Forum > Linux
SETI MB CUDA for Linux
sunu:
Yes, I agree with Raistmer, that must have been something different. I don't know exactly what ARs trigger the VLAR kill but I've never seen it kill anything larger than 0.20.
riofl:
the error said something to the effect of VLAR killed angle 0.49
i thought it was a bit high, but there were several work units with the same range of angles that were killed by it. the non killer app is working perfectly. i am down to using 0.25 as the cutoff angle and its just purring right along even with my desktop features enabled that i like.
i was getting ready to go out and get an ati card for my video and just use the nvidia cards for cuda only as i thought that i was using gpu resources for the desktop that caused problems with the cuda apps (desktop cube, shading and transparency features etc... experimenting with those just to see what 'glitter' was like :P ). especially when i turned those features off things started working again, but with this non killer app, i can keep everything enabled and it all lives together well.
riofl:
ok i went back and looked at them again. i was not very awake last time i looked at them. it was another error an fft.cu error.... however i dont get that with this app unless it was those specific workunits..
here are a few. there were maybe 30 workunits errored out and i looked at 10 of them just now and all had the same error except this first one.
Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0
Work Unit Info:
...............
WU true angle range is : 10.416071
SETI@home error -12 Unknown error
cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel
File: ./cudaAcc_pulsefind.cu
Line: 232
--------------------------------------------
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0
Work Unit Info:
...............
WU true angle range is : 0.437965
CUFFT error in file './cudaAcc_fft.cu' in line 62.
-----------------------------------------
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0
Work Unit Info:
...............
WU true angle range is : 0.407435
CUFFT error in file './cudaAcc_fft.cu' in line 62.
------------------------------------------
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0
Work Unit Info:
...............
WU true angle range is : 0.437965
CUFFT error in file './cudaAcc_fft.cu' in line 62.
sunu:
riofl, is the computer 4166601 ( http://setiathome.berkeley.edu/show_host_detail.php?hostid=4166601 ) yours?
The error cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel is a "normal" one. There is nothing in it.
The errors from that computer's page are interesting. They occur right after the "preparatory" phase in the CPU and when the GPU was supposed to take over. I've checked a few and all seem to happen in your "good" GTX285 card and not in the problematic tesla card. am I right?
If I remember correctly you were experiencing unusually high run times in your GPUs, does it still happen?
There is definitely something not right with the setup of this computer.
I think I've asked you before and you have told me the brand of your motherboard, can you remind me?
riofl:
--- Quote from: sunu on 18 Jan 2010, 07:15:03 pm ---riofl, is the computer 4166601 ( http://setiathome.berkeley.edu/show_host_detail.php?hostid=4166601 ) yours?
The error cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel is a "normal" one. There is nothing in it.
The errors from that computer's page are interesting. They occur right after the "preparatory" phase in the CPU and when the GPU was supposed to take over. I've checked a few and all seem to happen in your "good" GTX285 card and not in the problematic tesla card. am I right?
If I remember correctly you were experiencing unusually high run times in your GPUs, does it still happen?
There is definitely something not right with the setup of this computer.
I think I've asked you before and you have told me the brand of your motherboard, can you remind me?
--- End quote ---
yes that is the computer... the tesla is problematic in that it simply locks up randomly. there is a hardware problem with it. restarting boinc cures it for some time. vidram test software shows a bad ram chip around the 700mb mark. i think my workstation is using more resources than i think it does and the gtx285 is simply overwhelmed if i have kde options enabled and does not have enough resources for seti.
my times now are averaging 16-18 min off the tesla and 19-22min off the gtx285 . much better than previously at around 30 min. my scores have finally climbed to near 15k like you said they should be.
the computing errors were happening just as the gpu was supposed to take over. that was when i had all the 'cute' features of kde4 enabled which included dimming of unfocused windows and cube desktop switching and several other things including sharpen desktop (all experimental to see what it was like to use a workstation that had glitz enabled). i also use dual 24" monitors each at 1920x1200 using nvidia twinview option so i am sure that takes up a bit of vid resources as well. i also use different backgrounds on each of 9 desktops, same image loaded in each monitor/desktop.
once i disabled the glitz and glitter options and did a power down restart to allow everything to clear and changed back to the older non vlar killer app, all the errors stopped.
the system is an intel q6600 quad processor overclocked to 3.0ghz using a 9 multiplier and 333mhz bus, OCZ ram is adjusted to stock frequency of ddr2-1066 . ram factory recommended timings were adjusted slightly from 5-5-5-18 to 5-5-5-15 and cpu and ram voltages are stock factory recommendations. instead of auto, the pci-e bus speed is locked at 100mhz since the gigabyte board in full auto mode tends to adjust everything as it wants which could be dangerous.
the motherboard is a gigabyte GA-P35-DS4-rev2.1
things have been stable for the past 20 hours or so since i readjusted everything back to standard dull desktop :)
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version