Forum > GPU crunching

Driver, application and VRAM requirement?

(1/13) > >>

Miep:
Hello world ;)

Ok, so I'm running a high spec Notebook with a NVIDIA Quadro FX 570M , 256MB. this host for a few more details.
And from the start it has been having stability issues with the graphic driver (IRQ_zero at system service bluescreens among others). Well I assume it's the graphic driver,
since updating it has increased stability.
Up to 195.62 which is the most stable and the one I'm currently running.
I did upgrade to 257.21, but with the decrease in reported RAM it didn't want to run stock 6.08 any more and it was more unstable on top. (Looks like they just released 258.96, haven't tried that one...)
I downgraded again last week so I could at least run stock 6.08.
So. I tried optimized CUDA MBnokill 2.2 dll and it errored out on me. Didn't go through to report and I didn't remember to preserve the stderr, but from what I've read I assume it's memory issues - stock 6.08  runs just fine (AND I've got aero enabled ;P). I can probably free up some RAM if somebody explains how to do it (or just points to the right post).
As an interim measure I've stuck 6.08 into app_info, so at least I get some WUs to play around with. So far it hasn't errored (first one at 15% done right now) and since the errors with optimized occured in the first two minutes or so, I hope it's playing along for the moment.
Any further suggestions?

BMaytum:
Miep:

Maybe others here can help you (especially if you can link to errored WU and/or provide STDERR output) for the WU(s) that errored out when you tried optimized NO-VLAR-kill app.

Asisde: Just an FYI for MB WUs run on GPU using the optimized VLAR-Kill version: If the WU terminates after only a few elapsed seconds, it was terminated because the VLAR was too low (ie the VLAR-Kill caussed termination).  In such cases you'll see something similar to this

--- Code: ---Stderr output
<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
 - exit code -6 (0xfffffffa)
</message>
<stderr_txt>

VLAR WU (AR: 0.009264 )detected... autokill initialised
SETI@home error -6 Bad workunit header


--- End code ---


The "error -6 " means VLAR-Kill did it's job.

Raistmer:
Hi here :)
Try to disable Aero first. Also, if 191.xx drivers work with CUDA 2.3 DLLs stable - no need to upgrade to higher versions (still)

Richard Haselgrove:

--- Quote from: Raistmer on 19 Jul 2010, 11:59:29 am ---Hi here :)
Try to disable Aero first. Also, if 191.xx drivers work with CUDA 2.3 DLLs stable - no need to upgrade to higher versions (still)

--- End quote ---

When Miep asked about this on the main board last week (Nvidia driver revert?), I had a scout round and found that 190/191 drivers were never released (even in Beta) for her Quadro FX 570M / Vista 32. The first available driver with CUDA 2.3 support was the 195.62 she's using now.

Miep:
Ah well, next iteration. Disabled aero, rerun.
I think it might just be possible to get to many debugging messages... Are there actually people who can make sense of core dumps?

So, memory, as assumed:


--- Code: ---<stderr_out>
<![CDATA[
<message>
 - exit code -1073741819 (0xc0000005)
</message>
--- End code ---

anf then further down


--- Code: ---After app init: total GPU memory 268435456 free GPU memory 38637568

Cuda error 'cudaMalloc((void**) &dev_WorkData' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcceleration.cu' in line 293 : out of memory.

setiathome_CUDA: CUDA runtime ERROR in device memory allocation (Step 1 of 3). Falling back to HOST CPU processing...

Unhandled Exception Detected...
- Unhandled Exception Record -

Reason: Access Violation (0xc0000005) at address 0x726F662F read attempt to address 0x726F662F
Engaging BOINC Windows Runtime Debugger...

--- End code ---

followed by no less that 7 pages of runtime debugger messages.

Kept client_state in case anybody feels like having a closer look.
Fine,back to 6.08 for now.

Navigation

[0] Message Index

[#] Next page

Go to full version