Forum > GPU crunching

WUs that CUDA MB can't do correctly

<< < (6/7) > >>

Raistmer:
One more VLAR
AR ~0.01
online result differ from standalone result and both differ from standalone CPU result.
There is overflow in online result, no overflow but invalid signals in CUDA standalone result.
Standalone CUDA MB run didn't cause driver crash but caused "snow screen" effect. CUDA error in stderr.



[attachment deleted by admin]

Maik:
Big WU crash with serveral restarts of task, acces violtaions error, runtime debugger ...
applivation: mem-opt MB_6.06r380mod_CUDA.exe
WU: 01no08aa.9239.55814.9.8.194
WU true angle range is :  0.019967

error:
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x004097A4 read attempt to address 0x0980A820

Engaging BOINC Windows Runtime Debugger...

doing stand alone now with stock app.

Update:
-------------------------
AK_v8_win_SSE41.exe -standalone, Elapsed time: 3715 seconds
 > Spike count:    1
 > Pulse count:    3
 > Triplet count:  0
 > Gaussian count: 0
-------------------------
setiathome_6.06_windows_intelx86__cuda.exe -standalone, Elapsed time: 32 seconds
 > Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
-------------------------
reuslts as attachment added.

[attachment deleted by admin]

Raistmer:
VLAR AR ~0.13
driver crash in standalone run (6.06 stock, not my build), overflow in online result
Both results differ from CPU one, unknown CUDA error in stderr.



[attachment deleted by admin]

Jason G:

--- Quote from: Raistmer on 05 Jan 2009, 08:14:20 am ---...unknown CUDA error in stderr....
--- End quote ---

Oooh, that sounds familiar  :P

Raistmer:

--- Quote from: Josef W. Segur on 04 Jan 2009, 05:30:41 pm ---
--- Quote from: Raistmer on 04 Jan 2009, 05:11:10 pm ---It's all true if it would be show in CPU result too. While we have such error in GPU only I tend to think it's just another CUDA app problem....
--- End quote ---

In this case it looks like a design choice rather than an execution bug. My guess is someone did a mathematical estimation of the most "above threshold" points which should occur in pure random data and allowed for somewhat more. It looks like another case where ideal randomness isn't quite achieved...
                                                                    Joe

--- End quote ---

I agree that this design flaw may appear sometime in future but for now it's just another CUDA bug.
http://setiathome.berkeley.edu/workunit.php?wuid=390436470
CPU didn't report this overflow (as it should be :) )

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version