Forum > GPU crunching
WUs that CUDA MB can't do correctly
Raistmer:
Here this WU with results, log of testing, rescmpv2 for comparison are attached.
[attachment deleted by admin]
Jason G:
Here's that WU run with AKv8b SSE4.1 , and stock cuda 6.05.
Same results as your AKv8 SSE3 Vs 6.06:
--- Quote ---------------
Running app : AK_v8b_win_SSE41.exe with -verb -nog
with WU : 03dc08ad.15767.890.15.8.213.wu
Started at : 20:20:54.402
Ended at : 20:56:35.543
2141.047 secs Elapsed
2128.016 secs CPU time
Result : stored as ref for validation.
------------
Running app : setiathome_6.05_windows_intelx86__cuda.exe with -verb -st
with WU : 03dc08ad.15767.890.15.8.213.wu
Started at : 20:56:35.590
Ended at : 21:22:04.605
1528.969 secs Elapsed
95.953 secs CPU time
Speedup : 95.49%
Ratio : 22.18 x
----- R1:R2 ------ ----- R2:R1 ------
Good Bad Ugly Good Bad Ugly
Spike 2 0 0 2 0 0
Gaussian 2 0 0 2 0 0
Pulse 1 0 0 1 0 0
Triplet 1 0 1 1 0 0
Best Spike 1 0 0 1 0 0
Best Gaussian 1 0 0 1 0 0
Best Pulse 1 0 0 1 0 0
Best Triplet 0 0 1 0 0 1
---- ---- ---- ---- ---- ----
9 0 2 9 0 1
Result : Weakly similar.
--- End quote ---
Bench file attached. Ignore that I broke some Init_data.xml values while experimenting with something else ... no effect on (the lack of) validity of the result.
Jason
[attachment deleted by admin]
Raistmer:
Thanks!
Will search another reproducible failures to collect them here.
Raistmer:
Now VHAR WU:
Online and standalone results for CUDA app strongly similar again and both invalid versus CPU AK8 SSSE3x app result.
Data along with full log attached.
Log excerpt:
------------
MB_6.06r380mod_CUDA.exe -verb -st / 03no08aa.5874.273823.14.11.250.wu :
Started at : 15:49:57.593
Ended at : 15:50:15.392
17.737 secs Elapsed
15.054 secs CPU time
Speedup : 97.85%
Ratio : 46.46 x
----- R1:R2 ------ ----- R2:R1 ------
Good Bad Ugly Good Bad Ugly
Spike 0 0 1 0 0 0
Gaussian 0 0 0 0 0 0
Pulse 0 0 0 0 0 0
Triplet 0 0 2 0 0 31
Best Spike 0 0 1 0 0 0
Best Gaussian 0 0 1 0 0 0
Best Pulse 0 0 1 0 0 0
Best Triplet 0 0 1 0 0 0
---- ---- ---- ---- ---- ----
0 0 7 0 0 31
Result : Different.
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce 9600 GSO
totalGlobalMem = 402653184
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 262144
maxThreadsPerBlock = 512
clockRate = 1700000
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 0
multiProcessorCount = 12
setiathome_CUDA: No device specified, determined to use CUDA device 1: GeForce 9600 GSO
SETI@home using CUDA accelerated device GeForce 9600 GSO
Rise priority modification by Raistmer based on rev380 of SETI@home sources
Priority of worker thread rised successfully
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22
Work Unit Info:
...............
WU true angle range is : 14.146648
[attachment deleted by admin]
Raistmer:
Some VLAR rich of CUDA errors in standalone mode too
AK_v8_win_SSSE3x.exe -verb -st / 15no08ac.10856.20256.16.8.135.wu :
Started at : 16:34:50.293
Ended at : 17:38:32.699
3822.390 secs Elapsed
3820.168 secs CPU time
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSSE3x Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale
CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2655 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3
Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.013497
Flopcounter: 19390751501029.980000
Spike count: 4
Pulse count: 2
Triplet count: 0
Gaussian count: 0
called boinc_finish
[ /stderr ]
------------
MB_6.06r380mod_CUDA.exe -verb -st / 15no08ac.10856.20256.16.8.135.wu :
Started at : 17:38:32.745
Ended at : 17:38:53.462
20.686 secs Elapsed
15.163 secs CPU time
Speedup : 99.60%
Ratio : 251.94 x
----- R1:R2 ------ ----- R2:R1 ------
Good Bad Ugly Good Bad Ugly
Spike 0 0 4 0 0 30
Gaussian 0 0 0 0 0 0
Pulse 0 0 2 0 0 0
Triplet 0 0 0 0 0 0
Best Spike 0 0 1 0 0 0
Best Gaussian 0 0 1 0 0 0
Best Pulse 0 0 1 0 0 0
Best Triplet 0 0 0 0 0 0
---- ---- ---- ---- ---- ----
0 0 9 0 0 30
Result : Different.
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce 9600 GSO
totalGlobalMem = 402653184
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 262144
maxThreadsPerBlock = 512
clockRate = 1700000
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 0
multiProcessorCount = 12
setiathome_CUDA: No device specified, determined to use CUDA device 1: GeForce 9600 GSO
SETI@home using CUDA accelerated device GeForce 9600 GSO
Rise priority modification by Raistmer based on rev380 of SETI@home sources
Priority of worker thread rised successfully
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22
Work Unit Info:
...............
WU true angle range is : 0.013497
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_GetPowerSpectrum 0.00020 0.00000 test
v_GetPowerSpectrum 0.00020 0.00000 choice
v_ChirpData 0.01300 0.00000 test
v_ChirpData 0.01300 0.00000 choice
v_Transpose 0.00550 0.00000 test
v_Transpose2 0.00492 0.00000 test
v_Transpose4 0.00313 0.00000 test
v_Transpose8 0.00586 0.00000 test
v_Transpose4 0.00313 0.00000 choice
FPU opt folding 0.00775 0.00000 test
FPU opt folding 0.00775 0.00000 choice
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1250 : unknown error.
Cuda error 'cudaMemcpy(PulseResults, dev_PulseResults, 4 * (cudaAcc_NumDataPoints / AdvanceBy + 1) * sizeof(*dev_PulseResults), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1262 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(tmp_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1269 : unknown error.
Cuda error 'cufftExecC2C' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.
Cuda error 'find_triplets_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 224 : unknown error.
Cuda error 'find_triplets_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 224 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_flag, sizeof(*dev_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 228 : unknown error.
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1250 : unknown error.
Cuda error 'cudaMemcpy(PulseResults, dev_PulseResults, 4 * (cudaAcc_NumDataPoints / AdvanceBy + 1) * sizeof(*dev_PulseResults), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1262 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(tmp_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1269 : unknown error.
Cuda error 'cufftExecC2C' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected exceeds the storage space allocated.
Flopcounter: 20886859697.456421
Spike count: 30
Pulse count: 0
Triplet count: 0
Gaussian count: 0
called boinc_finish
[ /stderr ]
[attachment deleted by admin]
Navigation
[0] Message Index
[#] Next page
Go to full version