Forum > GPU crunching

WUs that CUDA MB can't do correctly

(1/7) > >>

Raistmer:
Here this WU with results, log of testing, rescmpv2 for comparison are attached.


[attachment deleted by admin]

Jason G:
Here's that WU run with AKv8b SSE4.1 , and stock cuda 6.05.

Same results as your AKv8 SSE3  Vs 6.06:


--- Quote ---------------
Running app : AK_v8b_win_SSE41.exe with -verb -nog
with WU     : 03dc08ad.15767.890.15.8.213.wu
Started at  : 20:20:54.402
Ended at    : 20:56:35.543
   2141.047 secs Elapsed
   2128.016 secs CPU time
Result      : stored as ref for validation.
------------
Running app : setiathome_6.05_windows_intelx86__cuda.exe with -verb -st
with WU     : 03dc08ad.15767.890.15.8.213.wu
Started at  : 20:56:35.590
Ended at    : 21:22:04.605
   1528.969 secs Elapsed
     95.953 secs CPU time
Speedup     : 95.49%
Ratio       : 22.18 x
                ----- R1:R2 ------     ----- R2:R1 ------
                Good    Bad   Ugly     Good    Bad   Ugly
        Spike      2      0      0        2      0      0
     Gaussian      2      0      0        2      0      0
        Pulse      1      0      0        1      0      0
      Triplet      1      0      1        1      0      0
   Best Spike      1      0      0        1      0      0
Best Gaussian      1      0      0        1      0      0
   Best Pulse      1      0      0        1      0      0
 Best Triplet      0      0      1        0      0      1
                ----   ----   ----     ----   ----   ----
                   9      0      2        9      0      1

Result      : Weakly similar.
--- End quote ---

Bench file attached.  Ignore that I broke some Init_data.xml values while experimenting with something else ... no effect on (the lack of) validity of the result.

Jason


[attachment deleted by admin]

Raistmer:
Thanks!
Will search another reproducible failures to collect them here.

Raistmer:
Now VHAR WU:

Online and standalone results for CUDA app strongly similar again and both invalid versus CPU AK8 SSSE3x app result.
Data along with full log attached.

Log excerpt:

------------
MB_6.06r380mod_CUDA.exe -verb -st / 03no08aa.5874.273823.14.11.250.wu :
Started at  : 15:49:57.593
Ended at    : 15:50:15.392
     17.737 secs Elapsed
     15.054 secs CPU time
Speedup     : 97.85%
Ratio       : 46.46 x
 
                ----- R1:R2 ------     ----- R2:R1 ------
                Good    Bad   Ugly     Good    Bad   Ugly
        Spike      0      0      1        0      0      0
     Gaussian      0      0      0        0      0      0
        Pulse      0      0      0        0      0      0
      Triplet      0      0      2        0      0     31
   Best Spike      0      0      1        0      0      0
Best Gaussian      0      0      1        0      0      0
   Best Pulse      0      0      1        0      0      0
 Best Triplet      0      0      1        0      0      0
                ----   ----   ----     ----   ----   ----
                   0      0      7        0      0     31

Result      : Different.
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GSO
           totalGlobalMem = 402653184
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1700000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 0
           multiProcessorCount = 12
setiathome_CUDA: No device specified, determined to use CUDA device 1: GeForce 9600 GSO
SETI@home using CUDA accelerated device GeForce 9600 GSO
Rise priority modification by Raistmer based on rev380 of SETI@home sources
Priority of worker thread rised successfully
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is :  14.146648

[attachment deleted by admin]

Raistmer:
Some VLAR rich of CUDA errors in standalone mode too

AK_v8_win_SSSE3x.exe -verb -st / 15no08ac.10856.20256.16.8.135.wu :
Started at  : 16:34:50.293
Ended at    : 17:38:32.699
   3822.390 secs Elapsed
   3820.168 secs CPU time
 
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSSE3x Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

     CPUID: Intel(R) Core(TM)2 Quad  CPU   Q9450  @ 2.66GHz
     Speed: 4 x 2655 MHz
     Cache: L1=64K L2=6144K
  Features: MMX SSE SSE2 SSE3 SSSE3
 
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  0.013497

Flopcounter: 19390751501029.980000

Spike count:    4
Pulse count:    2
Triplet count:  0
Gaussian count: 0
called boinc_finish
[ /stderr ]
------------
MB_6.06r380mod_CUDA.exe -verb -st / 15no08ac.10856.20256.16.8.135.wu :
Started at  : 17:38:32.745
Ended at    : 17:38:53.462
     20.686 secs Elapsed
     15.163 secs CPU time
Speedup     : 99.60%
Ratio       : 251.94 x
 
                ----- R1:R2 ------     ----- R2:R1 ------
                Good    Bad   Ugly     Good    Bad   Ugly
        Spike      0      0      4        0      0     30
     Gaussian      0      0      0        0      0      0
        Pulse      0      0      2        0      0      0
      Triplet      0      0      0        0      0      0
   Best Spike      0      0      1        0      0      0
Best Gaussian      0      0      1        0      0      0
   Best Pulse      0      0      1        0      0      0
 Best Triplet      0      0      0        0      0      0
                ----   ----   ----     ----   ----   ----
                   0      0      9        0      0     30

Result      : Different.
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GSO
           totalGlobalMem = 402653184
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1700000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 0
           multiProcessorCount = 12
setiathome_CUDA: No device specified, determined to use CUDA device 1: GeForce 9600 GSO
SETI@home using CUDA accelerated device GeForce 9600 GSO
Rise priority modification by Raistmer based on rev380 of SETI@home sources
Priority of worker thread rised successfully
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is :  0.013497
Optimal function choices:
-----------------------------------------------------
name               
-----------------------------------------------------
              v_BaseLineSmooth (no other)

            v_GetPowerSpectrum 0.00020 0.00000  test
            v_GetPowerSpectrum 0.00020 0.00000  choice

                   v_ChirpData 0.01300 0.00000  test
                   v_ChirpData 0.01300 0.00000  choice

                   v_Transpose 0.00550 0.00000  test
                  v_Transpose2 0.00492 0.00000  test
                  v_Transpose4 0.00313 0.00000  test
                  v_Transpose8 0.00586 0.00000  test
                  v_Transpose4 0.00313 0.00000  choice

               FPU opt folding 0.00775 0.00000  test
               FPU opt folding 0.00775 0.00000  choice

Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1250 : unknown error.
Cuda error 'cudaMemcpy(PulseResults, dev_PulseResults, 4 * (cudaAcc_NumDataPoints / AdvanceBy + 1) * sizeof(*dev_PulseResults), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1262 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(tmp_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1269 : unknown error.
Cuda error 'cufftExecC2C' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.
Cuda error 'find_triplets_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 224 : unknown error.
Cuda error 'find_triplets_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 224 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_flag, sizeof(*dev_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 228 : unknown error.
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1250 : unknown error.
Cuda error 'cudaMemcpy(PulseResults, dev_PulseResults, 4 * (cudaAcc_NumDataPoints / AdvanceBy + 1) * sizeof(*dev_PulseResults), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1262 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(tmp_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1269 : unknown error.
Cuda error 'cufftExecC2C' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected exceeds the storage space allocated.

Flopcounter: 20886859697.456421

Spike count:    30
Pulse count:    0
Triplet count:  0
Gaussian count: 0
called boinc_finish
[ /stderr ]


[attachment deleted by admin]

Navigation

[0] Message Index

[#] Next page

Go to full version