Forum > GPU crunching
[Split] PowerSpectrum Unit Test
Miep:
nr9
Device: Quadro FX 570M, 950 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
FAILURE in c:/[Projects]/LunaticsUnited/Tools/Tests/PowerSpectrum/main.cpp, line 254
ouch :)
ok stopping boinc helps ::) result tomorrow ok result now
Device: Quadro FX 570M, 950 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #9 (FFT pipeline)
Christmas 2010 edition.
Stock:
FFT+PS+SM( 8) 1.8 GFlops 3.2 GB/s ulps(fft 1.3,ps 4775.9) [OK]
FFT+PS+SM( 16) 2.9 GFlops 4.0 GB/s ulps(fft 1.6,ps 4817.4) [OK]
FFT+PS+SM( 32) 2.7 GFlops 3.0 GB/s ulps(fft 1.6,ps 4628.1) [OK]
FFT+PS+SM( 64) 5.3 GFlops 5.0 GB/s ulps(fft 1.6,ps 4557.6) [OK]
FFT+PS+SM( 128) 7.9 GFlops 6.5 GB/s ulps(fft 2.0,ps 4942.0) [OK]
FFT+PS+SM( 256) 11.0 GFlops 8.0 GB/s ulps(fft 2.0,ps 4967.8) [OK]
FFT+PS+SM( 512) 13.3 GFlops 8.7 GB/s ulps(fft 2.1,ps 5128.1) [OK]
FFT+PS+SM( 1024) 13.1 GFlops 7.8 GB/s ulps(fft 2.5,ps 5552.5) [OK]
FFT+PS+SM( 2048) 13.2 GFlops 7.2 GB/s ulps(fft 2.7,ps 5770.3) [OK]
FFT+PS+SM( 4096) 12.3 GFlops 6.1 GB/s ulps(fft 2.4,ps 5313.7) [OK]
FFT+PS+SM( 8192) 11.5 GFlops 5.3 GB/s ulps(fft 2.8,ps 5881.1) [OK]
FFT+PS+SM( 16384) 10.7 GFlops 4.6 GB/s ulps(fft 3.3,ps 6399.1) [OK]
FFT+PS+SM( 32768) 12.2 GFlops 5.0 GB/s ulps(fft 3.3,ps 6380.1) [OK]
FFT+PS+SM( 65536) 12.2 GFlops 4.7 GB/s ulps(fft 3.4,ps 6534.8) [OK]
FFT+PS+SM(131072) 12.5 GFlops 4.5 GB/s ulps(fft 3.6,ps 6694.2) [OK]
Opt1 (worst case): 64 thrds/block
FFT+PS+SM( 8) 3.7 GFlops 6.6 GB/s ulps(fft 1.3,ps 4637.5) [OK]
FFT+PS+SM( 16) 4.7 GFlops 6.4 GB/s ulps(fft 1.6,ps 4589.2) [OK]
FFT+PS+SM( 32) 5.6 GFlops 6.3 GB/s ulps(fft 1.6,ps 4535.6) [OK]
FFT+PS+SM( 64) 7.9 GFlops 7.5 GB/s ulps(fft 1.6,ps 4426.7) [OK]
FFT+PS+SM( 128) 9.4 GFlops 7.7 GB/s ulps(fft 2.0,ps 4818.1) [OK]
FFT+PS+SM( 256) 12.5 GFlops 9.1 GB/s ulps(fft 2.0,ps 4831.0) [OK]
FFT+PS+SM( 512) 15.3 GFlops 10.0 GB/s ulps(fft 2.1,ps 4987.2) [OK]
FFT+PS+SM( 1024) 15.0 GFlops 8.9 GB/s ulps(fft 2.5,ps 5438.0) [OK]
FFT+PS+SM( 2048) 14.6 GFlops 7.9 GB/s ulps(fft 2.7,ps 5674.7) [OK]
FFT+PS+SM( 4096) 14.1 GFlops 7.0 GB/s ulps(fft 2.4,ps 5202.4) [OK]
FFT+PS+SM( 8192) 12.8 GFlops 6.0 GB/s ulps(fft 2.8,ps 5765.4) [OK]
FFT+PS+SM( 16384) 11.6 GFlops 5.0 GB/s ulps(fft 3.3,ps 6291.8) [OK]
FFT+PS+SM( 32768) 13.1 GFlops 5.3 GB/s ulps(fft 3.3,ps 6275.5) [OK]
FFT+PS+SM( 65536) 14.1 GFlops 5.4 GB/s ulps(fft 3.4,ps 6429.1) [OK]
FFT+PS+SM(131072) 14.0 GFlops 5.0 GB/s ulps(fft 3.6,ps 6590.4) [OK]
sorry no time for avarages atm
Jason G:
Thanks Heinz, perrjay & Carola,
Nice to see the stubborn chips(that Quadro & ION) edging forward a bit now.
@perryjay: ~3x for 9500GT in some sizes? Don't know why that is completely but I like it ;D
Jason
PatrickV2:
Hi there,
Ran test #9 on my Q6600/8GB/8800GTX, under both WinXP-32 as well as Win7-64.
First, WinXP-32:
--- Code: ---Device: GeForce 8800 GTX, 1350 MHz clock, 768 MB memory.
Compute capability 1.0
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #9 (FFT pipeline)
Christmas 2010 edition.
Stock:
FFT+PS+SM( 8) 9.3 GFlops 16.4 GB/s ulps(fft 1.3,ps 4775.9) [OK]
FFT+PS+SM( 16) 13.6 GFlops 18.5 GB/s ulps(fft 1.6,ps 4817.4) [OK]
FFT+PS+SM( 32) 16.0 GFlops 17.8 GB/s ulps(fft 1.6,ps 4628.1) [OK]
FFT+PS+SM( 64) 28.3 GFlops 26.8 GB/s ulps(fft 1.6,ps 4557.6) [OK]
FFT+PS+SM( 128) 44.4 GFlops 36.5 GB/s ulps(fft 2.0,ps 4942.0) [OK]
FFT+PS+SM( 256) 59.2 GFlops 43.1 GB/s ulps(fft 2.0,ps 4967.8) [OK]
FFT+PS+SM( 512) 72.6 GFlops 47.4 GB/s ulps(fft 2.1,ps 5128.1) [OK]
FFT+PS+SM( 1024) 71.7 GFlops 42.5 GB/s ulps(fft 2.5,ps 5552.5) [OK]
FFT+PS+SM( 2048) 72.1 GFlops 39.1 GB/s ulps(fft 2.7,ps 5770.3) [OK]
FFT+PS+SM( 4096) 66.5 GFlops 33.3 GB/s ulps(fft 2.4,ps 5313.7) [OK]
FFT+PS+SM( 8192) 63.3 GFlops 29.4 GB/s ulps(fft 2.8,ps 5881.1) [OK]
FFT+PS+SM( 16384) 58.6 GFlops 25.3 GB/s ulps(fft 3.3,ps 6399.1) [OK]
FFT+PS+SM( 32768) 62.9 GFlops 25.5 GB/s ulps(fft 3.3,ps 6380.1) [OK]
FFT+PS+SM( 65536) 67.2 GFlops 25.6 GB/s ulps(fft 3.4,ps 6534.8) [OK]
FFT+PS+SM(131072) 66.0 GFlops 23.7 GB/s ulps(fft 3.6,ps 6694.2) [OK]
Opt1 (worst case): 64 thrds/block
FFT+PS+SM( 8) 14.3 GFlops 25.2 GB/s ulps(fft 1.3,ps 4637.5) [OK]
FFT+PS+SM( 16) 21.2 GFlops 28.9 GB/s ulps(fft 1.6,ps 4589.2) [OK]
FFT+PS+SM( 32) 27.5 GFlops 30.7 GB/s ulps(fft 1.6,ps 4535.6) [OK]
FFT+PS+SM( 64) 39.1 GFlops 37.0 GB/s ulps(fft 1.6,ps 4426.7) [OK]
FFT+PS+SM( 128) 47.4 GFlops 39.0 GB/s ulps(fft 2.0,ps 4818.1) [OK]
FFT+PS+SM( 256) 62.5 GFlops 45.5 GB/s ulps(fft 2.0,ps 4831.0) [OK]
FFT+PS+SM( 512) 76.0 GFlops 49.7 GB/s ulps(fft 2.1,ps 4987.2) [OK]
FFT+PS+SM( 1024) 74.1 GFlops 43.9 GB/s ulps(fft 2.5,ps 5438.0) [OK]
FFT+PS+SM( 2048) 74.2 GFlops 40.3 GB/s ulps(fft 2.7,ps 5674.7) [OK]
FFT+PS+SM( 4096) 67.3 GFlops 33.7 GB/s ulps(fft 2.4,ps 5202.4) [OK]
FFT+PS+SM( 8192) 64.7 GFlops 30.0 GB/s ulps(fft 2.8,ps 5765.4) [OK]
FFT+PS+SM( 16384) 59.8 GFlops 25.9 GB/s ulps(fft 3.3,ps 6291.8) [OK]
FFT+PS+SM( 32768) 64.3 GFlops 26.0 GB/s ulps(fft 3.3,ps 6275.5) [OK]
FFT+PS+SM( 65536) 68.6 GFlops 26.1 GB/s ulps(fft 3.4,ps 6429.1) [OK]
FFT+PS+SM(131072) 67.5 GFlops 24.3 GB/s ulps(fft 3.6,ps 6590.4) [OK]
--- End code ---
Second, Win7-64:
--- Code: ---Device: GeForce 8800 GTX, 1350 MHz clock, 731 MB memory.
Compute capability 1.0
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #9 (FFT pipeline)
Christmas 2010 edition.
Stock:
FFT+PS+SM( 8) 8.4 GFlops 14.9 GB/s ulps(fft 1.3,ps 4775.9) [OK]
FFT+PS+SM( 16) 12.1 GFlops 16.6 GB/s ulps(fft 1.6,ps 4817.4) [OK]
FFT+PS+SM( 32) 14.6 GFlops 16.3 GB/s ulps(fft 1.6,ps 4628.1) [OK]
FFT+PS+SM( 64) 25.9 GFlops 24.5 GB/s ulps(fft 1.6,ps 4557.6) [OK]
FFT+PS+SM( 128) 38.6 GFlops 31.8 GB/s ulps(fft 2.0,ps 4942.0) [OK]
FFT+PS+SM( 256) 50.3 GFlops 36.6 GB/s ulps(fft 2.0,ps 4967.8) [OK]
FFT+PS+SM( 512) 61.2 GFlops 40.0 GB/s ulps(fft 2.1,ps 5128.1) [OK]
FFT+PS+SM( 1024) 61.6 GFlops 36.5 GB/s ulps(fft 2.5,ps 5552.5) [OK]
FFT+PS+SM( 2048) 62.3 GFlops 33.8 GB/s ulps(fft 2.7,ps 5770.3) [OK]
FFT+PS+SM( 4096) 57.5 GFlops 28.7 GB/s ulps(fft 2.4,ps 5313.7) [OK]
FFT+PS+SM( 8192) 56.1 GFlops 26.0 GB/s ulps(fft 2.8,ps 5881.1) [OK]
FFT+PS+SM( 16384) 52.4 GFlops 22.7 GB/s ulps(fft 3.3,ps 6399.1) [OK]
FFT+PS+SM( 32768) 55.5 GFlops 22.5 GB/s ulps(fft 3.3,ps 6380.1) [OK]
FFT+PS+SM( 65536) 59.2 GFlops 22.5 GB/s ulps(fft 3.4,ps 6534.8) [OK]
FFT+PS+SM(131072) 58.8 GFlops 21.1 GB/s ulps(fft 3.6,ps 6694.2) [OK]
Opt1 (worst case): 64 thrds/block
FFT+PS+SM( 8) 14.2 GFlops 25.0 GB/s ulps(fft 1.3,ps 4637.5) [OK]
FFT+PS+SM( 16) 21.0 GFlops 28.6 GB/s ulps(fft 1.6,ps 4589.2) [OK]
FFT+PS+SM( 32) 27.5 GFlops 30.7 GB/s ulps(fft 1.6,ps 4535.6) [OK]
FFT+PS+SM( 64) 39.2 GFlops 37.1 GB/s ulps(fft 1.6,ps 4426.7) [OK]
FFT+PS+SM( 128) 46.8 GFlops 38.5 GB/s ulps(fft 2.0,ps 4818.1) [OK]
FFT+PS+SM( 256) 61.1 GFlops 44.5 GB/s ulps(fft 2.0,ps 4831.0) [OK]
FFT+PS+SM( 512) 75.2 GFlops 49.2 GB/s ulps(fft 2.1,ps 4987.2) [OK]
FFT+PS+SM( 1024) 73.6 GFlops 43.6 GB/s ulps(fft 2.5,ps 5438.0) [OK]
FFT+PS+SM( 2048) 73.4 GFlops 39.8 GB/s ulps(fft 2.7,ps 5674.7) [OK]
FFT+PS+SM( 4096) 67.7 GFlops 33.9 GB/s ulps(fft 2.4,ps 5202.4) [OK]
FFT+PS+SM( 8192) 64.4 GFlops 29.8 GB/s ulps(fft 2.8,ps 5765.4) [OK]
FFT+PS+SM( 16384) 59.5 GFlops 25.7 GB/s ulps(fft 3.3,ps 6291.8) [OK]
FFT+PS+SM( 32768) 64.0 GFlops 25.9 GB/s ulps(fft 3.3,ps 6275.5) [OK]
FFT+PS+SM( 65536) 68.2 GFlops 26.0 GB/s ulps(fft 3.4,ps 6429.1) [OK]
FFT+PS+SM(131072) 67.1 GFlops 24.1 GB/s ulps(fft 3.6,ps 6590.4) [OK]
--- End code ---
Regards, Patrick.
Jason G:
--- Quote from: PatrickV2 on 25 Dec 2010, 05:21:25 am ---Ran test #9 on my Q6600/8GB/8800GTX, under both WinXP-32 as well as Win7-64.
--- End quote ---
Excellent, not broken on the 8800. Last hurdle for that code area cleared & can move on :D
perryjay:
Carola just mentioned something I haven't been doing. I have been running the test without stopping BOINC. Should I run it with BOINC stopped?
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version