Forum > GPU crunching
[Split] PowerSpectrum Unit Test
SciManStev:
--- Quote from: Jason G on 26 Dec 2010, 02:21:14 pm ---
--- Quote from: SciManStev on 26 Dec 2010, 02:11:17 pm ---Device: GeForce GTX 480, 810 MHz clock, 1503 MB memory.
...
Compute thoughput [GFlops] -
Avg( 165.56, 1.45x) Peak( 234.17, 1.38x) Min( 61.06, 2.86x) [OK]
--- End quote ---
Winning! (just ;)) Glad you're on water cooling with those, My fan cranks up with that and creates a vortex in my room :D.
It made me think '1.21 GigaWatts!'. I'll be checking out & researching on water cooling the 480 here, sometime in the new year. Starting with the basics with guides like This one, & doing my homework.
--- End quote ---
With all the help you have given others, I would be happy to offer any assistance I could should you choose to go with water cooling. There is a lot in my system Tuning thread in NC you might find interesting. System Tuning
Steve
PatrickV2:
Q6600/8GB/8800GTX.
One remark though: if you want to run a test multiple times, why not do that in the download-able executable? I don't mind if a benchmark of yours runs several minutes on my rig, so just do a few test-runs, determine the max/min and standard-deviation or something and output that?
I have in any case run the benchmark 3 times on both OS versions, before running a 4th one redirected to a text-file (and compared that one too). Results and speed-ups looked stable to my 'naked' eye.
WinXP-32:
--- Code: ---Device: GeForce 8800 GTX, 1350 MHz clock, 768 MB memory.
Compute capability 1.0
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 51.45) Peak( 72.63) Min( 9.33) [OK]
Memory thoughput GB/s Avg( 30.07) Peak( 47.47) Min( 16.45)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 55.01, 1.07x) Peak( 75.98, 1.05x) Min( 13.89, 1.49x) [OK]
Memory thoughput [GB/s] -
Avg( 33.46, 1.11x) Peak( 49.65, 1.05x) Min( 24.23, 1.47x)
--- End code ---
Win7-64:
--- Code: ---Device: GeForce 8800 GTX, 1350 MHz clock, 731 MB memory.
Compute capability 1.0
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 45.04) Peak( 62.72) Min( 8.62) [OK]
Memory thoughput GB/s Avg( 26.39) Peak( 40.07) Min( 15.21)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 54.49, 1.21x) Peak( 75.17, 1.20x) Min( 13.75, 1.59x) [OK]
Memory thoughput [GB/s] -
Avg( 33.12, 1.26x) Peak( 49.13, 1.23x) Min( 24.07, 1.58x)
--- End code ---
Regards, Patrick.
MarkJ:
Did a few runs for test #10 on different cards/machines...
Cheers,
MarkJ
-------------------------------------------------
Device: GeForce GT 240, 1340 MHz clock, 475 MB memory.
Compute capability 1.2
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 32.78) Peak( 48.81) Min( 8.49) [OK]
Memory thoughput GB/s Avg( 19.49) Peak( 28.94) Min( 12.38)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 35.66, 1.09x) Peak( 51.41, 1.05x) Min( 12.84, 1.51x) [OK]
Memory thoughput [GB/s] -
Avg( 22.13, 1.14x) Peak( 30.48, 1.05x) Min( 15.22, 1.23x)
------------------------------------------------------------
Device: GeForce GTX 460, 1350 MHz clock, 768 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 62.95) Peak( 102.88) Min( 8.18) [OK]
Memory thoughput GB/s Avg( 34.05) Peak( 52.16) Min( 13.33)
Opt1 (worst case): 256 thrds/block, 2 x 524288 element streams
revert to single stream from size 512
Processing... Done!
Compute thoughput [GFlops] -
Avg( 79.87, 1.27x) Peak( 121.17, 1.18x) Min( 23.84, 2.91x) [OK]
Memory thoughput [GB/s] -
Avg( 47.79, 1.40x) Peak( 63.10, 1.21x) Min( 33.50, 2.51x)
-----------------------------------------------------------
Device: GeForce GTX 570, 1464 MHz clock, 1248 MB memory.
Compute capability 2.0
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 101.46) Peak( 151.95) Min( 20.02) [OK]
Memory thoughput GB/s Avg( 57.48) Peak( 79.89) Min( 30.85)
Opt1 (worst case): 256 thrds/block, 2 x 524288 element streams
revert to single stream from size 512
Processing... Done!
Compute thoughput [GFlops] -
Avg( 139.93, 1.38x) Peak( 199.62, 1.31x) Min( 51.29, 2.56x) [OK]
Memory thoughput [GB/s] -
Avg( 85.24, 1.48x) Peak( 106.89, 1.34x) Min( 58.81, 1.91x)
Jason G:
--- Quote from: PatrickV2 on 26 Dec 2010, 06:22:13 pm ---Q6600/8GB/8800GTX.
One remark though: if you want to run a test multiple times, why not do that in the download-able executable? I don't mind if a benchmark of yours runs several minutes on my rig, so just do a few test-runs, determine the max/min and standard-deviation or something and output that?
I have in any case run the benchmark 3 times on both OS versions, before running a 4th one redirected to a text-file (and compared that one too). Results and speed-ups looked stable to my 'naked' eye.
--- End quote ---
Cheers & No worries Patrick,
Just wasn't sure extending the test was going to be needed. Naked eye judgement is plenty for the purposes of testing scientific repeatability here, and running multiple times in the same exe would make it one large test rather than several small ones for comparison (if that makes any sense). I'm happy that the 8800 seems to have some headroom left, and the 'Min' numbers indicate the sloest kernels have received a niice boost.
Win7(WDDM) & XP(XPDM) driver model performance difference is 'gone' ;D
Secondary confirmation from a friend's 8800GTS:
XP32
--- Code: ---Device: GeForce 8800 GTS 512, 1625 MHz clock, 512 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 44.40) Peak( 66.68) Min( 7.85) [OK]
Memory thoughput GB/s Avg( 26.26) Peak( 41.19) Min( 13.83)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 47.57, 1.07x) Peak( 67.80, 1.02x) Min( 17.37, 2.21x) [OK]
Memory thoughput [GB/s] -
Avg( 30.04, 1.14x) Peak( 41.89, 1.02x) Min( 19.00, 1.37x)
--- End code ---
Win7-32
--- Code: ---Device: GeForce 8800 GTS 512, 1625 MHz clock, 500 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 40.57) Peak( 57.91) Min( 7.32) [OK]
Memory thoughput GB/s Avg( 23.86) Peak( 35.82) Min( 12.91)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 48.43, 1.19x) Peak( 66.67, 1.15x) Min( 15.87, 2.17x) [OK]
Memory thoughput [GB/s] -
Avg( 30.30, 1.27x) Peak( 41.94, 1.17x) Min( 20.41, 1.58x)
--- End code ---
Jason G:
--- Quote from: MarkJ on 27 Dec 2010, 12:17:59 am ---Did a few runs for test #10 on different cards/machines...
Cheers,
MarkJ
--- End quote ---
Thanks Mark! Starting to make a dent with the stubborn 240, and the Fermi boosts looking healthy.
I will need to get to checking the 260 in the other room soon, then we should have 'the full set'
[Later:] Here 'tis
--- Quote ---Device: GeForce GTX 260, 1242 MHz clock, 896 MB memory.
Compute capability 1.3
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 62.64) Peak( 93.36) Min( 4.48) [OK]
Memory thoughput GB/s Avg( 34.47) Peak( 52.71) Min( 7.89)
Opt1 (worst case): 128 thrds/block, 2 x 524288 element streams
revert to single stream from size 256
Processing... Done!
Compute thoughput [GFlops] -
Avg( 67.78, 1.08x) Peak( 95.96, 1.03x) Min( 5.69, 1.27x) [OK]
Memory thoughput [GB/s] -
Avg( 38.80, 1.13x) Peak( 55.48, 1.05x) Min( 10.03, 1.27x)
--- End quote ---
Maybe still some headroom on 200 series as well.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version