Forum > GPU crunching
[Split] PowerSpectrum Unit Test
PatrickV2:
--- Quote from: Jason G on 27 Dec 2010, 02:32:59 am ---Cheers & No worries Patrick,
Just wasn't sure extending the test was going to be needed. Naked eye judgement is plenty for the purposes of testing scientific repeatability here, and running multiple times in the same exe would make it one large test rather than several small ones for comparison (if that makes any sense). I'm happy that the 8800 seems to have some headroom left, and the 'Min' numbers indicate the sloest kernels have received a niice boost.
Win7(WDDM) & XP(XPDM) driver model performance difference is 'gone' ;D
--- End quote ---
Thanks for the extended explanation; my remark was merely given in by curiosity (and probably a large lack in understanding the underlying higher goals), but I feel more enlightened now. ;)
Regards, Patrick.
Miep:
Device: Quadro FX 570M, 950 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 9.58) Peak( 13.91) Min( 2.48) [OK]
Memory thoughput GB/s Avg( 5.70) Peak( 9.09) Min( 3.53)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 11.23, 1.17x) Peak( 15.13, 1.09x) Min( 4.27, 1.72x) [OK]
Memory thoughput [GB/s] -
Avg( 6.99, 1.23x) Peak( 9.88, 1.09x) Min( 5.01, 1.42x)
values roughly +- .3 on stock and +- .1 on opt1
[edit]compute speedup 1.56x - 1.76x memory speedup 1.22x -1.47x
Jason G:
--- Quote from: PatrickV2 on 27 Dec 2010, 08:04:57 am ---Thanks for the extended explanation; my remark was merely given in by curiosity (and probably a large lack in understanding the underlying higher goals), but I feel more enlightened now. ;)
--- End quote ---
Yeah, a bit more info along those lines, the actual kernels under test run in timing loops set to roughly half a second, which is enough for ~thousands to millions of runs, so I was expecting 'fair' stability in the Avg, Peak & Min values, so we are alright for discrete kernel performance measurements.
I have however picked up an interesting thing on a friends i7-860 w/GTX480 in comparing against mine ( 45nM core2 w/GTX480)
- His Peaks & Averages are ~same as mine for the same clockrate ... BUT ... the 'Min (slowest kernels) are several times faster ... Better CPU & RAM does have significant impact on the running of the toughest parts of code, it seems
Jason
Jason G:
--- Quote from: Miep on 27 Dec 2010, 08:25:28 am ---values roughly +- .3 on stock and +- .1 on opt1
--- End quote ---
Hey that's decent! ... and there you were going to start a riot when initial mods yielded about 5% slowdown on yours ... tsk tsk tsk ;D
Miep:
--- Quote from: Jason G on 27 Dec 2010, 08:29:00 am ---Hey that's decent! ... and there you were going to start a riot when initial mods yielded about 5% slowdown on yours ... tsk tsk tsk ;D
--- End quote ---
Oh I just learned how to complain when not suffering ;D did the trick didn't it? ;)
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version