Cheers & No worries Patrick, Just wasn't sure extending the test was going to be needed. Naked eye judgement is plenty for the purposes of testing scientific repeatability here, and running multiple times in the same exe would make it one large test rather than several small ones for comparison (if that makes any sense). I'm happy that the 8800 seems to have some headroom left, and the 'Min' numbers indicate the sloest kernels have received a niice boost. Win7(WDDM) & XP(XPDM) driver model performance difference is 'gone'
Device: Quadro FX 570M, 950 MHz clock, 242 MB memory.Compute capability 1.1Compiled with CUDA 3020. PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)Stock: Processing... Done! Compute Thoughput GFlops Avg( 9.58) Peak( 13.91) Min( 2.48) [OK] Memory thoughput GB/s Avg( 5.70) Peak( 9.09) Min( 3.53)Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams revert to single stream from size 128 Processing... Done! Compute thoughput [GFlops] - Avg( 11.23, 1.17x) Peak( 15.13, 1.09x) Min( 4.27, 1.72x) [OK] Memory thoughput [GB/s] - Avg( 6.99, 1.23x) Peak( 9.88, 1.09x) Min( 5.01, 1.42x)
Thanks for the extended explanation; my remark was merely given in by curiosity (and probably a large lack in understanding the underlying higher goals), but I feel more enlightened now.
values roughly +- .3 on stock and +- .1 on opt1
Hey that's decent! ... and there you were going to start a riot when initial mods yielded about 5% slowdown on yours ... tsk tsk tsk
Device: GeForce 9800 GTX/9800 GTX+, 1900 MHz clock, 496 MB memory.Compute capability 1.1Compiled with CUDA 3020. PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)Stock: Processing... Done! Compute Thoughput GFlops Avg( 49.81) Peak( 71.73) Min( 8.11) [OK] Memory thoughput GB/s Avg( 29.08) Peak( 44.80) Min( 14.31)Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams revert to single stream from size 128 Processing... Done! Compute thoughput [GFlops] - Avg( 57.66, 1.16x) Peak( 80.19, 1.12x) Min( 18.07, 2.23x) [OK] Memory thoughput [GB/s] - Avg( 35.80, 1.23x) Peak( 50.46, 1.13x) Min( 24.47, 1.71x)