Forum > GPU crunching
[Split] PowerSpectrum Unit Test
Claggy:
My 9800GTX+ on Win 7 x64:
Device: GeForce 9800 GTX/9800 GTX+, 1900 MHz clock, 496 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 49.81) Peak( 71.73) Min( 8.11) [OK]
Memory thoughput GB/s Avg( 29.08) Peak( 44.80) Min( 14.31)
Opt1 (worst case): 64 thrds/block, 2 x 524288 element streams
revert to single stream from size 128
Processing... Done!
Compute thoughput [GFlops] -
Avg( 57.66, 1.16x) Peak( 80.19, 1.12x) Min( 18.07, 2.23x) [OK]
Memory thoughput [GB/s] -
Avg( 35.80, 1.23x) Peak( 50.46, 1.13x) Min( 24.47, 1.71x)
Claggy
glennaxl:
Device: GeForce GTX 260, 1441 MHz clock, 869 MB memory.
Compute capability 1.3
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #10 (FFT pipeline throughput)
Stock:
Processing... Done!
Compute Thoughput GFlops Avg( 47.55) Peak( 65.20) Min( 10.16) [OK]
Memory thoughput GB/s Avg( 28.09) Peak( 37.12) Min( 17.92)
Opt1 (worst case): 128 thrds/block, 2 x 524288 element streams
revert to single stream from size 256
Processing... Done!
Compute thoughput [GFlops] -
Avg( 84.83, 1.78x) Peak( 111.50, 1.71x) Min( 31.57, 3.11x) [OK]
Memory thoughput [GB/s] -
Avg( 52.63, 1.87x) Peak( 67.26, 1.81x) Min( 36.24, 2.02x)
Jason G:
Thanks both!
@glenaxl: that's some impressive speedup on GTX 260, I'll have to look at that here carefully on mine when I get a chance to do so.
@Claggy, average at 3/4 of peak seems pretty good, but I think we can get some more maybe.
@ALL, Thanks! I'm closing this test for now. It's been an extremely valuable contribution from you all that has had a huge impact on the pace & quality of our progress (mine in particular).
FYI: Some urgent issues may have come to light from Raistmer's OpenCL development when combined with the refinements here. Those will need some fairly close attention for a short while, to get some information back to Berkeley, but stay tuned as there are more tests to come :)
[Locking thread, Please stay tuned for further Unit Tests!]
Jason
Jason G:
@All:
Just a note that the concerns that arose, and distracted me from testing & development along this line, have now been at least partially resolved, and don't require any immediate action on our part. I'm back to ruggedising & integrating what we've accomplished here into the X-builds, and plan to start devising tests for PoT (Power over Time) processing refinement soon, in similar fashion to this thread. PoT processing covers Gaussian searches, Triplet & Pulse finding, for which all Cuda releases have known issues to address, so there'll be plenty of tests to devise & collect data for yet.
Cheers once again! :)
Jason
Navigation
[0] Message Index
[*] Previous page
Go to full version