Forum > GPU crunching
[Split] PowerSpectrum Unit Test
Jason G:
Yes, size 128k drops off a bit on mine too, not sure why yet.
Raistmer:
Was able to get results for GSO9600 at last:
Device: GeForce 9600 GSO, 1700 MHz clock, 384 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #8 (Sanity Check)
Stock:
PS+SuMx( 8) [OK] 1.2 GFlops 5.4 GB/s
PS+SuMx( 16) [OK] 1.6 GFlops 6.9 GB/s
PS+SuMx( 32) [OK] 1.8 GFlops 7.3 GB/s
PS+SuMx( 64) [OK] 2.9 GFlops 11.8 GB/s
PS+SuMx( 128) [OK] 4.3 GFlops 17.1 GB/s
PS+SuMx( 256) [OK] 5.5 GFlops 22.1 GB/s
PS+SuMx( 512) [OK] 6.7 GFlops 27.0 GB/s
PS+SuMx( 1024) [OK] 7.0 GFlops 28.1 GB/s
PS+SuMx( 2048) [OK] 7.7 GFlops 30.8 GB/s
PS+SuMx( 4096) [OK] 7.6 GFlops 30.4 GB/s
PS+SuMx( 8192) [OK] 7.9 GFlops 31.6 GB/s
PS+SuMx( 16384) [OK] 7.7 GFlops 31.0 GB/s
PS+SuMx( 32768) [OK] 8.1 GFlops 32.5 GB/s
PS+SuMx( 65536) [OK] 7.8 GFlops 31.3 GB/s
PS+SuMx(131072) [OK] 8.0 GFlops 32.2 GB/s
Opt1: 64 thrds/block
worst case best case
GFlps GB/s ulps GFlps GB/s ulps
PS+SuMx( 8) 1.5 6.5 121.7 [OK] 4.5 19.6 121.7
PS+SuMx( 16) 2.3 9.6 121.7 [OK] 4.8 20.0 121.7
PS+SuMx( 32) 3.0 12.1 121.7 [OK] 4.5 18.5 121.7
PS+SuMx( 64) 3.1 12.7 121.7 [OK] 5.4 21.7 121.7
PS+SuMx( 128) 4.5 18.1 121.7 [OK] 5.3 21.3 121.7
PS+SuMx( 256) 5.8 23.1 121.7 [OK] 6.5 25.9 121.7
PS+SuMx( 512) 6.9 27.8 121.7 [OK] 7.5 30.0 121.7
PS+SuMx( 1024) 7.3 29.1 121.7 [OK] 7.8 31.2 121.7
PS+SuMx( 2048) 7.9 31.5 121.7 [OK] 8.4 33.6 121.7
PS+SuMx( 4096) 7.8 31.1 121.7 [OK] 8.2 32.6 121.7
PS+SuMx( 8192) 8.1 32.3 121.7 [OK] 8.5 33.9 121.7
PS+SuMx( 16384) 7.9 31.5 121.7 [OK] 8.2 32.8 121.7
PS+SuMx( 32768) 8.1 32.5 121.7 [OK] 8.6 34.6 121.7
PS+SuMx( 65536) 5.7 22.7 121.7 [OK] 8.3 33.2 121.7
PS+SuMx(131072) 8.2 32.6 121.7 [OK] 8.5 34.1 121.7
perryjay:
Okay, here's test 8. Figured it would be better for me to post it rather than try to explain what I don't understand. :8
Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
C:\Users\perry>cd\test
C:\test> powerspectrumtest8.exe
Device: GeForce 9500 GT, 1848 MHz clock, 1006 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #8 (Sanity Check)
Stock:
PS+SuMx( 8) [OK] 0.7 GFlops 3.1 GB/s
PS+SuMx( 16) [OK] 0.8 GFlops 3.2 GB/s
PS+SuMx( 32) [OK] 0.7 GFlops 3.0 GB/s
PS+SuMx( 64) [OK] 1.0 GFlops 4.2 GB/s
PS+SuMx( 128) [OK] 0.8 GFlops 3.4 GB/s
PS+SuMx( 256) [OK] 1.6 GFlops 6.6 GB/s
PS+SuMx( 512) [OK] 2.0 GFlops 7.8 GB/s
PS+SuMx( 1024) [OK] 2.1 GFlops 8.2 GB/s
PS+SuMx( 2048) [OK] 2.1 GFlops 8.2 GB/s
PS+SuMx( 4096) [OK] 2.0 GFlops 8.1 GB/s
PS+SuMx( 8192) [OK] 2.1 GFlops 8.4 GB/s
PS+SuMx( 16384) [OK] 2.1 GFlops 8.4 GB/s
PS+SuMx( 32768) [OK] 0.5 GFlops 1.9 GB/s
PS+SuMx( 65536) [OK] 0.4 GFlops 1.5 GB/s
PS+SuMx(131072) [OK] 2.1 GFlops 8.5 GB/s
Opt1: 64 thrds/block
worst case best case
GFlps GB/s ulps GFlps GB/s ulps
PS+SuMx( 8) 1.1 4.8 121.7 [OK] 1.5 6.8 121.7
PS+SuMx( 16) 1.2 5.0 121.7 [OK] 1.7 6.9 121.7
PS+SuMx( 32) 1.2 5.0 121.7 [OK] 1.5 6.1 121.7
PS+SuMx( 64) 0.5 1.9 121.7 [OK] 1.7 7.1 121.7
PS+SuMx( 128) 0.6 2.5 121.7 [OK] 1.8 7.2 121.7
PS+SuMx( 256) 0.6 2.3 121.7 [OK] 2.1 8.3 121.7
PS+SuMx( 512) 2.0 8.1 121.7 [OK] 2.5 10.1 121.7
PS+SuMx( 1024) 1.9 7.8 121.7 [OK] 2.6 10.3 121.7
PS+SuMx( 2048) 2.1 8.6 121.7 [OK] 2.6 10.3 121.7
PS+SuMx( 4096) 0.5 2.1 121.7 [OK] 2.5 10.0 121.7
PS+SuMx( 8192) 2.2 8.7 121.7 [OK] 2.8 11.1 121.7
PS+SuMx( 16384) 2.1 8.2 121.7 [OK] 2.7 10.9 121.7
PS+SuMx( 32768) 2.2 8.8 121.7 [OK] 2.8 11.1 121.7
PS+SuMx( 65536) 2.2 8.9 121.7 [OK] 2.8 11.2 121.7
PS+SuMx(131072) 2.3 9.2 121.7 [OK] 2.8 11.3 121.7
C:\test>
Jason G:
--- Quote from: Raistmer on 23 Dec 2010, 11:13:47 am ---Was able to get results for GSO9600 at last:
--- End quote ---
Ouch, not much headroom between worst & best (fast GDDR3 memory on 9600GSO IIRC). I reckon the 64k size is an anomaly worth looking into, as with the 128k drop-off on other cards (like ION). Thankfully that part (larger sizes) is mostly stock, so there should be plenty of tweaking possibilities.... Even if only for a GFlop here and there.
Jason G:
--- Quote from: perryjay on 23 Dec 2010, 11:21:18 am ---Okay, here's test 8. Figured it would be better for me to post it rather than try to explain what I don't understand. :8
--- End quote ---
Thanks, A couple of sizes choking there for whatever reason. I think I'm going to have to improve everything from size 64&128 upward before moving onto the FFTs ... Nice that it's working with all '[OK]'
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version