Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (48/62) > >>

Jason G:
Yes, size 128k drops off a bit on mine too, not sure why yet.

Raistmer:
Was able to get results for GSO9600 at last:


Device: GeForce 9600 GSO, 1700 MHz clock, 384 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
      PowerSpectrum+summax Unit test #8 (Sanity Check)
Stock:
 PS+SuMx(     8) [OK]    1.2 GFlops    5.4 GB/s
 PS+SuMx(    16) [OK]    1.6 GFlops    6.9 GB/s
 PS+SuMx(    32) [OK]    1.8 GFlops    7.3 GB/s
 PS+SuMx(    64) [OK]    2.9 GFlops   11.8 GB/s
 PS+SuMx(   128) [OK]    4.3 GFlops   17.1 GB/s
 PS+SuMx(   256) [OK]    5.5 GFlops   22.1 GB/s
 PS+SuMx(   512) [OK]    6.7 GFlops   27.0 GB/s
 PS+SuMx(  1024) [OK]    7.0 GFlops   28.1 GB/s
 PS+SuMx(  2048) [OK]    7.7 GFlops   30.8 GB/s
 PS+SuMx(  4096) [OK]    7.6 GFlops   30.4 GB/s
 PS+SuMx(  8192) [OK]    7.9 GFlops   31.6 GB/s
 PS+SuMx( 16384) [OK]    7.7 GFlops   31.0 GB/s
 PS+SuMx( 32768) [OK]    8.1 GFlops   32.5 GB/s
 PS+SuMx( 65536) [OK]    7.8 GFlops   31.3 GB/s
 PS+SuMx(131072) [OK]    8.0 GFlops   32.2 GB/s


Opt1: 64 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    1.5    6.5 121.7 [OK]    4.5   19.6 121.7
 PS+SuMx(    16)    2.3    9.6 121.7 [OK]    4.8   20.0 121.7
 PS+SuMx(    32)    3.0   12.1 121.7 [OK]    4.5   18.5 121.7
 PS+SuMx(    64)    3.1   12.7 121.7 [OK]    5.4   21.7 121.7
 PS+SuMx(   128)    4.5   18.1 121.7 [OK]    5.3   21.3 121.7
 PS+SuMx(   256)    5.8   23.1 121.7 [OK]    6.5   25.9 121.7
 PS+SuMx(   512)    6.9   27.8 121.7 [OK]    7.5   30.0 121.7
 PS+SuMx(  1024)    7.3   29.1 121.7 [OK]    7.8   31.2 121.7
 PS+SuMx(  2048)    7.9   31.5 121.7 [OK]    8.4   33.6 121.7
 PS+SuMx(  4096)    7.8   31.1 121.7 [OK]    8.2   32.6 121.7
 PS+SuMx(  8192)    8.1   32.3 121.7 [OK]    8.5   33.9 121.7
 PS+SuMx( 16384)    7.9   31.5 121.7 [OK]    8.2   32.8 121.7
 PS+SuMx( 32768)    8.1   32.5 121.7 [OK]    8.6   34.6 121.7
 PS+SuMx( 65536)    5.7   22.7 121.7 [OK]    8.3   33.2 121.7
 PS+SuMx(131072)    8.2   32.6 121.7 [OK]    8.5   34.1 121.7

perryjay:
Okay, here's test 8. Figured it would be better for me to post it rather than try to explain what I don't understand.  :8

Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\perry>cd\test

C:\test> powerspectrumtest8.exe

Device: GeForce 9500 GT, 1848 MHz clock, 1006 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #8 (Sanity Check)
Stock:
 PS+SuMx(     8) [OK]    0.7 GFlops    3.1 GB/s
 PS+SuMx(    16) [OK]    0.8 GFlops    3.2 GB/s
 PS+SuMx(    32) [OK]    0.7 GFlops    3.0 GB/s
 PS+SuMx(    64) [OK]    1.0 GFlops    4.2 GB/s
 PS+SuMx(   128) [OK]    0.8 GFlops    3.4 GB/s
 PS+SuMx(   256) [OK]    1.6 GFlops    6.6 GB/s
 PS+SuMx(   512) [OK]    2.0 GFlops    7.8 GB/s
 PS+SuMx(  1024) [OK]    2.1 GFlops    8.2 GB/s
 PS+SuMx(  2048) [OK]    2.1 GFlops    8.2 GB/s
 PS+SuMx(  4096) [OK]    2.0 GFlops    8.1 GB/s
 PS+SuMx(  8192) [OK]    2.1 GFlops    8.4 GB/s
 PS+SuMx( 16384) [OK]    2.1 GFlops    8.4 GB/s
 PS+SuMx( 32768) [OK]    0.5 GFlops    1.9 GB/s
 PS+SuMx( 65536) [OK]    0.4 GFlops    1.5 GB/s
 PS+SuMx(131072) [OK]    2.1 GFlops    8.5 GB/s


Opt1: 64 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    1.1    4.8 121.7 [OK]    1.5    6.8 121.7
 PS+SuMx(    16)    1.2    5.0 121.7 [OK]    1.7    6.9 121.7
 PS+SuMx(    32)    1.2    5.0 121.7 [OK]    1.5    6.1 121.7
 PS+SuMx(    64)    0.5    1.9 121.7 [OK]    1.7    7.1 121.7
 PS+SuMx(   128)    0.6    2.5 121.7 [OK]    1.8    7.2 121.7
 PS+SuMx(   256)    0.6    2.3 121.7 [OK]    2.1    8.3 121.7
 PS+SuMx(   512)    2.0    8.1 121.7 [OK]    2.5   10.1 121.7
 PS+SuMx(  1024)    1.9    7.8 121.7 [OK]    2.6   10.3 121.7
 PS+SuMx(  2048)    2.1    8.6 121.7 [OK]    2.6   10.3 121.7
 PS+SuMx(  4096)    0.5    2.1 121.7 [OK]    2.5   10.0 121.7
 PS+SuMx(  8192)    2.2    8.7 121.7 [OK]    2.8   11.1 121.7
 PS+SuMx( 16384)    2.1    8.2 121.7 [OK]    2.7   10.9 121.7
 PS+SuMx( 32768)    2.2    8.8 121.7 [OK]    2.8   11.1 121.7
 PS+SuMx( 65536)    2.2    8.9 121.7 [OK]    2.8   11.2 121.7
 PS+SuMx(131072)    2.3    9.2 121.7 [OK]    2.8   11.3 121.7



C:\test>

Jason G:

--- Quote from: Raistmer on 23 Dec 2010, 11:13:47 am ---Was able to get results for GSO9600 at last:
--- End quote ---

Ouch, not much headroom between worst & best (fast GDDR3 memory  on 9600GSO IIRC).  I reckon the 64k size is an anomaly worth looking into, as with the 128k drop-off on other cards (like ION).  Thankfully that part (larger sizes) is mostly stock, so there should be plenty of tweaking possibilities.... Even if only for a GFlop here and there.

Jason G:

--- Quote from: perryjay on 23 Dec 2010, 11:21:18 am ---Okay, here's test 8. Figured it would be better for me to post it rather than try to explain what I don't understand.  :8

--- End quote ---

Thanks, A couple of sizes choking there for whatever reason.  I think I'm going to have to improve everything from size 64&128 upward before moving onto the FFTs ... Nice that it's working with all '[OK]'

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version