Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (36/62) > >>

Vyper:
Well here is one of my slightly overclocked GTX460.

Running Win7X64 & 260.99 version.

Kind regards Vyper

Jason G:

--- Quote from: Miep on 06 Dec 2010, 05:36:57 am ---and one small mobile GPU ;) :
--- End quote ---
The worst case reduction is faster while the powerspectrum same speed, great ;D

Jason G:

--- Quote from: Vyper on 06 Dec 2010, 07:38:30 am ---Well here is one of my slightly overclocked GTX460.
--- End quote ---

Thank's!  We Fermi users are going to need more computation packed in there to bring those GFlops up.

Miep:
ok, a bit of statistics then. average +- std dev over 15 runs

Device: Quadro FX 570M, 950 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #6 (pinned mem)
Stock:
 PwrSpec<    64>    4.4 GFlops   17.5 GB/s 1183.3ulps

 SumMax (    64)    0.3 GFlops    1.1 GB/s
Every ifft average & peak OK

 PS+SuMx(    64)    0.82 +- 0.086 GFlops    3.5 GB/s


GetPowerSpectrum() choice for Opt1: 64 thrds/block
     64 threads:        4.37 +- 0.046 GFlops   17.5 GB/s 121.7ulps


Opt1 (PSmod3+SM): 64 thrds/block
PowerSpectrumSumMax array pinned in host memory.
   64 threads, fftlen 64: (worst case: full summax copy)
         1.37 +- 0.149 GFlops    6.0 GB/s 121.7ulps
Every ifft average & peak OK
   64 threads, fftlen 64: (best case, nothing to update)
         1.61 +- 0.026 GFlops    6.6 GB/s 121.7ulps


now if the pink was better distingushabel from the white ::)
would you like that for the GB/s as well?

Jason G:

--- Quote from: Miep on 06 Dec 2010, 09:30:38 am ---now if the pink was better distingushabel from the white ::)
would you like that for the GB/s as well?

--- End quote ---

Thanks for the tolerances.  Being largely memory bound, the FLops tolerances are more than enough, and indicate +/- 10% variation of worst case on that.  I presume that's driving a display, so that's reasonable.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version