Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (15/62) > >>

Jason G:

--- Quote from: Jason G on 20 Nov 2010, 07:52:23 am ---Yes please.  The difference picked up earlier (Thanks Frizz)  between XP32 & XP64 was interesting ( with stock, around 10% advantage to XP32, reduced to ~5% with Mod3 ) .    I've little doubt XP32 has a similar advantage over Win7x64, due to the simpler driver model, but it'd be nice to confirm if the mods close that gap a bit too.

--- End quote ---


--- Quote from: PatrickV2 on 20 Nov 2010, 11:32:41 am ---Sure, no problem. The results:
...
     64 threads:       18.3 GFlops    7.3 GB/s 121.7ulps

--- End quote ---
  Thanks!, Not enough in it (~2-3%) for me to consider switching back to Xp32  :).

perryjay:


Microsoft Windows [Version 6.0.6002]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

C:\Users\perry>cd\test

C:\test>powerspectrum4.exe

Device: GeForce 9500 GT, 1840 MHz clock, 1008 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
                PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
     64 threads:        2.8 GFlops    1.1 GB/s 1183.3ulps


GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
     32 threads:        2.7 GFlops    1.1 GB/s 121.7ulps
     64 threads:        2.9 GFlops    1.1 GB/s 121.7ulps
    128 threads:        2.9 GFlops    1.1 GB/s 121.7ulps
    256 threads:        2.9 GFlops    1.2 GB/s 121.7ulps


GetPowerSpectrum() mod 2 (fixed, but slow):
     32 threads:        0.5 GFlops    0.2 GB/s 1183.3ulps
     64 threads:        0.5 GFlops    0.2 GB/s 1183.3ulps
    128 threads:        0.5 GFlops    0.2 GB/s 1183.3ulps
    256 threads:        0.5 GFlops    0.2 GB/s 1183.3ulps


GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
     32 threads:        2.8 GFlops    1.1 GB/s 121.7ulps
     64 threads:        2.9 GFlops    1.1 GB/s 121.7ulps
    128 threads:        2.9 GFlops    1.2 GB/s 121.7ulps
    256 threads:        2.9 GFlops    1.2 GB/s 121.7ulps
    512 threads:        2.9 GFlops    1.1 GB/s 121.7ulps
   1024 threads: N/A



C:\test>

Jason G:
Woohoo, I like the ones that say 1.2GB/s , might have to shift compute cap 1.1 cards into the Mod 3, 128 thread category  ( or add more digits next time,  to find out where within 0-9% that difference is.  9% would be good )

arkayn:
I guess I was doing it wrong before as well, I was just running it straight

Device: GeForce GTX 460, 1600 MHz clock, 768 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
      PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
     64 threads:       12.8 GFlops    5.1 GB/s   0.0ulps


GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
     32 threads:        7.7 GFlops    3.1 GB/s 121.7ulps
     64 threads:       12.8 GFlops    5.1 GB/s 121.7ulps
    128 threads:       17.6 GFlops    7.0 GB/s 121.7ulps
    256 threads:       19.3 GFlops    7.7 GB/s 121.7ulps


GetPowerSpectrum() mod 2 (fixed, but slow):
     32 threads:        8.7 GFlops    3.5 GB/s   0.0ulps
     64 threads:       11.2 GFlops    4.5 GB/s   0.0ulps
    128 threads:       13.2 GFlops    5.3 GB/s   0.0ulps
    256 threads:       12.8 GFlops    5.1 GB/s   0.0ulps


GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
     32 threads:        7.8 GFlops    3.1 GB/s 121.7ulps
     64 threads:       12.9 GFlops    5.1 GB/s 121.7ulps
    128 threads:       17.6 GFlops    7.0 GB/s 121.7ulps
    256 threads:       19.3 GFlops    7.7 GB/s 121.7ulps
    512 threads:       19.1 GFlops    7.6 GB/s 121.7ulps
   1024 threads:       15.2 GFlops    6.1 GB/s 121.7ulps

perryjay:
Just to add a little bit... I'm running Vista 32 on a E5400 dual 2.7GHz. My 9500GT has driver 260.99 and is slightly overclocked at core 723/ shader 1840 and memory at 400 to give me 118GFLOP)S Peak.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version