Forum > GPU crunching
[Split] PowerSpectrum Unit Test
Jason G:
--- Quote from: Jason G on 20 Nov 2010, 07:52:23 am ---Yes please. The difference picked up earlier (Thanks Frizz) between XP32 & XP64 was interesting ( with stock, around 10% advantage to XP32, reduced to ~5% with Mod3 ) . I've little doubt XP32 has a similar advantage over Win7x64, due to the simpler driver model, but it'd be nice to confirm if the mods close that gap a bit too.
--- End quote ---
--- Quote from: PatrickV2 on 20 Nov 2010, 11:32:41 am ---Sure, no problem. The results:
...
64 threads: 18.3 GFlops 7.3 GB/s 121.7ulps
--- End quote ---
Thanks!, Not enough in it (~2-3%) for me to consider switching back to Xp32 :).
perryjay:
Microsoft Windows [Version 6.0.6002]
Copyright (c) 2006 Microsoft Corporation. All rights reserved.
C:\Users\perry>cd\test
C:\test>powerspectrum4.exe
Device: GeForce 9500 GT, 1840 MHz clock, 1008 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
64 threads: 2.8 GFlops 1.1 GB/s 1183.3ulps
GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
32 threads: 2.7 GFlops 1.1 GB/s 121.7ulps
64 threads: 2.9 GFlops 1.1 GB/s 121.7ulps
128 threads: 2.9 GFlops 1.1 GB/s 121.7ulps
256 threads: 2.9 GFlops 1.2 GB/s 121.7ulps
GetPowerSpectrum() mod 2 (fixed, but slow):
32 threads: 0.5 GFlops 0.2 GB/s 1183.3ulps
64 threads: 0.5 GFlops 0.2 GB/s 1183.3ulps
128 threads: 0.5 GFlops 0.2 GB/s 1183.3ulps
256 threads: 0.5 GFlops 0.2 GB/s 1183.3ulps
GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
32 threads: 2.8 GFlops 1.1 GB/s 121.7ulps
64 threads: 2.9 GFlops 1.1 GB/s 121.7ulps
128 threads: 2.9 GFlops 1.2 GB/s 121.7ulps
256 threads: 2.9 GFlops 1.2 GB/s 121.7ulps
512 threads: 2.9 GFlops 1.1 GB/s 121.7ulps
1024 threads: N/A
C:\test>
Jason G:
Woohoo, I like the ones that say 1.2GB/s , might have to shift compute cap 1.1 cards into the Mod 3, 128 thread category ( or add more digits next time, to find out where within 0-9% that difference is. 9% would be good )
arkayn:
I guess I was doing it wrong before as well, I was just running it straight
Device: GeForce GTX 460, 1600 MHz clock, 768 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
64 threads: 12.8 GFlops 5.1 GB/s 0.0ulps
GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
32 threads: 7.7 GFlops 3.1 GB/s 121.7ulps
64 threads: 12.8 GFlops 5.1 GB/s 121.7ulps
128 threads: 17.6 GFlops 7.0 GB/s 121.7ulps
256 threads: 19.3 GFlops 7.7 GB/s 121.7ulps
GetPowerSpectrum() mod 2 (fixed, but slow):
32 threads: 8.7 GFlops 3.5 GB/s 0.0ulps
64 threads: 11.2 GFlops 4.5 GB/s 0.0ulps
128 threads: 13.2 GFlops 5.3 GB/s 0.0ulps
256 threads: 12.8 GFlops 5.1 GB/s 0.0ulps
GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
32 threads: 7.8 GFlops 3.1 GB/s 121.7ulps
64 threads: 12.9 GFlops 5.1 GB/s 121.7ulps
128 threads: 17.6 GFlops 7.0 GB/s 121.7ulps
256 threads: 19.3 GFlops 7.7 GB/s 121.7ulps
512 threads: 19.1 GFlops 7.6 GB/s 121.7ulps
1024 threads: 15.2 GFlops 6.1 GB/s 121.7ulps
perryjay:
Just to add a little bit... I'm running Vista 32 on a E5400 dual 2.7GHz. My 9500GT has driver 260.99 and is slightly overclocked at core 723/ shader 1840 and memory at 400 to give me 118GFLOP)S Peak.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version