Forum > GPU crunching
[Split] PowerSpectrum Unit Test
Jason G:
Thanks Steve!,
Now your increased Core speed is showing via the improved 'worst case' speedup over mine ( Your 10.9 Vs my 8.1 GFlops )
GTX480 (watercooled)
Average, peak calcs, thread-count hueristic: OK
worst case speedup: ~53% ( 1.53x )
best case speedup: ~119% ( 2.19x )
Ghost0210:
--- Quote from: Jason G on 29 Nov 2010, 02:43:24 pm ---
Nice that my tweaking works even faster on XP, but I'm starting to hope MS include some sortof video subsystem fixes in SP1 for Win7x64 :D
--- End quote ---
Just re-run the Mod5 test on my GTX465 on Win7 x64 SP1 v.721 RC
getting the same results as before:
Device: GeForce GTX 465, 1215 MHz clock, 994 MB memory.
Compute capability 2.0
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #5
Stock:
PwrSpec< 64> 16.0 GFlops 63.9 GB/s 0.0ulps
SumMax ( 64) 1.3 GFlops 5.2 GB/s
Every ifft average & peak OK
PS+SuMx( 64) 4.1 GFlops 16.5 GB/s
GetPowerSpectrum() choice for Opt1: 256 thrds/block
256 threads: 23.1 GFlops 92.5 GB/s 121.7ulps
Opt1 (PSmod3+SM): 256 thrds/block
256 threads, fftlen 64: (worst case: full summax copy)
6.0 GFlops 24.2 GB/s 121.7ulps
Every ifft average & peak OK
256 threads, fftlen 64: (best case, nothing to update)
8.7 GFlops 35.4 GB/s 121.7ulps
Jason G:
Interesting. In the meantime I also managed to verify that 32 bit versus 64 bit executable yielded no discernible performance difference here ( Since it's GPU jard coded anyway ;) )
So we're left with WinXP32's simpler driver model with no Direct10+ support, or WDDM stuff going on IMO. I wonder if there's a way to turn off more stuff in Win7x64, video subsystem-wise.
[Edit:] Hmmm....
http://www.anandtech.com/show/3924/nvidia-announces-parallel-nsight-15-cuda-toolkit-32
"Compared to the old XPDM, WDDM was a big step up for GPU usage on Windows, but only for graphical purposes. With Windows’ iron-fisted control over the GPU and a focus on task scheduling for responsiveness over performance, it wasn’t ideal for GPGPU purposes. Case in point, with a WDDM driver NVIDIA was finding it took 30μs for a kernel to be launched, but if they had Windows treat the GPU as a generic device by using a Windows Driver Model (WDM) driver, that launch time dropped to 2.5μs. This coupled with the fact that a WDM driver is necessary to use Tesla cards in a Windows Remote Desktop Protocol environment (as any Folding @Home junkie can tell you, RDP sessions can’t access the GPU through WDDM) resulted in the birth of TCC mode."
Ghost0210:
Looks good - a massive drop in time to launch kernels, shame it's only available for Tesla GPU's at the moment
Hopefully NV will release a similar driver for atleast the fermi cards if not all the current cards
Jason G:
Yeah, OmegaDrivers.Net Guy looks like broke & struggling to Work out Win7 Drivers too (None for Win7 available when you read further in).
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version