Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (27/62) > >>

Jason G:
Thanks Steve!,
   Now your increased Core speed is showing via the improved 'worst case' speedup over mine ( Your 10.9 Vs my 8.1 GFlops )

GTX480 (watercooled)
Average, peak calcs, thread-count hueristic: OK
    worst case speedup: ~53%   ( 1.53x )
    best case speedup:   ~119%  ( 2.19x )

Ghost0210:

--- Quote from: Jason G on 29 Nov 2010, 02:43:24 pm ---
Nice that my tweaking works even faster on XP, but I'm starting to hope MS include some sortof video subsystem fixes in SP1 for Win7x64  :D


--- End quote ---

Just re-run the Mod5 test on my GTX465 on Win7 x64 SP1 v.721 RC
getting the same results as before:

Device: GeForce GTX 465, 1215 MHz clock, 994 MB memory.
Compute capability 2.0
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #5
Stock:
 PwrSpec<    64>   16.0 GFlops   63.9 GB/s   0.0ulps

 SumMax (    64)    1.3 GFlops    5.2 GB/s
Every ifft average & peak OK

 PS+SuMx(    64)    4.1 GFlops   16.5 GB/s


GetPowerSpectrum() choice for Opt1: 256 thrds/block
    256 threads:       23.1 GFlops   92.5 GB/s 121.7ulps


Opt1 (PSmod3+SM): 256 thrds/block
  256 threads, fftlen 64: (worst case: full summax copy)
         6.0 GFlops   24.2 GB/s 121.7ulps
Every ifft average & peak OK
  256 threads, fftlen 64: (best case, nothing to update)
         8.7 GFlops   35.4 GB/s 121.7ulps

Jason G:
Interesting.  In the meantime I also managed to verify that 32 bit versus 64 bit executable yielded no discernible performance difference here ( Since it's GPU jard coded anyway  ;) )

So we're left with WinXP32's simpler driver model with no Direct10+ support, or WDDM stuff going on IMO.  I wonder if there's a way to turn off more stuff in Win7x64, video subsystem-wise.

[Edit:] Hmmm....
http://www.anandtech.com/show/3924/nvidia-announces-parallel-nsight-15-cuda-toolkit-32

"Compared to the old XPDM, WDDM was a big step up for GPU usage on Windows, but only for graphical purposes. With Windows’ iron-fisted control over the GPU and a focus on task scheduling for responsiveness over performance, it wasn’t ideal for GPGPU purposes. Case in point, with a WDDM driver NVIDIA was finding it took 30μs for a kernel to be launched, but if they had Windows treat the GPU as a generic device by using a Windows Driver Model (WDM) driver, that launch time dropped to 2.5μs. This coupled with the fact that a WDM driver is necessary to use Tesla cards in a Windows Remote Desktop Protocol environment (as any Folding @Home junkie can tell you, RDP sessions can’t access the GPU through WDDM) resulted in the birth of TCC mode."

Ghost0210:
Looks good - a massive drop in time to launch kernels, shame it's only available for Tesla GPU's at the moment
Hopefully NV will release a similar driver for atleast the fermi cards if not all the current cards

Jason G:
Yeah, OmegaDrivers.Net Guy looks like broke & struggling to Work out Win7 Drivers too (None for Win7 available when you read further in). 

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version