Forum > GPU crunching
[Split] PowerSpectrum Unit Test
Jason G:
@Heinz, something broke in that source you used, investigating.
Ghost0210:
I've been playing with a couple of other versions of drivers (263.xx & 256.xx) as well and there is no improvement over the current 260.99 WHQL release drivers figures.
Was worth doing this just to get an XP machine up and running again - although I'm struggling to remember where anything is.....
Jason G:
--- Quote from: Ghost on 04 Dec 2010, 04:44:04 pm ---Was worth doing this just to get an XP machine up and running again - although I'm struggling to remember where anything is.....
--- End quote ---
Yep going back is a challenge after adapting. Now that I'm pretty confident the memory transfers are the main factor, I'm hopeful a certain 'trick' may squash the difference. We'll see.
[Edit:] Updated first post:
--- Quote ---Update: powerspectrum Test 6, pinned memory
- does it improve 'worst case' optimisation on WDDM versus XPDM ?
- or does it improve on both OSes the same ? (or neither, Test5 remains for comparison)
--- End quote ---
Will use pinned memory, for Opt1, on GPUs that can do so.
Ghost0210:
Hi Jason,
Getting an error with the new build saying that cudart_32_32_7.dll isn't present - is this meant to be in the .7z file?
ghost
arkayn:
Just to see if it would run, I made a copy of the cudart32_32_16.dll, renamed it to cudart32_32_7.dll and then ran the test
Device: GeForce GTX 460, 1600 MHz clock, 768 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
PowerSpectrum+summax Unit test #6 (pinned mem)
Stock:
PwrSpec< 64> 12.9 GFlops 51.4 GB/s 0.0ulps
SumMax ( 64) 1.0 GFlops 4.4 GB/s
Every ifft average & peak OK
PS+SuMx( 64) 3.4 GFlops 13.6 GB/s
GetPowerSpectrum() choice for Opt1: 256 thrds/block
256 threads: 19.4 GFlops 77.5 GB/s 121.7ulps
Opt1 (PSmod3+SM): 256 thrds/block
PowerSpectrumSumMax array pinned in host memory.
256 threads, fftlen 64: (worst case: full summax copy)
6.0 GFlops 24.4 GB/s 121.7ulps
Every ifft average & peak OK
256 threads, fftlen 64: (best case, nothing to update)
7.0 GFlops 28.2 GB/s 121.7ulps
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version