Forum > GPU crunching
[Split] PowerSpectrum Unit Test
arkayn:
Got through mod 2 just fine, now it crashes on mod 3 512 threads.
I even set the clocks to 505/1010/1350 just to check.
Also crashes at 800/1600/1800
Jason G:
mmm, don't know why, weird. Will look at mod3's differences to mod2 (not much). Maybe some sort of driver bug ? It runs on XP32 here, but that's only a 260, not a Fermi.
I'd try a 263.06 driver clean install & see if that helps.
Can anyone else report crashing out on Mod3 ? Looks like Mod1 (256 thread) will be the useful technique on Fermi cards anyway, but if there is some issue with Mod3 it'd be nice to find & fix for a fair comparison.
[A bit Later:] Might have found something, will try adjust mod3 & update later. @arkayn: :o why is your card the only one that tells me when I do something wrong ?
Jason G:
Updated first post:
--- Quote ---[Updated] to PowerSpectrum Unit Test #4
Mod1: no changes
Mod2: no changes
Mod3: Tidy up & ironed out a bug that only manifests on Arkayn's card so far :o. Could be a smidgen faster.
--- End quote ---
Thanks Arkayn for picking up my bugs. Still no idea why yours is extra fussy, but it's very handy at the moment.
M_M:
Mod3 perforamance improved in latest PS build...
Device: GeForce GTX 460, 810 MHz clock, 993 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
64 threads: 14.7 GFlops 5.9 GB/s 0.0ulps
GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
32 threads: 8.2 GFlops 3.3 GB/s 121.7ulps
64 threads: 14.6 GFlops 5.8 GB/s 121.7ulps
128 threads: 22.3 GFlops 8.9 GB/s 121.7ulps
256 threads: 26.2 GFlops 10.5 GB/s 121.7ulps
GetPowerSpectrum() mod 2 (fixed, but slow):
32 threads: 9.4 GFlops 3.8 GB/s 0.0ulps
64 threads: 12.2 GFlops 4.9 GB/s 0.0ulps
128 threads: 14.7 GFlops 5.9 GB/s 0.0ulps
256 threads: 14.3 GFlops 5.7 GB/s 0.0ulps
GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
32 threads: 8.2 GFlops 3.3 GB/s 121.7ulps
64 threads: 14.7 GFlops 5.9 GB/s 121.7ulps
128 threads: 22.3 GFlops 8.9 GB/s 121.7ulps
256 threads: 26.1 GFlops 10.4 GB/s 121.7ulps
512 threads: 25.7 GFlops 10.3 GB/s 121.7ulps
1024 threads: 18.3 GFlops 7.3 GB/s 121.7ulps
Jason G:
hehe thanks. 460 with stock code is starting to look a bit anaemic, around all those 20+ figures
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version