Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (10/62) > >>

arkayn:
Got through mod 2 just fine, now it crashes on mod 3 512 threads.

I even set the clocks to 505/1010/1350 just to check.

Also crashes at 800/1600/1800

Jason G:
mmm, don't know why, weird.  Will look at mod3's differences to mod2 (not much).  Maybe some sort of driver bug ? It runs on XP32 here, but that's only a 260, not a Fermi.
I'd try a 263.06 driver clean install & see if that helps.

Can anyone else report crashing out on Mod3 ?  Looks like Mod1 (256 thread) will be the useful technique on Fermi cards anyway, but if there is some issue with Mod3 it'd be nice to find & fix for a fair comparison.

[A bit Later:] Might have found something, will try adjust mod3 & update later.    @arkayn:  :o why is your card the only one that tells me when I do something wrong ?

Jason G:
Updated first post:

--- Quote ---[Updated] to PowerSpectrum Unit Test #4
Mod1: no changes
Mod2: no changes
Mod3: Tidy up & ironed out a bug that only manifests on Arkayn's card so far :o.  Could be a smidgen faster.
--- End quote ---

Thanks Arkayn for picking up my bugs.  Still no idea why yours is extra fussy, but it's very handy at the moment.

M_M:
Mod3 perforamance improved in latest PS build...

Device: GeForce GTX 460, 810 MHz clock, 993 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
                PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
     64 threads:       14.7 GFlops    5.9 GB/s   0.0ulps


GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
     32 threads:        8.2 GFlops    3.3 GB/s 121.7ulps
     64 threads:       14.6 GFlops    5.8 GB/s 121.7ulps
    128 threads:       22.3 GFlops    8.9 GB/s 121.7ulps
    256 threads:       26.2 GFlops   10.5 GB/s 121.7ulps


GetPowerSpectrum() mod 2 (fixed, but slow):
     32 threads:        9.4 GFlops    3.8 GB/s   0.0ulps
     64 threads:       12.2 GFlops    4.9 GB/s   0.0ulps
    128 threads:       14.7 GFlops    5.9 GB/s   0.0ulps
    256 threads:       14.3 GFlops    5.7 GB/s   0.0ulps


GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
     32 threads:        8.2 GFlops    3.3 GB/s 121.7ulps
     64 threads:       14.7 GFlops    5.9 GB/s 121.7ulps
    128 threads:       22.3 GFlops    8.9 GB/s 121.7ulps
    256 threads:       26.1 GFlops   10.4 GB/s 121.7ulps
    512 threads:       25.7 GFlops   10.3 GB/s 121.7ulps
   1024 threads:       18.3 GFlops    7.3 GB/s 121.7ulps

Jason G:
hehe thanks. 460 with stock code is starting to look a bit anaemic, around all those 20+ figures

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version