Forum > GPU crunching
[Split] PowerSpectrum Unit Test
			arkayn:
			
			Got through mod 2 just fine, now it crashes on mod 3 512 threads.
I even set the clocks to 505/1010/1350 just to check.
Also crashes at 800/1600/1800
		
			Jason G:
			
			mmm, don't know why, weird.  Will look at mod3's differences to mod2 (not much).  Maybe some sort of driver bug ? It runs on XP32 here, but that's only a 260, not a Fermi.
I'd try a 263.06 driver clean install & see if that helps.
Can anyone else report crashing out on Mod3 ?  Looks like Mod1 (256 thread) will be the useful technique on Fermi cards anyway, but if there is some issue with Mod3 it'd be nice to find & fix for a fair comparison.
[A bit Later:] Might have found something, will try adjust mod3 & update later.    @arkayn:  :o why is your card the only one that tells me when I do something wrong ?
		
			Jason G:
			
			Updated first post:
--- Quote ---[Updated] to PowerSpectrum Unit Test #4
Mod1: no changes
Mod2: no changes
Mod3: Tidy up & ironed out a bug that only manifests on Arkayn's card so far :o.  Could be a smidgen faster.
--- End quote ---
Thanks Arkayn for picking up my bugs.  Still no idea why yours is extra fussy, but it's very handy at the moment.
		
			M_M:
			
			Mod3 perforamance improved in latest PS build...
Device: GeForce GTX 460, 810 MHz clock, 993 MB memory.
Compute capability 2.1
Compiled with CUDA 3020.
                PowerSpectrum Unit Test #4
Stock GetPowerSpectrum():
     64 threads:       14.7 GFlops    5.9 GB/s   0.0ulps
GetPowerSpectrum() mod 1: (made Fermi & Pre-Fermi match in accuracy.)
     32 threads:        8.2 GFlops    3.3 GB/s 121.7ulps
     64 threads:       14.6 GFlops    5.8 GB/s 121.7ulps
    128 threads:       22.3 GFlops    8.9 GB/s 121.7ulps
    256 threads:       26.2 GFlops   10.5 GB/s 121.7ulps
GetPowerSpectrum() mod 2 (fixed, but slow):
     32 threads:        9.4 GFlops    3.8 GB/s   0.0ulps
     64 threads:       12.2 GFlops    4.9 GB/s   0.0ulps
    128 threads:       14.7 GFlops    5.9 GB/s   0.0ulps
    256 threads:       14.3 GFlops    5.7 GB/s   0.0ulps
GetPowerSpectrum() mod 3: (As with mod1, +threads & split loads)
     32 threads:        8.2 GFlops    3.3 GB/s 121.7ulps
     64 threads:       14.7 GFlops    5.9 GB/s 121.7ulps
    128 threads:       22.3 GFlops    8.9 GB/s 121.7ulps
    256 threads:       26.1 GFlops   10.4 GB/s 121.7ulps
    512 threads:       25.7 GFlops   10.3 GB/s 121.7ulps
   1024 threads:       18.3 GFlops    7.3 GB/s 121.7ulps
		
			Jason G:
			
			hehe thanks. 460 with stock code is starting to look a bit anaemic, around all those 20+ figures 
		
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version