Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (6/62) > >>

glennaxl:
Device: GeForce 9800 GT, 1750 MHz clock, 500 MB memory.
Compiled with CUDA 3020.
Stock GetPowerSpectrum():
     64 threads:       13.6 GFlops    5.4 GB/s 1183.3ulps


GetPowerSpectrum() mod 1:
     32 threads:       12.1 GFlops    4.9 GB/s 1183.3ulps
     64 threads:       13.7 GFlops    5.5 GB/s 1183.3ulps
    128 threads:       13.5 GFlops    5.4 GB/s 1183.3ulps
    256 threads:       13.4 GFlops    5.3 GB/s 1183.3ulps


GetPowerSpectrum() mod 2:
     32 threads:        5.3 GFlops    2.1 GB/s 1183.3ulps
     64 threads:        7.0 GFlops    2.8 GB/s 1183.3ulps
    128 threads:        7.1 GFlops    2.8 GB/s 1183.3ulps
    256 threads:        6.8 GFlops    2.7 GB/s 1183.3ulps

Jason G:
If anyone's wondering what this figure is:


--- Quote from: glennaxl on 18 Nov 2010, 08:38:12 pm ---... 1183.3ulps ...
--- End quote ---

It's a measure of the precision against a CPU double precision reference power spectrum.

Fermi's get 0ulps total deviation (most accurate) because they default to IEEE-754 compliance, whereas earlier gen consistently get 1183.3 because they use a fast single precision implementation by default.

I can either use special intrinsic functions on the older cards to force compliance, at a speed penalty, or allow the Fermi's to use the faster (less accurate) computation.  Will see.  1183.3 'Units of Least Precision' isn't much total deviation from double precision reference over the 1048576 point data set used in multibeam. 

an ulp is defined here as:

--- Quote ---const float ulp =  1.192092896e-07f;
--- End quote ---
... about  0.00000012 ... and there'd be some of that amount of variation from double precision CPU reference scattered throughout the dataset.

Jason

Jason G:
@Arkayn:  I looked through some results I have, and I have a GTX460 set that ran to completion @ stock speeds (Using driver 263.06).  Might be pushing the memory OC a bit on yours ?

arkayn:
I think it is at 800/1600 right now, runs Collatz just fine at that speed.

I just took it down to stock speed as well as the lowest setting that Afterburner allowed and it still crashed the program.

This is on a XP-64 pro machine though.

Driver is the 263.06, do I need the toolkit installed as well?

Jason G:

--- Quote from: arkayn on 18 Nov 2010, 10:05:35 pm ---Driver is the 263.06, do I need the toolkit installed as well?

--- End quote ---

Nope, It's definitely something weird.  Bear in mind that those upper kernels are pushing Fermi's memory subsystem harder than any boinc science app has to date that I know of, so I doubt Collatz or any other existing app would be a fair comparison ( except maybe Furmark, which is just a savage thing to do to a graphics card )

If it runs this at stock OK, but not at 800/1600, then it might be Collatz stable, but is unlikely to be future X series stable.  My current feeling is that the memory frequency is the culprit, rather than the core.

(If it doesn't run correctly at stock either, then more guessing to do  ;) )

[Later:]  At this stage I'm assuming some sort of bug in Mod2, so don;t go pulling things to bits just yet  ;)

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version