Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (47/62) > >>

Jason G:
@glenaxl: have updated the PowerSpectrumTest8 archive attached to first post, to dial back the borderline kernels a bit (for now, will dig deeper into those later if needed).

Jason

glennaxl:

--- Quote from: Jason G on 23 Dec 2010, 06:23:47 am ---@glenaxl: have updated the PowerSpectrumTest8 archive attached to first post, to dial back the borderline kernels a bit (for now, will dig deeper into those later if needed).

Jason

--- End quote ---
Yah, my gtx 295 vram is oc'd to 1080 from 999


The new test8 are all good now. Perfect!  ;)

Jason G:

--- Quote from: glennaxl on 23 Dec 2010, 07:25:44 am ---The new test8 are all good now. Perfect!  ;)

--- End quote ---

Good, good.  will keep those ones dialled in a bit then, allowing some possible fine tuning later.  It seems cramming that much data through we're beginning to find weak spots, so will look at moving onto FFT integration.

_heinz:
Hi Jason,
Device: GeForce GTX 470, 810 MHz clock, 1248 MB memory.
Stock best result
 PS+SuMx( 32768) [OK]   12.7 GFlops   50.7 GB/s

Opt best result
 PS+SuMx( 32768)   16.4   65.7 121.7 [OK]   27.8  111.4 121.7

all others are ok

_heinz:
Hi Jason,
excellent performance on the ION
worth to post full result
PowerSpectrumTest8.exe -device 0

Device: ION, 1161 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #8 (Sanity Check)
Stock:
 PS+SuMx(     8) [OK]    0.4 GFlops    1.6 GB/s
 PS+SuMx(    16) [OK]    0.3 GFlops    1.5 GB/s
 PS+SuMx(    32) [OK]    0.3 GFlops    1.1 GB/s
 PS+SuMx(    64) [OK]    0.4 GFlops    1.8 GB/s
 PS+SuMx(   128) [OK]    0.7 GFlops    2.7 GB/s
 PS+SuMx(   256) [OK]    0.8 GFlops    3.4 GB/s
 PS+SuMx(   512) [OK]    1.1 GFlops    4.3 GB/s
 PS+SuMx(  1024) [OK]    1.1 GFlops    4.4 GB/s
 PS+SuMx(  2048) [OK]    1.2 GFlops    4.9 GB/s
 PS+SuMx(  4096) [OK]    1.2 GFlops    4.8 GB/s
 PS+SuMx(  8192) [OK]    1.3 GFlops    5.2 GB/s
 PS+SuMx( 16384) [OK]    1.3 GFlops    5.1 GB/s
 PS+SuMx( 32768) [OK]    1.3 GFlops    5.4 GB/s
 PS+SuMx( 65536) [OK]    1.4 GFlops    5.4 GB/s
PS+SuMx(131072) [OK]    1.4 GFlops    5.6 GB/s

Opt1: 64 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    0.6    2.5 121.7 [OK]    0.7    2.9 121.7
 PS+SuMx(    16)    0.6    2.4 121.7 [OK]    0.6    2.7 121.7
 PS+SuMx(    32)    0.6    2.3 121.7 [OK]    0.6    2.4 121.7
 PS+SuMx(    64)    0.7    2.8 121.7 [OK]    0.7    3.0 121.7
 PS+SuMx(   128)    0.7    2.7 121.7 [OK]    0.7    3.0 121.7
 PS+SuMx(   256)    0.9    3.5 121.7 [OK]    1.0    3.9 121.7
 PS+SuMx(   512)    1.1    4.5 121.7 [OK]    1.2    5.0 121.7
 PS+SuMx(  1024)    1.2    4.6 121.7 [OK]    1.3    5.1 121.7
 PS+SuMx(  2048)    1.3    5.3 121.7 [OK]    1.5    5.9 121.7
 PS+SuMx(  4096)    1.3    5.0 121.7 [OK]    1.4    5.6 121.7
 PS+SuMx(  8192)    1.4    5.5 121.7 [OK]    1.5    6.1 121.7
 PS+SuMx( 16384)    1.3    5.4 121.7 [OK]    1.5    6.0 121.7
 PS+SuMx( 32768)    1.4    5.7 121.7 [OK]    1.6    6.4 121.7
 PS+SuMx( 65536)    1.4    5.8 121.7 [OK]    1.6    6.5 121.7
PS+SuMx(131072)    1.2    4.8 121.7 [OK]    1.7    6.6 121.7

.
Done

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version