Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (3/62) > >>

glennaxl:
**********
-device 0
**********
Device: GeForce GTX 295, 1476 MHz clock, 874 MB memory.
Compiled with CUDA 3020.
Stock GetPowerSpectrum():
     64 threads:       26.5 GFlops   10.6 GB/s 1183.3ulps


GetPowerSpectrum() mod 1:
     32 threads:       18.6 GFlops    7.4 GB/s 1183.3ulps
     64 threads:       26.5 GFlops   10.6 GB/s 1183.3ulps
    128 threads:       26.7 GFlops   10.7 GB/s 1183.3ulps
    256 threads:       26.7 GFlops   10.7 GB/s 1183.3ulps


GetPowerSpectrum() mod 2:
     32 threads:        5.3 GFlops    2.1 GB/s 1183.3ulps
     64 threads:        7.2 GFlops    2.9 GB/s 1183.3ulps
    128 threads:       10.6 GFlops    4.2 GB/s 1183.3ulps
    256 threads:       10.7 GFlops    4.3 GB/s 1183.3ulps


**********
-device 1
**********
Device: GeForce GTX 295, 1476 MHz clock, 873 MB memory.
Compiled with CUDA 3020.
Stock GetPowerSpectrum():
     64 threads:       25.8 GFlops   10.3 GB/s 1183.3ulps


GetPowerSpectrum() mod 1:
     32 threads:       17.9 GFlops    7.2 GB/s 1183.3ulps
     64 threads:       26.0 GFlops   10.4 GB/s 1183.3ulps
    128 threads:       26.1 GFlops   10.4 GB/s 1183.3ulps
    256 threads:       24.6 GFlops    9.8 GB/s 1183.3ulps


GetPowerSpectrum() mod 2:
     32 threads:        5.2 GFlops    2.1 GB/s 1183.3ulps
     64 threads:        7.1 GFlops    2.8 GB/s 1183.3ulps
    128 threads:       10.3 GFlops    4.1 GB/s 1183.3ulps
    256 threads:       10.6 GFlops    4.2 GB/s 1183.3ulps


**********
-device 2
**********
Device: GeForce GTX 260, 1487 MHz clock, 874 MB memory.
Compiled with CUDA 3020.
Stock GetPowerSpectrum():
     64 threads:       25.4 GFlops   10.2 GB/s 1183.3ulps


GetPowerSpectrum() mod 1:
     32 threads:       18.7 GFlops    7.5 GB/s 1183.3ulps
     64 threads:       25.6 GFlops   10.2 GB/s 1183.3ulps
    128 threads:       25.9 GFlops   10.4 GB/s 1183.3ulps
    256 threads:       25.9 GFlops   10.4 GB/s 1183.3ulps


GetPowerSpectrum() mod 2:
     32 threads:        5.2 GFlops    2.1 GB/s 1183.3ulps
     64 threads:        7.0 GFlops    2.8 GB/s 1183.3ulps
    128 threads:       10.3 GFlops    4.1 GB/s 1183.3ulps
    256 threads:       10.4 GFlops    4.1 GB/s 1183.3ulps

Jason G:
Hmm, I expected GTX 295 results, on each GPU to be closer to half GTX 480.  That's something to investigate.  Maybe the memory subsystem on those 295s isn't as good, or requires some different handling. [Edit: actually I suppose with stock code it is better than half a 480]


--- Quote ---Device: GeForce GTX 480, 810 MHz clock, 1503 MB memory.
Compiled with CUDA 3020.
Stock GetPowerSpectrum():
     64 threads:       29.1 GFlops   11.6 GB/s   0.0ulps


GetPowerSpectrum() mod 1:
     32 threads:       17.6 GFlops    7.1 GB/s   0.0ulps
     64 threads:       28.9 GFlops   11.6 GB/s   0.0ulps
    128 threads:       40.5 GFlops   16.2 GB/s   0.0ulps
    256 threads:       44.0 GFlops   17.6 GB/s   0.0ulps


GetPowerSpectrum() mod 2:
     32 threads:       19.3 GFlops    7.7 GB/s   0.0ulps
     64 threads:       38.0 GFlops   15.2 GB/s   0.0ulps
    128 threads:       61.1 GFlops   24.5 GB/s   0.0ulps
    256 threads:       61.4 GFlops   24.6 GB/s   0.0ulps

--- End quote ---

glennaxl:

--- Quote from: Jason G on 18 Nov 2010, 11:39:16 am ---Hmm, I expected GTX 295 results, on each GPU to be closer to half GTX 480.  That's something to investigate.  Maybe the memory subsystem on those 295s isn't as good, or requires some different handling. [Edit: actually I suppose with stock code it is better than half a 480]

--- End quote ---

My bad. FAH was running in background. Edited my post with new results.

Jason G:

--- Quote from: glennaxl on 18 Nov 2010, 11:44:39 am ---My bad. FAH was running in background. Edited my post with new results.

--- End quote ---

Ahh, cheers & LoL... I'm wondering why mod2 doesn't appear to work on those.  ([Later:] ah, probably some shared memory bank conflicts or such, will read into that. )

Ghost0210:
And on my 465 with 260.99 drivers:

Device: GeForce GTX 465, 1215 MHz clock, 994 MB memory.
Compiled with CUDA 3020.
Stock GetPowerSpectrum():
     64 threads:       16.0 GFlops    6.4 GB/s   0.0ulps


GetPowerSpectrum() mod 1:
     32 threads:        9.8 GFlops    3.9 GB/s   0.0ulps
     64 threads:       15.9 GFlops    6.3 GB/s   0.0ulps
    128 threads:       20.9 GFlops    8.3 GB/s   0.0ulps
    256 threads:       23.1 GFlops    9.2 GB/s   0.0ulps


GetPowerSpectrum() mod 2:
     32 threads:       14.4 GFlops    5.8 GB/s   0.0ulps
     64 threads:       28.4 GFlops   11.4 GB/s   0.0ulps
    128 threads:       33.5 GFlops   13.4 GB/s   0.0ulps
    256 threads:       32.8 GFlops   13.1 GB/s   0.0ulps

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version