Forum > GPU crunching

[Split] PowerSpectrum Unit Test

<< < (45/62) > >>

_heinz:
7_extended
~~~~~~~
PowerSpectrumTest7_extended.exe -device 0

Device: GeForce GTX 470, 810 MHz clock, 1248 MB memory.
Compute capability 2.0
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #7 (Faster reductions)
Stock:
 PS+SuMx(     8) [OK]    2.7 GFlops   12.0 GB/s
 PS+SuMx(    16) [OK]    3.7 GFlops   15.6 GB/s
 PS+SuMx(    32) [OK]    3.3 GFlops   13.7 GB/s
 PS+SuMx(    64) [OK]    5.1 GFlops   20.7 GB/s


Opt1: 256 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    4.9   21.5 121.7 [OK]   17.6   77.2 121.7
 PS+SuMx(    16)    7.1   29.7 121.7 [OK]   16.7   69.8 121.7
 PS+SuMx(    32)    8.3   34.1 121.7 [OK]   16.2   66.4 121.7
 PS+SuMx(    64)   10.2   41.3 121.7 [OK]   16.0   64.6 121.7


PowerSpectrumTest7_extended.exe -device 1

Device: GeForce GTX 470, 810 MHz clock, 1249 MB memory.
Compute capability 2.0
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #7 (Faster reductions)
Stock:
 PS+SuMx(     8) [OK]    2.7 GFlops   12.0 GB/s
 PS+SuMx(    16) [OK]    3.7 GFlops   15.4 GB/s
 PS+SuMx(    32) [OK]    3.4 GFlops   13.9 GB/s
 PS+SuMx(    64) [OK]    5.1 GFlops   20.7 GB/s


Opt1: 256 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    5.0   21.8 121.7 [OK]   17.7   77.4 121.7
 PS+SuMx(    16)    7.1   29.9 121.7 [OK]   16.7   70.0 121.7
 PS+SuMx(    32)    8.9   36.5 121.7 [OK]   16.3   66.6 121.7
 PS+SuMx(    64)   10.5   42.4 121.7 [OK]   16.0   64.7 121.7


.
Done
gpuload
I had never seen this Memory Controller load spike, comparing with primegrid it shows nothing.
gpuload_prime

_heinz:
7 extended ION
~~~~~~~~~~
PowerSpectrumTest7_extended.exe -device 0

Device: ION, 1100 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #7 (Faster reductions)
Stock:
 PS+SuMx(     8) [OK]    0.4 GFlops    1.5 GB/s
 PS+SuMx(    16) [OK]    0.3 GFlops    1.4 GB/s
 PS+SuMx(    32) [OK]    0.3 GFlops    1.1 GB/s
 PS+SuMx(    64) [OK]    0.4 GFlops    1.7 GB/s


Opt1: 64 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    0.5    2.4 121.7 [OK]    0.6    2.8 121.7
 PS+SuMx(    16)    0.6    2.3 121.7 [OK]    0.6    2.6 121.7
 PS+SuMx(    32)    0.5    2.2 121.7 [OK]    0.6    2.3 121.7
 PS+SuMx(    64)    0.7    2.7 121.7 [OK]    0.7    2.9 121.7


.
Done
hmm. how to interpret
the stock values 1,7GB/s are much better with the ION.
must lookup to the ION device properties
CUDA: ION
Informationsliste   Wert
Geräteeigenschaften   
Gerätename   ION
Taktrate   1100 MHz
Multiprozessor / Kerne   2 / 16
Max Threads Per Block   512
Max Registers Per Block   8192
Warp Size   32 threads
Max Block Size   512 x 512 x 64
Max Grid Size   65535 x 65535 x 1
Compute Capability   1.1
CUDA DLL   nvcuda.dll (8.17.12.6061 - nVIDIA ForceWare 260.61)
   
Speichereigenschaften   
Total Memory   241 MB
Total Constant Memory   64 KB
Max Shared Memory Per Block   16 KB
Max Memory Pitch   2147483647 Bytes
Texture Alignment   256 Bytes
   
Gerät Besonderheiten   
32-bit Floating-Point Atomic Addition   Nicht unterstützt
32-bit Integer Atomic Operations   Unterstützt
64-bit Integer Atomic Operations   Nicht unterstützt
Concurrent Memory Copy & Execute   Nicht unterstützt
Double-Precision Floating-Point   Nicht unterstützt
Warp Vote Functions   Nicht unterstützt
__ballot()   Nicht unterstützt
__syncthreads_and()   Nicht unterstützt
__syncthreads_count()   Nicht unterstützt
__syncthreads_or()   Nicht unterstützt
__threadfence_system()   Nicht unterstützt
   
Gerätehersteller   
Firmenname   NVIDIA Corporation
Produktinformation   http://www.nvidia.com/page/products.html
Treiberdownload   http://www.nvidia.com/content/drivers/drivers.asp
Treiberupdate   http://www.aida64.com/driver-updates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OPEN_CL
~~~~~~~
OpenCL: ION
Informationsliste   Wert
OpenCL Properties   
Platform Name   NVIDIA CUDA
Platform Vendor   NVIDIA Corporation
Platform Version   OpenCL 1.0 CUDA 3.2.1
Platform Profile   Full
   
Geräteeigenschaften   
Gerätename   ION
Geräteart   Grafikprozessor (GPU)
Device Vendor   NVIDIA Corporation
Device Version   OpenCL 1.0 CUDA
Device Profile   Full
Taktrate   1100 MHz
Multiprocessors   2
Max 2D Image Size   4096 x 32768
Max 3D Image Size   2048 x 2048 x 2048
Max Samplers   16
Max Work-Item Size   512 x 512 x 64
Max Work-Group Size   512
Max Argument Size   4352 Bytes
Max Constant Buffer Size   64 KB
Max Constant Arguments   9
Profiling Timer Resolution   1000 ns
OpenCL DLL   opencl.dll (1.0.0)
   
Speichereigenschaften   
Global Memory   241 MB
Local Memory   16 KB
Memory Base Address Alignment   2048 Bit
Min Data Type Alignment   128 Bytes
   
Gerät Besonderheiten   
Command-Queue Out Of Order Execution   Aktiviert
Command-Queue Profiling   Aktiviert
Compiler   Unterstützt
Fehlerkorrektur   Nicht unterstützt
Images   Unterstützt
Kernel Execution   Unterstützt
Native Kernel Execution   Nicht unterstützt
   
Device Extensions   
cl_amd_d3d10_interop   Nicht unterstützt
cl_amd_d3d9_interop   Nicht unterstützt
cl_amd_device_attribute_query   Nicht unterstützt
cl_amd_fp64   Nicht unterstützt
cl_amd_media_ops   Nicht unterstützt
cl_amd_printf   Nicht unterstützt
cl_khr_3d_image_writes   Nicht unterstützt
cl_khr_byte_addressable_store   Unterstützt
cl_khr_d3d10_sharing   Unterstützt
cl_khr_fp16   Nicht unterstützt
cl_khr_fp64   Nicht unterstützt
cl_khr_gl_sharing   Unterstützt
cl_khr_global_int32_base_atomics   Unterstützt
cl_khr_global_int32_extended_atomics   Unterstützt
cl_khr_icd   Unterstützt
cl_khr_int64_base_atomics   Nicht unterstützt
cl_khr_int64_extended_atomics   Nicht unterstützt
cl_khr_local_int32_base_atomics   Nicht unterstützt
cl_khr_local_int32_extended_atomics   Nicht unterstützt
cl_khr_select_fprounding_mode   Nicht unterstützt
cl_nv_compiler_options   Unterstützt
cl_nv_d3d10_sharing   Unterstützt
cl_nv_d3d11_sharing   Unterstützt
cl_nv_d3d9_sharing   Unterstützt
cl_nv_device_attribute_query   Unterstützt
cl_nv_pragma_unroll   Unterstützt
   
Gerätehersteller   
Firmenname   NVIDIA Corporation
Produktinformation   http://www.nvidia.com/page/products.html
Treiberdownload   http://www.nvidia.com/content/drivers/drivers.asp
Treiberupdate   http://www.aida64.com/driver-updates

Jason G:

--- Quote from: _heinz on 22 Dec 2010, 06:42:54 am ---hmm. how to interpret
the stock values 1,7GB/s are much better with the ION.
must lookup to the ION device properties

--- End quote ---

No, your labels are misaligned Heinz, will fix them for you ....[Done... 2.7GB/s is a bit better than 1.7GB/s ]

[Edit] Fixed it again, and fixed the 470 ones so you can read them properly  ;)

_heinz:
Thanks Jason,
must clean my glasses  ::)

_heinz:
7 extended ION
~~~~~~~~~~
rerun, now light oc'ed from  450 / 800 / 1100 to 475 / 850 / 1161

PowerSpectrumTest7_extended.exe -device 0

Device: ION, 1161 MHz clock, 242 MB memory.
Compute capability 1.1
Compiled with CUDA 3020.
                PowerSpectrum+summax Unit test #7 (Faster reductions)
Stock:
 PS+SuMx(     8) [OK]    0.4 GFlops    1.6 GB/s
 PS+SuMx(    16) [OK]    0.3 GFlops    1.4 GB/s
 PS+SuMx(    32) [OK]    0.3 GFlops    1.1 GB/s
 PS+SuMx(    64) [OK]    0.4 GFlops    1.8 GB/s


Opt1: 64 thrds/block
                        worst case              best case
                   GFlps  GB/s ulps         GFlps  GB/s ulps
 PS+SuMx(     8)    0.6    2.5 121.7 [OK]    0.7    2.9 121.7
 PS+SuMx(    16)    0.6    2.4 121.7 [OK]    0.6    2.7 121.7
 PS+SuMx(    32)    0.6    2.3 121.7 [OK]    0.6    2.4 121.7
 PS+SuMx(    64)    0.7    2.8 121.7 [OK]    0.8    3.1 121.7


.
Done
modify: the latest GPU-Z 0.4.9 did not show any Memory Controller load
looks like a issue ?
further it shows 4 ROPs for the ION, but it has 2 Multiprocessors(as far as I know)
emailed to techpowerup

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version