Forum > GPU crunching

GPU AP tuning: new set of test tasks for GPU AP

<< < (3/4) > >>

Raistmer:

--- Quote from: Fredericx51 on 06 Aug 2012, 08:08:54 am ---Is it usefull to do test with I7-2600 + 2x HD5870 GPUs, using AP rev.1316 app. with unroll 15;
ffa_block  10240 ffa_block_fetch 5120?  These give the lowest runtime and CPU time.
{Cat 12.4;  AMD-APP (SDK) 2.4; OpenCL 1.2}

Or try this on GTX470 or 480?


--- End quote ---

I found useful to get dependence curve from param, not just single dot. It's not test for valideness, it's tuning, I see no sense in single dot here, it will say nothing about good or bad params were chosen..

Raistmer:
C-60 picture updated, extraction script added to first post.
Looks like additional cooling and keeping display ON can make results more stable indeed (yellow dots)

Raistmer:
Being curious I decided to pass whole possible range of unrolls.
In short, it breaks on 65 for this GPU. Errors (-61, invalid buffer size) and then driver restart.
Very interesting dot on 64 unroll (will repeat it after reboot, on driver restart host lost mouse cursor completely): high CPU usage. We see high CPU usage for new FFA PC kernel sequence where total kernel sequence run time (w/o sync point with host) quite big. I supposed that ATi driver switches from interrupts to busy-wait loop after some awaiting threshold hence if kernel sequence too long we get increase in CPU time (in contradiction with all GPU optimization manuals, btw).
Here, with unroll increase, single kernel becomes longer and longer so, at some point, same driver switch should occur if any exist. This preliminary data show that yes, it happens. Need to be repeated few times of course to be sure.

And another conclusion: half of CU number unroll is good guess but little not optimal, but going further than unroll of number of CUs is pointless.

EDIT: added missed dots and repeated last one few times - it's reproducable, very high CPU usage at unroll 64 indeed! (blue dots recived after reboot, vertical line is the number of CU for this GPU).

arkayn:
Ran both WU's on my HD-7750 and GTX-670

Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 4.561 secs
      CPU 2.527 secs
AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe -verbose  :
  Elapsed 53.743 secs, speedup: -1078.32%  ratio: 0.08
      CPU 51.574 secs, speedup: -1940.92%  ratio: 0.05
AP6_win_x86_SSE2_OpenCL_NV_r1363.exe -verbose  :
  Elapsed 3.401 secs, speedup: 25.43%  ratio: 1.34
      CPU 1.420 secs, speedup: 43.81%  ratio: 1.78
 
WU : Clean_01LC.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 718.923 secs
      CPU 715.717 secs
AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe -verbose  :
  Elapsed 40.220 secs, speedup: 94.41%  ratio: 17.87
      CPU 8.408 secs, speedup: 98.83%  ratio: 85.12
AP6_win_x86_SSE2_OpenCL_NV_r1363.exe -verbose  :
  Elapsed 23.584 secs, speedup: 96.72%  ratio: 30.48
      CPU 20.967 secs, speedup: 97.07%  ratio: 34.14
 
WU : Clean_20LC.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 14193.554 secs
      CPU 14184.235 secs
AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe -verbose  :
  Elapsed 730.437 secs, speedup: 94.85%  ratio: 19.43
      CPU 122.820 secs, speedup: 99.13%  ratio: 115.49
AP6_win_x86_SSE2_OpenCL_NV_r1363.exe -verbose  :
  Elapsed 402.683 secs, speedup: 97.16%  ratio: 35.25
      CPU 385.572 secs, speedup: 97.28%  ratio: 36.79

Raistmer:
Here is full range of unrolls for C-60.
As was expected display ON and display OFF constitute very different modes of operation.
Though the power plans for netbook differ only by display behavior, both PCIe settings and CPU settings were exactly the same, GPU performance was considerably different with display ON and display OFF.
It's annoying feature for GPCPU computing cause hardly someone will keep netbook display ON always just for crunching. I will check if manuall turning off display (not via power plan but via Fn+display off key) will result in same slowdown...

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version