+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: AP6 for NV & ATi GPUs r1316 released  (Read 14518 times)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
AP6 for NV & ATi GPUs r1316 released
« on: 06 Jul 2012, 08:20:45 am »
Here is replacement for r521, r555 and r560 GPU builds of AstroPulse that was used before.
These new builds offer substantional (in many cases) speed increase and (in case of NV build) bug fixes that will result in less invalid results.

On low-end HD6450 plugged into PCI  GPUs r1316 can consume too much CPU so ATi r1305 provided for such hosts.
On other hosts better to use ATi r1316 cause it gives advantage both in CPU and GPU times over r1305 (and older).

There was long time from last GPU AP release so there are many changes in command line params and app behavior:

First of all, defaults are changed to work on slowest known GPUs so almost certainly will not use your GPU at max. Use command line params to tune to your GPU.

Second big change - there is ap_cmdline.txt file that can be used to add command line parameters to app.
Put params there as you would put them in corresponding tag in app_info. App_info tag supported too so use a way that more convenient to you.

GPUlock and CPUlock are disabled by default. So -no_cpu_lock and -no_gpu_lock params are deprecated.
One can use -cpu_lock and -gpu_lock instead to enable these features.
On hosts with BOINC supporting OpenCL app will use device supplied by BOINC. With older BOINC versions own enumeration ability will be used.

-instances_per_device param still supported but not required for using multiple instances of app.
One should set <count> tag in app_info to get multiple instances running.

-sbs param supported by will only issue warning if single block allocation will be bigger than supplied value. Needed memory amount will be allocated still. App's memory requirements depend from -unroll N and -ffa_block N params.

Other params like -hp, -ffa_block N, -ffa_block_fetch N, -unroll N work as before.

Please, report noticed issues here or in corresponding threads on SETI forums.

I would like to thank Lunatics crew, especially our alpha testers arkayn, Claggy and Mike,  and beta testers from SETI beta site for invaluable help in debugging and tuning these new releases.
« Last Edit: 06 Jul 2012, 02:29:33 pm by Raistmer »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: AP6 for NV & ATi GPUs r1316 released
« Reply #1 on: 06 Jul 2012, 08:24:58 am »
Here is example of possible app_info section:



<app>
   <name>astropulse_v6</name>
</app>
<file_info>
   <name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</name>
   <executable/>
</file_info>
<app_version>
   <app_name>astropulse_v6</app_name>
   <version_num>604</version_num>
   <avg_ncpus>0.04</avg_ncpus>
   <max_ncpus>0.2</max_ncpus>
   <plan_class>ati13ati</plan_class>
   <cmdline></cmdline>
   <coproc>
      <type>ATI</type>
      <count>1</count>
   </coproc>
   <file_ref>
      <file_name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</file_name>
      <main_program/>
   </file_ref>
   <flops>30987654321</flops>
</app_version>


As usual, installation of this app requires advanced skills and understanding of anonymous platform mechanism provided with BOINC. If you unsure ask for help on SETI boards or wait next Lunatics installer release.
« Last Edit: 06 Jul 2012, 08:28:12 am by Raistmer »

Offline Urs Echternacht

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 4121
  • ++
Re: AP6 for NV & ATi GPUs r1316 released
« Reply #2 on: 06 Jul 2012, 02:08:20 pm »
Low end GPU with increased CPU-times was a Radeon HD6450 in a PCI-Slot !
_\|/_
U r s

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: AP6 for NV & ATi GPUs r1316 released
« Reply #3 on: 09 Jul 2012, 03:03:18 am »
I made 2 posts about current situation with driver support for OpenCL on both vendors forums recently:
http://devgurus.amd.com/thread/159432
http://developer.nvidia.com/devforum/discussion/10636/feature-request-to-add-synchronization-mode-tuning-via-nv-specific-opencl-extension

If you have something to say on topic or explain why this important for users, please do post in corresponding threads.

Offline Fredericx51

  • Knight o' The Round Table
  • ***
  • Posts: 207
  • Knight Who Says Ni N!
Re: AP6 for NV & ATi GPUs r1316 released
« Reply #4 on: 10 Jul 2012, 12:06:37 pm »
I made 2 posts about current situation with driver support for OpenCL on both vendors forums recently:
http://devgurus.amd.com/thread/159432
http://developer.nvidia.com/devforum/discussion/10636/feature-request-to-add-synchronization-mode-tuning-via-nv-specific-opencl-extension

If you have something to say on topic or explain why this important for users, please do post in corresponding threads.


Installed AstroPulse app. rev.1316, all looking good, even the AP running the 555 version,
stopped at 33% when I changed versions.
Oh well, SETI went off-line, maintenance started. Wanted to link to this host.
And here
is the host.


One done 32% with rev.555 and the rest with rev.1316 the 2nd with rev.1316.

« Last Edit: 10 Jul 2012, 07:09:04 pm by Fredericx51 »

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: AP6 for NV & ATi GPUs r1316 released
« Reply #5 on: 15 Sep 2012, 11:38:33 am »
I did a few Benches of NV r1316 on my 9800GTX+ with different drivers recently to see if the Cuda slowdown (on legacy GPUs) on Cuda 5 preview drivers was also happening on NV OpenCL,
The driver synch changes in later drivers (as opposed to 26x.xx drivers) has resulted in a speedup (subject to an unused core being available to feed the app),
and there wasn't a noticeable slowdown on Cuda 5 preview drivers:

266.58:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.886 secs
      CPU 1.732 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.790 secs, speedup: -1001.13%  ratio: 0.09x
      CPU 39.375 secs, speedup: -2173.38%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 137.619 secs, speedup: 69.14%  ratio: 3.24x
      CPU 9.797 secs, speedup: 97.87%  ratio: 46.91x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 75.794 secs, speedup: 91.53%  ratio: 11.80x
      CPU 7.441 secs, speedup: 99.15%  ratio: 117.63x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 53.152 secs, speedup: 88.55%  ratio: 8.73x
      CPU 17.254 secs, speedup: 96.15%  ratio: 25.97x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 332.466 secs, speedup: 64.70%  ratio: 2.83x
      CPU 75.957 secs, speedup: 91.61%  ratio: 11.92x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 44.384 secs, speedup: 90.10%  ratio: 10.10x
      CPU 7.316 secs, speedup: 98.30%  ratio: 58.89x
 
301.42:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.731 secs
      CPU 1.732 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 52.223 secs, speedup: -1299.71%  ratio: 0.07x
      CPU 41.028 secs, speedup: -2268.82%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 129.331 secs, speedup: 71.00%  ratio: 3.45x
      CPU 126.813 secs, speedup: 72.41%  ratio: 3.62x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 60.810 secs, speedup: 93.20%  ratio: 14.71x
      CPU 58.376 secs, speedup: 93.33%  ratio: 14.99x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.138 secs, speedup: 90.92%  ratio: 11.01x
      CPU 39.453 secs, speedup: 91.19%  ratio: 11.36x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 297.572 secs, speedup: 68.41%  ratio: 3.17x
      CPU 292.689 secs, speedup: 67.67%  ratio: 3.09x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 33.530 secs, speedup: 92.52%  ratio: 13.37x
      CPU 30.904 secs, speedup: 92.83%  ratio: 13.94x

306.02:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.659 secs
      CPU 1.576 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 44.170 secs, speedup: -1107.16%  ratio: 0.08x
      CPU 39.406 secs, speedup: -2400.38%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 129.749 secs, speedup: 70.91%  ratio: 3.44x
      CPU 125.347 secs, speedup: 72.73%  ratio: 3.67x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 61.220 secs, speedup: 93.16%  ratio: 14.62x
      CPU 58.360 secs, speedup: 93.33%  ratio: 15.00x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.198 secs, speedup: 90.91%  ratio: 11.00x
      CPU 39.250 secs, speedup: 91.24%  ratio: 11.41x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 300.304 secs, speedup: 68.12%  ratio: 3.14x
      CPU 293.765 secs, speedup: 67.55%  ratio: 3.08x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 33.899 secs, speedup: 92.44%  ratio: 13.22x
      CPU 31.029 secs, speedup: 92.80%  ratio: 13.88x

306.23:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.823 secs
      CPU 1.778 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 45.188 secs, speedup: -1082.00%  ratio: 0.08x
      CPU 40.857 secs, speedup: -2197.92%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 129.390 secs, speedup: 70.99%  ratio: 3.45x
      CPU 130.947 secs, speedup: 71.51%  ratio: 3.51x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 60.678 secs, speedup: 93.22%  ratio: 14.75x
      CPU 59.904 secs, speedup: 93.16%  ratio: 14.61x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.019 secs, speedup: 90.95%  ratio: 11.04x
      CPU 41.496 secs, speedup: 90.74%  ratio: 10.80x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 298.223 secs, speedup: 68.34%  ratio: 3.16x
      CPU 301.768 secs, speedup: 66.66%  ratio: 3.00x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 33.425 secs, speedup: 92.54%  ratio: 13.41x
      CPU 31.621 secs, speedup: 92.66%  ratio: 13.62x

Claggy

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 238
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 337
Total: 337
Powered by EzPortal