Seti@Home optimized science apps and information
Optimized Seti@Home apps => Windows => GPU crunching => Topic started by: Raistmer on 06 Jul 2012, 08:20:45 am
-
Here is replacement for r521, r555 and r560 GPU builds of AstroPulse that was used before.
These new builds offer substantional (in many cases) speed increase and (in case of NV build) bug fixes that will result in less invalid results.
On low-end HD6450 plugged into PCI GPUs r1316 can consume too much CPU so ATi r1305 provided for such hosts.
On other hosts better to use ATi r1316 cause it gives advantage both in CPU and GPU times over r1305 (and older).
There was long time from last GPU AP release so there are many changes in command line params and app behavior:
First of all, defaults are changed to work on slowest known GPUs so almost certainly will not use your GPU at max. Use command line params to tune to your GPU.
Second big change - there is ap_cmdline.txt file that can be used to add command line parameters to app.
Put params there as you would put them in corresponding tag in app_info. App_info tag supported too so use a way that more convenient to you.
GPUlock and CPUlock are disabled by default. So -no_cpu_lock and -no_gpu_lock params are deprecated.
One can use -cpu_lock and -gpu_lock instead to enable these features.
On hosts with BOINC supporting OpenCL app will use device supplied by BOINC. With older BOINC versions own enumeration ability will be used.
-instances_per_device param still supported but not required for using multiple instances of app.
One should set <count> tag in app_info to get multiple instances running.
-sbs param supported by will only issue warning if single block allocation will be bigger than supplied value. Needed memory amount will be allocated still. App's memory requirements depend from -unroll N and -ffa_block N params.
Other params like -hp, -ffa_block N, -ffa_block_fetch N, -unroll N work as before.
Please, report noticed issues here or in corresponding threads on SETI forums.
I would like to thank Lunatics crew, especially our alpha testers arkayn, Claggy and Mike, and beta testers from SETI beta site for invaluable help in debugging and tuning these new releases.
-
Here is example of possible app_info section:
<app>
<name>astropulse_v6</name>
</app>
<file_info>
<name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v6</app_name>
<version_num>604</version_num>
<avg_ncpus>0.04</avg_ncpus>
<max_ncpus>0.2</max_ncpus>
<plan_class>ati13ati</plan_class>
<cmdline></cmdline>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</file_name>
<main_program/>
</file_ref>
<flops>30987654321</flops>
</app_version>
As usual, installation of this app requires advanced skills and understanding of anonymous platform mechanism provided with BOINC. If you unsure ask for help on SETI boards or wait next Lunatics installer release.
-
Low end GPU with increased CPU-times was a Radeon HD6450 in a PCI-Slot !
-
I made 2 posts about current situation with driver support for OpenCL on both vendors forums recently:
http://devgurus.amd.com/thread/159432
http://developer.nvidia.com/devforum/discussion/10636/feature-request-to-add-synchronization-mode-tuning-via-nv-specific-opencl-extension
If you have something to say on topic or explain why this important for users, please do post in corresponding threads.
-
I made 2 posts about current situation with driver support for OpenCL on both vendors forums recently:
http://devgurus.amd.com/thread/159432
http://developer.nvidia.com/devforum/discussion/10636/feature-request-to-add-synchronization-mode-tuning-via-nv-specific-opencl-extension
If you have something to say on topic or explain why this important for users, please do post in corresponding threads.
Installed AstroPulse app. rev.1316, all looking good, even the AP running the 555 version,
stopped at 33% when I changed versions.
Oh well, SETI went off-line, maintenance started. Wanted to link to this host.
And here
is the host. (http://setiathome.berkeley.edu/show_host_detail.php?hostid=6628794)
One done 32% with rev.555 and the rest with rev.1316 the 2nd with rev.1316.
-
I did a few Benches of NV r1316 on my 9800GTX+ with different drivers recently to see if the Cuda slowdown (on legacy GPUs) on Cuda 5 preview drivers was also happening on NV OpenCL,
The driver synch changes in later drivers (as opposed to 26x.xx drivers) has resulted in a speedup (subject to an unused core being available to feed the app),
and there wasn't a noticeable slowdown on Cuda 5 preview drivers:
266.58:
Quick timetable
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 3.886 secs
CPU 1.732 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 42.790 secs, speedup: -1001.13% ratio: 0.09x
CPU 39.375 secs, speedup: -2173.38% ratio: 0.04x
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 446.013 secs
CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 137.619 secs, speedup: 69.14% ratio: 3.24x
CPU 9.797 secs, speedup: 97.87% ratio: 46.91x
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 894.734 secs
CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 75.794 secs, speedup: 91.53% ratio: 11.80x
CPU 7.441 secs, speedup: 99.15% ratio: 117.63x
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 464.060 secs
CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 53.152 secs, speedup: 88.55% ratio: 8.73x
CPU 17.254 secs, speedup: 96.15% ratio: 25.97x
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 941.842 secs
CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 332.466 secs, speedup: 64.70% ratio: 2.83x
CPU 75.957 secs, speedup: 91.61% ratio: 11.92x
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 448.227 secs
CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 44.384 secs, speedup: 90.10% ratio: 10.10x
CPU 7.316 secs, speedup: 98.30% ratio: 58.89x
301.42:
Quick timetable
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 3.731 secs
CPU 1.732 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 52.223 secs, speedup: -1299.71% ratio: 0.07x
CPU 41.028 secs, speedup: -2268.82% ratio: 0.04x
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 446.013 secs
CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 129.331 secs, speedup: 71.00% ratio: 3.45x
CPU 126.813 secs, speedup: 72.41% ratio: 3.62x
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 894.734 secs
CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 60.810 secs, speedup: 93.20% ratio: 14.71x
CPU 58.376 secs, speedup: 93.33% ratio: 14.99x
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 464.060 secs
CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 42.138 secs, speedup: 90.92% ratio: 11.01x
CPU 39.453 secs, speedup: 91.19% ratio: 11.36x
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 941.842 secs
CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 297.572 secs, speedup: 68.41% ratio: 3.17x
CPU 292.689 secs, speedup: 67.67% ratio: 3.09x
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 448.227 secs
CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 33.530 secs, speedup: 92.52% ratio: 13.37x
CPU 30.904 secs, speedup: 92.83% ratio: 13.94x
306.02:
Quick timetable
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 3.659 secs
CPU 1.576 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 44.170 secs, speedup: -1107.16% ratio: 0.08x
CPU 39.406 secs, speedup: -2400.38% ratio: 0.04x
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 446.013 secs
CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 129.749 secs, speedup: 70.91% ratio: 3.44x
CPU 125.347 secs, speedup: 72.73% ratio: 3.67x
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 894.734 secs
CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 61.220 secs, speedup: 93.16% ratio: 14.62x
CPU 58.360 secs, speedup: 93.33% ratio: 15.00x
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 464.060 secs
CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 42.198 secs, speedup: 90.91% ratio: 11.00x
CPU 39.250 secs, speedup: 91.24% ratio: 11.41x
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 941.842 secs
CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 300.304 secs, speedup: 68.12% ratio: 3.14x
CPU 293.765 secs, speedup: 67.55% ratio: 3.08x
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 448.227 secs
CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 33.899 secs, speedup: 92.44% ratio: 13.22x
CPU 31.029 secs, speedup: 92.80% ratio: 13.88x
306.23:
Quick timetable
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 3.823 secs
CPU 1.778 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 45.188 secs, speedup: -1082.00% ratio: 0.08x
CPU 40.857 secs, speedup: -2197.92% ratio: 0.04x
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 446.013 secs
CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 129.390 secs, speedup: 70.99% ratio: 3.45x
CPU 130.947 secs, speedup: 71.51% ratio: 3.51x
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 894.734 secs
CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 60.678 secs, speedup: 93.22% ratio: 14.75x
CPU 59.904 secs, speedup: 93.16% ratio: 14.61x
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 464.060 secs
CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 42.019 secs, speedup: 90.95% ratio: 11.04x
CPU 41.496 secs, speedup: 90.74% ratio: 10.80x
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 941.842 secs
CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 298.223 secs, speedup: 68.34% ratio: 3.16x
CPU 301.768 secs, speedup: 66.66% ratio: 3.00x
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
Elapsed 448.227 secs
CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
Elapsed 33.425 secs, speedup: 92.54% ratio: 13.41x
CPU 31.621 secs, speedup: 92.66% ratio: 13.62x
Claggy