+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: GPU/CPU performance dependence from AR value  (Read 23200 times)

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: GPU/CPU performance dependence from AR value
« Reply #15 on: 19 Apr 2009, 11:39:17 am »
Do I understand right that only yellow lines should be changed for CPU<->GPU rebranding ?

Yes, pretty much. What I found easier (for me, while I was doing it manually) was to add an extra application name for cpu app with no plan class to app_info, say: setiathome_enhanced_AK  version 608  , (Planclass CPU optional).  This way all rebranded work shows as new version but CPU in manager.  That was helpful in quickly identifying work that was already rebranded visually.

This way only the application name and plan class lines needed changing, rather than the version, and work already allocated to CPU via 6.03 was untouched. new work for CPU went to 6.03, new cuda work went to cuda 6.08, and only rebranded work was 6.08 (CPU plan class), meaning it was easy to recognise and undo if needed.



Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #16 on: 19 Apr 2009, 12:44:32 pm »
I going to rebrand not only GPU to CPU but CPU to GPU too ;)
Now finishing current results running (for clean first test) and will try to do rebrand. produced file looks OK at my not very experienced in client_state.xml sight :)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #17 on: 19 Apr 2009, 01:58:18 pm »
Unfortunately, it was not single place:
<workunit>
    <name>01fe09aa.11349.15205.4.8.69</name>
    <app_name>setiathome_enhanced</app_name>
    <version_num>608</version_num>
    <rsc_fpops_est>78856403871557.094000</rsc_fpops_est>
    <rsc_fpops_bound>788564038715571.000000</rsc_fpops_bound>
    <rsc_memory_bound>33554432.000000</rsc_memory_bound>
    <rsc_disk_bound>33554432.000000</rsc_disk_bound>
    <file_ref>
        <file_name>01fe09aa.11349.15205.4.8.69</file_name>
        <open_name>work_unit.sah</open_name>
    </file_ref>
</workunit>

Here version number mentioned too.
I got errors on all rebranded tasks. Smth like can't link  on result ....
19/04/2009 21:56:31   SETI@home   [error] State file error: missing task
19/04/2009 21:56:31   SETI@home   [error] Can't link task 01fe09aa.11349.15205.4.8.75_0 in state file

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: GPU/CPU performance dependence from AR value
« Reply #18 on: 19 Apr 2009, 02:39:28 pm »
Fred's script just replaces "608" with "603" in both the <workunit> and <result> sections (matching ones, of course), and deletes the <plan_class>cuda</plan_class> line completely - making it look exactly like a 603 directly allocated to the CPU by the server. That seems the simplest solution: but I'm intrigued by Josef's suggestion. That might be worth a look.

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: GPU/CPU performance dependence from AR value
« Reply #19 on: 19 Apr 2009, 04:25:38 pm »
Well, I've had a look through the Proxomitron docs, and I'm a bit daunted by the match/replace language. Does anyone have any experience of writing that sort of meta-character based filter? Here's an example of what we need to do.

Quote
<rubbish>
...
</rubbish>
<file_info>
    <name>27dc08ab.32733.481.6.8.9</name>
    <url>http://boinc2.ssl.berkeley.edu/sah/download_fanout/f1/27dc08ab.32733.481.6.8.9</url>
    <md5_cksum>d8bf53ae5251691603446976bd9e757d</md5_cksum>
    <nbytes>375323</nbytes>
</file_info>
<workunit>
    <rsc_fpops_est>23780000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>237800000000000.000000</rsc_fpops_bound>
    <rsc_memory_bound>33554432.000000</rsc_memory_bound>
    <rsc_disk_bound>33554432.000000</rsc_disk_bound>
    <name>27dc08ab.32733.481.6.8.9</name>
    <app_name>setiathome_enhanced</app_name>
<file_ref>
    <file_name>27dc08ab.32733.481.6.8.9</file_name>
    <open_name>work_unit.sah</open_name>
</file_ref>
</workunit>
...
<file>
<WU>
<file>
<WU>
...
<file_info>
  <name>27dc08ab.32733.481.6.8.9_1_0</name>
  <generated_locally/>
  <upload_when_present/>
  <max_nbytes>65536</max_nbytes>
  <url>http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler</url>
<xml_signature>
b849d6e0adcc332ad1601a97d75f4d073ea633ae16b00663e6aa98bac0477c08
3742e0330aae2deee62f2406ddcd1020b3ff02e3cf6f7f77482a97dbc453a489
21fe18199095dda88f172da2d97b1d1cddff23272c832be8e44ba10b38212700
0e5ff950052f3a870c850bb3efa7cefcee57ce02ddcb6473d55526a34ba2dc4f
.
</xml_signature>
</file_info>
<result>
<report_deadline>1240773971</report_deadline>
<wu_name>27dc08ab.32733.481.6.8.9</wu_name>
<name>27dc08ab.32733.481.6.8.9_1</name>
  <file_ref>
    <file_name>27dc08ab.32733.481.6.8.9_1_0</file_name>
    <open_name>result.sah</open_name>
  </file_ref>
    <platform>windows_intelx86</platform>
    <version_num>608</version_num>
    <plan_class>cuda</plan_class>
</result>
...
<file>
<result>
<file>
<result>
...
<rubbish>
...
</rubbish>

The <rubbish> and all the intervening file/workunit/file/result stuff needs preserving, of course.

But the process is:

Identify a WU for re-branding by <rsc_fpops_est> - this is a VHAR
Remember the next <name> for later use
Look for a matching <wu_name> (it only occurs in a <result> section)
Change the next following <version_num> from 608 to 603, or vice versa as desired
Delete or insert the <plan_class> line to match

allowing for up to 20 names to be matched and re-branded in a single sched_reply file. The process is very similar to Fred's surgery on client_state, so it could easily be scripted: but I'm not sure it could be pattern-matched on the fly.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #20 on: 19 Apr 2009, 04:26:02 pm »
Fred's script just replaces "608" with "603" in both the <workunit> and <result> sections (matching ones, of course), and deletes the <plan_class>cuda</plan_class> line completely - making it look exactly like a 603 directly allocated to the CPU by the server. That seems the simplest solution: but I'm intrigued by Josef's suggestion. That might be worth a look.
Yes, mine too.
But it can do CPU->GPU move too.
Beta version in testing.

BOINC can be stopped by "boinccmd --quit "/net stop boinc and restarted after patching via start/net start boinc
« Last Edit: 19 Apr 2009, 04:31:11 pm by Raistmer »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #21 on: 19 Apr 2009, 04:28:38 pm »

Identify a WU for re-branding by <rsc_fpops_est> - this is a VHAR
How AR can be matched with this field value?
Any table or formula exist?

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: GPU/CPU performance dependence from AR value
« Reply #22 on: 19 Apr 2009, 04:42:13 pm »
Identify a WU for re-branding by <rsc_fpops_est> - this is a VHAR

How AR can be matched with this field value?
Any table or formula exist?

That's why I said earlier that the 'V' cases are easy - 80360000000000.000000 is a VLAR (true VLAR - AR<0.05) and 23780000000000.000000 is a VHAR.

For the formula in between, ask Josef, and remind him of http://setiathome.berkeley.edu/forum_thread.php?id=44178&nowrap=true#698744 - there will be a linear scaling factor to the formula in that post. But note that the formula will not be single-valued in converting from fpops to AR - any intermediate value could be on either of the curves.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #23 on: 19 Apr 2009, 04:46:46 pm »
I see. Will stay with rebranding script for now then.
Running once per 1-2 days it can increase net host performance IMHO.
~50% of task speedup for VHAR and even more for VLAR and no need to kill task - enough advantages for next step to perfection ;)

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: GPU/CPU performance dependence from AR value
« Reply #24 on: 19 Apr 2009, 05:23:45 pm »
I've split Fred's script into an information-only part (which can run while BOINC is active), and an action part which shuts down BOINC, does the necessary, and restarts BOINC.

When I get to within 50 CUDA tasks of the first one I want to re-brand, the default button switches from 'No' to 'Yes', and the action script runs automatically unless I intervene to stop it.

[attachment deleted by admin]

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #25 on: 19 Apr 2009, 05:32:13 pm »
I switched to 6.6.20 CUDA management from V10 pack for using rebranding and seems not edit app_info flop estimates right. So I recived 500 tasks at once. Did rebranding on whole queue (lost only few tasks during debugging). sorted by deadline time queue looks similarly GPU.CPU performance graph. 603 then 608 CUDA then 603 again. It seems scriopt work.
If some other successful reports will arive will put it to this thread. Maybe it could be ported to VBscript, to avoid using of Perl interpretator.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: GPU/CPU performance dependence from AR value
« Reply #26 on: 25 Apr 2009, 01:20:09 pm »
To evaluate dependance of performance curve from type of GPU I did the same measurements on the same host with different GPU.
Now it was 8500GT, one of the slowest CUDA-capable GPUs. First experiment was with 9600GSO that can be viewed as midrange GPU.
Results plotted on this graph.

Performance values normalized on 100% to simplify GPU comparison.
One can see that the slower GPU is the less sensitive it to differencies in performances for different ARs. Nevertheless, curve has the same peculiarities as for 9600GSO.
IMHO the extrapolation to the faster GPUs will emphasize differencies in performance for different ARs. That is, the more fast GPU is the more it needs correct AR tasks to compute.

EDIT: current results in VHAR area recived for single core busy with other cores idle on CPU. CPU VHAR performance drops significally when all CPU cores busy with same VHAR task.
This effect needs more investigations (and will take influence on VHAR part of provided curves).
« Last Edit: 25 Apr 2009, 01:26:14 pm by Raistmer »

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 4
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 34
Total: 34
Powered by EzPortal