+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Please, help to balance CPU MB + ATI GPU MW on host under BOINC 6.6.36  (Read 26129 times)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Typical situation:
BOINC runs 4 SETI tasks w/o any MW tasks.
ATI GPU stays idle.
Another situation:
BOINC runs more than 10 MW tasks (from them only 3 actually do progress thanks to well-designed opt app by Gipsel, so it's not so bad), but only 3 SETI tasks. That is, one core stays idle (MW tasks consume almost no CPU, less tank CUDA MB).

Please, help to set project shares and other options to balance load.
Ideally I wanna see 4 SETI tasks + 3 or more (actuall no matter how more) MW tasks running simultaneously.
Maybe avg and max CPU in app_info should be changed too?

ADDON: additional info:
current project shares (only 2 projects active):
SETI - 10000
MW -  15000

IMHO bigger MW share should stimulate BOINC to run MW tasks.
But what I observe:
2 MW tasks started then were put in waiting for run state.
Short debt for MW (observing it via BOINCView) ~5000 s now and continues to increase.
Debt for SETI ~  -5000 s correspondingly....
Why BOINC doesn't allow MW tasks to run in this conditions?
« Last Edit: 10 Jul 2009, 02:10:43 pm by Raistmer »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Even more info:
I suspended SETI for 1-2 seconds then resumed it.
BOINC started lot of MW tasks >10 then, after SETI resuming it starts 3 SETI tasks.
After few MW tasts completed it started 4th SETI task.
Short debts were resetted to zero. Now MW short debt increasing again and number of running MW tasks decreasing (that is BOINC refuses to start additional MW tasks when some of them completed).
What is wrong???
ADDON:
current app_info for MW:

    <flops>1.0e11</flops>
    <avg_ncpus>0.05</avg_ncpus>
    <max_ncpus>0.2</max_ncpus>
    <cmdline></cmdline>
« Last Edit: 10 Jul 2009, 02:28:01 pm by Raistmer »

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
How are these MW tasks supposed to work?

I know they do the actual work on a GPU that BOINC knows nothing about. So they must be scheduled as CPU tasks, right? How do you get seven tasks, all believed by BOINC to be CPU tasks, to run at the same time? Non-computationally-intensive? 175% CPU utilisation?

What (and this is a serious question) does <cpu_sched_debug> put out?

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
yes, MW use avg_cpu setting in app_info to claim it almost no use of CPU. That allows BOINC to run multiple instances of MW + some more CPU intensive tasks like SETI.
MW taks are processed completely on GPU indeed.
Will enable option you mentioned.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Right,  Ascribe a plan class with avg & max cpus of ~0.95 to SaH CPU tasks (AKv8 I presume). .... Initially should get 5 then drop back  :o

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
10/07/2009 22:33:13      [cpu_sched_debug] Request enforce CPU schedule: schedule_cpus
10/07/2009 22:33:13      [cpu_sched_debug] enforce_schedule(): start
10/07/2009 22:33:13      [cpu_sched_debug] preliminary job list:
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 0: ps_sgr_210F5_3s_hiw_15791565_1247247343_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 1: 13se08ab.29372.13569.8.8.136_1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 2: 22dc08ac.28597.14904.7.8.164_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 3: ps_sgr_222F5_3s_hiw_15793323_1247247510_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 4: 29au08aa.21644.18477.7.8.4_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 5: ps_sgr_214F5_3s_hiw_15793750_1247247570_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 6: 13se08ab.29372.17250.8.8.237_1
10/07/2009 22:33:13      [cpu_sched_debug] final job list:
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 0: ps_sgr_210F5_3s_hiw_15791565_1247247343_0
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 1: ps_sgr_230F5_3s_hiw_15810913_1247249472_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 2: 13se08ab.29372.13569.8.8.136_1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 3: 22dc08ac.28597.14904.7.8.164_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 4: ps_sgr_222F5_3s_hiw_15793323_1247247510_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 5: 29au08aa.21644.18477.7.8.4_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] 6: ps_sgr_214F5_3s_hiw_15793750_1247247570_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 7: 13se08ab.29372.17250.8.8.237_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] scheduling ps_sgr_210F5_3s_hiw_15791565_1247247343_0
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] scheduling ps_sgr_230F5_3s_hiw_15810913_1247249472_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] scheduling 13se08ab.29372.13569.8.8.136_1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] scheduling 22dc08ac.28597.14904.7.8.164_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] scheduling ps_sgr_222F5_3s_hiw_15793323_1247247510_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] scheduling 29au08aa.21644.18477.7.8.4_1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] scheduling ps_sgr_214F5_3s_hiw_15793750_1247247570_0
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] scheduling 13se08ab.29372.17250.8.8.237_1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 13se08ab.29372.13569.8.8.136_1 sched state 2 next 2 task state 1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 22dc08ac.28597.14904.7.8.164_1 sched state 2 next 2 task state 1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 29au08aa.21644.18477.7.8.4_1 sched state 2 next 2 task state 1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_230F5_3s_hiw_15787913_1247246910_0 sched state 1 next 1 task state 9
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_210F5_2s_hiw_15788517_1247247017_0 sched state 1 next 1 task state 9
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_210F5_2s_hiw_15788518_1247247017_0 sched state 1 next 1 task state 9
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_210F5_3s_hiw_15791565_1247247343_0 sched state 2 next 2 task state 1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_222F5_3s_hiw_15793323_1247247510_0 sched state 2 next 2 task state 1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_214F5_3s_hiw_15793750_1247247570_0 sched state 2 next 2 task state 1
10/07/2009 22:33:13   Milkyway@home   [cpu_sched_debug] ps_sgr_230F5_3s_hiw_15810913_1247249472_0 sched state 2 next 2 task state 1
10/07/2009 22:33:13   SETI@home   [cpu_sched_debug] 13se08ab.29372.17250.8.8.237_1 sched state 2 next 2 task state 1
10/07/2009 22:33:13      [cpu_sched_debug] enforce_schedule: end

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Right,  Ascribe a plan class with avg & max cpus of ~0.95 to SaH CPU tasks (AKv8 I presume). .... Initially should get 5 then drop back  :o
Ok, will add.

EDIT:
After restart BOINC started 4 SETI + 4 MW tasks. It just OK. Will see what will be next...

ADDON: current MW short debt approaching to 3000s....
« Last Edit: 10 Jul 2009, 02:42:57 pm by Raistmer »

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
EDIT:
After restart BOINC started 4 SETI + 4 MW tasks. It just OK. Will see what will be next...

It's a packing algorithm instead of graph colouring (otherwise known as Richard's Back of the envelope calculation).  Debt will accumulate according to the amount specified in Max, when task isn't running, even if it is running, but scheduler thinsk it has no room  :o

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Now, after few MW tasks completed still 4+4. Will see what will be after one of SETI tasks completed....
Debt is ~3500.
And what max value is ?

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
32 bit unsigned int ?

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
:) hope less :)

Oops, "communicating with BOINC"... and 99% of idle CPU. boinc.exe crashed it seems...
I use patched version now .
« Last Edit: 10 Jul 2009, 03:03:39 pm by Raistmer »

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
:) hope less :)

Oops, "communicating with BOINC"... and 99% of idle CPU. boinc.exe crashed it seems...
I use patched version now (with project- wide network activity suspend).

LoL. It doesn't like being watched.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
It seems so :)

Well, no crash dump. It seems it was not a boinc.exe crash, just all apps exited with "no heartbeat" state.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
...just all apps exited with "no heartbeat" state.

Because you deliberately terminated 8 apps simultaneously?... no... didn't really think so.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
No, I didn't terminated them.
There was delay in manager communication with daemon. It seems daemon was hung for more than 30 seconds - all apps exited with no heartbeat message. I see that message in SETI's stderr (MW tasks already gone of course).
Maybe just because BOINC's stderr with debug enabled grows too big (more 2MB now)... don't know. Will see if it will repeat again.
Will leave it for awhile and check later. Now still 4+4 that can be considered as perfect state :)

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 48
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 20
Total: 20
Powered by EzPortal