+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: BOINC 6.10.58 incorrect running time estimation  (Read 26759 times)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
BOINC 6.10.58 incorrect running time estimation
« on: 08 Feb 2012, 08:25:00 am »
On my  NV host newly recived GPU tasks always get time estimation ~2h (just as CPU ones) though real elapsed time ~30 minutes or less.
In the past I ran re-scheduler on this host but it was not running few months (!) already.
And estimation time still hugely incorrect.
It prevents host from downloading enough work to allow running with disabled network for statistic collection...

Please, any advices how it can be fixed w/o BOINC upgrade?

EDIT: it does correction for already downloaded time, but when fresh tasks arrive they have huge estimation again...
app_info is:

Quote
<app_info>
    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>AK_v8b2_win_x64_SSSE3x.exe</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
   <platform>windows_intelx86</platform>
        <file_ref>
           <file_name>AK_v8b2_win_x64_SSSE3x.exe</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
   <platform>windows_x86_64</platform>
        <file_ref>
           <file_name>AK_v8b2_win_x64_SSSE3x.exe</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>Lunatics_x41g_win32_cuda32.exe</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cudart32_32_16.dll</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cufft32_32_16.dll</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>610</version_num>
   <platform>windows_intelx86</platform>
        <plan_class>cuda_fermi</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>609</version_num>
   <platform>windows_intelx86</platform>
        <plan_class>cuda23</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>608</version_num>
   <platform>windows_intelx86</platform>
        <plan_class>cuda</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>610</version_num>
   <platform>windows_x86_64</platform>
        <plan_class>cuda_fermi</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>609</version_num>
   <platform>windows_x86_64</platform>
        <plan_class>cuda23</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>608</version_num>
   <platform>windows_x86_64</platform>
        <plan_class>cuda</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
</app_info>
« Last Edit: 08 Feb 2012, 08:27:47 am by Raistmer »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #1 on: 08 Feb 2012, 08:52:37 am »
ADDON:

SETI@home Enhanced (anonymous platform, CPU)
Number of tasks completed   4527
Max tasks per day   228
Number of tasks today   0
Consecutive valid tasks   129
Average processing rate   17.686758629915
Average turnaround time   8.77 days
SETI@home Enhanced (anonymous platform, nvidia GPU)
Number of tasks completed   14106
Max tasks per day   675
Number of tasks today   0
Consecutive valid tasks   576
Average processing rate   255.03870331506
Average turnaround time   6.36 days

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #2 on: 08 Feb 2012, 09:00:37 am »
Eithier bother DA or Eric to fix the server side Estimates for Anonymous Platform users that aren't using flops entries,

or put flops into your app_info (use the APR figure with e09 at the end), eg 17.686758629915e09 for CPU MB  app & 255.03870331506e09 for Nvidia GPU MB app,

or use Jason's 6.10.58 boinc.exe,

Claggy

Offline Mike

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 2427
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #3 on: 08 Feb 2012, 09:05:05 am »
I have the same issue with GPU times ETA since i removed flops entry.
So long i get enough work i dont care.
But its funny at least.
I remember Joe telling me at main it will correct itself.
Maybe in 100 years.

Mike

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #4 on: 08 Feb 2012, 11:24:06 am »
On my  NV host newly recived GPU tasks always get time estimation ~2h (just as CPU ones) though real elapsed time ~30 minutes or less.
In the past I ran re-scheduler on this host but it was not running few months (!) already.
And estimation time still hugely incorrect.
It prevents host from downloading enough work to allow running with disabled network for statistic collection...

Please, any advices how it can be fixed w/o BOINC upgrade?

That's not a client issue - it's the result of server changeset 24128, much discussed on the SETI message boards since 05 September 2011.

Options are:
  • Only run one application type at a time (probably advisable for statistical collection anyway)
  • Populate app_info with appropriate <flops> values for every app specified
  • Use Jason's aDCF client variant
  • Join us in lobbying DA and SETI staff to restore the server configuration to documented operating mode - safely

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #5 on: 08 Feb 2012, 11:28:46 am »
Eithier bother DA or Eric to fix the server side Estimates for Anonymous Platform users that aren't using flops entries,

or put flops into your app_info (use the APR figure with e09 at the end), eg 17.686758629915e09 for CPU MB  app & 255.03870331506e09 for Nvidia GPU MB app,

or use Jason's 6.10.58 boinc.exe,

Claggy
Hehe, I think it will more bother me than them so I skip directly to local methods ;)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #6 on: 08 Feb 2012, 11:36:06 am »
On my  NV host newly recived GPU tasks always get time estimation ~2h (just as CPU ones) though real elapsed time ~30 minutes or less.
In the past I ran re-scheduler on this host but it was not running few months (!) already.
And estimation time still hugely incorrect.
It prevents host from downloading enough work to allow running with disabled network for statistic collection...

Please, any advices how it can be fixed w/o BOINC upgrade?

That's not a client issue - it's the result of server changeset 24128, much discussed on the SETI message boards since 05 September 2011.

Options are:
  • Only run one application type at a time (probably advisable for statistical collection anyway)
  • Populate app_info with appropriate <flops> values for every app specified
  • Use Jason's aDCF client variant
  • Join us in lobbying DA and SETI staff to restore the server configuration to documented operating mode - safely
Ok, I use Jason's build at ATi host so will transfer executable there too.... It's very irritating to not get such basics things work correctly... work cache is really one of the basic BOINC's functions  :-\

And about any limitation in app types usage for statistics - no, I measure real-life performance so better to get all mixes... That host runs only one MB GPU task at once anyway so AP and MB wull not bother each other...
My ATi host from other dside running 3 at once - here MV/AP mixes are possible indeed...

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #7 on: 08 Feb 2012, 12:45:37 pm »
I have the same issue with GPU times ETA since i removed flops entry.
So long i get enough work i dont care.
But its funny at least.
I remember Joe telling me at main it will correct itself.
Maybe in 100 years.

Mike

Not in any amount of time now. BOINC changeset 24128 as revised by 24217 mean that if your host is telling the servers the app is less than 1/10 it's actual speed, the situation cannot correct itself. And if you don't specify <flops> in app_info.xml, the core client does a "conservative" estimate based on the notion that the Whetstone benchmark is peak FLOPS for the CPU and a GPU might actually be slower than a CPU.

Added note: Those assumptions underlying the estimate may be true at some other project, either now or in the future. IOW, I don't expect them to change. Similarly, the 1/10 factor in changeset 24217 applies to all projects which allow anonymous platform, so getting that changed to better suit S@H may not be likely.
                                                            Joe
« Last Edit: 08 Feb 2012, 01:15:13 pm by Josef W. Segur »

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #8 on: 08 Feb 2012, 01:18:55 pm »
But we got David to write http://boinc.berkeley.edu/trac/changeset/24225/boinc (and bugfixes) specifically "for applications like SETI@home", as a more scientific and targeted approach to solving Claggy's presenting problem.

If we could get the project to adopt that code as intended, there would be no need for the crude, blunderbuss, APR capping.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #9 on: 08 Feb 2012, 03:12:26 pm »
.... It's very irritating to not get such basics things work correctly... work cache is really one of the basic BOINC's functions  :-\ ...

In principle I agree, even though in computer science such scheduling/logistics problems are techniically condidered 'NP-Hard' meaning they can't be solved 'optimally' by machine in reasonable/finite time.  The usual approaches to coming to an acceptable answer are to settle on some  'near enough' acceptable answer using heuristics (i.e. human logic) which is time proven and can be effective.  Unfortunately there seems to be some disparity between what Boinc devs & end users consider acceptable, and that Boinc devs can think like humans is debatable  :D

Jason

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #10 on: 08 Feb 2012, 03:19:57 pm »
 :D ;D

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #11 on: 08 Feb 2012, 03:36:57 pm »
FWIW, fixing the client side DCF logic, and adding & switching to per app DCFs, weren't enough by themselves, and tended to oscillate/overshoot/freakout, even though not as badly.

The later modded Boincs use a hybrid PID controller core (from Engineering control systems theory)  tuned to damp oscillation sufficiently, but allow a quick rise under normal work variation. It's considered hybrid, rather than pure PID, because it still uses heuristics for safety with overflows etc.

That's been fairly stable here for quite a while, which means if later I switch to a simpler 'fuzzy' equivalent, lock DCFs'aDCFs to 1.0 & try manipulating flops instead as suggested by Joe, there should be the capability to use some more human readable/adjustable controls.  (neither DCF or other fudge factors, or PID controls are considered human readabl, whereas fuzzy parameters usually are). 

That, using the app flops, could end up more acceptable for future client integration (than per app DCF), though I am very wary of trying to introduce client side improvements that would disguise the fairly severe server side flaws at this stage.

Jason

[Edit:] found a good description that I was looking for, illustrating the difference between pure heurisitic based control, versus PID (proportional-Integral-Derivative) based control: http://en.wikipedia.org/wiki/Gaze_heuristic
Quote
Gaze heuristic
From Wikipedia, the free encyclopedia
The gaze heuristic is a heuristic employed by people when trying to catch a ball. Experimental studies have shown that people do not act as though they were solving a system of differential equations that describe the forces acting on the ball while it is in the air and then run to the place at which the ball is predicted to hit the ground. Instead they fixate the ball with their eyes and move so as to keep the angle of the gaze either constant or within a specific range. Moving in such a fashion assures that the ball will hit the catcher.
.  PID control 'works' but isn't very human like, so can be difficult to tune, and sometimes looks illogical (even when tuned correctly & working), and tends to become clunky if you have to add a lot of heuristics around it.  Either full fuzzy, or fuzzy enhanced PID, should end up better resembling more natural human-like control & tuning, so 'feel' better, even though effectively doing the same job as PID probably to similar mathematical quality.
« Last Edit: 08 Feb 2012, 05:03:18 pm by Jason G »

Offline SciManStev

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 263
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #12 on: 08 Feb 2012, 05:40:08 pm »
From what I'm reading here, it looks like these lousy limits are here to stay. I must admit I find that quite disturbing. Rats!

Steve

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #13 on: 08 Feb 2012, 05:48:52 pm »
From what I'm reading here, it looks like these lousy limits are here to stay. I must admit I find that quite disturbing. Rats!

Steve

I'd say probably not 'here to stay', other than project side seems to be focussed on some other mysterious stuff lately, rather than tying up loose ends with the main project or Beta.  What those other things might be though, is open to guessing  ;)

Offline SciManStev

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 263
Re: BOINC 6.10.58 incorrect running time estimation
« Reply #14 on: 08 Feb 2012, 06:09:43 pm »
From what I'm reading here, it looks like these lousy limits are here to stay. I must admit I find that quite disturbing. Rats!

Steve

I'd say probably not 'here to stay', other than project side seems to be focussed on some other mysterious stuff lately, rather than tying up loose ends with the main project or Beta.  What those other things might be though, is open to guessing  ;)

I'll take even the smallest bit of optimism as a very positive step. My GPU's ran dry again today. I have a few GPU units on board now, but it does get frustrating going from full to the limits to empty in just a few hours. I can only hope this will eventually get fixed.

Steve

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 355
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 225
Total: 225
Powered by EzPortal