+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Exceeded Elapsed Time Limit - Error??  (Read 16354 times)

Ghost0210

  • Guest
Exceeded Elapsed Time Limit - Error??
« on: 30 May 2010, 02:57:35 pm »
Hi,
As both the Seti boards are down at the moment, I'm hoping someone here may be able to help.  :D
I was halfway through processing a couple of Beta wu's, got to about 55% through the tasks andd they then both aborted with the following messages:
30/05/2010 19:52:26   SETI@home Beta Test   Aborting task 18dc09aa.22310.7429.5.13.83_0: exceeded elapsed time limit 272.626338
30/05/2010 19:52:26   SETI@home Beta Test   Aborting task 18dc09aa.22310.7429.5.13.82_0: exceeded elapsed time limit 272.626338
30/05/2010 19:52:28      [wfd] Request work fetch: application exited
30/05/2010 19:52:28      [wfd] Request work fetch: application exited

I've never seen a wu error out with this message before and was hoping that someone might either have seen it or be able to explain why the tasks self aborted
Thanks
 
[edit]Should have said hese tasks were cuda wu's - looks like I've had quite a few cuda tasks that have all aborted after about 4:30 run time
« Last Edit: 30 May 2010, 03:21:36 pm by Ghost0210 »

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Exceeded Elapsed Time Limit - Error??
« Reply #1 on: 30 May 2010, 03:57:10 pm »
I've only seen Maximum Time Exceeded errors with Astropulse tasks,
eithier when the flops value in the app_info was set far too high,
or with Seti Beta's Stock Hybrid Astropulse app,
when a slow (AMD) CPU was paired with a fastish GPU,
since Hybrid Astropulse app is mostly a CPU app, with a bit of GPU computation thrown in,
slow AMD computers were having their tasks aborted,
because the project used the GPU flops value instead of CPU flops,

Was that 4mins 30secs, or 4hrs 30mins?, could be a VLAR task on a very slow GPU,
or the Cuda app might have dropped into CPU-fallback mode,

Claggy

Ghost0210

  • Guest
Re: Exceeded Elapsed Time Limit - Error??
« Reply #2 on: 30 May 2010, 04:05:00 pm »
Hi Claggy, thanks for the info think you may have sorted this for me :)
It was 4 minutes 30. All the tasks that seem to have had this error message have been running on one of my GT240's and only today have I noticed this happening.
All the cuda tasks that have run on my GTX260 have completed fine no matter what the run time, which is why I was confused
I do use a app_info on Beta though, and because there's such a difference in compute speed between the 240's and the 260 I have had to compromise on the <flops> value and try to go in the middle of the values
I guess that I may have been leaning to far with the flops towards the 260 and may then need to lower the value so the 240's get a more realistic est. runtime   ???

Ghost

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: Exceeded Elapsed Time Limit - Error??
« Reply #3 on: 30 May 2010, 07:35:51 pm »
The mb_splitter processes set the <rsc_fpops_bound> which controls the computation limit to ten times the <rsc_fpops_est> value. In the past that has been plenty, the host's Duration Correction Factor (DCF) is applied only to the estimated runtime and with most hosts having DCF of something like 0.2 that relative time ratio was more like 50 in practice.

The new credit system being tested at Seti Beta includes server-side code which scales the estimate and bound for work being sent to be done with each application on the host. The scaling is based on server statistics for the host's performance using that application. If it works as intended, the host DCF should tend to about 1.0 so that extra margin we had will be gone. And it's right that using the same application with two GPUs of widely differing capability is likely to cause problems. David Anderson may not have considered that since the BOINC default setting would only use the better GPU. The server of course doesn't have any way of knowing which GPU will actually be used, I'm not even sure the characteristics of anything other than the better GPU are sent to the servers.

The project might consider a larger margin in the bound set by the splitter. Meanwhile, reducing the <flops> setting should help at least for awhile though I suspect it may only cause the server-side adjustment to shift enough to have the problem come back after you've returned more results. It's good Beta testing of the new credit system in actual use...
                                                                                         Joe

Ghost0210

  • Guest
Re: Exceeded Elapsed Time Limit - Error??
« Reply #4 on: 31 May 2010, 04:44:56 pm »
Thanks Joe,
So it looks like this could be a mix of my fault :( for having a badly written app_info and the new credit system @ Beta
Anyway bit the bullet and got myself a new HD 5670 to replace one of the 240's, so removed the troublesome app_info.

Thanks for your answers, will have to remember next time I modify a app_info to write it for the slowest of my cards

Thanks
Ghost

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Exceeded Elapsed Time Limit - Error??
« Reply #5 on: 01 Jun 2010, 03:03:28 am »

Anyway bit the bullet and got myself a new HD 5670 to replace one of the 240's, so removed the troublesome app_info.

Welcome to ATI AP test then ;)

Ghost0210

  • Guest
Re: Exceeded Elapsed Time Limit - Error??
« Reply #6 on: 01 Jun 2010, 08:43:03 am »
Already got a stock ATI AP test running at the moment ;D
As soon as I can figure out which app to run, and that the card is running correctly.
I'll be changing over to to your test app

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Exceeded Elapsed Time Limit - Error??
« Reply #7 on: 01 Jun 2010, 10:00:02 am »
rev420 with OpenCL support just for your GPU.

Offline Fredericx51

  • Knight o' The Round Table
  • ***
  • Posts: 207
  • Knight Who Says Ni N!
Re: Exceeded Elapsed Time Limit - Error??
« Reply #8 on: 01 Jun 2010, 01:13:53 pm »
Have installed rev.420 again, first UPDated  Catalyst driver, installed SDK 2.1. After BOOT, get a message :card error? .
No problems with MW on EAH 4850 (DP) and Collatz on EAH 4850 & HD 5770.(SP)

(BTW GPUz, shows NO OpenCL!)

Can't reach SETI Bêta, today, (time-out or very very slow response) last night got some new AP WU's.
What is the 'safest FLOPS setting', I can use, with these cards, unfortunatly BOINC (6.10.56) shows 2 HD5770 1000 GFLOPS,  each. But 5770 is > 1 TFLOP and 4850 =< 1 TFLOP.
Have to stop C.C. (or make app_info.xml, for this , too) MW has already one, to get AP WU's to run.

Hope it will run as supposed to  ::)






Ghost0210

  • Guest
Re: Exceeded Elapsed Time Limit - Error??
« Reply #9 on: 01 Jun 2010, 01:24:44 pm »
rev420 with OpenCL support just for your GPU.


Thanks Rasitmer, I'm just downloading the SDK and will install once I've got that running

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Exceeded Elapsed Time Limit - Error??
« Reply #10 on: 01 Jun 2010, 01:39:20 pm »
Have installed rev.420 again, first UPDated  Catalyst driver, installed SDK 2.1. After BOOT, get a message :card error? .
No problems with MW on EAH 4850 (DP) and Collatz on EAH 4850 & HD 5770.(SP)

(BTW GPUz, shows NO OpenCL!)
1) For what reason you re-installed all things? Error you reported before no connected with wrong SDK/driver installation, it's BOINC's own problem connected with incorrect flops setting in app_info ONLY.
2) Looks like _now_ you need reinstall indeed. My GPU-Z 0.4.3 correctly reports OpenCL for HD4870, as GPU-Z 0.4.2 did too.
You need Catalyst 10.4 drivers installed + SDK 2.1 to get OpenCL works.

Quote
What is the 'safest FLOPS setting', I can use, with these cards, unfortunatly BOINC (6.10.56) shows 2 HD5770 1000 GFLOPS,  each. But 5770 is > 1 TFLOP and 4850 =< 1 TFLOP.
Already was listed: 11987654321 (that is, 12 Gflops). Look my explanation for this number in corresponding thread.

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Exceeded Elapsed Time Limit - Error??
« Reply #11 on: 01 Jun 2010, 04:52:34 pm »
Seti Beta has had a problem today of resent work having a Zero flops value, and on start getting a Maximum time Exceeded error like this one:

01/06/2010 20:47:38   SETI@home Beta Test   Starting ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0
01/06/2010 20:47:38   SETI@home Beta Test   [cpu_sched] Starting ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0 (initial)
01/06/2010 20:47:38   SETI@home Beta Test   Starting task ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0 using astropulse version 505
01/06/2010 20:47:40   SETI@home Beta Test   Aborting task ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0: exceeded elapsed time limit 0.000000
01/06/2010 20:47:40   SETI@home Beta Test   [sched_op_debug] Reason: Unrecoverable error for result ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0 (Maximum elapsed time exceeded)
01/06/2010 20:47:41   SETI@home Beta Test   Computation for task ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0 finished
01/06/2010 20:47:41   SETI@home Beta Test   Output file ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0_0 for task ap_18dc09aa_B2_P0_00006_20100523_07077.wu_0 absent

It should have been fixed now with Changeset 21671

I'll reset the project again later and redownload my six Astropulse tasks again.

Claggy

Ghost0210

  • Guest
Re: Exceeded Elapsed Time Limit - Error??
« Reply #12 on: 01 Jun 2010, 05:26:26 pm »
I saw the email from Richard to the Boinc_Alpha email list
Glad it wasn't just my app_info that was causing the issues, and that they managed to find it and fix it so quick

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 50
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 109
Total: 109
Powered by EzPortal