+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: 2.4 Unrecoverable Errors ??  (Read 27665 times)

Offline Gus

  • Alpha Tester
  • Squire
  • ***
  • Posts: 35
2.4 Unrecoverable Errors ??
« on: 11 Aug 2007, 12:35:05 am »
This is a brand new machine, so I don't know if these unrecoverable errors are coming from the hardware, software, or whatever.  It's a core-2 quad (Q6600) right out of the box about 3 hours ago, a stock Dell computer, and I've never run anything on it but KWSN_2.4_SSE3-Core2_MB.   It is not OC'ed nor modified in any way.  It's cool in the house so temp should not be an issue.  The box has displayed no errors so far except these.

Or they may just be bad wu's and mean nothing.

I received two of these errors so far, pretty close together, in about 2 hours of total crunch time.  I'll attach a text file with the Messages Tab entries around the time of one of the errors, a URL to the two Results, and a URL to the computer.

I'm going to bed shortly so if you have any questions feel free to email me at obermege2 at comcast dot net and I'll reply in the morning. . . .

EDIT
OK, after watching this for a while it looks like they may be bad wu's.  I've had several others and the names of them all begin with 03mr07aa.19893.3344.xx.x.xxx.  I'll leave this  post up just in case.

[attachment deleted by admin]
« Last Edit: 11 Aug 2007, 07:20:34 am by Gus »

Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
Re: 2.4 Unrecoverable Errors
« Reply #1 on: 11 Aug 2007, 02:53:19 am »
We'll take a look - right now, your result is the only one available for those WUs, so it's difficult to compare.

They do look like "can happen" mistakes.

Regards,
Simon.

Offline Gus

  • Alpha Tester
  • Squire
  • ***
  • Posts: 35
Re: 2.4 Unrecoverable Errors
« Reply #2 on: 11 Aug 2007, 07:10:28 am »
One has now been returned OK by someone else.  Also the most recent one is not the same vintage as the others. (Ignore the very first one, it was likely caused when I switched to the optimized app.)

It may indeed be a problem with the machine after all.  I'll reseat the memory and run some diags.  I've since set it to receive no new work. Hate to do that the way things are running but I don't want to trash a bunch of wu's.

Can anyone suggest good free diagnostics that really torture the machine?

(Later) I now see that it spit out a whole page of errors one after the other.  Just reseated memory and restarted, we'll see.
« Last Edit: 11 Aug 2007, 07:47:41 am by Gus »

christofire

  • Guest
Re: 2.4 Unrecoverable Errors ??
« Reply #3 on: 11 Aug 2007, 10:06:20 am »
Prime95 - CPU torture test
MemTest86+ - Memory test (runs at boot-time)

Offline Gus

  • Alpha Tester
  • Squire
  • ***
  • Posts: 35
Re: 2.4 Unrecoverable Errors ??
« Reply #4 on: 11 Aug 2007, 11:14:40 am »
Thanks, I have 4 copies of Prime95 running now, I'll do the memtest after they've run a while.

Offline Morten

  • Knight o' The Round Table
  • ***
  • Posts: 165
Re: 2.4 Unrecoverable Errors ??
« Reply #5 on: 11 Aug 2007, 11:29:54 am »
Hi,

I have many WUs with same error. Bad WUs? The following is common:

1: Windows Vista

2:
>
Access Violation (0xc0000005)

- Paged Pool Usage -
QuotaPagedPoolUsage: 43920, QuotaPeakPagedPoolUsage: 43952
QuotaNonPagedPoolUsage: 2512, QuotaPeakNonPagedPoolUsage: 2512

<

Peak value exceeds Qouta, but this is not reflected in Windows evenlog..

Here are more WUs:
http://setiathome.berkeley.edu/result.php?resultid=588614096
http://setiathome.berkeley.edu/result.php?resultid=588541570
http://setiathome.berkeley.edu/result.php?resultid=588535840
http://setiathome.berkeley.edu/result.php?resultid=588535838
http://setiathome.berkeley.edu/result.php?resultid=588535820
http://setiathome.berkeley.edu/result.php?resultid=588530274
http://setiathome.berkeley.edu/result.php?resultid=588530252
http://setiathome.berkeley.edu/result.php?resultid=588466412
http://setiathome.berkeley.edu/result.php?resultid=588466410
http://setiathome.berkeley.edu/result.php?resultid=588466408
http://setiathome.berkeley.edu/result.php?resultid=588466406
http://setiathome.berkeley.edu/result.php?resultid=588466404
http://setiathome.berkeley.edu/result.php?resultid=588466402
http://setiathome.berkeley.edu/result.php?resultid=588466274
http://setiathome.berkeley.edu/result.php?resultid=588434143
http://setiathome.berkeley.edu/result.php?resultid=588434135
http://setiathome.berkeley.edu/result.php?resultid=588434099
http://setiathome.berkeley.edu/result.php?resultid=588434077
http://setiathome.berkeley.edu/result.php?resultid=588432479
http://setiathome.berkeley.edu/result.php?resultid=588432477

I will install r2.4 on another box to see if this can be reproduced on Windows XP.

Morten

« Last Edit: 11 Aug 2007, 11:36:22 am by Morten »

Offline Gus

  • Alpha Tester
  • Squire
  • ***
  • Posts: 35
Re: 2.4 Unrecoverable Errors ??
« Reply #6 on: 11 Aug 2007, 12:01:20 pm »
@Morten-
Those are indeed the same errors I was getting.  They always happened after running only a few seconds.  Mine is running Vista Home Basic on a Q6600.  (Darn I hate Vista, but that's another issue.)

@Simon-
I'm beginning to think there may be a problem with the discovery routine that the science app runs when a result first starts.  What makes me think so is that I noticed on two occasions all four results that were in progress and had been running fine failed when I stopped and restarted BOINC.  The beginning is the only time they fail.  Once they get going they run fine unless Manager is stopped and restarted, then they may or may not fail immediately.
?????

Offline Gus

  • Alpha Tester
  • Squire
  • ***
  • Posts: 35
Re: 2.4 Unrecoverable Errors ??
« Reply #7 on: 11 Aug 2007, 01:35:17 pm »
Simon, See "Client error with Chicken on 64bit Vista" in the Number Crunching boards.  The author is running the 32 bit client on Vista 64, someone else reported the same with 32-bit Vista.  One is running a core 2 E6600, the other a core 2 T7400.  The second person reports that 64-bit on 64-bit works fine.

The problem looks like it may be specific to the core2 32-bit app.   If so, I'm sorry to offer problems, but relieved that my machine may be OK after all.  Prime95 is running 4 threads clean for about 3 hours now.  I'm going to try the "stock" MB app and see how that goes.
« Last Edit: 11 Aug 2007, 01:38:17 pm by Gus »

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: 2.4 Unrecoverable Errors ??
« Reply #8 on: 11 Aug 2007, 05:46:58 pm »
I've also had three (so far) unrecoverable errors with 2.4 - I've logged one in the thread Gus has just linked, but they took the database down just as I was going to post the other two! However, I disabled network access after the first one, and I've backed up the entire BOINC folder, so I should have WU / result / state files for the other two.

The first error was on a high AR linefeed WU, the next two are October deadline multibeams. I'll post links to the results on the SETI board when the server is back up.

This is on my octo, running 32-bit Vista Business, BOINC v5.10.13, and the Core 2 variant of Chicken 2.4 (for Xeon 5320). It hasn't shown any Windows error messages or other signs of distress, and crunching is continuing on all 8 cores, with a mixture of SETI, Einstein and CPDN. It's done about 20 WUs since I upgraded to 2.4 this morning, so these errors are the exception rather than the rule.

Richard Haselgrove

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: 2.4 Unrecoverable Errors ??
« Reply #9 on: 11 Aug 2007, 07:07:33 pm »
And one more error. I can't upload/report because the servers are still down, but I'll backup the data again before uploading/reporting.

The WUs affected are:

12my00aa.25379.26192.790908.3.204
03mr07ab.4943.6207.6.4.94
03mr07ab.4907.6207.5.4.101
03mr07ab.4544.11115.3.4.8

- I'll post WUID / RID when I can.

MikeK

  • Guest
Re: 2.4 Unrecoverable Errors ??
« Reply #10 on: 12 Aug 2007, 05:27:18 am »

I have around ten unrecoverable errors as well since switching to 2.4.

I havn´t a long time, so it should something wrong with 2.4.
While i´m not able to report have to wait til the servers areup agai~.

Mike

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: 2.Unrecoverable Errors ??
« Reply #11 on: 12 Aug 2007, 07:03:21 am »
Four more errors on Core 2 / Vista overnight (and they'll be the last - I'm now dry on that box).

No signs of distress on SSE3 / server boxes: unless anyone has any better suggestions, I think I'll switch the Vista box to the SSE3 compile when the servers come back up, and do a comparison run that way.

NB before anyone asks: this is a Dell Precision workstation, running at stock speed with plenty of cooling. And I upgraded to the A05 BIOS after the first reports of Core 2 errata, so I think it's unlikely (and highly co-incidental) if these errors were in any way platform-releated.

Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
Re: 2.4 Unrecoverable Errors ??
« Reply #12 on: 12 Aug 2007, 09:02:46 am »
Okay,


Those were compiled using Visual Studio 2003 (as all apps until now), but Vista doesn't like VS 2003 anymore, it seems.

So, new apps compiled with VS 2005 SP1 (with Vista patches) will be uploaded shortly (today, if I can manage).

Regards,
Simon.

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: 2.4 Unrecoverable Errors ??
« Reply #13 on: 12 Aug 2007, 09:08:25 am »
having vista32 and 2.4 app and no problems  ...

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: 2.4 Unrecoverable Errors ??
« Reply #14 on: 12 Aug 2007, 09:31:38 am »
Quote from: Simon link=topic=304.msg4364#msg4364 date=118692{766
Okay,

seeing as problems with 32-bit Windows apps on Vista have been reported, there will be new versions out shortly.

Those we`e compiled uwing Visual Studio 2003 (as all apps until now), but Vista doesn't like VS 2003 anymore, it seems.

So, new apps compiled with VS 2005 SP1 (with Vista patches) will be uploaded shortly (today, if I can manage).

Regards,
Simon.
Thanks - but don't feel you ne%d to rh, get a good night's sleep so you can think clearly!

I presume it would be helpful to run with th` new Cord 2 ' Vista when available, rather than SSE3: anything else I can do to help track it down?

Richard

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 279
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 235
Total: 235
Powered by EzPortal