Seti@Home optimized science apps and information
Optimized Seti@Home apps => Windows => Topic started by: KarVi on 09 Nov 2009, 06:14:38 pm
-
Hi there guys.
I have run into some problems with my boinc installation on one of my computers.
The system will be humming along fine, for hours/days, until, suddenly, it decides to error out all my WU's (both Seti, Milkyway and others). It does so with "error -226 0xFFFFFF1E" according to Boincview. Boinc downloads new WU's, which also immediatly error out, and it ends up with the WU quota being used up for the system (on all attached, active projects).
I have several hundreds of WU's with this error code.
E.g.:
http://setiathome.berkeley.edu/result.php?resultid=1416330014
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=164466536
Hope you get to see the MW WU, they get reissued pretty quick.
As you can see the error code is the same, and the message given by Boinc is "too many exit(0)s".
Only change to the system is that this friday I changed 3 old harddrives (2x80Gb and 1x40Gb), with a brand new 1Tb drive.
After this th eproblems seem to have shown themselves.
But the drive on which Boinc resides is the same, it wasn't changed or moved. Same drive, same partition, same OS, same controller, so Boinc should see no difference.
The computer is running perfectly, for days on end, no crashes, temperatures are very fine, and everything else works on the system.
I have reinstalled Boinc this evening and hopes this fixes things. But besides this all suggestions are welcome!....
-
Hm... very interesting... stderr for SETI result contains no info about single restart....
If you have access to BOINC alpha mail list it's worth to ditto message there.
-
My guess is due to the hard drives changing.
When I install boinc I use the /a flag, portable installation.
Try to uninstall, and delete every reg entry, then install it normally or portable mode.
-
1) I've not seen /a described as a "portable" installation before. Usually, it's taken to mean "administrative" - unpack all the files to a network storage location for later use, but don't actually do the 'installing' bit. So the file which belongs in the Windows directory won't be moved there, the sandbox accounts won't be set up, the permissions won't be set on the folders, (most of) the registry settings won't be present, etc. etc. I wouldn't have thought that was a good way to run.
2) 'Exit 0' is the old "lost heartbeat" problem - BOINC is supposed to constantly reassure the science applications that it is still alive. If BOINC disappears, the science applications shut themselves down: and if they shut themselves down more than 100 times during the run, they throw this error. So why is BOINC not pumping out the reassurance messages? Possibilities:
a) BOINC not running / shutting down because of the strange installation
b) BOINC not running continuously - CPU throttle settings are reported to cause this
c) Some other CPU-intensive process stealing time from BOINC. Windows isn't trying to index that 1TB drive, is it?
-
Richard:
Thanks for your suggestions.
It could very well be, that its indexing the drive, but there's not much on it right now, so it should run fast.
I have turned indexing of anyway from now on.
Only other weird behaviour, is that one time where I caught the computer in the act af trashing WU's, I tried stopping boinc with "net stop boinc". Boinc didnt respond to this, I had to kill it through task manager (process explorer). I restarted Boinc, and it started working like it was supposed to.
Since i reinstalled Boinc yesterday, no bad stuff has happened, but I don't get much work, so it hasn't been thoroughly tested.
Am considering using something like Process lasso, to boost the boinc process to higher priority (as a test), if this happens again.
-
To wrap things up, reinstalling Boinc worked.
I havent lost a single WU since I did it.
Problem solved.