+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: When corrupted results get validated...  (Read 59209 times)

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: When corrupted results get validated...
« Reply #30 on: 06 Jun 2010, 01:40:02 pm »
Thanks. I've now set it to run live on the main project - host 2901600. No problem fetching work - just a shortlist for the moment, because (a) I run a short cache, and (b) DCF hasn't settled yet - still estimating three hours!

All those pseudo -9s that we started this thread with will have driven DCF way low. I think we may have encountered another of BOINC's safety features - IIRC BOINC cuts down on work fetch if DCF ever gets into 'insane' territory, either high or low. There's a lot of very sound engineering practice in the original BOINC design, but I fear we're in danger of losing it with all these hurried, on-the-fly, bodges to cope with evovling technologies like GPUs.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: When corrupted results get validated...
« Reply #31 on: 06 Jun 2010, 01:47:22 pm »
Yeah, life too fast to properly think about it, BOINC not escaped this :) But some block with fast reaction time to stop invalid overflows would be good thing IMO.
They damage project in too many ways.

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: When corrupted results get validated...
« Reply #32 on: 06 Jun 2010, 02:29:09 pm »
Already got a wingmate to add to Claggy's list:

Pieter hostid=5431046 NVIDIA GeForce 9800 GT (1005MB) driver: 19745

Host created today, downloaded 564 tasks, got two of them to validate at 0.01 credits, I'm too depressed to look-see how many pages-full he's wasted.

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: When corrupted results get validated...
« Reply #33 on: 06 Jun 2010, 03:30:28 pm »
Already got a wingmate to add to Claggy's list:

Pieter hostid=5431046 NVIDIA GeForce 9800 GT (1005MB) driver: 19745

Host created today, downloaded 564 tasks, got two of them to validate at 0.01 credits, I'm too depressed to look-see how many pages-full he's wasted.

220 pending, 2 validated, all teensie claims.

One of the two "valid" is good evidence, text captures attached as WU619984348.7z, also attaching text captures for paired 4xx case noted by Sutaru as WU619465291.7z.
                                                                                    Joe

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: When corrupted results get validated...
« Reply #34 on: 07 Jun 2010, 04:23:22 pm »
A cuda_fermi application, v6.10, was loaded about 30 minutes ago. No-one will have any WUs yet, of course, because the splitters haven't been restarted.

I'd prefer not to test the stock download process myself if I can avoid it, because I'm rigged with an app_info and still have some VLARs waiting for optimised CPU handling. But if we could keep an eye on Claggy's list, and see if the Fermis start producing valid work, that would be good news.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: When corrupted results get validated...
« Reply #35 on: 07 Jun 2010, 04:26:16 pm »
Please check md5 (binary equivalent) of exe against beta ones

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: When corrupted results get validated...
« Reply #36 on: 07 Jun 2010, 04:43:00 pm »
Please check md5 (binary equivalent) of exe against beta ones

E448A1489782723161EFAF99B9494661

in both cases. Binary FC says the same, too.

So this will be the one which describes itself as 6.09 in stderr, then ;D

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: When corrupted results get validated...
« Reply #37 on: 07 Jun 2010, 04:44:20 pm »
LoL, thanks.  :D

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
Re: When corrupted results get validated...
« Reply #38 on: 08 Jun 2010, 03:42:02 am »
berkeley switched off all data distribution, a message is on the front site. --->
"We are experiencing a problem such that some GPU platforms are quickly overflowing on all workunits that they receive. Rather than burn through a great deal of data that we would have to redistribute, we are turning off data distribution until we get this debugged."

08.06.2010 09:27:38   SETI@home   update requested by user
08.06.2010 09:27:41   SETI@home   Fetching scheduler list
08.06.2010 09:27:43   SETI@home   Master file download succeeded
08.06.2010 09:27:48   SETI@home   Sending scheduler request: Requested by user.
08.06.2010 09:27:48   SETI@home   Reporting 6 completed tasks, requesting new tasks for CPU and GPU
08.06.2010 09:27:51      Project communication failed: attempting access to reference site
08.06.2010 09:27:51   SETI@home   Scheduler request failed: Couldn't connect to server
08.06.2010 09:27:53      Internet access OK - project servers may be temporarily down.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

thats good, so they are working on it.

heinz
« Last Edit: 08 Jun 2010, 03:44:57 am by _heinz »

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: When corrupted results get validated...
« Reply #39 on: 08 Jun 2010, 04:17:58 am »
Eric reports in the 'Scheduler problems' News thread:

"We're having difficulty getting a new scheduler running that handles cuda_fermi applications properly. We'll be down until we get it sorted out."

Claggy

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: When corrupted results get validated...
« Reply #40 on: 08 Jun 2010, 05:19:28 am »
This may be considered a case of "be careful what you wish for" - I sent Eric an email just as the lab opened yesterday, drawing attention to the scale of the problem and the list in this thread. Probably not what he wanted to read on a Monday morning, and deploying a new scheduler probably wan't his plan for the day either.

But it had to be done - shame it didn't go smoothly.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: When corrupted results get validated...
« Reply #41 on: 08 Jun 2010, 05:27:23 am »
LoL, I did the same and recived answer that they worked on it right now (about the time 6.10 was spotted on main). Interesting, what was wrong with 6.10 ?...

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: When corrupted results get validated...
« Reply #42 on: 08 Jun 2010, 03:04:47 pm »
LoL, I did the same and recived answer that they worked on it right now (about the time 6.10 was spotted on main). Interesting, what was wrong with 6.10 ?...

Perhaps a host which got in a bad state trying to run 6.08 on a GTX 4xx will need a reboot to clear the problem. If so, automatically updating a host to 6.10 wouldn't help.

...I sent Eric an email just as the lab opened yesterday, drawing attention to the scale of the problem and the list in this thread.
...

User "korpela" has not been active in the last 24 hours, though this thread is visible to all and he might have viewed it as a guest.
                                                                                      Joe
« Last Edit: 08 Jun 2010, 03:22:27 pm by Josef W. Segur »

Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: When corrupted results get validated...
« Reply #43 on: 08 Jun 2010, 04:18:28 pm »
Perhaps a host which got in a bad state trying to run 6.08 on a GTX 4xx will need a reboot to clear the problem. If so, automatically updating a host to 6.10 wouldn't help.

I've not seen anything like that, and my GTX 470 has tried them all (6.09, 6.08, v12b). They just fail in their various ways, and move on to the next task. It's very different from the 'sporadic error state' on older GPUs, where the failure persists from task to task until reboot.

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: When corrupted results get validated...
« Reply #44 on: 08 Jun 2010, 04:24:01 pm »
I started going through my resends again this morning, no new Fermi's, just these hosts:

Sigurd G.Schinke hostid=5372764 NVIDIA GeForce GTX 260 (881MB) driver: 19713

Marc Jarry hostid=4247889 NVIDIA GeForce 9600 GT (511MB) driver: 19745

BabelAbu hostid=5374194 NVIDIA GeForce GTX 260 (1792MB) driver: 19732

malycc hostid=5386713 NVIDIA GeForce 9500 GT (511MB) driver: 19745

The Beef hostid=5289552 [2] NVIDIA GeForce GTX 295 (896MB) driver: 19562

Anonymous hostid=5049618 [2] NVIDIA GeForce GTX 295 (895MB) driver: 19038

k.pieschl hostid=3192436 NVIDIA GeForce 9800 GT (1023MB) driver: 18634

Claggy

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 33
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 25
Total: 25
Powered by EzPortal