+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Completed, validation inconclusive  (Read 20358 times)

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Completed, validation inconclusive
« on: 07 Jul 2009, 03:05:14 am »
I got lot of  Completed, validation inconclusive 100.... on different computers.
http://setiathome.berkeley.edu/workunit.php?wuid=473007709
Are these WU really bad or is something wrong with this VLAR KILL version.
Because my wingman looks ok on the CPU version.
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Completed, validation inconclusive
« Reply #1 on: 07 Jul 2009, 03:52:36 am »
I got lot of  Completed, validation inconclusive 100.... on different computers.
http://setiathome.berkeley.edu/workunit.php?wuid=473007709
Are these WU really bad or is something wrong with this VLAR KILL version.
Because my wingman looks ok on the CPU version.

It's hard to tell from a single example, since either result could be closer to 'right'.  If you seee more examples of this mismatch, let us know.  There could have been some problems in either processing run, host or build, communications, cosmic rays causing one-off bit errors in ram etc.  I would be inclined to watch to see how the reissue comes in, but only get worried if it happens a lot (repeatability is important).  The redundancy process seems to be doing its job.
« Last Edit: 07 Jul 2009, 04:17:20 am by Jason G »

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: Completed, validation inconclusive
« Reply #2 on: 07 Jul 2009, 05:44:58 am »
I got lot of  Completed, validation inconclusive 100.... on different computers.
http://setiathome.berkeley.edu/workunit.php?wuid=473007709
Are these WU really bad or is something wrong with this VLAR KILL version.
Because my wingman looks ok on the CPU version.

It's hard to tell from a single example, since either result could be closer to 'right'.  If you seee more examples of this mismatch, let us know.  There could have been some problems in either processing run, host or build, communications, cosmic rays causing one-off bit errors in ram etc.  I would be inclined to watch to see how the reissue comes in, but only get worried if it happens a lot (repeatability is important).  The redundancy process seems to be doing its job.
http://setiathome.berkeley.edu/show_host_detail.php?hostid=4955000Got hundreds of them on two computers XP X64 with 295 cards.
Driver: GeForce/ION Release 186
So this got me worried, got another computer without any problems, but it takes for ever to upload and see the validations....
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Completed, validation inconclusive
« Reply #3 on: 07 Jul 2009, 06:14:24 am »

Since I haven't seen reports of validation issues with the build, I would suggest to begin some basic checks without boinc running, to eliminate some possible issues. That might include sytem memory , temps, OC backoff, normal divers checks etc .

For the Video Card I use AtiTool (http://www.techpowerup.com/atitool) to scan for artefacts (Yes it does so work on nVidia cards!  ;) ).  Even If that goes OKay for an hours running (without any artefacts / beeping at you in 'Scan for Artefacts Mode'), I would double check for known issues with those drivers/cards too (especially CUDA related), maybe there is something there that may warrant looking at another version. 

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: Completed, validation inconclusive
« Reply #4 on: 07 Jul 2009, 06:23:55 am »

Since I haven't seen reports of validation issues with the build, I would suggest to begin some basic checks without boinc running, to eliminate some possible issues. That might include sytem memory , temps, OC backoff, normal divers checks etc .

For the Video Card I use AtiTool (http://www.techpowerup.com/atitool) to scan for artefacts (Yes it does so work on nVidia cards!  ;) ).  Even If that goes OKay for an hours running (without any artefacts / beeping at you in 'Scan for Artefacts Mode'), I would double check for known issues with those drivers/cards too (especially CUDA related), maybe there is something there that may warrant looking at another version. 
But it seem highly unlikely the this happens on two different computers with different brand 295 cards.
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Completed, validation inconclusive
« Reply #5 on: 07 Jul 2009, 06:40:18 am »
But it seem highly unlikely the this happens on two different computers with different brand 295 cards.

I agree, but not if there's some common factor, like cuda driver bug.  The point is to eliminate things that it isn't, to narrow the field to find/isolate whatever the actual problem is, rather than guessing that there is something wrong with an app that others aren't reporting the same issues with.

[Edit: x64 Installer will be available soon in Beta, perhaps V12 with updated Cuda DLLs'  (Totally different build) might either show same symptoms or correct them, in either case, that would confirm or eliminate some suspects]
« Last Edit: 07 Jul 2009, 06:45:07 am by Jason G »

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: Completed, validation inconclusive
« Reply #6 on: 07 Jul 2009, 06:45:31 am »
But it seem highly unlikely the this happens on two different computers with different brand 295 cards.

I agree, but not if there's some common factor, like cuda driver bug.  The point is to eliminate things that it isn't, to narrow the field to find/isolate whatever the actual problem is, rather than guessing that there is something wrong with an app that others aren't reporting the same issues with.
Can't have anything to do with the bad WU's they are sending out lately. With a lot of noise in it, because I this is a memory overflow error.
I will try putting one machine to an earlier driver 185.85
Got Cuda 2.2, MB_6.08_mod_CUDA_V11_VLARKill_refined.exe. Is the VLARKILL V12 worth trying and with what memory model.
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Completed, validation inconclusive
« Reply #7 on: 07 Jul 2009, 06:53:44 am »
Can't have anything to do with the bad WU's they are sending out lately. With a lot of noise in it, because I this is a memory overflow error.
I will try putting one machine to an earlier driver 185.85
Got Cuda 2.2, MB_6.08_mod_CUDA_V11_VLARKill_refined.exe. Is the VLARKILL V12 worth trying and with what memory model.

WUs? a possibility, but unlikely, and if we can isolate it to that then great.

That's the same setup I used for a few months until yesterday, with a much lesser card (9600GSO).

V12 will be worth trying, and has some speed enhancements put in by Raistmer, and we've tweaked it a bit to be more display friendly.  So yes it's worth a try, but won't necessarily solve whatever is causing your difficulty, but if it does the same, it would prove it wasn't the particular app build at least.

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: Completed, validation inconclusive
« Reply #8 on: 07 Jul 2009, 06:58:55 am »
Can't have anything to do with the bad WU's they are sending out lately. With a lot of noise in it, because I this is a memory overflow error.
I will try putting one machine to an earlier driver 185.85
Got Cuda 2.2, MB_6.08_mod_CUDA_V11_VLARKill_refined.exe. Is the VLARKILL V12 worth trying and with what memory model.

WUs? a possibility, but unlikely, and if we can isolate it to that then great.

That's the same setup I used for a few months until yesterday, with a much lesser card (9600GSO).

V12 will be worth trying, and has some speed enhancements put in by Raistmer, and we've tweaked it a bit to be more display friendly.  So yes it's worth a try, but won't necessarily solve whatever is causing your difficulty, but if it does the same, it would prove it wasn't the particular app build at least.

I expect a driver problem... What version do you recommend or is tested on a 295...
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Completed, validation inconclusive
« Reply #9 on: 07 Jul 2009, 07:03:19 am »
I expect a driver problem... What version do you recommend or is tested on a 295...
  The top machines with 295's (on x64) appear to be using either 185.85 or the slightly newer one as you are.  It might be worth to consider also checking mobo chipset drivers etc, are up to date, among the other checks mentioned.  Beyond that I have no direct experience with 295's.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Completed, validation inconclusive
« Reply #10 on: 07 Jul 2009, 07:19:09 am »
FYI: 'first cut' (untested) Win64 updated installer is just upload to Beta Downloads now.  Will start a beta thread for that one shortly.

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: Completed, validation inconclusive
« Reply #11 on: 09 Jul 2009, 12:37:33 am »
I expect a driver problem... What version do you recommend or is tested on a 295...
  The top machines with 295's (on x64) appear to be using either 185.85 or the slightly newer one as you are.  It might be worth to consider also checking mobo chipset drivers etc, are up to date, among the other checks mentioned.  Beyond that I have no direct experience with 295's.
On both machines I did a downgrade of the driver to 185.85
1) V11 VLAR killer still a lot of -9 errors 100+
2) V12 VLAR killer the -9 errors are almost gone. 100+
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Completed, validation inconclusive
« Reply #12 on: 09 Jul 2009, 04:17:07 am »
And wingmate shows no overflow in task?

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: Completed, validation inconclusive
« Reply #13 on: 09 Jul 2009, 05:30:12 am »
And wingmate shows no overflow in task?

No overflow on the wingman, most of the time.
And the WU is not marked as invalid for some reason but as Initial
SETI@Home Informational message -9 result_overflow - Initial

And 2 cards, different brands, all giving the same error?

Is the Cuda buffer the same length?

And... at the moment it takes forever to see any changes in the database, that is waaaay behind.
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Completed, validation inconclusive
« Reply #14 on: 09 Jul 2009, 04:36:05 pm »
most likely it's driver issues.
Try to use V12+185.85+CUDA RT2.2

BTW, do overflows disappear after restart (at least for awhile)? Do result's stderrs contain any CUDA-related errors?
I had overflows on x64 host with older driver+CUDA RT time to time before. Restarting usually helped.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 4
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 53
Total: 53
Powered by EzPortal