+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Large number of Errors when processing.  (Read 13282 times)

_Geordie_

  • Guest
Large number of Errors when processing.
« on: 10 Mar 2010, 03:13:13 pm »
I recently updated my drivers and noticed that I'm getting a large number of Cuda errors (Compute error) with the latest V12 cuda app (driver stopped).

I've taken the driver back to 190.38 but am still getting a large number of compute errors and using up the daily quota rapidly.

I'm not sure whether its hardware or software - I think my hardware is good - no overheating etc etc.

Is there any way I can diagnose?




Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Large number of Errors when processing.
« Reply #1 on: 10 Mar 2010, 03:15:43 pm »
first of all you could post link to host under question.

_Geordie_

  • Guest
Re: Large number of Errors when processing.
« Reply #2 on: 10 Mar 2010, 05:33:39 pm »
This is the host:

http://setiathome.berkeley.edu/results.php?hostid=4672231 - I've just changed this machine back to the SETI cuda client to see if that makes a difference - I'll know in the morning. (10:30pm here)

I also have another couple of machines that I don't have immediate terminal access to that are also posting errors according to SETI.

http://setiathome.berkeley.edu/results.php?hostid=4612287

http://setiathome.berkeley.edu/results.php?hostid=4093238

Edit: Just checked and the second 2 hosts are all VLAR kills as far as I can tell - (I took a sample of around 10 units from each host from the entire list of WU's errored for each host)
« Last Edit: 10 Mar 2010, 06:15:03 pm by _Geordie_ »

Offline Pepi

  • Knight o' The Realm
  • **
  • Posts: 119
Re: Large number of Errors when processing.
« Reply #3 on: 11 Mar 2010, 03:04:03 pm »
You are using VLAR kill app    VLAR WU (AR: 0.059975 )detected... autokill initialized, so there is no error :) It suppose to work in that way

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Large number of Errors when processing.
« Reply #4 on: 11 Mar 2010, 03:08:16 pm »
no, his first host has true errors:

Work Unit Info:
...............
WU true angle range is :  2.722896
Optimal function choices:
-----------------------------------------------------
name               
-----------------------------------------------------
              v_BaseLineSmooth (no other)
            v_GetPowerSpectrum 0.00023 0.00000
                   v_ChirpData 0.01420 0.00000
                  v_Transpose4 0.00362 0.00000
               FPU opt folding 0.00234 0.00000
CUFFT error in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_fft.cu' in line 62.

</stderr_txt>

But it's stock app, not opt one.
[
most likely some problem with CUFFT libraries. Maybe incompatible driver version...
]
« Last Edit: 11 Mar 2010, 03:12:38 pm by Raistmer »

_Geordie_

  • Guest
Re: Large number of Errors when processing.
« Reply #5 on: 11 Mar 2010, 04:45:52 pm »
Pepi - read above please..... ::)

Raistmer - Yep that's because I've reverted back to the stock app in the past day or so to see if my hardware is a problem - I'm getting some app errors but no more than I've seen on other machines - its not screaming through the entire days quota in a couple of minutes now and I've not had one driver crash.

http://setiathome.berkeley.edu/result.php?resultid=1540425786

http://setiathome.berkeley.edu/result.php?resultid=1540423298

http://setiathome.berkeley.edu/result.php?resultid=1540423294

http://setiathome.berkeley.edu/result.php?resultid=1540423290

http://setiathome.berkeley.edu/result.php?resultid=1540423281

That's just a few of the many.

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Large number of Errors when processing.
« Reply #6 on: 11 Mar 2010, 04:52:58 pm »
Reverting to stock not helped much it seems.
Opt app has better means for issues debugging, not only bettter speed.
For example:
After app init: total GPU memory 536870912    free GPU memory 128245760
That is, your GPU has ~512 onboard memory, but only ~128MB is available at moment app started to allocate memory.
This amount on border line, sometimes it's not enough, that's why that result (firts of listed by you) errored out with memory allocation error.
Maybe errors in stock app induced by same shortage of GPU memory, it's the primary cause you need to fight with.
Do you running some graphic-intencive app (like 3D ganes) while crunching with CUDA MB? Do you use some 3D screen-savers? Any another GPU memory consumers maybe?
Try to free more GPU memory, 512MB is more than enough for CUDA MB, but 128MB sometimes not enough...

_Geordie_

  • Guest
Re: Large number of Errors when processing.
« Reply #7 on: 11 Mar 2010, 06:55:40 pm »
mmmm not sure what could be causing that - definately no game playing.  The CUDA app only runs when the machine is idle also - there's nothing running on the box other than 8 optimized clients when the CUDA app starts AFAIK.

Just looked even further back and its the same problem with every one of the sample (10 or so) I looked at.....and yes it looks as if the error with the stock app even though less frequent and not crashing the driver is a consistent error.

Looking at GPU-Z each GPU is apparently only using 21Mb of memory right now. (providing it interogates the GPU correctly?)

As for driver version I'm running 190.38 at the moment - going to try 191.07.

Are there any quirks with the 9800GX2 that could be causing this?



« Last Edit: 11 Mar 2010, 07:01:56 pm by _Geordie_ »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Large number of Errors when processing.
« Reply #8 on: 11 Mar 2010, 07:02:30 pm »
Are there any quirks with the 9800GX2 that could be causing this?
Sorry, don't know. Try to ask on SETI main forums. There are some owners of 9800 cards, maybe they could give some insights...

_Geordie_

  • Guest
Re: Large number of Errors when processing.
« Reply #9 on: 11 Mar 2010, 07:31:37 pm »
Will do thanks.

Fairly new to CUDA/GPU crunching so learned a few things here anyhow! Appart from obviously not playing games when the app starts is there any way to check and ensure that nothing is running on the GPU?

Just running a memtest on each GPU RAM. Seems ok.

I've changed to 191.07 and reverted back to the optimised app to see if that helps.

I'll see if I can find any 9800GX2 card specific issues.

If I resolve it I'll post back here.



Twidget

  • Guest
Re: Large number of Errors when processing.
« Reply #10 on: 13 Mar 2010, 12:23:55 pm »
Receiving large number of "errors while computing" using optimized apps. Installed Lunatics_Win64v0.2 on March 8th, that is when they started. Vast majority have same information in "stderr out" field:

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
 - exit code -6 (0xfffffffa)
</message>
<stderr_txt>

VLAR WU (AR: 0.008783 )detected... autokill initialised
SETI@home error -6 Bad workunit header

File: ..\worker.cpp
Line: 144


</stderr_txt>
]]>

http://setiathome.berkeley.edu/results.php?hostid=5210381

Any help will be greatly appreciated. Thanks :)
« Last Edit: 13 Mar 2010, 12:26:06 pm by Twidget »

Ghost0210

  • Guest
Re: Large number of Errors when processing.
« Reply #11 on: 13 Mar 2010, 12:30:31 pm »
Hi Twidget,
I take it from your post your running the vLar kill version of the opt app which will stop processing any vLar wus on your GPU with this error and send them back to seti to be sent out to another client.
There is another version of the opt app that you can download that doesn't kill these workunits, but vLars are extremely slow on GPUs.
If you want to download the Reschedule tool from here this will move the vLars from your GPU and onto your CPU. Saving you from having all these tasks erroring out on you

Twidget

  • Guest
Re: Large number of Errors when processing.
« Reply #12 on: 13 Mar 2010, 12:47:20 pm »
Hi Ghost0210,
Thanks for the quick response. 8) I'm a newbie to this site. Poked around a bit and can't find the Reschedule app. :-[ Would you please point me in the right direction.

Thanks

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349

Twidget

  • Guest
Re: Large number of Errors when processing.
« Reply #14 on: 13 Mar 2010, 01:12:32 pm »
Thanks Raistmer! Will give it a try. Be back in touch if I have questions.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 257
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 192
Total: 192
Powered by EzPortal