+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: GTX295 CUDA Issues  (Read 24273 times)

madmac

  • Guest
GTX295 CUDA Issues
« on: 10 May 2010, 12:44:50 pm »
OK, I decided to splash out and spent £250 getting a second hand 295 as I was fed up watching my position slide down the tables :-(
I built a new (2nd hand) rig, installed my card, latest nvidia driver, turned of sli and put the latest optimised apps on and fired her up.
(It would appear that I have a problem with one of the cores as one gpu just errors typically after 19 sec's but never much longer than that, though there has been a couple of times where it has crunched to about 80% and then errored, but this has only happened with 1 or 2 units out of too many to mention, so might be a fluke.
Reverted back to stock apps - no difference
dropped the memory clocks right down to 500Mhz - no difference, the card is now back at stock speeds with one gpu disabled
This is my new pc

http://setiathome.berkeley.edu/show_host_detail.php?hostid=5400786

And this is the task list - Could someone on here have a look at the errors and see if there is a common theme or if there is anything I can change as Im stumped on this one...
The error messages are not always the same

http://setiathome.berkeley.edu/results.php?hostid=5400786

Is it hardware related or is it software related..

Any help gratefully received


Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: GTX295 CUDA Issues
« Reply #1 on: 10 May 2010, 05:06:16 pm »
99.99% it's a hardware problem. All invalids and error workunits are from the first GPU (device 1).

A bit puzzling is http://setiathome.berkeley.edu/result.php?resultid=1604052193 that gives an out of memory error.

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: GTX295 CUDA Issues
« Reply #2 on: 11 May 2010, 03:39:20 am »
Mmmm, thats weird, I've got a gtx295 and been suffering similar issues for about the last three weeks. I thought it was a heat issue and completely dismantled the GTX295 and removed all dustballs etc and replaced heatsink paste with Artic silver. Overall temperatures have reduced by about 5 to 8 degrees but the Seti problem persisted.  I assumed it may be a OS or registry issue so I clean installed  Win7 64bit with various Nvidia drivers but still the problem persists.  The errors always occur on GPU 0.  I am not convinced its a hardware issue as all other Cuda apps work flawlessly.  The GTX 295 plays intensive games without a hitch.
I have performed a burnin test using furmark which took the GTX295 to 93 degrees well within its 105 degree design limit.  In case it was an issue with gtx295 interfacing with my motherboard i removed all over clocks from my I7 processor and memory and put everything back to stock Intel settings.
Still GPU 0 throws an error every couple units or sometimes 10 in a row.

 :-\

Offline Pepi

  • Knight o' The Realm
  • **
  • Posts: 119
Re: GTX295 CUDA Issues
« Reply #3 on: 11 May 2010, 02:37:42 pm »
weak power supply?

madmac

  • Guest
Re: GTX295 CUDA Issues
« Reply #4 on: 11 May 2010, 05:15:40 pm »
I am not convinced its a hardware issue as all other Cuda apps work flawlessly.  The GTX 295 plays intensive games without a hitch.
I have performed a burnin test using furmark which took the GTX295 to 93 degrees well within its 105 degree design limit. 
Still GPU 0 throws an error every couple units or sometimes 10 in a row.

 :-\
l
Interesting, can you clarify the 'all other CUDA apps work flawlessly' bit?
I have read that this is an issue with the older dual pcb versions over on the seti forums

I have the problem that 99.99% of wu's error on one of the gpu's
Trying to get my money back...

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: GTX295 CUDA Issues
« Reply #5 on: 12 May 2010, 01:01:31 pm »
weak power supply?


I had suspected this so I installed a brand new Corsair 750w nearly two weeks ago.  I wired the system so only the boot drive, motherboard, DVD writer and GTX295 were connected and for the following 5 days the problems persisted.

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: GTX295 CUDA Issues
« Reply #6 on: 12 May 2010, 01:21:43 pm »
Quote
l
Interesting, can you clarify the 'all other CUDA apps work flawlessly' bit?
I have read that this is an issue with the older dual pcb versions over on the seti forums

My GTX is the dual PCB version and I suspect the issue you refer to is the bug in the memory controller, so far as I can make out this is more of an urban legend rather than actually confirmed by Nvidia.

As I said other Cuda apps work with very little hitch - the ones I currently use are Badaboom, TMPGENc, Adobe, Boinc Collatz Conjecture and GPUgrid.  Boinc Collatz Conjecture throws occasional errors but nowhere near as many as Seti.
« Last Edit: 12 May 2010, 01:24:05 pm by Pizzadude »

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: GTX295 CUDA Issues
« Reply #7 on: 13 May 2010, 02:39:11 am »
Can anybody explain this error, suddenly getting lots of these now :-


Name   29no06ag.1150.892.6.10.140_0
Workunit   608510479
Created   6 May 2010 8:08:47 UTC
Sent   6 May 2010 8:12:24 UTC
Received   13 May 2010 6:12:24 UTC
Server state   Over
Outcome   Client error
Client state   Compute error
Exit status   1 (0x1)
Computer ID   5180462
Report deadline   22 Jun 2010 22:42:57 UTC
Run time   989.105137
CPU time   287.4474
stderr out   

<core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
   Device 1 : GeForce GTX 295
           totalGlobalMem = 919994368
           sharedMemPerBlock = 16384
           regsPerBlock = 16384
           warpSize = 32
           memPitch = 2147483647
           maxThreadsPerBlock = 512
           clockRate = 1242000
           totalConstMem = 65536
           major = 1
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 30
   Device 2 : GeForce GTX 295
           totalGlobalMem = 919994368
           sharedMemPerBlock = 16384
           regsPerBlock = 16384
           warpSize = 32
           memPitch = 2147483647
           maxThreadsPerBlock = 512
           clockRate = 1242000
           totalConstMem = 65536
           major = 1
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 30
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 295 is okay
SETI@home using CUDA accelerated device GeForce GTX 295
V10 modification by Raistmer
Priority of worker thread rised successfully
Priority of process adjusted successfully
Total GPU memory 919994368    free GPU memory 561876992
setiathome_enhanced 6.02 Visual Studio/Microsoft C++

Build features: Non-graphics   VLAR autokill enabled    FFTW   x86   
     CPUID: Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz

     Cache: L1=64K L2=256K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3
libboinc: 6.4.5

Work Unit Info:
...............
WU true angle range is :  0.406157
Cuda error 'cufftExecC2C' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.

</stderr_txt>
]]>

Validate state   Invalid
Claimed credit   73.2308450799687
Granted credit   0

Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: GTX295 CUDA Issues
« Reply #8 on: 13 May 2010, 06:13:02 am »
Have you recently updated drivers etc. ?

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: GTX295 CUDA Issues
« Reply #9 on: 13 May 2010, 03:12:44 pm »
Have you recently updated drivers etc. ?

I am currently running 197.45 as this is the version recommended for Cuda support in Adobe CS5.

Which version are folks using for Seti ?

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: GTX295 CUDA Issues
« Reply #10 on: 15 May 2010, 01:28:36 pm »
Have you recently updated drivers etc. ?

I am currently running 197.45 as this is the version recommended for Cuda support in Adobe CS5.

Which version are folks using for Seti ?
These are my computers http://setiathome.berkeley.edu/hosts_user.php?userid=8906489
XP 64 is by far the best choice. Win 7 is a lot slower and a bit buggy.

Do you have any warranty on this card?
I don't want to scare you, but these 2 pcb versions are really bad.
Got about 4 defect cards, before I got 3 cards that work fine 24/7.
The defects were all 2 pcb cards, and lucky me, I have none left, all replaced under full warranty. Without any questions. Even as a second owner you may check if you still have any warranty left.

The 2 pcb cards get way way too hot. It may be that under Seti the cards get a bit warmer than with other applications.

So I recognize all your problems. I had 2 systems so I could swap them around and still it didn't work. The one pcb's worked flawlessly.

Cuda 2.3 works best for me. And 19038 not the latest work best.

TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: GTX295 CUDA Issues
« Reply #11 on: 16 May 2010, 12:17:37 am »
Quote
These are my computers http://setiathome.berkeley.edu/hosts_user.php?userid=8906489
XP 64 is by far the best choice. Win 7 is a lot slower and a bit buggy.

Do you have any warranty on this card?
I don't want to scare you, but these 2 pcb versions are really bad.
Got about 4 defect cards, before I got 3 cards that work fine 24/7.
The defects were all 2 pcb cards, and lucky me, I have none left, all replaced under full warranty. Without any questions. Even as a second owner you may check if you still have any warranty left.

The 2 pcb cards get way way too hot. It may be that under Seti the cards get a bit warmer than with other applications.

So I recognize all your problems. I had 2 systems so I could swap them around and still it didn't work. The one pcb's worked flawlessly.

Cuda 2.3 works best for me. And 19038 not the latest work best.


Quote

Warrantys out so thats a no go. I have shopped around looking for a single PCB version but they seem to be out of circulation or discontinued in the UK.

I think I will wait for the coding issues to get sorted and go the GTX480 route.

Offline efmer (fred)

  • Alpha Tester
  • Knight o' The Round Table
  • ***
  • Posts: 147
    • efmer
Re: GTX295 CUDA Issues
« Reply #12 on: 27 May 2010, 12:36:26 pm »
Try the new Beta drivers.
These Win 7 / GTX 295 drivers are the first to really work.
My 2 cards work, for the first time with Win 7, without any problems.
TThrottle Keep your temperatures controlled.
BoincTasks The best way to view BOINC

madmac

  • Guest
Re: GTX295 CUDA Issues
« Reply #13 on: 27 May 2010, 04:53:55 pm »
I ended getting a refund in the end..

Ok, I now need some advice on how to spend £300!

All I am interested in is crunching - not games.
Typical me, I was looking at a machine filled with 9800 GX2's, but I got excited and bought a 295 of ebay. It had a fault so I returned it, but I loved the output it gave me :-)
So I have to buy another card
I know there is no hard and fast data, but what will give the best ppd?
A GTX295 or a GTX470 as there isn't that much price difference now, a 480 is still too expensive...
I know Fermi has a different design and the 256 drivers and CUDA 3.1 are meant to improve things further so if you had the money, what would you do???
For the best points per day would you get a 470 or a 295?
Help a cruncher in need :-)

Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: GTX295 CUDA Issues
« Reply #14 on: 27 May 2010, 07:08:07 pm »
A GTX295 should give a better RAC than a GTX470 or 480. You could also wait, don't know how long, for the dual fermi cards to come out.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 279
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 239
Total: 239
Powered by EzPortal