+- +-
Say hello if visiting :) by Mike
23 Apr 2021, 08:55:37 am

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: SETI MB CUDA for Linux  (Read 348231 times)

Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: SETI MB CUDA for Linux
« Reply #435 on: 08 Sep 2009, 08:38:34 pm »
Yes, boinc scheduler is pretty messed up.

IanJ

  • Guest
Re: SETI MB CUDA for Linux
« Reply #436 on: 18 Sep 2009, 03:03:30 am »
Sunu,
 I just reporting back to update you on my Fedora Core 10 64bit machine. I have successfully installed the 2.3cuda libraries/stuff and the machine has been chugging through the workunits.
 I have a couple of questions, one of which is worrying. Occasionally over the past week the machine has locked up and only a reset has cleared the issue. I had a look at /var/log/messages and I see a number of NVRM:Xvid messages. I've googled around and didn't get a clear answer, so does anyone here have an idea? Here are the entries:-
Sep 17 17:11:41 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 04:58:05 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 05:11:07 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
 The second question, is there anyway to ensure that the card is being used to the max, is there any tuning or monitoring of the card that would assist?
 Thanks
 Ian

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #437 on: 18 Sep 2009, 02:50:58 pm »
Sunu,
 I just reporting back to update you on my Fedora Core 10 64bit machine. I have successfully installed the 2.3cuda libraries/stuff and the machine has been chugging through the workunits.
 I have a couple of questions, one of which is worrying. Occasionally over the past week the machine has locked up and only a reset has cleared the issue. I had a look at /var/log/messages and I see a number of NVRM:Xvid messages. I've googled around and didn't get a clear answer, so does anyone here have an idea? Here are the entries:-
Sep 17 17:11:41 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 04:58:05 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 05:11:07 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
 The second question, is there anyway to ensure that the card is being used to the max, is there any tuning or monitoring of the card that would assist?
 Thanks
 Ian


i have been getting this exact same message with the same numbers in the same positions on my tesla. i have been searching for meaning for these for almost 3 months. i only know that i have a problem with the tesla where even though it never locked the entire machine up, the tesla will lock up occasionally requiring a stop and restart of boinc. until i fixed my thermal pad problem on itfor the ram chips, it would  require a power off to reset it. i suspect this error may indicate ram problems. i ran a utility for testing vidram available on the net but it showed nothing wrong.

if the card is still in warranty i suggest getting a swap-out just to be safe.


Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: SETI MB CUDA for Linux
« Reply #438 on: 18 Sep 2009, 06:39:33 pm »
Unfortunately xid errors aren't very useful for troubleshooting. Riofl talks about problems with his thermal pads, you might have as well temperature problems, not necessarily with your graphics card but also with your CPU or nothbridge. What are your temperatures like?

Do you use KDE? I have the impression that I see much more reports with problems with KDE and nvidia drivers than with GNOME.

Which nvidia drivers do you use?. You could try a different version.

Check also your invalid tasks page. If you have invalid tasks this could mean a hardware problem with your graphics card and you'll have to replace it.

As for your second question about a monitoring app for GPU use, there isn't any. I use gnome's sensor-applet to monitor my CPU and GPU temperatures so I can always see if they are working (=crunching) or not.

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #439 on: 19 Sep 2009, 07:12:47 pm »
ahh another clue.. i use kde since i use many of its built in features... i tried gnome but i admit it was years ago and it did not do what i needed at that time... a small thing to most but it helps me a lot when switching a fair number of desktops every few min, is i need unique backgrounds for each desktop, and back then gnome did not do that. i understand it is now built in or there is an app to do it.. i might get adventurous and try the new gnome.

yeah my pads on the tesla were dried, brittle and cracked and some simply were not there so i had to clean everything and find the proper thick 2mm pads with fiberglass webbing which i had to buy from the UK and spent an afternoon being an artist with scissors and knife since the pads came in large sheets. since then temps everywhere including touch on the housing are considerably cooler but it still locks maybe once a week or so... maybe something was damaged from heat when the bad pads were on it. at this point i am not worried.. i am going to replace it soon and then send it to my boss who wants to run it in a much more forgiving windows environment. hehe let him have the problems :)

IanJ

  • Guest
Re: SETI MB CUDA for Linux
« Reply #440 on: 24 Sep 2009, 12:20:15 pm »
Sunu,
 The XVID problem I reported was in a system that didn't have any Windows Manager running, it was running in mode 3, not 5, the machine was just basic vt100. However there did seem to be a hardware problem, and last Sunday the disk packed up. Today after carefull reinstall, modifications with logical volumes and mounting I've managed to get the machine back to a workable state. I've installed the two Cuda 2.3 packages. However I'm missed/messed something as my tasks keep aborting. The ldd of the seti executable seems ok but as I say the thing fails. What have I done wrong in the attached task error output?
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GT
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1600000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 8
SIGSEGV: segmentation violation
Stack trace (17 frames):
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(boinc_catch_signal+0x43)[0x485ef3]
/lib64/libpthread.so.0[0x60880f0]
/usr/lib64/libcuda.so.1[0xb8d980]
/usr/lib64/libcuda.so.1[0xb933c4]
/usr/lib64/libcuda.so.1[0xb63557]
/usr/lib64/libcuda.so.1[0xb0ecf7]
/usr/lib64/libcuda.so.1[0xb2052b]
/usr/lib64/libcuda.so.1[0xb05940]
/usr/lib64/libcuda.so.1[0xafea8a]
/usr/lib64/libcuda.so.1(cuCtxCreate+0x57)[0xb59187]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x5bf335]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x413c5b]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x41f68d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x42b54d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x408707]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x6c9d546]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(__gxx_personality_v0+0x219)[0x408349]

Exiting...

</stderr_txt>
]]>

Thanks Ian!

pp

  • Guest
Re: SETI MB CUDA for Linux
« Reply #441 on: 24 Sep 2009, 02:07:20 pm »
Shameless plug:

I've managed to climb to the #3 spot in the Top hosts list. This is probably the highest I'll ever be so I'll savour the moment.  :D

PDF attached for future proof.

And now we have two Linux machines among the top 20. Don't know yet how high it will reach though...  :D

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #442 on: 24 Sep 2009, 05:48:35 pm »
Sunu,
 The XVID problem I reported was in a system that didn't have any Windows Manager running, it was running in mode 3, not 5, the machine was just basic vt100. However there did seem to be a hardware problem, and last Sunday the disk packed up. Today after carefull reinstall, modifications with logical volumes and mounting I've managed to get the machine back to a workable state. I've installed the two Cuda 2.3 packages. However I'm missed/messed something as my tasks keep aborting. The ldd of the seti executable seems ok but as I say the thing fails. What have I done wrong in the attached task error output?
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GT
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1600000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 8
SIGSEGV: segmentation violation
Stack trace (17 frames):
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(boinc_catch_signal+0x43)[0x485ef3]
/lib64/libpthread.so.0[0x60880f0]
/usr/lib64/libcuda.so.1[0xb8d980]
/usr/lib64/libcuda.so.1[0xb933c4]
/usr/lib64/libcuda.so.1[0xb63557]
/usr/lib64/libcuda.so.1[0xb0ecf7]
/usr/lib64/libcuda.so.1[0xb2052b]
/usr/lib64/libcuda.so.1[0xb05940]
/usr/lib64/libcuda.so.1[0xafea8a]
/usr/lib64/libcuda.so.1(cuCtxCreate+0x57)[0xb59187]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x5bf335]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x413c5b]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x41f68d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x42b54d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x408707]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x6c9d546]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(__gxx_personality_v0+0x219)[0x408349]

Exiting...

</stderr_txt>
]]>

Thanks Ian!


ok you say you installed cuda 2.3 libraries and the 2.3 v190 series driver right (earlier drivers won't work) ?

your error report says the app is cuda 2.2 so it will error. the app must also be cuda 2.3 compliant. those who explained things to me insisted that the driver, toolkit and app must use the same cuda version. i don't believe there is such a thing as 'backward compatibility' with cuda.




Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #443 on: 24 Sep 2009, 05:49:23 pm »
Shameless plug:

I've managed to climb to the #3 spot in the Top hosts list. This is probably the highest I'll ever be so I'll savour the moment.  :D

PDF attached for future proof.

And now we have two Linux machines among the top 20. Don't know yet how high it will reach though...  :D

cool. maybe someday ill get in there too

pp

  • Guest
Re: SETI MB CUDA for Linux
« Reply #444 on: 25 Sep 2009, 05:19:31 am »
ok you say you installed cuda 2.3 libraries and the 2.3 v190 series driver right (earlier drivers won't work) ?

your error report says the app is cuda 2.2 so it will error. the app must also be cuda 2.3 compliant. those who explained things to me insisted that the driver, toolkit and app must use the same cuda version. i don't believe there is such a thing as 'backward compatibility' with cuda.

AFAIK there is no 2.3 binary for CUDA. The 2.2 version is compatible and I run it on several computers. Copy and paste the output of the following command so we can have a look where your CUDA-library is linked:
Code: [Select]
ls -l /usr/lib64/libcuda*]
/PP

pp

  • Guest
Re: SETI MB CUDA for Linux
« Reply #445 on: 30 Sep 2009, 08:23:49 am »
There's a third Linux computer among the top 20 hosts now but it's not me this time though. I will however fight his 4xGTX275 with my single GTX295!  ;D

b0b3r

  • Guest
Re: SETI MB CUDA for Linux
« Reply #446 on: 30 Sep 2009, 08:33:32 am »
There's a third Linux computer among the top 20 hosts now but it's not me this time though. I will however fight his 4xGTX275 with my single GTX295!  ;D

I don't think so. It is relatively new host (since it was upgraded) and that's why it have lower RAC. It currently generating a lot more points http://pl.boincstats.com/stats/host_graph.php?pr=sah&id=5011059  ;D
« Last Edit: 30 Sep 2009, 08:41:28 am by b0b3r »

IanJ

  • Guest
Re: SETI MB CUDA for Linux
« Reply #447 on: 30 Sep 2009, 08:43:49 am »
Roifl and Sunu,
 Just an update. It looks like the copying of the seti cuda executable into the /usr/sbin directory finally got it to calm down and start crunching.
 The NVRM Xid issue continues but now doesn't lock up the machine. It's been up nearly a week without lookup, but I've seen eight in the past three days. As the machine continues on happily I'll forget about it for now. During the reinstall last week I took off the expansion card blanking plates (this machine has only one card in it, the 9600GT) so the machine can get a bit more air.
 Thanks for your help!
 Ian

pp

  • Guest
Re: SETI MB CUDA for Linux
« Reply #448 on: 30 Sep 2009, 09:11:26 am »
There's a third Linux computer among the top 20 hosts now but it's not me this time though. I will however fight his 4xGTX275 with my single GTX295!  ;D

I don't think so. It is relatively new host (since it was upgraded) and that's why it have lower RAC. It currently generating a lot more points http://pl.boincstats.com/stats/host_graph.php?pr=sah&id=5011059  ;D

Compiling my already über optimized 2.6.31-kernel with some -floop-interchange or -floop-strip-mine will take care of that... or I'll just throw in another 295.  :D Nice to see another Gentooer on the list though... but I hope the heat in your room makes your skin curl up and peel off! ;D

b0b3r

  • Guest
Re: SETI MB CUDA for Linux
« Reply #449 on: 30 Sep 2009, 09:45:45 am »
Compiling my already über optimized 2.6.31-kernel with some -floop-interchange or -floop-strip-mine will take care of that... or I'll just throw in another 295.  :D Nice to see another Gentooer on the list though... but I hope the heat in your room makes your skin curl up and peel off! ;D

Also nice to see Gentooer here  :)

Don't bother. This host have special self-made case and it's nice and cool. I also have another one 295 for it but it is currently in service cause it was factory damaged.

So when it back it will swallow your tiny, poor über. ;D
But I think it do it even earlier.  ;D

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59558
Total Topics: 1672
Most Online Today: 19
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 41
Total: 41
Powered by EzPortal