+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: SETI MB CUDA for Linux  (Read 507714 times)

pp

  • Guest
Re: SETI MB CUDA for Linux
« Reply #360 on: 19 Aug 2009, 08:36:11 am »
bit-tech.net recently had an entertaining article about this. They tried both 4xGTX295 and 7x9600GT and not surprisingly, heat was a problem. They tested Folding@Home but it has lots of nice pictures...
http://www.bit-tech.net/bits/2009/08/03/how-to-build-the-best-folding-rig/1

Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: SETI MB CUDA for Linux
« Reply #361 on: 19 Aug 2009, 10:06:01 am »
Thanks for the link, I haven't seen that.

That's were water cooling enters the picture. 4 x BFG NVIDIA GeForce GTX 295 H2OC 1792MB PCIe 2.0 with ThermoIntelligence Advanced Cooling Solution or 7 x BFG NVIDIA GeForce GTX 285 H2O+ 1GB PCIe 2.0 with ThermoIntelligence Advanced Cooling Solution or any other water cooled solution.

macros

  • Guest
Re: SETI MB CUDA for Linux
« Reply #362 on: 19 Aug 2009, 10:48:42 am »
Are you still running CUDA 2.1? The 100% CPU was apparently a bug in those libraries. Upgrade CUDA to 2.3, nvidia-drivers to 190.xx  and replace your setiathome executable with the 2.2 version and optionally renice that process if you think it's too slow.
Macros, what pp says. Make sure you're using cuda 2.2 or later together with a compatible nvidia driver.
i think you will find best resonse setting your preferences to use 6 or 7 cpus instead of 8 leaving 1 for cuda and your desktop to use. i played around a bit with max_ncpus but did not find a huge difference. mine is set at 0.35.

absolutely if you do nothing else change your cuda tookit and sdk to 2.2 and get the 2.2 application. make sure your driver is at the minimum 185.14 or 185.29. i am using 185.29.

ver 2.1 had huge flaws in it . i have heard 2.3 is even better, however i have not had good luck with 2.3 so i went back to 2.2 until i can figure out what went wrong.
Small correction to riofl: The driver versions are 185.18.14 and 185.18.29. Latest is 185.18.31. Macros, if you go to cuda 2.3 you'll need 190.18.

Macros, what card are you using? Maybe that 99% is because your card goes out of memory?

Thanks for everyone's hints.

I've installed nvidia-drivers version 185.18.14 from Ubuntu PPA source (x-updates) (I don't want to get on 'manual track' to manage nvidia drivers here...) plus 2.2 CUDA libraries). Also I've upgraded to setiathome-CUDA_2.2_6.08.x86_64_vlarkill.tar.bz2 client as pp suggested. First results weren't satisfactory - setiathome CUDA client would crash with following error output:

Code: [Select]
<core_client_version>6.6.37</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : Quadro FX 4600
           totalGlobalMem = 804585472
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1188000
           totalConstMem = 65536
           major = 1
           minor = 0
           textureAlignment = 256
           deviceOverlap = 0
           multiProcessorCount = 12
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: Quadro FX 4600 is okay
SIGSEGV: segmentation violation
Stack trace (16 frames):
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib/libpthread.so.0[0x7f96066ac080]
/usr/lib/libcuda.so.1[0x7f9607123020]
/usr/lib/libcuda.so.1[0x7f9607128d84]
/usr/lib/libcuda.so.1[0x7f96070f210f]
/usr/lib/libcuda.so.1[0x7f9606e7db3b]
/usr/lib/libcuda.so.1[0x7f9606e8e46b]
/usr/lib/libcuda.so.1[0x7f9606e76211]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7f9606e6ffaa]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x5ace4b]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40d4ca]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f96063495a6]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>
]]>

Then I've made an attempt to run the seti CUDA client standalone on the very same workunit and guess what - it worked.  :o
Messing around, I've ended up in state when there is one CUDA task running (and it seems that this time correctly - around 3-4% CPU time) but I don't have explanation for previous crashes.

The machine is:
Dual QC Xeon X5460 @ 3.16GHz
16GiB RAM
nVidia Quadro FX 4600
Ubuntu 9.04 w/ 2.6.28-15-server (I understood from other threads, that this might be an issue, but it doesn't really add up to fact that I didn't have a single compute error until I've upgraded to 2.2 CUDA + 2.2 seti CUDA client)
boinc ver. 6.6.37

pp

  • Guest
Re: SETI MB CUDA for Linux
« Reply #363 on: 19 Aug 2009, 11:04:18 am »
The crash dump is still referencing the old executable. Did you update your app_info.xml? Also make sure you copy the new libcudart.so.2 and libcufft.so.2 to your projects/setiathome.berkeley.edu directory. And finally, as stated in another thread, also copy the new executable to /usr/local/bin or whatever directory you have in your PATH. I have had no problems since following these advices (well, apart from having to renice the executable to level 0 to give it enough CPU time).

macros

  • Guest
Re: SETI MB CUDA for Linux
« Reply #364 on: 19 Aug 2009, 11:13:37 am »
The crash dump is still referencing the old executable.

True, but I got the same for the newer, just picked one from the error list, didn't notice its the old one...

Quote
Did you update your app_info.xml? Also make sure you copy the new libcudart.so.2 and libcufft.so.2 to your projects/setiathome.berkeley.edu directory.
And finally, as stated in another thread, also copy the new executable to /usr/local/bin or whatever directory you have in your PATH. I have had no problems since following these advices (well, apart from having to renice the executable to level 0 to give it enough CPU time).

Yes, I did all that. Anyway, it seems to be running now, due to not making one change at the time, I don't know what was exactly the cause.  ;) ::)
Besides, its just first WU, hopefully there will be no more errors.

edit: It works. Finally :)
« Last Edit: 19 Aug 2009, 11:40:07 am by macros »

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #365 on: 19 Aug 2009, 01:02:20 pm »
The crash dump is still referencing the old executable.

True, but I got the same for the newer, just picked one from the error list, didn't notice its the old one...

Quote
Did you update your app_info.xml? Also make sure you copy the new libcudart.so.2 and libcufft.so.2 to your projects/setiathome.berkeley.edu directory.
And finally, as stated in another thread, also copy the new executable to /usr/local/bin or whatever directory you have in your PATH. I have had no problems since following these advices (well, apart from having to renice the executable to level 0 to give it enough CPU time).

Yes, I did all that. Anyway, it seems to be running now, due to not making one change at the time, I don't know what was exactly the cause.  ;) ::)
Besides, its just first WU, hopefully there will be no more errors.

edit: It works. Finally :)

one thing you need to make sure of is that the project directory where the cuda libs are is listed in the ld.so.conf file and that you have run ldconfig. without that it is very likely it would crash possibly a few times and then find its libraries by accident.
« Last Edit: 19 Aug 2009, 01:04:33 pm by riofl »

macros

  • Guest
Re: SETI MB CUDA for Linux
« Reply #366 on: 19 Aug 2009, 01:58:20 pm »
one thing you need to make sure of is that the project directory where the cuda libs are is listed in the ld.so.conf file and that you have run ldconfig. without that it is very likely it would crash possibly a few times and then find its libraries by accident.
Yeah, I always verify DSO availability with ldd on every client ...

Offline sunu

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 771
Re: SETI MB CUDA for Linux
« Reply #367 on: 21 Aug 2009, 06:48:26 am »
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #368 on: 21 Aug 2009, 08:22:04 pm »
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

i have no idea. i agree unless my desktops are keeping too many shaders busy which i dont think standard gui desktops use many of them if at all then i am at a loss... the 285 should be at least near the speed of the tesla allowing for the 285 to be busy with things on the desktops. since we have had some strong storms in the area boinc has been down for the past 3 hours or so and i have just started things back up.

i think maybe i will boot this machine from my tuning windows drive tomorrow (keep it in a drawer) and review what riva tuner and the evga program tell me. nvidia-settings shows me that all the clocks meet what they are supposed to be.  now i dont know about desktop settings much but there is one change i made in the past few weeks with nvidia-settings. i unchecked Sync to VBlank in xvideo settings and also unchecked sync to vblank and allow flipping in the opengl settings. wasnt sure what they did but there seemed to be no difference.   should they be checked? out side of that, the clock freq on the 285 are as follows

2D settings gpu 300mhz memory 100mhz
3D settings  gpu 690mhz memory 1300mhz

power mizer which seems to not have settings says adaptive clocking enabled performance level2 perforamce mode desktop. level2 is the 3d settings above however i remember when i first got the card, performance mode said maximum performance and somewhere along the line it changed to desktop. since the other settings are the same i can only assume it is a function of which driver is being used for which text shows up.

does this give any clues?

lordvader

  • Guest
Re: SETI MB CUDA for Linux
« Reply #369 on: 21 Aug 2009, 08:23:01 pm »
Hi.

I was just wondering if there are any known issues in using the CUDA client with the 2.6.30 kernel ? I recently built a 2.6.30 kernel (to see if the AP units will fail), and noticed that my CUDA units were appreciably slower (taking over an hour).
I just switched back the latest ubuntu kernel (2.6.28-15-generic), which seems to work fine.

Any suggestions, any particular info you need ? I'm using the same nvidia driver in both cases (185.18.31, on an x86_64 platform)

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #370 on: 21 Aug 2009, 08:27:17 pm »
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

since 6.9.0 reports 2 teslas, could it be possible it is mixing up which device is 0 and which is 1? because it is completely odd since the tesla is running gpu 500mhz and memory 900mhz so it should be considerably slower. it rates both devices it thinks are teslas at 74gflops yet the 285 is rated by 6.6.11 as 127gflops

i am going to reboot this tomorrow so when i do i am going to go over the settings in cmos. presently it is set to auto on pci-e bus frequency. maybe i will fix it at 100mhz .. it could be doing God knows what in auto.

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #371 on: 21 Aug 2009, 08:36:43 pm »
Hi.

I was just wondering if there are any known issues in using the CUDA client with the 2.6.30 kernel ? I recently built a 2.6.30 kernel (to see if the AP units will fail), and noticed that my CUDA units were appreciably slower (taking over an hour).
I just switched back the latest ubuntu kernel (2.6.28-15-generic), which seems to work fine.

Any suggestions, any particular info you need ? I'm using the same nvidia driver in both cases (185.18.31, on an x86_64 platform)

i am using 2.6.29. and now that you mention it i have been having issues for a few weeks. i installed this on july 23rd. unfortunately i cannot remember far enough back since i have made so many other changes as well whether performance degraded then or not. i may try going back to my other kernel, 2.6.25 and see what happens.

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #372 on: 21 Aug 2009, 08:41:06 pm »
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

lordvader brought up an interesting point i have not even considered. july 23rd i switched from 2.6.25 to 2.6.29 kernel. that is close to the time i started having issues. i am going to try switching back tomorrow... unfortunately it will require some recompiling of the kernel and all modules and a few other things since with the switch to 2.6.29 i also switched gcc to 4.3.2

also maybe i missed something in getting rid of cuda 2.3. i guess a good manual inspection of things is in order tomorrow.

lordvader

  • Guest
Re: SETI MB CUDA for Linux
« Reply #373 on: 22 Aug 2009, 03:44:44 am »
I'm gonna try a vanilla built 2.6.28.10 kernel, see if I get the same performance issues (and hopefully successfull AP units ...).

This is fun ! Damn I missed this stuff !

Offline riofl

  • Knight o' The Round Table
  • ***
  • Posts: 240
Re: SETI MB CUDA for Linux
« Reply #374 on: 22 Aug 2009, 06:10:58 am »
I'm gonna try a vanilla built 2.6.28.10 kernel, see if I get the same performance issues (and hopefully successfull AP units ...).

This is fun ! Damn I missed this stuff !

used to be for me too until i started doing this stuff for a living.. now its just plain irritating when something doesnt go right the first time.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 40
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 26
Total: 26
Powered by EzPortal