Seti@Home optimized science apps and information
Optimized Seti@Home apps => Linux => Topic started by: guildwar on 22 Mar 2009, 08:55:10 am
-
I have a machine with one cuda device, and run setiathome CUDA for linux perfectly
Recently, I plug another cuda device which the same with the previous one
when run boinc, it say I have two cuda device, so far so good
but I notice, the job's cpu time is not the same, from about 300 seconds to 1600 seconds
284.96 13.79 13.79
to
1,640.06 13.79 13.79
and after check, I found setiathome use cpu instead of gpu to crunch the workunit
ps ax|grep seti
4311 pts/1 RNLl 9:42 setiathome-CUDA-6.08.x86_64-pc-linux-gnu --device 0
4312 pts/1 RNLl 9:42 setiathome-CUDA-6.08.x86_64-pc-linux-gnu --device 0
two setiathome cuda use the same gpu
so, I change app_info.xml
<coproc>
<type>CUDA</type>
<count>2</count>
</coproc>
and restart boinc
ps ax | grep seti
4423 pts/1 RNLl 9:45 setiathome-CUDA-6.08.x86_64-pc-linux-gnu --device 0 --device 1
only show one instence of setiathome cuda, but whith two devices
but after I check result, the cpu time and walltime is the same with only one gpu
can somebody tell me why?
-
Singlu CUDA MB instance can use only one GPU device.
That is, you need run one copy with device 0 and second with device 1.
Not both with device 0 and not single with two devices.
-
Singlu CUDA MB instance can use only one GPU device.
That is, you need run one copy with device 0 and second with device 1.
Not both with device 0 and not single with two devices.
Thanks for your reply
But I don't know how to use two instence of MB cuda
How to write app_info.xml?
-
Singlu CUDA MB instance can use only one GPU device.
That is, you need run one copy with device 0 and second with device 1.
Not both with device 0 and not single with two devices.
Thanks for your reply
But I don't know how to use two instence of MB cuda
How to write app_info.xml?
Any idea? I've got a dual-GPU GF9800 GX2 and I'm wondering if I can make it play nice ...
-
Any idea? I've got a dual-GPU GF9800 GX2 and I'm wondering if I can make it play nice ...
Have you tried it? What problems do you have?
-
Any idea? I've got a dual-GPU GF9800 GX2 and I'm wondering if I can make it play nice ...
Have you tried it? What problems do you have?
I'm trying .. hard ;-) It actually seems to me that I'm plagued with the same problem as described in this post: http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.150.html#msg19175 - also Ubuntu 9.04 etc. I seem to recall similar error messages, but I can't check right now ... SAH server seems to have some functionality temporarily disabled. Anyway, here's the link to host's results: http://setiathome.berkeley.edu/results.php?hostid=4994295.
I was wondering if there's some special magic when machine's got more than one GPU...
-
I assume that you've setup everything correctly? Please check my posts carefully here (http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014) and here (http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19220.html#msg19220). Do a ldd in your cuda app and post your xorg.conf and xorg.0.log.
-
I assume that you've setup everything correctly? Please check my posts carefully here (http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014) and here (http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19220.html#msg19220). Do a ldd in your cuda app and post your xorg.conf and xorg.0.log.
Here's the state of the matter:
- I've compiled kernel 2.6.30.1, albeit using configuration from Ubuntu 9.04 stock kernel
- I've installed (and made sure that driver gets used) nvidia 185.18.14 driver and cuda toolkit
- I've installed latest BOINC client 6.6.36
I can't get seti CUDA to work. Example of errored WU is this: http://setiathome.berkeley.edu/result.php?resultid=1297997619
Quite the same stuff with a WU from GPUgrid.net ( http://www.gpugrid.net/result.php?resultid=947630 ).
Libraries:
$ ldd projects/setiathome.berkeley.edu/setiathome-CUDA-6.08.x86_64-pc-linux-gnu
linux-vdso.so.1 => (0x00007fffe0193000)
libcufft.so.2 => /usr/local/cuda/lib/libcufft.so.2 (0x00007f4fd1bd7000)
libcudart.so.2 => /usr/local/cuda/lib/libcudart.so.2 (0x00007f4fd1997000)
libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007f4fd14ca000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f4fd11bd000)
libm.so.6 => /lib/libm.so.6 (0x00007f4fd0f38000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f4fd0d1c000)
libc.so.6 => /lib/libc.so.6 (0x00007f4fd09aa000)
libdl.so.2 => /lib/libdl.so.2 (0x00007f4fd07a6000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f4fd058e000)
librt.so.1 => /lib/librt.so.1 (0x00007f4fd0386000)
libz.so.1 => /lib/libz.so.1 (0x00007f4fd016e000)
/lib64/ld-linux-x86-64.so.2 (0x00007f4fd1ef2000)
$ ls -l /usr/local/cuda/lib/
total 24536
lrwxrwxrwx 1 root root 17 2009-07-06 14:41 libcublasemu.so -> libcublasemu.so.2
lrwxrwxrwx 1 root root 19 2009-07-06 14:41 libcublasemu.so.2 -> libcublasemu.so.2.2
-rwxr-xr-x 1 root root 4745456 2009-07-06 14:41 libcublasemu.so.2.2
lrwxrwxrwx 1 root root 14 2009-07-06 14:41 libcublas.so -> libcublas.so.2
lrwxrwxrwx 1 root root 16 2009-07-06 14:41 libcublas.so.2 -> libcublas.so.2.2
-rwxr-xr-x 1 root root 18684568 2009-07-06 14:41 libcublas.so.2.2
lrwxrwxrwx 1 root root 14 2009-07-06 14:41 libcudart.so -> libcudart.so.2
lrwxrwxrwx 1 root root 16 2009-07-06 14:41 libcudart.so.2 -> libcudart.so.2.2
-rwxr-xr-x 1 root root 261304 2009-07-06 14:41 libcudart.so.2.2
lrwxrwxrwx 1 root root 16 2009-07-06 14:41 libcufftemu.so -> libcufftemu.so.2
lrwxrwxrwx 1 root root 18 2009-07-06 14:41 libcufftemu.so.2 -> libcufftemu.so.2.2
-rwxr-xr-x 1 root root 272896 2009-07-06 14:41 libcufftemu.so.2.2
lrwxrwxrwx 1 root root 13 2009-07-06 14:41 libcufft.so -> libcufft.so.2
lrwxrwxrwx 1 root root 15 2009-07-06 14:41 libcufft.so.2 -> libcufft.so.2.2
-rwxr-xr-x 1 root root 1153304 2009-07-06 14:41 libcufft.so.2.2
Now, as GPUgrid app exhibits same error seems that it's some configuration of Ubuntu kernel that kicks in ... I'll re-compile kernel when I find some time (which will not happen in next couple of weeks).
'till then.
-
Please follow my post here (http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014) to the letter
-
I followed your post, but it's still only using one of my cards, I have two GTX 260 NOT in SLI.
I placed a copy of the cuda app in /bin, all the drivers are installed and using BOINC 6.6.20 (6.6.36 wouldn't work at all).
The one that is running says: "Running (0.04 CPUs, 2 CUDA)" for the Status.
-
The one that is running says: "Running (0.04 CPUs, 2 CUDA)" for the Status.
Are you Joseph Monk at SETI?
I just answered an identical question on the main board:
Your app_info.xml file is faulty.
The fragment which reads
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
(probably near the bottom) needs to be like that - unchanged - even when you have two cards. It's the number of CUDA cards needed per task - and none of the applications need more than one per task yet.
-
The one that is running says: "Running (0.04 CPUs, 2 CUDA)" for the Status.
Are you Joseph Monk at SETI?
I just answered an identical question on the main board:
Your app_info.xml file is faulty.
The fragment which reads
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
(probably near the bottom) needs to be like that - unchanged - even when you have two cards. It's the number of CUDA cards needed per task - and none of the applications need more than one per task yet.
Double thanks, after I asked there I realized I got the app here so this would be a better place to ask. It's working now... but something is odd as tasks now seem to take A LOT longer and only one of my GPUs are heating up. Last night it ran with the 2 count and completed about 30 tasks in 3-8 minutes each, now it's running 2 at a time but taking hours.
-
Kunin, if you haven't seen it already, check http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19767.html#msg19767 and subsequent posts in that thread.
-
Thanks, been using 6.6.11 for a while. Another user here and I have a long thread going on the SETI forums with our various attempts.
I believe I have located the problem in 6.6.36, but all attempts to compile my own client has failed (complains about curl, but curl is installed properly).
If you have any experience there let me know, I'd be more than happy to run a couple tests and confirm my suspicions then get it fixed and submit a patch back.