Seti@Home optimized science apps and information

Optimized Seti@Home apps => Linux => Topic started by: Crunch3r on 18 Jan 2009, 08:03:10 pm

Title: SETI MB CUDA for Linux
Post by: Crunch3r on 18 Jan 2009, 08:03:10 pm
(http://calbe.dw70.de/pics/CUDA-linux.jpg)

Howdy,

as the thread title says... if you're nuts enough, give them a try.

VERY IMPORTANT!!! Note that these applications are meant for ADVANCED USERS.

Take a look at the README.TXT for some instructions.

Remember, this is still a work in progress and can crash!



dedicated to no-skilldude :p

[Mod:] Removed outdated/bad 32 bit build

[Warning note:] The remaining 64 bit build was made more than a year before nVidia released Fermi class GPUs (GTX 4xx, etc. with compute capability 2.0 and above). It will appear to work very quickly with those cards, but most results will be false "-9 result_overflow" cases. So use it only with older GPUs.

[Mod:] Removed outdated 64 bit build - not compatible with SaH v7 tasks
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Jan 2009, 08:21:42 pm
Unfortunately on app_info.xml for MB+AP both 32 and 64 bit, in line 25,  there is a very small error, a . too many.
Title: Re: SETI MB CUDA for Linux
Post by: Crunch3r on 18 Jan 2009, 08:50:59 pm
Unfortunately on app_info.xml for MB+AP both 32 and 64 bit, in line 25,  there is a very small error, a . too many.

thanks for noticing. It's corrected now ;)
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 18 Jan 2009, 08:53:34 pm
I guess you made your point then  :P (couldn't help myself  :-X)
Title: Re: SETI MB CUDA for Linux
Post by: Crunch3r on 18 Jan 2009, 09:00:32 pm
I guess you made your point then  :P (couldn't help myself  :-X)

yeah, someone had to do it, so i thought it would be ok ... well, i got caught  :P
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Jan 2009, 09:23:36 pm
A small tip for those of you who choose option 2 from readme.txt concerning the cuda libs. After you edit ld.so.conf you need to run ldconfig as root to rebuild the libraries cache.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 19 Jan 2009, 05:03:23 am
nice work :)
I already crunched ~20 units on 2 computers with the 64bit app, some are already validated. Run times are 2-3x faster on my G92 based cards than my C2D & C2Q @ 3.4 and 3.2GHz, except for the VLARs...
Due to being Linux only, I was limited to GPUGRID up to yesterday. In only 5 months my GPUs made this by far my strongest project. Now its time to give some attention to SETI :)

Thanks for the port!
Title: Re: SETI MB CUDA for Linux
Post by: smurf on 21 Jan 2009, 05:53:48 pm
I crunshed several units. One was already correctly validated. The others have error messages:
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
 file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
....

I have a GeForce 8800 GT. Whats wrong here ?  :(

Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 21 Jan 2009, 05:57:41 pm
I crunshed several units. One was already correctly validated. The others have error messages:
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
 file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
....

I have a GeForce 8800 GT. Whats wrong here ?  :(



For windows version I would propose to update drivers. No idea if it applicable to Linux...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Jan 2009, 06:09:45 pm
Yes, what is your driver version?

Can you give us a link to that workunit?
Title: Re: SETI MB CUDA for Linux
Post by: smurf on 21 Jan 2009, 06:14:41 pm
I am using the latest 180.06. Thats a CUDA 2.1 Beta driver for OpenSUSE 11.0. But I am using it on OpenSUSE 11.1.

Workunit Link is:
http://setiathome.berkeley.edu/result.php?resultid=1128041873
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Jan 2009, 06:42:39 pm
Hmm, you have these errors in all your results except one.

I know this won't help you much but I've done more than a hundred workunits without a single error. I'm using the latest official stable drivers 180.22. Go to http://www.nvidia.com/object/linux_display_ia32_180.22.html to grab them.

Have you overclocked your card? Do you know what temperatures it gets when running cuda?

Edit: Read here http://www.suse.de/~sndirsch/nvidia-installer-HOWTO.html for a how-to. I've also found this http://en.opensuse.org/NVIDIA
Title: Re: SETI MB CUDA for Linux
Post by: ML1 on 22 Jan 2009, 07:05:30 pm
OK... To jump in with both feet and an AMD Athlon64 X2 + nVidia 8600 GT and...

Read all about the fun on s@h: Boinc 6.6.2 just released (CUDA) (http://setiathome.berkeley.edu/forum_thread.php?id=51556)

Briefly, it appears to be working with two nice 19 tasks and additionally a nice 10 s@h CUDA task. Except that that the Boinc Manager consistently crashes whilst starting. The boinc client runs on unperturbed...

?

Good effort on the compile. What to check next?

Cheers,
Martin
Title: Re: SETI MB CUDA for Linux
Post by: koubi89 on 24 Jan 2009, 03:45:38 am
hello everybody,it's ok form me...
2mn30 per wu with gtx260 216sp 55nm  (709/1518/1096)

it work
Title: Re: SETI MB CUDA for Linux
Post by: elgrande71 on 24 Jan 2009, 05:21:55 am
It works for me with Seti CUDA Linux 64 bits application and 8800GTS512 GPU card (Asus EN8800GTS).
It takes 6 minutes to calculate one work unit.
Thank you for this application.  ;)
Title: Re: SETI MB CUDA for Linux
Post by: ML1 on 24 Jan 2009, 10:01:39 am
OK... To jump in with both feet and an AMD Athlon64 X2 + nVidia 8600 GT and...

Read all about the fun on s@h: Boinc 6.6.2 just released (CUDA) (http://setiathome.berkeley.edu/forum_thread.php?id=51556)

Briefly, it appears to be working with two nice 19 tasks and additionally a nice 10 s@h CUDA task. Except that that the Boinc Manager consistently crashes whilst starting. The boinc client runs on unperturbed...

OK, first few results listed. (http://setiathome.berkeley.edu/forum_thread.php?id=51556&nowrap=true#857182)

All pending or validated ok. No fails so far.

However, looks like out-of-memory for processing the gaussians.

What to check next?

Cheers,
Martin

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jan 2009, 10:25:08 am
Hmm, 256MB seem a bit borderline in your case. I've got a GTX 280 and I've never had out of memory errors. Do you have any compiz 3d effects enabled?
Title: Re: SETI MB CUDA for Linux
Post by: ML1 on 24 Jan 2009, 12:17:12 pm
Hmm, 256MB seem a bit borderline in your case. I've got a GTX 280 and I've never had out of memory errors. Do you have any compiz 3d effects enabled?
Nope.

Simply 1280x1024, no effects, 4 desktops, running Mandriva 2009.0 with KDE4.

Any diagnostics that can be done?

Cheers,
Martin
Title: Re: SETI MB CUDA for Linux
Post by: Crunch3r on 24 Jan 2009, 12:52:03 pm
Hmm, 256MB seem a bit borderline in your case. I've got a GTX 280 and I've never had out of memory errors. Do you have any compiz 3d effects enabled?
Nope.

Simply 1280x1024, no effects, 4 desktops, running Mandriva 2009.0 with KDE4.

Any diagnostics that can be done?

Cheers,
Martin


I've seen that once. It happens when boinc suspends a WU to crunch another one in high priority mode. You need to make sure not to leave apps in memory while suspended.

When that happens again, you should check if there's another suspended instance of the cuda app present.

Title: Re: SETI MB CUDA for Linux
Post by: smurf on 24 Jan 2009, 12:57:28 pm
I'm using the latest official stable drivers 180.22. Go to http://www.nvidia.com/object/linux_display_ia32_180.22.html to grab them.
Have you overclocked your card? Do you know what temperatures it gets when running cuda?

Now I updated to 180.22. But same situation. The card is not overclocked and temperature is 62 C. There are two types of cases:
- small WUs with a time to completion of ~7 min are running fine without any errors and granted credit is the same as claimed credit (14-15)
- big WUs with a time to completion of ~25 min have the gausfit error message and the granted credit is smaller than the claimed credit (claimed ~50, granted ~40)

I also tried it with a good old TWM only desktop which should not block any resources of the GPU for desktop effects. But same situation.

I also tried seti_home in text console standalone mode, but then all WUs make lots of strange errors. Is CUDA only supported from within an X session ?

Title: Re: SETI MB CUDA for Linux
Post by: koubi89 on 24 Jan 2009, 10:21:57 pm
sam 24 jan 2009 21:00:06 CET||Internet access OK - project servers may be temporarily down.
sam 24 jan 2009 21:01:08 CET||Project communication failed: attempting access to reference site
sam 24 jan 2009 21:01:08 CET|SETI@home|Temporarily failed upload of 16dc08ab.25701.18068.7.8.36_0_0: HTTP error
sam 24 jan 2009 21:01:08 CET|SETI@home|Backing off 1 min 0 sec on upload of 16dc08ab.25701.18068.7.8.36_0_0
sam 24 jan 2009 21:01:08 CET|SETI@home|Started upload of 16dc08ab.25701.18068.7.8.47_1_0
sam 24 jan 2009 21:01:09 CET||Internet access OK - project servers may be temporarily down.

an idea?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jan 2009, 10:26:13 pm
For the last day or so berkeley's servers are taking a hit and they are a bit slow. There is nothing on your end to worry about. More or less everyone has a problem uploading.
Title: Re: SETI MB CUDA for Linux
Post by: arkayn on 24 Jan 2009, 11:57:52 pm
The server is getting blasted, partly because of the new server bug.

(http://fragment1.berkeley.edu/newcricket/mini-graph.cgi?type=png;target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;inst=0;dslist=ifInOctets%2CifOutOctets;range=151200;rand=780)

When that green line goes back down everything will work better.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 25 Jan 2009, 11:23:09 am
I also tried seti_home in text console standalone mode, but then all WUs make lots of strange errors. Is CUDA only supported from within an X session ?
CUDA (of couse) only works when the Nvidia driver is loaded. This is usually done during start of the X server/subsystem.  I guess the driver wasn't loaded/unloaded when you started/switched to standalone mode...
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 27 Jan 2009, 10:52:04 pm
Hi everybody,

I'm not able to convince BOINC client 6.4.5 that my GK 8600 GT is CUDA capable (No CUDA devices found)
I have got: openSUSE 11.1 64-bit, latest nvidia driver 180.22, libraries are copied into /usr/lib64:

ldd setiathome-CUDA-6.08.x86_64-pc-linux-gnu
        linux-vdso.so.1 =>  (0x00007fff337ff000)
        libcufft.so.2 => /usr/lib64/libcufft.so.2 (0x00007f602b259000)
        libcudart.so.2 => /usr/lib64/libcudart.so.2 (0x00007f602b01b000)
        libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x00007f602ab7b000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f602a86f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f602a619000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f602a3fd000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f602a0a4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f6029ea0000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f6029c88000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f6029a7f000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f6029869000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f602b573000)
ldd boinc
        linux-vdso.so.1 =>  (0x00007fffefdff000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fc3e7857000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fc3e7653000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fc3e743d000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fc3e7131000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fc3e6f15000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fc3e6cbf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fc3e6966000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc3e7a6f000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fc3e674e000)



Do you have any idea what else do I miss?

Thanks a lot for suport,
H99
Title: Re: SETI MB CUDA for Linux
Post by: Crunch3r on 28 Jan 2009, 04:23:20 am
I recommend reading this post at the S@H MB http://setiathome.berkeley.edu/forum_thread.php?id=51556&nowrap=true#856583

Seems to me that you ran into the same issues.

HTH

Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 28 Jan 2009, 04:54:27 am
Yeah, I have tried this before, but without sucecss  :(
I will try to reinstall the nvidia driver again if it helps...

[edit] I have reinstalled the driver and CUDA is detected now  :) Previous driver was installed from nvidia repository for SUSE, maybe this was the problem. [/edit]

H99
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 28 Jan 2009, 06:01:23 am
I have another question:
CPU utilization is still almost 100%  for SETI and temp. on the GK stays low (50 C).

Task Status shows " Running (0.04% CPUs, 1 CUDA)", progress proceeds very slowly.

Is it possible that CPU is too weak to feed Graphic Card (Athlon X2 3800+) ?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 28 Jan 2009, 07:45:39 am
Do you have any MB work to run? Maybe you're running only AP?
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 28 Jan 2009, 07:57:50 am
Yes, app is setiathome_enhanced 6.08 (cuda).
I have set AP to NO in the seti prefs and using app_info.xml without AP.
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 28 Jan 2009, 11:42:20 am
I have another question:
CPU utilization is still almost 100%  for SETI and temp. on the GK stays low (50 C).

Task Status shows " Running (0.04% CPUs, 1 CUDA)", progress proceeds very slowly.

Is it possible that CPU is too weak to feed Graphic Card (Athlon X2 3800+) ?

It sounds as if the app is running in CPU fallback mode if it's running at 100%,
normally happens if GPU doesn't meet the memory requirements, when the first one completes, check your results.
I've been testing 6.08 (on windows) on Seti Beta with a 8400M GS, since it's only got 128Mb,
it goes into CPU fallback mode, and takes 4 to 7 hours per WU, a lot slower than stock 6.03, let alone AK_V8.
You could try different resolutions/colour depths to see if it makes a difference.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 28 Jan 2009, 12:22:52 pm
I have another question:
CPU utilization is still almost 100%  for SETI and temp. on the GK stays low (50 C).

Task Status shows " Running (0.04% CPUs, 1 CUDA)", progress proceeds very slowly.

Is it possible that CPU is too weak to feed Graphic Card (Athlon X2 3800+) ?

See ML1's post http://setiathome.berkeley.edu/forum_thread.php?id=51556&nowrap=true#858930

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 28 Jan 2009, 07:02:15 pm
The CUDA Linux app on my Q6600 with 9800GT (512MB) is also consuming 100% (so one full core) CPU time :-/
It is not working in fallback mode or like that, the times are 2-3 times faster than with the V8 app and GPU temperature is the same as when running GPUGRID.

http://setiathome.berkeley.edu/results.php?hostid=4425787&offset=200
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 28 Jan 2009, 07:09:03 pm
The CUDA Linux app on my Q6600 with 9800GT (512MB) is also consuming 100% (so one full core) CPU time :-/

Yes, for performance reasons (both cuda app itself and general desktop responsiveness), Crunch3r made it that way .
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 28 Jan 2009, 08:31:27 pm
Thanks, if that was intended I didn't say anything ;-)
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 28 Jan 2009, 08:39:16 pm
The CUDA Linux app on my Q6600 with 9800GT (512MB) is also consuming 100% (so one full core) CPU time :-/

Yes, for performance reasons (both cuda app itself and general desktop responsiveness), Crunch3r made it that way .
Just wonder how to force CUDA MB to use 100% of CPU ??  ::)  ::)  ::)
REally more likely it's CPU fallback mode...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 28 Jan 2009, 09:23:05 pm
The CUDA Linux app on my Q6600 with 9800GT (512MB) is also consuming 100% (so one full core) CPU time :-/

Yes, for performance reasons (both cuda app itself and general desktop responsiveness), Crunch3r made it that way .
Just wonder how to force CUDA MB to use 100% of CPU ??  ::)  ::)  ::)
REally more likely it's CPU fallback mode...

Raistmer look at the pre-release section, thread about linux cuda, second page, last post (from Crunch3r) (don't think I should post a direct link here ;) )
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 28 Jan 2009, 09:26:45 pm
If its fallback mode, then the CUDA app has CPU code that is 2-3 times faster than AK V8...  ;D
Check my results, WUs with angle range of ~0.43 take just ~19 minutes, those took close to 1h before on my machine...

VLARs I'm killing with a cron script  ::)
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 29 Jan 2009, 03:47:11 am
Ok, I see.
Check my results, WUs with angle range of ~0.43 take just ~19 minutes, those took close to 1h before on my machine...

In task results I can see only CPU time, not wallclock time. 19min is big CPU time indeed for CUDA MB. It's interesting to know wallclock times for this build. Does additional usage of CPU lead to significant shrinkage wall clock time ?
Could anyone run PG*.wu test set with this build ?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 29 Jan 2009, 08:00:44 am
Raistmer, this is what Crunch3r said:
P.S. yes, the app needs a full core, cuz i've removed that part of the code that messes around with that on windows.
It also slows down the gpu if you'd run a 4+1 config.

Since it uses a full core, cpu time is actually wall time. It's not like in windows where cpu time << wall time.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 29 Jan 2009, 08:27:27 am
Raistmer, this is what Crunch3r said:
P.S. yes, the app needs a full core, cuz i've removed that part of the code that messes around with that on windows.
It also slows down the gpu if you'd run a 4+1 config.

Since it uses a full core, cpu time is actually wall time. It's not like in windows where cpu time << wall time.
OK, replace walltime by CPU time in my prev post. Anyway I interested inthat numbers.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 29 Jan 2009, 01:24:59 pm
I attached the CUDA start and stop messages from the log, run times are ~ the same I think...
Where to get this PG*.wu test set?

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 29 Jan 2009, 03:30:55 pm
I attached the CUDA start and stop messages from the log, run times are ~ the same I think...
Where to get this PG*.wu test set?
Thik in download area of this site, in benchmarking sector.
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 29 Jan 2009, 04:02:38 pm
I had a look at Hefto99's http://setiathome.berkeley.edu/results.php?hostid=4774614host today, he's now completed two Cuda WU's with Chrunch3r's 608 r06 Cuda app,
both have completed on CPU fallback after 'CUDA runtime ERROR in device memory allocation' in about 15K secs (4hrs),
no credit granted yet.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 29 Jan 2009, 06:12:58 pm
I had a look at Hefto99's http://setiathome.berkeley.edu/results.php?hostid=4774614host today, he's now completed two Cuda WU's with Chrunch3r's 608 r06 Cuda app,
both have completed on CPU fallback after 'CUDA runtime ERROR in device memory allocation' in about 15K secs (4hrs),
no credit granted yet.

Claggy
Hm.. As I expected... Still interesting in (clear) CUDA work with CPU 100% allocation times. I don't understand for what 100% CPU usage is needed for CUDA app indeed. That's why I want to look at elapsed times for tasks I know, not some "wild" task from server...
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 29 Jan 2009, 08:54:24 pm
hmm... I have tried  different environments (gnome, kde 3.5, xfce..) , but no improvement...
It seems I have to wait for the better GK..  :(

Thanks to everybody for support!
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 29 Jan 2009, 09:14:07 pm
I'll check the download sector and come back to the thread when I have some benchmark results ;-)
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 30 Jan 2009, 07:21:02 am
OK, I have finally got it running under  the XFCE display manager  :)
14.21 credit WU has been finished in under 15 minutes, although it is still falling fallback:
  CUDA runtime ERROR in device memory allocation (Step 1 of 3). Falling back to HOST CPU processing...


Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 30 Jan 2009, 07:31:14 am
OK, I have finally got it running under  the XFCE display manager  :)
14.21 credit WU has been finished in under 15 minutes, although it is still falling fallback:
  CUDA runtime ERROR in device memory allocation (Step 1 of 3). Falling back to HOST CPU processing...

It means you actually use 6.03 stock app CPU code for processing taks. If you can't avoid fallback to CPU it's much better to use one of AK v8 opt apps for CPU processing than 6.08 in CPU mode.
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 30 Jan 2009, 07:18:43 pm
Other tasks look much better, no more runtime ERRORS  :)
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 31 Jan 2009, 02:07:50 am
Other tasks look much better, no more runtime ERRORS  :)
And CPU usage with that tasks ?
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 31 Jan 2009, 03:57:16 am
CPU usage is still high - 95-100 %
GPU temp is 15 C above normal.

Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 31 Jan 2009, 05:19:07 am
with increased GPU temp it seems GPU works now.
Interesting what runtimes will be (wall clock ones)
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 31 Jan 2009, 05:41:37 am
I had a look at Hefto99's http://setiathome.berkeley.edu/results.php?hostid=4774614tasks earlier, they are all using the GPU now,
with only a couple that did CPU fallback, but either finished shortly after, or restarted on the GPU, times are now in the thousands range,
instead of tens of thousands, about the speed of my T8100 C2D using AK_V8.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 31 Jan 2009, 08:05:55 am
CPU speed?
Well, CPU time for Win CUDA MB ~3-4 minutes.
And I more interesting in wall clock time....
Could someone with Linux CUDA run some of test WUs like PGxxx.wu or testwu_x.wu ?
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 31 Jan 2009, 08:58:56 pm
My CPU is running on 2GHz (Athlon X2 3800+), GK is 8600 GT on default clocks, openSUSE 11.1 64-bit, here are some results:

http://setiathome.berkeley.edu/result.php?resultid=1142061689
CPU time:  883.97 sec.
Wall clock: 1087 sec.
Claimed credit: 14.20

http://setiathome.berkeley.edu/result.php?resultid=1141789428
CPU time: 2646 sec.
Wall clock: 2928 sec.
Claimed credit: 57.28

http://setiathome.berkeley.edu/result.php?resultid=1141486869
CPU time: 4652 sec.
Wall clock: 4756 sec.
Claimed credit: 81.32

CPU utilization is almost 100% for SETI
I will try to run some test units during next week.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 01 Feb 2009, 04:54:35 am
downloaded http://lunatics.kwsn.net/index.php?module=Downloads;sa=dlview;id=18, results are like this:

root@cruncher:~# for i in 1 2 3 4 5 6 7 ; do cp testWU-$i work_unit.sah; rm stderr.txt; echo "starting testWU-$i"; time ./setiathome-CUDA-6.08.x86_64-pc-linux-gnu ; cat stderr.txt | grep "angle range"; echo;done
starting testWU-1

real    1m5.579s
user    1m3.372s
sys     0m0.216s
WU true angle range is :  0.604884

starting testWU-2

real    1m16.346s
user    1m14.065s
sys     0m0.276s
WU true angle range is :  0.443732

starting testWU-3

real    1m20.750s
user    1m18.493s
sys     0m0.268s
WU true angle range is :  0.425877

starting testWU-4

real    0m44.151s
user    0m41.959s
sys     0m0.184s
WU true angle range is :  1.279649

starting testWU-5

real    1m16.539s
user    1m14.309s
sys     0m0.232s
WU true angle range is :  0.439957

starting testWU-6

real    3m40.854s
user    3m38.670s
sys     0m0.200s
WU true angle range is :  0.033858

starting testWU-7

real    0m49.831s
user    0m47.635s
sys     0m0.192s
WU true angle range is :  0.775000

App was running with priority 0 and consuming 100% of one core, no other BOINC processes running. The system is a dedicated cruncher, C2Q6600 with GF9800GT.
If you need any other output from stderr.txt or result.sah, let me know :)
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 01 Feb 2009, 05:33:17 am
real, user and sys - are they CPU times ? try to use standardized KWSN testbench, it should show elapsed times too.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 01 Feb 2009, 06:58:44 am
real is wall clock time, user is CPU time and sys is like kernel time or time spent interacting with the kernel, somehow like this...

I checked the KWSN testbench, it could need some small documentation, but I think I figured it out, its running now.
default-512 vs. AK_V8_linux64_ssse3 vs. setiathome-CUDA-6.08.x86_64-pc-linux-gnu

Will update this post when results are available...

edit:

I don't know what kind of work units these are, but they don't reflect what I've seen with current MB units.
The SSSE3 app is faster some times then my GPU in this bench. In real world performance on the project it is at least twice faster than the Q6600@3,2, except VLARs, of course... run times are attached in the text file...

my_science_app1 = AK_V8_linux64_ssse3
my_science_app2 = setiathome-CUDA-6.08.x86_64-pc-linux-gnu



[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 01 Feb 2009, 11:01:02 am
Thanks.
Will do test with these WUs on my host under Windows and post for comparison.
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 02 Feb 2009, 08:47:06 am
Here are my results....   my_science_app1 is setiathome-CUDA-6.08.x86_64-pc-linux-gnu

AMD Athlon X2 3800+ (2 GHz), GeForce 8600 GT, openSUSE 11.1 64-bit



[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 02 Feb 2009, 08:53:26 am
And your system is? (CPU/GPU )
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 02 Feb 2009, 10:12:32 am
The same tasks with Windows app on Q9450/9600GSO:

AK_v8b_win_SSSE3x.exe -verb -st / testWU-1.wu :
Started at  : 17:00:17.618
Ended at    : 17:02:04.759
    107.125 secs Elapsed
    105.113 secs CPU time
MB_6.08_mod_CPU_team_CUDA.exe -verb -st / testWU-1.wu :
Started at  : 17:02:04.790
Ended at    : 17:03:07.923
     63.118 secs Elapsed
     18.159 secs CPU time

AK_v8b_win_SSSE3x.exe -verb -st / testWU-2.wu :
Started at  : 17:03:08.017
Ended at    : 17:05:13.909
    125.876 secs Elapsed
    123.849 secs CPU time
MB_6.08_mod_CPU_team_CUDA.exe -verb -st / testWU-2.wu :
Started at  : 17:05:13.940
Ended at    : 17:06:26.932
     72.977 secs Elapsed
     18.798 secs CPU time

AK_v8b_win_SSSE3x.exe -verb -st / testWU-3.wu :
Started at  : 17:06:27.026
Ended at    : 17:08:40.515
    133.474 secs Elapsed
    131.462 secs CPU time
MB_6.08_mod_CPU_team_CUDA.exe -verb -st / testWU-3.wu :
Started at  : 17:08:40.546
Ended at    : 17:09:59.591
     79.030 secs Elapsed
     20.015 secs CPU time
Speedup     : 84.78%
Ratio       : 6.57 x

AK_v8b_win_SSSE3x.exe -verb -st / testWU-4.wu :
Started at  : 17:09:59.685
Ended at    : 17:10:38.264
     38.563 secs Elapsed
     36.535 secs CPU time
MB_6.08_mod_CPU_team_CUDA.exe -verb -st / testWU-4.wu :
Started at  : 17:10:38.295
Ended at    : 17:11:14.175
     35.864 secs Elapsed
     15.865 secs CPU time

AK_v8b_win_SSSE3x.exe -verb -st / testWU-5.wu :
Started at  : 17:11:14.269
Ended at    : 17:13:21.112
    126.828 secs Elapsed
    124.816 secs CPU time
MB_6.08_mod_CPU_team_CUDA.exe -verb -st / testWU-5.wu :
Started at  : 17:13:21.143
Ended at    : 17:14:34.510
     73.351 secs Elapsed
     18.439 secs CPU time

AK_v8b_win_SSSE3x.exe -verb -st / testWU-7.wu :
Started at  : 17:16:58.498
Ended at    : 17:17:58.387
     59.857 secs Elapsed
     57.845 secs CPU time
MB_6.08_mod_CPU_team_CUDA.exe -verb -st / testWU-7.wu :
Started at  : 17:17:58.418
Ended at    : 17:18:39.898
     41.465 secs Elapsed
     16.021 secs CPU time
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 02 Feb 2009, 07:26:16 pm
AMD Athlon X2 3800+ (2 GHz), GeForce 8600 GT, openSUSE 11.1 64-bit
Title: Re: SETI MB CUDA for Linux
Post by: Hefto99 on 04 Feb 2009, 05:14:14 am
Hi all,

is it possible to autokill VLAR WUs with this Linux application? My system is very laggy with these long Work Units...  :(

Thanks for info,
H99
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Feb 2009, 06:49:37 am
Autokill unfortunately no. You can do it with a script or manually (make a search for <true_angle_range>0.01 or 0.00 and delete or abort them).
Title: Re: SETI MB CUDA for Linux
Post by: rja on 05 Feb 2009, 07:37:21 pm
I realize that the app_info.xml files are hints, but is there a reason for the MB file to differ from the MB+AP file?

Should the version_num, avg_ncpus, and max_ncpus match between the MB+AP app_info.xml and MB app_info.xml files in setiathome-CUDA-6.08.i686.tar.bz2?  Same for setiathome-CUDA-6.08.x86_64.tar.bz2?

The MB+AP app_info.xml has version_num of 607 while the MB app_info.xml has version_num of 608.

The version 603 MB+AP avg_ncpus, and max_ncpus are set to 1.0000 while the MB avg_ncpus, and max_ncpus are set to 0.040000.
Title: Re: SETI MB CUDA for Linux
Post by: ML1 on 06 Feb 2009, 08:28:49 pm
My CPU is running on 2GHz (Athlon X2 3800+), GK is 8600 GT on default clocks, openSUSE 11.1 64-bit, here are some results:

[...]

CPU utilization is almost 100% for SETI
I see pretty much the same on my system for an AthlonXP 6400+ and 8600GT GPU (256 MB VRAM).

Is the CPU doing a busy-wait poll of the GPU? Or why the high CPU utilisation?

As an experiment I'm keeping the CPU priority down to nice 19 (instead of the default 10) to see if there is any slowdown for the CUDA processing. However, that only reduces the CPU load to between 75% and 90% for a core.

Is there any profiling that we can run to see what it is doing with the CPU time?

Happy crunchin',
Martin
Title: Re: SETI MB CUDA for Linux
Post by: ML1 on 07 Feb 2009, 09:01:08 am
I see pretty much the same [100% CPU] on my system for an AthlonXP 6400+ and 8600GT GPU (256 MB VRAM).

Is the CPU doing a busy-wait poll of the GPU? Or why the high CPU utilisation?

As an experiment I'm keeping the CPU priority down to nice 19 (instead of the default 10) to see if there is any slowdown for the CUDA processing. However, that only reduces the CPU load to between 75% and 90% for a core...
And for a brief comparison of a very few examples (sorting by AR):

04-Feb-2009 20:28:20 04-Feb-2009 20:43:07 2.7155224489909 19dc08ac.31914.13160.15.8.44
05-Feb-2009 17:48:16 05-Feb-2009 18:02:44 2.7155504718476 17dc08ae.20201.15207.8.8.172
07-Feb-2009 09:34:39 07-Feb-2009 09:53:13 2.7155603822925 17dc08ae.1228.13162.7.8.1
07-Feb-2009 11:07:38 07-Feb-2009 11:26:01 2.7155603822925 17dc08ae.1228.13162.7.8.143
07-Feb-2009 11:26:01 07-Feb-2009 11:43:35 2.7155603822925 17dc08ae.1228.13162.7.8.149
07-Feb-2009 10:11:51 07-Feb-2009 10:30:33 2.7155603822925 17dc08ae.1228.13162.7.8.7
03-Feb-2009 18:19:10 03-Feb-2009 18:35:18 2.7155918111126 20dc08ae.27884.20931.9.8.10
03-Feb-2009 19:23:17 03-Feb-2009 19:38:40 2.7155918111126 20dc08ae.27884.20931.9.8.11

07-Feb onwards is at nice 19 and 75% and 90% CPU on one core. So roughly, 15mins WUs go up to be about 19mins.

05-Feb-2009 23:24:24 05-Feb-2009 23:57:35 0.70474762854275 16dc08ad.2380.10706.11.8.3
05-Feb-2009 22:51:12 05-Feb-2009 23:24:24 0.7330587117685 16dc08ad.2380.11115.11.8.6
07-Feb-2009 04:03:45 07-Feb-2009 04:41:28 0.81850781128826 21dc08ab.5849.11933.10.8.173
06-Feb-2009 10:13:58 06-Feb-2009 10:43:48 0.82366885623122 21dc08ab.11148.20931.4.8.88
07-Feb-2009 02:27:27 07-Feb-2009 03:04:36 0.82505766342312 21dc08ab.5849.20931.10.8.31
05-Feb-2009 22:21:52 05-Feb-2009 22:51:12 0.84615758941882 16dc08ac.28940.4571.10.8.84
06-Feb-2009 05:54:03 06-Feb-2009 06:23:36 0.86971373346817 20dc08ab.30334.20931.11.8.13

And 33mins is pushed up to 38mins, & 30mins -> 37mins...

06-Feb-2009 19:11:13 06-Feb-2009 19:56:01 0.43305205747667 16no08ag.22317.25021.15.8.184
07-Feb-2009 03:04:36 07-Feb-2009 04:03:45 0.43362818142428 01dc08aa.19405.7025.3.8.251
07-Feb-2009 01:31:22 07-Feb-2009 02:27:27 0.43362879013513 01dc08aa.19405.4571.3.8.98
06-Feb-2009 18:24:59 06-Feb-2009 19:11:13 0.43418781810648 16dc08ac.14209.240437.9.8.152
06-Feb-2009 17:39:39 06-Feb-2009 18:24:59 0.43418793067813 16dc08ac.14209.238801.9.8.234
06-Feb-2009 22:16:11 06-Feb-2009 23:01:44 0.43422120716759 16dc08ac.28940.243300.10.8.199
07-Feb-2009 04:41:29 07-Feb-2009 05:40:32 0.43430233574049 16dc08ac.26815.244118.11.8.113

45mins -> 56-59mins...


Sooo... The slowdown looks to be roughly proportionate to the lower proportion of CPU used...

Are the nVidia Windows drivers so very much more efficient than their compiles for Linux?

Or are there very many frequent busy-waits for many small GPU steps?

Happy crunchin',
Martin
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 07 Feb 2009, 09:54:53 am
For my GPU -poll mod is useless:

http://setiathome.berkeley.edu/forum_thread.php?id=51712&nowrap=true#863155
Title: Re: SETI MB CUDA for Linux
Post by: rja on 07 Feb 2009, 11:28:06 pm
I tried the MB only version of setiathome-CUDA-6.08.i686.tar.bz2 and got the GaussFit_kernel errors that smurf mentioned earlier in « Reply #7 on: 21 Jan 2009, 05:53:48 pm » after about 14 minutes or processing.

It has an Nvidia 8800 GTS 640MB.  The GPU core temp is usually 68C, but went to 80C when processing a CUDA workunit.  There was a single cpu process of setiathome running at 100%.

Would it help to use a newer version of boinc than 6.4.5?

Fedora 9, nvidia-graphics180.22-kmdl-2.6.27.12-78.2.8.fc9-180.22-106.fc9.i686 from atrpms, boinc-client-6.4.5-2.20081217svn.fc9.i386 from Fedora, with the latest Nvidia cudatoolkit_2.1_linux32_fedora9.run, for this computer - http://setiathome.berkeley.edu/show_host_detail.php?hostid=4131779

Here is an example taskid http://setiathome.berkeley.edu/result.php?resultid=1154715237

I tried removing the cuda 2.1 libs and using the cudalibs (2.0?) that were in setiathome-CUDA-6.08.i686.tar.bz2, but got the same GaussFit_kernel errors.

This was in a different workunit's slots/0/stderr.txt before it was uploaded:

SETI@home MB CUDA 608 Linux 32bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 8800 GTS
           totalGlobalMem = 670760960
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1188000
           totalConstMem = 65536
           major = 1
           minor = 0
           textureAlignment = 256
           deviceOverlap = 0
           multiProcessorCount = 12
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 8800 GTS is okay
SETI@home using CUDA accelerated device GeForce 8800 GTS
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
libboinc: BOINC 6.5.0

Work Unit Info:
...............
WU true angle range is :  0.447901
Optimal function choices:
-----------------------------------------------------
name               
-----------------------------------------------------
              v_BaseLineSmooth (no other)
  v_vGetPowerSpectrumUnrolled2 0.00010 0.00000
             sse1_ChirpData_ak 0.00814 0.00098
                 v_vTranspose4 0.00381 0.00000
                BH SSE folding 0.00144 0.00000
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
and lots more of these Cuda error 'GaussFit_kernel' lines
Title: Re: SETI MB CUDA for Linux
Post by: zjones on 08 Feb 2009, 03:12:41 pm
I have been trying to get the SETI MB CUDA client running and ever work unit so far immediately hits a computation error as soon as it trys to crunch.

The error log looks like this:

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
process exited with code 127 (0x7f, -129)
</message>
<stderr_txt>
setiathome-CUDA-6.08.x86_64-pc-linux-gnu: error while loading shared libraries: libcufft.so.2: invalid ELF header

</stderr_txt>
]]>


The machine has a AMD Opteron 248, with 2 Quadro 5600 FX. These are equivalent to GeForce 8800 GTXs and G80 chipsets.  I am using CentOS 5.2 (x86_64) with Linux Kernel 2.6.18-8.el5.  I am using NVIDIA drivers x86_64-180.22 and CUDA toolkit 2.1 64-bit.  I have tried BOINC clients 6.4.5 and 6.6.2.  I have tried using the CUDA toolkit libs and the ones provided in Crunch3r's package. 

Have any of you run into this problem and/or have any suggestions?

Thanks.
Title: Re: SETI MB CUDA for Linux
Post by: ML1 on 08 Feb 2009, 09:21:34 pm
I have been trying to get the SETI MB CUDA client running and ever work unit so far immediately hits a computation error as soon as it trys to crunch.

The error log looks like this:

[...]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu: error while loading shared libraries: libcufft.so.2: invalid ELF header
[...]

The machine has a AMD Opteron 248, with 2 Quadro 5600 FX. These are equivalent to GeForce 8800 GTXs and G80 chipsets.  I am using CentOS 5.2 (x86_64) with Linux Kernel 2.6.18-8.el5.  I am using NVIDIA drivers x86_64-180.22 and CUDA toolkit 2.1 64-bit.  I have tried BOINC clients 6.4.5 and 6.6.2.  I have tried using the CUDA toolkit libs and the ones provided in Crunch3r's package. 

Have any of you run into this problem and/or have any suggestions?

'Tis working fine here and for the same revision for the nVidia drivers.

Kernel 2.6.18 is from a while ago now... It could well be that Crunch3r has used a much more recent kernel and a more recent version of gcc for his compiles.

Can you try a more recent kernel/distro?

I'm using Mandriva 2.6.27.7-server-1mnb.

Good luck,
Martin
Title: Re: SETI MB CUDA for Linux
Post by: CorranHorn on 13 Feb 2009, 06:56:51 pm
It's the same problem on my computer.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=4023395

Cuda is ok

Quote
chess@chess-desktop:~/Documents/setiathome-CUDA-6.08.x86_64$ ldd setiathome-CUDA-6.08.x86_64-pc-linux-gnu
   linux-vdso.so.1 =>  (0x00007ffff01ff000)
   libcufft.so.2 => /usr/lib/libcufft.so.2 (0x00007f80e7cbc000)
   libcudart.so.2 => /usr/lib/libcudart.so.2 (0x00007f80e7a7e000)
   libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007f80e7607000)
   libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f80e72fa000)
   libm.so.6 => /lib/libm.so.6 (0x00007f80e7075000)
   libpthread.so.0 => /lib/libpthread.so.0 (0x00007f80e6e59000)
   libc.so.6 => /lib/libc.so.6 (0x00007f80e6ae7000)
   libdl.so.2 => /lib/libdl.so.2 (0x00007f80e68e3000)
   libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f80e66cb000)
   librt.so.1 => /lib/librt.so.1 (0x00007f80e64c2000)
   libz.so.1 => /usr/lib/libz.so.1 (0x00007f80e62aa000)
   /lib64/ld-linux-x86-64.so.2 (0x00007f80e7fd6000)

OS

Quote
Operating System   Linux
2.6.27-11-generic
Title: Re: SETI MB CUDA for Linux
Post by: dtiger on 25 Feb 2009, 01:31:42 am
I crunshed several units. One was already correctly validated. The others have error messages:
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
 file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
....

I have a GeForce 8800 GT. Whats wrong here ?  :(


Now I updated to 180.22. But same situation. The card is not overclocked and temperature is 62 C. There are two types of cases:
- small WUs with a time to completion of ~7 min are running fine without any errors and granted credit is the same as claimed credit (14-15)
- big WUs with a time to completion of ~25 min have the gausfit error message and the granted credit is smaller than the claimed credit (claimed ~50, granted ~40)


I have exactly the same situation.
Video card is 8600 GT 256 MB, NVIDIA 180.29 driver.
Exactly the same uncountable error messages in stderr out, the same clamed/granted credits.

http://setiathome.berkeley.edu/result.php?resultid=1170131986
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 25 Feb 2009, 05:22:30 am
The credit difference is universal to all SETI CUDA applications - stock Windows as well. That goes back to the developers in Berkeley / nVidia - nothing to do with optimisations in general, or Linux in particular.
Title: Re: SETI MB CUDA for Linux
Post by: dtiger on 26 Feb 2009, 02:52:55 am
The credits don't matter for me. I'm interesting in technology only.

As I see from my experience, small units with working time about 16 mins store correct log info in stderr_txt. The longer units are full of "Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument."

Also, small units can start one after one on GPU, while longer units are fall back to CPU after completing one on GPU. Seems to memory issue problem with current Crunch3r's SETI-CUDA release (setiathome-CUDA-6.08.i686.tar.bz2).

Also, as BOINC starts 2 normal CPU crunchers on my C2D E4400 and additionally SETI-CUDA grabs one of CPU for 100%, the crunchers start fighting for second CPU and all thing goes very slowly including X-server response time.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 26 Feb 2009, 06:46:51 am
256MB seem borderline or not enough for the linux cuda app.

If the cpu app you run is astropulse you can force boinc to run only one instance. In your app_info.xml, in the astropulse section,  add
   
   <avg_ncpus>2.0000</avg_ncpus>
   <max_ncpus>2.0000</max_ncpus>

immediately after

       <version_num>500</version_num>
Title: Re: SETI MB CUDA for Linux
Post by: dtiger on 03 Mar 2009, 04:45:33 am
Seems like 256 MB is enough for Win's version of SETI-CUDA. They run fine.
Also, as I see from workunits page, Windows clients crunch units for 100-200 seconds on 8800 GTS 256MB, while my 8600 GT 256MB run about 1000-2000 seconds for the same unit, it's a huge abnormal difference for similar hardware.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Mar 2009, 07:33:46 am
Also, as I see from workunits page, Windows clients crunch units for 100-200 seconds on 8800 GTS 256MB, while my 8600 GT 256MB run about 1000-2000 seconds for the same unit, it's a huge abnormal difference for similar hardware.

Two things:
1. Some users with 256 MB graphics cards see some WUs fall back to CPU computation because of not enough memory. Maybe that explains the increased time.
2. The linux CUDA app uses a full core so the time reported is the "real" computation time. The windows CUDA app uses a small percentage of a single core and records only that time. The "real" computation time for windows machines is much larger, possibly equivalent to that of linux PCs.
Title: Re: SETI MB CUDA for Linux
Post by: CorranHorn on 13 Mar 2009, 08:39:48 am
The windows version is faster than the linux version.

http://setiathome.berkeley.edu/workunit.php?wuid=423584760
Title: Re: SETI MB CUDA for Linux
Post by: Crunch3r on 13 Mar 2009, 10:16:07 am
The windows version is faster than the linux version.

http://setiathome.berkeley.edu/workunit.php?wuid=423584760

No It's not faster. You should get some info about the reported 'CPU time' first and the difference between the win & linux app, before posting such a BS...  ::)
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 10:51:06 am
More correctly - Linux build uses much more CPU time than Windows one. Why it doing so - that's the question.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 13 Mar 2009, 11:17:37 am
More correctly - Linux build uses much more CPU time than Windows one. Why it doing so - that's the question.

LoL I'm with Crunch3r on this one.  Because CPU time is a useless measure of GPU app performance, and depends on how the OS defines CPU time. Where and how cpu time is attributed to the user program or kernel time will vary by platform, along with the methods used for handling the GPU feeding.

When dealing with a parallel program, you can only go by Wall clock time on Same AR WUs only.  The scheduling and accounting semantics between the two OSes will be vastly different, and likely the Linux figure is just being 'more honest'.

Jason
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 11:21:22 am
No, you missed that if CPU is busy - it's busy.
But if CPU free - it can be used somewhere else.
It seems in Linux CPU is busy all time CUDA app runs (I can do conclusions only by read posts of course, didn't run it on own host).

ADDON: on windows I studied total run time (elapsed) for busy cores with CUDA app so pretty confident, CPU is almost FREE while CUDA app running INDEED.
Windows doesn't cheat here as you suppose.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 13 Mar 2009, 11:25:41 am
No, you missed that if CPU is busy - it's busy.
But if CPU free - it can be used somewhere else.
It seems in Linux CPU is busy all time CUDA app runs (I can do conclusions only by read posts of course, didn't run it on own host).


You missed that If I'm spending time in a kernel driver, I can attribute it to the program or not.  Windows doesn't.  
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 11:43:26 am
Again, I _measured_ elapsed  times in config all cores busy with CPU app for CUDA app and measured elapsed time for CPU app when CUDA app running and other cores busy too.
So, NO noticeable kernel time increase here, all fair.
Linux does something wrong here it seems...
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 13 Mar 2009, 11:45:33 am
Watch deferred procedure Calls process (DPCs) %CPUusage in process explorer, with & without Cuda app running.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 11:47:00 am
Watch deferred procedure Calls process (DPCs) %CPUusage in process explorer, with & without Cuda app running.
For what? Elapsed == WALL CLOCK.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 13 Mar 2009, 12:00:08 pm
Watch deferred procedure Calls process (DPCs) %CPUusage in process explorer, with & without Cuda app running.
For what? Elapsed == WALL CLOCK.


That's why I said, Use only wall clock for app comparison.

Deferred procedure calls are executing on another core in another process space, so counts as no extra wall clock or CPU time for that Cuda Process... even though it was made by it.(and consumes resources)

DPC CPU usage with no Cuda App running ~0.77%
DPC CPU usage with Cuda Running ~2.5%
(~3 x)

Which is a full ~50% of the Cuda app shunted off to another kernel process, which will Not effect ELAPSED WALL-CLOCK, because it runs on another core, or register on app CPU_TIME either.

Linux has no windows deferred procedure calls AFAIK (could be wrong) , so cannot shunt of the CPU time to aniother process / core, so cops the cputime allocation locally.

(i.e. Windows is giving extra hidden CPU time to cuda app, there is no magic. )

http://en.wikipedia.org/wiki/Deferred_Procedure_Call

Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 12:12:28 pm
The problem with Linux it cosumes 100% of CPU clock, not 2,5%....
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 13 Mar 2009, 12:20:13 pm
Sure, that's an implementation issue they'll need to work out, and maybe something about their cuda drivers & SDK  etc on that platform  (We know how well beta 2.1 SDK is refined on Windows after all.)

If fixable, I can reasonably expect that their apps would still report 1.5->2x the CPU time as the windows ones, since the Windows one has that hidden component.

 [ Edit: Without knowing about how Linux schedules its processes, It's probably implemented with spin-wait polling  loops instead of interrupts & DPCs, so reports 100% CPU, even though only some fraction of that is used for useful work.  in a spin-wait, it shouldn't be blocking other threads or resources anyway, so it shouldn't really prevent any other app running etc... even though it says 100%.

 It would be interesting to know the behaviour on Linux when a CPU wants to use that core, if the Cuda app yields most of the percentage or not. ]
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 12:29:13 pm
About hidden component - interesting to measure it more accurately.
I understood that I could miss it in my tests (I measured only single core, other was used by BOINC) Will try to run 4 CPU apps + 1 GPU app all in standalone under appTimes control. This way all elapsed times will be accounted - no hidden component at all.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 13 Mar 2009, 12:45:28 pm
Yeah difficult to nail down, as it won't show in wall or cpu time for that app. It will be small, but large in proportion to reported CPU time & consume about 50% on top of whatever reported cpu time is.  Only way I've been able to see this figure is in process explorer. 

next question is does it matter? Well no, because it is really hidden and doesn't add to wall time, but it does consume some small portion of overall machine resources available to other apps.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Mar 2009, 02:45:28 pm
next question is does it matter? Well no, because it is really hidden and doesn't add to wall time, but it does consume some small portion of overall machine resources available to other apps.


And that portion should be known to do decisions about best host configs and performance tuning.
Overall host performance - that's really matter, not performance of just single app .
Title: Re: SETI MB CUDA for Linux
Post by: Lysia on 15 Mar 2009, 07:01:29 pm
I'm also seeing this

Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.

Beginning of stderr.txt is
Code: [Select]
SETI@home MB CUDA 608 Linux 32bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 8800 GTS 512
           totalGlobalMem = 536543232
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1620000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 16
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 8800 GTS 512 is okay
SETI@home using CUDA accelerated device GeForce 8800 GTS 512
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
libboinc: BOINC 6.5.0

Work Unit Info:
...............
WU true angle range is :  0.447697
Optimal function choices:
-----------------------------------------------------
name               
-----------------------------------------------------
              v_BaseLineSmooth (no other)
   v_vGetPowerSpectrumUnrolled 626.74337 0.00000
             sse1_ChirpData_ak 49414.78434 0.00098
                 v_vTranspose4 22247.17831 0.00000
                BH SSE folding 9764.43373 0.00000
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.

I'm running BOINC 6.4.5, Ubuntu 8.04 (2.6.24-23) and have a 8800 GTS 512 (G92). http://setiathome.berkeley.edu/show_host_detail.php?hostid=4840816
Nvidia driver is 180.29, I have tried with CUDA Toolkit from Nvidia (the versions of the libs are 2.1) and the CUDA libs delivered with the S@H-client, with the same results.

The only thing I seem to have in common with all the other reports in this thread seems to be that I run a 32 bit Linux.

I have calculated some work units, most of them still need validation, two are in state "Completed, validation inconclusive", so I guess the results may be garbage.

And some work units just gave a "Compute Error" (http://setiathome.berkeley.edu/result.php?resultid=1185431682), is this normal?

By the way:
Is there any good reason, why libcufft.so.2 and libcudart.so.2 are mentioned in app_info.xml? That should be normal libraries, and I don't think boinc should need to know about them. I did not unpack them at the beginning (had installed CUDA Toolkit), and boinc tried to download these files (and failed) now I have them in projects/setiathome.berkeley.edu but I don't think they are used from there.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Mar 2009, 09:24:24 pm
Lysius, what kind of cuda client are you running? Those tasks marked "Error while computing" seem to have been killed with the VLAR autokill. As far as I know there hasn't been released a linux cuda client with a VLAR autokill function.
Title: Re: SETI MB CUDA for Linux
Post by: Lysia on 16 Mar 2009, 05:43:28 am
Lysius, what kind of cuda client are you running? Those tasks marked "Error while computing" seem to have been killed with the VLAR autokill. As far as I know there hasn't been released a linux cuda client with a VLAR autokill function.

I'm using the Download from the first post in this thread (i686), redownloaded and verrified that this is really what I have installed. Is there anything newer?
And where can I find the source?

Has anyone successfully used the 32-bit version or does everybody use 64-bit?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Mar 2009, 12:17:56 pm
Yes Lysius, there is something wrong with the 32bit app. I've found and downloaded one of the WUs that you've done ( http://setiathome.berkeley.edu/workunit.php?wuid=424266619 ). The 32bit app gives me the same errors, while the 64bit app is good. Worse still, the two results are weakly similar.

Note to Crunch3r: I've posted the two results in the development thread.
Title: Re: SETI MB CUDA for Linux
Post by: Lysia on 22 Mar 2009, 12:16:09 pm
Yes Lysius, there is something wrong with the 32bit app. I've found and downloaded one of the WUs that you've done ( http://setiathome.berkeley.edu/workunit.php?wuid=424266619 ). The 32bit app gives me the same errors, while the 64bit app is good. Worse still, the two results are weakly similar.

Any news on this?

And is there a guide anywhere on how to test an app without using BOINC and reporting possible garbage?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Mar 2009, 03:07:23 pm
Any news on this?

Unfortunately no. All 32bit builds had the same error. If you can, run the 64bit app, else it's better not run CUDA for the time being.


And is there a guide anywhere on how to test an app without using BOINC and reporting possible garbage?

Just put  the app, a workunit named work_unit.sah and the file from the compressed archive I attached below in a directory and run it.

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: Lysia on 23 Mar 2009, 05:11:29 am
Any news on this?

Unfortunately no. All 32bit builds had the same error. If you can, run the 64bit app, else it's better not run CUDA for the time being.

Then I think somebody should add a warning to the original post or remove the 32bit app entirely.

Unfortunately I only have 32bit linux installed. The last time I tried 64bit I had a sound problem in some 32bit games, so I just reverted to 32bit. Perhaps I should give it a try again, but I don't think this will happen anytime soon.
Title: Re: SETI MB CUDA for Linux
Post by: Moustacha on 26 Mar 2009, 02:54:14 am
Excellent work. Gone from ~14,000 seconds to ~1,300 seconds just by using the GPU  ;D
Title: Re: SETI MB CUDA for Linux
Post by: kpolberg on 03 Apr 2009, 11:56:10 pm
Is the source code for this available? Just wondering if it would be possible to compile a 32bit my self just to test. I would really like to put a 9500GT to work(my other htpc is based on xbmc with vdpau mod, would just like to try if crunching at htpcing is possible at the same time).
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 04 Apr 2009, 03:33:01 am
Is the source code for this available? Just wondering if it would be possible to compile a 32bit my self just to test. I would really like to put a 9500GT to work(my other htpc is based on xbmc with vdpau mod, would just like to try if crunching at htpcing is possible at the same time).
You could start with sources from Berkeley's SVN.
https://setisvn.ssl.berkeley.edu/svn/branches/seti_cuda
It contains stock CODA.
Title: Re: SETI MB CUDA for Linux
Post by: kpolberg on 07 Apr 2009, 06:47:49 pm
Well didn't work to well, actually cuda seems to be working fine. But XBMC together with seti cuda was no hit :)
Title: Re: SETI MB CUDA for Linux
Post by: Freggel_Buster on 28 Apr 2009, 07:06:22 pm
I tried to Install  the 32bit CUDA app from Crunch3r.(setiathome-CUDA-6.08.i686)

But i need a Hand to Install it properly.

I followed the Instructions , like this.

1. I use the Nvidia Driver V180.44 from Ubuntu 9.04 32Bit
2. I put the whole cudalibs folder to /usr/lib
3. modified the ld.so.conf and start ldconfig
4. install the CUDA Toolkit V2.1
5. copy the setiathome-CUDA-6.08.i686-pc-linux-gnu and app_info.xml Files
to /var/lib/boinc-client/projects/setiathome.berkeley.edu
6. ldd setiathome-CUDA-6.08.i686-pc-linux-gnu shows no Errors or Warrnings

But it doesn't Work for me. :-(
Got this:

Do 16 Apr 2009 00:08:25 CEST||Starting BOINC client version 6.4.5 for i686-pc-linux-gnu
Do 16 Apr 2009 00:08:25 CEST||log flags: task, file_xfer, sched_ops
Do 16 Apr 2009 00:08:25 CEST||Libraries: libcurl/7.18.2 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.10
Do 16 Apr 2009 00:08:25 CEST||Data directory: /var/lib/boinc-client
Do 16 Apr 2009 00:08:25 CEST|SETI@home|Found app_info.xml; using anonymous platform
Do 16 Apr 2009 00:08:26 CEST||Processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ [Family 15 Model 107 Stepping 1]
Do 16 Apr 2009 00:08:26 CEST||Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch
Do 16 Apr 2009 00:08:26 CEST||OS: Linux: 2.6.28-11-generic
Do 16 Apr 2009 00:08:26 CEST||Memory: 1.97 GB physical, 2.97 GB virtual
Do 16 Apr 2009 00:08:26 CEST||Disk: 22.91 GB total, 16.43 GB free
Do 16 Apr 2009 00:08:26 CEST||Local time is UTC +2 hours
Do 16 Apr 2009 00:08:26 CEST||Not using a proxy
Do 16 Apr 2009 00:08:26 CEST||Can't load library libcudart
Do 16 Apr 2009 00:08:26 CEST||No coprocessors
Do 16 Apr 2009 00:08:26 CEST|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 1867036; location: home; project prefs: default
Do 16 Apr 2009 00:08:26 CEST|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 4879716; location: (none); project prefs: default
Do 16 Apr 2009 00:08:26 CEST|Milkyway@home|URL: http://milkyway.cs.rpi.edu/milkyway/; Computer ID: 56211; location: home; project prefs: default
Do 16 Apr 2009 00:08:26 CEST||General prefs: from Milkyway@home (last modified 08-Apr-2009 00:18:12)
Do 16 Apr 2009 00:08:26 CEST||Computer location: home
Do 16 Apr 2009 00:08:26 CEST||General prefs: no separate prefs for home; using your defaults
Do 16 Apr 2009 00:08:26 CEST||Reading preferences override file
Do 16 Apr 2009 00:08:26 CEST||Preferences limit memory usage when active to 1006.22MB
Do 16 Apr 2009 00:08:26 CEST||Preferences limit memory usage when idle to 1811.20MB
Do 16 Apr 2009 00:08:26 CEST||Preferences limit disk usage to 10.00GB
Do 16 Apr 2009 00:08:26 CEST||file projects/setiathome.berkeley.edu/libcudart.so.2 not found
Do 16 Apr 2009 00:08:26 CEST||file projects/setiathome.berkeley.edu/libcufft.so.2 not found
Do 16 Apr 2009 00:08:26 CEST||[error] No URL for file transfer of libcudart.so.2
Do 16 Apr 2009 00:08:26 CEST||[error] No URL for file transfer of libcufft.so.2
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 29 Apr 2009, 12:48:00 am
Unfortunately the 32bit CUDA linux client is bad. Use the 64bit app if you can.
Title: Re: SETI MB CUDA for Linux
Post by: icy-flame on 10 May 2009, 11:07:21 am
I have a core i7 920 + 9400GT 512MB, but it took about 30min (wall clock) to do a WU with CUDA.

Given the time it took to finish and one of the CPU cores were running at full load, I suspect it is working in fall back mode, but I do not see any error message in boinc.log.

Is there anything or anywhere I should check to find out where the problems is?

Code: [Select]
# ldd setiathome-CUDA-6.08.x86_64-pc-linux-gnu
        libcufft.so.2 => /usr/lib64/libcufft.so.2 (0x00002b1499d87000)
        libcudart.so.2 => /usr/lib64/libcudart.so.2 (0x00002b149a0a1000)
        libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x00002b149a2df000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000031b0e00000)
        libm.so.6 => /lib64/libm.so.6 (0x00000031ac200000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00000031aca00000)
        libc.so.6 => /lib64/libc.so.6 (0x00000031abe00000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00000031ac600000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000031b0a00000)
        librt.so.1 => /lib64/librt.so.1 (0x00000031b0600000)
        libz.so.1 => /usr/lib64/libz.so.1 (0x00000031ace00000)
        /lib64/ld-linux-x86-64.so.2 (0x00000031aae00000)


Code: [Select]
14:52 [---] Starting BOINC client version 6.6.20 for x86_64-pc-linux-gnu
10-May-2009 13:54:52 [---] This a development version of BOINC and may not function properly
10-May-2009 13:54:52 [---] log flags: task, file_xfer, sched_ops
10-May-2009 13:54:52 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
10-May-2009 13:54:52 [---] Data directory: /usr/lib/BOINC
10-May-2009 13:54:52 [SETI@home] Found app_info.xml; using anonymous platform
10-May-2009 13:54:52 [---] Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 920  @ 2.67GHz [Family 6 Model 26 Stepping 4]
10-May-2009 13:54:52 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr popcnt lahf_lm
10-May-2009 13:54:52 [---] OS: Linux: 2.6.18-128.1.10.el5
10-May-2009 13:54:52 [---] Memory: 11.72 GB physical, 13.69 GB virtual
10-May-2009 13:54:52 [---] Disk: 888.98 GB total, 836.73 GB free
10-May-2009 13:54:52 [---] Local time is UTC +1 hours
10-May-2009 13:54:52 [---] CUDA device: GeForce 9400 GT (driver version 0, CUDA version 1.1, 511MB, est. 16GFLOPS)
10-May-2009 13:54:52 [---] Not using a proxy
10-May-2009 13:54:52 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 4917711; location: (none); project prefs: default
10-May-2009 13:54:52 [SETI@home] General prefs: from SETI@home (last modified 02-Mar-2009 19:24:50)
10-May-2009 13:54:52 [SETI@home] Host location: none
10-May-2009 13:54:52 [SETI@home] General prefs: using your defaults
10-May-2009 13:54:52 [---] Preferences limit memory usage when active to 6001.50MB
10-May-2009 13:54:52 [---] Preferences limit memory usage when idle to 10802.69MB
10-May-2009 13:54:52 [---] Preferences limit disk usage to 0.50GB
10-May-2009 14:04:31 [SETI@home] Starting 27fe09ab.28487.20931.9.8.139_1
10-May-2009 14:31:47 [SETI@home] Computation for task 27fe09ab.28487.20931.9.8.139_1 finished


Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 10 May 2009, 11:10:13 am
I have a core i7 920 + 9400GT 512MB, but it took about 30min (wall clock) to do a WU with CUDA.

Given the time it took to finish and one of the CPU cores were running at full load, I suspect it is working in fall back mode, but I do not see any error message in boinc.log.

My 20% OC'd 9600GSO averages about 18 minutes, please compare specs against these two cards, but your time sounds about right to me  :).  That's still faster than CPU with AKv8 , If I recall correctly.
Title: Re: SETI MB CUDA for Linux
Post by: icy-flame on 10 May 2009, 12:25:04 pm
I guess my GPU just not powerful enough then. What still doesn't look right is I have one WU been going for 3 hours, only 90% progress. :(

I have an identical box (hardware and OS), running AK_V8 with boinc 6.4.5, typical WU finish in 60 minutes wall clock. But since HT is enabled (i.e. eight WU running in parallel on four physical cores), the actual CPU time is closer to 30 min each.


Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 10 May 2009, 12:33:23 pm
Yeah,  I had some 8600GT's when all this Ciuda kerfuffle started, and that really brought everything to its knees, really disappointing even though I got the cards for nothing, from a friend who was throwing them out.

Because I got my 9600 GSO's really cheap (i.e. from Santa  ;D), I thought that I'd try to squeeze what I could out of them.  It seems while they do fairly ordinary at stock clock, they seem pretty good cranked up a bit.  You might consider experimenting with 'ATITool' to determine best OC without artefacts, and see how 'RthDribl' runs in large config with Boinc suspended. 

If the performance then when OC'd doesn't satisfy you, I'd say we're both hitting the limits of the wallet rather than silicon  ;D

[Edit: Having said all that, driver crashes are still possible, instigating fall-back mode, but I would've thought that'd be a Windows thing for some reason.. Being a Windows man of course  :D]
Title: Re: SETI MB CUDA for Linux
Post by: icy-flame on 10 May 2009, 03:21:56 pm
(http://img139.imageshack.us/img139/8395/tmpege.png)

3 hrs per WU on a 9400GT, that is just disappointing :(

AK_V8 on my CPU can do better than that.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 May 2009, 09:23:01 am
Maybe those wus were VLAR?
Title: Re: SETI MB CUDA for Linux
Post by: Gecko_R7 on 11 May 2009, 10:23:53 am
Maybe those wus were VLAR?

Exactly.  VLAR on 9400GT would certainly be disappointing.

Also keep in mind that a stock 9400GT might disappoint overall if being compared to production from a high-end CPU, or even a mid-level one w/ lots of L2 cache and a modest over-clock.  Those with Celeron's, or (older Core2 hosts w/ lower clock speeds) and small cache may actually see see 9400GT as quicker than a CPU core.

IMO, comparing 94,95 & 9600 performance to CPU is very relative to the CPU being compared, especially using "reference speed" GPU cards.
Title: Re: SETI MB CUDA for Linux
Post by: dtiger on 29 May 2009, 12:27:39 pm
Also, as BOINC starts 2 normal CPU crunchers on my C2D E4400 and additionally SETI-CUDA grabs one of CPU for 100%, the crunchers start fighting for second CPU and all thing goes very slowly including X-server response time.
Finally, I tried Slackware64-current, and so I tried x86_64 version of Seti-Cuda with the same hardware.
The symptoms are the same - slowdown of X-server response time, so I don't like it....

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 29 May 2009, 01:31:27 pm
dtiger you can try the latest boinc 6.6.29. It has the option to use the GPU only when the pc is not used. I haven't used it so I don't know if it works.

Also with cuda 2.2 the linux client now uses a fraction of a cpu core, like the windows client does, and not 100% of a core (not that it'll make X faster, but it's a step forward for us, linux users).
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 07 Jun 2009, 03:17:24 am
Hi!

GTX 260 on Q6600 machine, Slackware. NVIDIA-Linux-x86_64-185.18.04 driver.
Tried 32 bit, worked fine. (beside returning errors, but this is driver problem, not my machine)


It is now Slackware64, 64bit. Used to work fine in text mode, but KDE kills it. Most WU takes a lot of time, in error we can see:

http://setiweb.ssl.berkeley.edu/result.php?resultid=1248087935

Cuda error 'cudaMalloc((void**) &dev_WorkData' in file './cudaAcceleration.cu' in line 293 : out of memory.
setiathome_CUDA: CUDA runtime ERROR in device memory allocation (Step 1 of 3). Falling back to HOST CPU processing...

Any hint what to do?  XFCE instead of KDD? Going out of KDE while PC is not used?
A bit of searching...

Aha, 16 bit color depth works, 24 does not.
    DefaultDepth    16
in xorg.conf, and seems to work fine.


A bit later: again, memory... Aaargh.


BR
Iztok

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 07 Jun 2009, 09:17:11 pm
Try the latest driver 185.18.14. It's now the official one.

What version of cuda libraries do you use? 2.1? 2.2?

You have many errors here (http://setiweb.ssl.berkeley.edu/results.php?hostid=2408695&offset=0&show_names=0&state=5). Can you describe under what conditions did they occur?
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 08 Jun 2009, 05:22:30 pm
Hi!

Main news: It is probably not DefaultDepth  but restarting boinc/X11 and freeing up memory ;-)

Errors: PC got mad and I lost number of WUs due to "computing error" during testing.
older errors from 32bit adventure.

Anyhow, I found occasionally few copies of setiathome-CUDA-6.08.x86_64-pc-linux-gnu in the memory (preferences: no memory).
Quick-and-Dirty solution: crontab+shell script to run ps |wc , if there are 3 or more copies, restart whole boinc.
triggered once yesterday, now I have 24 hours of uptime.

Slackware-current64,  2.6.29.4 kernel customized to PC, boinc 6.6.31.
libs as in driver/client packages.
CUDA device: GeForce GTX 260 (driver version 0, compute capability 1.3, 895MB, est. 104GFLOPS)

from dmesg:
NVRM: loading NVIDIA UNIX x86_64 Kernel Module  185.18.04  Thu Apr 16 21:41:04 PDT 2009
IRQ 16/nvidia: IRQF_DISABLED is not guaranteed on shared IRQs
NVRM: Xid (0001:00): 13, 0008 00000000 00005039 00000180 0000046c 00000008


Thanks for pointing out new 185.18.14 .... Will try on next reboot of X11.

BR
Iztok

Title: Re: SETI MB CUDA for Linux
Post by: pp on 10 Jun 2009, 04:01:40 pm
Also, as BOINC starts 2 normal CPU crunchers on my C2D E4400 and additionally SETI-CUDA grabs one of CPU for 100%, the crunchers start fighting for second CPU and all thing goes very slowly including X-server response time.
Finally, I tried Slackware64-current, and so I tried x86_64 version of Seti-Cuda with the same hardware.
The symptoms are the same - slowdown of X-server response time, so I don't like it....

I see the same behaviour on Gentoo AMD64. I've disabled all use of CPU in my profile, still the client  takes 100% CPU with its single CUDA-process. For now I manually lower the speed of the CPU to avoid having the fan go crazy on me :-) BOINC version is 6.4.5. Any insight into this would be appreciated.
/pp
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jun 2009, 04:15:20 pm
Can you give a link to your host?
Title: Re: SETI MB CUDA for Linux
Post by: pp on 10 Jun 2009, 04:38:46 pm
Can you give a link to your host?
If that was meant for me you need to explain that to me like the noob I am.   ;) You want me to provide more detailed system info? Just let me know what you need.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jun 2009, 04:45:35 pm
For example my host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=3281360

Can you give a link to yours?
Title: Re: SETI MB CUDA for Linux
Post by: pp on 10 Jun 2009, 04:54:02 pm
I feel stupid now :-)
http://setiathome.berkeley.edu/show_host_detail.php?hostid=4974779
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jun 2009, 05:18:03 pm
Ok pp, regarding your first message, if you're willing to run the CUDA app you'll have to accept a certain slowdown in your desktop.

Now tell me what exactly do you want to run? You'll get the highest RAC (credits) running two astropulse instances in your CPU, and multibeam in your GPU.

What version of nvidia drivers do you have? The latest are 185.18.14

Also a good move would be to upgrade your boinc. I see that they have released the 6.6.36 version. I haven't used it and I don't know how it is, but I've read some bad things with various 6.6.3x versions in windows. I use 6.6.20, it's supposedly beta but I don't have any problems. Direct link http://boincdl.ssl.berkeley.edu/dl/boinc_6.6.20_x86_64-pc-linux-gnu.sh . If you want the latest go here http://boinc.berkeley.edu/download_all.php .
Title: Re: SETI MB CUDA for Linux
Post by: pp on 10 Jun 2009, 05:43:29 pm
I don't mind the slowdown, it's a spare machine I only use for gaming sporadically. I participated in Seti Classic for 6 years until they shut down and I only started with Boinc a few days ago when a friend asked for some Linux assistance and my competitive spirit was awaken again. I'm still trying to learn the terminology and the overall framework structure.
The old software versions are the once currently in the Gentoo Portage tree and I rarely have use for something outside the tree but I'll look into it if you recommend it. It didn't take long to find this forum through some googling so I'm sure there are plenty of information in here that I will find useful but some pointers to good threads covering the topics you recommend would be appreciated. Like what clients to use and how to set the whole thing up. I assume app_info.xml is the key somehow. Is "multibeam" the SETI@Home Enhanced client? What's the difference between Astropulse and Astopulse v5 and which one should I use? Sorry for all the newbie questions...  :)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jun 2009, 07:03:12 pm
Especially for cuda, it's best to use the latest versions of boinc and nvidia drivers. If portage doesn't have them then you'll have to get them elsewhere. For boinc I gave you the links above.

Your host page identifies the GPU driver as 1600416869. This number isn't even remotely close to the numbering system that nvidia uses so I can't understand what you have. The latest official driver is 185.18.14. If portage doesn't have it, grab it from nvidia's page or from ftp://download.nvidia.com/XFree86/Linux-x86_64/185.18.14/ (get the pkg2 one).

Yes multibeam is the seti@home enhanced client. Astropulse_v5 is a newer version of plain astropulse. I don't think there are workunits for the plain astropulse anymore so there is no point installing it. Right now there aren't also any workunits for Astropulse_v5 because we crunched them all  :D,  but eventually there will be.

What sse capabilities does your cpu have?
Title: Re: SETI MB CUDA for Linux
Post by: arkayn on 11 Jun 2009, 03:36:10 am
My guess would be SSE3 or PNI on that host.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 11 Jun 2009, 04:57:57 am
Yes, this is an SSE3 capable Athlon64. My nvidia drivers are 180.60 and if you know what the boinc client are looking for to identify them, I can inform the maintainer.
I compiled the trunk version of boinc successfully yesterday but I'll have to go through that Makefile in detail before I swap to it. I'm experimenting with GCC 4.4.0 so I'm trying to keep the number of installations outside Portage to a minimum for now to make it easier to iron out the bugs. (No, I'm not a Gentoo developer but I try to help out whenever I can.)

The app_info.xml that came with the CUDA client contains several identical sections but with different version numbers. Aren't they redundant?

# grep version_num app_info.xml
<version_num>528</version_num>
<version_num>603</version_num>
<version_num>605</version_num>
<version_num>606</version_num>
<version_num>608</version_num>

I downloaded the SSE3-capable version of Astropulse v5 and there's a new app_info.xml included with it. Can I just add that info to my current app_info.xml to have boinc use both programs? To have it use two instances on my dual core, I assume that's controlled by the "100% CPU" setting in my S@H profile?

Thanks for your help and please leave a few Astropulse WUs for me ;-)

/pp
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jun 2009, 06:29:55 am
Yes, this is an SSE3 capable Athlon64. My nvidia drivers are 180.60 and if you know what the boinc client are looking for to identify them, I can inform the maintainer.
It doesn't matter much what boinc shows. Get the newer nvidia driver if you can.

I compiled the trunk version of boinc successfully yesterday but I'll have to go through that Makefile in detail before I swap to it.
Why compile? Use the ready packages from the links I gave you above.

The app_info.xml that came with the CUDA client contains several identical sections but with different version numbers. Aren't they redundant?

# grep version_num app_info.xml
<version_num>528</version_num>
<version_num>603</version_num>
<version_num>605</version_num>
<version_num>606</version_num>
<version_num>608</version_num>
Some of them maybe, but leave them as they are. It doesn't matter.

I downloaded the SSE3-capable version of Astropulse v5 and there's a new app_info.xml included with it. Can I just add that info to my current app_info.xml to have boinc use both programs?
Get the 64bit one if you haven't already. It's much faster than the 32bit one. Yes add the info to your current app_info.xml.

To have it use two instances on my dual core, I assume that's controlled by the "100% CPU" setting in my S@H profile?
That's a bit more complicated. If you want to run 2+1 (that is 2 astropulse in your cpu and 1 multibeam in your gpu) you'll need to get the 185.18.14 nvidia driver just to begin with. More tinkering will be needed but we are here to help. If you stay with your current 180.60 you can only use 1+1 (1cpu + 1gpu).
Title: Re: SETI MB CUDA for Linux
Post by: pp on 11 Jun 2009, 12:46:26 pm
Thanks sunu. Plenty of valuable information to work with there. I'll see if I can try to isolate the cpu hog first before I try Astropulse.

About the compilation... that's what working with Gentoo for a few years does to you. I didn't even consider the precompiled binary...  ;D Well, no harm done, it was a fun experience...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jun 2009, 01:14:39 pm
I'll see if I can try to isolate the cpu hog first before I try Astropulse.

What cpu hog?
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 11 Jun 2009, 01:36:56 pm
Didn't early Linux CUDA builds take 100% CPU? I though that was cured by the 185 driver.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 11 Jun 2009, 02:00:56 pm
I'll see if I can try to isolate the cpu hog first before I try Astropulse.

What cpu hog?
I was referring to the behaviour dtiger reported last in his post a few pages back in this thread: http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg14847.html#msg14847
setiathome-CUDA is using 100% of one of my my regular CPU's cores even though no actual work is done there. Richard's idea about nvidia-drivers is interesting. Was that ever confirmed? I actually found this forum looking for a solution to that problem...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jun 2009, 02:19:13 pm
pp, that's what I've been saying in all my posts to you, UPGRADE TO 185.18.14!!!
Title: Re: SETI MB CUDA for Linux
Post by: pp on 11 Jun 2009, 02:34:30 pm
 :D

I'm old school. I don't blindly upgrade everything to the latest version without thought. I need to understand what has changed and how it will affect the rest of my system. I tried 185.13 right now but got computation errors on every WU. Back to 180.60 for now. I'll have a look at 185.18.14 after some more research...
Title: Re: SETI MB CUDA for Linux
Post by: pp on 11 Jun 2009, 02:59:48 pm
185.19 also gives computation errors and it's the latest from nvidia's ftp. Guess my energy has to focus on the boinc version next... I'll just have to empty my queue before I continue experimenting.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jun 2009, 03:24:45 pm
No. The latest version is 185.18.14, trust me. But as I wrote in my other post the right driver is just the beginning. There is much more for a proper seti install.

I'll try to write a guide for seti cuda in linux and post it on a new thread.
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 11 Jun 2009, 03:33:38 pm
185.19 also gives computation errors and it's the latest from nvidia's ftp. Guess my energy has to focus on the boinc version next... I'll just have to empty my queue before I continue experimenting.

The actual 'discovery post' is in a development area not generally accessible, but the facts seem clear enough:

Experiments with CUDA 2.2

I tried cuda 2.2 with 185.18.08 beta driver and our 64 bit linux app.

1. 185.18.08 driver isn't compatible with 2.1 cuda libraries. You have to install the 2.2 ones.
2. Current optimized app doesn't have any problem running with the new 2.2 libs.
3. With cuda 2.2, current linux app exhibits the same behavior with the windows app. It no longer uses a full core, only for the first few seconds and then cpu utilisation hovers around 0-2%. Now in linux we potentially (see #6 ) can also use a 4+1 config, not only 3+1 as it is now.
4. Computation time is better with 2.2. Using file creation/modification timestamps, as it is now impossible to get accurate computation times for 2.2 I got for a random 0.44 AR wu:
          2.1      9min 30sec
          2.2      8min 28sec
5. Results are strongly similar
6. While standalone it runs ok, under BOINC I couldn't make it run, I get instant computation errors.

So you need the v2.2 CUDA runtime and FFT library as well as the updated drivers. I don't know the answer to that point 6, though.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 11 Jun 2009, 04:05:52 pm
Tried CUDA 2.2 with 185.18.14 and BOINC = computation error
Tried CUDA 2.2 with 180.60 and BOINC = doesn't use CUDA device despite being recognized

Did NOT try client standalone.

Me very tired now and back to CUDA 2.1 and 180.60 for now. I think I'll let other people to the experimenting for now. Thanks for your patience.
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 14 Jun 2009, 08:51:55 am
Huh, once every few days all CUDA WUs error out (libraries etc).
http://setiweb.ssl.berkeley.edu/result.php?resultid=1262888457

boinc restart, and all is OK.


libraries:

-rwxr-xr-x 1 iztok users 252680 2009-01-16 23:30 /usr/lib64/libcudart.so.2.1*
-rwxr-xr-x 1 iztok users 252680 2009-01-16 23:30 /home/iztok/boinc/projects/setiathome.berkeley.edu/libcudart.so.2*
-rwxr-xr-x 1 iztok users 252680 2009-01-16 23:30 /home/iztok/boinc/libcudart.so.2*
-rwxr-xr-x 1 iztok users 254912 2009-02-05 00:39 /home/iztok/boinc/BOINC64/libcudart.so*
lrwxrwxrwx 1 iztok users     16 2009-06-04 11:30 /usr/lib64/libcudart.so.2 -> libcudart.so.2.1*
lrwxrwxrwx 1 iztok users     14 2009-06-04 11:30 /usr/lib64/libcudart.so -> libcudart.so.2*

-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 /usr/lib64/libcufft.so.2.1*
-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 /home/iztok/boinc/projects/setiathome.berkeley.edu/libcufft.so.2*
-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 /home/iztok/boinc/libcufft.so.2*
lrwxrwxrwx 1 iztok users      15 2009-06-04 11:30 /usr/lib64/libcufft.so.2 -> libcufft.so.2.1*
lrwxrwxrwx 1 iztok users      13 2009-06-04 11:30 /usr/lib64/libcufft.so -> libcufft.so.2*
-rw-r--r-- 1 iztok users      76 2009-06-14 14:42 /home/iztok/boinc/slots/0/libcufft.so.2

are those two to be replaced with 2.2?
Where to get them?

configuration here: slackware64, NVIDIA 185.18.14, latest boinc ...

BR
Iztok



Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Jun 2009, 12:00:25 pm
Huh, once every few days all CUDA WUs error out (libraries etc).
http://setiweb.ssl.berkeley.edu/result.php?resultid=1262888457
boinc restart, and all is OK.
Iztok

I haven't seen or heard something like this before. Might be the client goes out of memory but can't fall back to cpu crunching and throws all these errors? Just a guess. Look at the computing preferences in your account page and make sure that "Leave applications in memory while suspended?" is set to no.
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 14 Jun 2009, 01:18:20 pm
Keep work in memory  is set to no: first to be checked a while ago when I started hunting errors.
It was allways set to NO.

BR
Iztok
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Jun 2009, 03:11:49 pm
Do you switch your wus back and forth between cpu and gpu? I catched these two wus:

http://setiweb.ssl.berkeley.edu/result.php?resultid=1260953059
http://setiweb.ssl.berkeley.edu/result.php?resultid=1260953058

In their stderr.txt they have messages from both AKv8 and cuda app. They are also both vlar.

Also I've read various bad things about 6.6.3x boinc versions. Maybe try a different boinc? I'm using 6.6.20.
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 14 Jun 2009, 03:32:09 pm
Do you switch your wus back and forth between cpu and gpu? I catched these two wus:

http://setiweb.ssl.berkeley.edu/result.php?resultid=1260953059
http://setiweb.ssl.berkeley.edu/result.php?resultid=1260953058

I
Also I've read various bad things about 6.6.3x boinc versions. Maybe try a different boinc? I'm using 6.6.20.

Yes, I do. After failure (quick bunch of errors), I run out of cuda work, so perl script helps a bit.
I checked few, and they validate.

Now... every 20 minutes I check ps: if there are two setiathome-CUDA-6.08.x86_64-pc-linux-gnu running,
boinc is killed -TERM, and restarted after minute.

2.2 libraries?

BR
Iztok

log example, when my watchdog triggered.
boinc started two copies of 608 (cuda) on top of AK classic 603.

14-Jun-2009 22:43:42 [SETI@home] Computation for task 30mr09aa.25976.14387.11.8.3_1 finished
14-Jun-2009 22:43:42 [SETI@home] Starting 30mr09aa.25976.14387.11.8.0_1
14-Jun-2009 22:43:42 [SETI@home] Starting task 30mr09aa.25976.14387.11.8.0_1 using setiathome_enhanced version 608
14-Jun-2009 22:43:44 [SETI@home] Started upload of 30mr09aa.25976.14387.11.8.3_1_0
14-Jun-2009 22:43:52 [SETI@home] Finished upload of 30mr09aa.25976.14387.11.8.3_1_0
14-Jun-2009 22:45:26 [SETI@home] Computation for task 22mr09aa.12857.21340.12.8.191_1 finished
14-Jun-2009 22:45:26 [SETI@home] Starting 21mr09ac.19502.15201.14.8.227_1
14-Jun-2009 22:45:26 [SETI@home] Starting task 21mr09ac.19502.15201.14.8.227_1 using setiathome_enhanced version 608
14-Jun-2009 22:45:26 [SETI@home] Starting 22mr09aa.12210.890.14.8.72_0
14-Jun-2009 22:45:26 [SETI@home] Starting task 22mr09aa.12210.890.14.8.72_0 using setiathome_enhanced version 603
14-Jun-2009 22:45:28 [SETI@home] Started upload of 22mr09aa.12857.21340.12.8.191_1_0
14-Jun-2009 22:45:34 [SETI@home] Finished upload of 22mr09aa.12857.21340.12.8.191_1_0
14-Jun-2009 22:45:53 [SETI@home] Computation for task 21mr09ad.26644.132650.12.8.71_1 finished
14-Jun-2009 22:45:53 [SETI@home] Starting 22mr09aa.12210.890.14.8.81_0
14-Jun-2009 22:45:53 [SETI@home] Starting task 22mr09aa.12210.890.14.8.81_0 using setiathome_enhanced version 603
14-Jun-2009 22:45:55 [SETI@home] Started upload of 21mr09ad.26644.132650.12.8.71_1_0
14-Jun-2009 22:46:02 [SETI@home] Finished upload of 21mr09ad.26644.132650.12.8.71_1_0
14-Jun-2009 22:50:02 [---] Received signal 15
14-Jun-2009 22:50:02 [---] Exit requested by user

one minute might be to short: a bunch of errors after restart:

14-Jun-2009 22:51:15 [SETI@home] Starting task 14mr09ab.27226.7434.12.8.96_0 using setiathome_enhanced version 608
14-Jun-2009 22:51:18 [SETI@home] Computation for task 14mr09ab.27226.7434.12.8.96_0 finished
14-Jun-2009 22:51:18 [SETI@home] Starting 14mr09ab.27226.7434.12.8.85_0
14-Jun-2009 22:51:18 [SETI@home] Starting task 14mr09ab.27226.7434.12.8.85_0 using setiathome_enhanced version 608
14-Jun-2009 22:51:20 [SETI@home] Started upload of 14mr09ab.27226.7434.12.8.96_0_0
14-Jun-2009 22:51:21 [SETI@home] Computation for task 14mr09ab.27226.7434.12.8.85_0 finished
14-Jun-2009 22:51:21 [SETI@home] Starting 14mr09ab.27226.7434.12.8.57_0
14-Jun-2009 22:51:21 [SETI@home] Starting task 14mr09ab.27226.7434.12.8.57_0 using setiathome_enhanced version 608
14-Jun-2009 22:51:23 [SETI@home] Started upload of 14mr09ab.27226.7434.12.8.85_0_0
14-Jun-2009 22:51:24 [SETI@home] Computation for task 14mr09ab.27226.7434.12.8.57_0 finished
14-Jun-2009 22:51:24 [SETI@home] Starting 14mr09ab.27226.7434.12.8.91_0
14-Jun-2009 22:51:24 [SETI@home] Starting task 14mr09ab.27226.7434.12.8.91_0 using setiathome_enhanced version 608
14-Jun-2009 22:51:28 [SETI@home] Finished upload of 14mr09ab.27226.7434.12.8.96_0_0
14-Jun-2009 22:51:28 [SETI@home] Started upload of 14mr09ab.27226.7434.12.8.57_0_0
14-Jun-2009 22:51:28 [SETI@home] Computation for task 14mr09ab.27226.7434.12.8.91_0 finished
14-Jun-2009 22:51:28 [SETI@home] Starting 14mr09ab.27226.7434.12.8.56_0
14-Jun-2009 22:51:28 [SETI@home] Starting task 14mr09ab.27226.7434.12.8.56_0 using setiathome_enhanced version 608
14-Jun-2009 22:51:29 [SETI@home] Finished upload of 14mr09ab.27226.7434.12.8.85_0_0
14-Jun-2009 22:51:30 [SETI@home] Started upload of 14mr09ab.27226.7434.12.8.91_0_0
14-Jun-2009 22:51:31 [SETI@home] Computation for task 14mr09ab.27226.7434.12.8.56_0 finished
14-Jun-2009 22:51:31 [SETI@home] Starting 14mr09ab.27226.7434.12.8.67_0
14-Jun-2009 22:51:31 [SETI@home] Starting task 14mr09ab.27226.7434.12.8.67_0 using setiathome_enhanced version 608
14-Jun-2009 22:51:33 [SETI@home] Finished upload of 14mr09ab.27226.7434.12.8.57_0_0
14-Jun-2009 22:51:33 [SETI@home] Started upload of 14mr09ab.27226.7434.12.8.56_0_0
14-Jun-2009 22:51:34 [SETI@home] Finished upload of 14mr09ab.27226.7434.12.8.91_0_0
14-Jun-2009 22:51:34 [SETI@home] Computation for task 14mr09ab.27226.7434.12.8.67_0 finished

manual restart, a bit more waiting, seems OK

BR
Iztok

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Jun 2009, 03:48:24 pm
2.2 libraries?

What do you mean?

Also try a different boinc version.
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 14 Jun 2009, 04:22:05 pm
2.2 libraries?

What do you mean?

Also try a different boinc version.

libraries:

-rwxr-xr-x 1 iztok users 252680 2009-01-16 23:30 /usr/lib64/libcudart.so.2.1*
-rwxr-xr-x 1 iztok users 252680 2009-01-16 23:30 /home/iztok/boinc/projects/setiathome.berkeley.edu/libcudart.so.2*
-rwxr-xr-x 1 iztok users 252680 2009-01-16 23:30 /home/iztok/boinc/libcudart.so.2*
-rwxr-xr-x 1 iztok users 254912 2009-02-05 00:39 /home/iztok/boinc/BOINC64/libcudart.so*
lrwxrwxrwx 1 iztok users     16 2009-06-04 11:30 /usr/lib64/libcudart.so.2 -> libcudart.so.2.1*
lrwxrwxrwx 1 iztok users     14 2009-06-04 11:30 /usr/lib64/libcudart.so -> libcudart.so.2*

-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 /usr/lib64/libcufft.so.2.1*
-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 /home/iztok/boinc/projects/setiathome.berkeley.edu/libcufft.so.2*
-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 /home/iztok/boinc/libcufft.so.2*
lrwxrwxrwx 1 iztok users      15 2009-06-04 11:30 /usr/lib64/libcufft.so.2 -> libcufft.so.2.1*
lrwxrwxrwx 1 iztok users      13 2009-06-04 11:30 /usr/lib64/libcufft.so -> libcufft.so.2*
-rw-r--r-- 1 iztok users      76 2009-06-14 14:42 /home/iztok/boinc/slots/0/libcufft.so.2

are those two to be replaced with 2.2?
Where to get them?

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Jun 2009, 06:36:25 pm
You can get the cuda 2.2 libraries from http://www.nvidia.com/object/cuda_get.html They are in the cuda toolkit.

Edit: From the boinc log you've posted above it seems more of a boinc problem. Try a different boinc version. Also can you post your app_info.xml?

Title: Re: SETI MB CUDA for Linux
Post by: s52d on 14 Jun 2009, 09:17:52 pm
ok, downgraded boinc to 6.4.5 (latest stable).

seems to worrk: 4 normal tasks and one CUDA, running with prioriti 10 and some low % of CPU time.

BR
Iztok

app_info: tried to hack 4 CPU and one CUDA.

<app_info>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</name>
<executable/>
</file_info>
<file_info>
<name>libcudart.so.2</name>
<executable/>
</file_info>
<file_info>
<name>libcufft.so.2</name>
<executable/>
</file_info>
<file_info>
<name>AK_V8_linux64_ssse3</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>528</version_num>
<file_ref>
<file_name>AK_V8_linux64_ssse3</file_name>
<main_program/>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<file_ref>
<file_name>AK_V8_linux64_ssse3</file_name>
<main_program/>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>605</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.2</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.2</file_name>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>606</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.2</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.2</file_name>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.2</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.2</file_name>
</file_ref>
</app_version>
</app_info>
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Jun 2009, 08:58:05 pm
Your app_info xml seem ok. If I'm not mistaken 6.4.5 doesn't support running two different versions of the same app simultaneously (e.g. 6.03 on the cpu and 6.08 on the gpu). If you already have 6.03s it will run them, but all new wus will be 6.08 (gpu).
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 16 Jun 2009, 01:34:20 am
Thanks!

Your app_info xml seem ok. If I'm not mistaken 6.4.5 doesn't support running two different versions of the same app simultaneously (e.g. 6.03 on the cpu and 6.08 on the gpu). If you already have 6.03s it will run them, but all new wus will be 6.08 (gpu).

Uh, ugly.... Maybe perl CPU_GPU_rebrand_V5.pl should be run daily?
One day of 6.4.5: seems to handle X and CPUs better.
- X is more responsive (still stops a bit once in a while, price for using GPU)
- one day with no errors
I guess overwriting libcuda.so helped a bit.

ls -lrt boinc/lib*
-rwxr-xr-x 1 iztok users 1150912 2009-01-16 23:30 libcufft.so.2*
-rwxr-xr-x 1 iztok users  252680 2009-01-16 23:30 libcudart.so.2*
-rwxr-xr-x 1 iztok users  254912 2009-06-15 03:10 libcudart.so*

a bit ugly to have two libcudart at the same time (and outside /usr/lib).



6.3.20 is then next to try, after SETI starts sending new job again.

BR
Iztok





Sligltly later: boinc 6.3.20, library copied to ~/boinc directory, smells ok.
validated WU:
http://setiweb.ssl.berkeley.edu/result.php?resultid=1263122254

X works, CPU load OK (shared among X/CPU-seti/GPU-seti). Now testing for "error ephidemia".

BR
Iztok

a bit later: happened.
two instances of seti-CUDA running, kill -TERM to boinc, 2 minutes to restart from crontab.
A stream of WUs gone with:


<message>
process exited with code 193 (0xc1, -63)
</message> 
<stderr_txt>

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce GTX 260
           totalGlobalMem = 938803200
           sharedMemPerBlock = 16384 
           regsPerBlock = 16384     
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1350000     
           totalConstMem = 65536
           major = 1             
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 27
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 260 is okay
SETI@home using CUDA accelerated device GeForce GTX 260
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
libboinc: BOINC 6.5.0

Work Unit Info:
...............
WU true angle range is :  0.410807
Optimal function choices:
-----------------------------------------------------
name               
-----------------------------------------------------
              v_BaseLineSmooth (no other)
            v_GetPowerSpectrum 0.00019 0.00000
                   v_ChirpData 0.01607 0.00000
                  v_Transpose4 0.00651 0.00000
               FPU opt folding 0.00152 0.00000

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce GTX 260
           totalGlobalMem = 938803200
           sharedMemPerBlock = 16384 
           regsPerBlock = 16384
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1350000
           totalConstMem = 65536
           major = 1
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 27
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 260 is okay
SIGSEGV: segmentation violation
Stack trace (16 frames):

setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib64/libpthread.so.0[0x7f3f289e2f30]
/usr/lib64/libcuda.so.1[0x7f3f29454020]
/usr/lib64/libcuda.so.1[0x7f3f29459d84]
/usr/lib64/libcuda.so.1[0x7f3f2942310f]
/usr/lib64/libcuda.so.1[0x7f3f291aeb3b]
/usr/lib64/libcuda.so.1[0x7f3f291bf46b]
/usr/lib64/libcuda.so.1[0x7f3f291a7211]
/usr/lib64/libcuda.so.1(cuCtxCreate+0xaa)[0x7f3f291a0faa]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x5ace4b]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40d4ca]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7f3f28682526]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>




Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Jun 2009, 06:00:36 am
6.3.20 is then next to try, after SETI starts sending new job again.

Do you mean 6.6.20? 6.3.20 is pretty old now.

Try something else. Put in your app_info.xml

   <avg_ncpus>1.0000</avg_ncpus>
   <max_ncpus>1.0000</max_ncpus>

in all AKv8 entries and also make all cuda entries 1.0000.  Then put in your cc_config.xml

<ncpus>5</ncpus>

Test it and see how it goes.
Title: Re: SETI MB CUDA for Linux
Post by: s52d on 16 Jun 2009, 07:43:23 am
Done.

let me wait a day or two if it stikes again.

Thanks,

Iztok

Of course: <core_client_version>6.6.20</core_client_version>

Title: Re: SETI MB CUDA for Linux
Post by: Sp@r0 on 04 Jul 2009, 11:52:31 am
Hello,

I have some problem with "SETI@Home MB NVIDIA CUDA V6.08", I run my PC under Fedora 11 x86_64, i have a GTS 250. I have setup the CUDA 2.2 driver with the toolkit and the SDK.
The SDK sample apps run without problems.

I use without problem the CPU optimized applications  AK V8 Linux 64 SSSE3  and it run very well, but with the cuda apps i have 2 big problems
* I have got calculation error on all my WU's within a second
* Boinc did not detect Cuda card if i launch it as service but it works if i launch it form a terminal

I have try lots of things with differents version of Cuda but now i'm seek, i do not know the reason of the calcultaion error because i do not know if there is an error log file

Can you give me some tips to solve my problem ???

PS : I m sorry for the quality of my english
 

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Jul 2009, 11:58:44 am
What version is the nvidia driver you use?
Title: Re: SETI MB CUDA for Linux
Post by: Sp@r0 on 04 Jul 2009, 02:03:29 pm
i use the folling version :
Cuda 2.2
Nvidia driver : 185.18.14
Boinc : 6.4.7 (come from yum, a lite bit strange the current version is 6.4.5 on the boinc's website)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Jul 2009, 03:00:48 pm
Follow all steps (1-4) below:

1)  Use a newer boinc version. The latest is 6.6.36, http://boinc.berkeley.edu/download_all.php . I haven't checked it, I use 6.6.20, direct download link http://boinc.berkeley.edu/dl/boinc_6.6.20_x86_64-pc-linux-gnu.sh
2)  Make sure all the appropriate cuda libs from 2.2 toolkit

libcudart.so
libcudart.so.2
libcudart.so.2.2
libcufft.so
libcufft.so.2
libcufft.so.2.2

are in the projects/setiathome.berkeley.edu directory.

3)  Edit accordingly your ld.so.conf or the corresponding ld-something file of your distro with the above location of the cuda libs.

4)  Place a copy of the cuda client in one of the following locations:

/usr/local/sbin
/usr/local/bin
/usr/sbin
/usr/bin
/sbin
/bin
/usr/games
Title: Re: SETI MB CUDA for Linux
Post by: Sp@r0 on 05 Jul 2009, 05:59:53 am
Thanks a lots for your tips sunu,

it run very well know i have got my firsts valids WUs on my GTS250 (about 5~10 min for a WU).It use about 2 % of CPU time to run so that i can crunch 2 other seti WU on my C2D E7300.

Bye
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Jul 2009, 06:30:01 am
Happy crunching Sp@r0  :)
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 07 Jul 2009, 10:57:16 am
Follow all steps (1-4) below:

Hi sunu, I'm running Ubuntu 9.04 64-bit and 2 9600 GSO's with the 4 driver packages that "nvidia-180-kernel-source_185.18.14-0ubuntu1_amd64.deb" is part of, and the 185.18.14 drivers seem to be working, but not for CUDA with Seti.  I've followed your steps, but keep getting workunits with the following errors (after <1 sec run time each):   Any ideas that might help???

http://setiathome.berkeley.edu/result.php?resultid=1293981772

<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 2 CUDA device(s):
   Device 1 : GeForce 9600 GSO
           totalGlobalMem = 804585472
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1350000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 12
   Device 2 : GeForce 9600 GSO 512
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1600000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 6
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 9600 GSO is okay
SIGSEGV: segmentation violation
Stack trace (16 frames):
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib/libpthread.so.0[0x7f8e8b5d8080]
/usr/lib/libcuda.so.1[0x7f8e8c04f020]
/usr/lib/libcuda.so.1[0x7f8e8c054d84]
/usr/lib/libcuda.so.1[0x7f8e8c01e10f]
/usr/lib/libcuda.so.1[0x7f8e8bda9b3b]
/usr/lib/libcuda.so.1[0x7f8e8bdba46b]
/usr/lib/libcuda.so.1[0x7f8e8bda2211]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7f8e8bd9bfaa]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x5ace4b]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40d4ca]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f8e8b2755a6]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>
]]>
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 07 Jul 2009, 12:43:07 pm
Hi Tye!

I don't have much time right now, I'll post back in a few hours.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 07 Jul 2009, 02:45:22 pm
Hi Tye!

I don't have much time right now, I'll post back in a few hours.

Thanks sunu - I'll check back tonight.  Hope everything's not too busy anymore.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 07 Jul 2009, 05:52:06 pm
Hi again Tye.

First of all get rid of that package. I don't know what it is or what it contains but it's better to use the "original" nvidia drivers. Go to synaptic and uninstall everything related to nvidia (you can keep xserver-xorg-video-nv). Some tips from nvidia:

Quote
If you wish to install the NVIDIA Linux graphics driver on a Debian GNU/Linux or Ubuntu system that ships with Xorg 7.x, please ensure that your system meets the following requirements:

    * development tools like make (build-essential) and gcc are installed
    * the linux-headers package matching the installed Linux kernel is installed
    * the pkg-config and xserver-xorg-dev packages are installed
    * the nvidia-glx package has been uninstalled with the --purge option and the files /etc/init.d/nvidia-glx and /etc/init.d/nvidia-kernel do not exist

If you use Ubuntu, please also ensure that the linux-restricted-modules or linux-restricted-modules-common packages have been uninstalled. Alternatively, you can edit the /etc/default/linux-restricted-modules or /etc/default/linux-restricted-modules-common configuration file and disable the NVIDIA linux-restricted kernel modules (nvidia, nvidia_legacy) via:

    DISABLED_MODULES="nv nvidia_new"

Additionally, delete the following file if it exists:

    /lib/linux-restricted-modules/.nvidia_new_installed

Get the driver from http://www.nvidia.com/object/linux_display_amd64_185.18.14.html . If you need help on the installation post back.


Ubuntu 9.04's kernel has some problems with our apps. Either compile your own kernel or use 8.10's kernel (2.6.27-14). Again post back if you need step by step instructions for that.

Also please post your xorg.conf and xorg.0.log.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 06:17:04 am
Hmmm, I'll go ahead and try that later today, but I can tell you that my Ubuntu 9.04 will CUDA just fine with 180.44, though I lose 2 of my 4 CPUs to the overhead with the 180 drivers and CUDA for linux, using either the 2.1 *or* 2.2 libcudart and libcufft.  Also, I successfully compiled the nvidia CUDA SDK stuff and all those apps work fine, so I was hoping that would carry over to the Seti CUDA app, but I guess not...

I'm not where I can try using the driver from nvidia right now, but I'll work that later today and see what it does.  It'll hopefully work, but in that case it'll feel like "magic", since I don't know how all the CUDA SDK apps could work, but not the seti CUDA app...   ;)

I've had the driver direct from Nvidia installed before so that should be no problem - just didn't have the other stuff in place to enable the CUDA to work correctly, so I rolled back to 180.44 and then saw this 185 package so figured that might work 'better'.   Guess not...  ;)
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 09:50:49 am
Argh.  Even using the 185 driver from the nvidia website didn't help (did the --purge deinstall etc of course).  Still getting computation errors after 1 or 2 seconds.  Next I'll keep the official nvidal 185 driver and try using the 8.10 kernel but I'm not too hopeful at this point...   :(
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2009, 10:08:16 am
Tye please post your xorg.conf and xorg.0.log here. Also do a ldd on your cuda app and post here.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 10:13:48 am
BTW, here's my xorg.conf file...


# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 1.0  (buildmeister@builder62)  Tue Jan  6 09:43:54 PST 2009


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "Module"
    Load           "dbe"
    Load           "extmod"
    Load           "type1"
    Load           "freetype"
    Load           "glx"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:01:00:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "Coolbits" "1"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection


and my xorg.o.log file:


X.Org X Server 1.6.0
Release Date: 2009-2-25
X Protocol Version 11, Revision 0
Build Operating System: Linux 2.6.24-15-server x86_64 Ubuntu
Current Operating System: Linux tye 2.6.28-13-generic #45-Ubuntu SMP Tue Jun 30 22:12:12 UTC 2009 x86_64
Build Date: 09 April 2009  02:11:54AM
xorg-server 2:1.6.0-0ubuntu14 (buildd@crested.buildd)
   Before reporting problems, check http://wiki.x.org
   to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
   (++) from command line, (!!) notice, (II) informational,
   (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Wed Jul  8 09:43:01 2009
(==) Using config file: "/etc/X11/xorg.conf"
(==) ServerLayout "Layout0"
(**) |-->Screen "Screen0" (0)
(**) |   |-->Monitor "Monitor0"
(**) |   |-->Device "Device0"
(**) |-->Input Device "Keyboard0"
(**) |-->Input Device "Mouse0"
(==) Automatically adding devices
(==) Automatically enabling devices
(WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
   Entry deleted from font path.
(==) FontPath set to:
   /usr/share/fonts/X11/misc,
   /usr/share/fonts/X11/100dpi/:unscaled,
   /usr/share/fonts/X11/75dpi/:unscaled,
   /usr/share/fonts/X11/Type1,
   /usr/share/fonts/X11/100dpi,
   /usr/share/fonts/X11/75dpi,
   /var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType,
   built-ins
(==) ModulePath set to "/usr/lib/xorg/modules"
(WW) AllowEmptyInput is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
(WW) Disabling Keyboard0
(WW) Disabling Mouse0
(II) Loader magic: 0xb40
(II) Module ABI versions:
   X.Org ANSI C Emulation: 0.4
   X.Org Video Driver: 5.0
   X.Org XInput driver : 4.0
   X.Org Server Extension : 2.0
(II) Loader running on linux
(++) using VT number 7

(!!) More than one possible primary device found
(--) PCI: (0@1:0:0) nVidia Corporation GeForce 9600 GSO rev 162, Mem @ 0xcc000000/16777216, 0xb0000000/268435456, 0xca000000/33554432, I/O @ 0x00009c00/128, BIOS @ 0x????????/131072
(--) PCI: (0@3:0:0) nVidia Corporation GeForce 9600 GSO rev 162, Mem @ 0xc8000000/16777216, 0xa0000000/268435456, 0xc6000000/33554432, I/O @ 0x00008c00/128, BIOS @ 0x????????/131072
(II) Open ACPI successful (/var/run/acpid.socket)
(II) System resource ranges:
   
  • -1   0   0xffffffff - 0xffffffff (0x1) MX
  • [1] -1   0   0x000f0000 - 0x000fffff (0x10000) MX

   [2] -1   0   0x000c0000 - 0x000effff (0x30000) MX
   [3] -1   0   0x00000000 - 0x0009ffff (0xa0000) MX
   [4] -1   0   0x0000ffff - 0x0000ffff (0x1) IX
   [5] -1   0   0x00000000 - 0x00000000 (0x1) IX
(II) "extmod" will be loaded. This was enabled by default and also specified in the config file.
(II) "dbe" will be loaded. This was enabled by default and also specified in the config file.
(II) "glx" will be loaded. This was enabled by default and also specified in the config file.
(II) "record" will be loaded by default.
(II) "dri" will be loaded by default.
(II) "dri2" will be loaded by default.
(II) LoadModule: "dbe"
(II) Loading /usr/lib/xorg/modules/extensions//libdbe.so
(II) Module dbe: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.0.0
   Module class: X.Org Server Extension
   ABI class: X.Org Server Extension, version 2.0
(II) Loading extension DOUBLE-BUFFER
(II) LoadModule: "extmod"
(II) Loading /usr/lib/xorg/modules/extensions//libextmod.so
(II) Module extmod: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.0.0
   Module class: X.Org Server Extension
   ABI class: X.Org Server Extension, version 2.0
(II) Loading extension MIT-SCREEN-SAVER
(II) Loading extension XFree86-VidModeExtension
(II) Loading extension XFree86-DGA
(II) Loading extension DPMS
(II) Loading extension XVideo
(II) Loading extension XVideo-MotionCompensation
(II) Loading extension X-Resource
(II) LoadModule: "type1"
(WW) Warning, couldn't open module type1
(II) UnloadModule: "type1"
(EE) Failed to load module "type1" (module does not exist, 0)
(II) LoadModule: "freetype"
(WW) Warning, couldn't open module freetype
(II) UnloadModule: "freetype"
(EE) Failed to load module "freetype" (module does not exist, 0)
(II) LoadModule: "glx"
(II) Loading /usr/lib/xorg/modules/extensions//libglx.so
(II) Module glx: vendor="NVIDIA Corporation"
   compiled for 4.0.2, module version = 1.0.0
   Module class: X.Org Server Extension
(II) NVIDIA GLX Module  185.18.14  Wed May 27 01:53:56 PDT 2009
(II) Loading extension GLX
(II) LoadModule: "record"
(II) Loading /usr/lib/xorg/modules/extensions//librecord.so
(II) Module record: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.13.0
   Module class: X.Org Server Extension
   ABI class: X.Org Server Extension, version 2.0
(II) Loading extension RECORD
(II) LoadModule: "dri"
(II) Loading /usr/lib/xorg/modules/extensions//libdri.so
(II) Module dri: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.0.0
   ABI class: X.Org Server Extension, version 2.0
(II) Loading extension XFree86-DRI
(II) LoadModule: "dri2"
(II) Loading /usr/lib/xorg/modules/extensions//libdri2.so
(II) Module dri2: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.0.0
   ABI class: X.Org Server Extension, version 2.0
(II) Loading extension DRI2
(II) LoadModule: "nvidia"
(II) Loading /usr/lib/xorg/modules/drivers//nvidia_drv.so
(II) Module nvidia: vendor="NVIDIA Corporation"
   compiled for 4.0.2, module version = 1.0.0
   Module class: X.Org Video Driver
(II) NVIDIA dlloader X Driver  185.18.14  Wed May 27 01:30:19 PDT 2009
(II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
(II) Primary Device is:
(II) Loading sub module "fb"
(II) LoadModule: "fb"
(II) Loading /usr/lib/xorg/modules//libfb.so
(II) Module fb: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.0.0
   ABI class: X.Org ANSI C Emulation, version 0.4
(II) Loading sub module "wfb"
(II) LoadModule: "wfb"
(II) Loading /usr/lib/xorg/modules//libwfb.so
(II) Module wfb: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 1.0.0
   ABI class: X.Org ANSI C Emulation, version 0.4
(II) Loading sub module "ramdac"
(II) LoadModule: "ramdac"
(II) Module "ramdac" already built-in
(II) resource ranges after probing:
   
  • -1   0   0xffffffff - 0xffffffff (0x1) MX
  • [1] -1   0   0x000f0000 - 0x000fffff (0x10000) MX

   [2] -1   0   0x000c0000 - 0x000effff (0x30000) MX
   [3] -1   0   0x00000000 - 0x0009ffff (0xa0000) MX
   [4] -1   0   0x0000ffff - 0x0000ffff (0x1) IX
   [5] -1   0   0x00000000 - 0x00000000 (0x1) IX
(**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
(==) NVIDIA(0): RGB weight 888
(==) NVIDIA(0): Default visual is TrueColor
(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
(**) NVIDIA(0): Option "Coolbits" "1"
(**) NVIDIA(0): Enabling RENDER acceleration
(II) NVIDIA(0): Support for GLX with the Damage and Composite X extensions is
(II) NVIDIA(0):     enabled.
(II) NVIDIA(0): NVIDIA GPU GeForce 9600 GSO (G92) at PCI:1:0:0 (GPU-0)
(--) NVIDIA(0): Memory: 786432 kBytes
(--) NVIDIA(0): VideoBIOS: 62.92.4c.00.06
(II) NVIDIA(0): Detected PCI Express Link width: 16X
(--) NVIDIA(0): Interlaced video modes are supported on this GPU
(--) NVIDIA(0): Connected display device(s) on GeForce 9600 GSO at PCI:1:0:0:
(--) NVIDIA(0):     HSD JC199D (CRT-0)
(--) NVIDIA(0): HSD JC199D (CRT-0): 400.0 MHz maximum pixel clock
(II) NVIDIA(0): Assigned Display Device: CRT-0
(==) NVIDIA(0):
(==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
(==) NVIDIA(0):     will be used as the requested mode.
(==) NVIDIA(0):
(II) NVIDIA(0): Validated modes:
(II) NVIDIA(0):     "nvidia-auto-select"
(II) NVIDIA(0): Virtual screen size determined to be 1280 x 1024
(--) NVIDIA(0): DPI set to (85, 86); computed from "UseEdidDpi" X config
(--) NVIDIA(0):     option
(==) NVIDIA(0): Enabling 32-bit ARGB GLX visuals.
(--) Depth 24 pixmap format is 32 bpp
(II) do I need RAC?  No, I don't.
(II) resource ranges after preInit:
   
  • -1   0   0xffffffff - 0xffffffff (0x1) MX
  • [1] -1   0   0x000f0000 - 0x000fffff (0x10000) MX

   [2] -1   0   0x000c0000 - 0x000effff (0x30000) MX
   [3] -1   0   0x00000000 - 0x0009ffff (0xa0000) MX
   [4] -1   0   0x0000ffff - 0x0000ffff (0x1) IX
   [5] -1   0   0x00000000 - 0x00000000 (0x1) IX
(II) NVIDIA(GPU-1): NVIDIA GPU GeForce 9600 GSO (G92) at PCI:3:0:0 (GPU-1)
(--) NVIDIA(GPU-1): Memory: 786432 kBytes
(--) NVIDIA(GPU-1): VideoBIOS: 62.92.4c.00.06
(II) NVIDIA(GPU-1): Detected PCI Express Link width: 16X
(--) NVIDIA(GPU-1): Interlaced video modes are supported on this GPU
(--) NVIDIA(GPU-1): Connected display device(s) on GeForce 9600 GSO at PCI:3:0:0:
(II) NVIDIA(0): Initialized GPU GART.
(II) NVIDIA(0): Setting mode "nvidia-auto-select"
(II) Loading extension NV-GLX
(II) NVIDIA(0): NVIDIA 3D Acceleration Architecture Initialized
(==) NVIDIA(0): Disabling shared memory pixmaps
(II) NVIDIA(0): Using the NVIDIA 2D acceleration architecture
(==) NVIDIA(0): Backing store disabled
(==) NVIDIA(0): Silken mouse enabled
(**) Option "dpms"
(**) NVIDIA(0): DPMS enabled
(II) Loading extension NV-CONTROL
(II) Loading extension XINERAMA
(==) RandR enabled
(II) Initializing built-in extension Generic Event Extension
(II) Initializing built-in extension SHAPE
(II) Initializing built-in extension MIT-SHM
(II) Initializing built-in extension XInputExtension
(II) Initializing built-in extension XTEST
(II) Initializing built-in extension BIG-REQUESTS
(II) Initializing built-in extension SYNC
(II) Initializing built-in extension XKEYBOARD
(II) Initializing built-in extension XC-MISC
(II) Initializing built-in extension SECURITY
(II) Initializing built-in extension XINERAMA
(II) Initializing built-in extension XFIXES
(II) Initializing built-in extension RENDER
(II) Initializing built-in extension RANDR
(II) Initializing built-in extension COMPOSITE
(II) Initializing built-in extension DAMAGE
(II) Initializing extension GLX
(II) config/hal: Adding input device GenPS/2 Genius Mouse
(II) LoadModule: "evdev"
(II) Loading /usr/lib/xorg/modules/input//evdev_drv.so
(II) Module evdev: vendor="X.Org Foundation"
   compiled for 1.6.0, module version = 2.1.1
   Module class: X.Org XInput Driver
   ABI class: X.Org XInput driver, version 4.0
(**) GenPS/2 Genius Mouse: always reports core events
(**) GenPS/2 Genius Mouse: Device: "/dev/input/event5"
(II) GenPS/2 Genius Mouse: Found 5 mouse buttons
(II) GenPS/2 Genius Mouse: Found x and y relative axes
(II) GenPS/2 Genius Mouse: Configuring as mouse
(**) GenPS/2 Genius Mouse: YAxisMapping: buttons 4 and 5
(**) GenPS/2 Genius Mouse: EmulateWheelButton: 4, EmulateWheelInertia: 10, EmulateWheelTimeout: 200
(II) XINPUT: Adding extended input device "GenPS/2 Genius Mouse" (type: MOUSE)
(**) GenPS/2 Genius Mouse: (accel) keeping acceleration scheme 1
(**) GenPS/2 Genius Mouse: (accel) filter chain progression: 2.00
(**) GenPS/2 Genius Mouse: (accel) filter stage 0: 20.00 ms
(**) GenPS/2 Genius Mouse: (accel) set acceleration profile 0
(II) config/hal: Adding input device Macintosh mouse button emulation
(**) Macintosh mouse button emulation: always reports core events
(**) Macintosh mouse button emulation: Device: "/dev/input/event2"
(II) Macintosh mouse button emulation: Found 3 mouse buttons
(II) Macintosh mouse button emulation: Found x and y relative axes
(II) Macintosh mouse button emulation: Configuring as mouse
(**) Macintosh mouse button emulation: YAxisMapping: buttons 4 and 5
(**) Macintosh mouse button emulation: EmulateWheelButton: 4, EmulateWheelInertia: 10, EmulateWheelTimeout: 200
(II) XINPUT: Adding extended input device "Macintosh mouse button emulation" (type: MOUSE)
(**) Macintosh mouse button emulation: (accel) keeping acceleration scheme 1
(**) Macintosh mouse button emulation: (accel) filter chain progression: 2.00
(**) Macintosh mouse button emulation: (accel) filter stage 0: 20.00 ms
(**) Macintosh mouse button emulation: (accel) set acceleration profile 0
(II) config/hal: Adding input device AT Translated Set 2 keyboard
(**) AT Translated Set 2 keyboard: always reports core events
(**) AT Translated Set 2 keyboard: Device: "/dev/input/event3"
(II) AT Translated Set 2 keyboard: Found keys
(II) AT Translated Set 2 keyboard: Configuring as keyboard
(II) XINPUT: Adding extended input device "AT Translated Set 2 keyboard" (type: KEYBOARD)
(**) Option "xkb_rules" "evdev"
(**) AT Translated Set 2 keyboard: xkb_rules: "evdev"
(**) Option "xkb_model" "pc104"
(**) AT Translated Set 2 keyboard: xkb_model: "pc104"
(**) Option "xkb_layout" "us"
(**) AT Translated Set 2 keyboard: xkb_layout: "us"



Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 10:19:45 am
here's the ldd on the CUDA app I'm using for seti:



ldd setiathome*CUDA*

   linux-vdso.so.1 =>  (0x00007fff939ff000)
   libcufft.so.2 => /home/tye/boinc/projects/setiathome.berkeley.edu/libcufft.so.2 (0x00007f7f8b38a000)
   libcudart.so.2 => /home/tye/boinc/projects/setiathome.berkeley.edu/libcudart.so.2 (0x00007f7f8b14a000)
   libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007f7f8ac7d000)
   libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f7f8a970000)
   libm.so.6 => /lib/libm.so.6 (0x00007f7f8a6eb000)
   libpthread.so.0 => /lib/libpthread.so.0 (0x00007f7f8a4cf000)
   libc.so.6 => /lib/libc.so.6 (0x00007f7f8a15d000)
   libdl.so.2 => /lib/libdl.so.2 (0x00007f7f89f59000)
   libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f7f89d41000)
   librt.so.1 => /lib/librt.so.1 (0x00007f7f89b39000)
   libz.so.1 => /lib/libz.so.1 (0x00007f7f89921000)
   /lib64/ld-linux-x86-64.so.2 (0x00007f7f8b6a5000)

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2009, 11:28:54 am
There is something I don't understand. Your xorg.0.log says:
(--) PCI: (0@1:0:0) nVidia Corporation GeForce 9600 GSO rev 162, Mem @ 0xcc000000/16777216,
0xb0000000/268435456, 0xca000000/33554432, I/O @ 0x00009c00/128, BIOS @ 0x????????/131072
(--) PCI: (0@3:0:0) nVidia Corporation GeForce 9600 GSO rev 162, Mem @ 0xc8000000/16777216,
0xa0000000/268435456, 0xc6000000/33554432, I/O @ 0x00008c00/128, BIOS @ 0x????????/131072
...
(II) NVIDIA(0): NVIDIA GPU GeForce 9600 GSO (G92) at PCI:1:0:0 (GPU-0)
(--) NVIDIA(0): Memory: 786432 kBytes
(--) NVIDIA(0): VideoBIOS: 62.92.4c.00.06
...
(II) NVIDIA(GPU-1): NVIDIA GPU GeForce 9600 GSO (G92) at PCI:3:0:0 (GPU-1)
(--) NVIDIA(GPU-1): Memory: 786432 kBytes
(--) NVIDIA(GPU-1): VideoBIOS: 62.92.4c.00.06

So to X server your cards appear the same. On the cuda error you posted above the cards are completely different (totalGlobalMem, clockRate, multiProcessorCount) :
   Device 1 : GeForce 9600 GSO
           totalGlobalMem = 804585472
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1350000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 12
   Device 2 : GeForce 9600 GSO 512
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1600000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 6

Which one is the truth? I think your xorg.conf will need some adjustments. I'll research further.

Also please do one more thing, go to synaptic and install strace. Then run boinc with:

strace -ffv boinc

Let it run for a few seconds to let cuda destroy some workunits. strace will produce some debugXXXXX files from the failed cuda runs. Tar them up and attach them here.

Thanks.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 12:43:56 pm
I moved some cards around to make them match the first card now.  I've been trying lots of things to see what might help.  Unfortunately making both the same wasn't helpful...  ;-)   Good catch, sunu!
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 01:10:42 pm
I will try the strace once I can get more workunits to be sent...
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 01:38:31 pm
There was some strace text output (it didn't make any debugXXXfiles), but I didn't see anything useful in what I could capture in my buffer.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 02:56:54 pm
BTW, the 9.04 kernel did have an update recently, to 2.6.28-13 rather than the 2.6.28-11 that it shipped with.  Does the newer kernel still have the same problems?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2009, 06:59:17 pm
BTW, the 9.04 kernel did have an update recently, to 2.6.28-13 rather than the 2.6.28-11 that it shipped with.  Does the newer kernel still have the same problems?

Both 2.6.28-11 and 2.6.28-12 have problems. I think I saw somewhere that also 2.6.28-13 has problems.  I say ditch ubuntu's 2.6.28 kernel altogether and use something else. Myself am running ubuntu 9.04 with 2.6.27 kernel from 8.10.

Sorry for the strace error. Use:

strace -ffv -o debug boinc

Now you'll see those debug files.

Also go to /dev and tell me how many nvidiaX devices you see.

Lets try some xorg.conf tweaks. Below the current "Device" section add another one for your second card, something like this:

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:01:00:0"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:03:00:0"
EndSection


On the "Screen" section add:

Option "SLI" "False"


Title: Re: SETI MB CUDA for Linux
Post by: Tye on 08 Jul 2009, 08:57:16 pm
Actually, to take variables out of the system until I nail down this CUDA thing, I've taken out the 2nd GPU, so I'm only running on one.

I'll try the strace thing again, but right now the projects out of workunits again...

Also, I'm willing to use the 8.10 kernel but I can't seem to make it install with my vbox stuff.  Any hints?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 05:30:44 am
Also, I'm willing to use the 8.10 kernel but I can't seem to make it install with my vbox stuff.  Any hints?

What problems do you have? I don't have any problems with 8.10's kernel and virtualbox 2.x or 3.0.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 06:13:58 am
When it gets near the end of the kernel packages install, it fails on adding two things for virtualbox into the kernel.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 06:39:17 am
When it gets near the end of the kernel packages install, it fails on adding two things for virtualbox into the kernel.

Does the install fail altogether? If not I don't think it's a problem. In my /lib/modules/2.6.27-14-generic there aren't any virtualbox modules, while there are in my /lib/modules/2.6.28-13-generic and /lib/modules/2.6.28-13-server and still virtualbox runs ok. It just doesn't like module kvm_intel, so before starting a virtual machine I do a sudo rmmod kvm_intel or sudo modprobe -r kvm_intel .
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 09:21:53 am
Hmmm, don't think it failed altogether (the kernel stuff was listed as installed and showed up in the menu.lst), but I was loathe to try to boot into it with the failed insertions - not sure what else might have burped.  Still, I'll try it tonight.  For now, I've gone back to the 180.44 driver and have been Seit-CUDAing fine again (but with the CPU hit of course).

If I just try with the -image and the -headers packages, the -headers fail, but the image installs and shows up in the menu.lst.  Is that all I really need?  I get the error that the -headers package can't be installed because it depends on itself.  ;)

BTW, the modules that fail now are nvidia (180.44), vboxdrv (2.1.4), and vboxnetflt (2.1.4).
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 10:02:45 am
If I just try with the -image and the -headers packages, the -headers fail, but the image installs and shows up in the menu.lst.  Is that all I really need?  I get the error that the -headers package can't be installed because it depends on itself.  ;)

BTW, the modules that fail now are nvidia (180.44), vboxdrv (2.1.4), and vboxnetflt (2.1.4).

You need the headers package for the nvidia drivers. How do you install the 8.10 kernel? What I did is go to /etc/apt/sources.list and replace every "jaunty" word with "intrepid", so was just like it was in intrepid. Update to reload the packages info and then install these four packages:

linux-headers-2.6.27-14
linux-headers-2.6.27-14-generic
linux-image-2.6.27-14-generic
linux-restricted-modules-2.6.27-14-generic

After installation revert back to the previous "jaunty edition" sources.list and you should be ok.


Also don't give up on cuda 2.2 with 2 gpus. Put back the second GPU, and try more. Tell me how many nvidia devices exist under /dev. Try the xorg.conf tweaks I posted above and post here a new xorg.0.log with the tweaks.

And I need those debug files. But first let's  try standalone. Do you have any workunits?
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 12:02:54 pm
I did the sources.list trick, and with a little finagling, I was able to get the 8.10 kernel to install with no broken deps or other warnings.  It even added the nvidia driver and vbox stuff correctly.  Booted fine but is crashy now - argh.  Hard to fix via SSH and VNC from work to home...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 12:26:31 pm
Ok, I hope the kernel is ok. Now it's time to attack those 2 gpus.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 12:52:57 pm
I think it was crashy because of nvclock - will not use it just yet.  Might need the 8.10 version, or my card is unrecognized...  Getting my wife to reboot right now so I can login and see what the deal is...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 01:07:39 pm
I also use nvclock with absolute no problems at all. I don't use that from the repositories, but I got it straight from http://www.linuxhardware.org/nvclock/
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 01:55:24 pm
No worries.  It's not crashy now - one of the things I did must have helped...  ;)

Anyway, got the nvidia-site 185 driver back in and got some debug files for you since I was able to get some workunits.

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 05:32:19 pm
Tye, the debug files tell me that you've not followed my directions from http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014 step number 4 to be exact. In your case that list of locations has two more options: /home/tye/bin and /usr/local/cuda/bin . That's why you're getting those instant computation faults.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 06:24:40 pm
Aha!  I put it in my ~/bin directory and no more immediate fails.  And it frees up a CPU after a minute so.

I am still crashy (total freeze) however, so I'm thinking of going back to the most recent 9.04 kernel.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Jul 2009, 06:40:09 pm
Watch your cards for overheat. Cuda stresses the cards very much. I use nvclock to run my card's fan always at 100%.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 09 Jul 2009, 10:46:35 pm
Thanks much sunu - looks like putting the file in the correct place also works with the ubuntu 185 driver I had earlier (now up to 0ubuntu3 though).  Tried it on another machine just out of curiosity and it's as simple as installing that driver (4 files from karmic 'release'), putting the CUDA app in your $PATH, putting the libraries in the seti work area and making the correct lings.

I dId have to go back to the 9.04 kernel though - kept getting hard freezes.

Anyway, thanks again - you saved me.  I thought for sure I had put the CUDA app in the right place.  Argh.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jul 2009, 04:12:27 am
Glad we solved it. Have you tried again with two cards?
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 10 Jul 2009, 10:59:09 am
Haven't tried again with two cards, but don't forsee a problem as that was working just fine with 180.44 so the config files will likely work.  If not I'll try your trick.

Cheers!  - Tye
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 11 Jul 2009, 01:44:56 pm
Came across an interesting error message in task 1294937260 (http://setiathome.berkeley.edu/result.php?resultid=1294937260) while researching something else.

Quote
SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

Error: API mismatch: the NVIDIA kernel module has version 180.29,
but this NVIDIA driver component has version 180.60.  Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
setiathome_CUDA: Found 1 CUDA device(s):
Cuda error 'cudaGetDeviceProperties( &cDevProp, i )' in file './cudaAcceleration.cu' in line 138 : initialization error.

Something to watch for when fiddling about with Linux drivers and modules.

The anonymous owner of host 5011059 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5011059) seems to be having a real problem getting his or her GTX 295 running under gentoo.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 11 Jul 2009, 02:37:27 pm
Figured out what the crashiness was - turns out that the 9600 GSOs I had that were 768 do not like being in the primary PCI-Ex slot.  If I put them in the secondarly slot and use a different card in the primary, then all is good.  Even the 9600 GSO 512 does fine in the primary - just not my two GSO 768s.  Argh.  Looks like no double-CUDA'ing with them, but might try mixed-mode with the 512 if I can pick up another card and put it where the 512 is being used now...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jul 2009, 03:12:31 pm
Came across an interesting error message in task 1294937260 (http://setiathome.berkeley.edu/result.php?resultid=1294937260) while researching something else.

Quote
SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

Error: API mismatch: the NVIDIA kernel module has version 180.29,
but this NVIDIA driver component has version 180.60.  Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
setiathome_CUDA: Found 1 CUDA device(s):
Cuda error 'cudaGetDeviceProperties( &cDevProp, i )' in file './cudaAcceleration.cu' in line 138 : initialization error.

Something to watch for when fiddling about with Linux drivers and modules.

The anonymous owner of host 5011059 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5011059) seems to be having a real problem getting his or her GTX 295 running under gentoo.

He hasn't installed the NVIDIA drivers properly.

Figured out what the crashiness was - turns out that the 9600 GSOs I had that were 768 do not like being in the primary PCI-Ex slot.  If I put them in the secondarly slot and use a different card in the primary, then all is good.  Even the 9600 GSO 512 does fine in the primary - just not my two GSO 768s.  Argh.  Looks like no double-CUDA'ing with them, but might try mixed-mode with the 512 if I can pick up another card and put it where the 512 is being used now...

Interesting.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 12 Jul 2009, 07:01:59 am
In windows the difference in first and second PCI-E slots (if first has motitor attached and second not) is:
GPU that used by Windows for video output will subject of 3 or 2 seconds timeout, but secong GPU will not.
Don't know if this relevant to Linux though.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 09:26:46 am
GPU that used by Windows for video output will subject of 3 or 2 seconds timeout, but secong GPU will not.
Don't know if this relevant to Linux though.

Well, if it is because of the first gpu also drawing the screen then it will probably also exist in linux. We don't have a big sample of seti cuda users with multi gpus in linux. Actually the sample is non-existent  :D

What Tye describes might be some faulty config, strange driver behavior, or some weird motherboard-gpu-gpu hardware incompatibility.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 12 Jul 2009, 10:10:04 am
Not sure it exist in linux. It's not GPU feature, it's windows feature - it will kill driver (Vista) with more than 2 secs of "no answer" from it.
Don't know if Linux kerner implements such watchdog machanism or not.
GPUs that don't output video don't subject of this "driver hung" check and can run long kernels. That's why surely not all that work OK on Tesla will work OK on user's GPUs (even if newly GPUs slightly faster than first released Teslas IMHO)
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 12 Jul 2009, 11:56:47 am
Hello everyone

Came across an interesting error message in task 1294937260 (http://setiathome.berkeley.edu/result.php?resultid=1294937260) while researching something else.

Quote
SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

Error: API mismatch: the NVIDIA kernel module has version 180.29,
but this NVIDIA driver component has version 180.60. 
...

Something to watch for when fiddling about with Linux drivers and modules.

The anonymous owner of host 5011059 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5011059) seems to be having a real problem getting his or her GTX 295 running under gentoo.

With this host I don't have any problems. It just happen during system upgrade.

I have a real problem with host 5018683 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5018683). I don't have any idea what's wrong. It isn't over clocked, or overheating. GPU-s have about 75C~77C at full load (~52C idle). And other CUDA programs are working fine, but with SETI almost all end with:

cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/execute.cu, line 1070
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/execute.cu, line 1070
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/cufft.cu, line 147
cufft: ERROR: CUFFT_EXEC_FAILED
Cuda error 'cufftExecC2C' in file './cudaAcc_fft.cu' in line 63 : unspecified launch failure.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file './cudaAcc_PowerSpectrum.cu' in line 56 : unspecified launch failure.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file './cudaAcc_PowerSpectrum.cu' in line 56 : unspecified launch failure.
Cuda error 'cudaAcc_summax32_kernel' in file './cudaAcc_summax.cu' in line 148 : unspecified launch failure.
Cuda error 'cudaAcc_summax32_kernel' in file './cudaAcc_summax.cu' in line 148 : unspecified launch failure.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file './cudaAcc_summax.cu' in line 161 : unspecified launch failure.

I will be thankful for any idea what's wrong and how to solve it.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 12 Jul 2009, 12:06:10 pm
FFT lib kernel launch failed, most probably incompatibility between CUDA RT and video driver used.
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 12 Jul 2009, 12:26:26 pm
I tried these combinations of drivers and cuda:
- drv 180.29 with cuda 2.1
- drv 180.60 with cuda 2.1
- drv 185.18.14 with cuda 2.1
- drv 185.18.14 with cuda 2.2

And they all give same results. Strange thing is that just a few weeks back (drv 180.29 cuda 2.1) everything works fine, maybe there is something wrong with those results unit.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 01:14:09 pm
b0b3r as a start do an ldd of the seti client and post here as well as your xorg.0.log
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 12 Jul 2009, 01:20:38 pm
I don't use Xorg on this machine. Here is ldd output:

linux-vdso.so.1 =>  (0x00007fffeb1ff000)
libcufft.so.2 => /opt/cuda/lib/libcufft.so.2 (0x00007f38e2c23000)
libcudart.so.2 => /opt/cuda/lib/libcudart.so.2 (0x00007f38e29e5000)
libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007f38e2518000)
libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.2/libstdc++.so.6 (0x00007f38e220d000)
libm.so.6 => /lib/libm.so.6 (0x00007f38e1f88000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f38e1d6c000)
libc.so.6 => /lib/libc.so.6 (0x00007f38e19f9000)
libdl.so.2 => /lib/libdl.so.2 (0x00007f38e17f5000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f38e15de000)
librt.so.1 => /lib/librt.so.1 (0x00007f38e13d5000)
libz.so.1 => /lib/libz.so.1 (0x00007f38e11bf000)
/lib64/ld-linux-x86-64.so.2 (0x00007f38e2f3e000)
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 12 Jul 2009, 01:34:36 pm
I found some information on google that it may be problem with memory leak in cufft, here is example (http://www.accelereyes.com/forums/viewtopic.php?f=7&t=349).
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 01:57:31 pm
I don't use Xorg on this machine.

What do you use? An

ls /dev/nv*

what does it give you?
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 12 Jul 2009, 02:04:20 pm
ls -al /dev/nv*
crw-rw-rw- 1 root root 195,   0 Jul 12 20:01 /dev/nvidia0
crw-rw-rw- 1 root root 195,   1 Jul 12 20:01 /dev/nvidia1
crw-rw-rw- 1 root root 195,   2 Jul 12 20:01 /dev/nvidia2
crw-rw-rw- 1 root root 195, 255 Jul 12 20:01 /dev/nvidiactl
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 02:22:28 pm
Your system sees three devices.

In your host 5018683, boinc doesn't even see your graphics cards. Are you sure that you have intalled them correctly?

Also in both of your hosts upgrade boinc. 6.4.5 is too old.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 12 Jul 2009, 02:29:50 pm
Came across an interesting error message in task 1294937260 (http://setiathome.berkeley.edu/result.php?resultid=1294937260) while researching something else.

Quote
SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

Error: API mismatch: the NVIDIA kernel module has version 180.29,
but this NVIDIA driver component has version 180.60.  Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
setiathome_CUDA: Found 1 CUDA device(s):
Cuda error 'cudaGetDeviceProperties( &cDevProp, i )' in file './cudaAcceleration.cu' in line 138 : initialization error.

Something to watch for when fiddling about with Linux drivers and modules.

The anonymous owner of host 5011059 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5011059) seems to be having a real problem getting his or her GTX 295 running under gentoo.

He hasn't installed the NVIDIA drivers properly.


Well, actually he has. But after installation of the new package he neither rebooted nor loaded the new module. He's still running his system with the old version in memory. Easy mistake to do in Gentoo - been there done that.  :)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 03:03:43 pm
Not sure it exist in linux. It's not GPU feature, it's windows feature - it will kill driver (Vista) with more than 2 secs of "no answer" from it.
Don't know if Linux kerner implements such watchdog machanism or not.
GPUs that don't output video don't subject of this "driver hung" check and can run long kernels. That's why surely not all that work OK on Tesla will work OK on user's GPUs (even if newly GPUs slightly faster than first released Teslas IMHO)

Raistmer do you mean something like this? From cuda 2.2 release notes:

o Individual GPU program launches are limited to a run time
  of less than 5 seconds on a GPU with a display attached.
  Exceeding this time limit causes a launch failure reported
  through the CUDA driver or the CUDA runtime. GPUs without
  a display attached are not subject to the 5 second run time
  restriction. For this reason it is recommended that CUDA is
  run on a GPU that is NOT attached to an X display.

So yes, it also exists in linux.


@b0b3r: The error you posted above with the "unspecified launch failure" messages might be because of that.

Curiously I've crunched tens of thousands of workunits with my GPU that also runs X with ever seeing that kind of error.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 12 Jul 2009, 03:28:06 pm
....Individual GPU program launches are limited to a run time
  of less than 5 seconds on a GPU with a display attached.....
  Yes.  In embedded microcontroller system terminology, that's called a "Watchdog Timer".  Crazy people program GPUs to take longer than that, Lunatics try to fix it.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 12 Jul 2009, 03:38:19 pm
Raistmer do you mean something like this? From cuda 2.2 release notes:

o Individual GPU program launches are limited to a run time
  of less than 5 seconds on a GPU with a display attached.
  Exceeding this time limit causes a launch failure reported
  through the CUDA driver or the CUDA runtime. GPUs without
  a display attached are not subject to the 5 second run time
  restriction. For this reason it is recommended that CUDA is
  run on a GPU that is NOT attached to an X display.

So yes, it also exists in linux.
Exactly. Timer value varies between OSes but it's the same thing.
Quote
Curiously I've crunched tens of thousands of workunits with my GPU that also runs X with ever seeing that kind of error.
Well, maybe you have more fast GPU than user who have issues?...
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 12 Jul 2009, 03:50:52 pm
GPU that used by Windows for video output will subject of 3 or 2 seconds timeout, but secong GPU will not.
Don't know if this relevant to Linux though.

Well, if it is because of the first gpu also drawing the screen then it will probably also exist in linux. We don't have a big sample of seti cuda users with multi gpus in linux. Actually the sample is non-existent  :D

What Tye describes might be some faulty config, strange driver behavior, or some weird motherboard-gpu-gpu hardware incompatibility.

Looking at it a bit closer, it turns out that one 680i motherboard (an ASUS 680i MB) can work with the problem card in the primary slot and the other brand/model (a XFX 680i) cannot, so I think you're right and it's a wierd motherboard-gpu issue (even is crashy with one of those gpus in the primary slot - but put a different one in the primary and move it to the secondary and it's fine).  I may look at see if there's a newer BIOS later, but since all three GPUs are stable and in different machines CUDA'ing away with the 185 drivers, I'll probably take a break from messing with them for awhile.  ;)  Plus I don't have any unused CUDA GPUs on hand to test with.
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 12 Jul 2009, 04:05:32 pm
To make some clearance;
- this host (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5018683) have 3 gpu's.
- one nvidia 630 in chipset and two on 295 card.
- display is on 630 in chipset.
- 295 card is dedicated to cuda.
- also this machine is not a workstation, there is no xorg, only text console, mostly work through ssh.

Your system sees three devices.

In your host 5018683, boinc doesn't even see your graphics cards. Are you sure that you have intalled them correctly?

Also in both of your hosts upgrade boinc. 6.4.5 is too old.

I actually do some test so that's why cuda devices disappear.
6.4.5 is currently marked as stable on linux that's why I'm using it and like you see there is no problem with second host (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5011059).

Strange thing is that both hosts are totally same machine. The only difference is that host with number 2 is newer (in meaning it was built with same components but about week later) and got some new workunits that the older one yet not try.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 04:18:26 pm
6.4.5 is currently marked as stable on linux that's why I'm using it

Disregard that and get a newer version.

and like you see there is no problem with second host (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5011059).

It has problems, those "unspecified launch failure" errors.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Jul 2009, 04:31:18 pm
Well, maybe you have more fast GPU than user who have issues?...
I have a GTX280 and he a GTX295, each GTX295 core more or less the same with GTX280 under cuda. Him saying that he is not running X makes the error message much more strange.

I may look at see if there's a newer BIOS later

I would do the same.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 12 Jul 2009, 06:34:56 pm
I have a GTX280 and he a GTX295, each GTX295 core more or less the same with GTX280 under cuda. Him saying that he is not running X makes the error message much more strange.
Probably we could derive from that this error not connected with watchdog timer expiration. Some other reason...
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 13 Jul 2009, 06:03:10 pm
6.4.5 is currently marked as stable on linux that's why I'm using it
Disregard that and get a newer version.
...

Like you advise I do some test with newer version of boinc and I got strange results:
- with 6.6.36 tasks don't run, they hang with "Waiting" status. So I enable all debugs in cc_config, but it give me no answer for what it is waiting.
- with 6.6.20 tasks run, but both of them on single gpu?

When I go back to 6.4.5 tasks run on both gpu but with errors. However I point out that errors happen only on one gpu . So I think maybe it is a hardware problem and decide to do some test with cudamemtester. It run for few hours with no errors, but on default settings it not start test for long time changes detection. I start this test and it show a lot of errors on this single gpu (the second have no errors). With unavailability good tools to do more diagnose in linux (nvclock have poor support for g200 chips) I decide to install windows on this box.

To this moment I observed interesting behaviour. In idle mode both gpus and ram have lover clocks and voltage (0.975V) when it get loaded then clocks are going up, voltage of first gpu go to 1.035V but second one stay at 0.0975. Don't now how accurate is the measure in gpu-z but it show that.

Currently I want to do some testing with stock seti-cuda application for the next few weeks with default voltage and with manually set to 1.035V and then I'll write some feedback.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 13 Jul 2009, 06:56:35 pm
- with 6.6.36 tasks don't run, they hang with "Waiting" status. So I enable all debugs in cc_config, but it give me no answer for what it is waiting.

6.6.36 has an option checked by default in preferences to not run cuda while the pc is in use. Uncheck it and you should be ok.

As to the other stuff you mention, interesting. What brand/model are your video cards and motherboard?
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 14 Jul 2009, 04:35:00 am
MB is Asus M2N-VM DVI and VGA is EVGA GTX 295 with Backplate.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 14 Jul 2009, 08:48:58 am
Follow all steps (1-4) below:

1)  Use a newer boinc version. The latest is 6.6.36, http://boinc.berkeley.edu/download_all.php . I haven't checked it, I use 6.6.20, direct download link http://boinc.berkeley.edu/dl/boinc_6.6.20_x86_64-pc-linux-gnu.sh
2)  Make sure all the appropriate cuda libs from 2.2 toolkit

libcudart.so
libcudart.so.2
libcudart.so.2.2
libcufft.so
libcufft.so.2
libcufft.so.2.2

are in the projects/setiathome.berkeley.edu directory.

3)  Edit accordingly your ld.so.conf or the corresponding ld-something file of your distro with the above location of the cuda libs.

4)  Place a copy of the cuda client in one of the following locations:

/usr/local/sbin
/usr/local/bin
/usr/sbin
/usr/bin
/sbin
/bin
/usr/games


I have done so and it cured my original problems nicely. However I have since added a usable card (tesla) and now i have some device mixups from boinc. i have posted this msg in the boinc linux forum and crunch3r's forum but so far no replies. i am hoping someone here may have an answer. it appears that boinc's device recognition/usage system is borked. I don't mention it in the msg below but I do have the cuda 2.2 sdk. here is the msg i posted:

I have a weird problem. The system works fine with just my vid card, or even just the Tesla telling Boinc to use only one card because device1 is the last one in the cmdline. . But it falls apart trying to use both.

My setup is as follows:

1. Linux x86_64 running on a q6600 intel system
2. Video card is GTX 285 in first pci-e slot
3. Tesla C1060 is installed in 2nd pci-e slot
4. Boinc version is 6.6.36
5. Nvidia driver is 185.18.14
6. My number_of_gpus is set to 2. I had it at 1 and it made no difference in this behavior below.
6. I have use_all_gpus set to 1 assuming it is a true/false required.
7. I have this statement in my app_info.xml:

<coproc>
<type>CUDA</type>
<count>2</count>
</coproc>


When I start Boinc, it reports 2 Tesla cards instead of the proper ones. Older boincs properly identify both cards. If this were just a naming problem I could live with this but....
With the above coproc statement set to 1,
When I do a ps ax to look at my process list this is what I see:

7987 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0
7988 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

and it uses the GTX285 for both simultaneously!

When I have the coproc statement set to 2, it uses the Tesla only and runs only 1 process. it has both device numbers but the GTX285 is not used:

10170 ? RNLl 0:07 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0 --device 1

How can I get this to do the right thing and provide me with processes like these using both cards?

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 1


How can I fix this? I know others are using 2 cards successfully.


i am pulling my hair out over this to get it working.

Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 14 Jul 2009, 02:08:16 pm
...
<coproc>
<type>CUDA</type>
<count>2</count>
</coproc>
...

Start by changing the orange #2 to a 1.  This tag specifies how many GPUs the application (each instance) uses.  AFAIK so far they only ever use 1.  Other stuff, I'm sure some more Linux savvy people can help you with.

Jason
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 14 Jul 2009, 03:12:00 pm
When I start Boinc, it reports 2 Tesla cards instead of the proper ones. Older boincs properly identify both cards. If this were just a naming problem I could live with this but....
With the above coproc statement set to 1,
When I do a ps ax to look at my process list this is what I see:

7987 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0
7988 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

and it uses the GTX285 for both simultaneously!

When I have the coproc statement set to 2, it uses the Tesla only and runs only 1 process. it has both device numbers but the GTX285 is not used:

10170 ? RNLl 0:07 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0 --device 1

How can I get this to do the right thing and provide me with processes like these using both cards?

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 1


How can I fix this? I know others are using 2 cards successfully.

Because Boinc versions greater than 6.6.25 only use the most cabable, use a cc_config.xml with this in it:

<cc_config>
  <options>
        <use_all_gpus>1</use_all_gpus>
  </options>
</cc_config>
 

See How do I configure my client using the cc_config.xml file?
 (http://boincfaq.mundayweb.com/index.php?language=1&view=91) for more options and debug flags.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 14 Jul 2009, 04:54:55 pm
When I start Boinc, it reports 2 Tesla cards instead of the proper ones. Older boincs properly identify both cards. If this were just a naming problem I could live with this but....
With the above coproc statement set to 1,
When I do a ps ax to look at my process list this is what I see:

7987 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0
7988 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

and it uses the GTX285 for both simultaneously!

When I have the coproc statement set to 2, it uses the Tesla only and runs only 1 process. it has both device numbers but the GTX285 is not used:

10170 ? RNLl 0:07 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0 --device 1

How can I get this to do the right thing and provide me with processes like these using both cards?

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 1


How can I fix this? I know others are using 2 cards successfully.

Because Boinc versions greater than 6.6.25 only use the most cabable, use a cc_config.xml with this in it:

<cc_config>
  <options>
        <use_all_gpus>1</use_all_gpus>
  </options>
</cc_config>
 

See How do I configure my client using the cc_config.xml file?
 (http://boincfaq.mundayweb.com/index.php?language=1&view=91) for more options and debug flags.

Claggy

Thanks for the reply! See my item #6 I do have that set in cc_config.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 14 Jul 2009, 05:05:12 pm
...
<coproc>
<type>CUDA</type>
<count>2</count>
</coproc>
...

Start by changing the orange #2 to a 1.  This tag specifies how many GPUs the application (each instance) uses.  AFAIK so far they only ever use 1.  Other stuff, I'm sure some more Linux savvy people can help you with.

Jason


I agree, but if I do not set it to 2, then it feeds 2 workunits simultaneously to my GTX which is my device 0. my process list shows

7987 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0
7988 ? RNLl 0:01 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0

by setting it to 2, it fools seti app into using only the tesla which is at the end of the device parameter list and it only feeds 1 wu at a time. so for now until i find out how to get boinc to set the second app to device 1, this is the most efficient setting. it gives this:

10170 ? RNLl 0:07 setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu --device 0 --device 1

and the app parses things out and finds device 1 at the end so it uses that and ignores device 0.

i need to somehow make boinc recognize that there are 2 individual cards not the same in the system which they broke. it used to do that... and then to assign an individual seti app to each device.

i have a feel ing this is broken boinc source and i *really* do not want to take the time to dig through to find where it discovers the devices, fix that and then fix the device assignments. my work load for work does not allow me that time needed :( . they had it working to report proper devices in an earlier version, and then it looks like someone decided to place the report on 2 lines and that broke it. however the device assignments were messed up even in that early version ( 6.6.20 i think it was).

i was hoping some kind of screwball configuration magic workaround would force it to do what i need, but i am beginning to doubt that.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Jul 2009, 07:47:42 pm
Got it working! the formula to make 2 devices work simultaneously in linux is to use, of all things, the ancient 6.4.5 boinc! the device reporter is still borked. it reports 2 teslas instead of 1 gtx and 1 tesla, but it feeds 1 wu to each device like it is supposed to.
Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 19 Jul 2009, 07:17:51 am
I can confirm the above, I tried it with 6.6.36, 6.6.20 and 6.4.5 and ONLY 6.4.5 worked correctly (one WU to one GPU).  Both 6.6.20 and 6.6.36 put both on the same device, and 6.6.36 I had to do funky things to get it working (had to install 6.6.20, run it, shut down, install 6.6.36 without removing anything, start 6.6.36... and repeat that any time I shut it down, VERY odd).

It could just be something with our OS, Ubuntu 8.10 x64, but for now 6.4.5 works... now I just need to find some way to get the CPU to do some work.
Title: Re: SETI MB CUDA for Linux
Post by: Josef W. Segur on 19 Jul 2009, 08:10:05 pm
If this has already been posted, please ignore, but the BOINC developers for some reason decided to use only the most capable GPU by default. Before that they had something which would use more than 1 only if they were identical, etc.

Anyhow, if you have more than one GPU and want BOINC 6.6.25 or later to use all, you need a cc_config.xml (http://boinc.berkeley.edu/wiki/Client_configuration) with the option <use_all_gpus>1</use_all_gpus>.
                                                                             Joe
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Jul 2009, 02:09:38 pm
If this has already been posted, please ignore, but the BOINC developers for some reason decided to use only the most capable GPU by default. Before that they had something which would use more than 1 only if they were identical, etc.

Anyhow, if you have more than one GPU and want BOINC 6.6.25 or later to use all, you need a cc_config.xml (http://boinc.berkeley.edu/wiki/Client_configuration) with the option <use_all_gpus>1</use_all_gpus>.
                                                                             Joe

I had that in cc_config since I installed the 2nd card, but it made no difference in how the 6.6.x boinic feeds the devices workunits. depending on the config you can feed 2 wu to the 1st device, or feed 1 wu to the 2nd device with --device 0 --device 1 on the cmdline. it is incapable of feeding one wu to each device the way the code behaves. In 6.6.20 it properly identified both devices and displayed such in the messages window but still reported to seti 2 of the 2nd device and did not properly feed the devices workunits. in 6.6.36 i noticed they changed it to display devices one per line and it broke causing it to report 2 of the 2nd device. .  but still had the same behavior as all the other 6.6.x in feeding devices.
it doesn't.
6.4.5 also reports 2 of the 2nd device but it feeds both devices properly.

Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 21 Jul 2009, 04:45:53 pm
I'm sticking with 6.4.5 for now, combined with the CPU_GPU_rebrand_V5.pl script (on the forums here) I get all the functionality (and a little more) that I wanted out of 6.6.36, so that's good enough for now.  Later, when I have some free time, I want to take a look at the 6.6.36 code and see if I can't figure out why it's screwing up.
Title: Re: SETI MB CUDA for Linux
Post by: skip da shu on 21 Jul 2009, 11:00:24 pm
Was hoping somebody here might help a noob at this.

I downloaded the setiathome-CUDA-6.08.x86_64.tar.bz2 file and extracted it to my desktop. 
I copied all the files in the cudalibs folder to /usr/lib64.
I copied the setiathome-CUDA-6.08.x86_64-pc-linux-gnu and app_info.xml to /var/lib/boinc-client

I ran the ldd:
Code: [Select]
\\skip@c17-desktop:/var/lib/boinc-client/projects/setiathome.berkeley.edu$ ldd setiathome-CUDA-6.08.x86_64-pc-linux-gnu
linux-vdso.so.1 =>  (0x00007fff69fff000)
libcufft.so.2 => /usr/lib/libcufft.so.2 (0x00007f16618de000)
libcudart.so.2 => /usr/lib/libcudart.so.2 (0x00007f16616a0000)
libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007f16611ff000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1660ef2000)
libm.so.6 => /lib/libm.so.6 (0x00007f1660c6d000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1660a51000)
libc.so.6 => /lib/libc.so.6 (0x00007f16606df000)
libdl.so.2 => /lib/libdl.so.2 (0x00007f16604db000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f16602c3000)
librt.so.1 => /lib/librt.so.1 (0x00007f16600bb000)
libz.so.1 => /lib/libz.so.1 (0x00007f165fea3000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1661bf8000)

I restarted BOINC.

GPUgrid is running/working on this machine (Ubuntu v9.04, 64b) with BOINC v6.4.5 from getdeb.

I get these messages:

Tue 21 Jul 2009 09:44:52 PM CDT||file projects/setiathome.berkeley.edu/libcudart.so.2 not found
Tue 21 Jul 2009 09:44:52 PM CDT||file projects/setiathome.berkeley.edu/libcufft.so.2 not found
Tue 21 Jul 2009 09:44:52 PM CDT||[error] No URL for file transfer of libcudart.so.2
Tue 21 Jul 2009 09:44:52 PM CDT||[error] No URL for file transfer of libcufft.so.2

and BOINC xfer tab shows libcudart.so.2 and libcufft.2

Can anyone explain to me what I've missed?


For now I've removed the app_info.xml and aborted the xfers of the libcu* files

Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 22 Jul 2009, 04:29:28 am
Apparently BOINC can't find 2 required libraries.
try to put these 3 libs with executable. Not sure Linux  checks that place in attempt to load shared libs though...
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 05:08:03 am
easiest way is to simply copy everything from the sdk lib dir into /var/lib/boinc and also into the project directory /var/lib/boinc/projects/<projectname>/

also depending on what your distro requires, add the projectname directory path to /etc/ld.so.conf or to whatever master file your distro says to and re-run ldconfig or env-update or whatever you have to scan the lib locations.

it will magically start working. my seti project dir contains these lib files:

lrwxrwxrwx 1 root wheel       17 2009-07-17 13:05 libcublasemu.so -> libcublasemu.so.2
lrwxrwxrwx 1 root wheel       19 2009-07-17 13:05 libcublasemu.so.2 -> libcublasemu.so.2.2
-rwxr-xr-x 1 root wheel  4704664 2009-07-09 08:00 libcublasemu.so.2.2
lrwxrwxrwx 1 root wheel       14 2009-07-17 13:06 libcublas.so -> libcublas.so.2
lrwxrwxrwx 1 root wheel       16 2009-07-17 13:05 libcublas.so.2 -> libcublas.so.2.2
-rwxr-xr-x 1 root wheel 18643744 2009-07-09 08:00 libcublas.so.2.2
lrwxrwxrwx 1 root wheel       14 2009-07-17 13:06 libcudart.so -> libcudart.so.2
lrwxrwxrwx 1 root wheel       16 2009-07-17 13:05 libcudart.so.2 -> libcudart.so.2.2
-rwxr-xr-x 1 root wheel   261472 2009-07-09 08:00 libcudart.so.2.2
lrwxrwxrwx 1 root wheel       16 2009-07-17 13:06 libcufftemu.so -> libcufftemu.so.2
lrwxrwxrwx 1 root wheel       18 2009-07-17 13:05 libcufftemu.so.2 -> libcufftemu.so.2.2
-rwxr-xr-x 1 root wheel   273064 2009-07-09 08:00 libcufftemu.so.2.2
lrwxrwxrwx 1 root wheel       13 2009-07-17 13:06 libcufft.so -> libcufft.so.2
lrwxrwxrwx 1 root wheel       15 2009-07-17 13:06 libcufft.so.2 -> libcufft.so.2.2
-rwxr-xr-x 1 root wheel  1153440 2009-07-09 08:00 libcufft.so.2.2


if you are not using the 2.2 cuda environment you will not have some of these files but no matter. as long as all files from the sdk lib dir are copied in it will work.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 05:13:05 am
I can confirm the above, I tried it with 6.6.36, 6.6.20 and 6.4.5 and ONLY 6.4.5 worked correctly (one WU to one GPU).  Both 6.6.20 and 6.6.36 put both on the same device, and 6.6.36 I had to do funky things to get it working (had to install 6.6.20, run it, shut down, install 6.6.36 without removing anything, start 6.6.36... and repeat that any time I shut it down, VERY odd).

It could just be something with our OS, Ubuntu 8.10 x64, but for now 6.4.5 works... now I just need to find some way to get the CPU to do some work.

i'm using the AK optimized MB app for my cpus and with this plus the cuda app in my app_info.xml my cpus and gpus are all happily crunching away. i don't know where it would be on this site but if it is not here i can give you a link to crunch3r's download page where i got mine.  my app name is

AK_V8_linux64_ssse3

they are highly optimized for the processor so be sure you get the correct one. a "cat /proc/cpuinfo" will get you the processor capabilities.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Jul 2009, 08:40:46 am
After a catastrophic GPU failure (probably something burned inside my GTX280  :D)  that left me offline for a week, I'm back up and slowly catching up.

Since many linux users, lately, had problems with multi-GPU configs, I decided my next card would be a GTX 295 so I have a proper testbed to research things further. My various tweaks and experiments with xorg.conf or different video drivers didn't have any effect. Boinc still used only one of the two GPUs.

Then I started with the latest boinc (6.6.37) and moved backwards to see if I could find a boinc version that worked as expected.

I've found that the last boinc that can use both GPUs properly is 6.6.11. Boinc 6.6.12 doesn't even assign --device X flags to cuda instances. It just launches two of them and they (probably automatically) go straight to the first GPU. The next boinc linux release (6.6.15) corrects the missing --device X flags but it does this wrongly, it assigns both of them to --device 0.

So something in the transition 6.6.11 --> 6.6.12 --> 6.6.15 broke and stayed broken ever since with all subsequent boinc releases.

I'm using 6.6.11 for a few hours now with no apparent problems or bugs. Direct link http://boincdl.ssl.berkeley.edu/dl/boinc_6.6.11_x86_64-pc-linux-gnu.sh .

If someone remembers any showstopper bugs with 6.6.11 please tell.
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 22 Jul 2009, 09:11:34 am

If someone remembers any showstopper bugs with 6.6.11 please tell.


That was just before the bombshell about not supporting app_info.xml properly. v6.6.12 doesn't aupport app_info, v6.6.14 started to put it back together (there was no .13). So you may have problems getting v6.6.11 working with both CUDA and an optimised CPU app, or indeed Astropulse.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 09:30:32 am
I just tried 6.6.11. it recognizes both devices just fine and properly i might add. however, i can only assume since i have a very large backlog of uploads, it refuses to do anything. both cuda processes are "waiting to run" and it does not load the gpu apps. with the upload clogs the way they are this is not acceptable so i went back to 6.4.5 to at least finish the existing backlog of waiting wu i have.

after things are caught up (if that ever happens :) ) i will try it again.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Jul 2009, 09:38:28 am
That was just before the bombshell about not supporting app_info.xml properly. v6.6.12 doesn't aupport app_info, v6.6.14 started to put it back together (there was no .13). So you may have problems getting v6.6.11 working with both CUDA and an optimised CPU app, or indeed Astropulse.

Now that you mention it, I think all the newly downloaded wus went to cuda 6.08. No assignments to 6.03. Maybe it didn't ask?  :-\. When did boinc started supporting both 6.03 and 6.08 concurrently?
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 22 Jul 2009, 09:49:24 am

Now that you mention it, I think all the newly downloaded wus went to cuda 6.08. No assignments to 6.03. Maybe it didn't ask?  :-\. When did boinc started supporting both 6.03 and 6.08 concurrently?


I'd need to check if you need a detailled answer, but I think:

v6.6.1 for stock applications
v6.6.14 with app_info.xml

Edit: v6.4.5 allows you to specify and run both apps, but will only fetch for the higher-numbered one.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Jul 2009, 10:16:36 am
I'd need to check if you need a detailled answer, but I think:

v6.6.1 for stock applications
v6.6.14 with app_info.xml

Edit: v6.4.5 allows you to specify and run both apps, but will only fetch for the higher-numbered one.

If you could find when boinc started to fetch for both 6.03 and 6.08, I would be very grateful.

I'll keep monitoring 6.6.11 to see how it behaves. If it can't fetch for 6.03, I'll have to do it manually till a fixed boinc is released. I'll try to report the bug in boinc_alpha and I hope I'll get some attention.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 10:41:23 am
I'd need to check if you need a detailled answer, but I think:

v6.6.1 for stock applications
v6.6.14 with app_info.xml

Edit: v6.4.5 allows you to specify and run both apps, but will only fetch for the higher-numbered one.

If you could find when boinc started to fetch for both 6.03 and 6.08, I would be very grateful.

I'll keep monitoring 6.6.11 to see how it behaves. If it can't fetch for 6.03, I'll have to do it manually till a fixed boinc is released. I'll try to report the bug in boinc_alpha and I hope I'll get some attention.

because of my upload backlog nothing gets anything, but once that is cleared up if 6.4.5 only gets for 6.03 which was its behavior before, then all i wil do is use 6.6.37 to get 4 or 5 days worth of units then switch back to 6.4.5 to properly process them. pain but at least it works. that will be after i try 6.6.11 to see what it does.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 04:34:32 pm
all my uploads finished so i switched to 6.6.11. so far it is getting only cpu workunits but i suspect that is because i still have a ton of unfinished gpu workunits. it detects my hardware properly and it sends 1 wu to each device :)

LOL I just checked my servers page and although it does everything right locally it is just the opposite of all the others in reporting! instead of what others report 2-teslas, this one reports to the page or at least the page displays 2 gtx285! oh well. minor problem affecting nothing but my ego :)

will see what happens as cuda units get finished to see if it gets both.

so far it is looking really good!

and yes I am running app_info.xml using the AK cpu multibeam application and the cuda 2.2 vlar killer app. i also am running the cpu-gpu script but so far i do that manually until i get the ratios adjusted properly. set it too far before and wound up with no cpu apps and all gpu apps.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 05:18:08 pm
it got a few cuda workunits so it is fetching both. it appears 6.6.11 works fine.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Jul 2009, 06:20:46 pm
LOL I just checked my servers page and although it does everything right locally it is just the opposite of all the others in reporting! instead of what others report 2-teslas, this one reports to the page or at least the page displays 2 gtx285! oh well. minor problem affecting nothing but my ego :)

In my case it reports correctly "[2] NVIDIA GeForce GTX 295 (895MB)"

it got a few cuda workunits so it is fetching both. it appears 6.6.11 works fine.

The big question is not if it gets cuda workunits, but if it ALSO gets workunits for cpu AK_V8.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Jul 2009, 06:28:31 pm
yeah iti got near 100 cpu AK workunits before it got the  cuda ones. it is downloading both properly. it seems to work just fine.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Jul 2009, 07:01:32 pm
yeah iti got near 100 cpu AK workunits before it got the  cuda ones. it is downloading both properly. it seems to work just fine.

Well, then, there is nothing more we could ask. 6.6.11 it is for multi-gpus in linux.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 22 Jul 2009, 10:31:43 pm
Well, then, there is nothing more we could ask. 6.6.11 it is for multi-gpus in linux.

I've been looking, but haven't found exactly how to write my app_info.xml to be able to crunch MB wus with both CPU and GPU - can you point me in the right direction?

Right now my app_info.xml lets me do the optimized AP with the CPU and the CUDA MB with the GPU only, but I do run other projects as well.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Jul 2009, 04:27:00 am
I also confirm that 6.6.11 gets both CPU and GPU multibeam workunits.

Tye, this is the seti section of my app_info.xml. Mind you that your cpu or your cuda client might have a different name:

    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>AK_V8_linux64_ssse3</name>
        <executable/>
    </file_info>
    <file_info>
        <name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</name>
        <executable/>
    </file_info>
    <file_info>
        <name>libcudart.so.2</name>
        <executable/>
    </file_info>
    <file_info>
        <name>libcufft.so.2</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
        <flops>5634219710.66940475</flops>
        <avg_ncpus>1.0000</avg_ncpus>
        <max_ncpus>1.0000</max_ncpus>
    <file_ref>
        <file_name>AK_V8_linux64_ssse3</file_name>
        <main_program/>
    </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>608</version_num>
        <plan_class>cuda</plan_class>
        <flops>19317324722.295102</flops>
        <avg_ncpus>1.0000</avg_ncpus>
        <max_ncpus>1.0000</max_ncpus>
    <coproc>
        <type>CUDA</type>
        <count>1</count>
    </coproc>
    <file_ref>
        <file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>
        <main_program/>
    </file_ref>
    <file_ref>
        <file_name>libcudart.so.2</file_name>
    </file_ref>
    <file_ref>
        <file_name>libcufft.so.2</file_name>
    </file_ref>
    </app_version>
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jul 2009, 05:25:41 am
is the flops section of the app info file really important? i'm not using it. my file is below. does the addition of the flops help with processing efficiency?

<app_info>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>AK_V8_linux64_ssse3</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<file_ref>
<file_name>AK_V8_linux64_ssse3</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</name>
<executable/>
</file_info>
<file_info>
<name>libcudart.so.2</name>
<executable/>
</file_info>
<file_info>
<name>libcufft.so.2</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.250000</avg_ncpus>
<max_ncpus>0.250000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.2</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.2</file_name>
</file_ref>
</app_version>
</app_info>
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jul 2009, 05:43:09 am
slightly off topic but possibly relevant.

does the cpu-gpu perl script V5 available in another topic actually catch vlars and vhars? maybe it doesn't report them because i have not seen it report any yet. a little concerned because i have had several computation error workunits from the 'vlar-killer 2.2 cuda' ap and with task viewing turned off at seti i cannot tell.

the other thing is would it be better to have my ratio set so that boinc never sees a shortage of cuda workunits so it never gets cuda workunits but only cpu workunits and then let this script supply the gpu work? i would think if it handles vlars it would be the most efficient way to never get one scheduled for cuda?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Jul 2009, 05:44:01 am
is the flops section of the app info file really important? i'm not using it. my file is below. does the addition of the flops help with processing efficiency?

It is important, but not for processing efficiency. flops numbers help boinc to better calculate estimated computation times, so to better plan ahead and download or not new workunits to fill the cache. Fairly accurate estimated computation times help stabilize Duration Correction Factor and brings boinc to a balanced state.

It is also the other way around. You see, Duration Correction Factor, estimated computation times, balanced boinc, good cache management are all interconnected and flops numbers in app_info.xml help to bring all these in a balanced state.

If you think that your boinc is balanced and steady without the flops numbers, then probably you don't need them. Otherwise you have to put them in.

Edit:
does the cpu-gpu perl script V5 available in another topic actually catch vlars and vhars? maybe it doesn't report them because i have not seen it report any yet. a little concerned because i have had several computation error workunits from the 'vlar-killer 2.2 cuda' ap and with task viewing turned off at seti i cannot tell.
I haven't used that script. Might be that vlar_kill has a greater vlar angle range than the script is looking for? To see if it is working correctly, after a script run that doesn't report anything, make a manual search in the workunits directory to see how many files(workunits) contain the text <true_angle_range>0.0 and then see if some of them are still assigned to cuda.


the other thing is would it be better to have my ratio set so that boinc never sees a shortage of cuda workunits so it never gets cuda workunits but only cpu workunits and then let this script supply the gpu work? i would think if it handles vlars it would be the most efficient way to never get one scheduled for cuda?
In both cases you don't avoid using the script. Choose whatever suits you better.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jul 2009, 07:21:38 am
wow i've never given any of that any thought. never even knew of the term balanced boinc. i just install it let it do its thing. it seems to get and processand send workunits ok but isuppose there is something to be said for 'balancing it out'. maybe ill mess with that a bit.

time will tell on the script. ill search for that phrase in what i have now cause i dont like killing off units like that and my tesla cannot take them. its an early engineering version of it and it locks right up on vlars. this is the main reason i chose to run the vlar killer app to protect it.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Jul 2009, 07:27:47 am
my tesla cannot take them. its an early engineering version of it and it locks right up on vlars. this is the main reason i chose to run the vlar killer app to protect it.

Every GPU, in every OS, windows or linux has problems with VLARs.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 23 Jul 2009, 10:46:46 pm
I see this line:

<file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>

But I only have the old app without the 2.2 part - where are you getting this CUDA app?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jul 2009, 03:16:25 am
I see this line:

<file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>

But I only have the old app without the 2.2 part - where are you getting this CUDA app?

Get it from http://calbe.dw70.de/mb/viewtopic.php?f=9&t=110
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Jul 2009, 06:50:15 am
I see this line:

<file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>

But I only have the old app without the 2.2 part - where are you getting this CUDA app?

sunu reported the url, but a reminder you need to install the 2.2 cuda sdk and tools too and update the libs in every directory you have them from the sdk lib dir. especially /usr/lib64, the boinc dir and  the seti project dir and make sure those paths are in ld.so.conf and rerun ldconfig.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jul 2009, 07:40:40 am
sunu reported the url, but a reminder you need to install the 2.2 cuda sdk and tools too and update the libs in every directory you have them from the sdk lib dir. especially /usr/lib64, the boinc dir and  the seti project dir and make sure those paths are in ld.so.conf and rerun ldconfig.

Oh yes, forgot about that. You'll also need a cuda 2.2 compatible nvidia driver. Get it in http://www.nvidia.com/object/linux_display_amd64_185.18.14.html
Title: Re: SETI MB CUDA for Linux
Post by: pp on 24 Jul 2009, 08:05:31 am
Any noticeable speedup?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jul 2009, 08:42:16 am
Any noticeable speedup?

There is a small one.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 24 Jul 2009, 02:25:48 pm
Will it work with the new 190 driver and CUDA 2.3?

With the 185 driver and 2.2 libraries, I tried the 2.2 VLARkill app and putting the libraries in /usr/lib, the boinc folder, the proj*/setiat* folder, and my ~/bin but I kept getting computation errors with the 2.2 VLARkill app.  Argh.

I couldn't download the tools since they are now offering the 2.3 tools, hence the 2.3 question...
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 24 Jul 2009, 02:28:19 pm
BTW, this is the error info that shows up with such a workunit:

<core_client_version>6.6.11</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GSO
           totalGlobalMem = 805044224
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1350000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 12
SIGSEGV: segmentation violation
Stack trace (16 frames):
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(boinc_catch_signal+0x43)[0x485ef3]
/lib/libpthread.so.0[0x7fc1f6ee9080]
/usr/lib/libcuda.so.1[0x7fc1f7ebb020]
/usr/lib/libcuda.so.1[0x7fc1f7ec0d84]
/usr/lib/libcuda.so.1[0x7fc1f7e8a10f]
/usr/lib/libcuda.so.1[0x7fc1f7c15b3b]
/usr/lib/libcuda.so.1[0x7fc1f7c2646b]
/usr/lib/libcuda.so.1[0x7fc1f7c0e211]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7fc1f7c07faa]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x5bf335]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x413c5b]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x41f68d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x42b54d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x408707]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7fc1f6b865a6]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(__gxx_personality_v0+0x219)[0x408349]

Exiting...

</stderr_txt>
]]>
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jul 2009, 03:09:55 pm
Will it work with the new 190 driver and CUDA 2.3?

I'm not too sure. I'm having some bad results.

With the 185 driver and 2.2 libraries, I tried the 2.2 VLARkill app and putting the libraries in /usr/lib, the boinc folder, the proj*/setiat* folder, and my ~/bin but I kept getting computation errors with the 2.2 VLARkill app.  Argh.

Are you sure that you've done everything from http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014 except for number1 if you have a multi-gpu setup?
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 24 Jul 2009, 03:27:10 pm
Argh - sorry, I'm a dumbass!  Forgot to put the executable in my $PATH (in my case ~/bin).  Double-argh.  Will test again once I get some WUs, but I think that should do the trick (it did last time - you'd think I'd learn!).  Triple-argh.

Still only single GPU, but I've got the CPU and GPU both doing MB stuff thanks to you guys now.  Will pick up another GPU at some point though.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jul 2009, 04:57:59 pm
Still only single GPU, but I've got the CPU and GPU both doing MB stuff thanks to you guys now.  Will pick up another GPU at some point though.

Multi-gpu problems of the past with seti CUDA are pretty much solved now in linux.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 24 Jul 2009, 06:28:31 pm
Still only single GPU, but I've got the CPU and GPU both doing MB stuff thanks to you guys now.  Will pick up another GPU at some point though.

Multi-gpu problems of the past with seti CUDA are pretty much solved now in linux.

Yep, seems like 6.6.11 is the way to go - already using it to be prepared.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 25 Jul 2009, 06:39:54 am
stupid question.. what  sets boinc's timing to report completed workunits? is it based on the number of days cache of work? it seems every time i look at it there are more than 100 wu ready to report and it doesn't seem to want to do that so i keep hitting manual update. i am presently keeping 6 days of work to do because of the recent problems at setihome. maybe i should drop it back to 4 or something?

plus i don't think that V5 script is working for me to grab VLARS as there are always 4 or 5 cuda computation error wu every time i look at it. i have never seen the script report them to me.

Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 25 Jul 2009, 06:50:10 am

stupid question.. what  sets boinc's timing to report completed workunits?


There's a whole long list of possible triggers (see John McLeod VII's post at SETI for all the gory details), but in general no 'ready to report' task should hand around on your computer for longer than 24 hours. So they should disappear at least once per day without you doing anything.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 25 Jul 2009, 09:47:20 am

stupid question.. what  sets boinc's timing to report completed workunits?


There's a whole long list of possible triggers (see John McLeod VII's post at SETI for all the gory details), but in general no 'ready to report' task should hand around on your computer for longer than 24 hours. So they should disappear at least once per day without you doing anything.

ok i wont touch it and wait till this time tomorrow to look to see if it has reported.
will check out that reference. thanks!

Title: Re: SETI MB CUDA for Linux
Post by: Josef W. Segur on 25 Jul 2009, 03:44:23 pm
The BOINC FAQ Service has very similar info in http://boincfaq.mundayweb.com/index.php?language=1&view=68, though jm7 has been more directly involved in the scheduling code development and lists things a little differently.
                                                                                    Joe
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 26 Jul 2009, 07:12:19 am
Thanks a lots for your tips sunu,

it run very well know i have got my firsts valids WUs on my GTS250 (about 5~10 min for a WU).It use about 2 % of CPU time to run so that i can crunch 2 other seti WU on my C2D E7300.

Bye

After crunching other projects for some months, I restarted SETI@GPU just his morning, but using the initial CUDA build for Linux which uses 100% of one CPU core as well...
So these new versions, that can be found in crunch3rs board, will use only few % of one CPU? That would be awsome...

I'm currently still at 180.60, some 2.6.30 rc and 6.6.17, but willing to update if I could free up that core with a never version of the app :)
Title: Re: SETI MB CUDA for Linux
Post by: pp on 27 Jul 2009, 01:21:19 pm
After crunching other projects for some months, I restarted SETI@GPU just his morning, but using the initial CUDA build for Linux which uses 100% of one CPU core as well...
So these new versions, that can be found in crunch3rs board, will use only few % of one CPU? That would be awsome...

I'm currently still at 180.60, some 2.6.30 rc and 6.6.17, but willing to update if I could free up that core with a never version of the app :)

I can confirm that setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu no longer takes 100% CPU (only about 2-4% now on this PC). That's with CUDA 2.2 - I haven't been brave enough  to upgrade to CUDA 2.3 yet. Also using nVidia 185.18.14 and BOINC 6.9.0. The only downside so far is that the CPU time column now only shows actual CPU time used which is only a couple of minutes during the 19 minutes CUDA run. So no good way of checking exactly how long a WU takes now unless I monitor the clock manually.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 27 Jul 2009, 02:59:42 pm
I can confirm that setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu no longer takes 100% CPU (only about 2-4% now on this PC). That's with CUDA 2.2 - I haven't been brave enough  to upgrade to CUDA 2.3 yet. Also using nVidia 185.18.14 and BOINC 6.9.0. The only downside so far is that the CPU time column now only shows actual CPU time used which is only a couple of minutes during the 19 minutes CUDA run. So no good way of checking exactly how long a WU takes now unless I monitor the clock manually.

Yup, that's been bothering me too.  I'm wondering if there's a way to trick it into reporting clock time rather than cpu time...  I'm using nvidia 185.18.14 and BOINC 6.6.11 btw since I'd like to do multi-GPU here soon.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 27 Jul 2009, 03:45:43 pm
I can confirm that setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu no longer takes 100% CPU (only about 2-4% now on this PC). That's with CUDA 2.2 - I haven't been brave enough  to upgrade to CUDA 2.3 yet. Also using nVidia 185.18.14 and BOINC 6.9.0. The only downside so far is that the CPU time column now only shows actual CPU time used which is only a couple of minutes during the 19 minutes CUDA run. So no good way of checking exactly how long a WU takes now unless I monitor the clock manually.

Yup, that's been bothering me too.  I'm wondering if there's a way to trick it into reporting clock time rather than cpu time...  I'm using nvidia 185.18.14 and BOINC 6.6.11 btw since I'd like to do multi-GPU here soon.

6.6.37 was reporting proper cpu/gpu times, but when i went back to 6.6.11 to use multiple devices that time reporting broke. i am not sure if adding a flops statement in app_info.xml  will help with that or not.

for the ability to use multiple devices not having proper reporting time is an irritant but one i will gladly put up with until there is a newer version with fixed device code.

Title: Re: SETI MB CUDA for Linux
Post by: koschi on 27 Jul 2009, 04:58:04 pm
Follow all steps (1-4) below:

1)  Use a newer boinc version. The latest is 6.6.36, http://boinc.berkeley.edu/download_all.php . I haven't checked it, I use 6.6.20, direct download link http://boinc.berkeley.edu/dl/boinc_6.6.20_x86_64-pc-linux-gnu.sh
2)  Make sure all the appropriate cuda libs from 2.2 toolkit

libcudart.so
libcudart.so.2
libcudart.so.2.2
libcufft.so
libcufft.so.2
libcufft.so.2.2

are in the projects/setiathome.berkeley.edu directory.

3)  Edit accordingly your ld.so.conf or the corresponding ld-something file of your distro with the above location of the cuda libs.

4)  Place a copy of the cuda client in one of the following locations:

/usr/local/sbin
/usr/local/bin
/usr/sbin
/usr/bin
/sbin
/bin
/usr/games


Thanks a lot, especially number 4 I would have never tried, its a little insane  ;D
One unit was already successfully validated. The CPU usage on my Q6600 is at roughly 1-2s/1min wall clock time.
The default priority of nice 10 seems to slow the process down on my box, once I switched it to 0 or -5, it processed much faster and collected up CPU time quicker.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 27 Jul 2009, 08:28:21 pm
hmm maybe i should try that. im still using nice 19. im only using 3 of my cpus leaving 1 to handle both gpu apps and my desktop. that seems to be most efficient, running cpus at 100% i found the times dropped from an average of 1 hr 25 min to an hour or under. and thats at nice 19.. maybe i should play a bit in priorities.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Jul 2009, 02:46:08 pm
off topic but i thought if anyone would know it would be here.

is there any utility for x86_64 besides lm_sensors that will properly monitor the GTX285 fans?

I assume this is done with the ADT7473 chip. it is enabled in the kernel but lm_sensors reports back "no driver for Analog Devices ADT7473 yet" and also "unknown adapter NVIDIA i2c adapter".

temps are monitoring just fine i just cannot see the fans.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 28 Jul 2009, 04:32:09 pm
Try nvclock: http://www.linuxhardware.org/nvclock/#download
The current 0.8b4 is already part of Ubuntu Jaunty, if you are using some older version or another flavour of Linux you might want to compile it yourself, if it isn't available at that level. Around December/January quite soem progress has been made in that tool, though its still not supporting all nvidia cards.
My 8800GTS also uses the ADT7473, temperature readings show an offset of 8°C to the nvidia-settings temperature reading, fan controlling is working.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Jul 2009, 06:02:08 pm
Try nvclock: http://www.linuxhardware.org/nvclock/#download
The current 0.8b4 is already part of Ubuntu Jaunty, if you are using some older version or another flavour of Linux you might want to compile it yourself, if it isn't available at that level. Around December/January quite soem progress has been made in that tool, though its still not supporting all nvidia cards.
My 8800GTS also uses the ADT7473, temperature readings show an offset of 8°C to the nvidia-settings temperature reading, fan controlling is working.

i have 8b4 and i set the fans manually with it but it cannot handle the gtx285 yet:
nvclock --info
It seems your card isn't officialy supported in NVClock yet.
The reason can be that your card is too new.
If you want to try it anyhow [DANGEROUS], use the option -f to force the setting(s).
NVClock will then assume your card is a 'normal', it might be dangerous on other cards.
Also please email the author the pci_id of the card for further investigation.
[Get that value using the -i option].


i am looking for something with support for the ADT7473 as a module or interface so  i can display  them in gkrellm on my desktop. i dont know how gkrellm gets the gpu temps but it reports them accurately. maybe ill send bill (the author of gkrellm) a note to ask if he knows how to monitor them directly.

Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 29 Jul 2009, 01:41:09 pm
I see this line:

<file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>

But I only have the old app without the 2.2 part - where are you getting this CUDA app?

Get it from http://calbe.dw70.de/mb/viewtopic.php?f=9&t=110

Will that work better with CUDA 2.3 than the one at the start of this thread?

I tried to go to the link, but every time I go to calbe.dw70.de I get access denied... is there another place to get it?
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 30 Jul 2009, 02:12:51 am
Earlier versions of nvclock I had to convince with -f to read the data of my card, but it worked quite well...

Has anyone tried the CUDA 2.2 client together with 190.xx drivers and the CUDA 2.3 dlls, if there is some speed-up like under Windows?
Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 30 Jul 2009, 07:07:29 am
I'm currently running CUDA 2.3 with the app from the start of this thread., does seem to be a speed up.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Jul 2009, 12:30:42 pm
does anyone  know if there is a cuda 2.3 vlarkill x86_64 app available yet? i am switching everything to 2.3 and the 190 driver today.
Title: Re: SETI MB CUDA for Linux
Post by: letni on 31 Jul 2009, 01:14:06 pm
I'm trying to get CUDA working (with the 32 bit binary posted at message 1 of this thread) with my new 8600GTS in Slackware Linux and I'm having issues.. I have run the nvidia installer, etc, but I get some weird errors..

1. The output shows I have a cuda device, however, it says I have revision 0 of the driver installed, even though I have installed the 185.18.14 and updaged to the 185.18.31..
 CUDA device: GeForce 8600 GTS (driver version 0, comp
ute capability 1.1, 255MB, est. 18GFLOPS).
2. I have modified my app_info.xml to allow both AK_V8_SSE3 (32bit) and the cuda to run simultaneously (included .xml file).. I have 3 active tasks being worked on, two (for my dual CPU) say setiathome_enhanced 6.03 and run just fine.  The third is the CUDA which setiathome_enhanced 6.08 (cuda), and the status NEVER goes past Ready to start. It will eventuall error out with Computation error.
3. I have tried the CUDA toolkit 2.0 and 2.2 based libs (put them in the projects/setiathome.berkeley.edu directory.

4. I'm not using XWindows at all.  This is all console based only.. Is Xorg required to be running to utilize CUDA?

Anyone have any thoughts or advice on how to debug this?


[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Jul 2009, 01:52:03 pm
will look things over more closely later after work but what version of boinc are you running?

x is not required. the only requirement there is that the nvidia driver loads properly.

you placed the libs in the project dir but did you make sure the project dir is included in your system ld.so.conf file? and did you re-run ldconfig?

try ldd <setiathome application name> and make sure there are no errors and every line points to a library.

Title: Re: SETI MB CUDA for Linux
Post by: koschi on 31 Jul 2009, 02:31:03 pm
Could you provide a link to your host, or some failed work units?
If you follow the instructions provided by sunu and make sure the Nvidia modules are loaded, then it will also work for you :)

I'm running the 190.18 driver with CUDA 2.3 libraries and 2.2VLARkill app now on two machines with G92 chips, so far no isses.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 01 Aug 2009, 10:25:15 am
Could you provide a link to your host, or some failed work units?
If you follow the instructions provided by sunu and make sure the Nvidia modules are loaded, then it will also work for you :)

And whatever you do, don't forget to do step #4 - that's gotten me twice now.  ;-)
Title: Re: SETI MB CUDA for Linux
Post by: letni on 01 Aug 2009, 12:28:39 pm
Thanks for the replies.. I am currently trying to use version 6.6.36.. I tried 6.6.20 earlier and had the same results.. I have read places that the 32 bit cuda binary at the beginning of this thread is broke.. I will try the 64 bit binary once I can reload the machine with at 64 bit linux.

Thanks
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Aug 2009, 06:38:26 pm
Could you provide a link to your host, or some failed work units?
If you follow the instructions provided by sunu and make sure the Nvidia modules are loaded, then it will also work for you :)

I'm running the 190.18 driver with CUDA 2.3 libraries and 2.2VLARkill app now on two machines with G92 chips, so far no isses.


hmm must be something i missed then because when i run the same software combination with boinc 6.6.11 and 2 gt200 series cards, it runs fine for about 10 to 12 hours then the desktop begins to pause and sometimes lock up and even the entire system ground to a halt one time. i returned to 2.2 and 185.18.29 and had no trouble since. and yes i paid very careful attention to be sure the 2.3 libs were in their proper places and that ldconfig found them and that ldd to the 2.2 vlarkill app showed no errors and then rebooted the system to make doubly sure the environment was sane.

my gkrellm monitors were the most sensitive to this behavior and began displaying symptoms before it got to the noticable level affecting my desktops. since i run an extrememly busy set of desktops (2 of the desktops display a total of 29 gkrellm monitor strips monitoring our servers in real time)  i suspect the 190 driver isn't ready for prime time yet for linux when handling more than near idle desktop load plus cuda.
Title: Re: SETI MB CUDA for Linux
Post by: letni on 01 Aug 2009, 10:32:04 pm
Could you provide a link to your host, or some failed work units?
If you follow the instructions provided by sunu and make sure the Nvidia modules are loaded, then it will also work for you :)

I'm running the 190.18 driver with CUDA 2.3 libraries and 2.2VLARkill app now on two machines with G92 chips, so far no isses.


Here is the machine with 32bit Slackware installed..
http://setiathome.berkeley.edu/show_host_detail.php?hostid=5050097

I am playing around with 64 bit Redhat on the same machine (with the same results) never gets past "Ready to start"
http://setiathome.berkeley.edu/show_host_detail.php?hostid=5052513


As of note.. I am also using the windows BOINC client to connect to the linux box VIA Remote RPC, this is how I can see it never gets past Read to Start.

Thanks..

Letni
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 02 Aug 2009, 03:20:52 am
http://setiathome.berkeley.edu/result.php?resultid=1323470350
Quote
[...]
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 8600 GTS is okay
SIGSEGV: segmentation violation
Stack trace (16 frames):
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib64/libpthread.so.0[0x7f0954f4f0f0]
/usr/lib64/libcuda.so.1[0x7f09559c3920]
/usr/lib64/libcuda.so.1[0x7f09559c9684]
/usr/lib64/libcuda.so.1[0x7f0955992a0f]
/usr/lib64/libcuda.so.1[0x7f095571e296]
/usr/lib64/libcuda.so.1[0x7f095572ebab]
/usr/lib64/libcuda.so.1[0x7f0955716190]
/usr/lib64/libcuda.so.1(cuCtxCreate+0xaa)[0x7f095571000a]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x5ace4b]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40d4ca]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7f0954bec576]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>
]]>


The same errors that I got when trying to use the 2.2VLARkill without putting the app into /usr/bin or any other directory within $PATH. Please check if it helps to copy the app there.
Plese also check with ldd setiathome-CUDA-6.08.x86_64-pc-linux-gnu that all needed libraries are in place.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 07:07:41 am
I had some pc troubles and I was offline the whole past week. I'll try to answer some messages that have been posted since.

After crunching other projects for some months, I restarted SETI@GPU just his morning, but using the initial CUDA build for Linux which uses 100% of one CPU core as well...
So these new versions, that can be found in crunch3rs board, will use only few % of one CPU? That would be awsome...
I'm currently still at 180.60, some 2.6.30 rc and 6.6.17, but willing to update if I could free up that core with a never version of the app :)
The 100% core usage was a bug with 2.1 and earlier linux cuda libs. With 2.2 and later this has been fixed. Our seti cuda client had nothing to do with it. So you can use anything you like as long as you use 2.2 or later cuda libraries.

Yup, that's been bothering me too.  I'm wondering if there's a way to trick it into reporting clock time rather than cpu time...  I'm using nvidia 185.18.14 and BOINC 6.6.11 btw since I'd like to do multi-GPU here soon.
Are you talking about boinc manager? Some time down the road, boinc manager changed from cpu time to elapsed time. If you use boinc 6.6.11 for multi-gpu you can use a later boinc manager version that shows the elapsed time. I'm using boinc 6.6.11 with boinc manager 6.6.37.
If you're talking about the times reported to Berkeley, you can't change that.

6.6.37 was reporting proper cpu/gpu times, but when i went back to 6.6.11 to use multiple devices that time reporting broke. i am not sure if adding a flops statement in app_info.xml  will help with that or not.
See my previous reply and no, flops in app_info.xml will not help.

The default priority of nice 10 seems to slow the process down on my box, once I switched it to 0 or -5, it processed much faster and collected up CPU time quicker.
Are you talking about the CPU client or the GPU one?

I tried to go to the link, but every time I go to calbe.dw70.de I get access denied... is there another place to get it?
Kunin I've attached it below.

Has anyone tried the CUDA 2.2 client together with 190.xx drivers and the CUDA 2.3 dlls, if there is some speed-up like under Windows?
I haven't run a comparison but there is no harm using it.

does anyone  know if there is a cuda 2.3 vlarkill x86_64 app available yet? i am switching everything to 2.3 and the 190 driver today.
Unless Crunch3r makes one... But I don't think there will be any worthy speedup (at least in windows there isn't).

I'm trying to get CUDA working (with the 32 bit binary posted at message 1 of this thread) with my new 8600GTS in Slackware Linux and I'm having issues.. I have run the nvidia installer, etc, but I get some weird errors..
Please don't use the 32bit client. When we were testing it, it had a strange bug and didn't produce valid results. I haven't checked though if newer cuda libraries make any difference.

1. The output shows I have a cuda device, however, it says I have revision 0 of the driver installed, even though I have installed the 185.18.14 and updaged to the 185.18.31..
 CUDA device: GeForce 8600 GTS (driver version 0, comp
ute capability 1.1, 255MB, est. 18GFLOPS).
This is just cosmetic. Don't pay attention to it.

2. I have modified my app_info.xml to allow both AK_V8_SSE3 (32bit) and the cuda to run simultaneously (included .xml file).. I have 3 active tasks being worked on, two (for my dual CPU) say setiathome_enhanced 6.03 and run just fine.  The third is the CUDA which setiathome_enhanced 6.08 (cuda), and the status NEVER goes past Ready to start. It will eventuall error out with Computation error.
Anyone have any thoughts or advice on how to debug this?
There is an option that is on by default to not run cuda tasks when pc is in use. Check global_prefs.xml for <run_gpu_if_user_active>0</run_gpu_if_user_active> and change that 0 to 1.


4. I'm not using XWindows at all.  This is all console based only.. Is Xorg required to be running to utilize CUDA?
Just copy&pasting from Nvidia:
 In order to run CUDA applications, the CUDA module must be
 loaded and the entries in /dev created.  This may be achieved
 by initializing X Windows, or by creating a script to load the
 kernel module and create the entries.

 An example script (to be run at boot time):

Code: [Select]
#!/bin/bash

modprobe nvidia

if [ "$?" -eq 0 ]; then

# Count the number of NVIDIA controllers found.
N3D=`/sbin/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`
NVGA=`/sbin/lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`

N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i;
done

mknod -m 666 /dev/nvidiactl c 195 255

else
exit 1
fi

my gkrellm monitors were the most sensitive to this behavior and began displaying symptoms before it got to the noticable level affecting my desktops. since i run an extrememly busy set of desktops (2 of the desktops display a total of 29 gkrellm monitor strips monitoring our servers in real time)  i suspect the 190 driver isn't ready for prime time yet for linux when handling more than near idle desktop load plus cuda.
I'm also using 190.18 and 2.3 cuda with no issues. This is my everyday pc with firefox with multitude of tabs and many other applications opening and closing. Have you checked if gkrellm has some kind of memory leak?

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 02 Aug 2009, 07:52:03 am
Thanks, but before I use it how does the VLAR kill work?  I rebrand all of my VLAR to the CPU as soon as I can, and prefer to work whatever units I get since I have the 8 cores just sitting there most of the time.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 08:08:12 am
Thanks, but before I use it how does the VLAR kill work?  I rebrand all of my VLAR to the CPU as soon as I can, and prefer to work whatever units I get since I have the 8 cores just sitting there most of the time.

Well if you miss a VLAR from the rebranding and it gets to your GPU, it will get aborted almost instantly by the client.
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 02 Aug 2009, 10:41:12 am
Are you talking about boinc manager? Some time down the road, boinc manager changed from cpu time to elapsed time. If you use boinc 6.6.11 for multi-gpu you can use a later boinc manager version that shows the elapsed time. I'm using boinc 6.6.11 with boinc manager 6.6.37.

Perfect - I didn't think of doing that!  Thanks again, sunu - you continue to be a big help and it's definitely appreciated!
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 10:48:40 am
Perfect - I didn't think of doing that!  Thanks again, sunu - you continue to be a big help and it's definitely appreciated!

Thanks Tye!
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 02 Aug 2009, 11:16:11 am
Are you talking about boinc manager? Some time down the road, boinc manager changed from cpu time to elapsed time. If you use boinc 6.6.11 for multi-gpu you can use a later boinc manager version that shows the elapsed time. I'm using boinc 6.6.11 with boinc manager 6.6.37.

Perfect - I didn't think of doing that!  Thanks again, sunu - you continue to be a big help and it's definitely appreciated!

Argh - somehow it doesn't like running with 6.6.11 boinc and 6.6.36 boincmgr...  Is there some trick to that I'm missing?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 12:17:15 pm
Argh - somehow it doesn't like running with 6.6.11 boinc and 6.6.36 boincmgr...  Is there some trick to that I'm missing?

What problem do you have? Just copy boincmgr to your 6.6.11 installation.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Aug 2009, 02:28:33 pm
typically memory use hardly changes once my system is stabilized into 'work mode'. this incarnation of gkrellm has been working fine for months and only showed erratic behavior with the 190 driver. thie second i went back to the 185 driver all problems vanished. i even tried recompiling gkrellm and all supporting libraries and sensors just to be sure. so when i reverted back to the 185 version and 2.2 i again recompiled everything mentioned just to be safe. i originally used the gentoo-supplied 2.3 and 190 and when i started having trouble i went directly to nvidia and got them from there, uninstalled the previous ones and used the nvidia installers. same behavior.

i cannot say for sure whether it is the cuda 2.3 libraries or the 190 driver or both. one or both simply do not like something on my system i guess. i just reverted back to 185 and 2.2 and its smooth sailing once again.

Title: Re: SETI MB CUDA for Linux
Post by: Tye on 02 Aug 2009, 04:19:14 pm
Argh - somehow it doesn't like running with 6.6.11 boinc and 6.6.36 boincmgr...  Is there some trick to that I'm missing?

What problem do you have? Just copy boincmgr to your 6.6.11 installation.
Yep, that's what I did, but it just sits there frozen at the "Communicating with client" portion on startup.  No messages, no display, no processes starting, etc.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 05:46:58 pm
Yep, that's what I did, but it just sits there frozen at the "Communicating with client" portion on startup.  No messages, no display, no processes starting, etc.

Start boinc first and then open boinc manager.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 06:19:55 pm
@letni
Please see my big post above, I'm talking about cuda with no X server. Are you sure you've set it up all correctly? Also make sure you use compatible nvidia drivers and cuda libraries.

Also see my post http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014 and follow it to the letter.

Your card with 256 MB is borderline. You might get some out of memory messages here and there.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Aug 2009, 07:29:29 pm
@letni
Please see my big post above, I'm talking about cuda with no X server. Are you sure you've set it up all correctly? Also make sure you use compatible nvidia drivers and cuda libraries.

Also see my post http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014 and follow it to the letter.

Your card with 256 MB is borderline. You might get some out of memory messages here and there.

i got them all the time with my 8600GT 256mb. cuda tried to use it when it was the only device but it turned every wu in with out of memory errors. that was when i added the tesla and then later replaced the gt with the gtx i now have.

for someone who uses a linux desktop seriously, i doubt 256mb vidram will work at all unless maybe you define only 1 or 2 desktops at most and keep all the bells and whistles disabled along with using a solid color background.., or it is a 2nd card not used for video at all then it can be used properly.
Title: Re: SETI MB CUDA for Linux
Post by: letni on 02 Aug 2009, 08:17:26 pm

2. I have modified my app_info.xml to allow both AK_V8_SSE3 (32bit) and the cuda to run simultaneously (included .xml file).. I have 3 active tasks being worked on, two (for my dual CPU) say setiathome_enhanced 6.03 and run just fine.  The third is the CUDA which setiathome_enhanced 6.08 (cuda), and the status NEVER goes past Ready to start. It will eventuall error out with Computation error.
Anyone have any thoughts or advice on how to debug this?
There is an option that is on by default to not run cuda tasks when pc is in use. Check global_prefs.xml for <run_gpu_if_user_active>0</run_gpu_if_user_active> and change that 0 to 1.

That was the issue the whole time.. I guess I figured just cause I am not uxing XWINDOWs on a headless machine that I technically wasn't using the video card.  Enabling that let it do its thing :)

Thanks everyone for all your help.



letni
Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 02 Aug 2009, 08:19:47 pm
Thanks, but before I use it how does the VLAR kill work?  I rebrand all of my VLAR to the CPU as soon as I can, and prefer to work whatever units I get since I have the 8 cores just sitting there most of the time.

Well if you miss a VLAR from the rebranding and it gets to your GPU, it will get aborted almost instantly by the client.

Ok, so it kills it as soon as it actually starts processing and no sooner?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Aug 2009, 09:35:20 pm
That was the issue the whole time.. I guess I figured just cause I am not uxing XWINDOWs on a headless machine that I technically wasn't using the video card.  Enabling that let it do its thing :)
Thanks everyone for all your help.
letni
letni the hosts that you've posted above are full of errors. Are you sure you've solved all your problems?


Ok, so it kills it as soon as it actually starts processing and no sooner?
Yes. When it starts to crunch a wu, it checks if this wu is a VLAR and if it is, it aborts.
Title: Re: SETI MB CUDA for Linux
Post by: letni on 02 Aug 2009, 10:25:53 pm
That was the issue the whole time.. I guess I figured just cause I am not uxing XWINDOWs on a headless machine that I technically wasn't using the video card.  Enabling that let it do its thing :)
Thanks everyone for all your help.
letni
letni the hosts that you've posted above are full of errors. Are you sure you've solved all your problems?
I have the client running currently.. 2 threads on AK_V8_64bit_SSE3 (Pentium D) and one thread on cuda novlar 2.2 build.. The CUDA is extremly slow and I have noticed that the cuda thread is taking up 100% of one (out of 2) of my CPUs, slowing down my SSE3 CPU threads.. Here are the versions:

Nvidia Driver: 185.18.31
CUDA Toolkit: 2.2 (ldd reports proper libs - added to ld.so.conf) -also copied libs to project.. directory
SETI Version: 6.6.36
Environment: FC10_64bit - Runlevel 3 - no xwindows
Video Card - 8600 GTS 256MB

I have copied the CUDA binary to a system directory in the path as mentioned. 

I even tried the new 190 driver and the 2.3 toolkit and same effect, just downgraded back to 2.2 toolkit and 185 driver.. Still same issue.. 100% cpu utilization.. This can't be efficient.. Any suggestions?

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Aug 2009, 05:13:09 am
latni can you give us a link of a wu done by the gpu?

I'm suspecting that you get an out of memory error and the wu is then done in cpu, that's why you get 100% cpu utilization by the cuda app.
Title: Re: SETI MB CUDA for Linux
Post by: letni on 03 Aug 2009, 08:55:31 am
latni can you give us a link of a wu done by the gpu?

I'm suspecting that you get an out of memory error and the wu is then done in cpu, that's why you get 100% cpu utilization by the cuda app.

That is absoutly the case.. my first WU was finally finished by "GPU" and the error messages say MALLOC out of memory errors.. Looks like time for a better video card!
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Aug 2009, 09:26:47 am
That is absoutly the case.. my first WU was finally finished by "GPU" and the error messages say MALLOC out of memory errors.. Looks like time for a better video card!

letni, what cuda app do you use? The 2.2 one or the one from the first post of this thread? The 2.2 one is very memory hungry. Try using the one from the first post of this thread and see if it helps. Mind you that it has a different name so you'll have to edit your app_info.xml accordingly.
Title: Re: SETI MB CUDA for Linux
Post by: letni on 03 Aug 2009, 12:36:14 pm
That is absoutly the case.. my first WU was finally finished by "GPU" and the error messages say MALLOC out of memory errors.. Looks like time for a better video card!

letni, what cuda app do you use? The 2.2 one or the one from the first post of this thread? The 2.2 one is very memory hungry. Try using the one from the first post of this thread and see if it helps. Mind you that it has a different name so you'll have to edit your app_info.xml accordingly.

That did the trick.. No 100% CPU usage anymore with the older binary.. Chugging away very fast now!! Hopefully no more computation errors!
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Aug 2009, 04:16:58 pm
That did the trick.. No 100% CPU usage anymore with the older binary.. Chugging away very fast now!! Hopefully no more computation errors!
letni I see you've already completed a few WUs with your GPU. Happy crunching!  :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Aug 2009, 10:45:36 am
if i were to build a machine with 4 GTX295 cards, will boinc see and use all 8 cuda processors properly and efficiently, or would i be better off using 4 GTX285 single gpu cards? i had originally thoguht of using 4 teslas but the GTX 285 outperforms it by a lot  just going by raw gflops (74 tesla vs 127 GTX) as measured by boinc.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Aug 2009, 11:22:38 am
if i were to build a machine with 4 GTX295 cards, will boinc see and use all 8 cuda processors properly and efficiently, or would i be better off using 4 GTX285 single gpu cards?
Well, that is your choice. Boinc won't have any problems at all.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 04 Aug 2009, 11:43:48 am
And if it doesn't work you can send that machine to me...  ;D
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Aug 2009, 12:08:06 pm
And if it doesn't work you can send that machine to me...  ;D

Well, probably, first, he's going to bring that machine down on my head hard for making him blow his money away, but I'll take it anyway.  ;D
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Aug 2009, 02:16:12 pm
And if it doesn't work you can send that machine to me...  ;D

Well, probably, first, he's going to bring that machine down on my head hard for making him blow his money away, but I'll take it anyway.  ;D

LOL as long as boinc can use all 8 processors efficiently it will be fine.. this is a '1st quarter next year' project im planning out now. prob use either an i7 proc or a couple quad core xeons not sure yet which would process better. i heard the modern xeons were super fast in math, but then ive also heard that about the i7 so... dunno... about the only thing that is a given is the choice of vid cards and hard drives and ram :) hehe friggin vid cards will cost as much as the rest of the machine will!
Title: Re: SETI MB CUDA for Linux
Post by: pp on 06 Aug 2009, 01:28:09 pm
The default priority of nice 10 seems to slow the process down on my box, once I switched it to 0 or -5, it processed much faster and collected up CPU time quicker.

Is there a way to control the SETI apps' default nice level through the configuration files? The nice level is supposed to be inherited from the parent app but default on my computer, BOINC has 0, AK_V8_linux64_sse3 has 19 and setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu has 10 so obviously there's something I don't understand here. ;D Right now I'm using an external daemon to renice the processes now and then.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 06 Aug 2009, 05:41:04 pm
BOINC has 0, AK_V8_linux64_sse3 has 19 and setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu has 10 so obviously there's something I don't understand here. ;D Right now I'm using an external daemon to renice the processes now and then.

Renice them to what? Those priority levels are good for a system that you work on. Unless it is a dedicated cruncher.

And how your cuda client starts at 10? Mine always starts at 19 and have to lower it with a script.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 06 Aug 2009, 07:11:48 pm
Well, running two CPU instances at nice 19 and one CUDA at nice 10 on this dualcore makes the CUDA task take several hours. Renicing the CUDA task to -5, like koschi mentioned, one CUDA task completes in 23 minutes on this 9800GT. It's almost a dedicated cruncher. It's my gaming rig but I don't use it often.
Title: Re: SETI MB CUDA for Linux
Post by: koschi on 08 Aug 2009, 05:55:28 am
For the box where I tried this, it really made a difference. Those days I was running Docking@home on the CPU, might not be that dramatic with other CPU projects.
I use a one line command in cron, being executed once per minute, renicing all cuda processes to -5. The system is just crunching and serving files, so no issues if the interface would get a little choppy or not :)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Aug 2009, 06:17:55 am
Back in the days when cuda needed a whole core, I was running a 3+1 config in my quad core. All processes had the lowest priority (19) and I don't think I had any serious slowdown, maybe a minute or so, not more. And this was my everyday desktop so many things were running, firefox with many many tabs, full 3d compiz effects, everyday backups, etc.

Only now that cuda shares a core with the other seti@home tasks, I started renicing them only to make them higher priority than the other seti@home instances. I think -5 is not necessary.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 09 Aug 2009, 05:21:53 am
Back in the days when cuda needed a whole core, I was running a 3+1 config in my quad core. All processes had the lowest priority (19) and I don't think I had any serious slowdown, maybe a minute or so, not more. And this was my everyday desktop so many things were running, firefox with many many tabs, full 3d compiz effects, everyday backups, etc.

Only now that cuda shares a core with the other seti@home tasks, I started renicing them only to make them higher priority than the other seti@home instances. I think -5 is not necessary.

i tried setting the AK MB cpu apps to 10 and the cuda ones to  -5 and after 24 hrs there was very little difference in my scores. the only thing that happened is my desktops became difficult to work with and gave me the impression i was working with my old p3 933mhz machine so i returned them to boinc defaults of 19 and 10.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Aug 2009, 05:47:32 am
A small update for those with multi-gpu systems.

The problem with proper multi-gpu support in linux with recent boinc releases seems to have been fixed (got a message yesterday from David Anderson and I confirmed it compiling boinc and running it).

But now there seems to be another problem with task scheduling. Richard Haselgrove hinted that maybe this problem is general since there have been similar reports that this might also occur in windows. If it is so we might get a faster fix this time.

Well, anyway, we are one problem down. I hope we get a proper functioning boinc in linux soon.
Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 14 Aug 2009, 07:06:30 am
Great news!  Maybe soon I won't have to use 6.4.5 for crunching and 6.6.11 for downloading (for some reason 6.6.11 would randomly stop using the GPUs, but 6.4.5 always says high priority so would never download new WU).
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Aug 2009, 07:25:54 am
Great news!  Maybe soon I won't have to use 6.4.5 for crunching and 6.6.11 for downloading (for some reason 6.6.11 would randomly stop using the GPUs, but 6.4.5 always says high priority so would never download new WU).

6.6.11 has a bug that if a GPU job is running,and a 2nd GPU job with an earlier deadline arrives, neither job is executed ever. Maybe you get hit by this.

I use a script running in an infinite loop to notify me when this happens. Then a boinc restart fixes it... until next time. Also turning off "leave applications in memory while suspended" in your computing preferences seem to help a bit, but it doesn't solve it completely.
Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 14 Aug 2009, 09:11:46 pm
Sounds like it since it happens randomly.  On days I work (12 hour shifts) I'm at my computer maybe 3-4 hours, so odds of me catching it is slim, hence I use 6.4.5 for crunching.  I just switch to 6.6.11 to download 5-10 days cache, rebrand it all and then back to crunching.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 14 Aug 2009, 10:22:55 pm
Great news!  Maybe soon I won't have to use 6.4.5 for crunching and 6.6.11 for downloading (for some reason 6.6.11 would randomly stop using the GPUs, but 6.4.5 always says high priority so would never download new WU).

6.6.11 has a bug that if a GPU job is running,and a 2nd GPU job with an earlier deadline arrives, neither job is executed ever. Maybe you get hit by this.

I use a script running in an infinite loop to notify me when this happens. Then a boinc restart fixes it... until next time. Also turning off "leave applications in memory while suspended" in your computing preferences seem to help a bit, but it doesn't solve it completely.

wow. never knew that... i dont see that in my system but that is because i run the cpugpu perl script often to catch random downloads so boinc gets restarted several times an hour.

soon as the scheduler gets fixed ill give the new one a shot :)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Aug 2009, 04:55:42 am
Sounds like it since it happens randomly.  On days I work (12 hour shifts) I'm at my computer maybe 3-4 hours, so odds of me catching it is slim, hence I use 6.4.5 for crunching.  I just switch to 6.6.11 to download 5-10 days cache, rebrand it all and then back to crunching.
Well, you could use a script to restart boinc automatically when the bug kicks in. I didn't do it because I didn't like too much complexity and at the end I preferred I have the control over boinc restart.

wow. never knew that... i dont see that in my system but that is because i run the cpugpu perl script often to catch random downloads so boinc gets restarted several times an hour.
Restarting boinc several times an hour surely squashed that bug.  :D

soon as the scheduler gets fixed ill give the new one a shot :)
Yesterday, some changes to the scheduler were introduced. The problems I posted above, it seems, were across all platforms as the changes were generic. The battle with task scheduling in boinc is an ongoing and never ending one.

I haven't checked boinc with the new changes to see how it runs. Feel free to check it. At last we'll have a "modern" boinc release with proper multi-gpu support in linux in the next round of "official" releases.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2009, 12:12:22 pm

Restarting boinc several times an hour surely squashed that bug.  :D

Yesterday, some changes to the scheduler were introduced. The problems I posted above, it seems, were across all platforms as the changes were generic. The battle with task scheduling in boinc is an ongoing and never ending one.

I haven't checked boinc with the new changes to see how it runs. Feel free to check it. At last we'll have a "modern" boinc release with proper multi-gpu support in linux in the next round of "official" releases.

what version is it? the latest i see in http://boincdl.ssl.berkeley.edu/dl/ is 6.6.37 or do i have to use svn and hope for the best? :P

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Aug 2009, 12:24:31 pm
or do i have to use svn and hope for the best? :P

Yes, you have to compile from source.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2009, 10:31:01 pm
or do i have to use svn and hope for the best? :P

Yes, you have to compile from source.

ok hope i got the right one. never did 'get' what to do with svn there are so many different sources. looks like the one i got was trunk 6.9.0. is that the correct one?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2009, 11:03:59 pm
well we shall see. i compiled 6.9.0 and it properly uses the 2 devices but the reporting is broken. it tells me in the msg log that i have 2 teslas. i have not looked at the code at all but it seems to me that the devices would be kept in an array and it should be a very simple thing to transverse the array reporting the proper string in each. seems like the index is broken. minor issue but i would think it would only take a few seconds for someone familiar with the code to fix that.

the message log also states it cannot load library libcal.so. too late tonight but ill look to see if it is supposed to be created by the boinc make and try to track down what happened. if not then i dont know where it comes from.

boincmgr would not compile for me so im still using 6.6.37 manager but it works.

if it continues to work like it has in the past 5 min it will be nice to run a new version for a change :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2009, 11:20:22 pm
hmm either that libcal has something to do with workunit calculations or i just got an entire cache full of big units. not one time is under 2:45 and watching boinc process it is EXTREMELY slow both on the cpus and gpus. system load and all other things are normal. unless since 6.6.11 calculations were severely broken that it will take a while for this version to fix that up and get it right. dunno. will see what it looks like in the morning.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Aug 2009, 05:11:30 am
Yes, trunk is the one to get.

What boinc reports is a minor cosmetic bug. The important thing is to use all gpus properly.

libcal.so is for ATI cards (something like libcudart.so for NVIDIA cards). ATI card support was added a couple of days ago for milkyway@home. It should be irrelevant to us.

I've never bothered with boincmgr while compiling from source. I use the released ones. As long as boinc works properly, we're ok.

Lately there was an increase in sensitivity so most of the recent workunits are big ones. In my pc they take about 12-15 min for the gpu and about 1:45-2:00 hours for the cpu. Boinc doesn't have anything to do with the speed of computations, unless it uses 100% of the CPU slowing things down.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Aug 2009, 06:38:19 am
Yes, trunk is the one to get.

What boinc reports is a minor cosmetic bug. The important thing is to use all gpus properly.

libcal.so is for ATI cards (something like libcudart.so for NVIDIA cards). ATI card support was added a couple of days ago for milkyway@home. It should be irrelevant to us.

I've never bothered with boincmgr while compiling from source. I use the released ones. As long as boinc works properly, we're ok.

Lately there was an increase in sensitivity so most of the recent workunits are big ones. In my pc they take about 12-15 min for the gpu and about 1:45-2:00 hours for the cpu. Boinc doesn't have anything to do with the speed of computations, unless it uses 100% of the CPU slowing things down.

was concerned since previously i have never had a cuda work unit take more than 15min to process with typical 9 to 13 min, they are now taking approx 30 min for each card. and my rac has dropped for this machine by more than 400 points. ill just keep plugging away for a while to let things settle out. nothing was changed in the 'backend' applications so it must be the larger workunits presented.

overall boinc seems to be managing things nicely. it no longer keeps a backlog of completed units to report which is refreshing.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Aug 2009, 08:20:23 am
was concerned since previously i have never had a cuda work unit take more than 15min to process with typical 9 to 13 min, they are now taking approx 30 min for each card. and my rac has dropped for this machine by more than 400 points. ill just keep plugging away for a while to let things settle out. nothing was changed in the 'backend' applications so it must be the larger workunits presented.

No,,this is not good. Check how boinc handles the tasks. When a cuda workunit finishes, does it also stop the other one running to start a new pair?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Aug 2009, 08:55:36 am
was concerned since previously i have never had a cuda work unit take more than 15min to process with typical 9 to 13 min, they are now taking approx 30 min for each card. and my rac has dropped for this machine by more than 400 points. ill just keep plugging away for a while to let things settle out. nothing was changed in the 'backend' applications so it must be the larger workunits presented.

No,,this is not good. Check how boinc handles the tasks. When a cuda workunit finishes, does it also stop the other one running to start a new pair?

i think it may be because it is asking for new gpu workunits and keeps getting no work available so when it uploads a finished unit it asks for more work and reports at the same time. dunno..

no when one finishes it starts a new one and the one that was in progress continues uninterrupted.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Aug 2009, 09:21:07 am
was concerned since previously i have never had a cuda work unit take more than 15min to process with typical 9 to 13 min, they are now taking approx 30 min for each card. and my rac has dropped for this machine by more than 400 points. ill just keep plugging away for a while to let things settle out. nothing was changed in the 'backend' applications so it must be the larger workunits presented.

No,,this is not good. Check how boinc handles the tasks. When a cuda workunit finishes, does it also stop the other one running to start a new pair?

hmm found a few interesting things in wandering thru the workunits on the web. found a few of this one:

Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  2.715027
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected exceeds the storage space allocated.


i am assuming it did not have enough allocated ram so i increased the allocation substantially. plenty of disk space allocation (40G available to boinc, 385mb used). also my pending credit is higher than ever at 80k+ so maybe that is also why my rac has dropped. it simply needs to catch up to itself.

all this simultaneously makes it impossible to point a finger :P especially since i also this week replaced my screwy ballistix ram with ocz blade ram.. went from 4x1gb 2.0v sticks to 2x2gb 1.8v sticks . this ocz should be reliable. had 11 RMAs on ballistix in 16 months and just got tired of it. the ocz has given the machine a slightly smoother personality so i am hopeful there but i have no clue how the raw performance is ... technically it should be better since i went from 4pcs dual channel to 2pcs dual channel which is supposed to be an improvement, plus the lower voltage is better as well.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Aug 2009, 12:25:06 pm
Unless you have a faulty card (if it is cuda) or cpu/ram (if it is a cpu workunit), result overflows are pretty much "normal" and they don't have anything to do about your memory/storage allocations.

Check why your pending cache has increased. Is it genuine "waiting for validation" or is it suspicious "validation inconclusive" If it is the latter, check those workunits if you have returned very strange and different to your wingman results. Check also in the invalid category of your tasks page if there are any there.

30min for a CUDA wu seem too much. Unless you have a lower end card.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Aug 2009, 11:05:15 pm
Unless you have a faulty card (if it is cuda) or cpu/ram (if it is a cpu workunit), result overflows are pretty much "normal" and they don't have anything to do about your memory/storage allocations.

Check why your pending cache has increased. Is it genuine "waiting for validation" or is it suspicious "validation inconclusive" If it is the latter, check those workunits if you have returned very strange and different to your wingman results. Check also in the invalid category of your tasks page if there are any there.

30min for a CUDA wu seem too much. Unless you have a lower end card.

there were only 2 or maybe 3 overflow errors out of 8 or 9 pages i looked through. there are a lot of waiting for validation for inconclusive but the majority are processed and validated. card device 0 is a gtx 285, a xfx overclocked black edition (127gflops by boinc) and the  device1 is a prerelease tesla c1060 which has 1gb ram instead of the 4 in production and a bit slower clock speeds (74gflops by boinc).. 

it seems that the workunits are very large. boinc is showing time to completion for those waiting to process of about 2:40 in this current cache, including the cuda workunits. the only changes made besides new downloads since the 13min workunits for cuda and 50min to 1.5hr workunits for cpu has been the boinc upgrade from 6.6.11 and the change in system ram on monday. the gpus both are running a satisfactory temp of 62-67c under load and cpus under load betw 52 and 59c with averages around 55c so its all running cool enough. the bios diagnostics show nothing wrong so my only guess is the kind of workunits i am getting now. 

glxgears is showing around 10kFPS which is where the gtx 285 has run since i first got it and nvidia-settings shows both cards running at their maximum performance level although the wording changed for the gtx. it used to say maximum performance now it says desktop, but the numbers are still the same and i suspect it is a change of driver versions that changed it. i have been running the same driver for weeks now.

also desktop performance is as good as it always was.. so i am at a loss to explain the sudden 30min cuda processing unless it is the workunits supplied. the script is making sure there are no vlar/vhar fed to the cuda devices. in fact, lately the cpu workunits have been nothing but vlar/vhar units with whatever normal ones they may have been assigned being changed to cuda.

my pending credits have always been around 40k but it jumped to 80k i guess recently. i cannot say for sure because i rarely check it so there has been maybe a month or two between those numbers.

i know my rac drops when i have boinc shut down for several hours and that is normal, but over the past week i have lost now nearly 500 points in average on this one machine.  i wonder if running that script and stopping/restarting boinc with an 8 second delay  3 times an hour may be causing the drop?

the only other thing that may be affecting it is the ambient temp of the room which has been considerably higher this week raising the ambient of the case. gtx ambient is running around 55c now and previous weeks it has run closer to 48c  but none of this is anywhere close to limits that would cause any kind of power/speed controls kicking in to cool things down.

what is interesting is i just thought of looking at the boinc cpu benchmarks which i largely ignore so i just ran them a few seconds ago. the floating point is within normal range it has always been but the interesting thing is the integer benchmark is just under 4k higher than normal!  that may be the new system ram configuration affecting that though.

weirdness abounds... :)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 17 Aug 2009, 07:32:43 am
riofl, give me a link to your host.

Compiled boinc gave me also increased benchmarks. Don't have any real importance though.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Aug 2009, 11:09:49 pm
riofl, give me a link to your host.

Compiled boinc gave me also increased benchmarks. Don't have any real importance though.

ok so then it doesnt mean anything about my ram change...  hope this is the right link. i took it from the details link from my computer listings page

http://setiathome.berkeley.edu/show_host_detail.php?hostid=4166601
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Aug 2009, 12:04:40 pm
riofl, I'm sure you know, your tesla card has some problems. It gives errors in some workunits. If you look in your errors page, all those workunits were run by the tesla card. It does run successfully though in other workunits.

Checking the reported run times, I don't see any significant difference between eg 18 August and 14 August when you were running 6.6.11.

I do see though that most of the workunits were restarted 2 or 3 or more times. The initialization phase of a cuda task takes about 30 sec. If it is restarted 2 times you lose 1 min and with a total computation time of eg. 14 min you lose 7% credit right there.

You've said that you run the rebranding script several times per hour, why? I search for vlars once per day, sometimes once per two days, and that is more than enough. If newly downloaded tasks get crunched only a few hours later, increase your cache so they are crunched after 2-3 or more days so running the script once per day will be enough.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Aug 2009, 12:42:54 pm
riofl, I'm sure you know, your tesla card has some problems. It gives errors in some workunits. If you look in your errors page, all those workunits were run by the tesla card. It does run successfully though in other workunits.

Checking the reported run times, I don't see any significant difference between eg 18 August and 14 August when you were running 6.6.11.

I do see though that most of the workunits were restarted 2 or 3 or more times. The initialization phase of a cuda task takes about 30 sec. If it is restarted 2 times you lose 1 min and with a total computation time of eg. 14 min you lose 7% credit right there.

You've said that you run the rebranding script several times per hour, why? I search for vlars once per day, sometimes once per two days, and that is more than enough. If newly downloaded tasks get crunched only a few hours later, increase your cache so they are crunched after 2-3 or more days so running the script once per day will be enough.

yes i am going to retire the tesla short.y but first im going to try to replace the existing incorrect thermal pads on all the chips.  if that doesnt fix it i will ship  it back to my boss who sent it to me in the first place and let him use it on a windows setup. i will then run just the single gtx285 for a month or 2 and then get another gtx285  to replace the tesla. the 285 is considerably faster than the tesla anyway. (127gflops vs 74gflops by boinc measurements)

the reason i picked every 20 min was i noticed a number of computation error results when i ran it just once an hour. i suppose the easiest way to keep it from downloading all the time is to set the cache to 10 days, get it all then run the script then turn it back to 2 days or something so it wont download more.  when i ran the cpugpureport script several times i found that it showed vlar/vhar assigned to gpu sometimes several times in an hour meaning it got more workunits. a second reason for doing this is the tesla locks up sometimes as much as every hour and a restart of boinc cures it.

maybe ill just ignore all that and run the script once every few hours using a large cache so it wont get more without me making it do so. that way i can keep control of it rejecting vlars.

thanks for checking. i didnt think frequent usage of the script would cause that much of a change.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Aug 2009, 01:45:37 pm
ok i changed my cache from 6 days to 10 but no workunits.. prob cause the project is in maintenance. script is set to run every 4 hrs until it gets its cache then ill change that to once a day and set the cache back to 2 days.

lets see what that does

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Aug 2009, 02:02:43 pm
Why play with your cache levels? Change from 6 to 10 and then back to 2, why? Pick a cache level and leave it there. I'm using 10 days. If you were using 6 days, it is fine also. 6 days cache means that the workunits downloaded now will be crunched in about 6 days, so you have 6 days to check for vlars. No need to run that script x times per hour or x times per day.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Aug 2009, 09:11:47 pm
Why play with your cache levels? Change from 6 to 10 and then back to 2, why? Pick a cache level and leave it there. I'm using 10 days. If you were using 6 days, it is fine also. 6 days cache means that the workunits downloaded now will be crunched in about 6 days, so you have 6 days to check for vlars. No need to run that script x times per hour or x times per day.

ahh yes except for one thing. i have seen even this new version of boinc obey the due dates and pick the next workunit from among the newly downloaded. if they stayed in ascending date order i would agree but it does not seem to work that way for me. at least 3 or 4 times so far i noticed a cuda and/or cpu workunit placed on hold to pick up one that had a closer due date that was recently downloaded. this means there is a danger the gpu app will reject a possible vlar before it can be flagged.

if this works out better which by the logic you presented makes a lot of sense and im sure it will, ill just leave things alone and if it rejects a few workunits before the script can run, oh well. :)
Title: Re: SETI MB CUDA for Linux
Post by: macros on 19 Aug 2009, 04:49:16 am
Back in the days when cuda needed a whole core, I was running a 3+1 config in my quad core. All processes had the lowest priority (19) and I don't think I had any serious slowdown, maybe a minute or so, not more. And this was my everyday desktop so many things were running, firefox with many many tabs, full 3d compiz effects, everyday backups, etc.

Only now that cuda shares a core with the other seti@home tasks, I started renicing them only to make them higher priority than the other seti@home instances. I think -5 is not necessary.

Question regarding this. I am using the default settings in app_info.xml's <app_version> for cuda as follows:
Code: [Select]
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</avg_ncpus>

The problem is, that setiathome-CUDA process has demand obviously higher than that and is able to eat up CPU time of whole one core. That results in other (regular CPU) processes to fight over the CPU time, context switches, cache thrashing etc. ->
Code: [Select]
  PID  PR  NI  RES  SHR %CPU    TIME+  COMMAND
15538  39  19  48m 1472  101  13:44.03 AK_V8_linux64_s
15539  39  19  48m 1464  101  20:17.47 AK_V8_linux64_s
15540  39  19  48m 1468  101  20:25.35 AK_V8_linux64_s
15541  39  19  48m 1464   99  19:52.12 AK_V8_linux64_s
15544  39  19  48m 1484   99  20:30.54 AK_V8_linux64_s
15545  30  10 114m  10m   99  18:42.69 setiathome-CUDA
15546  39  19  48m 1488   94  19:55.04 AK_V8_linux64_s
16208  39  19  48m 1488   51  12:22.11 AK_V8_linux64_s
15542  39  19  48m 1472   46  12:14.44 AK_V8_linux64_s

Now to the question - am I doing something wrong and cuda does not behave correctly?
Or is this normal and I should just set avg_ncpus & max_ncpus to 1, and pin the process to some core + make it use it exclusively?
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 05:07:50 am
Why play with your cache levels? Change from 6 to 10 and then back to 2, why? Pick a cache level and leave it there. I'm using 10 days.
ahh yes except for one thing. i have seen even this new version of boinc obey the due dates and pick the next workunit from among the newly downloaded.

Set cache to 10 days and let it fill up
Set BOINC to not receive more WUs
Run script
Let computer crunch for 10 days
Repeat
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 05:12:23 am
Now to the question - am I doing something wrong and cuda does not behave correctly?
Or is this normal and I should just set avg_ncpus & max_ncpus to 1, and pin the process to some core + make it use it exclusively?

Are you still running CUDA 2.1? The 100% CPU was apparently a bug in those libraries. Upgrade CUDA to 2.3, nvidia-drivers to 190.xx  and replace your setiathome executable with the 2.2 version and optionally renice that process if you think it's too slow.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Aug 2009, 05:32:30 am
ahh yes except for one thing. i have seen even this new version of boinc obey the due dates and pick the next workunit from among the newly downloaded. if they stayed in ascending date order i would agree but it does not seem to work that way for me. at least 3 or 4 times so far i noticed a cuda and/or cpu workunit placed on hold to pick up one that had a closer due date that was recently downloaded. this means there is a danger the gpu app will reject a possible vlar before it can be flagged.

This happens only for vhar workunits, they have shorter deadlines than the rest. VLARs have "normal" deadlines and they are crunched when their time comes, about x(cache) days after they've been downloaded.

Macros, what pp says. Make sure you're using cuda 2.2 or later together with a compatible nvidia driver.

Shameless plug: I've reached #4 in the top hosts list (http://setiathome.berkeley.edu/top_hosts.php). I don't know how long I can hold on there though. Attaching pdf for future proof.

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 05:58:36 am
Congratulations sunu. My goal is a modest top 2000 like I managed in Seti Classic. I'm slowly reaching there and the upgrade to CUDA 2.3 pushed my RAC on the 9800GT over 5000 now which helps a lot  ;D
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Aug 2009, 06:05:43 am
ahh yes except for one thing. i have seen even this new version of boinc obey the due dates and pick the next workunit from among the newly downloaded. if they stayed in ascending date order i would agree but it does not seem to work that way for me. at least 3 or 4 times so far i noticed a cuda and/or cpu workunit placed on hold to pick up one that had a closer due date that was recently downloaded. this means there is a danger the gpu app will reject a possible vlar before it can be flagged.

This happens only for vhar workunits, they have shorter deadlines than the rest. VLARs have "normal" deadlines and they are crunched when their time comes, about x(cache) days after they've been downloaded.

Macros, what pp says. Make sure you're using cuda 2.2 or later together with a compatible nvidia driver.

Shameless plug: I've reached #4 in the top hosts list (http://setiathome.berkeley.edu/top_hosts.php). I don't know how long I can hold on there though. Attaching pdf for future proof.

congrats!!! hehe mine is only a paltry 188 :)

i have had bad luck with 2.3 and had to go back to 2.2 which is flawless for me.

you were right... rac gained 200 points in the last 8 hours. thanks! i didnt think frequent restarting made that huge a difference. still a bit odd with 30min workunits but that seems to be what things are now. not quite sure why. just gonna let it go and see where it falls. maybe in a week or 2 things will level out if its the wu being supplied. or maybe this is the norm for this distribution .. dunno..

after get another gtx285 to replace this tesla and i finish next year's crunching project i think i will get a different motherboard with more x16 slots and wind up with maybe 3 gtx285 in this machine. that should help things along. i think this tesla has seen better days. probably a bad ram chip.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Aug 2009, 06:16:12 am
Back in the days when cuda needed a whole core, I was running a 3+1 config in my quad core. All processes had the lowest priority (19) and I don't think I had any serious slowdown, maybe a minute or so, not more. And this was my everyday desktop so many things were running, firefox with many many tabs, full 3d compiz effects, everyday backups, etc.

Only now that cuda shares a core with the other seti@home tasks, I started renicing them only to make them higher priority than the other seti@home instances. I think -5 is not necessary.

Question regarding this. I am using the default settings in app_info.xml's <app_version> for cuda as follows:
Code: [Select]
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</avg_ncpus>

The problem is, that setiathome-CUDA process has demand obviously higher than that and is able to eat up CPU time of whole one core. That results in other (regular CPU) processes to fight over the CPU time, context switches, cache thrashing etc. ->
Code: [Select]
  PID  PR  NI  RES  SHR %CPU    TIME+  COMMAND
15538  39  19  48m 1472  101  13:44.03 AK_V8_linux64_s
15539  39  19  48m 1464  101  20:17.47 AK_V8_linux64_s
15540  39  19  48m 1468  101  20:25.35 AK_V8_linux64_s
15541  39  19  48m 1464   99  19:52.12 AK_V8_linux64_s
15544  39  19  48m 1484   99  20:30.54 AK_V8_linux64_s
15545  30  10 114m  10m   99  18:42.69 setiathome-CUDA
15546  39  19  48m 1488   94  19:55.04 AK_V8_linux64_s
16208  39  19  48m 1488   51  12:22.11 AK_V8_linux64_s
15542  39  19  48m 1472   46  12:14.44 AK_V8_linux64_s

Now to the question - am I doing something wrong and cuda does not behave correctly?
Or is this normal and I should just set avg_ncpus & max_ncpus to 1, and pin the process to some core + make it use it exclusively?


i think you will find best resonse setting your preferences to use 6 or 7 cpus instead of 8 leaving 1 for cuda and your desktop to use. i played around a bit with max_ncpus but did not find a huge difference. mine is set at 0.35.

absolutely if you do nothing else change your cuda tookit and sdk to 2.2 and get the 2.2 application. make sure your driver is at the minimum 185.14 or 185.29. i am using 185.29.

ver 2.1 had huge flaws in it . i have heard 2.3 is even better, however i have not had good luck with 2.3 so i went back to 2.2 until i can figure out what went wrong.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Aug 2009, 06:30:12 am
Thanks pp, riofl. For a long time I was the only one in the top 100 list using linux. Now I see we are three. I hope more will get in the top 100. We have to show that linux is at least equal to windows for seti crunching.

pp,compared to you I don't think I was that good in seti classic.

Small correction to riofl: The driver versions are 185.18.14 and 185.18.29. Latest is 185.18.31. Macros, if you go to cuda 2.3 you'll need 190.18.

Macros, what card are you using? Maybe that 99% is because your card goes out of memory?
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 06:46:57 am
Thanks sunu. I crunched Classic from the start and had lots of PCs running but I quit when Seti switched to Boinc. I started again with Boinc only a month ago because one of my colleagues challenged me. I keep in front of him with only two GPUs running but I'm planning a Core i7/GTX 295 upgrade to crush him!  I never say no to a challenge...;D
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Aug 2009, 06:52:20 am
Go for it pp, crush him!!!  ;D
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 07:29:35 am
With pleasure, although I can't stop looking at you guys at the top with hundreds of thousands of daily RAC... that's impressive!  :o
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Aug 2009, 07:38:39 am
Tens of thousands not hundreds of thousands. You're probably looking at the Top participants list (http://setiathome.berkeley.edu/top_users.php). I have only one pc so all I can manage is the Top hosts list (http://setiathome.berkeley.edu/top_hosts.php), PCs with the highest RAC output.

Well, if you build that i7/GTX 295 maybe with a second 295 thrown in the mix you'll join the club. All I have is a Q6600 with a GTX295 and a GTX285 and I managed #4. But as I said, I don't know how long I'll last there and I probably won't upgrade for quite some time so sooner or later I'll start to go down the ladder.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 07:55:24 am
Good thing AtlasFolder doesn't do SETI: http://atlasfolding.com/?page_id=148
That's borderline...  ;)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Aug 2009, 08:20:38 am
I don't think it is possible with BOINC to use all those motherboards as a single host, it would be good though. I think the best possible setup for a seti crunching host would be with this (http://www.asus.com/product.aspx?P_ID=9ca8hJfGz483noLk&templete=2) motherboard. 4 dual-slot or 7 single-slot graphics cards for uber RAC.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 08:36:11 am
bit-tech.net recently had an entertaining article about this. They tried both 4xGTX295 and 7x9600GT and not surprisingly, heat was a problem. They tested Folding@Home but it has lots of nice pictures...
http://www.bit-tech.net/bits/2009/08/03/how-to-build-the-best-folding-rig/1
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Aug 2009, 10:06:01 am
Thanks for the link, I haven't seen that.

That's were water cooling enters the picture. 4 x BFG NVIDIA GeForce GTX 295 H2OC 1792MB PCIe 2.0 with ThermoIntelligence Advanced Cooling Solution (http://www.bfgtech.com/bfgrgtx2951792h2ocle.aspx) or 7 x BFG NVIDIA GeForce GTX 285 H2O+ 1GB PCIe 2.0 with ThermoIntelligence Advanced Cooling Solution (http://www.bfgtech.com/bfgrgtx2851024h2ople.aspx) or any other water cooled solution.
Title: Re: SETI MB CUDA for Linux
Post by: macros on 19 Aug 2009, 10:48:42 am
Are you still running CUDA 2.1? The 100% CPU was apparently a bug in those libraries. Upgrade CUDA to 2.3, nvidia-drivers to 190.xx  and replace your setiathome executable with the 2.2 version and optionally renice that process if you think it's too slow.
Macros, what pp says. Make sure you're using cuda 2.2 or later together with a compatible nvidia driver.
i think you will find best resonse setting your preferences to use 6 or 7 cpus instead of 8 leaving 1 for cuda and your desktop to use. i played around a bit with max_ncpus but did not find a huge difference. mine is set at 0.35.

absolutely if you do nothing else change your cuda tookit and sdk to 2.2 and get the 2.2 application. make sure your driver is at the minimum 185.14 or 185.29. i am using 185.29.

ver 2.1 had huge flaws in it . i have heard 2.3 is even better, however i have not had good luck with 2.3 so i went back to 2.2 until i can figure out what went wrong.
Small correction to riofl: The driver versions are 185.18.14 and 185.18.29. Latest is 185.18.31. Macros, if you go to cuda 2.3 you'll need 190.18.

Macros, what card are you using? Maybe that 99% is because your card goes out of memory?

Thanks for everyone's hints.

I've installed nvidia-drivers version 185.18.14 from Ubuntu PPA source (x-updates) (I don't want to get on 'manual track' to manage nvidia drivers here...) plus 2.2 CUDA libraries). Also I've upgraded to setiathome-CUDA_2.2_6.08.x86_64_vlarkill.tar.bz2 client as pp suggested. First results weren't satisfactory - setiathome CUDA client would crash with following error output:

Code: [Select]
<core_client_version>6.6.37</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : Quadro FX 4600
           totalGlobalMem = 804585472
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1188000
           totalConstMem = 65536
           major = 1
           minor = 0
           textureAlignment = 256
           deviceOverlap = 0
           multiProcessorCount = 12
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: Quadro FX 4600 is okay
SIGSEGV: segmentation violation
Stack trace (16 frames):
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib/libpthread.so.0[0x7f96066ac080]
/usr/lib/libcuda.so.1[0x7f9607123020]
/usr/lib/libcuda.so.1[0x7f9607128d84]
/usr/lib/libcuda.so.1[0x7f96070f210f]
/usr/lib/libcuda.so.1[0x7f9606e7db3b]
/usr/lib/libcuda.so.1[0x7f9606e8e46b]
/usr/lib/libcuda.so.1[0x7f9606e76211]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7f9606e6ffaa]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x5ace4b]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40d4ca]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f96063495a6]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>
]]>

Then I've made an attempt to run the seti CUDA client standalone on the very same workunit and guess what - it worked.  :o
Messing around, I've ended up in state when there is one CUDA task running (and it seems that this time correctly - around 3-4% CPU time) but I don't have explanation for previous crashes.

The machine is:
Dual QC Xeon X5460 @ 3.16GHz
16GiB RAM
nVidia Quadro FX 4600
Ubuntu 9.04 w/ 2.6.28-15-server (I understood from other threads, that this might be an issue, but it doesn't really add up to fact that I didn't have a single compute error until I've upgraded to 2.2 CUDA + 2.2 seti CUDA client)
boinc ver. 6.6.37
Title: Re: SETI MB CUDA for Linux
Post by: pp on 19 Aug 2009, 11:04:18 am
The crash dump is still referencing the old executable. Did you update your app_info.xml? Also make sure you copy the new libcudart.so.2 and libcufft.so.2 to your projects/setiathome.berkeley.edu directory. And finally, as stated in another thread, also copy the new executable to /usr/local/bin or whatever directory you have in your PATH. I have had no problems since following these advices (well, apart from having to renice the executable to level 0 to give it enough CPU time).
Title: Re: SETI MB CUDA for Linux
Post by: macros on 19 Aug 2009, 11:13:37 am
The crash dump is still referencing the old executable.

True, but I got the same for the newer, just picked one from the error list, didn't notice its the old one...

Quote
Did you update your app_info.xml? Also make sure you copy the new libcudart.so.2 and libcufft.so.2 to your projects/setiathome.berkeley.edu directory.
And finally, as stated in another thread, also copy the new executable to /usr/local/bin or whatever directory you have in your PATH. I have had no problems since following these advices (well, apart from having to renice the executable to level 0 to give it enough CPU time).

Yes, I did all that. Anyway, it seems to be running now, due to not making one change at the time, I don't know what was exactly the cause.  ;) ::)
Besides, its just first WU, hopefully there will be no more errors.

edit: It works. Finally :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Aug 2009, 01:02:20 pm
The crash dump is still referencing the old executable.

True, but I got the same for the newer, just picked one from the error list, didn't notice its the old one...

Quote
Did you update your app_info.xml? Also make sure you copy the new libcudart.so.2 and libcufft.so.2 to your projects/setiathome.berkeley.edu directory.
And finally, as stated in another thread, also copy the new executable to /usr/local/bin or whatever directory you have in your PATH. I have had no problems since following these advices (well, apart from having to renice the executable to level 0 to give it enough CPU time).

Yes, I did all that. Anyway, it seems to be running now, due to not making one change at the time, I don't know what was exactly the cause.  ;) ::)
Besides, its just first WU, hopefully there will be no more errors.

edit: It works. Finally :)

one thing you need to make sure of is that the project directory where the cuda libs are is listed in the ld.so.conf file and that you have run ldconfig. without that it is very likely it would crash possibly a few times and then find its libraries by accident.
Title: Re: SETI MB CUDA for Linux
Post by: macros on 19 Aug 2009, 01:58:20 pm
one thing you need to make sure of is that the project directory where the cuda libs are is listed in the ld.so.conf file and that you have run ldconfig. without that it is very likely it would crash possibly a few times and then find its libraries by accident.
Yeah, I always verify DSO availability with ldd on every client ...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Aug 2009, 06:48:26 am
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Aug 2009, 08:22:04 pm
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

i have no idea. i agree unless my desktops are keeping too many shaders busy which i dont think standard gui desktops use many of them if at all then i am at a loss... the 285 should be at least near the speed of the tesla allowing for the 285 to be busy with things on the desktops. since we have had some strong storms in the area boinc has been down for the past 3 hours or so and i have just started things back up.

i think maybe i will boot this machine from my tuning windows drive tomorrow (keep it in a drawer) and review what riva tuner and the evga program tell me. nvidia-settings shows me that all the clocks meet what they are supposed to be.  now i dont know about desktop settings much but there is one change i made in the past few weeks with nvidia-settings. i unchecked Sync to VBlank in xvideo settings and also unchecked sync to vblank and allow flipping in the opengl settings. wasnt sure what they did but there seemed to be no difference.   should they be checked? out side of that, the clock freq on the 285 are as follows

2D settings gpu 300mhz memory 100mhz
3D settings  gpu 690mhz memory 1300mhz

power mizer which seems to not have settings says adaptive clocking enabled performance level2 perforamce mode desktop. level2 is the 3d settings above however i remember when i first got the card, performance mode said maximum performance and somewhere along the line it changed to desktop. since the other settings are the same i can only assume it is a function of which driver is being used for which text shows up.

does this give any clues?
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 21 Aug 2009, 08:23:01 pm
Hi.

I was just wondering if there are any known issues in using the CUDA client with the 2.6.30 kernel ? I recently built a 2.6.30 kernel (to see if the AP units will fail), and noticed that my CUDA units were appreciably slower (taking over an hour).
I just switched back the latest ubuntu kernel (2.6.28-15-generic), which seems to work fine.

Any suggestions, any particular info you need ? I'm using the same nvidia driver in both cases (185.18.31, on an x86_64 platform)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Aug 2009, 08:27:17 pm
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

since 6.9.0 reports 2 teslas, could it be possible it is mixing up which device is 0 and which is 1? because it is completely odd since the tesla is running gpu 500mhz and memory 900mhz so it should be considerably slower. it rates both devices it thinks are teslas at 74gflops yet the 285 is rated by 6.6.11 as 127gflops

i am going to reboot this tomorrow so when i do i am going to go over the settings in cmos. presently it is set to auto on pci-e bus frequency. maybe i will fix it at 100mhz .. it could be doing God knows what in auto.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Aug 2009, 08:36:43 pm
Hi.

I was just wondering if there are any known issues in using the CUDA client with the 2.6.30 kernel ? I recently built a 2.6.30 kernel (to see if the AP units will fail), and noticed that my CUDA units were appreciably slower (taking over an hour).
I just switched back the latest ubuntu kernel (2.6.28-15-generic), which seems to work fine.

Any suggestions, any particular info you need ? I'm using the same nvidia driver in both cases (185.18.31, on an x86_64 platform)

i am using 2.6.29. and now that you mention it i have been having issues for a few weeks. i installed this on july 23rd. unfortunately i cannot remember far enough back since i have made so many other changes as well whether performance degraded then or not. i may try going back to my other kernel, 2.6.25 and see what happens.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Aug 2009, 08:41:06 pm
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

lordvader brought up an interesting point i have not even considered. july 23rd i switched from 2.6.25 to 2.6.29 kernel. that is close to the time i started having issues. i am going to try switching back tomorrow... unfortunately it will require some recompiling of the kernel and all modules and a few other things since with the switch to 2.6.29 i also switched gcc to 4.3.2

also maybe i missed something in getting rid of cuda 2.3. i guess a good manual inspection of things is in order tomorrow.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 22 Aug 2009, 03:44:44 am
I'm gonna try a vanilla built 2.6.28.10 kernel, see if I get the same performance issues (and hopefully successfull AP units ...).

This is fun ! Damn I missed this stuff !
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Aug 2009, 06:10:58 am
I'm gonna try a vanilla built 2.6.28.10 kernel, see if I get the same performance issues (and hopefully successfull AP units ...).

This is fun ! Damn I missed this stuff !

used to be for me too until i started doing this stuff for a living.. now its just plain irritating when something doesnt go right the first time.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Aug 2009, 12:06:34 pm
riofl, what is happening?

I've checked again your host today and I've seen this: http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=40&show_names=0&state=2

All 2 hundred and 3 hundred sec tasks were done by your 285. All two-digit sec tasks were done by your tesla. This is completely abnormal.

well i just went back to 2.6.25, recompiled everything to do with cuda/boinc, made sure there were no cuda 2.3 libs laying around so now we wait and see if that changes anything. looks like im still gonna get 30 min workunits but time will tell.

booted off my windows "nvidia checking" disk and checked both cards with rivatuner and also evga precision and it all looks right. everything set to factory except fans which i keep at 100%. did not use them to set anything simply cancelled out of whatever i looked at. i only use nvclock to set fans on boot of linux after X has reset them both.


Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Aug 2009, 02:37:08 pm
12 workunits done and still around 26min. a little faster average than before at 30 min but...

this is odd

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Aug 2009, 10:45:36 pm
also the 3 digit time workunits are still the 285 and 2 digit the tesla.  i wonder if it has something to do with how busy my desktops are? i have quite a lot going on 24/7 with 18 gkrellm server monitors running in one desktop, usually 4 or 5 browser windows in different desktops with maybe 28 or so tabs open, average 8 or 10 ssh konqueror tabs open into our servers, email, virtualbox running xp which also runs boinc, kopete, 8 or 9 postit notes in the various desktops, a few kedit windows open plus momentary things like adobe reader, smplayer or whatever.. im in totally new territory here. my experience in graphics cards is plug it in and make sure it works with a stable and peppy screen :)

however the 'busyness' of the desktops is not new and was basically the same when i had 10-13min workunits out of both cards.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 22 Aug 2009, 11:04:28 pm
i am trying to help someone in the boinc cuda forum. his workunits are giving strange things. he runs windows, the vlarkill cuda app and the ak cpu app, a 9400gt 1gb ram and the 190 driver with 2.3 toolkit. this is one of his workunits.. they are all similar..

his cuda app is specifically CUDA - MB_6.08_CUDA_V12_VLARKill_FPLim248.exe.

Work Unit Info:
...............
WU true angle range is : 1.479648
After app init: total GPU memory 1073741824 free GPU memory 963768320

Flopcounter: 11519105207831.588000

Spike count: 0
Pulse count: 0
Triplet count: 2
Gaussian count: 0

Wall-clock time elapsed since last restart: 3348.9 seconds
class T_FFT<0>: total=2.67e+006, N=98124, <>=27 (2.70e+001), min=0 (0.00e+000)
class T_FFT<8>: total=9.30e+001, N=3, <>=31 (3.10e+001), min=31 (3.10e+001)
class T_FFT<16>: total=1.25e+002, N=7, <>=17 (1.70e+001), min=15 (1.50e+001)
class T_FFT<64>: total=2.65e+002, N=29, <>=9 (9.00e+000), min=0 (0.00e+000)
class T_FFT<256>: total=0.00e+000, N=115, <>=0 (0.00e+000), min=0 (0.00e+000)
class T_FFT<512>: total=2.65e+003, N=229, <>=11 (1.10e+001), min=0 (0.00e+000)
class T_FFT<1024>: total=4.30e+003, N=457, <>=9 (9.00e+000), min=0 (0.00e+000)
class T_FFT<2048>: total=1.50e+004, N=915, <>=16 (1.60e+001), min=0 (0.00e+000)
class T_FFT<4096>: total=3.94e+003, N=211, <>=18 (1.80e+001), min=15 (1.50e+001)
class T_FFT<8192>: total=1.72e+004, N=845, <>=20 (2.00e+001), min=15 (1.50e+001)
called boinc_finish

</stderr_txt>
]]>

i dont know what the fft lines mean but i have a feeling they should not be there. my first thought was to drop back to the 185 series drivers and the 2.2 toolkit. im not sure 2.3 and a 9400 can work together.

any ideas?
Title: Re: SETI MB CUDA for Linux
Post by: Josef W. Segur on 23 Aug 2009, 01:38:25 am
i am trying to help someone in the boinc cuda forum. his workunits are giving strange things. he runs windows, the vlarkill cuda app and the ak cpu app, a 9400gt 1gb ram and the 190 driver with 2.3 toolkit. this is one of his workunits.. they are all similar..

his cuda app is specifically CUDA - MB_6.08_CUDA_V12_VLARKill_FPLim248.exe.
...
i dont know what the fft lines mean but i have a feeling they should not be there. my first thought was to drop back to the 185 series drivers and the 2.2 toolkit. im not sure 2.3 and a 9400 can work together.

any ideas?

The FFT lines are normal for those builds, just keeping some statistics on those which may give clues for future improvements. The FPLim2048 in the file name is also related to the same effort.
                                                                         Joe
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 23 Aug 2009, 04:33:45 am
...
The FFT lines are normal for those builds, ...

Aww yeah, forgot about those  :-\ . Well in case anyone's wondering why the <256> size measures 0, it mutated into a test piece toward future coding efforts, so the 'standard' code inside the timer is never run for that one.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Aug 2009, 06:54:15 am
thanks guys. ok then his processing is probably normal. a 9400 is not the fastest card and he was concerned something was wrong since his cuda processing times just about match his cpu times. his cpu is a Q9300 so it is very probable since i believe a 9400 would weigh in around 4gflops or so going by boinc's measurement standards.

thanks again.

Title: Re: SETI MB CUDA for Linux
Post by: Kunin on 24 Aug 2009, 04:16:26 pm
Two linux hosts in the top 20 now!
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Aug 2009, 05:57:10 pm
cool!
Title: Re: SETI MB CUDA for Linux
Post by: vejpuste on 25 Aug 2009, 07:55:46 am
Hello,
I try Crunch3rs CUDA seti application and search google and this forum too but result is not OK.
I tried to CUDA 2.1 and aplication setiathome-CUDA-6.08.x86_64-pc-linux-gnu. This compute OK but take 100% of CPU.
http://setiathome.berkeley.edu/result.php?resultid=1340190227

Now I test CUDA 2.3 with same apliaction and result is SIGSEGV: segmentation violation
http://setiathome.berkeley.edu/result.php?resultid=1344497914
CentOS5 64bit
CUDA driver 190.18
CUDA Toolkit and SDK 2.3
ldd setiathome-CUDA-6.08.x86_64-pc-linux-gnu
libcufft.so.2 => /usr/local/cuda/lib64/libcufft.so.2 (0x00002b7786226000)
libcudart.so.2 => /usr/local/cuda/lib64/libcudart.so.2 (0x00002b7786c7a000)
libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x00002b7786eba000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003ca0200000)
libm.so.6 => /lib64/libm.so.6 (0x0000003c9a200000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003c9aa00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003c99e00000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003c9a600000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003c9f600000)
librt.so.1 => /lib64/librt.so.1 (0x0000003c9e600000)
libz.so.1 => /usr/lib64/libz.so.1 (0x0000003c9ae00000)
/lib64/ld-linux-x86-64.so.2 (0x0000003c99a00000)

I try add setiathome-CUDA-6.08.x86_64-pc-linux-gnu to /usr/local/bin but still not working.
CUDA 2.2 take segmentation violation too.
Thanks for any ideas
Libor
Title: Re: SETI MB CUDA for Linux
Post by: pp on 25 Aug 2009, 08:45:02 am
You need the 2.2 CUDA binary: http://calbe.dw70.de/mb/viewtopic.php?p=868#p868
Title: Re: SETI MB CUDA for Linux
Post by: vejpuste on 25 Aug 2009, 08:59:23 am
You need the 2.2 CUDA binary: http://calbe.dw70.de/mb/viewtopic.php?p=868#p868
I try this application too but here is another problem
ldd setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu
./setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by ./setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu)
CentOS5 have no GLIBCXX_3.4.9 in updates now.
Libor
Title: Re: SETI MB CUDA for Linux
Post by: Urs Echternacht on 25 Aug 2009, 09:01:59 am
You need the 2.2 CUDA binary: http://calbe.dw70.de/mb/viewtopic.php?p=868#p868
I try this application too but here is another problem
ldd setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu
./setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by ./setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu)
CentOS5 have no GLIBCXX_3.4.9 in updates now.
Libor
Search for a newer version of "libstdc++", GLIBCXX_3.4.9 is the version number of that library.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 25 Aug 2009, 11:32:50 am
... currently trying the 2.6.27 kernel, and CUDA seems fine (and hoping that AP is stable).

Quick question though. I had about 70 WUs uploaded, but not reported. I then tinkered with my app_info.xml, and lost all the WUs, so can't report them ! Is there a way to report with the boinc client ? Otherwise, there's 70 units sitting there, and no one will ever know !!!
Title: Re: SETI MB CUDA for Linux
Post by: Urs Echternacht on 25 Aug 2009, 11:38:01 am
... currently trying the 2.6.27 kernel, and CUDA seems fine (and hoping that AP is stable).

Quick question though. I had about 70 WUs uploaded, but not reported. I then tinkered with my app_info.xml, and lost all the WUs, so can't report them ! Is there a way to report with the boinc client ? Otherwise, there's 70 units sitting there, and no one will ever know !!!
On main the only way is currently to run your cache dry, detach from project and immediately after reattach to project. That way the "lost" wus will be marked "detached" and send out again.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 25 Aug 2009, 09:32:09 pm
I don't understand something. The developers obviously know that cuda devices hate vlar and vhar workunits. why can't they put the smarts that are in the perl script directly into boinc so that as it downloads the workunit it reads the angles etc and decides where to assign it at that point? it should be an easy thing to implement saving lots of trouble for people like me who have a card that locks up at the slightest hint of a vlar and having to run something external to make sure 'proper' workunits are fed to cuda?

seems to me that this ia a gross oversight leaving this out of boinc.
Title: Re: SETI MB CUDA for Linux
Post by: Josef W. Segur on 26 Aug 2009, 01:28:34 am
I don't understand something. The developers obviously know that cuda devices hate vlar and vhar workunits. why can't they put the smarts that are in the perl script directly into boinc so that as it downloads the workunit it reads the angles etc and decides where to assign it at that point? it should be an easy thing to implement saving lots of trouble for people like me who have a card that locks up at the slightest hint of a vlar and having to run something external to make sure 'proper' workunits are fed to cuda?

seems to me that this ia a gross oversight leaving this out of boinc.

True, the S@H CUDA app has it's greatest advantage for midrange work. The stock BOINC won't be changed as you suggest for the simple reason that it's designed to support common features of use to all projects. BOINC is separate from S@H even though they both derive from Classic S@H in part. OTOH, it's LGPL open source so a special version could be built, feel free to volunteer for that work.  ;)

As far as the S@H project providing some additional features, I figure once volunteers begin donating enough money to resolve ongoing problems the staff might be able to find time to work something out. Meanwhile, many of those who have CUDA crunching turned off while using the computer for other things probably don't even notice the difficulty of VLAR. There isn't really any similar issue with VHAR, the main thing there is simply the number of tasks which have to be downloaded when most are VHAR 'shorties'.
                                                                                     Joe
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 26 Aug 2009, 06:04:41 am
makes sense. i keep forgetting about  other projects since i only do seti. maybe a shell after the list download would be in order to run a script if it was there. i always find it difficult trapsing around others' code trying to figure out what is what since a majority of the time it is not well commented if at all. i have never looked at boinc source. maybe once they have a stable version that works with multiple devices and properly reports them i might dig in and see what is what. i doubt i would submit mods for this because it would create a separate seti branch of boinc which probably would not be wise. of course, a simple cmdline flag --seti or some such could turn the extra code on or off.. hmm... have to think abit about this.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 31 Aug 2009, 01:34:21 am
Hey all.

I recently got a computation error with one of my CUDA WU's, but it wasn't a segfault. Maybe there's some info in there for you guys :

http://setiathome.berkeley.edu/result.php?resultid=1347109626
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 31 Aug 2009, 05:03:48 am
Hey all.

I recently got a computation error with one of my CUDA WU's, but it wasn't a segfault. Maybe there's some info in there for you guys :

http://setiathome.berkeley.edu/result.php?resultid=1347109626

Thanks, but it's known error.
It has even its own sticky thread on main ("-12"). It's some limitation of current CUDA code.
But sometimes such error could appear when card go in broken state.
So if you see many of such errors, then it's equal of many "-9 overflows" that no validated versus wingman => host should be rebooted.
Unfortunately, I have one such buggy card so had chance to see this "-12" for dozens of results at once (usually it's pretty rare error).
Title: Re: SETI MB CUDA for Linux
Post by: letni on 31 Aug 2009, 06:59:36 pm
Hey folks, I'm trying to set up 2 separate CUDA devices on a dedicated 64 bit Linux system with the setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu.  I have the system running with 1 CUDA (9600GSO) no problem, but I decided to stick in the 8800GTS from my desktop machine and now I'm getting this message:

CUDA device: GeForce 8800GTS (driver version 0, compute capability 1.0, 320MB, est. 41GFLOPS)
CUDA device (not used): GeForce 8800GTS (driver version 0, compute capability 1.0, 320MB, est. 41GFLOPS)

For some reason It doesn't detect my 9600 GSO as an 9600 anymore, but still uses it (the card is physically hot).  The 8800 I added in as a secondary card is cool to the tough and I don't see a thread running for it.   Here is the LSPCI output:

02:00.0 VGA compatible controller: nVidia Corporation GeForce 9600 GSO (rev a2)
03:00.0 VGA compatible controller: nVidia Corporation G80 [GeForce 8800 GTS] (rev a2)

Is it even possible to use CUDA with two different devices?  If so, what am I missing?

Thanks,

Letni
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Sep 2009, 06:32:23 am
After a small hiatus I'm back  :)

now i dont know about desktop settings much but there is one change i made in the past few weeks with nvidia-settings. i unchecked Sync to VBlank in xvideo settings and also unchecked sync to vblank and allow flipping in the opengl settings. wasnt sure what they did but there seemed to be no difference.   should they be checked?
Sync to vblank and flipping don't have anything to do with cuda computations. They are about tearing.

power mizer which seems to not have settings says adaptive clocking enabled performance level2 perforamce mode desktop. level2 is the 3d settings above however i remember when i first got the card, performance mode said maximum performance and somewhere along the line it changed to desktop. since the other settings are the same i can only assume it is a function of which driver is being used for which text shows up.
I don't like powermizer at all, but it doesn't seem you have a problem there.

since 6.9.0 reports 2 teslas, could it be possible it is mixing up which device is 0 and which is 1? because it is completely odd since the tesla is running gpu 500mhz and memory 900mhz so it should be considerably slower. it rates both devices it thinks are teslas at 74gflops yet the 285 is rated by 6.6.11 as 127gflops
Again, what boinc reports is irrelevant, just cosmetic. What cuda client sees is important and in your case it reports your cards right.

i am going to reboot this tomorrow so when i do i am going to go over the settings in cmos. presently it is set to auto on pci-e bus frequency. maybe i will fix it at 100mhz .. it could be doing God knows what in auto.
I don't think pci-e bus frequency has any noticeable effect in cuda speed and even if it did, it should affect both your cards, not only one.

also the 3 digit time workunits are still the 285 and 2 digit the tesla.  i wonder if it has something to do with how busy my desktops are? i have quite a lot going on 24/7 with 18 gkrellm server monitors running in one desktop, usually 4 or 5 browser windows in different desktops with maybe 28 or so tabs open, average 8 or 10 ssh konqueror tabs open into our servers, email, virtualbox running xp which also runs boinc, kopete, 8 or 9 postit notes in the various desktops, a few kedit windows open plus momentary things like adobe reader, smplayer or whatever.. im in totally new territory here. my experience in graphics cards is plug it in and make sure it works with a stable and peppy screen :)

however the 'busyness' of the desktops is not new and was basically the same when i had 10-13min workunits out of both cards.

This is a very busy desktop. Have you tried running a few workunits with absolutely nothing of the above running? I think your times will return to "normal".

Hello,
I try Crunch3rs CUDA seti application and search google and this forum too but result is not OK.
I tried to CUDA 2.1 and aplication setiathome-CUDA-6.08.x86_64-pc-linux-gnu. This compute OK but take 100% of CPU.
http://setiathome.berkeley.edu/result.php?resultid=1340190227

Now I test CUDA 2.3 with same apliaction and result is SIGSEGV: segmentation violation

I try add setiathome-CUDA-6.08.x86_64-pc-linux-gnu to /usr/local/bin but still not working.
CUDA 2.2 take segmentation violation too.
Thanks for any ideas
Libor
Cuda 2.1 libraries have a bug, that's why you have 100% CPU utilisation. You need 2.2 or later. Also please follow my post in http://lunatics.kwsn.net/linux/seti-mb-cuda-for-linux.msg19014.html#msg19014 very carefully.

I try this application too but here is another problem
ldd setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu
./setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by ./setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu)
CentOS5 have no GLIBCXX_3.4.9 in updates now.
Libor
Centos, since it's more enterprise oriented, uses old versions of ...well everything. Have you had any success?

@riofl and lordvader about kernel versions
Have you compiled these kernels yourselves or have you got them from elsewhere? Maybe some performance/optimization options you left out? Do you have any nvidia related errors in your syslog when running cuda? Any other observations with these newer kernels?

Is it even possible to use CUDA with two different devices?  If so, what am I missing?
Of course it is. Any link to your host? Also try putting

<use_all_gpus>1</use_all_gpus>

in the options section of your cc_config.xml and tell boinc to read the config file.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 12:05:14 pm
i always compile my own kernels and have been since i first ran linux 8 yrs ago. in switching back and forth there was no difference in my cuda processing times. at first i thought there was but it was just different size workunits.

no, i have not tried just running boinc without a gui.. i will try that this coming weekend when i can spare some downtime from work and monitoring the servers. will let it run for 1 hr with no X running and then will go in and see if there are any differences.   

thing is, the usage of my desktops has not changed much at all during the past year so i had the same stuff open with the 13min workunits a few months ago. will be interesting to see if the 3 digit numbers move into 2 digit though on the tasks report.

thanks for the info about video settings. i was not sure what they did but when i did not see any changes, i figured "off" may be better..  i hate powermizer myself but i cannot find any options to turn it off and leave the card in high perf mode at all times. every time i spot check it its always in hi perf mode so maybe my temps are not high enough to trigger it (assuming temp is its onlly trigger) and if idle is a trigger, my desktop is never idle even when i go to bed, all the gkrellm monitors are advancing their graphs every second.

seems so strange with all the mb servers down, my cuda cards are both idling at around 46c. really odd since i am used to them being in the low or mid 60s all the time.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Sep 2009, 01:48:51 pm
no, i have not tried just running boinc without a gui.. i will try that this coming weekend when i can spare some downtime from work and monitoring the servers. will let it run for 1 hr with no X running and then will go in and see if there are any differences.   
Leave X, just close all those apps you have running. Just the desktop with boinc in the background.

thing is, the usage of my desktops has not changed much at all during the past year so i had the same stuff open with the 13min workunits a few months ago. will be interesting to see if the 3 digit numbers move into 2 digit though on the tasks report.
The bigger multibeam workunits started about a month or two ago.

i hate powermizer myself but i cannot find any options to turn it off and leave the card in high perf mode at all times. every time i spot check it its always in hi perf mode so maybe my temps are not high enough to trigger it (assuming temp is its onlly trigger) and if idle is a trigger, my desktop is never idle even when i go to bed, all the gkrellm monitors are advancing their graphs every second.
Many people have tried many ways to turn off powermizer usually with no success. :D   Powermizer levels are triggered by GPU usage or very high (95+°C) temperatures.

seems so strange with all the mb servers down, my cuda cards are both idling at around 46c. really odd since i am used to them being in the low or mid 60s all the time.
I have some WUs cached for a few days more  ;D
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 02:52:01 pm
no, i have not tried just running boinc without a gui.. i will try that this coming weekend when i can spare some downtime from work and monitoring the servers. will let it run for 1 hr with no X running and then will go in and see if there are any differences.   
Leave X, just close all those apps you have running. Just the desktop with boinc in the background.

ok ill close down all my 'server' functions as well like my jabber server, bind, etc. so its just x and boinc running.

thing is, the usage of my desktops has not changed much at all during the past year so i had the same stuff open with the 13min workunits a few months ago. will be interesting to see if the 3 digit numbers move into 2 digit though on the tasks report.
The bigger multibeam workunits started about a month or two ago.

hehe thats about the time i started noticing issues. maybe they're not issues afer all.


i hate powermizer myself but i cannot find any options to turn it off and leave the card in high perf mode at all times. every time i spot check it its always in hi perf mode so maybe my temps are not high enough to trigger it (assuming temp is its onlly trigger) and if idle is a trigger, my desktop is never idle even when i go to bed, all the gkrellm monitors are advancing their graphs every second.
Many people have tried many ways to turn off powermizer usually with no success. :D   Powermizer levels are triggered by GPU usage or very high (95+°C) temperatures.

ok well i hardly do anything involving true graphics besides cuda running on that stuff and i have my hardware monitors set to shut the system down if the gpu gets to 80c.. once i adjusted the fans and air flow in the case they have never gone above 70c.

seems so strange with all the mb servers down, my cuda cards are both idling at around 46c. really odd since i am used to them being in the low or mid 60s all the time.
I have some WUs cached for a few days more  ;D

lucky. im set for 10 days but ran out of cuda.. still have 400+ cpu units which are all vlar/vhar. been trying to get more since saturday.

i'll just wait out the stampede and give the servers a chance to settle and then go for it again.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 02:53:10 pm
ugh i have to learn to remember to add extra quote tags.. all my responses are in the quote area..

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Sep 2009, 04:03:50 pm
You can switch those VHARs to your graphics cards.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 05:29:09 pm
You can switch those VHARs to your graphics cards.

cool. i have more than enough of those

Number of CPU tasks:420
Number of GPU tasks:0
Number of VLAR tasks:103
Number of VHAR tasks:317

just changed the script around to give all vhar tasks to gpu so now cpu will only get vlar tasks plus spillover if any. looks like they are loving it too :)

Number of CPU tasks:103
Number of GPU tasks:317
Number of VLAR tasks:103
Number of VHAR tasks:317
Total tasks: 420


thanks!


interesting. ... on the vhar ones my gtx285 is finishing the work units 2 full minutes ahead of my tesla in every case... so far they finished 6 of them varying from  11 to 14  min each. will check out the tasks report to see if they say something different too, just for the info. dunno what it will prove if anything... the test over the weekend should tell whether im keeping the 285 a bit busy or not.... i would think the 285 would not even notice desktop activity...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Sep 2009, 05:53:47 pm
I've never used that rebranding script and don't know how it works. Seeing the code I think changing

if($trueAR < 0.13 || $trueAR > 1.127)

to

if($trueAR < 1.127)

should do the trick. Backup first!
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 05:59:56 pm
yep thats exactly what line i changed... simply commented out the original line and inserted the new line under it.

only i used the left side since the logic assignes that evaluation to cpu.

if($trueAR < 0.13){
     $tasks{$WUname}=603;

maybe to get a bit better distribution i need to change the number from 0.13 to maybe 0.5 to make it more fair for the cpus to get some non vlar work too?  think i will leave it like this and see what the distribution looks like over this week then maybe play with that number a bit keeping it below 0.60 to be safe..
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Sep 2009, 06:26:29 pm
Well, just play with that AR value to keep both CPU and GPU busy till new WUs come out to refill your cache.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 06:52:00 pm
hmmm don't remember if i asked this before, but is this -9 result overflow message anything to worry about. i  have seen it a few times. the wu was validated and credit given though.

Work Unit Info:
...............
WU true angle range is :  0.508292
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected exceeds the storage space allocated.

Flopcounter: 27434172839395.402344

Spike count:    28
Pulse count:    2
Triplet count:  0
Gaussian count: 0
called boinc_finish

</stderr_txt>
]]>
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Sep 2009, 07:29:37 pm
These result overflows are common. You should pay attention though if you generate a lot of these while your wingmen return "good" results. In this case it could be a hardware problem in your part and these results will be invalid.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 07:40:31 pm
These result overflows are common. You should pay attention though if you generate a lot of these while your wingmen return "good" results. In this case it could be a hardware problem in your part and these results will be invalid.


hmm. i got two of these from the 285 since i started cuda back up with the vhar units.  i ran a memory tester i got off the net which showed good on all available memory..  and i am not geting any strange things happening to the machine or the desktop or movies when i push the gtx a little bit. might have been luck-o-the-draw. ill just have to keep an eye on this.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Sep 2009, 07:47:36 pm
ok my mistake. those were from yesterday when i ran out of units... no way to sort by date i assumed newest were first then i actually read the date :P  all of the 2 sept ones were fine.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 02 Sep 2009, 10:38:37 pm
After a small hiatus I'm back  :)
... snip ...

@riofl and lordvader about kernel versions
Have you compiled these kernels yourselves or have you got them from elsewhere? Maybe some performance/optimization options you left out? Do you have any nvidia related errors in your syslog when running cuda? Any other observations with these newer kernels?


Hi !

About the kernel versions. I compiled the 2.6.30 kernel using the same config used in Ubunutu's 2.6.28 kernel. I used the same config on the 2.6.27 kernel, which currently gives me equivalent CUDA performance, as well as stable Astropulse results.

I've been using the latest nvidia drivers, and the latest CUDA library (oh ! and my CPU is a Phenom II 955, not overclocked, and I'm running 64bit Kubuntu).

An interesting observation. While crunching CUDA units (using the fast kernels), my GPU heats up quite a bit, as expected. On the slow kernel, it barely goes above idle, so it's definitely being under-utilised.

I've yet to try kernel 2.6.29, and probably won't have time to test it until sometime next week.

Any more info you may need ?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Sep 2009, 03:47:50 am
About the kernel versions. I compiled the 2.6.30 kernel using the same config used in Ubunutu's 2.6.28 kernel. I used the same config on the 2.6.27 kernel, which currently gives me equivalent CUDA performance, as well as stable Astropulse results.

So let me get this straight:
2.6.27: custom compiled, fast
2.6.28: stock, fast
2.6.30: custom compiled, slow

All three supposedly with the same pre-compile config. Am I right?

In http://kernel.ubuntu.com/~kernel-ppa/mainline/ there are pre-compiled mainline kernels along with build options from Ubuntu's kernel team ready for install. Have you tried those?

Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 03 Sep 2009, 05:49:53 am
That's right, same config for all of them.

I've downloaded the precompiled kernels, but won't get a chance to try them till the end of the weekend, at the earliest.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Sep 2009, 09:34:44 am
credit granted is the same for a given workunit whether it is processed by cpu or gpu correct?

my thinking is that since the vhar are short, only taking my gpu about 12 min to process what would be the harm in assigning all vhar to gpu to speed up things? unless they would be granted more credit being done by the slower cpu then i will  go back to assigning them to cpu.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Sep 2009, 10:45:38 am
credit granted is the same for a given workunit whether it is processed by cpu or gpu correct?
Yes.

my thinking is that since the vhar are short, only taking my gpu about 12 min to process what would be the harm in assigning all vhar to gpu to speed up things?
Some people say that CPU is more efficient for VHARs than GPU but can't see their reasoning.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 03 Sep 2009, 11:00:43 am
That's right, same config for all of them.

I've downloaded the precompiled kernels, but won't get a chance to try them till the end of the weekend, at the earliest.

Remember that newer kernels usually have some new configurable options so an old config won't apply cleanly.  In the transition from 2.6.28 to 2.6.30 these comes to mind:

Code: [Select]
Processor type and features  --->
    [*] Supported processor vendors  --->
        [*]   Support Intel processors
        [ ]   Support Cyrix processors
        [ ]   Support AMD processors
        [ ]   Support Centaur processors
        [ ]   Support Transmeta processors
        [ ]   Support UMC processors

Really bad things seems to be possible if you don't enable this option and configure it correctly for your cpu. I can't say that this option is the cause of your problem (there are several new ones) but I always check every new option during kernel upgrades and computation times are the same for me regardless of kernel version.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 03 Sep 2009, 11:06:17 am
credit granted is the same for a given workunit whether it is processed by cpu or gpu correct?
Yes.

As noted by people over at S@H forums, the GPU seems to claim credit that is about 30% higher than what the CPU does. If your wingman use the CPU you will be granted the lower value but if your wingman use GPU like you, you will receive the higher value it seems.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Sep 2009, 02:53:04 pm
As noted by people over at S@H forums, the GPU seems to claim credit that is about 30% higher than what the CPU does. If your wingman use the CPU you will be granted the lower value but if your wingman use GPU like you, you will receive the higher value it seems.

The cuda app usually overclaims (other times underclaims) because it can't calculate precisely the flops needed for a certain workunit.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Sep 2009, 07:20:41 am
Shameless plug:

I've managed to climb to the #3 spot in the Top hosts (http://setiathome.berkeley.edu/top_hosts.php) list. This is probably the highest I'll ever be so I'll savour the moment.  :D

PDF attached for future proof.

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Sep 2009, 08:30:43 am
Shameless plug:

I've managed to climb to the #3 spot in the Top hosts (http://setiathome.berkeley.edu/top_hosts.php) list. This is probably the highest I'll ever be so I'll savour the moment.  :D

PDF attached for future proof.

congrats! when i get my 2nd 285 to replace this tesla maybe ill be able to reach around 15k.

do you process ap too or just mb? i blocked ap and am only doing mb.

either i'm doing something wrong to only break into the 10ks and now dropped to 9.9 since the weekend or something.. i see so many people with considerably higher rac with basically the same equipment.. unless its because i keep this machine so busy being a workstation... my load averages are consistantly betw 5.0 and 8.5
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 04 Sep 2009, 08:55:47 am
Guys,
 Forgive if this is not the place for posting questions, but from what I read here I think it is.
 I have installed the Crunchr CUDA app on my FedoraCore10 64bit machine. After a fair amount of grief with segfaults, today I finally managed to get my first result in. However two of my results this morning have a strange error. Could anyone elaborate on what the problem is and what I should do to fix it. The card is a 9600GT. Here is the output:-

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>

SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GT
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1625000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 8
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 9600 GT is okay
SETI@home using CUDA accelerated device GeForce 9600 GT
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0

Work Unit Info:
...............
WU true angle range is :  0.388520
Cuda error 'cudaAcc_CalcChirpData_kernel2' in file './cudaAcc_CalcChirpData.cu' in line 106 : unspecified launch failure.
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/execute.cu, line 1070
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/execute.cu, line 1070
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/cufft.cu, line 151
cufft: ERROR: CUFFT_EXEC_FAILED
CUFFT error in file './cudaAcc_fft.cu' in line 62.

</stderr_txt>
]]>

 Thanks
 Ian
 
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Sep 2009, 09:34:58 am
Thanks!

Just yesterday I put the new ap 5.06 in, but I haven't got any astropulse workunits yet. So currently only MB.

My load averages are above 4 and usually below 5.5

A GTX285 should be able to do 10000-14000 RAC alone.

With all those stuff running in your desktop I don't know if it would be a good idea to buy a low non-CUDA capable card for X and have your other 2 cards dedicated to CUDA. Of course your motherboard would need 3 PCI-E slots.

EDIT:
...
SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod
...
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/execute.cu, line 1070
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/execute.cu, line 1070
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/cufft.cu, line 151
cufft: ERROR: CUFFT_EXEC_FAILED
...

You're probably running the 2.2 cuda app with 2.1 libraries. Get the newer 2.2 or even better the 2.3 cuda libraries. Also you'll have to upgrade your NVIDIA driver to a 2.2 (185.18.xx) or 2.3 (190.xx) compatible one.
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 04 Sep 2009, 10:15:44 am
Sunu,
 I'll try with the 2.2. I've installed Cuda Driver 185.18.14 (2.2? from the nvidia website). Previously installed I had 185.18.36. I have now installed the Cuda Toolkit 2.2, set my PATH and amended ldconfig.
 I now await tasks from SETI, at the moment it's out of work.
 Ian
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Sep 2009, 10:34:44 am
well problem is i am now spoiled.. i had an 8600gt 256mb card i used for my desktop and ran the tesla for cuda before i got the 285. the 285 is several orders of magnitude better in desktop performance. i think i would rather just replace the tesla with a 2nd 285 and let that one crunch full speed and let this one do as it can. would still be a large improvement over the tesla in the 2nd slot. either that or maybe buy a motherboard with 3 slots that can take 3 of these cards leaving room for them to breathe and get a gtx 260 to use for my desktop and minor cuda crunching and let both 285 have at it full steam. i expect the 260 should be up to the task for my desktops.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Sep 2009, 10:52:05 am
I've installed Cuda Driver 185.18.14 (2.2? from the nvidia website). Previously installed I had 185.18.36.
185.18.14 is older than 185.18.36, why rollback? Also try cuda 2.3 with 190.xx driver, it's faster than 2.2.

well problem is i am now spoiled.. i had an 8600gt 256mb card i used for my desktop and ran the tesla for cuda before i got the 285. the 285 is several orders of magnitude better in desktop performance. i think i would rather just replace the tesla with a 2nd 285 and let that one crunch full speed and let this one do as it can. would still be a large improvement over the tesla in the 2nd slot. either that or maybe buy a motherboard with 3 slots that can take 3 of these cards leaving room for them to breathe and get a gtx 260 to use for my desktop and minor cuda crunching and let both 285 have at it full steam. i expect the 260 should be up to the task for my desktops.
Or maybe get a GTX295 in place of tesla and no need for a new motherboard.

TO ALL
Please see this thread (http://setiathome.berkeley.edu/forum_thread.php?id=55317) and take proper action (abort those workunits): I've lost quite a few credits because of this.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Sep 2009, 08:51:23 pm
yeah a 295 is an option. then it can dual crunch away and let the 285 'limp' along :P.. an idea to consider.. i suppose i could take the $ for that from my savings for my new project next year.. would also give me some experience with that monster :) will have to look and see if my current psu can handle it and the 285 and all the other things i have going on. just got it 2 months ago... kinda hate to replace it already.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Sep 2009, 09:01:21 pm

TO ALL
Please see this thread (http://setiathome.berkeley.edu/forum_thread.php?id=55317) and take proper action (abort those workunits): I've lost quite a few credits because of this.

hmmm guess a bit of all the probs people have is the data supplied. shame, but then again with the massive amount of chopping and adjusting the master data i can imagine errors creep in.

will have to check this out. i do get a number of units aborting with computation errors, usually 3 or 4 a day, but i have attributed those to the flaky tesla.. havent checked them all just a few and then gave up on it.
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 05 Sep 2009, 04:59:26 am
Badly-prepared data is actually pretty rare at SETI - that's why I made such a point of drawing that set to Eric's attention.

The point Sunu was making is that those WUs don't error out while crunching: they run full duration, and then error out when the time comes to upload the results. That's why they're a waste of time.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Sep 2009, 07:05:22 am
Badly-prepared data is actually pretty rare at SETI - that's why I made such a point of drawing that set to Eric's attention.

The point Sunu was making is that those WUs don't error out while crunching: they run full duration, and then error out when the time comes to upload the results. That's why they're a waste of time.

ahh. yeah.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Sep 2009, 07:07:09 am
any preferences in brand on the 295?  i was thinking of going with xfx only because my 285 is an xfx black edition.. also looked at asus and evga
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Sep 2009, 09:36:13 am
yeah a 295 is an option. then it can dual crunch away and let the 285 'limp' along :P.. an idea to consider.. i suppose i could take the $ for that from my savings for my new project next year..
Well if you want RAC right here, right now a GTX295 is probably your best choice with your current setup (as it was the case with mine). I was "forced" to upgrade because my GTX280 burned about 2 months ago.

You have a future project in mind so if I were you I would pursue that. End of 2009 beginning of 2010 we will have the update to Nehalem processors while NVIDIA is going to release its next generation of graphics cards (about Christmas time) with the next generation dual card probably in the 1st quarter of 2010.

any preferences in brand on the 295?  i was thinking of going with xfx only because my 285 is an xfx black edition.. also looked at asus and evga
My GTX280 that burned was EVGA but I stayed with them. Both my GTX285 and GTX295 are EVGA. XFX should be good and ASUS even more. If you take the plunge and buy now prefer someone who gives you a step-up option as you might catch the new NVIDIA cards when they will be released, so EVGA, BFG or XFX (don't know if XFX has a step-up program).
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Sep 2009, 09:49:04 am
yeah a 295 is an option. then it can dual crunch away and let the 285 'limp' along :P.. an idea to consider.. i suppose i could take the $ for that from my savings for my new project next year..
Well if you want RAC right here, right now a GTX295 is probably your best choice with your current setup (as it was the case with mine). I was "forced" to upgrade because my GTX280 burned about 2 months ago.

You have a future project in mind so if I were you I would pursue that. End of 2009 beginning of 2010 we will have the update to Nehalem processors while NVIDIA is going to release its next generation of graphics cards (about Christmas time) with the next generation dual card probably in the 1st quarter of 2010.

the future project will take a while to get off the ground as well as gathering parts here and there as i can in addition to any bulk buys i can do. it wont happen in jan but will begin gathering parts and probably will begin assembly in march.

any preferences in brand on the 295?  i was thinking of going with xfx only because my 285 is an xfx black edition.. also looked at asus and evga
Quote
My GTX280 that burned was EVGA but I stayed with them. Both my GTX285 and GTX295 are EVGA. XFX should be good and ASUS even more. If you take the plunge and buy now prefer someone who gives you a step-up option as you might catch the new NVIDIA cards when they will be released, so EVGA, BFG or XFX (don't know if XFX has a step-up program).

don't know either but i most likely would stick with whatever i get now for this machine and will take a good hard look at all the new stuff before i commit on the new project. that one is hopefully going to give any cray supercomputers a run for their money. :) heh my family looks at me like i am insane.. ,instead of saving for retirement  since i'm the oldest of the 'kids', i'm spending on computers :)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Sep 2009, 10:17:54 am
that one is hopefully going to give any cray supercomputers a run for their money.

I hope we'll have a linux machine in the #1 spot of the top hosts list.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Sep 2009, 11:22:27 am
that one is hopefully going to give any cray supercomputers a run for their money.

I hope we'll have a linux machine in the #1 spot of the top hosts list.

that would be great!

ok i ran my 1 hr test with no activity in the desktop. i only entered 1 desktop and that was to keep an eye on the cpu/gpu temps. 1 strip of gkrellm is all i ran. nothing else open.

the way i did it was to set to not fetch work, then hit update to get rid of current finished units. then shut boinc down. rebooted computer to clean up memory and then with just that mentioned above, started boinc and let it sit for an hour. the units has approx the same completion times..... maybe the average might have been 1 or 2 min faster but mostly they were approximately the same. then i uploaded them, and went to find them in the task listings and could not... newest were sept 4. ill check again later in case they are delayed in posting, or maybe they dont post until a workday. not sure.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 08 Sep 2009, 06:23:18 pm
heh guess i will never comprehend boinc's logic :) it sets aside workunits due sept 21st in favor of completing workunits not due till oct 20th!  there must be some urgent flag the project sets in these that we dont  see. they are not processing in "high priority" mode. it seems to eventually get to the ones it sets aside, but it seems completely illogical to stop a wu that is 93% complete. makes no sense at all  to me.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Sep 2009, 08:38:34 pm
Yes, boinc scheduler is pretty messed up.
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 18 Sep 2009, 03:03:30 am
Sunu,
 I just reporting back to update you on my Fedora Core 10 64bit machine. I have successfully installed the 2.3cuda libraries/stuff and the machine has been chugging through the workunits.
 I have a couple of questions, one of which is worrying. Occasionally over the past week the machine has locked up and only a reset has cleared the issue. I had a look at /var/log/messages and I see a number of NVRM:Xvid messages. I've googled around and didn't get a clear answer, so does anyone here have an idea? Here are the entries:-
Sep 17 17:11:41 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 04:58:05 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 05:11:07 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
 The second question, is there anyway to ensure that the card is being used to the max, is there any tuning or monitoring of the card that would assist?
 Thanks
 Ian
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Sep 2009, 02:50:58 pm
Sunu,
 I just reporting back to update you on my Fedora Core 10 64bit machine. I have successfully installed the 2.3cuda libraries/stuff and the machine has been chugging through the workunits.
 I have a couple of questions, one of which is worrying. Occasionally over the past week the machine has locked up and only a reset has cleared the issue. I had a look at /var/log/messages and I see a number of NVRM:Xvid messages. I've googled around and didn't get a clear answer, so does anyone here have an idea? Here are the entries:-
Sep 17 17:11:41 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 04:58:05 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 05:11:07 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
 The second question, is there anyway to ensure that the card is being used to the max, is there any tuning or monitoring of the card that would assist?
 Thanks
 Ian


i have been getting this exact same message with the same numbers in the same positions on my tesla. i have been searching for meaning for these for almost 3 months. i only know that i have a problem with the tesla where even though it never locked the entire machine up, the tesla will lock up occasionally requiring a stop and restart of boinc. until i fixed my thermal pad problem on itfor the ram chips, it would  require a power off to reset it. i suspect this error may indicate ram problems. i ran a utility for testing vidram available on the net but it showed nothing wrong.

if the card is still in warranty i suggest getting a swap-out just to be safe.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Sep 2009, 06:39:33 pm
Unfortunately xid errors aren't very useful for troubleshooting. Riofl talks about problems with his thermal pads, you might have as well temperature problems, not necessarily with your graphics card but also with your CPU or nothbridge. What are your temperatures like?

Do you use KDE? I have the impression that I see much more reports with problems with KDE and nvidia drivers than with GNOME.

Which nvidia drivers do you use?. You could try a different version.

Check also your invalid tasks page. If you have invalid tasks this could mean a hardware problem with your graphics card and you'll have to replace it.

As for your second question about a monitoring app for GPU use, there isn't any. I use gnome's sensor-applet to monitor my CPU and GPU temperatures so I can always see if they are working (=crunching) or not.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Sep 2009, 07:12:47 pm
ahh another clue.. i use kde since i use many of its built in features... i tried gnome but i admit it was years ago and it did not do what i needed at that time... a small thing to most but it helps me a lot when switching a fair number of desktops every few min, is i need unique backgrounds for each desktop, and back then gnome did not do that. i understand it is now built in or there is an app to do it.. i might get adventurous and try the new gnome.

yeah my pads on the tesla were dried, brittle and cracked and some simply were not there so i had to clean everything and find the proper thick 2mm pads with fiberglass webbing which i had to buy from the UK and spent an afternoon being an artist with scissors and knife since the pads came in large sheets. since then temps everywhere including touch on the housing are considerably cooler but it still locks maybe once a week or so... maybe something was damaged from heat when the bad pads were on it. at this point i am not worried.. i am going to replace it soon and then send it to my boss who wants to run it in a much more forgiving windows environment. hehe let him have the problems :)
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 24 Sep 2009, 12:20:15 pm
Sunu,
 The XVID problem I reported was in a system that didn't have any Windows Manager running, it was running in mode 3, not 5, the machine was just basic vt100. However there did seem to be a hardware problem, and last Sunday the disk packed up. Today after carefull reinstall, modifications with logical volumes and mounting I've managed to get the machine back to a workable state. I've installed the two Cuda 2.3 packages. However I'm missed/messed something as my tasks keep aborting. The ldd of the seti executable seems ok but as I say the thing fails. What have I done wrong in the attached task error output?
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GT
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1600000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 8
SIGSEGV: segmentation violation
Stack trace (17 frames):
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(boinc_catch_signal+0x43)[0x485ef3]
/lib64/libpthread.so.0[0x60880f0]
/usr/lib64/libcuda.so.1[0xb8d980]
/usr/lib64/libcuda.so.1[0xb933c4]
/usr/lib64/libcuda.so.1[0xb63557]
/usr/lib64/libcuda.so.1[0xb0ecf7]
/usr/lib64/libcuda.so.1[0xb2052b]
/usr/lib64/libcuda.so.1[0xb05940]
/usr/lib64/libcuda.so.1[0xafea8a]
/usr/lib64/libcuda.so.1(cuCtxCreate+0x57)[0xb59187]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x5bf335]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x413c5b]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x41f68d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x42b54d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x408707]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x6c9d546]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(__gxx_personality_v0+0x219)[0x408349]

Exiting...

</stderr_txt>
]]>

Thanks Ian!
Title: Re: SETI MB CUDA for Linux
Post by: pp on 24 Sep 2009, 02:07:20 pm
Shameless plug:

I've managed to climb to the #3 spot in the Top hosts (http://setiathome.berkeley.edu/top_hosts.php) list. This is probably the highest I'll ever be so I'll savour the moment.  :D

PDF attached for future proof.

And now we have two Linux machines among the top 20. Don't know yet how high it will reach though...  :D
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Sep 2009, 05:48:35 pm
Sunu,
 The XVID problem I reported was in a system that didn't have any Windows Manager running, it was running in mode 3, not 5, the machine was just basic vt100. However there did seem to be a hardware problem, and last Sunday the disk packed up. Today after carefull reinstall, modifications with logical volumes and mounting I've managed to get the machine back to a workable state. I've installed the two Cuda 2.3 packages. However I'm missed/messed something as my tasks keep aborting. The ldd of the seti executable seems ok but as I say the thing fails. What have I done wrong in the attached task error output?
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r :p
VLAR autokill mod

setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9600 GT
           totalGlobalMem = 536608768
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 262144
           maxThreadsPerBlock = 512
           clockRate = 1600000
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 8
SIGSEGV: segmentation violation
Stack trace (17 frames):
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(boinc_catch_signal+0x43)[0x485ef3]
/lib64/libpthread.so.0[0x60880f0]
/usr/lib64/libcuda.so.1[0xb8d980]
/usr/lib64/libcuda.so.1[0xb933c4]
/usr/lib64/libcuda.so.1[0xb63557]
/usr/lib64/libcuda.so.1[0xb0ecf7]
/usr/lib64/libcuda.so.1[0xb2052b]
/usr/lib64/libcuda.so.1[0xb05940]
/usr/lib64/libcuda.so.1[0xafea8a]
/usr/lib64/libcuda.so.1(cuCtxCreate+0x57)[0xb59187]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x5bf335]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x413c5b]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x41f68d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x42b54d]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu[0x408707]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x6c9d546]
setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu(__gxx_personality_v0+0x219)[0x408349]

Exiting...

</stderr_txt>
]]>

Thanks Ian!


ok you say you installed cuda 2.3 libraries and the 2.3 v190 series driver right (earlier drivers won't work) ?

your error report says the app is cuda 2.2 so it will error. the app must also be cuda 2.3 compliant. those who explained things to me insisted that the driver, toolkit and app must use the same cuda version. i don't believe there is such a thing as 'backward compatibility' with cuda.



Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Sep 2009, 05:49:23 pm
Shameless plug:

I've managed to climb to the #3 spot in the Top hosts (http://setiathome.berkeley.edu/top_hosts.php) list. This is probably the highest I'll ever be so I'll savour the moment.  :D

PDF attached for future proof.

And now we have two Linux machines among the top 20. Don't know yet how high it will reach though...  :D

cool. maybe someday ill get in there too
Title: Re: SETI MB CUDA for Linux
Post by: pp on 25 Sep 2009, 05:19:31 am
ok you say you installed cuda 2.3 libraries and the 2.3 v190 series driver right (earlier drivers won't work) ?

your error report says the app is cuda 2.2 so it will error. the app must also be cuda 2.3 compliant. those who explained things to me insisted that the driver, toolkit and app must use the same cuda version. i don't believe there is such a thing as 'backward compatibility' with cuda.

AFAIK there is no 2.3 binary for CUDA. The 2.2 version is compatible and I run it on several computers. Copy and paste the output of the following command so we can have a look where your CUDA-library is linked:
Code: [Select]
ls -l /usr/lib64/libcuda*]
/PP
Title: Re: SETI MB CUDA for Linux
Post by: pp on 30 Sep 2009, 08:23:49 am
There's a third Linux computer among the top 20 hosts now but it's not me this time though. I will however fight his 4xGTX275 with my single GTX295!  ;D
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 30 Sep 2009, 08:33:32 am
There's a third Linux computer among the top 20 hosts now but it's not me this time though. I will however fight his 4xGTX275 with my single GTX295!  ;D

I don't think so. It is relatively new host (since it was upgraded) and that's why it have lower RAC. It currently generating a lot more points http://pl.boincstats.com/stats/host_graph.php?pr=sah&id=5011059  ;D
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 30 Sep 2009, 08:43:49 am
Roifl and Sunu,
 Just an update. It looks like the copying of the seti cuda executable into the /usr/sbin directory finally got it to calm down and start crunching.
 The NVRM Xid issue continues but now doesn't lock up the machine. It's been up nearly a week without lookup, but I've seen eight in the past three days. As the machine continues on happily I'll forget about it for now. During the reinstall last week I took off the expansion card blanking plates (this machine has only one card in it, the 9600GT) so the machine can get a bit more air.
 Thanks for your help!
 Ian
Title: Re: SETI MB CUDA for Linux
Post by: pp on 30 Sep 2009, 09:11:26 am
There's a third Linux computer among the top 20 hosts now but it's not me this time though. I will however fight his 4xGTX275 with my single GTX295!  ;D

I don't think so. It is relatively new host (since it was upgraded) and that's why it have lower RAC. It currently generating a lot more points http://pl.boincstats.com/stats/host_graph.php?pr=sah&id=5011059  ;D

Compiling my already über optimized 2.6.31-kernel with some -floop-interchange or -floop-strip-mine will take care of that... or I'll just throw in another 295.  :D Nice to see another Gentooer on the list though... but I hope the heat in your room makes your skin curl up and peel off! ;D
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 30 Sep 2009, 09:45:45 am
Compiling my already über optimized 2.6.31-kernel with some -floop-interchange or -floop-strip-mine will take care of that... or I'll just throw in another 295.  :D Nice to see another Gentooer on the list though... but I hope the heat in your room makes your skin curl up and peel off! ;D

Also nice to see Gentooer here  :)

Don't bother. This host have special self-made case and it's nice and cool. I also have another one 295 for it but it is currently in service cause it was factory damaged.

So when it back it will swallow your tiny, poor über. ;D
But I think it do it even earlier.  ;D
Title: Re: SETI MB CUDA for Linux
Post by: pp on 30 Sep 2009, 09:56:40 am
Darn, darn, darn! ;D Well, I guess I have to let you hunt down Vyper by yourself then. I wish you luck. :D Would be nice with a Linux computer on top and a majority of them occupying the top 10...
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 30 Sep 2009, 10:30:19 am
Darn, darn, darn! ;D Well, I guess I have to let you hunt down Vyper by yourself then. I wish you luck. :D Would be nice with a Linux computer on top and a majority of them occupying the top 10...

Thanks. I also wish you luck. :)
It's sad but I don't think that this host will beat Vyper's one. No at this time maybe someday.  :(
It will be very nice to see many Linux hosts on the top computer list and to see generally more Linux hosts in boinc. :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Oct 2009, 08:56:50 am
i have been doing a lot of studying on the 285 vs 295 battle that has been going on in my brain. each element of the 295 is slower than the 285 by a reasonable margin (approx 160gflops difference.. 285=1062gflops while 295=894gflops per element). the thing the 295 has is 'density' to make up for it. so even if it takes longer to do a wu than the 285 does, it can do 2 of them in the same package in an attempt to make up for it which works ok.. wonder if there is an extended length motherboard out there that will take  8 pcie devices with a reasonable distance spread (at least 1/2 - 1 in between mounted devices)?  there are cases available that can handle this, but i have not found a mobo that can..  for raw speed i would be more inclined to put 8 285 in something than 4 295. thoughts?  i have a feeling i am not accounting for something here besides the obvious power requirements and cost savings of 1 295 vs 2 285...

maybe i should be looking into addon pcie density expansion like the nvidia supercomputer appliances do,putting 4 teslas into a single pcie slot.. wonder if empty appliance devices are available....


Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 01 Oct 2009, 10:00:02 am
i have been doing a lot of studying on the 285 vs 295 battle that has been going on in my brain. each element of the 295 is slower than the 285 by a reasonable margin (approx 160gflops difference.. 285=1062gflops while 295=894gflops per element). the thing the 295 has is 'density' to make up for it. so even if it takes longer to do a wu than the 285 does, it can do 2 of them in the same package in an attempt to make up for it which works ok.. wonder if there is an extended length motherboard out there that will take  8 pcie devices with a reasonable distance spread (at least 1/2 - 1 in between mounted devices)?  there are cases available that can handle this, but i have not found a mobo that can..  for raw speed i would be more inclined to put 8 285 in something than 4 295. thoughts?  i have a feeling i am not accounting for something here besides the obvious power requirements and cost savings of 1 295 vs 2 285...

maybe i should be looking into addon pcie density expansion like the nvidia supercomputer appliances do,putting 4 teslas into a single pcie slot.. wonder if empty appliance devices are available....

You may also consider this card http://www.asus.com/product.aspx?P_ID=3OXEUQmsHmmewEyu&templete=2 if you have enough money.  ;D

But generally i observed that it's not so big difference in Seti speed between 260sp216 and 275 and 285 compared to difference in price. So if theoretical speed difference is about 20% you should be happy if you see only about half of that in Seti. That's because they are peak value of GPU capability, but real computation depends also on cpu, bus, memory speed and even more on application architecture.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Oct 2009, 11:45:43 am
i have been doing a lot of studying on the 285 vs 295 battle that has been going on in my brain. each element of the 295 is slower than the 285 by a reasonable margin (approx 160gflops difference.. 285=1062gflops while 295=894gflops per element). the thing the 295 has is 'density' to make up for it. so even if it takes longer to do a wu than the 285 does, it can do 2 of them in the same package in an attempt to make up for it which works ok.. wonder if there is an extended length motherboard out there that will take  8 pcie devices with a reasonable distance spread (at least 1/2 - 1 in between mounted devices)?  there are cases available that can handle this, but i have not found a mobo that can..  for raw speed i would be more inclined to put 8 285 in something than 4 295. thoughts?  i have a feeling i am not accounting for something here besides the obvious power requirements and cost savings of 1 295 vs 2 285...

maybe i should be looking into addon pcie density expansion like the nvidia supercomputer appliances do,putting 4 teslas into a single pcie slot.. wonder if empty appliance devices are available....

You may also consider this card http://www.asus.com/product.aspx?P_ID=3OXEUQmsHmmewEyu&templete=2 if you have enough money.  ;D

But generally i observed that it's not so big difference in Seti speed between 260sp216 and 275 and 285 compared to difference in price. So if theoretical speed difference is about 20% you should be happy if you see only about half of that in Seti. That's because they are peak value of GPU capability, but real computation depends also on cpu, bus, memory speed and even more on application architecture.

hmm yeah ... my basis is on integer gflops since i have not found double precision gflops comparisons. basing performance comparisons between my tesla at 933 integer gflops and my 285 at 1062 integer gflops, boinc displays them as 74 and 127 gflops respectively. now, considering the 295 is slower in integer gflops per processing system (894 each) than the tesla, i would expect it would display less than 74gflops each half.
which basically means that for a given card, a 295 using both halves will only give approximately 50-60% higher performance in total then a single 285 which makes me curious about its value other than accepting that 50% more per physical device is preferable. i just wonder if since the 295 is essentially supposed to be 2x 285 with slightly degraded performance why it is so? it has 4 less pixel shaders (28 vs 32) and smaller memory bus width (448 vs 512 which to me is the most major item). although these vary by mfgr, in general the 295 also has slower default clock speeds. admittedly lower clock speeds will help with eliminating heat buildup, but instead of using the same default heatsink assy, put a better designed one on to compensate and keep the performance up. guess i just wonder why its design doesn't make a lot of sense or maybe i am in wishful thinking mode that it 'should' be a 2x full 285 units when in fact it is 2x crippled 285 units.

Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 01 Oct 2009, 12:53:44 pm
hmm yeah ... my basis is on integer gflops since i have not found double precision gflops comparisons. basing performance comparisons between my tesla at 933 integer gflops and my 285 at 1062 integer gflops, boinc displays them as 74 and 127 gflops respectively. now, considering the 295 is slower in integer gflops per processing system (894 each) than the tesla, i would expect it would display less than 74gflops each half.
which basically means that for a given card, a 295 using both halves will only give approximately 50-60% higher performance in total then a single 285 which makes me curious about its value other than accepting that 50% more per physical device is preferable. i just wonder if since the 295 is essentially supposed to be 2x 285 with slightly degraded performance why it is so? it has 4 less pixel shaders (28 vs 32) and smaller memory bus width (448 vs 512 which to me is the most major item). although these vary by mfgr, in general the 295 also has slower default clock speeds. admittedly lower clock speeds will help with eliminating heat buildup, but instead of using the same default heatsink assy, put a better designed one on to compensate and keep the performance up. guess i just wonder why its design doesn't make a lot of sense or maybe i am in wishful thinking mode that it 'should' be a 2x full 285 units when in fact it is 2x crippled 285 units.

First there is no such a thing like a "integer gflops". There are single (32bit) or double (64bit) precision floating-point operations. And indeed double precision performance is about 8 times lower than single in nvidia gpus. The reason why each of gpus on 295 is slower clocked is heat production. Following documentation it is 290W for two gpus on 295 card. For 285 it is about 205W for single GPU. So considering this card with two 285 with normal clocks may produce over 410W of heat power. It is very hard to dissipate that much heat. Even 295 with it's 290W is a very hot card and need very good cooling to stable work. Asus card from the link is build with two full clocked 285 chip and its cooling system is very big. It take 2.5 slots.

Difference in number of pixel shader is not important for CUDA computing. It uses vertex shader which both have 240 organized in 30 stream processors to use with CUDA (8 shader in each). Memory is faster 159GB/s (285) vs 112GB/s (for single gpu on 295) and shader clock is faster in 285. And again following documentation theoretical peak single precision performance for 285 is 1062Gflop/s (about 130Gflop/s in double) and for 295 895Gflop/s (about 112Gflop/s in double) for each gpu. Difference is about 15% but what is a difference for real computation time. Unfortunately I don't have any 285 card to do a test but i have 275 and 260sp216. Theoretical difference is over 25% (275 - 1010Gflop/s, 260 - 804Gflop/s) but real computation time for Seti is about 670 sec. for 275 and about 750 sec. for 260 (with normal 0.44 ar. unit). So the real difference is little over 10% like I said before.

What I try to say is that each of gpu's on 295 are theoretically about 15% slower compared to 285. But in real computation each of 295 gpu will be slower only about 5% to 6%. So with 295 we have more than 185% performance of 285 with not so big difference in price.  :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Oct 2009, 05:12:09 am
ahh... thanks. makes sense. i have a nasty habit of thinking myself into corners. unfortunately that asus  mars is untouchable for me. just way too much $$. 295 does sound like the way to go and i think i have enough air flow around the gpus to keep it cool. my 285 and tesla both never go above 65c with a summer room ambient temp of 28c. took running the fans at 100%, adding extra bottom front fan to move cooler air into the lower case pocket plus a few small pci slot exhaust fans. with the spacing of my mobo's pcie slots it is not easy getting the extra heated air out from between the 2 cards. had to mount a little 1 in fan on the tops of the cards aiming between them to move the air out which dropped both card temps quite a bit. probably would have been eaiser to buy another case side cover with fans directly over the gpus. mine has a single 25cm fan in the middle of the cover.
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 03 Oct 2009, 05:25:04 am
I also advise to wait for g300. It's premiere may greatly change prices of 295.  :) 
Title: Re: SETI MB CUDA for Linux
Post by: Tye on 04 Oct 2009, 07:53:43 am
I've been using BOINC 6.6.11 for awhile now, to make sure it handles my multi-GPUs of different types.  Is there any newer version that will also do this yet?  Sunu, I think you were also using 6.6.11...
Title: Re: SETI MB CUDA for Linux
Post by: pp on 04 Oct 2009, 08:36:31 am
I've been using the 6.10.x series for a while and it works correctly with multiple GPUs. Use at least 6.10.7 and upwards because previous versions introduced some new bug that preempted all CUDA tasks.
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 04 Oct 2009, 08:37:05 am
I've been using BOINC 6.6.11 for awhile now, to make sure it handles my multi-GPUs of different types.  Is there any newer version that will also do this yet?  Sunu, I think you were also using 6.6.11...

Currently marked as stable is version 6.6.40 and for me it work with multiple different gpu-s.  :)
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 04 Oct 2009, 09:05:21 am
There are very few pre-compiled Linux v6.10.xx versions available for download - the last was v6.10.6

Rom Walton was asked yesterday for a v6.10.11 build, and replied "Alright, I'll have them out tonight." - but no sign yet (the Berkeley server problem may have got in the way). But worth keeping an eye open.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 04 Oct 2009, 10:25:39 am
It's actually quite simple to build your own Boinc client. Just make sure you have all the dependencies installed as listed on the following page in the columns "Core client" and "BOINC Manager".
http://boinc.berkeley.edu/trac/wiki/SoftwarePrereqsUnix

Download whatever version you want with subversion. Available versions can be found here: http://boinc.berkeley.edu/trac/browser/tags
Code: [Select]
cd
svn co http://boinc.berkeley.edu/svn/tags/boinc_core_release_6_10_11

Run autosetup and configure. Make sure to define -march with whatever is appropriate for your own CPU.
Code: [Select]
cd boinc_core_release_6_10_11
./_autosetup
./configure --disable-server --disable-fcgi --enable-unicode CFLAGS="-march=core2 -O2 -pipe" CXXFLAGS="-march=core2 -O2 -pipe"

If everything went well you can compile the code.
Code: [Select]
make
To make things easier there's a Makefile to create a distributable Boinc data folder with all the files.
Code: [Select]
cd packages/generic/sea
make

Remove the included libcudart.so from the package since it can potentially interfere with your CUDA installation.
Code: [Select]
rm BOINC/libcudart.so
After stopping your currently running client and backing it up, you can copy the new version into your current Boinc data folder and overwrite the old binaries. This step depends heavily on how your current client is installed by your distribution. I suggest you move it to your home folder and run it manually from there in the future. In my particular case the copy command looks like this.
Code: [Select]
cp -rv BOINC/* ~/BOINC/
Command to start the client.
Code: [Select]
cd  ~/BOINC
./boinc --allow_remote_gui_rpc --daemon

Command to stop the client.
Code: [Select]
cd ~/BOINC
./boinccmd --quit

It's easy to make shell files to perform those commands if you don't want to type them manually.
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 04 Oct 2009, 02:41:19 pm
Linux v6.10.11 BOINC is now available for download from Berkeley.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Oct 2009, 03:45:07 pm
Roifl and Sunu,
 Just an update. It looks like the copying of the seti cuda executable into the /usr/sbin directory finally got it to calm down and start crunching.
 The NVRM Xid issue continues but now doesn't lock up the machine. It's been up nearly a week without lookup, but I've seen eight in the past three days. As the machine continues on happily I'll forget about it for now. During the reinstall last week I took off the expansion card blanking plates (this machine has only one card in it, the 9600GT) so the machine can get a bit more air.
 Thanks for your help!
 Ian
IanJ, have you solved your problems? What version exactly are your nvidia drivers? You can try anyone of 190.18, 190.25, 190.32, 190.36, to see if those xid errors go away.

your error report says the app is cuda 2.2 so it will error. the app must also be cuda 2.3 compliant. those who explained things to me insisted that the driver, toolkit and app must use the same cuda version. i don't believe there is such a thing as 'backward compatibility' with cuda.
The driver and libs have to support each other but not the app. The app can be a lower number (previous version) with no problems but can't be a higher number. So the duo driver/libs have backwards compatibility. At least that's how it seems right now.

And now we have two Linux machines among the top 20. Don't know yet how high it will reach though...  :D
Congrats pp! Welcome to the top 20 hosts club!

I don't think so. It is relatively new host (since it was upgraded) and that's why it have lower RAC. It currently generating a lot more points
b0b3r is that third linux machine yours? Congrats to you too!

Are these machines dedicated crunchers or you use them also as your desktops?
We need more, to flood the top 20 hosts list with linux machines. riofl is next in line with his super project. I probably won't upgrade for a year or more so eventually I'll get off the top 20.

@riofl
A GTX295 is essentially 2xGTX275 clocked lower.
Currently there are 2 motherboards with 7 PCIE slots that I know of, ASUS and EVGA (intel). So you can have 7 GTX285 with single slot water-cooling or 4 GTX295 (I would say preferably also water-cooled). Which one of the two configs would have higher RAC? I don't know, maybe still the 4 GTX295.
A few days ago, nvidia presented their next architecture GF100. Currently we don't know when the actual cards will come out, optimists say end of 2009, pessimists say Q1 2010. Since you're going to invest some serious money, if I were you, I would wait for the new cards and put my money there. I wouldn't be too happy to put multi thousand dollars in a project and in two months it would be already surpassed.
You could also go more pro/hardcore. This (http://www.brightsideofnews.com/news/2009/10/2/colfax-intl-shows-worlds-first-8gpu-box-8tflops!.aspx) is the first I see with 8 dual-slot cards.

I've been using BOINC 6.6.11 for awhile now, to make sure it handles my multi-GPUs of different types.  Is there any newer version that will also do this yet?  Sunu, I think you were also using 6.6.11...
Yes I'm still using 6.6.11. 6.6.40 (the recommended version) should have proper multi-gpu support and also 6.10.11 that Richard says is now available. Link http://boinc.berkeley.edu/download_all.php  I've seen many reports with problematic scheduling for 6.10.x but I think most of them have been solved.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 04 Oct 2009, 04:22:15 pm
Are these machines dedicated crunchers or you use them also as your desktops?
For the moment it's a dedicated cruncher but it's supposed to be my regular desktop computer. I have still to fit some more disks, serial I/O cards and deal with the air flow inside the box. It just breaks my heart to stop it from crunching to do that ;D
/PP
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Oct 2009, 04:50:04 pm
It just breaks my heart to stop it from crunching to do that ;D

Yes, I have the same feelings.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 05:20:45 am
whoah. seems like things are moving fast :) that 8 gpu machine must be able to cook a turkey placed behind it or at least heat a small home! those cards are entirely too close. each one will cascade more heat into the next one until the end one must run well over 130c! that thing would have to run in an ambient temp environment of near 0c to keep all those puppies cool. with all this changing technology and the imminent gt300 release my project may well get delayed into the 2nd quarter of the year to allow some re-design.

btw for those talking about boinc versions, i have been running 6.9.0 compiled from source and it has worked fine for multiple gpus... the scheduler is a bit odd in that it will refuse to pickup up work units until it is down to less than a  few hundred then it goes into panic mode and tries to continuously request them until it gets its quantity back up, rather than begin asking for more when the queue reaches around 50% which seems the sensible place to refill.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2009, 06:03:23 am
whoah. seems like things are moving fast :) that 8 gpu machine must be able to cook a turkey placed behind it or at least heat a small home! those cards are entirely too close. each one will cascade more heat into the next one until the end one must run well over 130c! that thing would have to run in an ambient temp environment of near 0c to keep all those puppies cool.
That's why I'm saying that for multi multi-gpu installations it's better a water-cooling setup.


btw for those talking about boinc versions, i have been running 6.9.0 compiled from source and it has worked fine for multiple gpus... the scheduler is a bit odd in that it will refuse to pickup up work units until it is down to less than a  few hundred then it goes into panic mode and tries to continuously request them until it gets its quantity back up, rather than begin asking for more when the queue reaches around 50% which seems the sensible place to refill.
As I said to Tye, you could try 6.6.40 or 6.10.11, they should have proper multi-gpu support.
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 05 Oct 2009, 06:18:47 am
btw for those talking about boinc versions, i have been running 6.9.0 compiled from source and it has worked fine for multiple gpus... the scheduler is a bit odd in that it will refuse to pickup up work units until it is down to less than a  few hundred then it goes into panic mode and tries to continuously request them until it gets its quantity back up, rather than begin asking for more when the queue reaches around 50% which seems the sensible place to refill.

You can avoid this by setting proper flops value in app_info.xml. This vale have impact on "result duration correction factor" which is used by scheduler process to guide your client in which manner it should ask for results.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 05 Oct 2009, 08:00:44 am
I ran 6.9.0 too for a while and I think it suffered from the on_frac bug. The branch was never tagged and was only available on trunk for a while. 6.10.x has fixed that bug but probably introduced a few more...
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 09:14:58 am
ill try upgrading boinc this week.. doesn't seem to be any rush with the server probs seti is having. most of my servers have run out of work anyway

water cooling makes some sense but do they make plates that cover all the chips that get contact from the stock cooler? jeeze on my tesla and gtx285  it is almost every chip that has a thermal pad attached with only the gpu actually getting any thermal grease. with the varying depths of the different chips the thermal plate would have to be custom made for each card like the stock plate is. do they actually make custom ones for each card? i have had thoughts of water cooling for some time especialy since my case is made for it but the seeming complication of cooling the gpus properly has held me back.
Title: Re: SETI MB CUDA for Linux
Post by: Urs Echternacht on 05 Oct 2009, 11:23:42 am
GPUs under water (http://shop.aquacomputer.de/index.php?cPath=7_11_149).

@riofl: is that ^ e.g. what you are looking for ?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 11:47:30 am
sure is! thank you!

now all i have to do is figure out what cooling systems are decent / good / great and compare prices and performance. being in florida i have a lot of higher ambient temps to contend with so the radiator will have to be very efficient. my only true concern is ultra high reliability. the systems run 24/7 and my workstation which is the biggest heat generator is an absolute must to remain up at all costs (every minute it is down i cannot work).

should be an interesting adventure .. never delved into water cooling before, always used good fans and matched air flow for best heat transfers and that always worked fine for me... this will be different. :)

(hmm maybe a freon based system is worth looking into? )
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 12:26:14 pm
well, i know one thing i will not be getting anytime soon. pcie expansion cabinet. those puppies are massively expensive! i could buy a 295 and a 285 for the price of the cheapest one i found! cant see it.. no reason for that kind of price gouging except for the fact its a 'niche' and they can do it. damn greed. doubt there is more than $50 actual  mfg cost in there.

was thinking of getting a 4 device expansion cabinet to deal with heat and power separately and save the added expenses of big mobos.

if i could find the individual interfaces i could convert an old minitower.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2009, 01:13:50 pm
You could also consider ready for water cooling cards like this (http://www.evga.com/products/moreInfo.asp?pn=017-P3-1297-AR&family=Geforce%20200%20Series%20Family) or this (http://www.bfgtech.com/bfgegtx2951792h2ocwbe.aspx) or, for even easier water-cooling, plug 'n play, self-contained water-cooled cards like these (http://www.bfgtech.com/bfgrgtx2951792h2ocle.aspx).
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 04:31:29 pm
You could also consider ready for water cooling cards like this (http://www.evga.com/products/moreInfo.asp?pn=017-P3-1297-AR&family=Geforce%20200%20Series%20Family) or this (http://www.bfgtech.com/bfgegtx2951792h2ocwbe.aspx) or, for even easier water-cooling, plug 'n play, self-contained water-cooled cards like these (http://www.bfgtech.com/bfgrgtx2951792h2ocle.aspx).


hmmm adds $200+ to the price of every card plus in the case of the nice idea self contained unit, finding spots to put each of the 120mm fan/cooler units..  (i really don't like having things hanging out of the case all over the place)

wondering how air would do.. my 285 runs less than 65c full load so with a bit of tinkering i would think i could keep a 295 below 75c total  with 100% fan plus plenty of ambient air flowing all around it which is still well within factory specs. dunno. i have no clue what a properly running 295 runs at for temps. i have only ever seen 'problem' temps which are off the charts in msg boards.

i can see the need for water cooling when packing those babies together but i think i can get away without water on the standalone replacement for my tesla. plus i read water cooling results all over the place from 30c cooler to less than 5c difference between water and air. i definitely dont like the idea of spending 500 to 700 extra for proper water cooling for my system only to find it isnt much better than i had with air (especially with next year's project having to spend 200-300 extra per card.. would be cheaper to build 2 smaller air units to get the same gross results).

if it wasnt for the condensation problem i would just spend a few hundred on a top or front mount air conditioner or a window unit with a duct feed dedicated to the case.. nothing is gonna overheat when the ambient internal case temp is 1 to 3c :P
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 04:38:02 pm
heh... might as well go for broke and get this case to put it all in :P

absolutely the greatest 'style' ive ever seen!


http://www.ttlevel10.com/
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2009, 05:05:23 pm
i can see the need for water cooling when packing those babies together but i think i can get away without water on the standalone replacement for my tesla.

Of course, I'm talking about watercooling only for your multi multi-gpu project. For a single card you can do without it.

heh... might as well go for broke and get this case to put it all in :P

absolutely the greatest 'style' ive ever seen!


http://www.ttlevel10.com/
I don't think cooling-wise it'll have many big advantages over "standard" cases.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 08:11:43 pm
i can see the need for water cooling when packing those babies together but i think i can get away without water on the standalone replacement for my tesla.

Of course, I'm talking about watercooling only for your multi multi-gpu project. For a single card you can do without it.

heh... might as well go for broke and get this case to put it all in :P

absolutely the greatest 'style' ive ever seen!


http://www.ttlevel10.com/
I don't think cooling-wise it'll have many big advantages over "standard" cases.

the only advantage i see is that it separates the heat generators into separate individually cooled compartments which reduces the overall heating load to have to deal with.

i can make anything work cool enough i think, i just LOVE how it looks :P

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 08:37:39 pm
an alternative would be a mountainmods u2 ufo duality case... holds 2 complete computer systems, 4 power supplies and enough fans to create a tornado


http://www.mountainmods.com/product_info.php?cPath=21_71&products_id=359%2522%2520title=%2522http://www.mountainmods.com/product_info.php%3fcPath=21_71&products_id=359%2522%253Ehttp://www.mountainmods.com/product_info.php%3fcPath=21_71&products_id=359


or i could just settle for another of my boring workstation case, a thermaltake armor series full tower but the newer one with the dual large fans over the gpus instead of one big one in the middle of the side cover.  dont get me wrong its a fantastic case, but its becoming a bit too 'traditional' for my tastes :)

http://www.newegg.com/Product/Product.aspx?Item=N82E16811133021

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2009, 08:51:50 pm
an alternative would be a mountainmods u2 ufo duality case... holds 2 complete computer systems, 4 power supplies and enough fans to create a tornado
That's what I was going to tell you!!! If you want a truly high end case go Mountain Mods and never look back.

or i could just settle for another of my boring workstation case, a thermaltake armor series full tower but the newer one with the dual large fans over the gpus instead of one big one in the middle of the side cover.  dont get me wrong its a fantastic case, but its becoming a bit too 'traditional' for my tastes :)
I currently have an oldie thermaltake xaser III. It definitely shows its age, since when it was designed the thermal requirements were way lower than today and I have moded it quite a lot to keep my current system cool and it still can't keep up.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 09:02:54 pm
an alternative would be a mountainmods u2 ufo duality case... holds 2 complete computer systems, 4 power supplies and enough fans to create a tornado
That's what I was going to tell you!!! If you want a truly high end case go Mountain Mods and never look back.
i have always loved mountainmods.. i just love the look of the level10. maybe sometime ill just get one of those to put my workstation into. dunno, but i had mostly made up my mind for the ufo series for the new project. especially since if i can find a mobo with enough extended pci slots i can space the cards out more and order a special build on the ufo to cover that.


or i could just settle for another of my boring workstation case, a thermaltake armor series full tower but the newer one with the dual large fans over the gpus instead of one big one in the middle of the side cover.  dont get me wrong its a fantastic case, but its becoming a bit too 'traditional' for my tastes :)
Quote
I currently have an oldie thermaltake xaser III. It definitely shows its age, since when it was designed the thermal requirements were way lower than today and I have moded it quite a lot to keep my current system cool and it still can't keep up.

what helped mine was i put 2 more drive bays in the front so there are 3 fans and i replaced the rear 120mm led fan with a 110cfm adjustable fan.. those 2 additions made all the difference. i think with the addition of the 295 to replace the tesla i will have reached the practical limit of this case. i have no doubt it will handle it but i doubt it will handle much more.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2009, 09:27:40 pm
if i can find a mobo with enough extended pci slots i can space the cards out more and order a special build on the ufo to cover that.
Unfortunately if you go multi (4x) dual-slot cards they'll have to sit one right next to each other so no matter how strong the air circulation is, it is very difficult to enter the tiny space between the cards. That's why I'm telling you to go watercooling for your super project. But that's what the Mountain Mods cases are built for: extreme watercooling setups. So another reason to go Mountain Mods.

what helped mine was i put 2 more drive bays in the front so there are 3 fans and i replaced the rear 120mm led fan with a 110cfm adjustable fan.. those 2 additions made all the difference.
120mm? What is that? :D In the days of xaser III, 80mm were more than enough! I wish my case had 120mm fans.


My pc reached 41100+ RAC. With an oldie q6600 a stock clocked 295 and a stock clocked 285 sitting on a PCIE x4 slot (and all PCIE slots 1.1 not 2.0) I'm more than happy.

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Oct 2009, 11:37:22 pm
yeah im convinced i am going to have to extend to budget to include water cooling. hard to believe they wanna charge 200+ bucks just for a copper plate.... but.....  the mountainmods willl def. allow the bfg self contained or any kind of water cooling.. heck if i find i need room for cooler mountings i can always mod the side panel to have them mount on it blowing out.

my case has 1 120mm in the back plus an 80mm rear and top fans plus the psu fan along with 2 pci slot exhaust fans placed next to  each of the gpus to help bring cool air into the hot pocket of the case... the front drive bay modules each have a 120mm fan in them so there are 4 120mm fans , plus the top 2 80mm plus psu and a side 25cm fan and an additional 120mm i added to the rear of the bottom housing to move air more directed into the hot spot and the 1 inch fan i have mounted blowing air between the 2 cards. that combination keeps both cards at or below 65c and nothing on the mobo runs above 49c and thats only momentary peaks for the southbridge and the cpu never goes above 60c typically staying around 55c in summer and 45c in winter under full load. so i'm happy with the air flow performance of this case.

btw the reason the cpu runs so warm compared to stock is it is oc to 3.0ghz (q6600) and has been such for more than a year.
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 09 Oct 2009, 04:54:22 am
Sunu,
 Regarding my 9600GT Xid errors. I have 190.18 cuda libraries which as you know is 2.3, you speak of trying other higher 190.xx versions, where do I get them from the Nvidia download site, as it seems only to have one version, 190.18, the one I'm using?
 Thanks
 Ian
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Oct 2009, 09:58:12 am
you speak of trying other higher 190.xx versions, where do I get them from the Nvidia download site, as it seems only to have one version, 190.18, the one I'm using?
 Thanks
 Ian

You can find them in ftp://download.nvidia.com/XFree86/Linux-x86_64/  Every folder has three files, you need the ...pkg2.run one.
Title: Re: SETI MB CUDA for Linux
Post by: IanJ on 15 Oct 2009, 12:03:58 pm
Sunu,
 I have tried 190.25, 190.32 and 190.36. They all continue to have the problem of the NVRM Xid. I will continue to use 190.36 for the time being.
 Regards
 Ian
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 01 Nov 2009, 05:14:31 am
Hey all.

I've recently upgraded to Kubuntu 9.10, which is running a 2.6.31 kernel, and, as I've noticed with any kernel 2.6.29 and beyond, the CUDA units take a REALLY long time.

Look at this unit :

http://setiathome.berkeley.edu/result.php?resultid=1407056493

compared with this one :

http://setiathome.berkeley.edu/result.php?resultid=1401862775

The "CPU" time is about the same, but the actual runtime goes from 13 minutes to over an hour.
Any suggestions ?

All other GPU/CUDA benchmarks indicate that everything else is working fine.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 01 Nov 2009, 06:14:20 am
What priority do your seti apps (CPU and GPU) run?

In sidux (uses 2.6.31-5) I've seen way too slow crunching with the default priorities, 19 for CPU and 10 for GPU tasks. Renicing the GPU tasks to 0, they speeded up considerably. Maybe newer kernels need more aggressive priority levels for cuda.

I'm using the script attached below to renice the cuda tasks to 0 (it runs in infinite loop, checking every 5 seconds for seti cuda tasks, renicing them to 0).

[attachment deleted by admin]
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 01 Nov 2009, 06:25:02 am
I'll try that ...

But I've just switched over to a 2.6.27 kernel, which also has the CPU tasks at 19, and CUDA at 10, and everything seems to be running fine ...
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 01 Nov 2009, 06:30:46 am
But I've just switched over to a 2.6.27 kernel, which also has the CPU tasks at 19, and CUDA at 10, and everything seems to be running fine ...

Yes, in my ubuntu with 2.6.27-14 I haven't seen such behaviour either. That's why I'm saying that newer kernels might need more aggressive priority settings for cuda.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 01 Nov 2009, 11:27:08 am
Just add this to cron:
Code: [Select]
* * * * * renice 0 `pgrep setiathome` >/dev/null
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 06:02:59 am
What priority do your seti apps (CPU and GPU) run?

In sidux (uses 2.6.31-5) I've seen way too slow crunching with the default priorities, 19 for CPU and 10 for GPU tasks. Renicing the GPU tasks to 0, they speeded up considerably. Maybe newer kernels need more aggressive priority levels for cuda.

I'm using the script attached below to renice the cuda tasks to 0 (it runs in infinite loop, checking every 5 seconds for seti cuda tasks, renicing them to 0).

can you run gpus at higher levels sucn as -1 or so or is it the nature of the gpu system to not go below 0? just wondering if there is only marginal benefit at running them at -1 or maybe even -5.

i have set that script for cpus as well, so now my gpus run 0 and cpus 10 instead of 19.. so far no noticable probs with the desktop, only time will tell.


Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 02 Nov 2009, 06:37:29 am
I'm currently renicing to 0, and while there's some improvement, it's nowhere near the performance levels of the 2.6.27 kernel.

I'll leave it running overnight so I can get a better gauge of average performance.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 07:28:52 am
i think i will keep my 2.6.29 kernel on my desktop then... unfortunately i have to update several of my servers to 2.6.31 due to application requirements, so those machines, unfortunately, will probably suffer. i have the script on them for cpu setting them to 10 to see how things go and how server performance is affected if at all. the rest of the servers i will probably leave with the 2.6.26 kernels they now have (if it ain't broke dont fix it) :) .

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 07:33:47 am
over the short term, i am seeing an average of 2 to 3 minutes lower processing times per workunit on the gpus at nice 0. nice boost.


Title: Re: SETI MB CUDA for Linux
Post by: pp on 02 Nov 2009, 08:37:37 am
can you run gpus at higher levels sucn as -1 or so or is it the nature of the gpu system to not go below 0? just wondering if there is only marginal benefit at running them at -1 or maybe even -5.
You can run them at -5 if you like but I didn't notice any difference in speed running them at -5 compared to 0.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Nov 2009, 08:53:40 am
can you run gpus at higher levels sucn as -1 or so or is it the nature of the gpu system to not go below 0? just wondering if there is only marginal benefit at running them at -1 or maybe even -5.

Some time ago, I've tried with negative priorities in my 2.6.27-14 ubuntu kernel and they had... negative effect, the tasks ran slower, but I don't remember if the slowdown was for the GPU tasks or the CPU ones or for both.

I'm trying right now the same in sidux with 2.6.31-5.slh.1-sidux-amd64 (-5 nice for cuda tasks, 19 for cpu) and it doesn't seem to make any difference over 0 for GPU and 19 for CPU ( http://setiathome.berkeley.edu/results.php?hostid=3690316 ). All results dated 2 Nov are with -5 for GPU and 19 for CPU, the rest are with 0 for GPU and 19 for CPU. They seem about the same to me. In this machine using the "default" (10 for GPU and 19 for CPU) results in doubling up computation times for CUDA tasks.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 09:22:17 am
i dont get that radical a change in processing times.. seems i never do compared to you. what mobo are you using? i believe you said you had a q6600 processor which is the same one i have. i did notice an apparent slow down using -5 so i went back to 0 but again that was short term. for some reason i don't get nearly the response you do so i am beginning to suspect it may be my gigabyte ga-p35-ds4 mobo. i may try upgrading it to an asus rampage formula or something similar. this machine does not need more than 2 pci-e slots although i suspect if there were more than 2 i could spread the 2 gpu devices out farther away from each other to help minimize heat buildup. then i might not need that little 1 in fan moving air between them.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Nov 2009, 09:55:10 am
Yes I still have a Q6600, it's my other "big" machine http://setiathome.berkeley.edu/show_host_detail.php?hostid=3281360 . This has the old ubuntu kernel that saw the slowdown with negative nice levels. This has an ASUS P5K vanilla motherboard. With this machine I didn't see much difference going from 10-19 GPU-CPU to 0-19 GPU-CPU.

i did notice an apparent slow down using -5 so i went back to 0 but again that was short term.
So we see the same result, at least with my ubuntu machine using the old, 2.6.27-14 kernel.

The 3690316 is a new lightweight experimental/testing build I've made to use it for... well experimental/testing things. Right now it runs sidux and boinc with all cuda and stuff over livecd.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 10:06:32 am
ahh ok so the difference was not that great with the 'big' machine compared to the others... now i dont feel so bad :)

my workstation is using 2.6.29 kernel and i doubt i will update it knowing what i do now.. i have the script running to set cpu nice to 10 to see if it helps on those servers that must run 2.6.31. i did notice rac dropped on those machines so i hope this helps.  the rest are 2.6.26 and will be left alone unless my test on one of those proves nice 10 is considerably faster with little bother to the server.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 02 Nov 2009, 06:00:24 pm
Well I had the units run overnight with the renice script running in the background, and it really made no difference.

CUDA workunits under the 2.6.31 kernel take between 2-4 times longer than they should. I may add, that while these units are running, compositing effects (such as wobbly windows and other eye candy) are extremely stuttery. This wasn't the case with the 2.6.27 kernel.

I'm running a phenom 955 CPU, GTX 275 GPU.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Nov 2009, 06:28:07 pm
Now your machine says 2.6.31-14-generic kernel. Couple of days ago I think I saw it showing 2.6.31-14-custom or something like that.

Looking at your tasks I see most of them with good times and some others with abnormal times all seemingly mixed up.

We are now nearing midnight UTC. Please stick with one kernel, run the script and let it crunch so all workunits reported 3 Nov are with the new 2.6.31 kernel with script running so we can see what's going on.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 08:36:09 pm
after running gpu at 0 all day with the 2.6.29 kernel it appears my average is down by a few hundred points, so i am cancelling the script for the next 24 hrs to see if it goes back up
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Nov 2009, 08:39:39 pm
Riofl, check your pending credits. Mine have gone up almost 8000 the last day or so and my RAC has taken a dive.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Nov 2009, 09:10:40 pm
ahh forgot about them. ok im writing current down and will put the script back into operation till sometime tomorrow
Title: Re: SETI MB CUDA for Linux
Post by: pp on 03 Nov 2009, 05:57:42 am
Well I had the units run overnight with the renice script running in the background, and it really made no difference.

CUDA workunits under the 2.6.31 kernel take between 2-4 times longer than they should. I may add, that while these units are running, compositing effects (such as wobbly windows and other eye candy) are extremely stuttery. This wasn't the case with the 2.6.27 kernel.

I'm running a phenom 955 CPU, GTX 275 GPU.

I'm beginning to suspect that someone left some debug options enabled in your kernel. What's the output of :
Code: [Select]
zgrep DEBUG /proc/config.gz
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 03 Nov 2009, 06:16:22 am
Interesting ...
Here's the output.

Code: [Select]
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y                                       
CONFIG_SLUB_DEBUG=y                                                         
CONFIG_HAVE_DMA_API_DEBUG=y                                                 
CONFIG_X86_DEBUGCTLMSR=y                                                     
CONFIG_X86_CPU_DEBUG=m                                                       
CONFIG_PM_DEBUG=y
CONFIG_IRDA_DEBUG=y
CONFIG_CFG80211_REG_DEBUG=y
CONFIG_CFG80211_DEBUGFS=y
CONFIG_MAC80211_DEBUGFS=y
CONFIG_WIMAX_DEBUG_LEVEL=8
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_CB710_DEBUG_ASSUMPTIONS=y
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_SCSI_MVSAS_DEBUG=y
CONFIG_SCSI_LPFC_DEBUG_FS=y
CONFIG_SCSI_DEBUG=m
CONFIG_FIREWIRE_OHCI_DEBUG=y
CONFIG_MLX4_DEBUG=y
CONFIG_ATH9K_DEBUG=y
CONFIG_LIBIPW_DEBUG=y
CONFIG_B43LEGACY_DEBUG=y
CONFIG_WIMAX_I2400M_DEBUG_LEVEL=8
CONFIG_ATM_FORE200E_DEBUG=0
CONFIG_USB_SERIAL_DEBUG=m
CONFIG_INFINIBAND_MTHCA_DEBUG=y
CONFIG_INFINIBAND_AMSO1100_DEBUG=y
CONFIG_INFINIBAND_IPOIB_DEBUG=y
CONFIG_THINKPAD_ACPI_DEBUGFACILITIES=y
CONFIG_OCFS2_DEBUG_MASKLOG=y
CONFIG_JFFS2_FS_DEBUG=0
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_RODATA=y
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Nov 2009, 07:31:26 am
interesting... doesnt look like i will have the large increase you did but my rac went down by about 100 points but my pending went up by almost 300 points.. wonder why pending increases when running more aggressively?

Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 03 Nov 2009, 07:45:53 am

interesting... doesnt look like i will have the large increase you did but my rac went down by about 100 points but my pending went up by almost 300 points.. wonder why pending increases when running more aggressively?


Because you complete them faster than your wingmate.

Typically, when testing new builds, you start with an empty cache (at least that's the way you should do it). So the run starts quickly. And if the build is any good, it'll finish quicker too ;D.
Title: Re: SETI MB CUDA for Linux
Post by: pp on 03 Nov 2009, 09:05:13 am
Interesting ...
Here's the output.

Code: [Select]
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y                                       
CONFIG_SLUB_DEBUG=y
<snip>

I was looking for CONFIG_USB_STORAGE_DEBUG in particular but you don't seem to have it enabled. It's a huge performance killer. Don't really know about the others but personally I disable all debug options in my kernels after my encounter with the dreaded USB_STORAGE_DEBUG...
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Nov 2009, 10:32:23 am

interesting... doesnt look like i will have the large increase you did but my rac went down by about 100 points but my pending went up by almost 300 points.. wonder why pending increases when running more aggressively?


Because you complete them faster than your wingmate.

Typically, when testing new builds, you start with an empty cache (at least that's the way you should do it). So the run starts quickly. And if the build is any good, it'll finish quicker too ;D.

ahh ok.... makes sense.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 03 Nov 2009, 08:02:37 pm
You should now be able to see a bunch of completed tasks under my valid units :

http://setiathome.berkeley.edu/results.php?hostid=5015908&offset=0&show_names=0&state=3
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 04 Nov 2009, 07:17:29 am
Quick update.

I've just started trying a compiled kernel (2.6.31.5), and this time, I disabled all the debug AND disable x86 PAT support.

Without the script, units take a while, but with priority set to 0, they are basically up to speed (well, they average 13 minutes, can't recall if these units ran at 13 mins, or 6 mins though).
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Nov 2009, 07:25:37 am
Looking at your tasks I see most from 3 Nov with bad times and from 4 Nov some with bad and some with good times. How were all these tasks run? Kernel, script on/off.
Title: Re: SETI MB CUDA for Linux
Post by: lordvader on 04 Nov 2009, 07:40:57 am
All these were run with the script on.

The ones with bad times were from the stock Kubuntu 9.10 kernel. The ones with good times are from the kernel I compiled with x86 PAT disabled, and all debug turned off.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Nov 2009, 08:28:22 am
The ones with bad times were from the stock Kubuntu 9.10 kernel.

All those debug flags you posted above were from a stock "official" ubuntu kernel?  :o
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 08 Nov 2009, 06:13:58 am
odd... with the gpus running at 0 nicelevel, there is hardly any difference in times, typically around 29 min, but now i do see some around 26 to 27 min, but my pending has jumped by about 4000. typically my rac is a few hundred lower than normal. odd how this stuff behaves :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 08 Nov 2009, 06:15:31 am
are they still supplying larger workunits? i remember a few months ago my average was 11 to 13 min consistantly, now its averaging 28 to 29 min consistantly.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Nov 2009, 10:49:10 am
are they still supplying larger workunits? i remember a few months ago my average was 11 to 13 min consistantly, now its averaging 28 to 29 min consistantly.

All of them are "large". Only the VHARs are quite short.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Dec 2009, 06:07:16 pm
wow 6.10.17 is nice! it not only recognizes the proper cuda devices reporting 1 per line like it should, but it also reports what device is being fed what workunit!! very nice!

now if the website would just accept the proper device list..........

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Dec 2009, 10:09:40 am
WHOAH

did ya hear what happened to NEZ (one of the top seti participants)?


http://www.kpho.com/news/21778774/detail.html
Title: Re: SETI MB CUDA for Linux
Post by: Gecko_R7 on 03 Dec 2009, 02:02:38 pm
Holy €^#!  Nez has a serious problem!
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Dec 2009, 05:32:49 pm
if that gets air time or spread nationally you can bet other organizations are gonna begin looking reallly hard at what their admins are doing. we may wind up losing a few  more since it is so easy to hide unless it is looked for.

Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 03 Dec 2009, 09:44:00 pm
Holy €^#!  Nez has a serious problem!

Eric's FAQ on the subject, in the staff blog, at http://setiathome.berkeley.edu/forum_thread.php?id=56450
Title: Re: SETI MB CUDA for Linux
Post by: dmol on 18 Dec 2009, 11:59:07 am
Hi All,
I'm trying to get CUDA working on a Quadro NVS 295 without much success. Its failing back to CPU processor due to lack of memory. I know its not a great card but I want to utilise all 4 cpu cores as well as the GPU. Is it even possible to get this card working due its lack of memory?:

OS:                              Ubuntu 9.10
Kernel:                        2.6.31-16-generic x86_64
CPU:                           Intel(R) Xeon(R) CPU   (quad)  E5506  @ 2.13GHz
Memory:                     4Gb
Nvidia card:               Quadro NVS 295 256Mb RAM
Nvidia-drivers:           Linux x86_64 190.53 (latest) including CUDA v2.3 driver
Nvidia-toolkit:            v2.3
Xwindow running:    Yes
Boinc Version:          6.10.17 for x86_64-pc-linux-gnu
Optimised app:        SETI@home MB CUDA_2.2 608 Linux 64bit SM 1.0 - r12 by Crunch3r VLAR autokill mod

The card is detected fine by boinc:
18-Dec-2009 16:51:12 [---] NVIDIA GPU 0: Quadro NVS 295 (driver version unknown, CUDA version 2030, compute capability 1.1, 255MB, 21 GFLOPS peak)
18-Dec-2009 16:51:12 [SETI@home] Found app_info.xml; using anonymous platform

Sample task:  http://setiathome.berkeley.edu/result.php?resultid=1452918416 (Note: Did not use the "VLAR autokill" version here but result the same)

<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : Quadro NVS 295
           totalGlobalMem = 267714560
...
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: Quadro NVS 295 is okay
SETI@home using CUDA accelerated device Quadro NVS 295
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
libboinc: BOINC 6.5.0
....
Cuda error 'cudaMalloc((void**) &dev_GaussFitResults' in file './cudaAcceleration.cu' in line 317 : out of memory.
setiathome_CUDA: CUDA runtime ERROR in device memory allocation (Step 1 of 3). Falling back to HOST CPU processing
</stderr_txt>



SETI preferences:
Use cpu:         yes
Use GPU:       yes
Applications:  SETI@home enhanced only

App_info.xml main information:
  <name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</name>
<avg_ncpus>3.5</avg_ncpus>
<max_ncpus>3.5</max_ncpus>

There is no cc_config file in use.

 









Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Dec 2009, 04:34:39 pm
since you share this card with your desktop i seriously doubt it will work. cuda needs a minimum of 256mb vidram for itself which you do not have if your desktop uses some of it. if yiou are not critical of your graphics capabilities for your desktop i recommend an inexpensive card for just your desktop and let this 295 handle cuda. some have managed to smooge enough to make a single card work but to be blunt i doubt it is worth all the hoops to jump through.
Title: Re: SETI MB CUDA for Linux
Post by: dmol on 22 Dec 2009, 06:37:04 am
Thanks riofl,
I take it that there's no way to configure the SETI app to use less memory?
Can anyone recommend any other project that will work on this card/OS without using the CPU?

Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 22 Dec 2009, 09:49:06 am
Collatz (http://boinc.thesonntags.com/collatz/index.php) has Linux Cuda apps, it's especially good for GPU's with little memory,
My laptop's 128Mb 8400m GS works fine (on Windows_X86), no lag, and 0 to 1% to feed the GPU.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 22 Dec 2009, 01:20:31 pm
dmol, you can try the client from http://calbe.dw70.de/linux64.html . It is release r06 and it is a bit less memory hungry.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 30 Dec 2009, 03:31:08 pm
i am going to try cuda 2.3 again to iron out any oddities before my project begins next year...  i cannot find a cuda 2.3 application though.. the one i have now is

setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu

any clues where to look? i can imagine one must have been generated by now.

also have the newer kernels fixed the slower crunching time reported previously? i am still running 2.6.29. stay with it?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 30 Dec 2009, 06:14:44 pm
There are no 2.3 cuda applications. Recompiling the apps with 2.3 didn't show any speed-up. All speed-up gains are from the 2.3 cuda libraries and the nvidia drivers not the apps themselves.

As for kernel versions you can always try new ones. I've never done a speed comparison among different kernel versions. I've switched to sidux and right now I'm using 2.6.32-2, the latest stable official, and don't see any big difference but as I said I've never done a speed comparison.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Dec 2009, 04:40:41 am
ok cool. thanks.

question. do you know what VLAR angle the vlar killer app uses to trigger rejection? i have been using that cpugpu script and recently i have had a slew of computation errors and when i look at them they all were slightly higher than 0.13 such as 0.133 so they got fed to the gpu and got rejected as vlar.  i am trying 0.14 in the script now.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Dec 2009, 04:57:05 am
Unfortunately I don't know the range of the VLAR kill.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Dec 2009, 08:12:10 am
ok no prob. trial and error works well :)  seems 0.14 was a huge improvement but still gave a few so i moved it to 0.15 and that seems to do it so far. it will probably give the cpu a couple that the gpu could be happy with but.....

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Dec 2009, 10:23:08 am
If you have a high output machine, even with 3 gpus like I do, the script isn't much useful. VLARS can be so many that the CPU won't be able to do them in time so you'll have to abort (at least some of) them anyway. I tried to use it for a week and my CPU's cache kept growing and growing out of proportions.

So I just do a search for AR 0.19 or lower and abort them. I don't leave them to error out with VLAR-kill so I can differentiate them in my error tasks page, VLARs (aborted) and true errors that need attention.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Dec 2009, 02:05:49 pm
interesting... i probably wont care much once i replace this tesla. it locks solid on vlars but works ok on the rest. i was waiting for the gtx 295 to go down but it seems to have increased in price instead.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Dec 2009, 02:14:28 pm
All cards suffer from VLARs, it is the software not the hardware at fault.

As for GTX295s, their price may have gone up because of their reduced production numbers. Still they are cheaper than when I bought mine last July-August.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Dec 2009, 06:12:40 pm
yeah. ill probably just bite the bullet and get one.

vlars kill the tesla. it locks up hard and i have to stop and restart boinc to clear it and sometimes i have to actually power the machine down to clear it.. i think it has a memory issue because i keep getting xvrm or some such in the logs always pointing to it and to the same locations every time.
the gpl gpumemtest i downloaded also locks up on it at the same spot. it is clearly defective. however it seems to behave nicely if it doesnt have long angles to deal with. i at first thought it was a messed up thermal pad on one of the ram chips so i replaced everything  on the chips with some VERY expensive fiberglass web pads and some shin-etsu thermal compound for the gpu and it still misbehaves. its a first design issue engineering model so i am not afraid to retire it or put it into a machine that i dont care much about.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 31 Dec 2009, 06:22:05 pm
vlar has 2 somewhat interrelated problems - very long kernel time (this leads to sluggishness, GPU locks and so on) and very low multiprocessor usage by some (same actually) kernel calls (this leads to very poor performance for VLAR tasks). GPU acts mostly as single-core CPU, not as huge-core one for this tasks. That is, the more MultiProcessors GPU has (the "cooler" it is) the bigger performance hit it experiences.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Dec 2009, 08:49:12 pm
WOW! i finally upgraded enough libraries to get cuda 2.3 working right with  a 190.53 driver and the processing times have literally cut in half! typically they have been 30 min per wu for a long time. now they average 15min... sheesh!

now i just have to take my system down for a day or 2 to fully ujpgrade the o/s and kde to v4. im allowing so long a time because i expect some breakage from this since the system has not been fully updated in a year or a bit  more. so i may not be replying to anything until sometime saturday.

i was also using march=native in the compiler cflags and switched it to core2 when i realized native shuts off all the sse* capabilities. so far ive only done nvidia stuff and the 2.6.31 kernel upgrade. now i have to do the rest. response of the system is noticably better already.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Dec 2009, 10:27:22 pm
Very nice riofl. The New Year brings new performance!   :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Jan 2010, 06:34:12 pm
whelp. i guess i am doomed for cuda 2.2 for this machine. 2.3 worked fine with my old install of the o/s and kde 3.5 after i upgraded and went to kde 4.3.3 it blew apart and boinc kept giving instant computation errors off my gtx285 and a slew more than normal off the tesla too. by the time i was able to stop it i must have wasted more than 100 workunits with 'computation error' and since the  tasks section has not updated since dec 31, i cannot look to see what the errors were :(

i expect kde is taking a ton more video resources even after i disabled more than half of kde so i returned to cuda 2.2 and its working just fine now... at least it is working on workunits instead of just trashing them.. time will tell...

guess as long as i continue using this machine i cannot get truly serious about crunching unless i go to some weaker desktop or i sit and waste days and days figuring out fvwm which from what i understand...

oh well...

Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 02 Jan 2010, 06:50:25 pm
The tasks section might not have been updated since dec 31, but the results will have been,
you just have to find which result you want to look at.  ;D

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Jan 2010, 06:58:22 pm
ahh ok ill have to look for that. usually i just go to tasks  link for the computer, then click on errors and look at the latest date/time and thats usually what i just uploaded.

thanks

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Jan 2010, 07:35:30 pm
riofl, what distro do you use? After ubuntu destroyed itself last month for the n-enth time during the update to 9.10 I decided to ditch ubuntu and switch to sidux. I have kde 4.3.4 with no problems at all.

If you want, post here some of the computation errors you had. Because berkeley will have a power outage in a few hours post them here, don't give links.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Jan 2010, 09:55:20 pm
they uploaded and i have no clue how to find them except in the errors listing once it is updated.
v 2.2 is working perfectly. weird.  i am using gentoo and havent updated my system for almost a year.. was running kde3.5.9 and it worked well. problem was that there have been so many changes and upgrades in everything over the past year that i started getting impossible dependency upgrades to install a simple program which wanted to upgrade half my o/s so i figured it was about time to bite the bullet and update :P  i am a firm believer in 'if it aint broke dont fix it' and i hardly ever update just because there is something new  available.

however in my effort to attempt to keep many of my configs especially login info and things, the upgrade path to kde 4.3.3 was not a smooth road. i had to uninstall much of kde 3.5 due to file collisions... i installed the monolithic version of 3.5 so it caused problems in not being able to remain in a slot. finally 4.3.3 installed and its quite nice in many ways but i get the impression that much of it is still way far from production ready.. my feeling is if 3.5 was a diamond, 4.3 is a pressured lump of coal. but there is no going back because gentoo has discontinued offering 3.5. i am slowly learning the ropes in 4.3 and find much of what i thought was not there was well hidden but still. i really dont need fancy desktops with flying windows and plasma zoom and other things.. but anyway enough rant. i have been configuring now for 20 straight hours and will be again tomorrow in an effort to get back to production by monday. have to tackle sound tomorrow. its a huge huge thing for me and presently it only plays less than a second of any file via the built in kde system... music and video are fine but i need the individual notifies i have configured so i know what i receive from whom before i even get to the computer.

 i suspect either something in the upgraded system and cuda don't like each other, or possibly this new desktop system uses considerably more video resources thereby choking out cuda.. i dont know. its a 1gb card and with 3.5 i was using around 400 to 500mb vidram for the desktops. time and research will tell.

also tomorrow i may try recompiling and reinstalling everything nvidia including the 2.3 toolkit/sdk just in case my system upgrades overwrote something they placed in the lib dirs since i did them before i upgraded.. i hate system upgrades :D  typically what i do is keep the same of everything until i build a new workstation for myself then that gets built clean from scratch with the latest of everything and then i copy/move/whatever things from old to new then wipe old and reinstall and use it as a secondary workstation.  not so in this case. i did something i never do. i updated the entire o/s and applications. to me thats asking for trouble :)

its an easy thing to keep updated if you do upgrades every few days. only one or 2 pkgs that way, but i committed the unpardonable sin of not updating for a year. thats where the problem is. gentoo is a cutting edge source distribution and everything gets compiled with your own specific compiler optimizations. you can install an os from prebuilt binaries but it is at best a compromise so after installing, i do a remake of world with an 'empty tree' which recompiles every byte of code to my optimizations.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Jan 2010, 10:21:00 pm
That was also the problem with ubuntu. Every six months that a new version came out, I had very big problems with upgrade. With 9.10 and yet another unusable system I decided enough was enough. sidux is a rolling update distro, essentially it is updated every single day. It was one of the reasons that I chose it. With more frequent and smaller upgrade steps, I think any problems along the way are more manageable.

1 GB video ram should be more than enough. In ubuntu with compiz effects in full swing and a 3d game running I didn't have any memory problems.

Gentoo is a very configurable distro. I'm sure you can make it as lightweight as you want.

Would you consider changing your distro? You can run seti with a live cd from another linux distro and see if you are happy with it.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Jan 2010, 11:06:41 pm
unfortunately i am steadfast on gentoo... if i update weekly as i did years ago when i first started with gentoo, there are never any problems and the upgrade path is very smooth. keeping it updated like that moves you slowly into the next 'distribution' so by the time a new level is announced you are already there painlessly.. i just ignored things and left it as it was because it worked.. my problems are my own stupidity in not keeping up with changes.. that wont happen again.. :)

gentoo updates just about every day as well, but i found in the past with the options i use, an update world once a week generally is no more than 5 or 6 packages.

'if it aint broke dont fix it' works fine for hardware and even software that is static and never to be added to or updated.

it will take me a few weeks to iron all the quirks out to get it rock solid . i am just one to buck change for change's sake so weekly updates are against my nature, but.. hehe its the name of the game in this day.

i have played with most distros on another machine just to mess with them and i have never ever found a binary distro that i liked,  (to be fair to ubuntu though, they had probably the best install manager program off the cd i have ever seen),and to be able to recompile that distro the way i want it is pure hell trying to gather together all the sources etc... gentoo is the only source based distro i have found and it truly rocks.  i love it. it is also faster than any distro i have ever played with (because i compile to hardware specific optimizations).

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Jan 2010, 11:14:11 pm
if kde4 turns out to be a flop i will work with it  while i learn what i need to know to create a fvwm desktop with all the options i  want.. ill just install a virtual machine in this one and i can switch between them any time i want... i probably should have done that for the upgrade.. mirrored my system into a virtual machine and then upgraded that and once it was perfect, mirror it back to my main partition. oh well.. lesson learned for the next time... hehe you would think i would have thought of that automatically since my specialty is virtual servers and workstations :P  i manage 23 host machines each supporting a minimum of 20 virtual machines , some with as many as 80.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Jan 2010, 02:19:32 pm
well, it has only gone through 4 workunits so far since the recompile of cuda 2.3 but it appears to be working again. evidently the system upgrades clobbered something it was linked to. time will tell. i still have to do the empty tree remake of world but i wont do that till next weekend so i have time to clean up all the config messes i have.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Jan 2010, 11:38:18 pm
it has been running 2.3 toolkit and sdk fothe past 9 hours now and apparently no errors. i cannot tell for sure since so many are in 'uploading' status but the times look right. will know more when it uploads everything later tomorrow.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 05 Jan 2010, 10:51:00 am
must have been a system upgrade clobbering a lib that cuda or the driver was linked to. for safety i recompiled and booted into the recompiled kernel, then recompiled the driver and cuda kits together, then rebooted again and it has been fine for  almost a full day.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Jan 2010, 06:58:19 pm
is there a link for a setiathome 6.08 app not a vlar killer for cuda 2.2+? crunch3r has no search on his page and all i can find is a 64bit app dated january 2009.  anything newer? the vlar killer app i have says 2.2 in its name, but i want to drop the vlar killer now that i can move workunits around as needed.
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 17 Jan 2010, 07:03:09 pm
Page One of this thread?, and i think there might be one or two others spread about on different pages in this thread.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Jan 2010, 07:22:28 pm
page 1 points to the one i have other than the vlar killer. i didnt keep the url when i got the vlar killer app and am just looking for something newer than jan 2009.. the vlarkill was july 2009 and since it was labled  -2.2- i am assuming there is a non vlarkill app also updated? searches on this site or on google show nothing.. its too hard finding updates. its like people want to hide them to make a game out of it. ill just stick with what i am running.

thanks!

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 17 Jan 2010, 07:41:17 pm
is there a link for a setiathome 6.08 app not a vlar killer for cuda 2.2+?
No, because there is no such app.

all i can find is a 64bit app dated january 2009.  anything newer?
No. If you want a non-VLAR-kill app that's the one to get. It's only a tiny bit slower than the 2.2 so you won't be missing much.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Jan 2010, 04:32:32 am
ahh ok thanks... it will probably make up the difference in the fact that it will complete more units since it will not reject any... i finally got to look at some of the error units and vlar killer got to them even with the script. the sctipt considers .13 to be a vlar angle and i had changed that somewhat to give the cpu some work units that didnt meet that criteria..i had mine set to call a vlar 0.30 and move anything less than  that to the cpu.. the vlar killer killed some work units with angles of 0.49 approx. so i decided to revert back to the non killer app.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 18 Jan 2010, 07:43:21 am
hm... what a weird "vlar killer" you have??? It's mid-range ARs where GPU is most effective!
Check again your results, probably that was different kind of error.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Jan 2010, 08:13:34 am
Yes, I agree with Raistmer, that must have been something different. I don't know exactly what ARs trigger the VLAR kill but I've never seen it kill anything larger than 0.20.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Jan 2010, 01:04:38 pm
the error said something to the effect of VLAR killed angle 0.49

i thought it was a bit high, but there were several work units with the same range of angles that were killed by it. the non killer app is working perfectly. i am  down to using 0.25 as the cutoff angle and its just purring right along even with my desktop features enabled that i like.

i was getting ready to go out and get an ati card for my video and just use the nvidia cards for cuda only as i thought that i was using gpu resources for the desktop that caused problems with the cuda apps (desktop cube, shading and transparency features etc... experimenting with those just to see what 'glitter' was like :P ). especially when i turned those features off things started working again, but with this non killer app, i can keep everything enabled and it all lives together well.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Jan 2010, 01:14:53 pm
ok i went back and looked at them again. i was not very awake last time i looked at them. it was another error an fft.cu error....  however i dont get that with this app unless it was those specific workunits..

here are a few. there were maybe 30 workunits errored out and i looked at 10 of them just now and all had the same error except this first one.


Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0

Work Unit Info:
...............
WU true angle range is :  10.416071
SETI@home error -12 Unknown error
cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel
File: ./cudaAcc_pulsefind.cu
Line: 232

--------------------------------------------

setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0

Work Unit Info:
...............
WU true angle range is :  0.437965
CUFFT error in file './cudaAcc_fft.cu' in line 62.



-----------------------------------------

setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0

Work Unit Info:
...............
WU true angle range is :  0.407435
CUFFT error in file './cudaAcc_fft.cu' in line 62.

------------------------------------------


setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.7.0

Work Unit Info:
...............
WU true angle range is :  0.437965
CUFFT error in file './cudaAcc_fft.cu' in line 62.


Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Jan 2010, 07:15:03 pm
riofl, is the computer 4166601 ( http://setiathome.berkeley.edu/show_host_detail.php?hostid=4166601 ) yours?

The error cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel is a "normal" one. There is nothing in it.

The errors from that computer's page are interesting. They occur right after the "preparatory" phase in the CPU and when the GPU was supposed to take over. I've checked a few and all seem to happen in your "good" GTX285 card and not in the problematic tesla card. am I right?

If I remember correctly you were experiencing unusually high run times in your GPUs, does it still happen?

There is definitely something not right with the setup of this computer.

I think I've asked you before and you have told me the brand of your motherboard, can you remind me?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Jan 2010, 12:16:26 pm
riofl, is the computer 4166601 ( http://setiathome.berkeley.edu/show_host_detail.php?hostid=4166601 ) yours?

The error cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel is a "normal" one. There is nothing in it.

The errors from that computer's page are interesting. They occur right after the "preparatory" phase in the CPU and when the GPU was supposed to take over. I've checked a few and all seem to happen in your "good" GTX285 card and not in the problematic tesla card. am I right?

If I remember correctly you were experiencing unusually high run times in your GPUs, does it still happen?

There is definitely something not right with the setup of this computer.

I think I've asked you before and you have told me the brand of your motherboard, can you remind me?


yes that is the computer... the tesla is problematic in that it simply locks up randomly. there is a hardware problem with it. restarting boinc cures it for some time.  vidram test software shows a bad ram chip around the 700mb mark. i think my workstation is using more resources than i think it does and the gtx285 is simply overwhelmed if i have kde options enabled and does not have enough resources for seti. 

my times now are averaging 16-18 min off the tesla and 19-22min off the gtx285 . much better than previously at around 30 min. my scores have finally climbed to near 15k like you said they should be.

the computing errors were happening just as the gpu was supposed to take over. that was when i had all the 'cute' features of kde4 enabled which included dimming of unfocused windows and cube desktop switching and several other things including sharpen desktop (all experimental to see what it was like to use a workstation that had glitz enabled). i also use dual 24" monitors each at 1920x1200 using nvidia twinview option so i am sure that takes up a bit of vid resources as well. i also use different backgrounds on each of 9 desktops, same image loaded in each monitor/desktop.

once i disabled the glitz and glitter options and did a power down restart to allow everything to clear and changed back to the older non vlar killer app,  all the errors stopped.

the system is an intel q6600 quad processor overclocked to 3.0ghz using a 9 multiplier and 333mhz bus, OCZ ram is adjusted to stock frequency of ddr2-1066 . ram factory recommended timings were adjusted slightly from 5-5-5-18 to 5-5-5-15 and  cpu and ram voltages are stock factory recommendations. instead of auto, the pci-e bus speed is locked at 100mhz since the gigabyte board in full auto mode tends to adjust everything as it wants which could be dangerous.

the motherboard is a gigabyte GA-P35-DS4-rev2.1

things have been stable for the past 20 hours or so since i readjusted everything back to standard dull desktop :)


Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 19 Jan 2010, 01:23:26 pm
This may be something we ought to remember when writing stock boilerplate answers to message board grumbles. It's already fairly standard to say "don't expect your graphics card to draw a screensaver while crunching CUDA". I think it's also sometimes mentioned, though perhaps less often than it should be, that the "Aero" effects in Vista, and whatever they call the equivalent in Windows 7, eat up VRAM - much more so than the simple frame buffer for the final output, no matter what the resolution, in my opinion (never seen any problem with the 1600 x 1200 screens I use here, even on 512MB CUDA cards).

I've also started to see reports from users of Mac OS X, who have just gained the ability to run Einstein on CUDA - or not, if they only have 512MB. One poster attributed the loss of 125MB available memory (512MB --> 387MB) to OS effects alone.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Jan 2010, 01:34:52 pm
i think my workstation is using more resources than i think it does and the gtx285 is simply overwhelmed if i have kde options enabled and does not have enough resources for seti. 
...
the computing errors were happening just as the gpu was supposed to take over. that was when i had all the 'cute' features of kde4 enabled which included dimming of unfocused windows and cube desktop switching and several other things including sharpen desktop (all experimental to see what it was like to use a workstation that had glitz enabled). i also use dual 24" monitors each at 1920x1200 using nvidia twinview option so i am sure that takes up a bit of vid resources as well. i also use different backgrounds on each of 9 desktops, same image loaded in each monitor/desktop.
With 1 GB ram I think you should be safe. Still, even if it hadn't memory left it should throw an out of memory error message and switch to CPU computing, not error out completely.

my times now are averaging 16-18 min off the tesla and 19-22min off the gtx285 . much better than previously at around 30 min. my scores have finally climbed to near 15k like you said they should be.
Maybe you could do better, 20000+ RAC  ;)

once i disabled the glitz and glitter options and did a power down restart to allow everything to clear and changed back to the older non vlar killer app,  all the errors stopped.
...
things have been stable for the past 20 hours or so :)
If you were looking in your errors page and didn't see new errors that was because they haven't been updated since 17th January, not because there weren't new errors.

cpu and ram voltages are stock factory recommendations. instead of auto
Maybe this isn't enough? What are their values? For cpu voltages don't look at bios, see the real value with 100% CPU utilization under seti.


since i readjusted everything back to standard dull desktop :)
Personally I don't like kde's effects now that I've seen them in sidux and I've them also switched offf. Compiz effects are way better I think.


I've also started to see reports from users of Mac OS X, who have just gained the ability to run Einstein on CUDA - or not, if they only have 512MB. One poster attributed the loss of 125MB available memory (512MB --> 387MB) to OS effects alone.
1GB video RAM might not seem excessive any more but the bare minimum?  ::)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Jan 2010, 05:56:57 pm
out of memory error: true. its just weird they errored out like they did. maybe the gpu  did not have enough shader resources when the options were enabled for the desktop. i dont know when shader threads are enabled but i am assuming with the dimming options and cube rotation / transparency options it uses the shaders..

yeah some of the options arent bad.. i like a *slight* fade of non focus windows makes it easier to pay attention to the one in focus.. but the rest... the cube stuff started making me dizzy :)

i have no idea how much vidram im actually using but i am sure its quite a bit. 9x2 1 to 3mb backgrounds would use up to a good 30mb ram keeping that in vid memory if it does.. i really dont know how it interfaces with vid cards.... will have to check some of the utilities to see if they show it or find one on the net.

20k+ rac huhj? might be pushing this puppy a little bit

i have only used the voltages in the bios.. under load i have nothing that reads them properly. for some reason lm sensors and  gkrellm report the voltage sensors are in error.. for example... 2.85v for the 12v line? nope.. nada... only voltage readouts that make any sense are the ram and some cpu voltages but i am guessing they are that since they are only labelled in1 in2 in3 in4 etc... the only thing i know for sure is correct is in1 as ram voltage. it matches what the bios says., and the fans and temps. temp1 is the mosfets and temp2 is the southbridge. i discovered that with a hair dryer against the chips.. and discovered what fanx belonged to which fan by unplugging the fan to see which one dropped to 0.


to give you an idea here is the sensors output:

Adapter: ISA adapter
in0:         +1.22 V  (min =  +0.00 V, max =  +4.08 V)
in1:         +1.89 V  (min =  +0.00 V, max =  +4.08 V)
in2:         +3.22 V  (min =  +0.00 V, max =  +4.08 V)
in3:         +2.94 V  (min =  +0.00 V, max =  +4.08 V)
in4:         +1.84 V  (min =  +0.00 V, max =  +4.08 V)
in5:         +0.08 V  (min =  +0.00 V, max =  +4.08 V)
in6:         +1.02 V  (min =  +0.00 V, max =  +4.08 V)
in7:         +2.93 V  (min =  +0.00 V, max =  +4.08 V)
in8:         +3.30 V
fan1:       2360 RPM  (min =    0 RPM)
fan2:       2102 RPM  (min =    0 RPM)
fan3:       1406 RPM  (min =    0 RPM)
fan4:       1415 RPM  (min =    0 RPM)
temp1:       +41.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
temp2:       +44.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermal diode
temp3:        -2.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
cpu0_vid:   +1.219 V


i am going to have to find a reliable reporting tool to make sure what they are.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 19 Jan 2010, 07:43:16 pm
20k+ rac huhj? might be pushing this puppy a little bit
Why not?  :)

i have only used the voltages in the bios.. under load i have nothing that reads them properly. for some reason lm sensors and  gkrellm report the voltage sensors are in error.. for example... 2.85v for the 12v line? nope.. nada... only voltage readouts that make any sense are the ram and some cpu voltages but i am guessing they are that since they are only labelled in1 in2 in3 in4 etc... the only thing i know for sure is correct is in1 as ram voltage. it matches what the bios says., and the fans and temps. temp1 is the mosfets and temp2 is the southbridge. i discovered that with a hair dryer against the chips.. and discovered what fanx belonged to which fan by unplugging the fan to see which one dropped to 0.

I agree that lm-sensors reports most of the voltages incorrectly, with the rest I'll have to disagree.
There are no RAM voltages in those values, in fact I don't think there is a utility that can show them, windows, linux or whatever.
temp1 mosfets? Maybe, if you watercool them. Mosfets go high, really high, 100+ °C.

From the values you posted, the two that resemble your CPU voltage (vcore) is in0 (1.22) and cpu0_vid (1.219). The "cpu0_vid" is a very interesting name. "VID" is something like a default voltage for the chip. The lower it is the more overclockable the chip is. A Q6600 with a VID of 1.219 is very very good. My Q6600 with a VID of 1.2750 (average for this chip) has easily gone to 3.24 GHz.

The thing is I don't think lm-sensors can show the VID of a chip and it is just the vcore with a fancy name. Now if we assume that vcore=in0=cpu0_vid=1.22, I think it is a little low for 3GHz. Maybe try 1.24-1.25volts. Still it depends on the VID of the chip. If the VID is really 1.219 then 1.22 is not necessarily bad. Then again I don't think the VID can take such a value (1.219), it goes with increments.

Is this machine dual boot with windows by any chance?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Jan 2010, 11:08:56 pm
20k+ rac huhj? might be pushing this puppy a little bit
Why not?  :)

i have only used the voltages in the bios.. under load i have nothing that reads them properly. for some reason lm sensors and  gkrellm report the voltage sensors are in error.. for example... 2.85v for the 12v line? nope.. nada... only voltage readouts that make any sense are the ram and some cpu voltages but i am guessing they are that since they are only labelled in1 in2 in3 in4 etc... the only thing i know for sure is correct is in1 as ram voltage. it matches what the bios says., and the fans and temps. temp1 is the mosfets and temp2 is the southbridge. i discovered that with a hair dryer against the chips.. and discovered what fanx belonged to which fan by unplugging the fan to see which one dropped to 0.

I agree that lm-sensors reports most of the voltages incorrectly, with the rest I'll have to disagree.
There are no RAM voltages in those values, in fact I don't think there is a utility that can show them, windows, linux or whatever.
temp1 mosfets? Maybe, if you watercool them. Mosfets go high, really high, 100+ °C.

From the values you posted, the two that resemble your CPU voltage (vcore) is in0 (1.22) and cpu0_vid (1.219). The "cpu0_vid" is a very interesting name. "VID" is something like a default voltage for the chip. The lower it is the more overclockable the chip is. A Q6600 with a VID of 1.219 is very very good. My Q6600 with a VID of 1.2750 (average for this chip) has easily gone to 3.24 GHz.

The thing is I don't think lm-sensors can show the VID of a chip and it is just the vcore with a fancy name. Now if we assume that vcore=in0=cpu0_vid=1.22, I think it is a little low for 3GHz. Maybe try 1.24-1.25volts. Still it depends on the VID of the chip. If the VID is really 1.219 then 1.22 is not necessarily bad. Then again I don't think the VID can take such a value (1.219), it goes with increments.

Is this machine dual boot with windows by any chance?

well, the mobo has heatpipe cooling for the mosfets, north and south bridges..   when i messed with the hair dryer i aimed it at each of the  heatsinks and the 2 ends, mosfetSand sourthbridge is where i got the most sensitive individual temp variations.. aiming at the northbridge (middle heatsink on the pipe) gave me hardly any variation at all and it was mostly even changes between the 2 temps.

here is the image gallery. the temps 1 and 2 are most sensitive to changes on the outer two heatsinks which the manual says is mosfet on the left and southbridge on the right of the board top image.

http://www.newegg.com/Product/ImageGallery.aspx?CurImage=13-128-064-S01&SCList=13-128-064-S01%2c13-128-064-S02%2c13-128-064-S03%2c13-128-064-S04%2c13-128-064-S05%2c13-128-064-S06%2c13-128-064-S07&S7ImageFlag=2&Item=N82E16813128064&Depa=0&WaterMark=1&Description=GIGABYTE%20GA-P35-DS4%20Rev.%202.0%20LGA%20775%20Intel%20P35%20ATX%20Ultra%20Durable%20II%20Intel%20Motherboard  (http://www.newegg.com/Product/ImageGallery.aspx?CurImage=13-128-064-S01&SCList=13-128-064-S01%2c13-128-064-S02%2c13-128-064-S03%2c13-128-064-S04%2c13-128-064-S05%2c13-128-064-S06%2c13-128-064-7&S7ImageFlag=2&Item=N82E16813128064&Depa=0&WaterMark=1&Description=GIGABYTE%20GA-P35-DS4%20Rev.%202.0%20LGA%20775%20Intel%20P35%20ATX%20Ultra%20Durable%20II%20Intel%20Motherboard)


i only know the processor is a "G0" chip which is the best overclockable chip of that model. the other one is a "B" something..  i can check the bios readings for the cpu voltages to see what they say. this voltage will probably be displayed in ht as well.

no. i dont have dual boot. i am exclusively linux. i do have xp in virtualbox so i can guide ppl to configs they need to change  for my job, but other than that it sits there crunching numbers..  when i first got the gtx285 i did hook an ide drive ioto my system and installed windows on that so i could try out riva tuner and evga precision .. i wanted to change the calibration on the fans auto setups on both cards so they would be more aggressive but found i could not do that within 4he bios itself only by setting things with the driver. since i have no counterpart to do so in linux i could not do it so i just use nvclock to set the fans at 100% all the time. so anyway if i have to i can boot from that ide hdd. when i do i always protect every other drive in my system by unplugging them :) i trust windows as far as i would actually use it for my desktop which is a very solid NEVER. :P

of course booting from windows will not get my system running at 100% load since boinc will not be installed. in fact, the windows installation on that ide drive does not even have networking capabilities. i made sure it was stuck to the hard drive only and anything added has to be done by cd. i will not give raw windows a chance to touch anything on my network. in my opinion it explores networks too much trying to know more than it needs to about a lan. my virtualbox windows xp has a tunnel directly to the 'outside' and has no permission to touch anything i have not specified on my system or my network , which is nothing. it has to use outside dns,  time and anything else it needs since it cannot see my machine or my lan..

are there programs i would have to install on the windows drive? if so i would have to download them with linux and burn a cd for them.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Jan 2010, 11:41:37 pm
i have been looking around and found that the G0 stepping is a lower voltage version of the previous B3 stepping so yes that VID may be right

intel says this about the VID voltage range of the G0

VID Voltage Range   0.85V-1.5V
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Jan 2010, 11:55:27 pm
my friend has an asus rampage formula mobo with his q6600 G0 running at 3.0ghz also and his VID says 1.22 so i guess that is what it is. i wont be able to look at my bios until i can reboot which will be sometime tomorrow if this one application finishes its work before tomorrow night.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 20 Jan 2010, 06:21:10 am
Lets do something else. Post again the above values with 0% cpu utilization (idle, nothing runs) and 100%.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 20 Jan 2010, 12:23:42 pm
Lets do something else. Post again the above values with 0% cpu utilization (idle, nothing runs) and 100%.

with boinc set to use 100% cpu on 3 cpus the 4th is always reserved for cuda and my desktop. cuda is set for 0.35cpus

sensors
coretemp-isa-0000     
Adapter: ISA adapter
Core 0:      +55.0°C  (high = +84.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1:      +50.0°C  (high = +84.0°C, crit = +100.0°C)

coretemp-isa-0002
Adapter: ISA adapter
Core 2:      +50.0°C  (high = +84.0°C, crit = +100.0°C)

coretemp-isa-0003
Adapter: ISA adapter
Core 3:      +50.0°C  (high = +84.0°C, crit = +100.0°C)

it8718-isa-0290
Adapter: ISA adapter
in0:         +1.22 V  (min =  +0.00 V, max =  +4.08 V)
in1:         +1.89 V  (min =  +0.00 V, max =  +4.08 V)
in2:         +3.22 V  (min =  +0.00 V, max =  +4.08 V)
in3:         +2.94 V  (min =  +0.00 V, max =  +4.08 V)
in4:         +1.87 V  (min =  +0.00 V, max =  +4.08 V)
in5:         +0.08 V  (min =  +0.00 V, max =  +4.08 V)
in6:         +1.04 V  (min =  +0.00 V, max =  +4.08 V)
in7:         +2.96 V  (min =  +0.00 V, max =  +4.08 V)
in8:         +3.30 V
fan1:       2288 RPM  (min =    0 RPM)
fan2:       2115 RPM  (min =    0 RPM)
fan3:       1406 RPM  (min =    0 RPM)
fan4:       1415 RPM  (min =    0 RPM)
temp1:       +39.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
temp2:       +43.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermal diode
temp3:        -2.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
cpu0_vid:   +1.219 V


with boinc shut down and time given for everything to cool off average cpu usage is 0 to 1% peaking at 5% momentarily

sensors
coretemp-isa-0000     
Adapter: ISA adapter
Core 0:      +40.0°C  (high = +84.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1:      +39.0°C  (high = +84.0°C, crit = +100.0°C)

coretemp-isa-0002
Adapter: ISA adapter
Core 2:      +38.0°C  (high = +84.0°C, crit = +100.0°C)

coretemp-isa-0003
Adapter: ISA adapter
Core 3:      +38.0°C  (high = +84.0°C, crit = +100.0°C)

it8718-isa-0290
Adapter: ISA adapter
in0:         +1.23 V  (min =  +0.00 V, max =  +4.08 V)
in1:         +1.89 V  (min =  +0.00 V, max =  +4.08 V)
in2:         +3.23 V  (min =  +0.00 V, max =  +4.08 V)
in3:         +2.96 V  (min =  +0.00 V, max =  +4.08 V)
in4:         +0.69 V  (min =  +0.00 V, max =  +4.08 V)
in5:         +0.08 V  (min =  +0.00 V, max =  +4.08 V)
in6:         +0.42 V  (min =  +0.00 V, max =  +4.08 V)
in7:         +2.98 V  (min =  +0.00 V, max =  +4.08 V)
in8:         +3.30 V
fan1:       1917 RPM  (min =    0 RPM)
fan2:       2115 RPM  (min =    0 RPM)
fan3:       1406 RPM  (min =    0 RPM)
fan4:       1415 RPM  (min =    0 RPM)
temp1:       +38.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
temp2:       +33.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermal diode
temp3:        -2.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
cpu0_vid:   +1.219 V
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 20 Jan 2010, 12:25:25 pm
if i truly need 0% i will have to wait till i can reboot and then kill off X and stop all my background processes like storegrid backups, openfire jabber server etc and then leave only the server os running.

Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 20 Jan 2010, 04:49:29 pm
Hello

Can you also include output of "nvidia-smi -lsa" command when gpus are on load?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 20 Jan 2010, 07:31:25 pm
Hello

Can you also include output of "nvidia-smi -lsa" command when gpus are on load?


i didnt even know that existed... so it is a temp utility... my devices 0 and 1  run between 57 and 66c, most commonly running around 64c  as reported by gkrellm. the nvidia-smi output is:

nvidia-smi -lsa

==============NVSMI LOG==============


Timestamp                       : Wed Jan 20 19:26:05 2010

GPU 0:
        Product Name            : GeForce GTX 285
        Serial                  : 3169719755757
        PCI ID                  : 5e310de
        Temperature             : 63 C
GPU 1:
        Product Name            : Tesla C1060
        Serial                  : 837485170935
        PCI ID                  : 5e710de
        Temperature             : 61 C

and yes they are running full-bore constantly. i keep both fans set at 100% since i cannot change the card bios settings permanently to make the auto more aggessive. wish they had something like cmos that you could change the temp ranges vs fan speeds permanently without always having to fiddle with the driver. for a harder test the device 0 runs 68-71c after 5 minutes of glxgears running.

i try to keep the ambient temp in my office between 19 and 22c to help cool the systems



Title: Re: SETI MB CUDA for Linux
Post by: sunu on 20 Jan 2010, 09:32:41 pm
Well, you never know, seems like cpu0_vid is the VID after all. Still, I'd like to see it from windows too.

With this VID and this motherboard I think you can easily go to3.6GHz if you have a decent cooler.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 20 Jan 2010, 10:51:35 pm
Well, you never know, seems like cpu0_vid is the VID after all. Still, I'd like to see it from windows too.

With this VID and this motherboard I think you can easily go to3.6GHz if you have a decent cooler.

i know i can.. i ran it for a week at 3.6ghz but temps were running higher than i like. cpus were always in the mid to high 60s.  71c is the max operating temp before the cpu should go into lower power mode so i chose the highest rating keeping the cpu temps at a decent level and that was 3.0ghz.. at this level they never momentarily peak higher than 65c on a very hot florida day. i always keep my systems as cool as is practical.. heat kills..  thats why i put up with a cooler than i would like ambient environment.. it helps everything.

this weekend will be the earliest i can down the system long enough to put the windows drive in and run some program.. i hear cpuz is supposed to be a winner in measurements so ill download that and make a cd for the weekend unless you have another favorite i should use.


Title: Re: SETI MB CUDA for Linux
Post by: riofl on 20 Jan 2010, 10:57:17 pm
Well, you never know, seems like cpu0_vid is the VID after all. Still, I'd like to see it from windows too.

With this VID and this motherboard I think you can easily go to3.6GHz if you have a decent cooler.

forgot... the cooler is a Zalman CNPS 9700 NT and if you want to see what that is doing 'fan1' in the sensors report is the cpu cooler fan. i tried it on full throttle but it wasnt worth making it work that hard. not enough appreciable difference in cooling so i leave it on auto in the motherboard.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 20 Jan 2010, 11:31:29 pm
Don't bother with windows but if you're going to do it anyway or for future reference:

CPU-Z from http://www.cpuid.com/cpuz.php Among other things it will show the vcore.
Core Temp http://www.alcpu.com/CoreTemp/ Most importantly it shows the VID of the processor.
Prime95 http://www.mersenne.org/freesoft/ For stability testing and to load the processor 100%.

As for the cooler, it's decent. Personally I don't like my fans autothrottled and don't connect them on the motherboard. I connect them directly to the PSU cables for 100% speed always.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 20 Jan 2010, 11:37:31 pm
ok ill prep a cd with those things on it.. ill probably do it anyway just to see how this prime prg loads the cpu and now i am also curious what the windows utilities say in addition to my bios.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 20 Jan 2010, 11:49:18 pm
Ok. Be sure to check how the vcore changes between 0% and 100% cpu utilization. And we'll see if lm_sensors reports the correct VID.

In prime95 you'll need to start 4 threads for your quad core. It has three tests: small FFTs, large FFTs and a mixed mode. Choose small FFTs.

What voltage did you use when you're running at 3.6 GHz?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Jan 2010, 05:08:01 am
Don't bother with windows but if you're going to do it anyway or for future reference:

CPU-Z from http://www.cpuid.com/cpuz.php Among other things it will show the vcore.
Core Temp http://www.alcpu.com/CoreTemp/ Most importantly it shows the VID of the processor.
Prime95 http://www.mersenne.org/freesoft/ For stability testing and to load the processor 100%.

As for the cooler, it's decent. Personally I don't like my fans autothrottled and don't connect them on the motherboard. I connect them directly to the PSU cables for 100% speed always.

i have found that high airflow is not always better. the air can move so fast it reduces the ability of the heatsink/air interface to transfer maximum heat to the air.. for instance if i run my rear fans at 100% the components, especially those under the heatpipe on the mobo actually run hotter. i have found that 80% is a nice starting value that usually is very close. the exceptions are gpu fan/coolers the stock coolers are efficient even at 100% fan which tells me they did not put high enough rpm fans into the cooler.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Jan 2010, 05:24:58 am
Ok. Be sure to check how the vcore changes between 0% and 100% cpu utilization. And we'll see if lm_sensors reports the correct VID.

In prime95 you'll need to start 4 threads for your quad core. It has three tests: small FFTs, large FFTs and a mixed mode. Choose small FFTs.

What voltage did you use when you're running at 3.6 GHz?

ok.


if i remember correctly (it has been almost 2 yrs since i did this) i believe i upped the VID to 1.3 v or so. i had found that slight voltage changes had a big difference in stability and nothing gained after a certain voltage. i tried 1.35v and only got more heat as a reward. i think i ran it at 1.28 and it was fine for almost an entire day then went south but at 1.3 it was good.,  i know of one person running the cpu at 4.0ghz with water cooling and i think he is running his  VID around 1.4v i dont remember exactly but i believe it was 1.38v. i do know he is pushing the limit and it is not 100% stable but it is good enough for him. he said he thinks he needs 1.43 or 1.45 but he has reached the safe limit of his cooling system. it is one of the cheaper ones. still that is quite a jump. i am not much of a believer in overclocking as it does not fit with my philosophy of 'if it aint broke dont fix it' and the mfgrs set their specs for a reason.

but since i always really wanted a 3.0 ghz machine but at the time could not afford the processor, and this can safely go to that speed i decided to go against my beliefs so that i had the best of both worlds for myself. :)


Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Jan 2010, 06:01:58 am
interesting.. didnt know freesoft had a prime95 for linux 64. got that too just to have it. could be useful.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Jan 2010, 08:55:43 am
if i remember correctly (it has been almost 2 yrs since i did this) i believe i upped the VID to 1.3 v or so.
Please refer to it as vcore. VID is constant something like a characteristic printed on the chip. 1.3 volts is very very good for 3.6Ghz. I also use 1.3 volts but for 3.24 Ghz.

but since i always really wanted a 3.0 ghz machine but at the time could not afford the processor, and this can safely go to that speed i decided to go against my beliefs so that i had the best of both worlds for myself. :)
That was also the reasoning beside my decision. I settled for 3.24Ghz. I didn't want to go beyond that since my motherboard is very basic.

interesting.. didnt know freesoft had a prime95 for linux 64. got that too just to have it. could be useful.

Yes, though I've never used it. Yes, it could be useful.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Jan 2010, 04:52:44 pm
i have discovered the following:

in0 = VCore
in1 = DDR2
in2 = 3.3v
in3 = Vcc
in8 = VBatt.

i have changed the labels accordingly in the lmsensors config. the rest are way off calibration and could not possibly be even close.

the temp1 and temp2 that i think are mosfet and southbridge based solely on hair dryer proximity, may be those or they may be what this guy listed. unfortunately he did not have my motherboard, but a close model gigabyte.. he claims they are

temp1 current system temperature
temp2 current cpu casing temperature

i dont know for sure... gigabyte cannot help at all. the guy sounded like i was crazy. they arent too awfully important anyway as the hottest runner, temp2 has never exceeded 55c on peaks and typically runs 45-50c.

so in0 is the Vcore and its voltage matches the  cpu0_vid... which i find out appears to be a static listing. it never moves so maybe it reads something from the cpu? dunno.


Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Jan 2010, 05:29:37 pm
The most important temperatures are the four coretemps. If these are good the rest (almost) don't matter
.
so in0 is the Vcore and its voltage matches the  cpu0_vid... which i find out appears to be a static listing. it never moves so maybe it reads something from the cpu? dunno.

That's why I said that cpu0_vid might be the real VID after all. We'll find out if/when you fire up windows and see what Core Temp says.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jan 2010, 03:16:04 pm
VERY interesting!

the sensors core0VID is the VID as reported by the chip itself. still there is a difference between linux and windows.
linux says it is 1.219 while coretemp reports it as 1.2500

results of tests:

idle
core0 = 35
core1=35
core2=33
core3=33
cpuz reports coreV = 1.232

under load after 4 iterations of the smal fft test
core0=56
core1=56
core2=51
core3=51
cpuz coreV=1.2500 initially then after a few seconds drops to 1.216

the motherboard health status states the VCore=1.252

i decided to give myself a "newer" computer to celebrate its 2 yr old birthday  :P

played around at 3.6ghz but since i have changed ram from the last time (old ram was crucial ballistix new ram is OCZ low voltage Blade series) I find that at 3.6ghz the lowest besides 1000 i can go is 1200 which the ram plainly complains even upped from 1.8 to 2.0 volts which is .1 v higher than recommended maximum. so i had no success in stability at 3.6ghz so i dropped it to 3.45ghz along with 1152 on the ram (base is 1066) and VCore set at 1.312v. it appears to be rock stable. I also found that there is a 5c approx difference now between auto and max on the cpu fan so it is set at max and my temps are not much different than at 3.0ghz. maybe 2c higher.. basically nothing... the system passed small fft , large ffp and mixed bag with flying colors. i then booted into memtest86 and ran 2 iterations of its test and all passed just fine.

so i am leaving it at 3.45ghz for now and will run all my apps, test movies, audio, everything.

interersting day :D

edit addition:
interesting ... the vid cards are running hotter. does that mean maybe my 0.35cpu settings in app_info file and the faster cpu are driving the vid card harder? the 285 is averaging maybe 2c higher than normal but the tesla is running a good 10c hotter! it is now averaging 73-77c at max fan! maybe i jarred it a bit too much in cleaning the system and caused some of the thermalpads to fall or a break in the gpu paste... it shouldnt do that but that thing has been one of those 'jiggle it till it works then dont touch it and hope it stays that way' items.... guess i need to replace it sooner than i wanted to...

20 min later...:   it appears that the problem was with the cpu fan running full bore now (2750rpm approx instead of 2480 approx), it upset the rather delicate airflow balance i had achieved in the case. i had to readjust all fans :( now the tesla is peaking at 68c which is approx 2c or so warmer than it used to and more than acceptable).
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Jan 2010, 07:13:56 pm
the sensors core0VID is the VID as reported by the chip itself. still there is a difference between linux and windows.
linux says it is 1.219 while coretemp reports it as 1.2500
Why do you say that? We were never sure about core0vid and as I said in another post, 1.219 as a VID, was a strange number anyway. Coretemp clears the things for us, your chip's VID is 1.2500 and that core0VID from lm-sensors is something else.

under load after 4 iterations of the smal fft test
core0=56
core1=56
core2=51
core3=51
cpuz coreV=1.2500 initially then after a few seconds drops to 1.216
1.216 volts under load? At 3Ghz? Might be dangerously low. Maybe not. See also below.

played around at 3.6ghz but since i have changed ram from the last time (old ram was crucial ballistix new ram is OCZ low voltage Blade series) I find that at 3.6ghz the lowest besides 1000 i can go is 1200 which the ram plainly complains even upped from 1.8 to 2.0 volts which is .1 v higher than recommended maximum. so i had no success in stability at 3.6ghz so i dropped it to 3.45ghz along with 1152 on the ram (base is 1066) and VCore set at 1.312v.
What vcore at 3.6GHz? Might not be the memory's fault but too low vcore. You also say "vcore set at 1.312v". Where, in bios? Don't look there. Look at what cpu-z says under 100% load. That's the one to watch. Before I pencil moded my motherboard I had to set the vcore in bios at 1.4750 in order to achieve 1.3000 under 100% load.

the system passed small fft , large ffp and mixed bag with flying colors. i then booted into memtest86 and ran 2 iterations of its test and all passed just fine.
How much time did you let prime95 run? Many people consider a system stable after 12, 18 or even 24 hours of prime95 load.  For my overclock of 3.24Ghz I went over the board and ran prime95 for three days: one day small FFTs, one day large and one day mixed  ;D Also for memtest, I think they recommend to also leave it for many hours to be sure that your ram is good and stable. Whenever I buy new ram the first thing I do is run memtest for 24 hours.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jan 2010, 07:28:11 pm
according to several texts the VID is reported by the chip as its nominal operating voltage with stock settings.

i tried VCore all the way to 3.8v .. it ran fine at the lower voltage of 3.68 when i kept the ram set at 1000 with stock timings but i didnt want to bottleneck things so i decided to go down to 3.45ghz which also ran more within my 'comfort range' in cpu temps.

the voltage at 3.0ghz was the 'normal' bios setting and voltages were taken off auto. it ran like that since approximately july 2008 and never locked up.

i agree on the test times .. they should be measured in days, but i cannot be without this machine that long.. i cannot even be without it for more than 2 or 3 hours at most is why i used short tests. if it is going to act up it will do so while i am using it and it is a simple thing to drop it down in minutes if that happens. heck when i switched ram, on customer machines i usually run memtest for a minimum of 24 to 36 hrs, but on mine again due to time constraints, i have found with the way i use the machine if it is gonna mess up it will do it within 2 iterations.

i cant keep  this down long enough to do proper 'burn  in' etc... and i do not have a 2nd machine that is capable of taking its place while being burned in or repaired, so i just have to keep watch on things as i work and if necessary adjust which takes a few min max. due to ship times and the long times it takes to run into town to buy parts, i have spares of every part in the closet so i can do a fast replacement if need be. only spare i dont have is the tesla. i know i could replace it with the spare 285 but that just doesnt sit right with me. a spare is a spare and if i used it i would have to buy yet another one for a spare.

after i build up a newer workstation for myself, i will keep spares of that one, and will use the old spares in another machine to clone the one i have.

then i will either keep them both or sell off one of them. prob. will keep them both since this one proves to be a decent rac bulder compared to the other machines..


Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jan 2010, 07:31:13 pm
forgot.. at 3.45ghz, cpuz reports CoreV to be 1.296. the number i gave in the msg above was the number set in the bios and it also agreed in the bios voltage reading in the 'cpu health' section.

the vcore reading in gkrellm using in0, is 1.31



Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Jan 2010, 07:45:40 pm
Very nice "policy" about the spares, but very costly...
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jan 2010, 08:32:56 pm
thankfully the boss pays for 50% of the cost of spares and he would have a fit if i told him i used a spare as a 2nd card. he has spares for everything in the office and the NOC.. not for each server but since the servers are clones he has one of each  part used in a single computer. doesnt need any more than that really.

i love the spare policy but if i had to foot the bill totally myself i doubt i would have spares of everything, probably just ram and hard drives and fans which are all the most common parts to go.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jan 2010, 08:36:41 pm
wish he counted cases as 'spares' as well.. hehe then i wouldnt have to buy anything to build the extra computer once i replace this one.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Jan 2010, 01:10:24 pm
what do you run on your cpus? multibeam or astropulse? do you find astropulse gives higher credit than the equivalent time with MB would?

Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 24 Jan 2010, 02:13:28 pm
what do you run on your cpus? multibeam or astropulse? do you find astropulse gives higher credit than the equivalent time with MB would?


There is no AP work now at all, so question highly theoretical one  :P
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Jan 2010, 02:56:24 pm
ahh .. ok.. guess i will wait to wonder until they bring it back if they do.

thanks

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Jan 2010, 05:59:14 pm
what do you run on your cpus? multibeam or astropulse? do you find astropulse gives higher credit than the equivalent time with MB would?

Well, what raistmer said.
But when astropulse comes back, I don't think there is currently a credit advantage over multibeam. Moreover our Q6600 is not strong enough to support, for example, 4 astropulse workunits plus 1, 2 or more cuda workunits. In that case we would lose RAC.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Jan 2010, 06:15:57 pm
ahh ok... ill stick with what i have then.

wish they would make a standard compute appliance like they have for java... azul systems makes a compute appliance that can be addressed by your computer and gives java 864 processor cores and 768gb ram to work with! their java only processors  (vega3 chips) now have 54 cores each!

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Jan 2010, 03:53:50 pm
is there a  way to tell boinc not to use a specific device for cuda? say you have 4 cuda capable processors, but you want device 0 to be left alone and use only as your video card but to use devices 1,2,3 for cuda only. is there a way to do this?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Jan 2010, 03:57:46 pm
does boinc see a gtx295 as one physical device or 2? i guess i am asking if it counts processors as devices or the entire unit based on a slot.
Title: Re: SETI MB CUDA for Linux
Post by: b0b3r on 31 Jan 2010, 04:06:45 pm
It counts GPU so will see 2.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Jan 2010, 07:20:59 pm
Yes, seti sees the cores of a 295 separately. In order to not use a core or more for crunching you can use

<ignore_cuda_dev>N</ignore_cuda_dev>

in your cc_config.xml. It needs boinc 6.10.19 or later.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Feb 2010, 10:44:32 am
thanks guys! that makes life considerably easier :)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Feb 2010, 11:01:20 pm
whoah gona have to read up on this 6.10.30 version... i set it for my normal 10 days to allow it to fill the cache before i set it back to 5 days and before i realized it it filled me with more than 3000 workunits.. 2200 of them on gpu! thankfully 1500 of them are vhars so they will go very quickly.. to help i took them off the cpu and put them on the gpus to get them outta here quickly and get the count down to something a bit more sensible. unless it is smart enough now to realize the vhars take so litlle time it loaded me up with things... at the rate they are going i dont see a log jam really, it should be ok but just the initial number alarmed me.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Feb 2010, 05:44:54 pm
it it filled me with more than 3000 workunits..

My cache hovers around 6500 workunits  ;D
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Feb 2010, 09:17:33 am
cant get a handle on the methods of scoring...

a few days ago i made a few hardware changes concerning the cpu cooler and case air flow and then upped the processor to 3.51ghz which it appears quite happy at. 3 hrs of prime95 small fft and option 3 torture test (blend) passed every test and the prime95 benchmarks show a definite improvement over 3.3ghz. i upped the cpu VCore to 1.35v by the bios but sensors report it closer to 1.34. appears quite stable. time will tell.

anyway, my times per workunit completion are noticably shorter, yet my scores for this machine have dropped from 18.6k to 17.5k and my pending credit jumped from 89k to 105k! is this an indication boinc isnt happy with the changes to this system?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Feb 2010, 09:20:03 am
i also noticed i have some pending credits still sitting there from last november!
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Feb 2010, 09:22:34 am
anyway, my times per workunit completion are noticably shorter, yet my scores for this machine have dropped from 18.6k to 17.5k and my pending credit jumped from 89k to 105k! is this an indication boinc isnt happy with the changes to this system?
No, it means that your pending credit increased  ;D

Just monitor your invalid tasks page in your account for the next few days. If it stays blank then all is ok.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Feb 2010, 12:23:23 pm
so far so good. it has been running since saturday night and so far there is one invalid from the 14th. a cuda out of memory error... i suspect since it was dated yesterday morning it was when i ran glxgears while everything was running with more open than i usually have just to see how much of a load was present by the drop in fps. will keep an eye out..

hehe a challenge to get used to this varying fan noise. its quite loud.. the cpu fan is a delta 150cfm 4000rpm fan :P  .... i suspect if i cover up the 25cm fan hole in the case since the fan seems to be useless with this current airflow design i did it will muffle the fan quite a bit.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Feb 2010, 03:45:13 pm
I've never had a Delta but with 2 ultra kazes 3000 and 1 2000 plus a multitude of other fans, my case is very noisy.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Feb 2010, 06:35:48 pm
i love delta (1st pc corp). they cut no corners including noise to give you air flow and consistant high air pressure... i would rank them as the best around. their only fault is all of them are noisy. which i personally could care less. if a person wants a supercomputer at their desk they need to be prepared for a mini jet engine sitting next to them. they make a 5500rpm 253cfm fan that outputs so much air pressure you literally have to squint to keep your eyes from drying out if you aim the fan at you from arms length away! i have one but have not installed it yet. not positive i need it but i got it in case summer temps cause my comp to run a bit hot. i will probably install it and just keep it turned down so i have the 'headroom' if i need it.

to give you an idea, here are the specs for that monster

    

Delta PFB1212UHE-F00
120 x 120 x 38mm
Airflow: 252.85 CFM, 7.16 m3/min
Fan Speed: 5500 RPM
Noise: 66.5 dBa
Rated Voltage: 12 VDC
Operating Voltage: 8 to 13.2V DC
Rated Current: 4.80 amp
Rated Input Power: 48W
Air Pressure: 35.877 mmH20, 1.41" H2O
3 bare wire or 4-pin + 3-pin optional

in case you want to see what they offer:

http://1stpccorp.com/fan_all.html
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Feb 2010, 06:48:29 pm
i have 2 invalids from the 14th, both completed within minutes of each other, both with the same cuda out of memory error so i suspect that also was a victim of my glxgears experiment. nothing else so far.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Feb 2010, 08:21:02 pm
I still can't understand why your out of memory tasks don't fall back to cpu computing.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Feb 2010, 05:27:18 am
maybe they can't find room to work? i run an average of 6.0 average load with 420 average number of tasks.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Feb 2010, 11:23:41 am
ugh pending is up to 109k!  why does it go so high? is it that they question the results and are waiting verification with someone else? or is it just that i am pushing out higher volume than they expect from me so they suspect most of it?


my machine's average credit dropped from 18.6k down to 16.9k :(

im thinking maybe it was better at 3.3ghz?

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Feb 2010, 12:11:26 pm
No, hang in there. My pc lost 4000 RAC since yesterday. It's because of Berkeley's connection problems that started yesterday.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Feb 2010, 12:23:22 pm
ahh ok... i hate it when things happen outside of my control :P especially when they happen right when i am wanting to compare results.  ok ill let it sit. machine is happy so i can wait.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Feb 2010, 06:31:41 am
ugh. looks like its gonna be days before things get back to normal. a/c failure in the server closet.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Feb 2010, 04:40:38 pm
hehe major changes... their website finally came up but its different. guess they dont have the accounts section done yet either. oh well.. at least boinc works :)

Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 24 Feb 2010, 04:49:30 pm
Doesn't work in I.E very well, Firefox O,K, I'm wondering if it's supposed to be an Internal Site.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 24 Feb 2010, 05:39:12 pm
It's the old-ish "All SETI at Berkeley" site - much more general than just the '@home' sub-project. Check the links for SERENDIP, SEVENDIP, CASPER etc. across the top. The weekly blogs are often fun too.

I guess they hooked it up to look a bit more interesting than yesterday's placeholder - shame about IE, though. (And actually, the placeholder was more informative for boinc@home users)

I hope the change doesn't mean the main SETI@home links will be offline for an extended period - that Googlehack (http://boinc.berkeley.edu/dev/forum_thread.php?id=5486) may have been more serious than Eric thought at first.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Feb 2010, 06:36:31 pm
yeah i can see that. an old version of the site is better than no version. i just hope they can restore it all. i can just see us losing our accounts.. ugh..

Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 24 Feb 2010, 06:46:54 pm
It's already up.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Feb 2010, 07:01:27 pm
cool thanks! ill go check things out
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Feb 2010, 12:58:47 pm
i am seeing quite a lot of these errors today. may be the tesla

Sun 28 Feb 2010 11:04:58 AM EST   SETI@home   [error] 09mr07ad.12026.32399.16.10.74_1: negative FLOPs left -966550029782.363525
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 28 Feb 2010, 07:13:28 pm
Haven't seen something similar before. Is it something consistent or you've seen it a few times and that's it?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Feb 2010, 10:42:15 pm
it was quite often during that particular time boinc ran. it is the only time i have ever seen such an error.. i shut the machine down once i noticed that and allowed a 15min cooldown period and then restarted it and everything has been fine since. i think next month i am going to replace the tesla. it appears to be acting up more and more and in strange ways. at least i am assuming it is the tesla. i will have to research this unit below to tell which device it came from.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Feb 2010, 10:55:43 pm
i could not find that workunit  from the log, its just too daunting a task since they do not correlate the id with the workunit easily... however of all the errors i looked at every single one of them was from the tesla.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 01 Mar 2010, 08:04:55 am
Can you post here some links to these errors? Maybe they were badly downloaded workunits?

26 March is the release day for the new Fermi GPUs from NVIDIA. You can get one of these.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Mar 2010, 11:20:31 am
that error i showed you came from my boinc log

the tasks link for my machine is

http://setiathome.berkeley.edu/results.php?hostid=4166601

the error has not repeated since.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 01 Mar 2010, 02:53:51 pm
As I said earlier it might be a badly downloaded workunit. In your error tasks pages there are several with download errors.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Mar 2010, 07:19:34 am
i believe you are right. i am seeing that error again today and it is during the download procedure.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Mar 2010, 12:42:09 pm
6.10.30 must be broken. i have the cache limit set to 1 day since i already had 3000+ in it, and it continued to download to 6600!  that is a heck of a lot more than 1 day. guess ill have to toggle wont get new tasks and manually hit the update a few times a day.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 04 Mar 2010, 04:03:52 pm
Why don't you try a different version of boinc? I've been running 6.10.17 with no problems.

6600 workunits are way too many.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Mar 2010, 09:05:30 pm
yeah i was thinking i would change it tomorrow. presently its set to not take new work so its safe for tonight
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 04 Mar 2010, 09:26:21 pm
i realized i still had 6.10.17 on the system in my pkgs install dir, so i did it now.
time will tell.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 09 Mar 2010, 06:29:03 am
hmm i seem to be getting some invalids almost daily ... considering the number of work units i got swamped with do you think this points to a problem in my system? or is it just the law of averages? it appears to be performing flawlessly without any pauses or lockups or any other odd behavior and temps are well in safe zones. it also passed 3 hrs of prime95 mix torture, and 1 hr of small fft and 1 hr of large fft. i cannot run it longer due to my work obligations.

http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=0&show_names=0&state=4

Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 09 Mar 2010, 06:53:10 am

http://setiathome.berkeley.edu/results.php?hostid=4166601&offset=0&show_names=0&state=4


Those are all "Validate error" - a server error we've been discussing at length on the SETI (and SETI Beta) Number Crunching message boards.

Nothing to do with your card/computer at all.

Edit - Thanks for the link, BTW. Now we know 'riofl' is Chuck Gorish (http://setiathome.berkeley.edu/show_user.php?userid=68751), it'll be easier to track down reports of nisbehaving computers/tasks in the future...  ;D
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 09 Mar 2010, 04:04:27 pm
LOL ok no prob.. whew... was getting concerned for a bit that my upgrades were not working as well as i expected they should. overclocking usually hides several gremlins but so far none have shown up.

Title: Re: SETI MB CUDA for Linux
Post by: Leopoldo on 12 Mar 2010, 02:56:29 pm
hmm i seem to be getting some invalids almost daily ... considering the number of work units i got swamped with do you think this points to a problem in my system?

HI! Did You try the r12 version?

Before an using of 64-bit Linux opti-apps I did standalone testing (I hate the pollution of S@H sci database with wrong results since the time I used Windows). Under Ubuntu both 32-bit and 64-bit versions of Crunch3r-compiled CUDA-oriented apps returns Gauss-searching errors. Sunu recommended an OS change from 32-bit to 64-bit for me. By googling I found Crunch3r recommendations agains Ubuntu and for at least openSUSE.

Ok. 64. CUDA-oriented Crunch3r-compiled CUDA-app version "r06" (http://calbe.dw70.de/cuda/rel/setiathome-CUDA-6.08.x86_64.tar.bz2) under openSUSE 64-bit returns POT error, while newer "r12" version (http://calbe.dw70.de/cuda/rel/setiathome-CUDA_2.2_6.08.x86_64_vlarkill.tar.bz2), mentioned at Crunch3r forum, doesn't. Furthermore, after integrating "r12" into app_info, it works flawless and successfully validates against both stock apps 6.03 and 6.08

My system: openSUSE 11.2 64-bit with common openSUSE-specific trick of "ln -s /usr/lib64/libcuda.so.1 /usr/lib64/libcuda.so" to workaround BOINC library search bug (without it BOINC sees no nVidia-cards), latest 64-bit BOINC 6.10.36, 3 CPU cores assigned (4th is free of AK-crunching, reserved by me for Linux itself and partially for GPU-filling)...

My only-MB app_info (both CUDA and AK files specified for on-the-fly [Raistmer-made] 6.03<-->6.08 rebranding script; flops corrected [as Richard along others suggested at S@H NC forum long time ago] for my system to keep the S@H DCF nearest the 1):

<app_info>
  <app>
    <name>setiathome_enhanced</name>
    <user_friendly_name>SETI@Home Enh.</user_friendly_name>
  </app>
  <file_info>
    <name>AK_V8_linux64_ssse3</name>
    <executable/>
  </file_info>
  <file_info>
    <name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</name>
    <executable/>
  </file_info>
  <app_version>
    <app_name>setiathome_enhanced</app_name>
    <version_num>608</version_num>
    <flops>105467634946</flops>
    <plan_class>cuda</plan_class>
    <avg_ncpus>0.150000</avg_ncpus>
    <platform>x86_64-pc-linux-gnu</platform>
    <coproc>
      <type>CUDA</type>
      <count>1</count>
    </coproc>
    <file_ref>
      <file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>
      <main_program/>
    </file_ref>
    <file_ref>
      <file_name>AK_V8_linux64_ssse3</file_name>
    </file_ref>
  </app_version>
  <app_version>
    <app_name>setiathome_enhanced</app_name>
    <version_num>603</version_num>
    <flops>22268493150</flops>
    <platform>x86_64-pc-linux-gnu</platform>
    <file_ref>
      <file_name>AK_V8_linux64_ssse3</file_name>
      <main_program/>
    </file_ref>
    <file_ref>
      <file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>
    </file_ref>
  </app_version>
</app_info>



And excuse me for my bad English, please...

________________
WBW, Leopoldo
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 13 Mar 2010, 03:05:34 pm
the invalids turned out not to be my system. i have used the vlarkiller version before and went back to the other because it would kill off too many work units before i had a chance to monitor the computer and run the rebranding script to remove them from gpu processing, so i went back to the non vlar killer app and it is working flawlessly now.. if it happens to get a vlar during times i am out of the office it doesnt matter it processes it anyway so there are none rejected due to bad assigning them to the gpu.

i have been running 64bit  for a few years now, and it is definitely superior :)
i prefer the gentoo or funtoo distributions. they work best for me, faster and more stable. others feel ubuntu or mandrake or redhat clones are the way to go. personally if i didnt have my specific requirements, any distro would do well.



Title: Re: SETI MB CUDA for Linux
Post by: riofl on 26 Mar 2010, 11:44:20 am
Anyone seen this behavior and maybe have a fix? I have no idea what causes this.

In boincmgr, any of the tabs i select such as tasks, transfers or messages, seem to pause in their text updates until i move the mouse over them then they update. There are no errors generated and it all works fine in the back end its just this annoying behavior. if i want to watch the progress column i have to move the mouse every few seconds or the numbers do not update. At some point they do update i suspect but it seems to be quite a while and i have never seen it do so while the focus was on the boincmgr window. i also tried moving the manager window to the other screen and making a different window the focus and it still does the same. it has to have the mouse moving over its window even unfocused to get the text to update.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 26 Mar 2010, 11:54:14 am
riofl, it's a long standing bug. I have it for years  :(
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 26 Mar 2010, 01:48:49 pm
ahh ok... ive seen it for years but it only has started irritating me recently... ok ill be gentle with it :)

Title: Re: SETI MB CUDA for Linux
Post by: Pepi on 27 Mar 2010, 08:44:29 pm
Please how to solve this problem

boinc@pepi-desktop:~/BOINC/projects/setiathome.berkeley.edu$ ldd setiathome
   linux-vdso.so.1 =>  (0x00007fffe9dff000)
   libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007ff5e168b000)
   libcudart.so.2 => not found
   libcufft.so.2 => not found
   libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007ff5e137e000)
   libm.so.6 => /lib/libm.so.6 (0x00007ff5e10f9000)
   libpthread.so.0 => /lib/libpthread.so.0 (0x00007ff5e0edd000)
   libc.so.6 => /lib/libc.so.6 (0x00007ff5e0b6b000)
   libz.so.1 => /usr/lib/libz.so.1 (0x00007ff5e0953000)
   libdl.so.2 => /lib/libdl.so.2 (0x00007ff5e074f000)
   /lib64/ld-linux-x86-64.so.2 (0x00007ff5e1b27000)
   libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007ff5e0537000)


as you can see all so are all right but cannot find libcudart.so.2 and libcufft.so.2.
I am sure that those files are in boinc project directory. where else those files must be?

Also : what to put in this conf?
3)  Edit accordingly your ld.so.conf or the corresponding ld-something file of your distro with the above location of the cuda libs.

Now my linux machine crunch but it is slow

Thanks  for any help
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 27 Mar 2010, 09:37:48 pm
In directory /etc/ld.so.conf.d create a file named cudalibs.conf and inside it put the path to your cuda libraries, for example:

/home/pepi/BOINC/projects/setiathome.berkeley.edu

Then run ldconfig and you should be ok.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Mar 2010, 06:11:05 pm
i realize this isnt the place for this but i thought one of you guys may know the answer and save me finding a forum somewhere to ask..  maybe i need to call a psu mfgr?

if i find i need to run a dedicated power supply just for the GPU cards, how does startup/shutdown work? what gets powered on first? mobo then gpus? then mobo off then gpus? or should i have them on a common switch to turn both on / off at once? can the 2 motherboard/psu 'sense' contacts that automate the power on/off be paralleled for both psus to hook to the same 2 contacts?



Title: Re: SETI MB CUDA for Linux
Post by: sunu on 28 Mar 2010, 06:51:38 pm
You can use one of these http://www.performance-pcs.com/catalog/index.php?main_page=product_info&cPath=34_186&products_id=21193

If you need it only for the gpu you can use also a "dedicated" PSU like http://thermaltakeusa.com/Product.aspx?C=1156&ID=1544 . It comes with the above cable extension built in. Review of this PSU in http://www.jonnyguru.com/modules.php?name=NDReviews&op=Story&reid=122 .
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Mar 2010, 11:08:30 pm
cool... thanks! i wasnt sure it was possible to tie the 2 sense connectors together but obviously it is. simplifies things a lot!

will look thru these links..
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 30 Mar 2010, 10:22:17 pm
is there a cuda 3.0 seti app available yet? i installed 3.0 toolkit but the existing app still uses 2.2

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Mar 2010, 05:16:03 am
No.
Title: Re: SETI MB CUDA for Linux
Post by: Pepi on 04 Apr 2010, 03:51:49 pm
is there a cuda 3.0 seti app available yet? i installed 3.0 toolkit but the existing app still uses 2.2



YES LINUX x64 CUDA 3.0

http://calbe.dw70.de/mb/viewtopic.php?f=9&t=120
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 06 Apr 2010, 01:10:10 pm
is there a cuda 3.0 seti app available yet? i installed 3.0 toolkit but the existing app still uses 2.2



YES LINUX x64 CUDA 3.0

http://calbe.dw70.de/mb/viewtopic.php?f=9&t=120

thank you! i notice they still discard vlars rather than move them over to cpu responsibility... i guess that is beyond the scope of a seti app though.

Title: Re: SETI MB CUDA for Linux
Post by: Pepi on 06 Apr 2010, 08:25:05 pm
Reshedule tool for Win work great: why dont make it also for Linux? That will be great help!
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 06 Apr 2010, 11:23:09 pm
there is a rescheduling tool for linux. in the form of a script.. the earlier version of the windows one before it became an exe file. works well although all it does is assign all vlars  and vhars to the cpu and depending on how the conditional is set everything else to the gpus.. also there is a script which is in one of the seti forums that acts as a workunit counting reporting tool.. kindof a companion to the rescheduling script. the script is in the first few pages of the beginnings of the windows rescheduler forum. it is a pearl script. easy to customize. for me i put all vlars on the cpus and everything else including vhar on the gpus.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 09 Apr 2010, 05:48:16 am
question on proper procedure....

i am still dealing vith that huge dump of 6600 workunits i received a while back. it is down now to only VLARS and i can easily see that many of them will not get completed on time (its a shame gpus cant process them properly).

is the proper procedure to "abort" many of those tasks and report them immediately so they can be reassigned  leaving myself enough close dates that i kîow can be completed properly and then just get more units to make up my cache?

changing back to boinc 6.10.17 cured that wu get insanity, btw.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 09 Apr 2010, 06:45:43 am
Yes, abort them. There is really no other way.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 09 Apr 2010, 10:57:03 am
done.. aborted 973 of them all with dates between the 13th and 19th. until im sure my cache level is gonna be ok im keeping it at 1 day.. vlars are now down to 164 with most dates in may so that is not a problem :)

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 11 Apr 2010, 09:42:56 am
does boinc/seti-app support ATI Radeon cards yet? if so anyone have any performance results compared to nvidia? i have been looking at reviews and comparisons but outside of gaming graphs very little is to the point for this specific application.

what is dissapointing is the performance of the fermi cards compared to a gtx285 outside of gaming. the TDP is horrendously high for the performance points you appear to get.. until i see one actually in use in this application to compare average time differences i still tend to lean to the gtx285. the gtx470 isnt bad for TDP and is mildly higher than the 285 but the 480 appears to be a power hog for what it appears to give back.

the radeon cards appear to outperform nvidia in gaming and TDP  but how do they compare when processing workunits?

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 11 Apr 2010, 05:27:37 pm
ok this is driving me insane :)

does anyone know where the boinc_client gui stores its list of computers in its pulldown? i have a mess of old ones in there and cannot find where it keeps them. i did a search in the /var/lib/boinc dir for the names of a few of the computers.. nothing... searched /etc, tried a locate for boinc and nothing shows up.

Title: Re: SETI MB CUDA for Linux
Post by: Urs Echternacht on 11 Apr 2010, 07:00:11 pm
ok this is driving me insane :)

does anyone know where the boinc_client gui stores its list of computers in its pulldown? i have a mess of old ones in there and cannot find where it keeps them. i did a search in the /var/lib/boinc dir for the names of a few of the computers.. nothing... searched /etc, tried a locate for boinc and nothing shows up.
On linux :
Look for ".BOINC Manager" and in there the section starting with "[Compter MRU]" should contain what you are searching.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 12 Apr 2010, 05:46:59 am
does boinc/seti-app support ATI Radeon cards yet? if so anyone have any performance results compared to nvidia? i have been looking at reviews and comparisons but outside of gaming graphs very little is to the point for this specific application.

what is dissapointing is the performance of the fermi cards compared to a gtx285 outside of gaming. the TDP is horrendously high for the performance points you appear to get.. until i see one actually in use in this application to compare average time differences i still tend to lean to the gtx285. the gtx470 isnt bad for TDP and is mildly higher than the 285 but the 480 appears to be a power hog for what it appears to give back.

the radeon cards appear to outperform nvidia in gaming and TDP  but how do they compare when processing workunits?



Yes fermi seems a flop (for what it asks anyway). Unless it is proven otherwise, better to wait for fermi2. Till then a gtx295 still remains king for seti I think.

The problem with ati is inferior drivers (at least for linux) and inferior features and development tools for GPU computing.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 12 Apr 2010, 06:10:25 am
ahh...  ok.. it seems newegg stopped carrying the 295. at least it doesnt show in the listings. i will keep trying them and looking.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 12 Apr 2010, 06:11:21 am
ok this is driving me insane :)

does anyone know where the boinc_client gui stores its list of computers in its pulldown? i have a mess of old ones in there and cannot find where it keeps them. i did a search in the /var/lib/boinc dir for the names of a few of the computers.. nothing... searched /etc, tried a locate for boinc and nothing shows up.
On linux :
Look for ".BOINC Manager" and in there the section starting with "[Compter MRU]" should contain what you are searching.


that did it! thanks!
 i would never have thought to look for that combination.
Title: Re: SETI MB CUDA for Linux
Post by: ScitechGrid on 15 Apr 2010, 01:53:49 pm
Seems like the new Fermi GPUs are a big disappointment, maybe they will relase faster cards later
Title: Re: SETI MB CUDA for Linux
Post by: Edboard on 15 Apr 2010, 02:22:55 pm
I wanted to buy one, but as I'm milky cruncher too, I'm very disappointed.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Apr 2010, 08:11:27 am
i can only speak from impressions from lots of reading about them... unless you are an avid gamer or a graphics artist they really don't have any real improvements especially for number crunchers. in fact my opinion so far is they are power hogs and why should i replace a gt200 series for the 400 series to get basically the same performance i get now while chewing up more watts?

maybe the version 2 will be better.. for now i am sticking with gt200 based cards
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 07 May 2010, 02:06:35 pm
just fyi my line will be down starting tomorrow for the next 10 days or so. i am switching providers and the new provider will not install as long as an existing t1 is installed by another carrier.
stupid.....

ill be taking a forced vacation next week due to this.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 10 Jun 2010, 06:22:57 am
is anyone else getting these messages all of a sudden, or did something go wrong in my setup? i checked my app_info and all is as it has been..


Thu 10 Jun 2010 05:44:04 AM EDT   SETI@home   Message from server: No work sent
Thu 10 Jun 2010 05:44:04 AM EDT   SETI@home   Message from server: Your app_info.xml file doesn't have a usable version of SETI@home Enhanced.
Thu 10 Jun 2010 05:44:04 AM EDT   SETI@home   Message from server: (reached daily quota of 100 tasks)
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 10 Jun 2010, 07:16:21 am
is anyone else getting these messages all of a sudden, or did something go wrong in my setup? i checked my app_info and all is as it has been..

Thu 10 Jun 2010 05:44:04 AM EDT   SETI@home   Message from server: No work sent
Thu 10 Jun 2010 05:44:04 AM EDT   SETI@home   Message from server: Your app_info.xml file doesn't have a usable version of SETI@home Enhanced.
Thu 10 Jun 2010 05:44:04 AM EDT   SETI@home   Message from server: (reached daily quota of 100 tasks)


Yes, all over the main project message board since they fiddled with the scheduler to accommodate Fermi.

What OS are you getting this with, and what (if any) <platform> tag do you have in app_info.xml? I'm not getting the "doesn't have a useable version" message (and I am getting work) - but then, I just use bog-standard 32-bit Windows here.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 10 Jun 2010, 08:42:07 am
gentoo linux x86_64 (amd64 although i run intel hardware)

running boinc 6.10.17 because the last 'new' one i tried loaded me up with 6000+ workunits before i realized what was happening :)

easiest thing is just to post my entire app_info. it is short. it has worked well for a long time until this recent change they did.


<app_info>                                                                                                                 
<app>                                                                                                                       
<name>setiathome_enhanced</name>                                                                                           
</app>                                                                                                                     
<file_info>                                                                                                                 
<name>AK_V8_linux64_ssse3</name>                                                                                           
<executable/>                                                                                                               
</file_info>                                                                                                               
<app_version>                                                                                                               
<app_name>setiathome_enhanced</app_name>                                                                                   
<version_num>603</version_num>                                                                                             
<file_ref>                                                                                                                 
<file_name>AK_V8_linux64_ssse3</file_name>                                                                                 
<main_program/>
</file_ref>
</app_version>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>setiathome-CUDA_3.0_6.09.x86_64</name>
<executable/>
</file_info>
<file_info>
<name>libcudart.so.3</name>
<executable/>
</file_info>
<file_info>
<name>libcufft.so.3</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.350000</avg_ncpus>
<max_ncpus>0.350000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome-CUDA_3.0_6.09.x86_64</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.3</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.3</file_name>
</file_ref>
</app_version>
</app_info>
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 10 Jun 2010, 08:54:30 am
interesting.... i just noticed in the cuda app it has a _6.09 in it... i assume it is a 6.09 version compared to the 6.08 i have previously run however my version is still 608... so on a whim i changed the version to 609 and now i am not seeing those red letter messages...

will give it a bit to see if it really is gone. odd...

crap.. i just looked.. i now have no cuda jobs at all damn boinc deleted them!! i really wish it would not attempt to think for itself it only messes up every time!

oh well

debating to leave that 609 in there or change it back to 608

Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 10 Jun 2010, 09:01:45 am
I'm pretty sure version numbers don't come into it. As to BOINC's intelligence - I think that particular case comes into the "you break it, you own both parts" category - after all, it was you that made the change :(

Going back to the error message: you're on Linux 64, with no platform tags at all. I'm dealing with a user at SETI, who has Windows 64 and an app_info with <platform>windows_intelx86 and <platform>windows_x86_64 scattered all over it like confetti.

I think I'll take this into the development area for a chat with Jason. May be some time.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 10 Jun 2010, 09:27:34 am
ok thanks.... if i see it messing up then ill back up the workunits first then change the version back since that has always worked before and will wait to see if you uncover anything.

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 11 Jun 2010, 05:04:10 am
6.09 definitely did not work. it got work units but got nothing for gpu. i set it back to 6.08 and it got units for gpu.
Title: Re: SETI MB CUDA for Linux
Post by: Richard Haselgrove on 11 Jun 2010, 05:43:10 am
Jason has now observed (and I agree with him) that the error is with the error message itself. The last line ("(reached daily quota of 100 tasks)") is the important, meningful one: everything above it, like the rubbish about app_info, is just - well, rubbish.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 11 Jun 2010, 06:25:56 am
cool. thanks! breathing easier...  the daily quota thing is weird as i typically have received as many as 300 or 400 tasks in one day. but.. if it is something they are implementing, no problem.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Jun 2010, 06:03:49 am
Thu 17 Jun 2010 05:57:30 AM EDT   SETI@home Beta Test   You used the wrong URL for this project
Thu 17 Jun 2010 05:57:30 AM EDT   SETI@home Beta Test   The correct URL is http://setiweb.ssl.berkeley.edu/beta/

beta??
this just started showing up.. are they wanting us to switch or is this another error? i did nothing to encourage this, it just changed by itself.


edit: 1 hr later its back to normal. i think i am going to stop paying attention to any messages this thing gives me for a while. as long as it keeps processing i will leave it alone.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 17 Jun 2010, 11:36:23 am
For the past 1-2 weeks, seti is in a state of ... well ... ::)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Jun 2010, 04:45:57 pm
yeah. ive basically given up caring. when it gets back to normal then fine.... like now its not giving me work saying ive reached my daily quota of 207 tasks... must be something they set up to give everyone an equal chance with what they can produce... i mean.. im sitting here with 9 vlar cpu tasks and nothing else and its telling me no work... im just not worrying about it any more.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 17 Jun 2010, 04:54:53 pm
You're like lucky with 207 tasks. Mine hovers around 100 when I need over 400 a day.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Jun 2010, 07:42:05 pm
not sure what happened... it was 100 and i had tasks set for 5 days so i increased them to 10 days and got more work and now its up to 207..

just looked and i got some! wow.. jumped from 3 cpu jobs to 26 with still no gpu jobs and that stupid msg about my app_info.xml which is nonsense...
 
hehe i have to admit they get some interesting responses out of their systems when they 'upgrade'.

yet another reason why i believe firmly if it aint broke dont fix it :P
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 18 Jun 2010, 03:23:35 am

yet another reason why i believe firmly if it aint broke dont fix it :P
It was broken actually, but instead of fixing only broken part DA brought whole package of changes to SETI main, most of them were not needed...
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Jun 2010, 08:22:25 am
ahh yes.. modify, improve the modifications, then modify the modifications until it is an entirely different product, then scrap all the changes and go back to the original code and fix the one thing that caused all the changes..  :D

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 18 Jun 2010, 11:19:33 am
LOL its more broke now than ever... hehe  i kept track and since midnight i have received all of 47 tasks total.. seti believes differently :D

Fri 18 Jun 2010 11:16:26 AM EDT   SETI@home   Message from server: (reached daily quota of 247 tasks)
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 12 Jul 2010, 10:04:47 pm
are the fermi cards proving out any better than when they first came out? or are they still not worth the price of their cooling fan?

i heard they need special software to make them work 'properly' which would possibly cause issues with me since i will be mixing gf100 and gt200b technologies.. maybe i should look around for leftover gtx2 series cards. ihave to get one before i can send my xfx back for repair. the 2nd port is messed up.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 13 Jul 2010, 05:42:06 am
Unfortunately fermi cards can't be used for seti in linux. No client available.

Yesterday I read the reviews of 460 GTX. Pretty interesting product. If the supposed dual fermi card will be based on this, this will be the new card to get for seti. Unfortunately nobody knows when/if a dual fermi card will be released.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 13 Jul 2010, 12:21:19 pm
does look promising.. however it wont help me now since there is no linux driver.. ill just look for new/used gtx 285 or 295 and hope the seller is honest and took care of it.

Title: Re: SETI MB CUDA for Linux
Post by: Tye on 18 Jul 2010, 11:44:27 am
In one of my systems I have a 8800 GT, 9600 GSO, 250 GTS all working together - very nice.   Still, I'm thinking of picking up a 460, but I did hear that the 400s weren't doing CUDA so well.  Any advice/thoughts/benches?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 18 Jul 2010, 01:06:26 pm
Still, I'm thinking of picking up a 460, but I did hear that the 400s weren't doing CUDA so well.  Any advice/thoughts/benches?

Don't do that. Fermi cards are useless in linux as there isn't a seti client for them.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 19 Jul 2010, 02:45:03 pm
In one of my systems I have a 8800 GT, 9600 GSO, 250 GTS all working together - very nice.   Still, I'm thinking of picking up a 460, but I did hear that the 400s weren't doing CUDA so well.  Any advice/thoughts/benches?

for the forseeable future i do not recommend fermi for seti. there are no applications to support them yet in linux. i need to get one and i have found a new gtx295 made by galaxy for $445 which is not a bad deal... i recomend looking for one of the gt200b products.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 11 Aug 2010, 09:02:01 am
any word if there is even a fermi for linux in the works?
eta guesses?
any clue if it will be backward compatible to be able to mix say gtx480 and gtx295 in the same machine?

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Aug 2010, 10:02:28 am
Unfortunately nothing is known at this time. A wild guess is that if/when it's done, it should be backwards compatible.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 14 Aug 2010, 06:00:00 pm
how does ati compare to nvidia for number crunching? faster / $ spent? slower?

xfx is giving me a hard time. they say they cannot fix my 285. they have no parts available. they have no replacements. so i told them ill take a 295. they dont have them either. they want to replace it with a 280 or 275 both of which chew more power/gflop than the top 2 do.

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 14 Aug 2010, 06:25:12 pm
It's sad that xfx has essentially quitted using nvidia and it's not their fault, it's nvidia's.

As far as I know ATI has many problems in linux, not to mention it's completely useless for seti@home.

What problem does your card have?

A 285GTX is a 280GTX with a smaller process and higher clocks. Nothing else.

If xfx gives you an ATI, especially a 5xxx card, you could sell it and buy something else. Likewise with a 280GTX. They will be new (not used), so you'll be able to get a higher price from them. You'll have to find out which card will give you the biggest return.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2010, 06:35:10 am
the second video port will no longer go to 1920x1200 resolution. it maxes out at 1600x1200 which is useless to me.  the reason i am not liking the 280 is it is a power hog for its performance plus it is not a 200b gpu. the 275 is better.. it is a 200b  but still draws more power than the 285 does... if anything i would want a 295 replacement. if they cannot or will not do that then i would settle for the 275 since it performs better than the 280 and draws power in the middle between the 280 and 285. i just dont like the fact that if they cannot actually do a 'board repair' on it, i have to possibly settle for less than the flagship single gpu product i paid good money for. i am finding out lifetime warranties are almost useless after 'end of life' of the product. with my luck they will only offer a fermi product in which case i will demand the flagship 480 which is equivalent to my 'flagship' 285.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Aug 2010, 08:10:35 am
Have you made sure that it isn't some strange software problem and not a hardware problem?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2010, 09:26:47 am
yes. i switched the roles of the ports so dsp1/port1  becomes 'port0' and the problem stayed on port 1.. i switched monitors. same behavior.   i borrowed a gts250 card a friend has for about 10 min since it is his only vid card and the problem went away. both ports behaved fine at 1920x1200.  so it is located within the gpu circuitry. i called xfx and explained my tests and they agreed it was a gpu hardware problem. they've "seen it before".
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Aug 2010, 04:57:38 pm
Is this the machine with the buggy tesla card? If the gtx285 doesn't have problems with seti crunching, you could decommission the tesla card, make the 285 a dedicated crunching card and get a new one to drive your displays + seti crunching. Or if you are lucky and your motherboard has 3 pci-express slots, get a low budget card for your displays and have the tesla and the 285 as dedicated crunching cards.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 15 Aug 2010, 10:20:34 pm
i only have 2 pci-e slots... but yes my plan is to use the 295 that is due to arrive in the next 3 days as my primary vid card. the 285 gives no probs with cuda crunching and since port0 works fine and i have a 3rd monitor sitting in a box doing nothing i was thinking of expanding my desktop over 3 monitors. would be extremely convenient. the only reason i am even going to ship the xfx for repair is 1. lifetime warranty they have to satisfy me.. 2. i want it to work properly even if i never use the 2nd vid port... i may, but even if i dont i want it to handle 1920x1200 like it should. the 285 will replace the tesla ultimately, or whatever they agree to ship back to me. my only concern is if they will only replace it with a fermi.. in that case the fermi will become my primary dual monitor display and the 295 will do all the number crunching and drive the 3rd monitor until such a time as seti gets a linux fermi app. they did say there is some hope they can fix my card. they still have some parts inventory... only them seeing the card will tell.. i can also opt not to replace and have them ship the card back to me in worst case scenario.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 16 Aug 2010, 04:01:01 am
XFX doesn't have fermi cards.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 16 Aug 2010, 05:03:00 am
i dunno i may not ship it. this is looking more and more bleak to get any satisfaction out of xfx. as my 2nd card replacing the tesla, it does everything i will need it to do. port0 works fine and it number crunches fine, and although it would be nice to eventually go to 4 monitors that would just be greedy. i dont need 4. i dont even need 3 but since i have a spare monitor sitting in a box doing nothing i may as well use it. that way i can keep email and im clients (i get a LOT of msgs from both for my work) on screen on all desktops. i wouldnt really care except these monitors were expensive. all 3 are samsung 245bw and at the time of purchase they were $450 each so if i dont use it i am starting to feel like it is just wasting away (been in the box for almost 2 yrs already. i only opened it to verify it worked after receiving it). the 285 is a better performer than the tesla for cuda work and that is no slouch.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Aug 2010, 09:00:15 am
ok. i am going to have to get down to some very serious diagnostics here. the gtx 295 exhibits the same trouble as the 285 does on port1 !!!!!! yet a gts250 does not... or on that one boot it didnt. i have asked my friend to bring it back when he can stay a while and i can try multiple reboots. funny thing is when the 285 is in alone, it always picks port1.

its the oddest behavior i have ever seen. i enabled the 3 monitor setup which works, but its a lottery which screen will become 1600x1200 with each reboot now. mostly it sticks to device 0 dfp1 , 2nd port... but it has done this to  all 3 ports and screens!! i'm baffled.. i am sending msgs off to everyone involved in the entire video display chain to see if anyone has some ideas.. i am including gigabyte in this too although i dont know how the mobo can affect it.

i put the 295 as the first device to use and the 285 in the 2nd slot.. its weird but the 295 is actually slower in video response than the 285 is. however all 3 gpu chips crank cuda out very nicely. the 285 is a bit faster there too. i guess they had to go with slower, cooler gtx275 setups in order to be successful in dual gpu in a single box. temps are good in this pupppy though. so far they have not exceeded 65c under heavy load! of course i have the fan set to 100% :P i think it was asus or acer that made a limited edition dual 285 card.. expensive tho. also only made 1000 of them.

looks like i have a while of brain burning diagnostics ahead of me... hehe i think a phd in astrophysics would probably be easier. at least that course is ordered and logical :)

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 17 Aug 2010, 09:54:10 am
The xorg.log would be quite useful to your troubleshooting.

The 2 gpus used in a gtx295 are weaker than the gpu in gtx285. They are not even equal to a gtx275. They are downclocked gtx275s.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 17 Aug 2010, 11:56:59 am
maybe i will switch them then and put the 295 in the 2nd slot. the slower screen updates and pausing are getting to me. didnt do that with the 285 and tesla.
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 23 Aug 2010, 05:10:33 am
I crunshed several units. One was already correctly validated. The others have error messages:
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
 file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 506 : invalid configuration argument.
....

I have a GeForce 8800 GT. Whats wrong here ?  :(



For windows version I would propose to update drivers. No idea if it applicable to Linux...

I've been recently bitten by this very same problem. Im my case it seems that I've got too recent NVIDIA drivers (version 256.35, cudatoolkit 3.1).

I'm running 32-bit ubuntu 10.04 on Q6600, 8GB RAM and GeForce 8600 GT. I'm not running graphical user interface (X server might be running though), BOINC 6.10.17 says
Code: [Select]
NVIDIA GPU 0: GeForce 8600 GT (driver version unknown, CUDA version 3010, compute capability 1.1, 255MB, 99 GFLOPS peak)

Any suggestion? 64-bit linux on this machine is out of question due to unrelated reasons.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Aug 2010, 05:58:34 am
What seti app do you use?

Also a link to your machine would be nice.

With 8 GB RAM may I ask why 64 bit is out of the question?
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 23 Aug 2010, 06:30:36 am
Here's link: http://setiathome.berkeley.edu/show_host_detail.php?hostid=129668.

I'm using Crunch3r's setiathome-CUDA-6.08.i686.

For living this machine does some GIS stuff with interface to MS SQL server. We've found out that Unix ODBC is broken on 64-bit while 32-bit one is OK. Hence 32-bit OS on this machine. My heart bleeds, but one can not have everything. :(

Hummm ... on the other hand, it seems to proceed somehow. state.sah has this line in it:

Code: [Select]
<prog>0.20329889</prog>

My concern (not primary though) is about why BOINC doesn't run GPU task in addition to 4 CPU tasks (quad core processor).
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 23 Aug 2010, 09:13:03 am
Seems that those lines are somewhat benign for the application itself. GPU finished one WU: http://setiathome.berkeley.edu/result.php?resultid=1690649422. Run time is nothing special (half of normal CPU time). What remains is to see result verified. If it verifies, then those lines in stderr make major aesthetic problem.

If the above is fine, then the issue of BOINC not running GPU task concurrent with 4 CPU tasks becomes primary concern.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 23 Aug 2010, 12:38:18 pm
I guess you tried a 32bit chroot for your work and didn't work?

For the seti part:
-256.44 is the latest nvidia driver. Get it.
- If you don't run a x server make sure that you initialize your card properly. This is from the cuda release notes:

o In order to run CUDA applications, the CUDA module must be
  loaded and the entries in /dev created.  This may be achieved
  by initializing X Windows, or by creating a script to load the
  kernel module and create the entries.

  An example script (to be run at boot time):

  #!/bin/bash

  /sbin/modprobe nvidia

  if [ "$?" -eq 0 ]; then

  # Count the number of NVIDIA controllers found.
  N3D=`/sbin/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`
  NVGA=`/sbin/lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`

  N=`expr $N3D + $NVGA - 1`
  for i in `seq 0 $N`; do
  mknod -m 666 /dev/nvidia$i c 195 $i;
  done

  mknod -m 666 /dev/nvidiactl c 195 255

  else
  exit 1
  fi

- Latest boinc is 6.10.56. Get it.


Let's try to reinstall the whole thing:
- Close boinc and delete everything in projects/setiathome.berkeley.edu directory. Delete client_state.xml and client_state_prev.xml in your boinc directory. Delete everything in the slots directory.
- Get http://calbe.dw70.de/AKV8/REL/V4/AK_V8_SSSE3_LX32.tar.bz2 and put the executable in projects/setiathome.berkeley.edu directory
- Get http://calbe.dw70.de/cuda/rel/setiathome-CUDA_2.2_6.08.x86_vlarkill.tar.bz2 and put the executable in projects/setiathome.berkeley.edu directory.
- Get http://lunatics.kwsn.net/index.php?module=Downloads;sa=dlview;id=258 and put the executable in projects/setiathome.berkeley.edu
- Get the 2.3 toolkit (delete everything from the 3.1 you have) and put

libcudart.so
libcudart.so.2
libcudart.so.2.3
libcufft.so
libcufft.so.2
libcufft.so.2.3

in the projects/setiathome.berkeley.edu directory.

- Edit accordingly your ld.so.conf or the corresponding ld-something file of your distro with the above location of the cuda libs. Do a ldconfig to update the cache.
- Do a ldd... to the cuda app to see if all dependencies are met.
-Make a file named app_info.xml in the projects/setiathome.berkeley.edu directory and put the following:

<app_info>
    <app>
        <name>astropulse_v505</name>
    </app>
    <file_info>
        <name>ap_5.06r411_sse3_x86_64-beta4</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>astropulse_v505</app_name>
        <version_num>506</version_num>
        <flops>41430637443.0638666</flops>
        <avg_ncpus>1.0000</avg_ncpus>
        <max_ncpus>1.0000</max_ncpus>
        <file_ref>
            <file_name>ap_5.06r411_sse3_x86_64-beta4</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>AK_V8_linux64_ssse3</name>
        <executable/>
    </file_info>
    <file_info>
        <name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</name>
        <executable/>
    </file_info>
    <file_info>
        <name>libcudart.so.2</name>                                                                                                                                                 
        <executable/>                                                                                                                                                               
    </file_info>                                                                                                                                                                   
    <file_info>                                                                                                                                                                     
        <name>libcufft.so.2</name>                                                                                                                                                 
        <executable/>                                                                                                                                                               
    </file_info>                                                                                                                                                                   
    <app_version>                                                                                                                                                                   
        <app_name>setiathome_enhanced</app_name>                                                                                                                                   
        <version_num>603</version_num>                                                                                                                                             
        <flops>24463044489.9277371</flops>                                                                                                                                         
        <avg_ncpus>1.0000</avg_ncpus>                                                                                                                                               
        <max_ncpus>1.0000</max_ncpus>                                                                                                                                               
    <file_ref>                                                                                                                                                                     
        <file_name>AK_V8_linux64_ssse3</file_name>                                                                                                                                 
        <main_program/>                                                                                                                                                             
    </file_ref>                                                                                                                                                                     
    </app_version>                                                                                                                                                                 
    <app_version>                                                                                                                                                                   
        <app_name>setiathome_enhanced</app_name>                                                                                                                                   
        <version_num>608</version_num>                                                                                                                                             
        <plan_class>cuda</plan_class>                                                                                                                                               
        <flops>244630444899.2773714</flops>                                                                                                                                         
        <avg_ncpus>0.1400</avg_ncpus>
        <max_ncpus>0.1400</max_ncpus>
    <coproc>
        <type>CUDA</type>
        <count>1</count>
    </coproc>
    <file_ref>
        <file_name>setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu</file_name>
        <main_program/>
    </file_ref>
    <file_ref>
        <file_name>libcudart.so.2</file_name>
    </file_ref>
    <file_ref>
        <file_name>libcufft.so.2</file_name>
    </file_ref>
    </app_version>
</app_info>


-Change the yellow filenames with the ones you've got.
- Change the red number with something much lower, I don't know what a proper number for your card will be, maybe half of it.

- You have to make the cuda app run at 0 priority. I use the following script that runs in infinite loop:

#!/bin/sh
while true
do
for arg in `ps -Lo lwp --no-headers -p $(pgrep setiathome-6)`
do
renice 0 -p ${arg}
done
sleep 5
done

for that script to work you'll need the procps and bsdutils packages installed.

- Fire up boinc and see what we've done  :D
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 25 Aug 2010, 10:20:08 am
- Change the red number with something much lower, I don't know what a proper number for your card will be, maybe half of it.

I assume that this flops stuff is more or less aesthetics.

Other than few differences in appinfo.xml (missing flops stanza, different values for avg_ncpus and max_ncpus - was 0.04) I've managed quite a different setup. I've left toolkit 3.1 while installing 2.0 libraries systemwide as well. ldd shows that correct (2.0) libraries are then loaded by seti/cuda app.

I've updated toolkit to 2.3 now and Crunch3r's app to the one you proposed. I'll check and see how it works out.

- You have to make the cuda app run at 0 priority. I use the following script that runs in infinite loop:

Could this be reason for quite long run times on my card? Meanwhile two results got verified and peers were running stock GPU app for Windows. One peer had GTX295, the other GTX285, both had from 5 to 8 times shorter run time. Or is such a speed difference expected between more modern and almost ancient GPUs?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 25 Aug 2010, 08:34:29 pm
I assume that this flops stuff is more or less aesthetics.
Absolutely NOT! Check http://lunatics.kwsn.net/3-linux/seti-mb-cuda-for-linux.msg19829.html#msg19829 and http://lunatics.kwsn.net/5-windows/appinfo-flops-question.0.html


I've left toolkit 3.1 while installing 2.0 libraries systemwide as well. ldd shows that correct (2.0) libraries are then loaded by seti/cuda app.

I've updated toolkit to 2.3 now and Crunch3r's app to the one you proposed. I'll check and see how it works out.
If you don't need them for other things DELETE the 3.1 and 2.0 and KEEP the 2.3.

- You have to make the cuda app run at 0 priority. I use the following script that runs in infinite loop:

Could this be reason for quite long run times on my card?

Yes, you ABSOLUTELY have to run the cuda app at priority 0.


Meanwhile two results got verified and peers were running stock GPU app for Windows. One peer had GTX295, the other GTX285, both had from 5 to 8 times shorter run time. Or is such a speed difference expected between more modern and almost ancient GPUs?
Of course a gtx295 or gtx 285 will be multiple times faster than a 8600gt. If they weren't, why people then would fork out multiple hundred euros/dollars to buy them?
Also note that the linux cuda app is slower in general than the windows one.
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 26 Aug 2010, 02:54:18 am
I assume that this flops stuff is more or less aesthetics.
Absolutely NOT! Check http://lunatics.kwsn.net/3-linux/seti-mb-cuda-for-linux.msg19829.html#msg19829 and http://lunatics.kwsn.net/5-windows/appinfo-flops-question.0.html

Well, for me that's more aesthetic than functionality. I'm running SAH along other projects with not-so-high resource share, hence effect of wrong DCF is not so crucial.

- You have to make the cuda app run at 0 priority. I use the following script that runs in infinite loop:

Could this be reason for quite long run times on my card?

Yes, you ABSOLUTELY have to run the cuda app at priority 0.

Given that CUDA/Win app is generally faster than CUDA/Linux and that peers' GPUs are faster HW-wise than mine, then 5-8 times longer runtimes on my ancient GPU seems reasonable. Even without running CUDA app at priority 0.

Or is it that normal (not niced that is) priority for CUDA app is necessary for flawless operation?

[edit]
Ah, I've read a couple more articles about it and now it became more clear that this helps a lot. At the same time I'll try to lower avg_ncpus and max_ncpus to 0.00. My reasoning: if the task runs with mich higher priority than the rest of CPU-hungry apps, then it'll just steal CPU time from them. Hence no need for BOINC to allocate CPU cycles for it - decent OS tends to do it much better and with much finer granularity.

This might cure my concerns below.

I'll report back with results.
[/edit]

Back to my concerns: what should I put in app_info.xml to make BOINC run 4 CPU tasks and 1 GPU task in parallel? I've got a quad CPU and single GPU so this system should be able to run 5 tasks in parallel. Currently it suspends one CPU task when running the GPU task.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 26 Aug 2010, 04:01:18 am
All I had to say is in my previous posts http://lunatics.kwsn.net/3-linux/seti-mb-cuda-for-linux.msg30557.html#msg30557 and http://lunatics.kwsn.net/3-linux/seti-mb-cuda-for-linux.msg30659.html#msg30659
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Aug 2010, 08:03:32 pm
sheesh. ever since they started this 3days off thing my pending has skyrocketed. is that true for everyone? for the past few weeks i have been hovering between 310,000 and 370,000 !
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 29 Aug 2010, 01:01:49 am
Yes, I'm seeing this also. Before that, I had about 200,000-250,000 pending, now it's about 600,000 pending.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 31 Aug 2010, 05:09:44 am
i'm currently using nvidia-drivers-195.36.31.  noticed an upgrade available to nvidia-drivers-256.52.
i'm always a bit suspicious of large jumps in upgrade versions. worth it? avoid it?
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 31 Aug 2010, 07:07:35 am
Here are my experiences after some days: it works  ;)

My observations:
Code: [Select]
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 497 : invalid configuration argument.
Seems benign though as most results have validated. Is there any particular reason for this error being reported and app seemingly still operating OK?
[/list]

Sunu, thank you for all advice.!

Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 31 Aug 2010, 07:09:01 am
i'm currently using nvidia-drivers-195.36.31.  noticed an upgrade available to nvidia-drivers-256.52.
i'm always a bit suspicious of large jumps in upgrade versions. worth it? avoid it?

I can't say anything about worthiness, however the new one works for me.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 31 Aug 2010, 07:27:09 am
i'm currently using nvidia-drivers-195.36.31.  noticed an upgrade available to nvidia-drivers-256.52.
i'm always a bit suspicious of large jumps in upgrade versions. worth it? avoid it?
There have been quite a few releases between them, so not really a big jump. You can try it and if you don't like it, revert back.

settings of <avg_ncpus> and <max_ncpus> don't matter much (if at all)
Wrong


settings of <flops> should not be too high or else BOINC bails out due to excessive resources (read: CPU/GPU cycles) being used
Wrong


setting niceness of CPU-part of GPU task to 0 (normal priority) doesn't seem to affect things a lot, but doesn't hurt.
It seems to depend on the kernel/distro used. Other systems seem to highly benefit from it, others not so much.
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 31 Aug 2010, 02:09:49 pm
settings of <avg_ncpus> and <max_ncpus> don't matter much (if at all)
Wrong
How so? I've tried some values between 0.00 and 0.15 and I haven't noticed any difference. The only time that I could imagine the difference to pop up is if there are multiple (probably more than 3-4) GPUs installed and used.

settings of <flops> should not be too high or else BOINC bails out due to excessive resources (read: CPU/GPU cycles) being used
Wrong
If not, what then? My estimates are currently way too high (around 4 days) so I tried to fix it by changing <flops> value. If I set it 10-times larger, WUs erred out due to excessive resources used. Run time (wall) was roughly the same as for successful WUs, so I can attribute the error only to too high <flops> value.
Title: Re: SETI MB CUDA for Linux
Post by: Josef W. Segur on 31 Aug 2010, 03:05:46 pm
settings of <avg_ncpus> and <max_ncpus> don't matter much (if at all)
Wrong
How so? I've tried some values between 0.00 and 0.15 and I haven't noticed any difference. The only time that I could imagine the difference to pop up is if there are multiple (probably more than 3-4) GPUs installed and used.

Set 1 and BOINC will reserve a full CPU for each GPU. Set 0.71 as the project app_plan is doing for some hosts running stock Windows builds and if the system has 2 GPUs one CPU will be reserved, etc. You're right that small fractional settings are generally insignificant.

Quote from: Metod
settings of <flops> should not be too high or else BOINC bails out due to excessive resources (read: CPU/GPU cycles) being used
Quote from: sunu
Wrong
Quote from: Metod
If not, what then? My estimates are currently way too high (around 4 days) so I tried to fix it by changing <flops> value. If I set it 10-times larger, WUs erred out due to excessive resources used. Run time (wall) was roughly the same as for successful WUs, so I can attribute the error only to too high <flops> value.

The relationships are : rsc_fpops_bound/flops = elapsed time limit. DCF*rsc_fpops_est/flops = estimated runtime. rsc_fpops_bound = 10*rsc_fpops_est.

If DCF is near or greater than 10 as sometimes happens, the estimated runtime is longer than the allowed runtime. Reducing DCF can reduce the estimates and thereby allow work fetch, without changing the allowed runtime. Adjusting flops to more than a realistic value for the host is not a very good idea, but adjusting rsc_fpops_bound values higher can protect against those errors.

With the servers attempting to provide rsc_fpops_est and _bound values which are about right for DCF 1.0, we can hope things will settle down after they have enough data to know how fast the applications are. Unfortunately the initial transitions are painful.
                                                                                    Joe
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Oct 2010, 10:49:19 pm
either today's batch of downloads is supposed to take a very long time for a gpu to complete or i have something going wrong. my fastest gpu is taking 3 hours 2 minutes to reach 85%! and even the others are taking 10 to 15 minutes longer on the other 2 gpus. all 3 gpu temps are also much lower than normal. typically they run 58-65c max load and i have not seen them rise above 50c in several hours.

is this a 'common' experience others are having too today or am i facing something going haywire?
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 02 Oct 2010, 07:23:28 am
either today's batch of downloads is supposed to take a very long time for a gpu to complete or i have something going wrong. my fastest gpu is taking 3 hours 2 minutes to reach 85%! and even the others are taking 10 to 15 minutes longer on the other 2 gpus. all 3 gpu temps are also much lower than normal. typically they run 58-65c max load and i have not seen them rise above 50c in several hours.

is this a 'common' experience others are having too today or am i facing something going haywire?
Check out your results:

resultid=1717782169 (http://setiathome.berkeley.edu/result.php?resultid=1717782169) on hostid=4166601 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=4166601)

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>

SETI@home MB CUDA 3.0 6.09 Linux 64bit - r16 by Crunch3r :p
- thread priority mod

setiathome_CUDA: Found 3 CUDA device(s):
   Device 1 : GeForce GTX 285
           totalGlobalMem = 1073020928
           sharedMemPerBlock = 16384
           regsPerBlock = 16384
           warpSize = 32
           memPitch = 2147483647
           maxThreadsPerBlock = 512
           clockRate = 1476000
           totalConstMem = 65536
           major = 1
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 30
   Device 2 : GeForce GTX 295
           totalGlobalMem = 939327488
           sharedMemPerBlock = 16384
           regsPerBlock = 16384
           warpSize = 32
           memPitch = 2147483647
           maxThreadsPerBlock = 512
           clockRate = 1345500
           totalConstMem = 65536
           major = 1
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 30
   Device 3 : GeForce GTX 295
           totalGlobalMem = 939327488
           sharedMemPerBlock = 16384
           regsPerBlock = 16384
           warpSize = 32
           memPitch = 2147483647
           maxThreadsPerBlock = 512
           clockRate = 1345500
           totalConstMem = 65536
           major = 1
           minor = 3
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 30
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 285 is okay
SETI@home using CUDA accelerated device GeForce GTX 285
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file './cudaAcc_fft.cu' in line 49 : no CUDA-capable device is available.
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file './cudaAcc_fft.cu' in line 49 : no CUDA-capable device is available.
setiathome_CUDA: CUDA runtime ERROR in plan FFT. Falling back to HOST CPU processing...

setiathome_enhanced 6.01 Revision: 737 g++ (GCC) 4.2.1 (SUSE Linux)
libboinc: BOINC 6.11.0

Work Unit Info:
...............
WU true angle range is :  1.433000

Flopcounter: 11714606392639.039062

Spike count:    1
Pulse count:    0
Triplet count:  0
Gaussian count: 0
05:22:35 (16178): called boinc_finish

</stderr_txt>


I suggest you try first restarting Boinc, then your computer.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Oct 2010, 12:35:17 pm
i noticed it was using 100% cpu and that is what tipped me off as well..
i shut down for about 1 min then restarted and that seems to have cured it.
i am wondering though if this is a symptom of something going  bad or if it
was just the occasional 'fluke'
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Oct 2010, 05:19:48 pm
Do you still make heavy use of your main graphics card?

What driver do you use?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 02 Oct 2010, 10:32:30 pm
Do you still make heavy use of your main graphics card?

What driver do you use?

im using 256.53 driver with cuda 3.1

yes. i have a monitor off both ports of the 285 and one monitor off the 295 and i make heavy use of them though mostly it is in ssh, browser, email , instant msg, text editor  windows. the monitors are set up in a zinerama/twinview mixture to get 3 on one desktop.

this is the first time this problem has h appened, and since i power cycled the machine it has not happened since.. although i did do something out of the ordinary yesterday. tried to watch a training seminar video but it wouldnt play. had some wrong version codecs somehow since it did work 2 weeks ago. that may have tossed the vid card into a strange state since i had to kill the player. wound up eating all available memory.i think when something like this happens in the future like with the vid player, ill just power off and start up again to be safe.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 02 Oct 2010, 11:22:27 pm
Why cuda 3.1? I think you shouldn't use it. Cuda 3.x is intended for different software and hardware.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Oct 2010, 09:11:29 am
i forget who told me but they said it was backward compatible and that performance was better.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Oct 2010, 09:20:21 am
Why cuda 3.1? I think you shouldn't use it. Cuda 3.x is intended for different software and hardware.

that must have been in an upgrade done yesterday or the day before. the list was long and i really didnt look carefully at it.
i have reinstalled cuda-toolkit 2.1. it appears device 2 started causing issues now and was done with each workunit as it began working on it.
this happened in the past hour i think... hopefully this will cure the problems.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 03 Oct 2010, 05:09:57 pm
Cuda 2.3 would be the best choice.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 03 Oct 2010, 06:39:29 pm
argh. i didnt even notice the typo.. yes i installed 2.3 not 2.1.. sorry

Title: Re: SETI MB CUDA for Linux
Post by: glennaxl on 04 Oct 2010, 01:17:27 am
just rebuild my system from using dotsch to ubuntu 10.4.1 and my seti cuda now  is too slow (using only 0.5-1.x cpu - windows is about 7-12cpu) - about 1hr to complete. running gpugrid to see if my cuda setup has issues but grugrid seems to be running perfectly well. So what's the deal here? any ideas?

driver ver: 195.36
boinc: 6.10.17
seti cuda: 2.3
seti app: setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu
seti wus: not vlar
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2010, 11:46:43 am
Do you raise the priority of the seti cuda app? Also why don't you use a newer graphics driver?
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 05 Oct 2010, 02:14:50 pm
I've recently replaced gfx card with (still old but slightly) better one. This one is 9800GX2, with 2 GPUs and 512 MB graphic RAM per GPU. Now I wonder the following: does SAH CUDA app really make use of all the HW available? Or would it be sensible to try to run, say, 2 CUDA tasks in parallel on same GPU in order to 'increase throughput' (similar to hyperthreading)?

I've observed that it's quite possible to run 2 Einstein@Home CUDA apps in parallel (indeed GPU is not used much there) while my efforts to do so with SAH failed miserably (I doubt it's due to lack of graphic RAM, single SAH CUDA ran just fine on my old 8600 with mere 256 MB gRAM).

Any thoughts?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 05 Oct 2010, 04:33:05 pm
No, for pre-fermi GPUs there is no advantage in running multiple tasks per GPU, in fact usually the speed is worse.

Your card has two GPUs so you can run two tasks (1 per GPU) simultaneously.
Title: Re: SETI MB CUDA for Linux
Post by: glennaxl on 05 Oct 2010, 05:17:18 pm
Do you raise the priority of the seti cuda app? Also why don't you use a newer graphics driver?

Yah I found that was the problem - NICE is not playing nice at all  ;D

Upgraded the driver and also used crunch3r cuda3 bins as the cuda2.2 doesn't have raised priority.

BTW, I have 1 rig that doesn't raised priority at system start-up (I also set a 20sec delay but doesn't work).  I have to manually restart boinc so setiathome-CUDA_3.0_6.09.x86_64_vlarkill will be raised. My other rigs works perfectly fine though.
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 11 Oct 2010, 07:22:48 am
At least the app I'm running (x86, 2.2, vlar-kill) has a nasty habit of complaining:[/li]
Code: [Select]
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 497 : invalid configuration argument.
Seems benign though as most results have validated. Is there any particular reason for this error being reported and app seemingly still operating OK?

Can anybody comment on the error above? I don't think I've got any results without this 'error' while almost all of them validated OK. It's annoying to see thousands of lines of error while meaningful information gets truncated.
Title: Re: SETI MB CUDA for Linux
Post by: glennaxl on 11 Oct 2010, 07:08:16 pm
Do you raise the priority of the seti cuda app? Also why don't you use a newer graphics driver?

Yah I found that was the problem - NICE is not playing nice at all  ;D

Upgraded the driver and also used crunch3r cuda3 bins as the cuda2.2 doesn't have raised priority.

BTW, I have 1 rig that doesn't raised priority at system start-up (I also set a 20sec delay but doesn't work).  I have to manually restart boinc so setiathome-CUDA_3.0_6.09.x86_64_vlarkill will be raised. My other rigs works perfectly fine though.
I solved the issue. Apparently, setiathome-CUDA_3.0_6.09.x86_64_vlarkill doesn't raise the priority if boinc is run by user "boinc" - it needs admin privilege. any thoughts?
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 11 Oct 2010, 11:27:29 pm
Maybe boinc user needs right to increase process priority ?
Title: Re: SETI MB CUDA for Linux
Post by: glennaxl on 12 Oct 2010, 09:58:52 pm
Maybe boinc user needs right to increase process priority ?

Isn't it giving boinc user right to increase process priority the same as root? or is there a specific group that I need to add boinc to?
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Oct 2010, 03:08:06 am
Root has many other rights too.
Hope linux has such conception as particular rights assignment for different users/groups.
In windows right assignment can be pretty granular.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 13 Oct 2010, 07:18:04 am
Why do you need boinc user and other stuff? Keep the garbage out of your system. Install it in your home directory and go from there.
Title: Re: SETI MB CUDA for Linux
Post by: Raistmer on 13 Oct 2010, 08:26:48 am
Why do you need boinc user and other stuff? Keep the garbage out of your system. Install it in your home directory and go from there.
In windows service(protected) install required for BOINC work w/o user log on. If linux can launch user's apps (installed in particular user home dir ) w/o user logon - good.
Also, it's good when some kind of autologon enabled. But if PC sits idle and just awaits when user come and do logon - it's not good.
Title: Re: SETI MB CUDA for Linux
Post by: glennaxl on 14 Oct 2010, 03:39:33 am
Why do you need boinc user and other stuff? Keep the garbage out of your system. Install it in your home directory and go from there.
In windows service(protected) install required for BOINC work w/o user log on. If linux can launch user's apps (installed in particular user home dir ) w/o user logon - good.
Also, it's good when some kind of autologon enabled. But if PC sits idle and just awaits when user come and do logon - it's not good.

My rig is running ubuntu server with no x-window, stripped down kernel (using localmodconfig - 9MB kernel) and boinc running as service. Add some startup script to initialize cuda and other stuff. Everything is running perfectly now, just boinc running as root - i don't think its a risk as i don't have anything in this rig - just pure cruncher :)
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 14 Oct 2010, 09:24:05 am
Why do you need boinc user and other stuff? Keep the garbage out of your system. Install it in your home directory and go from there.
In windows service(protected) install required for BOINC work w/o user log on. If linux can launch user's apps (installed in particular user home dir ) w/o user logon - good.
Also, it's good when some kind of autologon enabled. But if PC sits idle and just awaits when user come and do logon - it's not good.

My rig is running ubuntu server with no x-window, stripped down kernel (using localmodconfig - 9MB kernel) and boinc running as service. Add some startup script to initialize cuda and other stuff. Everything is running perfectly now, just boinc running as root - i don't think its a risk as i don't have anything in this rig - just pure cruncher :)

You could be running BOINC under non-root account and just run that renice script as root. Theoretically that would make your rig even more safe.
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 23 Oct 2010, 01:31:52 pm
At least the app I'm running (x86, 2.2, vlar-kill) has a nasty habit of complaining:[/li]
Code: [Select]
Cuda error 'GaussFit_kernel' in file './cudaAcc_gaussfit.cu' in line 497 : invalid configuration argument.
Seems benign though as most results have validated. Is there any particular reason for this error being reported and app seemingly still operating OK?

Can anybody comment on the error above? I don't think I've got any results without this 'error' while almost all of them validated OK. It's annoying to see thousands of lines of error while meaningful information gets truncated.
If you read the first dozen pages of this thread, you'll see that the original 32bit Linux Cuda app was broken, it might be that the x86, 2.2, vlar-kill app isn't totally reliable eithier,
If you can, try swapping to a 64bit OS and use the 64bit app instead.

Claggy

Edit: The old broken 32bit Linux Cuda app really should be removed from the first post.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 23 Oct 2010, 02:14:07 pm
Edit: The old broken 32bit Linux Cuda app really should be removed from the first post.

Done:
Quote
[Mod:] Removed outdated build

Was 64bit  OK ? I can put that back if needed, or better yet, put it in a proepr place.
Title: Re: SETI MB CUDA for Linux
Post by: Claggy on 23 Oct 2010, 02:34:20 pm
Edit: The old broken 32bit Linux Cuda app really should be removed from the first post.

Done:
Quote
[Mod:] Removed outdated build

Was 64bit  OK ? I can put that back if needed, or better yet, put it in a proepr place.

64bit app was O.K, only 32bit app bad, see this post from a long while ago:

Any news on this?

Unfortunately no. All 32bit builds had the same error. If you can, run the 64bit app, else it's better not run CUDA for the time being.

Then I think somebody should add a warning to the original post or remove the 32bit app entirely.

Unfortunately I only have 32bit linux installed. The last time I tried 64bit I had a sound problem in some 32bit games, so I just reverted to 32bit. Perhaps I should give it a try again, but I don't think this will happen anytime soon.

Claggy
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 23 Oct 2010, 02:41:45 pm
OK, modding [Done]
Quote
[Mod:] Removed outdated/bad 32 bit build
Title: Re: SETI MB CUDA for Linux
Post by: Metod, S56RKO on 23 Oct 2010, 03:00:14 pm
If you read the first dozen pages of this thread, you'll see that the original 32bit Linux Cuda app was broken, it might be that the x86, 2.2, vlar-kill app isn't totally reliable eithier,
If you can, try swapping to a 64bit OS and use the 64bit app instead.

Ah, well, I kinda hoped that somebody would be nice enough to provide a good 32-bit Linux CUDA app. I can't really go to 64-bit on this machine. I actually had to go from 64-bit to 32-bit due to requirements on what this machine does for its living.
Title: Re: SETI MB CUDA for Linux
Post by: glennaxl on 08 Dec 2010, 11:47:13 am
Anyone tried setiathome-CUDA_3.0_6.09.x86_64_vlarkill with cuda32?

It seems that all dependency get resolved when I tested it under VMware except for libcuda.so as cuda is not exposed under VMware.
Code: [Select]
ldd setiathome-CUDA_3.0_6.09.x86_64_vlarkill
        linux-vdso.so.1 =>  (0x00007ffff6fff000)
        libcufft.so.3 => /usr/local/cuda/lib64/libcufft.so.3 (0x00007fa18cf7d000)
        libcuda.so.1 => not found
        libcudart.so.3 => /usr/local/cuda/lib64/libcudart.so.3 (0x00007fa18cd2f000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fa18ca29000)
        libm.so.6 => /lib/libm.so.6 (0x00007fa18c7a6000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00007fa18c588000)
        libc.so.6 => /lib/libc.so.6 (0x00007fa18c205000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fa18ed44000)
        libdl.so.2 => /lib/libdl.so.2 (0x00007fa18c001000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fa18bdea000)
        librt.so.1 => /lib/librt.so.1 (0x00007fa18bbe2000)
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Dec 2010, 08:08:28 am
Anyone tried setiathome-CUDA_3.0_6.09.x86_64_vlarkill with cuda32?

Supposedly setiathome-CUDA_3.0_6.09.x86_64_vlarkill was done for fermi compatibility but when I tested it, it wasn't working.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 15 Dec 2010, 08:19:21 am
Anyone tried setiathome-CUDA_3.0_6.09.x86_64_vlarkill with cuda32?

Supposedly setiathome-CUDA_3.0_6.09.x86_64_vlarkill was done for fermi compatibility but when I tested it, it wasn't working.

Not surprising given the Fermi incompatibilities inherited from 'old stock' Cuda code (all platforms) are fairly specific, complex, and not related directly to Cuda version (apart from needing at least Cuda 3 for fermi binaries)

FYI,
Have got a few other responsibilities sorted today, so should hopefully be able to get some Linux guys on the case soon.  No firm timetable yet, for Linux, but wheels are in motion.  Will try to provide more info as something more tangible develops.

Jason
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 15 Dec 2010, 08:55:36 am
Not surprising given the Fermi incompatibilities inherited from 'old stock' Cuda code (all platforms) are fairly specific, complex, and not related directly to Cuda version (apart from needing at least Cuda 3 for fermi binaries)
When I tested it, it produced garbage results in all WUs except VLAR ones. It could run VLAR workunits in fermi GPUs and produce valid results.

FYI,
Have got a few other responsibilities sorted today, so should hopefully be able to get some Linux guys on the case soon.  No firm timetable yet, for Linux, but wheels are in motion.  Will try to provide more info as something more tangible develops.
Nice to hear that :)
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 15 Dec 2010, 09:06:13 am
When I tested it, it produced garbage results in all WUs except VLAR ones. It could run VLAR workunits in fermi GPUs and produce valid results.

LoL, that's kindof amusing but logical by nature of some of the issues. Thanks. That info will help track down & check any issues that might need resolving in an x32f port.  Fingers crossed it should be reasonably straightforward from here.  Once operational will likely be attempting to keep Linux builds up to date with Windows ones (My primary Platform) , so all going well shouldn't miss out on coming optimisations worked out in the Unit test threads.

Jason 
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 06 Jan 2011, 06:57:23 am
i just upgraded my nvidia driver to linux 64 version 260.19.29 and nvidia-settings of the
same version.

since this upgrade my 2nd video card will not release gpu core temps.

gpu0 is a gtx285
gpu1 and 2 is a dual processor gtx295

gpu0 still reads out fine in gkrellm properly but shows 0 for gpu1 and gpu2.

gkrellm is asking for the temps for all gpus according to the process list

when i run the command manually here is what i get:... the number 64 in the gpu0 readout
is the correct temp. the other 2 return a 0:


odyssey bin # nvidia-settings -q [gpu:0]/GPUCoreTemp
Xlib:  extension "RANDR" missing on display ":0".

  Attribute 'GPUCoreTemp' (odyssey:0[gpu:0]): 64.
    'GPUCoreTemp' is an integer attribute.
    'GPUCoreTemp' is a read-only attribute.
    'GPUCoreTemp' can use the following target types: X Screen, GPU.

odyssey bin # nvidia-settings -q [gpu:2]/GPUCoreTemp
Xlib:  extension "RANDR" missing on display ":0".

  Attribute 'GPUCoreTemp' (odyssey:0[gpu:2]): 0.
    'GPUCoreTemp' is an integer attribute.
    'GPUCoreTemp' is a read-only attribute.
    'GPUCoreTemp' can use the following target types: X Screen, GPU.

odyssey bin # nvidia-settings -q [gpu:1]/GPUCoreTemp
Xlib:  extension "RANDR" missing on display ":0".

  Attribute 'GPUCoreTemp' (odyssey:0[gpu:1]): 0.
    'GPUCoreTemp' is an integer attribute.
    'GPUCoreTemp' is a read-only attribute.
    'GPUCoreTemp' can use the following target types: X Screen, GPU.


is this a problem with the new driver?

i also recompiled cuda toolkit 2.3 to be sure everything was synchronized.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 06 Jan 2011, 07:36:51 am
You'll need to query the thermal sensors directly:

nvidia-settings -q [thermalsensor:0]/ThermalSensorReading

Each card has two thermal sensors, core and ambient, so thermal sensors 0,2,4 will be the core temperatures and 1,3,5 will be the ambient temperatures.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 06 Jan 2011, 11:15:39 am
cool that worked thanks!

what i cannot understand is why then does gpu0 still work under the old syntax?

i would think it would be all or nothing.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 06 Jan 2011, 11:21:54 am
what i cannot understand is why then does gpu0 still work under the old syntax?

Probably because it has a x server running while the other gpus do not.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 06 Jan 2011, 08:39:33 pm
my x server *i think* uses 2 gpus cause i have 3 monitors hooked up, both vid cards are defined and used in xorg.conf and the screens are defined as the gpu0 card both ports are xinerama and then that screen is twinviewed into the 3rd one.

just weird..

i found a better utility to check gpu temps and bill put the call into his gkrellm so there is now a choice of using nvidia-settings gputemp or using nvidia-smi which works perfectly.. also gives a nicer formatted output if you just want to look manually..

nvidia-smi -q -a

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Mar 2011, 02:12:59 pm
is the fermi driver reliable on linux yet? i am thinking of replacing both my cards with a gtx 590
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 06 Apr 2011, 09:03:41 am
I don't know whether 270.30 supports the new GTX590, but even if it doesn't I'm sure the next driver release from nvidia will support it. What doesn't work currently is nvidia-smi.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 07 Apr 2011, 06:32:58 am
cool so they finally got things going. actually i think i asked the wrong questin. i said driver didnt i.... im severely overworked.... i meant the seti software.  i can assume it now supports fermi. it has been long enough.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Jun 2011, 12:28:10 pm
this is probably the wrong place to ask, but in boinc manager v 6.12.27 clicking the X in the upper right of the window does not cancel the app but minimizes it. this is not acceptable bahavior.. is there a way to change that? it is highly irritatintg.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Jun 2011, 01:06:31 pm
I'm still with 6.10.58 and don't know what's happening with newer versions. Usually with boinc you can use managers with a different version than boinc. Maybe try an older boinc manager?

I've stopped using boinc manager. You can do many things with boinccmd. For example ./boinccmd --get_file_transfers shows you all pending uploads-downloads.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 23 Jun 2011, 03:17:33 pm
yeah im gonna go back to an earlier version. they also changed the order of the display columns and i found it lots more difficult than the older ones.. might just be because im used to the old style but it is not as quick to gather data i want to have like the old one was with important columns next to each other. in the new one they are spread across the window. they also removed the messages log and made it a menu option only and replaced the empty space with a notices tab which mirrors the messages on their website instead. yuk.
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 08 Jul 2011, 09:04:01 am
Hi All,
As posted on the SAH board.

I've just put a new cruncher together. QX9650, Gigabyte X38 Mobo, Mandriva 2010.2 PWP Linux, 260.10 Drivers (the latest in the Mandriva repository) BOINC 6.10.58, Aarons Linux App and 2 GTX580's.

My problem is this. The system and BOINC both recognise there are 2 video cards and BOINC runs 2 GPU units. However they are both running on the one card. It's the card in the first PCIE socket that's doing all the work. The one in socket 2 stays dead cold. Using a cc_config file made no difference

Other things I've noticed.
In the xorg.conf files the Busses/Sockets are transposed, Socket 1 appears as socket 2 and visa versa. If I only use one card the system behaves perfectly.
Swapping the cards made no difference, it's still only slot 1 that does work

Any clues from the Linux fraternity ?

T.A.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2011, 10:28:49 am
My problem is this. The system and BOINC both recognise there are 2 video cards and BOINC runs 2 GPU units. However they are both running on the one card. It's the card in the first PCIE socket that's doing all the work. The one in socket 2 stays dead cold. Using a cc_config file made no difference

Your tasks say otherwise:
Device1: http://setiathome.berkeley.edu/result.php?resultid=1986746872
Device2: http://setiathome.berkeley.edu/result.php?resultid=1986700701
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 08 Jul 2011, 11:02:40 am
Thats my point. As far as the system is concerned it IS using two cards, but it appears that the second card BOINC uses is a "virtual" card that running on the card in slot 1.

The second card is way too cold to be doing any work. The NVidia monitor reports a temp of 60deg for card one and a temp of 30 for card 2. I can't believe a GTX580 running at full blat would run that cool  :)

 The crunching times compare to those that are running 2 tasks on the one card as well. Time drops by half when only one card is fitted and only one task is running.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2011, 11:24:33 am
Now I get it.

Give me the output of :
ps -o cmd --no-headers -p $(pgrep setiathome)
and
nvidia-smi -q

Also post your xorg.conf and your xorg log.

I've never used a distribution provided nvidia driver. I always use nvidia's own driver. That latest official version is 275.09.07. Try that.
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 08 Jul 2011, 12:43:33 pm
Now I get it.

Give me the output of :
ps -o cmd --no-headers -p $(pgrep setiathome)
and
nvidia-smi -q
Why didn't you just say you wanted the Driver version and the CUDA app ? :D

V260.16 driver and setiathome-6.11.x86_64-pc-linux-gnu_cuda32 app

Also post your xorg.conf and your xorg log.

I've never used a distribution provided nvidia driver. I always use nvidia's own driver. That latest official version is 275.09.07. Try that.
Quote
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2011, 12:47:19 pm
Why didn't you just say you wanted the Driver version and the CUDA app ? :D

I don't. Please provide what I say in my previous post.

I have to go for a few hours. I'll check back later.
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 08 Jul 2011, 01:08:05 pm
Now I get it.

Give me the output of :
ps -o cmd --no-headers -p $(pgrep setiathome)
and
nvidia-smi -q
Why didn't you just say you wanted the Driver version and the CUDA app ? :D

Driver is 260.16 and CUDA App is: setiathome-6.11.x86_64-pc-linux-gnu__cuda32


Quote
I've never used a distribution provided nvidia driver. I always use nvidia's own driver. That latest official version is 275.09.07. Try that.
I tried installing that driver but the NVidia installer is a pain, the xorg.conf it creates doesn't work (X crashes on start up) and the problem is too subtle for me to pick up quickly.

Quote
Also post your xorg.conf and your xorg log.
Will do but I'll have to reinstall the 2nd card first to get meaningful files. will post them tomorrow.

It's only BOINC that has a problem with the 2nd card, as a video device it work perfectly, desktop and movies are fine. The system and BOINC know there is 2 cards, it's only that both instances of the app run on the same card without realising it. I'm suspecting a weird bug in the app.

T.A.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 08 Jul 2011, 05:45:34 pm
Also what version are your cuda libraries?

2.6.33 is almost a year and a half old. Why don't you use a newer kernel?
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 10 Jul 2011, 09:27:44 am
Also what version are your cuda libraries?

2.6.33 is almost a year and a half old. Why don't you use a newer kernel?
Cos I work on the "if it ain't broke don' fix it" principal :) Plus I don't have the time to spend compiling kernels and drivers. Repository kernels may be slow updating but they work with a minimum of fuss (usually).

FYI
[root@Geekbox-Q9650 brodo]# ps -o cmd --no-headers -p $(pgrep setiathome)
../../projects/setiathome.berkeley.edu/setiathome-6.11.x86_64-pc-linux-gnu__cuda

[root@Geekbox-Q9650 brodo]# nvidia-smi -q

==============NVSMI LOG==============


Timestamp         : Sat Jul  9 13:05:33 2011

Driver Version         : 260.19.36


As I said Driver and app versions  ;)



The Xorg conf and log files are attached. There are no glaring errors that I can see. The video system works perfectly for movies and games. The only app that has any trouble with it is BIONC. It's a weird problem. I suspect it's a bug in the Linux app and the way it handles multiple instances running on multiple devices.

Under Windows, the same hardware runs without problems and both cards are used. Compare these 2 machines Windows (http://setiathome.berkeley.edu/hosts_user.php?sort=rpc_time&rev=0&show_all=1&userid=8523123) and Linux (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5756768) . It's exactly the same hardware, only the HDD with the operating system was changed.

T.A.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jul 2011, 10:22:54 am
Is this with only one GPU? Some quick notes:

1) Why do you run as root?
2) There is no "--device x" flag on the running app. Is this because of small terminal window? Make it bigger and recheck.
3) This isn't what I would expect from nvidia-smi. Sample ouput:
==============NVSMI LOG==============

Timestamp                       : Sun Jul 10 17:19:00 2011

Driver Version                  : 275.09.07

Attached GPUs                   : 3

GPU 0:3:0
    Product Name                : GeForce GTX 295
    Display Mode                : N/A
    Persistence Mode            : Disabled
    Driver Model
        Current                 : N/A
        Pending                 : N/A
    Serial Number               : N/A
    GPU UUID                    : N/A
    Inforom Version
        OEM Object              : N/A
        ECC Object              : N/A
        Power Management Object : N/A
    PCI
        Bus                     : 3
        Device                  : 0
        Domain                  : 0
        Device Id               : 5EB10DE
        Bus Id                  : 0:3:0
    Fan Speed                   : N/A
    Memory Usage
        Total                   : 895 Mb
        Used                    : 265 Mb
        Free                    : 630 Mb
    Compute Mode                : Default
    Utilization
        Gpu                     : N/A
        Memory                  : N/A
    Ecc Mode
        Current                 : N/A
        Pending                 : N/A
    ECC Errors
        Volatile
            Single Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
            Double Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
        Aggregate
            Single Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
            Double Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
    Temperature
        Gpu                     : 74 C
    Power Readings
        Power State             : N/A
        Power Management        : N/A
        Power Draw              : N/A
        Power Limit             : N/A
    Clocks
        Graphics                : N/A
        SM                      : N/A
        Memory                  : N/A


I don't remember how nvidia-smi behaved with 260.xx. As I said try a newer driver.

I'll check the xorgs later.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jul 2011, 07:41:20 pm
What are you trying to do with X? Do you have 1 or 2 monitors? Your xorg.conf is like you want to run two X screens in one X server, while your xorg.0.log says that you have only one monitor connected and then it proceeds to unload the nvidia kernel module  :-\

(II) UnloadModule: "nvidia"

Running
lsmod | grep nvidia
what does it give you?

Also, What does
ls /dev/nv*
give you?

I edited your xorg.conf commenting out the double device, screen and monitor sections and changing some other minor stuff.

I see at mandriva's site that they have released 2011 RC1. Why don't you try something newer?
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 10 Jul 2011, 09:52:29 pm
Hi Sunu
A quick explanation and time line,
Built machine and installed OS and BOINC - latest available updates applied - Started machine with only one monitor connected -  When BOINC running noticed only one GPU was being used - searched web for answers - one answer was to make 2nd card think it had a monitor connected - Even with a 2nd monitor physically connected 2nd card still not crunching - tried several more of the "fixes" found on the web - still no joy - checked hardware by installing different OS - AOK - help sought from Lunatics.
SETI is down atm and the box is out of work, I'll try your xorg file when the project is back up and report back then.

I find this really intriguing, BOINC recognises both cards and runs 2 instances of the crunching app - it's just that both run on the card in PCIE slot 1 - Using one card only, in either slot it, works properly - Even setting the BIOS to boot using PCIE slot 2 makes no difference. What I haven't done is try a different project to see if that works, I'll try that tonight.

One positive thing to come out of this, my vi skills have definitely improved  :)

Thanks for your help
T.A.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 10 Jul 2011, 10:22:26 pm
one answer was to make 2nd card think it had a monitor connected
This is/was for windows. Doesn't matter in linux. Unless you want to run multiple X screens or servers.



I'll try your xorg file when the project is back up and report back then.

You still owe me many things:
- A proper ps... when seti is running (supposedly or not) in two GPUs.
- A proper nvidia-smi output with and without seti running, current and new xorg.conf.
- Those two above lsmod... and ls... with your current and new xorg.conf.
- A xorg.0.log with the new xorg.conf.
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 10 Jul 2011, 11:02:56 pm
Is this with only one GPU?
No - both were installed

Quote
1) Why do you run as root?
I don't - I just happened to be root in the console when I ran those commands

Quote
2) There is no "--device x" flag on the running app. Is this because of small terminal window? Make it bigger and recheck.
Pardon my ignorance but where should that line be ?

Quote
3) This isn't what I would expect from nvidia-smi. Sample ouput:....
That's all that came up. (Is this a clue to the problem ?)

Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jul 2011, 07:11:13 am
Is this with only one GPU?
No - both were installed
It shows only one instance running.

2) There is no "--device x" flag on the running app. Is this because of small terminal window? Make it bigger and recheck.
Pardon my ignorance but where should that line be ?
Maximize your terminal window and rerun it. Alternatively rerun it with > ps.txt at the end and post the file.


3) This isn't what I would expect from nvidia-smi. Sample ouput:....
That's all that came up. (Is this a clue to the problem ?)
No, it is a clue that you need to update kernel, driver, distro, everything  ;D


Also, how is boinc installed? From repositories or from berkeley? In root or your home directory? I always keep everything boinc/seti related in my home directory and far away from root.
Title: Re: SETI MB CUDA for Linux
Post by: Terror Australis on 11 Jul 2011, 11:49:29 am
Just a quick one. I didn't have time today to go through all your suggestions but I did have time to sign up to Milky Way. Both cards ran on this project exactly as they should have. I've gone back to thinking this is a problem with the SAH app and the way it handles multiple instances on multiple devices.

I'm going back to square one tomorrow and will incorporate all your suggestions and see how it goes, if nothing else I'll have a tidier install.

BTW the correct command for nvidia-smi is: nvidia-smi -a

I'll post everything when I get a chance

T.A.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 11 Jul 2011, 11:56:57 am
BTW the correct command for nvidia-smi is: nvidia-smi -a

It must have been with your driver version. As I said I don't remember how nvidia-smi worked with that driver version. From nvidia-smi manual:
Quote
The  -a,  -s and -g arguments are now deprecated in favor of -q and -i, respectively. However, the old arguments still work for this release.

EDIT: I just checked that app and it doesn't obey --device x flags. So no matter what we do, I don't think we can make it utilize other than the first GPU.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Feb 2012, 02:11:16 pm
Been inactive for a while. boincing fried both my vid cards. both have bad memory now and cause periodic lockups in my system.. been saving for a new card. was looking at the nvidia 590 but i am wondering if i should wait for the 690.. i assume the 690 will be dual gpu like the 590. thing is i need support for 4 monitors and the 590 does that which is nice. hope the 690 does too...

so... any thoughts about 590 vs 690?
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 24 Feb 2012, 06:53:56 pm
Nobody really knows what 690 will be like. It is also unknown when exactly it will be out, some say by the 3rd quarter. I would still wait for it.

As for your 4 monitors I think it would be better to go to a dedicated card for them. That is, use a separate card for display purposes and dedicate the 690 for cuda computing. Look at http://www.nvidia.com/object/desktop-nvs.html

This is what I'm thinking to do in the future. I'm also waiting for the 690 to replace my old 295+285 setup and use the 690 for cuda and a lowly card, like a 9600gt, for display (I don't have big requirements like your 4 monitor setup), if my motherboard cooperates (if it can run a display from the secondary PCIE slot).
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 24 Feb 2012, 11:11:19 pm
that nvs is interesting.. extremely low power too.. might be a bit of a mess making x work with it though. it only supports
2560x1600 per screen however i have 2 screens defined as 3840x1200 spanning across 2 monitors each screen, then i have those 2 screens 'twinview' enabled so i can move the mouse from one to the other screen. i would have preferred twinview all the way across but i could not find a way to do it so each screen was well defined and unique. didnt work so i had to go to the above solution. then again maybe that will work since each screen spans 2 ports. too tired to think properly. but this looks like a very nice setup.. much better than sharing my desktop with boinc.

Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 24 Feb 2012, 11:31:52 pm
echoing Sunu's comments regarding the 690 directly, it looks like a good time to sit and wait if you can hold out & don't need a new card right now.  It's looking like an uncommon situation development wise at the moment, where  nv driver & dev support staff appear to be in hiding off doing something else (for example the latest windows WHQL drivers aren't very well polished & there is currently zero publicly visible progress on reported issues).   With the company having recently reported something like $4 billion earnings, I'd say either the staff are hard at work on that next gen to get it 'right'... or possibly abducted by aliens.

Anyway, given that, by my understanding, Linux drivers tend to be simpler to develop than Windows ones, probably you won't see much of the expected teething problems over here in Linux land, but might be in for a quiet wait.  Windows display driver model based drivers, have already been buckling under multiple card setups for some time with latency & traffic issues [and most recently basic power management...  ::) ], so it's fairly logical that the situation will be worse for us in the short term & nv will have to find ways around that.

Jason
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 25 Feb 2012, 07:49:10 am
ugh. just reviewed the specs of the nvs 420 and 450.. neither will work for me :( they only have 512mb ram . i would clobber that in a heartbeat. typically my setup chews up a good 1gb+  vidram and thats without boinc running..
Title: Re: SETI MB CUDA for Linux
Post by: Mike on 25 Feb 2012, 08:12:51 am
ugh. just reviewed the specs of the nvs 420 and 450.. neither will work for me :( they only have 512mb ram . i would clobber that in a heartbeat. typically my setup chews up a good 1gb+  vidram and thats without boinc running..

You can get a EVGA 450 SC with 1GB.

Mike
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 25 Feb 2012, 09:56:43 am
If it is only for display, what do you need 1GB for? Maybe use 2 NVSs?

Nvidia gives extra capabilities/configuration options to their professional series of cards. For example those cards support BaseMosaic. BaseMosaic seems the perfect solution in your case:
Quote
Option "BaseMosaic" "boolean"

    This option can be used to extend a single X screen transparently across all of the available display outputs on each GPU. This is like SLI Mosaic mode except that it does not require a video bridge connected to the graphics cards. Due to this Base Mosaic does not guarantee there will be no tearing between the display boundaries. Base Mosaic is supported on all the configurations supported by SLI Mosaic Mode. It is also supported on Quadro FX 380, Quadro FX 580 and all G80 or higher non-mobile NVS cards.

    Use this in conjunction with the MetaModes X configuration option to specify the combination of mode(s) used on each display. nvidia-xconfig can be used to configure Base Mosaic via a command like nvidia-xconfig --base-mosaic --metamodes=METAMODES where the METAMODES string specifies the desired grid configuration. For example, to configure four DFPs in a 2x2 configuration, each running at 1920x1024, with two DFPs connected to two cards, the command would be:

        nvidia-xconfig --base-mosaic --metamodes="GPU-0.DFP-0: 1920x1024+0+0, GPU-0.DFP-1: 1920x1024+1920+0, GPU-1.DFP-0: 1920x1024+0+1024, GPU-1.DFP-1: 1920x1024+1920+1024"

Title: Re: SETI MB CUDA for Linux
Post by: riofl on 27 Feb 2012, 08:08:08 am
ugh. just reviewed the specs of the nvs 420 and 450.. neither will work for me :( they only have 512mb ram . i would clobber that in a heartbeat. typically my setup chews up a good 1gb+  vidram and thats without boinc running..

You can get a EVGA 450 SC with 1GB.

Mike


yeah i was just looking at he power consumptions... 40w max. i if i get a 590 or equivalent 690 ?, i am afraid the power requirements would leave me lacking for my display card if i went with a mainline card.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 27 Feb 2012, 08:18:28 am
If it is only for display, what do you need 1GB for? Maybe use 2 NVSs?

Nvidia gives extra capabilities/configuration options to their professional series of cards. For example those cards support BaseMosaic. BaseMosaic seems the perfect solution in your case:
Quote
Option "BaseMosaic" "boolean"

    This option can be used to extend a single X screen transparently across all of the available display outputs on each GPU. This is like SLI Mosaic mode except that it does not require a video bridge connected to the graphics cards. Due to this Base Mosaic does not guarantee there will be no tearing between the display boundaries. Base Mosaic is supported on all the configurations supported by SLI Mosaic Mode. It is also supported on Quadro FX 380, Quadro FX 580 and all G80 or higher non-mobile NVS cards.

    Use this in conjunction with the MetaModes X configuration option to specify the combination of mode(s) used on each display. nvidia-xconfig can be used to configure Base Mosaic via a command like nvidia-xconfig --base-mosaic --metamodes=METAMODES where the METAMODES string specifies the desired grid configuration. For example, to configure four DFPs in a 2x2 configuration, each running at 1920x1024, with two DFPs connected to two cards, the command would be:

        nvidia-xconfig --base-mosaic --metamodes="GPU-0.DFP-0: 1920x1024+0+0, GPU-0.DFP-1: 1920x1024+1920+0, GPU-1.DFP-0: 1920x1024+0+1024, GPU-1.DFP-1: 1920x1024+1920+1024"



my desktop usage is a vidram hog. i keep  at the very least 5 browsers open each with an average of 14 tabs .. these take up vid space because they are constantly updating and ready to come forward instantly.between that and the backgrounds, stickynotes, editors open, konsole with 9 ssh tabs open, 20 gkrellm hardware monitors running and a bunch of other stuff. running diag software revealed my average vidram usage without boinc running is 800mb to 1gb.
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 27 Feb 2012, 09:31:26 pm
running diag software revealed my average vidram usage without boinc running is 800mb to 1gb.

That would be nvidia-smi -a ?
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 28 Feb 2012, 11:30:15 am
running diag software revealed my average vidram usage without boinc running is 800mb to 1gb.

That would be nvidia-smi -a ?

never tried running that. used a program a friend wrote. running the above command now i get the  following usage stats.. note that i am not running at 'full bore' mode since it is not needed at the moment. i imagine these would be higher if i was.

gpu0 memory 53%
gpu1 memory 36 %
gpu2 memory 22%

gpu0 has 1g vidram
gpu1 and 2 have 1.8g rounded off

the gpu1 and 2 gtx295 card came with 1.8g ram advertised.
the gpu0 gtx285 card 1gb advertised
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 29 Feb 2012, 04:39:27 pm
What driver version do you use? I don't remember nvidia-smi showing percentages for memory use.
Title: Re: SETI MB CUDA for Linux
Post by: Jason G on 29 Feb 2012, 09:36:40 pm
What driver version do you use? I don't remember nvidia-smi showing percentages for memory use.
shows Total, Used & Free here on 280.13 Ubuntu proprpietary driver supplied through the additional drivers automatic thing, Ubuntu 11.10 64 bit.  I'm guessing that he just did the math from that.

Jason
Title: Re: SETI MB CUDA for Linux
Post by: aaronhaviland on 29 Feb 2012, 10:16:18 pm
nvidia-smi (without the -a option) now prints a pretty formatted output which includes the % memory usage.

Code: [Select]
+------------------------------------------------------+                       
| NVIDIA-SMI 3.295.20   Driver Version: 295.20         |                       
|-------------------------------+----------------------+----------------------+
| Nb.  Name                     | Bus Id        Disp.  | Volatile ECC SB / DB |
| Fan   Temp   Power Usage /Cap | Memory Usage         | GPU Util. Compute M. |
|===============================+======================+======================|
| 0.  GeForce GTX 460           | 0000:01:00.0  N/A    |       N/A        N/A |
|  20%   36 C  N/A   N/A /  N/A |  26%  200MB /  767MB |  N/A      Default    |
|-------------------------------+----------------------+----------------------|
| Compute processes:                                               GPU Memory |
|  GPU  PID     Process name                                       Usage      |
|=============================================================================|
|  0.           Not Supported                                                 |
+-----------------------------------------------------------------------------+
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Mar 2012, 04:49:54 am
What driver version do you use? I don't remember nvidia-smi showing percentages for memory use.

nvsmi log says this:

Driver Version                  : 260.19.36
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Mar 2012, 04:59:05 am
nvidia-smi (without the -a option) now prints a pretty formatted output which includes the % memory usage.

Code: [Select]
+------------------------------------------------------+                       
| NVIDIA-SMI 3.295.20   Driver Version: 295.20         |                       
|-------------------------------+----------------------+----------------------+
| Nb.  Name                     | Bus Id        Disp.  | Volatile ECC SB / DB |
| Fan   Temp   Power Usage /Cap | Memory Usage         | GPU Util. Compute M. |
|===============================+======================+======================|
| 0.  GeForce GTX 460           | 0000:01:00.0  N/A    |       N/A        N/A |
|  20%   36 C  N/A   N/A /  N/A |  26%  200MB /  767MB |  N/A      Default    |
|-------------------------------+----------------------+----------------------|
| Compute processes:                                               GPU Memory |
|  GPU  PID     Process name                                       Usage      |
|=============================================================================|
|  0.           Not Supported                                                 |
+-----------------------------------------------------------------------------+


I get nothing like that. If i just use it without the option, i only get the timestamp and driver version.

Because of the length of the output I am only including the first gpu report,


==============NVSMI LOG==============


Timestamp                       : Thu Mar  1 04:51:16 2012

Driver Version                  : 260.19.36


GPU 0:
        Product Name            : GeForce GTX 285
        PCI Device/Vendor ID    : 5e310de
        PCI Location ID         : 0:1:0
        Board Serial            : 3169719755757
        Display                 : Connected
        Temperature             : 44 C
        Fan Speed               : 100%
        Utilization
            GPU                  : 15%
            Memory              : 34%




Title: Re: SETI MB CUDA for Linux
Post by: sunu on 01 Mar 2012, 07:36:00 am
You're using a fairly old driver version. What is good about it is that most of nvidia-smi's counters were functional back then in non professional cards.
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 01 Mar 2012, 09:36:30 am
cool. i'm not one to upgrade just for the sake of upgrading unless there is a security measure that affects me. i am of the 'if it ain't broke dont fix it' crowd. hehe i am still running kernel 2.6.31...

my boss gets really frustrated at me because other than security updates, i wil only update our servers maybe twice a year and then only with versions that are at least 3 months old. my thought is that within 3 months anything that was broken would be fixed or at least would be talked about. my boss is one of those the second a new version of anything is released he has to have it. that gets him into a lot of trouble sometimes but he refuses to wake up and do his upgrades in a smart manner. of course i could wonder about his intelligence and sanity since he is a major preacher in the "microsoft is next to godliness" church.

live and let live i guess. keeps me working.. :P
Title: Re: SETI MB CUDA for Linux
Post by: sunu on 21 Mar 2012, 08:53:35 am
The new kepler 6xx cards seem to support 4 displays in a single card so you won't need sli/professional/quadro cards any more http://www.guru3d.com/news/msi-geforce-gtx-680-photo/

We will know more tomorrow...
Title: Re: SETI MB CUDA for Linux
Post by: riofl on 21 Mar 2012, 12:10:13 pm
excellent! thank you! these still work. just the occasional pausing is annoying, but i can wait a while. this looks good.