Forum > Linux

SETI MB CUDA for Linux

<< < (88/162) > >>

sunu:
Yes, boinc scheduler is pretty messed up.

IanJ:
Sunu,
 I just reporting back to update you on my Fedora Core 10 64bit machine. I have successfully installed the 2.3cuda libraries/stuff and the machine has been chugging through the workunits.
 I have a couple of questions, one of which is worrying. Occasionally over the past week the machine has locked up and only a reset has cleared the issue. I had a look at /var/log/messages and I see a number of NVRM:Xvid messages. I've googled around and didn't get a clear answer, so does anyone here have an idea? Here are the entries:-
Sep 17 17:11:41 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 04:58:05 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 05:11:07 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
 The second question, is there anyway to ensure that the card is being used to the max, is there any tuning or monitoring of the card that would assist?
 Thanks
 Ian

riofl:

--- Quote from: IanJ on 18 Sep 2009, 03:03:30 am ---Sunu,
 I just reporting back to update you on my Fedora Core 10 64bit machine. I have successfully installed the 2.3cuda libraries/stuff and the machine has been chugging through the workunits.
 I have a couple of questions, one of which is worrying. Occasionally over the past week the machine has locked up and only a reset has cleared the issue. I had a look at /var/log/messages and I see a number of NVRM:Xvid messages. I've googled around and didn't get a clear answer, so does anyone here have an idea? Here are the entries:-
Sep 17 17:11:41 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 04:58:05 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
Sep 18 05:11:07 fc10 kernel: NVRM: Xid (0002:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100
 The second question, is there anyway to ensure that the card is being used to the max, is there any tuning or monitoring of the card that would assist?
 Thanks
 Ian


--- End quote ---

i have been getting this exact same message with the same numbers in the same positions on my tesla. i have been searching for meaning for these for almost 3 months. i only know that i have a problem with the tesla where even though it never locked the entire machine up, the tesla will lock up occasionally requiring a stop and restart of boinc. until i fixed my thermal pad problem on itfor the ram chips, it would  require a power off to reset it. i suspect this error may indicate ram problems. i ran a utility for testing vidram available on the net but it showed nothing wrong.

if the card is still in warranty i suggest getting a swap-out just to be safe.

sunu:
Unfortunately xid errors aren't very useful for troubleshooting. Riofl talks about problems with his thermal pads, you might have as well temperature problems, not necessarily with your graphics card but also with your CPU or nothbridge. What are your temperatures like?

Do you use KDE? I have the impression that I see much more reports with problems with KDE and nvidia drivers than with GNOME.

Which nvidia drivers do you use?. You could try a different version.

Check also your invalid tasks page. If you have invalid tasks this could mean a hardware problem with your graphics card and you'll have to replace it.

As for your second question about a monitoring app for GPU use, there isn't any. I use gnome's sensor-applet to monitor my CPU and GPU temperatures so I can always see if they are working (=crunching) or not.

riofl:
ahh another clue.. i use kde since i use many of its built in features... i tried gnome but i admit it was years ago and it did not do what i needed at that time... a small thing to most but it helps me a lot when switching a fair number of desktops every few min, is i need unique backgrounds for each desktop, and back then gnome did not do that. i understand it is now built in or there is an app to do it.. i might get adventurous and try the new gnome.

yeah my pads on the tesla were dried, brittle and cracked and some simply were not there so i had to clean everything and find the proper thick 2mm pads with fiberglass webbing which i had to buy from the UK and spent an afternoon being an artist with scissors and knife since the pads came in large sheets. since then temps everywhere including touch on the housing are considerably cooler but it still locks maybe once a week or so... maybe something was damaged from heat when the bad pads were on it. at this point i am not worried.. i am going to replace it soon and then send it to my boss who wants to run it in a much more forgiving windows environment. hehe let him have the problems :)

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version