hi,
not sure if Boinc is reasing the Gflops correctly for my 465 either. It can go from 570->855->2553 Gflops in a day.
855 is the correct (nVidia's) rating for the 465, so I would expect your 460 to be around that mark as well
Check out http://en.wikipedia.org/wiki/GeForce_400_Series and you can see Wiki's rating there
Thanks Richard,
I saw the post you made to DA on the Boinc_Alpha list, from what he said it sounds like nVidia, didn't write this in to their API correctly?
Also any idea's why Boinc would report my Gflops at 2253 Gflops?
31/07/2010 17:50:48 | | NVIDIA GPU 0: GeForce GTX 465 (driver version unknown, CUDA version 3010, compute capability 2.0, 994MB, 2253 GFLOPS peak)
31/07/2010 17:50:48 | | ATI GPU 0: ATI Radeon HD5x00 series (Redwood) (CAL version 1.4.737, 1024MB, 620 GFLOPS peak)
No idea on that one, sorry.No worries, it reports correctly occasionally so its not a big issue, just a shame I don't actually get that amount
// Estimate of peak FLOPS.
// FLOPS for a given app may be much less;
// e.g. for SETI@home it's about 0.18 of the peak
//
inline double peak_flops() {
// clock rate is scaled down by 1000;
// each processor has 8 or 32 cores;
// each core can do 2 ops per clock
//
int cores_per_proc = (prop.major>=2)?32:8;
double x = (1000.*prop.clockRate) * prop.multiProcessorCount * cores_per_proc * 2.;
return x?x:5e10;
}
Hmmm, I was reading this stuff not long ago... looking
<core_client_version>6.11.4</core_client_version>
<![CDATA[
<stderr_txt>
Before cudaAcc_initializeDevice(): Boinc passed gCUDADevPref 1
setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce GTX 465, 993 MiB, regsPerBlock 32768
computeCap 2.0, multiProcs 11
clockRate = 3200000
But whether it's a bug in the app(s) coding, the NVidia API, or EVGA Precsion, I couldn't begin to tell.Don't think its a bug in EVGA as GPU-z is reporting this as well as nVidia's Performance tool
'Cuda Cores' @ 48 per Multprocessor ( 7 * 48 = 336 cores
...for Compute Capability 2.1...
...But this still leaves the question, where is Boinc reading the 3200Mhz and the 810Mhz from in the stderr text....I still reckon something's broken :D [That's the same 810000 shown on my 480... a default of some sort I reckon. For the 3200 Some addition to the GPU properties retrieved by the library might have shifted the data elements one space in the data structure ... They are kindof nearby, so it'd be a fairly simple task to mess up the properties parsing in the Boinc code ... checking ]
Got to agree that something is broken, all the figures indicate (to me atleast) that the 810 and the 3200Mhz figures are either defaults as you said, or Boinc is reading the wrong clock speed (memory instead of core).
I still reckon something's broken :D [That's the same 810000 shown on my 480... a default of some sort I reckon. For the 3200 Some addition to the GPU properties retrieved by the library might have shifted the data elements one space in the data structure ... They are kindof nearby, so it'd be a fairly simple task to mess up the properties parsing in the Boinc code ... checking ]
GeForce/ION Release 256 WHQL NVIDIA Recommended 258.96 July 19, 2010
GeForce/ION Release 256 BETA 258.69 June 29, 2010
GeForce/ION Release 256 WHQL 257.21 June 15, 2010
GeForce/ION Release 256 BETA 257.15 May 24, 2010
GeForce Release 197 WHQL 197.75 May 10, 2010
GeForce Release 197 BETA 197.75 May 3, 2010
GeForce Release 197 WHQL 197.41 April 9, 2010
setiathome_CUDA: Found 1 CUDA device(s):After I reset the value back to the correct 1215Mhz
Device 1: GeForce GTX 465, 993 MiB, regsPerBlock 32768
computeCap 2.0, multiProcs 11
clockRate = 3200000
setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce GTX 465, 993 MiB, regsPerBlock 32768
computeCap 2.0, multiProcs 11
clockRate = 1215000
Think I cans ee what value Boinc is reading in the stderr now...
...
Sorry TouchuvGrey, I kinda hijacked your thread for a while there ::)
Did you manage to find out why your 460 was reporting such a low Gflops rating in Boinc?
From everything I've read about the 460 it should be around the 900 Gflops mark.
With my 465 it seems that the default profile that the nVidia performance tool loaded was corrupt and was shooting my shader clock sky high to give a massively over-rated Gflops figure,
Not sure if this would be the same problem (maybe downclocking? as another post said as well)
Ghost
And again, this is quite different from the problem with the 460 / GF104 that TouchuvGrey reported at the start of this thread: that one stems from the
'Cuda Cores' @ 48 per Multprocessor ( 7 * 48 = 336 cores
for Compute Capability 2.1
I am planning to build a system with :
1x GT 240
1x GTX 460
How do I have to modify my app_info.xml to run those two cards with x32f?
The GTX 460 can run more than one task in parallel, right? Was it 3? How do I have to configure this?
For installation, I would use the Lunatics Installer first, BUT not start Boinc yet ... Then follow the same procedure as I gave perryjay below, but modify the app info to suti the Cuda 3.0 executable names & DLLs.
<app_version>Change the <count>1</count> to either <count>0.5</count> or <count>0.33</count> if you either want 1,2 or three tasks to run on the fermi card
.....
<type>CUDA</type>
<count>1</count>
</coproc>
.....
</app_info>
As far as I know the Fermi build takes about 300MB per task - so you should be able to run 2 tasks. Sure Jason will correct me if I'm wrong on this ;D On my 465 I get the best through put with two tasks running at a time (994MB)
But thats what I've found on this card 1 shorty ~3min 20, 2 @ a time ~5min 50, 3 at a time just over 7 minutes 30, ...
On my 465 I get the best through put with two tasks running at a time
OK ... when I do the math 3 WUs at a time is the optimum, no?
Completion time for 3 WUs:
1 at a time: 3 x 3:20 -> 10:00
2 at a time: 1.5 x 5:50 -> 8:45
3 at a time: 1 x 7:30 -> 7:30
Would you say its OK if I go for the 768MB version?
I also don't think I explained the math that well either the times I gave were for each task to complete
You guys that run 2 or 3 tasks at a time in your fermi cards, what RAC does that give to your cards?On the 480, with x32f 3.1, running on a E8400 Oc'd to 3.6GHz, I just passed 21000 today. The CPU is running AKv8b with occasional AstroPulse... I think it should start to level off soon, but don't know for sure. I'm finishing two midrange tasks on the 480 in about 9-12mins, or two shorties in ~3 mins. single task times were ~7-8 mins & ~2mins for Mid AR & shorty respectively.
As for the question about the 768MB card I would advise against that. 768MB memory seem quite small nowadays. Maybe 2-3 years ago yes, but not now. Get the higher memory model.I agree. For me it's the memory bus width though. With this much processing power at hand memory bandwidth will nearly always be the primary concern, and nVidia wisely have included some special cache control mechaisms in the driver API ( which we can't switch to yet, until I have the complete set of freaky powerspectrums, still on track for christmas 2010 ;), later to be replaced with a freaky powerspectrum cascade kernel to maximise cache benefits)
How can I see in the WU XML file whether or not a WU is a VHAR?
Shorties (are they always VHARs?) usually run in 5:50 on my GT 240. So it's about 10 per hour.
Somehow I can't believe the 465 is only 2x my cheapish GT 240 ...!?
It's easier just to look at the stderr task output after the task reports (and the servers are up) though.
It's easier just to look at the stderr task output after the task reports (and the servers are up) though.
Yeah bummer ... now since the servers are offline I'm starting to poke around :-)
I am planning to build a system with :
1x GT 240
1x GTX 460
How do I have to modify my app_info.xml to run those two cards with x32f?
The GTX 460 can run more than one task in parallel, right? Was it 3? How do I have to configure this?
It's a shortcoming in BOINC, there's no way to set up a separate app_version for each card.
Joe
I'm not certain, but I'm fairly sure that <plan_class> isn't used at the client scheduler level for determining device allocation.
I have such a setup running too 8)Successfully? What OS/Catalyst ? I failed to launch such config because ATI's lack of support for Win 2003 Server OS, Claggy had bad failures for different apps when both GPUs installed... Interesting what makes your config so successful :)
I have such a setup running too 8)Successfully? What OS/Catalyst ? I failed to launch such config because ATI's lack of support for Win 2003 Server OS, Claggy had bad failures for different apps when both GPUs installed... Interesting what makes your config so successful :)
...
There is an exact threshold above which is considered VHAR, and that makes shorties. I forget exactly what that threshold is, around 1.1 I think. Joe keeps that kind of information around here :D. Above that, whatever it is, they all tend to take pretty much the same short time to process.
Any thoughts or suggestions will be greatly appreciated.
GPU-Z 0.4.5 shows CUDA to not be available og my GTX-460, nor isI get the same on the 465 with GPU-Z, no cuda, no DirectCompute 5.0 But do have OpenCL available
DirectCompute 4.0. It shows both ARE available on my GTS 250.
Any thoughts or suggestions will be greatly appreciated.
I get the same on the 465 with GPU-Z, no cuda, no DirectCompute 5.0 But do have OpenCL available
Seems okay on mine (GTX460). Shows as having all the boxes ticked down the bottom. Did you get this on the machine with mixed cards?Yes, its currently got the GTX 465 and a HD5670 in it. Admittedly I've uninstalled PhysX, so I'm not expecting that to show up (possible issue with runnnig Raistmers CPU/GPU AP build) but not really sure why CUDA atleast isn't showing up
I get the same on the 465 with GPU-Z, no cuda, no DirectCompute 5.0 But do have OpenCL available
I have a "mixed" system with ATI + NVIDIA too. GPUZ seems to have problems with this. Have you tried something like "GPU Caps Viewer" instead?Thanks, have just downloaded this - lot more information available in Caps Viewer - now seeing CUDA is available on the 465
Can someone explain to me ( in small words please )
just what i'm seeing and what it means ?
warp size remains same so the number of truly simultaneous threads.Perhaps the GTX 460 is reporting GPU clock speed instead of shader clock speed?
Also, why so big downgrade in freq ???
Perhaps the GTX 460 is reporting GPU clock speed instead of shader clock speed?Looks like that. At least from my experience with other NV and ATI cards.
Claggy
Some counts DDR and some not? ...
warp size remains same so the number of truly simultaneous threads.
Also, why so big downgrade in freq ???
Good news - then I may hope NV will like float4 too as ATI does :)
[and it will save OpenCL kernels from rewriting for NV ]
LoL ;DGood news - then I may hope NV will like float4 too as ATI does :)
[and it will save OpenCL kernels from rewriting for NV ]
Yes, though when it comes to that you may need to be backporting some of my experimental code, so don't get too complacent there :P
Cuda 3.0 build of x32f (seeing as you're a squire). Probably 2 instances per Fermi card at a time.
What ( if anything ) do i need to add or modify in my config.xml file to get
2 instances running on my GTX460 ? Taking into account that the other card
is a GTS250
Doesn't work.Saw that thread, but haven't been over there since, did the GPU_RAM suggestion by Joe not work?
See http://setiathome.berkeley.edu/forum_thread.php?id=61184&nowrap=true#1027766
Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>
Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards, so you were running the v12 application and turning in a lot of tasks with a false result_overflow. Many of those ended up being judged invalid so got no credits. Some also happened to be paired with another host also running old CUDA code on Fermi, those unfortunately get validated and assimilated into the database. However, they overflow so quickly that there are few credits granted even for those.
Joe
Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>
Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards, so you were running the v12 application and turning in a lot of tasks with a false result_overflow. Many of those ended up being judged invalid so got no credits. Some also happened to be paired with another host also running old CUDA code on Fermi, those unfortunately get validated and assimilated into the database. However, they overflow so quickly that there are few credits granted even for those.
There may be a lingering problem because the DCF has adapted to doing a lot of the work in extremely short time. That could lead to BOINC killing some tasks for "exceeded elapsed time limit", the infamous -177 error. The new rescheduler (http://www.efmer.eu/forum_tt/index.php?topic=428.0) Fred M made has an expert feature to prevent any possibility of that, and IIRC there's a way to use that feature without actually rescheduling tasks. I hope someone who has actually used that will post a quick clear procedure. I don't have any GPU capable of crunching, so am only going on what I've read elsewhere.
You might also want to reduce your cache settings before asking for work during the uptime beginning Friday, the system thinking your GTX460 is much faster than it really is could lead to getting more work than you really want. After the host has 50 or so validated tasks done with x32f the server average should be close enough to not worry about that much, so the cache can be made as large as you need before the next outage.
Joe
On the expert tab, there is a check box that says Limit rsc_fpops_bound to avoid -177 errors. Check that off, and go to the first tab and push run. It takes a few seconds, but it works perfectly. I stopped a bunch of -177 errors cold by running that. Make sure you are not in simulation mode, which is also on the expert tab.
Steve
i plead guilty as charged, i have a long history of not reading the instructions.
You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.Quotei plead guilty as charged, i have a long history of not reading the instructions.
I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;D
Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards
...
Joe
Well, we were slow to pick up, but we were there by early June: I thnink all the warnings were in place by
http://lunatics.kwsn.net/1-discussion-forum/when-corrupted-results-get-validated.msg27734.html#msg27734
http://lunatics.kwsn.net/gpu-crunching/unified-installer-with-fermi-support.msg27926.html#msg27926
Anybody who installed any v12 or other non-Fermi app after then, with all the warnings here and on the main project, just wasn't reading. And of course, from that point onwards, just allowing stock download would have worked.
You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.Quotei plead guilty as charged, i have a long history of not reading the instructions.
I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;D
It sounds like i need to be using that rescheduler, where do i get it and how and where do i install it ?
A lot of this is Greek to me too, but what little i can get from it is a liitle that i did not know before.
Agreed, anyone who read and remembered the 19th reply in the first thread you mentioned should not have gotten into difficulties. The second thread is in the section of this site closed to all except developers and alpha testers, so TouchuvGrey will not have seen that. In retrospect, your proactive approach suggested there was a very good idea and I hereby declare you not a part of the "we" who screwed up. And of course your warnings on the main project, etc. were very helpful. So much so that I for one didn't realize until quite recently that the smoldering embers were threatening to turn into a full blown fire.
Basically, the difficulty is we developers and many others here have a very technical orientation. What is obvious to us is not necessarily so for our users, and we failed to communicate effectively across that gap. Even a clear statement on the front page and in the description of the 0.36 downloads that V12 isn't Fermi compatible would not have completely prevented the developing problem, many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. I'd be more comfortable if we had at least tried that kind of warning, though.
Joe
... many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. ...
Since you already print out stuff like "Device 1: GeForce GTX 460, 1023 MiB, regsPerBlock 32768" I guess you already have some kind of GPU detection code in place ;)That one runs on all the nVidia GPUs ;) and yes, when the internal dispatch mechanisms in the app, runtime and driver use the right kernels for the right device, as nVidia intended, appropriate codepaths from multiple are chosen & executed. Unfortunately that mechanism didn't exist prior to Cuda 3.0, so older (Pre Cuda 3.0)builds need to be deprecated.
You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.Quotei plead guilty as charged, i have a long history of not reading the instructions.
I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;DIt sounds like i need to be using that rescheduler, where do i get it and how and where do i install it ?
A lot of this is Greek to me too, but what little i can get from it is a liitle that i did not know before.
The download and instructions are at http://www.efmer.eu/forum_tt/index.php?topic=428.0. Installation is just a matter of taking the 64 bit version of the executable out of the zip file and putting it someplace convenient. If you have BOINC installed where it chooses by default the program should have no difficulty finding what it needs to work with.
The instructions on that linked page say it will reschedule both VLAR and VHAR to CPU unless you uncheck the "Always move VHAR's to CPU" box, I definitely recommend doing so to keep VHARs on GPU. The only VLARs you might have are some older unmarked ones which have been reissued, x32f is better at doing those than stock 6.10 but probably leaving the "Always move VLAR's to CPU" checked is what you'll want.
Joe
"Always move VHAR's to CPU" unchecked, "limit rsc_fpops...." checked. Is there anything
else i should change from default ?
Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards
...
Joe
Well, we were slow to pick up, but we were there by early June: I thnink all the warnings were in place by
http://lunatics.kwsn.net/1-discussion-forum/when-corrupted-results-get-validated.msg27734.html#msg27734
http://lunatics.kwsn.net/gpu-crunching/unified-installer-with-fermi-support.msg27926.html#msg27926
Agreed, anyone who read and remembered the 19th reply in the first thread you mentioned should not have gotten into difficulties. The second thread is in the section of this site closed to all except developers and alpha testers, so TouchuvGrey will not have seen that. In retrospect, your proactive approach suggested there was a very good idea and I hereby declare you not a part of the "we" who screwed up. And of course your warnings on the main project, etc. were very helpful. So much so that I for one didn't realize until quite recently that the smoldering embers were threatening to turn into a full blown fire.
Basically, the difficulty is we developers and many others here have a very technical orientation. What is obvious to us is not necessarily so for our users, and we failed to communicate effectively across that gap. Even a clear statement on the front page and in the description of the 0.36 downloads that V12 isn't Fermi compatible would not have completely prevented the developing problem, many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. I'd be more comfortable if we had at least tried that kind of warning, though.
Joe
As for the users that don't visit here, they expect a finished product they can just plug and play. They could care less what you went through to bring it to them. No amount of warnings is going to stop them and they are going to be the loudest of the loud when they screw it up. Not much you can do about them though. As far as your quote from Joe, I think you made it abundantly clear there was a problem with the apps, the problem was a number of the users grabbing the installer, plugging it in and forgetting it. Then when they finally did notice, blame you guys. About all you can do is keep that "For advanced users only" sign up and maybe add "Use at your own risk"Well said! :D Pretty much my thoughts on this point.
Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>