Seti@Home optimized science apps and information

Optimized Seti@Home apps => Windows => GPU crunching => Topic started by: TouchuvGrey on 27 Jul 2010, 09:40:53 pm

Title: GTX 460 superclocked
Post by: TouchuvGrey on 27 Jul 2010, 09:40:53 pm

i currently am running two video cards a GT220 and a GTS 250, i have a GTX 460 ( superclocked ) on the way.
i intend to replace the GT220 with the GTX460 and put the 220 in my #2 machine. i will report back with
the results.

Current Configuration:

7/27/2010 8:33:51 PM      Starting BOINC client version 6.10.56 for windows_x86_64
7/27/2010 8:33:51 PM      Config: use all coprocessors
7/27/2010 8:33:51 PM      log flags: file_xfer, sched_ops, task, sched_op_debug
7/27/2010 8:33:51 PM      Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
7/27/2010 8:33:51 PM      Data directory: C:\ProgramData\BOINC
7/27/2010 8:33:51 PM      Running under account Mike
7/27/2010 8:33:51 PM      Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5]
7/27/2010 8:33:51 PM      Processor: 256.00 KB cache
7/27/2010 8:33:51 PM      Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt pbe
7/27/2010 8:33:51 PM      OS: Microsoft Windows 7: Home Premium x64 Edition, (06.01.7600.00)
7/27/2010 8:33:51 PM      Memory: 11.99 GB physical, 23.98 GB virtual
7/27/2010 8:33:51 PM      Disk: 419.93 GB total, 332.79 GB free
7/27/2010 8:33:51 PM      Local time is UTC -5 hours
7/27/2010 8:33:51 PM      NVIDIA GPU 0: GeForce GTS 250 (driver version 25896, CUDA version 3010, compute capability 1.1, 998MB, 470 GFLOPS peak)
7/27/2010 8:33:51 PM      NVIDIA GPU 1: GeForce GT 220 (driver version 25896, CUDA version 3010, compute capability 1.2, 987MB, 131 GFLOPS peak)
7/27/2010 8:33:51 PM   SETI@home   Found app_info.xml; using anonymous platform
7/27/2010 8:33:52 PM   SETI@home   URL http://setiathome.berkeley.edu/; Computer ID 5241862; resource share 1000
7/27/2010 8:33:52 PM   SETI@home   General prefs: from SETI@home (last modified 21-Feb-2010 19:16:08)
7/27/2010 8:33:52 PM   SETI@home   Computer location: home
7/27/2010 8:33:52 PM   SETI@home   General prefs: no separate prefs for home; using your defaults
7/27/2010 8:33:52 PM      Reading preferences override file
7/27/2010 8:33:52 PM      Preferences:
7/27/2010 8:33:52 PM        max memory usage when active: 6139.56MB
7/27/2010 8:33:52 PM        max memory usage when idle: 11665.16MB
7/27/2010 8:33:52 PM        max disk usage: 2.00GB
7/27/2010 8:33:52 PM        don't use GPU while active
7/27/2010 8:33:52 PM        suspend work if non-BOINC CPU load exceeds 60 %

Architecture:     GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5]
OS Details:     Microsoft Windows 7 Home Premium x64 Edition, (06.01.7600.00)
Number of CPU's:     8
Created:     Sat, 26 Dec 09 17:11:17 -0700
Timezone:     GMT -5
Floating Point Speed:     2,260.79 million ops/sec
Integer Speed:     8,033.70 million ops/sec
Memory Bandwidth:     125Mbit/sec
Ram:     11.99Gb
Cache:     256.00Kb
Swap:     23.98Gb
Disk Total:     419.93Gb
Disk Free:     356.30Gb
Ave Upload Rate:     10.80 Kb/sec
Ave Download Rate:     52.21 Kb/sec
Ave Turnaround:(tbf)     365,377.22
cpid:    a23269cdfd792a32f4cdad5ec3412d55
Rank:     2,320
Last Update:     12,874
7 Day Average:     12,625
Last 7 Days:     101,245
Last 28 Days:     286,597
RAC:     11,973
Coprocessor:     0 x 6.10.56CUDA with 2 (BOINC)

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 31 Jul 2010, 02:45:09 pm

1/1/2002 12:15:58 AM NVIDIA GPU 0: GeForce GTX 460 (driver version 25856, CUDA version 3010, compute capability 2.1, 739MB, 363 GFLOPS peak)
1/1/2002 12:15:58 AM NVIDIA GPU 1: GeForce GTS 250 (driver version 25856, CUDA version 3010, compute capability 1.1, 998MB, 470 GFLOPS peak)

GTX 460 installed, i had expected that it would have been more like 900-950 GFLOPS peak.
Unless i am missing something it seems as if the GTX 460 will do less crunching than the
GTS 250 ( of course i miss a lot )

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 31 Jul 2010, 02:51:37 pm

hi,
not sure if Boinc is reasing the Gflops correctly for my 465 either. It can go from 570->855->2553 Gflops in a day.
855 is the correct (nVidia's) rating for the 465, so I would expect your 460 to be around that mark as well
Check out http://en.wikipedia.org/wiki/GeForce_400_Series and you can see Wiki's rating there

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 31 Jul 2010, 03:52:19 pm

Quote from: Ghost on 31 Jul 2010, 02:51:37 pm

hi,
not sure if Boinc is reasing the Gflops correctly for my 465 either. It can go from 570->855->2553 Gflops in a day.
855 is the correct (nVidia's) rating for the 465, so I would expect your 460 to be around that mark as well
Check out http://en.wikipedia.org/wiki/GeForce_400_Series and you can see Wiki's rating there

If you're using BOINC v6.10.45 or later, it should calculate peak GFlops correctly for GTX 480 and GTX 470 - using the original Fermi (GF100) chip with 32 shaders per multiprocessor.

But the GTX 460 uses the GF 104 chip, with 48 shaders per MP. No version of BOINC calculates that correctly yet (although DA was notified on 17 July): You should add 50% extra to those reported GFlops values.

Edit - the GTX 465 is a GF100 card, so the GFlops should be right.

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 31 Jul 2010, 03:56:09 pm

Thanks Richard,
I saw the post you made to DA on the Boinc_Alpha list, from what he said it sounds like nVidia, didn't write this in to their API correctly?
Also any idea's why Boinc would report my Gflops at 2253 Gflops?
31/07/2010 17:50:48 | | NVIDIA GPU 0: GeForce GTX 465 (driver version unknown, CUDA version 3010, compute capability 2.0, 994MB, 2253 GFLOPS peak)
31/07/2010 17:50:48 | | ATI GPU 0: ATI Radeon HD5x00 series (Redwood) (CAL version 1.4.737, 1024MB, 620 GFLOPS peak)

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 31 Jul 2010, 04:01:14 pm

Quote from: Ghost on 31 Jul 2010, 03:56:09 pm

Thanks Richard,
I saw the post you made to DA on the Boinc_Alpha list, from what he said it sounds like nVidia, didn't write this in to their API correctly?

It looks as if they didn't even put it into the API at all, although that's the obvious place. Instead, you have to read the printed developer guide, and hard-code the shader counts into the application. Anybody got a hot-line to the NVidia dev team?

Quote

Also any idea's why Boinc would report my Gflops at 2253 Gflops?
31/07/2010 17:50:48 | | NVIDIA GPU 0: GeForce GTX 465 (driver version unknown, CUDA version 3010, compute capability 2.0, 994MB, 2253 GFLOPS peak)
31/07/2010 17:50:48 | | ATI GPU 0: ATI Radeon HD5x00 series (Redwood) (CAL version 1.4.737, 1024MB, 620 GFLOPS peak)

No idea on that one, sorry.

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 31 Jul 2010, 04:06:38 pm

Ok,
Because having to hard-code something like that is the way to go obviously ::) Only nVidia........

Quote

No idea on that one, sorry.

No worries, it reports correctly occasionally so its not a big issue, just a shame I don't actually get that amount

Title: Re: GTX 460 superclocked
Post by: Jason G on 31 Jul 2010, 04:34:33 pm

Hmmm, I was reading this stuff not long ago... looking

[Later:] Here's how boinc 6.10.58 sources do the estimation, which matches the technical specifications of the compute capabilities in the Cuda 3.1 reference material.

Quote

// Estimate of peak FLOPS.
// FLOPS for a given app may be much less;
// e.g. for SETI@home it's about 0.18 of the peak
//
inline double peak_flops() {
// clock rate is scaled down by 1000;
// each processor has 8 or 32 cores;
// each core can do 2 ops per clock
//
int cores_per_proc = (prop.major>=2)?32:8;
double x = (1000.*prop.clockRate) * prop.multiProcessorCount * cores_per_proc * 2.;
return x?x:5e10;
}

I would suggest they are GigaFlops on the 'cold day in hell' benchmark to start with. For that outrageous Multi-TeraFop figure I suspect the clock rate or multiProcessorCount may be being misreported internally somewhere.

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 31 Jul 2010, 05:13:22 pm

Quote from: Jason G on 31 Jul 2010, 04:34:33 pm

Hmmm, I was reading this stuff not long ago... looking

http://boinc.berkeley.edu/trac/changeset/21034
http://boinc.berkeley.edu/trac/changeset/21036

No change to http://boinc.berkeley.edu/trac/browser/trunk/boinc/lib/coproc.h since then.

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 31 Jul 2010, 09:08:15 pm

Another issue involved is that when BOINC reads the properties, Fermi cards are usually downclocked. It's a good feature of the newer GPUs, but leads to lower GFLOPS display. BOINC isn't really using the value much now, instead relying on trying to measure actual performance, so the developers aren't highly motivated to fix it.
Joe

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 01 Aug 2010, 05:35:16 am

Thanks all
From what I've got above it does seem to make sense.
This fits in niceley with what I'm seeing, when I first boot the box, Boinc gives a Gflops of 570, add 50% on to this estimate from Richards post and this gives the correct Gflop rating of 855 but this is a GF100 chip so not sure why you would have to do this? Then, using nVidia's control panel - change the fan speed to manual and up it to 65% and restart Boinc it now gives the 2.5 Tflop rating.
I have a feeling that this may have to do with the way that Boinc reports the clock rate, EVGA Precsion and GPU-Z all rate the clock at 810 Mhz and if I don't use nVidia's toll to increase the fan speed, Boinc will report that 810Mhz clock rate in the stderr of any tasks.
But, when the Gflops is being reported as 2.5Tflops the clock rate in the stderr is being reported as 3200Mhz.
Since this figure has really no effect on the speed of the tasks or the performance of the card, like I said I'm not really bothered what Boinc reports it's more out of personel interest than anything

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 01 Aug 2010, 07:11:56 am

If we can track down what's happening, I can add it to my _alpha post about the shader count - that needs a bump, in any event.

Remember that CUDA cards have three different clocks. The only one that matters for BOINC is the shader clock, which would be something like 1.215 GHz for the 465. If you're seeing 3.2 GHz, then it looks as if both BOINC and the app are picking up the memory clock instead - that would be a bug. But whether it's a bug in the app(s) coding, the NVidia API, or EVGA Precsion, I couldn't begin to tell. Jason?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 01 Aug 2010, 07:41:26 am

EVGA Precision and GPU-Z are both picking up 1215 Mhz for the shader clock which is correct, but if you look at the stderr for a task completed on the 465 when it is reporting a Gflop of 2253 then the clock rate is being picked up as 3.2Ghz

Quote

<core_client_version>6.11.4</core_client_version>
<![CDATA[
<stderr_txt>
Before cudaAcc_initializeDevice(): Boinc passed gCUDADevPref 1
setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce GTX 465, 993 MiB, regsPerBlock 32768
computeCap 2.0, multiProcs 11
clockRate = 3200000

I've attached the full stderr for you, I'm assuming that the clock rate in the stderr is the shader clock?
If you look at a task that has been completed with a gflop of 570, then it picks up the 810Mhz clock rate, which according to GPU-Z/EVGA is the Memory clock

Quote

But whether it's a bug in the app(s) coding, the NVidia API, or EVGA Precsion, I couldn't begin to tell.

Don't think its a bug in EVGA as GPU-z is reporting this as well as nVidia's Performance tool

Title: Re: GTX 460 superclocked
Post by: Jason G on 01 Aug 2010, 08:17:19 am

Most likely a driver API or Cuda Runtime library bug/omission. Not surprising since it's a new card & there are almost always some teething issues with drivers & new card models. By my understanding GTX 460 is a new die (GF104), rather than a cut down GF100, correct me if I am wrong. That could mean the clock circuitry &/or strategies are entirely different. If that's the case, it's probably due to refinement of the memory controller, which was proving to be the limit for GF100. Reports I've had from a Gamer/Benchmarker friend seems to indicate that something like that has indeed been 'fixed' and achieves 'Unexpectedly high performance' for the money.

To clarify in case of any confusion: Boinc, for its peak flops estimates should be using the 'Core Clock' specification, which on my 480 is currently at 782MHz, but shows in my task stderr output as '810000', so I feel it's a generic driver or Cuda library encoded figure rather than necessarily read from the current rate.

Fermi's have the shaders locked at 2x the core clock (i.e. 1564MHz on my 480) ... but we don't use shaders, we use 'Cuda Cores' @ 32 per Multprocessor ( 15 * 32 = 480 cores )

- Boinc shows mine as 778GFlops Peak
- Using actual figures: (1000.*782000) * 15 * 32 * 2 = 750.72 GFlops peak
- Using figures from stderr: (1000.*810000) * 15 * 32 * 2 = 777.6 GFlops peak

So Boinc is using the 'slightly dodgy' figure reported by the Cuda Runtime, via the cudaGetDeviceProperties() call as the app does. Since mine is reporting a generic figure, slightly overstated even though it's a reference card, we may not see 'real; figures on newer cards until a new Cuda SDK & library version comes out.

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 01 Aug 2010, 09:52:50 am

But Ghost's 465, although a new-ish card (released 31 May), has the older GF100 chip. Should have got the driver bugs out of that by now.

And he reported earlier in this thread that "[BOINC GFlops] can go from 570->855->2553 Gflops in a day". That doesn't sound like a static library value: more like idle/downclocked: then running normally: then a borked figure from EVGA Precision. In general, I don't buy the library idea (although Jord seems to like it too): people who over/underclock these cards report BOINC reported speeds which vary in proportion, so I think the runtime is genuinely getting some representation of a clock speed, even if not always the right one. :'(

And again, this is quite different from the problem with the 460 / GF104 that TouchuvGrey reported at the start of this thread: that one stems from the

'Cuda Cores' @ 48 per Multprocessor ( 7 * 48  = 336 cores

for Compute Capability 2.1

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 01 Aug 2010, 10:08:54 am

Just to clarify, EVGA can't report or read a Core clock speed on this card
GPU-z reports the following:
GPU Clock = 608 Mhz
Memory = 802 Mhz
Shader = 1215 Mhz
EVGA reports the same figures as GPU-z omitting the GPU clock figure. nVidia's performance tool reads the same.
But this still leaves the question, where is Boinc reading the 3200Mhz and the 810Mhz from in the stderr text. The effective memory clock of the 465 is rated at 3.2Ghz (4*802mhz), so it still looks to me (as a complete novice in how Boinc gets these figures), that Boinc is reading the wrong clock on this chip as it can't read the correct figure through the API if it is the core clock that is supposed to output to the stderr.
I'm making a lot of assumptions here as I don't have enough knowledge of this to make informed judgements, but it does seem logical to me that Boinc is having issues trying to decipher the API, or the hard coded values

Title: Re: GTX 460 superclocked
Post by: Jason G on 01 Aug 2010, 10:16:03 am

Quote from: Richard Haselgrove on 01 Aug 2010, 09:52:50 am

...for Compute Capability 2.1...

I'd like to know where this figure comes from if you could direct me to that. I have no Cuda documentation even indicating the existence of this compute capability 2.1, nor how many cores on the die etc. So someone either has information not available even to registered nVidia early access program developers, NDA or otherwise ... presumably by looking in the drivers/control panel which may not be quite right yet.

I reckon the figure of 48 might be questionable, since that would be 1.5 warps which would require an extra warp scheduler, or extension of the existing two to support 24 threads each... That isn't a power of two so it wouldn't make sense to make it that way in engineering terms, since hardware engineers only speak binary and hexidecimal ... with the odd exception of some sort of faulty chip harvesting.

I did mix up Ghosts 465, and TouchevGrey's 460 indeed, but the same possibility exists for the libraries reporting, not necessarily hard wired to one value, but broken or generically derived by formula not adapted for 465 or other new stuff. Also, I think these cards (465 & 460) represent the first the vendors are not forced to use a reference design, but can design their own boards. That opens the further possibility that the required custom BIOSes aren't all using the same mechanisms as nVidia reference ones, so will probably be the subject of either driver, library or firmware updates as time goes on.

Quote

...But this still leaves the question, where is Boinc reading the 3200Mhz and the 810Mhz from in the stderr text....

I still reckon something's broken :D [That's the same 810000 shown on my 480... a default of some sort I reckon. For the 3200 Some addition to the GPU properties retrieved by the library might have shifted the data elements one space in the data structure ... They are kindof nearby, so it'd be a fairly simple task to mess up the properties parsing in the Boinc code ... checking ]

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 01 Aug 2010, 10:28:43 am

Quote from: Jason G on 01 Aug 2010, 10:16:03 am

I still reckon something's broken :D [That's the same 810000 shown on my 480... a default of some sort I reckon. For the 3200 Some addition to the GPU properties retrieved by the library might have shifted the data elements one space in the data structure ... They are kindof nearby, so it'd be a fairly simple task to mess up the properties parsing in the Boinc code ... checking ]

Got to agree that something is broken, all the figures indicate (to me atleast) that the 810 and the 3200Mhz figures are either defaults as you said, or Boinc is reading the wrong clock speed (memory instead of core).
The good news is that this doesn't affect the actual speed of the card, but it would be nice to have the correct figures reported ;)

Title: Re: GTX 460 superclocked
Post by: Jason G on 01 Aug 2010, 10:36:22 am

Just looked at (more) Boinc code. To get the attributes it is using the driver api ( nvcuda.dll ) which is installed along with the driver, instead of the cuda runtime ( cudart...DLL) the app uses.

Probably means the parameters will magically fix themselves with a driver update ;) I can't see any problems with how Boinc gets the driver API function entry points, nor how it chooses the attributes to get (eliminated the dodgy data structure parsing suspect).

Jason

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 01 Aug 2010, 10:49:09 am

Good news on that one then. It eliminates any issues with the Boinc source code and hopefully it shouldn't be too much longer till a fresh driver is released onto us

Title: Re: GTX 460 superclocked
Post by: Jason G on 01 Aug 2010, 10:53:24 am

One step ahead, looking at the schedule for recent 400 series releases:

Quote

   GeForce/ION Release 256 WHQL NVIDIA Recommended    258.96    July 19, 2010
   GeForce/ION Release 256 BETA    258.69    June 29, 2010
   GeForce/ION Release 256 WHQL    257.21    June 15, 2010
   GeForce/ION Release 256 BETA    257.15    May 24, 2010
   GeForce Release 197 WHQL    197.75    May 10, 2010
   GeForce Release 197 BETA    197.75    May 3, 2010
   GeForce Release 197 WHQL    197.41    April 9, 2010

*Could be* in a day or so for a new Beta, unless some problem (like this one, lol) causes delays.

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 01 Aug 2010, 11:04:28 am

Think I cans ee what value Boinc is reading in the stderr now...

I've just been having a play with the shader clocks on the 465 and it would seem that Boinc reads this value and outputs to the stderr for some reason.
I've just changed my shader clock and the corresponding figure is shown in the stderr of a task completed
Here you can see that the shader has been set to 3.2Ghz for some reason (although not showing in any GPU monitoring tools)

Quote

setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce GTX 465, 993 MiB, regsPerBlock 32768
computeCap 2.0, multiProcs 11
clockRate = 3200000

After I reset the value back to the correct 1215Mhz

Quote

setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce GTX 465, 993 MiB, regsPerBlock 32768
computeCap 2.0, multiProcs 11
clockRate = 1215000

If Boinc only uses the core clock, then why would it output the shader clock to the stderr, unless the bug in the driver is reporting the shader as the core clock?
And with the shader set back to the correct value the Gflops is being reported correctly in Boinc of 855 Gflops:
01/08/2010 16:04:43 | | NVIDIA GPU 0: GeForce GTX 465 (driver version unknown, CUDA version 3010, compute capability 2.0, 994MB, 855 GFLOPS peak)

Title: Re: GTX 460 superclocked
Post by: Jason G on 01 Aug 2010, 11:10:03 am

Quote from: Ghost on 01 Aug 2010, 11:04:28 am

Think I cans ee what value Boinc is reading in the stderr now...
...

that's the app one, [and yes looks wrong]. Stock & my cutdown one use the call from CudaRT.dll (Cuda Runtime). These device functions in the runtime are just wrappers for the same driver calls, so if one;s broken then they both will be.

[Later:] Moved a bunch of unrelated posts elsewhere. Nothing to see here, Carry on ;D

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 04 Aug 2010, 03:50:40 pm

Sorry TouchuvGrey, I kinda hijacked your thread for a while there ::)
Did you manage to find out why your 460 was reporting such a low Gflops rating in Boinc?
From everything I've read about the 460 it should be around the 900 Gflops mark.
With my 465 it seems that the default profile that the nVidia performance tool loaded was corrupt and was shooting my shader clock sky high to give a massively over-r!ted Gflops figure,
Not sure if this would be the same problem (maybe downclocking? as another post said as well)

Ghost

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 04 Aug 2010, 07:12:20 pm

Quote from: Ghost on 04 Aug 2010, 03:50:40 pm

Sorry TouchuvGrey, I kinda hijacked your thread for a while there ::)
Did you manage to find out why your 460 was reporting such a low Gflops rating in Boinc?
From everything I've read about the 460 it should be around the 900 Gflops mark.
With my 465 it seems that the default profile that the nVidia performance tool loaded was corrupt and was shooting my shader clock sky high to give a massively over-rated Gflops figure,
Not sure if this would be the same problem (maybe downclocking? as another post said as well)

Ghost

Hello Ghost:

No Problem at all about "hijacking" my thread. I'm glad that my original post inspired
such an interesting discussion. i learned some things from the portion of the responses
that i could understand ( i'm not that technical ). i'm still not sure why it is reporting ( still )
only 363 GFLOPS peak. Just a guess on my parts but could the "compute capability 2.1 "
have something to do with it ? Is BOINC misreading something ? Have i screwed up something
in my computer to the point where the 460 is delivering far less than it should ?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 05 Aug 2010, 03:28:17 am

Hi,

Not sure that you would be able to affect the Gflops that Boinc reports. Richard posted something earlier in the thread about compute 2.1 issues that DA is trying to get an answer from nVidia about. It could be this that is affecting your rating. But Richard/Jason would be the best ones to clarify this issue

Think this is the post from Richard about the compute capability:

Quote

And again, this is quite different from the problem with the 460 / GF104 that TouchuvGrey reported at the start of this thread: that one stems from the

'Cuda Cores' @ 48 per Multprocessor ( 7 * 48 = 336 cores

for Compute Capability 2.1

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 06 Aug 2010, 09:48:15 pm

Still no answers on what "compute capability 2.1 " indicates.

Another question or so though, SLI enable or disable ?

Given my current setup intel core i7 920 12 gigs of
tri channel DDR3 GTS 250 and GTX 460, what is the best
application for me to run ?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 07 Aug 2010, 05:55:20 am

I spent some time a couple of days ago trying to find any information on Cuda's compute capability 2.1 and was unable to find any mention of it, and I haven't seen anything come through on either the Boinc_Alpha list or Boinc_Dev so I would assume that either DA hasn't heard anything back from nVidia about it or he hasn't told published what he was told?

With regards to SLI, common consensus is that it should be disabled for Seti. You should be able to disable it though the Control Centre (I don;'t have SLI enabled cards in at the moment so can't test I'm afraid.

For the application, if you fancy doing some Beta testing for Jason, and you have access to the Beta download area then read this thread:
http://lunatics.kwsn.net/gpu-testing/ongoing-multibeam-cuda-x32f-testing.0.html
The Cuda 3.1 build does have an issue when using mixed cards (on the lesseer card the task will crash), but it would be good if you can confirm on your setup if you experience the same issues
The cuda 3.0 build should work fine on both cards
I have a similar setup to you (GTX465 and either a GT240 or GTX260) so if you could confirm my findings then it would be appreciated.
Please be aware though that these preview applications do have other known issues (no cpu fallback for example) so any issues please report them in the other thread.
If you decide that you want to use these applications and need a hand with the setup, let me know and I'll give you a hand ;)

Title: Re: GTX 460 superclocked
Post by: MarkJ on 07 Aug 2010, 06:11:30 am

Nvidia seems to use the compute capaility to indicate how many shaders per processor the card has. In the case of the GTX460 (GF104 chip) its 48 per processor.

As for the beta testing, yes I did some on a variety of cards, GTX460 included. I don't mix card types in machines due to driver issues, which have been ongoing with nvidia.

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 07 Aug 2010, 08:02:36 pm

i've posted about the GTX 460 over in the nvidia
forum. If i get any answers i'll post them here.

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 11:46:57 am

I am planning to build a system with :
1x GT 240
1x GTX 460

How do I have to modify my app_info.xml to run those two cards with x32f?

The GTX 460 can run more than one task in parallel, right? Was it 3? How do I have to configure this?

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 12:57:24 pm

Quote from: Frizz23 on 10 Aug 2010, 11:46:57 am

I am planning to build a system with :
1x GT 240
1x GTX 460

How do I have to modify my app_info.xml to run those two cards with x32f?

The GTX 460 can run more than one task in parallel, right? Was it 3? How do I have to configure this?

OK, firstly, if you are going to mix those cards (Fermi & pre fermi) there is an issue with the 3.1 Cuda runtime (nVidia bug in the CUFFT library) ... The solution is to use the Cuda 3.0 build of x32f instead.

For installation, I would use the Lunatics Installer first, BUT not start Boinc yet ... Then follow the same procedure as I gave perryjay below, but modify the app info to suti the Cuda 3.0 executable names & DLLs.

There is a complication if you plan to use a 64 bit OS. You'll need a different, more complicated aistub. I'll provide that if the situation arises.

Also bear in mind that no programming information has been released on the GTX 460 yet, either publicly or to registered developers under NDA, so there *may be* significant differences that will require some re-engineering of key code kernels to see what they are really capable of.

Cheers, Jason

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 02:48:36 pm

Thanks Jason,

Quote from: Jason G on 10 Aug 2010, 12:57:24 pm

For installation, I would use the Lunatics Installer first, BUT not start Boinc yet ... Then follow the same procedure as I gave perryjay below, but modify the app info to suti the Cuda 3.0 executable names & DLLs.

(1) As far as I understood the GF10x GPUs can run more than one WU at a time, no? How do I have to configure this in app_info.xml?

(2) I was thinking about getting the 768MB version (slightly less power consumption) - but if they run more than one WU at a time I might run out of memory when I crunch 3 at a time. No?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 10 Aug 2010, 02:52:19 pm

Hi Frizz23,

Stop Boinc. Then in your app_info.xml there will be a segment like this

Quote

<app_version>
.....
<type>CUDA</type>
<count>1</count>
</coproc>
.....
</app_info>

Change the <count>1</count> to either <count>0.5</count> or <count>0.33</count> if you either want 1,2 or three tasks to run on the fermi card
Restart Boinc and you should have multiple tasks running on the same card :D

As far as I know the Fermi build takes about 300MB per task - so you should be able to run 2 tasks. Sure Jason will correct me if I'm wrong on this ;D On my 465 I get the best through put with two tasks running at a time (994MB)

Bit later: I've attached my app_info for reference (cuda 3.1 build make sure to change if you use the 3.0 build also remembered the plan class is different(fermi31) so this will need changing as well)

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 03:04:12 pm

Thanks Ghost,

Quote from: Ghost on 10 Aug 2010, 02:52:19 pm

As far as I know the Fermi build takes about 300MB per task - so you should be able to run 2 tasks. Sure Jason will correct me if I'm wrong on this ;D On my 465 I get the best through put with two tasks running at a time (994MB)

How come 2 x 300MB = 994MB? ;-)

Hmmm ... that would mean the 765MB version would be OK for me (they consume about 10 watt less and are approx. 40 EUR cheaper).

How long does an average WU take to crunch on your GTX 465?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 10 Aug 2010, 03:14:16 pm

Could be wrong on the memory usage :P
But thats what I've found on this card 1 shorty ~3min 20, 2 @ a time ~5min 50, 3 at a time just over 7 minutes 30, this is about a 20% increase over the stock fermi app

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 03:27:49 pm

I find 2 at a time to be best on my 480, stock is better with 3 at a time because it uses less cores. Probably as things progress I'll get a single task to use the whole cards, so back to 1-2 tasks will become optimal.

The 460 is a bit unknown at this point, because there is a big question mark on how to use 48 cores per multiprocessor effectively, given that it is 3 half warps. We probably won't know much more about how to handle that the best until further documentation becomes available, but to my mind may require some significant adaptation of kernel geometries to get the most from. All indications are that these are going to be extremely popular cards, so it will be done eventually, whatever changes (if any) are needed.

Jason

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 03:34:49 pm

Quote from: Ghost

But thats what I've found on this card 1 shorty ~3min 20, 2 @ a time ~5min 50, 3 at a time just over 7 minutes 30, ...

and:

Quote from: Ghost

On my 465 I get the best through put with two tasks running at a time

OK ... when I do the math 3 WUs at a time is the optimum, no?

Completion time for 3 WUs:
1 at a time: 3 x 3:20 -> 10:00
2 at a time: 1.5 x 5:50 -> 8:45
3 at a time: 1 x 7:30 -> 7:30

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 03:39:32 pm

Quote from: Frizz23 on 10 Aug 2010, 03:34:49 pm

OK ... when I do the math 3 WUs at a time is the optimum, no?

Completion time for 3 WUs:
1 at a time: 3 x 3:20 -> 10:00
2 at a time: 1.5 x 5:50 -> 8:45
3 at a time: 1 x 7:30 -> 7:30

If the tasks used for measurement were all the same angle range yes, that's fine. Quite possibly the best 'for now', because of those big question marks over using the 460 architecture effectively.

Don't forget about cpu overhead & Bus contention though ... so always best to measure. Things don't 'always' scale nicely when you try cram in more work than the hardware can handle.

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 03:51:14 pm

Would you say its OK if I go for the 768MB version?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 10 Aug 2010, 03:53:02 pm

All these task times were done through Boinc so it could have been a mix of any VHAR AR ranges
I also don't think I explained the math that well either the times I gave were for each task to complete

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 03:56:13 pm

Quote from: Frizz23 on 10 Aug 2010, 03:51:14 pm

Would you say its OK if I go for the 768MB version?

Either would be 'OK'

The 1 Gig version is 'better' though. Enough so in the memory bandwidth to justify the price difference. Also If you have the budget, the top eVGA model comes with a higher factory overclock than other brands, along with rear exhaust, so is also worth a price premium if you can stretch that far.

Jason

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 03:58:20 pm

Quote from: Ghost on 10 Aug 2010, 03:53:02 pm

I also don't think I explained the math that well either the times I gave were for each task to complete

This is what I thought. And this is why I came to this conclusion:

Completion time for 3 WUs:
1 at a time: 3 x 3:20 -> 10:00
2 at a time: 1.5 x 5:50 -> 8:45
3 at a time: 1 x 7:30 -> 7:30

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 04:06:00 pm

Average each config over 6 similar(but preferably slightly different to represent reality) tasks & it should be clearest.

That's the difference between a measured bench & a synthetic one ;)

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 10 Aug 2010, 04:10:15 pm

Also have to compare the time it takes to complete one task, when running one at a time, to the time it takes to complete each task when running mulitple tasks to get an accurate picture of the exact throughput.

@Jason, if you want we can move this over to the "GTX460 Superclocked" thread , to keep this one clear for your testing? [Mod: Done]

[Edit];D ;D ;D ;D ;D

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 10 Aug 2010, 04:21:54 pm

just going through the tasks I hadn't reported, hoping to get definitive times from my testing with mulitple tasks running

Ok, after running through the tasks I haven't yet report and the online testing I done with whichever random VHAR tasks Boinc decided to throw at me next (I'll have to set up some real test conditions when I get the time)

So, first of all I need to slap myself for not doing this before I first posted and my apologies Frizz23 ::)
Revised and accurate average times:
Like I said, all VHAR and all different tasks:

1 task at a time::each task took ~ 200 seconds to complete
2 tasks at a time::each task took ~330 seconds to complete[revised time]
3 tasks at a time::each task took ~520 seconds to complete[revised time]

So tasks completed in an hour using the above times:
1 task at a time:: 3600 / 200 = 18 tasks an hour
2 task at a time:: 3600 / 330 * 2 = 21.8 tasks an hour
3 tasks at a time:: 3600 / 520 * 3 = 20.7 tasks an hour

So the best throughput (highest number of tasks per hour) on my 465 works out to 2 at a time, could well be different on your 460 if you decide on one, but I agree with Jason, its a good card and as nVidia release more documentation on it, it will only get better

Title: Re: GTX 460 superclocked
Post by: sunu on 10 Aug 2010, 05:22:46 pm

You guys that run 2 or 3 tasks at a time in your fermi cards, what RAC does that give to your cards?

As for the question about the 768MB card I would advise against that. 768MB memory seem quite small nowadays. Maybe 2-3 years ago yes, but not now. Get the higher memory model.

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 05:34:33 pm

How can I see in the WU XML file whether or not a WU is a VHAR?

Shorties (are they always VHARs?) usually run in 5:50 on my GT 240. So it's about 10 per hour.

Somehow I can't believe the 465 is only 2x my cheapish GT 240 ...!?

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 05:38:32 pm

Quote from: sunu on 10 Aug 2010, 05:22:46 pm

You guys that run 2 or 3 tasks at a time in your fermi cards, what RAC does that give to your cards?

On the 480, with x32f 3.1, running on a E8400 Oc'd to 3.6GHz, I just passed 21000 today. The CPU is running AKv8b with occasional AstroPulse... I think it should start to level off soon, but don't know for sure. I'm finishing two midrange tasks on the 480 in about 9-12mins, or two shorties in ~3 mins. single task times were ~7-8 mins & ~2mins for Mid AR & shorty respectively.

Quote

As for the question about the 768MB card I would advise against that. 768MB memory seem quite small nowadays. Maybe 2-3 years ago yes, but not now. Get the higher memory model.

I agree. For me it's the memory bus width though. With this much processing power at hand memory bandwidth will nearly always be the primary concern, and nVidia wisely have included some special cache control mechaisms in the driver API ( which we can't switch to yet, until I have the complete set of freaky powerspectrums, still on track for christmas 2010 ;), later to be replaced with a freaky powerspectrum cascade kernel to maximise cache benefits)

Title: Re: GTX 460 superclocked
Post by: Jason G on 10 Aug 2010, 05:43:28 pm

Quote from: Frizz23 on 10 Aug 2010, 05:34:33 pm

How can I see in the WU XML file whether or not a WU is a VHAR?

Shorties (are they always VHARs?) usually run in 5:50 on my GT 240. So it's about 10 per hour.

Somehow I can't believe the 465 is only 2x my cheapish GT 240 ...!?

As you get faster on GPU, a larger proportion of shorty run time is CPU time ;) That's a fairly fixed overhead between tasks & largely machine dependant.

You can search for the word angle in the task file using a text editor. The true angle range of the task is there. It's easier just to look at the stderr task output after the task reports (and the servers are up) though. There is an exact threshold above which is considered VHAR, and that makes shorties. I forget exactly what that threshold is, around 1.1 I think. Joe keeps that kind of information around here :D. Above that, whatever it is, they all tend to take pretty much the same short time to process.

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 05:49:06 pm

Quote from: Jason G on 10 Aug 2010, 05:43:28 pm

It's easier just to look at the stderr task output after the task reports (and the servers are up) though.

Yeah bummer ... now since the servers are offline I'm starting to poke around :-)

Title: Re: GTX 460 superclocked
Post by: Miep on 10 Aug 2010, 05:55:31 pm

Quote from: Frizz23 on 10 Aug 2010, 05:49:06 pm

Quote from: Jason G on 10 Aug 2010, 05:43:28 pm
It's easier just to look at the stderr task output after the task reports (and the servers are up) though.

Yeah bummer ... now since the servers are offline I'm starting to poke around :-)

There's always client_state to rummage around for stderr :) (after the task finished. before that you'll have to poke in the slot directories)

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 10 Aug 2010, 07:19:15 pm

Quote from: Frizz23 on 10 Aug 2010, 11:46:57 am

I am planning to build a system with :
1x GT 240
1x GTX 460

How do I have to modify my app_info.xml to run those two cards with x32f?

The GTX 460 can run more than one task in parallel, right? Was it 3? How do I have to configure this?

I see you have answers for both questions in isolation, but nobody has pointed out a conflict. AFAIK, if you have a GT 240 in the system it won't be practical to run more than one task at a time on the GTX 460. It's a shortcoming in BOINC, there's no way to set up a separate app_version for each card.
Joe

Title: Re: GTX 460 superclocked
Post by: Raistmer on 10 Aug 2010, 07:22:04 pm

Quote from: Josef W. Segur on 10 Aug 2010, 07:19:15 pm

It's a shortcoming in BOINC, there's no way to set up a separate app_version for each card.
Joe

What if one app version uses fermi plan class and another one - only cuda23 ?
Will cuda23 tasks be scheduled on both GPUs ?

[And most important - if fermi tasks will be scheduled only on 460 GPU, then it can do 3 tasks at once at least sometimes while 240 will do one task at once always]

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 10 Aug 2010, 07:31:38 pm

I'm not certain, but I'm fairly sure that <plan_class> isn't used at the client scheduler level for determining device allocation.

That is, <plan_class> cuda23 --> device 1
<plan_class> cuda_fermi --> device 0 ?

I don't think so.

Title: Re: GTX 460 superclocked
Post by: Raistmer on 10 Aug 2010, 07:36:03 pm

That is, if non-FERMI GPU installed and app_info has only fermi plan class app_version entry, "fermi" tasks will be still scheduled on non-FERMI GPU ?? If so it's plain bug in BOINC.
plan class should limit number of usable devices.
That is, both devices available for cuda23 application, but only single FERMI device is available for fermi application.
[And even such approach can be wrong. CUDA 23 app can't work OK on FERMI GPUs
That is, different plan classes should have different device lists indeed.]

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 07:40:11 pm

Quote from: Richard Haselgrove on 10 Aug 2010, 07:31:38 pm

I'm not certain, but I'm fairly sure that <plan_class> isn't used at the client scheduler level for determining device allocation.

But how does this work for hosts crunching MBs with NVIDIA and APs with ATI?

Title: Re: GTX 460 superclocked
Post by: Raistmer on 10 Aug 2010, 07:46:15 pm

LoL, it's not so easy to get such config working at all. Ask Claggy :)

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 07:49:25 pm

I have such a setup running too 8)

Title: Re: GTX 460 superclocked
Post by: Raistmer on 10 Aug 2010, 07:55:28 pm

Quote from: Frizz23 on 10 Aug 2010, 07:49:25 pm

I have such a setup running too 8)

Successfully? What OS/Catalyst ? I failed to launch such config because ATI's lack of support for Win 2003 Server OS, Claggy had bad failures for different apps when both GPUs installed... Interesting what makes your config so successful :)

Title: Re: GTX 460 superclocked
Post by: Frizz on 10 Aug 2010, 08:25:16 pm

Quote from: Raistmer on 10 Aug 2010, 07:55:28 pm

Quote from: Frizz23 on 10 Aug 2010, 07:49:25 pm
I have such a setup running too 8)
Successfully? What OS/Catalyst ? I failed to launch such config because ATI's lack of support for Win 2003 Server OS, Claggy had bad failures for different apps when both GPUs installed... Interesting what makes your config so successful :)

- Windows 7 x64
- Catalyst 10.6
- GeForce 257.21

I have a GT240 and a HD5770.

It was a pain in the ass to get it running, since the GT240 had to be in PCIe slot 0 and the HD5770 in slot 1 - the other way round didn't work. Catalyst had to be installed first, then GeForce, ... etc. ... so trial and error.

http://setiathome.berkeley.edu/hosts_user.php?userid=8395885

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 10 Aug 2010, 08:28:39 pm

Quote from: Jason G on 10 Aug 2010, 05:43:28 pm

...
There is an exact threshold above which is considered VHAR, and that makes shorties. I forget exactly what that threshold is, around 1.1 I think. Joe keeps that kind of information around here :D. Above that, whatever it is, they all tend to take pretty much the same short time to process.

<pot_max_slew>*<nsamples>/<sample_rate>

1.12742890 or thereabouts is the maximum AR at which Gaussian searching is done, so that's still midrange. At 1.12742891 AR, Gaussians are gone along with a lot of FFTs done only for them.
Joe

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 15 Aug 2010, 03:00:03 pm

July 27, 2010

Architecture: GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5]
OS Details: Microsoft Windows 7 Home Premium x64 Edition, (06.01.7600.00)
Number of CPU's: 8
Created: Sat, 26 Dec 09 17:11:17 -0700
Timezone: GMT -5
Floating Point Speed: 2,260.79 million ops/sec
Integer Speed: 8,033.70 million ops/sec
Memory Bandwidth: 125Mbit/sec
Ram: 11.99Gb
Cache: 256.00Kb
Swap: 23.98Gb
Disk Total: 419.93Gb
Disk Free: 356.30Gb
Ave Upload Rate: 10.80 Kb/sec
Ave Download Rate: 52.21 Kb/sec
Ave Turnaround:(tbf) 365,377.22
cpid: a23269cdfd792a32f4cdad5ec3412d55
Rank: 2,320
Last Update: 12,874
7 Day Average: 12,625
Last 7 Days: 101,245
Last 28 Days: 286,597
RAC: 11,973
Coprocessor: 0 x 6.10.56CUDA with 2 (BOINC)

August 15, 2010

Architecture:     GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5]
OS Details:     Microsoft Windows 7 Home Premium x64 Edition, (06.01.7600.00)
Number of CPU's:     8
Created:     Sat, 26 Dec 09 17:11:17 -0700
Timezone:     GMT -5
Floating Point Speed:     2,597.84 million ops/sec
Integer Speed:     6,304.23 million ops/sec
Memory Bandwidth:     125Mbit/sec
Ram:     11.99Gb
Cache:     256.00Kb
Swap:     23.98Gb
Disk Total:     419.93Gb
Disk Free:     346.69Gb
Ave Upload Rate:     13.36 Kb/sec
Ave Download Rate:     7.53 Mb/sec
Ave Turnaround:(tbf)     852,254.82
cpid:    a23269cdfd792a32f4cdad5ec3412d55
Rank:     1,987
Last Update:     7,648
7 Day Average:     8,826
Last 7 Days:     69,429
Last 28 Days:     310,103
RAC:     10,372
Coprocessor:     0 x 6.10.58CUDA with 2 (BOINC)

It seems as if the GTX460 is doing less than the GT220 was. Floating point speed is up
while integer speed is down.

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 15 Aug 2010, 06:12:41 pm

The "Floating Point Speed:" is merely BOINC's somewhat mangled version of the Whetstone benchmark, similarly "Integer Speed:" is a Dhrystone benchmark. They test CPU only and are run with crunching suspended, so the GPU has no effect. They cannot turn off anything non-BOINC which is running, of course, so quite a lot of variability is normal particularly on multicore CPUs. You might want to try rerunning them a few times just to see how much variation is normal for your system, in BOINC 6.10.58 the Advanced menu has the needed item.
Joe

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 19 Aug 2010, 08:58:08 pm

GPU-Z 0.4.5 shows CUDA to not be available og my GTX-460, nor is
DirectCompute 4.0. It shows both ARE available on my GTS 250.
Any thoughts or suggestions will be greatly appreciated.

Title: Re: GTX 460 superclocked
Post by: Jason G on 19 Aug 2010, 10:20:34 pm

Quote from: TouchuvGrey on 19 Aug 2010, 08:58:08 pm

Any thoughts or suggestions will be greatly appreciated.

Hi,
Still no Cuda documentation detailing these directly from nVidia in the last SDK. You would have to ask the makers of GPU-Z what interface they use to query the device and drivers. My guess is nVidia's driver API. Since there's only one driver listed on nVidia's site for that card, it's doubtful IMO that it's mature and completely bug free for them.

On the GTX 460 in general: Some technical questions I had over these cards have slowly had some answers clarified from articles here and there. These namely involve the changed Cuda core geometry in GF104 over GF100. Thankfully the extra cores, 48 instead of 32 per multiprocessor, which would have been difficult to program for requiring extensive code modification, are a hardware optimisation for superscalar execution (instruction level parallelism), so will run Fermi optimised kernels without needing special ones as I feared when I first heard those numbers. Looks good, for programming purposes it seems to me these should probably be treated as 32 cores per multiprocessor, but theoretically up to 50% faster by superscalar execution. so the math works out the same if we just call it 48 :D . It's just a way of cramming in and using more cores effectively without having to get everyone to rebuild their Fermi software, or duplicate other components on die only to sit idle. It'll take a while yet before any of these cards,( GF100 & GF104) can be used to their full potential.

Jason

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 20 Aug 2010, 07:20:47 am

Given what we do know about the GTX 460, what application will
put it to use best ?

Title: Re: GTX 460 superclocked
Post by: Jason G on 20 Aug 2010, 07:30:37 am

Cuda 3.0 build of x32f (seeing as you're a squire). Probably 2 instances per Fermi card at a time.

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 20 Aug 2010, 07:41:49 am

Quote from: TouchuvGrey on 19 Aug 2010, 08:58:08 pm

GPU-Z 0.4.5 shows CUDA to not be available og my GTX-460, nor is
DirectCompute 4.0. It shows both ARE available on my GTS 250.
Any thoughts or suggestions will be greatly appreciated.

I get the same on the 465 with GPU-Z, no cuda, no DirectCompute 5.0 But do have OpenCL available

Title: Re: GTX 460 superclocked
Post by: MarkJ on 22 Aug 2010, 05:11:17 am

Quote from: Ghost on 20 Aug 2010, 07:41:49 am

I get the same on the 465 with GPU-Z, no cuda, no DirectCompute 5.0 But do have OpenCL available

Seems okay on mine (GTX460). Shows as having all the boxes ticked down the bottom. Did you get this on the machine with mixed cards?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 22 Aug 2010, 05:17:55 am

Quote from: MarkJ on 22 Aug 2010, 05:11:17 am

Seems okay on mine (GTX460). Shows as having all the boxes ticked down the bottom. Did you get this on the machine with mixed cards?

Yes, its currently got the GTX 465 and a HD5670 in it. Admittedly I've uninstalled PhysX, so I'm not expecting that to show up (possible issue with runnnig Raistmers CPU/GPU AP build) but not really sure why CUDA atleast isn't showing up

Title: Re: GTX 460 superclocked
Post by: Frizz on 22 Aug 2010, 05:52:15 am

Quote from: Ghost on 20 Aug 2010, 07:41:49 am

I get the same on the 465 with GPU-Z, no cuda, no DirectCompute 5.0 But do have OpenCL available

I have a "mixed" system with ATI + NVIDIA too. GPUZ seems to have problems with this. Have you tried something like "GPU Caps Viewer" instead?

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 22 Aug 2010, 06:04:49 am

Quote from: Frizz23 on 22 Aug 2010, 05:52:15 am

I have a "mixed" system with ATI + NVIDIA too. GPUZ seems to have problems with this. Have you tried something like "GPU Caps Viewer" instead?

Thanks, have just downloaded this - lot more information available in Caps Viewer - now seeing CUDA is available on the 465
Looks like a bit of a better app than GPU-Z for mixed systems

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 22 Aug 2010, 07:51:42 am

Downloaded installed and ran Caps Viewer. More puzzled than ever.

GTS 250
Compute Capability 1.1
Shader Clock 1836 Mhz
MulitiProcessors 16
Warp Size 32
Grid Size 65535 x 65535 x 1
Block Dim 512 x 512 x 64
Thread/Block 512
Registers/Block 8192

GTX 460
Compute Capability 2.1
Shader Clock 810 Mhz
MulitiProcessors 7
Warp Size 32
Grid Size 65535 x 65535 x 1
Block Dim 1024 x 1024 x 64
Thread/Block 1024
Registers/Block 32768

Can someone explain to me ( in small words please )
just what i'm seeing and what it means ?

Title: Re: GTX 460 superclocked
Post by: HSchmirPo on 22 Aug 2010, 08:05:07 am

Can't explain, but - hey - it looks funny ! ::)

Title: Re: GTX 460 superclocked
Post by: Jason G on 22 Aug 2010, 09:42:59 am

Quote from: TouchuvGrey on 22 Aug 2010, 07:51:42 am

Can someone explain to me ( in small words please )
just what i'm seeing and what it means ?

In a nutshell, the geometry listed for the 460 is around twice the computing capacity of the GTS250. When drivers mature & we get a bit more development along Fermi lines going, numbers should reflect as such.

For more detail:
For the GTS 250 there are 16 multiprocessors, of 32 shaders each. With a warp size of 32 threads, that means a kernel will generally execute a block per multiprocessor, each one having 8192 registers (which just like in CPU land, are faster storage for computation than memory).

The 460 also has 32 shaders per multiprocessor, but less multiprocressors. That's not an error, it actually has 48, but the architecture uses the extra 16 for super-scalar execution, which is a hardware optimisation, so you treat it as 32 but things go ~50% faster at the same clock. This has 32k registers per block, which is ~4x the 250. Also, at a maximum of 1024 threads per block it means the multiprocessors can do about twice as much at once. All told, that should make the 460 roughly equivalent to a GTX 285 in throughput ( or 2 x GTS 250s )

JAson

Title: Re: GTX 460 superclocked
Post by: Raistmer on 22 Aug 2010, 10:04:16 am

warp size remains same so the number of truly simultaneous threads.
Also, why so big downgrade in freq ???

Title: Re: GTX 460 superclocked
Post by: Claggy on 22 Aug 2010, 10:22:45 am

Quote from: Raistmer on 22 Aug 2010, 10:04:16 am

warp size remains same so the number of truly simultaneous threads.
Also, why so big downgrade in freq ???

Perhaps the GTX 460 is reporting GPU clock speed instead of shader clock speed?

Claggy

Title: Re: GTX 460 superclocked
Post by: Raistmer on 22 Aug 2010, 10:25:09 am

Quote

Perhaps the GTX 460 is reporting GPU clock speed instead of shader clock speed?

Claggy

Looks like that. At least from my experience with other NV and ATI cards.

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 22 Aug 2010, 10:35:47 am

Looks like it's an issue with both 460 and 465 then
GPU-Z and Caps GPU Viewer report the following for me:

Core = 608 Mhz
Memory = 802 Mhz
Shader = 1215 Mhz

Nut nVidia control panel reads it as:

Core = 607 Mhz
Memory = 1604 Mhz
Shader = 1215 Mhz

When using GPU-Z etc, I usually just double the memory clock value to get the true figure?

Title: Re: GTX 460 superclocked
Post by: Raistmer on 22 Aug 2010, 10:37:36 am

Some counts DDR and some not? ...

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 22 Aug 2010, 10:39:04 am

Quote from: Raistmer on 22 Aug 2010, 10:37:36 am

Some counts DDR and some not? ...

not a clue :-\, I've just put it down to one of nVidia's little quirks
Kinda of like Boinc reading it at 570 Gflops one day and 850 the next

Title: Re: GTX 460 superclocked
Post by: Jason G on 22 Aug 2010, 11:46:13 am

Quote from: Raistmer on 22 Aug 2010, 10:04:16 am

warp size remains same so the number of truly simultaneous threads.
Also, why so big downgrade in freq ???

The warp size has stayed the same, but is capable of super-scalar execution with additional execution units, for ~50% greater throughput, so = 48 threads, but treat them as 32. That is extra compute units providing instruction level parallelism ( as was introduced in Intel CPUs with the Pentium Processor).

The way clocks are reported & used in the architecture has changed, from two separate clocks to 1 core clock.

The clock change is a significant one that will influence anything that tries to read clock frequencies. The extra superscalar execution one boosts Cuda-Core performance by ~50% but will look like the chip has less processors :D. In both cases they are optimisations for increased throughput, the core change being super-scalar fairly obvious but confusing at the same time ;D, The clock one because the hardware is used as a poly-morph engine, so needs to be synchronised to the shaders (Less obvious, more confusing at the same time)

Title: Re: GTX 460 superclocked
Post by: Raistmer on 22 Aug 2010, 11:48:27 am

Good news - then I may hope NV will like float4 too as ATI does :)
[and it will save OpenCL kernels from rewriting for NV ]

Title: Re: GTX 460 superclocked
Post by: Jason G on 22 Aug 2010, 11:51:45 am

Quote from: Raistmer on 22 Aug 2010, 11:48:27 am

Good news - then I may hope NV will like float4 too as ATI does :)
[and it will save OpenCL kernels from rewriting for NV ]

Yes, though when it comes to that you may need to be backporting some of my experimental code, so don't get too complacent there :P

Title: Re: GTX 460 superclocked
Post by: Raistmer on 22 Aug 2010, 11:55:33 am

Quote from: Jason G on 22 Aug 2010, 11:51:45 am

Quote from: Raistmer on 22 Aug 2010, 11:48:27 am
Good news - then I may hope NV will like float4 too as ATI does :)
[and it will save OpenCL kernels from rewriting for NV ]

Yes, though when it comes to that you may need to be backporting some of my experimental code, so don't get too complacent there :P

LoL ;D

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 23 Aug 2010, 02:00:47 pm

Quote from: Jason G on 20 Aug 2010, 07:30:37 am

Cuda 3.0 build of x32f (seeing as you're a squire). Probably 2 instances per Fermi card at a time.

What ( if anything ) do i need to add or modify in my config.xml file to get
2 instances running on my GTX460 ? Taking into account that the other card
is a GTS250

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 23 Aug 2010, 02:10:03 pm

you won't have to modify your app_info.xml at all, there's now a 32 and 64 bit installer in the Beta area that should do the job for you ;) Big thanks to Jason, for that one! Although you should make sure that either you run your cahce down completely, or ensure that your plan_class and version_num match the following before you run the installer:
608 and cuda
609 and cuda23
610 and cuda_fermi
As me and SciManSteve found out, having the incorrect pairing of plan_class and version_num will trash your entire cache of tasks.
Or if your feeling really brave there is another way to make sure that you have the correct pairing, but its quite risky and could end up trashing your cache anyway

Think your struggle to get 2 wu's running running on the 250, never had one myself, but it wasn't worth it on my old 260, and unfortunately, there's no known way at the moment that I'm aware of to run 2 wu's on 1 card and only 1 wu on another

Title: Re: GTX 460 superclocked
Post by: Frizz on 23 Aug 2010, 02:11:18 pm

Quote from: TouchuvGrey on 23 Aug 2010, 02:00:47 pm

What ( if anything ) do i need to add or modify in my config.xml file to get
2 instances running on my GTX460 ? Taking into account that the other card
is a GTS250

Doesn't work.

See http://setiathome.berkeley.edu/forum_thread.php?id=61184&nowrap=true#1027766

[EDIT] AFAIK there is only minimal performance gain running two or more tasks on a 460.

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 23 Aug 2010, 02:12:33 pm

Quote from: Frizz23 on 23 Aug 2010, 02:11:18 pm

Doesn't work.
See http://setiathome.berkeley.edu/forum_thread.php?id=61184&nowrap=true#1027766

Saw that thread, but haven't been over there since, did the GPU_RAM suggestion by Joe not work?

Edit, just read the thread, shame, I was thinking about plugging my 260 back in

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 24 Aug 2010, 04:04:47 pm

Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>

Title: Re: GTX 460 superclocked
Post by: Jason G on 24 Aug 2010, 04:22:44 pm

Cheers! Yeah should help a fair bit straight away ~20% maybe. It's still early days with those cards. no Cuda documentation mentioning them, and only one introductory driver out. Rest assured they seem to be a very popular card, and will start to grab a foothold.

Jason

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 24 Aug 2010, 06:06:41 pm

Quote from: TouchuvGrey on 24 Aug 2010, 04:04:47 pm

Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>

Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards, so you were running the v12 application and turning in a lot of tasks with a false result_overflow. Many of those ended up being judged invalid so got no credits. Some also happened to be paired with another host also running old CUDA code on Fermi, those unfortunately get validated and assimilated into the database. However, they overflow so quickly that there are few credits granted even for those.

There may be a lingering problem because the DCF has adapted to doing a lot of the work in extremely short time. That could lead to BOINC killing some tasks for "exceeded elapsed time limit", the infamous -177 error. The new rescheduler (http://www.efmer.eu/forum_tt/index.php?topic=428.0) Fred M made has an expert feature to prevent any possibility of that, and IIRC there's a way to use that feature without actually rescheduling tasks. I hope someone who has actually used that will post a quick clear procedure. I don't have any GPU capable of crunching, so am only going on what I've read elsewhere.

You might also want to reduce your cache settings before asking for work during the uptime beginning Friday, the system thinking your GTX460 is much faster than it really is could lead to getting more work than you really want. After the host has 50 or so validated tasks done with x32f the server average should be close enough to not worry about that much, so the cache can be made as large as you need before the next outage.
Joe

Title: Re: GTX 460 superclocked
Post by: SciManStev on 24 Aug 2010, 06:47:39 pm

On the expert tab, there is a check box that says Limit rsc_fpops_bound to avoid -177 errors. Check that off, and go to the first tab and push run. It takes a few seconds, but it works perfectly. I stopped a bunch of -177 errors cold by running that. Make sure you are not in simulation mode, which is also on the expert tab.

Steve

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 24 Aug 2010, 07:00:49 pm

Quote from: Josef W. Segur on 24 Aug 2010, 06:06:41 pm

Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards, so you were running the v12 application and turning in a lot of tasks with a false result_overflow. Many of those ended up being judged invalid so got no credits. Some also happened to be paired with another host also running old CUDA code on Fermi, those unfortunately get validated and assimilated into the database. However, they overflow so quickly that there are few credits granted even for those.
Joe

Well, we were slow to pick up, but we were there by early June: I thnink all the warnings were in place by

http://lunatics.kwsn.net/1-discussion-forum/when-corrupted-results-get-validated.msg27734.html#msg27734
http://lunatics.kwsn.net/gpu-crunching/unified-installer-with-fermi-support.msg27926.html#msg27926

Anybody who installed any v12 or other non-Fermi app after then, with all the warnings here and on the main project, just wasn't reading. And of course, from that point onwards, just allowing stock download would have worked.

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 24 Aug 2010, 07:01:30 pm

Quote from: Josef W. Segur on 24 Aug 2010, 06:06:41 pm

Quote from: TouchuvGrey on 24 Aug 2010, 04:04:47 pm
Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>

Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards, so you were running the v12 application and turning in a lot of tasks with a false result_overflow. Many of those ended up being judged invalid so got no credits. Some also happened to be paired with another host also running old CUDA code on Fermi, those unfortunately get validated and assimilated into the database. However, they overflow so quickly that there are few credits granted even for those.

There may be a lingering problem because the DCF has adapted to doing a lot of the work in extremely short time. That could lead to BOINC killing some tasks for "exceeded elapsed time limit", the infamous -177 error. The new rescheduler (http://www.efmer.eu/forum_tt/index.php?topic=428.0) Fred M made has an expert feature to prevent any possibility of that, and IIRC there's a way to use that feature without actually rescheduling tasks. I hope someone who has actually used that will post a quick clear procedure. I don't have any GPU capable of crunching, so am only going on what I've read elsewhere.

You might also want to reduce your cache settings before asking for work during the uptime beginning Friday, the system thinking your GTX460 is much faster than it really is could lead to getting more work than you really want. After the host has 50 or so validated tasks done with x32f the server average should be close enough to not worry about that much, so the cache can be made as large as you need before the next outage.
Joe

   Thank you Joe:

   That explains quite a bit i was puzzled by. i'm very looking forward to
using an app that can take full advantage of the Fermi series. When they become
available i just may have to get a 2nd 460 for Aiden ( my main rig, named for my
newest grandson ).
   i need clear and simple procedures for modifying config files and applications.
If there is any ambiguity at all in the instructions i will choose the wrong way of
doing something far more often than random chance would seem to allow.

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 24 Aug 2010, 07:19:21 pm

Quote from: SciManStev on 24 Aug 2010, 06:47:39 pm

On the expert tab, there is a check box that says Limit rsc_fpops_bound to avoid -177 errors. Check that off, and go to the first tab and push run. It takes a few seconds, but it works perfectly. I stopped a bunch of -177 errors cold by running that. Make sure you are not in simulation mode, which is also on the expert tab.

Steve

i just looked and failed to find the expert tab which tells me i'm looking in the wrong
place or i don't have one. Where do i look or how do i get one ?

To Richard who said "Anybody who installed any v12 or other non-Fermi app after then, with all the warnings here and on the main project, just wasn't reading...."

i plead guilty as charged, i have a long history of not reading the instructions.

Title: Re: GTX 460 superclocked
Post by: perryjay on 24 Aug 2010, 07:28:21 pm

You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.

Quote

i plead guilty as charged, i have a long history of not reading the instructions.

I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;D

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 24 Aug 2010, 07:44:57 pm

Quote from: perryjay on 24 Aug 2010, 07:28:21 pm

You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.

Quote
i plead guilty as charged, i have a long history of not reading the instructions.

I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;D

It sounds like i need to be using that rescheduler, where do i get it and how and where do i install it ?
A lot of this is Greek to me too, but what little i can get from it is a liitle that i did not know before.

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 24 Aug 2010, 11:58:14 pm

Quote from: Josef W. Segur on 24 Aug 2010, 06:06:41 pm

Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards
...
Joe

Quote from: Richard Haselgrove on 24 Aug 2010, 07:00:49 pm

Well, we were slow to pick up, but we were there by early June: I thnink all the warnings were in place by

http://lunatics.kwsn.net/1-discussion-forum/when-corrupted-results-get-validated.msg27734.html#msg27734
http://lunatics.kwsn.net/gpu-crunching/unified-installer-with-fermi-support.msg27926.html#msg27926

Anybody who installed any v12 or other non-Fermi app after then, with all the warnings here and on the main project, just wasn't reading. And of course, from that point onwards, just allowing stock download would have worked.

Agreed, anyone who read and remembered the 19th reply in the first thread you mentioned should not have gotten into difficulties. The second thread is in the section of this site closed to all except developers and alpha testers, so TouchuvGrey will not have seen that. In retrospect, your proactive approach suggested there was a very good idea and I hereby declare you not a part of the "we" who screwed up. And of course your warnings on the main project, etc. were very helpful. So much so that I for one didn't realize until quite recently that the smoldering embers were threatening to turn into a full blown fire.

Basically, the difficulty is we developers and many others here have a very technical orientation. What is obvious to us is not necessarily so for our users, and we failed to communicate effectively across that gap. Even a clear statement on the front page and in the description of the 0.36 downloads that V12 isn't Fermi compatible would not have completely prevented the developing problem, many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. I'd be more comfortable if we had at least tried that kind of warning, though.
Joe

Title: Re: GTX 460 superclocked
Post by: Josef W. Segur on 25 Aug 2010, 12:23:08 am

Quote from: perryjay on 24 Aug 2010, 07:28:21 pm

You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.

Quote
i plead guilty as charged, i have a long history of not reading the instructions.

I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;D

Quote from: TouchuvGrey on 24 Aug 2010, 07:44:57 pm

It sounds like i need to be using that rescheduler, where do i get it and how and where do i install it ?
A lot of this is Greek to me too, but what little i can get from it is a liitle that i did not know before.

The download and instructions are at http://www.efmer.eu/forum_tt/index.php?topic=428.0. Installation is just a matter of taking the 64 bit version of the executable out of the zip file and putting it someplace convenient. If you have BOINC installed where it chooses by default the program should have no difficulty finding what it needs to work with.

The instructions on that linked page say it will reschedule both VLAR and VHAR to CPU unless you uncheck the "Always move VHAR's to CPU" box, I definitely recommend doing so to keep VHARs on GPU. The only VLARs you might have are some older unmarked ones which have been reissued, x32f is better at doing those than stock 6.10 but probably leaving the "Always move VLAR's to CPU" checked is what you'll want.
Joe

Title: Re: GTX 460 superclocked
Post by: Jason G on 25 Aug 2010, 04:29:01 am

Quote from: Josef W. Segur on 24 Aug 2010, 11:58:14 pm

Agreed, anyone who read and remembered the 19th reply in the first thread you mentioned should not have gotten into difficulties. The second thread is in the section of this site closed to all except developers and alpha testers, so TouchuvGrey will not have seen that. In retrospect, your proactive approach suggested there was a very good idea and I hereby declare you not a part of the "we" who screwed up. And of course your warnings on the main project, etc. were very helpful. So much so that I for one didn't realize until quite recently that the smoldering embers were threatening to turn into a full blown fire.

Basically, the difficulty is we developers and many others here have a very technical orientation. What is obvious to us is not necessarily so for our users, and we failed to communicate effectively across that gap. Even a clear statement on the front page and in the description of the 0.36 downloads that V12 isn't Fermi compatible would not have completely prevented the developing problem, many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. I'd be more comfortable if we had at least tried that kind of warning, though.
Joe

It's been a difficult situation. I for one didn't realise the potential these cards have fully until I installed one, despite being optimistic from the white papers etc. Only recent directX 11 performance analyses & reviews of these brand new mainstream cards hints that uptake is likely to be huge. One thing to keep in mind through all this, while looking back, is that not all the knowledge to put the pieces together was publicly available knowledge, and we've yet to see the full changes to the stock codebase. That's marketing controls by nVidia, and continues to mean we have to chase our own tails to work stuff out. It's always unclear with new hardware whether problems lie with the application build, tools & SDK or drivers, or even the hardware. With the Cuda 3.1 documentation unreleased at the time, there was nothing to suggest the existing applications shouldn't work, except that they didn't, and the stock modifications yet to be seen, along with x32f, contain fixes in Cuda 3.0 code that were not in that documentation, and only vaguely described in Cuda 3.1

I doubt that any additional written warnings would have been effective, over Richard's advice & instructions, & the installer having been marked, as it was, [Don't use this for Fermi] soon after the problem was understood. I had to bail up several 'honorary testers' over the past couple of months & feed them working applications to minimise fallout & clarify the situation even for myself, and there remain deadfalls for other developers still they may not know about.

Well I promised plenty of Growing Pains :D

Title: Re: GTX 460 superclocked
Post by: Frizz on 25 Aug 2010, 04:42:09 am

Quote from: Josef W. Segur on 24 Aug 2010, 11:58:14 pm

... many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. ...

How about some kind of GPU detection like in Folding@Home - so the client runs only supported hardware. Plus a force flag that the user has to set explicitly (e.g. in app_info.xml) to make it run anyway. Like that "-forcegpu ati_r700" people had to set until recently to make their ATI HD5xxx cards work.

Since you already print out stuff like "Device 1: GeForce GTX 460, 1023 MiB, regsPerBlock 32768" I guess you already have some sort of GPU detection code in place ;)

To late now ... but maybe prevent the same thing from happening when Fermi2.0 comes around.

Title: Re: GTX 460 superclocked
Post by: Jason G on 25 Aug 2010, 04:47:59 am

Quote from: Frizz23 on 25 Aug 2010, 04:42:09 am

Since you already print out stuff like "Device 1: GeForce GTX 460, 1023 MiB, regsPerBlock 32768" I guess you already have some kind of GPU detection code in place ;)

That one runs on all the nVidia GPUs ;) and yes, when the internal dispatch mechanisms in the app, runtime and driver use the right kernels for the right device, as nVidia intended, appropriate codepaths from multiple are chosen & executed. Unfortunately that mechanism didn't exist prior to Cuda 3.0, so older (Pre Cuda 3.0)builds need to be deprecated.

nVidia builds 'supposedly' now contain 'forward compatible' PTX code, so Fermi 2 'shouldn't' be a problem. That however requires a crystal ball I don't have to verify, and wouldn't feel comfortable making a build deliberately error out on something it may have worked fine on.

Title: Re: GTX 460 superclocked
Post by: Fredericx51 on 25 Aug 2010, 06:15:36 am

Thanks Joe and Jason for your clear explanation, about CUDA-FERMI (460). Since I'll expect my 470 any
day now, it would be nice to use the right application.

And will it be possible to use a 9800GTX+ and a GTX470, driven by a QX9650, a bit above 'stock' and
low memory latency.(5-5-5-14--1T) @ ~400MHz (FSB=1340MHz; FSB:DRAM=5:6)
O.S. is WIN XP64, BOINC 6.10.58.
Or is this combi making a good test-case, in FERMI testing, already have a SETI Bêta account and have crunched a few hundred AP tasks with rev.434, also rev.422; 393; 280 , on an Q6600+HD5770+4850, host. (WIN XP x86 Pro.; BOINC 6.10.56, all using optimized app's.)

I follow the forum conversation closely, but need some guide to pick the 'right app', for a 470 GPU.

Title: Re: GTX 460 superclocked
Post by: Miep on 25 Aug 2010, 06:25:22 am

So, in essence, people don't read or if they read they don't pay enough attention to details. They skip over the paragraphs, only taking in what they want and screw everything up.
Same in discussions, people never listen properly. But there at least you can yell at them.
[Doesn't strike me as a particularly new insight. Even people who should know better sometimes miss things.]

From the slightly outside POV, there was plenty of information there - at least on NC. A whole thread dedicated on how to get a nice app_info to run Fermis for example.

So the problem remains, not to put information out there, but to get it across.
I think once 0.37 is out, I'll have to start yelling at people...

@Fredericx I'm sure Jason will be delighted to hear whether or not x32f_30 runs errorfree on that mixture.

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 25 Aug 2010, 07:14:08 am

Quote from: Josef W. Segur on 25 Aug 2010, 12:23:08 am

Quote from: perryjay on 24 Aug 2010, 07:28:21 pm
You are using Fred's (Efmer) 2.0 rescheduler aren't you? If you are, just look along the top, the expert tab is the last one on the right.

Quote
i plead guilty as charged, i have a long history of not reading the instructions.

I read them it's just that they might as well be in Greek most of the time. I need a lot of hand holding on these things. ;D

Quote from: TouchuvGrey on 24 Aug 2010, 07:44:57 pm
It sounds like i need to be using that rescheduler, where do i get it and how and where do i install it ?
A lot of this is Greek to me too, but what little i can get from it is a liitle that i did not know before.

The download and instructions are at http://www.efmer.eu/forum_tt/index.php?topic=428.0. Installation is just a matter of taking the 64 bit version of the executable out of the zip file and putting it someplace convenient. If you have BOINC installed where it chooses by default the program should have no difficulty finding what it needs to work with.

The instructions on that linked page say it will reschedule both VLAR and VHAR to CPU unless you uncheck the "Always move VHAR's to CPU" box, I definitely recommend doing so to keep VHARs on GPU. The only VLARs you might have are some older unmarked ones which have been reissued, x32f is better at doing those than stock 6.10 but probably leaving the "Always move VLAR's to CPU" checked is what you'll want.
Joe

"Always move VHAR's to CPU" unchecked, "limit rsc_fpops...." checked. Is there anything
else i should change from default ?

Title: Re: GTX 460 superclocked
Post by: Miep on 25 Aug 2010, 07:26:07 am

Quote from: TouchuvGrey on 25 Aug 2010, 07:14:08 am

"Always move VHAR's to CPU" unchecked, "limit rsc_fpops...." checked. Is there anything
else i should change from default ?

I think Fred has previously recommended running in Simulation mode once and check the log for errors, before running for real.
Making a backup of the Boinc Data folder before major operiations is also usually a good idea.

Title: Re: GTX 460 superclocked
Post by: perryjay on 25 Aug 2010, 10:51:41 am

I also checked "do not include running tasks in reschedule" as I believe someone had a little problem with that messing up the running tasks but I'm going by memory there and it's not always the best. ::) I'm just going by "better safe than sorry" on that one.

Title: Re: GTX 460 superclocked
Post by: Richard Haselgrove on 25 Aug 2010, 11:56:23 am

Quote from: Josef W. Segur on 24 Aug 2010, 06:06:41 pm

Actually, we screwed up. We failed to make it clear that all S@H CUDA applications built before the Fermi cards were released have problems on those cards
...
Joe

Quote from: Richard Haselgrove on 24 Aug 2010, 07:00:49 pm

Well, we were slow to pick up, but we were there by early June: I thnink all the warnings were in place by

http://lunatics.kwsn.net/1-discussion-forum/when-corrupted-results-get-validated.msg27734.html#msg27734
http://lunatics.kwsn.net/gpu-crunching/unified-installer-with-fermi-support.msg27926.html#msg27926

Quote from: Josef W. Segur on 24 Aug 2010, 11:58:14 pm

Agreed, anyone who read and remembered the 19th reply in the first thread you mentioned should not have gotten into difficulties. The second thread is in the section of this site closed to all except developers and alpha testers, so TouchuvGrey will not have seen that. In retrospect, your proactive approach suggested there was a very good idea and I hereby declare you not a part of the "we" who screwed up. And of course your warnings on the main project, etc. were very helpful. So much so that I for one didn't realize until quite recently that the smoldering embers were threatening to turn into a full blown fire.

Basically, the difficulty is we developers and many others here have a very technical orientation. What is obvious to us is not necessarily so for our users, and we failed to communicate effectively across that gap. Even a clear statement on the front page and in the description of the 0.36 downloads that V12 isn't Fermi compatible would not have completely prevented the developing problem, many are buying Fermi cards as an upgrade and no reinstall seems necessary in such cases. I'd be more comfortable if we had at least tried that kind of warning, though.
Joe

Sorry about the mixed thread references, but it was just a quick way of reminding both audiences who will be reading this thread - the technical audience who make the applications work, and the consumer audience who enjoy the fruits of their labours - how we came to be in this situation.

Let's be clear, the problem of Fermi compatibility - and in particular, the problem that older builds, including NVidia's own as supplied for Berkeley to use as stock apps - wouldn't run on Fermi cards only became apparent well after Fermi cards (GTX 4xx series) started to sell and be installed in significant numbers. And it was mainly those early purchasers who got caught by it. And understanding of the nature of the issues involved was substantially delayed by the conflicting signals about app_info errors coming from the faulty BOINC scheduler deployed around that time.

But if there can be one problem like this, there can also be more in the future. And that raises two questions. What do the ordinary, downloading and using, consumers of Lunatics applications expect? And what do the developers and other volunteers feel able to supply?

For a long time, Lunatics' optimised applications were prominently flagged as "FOR ADVANCED USERS ONLY". That meant that if anybody wanted to take advantage of the speed and efficiency of optimised applications, they absolutely had to put some time and effort into learning, at least in general terms, how BOINC's application controls work. More recently, Lunatics - or in particular, Jason on Lunatics' behalf - has responded to user feedback by making the process of installing the applications easier and more user-friendly. That has reduced the amount of effort and knowledge required of end users, at the expense of significantly increasing the complexity and size of the workload that Jason is shouldering on behalf of you all (as a volunteer, with other work commitments competing for time as well).

That second thread I linked was a suggestion for an updated installer to work round the Fermi problem, without requiring any additional manual updating from users themselves. Jason's reply started "I'm under time pressure with 'other things'", and after discussion, the decision was taken to leave the known-problematic applications in the installer, but supplement them with warnings and links to guidance on the additional steps required. Maybe, with hindsight, that's the point at which good intentions started to diverge from best practice and the expectations of users.

It does make me wonder, given the limited amount of volunteer time available, and the limited number of people available to supply it, whether the time might have come to separate the roles of application development and optimisation (Lunatics' historic primary focus), from the more mundane tasks of package management and deployment. Speaking from personal experience of being a sole developer/deployer, the multiplicity of tools and techniques involved can slow down ones ability to use any one of them effectively when competing time constraints arise.

Now that the basic framework of the installer process has been tried and tested, would there be any scope for a specialist volunteer to take over package maintenance, and release Jason to concentrate on the real hard work of coding and optimisation? One problem is that it would have to be somebody who already has, or could be entrusted with, the security authority to fiddle with the contents of the download area - quality control there is paramount. Any merit in that approach, and more importantly, any suitably qualified volunteers? (I possibly come fairly close, but I couldn't think of putting myself forward until after the BOINC workshop and a holiday planned for next month).

Title: Re: GTX 460 superclocked
Post by: perryjay on 25 Aug 2010, 12:51:11 pm

Holiday? You get a holiday?? :o

Seriously though, I am far from an expert in any of this. I just sit back and wait until I see you guys have it just about right before I try any of this stuff and pray I get it installed right. As I told Jason in another thread, I am more than happy to try something new for you guys but I need a lot of hand holding. There is no way I could do what you guys do.

Hmmm, what do I expect? First and foremost would be that you guys have fun doing what you do. Second would be to take the time to get whatever it is you're making for us to actually do what it is supposed to.(hello, does Dr A read these posts? ) I would guess most of us who visit this site are more than happy to help out if we can to test the new stuff but many are like me and scared they will screw everything up and cost you more time. Rest assured though, when I do mess things up I will do my best to let you know what I did to get me there.

As for the users that don't visit here, they expect a finished product they can just plug and play. They could care less what you went through to bring it to them. No amount of warnings is going to stop them and they are going to be the loudest of the loud when they screw it up. Not much you can do about them though. As far as your quote from Joe, I think you made it abundantly clear there was a problem with the apps, the problem was a number of the users grabbing the installer, plugging it in and forgetting it. Then when they finally did notice, blame you guys. About all you can do is keep that "For advanced users only" sign up and maybe add "Use at your own risk"

Title: Re: GTX 460 superclocked
Post by: Ghost0210 on 25 Aug 2010, 01:22:35 pm

Quote from: perryjay on 25 Aug 2010, 12:51:11 pm

As for the users that don't visit here, they expect a finished product they can just plug and play. They could care less what you went through to bring it to them. No amount of warnings is going to stop them and they are going to be the loudest of the loud when they screw it up. Not much you can do about them though. As far as your quote from Joe, I think you made it abundantly clear there was a problem with the apps, the problem was a number of the users grabbing the installer, plugging it in and forgetting it. Then when they finally did notice, blame you guys. About all you can do is keep that "For advanced users only" sign up and maybe add "Use at your own risk"

Well said! :D Pretty much my thoughts on this point.
There are unfortunately a large number of users that don't help with the tesing, yet still expect any applications here to work with whatever obscure build they are running, and when it all goes wrong blame the developers, the alpha testers and the beta testers for writing, testing and releasing a bad application?!?!?!? Then instead of asking for help on the appropriate forums, they would just shout and scream about the problems they had, instead of trying to understand why it went pear shaped and fixing it for themselves and others.

As for warnings, even if you had big flashing lights and alarms going off when you got to that screen, how many people would actually read and take note of them.
How many people that have downloaded the x32f build have actually read the full thread and noted the known issues that have been posted on the first page? 10%, 20%?
I think the mixture here is good, the warnings are there for people to see if they bothered to read and pay attention

Title: Re: GTX 460 superclocked
Post by: TouchuvGrey on 18 Sep 2010, 09:33:38 am

Quote from: TouchuvGrey on 24 Aug 2010, 04:04:47 pm

Downloaded and installed Lunatics_Win64v0.37_(SSE3+)_AP505r409_AKv8bx64_Cudax32f.exe
( Beta ) hoping this will help. i've watched my host average drop from 13,500 per day to
8700 per day since installing the GTX460. Suspecting i screwed something up along the way.
i have a Black Belt in that sort of thing. <sigh>

update:
Architecture: GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5]
OS Details: Microsoft Windows 7 Home Premium x64 Edition, (06.01.7600.00)
Number of CPU's: 8
Created: Sat, 26 Dec 09 17:11:17 -0700
Timezone: GMT -5
Floating Point Speed: 2,550.20 million ops/sec
Integer Speed: 6,651.04 million ops/sec
Memory Bandwidth: 125Mbit/sec
Ram: 11.99Gb
Cache: 256.00Kb
Swap: 23.98Gb
Disk Total: 419.93Gb
Disk Free: 180.33Gb
Ave Upload Rate: 29.53 Kb/sec
Ave Download Rate: 2.46 Mb/sec
Ave Turnaround:(tbf) 1,818,490.34
cpid: a23269cdfd792a32f4cdad5ec3412d55
Rank: 1,393
Last Update: 0
7 Day Average: 28,477
Last 7 Days: 199,341
Last 28 Days: 447,884
RAC: 20,452
Coprocessor: 0 x 6.10.58CUDA with 2 (BOINC)

This latest build seems to agree with my computer. Considering
updating the driver to the lastest Nvidia Beta 260.63, but will likely wait
a few days to see if it gets WHQL or an updated beta.

Title: Re: GTX 460 superclocked
Post by: arkayn on 18 Sep 2010, 07:09:28 pm

I am running my new GTX 460 with the 260.63 drivers, seems to work fine to me. Of course getting work downloaded from SETI is a pain so right now it is working on a full cache of Collatz work.