DA has added COPROCS to HOST_INFO Changeset 19797Claggy
Quote from: Fred M on 06 Dec 2009, 04:13:43 pm...It looks these values don't have much to do with the actual calculation speed. More like theoretical values out of the sales brochure.... Absolutely. [rant]I actually got a little agitated when I installed Boinc 6.10.18 on my system & noticed it was claiming ~.~317GFlops 'peak' ... ( the words 'when hell freezes over' came to mind Then I had a good laugh about it and felt much better). Of course with memory bound algorithms like larger FFT sizes, on that hardware, real world performance is more like 18-20GFLops, around twice that of each of my CPU Cores with the same problem. I've little doubt that kernels that do multiple redundant operations on the same data repeatedly, sitting in registers, on register sized (very small, ~8k total IIRC) datasets could acheive that kind of throughput ... The dumb thing is that sounds like graphics frame by frame processing more that general purpose computation[/rant]I'm fairly certain the syntheic estimates shoudl be good enough, provided we scale the number appropriately to a realistic range... but there is the alternative of benching with real code if we find better accuracy is needed (which I doubt, but the option is there).Jason
...It looks these values don't have much to do with the actual calculation speed. More like theoretical values out of the sales brochure....
...Using the CudaApi, get the cores, clock and make our own more realistic formula.
DA has added COPROCS to HOST_INFO Changeset 19797ClaggyEdit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info
Quote from: Claggy on 06 Dec 2009, 08:27:42 pmDA has added COPROCS to HOST_INFO Changeset 19797ClaggyEdit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info No reply to my question on boinc_alpha, but it obviously had the desired effect!I'll grab one as soon as there's a build available, and let you see it so you can decide if it's any use.
Hopefully the info is more than the same meaningless peak flop value that's already in the message log.
Better prediction of relative performance I would add.As absolute value it's meaningless of course.9400GT performs ~ as 1 core of my quad now. And its estimation value is 45Gflops. IMO single quad core has very different actual performance...
And this sort of corrections if (!strcmp(plan_class, "cuda23")) {flops *= 1.01; } for the cuda dll being more efficient.
<coprocs> <coproc_cuda> <count> 1 </count><name> GeForce 9800 GTX/9800 GTX+ </name><req_secs> 0 </req_secs><req_instances> 0 </req_instances><estimated_delay> 0 </estimated_delay><drvVersion> 19038 </drvVersion><cudaVersion> 2030 </cudaVersion><totalGlobalMem> 536543232 </totalGlobalMem><sharedMemPerBlock> 16384 </sharedMemPerBlock><regsPerBlock> 8192 </regsPerBlock><warpSize> 32 </warpSize><memPitch> 262144 </memPitch><maxThreadsPerBlock> 512 </maxThreadsPerBlock><maxThreadsDim> 512 512 64 </maxThreadsDim><maxGridSize> 65535 65535 1 </maxGridSize><totalConstMem> 65536 </totalConstMem><major> 1 </major><minor> 1 </minor><clockRate> 1890000 </clockRate><textureAlignment> 256 </textureAlignment><deviceOverlap> 1 </deviceOverlap><multiProcessorCount> 16 </multiProcessorCount></coproc_cuda> </coprocs>