Author Topic: Unified installer add flops (Read 31190 times)

Claggy · « **Reply #15 on:** 06 Dec 2009, 08:27:42 pm »

DA has added COPROCS to HOST_INFO Changeset 19797

Claggy

Edit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info

cristipurdel · « **Reply #16 on:** 07 Dec 2009, 01:32:32 am »

Quote from: Claggy on 06 Dec 2009, 08:27:42 pm

DA has added COPROCS to HOST_INFO Changeset 19797

Claggy

In simple English, this would mean the projects which run on the GPU will improve in which way?

efmer (fred) · « **Reply #17 on:** 07 Dec 2009, 03:26:38 am »

Quote from: Jason G on 06 Dec 2009, 04:25:51 pm

Quote from: Fred M on 06 Dec 2009, 04:13:43 pm
...
It looks these values don't have much to do with the actual calculation speed. More like theoretical values out of the sales brochure.
...
Absolutely.
[rant]
I actually got a little agitated when I installed Boinc 6.10.18 on my system & noticed it was claiming ~.~317GFlops 'peak' ... ( the words 'when hell freezes over' came to mind Then I had a good laugh about it and felt much better). Of course with memory bound algorithms like larger FFT sizes, on that hardware, real world performance is more like 18-20GFLops, around twice that of each of my CPU Cores with the same problem. I've little doubt that kernels that do multiple redundant operations on the same data repeatedly, sitting in registers, on register sized (very small, ~8k total IIRC) datasets could acheive that kind of throughput ... The dumb thing is that sounds like graphics frame by frame processing more that general purpose computation[/rant]

I'm fairly certain the syntheic estimates shoudl be good enough, provided we scale the number appropriately to a realistic range... but there is the alternative of benching with real code if we find better accuracy is needed (which I doubt, but the option is there).

Jason

We probably could get a good estimate on the performance by using the number of processor cores and clock speed. And maybe the type of cores used etc.
The best way would be to run a small WU and see how long it takes to process, nothing beats the real world.
But it is the question of need, the flops don't need to be 100% accurate. I don't think a 20% change will do much, the WU will vary in time anyway.
Using the CudaApi, get the cores, clock and make our own more realistic formula.

Jason G · « **Reply #18 on:** 07 Dec 2009, 03:46:30 am »

Quote from: Fred M on 07 Dec 2009, 03:26:38 am

...
Using the CudaApi, get the cores, clock and make our own more realistic formula.

Agreed. Let's try the easy way, and try predict indicators that show whether it'll be good enough, or needs the extra effort. Hopefully I'll have my build environment back up and functioning soon ( In the middle of Win7 migration after HDD death

) so will be able to start making some small test pieces (.exe) with flop estimate outputs to experiment with in the installer scripts.

Jason

Richard Haselgrove · « **Reply #19 on:** 07 Dec 2009, 04:40:08 am »

Quote from: Claggy on 06 Dec 2009, 08:27:42 pm

DA has added COPROCS to HOST_INFO Changeset 19797

Claggy

Edit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info

No reply to my question on boinc_alpha, but it obviously had the desired effect!

I'll grab one as soon as there's a build available, and let you see it so you can decide if it's any use.

efmer (fred) · « **Reply #20 on:** 07 Dec 2009, 05:46:29 am »

Quote from: Richard Haselgrove on 07 Dec 2009, 04:40:08 am

Quote from: Claggy on 06 Dec 2009, 08:27:42 pm
DA has added COPROCS to HOST_INFO Changeset 19797

Claggy

Edit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info

No reply to my question on boinc_alpha, but it obviously had the desired effect!

I'll grab one as soon as there's a build available, and let you see it so you can decide if it's any use.

Hopefully the info is more than the same meaningless peak flop value that's already in the message log.

Josef W. Segur · « **Reply #21 on:** 07 Dec 2009, 10:00:19 am »

Quote from: Fred M on 07 Dec 2009, 05:46:29 am

Hopefully the info is more than the same meaningless peak flop value that's already in the message log.

The CUDA peak flops is exactly as meaningful as the older estimated flops, ratio 5.6. IMO, they are better predictors of performance than the Whetstone benchmark used for CPUs.
Joe

Raistmer · « **Reply #22 on:** 07 Dec 2009, 10:28:07 am »

Better prediction of relative performance I would add.
As absolute value it's meaningless of course.
9400GT performs ~ as 1 core of my quad now. And its estimation value is 45Gflops. IMO single quad core has very different actual performance...

efmer (fred) · « **Reply #23 on:** 07 Dec 2009, 10:45:04 am »

Quote from: Raistmer on 07 Dec 2009, 10:28:07 am

Better prediction of relative performance I would add.
As absolute value it's meaningless of course.
9400GT performs ~ as 1 core of my quad now. And its estimation value is 45Gflops. IMO single quad core has very different actual performance...

In the BOINC source code the comment states: FLOPS for a given app may be much less; e.g. for SETI@home it's about 0.18 of the peak
And in the flops they really use is divided by 2 or 5.
And this sort of corrections if (!strcmp(plan_class, "cuda23")) {flops *= 1.01; } for the cuda dll being more efficient.
So it maybe that newer (beta/alpha) clients, already have a correction factor for the GPU flops and the statement may be obsolete by the time of release.

Raistmer · « **Reply #24 on:** 07 Dec 2009, 10:59:49 am »

Quote from: Fred M on 07 Dec 2009, 10:45:04 am

And this sort of corrections if (!strcmp(plan_class, "cuda23")) {flops *= 1.01; } for the cuda dll being more efficient.

CUDA 2.3 bring better performance improvement for CUDA MB than 1% so those 2 plans definitely will have different peak performance to real performance ratios for SETI project.
And 45/5=9(Gflops) - too much for single quad core performance too IMO.
That is, 5 times "peak to real" is pretty arbitrary coefficient.
All htis makes absolute Gflops unusable IMO. What needs to be checked: will 2 cards rated 45 and 90 Gflops by BOINC (for example) exhibit 2 times difference in performance for CUDA MB app or not (can BOINC estimates be used for relative performance comparisons or not).

Claggy · « **Reply #25 on:** 08 Dec 2009, 05:12:12 pm »

Quote from: Richard Haselgrove on 07 Dec 2009, 04:40:08 am

Quote from: Claggy on 06 Dec 2009, 08:27:42 pm
DA has added COPROCS to HOST_INFO Changeset 19797

Claggy

Edit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info

No reply to my question on boinc_alpha, but it obviously had the desired effect!

I'll grab one as soon as there's a build available, and let you see it so you can decide if it's any use.

Boinc 6.10.22 released:

boinc_6.10.22_windows_intelx86.exe

boinc_6.10.22_windows_x86_64.exe

Claggy

Edit: Installed 6.10.22_x86 on my laptop, Boinc reports a GPU on start up, but no GPU work runs, and project computer details no longer show a GPU.
Edit 2: Installed 6.10.22_64 on my Desktop PC, same, Boinc reports a GPU, but GPU tasks don't start, just show waiting to run,
Reported to Boinc_Alpha list, DA's looking into it.

Claggy · « **Reply #26 on:** 08 Dec 2009, 07:40:36 pm »

Quote from: Richard Haselgrove on 07 Dec 2009, 04:40:08 am

Quote from: Claggy on 06 Dec 2009, 08:27:42 pm
DA has added COPROCS to HOST_INFO Changeset 19797

Claggy

Edit: and after Changeset 19798 get_host_info() GUI RPC now contains GPU info

No reply to my question on boinc_alpha, but it obviously had the desired effect!

I'll grab one as soon as there's a build available, and let you see it so you can decide if it's any use.

Boinc 6.10.23 released:

boinc_6.10.23_windows_intelx86.exe

boinc_6.10.23_windows_x86_64.exe

Claggy

Edit: this version's no better:

09/12/2009 00:45:31 Collatz Conjecture Application uses missing NVIDIA GPU
09/12/2009 00:45:31 Collatz Conjecture Missing coprocessor for task collatz_1259356581_147128_0

arkayn · « **Reply #27 on:** 08 Dec 2009, 08:57:58 pm »

Looks like I will stick with 6.10.21 for the time being then.

Richard Haselgrove · « **Reply #28 on:** 09 Dec 2009, 07:12:33 am »

There's a v6.10.24 now, which does seem to work.

Here's the host_info output (my formatting):

Code: [Select]

<coprocs>		
<coproc_cuda>		
<count>			1	</count>
<name>			GeForce 9800 GTX/9800 GTX+	</name>
<req_secs>		0	</req_secs>
<req_instances>		0	</req_instances>
<estimated_delay>	0	</estimated_delay>
<drvVersion>		19038	</drvVersion>
<cudaVersion>		2030	</cudaVersion>
<totalGlobalMem>	536543232	</totalGlobalMem>
<sharedMemPerBlock>	16384	</sharedMemPerBlock>
<regsPerBlock>		8192	</regsPerBlock>
<warpSize>		32	</warpSize>
<memPitch>		262144	</memPitch>
<maxThreadsPerBlock>	512	</maxThreadsPerBlock>
<maxThreadsDim>		512 512 64	</maxThreadsDim>
<maxGridSize>		65535 65535 1	</maxGridSize>
<totalConstMem>		65536	</totalConstMem>
<major>			1	</major>
<minor>			1	</minor>
<clockRate>		1890000	</clockRate>
<textureAlignment>	256	</textureAlignment>
<deviceOverlap>		1	</deviceOverlap>
<multiProcessorCount>	16	</multiProcessorCount>
</coproc_cuda>		
</coprocs>

Shame about <req_secs>, <req_instances>, and <estimated_delay> - they belong with work fetch, not host info: but it's easier to ignore something unneeded, rather than invent something missing.

Jason G · « **Reply #29 on:** 09 Dec 2009, 07:40:00 am »

Hmm, same info as from the Cuda api device query function, (naturally I suppose). No need to use that particular RPC then. quite useful for a gui frontend/manager such that it wouldn't have to call cuda api though.

Author Topic: Unified installer add flops (Read 31190 times)

Claggy

Re: Unified installer add flops

cristipurdel

Re: Unified installer add flops

efmer (fred)

Re: Unified installer add flops

Jason G

Re: Unified installer add flops

Richard Haselgrove

Re: Unified installer add flops

efmer (fred)

Re: Unified installer add flops

Josef W. Segur

Re: Unified installer add flops

Raistmer

Re: Unified installer add flops

efmer (fred)

Re: Unified installer add flops

Raistmer

Re: Unified installer add flops

Claggy

Re: Unified installer add flops

Claggy

Re: Unified installer add flops

arkayn

Re: Unified installer add flops

Richard Haselgrove

Re: Unified installer add flops

Jason G

Re: Unified installer add flops