Forum > Windows

BOINC as library

<< < (7/9) > >>

Jason G:

--- Quote from: Raistmer on 28 Oct 2007, 05:09:02 pm ---Well, "results strong similar" ....

--- End quote ---

Nice :D, sounds to me like it might be only some compiler flags different for only 2% difference!

Jason

Raistmer:
Here all diffs that were done by me to compile 2.4 sources (actually, 2.39S but there only 2 differences in #define strings that was added to diffs after build and could not prevent to rebuild client again) with VS 2005 and trial versions of ICC and IPP.
opt_config.h was added to simplify tuning of conditional defines and compilation through few source files affected.



[attachment deleted by admin]

Jason G:
Good one to record the changes like that, mine are scribbled on an old envelope  :P, Looks like similar changes overall.  Did you end up with favourite compiler settings ?  the 2.4lunatics one for QxN looks pretty close for the ones I've played with on my p4.

Jason

Raistmer:
Well, I'm not even record my changes ;) Just downloaded yesterday lunatic 2.4 sources from link on main seti board, ran WinDiff utility and collected all discrepancies in one rar  ::)

I used SSE2 build options cause that binary was intented to run and be profiled on AMD 64 host. I use CodeAnalyst as profiling tool (governing by assumption that AMD should know their own CPUs better than Intel ;) ) It would be interesting to compare your's vTune data with CodeAnalyst one to highlight area of interests for some improvements.
Probably need to check that options set more presisely cause my build little less than optimal.  Another possibility - options are fine and 2% difference in speed comes from trial nature of mine IPP installation. Intel approves only dynamic linking with trial IPP library. So dll-calls... Don't know really could this accont for 2% slowness or not (even 2% still preliminary - tested only on short WU).

Jason G:
Very good idea to compare SSE2 QxN p4 vtune data against your sse2 AMD build.  There is some Arguments about that ! :D. 

Maybe you found Hotspots in an inner folding routine ? Mine chooses FoldArrayBy2AL and spends a about 10% of total time in there.  Maybe yours chooses a different routine? either way we could compare asm listing output of those even, which might explain some differences between the chips! (Those functions don't depend on IPP as far as I know.)

[Note that also because I am using ICC, about 11% of time is being spent in _Intel_fast_memcpy,  Which having looked at a mixture of improved memcopies, elimination of them, and hybrid processing techniques in other areas ,might make some generally applicable improvements.(not just intel chips) ]

 Even though yours calls the dynamic library  it would be nice to see if the dispatching is calling the same IPP functions (but DLL versions) ...or some different maybe more generic one...  mine calls the w7 static ones which are p4 sse2, but the internal names given by vtune / codeanalyst will give the real names.

Jason

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version