Forum > Discussion Forum
Interesting F/U on Intel Compiler vs. AMD issue
IrishFBall32:
Glad to see Intel is getting theirs handed to them over this... I remember this being an issue here when Lunatics first started taking over after the whole mess with Crunch3r getting run out of town.
viper666:
well it is INTEL's optimized compiler so i kinda seen that one coming. AMD should develop their own and apply the same "optimizations"
desprado7:
--- Quote from: Jason G on 04 Jan 2010, 11:26:06 am ---For completion, and interest, here's the mentioned workarounds described by Agner in his optimization manuals:
- In Green are the approaches we already use for multibeam, and the sole ICC compiled component library of astropulse (fftw SSE, release Astropulse was always an MSVC build) ... These approaches require multiple platform specific builds.
- In Yellow, are what we could do to hopefully bring the build count back down
- In Orange is the true crux of the matter.
In short, we don't use the dynamic dispatch mechanisms in Intel compiler. Never have. So any fix they apply to this, which I hope they do, while it would reduce our build count, and probably save a lot of work for which the energy could be directed elsewhere, it won't directly influence the speed of our builds on any brand of CPU.
Optimizing software in C++
An optimization guide for Windows, Linux and Mac
platforms
By Agner Fog. Copenhagen University College of Engineering.
Copyright © 2009. Last updated 2009-09-26.
pp.126-127
--- Quote ---The behavior of the Intel compiler puts the programmer in a bad dilemma. You may prefer
to use the Intel compiler because it has many advanced optimizing features available, and
you may want to use the well optimized Intel function libraries, but who would like to put a
tag on his program saying that it doesn’t work well on non-Intel machines?
Possible solutions to this problem are the following:
• Compile for a specific instruction set, e.g. SSE2. The compiler will produce the
optimal code for this instruction set and insert only the SSE2 version of most library
functions without CPU dispatching. Only a few library functions still have a CPU
dispatcher in this case. Test if the program will run on an AMD CPU. If an error
message is issued then it is necessary to replace the CPU detection function as
described below. The program will not be compatible with old microprocessors.
• Compile with option /QxO. This will include a special version of certain library
functions for AMD processors with SSE2. This performs reasonably on AMD
processors but not optimally. A program compiled with /QxO will not run on any
processor prior to SSE2.
• Make two or more versions of the most critical part of the code and compile them
separately with the appropriate instruction set specified. Insert an explicit CPU
dispatching in the code to call the version that fits the microprocessor it is running
on.
• Replace the CPU detection function of the Intel compiler with another function with
the same name. This pmi certifications method is described below.
• Make calls directly to the CPU-specific versions of the library functions. The CPUspecific
functions typically have names ending in .J for the SSE2 version and .A for
the generic version. The dot in the function names is not allowed in C++ so you need
to use objconv or a similar utility for adding an alias to these library entry oracle certification names.
• The ideal solution would be an open source library of well-optimized functions with a
performance that can compete with Intel’s libraries and with support for multiple
platforms and multiple instruction sets. I have no knowledge of any such library.
The performance on non-Intel microsoft certification processors can be improved by using one or more of the
above methods if the most time-consuming part of the program contains automatic CPU
dispatching or memory-intensive functions such as memcpy, memmove, memset, or
mathematical functions such as pow, log, exp, sin, etc.
--- End quote ---
--- End quote ---
But the problem is, it's not possible to let run Enhanced and Astropulse on CPU and Enhanced (and in future maybe Astropulse also) on GPU simultaneously. Sad
So BOINC and/or SETI@home need more programmer/coder.
(Of course the opt.-crew made a well job to now! But more members would help to accelerate the development..)
I made a thread in my team-forum..
So I thought, to give my teammates (and of course for other people also! Smiley) a place to meet and discuss it would be nice to have here one thread about this topic.
(This one..?!)
Navigation
[0] Message Index
[*] Previous page
Go to full version