Here is a little document I wrote...I hope will prove usefull to all
========================================================================
STATIC LIBRARY : optimize Project Overview
========================================================================
Compiler optimization command line arguments and seti specific DEFINE constants and their functions.
MS VC++ 2003 Compiler options
-----------------------
/arch:sse
- Its ok for compiler to generate SSE instructions when compiling
/arch:sse2
- Its ok for compiler to generate SSE2 & SSE instructions when compiling
/arch:sse3
- Its ok for compiler to generate SSE3 & SSE2 & SSE instructions when compiling
/Oy
The intel 386 and beyond CPU has 8 floating point stack registers and 8 general
registers, One, the stack pointer, can't really be used by code...this leaves 7.
Older code used one of these registers (extended base pointer) to keep track of
local variables inside of functions...and it was called the "frame pointer"
/Oy tells compiler to free up the 'ebp' register for regular code use.
(note - internally Pentium 3 and beyond have many more "registers" but they
use them behind the scenes when registers get loaded/saved to memory)
/Ob2 - Inline functions when it seems a good idea
Tells compiler to "inline" smallish functions when called inside of other functions
For example the function add_to_total( int num) { total += num; } would be a
perfect candidate for automatic inlining.
/Oi - Use processor specific "intrinsic" functions instead of library calls
example: the x86 processor has a built in 'sine' opcode
source compiled code
/Oi- x=sin( num ) ... call _library_sine_function
/Oi x=sin( num ) ... fsin
/Ot - Prefer to make the output code faster rather than making it small
/Og - "global" optmization - Many usefull optimizations within code
Intel C++ Compiler options
-----------------------
/arch:sse
/arch:sse2
/arch:sse3
/Qunroll - Sometimes (when makes code faster) make multiple copies of code inside
of source loops
/Qipo - Try to do multi-source-file optimizations.
Inline small functions from one source file at call points in functions from other files
Use registers to pass paramenters from some functions to others (rather than stack)
many other items
TAKES LONG TIME!! - Uses up to 1.4Gig of RAM + swap file
/QxK - Code is optimized for Intel® Pentium® III and compatible Intel processors.
/QxW - Code is optimized for Intel Pentium 4 and compatible Intel processors.
/QxN - Code is optimized for Intel Pentium 4 and compatible Intel processors
with Streaming SIMD Extensions 2.
The resulting code may contain unconditional use of features that are
not supported on other processors.
This option also enables new optimizations in addition to Intel
processor-specific optimizations including advanced data layout
and code restructuring optimizations to improve memory accesses for Intel processors.
/QxB- Code is optimized for Intel Pentium M and compatible Intel processors.
This option also enables new optimizations in addition to Intel processor-specific optimizations.
/QxP- Code is optimized for Intel® Core™ Duo processors, Intel® Core™ Solo processors,
Intel® Pentium® 4 processors with SSE3, and compatible
Intel processors with SSE3. The resulting code may contain unconditional use
of features that are not supported on other processors.
/G5 - Intel® Pentium® and Pentium® with MMX™ technology processor
/G6 - Intel® Pentium® Pro, Pentium® II and Pentium® III processors
/G7 - Pentium 4, Xeon, Core Duo, Core solo, Pentium M, SSE3
__INTEL_COMPILER
-- Automatically created define - lets programmer know what compiler is
crunching their source code
/O2 - Is the same as setting the following flags
/Og, /Oi-, /Os, /Ob2, /GF (/Qvc7 and above), /Gf (/Qvc6 and below), /Gs, and /Gy
/O3 - Even more optimizations
Seti@home worker client #define options
--------------------------------
BOINC_APP_GRAPHICS
- Make code & calls for graphics display of seti WU progress
USE_IPP
- Fast Fourier Transform functions using the Intel Compiler Primitives
- an Intel library of fast math functions
USE_SSE3
USE_SSE2
USE_SSE
- All only valid when using USE_IPP
- Determines which version of several IPP functions will be used in FFT
calculations (which SIMD)
USE_FFTWF
- Fast Fourier Transform functions using 'FFTW3' - 'Fastest Fast Fourier in the West'
- An open source fast fourier - optimized for all available SIMD processors
SSE, SSE2, 3Dnow, Altivec
http://www.fftw.orgDO_SMOOTH
- Call the 'baseline smooth' functions for data in WU processing