Forum > Windows
optimized sources
_heinz:
The auto- vectorizer runs ;D
-----------------------------------
------ Build started: Project: Optimizer, Configuration: Release32-NOGFX Win32 ------
Compiling...
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.20404 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
cl /O2 /Ob2 /Oi /Ot /Oy /GT /I "../../../boinc/win_build" /I ".." /I "..\.." /I "..\..\..\boinc\lib" /I "../../../boinc/api" /I "../../db" /I "C:\I\SC\vs90\seti_boinc_2k3_2.2B-Ben-Joe\client\Optimizer" /I "C:\I\INTEL\IPP\5.2_beta\ia32\tools\staticlib" /I "C:\I\INTEL\IPP\5.2_beta\ia32\include" /D "USE_AKFSIMD" /D "USE_IPP" /D "USE_SSE2" /D "WIN32" /D "_WIN32" /D "_WINDOWS" /D "_CONSOLE" /D "_DEBUG" /D "_LIB" /D "_MT" /D "CLIENT" /D "NBOINC_APP_GRAPHICS" /D "_UNICODE" /D "UNICODE" /D "_VC80_UPGRADE=0x0710" /D "_MBCS" /GF /FD /EHsc /MTd /Zp16 /arch:SSE2 /fp:fast /FAs /Fa"Release32-NOGFX\\" /Fo"Release32-NOGFX\\" /Fd"Release32-NOGFX\vc90.pdb" /W3 /c /Wp64 /Zi /Gd /TP /FI "win-config.h" ".\AKfoldSSE.cpp"
AKfoldSSE.cpp
-----IPP-----
-----SSE2/em-----
-----AKFSIMD-----
Build log was saved at "file://c:\I\SC\vs90\seti_boinc_2k3_2.2B-Ben-Joe\client\Optimizer\Release32-NOGFX\BuildLog.htm"
Optimizer - 0 error(s), 0 warning(s)
========== Build: 1 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========
_heinz:
Working now on a vectorized version of chirpfft.cpp
heinz ;D
Jason G:
Hi Heinz,
Did you manage to determine any performance differences between our 'auto vectoriser friendly' folding routine (when compiled under ICC, with the pragma hints / dependency overrides) and hand vectorised code? If you haven't had a chance I'll be able to take another look in 2 weeks (holidays ;D)
Jason
_heinz:
Hi Jason,
I´m waiting with this till you have holidays. Realised some nice ideas to eleminate not necessary code. ::)
The autovectorizer runs great. Let surprise you.
Have a nice week.
Heinz ;D
_heinz:
As I´m going through the code, fraction_done get my attention.
Always before it is called we found (sometimes not directly before) following statement --->
progress = std::min( progress, 1.0 );
1. in function do_transpose
progress = std::min( progress, 1.0 );
#ifdef BOINC_APP_GRAPHICS
if ( !nographics() )
{
if ( gbp ) gbp->rarray.add_source_row( (float *)WorkData );
sah_graphics->local_progress = ( (( float ) ifft + 1) / NumFfts );
}
#endif
remaining = 1.0 - ( double ) ( icfft + 1 ) / num_cfft;
fraction_done( progress, remaining );
----------------------------------------------------------------------------------------------------------
2. in function process_data
progress = std::min( progress, 1.0 );
#ifdef BOINC_APP_GRAPHICS
if ( !nographics() )
{
if ( gbp ) gbp->rarray.add_source_row( (float *)WorkData );
sah_graphics->local_progress = ( (( float ) ifft + 1) / NumFfts );
}
#endif
remaining = 1.0 - ( double ) ( icfft + 1 ) / num_cfft;
fraction_done( progress, remaining );
------------------------------------------------------------------------------------------------------
3. in analyzePoT.cpp line 246
progress = std::min( progress, 1.0 ); // prevent display of > 100%
fraction_done( progress, remaining );
-----------------------------------------------------------------------------------------------------------------------------------
4. in analyzePot.cpp line 387
progress = std::min( progress, 1.0 ); // prevent display of > 100%
fraction_done( progress, remaining );
----------------------------------------------------------------------------------------------------------------------------------------------------
therefore I think if we call fraction_done( double progress, double remaining )
it is not necessary in it to calculate progress again --->progress = std::min( progress, 1.0 );
because we get same result as before. So we can comment it out.
After helping the Compiler with some additional vars we get following short hopfully effective code --->
; 75 : prog2 = 1.0 - remaining;
fld1
fsub QWORD PTR _remaining$[esp-4]
; 76 : // progress = std::min( progress, 1.0 ); // is alredy done before call fraction_done
; 77 : // prog = progress * ( 1.0 - pow( prog2, PROG_POWER ) ) + prog2 * pow(prog2,PROG_POWER );//original
; 78 : // A = pow( prog2,PROG_POWER );
; 79 : // prog = progress * ( 1.0 - A ) + prog2 * A ;
; 80 : // B = 1.0 - A; C = prog2 * A;
; 81 : // prog = progress * B + C;
; 82 : // D = progress * B;
; 83 : // prog = D + C;
; 84 :
; 85 : A = pow( prog2,PROG_POWER );
fld QWORD PTR __real@4018000000000000
call __CIpow
; 86 : B = 1.0 - A; C = prog2 * A;
fld1
fsubrp ST(1), ST(0)
; 87 : D = progress * B;
; 88 : prog = D + C;
; 89 : boinc_fraction_done( prog );
sub esp, 8
fmul ST(0), ST(0)
fmul QWORD PTR _progress$[esp+4]
fadd ST(0), ST(0)
fstp QWORD PTR [esp]
call _boinc_fraction_done
add esp, 8
; 90 : }
ret 0
?fraction_done@@YAXNN@Z ENDP ; fraction_done
---------------------------------------------------------------------------------------------------------------------------------------
your comments are welcome
heinz
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version