Forum > Linux
How to make your own optimized Seti@Home client for Linux
michael37:
Simon,
Does the SSE3 variant uses the QxT (-xT) optimization flag? If yes, could you please update the Compiler options & seti #defines for optimiztion document. If not, we need an option that uses the xT :)
In addition, I found this odd line in the Intel compiler guide: Use /QxW /QaxT (-xW –axT) to include other Intel 64 and like AMD* processors as well.. Did any experiment with these flags?
Thanks.
P.S. I started running the version 1.3 SSE2-optimized code on my dual-Xeon 5160 and it's surprisingly slow (~10,000 seconds for 0.4 ar/60 credit unit, 4 threads in parallel). Granted, this may be due to general slowness of dual-51XX family vs single socket Core 2 Duos (more info here), but it has to be due to insufficiently optimized code.
sancio:
Hi,
ICC is a frustrating compiler >:(
Your executable is static, my dynamic.
The -static flag cause me headache:
checking size of long int... configure: error: cannot compute sizeof (long int), 77
This is strange:
(conftest.cc is a piece of code copied from config.log that return the size of long int into file conftest.val)
Same error with flag -fast
--- Code: ---gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ icpc -o conftest -no-sox -O3 -pc64 -xP -axP -fp-model fast -no-prec-div -no-prec-sqrt -ipo4 -I/opt/intel/cc/9.1.045/include -I/opt/intel/ipp/5.2_beta/ia32/include -I/opt/intel/ipp/5.2_beta/ia32/tools/staticlib -I/usr/include -I/usr/include/openssl -I/opt/intel/ipp/5.2_beta/ia32/include -I/opt/intel/ipp/5.2_beta/ia32/tools/staticlib -L/opt/intel/cc/9.1.045/lib -L/opt/intel/ipp/5.2_beta/ia32/lib -limf -lippsmerged -lippvmmerged -lippchmerged -lippcore -lsvml -i-static -static-libcxa -L/usr/lib -nodefaultlibs -L/opt/intel/ipp/5.2_beta/ia32/lib conftest.cc -lssl -lcrypto /usr/lib/libcrypto.a /usr/lib/libssl.a -Wl,-Bdynamic -ldl -Wl,-Bdynamic -lm /usr/lib/libnsl.a -Wl,-Bdynamic -lrt /usr/lib/libz.a /usr/lib/libjpeg.a /usr/lib/libstdc++.a -Wl,-Bstatic -lgcc_eh -Wl,-Bdynamic -lpthread -Wl,-Bdynamic -lc -lippcore -lippsmerged
IPO: performing single-file optimizations
IPO: generating object file /tmp/ipo_icpcw7cWoK.o
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ ./conftest
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ file ./conftest
./conftest: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), not stripped
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ ldd ./conftest
linux-gate.so.1 => (0xffffe000)
libimf.so => /opt/intel/cc/9.1.045/lib/libimf.so (0xb7dad000)
libsvml.so => /opt/intel/cc/9.1.045/lib/libsvml.so (0xb7d4d000)
libssl.so.0 => /usr/lib/libssl.so.0 (0xb7d15000)
libcrypto.so.0 => /usr/lib/libcrypto.so.0 (0xb7c11000)
libdl.so.2 => /lib/libdl.so.2 (0xb7c0d000)
libm.so.6 => /lib/libm.so.6 (0xb7bea000)
librt.so.1 => /lib/librt.so.1 (0xb7bd7000)
libpthread.so.0 => /lib/libpthread.so.0 (0xb7b84000)
libc.so.6 => /lib/libc.so.6 (0xb7a67000)
libirc.so => /opt/intel/cc/9.1.045/lib/libirc.so (0xb7a26000)
/lib/ld-linux.so.2 (0xb7feb000)
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ icpc -o conftest -no-sox -O3 -pc64 -xP -axP -fp-model fast -no-prec-div -no-prec-sqrt -ipo4 -I/opt/intel/cc/9.1.045/include -I/opt/intel/ipp/5.2_beta/ia32/include -I/opt/intel/ipp/5.2_beta/ia32/tools/staticlib -I/usr/include -I/usr/include/openssl -I/opt/intel/ipp/5.2_beta/ia32/include -I/opt/intel/ipp/5.2_beta/ia32/tools/staticlib -L/opt/intel/cc/9.1.045/lib -L/opt/intel/ipp/5.2_beta/ia32/lib -limf -lippsmerged -lippvmmerged -lippchmerged -lippcore -lsvml -i-static -static-libcxa -static -L/usr/lib -nodefaultlibs -L/opt/intel/ipp/5.2_beta/ia32/lib conftest.cc -lssl -lcrypto /usr/lib/libcrypto.a /usr/lib/libssl.a -Wl,-Bdynamic -ldl -Wl,-Bdynamic -lm /usr/lib/libnsl.a -Wl,-Bdynamic -lrt /usr/lib/libz.a /usr/lib/libjpeg.a /usr/lib/libstdc++.a -Wl,-Bstatic -lgcc_eh -Wl,-Bdynamic -lpthread -Wl,-Bdynamic -lc -lippcore -lippsmerged
IPO: performing single-file optimizations
IPO: generating object file /tmp/ipo_icpchnSplV.o
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ file ./conftest
./conftest: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), not stripped
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ ldd ./conftest
/usr/bin/ldd: line 124: ./conftest: No such file or directory
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$ ./conftest
-bash: ./conftest: No such file or directory
gtoso@marte:~/src/kwsn/kwsn/seti_boinc$
--- End code ---
[attachment deleted by admin]
Simon:
Hi,
I had exactly the same problem trying to use -static.
What I did to get the released executables to be static -
I used "statifier" (http://statifier.sourceforge.net/) with the --force-execve switch.
HTH,
Simon.
Simon:
--- Quote from: michael37 on 25 Dec 2006, 11:31:54 pm ---Simon,
Does the SSE3 variant uses the QxT (-xT) optimization flag? If yes, could you please update the Compiler options & seti #defines for optimiztion document. If not, we need an option that uses the xT :)
In addition, I found this odd line in the Intel compiler guide: Use /QxW /QaxT (-xW –axT) to include other Intel 64 and like AMD* processors as well.. Did any experiment with these flags?
--- End quote ---
Yup - /QxT and /QaxT are for SSSE3 (Supplemental Streaming SIMD 3 Extensions), meaning they only run (and on Linux, only compile, for me) on Core 2-based systems.
Support for these was added in the 5.1.1 IPP version and the 9.1.028+ ICC one.
HTH,
Simon.
P.S.: I know the 1.3 code isn't that quick on Core 2s - guess why I put the 1.41 up instead? ;)
sancio:
--- Quote from: Simon on 26 Dec 2006, 11:26:11 am ---Hi,
I had exactly the same problem trying to use -static.
What I did to get the released executables to be static -
I used "statifier" (http://statifier.sourceforge.net/) with the --force-execve switch.
--- End quote ---
Thanks very much ;D
But I'm not convinced...
it must be exist a clean way in order to obtain a real static binary,
than in theory it would have to also be faster.
But perhaps at this time it does not have much sense to strive on 1.3.
Some progress on 2.0?
PS: The same problems are taken place under Windows?
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version