+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: optimized sources  (Read 621349 times)

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
optimized sources
« on: 05 Mar 2007, 08:42:45 pm »
Hi Simon,

after studying the sources I found that in the client chirpfft.cpp is an object to have my attention.  I reduced the code in CalcTrigArray by using a extern function  FillTrigArray created by me and in TrigArrayInit.ptt I made some smart hints to compile. That should set up the speed. Next will be analyse.cpp
So I will go through all the other sources to find some things to make shorter and more effective, but it takes a little time to finish this.
who compiles the sources? Shold I do that ?
Or should i send the sources back to you Simon.
Till now I have not the complete environment at home to make a new client.
I have the Microsoft C Compiler Version 4.00 and the debugger Code View Version 1.0 to make some short progs to look if my new code is fine.
Sure I can download the all necessary new components to install a new environment, but it works still for a month. Its a little bit pitty. Or I must invest over 600 Dollers I think to get it for standy using.
have anybody a good idea what to do?
mfg seti_britta




Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: optimized sources
« Reply #1 on: 06 Mar 2007, 02:40:16 pm »
Hi Simon,

after studying the sources I found that in the client chirpfft.cpp is an object to have my attention.  I reduced the code in CalcTrigArray by using a extern function  FillTrigArray created by me and in TrigArrayInit.ptt I made some smart hints to compile. That should set up the speed. Next will be analyse.cpp
So I will go through all the other sources to find some things to make shorter and more effective, but it takes a little time to finish this.
who compiles the sources? Shold I do that ?
Or should i send the sources back to you Simon.
Till now I have not the complete environment at home to make a new client.
I have the Microsoft C Compiler Version 4.00 and the debugger Code View Version 1.0 to make some short progs to look if my new code is fine.
Sure I can download the all necessary new components to install a new environment, but it works still for a month. Its a little bit pitty. Or I must invest over 600 Dollers I think to get it for standy using.
have anybody a good idea what to do?
mfg seti_britta

I've shifted this to the Windows side since that matches the compiler and what you are running.

What you might do is just attach your changed source files to a post here. I'm definitely interested, one of my hosts has a Pentium-MMX CPU so can't use the vectorized chirp functions. And if you've improved the TrigArray approach enough, it might turn out to be faster than those vectorized versions even on systems with SSE, etc.  Any further optimizations will also be welcome.

Simon has the full build system with Intel compiler and Intel Performance Primitives, but I've been doing test GCC builds for Windows with DevC++/MinGW (as Eric Korpela uses for the stock Windows applications). If your changes can be built this way I'll probably try.
                                                                                 Joe

BenHer

  • Guest
Re: optimized sources
« Reply #2 on: 06 Mar 2007, 04:59:18 pm »
Britta,

Regarding your other questions.

1. Final releases are complied with Intel's C++ Compiler v9.x.  There is a free version of this for Linux and the Windows version is available for a 30 day demo install.

2. Making your changes compile with Microsoft 2003 or 2005 C++ should almost allways work with the Intel compiler.

3. We (the programmers) usually make a change, compile a candidate executable with that change, and then test it by crunching one of 7 available test work units.  These WUs are modified to make them run in about 1/15th the normal run time of a regular WU, but tests all parts of the seti code.

4. Once the test is complete (we also time the test and compare the time to the latest release) we verify that it produced the correct output file (result) by using rescmp, a result comparison utility.  If that works (and the time is faster) we then post the changed source file(s) along with the new executable in a posting to one of these threads for the rest of the development/testing group to try out and validate.

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
Re: optimized sources
« Reply #3 on: 15 Mar 2007, 10:06:28 pm »
Joe,
I´m  working now on analyzeFuncs.ccp. The important part chirpfft.cpp is now done.  Feel free to give some hints and comments.  Don´t use to compile it alone, some variables are defined outside of it.  All modifications are marked with "seti_britta:", so you can easy find it by searching.
seti_britta

[attachment deleted by admin]

Offline Crunch3r

  • Knight who says 'Ni!'
  • *****
  • Posts: 602
    • 64 bit boinc clients
Re: optimized sources
« Reply #4 on: 16 Mar 2007, 12:48:45 am »
Joe,
I´m  working now on analyzeFuncs.ccp. The important part chirpfft.cpp is now done.  Feel free to give some hints and comments.  Don´t use to compile it alone, some variables are defined outside of it.  All modifications are marked with "seti_britta:", so you can easy find it by searching.
seti_britta

Hello Seti_britta,

I assume as i've seen that you joined Seti.Germany that i can write thise one in german.... (if i'm wrong please correct me ;-)

-------------------------------------------------------------------------------------------------------------------------------------------------

ok... die (log etc. ) funktionen etc. sollten mit denen aus der intel ICC/IPP oder der MKL umgesetzt werden.
(log mit libimf bzw.  mathimf.h)

Wir haben haben dafür die notwendigen Lizenzen ... (zum testen gib's die auch als 30 tage demo von intel)

Was mich Persöhnlich interessieren würde, wäre eine umsetzung des powerspectrum und der transpose functions via Intel MKL...  bzw. Powerspectrum viia Intel IPP.

Kannst du das realisieren ?

P.S. bist du mit linux vertraut oder nur windows ???









« Last Edit: 16 Mar 2007, 12:53:32 am by Crunch3r »
I want to share something with you: The three little sentences that will get you through life. Number 1: Cover for me. Number 2: Oh, good idea, Boss! Number 3: It was like that when I got here.

Homer Simpson

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
Re: optimized sources
« Reply #5 on: 19 Mar 2007, 10:34:13 pm »
hello Joe, hello Cruch3r,

at the moment I´m very busy with analyzeFuncs.cpp, reducing code and make some optimizations in the sources. After that I will look what to do with powerspectrum and transpose. How you know analyzeFuncs is a fat thing, not easy to understand what is going on in the code. Therefore I  divided it into logical parts easy to understand the function. This take me the possibility to have a better overview, reduce code and make other logical changes. Now I´m ready to show the first result of my studys , the new  programmstructure  of seti_analyze. Hints and suggestions are welcome.

for Crunch3r --> I know Linux too, have alredy installed a webserver with Apache and PHP, but at the moment I have still some old win and mac boxes and a P4 with xph, linux not installed
in english für alle anderen zum mitlesen :-)

seti_britta

see attachfile: the new structure of  seti_analyze ( still for understanding documentation and discussion)

[attachment deleted by admin]

BenHer

  • Guest
Re: optimized sources
« Reply #6 on: 20 Mar 2007, 05:12:00 pm »
Quote
How you know analyzeFuncs is a fat thing,

Britta,

We know because we have compiled and then run the seti executable under control of a "profiling" program.  After completing an entire WU crunching we then know that  aa% of the time was spent within function abc, and bb% of the time was spent within function xyz and like so for all functions in the program.

The ones that use the most time get the most of our optimization thinking and programming attempts.

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
Re: optimized sources
« Reply #7 on: 28 Mar 2007, 08:20:58 pm »
the first news

use now an enhanced timer, count in timer ticks, test code pieces

used for test the new fkt CalcAng
let the function write in an 1000 element double vector cyclic
take this in a loop of 10 000
so we call the fkt 10 mio times.
was surprised about the result,
tryed this with 2 small different functions
here you see the result
-------------------
Timer Frequency in:
Hz  =       3579545
MHz =       3.57955
GHz =    0.00358

Start Time =     743223057648 Ticks
Stop Time  =     743224081856 Ticks

Duration in Ticks   =  1024208
Duration in seconds =  0.2861279855401
--------------------------------------
Timer Frequency in:
Hz  =       3579545
MHz =       3.57955
GHz =    0.00358

Start Time =     743224082065 Ticks
Stop Time  =     743225105999 Ticks

Duration in Ticks   =  1023934
Duration in seconds =  0.2860514394986
--------------------------------------
   P1 = 1024208
   P2 = 1023934
   dif= 274

Solution:P2 is faster than P1

 ;D

the secand news:
set up Ms Visual Studio 2005 Express
update with Windows Server 2003 Platform SDK
using this environment to compile seti sources

go on now with further optimization of the sources

seti_britta

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
Re: optimized sources
« Reply #8 on: 31 Mar 2007, 05:54:44 am »
- imported seti_boinc from Visual Studio 2003 to Visual Studio 2005 Express Edition  ;D
- can now compile and get object modul
- compile analyzeFuncs
------ Erstellen gestartet: Projekt: seti_boinc, Konfiguration: Debug Win32 ------
Kompilieren...
analyzeFuncs.cpp
....some wanings
Das Buildprotokoll wurde unter "file://c:\boincstuff\kwsn-seti_boinc_1.3\seti_boinc\client\win_build\Debug\BuildLog.htm" gespeichert.
seti_boinc - 0 Fehler, 13 Warnung(en)
========== Erstellen: 1 erfolgreich, Fehler bei 0, 0 aktuell, 0 übersprungen ==========
 ;D

now I can proof all my changes if there are any compiler errors  ;D

@Simon : till now I did not install IPP and MKL, but when I do that, it must be able to compile a optimized client. Hope did nothing forget. Simon, what do you think about it ?


Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
Re: optimized sources
« Reply #9 on: 31 Mar 2007, 07:04:10 am »
Hi Britta,

for optimal results, you should use ICC and IPP. Unless you modified the sources to use the fftw wrapper that MKL provides, it's not necessary (MKL).

Go for it :)

Regards,
Simon.

Gecko_R7

  • Guest
Re: optimized sources
« Reply #10 on: 31 Mar 2007, 01:45:43 pm »
Hi Simon,

Are you planning to play w/ and compare new MKL 9.1 Beta?
You think it has caught-up/surpassed speed of IPP? 


Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: optimized sources
« Reply #11 on: 31 Mar 2007, 08:19:00 pm »
Hi Simon,

Are you planning to play w/ and compare new MKL 9.1 Beta?
You think it has caught-up/surpassed speed of IPP? 

It would be interesting to know if they're products of separate teams within Intel which compete, or basically the same code under the hood with different interface and focus. My assumption has been the latter, in which case whichever one has the most recent release should be "better" in some sense. But note that "better" does not always mean "faster".
                                                                             Joe

Gecko_R7

  • Guest
Re: optimized sources
« Reply #12 on: 31 Mar 2007, 11:33:07 pm »
Think I may have as close to an Apples to Apples comparo of IPP vs. MKL 9.0

XEON 3.0 w/ IPP vs. MKL 8.1 in the first graph.
XEON 3.0 w/ new MKL 9.0 in the second

Looks pretty close w/ the new MKL 9.0 being slightly quicker than IPP in the 16K to 132K range.... if this is truly a level comparison.
At 16K, IPP = 12.5 Gflops vs. @ 13.5 Gflops for MKL 9.0 or @ 8% quicker
At 132K, IPP = 11.5 Gflops vs. @ 12.25 Gflops for MKL 9.0 or @ 6% quicker

I'd assume there are "other" improvements in 9.x w/ further optimization relevance as well?
Would the added trigonometric and other complex data support in the 9.0 VML also be worth a closer look?









[attachment deleted by admin]

[attachment deleted by admin]
« Last Edit: 01 Apr 2007, 03:18:26 pm by Gecko_R7 »

msattler

  • Guest
Re: optimized sources
« Reply #13 on: 01 Apr 2007, 12:47:15 am »
Does this mean I might have some newly compiled apps to test soon?

Offline Crunch3r

  • Knight who says 'Ni!'
  • *****
  • Posts: 602
    • 64 bit boinc clients
Re: optimized sources
« Reply #14 on: 01 Apr 2007, 02:16:20 pm »
Think I may have as close to an Apples to Apples comparo of ICC vs. MKL 9.0

XEON 3.0 w/ IPP vs. MKL 8.1 in the first graph.
XEON 3.0 w/ new MKL 9.0 in the second

Looks pretty close w/ the new MKL 9.0 being slightly quicker than ICC in the 16K to 132K range.... if this is truly a level comparison.
At 16K, ICC = 12.5 Gflops vs. @ 13.5 Gflops for MKL 9.0 or @ 8% quicker
At 132K, ICC = 11.5 Gflops vs. @ 12.25 Gflops for MKL 9.0 or @ 6% quicker

I'd assume there are "other" improvements in 9.x w/ further optimization relevance as well?
Would the added trigonometric and other complex data support in the 9.0 VML also be worth a closer look?

Hi,

MKL 9.0 is way faster than 8.0 and is %aual or in some cases depending on the ar faster than ipp.
Some weeks ago I've build a app from stock source and compared it to an old 5.12 and it was faster.

Regarding the trigonometric stuff imho it is worth looking into it!
But it depends on Ben and Joe if they like to have acloser look at it.


« Last Edit: 01 Apr 2007, 02:31:21 pm by Crunch3r »
I want to share something with you: The three little sentences that will get you through life. Number 1: Cover for me. Number 2: Oh, good idea, Boss! Number 3: It was like that when I got here.

Homer Simpson

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 57
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 31
Total: 31
Powered by EzPortal