+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: For the programmers:  (Read 14623 times)

Offline KarVi

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 252
For the programmers:
« on: 20 Jan 2007, 06:17:05 pm »
Since I'm not a programmer, this is totally incomprehensive for me but this link:

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF

is AMD's official optimization guide for the Athlon64 family.

If you don't allready use it, perhaps some gains could be found for the many AMD users?

I have no idea if its easy or not to implement the things written in this document, but it would be interresting to see what an application that was specifically optimized for the Athlon64 architecture could gain in performance.
A smile is the shortest distance between two peoble (Victor Borge).

BenHer

  • Guest
Re: For the programmers:
« Reply #1 on: 21 Jan 2007, 01:35:08 pm »
Our current release (2.0b or c) currently screams on Athlon systems, and also on earlier Intel systems.

It doesn't do anywhere near as well on Core or Core2 chips.  Since none of us programmers has one, we really have conceptual issues making the code faster on them.  All of our routines that work really fast on the other chips I mentioned don't seem to speed up the crunching on the Core chips...and so far we have no brilliant insights as to why.

Let me explain why having one is necessary to the process.
To develop an optimized section of a program,
1. the programmer conceives of a method that *might* make it faster. 
2. He/she then writes a seprate version of the source code with these hopeful improvements.
2.a. Re-coding
3. After coding they try to compile...
3.a. at which point they discover some errors they overlooked in their coding. 
3.b. After maybe multiple compiles the executable finally compiles
4. Then the executable is run in either a short WU test or a test-bench attempted run
5. If the test-bench results are incorrect the programmer returns to step 2.a
6. If a test bench is used then a later short WU test must be run to verify the code works
7. At this point timing tests are also done to see if the changes improved the code speed..and if so by how much.  If the results don't validate, however, extra speed doesn't matter.

So...if the programmer doesn't have the destination platform, in this case core or core2 chips, then between each of steps 4 5 6 and 7 they would have to email or post their executable and wait for someone who did have the chip to test, then post back results.

Hope this makes it all clearer.

P.S.: A core2 chip with motherboard + RAM + case would probably cost, in my area, about $170+80+50.  Could I afford this...yes, easily.  Am I interested enough in seti to buy it when I otherwise have no need for it...no.

P.P.S: Alex Kan has a Core2 based Macintosh - and he has written some very fast code for that.  Crunch3r copied his source over to an Intel machine and modified it so it would compile for PC.  He has an internal version only whose status we don't know about.
« Last Edit: 21 Jan 2007, 01:44:13 pm by BenHer »

Offline KarVi

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 252
Re: For the programmers:
« Reply #2 on: 22 Jan 2007, 10:37:53 am »
Thanks for the explanation.

I had an idea about how developement work progresses, and it was pretty much as you describe here.

Reading AMD's optimization guide, could, I think, help the programmer in steps 1 and 2, in making code that doesn't execute in an unfortunate way.

I think its a proven point that Intels compilers don't do AMD chips any favours. So helping the AMD compiles becomes the programmers unfortunate job, and an optimization guide must be a worthwhile read?

I know time constraints are a serious factor, but then again having an optimization guide at hand, could be helpfull looking for examples of effective coding? And doing things the right way the first time (optimation vise) must be good?

You must excuse me if I in any way sound as if I don't respect the work you are allready doing, because I have the greatest respect for what you guys are doing.

I'm just the type myself that likes to read manuals, and think I benefit greatly by doing so, as I often know things about a product that others don't, because they dont take the time.

I'm just trying to help in my own very limited way.

If my input is not wanted I will stop giving it.
A smile is the shortest distance between two peoble (Victor Borge).

BenHer

  • Guest
Re: For the programmers:
« Reply #3 on: 22 Jan 2007, 02:54:46 pm »
Sorry on my part Karvi ... I don't do it very often but sometimes I "go off". :o  Heh.

You are definitely correct about Intel "short chaning" amd in certain compiles.
 If you do some searching around the forums here you will find more details about that.

The actual placement of opcodes in the object code works pretty well on both AMD and intel, but some library functions are really built for Intel.

Personally I have a copy of that AMD optimization guide as well as the corresponding intel one.  But the other programmers might not have gotten one...so good heads up.

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: For the programmers:
« Reply #4 on: 22 Jan 2007, 05:00:53 pm »
Since I'm not a programmer, this is totally incomprehensive for me but this link:

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF

is AMD's official optimization guide for the Athlon64 family.

If you don't allready use it, perhaps some gains could be found for the many AMD users?

I have no idea if its easy or not to implement the things written in this document, but it would be interresting to see what an application that was specifically optimized for the Athlon64 architecture could gain in performance.

I fully agree with Ben's replies, and for the record I have that document plus an earlier (pre-64) version, and also both current and old versions of the Intel optimization advice. I don't claim to have fully absorbed the contents, though.

In terms of AMD specific optimisation, what would be interesting is a comparison of the AMD and Intel advice looking for contradictions and/or differences in emphasis. And neither set of documents goes very far into how much improvement can be expected from a specific kind of optimization, so it is hard to judge where to direct effort at improvement.

Ben provided a facility to test various versions of optimized code which runs when the 2.0 builds start. The version of each optimized function which is fastest on that system is used for crunching the WU. It is certainly true that on AMD systems different choices are made by that process than on most Intel systems, so to that extent the 2.0 builds already adapt to AMD. But the variety of optimized routines we've generated certainly don't cover all possibilities.
                                                                           Joe

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 355
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 179
Total: 179
Powered by EzPortal