Seti@Home optimized science apps and information

Optimized Seti@Home apps => Windows => Topic started by: nevica on 26 Mar 2008, 05:50:17 am

Title: AMD optimized App
Post by: nevica on 26 Mar 2008, 05:50:17 am
Hello,

I have the following processor:

AuthenticAMD
AMD Athlon(tm) 64 X2 Dual-Core Processor TK-57 [x86 Family 15 Model 104 Stepping 2]

running

Microsoft Windows Vista
Home Edition, Service Pack 1, (06.00.6001.00)

Could anybody please tell me the best optimized app to use on this computer.

Best wishes,

Nevica
Title: Re: AMD optimized App
Post by: KarVi on 26 Mar 2008, 10:44:05 am
As far as I know the best application for your setup is this:

http://calbe.dw70.de/2.4V/2.4V_Windows_x32_SSE2A.zip

I've run it with good results on on my X2-3800+, and use it now on my Phenom.

Keep an eye out, sometime in the not too distant future, newer and significantly faster apps will be released.
Title: Re: AMD optimized App
Post by: nevica on 26 Mar 2008, 11:19:11 am
Hello Karvi,

Thanks for your reply.

I am a bit confused as to how many bits my processor has. It has a 64 in its name so I presume it is 64 bit. I not that the optimised app you suggested has a 32 in it so I presume this means 32 bit. Please could you enlighten me.

Regards,

Nevica
Title: Re: AMD optimized App
Post by: KarVi on 26 Mar 2008, 05:16:27 pm
If you run a 64 bit version of Vista, you can use the 64-bit or the 32-bit client.

If you run the 32-bit version of vista you have to use the 32-bit version.

Both can be found at:

http://calbe.dw70.de/seti.html

I think I read somewhere that the 32-bit client was still the fastest, but I could be wrong. The difference will not be big anyway.
Title: Re: AMD optimized App
Post by: nevica on 03 Apr 2008, 03:29:20 pm
Karvi,

I am, in fact, using the 32 bit version of Windows Vista Home Premium.

On this page:

http://calbe.dw70.de/win32.html

Could you tell me me the app I should be using with:

AuthenticAMD
AMD Athlon(tm) 64 X2 Dual-Core Processor TK-57 [x86 Family 15 Model 104 Stepping 2]

Thanks,

Nevica

Title: Re: AMD optimized App
Post by: Josef W. Segur on 03 Apr 2008, 03:47:19 pm
...
Could you tell me me the app I should be using with:

AuthenticAMD
AMD Athlon(tm) 64 X2 Dual-Core Processor TK-57 [x86 Family 15 Model 104 Stepping 2]

2.4V Windows x32 SSE2 AMD
                                                      Joe
Title: Re: AMD optimized App
Post by: KarVi on 03 Apr 2008, 05:05:55 pm
Listen to Joe  :)

It's the same answer I gave you in my first reply.
Title: Re: AMD optimized App
Post by: Slagathor on 25 Apr 2008, 10:24:45 pm
So what your saying is, I could have still used the 32bit Seti???     :o
I just went to 64 bit Vista, right now using the normal BOINC cause the optimized BOINC wouldnt work, and the optimized 64 bit SSE2 Seti.......

If you run a 64 bit version of Vista, you can use the 64-bit or the 32-bit client.

If you run the 32-bit version of vista you have to use the 32-bit version.

Both can be found at:

http://calbe.dw70.de/seti.html

I think I read somewhere that the 32-bit client was still the fastest, but I could be wrong. The difference will not be big anyway.
Title: Re: AMD optimized App
Post by: Raistmer on 26 Apr 2008, 04:11:19 am
With opt app V2.4 for AMD chips - yes, 32-bit app faster on 64-bit OS. This will become untrue with upcoming release.
Title: Re: AMD optimized App
Post by: Slagathor on 26 Apr 2008, 07:23:53 am
Thanks for the reply......

I knew I should have done some reading....heh

I will use the 64 bit app till the next release......
Title: Re: AMD optimized App
Post by: Slawek on 24 Nov 2008, 11:32:54 am
Hi

I use AK_v8_win_SSE3.exe this for Athlon x2 ( AM2 ) 1.9 ghz ( 45 W )
and i now i dont know what APP is the best for my Atlhon.  :-[



OS Win XP SP3 32bit
Title: Re: AMD optimized App
Post by: Jason G on 24 Nov 2008, 11:39:03 am
Hi

I use AK_v8_win_SSE3.exe this for Athlon x2 ( AM2 ) 1.9 ghz ( 45 W )
and i now i dont know what APP is the best for my Atlhon.  :-[



OS Win XP SP3 32bit


Hi there,  If you have SSE3 , then the AK ones are definitely faster IMO (but then I don't drive an AMD chip anymore) .  It is mostly the older chips with only SSE (and maybe some SSE2) where the special FFTW build of v2.4 is particular faster.

Jason

[Note: Please keep in mind the posts in this thread are from well before the release of AK_v8 for windows, and information is no longer accurate. ]
Title: Re: AMD optimized App
Post by: Raistmer on 24 Nov 2008, 01:57:06 pm
Try to compare AK of SSE2 and SSE3 versions.
I recived few results that for x86 SSE2 version faster on Athlon 64 (X2 too) than SSE3 one. It's AMD specific, Intel chips go faster with SSE3 if available.
Title: Re: AMD optimized App
Post by: Slawek on 18 Jan 2009, 11:15:10 am
I use in AP SSE3 in my Athlon.. but i have poor results http://setiathome.berkeley.edu/results.php?hostid=4708858&offset=40
 i can change optimization for AMD ? ( SSE ) for  better computing ??

Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 11:22:52 am
Not recommended for AstroPulse, the SSE3 build is generally faster in all cases where available.  This is an old thread related to characteristics of Multibeam builds.  AstroPulse builds do not seem to follow this pattern (yet).

Poor results compared to what?
Title: Re: AMD optimized App
Post by: Slawek on 18 Jan 2009, 11:32:10 am
http://setiathome.berkeley.edu/workunit.php?wuid=392151435 this is Core 2 SSE 3

My SEE3 (Athlon )
http://setiathome.berkeley.edu/results.php?hostid=4708858&offset=40  ~130.000 sec vs 33.000 sec....


Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 11:40:17 am
That appears to be an E8400, and probably, by the look of the time, extremely overclocked.  My E8400 runs comfortably at a moderate overclock of 3.6GHz, and got around 10.5 hour times with that r69 release build. That machine seems to have achieved results in 8hr40mins  , or about 20% quicker.  So probably in the range of about 4.3GHz.

Are you sure you want to compare times against this machine?
Title: Re: AMD optimized App
Post by: Slawek on 18 Jan 2009, 11:41:47 am
ok. i have 2.9 ghz and ~130.000 sec ??? - 36 h


36 h vs 10 h....  :-\ 


maybe SSE3 in Athlon not working to fast :(
Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 11:52:12 am
That's okay,
Apart from clock speed it also comes down to clocks per instruction, and sheer cache size.  In both regards the E8400 is a bad example to compare against because it is a new generation, huge cache model.  Even the quads are generally much more bus and cache starved.  In that respect the E8400/8500/8600 are some of the fastest dual cores ever produced (to my knowledge).

If you can pick an Intel model of about the same generation as your chip it will probably make a much better comparison.  e.g. My p4 3.2 Cedar Mill running stock has slightly longer times than yours, though a faster clock speed and larger cache.

It is difficult to compare times cross brand, but I can assure you that none of the apparent disadvantage is because of anything other than CPU generation.

Jason
Title: Re: AMD optimized App
Post by: arkayn on 18 Jan 2009, 12:33:17 pm
My T7200(2.0)(r84) does them in about 28 hours and the E6550(2.33)(r69) does them in 18.
Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 12:35:54 pm
Those are slightly better middle ground comparisons, thanks arkayn.
Title: Re: AMD optimized App
Post by: KarVi on 18 Jan 2009, 02:11:34 pm
I have to reply to this thread.

Allthough I'm very pleased with the results of the optimizations, and have the greatest respect for the effort done by the optimizers here, I still beleive much emphasis is put upon making the apps fast on Intel architecture, and less upon AMD. This is understandable, seing how Intel has 75-80% of marketshare, and the tools they have are better.

Still I have to wonder what performance AMD chips could have, if an AMD specific build where made where optimizations where made to take advantage of AMD's strenghts, and avoiding the weakneses.

I once read this at Aces hardwares forum (a very good technical forum) about AMD and optimizing for it:

Quote:
I once had the opportunity to discuss online about that with several people, including some AMDers. AMD is desperate because they can't break the cycle of Intel's C++/Fortran compilers which are turning to be venom against AMD cpus much due to AMD's own design choices of late. Microsoft's compiler may help them IF coders care to create a code path for AMD.

AMD's L1 cache can work very well with FFT's and possibly compete with more expensive metal but it's still underutilized and adoption of adequate tools is going on slower than they predicted. AMD made the choice to equip their cpus L1 caches with neat features that Intels don't have and, of course, Intel doesn't care to optimize for. Prime95 worked like a dog with AMD processor until GIMPS adopted the prefetchw opcode for AMD's codepath along several other optimizations and since that, AMD's latest is the fastest clock-for-clock under Prime95. prefetchw is also useful to accelerate core-to-core coherency through the L3 cache.

Quote end:

Personally I know next to nothing about coding (once learned a little Turbo Pascal...), so I don't know how much work would be required for an AMD specific build, but if one was attempted, I would be more than happy to run lots of tests, to find the best solutions for AMD CPU's.
Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 02:18:23 pm
I will say once more, the astropulse builds are made on Microsoft visual studio, not intel compiler.  If something is unclear here, please enquire, alternatively the source code is available from the download section.
Title: Re: AMD optimized App
Post by: KarVi on 18 Jan 2009, 02:27:59 pm
That wouldn't help me any, since I can't read code.

You seem to be annoyed of me posting my worries. I don't understand the reaction. Though I may seem to be criticizising, my hope is to be helpfull and constructive.

There have been several Intel only builds, what would be wrong with an AMD only build? Do you use "prefetchw", and if you do, does it take the larger L1 cache into account? How about looking into SSE4a, or some of the 3DNow! instructions. Perhaps and only perhaps, one or two of the instructions therein could be of benefit?

Would you agree that the possibility exists that different code paths than the ones used, could perform better on AMD hardware? Why not try? This is only a suggestion, if the workload needed to create these builds, is of such magnitude that it would be counterproductive for the general optimization, it should just be left alone.
Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 02:32:36 pm
I am indeed annoyed, because you are using our development site as a platform for your rant oriented around AMD Vs Intel, false information, and completely ignored our last discussion on the issue.  Take it elsewhere.
Title: Re: AMD optimized App
Post by: KarVi on 18 Jan 2009, 02:35:48 pm
I have not ignored anything. But still there is and have never been AMD only builds.

I will let the matter rest, since it is obviously of little importance. The most important thing is to get well optimized code that is compatible with official SETI code out the door.

I will continue to run tests and so on if this is wished, otherwise tell me so.
Title: Re: AMD optimized App
Post by: Jason G on 18 Jan 2009, 02:41:55 pm
And neither will there be Intel only builds (apart from the obvious SSSE3, SSE4.1, and likely SSE4.2).

I will pass your comments onto the testing committee to assess whether your subjectiveness has compromised your value as tester.

Jason
Title: Re: AMD optimized App
Post by: Gecko_R7 on 18 Jan 2009, 02:52:51 pm
That wouldn't help me any, since I can't read code.

You seem to be annoyed of me posting my worries. I don't understand the reaction. Though I may seem to be criticizising, my hope is to be helpfull and constructive.

There have been several Intel only builds, what would be wrong with an AMD only build? Do you use "prefetchw", and if you do, does it take the larger L1 cache into account? How about looking into SSE4a, or some of the 3DNow! instructions. Perhaps and only perhaps, one or two of the instructions therein could be of benefit?

Would you agree that the possibility exists that different code paths than the ones used, could perform better on AMD hardware? Why not try? This is only a suggestion, if the workload needed to create these builds, is of such magnitude that it would be counterproductive for the general optimization, it should just be left alone.

Different devs have looked at (& continue to do so) opportunites where AMD could uniquely benefit.  The short answer is that in almost all cases, the work required would neccessitate complete re-writes of areas to benefit only "certain" AMD rigs.
A key developer who is on hiatus ATM (not active @ Lunatics BTW) recently commented that to do AMD correctly, would neccesate a COMPLETE re-write of the applications....from scratch.  He also acknowledged the amount of time required vs. number of users that would benefit from this wouldn't justify the effort, unless that was the person's only interest and motivation.

It's not an issue of lack of respect, it's an issue of lack of resources and the overall challenges of resource allocation against several other considerations, application-types and prevalent platform combinations.
Title: Re: AMD optimized App
Post by: KarVi on 18 Jan 2009, 03:05:18 pm
Thanks for the reply. Explains a lot.
Title: Re: AMD optimized App
Post by: Raistmer on 20 Jan 2009, 01:12:58 am
That wouldn't help me any, since I can't read code.

You seem to be annoyed of me posting my worries. I don't understand the reaction. Though I may seem to be criticizising, my hope is to be helpfull and constructive.

There have been several Intel only builds, what would be wrong with an AMD only build? Do you use "prefetchw", and if you do, does it take the larger L1 cache into account? How about looking into SSE4a, or some of the 3DNow! instructions. Perhaps and only perhaps, one or two of the instructions therein could be of benefit?

Would you agree that the possibility exists that different code paths than the ones used, could perform better on AMD hardware? Why not try? This is only a suggestion, if the workload needed to create these builds, is of such magnitude that it would be counterproductive for the general optimization, it should just be left alone.

I'll try to give one more answer:
Regarding AstroPulse:
1) There is no Intel-specific build still. Any "specific" build requires additional efforts. General optimisation (if possible) gives MUCH MORE feedback because it's GENERAL. So, AP still has mostly general optimization.
2) It's tru that SSE3 MB builds work sometimes even slower on AMD than SSE2 ones. Sorry, but I see only poor SSE3 implementation on chip here, not any plot against AMD.
3) There is no SSSE3 and up support for AMD but SSSE3 gives nice speedup for MB (again, it's just reality).
4) There is NO AMD optimizing compiler exist. Intel has its own compiler, AMD has no own compiler at all. Look on AMD pages, they recommend different third party compilers (Intel was included if I remember right). But to do build with new compiler requires additional efforts (as to do CPU specific optimization instead of general optimisation) for porting. For AP only transition from general purpose MS VC to ICC in progress and already we have some difficulties with performance. To do builds with some alternative (and not freely available) compilers - more time required that could be spend more effective to general optimization.
5) Sure it's possible to speedup current AP (or MB probably) builds specifically on AMD chips, but not our fault that latest Intel's CPU respond better on general optimizations that AMD (although for AMD Athlon XP speed improve was best for AP SSE BTW).

I hope it's more clear now "why not try".

(ADDON: Gecko_R7 explained it already indeed)