Seti@Home optimized science apps and information
Optimized Seti@Home apps => Windows => Topic started by: nevica on 26 Mar 2008, 05:50:17 am
-
Hello,
I have the following processor:
AuthenticAMD
AMD Athlon(tm) 64 X2 Dual-Core Processor TK-57 [x86 Family 15 Model 104 Stepping 2]
running
Microsoft Windows Vista
Home Edition, Service Pack 1, (06.00.6001.00)
Could anybody please tell me the best optimized app to use on this computer.
Best wishes,
Nevica
-
As far as I know the best application for your setup is this:
http://calbe.dw70.de/2.4V/2.4V_Windows_x32_SSE2A.zip
I've run it with good results on on my X2-3800+, and use it now on my Phenom.
Keep an eye out, sometime in the not too distant future, newer and significantly faster apps will be released.
-
Hello Karvi,
Thanks for your reply.
I am a bit confused as to how many bits my processor has. It has a 64 in its name so I presume it is 64 bit. I not that the optimised app you suggested has a 32 in it so I presume this means 32 bit. Please could you enlighten me.
Regards,
Nevica
-
If you run a 64 bit version of Vista, you can use the 64-bit or the 32-bit client.
If you run the 32-bit version of vista you have to use the 32-bit version.
Both can be found at:
http://calbe.dw70.de/seti.html
I think I read somewhere that the 32-bit client was still the fastest, but I could be wrong. The difference will not be big anyway.
-
Karvi,
I am, in fact, using the 32 bit version of Windows Vista Home Premium.
On this page:
http://calbe.dw70.de/win32.html
Could you tell me me the app I should be using with:
AuthenticAMD
AMD Athlon(tm) 64 X2 Dual-Core Processor TK-57 [x86 Family 15 Model 104 Stepping 2]
Thanks,
Nevica
-
...
Could you tell me me the app I should be using with:
AuthenticAMD
AMD Athlon(tm) 64 X2 Dual-Core Processor TK-57 [x86 Family 15 Model 104 Stepping 2]
2.4V Windows x32 SSE2 AMD
Joe
-
Listen to Joe :)
It's the same answer I gave you in my first reply.
-
So what your saying is, I could have still used the 32bit Seti??? :o
I just went to 64 bit Vista, right now using the normal BOINC cause the optimized BOINC wouldnt work, and the optimized 64 bit SSE2 Seti.......
If you run a 64 bit version of Vista, you can use the 64-bit or the 32-bit client.
If you run the 32-bit version of vista you have to use the 32-bit version.
Both can be found at:
http://calbe.dw70.de/seti.html
I think I read somewhere that the 32-bit client was still the fastest, but I could be wrong. The difference will not be big anyway.
-
With opt app V2.4 for AMD chips - yes, 32-bit app faster on 64-bit OS. This will become untrue with upcoming release.
-
Thanks for the reply......
I knew I should have done some reading....heh
I will use the 64 bit app till the next release......
-
Hi
I use AK_v8_win_SSE3.exe this for Athlon x2 ( AM2 ) 1.9 ghz ( 45 W )
and i now i dont know what APP is the best for my Atlhon. :-[
OS Win XP SP3 32bit
-
Hi
I use AK_v8_win_SSE3.exe this for Athlon x2 ( AM2 ) 1.9 ghz ( 45 W )
and i now i dont know what APP is the best for my Atlhon. :-[
OS Win XP SP3 32bit
Hi there, If you have SSE3 , then the AK ones are definitely faster IMO (but then I don't drive an AMD chip anymore) . It is mostly the older chips with only SSE (and maybe some SSE2) where the special FFTW build of v2.4 is particular faster.
Jason
[Note: Please keep in mind the posts in this thread are from well before the release of AK_v8 for windows, and information is no longer accurate. ]
-
Try to compare AK of SSE2 and SSE3 versions.
I recived few results that for x86 SSE2 version faster on Athlon 64 (X2 too) than SSE3 one. It's AMD specific, Intel chips go faster with SSE3 if available.
-
I use in AP SSE3 in my Athlon.. but i have poor results http://setiathome.berkeley.edu/results.php?hostid=4708858&offset=40
i can change optimization for AMD ? ( SSE ) for better computing ??
-
Not recommended for AstroPulse, the SSE3 build is generally faster in all cases where available. This is an old thread related to characteristics of Multibeam builds. AstroPulse builds do not seem to follow this pattern (yet).
Poor results compared to what?
-
http://setiathome.berkeley.edu/workunit.php?wuid=392151435 this is Core 2 SSE 3
My SEE3 (Athlon )
http://setiathome.berkeley.edu/results.php?hostid=4708858&offset=40 ~130.000 sec vs 33.000 sec....
-
That appears to be an E8400, and probably, by the look of the time, extremely overclocked. My E8400 runs comfortably at a moderate overclock of 3.6GHz, and got around 10.5 hour times with that r69 release build. That machine seems to have achieved results in 8hr40mins , or about 20% quicker. So probably in the range of about 4.3GHz.
Are you sure you want to compare times against this machine?
-
ok. i have 2.9 ghz and ~130.000 sec ??? - 36 h
36 h vs 10 h.... :-\
maybe SSE3 in Athlon not working to fast :(
-
That's okay,
Apart from clock speed it also comes down to clocks per instruction, and sheer cache size. In both regards the E8400 is a bad example to compare against because it is a new generation, huge cache model. Even the quads are generally much more bus and cache starved. In that respect the E8400/8500/8600 are some of the fastest dual cores ever produced (to my knowledge).
If you can pick an Intel model of about the same generation as your chip it will probably make a much better comparison. e.g. My p4 3.2 Cedar Mill running stock has slightly longer times than yours, though a faster clock speed and larger cache.
It is difficult to compare times cross brand, but I can assure you that none of the apparent disadvantage is because of anything other than CPU generation.
Jason
-
My T7200(2.0)(r84) does them in about 28 hours and the E6550(2.33)(r69) does them in 18.
-
Those are slightly better middle ground comparisons, thanks arkayn.
-
I have to reply to this thread.
Allthough I'm very pleased with the results of the optimizations, and have the greatest respect for the effort done by the optimizers here, I still beleive much emphasis is put upon making the apps fast on Intel architecture, and less upon AMD. This is understandable, seing how Intel has 75-80% of marketshare, and the tools they have are better.
Still I have to wonder what performance AMD chips could have, if an AMD specific build where made where optimizations where made to take advantage of AMD's strenghts, and avoiding the weakneses.
I once read this at Aces hardwares forum (a very good technical forum) about AMD and optimizing for it:
Quote:
I once had the opportunity to discuss online about that with several people, including some AMDers. AMD is desperate because they can't break the cycle of Intel's C++/Fortran compilers which are turning to be venom against AMD cpus much due to AMD's own design choices of late. Microsoft's compiler may help them IF coders care to create a code path for AMD.
AMD's L1 cache can work very well with FFT's and possibly compete with more expensive metal but it's still underutilized and adoption of adequate tools is going on slower than they predicted. AMD made the choice to equip their cpus L1 caches with neat features that Intels don't have and, of course, Intel doesn't care to optimize for. Prime95 worked like a dog with AMD processor until GIMPS adopted the prefetchw opcode for AMD's codepath along several other optimizations and since that, AMD's latest is the fastest clock-for-clock under Prime95. prefetchw is also useful to accelerate core-to-core coherency through the L3 cache.
Quote end:
Personally I know next to nothing about coding (once learned a little Turbo Pascal...), so I don't know how much work would be required for an AMD specific build, but if one was attempted, I would be more than happy to run lots of tests, to find the best solutions for AMD CPU's.
-
I will say once more, the astropulse builds are made on Microsoft visual studio, not intel compiler. If something is unclear here, please enquire, alternatively the source code is available from the download section.
-
That wouldn't help me any, since I can't read code.
You seem to be annoyed of me posting my worries. I don't understand the reaction. Though I may seem to be criticizising, my hope is to be helpfull and constructive.
There have been several Intel only builds, what would be wrong with an AMD only build? Do you use "prefetchw", and if you do, does it take the larger L1 cache into account? How about looking into SSE4a, or some of the 3DNow! instructions. Perhaps and only perhaps, one or two of the instructions therein could be of benefit?
Would you agree that the possibility exists that different code paths than the ones used, could perform better on AMD hardware? Why not try? This is only a suggestion, if the workload needed to create these builds, is of such magnitude that it would be counterproductive for the general optimization, it should just be left alone.
-
I am indeed annoyed, because you are using our development site as a platform for your rant oriented around AMD Vs Intel, false information, and completely ignored our last discussion on the issue. Take it elsewhere.
-
I have not ignored anything. But still there is and have never been AMD only builds.
I will let the matter rest, since it is obviously of little importance. The most important thing is to get well optimized code that is compatible with official SETI code out the door.
I will continue to run tests and so on if this is wished, otherwise tell me so.
-
And neither will there be Intel only builds (apart from the obvious SSSE3, SSE4.1, and likely SSE4.2).
I will pass your comments onto the testing committee to assess whether your subjectiveness has compromised your value as tester.
Jason
-
That wouldn't help me any, since I can't read code.
You seem to be annoyed of me posting my worries. I don't understand the reaction. Though I may seem to be criticizising, my hope is to be helpfull and constructive.
There have been several Intel only builds, what would be wrong with an AMD only build? Do you use "prefetchw", and if you do, does it take the larger L1 cache into account? How about looking into SSE4a, or some of the 3DNow! instructions. Perhaps and only perhaps, one or two of the instructions therein could be of benefit?
Would you agree that the possibility exists that different code paths than the ones used, could perform better on AMD hardware? Why not try? This is only a suggestion, if the workload needed to create these builds, is of such magnitude that it would be counterproductive for the general optimization, it should just be left alone.
Different devs have looked at (& continue to do so) opportunites where AMD could uniquely benefit. The short answer is that in almost all cases, the work required would neccessitate complete re-writes of areas to benefit only "certain" AMD rigs.
A key developer who is on hiatus ATM (not active @ Lunatics BTW) recently commented that to do AMD correctly, would neccesate a COMPLETE re-write of the applications....from scratch. He also acknowledged the amount of time required vs. number of users that would benefit from this wouldn't justify the effort, unless that was the person's only interest and motivation.
It's not an issue of lack of respect, it's an issue of lack of resources and the overall challenges of resource allocation against several other considerations, application-types and prevalent platform combinations.
-
Thanks for the reply. Explains a lot.
-
That wouldn't help me any, since I can't read code.
You seem to be annoyed of me posting my worries. I don't understand the reaction. Though I may seem to be criticizising, my hope is to be helpfull and constructive.
There have been several Intel only builds, what would be wrong with an AMD only build? Do you use "prefetchw", and if you do, does it take the larger L1 cache into account? How about looking into SSE4a, or some of the 3DNow! instructions. Perhaps and only perhaps, one or two of the instructions therein could be of benefit?
Would you agree that the possibility exists that different code paths than the ones used, could perform better on AMD hardware? Why not try? This is only a suggestion, if the workload needed to create these builds, is of such magnitude that it would be counterproductive for the general optimization, it should just be left alone.
I'll try to give one more answer:
Regarding AstroPulse:
1) There is no Intel-specific build still. Any "specific" build requires additional efforts. General optimisation (if possible) gives MUCH MORE feedback because it's GENERAL. So, AP still has mostly general optimization.
2) It's tru that SSE3 MB builds work sometimes even slower on AMD than SSE2 ones. Sorry, but I see only poor SSE3 implementation on chip here, not any plot against AMD.
3) There is no SSSE3 and up support for AMD but SSSE3 gives nice speedup for MB (again, it's just reality).
4) There is NO AMD optimizing compiler exist. Intel has its own compiler, AMD has no own compiler at all. Look on AMD pages, they recommend different third party compilers (Intel was included if I remember right). But to do build with new compiler requires additional efforts (as to do CPU specific optimization instead of general optimisation) for porting. For AP only transition from general purpose MS VC to ICC in progress and already we have some difficulties with performance. To do builds with some alternative (and not freely available) compilers - more time required that could be spend more effective to general optimization.
5) Sure it's possible to speedup current AP (or MB probably) builds specifically on AMD chips, but not our fault that latest Intel's CPU respond better on general optimizations that AMD (although for AMD Athlon XP speed improve was best for AP SSE BTW).
I hope it's more clear now "why not try".
(ADDON: Gecko_R7 explained it already indeed)