Seti@Home optimized science apps and information
Optimized Seti@Home apps => Windows => Topic started by: paxv on 30 Apr 2009, 03:15:21 pm
-
Hi to all:
Though not a complete noob I restarted crunching just recently (did crunch for the old Seti project before)
I was wondering:
Would it be usefull to have any SSE4a Optimized apps for the AMD Opteron 2376+ line?
I do know there is a SSE4a for AMD but does it enhance performance like the SSE4.1 and 4.2 used by Intel? ::)
And last question:
I'm crunching happily with my 2 Quad Core 2376 Opterons (2.3 Ghz) on SSE3 now though, for half the money of an 940 Core i7 processor based system. ;) With a good Mobo and 16 Gb of RAM it works nice. I use a Cuda videocard (Nvidia 9600GT 1Gb) but have 8 cores:
Can I crunch with two different systems at once: maybe enhanced on 1 processor, and 7* astropulse v5? If so: can one lock this?
Greetings,
Ronald Zaneveld aka PaxV
I hope to get to 9k-10k points a day average (using 20 hrs for an astropulse v5.0 unit times 7) so roughly I make 7*24 hrs/20 packages a day 8.4*1250pts=10,5k points a day average with one processorline working on Leiden and Einstein. :o
Thanks to AK and all others for your inspiring work.
-
Would it be usefull to have any SSE4a Optimized apps for the AMD Opteron 2376+ line?
I do know there is a SSE4a for AMD but does it enhance performance like the SSE4.1 and 4.2 used by Intel? ::)
Do you know someone who will do SSE4a opt app? There is no SSE4a support for now so your question is very theoretical one.
-
I used to be a GIS/Database programmer, but I wasn't working on something like this...
For me it's a long time ago. It has been 10 years since I had to quit programming because of epilepsy (caused by CRT screens mostly), with TFT it's better now (no more refresh), but I'm working with a system that's a bit more powerful than the late 90's with the pentium 2's and 1st athlons.
I wouldn't have a clue how to optimize the app. Reverse engenieer AP first and than recode on my computer? AP is 32 bit and I'm running a 64bit system. Vista 64bit and Linux (ubuntu 8.10 64 bit).
I am just wondering. If It WOULD be usefull I could think about things...
Greetz
PaxV
-
Why reverse engenering?? Both AP and MB source code in C++ is available for download, both stock and optimized ones.
So everyone could download it, read it and maybe notice some points for improvement.
-
...
Would it be usefull to have any SSE4a Optimized apps for the AMD Opteron 2376+ line?
I do know there is a SSE4a for AMD but does it enhance performance like the SSE4.1 and 4.2 used by Intel?
...
The 6 instructions in SSE4a are listed in http://en.wikipedia.org/wiki/SSE4#SSE4a. Maybe 1 or 2 could possibly be useful, but not for any of the really intense computations done in either S@H Enhanced or Astropulse, so might at best provide a slight improvement.
The SSE3 instructions provide almost all the functionality needed for optimisation, Intel's SSSE3 and SSE4.1 provide some enhancement but not a huge amount for our work. SSE4.2 seems unlikely to help at all.
I wouldn't have a clue how to optimize the app. Reverse engenieer AP first and than recode on my computer? AP is 32 bit and I'm running a 64bit system. Vista 64bit and Linux (ubuntu 8.10 64 bit).
...
Reverse engineering isn't needed, these applications are GPL so source code for both stock and our optimized versions is available. There are certainly more optimization possibilities, and someone thinking specifically about 64 bit builds might find some opportunities to tune for that.
Joe
-
Thank you for your time and answer.
Lastly this question: a little of topic but still related to the former:
Boinc lists my processor as being SSE/SSE2 and
I know for a fact its a SSE/SSE2/SSE3/SSE4a
(2376 Opteron Shanghai)
Doesn't BOINC check that well or is it AMD/INTEL differences?
(SSSE3 and SSE4.1/SSE 4.2 are intel only)
-
Thank you for your time and answer.
Lastly this question: a little of topic but still related to the former:
Boinc lists my processor as being SSE/SSE2 and
I know for a fact its a SSE/SSE2/SSE3/SSE4a
(2376 Opteron Shanghai)
Doesn't BOINC check that well or is it AMD/INTEL differences?
(SSSE3 and SSE4.1/SSE 4.2 are intel only)
BOINC only knows what the OS knows, XP only knows up SSE2.
-
I run windows Vista 64...
-
I run windows Vista 64...
Vista is "clever" only in unneeded areas it seems. Big big crap in short ;) ;D
-
I run windows Vista 64...
Then it may be reporting SSE3 as pni (Prescott New Instructions).
Joe
-
I'd suggest a 'current generation' OS has little direct use for detection or implementation of anything beyond SSE2, given that the primary cacheability & streaming instructions useful in drivers etc. are SSE2 and below. SSE3 unaligned loads are great, but since the OS must support SSE2 limited chips anyway, then that instruction becomes redundant, since the datasets would already be carefully aligned in things like drivers and runtime libraries.
For the bulk of SSE3 and above, the instructions become more application targetted and vendor specific, which applies to things like codecs and compression software (We love horizontal math too!) rather than OS function, so the necessity isn't really there for any detection, implementation or support at OS level. Really, the same probably applies to Boinc too, since it's just a coarser grained version of the same thing :o ... Only better :P
-
So noted. Thank U all 4 Ur time