Hi there,
Did spend some time looking at the things that would be involved with regard to making a refined i7 targeted version. The cache/memory/bus structure is the biggest change, for sure, which appears to be architecturally more oriented toward tight multithreaded applications, as opposed to the multiple instance static application we have now. Development was gradually moving towards that direction anyhow, but some of the tools and techniques to achieve this are really not entirely mature yet, so it will take some time. At this stage I don't believe a simple recompile with the SSE4.2 instructions enabled would be of real benefit while the system architecture changes aren't fully understood, but you never know
. more advanced hand optimisations come later when the chips are more widespread (like I have one, for example
), and the nuances of what is a very different platform are better understood, and yield to experimentation.
So, in short, probably a while. Until then, SSE4.1 & SSSE3x builds should perform very well, and I suspect the memory access patterns in the SSE4.1 build might have a slight advantage with the new architecture, though the full details aren't clear in literature yet.
Jason