Forum > Windows
ASM of compiled source of certain functions - by the Intel Compiler
Simon:
Using the complete sse2_v_chirpdata function, analyzeFuncs.cpp compiles fine for me.
So next up is a quick test run vs. my own SSE2-optimized build without this edit :)
Simon.
<edit>seems I posted too soon, it didn't finish linking. Needs some more work to get it to produce a valid executable.</edit>
BenHer:
Uhh...Simon
If you grabbed the sse2_v_chirpdata from the sse2_opt.cpp, then youve gotten "Evandro Menezes" version. He did join the sourceforge project and was an authorized submitter so those were his latest versions.
My latest version was the sse_ v_chirpdata version (faster than his if I recall).
I just read it now, it doesn't include Tetsuji's sin/cos tables or any caching, so it will probably be slower.
Simon:
Lol :)
Oops...I was wondering where the caching was, too...
So anyway, it helps being less tired than I was when I tried it.
Will try again with the file you pointed out.
Simon.
<edit>It's still a bit tough to integrate your function as it uses different variable types and a different number of arguments. Enhanced by default uses this:
--- Code: ---extern int v_ChirpData(
sah_complex * cx_DataArray,
sah_complex * cx_ChirpDataArray,
int ChirpRateInd,
double ChirpRate,
int ul_NumDataPoints,
double sample_rate
);
--- End code ---
Yours uses this:
--- Code: ---extern int v_ChirpData(
float * fp_DataArray,
float * fp_ChirpDataArray,
float f_ChirpRate,
int ul_NumDataPoints,
double sample_rate
);
--- End code ---
Which is giving me all sorts of trouble about incompatible arguments. So for now, I'm going to put it in the "to do" drawer unless you want to jump in and incorporate it yourself (or maybe someone with more C++ skills than me does the same).</edit>
BenHer:
To incorporate the cache features into my code would take a little work...will check it out.
To verify it I would, of course, have to do all those things I mentioned in earlier post ;)
korpela:
Hi Ben,
Sorry to be replying to an old thread. Just getting around to looking at this stuff. Somehow I missed your checkin of the vectorization stuff at sourceforge. I thought I was on the mailing list for checkins. Apparently not....
Looks like you and Alex have been busy. Don't know if you've seen more recent versions of the source that check speeds of at least some functions and use the fastest. (in the client/vector directory) Right now it justs tests GetPowerSpectrum, ChirpData, Transpose, and BaselineSmooth. (Baseline smooth should be removed since it really only gets called once.)
I'd like to extend this to more functions (gaussfit, pulse_find), but the problem is that those functions might generate output while being tested. We'd need to modify them to either suppress the output or compartmentalize them so the tested routines don't include the output. At any rate if you can any of your routines you want added into the new format, please do so (you can use analyzeFuncs_sse.cpp and analyzeFuncs_altivec.cpp as guides.
I'm also adding functions hostinfo_have_altivec(), hostinfo_have_sse(), etc to the boinc api. Unfortunately, as always I'm swamped with other work. If there are other threads that I should be looking at, let me know.
Eric
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version