double chirpInvariant;time = (1/sample_rate);chirpInvariant = time * time * M_PI*2*chirp_rate ;/* Loop invariance calculation: time = j/sample_rate; // Original equation ang = M_PI*2*chirp_rate*time*time; -------------- one = M_PI*2*chirp_rate; ang = one * (j/sample_rate)*(j/sample_rate); -------------- one = M_PI*2*chirp_rate; ang = one * j * (1/sample_rate) * j * 1/sample_rate) -------------- one = M_PI*2*chirp_rate * (1/sample_rate) * (1/sample_rate); ang = one * j*j;*/
I am posting 2 sections of v_chirp - 1 is part of my sse2 vectorized one, and the other is the current enhanced one (copied from your downloads section). There is one item they still haven't corrected which is a bunch of math is/should be hoisted out of the main loop. I haven't looked closely enough at the assembly example to determine if the intel CPP hoisted it or not.
Question: What speedup (as a percentage) does your code get with P4 - non HT? Back then (21 mo ago) I was getting about 55%.