Forum > Discussion Forum
AVX Optimized App Development
Mike:
FX 8150 @4.4 GHZ
Win 8 CP
Ftst_v7_J48_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.008454 0.00000 test
fpu_ChirpData 0.014305 0.00000 test
fpu_opt_ChirpData 0.009795 0.00000 test
sse1_ChirpData_ak8e 0.005828 0.00000 test
sse2_ChirpData_ak8 0.003836 0.00000 test
sse3_ChirpData_ak8 0.003835 0.00000 test
avx_ChirpData_a 0.003266 0.00000 test
avx_ChirpData_b 0.003247 0.00000 test
avx_ChirpData_c 0.003435 0.00000 test
avx_ChirpData_d 0.003360 0.00000 test
avx_ChirpData_e 0.003214 0.00000 test
avx_ChirpData_f2 0.003205 0.00000 test
avx_ChirpData_f3 0.003171 0.00000 test
avx_ChirpData_f4 0.003129 0.00000 test
avx_ChirpData_f5 0.003122 0.00000 test
avx_ChirpData_f6 0.003112 0.00000 test
avx_ChirpData_fn 0.003147 0.00000 test
avx_ChirpData_f6 0.003112 0.00000 choice
Second run
v_ChirpData 0.009746 0.00000 test
fpu_ChirpData 0.014194 0.00000 test
fpu_opt_ChirpData 0.009803 0.00000 test
sse1_ChirpData_ak8e 0.005828 0.00000 test
sse2_ChirpData_ak8 0.003910 0.00000 test
sse3_ChirpData_ak8 0.003845 0.00000 test
avx_ChirpData_a 0.003409 0.00000 test
avx_ChirpData_b 0.003379 0.00000 test
avx_ChirpData_c 0.003502 0.00000 test
avx_ChirpData_d 0.003457 0.00000 test
avx_ChirpData_e 0.003293 0.00000 test
avx_ChirpData_f2 0.003418 0.00000 test
avx_ChirpData_f3 0.003352 0.00000 test
avx_ChirpData_f4 0.003270 0.00000 test
avx_ChirpData_f5 0.003236 0.00000 test
avx_ChirpData_f6 0.003192 0.00000 test
avx_ChirpData_fn 0.003393 0.00000 test
avx_ChirpData_f6 0.003192 0.00000 choice
Third run
v_ChirpData 0.009865 0.00000 test
fpu_ChirpData 0.014297 0.00000 test
fpu_opt_ChirpData 0.009830 0.00000 test
sse1_ChirpData_ak8e 0.005848 0.00000 test
sse2_ChirpData_ak8 0.003917 0.00000 test
sse3_ChirpData_ak8 0.003848 0.00000 test
avx_ChirpData_a 0.003417 0.00000 test
avx_ChirpData_b 0.003311 0.00000 test
avx_ChirpData_c 0.003476 0.00000 test
avx_ChirpData_d 0.003420 0.00000 test
avx_ChirpData_e 0.003312 0.00000 test
avx_ChirpData_f2 0.003422 0.00000 test
avx_ChirpData_f3 0.003325 0.00000 test
avx_ChirpData_f4 0.003279 0.00000 test
avx_ChirpData_f5 0.003280 0.00000 test
avx_ChirpData_f6 0.003194 0.00000 test
avx_ChirpData_fn 0.003371 0.00000 test
avx_ChirpData_f6 0.003194 0.00000 choice
Test duration 8.23 seconds
Ftst_v7 completed successfully.
Claggy:
i7-2600K @4.7GHz (Boinc running):
=========================================================
Ftst_v7_J48_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.008760 0.00000 test
fpu_ChirpData 0.015042 0.00000 test
fpu_opt_ChirpData 0.008035 0.00000 test
sse1_ChirpData_ak8e 0.006087 0.00000 test
sse2_ChirpData_ak8 0.005136 0.00000 test
sse3_ChirpData_ak8 0.004889 0.00000 test
avx_ChirpData_a 0.002961 0.00000 test
avx_ChirpData_b 0.002496 0.00000 test
avx_ChirpData_c 0.002585 0.00000 test
avx_ChirpData_d 0.002622 0.00000 test
avx_ChirpData_e 0.002608 0.00000 test
avx_ChirpData_f2 0.002763 0.00000 test
avx_ChirpData_f3 0.002804 0.00000 test
avx_ChirpData_f4 0.002403 0.00000 test
avx_ChirpData_f5 0.002581 0.00000 test
avx_ChirpData_f6 0.002465 0.00000 test
avx_ChirpData_fn 0.002846 0.00000 test
avx_ChirpData_f4 0.002403 0.00000 choice
Second run
v_ChirpData 0.007861 0.00000 test
fpu_ChirpData 0.014469 0.00000 test
fpu_opt_ChirpData 0.006947 0.00000 test
sse1_ChirpData_ak8e 0.006156 0.00000 test
sse2_ChirpData_ak8 0.004994 0.00000 test
sse3_ChirpData_ak8 0.004901 0.00000 test
avx_ChirpData_a 0.002895 0.00000 test
avx_ChirpData_b 0.002575 0.00000 test
avx_ChirpData_c 0.002614 0.00000 test
avx_ChirpData_d 0.002759 0.00000 test
avx_ChirpData_e 0.002340 0.00000 test
avx_ChirpData_f2 0.002927 0.00000 test
avx_ChirpData_f3 0.002891 0.00000 test
avx_ChirpData_f4 0.002491 0.00000 test
avx_ChirpData_f5 0.002660 0.00000 test
avx_ChirpData_f6 0.002420 0.00000 test
avx_ChirpData_fn 0.003653 0.00000 test
avx_ChirpData_e 0.002340 0.00000 choice
Third run
v_ChirpData 0.008828 0.00000 test
fpu_ChirpData 0.015331 0.00000 test
fpu_opt_ChirpData 0.006832 0.00000 test
sse1_ChirpData_ak8e 0.006328 0.00000 test
sse2_ChirpData_ak8 0.004866 0.00000 test
sse3_ChirpData_ak8 0.004908 0.00000 test
avx_ChirpData_a 0.002686 0.00000 test
avx_ChirpData_b 0.002764 0.00000 test
avx_ChirpData_c 0.002528 0.00000 test
avx_ChirpData_d 0.002557 0.00000 test
avx_ChirpData_e 0.002444 0.00000 test
avx_ChirpData_f2 0.002746 0.00000 test
avx_ChirpData_f3 0.002616 0.00000 test
avx_ChirpData_f4 0.002821 0.00000 test
avx_ChirpData_f5 0.002443 0.00000 test
avx_ChirpData_f6 0.002607 0.00000 test
avx_ChirpData_fn 0.003108 0.00000 test
avx_ChirpData_f5 0.002443 0.00000 choice
Test duration 8.47 seconds
Ftst_v7 completed successfully.
=========================================================
i7-2600K @4.7GHz (Boinc computing suspended):
=========================================================
Ftst_v7_J48_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.003671 0.00000 test
fpu_ChirpData 0.008676 0.00000 test
fpu_opt_ChirpData 0.003585 0.00000 test
sse1_ChirpData_ak8e 0.004212 0.00000 test
sse2_ChirpData_ak8 0.003154 0.00000 test
sse3_ChirpData_ak8 0.003116 0.00000 test
avx_ChirpData_a 0.001474 0.00000 test
avx_ChirpData_b 0.001635 0.00000 test
avx_ChirpData_c 0.001500 0.00000 test
avx_ChirpData_d 0.001377 0.00000 test
avx_ChirpData_e 0.001522 0.00000 test
avx_ChirpData_f2 0.001630 0.00000 test
avx_ChirpData_f3 0.001588 0.00000 test
avx_ChirpData_f4 0.001571 0.00000 test
avx_ChirpData_f5 0.001567 0.00000 test
avx_ChirpData_f6 0.001563 0.00000 test
avx_ChirpData_fn 0.001727 0.00000 test
avx_ChirpData_d 0.001377 0.00000 choice
Second run
v_ChirpData 0.003673 0.00000 test
fpu_ChirpData 0.008625 0.00000 test
fpu_opt_ChirpData 0.003589 0.00000 test
sse1_ChirpData_ak8e 0.004206 0.00000 test
sse2_ChirpData_ak8 0.003152 0.00000 test
sse3_ChirpData_ak8 0.003113 0.00000 test
avx_ChirpData_a 0.001469 0.00000 test
avx_ChirpData_b 0.001639 0.00000 test
avx_ChirpData_c 0.001482 0.00000 test
avx_ChirpData_d 0.001376 0.00000 test
avx_ChirpData_e 0.001521 0.00000 test
avx_ChirpData_f2 0.001610 0.00000 test
avx_ChirpData_f3 0.001584 0.00000 test
avx_ChirpData_f4 0.001568 0.00000 test
avx_ChirpData_f5 0.001564 0.00000 test
avx_ChirpData_f6 0.001559 0.00000 test
avx_ChirpData_fn 0.001744 0.00000 test
avx_ChirpData_d 0.001376 0.00000 choice
Third run
v_ChirpData 0.003703 0.00000 test
fpu_ChirpData 0.008650 0.00000 test
fpu_opt_ChirpData 0.003566 0.00000 test
sse1_ChirpData_ak8e 0.004206 0.00000 test
sse2_ChirpData_ak8 0.003152 0.00000 test
sse3_ChirpData_ak8 0.003116 0.00000 test
avx_ChirpData_a 0.001470 0.00000 test
avx_ChirpData_b 0.001635 0.00000 test
avx_ChirpData_c 0.001483 0.00000 test
avx_ChirpData_d 0.001376 0.00000 test
avx_ChirpData_e 0.001520 0.00000 test
avx_ChirpData_f2 0.001614 0.00000 test
avx_ChirpData_f3 0.001585 0.00000 test
avx_ChirpData_f4 0.001571 0.00000 test
avx_ChirpData_f5 0.001566 0.00000 test
avx_ChirpData_f6 0.001567 0.00000 test
avx_ChirpData_fn 0.001727 0.00000 test
avx_ChirpData_d 0.001376 0.00000 choice
Test duration 6.29 seconds
Ftst_v7 completed successfully.
Claggy
KarVi:
FX8150@4.5G
Boinc suspended:
=========================================================
Ftst_v7_J48_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.007228 0.00000 test
fpu_ChirpData 0.013807 0.00000 test
fpu_opt_ChirpData 0.007158 0.00000 test
sse1_ChirpData_ak8e 0.005662 0.00000 test
sse2_ChirpData_ak8 0.003673 0.00000 test
sse3_ChirpData_ak8 0.003739 0.00000 test
avx_ChirpData_a 0.003099 0.00000 test
avx_ChirpData_b 0.003029 0.00000 test
avx_ChirpData_c 0.003297 0.00000 test
avx_ChirpData_d 0.003240 0.00000 test
avx_ChirpData_e 0.003132 0.00000 test
avx_ChirpData_f2 0.002997 0.00000 test
avx_ChirpData_f3 0.002977 0.00000 test
avx_ChirpData_f4 0.002973 0.00000 test
avx_ChirpData_f5 0.002961 0.00000 test
avx_ChirpData_f6 0.002952 0.00000 test
avx_ChirpData_fn 0.003013 0.00000 test
avx_ChirpData_f6 0.002952 0.00000 choice
Second run
v_ChirpData 0.007250 0.00000 test
fpu_ChirpData 0.013798 0.00000 test
fpu_opt_ChirpData 0.007152 0.00000 test
sse1_ChirpData_ak8e 0.005679 0.00000 test
sse2_ChirpData_ak8 0.003676 0.00000 test
sse3_ChirpData_ak8 0.003735 0.00000 test
avx_ChirpData_a 0.003104 0.00000 test
avx_ChirpData_b 0.003030 0.00000 test
avx_ChirpData_c 0.003297 0.00000 test
avx_ChirpData_d 0.003240 0.00000 test
avx_ChirpData_e 0.003131 0.00000 test
avx_ChirpData_f2 0.002998 0.00000 test
avx_ChirpData_f3 0.002988 0.00000 test
avx_ChirpData_f4 0.002974 0.00000 test
avx_ChirpData_f5 0.002964 0.00000 test
avx_ChirpData_f6 0.002956 0.00000 test
avx_ChirpData_fn 0.003188 0.00000 test
avx_ChirpData_f6 0.002956 0.00000 choice
Third run
v_ChirpData 0.007288 0.00000 test
fpu_ChirpData 0.013806 0.00000 test
fpu_opt_ChirpData 0.007163 0.00000 test
sse1_ChirpData_ak8e 0.005677 0.00000 test
sse2_ChirpData_ak8 0.003673 0.00000 test
sse3_ChirpData_ak8 0.003732 0.00000 test
avx_ChirpData_a 0.003099 0.00000 test
avx_ChirpData_b 0.003029 0.00000 test
avx_ChirpData_c 0.003300 0.00000 test
avx_ChirpData_d 0.003235 0.00000 test
avx_ChirpData_e 0.003128 0.00000 test
avx_ChirpData_f2 0.002994 0.00000 test
avx_ChirpData_f3 0.002987 0.00000 test
avx_ChirpData_f4 0.002975 0.00000 test
avx_ChirpData_f5 0.002966 0.00000 test
avx_ChirpData_f6 0.002952 0.00000 test
avx_ChirpData_fn 0.003193 0.00000 test
avx_ChirpData_f6 0.002952 0.00000 choice
Test duration 7.66 seconds
Ftst_v7 completed successfully.
Boinc running:
=========================================================
Ftst_v7_J48_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.011337 0.00000 test
fpu_ChirpData 0.018720 0.00000 test
fpu_opt_ChirpData 0.011250 0.00000 test
sse1_ChirpData_ak8e 0.007149 0.00000 test
sse2_ChirpData_ak8 0.004990 0.00000 test
sse3_ChirpData_ak8 0.004983 0.00000 test
avx_ChirpData_a 0.004490 0.00000 test
avx_ChirpData_b 0.004152 0.00000 test
avx_ChirpData_c 0.004441 0.00000 test
avx_ChirpData_d 0.004297 0.00000 test
avx_ChirpData_e 0.003984 0.00000 test
avx_ChirpData_f2 0.004162 0.00000 test
avx_ChirpData_f3 0.004241 0.00000 test
avx_ChirpData_f4 0.004012 0.00000 test
avx_ChirpData_f5 0.003996 0.00000 test
avx_ChirpData_f6 0.003970 0.00000 test
avx_ChirpData_fn 0.004261 0.00000 test
avx_ChirpData_f6 0.003970 0.00000 choice
Second run
v_ChirpData 0.011270 0.00000 test
fpu_ChirpData 0.018830 0.00000 test
fpu_opt_ChirpData 0.011104 0.00000 test
sse1_ChirpData_ak8e 0.007206 0.00000 test
sse2_ChirpData_ak8 0.005025 0.00000 test
sse3_ChirpData_ak8 0.004941 0.00000 test
avx_ChirpData_a 0.004436 0.00000 test
avx_ChirpData_b 0.004220 0.00000 test
avx_ChirpData_c 0.004531 0.00000 test
avx_ChirpData_d 0.004233 0.00000 test
avx_ChirpData_e 0.004130 0.00000 test
avx_ChirpData_f2 0.004156 0.00000 test
avx_ChirpData_f3 0.004129 0.00000 test
avx_ChirpData_f4 0.003999 0.00000 test
avx_ChirpData_f5 0.003965 0.00000 test
avx_ChirpData_f6 0.003952 0.00000 test
avx_ChirpData_fn 0.004213 0.00000 test
avx_ChirpData_f6 0.003952 0.00000 choice
Third run
v_ChirpData 0.011792 0.00000 test
fpu_ChirpData 0.018834 0.00000 test
fpu_opt_ChirpData 0.010799 0.00000 test
sse1_ChirpData_ak8e 0.007129 0.00000 test
sse2_ChirpData_ak8 0.004906 0.00000 test
sse3_ChirpData_ak8 0.004977 0.00000 test
avx_ChirpData_a 0.004436 0.00000 test
avx_ChirpData_b 0.004126 0.00000 test
avx_ChirpData_c 0.004484 0.00000 test
avx_ChirpData_d 0.004129 0.00000 test
avx_ChirpData_e 0.004036 0.00000 test
avx_ChirpData_f2 0.004025 0.00000 test
avx_ChirpData_f3 0.003961 0.00000 test
avx_ChirpData_f4 0.003982 0.00000 test
avx_ChirpData_f5 0.003951 0.00000 test
avx_ChirpData_f6 0.003995 0.00000 test
avx_ChirpData_fn 0.004298 0.00000 test
avx_ChirpData_f5 0.003951 0.00000 choice
Test duration 10.08 seconds
Ftst_v7 completed successfully.
Still prefers f6.
Josef W. Segur:
Well, it's clear that the software prefetching is doing some good though not clear why the more distant prefetch works slightly better on the 8 core Bulldozers even when BOINC is active. Getting those details pinned down can wait for final tuning though.
With AVX chirping times at ~80% of SSE3 chirping on Bulldozer but ~50% on Sandy Bridge, I'm looking for something with larger effects. One faint possibility is that the way the input and output test buffers are allocated in J48 and earlier might possibly cause L1 cache thrashing. I don't think that's likely, but am attaching J48a. The allocations are revised but functions being tested are unchanged.
Edit: Attachment removed, see later posts for current test.
Joe
Claggy:
i7-2600K @4.7GHz (Boinc running):
=========================================================
Ftst_v7_J48a_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009743 0.00000 test
fpu_ChirpData 0.015173 0.00000 test
fpu_opt_ChirpData 0.009064 0.00000 test
sse1_ChirpData_ak8e 0.006152 0.00000 test
sse2_ChirpData_ak8 0.004908 0.00000 test
sse3_ChirpData_ak8 0.004838 0.00000 test
avx_ChirpData_a 0.002580 0.00000 test
avx_ChirpData_b 0.002608 0.00000 test
avx_ChirpData_c 0.002615 0.00000 test
avx_ChirpData_d 0.002482 0.00000 test
avx_ChirpData_e 0.002469 0.00000 test
avx_ChirpData_f2 0.002653 0.00000 test
avx_ChirpData_f3 0.002615 0.00000 test
avx_ChirpData_f4 0.002597 0.00000 test
avx_ChirpData_f5 0.002544 0.00000 test
avx_ChirpData_f6 0.002575 0.00000 test
avx_ChirpData_fn 0.002803 0.00000 test
avx_ChirpData_e 0.002469 0.00000 choice
Second run
v_ChirpData 0.008056 0.00000 test
fpu_ChirpData 0.015192 0.00000 test
fpu_opt_ChirpData 0.008239 0.00000 test
sse1_ChirpData_ak8e 0.006109 0.00000 test
sse2_ChirpData_ak8 0.004860 0.00000 test
sse3_ChirpData_ak8 0.004905 0.00000 test
avx_ChirpData_a 0.002646 0.00000 test
avx_ChirpData_b 0.002649 0.00000 test
avx_ChirpData_c 0.002640 0.00000 test
avx_ChirpData_d 0.002515 0.00000 test
avx_ChirpData_e 0.002556 0.00000 test
avx_ChirpData_f2 0.002736 0.00000 test
avx_ChirpData_f3 0.002701 0.00000 test
avx_ChirpData_f4 0.002618 0.00000 test
avx_ChirpData_f5 0.002599 0.00000 test
avx_ChirpData_f6 0.002577 0.00000 test
avx_ChirpData_fn 0.002919 0.00000 test
avx_ChirpData_d 0.002515 0.00000 choice
Third run
v_ChirpData 0.008521 0.00000 test
fpu_ChirpData 0.015196 0.00000 test
fpu_opt_ChirpData 0.008329 0.00000 test
sse1_ChirpData_ak8e 0.006129 0.00000 test
sse2_ChirpData_ak8 0.004800 0.00000 test
sse3_ChirpData_ak8 0.004910 0.00000 test
avx_ChirpData_a 0.002695 0.00000 test
avx_ChirpData_b 0.002715 0.00000 test
avx_ChirpData_c 0.002653 0.00000 test
avx_ChirpData_d 0.002489 0.00000 test
avx_ChirpData_e 0.002523 0.00000 test
avx_ChirpData_f2 0.002678 0.00000 test
avx_ChirpData_f3 0.002662 0.00000 test
avx_ChirpData_f4 0.002604 0.00000 test
avx_ChirpData_f5 0.002609 0.00000 test
avx_ChirpData_f6 0.002576 0.00000 test
avx_ChirpData_fn 0.002865 0.00000 test
avx_ChirpData_d 0.002489 0.00000 choice
Test duration 8.53 seconds
Ftst_v7 completed successfully.
=========================================================
i7-2600K @4.7GHz (Boinc computing suspended):
=========================================================
Ftst_v7_J48a_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.003665 0.00000 test
fpu_ChirpData 0.008651 0.00000 test
fpu_opt_ChirpData 0.003569 0.00000 test
sse1_ChirpData_ak8e 0.004219 0.00000 test
sse2_ChirpData_ak8 0.003156 0.00000 test
sse3_ChirpData_ak8 0.003119 0.00000 test
avx_ChirpData_a 0.001476 0.00000 test
avx_ChirpData_b 0.001635 0.00000 test
avx_ChirpData_c 0.001493 0.00000 test
avx_ChirpData_d 0.001380 0.00000 test
avx_ChirpData_e 0.001483 0.00000 test
avx_ChirpData_f2 0.001612 0.00000 test
avx_ChirpData_f3 0.001585 0.00000 test
avx_ChirpData_f4 0.001571 0.00000 test
avx_ChirpData_f5 0.001567 0.00000 test
avx_ChirpData_f6 0.001559 0.00000 test
avx_ChirpData_fn 0.001726 0.00000 test
avx_ChirpData_d 0.001380 0.00000 choice
Second run
v_ChirpData 0.003662 0.00000 test
fpu_ChirpData 0.008645 0.00000 test
fpu_opt_ChirpData 0.003563 0.00000 test
sse1_ChirpData_ak8e 0.004212 0.00000 test
sse2_ChirpData_ak8 0.003151 0.00000 test
sse3_ChirpData_ak8 0.003118 0.00000 test
avx_ChirpData_a 0.001474 0.00000 test
avx_ChirpData_b 0.001648 0.00000 test
avx_ChirpData_c 0.001484 0.00000 test
avx_ChirpData_d 0.001373 0.00000 test
avx_ChirpData_e 0.001520 0.00000 test
avx_ChirpData_f2 0.001608 0.00000 test
avx_ChirpData_f3 0.001584 0.00000 test
avx_ChirpData_f4 0.001588 0.00000 test
avx_ChirpData_f5 0.001568 0.00000 test
avx_ChirpData_f6 0.001567 0.00000 test
avx_ChirpData_fn 0.001727 0.00000 test
avx_ChirpData_d 0.001373 0.00000 choice
Third run
v_ChirpData 0.003672 0.00000 test
fpu_ChirpData 0.008651 0.00000 test
fpu_opt_ChirpData 0.003566 0.00000 test
sse1_ChirpData_ak8e 0.004210 0.00000 test
sse2_ChirpData_ak8 0.003155 0.00000 test
sse3_ChirpData_ak8 0.003115 0.00000 test
avx_ChirpData_a 0.001472 0.00000 test
avx_ChirpData_b 0.001632 0.00000 test
avx_ChirpData_c 0.001483 0.00000 test
avx_ChirpData_d 0.001375 0.00000 test
avx_ChirpData_e 0.001524 0.00000 test
avx_ChirpData_f2 0.001610 0.00000 test
avx_ChirpData_f3 0.001585 0.00000 test
avx_ChirpData_f4 0.001570 0.00000 test
avx_ChirpData_f5 0.001569 0.00000 test
avx_ChirpData_f6 0.001564 0.00000 test
avx_ChirpData_fn 0.001726 0.00000 test
avx_ChirpData_d 0.001375 0.00000 choice
Test duration 6.29 seconds
Ftst_v7 completed successfully.
Claggy
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version