Forum > Discussion Forum
AVX Optimized App Development
arkayn:
i3-2120
=========================================================
Ftst_v7_J46_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.004987 0.00000 test
fpu_ChirpData 0.012602 0.00000 test
fpu_opt_ChirpData 0.004939 0.00000 test
sse1_ChirpData_ak 0.007443 0.00000 test
sse1_ChirpData_ak8e 0.005782 0.00000 test
sse1_ChirpData_ak8h 0.006308 0.00000 test
sse2_ChirpData_ak 0.006596 0.00000 test
sse2_ChirpData_ak8 0.004223 0.00000 test
sse3_ChirpData_ak 0.007248 0.00000 test
sse3_ChirpData_ak8 0.004744 0.00000 test
avx_ChirpData_a 0.002522 0.00000 test
avx_ChirpData_b 0.002197 0.00000 test
avx_ChirpData_c 0.002229 0.00000 test
avx_ChirpData_d 0.001941 0.00000 test
avx_ChirpData_e 0.001927 0.00000 test
avx_ChirpData_f 0.002687 0.00000 test
avx_ChirpData_e 0.001927 0.00000 choice
Second run
v_ChirpData 0.004898 0.00000 test
fpu_ChirpData 0.012865 0.00000 test
fpu_opt_ChirpData 0.004803 0.00000 test
sse1_ChirpData_ak 0.007609 0.00000 test
sse1_ChirpData_ak8e 0.006971 0.00000 test
sse1_ChirpData_ak8h 0.006140 0.00000 test
sse2_ChirpData_ak 0.011364 0.00000 test
sse2_ChirpData_ak8 0.004304 0.00000 test
sse3_ChirpData_ak 0.006403 0.00000 test
sse3_ChirpData_ak8 0.004099 0.00000 test
avx_ChirpData_a 0.002169 0.00000 test
avx_ChirpData_b 0.002218 0.00000 test
avx_ChirpData_c 0.002841 0.00000 test
avx_ChirpData_d 0.002072 0.00000 test
avx_ChirpData_e 0.002096 0.00000 test
avx_ChirpData_f 0.002106 0.00000 test
avx_ChirpData_d 0.002072 0.00000 choice
Third run
v_ChirpData 0.005091 0.00000 test
fpu_ChirpData 0.012386 0.00000 test
fpu_opt_ChirpData 0.005903 0.00000 test
sse1_ChirpData_ak 0.007593 0.00000 test
sse1_ChirpData_ak8e 0.006529 0.00000 test
sse1_ChirpData_ak8h 0.006921 0.00000 test
sse2_ChirpData_ak 0.007636 0.00000 test
sse2_ChirpData_ak8 0.004701 0.00000 test
sse3_ChirpData_ak 0.008300 0.00000 test
sse3_ChirpData_ak8 0.004363 0.00000 test
avx_ChirpData_a 0.002189 0.00000 test
avx_ChirpData_b 0.002560 0.00000 test
avx_ChirpData_c 0.002874 0.00000 test
avx_ChirpData_d 0.001963 0.00000 test
avx_ChirpData_e 0.002519 0.00000 test
avx_ChirpData_f 0.002402 0.00000 test
avx_ChirpData_d 0.001963 0.00000 choice
Test duration 9.38 seconds
Ftst_v7 completed successfully.
KarVi:
FX-8150 @ 4.5G
=========================================================
Ftst_v7_J46_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.007729 0.00000 test
fpu_ChirpData 0.014011 0.00000 test
fpu_opt_ChirpData 0.007529 0.00000 test
sse1_ChirpData_ak 0.007005 0.00000 test
sse1_ChirpData_ak8e 0.005742 0.00000 test
sse1_ChirpData_ak8h 0.006084 0.00000 test
sse2_ChirpData_ak 0.006046 0.00000 test
sse2_ChirpData_ak8 0.003682 0.00000 test
sse3_ChirpData_ak 0.005676 0.00000 test
sse3_ChirpData_ak8 0.003735 0.00000 test
avx_ChirpData_a 0.003086 0.00000 test
avx_ChirpData_b 0.003053 0.00000 test
avx_ChirpData_c 0.003305 0.00000 test
avx_ChirpData_d 0.003230 0.00000 test
avx_ChirpData_e 0.003276 0.00000 test
avx_ChirpData_f 0.002984 0.00000 test
avx_ChirpData_f 0.002984 0.00000 choice
Second run
v_ChirpData 0.007668 0.00000 test
fpu_ChirpData 0.014010 0.00000 test
fpu_opt_ChirpData 0.007530 0.00000 test
sse1_ChirpData_ak 0.007012 0.00000 test
sse1_ChirpData_ak8e 0.005742 0.00000 test
sse1_ChirpData_ak8h 0.006120 0.00000 test
sse2_ChirpData_ak 0.006099 0.00000 test
sse2_ChirpData_ak8 0.003680 0.00000 test
sse3_ChirpData_ak 0.005676 0.00000 test
sse3_ChirpData_ak8 0.003733 0.00000 test
avx_ChirpData_a 0.003084 0.00000 test
avx_ChirpData_b 0.003054 0.00000 test
avx_ChirpData_c 0.003298 0.00000 test
avx_ChirpData_d 0.003237 0.00000 test
avx_ChirpData_e 0.003160 0.00000 test
avx_ChirpData_f 0.002985 0.00000 test
avx_ChirpData_f 0.002985 0.00000 choice
Third run
v_ChirpData 0.007691 0.00000 test
fpu_ChirpData 0.014007 0.00000 test
fpu_opt_ChirpData 0.007550 0.00000 test
sse1_ChirpData_ak 0.007008 0.00000 test
sse1_ChirpData_ak8e 0.005766 0.00000 test
sse1_ChirpData_ak8h 0.006121 0.00000 test
sse2_ChirpData_ak 0.006102 0.00000 test
sse2_ChirpData_ak8 0.003683 0.00000 test
sse3_ChirpData_ak 0.005611 0.00000 test
sse3_ChirpData_ak8 0.003735 0.00000 test
avx_ChirpData_a 0.003099 0.00000 test
avx_ChirpData_b 0.003056 0.00000 test
avx_ChirpData_c 0.003307 0.00000 test
avx_ChirpData_d 0.003235 0.00000 test
avx_ChirpData_e 0.003145 0.00000 test
avx_ChirpData_f 0.002989 0.00000 test
avx_ChirpData_f 0.002989 0.00000 choice
Test duration 7.65 seconds
Ftst_v7 completed successfully.
Claggy:
i7-2600K @4.7GHz (Boinc running):
=========================================================
Ftst_v7_J46_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009035 0.00000 test
fpu_ChirpData 0.015938 0.00000 test
fpu_opt_ChirpData 0.008348 0.00000 test
sse1_ChirpData_ak 0.008082 0.00000 test
sse1_ChirpData_ak8e 0.006617 0.00000 test
sse1_ChirpData_ak8h 0.006743 0.00000 test
sse2_ChirpData_ak 0.007306 0.00000 test
sse2_ChirpData_ak8 0.005201 0.00000 test
sse3_ChirpData_ak 0.007197 0.00000 test
sse3_ChirpData_ak8 0.004754 0.00000 test
avx_ChirpData_a 0.003005 0.00000 test
avx_ChirpData_b 0.002989 0.00000 test
avx_ChirpData_c 0.002858 0.00000 test
avx_ChirpData_d 0.002757 0.00000 test
avx_ChirpData_e 0.002860 0.00000 test
avx_ChirpData_f 0.003022 0.00000 test
avx_ChirpData_d 0.002757 0.00000 choice
Second run
v_ChirpData 0.009295 0.00000 test
fpu_ChirpData 0.016223 0.00000 test
fpu_opt_ChirpData 0.009218 0.00000 test
sse1_ChirpData_ak 0.008023 0.00000 test
sse1_ChirpData_ak8e 0.006668 0.00000 test
sse1_ChirpData_ak8h 0.006970 0.00000 test
sse2_ChirpData_ak 0.007304 0.00000 test
sse2_ChirpData_ak8 0.007223 0.00000 test
sse3_ChirpData_ak 0.006876 0.00000 test
sse3_ChirpData_ak8 0.005352 0.00000 test
avx_ChirpData_a 0.002983 0.00000 test
avx_ChirpData_b 0.002851 0.00000 test
avx_ChirpData_c 0.002793 0.00000 test
avx_ChirpData_d 0.002774 0.00000 test
avx_ChirpData_e 0.002800 0.00000 test
avx_ChirpData_f 0.003027 0.00000 test
avx_ChirpData_d 0.002774 0.00000 choice
Third run
v_ChirpData 0.010111 0.00000 test
fpu_ChirpData 0.015180 0.00000 test
fpu_opt_ChirpData 0.007804 0.00000 test
sse1_ChirpData_ak 0.007392 0.00000 test
sse1_ChirpData_ak8e 0.005719 0.00000 test
sse1_ChirpData_ak8h 0.006324 0.00000 test
sse2_ChirpData_ak 0.006736 0.00000 test
sse2_ChirpData_ak8 0.004659 0.00000 test
sse3_ChirpData_ak 0.006483 0.00000 test
sse3_ChirpData_ak8 0.004693 0.00000 test
avx_ChirpData_a 0.002670 0.00000 test
avx_ChirpData_b 0.002547 0.00000 test
avx_ChirpData_c 0.002927 0.00000 test
avx_ChirpData_d 0.002863 0.00000 test
avx_ChirpData_e 0.002530 0.00000 test
avx_ChirpData_f 0.002889 0.00000 test
avx_ChirpData_e 0.002530 0.00000 choice
Test duration 8.81 seconds
Ftst_v7 completed successfully.
=========================================================
i7-2600K @4.7GHz (Boinc computing suspended):
=========================================================
Ftst_v7_J46_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.003774 0.00000 test
fpu_ChirpData 0.008655 0.00000 test
fpu_opt_ChirpData 0.003673 0.00000 test
sse1_ChirpData_ak 0.005044 0.00000 test
sse1_ChirpData_ak8e 0.004149 0.00000 test
sse1_ChirpData_ak8h 0.004295 0.00000 test
sse2_ChirpData_ak 0.004718 0.00000 test
sse2_ChirpData_ak8 0.003110 0.00000 test
sse3_ChirpData_ak 0.004573 0.00000 test
sse3_ChirpData_ak8 0.003070 0.00000 test
avx_ChirpData_a 0.001458 0.00000 test
avx_ChirpData_b 0.001635 0.00000 test
avx_ChirpData_c 0.001466 0.00000 test
avx_ChirpData_d 0.001359 0.00000 test
avx_ChirpData_e 0.001523 0.00000 test
avx_ChirpData_f 0.001567 0.00000 test
avx_ChirpData_d 0.001359 0.00000 choice
Second run
v_ChirpData 0.003752 0.00000 test
fpu_ChirpData 0.008567 0.00000 test
fpu_opt_ChirpData 0.003682 0.00000 test
sse1_ChirpData_ak 0.005043 0.00000 test
sse1_ChirpData_ak8e 0.004156 0.00000 test
sse1_ChirpData_ak8h 0.004301 0.00000 test
sse2_ChirpData_ak 0.004715 0.00000 test
sse2_ChirpData_ak8 0.003105 0.00000 test
sse3_ChirpData_ak 0.004566 0.00000 test
sse3_ChirpData_ak8 0.003084 0.00000 test
avx_ChirpData_a 0.001450 0.00000 test
avx_ChirpData_b 0.001618 0.00000 test
avx_ChirpData_c 0.001463 0.00000 test
avx_ChirpData_d 0.001364 0.00000 test
avx_ChirpData_e 0.001512 0.00000 test
avx_ChirpData_f 0.001567 0.00000 test
avx_ChirpData_d 0.001364 0.00000 choice
Third run
v_ChirpData 0.003780 0.00000 test
fpu_ChirpData 0.008574 0.00000 test
fpu_opt_ChirpData 0.003678 0.00000 test
sse1_ChirpData_ak 0.005039 0.00000 test
sse1_ChirpData_ak8e 0.004149 0.00000 test
sse1_ChirpData_ak8h 0.004303 0.00000 test
sse2_ChirpData_ak 0.004717 0.00000 test
sse2_ChirpData_ak8 0.003103 0.00000 test
sse3_ChirpData_ak 0.004552 0.00000 test
sse3_ChirpData_ak8 0.003074 0.00000 test
avx_ChirpData_a 0.001457 0.00000 test
avx_ChirpData_b 0.001623 0.00000 test
avx_ChirpData_c 0.001465 0.00000 test
avx_ChirpData_d 0.001358 0.00000 test
avx_ChirpData_e 0.001517 0.00000 test
avx_ChirpData_f 0.001568 0.00000 test
avx_ChirpData_d 0.001358 0.00000 choice
Test duration 6.21 seconds
Ftst_v7 completed successfully.
Claggy
Mike:
FX 8150 @4.4 GHZ
Vista 64
Ftst_v7_J46_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.013777 0.00000 test
fpu_ChirpData 0.019779 0.00000 test
fpu_opt_ChirpData 0.013983 0.00000 test
sse1_ChirpData_ak 0.008577 0.00000 test
sse1_ChirpData_ak8e 0.008094 0.00000 test
sse1_ChirpData_ak8h 0.008219 0.00000 test
sse2_ChirpData_ak 0.007457 0.00000 test
sse2_ChirpData_ak8 0.005295 0.00000 test
sse3_ChirpData_ak 0.007173 0.00000 test
sse3_ChirpData_ak8 0.005586 0.00000 test
avx_ChirpData_a not supported on CPU
avx_ChirpData_b not supported on CPU
avx_ChirpData_c not supported on CPU
avx_ChirpData_d not supported on CPU
avx_ChirpData_e not supported on CPU
avx_ChirpData_f not supported on CPU
sse2_ChirpData_ak8 0.005295 0.00000 choice
Second run
v_ChirpData 0.014368 0.00000 test
fpu_ChirpData 0.020133 0.00000 test
fpu_opt_ChirpData 0.014774 0.00000 test
sse1_ChirpData_ak 0.009611 0.00000 test
sse1_ChirpData_ak8e 0.008270 0.00000 test
sse1_ChirpData_ak8h 0.008133 0.00000 test
sse2_ChirpData_ak 0.007581 0.00000 test
sse2_ChirpData_ak8 0.005770 0.00000 test
sse3_ChirpData_ak 0.007714 0.00000 test
sse3_ChirpData_ak8 0.005994 0.00000 test
avx_ChirpData_a not supported on CPU
avx_ChirpData_b not supported on CPU
avx_ChirpData_c not supported on CPU
avx_ChirpData_d not supported on CPU
avx_ChirpData_e not supported on CPU
avx_ChirpData_f not supported on CPU
sse2_ChirpData_ak8 0.005770 0.00000 choice
Third run
v_ChirpData 0.015319 0.00000 test
fpu_ChirpData 0.019830 0.00000 test
fpu_opt_ChirpData 0.014914 0.00000 test
sse1_ChirpData_ak 0.008400 0.00000 test
sse1_ChirpData_ak8e 0.008292 0.00000 test
sse1_ChirpData_ak8h 0.008653 0.00000 test
sse2_ChirpData_ak 0.008362 0.00000 test
sse2_ChirpData_ak8 0.006021 0.00000 test
sse3_ChirpData_ak 0.007947 0.00000 test
sse3_ChirpData_ak8 0.006008 0.00000 test
avx_ChirpData_a not supported on CPU
avx_ChirpData_b not supported on CPU
avx_ChirpData_c not supported on CPU
avx_ChirpData_d not supported on CPU
avx_ChirpData_e not supported on CPU
avx_ChirpData_f not supported on CPU
sse3_ChirpData_ak8 0.006008 0.00000 choice
Test duration 7.70 seconds
Windows 8 CP
Ftst_v7 completed successfully.
=========================================================
Ftst_v7_J46_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009532 0.00000 test
fpu_ChirpData 0.014579 0.00000 test
fpu_opt_ChirpData 0.010973 0.00000 test
sse1_ChirpData_ak 0.007270 0.00000 test
sse1_ChirpData_ak8e 0.005956 0.00000 test
sse1_ChirpData_ak8h 0.006514 0.00000 test
sse2_ChirpData_ak 0.006356 0.00000 test
sse2_ChirpData_ak8 0.003823 0.00000 test
sse3_ChirpData_ak 0.005874 0.00000 test
sse3_ChirpData_ak8 0.003933 0.00000 test
avx_ChirpData_a 0.003266 0.00000 test
avx_ChirpData_b 0.003259 0.00000 test
avx_ChirpData_c 0.003489 0.00000 test
avx_ChirpData_d 0.003396 0.00000 test
avx_ChirpData_e 0.003249 0.00000 test
avx_ChirpData_f 0.003150 0.00000 test
avx_ChirpData_f 0.003150 0.00000 choice
Second run
v_ChirpData 0.010624 0.00000 test
fpu_ChirpData 0.014424 0.00000 test
fpu_opt_ChirpData 0.010814 0.00000 test
sse1_ChirpData_ak 0.007306 0.00000 test
sse1_ChirpData_ak8e 0.006028 0.00000 test
sse1_ChirpData_ak8h 0.006386 0.00000 test
sse2_ChirpData_ak 0.006246 0.00000 test
sse2_ChirpData_ak8 0.003924 0.00000 test
sse3_ChirpData_ak 0.005858 0.00000 test
sse3_ChirpData_ak8 0.003856 0.00000 test
avx_ChirpData_a 0.003388 0.00000 test
avx_ChirpData_b 0.003372 0.00000 test
avx_ChirpData_c 0.003444 0.00000 test
avx_ChirpData_d 0.003420 0.00000 test
avx_ChirpData_e 0.003288 0.00000 test
avx_ChirpData_f 0.003243 0.00000 test
avx_ChirpData_f 0.003243 0.00000 choice
Third run
v_ChirpData 0.010755 0.00000 test
fpu_ChirpData 0.014522 0.00000 test
fpu_opt_ChirpData 0.010650 0.00000 test
sse1_ChirpData_ak 0.007303 0.00000 test
sse1_ChirpData_ak8e 0.005991 0.00000 test
sse1_ChirpData_ak8h 0.006305 0.00000 test
sse2_ChirpData_ak 0.006294 0.00000 test
sse2_ChirpData_ak8 0.003958 0.00000 test
sse3_ChirpData_ak 0.005834 0.00000 test
sse3_ChirpData_ak8 0.003853 0.00000 test
avx_ChirpData_a 0.003438 0.00000 test
avx_ChirpData_b 0.003351 0.00000 test
avx_ChirpData_c 0.003511 0.00000 test
avx_ChirpData_d 0.003449 0.00000 test
avx_ChirpData_e 0.003353 0.00000 test
avx_ChirpData_f 0.003294 0.00000 test
avx_ChirpData_f 0.003294 0.00000 choice
Test duration 8.25 seconds
Ftst_v7 completed successfully.
Josef W. Segur:
Hmm, my hopes for a single best variant for both Bulldozer and Sandy Bridge are fading.
I'm still hoping to improve things further for Bulldozer, the attached J47 test has several subvariants of f with the prefetch distance varied from 2 to 6 cache lines (was 4 in J46). Possibly one will get the input data to L1 at just the right time, at least there may be some observable differences.
Edit: Attachment removed, see later post for current chirp only test.
Joe
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version