Forum > Discussion Forum
AVX Optimized App Development
PatrickV2:
I will have an Ivy Bridge i7-3770 based PC (Z77 mainboard) available for a (short) while in the near future.
Is there any interest in running this benchmark/test-tool on that config?
Regards, Patrick.
Josef W. Segur:
Yes, please! The broader the range of systems tested the better. If you'd run both the J45 test attached to message 37870 and whatever the latest Chirponly version is at that time, it would be a help.
Joe
Josef W. Segur:
The FMA4 a variant produced about a 5% speedup by reducing the number of floating point instructions in the inner loop by ~11%. That's good, but confirms that getting the data transferred still needs improvement. For J51 I'm trying the TLB priming again, but without block prefetching. The i variant for AVX is modified from the h variant, and the changes were merged to the b variant for AVX+FMA4.
Edit: Attachment removed, see later post for current version
Joe
arkayn:
FX-4100
BOINC running on 460
=========================================================
Ftst_v7_J51_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009009 0.00000 test
fpu_ChirpData 0.018200 0.00000 test
fpu_opt_ChirpData 0.008752 0.00000 test
sse1_ChirpData_ak8e 0.007503 0.00000 test
sse2_ChirpData_ak8 0.004782 0.00000 test
sse3_ChirpData_ak8 0.004801 0.00000 test
avx_ChirpData_a 0.003903 0.00000 test
avx_ChirpData_b 0.003902 0.00000 test
avx_ChirpData_c 0.004298 0.00000 test
avx_ChirpData_d 0.004136 0.00000 test
avx_ChirpData_e 0.003988 0.00000 test
avx_ChirpData_f 0.003865 0.00000 test
avx_ChirpData_g 0.003858 0.00000 test
avx_ChirpData_h 0.004558 0.00000 test
avx_ChirpData_i 0.004006 0.00000 test
avx_fma4_ChirpData_a 0.003524 0.00000 test
avx_fma4_ChirpData_b 0.060127 0.50095 test
avx_fma4_ChirpData_a 0.003524 0.00000 choice
Second run
v_ChirpData 0.009023 0.00000 test
fpu_ChirpData 0.018034 0.00000 test
fpu_opt_ChirpData 0.008862 0.00000 test
sse1_ChirpData_ak8e 0.007292 0.00000 test
sse2_ChirpData_ak8 0.004615 0.00000 test
sse3_ChirpData_ak8 0.004532 0.00000 test
avx_ChirpData_a 0.003917 0.00000 test
avx_ChirpData_b 0.003865 0.00000 test
avx_ChirpData_c 0.004167 0.00000 test
avx_ChirpData_d 0.004040 0.00000 test
avx_ChirpData_e 0.004026 0.00000 test
avx_ChirpData_f 0.003821 0.00000 test
avx_ChirpData_g 0.003666 0.00000 test
avx_ChirpData_h 0.004601 0.00000 test
avx_ChirpData_i 0.003980 0.00000 test
avx_fma4_ChirpData_a 0.003389 0.00000 test
avx_fma4_ChirpData_b 0.058483 0.50095 test
avx_fma4_ChirpData_a 0.003389 0.00000 choice
Third run
v_ChirpData 0.008824 0.00000 test
fpu_ChirpData 0.017494 0.00000 test
fpu_opt_ChirpData 0.008599 0.00000 test
sse1_ChirpData_ak8e 0.007149 0.00000 test
sse2_ChirpData_ak8 0.004593 0.00000 test
sse3_ChirpData_ak8 0.004453 0.00000 test
avx_ChirpData_a 0.003842 0.00000 test
avx_ChirpData_b 0.003825 0.00000 test
avx_ChirpData_c 0.004122 0.00000 test
avx_ChirpData_d 0.004023 0.00000 test
avx_ChirpData_e 0.003950 0.00000 test
avx_ChirpData_f 0.003855 0.00000 test
avx_ChirpData_g 0.003928 0.00000 test
avx_ChirpData_h 0.004565 0.00000 test
avx_ChirpData_i 0.004058 0.00000 test
avx_fma4_ChirpData_a 0.003531 0.00000 test
avx_fma4_ChirpData_b 0.059600 0.50095 test
avx_fma4_ChirpData_a 0.003531 0.00000 choice
Test duration 11.53 seconds
Ftst_v7 completed successfully.
i3-2120
BOINC running on 560
=========================================================
Ftst_v7_J51_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.004599 0.00000 test
fpu_ChirpData 0.012435 0.00000 test
fpu_opt_ChirpData 0.004366 0.00000 test
sse1_ChirpData_ak8e 0.006014 0.00000 test
sse2_ChirpData_ak8 0.004207 0.00000 test
sse3_ChirpData_ak8 0.004177 0.00000 test
avx_ChirpData_a 0.002153 0.00000 test
avx_ChirpData_b 0.002141 0.00000 test
avx_ChirpData_c 0.002217 0.00000 test
avx_ChirpData_d 0.002032 0.00000 test
avx_ChirpData_e 0.002002 0.00000 test
avx_ChirpData_f 0.002125 0.00000 test
avx_ChirpData_g 0.002081 0.00000 test
avx_ChirpData_h 0.002745 0.00000 test
avx_ChirpData_i 0.002329 0.00000 test
avx_fma4_ChirpData_a not supported by system
avx_fma4_ChirpData_b not supported by system
avx_ChirpData_e 0.002002 0.00000 choice
Second run
v_ChirpData 0.004888 0.00000 test
fpu_ChirpData 0.012563 0.00000 test
fpu_opt_ChirpData 0.004551 0.00000 test
sse1_ChirpData_ak8e 0.005902 0.00000 test
sse2_ChirpData_ak8 0.004339 0.00000 test
sse3_ChirpData_ak8 0.004017 0.00000 test
avx_ChirpData_a 0.002142 0.00000 test
avx_ChirpData_b 0.002153 0.00000 test
avx_ChirpData_c 0.002186 0.00000 test
avx_ChirpData_d 0.002007 0.00000 test
avx_ChirpData_e 0.001946 0.00000 test
avx_ChirpData_f 0.002063 0.00000 test
avx_ChirpData_g 0.002174 0.00000 test
avx_ChirpData_h 0.002790 0.00000 test
avx_ChirpData_i 0.002347 0.00000 test
avx_fma4_ChirpData_a not supported by system
avx_fma4_ChirpData_b not supported by system
avx_ChirpData_e 0.001946 0.00000 choice
Third run
v_ChirpData 0.004868 0.00000 test
fpu_ChirpData 0.012536 0.00000 test
fpu_opt_ChirpData 0.004565 0.00000 test
sse1_ChirpData_ak8e 0.005728 0.00000 test
sse2_ChirpData_ak8 0.004225 0.00000 test
sse3_ChirpData_ak8 0.004123 0.00000 test
avx_ChirpData_a 0.002121 0.00000 test
avx_ChirpData_b 0.002155 0.00000 test
avx_ChirpData_c 0.002184 0.00000 test
avx_ChirpData_d 0.002048 0.00000 test
avx_ChirpData_e 0.002039 0.00000 test
avx_ChirpData_f 0.002137 0.00000 test
avx_ChirpData_g 0.002188 0.00000 test
avx_ChirpData_h 0.002760 0.00000 test
avx_ChirpData_i 0.002335 0.00000 test
avx_fma4_ChirpData_a not supported by system
avx_fma4_ChirpData_b not supported by system
avx_ChirpData_e 0.002039 0.00000 choice
Test duration 8.08 seconds
Ftst_v7 completed successfully.
KarVi:
FX8150@4.5
=========================================================
Ftst_v7_J51_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.007185 0.00000 test
fpu_ChirpData 0.013632 0.00000 test
fpu_opt_ChirpData 0.007034 0.00000 test
sse1_ChirpData_ak8e 0.005612 0.00000 test
sse2_ChirpData_ak8 0.003647 0.00000 test
sse3_ChirpData_ak8 0.003540 0.00000 test
avx_ChirpData_a 0.003041 0.00000 test
avx_ChirpData_b 0.002989 0.00000 test
avx_ChirpData_c 0.003246 0.00000 test
avx_ChirpData_d 0.003197 0.00000 test
avx_ChirpData_e 0.003099 0.00000 test
avx_ChirpData_f 0.002934 0.00000 test
avx_ChirpData_g 0.003017 0.00000 test
avx_ChirpData_h 0.003557 0.00000 test
avx_ChirpData_i 0.003103 0.00000 test
avx_fma4_ChirpData_a 0.002702 0.00000 test
avx_fma4_ChirpData_b 0.046052 0.50095 test
avx_fma4_ChirpData_a 0.002702 0.00000 choice
Second run
v_ChirpData 0.007192 0.00000 test
fpu_ChirpData 0.013633 0.00000 test
fpu_opt_ChirpData 0.007051 0.00000 test
sse1_ChirpData_ak8e 0.005563 0.00000 test
sse2_ChirpData_ak8 0.003670 0.00000 test
sse3_ChirpData_ak8 0.003536 0.00000 test
avx_ChirpData_a 0.003046 0.00000 test
avx_ChirpData_b 0.002989 0.00000 test
avx_ChirpData_c 0.003246 0.00000 test
avx_ChirpData_d 0.003171 0.00000 test
avx_ChirpData_e 0.003096 0.00000 test
avx_ChirpData_f 0.002938 0.00000 test
avx_ChirpData_g 0.002863 0.00000 test
avx_ChirpData_h 0.003540 0.00000 test
avx_ChirpData_i 0.003107 0.00000 test
avx_fma4_ChirpData_a 0.002688 0.00000 test
avx_fma4_ChirpData_b 0.045858 0.50095 test
avx_fma4_ChirpData_a 0.002688 0.00000 choice
Third run
v_ChirpData 0.007145 0.00000 test
fpu_ChirpData 0.013630 0.00000 test
fpu_opt_ChirpData 0.007010 0.00000 test
sse1_ChirpData_ak8e 0.005560 0.00000 test
sse2_ChirpData_ak8 0.003667 0.00000 test
sse3_ChirpData_ak8 0.003536 0.00000 test
avx_ChirpData_a 0.003046 0.00000 test
avx_ChirpData_b 0.003003 0.00000 test
avx_ChirpData_c 0.003244 0.00000 test
avx_ChirpData_d 0.003178 0.00000 test
avx_ChirpData_e 0.003092 0.00000 test
avx_ChirpData_f 0.002933 0.00000 test
avx_ChirpData_g 0.002958 0.00000 test
avx_ChirpData_h 0.003534 0.00000 test
avx_ChirpData_i 0.003106 0.00000 test
avx_fma4_ChirpData_a 0.002691 0.00000 test
avx_fma4_ChirpData_b 0.045893 0.50095 test
avx_fma4_ChirpData_a 0.002691 0.00000 choice
Test duration 8.88 seconds
Ftst_v7 completed successfully.
Nice improvement from Fused Multiply-Add :)
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version