Forum > Discussion Forum
AVX Optimized App Development
KarVi:
FX8150@4,5
=========================================================
Ftst_v7_J53_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.007298 0.00000 test
fpu_ChirpData 0.013822 0.00000 test
fpu_opt_ChirpData 0.007152 0.00000 test
sse1_ChirpData_ak8e 0.005673 0.00000 test
sse2_ChirpData_ak8 0.003697 0.00000 test
sse3_ChirpData_ak8 0.003599 0.00000 test
avx_ChirpData_a 0.003088 0.00000 test
avx_ChirpData_b 0.003041 0.00000 test
avx_ChirpData_c 0.003297 0.00000 test
avx_ChirpData_d 0.003219 0.00000 test
avx_ChirpData_e 0.003142 0.00000 test
avx_ChirpData_f 0.002986 0.00000 test
avx_ChirpData_g 0.002915 0.00000 test
avx_ChirpData_h 0.003588 0.00000 test
avx_ChirpData_i 0.003172 0.00000 test
avx_fma4_ChirpData_a 0.002730 0.00000 test
avx_fma4_ChirpData_d4 0.002747 0.00000 test
avx_fma4_ChirpData_d6 0.002718 0.00000 test
avx_fma4_ChirpData_d8 0.002710 0.00000 test
avx_fma4_ChirpData_d8 0.002710 0.00000 choice
Second run
v_ChirpData 0.007239 0.00000 test
fpu_ChirpData 0.013803 0.00000 test
fpu_opt_ChirpData 0.007090 0.00000 test
sse1_ChirpData_ak8e 0.005697 0.00000 test
sse2_ChirpData_ak8 0.003676 0.00000 test
sse3_ChirpData_ak8 0.003612 0.00000 test
avx_ChirpData_a 0.003079 0.00000 test
avx_ChirpData_b 0.003044 0.00000 test
avx_ChirpData_c 0.003298 0.00000 test
avx_ChirpData_d 0.003225 0.00000 test
avx_ChirpData_e 0.003142 0.00000 test
avx_ChirpData_f 0.002985 0.00000 test
avx_ChirpData_g 0.002933 0.00000 test
avx_ChirpData_h 0.003579 0.00000 test
avx_ChirpData_i 0.003169 0.00000 test
avx_fma4_ChirpData_a 0.002730 0.00000 test
avx_fma4_ChirpData_d4 0.002754 0.00000 test
avx_fma4_ChirpData_d6 0.002718 0.00000 test
avx_fma4_ChirpData_d8 0.002713 0.00000 test
avx_fma4_ChirpData_d8 0.002713 0.00000 choice
Third run
v_ChirpData 0.007309 0.00000 test
fpu_ChirpData 0.013824 0.00000 test
fpu_opt_ChirpData 0.007157 0.00000 test
sse1_ChirpData_ak8e 0.005677 0.00000 test
sse2_ChirpData_ak8 0.003688 0.00000 test
sse3_ChirpData_ak8 0.003599 0.00000 test
avx_ChirpData_a 0.003090 0.00000 test
avx_ChirpData_b 0.003043 0.00000 test
avx_ChirpData_c 0.003296 0.00000 test
avx_ChirpData_d 0.003227 0.00000 test
avx_ChirpData_e 0.003148 0.00000 test
avx_ChirpData_f 0.002990 0.00000 test
avx_ChirpData_g 0.002925 0.00000 test
avx_ChirpData_h 0.003579 0.00000 test
avx_ChirpData_i 0.003172 0.00000 test
avx_fma4_ChirpData_a 0.002746 0.00000 test
avx_fma4_ChirpData_d4 0.002750 0.00000 test
avx_fma4_ChirpData_d6 0.002717 0.00000 test
avx_fma4_ChirpData_d8 0.002710 0.00000 test
avx_fma4_ChirpData_d8 0.002710 0.00000 choice
Test duration 8.41 seconds
Ftst_v7 completed successfully.
Josef W. Segur:
That difference in how the FX-4100 and FX-8150 react to prefetch distance is still fascinating.
For J54 I've modified the test framework again, each test will show the minimum time taken by one iteration. That will give some indication of how much variance there is.
I've also added an e variant for avx_fma4 which is exploring the capability doing 128 bit operations rather than 256 like all the other avx tests. I expect it to be faster than the existing SSE3 test both because it's using fma4 and because with avx enabled there are 3 operand forms of the instructions. With old-style SSE3 an operation like a = b + c actually had to copy b to a then add c. The 3 operand form does it in a single operation. I doubt the e variant will challenge the 256 bit versions, but it's possible. An AMD engineer chose to have the GCC autovectorizer produce 128 bit AVX and FMA4 for Bulldozer v1 because that outperformed 256 bit code on some of the SPEC benchmarks.
Joe
arkayn:
FX4100 @3.6
First run BOINC running on GTX460
=========================================================
Ftst_v7_J54_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009313 0.00000 test mintime= 0.004975
fpu_ChirpData 0.017663 0.00000 test mintime= 0.017547
fpu_opt_ChirpData 0.009177 0.00000 test mintime= 0.004658
sse1_ChirpData_ak8e 0.007268 0.00000 test mintime= 0.007189
sse2_ChirpData_ak8 0.004597 0.00000 test mintime= 0.004549
sse3_ChirpData_ak8 0.004520 0.00000 test mintime= 0.004419
avx_ChirpData_a 0.003807 0.00000 test mintime= 0.003769
avx_ChirpData_b 0.003873 0.00000 test mintime= 0.003792
avx_ChirpData_c 0.004119 0.00000 test mintime= 0.004081
avx_ChirpData_d 0.004026 0.00000 test mintime= 0.003959
avx_ChirpData_e 0.003916 0.00000 test mintime= 0.003878
avx_ChirpData_f 0.003722 0.00000 test mintime= 0.003698
avx_ChirpData_g 0.003716 0.00000 test mintime= 0.003637
avx_ChirpData_h 0.004431 0.00000 test mintime= 0.004382
avx_ChirpData_i 0.003890 0.00000 test mintime= 0.003846
avx_fma4_ChirpData_a 0.003380 0.00000 test mintime= 0.003322
avx_fma4_ChirpData_d4 0.003431 0.00000 test mintime= 0.003379
avx_fma4_ChirpData_d6 0.003521 0.00000 test mintime= 0.003345
avx_fma4_ChirpData_d8 0.003383 0.00000 test mintime= 0.003338
avx_fma4_ChirpData_e 0.003917 0.00000 test mintime= 0.003905
avx_fma4_ChirpData_a 0.003380 0.00000 choice
Second run
v_ChirpData 0.009529 0.00000 test mintime= 0.004951
fpu_ChirpData 0.017635 0.00000 test mintime= 0.017457
fpu_opt_ChirpData 0.009079 0.00000 test mintime= 0.004666
sse1_ChirpData_ak8e 0.007233 0.00000 test mintime= 0.007192
sse2_ChirpData_ak8 0.004588 0.00000 test mintime= 0.004541
sse3_ChirpData_ak8 0.004432 0.00000 test mintime= 0.004417
avx_ChirpData_a 0.003823 0.00000 test mintime= 0.003739
avx_ChirpData_b 0.003827 0.00000 test mintime= 0.003784
avx_ChirpData_c 0.004122 0.00000 test mintime= 0.004076
avx_ChirpData_d 0.004002 0.00000 test mintime= 0.003958
avx_ChirpData_e 0.003933 0.00000 test mintime= 0.003886
avx_ChirpData_f 0.003716 0.00000 test mintime= 0.003666
avx_ChirpData_g 0.003687 0.00000 test mintime= 0.003615
avx_ChirpData_h 0.004483 0.00000 test mintime= 0.004378
avx_ChirpData_i 0.003910 0.00000 test mintime= 0.003850
avx_fma4_ChirpData_a 0.003392 0.00000 test mintime= 0.003324
avx_fma4_ChirpData_d4 0.003453 0.00000 test mintime= 0.003392
avx_fma4_ChirpData_d6 0.003533 0.00000 test mintime= 0.003487
avx_fma4_ChirpData_d8 0.003477 0.00000 test mintime= 0.003394
avx_fma4_ChirpData_e 0.003999 0.00000 test mintime= 0.003937
avx_fma4_ChirpData_a 0.003392 0.00000 choice
Third run
v_ChirpData 0.009590 0.00000 test mintime= 0.005087
fpu_ChirpData 0.018358 0.00000 test mintime= 0.017907
fpu_opt_ChirpData 0.009407 0.00000 test mintime= 0.004685
sse1_ChirpData_ak8e 0.007488 0.00000 test mintime= 0.007304
sse2_ChirpData_ak8 0.004673 0.00000 test mintime= 0.004614
sse3_ChirpData_ak8 0.004549 0.00000 test mintime= 0.004473
avx_ChirpData_a 0.004010 0.00000 test mintime= 0.003766
avx_ChirpData_b 0.003849 0.00000 test mintime= 0.003803
avx_ChirpData_c 0.004126 0.00000 test mintime= 0.004085
avx_ChirpData_d 0.004000 0.00000 test mintime= 0.003981
avx_ChirpData_e 0.003917 0.00000 test mintime= 0.003881
avx_ChirpData_f 0.003818 0.00000 test mintime= 0.003664
avx_ChirpData_g 0.003710 0.00000 test mintime= 0.003597
avx_ChirpData_h 0.004417 0.00000 test mintime= 0.004379
avx_ChirpData_i 0.003895 0.00000 test mintime= 0.003867
avx_fma4_ChirpData_a 0.003405 0.00000 test mintime= 0.003341
avx_fma4_ChirpData_d4 0.003448 0.00000 test mintime= 0.003356
avx_fma4_ChirpData_d6 0.003464 0.00000 test mintime= 0.003389
avx_fma4_ChirpData_d8 0.003538 0.00000 test mintime= 0.003346
avx_fma4_ChirpData_e 0.003965 0.00000 test mintime= 0.003922
avx_fma4_ChirpData_a 0.003405 0.00000 choice
Test duration 11.20 seconds
Ftst_v7 completed successfully.
Second run BOINC IDLE
=========================================================
Ftst_v7_J54_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009191 0.00000 test mintime= 0.004928
fpu_ChirpData 0.017562 0.00000 test mintime= 0.017515
fpu_opt_ChirpData 0.008981 0.00000 test mintime= 0.004633
sse1_ChirpData_ak8e 0.007240 0.00000 test mintime= 0.007195
sse2_ChirpData_ak8 0.004578 0.00000 test mintime= 0.004515
sse3_ChirpData_ak8 0.004546 0.00000 test mintime= 0.004421
avx_ChirpData_a 0.003788 0.00000 test mintime= 0.003753
avx_ChirpData_b 0.003842 0.00000 test mintime= 0.003806
avx_ChirpData_c 0.004095 0.00000 test mintime= 0.004072
avx_ChirpData_d 0.003996 0.00000 test mintime= 0.003948
avx_ChirpData_e 0.003908 0.00000 test mintime= 0.003887
avx_ChirpData_f 0.003708 0.00000 test mintime= 0.003665
avx_ChirpData_g 0.003602 0.00000 test mintime= 0.003581
avx_ChirpData_h 0.004397 0.00000 test mintime= 0.004363
avx_ChirpData_i 0.003876 0.00000 test mintime= 0.003844
avx_fma4_ChirpData_a 0.003374 0.00000 test mintime= 0.003328
avx_fma4_ChirpData_d4 0.003371 0.00000 test mintime= 0.003353
avx_fma4_ChirpData_d6 0.003421 0.00000 test mintime= 0.003335
avx_fma4_ChirpData_d8 0.003377 0.00000 test mintime= 0.003348
avx_fma4_ChirpData_e 0.003945 0.00000 test mintime= 0.003914
avx_fma4_ChirpData_d4 0.003371 0.00000 choice
Second run
v_ChirpData 0.009147 0.00000 test mintime= 0.004946
fpu_ChirpData 0.017576 0.00000 test mintime= 0.017502
fpu_opt_ChirpData 0.008935 0.00000 test mintime= 0.004644
sse1_ChirpData_ak8e 0.007233 0.00000 test mintime= 0.007189
sse2_ChirpData_ak8 0.004593 0.00000 test mintime= 0.004523
sse3_ChirpData_ak8 0.004424 0.00000 test mintime= 0.004418
avx_ChirpData_a 0.003805 0.00000 test mintime= 0.003735
avx_ChirpData_b 0.003810 0.00000 test mintime= 0.003774
avx_ChirpData_c 0.004115 0.00000 test mintime= 0.004094
avx_ChirpData_d 0.003971 0.00000 test mintime= 0.003960
avx_ChirpData_e 0.003910 0.00000 test mintime= 0.003864
avx_ChirpData_f 0.003696 0.00000 test mintime= 0.003666
avx_ChirpData_g 0.003619 0.00000 test mintime= 0.003559
avx_ChirpData_h 0.004404 0.00000 test mintime= 0.004376
avx_ChirpData_i 0.003880 0.00000 test mintime= 0.003861
avx_fma4_ChirpData_a 0.003350 0.00000 test mintime= 0.003323
avx_fma4_ChirpData_d4 0.003392 0.00000 test mintime= 0.003354
avx_fma4_ChirpData_d6 0.003353 0.00000 test mintime= 0.003344
avx_fma4_ChirpData_d8 0.003352 0.00000 test mintime= 0.003340
avx_fma4_ChirpData_e 0.003941 0.00000 test mintime= 0.003902
avx_fma4_ChirpData_a 0.003350 0.00000 choice
Third run
v_ChirpData 0.009191 0.00000 test mintime= 0.004914
fpu_ChirpData 0.017564 0.00000 test mintime= 0.017467
fpu_opt_ChirpData 0.008974 0.00000 test mintime= 0.004635
sse1_ChirpData_ak8e 0.007437 0.00000 test mintime= 0.007225
sse2_ChirpData_ak8 0.004660 0.00000 test mintime= 0.004520
sse3_ChirpData_ak8 0.004443 0.00000 test mintime= 0.004420
avx_ChirpData_a 0.003801 0.00000 test mintime= 0.003711
avx_ChirpData_b 0.003829 0.00000 test mintime= 0.003784
avx_ChirpData_c 0.004095 0.00000 test mintime= 0.004075
avx_ChirpData_d 0.004004 0.00000 test mintime= 0.003969
avx_ChirpData_e 0.003909 0.00000 test mintime= 0.003861
avx_ChirpData_f 0.003724 0.00000 test mintime= 0.003667
avx_ChirpData_g 0.003675 0.00000 test mintime= 0.003593
avx_ChirpData_h 0.004403 0.00000 test mintime= 0.004370
avx_ChirpData_i 0.003866 0.00000 test mintime= 0.003849
avx_fma4_ChirpData_a 0.003363 0.00000 test mintime= 0.003351
avx_fma4_ChirpData_d4 0.003387 0.00000 test mintime= 0.003363
avx_fma4_ChirpData_d6 0.003381 0.00000 test mintime= 0.003345
avx_fma4_ChirpData_d8 0.003369 0.00000 test mintime= 0.003340
avx_fma4_ChirpData_e 0.003969 0.00000 test mintime= 0.003923
avx_fma4_ChirpData_a 0.003363 0.00000 choice
Test duration 11.07 seconds
Ftst_v7 completed successfully.
KarVi:
FX 8150@4,5
Boinc paused as usual. System completely idle, I didn't even move the mouse.
=========================================================
Ftst_v7_J54_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.007559 0.00000 test mintime= 0.004060
fpu_ChirpData 0.013618 0.00000 test mintime= 0.013601
fpu_opt_ChirpData 0.007416 0.00000 test mintime= 0.003848
sse1_ChirpData_ak8e 0.005540 0.00000 test mintime= 0.005529
sse2_ChirpData_ak8 0.003595 0.00000 test mintime= 0.003567
sse3_ChirpData_ak8 0.003480 0.00000 test mintime= 0.003471
avx_ChirpData_a 0.002973 0.00000 test mintime= 0.002969
avx_ChirpData_b 0.002961 0.00000 test mintime= 0.002951
avx_ChirpData_c 0.003203 0.00000 test mintime= 0.003190
avx_ChirpData_d 0.003107 0.00000 test mintime= 0.003097
avx_ChirpData_e 0.003048 0.00000 test mintime= 0.003043
avx_ChirpData_f 0.002884 0.00000 test mintime= 0.002880
avx_ChirpData_g 0.002811 0.00000 test mintime= 0.002800
avx_ChirpData_h 0.003455 0.00000 test mintime= 0.003441
avx_ChirpData_i 0.003046 0.00000 test mintime= 0.003033
avx_fma4_ChirpData_a 0.002846 0.00000 test mintime= 0.002630
avx_fma4_ChirpData_d4 0.002656 0.00000 test mintime= 0.002649
avx_fma4_ChirpData_d6 0.002632 0.00000 test mintime= 0.002627
avx_fma4_ChirpData_d8 0.002620 0.00000 test mintime= 0.002618
avx_fma4_ChirpData_e 0.003091 0.00000 test mintime= 0.003088
avx_fma4_ChirpData_d8 0.002620 0.00000 choice
Second run
v_ChirpData 0.007577 0.00000 test mintime= 0.004067
fpu_ChirpData 0.013604 0.00000 test mintime= 0.013573
fpu_opt_ChirpData 0.007383 0.00000 test mintime= 0.003850
sse1_ChirpData_ak8e 0.005512 0.00000 test mintime= 0.005497
sse2_ChirpData_ak8 0.003602 0.00000 test mintime= 0.003573
sse3_ChirpData_ak8 0.003478 0.00000 test mintime= 0.003472
avx_ChirpData_a 0.002977 0.00000 test mintime= 0.002972
avx_ChirpData_b 0.002965 0.00000 test mintime= 0.002957
avx_ChirpData_c 0.003197 0.00000 test mintime= 0.003190
avx_ChirpData_d 0.003108 0.00000 test mintime= 0.003099
avx_ChirpData_e 0.003051 0.00000 test mintime= 0.003047
avx_ChirpData_f 0.002895 0.00000 test mintime= 0.002886
avx_ChirpData_g 0.002809 0.00000 test mintime= 0.002807
avx_ChirpData_h 0.003471 0.00000 test mintime= 0.003449
avx_ChirpData_i 0.003056 0.00000 test mintime= 0.003040
avx_fma4_ChirpData_a 0.002643 0.00000 test mintime= 0.002636
avx_fma4_ChirpData_d4 0.002657 0.00000 test mintime= 0.002653
avx_fma4_ChirpData_d6 0.002634 0.00000 test mintime= 0.002629
avx_fma4_ChirpData_d8 0.002620 0.00000 test mintime= 0.002618
avx_fma4_ChirpData_e 0.003102 0.00000 test mintime= 0.003091
avx_fma4_ChirpData_d8 0.002620 0.00000 choice
Third run
v_ChirpData 0.007571 0.00000 test mintime= 0.004069
fpu_ChirpData 0.013619 0.00000 test mintime= 0.013601
fpu_opt_ChirpData 0.007503 0.00000 test mintime= 0.003850
sse1_ChirpData_ak8e 0.005545 0.00000 test mintime= 0.005531
sse2_ChirpData_ak8 0.003589 0.00000 test mintime= 0.003566
sse3_ChirpData_ak8 0.003483 0.00000 test mintime= 0.003476
avx_ChirpData_a 0.002976 0.00000 test mintime= 0.002972
avx_ChirpData_b 0.002986 0.00000 test mintime= 0.002957
avx_ChirpData_c 0.003198 0.00000 test mintime= 0.003192
avx_ChirpData_d 0.003109 0.00000 test mintime= 0.003103
avx_ChirpData_e 0.003054 0.00000 test mintime= 0.003044
avx_ChirpData_f 0.002885 0.00000 test mintime= 0.002881
avx_ChirpData_g 0.002807 0.00000 test mintime= 0.002804
avx_ChirpData_h 0.003454 0.00000 test mintime= 0.003444
avx_ChirpData_i 0.003048 0.00000 test mintime= 0.003035
avx_fma4_ChirpData_a 0.002637 0.00000 test mintime= 0.002635
avx_fma4_ChirpData_d4 0.002655 0.00000 test mintime= 0.002653
avx_fma4_ChirpData_d6 0.002640 0.00000 test mintime= 0.002627
avx_fma4_ChirpData_d8 0.002629 0.00000 test mintime= 0.002618
avx_fma4_ChirpData_e 0.003095 0.00000 test mintime= 0.003092
avx_fma4_ChirpData_d8 0.002629 0.00000 choice
Test duration 8.62 seconds
Ftst_v7 completed successfully.
Again difference from 4100 to 8150. 8150 is more consistent though.
I can't remember if arkayn runs with any frequency changing settings enabled, either turbo or C&Q, but find it strange that his 4100 chooses so differently, when mine choose avx_fma d8 every time, and with almost the same exact timings.
arkayn:
I am running strictly stock speeds with my FX-4100 and have the system set to performance in the CP.
12GB DDR3-1600 Ram
GTX460 & HD7750
Board is a MSI 870A-G54
http://www.newegg.com/Product/Product.aspx?Item=N82E16813130632R
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version