Forum > Discussion Forum
AVX Optimized App Development
KarVi:
So its possible the CPU is up/down -clocking?
That could possibly explain why results fluctuate more on your system.
My 8150 is residing in an ASUS Sabertooth 990FX, and is locked at 4.5, with CNQ and turbo turned off, as well as any other power saving states (C1E, C6). The SRQ/UNB/L3 cache is running at 2.4Ghz. Memory is at 1866Mhz, but it is cheap ram, with relaxed timings (9-10-9-27-48).
Locked clock will produce more predictable results, because the CPU cant decide to change speeds during tests.
But looking at mintimes on your 4100, it does seem to prefer d8, as that produces the fastest mintimes, so Josef's inclusion of mintimes are helpfull indeed.
PatrickV2:
Hi there,
I have the new Game-Machine for my brother under my desk. Basic config:
i7-3770 @ 3.4GHz / Sabertooth Z77 mainboard / 8GB of DDR3-1600 memory.
Nothing running in the background, results:
=========================================================
Ftst_v7_J45 started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)
v_GetPowerSpectrum 0.000246 0.00000 test
v_vGetPowerSpectrum 0.000110 0.00000 test
v_vGetPowerSpectrum2 0.000140 0.00000 test
v_vGetPowerSpectrumUnrolled 0.000098 0.00000 test
v_vGetPowerSpectrumUnrolled2 0.000139 0.00000 test
v_avxGetPowerSpectrum 0.000099 0.00000 test
v_vGetPowerSpectrumUnrolled 0.000098 0.00000 choice
v_ChirpData 0.003750 0.00000 test
fpu_ChirpData 0.010178 0.00000 test
fpu_opt_ChirpData 0.003710 0.00000 test
v_vChirpData_x86_64 0.049955 0.00000 test
sse1_ChirpData_ak 0.005268 0.00000 test
sse1_ChirpData_ak8e 0.004364 0.00000 test
sse1_ChirpData_ak8h 0.004614 0.00000 test
sse2_ChirpData_ak 0.004918 0.00000 test
sse2_ChirpData_ak8 0.003300 0.00000 test
sse3_ChirpData_ak 0.004881 0.00000 test
sse3_ChirpData_ak8 0.003142 0.00000 test
avx_ChirpData_a 0.001725 0.00000 test
avx_ChirpData_b 0.001749 0.00000 test
avx_ChirpData_c 0.001756 0.00000 test
avx_ChirpData_d 0.001615 0.00000 test
avx_ChirpData_d 0.001615 0.00000 choice
v_Transpose 0.002763 0.00000 test
v_Transpose2 0.004412 0.00000 test
v_Transpose4 0.002838 0.00000 test
v_Transpose8 0.003690 0.00000 test
v_pfTranspose2 0.002367 0.00000 test
v_pfTranspose4 0.002801 0.00000 test
v_pfTranspose8 0.004063 0.00000 test
v_vTranspose4 0.001822 0.00000 test
v_vTranspose4np 0.001887 0.00000 test
v_vTranspose4ntw 0.007671 0.00000 test
v_vTranspose4x8ntw 0.004252 0.00000 test
v_vTranspose4x16ntw 0.001665 0.00000 test
v_vpfTranspose8x4ntw 0.007697 0.00000 test
v_avxTranspose4x8ntw 0.004180 0.00000 test
v_avxTranspose4x16ntw 0.001380 0.00000 test
v_avxTranspose8x4ntw 0.007660 0.00000 test
v_avxTranspose8x8ntw_a 0.004437 0.00000 test
v_avxTranspose8x8ntw_b 0.004565 0.00000 test
v_avxTranspose4x16ntw 0.001380 0.00000 choice
FPU opt folding 0.002733 0.00000 test
AK SSE folding 0.001015 0.00000 test
BH SSE folding 0.000938 0.00000 test
JS AVX_a folding 0.000798 0.00000 test
JS AVX_c folding 0.000821 0.00000 test
JS AVX_a folding 0.000798 0.00000 choice
Test duration 7.03 seconds
Ftst_v7 completed successfully.
=========================================================
Ftst_v7_J54_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.003834 0.00000 test mintime= 0.002387
fpu_ChirpData 0.010318 0.00000 test mintime= 0.010281
fpu_opt_ChirpData 0.003676 0.00000 test mintime= 0.002203
sse1_ChirpData_ak8e 0.004432 0.00000 test mintime= 0.004417
sse2_ChirpData_ak8 0.003265 0.00000 test mintime= 0.003256
sse3_ChirpData_ak8 0.003119 0.00000 test mintime= 0.003111
avx_ChirpData_a 0.001727 0.00000 test mintime= 0.001723
avx_ChirpData_b 0.001693 0.00000 test mintime= 0.001688
avx_ChirpData_c 0.001768 0.00000 test mintime= 0.001752
avx_ChirpData_d 0.001614 0.00000 test mintime= 0.001612
avx_ChirpData_e 0.001610 0.00000 test mintime= 0.001605
avx_ChirpData_f 0.001721 0.00000 test mintime= 0.001717
avx_ChirpData_g 0.001750 0.00000 test mintime= 0.001736
avx_ChirpData_h 0.002148 0.00000 test mintime= 0.002114
avx_ChirpData_i 0.001855 0.00000 test mintime= 0.001837
avx_fma4_ChirpData_a not supported by system
avx_fma4_ChirpData_d4 not supported by system
avx_fma4_ChirpData_d6 not supported by system
avx_fma4_ChirpData_d8 not supported by system
avx_fma4_ChirpData_e not supported by system
avx_ChirpData_e 0.001610 0.00000 choice
Second run
v_ChirpData 0.003801 0.00000 test mintime= 0.002389
fpu_ChirpData 0.010334 0.00000 test mintime= 0.010292
fpu_opt_ChirpData 0.003718 0.00000 test mintime= 0.002202
sse1_ChirpData_ak8e 0.004435 0.00000 test mintime= 0.004411
sse2_ChirpData_ak8 0.003280 0.00000 test mintime= 0.003258
sse3_ChirpData_ak8 0.003125 0.00000 test mintime= 0.003112
avx_ChirpData_a 0.001725 0.00000 test mintime= 0.001723
avx_ChirpData_b 0.001698 0.00000 test mintime= 0.001689
avx_ChirpData_c 0.001755 0.00000 test mintime= 0.001752
avx_ChirpData_d 0.001617 0.00000 test mintime= 0.001612
avx_ChirpData_e 0.001608 0.00000 test mintime= 0.001605
avx_ChirpData_f 0.001733 0.00000 test mintime= 0.001720
avx_ChirpData_g 0.001741 0.00000 test mintime= 0.001732
avx_ChirpData_h 0.002146 0.00000 test mintime= 0.002111
avx_ChirpData_i 0.001862 0.00000 test mintime= 0.001843
avx_fma4_ChirpData_a not supported by system
avx_fma4_ChirpData_d4 not supported by system
avx_fma4_ChirpData_d6 not supported by system
avx_fma4_ChirpData_d8 not supported by system
avx_fma4_ChirpData_e not supported by system
avx_ChirpData_e 0.001608 0.00000 choice
Third run
v_ChirpData 0.003781 0.00000 test mintime= 0.002383
fpu_ChirpData 0.010354 0.00000 test mintime= 0.010298
fpu_opt_ChirpData 0.003670 0.00000 test mintime= 0.002202
sse1_ChirpData_ak8e 0.004439 0.00000 test mintime= 0.004414
sse2_ChirpData_ak8 0.003258 0.00000 test mintime= 0.003255
sse3_ChirpData_ak8 0.003139 0.00000 test mintime= 0.003113
avx_ChirpData_a 0.001727 0.00000 test mintime= 0.001723
avx_ChirpData_b 0.001697 0.00000 test mintime= 0.001690
avx_ChirpData_c 0.001755 0.00000 test mintime= 0.001753
avx_ChirpData_d 0.001621 0.00000 test mintime= 0.001612
avx_ChirpData_e 0.001611 0.00000 test mintime= 0.001605
avx_ChirpData_f 0.001727 0.00000 test mintime= 0.001717
avx_ChirpData_g 0.001743 0.00000 test mintime= 0.001735
avx_ChirpData_h 0.002145 0.00000 test mintime= 0.002107
avx_ChirpData_i 0.001857 0.00000 test mintime= 0.001840
avx_fma4_ChirpData_a not supported by system
avx_fma4_ChirpData_d4 not supported by system
avx_fma4_ChirpData_d6 not supported by system
avx_fma4_ChirpData_d8 not supported by system
avx_fma4_ChirpData_e not supported by system
avx_ChirpData_e 0.001611 0.00000 choice
Test duration 12.43 seconds
Ftst_v7 completed successfully.
Hope this helps, regards,
Patrick.
arkayn:
--- Quote from: KarVi on 25 May 2012, 01:12:39 pm ---So its possible the CPU is up/down -clocking?
That could possibly explain why results fluctuate more on your system.
My 8150 is residing in an ASUS Sabertooth 990FX, and is locked at 4.5, with CNQ and turbo turned off, as well as any other power saving states (C1E, C6). The SRQ/UNB/L3 cache is running at 2.4Ghz.
Locked clock will produce more predictable results, because the CPU cant decide to change speeds during tests.
But looking at mintimes on your 4100, it does seem to prefer d8, as that produces the fastest mintimes, so Josef's inclusion of mintimes are helpfull indeed.
--- End quote ---
Just restarted to check my BIOS and disabled Turbo and C1E, so there should be no major discrepancies now.
=========================================================
Ftst_v7_J54_Chirponly started.
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_ChirpData 0.009824 0.00000 test mintime= 0.005371
fpu_ChirpData 0.017487 0.00000 test mintime= 0.017333
fpu_opt_ChirpData 0.009654 0.00000 test mintime= 0.005122
sse1_ChirpData_ak8e 0.007148 0.00000 test mintime= 0.007111
sse2_ChirpData_ak8 0.004559 0.00000 test mintime= 0.004487
sse3_ChirpData_ak8 0.004439 0.00000 test mintime= 0.004402
avx_ChirpData_a 0.003772 0.00000 test mintime= 0.003741
avx_ChirpData_b 0.003828 0.00000 test mintime= 0.003732
avx_ChirpData_c 0.004082 0.00000 test mintime= 0.004050
avx_ChirpData_d 0.003935 0.00000 test mintime= 0.003932
avx_ChirpData_e 0.003859 0.00000 test mintime= 0.003853
avx_ChirpData_f 0.003646 0.00000 test mintime= 0.003635
avx_ChirpData_g 0.003545 0.00000 test mintime= 0.003533
avx_ChirpData_h 0.004371 0.00000 test mintime= 0.004342
avx_ChirpData_i 0.003838 0.00000 test mintime= 0.003808
avx_fma4_ChirpData_a 0.003330 0.00000 test mintime= 0.003310
avx_fma4_ChirpData_d4 0.003355 0.00000 test mintime= 0.003341
avx_fma4_ChirpData_d6 0.003351 0.00000 test mintime= 0.003328
avx_fma4_ChirpData_d8 0.003342 0.00000 test mintime= 0.003325
avx_fma4_ChirpData_e 0.003921 0.00000 test mintime= 0.003904
avx_fma4_ChirpData_a 0.003330 0.00000 choice
Second run
v_ChirpData 0.009809 0.00000 test mintime= 0.005367
fpu_ChirpData 0.017515 0.00000 test mintime= 0.017334
fpu_opt_ChirpData 0.009602 0.00000 test mintime= 0.005055
sse1_ChirpData_ak8e 0.007170 0.00000 test mintime= 0.007113
sse2_ChirpData_ak8 0.004509 0.00000 test mintime= 0.004488
sse3_ChirpData_ak8 0.004414 0.00000 test mintime= 0.004390
avx_ChirpData_a 0.003774 0.00000 test mintime= 0.003756
avx_ChirpData_b 0.003848 0.00000 test mintime= 0.003806
avx_ChirpData_c 0.004058 0.00000 test mintime= 0.004048
avx_ChirpData_d 0.003937 0.00000 test mintime= 0.003932
avx_ChirpData_e 0.003857 0.00000 test mintime= 0.003853
avx_ChirpData_f 0.003644 0.00000 test mintime= 0.003635
avx_ChirpData_g 0.003543 0.00000 test mintime= 0.003534
avx_ChirpData_h 0.004350 0.00000 test mintime= 0.004335
avx_ChirpData_i 0.003856 0.00000 test mintime= 0.003822
avx_fma4_ChirpData_a 0.003331 0.00000 test mintime= 0.003310
avx_fma4_ChirpData_d4 0.003349 0.00000 test mintime= 0.003341
avx_fma4_ChirpData_d6 0.003335 0.00000 test mintime= 0.003329
avx_fma4_ChirpData_d8 0.003333 0.00000 test mintime= 0.003326
avx_fma4_ChirpData_e 0.003913 0.00000 test mintime= 0.003900
avx_fma4_ChirpData_a 0.003331 0.00000 choice
Third run
v_ChirpData 0.009795 0.00000 test mintime= 0.005379
fpu_ChirpData 0.017380 0.00000 test mintime= 0.017333
fpu_opt_ChirpData 0.009683 0.00000 test mintime= 0.005122
sse1_ChirpData_ak8e 0.007147 0.00000 test mintime= 0.007113
sse2_ChirpData_ak8 0.004544 0.00000 test mintime= 0.004502
sse3_ChirpData_ak8 0.004440 0.00000 test mintime= 0.004403
avx_ChirpData_a 0.003776 0.00000 test mintime= 0.003748
avx_ChirpData_b 0.003836 0.00000 test mintime= 0.003741
avx_ChirpData_c 0.004129 0.00000 test mintime= 0.004049
avx_ChirpData_d 0.003951 0.00000 test mintime= 0.003937
avx_ChirpData_e 0.003870 0.00000 test mintime= 0.003853
avx_ChirpData_f 0.003650 0.00000 test mintime= 0.003638
avx_ChirpData_g 0.003574 0.00000 test mintime= 0.003537
avx_ChirpData_h 0.004354 0.00000 test mintime= 0.004334
avx_ChirpData_i 0.003865 0.00000 test mintime= 0.003821
avx_fma4_ChirpData_a 0.003316 0.00000 test mintime= 0.003311
avx_fma4_ChirpData_d4 0.003347 0.00000 test mintime= 0.003342
avx_fma4_ChirpData_d6 0.003332 0.00000 test mintime= 0.003325
avx_fma4_ChirpData_d8 0.003330 0.00000 test mintime= 0.003326
avx_fma4_ChirpData_e 0.003932 0.00000 test mintime= 0.003908
avx_fma4_ChirpData_a 0.003316 0.00000 choice
Test duration 10.93 seconds
Ftst_v7 completed successfully.
KarVi:
Well, now it doesn't prefer d8 (if it ever did) anymore, but at least its consistent with what it chooses. :)
Results are more stable it seems.
arkayn:
Actually, my system has fairly consistently preferred a, but d8 was usually a close second.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version