Pages: 1 2 3 [4] 5 6 ... 10
31
« Last post by Raistmer on 23 Aug 2016, 02:36:34 am »
r3500:class SleepQuantum: total=2.8579862, N=3, <>=0.95266207, min=0.93661302 max=0.97626472 Sleep0: class SleepQuantum: total=4.8358912, N=2704, <>=0.0017884213, min=0.00054984231 max=0.4228799 Sleep1: class SleepQuantum: total=2148.8459, N=1791, <>=1.1998023, min=0.86739361 max=3.0483601 STT: class SleepQuantum: total=3.9076965, N=2704, <>=0.001445154, min=0.0004952898 max=0.0027276319 The same question. CPU idle or busy? Or, maybe, single CPU core free only? Sleep behavior strongly depends from host load that's I always ask for full description of test conditions. And for prev run w/o sleep enabled - no explanation why these builds consume much more CPU
32
« Last post by Mike on 22 Aug 2016, 04:28:26 pm »
Not much different. Just slower.
WU : AR075.wu MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog : Elapsed 474.039 secs CPU 228.042 secs MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe : Elapsed 543.290 secs, speedup: -14.61% ratio: 0.87x CPU 194.861 secs, speedup: 14.55% ratio: 1.17x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe : Elapsed 495.577 secs, speedup: -4.54% ratio: 0.96x CPU 414.791 secs, speedup: -81.89% ratio: 0.55x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe : Elapsed 492.114 secs, speedup: -3.81% ratio: 0.96x CPU 297.541 secs, speedup: -30.48% ratio: 0.77x MB8_win_x86_SSE2_OpenCL_ATi_HD5_STT.exe : Elapsed 483.082 secs, speedup: -1.91% ratio: 0.98x CPU 415.961 secs, speedup: -82.41% ratio: 0.55x WU : PG1327_v7.wu MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog : Elapsed 62.481 secs CPU 36.145 secs MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe : Elapsed 64.840 secs, speedup: -3.78% ratio: 0.96x CPU 38.345 secs, speedup: -6.09% ratio: 0.94x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe : Elapsed 63.978 secs, speedup: -2.40% ratio: 0.98x CPU 58.812 secs, speedup: -62.71% ratio: 0.61x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe : Elapsed 64.277 secs, speedup: -2.87% ratio: 0.97x CPU 44.944 secs, speedup: -24.34% ratio: 0.80x MB8_win_x86_SSE2_OpenCL_ATi_HD5_STT.exe : Elapsed 65.041 secs, speedup: -4.10% ratio: 0.96x CPU 59.062 secs, speedup: -63.40% ratio: 0.61x
33
« Last post by Mike on 22 Aug 2016, 12:56:50 pm »
I see no -use_sleep used. Is it idle CPU? or busy CPU run?
Well i better repeat.
34
« Last post by Raistmer on 22 Aug 2016, 05:43:00 am »
I see no -use_sleep used. Is it idle CPU? or busy CPU run?
35
« Last post by Mike on 21 Aug 2016, 10:20:07 am »
WU : AR075.wu MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog : Elapsed 474.039 secs CPU 228.042 secs MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe : Elapsed 476.706 secs, speedup: -0.56% ratio: 0.99x CPU 228.994 secs, speedup: -0.42% ratio: 1.00x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe : Elapsed 475.049 secs, speedup: -0.21% ratio: 1.00x CPU 289.210 secs, speedup: -26.82% ratio: 0.79x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe : Elapsed 475.277 secs, speedup: -0.26% ratio: 1.00x CPU 288.009 secs, speedup: -26.30% ratio: 0.79x MB8_win_x86_SSE2_OpenCL_ATi_HD5_STT.exe : Elapsed 474.973 secs, speedup: -0.20% ratio: 1.00x CPU 288.415 secs, speedup: -26.47% ratio: 0.79x WU : PG1327_v7.wu MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog : Elapsed 62.481 secs CPU 36.145 secs MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe : Elapsed 61.959 secs, speedup: 0.84% ratio: 1.01x CPU 36.348 secs, speedup: -0.56% ratio: 0.99x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe : Elapsed 62.114 secs, speedup: 0.59% ratio: 1.01x CPU 42.370 secs, speedup: -17.22% ratio: 0.85x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe : Elapsed 61.824 secs, speedup: 1.05% ratio: 1.01x CPU 42.557 secs, speedup: -17.74% ratio: 0.85x MB8_win_x86_SSE2_OpenCL_ATi_HD5_STT.exe : Elapsed 62.313 secs, speedup: 0.27% ratio: 1.00x CPU 42.604 secs, speedup: -17.87% ratio: 0.85x CPU consumption is higher on all versions.
36
« Last post by Raistmer on 20 Aug 2016, 01:42:08 pm »
binaries updated to fix newly introduced bug in signal logging. WARNING: don't use binaries from V2 online.
37
« Last post by Raistmer on 19 Aug 2016, 03:18:42 pm »
Mike's results:
WU : AR075.wu MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog : Elapsed 474.039 secs CPU 228.042 secs MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe -use_sleep : Elapsed 497.494 secs, speedup: -4.95% ratio: 0.95x CPU 180.618 secs, speedup: 20.80% ratio: 1.26x MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe -use_sleep : Elapsed 500.524 secs, speedup: -5.59% ratio: 0.95x CPU 177.576 secs, speedup: 22.13% ratio: 1.28x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe -use_sleep : Elapsed 472.639 secs, speedup: 0.30% ratio: 1.00x CPU 406.617 secs, speedup: -78.31% ratio: 0.56x MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe -use_sleep : Elapsed 474.594 secs, speedup: -0.12% ratio: 1.00x CPU 285.856 secs, speedup: -25.35% ratio: 0.80x MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe -use_sleep : Elapsed 472.914 secs, speedup: 0.24% ratio: 1.00x CPU 407.116 secs, speedup: -78.53% ratio: 0.56x
GT720, CPU busy, use_sleep active results:
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep0.exe : Elapsed 3031.125 secs, speedup: 46.35% ratio: 1.86x CPU 365.136 secs, speedup: 90.83% ratio: 10.90x MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep1.exe : Elapsed 3016.956 secs, speedup: 46.60% ratio: 1.87x CPU 324.747 secs, speedup: 91.84% ratio: 12.26x MB8_win_x86_SSE3_OpenCL_NV_SoG_STT.exe : Elapsed 3037.066 secs, speedup: 46.24% ratio: 1.86x CPU 348.428 secs, speedup: 91.25% ratio: 11.42x setiathome_8.16_windows_intelx86__opencl_nvidia_SoG.exe : Elapsed 3012.764 secs, speedup: 46.67% ratio: 1.88x CPU 1721.908 secs, speedup: 56.74% ratio: 2.31x setiathome_8.17_windows_intelx86__opencl_nvidia_SoG.exe : Elapsed 3016.387 secs, speedup: 46.61% ratio: 1.87x CPU 324.966 secs, speedup: 91.83% ratio: 12.25x
So, for these places current choice of sleep(1) is optimal one even w/o high-prec timer activation. I'll repreat test with -high_prec_timer now for GT720
And counters: Sleep0: class SleepQuantum: total=13556.229, N=3065, <>=4.4229134, min=0.011274812 max=17.502548 Sleep1: class SleepQuantum: total=3163.4568, N=3153, <>=1.0033165, min=0.86198002 max=40.154495 STT: class SleepQuantum: total=16757.236, N=2412, <>=6.9474446, min=0.011177354 max=18.476677
38
« Last post by Mike on 19 Aug 2016, 10:08:34 am »
Here is bench with AR 0.75
Weakly similar on all 3 sleep variants.
39
« Last post by Raistmer on 19 Aug 2016, 08:19:40 am »
Small preliminary test on GT720: -use_sleep in tuning line CPU busy WU : PG1327_v8.wu MB8_win_x64_AVX_VS2010_r3330.exe -verb -nog : Elapsed 226.306 secs CPU 223.315 secs MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep0.exe : Elapsed 260.736 secs, speedup: -15.21% ratio: 0.87x CPU 19.641 secs, speedup: 91.20% ratio: 11.37x MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep1.exe : Elapsed 259.995 secs, speedup: -14.89% ratio: 0.87x CPU 18.939 secs, speedup: 91.52% ratio: 11.79x MB8_win_x86_SSE3_OpenCL_NV_SoG_STT.exe : Elapsed 259.921 secs, speedup: -14.85% ratio: 0.87x CPU 19.828 secs, speedup: 91.12% ratio: 11.26x setiathome_8.16_windows_intelx86__opencl_nvidia_SoG.exe : Elapsed 259.128 secs, speedup: -14.50% ratio: 0.87x CPU 43.602 secs, speedup: 80.48% ratio: 5.12x
setiathome_8.17_windows_intelx86__opencl_nvidia_SoG.exe : Elapsed 259.860 secs, speedup: -14.83% ratio: 0.87x CPU 19.017 secs, speedup: 91.48% ratio: 11.74x No strong differencies between sleep methods but one thing to notice: 8.17 definitely better in use_sleep than 8.16 And SleepQuntum's values are: Sleep0: class SleepQuantum: total=91.016396, N=40, <>=2.2754099, min=0.076670475 max=4.2926121 Sleep1: class SleepQuantum: total=66.940231, N=62, <>=1.0796812, min=0.80534756 max=8.826087 STT class SleepQuantum: total=162.07121, N=33, <>=4.9112489, min=4.226912 max=6.0012178 default:class SleepQuantum: total=46.431198, N=47, <>=0.98789783, min=0.90345198 max=1.0177377
default actually match with Sleep1 so it shows noise level for this test - definitely more prolonged tasks required.
40
« Last post by Raistmer on 19 Aug 2016, 06:50:08 am »
Binaries updated: http://lunatics.kwsn.info/index.php/topic,1812.msg61017.html#msg61017-both ATi and NV flavors -all occurencies changed so now SleepQuantum counter really represents usage of particular sleep method (Sleep0/1/STT). All builds are SoG ones. SoG currently use 2 sleep-wait loops. These builds explore if any replacement of Sleep(1) can improve CPU consumption by GPU app in these loops. There is possibility to squize more free CPU cycles by using STT or Sleep(0) but this will be topic of separate investigation and hardly go into near release. For testing use busy CPU (no sense to free CPU cycles if nobody use it) and -use_sleep in tuning line.
Though some of configs have sleep enabled by default it's too easy to make mistake so better provide use sleep manually always for this test. More benchmark result will follow. I suggest to use long-enough tasks and look into SleepQuantum's counter's N parameter - it's the number of updates it has. Worth to get this number high enough to get representative data for this test.
Pages: 1 2 3 [4] 5 6 ... 10
|