Recent Posts

Pages: 1 ... 3 4 [5] 6 7 ... 10
41
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Raistmer on 19 Aug 2016, 04:48:24 am »
From Mike's run:

PG1327
Sleep0: class SleepQuantum:      total=43.846642,   N=32,   <>=1.3702075,   min=0.93442535   max=1.6681368
Sleep1: class SleepQuantum:      total=43.289009,   N=31,   <>=1.3964196,   min=1.1869471   max=1.8000549
SwitchTothread:class SleepQuantum:      total=44.513672,   N=32,   <>=1.3910522,   min=0.948681   max=1.8100463

Summary: we should forget about PG set for this GPU and especially for this test. Only ~30 occuriences for whole task and even not all of them modified.


@Mike please repeat similar test on next occasion with task attached here. I hope it lasts longer and give more chances to test.

P.S. from my C-60 failed test one can make conclusion that ATi not too good for this test. ATi runtime frees CPU good enough to mix non-sleep results with sleep ones. I'll provide NV flavour version soon too.
42
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Raistmer on 19 Aug 2016, 04:31:00 am »
Damn, forgot that ATi low-perf path doesn't enable sleep instead of NV one. My C-60 two-days test screwed   :-\
43
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Mike on 18 Aug 2016, 04:06:29 pm »

Here is my bench with busy CPU using -use_sleep.

WU : PG0009_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 78.072 secs
      CPU 37.861 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe  -use_sleep :
  Elapsed 130.747 secs, speedup: -67.47%  ratio: 0.60x
      CPU 37.113 secs, speedup: 1.98%  ratio: 1.02x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe  -use_sleep :
  Elapsed 86.725 secs, speedup: -11.08%  ratio: 0.90x
      CPU 40.092 secs, speedup: -5.89%  ratio: 0.94x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe  -use_sleep :
  Elapsed 87.227 secs, speedup: -11.73%  ratio: 0.90x
      CPU 39.359 secs, speedup: -3.96%  ratio: 0.96x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  -use_sleep :
  Elapsed 85.652 secs, speedup: -9.71%  ratio: 0.91x
      CPU 41.902 secs, speedup: -10.67%  ratio: 0.90x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  -use_sleep :
  Elapsed 84.263 secs, speedup: -7.93%  ratio: 0.93x
      CPU 39.811 secs, speedup: -5.15%  ratio: 0.95x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  -use_sleep :
  Elapsed 85.323 secs, speedup: -9.29%  ratio: 0.92x
      CPU 41.886 secs, speedup: -10.63%  ratio: 0.90x
 
WU : PG0395_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 54.637 secs
      CPU 35.787 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe  -use_sleep :
  Elapsed 65.286 secs, speedup: -19.49%  ratio: 0.84x
      CPU 34.679 secs, speedup: 3.10%  ratio: 1.03x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe  -use_sleep :
  Elapsed 61.540 secs, speedup: -12.63%  ratio: 0.89x
      CPU 35.038 secs, speedup: 2.09%  ratio: 1.02x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe  -use_sleep :
  Elapsed 62.675 secs, speedup: -14.71%  ratio: 0.87x
      CPU 34.913 secs, speedup: 2.44%  ratio: 1.03x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  -use_sleep :
  Elapsed 58.698 secs, speedup: -7.43%  ratio: 0.93x
      CPU 49.218 secs, speedup: -37.53%  ratio: 0.73x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  -use_sleep :
  Elapsed 59.481 secs, speedup: -8.87%  ratio: 0.92x
      CPU 40.966 secs, speedup: -14.47%  ratio: 0.87x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  -use_sleep :
  Elapsed 59.418 secs, speedup: -8.75%  ratio: 0.92x
      CPU 49.094 secs, speedup: -37.18%  ratio: 0.73x
 
WU : PG0444_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 53.981 secs
      CPU 35.085 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe  -use_sleep :
  Elapsed 62.600 secs, speedup: -15.97%  ratio: 0.86x
      CPU 34.476 secs, speedup: 1.74%  ratio: 1.02x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe  -use_sleep :
  Elapsed 61.373 secs, speedup: -13.69%  ratio: 0.88x
      CPU 35.584 secs, speedup: -1.42%  ratio: 0.99x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe  -use_sleep :
  Elapsed 61.562 secs, speedup: -14.04%  ratio: 0.88x
      CPU 35.943 secs, speedup: -2.45%  ratio: 0.98x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  -use_sleep :
  Elapsed 57.967 secs, speedup: -7.38%  ratio: 0.93x
      CPU 48.735 secs, speedup: -38.91%  ratio: 0.72x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  -use_sleep :
  Elapsed 58.220 secs, speedup: -7.85%  ratio: 0.93x
      CPU 40.295 secs, speedup: -14.85%  ratio: 0.87x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  -use_sleep :
  Elapsed 58.184 secs, speedup: -7.79%  ratio: 0.93x
      CPU 48.329 secs, speedup: -37.75%  ratio: 0.73x
 
WU : PG1327_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 62.481 secs
      CPU 36.145 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe  -use_sleep :
  Elapsed 69.000 secs, speedup: -10.43%  ratio: 0.91x
      CPU 39.624 secs, speedup: -9.63%  ratio: 0.91x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe  -use_sleep :
  Elapsed 69.452 secs, speedup: -11.16%  ratio: 0.90x
      CPU 38.579 secs, speedup: -6.73%  ratio: 0.94x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe  -use_sleep :
  Elapsed 68.927 secs, speedup: -10.32%  ratio: 0.91x
      CPU 37.846 secs, speedup: -4.71%  ratio: 0.96x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  -use_sleep :
  Elapsed 67.176 secs, speedup: -7.51%  ratio: 0.93x
      CPU 44.101 secs, speedup: -22.01%  ratio: 0.82x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  -use_sleep :
  Elapsed 67.968 secs, speedup: -8.78%  ratio: 0.92x
      CPU 43.836 secs, speedup: -21.28%  ratio: 0.82x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  -use_sleep :
  Elapsed 67.928 secs, speedup: -8.72%  ratio: 0.92x
      CPU 43.477 secs, speedup: -20.28%  ratio: 0.83x
 
44
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Mike on 18 Aug 2016, 12:58:35 pm »
In next testing window please try with CPU busy and mandatory -use_sleep option too.

There is one very important difference between your GPU and my C-60 regarding this test. My C-60 one of slowest ATi devices so get low-perf path with sleep enabled by default.
Hence I omit it in testing (still going, BTW, on netbook).

And all changes between these builds embraced with if(use_sleep) so to enable sleep is requirement.

EDIT: and as usual these days, seems you need some full-length tasks, not PG set. GPU too fast.
For example, all CPU time you see most probably came from startup code, not icfft loop processing.

You said no tuning line at all in your post above.

Thats what i did.  :(

Quote
No tuning line at all so fully default.
CPU fixation to 1GHz reapplied after reboot.
Binaries used for this test attached so reader can repeat it on any ATi GPU equipped host.

Short instruction what you want would be helpful.

Will test with CPU busy and _use_sleep.

45
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Raistmer on 18 Aug 2016, 09:43:15 am »
In next testing window please try with CPU busy and mandatory -use_sleep option too.

There is one very important difference between your GPU and my C-60 regarding this test. My C-60 one of slowest ATi devices so get low-perf path with sleep enabled by default.
Hence I omit it in testing (still going, BTW, on netbook).

And all changes between these builds embraced with if(use_sleep) so to enable sleep is requirement.

EDIT: and as usual these days, seems you need some full-length tasks, not PG set. GPU too fast.
For example, all CPU time you see most probably came from startup code, not icfft loop processing.
46
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Raistmer on 18 Aug 2016, 09:33:57 am »

I don`t think changing high prec timer is a good idea for stock development.
Especially for hosts which are used for other things than crunching.

Yep, I will not do this default, at least before quite prolonged testing. It's system-wide change... So -high_prec_timer will not be enabled by default in next release.
47
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Mike on 18 Aug 2016, 09:33:19 am »
Uppps forgot to attach bench log.

Done.
48
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Raistmer on 18 Aug 2016, 09:31:14 am »

Now going to the hospital to see my new grand child.
Its just 10 hours old.

Congrats, Mike! :)
I'll look test in details while.
49
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Mike on 18 Aug 2016, 09:20:04 am »
Hello my name is Mike

Here is a bench of all sleep variants on my R9 380
Default settings system idle.

WU : PG0009_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 78.072 secs
      CPU 37.861 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe   :
  Elapsed 105.248 secs, speedup: -34.81%  ratio: 0.74x
      CPU 45.568 secs, speedup: -20.36%  ratio: 0.83x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe   :
  Elapsed 82.028 secs, speedup: -5.07%  ratio: 0.95x
      CPU 37.081 secs, speedup: 2.06%  ratio: 1.02x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe   :
  Elapsed 81.469 secs, speedup: -4.35%  ratio: 0.96x
      CPU 36.879 secs, speedup: 2.59%  ratio: 1.03x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe   :
  Elapsed 80.298 secs, speedup: -2.85%  ratio: 0.97x
      CPU 38.610 secs, speedup: -1.98%  ratio: 0.98x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe   :
  Elapsed 80.357 secs, speedup: -2.93%  ratio: 0.97x
      CPU 37.971 secs, speedup: -0.29%  ratio: 1.00x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe   :
  Elapsed 80.881 secs, speedup: -3.60%  ratio: 0.97x
      CPU 38.002 secs, speedup: -0.37%  ratio: 1.00x
 
WU : PG0395_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 54.637 secs
      CPU 35.787 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe   :
  Elapsed 57.780 secs, speedup: -5.75%  ratio: 0.95x
      CPU 36.208 secs, speedup: -1.18%  ratio: 0.99x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe   :
  Elapsed 57.933 secs, speedup: -6.03%  ratio: 0.94x
      CPU 36.161 secs, speedup: -1.05%  ratio: 0.99x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe   :
  Elapsed 57.769 secs, speedup: -5.73%  ratio: 0.95x
      CPU 35.459 secs, speedup: 0.92%  ratio: 1.01x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe   :
  Elapsed 56.500 secs, speedup: -3.41%  ratio: 0.97x
      CPU 38.251 secs, speedup: -6.89%  ratio: 0.94x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe   :
  Elapsed 57.102 secs, speedup: -4.51%  ratio: 0.96x
      CPU 38.064 secs, speedup: -6.36%  ratio: 0.94x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe   :
  Elapsed 56.563 secs, speedup: -3.53%  ratio: 0.97x
      CPU 38.454 secs, speedup: -7.45%  ratio: 0.93x
 
WU : PG0444_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 53.981 secs
      CPU 35.085 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe   :
  Elapsed 57.213 secs, speedup: -5.99%  ratio: 0.94x
      CPU 36.520 secs, speedup: -4.09%  ratio: 0.96x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe   :
  Elapsed 56.641 secs, speedup: -4.93%  ratio: 0.95x
      CPU 35.475 secs, speedup: -1.11%  ratio: 0.99x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe   :
  Elapsed 57.382 secs, speedup: -6.30%  ratio: 0.94x
      CPU 35.475 secs, speedup: -1.11%  ratio: 0.99x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe   :
  Elapsed 55.102 secs, speedup: -2.08%  ratio: 0.98x
      CPU 38.329 secs, speedup: -9.25%  ratio: 0.92x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe   :
  Elapsed 56.778 secs, speedup: -5.18%  ratio: 0.95x
      CPU 37.908 secs, speedup: -8.05%  ratio: 0.93x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe   :
  Elapsed 55.657 secs, speedup: -3.10%  ratio: 0.97x
      CPU 38.033 secs, speedup: -8.40%  ratio: 0.92x
 
WU : PG1327_v7.wu
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3330.exe -verb -nog :
  Elapsed 62.481 secs
      CPU 36.145 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe   :
  Elapsed 66.964 secs, speedup: -7.17%  ratio: 0.93x
      CPU 36.941 secs, speedup: -2.20%  ratio: 0.98x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3486.exe   :
  Elapsed 67.738 secs, speedup: -8.41%  ratio: 0.92x
      CPU 36.941 secs, speedup: -2.20%  ratio: 0.98x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe   :
  Elapsed 68.462 secs, speedup: -9.57%  ratio: 0.91x
      CPU 36.379 secs, speedup: -0.65%  ratio: 0.99x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe   :
  Elapsed 66.963 secs, speedup: -7.17%  ratio: 0.93x
      CPU 42.323 secs, speedup: -17.09%  ratio: 0.85x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe   :
  Elapsed 66.555 secs, speedup: -6.52%  ratio: 0.94x
      CPU 42.198 secs, speedup: -16.75%  ratio: 0.86x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe   :
  Elapsed 66.071 secs, speedup: -5.75%  ratio: 0.95x
      CPU 41.917 secs, speedup: -15.97%  ratio: 0.86x
 
To me the picture is quite clear.
The faster the GPU apps are getting the more CPU it uses.
I don`t think changing high prec timer is a good idea for stock development.
Especially for hosts which are used for other things than crunching.

Now going to the hospital to see my new grand child.
Its just 10 hours old.
50
Discussion Forum / Re: Better sleep on Windows - new round
« Last post by Raistmer on 17 Aug 2016, 11:41:03 am »
For this test host was rebooted to restore default multimedia timer behavior.
-high_prec_time option will be added to next bench run.
No tuning line at all so fully default.
CPU fixation to 1GHz reapplied after reboot.
Binaries used for this test attached so reader can repeat it on any ATi or NV GPU FERMI+ equipped host.

And, finally, results from C-60:

CPU busy, no special changes in mm timer (and no sleep at all):

WU : AR075.wu
setiathome_8.12_windows_intelx86__opencl_ati5_sah.exe -verb -nog :
  Elapsed 17457.726 secs
      CPU 355.885 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  :
  Elapsed 16464.784 secs, speedup: 5.69%  ratio: 1.06x
      CPU 417.849 secs, speedup: -17.41%  ratio: 0.85x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  :
  Elapsed 16283.808 secs, speedup: 6.72%  ratio: 1.07x
      CPU 413.013 secs, speedup: -16.05%  ratio: 0.86x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  :
  Elapsed 16677.490 secs, speedup: 4.47%  ratio: 1.05x
      CPU 433.808 secs, speedup: -21.90%  ratio: 0.82x
 
WU : AR075_1.wu
setiathome_8.12_windows_intelx86__opencl_ati5_sah.exe -verb -nog :
  Elapsed 16971.441 secs
      CPU 338.897 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  :
  Elapsed 16420.791 secs, speedup: 3.24%  ratio: 1.03x
      CPU 434.338 secs, speedup: -28.16%  ratio: 0.78x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  :
  Elapsed 16908.043 secs, speedup: 0.37%  ratio: 1.00x
      CPU 455.117 secs, speedup: -34.29%  ratio: 0.74x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  :
  Elapsed 16489.931 secs, speedup: 2.84%  ratio: 1.03x
      CPU 437.832 secs, speedup: -29.19%  ratio: 0.77x
 
WU : PG1327_v8.wu
setiathome_8.12_windows_intelx86__opencl_ati5_sah.exe -verb -nog :
  Elapsed 880.452 secs
      CPU 59.764 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  :
  Elapsed 1050.319 secs, speedup: -19.29%  ratio: 0.84x
      CPU 79.514 secs, speedup: -33.05%  ratio: 0.75x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  :
  Elapsed 1042.097 secs, speedup: -18.36%  ratio: 0.84x
      CPU 78.188 secs, speedup: -30.83%  ratio: 0.76x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  :
  Elapsed 1046.668 secs, speedup: -18.88%  ratio: 0.84x
      CPU 77.891 secs, speedup: -30.33%  ratio: 0.77x
 
WU : PG1327_v8_1.wu
setiathome_8.12_windows_intelx86__opencl_ati5_sah.exe -verb -nog :
  Elapsed 1052.627 secs
      CPU 70.793 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep0.exe  :
  Elapsed 1049.991 secs, speedup: 0.25%  ratio: 1.00x
      CPU 77.922 secs, speedup: -10.07%  ratio: 0.91x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_Sleep1.exe  :
  Elapsed 1040.366 secs, speedup: 1.16%  ratio: 1.01x
      CPU 77.376 secs, speedup: -9.30%  ratio: 0.91x
MB8_win_x86_SSE2_OpenCL_ATi_HD5_SwitchTothread.exe  :
  Elapsed 1040.912 secs, speedup: 1.11%  ratio: 1.01x
      CPU 77.969 secs, speedup: -10.14%  ratio: 0.91x

Summary: running on busy system makes results variation too big to discriminate between these sleep versions clearly. But tendency is: current Sleep(1) is adequate approach. There is possibility to use SwitchToThread in other places to extract even more free CPU cycles from GPU app but it can't be replacement for Sleep(1) in bulk sleep areas.
This test shows noise level ONLY. Cause differing part was not used at all.
Pages: 1 ... 3 4 [5] 6 7 ... 10
Powered by EzPortal