And data from GT720 on busy i5-3470 (high_prec timer enabled):
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep0.exe -verb -nog :
Elapsed 3018.575 secs, speedup: 46.57% ratio: 1.87x
CPU 358.599 secs, speedup: 90.99% ratio: 11.10x
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep1.exe -verb -nog :
Elapsed 3024.707 secs, speedup: 46.46% ratio: 1.87x
CPU 326.494 secs, speedup: 91.80% ratio: 12.19x
MB8_win_x86_SSE3_OpenCL_NV_SoG_STT.exe -verb -nog :
Elapsed 3034.625 secs, speedup: 46.29% ratio: 1.86x
CPU 334.591 secs, speedup: 91.59% ratio: 11.89x
Sleep0:class SleepQuantum: total=5073.9668, N=3152, <>=1.609761, min=0.011221858 max=8.9496584
Sleep1:class SleepQuantum: total=3132.7358, N=3153, <>=0.99357305, min=0.85221332 max=3.1896715
STT: class SleepQuantum: total=15702.391, N=2136, <>=7.3513065, min=0.01114194 max=16.63485
Nothing new here, just support of prev conclusions.
GT720 on busy i5-3470 (timer at default after host power cycle):
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep0.exe :
Elapsed 3095.420 secs, speedup: 45.21% ratio: 1.83x
CPU 268.571 secs, speedup: 93.25% ratio: 14.82x
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep1.exe :
Elapsed 3051.709 secs, speedup: 45.98% ratio: 1.85x
CPU 273.017 secs, speedup: 93.14% ratio: 14.58x
MB8_win_x86_SSE3_OpenCL_NV_SoG_STT.exe :
Elapsed 3014.658 secs, speedup: 46.64% ratio: 1.87x
CPU 319.958 secs, speedup: 91.96% ratio: 12.44x
Sleep0:class SleepQuantum: total=38777.035, N=1595, <>=24.311621, min=3.4529493 max=51.648083
Sleep1:class SleepQuantum: total=24096.59, N=1575, <>=15.299422, min=14.763614 max=15.852066
STT:class SleepQuantum: total=13195.877, N=2761, <>=4.7793832, min=0.012540359 max=23.805403
And here advantage of STT finally appeared. With sleep quantum only ~15ms STT remained on ~4-5ms range.
CPU idle:
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep0.exe :
Elapsed 29005.994 secs, speedup: -413.41% ratio: 0.19x(suspended through night)
CPU 1725.527 secs, speedup: 56.64% ratio: 2.31x
MB8_win_x86_SSE3_OpenCL_NV_SoG_Sleep1.exe :
Elapsed 3032.034 secs, speedup: 46.33% ratio: 1.86x
CPU 299.272 secs, speedup: 92.48% ratio: 13.30x
MB8_win_x86_SSE3_OpenCL_NV_SoG_STT.exe :
Elapsed 3007.939 secs, speedup: 46.76% ratio: 1.88x
CPU 1729.599 secs, speedup: 56.54% ratio: 2.30x
Sleep0:class SleepQuantum: total=49.532238, N=3197, <>=0.015493349, min=0.012275338 max=8.6543369
Sleep1:class SleepQuantum: total=24069.621, N=1575, <>=15.282299, min=4.3649426 max=15.597382
STT: class SleepQuantum: total=40.04808, N=3215, <>=0.012456635, min=0.012152191 max=0.0152032
approx context switch overhead for i5-3470: 0.015493349ms-0.012456635ms=0.003036714ms~3us