if(use_sleep){//R: spins with Sleep(1) while readback finished
cl_event ev; clEnqueueMarker(cq,&ev);clFlush(cq);
size_t wait_time=0;cl_int ret;
do{SwitchToThread();/*nanosleep(100);*//*Sleep(use_sleep_ex);*/wait_time++;
err=clGetEventInfo(ev,CL_EVENT_COMMAND_EXECUTION_STATUS,sizeof(ret),&ret,NULL);
}while(ret>CL_COMPLETE);
cl_ulong start=0,end=0;
err=clGetEventProfilingInfo(ev,CL_PROFILING_COMMAND_QUEUED,sizeof(cl_ulong),&start,NULL);
err|=clGetEventProfilingInfo(ev,CL_PROFILING_COMMAND_END,sizeof(cl_ulong),&end,NULL);
OCL_LOG_ERR("clGetEventProfilingInfo");
float cur_quantum=(end-start)/(wait_time*1e6);
clReleaseEvent(ev);
if(use_sleep_ex==1 && wait_time>7)SleepQuantumCounter::update(cur_quantum);
if(verbose==6){
if(use_sleep_ex==1)fprintf(stderr,"current sleep quantum %2.4gms\t",cur_quantum);
fprintf(stderr,"Sleep before triplet result map: Awaited %d iterations for completion; elapsed %2.4gms\n",
wait_time,(end-start)/1e6);
}
}
Now going to the hospital to see my new grand child.
Its just 10 hours old.
I don`t think changing high prec timer is a good idea for stock development.
Especially for hosts which are used for other things than crunching.
In next testing window please try with CPU busy and mandatory -use_sleep option too.
There is one very important difference between your GPU and my C-60 regarding this test. My C-60 one of slowest ATi devices so get low-perf path with sleep enabled by default.
Hence I omit it in testing (still going, BTW, on netbook).
And all changes between these builds embraced with if(use_sleep) so to enable sleep is requirement.
EDIT: and as usual these days, seems you need some full-length tasks, not PG set. GPU too fast.
For example, all CPU time you see most probably came from startup code, not icfft loop processing.
No tuning line at all so fully default.
CPU fixation to 1GHz reapplied after reboot.
Binaries used for this test attached so reader can repeat it on any ATi GPU equipped host.
Sleep0: class SleepQuantum: total=91.016396, N=40, <>=2.2754099, min=0.076670475 max=4.2926121
Sleep1: class SleepQuantum: total=66.940231, N=62, <>=1.0796812, min=0.80534756 max=8.826087
STT class SleepQuantum: total=162.07121, N=33, <>=4.9112489, min=4.226912 max=6.0012178
default:class SleepQuantum: total=46.431198, N=47, <>=0.98789783, min=0.90345198 max=1.0177377
I see no -use_sleep used.
Is it idle CPU? or busy CPU run?
r3500:class SleepQuantum: total=2.8579862, N=3, <>=0.95266207, min=0.93661302 max=0.97626472
Sleep0: class SleepQuantum: total=4.8358912, N=2704, <>=0.0017884213, min=0.00054984231 max=0.4228799
Sleep1: class SleepQuantum: total=2148.8459, N=1791, <>=1.1998023, min=0.86739361 max=3.0483601
STT: class SleepQuantum: total=3.9076965, N=2704, <>=0.001445154, min=0.0004952898 max=0.0027276319
The same question. CPU idle or busy? Or, maybe, single CPU core free only?
Sleep behavior strongly depends from host load that's I always ask for full description of test conditions.
And for prev run w/o sleep enabled - no explanation why these builds consume much more CPU :o
I have to remove r3500 from this bench because it doesn`t even start with all cores in use.
I have to remove r3500 from this bench because it doesn`t even start with all cores in use.
Ok for now, there is separate issue we just discovered...
I have to remove r3500 from this bench because it doesn`t even start with all cores in use.
Ok for now, there is separate issue we just discovered...
Also sleep versions doesn`t even start.
Zero CPU usage on GPU task so i aborted after 5 minutes.
Not even wisgen started.
I have to remove r3500 from this bench because it doesn`t even start with all cores in use.
Ok for now, there is separate issue we just discovered...
Also sleep versions doesn`t even start.
Zero CPU usage on GPU task so i aborted after 5 minutes.
Not even wisgen started.
Please remove all wisgen tasks, run bench, await ~5mins, locate stderr.txt in ScienceApps folder and attach it as is.
Host reseted after 5 minutes.Could it be power issues? Maybe more strong power supply needed?
FX can`t use permanent all 8 cores.
stderr attached.thanks. app processed OK some time, even found some spikes.
Could it be power issues? Maybe more strong power supply needed?