Forum > GPU crunching
VBscript Fights Cuda
Leopoldo:
Maik, thanks for new version with mod :)
May I ask You for another new option in Your script very useful for "fire-and-forget" crunching? (As for me, script really helps to stable crunching without paying attention to BOINC very often)
Maybe, for more safe BOINC handling You will add in script and ini-file the 2nd option to "not kill task but suspend/resume project"?
(Because of my OS is Win2003 server which works as file-server, mail-server, proxy-server, wsus-server I don't like killing any tasks...)
Legal stop/start frozen jobs can be made through commands "boinccmd.exe --project setiathome.berkeley.edu suspend" and "boinccmd.exe --project setiathome.berkeley.edu resume"
Of course, that will lead to new parameters in ini-file with names something like "project_name" and "boinc_dir" but You are too more skilled in such things, so it will be Your decision how to implement and do that or not at all ;)
Maik:
One question: What happen if you dont use the script, a task get stuck and you press the 'suspend'-button?
Answere: Right! Nothing. The process is still in your processlist ...
Explanation: If a task goes stuck then there is no more communication between the cpu-process and the gpu-process. That is what my script detects. If you press the 'suspend'-button while there is still no communication between the processes how should the 'suspend'-command arrive the gpu-process? So it will make no sense to write a modification like this.
Leopoldo:
--- Quote from: Maik on 17 Jan 2009, 08:26:28 am ---One question: What happen if you dont use the script, a task get stuck and you press the 'suspend'-button?
Answere: Right! Nothing. The process is still in your processlist ...
Explanation: If a task goes stuck then there is no more communication between the cpu-process and the gpu-process. That is what my script detects. If you press the 'suspend'-button while there is still no communication between the processes how should the 'suspend'-command arrive the gpu-process? So it will make no sense to write a modification like this.
--- End quote ---
But suspend/resume project in BM helps to resume calculation! Before using Your script I had never killing tasks by myself but only pressed suspend/resume buttons in BM!
(I think bug is lies somewhere in CUDA-code filling GPU with work - not task goes stuck and accepts no communication, but loop inside task waits for CUDA-answer from GPU, and suspend/resume button will send commands from BM to task - task {not stucked but looped} receives that message and forcefully breaks loop and restarts itself)
OK, this is my IMHO and I will wait, maybe other crunchers will ask You about the same option later ;)
Please don't abuse/blame me, I only have modestly asked changes, will not - so will not
Maik:
A other User was asking me some posts earlier to add bmcmd commands to the script.
I've rejected this.
After i modded 044 yesterday i noticed that the measure on cpu time is too inaccurate.
I've fixed that and now im monitoring results ... (crunching a VLAR atm).
If this test pass sucessfull I'll offer this new version.
Edit:
- attached a example log from new version to show and explain changes
- initial BreakPerCycle (BPC) is still in use (ini-setting)
- if the script detects a LAR then
. . it adds a amount of 40% to BPC
. . it changes measure-time on cpu process from 3 to 5 sec
. . it adds 2 sec to BPC after every 'resetting counter'
- if the script detects a VLAR then
. . it adds a amount of 80% to BPC
. . it changes measure-time on cpu process from 3 to 7 sec
. . it adds 5 sec to BPC after every 'resetting counter'
The 'heavy looking' cpu-usage on the task before the VLAR has begun is depended on the new measure-procedure.
My Host is a quaddy so the cpu usage is normal the shown / 4.
Atm there is no way to make differences between quad/dual/singel core systems. I think you can live with that :P
Edit2:
need more time to fine tune the timings. script terminated a VLAR although it was running fine ... :'(
[attachment deleted by admin]
Maik:
Ok, im done.
Had a runtime error: overflow , but I wasnt able to reproduce the error to figure out why it happened.
Now the scipt is running about a hour without errors ... -> must be some of Murphy's Law ;D
Please read update infos including in readme.txt before editing the ini and starting the script!
If you have questions to this version I will be online next 2 to 3 hours to answer them here.
external link
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version