Seti@Home optimized science apps and information

Optimized Seti@Home apps => Windows => GPU crunching => Topic started by: efmer (fred) on 21 Jun 2009, 03:10:33 pm

Title: The same time with CPU and GPU
Post by: efmer (fred) on 21 Jun 2009, 03:10:33 pm
The GPU does one WU in about 10 Minutes (with vlarkill V11) the CPU in about 70 Minutes (optimized). So 2 GPU and 8 CPU WU running...
BOINC Version 6.6.36

I set the <flops> statement to the right value, so at startup it is ok.
But when one CPU WU completes the time goes up from 10 Minutes to 50 minutes for a GPU WU.
This causes BOINC to panic and starts and stops lots and lots of GPU WU (>120 in wait) until it crashes... Because it leaves to many GPU tasks in memory probably.

A solution may be to limit the Work buffer to < 2 days. But this leaves the GPU for only a couple of hours of work...

Is there any way to solve this...
Title: Re: The same time with CPU and GPU
Post by: Raistmer on 21 Jun 2009, 04:26:01 pm
Try to disable leave in memory option and get rid from buggy BOINC god sake! What was bad with 6.6.20 to go into troubles with new bugs in new versions?...
I would suggest any BOINC upgrade ONLY if there is some bug in current version that prevent your host from good work, no other reasons to upgrade.
I seems they don't get better they just replace known bugs with yet to be discoveried ones ;)

As another solution try to increase flop counter for CPU version.
Fool BOINC in other words ;)
Title: Re: The same time with CPU and GPU
Post by: sunu on 21 Jun 2009, 06:38:08 pm
When the flops numbers are set correctly, there should be no problem. Problems begin when you start crunching a very short wu (if there are many in the row the problem is bigger) or other not "normal" WUs. They throw out of balance the Duration Correction Factor, to completion times go crazy and all these you describe happen.

As Raistmer says, disable the leave in memory option, it will help. To recover sooner when boinc goes crazy like that, manually return the Duration Correction Factor to its usual value by editing client_state.xml with boinc closed.
Title: Re: The same time with CPU and GPU
Post by: efmer (fred) on 22 Jun 2009, 12:18:38 am
Try to disable leave in memory option and get rid from buggy BOINC god sake! What was bad with 6.6.20 to go into troubles with new bugs in new versions?...
I would suggest any BOINC upgrade ONLY if there is some bug in current version that prevent your host from good work, no other reasons to upgrade.
I seems they don't get better they just replace known bugs with yet to be discoveried ones ;)

As another solution try to increase flop counter for CPU version.
Fool BOINC in other words ;)

The strange part is that "Leave applications in memory when suspended " is already switched off. Or is there an other switch.
I did have problems with previous BOINC versions, so.
Title: Re: The same time with CPU and GPU
Post by: efmer (fred) on 22 Jun 2009, 12:41:08 am
When the flops numbers are set correctly, there should be no problem. Problems begin when you start crunching a very short wu (if there are many in the row the problem is bigger) or other not "normal" WUs. They throw out of balance the Duration Correction Factor, to completion times go crazy and all these you describe happen.

As Raistmer says, disable the leave in memory option, it will help. To recover sooner when boinc goes crazy like that, manually return the Duration Correction Factor to its usual value by editing client_state.xml with boinc closed.
Fiddling around with the flops does improve things, is seems to more or less give more sensible values.  But there is probably still the bug that leaves the application in memory.  Thanks. I now bring down the WU a bit, to see if that improves things, and hope there is no shortage of WU.
Title: Re: The same time with CPU and GPU
Post by: Raistmer on 22 Jun 2009, 01:25:50 pm
I consider CUDA MB preempting as bug even it will not be leaved in memory.
CUDA initialization is costly, CUDA task should never be preempted w/o real and urgent reason. They don't take too long to preempt them anyway.
Title: Re: The same time with CPU and GPU
Post by: efmer (fred) on 22 Jun 2009, 01:40:07 pm
I consider CUDA MB preempting as bug even it will not be leaved in memory.
CUDA initialization is costly, CUDA task should never be preempted w/o real and urgent reason. They don't take too long to preempt them anyway.

Is there another BOINC version that doesn't switch between CDA WU or is there any way to prevent it..
I now feed about 100 CUDA WU  a time and the rest is suspended (works great but I'm baby sitting). If I feed more BOINC and my computer crashes :-[ as BOINC starts switching between WU.
Title: Re: The same time with CPU and GPU
Post by: Raistmer on 22 Jun 2009, 01:57:46 pm
I consider CUDA MB preempting as bug even it will not be leaved in memory.
CUDA initialization is costly, CUDA task should never be preempted w/o real and urgent reason. They don't take too long to preempt them anyway.

Is there another BOINC version that doesn't switch between CDA WU or is there any way to prevent it..
I now feed about 100 CUDA WU  a time and the rest is suspended (works great but I'm baby sitting). If I feed more BOINC and my computer crashes :-[ as BOINC starts switching between WU.

Unfortunately, no AFAIK.
My BOINC-disobey build will not help here, cause you have many CUDA MB preempted - if they all will continue to run you got CPU fallback very soon...