Seti@Home optimized science apps and information
Optimized Seti@Home apps => Windows => GPU crunching => Topic started by: Raistmer on 31 Dec 2011, 09:14:56 am
-
2 VLARS running:
(http://gpuz.techpowerup.com/11/12/31/w0.png)
Same 2 tasks after restart:
(http://gpuz.techpowerup.com/11/12/31/6vm.png)
-
I just check my 6970 while running 3 VHAR WU's and it looks similar.
I do have my fan speed set at 65% just to keep the temps lower than 65 C
-
On my machine it was below 50% after restart.
Same behaviour with 11.12. Vista 64 ultimate.
Mike
-
Here is one of Skildudes units.
http://setiathome.berkeley.edu/result.php?resultid=2239022262
7600 seconds for AR 0.29 unit is way to slow for a 6970.
The unit is marked as CPU so resheduled.
Mike
-
2 non-VLAR tasks (host running w/o intervention more than day):
(http://gpuz.techpowerup.com/12/01/01/emq.png)
Same tasks after restart:
(http://gpuz.techpowerup.com/12/01/01/b77.png)
And same 2 tasks again after complete BOINC restart (no host reboot):
(http://gpuz.techpowerup.com/12/01/01/fd2.png)
So, looks like BOINC responsible for such disgusting performance ???
-
Or just weird driver behavior when it gives different results from restort to restart...
On this picture one of those 2 tasks finished (small drop). Another task started but no corresponding increase in GPU usage.
Then second task finished - great GPU usage drop, almost to zero (though 1 task still running!) and then, when both new tasks are past of initialization stage, GPU usage remains unacceptable low...
(http://gpuz.techpowerup.com/12/01/01/8qu.png)
-
Have you tried running 3 WU's at a time. See if that boosts the numbers. I'm still getting a high GPU usage from mine
-
Here's an example of one of my slow running ATI_r521 AP tasks with Cat 11.12:
http://setiathome.berkeley.edu/result.php?resultid=2235201318
I've also had it with Cat 11.5 on the same host,
Strangely when the HD5770 was fitted to the E8500 and that had Cat 11.5 installed, i never saw any slowdowns, and didn't need to use the -hp switch eithier.
Claggy
-
I´ve noticed this also on my host with 11.5. on my driver test back in june.
-
Well, in existing apps I relied on OS/driver to cleanup GPU state when process terminated.
It's possible that new AMD drivers can't cleanup properly and need help from application.
So I will not go further in optimization and implement full OpenCL objects cleanup inside app instead. Maybe this could help with this issue.
BTW, OpenCL 1.2 is really breakthrough. It should fully support devicesplitting.
For example one can use full GPU when there are enough workitems and use only poart of GPU where too few workitems available (and we have this case in Pulse finding in VLAR for example).
Hope HD69xx will be compatible with OpenCL 1.2 too, not only HD7xxx ...
-
3 VLARs running, BOINC just started
(http://gpuz.techpowerup.com/12/01/02/5fp.png)
-
Well, in existing apps I relied on OS/driver to cleanup GPU state when process terminated.
It's possible that new AMD drivers can't cleanup properly and need help from application.
So I will not go further in optimization and implement full OpenCL objects cleanup inside app instead. Maybe this could help with this issue.
Probably that'll be along the right lines. Similar (with slighlty different symptoms) was originally happening under Cuda, resolved by making sure safe exit shutdown code was used, rather than terminating the process/thread during mid kernel execution or memory transfers. The side effects on Cuda devices/drivers were more along the lines of sending the driver into failsafe/sticky-downclock, but I can see that similar mechanisms might result in resource leaks or other unpredictable problems like that.
Jason
-
One ATI_r521 AP task running slowly, then GPUs suspended and resummed
(http://gpuz.techpowerup.com/12/01/02/f7.png)
Claggy
-
for giggles I decided to try 4 WU's on my 6970. All seems well
(http://i841.photobucket.com/albums/zz336/skildude/gpuz2.gif)
-
On my 6970 I get the highest RAC when running 4 Astropulse WUs in parallel. At least at times when there were such things as AP units ;)
-
Well, in existing apps I relied on OS/driver to cleanup GPU state when process terminated.
It's possible that new AMD drivers can't cleanup properly and need help from application.
So I will not go further in optimization and implement full OpenCL objects cleanup inside app instead. Maybe this could help with this issue.
Probably that'll be along the right lines. Similar (with slighlty different symptoms) was originally happening under Cuda, resolved by making sure safe exit shutdown code was used, rather than terminating the process/thread during mid kernel execution or memory transfers. The side effects on Cuda devices/drivers were more along the lines of sending the driver into failsafe/sticky-downclock, but I can see that similar mechanisms might result in resource leaks or other unpredictable problems like that.
Jason
Need to note, there is no talk about unexpected app termination where async PCIe transfers or kernel execution can take place. I speak about regular app ending where all work is done (no async oparations pending). I just don't deallocate buffers and free queue/context objects cause at process termination it supposed to be OS task - to free process resourses on exit. Of course OS can't free GPU resourses w/o help of corresponding driver. And here can be weak place I hope to strengthen soon.
-
One ATI_r521 AP task running slowly, then GPUs suspended and resummed
(http://gpuz.techpowerup.com/12/01/02/f7.png)
Claggy
It's just my case. Very low GPU usage that can be improved after restart.
-
well, many hours running in 3 tasks config already - no GPUload drop.
Always ~99%.
For some reasons x1 and x2 configs unstable and x3 stable? Very strange...
Well, when I plot graphs it will be more clear if elapsed time deviation reduced too in x3 config or just GPU load improved for some of 3 tasks and not for some others...
-
well, many hours running in 3 tasks config already - no GPUload drop.
Always ~99%.
For some reasons x1 and x2 configs unstable and x3 stable? Very strange...
Well, when I plot graphs it will be more clear if elapsed time deviation reduced too in x3 config or just GPU load improved for some of 3 tasks and not for some others...
Thats what i fear Raistmer.
-
On GTX460 during NV_r365 benchrun, if i resume Boinc CPU use (GPU use suspended), GPU use drops slightly, then drops to ~Zero after another 5 secs or so:
(http://gpuz.techpowerup.com/12/01/04/g30.png)
Without doing anything else, GPU use goes back to 98%, before dropping back down again:
(http://gpuz.techpowerup.com/12/01/04/4c9.png)
Claggy
-
So, this issue is common for both drivers.
What about -hp switch in menchmark command line? Does it helps ?
-
So, this issue is common for both drivers.
What about -hp switch in menchmark command line? Does it helps ?
Setting the -hp switch with NV_r365 didn't help, but leaving a core free does help (in conjuction with -hp switch), but GPU load is only ~88% now instead of ~98%
Starting the Bench without the -hp switch and a core free also gives ~88% GPU load.
Has there been any Microsoft updates that might have interfered with scheduling/thread priority?
Claggy
-
I don't know about such windows updates.
PErformance drop was reported more than month ago, bug was confirmed... and no more info from NV side.
-
I don't know about such windows updates.
PErformance drop was reported more than month ago, bug was confirmed... and no more info from NV side.
I saw some of your posts in the Regular Nvidia Forums, have you tried Registering/Posting in their Developer Forums?:
http://forums.developer.nvidia.com/
Claggy
-
No I used bug reporting system only as registered developer. Will try those forums too.