+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: V9 of modified SETI MB CUDA + opt AP package for full GPU+CPU utilization  (Read 61239 times)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
No Good. :'( :'(

Still looks like the MB_6.08_mod_CUDA_V9.exe is hung up in Boinc Manager.  It shows "Waiting to run, .04 cpu's, 1 CUDA".  But the GPU temp is still up indicating that it is still running.

I have seen many times in the last hours that the work does complete and starts a new work unit.  But it can show as described above at any time.

Don

IT STILL RUNNING. Key phrase. So what you worry about ?

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
But it's NOT running.  It has hung up solid 2 times on this work unit.  And I may be able to supply more.

[attachment deleted by admin]
Boinc....Boinc....Boinc....Boinc

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
But it's NOT running.  It has hung up solid 2 times on this work unit.  And I may be able to supply more.
I don' need result file, but would like to see stderr output for this task and logs from BOINC.

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
Here is another real work unit.  Sorry I sent the wrong file earlier.  I will try to get the other info.

[attachment deleted by admin]
Boinc....Boinc....Boinc....Boinc

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
Hopefully these are what you are looking for.  If not tell me the actual file names you wish to see.

[attachment deleted by admin]
Boinc....Boinc....Boinc....Boinc

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Hopefully these are what you are looking for.  If not tell me the actual file names you wish to see.
Your logs show BOINC crash.

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C421FA1 read attempt to address 0x00000034

Engaging BOINC Windows Runtime Debugger...

********************
BOINC Windows Runtime Debugger Version 6.6.5


Dump Timestamp    : 02/24/09 03:43:21

No info about science app there. stderr of current task located in slot subdir.

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
This from the slot running the CUDA app.  That work unit is currently showing "Waiting to run, (0.04 CPU's, 1CUDA" and the progress is stuck at 0.00% .  I have seen progress stuck at many different values.  The GPU temp is still up indicating possible working.

[attachment deleted by admin]
Boinc....Boinc....Boinc....Boinc

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
This from the slot running the CUDA app.  That work unit is currently showing "Waiting to run, (0.04 CPU's, 1CUDA" and the progress is stuck at 0.00% .  I have seen progress stuck at many different values.  The GPU temp is still up indicating possible working.
try to look into state.sah from that slot each 10-15 min look for value of <prog>  tag.
If i increase task is making progress irragarding what BOINC manager writes.

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
Work is being finished, uploaded, etc.  New work started.  Confirming messages in Boinc message area.  Boinc Manager continues to show work as stated earlier.  "Waiting to run, 0.04 cpu's, 1 CUDA".  Still using the dll's downloaded from Seti fanout earlier today.  Three more error files included here.

[attachment deleted by admin]
Boinc....Boinc....Boinc....Boinc

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Work is being finished, uploaded, etc.  New work started.  Confirming messages in Boinc message area.  Boinc Manager continues to show work as stated earlier.  "Waiting to run, 0.04 cpu's, 1 CUDA".  Still using the dll's downloaded from Seti fanout earlier today.  Three more error files included here.
There is no errors in these stderrs.
Can you specify what problem you have? Waiting to run is not an error too, it's just message that this task awaiting its turn.

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
Problem is that sometimes work is actually stopped or appears to be stopped.  Perhaps it's not but I don't know if it is stopped or not because of the "Waiting to run, 0.04 cpu's, 1 CUDA" message in Boinc Manager.  I have proven that the work is being done while showing that message in Boinc Manager.  It is impossible to know the true state of the work with this message.  That is the big problem.  I just cannot tell the state of the work with that message in Boinc Manager even though it is being crunched.

This state cannot be overlooked.  You will be flooded with complaints if this is released.
Boinc....Boinc....Boinc....Boinc

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Problem is that sometimes work is actually stopped or appears to be stopped.  Perhaps it's not but I don't know if it is stopped or not because of the "Waiting to run, 0.04 cpu's, 1 CUDA" message in Boinc Manager.  I have proven that the work is being done while showing that message in Boinc Manager.  It is impossible to know the true state of the work with this message.  That is the big problem.  I just cannot tell the state of the work with that message in Boinc Manager even though it is being crunched.

This state cannot be overlooked.  You will be flooded with complaints if this is released.
It released already. V9 includes (among others) mod that rejects to stop CUDA app when BOINC wants to stop it.
This should decrease probability of having 5 CPU apps running on quad but idle GPU.
Probably you see effect of this mod.
As I already said you can see if app actually runs and making progress by looking into state.sah file.
My main concern to ensure full utilization of both CPU and GPU. What BOINC thinks or writes about situation and how it interprets situation - not my concern.
This mods actually hack BOINC and cheat it. When BOINC will be able to run both CUDA MB and opt AK_v8 simultaneously on the same host the need in these team mods will disappear.

Offline Geek@Play

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 330
I installed using "AstroPulse & CudaMB, 1+ GPUs, no Team (req SSE3)".  I do NOT have the <ncpus> in the cc_config file.  I am therefore running 4 AP work units on the CPU's and 1 CUDA on the video card.  This is exactly the configuration I want on my quads.

To be unable to monitor the progress of the CUDA-MB work in Boinc Manager is unsatisfactory.  You seem to feel that it's ok since it is crunching the data.  I with respect, humbly dissagree.
« Last Edit: 24 Feb 2009, 09:14:56 pm by Geek@Play »
Boinc....Boinc....Boinc....Boinc

eschamali

  • Guest
For ones who wanna test x64 SSSE3 version of V9 combo - it attached to this post.
But keep in mind that I haven't ability to test it still. My GPU host running x86 Vista.


13 downloads at moment I write this.... And no single repost if it runs or not  :-\

Come on lurkers! fess up!

Been running this 64bit package for a few days now on my i7, and everything seems to go just fine. Seeing a nice speedup on the x64 AK also.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=4496384

chelski

  • Guest
Only been able to run the new V9 app for 24 hours or so (after the upload issue have been cleared off and no completed unit put at risk).  So far so good.  With V8 there will be some intervention from Maik script to stop the app hang (where it got stuck at 0:00:00 CPU time) but experience with V9 seems very positive.

Thanks and great work.

Quote
CPU type   GenuineIntel
Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz [x86 Family 6 Model 15 Stepping 11]
Number of processors   2
Coprocessors   NVIDIA GeForce 9600 GSO (383MB)
Operating System   Microsoft Windows XP
Professional x86 Editon, Service Pack 3, (05.01.2600.00)

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 298
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 258
Total: 258
Powered by EzPortal