Author Topic: AK V8 + CUDA MB team work mod (Read 140788 times)

cyclejon · « **Reply #90 on:** 04 Feb 2009, 12:36:03 pm »

Quote from: Raistmer on 04 Feb 2009, 12:18:07 pm

SSE2 one was in AMD-specific package indeed, now that version obsolete. Try to download new one (first page). There should be _AMD in app name.
AK_v8 available in "Download" section of this site.

Ok, I hadn't seen the new version. What app should i run it against?
The results I labeled as nonCUDA were from the download section.

Raistmer · « **Reply #91 on:** 04 Feb 2009, 12:59:28 pm »

Quote from: cyclejon on 04 Feb 2009, 12:36:03 pm

Quote from: Raistmer on 04 Feb 2009, 12:18:07 pm
SSE2 one was in AMD-specific package indeed, now that version obsolete. Try to download new one (first page). There should be _AMD in app name.
AK_v8 available in "Download" section of this site.

Ok, I hadn't seen the new version. What app should i run it against?
The results I labeled as nonCUDA were from the download section.

Think you can just install new AMD-specific package. Included CPU-based app will slightly (~1-2% no more) faster than usual SSE2 AK_v8 on AMD chips, it tested already on Phenoms too (and on prev AMD SSE3-capable CPUs).
Unless you have Phenom-II no more testing required on this stage IMHO.

MAOJC · « **Reply #92 on:** 04 Feb 2009, 01:08:38 pm »

Quote from: Raistmer on 26 Jan 2009, 11:57:38 am

Quote from: Slawek on 26 Jan 2009, 11:51:58 am
AMD SSE3 (X2 Athlon ) working on this build ?

No, will do SSE3 soon.
BTW, does anyone need SSE2 and low ?
That is, does anybody have SSE2-only CPU with CUDA-enabled GPU ?

Yep have old Opteron (SSE2) OC'd with a 9600 GTX under XP

So the AMD file on the first page will now work?

cyclejon · « **Reply #93 on:** 04 Feb 2009, 01:13:09 pm »

Quote from: Raistmer on 04 Feb 2009, 12:59:28 pm

Unless you have Phenom-II no more testing required on this stage IMHO.

Ok, I don't have a Phenom II yet, maybe when my tax refund comes in. I may test just out of curiousity.

Raistmer · « **Reply #94 on:** 04 Feb 2009, 01:19:52 pm »

Quote from: MAOJC on 04 Feb 2009, 01:08:38 pm

Quote from: Raistmer on 26 Jan 2009, 11:57:38 am
Quote from: Slawek on 26 Jan 2009, 11:51:58 am
AMD SSE3 (X2 Athlon ) working on this build ?

No, will do SSE3 soon.
BTW, does anyone need SSE2 and low ?
That is, does anybody have SSE2-only CPU with CUDA-enabled GPU ?

Yep have old Opteron (SSE2) OC'd with a 9600 GTX under XP

So the AMD file on the first page will now work?

No, I will assemble working package and put on first page soon.
Currently only SSE3 and up supported. (provided AP is SSE3 one)

Raistmer · « **Reply #95 on:** 04 Feb 2009, 01:45:45 pm »

Version for SSE2 only CPUs added.

MAOJC · « **Reply #96 on:** 04 Feb 2009, 01:47:19 pm »

Quote from: Raistmer on 04 Feb 2009, 01:19:52 pm

Quote from: MAOJC on 04 Feb 2009, 01:08:38 pm
Quote from: Raistmer on 26 Jan 2009, 11:57:38 am
Quote from: Slawek on 26 Jan 2009, 11:51:58 am
AMD SSE3 (X2 Athlon ) working on this build ?

No, will do SSE3 soon.
BTW, does anyone need SSE2 and low ?
That is, does anybody have SSE2-only CPU with CUDA-enabled GPU ?

Waiting with baited breath.

Yep have old Opteron (SSE2) OC'd with a 9600 GTX under XP

So the AMD file on the first page will now work?
No, I will assemble working package and put on first page soon.
Currently only SSE3 and up supported. (provided AP is SSE3 one)

Grey Shadow · « **Reply #97 on:** 04 Feb 2009, 02:04:00 pm »

Just noticed one strange thing.
If BOINC (I use 6.4.5) downloads new workunits with shorter deadline it stops crunching previous WUs and switches to new ones. But GPU-crunching doesn't start - all workunits are processed only by CPU cores. As a result GPU stays idle till all short-deadline workunits are finished and computation of previous ones resumes.

I assume that it happens because when GPU crunching starts GPU-thread became fixed to parcitular workunit and nothing can be changed before this workunit is finished (completed or aborted). I don't know if such behavior is special for your 8a mod or it also affects stock app, but it will be really great of you defeat it

Raistmer · « **Reply #98 on:** 04 Feb 2009, 02:34:18 pm »

Quote from: Grey Shadow on 04 Feb 2009, 02:04:00 pm

Just noticed one strange thing.
If BOINC (I use 6.4.5) downloads new workunits with shorter deadline it stops crunching previous WUs and switches to new ones. But GPU-crunching doesn't start - all workunits are processed only by CPU cores. As a result GPU stays idle till all short-deadline workunits are finished and computation of previous ones resumes.

I assume that it happens because when GPU crunching starts GPU-thread became fixed to parcitular workunit and nothing can be changed before this workunit is finished (completed or aborted). I don't know if such behavior is special for your 8a mod or it also affects stock app, but it will be really great of you defeat it

It's specific to V8 approach of handling GPU, stock and V7 will not suffer from it.
Restart BOINc when it happens and consider to work in download big cache->suspend network->process tasks->resume network->renew cache mode with V8. There is no way to repair this but forbid BOINC to with to another task untill it finishes current one.

Grey Shadow · « **Reply #99 on:** 04 Feb 2009, 03:17:06 pm »

Thank you for your response.
I'll try to turn off network communications when I am unable to control BOINC for long time.

Raistmer · « **Reply #100 on:** 04 Feb 2009, 05:07:25 pm »

Each time when CUDA app finishes BOINC can take decisio to switch to another project. In that case 4 (for quad) SETI MB/AP apps will continue with CPU , GPU will be idle and another CPU-related app (for another project) will be started.
The less cores has PC the more probable such situation.

Solution could be regular checking by CPU SETI MB app if GPU app is running and if not -> exit immediately with zero status.
This exit should be treated by BOINC as non-error exit. It will restart task allowing CPU team app call CUDA app again to continue computations on GPU.
This should decrease amount of time GPU stays idle.

Will try to perform such modification (CPU app could check if GPU is free or busy each checkpoint interval for example - it's very good time to exit - right after making checkpoint

)

Raistmer · « **Reply #101 on:** 04 Feb 2009, 05:39:36 pm »

Ok, app modified and built, now SSSE3 version only.

If someone interesting in decreasing PGU idle time, please, try it and report here. It will inform about rescheduling attempt by this line in stderr:
"Idle GPU detected, trying to reschedule task to GPU, exiting..."

You need stop BOINC, extract AK_v8 version from attached archive in place of old one and start BOINC.
Please, don't forget, only SSSE3 for now.

[attachment deleted by admin]

Raistmer · « **Reply #102 on:** 05 Feb 2009, 04:51:44 am »

There is some situation when VLAR autokill will waste resources:

If VLAR task is already near completion on CPU but re-scheduling to GPU occurs, almost complete task will be killed by CUDA app autokill mod.

To avoid such situation VLAR task will be not killed but processed on GPU if it was started before (if it has non-zero progress already). If it's fresh VLAR task it will be aborted as before.
That way CPU "investments" in task will be saved.

Update to CUDA app attached.
This update appropriate to any SSE level of CPU.

ADDON: you can see if it works by this line in stderr:
"VLAR WU (AR: xxxxx )detected, but task partially done already, continuing computations"

[attachment deleted by admin]

Raistmer · « **Reply #103 on:** 05 Feb 2009, 06:01:42 am »

And another update - now it's hack into BOINC API.
This build should ignore BOINC request to suspend execution when BOINC switched to another task.
This should reduce idle GPU time and increase total system performance.
Update appropriate for all SSE levels.

Special thanks go to Jason who pointed me where to dig

Warning: I don't know if it will work as intended so consider this update as experimental one. If you have no time to watch your BOINC installation or you feel yourself not to be able to deal with possible consequences, please, don't use it.

I would like to recive some feedback if it helps avoid GPU idle state or how it works on your host in general.

[attachment deleted by admin]

Raistmer · « **Reply #104 on:** 05 Feb 2009, 08:58:50 am »

I noticed some bug in current CUDA mod behavior. Mostly it's OS problem with correct scheduling but net result - idle GPU:

When I run another CPU intensive application (single threaded) (quantum chem calculations in this particular case) GPU temperature goes to almost idle value and CPU consumption of CUDA app goes to zero. I use quad so only 1 core could be busy with non-BOINC task =>OS could just reschedule CUDA app on another core... but it seems Vista doesn't understand that fact.
And what is cure in this situation?: I restricted non-BOINC app to CPU#3 only and restricted CUDA app to CPU#0-2 (all except 3). No process priority changed, only affinity corrections.
And GPU temp restores, %done begin increasing, CPU consumption of CUDA app restored.

Pure example of CPU scheduling bug in Vista IMHO.

So, think CUDA app modification with affinity bound to first CPU can easy such situation (I can't re-set CUDA app affinity each 6 minutes, but can exclude first CPU for very long running non-BOINC app). Maybe it will help with another cases of CUDA hang with 0 CPU consumption described before.

Author Topic: AK V8 + CUDA MB team work mod (Read 140788 times)

cyclejon

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

MAOJC

Re: AK V8 + CUDA MB team work mod

cyclejon

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

MAOJC

Re: AK V8 + CUDA MB team work mod

Grey Shadow

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

Grey Shadow

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod

Raistmer

Re: AK V8 + CUDA MB team work mod