Seti@Home optimized science apps and information

Optimized Seti@Home apps => Windows => Topic started by: hiamps on 01 Sep 2010, 09:11:30 am

Title: just installed Unified Installers, v0.37 for Windows
Post by: hiamps on 01 Sep 2010, 09:11:30 am: I noticed that it made my completion times on my GPU's go way up and put them in High Priority mode. I would be careful doing this while Seti is running if you don't already have lots of work or you may not get much depending on how long it takes to settle back down. Before installer GPU estimates were around 14 minutes now 1 hour 14 minutes. I was using the "j" version with the 3.1 Cuda, and it worked fine.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Fredericx51 on 01 Sep 2010, 12:05:07 pm: Still using the 'f' version and times to complete did quickly drop to normal values, ofcoarse
depending on their AR.
Probably the reason I got almost 1000 MB tasks, yesterday and the day before, 30 and 31 july.
This 470 crunches them very quickly, only VLAR's take some more time, but as SETI is still down, can't
see anything. (And I caught a nasty cold :-\ )
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Skywalker66_Bln on 01 Sep 2010, 12:13:54 pm: I work with it too, on my gtx 460

I think it is 1% (20 sec) faster , than the Berkeley Fermi app

Why can nobody download the "Lunatics_x32f_win32_cuda30_preview.exe" separatly ? I think it is bad for fast testing, to go with installer version

greetings

PS: i go back to Berkeley App, i have many -226 Errors
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 01 Sep 2010, 12:15:04 pm: Quote from: hiamps on 01 Sep 2010, 09:11:30 am
I noticed that it made my completion times on my GPU's go way up and put them in High Priority mode. I would be careful doing this while Seti is running if you don't already have lots of work or you may not get much depending on how long it takes to settle back down. Before installer GPU estimates were around 14 minutes now 1 hour 14 minutes. I was using the "j" version with the 3.1 Cuda, and it worked fine.

As the release notes say
Quote
- Though not 'essential', <flops> app_info entries are highly 'recommended', but this installer doesn't put them in. We haven't come up with a good way for the installer to do this automatically yet...

IOW, if you had <flops> in the previous app_info.xml you'll need to manually copy them into the new one to get the same estimates. The old app_info.xml is of course in the setiathome.berkeley.edu\oldApp_backup\ folder.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Skywalker66_Bln on 01 Sep 2010, 12:19:51 pm: i make it so you say !!!!!!

i go back to Berkeley App, i have many -226 Errors and my cruncher stand still for 24 h :( :( :( :(
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: JohnDK on 01 Sep 2010, 02:22:29 pm: Quote from: Skywalker66_Bln on 01 Sep 2010, 12:13:54 pm
Why can nobody download the "Lunatics_x32f_win32_cuda30_preview.exe" separatly ? I think it is bad for fast testing, to go with installer version

Quote from: Josef W. Segur on 01 Sep 2010, 12:15:04 pm
IOW, if you had <flops> in the previous app_info.xml you'll need to manually copy them into the new one to get the same estimates. The old app_info.xml is of course in the setiathome.berkeley.edu\oldApp_backup\ folder.
Joe

Also think it should be possible to download the app separate for those that knows how to edit app_info, especially if you have flops since you need to edit the app_info anyway.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 01 Sep 2010, 02:33:30 pm: Contrary to the popular perception, the main purpose for the installer is to reduce maintenance load on the developers & people providing support.

If it is desired to access the executables & support files for customisation, you can tell the installer to install to an empty folder somewhere other than the project directory, and it will put whatever files you asked for there. Either using those files in roll your own fashion, or modifying the aistubs as needed and running the supplied aimerge batch file, would be sufficient.

Sorry, I'm not going to be the one making upteen different packages ... I did it before with AKv8, and it's too much work.

Jason
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: JohnDK on 01 Sep 2010, 02:44:03 pm: Good tip, haven't thought of installing it in a diff folder, actually didn't think it would work.

Another option is to open the installer in a program like 7-zip and extract the needed files.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 01 Sep 2010, 04:43:24 pm: Found an interesting one this morning. It was an angle range of 0.34 which I've been completing in around 1hour 40 some minutes. This one showed me with CPU time of 00:06:31 and an elapsed time of 05:14:51. I think these are the ones that used to give me the -1 errors when they hung like that. I don't know if it's something you did in the new X32f but if so I'm grateful. So far I haven't had one -1 error this time.

I looked it up in my tasks list and my wingman had completed it on his CPU. From what I could tell it took him about 3000 seconds longer than others he had completed but I didn't check the angle ranges on the rest of his work.. Will be interesting to see for sure if I get credit for it when we come back from the outage. I blamed me playing my Solitaire game but now I'm wondering if it might be something in the WUs.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 01 Sep 2010, 04:59:36 pm: Quote from: perryjay on 01 Sep 2010, 04:43:24 pm
...but now I'm wondering if it might be something in the WUs.

In actuality, the reason VLARs have been a problem has been certain long running Pulse finding lengths that likely require a more sophisticated understanding of the multibeam algorithms, and corresponding code design, than is evident in the original cuda multibeam codebase, especially when we consider now obvious issues like the 2 second driver timout & recovery mechanism in Vista & Win7. These problem sizes aren't entirely restricted to the lower angle ranges, but occur to a less frequent extent at most ARs, so it's still possible to 'choke' on many tasks.

As my personal understanding grows, we all aim to get things more robust as a foundation for 'optimisation proper', and familiarity with the language, tools & hardware capabilities grows, I hope the Cuda codebase can become at least as solid & dependable as the AKv8b & Astropulse CPU ones have become over time. We already proved Fermi, and earlier cards, can generate trash results really fast. It'll take quite a bit of refinement yet IMO, but things should end up 'pretty good' ;)

Jason
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 01 Sep 2010, 05:31:00 pm: IIRC, I only had one -1 last week while I was trying the x32h build but that might have been before I tried it. This time I haven't seen any yet. I'm still running out that mess I got myself into on the 16th so it is the same batch of WUs. I don't like taking 5 hours to complete a GPU WU but it beats erroring out after that length of time. (They usually hit me when I wasn't looking so I don't know how much time was wasted on them before.)
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 01 Sep 2010, 06:32:35 pm: Quote from: perryjay on 01 Sep 2010, 04:43:24 pm
Found an interesting one this morning. It was an angle range of 0.34 which I've been completing in around 1hour 40 some minutes. This one showed me with CPU time of 00:06:31 and an elapsed time of 05:14:51. I think these are the ones that used to give me the -1 errors when they hung like that. I don't know if it's something you did in the new X32f but if so I'm grateful. So far I haven't had one -1 error this time.

I looked it up in my tasks list and my wingman had completed it on his CPU. From what I could tell it took him about 3000 seconds longer than others he had completed but I didn't check the angle ranges on the rest of his work.. Will be interesting to see for sure if I get credit for it when we come back from the outage. I blamed me playing my Solitaire game but now I'm wondering if it might be something in the WUs.

I think it might be a good idea if you'd capture that WU and the result file, pack them in a .zip or .7z, whatever, and attach here. Include the <result> section extracted from client_state.xml if you're willing to take the time to find it. Most likely the extra CPU and Elapsed time are from something which can't be found by rerunning the task, but Gaussian fitting is very data dependent and there's a lot of that for those lower midrange ARs. We might as well take advantage of the outage situation keeping BOINC from immediately deleting everything interesting...
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 01 Sep 2010, 07:11:48 pm: Ok, let's see if I did this right.Let me know if you need anything else.

Edit: So much for the first try, hopefully this one will work.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 01 Sep 2010, 09:06:36 pm: Unfamiliar tools sometimes can be difficult, and I may not have been clear what I wanted. The actual WU and the result file are in the project directory with filenames 21ap10ag.22479.6202.5.10.29 and 21ap10ag.22479.6202.5.10.29_0 . Maybe create a temporary folder/directory someplace and copy those files to it. If you highlight both then right click on the pair, then the 7-Zip submenu will offer the option to create an archive with the folder name and either a .zip or .7z extension, and will by default save the created archive in the same folder. I do have the good data you extracted from client_state.xml, but you could have it saved in the same folder and highlight all three files to make the archive a full set.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 01 Sep 2010, 09:38:13 pm: If at first you don't succeed try try again!!! ;D
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 01 Sep 2010, 11:13:37 pm: Quote from: perryjay on 01 Sep 2010, 09:38:13 pm
If at first you don't succeed try try again!!! ;D

Well done, third time's the charm! ;) I have the WU running a standalone test on a system which should take between 12 and 13 hours for that AR. Maybe someone else with CUDA capability could check whether it causes any unusual effects.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 02 Sep 2010, 03:58:42 am: will be able to laod it up tonight for a , look.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 02 Sep 2010, 08:50:09 am: Just a thought but could whatever is causing my little 9500Gt to hang on these be what's causing stock AMDs to hang? Sure would be great if my problem helped to find a cure for that. I know optimizing AMDs cures it but if we can find that one wrong piece in the WU maybe the boys at Berkeley could correct it in the stock WUs.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 02 Sep 2010, 12:29:33 pm: Have run just now under x32f, both Cuda 3 & 3.1 versions, on the 480 looking for anything unusual. Nothing immediately obvious yet. These builds, as usual, have the bench code disabled that causes those rare issues on stock with AMD. ~8 minutes elapsed, ~1min CPU time. Pretty normal processing for a Mid Angle range task here. I don't have stock cuda_fermi on hand at the moment to see if that differs.

Will see if I can spot anything in the result files, such as lots of closely spaced triplets or something...

[Edit:] Notes:
- Your result file is 'Strongly Similar' to both mine
- Both detected pulses seem to be at 'fairly short' FFT Lengths, (i.e. Long PulsePoTs) which can run more efficiently on Fermi hardware at this time, but prior gen can choke. I suspect these Long PulsePoTs could explain up to around 50% increased runtime for this task, maybe more, but would need a chirp/FFT pair breakdown to know for sure. If correct then it's a 'nasty bastard' task for older/lower capacity cards, but I'm not prepared to rule out something else interfering with the run time on that machine yet.

Got a Breakdown Joe ?

The lower multiprocessorcount of the 9500GT, about half that of my old 9600GSO, would see long PulsePoTs at fftLength 4096 and under, split pulsefind kernel execution more often to fit hardware. That would explain naturally longer runtime of the tasks on lower classes of GPU, while staying the same as other midrange tasks on higher GPUS. In addition, I did move execution of those kernels to a non-default stream (ie. not stream 0), and tamper with kernel launch geometry somewhat. That could explain why it runs to completion on x32f, while suffers timeouts & driver crashes under stock.

Jason
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 02 Sep 2010, 01:23:38 pm: Just to show, this is the stderr from one completed back on the 21st....also an 0.39 AR and a 21ap10ag....

Oops, not completed, errored out.. ::)

Stderr output

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce 9500 GT
totalGlobalMem = 1056505856
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 2147483647
maxThreadsPerBlock = 512
clockRate = 1840363
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 1
multiProcessorCount = 4
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce 9500 GT is okay
SETI@home using CUDA accelerated device GeForce 9500 GT
V12 modification by Raistmer
Priority of worker thread rised successfully
Priority of process adjusted successfully
Total GPU memory 1056505856    free GPU memory 983990272
setiathome_enhanced 6.02 Visual Studio/Microsoft C++

Build features: Non-graphics   CUDA    VLAR autokill enabled    FFTW   USE_SSE   x86
CPUID: Pentium(R) Dual-Core CPU E5400 @ 2.70GHz

Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 0.393971
After app init: total GPU memory 1056505856    free GPU memory 983990272
Cuda error 'cufftExecC2C' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_fft.cu' in line 143 : the launch timed out and was terminated.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : the launch timed out and was terminated.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : the launch timed out and was terminated.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 147 : the launch timed out and was terminated.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 147 : the launch timed out and was terminated.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 160 : the launch timed out and was terminated.

</stderr_txt>
]]>

Hope that helps.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 02 Sep 2010, 01:29:53 pm: Thanks, well it sortof fits the theory, from what I can tell so far. 'cufftExecC2C' would have been the first kernel executed after a pulsefind on the previous cfft pair, A long one of which having crashed the driver, or application context etc. Everything after that clearly hosed. I reckon in the future we can handle that better.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 02 Sep 2010, 01:37:53 pm: That's me Jason, low class all the way!! ;D Glad I could be of help and give you guys something to play with.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 02 Sep 2010, 02:14:42 pm: Quote from: Jason G on 02 Sep 2010, 12:29:33 pm
...
Got a Breakdown Joe ?

Code: [Select]
AR=0.39430364685758, First limit=30, Second limit=100 [ChirpRes 0.1665] FFTLen Stepsize NumCfft Spikes Gaussians Pulses Triplets PoTlen 8 7.463718 27 3538944 189 1365 2835 16621 16 3.731859 53 3473408 795 6075 11925 8310 32 1.865929 107 3506176 3317 24645 49755 4155 64 0.932965 215 3522560 13545 101115 203175 2078 128 0.466482 429 3514368 54483 409575 817245 1039 256 0.233241 857 3510272 218535 1640925 3278025 519 512 0.116621 1715 3512320 876365 6568905 13145475 260 1024 0.058310 3429 3511296 3507867 26316675 52618005 130 2048 0.029155 6859 3511808 14040373 105287445 210605595 65 4096 0.014578 13719 3512064 56179305 421314075 842689575 32 8192 0.007289 27439 3512192 224752849 1685584935 3371292735 16 16384 0.003644 54879 3512256 899082657 0 0 8 32768 0.014788 13525 432800 0 0 0 4 65536 0.003697 16229 259664 0 0 0 2 131072 0.000924 64917 519336 0 0 0 1 -------- ---------- ---------- ---------- ---------- Totals 204399 43349464 1198730280 2247255735 199747049
Quote
The lower multiprocessorcount of the 9500GT, about half that of my old 9600GSO, would see long PulsePoTs at fftLength 4096 and under, split pulsefind kernel execution more often to fit hardware. That would explain naturally longer runtime of the tasks on lower classes of GPU, while staying the same as other midrange tasks on higher GPUS. In addition, I did move execution of those kernels to a non-default stream (ie. not stream 0), and tamper with kernel launch geometry somewhat. That could explain why it runs to completion on x32f, while suffers timeouts & driver crashes under stock.

Jason

Perryjay did say the GPU had handled other tasks with similar AR much quicker, and the way the ALFALFA project observes I'd expect he even had at least several with AR identical to the full 14 digits supplied in the WU header. About the only possibility of something unusual in this WU has to be in the data. For Pulse finding, about the only possibility of a slowdown would be if the best_pulse threshold built up gradually, requiring a lot of data to be sent back from GPU to CPU. And for Gaussian fitting the situation is similar, there might have been a gradual buildup requiring much data return to the CPU, and even doing the final ChiSqr checks an unusually large number of times might be implicated.

My CPU run actually finished quicker than I'd expected, but that's mostly my not having done many full-length tasks on the test system. The result file is very strongly similar to Perryjay's as expected.

My judgement is the WU is exonerated, it just happened to be the one being processed when something caused either a GPU slowdown or tied up the CPU so it wasn't getting the next GPU operation started promptly. There's no way to tell if it was a protracted sluggishness or a period of zero progress, of course. Whatever, the task took about 3 times as long as usual for similar tasks which is disturbing but didn't approach the ~10 times longer which would have risked a -177 error. (The AMD hang on stock CPU apps appears to be permanent unless the user takes action, otherwise it will always reach the time limit.)

Watching for any similar cases is of course called for, at this point trying to make a special debug build without having even a vague theory of possible causes seems impractical.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 02 Sep 2010, 02:24:18 pm: Quote from: Josef W. Segur on 02 Sep 2010, 02:14:42 pm
...
Watching for any similar cases is of course called for, at this point trying to make a special debug build without having even a vague theory of possible causes seems impractical.
Joe

Thanks for the breakdown, I agree those cffts don't look particularly hardcore, and didn't see anything unusual in execution here.

Perhaps if caught it the act, it would warrant grabbing a HiJackThis! log or similar to look for interfering processes.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 02 Sep 2010, 02:42:12 pm: You're right Joe, I've finished a few of the same angle range and even very similar WU name. This one caught my eye because of the 5 hour run time. It was the first I'd seen that didn't -1 error out on me. The rest of them so far have been well within normal runtimes.

If you gentlemen are through, someone in another thread is saying something about the upload folder is full. So, I guess I should delete my stuff.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 02 Sep 2010, 02:51:43 pm: Looks like I'll have to go through for a trim session this weekend ;)
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Ghost0210 on 02 Sep 2010, 02:54:58 pm: Quote from: Jason G on 02 Sep 2010, 02:51:43 pm
Looks like I'll have to go through for a trim session this weekend ;)
It was only me ;D
Was going to upload a comparison chart for Raistmers r449>>r454 ATI build
nothing that can't wait
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 03 Sep 2010, 07:24:19 pm: Just to finish off my little adventure, the WU validated with 131.55 credits. Here is my stderr..

Stderr output

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce 9500 GT, 1007 MiB, regsPerBlock 8192
computeCap 1.1, multiProcs 4
clockRate = 1840363
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce 9500 GT is okay
SETI@home using CUDA accelerated device GeForce 9500 GT
Priority of process raised successfully
Priority of worker thread raised successfully
size 8 fft, is a freaky powerspectrum
size 16 fft, is a cufft plan
size 32 fft, is a cufft plan
size 64 fft, is a cufft plan
size 128 fft, is a cufft plan
size 256 fft, is a freaky powerspectrum
size 512 fft, is a freaky powerspectrum
size 1024 fft, is a freaky powerspectrum
size 2048 fft, is a cufft plan
size 4096 fft, is a cufft plan
size 8192 fft, is a cufft plan
size 16384 fft, is a cufft plan
size 32768 fft, is a cufft plan
size 65536 fft, is a cufft plan
size 131072 fft, is a cufft plan

) _ _ _)_ o _ _
(__ (_( ) ) (_( (_ ( (_ (
not bad for a human... _)

Multibeam x32f Preview, Cuda 3.0

Work Unit Info:
...............
WU true angle range is : 0.394304

Flopcounter: 45602123959036.234000

Spike count: 0
Pulse count: 2
Triplet count: 0
Gaussian count: 2
called boinc_finish

</stderr_txt>
]]>
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 04 Sep 2010, 12:51:43 am: Used the new installer earlier today to add a GTX465 to the Frozen 920.
No problems to report. Other than all Cuda tasks are marked as 6.08....no 6.10.
But the new card is crunching away with the existing GTX295.

Looks like a winner here.

So now the kitties are feeling kinda Fermi...............
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 04 Sep 2010, 10:11:08 am: So long as it is using the 6.10 you are good to go. How you liking that Fermi? Post some times and credits when you get some in.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 04 Sep 2010, 06:56:31 pm: Quote from: msattler on 04 Sep 2010, 12:51:43 am
Used the new installer earlier today to add a GTX465 to the Frozen 920.
No problems to report. Other than all Cuda tasks are marked as 6.08....no 6.10.
But the new card is crunching away with the existing GTX295.

Looks like a winner here.

So now the kitties are feeling kinda Fermi...............

As far as the S@H servers are considered, it's just "SETI@home Enhanced (anonymous platform, nvidia GPU)", the version number doesn't matter at all. It would be possible to do a conversion to 6.66, and maybe that would be interesting to some. That would make the installer larger and more complex though, since it would need to modify all existing assignments in client_state.xml.

If you put <flops> in the app_info.xml, probably a slightly higher setting for the 610 version would make the servers favor that. They do try to use whatever they consider the "best" in terms of productivity. When we figure out a practical method for the installer to set <flops> perhaps it would be best to do something like that so everybody using x32f will be getting the same version on new work, it could make future upgrades easier.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 04 Sep 2010, 07:02:31 pm: I'll just leave it as is, as long as it's just the nomenclature and is not affecting anything else.
The only other side effect I noticed this afternoon is the Seti servers complaining that I don't have a usable version of Seti@home enhanced in my app info file.
But I am assuming that's just another bogus Boinc error message.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 04 Sep 2010, 07:05:57 pm: I'm jealous, it never gave me a nifty message like that! ;D
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 04 Sep 2010, 07:17:44 pm: LOL...nifty as long as it's just Boinc blowing smoke.
Which is not all that unusual.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 04 Sep 2010, 07:36:28 pm: Odd thing though....
The new GTX465 is slower than my GTX295, even though core, shader, and memory clocks are all significantly higher on the 465.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 04 Sep 2010, 07:38:38 pm: That is odd but I'll let one of the Fermi experts on here explain that one.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 04 Sep 2010, 07:46:09 pm: I am sure one will be by sooner or later.....LOL.

The 295 is clocked at 684/1476/1188 core/shaders/ram.
The 465 is clocked at 775/1550/1700.........

Yet with all that apparent speed advantage, it is taking longer to crunch WUs on the new card.

EDIT...
If anybody wants to have a look, here's the link to the host's tasks.
http://setiathome.berkeley.edu/results.php?hostid=5082339
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 04 Sep 2010, 09:13:45 pm: Quote from: msattler on 04 Sep 2010, 07:46:09 pm
...
The 295 is clocked at 684/1476/1188 core/shaders/ram.
The 465 is clocked at 775/1550/1700.........

Yet with all that apparent speed advantage, it is taking longer to crunch WUs on the new card.
...

I can't explain that, but a few things may be relevant:

1. All the tasks I could find done by the 465 were VHAR, and although they took longer that may not be a good predictor of what will happen as you get into a normal mix. Much of the work which Jason did for the x32 builds was sort of targeting the VLAR problem, he hasn't really begun to apply optimization for different GPUs.

2. The 465 is the first device, I suppose that means it's the GPU that Windows is using to keep the screen refreshed. Even with no dynamic changes, the GPU still has to refresh the screen many times a second.

3. A 465 GPU with 11 multiprocessors each having 32 cores (shaders in graphics terms) is a different beast than a 295 GPU with 30 multiprocessors each having 8 cores. There are other architectural differences, too, and the possible code changes to take advantage of the newer arrangements haven't all been tried. The 465 would probably do better running more than 1 task at a time like other Fermi cards, but we haven't found a way to do that where a host also has non-Fermi cards. BOINC doesn't have a parameter which can be put in an app_info.xml saying "only use this configuration on the 465", for instance.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 05 Sep 2010, 02:52:16 am: I didn't think about the fact that it is indeed the card the monitor is running from.
I don't know if I can specify in the bios which card to use or if I would have to physically swap the 295 into the top slot to make that happen.

And sometime in the last 7 hours, the 465 fell back to about half speed due to the OC. Reduced the settings to
763/1525/1700 and restarted it just now. We'll see if that holds.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Jason G on 05 Sep 2010, 08:09:24 am: Indeed there is a long way to go with optimisation for Fermi architectures yet. There are simply so many fundamental architectural changes that it is going to take a while to 'invent' the techniques to use them.

At this stage, times if running a single workunit shoud be roughly similar to a GTX 275, GTX260-216 OC, or one GPU of a 295. When OCing, the achilles heel of the GF100 is the memory controller design, being nVidia's first crack at GDDR5. If anything like my 480 you *should* find if you keep memory clocks near stock, you can get * a bit more* legs out of the core/shaders before instability kicks in. Fortunately the architecture includes several layers of cache in the design, which clocks with the core, and so memory bandwidth & latency wont become critical until some more general optimsiations are made to make full use of the chip.

Also, that you have a GF100 die, albeit cut down, ir should be able to be clocked quite a lot higher, provided it has adequate power and cooling. I'd expect the heatsinking & power regualtion components on the cut down board to be reduced for the model though... For example my eVGA 480 with reference cooling tops out ar ~801MHz core, but I know of others that run the same card at 830MHz wtih water cooling.

For the moment, running 2 instances at a time seems to be the goer, I expect mostly because the kernels will be not using the cache as effectively as they should, and sit waiting for memory sometimes. That gives opportunity for another instance to sneak in. That situation wil likely change as the use of the hardware improves.

There are several things on my medium term development schedule that may start to address the situation. They include migrating fully to the driver api, instead of the cuda runtime, as that removes a layer hiding the cache management capabilities of the chip. That will require moving away from nVidias CUFFT library, which is good, but the inner partial results are not available for re-use. Better use of the hardware cache, and reuse of already processed data are fundamental steps to overcoming reliance on memory speed, along with increasing compute density... which given the current applications, and the Cuda Runtime libraries are built to run on the humblest Cuda capable cards, are techniques not used yet.

It'll be a long, but interesting road.

Jason
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 05 Sep 2010, 11:18:06 am: Well.......the 465 ran all night with the new settings.
I think I will leave the configuration as is for now. It is crunching it's way through a bunch of shorties.
As has been noted, there does not seem to be a way right now to get the 465 to run 2 tasks in a rig with mixed Fermi/non-Fermi cards installed. Awaiting further developments from you folks here.

Meow meow.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 05 Sep 2010, 11:43:06 am: And then again, I just might.............

The kitties made me do it.
I have another 465 inbound next week
It might be more productive to, instead of having a 465 and 295 in the same rig, to take the 295 out of my air cooled 920 rig and put it into the Frozen 920 so it can host 2 295s and put 2 465s into the air cooled rig so I can take advantage of multi-tasking them.

Meow meow.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: msattler on 05 Sep 2010, 06:50:10 pm: OK....
Did the ol' switcheroo.
The Frozen 920 now has both of my GTX295s, and the air cooled rig now has the GTX465, with the app info mod to run 2 tasks. Showing 99% usage of the GPU now.
And once again, no problems noticed with the new installer.

Thanx again to all of you who make these things possible.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 05 Sep 2010, 08:46:14 pm: Quote from: msattler on 05 Sep 2010, 06:50:10 pm
OK....
Did the ol' switcheroo.
The Frozen 920 now has both of my GTX295s, and the air cooled rig now has the GTX465, with the app info mod to run 2 tasks. Showing 99% usage of the GPU now.
And once again, no problems noticed with the new installer.

Thanx again to all of you who make these things possible.

OK, so 5025084 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5025084) has one 465 now and another coming, and 5082339 (http://setiathome.berkeley.edu/show_host_detail.php?hostid=5082339) now has [4] NVIDIA GeForce GTX 295. Should be good.
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: perryjay on 06 Sep 2010, 10:01:35 am: Just a note, everything is running smooth. Only one error but on SETI's part. A validate error they didn't get to in time to fix . No -1 errors at all this time around. I'm cleaning up a bunch of _2 and _3s from ghosts timing out and others -9 overflows. Picked up an AP last night. The time is still way off on it but it is only number seven out of the ten I need for them to get their estimates right. I can live with it. Debating whether or not to go ahead and get it done early.
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: corsair on 06 Sep 2010, 10:25:37 am: Could be an option for next releases??, about:
...
<user_friendly_name>Seti AK8 AMD x32 SSE3</user_friendly_name>
...
<user_friendly_name>Astropulse Opt. x32 r409 SSE</user_friendly_name>
...

this is just cosmetic, I like to name the applications as are by design. ::)
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: Josef W. Segur on 06 Sep 2010, 04:22:48 pm: Quote from: corsair on 06 Sep 2010, 10:25:37 am
Could be an option for next releases??, about:
...
<user_friendly_name>Seti AK8 AMD x32 SSE3</user_friendly_name>
...
<user_friendly_name>Astropulse Opt. x32 r409 SSE</user_friendly_name>
...

this is just cosmetic, I like to name the applications as are by design. ::)
Because those friendly names are in the <app> sections, it would have to be something like

Code: [Select]
... <user_friendly_name>S@H AK_v8b AMD SSE3 or CUDA x32j</user_friendly_name> ... <user_friendly_name>AP r409 SSE or ATI OpenCL r447</user_friendly_name>
That would take extra coding to combine friendly names for whichever apps the user chooses. For myself, I keep the Application column in BOINC Manager collapsed so there's enough width left to display the far more important info in other columns, so long friendly names would be a waste. But I don't object to the idea, if enough others want it Jason might be persuaded...
Joe
Title: Re: just installed Unified Installers, v0.37 for Windows
Post by: glennaxl on 07 Sep 2010, 08:45:51 pm: Friendly names are really useful especially doing some beta testing. I'm using it right now and with the combination of BoincTasks, its easy to pin point whc host has this build.