+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: just installed Unified Installers, v0.37 for Windows  (Read 27838 times)

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: just installed Unified Installers, v0.37 for Windows
« Reply #15 on: 01 Sep 2010, 11:13:37 pm »
If at first you don't succeed try try again!!!   ;D

Well done, third time's the charm! ;) I have the WU running a standalone test on a system which should take between 12 and 13 hours for that AR. Maybe someone else with CUDA capability could check whether it causes any unusual effects.
                                                                                    Joe

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: just installed Unified Installers, v0.37 for Windows
« Reply #16 on: 02 Sep 2010, 03:58:42 am »
will be able to laod it up tonight for a , look.

Offline perryjay

  • Knight Templar
  • ****
  • Posts: 427
Re: just installed Unified Installers, v0.37 for Windows
« Reply #17 on: 02 Sep 2010, 08:50:09 am »
Just a thought but could whatever is causing my little 9500Gt to hang on these be what's causing stock AMDs to hang? Sure would be great if my problem helped to find a cure for that. I know optimizing AMDs cures it but if we can find that one wrong piece in the WU maybe the boys at Berkeley could correct it in the stock WUs.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: just installed Unified Installers, v0.37 for Windows
« Reply #18 on: 02 Sep 2010, 12:29:33 pm »
Have run just now under x32f, both Cuda 3 & 3.1 versions, on the 480 looking for anything unusual.  Nothing immediately obvious yet.   These builds, as usual, have the bench code disabled that causes those rare issues on stock with AMD.  ~8 minutes elapsed, ~1min CPU time.  Pretty normal processing for a Mid Angle range task here.  I don't have stock cuda_fermi on hand at the moment to see if that differs. 

Will see if I can spot anything in the result files, such as lots of closely spaced triplets or something...

[Edit:] Notes:
  -  Your result file is 'Strongly Similar' to both mine
  -  Both detected pulses seem to be at 'fairly short' FFT Lengths, (i.e. Long PulsePoTs) which can run more efficiently on Fermi hardware at this time, but prior gen can choke.  I suspect these Long PulsePoTs could explain up to around 50% increased runtime for this task, maybe more, but would need a chirp/FFT pair breakdown to know for sure.  If correct then it's a 'nasty bastard' task for older/lower capacity cards, but I'm not prepared to rule out something else interfering with the run time on that machine yet.

Got a Breakdown Joe ?

The lower multiprocessorcount of the 9500GT, about half that of my old 9600GSO, would see long PulsePoTs at fftLength 4096 and under, split pulsefind kernel execution more often to fit hardware.  That would explain naturally longer runtime of the tasks on lower classes of GPU, while staying the same as other midrange tasks on higher GPUS.  In addition, I did move execution of those kernels to a non-default stream (ie. not stream 0), and tamper with kernel launch geometry somewhat.  That could explain why it runs to completion on x32f, while suffers timeouts & driver crashes under stock.

Jason
« Last Edit: 02 Sep 2010, 01:02:59 pm by Jason G »

Offline perryjay

  • Knight Templar
  • ****
  • Posts: 427
Re: just installed Unified Installers, v0.37 for Windows
« Reply #19 on: 02 Sep 2010, 01:23:38 pm »
Just to show, this is the stderr from one completed back on the 21st....also an 0.39 AR  and a 21ap10ag....

Oops, not completed, errored out..   ::)

Stderr output

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 9500 GT
           totalGlobalMem = 1056505856
           sharedMemPerBlock = 16384
           regsPerBlock = 8192
           warpSize = 32
           memPitch = 2147483647
           maxThreadsPerBlock = 512
           clockRate = 1840363
           totalConstMem = 65536
           major = 1
           minor = 1
           textureAlignment = 256
           deviceOverlap = 1
           multiProcessorCount = 4
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 9500 GT is okay
SETI@home using CUDA accelerated device GeForce 9500 GT
V12 modification by Raistmer
Priority of worker thread rised successfully
Priority of process adjusted successfully
Total GPU memory 1056505856    free GPU memory 983990272
setiathome_enhanced 6.02 Visual Studio/Microsoft C++

Build features: Non-graphics   CUDA    VLAR autokill enabled    FFTW   USE_SSE   x86   
     CPUID: Pentium(R) Dual-Core  CPU      E5400  @ 2.70GHz

     Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is :  0.393971
After app init: total GPU memory 1056505856    free GPU memory 983990272
Cuda error 'cufftExecC2C' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_fft.cu' in line 143 : the launch timed out and was terminated.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : the launch timed out and was terminated.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : the launch timed out and was terminated.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 147 : the launch timed out and was terminated.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 147 : the launch timed out and was terminated.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 160 : the launch timed out and was terminated.

</stderr_txt>
]]>

Hope that helps.


Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: just installed Unified Installers, v0.37 for Windows
« Reply #20 on: 02 Sep 2010, 01:29:53 pm »
Thanks,  well it sortof fits the theory, from what I can tell so far.    'cufftExecC2C' would have been the first kernel executed after a pulsefind on the previous cfft pair,  A long one of which having crashed the driver, or application context etc.  Everything after that clearly hosed.  I reckon in the future we can handle that better.

Offline perryjay

  • Knight Templar
  • ****
  • Posts: 427
Re: just installed Unified Installers, v0.37 for Windows
« Reply #21 on: 02 Sep 2010, 01:37:53 pm »
That's me Jason, low class all the way!!   ;D  Glad I could be of help and give you guys something to play with.

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: just installed Unified Installers, v0.37 for Windows
« Reply #22 on: 02 Sep 2010, 02:14:42 pm »
...
Got a Breakdown Joe ?

Code: [Select]
AR=0.39430364685758, First limit=30, Second limit=100  [ChirpRes 0.1665]

FFTLen   Stepsize  NumCfft     Spikes  Gaussians     Pulses   Triplets  PoTlen
     8   7.463718       27    3538944        189       1365       2835   16621
    16   3.731859       53    3473408        795       6075      11925    8310
    32   1.865929      107    3506176       3317      24645      49755    4155
    64   0.932965      215    3522560      13545     101115     203175    2078
   128   0.466482      429    3514368      54483     409575     817245    1039
   256   0.233241      857    3510272     218535    1640925    3278025     519
   512   0.116621     1715    3512320     876365    6568905   13145475     260
  1024   0.058310     3429    3511296    3507867   26316675   52618005     130
  2048   0.029155     6859    3511808   14040373  105287445  210605595      65
  4096   0.014578    13719    3512064   56179305  421314075  842689575      32
  8192   0.007289    27439    3512192  224752849 1685584935 3371292735      16
 16384   0.003644    54879    3512256  899082657          0          0       8
 32768   0.014788    13525     432800          0          0          0       4
 65536   0.003697    16229     259664          0          0          0       2
131072   0.000924    64917     519336          0          0          0       1
                  -------- ---------- ---------- ---------- ----------
Totals              204399   43349464 1198730280 2247255735  199747049

Quote
The lower multiprocessorcount of the 9500GT, about half that of my old 9600GSO, would see long PulsePoTs at fftLength 4096 and under, split pulsefind kernel execution more often to fit hardware.  That would explain naturally longer runtime of the tasks on lower classes of GPU, while staying the same as other midrange tasks on higher GPUS.  In addition, I did move execution of those kernels to a non-default stream (ie. not stream 0), and tamper with kernel launch geometry somewhat.  That could explain why it runs to completion on x32f, while suffers timeouts & driver crashes under stock.

Jason

Perryjay did say the GPU had handled other tasks with similar AR much quicker, and the way the ALFALFA project observes I'd expect he even had at least several with AR identical to the full 14 digits supplied in the WU header. About the only possibility of something unusual in this WU has to be in the data. For Pulse finding, about the only possibility of a slowdown would be if the best_pulse threshold built up gradually, requiring a lot of data to be sent back from GPU to CPU. And for Gaussian fitting the situation is similar, there might have been a gradual buildup requiring much data return to the CPU, and even doing the final ChiSqr checks an unusually large number of times might be implicated.

My CPU run actually finished quicker than I'd expected, but that's mostly my not having done many full-length tasks on the test system. The result file is very strongly similar to Perryjay's as expected.

My judgement is the WU is exonerated, it just happened to be the one being processed when something caused either a GPU slowdown or tied up the CPU so it wasn't getting the next GPU operation started promptly. There's no way to tell if it was a protracted sluggishness or a period of zero progress, of course. Whatever, the task took about 3 times as long as usual for similar tasks which is disturbing but didn't approach the ~10 times longer which would have risked a -177 error. (The AMD hang on stock CPU apps appears to be permanent unless the user takes action, otherwise it will always reach the time limit.)

Watching for any similar cases is of course called for, at this point trying to make a special debug build without having even a vague theory of possible causes seems impractical.
                                                                                 Joe

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: just installed Unified Installers, v0.37 for Windows
« Reply #23 on: 02 Sep 2010, 02:24:18 pm »
...
Watching for any similar cases is of course called for, at this point trying to make a special debug build without having even a vague theory of possible causes seems impractical.
                                                                                 Joe

Thanks for the breakdown, I agree those cffts don't look particularly hardcore, and didn't see anything unusual in execution here. 

Perhaps if caught it the act, it would warrant grabbing a HiJackThis! log or similar to look for interfering processes.

Offline perryjay

  • Knight Templar
  • ****
  • Posts: 427
Re: just installed Unified Installers, v0.37 for Windows
« Reply #24 on: 02 Sep 2010, 02:42:12 pm »
You're right Joe, I've finished a few of the same angle range and even very similar WU name. This one caught my eye because of the 5 hour run time. It was the first I'd seen that didn't -1 error out on me. The rest of them so far have been well within normal runtimes.

If you gentlemen are through, someone in another thread is saying something about the upload folder is full. So, I guess I should delete my stuff.
« Last Edit: 02 Sep 2010, 02:47:23 pm by perryjay »

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: just installed Unified Installers, v0.37 for Windows
« Reply #25 on: 02 Sep 2010, 02:51:43 pm »
Looks like I'll have to go through for a trim session this weekend  ;)

Ghost0210

  • Guest
Re: just installed Unified Installers, v0.37 for Windows
« Reply #26 on: 02 Sep 2010, 02:54:58 pm »
Looks like I'll have to go through for a trim session this weekend  ;)
It was only me  ;D
Was going to upload a comparison chart for Raistmers r449>>r454 ATI build
nothing that can't wait

Offline perryjay

  • Knight Templar
  • ****
  • Posts: 427
Re: just installed Unified Installers, v0.37 for Windows
« Reply #27 on: 03 Sep 2010, 07:24:19 pm »
Just to finish off my little adventure, the WU validated with 131.55 credits. Here is my stderr..

Stderr output

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
  Device 1: GeForce 9500 GT, 1007 MiB, regsPerBlock 8192
     computeCap 1.1, multiProcs 4
     clockRate = 1840363
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 9500 GT is okay
SETI@home using CUDA accelerated device GeForce 9500 GT
Priority of process raised successfully
Priority of worker thread raised successfully
size 8 fft, is a freaky powerspectrum
size 16 fft, is a cufft plan
size 32 fft, is a cufft plan
size 64 fft, is a cufft plan
size 128 fft, is a cufft plan
size 256 fft, is a freaky powerspectrum
size 512 fft, is a freaky powerspectrum
size 1024 fft, is a freaky powerspectrum
size 2048 fft, is a cufft plan
size 4096 fft, is a cufft plan
size 8192 fft, is a cufft plan
size 16384 fft, is a cufft plan
size 32768 fft, is a cufft plan
size 65536 fft, is a cufft plan
size 131072 fft, is a cufft plan

 )       _   _  _)_ o  _  _
(__ (_( ) ) (_( (_  ( (_ ( 
 not bad for a human...  _)

Multibeam x32f Preview, Cuda 3.0

Work Unit Info:
...............
WU true angle range is :  0.394304

Flopcounter: 45602123959036.234000

Spike count:    0
Pulse count:    2
Triplet count:  0
Gaussian count: 2
called boinc_finish

</stderr_txt>
]]>
« Last Edit: 03 Sep 2010, 07:26:46 pm by perryjay »

msattler

  • Guest
Re: just installed Unified Installers, v0.37 for Windows
« Reply #28 on: 04 Sep 2010, 12:51:43 am »
Used the new installer earlier today to add a GTX465 to the Frozen 920.
No problems to report. Other than all Cuda tasks are marked as 6.08....no 6.10.
But the new card is crunching away with the existing GTX295.

Looks like a winner here.

So now the kitties are feeling kinda Fermi...............
« Last Edit: 04 Sep 2010, 01:10:24 am by msattler »

Offline perryjay

  • Knight Templar
  • ****
  • Posts: 427
Re: just installed Unified Installers, v0.37 for Windows
« Reply #29 on: 04 Sep 2010, 10:11:08 am »
So long as it is using the 6.10 you are good to go. How you liking that Fermi? Post some times and credits when you get some in.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 257
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 200
Total: 200
Powered by EzPortal