Forum > Windows
just installed Unified Installers, v0.37 for Windows
Josef W. Segur:
--- Quote from: perryjay on 01 Sep 2010, 09:38:13 pm ---If at first you don't succeed try try again!!! ;D
--- End quote ---
Well done, third time's the charm! ;) I have the WU running a standalone test on a system which should take between 12 and 13 hours for that AR. Maybe someone else with CUDA capability could check whether it causes any unusual effects.
Joe
Jason G:
will be able to laod it up tonight for a , look.
perryjay:
Just a thought but could whatever is causing my little 9500Gt to hang on these be what's causing stock AMDs to hang? Sure would be great if my problem helped to find a cure for that. I know optimizing AMDs cures it but if we can find that one wrong piece in the WU maybe the boys at Berkeley could correct it in the stock WUs.
Jason G:
Have run just now under x32f, both Cuda 3 & 3.1 versions, on the 480 looking for anything unusual. Nothing immediately obvious yet. These builds, as usual, have the bench code disabled that causes those rare issues on stock with AMD. ~8 minutes elapsed, ~1min CPU time. Pretty normal processing for a Mid Angle range task here. I don't have stock cuda_fermi on hand at the moment to see if that differs.
Will see if I can spot anything in the result files, such as lots of closely spaced triplets or something...
[Edit:] Notes:
- Your result file is 'Strongly Similar' to both mine
- Both detected pulses seem to be at 'fairly short' FFT Lengths, (i.e. Long PulsePoTs) which can run more efficiently on Fermi hardware at this time, but prior gen can choke. I suspect these Long PulsePoTs could explain up to around 50% increased runtime for this task, maybe more, but would need a chirp/FFT pair breakdown to know for sure. If correct then it's a 'nasty bastard' task for older/lower capacity cards, but I'm not prepared to rule out something else interfering with the run time on that machine yet.
Got a Breakdown Joe ?
The lower multiprocessorcount of the 9500GT, about half that of my old 9600GSO, would see long PulsePoTs at fftLength 4096 and under, split pulsefind kernel execution more often to fit hardware. That would explain naturally longer runtime of the tasks on lower classes of GPU, while staying the same as other midrange tasks on higher GPUS. In addition, I did move execution of those kernels to a non-default stream (ie. not stream 0), and tamper with kernel launch geometry somewhat. That could explain why it runs to completion on x32f, while suffers timeouts & driver crashes under stock.
Jason
perryjay:
Just to show, this is the stderr from one completed back on the 21st....also an 0.39 AR and a 21ap10ag....
Oops, not completed, errored out.. ::)
Stderr output
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce 9500 GT
totalGlobalMem = 1056505856
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 2147483647
maxThreadsPerBlock = 512
clockRate = 1840363
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 1
multiProcessorCount = 4
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce 9500 GT is okay
SETI@home using CUDA accelerated device GeForce 9500 GT
V12 modification by Raistmer
Priority of worker thread rised successfully
Priority of process adjusted successfully
Total GPU memory 1056505856 free GPU memory 983990272
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
Build features: Non-graphics CUDA VLAR autokill enabled FFTW USE_SSE x86
CPUID: Pentium(R) Dual-Core CPU E5400 @ 2.70GHz
Cache: L1=64K L2=2048K
CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3
libboinc: 6.3.22
Work Unit Info:
...............
WU true angle range is : 0.393971
After app init: total GPU memory 1056505856 free GPU memory 983990272
Cuda error 'cufftExecC2C' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_fft.cu' in line 143 : the launch timed out and was terminated.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : the launch timed out and was terminated.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : the launch timed out and was terminated.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 147 : the launch timed out and was terminated.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 147 : the launch timed out and was terminated.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BoincSeti_Prog/sinbad_repositories/LunaticsUnited/SETI_CUDA_MB_exp/client/cuda/cudaAcc_summax.cu' in line 160 : the launch timed out and was terminated.
</stderr_txt>
]]>
Hope that helps.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version