CUDA MB V12b for multi-GPU multicore hosts.

Forum > GPU crunching

<< < (3/9) > >>

Jason G:
The release V12 in the installer & the NoKill separate download ? Were built on mine using 2.2 at the time (2.3 was not in common use) and set that limit thing to 2048 as directed by yourself & Joe.

No other changes from your sources at that time, around June 20th 2009 according to build date on the exe at my end, and svn logs, (though later experiments deviate quite a lot). I think it's possible the 2.3 sdk does build larger kernels, and the 2.3 DLL's are definitely larger & produce more stress, and use more video RAM. What effects this should have on smaller cards I'm not entirely sure.

Later in the course of experimentation, as well as adding Joe's triplet kernel fixing stuff, I did introduce a constant definition in my experimental branch, called NUM_ITER which reduces the length of the pulsefinding calls. But that definition isn't in those builds.

@Al, please tell me the creation date on the exe you used that worked well on the 8400GS, so I can pinpoint which parameters were used, and corresponding svn revision.

Cheers, Jason

Raistmer:
Thanks, low-end GPUs are borderline case (by amount of memory available and by lenght of kernel calls) so they are especially sensefull to even smallest changes between builds. I still had to understand why my own 9400GT works just well in Q9450 and fails badly and often in Core duo and Athlon64 hosts....

Jason G:
Hmm, yes, very confusing. Could you list the builds you've tried on the Athlon (and I guess from what you say none work properly...). At one time before I went hybrid, I did lots of test builds with reduced pulse finding blocks (NUM_ITER5 IIRC), perhaps those work in this? While v13 would be interesting to try on that, I don't think it'll help pinpoint the problem, since obviously the problem is in cuda code or hardware somewhere. I'm thinking something to do with chipset/DMA transfers. I presume the mobo BIOS is up to date? because there was some issues with PCIe on some mobos IIRC.

Pappa:

--- Quote from: Jason G on 25 Dec 2009, 04:17:14 am ---The release V12 in the installer & the NoKill separate download ? Were built on mine using 2.2 at the time (2.3 was not in common use) and set that limit thing to 2048 as directed by yourself & Joe.

No other changes from your sources at that time, around June 20th 2009 according to build date on the exe at my end, and svn logs, (though later experiments deviate quite a lot). I think it's possible the 2.3 sdk does build larger kernels, and the 2.3 DLL's are definitely larger & produce more stress, and use more video RAM. What effects this should have on smaller cards I'm not entirely sure.

Later in the course of experimentation, as well as adding Joe's triplet kernel fixing stuff, I did introduce a constant definition in my experimental branch, called NUM_ITER which reduces the length of the pulsefinding calls. But that definition isn't in those builds.

@Al, please tell me the creation date on the exe you used that worked well on the 8400GS, so I can pinpoint which parameters were used, and corresponding svn revision.

Cheers, Jason

--- End quote ---

6-19-2009

this message SETI CUDA MB so with this the "Proposed 'Better?' medium term VLAR solution" started not too long after that around the 1st of July.

Jason G:

--- Quote from: Pappa on 25 Dec 2009, 11:59:34 am ---....at around the 1st of July.
--- End quote ---
OK... that date matches up with my builds of 'Bog standard V12' with FPLIM 2048 applied (amongst other values tested at the time), later committed, after proof with testing & mimo's profiling, on on 7th July. (prior assertions confirmed)

@Raistmer: That corresponds to r93 in the CudaMB_exp branch, which you might like to compare to your r89, which it is based on. I don't see significant source changes amongst the experiments, between those revisions, So I guess used SDK might be one remaining suspect.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version