Some thinking and theoretic discussion about seti client on GPU

Forum > GPU crunching

(1/5) > >>

Devaster:
Now i am thinking how to best parallelize the pulsefind. in standard code is are pulses calculated in serial mode , by calling function in the main analyse loop .

what happen when i make something like this: ill take the cycle that is finding pulses at fft size count and run them in NumPoints/fftlen threads ???

i think this would be nice parallelization for this. but there is one extreme - by fft size bigger than 4096 is number of parallel therads going down from 256 to 8. maybe there will be some performance bottleneck or then would be GPU utilization very low ...

i must test this on next day ... see ya!

Devaster:
about pulse find - i think there would be better to write all kernels manually and do not have it automatically generated - there wold be used loop unrolling too - better performance ....

popandbob:
To follow up on my last question..

Once all is programmed in CUDA will the CPU usage still be 100%? I know that Folding@home's ATI GPU client is... but I believe that's due to them not using CUDA...

~BoB

Devaster:
i t dont now . there will be still some parts that would be run on CPU ....

popandbob:
Thanks for the reply Devaster, I do hope CPU usage wont be at 100% because then at least we would have something that Folding@home doesn't... A GPU app that can run with CPU apps (ie. dont have to reserve a core for GPU app)

~BoB

Navigation

[0] Message Index

[#] Next page

Go to full version