Author Topic: Some thinking and theoretic discussion about seti client on GPU (Read 25702 times)

tfp · « **Reply #15 on:** 03 Jan 2008, 01:28:49 pm »

Just a quick question, is there a reason why the data is converted to PF first and then all of the work is done?

guido.man · « **Reply #16 on:** 03 Jan 2008, 06:17:29 pm »

That is a good question.
Is there a real need to convert to Floating Point,
or could all calculations be done in binary,
and converted at the end if need be?

Gecko_R7 · « **Reply #17 on:** 07 Jan 2008, 01:12:50 am »

Mimo & Jason,
Interesting threads from 06' at GPGPU.org pertaining to FFTs.
I actually followed at the time and just came across them while cleaning out old links.
Maybe you've read already, but if not:

http://www.gpgpu.org/forums/viewtopic.php?t=2284

http://www.gpgpu.org/forums/viewtopic.php?t=2021

Cheers!

Jason G · « **Reply #18 on:** 07 Jan 2008, 01:51:09 am »

LOL, Thanks Gecko, I like some of the confusions about precision and representation in the first one, as the discussion mirrors much of my own questions when I first started exploring (and there's still many things for me to work out yet).

Jason

[ Side Note attempted clarification: As far as I've been able to understand, we are mostly dealing using 32 bit single floats paired in a complex representatation, derived from 1 bit samples from the native telescope data, which relates to telescope hardware arrangement for minimising noise, maximising sensitivity, and recording capacity. (like some kind of dipole antenna arrangement which has advantageous geometric characteristics to it) . [I'd imagine the 1 bit pairs representation, for storage purposes, would be missing elements needed for the signal processing that are implicit in the geometrical relationships, meaning the data needs to be expanded (decompressed) as the first step, using those known relationships.

In my limited understanding - then Having available the full complex representation extends the effective nyquist induced bandwidth limits of the system from n/2, (for real only based samples) to +/-n/2 (for a complex signal) (effectively improving the overall sensitivity of the search while making better use of the original telescope hardware, and having other processing implications for several stages in the system ... which hopefully answers "Why use 32 bit complex pairs(totalling 64 bits) instead of 64 bit double floats, which would occupy the same storage space?"

I gather those aspects of the specific telescope setup have been refined over decades improve the sensitivity etc... ]

Devaster · « **Reply #19 on:** 22 Jan 2008, 03:19:49 am »

Hi

after running last ETI code in the GPU profiler it shows some interesting things :

1. the most time consuming thing (47 %) is not FFT , but Findspike at 128k size. Why ?

2. increasing core speed from 400 MHz to 580 MHz has enormous affect to performance (>900 sec agains 700 sec on 8500GT (2 multiprocessors)),but increasing memory from 450 to 550 MHz has doing nothing with performance. Why ?

1. after better analyze i have found reason: find spike code is massively divergent - GPU cant use any predication and MUST run all direction of divergent code (whole if-then else construction- result from the bad direction is discarded). CPUs can predicate direction of code and can precache needed instructions/data and then can avoid to run unnecessary code . GPUs due its strictly parallel architecture cannot skip part of code aka CPUs without massive impact to performance- all threads in warp (lowest hardware thread unit - 32 threads) must compute same code and if only one thread in warp give relevant and needed result other 31 threads are only ballast - massive performance hit.
Way to solve this is in use reduction operations: after any compare i decrease the count of threads to half ... - this operation i have used at find best spike - classic findsipke at 128k is called about 120 times and takes 47% but reductive find spike (size is vary from 8 to 128k)is called about 4000 times and takes only 2% of time spend on GPU. This method has better read write coherency - reads are done from different memory banks and is better shared mem utilization

2.seti code is compute based not memory - so bigger core speed and more MP give better performance ....

roisen.dubh · « **Reply #20 on:** 27 Jan 2008, 09:47:47 pm »

What temps are you guys seeing while running the app, and once a fully working app is created, how hard to you predict it will be to get the code running on NVidia's next series of GPUs?

popandbob · « **Reply #21 on:** 28 Jan 2008, 01:01:55 am »

Quote from: roisen.dubh on 27 Jan 2008, 09:47:47 pm

What temps are you guys seeing while running the app, and once a fully working app is created, how hard to you predict it will be to get the code running on NVidia's next series of GPUs?

temps are around normal for a 3d game... (~70 deg. C )
As long as they support CUDA which they do it will work.

~BoB

riha · « **Reply #22 on:** 18 Feb 2008, 10:56:31 am »

Sorry if not correct thread to post in, but what happened to this thread : http://lunatics.kwsn.net/windows/gpu-crunching-question.180.html

I found it, downloaded the software which seems to work except for the final -9 error. I posted an answer also in thet thread but the ndiscovered that nothing has happened in it for half an year.

Is this the replacement thread or is there any other replacement thread? What about the software, is it still maintained?

Devaster · « **Reply #23 on:** 12 Mar 2008, 09:57:06 am »

after experimenting with pot population i have found this - pot population is MORE speeder on CPU as on GPU - for gpu it is small data and compute intensity ... so i go to the gauss and pulse fuctions ....

Author Topic: Some thinking and theoretic discussion about seti client on GPU (Read 25702 times)

tfp

Re: Some thinking and theoretic discussion about seti client on GPU

guido.man

Re: Some thinking and theoretic discussion about seti client on GPU

Gecko_R7

Re: Some thinking and theoretic discussion about seti client on GPU

Jason G

Re: Some thinking and theoretic discussion about seti client on GPU

Devaster

Re: Some thinking and theoretic discussion about seti client on GPU

roisen.dubh

Re: Some thinking and theoretic discussion about seti client on GPU

popandbob

Re: Some thinking and theoretic discussion about seti client on GPU

riha

Re: Some thinking and theoretic discussion about seti client on GPU

Devaster

Re: Some thinking and theoretic discussion about seti client on GPU