Forum > GPU crunching

Some thinking and theoretic discussion about seti client on GPU

<< < (4/5) > >>

tfp:
Just a quick question, is there a reason why the data is converted to PF first and then all of the work is done?

guido.man:
That is a good question.
Is there a real need to convert to Floating Point,
or could all calculations be done in binary,
and converted at the end if need be?

Gecko_R7:
Mimo & Jason,
Interesting threads from 06' at GPGPU.org pertaining to FFTs.
I actually followed at the time and just came across them while cleaning out old links.
Maybe you've read already, but if not:

http://www.gpgpu.org/forums/viewtopic.php?t=2284

http://www.gpgpu.org/forums/viewtopic.php?t=2021

Cheers!

Jason G:
LOL, Thanks Gecko, I like some of the confusions about precision and representation in the first one, as the discussion mirrors much of my own questions when I first started exploring (and there's still many things for me to work out yet).

Jason

[ Side Note attempted clarification: As far as I've been able to understand, we are mostly dealing using 32 bit single floats paired in a complex representatation, derived from 1 bit samples from the native  telescope data, which relates to telescope hardware arrangement for minimising noise, maximising sensitivity, and  recording capacity. (like some kind of dipole antenna arrangement which has advantageous geometric characteristics to it) . [I'd imagine the 1 bit pairs representation, for storage purposes,  would be  missing elements needed for the signal processing that are implicit in the geometrical relationships, meaning the data needs to be expanded (decompressed) as the first step, using those known relationships.

In my limited understanding - then Having available the full complex representation extends the effective nyquist induced bandwidth limits of the system from n/2, (for real only based samples)  to +/-n/2 (for a complex signal)  (effectively improving the overall sensitivity of the search while making better use of the original telescope hardware, and having other processing implications for several stages in the system ... which hopefully answers "Why use 32 bit complex pairs(totalling 64 bits)  instead of 64 bit double floats, which would occupy the same storage space?"

 I gather those aspects of the specific telescope setup have been refined over decades improve the sensitivity etc... ]

Devaster:
Hi

after running last ETI code in the GPU profiler it shows some interesting things :

1. the most time consuming thing (47 %) is not FFT , but Findspike at 128k size. Why ?

2. increasing core speed from 400 MHz to 580 MHz has enormous affect to performance (>900 sec agains 700 sec on 8500GT (2 multiprocessors)),but increasing memory from 450 to 550 MHz has doing nothing with performance. Why ? 



1. after better analyze i have found reason: find spike code is massively divergent - GPU cant use any predication and MUST run all direction of divergent code (whole if-then else construction- result from the bad direction is discarded). CPUs can predicate direction of code and can precache  needed instructions/data and then can avoid to run unnecessary code . GPUs  due its strictly parallel  architecture cannot skip part of code aka CPUs without massive impact to performance- all threads in warp (lowest hardware thread unit - 32 threads) must compute same code and if only one thread in warp give relevant and needed result other 31 threads are only ballast - massive performance hit.
   Way to solve this is in use reduction operations: after any compare i decrease the count of threads to half ... - this operation i have used at find best spike -  classic findsipke at 128k   is called about 120 times and takes 47% but reductive find spike (size is vary from 8 to 128k)is called about 4000 times and takes only 2% of time spend on GPU. This method has better read write coherency - reads are done from different memory banks and is better shared mem utilization 


2.seti code is compute based not memory  - so bigger core speed and more MP give better performance ....

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version