+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Some thinking and theoretic discussion about seti client on GPU  (Read 23504 times)

tfp

  • Guest
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #15 on: 03 Jan 2008, 01:28:49 pm »
Just a quick question, is there a reason why the data is converted to PF first and then all of the work is done?

guido.man

  • Guest
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #16 on: 03 Jan 2008, 06:17:29 pm »
That is a good question.
Is there a real need to convert to Floating Point,
or could all calculations be done in binary,
and converted at the end if need be?

Gecko_R7

  • Guest
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #17 on: 07 Jan 2008, 01:12:50 am »
Mimo & Jason,
Interesting threads from 06' at GPGPU.org pertaining to FFTs.
I actually followed at the time and just came across them while cleaning out old links.
Maybe you've read already, but if not:

http://www.gpgpu.org/forums/viewtopic.php?t=2284

http://www.gpgpu.org/forums/viewtopic.php?t=2021

Cheers!
« Last Edit: 07 Jan 2008, 01:39:03 am by Gecko_R7 »

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #18 on: 07 Jan 2008, 01:51:09 am »
LOL, Thanks Gecko, I like some of the confusions about precision and representation in the first one, as the discussion mirrors much of my own questions when I first started exploring (and there's still many things for me to work out yet).

Jason

[ Side Note attempted clarification: As far as I've been able to understand, we are mostly dealing using 32 bit single floats paired in a complex representatation, derived from 1 bit samples from the native  telescope data, which relates to telescope hardware arrangement for minimising noise, maximising sensitivity, and  recording capacity. (like some kind of dipole antenna arrangement which has advantageous geometric characteristics to it) . [I'd imagine the 1 bit pairs representation, for storage purposes,  would be  missing elements needed for the signal processing that are implicit in the geometrical relationships, meaning the data needs to be expanded (decompressed) as the first step, using those known relationships.

In my limited understanding - then Having available the full complex representation extends the effective nyquist induced bandwidth limits of the system from n/2, (for real only based samples)  to +/-n/2 (for a complex signal)  (effectively improving the overall sensitivity of the search while making better use of the original telescope hardware, and having other processing implications for several stages in the system ... which hopefully answers "Why use 32 bit complex pairs(totalling 64 bits)  instead of 64 bit double floats, which would occupy the same storage space?"

 I gather those aspects of the specific telescope setup have been refined over decades improve the sensitivity etc... ]

« Last Edit: 07 Jan 2008, 12:41:02 pm by j_groothu »

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #19 on: 22 Jan 2008, 03:19:49 am »
Hi

after running last ETI code in the GPU profiler it shows some interesting things :

1. the most time consuming thing (47 %) is not FFT , but Findspike at 128k size. Why ?

2. increasing core speed from 400 MHz to 580 MHz has enormous affect to performance (>900 sec agains 700 sec on 8500GT (2 multiprocessors)),but increasing memory from 450 to 550 MHz has doing nothing with performance. Why ? 



1. after better analyze i have found reason: find spike code is massively divergent - GPU cant use any predication and MUST run all direction of divergent code (whole if-then else construction- result from the bad direction is discarded). CPUs can predicate direction of code and can precache  needed instructions/data and then can avoid to run unnecessary code . GPUs  due its strictly parallel  architecture cannot skip part of code aka CPUs without massive impact to performance- all threads in warp (lowest hardware thread unit - 32 threads) must compute same code and if only one thread in warp give relevant and needed result other 31 threads are only ballast - massive performance hit.
   Way to solve this is in use reduction operations: after any compare i decrease the count of threads to half ... - this operation i have used at find best spike -  classic findsipke at 128k   is called about 120 times and takes 47% but reductive find spike (size is vary from 8 to 128k)is called about 4000 times and takes only 2% of time spend on GPU. This method has better read write coherency - reads are done from different memory banks and is better shared mem utilization 


2.seti code is compute based not memory  - so bigger core speed and more MP give better performance ....
« Last Edit: 22 Jan 2008, 04:11:22 am by Devaster »

roisen.dubh

  • Guest
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #20 on: 27 Jan 2008, 09:47:47 pm »
What temps are you guys seeing while running the app, and once a fully working app is created, how hard to you predict it will be to get the code running on NVidia's next series of GPUs?

popandbob

  • Guest
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #21 on: 28 Jan 2008, 01:01:55 am »
What temps are you guys seeing while running the app, and once a fully working app is created, how hard to you predict it will be to get the code running on NVidia's next series of GPUs?

temps are around normal for a 3d game... (~70 deg. C )
As long as they support CUDA which they do it will work.

~BoB

riha

  • Guest
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #22 on: 18 Feb 2008, 10:56:31 am »
Sorry if not correct thread to post in, but what happened to this thread : http://lunatics.kwsn.net/windows/gpu-crunching-question.180.html

I found it, downloaded the software which seems to work except for the final -9 error. I posted an answer also in thet thread but the ndiscovered that nothing has happened in it for half an year.

Is this the replacement thread or is there any other replacement thread? What about the software, is it still maintained?

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: Some thinking and theoretic discussion about seti client on GPU
« Reply #23 on: 12 Mar 2008, 09:57:06 am »
after experimenting with pot population i have found this - pot population is MORE speeder on CPU as on GPU - for gpu it is small data and compute intensity ... so i go to the gauss and pulse fuctions ....

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 352
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 332
Total: 332
Powered by EzPortal