GPU crunching question

Forum > Windows

<< < (24/37) > >>

Josef W. Segur:

--- Quote from: Devaster on 10 May 2007, 06:54:57 pm ---is FFTLen same for whole computaton ???
--- End quote ---

No.

8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, and 131072 lengths are all used.

Technically, some simple changes in the parameters sent in the workunit header could require additional lengths but it's highly unlikely the project will do that.
Joe

Devaster:
ok.

for fft sizes <128k is a gpu version (GPUFFTW and rapidmind) maximaly uneffective - by time when driver upload datas , cpu calculates 2- 10 small ffts ...

also seti uses fft lower <128k only so i think that any development for seti gpu client is lost of time for now, until whole analyse (fft,powerspectrum,pulses,triplets and so on) is on gpu. but there is a problem on older cards with shaders size.

so for now i have stopped a development of this client ... :-\

Vyper:
Isn't gpufftw and rapidmind parallell utilized?? If so a WU can be arranged in chunks and u can upload a datastream as a whole block to the gfxram and then let multiple "threads" calculate that particular area..

That would speed up the process alot.. If that isn't doable then ofcourse it't no use using either API..

Nv Cuda (which u unfortunately can't test) have 12-16 "processors" that can execute multiple threads each.. If it were possible to move one block of data once and then chop it up into small different pieces and try to align ram reads/writes then the client would be highly effecient.. The problem is that the Cuda API atm can only accept FFT sizes up to 16K and that would require either alot of datashuffling or a complete custom FFT library for that particular API and different gfx setups = "Verrrry time consuming" , but if someone is really keen of low level programming and implementation can surely write a massive parallell FFT routine ..

Well i hope that there is another approach in implementing a S@H client to the GPU world, it would surely be a shame if only Folding@home is going away with having that accomplished, but then again i don't know what type of calculations that client want . The only thing i know is that good o'l ATI X19XX series card has huge increased throughput and even PS3 client has strugles keeping up with that gfxcards FPU power..

So all in all im hoping for a parallell execution routine to accomplish that goal now atm...

Devaster, u have my deepest respect for even trying the 1st step to GPU S@H with ur reported progress, hope that the API gets better as times goes by..

Kind Regards Vyper

Simon:
Same from me - sad to see you stop development for the time being. Maybe Vyper's comments gave you some new ideas, hopefully anyway :)

Take care, and stay around!

Regards,
Simon.

Devaster:
so here is a technology preview...
DO NOT USE IT FOR OFFICIAL CRUNCHING
stop boinc unpack run exe with parameter -standalone and see how is it running ...
if dont running then send stderr.txt to this forum, but in 90 % of cases is missing some OpenGL extension ...

to development : i am tired and i need some pause and thinking about it so dont be worry ...

Ill be back !

[attachment deleted by admin]

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version