With points i mean FFT plans, how large plans do the S@H client use..
65536 or 131072 or perhaps even more??
Someone with the knowledge, Josef.. Ben.. Heeelp
131072 is the maximum in use, but that's controlled by values in the workunit header and a properly designed program should be able to do larger ones too.
All practical FFT algorithms do divide the work into smaller units and combine them for the final result. But the way it is split and recombined is not trivial. I can't go any further since I haven't studied the algorithms in detail.
One possible area you might search is the projects trying to find Mersenne primes. They use very long FFTs in a method for multiplying huge numbers, and it's likely someone in those communities has investigated using a GPU.
Joe