Forum > Windows

optimized sources

<< < (43/179) > >>

Jason G:
LOL, Here's one in seti_analyze that disappears if going to FFTW,
 

--- Code: ---                #ifndef USE_FFTW        // FFTW now uses out of place transforms.
                    memcpy( WorkData, &ChirpedData[CurrentSub], int(fftlen * sizeof(sah_complex)) );
                #endif
--- End code ---

I see a few of those.

Another thought. Has anyone attempted to use that FFTW codelet generator given that only a small portion of fftw is used? I have played with OCAML before, didn't seem hard.[but it was long enough ago to have forgotten everything :D]

Jason

Josef W. Segur:

--- Quote from: j_groothu on 26 Oct 2007, 10:14:58 am ---LOL, Here's one in seti_analyze that disappears if going to FFTW,
 

--- Code: ---                #ifndef USE_FFTW        // FFTW now uses out of place transforms.
                    memcpy( WorkData, &ChirpedData[CurrentSub], int(fftlen * sizeof(sah_complex)) );
                #endif
--- End code ---

I see a few of those.

Another thought. Has anyone attempted to use that FFTW codelet generator given that only a small portion of fftw is used? I have played with OCAML before, didn't seem hard.[but it was long enough ago to have forgotten everything :D]

Jason
--- End quote ---

Yes, those memcpy calls could be eliminated if the IPP FFTs were switched to out of place. Testsuji made that change in the official sources after 5.15 so they aren't included in our source. I've mentioned this several times, but it would be best if someone who actually works with IPP made and tested the changes.

I've thought about codelet generation, even downloaded OCAML, but have never done anything. I suspect there could be some efficiency to be gained by an FFTW function which combined the FFT and conversion to PowerSpectrum; the final FFT stage has the values needed to save the power rather than having a separate function to go through the complex array and convert it. The reversibility of a complete FFT is only needed during baseline smoothing.
                                                      Joe

Jason G:
yes , What triggered the mention is an attempt for myself to understand why the susbsequent calls to IPP are out of place versions, called using the same source and destination 'inplace style',  odd  :o.
     ippsFFTInv_CToC_32fc(
                        ( Ipp32fc * ) WorkData,   //pSrcDst for inplace, pSrc for outplace
                        ( Ipp32fc * ) WorkData,    // additional parameter indicating out of place call ?
                                                                       // maybe, drop it for in place, or change for out of place proper
                                                                      // and disable preceding memcpy.
                        FftSpec[FftNum],
                        FftBuf );

Having a play with MinGW over Eclipse at the moment for other work, less vendor library oriented.  I'm liking it, a big switch as I haven't used a gnu compiler for years.  Woot, 'proper make facilities'  ;D, means I'll have to take a deeper look at FFTW sometime soon.

Jason

[ Maybe when I'm back on ICC/IPP, I'll see what breaks if I comment out the memcpy, and use out of place parameter , just by changing the source arguement (In all those places in seti_analyze),   be indeed nice if someone more IPP experienced and in the sources loop could look at Joe's observation and comment, test etc.]

Jason G:
Okay, Just for kicks I managed to get FFTW 3.1.2 (configure & make) scripts operational in MinGW/MSYS.  No Idea what to make of the actual configuration flags (* config.h) yet though  ;D back to the doccos!

_heinz:
Hi Jason,
nice that you encouraged me, thanks...

The points are in process_data ------> here with some changes to the original code
// ----------------------------------------------------------------------------
//   Function:   process_data
//   Typ      :   void
//   Inhalt   :   process data, with or without transpose      
//   parameter:   none
//   last update:19.03.2007   by:seti_britta      
// ----------------------------------------------------------------------------
// Part 4.2 process data
// ----------------------------------------------------------------------------
void process_data()
   {
      extern int have_transpose;
      if (!have_transpose) ifft = 0;   // seti_britta: ifft=0, when no transpose
      for (; ifft < NumFfts; ifft++ )
            {
                CurrentSub = fftlen * ifft;
                #ifndef USE_FFTW        // FFTW now uses out of place transforms.
                    memcpy( WorkData, &ChirpedData[CurrentSub], int(fftlen * sizeof(sah_complex)) );
                #endif

               // seti_britta:move the calculation of flops to the point where fftlen get value
               // flops_form1= 4 * fftlen + 5 * fftlen * log( double(fftlen) ) / log(2.0 )
               // count_flops( 4 * fftlen + 5 * fftlen * log( double(fftlen) ) / log(2.0 ) );
               count_flops(flops_form1);   // setibritta: new statement

                #if defined( USE_IPP )
                    ippsFFTInv_CToC_32fc(
                        ( Ipp32fc * ) WorkData,
                        ( Ipp32fc * ) WorkData,
                        FftSpec[FftNum],
                        FftBuf );
                #elif defined( USE_FFTWF )
                    fftwf_execute_dft(
                        analysis_plans[FftNum],
                        &ChirpedData[CurrentSub],
                        WorkData );
                #else
                    // replace time with freq - ooura FFT
               // seti_britta: take mul with 2 out off the loop, where fftlen get value
                 /*   cdft( fftlen * 2, 1, WorkData, BitRevTab[FftNum], CoeffTab[FftNum] ); */
               cdft( fftlen_m2, 1, WorkData, BitRevTab[FftNum], CoeffTab[FftNum] );
                #endif
            if (have_transpose)
               {
               // BENH: new version replace freq with power
               //      does transpose as well as puts values back
               //      into WorkData (for use by findSpikes)
               GetPowerSpectrum( WorkData, PowerSpectrum, fftlen, ifft, NumFfts);
               have_transpose = false;
               }
                // replace freq with power
            // no transpose
                else PwrSpectrumOnly( WorkData, (float *)WorkData, fftlen );

                // any ETIs ?!
                // If PoT freq bin is non-negative, we are into PoT analysis
                // for this cfft pair and need not redo spike finding.
                if ( analysis_state.PoT_freq_bin == -1 )
                    {
                    count_flops( fftlen );
                    retval = FindSpikes( (float *)WorkData, fftlen, ifft, swi );
                    progress += SpikeProgressUnits( fftlen ) * ProgressUnitSize / NumFfts;
                    if ( retval ) SETIERROR( retval, "from FindSpikes" );
                    }

                // progress = ((float)icfft)/num_cfft + ((float)ifft)/(NumFfts*num_cfft);
                progress = std::min( progress, 1.0 );
                #ifdef BOINC_APP_GRAPHICS
                    if ( !nographics() )
                        {
                        if ( gbp ) gbp->rarray.add_source_row( (float *)WorkData );
                        sah_graphics->local_progress = ( (( float ) ifft + 1) / NumFfts );
                        }
                #endif
                remaining = 1.0 - ( double ) ( icfft + 1 ) / num_cfft;
                fraction_done( progress, remaining );

            }   // end for ifft < NumFfts
   } // end part 4.2 process_data
------------------------------------------------------------------
regards heinz

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version