Seti@Home optimized science apps and information

Optimized Seti@Home apps => Discussion Forum => Topic started by: BenHer on 18 Aug 2006, 05:31:56 pm

Title: Outline of "what seti@home code really does"
Post by: BenHer on 18 Aug 2006, 05:31:56 pm
I figured since I'm down to the two main time hogs in seti, I should figure out which functions do what to what buffers of data, and in what order.

How long is this buffer's data needed?
Which functions call which...how many loops do they do...etc?

Here is the result...it explains why 'GetFixedPot' and 'analyze_pot' are the main CPU hogs (cache miss hogs really).

analyze_seti

foreach fft/gauss pair
    {
    if chirp rate ind changes
        ChirpData - fills array ChirpedData[]
            CalcTrigArray
    // Now we will process the just chirped data...
    //  We will break the entire ChirpedData[] into 'fftlen' chunks and do each one
    //  as a block
    //  1. backward FFT over each block and output to WorkData[]
    //  2. compute power spectrum table over fft'ed output, store in PowerSpectrum[CurrentSub]
    //  3. if pot freq bin == -1, then try to find spike in PowerSpectrum[]

    analyze_pot <- PowerSpectrum[]
        // Look for gaussians :: by looping through frequencies * FftLength
            GetFixedPoT()  <- PowerSpectrum[] --> GaussPoT[]
                If fftlen were 1024...then for Frequency=1...GaussPot[] would contain
                    GaussPoT[0] = spectrum[0]
                    GaussPoT[1] = spectrum[1024]  // cache miss
                    GaussPoT[2] = spectrum[2048]  // ...and so on
                frequency = 2
                    GaussPoT[0] = spectrum[1]
                    GaussPoT[1] = spectrum[1025]
                    GaussPoT[2] = spectrum[2049]
            GaussFit() <- GaussPoT[]
           
        // Look for pulses :: by looping through frequencies * FftLength
            // loop through time for each frequency. PulsePoTNum is used
                Build a PulsePoT[] table extracted from the PowerSpectrum[] buffer
                   (similar to method in GetFixedPot above)
                - find_triplets on this PulsePoT[]
                - find_pulse(s) on this PulsePoT[]

    // end of fft/gauss pair
Title: Re: Outline of "what seti@home code really does"
Post by: Josef W. Segur on 19 Aug 2006, 04:17:35 pm
I figured since I'm down to the two main time hogs in seti, I should figure out which functions do what to what buffers of data, and in what order.

How long is this buffer's data needed?
Which functions call which...how many loops do they do...etc?

Here is the result...it explains why 'GetFixedPot' and 'analyze_pot' are the main CPU hogs (cache miss hogs really)....

All true, but 5.17+ will have the transposed PowerSpectrum. Then all the data points for a frequency will be in a contiguous vector.

Of course doing the transposition will have to deal with the same cache issue. I wonder how well the IPP functions ippmCopy_va_32f_SS() or
ippmTranspose_m_32f() would do for that. The docs say that the matrix operations are meant for much smaller sizes, but I guess the routines may still be coded so autovectorization works well anyhow.

If it seems I'm urging you to go on to the 5.17 source, that's true. Simon's 5.17 builds have so far not shown the expected improvement from transposition, if you could profile a 5.17 build that might indicate why not.
                                                                   Joe