+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: New apps based on code revision 2.2 'Noo? No, Ni!' have been released!  (Read 82382 times)

BenHer

  • Guest
There are more routines added to the benchmarking suite for 2.2 vs 2.0.  Notably timing & choosing which folding routine and choosing which transpose routine will be used.

When the program is running these tests it tries to make sure it gets accurate numbers on how well each of the routines will perform when crunching an entire WU.  It only has access to a counter to tell it how long each version required, however in modern operating systems, because of multi-tasking, the O/S might freeze Seti and then begin running a media player - word processor or whatever else needs to use some CPU time.  If an interruption occurs the timing of a given routine will be completely wrong  (time = time of rotuine + time for O/S switch + other program running + coming back).

Therefore, the benchmark routines, just before they start timing a function version, raise the task priority very high, and just after the routine timing it restores task priority to where it was (usually the IDLE_PRIORITY level).

On most higher end systems (pentium D and beyond X2, etc), the benchmark testing should only require around 3-4 seconds of time, with a quad core, perhaps more because of memory contention.

During this time the computer might seem unresponsive.  The priority setting and benchmarking occurred in the 2.0 version also, however a few more routines are timed in 2.2.

Note: Each individual pass/test turns priority high/low, and the  the slowest test is about 1/50th of a second on an AMD 3800+ X2.  Every low priority time dip between tests allows the O/S to do whatever switching is needed.
« Last Edit: 21 Feb 2007, 04:34:13 pm by BenHer »

Offline KarVi

  • Alpha Tester
  • Knight Templar
  • ***
  • Posts: 252
Well that explains it  :)

IMHO its something that needs fixing, since its has become very apparent in the newest version.

Don't really know how it could be done though, as i understand the reasoning behind raising the priority.

Running the test at just normal priority would remove the problem on most systems I think, but would also give the risk of wrong measurements.

Its a trade-off: Does one want absolutely accurate measurements, or an app. that doesn't interfere with normal operation.

I vote for the last, but I will let it be up to you guys to decide.
A smile is the shortest distance between two peoble (Victor Borge).

Furex

  • Guest
Memory Usage?
« Reply #47 on: 21 Feb 2007, 04:49:00 pm »
Being already used to see lower memory requirements of R2.2, this one caught my eye:


Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
That would mean that for this particular WU, it chose chirping functions other than Alex Kan's.
Strange, since if supported those functions should almost always be used (and definitely always if they've been chosen before).

Is this host being used exclusively for BOINC, or do you do work alongside?

Regards,
Simon.

Furex

  • Guest
Is this host being used exclusively for BOINC, or do you do work alongside?

It only crunches in its idle time; do I sense an explanation coming? :)

Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
No,

just trying to lower the amount of variables.

Lord Asmodeus

  • Guest
Well that explains it  :)

IMHO its something that needs fixing, since its has become very apparent in the newest version.

Don't really know how it could be done though, as i understand the reasoning behind raising the priority.

Running the test at just normal priority would remove the problem on most systems I think, but would also give the risk of wrong measurements.

Its a trade-off: Does one want absolutely accurate measurements, or an app. that doesn't interfere with normal operation.

I vote for the last, but I will let it be up to you guys to decide.

I agree. If I understand it correctly, doing what you suggest will slow the crunching of a WU from time to time, it does not seem a big deal. I just had a serie of around 20 WU crunshed very quickly (5-6 seconds each) and it made my computer unusable for as long as a minute.

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Now I have with the "normal/good" WUs invalid results like with the "overflow"- WUs too... :(

If I understand you right, then it will be better to have an application what make not so correct results... but then I have better chances for valid results (Credits)?

Where I can get them? ;)


(If I remember right, in the time I used Crunch3rs apps I didn't had invalid results...
Or I didn't find them? ;)
Maybe, what he had done in other way?)

I found my coding mistake a few hours ago and apologize for any anxiety it has caused. I had seen some indication of the problem before release but didn't expect it to have much impact, so that bad judgement compounded the original mistake.
                                                                                    Joe

Offline Urs Echternacht

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 4121
  • ++
I found my coding mistake a few hours ago and apologize for any anxiety it has caused. I had seen some indication of the problem before release but didn't expect it to have much impact, so that bad judgement compounded the original mistake.
                                                                                    Joe
Don't worry, at least you found your mistake.
Here are some more error reports attached occurring to my dual P3 yesterday. (0xc0000005) Hopefully that flaw can be found someday.

[attachment deleted by admin]
_\|/_
U r s

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
...
Here are some more error reports attached occurring to my dual P3 yesterday. (0xc0000005) Hopefully that flaw can be found someday.

That's a puzzle. For these 3, the callstack indicates it's happening while the app is putting its identifying information in stderr.txt, specifically the line in red:

fprintf( stderr, "Optimized SETI@Home Enhanced application\n\n" );
fprintf( stderr, "%9s: Ben Herndon, Josef Segur, Alex Kan, Simon Zadra\n", "Optimizers" );
fprintf( stderr, "%9s: %s %s %s-bit based on seti V%d.%2d  'Ni!'\n", "Version"
    ,_OS_, _fft_simd_, _bits_
    , gmajor_version, gminor_version );

fprintf( stderr, "%9s: (R-%s|%s)\n", "Rev", _release_, _compOps_ );

How it got to trying to execute non-existent code at 0x0055B8E3 I don't know.

The other one you posted is even more a puzzle, the callstack indicated it was executing a tail section which should only be used if the WU has a num_samples not evenly divisible by the fft length.

[Edit: The above is wrong, I managed to look at the wrong disassembly listing.]
                                                                              Joe
« Last Edit: 23 Feb 2007, 12:39:54 pm by Josef W. Segur »

BenHer

  • Guest
Don't worry Joe, Urs is probably super over clocking his Pentium III.  ::)
(just kidding for those whose sarcasm sensors don't function)

Offline Urs Echternacht

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 4121
  • ++

That's a puzzle. For these 3, the callstack indicates it's happening while the app is putting its identifying information in stderr.txt, specifically the line in red:

fprintf( stderr, "Optimized SETI@Home Enhanced application\n\n" );
fprintf( stderr, "%9s: Ben Herndon, Josef Segur, Alex Kan, Simon Zadra\n", "Optimizers" );
fprintf( stderr, "%9s: %s %s %s-bit based on seti V%d.%2d  'Ni!'\n", "Version"
    ,_OS_, _fft_simd_, _bits_
    , gmajor_version, gminor_version );

fprintf( stderr, "%9s: (R-%s|%s)\n", "Rev", _release_, _compOps_ );

How it got to trying to execute non-existent code at 0x0055B8E3 I don't know.

The other one you posted is even more a puzzle, the callstack indicated it was executing a tail section which should only be used if the WU has a num_samples not evenly divisible by the fft length.
                                                                              Joe
Hi Joe,
great you were able to spot the related source-places.
Maybe the fprintf C-function is not safe (stack buffer overrun is possible?). That is one of the other things i did change in my private build at seti beta using instead streaming to stderr, e.g.:
Code: [Select]
std::cerr<<"Some text"<<std::endl;
Quote
Don't worry Joe, Urs is probably super over clocking his Pentium III.
(just kidding for those whose sarcasm sensors don't function)
Hi Ben,
sarcasm noted. That machine has NO overclocking possibilities! (-> Gigabyte 6BXDU)

P.S. Attached another one.

[attachment deleted by admin]
« Last Edit: 22 Feb 2007, 05:19:30 pm by Urs Echternacht »
_\|/_
U r s

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
...
Here are some more error reports attached occurring to my dual P3 yesterday. (0xc0000005) Hopefully that flaw can be found someday.

For my previous reply I was looking at a disassembly of a test version rather than the fresh disassembly I did of SSE 2.2. Just clumsy, I guess.

With the right file, all but the first error are in the setup for testing pulse folding, this code:
Code: [Select]
  // At some angle ranges the total number of test folds may be too few to get good
  // timing measurements. Scale up so there are at least 8K test folds.
  while (NumPlans < 8192) {
    for (iL = 0; iL < 32; iL++) {
      FFTtbl[iL][1] *= 2;
      FFTtbl[iL][2] *= 2;
      FFTtbl[iL][3] *= 2;
    }
    NumPlans *= 2;
  }

The Intel compiler unrolls the for loop completely, the FFTtbl[32][5] array is integers and on the stack, so the generated code loads a value from the stack into esi, adds esi to itself, and stores it back on the stack. The specific one at 0x0055B8E3 is "mov [ebp-44h], esi". But the dump info says "write attempt to address 0x0013D20B" and "ebp=0012e6f8". I have no theories on what's really happening, nor why it's only affecting your dual PIII system. But it should only happen at high angle ranges where that scaling is needed (the two which remained in your results list an hour ago had ar=1.948616 and ar=7.924980 from other host reports).

For the first "Access Violation (0xc0000005) at address 0x0056A89F write attempt to address 0x02239040" you posted, the instruction is "movntps [esi],xmm3". That's the first store for a 4x4 transpose function which writes to the transposed PowerSpectrum array. Transposing is used heavily, so it seems like a one-time glitch unless the system has been choosing one of the other transpose routines since then.

If you would put a copy of the app and a WU renamed work_unit.sah in a temporary directory and execute the app with a -bench command line parameter, it would produce a stderr.txt showing the testing of routines and which ones are chosen. It then exits without actually crunching the WU, so should take less than a minute. I'll attach a README.txt which has all the command line options.
                                                                                 Joe

[attachment deleted by admin]

Offline Urs Echternacht

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 4121
  • ++
If you would put a copy of the app and a WU renamed work_unit.sah in a temporary directory and execute the app with a -bench command line parameter, it would produce a stderr.txt showing the testing of routines and which ones are chosen...
                                                                                 Joe
Hi Joe,
here you are. Took just a few seconds. Hope it helps somehow.

[attachment deleted by admin]
« Last Edit: 23 Feb 2007, 04:52:31 pm by Urs Echternacht »
_\|/_
U r s

Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
Hi folks,

I'm currently compiling Rev-2.2B apps with Joe's fix. In the meantime, I've uploaded a current source archive (including this fix).

Regards,
Simon.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 355
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 60
Total: 60
Powered by EzPortal