Seti@Home optimized science apps and information
Optimized Seti@Home apps => Windows => Topic started by: Raistmer on 11 Oct 2007, 05:11:56 pm
-
I wonder maybe it's worth to distribute boinc as set of libs to link with (for windows) and headers to include (just as usual API-library)?
Just to reduce initial compilation time (and time spent on efforts to make it compile ;) ).
For example gutil_text.C from BOINC 5.5 (sources taken from this site) failed to compile with Intel(R) C++ 10.0.027 [IA-32] used with MS VS 2005 .
Error message:
..\..\..\boinc\api\gutil_text.C(335): error: a value of type "const char *" cannot be used to initialize an entity of type "char *"
char* q = strchr(p, '\n');
^
(this line from next function
void MOVING_TEXT_PANEL::set_text(int lineno, const char* p) {
char* q = strchr(p, '\n');
while (p) {
if (q) *q = 0;
strlcpy(text[lineno++], p, 256);
if (!q) break;
p = q+1;
q = strchr(p, '\n');
}
}
)
-
LOL, and the rest of the errors. I think they should just clean up the source and we can build our own to our own optimisation settings. which version of boinc api did you use?[I see now, 5.5 sources] I had the same issues... (VS2005 w/ ICC&IPP)
- so what did you do about the const char * one ?
- what about- position_history(0) on line 126 of seti_header.cpp ?
(just curious to see if you did the same as I did :D, I'd have to go back and look what I did and compare)
Jason
-
For now I just downloaded 5.10.20 release sources :)
So libboincapi compiles w/o errors now but setiboincdb leads to error:
"
..\..\db\xml_util.cpp(641): error: identifier "strlcpy" is undefined
strlcpy(tmp_tag,tag,BUFSIZ);
^
"
It seems some include line or define is missing.... strange, uncompilable release? :o
About seti_header.cpp : got such error
"
seti_header.cpp
compilation aborted for ..\seti_header.cpp (code 4)
Command-line error: cannot open precompiled header input file Release32-NoGFX-xP\KWSN_2.3S9_MB_SSE.pchi
"
I use MS VS very occasionally so need some time to understand why compilation is bothered by lack of precompiled header at fresh build run... why it cant generate new precomp header? :-\
P.S. Ah, generate precomp headers was just turned off :) So now it complains only about "ipp_t7.h" missing (i use eval version of IPP so probably need to edit all these includes ).
No mention of position_history in errors log so far.
-
first have a look in Util.h line 31, and see if the #if !defined(HAVE_STRLCOPY)) , is active... if it is then implementation in util.c will be active too. If those aren't active then HAVE_STRLCOPY is defined, you can 'find in files' to locate the definition if there is one. If it is defined then maybe a wrong preprocessor definition is there (e.g _GCC or _MSCVER)
that bit was okay for me.
The precompiled headers need to be generated at least the first time, In the master projects I just set them on the C++ section -> Precompiled Headers to 'Create Precompiled Headers' and left them there for now. It'll save time to change it back later when you build pieces often, but don't need to alter the headers anymore.
[ I see you found this -- Good one :D]
For now you probably do need to creat them(because they aren't there, and) because some of the includes and defines are messed up.
[The Intel Include will be processor specific, chosen by ICC, just add the right directory in your 'additional include directories' to your ICC installation, and add IPP for good measure :D]
-
thanks :)
I found that addition of
#include "str_util.h"
line to xml_util.cpp heals setiboincdb project.
tomorrow will continue with IPP.
-
It took some more to return %)
Well, after some edition of include statements I got all solution to compile but now have some errors on link stage.
"
Linking... (Intel C++ Environment)
ipo: error #11033: Fatal error cannot open Release32-NoGFX-xP/analyzeFuncs.obj
xilink: error error_during_IPO_compilation: problem during multi-file optimization compilation (code 1)
xilink: error error_during_IPO_compilation: problem during multi-file optimization compilation (code 1)
"
And there is no analyzeFuncs.obj indeed! I tried to compile it as single file, it compiles OK, but still no obj anywhere... Any ideas?
-
Maybe the IPP include directories for the static libs (intel), for the seti_boinc project ?:
C:\Program Files\Intel\IPP\5.1\ia32\tools\staticlib
C:\Program Files\Intel\IPP\5.1\ia32\include
C:\Program Files\Intel\Compiler\C++\9.1\IA32\Include
(On my machine, you'll be different location or IPP version)
A small thing that was broken in my first attempts (that might relate to analysefuncs) was:
- The cmath include was defined as nothing and there are duplicates in multiple places . In the code I have the #define in one of the config.h' files was commented out and marked 'more work needed.'
The 'more work needed' is to realise that MATH_LIB is multiply included in some files (breaking comoilation if wanting intel MKL) . Commenting out a few #include 'math.h' (which should say MATH_LIB anyway, is a once only per module deal, and has already been included in one of the configs.). Fixing CMATH include to point to cmath from microsoft (not mkl, it doesn't have one) fixes something too (if using MKL.)
-
Hmm did a quick search on that error you mention. If you check the buildlog.htm it might say something about 'could not open' a certain file.
Indications are this maybe occurs when either there is not proper access rights to the TEMP directory, or that VS2005 was installed in a different login account
-
Well, think no static libs cause I use trial version of IPP that supports only dynamic linkage. And I doesnt use MKL at all.
What makes me curious is why compiler doesnt ganerate object file for source file that compiled OK (w/o any errors).
-
"Could not open" just means the file doesnt exists at all. And why it doesnt exists.... ? :o
(I use admin account, the same that I used for VS installation, no special access rights was applied )
P.S.
Full output:
Output Window Compiling with Intel(R) C++ 10.0.027 [IA-32]... (Intel C++ Environment)
gaussfit.cpp
seti_header.cpp
analyzeReport.cpp
pulsefind.cpp
timecvt.cpp
analyzePoT.cpp
schema_master.cpp
progress.cpp
chirpfft.cpp
sah_gfx_base.cpp
sah_gfx.cpp
worker.cpp
spike.cpp
s_util.cpp
malloc_a.cpp
analyzeFuncs.cpp
lcgamm.cpp
main.cpp
gdata.cpp
seti.cpp
version.cpp
Linking... (Intel C++ Environment)
ipo: error #11033: Fatal error cannot open Release32-NoGFX-xP/analyzeFuncs.obj
xilink: error error_during_IPO_compilation: problem during multi-file optimization compilation (code 1)
xilink: error error_during_IPO_compilation: problem during multi-file optimization compilation (code 1)
As one can see, analyzeFuncs.cpp was compiled w/o errors, but no analyzeFuncs.obj exist.
-
Well just to see if it maybe is in the compiler stage not producing .obj, or putting it in the wrong location, maybe turn on assembly output .asm file for analyzefuncs.cpp and see where that goes.... (with just a recompile of that file, not a link.)
the output directories for asm and object files on mine [analyzefuns.cpp, C++ output files] are set to '$(Intdir)' which I beleive is inherited, so they should be in the release directory (analyzefuncs.obj is sitting there in mine)
Jason
-
Wow I just did what you suggest to do :)
I turned on asm output (as verbose as it could be - /FAcs option) , replaced $(Intdir) with '.' and compiled again... No obj, no asm....
Only these files was found :
analyzeFuncs.cpp
analyzeFuncs.h
analyzeFuncs.i
seti_boinc-analyzeFuncs.Po
And nothing more with analyzeFuncs in the file name ...
-
Well that IS wierd :o. The only thing I can think of is that somehow Analysefuncs is set to exclude from build.
What happens if you 'clean only' seti_boinc solution then only comple analyzefuncs? same thing ?
-
Yes, the same. It tried to build only one project, to compile only one analyzeFuncs.cpp file, even to switch back to VS-style project and compile analyzeFuncs.cpp via MS compiler.
Results the same (exept MS compiler complained about unknown options
------ Build started: Project: seti_boinc, Configuration: Release32-NoGFX-xP Win32 ------
Compiling...
cl : Command line warning D9002 : ignoring unknown option '/Qprec-div-'
cl : Command line warning D9002 : ignoring unknown option '/Qprec-sqrt-'
cl : Command line warning D9002 : ignoring unknown option '/QxW'
analyzeFuncs.cpp
) no analyzeFuncs.obj was generated.
-
one small difference there.... I am doing a release QxW , and mine goes to the xW Release directory, Shouldn't yours be set to /QxP and be going to the Release xP ?... maybe that compiler command line.
-
Well, I just pick up sources with that option set active, then changed paths to correspond my installation (different places for ICC and IPP, ICC 10 instead 9, IPP 5.2 instead 5.1 and so on), changed defines to USE_SSE2, changed compiler options to /QxW.
So now it more correspond to SSE2-build than to anything else :) When I change to Release32-NoGFX-xW option set I get a lot of errors cause there are old wrong paths IMHO.
The host on that VS installed powered by AMD 64 Venice wich supports SSE2/SSE3 instruction sets but it seems there is no sense to compile with ICC SSE3 for AMD 64.
-
The case of the missing intermediate files! :o. I must go off to school now but I'll check and see if you found them when I get home. I think you are right that the setting shouldn't change the intermediate output directory/files as that is inherited from solution configuration anyway.
Well good luck , and maybe I can look some more when I get home (if you haven't already found them).
[We can step by step through AnalyzeFuncs.cpp setiings if you need]
Oh, did you do the seti_boinc Linker include directories as well ?
Jason
-
Yes, linker include paths are changed too.
But (!) when project was compiled under Release32-NoGFX-xW option set it failed to compile as whole, but there is analyzePoT.obj analyzeReport.obj in Release32-NoGFX-xW dir.
And there were no these obj when Release32-NoGFX-xP was used. So at morning I will try accomodate Release32-NoGFX-xW to my setup instead of -xP one. Probably I did something wrong with *-xP option set....
Thank you for support and good luck to you too :)
-
Sounds like progress! :D .. I hope you made notes the first time through all those settings!, mine are in green felt pen scrawled all over an old envelope, maybe I should make them into a text file one day :P
-
Back along the lines of your original post Raistmer (Boinc as library) ... I've been having some crazy thoughts.
Given the already fairly neat encapsulation of certain parts of the science application (namely the boinc interface, graphics api, xml utils, jpeg maybe more...) ,and that each instance of the science app (typically 2 to 8 copies) uses these. Might these low traffic functions actually be worthy candidates for ... *shudder* ... implementation as one or more DLLs? [or equivalent for other OSes, where available]
Certainly reduce the overall memory footprint , especially for many cores. (might influence cache too?) Can you [or anyone else for that matter] think of other reasons this may be a good or bad idea?
Jason
-
Well, as big fan of QNX OS I think the more modular app is the better as far as it doesnt impact performance ;) Another plus of dll'ed BOINC interface is shortened build times - devs certainly should like this idea ;)
Considering my build attemts, switching to Release32-NoGFX-xW options set brought bunch of compile errors (12 errors) in benchmark.cpp and cpu_x86.cpp. I have strong impression that Optimizer project just simulated compilation before (with no errors but no objs too). There are all objs from tis project now besides these two sourses that failed to compile.
Will investigate errors more closely then post.
P.S. after closer look only 3 errors remain (other originated from misplaced #endif, my fault). All belong to these lines of code ( cpu_x86.cpp).
case _INTEL:
cache.shared_L1_L2 = true;
bool hyper = testBit(1,edx_,28);
.\cpu_x86.cpp(143): error: invalid combination of type specifiers
hyper is defined as _int64 somewhere. So changing variable name solves this error (and remaining 2 too).
P.P.S. Well, now got position_history( 0 ),-related error too ::) After adding some missing #include statement and forcing const char* to char* conversions (hope it will not corrupt memory at run time %) ) it's the only remaining error.
-
Well, now got position_history( 0 ),-related error too .....
Yeah, I figured the cryptic message was saying there was no constructor of type position_history( int) .... and position_history seems to be a vector list or some such. Carefully commenting the '0' out seemed to let C++ use the default constructor, that 'seemed' to work for me.
I'm all for either explicitly using an existing correct constructor to initialise with though, or ,making a constructor of the right form if really needed .
Have fun, you're nearly there :D After a few weeks on and off, I'm still pretty mystified by a lot of it but baby steps and lots of silly questions seems to be clarifying a lot gradually.
Jason
-
Yes, I commented it out too for a moment.
Now encounter some linker errors. Some of these originate from using evaluation version of IPP (so there are slightly different lib names). But remaining are much worse. For example,
linker cant find definition of parse_command_line function. There is declaration
extern int parse_command_line(char*, char**);
in str_util.h but... no definition was found in project sources. I hardly believe this function belongs to IPP. So where to find it's definition still open question ... :o
-
.... So where to find it's definition still open question ... :o
Hmm, in mine that seems to be in Util.C , part of the libboinc solution. maybe your build order has that being compiled after set_boinc where it is called from main.cpp? or maybe util.H is not included [at top of main.cpp] for some reason?
[Ahhh, maybe libboinc directory is not added in 'Additional include directories? for seti_boinc soln]
-
In my BOINC sources this function defined boinc\lib\in str_util.c. And this file doesnt included in project source tree. That's why I didnt find definition while conducting search in project files via VS. File search on entire directory via FAR was more successfull :)
Probably I should include this file in libboinc project in project tree.
(I use one of the latests BOINC sources, doesn't downloaded from this site)
P.S. Now only 3 unresolved links remains. Will search for another missing sources...
P.P.S. LoL :D Great definitions of missing links from boinc\api\static_graphics.C
void boinc_app_mouse_button(int x, int y, int which, int is_down) {
}
void boinc_app_mouse_move(int x, int y, int left, int middle, int right) {
}
Probably there should be // ToDo comment ;)
I added this file to libboincapi, but it doesn't compile - missing boinc_resolve_filename function declaration. It seems there are some differences in BOINC api that needed to be reflected in project tree.
P.P.P.S. After some modifications of project tree and adding some #include statements I got exe-file. Hope it pass validity tests ... (fingers crossed) :)
-
LOL, Like a house of cards!
-
This many generations of Boinc interface is one of the many things I got an answer about from Joe Segur. I asked which was the best to use and if there was a preferred version. The basics of Joe's answer to that were something like 'boinc sources from about the time of that science app build. Which semmed reasonable ... So for some reason I ended up sticking with the boinc sources out of the 1.31 package and never bothered updating :D sounds like I'll keep it that way, for the time being, if there are many more differences :D
-
Well, "results strong similar" but it ~2% slower than optimal one for AMD 64 processors.
Think I create new thread and attach my build (it's sources to be precise) to post. Maybe someone saves time by compiling already modified sources.
-
Congratulation.... magnificent !.
heinz
-
Thank you 8)
With Jason's help and support indeed :)
rared distributive took ~22MB... I tried to upload it yesterday night with some comments but at morning discovered that IE crashed :(.
Maybe it's better to attach to few sequental posts... will try again later.
-
Well, "results strong similar" ....
Nice :D, sounds to me like it might be only some compiler flags different for only 2% difference!
Jason
-
Here all diffs that were done by me to compile 2.4 sources (actually, 2.39S but there only 2 differences in #define strings that was added to diffs after build and could not prevent to rebuild client again) with VS 2005 and trial versions of ICC and IPP.
opt_config.h was added to simplify tuning of conditional defines and compilation through few source files affected.
[attachment deleted by admin]
-
Good one to record the changes like that, mine are scribbled on an old envelope :P, Looks like similar changes overall. Did you end up with favourite compiler settings ? the 2.4lunatics one for QxN looks pretty close for the ones I've played with on my p4.
Jason
-
Well, I'm not even record my changes ;) Just downloaded yesterday lunatic 2.4 sources from link on main seti board, ran WinDiff utility and collected all discrepancies in one rar ::)
I used SSE2 build options cause that binary was intented to run and be profiled on AMD 64 host. I use CodeAnalyst as profiling tool (governing by assumption that AMD should know their own CPUs better than Intel ;) ) It would be interesting to compare your's vTune data with CodeAnalyst one to highlight area of interests for some improvements.
Probably need to check that options set more presisely cause my build little less than optimal. Another possibility - options are fine and 2% difference in speed comes from trial nature of mine IPP installation. Intel approves only dynamic linking with trial IPP library. So dll-calls... Don't know really could this accont for 2% slowness or not (even 2% still preliminary - tested only on short WU).
-
Very good idea to compare SSE2 QxN p4 vtune data against your sse2 AMD build. There is some Arguments about that ! :D.
Maybe you found Hotspots in an inner folding routine ? Mine chooses FoldArrayBy2AL and spends a about 10% of total time in there. Maybe yours chooses a different routine? either way we could compare asm listing output of those even, which might explain some differences between the chips! (Those functions don't depend on IPP as far as I know.)
[Note that also because I am using ICC, about 11% of time is being spent in _Intel_fast_memcpy, Which having looked at a mixture of improved memcopies, elimination of them, and hybrid processing techniques in other areas ,might make some generally applicable improvements.(not just intel chips) ]
Even though yours calls the dynamic library it would be nice to see if the dispatching is calling the same IPP functions (but DLL versions) ...or some different maybe more generic one... mine calls the w7 static ones which are p4 sse2, but the internal names given by vtune / codeanalyst will give the real names.
Jason
-
Well, some initial results.
Most time my version spends in sse3_ChirpData_ak
(1 function, 78 instructions, Total: 12300 samples, 19.37% of samples in the module, 5.12% of total session samples)
[these line take most:
Address Line Trace Source Code Bytes Timer samples
0x4a9dfe 125 m = vec_recip3(_mm_add_ps(_mm_mul_ps(x, x), _mm_mul_ps(y, y))); 3989
0x4a9d76 111 c = _mm_add_ps(_mm_mul_ps(_mm_add_ps(_mm_mul_ps(_mm_add_ps(_mm_mul_ps(y, CC3), 3631
]
It's pretty strange cause I used SSE2 build options... Maybe function name not quite adequate?... (or maybe smth wrong with profiler or my understanding of its results :) )
Next one is fastcopy_I
(1 function, 24 instructions, Total: 8714 samples, 13.72% of samples in the module, 3.63% of total session samples)
and in IPP dll most samples hitted ippsZero_8u
(1 function, 1160 instructions, Total: 46923 samples, 99.89% of samples in the module, 19.55% of total session samples)
[this line leader:
Address Code Bytes Instruction Symbol Timer samples
0x200ede9 0x 0F 28 4C 32 10 movaps xmm1,[edx+esi+10h] ippsZero_8u+1473833 5965
]
1 instructions, Total: 5965 samples, 4.87% of samples in module p:\bin\intel\ipp\5.3\ia32\bin\ippst7-5.3.dll, 0.99% of total session samples
As one can see it's almost single called function in whole dll ... (very strange too).
It was 240 sec profiling run. What time scale best suitable for profiling all main app activities? I will try to increase profiling time, maybe it will get more adequate results...
Some addon:
sse_sum2,3,4,5 and sse_f_GetPeak have the most unaligned accesses number.
sse3_ChirpData_ak and fastcopy_I have the most data cache misses
-
Well, some initial results.
Most time my version spends in sse3_ChirpData_ak
...
It's pretty strange cause I used SSE2 build options... Maybe function name not quite adequate?... (or maybe smth wrong with profiler or my understanding of its results :) )
The program design is to build all the hand-optimized code with at least whatever minimum options are required, then use run-time testing of the host to decide which of those routines to test. So the opt_SSE3.cpp module is built with its needed SSE3 setting, your CPU supports SSE3, and it tests faster than the other chirp routines on your system so is chosen as the one to use during actual crunching.
The fraction of time spent chirping is very much affected by the angle range. The reason WUs at high angle range are quick is that they do no Gaussian fitting and not much Pulse or Triplet finding. Chirping is also reduced, but not so much, so it becomes more of the total run time. I don't know CodeAnalyst, so don't understand the "19.37% of samples in the module, 5.12% of total session samples" distinction.
Joe
-
...so don't understand the "19.37% of samples in the module, 5.12% of total session samples" distinction. Joe
As well as the familiar/traditional instrumented 'Device Under Test' Style profiling, vTune, and I guess from this data CodeAnalyst too, collects the OS/System Counters, so Data is available on all processes /Threads running at the time of test.
Without having seen the rest of the data: ( And presuming Time-based sampling was used rather than Event-Based Sampling)
From the given information, if it were vTune, for the module/process which spent 20% of its time in the chirp routine, that 20% self time constituted about 5% system time .... This 'might' imply the total self time of the module makes 25% of the system time.
That might suggest a single threaded module going full pelt (constant 100% usage) on 1 core of a quad, Constant 100% usage would be one of the first System level optimisation Goals. !!!!GOAL!!!! move onto further optimisation levels.--> Application achitecture level --> MicroArchitecure level
otherwise if it's a dual or single core then it may be using less than 100% of available system cpu time ... either other processes running taking system resources during the profile (can diagnose system problems like this), or the module is either IO or memory bound (might suggest deeper optimisation if system problems are eliminated).
Again those are just guesses / general guidelines without looking at other data... at system level, for example, what proportion the Total module samples were of total system samples might be, especially a cpu usage graph by module, might tell you that you forgot to stop boinc (done it many times), maybe a virus scan had started, a windows update, maybe you were watching a DVD? LOL (joke)
Jason
-
:)
Not watched DVD ;) Host under testing is AMD 63 3200 Venice, SSE3 support available indeed.
Yes, with timer-based profile CodeAnalyst gathers data on whole system. Yes, there was BOINC run in background (einstein project in very that time). I interesting only time distribution inside SETI exe and IPP dll so didn't care about stopping/restarting BOINC during test. It makes "total system time %" meaningless sure. But SETI should take ~50% of CPU time in this situation, not just 25%. Maybe CodeAnalyst counted IPP dll as distinct module?...
Work Unit Info
True angle range: 0.405774
Any comments about why ippsZero_8u takes most time, please ?
and (accordingly dll name) it seems IPP dispatcher chose "standart" library version, not one of specificaly optimized (not w7 for example).
-
:)
Not watched DVD ;) ... But SETI should take ~50% of CPU time in this situation, not just 25%. ....
Right so Single core (like my non HT p4),
more Guesses:
~50% 1 Einstein task
~1 to 5% - boinc ( is higher because of context switching on single core, I've measured see below)
~2 to 20% - CodeAnalyst (high sampling rate increases load
~2 to 10% - Other system/kernel drivers & services
subtotal : 55% ->85% ... Average ~70% ;)
remaining 45%~15% Average 30% - your seti run.
So before you can move on to deeper optimisation level , you need to measure/graph with codeanalyst: whatever the equivalent system counters are for vTune names:
1) With Boinc+Einstein+your seti task (Same conditions as you did)
- "System: Processor Queue Length"
- "System: Context Switches/sec" might also be helpful
2) Without Boinc+Einstein, just your seti task
- "System: Processor Queue Length"
- "System: Context Switches/sec" might also be helpful
Maybe too some memory usage might show something if you have limited physical RAM etc...
"System: Processor Queue Length" (vTune name) Gives a reading of how many NON-IDLE threads are waiting in the queue for CPU time .... on my 2.0GHz non HT p4 this typically averages about 5 with a seti run (but no boinc+seti), that means I could benefit, for the software I run, from A dual core of at least 2GHz, preferably a bit more to bring it into the range of 1 to 2. ( A fast quad would probably be wasted on me, but give practically every running thread, on average, a fresh whole core to itself...)
"System: Context Switches/sec", might also give an idea of how much priority competition is happening on your machine (Threads/Modules competing ... You see this raise slightly during mouse moves, or having more active background programs that poll for something regularly (e.g. speedfan, boincview), that looks like speed humps in the context switches/sec.
Any comments about why ippsZero_8u takes most time, please ?
and (accordingly dll name) it seems IPP dispatcher chose "standart" library version, not one of specificaly optimized (not w7 for example).
Mine spends some large times in a few of the IPP functions. When you get to do some application and / architectural level performance measurement you will see the reasons, it in some small way might partially be related to the 'denormal data' issue you brought up before (take a look at the IPP flash tutorials about that). I've been thinking about ways to approach a custom (stripped down) FFTW build for a while now, but aren't ready yet.
The use of the standard library and the fact that it would be a DLL on a single core would be an issue too(probably extra context switches / cpu queue length)... means that like me you'd probably benefit from an extra core ;) so if you need to justify going to more cores for santa to bring one then "I need one for software development purposes" is probably a pretty good reason to add to the list ;D.
IMO, from the measurements I get, It is a myth that software doesn't benefit from multicore or even HT yet. Who runs only 1 single threaded process at a time? Only DOS! [ And perhaps reviewers doing synthetic benchmarks] The windows OS handles all the thread switching much better with multicore or even HT, for DLLs and services. Even without boinc/seti running, system responsiveness and use of system resources would improve for us :D
Jason
-
Yes.... but if it would be multicore there were multi seti/einstein processes to eat CPU too ;)
It seems my version still not appropriate for profiling, it better suits for debugging - checkpointing broken.
-
Yes.... but if it would be multicore there were multi seti/einstein processes to eat CPU too ;)
It seems my version still not appropriate for profiling, it better suits for debugging - checkpointing broken.
LOL, Good point, though you would tend to use the fully loaded cores profile data just for overall system performance analysis rather than program profile information. You would stop boinc for deeper module profile to not obscure the run.
Checkpointing? sounds like boincapi problem maybe
-
...checkpointing broken.
The default checkpoint interval is 300 seconds. When running with BOINC, the "Write to disk" preference overrides that, when running standalone you need to use an init_data.xml file to supply that and maybe a useful memory size. The knabench package has a suitable one, but I often use this simpler one:
-----------------------------------------------------------------------------
<app_init_data>
<wu_cpu_time>0</wu_cpu_time>
<checkpoint_period>60.000000</checkpoint_period>
<host_info>
<m_nbytes>134217728.000000</m_nbytes>
</host_info>
</app_init_data>
-----------------------------------------------------------------------------
Joe
-
Thank you, but my app does write checkpoint (every 300 sec only maybe but it does). It cant restore computation state from saved data - that i meant when wrote "checkpointing broken".