Seti@Home optimized science apps and information
Optimized Seti@Home apps => Linux => Alternate Hardware platforms => Topic started by: Joe Fox on 10 Oct 2014, 04:39:12 pm
-
Hi all,
I have been doing some work on the OSX OpenCL port of multibeam, and have run in to a roadblock. The code branch that I am using is setisvn sah_v7_opt/AKV8. I have gotten a successful build using xcode 5.0.2 using the readme included in the project, however when I execute the binary against the reference_work_unit.sah I get the following output:
Used GPU device parameters are:
Number of compute units: 280
Single buffer allocation size: 64MB
Total device global memory: 1536MB
max WG size: 512
local mem type: Real
period_iterations_num=23
Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3820.
Waiting 30 sec before restart...
The full stderr.txt is attached if more info is needed.
Earlier versions of this code were working back in early August, but something changed and now I run in to this error -49. Unfortunately I don't know enough about cl to troubleshoot further. Any suggestions on where the problem may lie?
Thanks
-Joe Fox
-
#define CL_INVALID_ARG_INDEX -49
Maybe last changes for AP broke MB build. i'll check if it compiles for windows then will see.
-
btw, what device you run on? 280 Cus ??? impressive...
-
Cool, thanks. This is on my 2013 macbook pro running OSX 10.9.5.
Thanks
-Joe Fox
-
Are you sure that you are using actual code from the svn repository ? From your stderr.txt : "MultiBeam_Kernels_r632.cl" Current svn revision is r2725.
-
Ah, that r632 is just due to my build environment not setting the release DEFINE properly. The actual svn revision that I am on is r2724
Thanks
-Joe Fox
-
Then could you lookup which svn revision number from august was the last that did work for you ? That would help to identify the possible changes faster.
Could you specify which readme you refering to ?
ps: I have used Xcode 4.5.3 (compiler only) so far, because that was the last version of Xcode that was Lion compatible (needed for AstroPulse v7). For Intel GPUs that is of course not necessary.
-
well, windows build fails with the same error so code tree is broken indeed .
Please wait until I fix it ::)
-
Yes, I will go back thru revs to narrow down when it stopped working for me.
As for readme, I used "AKv8/ConfigureLine_AKv8c_OPENCL_SSE3_OSX_MBv7.txt" as my starting point for getting the correct project settings.
Thanks
-Joe
-
Ah, ok. I'll hold tight until you have the windows side sorted.
Thanks!
-Joe
-
I checked in fix, try to rebuild now.
And please share your findings with team time to time :)
-
Unfortunately, it looks like I still get the same error. The only difference I see in the stderr is now local mem type is "Emulated" rather than "Real".
Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.775000
Used GPU device parameters are:
Number of compute units: 280
Single buffer allocation size: 64MB
Total device global memory: 1536MB
max WG size: 512
local mem type: Emulated
period_iterations_num=23
Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3820.
Waiting 30 sec before restart...
I will continue with the effort to find which revision looks to break things, though it may be next week before I have sufficient time to isolate it.
Thanks
-Joe
-
Please check that fft_execute.cpp file was updated.
-
Ah, I examined the svn diff, and looks like the code is now wrapped with an ifdef USE_OPENCL_INTEL. When I set that macro in the CFLAGS, then my binary works again. Thank you!
That leads me to a question. What should the USE_OPENCL_??? macros be set at for a wide distribution build, like a beta build? For my local machine, which has an Intel GPU, the USE_OPENCL_INTEL makes sense, but what about for the new Mac Pro desktop workstations? They have an nvidia GPU if I recall correctly. How would a widely distributed binary that was compiled with USE_OPENCL_INTEL work for the nvidia system?
Thanks for the assistance!
-Joe
-
"Emulated" is what i am working at currently.
-
Oh, and as far as execution time with the reference result set. I will go back through and build CPU only, then GPU and report back here with the processing times.
-Joe
-
Your better of to check which hardware the new Mac Pro workstations contain :
https://www.apple.com/mac-pro/
So far we have build separate apps, one per device vendor.
-
Ok, so boinc has the ability to distribute the correct client for not just the OS platform, but other parameters like gpu vendor? That's cool
-Joe
-
oh, and looks like the mac pro has dual AMD D300s or D500s, not nvidia. My mistake.
-
Looks like there may be another problem. My binary runs successfully, but doesn't produce valid results. Attached is the results.sah and stderr.txt from my latest run. For what it is worth this ran in 14 minutes and 10 seconds.
-
Ah, I examined the svn diff, and looks like the code is now wrapped with an ifdef USE_OPENCL_INTEL. When I set that macro in the CFLAGS, then my binary works again. Thank you!
That leads me to a question. What should the USE_OPENCL_??? macros be set at for a wide distribution build, like a beta build? For my local machine, which has an Intel GPU, the USE_OPENCL_INTEL makes sense, but what about for the new Mac Pro desktop workstations? They have an nvidia GPU if I recall correctly. How would a widely distributed binary that was compiled with USE_OPENCL_INTEL work for the nvidia system?
Thanks for the assistance!
-Joe
Currently we use 3 separate binaries for different GPU types. There are some pecularities for each GPU vendor so I consider easier to maintain 3 separate optimized builds instead of universal under-optimized one.
-
Looks like there may be another problem. My binary runs successfully, but doesn't produce valid results. Attached is the results.sah and stderr.txt from my latest run. For what it is worth this ran in 14 minutes and 10 seconds.
Glancing on your first log it seems you missed correct set of defines.
I'll post what I use for Windows here, maybe Urs will correct it for OS X specific.
-
Regarding maintaining the different binaries, that makes sense. Regarding the incorrect defines, thanks. I'll go back to the OSX list, as well as compare to the windows build list that you send.
Thanks
-Joe
-
Looks like there may be another problem. My binary runs successfully, but doesn't produce valid results. Attached is the results.sah and stderr.txt from my latest run. For what it is worth this ran in 14 minutes and 10 seconds.
The "reference_work_unit.sah" in the "test_workunits" directory seems to be an old v6 workunit without autocorr-signal detection. Instead, try to use "refquick.wu" or "reftiny.wu" from "DynWUs" directory for quick testing.
Additionally i've tried to adapt the readme (see attached) you have been referring your work to for Intel GPU builds. You may want to compare the "-D" commandline switches (you used the term "macro" before) , also called "preprocessor directives", from your build to what is in there. Especially "-DSETI7" is important for good results.
-
Thank you Urs, I will go through these and report back.
-
The "reference_work_unit.sah" in the "test_workunits" directory seems to be an old v6 workunit without autocorr-signal detection. Instead, try to use "refquick.wu" or "reftiny.wu" from "DynWUs" directory for quick testing.
I don't see those files inside of DynWUs. I only have the following:
PG0009.wu
PG0395.wu
PG0444.wu
PG1327.wu
init_data.xml
PG0009_v7.wu
PG0395_v7.wu
PG0444_v7.wu
PG1327_v7.wu
Any advice?
-Joe
-
Here is what I found with using one of the DynWUs.
I used "PG1327_v7.wu"
and got the attached results.
Elapsed time was 6 minutes and seconds 37
How can I test the results for accuracy?
Thanks
-Joe
-
list of defines I used in windows build:
OCL_CHIRP3;USE_OPENCL;ATI_OS_WIN;SETI7;USE_FFTW;WIN32;_WIN32;_MT;NDEBUG;_WINDOWS;CLIENT;_CONSOLE;USE_I386_OPTIMIZATIONS;USE_I386_XEON;USE_SSE;USE_AMD
OCL_CHIRP3 is required for best precision, USE_FFTW - for CPU FFT when needed, SETI7 as Urs already described required too.
-
In lunatics Downloads you will find some Tools, like more testworkunits and a benchmark-script for "Terminal" usage. That benchmark-package contains another tool for result comparison (rescmp4_d, d for Darwin) That comparison tool you could use also standalone if you have two results to compare (e.g. result_a.sah, result_b.sah).
If you start the comparison tool without arguments you get some howto use.
-
One interesting thing that I have found in my testing of the new builds. Only the build with USE_OPENCL_INTEL works correctly, regardless of the video card manufacturer used for the mac. I have tested on a new Mac Pro with dual AMD D700s, a MacBook Pro with an Intel Iris, and an older MacBook Pro with an nvidia. The USE_OPENCL_HD5xxx build would fail to compike the .cl files, and the USE_OPENCL_NV build would fail with the original error I ran in to (CL error -49). My conclusion is that the macs are using intel opencl drivers. Does this seem reasonable?
-
All variants should build OK actually.
I'll check what is regarding NV. MultiBeam OpenCL NV rarely used (for windows and Linux we have CUDA version that faster) so OPENCL_NV define there could be little unmaintained.
Regarding USE_OPENCL_HD5xxx - that's for ATi cards. And should be used together with USE_OPENCL as addition.
What errors do you see for this build ?
-
Attached is the stderr.txt. Let me know if you need additional info. It was built using the same build flags and preprocessor flags as the intel version that works. The only difference is USE_OPENCL_HD5xxx instead of USE_OPENCL_INTEL.
-
Thanks.
-43 is CL_INVALID_BUILD_OPTIONS error. Some compiler option not go it seems.
I'll look closer on weekend probably.
-
These are the buildoptions for OpenCL compiler when HD5xxx is active (which has no specification for __APPLE__ yet) :
#elif USE_OPENCL_HD5xxx
strcpy(buildoptions,"-w -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -DUSE_OPENCL_HD5xxx -fno-bin-amdil");
Joe, could you try a rebuild using this options in src/GPU_lock.cpp instead :
#elif USE_OPENCL_HD5xxx
#if __APPLE__
strcpy(buildoptions,"-w -D__APPLE__ -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -DUSE_OPENCL_HD5xxx");
#else
strcpy(buildoptions,"-w -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -DUSE_OPENCL_HD5xxx -fno-bin-amdil");
#endif
-
Hi Urs,
I applied the change that you suggested. I now get error -49 instead of -43. Attached is the stderr.txt
-
Please check that we on the same build path for HD5 build:
OCL_ZERO_COPY;OCL_CHIRP3;ATI_OS_WIN;SETI7;USE_OPENCL_HD5xxx;USE_OPENCL;USE_SSE;NDEBUG;_WINDOWS;CLIENT;_CONSOLE;USE_I386_OPTIMIZATIONS;USE_I386_XEON;USE_AMD;USE_FFTW
these deifnes (except windows-specific of course) should be set for HD5 build.
-
The only one that I see which we aren't using us USE_AMD. I will add that and try again.
-
try to go clFFT_ExecuteInterleaved_mb functionand add OCL_LOG_ERR("smth"); after each kernel argument set.
Like:
err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]);OCL_LOG_ERR("arg 1");
that way we will know what argument fails .
-
So, what was outcome?
-
Sorry, I have been on the road for the last week and haven't had a chance to follow up. I'll be able to add the debug code and narrow down the troublesome parameter this week.
Thanks
-Joe
-
Great! Will wait with interest (and I'm not always available too so no problems, lets catch each other :) ).
EDIT: With AP tasks too scarse the need for these builds is great and increases constantly. Share your efforts and issues please, we need to make some breakthrough on this front.
-
Understood. I finally have the build modified with the debug info. As soon as I get access to my nvidia and ATI/AMD based machines, I'll run the tests. Should be within the next day or two.
Interestingly, the intel targeted build worked on all three machines, and appeared to be using the GPU based on the time it took for the test to run.
Thanks for the patience.
-Joe
-
Here is the final output before I get the error. This is running the HD5xxx version on a mac pro with two ATI/AMD GPUs
call 'Setting kernel argument:FindAutoCorrelation_reduce1_kernel_cl' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 707
INFO: After SetupKernelArgs
INFO: Freeing local variables.
call 'arg pos 115' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3693
call 'Setting kernel argument:CalcChirpData_kernel2_cl' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3694
call 'Enqueueing kernel:CalcChirpData_kernel2_cl' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3709
Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3820.
The full stderr.txt is attached as well.
I will be testing the nvidia one this evening.
Thanks
-Joe
-
-49 is invalid argument index.
Please check arguments for that last call (clSetKernelArg).
Try to update your workspace with latest rev also. Seems Urs can build OpenCL ATi for OS X w/o issues now. Only issue we working on is one missed pulse in special test task, but not any runtime errors...
-
The latest call is
err = clSetKernelArg(CalcChirpData_kernel2_cl,0,sizeof(cl_float2),(void *)&chirprate2)
I will examine what the values of chirprate2 are.
I will also grab the latest code and see if the issue persists.
Thanks
-
last call is oclFFT one.
-
I will be back to working on this today and should be able to get the parameters for the oclFFT call that is causing the error. Again to reiterate though, the OPEN_CL_INTEL version works correctly with accurate test results when run on an nvidia as well as AMD based Mac machines.
-
Intel version probably runs slower than would be ATi or NV builds. Cause code paths are different for all these 3 builds.
Also, intel uses most precise but slowest trigonometry. Other vendors have enough precision in native trig functions to allow simpler and faster math.
EDIT: in other words, if you limited in time you could build all types with same "intel" path (take care for platform selection code though!) but if you can better to use optimized path for each of vendors.
//"intel" path currently almost identical to ATi non-HD5 path. And it's not right cause iGPU has real local memory. I'm in process of changing that so soon USE_OPENCL_INTEL could give enother results for your type of testing.
-
Ah, thanks for the clarification! That explains what I'm observing.
-
I'm not sure if this information will be useful or not, but here is the debug message I entered along with the code that was causing the error:
fprintf(stderr,"INFO: oclFFT done no strip. plan_ptr: %p, FtfNum=%d, fftlen=%d, NumBlockFfts=%d, chirplen=%d\n", plan, FftNum, fftlen, NumBlockFfts, chirplen);
err |= clFFT_ExecuteInterleaved_mb(cq, plan[FftNum],NumBlockFfts, clFFT_Inverse, gpu_ChirpedData,gpu_WorkData, FFTbuf, 0, NULL, NULL);
And the output of that line is:
INFO: oclFFT done no strip. plan_ptr: 0x7fff4fecf060, FtfNum=0, fftlen=8, NumBlockFfts=131072, chirplen=1048576
Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3823.
This is running the USE_OPENCL_HD5XXX define on an AMD Mac Pro.
I'm not sure what exactly I should be looking for, but plan looks to be non-null, and the index (FftNum) is 0, so I would expect it to be in bounds. Any insight?
-
It's too high-level call again.
As I proposed earlier in this thread try to go inside that oclFFT API call, to the place where params of particular _kernel_ call are set. And add debug code there.
I'll post particular place when will be at my dev PC closer to evening today.
-
//fprintf(stderr,"After getKernelWorkDimensions:\nbatchSize s =%d, gWorkItems =%d, lWorkItems =%d, dir =%d\n", s, gWorkItems, lWorkItems, dir);
err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]);
err |= clSetKernelArg(kernelInfo->kernel, 1, sizeof(cl_mem), &memObj[currWrite]);
err |= clSetKernelArg(kernelInfo->kernel, 2, sizeof(cl_int), &dir);
err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s);
#if USE_OPENCL_INTEL //R: only iGPU uses Taylor trig approx and LuT. Old Ati drivers can't accept NULL as valid buf
err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));
err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2));
#endif
err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/);
if(err){
fprintf(stderr,"Error in mb oclFFT_1: %d\n",err);
return err;
}
#if OCL_VERBOSE
fprintf(stderr,"INFO: in mb oclFFT_1 ok\n");
#endif
and
//fprintf(stderr,"After getKernelWorkDimensions:\nbatchSize s =%d, gWorkItems =%d, lWorkItems =%d, dir =%d\n", s, gWorkItems, lWorkItems, dir);
err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]);
err |= clSetKernelArg(kernelInfo->kernel, 1, sizeof(cl_mem), &memObj[currWrite]);
err |= clSetKernelArg(kernelInfo->kernel, 2, sizeof(cl_int), &dir);
err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s);
#if USE_OPENCL_INTEL //R: only iGPU uses Taylor trig approx and LuT. Old Ati drivers can't accept NULL as valid buf
err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));
err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2));
#endif
err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/);
if(err){
fprintf(stderr,"Error in mb oclFFT_2: %d\n",err);
return err;
}
#if OCL_VERBOSE
fprintf(stderr,"INFO: in mb oclFFT_2 ok\n");
#endif
-
ah, ok. Sorry for the misunderstanding.
-
Finally got some time to trace down to the KernelArg that is throwing the error.
Here is the relevant code:
fft_execute.cpp line 583-603
getKernelWorkDimensions(plan, kernelInfo, &s, &gWorkItems, &lWorkItems);
fprintf(stderr,"After getKernelWorkDimensions:\nbatchSize s =%d, gWorkItems =%lu, lWorkItems =%lu, dir =%d\n", s, gWorkItems, lWorkItems, dir);
err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]);
fprintf(stderr, "after param 0: err=%d\n", err);
err |= clSetKernelArg(kernelInfo->kernel, 1, sizeof(cl_mem), &memObj[currWrite]);
fprintf(stderr, "after param 1: err=%d\n", err);
err |= clSetKernelArg(kernelInfo->kernel, 2, sizeof(cl_int), &dir);
fprintf(stderr, "after param 2: err=%d\n", err);
err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s);
fprintf(stderr, "after param 3: err=%d\n", err);
err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));
fprintf(stderr, "after param 4: err=%d\n", err);
err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2));
fprintf(stderr, "after param 5: err=%d\n", err);
err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/);
fprintf(stderr, "after enqueue: err=%d\n", err);
if(err){
fprintf(stderr,"Error in mb oclFFT_2: %d\n",err);
return err;
}
and the resulting output:
INFO: oclFFT done no strip. plan_ptr: 0x7fff500f0060, FtfNum=0, fftlen=8, NumBlockFfts=131072, chirplen=1048576
After getKernelWorkDimensions:
batchSize s =131072, gWorkItems =131072, lWorkItems =64, dir =1
after param 0: err=0
after param 1: err=0
after param 2: err=0
after param 3: err=0
after param 4: err=-49
after param 5: err=-49
after enqueue: err=-49
Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3823.
So it looks like line 593
err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));
is the culprit.
This is all on svn revision 2760 which appears to be the latest.
And advice on next steps?
-
The advise is quite simple
that's how really current code looks:
err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s);
#if USE_OPENCL_INTEL //R: only iGPU uses Taylor trig approx and LuT. Old Ati drivers can't accept NULL as valid buf
err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));
err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2));
#endif
err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/);
So, do update to the latest revision and don't forget this time that SETI_opt source tree includes not only AKv8 directory but also src one that common between both MB and AP. Seems you use outdated src now.
-
Ok, I see now. There are two copies of fft_execute. The shared one in src/OpenCL_FFT and another one in AKv8/mac_build/OpenCL_FFT/ which is out of date. I will update my project to point to the shared one. Sorry for the trouble and thanks for the help.
-
And to clarify, the duplication was due to a mistake that I made. I have cleaned up my project and removed the duplicate copy of OpenCL_FFT, and now I am getting a successful run using the ati version on the mac pro. I will also check the nvidia version later today.
Thanks again for the assistance.
-
Great! Looking for OSX supporting OpenCL MB fully :D
-
All three builds are working correctly based on my tests. I have submitted the binaries to Eric Korpela. If you are curious, and want to play with the builds before he posts them, you can find them here: https://www.dropbox.com/s/hlh9m3z1cl6nlbl/seti%40home.zip?dl=0
-
Fine!
I have no OS X systems, maybe Urs would try them.
Now we only have to wait when SETI servers will be fixed ultimately...
-
All three builds are working correctly based on my tests. I have submitted the binaries to Eric Korpela. If you are curious, and want to play with the builds before he posts them, you can find them here: https://www.dropbox.com/s/hlh9m3z1cl6nlbl/seti%40home.zip?dl=0
Only version of these three i could try is the ATI HD5xxx. First thing i got was an error after the kernels were build successful. All needed info for analysis inside attached package. Guess you did not have ML and Lion in mind when building this app.
-
Hi Urs, thanks for the debug info. I'll analyze and make adjustments.
-
@JoeFox : TBar has tested also in the meantime and he couldn't run your ati app either. Neither on
Yosemite or ML. (EDIT: retest with correct commandline call was successful on Yosemite.)
Awaiting your adjusted builds. If you need more help, some other tests, just ask.
-
Hi Urs,
Attached are updated builds. The ATI one is now running correctly on an old macbook pro that I found which exhibited the crash you saw.
Thanks
-Joe
-
Hi Urs,
Attached are updated builds. The ATI one is now running correctly on an old macbook pro that I found which exhibited the crash you saw.
Thanks
-Joe
Joe, the second app version of your seti_boinc-hd5xxx fails similarly to the first one on my Mac mini 5.2 (mid 2011 with discrete ATI Radeon HD 6630M onboard) with ML. Again i have attached the debug output and additional a comparison of your weak bound symbols and lazy bound symbols to one of my latest trys to build this app for the Mac. Mine is missing some signals, but has acceptable performance, while yours works correct but inaccepatble slow on Mavericks and Yosemite. If you check the comparison (done with OS X tool dyldinfo) you will find that weak and lazy symbols are somehow switched in the two builds. Also it still shows that your build is using "__DATA __la_symbol_ptr 0x1002DB0F8 0x00DF libSystem ___sincos_stret", which apparently is not present before OS X 10.9 (Mavericks). This makes it impossible to run your app on older OS X versions (10.7.4 (Lion) and 10.8.5 (Mountain Lion)).
-
Hi Urs,
Thanks for the debug info. I changed the targetted SDK to 10.7, which should have removed the reference to sincos_stret, which is an optimization introduced in the compiler for builds targetted at 10.9 and up. I'm not sure how that symbol is still present, but will track down the cause and attach another update once I have verified that it is no longer being introduced.
Thanks.
-Joe
-
Is it possible to get all required changes into build scripts for OS X for committing back to Berkeley's repository?