Seti@Home optimized science apps and information

Optimized Seti@Home apps => Linux => Alternate Hardware platforms => Topic started by: Joe Fox on 10 Oct 2014, 04:39:12 pm

Title: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 04:39:12 pm: Hi all,

I have been doing some work on the OSX OpenCL port of multibeam, and have run in to a roadblock. The code branch that I am using is setisvn sah_v7_opt/AKV8. I have gotten a successful build using xcode 5.0.2 using the readme included in the project, however when I execute the binary against the reference_work_unit.sah I get the following output:

Code: [Select]
Used GPU device parameters are: Number of compute units: 280 Single buffer allocation size: 64MB Total device global memory: 1536MB max WG size: 512 local mem type: Real period_iterations_num=23 Error in mb oclFFT_2: -49 ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3820. Waiting 30 sec before restart...

The full stderr.txt is attached if more info is needed.

Earlier versions of this code were working back in early August, but something changed and now I run in to this error -49. Unfortunately I don't know enough about cl to troubleshoot further. Any suggestions on where the problem may lie?

Thanks
-Joe Fox
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 10 Oct 2014, 04:46:58 pm: #define CL_INVALID_ARG_INDEX -49

Maybe last changes for AP broke MB build. i'll check if it compiles for windows then will see.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 10 Oct 2014, 04:49:06 pm: btw, what device you run on? 280 Cus ??? impressive...
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 05:23:16 pm: Cool, thanks. This is on my 2013 macbook pro running OSX 10.9.5.

Thanks
-Joe Fox
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 10 Oct 2014, 05:23:42 pm: Are you sure that you are using actual code from the svn repository ? From your stderr.txt : "MultiBeam_Kernels_r632.cl" Current svn revision is r2725.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 05:28:26 pm: Ah, that r632 is just due to my build environment not setting the release DEFINE properly. The actual svn revision that I am on is r2724

Thanks
-Joe Fox
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 10 Oct 2014, 05:35:52 pm: Then could you lookup which svn revision number from august was the last that did work for you ? That would help to identify the possible changes faster.

Could you specify which readme you refering to ?

ps: I have used Xcode 4.5.3 (compiler only) so far, because that was the last version of Xcode that was Lion compatible (needed for AstroPulse v7). For Intel GPUs that is of course not necessary.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 10 Oct 2014, 05:41:39 pm: well, windows build fails with the same error so code tree is broken indeed .
Please wait until I fix it ::)
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 05:45:34 pm: Yes, I will go back thru revs to narrow down when it stopped working for me.

As for readme, I used "AKv8/ConfigureLine_AKv8c_OPENCL_SSE3_OSX_MBv7.txt" as my starting point for getting the correct project settings.

Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 05:46:10 pm: Ah, ok. I'll hold tight until you have the windows side sorted.

Thanks!
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 10 Oct 2014, 06:20:25 pm: I checked in fix, try to rebuild now.

And please share your findings with team time to time :)
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 06:32:58 pm: Unfortunately, it looks like I still get the same error. The only difference I see in the stderr is now local mem type is "Emulated" rather than "Real".

Code: [Select]
Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 0.775000 Used GPU device parameters are: Number of compute units: 280 Single buffer allocation size: 64MB Total device global memory: 1536MB max WG size: 512 local mem type: Emulated period_iterations_num=23 Error in mb oclFFT_2: -49 ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3820. Waiting 30 sec before restart...
I will continue with the effort to find which revision looks to break things, though it may be next week before I have sufficient time to isolate it.

Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 10 Oct 2014, 06:36:08 pm: Please check that fft_execute.cpp file was updated.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 06:45:23 pm: Ah, I examined the svn diff, and looks like the code is now wrapped with an ifdef USE_OPENCL_INTEL. When I set that macro in the CFLAGS, then my binary works again. Thank you!

That leads me to a question. What should the USE_OPENCL_??? macros be set at for a wide distribution build, like a beta build? For my local machine, which has an Intel GPU, the USE_OPENCL_INTEL makes sense, but what about for the new Mac Pro desktop workstations? They have an nvidia GPU if I recall correctly. How would a widely distributed binary that was compiled with USE_OPENCL_INTEL work for the nvidia system?

Thanks for the assistance!
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 10 Oct 2014, 06:46:25 pm: "Emulated" is what i am working at currently.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 06:46:52 pm: Oh, and as far as execution time with the reference result set. I will go back through and build CPU only, then GPU and report back here with the processing times.

-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 10 Oct 2014, 07:01:31 pm: Your better of to check which hardware the new Mac Pro workstations contain :
https://www.apple.com/mac-pro/

So far we have build separate apps, one per device vendor.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 07:04:23 pm: Ok, so boinc has the ability to distribute the correct client for not just the OS platform, but other parameters like gpu vendor? That's cool

-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 07:08:01 pm: oh, and looks like the mac pro has dual AMD D300s or D500s, not nvidia. My mistake.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 10 Oct 2014, 07:16:27 pm: Looks like there may be another problem. My binary runs successfully, but doesn't produce valid results. Attached is the results.sah and stderr.txt from my latest run. For what it is worth this ran in 14 minutes and 10 seconds.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 11 Oct 2014, 02:34:15 am: Quote from: Joe Fox on 10 Oct 2014, 06:45:23 pm
Ah, I examined the svn diff, and looks like the code is now wrapped with an ifdef USE_OPENCL_INTEL. When I set that macro in the CFLAGS, then my binary works again. Thank you!

That leads me to a question. What should the USE_OPENCL_??? macros be set at for a wide distribution build, like a beta build? For my local machine, which has an Intel GPU, the USE_OPENCL_INTEL makes sense, but what about for the new Mac Pro desktop workstations? They have an nvidia GPU if I recall correctly. How would a widely distributed binary that was compiled with USE_OPENCL_INTEL work for the nvidia system?

Thanks for the assistance!
-Joe

Currently we use 3 separate binaries for different GPU types. There are some pecularities for each GPU vendor so I consider easier to maintain 3 separate optimized builds instead of universal under-optimized one.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 11 Oct 2014, 09:13:37 am: Quote from: Joe Fox on 10 Oct 2014, 07:16:27 pm
Looks like there may be another problem. My binary runs successfully, but doesn't produce valid results. Attached is the results.sah and stderr.txt from my latest run. For what it is worth this ran in 14 minutes and 10 seconds.

Glancing on your first log it seems you missed correct set of defines.
I'll post what I use for Windows here, maybe Urs will correct it for OS X specific.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 11 Oct 2014, 10:12:47 am: Regarding maintaining the different binaries, that makes sense. Regarding the incorrect defines, thanks. I'll go back to the OSX list, as well as compare to the windows build list that you send.

Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 11 Oct 2014, 10:14:57 am: Quote from: Joe Fox on 10 Oct 2014, 07:16:27 pm
Looks like there may be another problem. My binary runs successfully, but doesn't produce valid results. Attached is the results.sah and stderr.txt from my latest run. For what it is worth this ran in 14 minutes and 10 seconds.
The "reference_work_unit.sah" in the "test_workunits" directory seems to be an old v6 workunit without autocorr-signal detection. Instead, try to use "refquick.wu" or "reftiny.wu" from "DynWUs" directory for quick testing.
Additionally i've tried to adapt the readme (see attached) you have been referring your work to for Intel GPU builds. You may want to compare the "-D" commandline switches (you used the term "macro" before) , also called "preprocessor directives", from your build to what is in there. Especially "-DSETI7" is important for good results.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 11 Oct 2014, 10:34:33 am: Thank you Urs, I will go through these and report back.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 11 Oct 2014, 10:39:53 am: Quote from: Urs Echternacht on 11 Oct 2014, 10:14:57 am
The "reference_work_unit.sah" in the "test_workunits" directory seems to be an old v6 workunit without autocorr-signal detection. Instead, try to use "refquick.wu" or "reftiny.wu" from "DynWUs" directory for quick testing.

I don't see those files inside of DynWUs. I only have the following:
PG0009.wu
PG0395.wu
PG0444.wu
PG1327.wu
init_data.xml
PG0009_v7.wu
PG0395_v7.wu
PG0444_v7.wu
PG1327_v7.wu

Any advice?

-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 11 Oct 2014, 11:04:59 am: Here is what I found with using one of the DynWUs.

I used "PG1327_v7.wu"

and got the attached results.

Elapsed time was 6 minutes and seconds 37

How can I test the results for accuracy?

Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 11 Oct 2014, 12:59:53 pm: list of defines I used in windows build:
OCL_CHIRP3;USE_OPENCL;ATI_OS_WIN;SETI7;USE_FFTW;WIN32;_WIN32;_MT;NDEBUG;_WINDOWS;CLIENT;_CONSOLE;USE_I386_OPTIMIZATIONS;USE_I386_XEON;USE_SSE;USE_AMD

OCL_CHIRP3 is required for best precision, USE_FFTW - for CPU FFT when needed, SETI7 as Urs already described required too.
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 11 Oct 2014, 01:25:52 pm: In lunatics Downloads you will find some Tools, like more testworkunits and a benchmark-script for "Terminal" usage. That benchmark-package contains another tool for result comparison (rescmp4_d, d for Darwin) That comparison tool you could use also standalone if you have two results to compare (e.g. result_a.sah, result_b.sah).
If you start the comparison tool without arguments you get some howto use.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 14 Oct 2014, 01:28:24 pm: One interesting thing that I have found in my testing of the new builds. Only the build with USE_OPENCL_INTEL works correctly, regardless of the video card manufacturer used for the mac. I have tested on a new Mac Pro with dual AMD D700s, a MacBook Pro with an Intel Iris, and an older MacBook Pro with an nvidia. The USE_OPENCL_HD5xxx build would fail to compike the .cl files, and the USE_OPENCL_NV build would fail with the original error I ran in to (CL error -49). My conclusion is that the macs are using intel opencl drivers. Does this seem reasonable?
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 14 Oct 2014, 02:39:20 pm: All variants should build OK actually.
I'll check what is regarding NV. MultiBeam OpenCL NV rarely used (for windows and Linux we have CUDA version that faster) so OPENCL_NV define there could be little unmaintained.
Regarding USE_OPENCL_HD5xxx - that's for ATi cards. And should be used together with USE_OPENCL as addition.
What errors do you see for this build ?
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 14 Oct 2014, 03:45:38 pm: Attached is the stderr.txt. Let me know if you need additional info. It was built using the same build flags and preprocessor flags as the intel version that works. The only difference is USE_OPENCL_HD5xxx instead of USE_OPENCL_INTEL.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 14 Oct 2014, 03:49:06 pm: Thanks.
-43 is CL_INVALID_BUILD_OPTIONS error. Some compiler option not go it seems.

I'll look closer on weekend probably.
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 14 Oct 2014, 04:02:49 pm: These are the buildoptions for OpenCL compiler when HD5xxx is active (which has no specification for __APPLE__ yet) :
#elif USE_OPENCL_HD5xxx
      strcpy(buildoptions,"-w -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -DUSE_OPENCL_HD5xxx -fno-bin-amdil");

Joe, could you try a rebuild using this options in src/GPU_lock.cpp instead :
#elif USE_OPENCL_HD5xxx
#if __APPLE__
      strcpy(buildoptions,"-w -D__APPLE__ -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -DUSE_OPENCL_HD5xxx");
#else
      strcpy(buildoptions,"-w -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -DUSE_OPENCL_HD5xxx -fno-bin-amdil");
#endif
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 15 Oct 2014, 12:11:15 pm: Hi Urs,

I applied the change that you suggested. I now get error -49 instead of -43. Attached is the stderr.txt
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 15 Oct 2014, 02:37:30 pm: Please check that we on the same build path for HD5 build:

OCL_ZERO_COPY;OCL_CHIRP3;ATI_OS_WIN;SETI7;USE_OPENCL_HD5xxx;USE_OPENCL;USE_SSE;NDEBUG;_WINDOWS;CLIENT;_CONSOLE;USE_I386_OPTIMIZATIONS;USE_I386_XEON;USE_AMD;USE_FFTW

these deifnes (except windows-specific of course) should be set for HD5 build.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 15 Oct 2014, 03:11:25 pm: The only one that I see which we aren't using us USE_AMD. I will add that and try again.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 15 Oct 2014, 03:22:56 pm: try to go clFFT_ExecuteInterleaved_mb functionand add OCL_LOG_ERR("smth"); after each kernel argument set.
Like:
err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]);OCL_LOG_ERR("arg 1");

that way we will know what argument fails .
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 26 Oct 2014, 02:04:17 pm: So, what was outcome?
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 27 Oct 2014, 01:06:36 pm: Sorry, I have been on the road for the last week and haven't had a chance to follow up. I'll be able to add the debug code and narrow down the troublesome parameter this week.

Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 27 Oct 2014, 02:04:34 pm: Great! Will wait with interest (and I'm not always available too so no problems, lets catch each other :) ).

EDIT: With AP tasks too scarse the need for these builds is great and increases constantly. Share your efforts and issues please, we need to make some breakthrough on this front.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 05 Nov 2014, 02:23:24 pm: Understood. I finally have the build modified with the debug info. As soon as I get access to my nvidia and ATI/AMD based machines, I'll run the tests. Should be within the next day or two.

Interestingly, the intel targeted build worked on all three machines, and appeared to be using the GPU based on the time it took for the test to run.

Thanks for the patience.
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 05 Nov 2014, 03:15:05 pm: Here is the final output before I get the error. This is running the HD5xxx version on a mac pro with two ATI/AMD GPUs

call 'Setting kernel argument:FindAutoCorrelation_reduce1_kernel_cl' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 707
INFO: After SetupKernelArgs
INFO: Freeing local variables.
call 'arg pos 115' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3693
call 'Setting kernel argument:CalcChirpData_kernel2_cl' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3694
call 'Enqueueing kernel:CalcChirpData_kernel2_cl' is finished OK in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3709
Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3820.

The full stderr.txt is attached as well.

I will be testing the nvidia one this evening.
Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 05 Nov 2014, 03:52:18 pm: -49 is invalid argument index.

Please check arguments for that last call (clSetKernelArg).

Try to update your workspace with latest rev also. Seems Urs can build OpenCL ATi for OS X w/o issues now. Only issue we working on is one missed pulse in special test task, but not any runtime errors...
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 05 Nov 2014, 04:56:54 pm: The latest call is

err = clSetKernelArg(CalcChirpData_kernel2_cl,0,sizeof(cl_float2),(void *)&chirprate2)

I will examine what the values of chirprate2 are.

I will also grab the latest code and see if the issue persists.

Thanks
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 05 Nov 2014, 04:58:30 pm: last call is oclFFT one.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 19 Nov 2014, 12:22:59 pm: I will be back to working on this today and should be able to get the parameters for the oclFFT call that is causing the error. Again to reiterate though, the OPEN_CL_INTEL version works correctly with accurate test results when run on an nvidia as well as AMD based Mac machines.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 19 Nov 2014, 12:31:59 pm: Intel version probably runs slower than would be ATi or NV builds. Cause code paths are different for all these 3 builds.
Also, intel uses most precise but slowest trigonometry. Other vendors have enough precision in native trig functions to allow simpler and faster math.

EDIT: in other words, if you limited in time you could build all types with same "intel" path (take care for platform selection code though!) but if you can better to use optimized path for each of vendors.

//"intel" path currently almost identical to ATi non-HD5 path. And it's not right cause iGPU has real local memory. I'm in process of changing that so soon USE_OPENCL_INTEL could give enother results for your type of testing.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 19 Nov 2014, 01:22:51 pm: Ah, thanks for the clarification! That explains what I'm observing.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 19 Nov 2014, 07:20:25 pm: I'm not sure if this information will be useful or not, but here is the debug message I entered along with the code that was causing the error:

Code: [Select]
fprintf(stderr,"INFO: oclFFT done no strip. plan_ptr: %p, FtfNum=%d, fftlen=%d, NumBlockFfts=%d, chirplen=%d\n", plan, FftNum, fftlen, NumBlockFfts, chirplen); err |= clFFT_ExecuteInterleaved_mb(cq, plan[FftNum],NumBlockFfts, clFFT_Inverse, gpu_ChirpedData,gpu_WorkData, FFTbuf, 0, NULL, NULL);
And the output of that line is:
INFO: oclFFT done no strip. plan_ptr: 0x7fff4fecf060, FtfNum=0, fftlen=8, NumBlockFfts=131072, chirplen=1048576

Error in mb oclFFT_2: -49
ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3823.

This is running the USE_OPENCL_HD5XXX define on an AMD Mac Pro.

I'm not sure what exactly I should be looking for, but plan looks to be non-null, and the index (FftNum) is 0, so I would expect it to be in bounds. Any insight?
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 20 Nov 2014, 04:56:32 am: It's too high-level call again.
As I proposed earlier in this thread try to go inside that oclFFT API call, to the place where params of particular _kernel_ call are set. And add debug code there.

I'll post particular place when will be at my dev PC closer to evening today.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 20 Nov 2014, 12:57:44 pm: Code: [Select]
//fprintf(stderr,"After getKernelWorkDimensions:\nbatchSize s =%d, gWorkItems =%d, lWorkItems =%d, dir =%d\n", s, gWorkItems, lWorkItems, dir); err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]); err |= clSetKernelArg(kernelInfo->kernel, 1, sizeof(cl_mem), &memObj[currWrite]); err |= clSetKernelArg(kernelInfo->kernel, 2, sizeof(cl_int), &dir); err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s); #if USE_OPENCL_INTEL //R: only iGPU uses Taylor trig approx and LuT. Old Ati drivers can't accept NULL as valid buf err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1)); err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2)); #endif err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/); if(err){ fprintf(stderr,"Error in mb oclFFT_1: %d\n",err); return err; } #if OCL_VERBOSE fprintf(stderr,"INFO: in mb oclFFT_1 ok\n"); #endif
and

Code: [Select]
//fprintf(stderr,"After getKernelWorkDimensions:\nbatchSize s =%d, gWorkItems =%d, lWorkItems =%d, dir =%d\n", s, gWorkItems, lWorkItems, dir); err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]); err |= clSetKernelArg(kernelInfo->kernel, 1, sizeof(cl_mem), &memObj[currWrite]); err |= clSetKernelArg(kernelInfo->kernel, 2, sizeof(cl_int), &dir); err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s); #if USE_OPENCL_INTEL //R: only iGPU uses Taylor trig approx and LuT. Old Ati drivers can't accept NULL as valid buf err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1)); err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2)); #endif err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/); if(err){ fprintf(stderr,"Error in mb oclFFT_2: %d\n",err); return err; } #if OCL_VERBOSE fprintf(stderr,"INFO: in mb oclFFT_2 ok\n"); #endif
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 20 Nov 2014, 04:33:09 pm: ah, ok. Sorry for the misunderstanding.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 25 Nov 2014, 02:53:58 pm: Finally got some time to trace down to the KernelArg that is throwing the error.

Here is the relevant code:
fft_execute.cpp line 583-603

Code: [Select]
getKernelWorkDimensions(plan, kernelInfo, &s, &gWorkItems, &lWorkItems); fprintf(stderr,"After getKernelWorkDimensions:\nbatchSize s =%d, gWorkItems =%lu, lWorkItems =%lu, dir =%d\n", s, gWorkItems, lWorkItems, dir); err |= clSetKernelArg(kernelInfo->kernel, 0, sizeof(cl_mem), &memObj[currRead]); fprintf(stderr, "after param 0: err=%d\n", err); err |= clSetKernelArg(kernelInfo->kernel, 1, sizeof(cl_mem), &memObj[currWrite]); fprintf(stderr, "after param 1: err=%d\n", err); err |= clSetKernelArg(kernelInfo->kernel, 2, sizeof(cl_int), &dir); fprintf(stderr, "after param 2: err=%d\n", err); err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s); fprintf(stderr, "after param 3: err=%d\n", err); err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1)); fprintf(stderr, "after param 4: err=%d\n", err); err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2)); fprintf(stderr, "after param 5: err=%d\n", err); err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/); fprintf(stderr, "after enqueue: err=%d\n", err); if(err){ fprintf(stderr,"Error in mb oclFFT_2: %d\n",err); return err; }
and the resulting output:
Code: [Select]
INFO: oclFFT done no strip. plan_ptr: 0x7fff500f0060, FtfNum=0, fftlen=8, NumBlockFfts=131072, chirplen=1048576 After getKernelWorkDimensions: batchSize s =131072, gWorkItems =131072, lWorkItems =64, dir =1 after param 0: err=0 after param 1: err=0 after param 2: err=0 after param 3: err=0 after param 4: err=-49 after param 5: err=-49 after enqueue: err=-49 Error in mb oclFFT_2: -49 ERROR: OpenCL kernel/call 'non-strip fft' call failed (-49) in file /Users/joe/projects/seti/sah_v7_opt/AKv8/client/analyzeFuncs.cpp near line 3823.
So it looks like line 593
Code: [Select]
err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));is the culprit.

This is all on svn revision 2760 which appears to be the latest.

And advice on next steps?
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 25 Nov 2014, 03:27:29 pm: The advise is quite simple

that's how really current code looks:

Quote
        err |= clSetKernelArg(kernelInfo->kernel, 3, sizeof(cl_int), &s);
#if USE_OPENCL_INTEL //R: only iGPU uses Taylor trig approx and LuT. Old Ati drivers can't accept NULL as valid buf
         err |= clSetKernelArg(kernelInfo->kernel, 4, sizeof(cl_mem), &(plan->cossin_LUT_d1));
err |= clSetKernelArg(kernelInfo->kernel, 5, sizeof(cl_mem), &(plan->cossin_LUT_d2));
#endif
        err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL, &gWorkItems, &lWorkItems, 0, NULL,NULL/*&fft_event*/);

So, do update to the latest revision and don't forget this time that SETI_opt source tree includes not only AKv8 directory but also src one that common between both MB and AP. Seems you use outdated src now.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 25 Nov 2014, 04:01:53 pm: Ok, I see now. There are two copies of fft_execute. The shared one in src/OpenCL_FFT and another one in AKv8/mac_build/OpenCL_FFT/ which is out of date. I will update my project to point to the shared one. Sorry for the trouble and thanks for the help.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 25 Nov 2014, 04:31:29 pm: And to clarify, the duplication was due to a mistake that I made. I have cleaned up my project and removed the duplicate copy of OpenCL_FFT, and now I am getting a successful run using the ati version on the mac pro. I will also check the nvidia version later today.

Thanks again for the assistance.
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 26 Nov 2014, 01:39:18 am: Great! Looking for OSX supporting OpenCL MB fully :D
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 29 Nov 2014, 02:29:06 pm: All three builds are working correctly based on my tests. I have submitted the binaries to Eric Korpela. If you are curious, and want to play with the builds before he posts them, you can find them here: https://www.dropbox.com/s/hlh9m3z1cl6nlbl/seti%40home.zip?dl=0
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 29 Nov 2014, 03:15:42 pm: Fine!
I have no OS X systems, maybe Urs would try them.
Now we only have to wait when SETI servers will be fixed ultimately...
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 30 Nov 2014, 09:30:11 pm: Quote from: Joe Fox on 29 Nov 2014, 02:29:06 pm
All three builds are working correctly based on my tests. I have submitted the binaries to Eric Korpela. If you are curious, and want to play with the builds before he posts them, you can find them here: https://www.dropbox.com/s/hlh9m3z1cl6nlbl/seti%40home.zip?dl=0
Only version of these three i could try is the ATI HD5xxx. First thing i got was an error after the kernels were build successful. All needed info for analysis inside attached package. Guess you did not have ML and Lion in mind when building this app.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 01 Dec 2014, 11:35:59 am: Hi Urs, thanks for the debug info. I'll analyze and make adjustments.
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 01 Dec 2014, 06:04:46 pm: @JoeFox : TBar has tested also in the meantime and he couldn't run your ati app either. Neither on ~~Yosemite~~ or ML. (EDIT: retest with correct commandline call was successful on Yosemite.)

Awaiting your adjusted builds. If you need more help, some other tests, just ask.
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 12 Dec 2014, 04:21:00 pm: Hi Urs,

Attached are updated builds. The ATI one is now running correctly on an old macbook pro that I found which exhibited the crash you saw.

Thanks
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Urs Echternacht on 13 Dec 2014, 05:32:29 pm: Quote from: Joe Fox on 12 Dec 2014, 04:21:00 pm
Hi Urs,

Attached are updated builds. The ATI one is now running correctly on an old macbook pro that I found which exhibited the crash you saw.

Thanks
-Joe
Joe, the second app version of your seti_boinc-hd5xxx fails similarly to the first one on my Mac mini 5.2 (mid 2011 with discrete ATI Radeon HD 6630M onboard) with ML. Again i have attached the debug output and additional a comparison of your weak bound symbols and lazy bound symbols to one of my latest trys to build this app for the Mac. Mine is missing some signals, but has acceptable performance, while yours works correct but inaccepatble slow on Mavericks and Yosemite. If you check the comparison (done with OS X tool dyldinfo) you will find that weak and lazy symbols are somehow switched in the two builds. Also it still shows that your build is using "__DATA __la_symbol_ptr 0x1002DB0F8 0x00DF libSystem ___sincos_stret", which apparently is not present before OS X 10.9 (Mavericks). This makes it impossible to run your app on older OS X versions (10.7.4 (Lion) and 10.8.5 (Mountain Lion)).
Title: Re: OSX Multibeam OpenCL question
Post by: Joe Fox on 15 Dec 2014, 03:13:58 pm: Hi Urs,

Thanks for the debug info. I changed the targetted SDK to 10.7, which should have removed the reference to sincos_stret, which is an optimization introduced in the compiler for builds targetted at 10.9 and up. I'm not sure how that symbol is still present, but will track down the cause and attach another update once I have verified that it is no longer being introduced.

Thanks.
-Joe
Title: Re: OSX Multibeam OpenCL question
Post by: Raistmer on 24 Feb 2015, 07:55:52 am: Is it possible to get all required changes into build scripts for OS X for committing back to Berkeley's repository?