+- +-
Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

MB/AP Bench Test Instruction by William
17 Jun 2016, 04:13:45 am

Author Topic: Nvidia Titan error on Astropulse  (Read 8595 times)

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Nvidia Titan error on Astropulse
« on: 15 Mar 2013, 03:28:02 pm »
I am having trouble getting my new Titan card to run Astropulse it seems to get into a repeating loop and does not do any work on the unit. Here is the error and my setup - any ideas ?


15/03/2013 19:22:22 | SETI@home | Starting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:22:32 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:22:32 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:22:32 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:22:32 | SETI@home | Starting task ap_04au12ac_B2_P0_00150_20130314_08598.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 12
15/03/2013 19:22:32 | SETI@home | Starting task ap_04au12ac_B1_P1_00305_20130314_07459.wu_0 using astropulse_v6 version 604 (cuda_opencl_100) in slot 13
15/03/2013 19:22:36 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:22:36 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:22:36 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:22:36 | SETI@home | Starting task 20dc11aa.17029.18577.206158430221.10.84_1 using setiathome_enhanced version 610 (cuda_fermi) in slot 14
15/03/2013 19:22:36 | SETI@home | Starting task 20dc11aa.17029.18577.206158430221.10.145_1 using setiathome_enhanced version 610 (cuda_fermi) in slot 15
15/03/2013 19:22:41 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:22:41 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:22:41 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:22:46 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:22:46 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:22:46 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:22:49 | SETI@home | Starting task 16fe12ad.14612.366667.206158430221.10.4_0 using setiathome_enhanced version 610 (cuda_fermi) in slot 16
15/03/2013 19:22:49 | SETI@home | Starting task 20dc11aa.17029.18577.206158430221.10.167_0 using setiathome_enhanced version 610 (cuda_fermi) in slot 17
15/03/2013 19:22:51 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:22:51 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:22:51 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:22:56 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:22:56 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:22:56 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:23:01 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:23:01 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:23:01 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11
15/03/2013 19:23:07 | SETI@home | Task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 exited with zero status but no 'finished' file
15/03/2013 19:23:07 | SETI@home | If this happens repeatedly you may need to reset the project.
15/03/2013 19:23:07 | SETI@home | Restarting task ap_04au12ac_B2_P1_00073_20130314_11177.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 11


<app_info>
    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>AK_v8b2_win_x64_SSE3.exe</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
   <platform>windows_intelx86</platform>
        <file_ref>
           <file_name>AK_v8b2_win_x64_SSE3.exe</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
   <platform>windows_x86_64</platform>
        <file_ref>
           <file_name>AK_v8b2_win_x64_SSE3.exe</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app>
        <name>astropulse_v6</name>
    </app>
    <file_info>
        <name>AP6_win_x86_SSE2_OpenCL_NV_r1761.exe</name>
        <executable/>
    </file_info>
    <file_info>
        <name>libfftw3f-3.dll</name>
        <executable/>
    </file_info>
    <file_info>
        <name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</name>
    </file_info>
    <app_version>
        <app_name>astropulse_v6</app_name>
        <version_num>604</version_num>
        <platform>windows_intelx86</platform>
        <avg_ncpus>0.04</avg_ncpus>
        <max_ncpus>0.2</max_ncpus>
        <plan_class>cuda_opencl_100</plan_class>
           <cmdline>-ffa_block 6144 -ffa_block_fetch 1536 -unroll 10</cmdline>   
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>AP6_win_x86_SSE2_OpenCL_NV_r1761.exe</file_name>
            <main_program/>
        </file_ref>
        <file_ref>
            <file_name>libfftw3f-3.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name>
            <open_name>ap_cmdline.txt</open_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>astropulse_v6</app_name>
        <version_num>604</version_num>
        <platform>windows_intelx86</platform>
        <avg_ncpus>0.04</avg_ncpus>
        <max_ncpus>0.2</max_ncpus>
        <plan_class>opencl_nvidia_100</plan_class>
           <cmdline>-ffa_block 6144 -ffa_block_fetch 1536 -unroll 10</cmdline>   
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>AP6_win_x86_SSE2_OpenCL_NV_r1761.exe</file_name>
            <main_program/>
        </file_ref>
        <file_ref>
            <file_name>libfftw3f-3.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name>
            <open_name>ap_cmdline.txt</open_name>
        </file_ref>
    </app_version>
    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>Lunatics_x41g_win32_cuda32.exe</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cudart32_32_16.dll</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cufft32_32_16.dll</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>610</version_num>
   <platform>windows_intelx86</platform>
        <plan_class>cuda_fermi</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>609</version_num>
   <platform>windows_intelx86</platform>
        <plan_class>cuda23</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>608</version_num>
   <platform>windows_intelx86</platform>
        <plan_class>cuda</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>610</version_num>
   <platform>windows_x86_64</platform>
        <plan_class>cuda_fermi</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>609</version_num>
   <platform>windows_x86_64</platform>
        <plan_class>cuda23</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
    <app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>608</version_num>
   <platform>windows_x86_64</platform>
        <plan_class>cuda</plan_class>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.040000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>0.3</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
          <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
</app_info>



Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14347
Re: Nvidia Titan error on Astropulse
« Reply #1 on: 15 Mar 2013, 03:36:15 pm »
stderr?

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Nvidia Titan error on Astropulse
« Reply #2 on: 15 Mar 2013, 03:39:31 pm »
I am having trouble getting my new Titan card to run Astropulse it seems to get into a repeating loop and does not do any work on the unit. Here is the error and my setup - any ideas ?
How many CPU cores are you reserving? you're got to reserve at least one or two cores.

Have you tried running only one instance at a time?, rather than three.

If you remove the cmd line parameters from the app_info, any improvement?
(put them in the ap_cmdline_win_x86_SSE2_OpenCL_NV.txt file instead, then you can make changes without restarting Boinc)

Can we have a link to the host too, please.

Claggy
« Last Edit: 15 Mar 2013, 03:54:32 pm by Claggy »

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Nvidia Titan error on Astropulse
« Reply #3 on: 15 Mar 2013, 04:00:13 pm »
    <app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>Lunatics_x41g_win32_cuda32.exe</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cudart32_32_16.dll</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cufft32_32_16.dll</name>
        <executable/>
    </file_info>
While you're at it, you might want to upgrade to x41zc_cuda5 from x41g_cuda32, x41zc is more Optimised for Fermi's and especially Kepler's,

Claggy

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: Nvidia Titan error on Astropulse
« Reply #4 on: 15 Mar 2013, 04:20:42 pm »
There are 4 cores free on the CPU.

Running one instance, two or three makes no difference.
Removed cmdline parameters but still the same.
Inserted parameters into ap_cmdline_win_x86_SSE2_OpenCL_NV.txt, but still getting the same issue.

http://setiathome.berkeley.edu/results.php?userid=15792





I am having trouble getting my new Titan card to run Astropulse it seems to get into a repeating loop and does not do any work on the unit. Here is the error and my setup - any ideas ?
How many CPU cores are you reserving? you're got to reserve at least one or two cores.

Have you tried running only one instance at a time?, rather than three.

If you remove the cmd line parameters from the app_info, any improvement?
(put them in the ap_cmdline_win_x86_SSE2_OpenCL_NV.txt file instead, then you can make changes without restarting Boinc)

Can we have a link to the host too, please.

Claggy

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Nvidia Titan error on Astropulse
« Reply #5 on: 15 Mar 2013, 04:39:38 pm »

http://setiathome.berkeley.edu/results.php?userid=15792
That is a url only you can view, the rest of us get:

Unable to handle request

No access

This one works:

http://setiathome.berkeley.edu/show_user.php?userid=15792

Even better, this one goes straight to your host:

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6758290

Claggy

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Nvidia Titan error on Astropulse
« Reply #6 on: 15 Mar 2013, 04:52:28 pm »
On one Wu, you were getting the following a couple of times:

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
ERROR: OpenCL kernel/call 'ReadBuf:gpu_results->CPU_result' call failed (-5) in file ..\..\ap_client_main.cpp near line 3238.

then:

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
ERROR: WriteBuffer(gpu_data): -4
Error in ap oclFFT_1: -4
ERROR: clFFT_Execute: -4
ERROR: Enqueueing kernel onto command queue.          (dechirp_range_kernel)
ERROR code: CL_MEM_OBJECT_ALLOCATION_FAILURE


On at least two other Wu's, it was:

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
ERROR: clEnqueueNDRangeKernel: GPU_coadd_kernel_cl(inner): -4

You can try just running Nvidia Astropulse on it's own (don't let any Cuda Wu's run from any project) and see if it's the same.

Claggy

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: Nvidia Titan error on Astropulse
« Reply #7 on: 15 Mar 2013, 04:57:42 pm »
Sorry, I sat here looking at the link thinking it did not look right - but it did not click with me what was wrong


http://setiathome.berkeley.edu/results.php?userid=15792
That is a url only you can view, the rest of us get:

Unable to handle request

No access

This one works:

http://setiathome.berkeley.edu/show_user.php?userid=15792

Even better, this one goes straight to your host:

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6758290

Claggy

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14347
Re: Nvidia Titan error on Astropulse
« Reply #8 on: 15 Mar 2013, 06:01:36 pm »
It's second time I get reports about AP not functioning on TITAN NV GPU.
Perhaps, some incompatibility there. I have no hardware to check.

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: Nvidia Titan error on Astropulse
« Reply #9 on: 15 Mar 2013, 06:57:00 pm »
Ok I have setup the machine to run nothing but Astropulse and downloaded fresh work, however this has made no difference all work units are failing.



On one Wu, you were getting the following a couple of times:

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
ERROR: OpenCL kernel/call 'ReadBuf:gpu_results->CPU_result' call failed (-5) in file ..\..\ap_client_main.cpp near line 3238.

then:

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
ERROR: WriteBuffer(gpu_data): -4
Error in ap oclFFT_1: -4
ERROR: clFFT_Execute: -4
ERROR: Enqueueing kernel onto command queue.          (dechirp_range_kernel)
ERROR code: CL_MEM_OBJECT_ALLOCATION_FAILURE


On at least two other Wu's, it was:

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
ERROR: clEnqueueNDRangeKernel: GPU_coadd_kernel_cl(inner): -4

You can try just running Nvidia Astropulse on it's own (don't let any Cuda Wu's run from any project) and see if it's the same.

Claggy

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Nvidia Titan error on Astropulse
« Reply #10 on: 16 Mar 2013, 06:44:28 am »
For a long shot, you could try the compatibility grid setting laid out in the Sticky post in Number crunching, I doubt it'll work, but worth a try.

Claggy
« Last Edit: 16 Mar 2013, 08:29:44 am by Claggy »

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: Nvidia Titan error on Astropulse
« Reply #11 on: 16 Mar 2013, 08:39:56 am »
Already tried that last night out of desperation, but it makes no difference. I am going back to CPU until the problem gets sorted.

For a long shot, you could try the compatibility grid setting laid out in the Sticky post in Number crunching, I doubt it'll work, but worth a try.

Claggy

Offline Claggy

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 3111
    • My computers at Seti Beta
Re: Nvidia Titan error on Astropulse
« Reply #12 on: 16 Mar 2013, 08:43:53 am »
Can you post the ouput from CLinfo please

http://boinc.berkeley.edu/dl/clinfo.zip

Claggy

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14347
Re: Nvidia Titan error on Astropulse
« Reply #13 on: 16 Mar 2013, 04:13:10 pm »
Already tried that last night out of desperation, but it makes no difference. I am going back to CPU until the problem gets sorted.

For a long shot, you could try the compatibility grid setting laid out in the Sticky post in Number crunching, I doubt it'll work, but worth a try.

Claggy
Cause I have no TITAN hardware I'm very doubt that problem will be ever solved w/o active testers.

Offline Pizzadude

  • Knight o' The Realm
  • **
  • Posts: 97
Re: Nvidia Titan error on Astropulse
« Reply #14 on: 16 Mar 2013, 09:44:35 pm »
Updating to driver 314.21(beta) seems to allow astropulse processing, however as this is a beta driver that has been out less than 24 hours we will have to wait and see if there are any downsides to this step forward


Already tried that last night out of desperation, but it makes no difference. I am going back to CPU until the problem gets sorted.

For a long shot, you could try the compatibility grid setting laid out in the Sticky post in Number crunching, I doubt it'll work, but worth a try.

Claggy
Cause I have no TITAN hardware I'm very doubt that problem will be ever solved w/o active testers.

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59551
Total Topics: 1671
Most Online Today: 25
Most Online Ever: 402
(15 Nov 2018, 02:33:34 pm)
Users Online
Members: 0
Guests: 14
Total: 14
Powered by EzPortal