Forum > GPU crunching

OpenCL AP v7 memory consumption

<< < (3/3)

Raistmer:
And final modification - variable ffa block size. Now this taks consumes almost usual memory amount even in peak.

If this modification produces correct result for test tasks and overflows it will be released soon.

Mike:
I made an interesting investigation.

Installed win 8.1 and tested the second task 30 times.



Check GPU clock its downclocking and fallback on CPU.



With -oclFFT_plan 256 8 128 memory consumption is normal even with big ffa_block values.
Also you can see memory is freed permanently.

Raistmer:
Quite possible that this task hides another issues to discover.

For example, I'm running it on Q9450 with SSE app more than 6 hours already... With GPU overflowed results I would expect CPU time no more 10 minutes... Quite possible it will finish w/o overflow at all and then we will have another TestCase for investigation why so.

But excessive memory usage on overflows is separate issue. So please compare results with and w/o oclfft_plan switch - do they the same?

Mike:

--- Quote from: Raistmer on 26 Oct 2014, 03:17:00 pm ---Quite possible that this task hides another issues to discover.

For example, I'm running it on Q9450 with SSE app more than 6 hours already... With GPU overflowed results I would expect CPU time no more 10 minutes... Quite possible it will finish w/o overflow at all and then we will have another TestCase for investigation why so.

But excessive memory usage on overflows is separate issue. So please compare results with and w/o oclfft_plan switch - do they the same?

--- End quote ---

Yes, it was always the same.

Only -oclFFT_plan 256 8 128 did cure it.

Raistmer:

--- Quote from: Mike on 26 Oct 2014, 03:21:03 pm ---Yes, it was always the same.

Only -oclFFT_plan 256 8 128 did cure it.


--- End quote ---

It's the puzzle. I'll try with 2721 on own host.

And regrading CPU - it gave overflow ultimately... but why SO long?...

astropulse_7.03_windows_intelx86__sse.exe  / ap_28jn14aa_B1_P1_00131_20141017_13023.wu :
AppName: astropulse_7.03_windows_intelx86__sse.exe
AppArgs:
TaskName: ap_28jn14aa_B1_P1_00131_20141017_13023.wu
Started at  : 16:17:19.199
Ended at    : 23:17:32.194
Result      : stored as ref for validations.
  25212.541 secs Elapsed
  23937.355 secs CPU time

[ stderr ]
Not using ap_cmdline.txt-file, using commandline options.
16:17:19 (3240): Can't set up shared mem: -1. Will run in standalone mode.

Build features: Non-graphics    BLANKIT TWINDECHIRP     USE_LRINT       FFTW    USE_INCREASED_PRECISION USE_SSE x86
     CPUID: Intel(R) Core(TM)2 Quad  CPU   Q9450  @ 2.66GHz

     Cache: L1=64K L2=6144K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1
AstroPulse v7 Windows x86 rev 2603, V7 match, by Raistmer with support of Lunatics.kwsn.net team.
SSE
ffa threshold, twindechirp, lrint mods by Joe Segur
state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
Found 30 single pulses and 30 repeating pulses, exiting.
  percent blanked: 4.25

Navigation

[0] Message Index

[*] Previous page

Go to full version