Forum > GPU crunching

OpenCL AP v7 memory consumption

<< < (2/3) > >>

Raistmer:
Well, something can be done. Even if not with peak memory consumption but at least with how long that peak consumption remains.
From current run I did with reduced -ffa_block 1024 it seems memory was not freed after FFA finish and return to main loop (flat area on graph).

Mike:
I didn`t assume there is something to fix.

Easiest way for those affected is to use either no ffa_block switch or low values.

oclfft_plan in conjuntion with tune param will still speed up processing.

Raistmer:
Well, each tuning option has it's own point of application inside code so I don't think one of options could replace the effect of other. Yes, same (but suboptimal) performance can be reached by different options combos. Absolute peak would be on some particular options combo though.
Then additional considerations with practical implementation will arise of course:
1) how broad/sharp  that peak in tuning options parameter space. If it broad enough then one can change one of options quite strongly staying in acceptable performance area (it's the case where Mike's approach to solution will work fine).
2) how that absolute peak depends on data processed. There is such dependence in current OpenCL code. App's performance will depend, in particular, from number of already found signals, total number of signals, and (apparent, known from very begining and reduced considerably in v7) from blanking %. All this makes that performance peak moving in app's tuning parameter space. That per task movement should be averaged if one estimates overall host performance that leads to broadening of resulting host performance peak in app's tuning parameter space.
So, if broadening strong enough we again recive broad peak.

I prefer to exclude out of memory crashes though as much as possible and with initial params string I saw such crash on my own hardware with TestCase task (and I'm quite sure same parameter string will work w/o crash on regular task).

Raistmer:
And here is first results on the improvements path for this area.
This picture should be directly compared with one posted here: http://lunatics.kwsn.net/12-gpu-crunching/opencl-ap-v7-memory-consumption.msg57231.html#msg57231

As one can see now app's memory consumption returns to almost normal state after FFA completion. Hence (though peak memory consumption remained the same for now) time duration when high memory amount allocated to app considerably decreased so probability of 2 or more tasks simultaneously will demand huge amount of memory is considerably decreased too. Hence host HDD swapping or other low system memory effects less likely with new build.

Thanks all who attracted my attention to this issue.

Raistmer:
~200 MB saved from TestCase max by checking if overflow already reached and not allocating more.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version