Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
527 views
in Technique[技术] by (71.8m points)

matlab - Is it possible return cell array that contains one instance in several cells?

I write some mex function and have to return huge array of strings.

I do this as following:

  mxArray * array = mxCreateCellMatrix(ARRAY_LEN, 1);
  for (size_t k = 0; k < ARRAY_LEN; ++ k) {
      mxArray *str = mxCreateString("Hello");
      mxSetCell(array, k, str);
  }
  prhs[0] = array;

However, since the string has always same value, I would like to create only one instance of it. like

  mxArray * array = mxCreateCellMatrix(ARRAY_LEN, 1);
  mxArray *str = mxCreateString("Hello");

  for (size_t k = 0; k < ARRAY_LEN; ++ k) {
      mxSetCell(array, k, str);
  }
  prhs[0] = array;

Does it possible? How the garbage collector knows to release it? Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The second code you suggested is not safe and should not be used, as it could crash MATLAB. Instead you should write:

mxArray *arr = mxCreateCellMatrix(len, 1);
mxArray *str = mxCreateString("Hello");
for(mwIndex i=0; i<len; i++) {
    mxSetCell(arr, i, mxDuplicateArray(str));
}
mxDestroyArray(str);
plhs[0] = arr;

This is unfortunately not the most efficient use of memory storage. Imagine that instead of using a tiny string, we were storing a very large matrix (duplicated along the cells).


Now it is possible to do what you initially wanted, but you'll have to be resort to undocumented hacks (like creating shared data copies or manually increment the reference count in the mxArray_tag structure).

In fact this is what usually happens behind the scenes in MATLAB. Take this for example:

>> c = cell(100,100);
>> c(:) = {rand(5000)};

As you know a cell array in MATLAB is basically an mxArray whose data-pointer points to an array of other mxArray variables.

In the case above, MATLAB first creates an mxArray corresponding to the 5000x5000 matrix. This will be stored in the first cell c{1}.

For the rest of the cells, MATLAB creates "lightweight" mxArrays, that basically share its data with the first cell element, i.e its data pointer points to the same block of memory holding the huge matrix.

So there is only one copy of the matrix at all times, unless of course you modify one of them (c{2,2}(1)=99), at which point MATLAB has to "unlink" the array and make a separate copy for this cell element.

You see internally each mxArray structure has a reference counter and a cross-link pointer to make this data sharing possible.

Hint: You can study this data sharing behavior with format debug option turned on, and comparing the pr pointer address of the various cells.

The same concept holds true for structure fields, so when we write:

x = rand(5000);
s = struct('a',x, 'b',x, 'c',x);

all the fields would point to the same copy of data in x..


EDIT:

I forgot to show the undocumented solution I mentioned :)

mex_test.cpp

#include "mex.h"

extern "C" mxArray* mxCreateReference(mxArray*);

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mwSize len = 10;
    mxArray *arr = mxCreateCellMatrix(len, 1);
    mxArray *str = mxCreateString("Hello");
    for(mwIndex i=0; i<len; i++) {
        // I simply replaced the call to mxDuplicateArray here
        mxSetCell(arr, i, mxCreateReference(str));
    }
    mxDestroyArray(str);
    plhs[0] = arr;
}

MATLAB

>> %c = repmat({'Hello'}, 10, 1);
>> c = mex_test()
>> c{1} = 'bye'
>> clear c

The mxCreateReference function will increment the internal reference counter of the str array each time it is called, thus letting MATLAB know that there are other copies of it.

So when you clear the resulting cell arrays, it will in turn decrement this counter one for each cell, until the counter reaches 0 at which point it is safe to destroy the array in question.

Using the array directly (mxSetCell(arr, i, str)) is problematic because the ref-counter immediately reaches zero after destroying the first cell. Thus for subsequent cells, MATLAB will attempt to free arrays that have already been freed, resulting in memory corruption.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...