Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
525 views
in Technique[技术] by (71.8m points)

matlab - Fun depending on order of subs and values in accumarray

In accumarray() the first note about 'subs', first appeared in the MATLAB R14sp3 docs, says:

Note If the subscripts in subs are not sorted, fun should not depend on the order of the values in its input data.

It is not clear to me what is considered to be sorted. Suppose:

subs = [1 1
        2 1
        1 2
        2 2];
  1. shall subs be sorted in the sense issorted(subs,'rows'), or ...
  2. in the linear indexing sense, i.e. issorted(sub2ind([2 2],subs(:,1), subs(:,2)))

I would like to rely on:

accumarray(subs,val,[], @(x) x(end))

If somebody could also provide examples/tests from older releases (to check for backward compatibility) where e.g. 1) is false while 2) is true, that would be great.

PS. I am mostly not interested in alternatives to accumarray, unless very succinct and using the same subs and val.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Ok I did some tests, and I think that "sorted" in the quote is meant in the linear indexing sense. (I'm on R2013a if that matters)

To understand how accumarray calls the specified function, I'll use the trick of grouping the values into a cellarray by specifying fun = @(x) {x} as the function to be applied.

1) 1D indices

First lets create some subscripts and values

N = 10; sz = 4;
subs = randi([1 sz], [N 1]);
vals = (1:N)'*100;

Now we call ACCUMARRAY on the non-sorted indices (multiple times)

C = cell(5,1);
for i=1:5
    C{i} = accumarray(subs, vals, [], @(x){x});
end

The order of the values passed to the function is arbitrary but still consistent across the multiple runs:

>> assert(isequal(C{:}))
>> celldisp(C{1})
ans{1} =
   800
   900
   700
ans{2} =
   300
ans{3} =
        1000
         200
         100
ans{4} =
   400
   600
   500

This is why the documentation warns you that fun should not depend on the order of the values passed to it.

Now if we sort the subscripts beforehand:

[~,ord] = sort(subs);
C = cell(5,1);
for i=1:5
    C{i} = accumarray(subs(ord), vals(ord), [], @(x){x});
end
assert(isequal(C{:}))
celldisp(C{1})

we will see that the values are passed to the function sorted themselves:

ans{1} =
   700
   800
   900
ans{2} =
   300
ans{3} =
         100
         200
        1000
ans{4} =
   400
   500
   600

2) 2D indices

I've tried the same thing in the case of 2D subscript indices. First we start with random data:

%# some 2d subscripts and corresponding values
N = 10; sz = 2;
subs = randi([1 sz], [N 2]);
vals = (1:N)*100;
  • Here is the case for non-sorted indices:

    C = cell(5,1);
    for i=1:5
        C{i} = accumarray(subs, vals, [], @(x){x});
    end
    assert(isequal(C{:}))
    celldisp(C{1})
    
  • Here is when I tried "sorting by rows":

    [~,ord] = sortrows(subs, [1 2]);
    C = cell(5,1);
    for i=1:5
        C{i} = accumarray(subs(ord,:), vals(ord), [], @(x){x});
    end
    assert(isequal(C{:}))
    celldisp(C{1})
    
  • And finally here is when we sort by "linear index":

    [~,ord] = sort(sub2ind([sz sz], subs(:,1), subs(:,2)));
    C = cell(5,1);
    for i=1:5
        C{i} = accumarray(subs(ord,:), vals(ord), [], @(x){x});
    end
    assert(isequal(C{:}))
    celldisp(C{1})
    

I'll omit the long output, and report that only in the last case that the values were passed to the function ordered. So I conclude that the "sorted" criteria is based on the linear indices.

ans{1,1} =
     []
ans{2,1} =
   200
   600
   700
ans{1,2} =
         100
         300
         400
         500
        1000
ans{2,2} =
   800
   900

Bonus:

Try using the following function instead:

function out = fun(x)
    out = {x};
    disp('[')
    disp(x)
    disp(']')
end

you'll see that the function is called in an unpredictable way, even with some duplication! You might have to increase the size of the data to see such behavior...


As far as backward compatibility goes, the documentation mentions this note all the way back in MATLAB R14sp3 but not R14sp2. Given the fact that it's been documented since 2005, I'd say it is safe to rely on this feature (I doubt anyone using such old versions expects new code to just work anyway!)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...