arrays - Is indexing vectors in MATLAB inefficient?

Question

Welcome To Ask or Share your Answers For Others

arrays - Is indexing vectors in MATLAB inefficient?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

arrays - Is indexing vectors in MATLAB inefficient?

Background

My question is motivated by simple observations, which somewhat undermine the beliefs/assumptions often held/made by experienced MATLAB users:

MATLAB is very well optimized when it comes to the built-in functions and the fundamental language features, such as indexing vectors and matrices.
Loops in MATLAB are slow (despite the JIT) and should generally be avoided if the algorithm can be expressed in a native, 'vectorized' manner.

The bottom line: core MATLAB functionality is efficient and trying to outperform it using MATLAB code is hard, if not impossible.

Investigating performance of vector indexing

The example codes shown below are as fundamental as it gets: I assign a scalar value to all vector entries. First, I allocate an empty vector x:

tic; x = zeros(1e8,1); toc
Elapsed time is 0.260525 seconds.

Having x I would like to set all its entries to the same value. In practice you would do it differently, e.g., x = value*ones(1e8,1), but the point here is to investigate the performance of vector indexing. The simplest way is to write:

tic; x(:) = 1; toc
Elapsed time is 0.094316 seconds.

Let's call it method 1 (from the value assigned to x). It seems to be very fast (faster at least than memory allocation). Because the only thing I do here is operate on memory, I can estimate the efficiency of this code by calculating the obtained effective memory bandwidth and comparing it to the hardware memory bandwidth of my computer:

eff_bandwidth = numel(x) * 8 bytes per double * 2 / time

In the above, I multiply by 2 because unless SSE streaming is used, setting values in memory requires that the vector is both read from and written to the memory. In the above example:

eff_bandwidth(1) = 1e8*8*2/0.094316 = 17 Gb/s

STREAM-benchmarked memory bandwidth of my computer is around 17.9 Gb/s, so indeed - MATLAB delivers close to peak performance in this case! So far, so good.

Method 1 is suitable if you want to set all vector elements to some value. But if you want to access elements every step entries, you need to substitute the : with e.g., 1:step:end. Below is a direct speed comparison with method 1:

tic; x(1:end) = 2; toc
Elapsed time is 0.496476 seconds.

While you would not expect it to perform any different, method 2 is clearly big trouble: factor 5 slowdown for no reason. My suspicion is that in this case MATLAB explicitly allocates the index vector (1:end). This is somewhat confirmed by using explicit vector size instead of end:

tic; x(1:1e8) = 3; toc
Elapsed time is 0.482083 seconds.

Methods 2 and 3 perform equally bad.

Another possibility is to explicitly create an index vector id and use it to index x. This gives you the most flexible indexing capabilities. In our case:

tic;
id = 1:1e8; % colon(1,1e8);
x(id) = 4;
toc
Elapsed time is 1.208419 seconds.

Now that is really something - 12 times slowdown compared to method 1! I understand it should perform worse than method 1 because of the additional memory used for id, but why is it so much worse than methods 2 and 3?

Let's try to give the loops a try - as hopeless as it may sound.

tic;
for i=1:numel(x)
    x(i) = 5;
end
toc
Elapsed time is 0.788944 seconds.

A big surprise - a loop beats a vectorized method 4, but is still slower than methods 1, 2 and 3. It turns out that in this particular case you can do it better:

tic;
for i=1:1e8
    x(i) = 6;
end
toc
Elapsed time is 0.321246 seconds.

And that is the probably the most bizarre outcome of this study - a MATLAB-written loop significantly outperforms native vector indexing. That should certainly not be so. Note that the JIT'ed loop is still 3 times slower than the theoretical peak almost obtained by method 1. So there is still plenty of room for improvement. It is just surprising (a stronger word would be more suitable) that usual 'vectorized' indexing (1:end) is even slower.

Questions

is simple indexing in MATLAB very inefficient (methods 2, 3, and 4 are slower than method 1), or did I miss something?
why is method 4 (so much) slower than methods 2 and 3?
why does using 1e8 instead of numel(x) as a loop bound speed up the code by factor 2?

Edit After reading Jonas's comment, here is another way to do that using logical indices:

tic;
id = logical(ones(1, 1e8));
x(id) = 7;
toc
Elapsed time is 0.613363 seconds.

Much better than method 4.

For convenience:

function test

tic; x = zeros(1,1e8); toc

tic; x(:) = 1; toc
tic; x(1:end) = 2; toc
tic; x(1:1e8) = 3; toc

tic;
id = 1:1e8; % colon(1,1e8);
x(id) = 4;
toc

tic;
for i=1:numel(x)
    x(i) = 5;
end
toc

tic;
for i=1:1e8
    x(i) = 6;
end
toc

end

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:58:26+0000

I can, of course, only speculate. However when I run your test with the JIT compiler enabled vs disabled, I get the following results:

 % with JIT   no JIT
    0.1677    0.0011 %# init
    0.0974    0.0936 %# #1 I added an assigment before this line to avoid issues with deferring
    0.4005    0.4028 %# #2
    0.4047    0.4005 %# #3
    1.1160    1.1180 %# #4
    0.8221   48.3239 %# #5 This is where "don't use loops in Matlab" comes from 
    0.3232   48.2197 %# #6
    0.5464   %# logical indexing

Dividing shows us where there is any speed increase:

% withoutJit./withJit
    0.0067 %# w/o JIT, the memory allocation is deferred
    0.9614 %# no JIT
    1.0057 %# no JIT
    0.9897 %# no JIT
    1.0018 %# no JIT
   58.7792 %# numel
  149.2010 %# no numel

The apparent speed-up on initialization happens, because with JIT turned off it appears that MATLAB delays the memory allocation until it is used, so x=zeros(...) does not do anything really. (thanks, @angainor).

Methods 1 through 4 don't seem to benefit from the JIT. I guess that #4 could be slow due to additional input testing in subsref to make sure that the input is of the proper form.

The numel result could have something to do with it being harder for the compiler to deal with uncertain number of iterations, or with some overhead due to checking whether the bound of the loop is ok (thought no-JIT tests suggest only ~0.1s for that)

Surprisingly, on R2012b on my machine, logical indexing seems to be slower than #4.

I think that this goes to show, once again, that MathWorks have done great work in speeding up code, and that "don't use loops" isn't always best if you're trying to get the fastest execution time (at least at the moment). Nevertheless, I find that vectorizing is in general a good approach, since (a) the JIT fails on more complex loops, and (b) learning to vectorize makes you understand Matlab a lot better.

Conclusion: If you want speed, use the profiler, and re-profile if you switch Matlab versions. As pointed out by @Adriaan in the comments, nowadays it may be better to use timeit() to measure execution speed.

For reference, I used the following slightly modified test function

function tt = speedTest

tt = zeros(8,1);

tic; x = zeros(1,1e8); tt(1)=toc;

x(:) = 2;

tic; x(:) = 1; tt(2)=toc;
tic; x(1:end) = 2; tt(3)=toc;
tic; x(1:1e8) = 3; tt(4)=toc;

tic;
id = 1:1e8; % colon(1,1e8);
x(id) = 4;
tt(5)=toc;

tic;
for i=1:numel(x)
    x(i) = 5;
end
tt(6)=toc;

tic;
for i=1:1e8
    x(i) = 6;
end
tt(7)=toc;

%# logical indexing
tic;
id = true(1e8,1));
x(id)=7;
tt(8)=toc;

Categories

arrays - Is indexing vectors in MATLAB inefficient?

arrays - Is indexing vectors in MATLAB inefficient?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags