Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
900 views
in Technique[技术] by (71.8m points)

performance - Best practice when working with sparse matrices

This question is based on a discussion in this question. I have been working with sparse matrices earlier and my belief is that the way I've been working with these is efficient.

My question is twofold:

In the below, A = full(S) where S is a sparse matrix.

What's the "correct" way to access an element in a sparse matrix?

That is, what would the sparse equivalent to var = A(row, col) be?

My view on this topic: You wouldn't do anything different. var = S(row, col) is as efficient as it gets. I have been challenged on this with the explanation:

Accessing element on row 2 and column 2 the way you said, S(2,2) does the same as adding a new element: var = S(2,2) => A = full(S) => var = A(2,2) => S = sparse(A) => 4.

Can this statement really be correct?

What's the "correct" way to add elements to a sparse matrix?

That is, what would the sparse equivalent of A(row, col) = var be? (Assuming A(row, col) == 0 to begin with)

It is known that simply doing A(row, col) = var is slow for large sparse matrices. From the documentation:

If you wanted to change a value in this matrix, you might be tempted to use the same indexing:

B(3,1) = 42; % This code does work, however, it is slow.

My view on this topic: When working with sparse matrices, you often start with the vectors and use them to create the matrix this way: S = sparse(i,j,s,m,n). Of course, you could also have created it like this: S = sparse(A) or sprand(m,n,density) or something similar.

If you start of the first way, you would simply do:

i = [i; new_i];
j = [j; new_j];
s = [s; new_s];
S = sparse(i,j,s,m,n); 

If you started out not having the vectors, you would do the same thing, but use find first:

[i, j, s] = find(S);
i = [i; new_i];
j = [j; new_j];
s = [s; new_s];
S = sparse(i,j,s,m,n); 

Now you would of course have the vectors, and can reuse them if you're doing this operation several times. It would however be better to add all new elements at once, and not do the above in a loop, because growing vectors are slow. In this case, new_i, new_j and new_s will be vectors corresponding to the new elements.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

MATLAB stores sparse matrices in compressed column format. This means that when you perform an operations like A(2,2) (to get the element in at row 2, column 2) MATLAB first access the second column and then finds the element in row 2 (row indices in each column are stored in ascending order). You can think of it as:

 A2 = A(:,2);
 A2(2)

If you are only accessing a single element of sparse matrix doing var = S(r,c) is fine. But if you are looping over the elements of a sparse matrix, you probably want to access one column at a time, and then loop over the nonzero row indices via [i,~,x]=find(S(:,c)). Or use something like spfun.

You should avoid constructing a dense matrix A and then doing S = sparse(A), as this operations just squeezes out zeros. Instead, as you note, it's much more efficient to build a sparse matrix from scratch using triplet-form and a call to sparse(i,j,x,m,n). MATLAB has a nice page which describes how to efficiently construct sparse matrices.

The original paper describing the implementation of sparse matrices in MATLAB is quite a good read. It provides some more info on how the sparse matrix algorithms were originally implemented.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...