Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
319 views
in Technique[技术] by (71.8m points)

r - Find all rows of matrix equal to vector

Suppose I have the following matrix:

cm<-structure(c(100, 200, 400, 800, 100, 200, 400, 800, 100, 200, 
400, 800, 100, 200, 400, 800, 100, 200, 400, 800, 0, 0, 0, 0, 
0.5, 0.5, 0.5, 0.5, 1, 1, 1, 1, 0, 0, 0, 0, 0.5, 0.5, 0.5, 0.5, 
-0.4, -0.4, -0.4, -0.4, -0.4, -0.4, -0.4, -0.4, -0.4, -0.4, -0.4, 
-0.4, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1), .Dim = c(20L, 4L), .Dimnames = list(
    NULL, c("Var1", "Var2", "Var3", "n1")))

and another matrix derived from it:

a4<-data.matrix(unique(cm[,1:3]))

Now, I want to find all the rows of cm whose first three columns are equal to a4[1,], but doing the intutive thing:

a5<-which(cm[,1:3]==a4[1,])

fails (R 3.1.3). For example a5[2] is 13, but the 13th row of cm[,1:3] ain't the same as a4[1,].

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use apply and all.equal to compare each row against the target row. The problem with using == is that it only checks the it recycles elements of a vector for comparison, whereas you want to see if all values in the row vector match a4[1,] so you should use all.equal. The consequence is that it's return value is not a logical but instead a character string describing differences between the objects, which makes it a little messier to work with than == alone:

which(apply(cm, 1, function(x) all.equal(x[1:3], a4[1,])) == "TRUE")
# [1] 1

You can also make that a bit simpler by using identical instead of all.equal:

which(apply(cm, 1, function(x) identical(x[1:3], a4[1,])))
# [1] 1

Then extract:

cm[apply(cm, 1, function(x) identical(x[1:3], a4[1,])),,drop=FALSE]
#      Var1 Var2 Var3 n1
# [1,]  100    0 -0.4  1

To clarify exactly what's happening, consider what == does implicitly when you pass a matrix argument:

which(cm[,1:3]==a4[1,])
# [1]  1 13 23 35 42 45 48 51 53 56 59

That result is the same as converting the matrix to a vector:

as.vector(cm[,1:3])
#  [1] 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0   0.0   0.0   0.0   0.0   0.5   0.5   0.5
# [28]   0.5   1.0   1.0   1.0   1.0   0.0   0.0   0.0   0.0   0.5   0.5   0.5   0.5  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4  -0.4   0.0   0.0
# [55]   0.0   0.0   0.0   0.0   0.0   0.0
which(as.vector(cm[,1:3])==a4[1,])
# [1]  1 13 23 35 42 45 48 51 53 56 59

Thus, the positions are positions within the vector representation of cm, not rows in the matrix representation. == comparisons can also be dangerous (again do to the recycling noted above) when trying to compare vectors that are not of equivalent length or where one vector's length is not a multiple of the other, which will produce a warning:

1:2 == 1:3
# [1]  TRUE  TRUE FALSE
# Warning message:
# In 1:2 == 1:3 :
#   longer object length is not a multiple of shorter object length

Whereas there is no warning when recycling is used:

1:2 == 1:6
# [1]  TRUE  TRUE FALSE FALSE FALSE FALSE

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...