Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
482 views
in Technique[技术] by (71.8m points)

r - Using grep to help subset a data frame

I am having trouble subsetting my data. I want the data subsetted on column x, where the first 3 characters begin G45.

My data frame:

x <- c("G448", "G459", "G479", "G406")  
y <- c(1:4)
My.Data <- data.frame (x,y)

I have tried:

subset (My.Data, x=="G45*")

But I am unsure how to use wildcards. I have also tried grep() to find the indicies:

grep  ("G45*", My.Data$x)

but it returns all 4 rows, rather than just those beginning G45, probably also as I am unsure how to use wildcards.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's pretty straightforward using [ to extract:

grep will give you the position in which it matched your search pattern (unless you use value = TRUE).

grep("^G45", My.Data$x)
# [1] 2

Since you're searching within the values of a single column, that actually corresponds to the row index. So, use that with [ (where you would use My.Data[rows, cols] to get specific rows and columns).

My.Data[grep("^G45", My.Data$x), ]
#      x y
# 2 G459 2

The help-page for subset shows how you can use grep and grepl with subset if you prefer using this function over [. Here's an example.

subset(My.Data, grepl("^G45", My.Data$x))
#      x y
# 2 G459 2

As of R 3.3, there's now also the startsWith function, which you can again use with subset (or with any of the other approaches above). According to the help page for the function, it's considerably faster than using substring or grepl.

subset(My.Data, startsWith(as.character(x), "G45"))
#      x y
# 2 G459 2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...