Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
946 views
in Technique[技术] by (71.8m points)

dynamic regex in R

The below code works so long as before and after strings have no characters that are special to a regex:

before <- 'Name of your Manager (note "self" if you are the Manager)' #parentheses cause problem in regex
after  <- 'CURRENT FOCUS'

pattern <- paste0(c('(?<=', before, ').*?(?=', after, ')'), collapse='')
ex <- regmatches(x, gregexpr(pattern, x, perl=TRUE))

Does R have a function to escape strings to be used in regexes?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In Perl, there is http://perldoc.perl.org/functions/quotemeta.html for doing exactly that. If the doc is correct when it says

Returns the value of EXPR with all the ASCII non-"word" characters backslashed. (That is, all ASCII characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash in the returned string, regardless of any locale settings.)

then you can achieve the same by doing:

quotemeta <- function(x) gsub("([^A-Za-z_0-9])", "\\\1", x)

And your pattern should be:

pattern <- paste0(c('(?<=', quotemeta(before), ').*?(?=', quotemeta(after), ')'),
                  collapse='')

Quick sanity check:

a <- "he'l(lo)"
grepl(a, a)
# [1] FALSE
grepl(quotemeta(a), a)
# [1] TRUE

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...