Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
708 views
in Technique[技术] by (71.8m points)

regex - breaking up a long regular expression in R

Problem: I am using R and stringr and I have a very long regular expression using the "or" operator that I save to an object and use with stringr. How can I break it up into multiple lines in R so I do not have to keep scrolling to the right in my source editor? When I try commas, only the first line is recognized. Most answers to this question have been for other programming languages (i.e. not R).

regex_of_sites <- "side|southeast|north|computer|engineer|first|south|pharm|left|southwest|level|second|thirteenth"
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since you are using the pattern with stringr methods that use ICU regex flavor, you may use a (?x) free spacing (also called verbose, or ignore pattern whitespace) modifier where all unescaped whitespace is ignored when compiling the pattern, and there is a possibility to add comments after an unescaped # on each line (so, all literal # must be escaped).

Here is an example:

> library(stringr)
> regex_of_sites <- "(?x)side     # Term 0
+ |southeast                      # Term 1
+ |north                          # Term 1
+ |computer                       # Term 2
+ |engineer
+ |first
+ |south
+ |pharm
+ |left
+ |southwest
+ |level
+ |second
+ |thirteenth"
> str_extract_all("first level", regex_of_sites)
[[1]]
[1] "first" "level"

The same modifier is supported by the PCRE patterns used in base R regex functions with perl=TRUE.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...