regex - How to do str_extract with base R?

Question

Welcome To Ask or Share your Answers For Others

regex - How to do str_extract with base R?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

regex - How to do str_extract with base R?

I am balancing several versions of R and want to change my R libraries loaded depending on which R and which operating system I'm using. As such, I want to stick with base R functions.

I was reading this page to see what the base R equivalent to stringr::str_extract was:

http://stat545.com/block022_regular-expression.html

It suggested I could replicate this functionality with grep. However, I haven't been able to get grep to do more than return the whole string if there is a match. Is this possible with grep alone, or do I need to combine it with another function? In my case I'm trying to distinguish between CentOS versions 6 and 7.

grep(pattern = "release ([0-9]+)", x = readLines("/etc/system-release"), value = TRUE)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

1.4m articles

1.4m replys

5 comments

57.0k users

Most popular tags

javascript python c# java How android c++ php ios html sql r c node.js .net iphone asp.net css reactjs jquery ruby What Android objective mysql linux Is git Python windows Why regex angular swift amazon excel algorithm macos Java visual how bash Can multithreading PHP Using scala angularjs typescript apache spring performance postgresql database flutter json rust arrays C# dart vba django wpf xml vue.js In go Get google jQuery xcode jsf http Google mongodb string shell oop powershell SQL C++ security assembly docker Javascript Android: Does haskell Convert azure debugging delphi vb.net Spring datetime pandas oracle math Django

联盟问答网站-Union QA website

Xstack问答社区

生活宝问答社区

OverStack问答社区

Ostack问答社区

在这了问答社区

在哪了问答社区

Xstack问答社区

无极谷问答社区

TouSu问答社区

SQlite问答社区

Qi-U问答社区

MLink问答社区

Jonic问答社区

Jike问答社区

16892问答社区

Vigges问答社区

55276问答社区

OGeek问答社区

深圳家问答社区

深圳家问答社区

深圳家问答社区

Vigges问答社区

Vigges问答社区

在这了问答社区

DevDocs API Documentations

Xstack问答社区

生活宝问答社区

OverStack问答社区

Ostack问答社区

在这了问答社区

在哪了问答社区

Xstack问答社区

无极谷问答社区

TouSu问答社区

SQlite问答社区

Qi-U问答社区

MLink问答社区

Jonic问答社区

Jike问答社区

16892问答社区

Vigges问答社区

55276问答社区

OGeek问答社区

深圳家问答社区

深圳家问答社区

深圳家问答社区

Vigges问答社区

Vigges问答社区

在这了问答社区

在这了问答社区

DevDocs API Documentations

Xstack问答社区

生活宝问答社区

OverStack问答社区

Ostack问答社区

在这了问答社区

在哪了问答社区

Xstack问答社区

无极谷问答社区

TouSu问答社区

SQlite问答社区

Qi-U问答社区

MLink问答社区

Jonic问答社区

Jike问答社区

16892问答社区

Vigges问答社区

55276问答社区

OGeek问答社区

深圳家问答社区

深圳家问答社区

深圳家问答社区

Vigges问答社区

Vigges问答社区

在这了问答社区

DevDocs API Documentations

广告位招租

深蓝 · Answer 1 · 2021-10-23T19:36:20+0000

1) strcapture If you want to extract a string of digits and dots from "release 1.2.3" using base then

x <- "release 1.2.3"
strcapture("([0-9.]+)", x, data.frame(version = character(0)))
##   version
## 1   1.2.3

2) regexec/regmatches There is also regmatches and regexec but that has already been covered in another answer.

3) sub Also it is often possible to use sub:

sub(".* ([0-9.]+).*", "\1", x)
## [1] "1.2.3"

3a) If you know the match is at the beginning or end then delete everything after or before it:

sub(".* ", "", x)
## [1] "1.2.3"

4) gsub Sometimes we know that the field to be extracted has certain characters and they do not appear elsewhere. In that case simply delete every occurrence of every character that cannot be in the string:

gsub("[^0-9.]", "", x)
## [1] "1.2.3"

5) read.table One can often decompose the input into fields and then pick off the desired one by number or via grep. strsplit, read.table or scan can be used:

read.table(text = x, as.is = TRUE)[[2]]
## [1] "1.2.3"

5a) grep/scan

grep("^[0-9.]+$", scan(textConnection(x), what = "", quiet = TRUE), value = TRUE)
## [1] "1.2.3"

5b) grep/strsplit

grep("^[0-9.]+$", strsplit(x, " ")[[1]], value = TRUE)
## [1] "1.2.3"

6) substring If we know the character position of the field we can use substring like this:

substring(x, 9)
## [1] "1.2.3"

6a) substring/regexpr or we may be able to use regexpr to locate the character position for us:

substring(x, regexpr("\d", x))
## [1] "1.2.3"

7) read.dcf Sometimes it is possible to convert the input to dcf form in which case it can be read with read.dcf. Such data is of the form name: value

 read.dcf(textConnection(sub(" ", ": ", x)))
 ##      release
 ## [1,] "1.2.3"

Categories

regex - How to do str_extract with base R?

regex - How to do str_extract with base R?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags