regex - Split on first/nth occurrence of delimiter

Question

Welcome To Ask or Share your Answers For Others

regex - Split on first/nth occurrence of delimiter

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

regex - Split on first/nth occurrence of delimiter

I am trying something I thought would be easy. I'm looking for a single regex solution (though others are welcomed for completeness). I want to split on n occurrences of a delimiter.

Here is some data:

x <- "I like_to see_how_too"
pat <- "_"

Desired outcome

Say I want to split on first occurrence of _:

[1] "I like"  "to see_how_too"

Say I want to split on second occurrence of _:

[1] "I like_to see"   "how_too"

Ideally, if the solution is a regex one liner generalizable to nth occurrence; the solution will use strsplit with a single regex.

Here's a solution that doesn't fit my parameters of single regex that works with strsplit

x <- "I like_to see_how_too"
y <- "_"
n <- 1
loc <- gregexpr("_", x)[[1]][n]

c(substr(x, 1, loc-1), substr(x, loc + 1, nchar(x)))

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T20:00:57+0000

Here is another solution using the gsubfn package and some regex-fu. To change the nth occurrence of the delimiter, you can simply swap the number that is placed inside of the range quantifier — {n}.

library(gsubfn)
x <- 'I like_to see_how_too'
strapply(x, '((?:[^_]*_){1})(.*)', c, simplify =~ sub('_$', '', x))
# [1] "I like"  "to see_how_too"

If you would like the nth occurrence to be user defined, you could use the following:

n <- 2
re <- paste0('((?:[^_]*_){',n,'})(.*)')
strapply(x, re, c, simplify =~ sub('_$', '', x))
# [1] "I like_to see" "how_too"

Categories

regex - Split on first/nth occurrence of delimiter

regex - Split on first/nth occurrence of delimiter

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags