regex - R remove last word from string

Question

Welcome To Ask or Share your Answers For Others

regex - R remove last word from string

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

regex - R remove last word from string

I'm trying to do something but can't remember/find the answer. I have a list of city names from the Census Bureau and they put the city's type on the end which is messing up my match().

I'd like to make this:

Middletown Township
Sunny Valley Borough
Hillside Village

into this:

Middletown
Sunny Valley
Hillside

Any suggestions? Ideally I'd also like to know if there's a lastIndexOf() function in R.

Here's the dput:

> dput(df1)
structure(list(id = c(1, 2, 3), city = structure(c(2L, 3L, 1L
), .Label = c("Hillside Village", "Middletown Township", "Sunny Valley Borough"
), class = "factor")), .Names = c("id", "city"), row.names = c(NA, 
-3L), class = "data.frame")

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:13:33+0000

This will work:

gsub("\s*\w*$", "", df1$city)
[1] "Middletown"   "Sunny Valley" "Hillside"

It removes any substring consisting of one or more space chararacters, followed by any number of "word" characters (spaces, numbers, or underscores), followed by the end of the string.

Categories

regex - R remove last word from string

regex - R remove last word from string

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags