Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
824 views
in Technique[技术] by (71.8m points)

r - How to split a dataframe column by the first instance of a character in its values

I have a dataframe (or vector?)

x <- data.frame(a=c("A_B_D", "B_C"))

I want to split the vector in x$a into two new columns by the first instance of "_" to get

x$b 
[1] "A" "B_D"

and

x$c
[2] "B" "C"

i tried variants of gsub, but couldnt come to a solution.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One idea is to replace the first _ with another delimiter and split on the new delimiter. This works because using sub will only replace the first found delimiter (whereas gsub replaces all), i.e.

strsplit(sub('_', ',', x$a), ',', fixed = TRUE)
#[[1]]
#[1] "A"   "B_D"

#[[2]]
#[1] "B" "C"

To create two new columns in your original data frame,

within(x, new <- data.frame(do.call(rbind, strsplit(sub('_', ',', x$a), ',', fixed = TRUE))))
#      a new.X1 new.X2
#1 A_B_D      A    B_D
#2   B_C      B      C

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...