r - Unlisting columns by groups

Question

Welcome To Ask or Share your Answers For Others

r - Unlisting columns by groups

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Unlisting columns by groups

I have a dataframe in the following format:

id | name               | logs                                  
---+--------------------+-----------------------------------------
84 |          "zibaroo" |                             "C47931038" 
12 | "fabien kelyarsky" | c("C47331040", "B19412225", "B18511449")
96 |     "mitra lutsko" |              c("F19712226", "A18311450")
34 |       "PaulSandoz" |                             "A47431044" 
65 |       "BeamVision" |                             "D47531045"

As you see the column "logs" includes vectors of strings in each cell.

Is there an efficient way to convert the data frame to the long format (one observation per row) without the intermediary step of separating "logs" into several columns?

This is important because the dataset is very large and the number of logs per person seems to be arbitrary.

In other words, I need the following:

id | name               | log                                 
---+--------------------+------------
84 |          "zibaroo" | "C47931038" 
12 | "fabien kelyarsky" | "C47331040"
12 | "fabien kelyarsky" | "B19412225"
12 | "fabien kelyarsky" | "B18511449"
96 |     "mitra lutsko" | "F19712226"
96 |     "mitra lutsko" | "A18311450"
34 |       "PaulSandoz" | "A47431044" 
65 |       "BeamVision" | "D47531045"

Here is the dput of a section of the real dataframe:

structure(list(id = 148:157, name = c("avihil1", "Niarfe", "doug henderson", 
"nick tan", "madisp", "woodbusy", "kevinhcross", "cylol", "andrewarrow", 
"gstavrev"), logs = list("Z47331572", "Z47031573", c("F47531574", 
"B195945", "D186871", "S192939", "S182865", "G19539045"), c("A47231575", 
"A190933", "C181859"), "F47431576", c("B47231577", "D193936", 
"Q184862"), "Y47331579", c("A47531580", "Z195944", "B185870"), 
"N47731581", "E47231582")), .Names = c("id", "name", "logs"
), row.names = 149:158, class = "data.frame")

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:13:28+0000

Using listCol_l from splitstackshape could be a good option here as the column "logs" in the data.frame is a list

library(splitstackshape)
listCol_l(df, 'logs')

 #    id           name   logs_ul
 #1: 148        avihil1 Z47331572
 #2: 149         Niarfe Z47031573
 #3: 150 doug henderson F47531574
 #4: 150 doug henderson   B195945
 #5: 150 doug henderson   D186871
 #6: 150 doug henderson   S192939
 #7: 150 doug henderson   S182865
 #8: 150 doug henderson G19539045
 #9: 151       nick tan A47231575
#10: 151       nick tan   A190933
#11: 151       nick tan   C181859
#12: 152         madisp F47431576
#13: 153       woodbusy B47231577
#14: 153       woodbusy   D193936
#15: 153       woodbusy   Q184862
#16: 154    kevinhcross Y47331579
#17: 155          cylol A47531580
#18: 155          cylol   Z195944
#19: 155          cylol   B185870
#20: 156    andrewarrow N47731581
#21: 157       gstavrev E47231582

Categories

r - Unlisting columns by groups

r - Unlisting columns by groups

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags