Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
224 views
in Technique[技术] by (71.8m points)

r - How to number/label data-table by group-number from group_by?

I have a tbl_df where I want to group_by(u, v) for each distinct integer combination observed with (u, v).


EDIT: this was subsequently resolved by adding the (now-deprecated) group_indices() back in dplyr 0.4.0


a) I then want to assign each distinct group some arbitrary distinct number label=1,2,3... e.g. the combination (u,v)==(2,3) could get label 1, (1,3) could get 2, and so on. How to do this with one mutate(), without a three-step summarize-and-self-join?

dplyr has a neat function n(), but that gives the number of elements within its group, not the overall number of the group. In data.table this would simply be called .GRP.

b) Actually what I really want to assign a string/character label ('A','B',...). But numbering groups by integers is good-enough, because I can then use integer_to_label(i) as below. Unless there's a clever way to merge these two? But don't sweat this part.

set.seed(1234)

# Helper fn for mapping integer 1..26 to character label
integer_to_label <- function(i) { substr("ABCDEFGHIJKLMNOPQRSTUVWXYZ",i,i) }

df <- tibble::as_tibble(data.frame(u=sample.int(3,10,replace=T), v=sample.int(4,10,replace=T)))

# Want to label/number each distinct group of unique (u,v) combinations
df %>% group_by(u,v) %>% mutate(label = n()) # WRONG: n() is number of element within its group, not overall number of group

   u v
1  2 3
2  1 3
3  1 2
4  2 3
5  1 2
6  3 3
7  1 3
8  1 2
9  3 1
10 3 4

KLUDGE 1: could do df %>% group_by(u,v) %>% summarize(label = n()) , then self-join
Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

dplyr has a group_indices() function that you can use like this:

df %>% 
    mutate(label = group_indices(., u, v)) %>% 
    group_by(label) ...

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...