Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
171 views
in Technique[技术] by (71.8m points)

r - How can I bin/bucket a data.frame by birthyears

I have a data.frame with three columns: a token, year of birth and number of contacts. The birthyears range from 1934 to 2020 and I don't want individual years but 5-year groups like 2000-2005, 2006-2010 and so on to later visualize the contact count per age group.

I already found the cut function like this:

# set up cut-off values 
breaks <- c(0,2,4,6,8,10,12,14,16,18,20)
# specify interval/bin labels
tags <- c("[0-2)","[2-4)", "[4-6)", "[6-8)", "[8-10)", "[10-12)","[12-14)", "[14-16)","[16-18)", "[18-20)")
# bucketing values into bins
group_tags <- cut(v$MeanEducation, 
                  breaks=breaks, 
                  include.lowest=TRUE, 
                  right=FALSE, 
                  labels=tags)

However in this example I'd have to set a vector of breaks and labels manually.

Is there a solution to automize this? Like beginning the first bucket at the next lower by 5 dividable year than the minimum in my dataframe. Analogue at the top end.

Thanks in advance!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The tags that you have defined are created by default by cut function you don't have to add them manually. Moreover, you can use seq to create sequence of breaks and paste to generate labels programatically.

#Generate data
set.seed(123)
x <- sample(10)
x
#[1]  3 10  2  8  6  9  1  7  5  4
#Create breaks
breaks <- seq(0, 10, 2)
#Create labels
labels <- paste(head(breaks, -1), tail(breaks, -1), sep = '-')

#Without labels
cut(x, breaks)

#[1] (2,4]  (8,10] (0,2]  (6,8]  (4,6]  (8,10] (0,2]  (6,8]  (4,6]  (2,4] 
#Levels: (0,2] (2,4] (4,6] (6,8] (8,10]

#With labels
cut(x, breaks, labels)
#[1] 2-4  8-10 0-2  6-8  4-6  8-10 0-2  6-8  4-6  2-4 
#Levels: 0-2 2-4 4-6 6-8 8-10

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...