I have a contingency table of counts, and I want to extend it with corresponding proportions of each group.
Some sample data (tips
data set from ggplot2
package):
library(ggplot2)
head(tips, 3)
# total_bill tip sex smoker day time size
# 1 17 1.0 Female No Sun Dinner 2
# 2 10 1.7 Male No Sun Dinner 3
# 3 21 3.5 Male No Sun Dinner 3
First, use table
to count smoker vs non-smoker, and nrow
to count total number of subjects:
table(tips$smoker)
# No Yes
# 151 93
nrow(tips)
# [1] 244
Then, I want to calculate percentage of smokers vs. non smokers. Something like this (ugly code):
# percentage of smokers
options(digits = 2)
transform(as.data.frame(table(tips$smoker)), percentage_column = Freq / nrow(tips) * 100)
# Var1 Freq percentage_column
# 1 No 151 62
# 2 Yes 93 38
Is there a better way to do this?
(even better it would be to do this on a set of columns (which I enumerate) and have output somewhat nicely formatted)
(e.g., smoker, day, and time)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…