Often you can find suitable data for your reproducible example by looking at what comes with R (data()
will show a list of data sets and brief descriptions). For example, the iris
data set is similar to yours except that the species name is the last column:
data(iris)
iris <- iris[, c(5, 1:4)]
iris.splt <- split(iris[, 2:5], iris[, 1])
Now we have loaded the data, moved the last column to the first position, and split the dataset by species into 3 data frames that are stored in a single list called iris.splt
.
The species name is the name of each part of the list and only the data are stored in the data frame for that list part. Now you need to write a function that computes the statistics you need. Here is an example based on the picture you provided, but you will probably need to change it:
stats <- function(x) {
quant=as.matrix(quantile(x, na.rm=TRUE))
mean=mean(x, na.rm=TRUE)
sd=sd(x, na.rm=TRUE)
var=var(x, na.rm=TRUE)
return(rbind(quant, mean, sd, var))
}
This computes the statistics for a single column. We need to run the function on each column of each part of the list using the lapply
function twice and then a third time to combine the columns back together:
iris.stats <- lapply(iris.splt, function(x) lapply(x, stats))
iris.dfs <- lapply(iris.stats, data.frame)
iris.dfs
# $setosa
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 0% 4.3000 2.3000 1.00000 0.10000
# 25% 4.8000 3.2000 1.40000 0.20000
# 50% 5.0000 3.4000 1.50000 0.20000
# 75% 5.2000 3.6750 1.57500 0.30000
# 100% 5.8000 4.4000 1.90000 0.60000
# mean 5.0060 3.4280 1.46200 0.24600
# sd 0.3525 0.3791 0.17366 0.10539
# var 0.1242 0.1437 0.03016 0.01111
#
# $versicolor
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 0% 4.9000 2.00000 3.0000 1.00000
# 25% 5.6000 2.52500 4.0000 1.20000
# 50% 5.9000 2.80000 4.3500 1.30000
# 75% 6.3000 3.00000 4.6000 1.50000
# 100% 7.0000 3.40000 5.1000 1.80000
# mean 5.9360 2.77000 4.2600 1.32600
# sd 0.5162 0.31380 0.4699 0.19775
# var 0.2664 0.09847 0.2208 0.03911
#
# $virginica
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 0% 4.9000 2.2000 4.5000 1.40000
# 25% 6.2250 2.8000 5.1000 1.80000
# 50% 6.5000 3.0000 5.5500 2.00000
# 75% 6.9000 3.1750 5.8750 2.30000
# 100% 7.9000 3.8000 6.9000 2.50000
# mean 6.5880 2.9740 5.5520 2.02600
# sd 0.6359 0.3225 0.5519 0.27465
# var 0.4043 0.1040 0.3046 0.07543
You will have to decide how you want to use this list or if you want to combine it back into a single data frame, but this should get you started.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…