I am trying to follow the tutorial by Datanovia for Two-way repeated measures ANOVA.
A quick overview of my dataset:
I have measured the number of different bacterial species in 12 samplingsunits over time. I have 16 time points and 2 groups. I have organised my data as a tibble called "richness";
# A tibble: 190 x 4
id selection.group Day value
<fct> <fct> <fct> <dbl>
1 KRH1 KR 2 111.
2 KRH2 KR 2 141.
3 KRH3 KR 2 110.
4 KRH1 KR 4 126
5 KRH2 KR 4 144
6 KRH3 KR 4 135.
7 KRH1 KR 6 115.
8 KRH2 KR 6 113.
9 KRH3 KR 6 107.
10 KRH1 KR 8 119.
The id refers to each sampling unit, and the selection group is of two factors (KR and RK).
richness <- tibble(
id = factor(c("KRH1", "KRH3", "KRH2", "RKH2", "RKH1", "RKH3")),
selection.group = factor(c("KR", "KR", "KR", "RK", "RK", "RK")),
Day = factor(c(2,2,4,2,4,4)),
value = c(111, 110, 144, 92, 85, 69)) # subset of original data
My tibble appears to be in an identical format as the one in the tutorial;
> str(selfesteem2)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 72 obs. of 4 variables:
$ id : Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ treatment: Factor w/ 2 levels "ctr","Diet": 1 1 1 1 1 1 1 1 1 1 ...
$ time : Factor w/ 3 levels "t1","t2","t3": 1 1 1 1 1 1 1 1 1 1 ...
$ score : num 83 97 93 92 77 72 92 92 95 92 ..
Before I can run the repeated measures ANOVA I must check for normality in my data. I copied the framework proposed in the tutorial.
#my code
richness %>%
group_by(selection.group, Day) %>%
shapiro_test(value)
#tutorial code
selfesteem2 %>%
group_by(treatment, time) %>%
shapiro_test(score)
But get the error message "Error: Column variable
is unknown" when I try to run the code. Does anyone know why this happens?
I tried to continue without insurance that my data is normally distributed and tried to run the ANOVA
res.aov <- rstatix::anova_test(
data = richness, dv = value, wid = id,
within = c(selection.group, Day)
)
But get this error message; Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
I have checked for NA values with any(is.na(richness))
which returns FALSE. I have also checked table(richness$selection.group, richness$Day)
to be sure my setup is correct
2 4 6 8 12 16 20 24 28 29 30 32 36 40 44 50
KR 6 6 6 6 6 6 6 6 6 6 6 5 6 6 6 6
RK 6 6 6 6 6 5 6 6 6 6 6 6 6 6 6 6
And the setup appears correct. I would be very grateful for tips on solving this.
Best regards Madeleine
Below is a subset of my dataset in a reproducible format:
library(tidyverse)
library(rstatix)
library(tibble)
richness_subset = data.frame(
id = c("KRH1", "KRH3", "KRH2", "RKH2", "RKH1", "RKH3"),
selection.group = c("KR", "KR", "KR", "RK", "RK", "RK"),
Day = c(2,2,4,2,4,4),
value = c(111, 110, 144, 92, 85, 69))
richness_subset$Day = factor(richness$Day)
richness_subset$selection.group = factor(richness$selection.group)
richness_subset$id = factor(richness$id)
richness_subset = tibble::as_tibble(richness_subset)
richness_subset %>%
group_by(selection.group, Day) %>%
shapiro_test(value)
# gives Error: Column `variable` is unknown
res.aov <- rstatix::anova_test(
data = richness, dv = value, wid = id,
within = c(selection.group, Day)
)
# gives Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
# 0 (non-NA) cases
See Question&Answers more detail:
os