Here is something I hope will get your closer to what you need.
First you group by i
. Then, you can then create a column that will indicate a 1 for each change in region. This compares the current value for the region with the previous value (using lag
). Note if the previous value is NA
(when looking at the first value for a given i
), it will be considered no change.
Same approach is taken for urban. Then, summarize totaling up all the changes for each i
. I left in these temporary variables so you can examine if you are getting the results desired.
Edit: If you wish to remove rows that have NA
for region
or urban
you can add drop_na
first.
library(dplyr)
library(tidyr)
df_tot <- df %>%
drop_na(region, urban) %>%
group_by(i) %>%
mutate(reg_change = ifelse(region == lag(region) | is.na(lag(region)), 0, 1),
urban_change = ifelse(urban == lag(urban) | is.na(lag(urban)), 0, 1)) %>%
summarize(tot_region = sum(reg_change),
tot_urban = sum(urban_change))
# A tibble: 3 x 3
i tot_region tot_urban
<int> <dbl> <dbl>
1 1 1 0
2 4 3 0
3 45 2 2
Edit: Afterwards, to get a grand total for both tot_region
and tot_urban
columns, you can use colSums
. (Store your earlier result as df_tot
as above.)
colSums(df_tot[-1])
tot_region tot_urban
6 2
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…