Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
295 views
in Technique[技术] by (71.8m points)

r - dplyr broadcasting single value per group in mutate

I am trying to do something very similar to Scale relative to a value in each group (via dplyr) (however this solution seems to crash R for me). I would like to replicate a single value for each group and add a new column with this value repeated. As an example I have

library(dplyr)

data = expand.grid(
  category = LETTERS[1:2],
  year = 2000:2003)
data$value = runif(nrow(data))

data

  category year     value
1        A 2000 0.6278798
2        B 2000 0.6112281
3        A 2001 0.2170495
4        B 2001 0.6454874
5        A 2002 0.9234604
6        B 2002 0.9311204
7        A 2003 0.5387899
8        B 2003 0.5573527

And I would like a dataframe like

data

  category year     value    value2
1        A 2000 0.6278798 0.6278798
2        B 2000 0.6112281 0.6112281
3        A 2001 0.2170495 0.6278798
4        B 2001 0.6454874 0.6112281
5        A 2002 0.9234604 0.6278798
6        B 2002 0.9311204 0.6112281
7        A 2003 0.5387899 0.6278798
8        B 2003 0.5573527 0.6112281

i.e. the value for each category is the value from year 2000. I was trying to think of a general solution extensible to a given filtering criteria, i.e. something like

data %>% group_by(category) %>% mutate(value = filter(data, year==2002))

however this does not work because of incorrect length in the assignment.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Do this:

data %>% group_by(category) %>%
  mutate(value2 = value[year == 2000])

You could also do it this way:

data %>% group_by(category) %>%
  arrange(year) %>%
  mutate(value2 = value[1])

or

data %>% group_by(category) %>%
  arrange(year) %>%
  mutate(value2 = first(value))

or

data %>% group_by(category) %>%
  mutate(value2 = nth(value, n = 1, order_by = "year"))

or probably several other ways.

Your attempt with mutate(value = filter(data, year==2002)) doesn't make sense for a few reasons.

  1. When you explicitly pass in data again, it's not part of the chain that got grouped earlier, so it doesn't know about the grouping.

  2. All dplyr verbs take a data frame as first argument and return a data frame, including filter. When you do value = filter(...) you're trying to assign a full data frame to the single column value.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...