Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
505 views
in Technique[技术] by (71.8m points)

r - Avoiding type conflicts with dplyr::case_when

I am trying to use dplyr::case_when within dplyr::mutate to create a new variable where I set some values to missing and recode other values simultaneously.

However, if I try to set values to NA, I get an error saying that we cannot create the variable new because NAs are logical:

Error in mutate_impl(.data, dots) :
Evaluation error: must be type double, not logical.

Is there a way to set values to NA in a non-logical vector in a data frame using this?

library(dplyr)    

# Create data
df <- data.frame(old = 1:3)

# Create new variable
df <- df %>% dplyr::mutate(new = dplyr::case_when(old == 1 ~ 5,
                                                  old == 2 ~ NA,
                                                  TRUE ~ old))

# Desired output
c(5, NA, 3)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As said in ?case_when:

All RHSs must evaluate to the same type of vector.

You actually have two possibilities:

1) Create new as a numeric vector

df <- df %>% mutate(new = case_when(old == 1 ~ 5,
                                    old == 2 ~ NA_real_,
                                    TRUE ~ as.numeric(old)))

Note that NA_real_ is the numeric version of NA, and that you must convert old to numeric because you created it as an integer in your original dataframe.

You get:

str(df)
# 'data.frame': 3 obs. of  2 variables:
# $ old: int  1 2 3
# $ new: num  5 NA 3

2) Create new as an integer vector

df <- df %>% mutate(new = case_when(old == 1 ~ 5L,
                                    old == 2 ~ NA_integer_,
                                    TRUE ~ old))

Here, 5L forces 5 into the integer type, and NA_integer_ is the integer version of NA.

So this time new is integer:

str(df)
# 'data.frame': 3 obs. of  2 variables:
# $ old: int  1 2 3
# $ new: int  5 NA 3

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...