I would like to write a function which updates one data frame based on the other data frame.
When the id
is in the updated_data
I would like to update the column product
in created_data
. If the id
is not in the updated_data
I would like to continue with the already existing value for product
from created_data
. It's just a fictive example and in reality I would need to update multiple columns not only product
, that's why I am using it as an argument to my function.
However due to this function approach I am struggeling with accessing the columns.
# some fictive data
created_data <- data.frame(id = c("ab01", "ab02", "ab03", "ab04", "ab05", "ab06", "ab07",
"ab08", "ab09", "ab10", "ab11", "ab12", "ab13", "ab14"),
rank = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14),
colour = c("blue", "blue", "red", "purple", "yellow", "black",
"green", "magenta", "black", "orange", "white",
"orange", "lightblue", "magenta"),
product = c("shoes", "socks", "socks", "shirt", "jacket",
"shoes", "socks", "socks", "shirt", "jacket",
"shoes", "socks", "socks", "shirt"),
candy = c("mars", "twix", "kitkat", "bounty", "mars",
"cookie", "cookie", "mars", "twix", "bounty",
"twix", "twix", "twix", "twix"))
# some update data
updated_data <- data.frame(id = c("ab03", "ab07", "ab08"),
product = c("shirt", "trousers", "trousers"))
# one possible solution to solve the task without using a function
created_data$id <- as.character(created_data$id)
updated_data$id <- as.character(updated_data$id)
updated_data1 <- updated_data %>%
rename(product_new = product)
results_without_function <- created_data %>%
left_join(updated_data1, by = "id") %>%
mutate(product = ifelse(is.na(product_new), product, product_new)) %>%
select(-product_new)
# one trial for my function
update_fun <- function(orig_df, upd_df, column_to_update){
orig_df1 <- orig_df %>%
mutate(column_to_update = ifelse(id %in% upd_df$id, upd_df$column_to_update, column_to_update))
return(orig_df1)
}
# another trial
update_fun <- function(orig_df, upd_df, column_to_update){
orig_df1 <- orig_df %>%
mutate(!! column_to_update := ifelse(id %in% upd_df$id, upd_df$column_to_update, !! column_to_update))
return(orig_df1)
}
# results
result <- update_fun(orig_df = created_data, upd_df = update_fun, column_to_update = "product")
EDIT:
sorry for not being explicit enough. I know how to solve this problem without using a function, code above has been adapted. However my question is how to translate this solution to a function where created_data
and input_data
as well as the id
and the product
column are handled as input parameters.
question from:
https://stackoverflow.com/questions/65942015/r-function-with-multiple-input-data-frames-and-colnames-as-function-arguments-fo