Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
566 views
in Technique[技术] by (71.8m points)

r - Replace column values with column name using dplyr's transmute_all

Data set contains many columns containing values which are either NA or 1, kind of like this:

> data_frame(a = c(NA, 1, NA, 1, 1), b=c(1, NA, 1, 1, NA))
# A tibble: 5 x 2
      a     b
  <dbl> <dbl>
1 NA     1.00
2  1.00 NA   
3 NA     1.00
4  1.00  1.00
5  1.00 NA  

Desired output: replace all the 1 values with the name of the column as a string,

> data_frame(a = c(NA, 'a', NA, 'a', 'a'), b=c('b', NA, 'b', 'b', NA))
# A tibble: 5 x 2
  a     b    
  <chr> <chr>
1 <NA>  b    
2 a     <NA> 
3 <NA>  b    
4 a     b    
5 a     <NA> 

here's my attempt using an anonymous function in transmute_all:

> data_frame(a = c(NA, 1, NA, 1, 1), b=c(1, NA, 1, 1, NA)) %>%
+     transmute_all(
+         funs(function(x){if (x == 1) deparse(substitute(x)) else NA})
+     )
Error in mutate_impl(.data, dots) : 
  Column `a` is of unsupported type function

EDIT: Attempt # 2:

> data_frame(a = c(NA, 1, NA, 1, 1), b=c(1, NA, 1, 1, NA)) %>%
+     transmute_all(
+         funs(
+             ((function(x){if (!is.na(x)) deparse(substitute(x)) else NA})(.))
+             )
+     )
# A tibble: 5 x 2
  a     b    
  <lgl> <chr>
1 NA    b    
2 NA    b    
3 NA    b    
4 NA    b    
5 NA    b    
Warning messages:
1: In if (!is.na(x)) deparse(substitute(x)) else NA :
  the condition has length > 1 and only the first element will be used
2: In if (!is.na(x)) deparse(substitute(x)) else NA :
  the condition has length > 1 and only the first element will be used
> 
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One option is map2

library(purrr)
map2_df(df1, names(df1), ~  replace(.x, .x==1, .y))
# A tibble: 5 x 2
#  a     b    
# <chr> <chr>
#1 NA    b    
#2 a     NA   
#3 NA    b    
#4 a     b    
#5 a     NA   

Or as @Moody_Mudskipper commented

imap_dfr(df1, ~replace(.x, .x==1, .y))

In base R, we can do

df1[] <- names(df1)[col(df1) *(df1 == 1)]

data

df1 <-  data_frame(a = c(NA, 1, NA, 1, 1), b=c(1, NA, 1, 1, NA))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...