Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
261 views
in Technique[技术] by (71.8m points)

r - Difference between `names(df[1]) <- ` and `names(df)[1] <- `

Consider the following:

df <- data.frame(a = 1, b = 2, c = 3)
names(df[1]) <- "d" ## First method
##  a b c
##1 1 2 3

names(df)[1] <- "d" ## Second method
##  d b c
##1 1 2 3

Both methods didn't return an error, but the first didn't change the column name, while the second did.

I thought it has something to do with the fact that I'm operating only on a subset of df, but why, for example, the following works fine then?

df[1] <- 2 
##  a b c
##1 2 2 3
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

What I think is happening is that replacement into a data frame ignores the attributes of the data frame that is drawn from. I am not 100% sure of this, but the following experiments appear to back it up:

df <- data.frame(a = 1:3, b = 5:7)
#   a b
# 1 1 5
# 2 2 6
# 3 3 7

df2 <- data.frame(c = 10:12)
#    c
# 1 10
# 2 11
# 3 12

df[1] <- df2[1]   # in this case `df[1] <- df2` is equivalent

Which produces:

#    a b
# 1 10 5
# 2 11 6
# 3 12 7

Notice how the values changed for df, but not the names. Basically the replacement operator `[<-` only replaces the values. This is why the name was not updated. I believe this explains all the issues.

In the scenario:

names(df[2]) <- "x"

You can think of the assignment as follows (this is a simplification, see end of post for more detail):

tmp <- df[2]
#   b
# 1 5
# 2 6
# 3 7

names(tmp) <- "x"
#   x
# 1 5
# 2 6
# 3 7

df[2] <- tmp   # `tmp` has "x" for names, but it is ignored!
#    a b
# 1 10 5
# 2 11 6
# 3 12 7

The last step of which is an assignment with `[<-`, which doesn't respect the names attribute of the RHS.

But in the scenario:

names(df)[2] <- "x"

you can think of the assignment as (again, a simplification):

tmp <- names(df)
# [1] "a" "b"

tmp[2] <- "x"
# [1] "a" "x"

names(df) <- tmp
#    a x
# 1 10 5
# 2 11 6
# 3 12 7

Notice how we directly assign to names, instead of assigning to df which ignores attributes.

df[2] <- 2

works because we are assigning directly to the values, not the attributes, so there are no problems here.


EDIT: based on some commentary from @AriB.Friedman, here is a more elaborate version of what I think is going on (note I'm omitting the S3 dispatch to `[.data.frame`, etc., for clarity):

Version 1 names(df[2]) <- "x" translates to:

df <- `[<-`(
  df, 2, 
  value=`names<-`(   # `names<-` here returns a re-named one column data frame
    `[`(df, 2),       
    value="x"
) ) 

Version 2 names(df)[2] <- "x" translates to:

df <- `names<-`(
  df,
  `[<-`(
     names(df), 2, "x"
) )

Also, turns out this is "documented" in R Inferno Section 8.2.34 (Thanks @Frank):

right <- wrong <- c(a=1, b=2)
names(wrong[1]) <- 'changed'
wrong
# a b
# 1 2
names(right)[1] <- 'changed'
right
# changed b
# 1 2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...