Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
399 views
in Technique[技术] by (71.8m points)

r - := (pass by reference) operator in the data.table package modifies another data table object simultaneously

While testing my code, I found out the following: If I assign a data.table DT1 to DT and change DT afterwards, DT1 changes with it. So DT and DT1 seem to be internally linked. Is this intended behavior? Although I'm not a programming expert, this looks wrong to me, and testing it with simple R variables or a data.frame, I couldn't reproduce the behavior. What's happening here?

DF <- data.frame(ID=letters[1:5],
                  value=1:5)
DF1 <- DF
all.equal(DF1, DF)
[1] TRUE
DF[1, "value"] <- DF[1, "value"]*2
all.equal(DF1, DF)
[1] "Component 2: Mean relative difference: 1"

library(data.table)
data.table 1.7.1  For help type: help("data.table")
DT <- data.table(ID=letters[1:5],
                  value=1:5)
DT1 <- DT
all.equal(DT1, DT)
[1] TRUE
DT[, value:=value*2]
     ID value
[1,]  a     2
[2,]  b     4
[3,]  c     6
[4,]  d     8
[5,]  e    10
all.equal(DT1, DT)
[1] TRUE
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This piece of documentation in data.table would help. ? data.table::copy

No value is returned. The data.table is modified by reference. If you require a copy, take a copy first (using DT2=copy(DT)). copy() may also sometimes be useful before := is used to subassign to a column by reference.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...