Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
91 views
in Technique[技术] by (71.8m points)

Operator "[<-" in RStudio and R

By accident i've encountered strange behaviour of "[<-" operator. It behaves differently depending on order of calls and whether i'm using RStudio or just ordinary RGui. I will make myself clear with an example.

x <- 1:10
"[<-"(x, 1, 111)
x[5] <- 123

As far as i know, first assigment shouldn't change x (or maybe i'm wrong?), while the second should do. And in fact the result of above operations is

x
[1]  1  2  3  4  123  6  7  8  9 10

However, when we perform these operations in different order, results are different and x has changed! Meaningly:

x <- 1:10
x[5] <- 123
"[<-"(x, 1, 111)
x
[1] 111   2   3   4   123   6   7   8   9  10

But it only happens when i'm using plain R! In RStudio the behaviour is the same in both options. I've checked it on two machines (one with Fedora one with Win7) and the situation looks exactly the same. I know the 'functional' version ("[<-"(x..)) is probably never used but i'm very curious why it is happening. Could anyone explain that?

==========================

EDIT: Ok, so from comments i get that the reason was that x <- 1:10 has type 'integer' and after replacing x[5] <- 123 it's 'double'. But still remains question why behaviour is different in RStudio? I restart R session and it doesn't change anything.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Rstudio's behavior

Rstudio's object browser modifies objects it examines in a way that forces copying upon modification. Specifically, the object browser employs at least one R function whose call internally forces evaluation of the object, in the process resetting the value of the object's named field from 1 to 2. From the R-Internals manual:

When an object is about to be altered, the named field is consulted. A value of 2 means that the object must be duplicated before being changed. [...] A value of 1 is used for situations [...] where in principle two copies of a exist for the duration of the computation [...] but for no longer, and so some primitive functions can be optimized to avoid a copy in this case.

To see that the object browser modifies the named field ([NAM()] in the next code block), compare the results of running the following lines. In the first, both 'lines' are run together, so that Rstudio has no time to 'touch' X before its structure is queried. In the second, each line is pasted in separately, so X is modified before it is examined.

## Pasted in together
x <- 1:10; .Internal(inspect(x))
# @46b47b8 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...

## Pasted in with some delay between lines
x <- 1:10
.Internal(inspect(x))
# @42111b8 13 INTSXP g0c4 [NAM(2)] (len=10, tl=0) 1,2,3,4,5,... 

Once the named field is set to 2, [<-(X, ...) will not modify the original object. Pasting the following into Rstudio all at once modifies X, while pasting it in line-by-line does not:

x <- 1:10
"[<-"(x, 1, 111)

One more consequence of all this is that Rstudio's object browser actually makes some operations slower than they would otherwise be. Again, compare the same two commands first pasted in together, and then one at a time:

## Pasted in together
x <- 1:5e7
system.time(x[1] <- 9L)
#    user  system elapsed 
#       0       0       0 

## Pasted in one at a time
x <- 1:5e7
system.time(x[1] <- 9L)
#    user  system elapsed 
#    0.11    0.04    0.16 

Variable behavior of [<- in R

The behavior of [<- w.r.t. modifying a vector X depends on the storage types of X and of the element being assigned into it. This explains R's behavior but not Rstudio's.

In R, when [<- either appends to a vector X, or performs a subassignment that requires that X's type be modified, X is copied and the value that is returned does not overwrite the pre-existing variable X. (To do that you need to do something like X <- "[<-(X, 2, 100).

So, neither of the following modify X

X <- 1:2         ## Note: typeof(X) --> "integer"

## Subassignment that requires that X be coerced to "numeric" type
"[<-"(X, 2, 100) ## Note: typeof(100) --> "numeric"
X 
# [1]   1   2

## Appending to X
"[<-"(X, 3, 100L)
X
# [1]   1   2

Whenever possible, though, R does allow the [<- function to modify X directly by reference (i.e. without making a copy). "Possible" here includes cases in which a sub-assignment doesn't require that X's type be modified.

So all of the following modify X

X <- c(0i, 0i, 0i, 0i)
"[<-"(X, 1, TRUE)
"[<-"(X, 2, 20L)
"[<-"(X, 3, 3.14)
"[<-"(X, 4, 5+5i)
X
# [1]  1.00+0i 20.00+0i  3.14+0i  5.00+5i

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...