Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
330 views
in Technique[技术] by (71.8m points)

r - as(x, 'double') and as.double(x) are inconsistent

x <- 1:10
str(x)
# int [1:10] 1 2 3 4 5 6 7 8 9 10
str(as.double(x))
# num [1:10] 1 2 3 4 5 6 7 8 9 10 
str(as(x, 'double'))
# int [1:10] 1 2 3 4 5 6 7 8 9 10

I'd be surprised if there was a bug in R with something so basic as type conversion. Is there a reason for this inconsistency?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

as is for coercing to a new class, and double technically isn't a class but rather a storage.mode.

y <- x
storage.mode(y) <- "double"
identical(x,y)
[1] FALSE
> identical(as.double(x),y)
[1] TRUE

The argument "double" is handled as a special case by as and will attempt to coerce to the class numeric, which the class integer already inherits, therefore there is no change.

is.numeric(x)
[1] TRUE

Not so fast...

While the above made sense, there is some further confusion. From ?double:

It is a historical anomaly that R has two names for its floating-point vectors, double and numeric (and formerly had real).

double is the name of the type. numeric is the name of the mode and also of the implicit class. As an S4 formal class, use "numeric".

The potential confusion is that R has used mode "numeric" to mean ‘double or integer’, which conflicts with the S4 usage. Thus is.numeric tests the mode, not the class, but as.numeric (which is identical to as.double) coerces to the class.

Therefore as should really change x according to the documentation... I will investigate further.

The plot is thicker than whipped cream and cornflour soup...

Well, if you debug as, you find out that what eventually happens is that the following method gets created rather than using the c("ANY","numeric") signature for the coerce generic which would call as.numeric:

function (from, strict = TRUE) 
if (strict) {
    class(from) <- "numeric"
    from
} else from

So actually, class<- gets called on x and this eventually means R_set_class is called from coerce.c. I believe the following part of the function determines the behaviour:

...
else if(!strcmp("numeric", valueString)) {
    setAttrib(obj, R_ClassSymbol, R_NilValue);
    if(IS_S4_OBJECT(obj)) /* NULL class is only valid for S3 objects */
      do_unsetS4(obj, value);
    switch(TYPEOF(obj)) {
    case INTSXP: case REALSXP: break;
    default: PROTECT(obj = coerceVector(obj, REALSXP));
    nProtect++;
    }
...

Note the switch statement: it breaks out without doing coercion in the case of integers and real values.

Bug or not?

Whether or not this is a bug depends on your point of view. Integers are numeric in one sense as confirmed by is.numeric(x) returning TRUE, but strictly speaking they are not a numeric class. On the other hand, since integers get promoted to double automatically on overflow, one may view them conceptually as the same. There are two major differences: i) Integers require less storage space - this may be significant for larger vectors, and, ii) when interacting with external code that has greater type discipline conversion costs may come into play.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...