Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
211 views
in Technique[技术] by (71.8m points)

Why does the number 1e9999... (31 9s) cause problems in R?

When entering 1e9999999999999999999999999999999 into R, R hangs and will not respond - requiring it to be terminated.

It seems to happen across 3 different computers, OSes (Windows 7 and Ubuntu). It happens in RStudio, RGui and RScript.

Here's some code to generate the number more easily:

boom <- paste(c("1e", rep(9, 31)), collapse="")
eval(parse(text=boom))

Now clearly this isn't a practical problem. I have no need to use numbers of this magnitude. It's just a question of curiosity.

Curiously, if you try 1e9999999999999999999999999999998 or 1e10000000000000000000000000000000 (add or subtract one from the power), you get Inf and 0 respectively. This number is clearly some kind of boundary, but between what and why here?

I considered that it might be:

  • A floating point problem, but I think they max out at 1.7977e308, long before the number in question.
  • An issue with 32-bit integers, but 2^32 is 4294967296, much smaller than the number in question.
  • Really weird. This is my dominant theory.

EDIT: As of 2015-09-15 at the latest, this no longer causes R to hang. They must have patched it.

question from:https://stackoverflow.com/questions/11700748/why-does-the-number-1e9999-31-9s-cause-problems-in-r

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This looks like an extreme case in the parser. The XeY format is described in Section 10.3.1: Literal Constants of the R Language Definition and points to ?NumericConstants for "up-to-date information on the currently accepted formats".

The problem seems to be how the parser handles the exponent. The numeric constant is handled by NumericValue (line 4361 of main/gram.c), which calls mkFloat (line 4124 of main/gram.c), which calls R_atof (line 1584 of main/util.c), which calls R_strtod4 (line 1461 of main/util.c). (All as of revision 60052.)

Line 1464 of main/utils.c shows expn declared as int and it will overflow at line 1551 if the exponent is too large. The signed integer overflow causes undefined behavior.

For example, the code below produces values for exponents < 308 or so and Inf for exponents > 308.

const <- paste0("1e",2^(1:31)-2)
for(n in const) print(eval(parse(text=n)))

You can see the undefined behavior for exponents > 2^31 (R hangs for an exponent = 2^31):

const <- paste0("1e",2^(31:61)+1)
for(n in const) print(eval(parse(text=n)))

I doubt this will get any attention from R-core because R can only store numeric values between about 2e-308 to 2e+308 (see ?double) and this number is way beyond that.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...