Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
351 views
in Technique[技术] by (71.8m points)

r - Do you reassign == and != to isTRUE( all.equal() )?

A previous post prompted me to post this question. It would seem like a best-practice to reassign == to isTRUE(all.equal()) ( and != to !isTRUE(all.equal()). I'm wondering if others do this in practice? I just realized that I use == and != to do numeric equality throughout my codebase. My first reaction was that I need to do a full-scrub and convert to all.equal. But in fact, everytime I use == and != I want to test equality (regardless of the datatype). In fact, I'm not sure what these operations would test for other than equality. I'm sure I'm missing some concept here. Can someone enlighten me? The only argument I see against this approach is that in some cases two non-identical numbers will appear to be identical because of the tolerance of all.equal. But we're told that two numbers that are in fact identical might not pass identical() because of how they are are stored in memory. So really what's the point of not defaulting to all.equal?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As @joran alluded to, you'll run into floating point issues with == and != in pretty much any other language too. One important aspect of them in R is the vectorization part.

It would be much better to define a new function almostEqual, fuzzyEqual or similar. It is unfortunate that there is no such base function. all.equal isn't very efficient since it handles all kinds of objects and returns a string describing the difference when mostly you just want TRUE or FALSE.

Here's an example of such a function. It's vectorized like ==.

almostEqual <- function(x, y, tolerance=1e-8) {
  diff <- abs(x - y)
  mag <- pmax( abs(x), abs(y) )
  ifelse( mag > tolerance, diff/mag <= tolerance, diff <= tolerance)
}

almostEqual(1, c(1+1e-8, 1+2e-8)) # [1]  TRUE FALSE

...it is around 2x faster than all.equal for scalar values, and much faster with vectors.

x <- 1
y <- 1+1e-8
system.time(for(i in 1:1e4) almostEqual(x, y)) # 0.44 seconds
system.time(for(i in 1:1e4) all.equal(x, y))   # 0.93 seconds

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...