Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
368 views
in Technique[技术] by (71.8m points)

python - How to eliminate the extra minus sign when rounding negative numbers towards zero in numpy?

I have a simple question about the fix and floor functions in numpy. When rounding negative numbers that are larger than -1 towards zero, numpy round them off correctly to zero however leaves a negative sign. This negative sign interferes with my costume unique_rows function since it uses the ascontiguousarray to compare elements of the array and this sign disturbs the uniqueness. Both round and fix behave the same in this regard.

>>> np.fix(-1e-6)
Out[1]: array(-0.0)
>>> np.round(-1e-6)
Out[2]: -0.0

Any insights on how to get rid of the sign? I thought about using the np.sign function but it comes with extra computational cost.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The issue you're having between -0. and +0. is part of the specification of how floats are supposed to behave (IEEE754). In some circumstance, one needs this distinction. See, for example, the docs that are linked to in the docs for around.

It's also worth noting that the two zeros should compare to equal, so

np.array(-0.)==np.array(+0.) 
# True

That is, I think the problem is more likely with your uniqueness comparison. For example:

a = np.array([-1., -0., 0., 1.])
np.unique(a)
#  array([-1., -0.,  1.])

If you want to keep the numbers as floating point but have all the zeros the same, you could use:

x = np.linspace(-2, 2, 6)
#  array([-2. , -1.2, -0.4,  0.4,  1.2,  2. ])
y = x.round()
#  array([-2., -1., -0.,  0.,  1.,  2.])
y[y==0.] = 0.
#  array([-2., -1.,  0.,  0.,  1.,  2.])

# or  
y += 0.
#  array([-2., -1.,  0.,  0.,  1.,  2.])    

Note, though, you do have to do this bit of extra work since you are trying to avoid the floating point specification.

Note also that this isn't due to a rounding error. For example,

np.fix(np.array(-.4)).tostring().encode('hex')
# '0000000000000080'
np.fix(np.array(-0.)).tostring().encode('hex')
# '0000000000000080'

That is, the resulting numbers are exactly the same, but

np.fix(np.array(0.)).tostring().encode('hex')
# '0000000000000000'

is different. This is why your method is not working, since it's comparing the binary representation of the numbers, which is different for the two zeros. Therefore, I think the problem is more the method of comparison than the general idea of comparing floating point numbers for uniqueness.

A quick timeit test for the various approaches:

data0 = np.fix(4*np.random.rand(1000000,)-2)
#   [ 1. -0.  1. -0. -0.  1.  1.  0. -0. -0. .... ]

N = 100
data = np.array(data0)
print timeit.timeit("data += 0.", setup="from __main__ import np, data", number=N)
#  0.171831846237
data = np.array(data0)
print timeit.timeit("data[data==0.] = 0.", setup="from __main__ import np, data", number=N)
#  0.83500289917
data = np.array(data0)
print timeit.timeit("data.astype(np.int).astype(np.float)", setup="from __main__ import np, data", number=N)
#  0.843791007996

I agree with @senderle's point that if you want simple and exact comparisons and can get by with ints, ints will generally be easier. But if you want unique floats, you should be able to do this too, though you need to do it a bit more carefully. The main issue with floats is that you can have small differences that can be introduced from calculations and don't appear in a normal print, but this isn't an huge barrier and especially not after a round, fix, rint for a reasonable range of floats.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...