I seem to have found a pitfall with using .sum()
on numpy
arrays but I'm unable to find an explanation. Essentially, if I try to sum a large array then I start getting nonsensical answers but this happens silently and I can't make sense of the output well enough to Google the cause.
For example, this works exactly as expected:
a = sum(xrange(2000))
print('a is {}'.format(a))
b = np.arange(2000).sum()
print('b is {}'.format(b))
Giving the same output for both:
a is 1999000
b is 1999000
However, this does not work:
c = sum(xrange(200000))
print('c is {}'.format(c))
d = np.arange(200000).sum()
print('d is {}'.format(d))
Giving the following output:
c is 19999900000
d is -1474936480
And on an even larger array, it's possible to get back a positive result. This is more insidious because I might not identify that something unusual was happening at all. For example this:
e = sum(xrange(100000000))
print('e is {}'.format(e))
f = np.arange(100000000).sum()
print('f is {}'.format(f))
Gives this:
e is 4999999950000000
f is 887459712
I guessed that this was to do with data types and indeed even using the python float
seems to fix the problem:
e = sum(xrange(100000000))
print('e is {}'.format(e))
f = np.arange(100000000, dtype=float).sum()
print('f is {}'.format(f))
Giving:
e is 4999999950000000
f is 4.99999995e+15
I have no background in Comp. Sci. and found myself stuck (perhaps this is a dupe). Things I've tried:
numpy
arrays have a fixed size. Nope; this seems to show I should hit a MemoryError
first.
- I might somehow have a 32-bit installation (probably not relevant); nope, I followed this and confirmed I have 64-bit.
- Other examples of weird
sum
behaviour; nope (?) I found this but I can't see how it applies.
Can someone please explain briefly what I'm missing and tell me what I need to read up on? Also, other than remembering to define a dtype
each time, is there a way to stop this happening or give a warning?
Possibly relevant:
Windows 7
numpy
1.11.3
Running out of Enthought Canopy on Python 2.7.9
See Question&Answers more detail:
os