What are the differences in performance and behavior between using Python's native sum
function and NumPy's numpy.sum
? sum
works on NumPy's arrays and numpy.sum
works on Python lists and they both return the same effective result (haven't tested edge cases such as overflow) but different types.
>>> import numpy as np
>>> np_a = np.array(range(5))
>>> np_a
array([0, 1, 2, 3, 4])
>>> type(np_a)
<class 'numpy.ndarray')
>>> py_a = list(range(5))
>>> py_a
[0, 1, 2, 3, 4]
>>> type(py_a)
<class 'list'>
# The numerical answer (10) is the same for the following sums:
>>> type(np.sum(np_a))
<class 'numpy.int32'>
>>> type(sum(np_a))
<class 'numpy.int32'>
>>> type(np.sum(py_a))
<class 'numpy.int32'>
>>> type(sum(py_a))
<class 'int'>
Edit: I think my practical question here is would using numpy.sum
on a list of Python integers be any faster than using Python's own sum
?
Additionally, what are the implications (including performance) of using a Python integer versus a scalar numpy.int32
? For example, for a += 1
, is there a behavior or performance difference if the type of a
is a Python integer or a numpy.int32
? I am curious if it is faster to use a NumPy scalar datatype such as numpy.int32
for a value that is added or subtracted a lot in Python code.
For clarification, I am working on a bioinformatics simulation which partly consists of collapsing multidimensional numpy.ndarray
s into single scalar sums which are then additionally processed. I am using Python 3.2 and NumPy 1.6.
Thanks in advance!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…