It turns out that np.argmax
is blazingly fast, but only with the native numpy arrays. With foreign data, almost all the time is spent on conversion:
In [194]: print platform.architecture()
('64bit', 'WindowsPE')
In [5]: x = np.random.rand(10000)
In [57]: l=list(x)
In [123]: timeit numpy.argmax(x)
100000 loops, best of 3: 6.55 us per loop
In [122]: timeit numpy.argmax(l)
1000 loops, best of 3: 729 us per loop
In [134]: timeit numpy.array(l)
1000 loops, best of 3: 716 us per loop
I called your function "inefficient" because it first converts everything to list, then iterates through it 2 times (effectively, 3 iterations + list construction).
I was going so suggest something like this that only iterates once:
def imax(seq):
it=iter(seq)
im=0
try: m=it.next()
except StopIteration: raise ValueError("the sequence is empty")
for i,e in enumerate(it,start=1):
if e>m:
m=e
im=i
return im
But, your version turns out to be faster because it iterates many times but does it in C, rather that Python, code. C is just that much faster - even considering the fact a great deal of time is spent on conversion, too:
In [158]: timeit imax(x)
1000 loops, best of 3: 883 us per loop
In [159]: timeit fastest_argmax(x)
1000 loops, best of 3: 575 us per loop
In [174]: timeit list(x)
1000 loops, best of 3: 316 us per loop
In [175]: timeit max(l)
1000 loops, best of 3: 256 us per loop
In [181]: timeit l.index(0.99991619010758348) #the greatest number in my case, at index 92
100000 loops, best of 3: 2.69 us per loop
So, the key knowledge to speeding this up further is to know which format the data in your sequence natively is (e.g. whether you can omit the conversion step or use/write another functionality native to that format).
Btw, you're likely to get some speedup by using aggregate(max_fn)
instead of agg([max_fn])
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…