Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
640 views
in Technique[技术] by (71.8m points)

python - how do I calculate a rolling idxmax

consider the pd.Series s

import pandas as pd
import numpy as np

np.random.seed([3,1415])
s = pd.Series(np.random.randint(0, 10, 10), list('abcdefghij'))
s

a    0
b    2
c    7
d    3
e    8
f    7
g    0
h    6
i    8
j    6
dtype: int64

I want to get the index for the max value for the rolling window of 3

s.rolling(3).max()

a    NaN
b    NaN
c    7.0
d    7.0
e    8.0
f    8.0
g    8.0
h    7.0
i    8.0
j    8.0
dtype: float64

What I want is

a    None
b    None
c       c
d       c
e       e
f       e
g       e
h       f
i       i
j       i
dtype: object

What I've done

s.rolling(3).apply(np.argmax)

a    NaN
b    NaN
c    2.0
d    1.0
e    2.0
f    1.0
g    0.0
h    0.0
i    2.0
j    1.0
dtype: float64

which is obviously not what I want

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is no simple way to do that, because the argument that is passed to the rolling-applied function is a plain numpy array, not a pandas Series, so it doesn't know about the index. Moreover, the rolling functions must return a float result, so they can't directly return the index values if they're not floats.

Here is one approach:

>>> s.index[s.rolling(3).apply(np.argmax)[2:].astype(int)+np.arange(len(s)-2)]
Index([u'c', u'c', u'e', u'e', u'e', u'f', u'i', u'i'], dtype='object')

The idea is to take the argmax values and align them with the series by adding a value indicating how far along in the series we are. (That is, for the first argmax value we add zero, because it is giving us the index into a subsequence starting at index 0 in the original series; for the second argmax value we add one, because it is giving us the index into a subsequence starting at index 1 in the original series; etc.)

This gives the correct results, but doesn't include the two "None" values at the beginning; you'd have to add those back manually if you wanted them.

There is an open pandas issue to add rolling idxmax.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...