You can use the compare-cumsum-groupby pattern (which I really need to getting around to writing up for the documentation), with a final cumcount
:
>>> df = pd.DataFrame({"binary": [0,1,1,1,0,0,1,1,0]})
>>> df["consec"] = df["binary"].groupby((df["binary"] == 0).cumsum()).cumcount()
>>> df
binary consec
0 0 0
1 1 1
2 1 2
3 1 3
4 0 0
5 0 0
6 1 1
7 1 2
8 0 0
This works because first we get the positions where we want to reset the counter:
>>> (df["binary"] == 0)
0 True
1 False
2 False
3 False
4 True
5 True
6 False
7 False
8 True
Name: binary, dtype: bool
The cumulative sum of these gives us a different id for each group:
>>> (df["binary"] == 0).cumsum()
0 1
1 1
2 1
3 1
4 2
5 3
6 3
7 3
8 4
Name: binary, dtype: int64
And then we can pass this to groupby
and use cumcount
to get an increasing index in each group.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…