I have a pandas DataFrame that need to be fed in chunks of n-rows into downstream functions (print
in the example). The chunks may have overlapping rows.
Let's start from a dummy DataFrame:
d = {'A':list(range(1000)), 'B':list(range(1000))}
df=pd.DataFrame(d)
In the case of a 2-rows chunks with 1-row overlap I have the following code:
a = df.index.values[:-1]
for i in a:
print(df.iloc[i:i+2])
The output is something like this:
...
A B
996 996 996
997 997 997
A B
997 997 997
998 998 998
A B
998 998 998
999 999 999
Which is exactly what I want.
Is there a better/faster approach to iterate over chunks of n-rows of a pandas.DataFrame?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…