General idea:
- iterate over each row in the df (would prefer iterrows method over plain iteration of column)
- once in the first column (df2[1] == 'Position') is encountered
- check if the upcoming row == 'Table'
- if not delete the entire row where the initial 'Position' was found
Dataframe:
1 2
0 Position random1
1 12345 random2
2 12345 random3
3 Position random4
4 Table random5
5 12345 random6
6 12345 random7
Desired result:
1 2
0 12345 random2
1 12345 random3
2 Position random4
3 Table random5
4 12345 random6
5 12345 random7
Pseudo/code:
import pandas as pd
info = {1: ['Position','12345','12345','Position', 'Table', '12345','12345'],
2: ['random1','random2','random3','random4','random5','random6','random7']
}
df2 = pd.DataFrame(info, columns = [1,2])
for indx,row in df.iterrows():
if df.loc[indx,(df[1] == 'Position') & df.loc[indx+1,(df[1] != 'Table')]:
del df.loc[indx, 1]
question from:
https://stackoverflow.com/questions/65924441/compare-upcoming-row-with-previous-by-index-pandas 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…