I have the following dataframe:
df = pd.DataFrame({"Code": ['9958S135K108MF-1','9958S135-1','9958S105-1','9958S105K84MF-1',], "ID": ['FO995877000581098', 'FO995877000581098','FO995877000581098','FO995877000581098',], "NUM": ['9958S135','9958S135','9958S105','9958S105']})
I need the following output:
Code ID NUM
0 9958S135K108MF-1 FO995877000581098 9958S135
3 9958S105K84MF-1 FO995877000581098 9958S105
For every "ID"
there should be a unique "NUM"
. There will be many duplicate "ID"
The trick is upon dropping the row which has a duplicate '"ID"' and "'NUM" I need to remove the row that has the prefix ending in MF-1
..
I have tried to add a "Mapping"
column and delete True
values in that column but it will not always allocate "True"
to the correct row which "Code"
contains 'MF-1'.
Here is what I have tried:
import pandas as pd
df['Mapping'] = df['NUM'].eq(df['NUM'].shift()) & df['ID'].eq(df['ID'].shift())
Code ID NUM Mapping
0 9958S135K108MF-1 FO995877000581098 9958S135 False
1 9958S135-1 FO995877000581098 9958S135 True
2 9958S105-1 FO995877000581098 9958S105 False
3 9958S105K84MF-1 FO995877000581098 9958S105 True
question from:
https://stackoverflow.com/questions/65877070/removing-rows-from-dataframe-based-on-condition 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…