I have two dataframes:
df1 = row1;row2;row3
df2 = row4;row5;row6;row2
I want my output dataframe to only contain the rows unique in df1, i.e.:
df_out = row1;row3
How do I get this most efficiently?
This code does what I want, but using 2 for-loops:
a = pd.DataFrame({0:[1,2,3],1:[10,20,30]})
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]})
match_ident = []
for i in range(0,len(a)):
found=False
for j in range(0,len(b)):
if a[0][i]==b[0][j]:
if a[1][i]==b[1][j]:
found=True
match_ident.append(not(found))
a = a[match_ident]
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…