I have two dataframes both of which have the same basic schema. (4 date fields, a couple of string fields, and 4-5 float fields). Call them df1
and df2
.
What I want to do is basically get a "diff" of the two - where I get back all rows that are not shared between the two dataframes (not in the set intersection). Note, the two dataframes need not be the same length.
I tried using pandas.merge(how='outer')
but I was not sure what column to pass in as the 'key' as there really isn't one and the various combinations I tried were not working. It is possible that df1
or df2
has two (or more) rows that are identical.
What is a good way to do this in pandas/Python?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…