In pandas, given a DataFrame D:
+-----+--------+--------+--------+
| | 1 | 2 | 3 |
+-----+--------+--------+--------+
| 0 | apple | banana | banana |
| 1 | orange | orange | orange |
| 2 | banana | apple | orange |
| 3 | NaN | NaN | NaN |
| 4 | apple | apple | apple |
+-----+--------+--------+--------+
How do I return rows that have the same contents across all of its columns when there are three columns or more such that it returns this:
+-----+--------+--------+--------+
| | 1 | 2 | 3 |
+-----+--------+--------+--------+
| 1 | orange | orange | orange |
| 4 | apple | apple | apple |
+-----+--------+--------+--------+
Note that it skips rows when all values are NaN.
If this were only two columns, I usually do D[D[1]==D[2]]
but I don't know how to generalize this for more than 2 column DataFrames.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…