In pandas, I wanted to compare only 3 columns(chosen by name), of the total 8 columns, and get the "Outcome".
- [You will find many similar questions, but 99% of them are irrelavent as they are comparing all the columns in the dataframe, and not just random ones from a larger dataset as it happens in the real world analysis... I want to choose the columns by name which have to be compared]
# Columns to compare are :: ColB, ColD and ColF
Fruits ColA ColB ColC ColD ColE ColF Outcome
Loquat 83 98 91 98 78 96 FALSE
Medlar 82 94 87 94 91 94 TRUE
Pear 77 74 79 71 79 71 FALSE
Quince 71 93 78 93 92 93 TRUE
Date 98 81 73 94 97 99 FALSE
Rowan 89 85 77 85 95 85 TRUE
Lime 97 91 71 90 88 85 FALSE
Is there any code which can help me compare more than 2 Columns at a time, and get a boolean?
(I know comparing 2 columns works with the below code, but if I add a third column it gives error shown at the end)
# I have tried the below code:
df.loc[(df['ColB']==df['ColD']==df['ColF']), 'Outcome'] = "True"
Traceback (most recent call last):
File "C:Py378TestsTrial.py", line 15, in <module>
df.loc[(df['ColB']==df['ColD']==df['ColF']), 'Outcome'] = "True"
File "c:py378pylibsite-packagespandascoregeneric.py", line 1479, in __nonzero__
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
The above would have worked if I removed "==df['ColF']" from it, so I know comparing 2 columns works... Is there any format in which I can add columns by name(more than 3 to 5) and it will work?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…