python - Find all indices of maximum in Pandas DataFrame

Question

Welcome To Ask or Share your Answers For Others

python - Find all indices of maximum in Pandas DataFrame

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Find all indices of maximum in Pandas DataFrame

I need to find all indices where the maximum value (per row) is obtained in a Pandas DataFrame. For instance, if I have a dataFrame like this:

   cat1  cat2  cat3
0     0     2     2
1     3     0     1
2     1     1     0

then the method I am looking for would yield a result like:

[['cat2', 'cat3'],
 ['cat1'],
 ['cat1', 'cat2']]

This is a list of lists, but some other data structure is also okay.

I cannot use df.idxmax(axis=1), because it only yields the first maximum.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T20:01:00+0000

Here is the information, in a different data structure:

In [8]: df = pd.DataFrame({'cat1':[0,3,1], 'cat2':[2,0,1], 'cat3':[2,1,0]})

In [9]: df
Out[9]: 
   cat1  cat2  cat3
0     0     2     2
1     3     0     1
2     1     1     0

[3 rows x 3 columns]

In [10]: rowmax = df.max(axis=1)

The max values are indicated by True values:

In [82]: df.values == rowmax[:,None]
Out[82]: 
array([[False,  True,  True],
       [ True, False, False],
       [ True,  True, False]], dtype=bool)

np.where returns the indices where the DataFrame above is True.

In [84]: np.where(df.values == rowmax[:,None])
Out[84]: (array([0, 0, 1, 2, 2]), array([1, 2, 0, 0, 1]))

The first array indicates index values for axis=0, the second array for axis=1. There are 5 values in each array since there are five locations that are True.

You could use itertools.groupby to build the list of lists you posted, though perhaps you don't need this given the data structures above:

In [46]: import itertools as IT

In [47]: import operator

In [48]: idx = np.where(df.values == rowmax[:,None])

In [49]: groups = IT.groupby(zip(*idx), key=operator.itemgetter(0))

In [50]: [[df.columns[j] for i, j in grp] for k, grp in groups]
Out[50]: [['cat1', 'cat1'], ['cat2'], ['cat3', 'cat3']]

Categories

python - Find all indices of maximum in Pandas DataFrame

python - Find all indices of maximum in Pandas DataFrame

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags