python - How to slice one MultiIndex DataFrame with the MultiIndex of another

Question

Welcome To Ask or Share your Answers For Others

python - How to slice one MultiIndex DataFrame with the MultiIndex of another

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - How to slice one MultiIndex DataFrame with the MultiIndex of another

I have a pandas dataframe with 3 levels of a MultiIndex. I am trying to pull out rows of this dataframe according to a list of values that correspond to two of the levels.

I have something like this:

ix = pd.MultiIndex.from_product([[1, 2, 3], ['foo', 'bar'], ['baz', 'can']], names=['a', 'b', 'c'])
data = np.arange(len(ix))
df = pd.DataFrame(data, index=ix, columns=['hi'])
print(df)

           hi
a b   c      
1 foo baz   0
      can   1
  bar baz   2
      can   3
2 foo baz   4
      can   5
  bar baz   6
      can   7
3 foo baz   8
      can   9
  bar baz  10
      can  11

Now I want to take all rows where index levels 'b' and 'c' are in this index:

ix_use = pd.MultiIndex.from_tuples([('foo', 'can'), ('bar', 'baz')], names=['b', 'c'])

i.e. values of hi having ('foo', 'can') or ('bar', 'baz') in levels b and c respectively: (1, 2, 5, 6, 9, 10).

So I'd like to take a slice(None) on the first level, and pull out specific tuples on the second and third levels.

Initially I thought that passing a multi-index object to .loc would pull out the values / levels that I wanted, but this isn't working. What's the best way to do something like this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T02:58:00+0000

Here is a way to get this slice:

df.sort_index(inplace=True)
idx = pd.IndexSlice
df.loc[idx[:, ('foo','bar'), 'can'], :]

yielding

           hi
a b   c      
1 bar can   3
  foo can   1
2 bar can   7
  foo can   5
3 bar can  11
  foo can   9

Note that you might need to sort MultiIndex before you can slice it. Well pandas is kind enough to warn if you need to do it:

KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (3), lexsort depth (1)'

You can read more on how to use slicers in the docs

If for some reason using slicers is not an option here is a way to get the same slice using .isin() method:

df[df.index.get_level_values('b').isin(ix_use.get_level_values(0)) & df.index.get_level_values('c').isin(ix_use.get_level_values(1))]

Which is clearly not as concise.

UPDATE:

For the conditions that you have updated here is a way to do it:

cond1 = (df.index.get_level_values('b').isin(['foo'])) & (df.index.get_level_values('c').isin(['can']))
cond2 = (df.index.get_level_values('b').isin(['bar'])) & (df.index.get_level_values('c').isin(['baz']))
df[cond1 | cond2]

producing:

           hi
a b   c      
1 foo can   1
  bar baz   2
2 foo can   5
  bar baz   6
3 foo can   9
  bar baz  10

Categories

python - How to slice one MultiIndex DataFrame with the MultiIndex of another

python - How to slice one MultiIndex DataFrame with the MultiIndex of another

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags