I am trying to figure out the best way to insert the means back into a multi-indexed pandas dataframe.
Suppose I have a dataframe like this:
metric 1 metric 2
R P R P
foo a 0 1 2 3
b 4 5 6 7
bar a 8 9 10 11
b 12 13 14 15
I would like to get the following result:
metric 1 metric 2
R P R P
foo a 0 1 2 3
b 4 5 6 7
AVG 2 3 4 5
bar a 8 9 10 11
b 12 13 14 15
AVG 10 11 12 13
Please note, I know I can do df.mean(level=0)
to get the level 0 group means as a separate dataframe. This is not exactly what I want -- I want to insert the group means as rows back into the group.
I am able to get the result I want, but I feel like I am doing this wrong/there is probably a one liner that I am missing that already does this without the expensive python iteration. Here is my example code:
import numpy as np
import pandas as pd
data = np.arange(16).reshape(4,4)
row_index = [("foo", "a"), ("foo", "b"), ("bar", "a"), ("bar", "b")]
col_index = [("metric 1", "R"), ("metric 1", "P"), ("metric 2", "R"),
("metric 2", "P")]
col_multiindex = pd.MultiIndex.from_tuples(col_index)
df = pd.DataFrame(data, index=pd.MultiIndex.from_tuples(row_index),
columns=col_multiindex)
new_row_index = []
data = []
for name, group in df.groupby(level=0):
for index_tuple, row in group.iterrows():
new_row_index.append(index_tuple)
data.append(row.tolist())
new_row_index.append((name, "AVG"))
data.append(group.mean().tolist())
print pd.DataFrame(data,
index=pd.MultiIndex.from_tuples(new_row_index),
columns=col_multiindex)
Which results in:
metric 1 metric 2
R P R P
bar a 8 9 10 11
b 12 13 14 15
AVG 10 11 12 13
foo a 0 1 2 3
b 4 5 6 7
AVG 2 3 4 5
which flips the order of the groups for some reason, but is more or less what I want.
See Question&Answers more detail:
os