I'm plotting a cross-tabulation of various offices within certain categories. I'd like to put together a horizontal stacked bar chart where each office and its value is labeled.
Here's some example code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# create dataframe
df = pd.DataFrame({'office1': [1, np.nan, np.nan],
'office2': [np.nan, 8, np.nan],
'office3': [12, np.nan, np.nan],
'office4': [np.nan, np.nan, 3],
'office5': [np.nan, 5, np.nan],
'office6': [np.nan, np.nan, 7],
'office7': [3, np.nan, np.nan],
'office8': [np.nan, np.nan, 11],
'office9': [np.nan, 6, np.nan]},
index=['catA', 'catB', 'catC'])
# plot dataframe
ax = df.plot.barh(title="Office Breakdown by Category",
legend=False,
figsize=(10,7), stacked=True)
This gives me a fine starting point:
However, what I'd like to have is this:
After some research, I came up with the following code that correctly lines up labels on the 'category' axis:
def annotateBars(row, ax=ax):
for col in row.index:
value = row[col]
if (str(value) != 'nan'):
ax.text(value/2, labeltonum(row.name), col+","+str(value))
def labeltonum(label):
if label == 'catA':
return 0
elif label == 'catB':
return 1
elif label == 'catC':
return 2
df.apply(annotateBars, ax=ax, axis=1)
But this doesn't factor in the "stacking" of the bars. I've also tried iterating through the patches
container returned by the plot command (which can let me retrieve x & y positions of each rectangle), but I then lose any connection to the office labels.
See Question&Answers more detail:
os