python - How to create a percentage bar plot with grouped bars?

Question

Welcome To Ask or Share your Answers For Others

python - How to create a percentage bar plot with grouped bars?

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - How to create a percentage bar plot with grouped bars?

i have a dataframe which has columns as 'City, 'Gender','Education level' and 'How satisfied are you about something' Here how my dataframe looks like.

So i am trying to plot it into a bar chart;

#in here i select the neighbourhood as "FAT?H"
fatih_ilcesi = data.loc[data['A.01.?stanbul’un hangi il?esinde oturuyorsunuz?'] == 'FAT?H']
#then i group it based on gender and try to plot it with the question of how satisfied are you about something.
fatih_ilcesi.groupby('Cinsiyeti')['A.04. Genel olarak dü?ündü?ünüzde ?l?e Belediyenizin 
hizmetlerinden ne derece memnunsunuz?'].value_counts(normalize = True).plot(kind = "bar").labels()

So this is what i got:

But i'd like to get something like this:

I could not figure out to make the bars same color as the answers of the question of 'How satisfied are you about something'.

And i want to be able to add percentages at the top of the bar charts. If someone can guide me I would be really greatful. Thank you.

question from:https://stackoverflow.com/questions/65852610/how-to-create-a-percentage-bar-plot-with-grouped-bars

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:28:48+0000

You could create a Seaborn countplot() as follows. Using gender for the x places it on the x-axis. Using Satisfied? as the hue will divide the bars for the genders into smaller bars and create an accompanying legend. If you want to fix a certain order on these values, either hue_order could be used, or the column could be made categorical.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

N = 500
data = pd.DataFrame({'City': np.random.choice(['Test City', 'Other City'], N),
                     'Gender': np.random.choice(['Male', 'Female'], N),
                     'Satisfied?': np.random.choice(['1 - very bad', '2 - bad', '3 - neutral', '4 - good', '5 - very good'], N)})
sns.countplot(data=data[data['City'] == 'Test City'], x='Gender', palette='plasma',
              hue='Satisfied?', hue_order=['1 - very bad', '2 - bad', '3 - neutral', '4 - good', '5 - very good'])
plt.show()

From here, further refinements can be made:

Changing the bar heights such that the sum per gender will be one. This will convert the heights to percents.
Change the formatting of the y-axis to show percents
While changing the heights, also the widths of the bars could be changed, leaving a little gap between them
Putting the legend at the bottom, without frame and with square markers.
Add the percentage as text above the bars
Add horizontal grid lines
Hide the spines
...

Seaborn has a myriad of ways to choose colors. The simplest way is to give a list of named colors. But not that existing palettes have been studied to have colors that go well together. The Colorbrewer website can be used to experiment and find colors for many situations.

The variable width_scale in the code can be used to set the gaps. In the old version 0.8 was set, leaving a gap of 0.2. The new example has a gap of 1.0 - 0.6 = 0.4.

Here is an example:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from matplotlib.ticker import PercentFormatter

N = 500
data = pd.DataFrame({'City': np.random.choice(['Test City', 'Other City'], N),
                     'Gender': np.random.choice(['Male', 'Female'], N, p=[0.3, 0.7]),
                     'Satisfied?': np.random.choice(['1 - very bad', '2 - bad', '3 - neutral', '4 - good', '5 - very good'], N)})
city_data = data[data['City'] == 'Test City']
fig, ax = plt.subplots(figsize=(14, 4))
sns.countplot(data=city_data, x='Gender', order=['Male', 'Female'], ax=ax,
              palette=['turquoise', 'tomato', 'deepskyblue', 'gold', 'limegreen'],
              hue='Satisfied?', hue_order=['1 - very bad', '2 - bad', '3 - neutral', '4 - good', '5 - very good'])

width_scale = 0.6  # the relative width of the bars, 1.0 means bars touching; the gap will be 1-width_scale
for bars in ax.containers:
    for bar, total_per_gender in zip(bars, [sum(city_data['Gender'] == 'Male'), sum(city_data['Gender'] == 'Female')]):
        new_height = bar.get_height() / total_per_gender
        bar.set_height(new_height)
        width = bar.get_width()
        x = bar.get_x()
        bar.set_width(width * width_scale)
        bar.set_x(x + width * (1 - width_scale) / 2)  # recenter
        if np.isnan(new_height):
            new_height = 0
        ax.text(x + width / 2, new_height, f' {new_height * 100:.1f}%
', ha='center', va='bottom', rotation=90)
ax.set_xlabel('')  # remove superfluous x-label
ax.set_ylabel('')
ax.tick_params(axis='x', length=0, labelsize=14)  # remove tick marks, larger text
ax.yaxis.set_major_formatter(PercentFormatter(1))
ax.grid(axis='y', ls=':', clip_on=False)
sns.despine(fig, ax, top=True, right=True, left=True, bottom=True)
ax.legend(ncol=5, bbox_to_anchor=(0.5, -0.1), loc='upper center', frameon=False, handlelength=1, handleheight=1)
ax.autoscale()  # needed to recalculate the axis limits after changing the heights
ax.relim()
ax.margins(y=0.15, x=0.02)  # some space for the text on top of the bars
plt.tight_layout()
plt.show()

Categories

python - How to create a percentage bar plot with grouped bars?

python - How to create a percentage bar plot with grouped bars?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags