Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
704 views
in Technique[技术] by (71.8m points)

python - Plot an histogram with y-axis as percentage (using FuncFormatter?)

I have a list of data in which the numbers are between 1000 and 20 000.

data = [1000, 1000, 5000, 3000, 4000, 16000, 2000]

When I plot a histogram using the hist() function, the y-axis represents the number of occurrences of the values within a bin. Instead of the number of occurrences, I would like to have the percentage of occurrences.

Histogram of the data list above

Code for the above plot:

f, ax = plt.subplots(1, 1, figsize=(10,5))
ax.hist(data, bins = len(list(set(data))))

I've been looking at this post which describes an example using FuncFormatter but I can't figure out how to adapt it to my problem. Some help and guidance would be welcome :)

EDIT: Main issue with the to_percent(y, position) function used by the FuncFormatter. The y corresponds to one given value on the y-axis I guess. I need to divide this value by the total number of elements which I apparently can' t pass to the function...

EDIT 2: Current solution I dislike because of the use of a global variable:

def to_percent(y, position):
    # Ignore the passed in position. This has the effect of scaling the default
    # tick locations.
    global n

    s = str(round(100 * y / n, 3))
    print (y)

    # The percent symbol needs escaping in latex
    if matplotlib.rcParams['text.usetex'] is True:
        return s + r'$\%$'
    else:
        return s + '%'

def plotting_hist(folder, output):
    global n

    data = list()
    # Do stuff to create data from folder

    n = len(data)
    f, ax = plt.subplots(1, 1, figsize=(10,5))
    ax.hist(data, bins = len(list(set(data))), rwidth = 1)

    formatter = FuncFormatter(to_percent)
    plt.gca().yaxis.set_major_formatter(formatter)

    plt.savefig("{}.png".format(output), dpi=500)

EDIT 3: Method with density = True

enter image description here

Actual desired output (method with global variable):

enter image description here

question from:https://stackoverflow.com/questions/51473993/plot-an-histogram-with-y-axis-as-percentage-using-funcformatter

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Other answers seem utterly complicated. A histogram which shows the proportion instead of the absolute amount can easily produced by weighting the data with 1/n, where n is the number of datapoints.

Then a PercentFormatter can be used to show the proportion (e.g. 0.45) as percentage (45%).

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter

data = [1000, 1000, 5000, 3000, 4000, 16000, 2000]

plt.hist(data, weights=np.ones(len(data)) / len(data))

plt.gca().yaxis.set_major_formatter(PercentFormatter(1))
plt.show()

enter image description here

Here we see that three of the 7 values are in the first bin, i.e. 3/7=43%.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...