Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
717 views
in Technique[技术] by (71.8m points)

python - Emulating deprecated seaborn distplots

Seaborn distplot is now deprecated and will be removed in a future version. It is suggested to use histplot (or displot as a figure-level plot) as an alternative. But the presets differ between distplot and histplot:

from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns

x_list = [1, 2, 3, 4, 6, 7, 9, 9, 9, 10]
df = pd.DataFrame({"X": x_list, "Y": range(len(x_list))})

f, (ax_dist, ax_hist) = plt.subplots(2, sharex=True)

sns.distplot(df["X"], ax=ax_dist)
ax_dist.set_title("old distplot")
sns.histplot(data=df, x="X", ax=ax_hist)
ax_hist.set_title("new histplot")

plt.show()

enter image description here

So, how do we have to configure histplot to replicate the output of the deprecated distplot?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since I spent some time on this, I thought I share this so that others can easily adapt this approach:

from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np

x_list = [1, 2, 3, 4, 6, 7, 9, 9, 9, 10]
df = pd.DataFrame({"X": x_list, "Y": range(len(x_list))})

f, (ax_dist, ax_hist) = plt.subplots(2, sharex=True)

sns.distplot(df["X"], ax=ax_dist)
ax_dist.set_title("old distplot")
_, FD_bins = np.histogram(x_list, bins="fd")
bin_nr = min(len(FD_bins)-1, 50)
sns.histplot(data=df, x="X", ax=ax_hist, bins=bin_nr, stat="density", alpha=0.4, kde=True, kde_kws={"cut": 3})
ax_hist.set_title("new histplot")

plt.show()

Sample output:
enter image description here

The main changes are

  • bins=bin_nr - determine the histogram bins using the Freedman Diaconis Estimator and restrict the upper limit to 50
  • stat="density" - show density instead of count in the histogram
  • alpha=0.4 - for the same transparency
  • kde=True - add a kernel density plot
  • kde_kws={"cut": 3} - extend the kernel density plot beyond the histogram limits

Regarding the bin estimation with bins="fd", I am not sure that this is indeed the method used by distplot. Comments and corrections are more than welcome.

I removed **{"linewidth": 0} because distplot has, as pointed out by @mwaskom in a comment, an edgecolor line around the histogram bars that can be set by matplotlib to the default facecolor. So, you have to sort this out according to your style preferences.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...