Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
454 views
in Technique[技术] by (71.8m points)

python - The `hue` parameter in Seaborn.relplot() skips an integer when given numerical data?

The hue parameter skips one integer.

d = {'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':[0,1,2,3,4]}

df = pd.DataFrame(data=d)

sns.relplot(x='column2', y='column1', hue='cluster', data=df)

While all points are plotted, the cluster label is missing '2'.

Python 2.7 Seaborn 0.9.0 Ubuntu 16.04 LTS

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

"Full" legend

If the hue is in numeric format, seaborn will assume that it represents some continuous quantity and will decide to display what it thinks is a representative sample along the color dimension.

You can circumvent this by using legend="full".

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = pd.DataFrame({'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':[0,1,2,3,4]})
sns.relplot(x='column2', y='column1', hue='cluster', data=df, legend="full")
plt.show()

enter image description here

Categoricals

An alternative is to make sure the values are treated categorical Unfortunately, even if you plug in the numbers as strings, they will be converted to numbers falling back to the same mechanism described above. This may be seen as a bug.

However, one choice you have is to use real categories, like e.g. single letters.

'cluster':list("ABCDE")

works fine,

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

d = {'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':list("ABCDE")}

df = pd.DataFrame(data=d)

sns.relplot(x='column2', y='column1', hue='cluster', data=df)

plt.show()

enter image description here

Strings with customized palette

An alternative to the above is to use numbers converted to strings, and then make sure to use a custom palette with as many colors as there are unique hues.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

d = {'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':[1,2,3,4,5]}

df = pd.DataFrame(data=d)
df["cluster"] = df["cluster"].astype(str)

sns.relplot(x='column2', y='column1', hue='cluster', data=df, 
            palette=["b", "g", "r", "indigo", "k"])

plt.show()

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...