Setup a Sample DataFrame:
import pandas as pd
df = pd.DataFrame({'word': ['how', 'are', 'you', 'doing', 'this', 'afternoon'],
'count': [7, 10, 4, 1, 20, 100]})
word count
0 how 7
1 are 10
2 you 4
3 doing 1
4 this 20
5 afternoon 100
Convert the word
& count
columns to a dict
WordCloud().generate_from_frequencies()
requires a dict
- Use one of the following methods
# method 1: convert to dict
data = dict(zip(df['word'].tolist(), df['count'].tolist()))
# method 2: convert to dict
data = df.set_index('word').to_dict()['count']
print(data)
[out]: {'how': 7, 'are': 10, 'you': 4, 'doing': 1, 'this': 20, 'afternoon': 100}
Wordcloud:
from wordcloud import WordCloud
wc = WordCloud(width=800, height=400, max_words=200).generate_from_frequencies(data)
Plot
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
plt.imshow(wc, interpolation='bilinear')
plt.axis('off')
plt.show()
Using an image mask:
twitter_mask = np.array(Image.open('twitter.png'))
wc = WordCloud(background_color='white', width=800, height=400, max_words=200, mask=twitter_mask).generate_from_frequencies(data_nyt)
plt.figure(figsize=(10, 10))
plt.imshow(wc, interpolation='bilinear')
plt.axis("off")
plt.figure()
plt.imshow(twitter_mask, cmap=plt.cm.gray, interpolation='bilinear')
plt.axis("off")
plt.show()
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…