python - Get word frequency of pandas column containing lists of strings

Question

Welcome To Ask or Share your Answers For Others

python - Get word frequency of pandas column containing lists of strings

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Get word frequency of pandas column containing lists of strings

I have a pandas dataframe:

import pandas as pd
test = pd.DataFrame({'words':[['foo','bar none','scare','bar','foo'],
                              ['race','bar none','scare'],
                              ['ten','scare','crow bird']]})

I'm trying to get a word/phrase count of all the list elements in the dataframe colunn. My current solution is:

allwords = []

for index, row in test.iterrows():
    for word in row['words']:
        allwords.append(word)

from collections import Counter
pd.Series(Counter(allwords)).sort_values(ascending=False)

This works, but I was wondering if there was a faster solution. Note: I'm not using ' '.join() because I don't want the phrases to be split into individual words.

question from:https://stackoverflow.com/questions/65844350/get-word-frequency-of-pandas-column-containing-lists-of-strings

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:30:45+0000

For improve performance dont use iterrows:

from collections import Counter
from  itertools import chain

a = pd.Series(Counter(chain.from_iterable(test['words']))).sort_values(ascending=False)
print (a)
scare        3
foo          2
bar none     2
bar          1
race         1
ten          1
crow bird    1
dtype: int64

Pandas only solution:

a = pd.Series([y for x in test['words'] for y in x]).value_counts()
print (a)
scare        3
bar none     2
foo          2
bar          1
race         1
crow bird    1
ten          1
dtype: int64

Categories

python - Get word frequency of pandas column containing lists of strings

python - Get word frequency of pandas column containing lists of strings

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags