Dataframe:
Code:
def word_count(sentence): return len(sentence.split()) df['word_count'] = df['PMID'].apply(word_count) df.tail(10)
You can make use of TfidfVectorizer.
from sklearn.feature_extraction.text import TfidfVectorizer v = TfidfVectorizer() x = v.fit_transform(df['PMID'])
1.4m articles
1.4m replys
5 comments
57.0k users