Working with pandas to try and summarise a data frame as a count of certain categories, as well as the means sentiment score for these categories.
There is a table full of strings that have different sentiment scores, and I want to group each text source by saying how many posts they have, as well as the average sentiment of these posts.
My (simplified) data frame looks like this:
source text sent
--------------------------------
bar some string 0.13
foo alt string -0.8
bar another str 0.7
foo some text -0.2
foo more text -0.5
The output from this should be something like this:
source count mean_sent
-----------------------------
foo 3 -0.5
bar 2 0.415
The answer is somewhere along the lines of:
df['sent'].groupby(df['source']).mean()
Yet only gives each source and it's mean, with no column headers.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…