Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
150 views
in Technique[技术] by (71.8m points)

How to visualize hashtags in R, and see the trends of the hashtags?

I'm doing trend analysis, and trying to use barcharts to visualize the frequencies of the hashtags in different years. So I can see the top 3 most frequent hashtag terms, and see how the frequencies of these terms are evolving during years I have a dataset like this:

    terms          year
1   #A;#B;#C       2017
2   #B;#C;#D       2016
3   #C;#D;#E       2021
4   #D;#E;#F       2020
5   #E;#F;#G       2020
6   #F;#G;#H       2020
7   #G;#H;#I       2019
8   #H;#I;#J       2018
9   #I;#J;#K       2020
10  #J;#K;#L       2020

thanks!

question from:https://stackoverflow.com/questions/66060347/how-to-visualize-hashtags-in-r-and-see-the-trends-of-the-hashtags

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Basically, we need to count the hashtag for every year. Since the hashtags for a particular year is in single-column we need to separate it into different columns and then we can convert the df into a long df, where it becomes possible for us to group it based on year and hashtag to find the count.

library(tidyverse)

structure(list(terms = c("#A;#B;#C", "#B;#C;#D", "#C;#D;#E", 
                         "#D;#E;#F", "#E;#F;#G", "#F;#G;#H", "#G;#H;#I", "#H;#I;#J", "#I;#J;#K", 
                         "#J;#K;#L"), year = c(2017, 2016, 2021, 2020, 2020, 2020, 2019, 
                                               2018, 2020, 2020)), row.names = c(NA, -10L), class = c("tbl_df", 
                                                                                                      "tbl", "data.frame")) -> df

df %>% 
   separate(terms, into = paste0("t", 1:3), sep = ";") %>% 
   pivot_longer(-year) %>% 
   group_by(year, value) %>% 
   count(value) %>% 
   ggplot(aes(x = year, y = n, fill = value, label = n)) +
   geom_col(position = position_dodge()) +
   geom_text(position = position_dodge(1))

Created on 2021-02-05 by the reprex package (v0.3.0)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...