If you want to work with R and twitter, you should take a look at the twitteR
package. It doesn't have a function to retrieve the information you want, but we can take advantage of its internal functions to use OAuth, and then send the correct API call. The advantage of using API calls is that you don't rely on parsing the HTML page, you're actually doing what developers are supposed to do.
The code below assumes you have already authenticated using setup_twitter_oauth()
, you can find tutorials on this easily, since it's the package basics. Once authenticated, let's load the packages we need:
library(rjson)
library(httr)
# library(twitteR) Should have been loaded already of course
Now to do the API call, we'll use POST
. The URL has a slug
parameter which is the twitter list name, and a owner_screen_name
parameter which is the Twitter Account owner of the list. We'll use internal twitteR:::get_oauth_sig()
to authenticate the call.
twlist <- "premier-league-players"
twowner <- "TwitterUK"
api.url <- paste0("https://api.twitter.com/1.1/lists/members.json?slug=",
twlist, "&owner_screen_name=", twowner, "&count=5000")
response <- POST(api.url, config(token=twitteR:::get_oauth_sig()))
#Count = 5000 is the number of names per result page,
# which for this case simplifies things to one page.
This returns a JSON response which we can read using fromJSON
:
response.list <- fromJSON(content(response, as = "text", encoding = "UTF-8"))
Now, we have a list where each element is the Twitter data of one Twitter-list member. To extract their names and user_names:
users.names <- sapply(response.list$users, function(i) i$name)
users.screennames <- sapply(response.list$users, function(i) i$screen_name)
Which are:
> head(users.names)
[1] "Peter Crouch" "barry bannan" "Jose Leonardo Ulloa "
"Paul McShane" "nacho monreal" "James Ward-Prowse"
> head(users.screennames)
[1] "petercrouch" "bazzabannan25" "Ciclone1923" "pmacca15"
"_nachomonreal" "Prowsey16"
Now the best part of this code is that it opens up pretty much the entire twitter API from R, as an already authenticated request. You can check the response list and sublists for all the available information on each query.