I have been using readLines() to scrape information from a website in an R tutorial. I now wish to extract data from my own website (specifically the awstats data) however the domain is password protected.
Is there a way that I can pass the url for the specific awstats data I require with a username and password.
the format of the url is:
http://domain.name:port/awstats.pl?month=02&year=2011&config=domain.name&lang=en&framename=mainright&output=alldomains
Thanks.
If it is indeed a http basic access authentication, the documentation on connections provides some help:
connections
URLs Note that https:// connections are only supported if --internet2 or setInternet2(TRUE) was used (to make use of Internet Explorer internals), and then only if the certificate is considered to be valid. With that option only, the http://user:pass@site notation for sites requiring authentication is also accepted.
URLs
Note that https:// connections are only supported if --internet2 or setInternet2(TRUE) was used (to make use of Internet Explorer internals), and then only if the certificate is considered to be valid. With that option only, the http://user:pass@site notation for sites requiring authentication is also accepted.
So your URL string should look like this:
http://username:[email protected]:port/awstats.pl?month=02&year=2011&config=domain.name&lang=en&framename=mainright&output=alldomains
This might be Windows-only though.
Hope this helps!
1.4m articles
1.4m replys
5 comments
57.0k users