I'm trying to login to https://www.voxbeam.com/login using requests to scrape data. I'm a python beginner and I have done mostly tutorials, and some web scraping on my own with BeautifulSoup.
Looking at the HTML:
<form id="loginForm" action="https://www.voxbeam.com//login" method="post" autocomplete="off">
<input name="userName" id="userName" class="text auto_focus" placeholder="Username" autocomplete="off" type="text">
<input name="password" id="password" class="password" placeholder="Password" autocomplete="off" type="password">
<input id="challenge" name="challenge" value="78ed64f09c5bcf53ead08d967482bfac" type="hidden">
<input id="hash" name="hash" type="hidden">
I understand I should be using the method post, and sending userName and password
I'm trying this:
import requests
import webbrowser
url = "https://www.voxbeam.com/login"
login = {'userName': 'xxxxxxxxx',
'password': 'yyyyyyyyy'}
print("Original URL:", url)
r = requests.post(url, data=login)
print("
New URL", r.url)
print("Status Code:", r.status_code)
print("History:", r.history)
print("
Redirection:")
for i in r.history:
print(i.status_code, i.url)
# Open r in the browser to check if I logged in
new = 2 # open in a new tab, if possible
webbrowser.open(r.url, new=new)
I’m expecting, after a successful login to get in r the URL to the dashboard, so I can begin scraping the data I need.
When I run the code with the authentication information in place of xxxxxx and yyyyyy, I get the following output:
Original URL: https://www.voxbeam.com/login
New URL https://www.voxbeam.com/login
Status Code: 200
History: []
Redirection:
Process finished with exit code 0
I get in the browser a new tab with www.voxbeam.com/login
Is there something wrong in the code?
Am I missing something in the HTML?
It’s ok to expect to get the dashboard URL in r, or to be redirected and trying to open the URL in a browser tab to check visually the response, or I should be doing things in a different way?
I been reading many similar questions here for a couple of days, but it seems every website authentication process is a little bit different, and I checked http://docs.python-requests.org/en/latest/user/authentication/ which describes other methods, but I haven’t found anything in the HTML that would suggest I should be using one of those instead of post
I tried too
r = requests.get(url, auth=('xxxxxxxx', 'yyyyyyyy'))
but it doesn’t seem to work either.
See Question&Answers more detail:
os