Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
219 views
in Technique[技术] by (71.8m points)

authentication - Webscraping in Python a page with login and redirection

I am trying to login to a financial service I am a customer to retrieve automatically some data by using Python requests.

I have been inspired by this page:

import requests
from typing import Dict

def get_payload(username:str, password:str) -> Dict[str, str]:
    """Return dictionary for credentials"""
    return {
        "USERNAME": username, "PASSWORD": password, "option": "login"
    }

session_requests = requests.session()
result_login = session_requests.post(
    URL, 
    data = get_payload("myusername", "MyPasswordSuperSafe"), 
    headers = dict(referer=URL)
)

tree = html.fromstring(result.text)

I am able to send the username and password and send login information. However, the system is using what I suppose is some kind of safety system: it uses some automatic redirection (see screenshot).

Redirection

On a webbrowser, it automatically redirects to the page with a successful login.

However, I don't know how to deal with it with my Python webscraping program leads to a timeout.

For information, this is the code the redirection page (with the name of website ofuscated):

<!DOCTYPE html>
<!-- saved from url=(0059)https://somewebsite.com/scripts/customer.cgi?option=login -->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
        <script language="JavaScript">
            function redirect() {
                top.location.href = 'https://somewebsite.com/scripts/customer.cgi/SC/';
            }
        </script>

<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="author" content="RL360">
<meta name="copyright" content="RL360">


<link href="./Online services redirection_files/screen.css" rel="styleSheet" media="screen">
<link href="./Online services redirection_files/print.css" rel="styleSheet" media="print">

<!--[if lte IE 8]>
<link href="https://somewebsite.com/scripts/customer.cgi/SF/stylesheets/desktop/ie8fix.css"  rel="stylesheet"  type="text/css" />
<![endif]-->
<script>
        function setCookie(cname, cvalue, exdays, path) {
            var d = new Date();
            d.setTime(d.getTime() + (exdays * 24 * 60 * 60 * 1000));
            var expires = "expires="+d.toUTCString();
            document.cookie = cname + "=" + cvalue + ";" + expires + ";path=" + path;
        }
        function getCookie(cname) {
            var name = cname + "=";
            var ca = document.cookie.split(';');
            for(var i = 0; i < ca.length; i++) {
                var c = ca[i];
                while (c.charAt(0) == ' ') {
                    c = c.substring(1);
                 }
                if (c.indexOf(name)  == 0) {
                    return c.substring(name.length, c.length);
                 }
            }
            return "";
        }
</script>

<title>Online services redirection</title>

<link href="./Online services redirection_files/css" rel="stylesheet"></head><span id="warning-container"><i data-reactroot=""></i></span>
<body onload="redirect();" style="background-color: #ffffff;">

<div id="mainarea">
    <div id="title"></div>

    <!-- main content -->
    <form action="https://somewebsite.com/scripts/customer.cgi/SC/" name="redirform" method="POST">
    <div class="level1" style="width: 700px; margin-left: 123px; height: auto;"><h2>Online services redirection</h2>

    <p><a href="https://somewebsite.com/scripts/customer.cgi/SC/" target="_top">Attempting to redirect, please click here if nothing happens after 30 seconds.</a></p>
    </div>
    </form>
</div>


</body></html>

How could I deal with this redirection?

I'm open to using requests, mechanize, BeautifulSoup or any other solution (but would prefer to avoid selenium if possible).

question from:https://stackoverflow.com/questions/65872784/webscraping-in-python-a-page-with-login-and-redirection

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...