Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
315 views
in Technique[技术] by (71.8m points)

python - How can I keep a PyQT5 stream open to catch dojo/domReady! JS execution?

I am using example code below to scrape a website. The problem is that the website has code behind "dojo/domReady!" attributes so the code referenced below will complete and scrape the HTML before the remaining site content has been adjusted/finalized.

Can anybody help me adjust the below code to enable it to "wait 10 seconds after page connection" before grabbing the HTML as the page exists? I am trying to wait an arbitrary amount of time to allow for any or all of the content to render further past the initial page load.

Example:

import bs4 as bs
import sys
import urllib3.request
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl
import time

class Page(QWebEnginePage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebEnginePage.__init__(self)
        self.html = ''
        self.loadFinished.connect(self._on_load_finished)
        self.load(QUrl(url))
        self.app.exec_()

    def _on_load_finished(self):

        self.html = self.toHtml(self.Callable)
        print('Load finished')

    def Callable(self, html_str):
        self.html = html_str
        self.app.quit()


def main():
    page = Page('some_website')
    soup = bs.BeautifulSoup(page.html, 'html.parser')
    print(soup)

main()
question from:https://stackoverflow.com/questions/65713943/how-can-i-keep-a-pyqt5-stream-open-to-catch-dojo-domready-js-execution

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...