I'm using Qt's QWebPage to render a page that uses javascript to update its content dynamically - so a library that just downloads a static version of the page (such as urllib2) won't work.
My problem is, when I render a second page, about 99% of the time the program just crashes. At other times, it will work three times before crashing. I've also gotten a few segfaults, but it is all very random.
My guess is the object I'm using to render isn't getting deleted properly, so trying to reuse it is possibly causing some problems for myself. I've looked all over and no one really seems to be having this same issue.
Here's the code I'm using. The program downloads web pages from steam's community market so I can create a database of all the items. I need to call the getItemsFromPage
function multiple times to get all of the items, as they are broken up into pages (showing results 1-10 out of X amount).
import csv
import re
import sys
from string import replace
from bs4 import BeautifulSoup
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import *
class Item:
__slots__ = ("name", "count", "price", "game")
def __repr__(self):
return self.name + "(" + str(self.count) + ")"
def __str__(self):
return self.name + ", " + str(self.count) + ", $" + str(self.price)
class Render(QWebPage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().load(QUrl(url))
self.app.exec_()
def _loadFinished(self, result):
self.frame = self.mainFrame()
self.app.quit()
self.deleteLater()
def getItemsFromPage(appid, page=1):
r = Render("http://steamcommunity.com/market/search?q=appid:" + str(appid) + "#p" + str(page))
soup = BeautifulSoup(str(r.frame.toHtml().toUtf8()))
itemLst = soup.find_all("div", "market_listing_row market_recent_listing_row")
items = []
for k in itemLst:
i = Item()
i.name = k.find("span", "market_listing_item_name").string
i.count = int(replace(k.find("span", "market_listing_num_listings_qty").string, ",", ""))
i.price = float(re.search(r'$([0-9]+.[0-9]+)', str(k)).group(1))
i.game = appid
items.append(i)
return items
if __name__ == "__main__":
print "Updating market items to dota2.csv ..."
i = 1
with open("dota2.csv", "w") as f:
writer = csv.writer(f)
r = None
while True:
print "Page " + str(i)
items = getItemsFromPage(570)
if len(items) == 0:
print "No items found, stopping..."
break
for k in items:
writer.writerow((k.name, k.count, k.price, k.game))
i += 1
print "Done."
Calling getItemsFromPage
once works fine. Subsequent calls give me my problem. The output of the program is typically
Updating market items to dota2.csv ...
Page 1
Page 2
and then it crashes. It should go on for over 700 pages.
question from:
https://stackoverflow.com/questions/66051200/quit-in-qapplication-making-the-whole-code-stop 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…