There are several issues with your code:
App
inherits from app_crawler
yet you provide an app_crawler
instance to App.__init__
.
App.__init__
calls app_crawler.__init__
instead of super().__init__()
.
Not only app_crawler.get_app
doesn't actually return anything, it creates a brand new App
object.
This results in your code passing an app_crawler
object to requests.get
instead of a url string.
You have too much encapsulation in your code.
Consider the following code that is shorter than your not-working code, cleaner and without needing to needlessly pass objects around:
from lxml import html
import requests
class App:
def __init__(self, starturl):
self.starturl = starturl
self.links = []
def get_links(self):
page = requests.get(self.starturl)
tree = html.fromstring(page.text)
self.links = tree.xpath('//div[@class="lockup-info"]//*/a[@class="name"]/@href')
def process_links(self):
for link in self.links:
self.get_docs(link)
def get_docs(self, url):
page = requests.get(url)
tree = html.fromstring(page.text)
name = tree.xpath('//h1[@itemprop="name"]/text()')[0]
developper = tree.xpath('//div[@class="left"]/h2/text()')[0]
price = tree.xpath('//div[@itemprop="price"]/text()')[0]
print(name, developper, price)
if __name__ == '__main__':
parse = App("https://itunes.apple.com/us/app/candy-crush-saga/id553834731?mt=8")
parse.get_links()
parse.process_links()
outputs
Cookie Jam By Jam City, Inc. Free
Zombie Tsunami By Mobigame Free
Flow Free By Big Duck Games LLC Free
Bejeweled Blitz By PopCap Free
Juice Jam By Jam City, Inc. Free
Candy Crush Soda Saga By King Free
Bubble Witch 3 Saga By King Free
Candy Crush Jelly Saga By King Free
Farm Heroes Saga By King Free
Pet Rescue Saga By King Free
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…