I am scraping listings with Scrapy. My script parses first for the listing urls using parse_node
, then it parses each listing using parse_listing
, for each listing it parses the agents for the listing using parse_agent
. I would like to create an array, that builds up as scrapy parses through the listings and the agents for the listings and resets for each new listing.
Here is my parsing script:
def parse_node(self,response,node):
yield Request('LISTING LINK',callback=self.parse_listing)
def parse_listing(self,response):
yield response.xpath('//node[@id="ListingId"]/text()').extract_first()
yield response.xpath('//node[@id="ListingTitle"]/text()').extract_first()
for agent in string.split(response.xpath('//node[@id="Agents"]/text()').extract_first() or "",'^'):
yield Request('AGENT LINK',callback=self.parse_agent)
def parse_agent(self,response):
yield response.xpath('//node[@id="AgentName"]/text()').extract_first()
yield response.xpath('//node[@id="AgentEmail"]/text()').extract_first()
I would like parse_listing to result in:
{
'id':123,
'title':'Amazing Listing'
}
then parse_agent to add to the listing array:
{
'id':123,
'title':'Amazing Listing'
'agent':[
{
'name':'jon doe',
'email:'[email protected]'
},
{
'name':'jane doe',
'email:'[email protected]'
}
]
}
How do I get the results from each level and build up an array?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…