Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
160 views
in Technique[技术] by (71.8m points)

python - How do I retrieve objects based on a partial custom attribute name in Scrapy?

I have the following element:

<div data-offer="MTs3O29sZG5hdnkuY29tOzQxMDYy" class="Offer__Card-sc-14rx0hy-0 iBdrTi"></div>

I need to find it with scrapy but I have two comlpications. The class can change so it is not going to have that value. Pretty much off the table.

The second problem is that data-offer value can vary between data-offer, data-offer-promo, data-offer-double

Do you know how can I find this elements based on a partial attribute name? Like bring me everything that has a custom attribute "data-offer*" Or everything that starts with it works too, but not the value, the attribute name.

I tried this with no success

 response.css('[div::attr^="data-offer"]')
question from:https://stackoverflow.com/questions/65919729/how-do-i-retrieve-objects-based-on-a-partial-custom-attribute-name-in-scrapy

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can find those elements using beautifulSoup. This will find the first div element that has a "data-offer" attribute:

soup = BeautifulSoup(response.body, 'lxml')
results = soup.find("div", {"data-offer" : True})

You could also get a list with all the elements that have the same condition:

soup = BeautifulSoup(response.body, 'lxml')
results = soup.find_all("div", {"data-offer" : True})

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...