Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
239 views
in Technique[技术] by (71.8m points)

python - How web scrape data from this line .. there is no div and no class element I can't find.i want to extract data from that line??how

<p> ==$0
  "1."the purpose of our lives is 
   to be happy." - "
   <strong>Dalai Lama</strong>
</P>

there is many quotes like above form tags and I can't find locating elements

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
import requests
from bs4 import BeautifulSoup
from pprint import pp


def main(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'lxml')
    x = [x.get_text(strip=True, separator=" ") for x in soup.select(
        'span[data-parade-type="promoarea"] .figure_block ~ p')]

    goal = [i for i in x if i[0].isdigit()]
    pp(goal)


main('https://parade.com/937586/parade/life-quotes/')

Note, If you are using Windows machine, DO NOT forget to include from_encoding= equal to the encoding used by your sys.

Ref: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#encodings

Otherwise:

print("
".join(goal))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...