I'm trying to get the plain text of a website article using python. I've heard about the BeautifulSoup library, but how to retrieve a specific tag in html page?
This is what I have done:
base_url = 'http://www.nytimes.com' r = requests.get(base_url) soup = BeautifulSoup(r.text, "html.parser")
Look this:
import bs4 as bs import requests as rq html = rq.get('site.com') s = bs.BeautifulSoup(html.text, features="html.parser") div = s.find('div', {'class': 'yourclass'}) # or id print(str(div.text)) # print text
1.4m articles
1.4m replys
5 comments
57.0k users