Google needs that you specify User-Agent
http header to return correct page. Without the correct User-Agent
specified, Google returns page that doesn't contain <div>
tags with r
class. You can see it when you do print(soup)
with and without User-Agent
.
For example:
import requests
from bs4 import BeautifulSoup
string = 'selena+gomez'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'}
website = f'http://google.com/search?hl=en&q={string}'
req_web = requests.get(website, headers=headers).text
parser = BeautifulSoup(req_web, 'html.parser')
gotolink = parser.find('div', class_='r').a["href"]
print(gotolink)
Prints:
https://www.instagram.com/selenagomez/?hl=en
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…