Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
155 views
in Technique[技术] by (71.8m points)

python - Trying to get a small part of an HTML a class element

I've been using BeautifulSoup on and off for a few years, and I still get tripped up from time to time. I put together this code.

from bs4 import BeautifulSoup
from bs4.dammit import EncodingDetector
import requests

resp = requests.get("https://finance.yahoo.com/gainers")
http_encoding = resp.encoding if 'charset' in resp.headers.get('content-type', '').lower() else None
html_encoding = EncodingDetector.find_declared_encoding(resp.content, is_html=True)
encoding = html_encoding or http_encoding
soup = BeautifulSoup(resp.content, from_encoding=encoding)
myclass = soup.findAll("a", {"class": "Fw(600) C($linkColor)"})
myclass

That gives me this.

[<a class="Fw(600) C($linkColor)" data-reactid="79" href="/quote/TSNP?p=TSNP" title="Tesoro Enterprises, Inc.">TSNP</a>,
 <a class="Fw(600) C($linkColor)" data-reactid="105" href="/quote/FDVRF?p=FDVRF" title="Facedrive Inc.">FDVRF</a>,
 <a class="Fw(600) C($linkColor)" data-reactid="131" href="/quote/SKLZ?p=SKLZ" title="Skillz Inc.">SKLZ</a>,
 <a class="Fw(600) C($linkColor)" data-reactid="157" href="/quote/GOOS?p=GOOS" title="Canada Goose Holdings Inc.">GOOS</a>,
 <a class="Fw(600) C($linkColor)" data-reactid="183" href="/quote/WMS?p=WMS" title="Advanced Drainage Systems, Inc.">WMS</a>, etc., etc.

What I really want is the stock symbols: TSNP, FDVRF, SKLZ, GOOS, WMS, etc., etc.

How can I modify this code to get just the stock symbols? I tried to use regex, but I've never been very proficient with that.

Thanks everyone.

question from:https://stackoverflow.com/questions/66056527/trying-to-get-a-small-part-of-an-html-a-class-element

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use the .text attribute of the elements returned from the .findAll() method:

for e in soup.findAll("a", {"class": "Fw(600) C($linkColor)"}):
    print(e.text)

Output:

TSNP
FDVRF
SKLZ
GOOS
WMS
APPS
...

If you want them in a list, a simple list comprehension will do:

gainers = soup.findAll("a", {"class": "Fw(600) C($linkColor)"})
tickers = [e.text for e in gainers]

Output:

['TSNP', 'FDVRF', 'SKLZ', 'GOOS', 'WMS', 'APPS', 'TIGR', ...]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...