Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
286 views
in Technique[技术] by (71.8m points)

python - Are spaces around CSS combinators are really optional

I'm a bit confused by using CSS selectors with axis combinators in BeautifulSoup. Below is the simple code to illustrate what I mean:

from bs4 import BeautifulSoup as bs
import requests

response = requests.get('https://stackoverflow.com/questions/tagged/python')
soup = bs(response.text)

print(len(soup.select('#mainbar > div'))) 

returns 6 children... but

print(len(soup.select('#mainbar>div')))

returns 0 children...

The same with '#mainbar ~ div' (found 1 sibling) and #mainbar~div' (found nothing)

From documentation those spaces are optional, but in fact I got different output with BeautifulSoup for the same selectors (as I thought)

So is it bs4 bug or this behavior depends on version of CSS or something else?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is confirmed as a bug here: https://bugs.launchpad.net/beautifulsoup/+bug/1717851

The selector, from a CSS perspective is fine with/without.

I will see if I can find further evidence.

The individual reporting the bug states:

The issue, as far as I see, is that since the code is only doing a shlex.split, it doesn't treat div, >, and span as separate entities is a space is left out on either side of >.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...