I'm working with HTML elements that have child tags, which I want to "ignore" or remove, so that the text is still there. Just now, if I try to .string
any element with tags, all I get is None
.
import bs4
soup = bs4.BeautifulSoup("""
<div id="main">
<p>This is a paragraph.</p>
<p>This is a paragraph <span class="test">with a tag</span>.</p>
<p>This is another paragraph.</p>
</div>
""")
main = soup.find(id='main')
for child in main.children:
print child.string
Output:
This is a paragraph.
None
This is another paragraph.
I want the second line to be This is a paragraph with a tag.
. How do I do this?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…