python - Extract all <script> tags in an HTML page and append to the bottom of the document

Question

Welcome To Ask or Share your Answers For Others

python - Extract all <script> tags in an HTML page and append to the bottom of the document

1 Reply

深蓝 · Answer 1 · 2021-10-23T21:35:32+0000

The answer is simple and may miss many nuances. How ever, this should give you an idea of how to go about doing it, improving it in general. I am sure this can be improved but you should be able to do that quickly with help of the documentation.

Reference doc: http://www.crummy.com/software/BeautifulSoup/documentation.html

from bs4 import BeautifulSoup

doc = ['<html><script type="text/javascript">document.write("Hello World!")',
       '</script><head><title>Page title</title></head>',
       '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
       '<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
       '</html>']
soup = BeautifulSoup(''.join(doc))


for tag in soup.findAll('script'):
    # Use extract to remove the tag
    tag.extract()
    # use simple insert
    soup.body.insert(len(soup.body.contents), tag)

print soup.prettify()

Output:

<html>
 <head>
  <title>
   Page title
  </title>
 </head>
 <body>
  <p id="firstpara" align="center">
   This is paragraph
   <b>
    one
   </b>
   .
  </p>
  <p id="secondpara" align="blah">
   This is paragraph
   <b>
    two
   </b>
   .
  </p>
  <script type="text/javascript">
   document.write("Hello World!")
  </script>
 </body>
</html>

Categories

python - Extract all <script> tags in an HTML page and append to the bottom of the document

python - Extract all <script> tags in an HTML page and append to the bottom of the document

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags