You could use BeautifulSoup to extract the <script>
tag, but you would still need an alternative approach to extract the information inside.
Some Python can be used to first extract flashvars
and then pass this to demjson
to convert the Javascript dictionary into a Python one. For example:
import demjson
content = """<script type="text/javascript">/* <![CDATA[ */
...
...
</script>"""
script_var = content.split('var flashvars = ')[1]
script_var = script_var[:script_var.find('};') + 1]
data = demjson.decode(script_var)
print(data['video_url'])
print(data['video_alt_url'])
This would then display:
https://www.ptrex.com/get_file/4/996a9088fdf801992d24457cd51469f3f7aaaee6a0/33000/33247/33247.mp4/
https://www.ptrex.com/get_file/4/774833c428771edee2cf401ef2264e746a06f9f370/33000/33247/33247_720p.mp4/
demjson
is an alternative JSON decoder which can be installed via PIP
pip install demjson
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…