Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
436 views
in Technique[技术] by (71.8m points)

python - Entity references and lxml

Here's the code I have:

from cStringIO import StringIO
from lxml import etree

xml = StringIO('''<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY test "This is a test">
]>
<root>
  <sub>&test;</sub>
</root>''')

d1 = etree.parse(xml)
print '%r' % d1.find('/sub').text

parser = etree.XMLParser(resolve_entities=False)
d2 = etree.parse(xml, parser=parser)
print '%r' % d2.find('/sub').text

Here's the output:

'This is a test'
None

How do I get lxml to give me '&test;', i.e., the raw entity reference?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The "unresolved" Entity is left as child node of the element node sub

>>> print d2.find('/sub')[0]
&test;
>>> d2.find('/sub').getchildren()
[&test;]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...