Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
344 views
in Technique[技术] by (71.8m points)

python - ElementTree and unicode

I have this char in an xml file:

<data>
  <products>
      <color>fumè</color>
  </product>
</data>

I try to generate an instance of ElementTree with the following code:

string_data = open('file.xml')
x = ElementTree.fromstring(unicode(string_data.encode('utf-8')))

and I get the following error:

UnicodeEncodeError: 'ascii' codec can't encode character u'xe8' in position 185: ordinal not in range(128)

(NOTE: The position is not exact, I sampled the xml from a larger one).

How to solve it? Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Might you have stumbled upon this problem while using Requests (HTTP for Humans), response.text decodes the response by default, you can use response.content to get the undecoded data, so ElementTree can decode it itself. Just remember to use the correct encoding.

More info: http://docs.python-requests.org/en/latest/user/quickstart/#response-content


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...