Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
654 views
in Technique[技术] by (71.8m points)

unicode - Best Practices for Python UnicodeDecodeError

I use Pylons framework, Mako template for a web based application. I wasn't really bother too deep into the way Python handles the unicode strings. I had tense moment when I did see my site crash when the page is rendered and later I came to know that it was related to UnicodeDecodeError.

After seeing the error, I started mesh around my Python code adding encode, decode calls for string with 'ignore' option but still I could not see the errors gone sometime.

Finally I used to decode to ascii with ignore and made the site running without any crash.

Input to my site comes through many sites. This means that I do not control the languages or language of choice. My site supports international languages and along with English. I have feed aggregation which generally not bother about unicode/ascii/utf-8. While I display the text through mako template, I display as it is.

Not being a web expert, what are the best practices to handle the strings within the Python project? Should I care only while rendering the text or all the phase of the application?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you have influence on it, this is the painless way:

  • know your input encoding (or decode with ignore) and decode(encoding) the data as soon as it hits your app
  • work internally only with unicode (u'something' is unicode), also in the database
  • for rendering, export etc, anytime it leaves your app, encode('utf-8') the data

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...