Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.2k views
in Technique[技术] by (71.8m points)

python - unicode and encoding for persian or arabic in python3

some chunk of code like this:

city_name = obj['city_from']['name'].encode('utf-8')
            print(city_name)

The output from this code is:

b'xd8xa8xd9x86xd8xafxd8xb1xd8xb9xd8xa8xd8xa7xd8xb3'

and if i remove encode('utf-8') output change like this:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)

this output language is persian(like arabic), i wonder why the string class in python3 does not have any decode method? Do you have any solutions to this problem?

thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your answer shows that your terminal accepts utf-8 byte sequences.

You don't need to convert Unicode string into bytes before printing them. Python does it for you.

To change the character encoding that Python uses for I/O; set PYTHONIOENCODING=utf-8 environment variable or change your locale settings.

It looks like sys.stdout.encoding is ascii in your case.

$ python3 -c'import sys; print(sys.stdout.encoding)' 
UTF-8
$ python3 -c'import sys; print(sys.stdout.encoding)' | cat
ascii
$ LC_CTYPE=C python3 -c'import sys; print(sys.stdout.encoding)' 
ANSI_X3.4-1968

ANSI_X3.4-1968 is a canonical name for ascii.

$ PYTHONIOENCODING=uTf-8 python3 -c'import sys; print(sys.stdout.encoding)' | cat
uTf-8
$ LC_CTYPE=C.UTF-8 python3 -c'import sys; print(sys.stdout.encoding)' 
UTF-8

Don't hardcode the character encoding inside your scripts. Print Unicode strings and configure your environment appropriately instead


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...