Happy examples:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
czech = u'Leo? Janá?ek'.encode("utf-8")
print(czech)
pl = u'Zdzis?aw Beksiński'.encode("utf-8")
print(pl)
jp = u'リング 山村 貞子'.encode("utf-8")
print(jp)
chinese = u'五行'.encode("utf-8")
print(chinese)
MIR = u'Машина для Инженерных Расчётов'.encode("utf-8")
print(MIR)
pt = u'Minha Língua Portuguesa: ?áà'.encode("utf-8")
print(pt)
Unhappy output:
b'Leoxc5xa1 Janxc3xa1xc4x8dek'
b'Zdzisxc5x82aw Beksixc5x84ski'
b'xe3x83xaaxe3x83xb3xe3x82xb0 xe5xb1xb1xe6x9dx91 xe8xb2x9exe5xadx90'
b'xe4xbax94xe8xa1x8c'
b'xd0x9cxd0xb0xd1x88xd0xb8xd0xbdxd0xb0 xd0xb4xd0xbbxd1x8f xd0x98xd0xbdxd0xb6xd0xb5xd0xbdxd0xb5xd1x80xd0xbdxd1x8bxd1x85 xd0xa0xd0xb0xd1x81xd1x87xd1x91xd1x82xd0xbexd0xb2'
b'Minha Lxc3xadngua Portuguesa: xc3xa7xc3xa1xc3xa0'
And if I print them like this:
jp = u'リング 山村 貞子'
print(jp)
I get:
Traceback (most recent call last):
File "x.py", line 5, in <module>
print(jp)
File "C:Python34libencodingscp850.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position
0-2: character maps to <undefined>
I've also tried the following from this question (And other alternatives that involve sys.stdout.encoding
):
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import sys
def safeprint(s):
try:
print(s)
except UnicodeEncodeError:
if sys.version_info >= (3,):
print(s.encode('utf8').decode(sys.stdout.encoding))
else:
print(s.encode('utf8'))
jp = u'リング 山村 貞子'
safeprint(jp)
And things get even more cryptic:
πa?πa│πé? σ??μ¥? Φ▓?σ?é
And the docs were not very helpful.
So, what's the deal with Python 3.4, Unicode, different languages and Windows? Almost all possible examples I could find, deal with Python 2.x.
Is there a general and cross-platform way of printing ANY Unicode character from any language in a decent and non-nasty way in Python 3.4?
EDIT:
I've tried typing at the terminal:
chcp 65001
To change the code page, as proposed here and in the comments, and it did not work (Including the attempt with sys.stdout.encoding)
Question&Answers:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…