Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
469 views
in Technique[技术] by (71.8m points)

python - UnicodeEncodeError: 'charmap' codec can't encode character... problems

Before anyone gives me crap about this being asked a billion times, please note that I've tried several of the answers in many a thread but none of them seemed to work properly for my problem.

import json
def parse(fn):
    results = []
    with open(fn) as f:
        json_obj = json.loads(open(fn).read())
        for r in json_obj["result"]:
            print(r["name"])

parse("wine.json")

I'm basically just opening a json file and iterating it for some values. Obviously, whenever I read a value with some unicode in it I get this error.

Traceback (most recent call last):
  File "json_test.py", line 9, in <module>
    parse("wine.json")
  File "json_test.py", line 7, in parse
    print(r["name"])
  File "C:Python34libencodingscp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character 'u201c' in position
15: character maps to <undefined>

As people said in other threads I've tried to encode it and whatnot, but then I get a similar error, no matter how I encode and/or decode it. Please help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Everything is fine up until the point where you try to print the string. To print a string it must first be converted from pure Unicode to the byte sequences supported by your output device. This requires an encode to the proper character set, which Python has identified as cp850 - the Windows Console default.

Starting with Python 3.4 you can set the Windows console to use UTF-8 with the following command issued at the command prompt:

chcp 65001

This should fix your issue, as long as you've configured the window to use a font that contains the character.

Starting with Python 3.6 this is no longer necessary - Windows has always had a full Unicode interface for the console, and Python is now using it in place of the primitive code page I/O. Unicode to the console just works.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...