Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.9k views
in Technique[技术] by (71.8m points)

utf 8 - Python: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

I am fetching data from a catalog and it's giving data in bytes format.

Bytes data:

b'x80x00x00x00
x00x00%x83xa0x08x01x00xbb@x00x00x05p 
x02x00>xf3x00x00x00}x02x00`x03xef0x00x00
xc0 
x06xf0>xf3x00x00x02x88x02x03xecx03xef0x00x00/.....'

While converting this data in string or any readable format I'am getting this error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Code which I used(Python 3.7.3):

blobs = blob.decode('utf-8')

AND

import json
json.dumps(blob.decode())

I've also used pickle, ast and pprint but they are not helpful here.

What I tried:

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can try ignoring the non-readable blocks.

blobs.decode('utf-8', 'ignore')

It's not a great solution but the way you're generating the byte object has some issues. Maybe, utf-8 is not the proper encoding for your data.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...