I have JSON file which contains followingly encoded strings:
"sender_name": "Hornu00c3u00adkovu00c3u00a1",
I am trying to parse this file using the json
module. However I am not able to decode this string correctly.
What I get after decoding the JSON using .load()
method is 'Horn?xadkov??'
. The string should be correctly decoded as 'Horníková'
instead.
I read the JSON specification and I understasnd that after u
there should be 4 hexadecimal numbers specifing Unicode number of character. But it seems that in this JSON file UTF-8 encoded bytes are stored as u
-sequences.
What type of encoding is this and how to correctly parse it in Python 3?
Is this type JSON file even valid JSON file according to the specification?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…