(In this answer, I'm assuming you use Python 2.)
First, let me explain why your snippet returns something different than you expect:
r1 = json.dumps({"detalle":"el Expediente Nu00b0u00a030 de la Resoluciu00f3n 11..."}, ensure_ascii=False).encode('utf8')
print(r1)
r2 = json.dumps({"detalle":u"el Expediente Nu00b0u00a030 de la Resoluciu00f3n 11..."}, ensure_ascii=False).encode('utf8')
print(r2)
This outputs:
{"detalle": "el Expediente N\u00b0\u00a030 de la Resoluci\u00f3n 11..."}
{"detalle": "el Expediente N°?30 de la Resolución 11..."}
The difference is, that in the first case, the input string is ascii code, with slashes and other characters to represent special characters, and in the second case, the string is a unicode string with unicode characters. The second case is what you want.
Based on this, here is what I understand from your problem:
Normally when you read a JSON file with the json
module, the strings (which are escaped in the JSON file) are unescaped by the parser. If you still see escaped characters, that indicates that the strings were (accidentally?) double escaped in the JSON file. In that case, try an extra unescape with s.decode('unicode-escape')
:
data["detalle"] = data["detalle"].decode('unicode-escape')
Once you have proper unicode strings loaded in Python, converting them to bytes with s.encode('utf8')
and writing the result to a file, is correct.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…