Your input string must be a byte string for decoding to make sense. Assuming that use bytes.decode()
:
>>> s = b'The restoration and rejuvenation of the Willamette Army Basexe2x80x94now the Willamette Reservist Training Centerxe2x80x94is complete.
'
>>> type(s)
<class 'bytes'>
>>> s2 = s.decode('utf8')
>>> type(s2)
<class 'str'>
>>> s2
'The restoration and rejuvenation of the Willamette Army Base—now the Willamette Reservist Training Center—is complete.
'
The above shows decoding of a byte string (class bytes
) to a unicode string (class str
).
Strip off the trailing new lines with rstrip()
:
>>> s2.rstrip()
'The restoration and rejuvenation of the Willamette Army Base—now the Willamette Reservist Training Center—is complete.'
If your data is coming from a file or other stream you can decode as you read it by specifying an encoding when you open the file/stream:
with open('file.txt', encoding='utf8') as f:
for line in f:
print(line)
This will decode the incoming data from UTF8 and your code only deals with strings. not byte strings. See open()
for details.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…