If it's a Python 2.x str
, get its len
. If it's a Python 3.x str
(or a Python 2.x unicode
), first encode to bytes
(or a str
, respectively) using your preferred encoding ('utf-8'
is a good choice) and then get the len
of the encoded bytes/str object.
For example, ASCII characters use 1 byte each:
>>> len("hello".encode("utf8"))
5
whereas Chinese ones use 3 bytes each:
>>> len("你好".encode("utf8"))
6
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…