You asked for minimal!
- Write a encode function and a decode function.
- Write a "search function" that returns a
CodecInfo
object constructed from the above encoder and decoder.
- Use codec.register to register a function that returns the above
CodecInfo
object.
Here is an example that converts the lowercase letters a-z to 0-25 in order.
import codecs
import string
from typing import Tuple
# prepare map from numbers to letters
_encode_table = {str(number): bytes(letter, 'ascii') for number, letter in enumerate(string.ascii_lowercase)}
# prepare inverse map
_decode_table = {ord(v): k for k, v in _encode_table.items()}
def custom_encode(text: str) -> Tuple[bytes, int]:
# example encoder that converts ints to letters
# see https://docs.python.org/3/library/codecs.html#codecs.Codec.encode
return b''.join(_encode_table[x] for x in text), len(text)
def custom_decode(binary: bytes) -> Tuple[str, int]:
# example decoder that converts letters to ints
# see https://docs.python.org/3/library/codecs.html#codecs.Codec.decode
return ''.join(_decode_table[x] for x in binary), len(binary)
def custom_search_function(encoding_name):
return codecs.CodecInfo(custom_encode, custom_decode, name='Reasons')
def main():
# register your custom codec
# note that CodecInfo.name is used later
codecs.register(custom_search_function)
binary = b'abcdefg'
# decode letters to numbers
text = codecs.decode(binary, encoding='Reasons')
print(text)
# encode numbers to letters
binary2 = codecs.encode(text, encoding='Reasons')
print(binary2)
# encode(decode(...)) should be an identity function
assert binary == binary2
if __name__ == '__main__':
main()
Running this prints
$ python codec_example.py
0123456
b'abcdefg'
See https://docs.python.org/3/library/codecs.html#codec-objects for details on the Codec
interface. In particular, the decode function
... decodes the object input and returns a tuple (output object, length
consumed).
whereas the encode function
... encodes the object input and returns a tuple (output object, length consumed).
Note that you should also worry about handling streams, incremental encoding/decoding, as well as error handling. For a more complete example, refer to the hexlify codec that @krs013 mentioned.
P.S. instead of of codec.decode
, you can also use codec.open(..., encoding='Reasons')
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…