diff options
Diffstat (limited to 'Doc/library/codecs.rst')
-rw-r--r-- | Doc/library/codecs.rst | 98 |
1 files changed, 92 insertions, 6 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index 14f6547e4e0..c5dae7c8e8f 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -53,6 +53,14 @@ any codec: :exc:`UnicodeDecodeError`). Refer to :ref:`codec-base-classes` for more information on codec error handling. +.. function:: charmap_build(string) + + Return a mapping suitable for encoding with a custom single-byte encoding. + Given a :class:`str` *string* of up to 256 characters representing a + decoding table, returns either a compact internal mapping object + ``EncodingMap`` or a :class:`dictionary <dict>` mapping character ordinals + to byte values. Raises a :exc:`TypeError` on invalid input. + The full details for each codec can also be looked up directly: .. function:: lookup(encoding, /) @@ -208,7 +216,7 @@ wider range of codecs when working with binary files: .. versionchanged:: 3.11 The ``'U'`` mode has been removed. - .. deprecated:: next + .. deprecated:: 3.14 :func:`codecs.open` has been superseded by :func:`open`. @@ -235,8 +243,8 @@ wider range of codecs when working with binary files: .. function:: iterencode(iterator, encoding, errors='strict', **kwargs) Uses an incremental encoder to iteratively encode the input provided by - *iterator*. This function is a :term:`generator`. - The *errors* argument (as well as any + *iterator*. *iterator* must yield :class:`str` objects. + This function is a :term:`generator`. The *errors* argument (as well as any other keyword argument) is passed through to the incremental encoder. This function requires that the codec accept text :class:`str` objects @@ -247,8 +255,8 @@ wider range of codecs when working with binary files: .. function:: iterdecode(iterator, encoding, errors='strict', **kwargs) Uses an incremental decoder to iteratively decode the input provided by - *iterator*. This function is a :term:`generator`. - The *errors* argument (as well as any + *iterator*. *iterator* must yield :class:`bytes` objects. + This function is a :term:`generator`. The *errors* argument (as well as any other keyword argument) is passed through to the incremental decoder. This function requires that the codec accept :class:`bytes` objects @@ -257,6 +265,20 @@ wider range of codecs when working with binary files: :func:`iterencode`. +.. function:: readbuffer_encode(buffer, errors=None, /) + + Return a :class:`tuple` containing the raw bytes of *buffer*, a + :ref:`buffer-compatible object <bufferobjects>` or :class:`str` + (encoded to UTF-8 before processing), and their length in bytes. + + The *errors* argument is ignored. + + .. code-block:: pycon + + >>> codecs.readbuffer_encode(b"Zito") + (b'Zito', 4) + + The module also provides the following constants which are useful for reading and writing to platform dependent files: @@ -1373,7 +1395,11 @@ encodings. | | | It is used in the Python | | | | pickle protocol. | +--------------------+---------+---------------------------+ -| undefined | | Raise an exception for | +| undefined | | This Codec should only | +| | | be used for testing | +| | | purposes. | +| | | | +| | | Raise an exception for | | | | all conversions, even | | | | empty strings. The error | | | | handler is ignored. | @@ -1476,6 +1502,66 @@ mapping. It is not supported by :meth:`str.encode` (which only produces Restoration of the ``rot13`` alias. +:mod:`encodings` --- Encodings package +-------------------------------------- + +.. module:: encodings + :synopsis: Encodings package + +This module implements the following functions: + +.. function:: normalize_encoding(encoding) + + Normalize encoding name *encoding*. + + Normalization works as follows: all non-alphanumeric characters except the + dot used for Python package names are collapsed and replaced with a single + underscore, leading and trailing underscores are removed. + For example, ``' -;#'`` becomes ``'_'``. + + Note that *encoding* should be ASCII only. + + +.. note:: + The following functions should not be used directly, except for testing + purposes; :func:`codecs.lookup` should be used instead. + + +.. function:: search_function(encoding) + + Search for the codec module corresponding to the given encoding name + *encoding*. + + This function first normalizes the *encoding* using + :func:`normalize_encoding`, then looks for a corresponding alias. + It attempts to import a codec module from the encodings package using either + the alias or the normalized name. If the module is found and defines a valid + ``getregentry()`` function that returns a :class:`codecs.CodecInfo` object, + the codec is cached and returned. + + If the codec module defines a ``getaliases()`` function any returned aliases + are registered for future use. + + +.. function:: win32_code_page_search_function(encoding) + + Search for a Windows code page encoding *encoding* of the form ``cpXXXX``. + + If the code page is valid and supported, return a :class:`codecs.CodecInfo` + object for it. + + .. availability:: Windows. + + .. versionadded:: 3.14 + + +This module implements the following exception: + +.. exception:: CodecRegistryError + + Raised when a codec is invalid or incompatible. + + :mod:`encodings.idna` --- Internationalized Domain Names in Applications ------------------------------------------------------------------------ |