diff options
Diffstat (limited to 'Doc/library')
-rw-r--r-- | Doc/library/__future__.rst | 78 | ||||
-rw-r--r-- | Doc/library/codecs.rst | 88 | ||||
-rw-r--r-- | Doc/library/email.compat32-message.rst | 18 | ||||
-rw-r--r-- | Doc/library/email.parser.rst | 10 | ||||
-rw-r--r-- | Doc/library/exceptions.rst | 30 | ||||
-rw-r--r-- | Doc/library/functools.rst | 33 | ||||
-rw-r--r-- | Doc/library/math.rst | 26 | ||||
-rw-r--r-- | Doc/library/os.path.rst | 7 | ||||
-rw-r--r-- | Doc/library/platform.rst | 4 | ||||
-rw-r--r-- | Doc/library/profile.rst | 306 | ||||
-rw-r--r-- | Doc/library/pyexpat.rst | 7 | ||||
-rw-r--r-- | Doc/library/random.rst | 5 | ||||
-rw-r--r-- | Doc/library/security_warnings.rst | 2 | ||||
-rw-r--r-- | Doc/library/shelve.rst | 70 | ||||
-rw-r--r-- | Doc/library/sqlite3.rst | 2 | ||||
-rw-r--r-- | Doc/library/ssl.rst | 10 | ||||
-rw-r--r-- | Doc/library/sys.rst | 2 | ||||
-rw-r--r-- | Doc/library/time.rst | 14 | ||||
-rw-r--r-- | Doc/library/unittest.rst | 2 | ||||
-rw-r--r-- | Doc/library/xml.dom.minidom.rst | 7 | ||||
-rw-r--r-- | Doc/library/xml.dom.pulldom.rst | 7 | ||||
-rw-r--r-- | Doc/library/xml.etree.elementtree.rst | 7 | ||||
-rw-r--r-- | Doc/library/xml.rst | 76 | ||||
-rw-r--r-- | Doc/library/xml.sax.rst | 7 | ||||
-rw-r--r-- | Doc/library/xmlrpc.client.rst | 4 | ||||
-rw-r--r-- | Doc/library/xmlrpc.server.rst | 4 |
26 files changed, 632 insertions, 194 deletions
diff --git a/Doc/library/__future__.rst b/Doc/library/__future__.rst index 4f3b663006f..5d916b30112 100644 --- a/Doc/library/__future__.rst +++ b/Doc/library/__future__.rst @@ -37,38 +37,52 @@ No feature description will ever be deleted from :mod:`__future__`. Since its introduction in Python 2.1 the following features have found their way into the language using this mechanism: -+------------------+-------------+--------------+---------------------------------------------+ -| feature | optional in | mandatory in | effect | -+==================+=============+==============+=============================================+ -| nested_scopes | 2.1.0b1 | 2.2 | :pep:`227`: | -| | | | *Statically Nested Scopes* | -+------------------+-------------+--------------+---------------------------------------------+ -| generators | 2.2.0a1 | 2.3 | :pep:`255`: | -| | | | *Simple Generators* | -+------------------+-------------+--------------+---------------------------------------------+ -| division | 2.2.0a2 | 3.0 | :pep:`238`: | -| | | | *Changing the Division Operator* | -+------------------+-------------+--------------+---------------------------------------------+ -| absolute_import | 2.5.0a1 | 3.0 | :pep:`328`: | -| | | | *Imports: Multi-Line and Absolute/Relative* | -+------------------+-------------+--------------+---------------------------------------------+ -| with_statement | 2.5.0a1 | 2.6 | :pep:`343`: | -| | | | *The "with" Statement* | -+------------------+-------------+--------------+---------------------------------------------+ -| print_function | 2.6.0a2 | 3.0 | :pep:`3105`: | -| | | | *Make print a function* | -+------------------+-------------+--------------+---------------------------------------------+ -| unicode_literals | 2.6.0a2 | 3.0 | :pep:`3112`: | -| | | | *Bytes literals in Python 3000* | -+------------------+-------------+--------------+---------------------------------------------+ -| generator_stop | 3.5.0b1 | 3.7 | :pep:`479`: | -| | | | *StopIteration handling inside generators* | -+------------------+-------------+--------------+---------------------------------------------+ -| annotations | 3.7.0b1 | Never [1]_ | :pep:`563`: | -| | | | *Postponed evaluation of annotations*, | -| | | | :pep:`649`: *Deferred evaluation of | -| | | | annotations using descriptors* | -+------------------+-------------+--------------+---------------------------------------------+ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * * feature + * optional in + * mandatory in + * effect + * * .. data:: nested_scopes + * 2.1.0b1 + * 2.2 + * :pep:`227`: *Statically Nested Scopes* + * * .. data:: generators + * 2.2.0a1 + * 2.3 + * :pep:`255`: *Simple Generators* + * * .. data:: division + * 2.2.0a2 + * 3.0 + * :pep:`238`: *Changing the Division Operator* + * * .. data:: absolute_import + * 2.5.0a1 + * 3.0 + * :pep:`328`: *Imports: Multi-Line and Absolute/Relative* + * * .. data:: with_statement + * 2.5.0a1 + * 2.6 + * :pep:`343`: *The “with” Statement* + * * .. data:: print_function + * 2.6.0a2 + * 3.0 + * :pep:`3105`: *Make print a function* + * * .. data:: unicode_literals + * 2.6.0a2 + * 3.0 + * :pep:`3112`: *Bytes literals in Python 3000* + * * .. data:: generator_stop + * 3.5.0b1 + * 3.7 + * :pep:`479`: *StopIteration handling inside generators* + * * .. data:: annotations + * 3.7.0b1 + * Never [1]_ + * :pep:`563`: *Postponed evaluation of annotations*, + :pep:`649`: *Deferred evaluation of annotations using descriptors* .. XXX Adding a new entry? Remember to update simple_stmts.rst, too. diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index b231fa568cf..c5dae7c8e8f 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -243,8 +243,8 @@ wider range of codecs when working with binary files: .. function:: iterencode(iterator, encoding, errors='strict', **kwargs) Uses an incremental encoder to iteratively encode the input provided by - *iterator*. This function is a :term:`generator`. - The *errors* argument (as well as any + *iterator*. *iterator* must yield :class:`str` objects. + This function is a :term:`generator`. The *errors* argument (as well as any other keyword argument) is passed through to the incremental encoder. This function requires that the codec accept text :class:`str` objects @@ -255,8 +255,8 @@ wider range of codecs when working with binary files: .. function:: iterdecode(iterator, encoding, errors='strict', **kwargs) Uses an incremental decoder to iteratively decode the input provided by - *iterator*. This function is a :term:`generator`. - The *errors* argument (as well as any + *iterator*. *iterator* must yield :class:`bytes` objects. + This function is a :term:`generator`. The *errors* argument (as well as any other keyword argument) is passed through to the incremental decoder. This function requires that the codec accept :class:`bytes` objects @@ -265,6 +265,20 @@ wider range of codecs when working with binary files: :func:`iterencode`. +.. function:: readbuffer_encode(buffer, errors=None, /) + + Return a :class:`tuple` containing the raw bytes of *buffer*, a + :ref:`buffer-compatible object <bufferobjects>` or :class:`str` + (encoded to UTF-8 before processing), and their length in bytes. + + The *errors* argument is ignored. + + .. code-block:: pycon + + >>> codecs.readbuffer_encode(b"Zito") + (b'Zito', 4) + + The module also provides the following constants which are useful for reading and writing to platform dependent files: @@ -1381,7 +1395,11 @@ encodings. | | | It is used in the Python | | | | pickle protocol. | +--------------------+---------+---------------------------+ -| undefined | | Raise an exception for | +| undefined | | This Codec should only | +| | | be used for testing | +| | | purposes. | +| | | | +| | | Raise an exception for | | | | all conversions, even | | | | empty strings. The error | | | | handler is ignored. | @@ -1484,6 +1502,66 @@ mapping. It is not supported by :meth:`str.encode` (which only produces Restoration of the ``rot13`` alias. +:mod:`encodings` --- Encodings package +-------------------------------------- + +.. module:: encodings + :synopsis: Encodings package + +This module implements the following functions: + +.. function:: normalize_encoding(encoding) + + Normalize encoding name *encoding*. + + Normalization works as follows: all non-alphanumeric characters except the + dot used for Python package names are collapsed and replaced with a single + underscore, leading and trailing underscores are removed. + For example, ``' -;#'`` becomes ``'_'``. + + Note that *encoding* should be ASCII only. + + +.. note:: + The following functions should not be used directly, except for testing + purposes; :func:`codecs.lookup` should be used instead. + + +.. function:: search_function(encoding) + + Search for the codec module corresponding to the given encoding name + *encoding*. + + This function first normalizes the *encoding* using + :func:`normalize_encoding`, then looks for a corresponding alias. + It attempts to import a codec module from the encodings package using either + the alias or the normalized name. If the module is found and defines a valid + ``getregentry()`` function that returns a :class:`codecs.CodecInfo` object, + the codec is cached and returned. + + If the codec module defines a ``getaliases()`` function any returned aliases + are registered for future use. + + +.. function:: win32_code_page_search_function(encoding) + + Search for a Windows code page encoding *encoding* of the form ``cpXXXX``. + + If the code page is valid and supported, return a :class:`codecs.CodecInfo` + object for it. + + .. availability:: Windows. + + .. versionadded:: 3.14 + + +This module implements the following exception: + +.. exception:: CodecRegistryError + + Raised when a codec is invalid or incompatible. + + :mod:`encodings.idna` --- Internationalized Domain Names in Applications ------------------------------------------------------------------------ diff --git a/Doc/library/email.compat32-message.rst b/Doc/library/email.compat32-message.rst index 4285c436e8d..5754c2b65b2 100644 --- a/Doc/library/email.compat32-message.rst +++ b/Doc/library/email.compat32-message.rst @@ -181,7 +181,7 @@ Here are the methods of the :class:`Message` class: :meth:`set_payload` instead. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by :meth:`~email.message.EmailMessage.set_content` and the related ``make`` and ``add`` methods. @@ -224,7 +224,7 @@ Here are the methods of the :class:`Message` class: ASCII charset. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by :meth:`~email.message.EmailMessage.get_content` and :meth:`~email.message.EmailMessage.iter_parts`. @@ -236,7 +236,7 @@ Here are the methods of the :class:`Message` class: the message's default character set; see :meth:`set_charset` for details. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by :meth:`~email.message.EmailMessage.set_content`. @@ -265,9 +265,9 @@ Here are the methods of the :class:`Message` class: using that :mailheader:`Content-Transfer-Encoding` and is not modified. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by the *charset* parameter of the - :meth:`email.emailmessage.EmailMessage.set_content` method. + :meth:`email.message.EmailMessage.set_content` method. .. method:: get_charset() @@ -276,7 +276,7 @@ Here are the methods of the :class:`Message` class: message's payload. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class it always returns + :class:`~email.message.EmailMessage` class it always returns ``None``. @@ -486,7 +486,7 @@ Here are the methods of the :class:`Message` class: search instead of :mailheader:`Content-Type`. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by the *params* property of the individual header objects returned by the header access methods. @@ -524,7 +524,7 @@ Here are the methods of the :class:`Message` class: to ``False``. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by the *params* property of the individual header objects returned by the header access methods. @@ -579,7 +579,7 @@ Here are the methods of the :class:`Message` class: header is also added. This is a legacy method. On the - :class:`~email.emailmessage.EmailMessage` class its functionality is + :class:`~email.message.EmailMessage` class its functionality is replaced by the ``make_`` and ``add_`` methods. diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst index 439b5c8f34b..90796370ebb 100644 --- a/Doc/library/email.parser.rst +++ b/Doc/library/email.parser.rst @@ -48,8 +48,8 @@ methods. FeedParser API ^^^^^^^^^^^^^^ -The :class:`BytesFeedParser`, imported from the :mod:`email.feedparser` module, -provides an API that is conducive to incremental parsing of email messages, +The :class:`BytesFeedParser`, imported from the :mod:`email.parser.FeedParser` +module, provides an API that is conducive to incremental parsing of email messages, such as would be necessary when reading the text of an email message from a source that can block (such as a socket). The :class:`BytesFeedParser` can of course be used to parse an email message fully contained in a :term:`bytes-like @@ -116,7 +116,7 @@ Here is the API for the :class:`BytesFeedParser`: Works like :class:`BytesFeedParser` except that the input to the :meth:`~BytesFeedParser.feed` method must be a string. This is of limited utility, since the only way for such a message to be valid is for it to - contain only ASCII text or, if :attr:`~email.policy.Policy.utf8` is + contain only ASCII text or, if :attr:`~email.policy.EmailPolicy.utf8` is ``True``, no binary attachments. .. versionchanged:: 3.3 Added the *policy* keyword. @@ -155,11 +155,11 @@ message body, instead setting the payload to the raw body. Read all the data from the binary file-like object *fp*, parse the resulting bytes, and return the message object. *fp* must support - both the :meth:`~io.IOBase.readline` and the :meth:`~io.IOBase.read` + both the :meth:`~io.IOBase.readline` and the :meth:`~io.TextIOBase.read` methods. The bytes contained in *fp* must be formatted as a block of :rfc:`5322` - (or, if :attr:`~email.policy.Policy.utf8` is ``True``, :rfc:`6532`) + (or, if :attr:`~email.policy.EmailPolicy.utf8` is ``True``, :rfc:`6532`) style headers and header continuation lines, optionally preceded by an envelope header. The header block is terminated either by the end of the data or by a blank line. Following the header block is the body of the diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst index c09e1615a5b..a73cc2e8a2d 100644 --- a/Doc/library/exceptions.rst +++ b/Doc/library/exceptions.rst @@ -204,10 +204,16 @@ The following exceptions are the exceptions that are usually raised. assignment fails. (When an object does not support attribute references or attribute assignments at all, :exc:`TypeError` is raised.) - The :attr:`name` and :attr:`obj` attributes can be set using keyword-only - arguments to the constructor. When set they represent the name of the attribute - that was attempted to be accessed and the object that was accessed for said - attribute, respectively. + The optional *name* and *obj* keyword-only arguments + set the corresponding attributes: + + .. attribute:: name + + The name of the attribute that was attempted to be accessed. + + .. attribute:: obj + + The object that was accessed for the named attribute. .. versionchanged:: 3.10 Added the :attr:`name` and :attr:`obj` attributes. @@ -215,7 +221,7 @@ The following exceptions are the exceptions that are usually raised. .. exception:: EOFError Raised when the :func:`input` function hits an end-of-file condition (EOF) - without reading any data. (N.B.: the :meth:`io.IOBase.read` and + without reading any data. (Note: the :meth:`!io.IOBase.read` and :meth:`io.IOBase.readline` methods return an empty string when they hit EOF.) @@ -312,9 +318,11 @@ The following exceptions are the exceptions that are usually raised. unqualified names. The associated value is an error message that includes the name that could not be found. - The :attr:`name` attribute can be set using a keyword-only argument to the - constructor. When set it represent the name of the variable that was attempted - to be accessed. + The optional *name* keyword-only argument sets the attribute: + + .. attribute:: name + + The name of the variable that was attempted to be accessed. .. versionchanged:: 3.10 Added the :attr:`name` attribute. @@ -382,7 +390,7 @@ The following exceptions are the exceptions that are usually raised. The corresponding error message, as provided by the operating system. It is formatted by the C - functions :c:func:`perror` under POSIX, and :c:func:`FormatMessage` + functions :c:func:`!perror` under POSIX, and :c:func:`!FormatMessage` under Windows. .. attribute:: filename @@ -398,7 +406,7 @@ The following exceptions are the exceptions that are usually raised. .. versionchanged:: 3.3 :exc:`EnvironmentError`, :exc:`IOError`, :exc:`WindowsError`, :exc:`socket.error`, :exc:`select.error` and - :exc:`mmap.error` have been merged into :exc:`OSError`, and the + :exc:`!mmap.error` have been merged into :exc:`OSError`, and the constructor may return a subclass. .. versionchanged:: 3.4 @@ -597,7 +605,7 @@ The following exceptions are the exceptions that are usually raised. handled, the Python interpreter exits; no stack traceback is printed. The constructor accepts the same optional argument passed to :func:`sys.exit`. If the value is an integer, it specifies the system exit status (passed to - C's :c:func:`exit` function); if it is ``None``, the exit status is zero; if + C's :c:func:`!exit` function); if it is ``None``, the exit status is zero; if it has another type (such as a string), the object's value is printed and the exit status is one. diff --git a/Doc/library/functools.rst b/Doc/library/functools.rst index 3e75621be6d..beec9b942af 100644 --- a/Doc/library/functools.rst +++ b/Doc/library/functools.rst @@ -199,12 +199,18 @@ The :mod:`functools` module defines the following functions: and *typed*. This is for information purposes only. Mutating the values has no effect. + .. method:: lru_cache.cache_info() + :no-typesetting: + To help measure the effectiveness of the cache and tune the *maxsize* - parameter, the wrapped function is instrumented with a :func:`cache_info` + parameter, the wrapped function is instrumented with a :func:`!cache_info` function that returns a :term:`named tuple` showing *hits*, *misses*, *maxsize* and *currsize*. - The decorator also provides a :func:`cache_clear` function for clearing or + .. method:: lru_cache.cache_clear() + :no-typesetting: + + The decorator also provides a :func:`!cache_clear` function for clearing or invalidating the cache. The original underlying function is accessible through the @@ -284,9 +290,9 @@ The :mod:`functools` module defines the following functions: class decorator supplies the rest. This simplifies the effort involved in specifying all of the possible rich comparison operations: - The class must define one of :meth:`__lt__`, :meth:`__le__`, - :meth:`__gt__`, or :meth:`__ge__`. - In addition, the class should supply an :meth:`__eq__` method. + The class must define one of :meth:`~object.__lt__`, :meth:`~object.__le__`, + :meth:`~object.__gt__`, or :meth:`~object.__ge__`. + In addition, the class should supply an :meth:`~object.__eq__` method. For example:: @@ -418,7 +424,7 @@ The :mod:`functools` module defines the following functions: like normal functions, are handled as descriptors). When *func* is a descriptor (such as a normal Python function, - :func:`classmethod`, :func:`staticmethod`, :func:`abstractmethod` or + :func:`classmethod`, :func:`staticmethod`, :func:`~abc.abstractmethod` or another instance of :class:`partialmethod`), calls to ``__get__`` are delegated to the underlying descriptor, and an appropriate :ref:`partial object<partial-objects>` returned as the result. @@ -499,7 +505,10 @@ The :mod:`functools` module defines the following functions: ... print("Let me just say,", end=" ") ... print(arg) - To add overloaded implementations to the function, use the :func:`register` + .. method:: singledispatch.register() + :no-typesetting: + + To add overloaded implementations to the function, use the :func:`!register` attribute of the generic function, which can be used as a decorator. For functions annotated with types, the decorator will infer the type of the first argument automatically:: @@ -565,14 +574,14 @@ The :mod:`functools` module defines the following functions: runtime impact. To enable registering :term:`lambdas<lambda>` and pre-existing functions, - the :func:`register` attribute can also be used in a functional form:: + the :func:`~singledispatch.register` attribute can also be used in a functional form:: >>> def nothing(arg, verbose=False): ... print("Nothing.") ... >>> fun.register(type(None), nothing) - The :func:`register` attribute returns the undecorated function. This + The :func:`~singledispatch.register` attribute returns the undecorated function. This enables decorator stacking, :mod:`pickling<pickle>`, and the creation of unit tests for each variant independently:: @@ -650,10 +659,10 @@ The :mod:`functools` module defines the following functions: .. versionadded:: 3.4 .. versionchanged:: 3.7 - The :func:`register` attribute now supports using type annotations. + The :func:`~singledispatch.register` attribute now supports using type annotations. .. versionchanged:: 3.11 - The :func:`register` attribute now supports + The :func:`~singledispatch.register` attribute now supports :class:`typing.Union` as a type annotation. @@ -783,7 +792,7 @@ The :mod:`functools` module defines the following functions: 'Docstring' Without the use of this decorator factory, the name of the example function - would have been ``'wrapper'``, and the docstring of the original :func:`example` + would have been ``'wrapper'``, and the docstring of the original :func:`!example` would have been lost. diff --git a/Doc/library/math.rst b/Doc/library/math.rst index bf7a00549fc..55f2de07553 100644 --- a/Doc/library/math.rst +++ b/Doc/library/math.rst @@ -42,6 +42,8 @@ noted otherwise, all return values are floats. :func:`fabs(x) <fabs>` Absolute value of *x* :func:`floor(x) <floor>` Floor of *x*, the largest integer less than or equal to *x* :func:`fma(x, y, z) <fma>` Fused multiply-add operation: ``(x * y) + z`` +:func:`fmax(x, y) <fmax>` Maximum of two floating-point values +:func:`fmin(x, y) <fmin>` Minimum of two floating-point values :func:`fmod(x, y) <fmod>` Remainder of division ``x / y`` :func:`modf(x) <modf>` Fractional and integer parts of *x* :func:`remainder(x, y) <remainder>` Remainder of *x* with respect to *y* @@ -248,6 +250,30 @@ Floating point arithmetic .. versionadded:: 3.13 +.. function:: fmax(x, y) + + Get the larger of two floating-point values, treating NaNs as missing data. + + When both operands are (signed) NaNs or zeroes, return ``nan`` and ``0`` + respectively and the sign of the result is implementation-defined, that + is, :func:`!fmax` is not required to be sensitive to the sign of such + operands (see Annex F of the C11 standard, §F.10.0.3 and §F.10.9.2). + + .. versionadded:: next + + +.. function:: fmin(x, y) + + Get the smaller of two floating-point values, treating NaNs as missing data. + + When both operands are (signed) NaNs or zeroes, return ``nan`` and ``0`` + respectively and the sign of the result is implementation-defined, that + is, :func:`!fmin` is not required to be sensitive to the sign of such + operands (see Annex F of the C11 standard, §F.10.0.3 and §F.10.9.3). + + .. versionadded:: next + + .. function:: fmod(x, y) Return the floating-point remainder of ``x / y``, diff --git a/Doc/library/os.path.rst b/Doc/library/os.path.rst index f72aee19d8f..1c1cf07a655 100644 --- a/Doc/library/os.path.rst +++ b/Doc/library/os.path.rst @@ -298,9 +298,10 @@ the :mod:`glob` module.) device than *path*, or whether :file:`{path}/..` and *path* point to the same i-node on the same device --- this should detect mount points for all Unix and POSIX variants. It is not able to reliably detect bind mounts on the - same filesystem. On Windows, a drive letter root and a share UNC are - always mount points, and for any other path ``GetVolumePathName`` is called - to see if it is different from the input path. + same filesystem. On Linux systems, it will always return ``True`` for btrfs + subvolumes, even if they aren't mount points. On Windows, a drive letter root + and a share UNC are always mount points, and for any other path + ``GetVolumePathName`` is called to see if it is different from the input path. .. versionchanged:: 3.4 Added support for detecting non-root mount points on Windows. diff --git a/Doc/library/platform.rst b/Doc/library/platform.rst index 06de152a742..37df13f8a1e 100644 --- a/Doc/library/platform.rst +++ b/Doc/library/platform.rst @@ -176,8 +176,8 @@ Cross platform :attr:`processor` is resolved late, on demand. Note: the first two attribute names differ from the names presented by - :func:`os.uname`, where they are named :attr:`sysname` and - :attr:`nodename`. + :func:`os.uname`, where they are named :attr:`!sysname` and + :attr:`!nodename`. Entries which cannot be determined are set to ``''``. diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst index b6e51dffc40..c5b88c22325 100644 --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -4,7 +4,7 @@ The Python Profilers ******************** -**Source code:** :source:`Lib/profile.py` and :source:`Lib/pstats.py` +**Source code:** :source:`Lib/profile.py`, :source:`Lib/pstats.py`, and :source:`Lib/profile/sample.py` -------------- @@ -14,23 +14,32 @@ Introduction to the profilers ============================= .. index:: + single: statistical profiling + single: profiling, statistical single: deterministic profiling single: profiling, deterministic -:mod:`cProfile` and :mod:`profile` provide :dfn:`deterministic profiling` of +Python provides both :dfn:`statistical profiling` and :dfn:`deterministic profiling` of Python programs. A :dfn:`profile` is a set of statistics that describes how often and for how long various parts of the program executed. These statistics can be formatted into reports via the :mod:`pstats` module. -The Python standard library provides two different implementations of the same -profiling interface: +The Python standard library provides three different profiling implementations: -1. :mod:`cProfile` is recommended for most users; it's a C extension with +**Statistical Profiler:** + +1. :mod:`profile.sample` provides statistical profiling of running Python processes + using periodic stack sampling. It can attach to any running Python process without + requiring code modification or restart, making it ideal for production debugging. + +**Deterministic Profilers:** + +2. :mod:`cProfile` is recommended for development and testing; it's a C extension with reasonable overhead that makes it suitable for profiling long-running programs. Based on :mod:`lsprof`, contributed by Brett Rosen and Ted Czotter. -2. :mod:`profile`, a pure Python module whose interface is imitated by +3. :mod:`profile`, a pure Python module whose interface is imitated by :mod:`cProfile`, but which adds significant overhead to profiled programs. If you're trying to extend the profiler in some way, the task might be easier with this module. Originally designed and written by Jim Roskind. @@ -44,6 +53,77 @@ profiling interface: but not for C-level functions, and so the C code would seem faster than any Python one. +**Profiler Comparison:** + ++-------------------+----------------------+----------------------+----------------------+ +| Feature | Statistical | Deterministic | Deterministic | +| | (``profile.sample``) | (``cProfile``) | (``profile``) | ++===================+======================+======================+======================+ +| **Target** | Running process | Code you run | Code you run | ++-------------------+----------------------+----------------------+----------------------+ +| **Overhead** | Virtually none | Moderate | High | ++-------------------+----------------------+----------------------+----------------------+ +| **Accuracy** | Statistical approx. | Exact call counts | Exact call counts | ++-------------------+----------------------+----------------------+----------------------+ +| **Setup** | Attach to any PID | Instrument code | Instrument code | ++-------------------+----------------------+----------------------+----------------------+ +| **Use Case** | Production debugging | Development/testing | Profiler extension | ++-------------------+----------------------+----------------------+----------------------+ +| **Implementation**| C extension | C extension | Pure Python | ++-------------------+----------------------+----------------------+----------------------+ + +.. note:: + + The statistical profiler (:mod:`profile.sample`) is recommended for most production + use cases due to its extremely low overhead and ability to profile running processes + without modification. It can attach to any Python process and collect performance + data with minimal impact on execution speed, making it ideal for debugging + performance issues in live applications. + + +.. _statistical-profiling: + +What Is Statistical Profiling? +============================== + +:dfn:`Statistical profiling` works by periodically interrupting a running +program to capture its current call stack. Rather than monitoring every +function entry and exit like deterministic profilers, it takes snapshots at +regular intervals to build a statistical picture of where the program spends +its time. + +The sampling profiler uses process memory reading (via system calls like +``process_vm_readv`` on Linux, ``vm_read`` on macOS, and ``ReadProcessMemory`` on +Windows) to attach to a running Python process and extract stack trace +information without requiring any code modification or restart of the target +process. This approach provides several key advantages over traditional +profiling methods. + +The fundamental principle is that if a function appears frequently in the +collected stack samples, it is likely consuming significant CPU time. By +analyzing thousands of samples, the profiler can accurately estimate the +relative time spent in different parts of the program. The statistical nature +means that while individual measurements may vary, the aggregate results +converge to represent the true performance characteristics of the application. + +Since statistical profiling operates externally to the target process, it +introduces virtually no overhead to the running program. The profiler process +runs separately and reads the target process memory without interrupting its +execution. This makes it suitable for profiling production systems where +performance impact must be minimized. + +The accuracy of statistical profiling improves with the number of samples +collected. Short-lived functions may be missed or underrepresented, while +long-running functions will be captured proportionally to their execution time. +This characteristic makes statistical profiling particularly effective for +identifying the most significant performance bottlenecks rather than providing +exhaustive coverage of all function calls. + +Statistical profiling excels at answering questions like "which functions +consume the most CPU time?" and "where should I focus optimization efforts?" +rather than "exactly how many times was this function called?" The trade-off +between precision and practicality makes it an invaluable tool for performance +analysis in real-world applications. .. _profile-instant: @@ -54,6 +134,18 @@ This section is provided for users that "don't want to read the manual." It provides a very brief overview, and allows a user to rapidly perform profiling on an existing application. +**Statistical Profiling (Recommended for Production):** + +To profile an existing running process:: + + python -m profile.sample 1234 + +To profile with custom settings:: + + python -m profile.sample -i 50 -d 30 1234 + +**Deterministic Profiling (Development/Testing):** + To profile a function that takes a single argument, you can do:: import cProfile @@ -121,8 +213,208 @@ results to a file by specifying a filename to the :func:`run` function:: The :class:`pstats.Stats` class reads profile results from a file and formats them in various ways. +.. _sampling-profiler-cli: + +Statistical Profiler Command Line Interface +=========================================== + +.. program:: profile.sample + +The :mod:`profile.sample` module can be invoked as a script to profile running processes:: + + python -m profile.sample [options] PID + +**Basic Usage Examples:** + +Profile process 1234 for 10 seconds with default settings:: + + python -m profile.sample 1234 + +Profile with custom interval and duration, save to file:: + + python -m profile.sample -i 50 -d 30 -o profile.stats 1234 + +Generate collapsed stacks to use with tools like `flamegraph.pl +<https://github.com/brendangregg/FlameGraph>`_:: + + python -m profile.sample --collapsed 1234 + +Profile all threads, sort by total time:: + + python -m profile.sample -a --sort-tottime 1234 + +Profile with real-time sampling statistics:: + + python -m profile.sample --realtime-stats 1234 + +**Command Line Options:** + +.. option:: PID + + Process ID of the Python process to profile (required) + +.. option:: -i, --interval INTERVAL + + Sampling interval in microseconds (default: 100) + +.. option:: -d, --duration DURATION + + Sampling duration in seconds (default: 10) + +.. option:: -a, --all-threads + + Sample all threads in the process instead of just the main thread + +.. option:: --realtime-stats + + Print real-time sampling statistics during profiling + +.. option:: --pstats + + Generate pstats output (default) + +.. option:: --collapsed + + Generate collapsed stack traces for flamegraphs + +.. option:: -o, --outfile OUTFILE + + Save output to a file + +**Sorting Options (pstats format only):** + +.. option:: --sort-nsamples + + Sort by number of direct samples + +.. option:: --sort-tottime + + Sort by total time + +.. option:: --sort-cumtime + + Sort by cumulative time (default) + +.. option:: --sort-sample-pct + + Sort by sample percentage + +.. option:: --sort-cumul-pct + + Sort by cumulative sample percentage + +.. option:: --sort-nsamples-cumul + + Sort by cumulative samples + +.. option:: --sort-name + + Sort by function name + +.. option:: -l, --limit LIMIT + + Limit the number of rows in the output (default: 15) + +.. option:: --no-summary + + Disable the summary section in the output + +**Understanding Statistical Profile Output:** + +The statistical profiler produces output similar to deterministic profilers but with different column meanings:: + + Profile Stats: + nsamples sample% tottime (ms) cumul% cumtime (ms) filename:lineno(function) + 45/67 12.5 23.450 18.6 56.780 mymodule.py:42(process_data) + 23/23 6.4 15.230 6.4 15.230 <built-in>:0(len) + +**Column Meanings:** + +- **nsamples**: ``direct/cumulative`` - Times function was directly executing / on call stack +- **sample%**: Percentage of total samples where function was directly executing +- **tottime**: Estimated time spent directly in this function +- **cumul%**: Percentage of samples where function was anywhere on call stack +- **cumtime**: Estimated cumulative time including called functions +- **filename:lineno(function)**: Location and name of the function + .. _profile-cli: +:mod:`profile.sample` Module Reference +======================================================= + +.. module:: profile.sample + :synopsis: Python statistical profiler. + +This section documents the programmatic interface for the :mod:`profile.sample` module. +For command-line usage, see :ref:`sampling-profiler-cli`. For conceptual information +about statistical profiling, see :ref:`statistical-profiling` + +.. function:: sample(pid, *, sort=2, sample_interval_usec=100, duration_sec=10, filename=None, all_threads=False, limit=None, show_summary=True, output_format="pstats", realtime_stats=False) + + Sample a Python process and generate profiling data. + + This is the main entry point for statistical profiling. It creates a + :class:`SampleProfiler`, collects stack traces from the target process, and + outputs the results in the specified format. + + :param int pid: Process ID of the target Python process + :param int sort: Sort order for pstats output (default: 2 for cumulative time) + :param int sample_interval_usec: Sampling interval in microseconds (default: 100) + :param int duration_sec: Duration to sample in seconds (default: 10) + :param str filename: Output filename (None for stdout/default naming) + :param bool all_threads: Whether to sample all threads (default: False) + :param int limit: Maximum number of functions to display (default: None) + :param bool show_summary: Whether to show summary statistics (default: True) + :param str output_format: Output format - 'pstats' or 'collapsed' (default: 'pstats') + :param bool realtime_stats: Whether to display real-time statistics (default: False) + + :raises ValueError: If output_format is not 'pstats' or 'collapsed' + + Examples:: + + # Basic usage - profile process 1234 for 10 seconds + import profile.sample + profile.sample.sample(1234) + + # Profile with custom settings + profile.sample.sample(1234, duration_sec=30, sample_interval_usec=50, all_threads=True) + + # Generate collapsed stack traces for flamegraph.pl + profile.sample.sample(1234, output_format='collapsed', filename='profile.collapsed') + +.. class:: SampleProfiler(pid, sample_interval_usec, all_threads) + + Low-level API for the statistical profiler. + + This profiler uses periodic stack sampling to collect performance data + from running Python processes with minimal overhead. It can attach to + any Python process by PID and collect stack traces at regular intervals. + + :param int pid: Process ID of the target Python process + :param int sample_interval_usec: Sampling interval in microseconds + :param bool all_threads: Whether to sample all threads or just the main thread + + .. method:: sample(collector, duration_sec=10) + + Sample the target process for the specified duration. + + Collects stack traces from the target process at regular intervals + and passes them to the provided collector for processing. + + :param collector: Object that implements ``collect()`` method to process stack traces + :param int duration_sec: Duration to sample in seconds (default: 10) + + The method tracks sampling statistics and can display real-time + information if realtime_stats is enabled. + +.. seealso:: + + :ref:`sampling-profiler-cli` + Command-line interface documentation for the statistical profiler. + +Deterministic Profiler Command Line Interface +============================================= + .. program:: cProfile The files :mod:`cProfile` and :mod:`profile` can also be invoked as a script to @@ -564,7 +856,7 @@ What Is Deterministic Profiling? call*, *function return*, and *exception* events are monitored, and precise timings are made for the intervals between these events (during which time the user's code is executing). In contrast, :dfn:`statistical profiling` (which is -not done by this module) randomly samples the effective instruction pointer, and +provided by the :mod:`profile.sample` module) periodically samples the effective instruction pointer, and deduces where time is being spent. The latter technique traditionally involves less overhead (as the code does not need to be instrumented), but provides only relative indications of where time is being spent. diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 2d57cff10a9..5506ac828e5 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -16,11 +16,10 @@ references to these attributes should be marked using the :member: role. -.. warning:: +.. note:: - The :mod:`pyexpat` module is not secure against maliciously - constructed data. If you need to parse untrusted or unauthenticated data see - :ref:`xml-vulnerabilities`. + If you need to parse untrusted or unauthenticated data, see + :ref:`xml-security`. .. index:: single: Expat diff --git a/Doc/library/random.rst b/Doc/library/random.rst index ef0cfb0e76c..b1120b3a4d8 100644 --- a/Doc/library/random.rst +++ b/Doc/library/random.rst @@ -447,6 +447,11 @@ Alternative Generator Override this method in subclasses to customise the :meth:`~random.getrandbits` behaviour of :class:`!Random` instances. + .. method:: Random.randbytes(n) + + Override this method in subclasses to customise the + :meth:`~random.randbytes` behaviour of :class:`!Random` instances. + .. class:: SystemRandom([seed]) diff --git a/Doc/library/security_warnings.rst b/Doc/library/security_warnings.rst index a573c98f73e..70c359cc1c0 100644 --- a/Doc/library/security_warnings.rst +++ b/Doc/library/security_warnings.rst @@ -28,7 +28,7 @@ The following modules have specific security considerations: <subprocess-security>` * :mod:`tempfile`: :ref:`mktemp is deprecated due to vulnerability to race conditions <tempfile-mktemp-deprecated>` -* :mod:`xml`: :ref:`XML vulnerabilities <xml-vulnerabilities>` +* :mod:`xml`: :ref:`XML security <xml-security>` * :mod:`zipfile`: :ref:`maliciously prepared .zip files can cause disk volume exhaustion <zipfile-resources-limitations>` diff --git a/Doc/library/shelve.rst b/Doc/library/shelve.rst index 23a2e0c3d0c..23808619524 100644 --- a/Doc/library/shelve.rst +++ b/Doc/library/shelve.rst @@ -17,7 +17,8 @@ This includes most class instances, recursive data types, and objects containing lots of shared sub-objects. The keys are ordinary strings. -.. function:: open(filename, flag='c', protocol=None, writeback=False) +.. function:: open(filename, flag='c', protocol=None, writeback=False, *, \ + serializer=None, deserializer=None) Open a persistent dictionary. The filename specified is the base filename for the underlying database. As a side-effect, an extension may be added to the @@ -41,6 +42,21 @@ lots of shared sub-objects. The keys are ordinary strings. determine which accessed entries are mutable, nor which ones were actually mutated). + By default, :mod:`shelve` uses :func:`pickle.dumps` and :func:`pickle.loads` + for serializing and deserializing. This can be changed by supplying + *serializer* and *deserializer*, respectively. + + The *serializer* argument must be a callable which takes an object ``obj`` + and the *protocol* as inputs and returns the representation ``obj`` as a + :term:`bytes-like object`; the *protocol* value may be ignored by the + serializer. + + The *deserializer* argument must be callable which takes a serialized object + given as a :class:`bytes` object and returns the corresponding object. + + A :exc:`ShelveError` is raised if *serializer* is given but *deserializer* + is not, or vice-versa. + .. versionchanged:: 3.10 :const:`pickle.DEFAULT_PROTOCOL` is now used as the default pickle protocol. @@ -48,6 +64,10 @@ lots of shared sub-objects. The keys are ordinary strings. .. versionchanged:: 3.11 Accepts :term:`path-like object` for filename. + .. versionchanged:: next + Accepts custom *serializer* and *deserializer* functions in place of + :func:`pickle.dumps` and :func:`pickle.loads`. + .. note:: Do not rely on the shelf being closed automatically; always call @@ -129,7 +149,8 @@ Restrictions explicitly. -.. class:: Shelf(dict, protocol=None, writeback=False, keyencoding='utf-8') +.. class:: Shelf(dict, protocol=None, writeback=False, \ + keyencoding='utf-8', *, serializer=None, deserializer=None) A subclass of :class:`collections.abc.MutableMapping` which stores pickled values in the *dict* object. @@ -147,6 +168,9 @@ Restrictions The *keyencoding* parameter is the encoding used to encode keys before they are used with the underlying dict. + The *serializer* and *deserializer* parameters have the same interpretation + as in :func:`~shelve.open`. + A :class:`Shelf` object can also be used as a context manager, in which case it will be automatically closed when the :keyword:`with` block ends. @@ -161,8 +185,13 @@ Restrictions :const:`pickle.DEFAULT_PROTOCOL` is now used as the default pickle protocol. + .. versionchanged:: next + Added the *serializer* and *deserializer* parameters. -.. class:: BsdDbShelf(dict, protocol=None, writeback=False, keyencoding='utf-8') + +.. class:: BsdDbShelf(dict, protocol=None, writeback=False, \ + keyencoding='utf-8', *, \ + serializer=None, deserializer=None) A subclass of :class:`Shelf` which exposes :meth:`!first`, :meth:`!next`, :meth:`!previous`, :meth:`!last` and :meth:`!set_location` methods. @@ -172,18 +201,27 @@ Restrictions modules. The *dict* object passed to the constructor must support those methods. This is generally accomplished by calling one of :func:`!bsddb.hashopen`, :func:`!bsddb.btopen` or :func:`!bsddb.rnopen`. The - optional *protocol*, *writeback*, and *keyencoding* parameters have the same - interpretation as for the :class:`Shelf` class. + optional *protocol*, *writeback*, *keyencoding*, *serializer* and *deserializer* + parameters have the same interpretation as in :func:`~shelve.open`. + + .. versionchanged:: next + Added the *serializer* and *deserializer* parameters. -.. class:: DbfilenameShelf(filename, flag='c', protocol=None, writeback=False) +.. class:: DbfilenameShelf(filename, flag='c', protocol=None, \ + writeback=False, *, serializer=None, \ + deserializer=None) A subclass of :class:`Shelf` which accepts a *filename* instead of a dict-like object. The underlying file will be opened using :func:`dbm.open`. By default, the file will be created and opened for both read and write. The - optional *flag* parameter has the same interpretation as for the :func:`.open` - function. The optional *protocol* and *writeback* parameters have the same - interpretation as for the :class:`Shelf` class. + optional *flag* parameter has the same interpretation as for the + :func:`.open` function. The optional *protocol*, *writeback*, *serializer* + and *deserializer* parameters have the same interpretation as in + :func:`~shelve.open`. + + .. versionchanged:: next + Added the *serializer* and *deserializer* parameters. .. _shelve-example: @@ -225,6 +263,20 @@ object):: d.close() # close it +Exceptions +---------- + +.. exception:: ShelveError + + Exception raised when one of the arguments *deserializer* and *serializer* + is missing in the :func:`~shelve.open`, :class:`Shelf`, :class:`BsdDbShelf` + and :class:`DbfilenameShelf`. + + The *deserializer* and *serializer* arguments must be given together. + + .. versionadded:: next + + .. seealso:: Module :mod:`dbm` diff --git a/Doc/library/sqlite3.rst b/Doc/library/sqlite3.rst index 641e1f1de03..a14af6d3d88 100644 --- a/Doc/library/sqlite3.rst +++ b/Doc/library/sqlite3.rst @@ -2288,7 +2288,7 @@ This section shows recipes for common adapters and converters. def adapt_datetime_iso(val): """Adapt datetime.datetime to timezone-naive ISO 8601 date.""" - return val.isoformat() + return val.replace(tzinfo=None).isoformat() def adapt_datetime_epoch(val): """Adapt datetime.datetime to Unix timestamp.""" diff --git a/Doc/library/ssl.rst b/Doc/library/ssl.rst index ae2e324d0ab..a9930183f9a 100644 --- a/Doc/library/ssl.rst +++ b/Doc/library/ssl.rst @@ -1078,8 +1078,9 @@ SSL Sockets (but passing a non-zero ``flags`` argument is not allowed) - :meth:`~socket.socket.send`, :meth:`~socket.socket.sendall` (with the same limitation) - - :meth:`~socket.socket.sendfile` (but :mod:`os.sendfile` will be used - for plain-text sockets only, else :meth:`~socket.socket.send` will be used) + - :meth:`~socket.socket.sendfile` (it may be high-performant only when + the kernel TLS is enabled by setting :data:`~ssl.OP_ENABLE_KTLS` or when a + socket is plain-text, else :meth:`~socket.socket.send` will be used) - :meth:`~socket.socket.shutdown` However, since the SSL (and TLS) protocol has its own framing atop @@ -1113,6 +1114,11 @@ SSL Sockets functions support reading and writing of data larger than 2 GB. Writing zero-length data no longer fails with a protocol violation error. + .. versionchanged:: next + Python now uses ``SSL_sendfile`` internally when possible. The + function sends a file more efficiently because it performs TLS encryption + in the kernel to avoid additional context switches. + SSL sockets also have the following additional methods and attributes: .. method:: SSLSocket.read(len=1024, buffer=None) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst index 1626a89a073..05bc7cfb9dc 100644 --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -953,6 +953,8 @@ always available. Unless explicitly noted otherwise, all variables are read-only This function should be used for internal and specialized purposes only. It is not guaranteed to exist in all implementations of Python. + .. versionadded:: 3.12 + .. function:: getobjects(limit[, type]) diff --git a/Doc/library/time.rst b/Doc/library/time.rst index 29b695a9b19..df9be68bf4f 100644 --- a/Doc/library/time.rst +++ b/Doc/library/time.rst @@ -306,10 +306,11 @@ Functions .. versionadded:: 3.3 .. versionchanged:: 3.5 - The function is now always available and always system-wide. + The function is now always available and the clock is now the same for + all processes. .. versionchanged:: 3.10 - On macOS, the function is now system-wide. + On macOS, the clock is now the same for all processes. .. function:: monotonic_ns() -> int @@ -325,7 +326,8 @@ Functions Return the value (in fractional seconds) of a performance counter, i.e. a clock with the highest available resolution to measure a short duration. It - does include time elapsed during sleep and is system-wide. The reference + does include time elapsed during sleep. The clock is the same for all + processes. The reference point of the returned value is undefined, so that only the difference between the results of two calls is valid. @@ -340,7 +342,7 @@ Functions .. versionadded:: 3.3 .. versionchanged:: 3.10 - On Windows, the function is now system-wide. + On Windows, the clock is now the same for all processes. .. versionchanged:: 3.13 Use the same clock as :func:`time.monotonic`. @@ -987,8 +989,8 @@ The following constant is the only parameter that can be sent to .. data:: CLOCK_REALTIME - System-wide real-time clock. Setting this clock requires appropriate - privileges. + Real-time clock. Setting this clock requires appropriate privileges. + The clock is the same for all processes. .. availability:: Unix. diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst index d526e835caa..ec96e841612 100644 --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -2563,7 +2563,7 @@ To add cleanup code that must be run even in the case of an exception, use .. versionadded:: 3.8 -.. classmethod:: enterModuleContext(cm) +.. function:: enterModuleContext(cm) Enter the supplied :term:`context manager`. If successful, also add its :meth:`~object.__exit__` method as a cleanup function by diff --git a/Doc/library/xml.dom.minidom.rst b/Doc/library/xml.dom.minidom.rst index 00a18751207..9ffedf7366a 100644 --- a/Doc/library/xml.dom.minidom.rst +++ b/Doc/library/xml.dom.minidom.rst @@ -19,11 +19,10 @@ not already proficient with the DOM should consider using the :mod:`xml.etree.ElementTree` module for their XML processing instead. -.. warning:: +.. note:: - The :mod:`xml.dom.minidom` module is not secure against - maliciously constructed data. If you need to parse untrusted or - unauthenticated data see :ref:`xml-vulnerabilities`. + If you need to parse untrusted or unauthenticated data, see + :ref:`xml-security`. DOM applications typically start by parsing some XML into a DOM. With diff --git a/Doc/library/xml.dom.pulldom.rst b/Doc/library/xml.dom.pulldom.rst index fd96765cbe3..8bceeecd463 100644 --- a/Doc/library/xml.dom.pulldom.rst +++ b/Doc/library/xml.dom.pulldom.rst @@ -19,11 +19,10 @@ responsible for explicitly pulling events from the stream, looping over those events until either processing is finished or an error condition occurs. -.. warning:: +.. note:: - The :mod:`xml.dom.pulldom` module is not secure against - maliciously constructed data. If you need to parse untrusted or - unauthenticated data see :ref:`xml-vulnerabilities`. + If you need to parse untrusted or unauthenticated data, see + :ref:`xml-security`. .. versionchanged:: 3.7.1 diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst index 1daf6628013..00075ac2a23 100644 --- a/Doc/library/xml.etree.elementtree.rst +++ b/Doc/library/xml.etree.elementtree.rst @@ -20,11 +20,10 @@ for parsing and creating XML data. The :mod:`!xml.etree.cElementTree` module is deprecated. -.. warning:: +.. note:: - The :mod:`xml.etree.ElementTree` module is not secure against - maliciously constructed data. If you need to parse untrusted or - unauthenticated data see :ref:`xml-vulnerabilities`. + If you need to parse untrusted or unauthenticated data, see + :ref:`xml-security`. Tutorial -------- diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index d4959953989..28465219a1a 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -15,12 +15,10 @@ XML Processing Modules Python's interfaces for processing XML are grouped in the ``xml`` package. -.. warning:: +.. note:: - The XML modules are not secure against erroneous or maliciously - constructed data. If you need to parse untrusted or - unauthenticated data see the :ref:`xml-vulnerabilities` and - :ref:`defusedxml-package` sections. + If you need to parse untrusted or unauthenticated data, see + :ref:`xml-security`. It is important to note that modules in the :mod:`xml` package require that there be at least one SAX-compliant XML parser available. The Expat parser is @@ -47,46 +45,22 @@ The XML handling submodules are: * :mod:`xml.parsers.expat`: the Expat parser binding +.. _xml-security: .. _xml-vulnerabilities: -XML vulnerabilities -------------------- +XML security +------------ -The XML processing modules are not secure against maliciously constructed data. An attacker can abuse XML features to carry out denial of service attacks, access local files, generate network connections to other machines, or circumvent firewalls. -The following table gives an overview of the known attacks and whether -the various modules are vulnerable to them. - -========================= ================== ================== ================== ================== ================== -kind sax etree minidom pulldom xmlrpc -========================= ================== ================== ================== ================== ================== -billion laughs **Vulnerable** (1) **Vulnerable** (1) **Vulnerable** (1) **Vulnerable** (1) **Vulnerable** (1) -quadratic blowup **Vulnerable** (1) **Vulnerable** (1) **Vulnerable** (1) **Vulnerable** (1) **Vulnerable** (1) -external entity expansion Safe (5) Safe (2) Safe (3) Safe (5) Safe (4) -`DTD`_ retrieval Safe (5) Safe Safe Safe (5) Safe -decompression bomb Safe Safe Safe Safe **Vulnerable** -large tokens **Vulnerable** (6) **Vulnerable** (6) **Vulnerable** (6) **Vulnerable** (6) **Vulnerable** (6) -========================= ================== ================== ================== ================== ================== - -1. Expat 2.4.1 and newer is not vulnerable to the "billion laughs" and - "quadratic blowup" vulnerabilities. Items still listed as vulnerable due to - potential reliance on system-provided libraries. Check - :const:`!pyexpat.EXPAT_VERSION`. -2. :mod:`xml.etree.ElementTree` doesn't expand external entities and raises a - :exc:`~xml.etree.ElementTree.ParseError` when an entity occurs. -3. :mod:`xml.dom.minidom` doesn't expand external entities and simply returns - the unexpanded entity verbatim. -4. :mod:`xmlrpc.client` doesn't expand external entities and omits them. -5. Since Python 3.7.1, external general entities are no longer processed by - default. -6. Expat 2.6.0 and newer is not vulnerable to denial of service - through quadratic runtime caused by parsing large tokens. - Items still listed as vulnerable due to - potential reliance on system-provided libraries. Check - :const:`!pyexpat.EXPAT_VERSION`. +Expat versions lower that 2.6.0 may be vulnerable to "billion laughs", +"quadratic blowup" and "large tokens". Python may be vulnerable if it uses such +older versions of Expat as a system-provided library. +Check :const:`!pyexpat.EXPAT_VERSION`. + +:mod:`xmlrpc` is **vulnerable** to the "decompression bomb" attack. billion laughs / exponential entity expansion @@ -103,16 +77,6 @@ quadratic blowup entity expansion efficient as the exponential case but it avoids triggering parser countermeasures that forbid deeply nested entities. -external entity expansion - Entity declarations can contain more than just text for replacement. They can - also point to external resources or local files. The XML - parser accesses the resource and embeds the content into the XML document. - -`DTD`_ retrieval - Some XML libraries like Python's :mod:`xml.dom.pulldom` retrieve document type - definitions from remote or local locations. The feature has similar - implications as the external entity expansion issue. - decompression bomb Decompression bombs (aka `ZIP bomb`_) apply to all XML libraries that can parse compressed XML streams such as gzipped HTTP streams or @@ -126,21 +90,5 @@ large tokens be used to cause denial of service in the application parsing XML. The issue is known as :cve:`2023-52425`. -The documentation for :pypi:`defusedxml` on PyPI has further information about -all known attack vectors with examples and references. - -.. _defusedxml-package: - -The :mod:`!defusedxml` Package ------------------------------- - -:pypi:`defusedxml` is a pure Python package with modified subclasses of all stdlib -XML parsers that prevent any potentially malicious operation. Use of this -package is recommended for any server code that parses untrusted XML data. The -package also ships with example exploits and extended documentation on more -XML exploits such as XPath injection. - - .. _Billion Laughs: https://en.wikipedia.org/wiki/Billion_laughs .. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb -.. _DTD: https://en.wikipedia.org/wiki/Document_type_definition diff --git a/Doc/library/xml.sax.rst b/Doc/library/xml.sax.rst index c60e9e505f7..5fa92645a44 100644 --- a/Doc/library/xml.sax.rst +++ b/Doc/library/xml.sax.rst @@ -18,11 +18,10 @@ SAX exceptions and the convenience functions which will be most used by users of the SAX API. -.. warning:: +.. note:: - The :mod:`xml.sax` module is not secure against maliciously - constructed data. If you need to parse untrusted or unauthenticated data see - :ref:`xml-vulnerabilities`. + If you need to parse untrusted or unauthenticated data, see + :ref:`xml-security`. .. versionchanged:: 3.7.1 diff --git a/Doc/library/xmlrpc.client.rst b/Doc/library/xmlrpc.client.rst index 654154cb43d..547cb50be78 100644 --- a/Doc/library/xmlrpc.client.rst +++ b/Doc/library/xmlrpc.client.rst @@ -24,8 +24,8 @@ between conformable Python objects and XML on the wire. .. warning:: The :mod:`xmlrpc.client` module is not secure against maliciously - constructed data. If you need to parse untrusted or unauthenticated data see - :ref:`xml-vulnerabilities`. + constructed data. If you need to parse untrusted or unauthenticated data, + see :ref:`xml-security`. .. versionchanged:: 3.5 diff --git a/Doc/library/xmlrpc.server.rst b/Doc/library/xmlrpc.server.rst index 06169c7eca8..2a8f6f8d5fc 100644 --- a/Doc/library/xmlrpc.server.rst +++ b/Doc/library/xmlrpc.server.rst @@ -20,8 +20,8 @@ servers written in Python. Servers can either be free standing, using .. warning:: The :mod:`xmlrpc.server` module is not secure against maliciously - constructed data. If you need to parse untrusted or unauthenticated data see - :ref:`xml-vulnerabilities`. + constructed data. If you need to parse untrusted or unauthenticated data, + see :ref:`xml-security`. .. include:: ../includes/wasm-notavail.rst |