diff options
Diffstat (limited to 'Doc')
30 files changed, 429 insertions, 93 deletions
diff --git a/Doc/c-api/arg.rst b/Doc/c-api/arg.rst index 3bbc990b632..49dbc8d71cc 100644 --- a/Doc/c-api/arg.rst +++ b/Doc/c-api/arg.rst @@ -685,6 +685,7 @@ Building values ``p`` (:class:`bool`) [int] Convert a C :c:expr:`int` to a Python :class:`bool` object. + .. versionadded:: 3.14 ``c`` (:class:`bytes` of length 1) [char] diff --git a/Doc/c-api/exceptions.rst b/Doc/c-api/exceptions.rst index c8e1b5c2461..885dbeb7530 100644 --- a/Doc/c-api/exceptions.rst +++ b/Doc/c-api/exceptions.rst @@ -982,6 +982,7 @@ the variables: .. index:: single: PyExc_BaseException (C var) + single: PyExc_BaseExceptionGroup (C var) single: PyExc_Exception (C var) single: PyExc_ArithmeticError (C var) single: PyExc_AssertionError (C var) @@ -1041,6 +1042,8 @@ the variables: +=========================================+=================================+==========+ | :c:data:`PyExc_BaseException` | :exc:`BaseException` | [1]_ | +-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_BaseExceptionGroup` | :exc:`BaseExceptionGroup` | [1]_ | ++-----------------------------------------+---------------------------------+----------+ | :c:data:`PyExc_Exception` | :exc:`Exception` | [1]_ | +-----------------------------------------+---------------------------------+----------+ | :c:data:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | [1]_ | @@ -1164,6 +1167,9 @@ the variables: .. versionadded:: 3.6 :c:data:`PyExc_ModuleNotFoundError`. +.. versionadded:: 3.11 + :c:data:`PyExc_BaseExceptionGroup`. + These are compatibility aliases to :c:data:`PyExc_OSError`: .. index:: @@ -1207,6 +1213,7 @@ the variables: single: PyExc_Warning (C var) single: PyExc_BytesWarning (C var) single: PyExc_DeprecationWarning (C var) + single: PyExc_EncodingWarning (C var) single: PyExc_FutureWarning (C var) single: PyExc_ImportWarning (C var) single: PyExc_PendingDeprecationWarning (C var) @@ -1225,6 +1232,8 @@ the variables: +------------------------------------------+---------------------------------+----------+ | :c:data:`PyExc_DeprecationWarning` | :exc:`DeprecationWarning` | | +------------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_EncodingWarning` | :exc:`EncodingWarning` | | ++------------------------------------------+---------------------------------+----------+ | :c:data:`PyExc_FutureWarning` | :exc:`FutureWarning` | | +------------------------------------------+---------------------------------+----------+ | :c:data:`PyExc_ImportWarning` | :exc:`ImportWarning` | | @@ -1245,6 +1254,9 @@ the variables: .. versionadded:: 3.2 :c:data:`PyExc_ResourceWarning`. +.. versionadded:: 3.10 + :c:data:`PyExc_EncodingWarning`. + Notes: .. [3] diff --git a/Doc/c-api/lifecycle.rst b/Doc/c-api/lifecycle.rst index 0e2ffc096ca..5a170862a26 100644 --- a/Doc/c-api/lifecycle.rst +++ b/Doc/c-api/lifecycle.rst @@ -55,16 +55,14 @@ that must be true for *B* to occur after *A*. .. image:: lifecycle.dot.svg :align: center :class: invert-in-dark-mode - :alt: Diagram showing events in an object's life. Explained in detail - below. + :alt: Diagram showing events in an object's life. Explained in detail below. .. only:: latex .. image:: lifecycle.dot.pdf :align: center :class: invert-in-dark-mode - :alt: Diagram showing events in an object's life. Explained in detail - below. + :alt: Diagram showing events in an object's life. Explained in detail below. .. container:: :name: life-events-graph-description diff --git a/Doc/c-api/stable.rst b/Doc/c-api/stable.rst index 124e58cf950..9b65e0b8d23 100644 --- a/Doc/c-api/stable.rst +++ b/Doc/c-api/stable.rst @@ -51,6 +51,7 @@ It is generally intended for specialized, low-level tools like debuggers. Projects that use this API are expected to follow CPython development and spend extra effort adjusting to changes. +.. _stable-application-binary-interface: Stable Application Binary Interface =================================== diff --git a/Doc/c-api/typeobj.rst b/Doc/c-api/typeobj.rst index 91046c0e6f1..af2bead3bb5 100644 --- a/Doc/c-api/typeobj.rst +++ b/Doc/c-api/typeobj.rst @@ -686,6 +686,26 @@ and :c:data:`PyType_Type` effectively act as defaults.) instance, and call the type's :c:member:`~PyTypeObject.tp_free` function to free the object itself. + If you may call functions that may set the error indicator, you must use + :c:func:`PyErr_GetRaisedException` and :c:func:`PyErr_SetRaisedException` + to ensure you don't clobber a preexisting error indicator (the deallocation + could have occurred while processing a different error): + + .. code-block:: c + + static void + foo_dealloc(foo_object *self) + { + PyObject *et, *ev, *etb; + PyObject *exc = PyErr_GetRaisedException(); + ... + PyErr_SetRaisedException(exc); + } + + The dealloc handler itself must not raise an exception; if it hits an error + case it should call :c:func:`PyErr_FormatUnraisable` to log (and clear) an + unraisable exception. + No guarantees are made about when an object is destroyed, except: * Python will destroy an object immediately or some time after the final diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst index 45f50ba5f97..84fee05cb4c 100644 --- a/Doc/c-api/unicode.rst +++ b/Doc/c-api/unicode.rst @@ -191,6 +191,22 @@ access to internal read-only data of Unicode objects: .. versionadded:: 3.2 +.. c:function:: Py_hash_t PyUnstable_Unicode_GET_CACHED_HASH(PyObject *str) + + If the hash of *str*, as returned by :c:func:`PyObject_Hash`, has been + cached and is immediately available, return it. + Otherwise, return ``-1`` *without* setting an exception. + + If *str* is not a string (that is, if ``PyUnicode_Check(obj)`` + is false), the behavior is undefined. + + This function never fails with an exception. + + Note that there are no guarantees on when an object's hash is cached, + and the (non-)existence of a cached hash does not imply that the string has + any other properties. + + Unicode Character Properties """""""""""""""""""""""""""" @@ -1811,7 +1827,7 @@ object. On success, return ``0``. On error, set an exception, leave the writer unchanged, and return ``-1``. - .. versionadded:: next + .. versionadded:: 3.14 .. c:function:: int PyUnicodeWriter_WriteWideChar(PyUnicodeWriter *writer, const wchar_t *str, Py_ssize_t size) diff --git a/Doc/conf.py b/Doc/conf.py index 7fadad66cb3..b08f5452901 100644 --- a/Doc/conf.py +++ b/Doc/conf.py @@ -234,6 +234,7 @@ nitpick_ignore += [ ('c:data', 'PyExc_AssertionError'), ('c:data', 'PyExc_AttributeError'), ('c:data', 'PyExc_BaseException'), + ('c:data', 'PyExc_BaseExceptionGroup'), ('c:data', 'PyExc_BlockingIOError'), ('c:data', 'PyExc_BrokenPipeError'), ('c:data', 'PyExc_BufferError'), @@ -287,6 +288,7 @@ nitpick_ignore += [ # C API: Standard Python warning classes ('c:data', 'PyExc_BytesWarning'), ('c:data', 'PyExc_DeprecationWarning'), + ('c:data', 'PyExc_EncodingWarning'), ('c:data', 'PyExc_FutureWarning'), ('c:data', 'PyExc_ImportWarning'), ('c:data', 'PyExc_PendingDeprecationWarning'), diff --git a/Doc/deprecations/pending-removal-in-3.19.rst b/Doc/deprecations/pending-removal-in-3.19.rst index 3936f63ca5b..25f9cba390d 100644 --- a/Doc/deprecations/pending-removal-in-3.19.rst +++ b/Doc/deprecations/pending-removal-in-3.19.rst @@ -6,3 +6,19 @@ Pending removal in Python 3.19 * Implicitly switching to the MSVC-compatible struct layout by setting :attr:`~ctypes.Structure._pack_` but not :attr:`~ctypes.Structure._layout_` on non-Windows platforms. + +* :mod:`hashlib`: + + - In hash function constructors such as :func:`~hashlib.new` or the + direct hash-named constructors such as :func:`~hashlib.md5` and + :func:`~hashlib.sha256`, their optional initial data parameter could + also be passed a keyword argument named ``data=`` or ``string=`` in + various :mod:`!hashlib` implementations. + + Support for the ``string`` keyword argument name is now deprecated + and slated for removal in Python 3.19. + + Before Python 3.13, the ``string`` keyword parameter was not correctly + supported depending on the backend implementation of hash functions. + Prefer passing the initial data as a positional argument for maximum + backwards compatibility. diff --git a/Doc/extending/windows.rst b/Doc/extending/windows.rst index 56aa44e4e58..a97c6182553 100644 --- a/Doc/extending/windows.rst +++ b/Doc/extending/windows.rst @@ -121,7 +121,7 @@ When creating DLLs in Windows, you can use the CPython library in two ways: :file:`Python.h` triggers an implicit, configure-aware link with the library. The header file chooses :file:`pythonXY_d.lib` for Debug, :file:`pythonXY.lib` for Release, and :file:`pythonX.lib` for Release with - the `Limited API <stable-application-binary-interface>`_ enabled. + the :ref:`Limited API <stable-application-binary-interface>` enabled. To build two DLLs, spam and ni (which uses C functions found in spam), you could use these commands:: diff --git a/Doc/library/calendar.rst b/Doc/library/calendar.rst index 39090e36ed9..b292d828841 100644 --- a/Doc/library/calendar.rst +++ b/Doc/library/calendar.rst @@ -251,7 +251,7 @@ interpreted as prescribed by the ISO 8601 standard. Year 0 is 1 BC, year -1 is 3) specifies the number of months per row. *css* is the name for the cascading style sheet to be used. :const:`None` can be passed if no style sheet should be used. *encoding* specifies the encoding to be used for the - output (defaulting to the system default encoding). + output (defaulting to ``'utf-8'``). .. method:: formatmonthname(theyear, themonth, withyear=True) diff --git a/Doc/library/compression.zstd.rst b/Doc/library/compression.zstd.rst index 35bcbc2bfd8..57ad8e3377f 100644 --- a/Doc/library/compression.zstd.rst +++ b/Doc/library/compression.zstd.rst @@ -247,6 +247,27 @@ Compressing and decompressing data in memory The *mode* argument is a :class:`ZstdCompressor` attribute, either :attr:`~.FLUSH_BLOCK`, or :attr:`~.FLUSH_FRAME`. + .. method:: set_pledged_input_size(size) + + Specify the amount of uncompressed data *size* that will be provided for + the next frame. *size* will be written into the frame header of the next + frame unless :attr:`CompressionParameter.content_size_flag` is ``False`` + or ``0``. A size of ``0`` means that the frame is empty. If *size* is + ``None``, the frame header will omit the frame size. Frames that include + the uncompressed data size require less memory to decompress, especially + at higher compression levels. + + If :attr:`last_mode` is not :attr:`FLUSH_FRAME`, a + :exc:`ValueError` is raised as the compressor is not at the start of + a frame. If the pledged size does not match the actual size of data + provided to :meth:`.compress`, future calls to :meth:`!compress` or + :meth:`flush` may raise :exc:`ZstdError` and the last chunk of data may + be lost. + + After :meth:`flush` or :meth:`.compress` are called with mode + :attr:`FLUSH_FRAME`, the next frame will not include the frame size into + the header unless :meth:`!set_pledged_input_size` is called again. + .. attribute:: CONTINUE Collect more data for compression, which may or may not generate output @@ -266,6 +287,13 @@ Compressing and decompressing data in memory :meth:`~.compress` will be written into a new frame and *cannot* reference past data. + .. attribute:: last_mode + + The last mode passed to either :meth:`~.compress` or :meth:`~.flush`. + The value can be one of :attr:`~.CONTINUE`, :attr:`~.FLUSH_BLOCK`, or + :attr:`~.FLUSH_FRAME`. The initial value is :attr:`~.FLUSH_FRAME`, + signifying that the compressor is at the start of a new frame. + .. class:: ZstdDecompressor(zstd_dict=None, options=None) @@ -620,12 +648,17 @@ Advanced parameter control Write the size of the data to be compressed into the Zstandard frame header when known prior to compressing. - This flag only takes effect under the following two scenarios: + This flag only takes effect under the following scenarios: * Calling :func:`compress` for one-shot compression * Providing all of the data to be compressed in the frame in a single :meth:`ZstdCompressor.compress` call, with the :attr:`ZstdCompressor.FLUSH_FRAME` mode. + * Calling :meth:`ZstdCompressor.set_pledged_input_size` with the exact + amount of data that will be provided to the compressor prior to any + calls to :meth:`ZstdCompressor.compress` for the current frame. + :meth:`!ZstdCompressor.set_pledged_input_size` must be called for each + new frame. All other compression calls may not write the size information into the frame header. diff --git a/Doc/library/csv.rst b/Doc/library/csv.rst index 533cdf13974..2e513bff651 100644 --- a/Doc/library/csv.rst +++ b/Doc/library/csv.rst @@ -70,7 +70,7 @@ The :mod:`csv` module defines the following functions: section :ref:`csv-fmt-params`. Each row read from the csv file is returned as a list of strings. No - automatic data type conversion is performed unless the ``QUOTE_NONNUMERIC`` format + automatic data type conversion is performed unless the :data:`QUOTE_NONNUMERIC` format option is specified (in which case unquoted fields are transformed into floats). A short usage example:: @@ -331,8 +331,14 @@ The :mod:`csv` module defines the following constants: Instructs :class:`writer` objects to quote all non-numeric fields. - Instructs :class:`reader` objects to convert all non-quoted fields to type *float*. + Instructs :class:`reader` objects to convert all non-quoted fields to type :class:`float`. + .. note:: + Some numeric types, such as :class:`bool`, :class:`~fractions.Fraction`, + or :class:`~enum.IntEnum`, have a string representation that cannot be + converted to :class:`float`. + They cannot be read in the :data:`QUOTE_NONNUMERIC` and + :data:`QUOTE_STRINGS` modes. .. data:: QUOTE_NONE @@ -603,7 +609,7 @@ A slightly more advanced use of the reader --- catching and reporting errors:: for row in reader: print(row) except csv.Error as e: - sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e)) + sys.exit(f'file {filename}, line {reader.line_num}: {e}') And while the module doesn't directly support parsing strings, it can easily be done:: diff --git a/Doc/library/ctypes.rst b/Doc/library/ctypes.rst index 8e74c6c9dee..2ee4450698a 100644 --- a/Doc/library/ctypes.rst +++ b/Doc/library/ctypes.rst @@ -714,10 +714,16 @@ item in the :attr:`~Structure._fields_` tuples:: ... ("second_16", c_int, 16)] ... >>> print(Int.first_16) - <Field type=c_long, ofs=0:0, bits=16> + <ctypes.CField 'first_16' type=c_int, ofs=0, bit_size=16, bit_offset=0> >>> print(Int.second_16) - <Field type=c_long, ofs=0:16, bits=16> - >>> + <ctypes.CField 'second_16' type=c_int, ofs=0, bit_size=16, bit_offset=16> + +It is important to note that bit field allocation and layout in memory are not +defined as a C standard; their implementation is compiler-specific. +By default, Python will attempt to match the behavior of a "native" compiler +for the current platform. +See the :attr:`~Structure._layout_` attribute for details on the default +behavior and how to change it. .. _ctypes-arrays: diff --git a/Doc/library/dbm.rst b/Doc/library/dbm.rst index 6f548fbb1b3..39e287b1521 100644 --- a/Doc/library/dbm.rst +++ b/Doc/library/dbm.rst @@ -254,6 +254,9 @@ functionality like crash tolerance. * ``'s'``: Synchronized mode. Changes to the database will be written immediately to the file. * ``'u'``: Do not lock database. + * ``'m'``: Do not use :manpage:`mmap(2)`. + This may harm performance, but improve crash tolerance. + .. versionadded:: next Not all flags are valid for all versions of GDBM. See the :data:`open_flags` member for a list of supported flag characters. diff --git a/Doc/library/hashlib.rst b/Doc/library/hashlib.rst index 4818a4944a5..8bba6700930 100644 --- a/Doc/library/hashlib.rst +++ b/Doc/library/hashlib.rst @@ -94,6 +94,13 @@ accessible by name via :func:`new`. See :data:`algorithms_available`. OpenSSL does not provide we fall back to a verified implementation from the `HACL\* project`_. +.. deprecated-removed:: 3.15 3.19 + The undocumented ``string`` keyword parameter in :func:`!_hashlib.new` + and hash-named constructors such as :func:`!_md5.md5` is deprecated. + Prefer passing the initial data as a positional argument for maximum + backwards compatibility. + + Usage ----- diff --git a/Doc/library/math.rst b/Doc/library/math.rst index 11d3b756e21..c8061fb1638 100644 --- a/Doc/library/math.rst +++ b/Doc/library/math.rst @@ -53,6 +53,8 @@ noted otherwise, all return values are floats. :func:`frexp(x) <frexp>` Mantissa and exponent of *x* :func:`isclose(a, b, rel_tol, abs_tol) <isclose>` Check if the values *a* and *b* are close to each other :func:`isfinite(x) <isfinite>` Check if *x* is neither an infinity nor a NaN +:func:`isnormal(x) <isnormal>` Check if *x* is a normal number +:func:`issubnormal(x) <issubnormal>` Check if *x* is a subnormal number :func:`isinf(x) <isinf>` Check if *x* is a positive or negative infinity :func:`isnan(x) <isnan>` Check if *x* is a NaN (not a number) :func:`ldexp(x, i) <ldexp>` ``x * (2**i)``, inverse of function :func:`frexp` @@ -373,6 +375,24 @@ Floating point manipulation functions .. versionadded:: 3.2 +.. function:: isnormal(x) + + Return ``True`` if *x* is a normal number, that is a finite + nonzero number that is not a subnormal (see :func:`issubnormal`). + Return ``False`` otherwise. + + .. versionadded:: next + + +.. function:: issubnormal(x) + + Return ``True`` if *x* is a subnormal number, that is a finite + nonzero number with a magnitude smaller than the smallest positive normal + number, see :data:`sys.float_info.min`. Return ``False`` otherwise. + + .. versionadded:: next + + .. function:: isinf(x) Return ``True`` if *x* is a positive or negative infinity, and diff --git a/Doc/library/os.path.rst b/Doc/library/os.path.rst index ecbbc1d7605..f72aee19d8f 100644 --- a/Doc/library/os.path.rst +++ b/Doc/library/os.path.rst @@ -408,9 +408,26 @@ the :mod:`glob` module.) system). On Windows, this function will also resolve MS-DOS (also called 8.3) style names such as ``C:\\PROGRA~1`` to ``C:\\Program Files``. - If a path doesn't exist or a symlink loop is encountered, and *strict* is - ``True``, :exc:`OSError` is raised. If *strict* is ``False`` these errors - are ignored, and so the result might be missing or otherwise inaccessible. + By default, the path is evaluated up to the first component that does not + exist, is a symlink loop, or whose evaluation raises :exc:`OSError`. + All such components are appended unchanged to the existing part of the path. + + Some errors that are handled this way include "access denied", "not a + directory", or "bad argument to internal function". Thus, the + resulting path may be missing or inaccessible, may still contain + links or loops, and may traverse non-directories. + + This behavior can be modified by keyword arguments: + + If *strict* is ``True``, the first error encountered when evaluating the path is + re-raised. + In particular, :exc:`FileNotFoundError` is raised if *path* does not exist, + or another :exc:`OSError` if it is otherwise inaccessible. + + If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than + :exc:`FileNotFoundError` are re-raised (as with ``strict=True``). + Thus, the returned path will not contain any symbolic links, but the named + file and some of its parent directories may be missing. .. note:: This function emulates the operating system's procedure for making a path @@ -429,6 +446,15 @@ the :mod:`glob` module.) .. versionchanged:: 3.10 The *strict* parameter was added. + .. versionchanged:: next + The :py:data:`~os.path.ALLOW_MISSING` value for the *strict* parameter + was added. + +.. data:: ALLOW_MISSING + + Special value used for the *strict* argument in :func:`realpath`. + + .. versionadded:: next .. function:: relpath(path, start=os.curdir) diff --git a/Doc/library/socket.rst b/Doc/library/socket.rst index 75fd637045d..bc89a3228f0 100644 --- a/Doc/library/socket.rst +++ b/Doc/library/socket.rst @@ -1492,7 +1492,7 @@ The :mod:`socket` module also offers various network-related services: The *fds* parameter is a sequence of file descriptors. Consult :meth:`~socket.sendmsg` for the documentation of these parameters. - .. availability:: Unix, Windows, not WASI. + .. availability:: Unix, not WASI. Unix platforms supporting :meth:`~socket.sendmsg` and :const:`SCM_RIGHTS` mechanism. @@ -1506,9 +1506,9 @@ The :mod:`socket` module also offers various network-related services: Return ``(msg, list(fds), flags, addr)``. Consult :meth:`~socket.recvmsg` for the documentation of these parameters. - .. availability:: Unix, Windows, not WASI. + .. availability:: Unix, not WASI. - Unix platforms supporting :meth:`~socket.sendmsg` + Unix platforms supporting :meth:`~socket.recvmsg` and :const:`SCM_RIGHTS` mechanism. .. versionadded:: 3.9 diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index f0b4b09ff10..b75e5ceecf8 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1018,7 +1018,7 @@ operations have the same priority as the corresponding numeric operations. [3]_ | ``s * n`` or | equivalent to adding *s* to | (2)(7) | | ``n * s`` | itself *n* times | | +--------------------------+--------------------------------+----------+ -| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) | +| ``s[i]`` | *i*\ th item of *s*, origin 0 | (3)(9) | +--------------------------+--------------------------------+----------+ | ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) | +--------------------------+--------------------------------+----------+ @@ -1150,6 +1150,9 @@ Notes: without copying any data and with the returned index being relative to the start of the sequence rather than the start of the slice. +(9) + An :exc:`IndexError` is raised if *i* is outside the sequence range. + .. _typesseq-immutable: diff --git a/Doc/library/string.rst b/Doc/library/string.rst index c4012483a52..23e15780075 100644 --- a/Doc/library/string.rst +++ b/Doc/library/string.rst @@ -328,7 +328,7 @@ The general form of a *standard format specifier* is: sign: "+" | "-" | " " width_and_precision: [`width_with_grouping`][`precision_with_grouping`] width_with_grouping: [`width`][`grouping`] - precision_with_grouping: "." [`precision`][`grouping`] + precision_with_grouping: "." [`precision`][`grouping`] | "." `grouping` width: `~python-grammar:digit`+ precision: `~python-grammar:digit`+ grouping: "," | "_" diff --git a/Doc/library/tarfile.rst b/Doc/library/tarfile.rst index f9cb5495e60..7cec108a5bd 100644 --- a/Doc/library/tarfile.rst +++ b/Doc/library/tarfile.rst @@ -255,6 +255,15 @@ The :mod:`tarfile` module defines the following exceptions: Raised to refuse extracting a symbolic link pointing outside the destination directory. +.. exception:: LinkFallbackError + + Raised to refuse emulating a link (hard or symbolic) by extracting another + archive member, when that member would be rejected by the filter location. + The exception that was raised to reject the replacement member is available + as :attr:`!BaseException.__context__`. + + .. versionadded:: next + The following constants are available at the module level: @@ -1068,6 +1077,12 @@ reused in custom filters: Implements the ``'data'`` filter. In addition to what ``tar_filter`` does: + - Normalize link targets (:attr:`TarInfo.linkname`) using + :func:`os.path.normpath`. + Note that this removes internal ``..`` components, which may change the + meaning of the link if the path in :attr:`!TarInfo.linkname` traverses + symbolic links. + - :ref:`Refuse <tarfile-extraction-refuse>` to extract links (hard or soft) that link to absolute paths, or ones that link outside the destination. @@ -1099,6 +1114,10 @@ reused in custom filters: Note that this filter does not block *all* dangerous archive features. See :ref:`tarfile-further-verification` for details. + .. versionchanged:: next + + Link targets are now normalized. + .. _tarfile-extraction-refuse: @@ -1127,6 +1146,7 @@ Here is an incomplete list of things to consider: * Extract to a :func:`new temporary directory <tempfile.mkdtemp>` to prevent e.g. exploiting pre-existing links, and to make it easier to clean up after a failed extraction. +* Disallow symbolic links if you do not need the functionality. * When working with untrusted data, use external (e.g. OS-level) limits on disk, memory and CPU usage. * Check filenames against an allow-list of characters diff --git a/Doc/library/token.rst b/Doc/library/token.rst index 1f92b5df430..c228006d4c1 100644 --- a/Doc/library/token.rst +++ b/Doc/library/token.rst @@ -51,7 +51,7 @@ The token constants are: .. data:: NAME Token value that indicates an :ref:`identifier <identifiers>`. - Note that keywords are also initially tokenized an ``NAME`` tokens. + Note that keywords are also initially tokenized as ``NAME`` tokens. .. data:: NUMBER diff --git a/Doc/library/uuid.rst b/Doc/library/uuid.rst index 8cce6b98cbc..747ee3ee0e1 100644 --- a/Doc/library/uuid.rst +++ b/Doc/library/uuid.rst @@ -257,6 +257,10 @@ The :mod:`uuid` module defines the following functions: non-specified arguments are substituted for a pseudo-random integer of appropriate size. + By default, *a*, *b* and *c* are generated by a non-cryptographically + secure pseudo-random number generator (CSPRNG). Use :func:`uuid4` when + a UUID needs to be used in a security-sensitive context. + .. versionadded:: 3.14 diff --git a/Doc/reference/grammar.rst b/Doc/reference/grammar.rst index b9cca4444c9..55c148801d8 100644 --- a/Doc/reference/grammar.rst +++ b/Doc/reference/grammar.rst @@ -8,15 +8,15 @@ used to generate the CPython parser (see :source:`Grammar/python.gram`). The version here omits details related to code generation and error recovery. -The notation is a mixture of `EBNF -<https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form>`_ -and `PEG <https://en.wikipedia.org/wiki/Parsing_expression_grammar>`_. -In particular, ``&`` followed by a symbol, token or parenthesized -group indicates a positive lookahead (i.e., is required to match but -not consumed), while ``!`` indicates a negative lookahead (i.e., is -required *not* to match). We use the ``|`` separator to mean PEG's -"ordered choice" (written as ``/`` in traditional PEG grammars). See -:pep:`617` for more details on the grammar's syntax. +The notation used here is the same as in the preceding docs, +and is described in the :ref:`notation <notation>` section, +except for a few extra complications: + +* ``&e``: a positive lookahead (that is, ``e`` is required to match but + not consumed) +* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match) +* ``~`` ("cut"): commit to the current alternative and fail the rule + even if this fails to parse .. literalinclude:: ../../Grammar/python.gram :language: peg diff --git a/Doc/reference/introduction.rst b/Doc/reference/introduction.rst index b7b70e6be5a..444acac374a 100644 --- a/Doc/reference/introduction.rst +++ b/Doc/reference/introduction.rst @@ -90,44 +90,122 @@ Notation .. index:: BNF, grammar, syntax, notation -The descriptions of lexical analysis and syntax use a modified -`Backus–Naur form (BNF) <https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form>`_ grammar -notation. This uses the following style of definition: - -.. productionlist:: notation - name: `lc_letter` (`lc_letter` | "_")* - lc_letter: "a"..."z" - -The first line says that a ``name`` is an ``lc_letter`` followed by a sequence -of zero or more ``lc_letter``\ s and underscores. An ``lc_letter`` in turn is -any of the single characters ``'a'`` through ``'z'``. (This rule is actually -adhered to for the names defined in lexical and grammar rules in this document.) - -Each rule begins with a name (which is the name defined by the rule) and -``::=``. A vertical bar (``|``) is used to separate alternatives; it is the -least binding operator in this notation. A star (``*``) means zero or more -repetitions of the preceding item; likewise, a plus (``+``) means one or more -repetitions, and a phrase enclosed in square brackets (``[ ]``) means zero or -one occurrences (in other words, the enclosed phrase is optional). The ``*`` -and ``+`` operators bind as tightly as possible; parentheses are used for -grouping. Literal strings are enclosed in quotes. White space is only -meaningful to separate tokens. Rules are normally contained on a single line; -rules with many alternatives may be formatted alternatively with each line after -the first beginning with a vertical bar. - -.. index:: lexical definitions, ASCII - -In lexical definitions (as the example above), two more conventions are used: -Two literal characters separated by three dots mean a choice of any single -character in the given (inclusive) range of ASCII characters. A phrase between -angular brackets (``<...>``) gives an informal description of the symbol -defined; e.g., this could be used to describe the notion of 'control character' -if needed. - -Even though the notation used is almost the same, there is a big difference -between the meaning of lexical and syntactic definitions: a lexical definition -operates on the individual characters of the input source, while a syntax -definition operates on the stream of tokens generated by the lexical analysis. -All uses of BNF in the next chapter ("Lexical Analysis") are lexical -definitions; uses in subsequent chapters are syntactic definitions. - +The descriptions of lexical analysis and syntax use a grammar notation that +is a mixture of +`EBNF <https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form>`_ +and `PEG <https://en.wikipedia.org/wiki/Parsing_expression_grammar>`_. +For example: + +.. grammar-snippet:: + :group: notation + + name: `letter` (`letter` | `digit` | "_")* + letter: "a"..."z" | "A"..."Z" + digit: "0"..."9" + +In this example, the first line says that a ``name`` is a ``letter`` followed +by a sequence of zero or more ``letter``\ s, ``digit``\ s, and underscores. +A ``letter`` in turn is any of the single characters ``'a'`` through +``'z'`` and ``A`` through ``Z``; a ``digit`` is a single character from ``0`` +to ``9``. + +Each rule begins with a name (which identifies the rule that's being defined) +followed by a colon, ``:``. +The definition to the right of the colon uses the following syntax elements: + +* ``name``: A name refers to another rule. + Where possible, it is a link to the rule's definition. + + * ``TOKEN``: An uppercase name refers to a :term:`token`. + For the purposes of grammar definitions, tokens are the same as rules. + +* ``"text"``, ``'text'``: Text in single or double quotes must match literally + (without the quotes). The type of quote is chosen according to the meaning + of ``text``: + + * ``'if'``: A name in single quotes denotes a :ref:`keyword <keywords>`. + * ``"case"``: A name in double quotes denotes a + :ref:`soft-keyword <soft-keywords>`. + * ``'@'``: A non-letter symbol in single quotes denotes an + :py:data:`~token.OP` token, that is, a :ref:`delimiter <delimiters>` or + :ref:`operator <operators>`. + +* ``e1 e2``: Items separated only by whitespace denote a sequence. + Here, ``e1`` must be followed by ``e2``. +* ``e1 | e2``: A vertical bar is used to separate alternatives. + It denotes PEG's "ordered choice": if ``e1`` matches, ``e2`` is + not considered. + In traditional PEG grammars, this is written as a slash, ``/``, rather than + a vertical bar. + See :pep:`617` for more background and details. +* ``e*``: A star means zero or more repetitions of the preceding item. +* ``e+``: Likewise, a plus means one or more repetitions. +* ``[e]``: A phrase enclosed in square brackets means zero or + one occurrences. In other words, the enclosed phrase is optional. +* ``e?``: A question mark has exactly the same meaning as square brackets: + the preceding item is optional. +* ``(e)``: Parentheses are used for grouping. +* ``"a"..."z"``: Two literal characters separated by three dots mean a choice + of any single character in the given (inclusive) range of ASCII characters. + This notation is only used in + :ref:`lexical definitions <notation-lexical-vs-syntactic>`. +* ``<...>``: A phrase between angular brackets gives an informal description + of the matched symbol (for example, ``<any ASCII character except "\">``), + or an abbreviation that is defined in nearby text (for example, ``<Lu>``). + This notation is only used in + :ref:`lexical definitions <notation-lexical-vs-syntactic>`. + +The unary operators (``*``, ``+``, ``?``) bind as tightly as possible; +the vertical bar (``|``) binds most loosely. + +White space is only meaningful to separate tokens. + +Rules are normally contained on a single line, but rules that are too long +may be wrapped: + +.. grammar-snippet:: + :group: notation + + literal: stringliteral | bytesliteral + | integer | floatnumber | imagnumber + +Alternatively, rules may be formatted with the first line ending at the colon, +and each alternative beginning with a vertical bar on a new line. +For example: + + +.. grammar-snippet:: + :group: notation-alt + + literal: + | stringliteral + | bytesliteral + | integer + | floatnumber + | imagnumber + +This does *not* mean that there is an empty first alternative. + +.. index:: lexical definitions + +.. _notation-lexical-vs-syntactic: + +Lexical and Syntactic definitions +--------------------------------- + +There is some difference between *lexical* and *syntactic* analysis: +the :term:`lexical analyzer` operates on the individual characters of the +input source, while the *parser* (syntactic analyzer) operates on the stream +of :term:`tokens <token>` generated by the lexical analysis. +However, in some cases the exact boundary between the two phases is a +CPython implementation detail. + +The practical difference between the two is that in *lexical* definitions, +all whitespace is significant. +The lexical analyzer :ref:`discards <whitespace>` all whitespace that is not +converted to tokens like :data:`token.INDENT` or :data:`~token.NEWLINE`. +*Syntactic* definitions then use these tokens, rather than source characters. + +This documentation uses the same BNF grammar for both styles of definitions. +All uses of BNF in the next chapter (:ref:`lexical`) are lexical definitions; +uses in subsequent chapters are syntactic definitions. diff --git a/Doc/tutorial/introduction.rst b/Doc/tutorial/introduction.rst index cdb35da7bc9..9e06e03991b 100644 --- a/Doc/tutorial/introduction.rst +++ b/Doc/tutorial/introduction.rst @@ -13,10 +13,9 @@ end a multi-line command. .. only:: html - You can toggle the display of prompts and output by clicking on ``>>>`` - in the upper-right corner of an example box. If you hide the prompts - and output for an example, then you can easily copy and paste the input - lines into your interpreter. + You can use the "Copy" button (it appears in the upper-right corner + when hovering over or tapping a code example), which strips prompts + and omits output, to copy and paste the input lines into your interpreter. .. index:: single: # (hash); comment diff --git a/Doc/tutorial/modules.rst b/Doc/tutorial/modules.rst index de7aa0e2342..47bf7547b4a 100644 --- a/Doc/tutorial/modules.rst +++ b/Doc/tutorial/modules.rst @@ -27,14 +27,16 @@ called :file:`fibo.py` in the current directory with the following contents:: # Fibonacci numbers module - def fib(n): # write Fibonacci series up to n + def fib(n): + """Write Fibonacci series up to n.""" a, b = 0, 1 while a < n: print(a, end=' ') a, b = b, a+b print() - def fib2(n): # return Fibonacci series up to n + def fib2(n): + """Return Fibonacci series up to n.""" result = [] a, b = 0, 1 while a < n: diff --git a/Doc/using/android.rst b/Doc/using/android.rst index 65bf23dc994..cb762310328 100644 --- a/Doc/using/android.rst +++ b/Doc/using/android.rst @@ -63,3 +63,12 @@ link to the relevant file. * Add code to your app to :source:`start Python in embedded mode <Android/testbed/app/src/main/c/main_activity.c>`. This will need to be C code called via JNI. + +Building a Python package for Android +------------------------------------- + +Python packages can be built for Android as wheels and released on PyPI. The +recommended tool for doing this is `cibuildwheel +<https://cibuildwheel.pypa.io/en/stable/platforms/#android>`__, which automates +all the details of setting up a cross-compilation environment, building the +wheel, and testing it on an emulator. diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 561d1a8914b..45e68aea5fb 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -342,15 +342,16 @@ For example the following expressions are now valid: .. code-block:: python try: - release_new_sleep_token_album() - except AlbumNotFound, SongsTooGoodToBeReleased: - print("Sorry, no new album this year.") + connect_to_server() + except TimeoutError, ConnectionRefusedError: + print("Network issue encountered.") # The same applies to except* (for exception groups): + try: - release_new_sleep_token_album() - except* AlbumNotFound, SongsTooGoodToBeReleased: - print("Sorry, no new album this year.") + connect_to_server() + except* TimeoutError, ConnectionRefusedError: + print("Network issue encountered.") Check :pep:`758` for more details. @@ -1454,7 +1455,7 @@ math ---- * Added more detailed error messages for domain errors in the module. - (Contributed by by Charlie Zhao and Sergey B Kirpichev in :gh:`101410`.) + (Contributed by Charlie Zhao and Sergey B Kirpichev in :gh:`101410`.) mimetypes diff --git a/Doc/whatsnew/3.15.rst b/Doc/whatsnew/3.15.rst index 244ce327763..88e7462f688 100644 --- a/Doc/whatsnew/3.15.rst +++ b/Doc/whatsnew/3.15.rst @@ -96,6 +96,10 @@ dbm which allow to recover unused free space previously occupied by deleted entries. (Contributed by Andrea Oliveri in :gh:`134004`.) +* Add the ``'m'`` flag for :func:`dbm.gnu.open` which allows to disable + the use of :manpage:`mmap(2)`. + This may harm performance, but improve crash tolerance. + (Contributed by Serhiy Storchaka in :gh:`66234`.) difflib ------- @@ -105,6 +109,23 @@ difflib (Contributed by Jiahao Li in :gh:`134580`.) +math +---- + +* Add :func:`math.isnormal` and :func:`math.issubnormal` functions. + (Contributed by Sergey B Kirpichev in :gh:`132908`.) + + +os.path +------- + +* The *strict* parameter to :func:`os.path.realpath` accepts a new value, + :data:`os.path.ALLOW_MISSING`. + If used, errors other than :exc:`FileNotFoundError` will be re-raised; + the resulting path can be missing but it will be free of symlinks. + (Contributed by Petr Viktorin for :cve:`2025-4517`.) + + shelve ------ @@ -121,6 +142,28 @@ ssl (Contributed by Will Childs-Klein in :gh:`133624`.) +tarfile +------- + +* :func:`~tarfile.data_filter` now normalizes symbolic link targets in order to + avoid path traversal attacks. + (Contributed by Petr Viktorin in :gh:`127987` and :cve:`2025-4138`.) +* :func:`~tarfile.TarFile.extractall` now skips fixing up directory attributes + when a directory was removed or replaced by another kind of file. + (Contributed by Petr Viktorin in :gh:`127987` and :cve:`2024-12718`.) +* :func:`~tarfile.TarFile.extract` and :func:`~tarfile.TarFile.extractall` + now (re-)apply the extraction filter when substituting a link (hard or + symbolic) with a copy of another archive member, and when fixing up + directory attributes. + The former raises a new exception, :exc:`~tarfile.LinkFallbackError`. + (Contributed by Petr Viktorin for :cve:`2025-4330` and :cve:`2024-12718`.) +* :func:`~tarfile.TarFile.extract` and :func:`~tarfile.TarFile.extractall` + no longer extract rejected members when + :func:`~tarfile.TarFile.errorlevel` is zero. + (Contributed by Matt Prodani and Petr Viktorin in :gh:`112887` + and :cve:`2025-4435`.) + + zlib ---- @@ -146,8 +189,20 @@ module_name Deprecated ========== -* module_name: - TODO +hashlib +------- + +* In hash function constructors such as :func:`~hashlib.new` or the + direct hash-named constructors such as :func:`~hashlib.md5` and + :func:`~hashlib.sha256`, their optional initial data parameter could + also be passed a keyword argument named ``data=`` or ``string=`` in + various :mod:`hashlib` implementations. + + Support for the ``string`` keyword argument name is now deprecated and + is slated for removal in Python 3.19. Prefer passing the initial data as + a positional argument for maximum backwards compatibility. + + (Contributed by Bénédikt Tran in :gh:`134978`.) .. Add deprecations above alphabetically, not here at the end. @@ -243,11 +298,9 @@ New features functions as replacements for :c:func:`PySys_GetObject`. (Contributed by Serhiy Storchaka in :gh:`108512`.) -* Add :c:func:`PyUnicodeWriter_WriteASCII` function to write an ASCII string - into a :c:type:`PyUnicodeWriter`. The function is faster than - :c:func:`PyUnicodeWriter_WriteUTF8`, but has an undefined behavior if the - input string contains non-ASCII characters. - (Contributed by Victor Stinner in :gh:`133968`.) +* Add :c:type:`PyUnstable_Unicode_GET_CACHED_HASH` to get the cached hash of + a string. See the documentation for caveats. + (Contributed by Petr Viktorin in :gh:`131510`) Porting to Python 3.15 |