aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Doc/c-api/extension-modules.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/c-api/extension-modules.rst')
-rw-r--r--Doc/c-api/extension-modules.rst247
1 files changed, 247 insertions, 0 deletions
diff --git a/Doc/c-api/extension-modules.rst b/Doc/c-api/extension-modules.rst
new file mode 100644
index 00000000000..4c8212f2f5e
--- /dev/null
+++ b/Doc/c-api/extension-modules.rst
@@ -0,0 +1,247 @@
+.. highlight:: c
+
+.. _extension-modules:
+
+Defining extension modules
+--------------------------
+
+A C extension for CPython is a shared library (for example, a ``.so`` file
+on Linux, ``.pyd`` DLL on Windows), which is loadable into the Python process
+(for example, it is compiled with compatible compiler settings), and which
+exports an :ref:`initialization function <extension-export-hook>`.
+
+To be importable by default (that is, by
+:py:class:`importlib.machinery.ExtensionFileLoader`),
+the shared library must be available on :py:attr:`sys.path`,
+and must be named after the module name plus an extension listed in
+:py:attr:`importlib.machinery.EXTENSION_SUFFIXES`.
+
+.. note::
+
+ Building, packaging and distributing extension modules is best done with
+ third-party tools, and is out of scope of this document.
+ One suitable tool is Setuptools, whose documentation can be found at
+ https://setuptools.pypa.io/en/latest/setuptools.html.
+
+Normally, the initialization function returns a module definition initialized
+using :c:func:`PyModuleDef_Init`.
+This allows splitting the creation process into several phases:
+
+- Before any substantial code is executed, Python can determine which
+ capabilities the module supports, and it can adjust the environment or
+ refuse loading an incompatible extension.
+- By default, Python itself creates the module object -- that is, it does
+ the equivalent of :py:meth:`object.__new__` for classes.
+ It also sets initial attributes like :attr:`~module.__package__` and
+ :attr:`~module.__loader__`.
+- Afterwards, the module object is initialized using extension-specific
+ code -- the equivalent of :py:meth:`~object.__init__` on classes.
+
+This is called *multi-phase initialization* to distinguish it from the legacy
+(but still supported) *single-phase initialization* scheme,
+where the initialization function returns a fully constructed module.
+See the :ref:`single-phase-initialization section below <single-phase-initialization>`
+for details.
+
+.. versionchanged:: 3.5
+
+ Added support for multi-phase initialization (:pep:`489`).
+
+
+Multiple module instances
+.........................
+
+By default, extension modules are not singletons.
+For example, if the :py:attr:`sys.modules` entry is removed and the module
+is re-imported, a new module object is created, and typically populated with
+fresh method and type objects.
+The old module is subject to normal garbage collection.
+This mirrors the behavior of pure-Python modules.
+
+Additional module instances may be created in
+:ref:`sub-interpreters <sub-interpreter-support>`
+or after Python runtime reinitialization
+(:c:func:`Py_Finalize` and :c:func:`Py_Initialize`).
+In these cases, sharing Python objects between module instances would likely
+cause crashes or undefined behavior.
+
+To avoid such issues, each instance of an extension module should
+be *isolated*: changes to one instance should not implicitly affect the others,
+and all state owned by the module, including references to Python objects,
+should be specific to a particular module instance.
+See :ref:`isolating-extensions-howto` for more details and a practical guide.
+
+A simpler way to avoid these issues is
+:ref:`raising an error on repeated initialization <isolating-extensions-optout>`.
+
+All modules are expected to support
+:ref:`sub-interpreters <sub-interpreter-support>`, or otherwise explicitly
+signal a lack of support.
+This is usually achieved by isolation or blocking repeated initialization,
+as above.
+A module may also be limited to the main interpreter using
+the :c:data:`Py_mod_multiple_interpreters` slot.
+
+
+.. _extension-export-hook:
+
+Initialization function
+.......................
+
+The initialization function defined by an extension module has the
+following signature:
+
+.. c:function:: PyObject* PyInit_modulename(void)
+
+Its name should be :samp:`PyInit_{<name>}`, with ``<name>`` replaced by the
+name of the module.
+
+For modules with ASCII-only names, the function must instead be named
+:samp:`PyInit_{<name>}`, with ``<name>`` replaced by the name of the module.
+When using :ref:`multi-phase-initialization`, non-ASCII module names
+are allowed. In this case, the initialization function name is
+:samp:`PyInitU_{<name>}`, with ``<name>`` encoded using Python's
+*punycode* encoding with hyphens replaced by underscores. In Python:
+
+.. code-block:: python
+
+ def initfunc_name(name):
+ try:
+ suffix = b'_' + name.encode('ascii')
+ except UnicodeEncodeError:
+ suffix = b'U_' + name.encode('punycode').replace(b'-', b'_')
+ return b'PyInit' + suffix
+
+It is recommended to define the initialization function using a helper macro:
+
+.. c:macro:: PyMODINIT_FUNC
+
+ Declare an extension module initialization function.
+ This macro:
+
+ * specifies the :c:expr:`PyObject*` return type,
+ * adds any special linkage declarations required by the platform, and
+ * for C++, declares the function as ``extern "C"``.
+
+For example, a module called ``spam`` would be defined like this::
+
+ static struct PyModuleDef spam_module = {
+ .m_base = PyModuleDef_HEAD_INIT,
+ .m_name = "spam",
+ ...
+ };
+
+ PyMODINIT_FUNC
+ PyInit_spam(void)
+ {
+ return PyModuleDef_Init(&spam_module);
+ }
+
+It is possible to export multiple modules from a single shared library by
+defining multiple initialization functions. However, importing them requires
+using symbolic links or a custom importer, because by default only the
+function corresponding to the filename is found.
+See the `Multiple modules in one library <https://peps.python.org/pep-0489/#multiple-modules-in-one-library>`__
+section in :pep:`489` for details.
+
+The initialization function is typically the only non-\ ``static``
+item defined in the module's C source.
+
+
+.. _multi-phase-initialization:
+
+Multi-phase initialization
+..........................
+
+Normally, the :ref:`initialization function <extension-export-hook>`
+(``PyInit_modulename``) returns a :c:type:`PyModuleDef` instance with
+non-``NULL`` :c:member:`~PyModuleDef.m_slots`.
+Before it is returned, the ``PyModuleDef`` instance must be initialized
+using the following function:
+
+
+.. c:function:: PyObject* PyModuleDef_Init(PyModuleDef *def)
+
+ Ensure a module definition is a properly initialized Python object that
+ correctly reports its type and a reference count.
+
+ Return *def* cast to ``PyObject*``, or ``NULL`` if an error occurred.
+
+ Calling this function is required for :ref:`multi-phase-initialization`.
+ It should not be used in other contexts.
+
+ Note that Python assumes that ``PyModuleDef`` structures are statically
+ allocated.
+ This function may return either a new reference or a borrowed one;
+ this reference must not be released.
+
+ .. versionadded:: 3.5
+
+
+.. _single-phase-initialization:
+
+Legacy single-phase initialization
+..................................
+
+.. attention::
+ Single-phase initialization is a legacy mechanism to initialize extension
+ modules, with known drawbacks and design flaws. Extension module authors
+ are encouraged to use multi-phase initialization instead.
+
+In single-phase initialization, the
+:ref:`initialization function <extension-export-hook>` (``PyInit_modulename``)
+should create, populate and return a module object.
+This is typically done using :c:func:`PyModule_Create` and functions like
+:c:func:`PyModule_AddObjectRef`.
+
+Single-phase initialization differs from the :ref:`default <multi-phase-initialization>`
+in the following ways:
+
+* Single-phase modules are, or rather *contain*, “singletons”.
+
+ When the module is first initialized, Python saves the contents of
+ the module's ``__dict__`` (that is, typically, the module's functions and
+ types).
+
+ For subsequent imports, Python does not call the initialization function
+ again.
+ Instead, it creates a new module object with a new ``__dict__``, and copies
+ the saved contents to it.
+ For example, given a single-phase module ``_testsinglephase``
+ [#testsinglephase]_ that defines a function ``sum`` and an exception class
+ ``error``:
+
+ .. code-block:: python
+
+ >>> import sys
+ >>> import _testsinglephase as one
+ >>> del sys.modules['_testsinglephase']
+ >>> import _testsinglephase as two
+ >>> one is two
+ False
+ >>> one.__dict__ is two.__dict__
+ False
+ >>> one.sum is two.sum
+ True
+ >>> one.error is two.error
+ True
+
+ The exact behavior should be considered a CPython implementation detail.
+
+* To work around the fact that ``PyInit_modulename`` does not take a *spec*
+ argument, some state of the import machinery is saved and applied to the
+ first suitable module created during the ``PyInit_modulename`` call.
+ Specifically, when a sub-module is imported, this mechanism prepends the
+ parent package name to the name of the module.
+
+ A single-phase ``PyInit_modulename`` function should create “its” module
+ object as soon as possible, before any other module objects can be created.
+
+* Non-ASCII module names (``PyInitU_modulename``) are not supported.
+
+* Single-phase modules support module lookup functions like
+ :c:func:`PyState_FindModule`.
+
+.. [#testsinglephase] ``_testsinglephase`` is an internal module used \
+ in CPython's self-test suite; your installation may or may not \
+ include it.