| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Completely refactor Modules/_remote_debugging_module.c with improved
code organization, replacing scattered reference counting and error
handling with centralized goto error paths. This cleanup improves
maintainability and reduces code duplication throughout the module while
preserving the same external API.
Implement memory page caching optimization in Python/remote_debug.h to
avoid repeated reads of the same memory regions during debugging
operations. The cache stores previously read memory pages and reuses
them for subsequent reads, significantly reducing system calls and
improving performance.
Add code object caching mechanism with a new code_object_generation
field in the interpreter state that tracks when code object caches need
invalidation. This allows efficient reuse of parsed code object metadata
and eliminates redundant processing of the same code objects across
debugging sessions.
Optimize memory operations by replacing multiple individual structure
copies with single bulk reads for the same data structures. This reduces
the number of memory operations and system calls required to gather
debugging information from the target process.
Update Makefile.pre.in to include Python/remote_debug.h in the headers
list, ensuring that changes to the remote debugging header force proper
recompilation of dependent modules and maintain build consistency across
the codebase.
Also, make the module compatible with the free threading build as an extra :)
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
|
|
|
|
|
| |
Switches over to a _Py_thread_local in place of autoTssKey, and also fixes a few other checks regarding PyGILState_Ensure after finalization.
Note that this doesn't fix concurrent use of PyGILState_Ensure with Py_Finalize; I'm pretty sure zapthreads doesn't work at all, and that needs to be fixed seperately.
|
|
|
|
| |
Don't use PyInterpreterState_GetID() but get directly the interpreter
'id' member which cannot fail.
|
| |
|
|
|
|
|
|
|
|
| |
module (GH-133287)
* Track the current executor, not the previous one, on the thread-state.
* Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.
|
| |
|
|
|
|
| |
instruction offset (GH-132545)
|
|
|
|
|
|
|
|
| |
(GH-133080)
Both were added in 3.13, are undocumented, and don't make sense in 3.14 due to
changes in the stack overflow detection machinery (gh-112282).
PEP 387 exception for skipping a deprecation period: https://github.com/python/steering-council/issues/288
|
|
|
|
|
| |
We replace it with _Py_GetMainModule(), and add _Py_CheckMainModule(), but both in the internal-only C-API. We also add _PyImport_GetModulesRef(), which is the equivalent of _PyImport_GetModules(), but which increfs before the lock is released.
This is used by a later change related to pickle and handling __main__.
|
| |
|
|
|
|
| |
Add an explicit include to pycore_interpframe_structs.h in
pycore_runtime_structs.h to fix a dependency cycle.
|
|
|
| |
Remove also now unused includes in C files.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Convert static inline functions to functions:
* _Py_IsMainThread()
* _PyInterpreterState_Main()
* _Py_IsMainInterpreterFinalizing()
* _Py_GetMainConfig()
|
|
|
|
|
| |
* Moves most structs in pycore_ header files into pycore_structs.h and pycore_runtime_structs.h
* Removes many cross-header dependencies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PyThreadState field gains a reference count field to avoid
issues with PyThreadState being a dangling pointer to freed memory.
The refcount starts with a value of two: one reference is owned by the
interpreter's linked list of thread states and one reference is owned by
the OS thread. The reference count is decremented when the thread state
is removed from the interpreter's linked list and before the OS thread
calls `PyThread_hang_thread()`. The thread that decrements it to zero
frees the `PyThreadState` memory.
The `holds_gil` field is moved out of the `_status` bit field, to avoid
a data race where on thread calls `PyThreadState_Clear()`, modifying the
`_status` bit field while the OS thread reads `holds_gil` when
attempting to acquire the GIL.
The `PyThreadState.state` field now has `_Py_THREAD_SHUTTING_DOWN` as a
possible value. This corresponds to the `_PyThreadState_MustExit()`
check. This avoids race conditions in the free threading build when
checking `_PyThreadState_MustExit()`.
|
|
|
|
|
| |
* Add location information when accessing already closed stackref
* Add #def option to track closed stackrefs to provide precise information for use after free and double frees.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves `tstate_activate()` down to avoid a data race in the free
threading build on the `_PyRuntime`'s thread-local `autoTSSkey`. This
key is deleted during runtime finalization, which may happen
concurrently with a call to `_PyThreadState_Attach`.
The earlier `tstate_try/wait_attach` ensures that the thread is blocked
before it attempts to access the deleted `autoTSSkey`.
This fixes a TSAN reported data race in
`test_threading.test_import_from_another_thread`.
|
|
|
|
|
|
| |
Windows and macOS require precomputing a "timebase" in order to convert
OS timestamps into nanoseconds. Retrieve and compute this value during
runtime initialization to avoid data races when accessing the time.
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement C recursion protection with limit pointers for Linux, MacOS and Windows
* Remove calls to PyOS_CheckStack
* Add stack protection to parser
* Make tests more robust to low stacks
* Improve error messages for stack overflow
|
|
|
|
|
|
|
|
|
| |
counters. (GH-130007)" for now (GH130413)
Revert "GH-91079: Implement C stack limits using addresses, not counters. (GH-130007)" for now
Unfortunatlely, the change broke some buildbots.
This reverts commit 2498c22fa0a2b560491bc503fa676585c1a603d0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CPython current temporarily changes `PYMEM_DOMAIN_RAW` to the default
allocator during initialization and shutdown. The motivation is to
ensure that core runtime structures are allocated and freed using the
same allocator. However, modifying the current allocator changes global
state and is not thread-safe even with the GIL. Other threads may be
allocating or freeing objects use PYMEM_DOMAIN_RAW; they are not
required to hold the GIL to call PyMem_RawMalloc/PyMem_RawFree.
This adds new internal-only functions like `_PyMem_DefaultRawMalloc`
that aren't affected by calls to `PyMem_SetAllocator()`, so they're
appropriate for Python runtime initialization and finalization. Use
these calls in places where we previously swapped to the default raw
allocator.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement C recursion protection with limit pointers
* Remove calls to PyOS_CheckStack
* Add stack protection to parser
* Make tests more robust to low stacks
* Improve error messages for stack overflow
|
|
|
| |
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
enable profiling (#124640)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Savannah Ostrowski <savannahostrowski@gmail.com>
Co-authored-by: Jacob Coffee <jacob@z7x.org>
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
|
|
|
|
|
|
| |
Remove _PyInterpreterState_GetConfigCopy() and
_PyInterpreterState_SetConfig() private functions. PEP 741 "Python
Configuration C API" added a better public C API: PyConfig_Get() and
PyConfig_Set().
|
|
|
|
|
| |
state (#128361)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
|
|
|
|
| |
(GH-128121)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
builds (#127123)
The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below.
A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe:
Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref.
Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__.
Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes.
Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure.
A few other miscellaneous changes were also needed:
Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it.
Add missing co_tlbc for _Py_InitCleanup.
Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.
|
|
|
|
|
|
|
|
|
|
| |
startup failure (GH-109761)
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
|
|
|
| |
This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock. This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock. Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
|
|
|
|
| |
(gh-121343)
|
|
|
|
|
| |
PyInterpreterState Field (gh-126989)
This approach eliminates the originally reported race. It also gets rid of the deadlock reported in gh-96071, so we can remove the workaround added then.
|
|
|
| |
We replace it with _PyErr_SetInterpreterAlreadyRunning().
|
|
|
|
|
|
|
|
| |
The primary objective here is to allow some later changes to be cleaner. Mostly this involves renaming things and moving a few things around.
* CrossInterpreterData -> XIData
* crossinterpdatafunc -> xidatafunc
* split out pycore_crossinterp_data_registry.h
* add _PyXIData_lookup_t
|
|
|
|
|
|
|
|
|
| |
`BINARY_OP` (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.
Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.
Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
|
|
|
|
|
| |
They used to be shared, before 3.12. Returning to sharing them resolves a failure on Py_TRACE_REFS builds.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is essentially a cleanup, moving a handful of API declarations to the header files where they fit best, creating new ones when needed.
We do the following:
* add pycore_debug_offsets.h and move _Py_DebugOffsets, etc. there
* inline struct _getargs_runtime_state and struct _gilstate_runtime_state in _PyRuntimeState
* move struct _reftracer_runtime_state to the existing pycore_object_state.h
* add pycore_audit.h and move to it _Py_AuditHookEntry , _PySys_Audit(), and _PySys_ClearAuditHooks
* add audit.h and cpython/audit.h and move the existing audit-related API there
*move the perfmap/trampoline API from cpython/sysmodule.h to cpython/ceval.h, and remove the now-empty cpython/sysmodule.h
|
|
|
|
|
|
|
| |
Use per-thread refcounting for the reference from function objects to
their corresponding code object. This can be a source of contention when
frequently creating nested functions. Deferred refcounting alone isn't a
great fit here because these references are on the heap and may be
modified by other libraries.
|
| |
|
|
|
|
| |
(#124568)
|
|
|
|
|
|
|
| |
Currently, we only use per-thread reference counting for heap type objects and
the naming reflects that. We will extend it to a few additional types in an
upcoming change to avoid scaling bottlenecks when creating nested functions.
Rename some of the files and functions in preparation for this change.
|
|
|
|
|
| |
(GH-124443)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
|
|
|
| |
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were not properly accounting for interpreter memory leaks at
shutdown and had two sources of leaks:
* Objects that use deferred reference counting and were reachable via
static types outlive the final GC. We now disable deferred reference
counting on all objects if we are calling the GC due to interpreter
shutdown.
* `_PyMem_FreeDelayed` did not properly check for interpreter shutdown
so we had some memory blocks that were enqueued to be freed, but
never actually freed.
* `_PyType_FinalizeIdPool` wasn't called at interpreter shutdown.
|
|
|
|
|
|
|
| |
The free-threaded build partially stores heap type reference counts in
distributed manner in per-thread arrays. This avoids reference count
contention when creating or destroying instances.
Co-authored-by: Ken Jin <kenjin@python.org>
|