aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Python/gc_free_threading.c
Commit message (Collapse)AuthorAge
* gh-132917: Use /proc/self/status for mem usage info. (#133544)Neil Schemenauer2025-05-08
| | | | | On Linux, use /proc/self/status for mem usage info. Using smaps_rollup is quite a lot slower and we can get the similar info from /proc/self/status.
* gh-132917: Fix data race detected by tsan (#133508)T. Wouters2025-05-06
| | | | | | | | | | Fix data race detected by tsan (https://github.com/python/cpython/actions/runs/14857021107/job/41712717208?pr=133502): young.count can be modified by other threads even while the gcstate is locked. This is the simplest fix to (potentially) unblock beta 1, although this particular code path seems like it could just be an atomic swap followed by an atomic add, without having the lock at all.
* gh-132917: Use RSS + swap for estimate of process memory usage (gh-133464)Neil Schemenauer2025-05-05
|
* gh-132917: Check resident set size (RSS) before GC trigger. (gh-133399)Neil Schemenauer2025-05-05
| | | | | For the free-threaded build, check the process resident set size (RSS) increase before triggering a full automatic garbage collection. If the RSS has not increased 10% since the last collection then it is deferred.
* GH-124715: Move trashcan mechanism into `Py_Dealloc` (GH-132280)Mark Shannon2025-04-30
|
* GH-132508: Use tagged integers on the evaluation stack for the last ↵Mark Shannon2025-04-29
| | | | instruction offset (GH-132545)
* gh-132399: fix invalid function signatures on the free-threaded build (#132400)Bénédikt Tran2025-04-12
|
* gh-131586: Avoid refcount contention in some "special" calls (#131588)Sam Gross2025-03-26
| | | | | | | | | | In the free threaded build, the `_PyObject_LookupSpecial()` call can lead to reference count contention on the returned function object becuase it doesn't use stackrefs. Refactor some of the callers to use `_PyObject_MaybeCallSpecialNoArgs`, which uses stackrefs internally. This fixes the scaling bottleneck in the "lookup_special" microbenchmark in `ftscalingbench.py`. However, the are still some uses of `_PyObject_LookupSpecial()` that need to be addressed in future PRs.
* gh-131238: Remove includes from pycore_interp.h (#131495)Victor Stinner2025-03-20
| | | Remove also now unused includes in C files.
* gh-131238: Remove many includes from pycore_interp.h (#131472)Victor Stinner2025-03-19
|
* gh-130931: Add pycore_interpframe.h internal header (#131249)Victor Stinner2025-03-19
| | | | Move _PyInterpreterFrame and associated functions to a new pycore_interpframe.h header.
* gh-130019: Fix data race in _PyType_AllocNoTrack (gh-130058)Sam Gross2025-02-13
| | | | | | | The reference count fields, such as `ob_tid` and `ob_ref_shared`, may be accessed concurrently in the free threading build by a `_Py_TryXGetRef` or similar operation. The PyObject header fields will be initialized by `_PyObject_Init`, so only call `memset()` to zero-initialize the remainder of the allocation.
* gh-130030: Fix crash on 32-bit Linux with free threading (gh-130043)Sam Gross2025-02-12
| | | | | | | | | The `gc_get_refs` assertion needs to be after we check the alive and unreachable bits. Otherwise, `ob_tid` may store the actual thread id instead of the computed `gc_refs`, which may trigger the assertion if the `ob_tid` looks like a negative value. Also fix a few type warnings on 32-bit systems.
* gh-129533: Update PyGC_Enable/Disable/IsEnabled to use atomic operation ↵Donghee Na2025-02-07
| | | | (gh-129563)
* gh-129201: Use prefetch in GC mark alive phase. (gh-129203)Neil Schemenauer2025-02-05
| | | For the free-threaded version of the cyclic GC, restructure the "mark alive" phase to use software prefetch instructions. This gives a speedup in most cases when the number of objects is large enough. The prefetching is enabled conditionally based on the number of long-lived objects the GC finds.
* gh-129354: Use PyErr_FormatUnraisable() function (#129514)Victor Stinner2025-01-31
| | | Replace PyErr_WriteUnraisable() with PyErr_FormatUnraisable().
* gh-129354: Fix grammar in PyErr_FormatUnraisable() (#129475)Victor Stinner2025-01-31
| | | Replace "on verb+ing" with "while verb+ing".
* gh-129236: Use `stackpointer` in free threaded GC (#129240)Sam Gross2025-01-29
| | | | | | The stack pointers in interpreter frames are nearly always valid now, so use them when visiting each thread's frame. For now, don't collect objects with deferred references in the rare case that we see a frame with a NULL stack pointer.
* gh-128807: Add marking phase for free-threaded cyclic GC (gh-128808)Neil Schemenauer2025-01-15
|
* gh-114940: Add _Py_FOR_EACH_TSTATE_UNLOCKED(), and Friends (gh-127077)Eric Snow2024-11-21
| | | This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock. This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock. Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
* gh-124470: Fix crash when reading from object instance dictionary while ↵Dino Viehland2024-11-21
| | | | | replacing it (#122489) Delay free a dictionary when replacing it
* GH-127010: Don't lazily track and untrack dicts (GH-127027)Mark Shannon2024-11-20
|
* Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle ↵Hugo van Kemenade2024-11-19
| | | | collection (GH-126502)" (#126983)
* GH-126491: GC: Mark objects reachable from roots before doing cycle ↵Mark Shannon2024-11-18
| | | | | | | | | | | | | | | | collection (GH-126502) * Mark almost all reachable objects before doing collection phase * Add stats for objects marked * Visit new frames before each increment * Remove lazy dict tracking * Update docs * Clearer calculation of work to do.
* gh-126312: Don't traverse frozen objects on the free-threaded build (#126338)Peter Bierma2024-11-15
| | | | | Also, _PyGC_Freeze() no longer freezes unreachable objects. Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
* gh-115999: Implement thread-local bytecode and enable specialization for ↵mpage2024-11-04
| | | | | | | | | `BINARY_OP` (#123926) Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads. Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization. Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
* gh-125859: Fix crash when `gc.get_objects` is called during GC (#125882)Sam Gross2024-10-24
| | | | | | | | | This fixes a crash when `gc.get_objects()` or `gc.get_referrers()` is called during a GC in the free threading build. Switch to `_PyObjectStack` to avoid corrupting the `struct worklist` linked list maintained by the GC. Also, don't return objects that are frozen (`gc.freeze()`) or in the process of being collected to more closely match the behavior of the default build.
* gh-124218: Use per-thread refcounts for code objects (#125216)Sam Gross2024-10-15
| | | | | | | Use per-thread refcounting for the reference from function objects to their corresponding code object. This can be a source of contention when frequently creating nested functions. Deferred refcounting alone isn't a great fit here because these references are on the heap and may be modified by other libraries.
* gh-124375: Avoid calling `_PyMem_ProcessDelayed` on other thread states ↵Sam Gross2024-10-15
| | | | | | | | | | | (#124459) This fixes a crash when running the PyO3 test suite on the free-threaded build. The `qsbr` field is initialized after the `PyThreadState` is added to the interpreter's linked list -- it might still be NULL. Instead, we "steal" the queue of to-be-freed memory blocks. This is always initialized (possibly empty) and protected by the stop the world pause.
* gh-124218: Refactor per-thread reference counting (#124844)Sam Gross2024-10-01
| | | | | | | Currently, we only use per-thread reference counting for heap type objects and the naming reflects that. We will extend it to a few additional types in an upcoming change to avoid scaling bottlenecks when creating nested functions. Rename some of the files and functions in preparation for this change.
* gh-123923: Defer refcounting for `f_funcobj` in `_PyInterpreterFrame` (#124026)Sam Gross2024-09-24
| | | | | | Use a `_PyStackRef` and defer the reference to `f_funcobj` when possible. This avoids some reference count contention in the common case of executing the same code object from multiple threads concurrently in the free-threaded build.
* gh-124068: Fix reference leak with generators in the free-threaded build ↵Sam Gross2024-09-13
| | | | | | | (#124069) If the generator is already cleared, then most fields in the generator's frame are not valid other than f_executable. The invalid fields may contain dangling pointers and should not be used.
* gh-123923: Defer refcounting for `f_executable` in `_PyInterpreterFrame` ↵Sam Gross2024-09-12
| | | | | | | | (#123924) Use a `_PyStackRef` and defer the reference to `f_executable` when possible. This avoids some reference count contention in the common case of executing the same code object from multiple threads concurrently in the free-threaded build.
* GH-115776: Allow any fixed sized object to have inline values (GH-123192)Mark Shannon2024-08-21
|
* gh-117139: Garbage collector support for deferred refcounting (#122956)Sam Gross2024-08-15
| | | | | | | | | | The free-threaded GC now visits interpreter stacks to keep objects that use deferred reference counting alive. Interpreter frames are zero initialized in the free-threaded GC so that the GC doesn't see garbage data. This is a temporary measure until stack spilling around escaping calls is implemented. Co-authored-by: Ken Jin <kenjin@python.org>
* gh-122697: Fix free-threading memory leaks at shutdown (#122703)Sam Gross2024-08-08
| | | | | | | | | | | | | | | We were not properly accounting for interpreter memory leaks at shutdown and had two sources of leaks: * Objects that use deferred reference counting and were reachable via static types outlive the final GC. We now disable deferred reference counting on all objects if we are calling the GC due to interpreter shutdown. * `_PyMem_FreeDelayed` did not properly check for interpreter shutdown so we had some memory blocks that were enqueued to be freed, but never actually freed. * `_PyType_FinalizeIdPool` wasn't called at interpreter shutdown.
* gh-122417: Implement per-thread heap type refcounts (#122418)Sam Gross2024-08-06
| | | | | | | The free-threaded build partially stores heap type reference counts in distributed manner in per-thread arrays. This avoids reference count contention when creating or destroying instances. Co-authored-by: Ken Jin <kenjin@python.org>
* gh-100240: Use a consistent implementation for freelists (#121934)Sam Gross2024-07-22
| | | | | | | | This combines and updates our freelist handling to use a consistent implementation. Objects in the freelist are linked together using the first word of memory block. If configured with freelists disabled, these operations are essentially no-ops.
* gh-121794: Don't set `ob_tid` to zero in fast-path dealloc (#121799)Sam Gross2024-07-15
| | | | | | | | | | | | We should maintain the invariant that a zero `ob_tid` implies the refcount fields are merged. * Move the assignment in `_Py_MergeZeroLocalRefcount` to immediately before the refcount merge. * Update `_PyTrash_thread_destroy_chain` to set `ob_ref_shared` to `_Py_REF_MERGED` when setting `ob_tid` to zero. Also check this invariant with assertions in the GC in debug builds. That uncovered a bug when running out of memory during GC.
* gh-117657: Fix race involving GC and heap initialization (#119923)Sam Gross2024-06-04
| | | | | | | | | | | | The `_PyThreadState_Bind()` function is called before the first `PyEval_AcquireThread()` so it's not synchronized with the stop the world GC. We had a race where `gc_visit_heaps()` might visit a thread's heap while it's being initialized. Use a simple atomic int to avoid visiting heaps for threads that are not yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called). The race was reproducible by running: `python Lib/test/test_importlib/partial/pool_in_threads.py`.
* gh-117657: Fix race involving immortalizing objects (#119927)Sam Gross2024-06-03
| | | | | | | | | The free-threaded build currently immortalizes objects that use deferred reference counting (see gh-117783). This typically happens once the first non-main thread is created, but the behavior can be suppressed for tests, in subinterpreters, or during a compile() call. This fixes a race condition involving the tracking of whether the behavior is suppressed.
* gh-117657: Fix TSAN race in free-threaded GC (#119883)Sam Gross2024-06-01
| | | | | Only call `gc_restore_tid()` from stop-the-world contexts. `worklist_pop()` can be called while other threads are running, so use a relaxed atomic to modify `ob_tid`.
* gh-110850: Remove _PyTime_TimeUnchecked() function (#118552)Victor Stinner2024-05-05
| | | | | | | | | | | | | Use the new public Raw functions: * _PyTime_PerfCounterUnchecked() with PyTime_PerfCounterRaw() * _PyTime_TimeUnchecked() with PyTime_TimeRaw() * _PyTime_MonotonicUnchecked() with PyTime_MonotonicRaw() Remove internal functions: * _PyTime_PerfCounterUnchecked() * _PyTime_TimeUnchecked() * _PyTime_MonotonicUnchecked()
* gh-117657: TSAN fix race on `gstate->young.count` (#118313)Alex Turner2024-04-29
|
* gh-117783: Immortalize objects that use deferred reference counting (#118112)Sam Gross2024-04-29
| | | | | | | | | Deferred reference counting is not fully implemented yet. As a temporary measure, we immortalize objects that would use deferred reference counting to avoid multi-threaded scaling bottlenecks. This is only performed in the free-threaded build once the first non-main thread is started. Additionally, some tests, including refleak tests, suppress this behavior.
* gh-117376: Partial implementation of deferred reference counting (#117696)Sam Gross2024-04-12
| | | | | This marks objects as using deferred refrence counting using the `ob_gc_bits` field in the free-threaded build and collects those objects during GC.
* gh-117439: Make refleak checking thread-safe without the GIL (#117469)Sam Gross2024-04-08
| | | | | This keeps track of the per-thread total reference count operations in PyThreadState in the free-threaded builds. The count is merged into the interpreter's total when the thread exits.
* GH-115776: Embed the values array into the object, for "normal" Python ↵Mark Shannon2024-04-02
| | | | objects. (GH-116115)
* gh-112529: Don't untrack tuples or dicts with zero refcount (#117370)Sam Gross2024-03-29
| | | | | | | | The free-threaded GC sometimes sees objects with zero refcount. This can happen due to the delay in merging biased reference counting fields, and, in the future, due to deferred reference counting. We should not untrack these objects or they will never be collected. This fixes the refleaks in the free-threaded build.
* GH-117108: Change the size of the GC increment to about 1% of the total heap ↵Mark Shannon2024-03-22
| | | | size. (GH-117120)