aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Python/optimizer.c
Commit message (Collapse)AuthorAge
* gh-115999: Add free-threaded specialization for FOR_ITER (#128798)T. Wouters2025-03-12
| | | | Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)
* GH-130296: Avoid stack transients in four instructions. (GH-130310)Mark Shannon2025-02-28
| | | | | | | | | * Combine _GUARD_GLOBALS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_MODULE_FROM_KEYS into _LOAD_GLOBAL_MODULE * Combine _GUARD_BUILTINS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_BUILTINS_FROM_KEYS into _LOAD_GLOBAL_BUILTINS * Combine _CHECK_ATTR_MODULE_PUSH_KEYS and _LOAD_ATTR_MODULE_FROM_KEYS into _LOAD_ATTR_MODULE * Remove stack transient in LOAD_ATTR_WITH_HINT
* GH-129715: Don't project traces that return to an unknown caller (GH-130024)Brandt Bucher2025-02-12
|
* gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR ↵Irit Katriel2025-02-07
| | | | (#129700)
* GH-129715: Remove _DYNAMIC_EXIT (GH-129716)Brandt Bucher2025-02-07
|
* GH-128682: Spill the stack pointer in labels, as well as instructions ↵Mark Shannon2025-02-04
| | | | (GH-129618)
* GH-128682: Make `PyStackRef_CLOSE` escaping. (GH-129404)Mark Shannon2025-02-03
|
* GH-126599: Remove the PyOptimizer API (GH-129194)Brandt Bucher2025-01-28
|
* GH-128914: Remove all but one conditional stack effects (GH-129226)Mark Shannon2025-01-27
| | | | | | | | | | | | | * Remove all 'if (0)' and 'if (1)' conditional stack effects * Use array instead of conditional for BUILD_SLICE args * Refactor LOAD_GLOBAL to use a common conditional uop * Remove conditional stack effects from LOAD_ATTR specializations * Replace conditional stack effects in LOAD_ATTR with a 0 or 1 sized array. * Remove conditional stack effects from CALL_FUNCTION_EX
* Revert "GH-128914: Remove conditional stack effects from `bytecodes.c` and ↵Sam Gross2025-01-23
| | | | | | | the code generators (GH-128918)" (GH-129202) The commit introduced a ~2.5-3% regression in the free threading build. This reverts commit ab61d3f4303d14a413bc9ae6557c730ffdf7579e.
* GH-128914: Remove conditional stack effects from `bytecodes.c` and the code ↵Mark Shannon2025-01-20
| | | | generators (GH-128918)
* GH-126599: Remove the "counter" optimizer/executor (GH-126853)Xuanteng Huang2025-01-16
|
* GH-128375: Better instrument for `FOR_ITER` (GH-128445)Mark Shannon2025-01-06
|
* GH-126833: Dumps graphviz representation of executor graph. (GH-126880)Mark Shannon2024-12-13
|
* gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607)mpage2024-11-21
| | | | | | | | | | | | | | Enable specialization of LOAD_GLOBAL in free-threaded builds. Thread-safety of specialization in free-threaded builds is provided by the following: A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization. Generation of new keys versions is made atomic in free-threaded builds. Existing helpers are used to atomically modify the opcode. Thread-safety of specialized instructions in free-threaded builds is provided by the following: Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards. Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset. The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
* gh-120619: Strength reduce function guards, support 2-operand uop forms ↵Ken Jin2024-11-09
| | | | | (GH-124846) Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
* GH-118093: Don't lose confidence when tracing through 100% biased branches ↵Brandt Bucher2024-10-02
| | | | (GH-124813)
* GH-123516: Improve JIT memory consumption by invalidating cold executors ↵Savannah Ostrowski2024-09-27
| | | | | (GH-124443) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
* gh-123923: Defer refcounting for `f_funcobj` in `_PyInterpreterFrame` (#124026)Sam Gross2024-09-24
| | | | | | Use a `_PyStackRef` and defer the reference to `f_funcobj` when possible. This avoids some reference count contention in the common case of executing the same code object from multiple threads concurrently in the free-threaded build.
* GH-118093: Specialize `CALL_KW` (GH-123006)Mark Shannon2024-08-16
|
* GH-122390: Replace `_Py_GetbaseOpcode` with `_Py_GetBaseCodeUnit` (GH-122942)Mark Shannon2024-08-13
|
* GH-118093: Handle some polymorphism before requiring progress in tier two ↵Brandt Bucher2024-08-12
| | | | (GH-122843)
* GH-118095: Add tier two support for BINARY_SUBSCR_GETITEM (GH-120793)Mark Shannon2024-08-01
|
* Replace PyObject_Del with PyObject_Free (#122453)Victor Stinner2024-08-01
| | | | PyObject_Del() is just a alias to PyObject_Free() kept for backward compatibility. Use directly PyObject_Free() instead.
* GH-118093: Improve handling of short and mid-loop traces (GH-122252)Brandt Bucher2024-07-29
|
* GH-122294: Burn in the addresses of side exits (GH-122295)Brandt Bucher2024-07-26
|
* GH-118093: Add tier two support for BINARY_OP_INPLACE_ADD_UNICODE (GH-122253)Brandt Bucher2024-07-25
|
* GH-118093: Add tier two support for LOAD_ATTR_PROPERTY (GH-122283)Brandt Bucher2024-07-25
|
* GH-118093: Remove invalidated executors from side exits (GH-121885)Brandt Bucher2024-07-24
|
* GH-118093: Add tier two support to several instructions (GH-121884)Brandt Bucher2024-07-18
|
* GH-116017: Get rid of _COLD_EXITs (GH-120960)Brandt Bucher2024-07-01
|
* gh-117139: Convert the evaluation stack to stack refs (#118450)Ken Jin2024-06-27
| | | | | | | | | | | | | | | | | This PR sets up tagged pointers for CPython. The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly. Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out. This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it. The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs. Please read Include/internal/pycore_stackref.h for more information! --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>
* gh-120642: Move private PyCode APIs to the internal C API (#120643)Victor Stinner2024-06-26
| | | | | | | | | | | | | | | * Move _Py_CODEUNIT and related functions to pycore_code.h. * Move _Py_BackoffCounter to pycore_backoff.h. * Move Include/cpython/optimizer.h content to pycore_optimizer.h. * Remove Include/cpython/optimizer.h. * Remove PyUnstable_Replace_Executor(). Rename functions: * PyUnstable_GetExecutor() => _Py_GetExecutor() * PyUnstable_GetOptimizer() => _Py_GetOptimizer() * PyUnstable_SetOptimizer() => _Py_SetTier2Optimizer() * PyUnstable_Optimizer_NewCounter() => _PyOptimizer_NewCounter() * PyUnstable_Optimizer_NewUOpOptimizer() => _PyOptimizer_NewUOpOptimizer()
* GH-117062: Make _JUMP_TO_TOP a general absolute jump (GH-120854)Brandt Bucher2024-06-24
|
* GH-120619: Clean up `RETURN_VALUE` instruction (GH-120624)Mark Shannon2024-06-17
| | | | | * Rename _POP_FRAME to _RETURN_VALUE as it returns a value as well as popping a frame. * Remove remaining _POP_FRAMEs
* Fix typos in documentation and comments (#119763)Xie Yanbo2024-06-04
|
* gh-111389: Add PyHASH_MULTIPLIER constant (#119214)Victor Stinner2024-05-21
|
* gh-118771: Ensure names defined in optimizer.h start with Py/_Py (GH-118825)Petr Viktorin2024-05-10
|
* GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 ↵Mark Shannon2024-05-04
| | | | | | | | | | support of calls. (GH-118322) * Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations. * Remove CALL_PY_WITH_DEFAULTS specialization * Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
* GH-113464: Remove the extra jump via `_SIDE_EXIT` in `_EXIT_TRACE` (GH-118545)Mark Shannon2024-05-04
|
* GH-118095: Unify the behavior of tier 2 FOR_ITER branch micro-ops (GH-118420)Mark Shannon2024-05-02
| | | | | | * Target _FOR_ITER_TIER_TWO at POP_TOP following the matching END_FOR * Modify _GUARD_NOT_EXHAUSTED_RANGE, _GUARD_NOT_EXHAUSTED_LIST and _GUARD_NOT_EXHAUSTED_TUPLE so that they also target the POP_TOP following the matching END_FOR
* GH-117442: Check eval-breaker at start (rather than end) of tier 2 loops ↵Mark Shannon2024-05-02
| | | | (GH-118482)
* GH-115802: Use the GHC calling convention in JIT code (GH-118287)Brandt Bucher2024-05-01
|
* gh-117958: Expose JIT code via method in UOpExecutor (#117959)Anthony Shaw2024-05-01
|
* GH-118095: Make invalidating and clearing executors memory safe (GH-118459)Mark Shannon2024-05-01
|
* gh-118335: Configure Tier 2 interpreter at build time (#118339)Guido van Rossum2024-04-30
| | | | | | | | | | | | | | | | | | | | | | The code for Tier 2 is now only compiled when configured with `--enable-experimental-jit[=yes|interpreter]`. We drop support for `PYTHON_UOPS` and -`Xuops`, but you can disable the interpreter or JIT at runtime by setting `PYTHON_JIT=0`. You can also build it without enabling it by default using `--enable-experimental-jit=yes-off`; enable with `PYTHON_JIT=1`. On Windows, the `build.bat` script supports `--experimental-jit`, `--experimental-jit-off`, `--experimental-interpreter`. In the C code, `_Py_JIT` is defined as before when the JIT is enabled; the new variable `_Py_TIER2` is defined when the JIT *or* the interpreter is enabled. It is actually a bitmask: 1: JIT; 2: default-off; 4: interpreter.
* GH-118095: Add tier 2 support for YIELD_VALUE (GH-118380)Mark Shannon2024-04-30
|
* GH-118095: Allow a variant of RESUME_CHECK in tier 2 (GH-118286)Mark Shannon2024-04-29
|
* GH-118095: Add dynamic exit support and FOR_ITER_GEN support to tier 2 ↵Mark Shannon2024-04-26
| | | | (GH-118279)
* GH-118095: Handle `RETURN_GENERATOR` in tier 2 (GH-118180)Mark Shannon2024-04-25
|