diff options
author | Damien George <damien@micropython.org> | 2021-10-22 22:22:47 +1100 |
---|---|---|
committer | Damien George <damien@micropython.org> | 2022-02-24 18:08:43 +1100 |
commit | f2040bfc7ee033e48acef9f289790f3b4e6b74e5 (patch) | |
tree | 53402caf1b0e7321bf772278a94f5a87a9e7bf0d /py/emitnative.c | |
parent | 64bfaae7ab33e628f28ca3b53b10893fb047b48e (diff) | |
download | micropython-f2040bfc7ee033e48acef9f289790f3b4e6b74e5.tar.gz micropython-f2040bfc7ee033e48acef9f289790f3b4e6b74e5.zip |
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
Diffstat (limited to 'py/emitnative.c')
-rw-r--r-- | py/emitnative.c | 265 |
1 files changed, 128 insertions, 137 deletions
diff --git a/py/emitnative.c b/py/emitnative.c index 6504f37765..ca34e89f64 100644 --- a/py/emitnative.c +++ b/py/emitnative.c @@ -48,6 +48,7 @@ #include "py/emit.h" #include "py/nativeglue.h" +#include "py/objfun.h" #include "py/objstr.h" #if MICROPY_DEBUG_VERBOSE // print debugging info @@ -92,9 +93,13 @@ #define OFFSETOF_CODE_STATE_FUN_BC (offsetof(mp_code_state_t, fun_bc) / sizeof(uintptr_t)) #define OFFSETOF_CODE_STATE_IP (offsetof(mp_code_state_t, ip) / sizeof(uintptr_t)) #define OFFSETOF_CODE_STATE_SP (offsetof(mp_code_state_t, sp) / sizeof(uintptr_t)) -#define OFFSETOF_OBJ_FUN_BC_GLOBALS (offsetof(mp_obj_fun_bc_t, globals) / sizeof(uintptr_t)) +#define OFFSETOF_CODE_STATE_N_STATE (offsetof(mp_code_state_t, n_state) / sizeof(uintptr_t)) +#define OFFSETOF_OBJ_FUN_BC_CONTEXT (offsetof(mp_obj_fun_bc_t, context) / sizeof(uintptr_t)) +#define OFFSETOF_OBJ_FUN_BC_CHILD_TABLE (offsetof(mp_obj_fun_bc_t, child_table) / sizeof(uintptr_t)) #define OFFSETOF_OBJ_FUN_BC_BYTECODE (offsetof(mp_obj_fun_bc_t, bytecode) / sizeof(uintptr_t)) -#define OFFSETOF_OBJ_FUN_BC_CONST_TABLE (offsetof(mp_obj_fun_bc_t, const_table) / sizeof(uintptr_t)) +#define OFFSETOF_MODULE_CONTEXT_OBJ_TABLE (offsetof(mp_module_context_t, constants.obj_table) / sizeof(uintptr_t)) +#define OFFSETOF_MODULE_CONTEXT_GLOBALS (offsetof(mp_module_context_t, module.globals) / sizeof(uintptr_t)) +#define INDEX_OF_MP_FUN_TABLE_IN_CONST_TABLE (0) // If not already defined, set parent args to same as child call registers #ifndef REG_PARENT_RET @@ -205,6 +210,7 @@ typedef struct _exc_stack_entry_t { } exc_stack_entry_t; struct _emit_t { + mp_emit_common_t *emit_common; mp_obj_t *error_slot; uint *label_slot; uint exit_label; @@ -225,18 +231,17 @@ struct _emit_t { exc_stack_entry_t *exc_stack; int prelude_offset; + #if N_PRELUDE_AS_BYTES_OBJ + size_t prelude_const_table_offset; + #endif int start_offset; int n_state; uint16_t code_state_start; uint16_t stack_start; int stack_size; + uint16_t n_info; uint16_t n_cell; - uint16_t const_table_cur_obj; - uint16_t const_table_num_obj; - uint16_t const_table_cur_raw_code; - mp_uint_t *const_table; - #if MICROPY_PERSISTENT_CODE_SAVE uint16_t qstr_link_cur; mp_qstr_link_entry_t *qstr_link; @@ -255,8 +260,9 @@ STATIC void emit_native_global_exc_entry(emit_t *emit); STATIC void emit_native_global_exc_exit(emit_t *emit); STATIC void emit_native_load_const_obj(emit_t *emit, mp_obj_t obj); -emit_t *EXPORT_FUN(new)(mp_obj_t * error_slot, uint *label_slot, mp_uint_t max_num_labels) { +emit_t *EXPORT_FUN(new)(mp_emit_common_t * emit_common, mp_obj_t *error_slot, uint *label_slot, mp_uint_t max_num_labels) { emit_t *emit = m_new0(emit_t, 1); + emit->emit_common = emit_common; emit->error_slot = error_slot; emit->label_slot = label_slot; emit->stack_info_alloc = 8; @@ -340,30 +346,22 @@ STATIC void emit_native_mov_reg_qstr_obj(emit_t *emit, int reg_dest, qstr qst) { emit_native_mov_state_reg((emit), (local_num), (reg_temp)); \ } while (false) -#define emit_native_mov_state_imm_fix_u16_via(emit, local_num, imm, reg_temp) \ - do { \ - ASM_MOV_REG_IMM_FIX_U16((emit)->as, (reg_temp), (imm)); \ - emit_native_mov_state_reg((emit), (local_num), (reg_temp)); \ - } while (false) - -#define emit_native_mov_state_imm_fix_word_via(emit, local_num, imm, reg_temp) \ - do { \ - ASM_MOV_REG_IMM_FIX_WORD((emit)->as, (reg_temp), (imm)); \ - emit_native_mov_state_reg((emit), (local_num), (reg_temp)); \ - } while (false) - STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scope) { DEBUG_printf("start_pass(pass=%u, scope=%p)\n", pass, scope); + if (pass == MP_PASS_SCOPE) { + // Note: the first argument passed here is mp_emit_common_t, not the native emitter context + #if N_PRELUDE_AS_BYTES_OBJ + if (scope->emit_options == MP_EMIT_OPT_NATIVE_PYTHON) { + mp_emit_common_alloc_const_obj((mp_emit_common_t *)emit, mp_const_none); + } + #endif + return; + } + emit->pass = pass; emit->do_viper_types = scope->emit_options == MP_EMIT_OPT_VIPER; emit->stack_size = 0; - #if N_PRELUDE_AS_BYTES_OBJ - emit->const_table_cur_obj = emit->do_viper_types ? 0 : 1; // reserve first obj for prelude bytes obj - #else - emit->const_table_cur_obj = 0; - #endif - emit->const_table_cur_raw_code = 0; #if MICROPY_PERSISTENT_CODE_SAVE emit->qstr_link_cur = 0; #endif @@ -455,8 +453,9 @@ STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scop #endif // Load REG_FUN_TABLE with a pointer to mp_fun_table, found in the const_table - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_CONST_TABLE); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_LOCAL_3, 0); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_FUN_TABLE, OFFSETOF_MODULE_CONTEXT_OBJ_TABLE); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_FUN_TABLE, INDEX_OF_MP_FUN_TABLE_IN_CONST_TABLE); // Store function object (passed as first arg) to stack if needed if (NEED_FUN_OBJ(emit)) { @@ -514,7 +513,7 @@ STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scop emit->stack_start = SIZEOF_CODE_STATE; #if N_PRELUDE_AS_BYTES_OBJ // Load index of prelude bytes object in const_table - mp_asm_base_data(&emit->as->base, ASM_WORD_SIZE, (uintptr_t)(emit->scope->num_pos_args + emit->scope->num_kwonly_args + 1)); + mp_asm_base_data(&emit->as->base, ASM_WORD_SIZE, (uintptr_t)emit->prelude_const_table_offset); #else mp_asm_base_data(&emit->as->base, ASM_WORD_SIZE, (uintptr_t)emit->prelude_offset); #endif @@ -536,8 +535,9 @@ STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scop // Load REG_FUN_TABLE with a pointer to mp_fun_table, found in the const_table ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_GENERATOR_STATE, LOCAL_IDX_FUN_OBJ(emit)); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_OBJ_FUN_BC_CONST_TABLE); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_TEMP0, emit->scope->num_pos_args + emit->scope->num_kwonly_args); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_MODULE_CONTEXT_OBJ_TABLE); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_TEMP0, INDEX_OF_MP_FUN_TABLE_IN_CONST_TABLE); } else { // The locals and stack start after the code_state structure emit->stack_start = emit->code_state_start + SIZEOF_CODE_STATE; @@ -555,22 +555,33 @@ STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scop #endif // Load REG_FUN_TABLE with a pointer to mp_fun_table, found in the const_table - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_CONST_TABLE); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_LOCAL_3, emit->scope->num_pos_args + emit->scope->num_kwonly_args); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_FUN_TABLE, OFFSETOF_MODULE_CONTEXT_OBJ_TABLE); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_FUN_TABLE, INDEX_OF_MP_FUN_TABLE_IN_CONST_TABLE); // Set code_state.fun_bc ASM_MOV_LOCAL_REG(emit->as, LOCAL_IDX_FUN_OBJ(emit), REG_PARENT_ARG_1); - // Set code_state.ip (offset from start of this function to prelude info) + // Set code_state.ip, a pointer to the beginning of the prelude + // Need to use some locals for this, so assert that they are available for use + MP_STATIC_ASSERT(REG_LOCAL_3 != REG_PARENT_ARG_1); + MP_STATIC_ASSERT(REG_LOCAL_3 != REG_PARENT_ARG_2); + MP_STATIC_ASSERT(REG_LOCAL_3 != REG_PARENT_ARG_3); + MP_STATIC_ASSERT(REG_LOCAL_3 != REG_PARENT_ARG_4); int code_state_ip_local = emit->code_state_start + OFFSETOF_CODE_STATE_IP; #if N_PRELUDE_AS_BYTES_OBJ - // Prelude is a bytes object in const_table; store ip = prelude->data - fun_bc->bytecode - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_LOCAL_3, emit->scope->num_pos_args + emit->scope->num_kwonly_args + 1); + // Prelude is a bytes object in const_table[prelude_const_table_offset]. + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_LOCAL_3, OFFSETOF_MODULE_CONTEXT_OBJ_TABLE); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_LOCAL_3, emit->prelude_const_table_offset); ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_LOCAL_3, offsetof(mp_obj_str_t, data) / sizeof(uintptr_t)); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_PARENT_ARG_1, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_BYTECODE); - ASM_SUB_REG_REG(emit->as, REG_LOCAL_3, REG_PARENT_ARG_1); - emit_native_mov_state_reg(emit, code_state_ip_local, REG_LOCAL_3); #else + MP_STATIC_ASSERT(REG_LOCAL_2 != REG_PARENT_ARG_1); + MP_STATIC_ASSERT(REG_LOCAL_2 != REG_PARENT_ARG_2); + MP_STATIC_ASSERT(REG_LOCAL_2 != REG_PARENT_ARG_3); + MP_STATIC_ASSERT(REG_LOCAL_2 != REG_PARENT_ARG_4); + // Prelude is at the end of the machine code + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_3, REG_PARENT_ARG_1, OFFSETOF_OBJ_FUN_BC_BYTECODE); if (emit->pass == MP_PASS_CODE_SIZE) { // Commit to the encoding size based on the value of prelude_offset in this pass. // By using 32768 as the cut-off it is highly unlikely that prelude_offset will @@ -579,14 +590,16 @@ STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scop } if (emit->prelude_offset_uses_u16_encoding) { assert(emit->prelude_offset <= 65535); - emit_native_mov_state_imm_fix_u16_via(emit, code_state_ip_local, emit->prelude_offset, REG_PARENT_ARG_1); + ASM_MOV_REG_IMM_FIX_U16((emit)->as, REG_LOCAL_2, emit->prelude_offset); } else { - emit_native_mov_state_imm_fix_word_via(emit, code_state_ip_local, emit->prelude_offset, REG_PARENT_ARG_1); + ASM_MOV_REG_IMM_FIX_WORD((emit)->as, REG_LOCAL_2, emit->prelude_offset); } + ASM_ADD_REG_REG(emit->as, REG_LOCAL_3, REG_LOCAL_2); #endif + emit_native_mov_state_reg(emit, code_state_ip_local, REG_LOCAL_3); // Set code_state.n_state (only works on little endian targets due to n_state being uint16_t) - emit_native_mov_state_imm_via(emit, emit->code_state_start + offsetof(mp_code_state_t, n_state) / sizeof(uintptr_t), emit->n_state, REG_ARG_1); + emit_native_mov_state_imm_via(emit, emit->code_state_start + OFFSETOF_CODE_STATE_N_STATE, emit->n_state, REG_ARG_1); // Put address of code_state into first arg ASM_MOV_REG_LOCAL_ADDR(emit->as, REG_ARG_1, emit->code_state_start); @@ -628,30 +641,17 @@ STATIC void emit_native_start_pass(emit_t *emit, pass_kind_t pass, scope_t *scop emit->local_vtype[id->local_num] = VTYPE_PYOBJ; } } - - if (pass == MP_PASS_EMIT) { - // write argument names as qstr objects - // see comment in corresponding part of emitbc.c about the logic here - for (int i = 0; i < scope->num_pos_args + scope->num_kwonly_args; i++) { - qstr qst = MP_QSTR__star_; - for (int j = 0; j < scope->id_info_len; ++j) { - id_info_t *id = &scope->id_info[j]; - if ((id->flags & ID_FLAG_IS_PARAM) && id->local_num == i) { - qst = id->qst; - break; - } - } - emit->const_table[i] = (mp_uint_t)MP_OBJ_NEW_QSTR(qst); - } - } } - } static inline void emit_native_write_code_info_byte(emit_t *emit, byte val) { mp_asm_base_data(&emit->as->base, 1, val); } +static inline void emit_native_write_code_info_qstr(emit_t *emit, qstr qst) { + mp_encode_uint(&emit->as->base, mp_asm_base_get_cur_to_write_bytes, mp_emit_common_use_qstr(emit->emit_common, qst)); +} + STATIC void emit_native_end_pass(emit_t *emit) { emit_native_global_exc_exit(emit); @@ -662,21 +662,25 @@ STATIC void emit_native_end_pass(emit_t *emit) { size_t n_exc_stack = 0; // exc-stack not needed for native code MP_BC_PRELUDE_SIG_ENCODE(n_state, n_exc_stack, emit->scope, emit_native_write_code_info_byte, emit); - #if MICROPY_PERSISTENT_CODE - size_t n_info = 4; - #else - size_t n_info = 1; - #endif - MP_BC_PRELUDE_SIZE_ENCODE(n_info, emit->n_cell, emit_native_write_code_info_byte, emit); - - #if MICROPY_PERSISTENT_CODE - mp_asm_base_data(&emit->as->base, 1, emit->scope->simple_name); - mp_asm_base_data(&emit->as->base, 1, emit->scope->simple_name >> 8); - mp_asm_base_data(&emit->as->base, 1, emit->scope->source_file); - mp_asm_base_data(&emit->as->base, 1, emit->scope->source_file >> 8); - #else - mp_asm_base_data(&emit->as->base, 1, 1); - #endif + size_t n_info = emit->n_info; + size_t n_cell = emit->n_cell; + MP_BC_PRELUDE_SIZE_ENCODE(n_info, n_cell, emit_native_write_code_info_byte, emit); + + // bytecode prelude: source info (function and argument qstrs) + size_t info_start = mp_asm_base_get_code_pos(&emit->as->base); + emit_native_write_code_info_qstr(emit, emit->scope->simple_name); + for (int i = 0; i < emit->scope->num_pos_args + emit->scope->num_kwonly_args; i++) { + qstr qst = MP_QSTR__star_; + for (int j = 0; j < emit->scope->id_info_len; ++j) { + id_info_t *id = &emit->scope->id_info[j]; + if ((id->flags & ID_FLAG_IS_PARAM) && id->local_num == i) { + qst = id->qst; + break; + } + } + emit_native_write_code_info_qstr(emit, qst); + } + emit->n_info = mp_asm_base_get_code_pos(&emit->as->base) - info_start; // bytecode prelude: initialise closed over variables size_t cell_start = mp_asm_base_get_code_pos(&emit->as->base); @@ -690,13 +694,14 @@ STATIC void emit_native_end_pass(emit_t *emit) { emit->n_cell = mp_asm_base_get_code_pos(&emit->as->base) - cell_start; #if N_PRELUDE_AS_BYTES_OBJ - // Prelude bytes object is after qstr arg names and mp_fun_table - size_t table_off = emit->scope->num_pos_args + emit->scope->num_kwonly_args + 1; + // Create the prelude as a bytes object, and store it in the constant table + mp_obj_t prelude = mp_const_none; if (emit->pass == MP_PASS_EMIT) { void *buf = emit->as->base.code_base + emit->prelude_offset; size_t n = emit->as->base.code_offset - emit->prelude_offset; - emit->const_table[table_off] = (uintptr_t)mp_obj_new_bytes(buf, n); + prelude = mp_obj_new_bytes(buf, n); } + emit->prelude_const_table_offset = mp_emit_common_alloc_const_obj(emit->emit_common, prelude); #endif } @@ -706,31 +711,15 @@ STATIC void emit_native_end_pass(emit_t *emit) { assert(emit->stack_size == 0); assert(emit->exc_stack_size == 0); - // Deal with const table accounting - assert(emit->pass <= MP_PASS_STACK_SIZE || (emit->const_table_num_obj == emit->const_table_cur_obj)); - emit->const_table_num_obj = emit->const_table_cur_obj; + #if MICROPY_PERSISTENT_CODE_SAVE + // Allocate qstr_link table if needed if (emit->pass == MP_PASS_CODE_SIZE) { - size_t const_table_alloc = 1 + emit->const_table_num_obj + emit->const_table_cur_raw_code; - size_t nqstr = 0; - if (!emit->do_viper_types) { - // Add room for qstr names of arguments - nqstr = emit->scope->num_pos_args + emit->scope->num_kwonly_args; - const_table_alloc += nqstr; - } - emit->const_table = m_new(mp_uint_t, const_table_alloc); - #if !MICROPY_DYNAMIC_COMPILER - // Store mp_fun_table pointer just after qstrs - // (but in dynamic-compiler mode eliminate dependency on mp_fun_table) - emit->const_table[nqstr] = (mp_uint_t)(uintptr_t)&mp_fun_table; - #endif - - #if MICROPY_PERSISTENT_CODE_SAVE size_t qstr_link_alloc = emit->qstr_link_cur; if (qstr_link_alloc > 0) { emit->qstr_link = m_new(mp_qstr_link_entry_t, qstr_link_alloc); } - #endif } + #endif if (emit->pass == MP_PASS_EMIT) { void *f = mp_asm_base_get_code(&emit->as->base); @@ -738,13 +727,14 @@ STATIC void emit_native_end_pass(emit_t *emit) { mp_emit_glue_assign_native(emit->scope->raw_code, emit->do_viper_types ? MP_CODE_NATIVE_VIPER : MP_CODE_NATIVE_PY, - f, f_len, emit->const_table, + f, f_len, + emit->emit_common->children, #if MICROPY_PERSISTENT_CODE_SAVE + emit->emit_common->ct_cur_child, emit->prelude_offset, - emit->const_table_cur_obj, emit->const_table_cur_raw_code, emit->qstr_link_cur, emit->qstr_link, #endif - emit->scope->num_pos_args, emit->scope->scope_flags, 0); + emit->scope->scope_flags, 0, 0); } } @@ -1137,29 +1127,19 @@ STATIC exc_stack_entry_t *emit_native_pop_exc_stack(emit_t *emit) { return e; } -STATIC void emit_load_reg_with_ptr(emit_t *emit, int reg, mp_uint_t ptr, size_t table_off) { - if (!emit->do_viper_types) { - // Skip qstr names of arguments - table_off += emit->scope->num_pos_args + emit->scope->num_kwonly_args; - } - if (emit->pass == MP_PASS_EMIT) { - emit->const_table[table_off] = ptr; - } +STATIC void emit_load_reg_with_object(emit_t *emit, int reg, mp_obj_t obj) { + size_t table_off = mp_emit_common_alloc_const_obj(emit->emit_common, obj); emit_native_mov_reg_state(emit, REG_TEMP0, LOCAL_IDX_FUN_OBJ(emit)); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_OBJ_FUN_BC_CONST_TABLE); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_MODULE_CONTEXT_OBJ_TABLE); ASM_LOAD_REG_REG_OFFSET(emit->as, reg, REG_TEMP0, table_off); } -STATIC void emit_load_reg_with_object(emit_t *emit, int reg, mp_obj_t obj) { - // First entry is for mp_fun_table - size_t table_off = 1 + emit->const_table_cur_obj++; - emit_load_reg_with_ptr(emit, reg, (mp_uint_t)obj, table_off); -} - -STATIC void emit_load_reg_with_raw_code(emit_t *emit, int reg, mp_raw_code_t *rc) { - // First entry is for mp_fun_table, then constant objects - size_t table_off = 1 + emit->const_table_num_obj + emit->const_table_cur_raw_code++; - emit_load_reg_with_ptr(emit, reg, (mp_uint_t)rc, table_off); +STATIC void emit_load_reg_with_child(emit_t *emit, int reg, mp_raw_code_t *rc) { + size_t table_off = mp_emit_common_alloc_const_child(emit->emit_common, rc); + emit_native_mov_reg_state(emit, REG_TEMP0, LOCAL_IDX_FUN_OBJ(emit)); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_TEMP0, REG_TEMP0, OFFSETOF_OBJ_FUN_BC_CHILD_TABLE); + ASM_LOAD_REG_REG_OFFSET(emit->as, reg, REG_TEMP0, table_off); } STATIC void emit_native_label_assign(emit_t *emit, mp_uint_t l) { @@ -1203,7 +1183,8 @@ STATIC void emit_native_global_exc_entry(emit_t *emit) { if (!(emit->scope->scope_flags & MP_SCOPE_FLAG_GENERATOR)) { // Set new globals emit_native_mov_reg_state(emit, REG_ARG_1, LOCAL_IDX_FUN_OBJ(emit)); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_ARG_1, REG_ARG_1, OFFSETOF_OBJ_FUN_BC_GLOBALS); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_ARG_1, REG_ARG_1, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_ARG_1, REG_ARG_1, OFFSETOF_MODULE_CONTEXT_GLOBALS); emit_call(emit, MP_F_NATIVE_SWAP_GLOBALS); // Save old globals (or NULL if globals didn't change) @@ -1254,7 +1235,8 @@ STATIC void emit_native_global_exc_entry(emit_t *emit) { #if N_NLR_SETJMP // Reload REG_FUN_TABLE, since it may be clobbered by longjmp emit_native_mov_reg_state(emit, REG_LOCAL_1, LOCAL_IDX_FUN_OBJ(emit)); - ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_1, REG_LOCAL_1, offsetof(mp_obj_fun_bc_t, const_table) / sizeof(uintptr_t)); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_1, REG_LOCAL_1, OFFSETOF_OBJ_FUN_BC_CONTEXT); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_LOCAL_1, REG_LOCAL_1, OFFSETOF_MODULE_CONTEXT_OBJ_TABLE); ASM_LOAD_REG_REG_OFFSET(emit->as, REG_FUN_TABLE, REG_LOCAL_1, emit->scope->num_pos_args + emit->scope->num_kwonly_args); #endif ASM_MOV_REG_LOCAL(emit->as, REG_LOCAL_1, LOCAL_IDX_EXC_HANDLER_PC(emit)); @@ -1385,11 +1367,7 @@ STATIC void emit_native_import(emit_t *emit, qstr qst, int kind) { STATIC void emit_native_load_const_tok(emit_t *emit, mp_token_kind_t tok) { DEBUG_printf("load_const_tok(tok=%u)\n", tok); if (tok == MP_TOKEN_ELLIPSIS) { - #if MICROPY_PERSISTENT_CODE_SAVE emit_native_load_const_obj(emit, MP_OBJ_FROM_PTR(&mp_const_ellipsis_obj)); - #else - emit_post_push_imm(emit, VTYPE_PYOBJ, (mp_uint_t)MP_OBJ_FROM_PTR(&mp_const_ellipsis_obj)); - #endif } else { emit_native_pre(emit); if (tok == MP_TOKEN_KW_NONE) { @@ -2682,33 +2660,46 @@ STATIC void emit_native_unpack_ex(emit_t *emit, mp_uint_t n_left, mp_uint_t n_ri STATIC void emit_native_make_function(emit_t *emit, scope_t *scope, mp_uint_t n_pos_defaults, mp_uint_t n_kw_defaults) { // call runtime, with type info for args, or don't support dict/default params, or only support Python objects for them emit_native_pre(emit); + emit_native_mov_reg_state(emit, REG_ARG_2, LOCAL_IDX_FUN_OBJ(emit)); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_ARG_2, REG_ARG_2, OFFSETOF_OBJ_FUN_BC_CONTEXT); if (n_pos_defaults == 0 && n_kw_defaults == 0) { need_reg_all(emit); - ASM_MOV_REG_IMM(emit->as, REG_ARG_2, (mp_uint_t)MP_OBJ_NULL); - ASM_MOV_REG_IMM(emit->as, REG_ARG_3, (mp_uint_t)MP_OBJ_NULL); + ASM_MOV_REG_IMM(emit->as, REG_ARG_3, 0); } else { - vtype_kind_t vtype_def_tuple, vtype_def_dict; - emit_pre_pop_reg_reg(emit, &vtype_def_dict, REG_ARG_3, &vtype_def_tuple, REG_ARG_2); - assert(vtype_def_tuple == VTYPE_PYOBJ); - assert(vtype_def_dict == VTYPE_PYOBJ); + emit_get_stack_pointer_to_reg_for_pop(emit, REG_ARG_3, 2); need_reg_all(emit); } - emit_load_reg_with_raw_code(emit, REG_ARG_1, scope->raw_code); + emit_load_reg_with_child(emit, REG_ARG_1, scope->raw_code); ASM_CALL_IND(emit->as, MP_F_MAKE_FUNCTION_FROM_RAW_CODE); emit_post_push_reg(emit, VTYPE_PYOBJ, REG_RET); } STATIC void emit_native_make_closure(emit_t *emit, scope_t *scope, mp_uint_t n_closed_over, mp_uint_t n_pos_defaults, mp_uint_t n_kw_defaults) { + // make function emit_native_pre(emit); + emit_native_mov_reg_state(emit, REG_ARG_2, LOCAL_IDX_FUN_OBJ(emit)); + ASM_LOAD_REG_REG_OFFSET(emit->as, REG_ARG_2, REG_ARG_2, OFFSETOF_OBJ_FUN_BC_CONTEXT); if (n_pos_defaults == 0 && n_kw_defaults == 0) { - emit_get_stack_pointer_to_reg_for_pop(emit, REG_ARG_3, n_closed_over); - ASM_MOV_REG_IMM(emit->as, REG_ARG_2, n_closed_over); + need_reg_all(emit); + ASM_MOV_REG_IMM(emit->as, REG_ARG_3, 0); } else { - emit_get_stack_pointer_to_reg_for_pop(emit, REG_ARG_3, n_closed_over + 2); - ASM_MOV_REG_IMM(emit->as, REG_ARG_2, 0x100 | n_closed_over); + emit_get_stack_pointer_to_reg_for_pop(emit, REG_ARG_3, 2 + n_closed_over); + adjust_stack(emit, 2 + n_closed_over); + need_reg_all(emit); + } + emit_load_reg_with_child(emit, REG_ARG_1, scope->raw_code); + ASM_CALL_IND(emit->as, MP_F_MAKE_FUNCTION_FROM_RAW_CODE); + + // make closure + #if REG_ARG_1 != REG_RET + ASM_MOV_REG_REG(emit->as, REG_ARG_1, REG_RET); + #endif + ASM_MOV_REG_IMM(emit->as, REG_ARG_2, n_closed_over); + emit_get_stack_pointer_to_reg_for_pop(emit, REG_ARG_3, n_closed_over); + if (n_pos_defaults != 0 || n_kw_defaults != 0) { + adjust_stack(emit, -2); } - emit_load_reg_with_raw_code(emit, REG_ARG_1, scope->raw_code); - ASM_CALL_IND(emit->as, MP_F_MAKE_CLOSURE_FROM_RAW_CODE); + ASM_CALL_IND(emit->as, MP_F_NEW_CLOSURE); emit_post_push_reg(emit, VTYPE_PYOBJ, REG_RET); } |