summaryrefslogtreecommitdiffstatshomepage
path: root/py/parse.c
Commit message (Collapse)AuthorAge
* py/parse: Remove old esp32 compiler workaround.Alessandro Gatti2024-09-27
| | | | | | | | | | | | The ESP32 port contains a workaround to avoid having a certain function in `py/parse.c` being generated incorrectly. The compiler in question is not part of any currently supported version of ESP-IDF anymore, and the problem inside the compiler (well, assembler in this case) has been corrected a few years ago. This commit removes all traces of that workaround from the source tree. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
* py/lexer: Support raw f-strings.Damien George2024-06-06
| | | | | | | | Support for raw str/bytes already exists, and extending that to raw f-strings is easy. It also reduces code size because it eliminates an error message. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Zero out dangling parse tree pointer to fix potential GC leak.Angus Gratton2024-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a bug where a random Python object may become un-garbage-collectable until an enclosing Python file (compiled on device) finishes executing. Details: The mp_parse_tree_t structure is stored on the stack in top-level functions such as parse_compile_execute() in pyexec.c (and others). Although it quickly falls out of scope in these functions, it is usually still in the current stack frame when the compiled code executes. (Compiler dependent, but usually it's one stack push per function.) This means if any Python object happens to allocate at the same address as the (freed) root parse tree chunk, it's un-garbage-collectable as there's a (dangling) pointer up the stack referencing this same address. As reported by @GitHubsSilverBullet here: https://github.com/orgs/micropython/discussions/14116#discussioncomment-8837214 This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
* all: Remove the "STATIC" macro and just use "static" instead.Angus Gratton2024-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The STATIC macro was introduced a very long time ago in commit d5df6cd44a433d6253a61cb0f987835fbc06b2de. The original reason for this was to have the option to define it to nothing so that all static functions become global functions and therefore visible to certain debug tools, so one could do function size comparison and other things. This STATIC feature is rarely (if ever) used. And with the use of LTO and heavy inline optimisation, analysing the size of individual functions when they are not static is not a good representation of the size of code when fully optimised. So the macro does not have much use and it's simpler to just remove it. Then you know exactly what it's doing. For example, newcomers don't have to learn what the STATIC macro is and why it exists. Reading the code is also less "loud" with a lowercase static. One other minor point in favour of removing it, is that it stops bugs with `STATIC inline`, which should always be `static inline`. Methodology for this commit was: 1) git ls-files | egrep '\.[ch]$' | \ xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/" 2) Do some manual cleanup in the diff by searching for the word STATIC in comments and changing those back. 3) "git-grep STATIC docs/", manually fixed those cases. 4) "rg -t python STATIC", manually fixed codegen lines that used STATIC. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
* py/parse: Always free lexer even if an exception is raised.Damien George2023-09-14
| | | | | | Fixes issue #3843. Signed-off-by: Damien George <damien@micropython.org>
* all: Rename UMODULE to MODULE in preprocessor/Makefile vars.Jim Mussared2023-06-08
| | | | | | This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Rename mp_umodule*, mp_module_umodule* to remove the "u" prefix.Jim Mussared2023-06-08
| | | | | | This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/parse: Fix build when COMP_CONST_FOLDING=0 and COMP_MODULE_CONST=1.Damien George2023-05-03
| | | | Signed-off-by: Damien George <damien@micropython.org>
* all: Fix spelling mistakes based on codespell check.Damien George2023-04-27
| | | | Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Allow const types other than int to optimise as true/false.Angus Gratton2022-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | Allows optimisation of cases like: import micropython _DEBUG = micropython.const(False) if _DEBUG: print('Debugging info') Previously the 'if' statement was only optimised out if the type of the const() argument was integer. The change is implemented in a way that makes the compiler slightly smaller (-16 bytes on PYBV11) but compilation will also be very slightly slower. As a bonus, if const support is enabled then the compiler can now optimise const truthy/falsey expressions of other types, like: while "something": pass ... unclear if that is useful, but perhaps it could be. Signed-off-by: Angus Gratton <angus@redyak.com.au>
* py/parsenum: Optimise when building with complex disabled.Damien George2022-06-23
| | | | | | To reduce code size when MICROPY_PY_BUILTINS_COMPLEX is disabled. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Work around xtensa esp-2020r3 compiler bug.Damien George2022-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit works around a bug in xtensa-esp32-elf-gcc version esp-2020r3. The bug is in generation of loop constructs. The below code is generated by the xtensa-esp32 compiler. The first extract is the buggy machine code and the second extract is the corrected machine code. The test `basics/logic_constfolding.py` fails with the first code and succeeds with the second. Disassembly of section .text.push_result_rule: 00000000 <push_result_rule>: ... d6: 209770 or a9, a7, a7 d9: 178976 loop a9, f4 <push_result_rule+0xf4> d9: R_XTENSA_SLOT0_OP .text.push_result_rule+0xf4 dc: 030190 rsr.lend a9 df: 130090 wsr.lbeg a9 e2: a8c992 addi a9, a9, -88 e5: 06d992 addmi a9, a9, 0x600 e8: 130190 wsr.lend a9 eb: 002000 isync ee: 030290 rsr.lcount a9 f1: 01c992 addi a9, a9, 1 f4: 1494e7 bne a4, a14, 10c <push_result_rule+0x10c> f4: R_XTENSA_SLOT0_OP .text.push_result_rule+0x10c Disassembly of section .text.push_result_rule: 00000000 <push_result_rule>: ... d6: 209770 or a9, a7, a7 d9: 178976 loop a9, f4 <push_result_rule+0xf4> d9: R_XTENSA_SLOT0_OP .text.push_result_rule+0xf4 dc: 030190 rsr.lend a9 df: 130090 wsr.lbeg a9 e2: 000091 l32r a9, fffc00e4 <push_result_rule+0xfffc00e4> e2: R_XTENSA_SLOT0_OP .literal.push_result_rule+0x18 e5: 0020f0 nop e8: 130190 wsr.lend a9 eb: 002000 isync ee: 030290 rsr.lcount a9 f1: 01c992 addi a9, a9, 1 f4: 1494e7 bne a4, a14, 10c <push_result_rule+0x10c> f4: R_XTENSA_SLOT0_OP .text.push_result_rule+0x10c Work done in collaboration with @jimmo. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Allow all constant objects to be used in "X = const(o)".Damien George2022-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that constant tuples are supported in the parser, eg (1, True, "str"), it's a small step to allow anything that is a constant to be used with the pattern: from micropython import const X = const(obj) This commit makes the required changes to allow the following types of constants: from micropython import const _INT = const(123) _FLOAT = const(1.2) _COMPLEX = const(3.4j) _STR = const("str") _BYTES = const(b"bytes") _TUPLE = const((_INT, _STR, _BYTES)) _TUPLE2 = const((None, False, True, ..., (), _TUPLE)) Prior to this, only integers could be used in const(...). Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Add MICROPY_COMP_CONST_TUPLE option to build const tuples.Damien George2022-04-14
| | | | | | | | | | | | | | | | | | This commit adds support to the parser so that tuples which contain only constant elements (bool, int, str, bytes, etc) are immediately converted to a tuple object. This makes it more efficient to use tuples containing constant data because they no longer need to be created at runtime by the bytecode (or native code). Furthermore, with this improvement constant tuples that are part of frozen code are now able to be stored fully in ROM (this will be implemented in later commits). Code size is increased by about 400 bytes on Cortex-M4 platforms. See related issue #722. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Print const object value in mp_parse_node_print.Damien George2022-04-14
| | | | | | To give more information when printing the parse tree. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Factor obj extract code to mp_parse_node_extract_const_object.Damien George2022-04-14
| | | | Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Handle check for target small-int size in parser.Damien George2022-03-16
| | | | | | | This means that all constants for EMIT_ARG(load_const_obj, obj) are created in the parser (rather than some in the compiler). Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Put const bytes objects in parse tree as const object.Damien George2022-03-16
| | | | | | | Instead of as an intermediate qstr, which may unnecessarily intern the data of the bytes object. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Simplify handling of const int parse nodes.Damien George2022-03-16
| | | | Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Simplify parse nodes representing a list.Damien George2021-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit simplifies and optimises the parse tree in-memory representation of lists of expressions, for tuples and lists, and when tuples are used on the left-hand-side of assignments and within del statements. This reduces memory usage of the parse tree when such code is compiled, and also reduces the size of the compiler. For example, (1,) was previously the following parse tree: expr_stmt(5) (n=2) atom_paren(45) (n=1) testlist_comp(146) (n=2) int(1) testlist_comp_3b(149) (n=1) NULL NULL and with this commit is now: expr_stmt(5) (n=2) atom_paren(45) (n=1) testlist_comp(146) (n=1) int(1) NULL Similarly, (1, 2, 3) was previously: expr_stmt(5) (n=2) atom_paren(45) (n=1) testlist_comp(146) (n=2) int(1) testlist_comp_3c(150) (n=2) int(2) int(3) NULL and is now: expr_stmt(5) (n=2) atom_paren(45) (n=1) testlist_comp(146) (n=3) int(1) int(2) int(3) NULL Signed-off-by: Damien George <damien@micropython.org>
* py: Implement partial PEP-498 (f-string) support.Jim Mussared2021-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements (most of) the PEP-498 spec for f-strings and is based on https://github.com/micropython/micropython/pull/4998 by @klardotsh. It is implemented in the lexer as a syntax translation to `str.format`: f"{a}" --> "{}".format(a) It also supports: f"{a=}" --> "a={}".format(a) This is done by extracting the arguments into a temporary vstr buffer, then after the string has been tokenized, the lexer input queue is saved and the contents of the temporary vstr buffer are injected into the lexer instead. There are four main limitations: - raw f-strings (`fr` or `rf` prefixes) are not supported and will raise `SyntaxError: raw f-strings are not supported`. - literal concatenation of f-strings with adjacent strings will fail "{}" f"{a}" --> "{}{}".format(a) (str.format will incorrectly use the braces from the non-f-string) f"{a}" f"{a}" --> "{}".format(a) "{}".format(a) (cannot concatenate) - PEP-498 requires the full parser to understand the interpolated argument, however because this entirely runs in the lexer it cannot resolve nested braces in expressions like f"{'}'}" - The !r, !s, and !a conversions are not supported. Includes tests and cpydiffs. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/parse: Expose rule-name printing as MICROPY_DEBUG_PARSE_RULE_NAME.Damien George2020-10-01
| | | | | | So it can be enabled without modifying the source. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Pass in an mp_print_t to mp_parse_node_print.Damien George2020-09-11
| | | | | | So the output can be redirected if needed. Signed-off-by: Damien George <damien@micropython.org>
* py/parse: Make mp_parse_node_extract_list return size_t instead of int.Damien George2020-05-09
| | | | | Because this function can only return non-negative values, and having the correct return type gives more information to the caller.
* py/parse: Support constant folding of power operator for integers.Damien George2020-05-03
| | | | | | | | Constant expression like "2 ** 3" will now be folded, and the special form "X = const(2 ** 3)" will now compile because the argument to the const is now a constant. Fixes issue #5865.
* all: Format code to add space after C++-style comment start.stijn2020-04-23
| | | | | | Note: the uncrustify configuration is explicitly set to 'add' instead of 'force' in order not to alter the comments which use extra spaces after // as a means of indenting text for clarity.
* py/parse: Remove unnecessary check in const folding for ** operator.Damien George2020-04-09
| | | | | | | | In this part of the code there is no way to get the ** operator, so no need to check for it. This commit also adds tests for this, and other related, invalid const operations.
* all: Use MP_ERROR_TEXT for all error messages.Jim Mussared2020-04-05
|
* all: Reformat C and Python source code with tools/codeformat.py.Damien George2020-02-28
| | | | This is run with uncrustify 0.70.1, and black 19.10b0.
* all: Add *FORMAT-OFF* in various places.Damien George2020-02-28
| | | | | | | | This string is recognised by uncrustify, to disable formatting in the region marked by these comments. This is necessary in the qstrdef*.h files to prevent modification of the strings within the Q(...). In other places it is used to prevent excessive reformatting that would make the code less readable.
* py/parse: Add parenthesis around calculated bit-width in struct.Damien George2020-02-28
| | | | To improve interaction with uncrustify formatter.
* py: Rename MP_QSTR_NULL to MP_QSTRnull to avoid intern collisions.Josh Lloyd2019-09-26
| | | | Fixes #5140.
* py: Add support for matmul operator @ as per PEP 465.Damien George2019-09-26
| | | | | | | | | | | | | | | | To make progress towards MicroPython supporting Python 3.5, adding the matmul operator is important because it's a really "low level" part of the language, being a new token and modifications to the grammar. It doesn't make sense to make it configurable because 1) it would make the grammar and lexer complicated/messy; 2) no other operators are configurable; 3) it's not a feature that can be "dynamically plugged in" via an import. And matmul can be useful as a general purpose user-defined operator, it doesn't have to be just for numpy use. Based on work done by Jim Mussared.
* py/parse: Use calculation instead of table to convert token to operator.Damien George2019-09-26
|
* py/lexer: Reorder operator tokens to match corresponding binary ops.Damien George2019-09-26
|
* py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.Damien George2019-02-12
| | | | | | | | | | | | | | | | | | | | | These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.
* py: Shorten error messages by using contractions and some rewording.Damien George2018-09-20
|
* py/parse: Fix macro evaluation by avoiding empty __VA_ARGS__.Damien George2017-12-29
| | | | | | Empty __VA_ARGS__ are not allowed in the C preprocessor so adjust the rule arg offset calculation to not use them. Also, some compilers (eg MSVC) require an extra layer of macro expansion.
* py/parse: Update debugging code to compile on 64-bit arch.Damien George2017-12-29
|
* py/parse: Compress rule pointer table to table of offsets.Damien George2017-12-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the sixth and final patch in a series of patches to the parser that aims to reduce code size by compressing the data corresponding to the rules of the grammar. Prior to this set of patches the rules were stored as rule_t structs with rule_id, act and arg members. And then there was a big table of pointers which allowed to lookup the address of a rule_t struct given the id of that rule. The changes that have been made are: - Breaking up of the rule_t struct into individual components, with each component in a separate array. - Removal of the rule_id part of the struct because it's not needed. - Put all the rule arg data in a big array. - Change the table of pointers to rules to a table of offsets within the array of rule arg data. The last point is what is done in this patch here and brings about the biggest decreases in code size, because an array of pointers is now an array of bytes. Code size changes for the six patches combined is: bare-arm: -644 minimal x86: -1856 unix x64: -5408 unix nanbox: -2080 stm32: -720 esp8266: -812 cc3200: -712 For the change in parser performance: it was measured on pyboard that these six patches combined gave an increase in script parse time of about 0.4%. This is due to the slightly more complicated way of looking up the data for a rule (since the 9th bit of the offset into the rule arg data table is calculated with an if statement). This is an acceptable increase in parse time considering that parsing is only done once per script (if compiled on the target).
* py/parse: Remove rule_t struct because it's no longer needed.Damien George2017-12-28
|
* py/parse: Pass rule_id to push_result_token, instead of passing rule_t*.Damien George2017-12-28
|
* py/parse: Pass rule_id to push_result_rule, instead of passing rule_t*.Damien George2017-12-28
| | | | Reduces code size by eliminating quite a few pointer dereferences.
* py/parse: Break rule data into separate act and arg arrays.Damien George2017-12-28
| | | | | | | Instead of each rule being stored in ROM as a struct with rule_id, act and arg, the act and arg parts are now in separate arrays and the rule_id part is removed because it's not needed. This reduces code size, by roughly one byte per grammar rule, around 150 bytes.
* py/parse: Split out rule name from rule struct into separate array.Damien George2017-12-28
| | | | | | The rule name is only used for debugging, and this patch makes things a bit cleaner by completely separating out the rule name from the rest of the rule data.
* py: Extend nan-boxing config to have 47-bit small integers.Damien George2017-12-11
| | | | | | The nan-boxing representation has an extra 16-bits of space to store small-int values, and making use of it allows to create and manipulate full 32-bit positive integers (ie up to 0xffffffff) without using the heap.
* py/objstr: Make mp_obj_new_str_of_type check for existing interned qstr.Damien George2017-11-16
| | | | | | | | | | | | | | | | | The function mp_obj_new_str_of_type is a general str object constructor used in many places in the code to create either a str or bytes object. When creating a str it should first check if the string data already exists as an interned qstr, and if so then return the qstr object. This patch makes the function have such behaviour, which helps to reduce heap usage by reusing existing interned data where possible. The old behaviour of mp_obj_new_str_of_type (which didn't check for existing interned data) is made available through the function mp_obj_new_str_copy, but should only be used in very special cases. One consequence of this patch is that the following expression is now True: 'abc' is ' abc '.split()[0]
* all: Remove inclusion of internal py header files.Damien George2017-10-04
| | | | | | | | | | | | | | | | Header files that are considered internal to the py core and should not normally be included directly are: py/nlr.h - internal nlr configuration and declarations py/bc0.h - contains bytecode macro definitions py/runtime0.h - contains basic runtime enums Instead, the top-level header files to include are one of: py/obj.h - includes runtime0.h and defines everything to use the mp_obj_t type py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h, and defines everything to use the general runtime support functions Additional, specific headers (eg py/objlist.h) can be included if needed.
* all: Use the name MicroPython consistently in commentsAlexander Steffen2017-07-31
| | | | | There were several different spellings of MicroPython present in comments, when there should be only one.
* py/parse: Simplify handling of errors by raising them directly.Damien George2017-02-24
| | | | | | | | | | | | | | | The parser was originally written to work without raising any exceptions and instead return an error value to the caller. But it's now required that a call to the parser be wrapped in an nlr handler, so we may as well make use of that fact and simplify the parser so that it doesn't need to keep track of any memory errors that it had. The parser anyway explicitly raises an exception at the end if there was an error. This patch simplifies the parser by letting the underlying memory allocation functions raise an exception if they fail to allocate any memory. And if there is an error parsing the "<id> = const(<val>)" pattern then that also raises an exception right away instead of trying to recover gracefully and then raise.