summaryrefslogtreecommitdiffstatshomepage
path: root/py/objstr.c
Commit message (Collapse)AuthorAge
* py/objstr: Add support for the :_b/o/x specifier in str.format.Jeff Epler2025-05-13
| | | | | | | | | | | This groups non-decimal values by fours, such as bbb_bbbb_bbbb. It also supports `{:_d}` to use underscore for decimal numbers (grouped in threes). Use of incorrect ":,b" is not diagnosed. Thanks to @dpgeorge for the suggestion to reduce code size. Signed-off-by: Jeff Epler <jepler@gmail.com>
* all: Rename the "NORETURN" macro to "MP_NORETURN".Alessandro Gatti2025-05-13
| | | | | | | | | | | | | | | | This commit renames the NORETURN macro, indicating to the compiler that a function does not return, into MP_NORETURN to maintain the same naming convention of other similar macros. To maintain compaitiblity with existing code NORETURN is aliased to MP_NORETURN, but it is also deprecated for MicroPython v2. This changeset was created using a similar process to decf8e6a8bb940d5829ca3296790631fcece7b21 ("all: Remove the "STATIC" macro and just use "static" instead."), with no documentation or python scripts to change to reflect the new macro name. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
* py/objstr: Fix handling of OP_MODULO with namedtuple.Yoctopuce dev2025-04-21
| | | | | | | | | This fix handles attrtuple as well, eg. os.uname(). A test case has been added in basics/attrtuple2.py. Fixes issue #16969. Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>
* py/objstr: Support tuples and start/end args in startswith and endswith.Glenn Moloney2025-03-02
| | | | | | | | | | | | | | This change allows tuples to be passed as the prefix/suffix argument to the `str.startswith()` and `str.endswith()` methods. The methods will return `True` if the string starts/ends with any of the prefixes/suffixes in the tuple. Also adds full support for the `start` and `end` arguments to both methods for compatibility with CPython. Tests have been updated for the new behaviour. Signed-off-by: Glenn Moloney <glenn.moloney@gmail.com>
* py/objstr: Skip whitespace in bytes.fromhex().Glenn Moloney2024-08-19
| | | | | | | | | Skip whitespace characters between pairs of hex numbers. This makes `bytes.fromhex()` compatible with cpython. Includes simple test in `tests/basic/builtin_str_hex.py`. Signed-off-by: Glenn Moloney <glenn.moloney@gmail.com>
* py: Add new cstack API for stack checking, with limit margin macro.Angus Gratton2024-08-14
| | | | | | | | | | | | | | | | | | | | Currently the stack limit margin is hard-coded in each port's call to `mp_stack_set_limit()`, but on threaded ports it's fiddlier and can lead to bugs (such as incorrect thread stack margin on esp32). This commit provides a new API to initialise the C Stack in one function call, with a config macro to set the margin. Where possible the new call is inlined to reduce code size in thread-free ports. Intended replacement for `MP_TASK_STACK_LIMIT_MARGIN` on esp32. The previous `stackctrl.h` API is still present and unmodified apart from a deprecation comment. However it's not available when the `MICROPY_PREVIEW_VERSION_2` macro is set. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
* py/objstr: Add new mp_obj_new_str_from_cstr() helper function.Jon Foster2024-07-04
| | | | | | | | | | | | | | | There were lots of places where this pattern was duplicated, to convert a standard C string to a MicroPython string: x = mp_obj_new_str(s, strlen(s)); This commit provides a simpler method that removes this code duplication: x = mp_obj_new_str_from_cstr(s); This gives clearer, and probably smaller, code. Signed-off-by: Jon Foster <jon@jon-foster.co.uk>
* all: Remove the "STATIC" macro and just use "static" instead.Angus Gratton2024-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The STATIC macro was introduced a very long time ago in commit d5df6cd44a433d6253a61cb0f987835fbc06b2de. The original reason for this was to have the option to define it to nothing so that all static functions become global functions and therefore visible to certain debug tools, so one could do function size comparison and other things. This STATIC feature is rarely (if ever) used. And with the use of LTO and heavy inline optimisation, analysing the size of individual functions when they are not static is not a good representation of the size of code when fully optimised. So the macro does not have much use and it's simpler to just remove it. Then you know exactly what it's doing. For example, newcomers don't have to learn what the STATIC macro is and why it exists. Reading the code is also less "loud" with a lowercase static. One other minor point in favour of removing it, is that it stops bugs with `STATIC inline`, which should always be `static inline`. Methodology for this commit was: 1) git ls-files | egrep '\.[ch]$' | \ xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/" 2) Do some manual cleanup in the diff by searching for the word STATIC in comments and changing those back. 3) "git-grep STATIC docs/", manually fixed those cases. 4) "rg -t python STATIC", manually fixed codegen lines that used STATIC. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
* py/objstr: Fix `str % {}` edge case.mcskatkat2023-09-01
| | | | | | | Eliminate `TypeError` when format string contains no named conversions. This matches CPython behavior. Signed-off-by: mcskatkat <mc_skatkat@hotmail.com>
* all: Rename UMODULE to MODULE in preprocessor/Makefile vars.Jim Mussared2023-06-08
| | | | | | This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Return unsupported binop instead of raising TypeError.Damien George2023-05-19
| | | | | | | So that user types can implement reverse operators and have them work with str on the left-hand-side, eg `"a" + UserType()`. Signed-off-by: Damien George <damien@micropython.org>
* py: Remove the word "yet" from exception messages.Damien George2022-12-06
| | | | | | | These unimplemented features may never be implemented, and having the word "yet" there takes up space. Signed-off-by: Damien George <damien@micropython.org>
* py/objarray: Detect bytearray(str) without an encoding.Jim Mussared2022-11-08
| | | | | | | | | This prevents a very subtle bug caused by writing e.g. `bytearray('\xfd')` which gives you `(0xc3, 0xbd)`. This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Add a helper to set mp_obj_str_t data.Jim Mussared2022-10-11
| | | | Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Don't treat bytes as unicode in str.count.Jim Mussared2022-09-26
| | | | | | | | | | `b'\xaa \xaa'.count(b'\xaa')` now (correctly) returns 2 instead of 1. Fixes issue #9404. This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/obj: Optimise code size and performance for make_new as a slot.Jim Mussared2022-09-19
| | | | | | | The check for make_new (i.e. used to determine something's type) is now more complicated due to the slot access. This commit changes the inlining of a few frequently-used helpers to overall improve code size and performance.
* py/obj: Convert make_new into a mp_obj_type_t slot.Jim Mussared2022-09-19
| | | | | | | | | | | Instead of being an explicit field, it's now a slot like all the other methods. This is a marginal code size improvement because most types have a make_new (100/138 on PYBV11), however it improves consistency in how types are declared, removing the special case for make_new. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/obj: Merge getiter and iternext mp_obj_type_t slots.Jim Mussared2022-09-19
| | | | | | | | | | | | | | | | | | | | | | The goal here is to remove a slot (making way to turn make_new into a slot) as well as reduce code size by the ~40 references to mp_identity_getiter and mp_stream_unbuffered_iter. This introduces two new type flags: - MP_TYPE_FLAG_ITER_IS_ITERNEXT: This means that the "iter" slot in the type is "iternext", and should use the identity getiter. - MP_TYPE_FLAG_ITER_IS_CUSTOM: This means that the "iter" slot is a pointer to a mp_getiter_iternext_custom_t instance, which then defines both getiter and iternext. And a third flag that is the OR of both, MP_TYPE_FLAG_ITER_IS_STREAM: This means that the type should use the identity getiter, and mp_stream_unbuffered_iter as iternext. Finally, MP_TYPE_FLAG_ITER_IS_GETITER is defined as a no-op flag to give the default case where "iter" is "getiter". Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Remove unnecessary locals_dict cast.Jim Mussared2022-09-19
| | | | Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Make all mp_obj_type_t defs use MP_DEFINE_CONST_OBJ_TYPE.Jim Mussared2022-09-19
| | | | | | In preparation for upcoming rework of mp_obj_type_t layout. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Simplify buffer protocol to just a "get buffer" callback.Jim Mussared2022-09-19
| | | | | | | | | | | | | | The buffer protocol type only has a single member, and this existing layout creates problems for the upcoming split/slot-index mp_obj_type_t layout optimisations. If we need to make the buffer protocol more sophisticated in the future either we can rely on the mp_obj_type_t optimisations to just add additional slots to mp_obj_type_t or re-visit the buffer protocol then. This change is a no-op in terms of generated code. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Always validate utf-8 for mp_obj_new_str.Jim Mussared2022-08-26
| | | | | | | | All uses of this are either tiny strings or not-known-to-be-safe. Update comments for mp_obj_new_str_copy and mp_obj_new_str_of_type. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Optimise mp_obj_new_str_from_vstr for known-safe strings.Jim Mussared2022-08-26
| | | | | | | The new `mp_obj_new_str_from_utf8_vstr` can be used when you know you already have a unicode-safe string. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Always ensure mp_obj_str_from_vstr is unicode-safe.Jim Mussared2022-08-26
| | | | | | | | | | Now that we have `mp_obj_new_str_type_from_vstr` (private helper used by objstr.c) split from the public API (`mp_obj_new_str_from_vstr`), we can enforce a unicode check at the public API without incurring a performance cost on the various objstr.c methods (which are already working on known unicode-safe strings). Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Split mp_obj_str_from_vstr into bytes/str versions.Jim Mussared2022-08-26
| | | | | | | | | | | | | | Previously the desired output type was specified. Now make the type part of the function name. Because this function is used in a few places this saves code size due to smaller call-site. This makes `mp_obj_new_str_type_from_vstr` a private function of objstr.c (which is almost the only place where the output type isn't a compile-time constant). This saves ~140 bytes on PYBV11. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Add hex/fromhex to bytes/memoryview/bytearray.Jim Mussared2022-08-12
| | | | | | | | | | | These were added in Python 3.5. Enabled via MICROPY_PY_BUILTINS_BYTES_HEX, and enabled by default for all ports that currently have ubinascii. Rework ubinascii to use the implementation of these methods. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Consolidate methods for str/bytes/bytearray/array.Andrew Leech2022-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds the bytes methods to bytearray, matching CPython. The existing implementations of these methods for str/bytes are reused for bytearray with minor updates to match CPython return types. For details on the CPython behaviour see https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations The work to merge locals tables for str/bytes/bytearray/array was done by @jimmo. Because of this merging of locals the change in code size for this commit is mostly negative: bare-arm: +0 +0.000% minimal x86: +29 +0.018% unix x64: -792 -0.128% standard[incl -448(data)] unix nanbox: -436 -0.078% nanbox[incl -448(data)] stm32: -40 -0.010% PYBV10 cc3200: -32 -0.017% esp8266: -28 -0.004% GENERIC esp32: -72 -0.005% GENERIC[incl -200(data)] mimxrt: -40 -0.011% TEENSY40 renesas-ra: -40 -0.006% RA6M2_EK nrf: -16 -0.009% pca10040 rp2: -64 -0.013% PICO samd: +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
* py/obj: Add static safety checks to mp_obj_is_type().Yonatan Goldschmidt2022-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d96cfd13e3a464862c introduced a regression by breaking existing users of mp_obj_is_type(.., &mp_obj_bool). This function (and associated helpers like mp_obj_is_int()) have some specific nuances, and mistakes like this one can happen again. This commit adds mp_obj_is_exact_type() which behaves like the the old mp_obj_is_type(). The new mp_obj_is_type() has the same prototype but it attempts to statically assert that it's not called with types which should be checked using mp_obj_is_type(). If called with any of these types: int, str, bool, NoneType - it will cause a compilation error. Additional checked types (e.g function types) can be added in the future. Existing users of mp_obj_is_type() with the now "invalid" types, were translated to use mp_obj_is_exact_type(). The use of MP_STATIC_ASSERT() is not bulletproof - usually GCC (and other compilers) can't statically check conditions that are only known during link-time (like variables' addresses comparison). However, in this case, GCC is able to statically detect these conditions, probably because it's the exact same object - `&mp_type_int == &mp_type_int` is detected. Misuses of this function with runtime-chosen types (e.g: `mp_obj_type_t *x = ...; mp_obj_is_type(..., x);` won't be detected. MSC is unable to detect this, so we use MP_STATIC_ASSERT_NOT_MSC(). Compiling with this commit and without the fix for d96cfd13e3a464862c shows that it detects the problem. Signed-off-by: Yonatan Goldschmidt <yon.goldschmidt@gmail.com>
* all: Use mp_obj_malloc everywhere it's applicable.Jim Mussared2022-05-03
| | | | | | | | | | | | | | | This replaces occurences of foo_t *foo = m_new_obj(foo_t); foo->base.type = &foo_type; with foo_t *foo = mp_obj_malloc(foo_t, &foo_type); Excludes any places where base is a sub-field or when new0/memset is used. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Support '{:08}'.format("Jan") like Python 3.10.Jeff Epler2022-01-19
| | | | | | | | | The new test has an .exp file, because it is not compatible with Python 3.9 and lower. See CPython version of the issue at https://bugs.python.org/issue27772 Signed-off-by: Jeff Epler <jepler@gmail.com>
* py: Introduce and use mp_raise_type_arg helper.Damien George2021-07-15
| | | | | | To reduce code size. Signed-off-by: Damien George <damien@micropython.org>
* py: Add option to compile without any error messages at all.Damien George2021-04-27
| | | | | | | | This introduces a new option, MICROPY_ERROR_REPORTING_NONE, which completely disables all error messages. To be used in cases where MicroPython needs to fit in very limited systems. Signed-off-by: Damien George <damien@micropython.org>
* py/mpprint: Fix length calculation for strings with precision-modifier.Joris Peeraer2020-12-07
| | | | | | | | | | | | | | | | | | | Two issues are tackled: 1. The calculation of the correct length to print is fixed to treat the precision as a maximum length instead as the exact length. This is done for both qstr (%q) and for regular str (%s). 2. Fix the incorrect use of mp_printf("%.*s") to mp_print_strn(). Because of the fix of above issue, some testcases that would print an embedded null-byte (^@ in test-output) would now fail. The bug here is that "%s" was used to print null-bytes. Instead, mp_print_strn is used to make sure all bytes are outputted and the exact length is respected. Test-cases are added for both %s and %q with a combination of precision and padding specifiers.
* py/objstr: Make bytes(bytes_obj) return bytes_obj.Iyassou Shimels2020-09-24
| | | | | | | Calling the bytes constructor on a bytes object returns the original bytes object. This saves allocating a new instance, and matches CPython. Signed-off-by: Iyassou Shimels <s.iyassou@gmail.com>
* all: Format code to add space after C++-style comment start.stijn2020-04-23
| | | | | | Note: the uncrustify configuration is explicitly set to 'add' instead of 'force' in order not to alter the comments which use extra spaces after // as a means of indenting text for clarity.
* all: Use MP_ERROR_TEXT for all error messages.Jim Mussared2020-04-05
|
* py: Use preprocessor to detect error reporting level (terse/detailed).Jim Mussared2020-04-05
| | | | | | Instead of compiler-level if-logic. This is necessary to know what error strings are included in the build at the preprocessor stage, so that string compression can be implemented.
* py/objstr: Remove duplicate % in error string.Tom Collins2020-03-11
| | | | | | | | | The double-% was added in 11de8399fe5f9ef54589b14470faf8d4fcc5ccaa (Jun 2014) when such errors were formatted with printf. But then 55830dd9bf4fee87c0a6d3f38c51614fea0eb483 (Dec 2018) changed mp_obj_new_exception_msg() to not format the message, as discussed in #3004. So such error strings are no longer formatted and a % is just that.
* all: Reformat C and Python source code with tools/codeformat.py.Damien George2020-02-28
| | | | This is run with uncrustify 0.70.1, and black 19.10b0.
* py: Add mp_raise_msg_varg helper and use it where appropriate.Damien George2020-02-13
| | | | | | | | | | | | | | | | | | This commit adds mp_raise_msg_varg(type, fmt, ...) as a helper for nlr_raise(mp_obj_new_exception_msg_varg(type, fmt, ...)). It makes the C-level API for raising exceptions more consistent, and reduces code size on most ports: bare-arm: +28 +0.042% minimal x86: +100 +0.067% unix x64: -56 -0.011% unix nanbox: -300 -0.068% stm32: -204 -0.054% PYBV10 cc3200: +0 +0.000% esp8266: -64 -0.010% GENERIC esp32: -104 -0.007% GENERIC nrf: -136 -0.094% pca10040 samd: +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
* py/obj.h: Add and use mp_obj_is_bool() helper.Yonatan Goldschmidt2020-01-24
| | | | | | | | | | | | | | | Commit d96cfd13e3a464862cecffb2718c6286b52c77b0 introduced a regression in testing for bool objects, that such objects were in some cases no longer recognised and bools, eg when using mp_obj_is_type(o, &mp_type_bool), or mp_obj_is_integer(o). This commit fixes that problem by adding mp_obj_is_bool(o). Builds with MICROPY_OBJ_IMMEDIATE_OBJS enabled check if the object is any of the const True or False objects. Builds without it use the old method of ->type checking, which compiles to smaller code (compared with the former mentioned method). Fixes #5538.
* py: Make mp_obj_get_type() return a const ptr to mp_obj_type_t.Damien George2020-01-09
| | | | | | Most types are in rodata/ROM, and mp_obj_base_t.type is a constant pointer, so enforce this const-ness throughout the code base. If a type ever needs to be modified (eg a user type) then a simple cast can be used.
* py/objstr: Don't use inline GET_STR_DATA_LEN for object-repr D.Damien George2019-12-27
| | | | | Changing to use the helper function mp_obj_str_get_data_no_check() reduces code size of nan-boxing builds by about 1000 bytes.
* py/objstr: Size-optimise failure path for mp_obj_str_get_buffer.Jim Mussared2019-10-22
| | | | These fields are never looked at if the function returns non-zero.
* py: Rename MP_QSTR_NULL to MP_QSTRnull to avoid intern collisions.Josh Lloyd2019-09-26
| | | | Fixes #5140.
* py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.Damien George2019-02-12
| | | | | | | | | | | | | | | | | | | | | These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.
* py: Update my copyright info on some files.Paul Sokolovsky2019-02-06
| | | | Based on git history.
* py/objstr: Make str.count() method configurable.Paul Sokolovsky2018-10-22
| | | | | | Configurable via MICROPY_PY_BUILTINS_STR_COUNT. Default is enabled. Disabled for bare-arm, minimal, unix-minimal and zephyr ports. Disabling it saves 408 bytes on x86.
* py/objstr: format: Return bytes result for bytes format string.Paul Sokolovsky2018-09-26
| | | | | | | | | | This is an improvement over previous behavior when str was returned for both str and bytes input format. This new behaviour is also consistent with how the % operator works, as well as many other str/bytes methods. It should be noted that it's not how current versions of CPython work, where there's a gap in the functionality and bytes.format() is not supported.
* py/objstr: Make % (__mod__) formatting operator configurable.Paul Sokolovsky2018-09-20
| | | | | Default is enabled, disabled for minimal builds. Saves 1296 bytes on x86, 976 bytes on ARM.