summaryrefslogtreecommitdiffstatshomepage
path: root/py/objstrunicode.c
Commit message (Collapse)AuthorAge
* all: Remove the "STATIC" macro and just use "static" instead.Angus Gratton2024-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The STATIC macro was introduced a very long time ago in commit d5df6cd44a433d6253a61cb0f987835fbc06b2de. The original reason for this was to have the option to define it to nothing so that all static functions become global functions and therefore visible to certain debug tools, so one could do function size comparison and other things. This STATIC feature is rarely (if ever) used. And with the use of LTO and heavy inline optimisation, analysing the size of individual functions when they are not static is not a good representation of the size of code when fully optimised. So the macro does not have much use and it's simpler to just remove it. Then you know exactly what it's doing. For example, newcomers don't have to learn what the STATIC macro is and why it exists. Reading the code is also less "loud" with a lowercase static. One other minor point in favour of removing it, is that it stops bugs with `STATIC inline`, which should always be `static inline`. Methodology for this commit was: 1) git ls-files | egrep '\.[ch]$' | \ xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/" 2) Do some manual cleanup in the diff by searching for the word STATIC in comments and changing those back. 3) "git-grep STATIC docs/", manually fixed those cases. 4) "rg -t python STATIC", manually fixed codegen lines that used STATIC. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
* all: Rename UMODULE to MODULE in preprocessor/Makefile vars.Jim Mussared2023-06-08
| | | | | | This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/obj: Convert make_new into a mp_obj_type_t slot.Jim Mussared2022-09-19
| | | | | | | | | | | Instead of being an explicit field, it's now a slot like all the other methods. This is a marginal code size improvement because most types have a make_new (100/138 on PYBV11), however it improves consistency in how types are declared, removing the special case for make_new. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/obj: Merge getiter and iternext mp_obj_type_t slots.Jim Mussared2022-09-19
| | | | | | | | | | | | | | | | | | | | | | The goal here is to remove a slot (making way to turn make_new into a slot) as well as reduce code size by the ~40 references to mp_identity_getiter and mp_stream_unbuffered_iter. This introduces two new type flags: - MP_TYPE_FLAG_ITER_IS_ITERNEXT: This means that the "iter" slot in the type is "iternext", and should use the identity getiter. - MP_TYPE_FLAG_ITER_IS_CUSTOM: This means that the "iter" slot is a pointer to a mp_getiter_iternext_custom_t instance, which then defines both getiter and iternext. And a third flag that is the OR of both, MP_TYPE_FLAG_ITER_IS_STREAM: This means that the type should use the identity getiter, and mp_stream_unbuffered_iter as iternext. Finally, MP_TYPE_FLAG_ITER_IS_GETITER is defined as a no-op flag to give the default case where "iter" is "getiter". Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Remove unnecessary locals_dict cast.Jim Mussared2022-09-19
| | | | Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Make all mp_obj_type_t defs use MP_DEFINE_CONST_OBJ_TYPE.Jim Mussared2022-09-19
| | | | | | In preparation for upcoming rework of mp_obj_type_t layout. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* all: Simplify buffer protocol to just a "get buffer" callback.Jim Mussared2022-09-19
| | | | | | | | | | | | | | The buffer protocol type only has a single member, and this existing layout creates problems for the upcoming split/slot-index mp_obj_type_t layout optimisations. If we need to make the buffer protocol more sophisticated in the future either we can rely on the mp_obj_type_t optimisations to just add additional slots to mp_obj_type_t or re-visit the buffer protocol then. This change is a no-op in terms of generated code. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
* py/objstr: Consolidate methods for str/bytes/bytearray/array.Andrew Leech2022-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds the bytes methods to bytearray, matching CPython. The existing implementations of these methods for str/bytes are reused for bytearray with minor updates to match CPython return types. For details on the CPython behaviour see https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations The work to merge locals tables for str/bytes/bytearray/array was done by @jimmo. Because of this merging of locals the change in code size for this commit is mostly negative: bare-arm: +0 +0.000% minimal x86: +29 +0.018% unix x64: -792 -0.128% standard[incl -448(data)] unix nanbox: -436 -0.078% nanbox[incl -448(data)] stm32: -40 -0.010% PYBV10 cc3200: -32 -0.017% esp8266: -28 -0.004% GENERIC esp32: -72 -0.005% GENERIC[incl -200(data)] mimxrt: -40 -0.011% TEENSY40 renesas-ra: -40 -0.006% RA6M2_EK nrf: -16 -0.009% pca10040 rp2: -64 -0.013% PICO samd: +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
* py/mpprint: Fix length calculation for strings with precision-modifier.Joris Peeraer2020-12-07
| | | | | | | | | | | | | | | | | | | Two issues are tackled: 1. The calculation of the correct length to print is fixed to treat the precision as a maximum length instead as the exact length. This is done for both qstr (%q) and for regular str (%s). 2. Fix the incorrect use of mp_printf("%.*s") to mp_print_strn(). Because of the fix of above issue, some testcases that would print an embedded null-byte (^@ in test-output) would now fail. The bug here is that "%s" was used to print null-bytes. Instead, mp_print_strn is used to make sure all bytes are outputted and the exact length is respected. Test-cases are added for both %s and %q with a combination of precision and padding specifiers.
* all: Use MP_ERROR_TEXT for all error messages.Jim Mussared2020-04-05
|
* all: Reformat C and Python source code with tools/codeformat.py.Damien George2020-02-28
| | | | This is run with uncrustify 0.70.1, and black 19.10b0.
* py: Add mp_raise_msg_varg helper and use it where appropriate.Damien George2020-02-13
| | | | | | | | | | | | | | | | | | This commit adds mp_raise_msg_varg(type, fmt, ...) as a helper for nlr_raise(mp_obj_new_exception_msg_varg(type, fmt, ...)). It makes the C-level API for raising exceptions more consistent, and reduces code size on most ports: bare-arm: +28 +0.042% minimal x86: +100 +0.067% unix x64: -56 -0.011% unix nanbox: -300 -0.068% stm32: -204 -0.054% PYBV10 cc3200: +0 +0.000% esp8266: -64 -0.010% GENERIC esp32: -104 -0.007% GENERIC nrf: -136 -0.094% pca10040 samd: +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
* py: Make mp_obj_get_type() return a const ptr to mp_obj_type_t.Damien George2020-01-09
| | | | | | Most types are in rodata/ROM, and mp_obj_base_t.type is a constant pointer, so enforce this const-ness throughout the code base. If a type ever needs to be modified (eg a user type) then a simple cast can be used.
* py/objslice: Inline fetching of slice paramters in str_subscr().Nicko van Someren2019-12-29
| | | | To reduce code size.
* py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.Damien George2019-02-12
| | | | | | | | | | | | | | | | | | | | | These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.
* py: Update my copyright info on some files.Paul Sokolovsky2019-02-06
| | | | Based on git history.
* py/objstr: Make str.count() method configurable.Paul Sokolovsky2018-10-22
| | | | | | Configurable via MICROPY_PY_BUILTINS_STR_COUNT. Default is enabled. Disabled for bare-arm, minimal, unix-minimal and zephyr ports. Disabling it saves 408 bytes on x86.
* py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.Damien George2018-02-14
| | | | | | | | | | This patch provides inline versions of the utf8 helper functions for the case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0). This saves code size. The unichar_charlen function is also renamed to utf8_charlen to match the other utf8 helper functions, and the signature of this function is adjusted for consistency (const char* -> const byte*, mp_uint_t -> size_t).
* py/objstr: Remove "make_qstr_if_not_already" arg from mp_obj_new_str.Damien George2017-11-16
| | | | | | | | | | | This patch simplifies the str creation API to favour the common case of creating a str object that is not forced to be interned. To force interning of a new str the new mp_obj_new_str_via_qstr function is added, and should only be used if warranted. Apart from simplifying the mp_obj_new_str function (and making it have the same signature as mp_obj_new_bytes), this patch also reduces code size by a bit (-16 bytes for bare-arm and roughly -40 bytes on the bare-metal archs).
* all: Remove inclusion of internal py header files.Damien George2017-10-04
| | | | | | | | | | | | | | | | Header files that are considered internal to the py core and should not normally be included directly are: py/nlr.h - internal nlr configuration and declarations py/bc0.h - contains bytecode macro definitions py/runtime0.h - contains basic runtime enums Instead, the top-level header files to include are one of: py/obj.h - includes runtime0.h and defines everything to use the mp_obj_t type py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h, and defines everything to use the general runtime support functions Additional, specific headers (eg py/objlist.h) can be included if needed.
* all: Convert mp_uint_t to mp_unary_op_t/mp_binary_op_t where appropriateDamien George2017-08-29
| | | | | | | The unary-op/binary-op enums are already defined, and there are no arithmetic tricks used with these types, so it makes sense to use the correct enum type for arguments that take these values. It also reduces code size quite a bit for nan-boxing builds.
* all: Raise exceptions via mp_raise_XXXJavier Candeira2017-08-13
| | | | | | | | - Changed: ValueError, TypeError, NotImplementedError - OSError invocations unchanged, because the corresponding utility function takes ints, not strings like the long form invocation. - OverflowError, IndexError and RuntimeError etc. not changed for now until we decide whether to add new utility functions.
* all: Use the name MicroPython consistently in commentsAlexander Steffen2017-07-31
| | | | | There were several different spellings of MicroPython present in comments, when there should be only one.
* all: Make more use of mp_raise_{msg,TypeError,ValueError} helpers.Damien George2017-06-15
|
* py: Use size_t as len argument and return type of mp_get_index.Damien George2017-03-23
| | | | | These values are used to compute memory addresses and so size_t is the more appropriate type to use.
* py: Add iter_buf to getiter type method.Damien George2017-02-16
| | | | | | | | | | | | | | | Allows to iterate over the following without allocating on the heap: - tuple - list - string, bytes - bytearray, array - dict (not dict.keys, dict.values, dict.items) - set, frozenset Allows to call the following without heap memory: - all, any, min, max, sum TODO: still need to allocate stack memory in bytecode for iter_buf.
* py/objstr: Convert mp_uint_t to size_t (and use int) where appropriate.Damien George2017-02-16
|
* py/objstr,objstrunicode: Fix inconistent #if indentation.Paul Sokolovsky2016-08-07
|
* py/objstr: Make .partition()/.rpartition() methods configurable.Paul Sokolovsky2016-08-07
| | | | Default is disabled, enabled for unix port. Saves 600 bytes on x86.
* py/objstrunicode: str_index_to_ptr: Implement positive indexing properly.Paul Sokolovsky2016-07-25
| | | | Order out-of-bounds check, completion check, and increment in the right way.
* py/objstrunicode: str_index_to_ptr: Should handle bytes too.Paul Sokolovsky2016-07-25
| | | | | There's single str_index_to_ptr() function, called for both bytes and unicode objects, so should handle each properly.
* py/objstr*: Properly ifdef str.center().Dave Hylands2016-05-22
|
* py/objstr: Implement str.center().Paul Sokolovsky2016-05-22
| | | | | | Disabled by default, enabled in unix port. Need for this method easily pops up when working with text UI/reporting, and coding workalike manually again and again counter-productive.
* py: Use polymorphic iterator type where possible to reduce code size.Damien George2016-01-03
| | | | | | | Only types whose iterator instances still fit in 4 machine words have been changed to use the polymorphic iterator. Reduces Thumb2 arch code size by 264 bytes.
* py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.Damien George2015-11-29
| | | | | | | | | This allows the mp_obj_t type to be configured to something other than a pointer-sized primitive type. This patch also includes additional changes to allow the code to compile when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of mp_uint_t, and various casts.
* py: Add MP_ROM_* macros and mp_rom_* types and use them.Damien George2015-11-29
|
* py: Rename MP_BOOL() to mp_obj_new_bool() for consistency in naming.Paul Sokolovsky2015-10-11
|
* py: Use mp_not_implemented consistently for not implemented features.Damien George2015-09-03
|
* py: Clean up declarations of str type/funcs that are also in unicode.Damien George2015-05-17
| | | | | Background: trying to make an amalgamation of all the code gave some errors with redefined types and inconsistent use of static.
* py: Overhaul and simplify printf/pfenv mechanism.Damien George2015-04-16
| | | | | | | | | | | | | | | | | | | | | | Previous to this patch the printing mechanism was a bit of a tangled mess. This patch attempts to consolidate printing into one interface. All (non-debug) printing now uses the mp_print* family of functions, mainly mp_printf. All these functions take an mp_print_t structure as their first argument, and this structure defines the printing backend through the "print_strn" function of said structure. Printing from the uPy core can reach the platform-defined print code via two paths: either through mp_sys_stdout_obj (defined pert port) in conjunction with mp_stream_write; or through the mp_plat_print structure which uses the MP_PLAT_PRINT_STRN macro to define how string are printed on the platform. The former is only used when MICROPY_PY_IO is defined. With this new scheme printing is generally more efficient (less layers to go through, less arguments to pass), and, given an mp_print_t* structure, one can call mp_print_str for efficiency instead of mp_printf("%s", ...). Code size is also reduced by around 200 bytes on Thumb2 archs.
* py: In str unicode, str_subscr will never be passed a bytes object.Damien George2015-04-04
|
* objstr: Add .splitlines() method.Paul Sokolovsky2015-04-04
| | | | | | | | | splitlines() occurs ~179 times in CPython3 standard library, so was deemed worthy to implement. The method has subtle semantic differences from just .split("\n"). It is also defined as working for any end-of-line combination, but this is currently not implemented - it works only with LF line-endings (which should be OK for text strings on any platforms, but not OK for bytes).
* py: Allow to compile with extra warnings (sign-compare, unused-param).Damien George2015-03-19
|
* py: Remove duplicated mp_obj_str_make_new function from objstrunicode.c.Damien George2015-01-28
|
* objstr: Remove code duplication and unbreak Windows build.Paul Sokolovsky2015-01-23
| | | | | | | | There was really weird warning (promoted to error) when building Windows port. Exact cause is still unknown, but it uncovered another issue: 8-bit and unicode str_make_new implementations should be mutually exclusive, and not built at the same time. What we had is that bytes_decode() pulled 8-bit str_make_new() even for unicode build.
* objstr*: Use separate names for locals_dict of 8-bit and unicode str's.Paul Sokolovsky2015-01-23
| | | | To somewhat unbreak -DSTATIC="" compile.
* py: Add mp_obj_new_str_from_vstr, and use it where relevant.Damien George2015-01-21
| | | | | | | | This patch allows to reuse vstr memory when creating str/bytes object. This improves memory usage. Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88 bytes on unix x64.
* py, unix: Allow to compile with -Wunused-parameter.Damien George2015-01-20
| | | | See issue #699.
* py: Move to guarded includes, everywhere in py/ core.Damien George2015-01-01
| | | | Addresses issue #1022.
* objstr: Allow to convert any buffer proto object to str.Paul Sokolovsky2014-10-31
| | | | | Original motivation is to support converting bytearrays, but easier to just support buffer protocol at all.