summaryrefslogtreecommitdiffstatshomepage
path: root/py/objstrunicode.c
Commit message (Collapse)AuthorAge
* py: Use polymorphic iterator type where possible to reduce code size.Damien George2016-01-03
| | | | | | | Only types whose iterator instances still fit in 4 machine words have been changed to use the polymorphic iterator. Reduces Thumb2 arch code size by 264 bytes.
* py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.Damien George2015-11-29
| | | | | | | | | This allows the mp_obj_t type to be configured to something other than a pointer-sized primitive type. This patch also includes additional changes to allow the code to compile when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of mp_uint_t, and various casts.
* py: Add MP_ROM_* macros and mp_rom_* types and use them.Damien George2015-11-29
|
* py: Rename MP_BOOL() to mp_obj_new_bool() for consistency in naming.Paul Sokolovsky2015-10-11
|
* py: Use mp_not_implemented consistently for not implemented features.Damien George2015-09-03
|
* py: Clean up declarations of str type/funcs that are also in unicode.Damien George2015-05-17
| | | | | Background: trying to make an amalgamation of all the code gave some errors with redefined types and inconsistent use of static.
* py: Overhaul and simplify printf/pfenv mechanism.Damien George2015-04-16
| | | | | | | | | | | | | | | | | | | | | | Previous to this patch the printing mechanism was a bit of a tangled mess. This patch attempts to consolidate printing into one interface. All (non-debug) printing now uses the mp_print* family of functions, mainly mp_printf. All these functions take an mp_print_t structure as their first argument, and this structure defines the printing backend through the "print_strn" function of said structure. Printing from the uPy core can reach the platform-defined print code via two paths: either through mp_sys_stdout_obj (defined pert port) in conjunction with mp_stream_write; or through the mp_plat_print structure which uses the MP_PLAT_PRINT_STRN macro to define how string are printed on the platform. The former is only used when MICROPY_PY_IO is defined. With this new scheme printing is generally more efficient (less layers to go through, less arguments to pass), and, given an mp_print_t* structure, one can call mp_print_str for efficiency instead of mp_printf("%s", ...). Code size is also reduced by around 200 bytes on Thumb2 archs.
* py: In str unicode, str_subscr will never be passed a bytes object.Damien George2015-04-04
|
* objstr: Add .splitlines() method.Paul Sokolovsky2015-04-04
| | | | | | | | | splitlines() occurs ~179 times in CPython3 standard library, so was deemed worthy to implement. The method has subtle semantic differences from just .split("\n"). It is also defined as working for any end-of-line combination, but this is currently not implemented - it works only with LF line-endings (which should be OK for text strings on any platforms, but not OK for bytes).
* py: Allow to compile with extra warnings (sign-compare, unused-param).Damien George2015-03-19
|
* py: Remove duplicated mp_obj_str_make_new function from objstrunicode.c.Damien George2015-01-28
|
* objstr: Remove code duplication and unbreak Windows build.Paul Sokolovsky2015-01-23
| | | | | | | | There was really weird warning (promoted to error) when building Windows port. Exact cause is still unknown, but it uncovered another issue: 8-bit and unicode str_make_new implementations should be mutually exclusive, and not built at the same time. What we had is that bytes_decode() pulled 8-bit str_make_new() even for unicode build.
* objstr*: Use separate names for locals_dict of 8-bit and unicode str's.Paul Sokolovsky2015-01-23
| | | | To somewhat unbreak -DSTATIC="" compile.
* py: Add mp_obj_new_str_from_vstr, and use it where relevant.Damien George2015-01-21
| | | | | | | | This patch allows to reuse vstr memory when creating str/bytes object. This improves memory usage. Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88 bytes on unix x64.
* py, unix: Allow to compile with -Wunused-parameter.Damien George2015-01-20
| | | | See issue #699.
* py: Move to guarded includes, everywhere in py/ core.Damien George2015-01-01
| | | | Addresses issue #1022.
* objstr: Allow to convert any buffer proto object to str.Paul Sokolovsky2014-10-31
| | | | | Original motivation is to support converting bytearrays, but easier to just support buffer protocol at all.
* py: Simplify JSON str printing (while still conforming to JSON spec).Damien George2014-09-25
| | | | | The JSON specs are relatively flexible and allow us to use one function to print strings, be they ascii, bytes or utf-8 encoded.
* py: Add native json printing using existing print framework.Damien George2014-09-17
| | | | | Also add start of ujson module with dumps implemented. Enabled in unix and stmhal ports. Test passes on both.
* py: Change uint to mp_uint_t in runtime.h, stackctrl.h, binary.h.Damien George2014-08-30
| | | | Part of code cleanup, working towards resolving issue #50.
* Change some parts of the core API to use mp_uint_t instead of uint/int.Damien George2014-08-30
| | | | Addressing issue #50, still some way to go yet.
* py: Make MP_OBJ_NEW_SMALL_INT cast arg to mp_int_t itself.Damien George2014-07-31
| | | | Addresses issue #724.
* Rename machine_(u)int_t to mp_(u)int_t.Damien George2014-07-03
| | | | See discussion in issue #50.
* py: Make unichar_charlen() accept/return machine_uint_t.Paul Sokolovsky2014-06-28
|
* py: Small comments, name changes, use of machine_int_t.Damien George2014-06-28
|
* objstrunicode: Refactor str_index_to_ptr() following objstr.Paul Sokolovsky2014-06-27
|
* objstrunicode: Signedness issues.Paul Sokolovsky2014-06-27
|
* objstrunicode: Implement iterator.Paul Sokolovsky2014-06-27
|
* objstrunicode: Re-add buffer protocol back for now, required for io.StringIO.Paul Sokolovsky2014-06-27
|
* objstrunicode: Revamp len() handling for unicode, and optimize bool().Paul Sokolovsky2014-06-27
|
* objstrunicode: Get rid of bytes checking, it's separate type.Paul Sokolovsky2014-06-27
|
* py: Prune unneeded code from objstrunicode, reuse code in objstr.Paul Sokolovsky2014-06-27
|
* objstrunicode: Basic implementation of unicode handling.Chris Angelico2014-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Squashed commit of the following: commit 99dc21b67a895dc10d3c846bc158d27c839cee48 Author: Chris Angelico <rosuav@gmail.com> Date: Thu Jun 12 02:18:54 2014 +1000 Optimize as per TODO (thanks Damien!) commit 5bf0153ecad8348443058d449d74504fc458fe51 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 08:42:06 2014 +1000 Test a default (= UTF-8) encode and decode commit c962057ac340832c4fde60896f656a3fe3ad78a9 Merge: e2c9782 195de32 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 05:23:03 2014 +1000 Merge branch 'master' into unicode, resolving conflict on py/obj.h commit e2c9782a65eb57f481d441d40161de427e1940ba Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 05:05:57 2014 +1000 More whitespace fixups commit 086a2a0f57afbc1f731697fd5d3a0cbbb80e5418 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 05:04:20 2014 +1000 Properly implement string slicing commit 0d339a143e2b6442366145e7f3d64aada293eaa0 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 02:24:11 2014 +1000 Support slicing in str_index_to_ptr, and fix a bounds error commit 24371c7267d360e77cf5eabc2e8ce9a73d2ee0da Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 02:10:22 2014 +1000 Break out index-to-pointer calculation into a function commit 616c24ac014c3ca56008428c506034dd1bfff7a8 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 02:03:11 2014 +1000 Add tests of string slicing, which currently fail commit a24d19f676fe8cc21dad512d91b826892e162a5b Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 01:56:53 2014 +1000 Change string indexing to not precalculate the charlen, and add test for neg indexing commit 0bcc7ab89eafb2ae53195e94c9bea42a4e886b64 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 22:09:17 2014 +1000 Clean up constant qstr declarations now that charlen isn't needed commit 5473e1a1dba2124b7b0c207f2964293cfbe80167 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 07:18:42 2014 +1000 Remove the charlen field from strings, calculating it when required commit 5c1658ec71aefbdc88c261ce2e57dc7670cdc6ef Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 07:11:27 2014 +1000 Get rid of mp_obj_str_get_data_len() which was used in only one place commit a019ba968b4e8daf7f3674f63c5cc400e304c509 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 06:58:26 2014 +1000 Add a unichar_charlen() function to calculate length-in-characters from length-in-bytes commit 44b0d5cff846ba487c526ed95be1b3d1cd3d762a Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 06:32:44 2014 +1000 Use utf8_get/next_char in building up a string's repr commit 30d1bad33f7af90f1971987c39864c8fcf3f5c21 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 06:10:45 2014 +1000 Make utf8_get_char() and utf8_next_char() actually do what their names say commit bc990dad9afb8ec112f5e7f7f79d5ab415da0e72 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 02:10:59 2014 +1000 Revert "Add PEP 393-flags to strings and stub usage." This reverts commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba. commit f9bebb28ad52467f2f2d7a752bb033296b6c2f9b Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 15:41:48 2014 +1000 Whitespace fixes commit 279de0c8eb3cb186914799ccc5ee94ea97f56de4 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 15:28:35 2014 +1000 Formatting/layout improvements - introduce macros for UTF-8 byte detection, add braces. No functional changes. commit f1911f53d56da809c97b07245f5728a419e8fb30 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:56:02 2014 +1000 Make chr() Unicode-aware commit f51ad737b48ac04c161197a4012821d50885c4c7 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:44:07 2014 +1000 Make a string's repr Unicode-aware commit 01bd68684611585d437982dccdf05b33cbedc630 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:33:43 2014 +1000 Expand the Unicode tests commit 7bc91904f899f8012089fc14a06495680a51e590 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:27:30 2014 +1000 Record byte lengths for byte strings commit bb132120717cf176dcfb26f87fa309378f76ab5f Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:25:06 2014 +1000 Make ord() Unicode-aware commit 03f0cbe9051b62192be97b59f84f63f9216668bf Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 10:24:35 2014 +1000 Retain characters as UTF-8 encoded Unicode commit e924659b85c001916a5ff7f4d1d8b3ebe2bf0c2f Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 08:37:27 2014 +1000 Add support for \u and \U escapes, but not \N (with explanatory comment) commit 231031ac5f0346e4ffcf9c4abec2bd33f566232c Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 05:09:35 2014 +1000 Add character length to qstr commit 6df1b946fb17d8d5df3d91b21cde627c3d4556a8 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:48:36 2014 +1000 Add test of UTF-8 encoded source file resulting in properly formed string commit 16429b81a8483cf25865ed11afd81a7d9c253c26 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:44:15 2014 +1000 Make len(s) return character length (even though creation's still buggy) commit cd2cf6663cc47831dbc97819ad5c50ad33f939d3 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:15:36 2014 +1000 HACK - When indexing a qstr, count its charlen. Stupidly inefficient but POC. All tests pass now, though string creation is still buggy. commit 47c234584d3358dfa6b4003d5e7264105d17b8f7 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:15:32 2014 +1000 objstr: Record character length separately from byte length CAUTION: Buggy, may crash stuff - qstr needs equivalent functionality too commit b0f41c72af27d3b361027146025877b3d7e8785c Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 05:37:36 2014 +1000 Beginnings of UTF-8 support - construct strings from that many UTF-8-encoded chars, and subscript bytes the same way commit 89452be641674601e9bfce86dc71c17c3140a6cf Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 05:28:47 2014 +1000 Update comments - now aiming for UTF-8 rather than PEP 393 strings commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba Author: Chris Angelico <rosuav@gmail.com> Date: Wed Jun 4 05:28:12 2014 +1000 Add PEP 393-flags to strings and stub usage. The test suite all passes, but nothing has actually been changed.
* objstrunicode: Complete copy of objstr, to be patched for unicode support.Paul Sokolovsky2014-06-27