| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
| |
(gh-134306)
Co-authored-by: David Goncalves <davegoncalves@gmail.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
handler (GH-129648)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().
_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
|
|
|
|
| |
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
|
|
|
|
|
|
| |
If the cpXXX encoding is not directly implemented in Python, fall back
to use the Windows-specific API codecs.code_page_encode() and
codecs.code_page_decode().
|
|
|
|
|
|
|
|
|
|
| |
Run them with different locales and different date and time.
Add the @run_with_locales() decorator to run the test with multiple
locales.
Improve the run_with_locale() context manager/decorator -- it now
catches only expected exceptions and reports the test as skipped if no
appropriate locale is available.
|
|
|
|
| |
Ignore linter "imported but unused" warnings in tests when the linter
doesn't understand how the import is used.
|
|
|
|
|
|
| |
Split unicode.c tests of _testcapi into two parts: limited C API
tests in _testlimitedcapi and non-limited C API tests in _testcapi.
Update test_codecs.
|
|
|
|
|
| |
UnicodeDecodeError (#113674)
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
|
|
|
|
|
|
|
|
| |
Any capitalization of "xn--" should be acceptable for the ACE prefix
(see https://tools.ietf.org/html/rfc3490#section-5).
Co-authored-by: Pepijn de Vos <pepijndevos@gmail.com>
Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
|
| |
|
|
|
| |
Co-authored-by: Robert Lehmann <mail@robertlehmann.de>
|
|
|
|
|
|
|
|
|
| |
Attempts to pickle or create a shallow or deep copy of codecs streams
now raise a TypeError.
Previously, copying failed with a RecursionError, while pickling
produced wrong results that eventually caused unpickling to fail with
a RecursionError.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
There was an unnecessary quadratic loop in idna decoding. This restores
the behavior to linear.
This also adds an early length check in IDNA decoding to outright reject
huge inputs early on given the ultimate result is defined to be 63 or fewer
characters.
|
|
|
|
| |
conversion to binary mode (#94370)
|
| |
|
|
|
|
| |
(GH-91668)
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
encodings registers the _alias_mbcs() codec search function before
the search_function() codec search function. Previously, the
_alias_mbcs() was never used.
Fix the test_codecs.test_mbcs_alias() test: use the current ANSI code
page, not a fake ANSI code page number.
Remove the test_site.test_aliasing_mbcs() test: the alias is now
implemented in the encodings module, no longer in the site module.
|
|
|
|
|
|
|
|
|
|
| |
Move almost all private functions of Include/cpython/fileutils.h to
the internal C API Include/internal/pycore_fileutils.h.
Only keep _Py_fopen_obj() in Include/cpython/fileutils.h, since it's
used by _testcapi which must not use the internal C API.
Move EncodeLocaleEx() and DecodeLocaleEx() functions from _testcapi
to _testinternalcapi, since the C API moved to the internal C API.
|
| |
|
|
|
|
|
|
|
|
|
| |
"raw-unicode-escape" codec (GH-28944)
They support now splitting escape sequences between input chunks.
Add the third parameter "final" in codecs.raw_unicode_escape_decode().
It is True by default to match the former behavior.
|
|
|
|
|
|
|
|
|
| |
codec (GH-28939)
They support now splitting escape sequences between input chunks.
Add the third parameter "final" in codecs.unicode_escape_decode().
It is True by default to match the former behavior.
|
|
|
|
|
| |
open(), io.open(), codecs.open() and fileinput.FileInput no longer
accept "U" ("universal newline") in the file mode. This flag was
deprecated since Python 3.3.
|
|
|
|
| |
(GH-19940)
|
|
|
|
| |
* Move the codecs' (un)register operation to testcases.
* Remove _codecs._forget_codec() and _PyCodec_Forget()
|
|
|
|
| |
(GH-22219)
|
| |
|
|
|
|
| |
Add codecs.unregister() and PyCodec_Unregister() functions
to unregister a codec search function.
|
|
|
| |
Rename 5 test method names in test_codecs and test_typing.
|
| |
|
|
|
|
|
|
|
| |
(GH-16959)" (GH-18767)
This reverts commit e471e72977c83664f13d041c78549140c86c92de.
The mode will be removed from Python 3.10.
|
|
|
|
|
|
| |
Open issue in the BPO indicated a desire to make the implementation of
codecs.open() at parity with io.open(), which implements a try/except to
assure file stream gets closed before an exception is raised.
|
|
|
|
| |
Trying to decode an invalid string with the punycode codec
shoud raise UnicodeError.
|
|
|
|
|
|
|
| |
creating cycles (GH-17246)
Capturing exceptions into names can lead to reference cycles though the __traceback__ attribute of the exceptions in some obscure cases that have been reported previously and fixed individually. As these variables are not used anyway, we can remove the binding to reduce the chances of creating reference cycles.
See for example GH-13135
|
|
|
|
|
| |
open(), io.open(), codecs.open() and fileinput.FileInput no longer
accept "U" ("universal newline") in the file mode. This flag was
deprecated since Python 3.3.
|
|
|
|
| |
The Rot-13 codec is for educational use but does not have unit tests,
dragging down test coverage. This adds a few very simple tests.
|
|
|
|
| |
improves decoding performance (GH-15083)
|
| |
|
|
|
|
|
|
|
| |
* The UTF-8 incremental decoders fails now fast if encounter
a sequence that can't be handled by the error handler.
* The UTF-16 incremental decoders with the surrogatepass error
handler decodes now a lone low surrogate with final=False.
|
|
|
| |
CP65001Test has been removed.
|
| |
|
|
|
|
|
|
| |
A very simple fix. I found this while writing typeshed stubs for StreamRecoder.
https://bugs.python.org/issue33482
|
| |
|
| |
|
|
|
|
| |
The bug occurred when the encoded surrogate character is passed
to the incremental decoder in two chunks.
|
| |
|