diff options
author | Dan Lenski <dlenski@gmail.com> | 2025-06-15 12:29:38 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-06-15 15:29:38 -0400 |
commit | 60181f4ed0e48ff35dc296da6b51473bfc553d16 (patch) | |
tree | 1a18b541bea0f55398a0d46cc899a6468f63cf46 /Lib/test/test_email | |
parent | 54e29ea4eb7b54c888fd5764eef2215535e4d862 (diff) | |
download | cpython-main.tar.gz cpython-main.zip |
gh-67022: Document bytes/str inconsistency in email.header.decode_header() and suggest email.headerregistry.HeaderRegistry as a sane alternative (#92900)HEADmain
* gh-67022: Document bytes/str inconsistency in email.header.decode_header()
This function's possible return types have been surprising and error-prone
for the entirety of its Python 3.x history. It can return either:
1. `typing.List[typing.Tuple[bytes, typing.Optional[str]]]` of length >1
2. or `typing.List[typing.Tuple[str, None]]`, of length exactly 1
This means that any user of this function must be prepared to accept either
`bytes` or `str` for the first member of the 2-tuples it returns, which is a
very surprising behavior in Python 3.x, particularly given that the second
member of the tuple is supposed to represent the charset/encoding of the
first member.
This patch documents the behavior of this function, and adds test cases
to demonstrate it.
As discussed in bpo-22833, this cannot be changed in a backwards-compatible
way, and some users of this function depend precisely on the existing
behavior.
Add warnings about obsolescence of 'email.header.decode_header' and 'email.header.make_header' functions.
Recommend use of `email.headerregistry.HeaderRegistry` instead, as suggested
in https://github.com/python/cpython/pull/92900#discussion_r1112472177
Diffstat (limited to 'Lib/test/test_email')
-rw-r--r-- | Lib/test/test_email/test_email.py | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py index 8765d121fd0..b8116d073a2 100644 --- a/Lib/test/test_email/test_email.py +++ b/Lib/test/test_email/test_email.py @@ -2568,6 +2568,18 @@ Re: =?mac-iceland?q?r=8Aksm=9Arg=8Cs?= baz foo bar =?mac-iceland?q?r=8Aksm?= self.assertEqual(str(make_header(decode_header(s))), '"Müller T" <T.Mueller@xxx.com>') + def test_unencoded_ascii(self): + # bpo-22833/gh-67022: returns [(str, None)] rather than [(bytes, None)] + s = 'header without encoded words' + self.assertEqual(decode_header(s), + [('header without encoded words', None)]) + + def test_unencoded_utf8(self): + # bpo-22833/gh-67022: returns [(str, None)] rather than [(bytes, None)] + s = 'header with unexpected non ASCII caract\xe8res' + self.assertEqual(decode_header(s), + [('header with unexpected non ASCII caract\xe8res', None)]) + # Test the MIMEMessage class class TestMIMEMessage(TestEmailBase): |