aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Lib/test/test_htmlparser.py
Commit message (Collapse)AuthorAge
* gh-86155: Fix data loss after unclosed script or style tag in HTMLParser ↵Waylan Limberg8 days
| | | | | | | (GH-22658) When calling .close() the HTMLParser should flush all remaining content, even when that content is in an unclosed script or style tag.
* gh-77057: Fix handling of invalid markup declarations in HTMLParser (GH-9295)Ezio Melotti8 days
| | | Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* gh-69426: HTMLParser: only unescape properly terminated character entities ↵Sascha Ißbrücker11 days
| | | | | | | | | | in attribute values (GH-95215) According to the HTML5 spec, named character references in attribute values should only be processed if they are not followed by an ASCII alphanumeric, or an equals sign. https://html.spec.whatwg.org/multipage/parsing.html#named-character-reference-state
* gh-95813: Improve HTMLParser from the view of inheritance (#95874)Dong-hee Na2022-08-18
| | | | | | | * gh-95813: Improve HTMLParser from the view of inheritance * gh-95813: Add unittest * Address code review
* bpo-41748: Handles unquoted attributes with commas (#24072)Karl Dubost2021-02-01
| | | | | | | | | | | | | | | | | | * bpo-41748: Adds tests for unquoted attributes with comma * bpo-41748: Handles unquoted attributes with comma * bpo-41748: Addresses review comments * bpo-41748: Addresses review comments * Adds more test cases * Simplifies the regex for handling spaces * bpo-41748: Moves attributes tests under the right class * bpo-41748: Addresses review about duplicate attributes * bpo-41748: Adds NEWS.d entry for this patch
* bpo-37328: remove deprecated HTMLParser.unescape (GH-14186)Inada Naoki2019-08-27
| | | It is deprecated since Python 3.4.
* #27364: fix "incorrect" uses of escape character in the stdlib.R David Murray2016-09-08
| | | | | | | And most of the tools. Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and Martin Panter.
* Issue #23277: Remove unused support.run_unittest import.Serhiy Storchaka2016-04-24
|
* #23144: merge with 3.4.Ezio Melotti2015-09-06
|\
| * #23144: Make sure that HTMLParser.feed() returns all the data, even when ↵Ezio Melotti2015-09-06
| | | | | | | | convert_charrefs is True.
* | #21047: set the default value for the *convert_charrefs* argument of ↵Ezio Melotti2014-08-02
| | | | | | | | HTMLParser to True. Patch by Berker Peksag.
* | #15114: the strict mode and argument of HTMLParser, HTMLParser.error, and ↵Ezio Melotti2014-08-02
|/ | | | the HTMLParserError exception have been removed.
* #20288: merge with 3.3.Ezio Melotti2014-02-01
|\
| * #20288: fix handling of invalid numeric charrefs in HTMLParser.Ezio Melotti2014-02-01
| |
* | #13633: Added a new convert_charrefs keyword arg to HTMLParser that, when ↵Ezio Melotti2013-11-23
| | | | | | | | True, automatically converts all character references.
* | #19688: add back and deprecate the internal HTMLParser.unescape() method.Ezio Melotti2013-11-22
| |
* | #2927: Added the unescape() function to the html module.Ezio Melotti2013-11-19
| |
* | #19480: merge with 3.3.Ezio Melotti2013-11-07
|\|
| * #19480: HTMLParser now accepts all valid start-tag names as defined by the ↵Ezio Melotti2013-11-07
| | | | | | | | HTML5 standard.
* | Merge test_htmlparser changes from 3.3.Ezio Melotti2013-11-02
|\|
| * Use unittest.main() in test_htmlparser.Ezio Melotti2013-11-02
| |
* | #15114: The html.parser module now raises a DeprecationWarning when the ↵Ezio Melotti2013-11-02
|/ | | | strict argument of HTMLParser or the HTMLParser.error method are used.
* #17802: Fix an UnboundLocalError in html.parser. Initial tests by Thomas ↵Ezio Melotti2013-05-01
| | | | Barlow.
* #15156: HTMLParser now uses the new "html.entities.html5" dictionary.Ezio Melotti2012-06-24
|
* #15114: the strict mode of HTMLParser and the HTMLParseError exception are ↵Ezio Melotti2012-06-23
| | | | deprecated now that the parser is able to parse invalid markup.
* #14538: HTMLParser can now parse correctly start tags that contain a bare /.Ezio Melotti2012-04-18
|
* HTMLParser is now able to handle slashes in the start tag.Ezio Melotti2012-02-21
|
* Fix an index and clean up comments.Ezio Melotti2012-02-13
|
* Improve handling of declarations in HTMLParser.Ezio Melotti2012-02-13
|
* Fix htmlparser tests to always use the right collector.Ezio Melotti2012-02-13
|
* #13993: HTMLParser is now able to handle broken end tags when strict=False.Ezio Melotti2012-02-13
|
* #13960: HTMLParser is now able to handle broken comments when strict=False.Ezio Melotti2012-02-10
|
* #13576: add tests about the handling of (possibly broken) condcoms.Ezio Melotti2011-12-19
|
* #13358: HTMLParser now calls handle_data only once for each CDATA.Ezio Melotti2011-11-18
|
* #1745761, #755670, #13357, #12629, #1200313: improve attribute handling in ↵Ezio Melotti2011-11-14
| | | | HTMLParser.
* Group tests about attributes in a separate class.Ezio Melotti2011-11-14
|
* Make sure that the tolerant parser still parses valid HTML correctly.Ezio Melotti2011-11-01
|
* Avoid reusing the same collector in the tests.Ezio Melotti2011-11-01
|
* #12008: add a test.Ezio Melotti2011-11-01
|
* #670664: Fix HTMLParser to correctly handle the content of ↵Ezio Melotti2011-11-01
| | | | ``<script>...</script>`` and ``<style>...</style>``.
* #13273: fix a bug that prevented HTMLParser to properly detect some tags ↵Ezio Melotti2011-10-28
| | | | when strict=False.
* #12888: Fix a bug in HTMLParser.unescape that prevented it to escape more ↵Ezio Melotti2011-09-05
| | | | than 128 entities. Patch by Peter Otten.
* #7311: fix html.parser to accept non-ASCII attribute values.Ezio Melotti2011-04-07
|
* Fix Issue10759 - html.parser.unescape() fails on HTML entities with ↵Senthil Kumaran2010-12-28
| | | | incorrect syntax
* #1486713: Add a tolerant mode to HTMLParser.R. David Murray2010-12-03
| | | | | | | | | | | | The motivation for adding this option is that the the functionality it provides used to be provided by sgmllib in Python2, and was used by, for example, BeautifulSoup. Without this option, the Python3 version of BeautifulSoup and the many programs that use it are crippled. The original patch was by 'kxroberto'. I modified it heavily but kept his heuristics and test. I also added additional heuristics to fix #975556, #1046092, and part of #6191. This patch should be completely backward compatible: the behavior with the default strict=True is unchanged.
* Recorded merge of revisions 81500-81501 via svnmerge fromVictor Stinner2010-05-24
| | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81500 | victor.stinner | 2010-05-24 23:33:24 +0200 (lun., 24 mai 2010) | 2 lines Issue #6662: Fix parsing of malformatted charref (&#bad;) ........ r81501 | victor.stinner | 2010-05-24 23:37:28 +0200 (lun., 24 mai 2010) | 2 lines Add the author of the last fix (Issue #6662) ........
* Merged revisions 78678,78680,78682 via svnmerge fromBenjamin Peterson2010-03-05
| | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r78678 | benjamin.peterson | 2010-03-04 21:07:59 -0600 (Thu, 04 Mar 2010) | 1 line set svn:eol-style ........ r78680 | benjamin.peterson | 2010-03-04 21:15:07 -0600 (Thu, 04 Mar 2010) | 1 line set svn:eol-style on Lib files ........ r78682 | benjamin.peterson | 2010-03-04 21:20:06 -0600 (Thu, 04 Mar 2010) | 1 line remove the svn:executable property from files that don't have shebang lines ........
* Change test_htmlparser to reflect the HTMLParser -> html.parserMark Dickinson2008-05-21
| | | | | | rename in r63439. Also fix one occurrence of unichr() in html.parser.
* #2621 rename test.test_support to test.supportBenjamin Peterson2008-05-20
|
* Merged revisions 60990-61002 via svnmerge fromChristian Heimes2008-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r60990 | eric.smith | 2008-02-23 17:05:26 +0100 (Sat, 23 Feb 2008) | 1 line Removed duplicate Py_CHARMASK define. It's already defined in Python.h. ........ r60991 | andrew.kuchling | 2008-02-23 17:23:05 +0100 (Sat, 23 Feb 2008) | 4 lines #1330538: Improve comparison of xmlrpclib.DateTime and datetime instances. Remove automatic handling of datetime.date and datetime.time. This breaks backward compatibility, but python-dev discussion was strongly against this automatic conversion; see the bug for a link. ........ r60994 | andrew.kuchling | 2008-02-23 17:39:43 +0100 (Sat, 23 Feb 2008) | 1 line #835521: Add index entries for various pickle-protocol methods and attributes ........ r60995 | andrew.kuchling | 2008-02-23 18:10:46 +0100 (Sat, 23 Feb 2008) | 2 lines #1433694: minidom's .normalize() failed to set .nextSibling for last element. Fix by Malte Helmert ........ r61000 | christian.heimes | 2008-02-23 18:40:11 +0100 (Sat, 23 Feb 2008) | 1 line Patch #2167 from calvin: Remove unused imports ........ r61001 | christian.heimes | 2008-02-23 18:42:31 +0100 (Sat, 23 Feb 2008) | 1 line Patch #1957: syslogmodule: Release GIL when calling syslog(3) ........ r61002 | christian.heimes | 2008-02-23 18:52:07 +0100 (Sat, 23 Feb 2008) | 2 lines Issue #2051 and patch from Alexander Belopolsky: Permission for pyc and pyo files are inherited from the py file. ........