# Changelog
All notable changes to Nokogumbo will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Added
### Changed
### Deprecated
### Removed
### Fixed
### Security
## [2.0.5] - 2021-03-19
### Fixed
- Support Mageia distros when libxml2/libxslt system libraries are install. #165 (Thank you,
@pterjan!)
### Added
- Forward-looking support for a version of Nokogiri that will provide HTML5 parsing. #171
### Improved
- Update extconf.rb to use Nokogiri v1.11's CPPFLAGS for more reliable installation. #163
## [2.0.4] - 2020-11-27
### Fixed
- Fixed a bug where `Nokogiri::HTML5.fragment(nil)` would raise an error. Now
it returns an empty `DocumentFragment` like it did in v2.0.2.
- Fixed assertion failure when a tag immediately followed the UTF-8 BOM.
## [2.0.3] - 2020-11-21
### Added
- Limit enforced on number of attributes per element, defaulting to 400 and
configurable with the `:max_attributes` argument.
### Fixed
- Ignore UTF-8 byte order mark at the beginning of the input.
- Fix content sniffing for Unicode strings.
- Fixed crash where Ruby objects constructed in C can be garbage collected.
## [2.0.2] - 2019-11-19
### Added
- Support Ruby 2.6
### Fixed
- Fix assertion failures with nonstandard HTML tags.
- Fix the handling of mis-nested formatting tags (the adoption agency
algorithm).
- Fix crash with zero-length HTML tags.
### Security
- Prevent 1-byte buffer over read when constructing an error message about an
unexpected EOF.
## [2.0.1] - 2018-11-11
### Fixed
- Fix line numbers on elements from `#line`.
## [2.0.0] - 2018-10-04
### Added
- Experimental support for errors (it was supported in 1.5.0 but
undocumented).
- Added proper HTML5 serialization.
- Added option `:max_errors` to control the maximum number of errors reported
by `#errors`.
- Added option `:max_tree_depth` to control the maximum parse tree depth.
- Line number support via `Nokogiri::XML::Node#line` as long as Nokogumbo has
been compiled with libxml2 support.
### Changed
- Integrated [Gumbo parser](https://github.com/google/gumbo-parser) into
Nokogumbo. A system version will not be used.
- The undocumented (but publicly mentioned) `:max_parse_errors` renamed to `:max_errors`;
`:max_parse_errors` is deprecated and will go away
- The various `#parse` and `#fragment` (and `Nokogiri.HTML5`) methods return
`Nokogiri::HTML5::Document` and `Nokogiri::HTML5::DocumentFragment` classes
rather than `Nokogiri::HTML::Document` and
`Nokogiri::HTML::DocumentFragment`.
- Changed the top-level API to more closely match Nokogiri's while maintaining
backwards compatibility. The new APIs are
* `Nokogiri::HTML5(html, url = nil, encoding = nil, **options, &block)`
* `Nokogiri::HTML5.parse(html, url = nil, encoding = nil, **options, &block)`
* `Nokogiri::HTML5::Document.parse(html, url = nil, encoding = nil, **options, &block)`
* `Nokogiri::HTML5.fragment(html, encoding = nil, **options)`
* `Nokogiri::HTML5::DocumentFragment.parse(html, encoding = nil, **options)`
* `Nokogiri::HTML5::DocumentFragment.new(document, html = nil, ctx = nil)`
* `Nokogiri::HTML5::Document#fragment(html = nil)`
* `Nokogiri::XML::Node#fragment(html = nil)`
In all cases, `html` can be a string or an `IO` object (something that
responds to `#read`). The `url` parameter is entirely for error reporting,
as in Nokogiri. The `encoding` parameter only signals what encoding `html`
should have on input; the output `Document` or `DocumentFragment` will be in
UTF-8. Currently, the only options supported are `:max_errors` which controls
the maximum number of reported by `#errors`.
- Minimum supported version of Ruby changed to 2.1.
- Minimum supported version of Nokogiri changed to 1.8.0.
- `Nokogiri::HTML5::DocumentFragment#errors` returns errors for the document
fragment itself, not the underlying document.
- The five XML namespaces described in the HTML spec, MathML, SVG, XLink, XML,
and XMLNS, are now supported. Thus `<svg>` will create an `svg` element in
the SVG namespace and `<math>` will create a `math` element in the MathML
namespace. An attribute `xml:lang=en`, for example, will create a `lang`
attribute in the XML namespace, **but only in foreign elements (i.e., those
in the SVG or MathML namespaces)**. On HTML elements, this creates an
attribute with the name `xml:lang`. This changes the `#xpath` and related
APIs.
- Calling `#to_xml` on a `Nokogiri::HTML5::Document` will produce XML output
rather than HTML.
### Deprecated
- `:max_parse_errors`; use `:max_errors`
### Fixed
- Fixed documents failing to serialize (via `to_html`) if they contain certain
`meta` elements that set the `charset`.
- Documents are now properly marked as UTF-8 after parsing.
- Fixed `Nokogiri::HTML5.fragment` reporting an error due to a missing
`<!DOCTYPE html>`.
- Fixed crash when input contains U+0000 NULL bytes and error reporting is
enabled.
### Security
- The most recent, released version of Gumbo has a [potential security
issue](https://github.com/google/gumbo-parser/pull/375) that could result in
a cross-site scripting vulnerability. This has been fixed by integrating
Gumbo into Nokogumbo.