#parser #html-parser #html5ever #dom #testing #automated-testing #web

markup5ever_rcdom

Basic, unsupported DOM structure for use by tests in html5ever/xml5ever

5 releases (breaking)

0.5.0-unofficial Sep 20, 2024
0.4.0-unofficial Aug 10, 2024
0.3.0 Mar 23, 2024
0.2.0 Aug 18, 2022
0.1.0 Dec 19, 2019

#208 in Web programming

Download history 13884/week @ 2024-07-18 13701/week @ 2024-07-25 16316/week @ 2024-08-01 21244/week @ 2024-08-08 17959/week @ 2024-08-15 25416/week @ 2024-08-22 21442/week @ 2024-08-29 23897/week @ 2024-09-05 19831/week @ 2024-09-12 23887/week @ 2024-09-19 31141/week @ 2024-09-26 17348/week @ 2024-10-03 21479/week @ 2024-10-10 23481/week @ 2024-10-17 18974/week @ 2024-10-24 22291/week @ 2024-10-31

88,503 downloads per month
Used in 231 crates (43 directly)

MIT/Apache

740KB
12K SLoC

markup5ever_rcdom

This crate is built for the express purpose of writing automated tests for the html5ever and xml5ever crates. It is not intended to be a production-quality DOM implementation, and has not been fuzzed or tested against arbitrary, malicious, or nontrivial inputs. No maintenance or support for any such issues will be provided. If you use this DOM implementation in a production, user-facing system, you do so at your own risk.


lib.rs:

A simple reference-counted DOM.

This is sufficient as a static parse tree, but don't build a web browser using it. :)

A DOM is a tree structure with ordered children that can be represented in an XML-like format. For example, the following graph

div
 +- "text node"
 +- span

in HTML would be serialized as

<div>text node<span></span></div>

See the document object model article on wikipedia for more information.

This implementation stores the information associated with each node once, and then hands out refs to children. The nodes themselves are reference-counted to avoid copying - you can create a new ref and then a node will outlive the document. Nodes own their children, but only have weak references to their parents.

Dependencies

~0.7–5.5MB
~22K SLoC