#html-parser #html #parser

toks

Efficient tokens for html5ever::rcdom::RcDom Handle parsing aiming for O(1) HTML DOM walking & efficiency

14 releases (4 stable)

1.3.0 Aug 2, 2024
1.2.0 Jul 21, 2023
1.1.0 Aug 21, 2022
1.0.0 Apr 11, 2020
0.4.0 Nov 7, 2017

#2097 in Web programming

Download history 54/week @ 2024-07-24 105/week @ 2024-07-31 5/week @ 2024-08-07 6/week @ 2024-09-11 7/week @ 2024-09-18 20/week @ 2024-09-25 7/week @ 2024-10-02

743 downloads per month

MIT/Apache

12KB
71 lines

toks

Efficient tokens for html5ever::rcdom::RcDom Handle parsing aiming for O(1) HTML DOM walking & efficiency.

Rust Crates.io

Documentation

Usage

Get started by looking at examples.

License

MIT/Apache-2.0


lib.rs:

Efficient tokens for rcdom::RcDom Handle parsing aiming for O(1) HTML DOM walking.

This library aims to provide convenient and efficient handling of HTML DOM elements.

Examples

 extern crate toks;
 #[macro_use]
 extern crate html5ever;

 use toks::prelude::*;
 use std::io::{self, Read};

 pub struct LinkTok {
     total: u32,
 }

 impl Tok for LinkTok {
     fn is_match(&self, qn: &QualName) -> bool {
         qn.local == local_name!("a")
     }

     fn process(&mut self, _: &mut Vec<Attribute>, _: &mut Vec<Handle>) {
         self.total += 1;
     }
 }

 // How to use
 // $ cargo build --example count_links
 // $ cat your.html | ./target/debug/examples/count_links
 // Link <a> count 9
 fn main() {
     let mut chunk = String::new();
     io::stdin().read_to_string(&mut chunk).unwrap();

     let dom = parse_document(RcDom::default(), Default::default()).one(chunk);

     let mut lt = LinkTok { total: 0 };

     // Dropping mut reference
     {
         recursion(&mut vec![&mut lt], dom.document);
     }

     println!("Link <a> count {}", lt.total);
 }

Dependencies

~1.5–6MB
~34K SLoC