#html-parser #web-scraping #html #selector #scrape #query #jquery

visdom

A html document syntax and operation library, use APIs similar to jquery, easy to use for web scraping and confused html

58 releases (3 stable)

1.0.2 Oct 27, 2024
1.0.1 Apr 18, 2024
1.0.0 Sep 11, 2023
0.5.10 Apr 17, 2023
0.4.8 Mar 27, 2021

#167 in Web programming

Download history 42/week @ 2024-07-15 19/week @ 2024-07-22 48/week @ 2024-07-29 41/week @ 2024-08-05 33/week @ 2024-08-12 23/week @ 2024-08-19 83/week @ 2024-08-26 24/week @ 2024-09-02 22/week @ 2024-09-09 59/week @ 2024-09-16 242/week @ 2024-09-23 76/week @ 2024-09-30 10/week @ 2024-10-14 104/week @ 2024-10-21 88/week @ 2024-10-28

209 downloads per month
Used in 7 crates

MIT license

285KB
7K SLoC

Rust 6K SLoC // 0.1% comments Go 471 SLoC // 0.1% comments JavaScript 409 SLoC // 0.1% comments

Visdom

Build status crates.io tag codecov Crates download docs.rs GitHub license

API Document     Performance     中文 API 文档     更新文档

🏠 A html parsing & node selecting and mutation library written in Rust, using APIs similar to jQuery, left off the parts thoes only worked in the browsers(e.g. render and event related methods).

It's not only helpful for the working with html scraping, but also have useful APIs to mutate text nodes, so you can use it for mixing your html with dirty html fragement, and keep the web scrapers away. 💖

Usage

use visdom::Vis;
use visdom::types::BoxDynError;

fn main() -> Result<(), BoxDynError>{
  let html = r##"
    <!DOCTYPE html>
    <html>
      <head>
        <meta charset="utf-8" />
      </head>
      <body>
        <nav id="header">
          <ul>
            <li>Hello,</li>
            <li>Vis</li>
            <li>Dom</li>
          </ul>
        </nav>
      </body>
    </html>
  "##;
  // load html
  let root = Vis::load(html)?;
  let lis = root.find("#header li");
  let lis_text = lis.text();
  println!("{}", lis_text);
  // will output "Hello,VisDom"
  Ok(())
}

Try it online

Feature flags

After version v0.5.0, visdom add some feature flags to support conditional compilation for different usage.

Feature Description API Config
destroy When you don't need remove or clear the elements, you can ignore this feature flag. .remove() .empty() (IElementTrait) remove_child() clone() visdom = { version = xxx, features = ["destroy"]}
insertion When you don't need mutation the DOM, you can ignore this feature flag. append() append_to() prepend() prepend_to() insert_after() after() insert_before() before() replace_with() visdom = { version = xxx, features = ["insertion"]}
text When you don't need mutation the TextNode, you can ignore this feature flag. .texts() .texts_by() texts_by_rec() visdom = { version = xxx, features = ["text"]}
full When you need all the API above, you can open this feature flag. - visdom = { version = xxx, features = ["full"]}

Depedencies

Questions & Advices & Bugs?

Welcome to report Issue to us if you have any question or bug or good advice.

License

MIT License.

Dependencies

~2.8–4.5MB
~85K SLoC