2 unstable releases
0.2.0 | Jun 19, 2021 |
---|---|
0.1.0 | Jun 18, 2021 |
#20 in #html-text
9KB
98 lines
node2text
A tool to extract text from HTML from your terminal.
Usage
# pipe in
curl -s 'https://en.wikipedia.org/wiki/Wiki' | node2text '#siteSub'
# Outputs: From Wikipedia, the free encyclopedia
# extract from path
node2text '#app.title' /path/to/file.html
# May or may not output depending on if selector is matched
Motivation
When I reinstall my machine, I want to automate my install process. Usually it involves quickly grabbing snippet from the internet and writing it to file, this tool aims to help script it.
Hugely inspired by pup.
Demo
Installation
If you have rust toolchain installed, node2text
is available on crates.io, if you don't have rust toolchain installed, please install rust by going to the official website.
Run
cargo install node2text
Note
Piping will always take precedence even if <path>
is provided.
Comparison with pup:
node2text
- Selectors are purely CSS selectors, no dsl
- Takes html, spits out text
- Written in rust programming language
- Less features than
pup
- Outputs are not escaped
pup
- Selectors are CSS selectors plus dsl
- Takes html, spits out text, json, html
- Written in go programming language
- Has many features, visit their github page to know more
- Outputs are escaped
Dependencies
~5–10MB
~106K SLoC