2 unstable releases

0.2.0 Jun 19, 2021
0.1.0 Jun 18, 2021

#20 in #html-text

MIT license

9KB
98 lines

node2text Crates.io License

A tool to extract text from HTML from your terminal.

Usage

# pipe in
curl -s 'https://en.wikipedia.org/wiki/Wiki' | node2text '#siteSub'
# Outputs: From Wikipedia, the free encyclopedia

# extract from path
node2text '#app.title' /path/to/file.html
# May or may not output depending on if selector is matched

Motivation

When I reinstall my machine, I want to automate my install process. Usually it involves quickly grabbing snippet from the internet and writing it to file, this tool aims to help script it.

Hugely inspired by pup.

Demo

demo

Installation

If you have rust toolchain installed, node2text is available on crates.io, if you don't have rust toolchain installed, please install rust by going to the official website.

Run

cargo install node2text

Note

Piping will always take precedence even if <path> is provided.

Comparison with pup:

node2text

  • Selectors are purely CSS selectors, no dsl
  • Takes html, spits out text
  • Written in rust programming language
  • Less features than pup
  • Outputs are not escaped

pup

  • Selectors are CSS selectors plus dsl
  • Takes html, spits out text, json, html
  • Written in go programming language
  • Has many features, visit their github page to know more
  • Outputs are escaped

Dependencies

~5–10MB
~106K SLoC