1 unstable release

0.1.0 Jul 1, 2024

#1292 in Text processing

MIT/Apache

17KB
386 lines

summary

Extract the sentences which best summarize a document.

License: MIT License: Apache 2.0 crates.io docs.rs

Example

let summarizer = Summarizer::new(Language::English);
let text = "See Spot. See Spot run. Run Spot, run!";
let n = 2.try_into().unwrap();
for sentence in summarizer.summarize_sentences(text, n) {
    println!("{sentence}");
}

lib.rs:

Extract the sentences which best summarize a document.

The algorithm uses a heuristic which identifies a "core" sentence based on tf-idf cosine distance to the document at large, and then gathers all sentences that have small cosine distances to the "core" sentence.

Example

let summarizer = Summarizer::new(Language::English);
let text = "See Spot. See Spot run. Run Spot, run!";
let n = 2.try_into().unwrap();
for sentence in summarizer.summarize_sentences(text, n) {
    println!("{sentence}");
}

Dependencies

~4.5MB
~58K SLoC