2 releases

new 0.0.2 Nov 5, 2024
0.0.1 Oct 24, 2024

#392 in Text processing

Download history 114/week @ 2024-10-24 48/week @ 2024-10-31

162 downloads per month

MIT license

15MB
177K SLoC

common-words-all

Most common words sorted by ngram frequency.

Available in the following languages:

  • Chinese
  • English
  • French
  • German
  • Hebrew
  • Italian
  • Russian
  • Spanish

Available ngram sizes:

  • 1
  • 2
  • 3
  • 4
  • 5

Usage

Get top 10 english ngrams:

let top = get_top(Language::English, 10, NgramSize::One);

Examples

Simple

You can specify features of language (english) and ngram size (one)

cargo run --example simple --no-default-features -F english -F one --release

Data

Dataset version 20200217 from Google Books

License

MIT

© 2024, Eugene Hauptmann

No runtime deps