2 releases
new 0.0.2 | Nov 5, 2024 |
---|---|
0.0.1 | Oct 24, 2024 |
#392 in Text processing
162 downloads per month
15MB
177K
SLoC
common-words-all
Most common words sorted by ngram frequency.
Available in the following languages:
- Chinese
- English
- French
- German
- Hebrew
- Italian
- Russian
- Spanish
Available ngram sizes:
- 1
- 2
- 3
- 4
- 5
Usage
Get top 10 english ngrams:
let top = get_top(Language::English, 10, NgramSize::One);
Examples
Simple
You can specify features of language (english
) and ngram size (one
)
cargo run --example simple --no-default-features -F english -F one --release
Data
Dataset version 20200217 from Google Books
License
Copyright
© 2024, Eugene Hauptmann