#nlp #information-retrieval #stemming #language #retrieval

rust-stemmers

A rust implementation of some popular snowball stemming algorithms

6 releases (stable)

Uses old Rust 2015

1.2.0 Nov 17, 2019
1.1.0 Jan 24, 2019
1.0.2 Dec 30, 2017
1.0.1 Aug 22, 2017
0.1.0 Feb 7, 2017

#97 in Algorithms

Download history 93375/week @ 2024-09-25 94341/week @ 2024-10-02 85662/week @ 2024-10-09 82784/week @ 2024-10-16 76950/week @ 2024-10-23 69980/week @ 2024-10-30 78082/week @ 2024-11-06 81880/week @ 2024-11-13 75721/week @ 2024-11-20 71021/week @ 2024-11-27 80382/week @ 2024-12-04 69519/week @ 2024-12-11 60651/week @ 2024-12-18 32138/week @ 2024-12-25 55527/week @ 2025-01-01 78441/week @ 2025-01-08

240,170 downloads per month
Used in 254 crates (31 directly)

MIT/BSD-3-Clause

2.5MB
19K SLoC

Rust Stemmers

This crate implements some stemmer algorithms found in the snowball project which are compiled to rust using the rust-backend of the snowball compiler.

Supported Algorithms

  • Arabic
  • Danish
  • Dutch
  • English
  • French
  • German
  • Greek
  • Hungarian
  • Italian
  • Norwegian
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Tamil
  • Turkish

Usage

extern crate rust_stemmers;
use rust_stemmers::{Algorithm, Stemmer};

// Create a stemmer for the english language
let en_stemmer = Stemmer::create(Algorithm::English);

// Stemm the word "fruitlessly"
// Please be aware that all algorithms expect their input to only contain lowercase characters.
assert_eq!(en_stemmer.stem("fruitlessly"), "fruitless");

Related Projects

  • The stemmer crate provides bindings to the C Snowball implementation.

lib.rs:

This library provides rust implementations for some stemmer algorithms written in the snowball language.

All algorithms expect the input to already be lowercased.

Usage

[dependencies]
rust-stemmers = "^1.0"
extern crate rust_stemmers;

use rust_stemmers::{Algorithm, Stemmer};

fn main() {
   let en_stemmer = Stemmer::create(Algorithm::English);
   assert_eq!(en_stemmer.stem("fruitlessly"), "fruitless");
}

Dependencies