#stemmer #english #incomplete #porter #stemming

nightly bin+lib porter2

An (incomplete) implementation of the Porter 2 English Stemmer

2 releases

Uses old Rust 2015

0.0.1004 Nov 23, 2014
0.0.1003 Nov 23, 2014

#8 in #stemmer

MIT license

395KB
482 lines

Porter2 English Stemmer

Build Status

This is an INCOMPLETE implementation of the Porter2 english stemmer written in Rust. It's a little toy project for me to learn Rust on, while doing something somewhat useful.

Please check rust-ci.org for rust version compatibility.

Many thanks to the start that mrordinaire's porter stemmer in rust gave me!!

Compiling

I'm using Cargo!!! Just run cargo build!!!!

Running the tests

I'm using Cargo!!! Just run cargo test!!!!

The tests are really just one test with a lot of cases-- it runs through the words in some input files and asserts that the stem of the word matches the corresponding line in the expected output files.

Stemming

After compiling, you should have a binary in target/porter2 that will read a list of words, one per line, from stdin and print their stems to stdout.

Example:

./target/porter2 < test-data/voc.txt > output.txt

License

MIT. See LICENSE.

No runtime deps