4 releases (breaking)

0.25.0 Feb 17, 2025
0.3.0 Apr 18, 2024
0.2.0 Oct 14, 2023
0.1.0 Sep 8, 2023

#216 in Database implementations

Download history 4/week @ 2024-11-27 101/week @ 2024-12-04 13/week @ 2024-12-11 79/week @ 2024-12-18 1/week @ 2024-12-25 72/week @ 2025-01-08 10/week @ 2025-01-29 29/week @ 2025-02-05 152/week @ 2025-02-12 449/week @ 2025-02-19 17/week @ 2025-02-26

648 downloads per month
Used in 3 crates (via izihawa-tantivy)

MIT license

7KB
117 lines

#Tokenizer-API

An API to interface a tokenizer with tantivy.

The API will be kept stable in order to not break support for existing tokenizers.


lib.rs:

Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an separate crate from tantivy, so implementors don't need to update for each new tantivy version.

To add support for a tokenizer, implement the Tokenizer trait. Checkout the tantivy repo for some examples.

Dependencies

~0.3–0.9MB
~20K SLoC