#tantivy #bridge #adapter #tokenizer #jieba #jieba-rs

tantivy-jieba

A library that bridges between tantivy and jieba-rs

12 releases (breaking)

0.11.0 Apr 28, 2024
0.10.0 Oct 13, 2023
0.9.0 Jul 3, 2023
0.7.0 Jan 5, 2023
0.1.1 Feb 13, 2019

#149 in Text processing

Download history 1391/week @ 2024-06-15 2116/week @ 2024-06-22 2890/week @ 2024-06-29 3073/week @ 2024-07-06 2309/week @ 2024-07-13 2605/week @ 2024-07-20 2403/week @ 2024-07-27 2718/week @ 2024-08-03 2303/week @ 2024-08-10 2142/week @ 2024-08-17 1452/week @ 2024-08-24 3163/week @ 2024-08-31 3059/week @ 2024-09-07 1893/week @ 2024-09-14 2031/week @ 2024-09-21 924/week @ 2024-09-28

8,550 downloads per month

MIT license

8KB
104 lines

tantivy-jieba

Crates.io version docs.rs Changelog FOSSA Status

An adapter that bridges between tantivy and jieba-rs.

Usage

Add dependency tantivy-jieba to your Cargo.toml.

Example

use tantivy::tokenizer::*;
let mut tokenizer = tantivy_jieba::JiebaTokenizer {};
let mut token_stream = tokenizer.token_stream("测试");
assert_eq!(token_stream.next().unwrap().text, "测试");
assert!(token_stream.next().is_none());

Register tantivy tokenizer

use tantivy::schema::Schema;
use tantivy::tokenizer::*;
use tantivy::Index;
let tokenizer = tantivy_jieba::JiebaTokenizer {};
let index = Index::create_in_ram(schema);
index.tokenizers()
     .register("jieba", tokenizer);

License

FOSSA Status

Dependencies

~9.5MB
~96K SLoC