12 releases (breaking)
0.11.0 | Apr 28, 2024 |
---|---|
0.10.0 | Oct 13, 2023 |
0.9.0 | Jul 3, 2023 |
0.7.0 | Jan 5, 2023 |
0.1.1 | Feb 13, 2019 |
#149 in Text processing
8,550 downloads per month
8KB
104 lines
tantivy-jieba
An adapter that bridges between tantivy and jieba-rs.
Usage
Add dependency tantivy-jieba
to your Cargo.toml
.
Example
use tantivy::tokenizer::*;
let mut tokenizer = tantivy_jieba::JiebaTokenizer {};
let mut token_stream = tokenizer.token_stream("测试");
assert_eq!(token_stream.next().unwrap().text, "测试");
assert!(token_stream.next().is_none());
Register tantivy tokenizer
use tantivy::schema::Schema;
use tantivy::tokenizer::*;
use tantivy::Index;
let tokenizer = tantivy_jieba::JiebaTokenizer {};
let index = Index::create_in_ram(schema);
index.tokenizers()
.register("jieba", tokenizer);
License
Dependencies
~9.5MB
~96K SLoC