Cargo Features

[dependencies]
charabia = { version = "0.9.1", default-features = false, features = ["chinese", "chinese-segmentation", "chinese-normalization", "chinese-normalization-pinyin", "hebrew", "japanese", "japanese-segmentation-ipadic", "japanese-segmentation-unidic", "japanese-transliteration", "korean", "thai", "greek", "latin-camelcase", "khmer", "vietnamese", "latin-snakecase", "swedish-recomposition", "turkish", "german-segmentation"] }
default = chinese, german-segmentation, greek, hebrew, japanese, khmer, korean, swedish-recomposition, thai, turkish, vietnamese

These default features are set whenever charabia is added without default-features = false somewhere in the dependency tree.

chinese default = chinese-normalization, chinese-segmentation

allow chinese specialized tokenization

chinese-segmentation chinese

Enables jieba-rs

chinese-normalization chinese chinese-normalization-pinyin?
chinese-normalization-pinyin = chinese-normalization

Enables pinyin

hebrew default

allow hebrew specialized tokenization

japanese default = japanese-segmentation-unidic

allow japanese specialized tokenization

japanese-segmentation-ipadic

Enables compress and ipadic of lindera =0.32.2

japanese-segmentation-unidic japanese

Enables compress and unidic of lindera =0.32.2

japanese-transliteration

Enables wana_kana ^3.0.0

korean default

allow korean specialized tokenization

Enables compress and ko-dic of lindera =0.32.2

thai default

allow thai specialized tokenization

greek default

allow greek specialized tokenization

latin-camelcase

allow splitting camelCase latin words

Enables finl_unicode

khmer default
vietnamese default

allow vietnamese specialized tokenization

latin-snakecase

allow splitting snake_case latin words

Enables finl_unicode

swedish-recomposition default

force Charabia to recompose Swedish characters

turkish default

allow turkish specialized tokenization

german-segmentation default

allow decomposition of German composite words

Features from optional dependencies

In crates that don't use the dep: syntax, optional dependencies automatically become Cargo features. These features may have been created by mistake, and this functionality may be removed in the future.

lindera japanese-segmentation-ipadic? japanese-segmentation-unidic? korean

Enables lindera =0.32.2