5 releases
0.1.4 | Oct 27, 2024 |
---|---|
0.1.3 | Sep 17, 2024 |
0.1.2 | Sep 17, 2024 |
0.1.1 | Jan 25, 2024 |
0.1.0 | Nov 26, 2023 |
#436 in Text processing
147 downloads per month
40KB
783 lines
Nomnom 🥘 吃吃
Nado - CLI
Just a small util tool to convert the cedict_ts.u8 into a JSON or CSV file. Additionals features are:
- Add pinyin with accent based on these rules
- Add HSK level character based fetched on mandarinbean. The HSK7-9 level is parsed from a different website by wohok
- Add zhuyin support based on this conversion rules link
- Add wade-giles support based on this conversion rules link
Usage
Clone this project and run one of the cargo command below. If needed I could provided the generate json & csv file.
Json
cargo run -- generate -e ../cedict_ts.u8 -o ../cedict.json -f json
Csv
cargo run -- generate -e ../cedict_ts.u8 -o ../cedict.csv -f csv
Dodo - Lib
A small crate which allows to do several operations on the cedict.u8 file but also allows you to do some operations on chinese characters such as:
- Convert pinyin tones to pinyin numbers and vice versa
- Convert pinyin to wade-giles
- Convert pinyin to zhuyin
- Convert a simplified chinese text to tradional and vice versa
- Detect which chinese variant a text is written
use dodo_zh;
use dodo_zh::variant::KeyVariant;
fn main() {
// The KeyVariant can either be Traditional or Simplified chinese
let cedict = dodo_zh::load_cedict_dictionary(path, KeyVariant::Traditional).unwrap();
let wo = cedict.items.get("我").unwrap();
// will return an Item struct
println!(wo.translations);
}
A set of example exist which can helps you to see how to do some pinyin manipulation. Namely convert the pinyin with tone number to a pinyin with tone marker etc...
You can run the example with the following command
cargo run --example dodo
Dependencies
~1.6–2.5MB
~68K SLoC