#chinese #convert-text #simplified-chinese #traditional-chinese

bin+lib hanconv

Convert between Chinese characters variants

6 releases

Uses new Rust 2024

new 0.3.4 Mar 19, 2025
0.3.3 Feb 27, 2025
0.2.5 Feb 20, 2025

#438 in Text processing

Download history 91/week @ 2025-02-09 552/week @ 2025-02-16 294/week @ 2025-02-23 59/week @ 2025-03-02 3/week @ 2025-03-09 105/week @ 2025-03-16

522 downloads per month

MIT license

475KB
644 lines

Hanconv

Hanconv converts between Chinese characters variants base on OpenCC's dictionaries and conversion rules. It provides efficient conversion between Simplified Chinese and Traditional Chinese both as a command-line tool and as a library for integration into other projects.

Installation

cargo install hanconv

Usage

Command-Line Interface

  • Display the help message:

    hanconv --help
    
  • Convert a text file from Simplified to Traditional Chinese:

    hanconv s2t -i input.txt -o output.txt
    
  • Convert a text file from Traditional to Simplified Chinese:

    hanconv t2s -i input.txt -o output.txt
    
  • Convert text directly from command line:

    hanconv s2t "简繁转换"
    
  • Read from stdin and write to stdout:

    cat input.txt | hanconv s2t > output.txt
    
  • For files with non-UTF-8 encodings, specify the encoding:

    # Same encoding for input and output
    hanconv t2s -i input.txt -o output.txt --encoding GBK
    
    # Different encodings for input and output
    hanconv t2s -i input.txt -o output.txt --input-encoding GBK --output-encoding UTF-8
    

As a Library

  1. Add hanconv as a dependency in your Cargo.toml.

    [dependencies]
    hanconv = "0.3"
    
  2. Use the provided conversion functions in your project.

    fn main() {
        let result = hanconv::s2t("简繁转换");
        println!("{}", result);
    }
    
  3. Available conversion functions:

    • s2t: Simplified Chinese → Traditional Chinese
    • t2s: Traditional Chinese → Simplified Chinese
    • s2tw: Simplified Chinese → Traditional Chinese (Taiwan)
    • tw2s: Traditional Chinese (Taiwan) → Simplified Chinese
    • s2twp: Simplified Chinese → Traditional Chinese (Taiwan) with Taiwanese idiom
    • tw2sp: Traditional Chinese (Taiwan) → Simplified Chinese with Mainland Chinese idiom
    • t2tw: Traditional Chinese → Traditional Chinese (Taiwan)
    • tw2t: Traditional Chinese (Taiwan) → Traditional Chinese
    • s2hk: Simplified Chinese → Traditional Chinese (Hong Kong)
    • hk2s: Traditional Chinese (Hong Kong) → Simplified Chinese
    • t2hk: Traditional Chinese → Traditional Chinese (Hong Kong)
    • hk2t: Traditional Chinese (Hong Kong) → Traditional Chinese
    • t2jp: Traditional Chinese characters (Kyūjitai) → New Japanese Kanji (Shinjitai)
    • jp2t: New Japanese Kanji (Shinjitai) → Traditional Chinese characters (Kyūjitai)

Conversions

Conversion Source Target
S2T Simplified Chinese Traditional Chinese
T2S Traditional Chinese Simplified Chinese
S2TW Simplified Chinese Traditional Chinese (Taiwan)
TW2S Traditional Chinese (Taiwan) Simplified Chinese
S2TWP Simplified Chinese Traditional Chinese (Taiwan) with Taiwanese idiom
TW2SP Traditional Chinese (Taiwan) Simplified Chinese with Mainland Chinese idiom
T2TW Traditional Chinese Traditional Chinese (Taiwan)
TW2T Traditional Chinese (Taiwan) Traditional Chinese
S2HK Simplified Chinese Traditional Chinese (Hong Kong)
HK2S Traditional Chinese (Hong Kong) Simplified Chinese
T2HK Traditional Chinese Traditional Chinese (Hong Kong)
HK2T Traditional Chinese (Hong Kong) Traditional Chinese
T2JP Traditional Chinese characters (Kyūjitai) New Japanese Kanji (Shinjitai)
JP2T New Japanese Kanji (Shinjitai) Traditional Chinese characters (Kyūjitai)

Dependencies

~7MB
~187K SLoC