2 releases

0.1.1 Nov 9, 2023
0.1.0 Nov 9, 2023

#1452 in Text processing

MIT OR Apache-2.0 OR CC0-1.0

330KB
313 lines

unicode_font

Download License Docs Crate

Convert unicode characters between fonts.

Unicode Standard Annex #44 defines Character Decomposition Mapping. In particular, characters are given a <font> tag to indicate some characters are a font variant of others. On top of these variants, we add carefully selected variants, like superscript, subscript and squared. This extension is included by default and can be turned off.

Characteristics

  • Standard complying
  • Database-driven
    • Code is generated from CSV files
  • Hash lookup
    • We use perfect hash functions for lookup, generally faster than binary-search on ordered tables

Similar projects

  • YayText
    • Online tool to transform to multiple unicode styles
  • sprezz keyboard
    • Online tool to transform to multiple unicode styles
  • Unicode Toys
    • Transliterate plain text to obscure characters from Unicode

Building

We use a builder script that takes into account the folder unidata. The script is the crate builder. Run this crate to update unicode_font's maps.

We opted for this strategy as opposed to a building script build.rs to speed up compilation of dependent crates.

Resources

Dependencies

~0.6–1.2MB
~25K SLoC