#string #unicode #validation #simd #utf-8 #unicode-characters

no-std simdutf

Unicode validation and transcoding at billions of characters per second

19 releases

0.6.0 Jan 22, 2025
0.5.72 Dec 31, 2024
0.5.1 Sep 1, 2024
0.4.17 May 2, 2024
0.3.0 Jul 27, 2022

#382 in Text processing

Download history 10/week @ 2024-11-17 1/week @ 2024-11-24 111/week @ 2024-12-01 211/week @ 2024-12-08 32/week @ 2024-12-15 1/week @ 2024-12-22 168/week @ 2024-12-29 37/week @ 2025-01-05 11/week @ 2025-01-12 147/week @ 2025-01-19 52/week @ 2025-01-26 5/week @ 2025-02-02

215 downloads per month

MIT license

2.5MB
44K SLoC

C++ 44K SLoC // 0.2% comments Rust 720 SLoC // 0.0% comments Python 63 SLoC // 0.0% comments Just 26 SLoC

simdutf

Latest Version Documentation License

Unicode validation and transcoding at billions of characters per second.

This crate is the Rust binding of simdutf.

Documentation: https://docs.rs/simdutf

Contributing


lib.rs:

Unicode validation and transcoding at billions of characters per second.

This crate is the Rust binding of simdutf.

Compilation

This crate works out of the box as long as you have a C++11-compatible toolchain installed correctly.

simdutf links C++ standard library, which adds a dynamic linking dependency.

For more details, see simdutf documentation and cc documentation.

Here is an example for local benchmark:

export RUSTFLAGS='-C target-cpu=native'
export CXXFLAGS='-march=native'
cargo build --release

Dependencies