#string-representation #string #reference-counting #inline #cow #byte-slice

no-std hipstr

Yet another string for Rust: zero-cost borrow and slicing, inline representation for small strings, (atomic) reference counting

11 releases (6 breaking)

new 0.7.0 Jan 10, 2025
0.6.0 Oct 8, 2024
0.5.1 Aug 2, 2024
0.5.0 Jun 25, 2024
0.2.0 Jul 6, 2023

#26 in Memory management

Download history 17/week @ 2024-09-27 162/week @ 2024-10-04 31/week @ 2024-10-11 3/week @ 2024-10-18 3/week @ 2024-11-01 4/week @ 2024-11-08 2/week @ 2024-11-15 1/week @ 2024-11-22 39/week @ 2024-12-06 51/week @ 2024-12-13 981/week @ 2024-12-20 1006/week @ 2024-12-27 862/week @ 2025-01-03 1464/week @ 2025-01-10

4,328 downloads per month
Used in 17 crates (2 directly)

MIT/Apache

445KB
10K SLoC

hipstr

Rust Clippy & Miri codecov Docs MIT OR Apache-2.0

Yet another string type for Rust πŸ¦€

  • no copy borrow via borrowed (a const constructor) or from_static
  • no alloc small strings (23 bytes on 64-bit platform)
  • no copy owned slices
  • a niche: Option<HipStr> and HipStr have the same size
  • zero dependency and compatible no_std with alloc

Also byte strings, OS strings, and paths!

⚑ Examples

use hipstr::HipStr;

let simple_greetings = HipStr::from_static("Hello world");
let _clone = simple_greetings.clone(); // no copy

let user = "John";
let greetings = HipStr::from(format!("Hello {}", user));
let user = greetings.slice(6..): // no copy
drop(greetings); // the slice is owned, it exists even if greetings disappear

let chars = user.chars().count(); // "inherits" `&str` methods

✏️ Features

  • std (default): provides HipOsStr and HipPath types, and more trait implementations (for comparison and conversions)
  • serde: provides serialization/deserialization support with serde
  • borsh: provides serialization/deserialization support with borsh
  • bstr: provides compatibility with BurntSushi's bstr crate
  • unstable: do nothing, used to reveal unstable implementation details

☣️ Safety of hipstr

This crate makes extensive use of unsafe code blocks. 🀷

It leverages the 2-bit alignment niche present in pointers across most platforms (all platforms currently supported by the Rust compiler?) to discriminate between the three possible representations.

To make things safer, Rust is tested thoroughly on multiple platforms, normally and with Miri (the MIR interpreter).

πŸ§ͺ Testing and Verification Strategy

To ensure safety and reliability, this crate undergoes thorough testing:

  • Near 100% test coverage
  • Cross-platform validation:
    • 32-bit and 64-bit architectures
    • little and big endian

In addition, this crate is checked with advanced dynamic verification methods:

  • Concurrency testing using the Tokio's loom crate
  • Undefined behavior detection using Miri (the MIR interpreter)

β˜” Coverage

This crate has near full line coverage:

cargo llvm-cov --all-features --html
# or
cargo tarpaulin --all-features --out html --engine llvm

Check out the current coverage on Codecov:

Coverage grid

πŸ–₯️ Cross-platform testing

In the Github-provided CI, hipstr is tested under:

  • Linux
  • Windows
  • MacOS (ARM 64-bit LE)

You can easily run the test on various platforms with cross:

cross test --target s390x-unknown-linux-gnu         # 32-bit BE
cross test --target powerpc64-unknown-linux-gnu     # 64-bit BE
cross test --target i686-unknown-linux-gnu          # 32-bit LE
cross test --target x86_64-unknown-linux-gnu        # 64-bit LE

NB: previously I used MIPS targets for big endian, but due to some LLVM-related issue they are not working anymore… see Rust issue #113065

🧡 Loom

This crates uses the loom crate to check the custom "Arc" implementation. To run the tests:

RUSTFLAGS='--cfg loom' cargo test --release loom

πŸ” Miri

This crate runs successfully with Miri:

MIRIFLAGS='-Zmiri-symbolic-alignment-check -Zmiri-permissive-provenance' cargo +nightly miri test

for SEED in $(seq 0 10); do
  echo "Trying seed: $SEED"
  MIRIFLAGS="-Zmiri-seed=$SEED -Zmiri-permissive-provenance" cargo +nightly miri test || { echo "Failing seed: $SEED"; break; };
done

To check with different word size and endianness:

# Big endian, 64-bit
cargo +nightly miri test --target mips64-unknown-linux-gnuabi64
# Little endian, 32-bit
cargo +nightly miri test --target i686-unknown-linux-gnu

Note: this crate leverages the "exposed provenance" semantics. MIRIFLAGS=-Zmiri-permissive-provenance silences the warning related to the use of exposed provenance.

πŸ“¦ Similar crates

#[non_exhaustive]

Name TS cheap-clone Local cheap-clone Inline Cheap slice Bytes Borrow 'static Borrow any 'a Comment
hipstr βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ obviously!
arcstr βœ“* - - -** - βœ“ - *use a custom thin Arc, **heavy slice (with dedicated substring type)
flexstr βœ“* βœ“ βœ“ - - βœ“ - *use (A)rc<str> instead of (A)rc<String> (remove a level of indirection but use fat pointers)
imstr βœ“ βœ“ - βœ“ - - -
faststr βœ“ - βœ“ βœ“ - βœ“ - zero-doc with complex API
fast-str βœ“ - βœ“ βœ“ - βœ“ - inline repr is opt-in
ecow βœ“* - βœ“ - βœ“** βœ“ - *on two words only 🀀, **even any T
cowstr βœ“ - - -* - βœ“ -** *heavy slice, **contrary to its name
compact_str - - βœ“ - βœ“* - - *opt-in via smallvec
inline_string - - βœ“ - - - -
kstring βœ“ βœ“ βœ“ - - βœ“ βœ“* safe mode, use boxed strings; * with second type
smartstring - - βœ“ - - - -
smallstr - - βœ“ - - - -
smol_str - - βœ“* - - βœ“ - *but only inline string, here for reference

skipping specialized string types like tinystr (ASCII-only, bounded), or bstr, or bytestring, or...

In short, HipStr, one string type to rule them all πŸ˜‰

How standards proliferate

🏎️ Performances

While speed is not the main motivator for hipstr, it seems to be doing OK on that front.

See some actual benchmarks on Rust's String Rosetta.

πŸ“– Author and licenses

For now, just me PoLazarus πŸ‘»
Help welcome! 🚨

MIT + Apache

Dependencies

~0–23MB
~302K SLoC