3 releases (breaking)
0.3.0 | Jul 6, 2024 |
---|---|
0.2.0 | Jul 4, 2024 |
0.1.0 | Jul 4, 2024 |
#19 in #interning
36KB
625 lines
interned-string
This crate exposes IString
, an
immutable and interned string type.
It's built for high performance and with multithreading in mind.
It provides O(1) Hash
and Eq
operations, perfect for your HashMap<IString, _>
.
Getting Started
You can intern any String
or &str
value by calling intern()
.
You can pass an IString
by reference to any function that accepts a &str
.
use interned_string::Intern;
fn main() {
let my_istring = "hello".intern();
foo(&my_istring);
}
fn foo(string: &str) {
println!("{string}");
}
If you enable the serde
feature, you can use IString
in place of String
in your DTOs.
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
interned-string = { version = "0.1", features = ["serde"] }
use serde::Deserialize;
use interned_string::IString;
#[derive(Deserialize, Debug)]
struct ExampleDTO {
favorite_dish: IString
}
fn main() {
let serialized = "{\"favorite_dish\":\"pasta\"}";
let example: ExampleDTO = serde_json::from_str(&serialized).unwrap();
println!("{example:?}")
// ExampleDTO { favorite_dish: IString("pasta") }
}
Performance Characteristics
Reading an IString
's contents is very fast, lock free and wait free (thanks to left_right
).
The IString
can be shared and read from any number of threads.
It scales linearly with the number of reading threads.
The tradeoff is that creating a new IString
is slower.
A radix tree (compact trie) needs to be traversed to deduplicate the new string,
a lock needs to be acquired, and the tree needs to be updated if the string wasn't interned yet.
While the tree walk can be done in parallel from multiple threads, the lock prevents linear
scaling for writes.
Planned Improvements
-
Replace or rewrite the radix tree to make it reuse the string storage, instead of storing a clone of the each interned string. Currently the crate uses 2x the interned string storage space because of this (1x in storage, 1x as a clone in the radix tree).
-
Free the memory of unused strings in the background / in a cold path. Currenly the memory is only free'd the next time a new string is inserted in storage, which may be unexpected.
Contributing
Feel free to open a PR. Any contribution is greatly appreciated.
If you have a suggestion, please open an Issue.
License
Acknowledgments
Special thanks to Jon Gjengset (@jonhoo) for providing the inspiration for this crate, and for his
work on left-right
.
Dependencies
~0.8–27MB
~345K SLoC