2 stable releases
1.1.0 | Nov 19, 2024 |
---|---|
1.0.0 | Oct 18, 2024 |
#377 in Algorithms
548 downloads per month
87KB
1.5K
SLoC
entropy-map
entropy-map
is an ultra-low latency hash map Rust crate using minimal perfect hash functions (MPHF) and compact encoding of values, designed for scenarios where both memory efficiency and fast data retrieval are critical. Ideal for applications in high-performance computing, entropy-map
offers a unique blend of speed and compactness.
Getting Started
Simple example to quickly get started:
use entropy_map::Mphf;
let keys = [1, 2, 3, 4, 5];
let mphf = Mphf::<32, 8>::from_slice(&keys, 2.0).unwrap();
assert!(mphf.get(&1).is_some());
Check out the provided examples for detailed usage:
Overview
This crate provides advanced data structures leveraging MPHF, optimized for scenarios requiring high-speed data access and minimal memory usage. It includes the following key components:
Minimal Perfect Hash Function (MPHF)
- Implements MPHF based on fingerprinting techniques as detailed in Fingerprinting-based minimal perfect hashing revisited
- Inspired by ph crate but with improved rank storage and reduced construction and query times.
- Optimized rank storage mechanism based on Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors
- Memory usage ranging from
2.10 bits
to2.71 bits
per key depending on parameters. - Query time ranging from
5 ns
to20 ns
depending on the parameters, number of keys and L1-L3 cache sizes. - Configurable template parameters for flexibility.
B
: group size in bits in [1..64] range, default 32 bits.S
: defines maximum seed value to try (2^S) in [0..16] range, default 8.ST
: seed type (unsigned integer), defaultu8
.H
: hasher used to hash keys, defaultWyHash
.
- Configurable
gamma
parameter to tune construction time vs query time trade-off. - Optional rkyv support to enable zero-copy serialization/deserialization of MPHF.
MapWithDict
- Immutable hash map leveraging MPHF for indexing.
- Stores keys to ensure presence/absence of the key in the map.
- Optimized for space, using a dictionary to pack unique values.
- Efficient storage and retrieval, reducing overall memory footprint.
- Optional rkyv support to enable zero-copy serialization/deserialization and superior memory footprint and performance when compared with
rkyv::ArchivedHashMap
.
MapWithDictBitpacked
- Specialized version of
MapWithDict
, further optimized for memory usage when values areVec<u32>
. - Bit-packs
Vec<u32>
values for minimal space usage using SIMD instructions. - Excels in scenarios where values are within a limited range and can be efficiently encoded.
Set
Special case of MapWithDict
, optimized for set membership operations.
- Immutable set using MPHF for indexing.
- Stores keys to ensure presence/absence of the key in the set.
- Optional rkyv support to enable zero-copy serialization/deserialization.
Dependencies
~0.6–1.2MB
~25K SLoC