#hashing #bloom #hash-table #hash-values #bloom-filter #key-hash #double-hashing

hash-iter

Iterator producing sequence of hash values for a given input (using double hashing technique)

6 releases (stable)

1.1.0 Nov 8, 2024
1.0.2 Nov 7, 2024
0.2.0 Nov 7, 2024
0.1.0 Nov 7, 2024

#632 in Algorithms

Download history 348/week @ 2024-11-02 115/week @ 2024-11-09

463 downloads per month
Used in mpchash

MIT license

14KB
176 lines

hash-iter

crates.io docs.rs

Implementation of the enhanced double hashing technique based on the Bloom Filters in Probabilistic Verification paper (Dillinger and Manolios, 2004).

Motivation

This crate is very simple: given a key key, one can hash it to a sequence of k hashes [hash_1, hash_2, .., hash_k], instead of a single hash.

This is useful for implementing many hash-based algorithms, such as:

Usage

Create and configure a hasher either by directly calling constructors of DoubleHashHasher or by using the builder object DoubleHashBuilder.

Basic usage

When we are relying on default parameters (for seed values, max hash value etc), and do not need to keep builder around (which is normally the case, as a single generated hasher is often enough), we can use the DoubleHashHasher constructor directly:

use hash_iter::{DoubleHashHasher, HashIterHasher};

let hasher = DoubleHashHasher::new();

// Hash each key to 3 hash points.
let hashes = hasher.hash_iter(&"foo", 3).collect::<Vec<_>>();
let hashes = hasher.hash_iter(&"bar", 3).collect::<Vec<_>>();

Configuring hasher state

There are several optional parameters that can be configured:

  • n: the maximum hash value producible (by default it is usize::MAX, so that array indexing is safe).
  • seed1 and seed2: seeds for the two hash functions (by default they are 12345 and 67890 respectively).

The DoubleHashBuilder allows you to configure how hash iterators are produced:

use hash_iter::{BuildHashIterHasher, DoubleHashHasher, DoubleHashBuilder, HashIterHasher};

// Specify default values explicitly.
let hasher = DoubleHashBuilder::new()
    .with_seed1(12345)
    .with_seed2(67890)
    .with_n(usize::MAX as u64)
    .build_hash_iter_hasher();

let hashes = hasher.hash_iter(&"hello", 3).collect::<Vec<_>>();

Custom hash functions

One can specify which hash functions to use when creating the first two hash values used to produce the sequence.

All you need to do is to supply DoubleHashHasher::with_hash_builders() function with two structs that implement hash::BuildHasher:

use hash_iter::{DoubleHashHasher, HashIterHasher};
use xxhash_rust::xxh3::Xxh3Builder;

let hasher = DoubleHashHasher::with_hash_builders(
    Xxh3Builder::new().with_seed(12345),
    Xxh3Builder::new().with_seed(67890),
    usize::MAX, // n
);

let hashes = hasher.hash_iter(&"hello", 3).collect::<Vec<_>>();

Dependencies

~255KB