#tlsh #digest #similarity

tlsh2

A rust implementation of the TLSH algorithm

5 releases (3 breaking)

0.4.0 Jul 1, 2024
0.3.0 Jul 30, 2023
0.2.1 May 4, 2023
0.2.0 Jan 1, 2023
0.1.0 Dec 30, 2022

#281 in Algorithms

Download history 558/week @ 2024-07-02 306/week @ 2024-07-09 266/week @ 2024-07-16 366/week @ 2024-07-23 383/week @ 2024-07-30 422/week @ 2024-08-06 208/week @ 2024-08-13 281/week @ 2024-08-20 262/week @ 2024-08-27 345/week @ 2024-09-03 386/week @ 2024-09-10 413/week @ 2024-09-17 471/week @ 2024-09-24 432/week @ 2024-10-01 643/week @ 2024-10-08 652/week @ 2024-10-15

2,275 downloads per month
Used in boreal

Apache-2.0 OR BSD-3-Clause

505KB
568 lines

TLSH2

Build status Crates.io Documentation

Rust port of the TLSH library. The code is kept close to the original C++ version, to limit bugs and help maintainability

This crate is no_std and different configurations of bucket numbers and checksum length are handled as generics, making every configuration properly optimized.

// The default builder uses 128 buckets and a 1-byte checksum.
// Other builders are also available.
let mut builder = tlsh2::TlshDefaultBuilder::new();
builder.update(b"Sed ut perspiciatis unde omnis iste natus");
builder.update(b"error sit voluptatem accusantium");
let tlsh = builder.build()
    .ok_or_else(|| "could not generate TLSH from payload")?;

// Alternatively, a TLSH object can be generated directly from
// a byte slice.
let tlsh2 = tlsh2::TlshDefaultBuilder::build_from(
    b"odit aut fugit, sed quia consequuntur magni dolores"
).ok_or_else(|| "could not generate TLSH from second payload")?;

// Then, the TLSH object can be used to generated a hash or compute
// distances
assert_eq!(
    tlsh.hash(),
    b"T184A022B383C2A2A20ACB0830880CF0202CCAC080033A023800338\
      A30B0880AA8E0BE38".as_slice(),
);
// The `diff` feature is required for this computation.
assert_eq!(tlsh.diff(&tlsh2, true), 209);

Those configurations are available:

  • 128 buckets and 1-byte checksum (default).
  • 128 buckets and 3-byte checksum.
  • 256 buckets and 1-byte checksum.
  • 256 buckets and 3-byte checksum.
  • 48 buckets and 1-byte checksum.

The fast feature speeds up TLSH generation but adds a 64kB lookup table.

The threaded and private options that exists in the original TLSH version are not yet implemented.

No runtime deps