#post-quantum-cryptography #hashing #keccak #xof #public-key

turboshake

A family of extendable output functions based on keccak-p[1600, 12] permutation

14 unstable releases (3 breaking)

new 0.4.0 Feb 17, 2025
0.3.1 Feb 16, 2025
0.2.0 Nov 25, 2023
0.1.9 Nov 21, 2023
0.1.5 Mar 24, 2023

#696 in Cryptography

Download history 18/week @ 2024-12-09 138/week @ 2025-02-10

141 downloads per month
Used in kangarootwelve

MIT license

63KB
900 lines

turboshake

TurboSHAKE: A Family of eXtendable Output Functions based on round reduced ( 12 rounds ) Keccak[1600] Permutation.

Overview

TurboSHAKE is a family of extendable output functions (xof) powered by round-reduced ( i.e. 12 -rounds ) Keccak-p[1600, 12] permutation. Keccak-p[1600, 12] has previously been used in fast parallel hashing algorithm KangarooTwelve ( more @ https://keccak.team/kangarootwelve.html ). Recently a formal specification, describing TurboSHAKE was released ( more @ https://ia.cr/2023/342 ) which generally exposes the underlying primitive of KangarooTwelve ( also known as K12, see https://blake12.org ) so that post-quantum public key cryptosystems ( such as ML-KEM, ML-DSA etc. - standardized by NIST ) might benefit from it ( more @ https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/5HveEPBsbxY ).

Here I'm maintaining a Rust library which implements TurboSHAKE{128, 256} xof s.t. one can absorb arbitrary many bytes into sponge state, finalize sponge and squeeze arbitrary many bytes out of sponge. See usage section below for more info.

Prerequisites

Rust stable toolchain; see https://rustup.rs for installation guide.

# When developing this library, I was using
$ rustc --version
rustc 1.84.1 (e71f9a9a9 2025-01-27)

Testing

For ensuring functional correctness of TurboSHAKE{128, 256} implementation, I use test vectors from section 4 ( on page 9 ) and Appendix A ( on page 17 ) of https://datatracker.ietf.org/doc/draft-irtf-cfrg-kangarootwelve. Issue following command to run all test cases

cargo test

Benchmarking

Issue following command for benchmarking round-reduced Keccak-p[1600, 12] permutation and TurboSHAKE{128, 256} Xof, for variable input and output sizes.

[!WARNING] When benchmarking make sure you've disabled CPU frequency scaling, otherwise numbers you see can be misleading. I found https://github.com/google/benchmark/blob/b40db869/docs/reducing_variance.md helpful.

cargo bench --bench keccak --features=dev --profile optimized # Only keccak permutation
cargo bench --bench turboshake --profile optimized            # Only TurboSHAKE{128, 256} Xof
cargo bench --all-features --profile optimized                # Both of above

On 12th Gen Intel(R) Core(TM) i7-1260P

Running kernel Linux 6.11.0-14-generic x86_64, with Rust compiler 1.84.1 (e71f9a9a9 2025-01-27), compiled in optimized mode.

Timer precision: 20 ns
keccak                fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ permute_12_rounds  82.32 ns      │ 226.3 ns      │ 83.1 ns       │ 88.24 ns      │ 100     │ 3200
                      2.262 GiB/s   │ 842.7 MiB/s   │ 2.241 GiB/s   │ 2.11 GiB/s    │         │
                      12.14 Mitem/s │ 4.418 Mitem/s │ 12.03 Mitem/s │ 11.33 Mitem/s │         │

Timer precision: 19 ns
turboshake                      fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ turboshake128                              │               │               │               │         │
  ├─ msg = 2.0KB, md = 64.0B   1.25 µs       │ 9.534 µs      │ 1.475 µs      │ 1.547 µs      │ 100     │ 100
  │                            1.572 GiB/s   │ 211.2 MiB/s   │ 1.332 GiB/s   │ 1.27 GiB/s    │         │
  │                            799.6 Kitem/s │ 104.8 Kitem/s │ 677.6 Kitem/s │ 646.1 Kitem/s │         │
  ├─ msg = 8.0KB, md = 64.0B   4.592 µs      │ 6.655 µs      │ 4.623 µs      │ 4.644 µs      │ 100     │ 100
  │                            1.674 GiB/s   │ 1.155 GiB/s   │ 1.663 GiB/s   │ 1.655 GiB/s   │         │
  │                            217.7 Kitem/s │ 150.2 Kitem/s │ 216.3 Kitem/s │ 215.3 Kitem/s │         │
  ├─ msg = 32.0B, md = 64.0B   115.1 ns      │ 121.6 ns      │ 116 ns        │ 116.1 ns      │ 100     │ 1600
  │                            795.3 MiB/s   │ 752.8 MiB/s   │ 788.9 MiB/s   │ 788.4 MiB/s   │         │
  │                            8.687 Mitem/s │ 8.223 Mitem/s │ 8.617 Mitem/s │ 8.612 Mitem/s │         │
  ├─ msg = 128.0B, md = 64.0B  119.7 ns      │ 249.9 ns      │ 124 ns        │ 131.8 ns      │ 100     │ 1600
  │                            1.493 GiB/s   │ 732.4 MiB/s   │ 1.441 GiB/s   │ 1.356 GiB/s   │         │
  │                            8.352 Mitem/s │ 4 Mitem/s     │ 8.061 Mitem/s │ 7.584 Mitem/s │         │
  ╰─ msg = 512.0B, md = 64.0B  400.6 ns      │ 725.6 ns      │ 407.1 ns      │ 423.9 ns      │ 100     │ 400
                               1.339 GiB/s   │ 757 MiB/s     │ 1.317 GiB/s   │ 1.265 GiB/s   │         │
                               2.496 Mitem/s │ 1.378 Mitem/s │ 2.456 Mitem/s │ 2.358 Mitem/s │         │
╰─ turboshake256                              │               │               │               │         │
   ├─ msg = 2.0KB, md = 64.0B   1.526 µs      │ 4.404 µs      │ 1.544 µs      │ 1.574 µs      │ 100     │ 100
                               1.288 GiB/s   │ 457.2 MiB/s   │ 1.273 GiB/s   │ 1.248 GiB/s   │         │
                               655 Kitem/s   │ 227 Kitem/s   │ 647.6 Kitem/s │ 634.9 Kitem/s │         │
   ├─ msg = 8.0KB, md = 64.0B   5.711 µs      │ 8.574 µs      │ 5.747 µs      │ 5.922 µs      │ 100     │ 100
                               1.346 GiB/s   │ 918.2 MiB/s   │ 1.337 GiB/s   │ 1.298 GiB/s   │         │
                               175 Kitem/s   │ 116.6 Kitem/s │ 173.9 Kitem/s │ 168.8 Kitem/s │         │
   ├─ msg = 32.0B, md = 64.0B   114.2 ns      │ 201.4 ns      │ 116.7 ns      │ 125.1 ns      │ 100     │ 1600
                               801.4 MiB/s   │ 454.3 MiB/s   │ 784.3 MiB/s   │ 731.5 MiB/s   │         │
                               8.754 Mitem/s │ 4.963 Mitem/s │ 8.566 Mitem/s │ 7.99 Mitem/s  │         │
   ├─ msg = 128.0B, md = 64.0B  119.9 ns      │ 141.9 ns      │ 121.9 ns      │ 122.3 ns      │ 100     │ 1600
                               1.49 GiB/s    │ 1.259 GiB/s   │ 1.465 GiB/s   │ 1.461 GiB/s   │         │
                               8.334 Mitem/s │ 7.046 Mitem/s │ 8.197 Mitem/s │ 8.172 Mitem/s │         │
   ╰─ msg = 512.0B, md = 64.0B  400.2 ns      │ 427.2 ns      │ 408.2 ns      │ 407.4 ns      │ 100     │ 800
                                1.34 GiB/s    │ 1.255 GiB/s   │ 1.314 GiB/s   │ 1.316 GiB/s   │         │
                                2.498 Mitem/s │ 2.34 Mitem/s  │ 2.449 Mitem/s │ 2.454 Mitem/s │         │

Usage

Using TurboSHAKE{128, 256} Xof API is fairly easy.

  1. Add turboshake to your project's Cargo.toml.
[dependencies]
turboshake = "0.4.0"
  1. Create a TurboSHAKE{128, 256} Xof object.
use turboshake;

fn main() {
    let msg = [1u8; 8];      // message to be absorbed
    let mut dig = [0u8; 32]; // digest to be computed

    let mut hasher = turboshake::TurboShake128::default();
    // ...
}
  1. Absorb N(>=0) -bytes message into sponge state by invoking absorb() M(>1) -many times.
hasher.absorb(&msg[..2]);
hasher.absorb(&msg[2..4]);
hasher.absorb(&msg[4..]);
  1. When all message bytes are consumed, finalize sponge state by calling finalize().
// Note, one needs to pass a domain seperator constant byte in finalization step.
// You can use 0x1f ( i.e. default domain seperator value ) if you're not using
// multiple instances of TurboSHAKE. Consider reading section 1 ( top of page 2 )
// of TurboSHAKE specification https://eprint.iacr.org/2023/342.pdf.
hasher.finalize::<{ turboshake::TurboShake128::DEFAULT_DOMAIN_SEPARATOR }>();
  1. Now sponge is ready to be squeezed i.e. read arbitrary many bytes by invoking squeeze() arbitrary many times.
hasher.squeeze(&mut dig[..16]);
hasher.squeeze(&mut dig[16..]);

I maintain two examples demonstrating use of TurboSHAKE{128, 256} Xof API.

You should be able to run those examples with following commands

cargo run --example turboshake128

Message: 44abe2f57f3186dd40e761d955fbda1b0dd21a86ed17fdd0d389e1b578857b09a0ef1236ef02cefd6f7d7e7a23e1d200066361de50315655b614ef5f7f72f1e6
Digest: 721f1b1c6afa722e10001f2e2058844756cdf51c4ca00179073665a34720b317

# or
cargo run --example turboshake256

Message: 2f9f1b0bcf2b22a641ac3db02308c3bdf19acea8d271bd4d72d107c53b19e145fa520ffe15cdba0236131071b0d4f84cb57b2842220f5d13ff0393cb1c37d679
Digest: 9e5310a6f2965899ebcdea891b01d08431957ad0dd12bee163c55c8e38b2cf4c

No runtime deps