#neural-network #tensorflow

tract-linalg

Tiny, no-nonsense, self contained, TensorFlow and ONNX inference

139 releases

0.21.11 Mar 19, 2025
0.21.9 Jan 8, 2025
0.21.8 Dec 5, 2024
0.21.7 Sep 23, 2024
0.2.9 Mar 28, 2019

#1384 in Machine learning

Download history 4116/week @ 2024-12-04 4174/week @ 2024-12-11 2821/week @ 2024-12-18 1541/week @ 2024-12-25 3094/week @ 2025-01-01 4394/week @ 2025-01-08 4714/week @ 2025-01-15 4711/week @ 2025-01-22 6058/week @ 2025-01-29 4690/week @ 2025-02-05 5003/week @ 2025-02-12 3393/week @ 2025-02-19 4350/week @ 2025-02-26 5057/week @ 2025-03-05 4707/week @ 2025-03-12 6130/week @ 2025-03-19

20,717 downloads per month
Used in 38 crates (2 directly)

MIT/Apache

1MB
27K SLoC

Rust 18K SLoC // 0.0% comments Templ 9K SLoC // 0.1% comments GNU Style Assembly 13 SLoC // 0.3% comments

tract-linalg

linalg stands for "linear algebra". This is a misnamer. This crates contains low-level, architecture dependant optimisations used by tract-core.

Functions

  • MatMatMul: Extended matrix*matrix product:
    • inspired by Gotoblass and BLIS micro kernel approach
    • extended for convolution friendly addressing (fused img2col)
    • fused output pipeline (min, max, and a few more simple, fast ops)
    • f32*f32 -> f32 (à la sgemm)
    • i8*i8 -> i32 accumulator -> i32 storage
    • i8*i8 -> i32 accumulator -> i8 (with channel zeropoint and scale, and re-quantization pipeline)
  • f32 sigmoid and f32 tanh: at f32 precision, by a rationale function (no exponentiation)
  • byte-to-byte lookup table

Implementations

generic fallback armv6, vfp armv7 neon armv8 simd x64 FMA
MatMatMul f32 4x4 8x4 8x8 16x6
MatMatMul i8->i8 8x4 8x8
MatMatMul i8->i32 8x8
sigmoid f32 4n 4n
tanh f32 4n 4n
byte lookup

Dependencies

~10–18MB
~248K SLoC