4 releases (2 breaking)

0.3.1 Jan 13, 2023
0.3.0 Dec 12, 2022
0.2.0 May 25, 2022
0.1.0 Aug 26, 2021

#1082 in Encoding

Download history 3027/week @ 2024-03-14 3772/week @ 2024-03-21 4194/week @ 2024-03-28 4942/week @ 2024-04-04 4502/week @ 2024-04-11 3000/week @ 2024-04-18 2745/week @ 2024-04-25 3675/week @ 2024-05-02 3939/week @ 2024-05-09 4980/week @ 2024-05-16 3300/week @ 2024-05-23 3892/week @ 2024-05-30 3059/week @ 2024-06-06 2007/week @ 2024-06-13 2814/week @ 2024-06-20 1577/week @ 2024-06-27

10,344 downloads per month
Used in 64 crates (via tantivy-nightly)

MIT license

280KB
6.5K SLoC

Fast Field Codecs

This crate contains various fast field codecs, used to compress/decompress fast field data in tantivy.

Contributing

Contributing is pretty straightforward. Since the bitpacking is the simplest compressor, you can check it for reference.

A codec needs to implement 2 traits:

  • A reader implementing FastFieldCodecReader to read the codec.
  • A serializer implementing FastFieldCodecSerializer for compression estimation and codec name + id.

Tests

Once the traits are implemented test and benchmark integration is pretty easy (see test_with_codec_data_sets and bench.rs).

Make sure to add the codec to the main.rs, which tests the compression ratio and estimation against different data sets. You can run it with:

cargo run --features bin

TODO

  • Add real world data sets in comparison
  • Add codec to cover sparse data sets

Codec Comparison

+----------------------------------+-------------------+------------------------+
|                                  | Compression Ratio | Compression Estimation |
+----------------------------------+-------------------+------------------------+
| Autoincrement                    |                   |                        |
+----------------------------------+-------------------+------------------------+
| LinearInterpol                   | 0.000039572664    | 0.000004396963         |
+----------------------------------+-------------------+------------------------+
| MultiLinearInterpol              | 0.1477348         | 0.17275847             |
+----------------------------------+-------------------+------------------------+
| Bitpacked                        | 0.28126493        | 0.28125                |
+----------------------------------+-------------------+------------------------+
| Monotonically increasing concave |                   |                        |
+----------------------------------+-------------------+------------------------+
| LinearInterpol                   | 0.25003937        | 0.26562938             |
+----------------------------------+-------------------+------------------------+
| MultiLinearInterpol              | 0.190665          | 0.1883836              |
+----------------------------------+-------------------+------------------------+
| Bitpacked                        | 0.31251436        | 0.3125                 |
+----------------------------------+-------------------+------------------------+
| Monotonically increasing convex  |                   |                        |
+----------------------------------+-------------------+------------------------+
| LinearInterpol                   | 0.25003937        | 0.28125438             |
+----------------------------------+-------------------+------------------------+
| MultiLinearInterpol              | 0.18676           | 0.2040086              |
+----------------------------------+-------------------+------------------------+
| Bitpacked                        | 0.31251436        | 0.3125                 |
+----------------------------------+-------------------+------------------------+
| Almost monotonically increasing  |                   |                        |
+----------------------------------+-------------------+------------------------+
| LinearInterpol                   | 0.14066513        | 0.1562544              |
+----------------------------------+-------------------+------------------------+
| MultiLinearInterpol              | 0.16335973        | 0.17275847             |
+----------------------------------+-------------------+------------------------+
| Bitpacked                        | 0.28126493        | 0.28125                |
+----------------------------------+-------------------+------------------------+

Dependencies

~0.5–2.7MB
~46K SLoC