6 releases
0.2.4 | Jul 31, 2020 |
---|---|
0.2.3 | Jul 31, 2020 |
0.1.0 | Jul 24, 2020 |
#494 in Compression
47KB
642 lines
C-Squared Histograms Aaron Gindi
This library is designed to facilitate user control over the trade-off between precision and memory usage of histograms, which are often essential in memory research.
Instead of implementing memory saving operations, this library merely simulates the operations for evaluation purposes.
Patch 0.2.0: Fix bugs relating to CompressedHistogram and made CompressedHistogram more efficient. Breaking change because Copy trait was removed from Param implementors, and C2Params now return references to embedded parameters instead of copies.
lib.rs
:
This library contains tools for simulating space efficient histograms. For the purposes of this library, a histogram is a map from labels to frequencies, both of which must be numerical.
We provide is a standard histogram implementation (StandardHistogram
) which
stores label-frequency pairs at face value using a HashMap.
Furthermore, we provide three additional histogram implementations that use either
strictly less memory than StandardHistogram
or use a fixed amount of memory. The
optimizations are done across two dimensions, the label storage (referred to as "compaction")
and the frequency storage (referred to as "compression").
Hence the name "C-Squared Histograms". The three implementations are as follows:
CompressedHistogram
- This implementation uses strictly less space than aStandardHistogram
by approximating the frequenciesCompactHistogram
- This implementation consumes a fixed amount of space depending on a few precision parameters. It saves space by approximating labels.C2Histogram
- This implementation also utilizes a fixed amount of space, usually much less than a CompactHistogram. It approximates both labels and frequencies.
Each of these implementations is parameterized such that the trade-off between precision and memory performance is directly controlled by the user.
To construct a StandardHistogram
, use the function create_standard_histogram
. Then,
the other histograms can be derived from that standard histogram using the conversion functions
to_compact
, to_compressed
, or to_c2
.