#arithmetic-coding #coding #arithmetic #lossless #entropy #cacm87

bin+lib simple-arithmetic-coding

Arithmetic coding, directly derived from the well-known CACM87 C-language implementation

4 releases

new 0.2.2 Mar 4, 2025
0.2.1 Mar 3, 2025
0.2.0 Mar 2, 2025
0.1.1 Aug 8, 2024
0.1.0 Aug 8, 2024

#259 in Compression

Download history 12/week @ 2024-12-09 6/week @ 2025-02-10 2/week @ 2025-02-17 89/week @ 2025-02-24 304/week @ 2025-03-03

401 downloads per month

MIT license

29KB
863 lines

Arithmetic coding in Rust

A simple arithmetic coder implementation that should be easy to read. Includes a driver binary and a library. By default a binary is built.

Encoding from the command line

Run target/release/simple-arithmetic-coding -e <input from stdin> or wherever you have compiled the binary. For example on Linux, you can pipe your file using cat, and then direct the output to file: cat war_and_peace.txt | target/release/simple-arithmetic-coding -e > output.bin

Decoding from the command line

Similarly to encoding, run target/release/simple-arithmetic-coding -d <input from stdin>. Only the command line switch is different.

As a lib

The library version provides two functions and two iterators. The functions are very much self-explanatory: there is the encoding_routine and the decoding_routine. Their respective argument names are self-documenting and, with an IDE or a text editor, you can see their trait bounds. The iterators work similarly, except that they only need to wrap inputs. The result that you get is an iterator over u8s for both iterators. A goal of this project is to keep the external API unchanged forever.

Updates in version 0.2.1

  • Fixed a silly bug regarding subtraction overflow.
  • Added some smart integration tests.
  • Added a trivial example.
  • Updating to this version or higher is recommended and probably required.

Useful notes

  • The encoder introduces noticeable drift with certain inputs where the same byte repeats at least a little over 187,000 times.
  • War and Peace from Project Gutenberg does not include a single zero byte.

Dependencies

~150KB