duval-rs

A simple implementation of the Duval algorithm in Rust

1 unstable release

0.1.0 Feb 26, 2024

#924 in Text processing


Used in 2 crates

MIT license

7KB
103 lines

Duval algorithm for Lyndon factorization and smallest rotation computation

We provide implementation of Duval algorithm for Lyndon factorization and smallest rotation computation in C++, Rust and Python.

Our code for the Lyndon factorization is based on this implementation.

Our code for the smallest rotation computation is custom and highly optimized (for the C++ and Rust versions).

Usage

Pure Python

Just use the following code snippet (from duval_pure.py)

def min_rotation(s):
    N = len(s)
    s += s
    a = b = 0
    while b < N:
        for i in range(N - a):
            sai = s[a + i]
            sbi = s[b + i]
            if sai < sbi or a + i == b:
                if i:
                    b += i - 1
                break
            if sai > sbi:
                a = b
                break
        b += 1
    return a

Call C++ from Python

Requires pybind11 (pip install pybind11).

Install with pip install ./duval-py

from duval import duval, min_rotation

s = "abracadabra"
print(duval(s))
print(min_rotation(s))

C++

Simply include duval-cpp/duval.hpp.

#include <iostream>
#include <string>
#include "duval.hpp"

int main()
{
    std::string s = "abracadabra";
    for (auto &factor : duval(s))
        std::cout << factor << std::endl;
    std::cout << min_rotation(s) << std::endl;
}

Rust

Code is provided in duval-rs.

Testing

We provide simple tests for the Python, C++ and Rust versions.

C++

cd duval-cpp && g++ --std=c++11 -O3 test.cpp -o test && ./test && cd ..

We do extended testing for the C++ version with random strings against a very naive implementation. The random strings are generated with fixed seeds to ensure reproducibility.

We also display the performance. On our laptop, computing min_rotation for 10000 strings (half random and half random) of length 10000 takes 0.2s.

Python

python duval-py/test.py

We test both the pure Python and the C++ version and compare their performance. The Python version is about 150x slower than the C++ version.

Rust

cargo test --manifest-path=duval-rs/Cargo.toml --release -- --nocapture

The Rust version has about the same performance as the C++ version.

Dependencies

~315KB