17 releases (5 breaking)

new 0.9.0 Jan 14, 2025
0.8.1 Jan 7, 2025
0.8.0 Dec 23, 2024
0.7.0 Nov 25, 2024
0.4.3 Oct 18, 2024

#845 in Algorithms

Download history 489/week @ 2024-10-16 410/week @ 2024-10-23 44/week @ 2024-10-30 300/week @ 2024-11-06 31/week @ 2024-11-13 259/week @ 2024-11-20 43/week @ 2024-11-27 162/week @ 2024-12-04 134/week @ 2024-12-11 127/week @ 2024-12-18 63/week @ 2024-12-25 61/week @ 2025-01-01 284/week @ 2025-01-08

540 downloads per month
Used in augurs

MIT/Apache

1.5MB
372 lines

Time series clustering algorithms

Time series clustering algorithms.

So far, only DBSCAN is implemented, and the distance matrix must be passed directly. A crate such as augurs-dtw must be used to calculate the distance matrix for now.

Usage

use augurs::clustering::{DbscanCluster, DbscanClusterer, DistanceMatrix};

# fn main() -> Result<(), Box<dyn std::error::Error>> {
// Start with a distance matrix.
// This can be calculated using e.g. dynamic time warping
// using the `augurs-dtw` crate.
let distance_matrix = DistanceMatrix::try_from_square(
    vec![
        vec![0.0, 0.1, 0.2, 2.0, 1.9],
        vec![0.1, 0.0, 0.15, 2.1, 2.2],
        vec![0.2, 0.15, 0.0, 2.2, 2.3],
        vec![2.0, 2.1, 2.2, 0.0, 0.1],
        vec![1.9, 2.2, 2.3, 0.1, 0.0],
    ],
)?;

// Epsilon is the maximum distance between two series for them to be considered in the same cluster.
let epsilon = 0.3;
// The minimum number of series in a cluster before it is considered non-noise.
let min_cluster_size = 2;

// Use DBSCAN to detect clusters of series.
// Note that we don't need to specify the number of clusters in advance.
let clusters = DbscanClusterer::new(epsilon, min_cluster_size).fit(&distance_matrix);
assert_eq!(
    clusters,
    vec![
        DbscanCluster::Cluster(1.try_into().unwrap()),
        DbscanCluster::Cluster(1.try_into().unwrap()),
        DbscanCluster::Cluster(1.try_into().unwrap()),
        DbscanCluster::Cluster(2.try_into().unwrap()),
        DbscanCluster::Cluster(2.try_into().unwrap()),
    ],
);
# Ok(())
# }

Credits

This implementation is based heavily on to the implementation in linfa-clustering and scikit-learn. The main difference between these is that we operate directly on the distance matrix rather than calculating it as part of the clustering algorithm.

License

Dual-licensed to be compatible with the Rust project. Licensed under the Apache License, Version 2.0 <http://www.apache.org/licenses/LICENSE-2.0> or the MIT license <http://opensource.org/licenses/MIT>, at your option.

Dependencies