1 unstable release

0.1.0 Jun 26, 2022

#2351 in Data structures

MIT license

32KB
386 lines

sortbuf -- data structure for sorting large numbers of items in memory

This library provides types and traits for accumulating a large number of items in memory and iterating over them in ascending or descending order. It outperforms BTree-based sorting, introduces low memory overhead and allows insertion of items from multiple threads as well as reacting to allocation failures without losing data. However, it's sole purpose is sorting and it provides no other functionality.

Example

let mut sortbuf = sortbuf::SortBuf::new();
let mut inserter = sortbuf::Inserter::new(&mut sortbuf);
inserter.insert_items([10, 20, 5, 17]).expect("Failed to insert items");
drop(inserter);
assert!(sortbuf.into_iter().eq([20, 17, 10, 5]));

License

This work is provided under the MIT license. See LICENSE for more details.


lib.rs:

Sort a large number of items in memory

This library provides types, most prominently [SortBuf], and traits for accumulating a large number of items in memory and iterating over them in ascending or descending order (as per [Ord]). The implementation

  • avoids potentially costly reallocations,
  • releases chunks of memory every now and then during iteration,
  • doesn't introduce much memory overhead,
  • allows reacting to allocation failures,
  • supports multi-threaded insertion and
  • isn't awfully slow.

Examples

Natively, [SortBuf] will prepare items for descending iteration:

let mut sortbuf = sortbuf::SortBuf::new();
let mut inserter = sortbuf::Inserter::new(&mut sortbuf);
inserter.insert_items([10, 20, 5, 17]).expect("Failed to insert items");
drop(inserter);
assert!(sortbuf.into_iter().eq([20, 17, 10, 5]));

For ascending iteration, items need to be wrapped in std::cmp::Reverse. However, the library provides convenience functions for handling the (un)wrapping:

let mut sortbuf = sortbuf::SortBuf::new();
let mut inserter = sortbuf::Inserter::new(&mut sortbuf);
inserter.insert_items_reversed([10, 20, 5, 17]).expect("Failed to insert items");
drop(inserter);
assert!(sortbuf.unreversed().eq([5, 10, 17, 20]));

Multithreaded insertion is supported via multiple [Inserter]s:

use std::sync::{Arc, Mutex};
let sortbuf: Arc<Mutex<sortbuf::SortBuf<_>>> = Default::default();
let workers: Vec<_> = (0..4).map(|n| {
    let mut inserter = sortbuf::Inserter::new(sortbuf.clone());
    std::thread::spawn(move || inserter
        .insert_items((0..1000).map(|i| 4*i+n))
        .expect("Failed to insert items"))
}).collect();
workers.into_iter().try_for_each(|h| h.join()).unwrap();
assert!(sortbuf.lock().unwrap().take().into_iter().eq((0..4000).rev()));

Approach and comparison

As indicated in the examples above, adding new items to a buffer is done via [Inserter]s. These accumulate items in pre-sorted [Bucket]s and commit them to their target buffer. Later, that buffer can be converted to an [Iterator] which yields items taken from those [Bucket]s, which involves selecting the [Bucket] with the current greatest item in the buffer.

While a significant amount of time is spent during insertion, the majority of time is usually spent during iteration. Performance is usually better, and skews towards more time spent in the parallelizable insertion state, with fewer, bigger [Bucket]s. As [Bucket]s are pre-allocated, this comes at the cost of flexibility regarding memory.

Comparison to Vec and sort

Buffering and sorting items can also be done using a [Vec] (for buffering) in conjunction with slice::sort, slice::sort_unstable or another sorting function. The process is then usually split into an insertion, a sorting and an iteration phase, with sort being the most computational intensive phase.

Sorting and iteration over the items in a [Vec] is generally faster than with the utilities provided by this library ---in the single-threaded case. However, this library does allow insertion from multiple threads, contrary to a bare [Vec]. In addition, the performance of inserting items into a [Vec] hinges on the reallocation performance, which might be poor in some cases, e.g. if multiple buffers are involved.

The need of a single, separate, computational intensive sorting phase may also have some implications on overall performance in some use-cases. With the types provided by this library, sorting will likely interleave with I/O linked to insertion and/or the final iteration, spread out over the entire process. Thus, the underlying OS may have more opportunities to perform background operations related to reads (insertion stage) and writes (iteration stage), increasing the overall throughput.

Comparison to BTreeSet

Another option for sorting items without the need for a separate sorting phase would be an BTreeSet. Contrary to the sortbuf approach, most of the time is spent in the insertion phase rather than the iteration phase. Using a BTreeSet is usually slower than a [SortBuf] with sufficiently large [Bucket]s, not parallelizable and incurs a higher memory overhead.

No runtime deps