2 releases
0.1.1 | Oct 11, 2023 |
---|---|
0.1.0 | Sep 12, 2023 |
#1034 in Algorithms
Used in 5 crates
(4 directly)
33KB
531 lines
QuickDiv
QuickDiv is a Rust crate that allows you to speed up repeated division and modulo operations by the same divisor, based on the libdivide C/C++ library.
On most hardware today integer division operations take longer to execute compared to operations like multiplication and addition. Because of this, compilers generally optimize division by a constant, by replacing it with a cheaper sequence of shifts, multiplications and additions. This crate lets you apply a similar algorithm to optimize division by values that are only known at runtime.
Performance gains will vary between platforms, CPUs, and integer widths, but you can expect dividing an integer by a precomputed divisor to be somewhere between 2 to 10 times faster compared to the built-in hardware division method. Note that preparing the divisor is more expensive than a single unoptimized division: it will take at least 2 divisions by the same divisor to break even.
This crate supports primitive integer types of all widths, in both signed and
unsigned variants. It requires Rust version 1.54 or greater. It is #![no_std]
and #![forbid(unsafe_code)]
.
Example
use quickdiv::DivisorU64;
fn is_quadratic_residue(q: u64, modulus: u64) -> bool {
// Initializing a divisor is more expensive than a single
// unoptimized division, to gain a benefit you must divide
// multiple times by the same divisor.
let modulus = DivisorU64::new(modulus);
// The original value can be recovered by using ::get().
for x in (0..modulus.get()) {
// A divisor can be used as the second operand with
// the / and % operators.
if (x * x) % modulus == q {
return true;
}
}
false
}
assert!(is_quadratic_residue(152, 169));
assert!(!is_quadratic_residue(51, 111));
Performance
The following benchmarks should give a rough sense of the kind of speed-up you
can expect. The numbers represent throughput in millions of elements per second
(larger is better) on various tasks.
For Quotient Sum
we divide a collection of random integers by a fixed
divisor, in LCG
we compute the remainder given a fixed modulus
and, finally, in FizzBuzz
we check the divisibility of random integers by a
fixed divisor.
Task | CPU | Compiler | QuickDiv |
---|---|---|---|
Quotient Sum | 248.3 | 838 | 823.5 |
LCG | 168.8 | 252.7 | 255 |
FizzBuzz | 41.13 | 1350 | 556 |
Note that while QuickDiv computes the remainder and checks if its
zero, the compiler uses a different method to directly check divisibility,
leading to faster performance on the FizzBuzz
task.
Caveats
- These results are for
u64
only. Performance can vary with width and signedness. - Benchmarks were run on an AMD Ryzen 5 2600 CPU (i.e. an older x86-64 CPU). Some newer high-end processors like the Apple M1/M2 have very fast hardware division, and will experience a less dramatic speed-up.
- All tasks involved at least 1000 repeated uses of the same divisor, making branch prediction trivial. You will experience worse performance if you are instead iterating over a collection of different divisors. If you want to read more about this, check out Paul Khuong's post about his branchfree Rust integer division library Reciprocal.
If you would like to run these benchmarks yourself, check out the benchmarks
crate in the GitHub
repository.
License
Licensed under any of:
- Apache License, Version 2.0, (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or https://opensource.org/licenses/MIT)
- zlib License (LICENSE-ZLIB or https://opensource.org/license/zlib/)
by your choice.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be multi-licensed as above, without any additional terms or conditions.