#cpu #cargo-build #cargo #cargo-toml #optimization #performance #package-version

app cargo-multivers

Cargo subcommand to build multiple versions of the same binary, each with a different CPU features set, merged into a single portable optimized binary

14 releases (8 breaking)

0.9.0 Jul 26, 2024
0.8.1 Feb 17, 2024
0.8.0 Dec 27, 2023
0.5.0 Aug 12, 2023
0.3.0 Mar 28, 2023

#47 in Cargo plugins

Download history 1/week @ 2024-07-14 96/week @ 2024-07-21 38/week @ 2024-07-28 6/week @ 2024-08-04 22/week @ 2024-08-11 11/week @ 2024-08-18 21/week @ 2024-08-25 14/week @ 2024-09-01 10/week @ 2024-09-08 13/week @ 2024-09-15 21/week @ 2024-09-22 4/week @ 2024-09-29 5/week @ 2024-10-06 23/week @ 2024-10-13 129/week @ 2024-10-20 83/week @ 2024-10-27

240 downloads per month

MIT/Apache

60KB
1K SLoC

cargo-multivers

Latest Version MSRV Apache 2.0 OR MIT licensed

Cargo subcommand to build multiple versions of the same binary, each with a different CPU features set, merged into a single portable optimized binary.

Overview

cargo-multivers builds multiple versions of the binary of a Rust package. Each version is built with a set of CPU features (e.g., +cmpxchg16b,+fxsr,+sse,+sse2,+sse3) from a CPU (e.g., ivybridge) supported by the target (e.g., x86_64-pc-windows-msvc).

By default, it lists the CPUs known to rustc for a given target, then it fetches each set of CPU features and filters out the duplicates (i.e., the CPUs that support the same extensions). You can also add a section to your Cargo.toml to set the allowed list of CPUs for your package. For example, for x86_64 you could add:

[package.metadata.multivers.x86_64]
cpus = ["x86-64", "x86-64-v2", "x86-64-v3", "x86-64-v4", "raptorlake"]

After building the different versions, it computes a hash of each version and it filters out the duplicates (i.e., the compilations that gave the same binaries despite having different CPU features). Finally, it builds a runner that embeds one version compressed (the source) and the others as compressed binary patches to the source. For instance, when building for the target x86_64-pc-windows-msvc, by default 37 different versions will be built, filtered, compressed, and merged into a single portable binary.

When executed, the runner uncompresses and executes the version that matches the CPU features of the host.

Intended Use

While cargo-multivers could be used to build any kind of binary from a Rust package, it is mostly intended for the following use cases:

  • To build a project that is distributed to multiple users with different microarchitectures (e.g., a release version of your project).
  • To build a program that performs long running tasks (e.g., heavy computations, a server, or a game).

[!TIP] If you only want to optimize your program for your CPU, do not use cargo multivers, you can just use -C target-cpu=native like this: RUSTFLAGS=-Ctarget-cpu=native cargo build --release. You will save some CPU cycles :)

Supported Operating Systems

This project is tested on Windows and Linux (due to the use of memfd_create, only Linux >= v3.17 is supported).

Supported Architectures

In theory the following architectures are supported: x86, x86_64, arm, aarch64, riscv32, riscv64, powerpc, powerpc64, mips, and mips64. But only x86_64 is tested.

Installation

cargo install --locked cargo-multivers

Usage

cargo +nightly multivers

Recommendations

cargo multivers uses the release profile of your package to build the binary ([profile.release]). To optimize the size of your binary and to reduce the startup time, it is recommended to enable features that can reduce the size of each build. For example, you can have the following profile that reduce the size of your binary, while still prioritizing speed optimizations and not increasing significantly the build time:

[profile.release]
strip = "symbols"
panic = "abort"
lto = "thin"

To reduce the total build time, it might be best to limit the set of CPUs for which the project will be built. For instance, you can add to your Cargo.toml the following section if you build for x86_64:

[package.metadata.multivers.x86_64]
cpus = ["x86-64", "x86-64-v2", "x86-64-v3", "x86-64-v4"]

GitHub Actions Integration

If you want to publish an optimized portable binary built by cargo-multivers when releasing a new version of your project, you can use the cargo-multivers GitHub Action. To do that you need to have a Rust nightly toolchain and add a step to your job:

    - uses: ronnychevalier/cargo-multivers@main

For example, this can look like:

jobs:
  cargo-multivers-build:
    name: Build with cargo-multivers
    runs-on: windows-latest
    steps:
    - name: Checkout repository
      uses: actions/checkout@v4
    - name: Install Rust nightly
      uses: dtolnay/rust-toolchain@master
      with:
        toolchain: nightly
    - uses: ronnychevalier/cargo-multivers@main
      with:
        manifest_path: path/to-your/Cargo.toml
    - name: Upload release archive
      uses: softprops/action-gh-release@v1
      if: startsWith(github.ref, 'refs/tags/')
      with:
        files: the-name-of-your-binary.exe

        ... [other config fields]

Inputs

You can set two types of inputs. The ones that are related to how cargo-multivers is installed:

Name Description Required Default
version Version of cargo-multivers to use (e.g., 0.7.0) false Latest published version on crates.io

And the ones that are related to the arguments given to cargo multivers (e.g., target configures the --target option):

Name Description Required Default
manifest_path Path to Cargo.toml false
target Build for the target triple false
out_dir Copy final artifacts to this directory true .
profile Build artifacts with the specified profile false
runner_version Specify the version of the runner to use false
other_args Other arguments given to cargo multivers false
build_args Arguments given to cargo build false

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~7–16MB
~188K SLoC