#profiling #multi-threading #performance #macro #instrumentation

macro profi-attributes

Procedural macro attributes for the profi crate

3 releases

0.1.2 Mar 30, 2024
0.1.1 Mar 30, 2024
0.1.0 Mar 30, 2024

#66 in #instrumentation

33 downloads per month
Used in 2 crates (via profi)

MIT license

13KB
65 lines

Profi

A simple profiler for single and multithreaded applications.

Record the time it takes for a scope to end and print the timings when the program exits.

Each measurement has an overhead of ~25ns-50ns, so it shouldn't impact benchmarks.
Run the benchmarks example to see what's the overhead on your machine.

Setup

profi is controlled by the enable feature, which is active by default.
When disabled, all macros and methods will become no-ops, resulting in zero impact on your code.

To disable it, add default-features = false to the profi dependency in your Cargo.toml.

For convenience, you can also add a custom feature:

[dependencies]
profi = { version = "*", default-features = false }

[features]
prof = ["profi/enable"]

And run it with cargo run --release --features prof

If you use rayon, enable the rayon feature!

Usage

See the examples for more usage cases.

Basic Usage

use profi::{prof, print_on_exit};

fn main() {
 // Prints the timings to stdout when the program exits
 // Always put at the top of the main function to ensure it's dropped last
 //
 // An implicit `main` guard is created to profile the whole application
 print_on_exit!();

 // Sleep for a bit to simulate some work
 std::thread::sleep(std::time::Duration::from_millis(200));
}
┌──────────────┬────────────────────┬───────────┬──────────────┬───────┐
│ Name         ┆ % Application Time ┆ Real Time ┆ Average time ┆ Calls │
╞══════════════╪════════════════════╪═══════════╪══════════════╪═══════╡
│ simple::main ┆ 100.00%            ┆ 200.13ms  ┆       -      ┆     1 │
└──────────────┴────────────────────┴───────────┴──────────────┴───────┘

Loops

use profi::{prof, print_on_exit};

fn main() {
  print_on_exit!();

  for _ in 0..100 {
      prof!(iteration);
      std::thread::sleep(std::time::Duration::from_millis(10));
  }
}
┌────────────┬────────────────────┬───────────┬──────────────┬───────┐
│ Name       ┆ % Application Time ┆ Real Time ┆ Average time ┆ Calls │
╞════════════╪════════════════════╪═══════════╪══════════════╪═══════╡
│ loop::main ┆ 100.00%            ┆ 1.01s     ┆       -      ┆     1 │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ iteration  ┆ 99.99%             ┆ 1.01s     ┆ 10.10ms/call ┆   100 │
└────────────┴────────────────────┴───────────┴──────────────┴───────┘

Multiple threads

use profi::{print_on_exit, prof_guard};

fn do_work(i: usize) {
    // Need to bind it to a variable to ensure it isn't dropped before sleeping
    let _guard = if i < 6 {
        prof_guard!("6 first")
    } else {
        prof_guard!("4 last")
    };
    std::thread::sleep(std::time::Duration::from_millis(10));
    // The guard goes out of scope here
}

fn main() {
    print_on_exit!();
    
    // Spawn 10 threads
    std::thread::scope(|s| {
        for i in 0..10 {
            s.spawn(move || {
                do_work(i);
            });
        }
    });
}
┌───────────────┬────────────────────┬───────────┬────────────┬──────────┬──────────────┬───────┐
│ Name          ┆ % Application Time ┆ Real Time ┆ % CPU Time ┆ CPU Time ┆ Average time ┆ Calls │
╞═══════════════╪════════════════════╪═══════════╪════════════╪══════════╪══════════════╪═══════╡
│ threads::main ┆ 100.00%            ┆ 10.48ms   ┆ 9.43%      ┆ 10.48ms  ┆       -      ┆     1 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 6 first       ┆ 96.42%             ┆ 10.11ms   ┆ 54.38%     ┆ 60.44ms  ┆ 10.08ms/call ┆     6 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 4 last        ┆ 95.93%             ┆ 10.06ms   ┆ 36.19%     ┆ 40.22ms  ┆ 10.06ms/call ┆     4 │
└───────────────┴────────────────────┴───────────┴────────────┴──────────┴──────────────┴───────┘

"CPU Time" is the combined time all threads have spent on that scope.

For example, "6 first" has a "CPU Time" of 60 milliseconds because each thread waits 10ms, and the program spawns six of them.

Attribute

Enable the attributes feature to use the profile attribute on functions.
This will add a guard at the start of the function.

use profi::profile;

#[profile]
fn anotated() { /* ... */ }

Features

Name Description
enable Activates the profiling, if not active all macros become no-ops
attributes Enables the #[prof] macro
deep-hierarchy By default profi merges all uses of a function, use this feature to disable this behaviour.
See the nested example for more information
nightly Enables nightly-only optimizations (unused at the moment)
rayon Necessary if using rayon

Dependencies

~1.5MB
~24K SLoC