5 unstable releases

0.3.0 May 3, 2020
0.3.0-alpha.2 Apr 5, 2020
0.3.0-alpha.1 Jan 11, 2020
0.2.0 Jun 14, 2018
0.1.0 Nov 20, 2017

#64 in #gpgpu


Used in 2 crates (via accel)

MIT/Apache

15KB
334 lines

accel-derive

docs.rs

Procedural-macro crate for #[kernel]. #[kernel] function will be converted to two part:

  • Device code will be compiled into PTX assembler
  • Host code which call the generated device code (PTX asm) using accel::module API

lib.rs:

Get compiled PTX as String

The proc-macro #[kernel] creates a submodule add:: in addition to a function add. Kernel Rust code is compiled into PTX string using rustc's nvptx64-nvidia-cuda toolchain. Generated PTX string is embedded into proc-macro output as {kernel_name}::PTX_STR.

use accel_derive::kernel;

#[kernel]
unsafe fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) {
    let i = accel_core::index();
    if (i as usize) < n {
        *c.offset(i) = *a.offset(i) + *b.offset(i);
    }
}

// PTX assembler code is embedded as `add::PTX_STR`
println!("{}", add::PTX_STR);

Dependencies

~0.7–1.7MB
~32K SLoC