5 unstable releases

0.3.0	May 3, 2020
0.3.0-alpha.2	Apr 5, 2020
0.3.0-alpha.1	Jan 11, 2020
0.2.0	Jun 14, 2018
0.1.0	Nov 20, 2017

#68 in #gpgpu

25 downloads per month
Used in 2 crates (via accel)

MIT/Apache

15KB
334 lines

accel-derive

Procedural-macro crate for #[kernel]. #[kernel] function will be converted to two part:

Device code will be compiled into PTX assembler
Host code which call the generated device code (PTX asm) using accel::module API

Get compiled PTX as `String`

The proc-macro #[kernel] creates a submodule add:: in addition to a function add. Kernel Rust code is compiled into PTX string using rustc's nvptx64-nvidia-cuda toolchain. Generated PTX string is embedded into proc-macro output as {kernel_name}::PTX_STR.

use accel_derive::kernel;

#[kernel]
unsafe fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) {
    let i = accel_core::index();
    if (i as usize) < n {
        *c.offset(i) = *a.offset(i) + *b.offset(i);
    }
}

// PTX assembler code is embedded as `add::PTX_STR`
println!("{}", add::PTX_STR);

Dependencies

~0.7–1.6MB
~31K SLoC

macro accel-derive

5 unstable releases

accel-derive

`lib.rs`:

Get compiled PTX as `String`

Dependencies

5 unstable releases

accel-derive

lib.rs:

Get compiled PTX as String

Dependencies

`lib.rs`:

Get compiled PTX as `String`