5 unstable releases
0.3.0 | May 3, 2020 |
---|---|
0.3.0-alpha.2 | Apr 5, 2020 |
0.3.0-alpha.1 | Jan 11, 2020 |
0.2.0 | Jun 14, 2018 |
0.1.0 | Nov 20, 2017 |
#64 in #gpgpu
Used in 2 crates
(via accel)
15KB
334 lines
accel-derive
Procedural-macro crate for #[kernel]
. #[kernel]
function will be converted to two part:
- Device code will be compiled into PTX assembler
- Host code which call the generated device code (PTX asm) using
accel::module
API
lib.rs
:
Get compiled PTX as String
The proc-macro #[kernel]
creates a submodule add::
in addition to a function add
.
Kernel Rust code is compiled into PTX string using rustc's nvptx64-nvidia-cuda
toolchain.
Generated PTX string is embedded into proc-macro output as {kernel_name}::PTX_STR
.
use accel_derive::kernel;
#[kernel]
unsafe fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) {
let i = accel_core::index();
if (i as usize) < n {
*c.offset(i) = *a.offset(i) + *b.offset(i);
}
}
// PTX assembler code is embedded as `add::PTX_STR`
println!("{}", add::PTX_STR);
Dependencies
~0.7–1.7MB
~32K SLoC