This module provides support for gpu offloading. For technical details regarding the offload_kernel and how to use it, see their respective documentation.
The offload_kernel macro can be applied to a function to generate the necessary code to launch a kernel on the target device.
#[offload_kernel] fn kernel(x: *mut [f64; 256]) { // SAFETY: // calling our `arch` functions and dereferencing a raw pointer is unsafe unsafe { let n = (*x).len(); let i = (thread_idx_x() + block_idx_x() * block_dim_x()) as usize; if i < n { (*x)[i] = i as f64; } } }
To launch an offloaded kernel, the only current way is to use the core::intrinsic::offload intrinsic (note that intrinsics usage is discouraged outside the standard library). This allows you to specify grid and block dimensions and pass the required arguments to the device.
let mut x = [0.0f64; 256]; core::intrinsics::offload::<_, _, ()>(kernel, [256, 1, 1], [1, 1, 1], (&mut x as *mut [f64; 256],));
For precise information on the offload intrinsic, see its respective documentation.