A code generator that emits GPU code from a given Halide stmt.
More...
#include <CodeGen_GPU_Dev.h>
A code generator that emits GPU code from a given Halide stmt.
Definition at line 18 of file CodeGen_GPU_Dev.h.
◆ MemoryFenceType
An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic.
Not all GPUs APIs support all types.
Enumerator |
---|
None | |
Device | |
Shared | |
Definition at line 79 of file CodeGen_GPU_Dev.h.
◆ ~CodeGen_GPU_Dev()
virtual Halide::Internal::CodeGen_GPU_Dev::~CodeGen_GPU_Dev |
( |
| ) |
|
|
virtual |
◆ add_kernel()
virtual void Halide::Internal::CodeGen_GPU_Dev::add_kernel |
( |
Stmt |
stmt, |
|
|
const std::string & |
name, |
|
|
const std::vector< DeviceArgument > & |
args |
|
) |
| |
|
pure virtual |
Compile a GPU kernel into the module.
This may be called many times with different kernels, which will all be accumulated into a single source module shared by a given Halide pipeline.
◆ init_module()
virtual void Halide::Internal::CodeGen_GPU_Dev::init_module |
( |
| ) |
|
|
pure virtual |
(Re)initialize the GPU kernel module.
This is separate from compile, since a GPU device module will often have many kernels compiled into it for a single pipeline.
◆ compile_to_src()
virtual std::vector<char> Halide::Internal::CodeGen_GPU_Dev::compile_to_src |
( |
| ) |
|
|
pure virtual |
◆ get_current_kernel_name()
virtual std::string Halide::Internal::CodeGen_GPU_Dev::get_current_kernel_name |
( |
| ) |
|
|
pure virtual |
◆ dump()
virtual void Halide::Internal::CodeGen_GPU_Dev::dump |
( |
| ) |
|
|
pure virtual |
◆ api_unique_name()
virtual std::string Halide::Internal::CodeGen_GPU_Dev::api_unique_name |
( |
| ) |
|
|
pure virtual |
This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name.
◆ print_gpu_name()
virtual std::string Halide::Internal::CodeGen_GPU_Dev::print_gpu_name |
( |
const std::string & |
name | ) |
|
|
pure virtual |
Returns the specified name transformed by the variable naming rules for the GPU language backend.
Used to determine the name of a parameter during host codegen.
◆ kernel_run_takes_types()
virtual bool Halide::Internal::CodeGen_GPU_Dev::kernel_run_takes_types |
( |
| ) |
const |
|
inlinevirtual |
Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes.
Definition at line 54 of file CodeGen_GPU_Dev.h.
◆ is_gpu_var()
static bool Halide::Internal::CodeGen_GPU_Dev::is_gpu_var |
( |
const std::string & |
name | ) |
|
|
static |
◆ is_gpu_block_var()
static bool Halide::Internal::CodeGen_GPU_Dev::is_gpu_block_var |
( |
const std::string & |
name | ) |
|
|
static |
◆ is_gpu_thread_var()
static bool Halide::Internal::CodeGen_GPU_Dev::is_gpu_thread_var |
( |
const std::string & |
name | ) |
|
|
static |
◆ is_block_uniform()
static bool Halide::Internal::CodeGen_GPU_Dev::is_block_uniform |
( |
const Expr & |
expr | ) |
|
|
static |
Checks if expr is block uniform, i.e.
does not depend on a thread var.
◆ is_buffer_constant()
static bool Halide::Internal::CodeGen_GPU_Dev::is_buffer_constant |
( |
const Stmt & |
kernel, |
|
|
const std::string & |
buffer |
|
) |
| |
|
static |
Checks if the buffer is a candidate for constant storage.
Most GPUs (APIs) support a constant memory storage class that cannot be written to and performs well for block uniform accesses. A buffer is a candidate for constant storage if it is never written to, and loads are uniform within the workgroup.
◆ scalarize_predicated_loads_stores()
static Stmt Halide::Internal::CodeGen_GPU_Dev::scalarize_predicated_loads_stores |
( |
Stmt & |
s | ) |
|
|
static |
Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication.
The documentation for this struct was generated from the following file: