Halide
Halide::Internal::CodeGen_GPU_Dev Struct Referenceabstract

A code generator that emits GPU code from a given Halide stmt. More...

#include <CodeGen_GPU_Dev.h>

Public Types

enum  MemoryFenceType { None = 0, Device = 1, Shared = 2 }
 An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic. More...
 

Public Member Functions

virtual ~CodeGen_GPU_Dev ()
 
virtual void add_kernel (Stmt stmt, const std::string &name, const std::vector< DeviceArgument > &args)=0
 Compile a GPU kernel into the module. More...
 
virtual void init_module ()=0
 (Re)initialize the GPU kernel module. More...
 
virtual std::vector< char > compile_to_src ()=0
 
virtual std::string get_current_kernel_name ()=0
 
virtual void dump ()=0
 
virtual std::string api_unique_name ()=0
 This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name. More...
 
virtual std::string print_gpu_name (const std::string &name)=0
 Returns the specified name transformed by the variable naming rules for the GPU language backend. More...
 
virtual bool kernel_run_takes_types () const
 Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes. More...
 

Static Public Member Functions

static bool is_gpu_var (const std::string &name)
 
static bool is_gpu_block_var (const std::string &name)
 
static bool is_gpu_thread_var (const std::string &name)
 
static bool is_block_uniform (const Expr &expr)
 Checks if expr is block uniform, i.e. More...
 
static bool is_buffer_constant (const Stmt &kernel, const std::string &buffer)
 Checks if the buffer is a candidate for constant storage. More...
 
static Stmt scalarize_predicated_loads_stores (Stmt &s)
 Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication. More...
 

Detailed Description

A code generator that emits GPU code from a given Halide stmt.

Definition at line 18 of file CodeGen_GPU_Dev.h.

Member Enumeration Documentation

◆ MemoryFenceType

An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic.

Not all GPUs APIs support all types.

Enumerator
None 
Device 
Shared 

Definition at line 79 of file CodeGen_GPU_Dev.h.

Constructor & Destructor Documentation

◆ ~CodeGen_GPU_Dev()

virtual Halide::Internal::CodeGen_GPU_Dev::~CodeGen_GPU_Dev ( )
virtual

Member Function Documentation

◆ add_kernel()

virtual void Halide::Internal::CodeGen_GPU_Dev::add_kernel ( Stmt  stmt,
const std::string &  name,
const std::vector< DeviceArgument > &  args 
)
pure virtual

Compile a GPU kernel into the module.

This may be called many times with different kernels, which will all be accumulated into a single source module shared by a given Halide pipeline.

◆ init_module()

virtual void Halide::Internal::CodeGen_GPU_Dev::init_module ( )
pure virtual

(Re)initialize the GPU kernel module.

This is separate from compile, since a GPU device module will often have many kernels compiled into it for a single pipeline.

◆ compile_to_src()

virtual std::vector<char> Halide::Internal::CodeGen_GPU_Dev::compile_to_src ( )
pure virtual

◆ get_current_kernel_name()

virtual std::string Halide::Internal::CodeGen_GPU_Dev::get_current_kernel_name ( )
pure virtual

◆ dump()

virtual void Halide::Internal::CodeGen_GPU_Dev::dump ( )
pure virtual

◆ api_unique_name()

virtual std::string Halide::Internal::CodeGen_GPU_Dev::api_unique_name ( )
pure virtual

This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name.

◆ print_gpu_name()

virtual std::string Halide::Internal::CodeGen_GPU_Dev::print_gpu_name ( const std::string &  name)
pure virtual

Returns the specified name transformed by the variable naming rules for the GPU language backend.

Used to determine the name of a parameter during host codegen.

◆ kernel_run_takes_types()

virtual bool Halide::Internal::CodeGen_GPU_Dev::kernel_run_takes_types ( ) const
inlinevirtual

Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes.

Definition at line 54 of file CodeGen_GPU_Dev.h.

◆ is_gpu_var()

static bool Halide::Internal::CodeGen_GPU_Dev::is_gpu_var ( const std::string &  name)
static

◆ is_gpu_block_var()

static bool Halide::Internal::CodeGen_GPU_Dev::is_gpu_block_var ( const std::string &  name)
static

◆ is_gpu_thread_var()

static bool Halide::Internal::CodeGen_GPU_Dev::is_gpu_thread_var ( const std::string &  name)
static

◆ is_block_uniform()

static bool Halide::Internal::CodeGen_GPU_Dev::is_block_uniform ( const Expr expr)
static

Checks if expr is block uniform, i.e.

does not depend on a thread var.

◆ is_buffer_constant()

static bool Halide::Internal::CodeGen_GPU_Dev::is_buffer_constant ( const Stmt kernel,
const std::string &  buffer 
)
static

Checks if the buffer is a candidate for constant storage.

Most GPUs (APIs) support a constant memory storage class that cannot be written to and performs well for block uniform accesses. A buffer is a candidate for constant storage if it is never written to, and loads are uniform within the workgroup.

◆ scalarize_predicated_loads_stores()

static Stmt Halide::Internal::CodeGen_GPU_Dev::scalarize_predicated_loads_stores ( Stmt s)
static

Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication.


The documentation for this struct was generated from the following file: