Halide 19.0.0
Halide compiler and libraries
Loading...
Searching...
No Matches
Halide::Internal::CodeGen_GPU_Dev Struct Referenceabstract

A code generator that emits GPU code from a given Halide stmt. More...

#include <CodeGen_GPU_Dev.h>

Public Types

enum  MemoryFenceType { None = 0 , Device = 1 , Shared = 2 }
 An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic. More...
 

Public Member Functions

virtual ~CodeGen_GPU_Dev ()
 
virtual void add_kernel (Stmt stmt, const std::string &name, const std::vector< DeviceArgument > &args)=0
 Compile a GPU kernel into the module.
 
virtual void init_module ()=0
 (Re)initialize the GPU kernel module.
 
virtual std::vector< char > compile_to_src ()=0
 
virtual std::string get_current_kernel_name ()=0
 
virtual void dump ()=0
 
virtual std::string api_unique_name ()=0
 This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name.
 
virtual std::string print_gpu_name (const std::string &name)=0
 Returns the specified name transformed by the variable naming rules for the GPU language backend.
 
virtual bool kernel_run_takes_types () const
 Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes.
 

Static Public Member Functions

static bool is_block_uniform (const Expr &expr)
 Checks if expr is block uniform, i.e.
 
static bool is_buffer_constant (const Stmt &kernel, const std::string &buffer)
 Checks if the buffer is a candidate for constant storage.
 
static Stmt scalarize_predicated_loads_stores (Stmt &s)
 Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication.
 

Detailed Description

A code generator that emits GPU code from a given Halide stmt.

Definition at line 18 of file CodeGen_GPU_Dev.h.

Member Enumeration Documentation

◆ MemoryFenceType

An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic.

Not all GPUs APIs support all types.

Enumerator
None 
Device 
Shared 

Definition at line 75 of file CodeGen_GPU_Dev.h.

Constructor & Destructor Documentation

◆ ~CodeGen_GPU_Dev()

virtual Halide::Internal::CodeGen_GPU_Dev::~CodeGen_GPU_Dev ( )
virtual

Member Function Documentation

◆ add_kernel()

virtual void Halide::Internal::CodeGen_GPU_Dev::add_kernel ( Stmt stmt,
const std::string & name,
const std::vector< DeviceArgument > & args )
pure virtual

Compile a GPU kernel into the module.

This may be called many times with different kernels, which will all be accumulated into a single source module shared by a given Halide pipeline.

◆ init_module()

virtual void Halide::Internal::CodeGen_GPU_Dev::init_module ( )
pure virtual

(Re)initialize the GPU kernel module.

This is separate from compile, since a GPU device module will often have many kernels compiled into it for a single pipeline.

◆ compile_to_src()

virtual std::vector< char > Halide::Internal::CodeGen_GPU_Dev::compile_to_src ( )
pure virtual

◆ get_current_kernel_name()

virtual std::string Halide::Internal::CodeGen_GPU_Dev::get_current_kernel_name ( )
pure virtual

◆ dump()

virtual void Halide::Internal::CodeGen_GPU_Dev::dump ( )
pure virtual

◆ api_unique_name()

virtual std::string Halide::Internal::CodeGen_GPU_Dev::api_unique_name ( )
pure virtual

This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name.

◆ print_gpu_name()

virtual std::string Halide::Internal::CodeGen_GPU_Dev::print_gpu_name ( const std::string & name)
pure virtual

Returns the specified name transformed by the variable naming rules for the GPU language backend.

Used to determine the name of a parameter during host codegen.

◆ kernel_run_takes_types()

virtual bool Halide::Internal::CodeGen_GPU_Dev::kernel_run_takes_types ( ) const
inlinevirtual

Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes.

Definition at line 54 of file CodeGen_GPU_Dev.h.

◆ is_block_uniform()

static bool Halide::Internal::CodeGen_GPU_Dev::is_block_uniform ( const Expr & expr)
static

Checks if expr is block uniform, i.e.

does not depend on a thread var.

◆ is_buffer_constant()

static bool Halide::Internal::CodeGen_GPU_Dev::is_buffer_constant ( const Stmt & kernel,
const std::string & buffer )
static

Checks if the buffer is a candidate for constant storage.

Most GPUs (APIs) support a constant memory storage class that cannot be written to and performs well for block uniform accesses. A buffer is a candidate for constant storage if it is never written to, and loads are uniform within the workgroup.

◆ scalarize_predicated_loads_stores()

static Stmt Halide::Internal::CodeGen_GPU_Dev::scalarize_predicated_loads_stores ( Stmt & s)
static

Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication.


The documentation for this struct was generated from the following file: