Halide
|
Namespaces | |
Autoscheduler | |
Elf | |
GeneratorMinMax | |
IntegerDivision | |
Introspection | |
IRMatcher | |
An alternative template-metaprogramming approach to expression matching. | |
Test | |
Classes | |
class | AbstractGenerator |
AbstractGenerator is an ABC that defines the API a Generator must provide to work with the existing Generator infrastructure (GenGen, RunGen, execute_generator(), Generator Stubs). More... | |
struct | Acquire |
struct | Add |
The sum of two expressions. More... | |
struct | all_are_convertible |
struct | all_are_printable_args |
struct | all_ints_and_optional_name |
struct | all_ints_and_optional_name< First, Rest... > |
struct | all_ints_and_optional_name< T > |
struct | all_ints_and_optional_name<> |
struct | Allocate |
Allocate a scratch area called with the given name, type, and size. More... | |
struct | And |
Logical and - are both expressions true. More... | |
struct | ApplySplitResult |
class | aslog |
struct | AssertStmt |
If the 'condition' is false, then evaluate and return the message, which should be a call to an error function. More... | |
struct | AssociativeOp |
Represent the equivalent associative op of an update definition. More... | |
struct | AssociativePattern |
Represent an associative op with its identity. More... | |
struct | Atomic |
Lock all the Store nodes in the body statement. More... | |
struct | BaseExprNode |
A base class for expression nodes. More... | |
struct | BaseStmtNode |
IR nodes are split into expressions and statements. More... | |
struct | Block |
A sequence of statements to be executed in-order. More... | |
struct | Bound |
A bound on a loop, typically from Func::bound. More... | |
struct | Box |
Represents the bounds of a region of arbitrary dimension. More... | |
struct | Broadcast |
A vector with 'lanes' elements, in which every element is 'value'. More... | |
struct | BufferBuilder |
A builder to help create Exprs representing halide_buffer_t structs (e.g. More... | |
struct | BufferContents |
struct | BufferInfo |
Find all calls to image buffers and parameters in the function. More... | |
struct | Call |
A function call. More... | |
struct | Cast |
The actual IR nodes begin here. More... | |
class | Closure |
A helper class to manage closures. More... | |
class | CodeGen_C |
This class emits C++ code equivalent to a halide Stmt. More... | |
class | CodeGen_GPU_C |
A base class for GPU backends that require C-like shader output. More... | |
struct | CodeGen_GPU_Dev |
A code generator that emits GPU code from a given Halide stmt. More... | |
class | CodeGen_LLVM |
A code generator abstract base class. More... | |
class | CodeGen_Posix |
A code generator that emits posix code from a given Halide stmt. More... | |
class | CodeGen_PyTorch |
This class emits C++ code to wrap a Halide pipeline so that it can be used as a C++ extension operator in PyTorch. More... | |
class | CompilerLogger |
struct | cond |
struct | ConstantInterval |
A class to represent ranges of integers. More... | |
struct | Convert |
struct | Cost |
class | debug |
For optional debugging during codegen, use the debug class as follows: More... | |
class | Definition |
A Function definition which can either represent a init or an update definition. More... | |
struct | DeviceArgument |
A DeviceArgument looks similar to an Halide::Argument, but has behavioral differences that make it specific to the GPU pipeline; the fact that neither is-a nor has-a Halide::Argument is deliberate. More... | |
struct | Dim |
The Dim struct represents one loop in the schedule's representation of a loop nest. More... | |
class | Dimension |
struct | Div |
The ratio of two expressions. More... | |
struct | EQ |
Is the first expression equal to the second. More... | |
struct | ErrorReport |
struct | Evaluate |
Evaluate and discard an expression, presumably because it has some side-effect. More... | |
struct | ExecuteGeneratorArgs |
ExecuteGeneratorArgs is the set of arguments to execute_generator(). More... | |
struct | ExprNode |
We use the "curiously recurring template pattern" to avoid duplicated code in the IR Nodes. More... | |
class | ExprUsesVars |
struct | ExprWithCompareCache |
A wrapper about Exprs so that they can be deeply compared with a cache for known-equal subexpressions. More... | |
struct | FeatureIntermediates |
struct | FileStat |
class | FindAllCalls |
Visitor for keeping track of functions that are directly called and the arguments with which they are called. More... | |
struct | FloatImm |
Floating point constants. More... | |
struct | For |
A for loop. More... | |
struct | Fork |
A pair of statements executed concurrently. More... | |
struct | Free |
Free the resources associated with the given buffer. More... | |
class | FuncSchedule |
A schedule for a Function of a Halide pipeline. More... | |
class | Function |
A reference-counted handle to Halide's internal representation of a function. More... | |
struct | FunctionPtr |
A possibly-weak pointer to a Halide function. More... | |
struct | FusedPair |
This represents two stages with fused loop nests from outermost to a specific loop level. More... | |
struct | GE |
Is the first expression greater than or equal to the second. More... | |
class | GeneratorBase |
class | GeneratorFactoryProvider |
GeneratorFactoryProvider provides a way to customize the Generators that are visible to generate_filter_main (which otherwise would just look at the global registry of C++ Generators). More... | |
class | GeneratorInput_Arithmetic |
class | GeneratorInput_Buffer |
class | GeneratorInput_DynamicScalar |
class | GeneratorInput_Func |
class | GeneratorInput_Scalar |
class | GeneratorInputBase |
class | GeneratorInputImpl |
class | GeneratorOutput_Arithmetic |
class | GeneratorOutput_Buffer |
class | GeneratorOutput_Func |
class | GeneratorOutputBase |
class | GeneratorOutputImpl |
class | GeneratorParam_Arithmetic |
class | GeneratorParam_AutoSchedulerParams |
class | GeneratorParam_Bool |
class | GeneratorParam_Enum |
class | GeneratorParam_LoopLevel |
class | GeneratorParam_String |
class | GeneratorParam_Synthetic |
class | GeneratorParam_Target |
class | GeneratorParam_Type |
class | GeneratorParamBase |
class | GeneratorParamImpl |
class | GeneratorParamInfo |
class | GeneratorRegistry |
class | GIOBase |
GIOBase is the base class for all GeneratorInput<> and GeneratorOutput<> instantiations; it is not part of the public API and should never be used directly by user code. More... | |
class | GPUCompilationCache |
class | GpuObjectLifetimeTracker |
struct | GT |
Is the first expression greater than the second. More... | |
struct | HalideBufferStaticTypeAndDims |
struct | HalideBufferStaticTypeAndDims<::Halide::Buffer< T, Dims > > |
struct | HalideBufferStaticTypeAndDims<::Halide::Runtime::Buffer< T, Dims > > |
struct | has_static_halide_type_method |
struct | has_static_halide_type_method< T2, typename type_sink< decltype(T2::static_halide_type())>::type > |
class | HexagonAlignmentAnalyzer |
class | HostClosure |
A Closure modified to inspect GPU-specific memory accesses, and produce a vector of DeviceArgument objects. More... | |
struct | IfThenElse |
An if-then-else block. More... | |
struct | Indentation |
struct | InferredArgument |
An inferred argument. More... | |
struct | Interval |
A class to represent ranges of Exprs. More... | |
struct | IntImm |
Integer constants. More... | |
struct | IntrusivePtr |
Intrusive shared pointers have a reference count (a RefCount object) stored in the class itself. More... | |
class | IRCompareCache |
Lossily track known equal exprs with a cache. More... | |
struct | IRDeepCompare |
A compare struct suitable for use in std::map and std::set that computes a lexical ordering on IR nodes. More... | |
class | IRGraphMutator |
A mutator that caches and reapplies previously-done mutations, so that it can handle graphs of IR that have not had CSE done to them. More... | |
class | IRGraphVisitor |
A base class for algorithms that walk recursively over the IR without visiting the same node twice. More... | |
struct | IRHandle |
IR nodes are passed around opaque handles to them. More... | |
class | IRMutator |
A base class for passes over the IR which modify it (e.g. More... | |
struct | IRNode |
The abstract base classes for a node in the Halide IR. More... | |
class | IRPrinter |
An IRVisitor that emits IR to the given output stream in a human readable form. More... | |
class | IRVisitor |
A base class for algorithms that need to recursively walk over the IR. More... | |
struct | is_printable_arg |
struct | IsHalideBuffer |
struct | IsHalideBuffer< const halide_buffer_t * > |
struct | IsHalideBuffer< halide_buffer_t * > |
struct | IsHalideBuffer<::Halide::Buffer< T, Dims > > |
struct | IsHalideBuffer<::Halide::Runtime::Buffer< T, Dims > > |
struct | IsRoundtrippable |
struct | JITCache |
struct | JITErrorBuffer |
struct | JITFuncCallContext |
struct | JITModule |
class | JITSharedRuntime |
class | JSONCompilerLogger |
JSONCompilerLogger is a basic implementation of the CompilerLogger interface that saves logged data, then logs it all in JSON format in emit_to_stream(). More... | |
struct | LE |
Is the first expression less than or equal to the second. More... | |
struct | Let |
A let expression, like you might find in a functional language. More... | |
struct | LetStmt |
The statement form of a let node. More... | |
struct | Load |
Load a value from a named symbol if predicate is true. More... | |
struct | LoweredArgument |
Definition of an argument to a LoweredFunc. More... | |
struct | LoweredFunc |
Definition of a lowered function. More... | |
struct | LT |
Is the first expression less than the second. More... | |
struct | Max |
The greater of two values. More... | |
struct | meta_and |
struct | meta_and< T1, Args... > |
struct | meta_or |
struct | meta_or< T1, Args... > |
struct | Min |
The lesser of two values. More... | |
struct | Mod |
The remainder of a / b. More... | |
struct | ModulusRemainder |
The result of modulus_remainder analysis. More... | |
struct | Mul |
The product of two expressions. More... | |
struct | NE |
Is the first expression not equal to the second. More... | |
struct | NoRealizations |
struct | NoRealizations< T, Args... > |
struct | NoRealizations<> |
struct | Not |
Logical not - true if the expression false. More... | |
class | ObjectInstanceRegistry |
struct | Or |
Logical or - is at least one of the expression true. More... | |
struct | OutputInfo |
class | Parameter |
A reference-counted handle to a parameter to a halide pipeline. More... | |
struct | PipelineFeatures |
struct | Prefetch |
Represent a multi-dimensional region of a Func or an ImageParam that needs to be prefetched. More... | |
struct | PrefetchDirective |
struct | ProducerConsumer |
This node is a helpful annotation to do with permissions. More... | |
struct | Provide |
This defines the value of a function at a multi-dimensional location. More... | |
class | PythonExtensionGen |
struct | Ramp |
A linear ramp vector node. More... | |
struct | Realize |
Allocate a multi-dimensional buffer of the given type and size. More... | |
class | ReductionDomain |
A reference-counted handle on a reduction domain, which is just a vector of ReductionVariable. More... | |
struct | ReductionVariable |
A single named dimension of a reduction domain. More... | |
struct | ReductionVariableInfo |
Return a list of reduction variables the expression or tuple depends on. More... | |
class | RefCount |
A class representing a reference count to be used with IntrusivePtr. More... | |
struct | RegionCosts |
Auto scheduling component which is used to assign costs for computing a region of a function or one of its stages. More... | |
class | RegisterGenerator |
struct | Reinterpret |
Reinterpret value as another type, without affecting any of the bits (on little-endian systems). More... | |
struct | ScheduleFeatures |
class | Scope |
A common pattern when traversing Halide IR is that you need to keep track of stuff when you find a Let or a LetStmt, and that it should hide previous values with the same name until you leave the Let or LetStmt nodes This class helps with that. More... | |
struct | ScopedBinding |
Helper class for pushing/popping Scope<> values, to allow for early-exit in Visitor/Mutators that preserves correctness. More... | |
struct | ScopedBinding< void > |
struct | ScopedValue |
Helper class for saving/restoring variable values on the stack, to allow for early-exit that preserves correctness. More... | |
struct | Select |
A ternary operator. More... | |
struct | select_type |
struct | select_type< First > |
struct | Shuffle |
Construct a new vector by taking elements from another sequence of vectors. More... | |
class | Simplify |
class | SmallStack |
A stack which can store one item very efficiently. More... | |
class | SmallStack< void > |
struct | SolverResult |
struct | Specialization |
struct | Split |
class | StageSchedule |
A schedule for a single stage of a Halide pipeline. More... | |
struct | StaticCast |
struct | Stmt |
A reference-counted handle to a statement node. More... | |
struct | StmtNode |
struct | StorageDim |
Properties of one axis of the storage of a Func. More... | |
struct | Store |
Store a 'value' to the buffer called 'name' at a given 'index' if 'predicate' is true. More... | |
struct | StringImm |
String constants. More... | |
class | StubInput |
class | StubInputBuffer |
StubInputBuffer is the placeholder that a Stub uses when it requires a Buffer for an input (rather than merely a Func or Expr). More... | |
class | StubOutputBuffer |
StubOutputBuffer is the placeholder that a Stub uses when it requires a Buffer for an output (rather than merely a Func). More... | |
class | StubOutputBufferBase |
struct | Sub |
The difference of two expressions. More... | |
class | TemporaryFile |
A simple utility class that creates a temporary file in its ctor and deletes that file in its dtor; this is useful for temporary files that you want to ensure are deleted when exiting a certain scope. More... | |
struct | type_sink |
struct | UIntImm |
Unsigned integer constants. More... | |
struct | Variable |
A named variable. More... | |
class | VariadicVisitor |
A visitor/mutator capable of passing arbitrary arguments to the visit methods using CRTP and returning any types from them. More... | |
struct | VectorReduce |
Horizontally reduce a vector to a scalar or narrower vector using the given commutative and associative binary operator. More... | |
class | Voidifier |
struct | WasmModule |
Handle to compiled wasm code which can be called later. More... | |
struct | Weights |
Typedefs | |
using | AbstractGeneratorPtr = std::unique_ptr< AbstractGenerator > |
typedef std::map< std::string, Interval > | DimBounds |
typedef std::map< std::pair< std::string, int >, Interval > | FuncValueBounds |
template<typename T , typename T2 > | |
using | add_const_if_T_is_const = typename std::conditional< std::is_const< T >::value, const T2, T2 >::type |
template<typename T > | |
using | GeneratorParamImplBase = typename select_type< cond< std::is_same< T, Target >::value, GeneratorParam_Target< T > >, cond< std::is_same< T, LoopLevel >::value, GeneratorParam_LoopLevel >, cond< std::is_same< T, std::string >::value, GeneratorParam_String< T > >, cond< std::is_same< T, Type >::value, GeneratorParam_Type< T > >, cond< std::is_same< T, bool >::value, GeneratorParam_Bool< T > >, cond< std::is_arithmetic< T >::value, GeneratorParam_Arithmetic< T > >, cond< std::is_enum< T >::value, GeneratorParam_Enum< T > >>::type |
template<typename T , typename TBase = typename std::remove_all_extents<T>::type> | |
using | GeneratorInputImplBase = typename select_type< cond< has_static_halide_type_method< TBase >::value, GeneratorInput_Buffer< T > >, cond< std::is_same< TBase, Func >::value, GeneratorInput_Func< T > >, cond< std::is_arithmetic< TBase >::value, GeneratorInput_Arithmetic< T > >, cond< std::is_scalar< TBase >::value, GeneratorInput_Scalar< T > >, cond< std::is_same< TBase, Expr >::value, GeneratorInput_DynamicScalar< T > >>::type |
template<typename T , typename TBase = typename std::remove_all_extents<T>::type> | |
using | GeneratorOutputImplBase = typename select_type< cond< has_static_halide_type_method< TBase >::value, GeneratorOutput_Buffer< T > >, cond< std::is_same< TBase, Func >::value, GeneratorOutput_Func< T > >, cond< std::is_arithmetic< TBase >::value, GeneratorOutput_Arithmetic< T > >>::type |
using | GeneratorFactory = std::function< AbstractGeneratorPtr(const GeneratorContext &context)> |
typedef llvm::raw_pwrite_stream | LLVMOStream |
Functions | |
Stmt | add_atomic_mutex (Stmt s, const std::map< std::string, Function > &env) |
Stmt | add_image_checks (const Stmt &s, const std::vector< Function > &outputs, const Target &t, const std::vector< std::string > &order, const std::map< std::string, Function > &env, const FuncValueBounds &fb, bool will_inject_host_copies) |
Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g. More... | |
Stmt | add_parameter_checks (const std::vector< Stmt > &requirements, Stmt s, const Target &t) |
Insert checks to make sure that all referenced parameters meet their constraints. More... | |
Stmt | align_loads (const Stmt &s, int alignment, int min_bytes_to_align) |
Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors. More... | |
Stmt | allocation_bounds_inference (Stmt s, const std::map< std::string, Function > &env, const std::map< std::pair< std::string, int >, Interval > &func_bounds) |
Take a partially statement with Realize nodes in terms of variables, and define values for those variables. More... | |
std::vector< ApplySplitResult > | apply_split (const Split &split, bool is_update, const std::string &prefix, std::map< std::string, Expr > &dim_extent_alignment) |
Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let). More... | |
std::vector< std::pair< std::string, Expr > > | compute_loop_bounds_after_split (const Split &split, const std::string &prefix) |
Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions. More... | |
const std::vector< AssociativePattern > & | get_ops_table (const std::vector< Expr > &exprs) |
AssociativeOp | prove_associativity (const std::string &f, std::vector< Expr > args, std::vector< Expr > exprs) |
Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any. More... | |
void | associativity_test () |
Stmt | fork_async_producers (Stmt s, const std::map< std::string, Function > &env) |
int | string_to_int (const std::string &s) |
Return an int representation of 's'. More... | |
Expr | substitute_var_estimates (Expr e) |
Substitute every variable in an Expr or a Stmt with its estimate if specified. More... | |
Stmt | substitute_var_estimates (Stmt s) |
Expr | get_extent (const Interval &i) |
Return the size of an interval. More... | |
Expr | box_size (const Box &b) |
Return the size of an n-d box. More... | |
void | disp_regions (const std::map< std::string, Box > ®ions) |
Helper function to print the bounds of a region. More... | |
Definition | get_stage_definition (const Function &f, int stage_num) |
Return the corresponding definition of a function given the stage. More... | |
std::vector< Dim > & | get_stage_dims (const Function &f, int stage_num) |
Return the corresponding loop dimensions of a function given the stage. More... | |
void | combine_load_costs (std::map< std::string, Expr > &result, const std::map< std::string, Expr > &partial) |
Add partial load costs to the corresponding function in the result costs. More... | |
DimBounds | get_stage_bounds (const Function &f, int stage_num, const DimBounds &pure_bounds) |
Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions. More... | |
std::vector< DimBounds > | get_stage_bounds (const Function &f, const DimBounds &pure_bounds) |
Return the required bounds for all the stages of the function 'f'. More... | |
Expr | perform_inline (Expr e, const std::map< std::string, Function > &env, const std::set< std::string > &inlines=std::set< std::string >(), const std::vector< std::string > &order=std::vector< std::string >()) |
Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression. More... | |
std::set< std::string > | get_parents (Function f, int stage) |
Return all functions that are directly called by a function stage (f, stage). More... | |
template<typename K , typename V > | |
V | get_element (const std::map< K, V > &m, const K &key) |
Return value of element within a map. More... | |
template<typename K , typename V > | |
V & | get_element (std::map< K, V > &m, const K &key) |
bool | inline_all_trivial_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env) |
If the cost of computing a Func is about the same as calling the Func, inline the Func. More... | |
std::string | is_func_called_element_wise (const std::vector< std::string > &order, size_t index, const std::map< std::string, Function > &env) |
Determine if a Func (order[index]) is only consumed by another single Func in element-wise manner. More... | |
bool | inline_all_element_wise_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env) |
Inline a Func if its values are only consumed by another single Func in element-wise manner. More... | |
void | propagate_estimate_test () |
const FuncValueBounds & | empty_func_value_bounds () |
Interval | bounds_of_expr_in_scope (const Expr &expr, const Scope< Interval > &scope, const FuncValueBounds &func_bounds=empty_func_value_bounds(), bool const_bound=false) |
Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression. More... | |
Expr | find_constant_bound (const Expr &e, Direction d, const Scope< Interval > &scope=Scope< Interval >::empty_scope()) |
Interval | find_constant_bounds (const Expr &e, const Scope< Interval > &scope) |
Find bounds for a varying expression that are either constants or +/-inf. More... | |
void | merge_boxes (Box &a, const Box &b) |
Expand box a to encompass box b. More... | |
bool | boxes_overlap (const Box &a, const Box &b) |
Test if box a could possibly overlap box b. More... | |
Box | box_union (const Box &a, const Box &b) |
The union of two boxes. More... | |
Box | box_intersection (const Box &a, const Box &b) |
The intersection of two boxes. More... | |
bool | box_contains (const Box &a, const Box &b) |
Test if box a provably contains box b. More... | |
std::map< std::string, Box > | boxes_required (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression. More... | |
std::map< std::string, Box > | boxes_required (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
std::map< std::string, Box > | boxes_provided (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression. More... | |
std::map< std::string, Box > | boxes_provided (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
std::map< std::string, Box > | boxes_touched (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression. More... | |
std::map< std::string, Box > | boxes_touched (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_required (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Variants of the above that are only concerned with a single function. More... | |
Box | box_required (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_provided (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_provided (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_touched (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_touched (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
FuncValueBounds | compute_function_value_bounds (const std::vector< std::string > &order, const std::map< std::string, Function > &env) |
Compute the maximum and minimum possible value for each function in an environment. More... | |
Expr | span_of_bounds (const Interval &bounds) |
void | bounds_test () |
Stmt | bounds_inference (Stmt, const std::vector< Function > &outputs, const std::vector< std::string > &realization_order, const std::vector< std::vector< std::string >> &fused_groups, const std::map< std::string, Function > &environment, const std::map< std::pair< std::string, int >, Interval > &func_bounds, const Target &target) |
Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds. More... | |
Stmt | bound_small_allocations (const Stmt &s) |
Expr | buffer_accessor (const Buffer<> &buf, const std::vector< Expr > &args) |
template<typename T , typename = typename std::enable_if<!std::is_convertible<T, std::string>::value>::type> | |
std::string | get_name_from_end_of_parameter_pack (T &&) |
std::string | get_name_from_end_of_parameter_pack (const std::string &n) |
std::string | get_name_from_end_of_parameter_pack () |
template<typename First , typename Second , typename... Args> | |
std::string | get_name_from_end_of_parameter_pack (First first, Second second, Args &&...rest) |
void | get_shape_from_start_of_parameter_pack_helper (std::vector< int > &, const std::string &) |
void | get_shape_from_start_of_parameter_pack_helper (std::vector< int > &) |
template<typename... Args> | |
void | get_shape_from_start_of_parameter_pack_helper (std::vector< int > &result, int x, Args &&...rest) |
template<typename... Args> | |
std::vector< int > | get_shape_from_start_of_parameter_pack (Args &&...args) |
template<typename T > | |
void | buffer_type_name_non_const (std::ostream &s) |
template<> | |
void | buffer_type_name_non_const< void > (std::ostream &s) |
template<typename T > | |
std::string | buffer_type_name () |
Stmt | canonicalize_gpu_vars (Stmt s) |
Canonicalize GPU var names into some pre-determined block/thread names (i.e. More... | |
Stmt | clamp_unsafe_accesses (const Stmt &s, const std::map< std::string, Function > &env, FuncValueBounds &func_bounds) |
Inject clamps around func calls h(...) when all the following conditions hold: More... | |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_D3D12Compute_Dev (const Target &target) |
llvm::Type * | get_vector_element_type (llvm::Type *) |
Get the scalar type of an llvm vector type. More... | |
bool | function_takes_user_context (const std::string &name) |
Which built-in functions require a user-context first argument? More... | |
bool | can_allocation_fit_on_stack (int64_t size) |
Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False. More... | |
std::pair< Expr, Expr > | long_div_mod_round_to_zero (const Expr &a, const Expr &b, const uint64_t *max_abs=nullptr) |
Does a {div/mod}_round_to_zero using binary long division for int/uint. More... | |
Expr | lower_mux (const Call *mux) |
Reduce a mux intrinsic to a select tree. More... | |
Expr | lower_round_to_nearest_ties_to_even (const Expr &) |
An vectorizable implementation of Halide::round that doesn't depend on any standard library being present. More... | |
void | get_target_options (const llvm::Module &module, llvm::TargetOptions &options) |
Given an llvm::Module, set llvm:TargetOptions information. More... | |
void | clone_target_options (const llvm::Module &from, llvm::Module &to) |
Given two llvm::Modules, clone target options from one to the other. More... | |
std::unique_ptr< llvm::TargetMachine > | make_target_machine (const llvm::Module &module) |
Given an llvm::Module, get or create an llvm:TargetMachine. More... | |
void | set_function_attributes_from_halide_target_options (llvm::Function &) |
Set the appropriate llvm Function attributes given the Halide Target. More... | |
void | embed_bitcode (llvm::Module *M, const std::string &halide_command) |
Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section. More... | |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_Metal_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_OpenCL_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_OpenGLCompute_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_PTX_Dev (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_ARM (const Target &target) |
Construct CodeGen object for a variety of targets. More... | |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_Hexagon (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_PowerPC (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_RISCV (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_X86 (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_WebAssembly (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_Vulkan_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_WebGPU_Dev (const Target &target) |
std::unique_ptr< CompilerLogger > | set_compiler_logger (std::unique_ptr< CompilerLogger > compiler_logger) |
Set the active CompilerLogger object, replacing any existing one. More... | |
CompilerLogger * | get_compiler_logger () |
Return the currently active CompilerLogger object. More... | |
std::string | cplusplus_function_mangled_name (const std::string &name, const std::vector< std::string > &namespaces, Type return_type, const std::vector< ExternFuncArgument > &args, const Target &target) |
Return the mangled C++ name for a function. More... | |
void | cplusplus_mangle_test () |
Expr | common_subexpression_elimination (const Expr &, bool lift_all=false) |
Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable. More... | |
Stmt | common_subexpression_elimination (const Stmt &, bool lift_all=false) |
Do common-subexpression-elimination on each expression in a statement. More... | |
void | cse_test () |
std::ostream & | operator<< (std::ostream &stream, const Stmt &) |
Emit a halide statement on an output stream (such as std::cout) in a human-readable form. More... | |
std::ostream & | operator<< (std::ostream &stream, const LoweredFunc &) |
Emit a halide LoweredFunc in a human readable format. More... | |
void | debug_arguments (LoweredFunc *func, const Target &t) |
Injects debug prints in a LoweredFunc that describe the target and arguments. More... | |
Stmt | debug_to_file (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env) |
Takes a statement with Realize nodes still unlowered. More... | |
Expr | extract_odd_lanes (const Expr &a) |
Extract the odd-numbered lanes in a vector. More... | |
Expr | extract_even_lanes (const Expr &a) |
Extract the even-numbered lanes in a vector. More... | |
Expr | extract_lane (const Expr &vec, int lane) |
Extract the nth lane of a vector. More... | |
Stmt | rewrite_interleavings (const Stmt &s) |
Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic. More... | |
void | deinterleave_vector_test () |
Expr | remove_let_definitions (const Expr &expr) |
Remove all let definitions of expr. More... | |
std::vector< int > | gather_variables (const Expr &expr, const std::vector< std::string > &filter) |
Return a list of variables' indices that expr depends on and are in the filter. More... | |
std::vector< int > | gather_variables (const Expr &expr, const std::vector< Var > &filter) |
std::map< std::string, ReductionVariableInfo > | gather_rvariables (const Expr &expr) |
std::map< std::string, ReductionVariableInfo > | gather_rvariables (const Tuple &tuple) |
Expr | add_let_expression (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping, const std::vector< std::string > &let_variables) |
Add necessary let expressions to expr. More... | |
std::vector< Expr > | sort_expressions (const Expr &expr) |
Topologically sort the expression graph expressed by expr. More... | |
std::map< std::string, Box > | inference_bounds (const std::vector< Func > &funcs, const std::vector< Box > &output_bounds) |
Compute the bounds of funcs. More... | |
std::map< std::string, Box > | inference_bounds (const Func &func, const Box &output_bounds) |
std::vector< std::pair< Expr, Expr > > | box_to_vector (const Box &bounds) |
Convert Box to vector of (min, extent) More... | |
bool | equal (const RDom &bounds0, const RDom &bounds1) |
Return true if bounds0 and bounds1 represent the same bounds. More... | |
std::vector< std::string > | vars_to_strings (const std::vector< Var > &vars) |
Return a list of variable names. More... | |
ReductionDomain | extract_rdom (const Expr &expr) |
Return the reduction domain used by expr. More... | |
std::pair< bool, Expr > | solve_inverse (Expr expr, const std::string &new_var, const std::string &var) |
expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom More... | |
std::map< std::string, BufferInfo > | find_buffer_param_calls (const Func &func) |
std::set< std::string > | find_implicit_variables (const Expr &expr) |
Find all implicit variables in expr. More... | |
Expr | substitute_rdom_predicate (const std::string &name, const Expr &replacement, const Expr &expr) |
Substitute the variable. More... | |
bool | is_calling_function (const std::string &func_name, const Expr &expr, const std::map< std::string, Expr > &let_var_mapping) |
Return true if expr contains call to func_name. More... | |
bool | is_calling_function (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping) |
Return true if expr depends on any function or buffer. More... | |
Expr | substitute_call_arg_with_pure_arg (Func f, int variable_id, const Expr &e) |
Replaces call to Func f in Expr e such that the call argument at variable_id is the pure argument. More... | |
Expr | make_device_interface_call (DeviceAPI device_api, MemoryType memory_type=MemoryType::Auto) |
Get an Expr which evaluates to the device interface for the given device api at runtime. More... | |
Stmt | inject_early_frees (const Stmt &s) |
Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation. More... | |
Type | eliminated_bool_type (Type bool_type, Type other_type) |
If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors. More... | |
bool | is_float16_transcendental (const Call *) |
Check if a call is a float16 transcendental (e.g. More... | |
Expr | lower_float16_transcendental_to_float32_equivalent (const Call *) |
Implement a float16 transcendental using the float32 equivalent. More... | |
Expr | float32_to_bfloat16 (Expr e) |
Cast to/from float and bfloat using bitwise math. More... | |
Expr | float32_to_float16 (Expr e) |
Expr | float16_to_float32 (Expr e) |
Expr | bfloat16_to_float32 (Expr e) |
Expr | lower_float16_cast (const Cast *op) |
HALIDE_EXPORT_SYMBOL void | unhandled_exception_handler () |
template<> | |
RefCount & | ref_count< IRNode > (const IRNode *t) noexcept |
template<> | |
void | destroy< IRNode > (const IRNode *t) |
bool | is_unordered_parallel (ForType for_type) |
Check if for_type executes for loop iterations in parallel and unordered. More... | |
bool | is_parallel (ForType for_type) |
Returns true if for_type executes for loop iterations in parallel. More... | |
template<typename StmtOrExpr , typename T > | |
bool | stmt_or_expr_uses_vars (const StmtOrExpr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument. More... | |
template<typename StmtOrExpr > | |
bool | stmt_or_expr_uses_var (const StmtOrExpr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument. More... | |
bool | expr_uses_var (const Expr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument. More... | |
bool | stmt_uses_var (const Stmt &stmt, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument. More... | |
template<typename T > | |
bool | expr_uses_vars (const Expr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument. More... | |
template<typename T > | |
bool | stmt_uses_vars (const Stmt &stmt, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument. More... | |
Stmt | extract_tile_operations (const Stmt &s) |
Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend. More... | |
std::map< std::string, Function > | find_direct_calls (const Function &f) |
Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, including in update definitions, update index expressions, and RDom extents. More... | |
std::map< std::string, Function > | find_transitive_calls (const Function &f) |
Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, or indirectly in those functions' definitions, recursively. More... | |
std::map< std::string, Function > | build_environment (const std::vector< Function > &funcs) |
Find all Functions transitively referenced by any Function in funcs and return a map of them. More... | |
Expr | lower_widen_right_add (const Expr &a, const Expr &b) |
Implement intrinsics with non-intrinsic using equivalents. More... | |
Expr | lower_widen_right_mul (const Expr &a, const Expr &b) |
Expr | lower_widen_right_sub (const Expr &a, const Expr &b) |
Expr | lower_widening_add (const Expr &a, const Expr &b) |
Expr | lower_widening_mul (const Expr &a, const Expr &b) |
Expr | lower_widening_sub (const Expr &a, const Expr &b) |
Expr | lower_widening_shift_left (const Expr &a, const Expr &b) |
Expr | lower_widening_shift_right (const Expr &a, const Expr &b) |
Expr | lower_rounding_shift_left (const Expr &a, const Expr &b) |
Expr | lower_rounding_shift_right (const Expr &a, const Expr &b) |
Expr | lower_saturating_add (const Expr &a, const Expr &b) |
Expr | lower_saturating_sub (const Expr &a, const Expr &b) |
Expr | lower_saturating_cast (const Type &t, const Expr &a) |
Expr | lower_halving_add (const Expr &a, const Expr &b) |
Expr | lower_halving_sub (const Expr &a, const Expr &b) |
Expr | lower_rounding_halving_add (const Expr &a, const Expr &b) |
Expr | lower_mul_shift_right (const Expr &a, const Expr &b, const Expr &q) |
Expr | lower_rounding_mul_shift_right (const Expr &a, const Expr &b, const Expr &q) |
Expr | lower_intrinsic (const Call *op) |
Replace one of the above ops with equivalent arithmetic. More... | |
Stmt | find_intrinsics (const Stmt &s) |
Replace common arithmetic patterns with intrinsics. More... | |
Expr | find_intrinsics (const Expr &e) |
Expr | lower_intrinsics (const Expr &e) |
The reverse of find_intrinsics. More... | |
Stmt | lower_intrinsics (const Stmt &s) |
Stmt | flatten_nested_ramps (const Stmt &s) |
Take a statement/expression and replace nested ramps and broadcasts. More... | |
Expr | flatten_nested_ramps (const Expr &e) |
template<typename Last > | |
void | check_types (const Tuple &t, int idx) |
template<typename First , typename Second , typename... Rest> | |
void | check_types (const Tuple &t, int idx) |
template<typename Last > | |
void | assign_results (Realization &r, int idx, Last last) |
template<typename First , typename Second , typename... Rest> | |
void | assign_results (Realization &r, int idx, First first, Second second, Rest &&...rest) |
void | schedule_scalar (Func f) |
std::pair< std::vector< Function >, std::map< std::string, Function > > | deep_copy (const std::vector< Function > &outputs, const std::map< std::string, Function > &env) |
Deep copy an entire Function DAG. More... | |
Stmt | zero_gpu_loop_mins (const Stmt &s) |
Rewrite all GPU loops to have a min of zero. More... | |
Stmt | fuse_gpu_thread_loops (Stmt s) |
Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model. More... | |
Stmt | fuzz_float_stores (const Stmt &s) |
On every store of a floating point value, mask off the least-significant-bit of the mantissa. More... | |
void | generator_test () |
std::vector< Expr > | parameter_constraints (const Parameter &p) |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE std::string | enum_to_string (const std::map< std::string, T > &enum_map, const T &t) |
template<typename T > | |
T | enum_from_string (const std::map< std::string, T > &enum_map, const std::string &s) |
const std::map< std::string, Halide::Type > & | get_halide_type_enum_map () |
std::string | halide_type_to_enum_string (const Type &t) |
std::string | halide_type_to_c_source (const Type &t) |
std::string | halide_type_to_c_type (const Type &t) |
const GeneratorFactoryProvider & | get_registered_generators () |
Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators. More... | |
int | generate_filter_main (int argc, char **argv) |
generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation. More... | |
int | generate_filter_main (int argc, char **argv, const GeneratorFactoryProvider &generator_factory_provider) |
This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g. More... | |
template<typename T > | |
T | parse_scalar (const std::string &value) |
std::vector< Type > | parse_halide_type_list (const std::string &types) |
void | execute_generator (const ExecuteGeneratorArgs &args) |
Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main() , but with a structured API that is more suitable for calling directly from code (vs command line). More... | |
Stmt | inject_hexagon_rpc (Stmt s, const Target &host_target, Module &module) |
Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module. More... | |
Buffer< uint8_t > | compile_module_to_hexagon_shared_object (const Module &device_code) |
Stmt | optimize_hexagon_shuffles (const Stmt &s, int lut_alignment) |
Replace indirect and other loads with simple loads + vlut calls. More... | |
Stmt | scatter_gather_generator (Stmt s) |
Stmt | optimize_hexagon_instructions (Stmt s, const Target &t) |
Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations. More... | |
Expr | native_deinterleave (const Expr &x) |
Generate deinterleave or interleave operations, operating on groups of vectors at a time. More... | |
Expr | native_interleave (const Expr &x) |
bool | is_native_deinterleave (const Expr &x) |
bool | is_native_interleave (const Expr &x) |
std::string | type_suffix (Type type, bool signed_variants=true) |
std::string | type_suffix (const Expr &a, bool signed_variants=true) |
std::string | type_suffix (const Expr &a, const Expr &b, bool signed_variants=true) |
std::string | type_suffix (const std::vector< Expr > &ops, bool signed_variants=true) |
std::vector< InferredArgument > | infer_arguments (const Stmt &body, const std::vector< Function > &outputs) |
Stmt | call_extern_and_assert (const std::string &name, const std::vector< Expr > &args) |
A helper function to call an extern function, and assert that it returns 0. More... | |
Stmt | inject_host_dev_buffer_copies (Stmt s, const Target &t) |
Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed. More... | |
Stmt | inline_function (Stmt s, const Function &f) |
Inline a single named function, which must be pure. More... | |
Expr | inline_function (Expr e, const Function &f) |
void | inline_function (Function caller, const Function &f) |
void | validate_schedule_inlined_function (Function f) |
Check if the schedule of an inlined function is legal, throwing an error if it is not. More... | |
template<typename T > | |
RefCount & | ref_count (const T *t) noexcept |
Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here. More... | |
template<typename T > | |
void | destroy (const T *t) |
bool | equal (const Expr &a, const Expr &b) |
Compare IR nodes for equality of value. More... | |
bool | equal (const Stmt &a, const Stmt &b) |
bool | graph_equal (const Expr &a, const Expr &b) |
bool | graph_equal (const Stmt &a, const Stmt &b) |
bool | graph_less_than (const Expr &a, const Expr &b) |
Order unsanitized IRNodes for use in a map key. More... | |
bool | graph_less_than (const Stmt &a, const Stmt &b) |
void | ir_equality_test () |
bool | expr_match (const Expr &pattern, const Expr &expr, std::vector< Expr > &result) |
Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument. More... | |
bool | expr_match (const Expr &pattern, const Expr &expr, std::map< std::string, Expr > &result) |
Does the first expression have the same structure as the second? Variables are matched consistently. More... | |
Expr | with_lanes (const Expr &x, int lanes) |
Rewrite the expression x to have lanes lanes. More... | |
void | expr_match_test () |
template<typename Mutator , typename... Args> | |
std::pair< Region, bool > | mutate_region (Mutator *mutator, const Region &bounds, Args &&...args) |
A helper function for mutator-like things to mutate regions. More... | |
bool | is_const (const Expr &e) |
Is the expression either an IntImm, a FloatImm, a StringImm, or a Cast of the same, or a Ramp or Broadcast of the same. More... | |
bool | is_const (const Expr &e, int64_t v) |
Is the expression an IntImm, FloatImm of a particular value, or a Cast, or Broadcast of the same. More... | |
const int64_t * | as_const_int (const Expr &e) |
If an expression is an IntImm or a Broadcast of an IntImm, return a pointer to its value. More... | |
const uint64_t * | as_const_uint (const Expr &e) |
If an expression is a UIntImm or a Broadcast of a UIntImm, return a pointer to its value. More... | |
const double * | as_const_float (const Expr &e) |
If an expression is a FloatImm or a Broadcast of a FloatImm, return a pointer to its value. More... | |
bool | is_const_power_of_two_integer (const Expr &e, int *bits) |
Is the expression a constant integer power of two. More... | |
bool | is_positive_const (const Expr &e) |
Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression) More... | |
bool | is_negative_const (const Expr &e) |
Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression) More... | |
bool | is_undef (const Expr &e) |
Is the expression an undef. More... | |
bool | is_const_zero (const Expr &e) |
Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression) More... | |
bool | is_const_one (const Expr &e) |
Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression) More... | |
bool | is_no_op (const Stmt &s) |
Is the statement a no-op (which we represent as either an undefined Stmt, or as an Evaluate node of a constant) More... | |
bool | is_pure (const Expr &e) |
Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects. More... | |
Expr | make_const (Type t, int64_t val) |
Construct an immediate of the given type from any numeric C++ type. More... | |
Expr | make_const (Type t, uint64_t val) |
Expr | make_const (Type t, double val) |
Expr | make_const (Type t, int32_t val) |
Expr | make_const (Type t, uint32_t val) |
Expr | make_const (Type t, int16_t val) |
Expr | make_const (Type t, uint16_t val) |
Expr | make_const (Type t, int8_t val) |
Expr | make_const (Type t, uint8_t val) |
Expr | make_const (Type t, bool val) |
Expr | make_const (Type t, float val) |
Expr | make_const (Type t, float16_t val) |
Expr | make_signed_integer_overflow (Type type) |
Construct a unique signed_integer_overflow Expr. More... | |
bool | is_signed_integer_overflow (const Expr &expr) |
Check if an expression is a signed_integer_overflow. More... | |
void | check_representable (Type t, int64_t val) |
Check if a constant value can be correctly represented as the given type. More... | |
Expr | make_bool (bool val, int lanes=1) |
Construct a boolean constant from a C++ boolean value. More... | |
Expr | make_zero (Type t) |
Construct the representation of zero in the given type. More... | |
Expr | make_one (Type t) |
Construct the representation of one in the given type. More... | |
Expr | make_two (Type t) |
Construct the representation of two in the given type. More... | |
Expr | const_true (int lanes=1) |
Construct the constant boolean true. More... | |
Expr | const_false (int lanes=1) |
Construct the constant boolean false. More... | |
Expr | lossless_cast (Type t, Expr e) |
Attempt to cast an expression to a smaller type while provably not losing information. More... | |
Expr | lossless_negate (const Expr &x) |
Attempt to negate x without introducing new IR and without overflow. More... | |
void | match_types (Expr &a, Expr &b) |
Coerce the two expressions to have the same type, using C-style casting rules. More... | |
void | match_types_bitwise (Expr &a, Expr &b, const char *op_name) |
Asserts that both expressions are integer types and are either both signed or both unsigned. More... | |
Expr | halide_log (const Expr &a) |
Halide's vectorizable transcendentals. More... | |
Expr | halide_exp (const Expr &a) |
Expr | halide_erf (const Expr &a) |
Expr | raise_to_integer_power (Expr a, int64_t b) |
Raise an expression to an integer power by repeatedly multiplying it by itself. More... | |
void | split_into_ands (const Expr &cond, std::vector< Expr > &result) |
Split a boolean condition into vector of ANDs. More... | |
Expr | strided_ramp_base (const Expr &e, int stride=1) |
If e is a ramp expression with stride, default 1, return the base, otherwise undefined. More... | |
template<typename T > | |
T | mod_imp (T a, T b) |
Implementations of division and mod that are specific to Halide. More... | |
template<typename T > | |
T | div_imp (T a, T b) |
template<> | |
float | mod_imp< float > (float a, float b) |
template<> | |
double | mod_imp< double > (double a, double b) |
template<> | |
float | div_imp< float > (float a, float b) |
template<> | |
double | div_imp< double > (double a, double b) |
Expr | remove_likelies (const Expr &e) |
Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed. More... | |
Stmt | remove_likelies (const Stmt &s) |
Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed. More... | |
Expr | remove_promises (const Expr &e) |
Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed. More... | |
Stmt | remove_promises (const Stmt &s) |
Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed. More... | |
Expr | unwrap_tags (const Expr &e) |
If the expression is a tag helper call, remove it and return the tagged expression. More... | |
HALIDE_NO_USER_CODE_INLINE void | collect_print_args (std::vector< Expr > &args) |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE void | collect_print_args (std::vector< Expr > &args, const char *arg, Args &&...more_args) |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE void | collect_print_args (std::vector< Expr > &args, Expr arg, Args &&...more_args) |
Expr | requirement_failed_error (Expr condition, const std::vector< Expr > &args) |
Expr | memoize_tag_helper (Expr result, const std::vector< Expr > &cache_key_values) |
Expr | unreachable (Type t=Int(32)) |
Return an expression that should never be evaluated. More... | |
template<typename T > | |
Expr | unreachable () |
Expr | promise_clamped (const Expr &value, const Expr &min, const Expr &max) |
FOR INTERNAL USE ONLY. More... | |
template<typename T = void> | |
Expr | widen_right_add (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widen_right_mul (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widen_right_sub (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_add (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_mul (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_sub (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_shift_left (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_shift_left (const Expr &a, int b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_shift_right (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | widening_shift_right (const Expr &a, int b, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_shift_left (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_shift_left (const Expr &a, int b, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_shift_right (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_shift_right (const Expr &a, int b, T *=nullptr) |
template<typename T = void> | |
Expr | saturating_add (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | saturating_sub (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | halving_add (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_halving_add (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | halving_sub (const Expr &a, const Expr &b, T *=nullptr) |
template<typename T = void> | |
Expr | mul_shift_right (const Expr &a, const Expr &b, const Expr &q, T *=nullptr) |
template<typename T = void> | |
Expr | mul_shift_right (const Expr &a, const Expr &b, int q, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_mul_shift_right (const Expr &a, const Expr &b, const Expr &q, T *=nullptr) |
template<typename T = void> | |
Expr | rounding_mul_shift_right (const Expr &a, const Expr &b, int q, T *=nullptr) |
std::ostream & | operator<< (std::ostream &stream, const AssociativePattern &) |
Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form. More... | |
std::ostream & | operator<< (std::ostream &stream, const AssociativeOp &) |
Emit a halide associative op on an output stream (such as std::cout) in a human-readable form. More... | |
std::ostream & | operator<< (std::ostream &stream, const ForType &) |
Emit a halide for loop type (vectorized, serial, etc) in a human readable form. More... | |
std::ostream & | operator<< (std::ostream &stream, const VectorReduce::Operator &) |
Emit a horizontal vector reduction op in human-readable form. More... | |
std::ostream & | operator<< (std::ostream &stream, const NameMangling &) |
Emit a halide name mangling value in a human readable format. More... | |
std::ostream & | operator<< (std::ostream &stream, const LinkageType &) |
Emit a halide linkage value in a human readable format. More... | |
std::ostream & | operator<< (std::ostream &stream, const DimType &) |
Emit a halide dimension type in human-readable format. More... | |
std::ostream & | operator<< (std::ostream &out, const Closure &c) |
Emit a Closure in human-readable format. More... | |
std::ostream & | operator<< (std::ostream &stream, const Indentation &) |
void * | get_symbol_address (const char *s) |
Expr | lower_lerp (Type final_type, Expr zero_val, Expr one_val, const Expr &weight, const Target &target) |
Build Halide IR that computes a lerp. More... | |
Stmt | hoist_loop_invariant_values (Stmt) |
Hoist loop-invariants out of inner loops. More... | |
Stmt | hoist_loop_invariant_if_statements (Stmt) |
Just hoist loop-invariant if statements as far up as possible. More... | |
template<typename T > | |
auto | iterator_to_pointer (T iter) -> decltype(&*std::declval< T >()) |
std::string | get_llvm_function_name (const llvm::Function *f) |
std::string | get_llvm_function_name (const llvm::Function &f) |
llvm::StructType * | get_llvm_struct_type_by_name (llvm::Module *module, const char *name) |
llvm::Triple | get_triple_for_target (const Target &target) |
Return the llvm::Triple that corresponds to the given Halide Target. More... | |
std::unique_ptr< llvm::Module > | get_initial_module_for_target (Target, llvm::LLVMContext *, bool for_shared_jit_runtime=false, bool just_gpu=false) |
Create an llvm module containing the support code for a given target. More... | |
std::unique_ptr< llvm::Module > | get_initial_module_for_ptx_device (Target, llvm::LLVMContext *c) |
Create an llvm module containing the support code for ptx device. More... | |
void | add_bitcode_to_module (llvm::LLVMContext *context, llvm::Module &module, const std::vector< uint8_t > &bitcode, const std::string &name) |
Link a block of llvm bitcode into an llvm module. More... | |
std::unique_ptr< llvm::Module > | link_with_wasm_jit_runtime (llvm::LLVMContext *c, const Target &t, std::unique_ptr< llvm::Module > extra_module) |
Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module. More... | |
Stmt | loop_carry (Stmt, int max_carried_values=8) |
Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load. More... | |
Module | lower (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Argument > &args, LinkageType linkage_type, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >()) |
Given a vector of scheduled halide functions, create a Module that evaluates it. More... | |
Stmt | lower_main_stmt (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >()) |
Given a halide function with a schedule, create a statement that evaluates it. More... | |
void | lower_test () |
Stmt | lower_parallel_tasks (const Stmt &s, std::vector< LoweredFunc > &closure_implementations, const std::string &name, const Target &t) |
Stmt | lower_warp_shuffles (Stmt s, const Target &t) |
Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions. More... | |
Stmt | inject_memoization (const Stmt &s, const std::map< std::string, Function > &env, const std::string &name, const std::vector< Function > &outputs) |
Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache. More... | |
Stmt | rewrite_memoized_allocations (const Stmt &s, const std::map< std::string, Function > &env) |
This should be called after Storage Flattening has added Allocation IR nodes. More... | |
std::map< OutputFileType, const OutputInfo > | get_output_info (const Target &target) |
ModulusRemainder | operator+ (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator- (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator* (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator/ (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator% (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator+ (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator- (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator* (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator/ (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator% (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | modulus_remainder (const Expr &e) |
For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant. More... | |
ModulusRemainder | modulus_remainder (const Expr &e, const Scope< ModulusRemainder > &scope) |
If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder: More... | |
void | modulus_remainder_test () |
int64_t | gcd (int64_t, int64_t) |
The greatest common divisor of two integers. More... | |
int64_t | lcm (int64_t, int64_t) |
The least common multiple of two integers. More... | |
ConstantInterval | derivative_bounds (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope()) |
Find the bounds of the derivative of an expression. More... | |
Monotonic | is_monotonic (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope()) |
Monotonic | is_monotonic (const Expr &e, const std::string &var, const Scope< Monotonic > &scope) |
std::ostream & | operator<< (std::ostream &stream, const Monotonic &m) |
Emit the monotonic class in human-readable form for debugging. More... | |
void | is_monotonic_test () |
Stmt | inject_gpu_offload (const Stmt &s, const Target &host_target) |
Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module. More... | |
Stmt | optimize_shuffles (Stmt s, int lut_alignment) |
bool | can_parallelize_rvar (const std::string &rvar, const std::string &func, const Definition &r) |
Returns whether or not Halide can prove that it is safe to parallelize an update definition across a specific variable. More... | |
void | check_call_arg_types (const std::string &name, std::vector< Expr > *args, int dims) |
Validate arguments to a call to a func, image or imageparam. More... | |
bool | has_uncaptured_likely_tag (const Expr &e) |
Return true if an expression uses a likely tag that isn't captured by an enclosing Select, Min, or Max. More... | |
bool | has_likely_tag (const Expr &e) |
Return true if an expression uses a likely tag. More... | |
Stmt | partition_loops (Stmt s) |
Partitions loop bodies into a prologue, a steady state, and an epilogue. More... | |
Stmt | inject_placeholder_prefetch (const Stmt &s, const std::map< std::string, Function > &env, const std::string &prefix, const std::vector< PrefetchDirective > &prefetches) |
Inject placeholder prefetches to 's'. More... | |
Stmt | inject_prefetch (const Stmt &s, const std::map< std::string, Function > &env) |
Compute the actual region to be prefetched and place it to the placholder prefetch. More... | |
Stmt | reduce_prefetch_dimension (Stmt stmt, const Target &t) |
Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture). More... | |
Stmt | hoist_prefetches (const Stmt &s) |
Hoist all the prefetches in a Block to the beginning of the Block. More... | |
std::string | print_loop_nest (const std::vector< Function > &output_funcs) |
Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses. More... | |
Stmt | inject_profiling (Stmt, const std::string &) |
Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end. More... | |
Expr | purify_index_math (const Expr &) |
Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero. More... | |
Expr | qualify (const std::string &prefix, const Expr &value) |
Prefix all variable names in the given expression with the prefix string. More... | |
Expr | random_float (const std::vector< Expr > &) |
Return a random floating-point number between zero and one that varies deterministically based on the input expressions. More... | |
Expr | random_int (const std::vector< Expr > &) |
Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers). More... | |
Expr | lower_random (const Expr &e, const std::vector< VarOrRVar > &free_vars, int tag) |
Convert calls to random() to IR generated by random_float and random_int. More... | |
std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > > | realization_order (const std::vector< Function > &outputs, std::map< std::string, Function > &env) |
Given a bunch of functions that call each other, determine an order in which to do the scheduling. More... | |
std::vector< std::string > | topological_order (const std::vector< Function > &outputs, const std::map< std::string, Function > &env) |
Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule. More... | |
Stmt | rebase_loops_to_zero (const Stmt &) |
Rewrite the mins of most loops to 0. More... | |
void | split_predicate_test () |
bool | is_func_trivial_to_inline (const Function &func) |
Return true if the cost of inlining a function is equivalent to the cost of calling the function directly. More... | |
Stmt | remove_dead_allocations (const Stmt &s) |
Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt. More... | |
Stmt | remove_extern_loops (const Stmt &s) |
Removes placeholder loops for extern stages. More... | |
Stmt | remove_undef (Stmt s) |
Removes stores that depend on undef values, and statements that only contain such stores. More... | |
Stmt | schedule_functions (const std::vector< Function > &outputs, const std::vector< std::vector< std::string >> &fused_groups, const std::map< std::string, Function > &env, const Target &target, bool &any_memoized) |
Build loop nests and inject Function realizations at the appropriate places using the schedule. More... | |
template<typename T > | |
std::ostream & | operator<< (std::ostream &stream, const Scope< T > &s) |
Stmt | select_gpu_api (const Stmt &s, const Target &t) |
Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target. More... | |
Stmt | simplify (const Stmt &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope()) |
Perform a a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc. More... | |
Expr | simplify (const Expr &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope()) |
bool | can_prove (Expr e, const Scope< Interval > &bounds=Scope< Interval >::empty_scope()) |
Attempt to statically prove an expression is true using the simplifier. More... | |
Stmt | simplify_exprs (const Stmt &) |
Simplify expressions found in a statement, but don't simplify across different statements. More... | |
int64_t | saturating_mul (int64_t a, int64_t b) |
Stmt | simplify_correlated_differences (const Stmt &) |
Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions. More... | |
Expr | bound_correlated_differences (const Expr &expr) |
Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference. More... | |
void | simplify_specializations (std::map< std::string, Function > &env) |
Try to simplify the RHS/LHS of a function's definition based on its specializations. More... | |
Stmt | skip_stages (Stmt s, const std::vector< std::string > &order) |
Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used. More... | |
Stmt | sliding_window (const Stmt &s, const std::map< std::string, Function > &env) |
Perform sliding window optimizations on a halide statement. More... | |
SolverResult | solve_expression (const Expr &e, const std::string &variable, const Scope< Expr > &scope=Scope< Expr >::empty_scope()) |
Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e. More... | |
Interval | solve_for_outer_interval (const Expr &c, const std::string &variable) |
Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it. More... | |
Interval | solve_for_inner_interval (const Expr &c, const std::string &variable) |
Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it. More... | |
Expr | and_condition_over_domain (const Expr &c, const Scope< Interval > &varying) |
Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables. More... | |
void | solve_test () |
void | spirv_ir_test () |
Internal test for SPIR-V IR. More... | |
Stmt | split_tuples (const Stmt &s, const std::map< std::string, Function > &env) |
Rewrite all tuple-valued Realizations, Provide nodes, and Call nodes into several scalar-valued ones, so that later lowering passes only need to think about scalar-valued productions. More... | |
Stmt | stage_strided_loads (const Stmt &s) |
Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles. More... | |
void | print_to_viz (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="") |
Dump an HTML-formatted visualization of a Module to filename. More... | |
Stmt | storage_flattening (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env, const Target &target) |
Take a statement with multi-dimensional Realize, Provide, and Call nodes, and turn it into a statement with single-dimensional Allocate, Store, and Load nodes respectively. More... | |
Stmt | storage_folding (const Stmt &s, const std::map< std::string, Function > &env) |
Fold storage of functions if possible. More... | |
bool | strictify_float (std::map< std::string, Function > &env, const Target &t) |
Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions. More... | |
Expr | substitute (const std::string &name, const Expr &replacement, const Expr &expr) |
Substitute variables with the given name with the replacement expression within expr. More... | |
Stmt | substitute (const std::string &name, const Expr &replacement, const Stmt &stmt) |
Substitute variables with the given name with the replacement expression within stmt. More... | |
Expr | substitute (const std::map< std::string, Expr > &replacements, const Expr &expr) |
Substitute variables with names in the map. More... | |
Stmt | substitute (const std::map< std::string, Expr > &replacements, const Stmt &stmt) |
Expr | substitute (const Expr &find, const Expr &replacement, const Expr &expr) |
Substitute expressions for other expressions. More... | |
Stmt | substitute (const Expr &find, const Expr &replacement, const Stmt &stmt) |
Expr | graph_substitute (const std::string &name, const Expr &replacement, const Expr &expr) |
Substitutions where the IR may be a general graph (and not just a DAG). More... | |
Stmt | graph_substitute (const std::string &name, const Expr &replacement, const Stmt &stmt) |
Expr | graph_substitute (const Expr &find, const Expr &replacement, const Expr &expr) |
Stmt | graph_substitute (const Expr &find, const Expr &replacement, const Stmt &stmt) |
Expr | substitute_in_all_lets (const Expr &expr) |
Substitute in all let Exprs in a piece of IR. More... | |
Stmt | substitute_in_all_lets (const Stmt &stmt) |
void | target_test () |
Stmt | inject_tracing (Stmt, const std::string &pipeline_name, bool trace_pipeline, const std::map< std::string, Function > &env, const std::vector< Function > &outputs, const Target &Target) |
Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations. More... | |
Stmt | trim_no_ops (Stmt s) |
Truncate loop bounds to the region over which they actually do something. More... | |
Stmt | unify_duplicate_lets (const Stmt &s) |
Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones. More... | |
Stmt | uniquify_variable_names (const Stmt &s) |
Modify a statement so that every internally-defined variable name is unique. More... | |
void | uniquify_variable_names_test () |
Stmt | unpack_buffers (Stmt s) |
Creates let stmts for the various buffer components (e.g. More... | |
Stmt | unroll_loops (const Stmt &) |
Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement. More... | |
Stmt | lower_unsafe_promises (const Stmt &s, const Target &t) |
Lower all unsafe promises into either assertions or unchecked code, depending on the target. More... | |
Stmt | lower_safe_promises (const Stmt &s) |
Lower all safe promises by just stripping them. More... | |
template<typename DST , typename SRC , typename std::enable_if< std::is_floating_point< SRC >::value >::type * = nullptr> | |
DST | safe_numeric_cast (SRC s) |
Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible. More... | |
template<typename DstType , typename SrcType > | |
DstType | reinterpret_bits (const SrcType &src) |
An aggressive form of reinterpret cast used for correct type-punning. More... | |
std::string | make_entity_name (void *stack_ptr, const std::string &type, char prefix) |
Make a unique name for an object based on the name of the stack variable passed in. More... | |
std::string | get_env_variable (char const *env_var_name) |
Get value of an environment variable. More... | |
std::string | running_program_name () |
Get the name of the currently running executable. More... | |
std::string | unique_name (char prefix) |
Generate a unique name starting with the given prefix. More... | |
std::string | unique_name (const std::string &prefix) |
bool | starts_with (const std::string &str, const std::string &prefix) |
Test if the first string starts with the second string. More... | |
bool | ends_with (const std::string &str, const std::string &suffix) |
Test if the first string ends with the second string. More... | |
std::string | replace_all (const std::string &str, const std::string &find, const std::string &replace) |
Replace all matches of the second string in the first string with the last string. More... | |
std::vector< std::string > | split_string (const std::string &source, const std::string &delim) |
Split the source string using 'delim' as the divider. More... | |
template<typename T , typename Fn > | |
T | fold_left (const std::vector< T > &vec, Fn f) |
Perform a left fold of a vector. More... | |
template<typename T , typename Fn > | |
T | fold_right (const std::vector< T > &vec, Fn f) |
Returns a right fold of a vector. More... | |
std::string | extract_namespaces (const std::string &name, std::vector< std::string > &namespaces) |
Returns base name and fills in namespaces, outermost one first in vector. More... | |
std::string | strip_namespaces (const std::string &name) |
Like extract_namespaces(), but strip and discard the namespaces, returning base name only. More... | |
std::string | file_make_temp (const std::string &prefix, const std::string &suffix) |
Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed. More... | |
std::string | dir_make_temp () |
Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed. More... | |
bool | file_exists (const std::string &name) |
Wrapper for access(). More... | |
void | assert_file_exists (const std::string &name) |
assert-fail if the file doesn't exist. More... | |
void | assert_no_file_exists (const std::string &name) |
assert-fail if the file DOES exist. More... | |
void | file_unlink (const std::string &name) |
Wrapper for unlink(). More... | |
void | ensure_no_file_exists (const std::string &name) |
Ensure that no file with this path exists. More... | |
void | dir_rmdir (const std::string &name) |
Wrapper for rmdir(). More... | |
FileStat | file_stat (const std::string &name) |
Wrapper for stat(). More... | |
std::vector< char > | read_entire_file (const std::string &pathname) |
Read the entire contents of a file into a vector<char>. More... | |
void | write_entire_file (const std::string &pathname, const void *source, size_t source_len) |
Create or replace the contents of a file with a given pointer-and-length of memory. More... | |
void | write_entire_file (const std::string &pathname, const std::vector< char > &source) |
bool | add_would_overflow (int bits, int64_t a, int64_t b) |
Routines to test if math would overflow for signed integers with the given number of bits. More... | |
bool | sub_would_overflow (int bits, int64_t a, int64_t b) |
bool | mul_would_overflow (int bits, int64_t a, int64_t b) |
HALIDE_MUST_USE_RESULT bool | add_with_overflow (int bits, int64_t a, int64_t b, int64_t *result) |
Routines to perform arithmetic on signed types without triggering signed overflow. More... | |
HALIDE_MUST_USE_RESULT bool | sub_with_overflow (int bits, int64_t a, int64_t b, int64_t *result) |
HALIDE_MUST_USE_RESULT bool | mul_with_overflow (int bits, int64_t a, int64_t b, int64_t *result) |
void | halide_tic_impl (const char *file, int line) |
void | halide_toc_impl (const char *file, int line) |
std::string | c_print_name (const std::string &name, bool prefix_underscore=true) |
Emit a version of a string that is a valid identifier in C (. More... | |
int | get_llvm_version () |
Return the LLVM_VERSION against which this libHalide is compiled. More... | |
void | run_with_large_stack (const std::function< void()> &action) |
Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size. More... | |
int | popcount64 (uint64_t x) |
Portable versions of popcount, count-leading-zeros, and count-trailing-zeros. More... | |
int | clz64 (uint64_t x) |
int | ctz64 (uint64_t x) |
std::vector< Var > | make_argument_list (int dimensionality) |
Make a list of unique arguments for definitions with unnamed arguments. More... | |
Stmt | vectorize_loops (const Stmt &s, const std::map< std::string, Function > &env) |
Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors. More... | |
std::map< std::string, Function > | wrap_func_calls (const std::map< std::string, Function > &env) |
Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions. More... | |
std::string | get_test_tmp_dir () |
Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution). More... | |
Variables | |
const int64_t | unknown = std::numeric_limits<int64_t>::min() |
constexpr IRNodeType | StrongestExprNodeType = IRNodeType::VectorReduce |
using Halide::Internal::AbstractGeneratorPtr = typedef std::unique_ptr<AbstractGenerator> |
Definition at line 221 of file AbstractGenerator.h.
typedef std::map<std::string, Interval> Halide::Internal::DimBounds |
Definition at line 20 of file AutoScheduleUtils.h.
typedef std::map<std::pair<std::string, int>, Interval> Halide::Internal::FuncValueBounds |
using Halide::Internal::add_const_if_T_is_const = typedef typename std::conditional<std::is_const<T>::value, const T2, T2>::type |
using Halide::Internal::GeneratorParamImplBase = typedef typename select_type< cond<std::is_same<T, Target>::value, GeneratorParam_Target<T> >, cond<std::is_same<T, LoopLevel>::value, GeneratorParam_LoopLevel>, cond<std::is_same<T, std::string>::value, GeneratorParam_String<T> >, cond<std::is_same<T, Type>::value, GeneratorParam_Type<T> >, cond<std::is_same<T, bool>::value, GeneratorParam_Bool<T> >, cond<std::is_arithmetic<T>::value, GeneratorParam_Arithmetic<T> >, cond<std::is_enum<T>::value, GeneratorParam_Enum<T> >>::type |
Definition at line 950 of file Generator.h.
using Halide::Internal::GeneratorInputImplBase = typedef typename select_type< cond<has_static_halide_type_method<TBase>::value, GeneratorInput_Buffer<T> >, cond<std::is_same<TBase, Func>::value, GeneratorInput_Func<T> >, cond<std::is_arithmetic<TBase>::value, GeneratorInput_Arithmetic<T> >, cond<std::is_scalar<TBase>::value, GeneratorInput_Scalar<T> >, cond<std::is_same<TBase, Expr>::value, GeneratorInput_DynamicScalar<T> >>::type |
Definition at line 2177 of file Generator.h.
using Halide::Internal::GeneratorOutputImplBase = typedef typename select_type< cond<has_static_halide_type_method<TBase>::value, GeneratorOutput_Buffer<T> >, cond<std::is_same<TBase, Func>::value, GeneratorOutput_Func<T> >, cond<std::is_arithmetic<TBase>::value, GeneratorOutput_Arithmetic<T> >>::type |
Definition at line 2781 of file Generator.h.
using Halide::Internal::GeneratorFactory = typedef std::function<AbstractGeneratorPtr(const GeneratorContext &context)> |
Definition at line 3105 of file Generator.h.
typedef llvm::raw_pwrite_stream Halide::Internal::LLVMOStream |
Definition at line 27 of file LLVM_Output.h.
|
strong |
Enumerator | |
---|---|
Scalar | |
Function | |
Buffer |
Definition at line 26 of file AbstractGenerator.h.
|
strong |
Enumerator | |
---|---|
Input | |
Output |
Definition at line 30 of file AbstractGenerator.h.
|
strong |
|
strong |
All our IR node types get unique IDs for the purposes of RTTI.
|
strong |
An enum describing a type of loop traversal.
Used in schedules, and in the For loop IR node. Serial is a conventional ordered for loop. Iterations occur in increasing order, and each iteration must appear to have finished before the next begins. Parallel, GPUBlock, and GPUThread are parallel and unordered: iterations may occur in any order, and multiple iterations may occur simultaneously. Vectorized and GPULane are parallel and synchronous: they act as if all iterations occur at the same time in lockstep.
Enumerator | |
---|---|
Serial | |
Parallel | |
Vectorized | |
Unrolled | |
Extern | |
GPUBlock | |
GPUThread | |
GPULane |
|
strong |
Enumerator | |
---|---|
Type | |
Dim | |
ArraySize |
Definition at line 2883 of file Generator.h.
|
strong |
Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown.
Enumerator | |
---|---|
Constant | |
Increasing | |
Decreasing | |
Unknown |
Definition at line 25 of file Monotonic.h.
|
strong |
Each Dim below has a dim_type, which tells you what transformations are legal on it.
When you combine two Dims of distinct DimTypes (e.g. with Stage::fuse), the combined result has the greater enum value of the two types.
Enumerator | |
---|---|
PureVar | This dim originated from a Var. You can evaluate a Func at distinct values of this Var in any order over an interval that's at least as large as the interval required. In pure definitions you can even redundantly re-evaluate points. |
PureRVar | The dim originated from an RVar. You can evaluate a Func at distinct values of this RVar in any order (including in parallel) over exactly the interval specified in the RDom. PureRVars can also be reordered arbitrarily in the dims list, as there are no data hazards between the evaluation of the Func at distinct values of the RVar. The most common case where an RVar is considered pure is RVars that are used in a way which obeys all the syntactic constraints that a Var does, e.g: RDom r(0, 100);
f(r.x) = f(r.x) + 5;
Other cases where RVars are pure are where the sites being written to by the Func evaluated at one value of the RVar couldn't possibly collide with the sites being written or read by the Func at a distinct value of the RVar. For example, r.x is pure in the following three definitions: // This definition writes to even coordinates and reads from the
// same site (which no other value of r.x is writing to) and odd
// sites (which no other value of r.x is writing to):
f(2*r.x) = max(f(2*r.x), f(2*r.x + 7));
// This definition writes to scanline zero and reads from the the
// same site and scanline one:
f(r.x, 0) += f(r.x, 1);
// This definition reads and writes over non-overlapping ranges:
f(r.x + 100) += f(r.x);
To give two counterexamples, r.x is not pure in the following definitions: // The same site is written by distinct values of the RVar
// (write-after-write hazard):
f(r.x / 2) += f(r.x);
// One value of r.x reads from a site that another value of r.x
// is writing to (read-after-write hazard):
f(r.x) += f(r.x + 1);
|
ImpureRVar | The dim originated from an RVar. You must evaluate a Func at distinct values of this RVar in increasing order over precisely the interval specified in the RDom. ImpureRVars may not be reordered with respect to other ImpureRVars. All RVars are impure by default. Those for which we can prove no data hazards exist get promoted to PureRVar. There are two instances in which ImpureRVars may be parallelized or reordered even in the presence of hazards: 1) In the case of an update definition that has been proven to be an associative and commutative reduction, reordering of ImpureRVars is allowed, and parallelizing them is allowed if the update has been made atomic. 2) ImpureRVars can also be reordered and parallelized if Func::allow_race_conditions() has been set. This is the escape hatch for when there are no hazards but the checks above failed to prove that (RDom::where can encode arbitrary facts about non-linear integer arithmetic, which is undecidable), or for when you don't actually care about the non-determinism introduced by data hazards (e.g. in the algorithm HOGWILD!). |
Definition at line 326 of file Schedule.h.
Stmt Halide::Internal::add_image_checks | ( | const Stmt & | s, |
const std::vector< Function > & | outputs, | ||
const Target & | t, | ||
const std::vector< std::string > & | order, | ||
const std::map< std::string, Function > & | env, | ||
const FuncValueBounds & | fb, | ||
bool | will_inject_host_copies | ||
) |
Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g.
stride.0 must be 1).
Stmt Halide::Internal::add_parameter_checks | ( | const std::vector< Stmt > & | requirements, |
Stmt | s, | ||
const Target & | t | ||
) |
Insert checks to make sure that all referenced parameters meet their constraints.
Also injects any custom requirements provided by the user.
Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors.
Types that are less than min_bytes_to_align in size are not rewritten. This is intended to make a distinction between data that will be accessed as a scalar and that which will be accessed as a vector.
Stmt Halide::Internal::allocation_bounds_inference | ( | Stmt | s, |
const std::map< std::string, Function > & | env, | ||
const std::map< std::pair< std::string, int >, Interval > & | func_bounds | ||
) |
Take a partially statement with Realize nodes in terms of variables, and define values for those variables.
std::vector<ApplySplitResult> Halide::Internal::apply_split | ( | const Split & | split, |
bool | is_update, | ||
const std::string & | prefix, | ||
std::map< std::string, Expr > & | dim_extent_alignment | ||
) |
Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let).
std::vector<std::pair<std::string, Expr> > Halide::Internal::compute_loop_bounds_after_split | ( | const Split & | split, |
const std::string & | prefix | ||
) |
Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions.
const std::vector<AssociativePattern>& Halide::Internal::get_ops_table | ( | const std::vector< Expr > & | exprs | ) |
AssociativeOp Halide::Internal::prove_associativity | ( | const std::string & | f, |
std::vector< Expr > | args, | ||
std::vector< Expr > | exprs | ||
) |
Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any.
'is_associative' indicates if the operation was successfuly proven as associative.
void Halide::Internal::associativity_test | ( | ) |
Stmt Halide::Internal::fork_async_producers | ( | Stmt | s, |
const std::map< std::string, Function > & | env | ||
) |
int Halide::Internal::string_to_int | ( | const std::string & | s | ) |
Return an int representation of 's'.
Throw an error on failure.
Return the size of an interval.
Return an undefined expr if the interval is unbounded.
void Halide::Internal::disp_regions | ( | const std::map< std::string, Box > & | regions | ) |
Helper function to print the bounds of a region.
Definition Halide::Internal::get_stage_definition | ( | const Function & | f, |
int | stage_num | ||
) |
Return the corresponding definition of a function given the stage.
This will throw an assertion if the function is an extern function (Extern Func does not have definition).
void Halide::Internal::combine_load_costs | ( | std::map< std::string, Expr > & | result, |
const std::map< std::string, Expr > & | partial | ||
) |
Add partial load costs to the corresponding function in the result costs.
DimBounds Halide::Internal::get_stage_bounds | ( | const Function & | f, |
int | stage_num, | ||
const DimBounds & | pure_bounds | ||
) |
Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions.
std::vector<DimBounds> Halide::Internal::get_stage_bounds | ( | const Function & | f, |
const DimBounds & | pure_bounds | ||
) |
Return the required bounds for all the stages of the function 'f'.
Each entry in the returned vector corresponds to a stage.
Expr Halide::Internal::perform_inline | ( | Expr | e, |
const std::map< std::string, Function > & | env, | ||
const std::set< std::string > & | inlines = std::set< std::string >() , |
||
const std::vector< std::string > & | order = std::vector< std::string >() |
||
) |
Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression.
If 'order' is passed, inlining will be done in the reverse order of function realization to avoid extra inlining works.
std::set<std::string> Halide::Internal::get_parents | ( | Function | f, |
int | stage | ||
) |
Return all functions that are directly called by a function stage (f, stage).
V Halide::Internal::get_element | ( | const std::map< K, V > & | m, |
const K & | key | ||
) |
Return value of element within a map.
This will assert if the element is not in the map.
Definition at line 101 of file AutoScheduleUtils.h.
References internal_assert.
V& Halide::Internal::get_element | ( | std::map< K, V > & | m, |
const K & | key | ||
) |
Definition at line 108 of file AutoScheduleUtils.h.
References internal_assert.
void Halide::Internal::propagate_estimate_test | ( | ) |
const FuncValueBounds& Halide::Internal::empty_func_value_bounds | ( | ) |
Interval Halide::Internal::bounds_of_expr_in_scope | ( | const Expr & | expr, |
const Scope< Interval > & | scope, | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() , |
||
bool | const_bound = false |
||
) |
Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression.
Max or min may be undefined expressions if the value is not bounded above or below. If the expression is a vector, also takes the bounds across the vector lanes and returns a scalar result.
This is for tasks such as deducing the region of a buffer loaded by a chunk of code.
Expr Halide::Internal::find_constant_bound | ( | const Expr & | e, |
Direction | d, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() |
||
) |
Find bounds for a varying expression that are either constants or +/-inf.
Test if box a could possibly overlap box b.
The intersection of two boxes.
Test if box a provably contains box b.
std::map<std::string, Box> Halide::Internal::boxes_required | ( | const Expr & | e, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression.
This is useful for figuring out what regions of things to evaluate.
std::map<std::string, Box> Halide::Internal::boxes_required | ( | Stmt | s, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
std::map<std::string, Box> Halide::Internal::boxes_provided | ( | const Expr & | e, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression.
std::map<std::string, Box> Halide::Internal::boxes_provided | ( | Stmt | s, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
std::map<std::string, Box> Halide::Internal::boxes_touched | ( | const Expr & | e, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression.
std::map<std::string, Box> Halide::Internal::boxes_touched | ( | Stmt | s, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Box Halide::Internal::box_required | ( | const Expr & | e, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Variants of the above that are only concerned with a single function.
Box Halide::Internal::box_required | ( | Stmt | s, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Box Halide::Internal::box_provided | ( | const Expr & | e, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Box Halide::Internal::box_provided | ( | Stmt | s, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Box Halide::Internal::box_touched | ( | const Expr & | e, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
Box Halide::Internal::box_touched | ( | Stmt | s, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() , |
||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() |
||
) |
FuncValueBounds Halide::Internal::compute_function_value_bounds | ( | const std::vector< std::string > & | order, |
const std::map< std::string, Function > & | env | ||
) |
Compute the maximum and minimum possible value for each function in an environment.
void Halide::Internal::bounds_test | ( | ) |
Stmt Halide::Internal::bounds_inference | ( | Stmt | , |
const std::vector< Function > & | outputs, | ||
const std::vector< std::string > & | realization_order, | ||
const std::vector< std::vector< std::string >> & | fused_groups, | ||
const std::map< std::string, Function > & | environment, | ||
const std::map< std::pair< std::string, int >, Interval > & | func_bounds, | ||
const Target & | target | ||
) |
Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds.
Referenced by Halide::Buffer< void >::operator()().
std::string Halide::Internal::get_name_from_end_of_parameter_pack | ( | T && | ) |
Definition at line 44 of file Buffer.h.
Referenced by get_name_from_end_of_parameter_pack().
|
inline |
|
inline |
std::string Halide::Internal::get_name_from_end_of_parameter_pack | ( | First | first, |
Second | second, | ||
Args &&... | rest | ||
) |
Definition at line 59 of file Buffer.h.
References get_name_from_end_of_parameter_pack().
|
inline |
Definition at line 63 of file Buffer.h.
Referenced by get_shape_from_start_of_parameter_pack(), and get_shape_from_start_of_parameter_pack_helper().
|
inline |
void Halide::Internal::get_shape_from_start_of_parameter_pack_helper | ( | std::vector< int > & | result, |
int | x, | ||
Args &&... | rest | ||
) |
Definition at line 70 of file Buffer.h.
References get_shape_from_start_of_parameter_pack_helper().
std::vector<int> Halide::Internal::get_shape_from_start_of_parameter_pack | ( | Args &&... | args | ) |
Definition at line 76 of file Buffer.h.
References get_shape_from_start_of_parameter_pack_helper().
void Halide::Internal::buffer_type_name_non_const | ( | std::ostream & | s | ) |
|
inline |
std::string Halide::Internal::buffer_type_name | ( | ) |
Canonicalize GPU var names into some pre-determined block/thread names (i.e.
__block_id_x, __thread_id_x, etc.). The x/y/z/w order is determined by the nesting order: innermost is assigned to x and so on.
Stmt Halide::Internal::clamp_unsafe_accesses | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env, | ||
FuncValueBounds & | func_bounds | ||
) |
Inject clamps around func calls h(...) when all the following conditions hold:
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_D3D12Compute_Dev | ( | const Target & | target | ) |
llvm::Type* Halide::Internal::get_vector_element_type | ( | llvm::Type * | ) |
Get the scalar type of an llvm vector type.
Returns the argument if it's not a vector type.
bool Halide::Internal::function_takes_user_context | ( | const std::string & | name | ) |
Which built-in functions require a user-context first argument?
bool Halide::Internal::can_allocation_fit_on_stack | ( | int64_t | size | ) |
Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False.
This routine asserts if size is non-positive.
std::pair<Expr, Expr> Halide::Internal::long_div_mod_round_to_zero | ( | const Expr & | a, |
const Expr & | b, | ||
const uint64_t * | max_abs = nullptr |
||
) |
Does a {div/mod}_round_to_zero using binary long division for int/uint.
max_abs is the maximum absolute value of (a/b). Returns the pair {div_round_to_zero, mod_round_to_zero}.
Expr Halide::Internal::lower_int_uint_div | ( | const Expr & | a, |
const Expr & | b, | ||
bool | round_to_zero = false |
||
) |
Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.
Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.
Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.
Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.
Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.
Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.
Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.
Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.
Reduce bit extraction and concatenation to bit ops.
Reduce bit extraction and concatenation to bit ops.
An vectorizable implementation of Halide::round that doesn't depend on any standard library being present.
void Halide::Internal::get_target_options | ( | const llvm::Module & | module, |
llvm::TargetOptions & | options | ||
) |
Given an llvm::Module, set llvm:TargetOptions information.
void Halide::Internal::clone_target_options | ( | const llvm::Module & | from, |
llvm::Module & | to | ||
) |
Given two llvm::Modules, clone target options from one to the other.
std::unique_ptr<llvm::TargetMachine> Halide::Internal::make_target_machine | ( | const llvm::Module & | module | ) |
Given an llvm::Module, get or create an llvm:TargetMachine.
void Halide::Internal::set_function_attributes_from_halide_target_options | ( | llvm::Function & | ) |
void Halide::Internal::embed_bitcode | ( | llvm::Module * | M, |
const std::string & | halide_command | ||
) |
Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section.
Emulates clang's -fembed-bitcode flag and is useful to satisfy Apple's bitcode inclusion requirements.
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_Metal_Dev | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_OpenCL_Dev | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_OpenGLCompute_Dev | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_PTX_Dev | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_Posix> Halide::Internal::new_CodeGen_ARM | ( | const Target & | target | ) |
Construct CodeGen object for a variety of targets.
std::unique_ptr<CodeGen_Posix> Halide::Internal::new_CodeGen_Hexagon | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_Posix> Halide::Internal::new_CodeGen_PowerPC | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_Posix> Halide::Internal::new_CodeGen_RISCV | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_Posix> Halide::Internal::new_CodeGen_X86 | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_Posix> Halide::Internal::new_CodeGen_WebAssembly | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_Vulkan_Dev | ( | const Target & | target | ) |
std::unique_ptr<CodeGen_GPU_Dev> Halide::Internal::new_CodeGen_WebGPU_Dev | ( | const Target & | target | ) |
std::unique_ptr<CompilerLogger> Halide::Internal::set_compiler_logger | ( | std::unique_ptr< CompilerLogger > | compiler_logger | ) |
Set the active CompilerLogger object, replacing any existing one.
It is legal to pass in a nullptr (which means "don't do any compiler logging"). Returns the previous CompilerLogger (if any).
CompilerLogger* Halide::Internal::get_compiler_logger | ( | ) |
Return the currently active CompilerLogger object.
If set_compiler_logger() has never been called, a nullptr implementation will be returned. Do not save the pointer returned! It is intended to be used for immediate calls only.
std::string Halide::Internal::cplusplus_function_mangled_name | ( | const std::string & | name, |
const std::vector< std::string > & | namespaces, | ||
Type | return_type, | ||
const std::vector< ExternFuncArgument > & | args, | ||
const Target & | target | ||
) |
Return the mangled C++ name for a function.
The target parameter is used to decide on the C++ ABI/mangling style to use.
void Halide::Internal::cplusplus_mangle_test | ( | ) |
Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable.
This is important to do within Halide (instead of punting to llvm), because exprs that come in from the front-end are small when considered as a graph, but combinatorially large when considered as a tree. For an example of a such a case, see test/code_explosion.cpp
The last parameter determines whether all common subexpressions are lifted, or only those that the simplifier would not subsitute back in (e.g. addition of a constant).
Do common-subexpression-elimination on each expression in a statement.
Does not introduce let statements.
void Halide::Internal::cse_test | ( | ) |
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const Stmt & | |||
) |
Emit a halide statement on an output stream (such as std::cout) in a human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | , |
const LoweredFunc & | |||
) |
Emit a halide LoweredFunc in a human readable format.
void Halide::Internal::debug_arguments | ( | LoweredFunc * | func, |
const Target & | t | ||
) |
Injects debug prints in a LoweredFunc that describe the target and arguments.
Mutates the given func.
Stmt Halide::Internal::debug_to_file | ( | Stmt | s, |
const std::vector< Function > & | outputs, | ||
const std::map< std::string, Function > & | env | ||
) |
Takes a statement with Realize nodes still unlowered.
If the corresponding functions have a debug_file set, then inject code that will dump the contents of those functions to a file after the realization.
Extract the odd-numbered lanes in a vector.
Extract the even-numbered lanes in a vector.
Extract the nth lane of a vector.
Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic.
void Halide::Internal::deinterleave_vector_test | ( | ) |
Remove all let definitions of expr.
std::vector<int> Halide::Internal::gather_variables | ( | const Expr & | expr, |
const std::vector< std::string > & | filter | ||
) |
Return a list of variables' indices that expr depends on and are in the filter.
std::vector<int> Halide::Internal::gather_variables | ( | const Expr & | expr, |
const std::vector< Var > & | filter | ||
) |
std::map<std::string, ReductionVariableInfo> Halide::Internal::gather_rvariables | ( | const Expr & | expr | ) |
std::map<std::string, ReductionVariableInfo> Halide::Internal::gather_rvariables | ( | const Tuple & | tuple | ) |
Expr Halide::Internal::add_let_expression | ( | const Expr & | expr, |
const std::map< std::string, Expr > & | let_var_mapping, | ||
const std::vector< std::string > & | let_variables | ||
) |
Add necessary let expressions to expr.
Topologically sort the expression graph expressed by expr.
std::map<std::string, Box> Halide::Internal::inference_bounds | ( | const std::vector< Func > & | funcs, |
const std::vector< Box > & | output_bounds | ||
) |
Compute the bounds of funcs.
The bounds represent a conservative region that is used by the "consumers" of the function, except of itself.
std::map<std::string, Box> Halide::Internal::inference_bounds | ( | const Func & | func, |
const Box & | output_bounds | ||
) |
Return true if bounds0 and bounds1 represent the same bounds.
Referenced by Halide::Internal::AssociativePattern::operator==(), and Halide::Internal::AssociativeOp::Replacement::operator==().
std::vector<std::string> Halide::Internal::vars_to_strings | ( | const std::vector< Var > & | vars | ) |
Return a list of variable names.
ReductionDomain Halide::Internal::extract_rdom | ( | const Expr & | expr | ) |
Return the reduction domain used by expr.
std::pair<bool, Expr> Halide::Internal::solve_inverse | ( | Expr | expr, |
const std::string & | new_var, | ||
const std::string & | var | ||
) |
expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom
std::map<std::string, BufferInfo> Halide::Internal::find_buffer_param_calls | ( | const Func & | func | ) |
std::set<std::string> Halide::Internal::find_implicit_variables | ( | const Expr & | expr | ) |
Find all implicit variables in expr.
Expr Halide::Internal::substitute_rdom_predicate | ( | const std::string & | name, |
const Expr & | replacement, | ||
const Expr & | expr | ||
) |
Substitute the variable.
Also replace all occurrences in rdom.where() predicates.
bool Halide::Internal::is_calling_function | ( | const std::string & | func_name, |
const Expr & | expr, | ||
const std::map< std::string, Expr > & | let_var_mapping | ||
) |
Return true if expr contains call to func_name.
bool Halide::Internal::is_calling_function | ( | const Expr & | expr, |
const std::map< std::string, Expr > & | let_var_mapping | ||
) |
Return true if expr depends on any function or buffer.
Expr Halide::Internal::make_device_interface_call | ( | DeviceAPI | device_api, |
MemoryType | memory_type = MemoryType::Auto |
||
) |
Get an Expr which evaluates to the device interface for the given device api at runtime.
Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation.
Targets may use this to free buffers earlier than the close of their Allocate node.
Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.
For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.
Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.
For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.
If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors.
Definition at line 32 of file EliminateBoolVectors.h.
References Halide::Type::bits(), Halide::Type::Int, Halide::Type::is_vector(), Halide::Type::with_bits(), and Halide::Type::with_code().
bool Halide::Internal::is_float16_transcendental | ( | const Call * | ) |
Check if a call is a float16 transcendental (e.g.
sqrt_f16)
Implement a float16 transcendental using the float32 equivalent.
Cast to/from float and bfloat using bitwise math.
HALIDE_EXPORT_SYMBOL void Halide::Internal::unhandled_exception_handler | ( | ) |
|
inlinenoexcept |
|
inline |
bool Halide::Internal::is_unordered_parallel | ( | ForType | for_type | ) |
Check if for_type executes for loop iterations in parallel and unordered.
Referenced by Halide::Internal::Dim::is_unordered_parallel(), and Halide::Internal::For::is_unordered_parallel().
bool Halide::Internal::is_parallel | ( | ForType | for_type | ) |
Returns true if for_type executes for loop iterations in parallel.
Referenced by Halide::Internal::Dim::is_parallel(), and Halide::Internal::For::is_parallel().
|
inline |
Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 101 of file ExprUsesVar.h.
References Halide::Internal::ExprUsesVars< T >::result.
Referenced by expr_uses_vars(), and stmt_uses_vars().
|
inline |
Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 113 of file ExprUsesVar.h.
References Halide::Internal::Scope< T >::push().
Referenced by expr_uses_var(), and stmt_uses_var().
|
inline |
Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 124 of file ExprUsesVar.h.
References stmt_or_expr_uses_var().
|
inline |
Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 133 of file ExprUsesVar.h.
References Halide::stmt, and stmt_or_expr_uses_var().
|
inline |
Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 143 of file ExprUsesVar.h.
References stmt_or_expr_uses_vars().
|
inline |
Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 153 of file ExprUsesVar.h.
References Halide::stmt, and stmt_or_expr_uses_vars().
Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend.
std::map<std::string, Function> Halide::Internal::build_environment | ( | const std::vector< Function > & | funcs | ) |
Find all Functions transitively referenced by any Function in funcs
and return a map of them.
Implement intrinsics with non-intrinsic using equivalents.
Expr Halide::Internal::lower_rounding_mul_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
const Expr & | q | ||
) |
Replace one of the above ops with equivalent arithmetic.
Replace common arithmetic patterns with intrinsics.
Take a statement/expression and replace nested ramps and broadcasts.
|
inline |
Definition at line 2482 of file Func.h.
References user_assert.
Referenced by check_types(), Halide::evaluate(), and Halide::evaluate_may_gpu().
|
inline |
Definition at line 2491 of file Func.h.
References check_types().
|
inline |
Definition at line 2497 of file Func.h.
Referenced by assign_results(), Halide::evaluate(), and Halide::evaluate_may_gpu().
|
inline |
Definition at line 2503 of file Func.h.
References assign_results().
|
inline |
Definition at line 2550 of file Func.h.
References Halide::get_jit_target_from_environment(), Halide::Func::gpu_single_thread(), Halide::Target::has_feature(), Halide::Target::has_gpu_feature(), Halide::Func::hexagon(), and Halide::Target::HVX.
Referenced by Halide::evaluate_may_gpu().
std::pair<std::vector<Function>, std::map<std::string, Function> > Halide::Internal::deep_copy | ( | const std::vector< Function > & | outputs, |
const std::map< std::string, Function > & | env | ||
) |
Deep copy an entire Function DAG.
Rewrite all GPU loops to have a min of zero.
Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model.
Within every loop over gpu block indices, fuse the inner loops over thread indices into a single loop (with predication to turn off threads). Push if conditions between GPU blocks to the innermost GPU threads. Also injects synchronization points as needed, and hoists shared allocations at the block level out into a single shared memory array, and heap allocations into a slice of a global pool allocated outside the kernel.
On every store of a floating point value, mask off the least-significant-bit of the mantissa.
We've found that whether or not this dramatically changes the output of a pipeline correlates very well with whether or not a pipeline will produce very different outputs on different architectures (e.g. with and without FMA). It's also a useful way to detect bad tests, such as those that expect exact floating point equality across platforms.
void Halide::Internal::generator_test | ( | ) |
HALIDE_NO_USER_CODE_INLINE std::string Halide::Internal::enum_to_string | ( | const std::map< std::string, T > & | enum_map, |
const T & | t | ||
) |
Definition at line 298 of file Generator.h.
References user_error.
Referenced by Halide::Internal::GeneratorParam_Enum< T >::get_default_value(), and halide_type_to_enum_string().
T Halide::Internal::enum_from_string | ( | const std::map< std::string, T > & | enum_map, |
const std::string & | s | ||
) |
Definition at line 309 of file Generator.h.
References user_assert.
const std::map<std::string, Halide::Type>& Halide::Internal::get_halide_type_enum_map | ( | ) |
Referenced by halide_type_to_enum_string().
|
inline |
Definition at line 316 of file Generator.h.
References enum_to_string(), and get_halide_type_enum_map().
std::string Halide::Internal::halide_type_to_c_source | ( | const Type & | t | ) |
std::string Halide::Internal::halide_type_to_c_type | ( | const Type & | t | ) |
const GeneratorFactoryProvider& Halide::Internal::get_registered_generators | ( | ) |
Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators.
int Halide::Internal::generate_filter_main | ( | int | argc, |
char ** | argv | ||
) |
generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation.
int Halide::Internal::generate_filter_main | ( | int | argc, |
char ** | argv, | ||
const GeneratorFactoryProvider & | generator_factory_provider | ||
) |
This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g.
for bindings in languages other than C++).
T Halide::Internal::parse_scalar | ( | const std::string & | value | ) |
Definition at line 2873 of file Generator.h.
References user_assert.
std::vector<Type> Halide::Internal::parse_halide_type_list | ( | const std::string & | types | ) |
void Halide::Internal::execute_generator | ( | const ExecuteGeneratorArgs & | args | ) |
Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main()
, but with a structured API that is more suitable for calling directly from code (vs command line).
Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module.
Buffer<uint8_t> Halide::Internal::compile_module_to_hexagon_shared_object | ( | const Module & | device_code | ) |
Replace indirect and other loads with simple loads + vlut calls.
Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations.
This pass rewrites widenings/narrowings to be explicit in the IR, and attempts to simplify away most of the interleaving/deinterleaving.
Generate deinterleave or interleave operations, operating on groups of vectors at a time.
bool Halide::Internal::is_native_deinterleave | ( | const Expr & | x | ) |
bool Halide::Internal::is_native_interleave | ( | const Expr & | x | ) |
std::string Halide::Internal::type_suffix | ( | Type | type, |
bool | signed_variants = true |
||
) |
std::string Halide::Internal::type_suffix | ( | const Expr & | a, |
bool | signed_variants = true |
||
) |
std::string Halide::Internal::type_suffix | ( | const Expr & | a, |
const Expr & | b, | ||
bool | signed_variants = true |
||
) |
std::string Halide::Internal::type_suffix | ( | const std::vector< Expr > & | ops, |
bool | signed_variants = true |
||
) |
std::vector<InferredArgument> Halide::Internal::infer_arguments | ( | const Stmt & | body, |
const std::vector< Function > & | outputs | ||
) |
Stmt Halide::Internal::call_extern_and_assert | ( | const std::string & | name, |
const std::vector< Expr > & | args | ||
) |
A helper function to call an extern function, and assert that it returns 0.
Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed.
Inline a single named function, which must be pure.
For a pure function to be inlined, it must not have any specializations (i.e. it can only have one values definition).
void Halide::Internal::validate_schedule_inlined_function | ( | Function | f | ) |
Check if the schedule of an inlined function is legal, throwing an error if it is not.
|
noexcept |
Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here.
E.g. if you want to use IntrusivePtr<MyClass>, then you should define something like this in MyClass.cpp (assuming MyClass has a field: mutable RefCount ref_count):
template<> RefCount &ref_count<MyClass>(const MyClass *c) noexcept {return c->ref_count;} template<> void destroy<MyClass>(const MyClass *c) {delete c;}
void Halide::Internal::destroy | ( | const T * | t | ) |
Compare IR nodes for equality of value.
Traverses entire IR tree. For equality of reference, use Expr::same_as. If you're comparing non-CSE'd Exprs, use graph_equal, which is safe for nasty graphs of IR nodes.
Order unsanitized IRNodes for use in a map key.
void Halide::Internal::ir_equality_test | ( | ) |
bool Halide::Internal::expr_match | ( | const Expr & | pattern, |
const Expr & | expr, | ||
std::vector< Expr > & | result | ||
) |
Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument.
Wildcards require the types to match. For the type bits and width, a 0 indicates "match anything". So an Int(8, 0) will match 8-bit integer vectors of any width (including scalars), and a UInt(0, 0) will match any unsigned integer type.
For example:
should return true, and set result[0] to 3 and result[1] to 2*k.
bool Halide::Internal::expr_match | ( | const Expr & | pattern, |
const Expr & | expr, | ||
std::map< std::string, Expr > & | result | ||
) |
Does the first expression have the same structure as the second? Variables are matched consistently.
The first time a variable is matched, it assumes the value of the matching part of the second expression. Subsequent matches must be equal to the first match.
For example:
should return true, and set result["x"] = a, and result["y"] = b.
Rewrite the expression x to have lanes
lanes.
This is useful for substituting the results of expr_match into a pattern expression.
void Halide::Internal::expr_match_test | ( | ) |
std::pair<Region, bool> Halide::Internal::mutate_region | ( | Mutator * | mutator, |
const Region & | bounds, | ||
Args &&... | args | ||
) |
A helper function for mutator-like things to mutate regions.
Definition at line 123 of file IRMutator.h.
References Halide::Internal::IntrusivePtr< T >::same_as().
bool Halide::Internal::is_const | ( | const Expr & | e | ) |
const double* Halide::Internal::as_const_float | ( | const Expr & | e | ) |
bool Halide::Internal::is_const_power_of_two_integer | ( | const Expr & | e, |
int * | bits | ||
) |
Is the expression a constant integer power of two.
Also returns log base two of the expression if it is. Only returns true for integer types.
bool Halide::Internal::is_positive_const | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression)
bool Halide::Internal::is_negative_const | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression)
bool Halide::Internal::is_undef | ( | const Expr & | e | ) |
Is the expression an undef.
bool Halide::Internal::is_const_zero | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression)
Referenced by Halide::Internal::IRMatcher::NegateOp< A >::match().
bool Halide::Internal::is_const_one | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression)
Referenced by Halide::Internal::IRMatcher::CanProve< A, Prover >::make_folded_const().
bool Halide::Internal::is_no_op | ( | const Stmt & | s | ) |
bool Halide::Internal::is_pure | ( | const Expr & | e | ) |
Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects.
Construct an immediate of the given type from any numeric C++ type.
Referenced by Halide::Internal::IRMatcher::fuzz_test_rule(), Halide::Internal::IRMatcher::IntLiteral::make(), make_const(), Halide::Internal::GeneratorParamImpl< LoopLevel >::operator Expr(), and Halide::Internal::IRMatcher::Rewriter< Instance >::operator()().
Definition at line 78 of file IROperator.h.
References make_const().
Definition at line 81 of file IROperator.h.
References make_const().
Definition at line 84 of file IROperator.h.
References make_const().
Definition at line 87 of file IROperator.h.
References make_const().
Definition at line 90 of file IROperator.h.
References make_const().
Definition at line 93 of file IROperator.h.
References make_const().
Definition at line 96 of file IROperator.h.
References make_const().
Definition at line 99 of file IROperator.h.
References make_const().
Definition at line 102 of file IROperator.h.
References make_const().
Construct a unique signed_integer_overflow Expr.
Referenced by Halide::Internal::IRMatcher::make_const_special_expr().
bool Halide::Internal::is_signed_integer_overflow | ( | const Expr & | expr | ) |
Check if an expression is a signed_integer_overflow.
Check if a constant value can be correctly represented as the given type.
Expr Halide::Internal::make_bool | ( | bool | val, |
int | lanes = 1 |
||
) |
Construct a boolean constant from a C++ boolean value.
May also be a vector if width is given. It is not possible to coerce a C++ boolean to Expr because if we provide such a path then char objects can ambiguously be converted to Halide Expr or to std::string. The problem is that C++ does not have a real bool type - it is in fact close enough to char that C++ does not know how to distinguish them. make_bool is the explicit coercion.
Construct the representation of zero in the given type.
Referenced by Halide::Internal::IRMatcher::NegateOp< A >::make().
Expr Halide::Internal::const_true | ( | int | lanes = 1 | ) |
Construct the constant boolean true.
May also be a vector of trues, if a lanes argument is given.
Expr Halide::Internal::const_false | ( | int | lanes = 1 | ) |
Construct the constant boolean false.
May also be a vector of falses, if a lanes argument is given.
Attempt to cast an expression to a smaller type while provably not losing information.
If it can't be done, return an undefined Expr.
Attempt to negate x without introducing new IR and without overflow.
If it can't be done, return an undefined Expr.
Coerce the two expressions to have the same type, using C-style casting rules.
For the purposes of casting, a boolean type is UInt(1). We use the following procedure:
If the types already match, do nothing.
Then, if one type is a vector and the other is a scalar, the scalar is broadcast to match the vector width, and we continue.
Then, if one type is floating-point and the other is not, the non-float is cast to the floating-point type, and we're done.
Then, if both types are unsigned ints, the one with fewer bits is cast to match the one with more bits and we're done.
Then, if both types are signed ints, the one with fewer bits is cast to match the one with more bits and we're done.
Finally, if one type is an unsigned int and the other type is a signed int, both are cast to a signed int with the greater of the two bit-widths. For example, matching an Int(8) with a UInt(16) results in an Int(16).
Asserts that both expressions are integer types and are either both signed or both unsigned.
If one argument is scalar and the other a vector, the scalar is broadcasted to have the same number of lanes as the vector. If one expression is of narrower type than the other, it is widened to the bit width of the wider.
Raise an expression to an integer power by repeatedly multiplying it by itself.
Split a boolean condition into vector of ANDs.
If 'cond' is undefined, return an empty vector.
If e is a ramp expression with stride, default 1, return the base, otherwise undefined.
|
inline |
Implementations of division and mod that are specific to Halide.
Use these implementations; do not use native C division or mod to simplify Halide expressions. Halide division and modulo satisify the Euclidean definition of division for integers a and b:
/code when b != 0, (a/b)*b + ab = a 0 <= ab < |b| /endcode
Additionally, mod by zero returns zero, and div by zero returns zero. This makes mod and div total functions.
Definition at line 239 of file IROperator.h.
References Halide::Type::is_float(), and Halide::Type::is_int().
Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), and Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().
|
inline |
Definition at line 260 of file IROperator.h.
References Halide::Type::is_float(), and Halide::Type::is_int().
Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Div >().
|
inline |
Definition at line 285 of file IROperator.h.
|
inline |
Definition at line 291 of file IROperator.h.
References Halide::floor().
|
inline |
Definition at line 297 of file IROperator.h.
|
inline |
Definition at line 301 of file IROperator.h.
Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed.
Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed.
Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.
Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.
If the expression is a tag helper call, remove it and return the tagged expression.
If not, returns the expression.
|
inline |
Definition at line 335 of file IROperator.h.
Referenced by Halide::Pipeline::add_requirement(), Halide::Internal::GeneratorBase::add_requirement(), collect_print_args(), Halide::print(), Halide::print_when(), and Halide::require().
|
inline |
Definition at line 339 of file IROperator.h.
References collect_print_args().
|
inline |
Definition at line 345 of file IROperator.h.
References collect_print_args().
Expr Halide::Internal::requirement_failed_error | ( | Expr | condition, |
const std::vector< Expr > & | args | ||
) |
Expr Halide::Internal::memoize_tag_helper | ( | Expr | result, |
const std::vector< Expr > & | cache_key_values | ||
) |
Referenced by Halide::memoize_tag().
Return an expression that should never be evaluated.
Expressions that depend on unreachabale values are also unreachable, and statements that execute unreachable expressions are also considered unreachable.
Referenced by unreachable().
|
inline |
Definition at line 1328 of file IROperator.h.
References unreachable().
FOR INTERNAL USE ONLY.
An entirely unchecked version of unsafe_promise_clamped, used inside the compiler as an annotation of the known bounds of an Expr when it has proved something is bounded and wants to record that fact for later passes (notably bounds inference) to exploit. This gets introduced by GuardWithIf tail strategies, because the bounds machinery has a hard time exploiting if statement conditions.
Unlike unsafe_promise_clamped, this expression is context-dependent, because 'value' might be statically bounded at some point in the IR (e.g. due to a containing if statement), but not elsewhere.
This intrinsic always evaluates to its first argument. If this value is used by a side-effecting operation and it is outside the range specified by its second and third arguments, behavior is undefined. The compiler can therefore assume that the value is within the range given and optimize accordingly. Note that this permits promise_clamped to evaluate to something outside of the range, provided that this value is not used.
Note that this produces an intrinsic that is marked as 'pure' and thus is allowed to be hoisted, etc.; thus, extra care must be taken with its use.
Expr Halide::Internal::widen_right_add | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1669 of file IROperator.h.
References Halide::widen_right_add().
Expr Halide::Internal::widen_right_mul | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1674 of file IROperator.h.
References Halide::widen_right_mul().
Expr Halide::Internal::widen_right_sub | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1679 of file IROperator.h.
References Halide::widen_right_sub().
Expr Halide::Internal::widening_add | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1684 of file IROperator.h.
References Halide::widening_add().
Expr Halide::Internal::widening_mul | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1689 of file IROperator.h.
References Halide::widening_mul().
Expr Halide::Internal::widening_sub | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1694 of file IROperator.h.
References Halide::widening_sub().
Expr Halide::Internal::widening_shift_left | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1699 of file IROperator.h.
References Halide::widening_shift_left().
Expr Halide::Internal::widening_shift_left | ( | const Expr & | a, |
int | b, | ||
T * | = nullptr |
||
) |
Definition at line 1704 of file IROperator.h.
References Halide::widening_shift_left().
Expr Halide::Internal::widening_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1709 of file IROperator.h.
References Halide::widening_shift_right().
Expr Halide::Internal::widening_shift_right | ( | const Expr & | a, |
int | b, | ||
T * | = nullptr |
||
) |
Definition at line 1714 of file IROperator.h.
References Halide::widening_shift_right().
Expr Halide::Internal::rounding_shift_left | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1719 of file IROperator.h.
References Halide::widening_shift_left().
Expr Halide::Internal::rounding_shift_left | ( | const Expr & | a, |
int | b, | ||
T * | = nullptr |
||
) |
Definition at line 1724 of file IROperator.h.
References Halide::widening_shift_left().
Expr Halide::Internal::rounding_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1729 of file IROperator.h.
References Halide::rounding_shift_right().
Expr Halide::Internal::rounding_shift_right | ( | const Expr & | a, |
int | b, | ||
T * | = nullptr |
||
) |
Definition at line 1734 of file IROperator.h.
References Halide::rounding_shift_right().
Expr Halide::Internal::saturating_add | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1739 of file IROperator.h.
References Halide::saturating_add().
Expr Halide::Internal::saturating_sub | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1744 of file IROperator.h.
References Halide::saturating_sub().
Expr Halide::Internal::halving_add | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1749 of file IROperator.h.
References Halide::halving_add().
Expr Halide::Internal::rounding_halving_add | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1754 of file IROperator.h.
References Halide::rounding_halving_add().
Expr Halide::Internal::halving_sub | ( | const Expr & | a, |
const Expr & | b, | ||
T * | = nullptr |
||
) |
Definition at line 1759 of file IROperator.h.
References Halide::halving_sub().
Expr Halide::Internal::mul_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
const Expr & | q, | ||
T * | = nullptr |
||
) |
Definition at line 1764 of file IROperator.h.
References Halide::mul_shift_right().
Expr Halide::Internal::mul_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
int | q, | ||
T * | = nullptr |
||
) |
Definition at line 1769 of file IROperator.h.
References Halide::mul_shift_right().
Expr Halide::Internal::rounding_mul_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
const Expr & | q, | ||
T * | = nullptr |
||
) |
Definition at line 1774 of file IROperator.h.
References Halide::rounding_mul_shift_right().
Expr Halide::Internal::rounding_mul_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
int | q, | ||
T * | = nullptr |
||
) |
Definition at line 1779 of file IROperator.h.
References Halide::rounding_mul_shift_right().
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const AssociativePattern & | |||
) |
Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const AssociativeOp & | |||
) |
Emit a halide associative op on an output stream (such as std::cout) in a human-readable form.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const ForType & | |||
) |
Emit a halide for loop type (vectorized, serial, etc) in a human readable form.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const VectorReduce::Operator & | |||
) |
Emit a horizontal vector reduction op in human-readable form.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const NameMangling & | |||
) |
Emit a halide name mangling value in a human readable format.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const LinkageType & | |||
) |
Emit a halide linkage value in a human readable format.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const DimType & | |||
) |
Emit a halide dimension type in human-readable format.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | out, |
const Closure & | c | ||
) |
Emit a Closure in human-readable format.
std::ostream& Halide::Internal::operator<< | ( | std::ostream & | stream, |
const Indentation & | |||
) |
void* Halide::Internal::get_symbol_address | ( | const char * | s | ) |
Expr Halide::Internal::lower_lerp | ( | Type | final_type, |
Expr | zero_val, | ||
Expr | one_val, | ||
const Expr & | weight, | ||
const Target & | target | ||
) |
Build Halide IR that computes a lerp.
Use by codegen targets that don't have a native lerp. The lerp is done in the type of the zero value. The final_type is a cast that should occur after the lerp. It's included because in some cases you can incorporate a final cast into the lerp math.
Hoist loop-invariants out of inner loops.
This is especially important in cases where LLVM would not do it for us automatically. For example, it hoists loop invariants out of cuda kernels.
Just hoist loop-invariant if statements as far up as possible.
Does not lift other values. It's useful to run this earlier in lowering to simplify the IR.
auto Halide::Internal::iterator_to_pointer | ( | T | iter | ) | -> decltype(&*std::declval<T>()) |
Definition at line 124 of file LLVM_Headers.h.
|
inline |
Definition at line 128 of file LLVM_Headers.h.
|
inline |
Definition at line 132 of file LLVM_Headers.h.
|
inline |
Definition at line 136 of file LLVM_Headers.h.
llvm::Triple Halide::Internal::get_triple_for_target | ( | const Target & | target | ) |
std::unique_ptr<llvm::Module> Halide::Internal::get_initial_module_for_target | ( | Target | , |
llvm::LLVMContext * | , | ||
bool | for_shared_jit_runtime = false , |
||
bool | just_gpu = false |
||
) |
Create an llvm module containing the support code for a given target.
std::unique_ptr<llvm::Module> Halide::Internal::get_initial_module_for_ptx_device | ( | Target | , |
llvm::LLVMContext * | c | ||
) |
Create an llvm module containing the support code for ptx device.
void Halide::Internal::add_bitcode_to_module | ( | llvm::LLVMContext * | context, |
llvm::Module & | module, | ||
const std::vector< uint8_t > & | bitcode, | ||
const std::string & | name | ||
) |
Link a block of llvm bitcode into an llvm module.
std::unique_ptr<llvm::Module> Halide::Internal::link_with_wasm_jit_runtime | ( | llvm::LLVMContext * | c, |
const Target & | t, | ||
std::unique_ptr< llvm::Module > | extra_module | ||
) |
Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module.
Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load.
If the loads are predicated, the predicates need to match. Can be an optimization or pessimization depending on how good the L1 cache is on the architecture and how many memory issue slots there are. Currently only intended for Hexagon.
Module Halide::Internal::lower | ( | const std::vector< Function > & | output_funcs, |
const std::string & | pipeline_name, | ||
const Target & | t, | ||
const std::vector< Argument > & | args, | ||
LinkageType | linkage_type, | ||
const std::vector< Stmt > & | requirements = std::vector< Stmt >() , |
||
bool | trace_pipeline = false , |
||
const std::vector< IRMutator * > & | custom_passes = std::vector< IRMutator * >() |
||
) |
Given a vector of scheduled halide functions, create a Module that evaluates it.
Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. The Module may contain submodules for computation offloaded to another execution engine or API as well as buffers that are used in the passed in Stmt.
Stmt Halide::Internal::lower_main_stmt | ( | const std::vector< Function > & | output_funcs, |
const std::string & | pipeline_name, | ||
const Target & | t, | ||
const std::vector< Stmt > & | requirements = std::vector< Stmt >() , |
||
bool | trace_pipeline = false , |
||
const std::vector< IRMutator * > & | custom_passes = std::vector< IRMutator * >() |
||
) |
Given a halide function with a schedule, create a statement that evaluates it.
Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. Mostly used as a convenience function in tests that wish to assert some property of the lowered IR.
void Halide::Internal::lower_test | ( | ) |
Stmt Halide::Internal::lower_parallel_tasks | ( | const Stmt & | s, |
std::vector< LoweredFunc > & | closure_implementations, | ||
const std::string & | name, | ||
const Target & | t | ||
) |
Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions.
Stmt Halide::Internal::inject_memoization | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env, | ||
const std::string & | name, | ||
const std::vector< Function > & | outputs | ||
) |
Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache.
Should leave non-memoized Funcs unchanged.
Stmt Halide::Internal::rewrite_memoized_allocations | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env | ||
) |
This should be called after Storage Flattening has added Allocation IR nodes.
It connects the memoization cache lookups to the Allocations so they point to the buffers from the memoization cache and those buffers are released when no longer used. Should not affect allocations for non-memoized Funcs.
std::map<OutputFileType, const OutputInfo> Halide::Internal::get_output_info | ( | const Target & | target | ) |
Referenced by Halide::SimdOpCheckTest::compile_and_check().
ModulusRemainder Halide::Internal::operator+ | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b | ||
) |
ModulusRemainder Halide::Internal::operator- | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b | ||
) |
ModulusRemainder Halide::Internal::operator* | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b | ||
) |
ModulusRemainder Halide::Internal::operator/ | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b | ||
) |
ModulusRemainder Halide::Internal::operator% | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b | ||
) |
ModulusRemainder Halide::Internal::operator+ | ( | const ModulusRemainder & | a, |
int64_t | b | ||
) |
ModulusRemainder Halide::Internal::operator- | ( | const ModulusRemainder & | a, |
int64_t | b | ||
) |
ModulusRemainder Halide::Internal::operator* | ( | const ModulusRemainder & | a, |
int64_t | b | ||
) |
ModulusRemainder Halide::Internal::operator/ | ( | const ModulusRemainder & | a, |
int64_t | b | ||
) |
ModulusRemainder Halide::Internal::operator% | ( | const ModulusRemainder & | a, |
int64_t | b | ||
) |
ModulusRemainder Halide::Internal::modulus_remainder | ( | const Expr & | e | ) |
For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant.
For example, it is straight-forward to deduce that ((10*x + 2)*(6*y - 3) - 1) is congruent to five modulo six.
We get the most information when the modulus is large. E.g. if something is congruent to 208 modulo 384, then we also know it's congruent to 0 mod 8, and we can possibly use it as an index for an aligned load. If all else fails, we can just say that an integer is congruent to zero modulo one.
ModulusRemainder Halide::Internal::modulus_remainder | ( | const Expr & | e, |
const Scope< ModulusRemainder > & | scope | ||
) |
If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder:
Reduce an expression modulo some integer.
Returns true and assigns to remainder if an answer could be found.
bool Halide::Internal::reduce_expr_modulo | ( | const Expr & | e, |
int64_t | modulus, | ||
int64_t * | remainder, | ||
const Scope< ModulusRemainder > & | scope | ||
) |
Reduce an expression modulo some integer.
Returns true and assigns to remainder if an answer could be found.
void Halide::Internal::modulus_remainder_test | ( | ) |
The greatest common divisor of two integers.
Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().
The least common multiple of two integers.
Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().
ConstantInterval Halide::Internal::derivative_bounds | ( | const Expr & | e, |
const std::string & | var, | ||
const Scope< ConstantInterval > & | scope = Scope< ConstantInterval >::empty_scope() |
||
) |
Find the bounds of the derivative of an expression.
Monotonic Halide::Internal::is_monotonic | ( | const Expr & | e, |
const std::string & | var, | ||
const Scope< ConstantInterval > & | scope = Scope< ConstantInterval >::empty_scope() |
||
) |