Halide 19.0.0
Halide compiler and libraries
|
Namespaces | |
namespace | Autoscheduler |
namespace | Elf |
namespace | GeneratorMinMax |
namespace | IntegerDivision |
namespace | IRMatcher |
An alternative template-metaprogramming approach to expression matching. | |
namespace | Test |
Classes | |
class | AbstractGenerator |
AbstractGenerator is an ABC that defines the API a Generator must provide to work with the existing Generator infrastructure (GenGen, RunGen, execute_generator(), Generator Stubs). More... | |
struct | Acquire |
struct | Add |
The sum of two expressions. More... | |
struct | all_are_convertible |
struct | all_are_printable_args |
struct | all_ints_and_optional_name |
struct | all_ints_and_optional_name< First, Rest... > |
struct | all_ints_and_optional_name< T > |
struct | all_ints_and_optional_name<> |
struct | Allocate |
Allocate a scratch area called with the given name, type, and size. More... | |
struct | And |
Logical and - are both expressions true. More... | |
struct | ApplySplitResult |
class | aslog |
struct | AssertStmt |
If the 'condition' is false, then evaluate and return the message, which should be a call to an error function. More... | |
struct | AssociativeOp |
Represent the equivalent associative op of an update definition. More... | |
struct | AssociativePattern |
Represent an associative op with its identity. More... | |
struct | Atomic |
Lock all the Store nodes in the body statement. More... | |
struct | BaseExprNode |
A base class for expression nodes. More... | |
struct | BaseStmtNode |
IR nodes are split into expressions and statements. More... | |
struct | Block |
A sequence of statements to be executed in-order. More... | |
struct | Bound |
A bound on a loop, typically from Func::bound. More... | |
struct | Box |
Represents the bounds of a region of arbitrary dimension. More... | |
struct | Broadcast |
A vector with 'lanes' elements, in which every element is 'value'. More... | |
struct | BufferBuilder |
A builder to help create Exprs representing halide_buffer_t structs (e.g. More... | |
struct | BufferContents |
struct | BufferInfo |
Find all calls to image buffers and parameters in the function. More... | |
struct | Call |
A function call. More... | |
struct | Cast |
The actual IR nodes begin here. More... | |
class | Closure |
A helper class to manage closures. More... | |
class | CodeGen_C |
This class emits C++ code equivalent to a halide Stmt. More... | |
class | CodeGen_GPU_C |
A base class for GPU backends that require C-like shader output. More... | |
struct | CodeGen_GPU_Dev |
A code generator that emits GPU code from a given Halide stmt. More... | |
class | CodeGen_LLVM |
A code generator abstract base class. More... | |
class | CodeGen_Posix |
A code generator that emits posix code from a given Halide stmt. More... | |
class | CodeGen_PyTorch |
This class emits C++ code to wrap a Halide pipeline so that it can be used as a C++ extension operator in PyTorch. More... | |
class | CompilerLogger |
struct | cond |
struct | ConstantInterval |
A class to represent ranges of integers. More... | |
struct | Convert |
struct | Cost |
class | debug |
For optional debugging during codegen, use the debug class as follows: More... | |
class | Definition |
A Function definition which can either represent a init or an update definition. More... | |
struct | DeviceArgument |
A DeviceArgument looks similar to an Halide::Argument, but has behavioral differences that make it specific to the GPU pipeline; the fact that neither is-a nor has-a Halide::Argument is deliberate. More... | |
struct | Dim |
The Dim struct represents one loop in the schedule's representation of a loop nest. More... | |
class | Dimension |
struct | Div |
The ratio of two expressions. More... | |
struct | EQ |
Is the first expression equal to the second. More... | |
struct | ErrorReport |
struct | Evaluate |
Evaluate and discard an expression, presumably because it has some side-effect. More... | |
struct | ExecuteGeneratorArgs |
ExecuteGeneratorArgs is the set of arguments to execute_generator(). More... | |
struct | ExprNode |
We use the "curiously recurring template pattern" to avoid duplicated code in the IR Nodes. More... | |
class | ExprUsesVars |
struct | FeatureIntermediates |
struct | FileStat |
class | FindAllCalls |
Visitor for keeping track of functions that are directly called and the arguments with which they are called. More... | |
struct | FloatImm |
Floating point constants. More... | |
struct | For |
A for loop. More... | |
struct | Fork |
A pair of statements executed concurrently. More... | |
struct | Free |
Free the resources associated with the given buffer. More... | |
class | FuncSchedule |
A schedule for a Function of a Halide pipeline. More... | |
class | Function |
A reference-counted handle to Halide's internal representation of a function. More... | |
struct | FunctionPtr |
A possibly-weak pointer to a Halide function. More... | |
struct | FusedPair |
This represents two stages with fused loop nests from outermost to a specific loop level. More... | |
struct | GE |
Is the first expression greater than or equal to the second. More... | |
class | GeneratorBase |
class | GeneratorFactoryProvider |
GeneratorFactoryProvider provides a way to customize the Generators that are visible to generate_filter_main (which otherwise would just look at the global registry of C++ Generators). More... | |
class | GeneratorInput_Arithmetic |
class | GeneratorInput_Buffer |
class | GeneratorInput_DynamicScalar |
class | GeneratorInput_Func |
class | GeneratorInput_Scalar |
class | GeneratorInputBase |
class | GeneratorInputImpl |
class | GeneratorOutput_Arithmetic |
class | GeneratorOutput_Buffer |
class | GeneratorOutput_Func |
class | GeneratorOutputBase |
class | GeneratorOutputImpl |
class | GeneratorParam_Arithmetic |
class | GeneratorParam_AutoSchedulerParams |
class | GeneratorParam_Bool |
class | GeneratorParam_Enum |
class | GeneratorParam_LoopLevel |
class | GeneratorParam_String |
class | GeneratorParam_Synthetic |
class | GeneratorParam_Target |
class | GeneratorParam_Type |
class | GeneratorParamBase |
class | GeneratorParamImpl |
class | GeneratorParamInfo |
class | GeneratorRegistry |
class | GIOBase |
GIOBase is the base class for all GeneratorInput<> and GeneratorOutput<> instantiations; it is not part of the public API and should never be used directly by user code. More... | |
class | GPUCompilationCache |
class | GpuObjectLifetimeTracker |
struct | GT |
Is the first expression greater than the second. More... | |
struct | HalideBufferStaticTypeAndDims |
struct | HalideBufferStaticTypeAndDims<::Halide::Buffer< T, Dims > > |
struct | HalideBufferStaticTypeAndDims<::Halide::Runtime::Buffer< T, Dims > > |
struct | has_static_halide_type_method |
struct | has_static_halide_type_method< T2, typename type_sink< decltype(T2::static_halide_type())>::type > |
class | HexagonAlignmentAnalyzer |
struct | HoistedStorage |
Represents a location where storage will be hoisted to for a Func / Realize node with a given name. More... | |
class | HostClosure |
A Closure modified to inspect GPU-specific memory accesses, and produce a vector of DeviceArgument objects. More... | |
struct | IfThenElse |
An if-then-else block. More... | |
struct | Indentation |
struct | InferredArgument |
An inferred argument. More... | |
struct | Interval |
A class to represent ranges of Exprs. More... | |
struct | IntImm |
Integer constants. More... | |
struct | IntrusivePtr |
Intrusive shared pointers have a reference count (a RefCount object) stored in the class itself. More... | |
struct | IRDeepCompare |
A compare struct built around less_than, for use as the comparison object in a std::map or std::set. More... | |
struct | IRGraphDeepCompare |
A compare struct built around graph_less_than, for use as the comparison object in a std::map or std::set. More... | |
class | IRGraphMutator |
A mutator that caches and reapplies previously-done mutations, so that it can handle graphs of IR that have not had CSE done to them. More... | |
class | IRGraphVisitor |
A base class for algorithms that walk recursively over the IR without visiting the same node twice. More... | |
struct | IRHandle |
IR nodes are passed around opaque handles to them. More... | |
class | IRMutator |
A base class for passes over the IR which modify it (e.g. More... | |
struct | IRNode |
The abstract base classes for a node in the Halide IR. More... | |
class | IRPrinter |
An IRVisitor that emits IR to the given output stream in a human readable form. More... | |
class | IRVisitor |
A base class for algorithms that need to recursively walk over the IR. More... | |
struct | is_printable_arg |
struct | IsHalideBuffer |
struct | IsHalideBuffer< const halide_buffer_t * > |
struct | IsHalideBuffer< halide_buffer_t * > |
struct | IsHalideBuffer<::Halide::Buffer< T, Dims > > |
struct | IsHalideBuffer<::Halide::Runtime::Buffer< T, Dims > > |
struct | IsRoundtrippable |
struct | JITCache |
struct | JITErrorBuffer |
struct | JITFuncCallContext |
struct | JITModule |
class | JITSharedRuntime |
class | JSONCompilerLogger |
JSONCompilerLogger is a basic implementation of the CompilerLogger interface that saves logged data, then logs it all in JSON format in emit_to_stream(). More... | |
struct | LE |
Is the first expression less than or equal to the second. More... | |
struct | Let |
A let expression, like you might find in a functional language. More... | |
struct | LetStmt |
The statement form of a let node. More... | |
struct | Load |
Load a value from a named symbol if predicate is true. More... | |
struct | LoweredArgument |
Definition of an argument to a LoweredFunc. More... | |
struct | LoweredFunc |
Definition of a lowered function. More... | |
struct | LT |
Is the first expression less than the second. More... | |
struct | Max |
The greater of two values. More... | |
struct | meta_and |
struct | meta_and< T1, Args... > |
struct | meta_or |
struct | meta_or< T1, Args... > |
struct | Min |
The lesser of two values. More... | |
struct | Mod |
The remainder of a / b. More... | |
struct | ModulusRemainder |
The result of modulus_remainder analysis. More... | |
struct | Mul |
The product of two expressions. More... | |
struct | NE |
Is the first expression not equal to the second. More... | |
struct | NoRealizations |
struct | NoRealizations< T, Args... > |
struct | NoRealizations<> |
struct | Not |
Logical not - true if the expression false. More... | |
class | ObjectInstanceRegistry |
struct | Or |
Logical or - is at least one of the expression true. More... | |
struct | OutputInfo |
struct | PipelineFeatures |
struct | Prefetch |
Represent a multi-dimensional region of a Func or an ImageParam that needs to be prefetched. More... | |
struct | PrefetchDirective |
struct | PrintSpan |
Allow easily printing the contents of containers, or std::vector-like containers, in debug output. More... | |
struct | PrintSpanLn |
Allow easily printing the contents of spans, or std::vector-like spans, in debug output. More... | |
struct | ProducerConsumer |
This node is a helpful annotation to do with permissions. More... | |
struct | Provide |
This defines the value of a function at a multi-dimensional location. More... | |
class | PythonExtensionGen |
struct | Ramp |
A linear ramp vector node. More... | |
struct | Realize |
Allocate a multi-dimensional buffer of the given type and size. More... | |
class | ReductionDomain |
A reference-counted handle on a reduction domain, which is just a vector of ReductionVariable. More... | |
struct | ReductionVariable |
A single named dimension of a reduction domain. More... | |
struct | ReductionVariableInfo |
Return a list of reduction variables the expression or tuple depends on. More... | |
class | RefCount |
A class representing a reference count to be used with IntrusivePtr. More... | |
struct | RegionCosts |
Auto scheduling component which is used to assign costs for computing a region of a function or one of its stages. More... | |
class | RegisterGenerator |
struct | Reinterpret |
Reinterpret value as another type, without affecting any of the bits (on little-endian systems). More... | |
struct | ScheduleFeatures |
class | Scope |
A common pattern when traversing Halide IR is that you need to keep track of stuff when you find a Let or a LetStmt, and that it should hide previous values with the same name until you leave the Let or LetStmt nodes This class helps with that. More... | |
struct | ScopedBinding |
Helper class for pushing/popping Scope<> values, to allow for early-exit in Visitor/Mutators that preserves correctness. More... | |
struct | ScopedBinding< void > |
struct | ScopedValue |
Helper class for saving/restoring variable values on the stack, to allow for early-exit that preserves correctness. More... | |
struct | Select |
A ternary operator. More... | |
struct | select_type |
struct | select_type< First > |
struct | Shuffle |
Construct a new vector by taking elements from another sequence of vectors. More... | |
class | Simplify |
class | SmallStack |
A stack which can store one item very efficiently. More... | |
class | SmallStack< void > |
struct | SolverResult |
struct | Specialization |
struct | Split |
class | StageSchedule |
A schedule for a single stage of a Halide pipeline. More... | |
struct | StaticCast |
struct | Stmt |
A reference-counted handle to a statement node. More... | |
struct | StmtNode |
struct | StorageDim |
Properties of one axis of the storage of a Func. More... | |
struct | Store |
Store a 'value' to the buffer called 'name' at a given 'index' if 'predicate' is true. More... | |
struct | StringImm |
String constants. More... | |
class | StubInput |
class | StubInputBuffer |
StubInputBuffer is the placeholder that a Stub uses when it requires a Buffer for an input (rather than merely a Func or Expr). More... | |
class | StubOutputBuffer |
StubOutputBuffer is the placeholder that a Stub uses when it requires a Buffer for an output (rather than merely a Func). More... | |
class | StubOutputBufferBase |
struct | Sub |
The difference of two expressions. More... | |
class | TemporaryFile |
A simple utility class that creates a temporary file in its ctor and deletes that file in its dtor; this is useful for temporary files that you want to ensure are deleted when exiting a certain scope. More... | |
struct | type_sink |
struct | UIntImm |
Unsigned integer constants. More... | |
struct | Variable |
A named variable. More... | |
class | VariadicVisitor |
A visitor/mutator capable of passing arbitrary arguments to the visit methods using CRTP and returning any types from them. More... | |
struct | VectorReduce |
Horizontally reduce a vector to a scalar or narrower vector using the given commutative and associative binary operator. More... | |
class | Voidifier |
struct | WasmModule |
Handle to compiled wasm code which can be called later. More... | |
struct | Weights |
Typedefs | |
using | AbstractGeneratorPtr = std::unique_ptr<AbstractGenerator> |
typedef std::map< std::string, Interval > | DimBounds |
typedef std::map< std::pair< std::string, int >, Interval > | FuncValueBounds |
template<typename T , typename T2 > | |
using | add_const_if_T_is_const = typename std::conditional<std::is_const<T>::value, const T2, T2>::type |
template<typename T > | |
using | GeneratorParamImplBase |
template<typename T , typename TBase = typename std::remove_all_extents<T>::type> | |
using | GeneratorInputImplBase |
template<typename T , typename TBase = typename std::remove_all_extents<T>::type> | |
using | GeneratorOutputImplBase |
using | GeneratorFactory = std::function<AbstractGeneratorPtr(const GeneratorContext &context)> |
typedef llvm::raw_pwrite_stream | LLVMOStream |
Enumerations | |
enum class | ArgInfoKind { Scalar , Function , Buffer } |
enum class | ArgInfoDirection { Input , Output } |
enum class | Direction { Upper , Lower } |
Given a varying expression, try to find a constant that is either: An upper bound (always greater than or equal to the expression), or A lower bound (always less than or equal to the expression) If it fails, returns an undefined Expr. More... | |
enum class | IRNodeType { IntImm , UIntImm , FloatImm , StringImm , Broadcast , Cast , Reinterpret , Variable , Add , Sub , Mod , Mul , Div , Min , Max , EQ , NE , LT , LE , GT , GE , And , Or , Not , Select , Load , Ramp , Call , Let , Shuffle , VectorReduce , LetStmt , AssertStmt , ProducerConsumer , For , Acquire , Store , Provide , Allocate , Free , Realize , Block , Fork , IfThenElse , Evaluate , Prefetch , Atomic , HoistedStorage } |
All our IR node types get unique IDs for the purposes of RTTI. More... | |
enum class | ForType { Serial , Parallel , Vectorized , Unrolled , Extern , GPUBlock , GPUThread , GPULane } |
An enum describing a type of loop traversal. More... | |
enum class | SyntheticParamType { Type , Dim , ArraySize } |
enum class | Monotonic { Constant , Increasing , Decreasing , Unknown } |
Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown. More... | |
enum class | DimType { PureVar = 0 , PureRVar , ImpureRVar } |
Each Dim below has a dim_type, which tells you what transformations are legal on it. More... | |
Functions | |
Stmt | add_atomic_mutex (Stmt s, const std::vector< Function > &outputs) |
Stmt | add_image_checks (const Stmt &s, const std::vector< Function > &outputs, const Target &t, const std::vector< std::string > &order, const std::map< std::string, Function > &env, const FuncValueBounds &fb, bool will_inject_host_copies) |
Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g. | |
Stmt | add_parameter_checks (const std::vector< Stmt > &requirements, Stmt s, const Target &t) |
Insert checks to make sure that all referenced parameters meet their constraints. | |
Stmt | add_split_factor_checks (const Stmt &s, const std::map< std::string, Function > &env) |
Insert checks that all split factors that depend on scalar parameters are strictly positive. | |
Stmt | align_loads (const Stmt &s, int alignment, int min_bytes_to_align) |
Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors. | |
Stmt | allocation_bounds_inference (Stmt s, const std::map< std::string, Function > &env, const std::map< std::pair< std::string, int >, Interval > &func_bounds) |
Take a partially statement with Realize nodes in terms of variables, and define values for those variables. | |
std::vector< ApplySplitResult > | apply_split (const Split &split, bool is_update, const std::string &prefix, std::map< std::string, Expr > &dim_extent_alignment) |
Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let). | |
std::vector< std::pair< std::string, Expr > > | compute_loop_bounds_after_split (const Split &split, const std::string &prefix) |
Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions. | |
const std::vector< AssociativePattern > & | get_ops_table (const std::vector< Expr > &exprs) |
AssociativeOp | prove_associativity (const std::string &f, std::vector< Expr > args, std::vector< Expr > exprs) |
Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any. | |
void | associativity_test () |
Stmt | fork_async_producers (Stmt s, const std::map< std::string, Function > &env) |
int | string_to_int (const std::string &s) |
Return an int representation of 's'. | |
Expr | substitute_var_estimates (Expr e) |
Substitute every variable in an Expr or a Stmt with its estimate if specified. | |
Stmt | substitute_var_estimates (Stmt s) |
Expr | get_extent (const Interval &i) |
Return the size of an interval. | |
Expr | box_size (const Box &b) |
Return the size of an n-d box. | |
void | disp_regions (const std::map< std::string, Box > ®ions) |
Helper function to print the bounds of a region. | |
Definition | get_stage_definition (const Function &f, int stage_num) |
Return the corresponding definition of a function given the stage. | |
std::vector< Dim > & | get_stage_dims (const Function &f, int stage_num) |
Return the corresponding loop dimensions of a function given the stage. | |
void | combine_load_costs (std::map< std::string, Expr > &result, const std::map< std::string, Expr > &partial) |
Add partial load costs to the corresponding function in the result costs. | |
DimBounds | get_stage_bounds (const Function &f, int stage_num, const DimBounds &pure_bounds) |
Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions. | |
std::vector< DimBounds > | get_stage_bounds (const Function &f, const DimBounds &pure_bounds) |
Return the required bounds for all the stages of the function 'f'. | |
Expr | perform_inline (Expr e, const std::map< std::string, Function > &env, const std::set< std::string > &inlines=std::set< std::string >(), const std::vector< std::string > &order=std::vector< std::string >()) |
Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression. | |
std::set< std::string > | get_parents (Function f, int stage) |
Return all functions that are directly called by a function stage (f, stage). | |
template<typename K , typename V > | |
V | get_element (const std::map< K, V > &m, const K &key) |
Return value of element within a map. | |
template<typename K , typename V > | |
V & | get_element (std::map< K, V > &m, const K &key) |
bool | inline_all_trivial_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env) |
If the cost of computing a Func is about the same as calling the Func, inline the Func. | |
std::string | is_func_called_element_wise (const std::vector< std::string > &order, size_t index, const std::map< std::string, Function > &env) |
Determine if a Func (order[index]) is only consumed by another single Func in element-wise manner. | |
bool | inline_all_element_wise_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env) |
Inline a Func if its values are only consumed by another single Func in element-wise manner. | |
void | propagate_estimate_test () |
Stmt | bound_constant_extent_loops (const Stmt &s) |
Replace all loop extents of unrolled or vectorized loops with constants, by substituting and simplifying as needed. | |
const FuncValueBounds & | empty_func_value_bounds () |
Interval | bounds_of_expr_in_scope (const Expr &expr, const Scope< Interval > &scope, const FuncValueBounds &func_bounds=empty_func_value_bounds(), bool const_bound=false) |
Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression. | |
Expr | find_constant_bound (const Expr &e, Direction d, const Scope< Interval > &scope=Scope< Interval >::empty_scope()) |
Interval | find_constant_bounds (const Expr &e, const Scope< Interval > &scope) |
Find bounds for a varying expression that are either constants or +/-inf. | |
void | merge_boxes (Box &a, const Box &b) |
Expand box a to encompass box b. | |
bool | boxes_overlap (const Box &a, const Box &b) |
Test if box a could possibly overlap box b. | |
Box | box_union (const Box &a, const Box &b) |
The union of two boxes. | |
Box | box_intersection (const Box &a, const Box &b) |
The intersection of two boxes. | |
bool | box_contains (const Box &a, const Box &b) |
Test if box a provably contains box b. | |
std::map< std::string, Box > | boxes_required (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression. | |
std::map< std::string, Box > | boxes_required (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
std::map< std::string, Box > | boxes_provided (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression. | |
std::map< std::string, Box > | boxes_provided (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
std::map< std::string, Box > | boxes_touched (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression. | |
std::map< std::string, Box > | boxes_touched (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_required (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Variants of the above that are only concerned with a single function. | |
Box | box_required (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_provided (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_provided (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_touched (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
Box | box_touched (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds()) |
FuncValueBounds | compute_function_value_bounds (const std::vector< std::string > &order, const std::map< std::string, Function > &env) |
Compute the maximum and minimum possible value for each function in an environment. | |
Expr | span_of_bounds (const Interval &bounds) |
void | bounds_test () |
Stmt | bounds_inference (Stmt, const std::vector< Function > &outputs, const std::vector< std::string > &realization_order, const std::vector< std::vector< std::string > > &fused_groups, const std::map< std::string, Function > &environment, const std::map< std::pair< std::string, int >, Interval > &func_bounds, const Target &target) |
Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds. | |
Stmt | bound_small_allocations (const Stmt &s) |
Expr | buffer_accessor (const Buffer<> &buf, const std::vector< Expr > &args) |
template<typename T , typename = typename std::enable_if<!std::is_convertible<T, std::string>::value>::type> | |
std::string | get_name_from_end_of_parameter_pack (T &&) |
std::string | get_name_from_end_of_parameter_pack (const std::string &n) |
std::string | get_name_from_end_of_parameter_pack () |
template<typename First , typename Second , typename... Args> | |
std::string | get_name_from_end_of_parameter_pack (First first, Second second, Args &&...rest) |
void | get_shape_from_start_of_parameter_pack_helper (std::vector< int > &, const std::string &) |
void | get_shape_from_start_of_parameter_pack_helper (std::vector< int > &) |
template<typename... Args> | |
void | get_shape_from_start_of_parameter_pack_helper (std::vector< int > &result, int x, Args &&...rest) |
template<typename... Args> | |
std::vector< int > | get_shape_from_start_of_parameter_pack (Args &&...args) |
template<typename T > | |
void | buffer_type_name_non_const (std::ostream &s) |
template<> | |
void | buffer_type_name_non_const< void > (std::ostream &s) |
template<typename T > | |
std::string | buffer_type_name () |
Stmt | canonicalize_gpu_vars (Stmt s) |
Canonicalize GPU var names into some pre-determined block/thread names (i.e. | |
const std::string & | gpu_thread_name (int index) |
Names for the thread and block id variables. | |
const std::string & | gpu_block_name (int index) |
Stmt | clamp_unsafe_accesses (const Stmt &s, const std::map< std::string, Function > &env, FuncValueBounds &func_bounds) |
Inject clamps around func calls h(...) when all the following conditions hold: | |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_D3D12Compute_Dev (const Target &target) |
llvm::Type * | get_vector_element_type (llvm::Type *) |
Get the scalar type of an llvm vector type. | |
bool | function_takes_user_context (const std::string &name) |
Which built-in functions require a user-context first argument? | |
bool | can_allocation_fit_on_stack (int64_t size) |
Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False. | |
std::pair< Expr, Expr > | long_div_mod_round_to_zero (const Expr &a, const Expr &b, const uint64_t *max_abs=nullptr) |
Does a {div/mod}_round_to_zero using binary long division for int/uint. | |
Expr | lower_mux (const Call *mux) |
Reduce a mux intrinsic to a select tree. | |
Expr | lower_round_to_nearest_ties_to_even (const Expr &) |
An vectorizable implementation of Halide::round that doesn't depend on any standard library being present. | |
void | get_target_options (const llvm::Module &module, llvm::TargetOptions &options) |
Given an llvm::Module, set llvm:TargetOptions information. | |
void | clone_target_options (const llvm::Module &from, llvm::Module &to) |
Given two llvm::Modules, clone target options from one to the other. | |
std::unique_ptr< llvm::TargetMachine > | make_target_machine (const llvm::Module &module) |
Given an llvm::Module, get or create an llvm:TargetMachine. | |
void | set_function_attributes_from_halide_target_options (llvm::Function &) |
Set the appropriate llvm Function attributes given the Halide Target. | |
void | embed_bitcode (llvm::Module *M, const std::string &halide_command) |
Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section. | |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_Metal_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_OpenCL_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_PTX_Dev (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_ARM (const Target &target) |
Construct CodeGen object for a variety of targets. | |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_Hexagon (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_PowerPC (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_RISCV (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_X86 (const Target &target) |
std::unique_ptr< CodeGen_Posix > | new_CodeGen_WebAssembly (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_Vulkan_Dev (const Target &target) |
std::unique_ptr< CodeGen_GPU_Dev > | new_CodeGen_WebGPU_Dev (const Target &target) |
std::unique_ptr< CompilerLogger > | set_compiler_logger (std::unique_ptr< CompilerLogger > compiler_logger) |
Set the active CompilerLogger object, replacing any existing one. | |
CompilerLogger * | get_compiler_logger () |
Return the currently active CompilerLogger object. | |
ConstantInterval | constant_integer_bounds (const Expr &e, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope(), std::map< Expr, ConstantInterval, ExprCompare > *cache=nullptr) |
Deduce constant integer bounds on an expression. | |
ConstantInterval | operator+ (const ConstantInterval &a, const ConstantInterval &b) |
Arithmetic operators on ConstantIntervals. | |
ConstantInterval | operator+ (const ConstantInterval &a, int64_t b) |
ConstantInterval | operator- (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | operator- (const ConstantInterval &a, int64_t b) |
ConstantInterval | operator/ (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | operator/ (const ConstantInterval &a, int64_t b) |
ConstantInterval | operator* (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | operator* (const ConstantInterval &a, int64_t b) |
ConstantInterval | operator% (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | operator% (const ConstantInterval &a, int64_t b) |
ConstantInterval | min (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | min (const ConstantInterval &a, int64_t b) |
ConstantInterval | max (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | max (const ConstantInterval &a, int64_t b) |
ConstantInterval | abs (const ConstantInterval &a) |
ConstantInterval | operator<< (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | operator<< (const ConstantInterval &a, int64_t b) |
ConstantInterval | operator<< (int64_t a, const ConstantInterval &b) |
ConstantInterval | operator>> (const ConstantInterval &a, const ConstantInterval &b) |
ConstantInterval | operator>> (const ConstantInterval &a, int64_t b) |
ConstantInterval | operator>> (int64_t a, const ConstantInterval &b) |
bool | operator<= (const ConstantInterval &a, const ConstantInterval &b) |
Comparison operators on ConstantIntervals. | |
bool | operator<= (const ConstantInterval &a, int64_t b) |
bool | operator<= (int64_t a, const ConstantInterval &b) |
bool | operator< (const ConstantInterval &a, const ConstantInterval &b) |
bool | operator< (const ConstantInterval &a, int64_t b) |
bool | operator< (int64_t a, const ConstantInterval &b) |
bool | operator>= (const ConstantInterval &a, const ConstantInterval &b) |
bool | operator> (const ConstantInterval &a, const ConstantInterval &b) |
bool | operator>= (const ConstantInterval &a, int64_t b) |
bool | operator> (const ConstantInterval &a, int64_t b) |
bool | operator>= (int64_t a, const ConstantInterval &b) |
bool | operator> (int64_t a, const ConstantInterval &b) |
std::string | cplusplus_function_mangled_name (const std::string &name, const std::vector< std::string > &namespaces, Type return_type, const std::vector< ExternFuncArgument > &args, const Target &target) |
Return the mangled C++ name for a function. | |
void | cplusplus_mangle_test () |
Expr | common_subexpression_elimination (const Expr &, bool lift_all=false) |
Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable. | |
Stmt | common_subexpression_elimination (const Stmt &, bool lift_all=false) |
Do common-subexpression-elimination on each expression in a statement. | |
void | cse_test () |
std::ostream & | operator<< (std::ostream &stream, const Stmt &) |
Emit a halide statement on an output stream (such as std::cout) in a human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const LoweredFunc &) |
Emit a halide LoweredFunc in a human readable format. | |
template<typename T > | |
PrintSpan (const T &) -> PrintSpan< T > | |
template<typename StreamT , typename T > | |
StreamT & | operator<< (StreamT &stream, const PrintSpan< T > &wrapper) |
template<typename T > | |
PrintSpanLn (const T &) -> PrintSpanLn< T > | |
template<typename StreamT , typename T > | |
StreamT & | operator<< (StreamT &stream, const PrintSpanLn< T > &wrapper) |
void | debug_arguments (LoweredFunc *func, const Target &t) |
Injects debug prints in a LoweredFunc that describe the target and arguments. | |
Stmt | debug_to_file (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env) |
Takes a statement with Realize nodes still unlowered. | |
Expr | extract_odd_lanes (const Expr &a) |
Extract the odd-numbered lanes in a vector. | |
Expr | extract_even_lanes (const Expr &a) |
Extract the even-numbered lanes in a vector. | |
Expr | extract_lane (const Expr &vec, int lane) |
Extract the nth lane of a vector. | |
Stmt | rewrite_interleavings (const Stmt &s) |
Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic. | |
void | deinterleave_vector_test () |
Expr | remove_let_definitions (const Expr &expr) |
Remove all let definitions of expr. | |
std::vector< int > | gather_variables (const Expr &expr, const std::vector< std::string > &filter) |
Return a list of variables' indices that expr depends on and are in the filter. | |
std::vector< int > | gather_variables (const Expr &expr, const std::vector< Var > &filter) |
std::map< std::string, ReductionVariableInfo > | gather_rvariables (const Expr &expr) |
std::map< std::string, ReductionVariableInfo > | gather_rvariables (const Tuple &tuple) |
Expr | add_let_expression (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping, const std::vector< std::string > &let_variables) |
Add necessary let expressions to expr. | |
std::vector< Expr > | sort_expressions (const Expr &expr) |
Topologically sort the expression graph expressed by expr. | |
std::map< std::string, Box > | inference_bounds (const std::vector< Func > &funcs, const std::vector< Box > &output_bounds) |
Compute the bounds of funcs. | |
std::map< std::string, Box > | inference_bounds (const Func &func, const Box &output_bounds) |
std::vector< std::pair< Expr, Expr > > | box_to_vector (const Box &bounds) |
Convert Box to vector of (min, extent) | |
bool | equal (const RDom &bounds0, const RDom &bounds1) |
Return true if bounds0 and bounds1 represent the same bounds. | |
std::vector< std::string > | vars_to_strings (const std::vector< Var > &vars) |
Return a list of variable names. | |
ReductionDomain | extract_rdom (const Expr &expr) |
Return the reduction domain used by expr. | |
std::pair< bool, Expr > | solve_inverse (Expr expr, const std::string &new_var, const std::string &var) |
expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom | |
std::map< std::string, BufferInfo > | find_buffer_param_calls (const Func &func) |
std::set< std::string > | find_implicit_variables (const Expr &expr) |
Find all implicit variables in expr. | |
Expr | substitute_rdom_predicate (const std::string &name, const Expr &replacement, const Expr &expr) |
Substitute the variable. | |
bool | is_calling_function (const std::string &func_name, const Expr &expr, const std::map< std::string, Expr > &let_var_mapping) |
Return true if expr contains call to func_name. | |
bool | is_calling_function (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping) |
Return true if expr depends on any function or buffer. | |
Expr | substitute_call_arg_with_pure_arg (Func f, int variable_id, const Expr &e) |
Replaces call to Func f in Expr e such that the call argument at variable_id is the pure argument. | |
Expr | make_device_interface_call (DeviceAPI device_api, MemoryType memory_type=MemoryType::Auto) |
Get an Expr which evaluates to the device interface for the given device api at runtime. | |
Stmt | distribute_shifts (const Stmt &stmt, bool multiply_adds) |
Stmt | inject_early_frees (const Stmt &s) |
Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation. | |
Type | eliminated_bool_type (Type bool_type, Type other_type) |
If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors. | |
bool | is_float16_transcendental (const Call *) |
Check if a call is a float16 transcendental (e.g. | |
Expr | lower_float16_transcendental_to_float32_equivalent (const Call *) |
Implement a float16 transcendental using the float32 equivalent. | |
Expr | float32_to_bfloat16 (Expr e) |
Cast to/from float and bfloat using bitwise math. | |
Expr | float32_to_float16 (Expr e) |
Expr | float16_to_float32 (Expr e) |
Expr | bfloat16_to_float32 (Expr e) |
Expr | lower_float16_cast (const Cast *op) |
HALIDE_EXPORT_SYMBOL void | unhandled_exception_handler () |
template<> | |
RefCount & | ref_count< IRNode > (const IRNode *t) noexcept |
template<> | |
void | destroy< IRNode > (const IRNode *t) |
bool | is_unordered_parallel (ForType for_type) |
Check if for_type executes for loop iterations in parallel and unordered. | |
bool | is_parallel (ForType for_type) |
Returns true if for_type executes for loop iterations in parallel. | |
bool | is_gpu (ForType for_type) |
Returns true if for_type is GPUBlock, GPUThread, or GPULane. | |
template<typename StmtOrExpr , typename T > | |
bool | stmt_or_expr_uses_vars (const StmtOrExpr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument. | |
template<typename StmtOrExpr > | |
bool | stmt_or_expr_uses_var (const StmtOrExpr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument. | |
bool | expr_uses_var (const Expr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument. | |
bool | stmt_uses_var (const Stmt &stmt, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument. | |
template<typename T > | |
bool | expr_uses_vars (const Expr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument. | |
template<typename T > | |
bool | stmt_uses_vars (const Stmt &stmt, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope()) |
Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument. | |
Stmt | extract_tile_operations (const Stmt &s) |
Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend. | |
std::map< std::string, Function > | find_direct_calls (const Function &f) |
Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, including in update definitions, update index expressions, and RDom extents. | |
std::map< std::string, Function > | find_transitive_calls (const Function &f) |
Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, or indirectly in those functions' definitions, recursively. | |
std::map< std::string, Function > | build_environment (const std::vector< Function > &funcs) |
Find all Functions transitively referenced by any Function in funcs and return a map of them. | |
std::vector< Function > | called_funcs_in_order_found (const std::vector< Function > &funcs) |
Returns the same Functions as build_environment, but returns a vector of Functions instead, where the order is the order in which the Functions were first encountered. | |
Expr | lower_widen_right_add (const Expr &a, const Expr &b) |
Implement intrinsics with non-intrinsic using equivalents. | |
Expr | lower_widen_right_mul (const Expr &a, const Expr &b) |
Expr | lower_widen_right_sub (const Expr &a, const Expr &b) |
Expr | lower_widening_add (const Expr &a, const Expr &b) |
Expr | lower_widening_mul (const Expr &a, const Expr &b) |
Expr | lower_widening_sub (const Expr &a, const Expr &b) |
Expr | lower_widening_shift_left (const Expr &a, const Expr &b) |
Expr | lower_widening_shift_right (const Expr &a, const Expr &b) |
Expr | lower_rounding_shift_left (const Expr &a, const Expr &b) |
Expr | lower_rounding_shift_right (const Expr &a, const Expr &b) |
Expr | lower_saturating_add (const Expr &a, const Expr &b) |
Expr | lower_saturating_sub (const Expr &a, const Expr &b) |
Expr | lower_saturating_cast (const Type &t, const Expr &a) |
Expr | lower_halving_add (const Expr &a, const Expr &b) |
Expr | lower_halving_sub (const Expr &a, const Expr &b) |
Expr | lower_rounding_halving_add (const Expr &a, const Expr &b) |
Expr | lower_sorted_avg (const Expr &a, const Expr &b) |
Expr | lower_mul_shift_right (const Expr &a, const Expr &b, const Expr &q) |
Expr | lower_rounding_mul_shift_right (const Expr &a, const Expr &b, const Expr &q) |
Expr | lower_intrinsic (const Call *op) |
Replace one of the above ops with equivalent arithmetic. | |
Stmt | find_intrinsics (const Stmt &s) |
Replace common arithmetic patterns with intrinsics. | |
Expr | find_intrinsics (const Expr &e) |
Expr | lower_intrinsics (const Expr &e) |
The reverse of find_intrinsics. | |
Stmt | lower_intrinsics (const Stmt &s) |
Stmt | flatten_nested_ramps (const Stmt &s) |
Take a statement/expression and replace nested ramps and broadcasts. | |
Expr | flatten_nested_ramps (const Expr &e) |
template<typename Last > | |
void | check_types (const Tuple &t, int idx) |
template<typename First , typename Second , typename... Rest> | |
void | check_types (const Tuple &t, int idx) |
template<typename Last > | |
void | assign_results (Realization &r, int idx, Last last) |
template<typename First , typename Second , typename... Rest> | |
void | assign_results (Realization &r, int idx, First first, Second second, Rest &&...rest) |
void | schedule_scalar (Func f) |
std::pair< std::vector< Function >, std::map< std::string, Function > > | deep_copy (const std::vector< Function > &outputs, const std::map< std::string, Function > &env) |
Deep copy an entire Function DAG. | |
Stmt | zero_gpu_loop_mins (const Stmt &s) |
Rewrite all GPU loops to have a min of zero. | |
Stmt | fuse_gpu_thread_loops (Stmt s) |
Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model. | |
Stmt | fuzz_float_stores (const Stmt &s) |
On every store of a floating point value, mask off the least-significant-bit of the mantissa. | |
void | generator_test () |
std::vector< Expr > | parameter_constraints (const Parameter &p) |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE std::string | enum_to_string (const std::map< std::string, T > &enum_map, const T &t) |
template<typename T > | |
T | enum_from_string (const std::map< std::string, T > &enum_map, const std::string &s) |
const std::map< std::string, Halide::Type > & | get_halide_type_enum_map () |
std::string | halide_type_to_enum_string (const Type &t) |
std::string | halide_type_to_c_source (const Type &t) |
std::string | halide_type_to_c_type (const Type &t) |
const GeneratorFactoryProvider & | get_registered_generators () |
Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators. | |
int | generate_filter_main (int argc, char **argv) |
generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation. | |
int | generate_filter_main (int argc, char **argv, const GeneratorFactoryProvider &generator_factory_provider) |
This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g. | |
template<typename T > | |
T | parse_scalar (const std::string &value) |
std::vector< Type > | parse_halide_type_list (const std::string &types) |
void | execute_generator (const ExecuteGeneratorArgs &args) |
Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main() , but with a structured API that is more suitable for calling directly from code (vs command line). | |
Stmt | inject_hexagon_rpc (Stmt s, const Target &host_target, Module &module) |
Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module. | |
Buffer< uint8_t > | compile_module_to_hexagon_shared_object (const Module &device_code) |
Stmt | optimize_hexagon_shuffles (const Stmt &s, int lut_alignment) |
Replace indirect and other loads with simple loads + vlut calls. | |
Stmt | scatter_gather_generator (Stmt s) |
Stmt | optimize_hexagon_instructions (Stmt s, const Target &t) |
Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations. | |
Expr | native_deinterleave (const Expr &x) |
Generate deinterleave or interleave operations, operating on groups of vectors at a time. | |
Expr | native_interleave (const Expr &x) |
bool | is_native_deinterleave (const Expr &x) |
bool | is_native_interleave (const Expr &x) |
std::string | type_suffix (Type type, bool signed_variants=true) |
std::string | type_suffix (const Expr &a, bool signed_variants=true) |
std::string | type_suffix (const Expr &a, const Expr &b, bool signed_variants=true) |
std::string | type_suffix (const std::vector< Expr > &ops, bool signed_variants=true) |
std::vector< InferredArgument > | infer_arguments (const Stmt &body, const std::vector< Function > &outputs) |
Stmt | call_extern_and_assert (const std::string &name, const std::vector< Expr > &args) |
A helper function to call an extern function, and assert that it returns 0. | |
Stmt | inject_host_dev_buffer_copies (Stmt s, const Target &t) |
Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed. | |
Stmt | inline_function (Stmt s, const Function &f) |
Inline a single named function, which must be pure. | |
Expr | inline_function (Expr e, const Function &f) |
void | inline_function (Function caller, const Function &f) |
void | validate_schedule_inlined_function (Function f) |
Check if the schedule of an inlined function is legal, throwing an error if it is not. | |
template<typename T > | |
RefCount & | ref_count (const T *t) noexcept |
Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here. | |
template<typename T > | |
void | destroy (const T *t) |
bool | equal_impl (const IRNode &a, const IRNode &b) |
bool | graph_equal_impl (const IRNode &a, const IRNode &b) |
bool | less_than_impl (const IRNode &a, const IRNode &b) |
bool | graph_less_than_impl (const IRNode &a, const IRNode &b) |
HALIDE_ALWAYS_INLINE bool | equal (const Expr &a, int b) |
Compare an Expr to an int literal. | |
HALIDE_ALWAYS_INLINE bool | equal (const IRNode &a, const IRNode &b) |
Check if two defined Stmts or Exprs are equal. | |
HALIDE_ALWAYS_INLINE bool | equal (const IRHandle &a, const IRHandle &b) |
Check if two possible-undefined Stmts or Exprs are equal. | |
HALIDE_ALWAYS_INLINE bool | graph_equal (const IRNode &a, const IRNode &b) |
Check if two defined Stmts or Exprs are equal. | |
HALIDE_ALWAYS_INLINE bool | graph_equal (const IRHandle &a, const IRHandle &b) |
Check if two possibly-undefined Stmts or Exprs are equal. | |
HALIDE_ALWAYS_INLINE bool | less_than (const IRNode &a, const IRNode &b) |
Check if two defined Stmts or Exprs are in a lexicographic order. | |
HALIDE_ALWAYS_INLINE bool | less_than (const IRHandle &a, const IRHandle &b) |
Check if two possibly-undefined Stmts or Exprs are in a lexicographic order. | |
HALIDE_ALWAYS_INLINE bool | graph_less_than (const IRNode &a, const IRNode &b) |
Check if two defined Stmts or Exprs are in a lexicographic order. | |
HALIDE_ALWAYS_INLINE bool | graph_less_than (const IRHandle &a, const IRHandle &b) |
Check if two possibly-undefined Stmts or Exprs are in a lexicographic order. | |
void | ir_equality_test () |
bool | expr_match (const Expr &pattern, const Expr &expr, std::vector< Expr > &result) |
Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument. | |
bool | expr_match (const Expr &pattern, const Expr &expr, std::map< std::string, Expr > &result) |
Does the first expression have the same structure as the second? Variables are matched consistently. | |
Expr | with_lanes (const Expr &x, int lanes) |
Rewrite the expression x to have lanes lanes. | |
void | expr_match_test () |
template<typename Mutator , typename... Args> | |
std::pair< Region, bool > | mutate_region (Mutator *mutator, const Region &bounds, Args &&...args) |
A helper function for mutator-like things to mutate regions. | |
bool | is_const (const Expr &e) |
Is the expression either an IntImm, a FloatImm, a StringImm, or a Cast of the same, or a Ramp or Broadcast of the same. | |
bool | is_const (const Expr &e, int64_t v) |
Is the expression an IntImm, FloatImm of a particular value, or a Cast, or Broadcast of the same. | |
const int64_t * | as_const_int (const Expr &e) |
If an expression is an IntImm or a Broadcast of an IntImm, return a pointer to its value. | |
const uint64_t * | as_const_uint (const Expr &e) |
If an expression is a UIntImm or a Broadcast of a UIntImm, return a pointer to its value. | |
const double * | as_const_float (const Expr &e) |
If an expression is a FloatImm or a Broadcast of a FloatImm, return a pointer to its value. | |
bool | is_const_power_of_two_integer (const Expr &e, int *bits) |
Is the expression a constant integer power of two. | |
bool | is_positive_const (const Expr &e) |
Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression) | |
bool | is_negative_const (const Expr &e) |
Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression) | |
bool | is_undef (const Expr &e) |
Is the expression an undef. | |
bool | is_const_zero (const Expr &e) |
Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression) | |
bool | is_const_one (const Expr &e) |
Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression) | |
bool | is_no_op (const Stmt &s) |
Is the statement a no-op (which we represent as either an undefined Stmt, or as an Evaluate node of a constant) | |
bool | is_pure (const Expr &e) |
Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects. | |
Expr | make_const (Type t, int64_t val) |
Construct an immediate of the given type from any numeric C++ type. | |
Expr | make_const (Type t, uint64_t val) |
Expr | make_const (Type t, double val) |
Expr | make_const (Type t, int32_t val) |
Expr | make_const (Type t, uint32_t val) |
Expr | make_const (Type t, int16_t val) |
Expr | make_const (Type t, uint16_t val) |
Expr | make_const (Type t, int8_t val) |
Expr | make_const (Type t, uint8_t val) |
Expr | make_const (Type t, bool val) |
Expr | make_const (Type t, float val) |
Expr | make_const (Type t, float16_t val) |
Expr | make_signed_integer_overflow (Type type) |
Construct a unique signed_integer_overflow Expr. | |
bool | is_signed_integer_overflow (const Expr &expr) |
Check if an expression is a signed_integer_overflow. | |
void | check_representable (Type t, int64_t val) |
Check if a constant value can be correctly represented as the given type. | |
Expr | make_bool (bool val, int lanes=1) |
Construct a boolean constant from a C++ boolean value. | |
Expr | make_zero (Type t) |
Construct the representation of zero in the given type. | |
Expr | make_one (Type t) |
Construct the representation of one in the given type. | |
Expr | make_two (Type t) |
Construct the representation of two in the given type. | |
Expr | const_true (int lanes=1) |
Construct the constant boolean true. | |
Expr | const_false (int lanes=1) |
Construct the constant boolean false. | |
Expr | lossless_cast (Type t, Expr e, std::map< Expr, ConstantInterval, ExprCompare > *cache=nullptr) |
Attempt to cast an expression to a smaller type while provably not losing information. | |
Expr | lossless_negate (const Expr &x) |
Attempt to negate x without introducing new IR and without overflow. | |
void | match_types (Expr &a, Expr &b) |
Coerce the two expressions to have the same type, using C-style casting rules. | |
void | match_types_bitwise (Expr &a, Expr &b, const char *op_name) |
Asserts that both expressions are integer types and are either both signed or both unsigned. | |
Expr | halide_log (const Expr &a) |
Halide's vectorizable transcendentals. | |
Expr | halide_exp (const Expr &a) |
Expr | halide_erf (const Expr &a) |
Expr | raise_to_integer_power (Expr a, int64_t b) |
Raise an expression to an integer power by repeatedly multiplying it by itself. | |
void | split_into_ands (const Expr &cond, std::vector< Expr > &result) |
Split a boolean condition into vector of ANDs. | |
Expr | strided_ramp_base (const Expr &e, int stride=1) |
If e is a ramp expression with stride, default 1, return the base, otherwise undefined. | |
template<typename T > | |
T | mod_imp (T a, T b) |
Implementations of division and mod that are specific to Halide. | |
template<typename T > | |
T | div_imp (T a, T b) |
template<> | |
float | mod_imp< float > (float a, float b) |
template<> | |
double | mod_imp< double > (double a, double b) |
template<> | |
float | div_imp< float > (float a, float b) |
template<> | |
double | div_imp< double > (double a, double b) |
Expr | remove_likelies (const Expr &e) |
Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed. | |
Stmt | remove_likelies (const Stmt &s) |
Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed. | |
Expr | remove_promises (const Expr &e) |
Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed. | |
Stmt | remove_promises (const Stmt &s) |
Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed. | |
Expr | unwrap_tags (const Expr &e) |
If the expression is a tag helper call, remove it and return the tagged expression. | |
HALIDE_NO_USER_CODE_INLINE void | collect_print_args (std::vector< Expr > &args) |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE void | collect_print_args (std::vector< Expr > &args, const char *arg, Args &&...more_args) |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE void | collect_print_args (std::vector< Expr > &args, Expr arg, Args &&...more_args) |
Expr | requirement_failed_error (Expr condition, const std::vector< Expr > &args) |
Expr | memoize_tag_helper (Expr result, const std::vector< Expr > &cache_key_values) |
void | reset_random_counters () |
Reset the counters used for random-number seeds in random_float/int/uint. | |
Expr | unreachable (Type t=Int(32)) |
Return an expression that should never be evaluated. | |
template<typename T > | |
Expr | unreachable () |
Expr | promise_clamped (const Expr &value, const Expr &min, const Expr &max) |
FOR INTERNAL USE ONLY. | |
std::ostream & | operator<< (std::ostream &stream, IRNodeType) |
Emit a halide node type on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const AssociativePattern &) |
Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const AssociativeOp &) |
Emit a halide associative op on an output stream (such as std::cout) in a human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const ForType &) |
Emit a halide for loop type (vectorized, serial, etc) in a human readable form. | |
std::ostream & | operator<< (std::ostream &stream, const VectorReduce::Operator &) |
Emit a horizontal vector reduction op in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const NameMangling &) |
Emit a halide name mangling value in a human readable format. | |
std::ostream & | operator<< (std::ostream &stream, const LinkageType &) |
Emit a halide linkage value in a human readable format. | |
std::ostream & | operator<< (std::ostream &stream, const DimType &) |
Emit a halide dimension type in human-readable format. | |
std::ostream & | operator<< (std::ostream &out, const Closure &c) |
Emit a Closure in human-readable form. | |
std::ostream & | operator<< (std::ostream &out, const Interval &c) |
Emit an Interval in human-readable form. | |
std::ostream & | operator<< (std::ostream &out, const ConstantInterval &c) |
Emit a ConstantInterval in human-readable form. | |
std::ostream & | operator<< (std::ostream &out, const ModulusRemainder &c) |
Emit a ModulusRemainder in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Indentation &) |
void * | get_symbol_address (const char *s) |
Expr | lower_lerp (Type final_type, Expr zero_val, Expr one_val, const Expr &weight, const Target &target) |
Build Halide IR that computes a lerp. | |
Stmt | hoist_loop_invariant_values (Stmt) |
Hoist loop-invariants out of inner loops. | |
Stmt | hoist_loop_invariant_if_statements (Stmt) |
Just hoist loop-invariant if statements as far up as possible. | |
template<typename T > | |
auto | iterator_to_pointer (T iter) -> decltype(&*std::declval< T >()) |
std::string | get_llvm_function_name (const llvm::Function *f) |
std::string | get_llvm_function_name (const llvm::Function &f) |
llvm::StructType * | get_llvm_struct_type_by_name (llvm::Module *module, const char *name) |
llvm::Triple | get_triple_for_target (const Target &target) |
Return the llvm::Triple that corresponds to the given Halide Target. | |
std::unique_ptr< llvm::Module > | get_initial_module_for_target (Target, llvm::LLVMContext *, bool for_shared_jit_runtime=false, bool just_gpu=false) |
Create an llvm module containing the support code for a given target. | |
std::unique_ptr< llvm::Module > | get_initial_module_for_ptx_device (Target, llvm::LLVMContext *c) |
Create an llvm module containing the support code for ptx device. | |
void | add_bitcode_to_module (llvm::LLVMContext *context, llvm::Module &module, const std::vector< uint8_t > &bitcode, const std::string &name) |
Link a block of llvm bitcode into an llvm module. | |
std::unique_ptr< llvm::Module > | link_with_wasm_jit_runtime (llvm::LLVMContext *c, const Target &t, std::unique_ptr< llvm::Module > extra_module) |
Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module. | |
Stmt | loop_carry (Stmt, int max_carried_values=8) |
Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load. | |
Module | lower (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Argument > &args, LinkageType linkage_type, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >()) |
Given a vector of scheduled halide functions, create a Module that evaluates it. | |
Stmt | lower_main_stmt (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >()) |
Given a halide function with a schedule, create a statement that evaluates it. | |
void | lower_test () |
Stmt | lower_parallel_tasks (const Stmt &s, std::vector< LoweredFunc > &closure_implementations, const std::string &name, const Target &t) |
Stmt | lower_warp_shuffles (Stmt s, const Target &t) |
Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions. | |
Stmt | inject_memoization (const Stmt &s, const std::map< std::string, Function > &env, const std::string &name, const std::vector< Function > &outputs) |
Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache. | |
Stmt | rewrite_memoized_allocations (const Stmt &s, const std::map< std::string, Function > &env) |
This should be called after Storage Flattening has added Allocation IR nodes. | |
std::map< OutputFileType, const OutputInfo > | get_output_info (const Target &target) |
ModulusRemainder | operator+ (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator- (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator* (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator/ (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator% (const ModulusRemainder &a, const ModulusRemainder &b) |
ModulusRemainder | operator+ (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator- (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator* (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator/ (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | operator% (const ModulusRemainder &a, int64_t b) |
ModulusRemainder | modulus_remainder (const Expr &e) |
For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant. | |
ModulusRemainder | modulus_remainder (const Expr &e, const Scope< ModulusRemainder > &scope) |
If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder: | |
void | modulus_remainder_test () |
int64_t | gcd (int64_t, int64_t) |
The greatest common divisor of two integers. | |
int64_t | lcm (int64_t, int64_t) |
The least common multiple of two integers. | |
ConstantInterval | derivative_bounds (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope()) |
Find the bounds of the derivative of an expression. | |
Monotonic | is_monotonic (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope()) |
Monotonic | is_monotonic (const Expr &e, const std::string &var, const Scope< Monotonic > &scope) |
std::ostream & | operator<< (std::ostream &stream, const Monotonic &m) |
Emit the monotonic class in human-readable form for debugging. | |
void | is_monotonic_test () |
Stmt | inject_gpu_offload (const Stmt &s, const Target &host_target) |
Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module. | |
Stmt | optimize_shuffles (Stmt s, int lut_alignment) |
bool | can_parallelize_rvar (const std::string &rvar, const std::string &func, const Definition &r) |
Returns whether or not Halide can prove that it is safe to parallelize an update definition across a specific variable. | |
void | check_call_arg_types (const std::string &name, std::vector< Expr > *args, int dims) |
Validate arguments to a call to a func, image or imageparam. | |
bool | has_uncaptured_likely_tag (const Expr &e, const Scope<> &scope) |
Return true if an expression uses a likely tag that isn't captured by an enclosing Select, Min, or Max. | |
bool | has_likely_tag (const Expr &e, const Scope<> &scope) |
Return true if an expression uses a likely tag. | |
Stmt | partition_loops (Stmt s) |
Partitions loop bodies into a prologue, a steady state, and an epilogue. | |
Stmt | inject_placeholder_prefetch (const Stmt &s, const std::map< std::string, Function > &env, const std::string &prefix, const std::vector< PrefetchDirective > &prefetches) |
Inject placeholder prefetches to 's'. | |
Stmt | inject_prefetch (const Stmt &s, const std::map< std::string, Function > &env) |
Compute the actual region to be prefetched and place it to the placholder prefetch. | |
Stmt | reduce_prefetch_dimension (Stmt stmt, const Target &t) |
Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture). | |
Stmt | hoist_prefetches (const Stmt &s) |
Hoist all the prefetches in a Block to the beginning of the Block. | |
std::string | print_loop_nest (const std::vector< Function > &output_funcs) |
Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses. | |
Stmt | inject_profiling (const Stmt &, const std::string &, const std::map< std::string, Function > &env) |
Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end. | |
Expr | purify_index_math (const Expr &) |
Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero. | |
Expr | qualify (const std::string &prefix, const Expr &value) |
Prefix all variable names in the given expression with the prefix string. | |
Expr | random_float (const std::vector< Expr > &) |
Return a random floating-point number between zero and one that varies deterministically based on the input expressions. | |
Expr | random_int (const std::vector< Expr > &) |
Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers). | |
Expr | lower_random (const Expr &e, const std::vector< VarOrRVar > &free_vars, int tag) |
Convert calls to random() to IR generated by random_float and random_int. | |
std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > > | realization_order (const std::vector< Function > &outputs, std::map< std::string, Function > &env) |
Given a bunch of functions that call each other, determine an order in which to do the scheduling. | |
std::vector< std::string > | topological_order (const std::vector< Function > &outputs, const std::map< std::string, Function > &env) |
Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule. | |
Stmt | rebase_loops_to_zero (const Stmt &) |
Rewrite the mins of most loops to 0. | |
void | split_predicate_test () |
bool | is_func_trivial_to_inline (const Function &func) |
Return true if the cost of inlining a function is equivalent to the cost of calling the function directly. | |
Stmt | remove_dead_allocations (const Stmt &s) |
Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt. | |
Stmt | remove_extern_loops (const Stmt &s) |
Removes placeholder loops for extern stages. | |
Stmt | remove_undef (Stmt s) |
Removes stores that depend on undef values, and statements that only contain such stores. | |
Stmt | schedule_functions (const std::vector< Function > &outputs, const std::vector< std::vector< std::string > > &fused_groups, const std::map< std::string, Function > &env, const Target &target, bool &any_memoized) |
Build loop nests and inject Function realizations at the appropriate places using the schedule. | |
template<typename T > | |
std::ostream & | operator<< (std::ostream &stream, const Scope< T > &s) |
Stmt | select_gpu_api (const Stmt &s, const Target &t) |
Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target. | |
Stmt | simplify (const Stmt &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope(), const std::vector< Expr > &assumptions=std::vector< Expr >()) |
Perform a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc. | |
Expr | simplify (const Expr &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope(), const std::vector< Expr > &assumptions=std::vector< Expr >()) |
bool | can_prove (Expr e, const Scope< Interval > &bounds=Scope< Interval >::empty_scope()) |
Attempt to statically prove an expression is true using the simplifier. | |
Stmt | simplify_exprs (const Stmt &) |
Simplify expressions found in a statement, but don't simplify across different statements. | |
Stmt | simplify_correlated_differences (const Stmt &) |
Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions. | |
Expr | bound_correlated_differences (const Expr &expr) |
Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference. | |
void | simplify_specializations (std::map< std::string, Function > &env) |
Try to simplify the RHS/LHS of a function's definition based on its specializations. | |
Stmt | skip_stages (const Stmt &s, const std::vector< Function > &outputs, const std::vector< std::vector< std::string > > &order, const std::map< std::string, Function > &env) |
Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used. | |
Stmt | sliding_window (const Stmt &s, const std::map< std::string, Function > &env) |
Perform sliding window optimizations on a halide statement. | |
SolverResult | solve_expression (const Expr &e, const std::string &variable, const Scope< Expr > &scope=Scope< Expr >::empty_scope()) |
Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e. | |
Interval | solve_for_outer_interval (const Expr &c, const std::string &variable) |
Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it. | |
Interval | solve_for_inner_interval (const Expr &c, const std::string &variable) |
Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it. | |
Expr | and_condition_over_domain (const Expr &c, const Scope< Interval > &varying) |
Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables. | |
void | solve_test () |
void | spirv_ir_test () |
Internal test for SPIR-V IR. | |
Stmt | split_tuples (const Stmt &s, const std::map< std::string, Function > &env) |
Rewrite all tuple-valued Realizations, Provide nodes, and Call nodes into several scalar-valued ones, so that later lowering passes only need to think about scalar-valued productions. | |
Stmt | stage_strided_loads (const Stmt &s) |
Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles. | |
void | print_to_stmt_html (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="") |
Dump an HTML-formatted visualization of a Module to filename. | |
void | print_to_conceptual_stmt_html (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="") |
Dump an HTML-formatted visualization of a Module's conceptual Stmt code to filename. | |
Stmt | storage_flattening (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env, const Target &target) |
Take a statement with multi-dimensional Realize, Provide, and Call nodes, and turn it into a statement with single-dimensional Allocate, Store, and Load nodes respectively. | |
Stmt | storage_folding (const Stmt &s, const std::map< std::string, Function > &env) |
Fold storage of functions if possible. | |
bool | strictify_float (std::map< std::string, Function > &env, const Target &t) |
Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions. | |
Stmt | strip_asserts (const Stmt &s) |
Expr | substitute (const std::string &name, const Expr &replacement, const Expr &expr) |
Substitute variables with the given name with the replacement expression within expr. | |
Stmt | substitute (const std::string &name, const Expr &replacement, const Stmt &stmt) |
Substitute variables with the given name with the replacement expression within stmt. | |
Expr | substitute (const std::map< std::string, Expr > &replacements, const Expr &expr) |
Substitute variables with names in the map. | |
Stmt | substitute (const std::map< std::string, Expr > &replacements, const Stmt &stmt) |
Expr | substitute (const Expr &find, const Expr &replacement, const Expr &expr) |
Substitute expressions for other expressions. | |
Stmt | substitute (const Expr &find, const Expr &replacement, const Stmt &stmt) |
Expr | graph_substitute (const std::string &name, const Expr &replacement, const Expr &expr) |
Substitutions where the IR may be a general graph (and not just a DAG). | |
Stmt | graph_substitute (const std::string &name, const Expr &replacement, const Stmt &stmt) |
Expr | graph_substitute (const Expr &find, const Expr &replacement, const Expr &expr) |
Stmt | graph_substitute (const Expr &find, const Expr &replacement, const Stmt &stmt) |
Expr | substitute_in_all_lets (const Expr &expr) |
Substitute in all let Exprs in a piece of IR. | |
Stmt | substitute_in_all_lets (const Stmt &stmt) |
void | target_test () |
void | lower_target_query_ops (std::map< std::string, Function > &env, const Target &t) |
Stmt | inject_tracing (Stmt, const std::string &pipeline_name, bool trace_pipeline, const std::map< std::string, Function > &env, const std::vector< Function > &outputs, const Target &Target) |
Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations. | |
Stmt | trim_no_ops (Stmt s) |
Truncate loop bounds to the region over which they actually do something. | |
Stmt | unify_duplicate_lets (const Stmt &s) |
Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones. | |
Stmt | uniquify_variable_names (const Stmt &s) |
Modify a statement so that every internally-defined variable name is unique. | |
void | uniquify_variable_names_test () |
Stmt | unpack_buffers (Stmt s) |
Creates let stmts for the various buffer components (e.g. | |
Stmt | unroll_loops (const Stmt &) |
Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement. | |
Stmt | lower_unsafe_promises (const Stmt &s, const Target &t) |
Lower all unsafe promises into either assertions or unchecked code, depending on the target. | |
Stmt | lower_safe_promises (const Stmt &s) |
Lower all safe promises by just stripping them. | |
template<typename DST , typename SRC , typename std::enable_if< std::is_floating_point< SRC >::value >::type * = nullptr> | |
DST | safe_numeric_cast (SRC s) |
Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible. | |
template<typename DstType , typename SrcType > | |
DstType | reinterpret_bits (const SrcType &src) |
An aggressive form of reinterpret cast used for correct type-punning. | |
std::string | get_env_variable (char const *env_var_name) |
Get value of an environment variable. | |
std::string | running_program_name () |
Get the name of the currently running executable. | |
std::string | unique_name (char prefix) |
Generate a unique name starting with the given prefix. | |
std::string | unique_name (const std::string &prefix) |
bool | starts_with (const std::string &str, const std::string &prefix) |
Test if the first string starts with the second string. | |
bool | ends_with (const std::string &str, const std::string &suffix) |
Test if the first string ends with the second string. | |
std::string | replace_all (const std::string &str, const std::string &find, const std::string &replace) |
Replace all matches of the second string in the first string with the last string. | |
std::vector< std::string > | split_string (const std::string &source, const std::string &delim) |
Split the source string using 'delim' as the divider. | |
template<typename T > | |
std::string | join_strings (const std::vector< T > &sources, const std::string &delim) |
Join the source vector using 'delim' as the divider. | |
template<typename T , typename Fn > | |
T | fold_left (const std::vector< T > &vec, Fn f) |
Perform a left fold of a vector. | |
template<typename T , typename Fn > | |
T | fold_right (const std::vector< T > &vec, Fn f) |
Returns a right fold of a vector. | |
std::string | extract_namespaces (const std::string &name, std::vector< std::string > &namespaces) |
Returns base name and fills in namespaces, outermost one first in vector. | |
std::string | strip_namespaces (const std::string &name) |
Like extract_namespaces(), but strip and discard the namespaces, returning base name only. | |
std::string | file_make_temp (const std::string &prefix, const std::string &suffix) |
Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed. | |
std::string | dir_make_temp () |
Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed. | |
bool | file_exists (const std::string &name) |
Wrapper for access(). | |
void | assert_file_exists (const std::string &name) |
assert-fail if the file doesn't exist. | |
void | assert_no_file_exists (const std::string &name) |
assert-fail if the file DOES exist. | |
void | file_unlink (const std::string &name) |
Wrapper for unlink(). | |
void | ensure_no_file_exists (const std::string &name) |
Ensure that no file with this path exists. | |
void | dir_rmdir (const std::string &name) |
Wrapper for rmdir(). | |
FileStat | file_stat (const std::string &name) |
Wrapper for stat(). | |
std::vector< char > | read_entire_file (const std::string &pathname) |
Read the entire contents of a file into a vector<char>. | |
void | write_entire_file (const std::string &pathname, const void *source, size_t source_len) |
Create or replace the contents of a file with a given pointer-and-length of memory. | |
void | write_entire_file (const std::string &pathname, const std::vector< char > &source) |
bool | add_would_overflow (int bits, int64_t a, int64_t b) |
Routines to test if math would overflow for signed integers with the given number of bits. | |
bool | sub_would_overflow (int bits, int64_t a, int64_t b) |
bool | mul_would_overflow (int bits, int64_t a, int64_t b) |
HALIDE_MUST_USE_RESULT bool | add_with_overflow (int bits, int64_t a, int64_t b, int64_t *result) |
Routines to perform arithmetic on signed types without triggering signed overflow. | |
HALIDE_MUST_USE_RESULT bool | sub_with_overflow (int bits, int64_t a, int64_t b, int64_t *result) |
HALIDE_MUST_USE_RESULT bool | mul_with_overflow (int bits, int64_t a, int64_t b, int64_t *result) |
void | halide_tic_impl (const char *file, int line) |
void | halide_toc_impl (const char *file, int line) |
std::string | c_print_name (const std::string &name, bool prefix_underscore=true) |
Emit a version of a string that is a valid identifier in C (. | |
int | get_llvm_version () |
Return the LLVM_VERSION against which this libHalide is compiled. | |
void | run_with_large_stack (const std::function< void()> &action) |
Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size. | |
int | popcount64 (uint64_t x) |
Portable versions of popcount, count-leading-zeros, and count-trailing-zeros. | |
int | clz64 (uint64_t x) |
int | ctz64 (uint64_t x) |
int64_t | next_power_of_two (int64_t x) |
Return an integer 2^n, for some n, which is >= x. | |
template<typename T > | |
T | align_up (T x, int n) |
std::vector< Var > | make_argument_list (int dimensionality) |
Make a list of unique arguments for definitions with unnamed arguments. | |
Stmt | vectorize_loops (const Stmt &s, const std::map< std::string, Function > &env) |
Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors. | |
std::map< std::string, Function > | wrap_func_calls (const std::map< std::string, Function > &env) |
Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions. | |
std::string | get_test_tmp_dir () |
Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution). | |
Expr | lower_int_uint_div (const Expr &a, const Expr &b, bool round_to_zero=false) |
Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary. | |
Expr | lower_int_uint_mod (const Expr &a, const Expr &b) |
Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary. | |
Expr | lower_euclidean_div (Expr a, Expr b) |
Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero. | |
Expr | lower_euclidean_mod (Expr a, Expr b) |
Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero. | |
Expr | lower_signed_shift_left (const Expr &a, const Expr &b) |
Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts. | |
Expr | lower_signed_shift_right (const Expr &a, const Expr &b) |
Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts. | |
Expr | lower_extract_bits (const Call *c) |
Reduce bit extraction and concatenation to bit ops. | |
Expr | lower_concat_bits (const Call *c) |
Reduce bit extraction and concatenation to bit ops. | |
Stmt | eliminate_bool_vectors (const Stmt &s) |
Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on. | |
Expr | eliminate_bool_vectors (const Expr &s) |
Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on. | |
HALIDE_MUST_USE_RESULT bool | reduce_expr_modulo (const Expr &e, int64_t modulus, int64_t *remainder) |
Reduce an expression modulo some integer. | |
HALIDE_MUST_USE_RESULT bool | reduce_expr_modulo (const Expr &e, int64_t modulus, int64_t *remainder, const Scope< ModulusRemainder > &scope) |
Reduce an expression modulo some integer. | |
Variables | |
const int64_t | unknown = std::numeric_limits<int64_t>::min() |
constexpr IRNodeType | StrongestExprNodeType = IRNodeType::VectorReduce |
std::atomic< int > | random_variable_counter |
using Halide::Internal::AbstractGeneratorPtr = std::unique_ptr<AbstractGenerator> |
Definition at line 244 of file AbstractGenerator.h.
typedef std::map<std::string, Interval> Halide::Internal::DimBounds |
Definition at line 20 of file AutoScheduleUtils.h.
typedef std::map<std::pair<std::string, int>, Interval> Halide::Internal::FuncValueBounds |
using Halide::Internal::add_const_if_T_is_const = typename std::conditional<std::is_const<T>::value, const T2, T2>::type |
using Halide::Internal::GeneratorParamImplBase |
Definition at line 941 of file Generator.h.
using Halide::Internal::GeneratorInputImplBase |
Definition at line 2175 of file Generator.h.
using Halide::Internal::GeneratorOutputImplBase |
Definition at line 2786 of file Generator.h.
using Halide::Internal::GeneratorFactory = std::function<AbstractGeneratorPtr(const GeneratorContext &context)> |
Definition at line 3115 of file Generator.h.
typedef llvm::raw_pwrite_stream Halide::Internal::LLVMOStream |
Definition at line 27 of file LLVM_Output.h.
|
strong |
Enumerator | |
---|---|
Scalar | |
Function | |
Buffer |
Definition at line 26 of file AbstractGenerator.h.
|
strong |
Enumerator | |
---|---|
Input | |
Output |
Definition at line 30 of file AbstractGenerator.h.
|
strong |
|
strong |
All our IR node types get unique IDs for the purposes of RTTI.
|
strong |
An enum describing a type of loop traversal.
Used in schedules, and in the For loop IR node. Serial is a conventional ordered for loop. Iterations occur in increasing order, and each iteration must appear to have finished before the next begins. Parallel, GPUBlock, and GPUThread are parallel and unordered: iterations may occur in any order, and multiple iterations may occur simultaneously. Vectorized and GPULane are parallel and synchronous: they act as if all iterations occur at the same time in lockstep.
Enumerator | |
---|---|
Serial | |
Parallel | |
Vectorized | |
Unrolled | |
Extern | |
GPUBlock | |
GPUThread | |
GPULane |
|
strong |
Enumerator | |
---|---|
Type | |
Dim | |
ArraySize |
Definition at line 2892 of file Generator.h.
|
strong |
Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown.
Enumerator | |
---|---|
Constant | |
Increasing | |
Decreasing | |
Unknown |
Definition at line 26 of file Monotonic.h.
|
strong |
Each Dim below has a dim_type, which tells you what transformations are legal on it.
When you combine two Dims of distinct DimTypes (e.g. with Stage::fuse), the combined result has the greater enum value of the two types.
Enumerator | |
---|---|
PureVar | This dim originated from a Var. You can evaluate a Func at distinct values of this Var in any order over an interval that's at least as large as the interval required. In pure definitions you can even redundantly re-evaluate points. |
PureRVar | The dim originated from an RVar. You can evaluate a Func at distinct values of this RVar in any order (including in parallel) over exactly the interval specified in the RDom. PureRVars can also be reordered arbitrarily in the dims list, as there are no data hazards between the evaluation of the Func at distinct values of the RVar. The most common case where an RVar is considered pure is RVars that are used in a way which obeys all the syntactic constraints that a Var does, e.g: RDom r(0, 100);
f(r.x) = f(r.x) + 5;
Other cases where RVars are pure are where the sites being written to by the Func evaluated at one value of the RVar couldn't possibly collide with the sites being written or read by the Func at a distinct value of the RVar. For example, r.x is pure in the following three definitions: // This definition writes to even coordinates and reads from the
// same site (which no other value of r.x is writing to) and odd
// sites (which no other value of r.x is writing to):
f(2*r.x) = max(f(2*r.x), f(2*r.x + 7));
// This definition writes to scanline zero and reads from the the
// same site and scanline one:
f(r.x, 0) += f(r.x, 1);
// This definition reads and writes over non-overlapping ranges:
f(r.x + 100) += f(r.x);
ConstantInterval max(const ConstantInterval &a, const ConstantInterval &b) To give two counterexamples, r.x is not pure in the following definitions: // The same site is written by distinct values of the RVar
// (write-after-write hazard):
f(r.x / 2) += f(r.x);
// One value of r.x reads from a site that another value of r.x
// is writing to (read-after-write hazard):
f(r.x) += f(r.x + 1);
|
ImpureRVar | The dim originated from an RVar. You must evaluate a Func at distinct values of this RVar in increasing order over precisely the interval specified in the RDom. ImpureRVars may not be reordered with respect to other ImpureRVars. All RVars are impure by default. Those for which we can prove no data hazards exist get promoted to PureRVar. There are two instances in which ImpureRVars may be parallelized or reordered even in the presence of hazards: 1) In the case of an update definition that has been proven to be an associative and commutative reduction, reordering of ImpureRVars is allowed, and parallelizing them is allowed if the update has been made atomic. 2) ImpureRVars can also be reordered and parallelized if Func::allow_race_conditions() has been set. This is the escape hatch for when there are no hazards but the checks above failed to prove that (RDom::where can encode arbitrary facts about non-linear integer arithmetic, which is undecidable), or for when you don't actually care about the non-determinism introduced by data hazards (e.g. in the algorithm HOGWILD!). |
Definition at line 370 of file Schedule.h.
Stmt Halide::Internal::add_image_checks | ( | const Stmt & | s, |
const std::vector< Function > & | outputs, | ||
const Target & | t, | ||
const std::vector< std::string > & | order, | ||
const std::map< std::string, Function > & | env, | ||
const FuncValueBounds & | fb, | ||
bool | will_inject_host_copies ) |
Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g.
stride.0 must be 1).
Stmt Halide::Internal::add_parameter_checks | ( | const std::vector< Stmt > & | requirements, |
Stmt | s, | ||
const Target & | t ) |
Insert checks to make sure that all referenced parameters meet their constraints.
Also injects any custom requirements provided by the user.
Stmt Halide::Internal::add_split_factor_checks | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env ) |
Insert checks that all split factors that depend on scalar parameters are strictly positive.
Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors.
Types that are less than min_bytes_to_align in size are not rewritten. This is intended to make a distinction between data that will be accessed as a scalar and that which will be accessed as a vector.
Stmt Halide::Internal::allocation_bounds_inference | ( | Stmt | s, |
const std::map< std::string, Function > & | env, | ||
const std::map< std::pair< std::string, int >, Interval > & | func_bounds ) |
Take a partially statement with Realize nodes in terms of variables, and define values for those variables.
std::vector< ApplySplitResult > Halide::Internal::apply_split | ( | const Split & | split, |
bool | is_update, | ||
const std::string & | prefix, | ||
std::map< std::string, Expr > & | dim_extent_alignment ) |
Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let).
std::vector< std::pair< std::string, Expr > > Halide::Internal::compute_loop_bounds_after_split | ( | const Split & | split, |
const std::string & | prefix ) |
Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions.
const std::vector< AssociativePattern > & Halide::Internal::get_ops_table | ( | const std::vector< Expr > & | exprs | ) |
AssociativeOp Halide::Internal::prove_associativity | ( | const std::string & | f, |
std::vector< Expr > | args, | ||
std::vector< Expr > | exprs ) |
Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any.
'is_associative' indicates if the operation was successfuly proven as associative.
void Halide::Internal::associativity_test | ( | ) |
Stmt Halide::Internal::fork_async_producers | ( | Stmt | s, |
const std::map< std::string, Function > & | env ) |
int Halide::Internal::string_to_int | ( | const std::string & | s | ) |
Return an int representation of 's'.
Throw an error on failure.
Return the size of an interval.
Return an undefined expr if the interval is unbounded.
void Halide::Internal::disp_regions | ( | const std::map< std::string, Box > & | regions | ) |
Helper function to print the bounds of a region.
Definition Halide::Internal::get_stage_definition | ( | const Function & | f, |
int | stage_num ) |
Return the corresponding definition of a function given the stage.
This will throw an assertion if the function is an extern function (Extern Func does not have definition).
void Halide::Internal::combine_load_costs | ( | std::map< std::string, Expr > & | result, |
const std::map< std::string, Expr > & | partial ) |
Add partial load costs to the corresponding function in the result costs.
DimBounds Halide::Internal::get_stage_bounds | ( | const Function & | f, |
int | stage_num, | ||
const DimBounds & | pure_bounds ) |
Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions.
std::vector< DimBounds > Halide::Internal::get_stage_bounds | ( | const Function & | f, |
const DimBounds & | pure_bounds ) |
Return the required bounds for all the stages of the function 'f'.
Each entry in the returned vector corresponds to a stage.
Expr Halide::Internal::perform_inline | ( | Expr | e, |
const std::map< std::string, Function > & | env, | ||
const std::set< std::string > & | inlines = std::set< std::string >(), | ||
const std::vector< std::string > & | order = std::vector< std::string >() ) |
Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression.
If 'order' is passed, inlining will be done in the reverse order of function realization to avoid extra inlining works.
std::set< std::string > Halide::Internal::get_parents | ( | Function | f, |
int | stage ) |
Return all functions that are directly called by a function stage (f, stage).
V Halide::Internal::get_element | ( | const std::map< K, V > & | m, |
const K & | key ) |
Return value of element within a map.
This will assert if the element is not in the map.
Definition at line 101 of file AutoScheduleUtils.h.
References internal_assert.
V & Halide::Internal::get_element | ( | std::map< K, V > & | m, |
const K & | key ) |
Definition at line 108 of file AutoScheduleUtils.h.
References internal_assert.
void Halide::Internal::propagate_estimate_test | ( | ) |
Replace all loop extents of unrolled or vectorized loops with constants, by substituting and simplifying as needed.
If we can't determine a constant extent, but can determine a constant upper bound, inject an if statement into the body. If we can't even determine a constant upper bound, throw a user error.
const FuncValueBounds & Halide::Internal::empty_func_value_bounds | ( | ) |
Interval Halide::Internal::bounds_of_expr_in_scope | ( | const Expr & | expr, |
const Scope< Interval > & | scope, | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds(), | ||
bool | const_bound = false ) |
Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression.
Max or min may be undefined expressions if the value is not bounded above or below. If the expression is a vector, also takes the bounds across the vector lanes and returns a scalar result.
This is for tasks such as deducing the region of a buffer loaded by a chunk of code.
Expr Halide::Internal::find_constant_bound | ( | const Expr & | e, |
Direction | d, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope() ) |
Find bounds for a varying expression that are either constants or +/-inf.
Test if box a could possibly overlap box b.
The intersection of two boxes.
Test if box a provably contains box b.
std::map< std::string, Box > Halide::Internal::boxes_required | ( | const Expr & | e, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression.
This is useful for figuring out what regions of things to evaluate. Respects control flow (e.g. encodes if statement conditions), but assumes all encountered asserts pass. If it encounters an assert(false) in one if branch, assumes the opposite if branch runs unconditionally.
std::map< std::string, Box > Halide::Internal::boxes_required | ( | Stmt | s, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
std::map< std::string, Box > Halide::Internal::boxes_provided | ( | const Expr & | e, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression.
Handles asserts in the same way as boxes_required.
std::map< std::string, Box > Halide::Internal::boxes_provided | ( | Stmt | s, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
std::map< std::string, Box > Halide::Internal::boxes_touched | ( | const Expr & | e, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression.
Handles asserts in the same way as boxes_required.
std::map< std::string, Box > Halide::Internal::boxes_touched | ( | Stmt | s, |
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Box Halide::Internal::box_required | ( | const Expr & | e, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Variants of the above that are only concerned with a single function.
Box Halide::Internal::box_required | ( | Stmt | s, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Box Halide::Internal::box_provided | ( | const Expr & | e, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Box Halide::Internal::box_provided | ( | Stmt | s, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Box Halide::Internal::box_touched | ( | const Expr & | e, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
Box Halide::Internal::box_touched | ( | Stmt | s, |
const std::string & | fn, | ||
const Scope< Interval > & | scope = Scope< Interval >::empty_scope(), | ||
const FuncValueBounds & | func_bounds = empty_func_value_bounds() ) |
FuncValueBounds Halide::Internal::compute_function_value_bounds | ( | const std::vector< std::string > & | order, |
const std::map< std::string, Function > & | env ) |
Compute the maximum and minimum possible value for each function in an environment.
void Halide::Internal::bounds_test | ( | ) |
Stmt Halide::Internal::bounds_inference | ( | Stmt | , |
const std::vector< Function > & | outputs, | ||
const std::vector< std::string > & | realization_order, | ||
const std::vector< std::vector< std::string > > & | fused_groups, | ||
const std::map< std::string, Function > & | environment, | ||
const std::map< std::pair< std::string, int >, Interval > & | func_bounds, | ||
const Target & | target ) |
Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds.
std::string Halide::Internal::get_name_from_end_of_parameter_pack | ( | T && | ) |
|
inline |
|
inline |
Definition at line 52 of file Buffer.h.
Referenced by get_name_from_end_of_parameter_pack().
std::string Halide::Internal::get_name_from_end_of_parameter_pack | ( | First | first, |
Second | second, | ||
Args &&... | rest ) |
Definition at line 59 of file Buffer.h.
References get_name_from_end_of_parameter_pack().
|
inline |
Definition at line 63 of file Buffer.h.
Referenced by get_shape_from_start_of_parameter_pack(), and get_shape_from_start_of_parameter_pack_helper().
|
inline |
void Halide::Internal::get_shape_from_start_of_parameter_pack_helper | ( | std::vector< int > & | result, |
int | x, | ||
Args &&... | rest ) |
Definition at line 70 of file Buffer.h.
References get_shape_from_start_of_parameter_pack_helper().
std::vector< int > Halide::Internal::get_shape_from_start_of_parameter_pack | ( | Args &&... | args | ) |
Definition at line 76 of file Buffer.h.
References get_shape_from_start_of_parameter_pack_helper().
void Halide::Internal::buffer_type_name_non_const | ( | std::ostream & | s | ) |
|
inline |
std::string Halide::Internal::buffer_type_name | ( | ) |
Canonicalize GPU var names into some pre-determined block/thread names (i.e.
__block_id_x, __thread_id_x, etc.). The x/y/z/w order is determined by the nesting order: innermost is assigned to x and so on.
const std::string & Halide::Internal::gpu_thread_name | ( | int | index | ) |
Names for the thread and block id variables.
Includes the leading dot. Indexed from inside out, so 0 gives you the innermost loop.
const std::string & Halide::Internal::gpu_block_name | ( | int | index | ) |
Stmt Halide::Internal::clamp_unsafe_accesses | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env, | ||
FuncValueBounds & | func_bounds ) |
Inject clamps around func calls h(...) when all the following conditions hold:
std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_D3D12Compute_Dev | ( | const Target & | target | ) |
llvm::Type * Halide::Internal::get_vector_element_type | ( | llvm::Type * | ) |
Get the scalar type of an llvm vector type.
Returns the argument if it's not a vector type.
bool Halide::Internal::function_takes_user_context | ( | const std::string & | name | ) |
Which built-in functions require a user-context first argument?
bool Halide::Internal::can_allocation_fit_on_stack | ( | int64_t | size | ) |
Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False.
This routine asserts if size is non-positive.
std::pair< Expr, Expr > Halide::Internal::long_div_mod_round_to_zero | ( | const Expr & | a, |
const Expr & | b, | ||
const uint64_t * | max_abs = nullptr ) |
Does a {div/mod}_round_to_zero using binary long division for int/uint.
max_abs is the maximum absolute value of (a/b). Returns the pair {div_round_to_zero, mod_round_to_zero}.
Expr Halide::Internal::lower_int_uint_div | ( | const Expr & | a, |
const Expr & | b, | ||
bool | round_to_zero = false ) |
Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.
Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.
Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.
Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.
Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.
Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.
Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.
Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.
Reduce bit extraction and concatenation to bit ops.
Reduce bit extraction and concatenation to bit ops.
An vectorizable implementation of Halide::round that doesn't depend on any standard library being present.
void Halide::Internal::get_target_options | ( | const llvm::Module & | module, |
llvm::TargetOptions & | options ) |
Given an llvm::Module, set llvm:TargetOptions information.
void Halide::Internal::clone_target_options | ( | const llvm::Module & | from, |
llvm::Module & | to ) |
Given two llvm::Modules, clone target options from one to the other.
std::unique_ptr< llvm::TargetMachine > Halide::Internal::make_target_machine | ( | const llvm::Module & | module | ) |
Given an llvm::Module, get or create an llvm:TargetMachine.
void Halide::Internal::set_function_attributes_from_halide_target_options | ( | llvm::Function & | ) |
void Halide::Internal::embed_bitcode | ( | llvm::Module * | M, |
const std::string & | halide_command ) |
Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section.
Emulates clang's -fembed-bitcode flag and is useful to satisfy Apple's bitcode inclusion requirements.
std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_Metal_Dev | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_OpenCL_Dev | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_PTX_Dev | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_ARM | ( | const Target & | target | ) |
Construct CodeGen object for a variety of targets.
std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_Hexagon | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_PowerPC | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_RISCV | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_X86 | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_WebAssembly | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_Vulkan_Dev | ( | const Target & | target | ) |
std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_WebGPU_Dev | ( | const Target & | target | ) |
std::unique_ptr< CompilerLogger > Halide::Internal::set_compiler_logger | ( | std::unique_ptr< CompilerLogger > | compiler_logger | ) |
Set the active CompilerLogger object, replacing any existing one.
It is legal to pass in a nullptr (which means "don't do any compiler logging"). Returns the previous CompilerLogger (if any).
CompilerLogger * Halide::Internal::get_compiler_logger | ( | ) |
Return the currently active CompilerLogger object.
If set_compiler_logger() has never been called, a nullptr implementation will be returned. Do not save the pointer returned! It is intended to be used for immediate calls only.
ConstantInterval Halide::Internal::constant_integer_bounds | ( | const Expr & | e, |
const Scope< ConstantInterval > & | scope = Scope< ConstantInterval >::empty_scope(), | ||
std::map< Expr, ConstantInterval, ExprCompare > * | cache = nullptr ) |
Deduce constant integer bounds on an expression.
This can be useful to decide if, for example, the expression can be cast to another type, be negated, be incremented, etc without risking overflow.
Also optionally accepts a scope containing the integer bounds of any variables that may be referenced, and a cache of constant integer bounds on known Exprs, which this function will update. The cache is helpful to short-circuit large numbers of redundant queries, but it should not be used in contexts where the same Expr object may take on different values within a single Expr (i.e. before uniquify_variable_names).
ConstantInterval Halide::Internal::operator+ | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
Arithmetic operators on ConstantIntervals.
The resulting interval contains all possible values of the operator applied to any two elements of the argument intervals. Note that these operator on unbounded integers. If you are applying this to concrete small integer types, you will need to manually cast the constant interval back to the desired type to model the effect of overflow.
ConstantInterval Halide::Internal::operator+ | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::operator- | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator- | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::operator/ | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator/ | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::operator* | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator* | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::operator% | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator% | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::min | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::min | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::max | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::max | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::abs | ( | const ConstantInterval & | a | ) |
Referenced by Halide::Internal::IRMatcher::Intrin< Args >::make().
ConstantInterval Halide::Internal::operator<< | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator<< | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::operator<< | ( | int64_t | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator>> | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
ConstantInterval Halide::Internal::operator>> | ( | const ConstantInterval & | a, |
int64_t | b ) |
ConstantInterval Halide::Internal::operator>> | ( | int64_t | a, |
const ConstantInterval & | b ) |
bool Halide::Internal::operator<= | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
Comparison operators on ConstantIntervals.
Returns whether the comparison is true for all values of the two intervals.
bool Halide::Internal::operator<= | ( | const ConstantInterval & | a, |
int64_t | b ) |
bool Halide::Internal::operator<= | ( | int64_t | a, |
const ConstantInterval & | b ) |
bool Halide::Internal::operator< | ( | const ConstantInterval & | a, |
const ConstantInterval & | b ) |
bool Halide::Internal::operator< | ( | const ConstantInterval & | a, |
int64_t | b ) |
bool Halide::Internal::operator< | ( | int64_t | a, |
const ConstantInterval & | b ) |
|
inline |
Definition at line 144 of file ConstantInterval.h.
|
inline |
Definition at line 147 of file ConstantInterval.h.
|
inline |
Definition at line 150 of file ConstantInterval.h.
|
inline |
Definition at line 153 of file ConstantInterval.h.
|
inline |
Definition at line 156 of file ConstantInterval.h.
|
inline |
Definition at line 159 of file ConstantInterval.h.
std::string Halide::Internal::cplusplus_function_mangled_name | ( | const std::string & | name, |
const std::vector< std::string > & | namespaces, | ||
Type | return_type, | ||
const std::vector< ExternFuncArgument > & | args, | ||
const Target & | target ) |
Return the mangled C++ name for a function.
The target parameter is used to decide on the C++ ABI/mangling style to use.
void Halide::Internal::cplusplus_mangle_test | ( | ) |
Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable.
This is important to do within Halide (instead of punting to llvm), because exprs that come in from the front-end are small when considered as a graph, but combinatorially large when considered as a tree. For an example of a such a case, see test/code_explosion.cpp
The last parameter determines whether all common subexpressions are lifted, or only those that the simplifier would not subsitute back in (e.g. addition of a constant).
Do common-subexpression-elimination on each expression in a statement.
Does not introduce let statements.
void Halide::Internal::cse_test | ( | ) |
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const Stmt & | ) |
Emit a halide statement on an output stream (such as std::cout) in a human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | , |
const LoweredFunc & | ) |
Emit a halide LoweredFunc in a human readable format.
Halide::Internal::PrintSpan | ( | const T & | ) | -> PrintSpan< T > |
|
inline |
Definition at line 85 of file Debug.h.
References Halide::Internal::PrintSpan< T >::span.
Halide::Internal::PrintSpanLn | ( | const T & | ) | -> PrintSpanLn< T > |
|
inline |
Definition at line 119 of file Debug.h.
References Halide::Internal::PrintSpanLn< T >::span.
void Halide::Internal::debug_arguments | ( | LoweredFunc * | func, |
const Target & | t ) |
Injects debug prints in a LoweredFunc that describe the target and arguments.
Mutates the given func.
Stmt Halide::Internal::debug_to_file | ( | Stmt | s, |
const std::vector< Function > & | outputs, | ||
const std::map< std::string, Function > & | env ) |
Takes a statement with Realize nodes still unlowered.
If the corresponding functions have a debug_file set, then inject code that will dump the contents of those functions to a file after the realization.
Extract the odd-numbered lanes in a vector.
Extract the even-numbered lanes in a vector.
Extract the nth lane of a vector.
Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic.
void Halide::Internal::deinterleave_vector_test | ( | ) |
Remove all let definitions of expr.
std::vector< int > Halide::Internal::gather_variables | ( | const Expr & | expr, |
const std::vector< std::string > & | filter ) |
Return a list of variables' indices that expr depends on and are in the filter.
std::vector< int > Halide::Internal::gather_variables | ( | const Expr & | expr, |
const std::vector< Var > & | filter ) |
std::map< std::string, ReductionVariableInfo > Halide::Internal::gather_rvariables | ( | const Expr & | expr | ) |
std::map< std::string, ReductionVariableInfo > Halide::Internal::gather_rvariables | ( | const Tuple & | tuple | ) |
Expr Halide::Internal::add_let_expression | ( | const Expr & | expr, |
const std::map< std::string, Expr > & | let_var_mapping, | ||
const std::vector< std::string > & | let_variables ) |
Add necessary let expressions to expr.
Topologically sort the expression graph expressed by expr.
std::map< std::string, Box > Halide::Internal::inference_bounds | ( | const std::vector< Func > & | funcs, |
const std::vector< Box > & | output_bounds ) |
Compute the bounds of funcs.
The bounds represent a conservative region that is used by the "consumers" of the function, except of itself.
std::map< std::string, Box > Halide::Internal::inference_bounds | ( | const Func & | func, |
const Box & | output_bounds ) |
Return true if bounds0 and bounds1 represent the same bounds.
Referenced by equal(), graph_equal(), Halide::Internal::IRMatcher::SpecificExpr::match(), Halide::Internal::IRMatcher::Wild< i >::match(), Halide::Internal::AssociativeOp::Replacement::operator==(), and Halide::Internal::AssociativePattern::operator==().
std::vector< std::string > Halide::Internal::vars_to_strings | ( | const std::vector< Var > & | vars | ) |
Return a list of variable names.
ReductionDomain Halide::Internal::extract_rdom | ( | const Expr & | expr | ) |
Return the reduction domain used by expr.
std::pair< bool, Expr > Halide::Internal::solve_inverse | ( | Expr | expr, |
const std::string & | new_var, | ||
const std::string & | var ) |
expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom
std::map< std::string, BufferInfo > Halide::Internal::find_buffer_param_calls | ( | const Func & | func | ) |
std::set< std::string > Halide::Internal::find_implicit_variables | ( | const Expr & | expr | ) |
Find all implicit variables in expr.
Expr Halide::Internal::substitute_rdom_predicate | ( | const std::string & | name, |
const Expr & | replacement, | ||
const Expr & | expr ) |
Substitute the variable.
Also replace all occurrences in rdom.where() predicates.
bool Halide::Internal::is_calling_function | ( | const std::string & | func_name, |
const Expr & | expr, | ||
const std::map< std::string, Expr > & | let_var_mapping ) |
Return true if expr contains call to func_name.
bool Halide::Internal::is_calling_function | ( | const Expr & | expr, |
const std::map< std::string, Expr > & | let_var_mapping ) |
Return true if expr depends on any function or buffer.
Expr Halide::Internal::make_device_interface_call | ( | DeviceAPI | device_api, |
MemoryType | memory_type = MemoryType::Auto ) |
Get an Expr which evaluates to the device interface for the given device api at runtime.
Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation.
Targets may use this to free buffers earlier than the close of their Allocate node.
Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.
For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.
Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.
For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.
If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors.
Definition at line 32 of file EliminateBoolVectors.h.
References Halide::Type::bits(), Halide::Type::Int, Halide::Type::is_vector(), Halide::Type::with_bits(), and Halide::Type::with_code().
bool Halide::Internal::is_float16_transcendental | ( | const Call * | ) |
Check if a call is a float16 transcendental (e.g.
sqrt_f16)
Implement a float16 transcendental using the float32 equivalent.
Cast to/from float and bfloat using bitwise math.
HALIDE_EXPORT_SYMBOL void Halide::Internal::unhandled_exception_handler | ( | ) |
References unhandled_exception_handler().
Referenced by unhandled_exception_handler().
|
inlinenoexcept |
|
inline |
bool Halide::Internal::is_unordered_parallel | ( | ForType | for_type | ) |
Check if for_type executes for loop iterations in parallel and unordered.
Referenced by Halide::Internal::Dim::is_unordered_parallel(), and Halide::Internal::For::is_unordered_parallel().
bool Halide::Internal::is_parallel | ( | ForType | for_type | ) |
Returns true if for_type executes for loop iterations in parallel.
Referenced by Halide::Internal::Dim::is_parallel(), and Halide::Internal::For::is_parallel().
bool Halide::Internal::is_gpu | ( | ForType | for_type | ) |
Returns true if for_type is GPUBlock, GPUThread, or GPULane.
|
inline |
Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 101 of file ExprUsesVar.h.
References Halide::Internal::ExprUsesVars< T >::result.
Referenced by expr_uses_vars(), stmt_or_expr_uses_var(), and stmt_uses_vars().
|
inline |
Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 113 of file ExprUsesVar.h.
References Halide::Internal::Scope< T >::push(), and stmt_or_expr_uses_vars().
Referenced by expr_uses_var(), and stmt_uses_var().
|
inline |
Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 124 of file ExprUsesVar.h.
References stmt_or_expr_uses_var().
|
inline |
Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 133 of file ExprUsesVar.h.
References Halide::stmt, and stmt_or_expr_uses_var().
|
inline |
Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 143 of file ExprUsesVar.h.
References stmt_or_expr_uses_vars().
|
inline |
Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
Definition at line 153 of file ExprUsesVar.h.
References Halide::stmt, and stmt_or_expr_uses_vars().
Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend.
std::map< std::string, Function > Halide::Internal::build_environment | ( | const std::vector< Function > & | funcs | ) |
Find all Functions transitively referenced by any Function in funcs
and return a map of them.
std::vector< Function > Halide::Internal::called_funcs_in_order_found | ( | const std::vector< Function > & | funcs | ) |
Returns the same Functions as build_environment, but returns a vector of Functions instead, where the order is the order in which the Functions were first encountered.
This is stable to changes in the names of the Functions.
Implement intrinsics with non-intrinsic using equivalents.
Expr Halide::Internal::lower_rounding_mul_shift_right | ( | const Expr & | a, |
const Expr & | b, | ||
const Expr & | q ) |
Replace one of the above ops with equivalent arithmetic.
Replace common arithmetic patterns with intrinsics.
Take a statement/expression and replace nested ramps and broadcasts.
|
inline |
Definition at line 2616 of file Func.h.
References Halide::type_of(), and user_assert.
Referenced by check_types(), Halide::evaluate(), and Halide::evaluate_may_gpu().
|
inline |
Definition at line 2625 of file Func.h.
References check_types().
|
inline |
Definition at line 2631 of file Func.h.
References Buffer.
Referenced by assign_results(), Halide::evaluate(), and Halide::evaluate_may_gpu().
|
inline |
Definition at line 2637 of file Func.h.
References assign_results().
|
inline |
Definition at line 2684 of file Func.h.
References Halide::get_jit_target_from_environment(), Halide::Func::gpu_single_thread(), Halide::Target::has_feature(), Halide::Target::has_gpu_feature(), Halide::Func::hexagon(), and Halide::Target::HVX.
Referenced by Halide::evaluate_may_gpu(), and Halide::evaluate_may_gpu().
std::pair< std::vector< Function >, std::map< std::string, Function > > Halide::Internal::deep_copy | ( | const std::vector< Function > & | outputs, |
const std::map< std::string, Function > & | env ) |
Deep copy an entire Function DAG.
Rewrite all GPU loops to have a min of zero.
Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model.
Within every loop over gpu block indices, fuse the inner loops over thread indices into a single loop (with predication to turn off threads). Push if conditions between GPU blocks to the innermost GPU threads. Also injects synchronization points as needed, and hoists shared allocations at the block level out into a single shared memory array, and heap allocations into a slice of a global pool allocated outside the kernel.
On every store of a floating point value, mask off the least-significant-bit of the mantissa.
We've found that whether or not this dramatically changes the output of a pipeline correlates very well with whether or not a pipeline will produce very different outputs on different architectures (e.g. with and without FMA). It's also a useful way to detect bad tests, such as those that expect exact floating point equality across platforms.
void Halide::Internal::generator_test | ( | ) |
HALIDE_NO_USER_CODE_INLINE std::string Halide::Internal::enum_to_string | ( | const std::map< std::string, T > & | enum_map, |
const T & | t ) |
Definition at line 297 of file Generator.h.
References user_error.
Referenced by Halide::Internal::GeneratorParam_Enum< T >::get_default_value(), and halide_type_to_enum_string().
T Halide::Internal::enum_from_string | ( | const std::map< std::string, T > & | enum_map, |
const std::string & | s ) |
Definition at line 308 of file Generator.h.
References user_assert.
|
extern |
Referenced by halide_type_to_enum_string().
|
inline |
Definition at line 315 of file Generator.h.
References enum_to_string(), and get_halide_type_enum_map().
std::string Halide::Internal::halide_type_to_c_source | ( | const Type & | t | ) |
std::string Halide::Internal::halide_type_to_c_type | ( | const Type & | t | ) |
const GeneratorFactoryProvider & Halide::Internal::get_registered_generators | ( | ) |
Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators.
int Halide::Internal::generate_filter_main | ( | int | argc, |
char ** | argv ) |
generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation.
int Halide::Internal::generate_filter_main | ( | int | argc, |
char ** | argv, | ||
const GeneratorFactoryProvider & | generator_factory_provider ) |
This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g.
for bindings in languages other than C++).
T Halide::Internal::parse_scalar | ( | const std::string & | value | ) |
Definition at line 2882 of file Generator.h.
References parse_scalar(), and user_assert.
Referenced by parse_scalar().
std::vector< Type > Halide::Internal::parse_halide_type_list | ( | const std::string & | types | ) |
References parse_halide_type_list().
Referenced by parse_halide_type_list().
void Halide::Internal::execute_generator | ( | const ExecuteGeneratorArgs & | args | ) |
Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main()
, but with a structured API that is more suitable for calling directly from code (vs command line).
References execute_generator().
Referenced by execute_generator().
Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module.
Buffer< uint8_t > Halide::Internal::compile_module_to_hexagon_shared_object | ( | const Module & | device_code | ) |
Replace indirect and other loads with simple loads + vlut calls.
Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations.
This pass rewrites widenings/narrowings to be explicit in the IR, and attempts to simplify away most of the interleaving/deinterleaving.
Generate deinterleave or interleave operations, operating on groups of vectors at a time.
bool Halide::Internal::is_native_deinterleave | ( | const Expr & | x | ) |
bool Halide::Internal::is_native_interleave | ( | const Expr & | x | ) |
std::string Halide::Internal::type_suffix | ( | Type | type, |
bool | signed_variants = true ) |
std::string Halide::Internal::type_suffix | ( | const Expr & | a, |
bool | signed_variants = true ) |
std::string Halide::Internal::type_suffix | ( | const Expr & | a, |
const Expr & | b, | ||
bool | signed_variants = true ) |
std::string Halide::Internal::type_suffix | ( | const std::vector< Expr > & | ops, |
bool | signed_variants = true ) |
std::vector< InferredArgument > Halide::Internal::infer_arguments | ( | const Stmt & | body, |
const std::vector< Function > & | outputs ) |
Stmt Halide::Internal::call_extern_and_assert | ( | const std::string & | name, |
const std::vector< Expr > & | args ) |
A helper function to call an extern function, and assert that it returns 0.
Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed.
Inline a single named function, which must be pure.
For a pure function to be inlined, it must not have any specializations (i.e. it can only have one values definition).
void Halide::Internal::validate_schedule_inlined_function | ( | Function | f | ) |
Check if the schedule of an inlined function is legal, throwing an error if it is not.
|
noexcept |
Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here.
E.g. if you want to use IntrusivePtr<MyClass>, then you should define something like this in MyClass.cpp (assuming MyClass has a field: mutable RefCount ref_count):
template<> RefCount &ref_count<MyClass>(const MyClass *c) noexcept {return c->ref_count;} template<> void destroy<MyClass>(const MyClass *c) {delete c;}
Referenced by Halide::Internal::IntrusivePtr< T >::is_sole_reference().
void Halide::Internal::destroy | ( | const T * | t | ) |
Referenced by equal(), and graph_equal().
Referenced by less_than().
Referenced by graph_less_than().
HALIDE_ALWAYS_INLINE bool Halide::Internal::equal | ( | const Expr & | a, |
int | b ) |
Compare an Expr to an int literal.
This is a somewhat common use of equal in tests. Making this separate avoids constructing an Expr out of the int literal just to check if it's equal to a.
Definition at line 28 of file IREquality.h.
References Halide::Internal::IRHandle::as(), Halide::Int(), and Halide::Expr::type().
HALIDE_ALWAYS_INLINE bool Halide::Internal::equal | ( | const IRNode & | a, |
const IRNode & | b ) |
Check if two defined Stmts or Exprs are equal.
Definition at line 38 of file IREquality.h.
References equal_impl(), and Halide::Internal::IRNode::node_type.
HALIDE_ALWAYS_INLINE bool Halide::Internal::equal | ( | const IRHandle & | a, |
const IRHandle & | b ) |
Check if two possible-undefined Stmts or Exprs are equal.
Definition at line 50 of file IREquality.h.
References Halide::Internal::IntrusivePtr< T >::defined(), equal(), and Halide::Internal::IntrusivePtr< T >::get().
HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_equal | ( | const IRNode & | a, |
const IRNode & | b ) |
Check if two defined Stmts or Exprs are equal.
Safe to call on Exprs that haven't been passed to common_subexpression_elimination.
Definition at line 63 of file IREquality.h.
References equal_impl(), and Halide::Internal::IRNode::node_type.
HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_equal | ( | const IRHandle & | a, |
const IRHandle & | b ) |
Check if two possibly-undefined Stmts or Exprs are equal.
Safe to call on Exprs that haven't been passed to common_subexpression_elimination.
Definition at line 76 of file IREquality.h.
References Halide::Internal::IntrusivePtr< T >::defined(), equal(), and Halide::Internal::IntrusivePtr< T >::get().
HALIDE_ALWAYS_INLINE bool Halide::Internal::less_than | ( | const IRNode & | a, |
const IRNode & | b ) |
Check if two defined Stmts or Exprs are in a lexicographic order.
For use in map keys.
Definition at line 89 of file IREquality.h.
References less_than_impl(), and Halide::Internal::IRNode::node_type.
Referenced by less_than(), and Halide::Internal::IRDeepCompare::operator()().
HALIDE_ALWAYS_INLINE bool Halide::Internal::less_than | ( | const IRHandle & | a, |
const IRHandle & | b ) |
Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.
For use in map keys.
Definition at line 102 of file IREquality.h.
References Halide::Internal::IntrusivePtr< T >::defined(), Halide::Internal::IntrusivePtr< T >::get(), and less_than().
HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_less_than | ( | const IRNode & | a, |
const IRNode & | b ) |
Check if two defined Stmts or Exprs are in a lexicographic order.
For use in map keys. Safe to use on Exprs that haven't been passed to common_subexpression_elimination.
Definition at line 118 of file IREquality.h.
References graph_less_than_impl(), and Halide::Internal::IRNode::node_type.
Referenced by graph_less_than(), and Halide::Internal::IRGraphDeepCompare::operator()().
HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_less_than | ( | const IRHandle & | a, |
const IRHandle & | b ) |
Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.
For use in map keys. Safe to use on Exprs that haven't been passed to common_subexpression_elimination.
Definition at line 132 of file IREquality.h.
References Halide::Internal::IntrusivePtr< T >::defined(), Halide::Internal::IntrusivePtr< T >::get(), and graph_less_than().
void Halide::Internal::ir_equality_test | ( | ) |
bool Halide::Internal::expr_match | ( | const Expr & | pattern, |
const Expr & | expr, | ||
std::vector< Expr > & | result ) |
Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument.
Wildcards require the types to match. For the type bits and width, a 0 indicates "match anything". So an Int(8, 0) will match 8-bit integer vectors of any width (including scalars), and a UInt(0, 0) will match any unsigned integer type.
For example:
should return true, and set result[0] to 3 and result[1] to 2*k.
bool Halide::Internal::expr_match | ( | const Expr & | pattern, |
const Expr & | expr, | ||
std::map< std::string, Expr > & | result ) |
Does the first expression have the same structure as the second? Variables are matched consistently.
The first time a variable is matched, it assumes the value of the matching part of the second expression. Subsequent matches must be equal to the first match.
For example:
should return true, and set result["x"] = a, and result["y"] = b.
Rewrite the expression x to have lanes
lanes.
This is useful for substituting the results of expr_match into a pattern expression.
void Halide::Internal::expr_match_test | ( | ) |
std::pair< Region, bool > Halide::Internal::mutate_region | ( | Mutator * | mutator, |
const Region & | bounds, | ||
Args &&... | args ) |
A helper function for mutator-like things to mutate regions.
Definition at line 124 of file IRMutator.h.
References Halide::Internal::IntrusivePtr< T >::same_as().
bool Halide::Internal::is_const | ( | const Expr & | e | ) |
const double * Halide::Internal::as_const_float | ( | const Expr & | e | ) |
bool Halide::Internal::is_const_power_of_two_integer | ( | const Expr & | e, |
int * | bits ) |
Is the expression a constant integer power of two.
Also returns log base two of the expression if it is. Only returns true for integer types.
bool Halide::Internal::is_positive_const | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression)
bool Halide::Internal::is_negative_const | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression)
bool Halide::Internal::is_undef | ( | const Expr & | e | ) |
Is the expression an undef.
bool Halide::Internal::is_const_zero | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression)
Referenced by Halide::Internal::IRMatcher::NegateOp< A >::match().
bool Halide::Internal::is_const_one | ( | const Expr & | e | ) |
Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression)
Referenced by Halide::Internal::IRMatcher::CanProve< A, Prover >::make_folded_const().
bool Halide::Internal::is_no_op | ( | const Stmt & | s | ) |
bool Halide::Internal::is_pure | ( | const Expr & | e | ) |
Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects.
Construct an immediate of the given type from any numeric C++ type.
Referenced by Halide::Internal::IRMatcher::fuzz_test_rule(), Halide::Internal::IRMatcher::IntLiteral::make(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), Halide::Internal::GeneratorParamImpl< T >::operator Expr(), Halide::Internal::IRMatcher::Rewriter< Instance >::operator()(), and Halide::Internal::IRMatcher::Rewriter< Instance >::operator()().
Definition at line 80 of file IROperator.h.
References make_const().
Definition at line 83 of file IROperator.h.
References make_const().
Definition at line 86 of file IROperator.h.
References make_const().
Definition at line 89 of file IROperator.h.
References make_const().
Definition at line 92 of file IROperator.h.
References make_const().
Definition at line 95 of file IROperator.h.
References make_const().
Definition at line 98 of file IROperator.h.
References make_const().
Definition at line 101 of file IROperator.h.
References make_const().
Definition at line 104 of file IROperator.h.
References make_const().
Construct a unique signed_integer_overflow Expr.
Referenced by Halide::Internal::IRMatcher::make_const_special_expr().
bool Halide::Internal::is_signed_integer_overflow | ( | const Expr & | expr | ) |
Check if an expression is a signed_integer_overflow.
Check if a constant value can be correctly represented as the given type.
Expr Halide::Internal::make_bool | ( | bool | val, |
int | lanes = 1 ) |
Construct a boolean constant from a C++ boolean value.
May also be a vector if width is given. It is not possible to coerce a C++ boolean to Expr because if we provide such a path then char objects can ambiguously be converted to Halide Expr or to std::string. The problem is that C++ does not have a real bool type - it is in fact close enough to char that C++ does not know how to distinguish them. make_bool is the explicit coercion.
Construct the representation of zero in the given type.
Referenced by Halide::Internal::IRMatcher::NegateOp< A >::make().
Expr Halide::Internal::const_true | ( | int | lanes = 1 | ) |
Construct the constant boolean true.
May also be a vector of trues, if a lanes argument is given.
Expr Halide::Internal::const_false | ( | int | lanes = 1 | ) |
Construct the constant boolean false.
May also be a vector of falses, if a lanes argument is given.
Expr Halide::Internal::lossless_cast | ( | Type | t, |
Expr | e, | ||
std::map< Expr, ConstantInterval, ExprCompare > * | cache = nullptr ) |
Attempt to cast an expression to a smaller type while provably not losing information.
If it can't be done, return an undefined Expr.
Optionally accepts a map that gives the constant bounds of exprs already analyzed to avoid redoing work across many calls to lossless_cast. It is not safe to use this optional map in contexts where the same Expr object may take on a different value. For example: (let x = 4 in some_expr_object) + (let x = 5 in the_same_expr_object)). It is safe to use it after uniquify_variable_names has been run.
Attempt to negate x without introducing new IR and without overflow.
If it can't be done, return an undefined Expr.
Coerce the two expressions to have the same type, using C-style casting rules.
For the purposes of casting, a boolean type is UInt(1). We use the following procedure:
If the types already match, do nothing.
Then, if one type is a vector and the other is a scalar, the scalar is broadcast to match the vector width, and we continue.
Then, if one type is floating-point and the other is not, the non-float is cast to the floating-point type, and we're done.
Then, if both types are unsigned ints, the one with fewer bits is cast to match the one with more bits and we're done.
Then, if both types are signed ints, the one with fewer bits is cast to match the one with more bits and we're done.
Finally, if one type is an unsigned int and the other type is a signed int, both are cast to a signed int with the greater of the two bit-widths. For example, matching an Int(8) with a UInt(16) results in an Int(16).
Asserts that both expressions are integer types and are either both signed or both unsigned.
If one argument is scalar and the other a vector, the scalar is broadcasted to have the same number of lanes as the vector. If one expression is of narrower type than the other, it is widened to the bit width of the wider.
Raise an expression to an integer power by repeatedly multiplying it by itself.
Split a boolean condition into vector of ANDs.
If 'cond' is undefined, return an empty vector.
If e is a ramp expression with stride, default 1, return the base, otherwise undefined.
|
inline |
Implementations of division and mod that are specific to Halide.
Use these implementations; do not use native C division or mod to simplify Halide expressions. Halide division and modulo satisify the Euclidean definition of division for integers a and b:
/code when b != 0, (a/b)*b + ab = a 0 <= ab < |b| /endcode
Additionally, mod by zero returns zero, and div by zero returns zero. This makes mod and div total functions.
Definition at line 247 of file IROperator.h.
References Halide::Type::is_float(), Halide::Type::is_int(), and Halide::type_of().
Referenced by Halide::Internal::Simplify::ExprInfo::cast_to(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), and Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().
|
inline |
Definition at line 268 of file IROperator.h.
References Halide::Type::is_float(), Halide::Type::is_int(), and Halide::type_of().
Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Div >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Div >(), and Halide::Internal::IRMatcher::constant_fold_bin_op< Div >().
|
inline |
Definition at line 293 of file IROperator.h.
|
inline |
Definition at line 299 of file IROperator.h.
|
inline |
Definition at line 305 of file IROperator.h.
|
inline |
Definition at line 309 of file IROperator.h.
Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed.
Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed.
Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.
Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.
If the expression is a tag helper call, remove it and return the tagged expression.
If not, returns the expression.
|
inline |
Definition at line 343 of file IROperator.h.
Referenced by Halide::Pipeline::add_requirement(), collect_print_args(), collect_print_args(), Halide::print(), Halide::print_when(), and Halide::require().
|
inline |
Definition at line 347 of file IROperator.h.
References collect_print_args().
|
inline |
Definition at line 353 of file IROperator.h.
References collect_print_args().
Expr Halide::Internal::requirement_failed_error | ( | Expr | condition, |
const std::vector< Expr > & | args ) |
Expr Halide::Internal::memoize_tag_helper | ( | Expr | result, |
const std::vector< Expr > & | cache_key_values ) |
Referenced by Halide::memoize_tag().
void Halide::Internal::reset_random_counters | ( | ) |
Reset the counters used for random-number seeds in random_float/int/uint.
(Note that the counters are incremented for each call, even if a seed is passed in.) This is used for multitarget compilation to ensure that each subtarget gets the same sequence of random numbers.
Return an expression that should never be evaluated.
Expressions that depend on unreachabale values are also unreachable, and statements that execute unreachable expressions are also considered unreachable.
|
inline |
Definition at line 1356 of file IROperator.h.
References Halide::type_of(), and unreachable().
Referenced by unreachable().
FOR INTERNAL USE ONLY.
An entirely unchecked version of unsafe_promise_clamped, used inside the compiler as an annotation of the known bounds of an Expr when it has proved something is bounded and wants to record that fact for later passes (notably bounds inference) to exploit. This gets introduced by GuardWithIf tail strategies, because the bounds machinery has a hard time exploiting if statement conditions.
Unlike unsafe_promise_clamped, this expression is context-dependent, because 'value' might be statically bounded at some point in the IR (e.g. due to a containing if statement), but not elsewhere.
This intrinsic always evaluates to its first argument. If this value is used by a side-effecting operation and it is outside the range specified by its second and third arguments, behavior is undefined. The compiler can therefore assume that the value is within the range given and optimize accordingly. Note that this permits promise_clamped to evaluate to something outside of the range, provided that this value is not used.
Note that this produces an intrinsic that is marked as 'pure' and thus is allowed to be hoisted, etc.; thus, extra care must be taken with its use.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
IRNodeType | ) |
Emit a halide node type on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const AssociativePattern & | ) |
Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const AssociativeOp & | ) |
Emit a halide associative op on an output stream (such as std::cout) in a human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const ForType & | ) |
Emit a halide for loop type (vectorized, serial, etc) in a human readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const VectorReduce::Operator & | ) |
Emit a horizontal vector reduction op in human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const NameMangling & | ) |
Emit a halide name mangling value in a human readable format.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const LinkageType & | ) |
Emit a halide linkage value in a human readable format.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const DimType & | ) |
Emit a halide dimension type in human-readable format.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | out, |
const Closure & | c ) |
Emit a Closure in human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | out, |
const Interval & | c ) |
Emit an Interval in human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | out, |
const ConstantInterval & | c ) |
Emit a ConstantInterval in human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | out, |
const ModulusRemainder & | c ) |
Emit a ModulusRemainder in human-readable form.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const Indentation & | ) |
void * Halide::Internal::get_symbol_address | ( | const char * | s | ) |
Expr Halide::Internal::lower_lerp | ( | Type | final_type, |
Expr | zero_val, | ||
Expr | one_val, | ||
const Expr & | weight, | ||
const Target & | target ) |
Build Halide IR that computes a lerp.
Use by codegen targets that don't have a native lerp. The lerp is done in the type of the zero value. The final_type is a cast that should occur after the lerp. It's included because in some cases you can incorporate a final cast into the lerp math.
Hoist loop-invariants out of inner loops.
This is especially important in cases where LLVM would not do it for us automatically. For example, it hoists loop invariants out of cuda kernels.
Just hoist loop-invariant if statements as far up as possible.
Does not lift other values. It's useful to run this earlier in lowering to simplify the IR.
auto Halide::Internal::iterator_to_pointer | ( | T | iter | ) | -> decltype(&*std::declval<T>()) |
Definition at line 117 of file LLVM_Headers.h.
|
inline |
Definition at line 121 of file LLVM_Headers.h.
|
inline |
Definition at line 125 of file LLVM_Headers.h.
|
inline |
Definition at line 129 of file LLVM_Headers.h.
llvm::Triple Halide::Internal::get_triple_for_target | ( | const Target & | target | ) |
std::unique_ptr< llvm::Module > Halide::Internal::get_initial_module_for_target | ( | Target | , |
llvm::LLVMContext * | , | ||
bool | for_shared_jit_runtime = false, | ||
bool | just_gpu = false ) |
Create an llvm module containing the support code for a given target.
std::unique_ptr< llvm::Module > Halide::Internal::get_initial_module_for_ptx_device | ( | Target | , |
llvm::LLVMContext * | c ) |
Create an llvm module containing the support code for ptx device.
void Halide::Internal::add_bitcode_to_module | ( | llvm::LLVMContext * | context, |
llvm::Module & | module, | ||
const std::vector< uint8_t > & | bitcode, | ||
const std::string & | name ) |
Link a block of llvm bitcode into an llvm module.
std::unique_ptr< llvm::Module > Halide::Internal::link_with_wasm_jit_runtime | ( | llvm::LLVMContext * | c, |
const Target & | t, | ||
std::unique_ptr< llvm::Module > | extra_module ) |
Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module.
Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load.
If the loads are predicated, the predicates need to match. Can be an optimization or pessimization depending on how good the L1 cache is on the architecture and how many memory issue slots there are. Currently only intended for Hexagon.
Module Halide::Internal::lower | ( | const std::vector< Function > & | output_funcs, |
const std::string & | pipeline_name, | ||
const Target & | t, | ||
const std::vector< Argument > & | args, | ||
LinkageType | linkage_type, | ||
const std::vector< Stmt > & | requirements = std::vector< Stmt >(), | ||
bool | trace_pipeline = false, | ||
const std::vector< IRMutator * > & | custom_passes = std::vector< IRMutator * >() ) |
Given a vector of scheduled halide functions, create a Module that evaluates it.
Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. The Module may contain submodules for computation offloaded to another execution engine or API as well as buffers that are used in the passed in Stmt.
Stmt Halide::Internal::lower_main_stmt | ( | const std::vector< Function > & | output_funcs, |
const std::string & | pipeline_name, | ||
const Target & | t, | ||
const std::vector< Stmt > & | requirements = std::vector< Stmt >(), | ||
bool | trace_pipeline = false, | ||
const std::vector< IRMutator * > & | custom_passes = std::vector< IRMutator * >() ) |
Given a halide function with a schedule, create a statement that evaluates it.
Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. Mostly used as a convenience function in tests that wish to assert some property of the lowered IR.
void Halide::Internal::lower_test | ( | ) |
Stmt Halide::Internal::lower_parallel_tasks | ( | const Stmt & | s, |
std::vector< LoweredFunc > & | closure_implementations, | ||
const std::string & | name, | ||
const Target & | t ) |
Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions.
Stmt Halide::Internal::inject_memoization | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env, | ||
const std::string & | name, | ||
const std::vector< Function > & | outputs ) |
Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache.
Should leave non-memoized Funcs unchanged.
Stmt Halide::Internal::rewrite_memoized_allocations | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env ) |
This should be called after Storage Flattening has added Allocation IR nodes.
It connects the memoization cache lookups to the Allocations so they point to the buffers from the memoization cache and those buffers are released when no longer used. Should not affect allocations for non-memoized Funcs.
std::map< OutputFileType, const OutputInfo > Halide::Internal::get_output_info | ( | const Target & | target | ) |
Referenced by Halide::SimdOpCheckTest::compile_and_check().
ModulusRemainder Halide::Internal::operator+ | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b ) |
ModulusRemainder Halide::Internal::operator- | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b ) |
ModulusRemainder Halide::Internal::operator* | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b ) |
ModulusRemainder Halide::Internal::operator/ | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b ) |
ModulusRemainder Halide::Internal::operator% | ( | const ModulusRemainder & | a, |
const ModulusRemainder & | b ) |
ModulusRemainder Halide::Internal::operator+ | ( | const ModulusRemainder & | a, |
int64_t | b ) |
ModulusRemainder Halide::Internal::operator- | ( | const ModulusRemainder & | a, |
int64_t | b ) |
ModulusRemainder Halide::Internal::operator* | ( | const ModulusRemainder & | a, |
int64_t | b ) |
ModulusRemainder Halide::Internal::operator/ | ( | const ModulusRemainder & | a, |
int64_t | b ) |
ModulusRemainder Halide::Internal::operator% | ( | const ModulusRemainder & | a, |
int64_t | b ) |
ModulusRemainder Halide::Internal::modulus_remainder | ( | const Expr & | e | ) |
For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant.
For example, it is straight-forward to deduce that ((10*x + 2)*(6*y - 3) - 1) is congruent to five modulo six.
We get the most information when the modulus is large. E.g. if something is congruent to 208 modulo 384, then we also know it's congruent to 0 mod 8, and we can possibly use it as an index for an aligned load. If all else fails, we can just say that an integer is congruent to zero modulo one.
ModulusRemainder Halide::Internal::modulus_remainder | ( | const Expr & | e, |
const Scope< ModulusRemainder > & | scope ) |
If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder:
HALIDE_MUST_USE_RESULT bool Halide::Internal::reduce_expr_modulo | ( | const Expr & | e, |
int64_t | modulus, | ||
int64_t * | remainder ) |
Reduce an expression modulo some integer.
Returns true and assigns to remainder if an answer could be found.
HALIDE_MUST_USE_RESULT bool Halide::Internal::reduce_expr_modulo | ( | const Expr & | e, |
int64_t | modulus, | ||
int64_t * | remainder, | ||
const Scope< ModulusRemainder > & | scope ) |
Reduce an expression modulo some integer.
Returns true and assigns to remainder if an answer could be found.
void Halide::Internal::modulus_remainder_test | ( | ) |
The greatest common divisor of two integers.
Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().
The least common multiple of two integers.
Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().
ConstantInterval Halide::Internal::derivative_bounds | ( | const Expr & | e, |
const std::string & | var, | ||
const Scope< ConstantInterval > & | scope = Scope< ConstantInterval >::empty_scope() ) |
Find the bounds of the derivative of an expression.
The scope gives the bounds on the derivatives of any variables found.
Monotonic Halide::Internal::is_monotonic | ( | const Expr & | e, |
const std::string & | var, | ||
const Scope< ConstantInterval > & | scope = Scope< ConstantInterval >::empty_scope() ) |
Monotonic Halide::Internal::is_monotonic | ( | const Expr & | e, |
const std::string & | var, | ||
const Scope< Monotonic > & | scope ) |
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const Monotonic & | m ) |
Emit the monotonic class in human-readable form for debugging.
void Halide::Internal::is_monotonic_test | ( | ) |
Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module.
bool Halide::Internal::can_parallelize_rvar | ( | const std::string & | rvar, |
const std::string & | func, | ||
const Definition & | r ) |
void Halide::Internal::check_call_arg_types | ( | const std::string & | name, |
std::vector< Expr > * | args, | ||
int | dims ) |
Validate arguments to a call to a func, image or imageparam.
Return true if an expression uses a likely tag.
The scope contains all vars in scope that should be considered to have likely tags.
Partitions loop bodies into a prologue, a steady state, and an epilogue.
Finds the steady state by hunting for use of clamped ramps, or the 'likely' intrinsic.
Stmt Halide::Internal::inject_placeholder_prefetch | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env, | ||
const std::string & | prefix, | ||
const std::vector< PrefetchDirective > & | prefetches ) |
Inject placeholder prefetches to 's'.
This placholder prefetch does not have explicit region to be prefetched yet. It will be computed during call to inject_prefetch.
Stmt Halide::Internal::inject_prefetch | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env ) |
Compute the actual region to be prefetched and place it to the placholder prefetch.
Wrap the prefetch call with condition when applicable.
Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture).
This keeps the 'max_dim' innermost dimensions and adds loops for the rest of the dimensions. If maximum prefetched-byte-size is specified (depending on the architecture), this also adds an outer loops that tile the prefetches.
Hoist all the prefetches in a Block to the beginning of the Block.
This generally only happens when a loop with prefetches is unrolled; in some cases, LLVM's code generation can be suboptimal (unnecessary register spills) when prefetches are scattered through the loop. Hoisting to the top of the loop is a good way to mitigate this, at the cost of the prefetch calls possibly being less useful due to distance from use point. (This is a bit experimental and may need revisiting.) See also https://bugs.llvm.org/show_bug.cgi?id=51172
std::string Halide::Internal::print_loop_nest | ( | const std::vector< Function > & | output_funcs | ) |
Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses.
Stmt Halide::Internal::inject_profiling | ( | const Stmt & | , |
const std::string & | , | ||
const std::map< std::string, Function > & | env ) |
Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end.
Should be done before storage flattening, but after all bounds inference.
Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero.
In those cases, if the lowering passes are functional, the value resulting from the division or mod is evaluated but not used. This mutator rewrites divs and mods in such expressions to fail silently (evaluate to undef) when the denominator is zero.
Prefix all variable names in the given expression with the prefix string.
Return a random floating-point number between zero and one that varies deterministically based on the input expressions.
Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers).
Expr Halide::Internal::lower_random | ( | const Expr & | e, |
const std::vector< VarOrRVar > & | free_vars, | ||
int | tag ) |
Convert calls to random() to IR generated by random_float and random_int.
Tags all calls with the variables in free_vars, and the integer given as the last argument.
std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > > Halide::Internal::realization_order | ( | const std::vector< Function > & | outputs, |
std::map< std::string, Function > & | env ) |
Given a bunch of functions that call each other, determine an order in which to do the scheduling.
This in turn influences the order in which stages are computed when there's no strict dependency between them. Currently just some arbitrary depth-first traversal of the call graph. In addition, determine grouping of functions with fused computation loops. The functions within the fused groups are sorted based on realization order. There should not be any dependencies among functions within a fused group. This pass will also populate the 'fused_pairs' list in the function's schedule. Return a pair of the realization order and the fused groups in that order.
std::vector< std::string > Halide::Internal::topological_order | ( | const std::vector< Function > & | outputs, |
const std::map< std::string, Function > & | env ) |
Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule.
This ordering adheres to the producer-consumer dependencies, i.e. producer will come before its consumers in that order
void Halide::Internal::split_predicate_test | ( | ) |
bool Halide::Internal::is_func_trivial_to_inline | ( | const Function & | func | ) |
Return true if the cost of inlining a function is equivalent to the cost of calling the function directly.
Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt.
This doesn't touch Realize/Call nodes and so must be called after storage_flattening.
Removes placeholder loops for extern stages.
Removes stores that depend on undef values, and statements that only contain such stores.
Stmt Halide::Internal::schedule_functions | ( | const std::vector< Function > & | outputs, |
const std::vector< std::vector< std::string > > & | fused_groups, | ||
const std::map< std::string, Function > & | env, | ||
const Target & | target, | ||
bool & | any_memoized ) |
Build loop nests and inject Function realizations at the appropriate places using the schedule.
Returns a flag indicating whether memoization passes need to be run.
std::ostream & Halide::Internal::operator<< | ( | std::ostream & | stream, |
const Scope< T > & | s ) |
Definition at line 307 of file Scope.h.
References Halide::Internal::Scope< T >::cbegin(), Halide::Internal::Scope< T >::cend(), and Halide::Internal::Scope< T >::const_iterator::name().
Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target.
Choose the first of the following: opencl, cuda
Stmt Halide::Internal::simplify | ( | const Stmt & | , |
bool | remove_dead_code = true, | ||
const Scope< Interval > & | bounds = Scope< Interval >::empty_scope(), | ||
const Scope< ModulusRemainder > & | alignment = Scope< ModulusRemainder >::empty_scope(), | ||
const std::vector< Expr > & | assumptions = std::vector< Expr >() ) |
Perform a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc.
Simplifies across let statements, so must not be called on stmts with dangling or repeated variable names. Can optionally be passed known bounds of any variables, known alignment properties, and any other Exprs that should be assumed to be true.
Expr Halide::Internal::simplify | ( | const Expr & | , |
bool | remove_dead_code = true, | ||
const Scope< Interval > & | bounds = Scope< Interval >::empty_scope(), | ||
const Scope< ModulusRemainder > & | alignment = Scope< ModulusRemainder >::empty_scope(), | ||
const std::vector< Expr > & | assumptions = std::vector< Expr >() ) |
bool Halide::Internal::can_prove | ( | Expr | e, |
const Scope< Interval > & | bounds = Scope< Interval >::empty_scope() ) |
Attempt to statically prove an expression is true using the simplifier.
Simplify expressions found in a statement, but don't simplify across different statements.
This is safe to perform at an earlier stage in lowering than full simplification of a stmt.
Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions.
For example, consider:
for x in [0, 10]: let y = x + 3 let z = y - x
x lies within [0, 10]. Interval arithmetic will correctly determine that y lies within [3, 13]. When z is encountered, it is treated as a difference of two independent variables, and gives [3 - 10, 13 - 0] = [-7, 13] instead of the tighter interval [3, 3]. It doesn't understand that y and x are correlated.
In practice, this problem causes problems for unrolling, and arbitrarily-bad overconservative behavior in bounds inference (e.g. https://github.com/halide/Halide/issues/3697 )
The function below attempts to address this by walking the IR, remembering whether each let variable is monotonic increasing, decreasing, unknown, or constant w.r.t each loop var. When it encounters a subtract node where both sides have the same monotonicity it substitutes, solves, and attempts to generally simplify as aggressively as possible to try to cancel out the repeated dependence on the loop var. The same is done for addition nodes with arguments of opposite monotonicity.
Bounds inference is particularly sensitive to these false dependencies, but removing false dependencies also helps other lowering passes. E.g. if this simplification means a value no longer depends on a loop variable, it can remain scalar during vectorization of that loop, or we can lift it out as a loop invariant, or it might avoid some of the complex paths in GPU codegen that trigger when values depend on the block index (e.g. warp shuffles).
This pass is safe to use on code with repeated instances of the same variable name (it must be, because we want to run it before allocation bounds inference).
Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference.
Performs a subset of what simplify_correlated_differences
does. Can increase Expr size (i.e. does not follow the simplifier's reduction order).
void Halide::Internal::simplify_specializations | ( | std::map< std::string, Function > & | env | ) |
Try to simplify the RHS/LHS of a function's definition based on its specializations.
Stmt Halide::Internal::skip_stages | ( | const Stmt & | s, |
const std::vector< Function > & | outputs, | ||
const std::vector< std::vector< std::string > > & | order, | ||
const std::map< std::string, Function > & | env ) |
Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used.
Does this by analyzing all reads of each buffer allocated, and inferring some condition that tells us if the reads occur. If the condition is non-trivial, inject ifs that guard the production.
Stmt Halide::Internal::sliding_window | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env ) |
Perform sliding window optimizations on a halide statement.
I.e. don't bother computing points in a function that have provably already been computed by a previous iteration.
SolverResult Halide::Internal::solve_expression | ( | const Expr & | e, |
const std::string & | variable, | ||
const Scope< Expr > & | scope = Scope< Expr >::empty_scope() ) |
Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e.
outside most parentheses). If the expression is an equality or comparison, this 'solves' the equation. Returns a pair of Expr and bool. The Expr is the mutated expression, and the bool indicates whether there is a single instance of the variable in the result. If it is false, the expression has only been partially solved, and there are still multiple instances of the variable.
Interval Halide::Internal::solve_for_outer_interval | ( | const Expr & | c, |
const std::string & | variable ) |
Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it.
Never returns undefined Exprs, instead it uses variables called "pos_inf" and "neg_inf" to represent positive and negative infinity.
Interval Halide::Internal::solve_for_inner_interval | ( | const Expr & | c, |
const std::string & | variable ) |
Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it.
Expr Halide::Internal::and_condition_over_domain | ( | const Expr & | c, |
const Scope< Interval > & | varying ) |
Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables.
Formally, the output expr implies the input expr.
The condition may be a vector condition, in which case we also 'and' over the vector lanes, and return a scalar result.
void Halide::Internal::solve_test | ( | ) |
void Halide::Internal::spirv_ir_test | ( | ) |
Internal test for SPIR-V IR.
Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles.
For a stride of two, the trick is to do a dense load of twice the size, and then extract either the even or odd lanes. This was previously done in codegen, where it was challenging, because it's not easy to know there if it's safe to do the double-sized load, as it either loads one element beyond or before the original load. We used the alignment of the ramp base to try to tell if it was safe to shift backwards, and we added padding to internal allocations so that for those at least it was safe to shift forwards. Unfortunately the alignment of the ramp base is usually unknown if you don't know anything about the strides of the input, and adding padding to allocations was a serious wart in our memory allocators.
This pass instead actively looks for evidence elsewhere in the Stmt (at some location which definitely executes whenever the load being transformed executes) that it's safe to read further forwards or backwards in memory. The evidence is in the form of a load at the same base address with a different constant offset. It also clusters groups of these loads so that they do the same dense load and extract the appropriate slice of lanes. If it fails to find any evidence, for loads from external buffers it does two overlapping half-sized dense loads and shuffles out the desired lanes, and for loads from internal allocations it adds padding to the allocation explicitly, by setting the padding field on Allocate nodes.
void Halide::Internal::print_to_stmt_html | ( | const std::string & | html_output_filename, |
const Module & | m, | ||
const std::string & | assembly_input_filename = "" ) |
Dump an HTML-formatted visualization of a Module to filename.
If assembly_input_filename is not empty, it is expected to be the path to assembly output. If empty, the code will attempt to find such a file based on output_filename (replacing ".stmt.html" with ".s"), and will assert-fail if no such file is found.
void Halide::Internal::print_to_conceptual_stmt_html | ( | const std::string & | html_output_filename, |
const Module & | m, | ||
const std::string & | assembly_input_filename = "" ) |
Dump an HTML-formatted visualization of a Module's conceptual Stmt code to filename.
If assembly_input_filename is not empty, it is expected to be the path to assembly output. If empty, the code will attempt to find such a file based on output_filename (replacing ".stmt.html" with ".s"), and will assert-fail if no such file is found.
Stmt Halide::Internal::storage_folding | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env ) |
Fold storage of functions if possible.
This means reducing one of the dimensions module something for the purpose of storage, if we can prove that this is safe to do. E.g consider:
We can store f as a circular buffer of size two, instead of allocating space for all of it.
bool Halide::Internal::strictify_float | ( | std::map< std::string, Function > & | env, |
const Target & | t ) |
Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions.
This makes the IR nodes context independent. If the Target::StrictFloat flag is specified in target, starts in strict_float mode so all floating-point type Exprs in the compilation will be marked with strict_float. Returns whether any strict floating-point is used in any function in the passed in env.
Expr Halide::Internal::substitute | ( | const std::string & | name, |
const Expr & | replacement, | ||
const Expr & | expr ) |
Substitute variables with the given name with the replacement expression within expr.
This is a dangerous thing to do if variable names have not been uniquified. While it won't traverse inside let statements with the same name as the first argument, moving a piece of syntax around can change its meaning, because it can cross lets that redefine variable names that it includes references to.
Stmt Halide::Internal::substitute | ( | const std::string & | name, |
const Expr & | replacement, | ||
const Stmt & | stmt ) |
Substitute variables with the given name with the replacement expression within stmt.
Expr Halide::Internal::substitute | ( | const std::map< std::string, Expr > & | replacements, |
const Expr & | expr ) |
Substitute variables with names in the map.
Stmt Halide::Internal::substitute | ( | const std::map< std::string, Expr > & | replacements, |
const Stmt & | stmt ) |
Expr Halide::Internal::substitute | ( | const Expr & | find, |
const Expr & | replacement, | ||
const Expr & | expr ) |
Substitute expressions for other expressions.
Stmt Halide::Internal::substitute | ( | const Expr & | find, |
const Expr & | replacement, | ||
const Stmt & | stmt ) |
Expr Halide::Internal::graph_substitute | ( | const std::string & | name, |
const Expr & | replacement, | ||
const Expr & | expr ) |
Substitutions where the IR may be a general graph (and not just a DAG).
Stmt Halide::Internal::graph_substitute | ( | const std::string & | name, |
const Expr & | replacement, | ||
const Stmt & | stmt ) |
Expr Halide::Internal::graph_substitute | ( | const Expr & | find, |
const Expr & | replacement, | ||
const Expr & | expr ) |
Stmt Halide::Internal::graph_substitute | ( | const Expr & | find, |
const Expr & | replacement, | ||
const Stmt & | stmt ) |
Substitute in all let Exprs in a piece of IR.
Doesn't substitute in let stmts, as this may change the meaning of the IR (e.g. by moving a load after a store). Produces graphs of IR, so don't use non-graph-aware visitors or mutators on it until you've CSE'd the result.
void Halide::Internal::target_test | ( | ) |
void Halide::Internal::lower_target_query_ops | ( | std::map< std::string, Function > & | env, |
const Target & | t ) |
Stmt Halide::Internal::inject_tracing | ( | Stmt | , |
const std::string & | pipeline_name, | ||
bool | trace_pipeline, | ||
const std::map< std::string, Function > & | env, | ||
const std::vector< Function > & | outputs, | ||
const Target & | Target ) |
Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations.
Should be done before storage flattening, but after all bounds inference.
Truncate loop bounds to the region over which they actually do something.
For examples see test/correctness/trim_no_ops.cpp
Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones.
Modify a statement so that every internally-defined variable name is unique.
This lets later passes assume syntactic equivalence is semantic equivalence.
void Halide::Internal::uniquify_variable_names_test | ( | ) |
Creates let stmts for the various buffer components (e.g.
foo.extent.0) in any referenced concrete buffers or buffer parameters. After this pass, the only undefined symbols should scalar parameters and the buffers themselves (e.g. foo.buffer).
Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement.
I.e. unroll the loop.
Lower all unsafe promises into either assertions or unchecked code, depending on the target.
Lower all safe promises by just stripping them.
This is a good idea once no more lowering stages are going to use boxes_touched.
DST Halide::Internal::safe_numeric_cast | ( | SRC | s | ) |
Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible.
DstType Halide::Internal::reinterpret_bits | ( | const SrcType & | src | ) |
An aggressive form of reinterpret cast used for correct type-punning.
Definition at line 135 of file Util.h.
References memcpy().
Referenced by Halide::Internal::IRMatcher::fuzz_test_rule().
std::string Halide::Internal::get_env_variable | ( | char const * | env_var_name | ) |
Get value of an environment variable.
Returns its value is defined in the environment. If the var is not defined, an empty string is returned.
std::string Halide::Internal::running_program_name | ( | ) |
Get the name of the currently running executable.
Platform-specific. If program name cannot be retrieved, function returns an empty string.
std::string Halide::Internal::unique_name | ( | char | prefix | ) |
Generate a unique name starting with the given prefix.
It's unique relative to all other strings returned by unique_name in this process.
The single-character version always appends a numeric suffix to the character.
The string version will either return the input as-is (with high probability on the first time it is called with that input), or replace any existing '$' characters with underscores, then add a '$' sign and a numeric suffix to it.
Note that unique_name('f') therefore differs from unique_name("f"). The former returns something like f123, and the latter returns either f or f$123.
Referenced by Halide::Buffer< T, Dims >::Buffer().
std::string Halide::Internal::unique_name | ( | const std::string & | prefix | ) |
bool Halide::Internal::starts_with | ( | const std::string & | str, |
const std::string & | prefix ) |
Test if the first string starts with the second string.
bool Halide::Internal::ends_with | ( | const std::string & | str, |
const std::string & | suffix ) |
Test if the first string ends with the second string.
std::string Halide::Internal::replace_all | ( | const std::string & | str, |
const std::string & | find, | ||
const std::string & | replace ) |
Replace all matches of the second string in the first string with the last string.
std::vector< std::string > Halide::Internal::split_string | ( | const std::string & | source, |
const std::string & | delim ) |
Split the source string using 'delim' as the divider.
std::string Halide::Internal::join_strings | ( | const std::vector< T > & | sources, |
const std::string & | delim ) |
T Halide::Internal::fold_left | ( | const std::vector< T > & | vec, |
Fn | f ) |
T Halide::Internal::fold_right | ( | const std::vector< T > & | vec, |
Fn | f ) |
std::string Halide::Internal::extract_namespaces | ( | const std::string & | name, |
std::vector< std::string > & | namespaces ) |
Returns base name and fills in namespaces, outermost one first in vector.
Referenced by halide_handle_cplusplus_type::make().
std::string Halide::Internal::strip_namespaces | ( | const std::string & | name | ) |
Like extract_namespaces(), but strip and discard the namespaces, returning base name only.
std::string Halide::Internal::file_make_temp | ( | const std::string & | prefix, |
const std::string & | suffix ) |
Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed.
(Note that the exact form of the file name may vary; in particular, the suffix may be ignored on Windows.) The file is created (but not opened), thus this can be called from different threads (or processes, e.g. when building with parallel make) without risking collision. Note that if this file is used as a temporary file, the caller is responsibly for deleting it. Neither the prefix nor suffix may contain a directory separator.
std::string Halide::Internal::dir_make_temp | ( | ) |
Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed.
The directory will be empty (i.e., this will never return /tmp itself, but rather a new directory inside /tmp). The caller is responsible for removing the directory after use.
bool Halide::Internal::file_exists | ( | const std::string & | name | ) |
Wrapper for access().
Quietly ignores errors.
void Halide::Internal::assert_file_exists | ( | const std::string & | name | ) |
assert-fail if the file doesn't exist.
useful primarily for testing purposes.
void Halide::Internal::assert_no_file_exists | ( | const std::string & | name | ) |
assert-fail if the file DOES exist.
useful primarily for testing purposes.
void Halide::Internal::file_unlink | ( | const std::string & | name | ) |
Wrapper for unlink().
Asserts upon error.
Quietly ignores errors.
Referenced by Halide::Internal::TemporaryFile::~TemporaryFile().
void Halide::Internal::ensure_no_file_exists | ( | const std::string & | name | ) |
Ensure that no file with this path exists.
If such a file exists and cannot be removed, assert-fail.
void Halide::Internal::dir_rmdir | ( | const std::string & | name | ) |
Wrapper for rmdir().
Asserts upon error.
FileStat Halide::Internal::file_stat | ( | const std::string & | name | ) |
Wrapper for stat().
Asserts upon error.
std::vector< char > Halide::Internal::read_entire_file | ( | const std::string & | pathname | ) |
Read the entire contents of a file into a vector<char>.
The file is read in binary mode. Errors trigger an assertion failure.
void Halide::Internal::write_entire_file | ( | const std::string & | pathname, |
const void * | source, | ||
size_t | source_len ) |
Create or replace the contents of a file with a given pointer-and-length of memory.
If the file doesn't exist, it is created; if it does exist, it is completely overwritten. Any error triggers an assertion failure.
Referenced by write_entire_file().
|
inline |
Definition at line 322 of file Util.h.
References write_entire_file().
Routines to test if math would overflow for signed integers with the given number of bits.
Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Add >().
Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Sub >().
Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Mul >().
HALIDE_MUST_USE_RESULT bool Halide::Internal::add_with_overflow | ( | int | bits, |
int64_t | a, | ||
int64_t | b, | ||
int64_t * | result ) |
Routines to perform arithmetic on signed types without triggering signed overflow.
If overflow would occur, sets result to zero, and returns false. Otherwise set result to the correct value, and returns true.
Referenced by Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().
HALIDE_MUST_USE_RESULT bool Halide::Internal::sub_with_overflow | ( | int | bits, |
int64_t | a, | ||
int64_t | b, | ||
int64_t * | result ) |
HALIDE_MUST_USE_RESULT bool Halide::Internal::mul_with_overflow | ( | int | bits, |
int64_t | a, | ||
int64_t | b, | ||
int64_t * | result ) |
void Halide::Internal::halide_tic_impl | ( | const char * | file, |
int | line ) |
void Halide::Internal::halide_toc_impl | ( | const char * | file, |
int | line ) |
std::string Halide::Internal::c_print_name | ( | const std::string & | name, |
bool | prefix_underscore = true ) |
Emit a version of a string that is a valid identifier in C (.
is replaced with _) If prefix_underscore is true (the default), an underscore will be prepended if the input starts with an alphabetic character to avoid reserved word clashes.
int Halide::Internal::get_llvm_version | ( | ) |
Return the LLVM_VERSION against which this libHalide is compiled.
This is provided only for internal tests which need to verify behavior; please don't use this outside of Halide tests.
void Halide::Internal::run_with_large_stack | ( | const std::function< void()> & | action | ) |
Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size.
If that value is zero, just calls the function on the calling thread. Otherwise on Windows this uses a Fiber, and on other platforms it uses swapcontext.
int Halide::Internal::popcount64 | ( | uint64_t | x | ) |
Portable versions of popcount, count-leading-zeros, and count-trailing-zeros.
int Halide::Internal::clz64 | ( | uint64_t | x | ) |
int Halide::Internal::ctz64 | ( | uint64_t | x | ) |
|
inline |
std::vector< Var > Halide::Internal::make_argument_list | ( | int | dimensionality | ) |
Make a list of unique arguments for definitions with unnamed arguments.
Referenced by Halide::Func::define_extern(), Halide::Func::define_extern(), and Halide::Func::define_extern().
Stmt Halide::Internal::vectorize_loops | ( | const Stmt & | s, |
const std::map< std::string, Function > & | env ) |
Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors.
The loops in question must have constant extent.
std::map< std::string, Function > Halide::Internal::wrap_func_calls | ( | const std::map< std::string, Function > & | env | ) |
Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions.
|
inline |
Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution).
The path is guaranteed to be an absolute path and end in a directory separator, so a leaf filename can simply be appended. It is not guaranteed that this directory will be empty. If the path cannot be created, the function will assert-fail and return an invalid path.
Definition at line 75 of file halide_test_dirs.h.
References Halide::Internal::Test::get_current_directory(), and Halide::Internal::Test::get_env_variable().
Definition at line 22 of file AutoScheduleUtils.h.
|
constexpr |
|
extern |