Halide 19.0.0
Halide compiler and libraries
Loading...
Searching...
No Matches
Halide::Internal Namespace Reference

Namespaces

namespace  Autoscheduler
 
namespace  Elf
 
namespace  GeneratorMinMax
 
namespace  IntegerDivision
 
namespace  IRMatcher
 An alternative template-metaprogramming approach to expression matching.
 
namespace  Test
 

Classes

class  AbstractGenerator
 AbstractGenerator is an ABC that defines the API a Generator must provide to work with the existing Generator infrastructure (GenGen, RunGen, execute_generator(), Generator Stubs). More...
 
struct  Acquire
 
struct  Add
 The sum of two expressions. More...
 
struct  all_are_convertible
 
struct  all_are_printable_args
 
struct  all_ints_and_optional_name
 
struct  all_ints_and_optional_name< First, Rest... >
 
struct  all_ints_and_optional_name< T >
 
struct  all_ints_and_optional_name<>
 
struct  Allocate
 Allocate a scratch area called with the given name, type, and size. More...
 
struct  And
 Logical and - are both expressions true. More...
 
struct  ApplySplitResult
 
class  aslog
 
struct  AssertStmt
 If the 'condition' is false, then evaluate and return the message, which should be a call to an error function. More...
 
struct  AssociativeOp
 Represent the equivalent associative op of an update definition. More...
 
struct  AssociativePattern
 Represent an associative op with its identity. More...
 
struct  Atomic
 Lock all the Store nodes in the body statement. More...
 
struct  BaseExprNode
 A base class for expression nodes. More...
 
struct  BaseStmtNode
 IR nodes are split into expressions and statements. More...
 
struct  Block
 A sequence of statements to be executed in-order. More...
 
struct  Bound
 A bound on a loop, typically from Func::bound. More...
 
struct  Box
 Represents the bounds of a region of arbitrary dimension. More...
 
struct  Broadcast
 A vector with 'lanes' elements, in which every element is 'value'. More...
 
struct  BufferBuilder
 A builder to help create Exprs representing halide_buffer_t structs (e.g. More...
 
struct  BufferContents
 
struct  BufferInfo
 Find all calls to image buffers and parameters in the function. More...
 
struct  Call
 A function call. More...
 
struct  Cast
 The actual IR nodes begin here. More...
 
class  Closure
 A helper class to manage closures. More...
 
class  CodeGen_C
 This class emits C++ code equivalent to a halide Stmt. More...
 
class  CodeGen_GPU_C
 A base class for GPU backends that require C-like shader output. More...
 
struct  CodeGen_GPU_Dev
 A code generator that emits GPU code from a given Halide stmt. More...
 
class  CodeGen_LLVM
 A code generator abstract base class. More...
 
class  CodeGen_Posix
 A code generator that emits posix code from a given Halide stmt. More...
 
class  CodeGen_PyTorch
 This class emits C++ code to wrap a Halide pipeline so that it can be used as a C++ extension operator in PyTorch. More...
 
class  CompilerLogger
 
struct  cond
 
struct  ConstantInterval
 A class to represent ranges of integers. More...
 
struct  Convert
 
struct  Cost
 
class  debug
 For optional debugging during codegen, use the debug class as follows: More...
 
class  Definition
 A Function definition which can either represent a init or an update definition. More...
 
struct  DeviceArgument
 A DeviceArgument looks similar to an Halide::Argument, but has behavioral differences that make it specific to the GPU pipeline; the fact that neither is-a nor has-a Halide::Argument is deliberate. More...
 
struct  Dim
 The Dim struct represents one loop in the schedule's representation of a loop nest. More...
 
class  Dimension
 
struct  Div
 The ratio of two expressions. More...
 
struct  EQ
 Is the first expression equal to the second. More...
 
struct  ErrorReport
 
struct  Evaluate
 Evaluate and discard an expression, presumably because it has some side-effect. More...
 
struct  ExecuteGeneratorArgs
 ExecuteGeneratorArgs is the set of arguments to execute_generator(). More...
 
struct  ExprNode
 We use the "curiously recurring template pattern" to avoid duplicated code in the IR Nodes. More...
 
class  ExprUsesVars
 
struct  FeatureIntermediates
 
struct  FileStat
 
class  FindAllCalls
 Visitor for keeping track of functions that are directly called and the arguments with which they are called. More...
 
struct  FloatImm
 Floating point constants. More...
 
struct  For
 A for loop. More...
 
struct  Fork
 A pair of statements executed concurrently. More...
 
struct  Free
 Free the resources associated with the given buffer. More...
 
class  FuncSchedule
 A schedule for a Function of a Halide pipeline. More...
 
class  Function
 A reference-counted handle to Halide's internal representation of a function. More...
 
struct  FunctionPtr
 A possibly-weak pointer to a Halide function. More...
 
struct  FusedPair
 This represents two stages with fused loop nests from outermost to a specific loop level. More...
 
struct  GE
 Is the first expression greater than or equal to the second. More...
 
class  GeneratorBase
 
class  GeneratorFactoryProvider
 GeneratorFactoryProvider provides a way to customize the Generators that are visible to generate_filter_main (which otherwise would just look at the global registry of C++ Generators). More...
 
class  GeneratorInput_Arithmetic
 
class  GeneratorInput_Buffer
 
class  GeneratorInput_DynamicScalar
 
class  GeneratorInput_Func
 
class  GeneratorInput_Scalar
 
class  GeneratorInputBase
 
class  GeneratorInputImpl
 
class  GeneratorOutput_Arithmetic
 
class  GeneratorOutput_Buffer
 
class  GeneratorOutput_Func
 
class  GeneratorOutputBase
 
class  GeneratorOutputImpl
 
class  GeneratorParam_Arithmetic
 
class  GeneratorParam_AutoSchedulerParams
 
class  GeneratorParam_Bool
 
class  GeneratorParam_Enum
 
class  GeneratorParam_LoopLevel
 
class  GeneratorParam_String
 
class  GeneratorParam_Synthetic
 
class  GeneratorParam_Target
 
class  GeneratorParam_Type
 
class  GeneratorParamBase
 
class  GeneratorParamImpl
 
class  GeneratorParamInfo
 
class  GeneratorRegistry
 
class  GIOBase
 GIOBase is the base class for all GeneratorInput<> and GeneratorOutput<> instantiations; it is not part of the public API and should never be used directly by user code. More...
 
class  GPUCompilationCache
 
class  GpuObjectLifetimeTracker
 
struct  GT
 Is the first expression greater than the second. More...
 
struct  HalideBufferStaticTypeAndDims
 
struct  HalideBufferStaticTypeAndDims<::Halide::Buffer< T, Dims > >
 
struct  HalideBufferStaticTypeAndDims<::Halide::Runtime::Buffer< T, Dims > >
 
struct  has_static_halide_type_method
 
struct  has_static_halide_type_method< T2, typename type_sink< decltype(T2::static_halide_type())>::type >
 
class  HexagonAlignmentAnalyzer
 
struct  HoistedStorage
 Represents a location where storage will be hoisted to for a Func / Realize node with a given name. More...
 
class  HostClosure
 A Closure modified to inspect GPU-specific memory accesses, and produce a vector of DeviceArgument objects. More...
 
struct  IfThenElse
 An if-then-else block. More...
 
struct  Indentation
 
struct  InferredArgument
 An inferred argument. More...
 
struct  Interval
 A class to represent ranges of Exprs. More...
 
struct  IntImm
 Integer constants. More...
 
struct  IntrusivePtr
 Intrusive shared pointers have a reference count (a RefCount object) stored in the class itself. More...
 
struct  IRDeepCompare
 A compare struct built around less_than, for use as the comparison object in a std::map or std::set. More...
 
struct  IRGraphDeepCompare
 A compare struct built around graph_less_than, for use as the comparison object in a std::map or std::set. More...
 
class  IRGraphMutator
 A mutator that caches and reapplies previously-done mutations, so that it can handle graphs of IR that have not had CSE done to them. More...
 
class  IRGraphVisitor
 A base class for algorithms that walk recursively over the IR without visiting the same node twice. More...
 
struct  IRHandle
 IR nodes are passed around opaque handles to them. More...
 
class  IRMutator
 A base class for passes over the IR which modify it (e.g. More...
 
struct  IRNode
 The abstract base classes for a node in the Halide IR. More...
 
class  IRPrinter
 An IRVisitor that emits IR to the given output stream in a human readable form. More...
 
class  IRVisitor
 A base class for algorithms that need to recursively walk over the IR. More...
 
struct  is_printable_arg
 
struct  IsHalideBuffer
 
struct  IsHalideBuffer< const halide_buffer_t * >
 
struct  IsHalideBuffer< halide_buffer_t * >
 
struct  IsHalideBuffer<::Halide::Buffer< T, Dims > >
 
struct  IsHalideBuffer<::Halide::Runtime::Buffer< T, Dims > >
 
struct  IsRoundtrippable
 
struct  JITCache
 
struct  JITErrorBuffer
 
struct  JITFuncCallContext
 
struct  JITModule
 
class  JITSharedRuntime
 
class  JSONCompilerLogger
 JSONCompilerLogger is a basic implementation of the CompilerLogger interface that saves logged data, then logs it all in JSON format in emit_to_stream(). More...
 
struct  LE
 Is the first expression less than or equal to the second. More...
 
struct  Let
 A let expression, like you might find in a functional language. More...
 
struct  LetStmt
 The statement form of a let node. More...
 
struct  Load
 Load a value from a named symbol if predicate is true. More...
 
struct  LoweredArgument
 Definition of an argument to a LoweredFunc. More...
 
struct  LoweredFunc
 Definition of a lowered function. More...
 
struct  LT
 Is the first expression less than the second. More...
 
struct  Max
 The greater of two values. More...
 
struct  meta_and
 
struct  meta_and< T1, Args... >
 
struct  meta_or
 
struct  meta_or< T1, Args... >
 
struct  Min
 The lesser of two values. More...
 
struct  Mod
 The remainder of a / b. More...
 
struct  ModulusRemainder
 The result of modulus_remainder analysis. More...
 
struct  Mul
 The product of two expressions. More...
 
struct  NE
 Is the first expression not equal to the second. More...
 
struct  NoRealizations
 
struct  NoRealizations< T, Args... >
 
struct  NoRealizations<>
 
struct  Not
 Logical not - true if the expression false. More...
 
class  ObjectInstanceRegistry
 
struct  Or
 Logical or - is at least one of the expression true. More...
 
struct  OutputInfo
 
struct  PipelineFeatures
 
struct  Prefetch
 Represent a multi-dimensional region of a Func or an ImageParam that needs to be prefetched. More...
 
struct  PrefetchDirective
 
struct  PrintSpan
 Allow easily printing the contents of containers, or std::vector-like containers, in debug output. More...
 
struct  PrintSpanLn
 Allow easily printing the contents of spans, or std::vector-like spans, in debug output. More...
 
struct  ProducerConsumer
 This node is a helpful annotation to do with permissions. More...
 
struct  Provide
 This defines the value of a function at a multi-dimensional location. More...
 
class  PythonExtensionGen
 
struct  Ramp
 A linear ramp vector node. More...
 
struct  Realize
 Allocate a multi-dimensional buffer of the given type and size. More...
 
class  ReductionDomain
 A reference-counted handle on a reduction domain, which is just a vector of ReductionVariable. More...
 
struct  ReductionVariable
 A single named dimension of a reduction domain. More...
 
struct  ReductionVariableInfo
 Return a list of reduction variables the expression or tuple depends on. More...
 
class  RefCount
 A class representing a reference count to be used with IntrusivePtr. More...
 
struct  RegionCosts
 Auto scheduling component which is used to assign costs for computing a region of a function or one of its stages. More...
 
class  RegisterGenerator
 
struct  Reinterpret
 Reinterpret value as another type, without affecting any of the bits (on little-endian systems). More...
 
struct  reverse_adaptor
 
struct  ScheduleFeatures
 
class  Scope
 A common pattern when traversing Halide IR is that you need to keep track of stuff when you find a Let or a LetStmt, and that it should hide previous values with the same name until you leave the Let or LetStmt nodes This class helps with that. More...
 
struct  ScopedBinding
 Helper class for pushing/popping Scope<> values, to allow for early-exit in Visitor/Mutators that preserves correctness. More...
 
struct  ScopedBinding< void >
 
struct  ScopedValue
 Helper class for saving/restoring variable values on the stack, to allow for early-exit that preserves correctness. More...
 
struct  Select
 A ternary operator. More...
 
struct  select_type
 
struct  select_type< First >
 
struct  Shuffle
 Construct a new vector by taking elements from another sequence of vectors. More...
 
class  Simplify
 
class  SmallStack
 A stack which can store one item very efficiently. More...
 
class  SmallStack< void >
 
struct  SolverResult
 
struct  Specialization
 
struct  Split
 
class  StageSchedule
 A schedule for a single stage of a Halide pipeline. More...
 
struct  StaticCast
 
struct  Stmt
 A reference-counted handle to a statement node. More...
 
struct  StmtNode
 
struct  StorageDim
 Properties of one axis of the storage of a Func. More...
 
struct  Store
 Store a 'value' to the buffer called 'name' at a given 'index' if 'predicate' is true. More...
 
struct  StringImm
 String constants. More...
 
class  StubInput
 
class  StubInputBuffer
 StubInputBuffer is the placeholder that a Stub uses when it requires a Buffer for an input (rather than merely a Func or Expr). More...
 
class  StubOutputBuffer
 StubOutputBuffer is the placeholder that a Stub uses when it requires a Buffer for an output (rather than merely a Func). More...
 
class  StubOutputBufferBase
 
struct  Sub
 The difference of two expressions. More...
 
class  TemporaryFile
 A simple utility class that creates a temporary file in its ctor and deletes that file in its dtor; this is useful for temporary files that you want to ensure are deleted when exiting a certain scope. More...
 
struct  type_sink
 
struct  UIntImm
 Unsigned integer constants. More...
 
struct  Variable
 A named variable. More...
 
class  VariadicVisitor
 A visitor/mutator capable of passing arbitrary arguments to the visit methods using CRTP and returning any types from them. More...
 
struct  VectorReduce
 Horizontally reduce a vector to a scalar or narrower vector using the given commutative and associative binary operator. More...
 
class  Voidifier
 
struct  WasmModule
 Handle to compiled wasm code which can be called later. More...
 
struct  Weights
 

Typedefs

using AbstractGeneratorPtr = std::unique_ptr<AbstractGenerator>
 
typedef std::map< std::string, IntervalDimBounds
 
typedef std::map< std::pair< std::string, int >, IntervalFuncValueBounds
 
template<typename T , typename T2 >
using add_const_if_T_is_const = typename std::conditional<std::is_const<T>::value, const T2, T2>::type
 
template<typename T >
using GeneratorParamImplBase
 
template<typename T , typename TBase = typename std::remove_all_extents<T>::type>
using GeneratorInputImplBase
 
template<typename T , typename TBase = typename std::remove_all_extents<T>::type>
using GeneratorOutputImplBase
 
using GeneratorFactory = std::function<AbstractGeneratorPtr(const GeneratorContext &context)>
 
typedef llvm::raw_pwrite_stream LLVMOStream
 

Enumerations

enum class  ArgInfoKind { Scalar , Function , Buffer }
 
enum class  ArgInfoDirection { Input , Output }
 
enum class  Direction { Upper , Lower }
 Given a varying expression, try to find a constant that is either: An upper bound (always greater than or equal to the expression), or A lower bound (always less than or equal to the expression) If it fails, returns an undefined Expr. More...
 
enum class  IRNodeType {
  IntImm , UIntImm , FloatImm , StringImm ,
  Broadcast , Cast , Reinterpret , Variable ,
  Add , Sub , Mod , Mul ,
  Div , Min , Max , EQ ,
  NE , LT , LE , GT ,
  GE , And , Or , Not ,
  Select , Load , Ramp , Call ,
  Let , Shuffle , VectorReduce , LetStmt ,
  AssertStmt , ProducerConsumer , For , Acquire ,
  Store , Provide , Allocate , Free ,
  Realize , Block , Fork , IfThenElse ,
  Evaluate , Prefetch , Atomic , HoistedStorage
}
 All our IR node types get unique IDs for the purposes of RTTI. More...
 
enum class  ForType {
  Serial , Parallel , Vectorized , Unrolled ,
  Extern , GPUBlock , GPUThread , GPULane
}
 An enum describing a type of loop traversal. More...
 
enum class  SyntheticParamType { Type , Dim , ArraySize }
 
enum class  Monotonic { Constant , Increasing , Decreasing , Unknown }
 Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown. More...
 
enum class  DimType { PureVar = 0 , PureRVar , ImpureRVar }
 Each Dim below has a dim_type, which tells you what transformations are legal on it. More...
 

Functions

Stmt add_atomic_mutex (Stmt s, const std::vector< Function > &outputs)
 
Stmt add_image_checks (const Stmt &s, const std::vector< Function > &outputs, const Target &t, const std::vector< std::string > &order, const std::map< std::string, Function > &env, const FuncValueBounds &fb, bool will_inject_host_copies)
 Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g.
 
Stmt add_parameter_checks (const std::vector< Stmt > &requirements, Stmt s, const Target &t)
 Insert checks to make sure that all referenced parameters meet their constraints.
 
Stmt add_split_factor_checks (const Stmt &s, const std::map< std::string, Function > &env)
 Insert checks that all split factors that depend on scalar parameters are strictly positive.
 
Stmt align_loads (const Stmt &s, int alignment, int min_bytes_to_align)
 Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors.
 
Stmt allocation_bounds_inference (Stmt s, const std::map< std::string, Function > &env, const std::map< std::pair< std::string, int >, Interval > &func_bounds)
 Take a partially statement with Realize nodes in terms of variables, and define values for those variables.
 
std::vector< ApplySplitResultapply_split (const Split &split, const std::string &prefix, std::map< std::string, Expr > &dim_extent_alignment)
 Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let).
 
std::vector< std::pair< std::string, Expr > > compute_loop_bounds_after_split (const Split &split, const std::string &prefix)
 Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions.
 
const std::vector< AssociativePattern > & get_ops_table (const std::vector< Expr > &exprs)
 
AssociativeOp prove_associativity (const std::string &f, std::vector< Expr > args, std::vector< Expr > exprs)
 Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any.
 
void associativity_test ()
 
Stmt fork_async_producers (Stmt s, const std::map< std::string, Function > &env)
 
int string_to_int (const std::string &s)
 Return an int representation of 's'.
 
Expr substitute_var_estimates (Expr e)
 Substitute every variable in an Expr or a Stmt with its estimate if specified.
 
Stmt substitute_var_estimates (Stmt s)
 
Expr get_extent (const Interval &i)
 Return the size of an interval.
 
Expr box_size (const Box &b)
 Return the size of an n-d box.
 
void disp_regions (const std::map< std::string, Box > &regions)
 Helper function to print the bounds of a region.
 
Definition get_stage_definition (const Function &f, int stage_num)
 Return the corresponding definition of a function given the stage.
 
std::vector< Dim > & get_stage_dims (const Function &f, int stage_num)
 Return the corresponding loop dimensions of a function given the stage.
 
void combine_load_costs (std::map< std::string, Expr > &result, const std::map< std::string, Expr > &partial)
 Add partial load costs to the corresponding function in the result costs.
 
DimBounds get_stage_bounds (const Function &f, int stage_num, const DimBounds &pure_bounds)
 Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions.
 
std::vector< DimBoundsget_stage_bounds (const Function &f, const DimBounds &pure_bounds)
 Return the required bounds for all the stages of the function 'f'.
 
Expr perform_inline (Expr e, const std::map< std::string, Function > &env, const std::set< std::string > &inlines=std::set< std::string >(), const std::vector< std::string > &order=std::vector< std::string >())
 Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression.
 
std::set< std::string > get_parents (Function f, int stage)
 Return all functions that are directly called by a function stage (f, stage).
 
template<typename K , typename V >
get_element (const std::map< K, V > &m, const K &key)
 Return value of element within a map.
 
template<typename K , typename V >
V & get_element (std::map< K, V > &m, const K &key)
 
bool inline_all_trivial_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env)
 If the cost of computing a Func is about the same as calling the Func, inline the Func.
 
std::string is_func_called_element_wise (const std::vector< std::string > &order, size_t index, const std::map< std::string, Function > &env)
 Determine if a Func (order[index]) is only consumed by another single Func in element-wise manner.
 
bool inline_all_element_wise_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env)
 Inline a Func if its values are only consumed by another single Func in element-wise manner.
 
void propagate_estimate_test ()
 
Stmt bound_constant_extent_loops (const Stmt &s)
 Replace all loop extents of unrolled or vectorized loops with constants, by substituting and simplifying as needed.
 
const FuncValueBoundsempty_func_value_bounds ()
 
Interval bounds_of_expr_in_scope (const Expr &expr, const Scope< Interval > &scope, const FuncValueBounds &func_bounds=empty_func_value_bounds(), bool const_bound=false)
 Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression.
 
Expr find_constant_bound (const Expr &e, Direction d, const Scope< Interval > &scope=Scope< Interval >::empty_scope())
 
Interval find_constant_bounds (const Expr &e, const Scope< Interval > &scope)
 Find bounds for a varying expression that are either constants or +/-inf.
 
void merge_boxes (Box &a, const Box &b)
 Expand box a to encompass box b.
 
bool boxes_overlap (const Box &a, const Box &b)
 Test if box a could possibly overlap box b.
 
Box box_union (const Box &a, const Box &b)
 The union of two boxes.
 
Box box_intersection (const Box &a, const Box &b)
 The intersection of two boxes.
 
bool box_contains (const Box &a, const Box &b)
 Test if box a provably contains box b.
 
std::map< std::string, Boxboxes_required (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression.
 
std::map< std::string, Boxboxes_required (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
std::map< std::string, Boxboxes_provided (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression.
 
std::map< std::string, Boxboxes_provided (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
std::map< std::string, Boxboxes_touched (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression.
 
std::map< std::string, Boxboxes_touched (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
Box box_required (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 Variants of the above that are only concerned with a single function.
 
Box box_required (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
Box box_provided (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
Box box_provided (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
Box box_touched (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
Box box_touched (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
 
FuncValueBounds compute_function_value_bounds (const std::vector< std::string > &order, const std::map< std::string, Function > &env)
 Compute the maximum and minimum possible value for each function in an environment.
 
Expr span_of_bounds (const Interval &bounds)
 
void bounds_test ()
 
Stmt bounds_inference (Stmt, const std::vector< Function > &outputs, const std::vector< std::string > &realization_order, const std::vector< std::vector< std::string > > &fused_groups, const std::map< std::string, Function > &environment, const std::map< std::pair< std::string, int >, Interval > &func_bounds, const Target &target)
 Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds.
 
Stmt bound_small_allocations (const Stmt &s)
 
Expr buffer_accessor (const Buffer<> &buf, const std::vector< Expr > &args)
 
template<typename T , typename = typename std::enable_if<!std::is_convertible<T, std::string>::value>::type>
std::string get_name_from_end_of_parameter_pack (T &&)
 
std::string get_name_from_end_of_parameter_pack (const std::string &n)
 
std::string get_name_from_end_of_parameter_pack ()
 
template<typename First , typename Second , typename... Args>
std::string get_name_from_end_of_parameter_pack (First first, Second second, Args &&...rest)
 
void get_shape_from_start_of_parameter_pack_helper (std::vector< int > &, const std::string &)
 
void get_shape_from_start_of_parameter_pack_helper (std::vector< int > &)
 
template<typename... Args>
void get_shape_from_start_of_parameter_pack_helper (std::vector< int > &result, int x, Args &&...rest)
 
template<typename... Args>
std::vector< int > get_shape_from_start_of_parameter_pack (Args &&...args)
 
template<typename T >
void buffer_type_name_non_const (std::ostream &s)
 
template<>
void buffer_type_name_non_const< void > (std::ostream &s)
 
template<typename T >
std::string buffer_type_name ()
 
Stmt canonicalize_gpu_vars (Stmt s)
 Canonicalize GPU var names into some pre-determined block/thread names (i.e.
 
const std::string & gpu_thread_name (int index)
 Names for the thread and block id variables.
 
const std::string & gpu_block_name (int index)
 
Stmt clamp_unsafe_accesses (const Stmt &s, const std::map< std::string, Function > &env, FuncValueBounds &func_bounds)
 Inject clamps around func calls h(...) when all the following conditions hold:
 
std::unique_ptr< CodeGen_GPU_Devnew_CodeGen_D3D12Compute_Dev (const Target &target)
 
llvm::Type * get_vector_element_type (llvm::Type *)
 Get the scalar type of an llvm vector type.
 
bool function_takes_user_context (const std::string &name)
 Which built-in functions require a user-context first argument?
 
bool can_allocation_fit_on_stack (int64_t size)
 Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False.
 
std::pair< Expr, Exprlong_div_mod_round_to_zero (const Expr &a, const Expr &b, std::optional< uint64_t > max_abs=std::nullopt)
 Does a {div/mod}_round_to_zero using binary long division for int/uint.
 
Expr lower_mux (const Call *mux)
 Reduce a mux intrinsic to a select tree.
 
Expr lower_round_to_nearest_ties_to_even (const Expr &)
 An vectorizable implementation of Halide::round that doesn't depend on any standard library being present.
 
void get_target_options (const llvm::Module &module, llvm::TargetOptions &options)
 Given an llvm::Module, set llvm:TargetOptions information.
 
void clone_target_options (const llvm::Module &from, llvm::Module &to)
 Given two llvm::Modules, clone target options from one to the other.
 
std::unique_ptr< llvm::TargetMachine > make_target_machine (const llvm::Module &module)
 Given an llvm::Module, get or create an llvm:TargetMachine.
 
void set_function_attributes_from_halide_target_options (llvm::Function &)
 Set the appropriate llvm Function attributes given the Halide Target.
 
void embed_bitcode (llvm::Module *M, const std::string &halide_command)
 Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section.
 
std::unique_ptr< CodeGen_GPU_Devnew_CodeGen_Metal_Dev (const Target &target)
 
std::unique_ptr< CodeGen_GPU_Devnew_CodeGen_OpenCL_Dev (const Target &target)
 
std::unique_ptr< CodeGen_GPU_Devnew_CodeGen_PTX_Dev (const Target &target)
 
std::unique_ptr< CodeGen_Posixnew_CodeGen_ARM (const Target &target)
 Construct CodeGen object for a variety of targets.
 
std::unique_ptr< CodeGen_Posixnew_CodeGen_Hexagon (const Target &target)
 
std::unique_ptr< CodeGen_Posixnew_CodeGen_PowerPC (const Target &target)
 
std::unique_ptr< CodeGen_Posixnew_CodeGen_RISCV (const Target &target)
 
std::unique_ptr< CodeGen_Posixnew_CodeGen_X86 (const Target &target)
 
std::unique_ptr< CodeGen_Posixnew_CodeGen_WebAssembly (const Target &target)
 
std::unique_ptr< CodeGen_GPU_Devnew_CodeGen_Vulkan_Dev (const Target &target)
 
std::unique_ptr< CodeGen_GPU_Devnew_CodeGen_WebGPU_Dev (const Target &target)
 
std::unique_ptr< CompilerLoggerset_compiler_logger (std::unique_ptr< CompilerLogger > compiler_logger)
 Set the active CompilerLogger object, replacing any existing one.
 
CompilerLoggerget_compiler_logger ()
 Return the currently active CompilerLogger object.
 
ConstantInterval constant_integer_bounds (const Expr &e, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope(), std::map< Expr, ConstantInterval, ExprCompare > *cache=nullptr)
 Deduce constant integer bounds on an expression.
 
ConstantInterval operator+ (const ConstantInterval &a, const ConstantInterval &b)
 Arithmetic operators on ConstantIntervals.
 
ConstantInterval operator+ (const ConstantInterval &a, int64_t b)
 
ConstantInterval operator- (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval operator- (const ConstantInterval &a, int64_t b)
 
ConstantInterval operator/ (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval operator/ (const ConstantInterval &a, int64_t b)
 
ConstantInterval operator* (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval operator* (const ConstantInterval &a, int64_t b)
 
ConstantInterval operator% (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval operator% (const ConstantInterval &a, int64_t b)
 
ConstantInterval min (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval min (const ConstantInterval &a, int64_t b)
 
ConstantInterval max (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval max (const ConstantInterval &a, int64_t b)
 
ConstantInterval abs (const ConstantInterval &a)
 
ConstantInterval operator<< (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval operator<< (const ConstantInterval &a, int64_t b)
 
ConstantInterval operator<< (int64_t a, const ConstantInterval &b)
 
ConstantInterval operator>> (const ConstantInterval &a, const ConstantInterval &b)
 
ConstantInterval operator>> (const ConstantInterval &a, int64_t b)
 
ConstantInterval operator>> (int64_t a, const ConstantInterval &b)
 
bool operator<= (const ConstantInterval &a, const ConstantInterval &b)
 Comparison operators on ConstantIntervals.
 
bool operator<= (const ConstantInterval &a, int64_t b)
 
bool operator<= (int64_t a, const ConstantInterval &b)
 
bool operator< (const ConstantInterval &a, const ConstantInterval &b)
 
bool operator< (const ConstantInterval &a, int64_t b)
 
bool operator< (int64_t a, const ConstantInterval &b)
 
bool operator>= (const ConstantInterval &a, const ConstantInterval &b)
 
bool operator> (const ConstantInterval &a, const ConstantInterval &b)
 
bool operator>= (const ConstantInterval &a, int64_t b)
 
bool operator> (const ConstantInterval &a, int64_t b)
 
bool operator>= (int64_t a, const ConstantInterval &b)
 
bool operator> (int64_t a, const ConstantInterval &b)
 
std::string cplusplus_function_mangled_name (const std::string &name, const std::vector< std::string > &namespaces, Type return_type, const std::vector< ExternFuncArgument > &args, const Target &target)
 Return the mangled C++ name for a function.
 
void cplusplus_mangle_test ()
 
Expr common_subexpression_elimination (const Expr &, bool lift_all=false)
 Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable.
 
Stmt common_subexpression_elimination (const Stmt &, bool lift_all=false)
 Do common-subexpression-elimination on each expression in a statement.
 
void cse_test ()
 
std::ostream & operator<< (std::ostream &stream, const Stmt &)
 Emit a halide statement on an output stream (such as std::cout) in a human-readable form.
 
std::ostream & operator<< (std::ostream &stream, const LoweredFunc &)
 Emit a halide LoweredFunc in a human readable format.
 
template<typename T >
 PrintSpan (const T &) -> PrintSpan< T >
 
template<typename StreamT , typename T >
StreamT & operator<< (StreamT &stream, const PrintSpan< T > &wrapper)
 
template<typename T >
 PrintSpanLn (const T &) -> PrintSpanLn< T >
 
template<typename StreamT , typename T >
StreamT & operator<< (StreamT &stream, const PrintSpanLn< T > &wrapper)
 
void debug_arguments (LoweredFunc *func, const Target &t)
 Injects debug prints in a LoweredFunc that describe the target and arguments.
 
Stmt debug_to_file (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env)
 Takes a statement with Realize nodes still unlowered.
 
Expr extract_odd_lanes (const Expr &a)
 Extract the odd-numbered lanes in a vector.
 
Expr extract_even_lanes (const Expr &a)
 Extract the even-numbered lanes in a vector.
 
Expr extract_lane (const Expr &vec, int lane)
 Extract the nth lane of a vector.
 
Stmt rewrite_interleavings (const Stmt &s)
 Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic.
 
void deinterleave_vector_test ()
 
Expr remove_let_definitions (const Expr &expr)
 Remove all let definitions of expr.
 
std::vector< int > gather_variables (const Expr &expr, const std::vector< std::string > &filter)
 Return a list of variables' indices that expr depends on and are in the filter.
 
std::vector< int > gather_variables (const Expr &expr, const std::vector< Var > &filter)
 
std::map< std::string, ReductionVariableInfogather_rvariables (const Expr &expr)
 
std::map< std::string, ReductionVariableInfogather_rvariables (const Tuple &tuple)
 
Expr add_let_expression (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping, const std::vector< std::string > &let_variables)
 Add necessary let expressions to expr.
 
std::vector< Exprsort_expressions (const Expr &expr)
 Topologically sort the expression graph expressed by expr.
 
std::map< std::string, Boxinference_bounds (const std::vector< Func > &funcs, const std::vector< Box > &output_bounds)
 Compute the bounds of funcs.
 
std::map< std::string, Boxinference_bounds (const Func &func, const Box &output_bounds)
 
std::vector< std::pair< Expr, Expr > > box_to_vector (const Box &bounds)
 Convert Box to vector of (min, extent)
 
bool equal (const RDom &bounds0, const RDom &bounds1)
 Return true if bounds0 and bounds1 represent the same bounds.
 
std::vector< std::string > vars_to_strings (const std::vector< Var > &vars)
 Return a list of variable names.
 
ReductionDomain extract_rdom (const Expr &expr)
 Return the reduction domain used by expr.
 
std::pair< bool, Exprsolve_inverse (Expr expr, const std::string &new_var, const std::string &var)
 expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom
 
std::map< std::string, BufferInfofind_buffer_param_calls (const Func &func)
 
std::set< std::string > find_implicit_variables (const Expr &expr)
 Find all implicit variables in expr.
 
Expr substitute_rdom_predicate (const std::string &name, const Expr &replacement, const Expr &expr)
 Substitute the variable.
 
bool is_calling_function (const std::string &func_name, const Expr &expr, const std::map< std::string, Expr > &let_var_mapping)
 Return true if expr contains call to func_name.
 
bool is_calling_function (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping)
 Return true if expr depends on any function or buffer.
 
Expr substitute_call_arg_with_pure_arg (Func f, int variable_id, const Expr &e)
 Replaces call to Func f in Expr e such that the call argument at variable_id is the pure argument.
 
Expr make_device_interface_call (DeviceAPI device_api, MemoryType memory_type=MemoryType::Auto)
 Get an Expr which evaluates to the device interface for the given device api at runtime.
 
Stmt distribute_shifts (const Stmt &stmt, bool multiply_adds)
 
Stmt inject_early_frees (const Stmt &s)
 Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation.
 
Type eliminated_bool_type (Type bool_type, Type other_type)
 If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors.
 
bool is_float16_transcendental (const Call *)
 Check if a call is a float16 transcendental (e.g.
 
Expr lower_float16_transcendental_to_float32_equivalent (const Call *)
 Implement a float16 transcendental using the float32 equivalent.
 
Expr float32_to_bfloat16 (Expr e)
 Cast to/from float and bfloat using bitwise math.
 
Expr float32_to_float16 (Expr e)
 
Expr float16_to_float32 (Expr e)
 
Expr bfloat16_to_float32 (Expr e)
 
Expr lower_float16_cast (const Cast *op)
 
HALIDE_EXPORT_SYMBOL void unhandled_exception_handler ()
 
template<>
RefCountref_count< IRNode > (const IRNode *t) noexcept
 
template<>
void destroy< IRNode > (const IRNode *t)
 
bool is_unordered_parallel (ForType for_type)
 Check if for_type executes for loop iterations in parallel and unordered.
 
bool is_parallel (ForType for_type)
 Returns true if for_type executes for loop iterations in parallel.
 
bool is_gpu (ForType for_type)
 Returns true if for_type is GPUBlock, GPUThread, or GPULane.
 
template<typename StmtOrExpr , typename T >
bool stmt_or_expr_uses_vars (const StmtOrExpr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
 Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
 
template<typename StmtOrExpr >
bool stmt_or_expr_uses_var (const StmtOrExpr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
 Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
 
bool expr_uses_var (const Expr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
 Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
 
bool stmt_uses_var (const Stmt &stmt, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
 Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.
 
template<typename T >
bool expr_uses_vars (const Expr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
 Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
 
template<typename T >
bool stmt_uses_vars (const Stmt &stmt, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
 Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.
 
Stmt extract_tile_operations (const Stmt &s)
 Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend.
 
std::map< std::string, Functionfind_direct_calls (const Function &f)
 Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, including in update definitions, update index expressions, and RDom extents.
 
std::map< std::string, Functionfind_transitive_calls (const Function &f)
 Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, or indirectly in those functions' definitions, recursively.
 
std::map< std::string, Functionbuild_environment (const std::vector< Function > &funcs)
 Find all Functions transitively referenced by any Function in funcs and return a map of them.
 
std::vector< Functioncalled_funcs_in_order_found (const std::vector< Function > &funcs)
 Returns the same Functions as build_environment, but returns a vector of Functions instead, where the order is the order in which the Functions were first encountered.
 
Expr lower_widen_right_add (const Expr &a, const Expr &b)
 Implement intrinsics with non-intrinsic using equivalents.
 
Expr lower_widen_right_mul (const Expr &a, const Expr &b)
 
Expr lower_widen_right_sub (const Expr &a, const Expr &b)
 
Expr lower_widening_add (const Expr &a, const Expr &b)
 
Expr lower_widening_mul (const Expr &a, const Expr &b)
 
Expr lower_widening_sub (const Expr &a, const Expr &b)
 
Expr lower_widening_shift_left (const Expr &a, const Expr &b)
 
Expr lower_widening_shift_right (const Expr &a, const Expr &b)
 
Expr lower_rounding_shift_left (const Expr &a, const Expr &b)
 
Expr lower_rounding_shift_right (const Expr &a, const Expr &b)
 
Expr lower_saturating_add (const Expr &a, const Expr &b)
 
Expr lower_saturating_sub (const Expr &a, const Expr &b)
 
Expr lower_saturating_cast (const Type &t, const Expr &a)
 
Expr lower_halving_add (const Expr &a, const Expr &b)
 
Expr lower_halving_sub (const Expr &a, const Expr &b)
 
Expr lower_rounding_halving_add (const Expr &a, const Expr &b)
 
Expr lower_sorted_avg (const Expr &a, const Expr &b)
 
Expr lower_mul_shift_right (const Expr &a, const Expr &b, const Expr &q)
 
Expr lower_rounding_mul_shift_right (const Expr &a, const Expr &b, const Expr &q)
 
Expr lower_intrinsic (const Call *op)
 Replace one of the above ops with equivalent arithmetic.
 
Stmt find_intrinsics (const Stmt &s)
 Replace common arithmetic patterns with intrinsics.
 
Expr find_intrinsics (const Expr &e)
 
Expr lower_intrinsics (const Expr &e)
 The reverse of find_intrinsics.
 
Stmt lower_intrinsics (const Stmt &s)
 
Stmt flatten_nested_ramps (const Stmt &s)
 Take a statement/expression and replace nested ramps and broadcasts.
 
Expr flatten_nested_ramps (const Expr &e)
 
template<typename Last >
void check_types (const Tuple &t, int idx)
 
template<typename First , typename Second , typename... Rest>
void check_types (const Tuple &t, int idx)
 
template<typename Last >
void assign_results (Realization &r, int idx, Last last)
 
template<typename First , typename Second , typename... Rest>
void assign_results (Realization &r, int idx, First first, Second second, Rest &&...rest)
 
void schedule_scalar (Func f)
 
std::pair< std::vector< Function >, std::map< std::string, Function > > deep_copy (const std::vector< Function > &outputs, const std::map< std::string, Function > &env)
 Deep copy an entire Function DAG.
 
Stmt zero_gpu_loop_mins (const Stmt &s)
 Rewrite all GPU loops to have a min of zero.
 
Stmt fuse_gpu_thread_loops (Stmt s)
 Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model.
 
Stmt fuzz_float_stores (const Stmt &s)
 On every store of a floating point value, mask off the least-significant-bit of the mantissa.
 
void generator_test ()
 
std::vector< Exprparameter_constraints (const Parameter &p)
 
template<typename T >
HALIDE_NO_USER_CODE_INLINE std::string enum_to_string (const std::map< std::string, T > &enum_map, const T &t)
 
template<typename T >
enum_from_string (const std::map< std::string, T > &enum_map, const std::string &s)
 
const std::map< std::string, Halide::Type > & get_halide_type_enum_map ()
 
std::string halide_type_to_enum_string (const Type &t)
 
std::string halide_type_to_c_source (const Type &t)
 
std::string halide_type_to_c_type (const Type &t)
 
const GeneratorFactoryProviderget_registered_generators ()
 Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators.
 
int generate_filter_main (int argc, char **argv)
 generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation.
 
int generate_filter_main (int argc, char **argv, const GeneratorFactoryProvider &generator_factory_provider)
 This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g.
 
template<typename T >
parse_scalar (const std::string &value)
 
std::vector< Typeparse_halide_type_list (const std::string &types)
 
void execute_generator (const ExecuteGeneratorArgs &args)
 Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main(), but with a structured API that is more suitable for calling directly from code (vs command line).
 
Stmt inject_hexagon_rpc (Stmt s, const Target &host_target, Module &module)
 Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module.
 
Buffer< uint8_tcompile_module_to_hexagon_shared_object (const Module &device_code)
 
Stmt optimize_hexagon_shuffles (const Stmt &s, int lut_alignment)
 Replace indirect and other loads with simple loads + vlut calls.
 
Stmt scatter_gather_generator (Stmt s)
 
Stmt optimize_hexagon_instructions (Stmt s, const Target &t)
 Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations.
 
Expr native_deinterleave (const Expr &x)
 Generate deinterleave or interleave operations, operating on groups of vectors at a time.
 
Expr native_interleave (const Expr &x)
 
bool is_native_deinterleave (const Expr &x)
 
bool is_native_interleave (const Expr &x)
 
std::string type_suffix (Type type, bool signed_variants=true)
 
std::string type_suffix (const Expr &a, bool signed_variants=true)
 
std::string type_suffix (const Expr &a, const Expr &b, bool signed_variants=true)
 
std::string type_suffix (const std::vector< Expr > &ops, bool signed_variants=true)
 
std::vector< InferredArgumentinfer_arguments (const Stmt &body, const std::vector< Function > &outputs)
 
Stmt call_extern_and_assert (const std::string &name, const std::vector< Expr > &args)
 A helper function to call an extern function, and assert that it returns 0.
 
Stmt inject_host_dev_buffer_copies (Stmt s, const Target &t)
 Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed.
 
Stmt inline_function (Stmt s, const Function &f)
 Inline a single named function, which must be pure.
 
Expr inline_function (Expr e, const Function &f)
 
void inline_function (Function caller, const Function &f)
 
void validate_schedule_inlined_function (Function f)
 Check if the schedule of an inlined function is legal, throwing an error if it is not.
 
template<typename T >
RefCountref_count (const T *t) noexcept
 Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here.
 
template<typename T >
void destroy (const T *t)
 
bool equal_impl (const IRNode &a, const IRNode &b)
 
bool graph_equal_impl (const IRNode &a, const IRNode &b)
 
bool less_than_impl (const IRNode &a, const IRNode &b)
 
bool graph_less_than_impl (const IRNode &a, const IRNode &b)
 
HALIDE_ALWAYS_INLINE bool equal (const Expr &a, int b)
 Compare an Expr to an int literal.
 
HALIDE_ALWAYS_INLINE bool equal (const IRNode &a, const IRNode &b)
 Check if two defined Stmts or Exprs are equal.
 
HALIDE_ALWAYS_INLINE bool equal (const IRHandle &a, const IRHandle &b)
 Check if two possible-undefined Stmts or Exprs are equal.
 
HALIDE_ALWAYS_INLINE bool graph_equal (const IRNode &a, const IRNode &b)
 Check if two defined Stmts or Exprs are equal.
 
HALIDE_ALWAYS_INLINE bool graph_equal (const IRHandle &a, const IRHandle &b)
 Check if two possibly-undefined Stmts or Exprs are equal.
 
HALIDE_ALWAYS_INLINE bool less_than (const IRNode &a, const IRNode &b)
 Check if two defined Stmts or Exprs are in a lexicographic order.
 
HALIDE_ALWAYS_INLINE bool less_than (const IRHandle &a, const IRHandle &b)
 Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.
 
HALIDE_ALWAYS_INLINE bool graph_less_than (const IRNode &a, const IRNode &b)
 Check if two defined Stmts or Exprs are in a lexicographic order.
 
HALIDE_ALWAYS_INLINE bool graph_less_than (const IRHandle &a, const IRHandle &b)
 Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.
 
void ir_equality_test ()
 
bool expr_match (const Expr &pattern, const Expr &expr, std::vector< Expr > &result)
 Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument.
 
bool expr_match (const Expr &pattern, const Expr &expr, std::map< std::string, Expr > &result)
 Does the first expression have the same structure as the second? Variables are matched consistently.
 
Expr with_lanes (const Expr &x, int lanes)
 Rewrite the expression x to have lanes lanes.
 
void expr_match_test ()
 
template<typename Mutator , typename... Args>
std::pair< Region, bool > mutate_region (Mutator *mutator, const Region &bounds, Args &&...args)
 A helper function for mutator-like things to mutate regions.
 
bool is_const (const Expr &e)
 Is the expression either an IntImm, a FloatImm, a StringImm, or a Cast of the same, or a Ramp or Broadcast of the same.
 
bool is_const (const Expr &e, int64_t v)
 Is the expression an IntImm, FloatImm of a particular value, or a Cast, or Broadcast of the same.
 
std::optional< int64_tas_const_int (const Expr &e)
 If an expression is an IntImm or a Broadcast of an IntImm, return a its value.
 
std::optional< uint64_tas_const_uint (const Expr &e)
 If an expression is a UIntImm or a Broadcast of a UIntImm, return its value.
 
std::optional< double > as_const_float (const Expr &e)
 If an expression is a FloatImm or a Broadcast of a FloatImm, return its value.
 
std::optional< int > is_const_power_of_two_integer (const Expr &e)
 Is the expression a constant integer power of two.
 
std::optional< int > is_const_power_of_two_integer (uint64_t)
 
std::optional< int > is_const_power_of_two_integer (int64_t)
 
bool is_positive_const (const Expr &e)
 Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression)
 
bool is_negative_const (const Expr &e)
 Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression)
 
bool is_undef (const Expr &e)
 Is the expression an undef.
 
bool is_const_zero (const Expr &e)
 Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression)
 
bool is_const_one (const Expr &e)
 Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression)
 
bool is_no_op (const Stmt &s)
 Is the statement a no-op (which we represent as either an undefined Stmt, or as an Evaluate node of a constant)
 
bool is_pure (const Expr &e)
 Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects.
 
Expr make_const (Type t, int64_t val)
 Construct an immediate of the given type from any numeric C++ type.
 
Expr make_const (Type t, uint64_t val)
 
Expr make_const (Type t, double val)
 
Expr make_const (Type t, int32_t val)
 
Expr make_const (Type t, uint32_t val)
 
Expr make_const (Type t, int16_t val)
 
Expr make_const (Type t, uint16_t val)
 
Expr make_const (Type t, int8_t val)
 
Expr make_const (Type t, uint8_t val)
 
Expr make_const (Type t, bool val)
 
Expr make_const (Type t, float val)
 
Expr make_const (Type t, float16_t val)
 
Expr make_signed_integer_overflow (Type type)
 Construct a unique signed_integer_overflow Expr.
 
bool is_signed_integer_overflow (const Expr &expr)
 Check if an expression is a signed_integer_overflow.
 
void check_representable (Type t, int64_t val)
 Check if a constant value can be correctly represented as the given type.
 
Expr make_bool (bool val, int lanes=1)
 Construct a boolean constant from a C++ boolean value.
 
Expr make_zero (Type t)
 Construct the representation of zero in the given type.
 
Expr make_one (Type t)
 Construct the representation of one in the given type.
 
Expr make_two (Type t)
 Construct the representation of two in the given type.
 
Expr const_true (int lanes=1)
 Construct the constant boolean true.
 
Expr const_false (int lanes=1)
 Construct the constant boolean false.
 
Expr lossless_cast (Type t, Expr e, std::map< Expr, ConstantInterval, ExprCompare > *cache=nullptr)
 Attempt to cast an expression to a smaller type while provably not losing information.
 
Expr lossless_negate (const Expr &x)
 Attempt to negate x without introducing new IR and without overflow.
 
void match_types (Expr &a, Expr &b)
 Coerce the two expressions to have the same type, using C-style casting rules.
 
void match_types_bitwise (Expr &a, Expr &b, const char *op_name)
 Asserts that both expressions are integer types and are either both signed or both unsigned.
 
Expr halide_log (const Expr &a)
 Halide's vectorizable transcendentals.
 
Expr halide_exp (const Expr &a)
 
Expr halide_erf (const Expr &a)
 
Expr raise_to_integer_power (Expr a, int64_t b)
 Raise an expression to an integer power by repeatedly multiplying it by itself.
 
void split_into_ands (const Expr &cond, std::vector< Expr > &result)
 Split a boolean condition into vector of ANDs.
 
Expr strided_ramp_base (const Expr &e, int stride=1)
 If e is a ramp expression with stride, default 1, return the base, otherwise undefined.
 
template<typename T >
mod_imp (T a, T b)
 Implementations of division and mod that are specific to Halide.
 
template<typename T >
div_imp (T a, T b)
 
template<>
float mod_imp< float > (float a, float b)
 
template<>
double mod_imp< double > (double a, double b)
 
template<>
float div_imp< float > (float a, float b)
 
template<>
double div_imp< double > (double a, double b)
 
Expr remove_likelies (const Expr &e)
 Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed.
 
Stmt remove_likelies (const Stmt &s)
 Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed.
 
Expr remove_promises (const Expr &e)
 Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.
 
Stmt remove_promises (const Stmt &s)
 Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.
 
Expr unwrap_tags (const Expr &e)
 If the expression is a tag helper call, remove it and return the tagged expression.
 
HALIDE_NO_USER_CODE_INLINE void collect_print_args (std::vector< Expr > &args)
 
template<typename... Args>
HALIDE_NO_USER_CODE_INLINE void collect_print_args (std::vector< Expr > &args, const char *arg, Args &&...more_args)
 
template<typename... Args>
HALIDE_NO_USER_CODE_INLINE void collect_print_args (std::vector< Expr > &args, Expr arg, Args &&...more_args)
 
Expr requirement_failed_error (Expr condition, const std::vector< Expr > &args)
 
Expr memoize_tag_helper (Expr result, const std::vector< Expr > &cache_key_values)
 
void reset_random_counters ()
 Reset the counters used for random-number seeds in random_float/int/uint.
 
Expr unreachable (Type t=Int(32))
 Return an expression that should never be evaluated.
 
template<typename T >
Expr unreachable ()
 
Expr promise_clamped (const Expr &value, const Expr &min, const Expr &max)
 FOR INTERNAL USE ONLY.
 
std::ostream & operator<< (std::ostream &stream, IRNodeType)
 Emit a halide node type on an output stream (such as std::cout) in human-readable form.
 
std::ostream & operator<< (std::ostream &stream, const AssociativePattern &)
 Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form.
 
std::ostream & operator<< (std::ostream &stream, const AssociativeOp &)
 Emit a halide associative op on an output stream (such as std::cout) in a human-readable form.
 
std::ostream & operator<< (std::ostream &stream, const ForType &)
 Emit a halide for loop type (vectorized, serial, etc) in a human readable form.
 
std::ostream & operator<< (std::ostream &stream, const VectorReduce::Operator &)
 Emit a horizontal vector reduction op in human-readable form.
 
std::ostream & operator<< (std::ostream &stream, const NameMangling &)
 Emit a halide name mangling value in a human readable format.
 
std::ostream & operator<< (std::ostream &stream, const LinkageType &)
 Emit a halide linkage value in a human readable format.
 
std::ostream & operator<< (std::ostream &stream, const DimType &)
 Emit a halide dimension type in human-readable format.
 
std::ostream & operator<< (std::ostream &out, const Closure &c)
 Emit a Closure in human-readable form.
 
std::ostream & operator<< (std::ostream &out, const Interval &c)
 Emit an Interval in human-readable form.
 
std::ostream & operator<< (std::ostream &out, const ConstantInterval &c)
 Emit a ConstantInterval in human-readable form.
 
std::ostream & operator<< (std::ostream &out, const ModulusRemainder &c)
 Emit a ModulusRemainder in human-readable form.
 
std::ostream & operator<< (std::ostream &stream, const Indentation &)
 
void * get_symbol_address (const char *s)
 
Expr lower_lerp (Type final_type, Expr zero_val, Expr one_val, const Expr &weight, const Target &target)
 Build Halide IR that computes a lerp.
 
Stmt hoist_loop_invariant_values (Stmt)
 Hoist loop-invariants out of inner loops.
 
Stmt hoist_loop_invariant_if_statements (Stmt)
 Just hoist loop-invariant if statements as far up as possible.
 
template<typename T >
auto iterator_to_pointer (T iter) -> decltype(&*std::declval< T >())
 
std::string get_llvm_function_name (const llvm::Function *f)
 
std::string get_llvm_function_name (const llvm::Function &f)
 
llvm::StructType * get_llvm_struct_type_by_name (llvm::Module *module, const char *name)
 
llvm::Triple get_triple_for_target (const Target &target)
 Return the llvm::Triple that corresponds to the given Halide Target.
 
std::unique_ptr< llvm::Module > get_initial_module_for_target (Target, llvm::LLVMContext *, bool for_shared_jit_runtime=false, bool just_gpu=false)
 Create an llvm module containing the support code for a given target.
 
std::unique_ptr< llvm::Module > get_initial_module_for_ptx_device (Target, llvm::LLVMContext *c)
 Create an llvm module containing the support code for ptx device.
 
void add_bitcode_to_module (llvm::LLVMContext *context, llvm::Module &module, const std::vector< uint8_t > &bitcode, const std::string &name)
 Link a block of llvm bitcode into an llvm module.
 
std::unique_ptr< llvm::Module > link_with_wasm_jit_runtime (llvm::LLVMContext *c, const Target &t, std::unique_ptr< llvm::Module > extra_module)
 Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module.
 
Stmt loop_carry (Stmt, int max_carried_values=8)
 Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load.
 
Module lower (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Argument > &args, LinkageType linkage_type, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >())
 Given a vector of scheduled halide functions, create a Module that evaluates it.
 
Stmt lower_main_stmt (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >())
 Given a halide function with a schedule, create a statement that evaluates it.
 
void lower_test ()
 
Stmt lower_parallel_tasks (const Stmt &s, std::vector< LoweredFunc > &closure_implementations, const std::string &name, const Target &t)
 
Stmt lower_warp_shuffles (Stmt s, const Target &t)
 Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions.
 
Stmt inject_memoization (const Stmt &s, const std::map< std::string, Function > &env, const std::string &name, const std::vector< Function > &outputs)
 Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache.
 
Stmt rewrite_memoized_allocations (const Stmt &s, const std::map< std::string, Function > &env)
 This should be called after Storage Flattening has added Allocation IR nodes.
 
std::map< OutputFileType, const OutputInfoget_output_info (const Target &target)
 
ModulusRemainder operator+ (const ModulusRemainder &a, const ModulusRemainder &b)
 
ModulusRemainder operator- (const ModulusRemainder &a, const ModulusRemainder &b)
 
ModulusRemainder operator* (const ModulusRemainder &a, const ModulusRemainder &b)
 
ModulusRemainder operator/ (const ModulusRemainder &a, const ModulusRemainder &b)
 
ModulusRemainder operator% (const ModulusRemainder &a, const ModulusRemainder &b)
 
ModulusRemainder operator+ (const ModulusRemainder &a, int64_t b)
 
ModulusRemainder operator- (const ModulusRemainder &a, int64_t b)
 
ModulusRemainder operator* (const ModulusRemainder &a, int64_t b)
 
ModulusRemainder operator/ (const ModulusRemainder &a, int64_t b)
 
ModulusRemainder operator% (const ModulusRemainder &a, int64_t b)
 
ModulusRemainder modulus_remainder (const Expr &e)
 For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant.
 
ModulusRemainder modulus_remainder (const Expr &e, const Scope< ModulusRemainder > &scope)
 If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder:
 
void modulus_remainder_test ()
 
int64_t gcd (int64_t, int64_t)
 The greatest common divisor of two integers.
 
int64_t lcm (int64_t, int64_t)
 The least common multiple of two integers.
 
ConstantInterval derivative_bounds (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope())
 Find the bounds of the derivative of an expression.
 
Monotonic is_monotonic (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope())
 
Monotonic is_monotonic (const Expr &e, const std::string &var, const Scope< Monotonic > &scope)
 
std::ostream & operator<< (std::ostream &stream, const Monotonic &m)
 Emit the monotonic class in human-readable form for debugging.
 
void is_monotonic_test ()
 
Stmt inject_gpu_offload (const Stmt &s, const Target &host_target)
 Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module.
 
Stmt optimize_shuffles (Stmt s, int lut_alignment)
 
bool can_parallelize_rvar (const std::string &rvar, const std::string &func, const Definition &r)
 Returns whether or not Halide can prove that it is safe to parallelize an update definition across a specific variable.
 
void check_call_arg_types (const std::string &name, std::vector< Expr > *args, int dims)
 Validate arguments to a call to a func, image or imageparam.
 
bool has_uncaptured_likely_tag (const Expr &e, const Scope<> &scope)
 Return true if an expression uses a likely tag that isn't captured by an enclosing Select, Min, or Max.
 
bool has_likely_tag (const Expr &e, const Scope<> &scope)
 Return true if an expression uses a likely tag.
 
Stmt partition_loops (Stmt s)
 Partitions loop bodies into a prologue, a steady state, and an epilogue.
 
Stmt inject_placeholder_prefetch (const Stmt &s, const std::map< std::string, Function > &env, const std::string &prefix, const std::vector< PrefetchDirective > &prefetches)
 Inject placeholder prefetches to 's'.
 
Stmt inject_prefetch (const Stmt &s, const std::map< std::string, Function > &env)
 Compute the actual region to be prefetched and place it to the placholder prefetch.
 
Stmt reduce_prefetch_dimension (Stmt stmt, const Target &t)
 Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture).
 
Stmt hoist_prefetches (const Stmt &s)
 Hoist all the prefetches in a Block to the beginning of the Block.
 
std::string print_loop_nest (const std::vector< Function > &output_funcs)
 Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses.
 
Stmt inject_profiling (const Stmt &, const std::string &, const std::map< std::string, Function > &env)
 Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end.
 
Expr purify_index_math (const Expr &)
 Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero.
 
Expr qualify (const std::string &prefix, const Expr &value)
 Prefix all variable names in the given expression with the prefix string.
 
Expr random_float (const std::vector< Expr > &)
 Return a random floating-point number between zero and one that varies deterministically based on the input expressions.
 
Expr random_int (const std::vector< Expr > &)
 Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers).
 
Expr lower_random (const Expr &e, const std::vector< VarOrRVar > &free_vars, int tag)
 Convert calls to random() to IR generated by random_float and random_int.
 
std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > > realization_order (const std::vector< Function > &outputs, std::map< std::string, Function > &env)
 Given a bunch of functions that call each other, determine an order in which to do the scheduling.
 
std::vector< std::string > topological_order (const std::vector< Function > &outputs, const std::map< std::string, Function > &env)
 Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule.
 
Stmt rebase_loops_to_zero (const Stmt &)
 Rewrite the mins of most loops to 0.
 
void split_predicate_test ()
 
bool is_func_trivial_to_inline (const Function &func)
 Return true if the cost of inlining a function is equivalent to the cost of calling the function directly.
 
Stmt remove_dead_allocations (const Stmt &s)
 Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt.
 
Stmt remove_extern_loops (const Stmt &s)
 Removes placeholder loops for extern stages.
 
Stmt remove_undef (Stmt s)
 Removes stores that depend on undef values, and statements that only contain such stores.
 
Stmt schedule_functions (const std::vector< Function > &outputs, const std::vector< std::vector< std::string > > &fused_groups, const std::map< std::string, Function > &env, const Target &target, bool &any_memoized)
 Build loop nests and inject Function realizations at the appropriate places using the schedule.
 
template<typename T >
std::ostream & operator<< (std::ostream &stream, const Scope< T > &s)
 
Stmt select_gpu_api (const Stmt &s, const Target &t)
 Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target.
 
Stmt simplify (const Stmt &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope(), const std::vector< Expr > &assumptions=std::vector< Expr >())
 Perform a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc.
 
Expr simplify (const Expr &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope(), const std::vector< Expr > &assumptions=std::vector< Expr >())
 
bool can_prove (Expr e, const Scope< Interval > &bounds=Scope< Interval >::empty_scope())
 Attempt to statically prove an expression is true using the simplifier.
 
Stmt simplify_exprs (const Stmt &)
 Simplify expressions found in a statement, but don't simplify across different statements.
 
Stmt simplify_correlated_differences (const Stmt &)
 Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions.
 
Expr bound_correlated_differences (const Expr &expr)
 Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference.
 
void simplify_specializations (std::map< std::string, Function > &env)
 Try to simplify the RHS/LHS of a function's definition based on its specializations.
 
Stmt skip_stages (const Stmt &s, const std::vector< Function > &outputs, const std::vector< std::vector< std::string > > &order, const std::map< std::string, Function > &env)
 Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used.
 
Stmt sliding_window (const Stmt &s, const std::map< std::string, Function > &env)
 Perform sliding window optimizations on a halide statement.
 
SolverResult solve_expression (const Expr &e, const std::string &variable, const Scope< Expr > &scope=Scope< Expr >::empty_scope())
 Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e.
 
Interval solve_for_outer_interval (const Expr &c, const std::string &variable)
 Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it.
 
Interval solve_for_inner_interval (const Expr &c, const std::string &variable)
 Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it.
 
Expr and_condition_over_domain (const Expr &c, const Scope< Interval > &varying)
 Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables.
 
void solve_test ()
 
void spirv_ir_test ()
 Internal test for SPIR-V IR.
 
Stmt split_tuples (const Stmt &s, const std::map< std::string, Function > &env)
 Rewrite all tuple-valued Realizations, Provide nodes, and Call nodes into several scalar-valued ones, so that later lowering passes only need to think about scalar-valued productions.
 
Stmt stage_strided_loads (const Stmt &s)
 Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles.
 
void print_to_stmt_html (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="")
 Dump an HTML-formatted visualization of a Module to filename.
 
void print_to_conceptual_stmt_html (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="")
 Dump an HTML-formatted visualization of a Module's conceptual Stmt code to filename.
 
Stmt storage_flattening (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env, const Target &target)
 Take a statement with multi-dimensional Realize, Provide, and Call nodes, and turn it into a statement with single-dimensional Allocate, Store, and Load nodes respectively.
 
Stmt storage_folding (const Stmt &s, const std::map< std::string, Function > &env)
 Fold storage of functions if possible.
 
bool strictify_float (std::map< std::string, Function > &env, const Target &t)
 Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions.
 
Stmt strip_asserts (const Stmt &s)
 
Expr substitute (const std::string &name, const Expr &replacement, const Expr &expr)
 Substitute variables with the given name with the replacement expression within expr.
 
Stmt substitute (const std::string &name, const Expr &replacement, const Stmt &stmt)
 Substitute variables with the given name with the replacement expression within stmt.
 
Expr substitute (const std::map< std::string, Expr > &replacements, const Expr &expr)
 Substitute variables with names in the map.
 
Stmt substitute (const std::map< std::string, Expr > &replacements, const Stmt &stmt)
 
Expr substitute (const Expr &find, const Expr &replacement, const Expr &expr)
 Substitute expressions for other expressions.
 
Stmt substitute (const Expr &find, const Expr &replacement, const Stmt &stmt)
 
Expr graph_substitute (const std::string &name, const Expr &replacement, const Expr &expr)
 Substitutions where the IR may be a general graph (and not just a DAG).
 
Stmt graph_substitute (const std::string &name, const Expr &replacement, const Stmt &stmt)
 
Expr graph_substitute (const Expr &find, const Expr &replacement, const Expr &expr)
 
Stmt graph_substitute (const Expr &find, const Expr &replacement, const Stmt &stmt)
 
Expr substitute_in_all_lets (const Expr &expr)
 Substitute in all let Exprs in a piece of IR.
 
Stmt substitute_in_all_lets (const Stmt &stmt)
 
void target_test ()
 
void lower_target_query_ops (std::map< std::string, Function > &env, const Target &t)
 
Stmt inject_tracing (Stmt, const std::string &pipeline_name, bool trace_pipeline, const std::map< std::string, Function > &env, const std::vector< Function > &outputs, const Target &Target)
 Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations.
 
Stmt trim_no_ops (Stmt s)
 Truncate loop bounds to the region over which they actually do something.
 
Stmt unify_duplicate_lets (const Stmt &s)
 Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones.
 
Stmt uniquify_variable_names (const Stmt &s)
 Modify a statement so that every internally-defined variable name is unique.
 
void uniquify_variable_names_test ()
 
Stmt unpack_buffers (Stmt s)
 Creates let stmts for the various buffer components (e.g.
 
Stmt unroll_loops (const Stmt &)
 Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement.
 
Stmt lower_unsafe_promises (const Stmt &s, const Target &t)
 Lower all unsafe promises into either assertions or unchecked code, depending on the target.
 
Stmt lower_safe_promises (const Stmt &s)
 Lower all safe promises by just stripping them.
 
template<typename DST , typename SRC , typename std::enable_if< std::is_floating_point< SRC >::value >::type * = nullptr>
DST safe_numeric_cast (SRC s)
 Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible.
 
template<typename DstType , typename SrcType >
DstType reinterpret_bits (const SrcType &src)
 An aggressive form of reinterpret cast used for correct type-punning.
 
std::string get_env_variable (char const *env_var_name)
 Get value of an environment variable.
 
std::string running_program_name ()
 Get the name of the currently running executable.
 
std::string unique_name (char prefix)
 Generate a unique name starting with the given prefix.
 
std::string unique_name (const std::string &prefix)
 
bool starts_with (const std::string &str, const std::string &prefix)
 Test if the first string starts with the second string.
 
bool ends_with (const std::string &str, const std::string &suffix)
 Test if the first string ends with the second string.
 
std::string replace_all (const std::string &str, const std::string &find, const std::string &replace)
 Replace all matches of the second string in the first string with the last string.
 
std::vector< std::string > split_string (const std::string &source, const std::string &delim)
 Split the source string using 'delim' as the divider.
 
template<typename T >
std::string join_strings (const std::vector< T > &sources, const std::string &delim)
 Join the source vector using 'delim' as the divider.
 
template<typename T , typename Fn >
fold_left (const std::vector< T > &vec, Fn f)
 Perform a left fold of a vector.
 
template<typename T , typename Fn >
fold_right (const std::vector< T > &vec, Fn f)
 Returns a right fold of a vector.
 
std::string extract_namespaces (const std::string &name, std::vector< std::string > &namespaces)
 Returns base name and fills in namespaces, outermost one first in vector.
 
std::string strip_namespaces (const std::string &name)
 Like extract_namespaces(), but strip and discard the namespaces, returning base name only.
 
std::string file_make_temp (const std::string &prefix, const std::string &suffix)
 Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed.
 
std::string dir_make_temp ()
 Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed.
 
bool file_exists (const std::string &name)
 Wrapper for access().
 
void assert_file_exists (const std::string &name)
 assert-fail if the file doesn't exist.
 
void assert_no_file_exists (const std::string &name)
 assert-fail if the file DOES exist.
 
void file_unlink (const std::string &name)
 Wrapper for unlink().
 
void ensure_no_file_exists (const std::string &name)
 Ensure that no file with this path exists.
 
void dir_rmdir (const std::string &name)
 Wrapper for rmdir().
 
FileStat file_stat (const std::string &name)
 Wrapper for stat().
 
std::vector< char > read_entire_file (const std::string &pathname)
 Read the entire contents of a file into a vector<char>.
 
void write_entire_file (const std::string &pathname, const void *source, size_t source_len)
 Create or replace the contents of a file with a given pointer-and-length of memory.
 
void write_entire_file (const std::string &pathname, const std::vector< char > &source)
 
bool add_would_overflow (int bits, int64_t a, int64_t b)
 Routines to test if math would overflow for signed integers with the given number of bits.
 
bool sub_would_overflow (int bits, int64_t a, int64_t b)
 
bool mul_would_overflow (int bits, int64_t a, int64_t b)
 
HALIDE_MUST_USE_RESULT bool add_with_overflow (int bits, int64_t a, int64_t b, int64_t *result)
 Routines to perform arithmetic on signed types without triggering signed overflow.
 
HALIDE_MUST_USE_RESULT bool sub_with_overflow (int bits, int64_t a, int64_t b, int64_t *result)
 
HALIDE_MUST_USE_RESULT bool mul_with_overflow (int bits, int64_t a, int64_t b, int64_t *result)
 
void halide_tic_impl (const char *file, int line)
 
void halide_toc_impl (const char *file, int line)
 
template<typename T >
auto begin (reverse_adaptor< T > i)
 
template<typename T >
auto end (reverse_adaptor< T > i)
 
template<typename T >
reverse_adaptor< T > reverse_view (T &&range)
 Reverse-order adaptor for range-based for-loops.
 
std::string c_print_name (const std::string &name, bool prefix_underscore=true)
 Emit a version of a string that is a valid identifier in C (.
 
int get_llvm_version ()
 Return the LLVM_VERSION against which this libHalide is compiled.
 
void run_with_large_stack (const std::function< void()> &action)
 Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size.
 
int popcount64 (uint64_t x)
 Portable versions of popcount, count-leading-zeros, and count-trailing-zeros.
 
int clz64 (uint64_t x)
 
int ctz64 (uint64_t x)
 
int64_t next_power_of_two (int64_t x)
 Return an integer 2^n, for some n, which is >= x.
 
template<typename T >
align_up (T x, int n)
 
std::vector< Varmake_argument_list (int dimensionality)
 Make a list of unique arguments for definitions with unnamed arguments.
 
Stmt vectorize_loops (const Stmt &s, const std::map< std::string, Function > &env)
 Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors.
 
std::map< std::string, Functionwrap_func_calls (const std::map< std::string, Function > &env)
 Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions.
 
std::string get_test_tmp_dir ()
 Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution).
 
Expr lower_int_uint_div (const Expr &a, const Expr &b, bool round_to_zero=false)
 Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.
 
Expr lower_int_uint_mod (const Expr &a, const Expr &b)
 Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.
 
Expr lower_euclidean_div (Expr a, Expr b)
 Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.
 
Expr lower_euclidean_mod (Expr a, Expr b)
 Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.
 
Expr lower_signed_shift_left (const Expr &a, const Expr &b)
 Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.
 
Expr lower_signed_shift_right (const Expr &a, const Expr &b)
 Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.
 
Expr lower_extract_bits (const Call *c)
 Reduce bit extraction and concatenation to bit ops.
 
Expr lower_concat_bits (const Call *c)
 Reduce bit extraction and concatenation to bit ops.
 
Stmt eliminate_bool_vectors (const Stmt &s)
 Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.
 
Expr eliminate_bool_vectors (const Expr &s)
 Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.
 
std::string lldb_string (const Expr &)
 Debugging helpers for LLDB.
 
std::string lldb_string (const Internal::BaseExprNode *)
 Debugging helpers for LLDB.
 
std::string lldb_string (const Stmt &)
 Debugging helpers for LLDB.
 
HALIDE_MUST_USE_RESULT bool reduce_expr_modulo (const Expr &e, int64_t modulus, int64_t *remainder)
 Reduce an expression modulo some integer.
 
HALIDE_MUST_USE_RESULT bool reduce_expr_modulo (const Expr &e, int64_t modulus, int64_t *remainder, const Scope< ModulusRemainder > &scope)
 Reduce an expression modulo some integer.
 

Variables

const int64_t unknown = std::numeric_limits<int64_t>::min()
 
constexpr IRNodeType StrongestExprNodeType = IRNodeType::VectorReduce
 
std::atomic< int > random_variable_counter
 

Typedef Documentation

◆ AbstractGeneratorPtr

Definition at line 244 of file AbstractGenerator.h.

◆ DimBounds

typedef std::map<std::string, Interval> Halide::Internal::DimBounds

Definition at line 20 of file AutoScheduleUtils.h.

◆ FuncValueBounds

typedef std::map<std::pair<std::string, int>, Interval> Halide::Internal::FuncValueBounds

Definition at line 17 of file Bounds.h.

◆ add_const_if_T_is_const

template<typename T , typename T2 >
using Halide::Internal::add_const_if_T_is_const = typename std::conditional<std::is_const<T>::value, const T2, T2>::type

Definition at line 83 of file Buffer.h.

◆ GeneratorParamImplBase

template<typename T >
using Halide::Internal::GeneratorParamImplBase
Initial value:
typename select_type<
cond<std::is_same<T, Target>::value, GeneratorParam_Target<T>>,
cond<std::is_same<T, LoopLevel>::value, GeneratorParam_LoopLevel>,
cond<std::is_same<T, std::string>::value, GeneratorParam_String<T>>,
cond<std::is_same<T, Type>::value, GeneratorParam_Type<T>>,
cond<std::is_same<T, bool>::value, GeneratorParam_Bool<T>>,
cond<std::is_arithmetic<T>::value, GeneratorParam_Arithmetic<T>>,
cond<std::is_enum<T>::value, GeneratorParam_Enum<T>>>::type

Definition at line 941 of file Generator.h.

◆ GeneratorInputImplBase

template<typename T , typename TBase = typename std::remove_all_extents<T>::type>
using Halide::Internal::GeneratorInputImplBase
Initial value:
typename select_type<
cond<has_static_halide_type_method<TBase>::value, GeneratorInput_Buffer<T>>,
cond<std::is_same<TBase, Func>::value, GeneratorInput_Func<T>>,
cond<std::is_arithmetic<TBase>::value, GeneratorInput_Arithmetic<T>>,
cond<std::is_scalar<TBase>::value, GeneratorInput_Scalar<T>>,
cond<std::is_same<TBase, Expr>::value, GeneratorInput_DynamicScalar<T>>>::type

Definition at line 2175 of file Generator.h.

◆ GeneratorOutputImplBase

template<typename T , typename TBase = typename std::remove_all_extents<T>::type>
using Halide::Internal::GeneratorOutputImplBase
Initial value:
typename select_type<
cond<has_static_halide_type_method<TBase>::value, GeneratorOutput_Buffer<T>>,
cond<std::is_same<TBase, Func>::value, GeneratorOutput_Func<T>>,
cond<std::is_arithmetic<TBase>::value, GeneratorOutput_Arithmetic<T>>>::type

Definition at line 2786 of file Generator.h.

◆ GeneratorFactory

Definition at line 3115 of file Generator.h.

◆ LLVMOStream

typedef llvm::raw_pwrite_stream Halide::Internal::LLVMOStream

Definition at line 27 of file LLVM_Output.h.

Enumeration Type Documentation

◆ ArgInfoKind

enum class Halide::Internal::ArgInfoKind
strong
Enumerator
Scalar 
Function 
Buffer 

Definition at line 26 of file AbstractGenerator.h.

◆ ArgInfoDirection

Enumerator
Input 
Output 

Definition at line 30 of file AbstractGenerator.h.

◆ Direction

enum class Halide::Internal::Direction
strong

Given a varying expression, try to find a constant that is either: An upper bound (always greater than or equal to the expression), or A lower bound (always less than or equal to the expression) If it fails, returns an undefined Expr.

Enumerator
Upper 
Lower 

Definition at line 42 of file Bounds.h.

◆ IRNodeType

enum class Halide::Internal::IRNodeType
strong

All our IR node types get unique IDs for the purposes of RTTI.

Enumerator
IntImm 
UIntImm 
FloatImm 
StringImm 
Broadcast 
Cast 
Reinterpret 
Variable 
Add 
Sub 
Mod 
Mul 
Div 
Min 
Max 
EQ 
NE 
LT 
LE 
GT 
GE 
And 
Or 
Not 
Select 
Load 
Ramp 
Call 
Let 
Shuffle 
VectorReduce 
LetStmt 
AssertStmt 
ProducerConsumer 
For 
Acquire 
Store 
Provide 
Allocate 
Free 
Realize 
Block 
Fork 
IfThenElse 
Evaluate 
Prefetch 
Atomic 
HoistedStorage 

Definition at line 25 of file Expr.h.

◆ ForType

enum class Halide::Internal::ForType
strong

An enum describing a type of loop traversal.

Used in schedules, and in the For loop IR node. Serial is a conventional ordered for loop. Iterations occur in increasing order, and each iteration must appear to have finished before the next begins. Parallel, GPUBlock, and GPUThread are parallel and unordered: iterations may occur in any order, and multiple iterations may occur simultaneously. Vectorized and GPULane are parallel and synchronous: they act as if all iterations occur at the same time in lockstep.

Enumerator
Serial 
Parallel 
Vectorized 
Unrolled 
Extern 
GPUBlock 
GPUThread 
GPULane 

Definition at line 406 of file Expr.h.

◆ SyntheticParamType

Enumerator
Type 
Dim 
ArraySize 

Definition at line 2892 of file Generator.h.

◆ Monotonic

enum class Halide::Internal::Monotonic
strong

Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown.

Enumerator
Constant 
Increasing 
Decreasing 
Unknown 

Definition at line 26 of file Monotonic.h.

◆ DimType

enum class Halide::Internal::DimType
strong

Each Dim below has a dim_type, which tells you what transformations are legal on it.

When you combine two Dims of distinct DimTypes (e.g. with Stage::fuse), the combined result has the greater enum value of the two types.

Enumerator
PureVar 

This dim originated from a Var.

You can evaluate a Func at distinct values of this Var in any order over an interval that's at least as large as the interval required. In pure definitions you can even redundantly re-evaluate points.

PureRVar 

The dim originated from an RVar.

You can evaluate a Func at distinct values of this RVar in any order (including in parallel) over exactly the interval specified in the RDom. PureRVars can also be reordered arbitrarily in the dims list, as there are no data hazards between the evaluation of the Func at distinct values of the RVar.

The most common case where an RVar is considered pure is RVars that are used in a way which obeys all the syntactic constraints that a Var does, e.g:

RDom r(0, 100);
f(r.x) = f(r.x) + 5;
A multi-dimensional domain over which to iterate.
Definition RDom.h:193

Other cases where RVars are pure are where the sites being written to by the Func evaluated at one value of the RVar couldn't possibly collide with the sites being written or read by the Func at a distinct value of the RVar. For example, r.x is pure in the following three definitions:

// This definition writes to even coordinates and reads from the
// same site (which no other value of r.x is writing to) and odd
// sites (which no other value of r.x is writing to):
f(2*r.x) = max(f(2*r.x), f(2*r.x + 7));
// This definition writes to scanline zero and reads from the the
// same site and scanline one:
f(r.x, 0) += f(r.x, 1);
// This definition reads and writes over non-overlapping ranges:
f(r.x + 100) += f(r.x);
ConstantInterval max(const ConstantInterval &a, const ConstantInterval &b)

To give two counterexamples, r.x is not pure in the following definitions:

// The same site is written by distinct values of the RVar
// (write-after-write hazard):
f(r.x / 2) += f(r.x);
// One value of r.x reads from a site that another value of r.x
// is writing to (read-after-write hazard):
f(r.x) += f(r.x + 1);
ImpureRVar 

The dim originated from an RVar.

You must evaluate a Func at distinct values of this RVar in increasing order over precisely the interval specified in the RDom. ImpureRVars may not be reordered with respect to other ImpureRVars.

All RVars are impure by default. Those for which we can prove no data hazards exist get promoted to PureRVar. There are two instances in which ImpureRVars may be parallelized or reordered even in the presence of hazards:

1) In the case of an update definition that has been proven to be an associative and commutative reduction, reordering of ImpureRVars is allowed, and parallelizing them is allowed if the update has been made atomic.

2) ImpureRVars can also be reordered and parallelized if Func::allow_race_conditions() has been set. This is the escape hatch for when there are no hazards but the checks above failed to prove that (RDom::where can encode arbitrary facts about non-linear integer arithmetic, which is undecidable), or for when you don't actually care about the non-determinism introduced by data hazards (e.g. in the algorithm HOGWILD!).

Definition at line 357 of file Schedule.h.

Function Documentation

◆ add_atomic_mutex()

Stmt Halide::Internal::add_atomic_mutex ( Stmt s,
const std::vector< Function > & outputs )

◆ add_image_checks()

Stmt Halide::Internal::add_image_checks ( const Stmt & s,
const std::vector< Function > & outputs,
const Target & t,
const std::vector< std::string > & order,
const std::map< std::string, Function > & env,
const FuncValueBounds & fb,
bool will_inject_host_copies )

Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g.

stride.0 must be 1).

◆ add_parameter_checks()

Stmt Halide::Internal::add_parameter_checks ( const std::vector< Stmt > & requirements,
Stmt s,
const Target & t )

Insert checks to make sure that all referenced parameters meet their constraints.

Also injects any custom requirements provided by the user.

◆ add_split_factor_checks()

Stmt Halide::Internal::add_split_factor_checks ( const Stmt & s,
const std::map< std::string, Function > & env )

Insert checks that all split factors that depend on scalar parameters are strictly positive.

◆ align_loads()

Stmt Halide::Internal::align_loads ( const Stmt & s,
int alignment,
int min_bytes_to_align )

Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors.

Types that are less than min_bytes_to_align in size are not rewritten. This is intended to make a distinction between data that will be accessed as a scalar and that which will be accessed as a vector.

◆ allocation_bounds_inference()

Stmt Halide::Internal::allocation_bounds_inference ( Stmt s,
const std::map< std::string, Function > & env,
const std::map< std::pair< std::string, int >, Interval > & func_bounds )

Take a partially statement with Realize nodes in terms of variables, and define values for those variables.

◆ apply_split()

std::vector< ApplySplitResult > Halide::Internal::apply_split ( const Split & split,
const std::string & prefix,
std::map< std::string, Expr > & dim_extent_alignment )

Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let).

◆ compute_loop_bounds_after_split()

std::vector< std::pair< std::string, Expr > > Halide::Internal::compute_loop_bounds_after_split ( const Split & split,
const std::string & prefix )

Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions.

◆ get_ops_table()

const std::vector< AssociativePattern > & Halide::Internal::get_ops_table ( const std::vector< Expr > & exprs)

◆ prove_associativity()

AssociativeOp Halide::Internal::prove_associativity ( const std::string & f,
std::vector< Expr > args,
std::vector< Expr > exprs )

Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any.

'is_associative' indicates if the operation was successfuly proven as associative.

◆ associativity_test()

void Halide::Internal::associativity_test ( )

◆ fork_async_producers()

Stmt Halide::Internal::fork_async_producers ( Stmt s,
const std::map< std::string, Function > & env )

◆ string_to_int()

int Halide::Internal::string_to_int ( const std::string & s)

Return an int representation of 's'.

Throw an error on failure.

◆ substitute_var_estimates() [1/2]

Expr Halide::Internal::substitute_var_estimates ( Expr e)

Substitute every variable in an Expr or a Stmt with its estimate if specified.

◆ substitute_var_estimates() [2/2]

Stmt Halide::Internal::substitute_var_estimates ( Stmt s)

◆ get_extent()

Expr Halide::Internal::get_extent ( const Interval & i)

Return the size of an interval.

Return an undefined expr if the interval is unbounded.

◆ box_size()

Expr Halide::Internal::box_size ( const Box & b)

Return the size of an n-d box.

◆ disp_regions()

void Halide::Internal::disp_regions ( const std::map< std::string, Box > & regions)

Helper function to print the bounds of a region.

◆ get_stage_definition()

Definition Halide::Internal::get_stage_definition ( const Function & f,
int stage_num )

Return the corresponding definition of a function given the stage.

This will throw an assertion if the function is an extern function (Extern Func does not have definition).

◆ get_stage_dims()

std::vector< Dim > & Halide::Internal::get_stage_dims ( const Function & f,
int stage_num )

Return the corresponding loop dimensions of a function given the stage.

For extern Func, this will return a list of size 1 containing the dummy __outermost loop dimension.

◆ combine_load_costs()

void Halide::Internal::combine_load_costs ( std::map< std::string, Expr > & result,
const std::map< std::string, Expr > & partial )

Add partial load costs to the corresponding function in the result costs.

◆ get_stage_bounds() [1/2]

DimBounds Halide::Internal::get_stage_bounds ( const Function & f,
int stage_num,
const DimBounds & pure_bounds )

Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions.

◆ get_stage_bounds() [2/2]

std::vector< DimBounds > Halide::Internal::get_stage_bounds ( const Function & f,
const DimBounds & pure_bounds )

Return the required bounds for all the stages of the function 'f'.

Each entry in the returned vector corresponds to a stage.

◆ perform_inline()

Expr Halide::Internal::perform_inline ( Expr e,
const std::map< std::string, Function > & env,
const std::set< std::string > & inlines = std::set< std::string >(),
const std::vector< std::string > & order = std::vector< std::string >() )

Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression.

If 'order' is passed, inlining will be done in the reverse order of function realization to avoid extra inlining works.

◆ get_parents()

std::set< std::string > Halide::Internal::get_parents ( Function f,
int stage )

Return all functions that are directly called by a function stage (f, stage).

◆ get_element() [1/2]

template<typename K , typename V >
V Halide::Internal::get_element ( const std::map< K, V > & m,
const K & key )

Return value of element within a map.

This will assert if the element is not in the map.

Definition at line 101 of file AutoScheduleUtils.h.

References internal_assert.

◆ get_element() [2/2]

template<typename K , typename V >
V & Halide::Internal::get_element ( std::map< K, V > & m,
const K & key )

Definition at line 108 of file AutoScheduleUtils.h.

References internal_assert.

◆ inline_all_trivial_functions()

bool Halide::Internal::inline_all_trivial_functions ( const std::vector< Function > & outputs,
const std::vector< std::string > & order,
const std::map< std::string, Function > & env )

If the cost of computing a Func is about the same as calling the Func, inline the Func.

Return true of any of the Funcs is inlined.

◆ is_func_called_element_wise()

std::string Halide::Internal::is_func_called_element_wise ( const std::vector< std::string > & order,
size_t index,
const std::map< std::string, Function > & env )

Determine if a Func (order[index]) is only consumed by another single Func in element-wise manner.

If it is, return the name of the consumer Func; otherwise, return an empty string.

◆ inline_all_element_wise_functions()

bool Halide::Internal::inline_all_element_wise_functions ( const std::vector< Function > & outputs,
const std::vector< std::string > & order,
const std::map< std::string, Function > & env )

Inline a Func if its values are only consumed by another single Func in element-wise manner.

◆ propagate_estimate_test()

void Halide::Internal::propagate_estimate_test ( )

◆ bound_constant_extent_loops()

Stmt Halide::Internal::bound_constant_extent_loops ( const Stmt & s)

Replace all loop extents of unrolled or vectorized loops with constants, by substituting and simplifying as needed.

If we can't determine a constant extent, but can determine a constant upper bound, inject an if statement into the body. If we can't even determine a constant upper bound, throw a user error.

◆ empty_func_value_bounds()

const FuncValueBounds & Halide::Internal::empty_func_value_bounds ( )

◆ bounds_of_expr_in_scope()

Interval Halide::Internal::bounds_of_expr_in_scope ( const Expr & expr,
const Scope< Interval > & scope,
const FuncValueBounds & func_bounds = empty_func_value_bounds(),
bool const_bound = false )

Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression.

Max or min may be undefined expressions if the value is not bounded above or below. If the expression is a vector, also takes the bounds across the vector lanes and returns a scalar result.

This is for tasks such as deducing the region of a buffer loaded by a chunk of code.

◆ find_constant_bound()

Expr Halide::Internal::find_constant_bound ( const Expr & e,
Direction d,
const Scope< Interval > & scope = ScopeInterval >::empty_scope() )

◆ find_constant_bounds()

Interval Halide::Internal::find_constant_bounds ( const Expr & e,
const Scope< Interval > & scope )

Find bounds for a varying expression that are either constants or +/-inf.

◆ merge_boxes()

void Halide::Internal::merge_boxes ( Box & a,
const Box & b )

Expand box a to encompass box b.

◆ boxes_overlap()

bool Halide::Internal::boxes_overlap ( const Box & a,
const Box & b )

Test if box a could possibly overlap box b.

◆ box_union()

Box Halide::Internal::box_union ( const Box & a,
const Box & b )

The union of two boxes.

◆ box_intersection()

Box Halide::Internal::box_intersection ( const Box & a,
const Box & b )

The intersection of two boxes.

◆ box_contains()

bool Halide::Internal::box_contains ( const Box & a,
const Box & b )

Test if box a provably contains box b.

◆ boxes_required() [1/2]

std::map< std::string, Box > Halide::Internal::boxes_required ( const Expr & e,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression.

This is useful for figuring out what regions of things to evaluate. Respects control flow (e.g. encodes if statement conditions), but assumes all encountered asserts pass. If it encounters an assert(false) in one if branch, assumes the opposite if branch runs unconditionally.

◆ boxes_required() [2/2]

std::map< std::string, Box > Halide::Internal::boxes_required ( Stmt s,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ boxes_provided() [1/2]

std::map< std::string, Box > Halide::Internal::boxes_provided ( const Expr & e,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression.

Handles asserts in the same way as boxes_required.

◆ boxes_provided() [2/2]

std::map< std::string, Box > Halide::Internal::boxes_provided ( Stmt s,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ boxes_touched() [1/2]

std::map< std::string, Box > Halide::Internal::boxes_touched ( const Expr & e,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression.

Handles asserts in the same way as boxes_required.

◆ boxes_touched() [2/2]

std::map< std::string, Box > Halide::Internal::boxes_touched ( Stmt s,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ box_required() [1/2]

Box Halide::Internal::box_required ( const Expr & e,
const std::string & fn,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

Variants of the above that are only concerned with a single function.

◆ box_required() [2/2]

Box Halide::Internal::box_required ( Stmt s,
const std::string & fn,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ box_provided() [1/2]

Box Halide::Internal::box_provided ( const Expr & e,
const std::string & fn,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ box_provided() [2/2]

Box Halide::Internal::box_provided ( Stmt s,
const std::string & fn,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ box_touched() [1/2]

Box Halide::Internal::box_touched ( const Expr & e,
const std::string & fn,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ box_touched() [2/2]

Box Halide::Internal::box_touched ( Stmt s,
const std::string & fn,
const Scope< Interval > & scope = ScopeInterval >::empty_scope(),
const FuncValueBounds & func_bounds = empty_func_value_bounds() )

◆ compute_function_value_bounds()

FuncValueBounds Halide::Internal::compute_function_value_bounds ( const std::vector< std::string > & order,
const std::map< std::string, Function > & env )

Compute the maximum and minimum possible value for each function in an environment.

◆ span_of_bounds()

Expr Halide::Internal::span_of_bounds ( const Interval & bounds)

◆ bounds_test()

void Halide::Internal::bounds_test ( )

◆ bounds_inference()

Stmt Halide::Internal::bounds_inference ( Stmt ,
const std::vector< Function > & outputs,
const std::vector< std::string > & realization_order,
const std::vector< std::vector< std::string > > & fused_groups,
const std::map< std::string, Function > & environment,
const std::map< std::pair< std::string, int >, Interval > & func_bounds,
const Target & target )

Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds.

◆ bound_small_allocations()

Stmt Halide::Internal::bound_small_allocations ( const Stmt & s)

◆ buffer_accessor()

Expr Halide::Internal::buffer_accessor ( const Buffer<> & buf,
const std::vector< Expr > & args )

◆ get_name_from_end_of_parameter_pack() [1/4]

template<typename T , typename = typename std::enable_if<!std::is_convertible<T, std::string>::value>::type>
std::string Halide::Internal::get_name_from_end_of_parameter_pack ( T && )

Definition at line 44 of file Buffer.h.

◆ get_name_from_end_of_parameter_pack() [2/4]

std::string Halide::Internal::get_name_from_end_of_parameter_pack ( const std::string & n)
inline

Definition at line 48 of file Buffer.h.

◆ get_name_from_end_of_parameter_pack() [3/4]

std::string Halide::Internal::get_name_from_end_of_parameter_pack ( )
inline

Definition at line 52 of file Buffer.h.

Referenced by get_name_from_end_of_parameter_pack().

◆ get_name_from_end_of_parameter_pack() [4/4]

template<typename First , typename Second , typename... Args>
std::string Halide::Internal::get_name_from_end_of_parameter_pack ( First first,
Second second,
Args &&... rest )

Definition at line 59 of file Buffer.h.

References get_name_from_end_of_parameter_pack().

◆ get_shape_from_start_of_parameter_pack_helper() [1/3]

void Halide::Internal::get_shape_from_start_of_parameter_pack_helper ( std::vector< int > & ,
const std::string &  )
inline

◆ get_shape_from_start_of_parameter_pack_helper() [2/3]

void Halide::Internal::get_shape_from_start_of_parameter_pack_helper ( std::vector< int > & )
inline

Definition at line 66 of file Buffer.h.

◆ get_shape_from_start_of_parameter_pack_helper() [3/3]

template<typename... Args>
void Halide::Internal::get_shape_from_start_of_parameter_pack_helper ( std::vector< int > & result,
int x,
Args &&... rest )

Definition at line 70 of file Buffer.h.

References get_shape_from_start_of_parameter_pack_helper().

◆ get_shape_from_start_of_parameter_pack()

template<typename... Args>
std::vector< int > Halide::Internal::get_shape_from_start_of_parameter_pack ( Args &&... args)

Definition at line 76 of file Buffer.h.

References get_shape_from_start_of_parameter_pack_helper().

◆ buffer_type_name_non_const()

template<typename T >
void Halide::Internal::buffer_type_name_non_const ( std::ostream & s)

Definition at line 89 of file Buffer.h.

◆ buffer_type_name_non_const< void >()

template<>
void Halide::Internal::buffer_type_name_non_const< void > ( std::ostream & s)
inline

Definition at line 94 of file Buffer.h.

◆ buffer_type_name()

template<typename T >
std::string Halide::Internal::buffer_type_name ( )

Definition at line 99 of file Buffer.h.

◆ canonicalize_gpu_vars()

Stmt Halide::Internal::canonicalize_gpu_vars ( Stmt s)

Canonicalize GPU var names into some pre-determined block/thread names (i.e.

__block_id_x, __thread_id_x, etc.). The x/y/z/w order is determined by the nesting order: innermost is assigned to x and so on.

◆ gpu_thread_name()

const std::string & Halide::Internal::gpu_thread_name ( int index)

Names for the thread and block id variables.

Includes the leading dot. Indexed from inside out, so 0 gives you the innermost loop.

◆ gpu_block_name()

const std::string & Halide::Internal::gpu_block_name ( int index)

◆ clamp_unsafe_accesses()

Stmt Halide::Internal::clamp_unsafe_accesses ( const Stmt & s,
const std::map< std::string, Function > & env,
FuncValueBounds & func_bounds )

Inject clamps around func calls h(...) when all the following conditions hold:

  1. The call is in an indexing context, such as: f(x) = g(h(x));
  2. The FuncValueBounds of h are smaller than those of its type
  3. The allocation bounds of h might be wider than its compute bounds.

◆ new_CodeGen_D3D12Compute_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_D3D12Compute_Dev ( const Target & target)

◆ get_vector_element_type()

llvm::Type * Halide::Internal::get_vector_element_type ( llvm::Type * )

Get the scalar type of an llvm vector type.

Returns the argument if it's not a vector type.

◆ function_takes_user_context()

bool Halide::Internal::function_takes_user_context ( const std::string & name)

Which built-in functions require a user-context first argument?

◆ can_allocation_fit_on_stack()

bool Halide::Internal::can_allocation_fit_on_stack ( int64_t size)

Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False.

This routine asserts if size is non-positive.

◆ long_div_mod_round_to_zero()

std::pair< Expr, Expr > Halide::Internal::long_div_mod_round_to_zero ( const Expr & a,
const Expr & b,
std::optional< uint64_t > max_abs = std::nullopt )

Does a {div/mod}_round_to_zero using binary long division for int/uint.

max_abs is the maximum absolute value of (a/b). Returns the pair {div_round_to_zero, mod_round_to_zero}.

◆ lower_int_uint_div()

Expr Halide::Internal::lower_int_uint_div ( const Expr & a,
const Expr & b,
bool round_to_zero = false )

Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.

Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.

◆ lower_int_uint_mod()

Expr Halide::Internal::lower_int_uint_mod ( const Expr & a,
const Expr & b )

Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.

Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.

◆ lower_euclidean_div()

Expr Halide::Internal::lower_euclidean_div ( Expr a,
Expr b )

Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.

◆ lower_euclidean_mod()

Expr Halide::Internal::lower_euclidean_mod ( Expr a,
Expr b )

Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.

◆ lower_signed_shift_left()

Expr Halide::Internal::lower_signed_shift_left ( const Expr & a,
const Expr & b )

Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.

◆ lower_signed_shift_right()

Expr Halide::Internal::lower_signed_shift_right ( const Expr & a,
const Expr & b )

Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.

◆ lower_mux()

Expr Halide::Internal::lower_mux ( const Call * mux)

Reduce a mux intrinsic to a select tree.

◆ lower_extract_bits()

Expr Halide::Internal::lower_extract_bits ( const Call * c)

Reduce bit extraction and concatenation to bit ops.

◆ lower_concat_bits()

Expr Halide::Internal::lower_concat_bits ( const Call * c)

Reduce bit extraction and concatenation to bit ops.

◆ lower_round_to_nearest_ties_to_even()

Expr Halide::Internal::lower_round_to_nearest_ties_to_even ( const Expr & )

An vectorizable implementation of Halide::round that doesn't depend on any standard library being present.

◆ get_target_options()

void Halide::Internal::get_target_options ( const llvm::Module & module,
llvm::TargetOptions & options )

Given an llvm::Module, set llvm:TargetOptions information.

◆ clone_target_options()

void Halide::Internal::clone_target_options ( const llvm::Module & from,
llvm::Module & to )

Given two llvm::Modules, clone target options from one to the other.

◆ make_target_machine()

std::unique_ptr< llvm::TargetMachine > Halide::Internal::make_target_machine ( const llvm::Module & module)

Given an llvm::Module, get or create an llvm:TargetMachine.

◆ set_function_attributes_from_halide_target_options()

void Halide::Internal::set_function_attributes_from_halide_target_options ( llvm::Function & )

Set the appropriate llvm Function attributes given the Halide Target.

◆ embed_bitcode()

void Halide::Internal::embed_bitcode ( llvm::Module * M,
const std::string & halide_command )

Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section.

Emulates clang's -fembed-bitcode flag and is useful to satisfy Apple's bitcode inclusion requirements.

◆ new_CodeGen_Metal_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_Metal_Dev ( const Target & target)

◆ new_CodeGen_OpenCL_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_OpenCL_Dev ( const Target & target)

◆ new_CodeGen_PTX_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_PTX_Dev ( const Target & target)

◆ new_CodeGen_ARM()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_ARM ( const Target & target)

Construct CodeGen object for a variety of targets.

◆ new_CodeGen_Hexagon()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_Hexagon ( const Target & target)

◆ new_CodeGen_PowerPC()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_PowerPC ( const Target & target)

◆ new_CodeGen_RISCV()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_RISCV ( const Target & target)

◆ new_CodeGen_X86()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_X86 ( const Target & target)

◆ new_CodeGen_WebAssembly()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_WebAssembly ( const Target & target)

◆ new_CodeGen_Vulkan_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_Vulkan_Dev ( const Target & target)

◆ new_CodeGen_WebGPU_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_WebGPU_Dev ( const Target & target)

◆ set_compiler_logger()

std::unique_ptr< CompilerLogger > Halide::Internal::set_compiler_logger ( std::unique_ptr< CompilerLogger > compiler_logger)

Set the active CompilerLogger object, replacing any existing one.

It is legal to pass in a nullptr (which means "don't do any compiler logging"). Returns the previous CompilerLogger (if any).

◆ get_compiler_logger()

CompilerLogger * Halide::Internal::get_compiler_logger ( )

Return the currently active CompilerLogger object.

If set_compiler_logger() has never been called, a nullptr implementation will be returned. Do not save the pointer returned! It is intended to be used for immediate calls only.

◆ constant_integer_bounds()

ConstantInterval Halide::Internal::constant_integer_bounds ( const Expr & e,
const Scope< ConstantInterval > & scope = ScopeConstantInterval >::empty_scope(),
std::map< Expr, ConstantInterval, ExprCompare > * cache = nullptr )

Deduce constant integer bounds on an expression.

This can be useful to decide if, for example, the expression can be cast to another type, be negated, be incremented, etc without risking overflow.

Also optionally accepts a scope containing the integer bounds of any variables that may be referenced, and a cache of constant integer bounds on known Exprs, which this function will update. The cache is helpful to short-circuit large numbers of redundant queries, but it should not be used in contexts where the same Expr object may take on different values within a single Expr (i.e. before uniquify_variable_names).

◆ operator+() [1/4]

ConstantInterval Halide::Internal::operator+ ( const ConstantInterval & a,
const ConstantInterval & b )

Arithmetic operators on ConstantIntervals.

The resulting interval contains all possible values of the operator applied to any two elements of the argument intervals. Note that these operator on unbounded integers. If you are applying this to concrete small integer types, you will need to manually cast the constant interval back to the desired type to model the effect of overflow.

◆ operator+() [2/4]

ConstantInterval Halide::Internal::operator+ ( const ConstantInterval & a,
int64_t b )

◆ operator-() [1/4]

ConstantInterval Halide::Internal::operator- ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator-() [2/4]

ConstantInterval Halide::Internal::operator- ( const ConstantInterval & a,
int64_t b )

◆ operator/() [1/4]

ConstantInterval Halide::Internal::operator/ ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator/() [2/4]

ConstantInterval Halide::Internal::operator/ ( const ConstantInterval & a,
int64_t b )

◆ operator*() [1/4]

ConstantInterval Halide::Internal::operator* ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator*() [2/4]

ConstantInterval Halide::Internal::operator* ( const ConstantInterval & a,
int64_t b )

◆ operator%() [1/4]

ConstantInterval Halide::Internal::operator% ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator%() [2/4]

ConstantInterval Halide::Internal::operator% ( const ConstantInterval & a,
int64_t b )

◆ min() [1/2]

◆ min() [2/2]

ConstantInterval Halide::Internal::min ( const ConstantInterval & a,
int64_t b )

◆ max() [1/2]

◆ max() [2/2]

ConstantInterval Halide::Internal::max ( const ConstantInterval & a,
int64_t b )

◆ abs()

ConstantInterval Halide::Internal::abs ( const ConstantInterval & a)

◆ operator<<() [1/22]

ConstantInterval Halide::Internal::operator<< ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator<<() [2/22]

ConstantInterval Halide::Internal::operator<< ( const ConstantInterval & a,
int64_t b )

◆ operator<<() [3/22]

ConstantInterval Halide::Internal::operator<< ( int64_t a,
const ConstantInterval & b )

◆ operator>>() [1/3]

ConstantInterval Halide::Internal::operator>> ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator>>() [2/3]

ConstantInterval Halide::Internal::operator>> ( const ConstantInterval & a,
int64_t b )

◆ operator>>() [3/3]

ConstantInterval Halide::Internal::operator>> ( int64_t a,
const ConstantInterval & b )

◆ operator<=() [1/3]

bool Halide::Internal::operator<= ( const ConstantInterval & a,
const ConstantInterval & b )

Comparison operators on ConstantIntervals.

Returns whether the comparison is true for all values of the two intervals.

◆ operator<=() [2/3]

bool Halide::Internal::operator<= ( const ConstantInterval & a,
int64_t b )

◆ operator<=() [3/3]

bool Halide::Internal::operator<= ( int64_t a,
const ConstantInterval & b )

◆ operator<() [1/3]

bool Halide::Internal::operator< ( const ConstantInterval & a,
const ConstantInterval & b )

◆ operator<() [2/3]

bool Halide::Internal::operator< ( const ConstantInterval & a,
int64_t b )

◆ operator<() [3/3]

bool Halide::Internal::operator< ( int64_t a,
const ConstantInterval & b )

◆ operator>=() [1/3]

bool Halide::Internal::operator>= ( const ConstantInterval & a,
const ConstantInterval & b )
inline

Definition at line 144 of file ConstantInterval.h.

◆ operator>() [1/3]

bool Halide::Internal::operator> ( const ConstantInterval & a,
const ConstantInterval & b )
inline

Definition at line 147 of file ConstantInterval.h.

◆ operator>=() [2/3]

bool Halide::Internal::operator>= ( const ConstantInterval & a,
int64_t b )
inline

Definition at line 150 of file ConstantInterval.h.

◆ operator>() [2/3]

bool Halide::Internal::operator> ( const ConstantInterval & a,
int64_t b )
inline

Definition at line 153 of file ConstantInterval.h.

◆ operator>=() [3/3]

bool Halide::Internal::operator>= ( int64_t a,
const ConstantInterval & b )
inline

Definition at line 156 of file ConstantInterval.h.

◆ operator>() [3/3]

bool Halide::Internal::operator> ( int64_t a,
const ConstantInterval & b )
inline

Definition at line 159 of file ConstantInterval.h.

◆ cplusplus_function_mangled_name()

std::string Halide::Internal::cplusplus_function_mangled_name ( const std::string & name,
const std::vector< std::string > & namespaces,
Type return_type,
const std::vector< ExternFuncArgument > & args,
const Target & target )

Return the mangled C++ name for a function.

The target parameter is used to decide on the C++ ABI/mangling style to use.

◆ cplusplus_mangle_test()

void Halide::Internal::cplusplus_mangle_test ( )

◆ common_subexpression_elimination() [1/2]

Expr Halide::Internal::common_subexpression_elimination ( const Expr & ,
bool lift_all = false )

Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable.

This is important to do within Halide (instead of punting to llvm), because exprs that come in from the front-end are small when considered as a graph, but combinatorially large when considered as a tree. For an example of a such a case, see test/code_explosion.cpp

The last parameter determines whether all common subexpressions are lifted, or only those that the simplifier would not subsitute back in (e.g. addition of a constant).

◆ common_subexpression_elimination() [2/2]

Stmt Halide::Internal::common_subexpression_elimination ( const Stmt & ,
bool lift_all = false )

Do common-subexpression-elimination on each expression in a statement.

Does not introduce let statements.

◆ cse_test()

void Halide::Internal::cse_test ( )

◆ operator<<() [4/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const Stmt &  )

Emit a halide statement on an output stream (such as std::cout) in a human-readable form.

◆ operator<<() [5/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & ,
const LoweredFunc &  )

Emit a halide LoweredFunc in a human readable format.

◆ PrintSpan()

template<typename T >
Halide::Internal::PrintSpan ( const T & ) -> PrintSpan< T >

◆ operator<<() [6/22]

template<typename StreamT , typename T >
StreamT & Halide::Internal::operator<< ( StreamT & stream,
const PrintSpan< T > & wrapper )
inline

Definition at line 85 of file Debug.h.

References Halide::Internal::PrintSpan< T >::span.

◆ PrintSpanLn()

template<typename T >
Halide::Internal::PrintSpanLn ( const T & ) -> PrintSpanLn< T >

◆ operator<<() [7/22]

template<typename StreamT , typename T >
StreamT & Halide::Internal::operator<< ( StreamT & stream,
const PrintSpanLn< T > & wrapper )
inline

Definition at line 119 of file Debug.h.

References Halide::Internal::PrintSpanLn< T >::span.

◆ debug_arguments()

void Halide::Internal::debug_arguments ( LoweredFunc * func,
const Target & t )

Injects debug prints in a LoweredFunc that describe the target and arguments.

Mutates the given func.

◆ debug_to_file()

Stmt Halide::Internal::debug_to_file ( Stmt s,
const std::vector< Function > & outputs,
const std::map< std::string, Function > & env )

Takes a statement with Realize nodes still unlowered.

If the corresponding functions have a debug_file set, then inject code that will dump the contents of those functions to a file after the realization.

◆ extract_odd_lanes()

Expr Halide::Internal::extract_odd_lanes ( const Expr & a)

Extract the odd-numbered lanes in a vector.

◆ extract_even_lanes()

Expr Halide::Internal::extract_even_lanes ( const Expr & a)

Extract the even-numbered lanes in a vector.

◆ extract_lane()

Expr Halide::Internal::extract_lane ( const Expr & vec,
int lane )

Extract the nth lane of a vector.

◆ rewrite_interleavings()

Stmt Halide::Internal::rewrite_interleavings ( const Stmt & s)

Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic.

◆ deinterleave_vector_test()

void Halide::Internal::deinterleave_vector_test ( )

◆ remove_let_definitions()

Expr Halide::Internal::remove_let_definitions ( const Expr & expr)

Remove all let definitions of expr.

◆ gather_variables() [1/2]

std::vector< int > Halide::Internal::gather_variables ( const Expr & expr,
const std::vector< std::string > & filter )

Return a list of variables' indices that expr depends on and are in the filter.

◆ gather_variables() [2/2]

std::vector< int > Halide::Internal::gather_variables ( const Expr & expr,
const std::vector< Var > & filter )

◆ gather_rvariables() [1/2]

std::map< std::string, ReductionVariableInfo > Halide::Internal::gather_rvariables ( const Expr & expr)

◆ gather_rvariables() [2/2]

std::map< std::string, ReductionVariableInfo > Halide::Internal::gather_rvariables ( const Tuple & tuple)

◆ add_let_expression()

Expr Halide::Internal::add_let_expression ( const Expr & expr,
const std::map< std::string, Expr > & let_var_mapping,
const std::vector< std::string > & let_variables )

Add necessary let expressions to expr.

◆ sort_expressions()

std::vector< Expr > Halide::Internal::sort_expressions ( const Expr & expr)

Topologically sort the expression graph expressed by expr.

◆ inference_bounds() [1/2]

std::map< std::string, Box > Halide::Internal::inference_bounds ( const std::vector< Func > & funcs,
const std::vector< Box > & output_bounds )

Compute the bounds of funcs.

The bounds represent a conservative region that is used by the "consumers" of the function, except of itself.

◆ inference_bounds() [2/2]

std::map< std::string, Box > Halide::Internal::inference_bounds ( const Func & func,
const Box & output_bounds )

◆ box_to_vector()

std::vector< std::pair< Expr, Expr > > Halide::Internal::box_to_vector ( const Box & bounds)

Convert Box to vector of (min, extent)

◆ equal() [1/4]

bool Halide::Internal::equal ( const RDom & bounds0,
const RDom & bounds1 )

◆ vars_to_strings()

std::vector< std::string > Halide::Internal::vars_to_strings ( const std::vector< Var > & vars)

Return a list of variable names.

◆ extract_rdom()

ReductionDomain Halide::Internal::extract_rdom ( const Expr & expr)

Return the reduction domain used by expr.

◆ solve_inverse()

std::pair< bool, Expr > Halide::Internal::solve_inverse ( Expr expr,
const std::string & new_var,
const std::string & var )

expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom

◆ find_buffer_param_calls()

std::map< std::string, BufferInfo > Halide::Internal::find_buffer_param_calls ( const Func & func)

◆ find_implicit_variables()

std::set< std::string > Halide::Internal::find_implicit_variables ( const Expr & expr)

Find all implicit variables in expr.

◆ substitute_rdom_predicate()

Expr Halide::Internal::substitute_rdom_predicate ( const std::string & name,
const Expr & replacement,
const Expr & expr )

Substitute the variable.

Also replace all occurrences in rdom.where() predicates.

◆ is_calling_function() [1/2]

bool Halide::Internal::is_calling_function ( const std::string & func_name,
const Expr & expr,
const std::map< std::string, Expr > & let_var_mapping )

Return true if expr contains call to func_name.

◆ is_calling_function() [2/2]

bool Halide::Internal::is_calling_function ( const Expr & expr,
const std::map< std::string, Expr > & let_var_mapping )

Return true if expr depends on any function or buffer.

◆ substitute_call_arg_with_pure_arg()

Expr Halide::Internal::substitute_call_arg_with_pure_arg ( Func f,
int variable_id,
const Expr & e )

Replaces call to Func f in Expr e such that the call argument at variable_id is the pure argument.

◆ make_device_interface_call()

Expr Halide::Internal::make_device_interface_call ( DeviceAPI device_api,
MemoryType memory_type = MemoryType::Auto )

Get an Expr which evaluates to the device interface for the given device api at runtime.

◆ distribute_shifts()

Stmt Halide::Internal::distribute_shifts ( const Stmt & stmt,
bool multiply_adds )

◆ inject_early_frees()

Stmt Halide::Internal::inject_early_frees ( const Stmt & s)

Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation.

Targets may use this to free buffers earlier than the close of their Allocate node.

◆ eliminate_bool_vectors() [1/2]

Stmt Halide::Internal::eliminate_bool_vectors ( const Stmt & s)

Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.

For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.

◆ eliminate_bool_vectors() [2/2]

Expr Halide::Internal::eliminate_bool_vectors ( const Expr & s)

Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.

For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.

◆ eliminated_bool_type()

Type Halide::Internal::eliminated_bool_type ( Type bool_type,
Type other_type )
inline

If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors.

Definition at line 32 of file EliminateBoolVectors.h.

References Halide::Type::bits(), Halide::Type::Int, Halide::Type::is_vector(), Halide::Type::with_bits(), and Halide::Type::with_code().

◆ is_float16_transcendental()

bool Halide::Internal::is_float16_transcendental ( const Call * )

Check if a call is a float16 transcendental (e.g.

sqrt_f16)

◆ lower_float16_transcendental_to_float32_equivalent()

Expr Halide::Internal::lower_float16_transcendental_to_float32_equivalent ( const Call * )

Implement a float16 transcendental using the float32 equivalent.

◆ float32_to_bfloat16()

Expr Halide::Internal::float32_to_bfloat16 ( Expr e)

Cast to/from float and bfloat using bitwise math.

◆ float32_to_float16()

Expr Halide::Internal::float32_to_float16 ( Expr e)

◆ float16_to_float32()

Expr Halide::Internal::float16_to_float32 ( Expr e)

◆ bfloat16_to_float32()

Expr Halide::Internal::bfloat16_to_float32 ( Expr e)

◆ lower_float16_cast()

Expr Halide::Internal::lower_float16_cast ( const Cast * op)

◆ unhandled_exception_handler()

HALIDE_EXPORT_SYMBOL void Halide::Internal::unhandled_exception_handler ( )

◆ ref_count< IRNode >()

template<>
RefCount & Halide::Internal::ref_count< IRNode > ( const IRNode * t)
inlinenoexcept

Definition at line 117 of file Expr.h.

◆ destroy< IRNode >()

template<>
void Halide::Internal::destroy< IRNode > ( const IRNode * t)
inline

Definition at line 122 of file Expr.h.

◆ is_unordered_parallel()

bool Halide::Internal::is_unordered_parallel ( ForType for_type)

Check if for_type executes for loop iterations in parallel and unordered.

Referenced by Halide::Internal::Dim::is_unordered_parallel(), and Halide::Internal::For::is_unordered_parallel().

◆ is_parallel()

bool Halide::Internal::is_parallel ( ForType for_type)

Returns true if for_type executes for loop iterations in parallel.

Referenced by Halide::Internal::Dim::is_parallel(), and Halide::Internal::For::is_parallel().

◆ is_gpu()

bool Halide::Internal::is_gpu ( ForType for_type)

Returns true if for_type is GPUBlock, GPUThread, or GPULane.

◆ stmt_or_expr_uses_vars()

template<typename StmtOrExpr , typename T >
bool Halide::Internal::stmt_or_expr_uses_vars ( const StmtOrExpr & e,
const Scope< T > & v,
const Scope< Expr > & s = Scope<Expr>::empty_scope() )
inline

Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 101 of file ExprUsesVar.h.

References Halide::Internal::ExprUsesVars< T >::result.

Referenced by expr_uses_vars(), stmt_or_expr_uses_var(), and stmt_uses_vars().

◆ stmt_or_expr_uses_var()

template<typename StmtOrExpr >
bool Halide::Internal::stmt_or_expr_uses_var ( const StmtOrExpr & e,
const std::string & v,
const Scope< Expr > & s = Scope<Expr>::empty_scope() )
inline

Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 113 of file ExprUsesVar.h.

References Halide::Internal::Scope< T >::push(), and stmt_or_expr_uses_vars().

Referenced by expr_uses_var(), and stmt_uses_var().

◆ expr_uses_var()

bool Halide::Internal::expr_uses_var ( const Expr & e,
const std::string & v,
const Scope< Expr > & s = Scope<Expr>::empty_scope() )
inline

Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 124 of file ExprUsesVar.h.

References stmt_or_expr_uses_var().

◆ stmt_uses_var()

bool Halide::Internal::stmt_uses_var ( const Stmt & stmt,
const std::string & v,
const Scope< Expr > & s = Scope<Expr>::empty_scope() )
inline

Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 133 of file ExprUsesVar.h.

References Halide::stmt, and stmt_or_expr_uses_var().

◆ expr_uses_vars()

template<typename T >
bool Halide::Internal::expr_uses_vars ( const Expr & e,
const Scope< T > & v,
const Scope< Expr > & s = Scope<Expr>::empty_scope() )
inline

Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 143 of file ExprUsesVar.h.

References stmt_or_expr_uses_vars().

◆ stmt_uses_vars()

template<typename T >
bool Halide::Internal::stmt_uses_vars ( const Stmt & stmt,
const Scope< T > & v,
const Scope< Expr > & s = Scope<Expr>::empty_scope() )
inline

Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 153 of file ExprUsesVar.h.

References Halide::stmt, and stmt_or_expr_uses_vars().

◆ extract_tile_operations()

Stmt Halide::Internal::extract_tile_operations ( const Stmt & s)

Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend.

◆ find_direct_calls()

std::map< std::string, Function > Halide::Internal::find_direct_calls ( const Function & f)

Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, including in update definitions, update index expressions, and RDom extents.

This map does not include the Function f, unless it is called recursively by itself.

◆ find_transitive_calls()

std::map< std::string, Function > Halide::Internal::find_transitive_calls ( const Function & f)

Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, or indirectly in those functions' definitions, recursively.

This map always includes the Function f.

◆ build_environment()

std::map< std::string, Function > Halide::Internal::build_environment ( const std::vector< Function > & funcs)

Find all Functions transitively referenced by any Function in funcs and return a map of them.

◆ called_funcs_in_order_found()

std::vector< Function > Halide::Internal::called_funcs_in_order_found ( const std::vector< Function > & funcs)

Returns the same Functions as build_environment, but returns a vector of Functions instead, where the order is the order in which the Functions were first encountered.

This is stable to changes in the names of the Functions.

◆ lower_widen_right_add()

Expr Halide::Internal::lower_widen_right_add ( const Expr & a,
const Expr & b )

Implement intrinsics with non-intrinsic using equivalents.

◆ lower_widen_right_mul()

Expr Halide::Internal::lower_widen_right_mul ( const Expr & a,
const Expr & b )

◆ lower_widen_right_sub()

Expr Halide::Internal::lower_widen_right_sub ( const Expr & a,
const Expr & b )

◆ lower_widening_add()

Expr Halide::Internal::lower_widening_add ( const Expr & a,
const Expr & b )

◆ lower_widening_mul()

Expr Halide::Internal::lower_widening_mul ( const Expr & a,
const Expr & b )

◆ lower_widening_sub()

Expr Halide::Internal::lower_widening_sub ( const Expr & a,
const Expr & b )

◆ lower_widening_shift_left()

Expr Halide::Internal::lower_widening_shift_left ( const Expr & a,
const Expr & b )

◆ lower_widening_shift_right()

Expr Halide::Internal::lower_widening_shift_right ( const Expr & a,
const Expr & b )

◆ lower_rounding_shift_left()

Expr Halide::Internal::lower_rounding_shift_left ( const Expr & a,
const Expr & b )

◆ lower_rounding_shift_right()

Expr Halide::Internal::lower_rounding_shift_right ( const Expr & a,
const Expr & b )

◆ lower_saturating_add()

Expr Halide::Internal::lower_saturating_add ( const Expr & a,
const Expr & b )

◆ lower_saturating_sub()

Expr Halide::Internal::lower_saturating_sub ( const Expr & a,
const Expr & b )

◆ lower_saturating_cast()

Expr Halide::Internal::lower_saturating_cast ( const Type & t,
const Expr & a )

◆ lower_halving_add()

Expr Halide::Internal::lower_halving_add ( const Expr & a,
const Expr & b )

◆ lower_halving_sub()

Expr Halide::Internal::lower_halving_sub ( const Expr & a,
const Expr & b )

◆ lower_rounding_halving_add()

Expr Halide::Internal::lower_rounding_halving_add ( const Expr & a,
const Expr & b )

◆ lower_sorted_avg()

Expr Halide::Internal::lower_sorted_avg ( const Expr & a,
const Expr & b )

◆ lower_mul_shift_right()

Expr Halide::Internal::lower_mul_shift_right ( const Expr & a,
const Expr & b,
const Expr & q )

◆ lower_rounding_mul_shift_right()

Expr Halide::Internal::lower_rounding_mul_shift_right ( const Expr & a,
const Expr & b,
const Expr & q )

◆ lower_intrinsic()

Expr Halide::Internal::lower_intrinsic ( const Call * op)

Replace one of the above ops with equivalent arithmetic.

◆ find_intrinsics() [1/2]

Stmt Halide::Internal::find_intrinsics ( const Stmt & s)

Replace common arithmetic patterns with intrinsics.

◆ find_intrinsics() [2/2]

Expr Halide::Internal::find_intrinsics ( const Expr & e)

◆ lower_intrinsics() [1/2]

Expr Halide::Internal::lower_intrinsics ( const Expr & e)

The reverse of find_intrinsics.

◆ lower_intrinsics() [2/2]

Stmt Halide::Internal::lower_intrinsics ( const Stmt & s)

◆ flatten_nested_ramps() [1/2]

Stmt Halide::Internal::flatten_nested_ramps ( const Stmt & s)

Take a statement/expression and replace nested ramps and broadcasts.

◆ flatten_nested_ramps() [2/2]

Expr Halide::Internal::flatten_nested_ramps ( const Expr & e)

◆ check_types() [1/2]

template<typename Last >
void Halide::Internal::check_types ( const Tuple & t,
int idx )
inline

Definition at line 2616 of file Func.h.

References Halide::type_of(), and user_assert.

Referenced by check_types(), Halide::evaluate(), and Halide::evaluate_may_gpu().

◆ check_types() [2/2]

template<typename First , typename Second , typename... Rest>
void Halide::Internal::check_types ( const Tuple & t,
int idx )
inline

Definition at line 2625 of file Func.h.

References check_types().

◆ assign_results() [1/2]

template<typename Last >
void Halide::Internal::assign_results ( Realization & r,
int idx,
Last last )
inline

Definition at line 2631 of file Func.h.

References Buffer.

Referenced by assign_results(), Halide::evaluate(), and Halide::evaluate_may_gpu().

◆ assign_results() [2/2]

template<typename First , typename Second , typename... Rest>
void Halide::Internal::assign_results ( Realization & r,
int idx,
First first,
Second second,
Rest &&... rest )
inline

Definition at line 2637 of file Func.h.

References assign_results().

◆ schedule_scalar()

◆ deep_copy()

std::pair< std::vector< Function >, std::map< std::string, Function > > Halide::Internal::deep_copy ( const std::vector< Function > & outputs,
const std::map< std::string, Function > & env )

Deep copy an entire Function DAG.

◆ zero_gpu_loop_mins()

Stmt Halide::Internal::zero_gpu_loop_mins ( const Stmt & s)

Rewrite all GPU loops to have a min of zero.

◆ fuse_gpu_thread_loops()

Stmt Halide::Internal::fuse_gpu_thread_loops ( Stmt s)

Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model.

Within every loop over gpu block indices, fuse the inner loops over thread indices into a single loop (with predication to turn off threads). Push if conditions between GPU blocks to the innermost GPU threads. Also injects synchronization points as needed, and hoists shared allocations at the block level out into a single shared memory array, and heap allocations into a slice of a global pool allocated outside the kernel.

◆ fuzz_float_stores()

Stmt Halide::Internal::fuzz_float_stores ( const Stmt & s)

On every store of a floating point value, mask off the least-significant-bit of the mantissa.

We've found that whether or not this dramatically changes the output of a pipeline correlates very well with whether or not a pipeline will produce very different outputs on different architectures (e.g. with and without FMA). It's also a useful way to detect bad tests, such as those that expect exact floating point equality across platforms.

◆ generator_test()

void Halide::Internal::generator_test ( )

◆ parameter_constraints()

std::vector< Expr > Halide::Internal::parameter_constraints ( const Parameter & p)

◆ enum_to_string()

template<typename T >
HALIDE_NO_USER_CODE_INLINE std::string Halide::Internal::enum_to_string ( const std::map< std::string, T > & enum_map,
const T & t )

◆ enum_from_string()

template<typename T >
T Halide::Internal::enum_from_string ( const std::map< std::string, T > & enum_map,
const std::string & s )

Definition at line 308 of file Generator.h.

References user_assert.

◆ get_halide_type_enum_map()

const std::map< std::string, Halide::Type > & Halide::Internal::get_halide_type_enum_map ( )
extern

◆ halide_type_to_enum_string()

std::string Halide::Internal::halide_type_to_enum_string ( const Type & t)
inline

Definition at line 315 of file Generator.h.

References enum_to_string(), and get_halide_type_enum_map().

◆ halide_type_to_c_source()

std::string Halide::Internal::halide_type_to_c_source ( const Type & t)

◆ halide_type_to_c_type()

std::string Halide::Internal::halide_type_to_c_type ( const Type & t)

◆ get_registered_generators()

const GeneratorFactoryProvider & Halide::Internal::get_registered_generators ( )

Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators.

◆ generate_filter_main() [1/2]

int Halide::Internal::generate_filter_main ( int argc,
char ** argv )

generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation.

◆ generate_filter_main() [2/2]

int Halide::Internal::generate_filter_main ( int argc,
char ** argv,
const GeneratorFactoryProvider & generator_factory_provider )

This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g.

for bindings in languages other than C++).

◆ parse_scalar()

template<typename T >
T Halide::Internal::parse_scalar ( const std::string & value)

Definition at line 2882 of file Generator.h.

References parse_scalar(), and user_assert.

Referenced by parse_scalar().

◆ parse_halide_type_list()

std::vector< Type > Halide::Internal::parse_halide_type_list ( const std::string & types)

◆ execute_generator()

void Halide::Internal::execute_generator ( const ExecuteGeneratorArgs & args)

Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main(), but with a structured API that is more suitable for calling directly from code (vs command line).

References execute_generator().

Referenced by execute_generator().

◆ inject_hexagon_rpc()

Stmt Halide::Internal::inject_hexagon_rpc ( Stmt s,
const Target & host_target,
Module & module )

Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module.

◆ compile_module_to_hexagon_shared_object()

Buffer< uint8_t > Halide::Internal::compile_module_to_hexagon_shared_object ( const Module & device_code)

◆ optimize_hexagon_shuffles()

Stmt Halide::Internal::optimize_hexagon_shuffles ( const Stmt & s,
int lut_alignment )

Replace indirect and other loads with simple loads + vlut calls.

◆ scatter_gather_generator()

Stmt Halide::Internal::scatter_gather_generator ( Stmt s)

◆ optimize_hexagon_instructions()

Stmt Halide::Internal::optimize_hexagon_instructions ( Stmt s,
const Target & t )

Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations.

This pass rewrites widenings/narrowings to be explicit in the IR, and attempts to simplify away most of the interleaving/deinterleaving.

◆ native_deinterleave()

Expr Halide::Internal::native_deinterleave ( const Expr & x)

Generate deinterleave or interleave operations, operating on groups of vectors at a time.

◆ native_interleave()

Expr Halide::Internal::native_interleave ( const Expr & x)

◆ is_native_deinterleave()

bool Halide::Internal::is_native_deinterleave ( const Expr & x)

◆ is_native_interleave()

bool Halide::Internal::is_native_interleave ( const Expr & x)

◆ type_suffix() [1/4]

std::string Halide::Internal::type_suffix ( Type type,
bool signed_variants = true )

◆ type_suffix() [2/4]

std::string Halide::Internal::type_suffix ( const Expr & a,
bool signed_variants = true )

◆ type_suffix() [3/4]

std::string Halide::Internal::type_suffix ( const Expr & a,
const Expr & b,
bool signed_variants = true )

◆ type_suffix() [4/4]

std::string Halide::Internal::type_suffix ( const std::vector< Expr > & ops,
bool signed_variants = true )

◆ infer_arguments()

std::vector< InferredArgument > Halide::Internal::infer_arguments ( const Stmt & body,
const std::vector< Function > & outputs )

◆ call_extern_and_assert()

Stmt Halide::Internal::call_extern_and_assert ( const std::string & name,
const std::vector< Expr > & args )

A helper function to call an extern function, and assert that it returns 0.

◆ inject_host_dev_buffer_copies()

Stmt Halide::Internal::inject_host_dev_buffer_copies ( Stmt s,
const Target & t )

Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed.

◆ inline_function() [1/3]

Stmt Halide::Internal::inline_function ( Stmt s,
const Function & f )

Inline a single named function, which must be pure.

For a pure function to be inlined, it must not have any specializations (i.e. it can only have one values definition).

◆ inline_function() [2/3]

Expr Halide::Internal::inline_function ( Expr e,
const Function & f )

◆ inline_function() [3/3]

void Halide::Internal::inline_function ( Function caller,
const Function & f )

◆ validate_schedule_inlined_function()

void Halide::Internal::validate_schedule_inlined_function ( Function f)

Check if the schedule of an inlined function is legal, throwing an error if it is not.

◆ ref_count()

template<typename T >
RefCount & Halide::Internal::ref_count ( const T * t)
noexcept

Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here.

E.g. if you want to use IntrusivePtr<MyClass>, then you should define something like this in MyClass.cpp (assuming MyClass has a field: mutable RefCount ref_count):

template<> RefCount &ref_count<MyClass>(const MyClass *c) noexcept {return c->ref_count;} template<> void destroy<MyClass>(const MyClass *c) {delete c;}

Referenced by Halide::Internal::IntrusivePtr< T >::is_sole_reference().

◆ destroy()

template<typename T >
void Halide::Internal::destroy ( const T * t)

◆ equal_impl()

bool Halide::Internal::equal_impl ( const IRNode & a,
const IRNode & b )

Referenced by equal(), and graph_equal().

◆ graph_equal_impl()

bool Halide::Internal::graph_equal_impl ( const IRNode & a,
const IRNode & b )

◆ less_than_impl()

bool Halide::Internal::less_than_impl ( const IRNode & a,
const IRNode & b )

Referenced by less_than().

◆ graph_less_than_impl()

bool Halide::Internal::graph_less_than_impl ( const IRNode & a,
const IRNode & b )

Referenced by graph_less_than().

◆ equal() [2/4]

HALIDE_ALWAYS_INLINE bool Halide::Internal::equal ( const Expr & a,
int b )

Compare an Expr to an int literal.

This is a somewhat common use of equal in tests. Making this separate avoids constructing an Expr out of the int literal just to check if it's equal to a.

Definition at line 28 of file IREquality.h.

References Halide::Internal::IRHandle::as(), Halide::Int(), and Halide::Expr::type().

◆ equal() [3/4]

HALIDE_ALWAYS_INLINE bool Halide::Internal::equal ( const IRNode & a,
const IRNode & b )

Check if two defined Stmts or Exprs are equal.

Definition at line 38 of file IREquality.h.

References equal_impl(), and Halide::Internal::IRNode::node_type.

◆ equal() [4/4]

HALIDE_ALWAYS_INLINE bool Halide::Internal::equal ( const IRHandle & a,
const IRHandle & b )

Check if two possible-undefined Stmts or Exprs are equal.

Definition at line 50 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), equal(), and Halide::Internal::IntrusivePtr< T >::get().

◆ graph_equal() [1/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_equal ( const IRNode & a,
const IRNode & b )

Check if two defined Stmts or Exprs are equal.

Safe to call on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 63 of file IREquality.h.

References equal_impl(), and Halide::Internal::IRNode::node_type.

◆ graph_equal() [2/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_equal ( const IRHandle & a,
const IRHandle & b )

Check if two possibly-undefined Stmts or Exprs are equal.

Safe to call on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 76 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), equal(), and Halide::Internal::IntrusivePtr< T >::get().

◆ less_than() [1/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::less_than ( const IRNode & a,
const IRNode & b )

Check if two defined Stmts or Exprs are in a lexicographic order.

For use in map keys.

Definition at line 89 of file IREquality.h.

References less_than_impl(), and Halide::Internal::IRNode::node_type.

Referenced by less_than(), and Halide::Internal::IRDeepCompare::operator()().

◆ less_than() [2/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::less_than ( const IRHandle & a,
const IRHandle & b )

Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.

For use in map keys.

Definition at line 102 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), Halide::Internal::IntrusivePtr< T >::get(), and less_than().

◆ graph_less_than() [1/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_less_than ( const IRNode & a,
const IRNode & b )

Check if two defined Stmts or Exprs are in a lexicographic order.

For use in map keys. Safe to use on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 118 of file IREquality.h.

References graph_less_than_impl(), and Halide::Internal::IRNode::node_type.

Referenced by graph_less_than(), and Halide::Internal::IRGraphDeepCompare::operator()().

◆ graph_less_than() [2/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_less_than ( const IRHandle & a,
const IRHandle & b )

Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.

For use in map keys. Safe to use on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 132 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), Halide::Internal::IntrusivePtr< T >::get(), and graph_less_than().

◆ ir_equality_test()

void Halide::Internal::ir_equality_test ( )

◆ expr_match() [1/2]

bool Halide::Internal::expr_match ( const Expr & pattern,
const Expr & expr,
std::vector< Expr > & result )

Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument.

Wildcards require the types to match. For the type bits and width, a 0 indicates "match anything". So an Int(8, 0) will match 8-bit integer vectors of any width (including scalars), and a UInt(0, 0) will match any unsigned integer type.

For example:

Expr x = Variable::make(Int(32), "*");
match(x + x, 3 + (2*k), result)
Type Int(int bits, int lanes=1)
Constructing a signed integer type.
Definition Type.h:541
A fragment of Halide syntax.
Definition Expr.h:258
static Expr make(Type type, const std::string &name)
Definition IR.h:785

should return true, and set result[0] to 3 and result[1] to 2*k.

◆ expr_match() [2/2]

bool Halide::Internal::expr_match ( const Expr & pattern,
const Expr & expr,
std::map< std::string, Expr > & result )

Does the first expression have the same structure as the second? Variables are matched consistently.

The first time a variable is matched, it assumes the value of the matching part of the second expression. Subsequent matches must be equal to the first match.

For example:

Var x("x"), y("y");
match(x*(x + y), a*(a + b), result)
A Halide variable, to be used when defining functions.
Definition Var.h:19

should return true, and set result["x"] = a, and result["y"] = b.

◆ with_lanes()

Expr Halide::Internal::with_lanes ( const Expr & x,
int lanes )

Rewrite the expression x to have lanes lanes.

This is useful for substituting the results of expr_match into a pattern expression.

◆ expr_match_test()

void Halide::Internal::expr_match_test ( )

◆ mutate_region()

template<typename Mutator , typename... Args>
std::pair< Region, bool > Halide::Internal::mutate_region ( Mutator * mutator,
const Region & bounds,
Args &&... args )

A helper function for mutator-like things to mutate regions.

Definition at line 124 of file IRMutator.h.

References Halide::Internal::IntrusivePtr< T >::same_as().

◆ is_const() [1/2]

bool Halide::Internal::is_const ( const Expr & e)

Is the expression either an IntImm, a FloatImm, a StringImm, or a Cast of the same, or a Ramp or Broadcast of the same.

Doesn't do any constant folding.

Referenced by Halide::Internal::IRMatcher::IsConst< A >::make_folded_const().

◆ is_const() [2/2]

bool Halide::Internal::is_const ( const Expr & e,
int64_t v )

Is the expression an IntImm, FloatImm of a particular value, or a Cast, or Broadcast of the same.

◆ as_const_int()

std::optional< int64_t > Halide::Internal::as_const_int ( const Expr & e)

If an expression is an IntImm or a Broadcast of an IntImm, return a its value.

Otherwise returns std::nullopt.

◆ as_const_uint()

std::optional< uint64_t > Halide::Internal::as_const_uint ( const Expr & e)

If an expression is a UIntImm or a Broadcast of a UIntImm, return its value.

Otherwise returns std::nullopt.

◆ as_const_float()

std::optional< double > Halide::Internal::as_const_float ( const Expr & e)

If an expression is a FloatImm or a Broadcast of a FloatImm, return its value.

Otherwise returns std::nullopt.

◆ is_const_power_of_two_integer() [1/3]

std::optional< int > Halide::Internal::is_const_power_of_two_integer ( const Expr & e)

Is the expression a constant integer power of two.

Returns log base two of the expression if it is, or std::nullopt if not. Also returns std::nullopt for non-integer types.

◆ is_const_power_of_two_integer() [2/3]

std::optional< int > Halide::Internal::is_const_power_of_two_integer ( uint64_t )

◆ is_const_power_of_two_integer() [3/3]

std::optional< int > Halide::Internal::is_const_power_of_two_integer ( int64_t )

◆ is_positive_const()

bool Halide::Internal::is_positive_const ( const Expr & e)

Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression)

◆ is_negative_const()

bool Halide::Internal::is_negative_const ( const Expr & e)

Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression)

◆ is_undef()

bool Halide::Internal::is_undef ( const Expr & e)

Is the expression an undef.

◆ is_const_zero()

bool Halide::Internal::is_const_zero ( const Expr & e)

Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression)

Referenced by Halide::Internal::IRMatcher::NegateOp< A >::match().

◆ is_const_one()

bool Halide::Internal::is_const_one ( const Expr & e)

Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression)

Referenced by Halide::Internal::IRMatcher::CanProve< A, Prover >::make_folded_const().

◆ is_no_op()

bool Halide::Internal::is_no_op ( const Stmt & s)

Is the statement a no-op (which we represent as either an undefined Stmt, or as an Evaluate node of a constant)

◆ is_pure()

bool Halide::Internal::is_pure ( const Expr & e)

Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects.

◆ make_const() [1/12]

◆ make_const() [2/12]

Expr Halide::Internal::make_const ( Type t,
uint64_t val )

◆ make_const() [3/12]

Expr Halide::Internal::make_const ( Type t,
double val )

◆ make_const() [4/12]

Expr Halide::Internal::make_const ( Type t,
int32_t val )
inline

Definition at line 85 of file IROperator.h.

References make_const().

◆ make_const() [5/12]

Expr Halide::Internal::make_const ( Type t,
uint32_t val )
inline

Definition at line 88 of file IROperator.h.

References make_const().

◆ make_const() [6/12]

Expr Halide::Internal::make_const ( Type t,
int16_t val )
inline

Definition at line 91 of file IROperator.h.

References make_const().

◆ make_const() [7/12]

Expr Halide::Internal::make_const ( Type t,
uint16_t val )
inline

Definition at line 94 of file IROperator.h.

References make_const().

◆ make_const() [8/12]

Expr Halide::Internal::make_const ( Type t,
int8_t val )
inline

Definition at line 97 of file IROperator.h.

References make_const().

◆ make_const() [9/12]

Expr Halide::Internal::make_const ( Type t,
uint8_t val )
inline

Definition at line 100 of file IROperator.h.

References make_const().

◆ make_const() [10/12]

Expr Halide::Internal::make_const ( Type t,
bool val )
inline

Definition at line 103 of file IROperator.h.

References make_const().

◆ make_const() [11/12]

Expr Halide::Internal::make_const ( Type t,
float val )
inline

Definition at line 106 of file IROperator.h.

References make_const().

◆ make_const() [12/12]

Expr Halide::Internal::make_const ( Type t,
float16_t val )
inline

Definition at line 109 of file IROperator.h.

References make_const().

◆ make_signed_integer_overflow()

Expr Halide::Internal::make_signed_integer_overflow ( Type type)

Construct a unique signed_integer_overflow Expr.

Referenced by Halide::Internal::IRMatcher::make_const_special_expr().

◆ is_signed_integer_overflow()

bool Halide::Internal::is_signed_integer_overflow ( const Expr & expr)

Check if an expression is a signed_integer_overflow.

◆ check_representable()

void Halide::Internal::check_representable ( Type t,
int64_t val )

Check if a constant value can be correctly represented as the given type.

◆ make_bool()

Expr Halide::Internal::make_bool ( bool val,
int lanes = 1 )

Construct a boolean constant from a C++ boolean value.

May also be a vector if width is given. It is not possible to coerce a C++ boolean to Expr because if we provide such a path then char objects can ambiguously be converted to Halide Expr or to std::string. The problem is that C++ does not have a real bool type - it is in fact close enough to char that C++ does not know how to distinguish them. make_bool is the explicit coercion.

◆ make_zero()

Expr Halide::Internal::make_zero ( Type t)

Construct the representation of zero in the given type.

Referenced by Halide::Internal::IRMatcher::NegateOp< A >::make().

◆ make_one()

Expr Halide::Internal::make_one ( Type t)

Construct the representation of one in the given type.

◆ make_two()

Expr Halide::Internal::make_two ( Type t)

Construct the representation of two in the given type.

◆ const_true()

Expr Halide::Internal::const_true ( int lanes = 1)

Construct the constant boolean true.

May also be a vector of trues, if a lanes argument is given.

◆ const_false()

Expr Halide::Internal::const_false ( int lanes = 1)

Construct the constant boolean false.

May also be a vector of falses, if a lanes argument is given.

◆ lossless_cast()

Expr Halide::Internal::lossless_cast ( Type t,
Expr e,
std::map< Expr, ConstantInterval, ExprCompare > * cache = nullptr )

Attempt to cast an expression to a smaller type while provably not losing information.

If it can't be done, return an undefined Expr.

Optionally accepts a map that gives the constant bounds of exprs already analyzed to avoid redoing work across many calls to lossless_cast. It is not safe to use this optional map in contexts where the same Expr object may take on a different value. For example: (let x = 4 in some_expr_object) + (let x = 5 in the_same_expr_object)). It is safe to use it after uniquify_variable_names has been run.

◆ lossless_negate()

Expr Halide::Internal::lossless_negate ( const Expr & x)

Attempt to negate x without introducing new IR and without overflow.

If it can't be done, return an undefined Expr.

◆ match_types()

void Halide::Internal::match_types ( Expr & a,
Expr & b )

Coerce the two expressions to have the same type, using C-style casting rules.

For the purposes of casting, a boolean type is UInt(1). We use the following procedure:

If the types already match, do nothing.

Then, if one type is a vector and the other is a scalar, the scalar is broadcast to match the vector width, and we continue.

Then, if one type is floating-point and the other is not, the non-float is cast to the floating-point type, and we're done.

Then, if both types are unsigned ints, the one with fewer bits is cast to match the one with more bits and we're done.

Then, if both types are signed ints, the one with fewer bits is cast to match the one with more bits and we're done.

Finally, if one type is an unsigned int and the other type is a signed int, both are cast to a signed int with the greater of the two bit-widths. For example, matching an Int(8) with a UInt(16) results in an Int(16).

◆ match_types_bitwise()

void Halide::Internal::match_types_bitwise ( Expr & a,
Expr & b,
const char * op_name )

Asserts that both expressions are integer types and are either both signed or both unsigned.

If one argument is scalar and the other a vector, the scalar is broadcasted to have the same number of lanes as the vector. If one expression is of narrower type than the other, it is widened to the bit width of the wider.

◆ halide_log()

Expr Halide::Internal::halide_log ( const Expr & a)

Halide's vectorizable transcendentals.

◆ halide_exp()

Expr Halide::Internal::halide_exp ( const Expr & a)

◆ halide_erf()

Expr Halide::Internal::halide_erf ( const Expr & a)

◆ raise_to_integer_power()

Expr Halide::Internal::raise_to_integer_power ( Expr a,
int64_t b )

Raise an expression to an integer power by repeatedly multiplying it by itself.

◆ split_into_ands()

void Halide::Internal::split_into_ands ( const Expr & cond,
std::vector< Expr > & result )

Split a boolean condition into vector of ANDs.

If 'cond' is undefined, return an empty vector.

◆ strided_ramp_base()

Expr Halide::Internal::strided_ramp_base ( const Expr & e,
int stride = 1 )

If e is a ramp expression with stride, default 1, return the base, otherwise undefined.

◆ mod_imp()

template<typename T >
T Halide::Internal::mod_imp ( T a,
T b )
inline

Implementations of division and mod that are specific to Halide.

Use these implementations; do not use native C division or mod to simplify Halide expressions. Halide division and modulo satisify the Euclidean definition of division for integers a and b:

/code when b != 0, (a/b)*b + ab = a 0 <= ab < |b| /endcode

Additionally, mod by zero returns zero, and div by zero returns zero. This makes mod and div total functions.

Definition at line 252 of file IROperator.h.

References Halide::Type::is_float(), Halide::Type::is_int(), and Halide::type_of().

Referenced by Halide::Internal::Simplify::ExprInfo::cast_to(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), and Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().

◆ div_imp()

◆ mod_imp< float >()

template<>
float Halide::Internal::mod_imp< float > ( float a,
float b )
inline

Definition at line 298 of file IROperator.h.

◆ mod_imp< double >()

template<>
double Halide::Internal::mod_imp< double > ( double a,
double b )
inline

Definition at line 304 of file IROperator.h.

◆ div_imp< float >()

template<>
float Halide::Internal::div_imp< float > ( float a,
float b )
inline

Definition at line 310 of file IROperator.h.

◆ div_imp< double >()

template<>
double Halide::Internal::div_imp< double > ( double a,
double b )
inline

Definition at line 314 of file IROperator.h.

◆ remove_likelies() [1/2]

Expr Halide::Internal::remove_likelies ( const Expr & e)

Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed.

◆ remove_likelies() [2/2]

Stmt Halide::Internal::remove_likelies ( const Stmt & s)

Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed.

◆ remove_promises() [1/2]

Expr Halide::Internal::remove_promises ( const Expr & e)

Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.

◆ remove_promises() [2/2]

Stmt Halide::Internal::remove_promises ( const Stmt & s)

Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.

◆ unwrap_tags()

Expr Halide::Internal::unwrap_tags ( const Expr & e)

If the expression is a tag helper call, remove it and return the tagged expression.

If not, returns the expression.

◆ collect_print_args() [1/3]

HALIDE_NO_USER_CODE_INLINE void Halide::Internal::collect_print_args ( std::vector< Expr > & args)
inline

◆ collect_print_args() [2/3]

template<typename... Args>
HALIDE_NO_USER_CODE_INLINE void Halide::Internal::collect_print_args ( std::vector< Expr > & args,
const char * arg,
Args &&... more_args )
inline

Definition at line 352 of file IROperator.h.

References collect_print_args().

◆ collect_print_args() [3/3]

template<typename... Args>
HALIDE_NO_USER_CODE_INLINE void Halide::Internal::collect_print_args ( std::vector< Expr > & args,
Expr arg,
Args &&... more_args )
inline

Definition at line 358 of file IROperator.h.

References collect_print_args().

◆ requirement_failed_error()

Expr Halide::Internal::requirement_failed_error ( Expr condition,
const std::vector< Expr > & args )

◆ memoize_tag_helper()

Expr Halide::Internal::memoize_tag_helper ( Expr result,
const std::vector< Expr > & cache_key_values )

Referenced by Halide::memoize_tag().

◆ reset_random_counters()

void Halide::Internal::reset_random_counters ( )

Reset the counters used for random-number seeds in random_float/int/uint.

(Note that the counters are incremented for each call, even if a seed is passed in.) This is used for multitarget compilation to ensure that each subtarget gets the same sequence of random numbers.

◆ unreachable() [1/2]

Expr Halide::Internal::unreachable ( Type t = Int(32))

Return an expression that should never be evaluated.

Expressions that depend on unreachabale values are also unreachable, and statements that execute unreachable expressions are also considered unreachable.

◆ unreachable() [2/2]

template<typename T >
Expr Halide::Internal::unreachable ( )
inline

Definition at line 1361 of file IROperator.h.

References Halide::type_of(), and unreachable().

Referenced by unreachable().

◆ promise_clamped()

Expr Halide::Internal::promise_clamped ( const Expr & value,
const Expr & min,
const Expr & max )

FOR INTERNAL USE ONLY.

An entirely unchecked version of unsafe_promise_clamped, used inside the compiler as an annotation of the known bounds of an Expr when it has proved something is bounded and wants to record that fact for later passes (notably bounds inference) to exploit. This gets introduced by GuardWithIf tail strategies, because the bounds machinery has a hard time exploiting if statement conditions.

Unlike unsafe_promise_clamped, this expression is context-dependent, because 'value' might be statically bounded at some point in the IR (e.g. due to a containing if statement), but not elsewhere.

This intrinsic always evaluates to its first argument. If this value is used by a side-effecting operation and it is outside the range specified by its second and third arguments, behavior is undefined. The compiler can therefore assume that the value is within the range given and optimize accordingly. Note that this permits promise_clamped to evaluate to something outside of the range, provided that this value is not used.

Note that this produces an intrinsic that is marked as 'pure' and thus is allowed to be hoisted, etc.; thus, extra care must be taken with its use.

◆ operator<<() [8/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
IRNodeType  )

Emit a halide node type on an output stream (such as std::cout) in human-readable form.

◆ operator<<() [9/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const AssociativePattern &  )

Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form.

◆ operator<<() [10/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const AssociativeOp &  )

Emit a halide associative op on an output stream (such as std::cout) in a human-readable form.

◆ operator<<() [11/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const ForType &  )

Emit a halide for loop type (vectorized, serial, etc) in a human readable form.

◆ operator<<() [12/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const VectorReduce::Operator &  )

Emit a horizontal vector reduction op in human-readable form.

◆ operator<<() [13/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const NameMangling &  )

Emit a halide name mangling value in a human readable format.

◆ operator<<() [14/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const LinkageType &  )

Emit a halide linkage value in a human readable format.

◆ operator<<() [15/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const DimType &  )

Emit a halide dimension type in human-readable format.

◆ operator<<() [16/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & out,
const Closure & c )

Emit a Closure in human-readable form.

◆ operator<<() [17/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & out,
const Interval & c )

Emit an Interval in human-readable form.

◆ operator<<() [18/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & out,
const ConstantInterval & c )

Emit a ConstantInterval in human-readable form.

◆ operator<<() [19/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & out,
const ModulusRemainder & c )

Emit a ModulusRemainder in human-readable form.

◆ operator<<() [20/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const Indentation &  )

◆ lldb_string() [1/3]

std::string Halide::Internal::lldb_string ( const Expr & )

Debugging helpers for LLDB.

◆ lldb_string() [2/3]

std::string Halide::Internal::lldb_string ( const Internal::BaseExprNode * )

Debugging helpers for LLDB.

◆ lldb_string() [3/3]

std::string Halide::Internal::lldb_string ( const Stmt & )

Debugging helpers for LLDB.

◆ get_symbol_address()

void * Halide::Internal::get_symbol_address ( const char * s)

◆ lower_lerp()

Expr Halide::Internal::lower_lerp ( Type final_type,
Expr zero_val,
Expr one_val,
const Expr & weight,
const Target & target )

Build Halide IR that computes a lerp.

Use by codegen targets that don't have a native lerp. The lerp is done in the type of the zero value. The final_type is a cast that should occur after the lerp. It's included because in some cases you can incorporate a final cast into the lerp math.

◆ hoist_loop_invariant_values()

Stmt Halide::Internal::hoist_loop_invariant_values ( Stmt )

Hoist loop-invariants out of inner loops.

This is especially important in cases where LLVM would not do it for us automatically. For example, it hoists loop invariants out of cuda kernels.

◆ hoist_loop_invariant_if_statements()

Stmt Halide::Internal::hoist_loop_invariant_if_statements ( Stmt )

Just hoist loop-invariant if statements as far up as possible.

Does not lift other values. It's useful to run this earlier in lowering to simplify the IR.

◆ iterator_to_pointer()

template<typename T >
auto Halide::Internal::iterator_to_pointer ( T iter) -> decltype(&*std::declval<T>())

Definition at line 119 of file LLVM_Headers.h.

◆ get_llvm_function_name() [1/2]

std::string Halide::Internal::get_llvm_function_name ( const llvm::Function * f)
inline

Definition at line 123 of file LLVM_Headers.h.

◆ get_llvm_function_name() [2/2]

std::string Halide::Internal::get_llvm_function_name ( const llvm::Function & f)
inline

Definition at line 127 of file LLVM_Headers.h.

◆ get_llvm_struct_type_by_name()

llvm::StructType * Halide::Internal::get_llvm_struct_type_by_name ( llvm::Module * module,
const char * name )
inline

Definition at line 131 of file LLVM_Headers.h.

◆ get_triple_for_target()

llvm::Triple Halide::Internal::get_triple_for_target ( const Target & target)

Return the llvm::Triple that corresponds to the given Halide Target.

◆ get_initial_module_for_target()

std::unique_ptr< llvm::Module > Halide::Internal::get_initial_module_for_target ( Target ,
llvm::LLVMContext * ,
bool for_shared_jit_runtime = false,
bool just_gpu = false )

Create an llvm module containing the support code for a given target.

◆ get_initial_module_for_ptx_device()

std::unique_ptr< llvm::Module > Halide::Internal::get_initial_module_for_ptx_device ( Target ,
llvm::LLVMContext * c )

Create an llvm module containing the support code for ptx device.

◆ add_bitcode_to_module()

void Halide::Internal::add_bitcode_to_module ( llvm::LLVMContext * context,
llvm::Module & module,
const std::vector< uint8_t > & bitcode,
const std::string & name )

Link a block of llvm bitcode into an llvm module.

◆ link_with_wasm_jit_runtime()

std::unique_ptr< llvm::Module > Halide::Internal::link_with_wasm_jit_runtime ( llvm::LLVMContext * c,
const Target & t,
std::unique_ptr< llvm::Module > extra_module )

Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module.

◆ loop_carry()

Stmt Halide::Internal::loop_carry ( Stmt ,
int max_carried_values = 8 )

Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load.

If the loads are predicated, the predicates need to match. Can be an optimization or pessimization depending on how good the L1 cache is on the architecture and how many memory issue slots there are. Currently only intended for Hexagon.

◆ lower()

Module Halide::Internal::lower ( const std::vector< Function > & output_funcs,
const std::string & pipeline_name,
const Target & t,
const std::vector< Argument > & args,
LinkageType linkage_type,
const std::vector< Stmt > & requirements = std::vector< Stmt >(),
bool trace_pipeline = false,
const std::vector< IRMutator * > & custom_passes = std::vector< IRMutator * >() )

Given a vector of scheduled halide functions, create a Module that evaluates it.

Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. The Module may contain submodules for computation offloaded to another execution engine or API as well as buffers that are used in the passed in Stmt.

◆ lower_main_stmt()

Stmt Halide::Internal::lower_main_stmt ( const std::vector< Function > & output_funcs,
const std::string & pipeline_name,
const Target & t,
const std::vector< Stmt > & requirements = std::vector< Stmt >(),
bool trace_pipeline = false,
const std::vector< IRMutator * > & custom_passes = std::vector< IRMutator * >() )

Given a halide function with a schedule, create a statement that evaluates it.

Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. Mostly used as a convenience function in tests that wish to assert some property of the lowered IR.

◆ lower_test()

void Halide::Internal::lower_test ( )

◆ lower_parallel_tasks()

Stmt Halide::Internal::lower_parallel_tasks ( const Stmt & s,
std::vector< LoweredFunc > & closure_implementations,
const std::string & name,
const Target & t )

◆ lower_warp_shuffles()

Stmt Halide::Internal::lower_warp_shuffles ( Stmt s,
const Target & t )

Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions.

◆ inject_memoization()

Stmt Halide::Internal::inject_memoization ( const Stmt & s,
const std::map< std::string, Function > & env,
const std::string & name,
const std::vector< Function > & outputs )

Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache.

Should leave non-memoized Funcs unchanged.

◆ rewrite_memoized_allocations()

Stmt Halide::Internal::rewrite_memoized_allocations ( const Stmt & s,
const std::map< std::string, Function > & env )

This should be called after Storage Flattening has added Allocation IR nodes.

It connects the memoization cache lookups to the Allocations so they point to the buffers from the memoization cache and those buffers are released when no longer used. Should not affect allocations for non-memoized Funcs.

◆ get_output_info()

std::map< OutputFileType, const OutputInfo > Halide::Internal::get_output_info ( const Target & target)

◆ operator+() [3/4]

ModulusRemainder Halide::Internal::operator+ ( const ModulusRemainder & a,
const ModulusRemainder & b )

◆ operator-() [3/4]

ModulusRemainder Halide::Internal::operator- ( const ModulusRemainder & a,
const ModulusRemainder & b )

◆ operator*() [3/4]

ModulusRemainder Halide::Internal::operator* ( const ModulusRemainder & a,
const ModulusRemainder & b )

◆ operator/() [3/4]

ModulusRemainder Halide::Internal::operator/ ( const ModulusRemainder & a,
const ModulusRemainder & b )

◆ operator%() [3/4]

ModulusRemainder Halide::Internal::operator% ( const ModulusRemainder & a,
const ModulusRemainder & b )

◆ operator+() [4/4]

ModulusRemainder Halide::Internal::operator+ ( const ModulusRemainder & a,
int64_t b )

◆ operator-() [4/4]

ModulusRemainder Halide::Internal::operator- ( const ModulusRemainder & a,
int64_t b )

◆ operator*() [4/4]

ModulusRemainder Halide::Internal::operator* ( const ModulusRemainder & a,
int64_t b )

◆ operator/() [4/4]

ModulusRemainder Halide::Internal::operator/ ( const ModulusRemainder & a,
int64_t b )

◆ operator%() [4/4]

ModulusRemainder Halide::Internal::operator% ( const ModulusRemainder & a,
int64_t b )

◆ modulus_remainder() [1/2]

ModulusRemainder Halide::Internal::modulus_remainder ( const Expr & e)

For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant.

For example, it is straight-forward to deduce that ((10*x + 2)*(6*y - 3) - 1) is congruent to five modulo six.

We get the most information when the modulus is large. E.g. if something is congruent to 208 modulo 384, then we also know it's congruent to 0 mod 8, and we can possibly use it as an index for an aligned load. If all else fails, we can just say that an integer is congruent to zero modulo one.

◆ modulus_remainder() [2/2]

ModulusRemainder Halide::Internal::modulus_remainder ( const Expr & e,
const Scope< ModulusRemainder > & scope )

If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder:

◆ reduce_expr_modulo() [1/2]

HALIDE_MUST_USE_RESULT bool Halide::Internal::reduce_expr_modulo ( const Expr & e,
int64_t modulus,
int64_t * remainder )

Reduce an expression modulo some integer.

Returns true and assigns to remainder if an answer could be found.

◆ reduce_expr_modulo() [2/2]

HALIDE_MUST_USE_RESULT bool Halide::Internal::reduce_expr_modulo ( const Expr & e,
int64_t modulus,
int64_t * remainder,
const Scope< ModulusRemainder > & scope )

Reduce an expression modulo some integer.

Returns true and assigns to remainder if an answer could be found.

◆ modulus_remainder_test()

void Halide::Internal::modulus_remainder_test ( )

◆ gcd()

int64_t Halide::Internal::gcd ( int64_t ,
int64_t  )

The greatest common divisor of two integers.

Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().

◆ lcm()

int64_t Halide::Internal::lcm ( int64_t ,
int64_t  )

The least common multiple of two integers.

Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().

◆ derivative_bounds()

ConstantInterval Halide::Internal::derivative_bounds ( const Expr & e,
const std::string & var,
const Scope< ConstantInterval > & scope = ScopeConstantInterval >::empty_scope() )

Find the bounds of the derivative of an expression.

The scope gives the bounds on the derivatives of any variables found.

◆ is_monotonic() [1/2]

Monotonic Halide::Internal::is_monotonic ( const Expr & e,
const std::string & var,
const Scope< ConstantInterval > & scope = ScopeConstantInterval >::empty_scope() )

◆ is_monotonic() [2/2]

Monotonic Halide::Internal::is_monotonic ( const Expr & e,
const std::string & var,
const Scope< Monotonic > & scope )

◆ operator<<() [21/22]

std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const Monotonic & m )

Emit the monotonic class in human-readable form for debugging.

◆ is_monotonic_test()

void Halide::Internal::is_monotonic_test ( )

◆ inject_gpu_offload()

Stmt Halide::Internal::inject_gpu_offload ( const Stmt & s,
const Target & host_target )

Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module.

◆ optimize_shuffles()

Stmt Halide::Internal::optimize_shuffles ( Stmt s,
int lut_alignment )

◆ can_parallelize_rvar()

bool Halide::Internal::can_parallelize_rvar ( const std::string & rvar,
const std::string & func,
const Definition & r )

Returns whether or not Halide can prove that it is safe to parallelize an update definition across a specific variable.

If this returns true, it's definitely safe. If this returns false, it may still be safe, but Halide couldn't prove it.

◆ check_call_arg_types()

void Halide::Internal::check_call_arg_types ( const std::string & name,
std::vector< Expr > * args,
int dims )

Validate arguments to a call to a func, image or imageparam.

◆ has_uncaptured_likely_tag()

bool Halide::Internal::has_uncaptured_likely_tag ( const Expr & e,
const Scope<> & scope )

Return true if an expression uses a likely tag that isn't captured by an enclosing Select, Min, or Max.

The scope contains all vars that should be considered to have uncaptured likelies.

◆ has_likely_tag()

bool Halide::Internal::has_likely_tag ( const Expr & e,
const Scope<> & scope )

Return true if an expression uses a likely tag.

The scope contains all vars in scope that should be considered to have likely tags.

◆ partition_loops()

Stmt Halide::Internal::partition_loops ( Stmt s)

Partitions loop bodies into a prologue, a steady state, and an epilogue.

Finds the steady state by hunting for use of clamped ramps, or the 'likely' intrinsic.

◆ inject_placeholder_prefetch()

Stmt Halide::Internal::inject_placeholder_prefetch ( const Stmt & s,
const std::map< std::string, Function > & env,
const std::string & prefix,
const std::vector< PrefetchDirective > & prefetches )

Inject placeholder prefetches to 's'.

This placholder prefetch does not have explicit region to be prefetched yet. It will be computed during call to inject_prefetch.

◆ inject_prefetch()

Stmt Halide::Internal::inject_prefetch ( const Stmt & s,
const std::map< std::string, Function > & env )

Compute the actual region to be prefetched and place it to the placholder prefetch.

Wrap the prefetch call with condition when applicable.

◆ reduce_prefetch_dimension()

Stmt Halide::Internal::reduce_prefetch_dimension ( Stmt stmt,
const Target & t )

Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture).

This keeps the 'max_dim' innermost dimensions and adds loops for the rest of the dimensions. If maximum prefetched-byte-size is specified (depending on the architecture), this also adds an outer loops that tile the prefetches.

◆ hoist_prefetches()

Stmt Halide::Internal::hoist_prefetches ( const Stmt & s)

Hoist all the prefetches in a Block to the beginning of the Block.

This generally only happens when a loop with prefetches is unrolled; in some cases, LLVM's code generation can be suboptimal (unnecessary register spills) when prefetches are scattered through the loop. Hoisting to the top of the loop is a good way to mitigate this, at the cost of the prefetch calls possibly being less useful due to distance from use point. (This is a bit experimental and may need revisiting.) See also https://bugs.llvm.org/show_bug.cgi?id=51172

◆ print_loop_nest()

std::string Halide::Internal::print_loop_nest ( const std::vector< Function > & output_funcs)

Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses.

◆ inject_profiling()

Stmt Halide::Internal::inject_profiling ( const Stmt & ,
const std::string & ,
const std::map< std::string, Function > & env )

Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end.

Should be done before storage flattening, but after all bounds inference.

◆ purify_index_math()

Expr Halide::Internal::purify_index_math ( const Expr & )

Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero.

In those cases, if the lowering passes are functional, the value resulting from the division or mod is evaluated but not used. This mutator rewrites divs and mods in such expressions to fail silently (evaluate to undef) when the denominator is zero.

◆ qualify()

Expr Halide::Internal::qualify ( const std::string & prefix,
const Expr & value )

Prefix all variable names in the given expression with the prefix string.

◆ random_float()

Expr Halide::Internal::random_float ( const std::vector< Expr > & )

Return a random floating-point number between zero and one that varies deterministically based on the input expressions.

◆ random_int()

Expr Halide::Internal::random_int ( const std::vector< Expr > & )

Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers).

◆ lower_random()

Expr Halide::Internal::lower_random ( const Expr & e,
const std::vector< VarOrRVar > & free_vars,
int tag )

Convert calls to random() to IR generated by random_float and random_int.

Tags all calls with the variables in free_vars, and the integer given as the last argument.

◆ realization_order()

std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > > Halide::Internal::realization_order ( const std::vector< Function > & outputs,
std::map< std::string, Function > & env )

Given a bunch of functions that call each other, determine an order in which to do the scheduling.

This in turn influences the order in which stages are computed when there's no strict dependency between them. Currently just some arbitrary depth-first traversal of the call graph. In addition, determine grouping of functions with fused computation loops. The functions within the fused groups are sorted based on realization order. There should not be any dependencies among functions within a fused group. This pass will also populate the 'fused_pairs' list in the function's schedule. Return a pair of the realization order and the fused groups in that order.

◆ topological_order()

std::vector< std::string > Halide::Internal::topological_order ( const std::vector< Function > & outputs,
const std::map< std::string, Function > & env )

Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule.

This ordering adheres to the producer-consumer dependencies, i.e. producer will come before its consumers in that order

◆ rebase_loops_to_zero()

Stmt Halide::Internal::rebase_loops_to_zero ( const Stmt & )

Rewrite the mins of most loops to 0.

◆ split_predicate_test()

void Halide::Internal::split_predicate_test ( )

◆ is_func_trivial_to_inline()

bool Halide::Internal::is_func_trivial_to_inline ( const Function & func)

Return true if the cost of inlining a function is equivalent to the cost of calling the function directly.

◆ remove_dead_allocations()

Stmt Halide::Internal::remove_dead_allocations ( const Stmt & s)

Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt.

This doesn't touch Realize/Call nodes and so must be called after storage_flattening.

◆ remove_extern_loops()

Stmt Halide::Internal::remove_extern_loops ( const Stmt & s)

Removes placeholder loops for extern stages.

◆ remove_undef()

Stmt Halide::Internal::remove_undef ( Stmt s)

Removes stores that depend on undef values, and statements that only contain such stores.

◆ schedule_functions()

Stmt Halide::Internal::schedule_functions ( const std::vector< Function > & outputs,
const std::vector< std::vector< std::string > > & fused_groups,
const std::map< std::string, Function > & env,
const Target & target,
bool & any_memoized )

Build loop nests and inject Function realizations at the appropriate places using the schedule.

Returns a flag indicating whether memoization passes need to be run.

◆ operator<<() [22/22]

template<typename T >
std::ostream & Halide::Internal::operator<< ( std::ostream & stream,
const Scope< T > & s )

◆ select_gpu_api()

Stmt Halide::Internal::select_gpu_api ( const Stmt & s,
const Target & t )

Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target.

Choose the first of the following: opencl, cuda

◆ simplify() [1/2]

Stmt Halide::Internal::simplify ( const Stmt & ,
bool remove_dead_code = true,
const Scope< Interval > & bounds = ScopeInterval >::empty_scope(),
const Scope< ModulusRemainder > & alignment = ScopeModulusRemainder >::empty_scope(),
const std::vector< Expr > & assumptions = std::vector< Expr >() )

Perform a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc.

Simplifies across let statements, so must not be called on stmts with dangling or repeated variable names. Can optionally be passed known bounds of any variables, known alignment properties, and any other Exprs that should be assumed to be true.

◆ simplify() [2/2]

Expr Halide::Internal::simplify ( const Expr & ,
bool remove_dead_code = true,
const Scope< Interval > & bounds = ScopeInterval >::empty_scope(),
const Scope< ModulusRemainder > & alignment = ScopeModulusRemainder >::empty_scope(),
const std::vector< Expr > & assumptions = std::vector< Expr >() )

◆ can_prove()

bool Halide::Internal::can_prove ( Expr e,
const Scope< Interval > & bounds = ScopeInterval >::empty_scope() )

Attempt to statically prove an expression is true using the simplifier.

◆ simplify_exprs()

Stmt Halide::Internal::simplify_exprs ( const Stmt & )

Simplify expressions found in a statement, but don't simplify across different statements.

This is safe to perform at an earlier stage in lowering than full simplification of a stmt.

◆ simplify_correlated_differences()

Stmt Halide::Internal::simplify_correlated_differences ( const Stmt & )

Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions.

For example, consider:

for x in [0, 10]: let y = x + 3 let z = y - x

x lies within [0, 10]. Interval arithmetic will correctly determine that y lies within [3, 13]. When z is encountered, it is treated as a difference of two independent variables, and gives [3 - 10, 13 - 0] = [-7, 13] instead of the tighter interval [3, 3]. It doesn't understand that y and x are correlated.

In practice, this problem causes problems for unrolling, and arbitrarily-bad overconservative behavior in bounds inference (e.g. https://github.com/halide/Halide/issues/3697 )

The function below attempts to address this by walking the IR, remembering whether each let variable is monotonic increasing, decreasing, unknown, or constant w.r.t each loop var. When it encounters a subtract node where both sides have the same monotonicity it substitutes, solves, and attempts to generally simplify as aggressively as possible to try to cancel out the repeated dependence on the loop var. The same is done for addition nodes with arguments of opposite monotonicity.

Bounds inference is particularly sensitive to these false dependencies, but removing false dependencies also helps other lowering passes. E.g. if this simplification means a value no longer depends on a loop variable, it can remain scalar during vectorization of that loop, or we can lift it out as a loop invariant, or it might avoid some of the complex paths in GPU codegen that trigger when values depend on the block index (e.g. warp shuffles).

This pass is safe to use on code with repeated instances of the same variable name (it must be, because we want to run it before allocation bounds inference).

◆ bound_correlated_differences()

Expr Halide::Internal::bound_correlated_differences ( const Expr & expr)

Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference.

Performs a subset of what simplify_correlated_differences does. Can increase Expr size (i.e. does not follow the simplifier's reduction order).

◆ simplify_specializations()

void Halide::Internal::simplify_specializations ( std::map< std::string, Function > & env)

Try to simplify the RHS/LHS of a function's definition based on its specializations.

◆ skip_stages()

Stmt Halide::Internal::skip_stages ( const Stmt & s,
const std::vector< Function > & outputs,
const std::vector< std::vector< std::string > > & order,
const std::map< std::string, Function > & env )

Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used.

Does this by analyzing all reads of each buffer allocated, and inferring some condition that tells us if the reads occur. If the condition is non-trivial, inject ifs that guard the production.

◆ sliding_window()

Stmt Halide::Internal::sliding_window ( const Stmt & s,
const std::map< std::string, Function > & env )

Perform sliding window optimizations on a halide statement.

I.e. don't bother computing points in a function that have provably already been computed by a previous iteration.

◆ solve_expression()

SolverResult Halide::Internal::solve_expression ( const Expr & e,
const std::string & variable,
const Scope< Expr > & scope = ScopeExpr >::empty_scope() )

Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e.

outside most parentheses). If the expression is an equality or comparison, this 'solves' the equation. Returns a pair of Expr and bool. The Expr is the mutated expression, and the bool indicates whether there is a single instance of the variable in the result. If it is false, the expression has only been partially solved, and there are still multiple instances of the variable.

◆ solve_for_outer_interval()

Interval Halide::Internal::solve_for_outer_interval ( const Expr & c,
const std::string & variable )

Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it.

Never returns undefined Exprs, instead it uses variables called "pos_inf" and "neg_inf" to represent positive and negative infinity.

◆ solve_for_inner_interval()

Interval Halide::Internal::solve_for_inner_interval ( const Expr & c,
const std::string & variable )

Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it.

◆ and_condition_over_domain()

Expr Halide::Internal::and_condition_over_domain ( const Expr & c,
const Scope< Interval > & varying )

Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables.

Formally, the output expr implies the input expr.

The condition may be a vector condition, in which case we also 'and' over the vector lanes, and return a scalar result.

◆ solve_test()

void Halide::Internal::solve_test ( )

◆ spirv_ir_test()

void Halide::Internal::spirv_ir_test ( )

Internal test for SPIR-V IR.

◆ split_tuples()

Stmt Halide::Internal::split_tuples ( const Stmt & s,
const std::map< std::string, Function > & env )

Rewrite all tuple-valued Realizations, Provide nodes, and Call nodes into several scalar-valued ones, so that later lowering passes only need to think about scalar-valued productions.

◆ stage_strided_loads()

Stmt Halide::Internal::stage_strided_loads ( const Stmt & s)

Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles.

For a stride of two, the trick is to do a dense load of twice the size, and then extract either the even or odd lanes. This was previously done in codegen, where it was challenging, because it's not easy to know there if it's safe to do the double-sized load, as it either loads one element beyond or before the original load. We used the alignment of the ramp base to try to tell if it was safe to shift backwards, and we added padding to internal allocations so that for those at least it was safe to shift forwards. Unfortunately the alignment of the ramp base is usually unknown if you don't know anything about the strides of the input, and adding padding to allocations was a serious wart in our memory allocators.

This pass instead actively looks for evidence elsewhere in the Stmt (at some location which definitely executes whenever the load being transformed executes) that it's safe to read further forwards or backwards in memory. The evidence is in the form of a load at the same base address with a different constant offset. It also clusters groups of these loads so that they do the same dense load and extract the appropriate slice of lanes. If it fails to find any evidence, for loads from external buffers it does two overlapping half-sized dense loads and shuffles out the desired lanes, and for loads from internal allocations it adds padding to the allocation explicitly, by setting the padding field on Allocate nodes.

◆ print_to_stmt_html()

void Halide::Internal::print_to_stmt_html ( const std::string & html_output_filename,
const Module & m,
const std::string & assembly_input_filename = "" )

Dump an HTML-formatted visualization of a Module to filename.

If assembly_input_filename is not empty, it is expected to be the path to assembly output. If empty, the code will attempt to find such a file based on output_filename (replacing ".stmt.html" with ".s"), and will assert-fail if no such file is found.

◆ print_to_conceptual_stmt_html()

void Halide::Internal::print_to_conceptual_stmt_html ( const std::string & html_output_filename,
const Module & m,
const std::string & assembly_input_filename = "" )

Dump an HTML-formatted visualization of a Module's conceptual Stmt code to filename.

If assembly_input_filename is not empty, it is expected to be the path to assembly output. If empty, the code will attempt to find such a file based on output_filename (replacing ".stmt.html" with ".s"), and will assert-fail if no such file is found.

◆ storage_flattening()

Stmt Halide::Internal::storage_flattening ( Stmt s,
const std::vector< Function > & outputs,
const std::map< std::string, Function > & env,
const Target & target )

Take a statement with multi-dimensional Realize, Provide, and Call nodes, and turn it into a statement with single-dimensional Allocate, Store, and Load nodes respectively.

◆ storage_folding()

Stmt Halide::Internal::storage_folding ( const Stmt & s,
const std::map< std::string, Function > & env )

Fold storage of functions if possible.

This means reducing one of the dimensions module something for the purpose of storage, if we can prove that this is safe to do. E.g consider:

f(x) = ...
g(x) = f(x-1) + f(x)
f.store_root().compute_at(g, x);

We can store f as a circular buffer of size two, instead of allocating space for all of it.

◆ strictify_float()

bool Halide::Internal::strictify_float ( std::map< std::string, Function > & env,
const Target & t )

Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions.

This makes the IR nodes context independent. If the Target::StrictFloat flag is specified in target, starts in strict_float mode so all floating-point type Exprs in the compilation will be marked with strict_float. Returns whether any strict floating-point is used in any function in the passed in env.

◆ strip_asserts()

Stmt Halide::Internal::strip_asserts ( const Stmt & s)

◆ substitute() [1/6]

Expr Halide::Internal::substitute ( const std::string & name,
const Expr & replacement,
const Expr & expr )

Substitute variables with the given name with the replacement expression within expr.

This is a dangerous thing to do if variable names have not been uniquified. While it won't traverse inside let statements with the same name as the first argument, moving a piece of syntax around can change its meaning, because it can cross lets that redefine variable names that it includes references to.

◆ substitute() [2/6]

Stmt Halide::Internal::substitute ( const std::string & name,
const Expr & replacement,
const Stmt & stmt )

Substitute variables with the given name with the replacement expression within stmt.

◆ substitute() [3/6]

Expr Halide::Internal::substitute ( const std::map< std::string, Expr > & replacements,
const Expr & expr )

Substitute variables with names in the map.

◆ substitute() [4/6]

Stmt Halide::Internal::substitute ( const std::map< std::string, Expr > & replacements,
const Stmt & stmt )

◆ substitute() [5/6]

Expr Halide::Internal::substitute ( const Expr & find,
const Expr & replacement,
const Expr & expr )

Substitute expressions for other expressions.

◆ substitute() [6/6]

Stmt Halide::Internal::substitute ( const Expr & find,
const Expr & replacement,
const Stmt & stmt )

◆ graph_substitute() [1/4]

Expr Halide::Internal::graph_substitute ( const std::string & name,
const Expr & replacement,
const Expr & expr )

Substitutions where the IR may be a general graph (and not just a DAG).

◆ graph_substitute() [2/4]

Stmt Halide::Internal::graph_substitute ( const std::string & name,
const Expr & replacement,
const Stmt & stmt )

◆ graph_substitute() [3/4]

Expr Halide::Internal::graph_substitute ( const Expr & find,
const Expr & replacement,
const Expr & expr )

◆ graph_substitute() [4/4]

Stmt Halide::Internal::graph_substitute ( const Expr & find,
const Expr & replacement,
const Stmt & stmt )

◆ substitute_in_all_lets() [1/2]

Expr Halide::Internal::substitute_in_all_lets ( const Expr & expr)

Substitute in all let Exprs in a piece of IR.

Doesn't substitute in let stmts, as this may change the meaning of the IR (e.g. by moving a load after a store). Produces graphs of IR, so don't use non-graph-aware visitors or mutators on it until you've CSE'd the result.

◆ substitute_in_all_lets() [2/2]

Stmt Halide::Internal::substitute_in_all_lets ( const Stmt & stmt)

◆ target_test()

void Halide::Internal::target_test ( )

◆ lower_target_query_ops()

void Halide::Internal::lower_target_query_ops ( std::map< std::string, Function > & env,
const Target & t )

◆ inject_tracing()

Stmt Halide::Internal::inject_tracing ( Stmt ,
const std::string & pipeline_name,
bool trace_pipeline,
const std::map< std::string, Function > & env,
const std::vector< Function > & outputs,
const Target & Target )

Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations.

Should be done before storage flattening, but after all bounds inference.

◆ trim_no_ops()

Stmt Halide::Internal::trim_no_ops ( Stmt s)

Truncate loop bounds to the region over which they actually do something.

For examples see test/correctness/trim_no_ops.cpp

◆ unify_duplicate_lets()

Stmt Halide::Internal::unify_duplicate_lets ( const Stmt & s)

Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones.

◆ uniquify_variable_names()

Stmt Halide::Internal::uniquify_variable_names ( const Stmt & s)

Modify a statement so that every internally-defined variable name is unique.

This lets later passes assume syntactic equivalence is semantic equivalence.

◆ uniquify_variable_names_test()

void Halide::Internal::uniquify_variable_names_test ( )

◆ unpack_buffers()

Stmt Halide::Internal::unpack_buffers ( Stmt s)

Creates let stmts for the various buffer components (e.g.

foo.extent.0) in any referenced concrete buffers or buffer parameters. After this pass, the only undefined symbols should scalar parameters and the buffers themselves (e.g. foo.buffer).

◆ unroll_loops()

Stmt Halide::Internal::unroll_loops ( const Stmt & )

Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement.

I.e. unroll the loop.

◆ lower_unsafe_promises()

Stmt Halide::Internal::lower_unsafe_promises ( const Stmt & s,
const Target & t )

Lower all unsafe promises into either assertions or unchecked code, depending on the target.

◆ lower_safe_promises()

Stmt Halide::Internal::lower_safe_promises ( const Stmt & s)

Lower all safe promises by just stripping them.

This is a good idea once no more lowering stages are going to use boxes_touched.

◆ safe_numeric_cast()

template<typename DST , typename SRC , typename std::enable_if< std::is_floating_point< SRC >::value >::type * = nullptr>
DST Halide::Internal::safe_numeric_cast ( SRC s)

Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible.

Definition at line 99 of file Util.h.

◆ reinterpret_bits()

template<typename DstType , typename SrcType >
DstType Halide::Internal::reinterpret_bits ( const SrcType & src)

An aggressive form of reinterpret cast used for correct type-punning.

Definition at line 135 of file Util.h.

References memcpy().

Referenced by Halide::Internal::IRMatcher::fuzz_test_rule().

◆ get_env_variable()

std::string Halide::Internal::get_env_variable ( char const * env_var_name)

Get value of an environment variable.

Returns its value is defined in the environment. If the var is not defined, an empty string is returned.

◆ running_program_name()

std::string Halide::Internal::running_program_name ( )

Get the name of the currently running executable.

Platform-specific. If program name cannot be retrieved, function returns an empty string.

◆ unique_name() [1/2]

std::string Halide::Internal::unique_name ( char prefix)

Generate a unique name starting with the given prefix.

It's unique relative to all other strings returned by unique_name in this process.

The single-character version always appends a numeric suffix to the character.

The string version will either return the input as-is (with high probability on the first time it is called with that input), or replace any existing '$' characters with underscores, then add a '$' sign and a numeric suffix to it.

Note that unique_name('f') therefore differs from unique_name("f"). The former returns something like f123, and the latter returns either f or f$123.

Referenced by Halide::Buffer< T, Dims >::Buffer().

◆ unique_name() [2/2]

std::string Halide::Internal::unique_name ( const std::string & prefix)

◆ starts_with()

bool Halide::Internal::starts_with ( const std::string & str,
const std::string & prefix )

Test if the first string starts with the second string.

◆ ends_with()

bool Halide::Internal::ends_with ( const std::string & str,
const std::string & suffix )

Test if the first string ends with the second string.

◆ replace_all()

std::string Halide::Internal::replace_all ( const std::string & str,
const std::string & find,
const std::string & replace )

Replace all matches of the second string in the first string with the last string.

◆ split_string()

std::vector< std::string > Halide::Internal::split_string ( const std::string & source,
const std::string & delim )

Split the source string using 'delim' as the divider.

◆ join_strings()

template<typename T >
std::string Halide::Internal::join_strings ( const std::vector< T > & sources,
const std::string & delim )

Join the source vector using 'delim' as the divider.

Definition at line 187 of file Util.h.

◆ fold_left()

template<typename T , typename Fn >
T Halide::Internal::fold_left ( const std::vector< T > & vec,
Fn f )

Perform a left fold of a vector.

Returns a default-constructed vector element if the vector is empty. Similar to std::accumulate but with a less clunky syntax.

Definition at line 212 of file Util.h.

◆ fold_right()

template<typename T , typename Fn >
T Halide::Internal::fold_right ( const std::vector< T > & vec,
Fn f )

Returns a right fold of a vector.

Returns a default-constructed vector element if the vector is empty.

Definition at line 227 of file Util.h.

◆ extract_namespaces()

std::string Halide::Internal::extract_namespaces ( const std::string & name,
std::vector< std::string > & namespaces )

Returns base name and fills in namespaces, outermost one first in vector.

Referenced by halide_handle_cplusplus_type::make().

◆ strip_namespaces()

std::string Halide::Internal::strip_namespaces ( const std::string & name)

Like extract_namespaces(), but strip and discard the namespaces, returning base name only.

◆ file_make_temp()

std::string Halide::Internal::file_make_temp ( const std::string & prefix,
const std::string & suffix )

Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed.

(Note that the exact form of the file name may vary; in particular, the suffix may be ignored on Windows.) The file is created (but not opened), thus this can be called from different threads (or processes, e.g. when building with parallel make) without risking collision. Note that if this file is used as a temporary file, the caller is responsibly for deleting it. Neither the prefix nor suffix may contain a directory separator.

◆ dir_make_temp()

std::string Halide::Internal::dir_make_temp ( )

Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed.

The directory will be empty (i.e., this will never return /tmp itself, but rather a new directory inside /tmp). The caller is responsible for removing the directory after use.

◆ file_exists()

bool Halide::Internal::file_exists ( const std::string & name)

Wrapper for access().

Quietly ignores errors.

◆ assert_file_exists()

void Halide::Internal::assert_file_exists ( const std::string & name)

assert-fail if the file doesn't exist.

useful primarily for testing purposes.

◆ assert_no_file_exists()

void Halide::Internal::assert_no_file_exists ( const std::string & name)

assert-fail if the file DOES exist.

useful primarily for testing purposes.

◆ file_unlink()

void Halide::Internal::file_unlink ( const std::string & name)

Wrapper for unlink().

Asserts upon error.

Quietly ignores errors.

Referenced by Halide::Internal::TemporaryFile::~TemporaryFile().

◆ ensure_no_file_exists()

void Halide::Internal::ensure_no_file_exists ( const std::string & name)

Ensure that no file with this path exists.

If such a file exists and cannot be removed, assert-fail.

◆ dir_rmdir()

void Halide::Internal::dir_rmdir ( const std::string & name)

Wrapper for rmdir().

Asserts upon error.

◆ file_stat()

FileStat Halide::Internal::file_stat ( const std::string & name)

Wrapper for stat().

Asserts upon error.

◆ read_entire_file()

std::vector< char > Halide::Internal::read_entire_file ( const std::string & pathname)

Read the entire contents of a file into a vector<char>.

The file is read in binary mode. Errors trigger an assertion failure.

◆ write_entire_file() [1/2]

void Halide::Internal::write_entire_file ( const std::string & pathname,
const void * source,
size_t source_len )

Create or replace the contents of a file with a given pointer-and-length of memory.

If the file doesn't exist, it is created; if it does exist, it is completely overwritten. Any error triggers an assertion failure.

Referenced by write_entire_file().

◆ write_entire_file() [2/2]

void Halide::Internal::write_entire_file ( const std::string & pathname,
const std::vector< char > & source )
inline

Definition at line 322 of file Util.h.

References write_entire_file().

◆ add_would_overflow()

bool Halide::Internal::add_would_overflow ( int bits,
int64_t a,
int64_t b )

Routines to test if math would overflow for signed integers with the given number of bits.

Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Add >().

◆ sub_would_overflow()

bool Halide::Internal::sub_would_overflow ( int bits,
int64_t a,
int64_t b )

◆ mul_would_overflow()

bool Halide::Internal::mul_would_overflow ( int bits,
int64_t a,
int64_t b )

◆ add_with_overflow()

HALIDE_MUST_USE_RESULT bool Halide::Internal::add_with_overflow ( int bits,
int64_t a,
int64_t b,
int64_t * result )

Routines to perform arithmetic on signed types without triggering signed overflow.

If overflow would occur, sets result to zero, and returns false. Otherwise set result to the correct value, and returns true.

Referenced by Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().

◆ sub_with_overflow()

HALIDE_MUST_USE_RESULT bool Halide::Internal::sub_with_overflow ( int bits,
int64_t a,
int64_t b,
int64_t * result )

◆ mul_with_overflow()

HALIDE_MUST_USE_RESULT bool Halide::Internal::mul_with_overflow ( int bits,
int64_t a,
int64_t b,
int64_t * result )

◆ halide_tic_impl()

void Halide::Internal::halide_tic_impl ( const char * file,
int line )

◆ halide_toc_impl()

void Halide::Internal::halide_toc_impl ( const char * file,
int line )

◆ begin()

◆ end()

◆ reverse_view()

template<typename T >
reverse_adaptor< T > Halide::Internal::reverse_view ( T && range)

Reverse-order adaptor for range-based for-loops.

TODO: Replace with std::ranges::reverse_view when upgrading to C++20.

Definition at line 481 of file Util.h.

◆ c_print_name()

std::string Halide::Internal::c_print_name ( const std::string & name,
bool prefix_underscore = true )

Emit a version of a string that is a valid identifier in C (.

is replaced with _) If prefix_underscore is true (the default), an underscore will be prepended if the input starts with an alphabetic character to avoid reserved word clashes.

◆ get_llvm_version()

int Halide::Internal::get_llvm_version ( )

Return the LLVM_VERSION against which this libHalide is compiled.

This is provided only for internal tests which need to verify behavior; please don't use this outside of Halide tests.

◆ run_with_large_stack()

void Halide::Internal::run_with_large_stack ( const std::function< void()> & action)

Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size.

If that value is zero, just calls the function on the calling thread. Otherwise on Windows this uses a Fiber, and on other platforms it uses swapcontext.

◆ popcount64()

int Halide::Internal::popcount64 ( uint64_t x)

Portable versions of popcount, count-leading-zeros, and count-trailing-zeros.

◆ clz64()

int Halide::Internal::clz64 ( uint64_t x)

◆ ctz64()

int Halide::Internal::ctz64 ( uint64_t x)

◆ next_power_of_two()

int64_t Halide::Internal::next_power_of_two ( int64_t x)
inline

Return an integer 2^n, for some n, which is >= x.

Argument x must be > 0.

Definition at line 556 of file Util.h.

◆ align_up()

template<typename T >
T Halide::Internal::align_up ( T x,
int n )
inline

Definition at line 561 of file Util.h.

◆ make_argument_list()

std::vector< Var > Halide::Internal::make_argument_list ( int dimensionality)

Make a list of unique arguments for definitions with unnamed arguments.

Referenced by Halide::Func::define_extern(), Halide::Func::define_extern(), and Halide::Func::define_extern().

◆ vectorize_loops()

Stmt Halide::Internal::vectorize_loops ( const Stmt & s,
const std::map< std::string, Function > & env )

Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors.

The loops in question must have constant extent.

◆ wrap_func_calls()

std::map< std::string, Function > Halide::Internal::wrap_func_calls ( const std::map< std::string, Function > & env)

Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions.

◆ get_test_tmp_dir()

std::string Halide::Internal::get_test_tmp_dir ( )
inline

Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution).

The path is guaranteed to be an absolute path and end in a directory separator, so a leaf filename can simply be appended. It is not guaranteed that this directory will be empty. If the path cannot be created, the function will assert-fail and return an invalid path.

Definition at line 76 of file halide_test_dirs.h.

References Halide::Internal::Test::get_current_directory(), and Halide::Internal::Test::get_env_variable().

Variable Documentation

◆ unknown

const int64_t Halide::Internal::unknown = std::numeric_limits<int64_t>::min()

Definition at line 22 of file AutoScheduleUtils.h.

◆ StrongestExprNodeType

IRNodeType Halide::Internal::StrongestExprNodeType = IRNodeType::VectorReduce
constexpr

Definition at line 81 of file Expr.h.

◆ random_variable_counter

std::atomic<int> Halide::Internal::random_variable_counter
extern