Namespaces
namespace	Autoscheduler

namespace	Elf

namespace	GeneratorMinMax

namespace	IntegerDivision

namespace	IRMatcher
	An alternative template-metaprogramming approach to expression matching.

namespace	Test

Classes
class	AbstractGenerator
	AbstractGenerator is an ABC that defines the API a Generator must provide to work with the existing Generator infrastructure (GenGen, RunGen, execute_generator(), Generator Stubs). More...

struct	Acquire

struct	Add
	The sum of two expressions. More...

struct	all_are_convertible

struct	all_are_printable_args

struct	all_ints_and_optional_name

struct	all_ints_and_optional_name< First, Rest... >

struct	all_ints_and_optional_name< T >

struct	all_ints_and_optional_name<>

struct	Allocate
	Allocate a scratch area called with the given name, type, and size. More...

struct	And
	Logical and - are both expressions true. More...

struct	ApplySplitResult

class	aslog

struct	AssertStmt
	If the 'condition' is false, then evaluate and return the message, which should be a call to an error function. More...

struct	AssociativeOp
	Represent the equivalent associative op of an update definition. More...

struct	AssociativePattern
	Represent an associative op with its identity. More...

struct	Atomic
	Lock all the Store nodes in the body statement. More...

struct	BaseExprNode
	A base class for expression nodes. More...

struct	BaseStmtNode
	IR nodes are split into expressions and statements. More...

struct	Block
	A sequence of statements to be executed in-order. More...

struct	Bound
	A bound on a loop, typically from Func::bound. More...

struct	Box
	Represents the bounds of a region of arbitrary dimension. More...

struct	Broadcast
	A vector with 'lanes' elements, in which every element is 'value'. More...

struct	BufferBuilder
	A builder to help create Exprs representing halide_buffer_t structs (e.g. More...

struct	BufferContents

struct	BufferInfo
	Find all calls to image buffers and parameters in the function. More...

struct	Call
	A function call. More...

struct	Cast
	The actual IR nodes begin here. More...

class	Closure
	A helper class to manage closures. More...

class	CodeGen_C
	This class emits C++ code equivalent to a halide Stmt. More...

class	CodeGen_GPU_C
	A base class for GPU backends that require C-like shader output. More...

struct	CodeGen_GPU_Dev
	A code generator that emits GPU code from a given Halide stmt. More...

class	CodeGen_LLVM
	A code generator abstract base class. More...

class	CodeGen_Posix
	A code generator that emits posix code from a given Halide stmt. More...

class	CodeGen_PyTorch
	This class emits C++ code to wrap a Halide pipeline so that it can be used as a C++ extension operator in PyTorch. More...

class	CompilerLogger

struct	cond

struct	ConstantInterval
	A class to represent ranges of integers. More...

struct	Convert

struct	Cost

class	debug
	For optional debugging during codegen, use the debug class as follows: More...

class	Definition
	A Function definition which can either represent a init or an update definition. More...

struct	DeviceArgument
	A DeviceArgument looks similar to an Halide::Argument, but has behavioral differences that make it specific to the GPU pipeline; the fact that neither is-a nor has-a Halide::Argument is deliberate. More...

struct	Dim
	The Dim struct represents one loop in the schedule's representation of a loop nest. More...

class	Dimension

struct	Div
	The ratio of two expressions. More...

struct	EQ
	Is the first expression equal to the second. More...

struct	ErrorReport

struct	Evaluate
	Evaluate and discard an expression, presumably because it has some side-effect. More...

struct	ExecuteGeneratorArgs
	ExecuteGeneratorArgs is the set of arguments to execute_generator(). More...

struct	ExprNode
	We use the "curiously recurring template pattern" to avoid duplicated code in the IR Nodes. More...

class	ExprUsesVars

struct	FeatureIntermediates

struct	FileStat

class	FindAllCalls
	Visitor for keeping track of functions that are directly called and the arguments with which they are called. More...

struct	FloatImm
	Floating point constants. More...

struct	For
	A for loop. More...

struct	Fork
	A pair of statements executed concurrently. More...

struct	Free
	Free the resources associated with the given buffer. More...

class	FuncSchedule
	A schedule for a Function of a Halide pipeline. More...

class	Function
	A reference-counted handle to Halide's internal representation of a function. More...

struct	FunctionPtr
	A possibly-weak pointer to a Halide function. More...

struct	FusedPair
	This represents two stages with fused loop nests from outermost to a specific loop level. More...

struct	GE
	Is the first expression greater than or equal to the second. More...

class	GeneratorBase

class	GeneratorFactoryProvider
	GeneratorFactoryProvider provides a way to customize the Generators that are visible to generate_filter_main (which otherwise would just look at the global registry of C++ Generators). More...

class	GeneratorInput_Arithmetic

class	GeneratorInput_Buffer

class	GeneratorInput_DynamicScalar

class	GeneratorInput_Func

class	GeneratorInput_Scalar

class	GeneratorInputBase

class	GeneratorInputImpl

class	GeneratorOutput_Arithmetic

class	GeneratorOutput_Buffer

class	GeneratorOutput_Func

class	GeneratorOutputBase

class	GeneratorOutputImpl

class	GeneratorParam_Arithmetic

class	GeneratorParam_AutoSchedulerParams

class	GeneratorParam_Bool

class	GeneratorParam_Enum

class	GeneratorParam_LoopLevel

class	GeneratorParam_String

class	GeneratorParam_Synthetic

class	GeneratorParam_Target

class	GeneratorParam_Type

class	GeneratorParamBase

class	GeneratorParamImpl

class	GeneratorParamInfo

class	GeneratorRegistry

class	GIOBase
	GIOBase is the base class for all GeneratorInput<> and GeneratorOutput<> instantiations; it is not part of the public API and should never be used directly by user code. More...

class	GPUCompilationCache

class	GpuObjectLifetimeTracker

struct	GT
	Is the first expression greater than the second. More...

struct	HalideBufferStaticTypeAndDims

struct	HalideBufferStaticTypeAndDims<::Halide::Buffer< T, Dims > >

struct	HalideBufferStaticTypeAndDims<::Halide::Runtime::Buffer< T, Dims > >

struct	has_static_halide_type_method

struct	has_static_halide_type_method< T2, typename type_sink< decltype(T2::static_halide_type())>::type >

class	HexagonAlignmentAnalyzer

struct	HoistedStorage
	Represents a location where storage will be hoisted to for a Func / Realize node with a given name. More...

class	HostClosure
	A Closure modified to inspect GPU-specific memory accesses, and produce a vector of DeviceArgument objects. More...

struct	IfThenElse
	An if-then-else block. More...

struct	Indentation

struct	InferredArgument
	An inferred argument. More...

struct	Interval
	A class to represent ranges of Exprs. More...

struct	IntImm
	Integer constants. More...

struct	IntrusivePtr
	Intrusive shared pointers have a reference count (a RefCount object) stored in the class itself. More...

struct	IRDeepCompare
	A compare struct built around less_than, for use as the comparison object in a std::map or std::set. More...

struct	IRGraphDeepCompare
	A compare struct built around graph_less_than, for use as the comparison object in a std::map or std::set. More...

class	IRGraphMutator
	A mutator that caches and reapplies previously-done mutations, so that it can handle graphs of IR that have not had CSE done to them. More...

class	IRGraphVisitor
	A base class for algorithms that walk recursively over the IR without visiting the same node twice. More...

struct	IRHandle
	IR nodes are passed around opaque handles to them. More...

class	IRMutator
	A base class for passes over the IR which modify it (e.g. More...

struct	IRNode
	The abstract base classes for a node in the Halide IR. More...

class	IRPrinter
	An IRVisitor that emits IR to the given output stream in a human readable form. More...

class	IRVisitor
	A base class for algorithms that need to recursively walk over the IR. More...

struct	is_printable_arg

struct	IsHalideBuffer

struct	IsHalideBuffer< const halide_buffer_t * >

struct	IsHalideBuffer< halide_buffer_t * >

struct	IsHalideBuffer<::Halide::Buffer< T, Dims > >

struct	IsHalideBuffer<::Halide::Runtime::Buffer< T, Dims > >

struct	IsRoundtrippable

struct	JITCache

struct	JITErrorBuffer

struct	JITFuncCallContext

struct	JITModule

class	JITSharedRuntime

class	JSONCompilerLogger
	JSONCompilerLogger is a basic implementation of the CompilerLogger interface that saves logged data, then logs it all in JSON format in emit_to_stream(). More...

struct	LE
	Is the first expression less than or equal to the second. More...

struct	Let
	A let expression, like you might find in a functional language. More...

struct	LetStmt
	The statement form of a let node. More...

struct	Load
	Load a value from a named symbol if predicate is true. More...

struct	LoweredArgument
	Definition of an argument to a LoweredFunc. More...

struct	LoweredFunc
	Definition of a lowered function. More...

struct	LT
	Is the first expression less than the second. More...

struct	Max
	The greater of two values. More...

struct	meta_and

struct	meta_and< T1, Args... >

struct	meta_or

struct	meta_or< T1, Args... >

struct	Min
	The lesser of two values. More...

struct	Mod
	The remainder of a / b. More...

struct	ModulusRemainder
	The result of modulus_remainder analysis. More...

struct	Mul
	The product of two expressions. More...

struct	NE
	Is the first expression not equal to the second. More...

struct	NoRealizations

struct	NoRealizations< T, Args... >

struct	NoRealizations<>

struct	Not
	Logical not - true if the expression false. More...

class	ObjectInstanceRegistry

struct	Or
	Logical or - is at least one of the expression true. More...

struct	OutputInfo

struct	PipelineFeatures

struct	Prefetch
	Represent a multi-dimensional region of a Func or an ImageParam that needs to be prefetched. More...

struct	PrefetchDirective

struct	PrintSpan
	Allow easily printing the contents of containers, or std::vector-like containers, in debug output. More...

struct	PrintSpanLn
	Allow easily printing the contents of spans, or std::vector-like spans, in debug output. More...

struct	ProducerConsumer
	This node is a helpful annotation to do with permissions. More...

struct	Provide
	This defines the value of a function at a multi-dimensional location. More...

class	PythonExtensionGen

struct	Ramp
	A linear ramp vector node. More...

struct	Realize
	Allocate a multi-dimensional buffer of the given type and size. More...

class	ReductionDomain
	A reference-counted handle on a reduction domain, which is just a vector of ReductionVariable. More...

struct	ReductionVariable
	A single named dimension of a reduction domain. More...

struct	ReductionVariableInfo
	Return a list of reduction variables the expression or tuple depends on. More...

class	RefCount
	A class representing a reference count to be used with IntrusivePtr. More...

struct	RegionCosts
	Auto scheduling component which is used to assign costs for computing a region of a function or one of its stages. More...

class	RegisterGenerator

struct	Reinterpret
	Reinterpret value as another type, without affecting any of the bits (on little-endian systems). More...

struct	reverse_adaptor

struct	ScheduleFeatures

class	Scope
	A common pattern when traversing Halide IR is that you need to keep track of stuff when you find a Let or a LetStmt, and that it should hide previous values with the same name until you leave the Let or LetStmt nodes This class helps with that. More...

struct	ScopedBinding
	Helper class for pushing/popping Scope<> values, to allow for early-exit in Visitor/Mutators that preserves correctness. More...

struct	ScopedBinding< void >

struct	ScopedValue
	Helper class for saving/restoring variable values on the stack, to allow for early-exit that preserves correctness. More...

struct	Select
	A ternary operator. More...

struct	select_type

struct	select_type< First >

struct	Shuffle
	Construct a new vector by taking elements from another sequence of vectors. More...

class	Simplify

class	SmallStack
	A stack which can store one item very efficiently. More...

class	SmallStack< void >

struct	SolverResult

struct	Specialization

struct	Split

class	StageSchedule
	A schedule for a single stage of a Halide pipeline. More...

struct	StaticCast

struct	Stmt
	A reference-counted handle to a statement node. More...

struct	StmtNode

struct	StorageDim
	Properties of one axis of the storage of a Func. More...

struct	Store
	Store a 'value' to the buffer called 'name' at a given 'index' if 'predicate' is true. More...

struct	StringImm
	String constants. More...

class	StubInput

class	StubInputBuffer
	StubInputBuffer is the placeholder that a Stub uses when it requires a Buffer for an input (rather than merely a Func or Expr). More...

class	StubOutputBuffer
	StubOutputBuffer is the placeholder that a Stub uses when it requires a Buffer for an output (rather than merely a Func). More...

class	StubOutputBufferBase

struct	Sub
	The difference of two expressions. More...

class	TemporaryFile
	A simple utility class that creates a temporary file in its ctor and deletes that file in its dtor; this is useful for temporary files that you want to ensure are deleted when exiting a certain scope. More...

struct	type_sink

struct	UIntImm
	Unsigned integer constants. More...

struct	Variable
	A named variable. More...

class	VariadicVisitor
	A visitor/mutator capable of passing arbitrary arguments to the visit methods using CRTP and returning any types from them. More...

struct	VectorReduce
	Horizontally reduce a vector to a scalar or narrower vector using the given commutative and associative binary operator. More...

class	Voidifier

struct	WasmModule
	Handle to compiled wasm code which can be called later. More...

struct	Weights

Typedefs
using	AbstractGeneratorPtr = std::unique_ptr<AbstractGenerator>

typedef std::map< std::string, Interval >	DimBounds

typedef std::map< std::pair< std::string, int >, Interval >	FuncValueBounds

template<typename T , typename T2 >
using	add_const_if_T_is_const = typename std::conditional<std::is_const<T>::value, const T2, T2>::type

template<typename T >
using	GeneratorParamImplBase

template<typename T , typename TBase = typename std::remove_all_extents<T>::type>
using	GeneratorInputImplBase

template<typename T , typename TBase = typename std::remove_all_extents<T>::type>
using	GeneratorOutputImplBase

using	GeneratorFactory = std::function<AbstractGeneratorPtr(const GeneratorContext &context)>

typedef llvm::raw_pwrite_stream	LLVMOStream

Enumerations
enum class	ArgInfoKind { Scalar , Function , Buffer }

enum class	ArgInfoDirection { Input , Output }

enum class	Direction { Upper , Lower }
	Given a varying expression, try to find a constant that is either: An upper bound (always greater than or equal to the expression), or A lower bound (always less than or equal to the expression) If it fails, returns an undefined Expr. More...

enum class	IRNodeType { IntImm , UIntImm , FloatImm , StringImm , Broadcast , Cast , Reinterpret , Variable , Add , Sub , Mod , Mul , Div , Min , Max , EQ , NE , LT , LE , GT , GE , And , Or , Not , Select , Load , Ramp , Call , Let , Shuffle , VectorReduce , LetStmt , AssertStmt , ProducerConsumer , For , Acquire , Store , Provide , Allocate , Free , Realize , Block , Fork , IfThenElse , Evaluate , Prefetch , Atomic , HoistedStorage }
	All our IR node types get unique IDs for the purposes of RTTI. More...

enum class	ForType { Serial , Parallel , Vectorized , Unrolled , Extern , GPUBlock , GPUThread , GPULane }
	An enum describing a type of loop traversal. More...

enum class	SyntheticParamType { Type , Dim , ArraySize }

enum class	Monotonic { Constant , Increasing , Decreasing , Unknown }
	Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown. More...

enum class	DimType { PureVar = 0 , PureRVar , ImpureRVar }
	Each Dim below has a dim_type, which tells you what transformations are legal on it. More...

Functions
Stmt	add_atomic_mutex (Stmt s, const std::vector< Function > &outputs)

Stmt	add_image_checks (const Stmt &s, const std::vector< Function > &outputs, const Target &t, const std::vector< std::string > &order, const std::map< std::string, Function > &env, const FuncValueBounds &fb, bool will_inject_host_copies)
	Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g.

Stmt	add_parameter_checks (const std::vector< Stmt > &requirements, Stmt s, const Target &t)
	Insert checks to make sure that all referenced parameters meet their constraints.

Stmt	add_split_factor_checks (const Stmt &s, const std::map< std::string, Function > &env)
	Insert checks that all split factors that depend on scalar parameters are strictly positive.

Stmt	align_loads (const Stmt &s, int alignment, int min_bytes_to_align)
	Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors.

Stmt	allocation_bounds_inference (Stmt s, const std::map< std::string, Function > &env, const std::map< std::pair< std::string, int >, Interval > &func_bounds)
	Take a partially statement with Realize nodes in terms of variables, and define values for those variables.

std::vector< ApplySplitResult >	apply_split (const Split &split, const std::string &prefix, std::map< std::string, Expr > &dim_extent_alignment)
	Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let).

std::vector< std::pair< std::string, Expr > >	compute_loop_bounds_after_split (const Split &split, const std::string &prefix)
	Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions.

const std::vector< AssociativePattern > &	get_ops_table (const std::vector< Expr > &exprs)

AssociativeOp	prove_associativity (const std::string &f, std::vector< Expr > args, std::vector< Expr > exprs)
	Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any.

void	associativity_test ()

Stmt	fork_async_producers (Stmt s, const std::map< std::string, Function > &env)

int	string_to_int (const std::string &s)
	Return an int representation of 's'.

Expr	substitute_var_estimates (Expr e)
	Substitute every variable in an Expr or a Stmt with its estimate if specified.

Stmt	substitute_var_estimates (Stmt s)

Expr	get_extent (const Interval &i)
	Return the size of an interval.

Expr	box_size (const Box &b)
	Return the size of an n-d box.

void	disp_regions (const std::map< std::string, Box > &regions)
	Helper function to print the bounds of a region.

Definition	get_stage_definition (const Function &f, int stage_num)
	Return the corresponding definition of a function given the stage.

std::vector< Dim > &	get_stage_dims (const Function &f, int stage_num)
	Return the corresponding loop dimensions of a function given the stage.

void	combine_load_costs (std::map< std::string, Expr > &result, const std::map< std::string, Expr > &partial)
	Add partial load costs to the corresponding function in the result costs.

DimBounds	get_stage_bounds (const Function &f, int stage_num, const DimBounds &pure_bounds)
	Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions.

std::vector< DimBounds >	get_stage_bounds (const Function &f, const DimBounds &pure_bounds)
	Return the required bounds for all the stages of the function 'f'.

Expr	perform_inline (Expr e, const std::map< std::string, Function > &env, const std::set< std::string > &inlines=std::set< std::string >(), const std::vector< std::string > &order=std::vector< std::string >())
	Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression.

std::set< std::string >	get_parents (Function f, int stage)
	Return all functions that are directly called by a function stage (f, stage).

template<typename K , typename V >
V	get_element (const std::map< K, V > &m, const K &key)
	Return value of element within a map.

template<typename K , typename V >
V &	get_element (std::map< K, V > &m, const K &key)

bool	inline_all_trivial_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env)
	If the cost of computing a Func is about the same as calling the Func, inline the Func.

std::string	is_func_called_element_wise (const std::vector< std::string > &order, size_t index, const std::map< std::string, Function > &env)
	Determine if a Func (order[index]) is only consumed by another single Func in element-wise manner.

bool	inline_all_element_wise_functions (const std::vector< Function > &outputs, const std::vector< std::string > &order, const std::map< std::string, Function > &env)
	Inline a Func if its values are only consumed by another single Func in element-wise manner.

void	propagate_estimate_test ()

Stmt	bound_constant_extent_loops (const Stmt &s)
	Replace all loop extents of unrolled or vectorized loops with constants, by substituting and simplifying as needed.

const FuncValueBounds &	empty_func_value_bounds ()

Interval	bounds_of_expr_in_scope (const Expr &expr, const Scope< Interval > &scope, const FuncValueBounds &func_bounds=empty_func_value_bounds(), bool const_bound=false)
	Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression.

Expr	find_constant_bound (const Expr &e, Direction d, const Scope< Interval > &scope=Scope< Interval >::empty_scope())

Interval	find_constant_bounds (const Expr &e, const Scope< Interval > &scope)
	Find bounds for a varying expression that are either constants or +/-inf.

void	merge_boxes (Box &a, const Box &b)
	Expand box a to encompass box b.

bool	boxes_overlap (const Box &a, const Box &b)
	Test if box a could possibly overlap box b.

Box	box_union (const Box &a, const Box &b)
	The union of two boxes.

Box	box_intersection (const Box &a, const Box &b)
	The intersection of two boxes.

bool	box_contains (const Box &a, const Box &b)
	Test if box a provably contains box b.

std::map< std::string, Box >	boxes_required (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
	Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression.

std::map< std::string, Box >	boxes_required (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

std::map< std::string, Box >	boxes_provided (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
	Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression.

std::map< std::string, Box >	boxes_provided (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

std::map< std::string, Box >	boxes_touched (const Expr &e, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
	Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression.

std::map< std::string, Box >	boxes_touched (Stmt s, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

Box	box_required (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())
	Variants of the above that are only concerned with a single function.

Box	box_required (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

Box	box_provided (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

Box	box_provided (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

Box	box_touched (const Expr &e, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

Box	box_touched (Stmt s, const std::string &fn, const Scope< Interval > &scope=Scope< Interval >::empty_scope(), const FuncValueBounds &func_bounds=empty_func_value_bounds())

FuncValueBounds	compute_function_value_bounds (const std::vector< std::string > &order, const std::map< std::string, Function > &env)
	Compute the maximum and minimum possible value for each function in an environment.

Expr	span_of_bounds (const Interval &bounds)

void	bounds_test ()

Stmt	bounds_inference (Stmt, const std::vector< Function > &outputs, const std::vector< std::string > &realization_order, const std::vector< std::vector< std::string > > &fused_groups, const std::map< std::string, Function > &environment, const std::map< std::pair< std::string, int >, Interval > &func_bounds, const Target &target)
	Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds.

Stmt	bound_small_allocations (const Stmt &s)

Expr	buffer_accessor (const Buffer<> &buf, const std::vector< Expr > &args)

template<typename T , typename = typename std::enable_if<!std::is_convertible<T, std::string>::value>::type>
std::string	get_name_from_end_of_parameter_pack (T &&)

std::string	get_name_from_end_of_parameter_pack (const std::string &n)

std::string	get_name_from_end_of_parameter_pack ()

template<typename First , typename Second , typename... Args>
std::string	get_name_from_end_of_parameter_pack (First first, Second second, Args &&...rest)

void	get_shape_from_start_of_parameter_pack_helper (std::vector< int > &, const std::string &)

void	get_shape_from_start_of_parameter_pack_helper (std::vector< int > &)

template<typename... Args>
void	get_shape_from_start_of_parameter_pack_helper (std::vector< int > &result, int x, Args &&...rest)

template<typename... Args>
std::vector< int >	get_shape_from_start_of_parameter_pack (Args &&...args)

template<typename T >
void	buffer_type_name_non_const (std::ostream &s)

template<>
void	buffer_type_name_non_const< void > (std::ostream &s)

template<typename T >
std::string	buffer_type_name ()

Stmt	canonicalize_gpu_vars (Stmt s)
	Canonicalize GPU var names into some pre-determined block/thread names (i.e.

const std::string &	gpu_thread_name (int index)
	Names for the thread and block id variables.

const std::string &	gpu_block_name (int index)

Stmt	clamp_unsafe_accesses (const Stmt &s, const std::map< std::string, Function > &env, FuncValueBounds &func_bounds)
	Inject clamps around func calls h(...) when all the following conditions hold:

std::unique_ptr< CodeGen_GPU_Dev >	new_CodeGen_D3D12Compute_Dev (const Target &target)

llvm::Type *	get_vector_element_type (llvm::Type *)
	Get the scalar type of an llvm vector type.

bool	function_takes_user_context (const std::string &name)
	Which built-in functions require a user-context first argument?

bool	can_allocation_fit_on_stack (int64_t size)
	Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False.

std::pair< Expr, Expr >	long_div_mod_round_to_zero (const Expr &a, const Expr &b, std::optional< uint64_t > max_abs=std::nullopt)
	Does a {div/mod}_round_to_zero using binary long division for int/uint.

Expr	lower_mux (const Call *mux)
	Reduce a mux intrinsic to a select tree.

Expr	lower_round_to_nearest_ties_to_even (const Expr &)
	An vectorizable implementation of Halide::round that doesn't depend on any standard library being present.

void	get_target_options (const llvm::Module &module, llvm::TargetOptions &options)
	Given an llvm::Module, set llvm:TargetOptions information.

void	clone_target_options (const llvm::Module &from, llvm::Module &to)
	Given two llvm::Modules, clone target options from one to the other.

std::unique_ptr< llvm::TargetMachine >	make_target_machine (const llvm::Module &module)
	Given an llvm::Module, get or create an llvm:TargetMachine.

void	set_function_attributes_from_halide_target_options (llvm::Function &)
	Set the appropriate llvm Function attributes given the Halide Target.

void	embed_bitcode (llvm::Module *M, const std::string &halide_command)
	Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section.

std::unique_ptr< CodeGen_GPU_Dev >	new_CodeGen_Metal_Dev (const Target &target)

std::unique_ptr< CodeGen_GPU_Dev >	new_CodeGen_OpenCL_Dev (const Target &target)

std::unique_ptr< CodeGen_GPU_Dev >	new_CodeGen_PTX_Dev (const Target &target)

std::unique_ptr< CodeGen_Posix >	new_CodeGen_ARM (const Target &target)
	Construct CodeGen object for a variety of targets.

std::unique_ptr< CodeGen_Posix >	new_CodeGen_Hexagon (const Target &target)

std::unique_ptr< CodeGen_Posix >	new_CodeGen_PowerPC (const Target &target)

std::unique_ptr< CodeGen_Posix >	new_CodeGen_RISCV (const Target &target)

std::unique_ptr< CodeGen_Posix >	new_CodeGen_X86 (const Target &target)

std::unique_ptr< CodeGen_Posix >	new_CodeGen_WebAssembly (const Target &target)

std::unique_ptr< CodeGen_GPU_Dev >	new_CodeGen_Vulkan_Dev (const Target &target)

std::unique_ptr< CodeGen_GPU_Dev >	new_CodeGen_WebGPU_Dev (const Target &target)

std::unique_ptr< CompilerLogger >	set_compiler_logger (std::unique_ptr< CompilerLogger > compiler_logger)
	Set the active CompilerLogger object, replacing any existing one.

CompilerLogger *	get_compiler_logger ()
	Return the currently active CompilerLogger object.

ConstantInterval	constant_integer_bounds (const Expr &e, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope(), std::map< Expr, ConstantInterval, ExprCompare > *cache=nullptr)
	Deduce constant integer bounds on an expression.

ConstantInterval	operator+ (const ConstantInterval &a, const ConstantInterval &b)
	Arithmetic operators on ConstantIntervals.

ConstantInterval	operator+ (const ConstantInterval &a, int64_t b)

ConstantInterval	operator- (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	operator- (const ConstantInterval &a, int64_t b)

ConstantInterval	operator/ (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	operator/ (const ConstantInterval &a, int64_t b)

ConstantInterval	operator* (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	operator* (const ConstantInterval &a, int64_t b)

ConstantInterval	operator% (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	operator% (const ConstantInterval &a, int64_t b)

ConstantInterval	min (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	min (const ConstantInterval &a, int64_t b)

ConstantInterval	max (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	max (const ConstantInterval &a, int64_t b)

ConstantInterval	abs (const ConstantInterval &a)

ConstantInterval	operator<< (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	operator<< (const ConstantInterval &a, int64_t b)

ConstantInterval	operator<< (int64_t a, const ConstantInterval &b)

ConstantInterval	operator>> (const ConstantInterval &a, const ConstantInterval &b)

ConstantInterval	operator>> (const ConstantInterval &a, int64_t b)

ConstantInterval	operator>> (int64_t a, const ConstantInterval &b)

bool	operator<= (const ConstantInterval &a, const ConstantInterval &b)
	Comparison operators on ConstantIntervals.

bool	operator<= (const ConstantInterval &a, int64_t b)

bool	operator<= (int64_t a, const ConstantInterval &b)

bool	operator< (const ConstantInterval &a, const ConstantInterval &b)

bool	operator< (const ConstantInterval &a, int64_t b)

bool	operator< (int64_t a, const ConstantInterval &b)

bool	operator>= (const ConstantInterval &a, const ConstantInterval &b)

bool	operator> (const ConstantInterval &a, const ConstantInterval &b)

bool	operator>= (const ConstantInterval &a, int64_t b)

bool	operator> (const ConstantInterval &a, int64_t b)

bool	operator>= (int64_t a, const ConstantInterval &b)

bool	operator> (int64_t a, const ConstantInterval &b)

std::string	cplusplus_function_mangled_name (const std::string &name, const std::vector< std::string > &namespaces, Type return_type, const std::vector< ExternFuncArgument > &args, const Target &target)
	Return the mangled C++ name for a function.

void	cplusplus_mangle_test ()

Expr	common_subexpression_elimination (const Expr &, bool lift_all=false)
	Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable.

Stmt	common_subexpression_elimination (const Stmt &, bool lift_all=false)
	Do common-subexpression-elimination on each expression in a statement.

void	cse_test ()

std::ostream &	operator<< (std::ostream &stream, const Stmt &)
	Emit a halide statement on an output stream (such as std::cout) in a human-readable form.

std::ostream &	operator<< (std::ostream &stream, const LoweredFunc &)
	Emit a halide LoweredFunc in a human readable format.

template<typename T >
	PrintSpan (const T &) -> PrintSpan< T >

template<typename StreamT , typename T >
StreamT &	operator<< (StreamT &stream, const PrintSpan< T > &wrapper)

template<typename T >
	PrintSpanLn (const T &) -> PrintSpanLn< T >

template<typename StreamT , typename T >
StreamT &	operator<< (StreamT &stream, const PrintSpanLn< T > &wrapper)

void	debug_arguments (LoweredFunc *func, const Target &t)
	Injects debug prints in a LoweredFunc that describe the target and arguments.

Stmt	debug_to_file (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env)
	Takes a statement with Realize nodes still unlowered.

Expr	extract_odd_lanes (const Expr &a)
	Extract the odd-numbered lanes in a vector.

Expr	extract_even_lanes (const Expr &a)
	Extract the even-numbered lanes in a vector.

Expr	extract_lane (const Expr &vec, int lane)
	Extract the nth lane of a vector.

Stmt	rewrite_interleavings (const Stmt &s)
	Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic.

void	deinterleave_vector_test ()

Expr	remove_let_definitions (const Expr &expr)
	Remove all let definitions of expr.

std::vector< int >	gather_variables (const Expr &expr, const std::vector< std::string > &filter)
	Return a list of variables' indices that expr depends on and are in the filter.

std::vector< int >	gather_variables (const Expr &expr, const std::vector< Var > &filter)

std::map< std::string, ReductionVariableInfo >	gather_rvariables (const Expr &expr)

std::map< std::string, ReductionVariableInfo >	gather_rvariables (const Tuple &tuple)

Expr	add_let_expression (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping, const std::vector< std::string > &let_variables)
	Add necessary let expressions to expr.

std::vector< Expr >	sort_expressions (const Expr &expr)
	Topologically sort the expression graph expressed by expr.

std::map< std::string, Box >	inference_bounds (const std::vector< Func > &funcs, const std::vector< Box > &output_bounds)
	Compute the bounds of funcs.

std::map< std::string, Box >	inference_bounds (const Func &func, const Box &output_bounds)

std::vector< std::pair< Expr, Expr > >	box_to_vector (const Box &bounds)
	Convert Box to vector of (min, extent)

bool	equal (const RDom &bounds0, const RDom &bounds1)
	Return true if bounds0 and bounds1 represent the same bounds.

std::vector< std::string >	vars_to_strings (const std::vector< Var > &vars)
	Return a list of variable names.

ReductionDomain	extract_rdom (const Expr &expr)
	Return the reduction domain used by expr.

std::pair< bool, Expr >	solve_inverse (Expr expr, const std::string &new_var, const std::string &var)
	expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom

std::map< std::string, BufferInfo >	find_buffer_param_calls (const Func &func)

std::set< std::string >	find_implicit_variables (const Expr &expr)
	Find all implicit variables in expr.

Expr	substitute_rdom_predicate (const std::string &name, const Expr &replacement, const Expr &expr)
	Substitute the variable.

bool	is_calling_function (const std::string &func_name, const Expr &expr, const std::map< std::string, Expr > &let_var_mapping)
	Return true if expr contains call to func_name.

bool	is_calling_function (const Expr &expr, const std::map< std::string, Expr > &let_var_mapping)
	Return true if expr depends on any function or buffer.

Expr	substitute_call_arg_with_pure_arg (Func f, int variable_id, const Expr &e)
	Replaces call to Func f in Expr e such that the call argument at variable_id is the pure argument.

Expr	make_device_interface_call (DeviceAPI device_api, MemoryType memory_type=MemoryType::Auto)
	Get an Expr which evaluates to the device interface for the given device api at runtime.

Stmt	distribute_shifts (const Stmt &stmt, bool multiply_adds)

Stmt	inject_early_frees (const Stmt &s)
	Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation.

Type	eliminated_bool_type (Type bool_type, Type other_type)
	If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors.

bool	is_float16_transcendental (const Call *)
	Check if a call is a float16 transcendental (e.g.

Expr	lower_float16_transcendental_to_float32_equivalent (const Call *)
	Implement a float16 transcendental using the float32 equivalent.

Expr	float32_to_bfloat16 (Expr e)
	Cast to/from float and bfloat using bitwise math.

Expr	float32_to_float16 (Expr e)

Expr	float16_to_float32 (Expr e)

Expr	bfloat16_to_float32 (Expr e)

Expr	lower_float16_cast (const Cast *op)

HALIDE_EXPORT_SYMBOL void	unhandled_exception_handler ()

template<>
RefCount &	ref_count< IRNode > (const IRNode *t) noexcept

template<>
void	destroy< IRNode > (const IRNode *t)

bool	is_unordered_parallel (ForType for_type)
	Check if for_type executes for loop iterations in parallel and unordered.

bool	is_parallel (ForType for_type)
	Returns true if for_type executes for loop iterations in parallel.

bool	is_gpu (ForType for_type)
	Returns true if for_type is GPUBlock, GPUThread, or GPULane.

template<typename StmtOrExpr , typename T >
bool	stmt_or_expr_uses_vars (const StmtOrExpr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
	Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

template<typename StmtOrExpr >
bool	stmt_or_expr_uses_var (const StmtOrExpr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
	Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

bool	expr_uses_var (const Expr &e, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
	Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

bool	stmt_uses_var (const Stmt &stmt, const std::string &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
	Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

template<typename T >
bool	expr_uses_vars (const Expr &e, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
	Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

template<typename T >
bool	stmt_uses_vars (const Stmt &stmt, const Scope< T > &v, const Scope< Expr > &s=Scope< Expr >::empty_scope())
	Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Stmt	extract_tile_operations (const Stmt &s)
	Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend.

std::map< std::string, Function >	find_direct_calls (const Function &f)
	Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, including in update definitions, update index expressions, and RDom extents.

std::map< std::string, Function >	find_transitive_calls (const Function &f)
	Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, or indirectly in those functions' definitions, recursively.

std::map< std::string, Function >	build_environment (const std::vector< Function > &funcs)
	Find all Functions transitively referenced by any Function in `funcs` and return a map of them.

std::vector< Function >	called_funcs_in_order_found (const std::vector< Function > &funcs)
	Returns the same Functions as build_environment, but returns a vector of Functions instead, where the order is the order in which the Functions were first encountered.

Expr	lower_widen_right_add (const Expr &a, const Expr &b)
	Implement intrinsics with non-intrinsic using equivalents.

Expr	lower_widen_right_mul (const Expr &a, const Expr &b)

Expr	lower_widen_right_sub (const Expr &a, const Expr &b)

Expr	lower_widening_add (const Expr &a, const Expr &b)

Expr	lower_widening_mul (const Expr &a, const Expr &b)

Expr	lower_widening_sub (const Expr &a, const Expr &b)

Expr	lower_widening_shift_left (const Expr &a, const Expr &b)

Expr	lower_widening_shift_right (const Expr &a, const Expr &b)

Expr	lower_rounding_shift_left (const Expr &a, const Expr &b)

Expr	lower_rounding_shift_right (const Expr &a, const Expr &b)

Expr	lower_saturating_add (const Expr &a, const Expr &b)

Expr	lower_saturating_sub (const Expr &a, const Expr &b)

Expr	lower_saturating_cast (const Type &t, const Expr &a)

Expr	lower_halving_add (const Expr &a, const Expr &b)

Expr	lower_halving_sub (const Expr &a, const Expr &b)

Expr	lower_rounding_halving_add (const Expr &a, const Expr &b)

Expr	lower_sorted_avg (const Expr &a, const Expr &b)

Expr	lower_mul_shift_right (const Expr &a, const Expr &b, const Expr &q)

Expr	lower_rounding_mul_shift_right (const Expr &a, const Expr &b, const Expr &q)

Expr	lower_intrinsic (const Call *op)
	Replace one of the above ops with equivalent arithmetic.

Stmt	find_intrinsics (const Stmt &s)
	Replace common arithmetic patterns with intrinsics.

Expr	find_intrinsics (const Expr &e)

Expr	lower_intrinsics (const Expr &e)
	The reverse of find_intrinsics.

Stmt	lower_intrinsics (const Stmt &s)

Stmt	flatten_nested_ramps (const Stmt &s)
	Take a statement/expression and replace nested ramps and broadcasts.

Expr	flatten_nested_ramps (const Expr &e)

template<typename Last >
void	check_types (const Tuple &t, int idx)

template<typename First , typename Second , typename... Rest>
void	check_types (const Tuple &t, int idx)

template<typename Last >
void	assign_results (Realization &r, int idx, Last last)

template<typename First , typename Second , typename... Rest>
void	assign_results (Realization &r, int idx, First first, Second second, Rest &&...rest)

void	schedule_scalar (Func f)

std::pair< std::vector< Function >, std::map< std::string, Function > >	deep_copy (const std::vector< Function > &outputs, const std::map< std::string, Function > &env)
	Deep copy an entire Function DAG.

Stmt	zero_gpu_loop_mins (const Stmt &s)
	Rewrite all GPU loops to have a min of zero.

Stmt	fuse_gpu_thread_loops (Stmt s)
	Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model.

Stmt	fuzz_float_stores (const Stmt &s)
	On every store of a floating point value, mask off the least-significant-bit of the mantissa.

void	generator_test ()

std::vector< Expr >	parameter_constraints (const Parameter &p)

template<typename T >
HALIDE_NO_USER_CODE_INLINE std::string	enum_to_string (const std::map< std::string, T > &enum_map, const T &t)

template<typename T >
T	enum_from_string (const std::map< std::string, T > &enum_map, const std::string &s)

const std::map< std::string, Halide::Type > &	get_halide_type_enum_map ()

std::string	halide_type_to_enum_string (const Type &t)

std::string	halide_type_to_c_source (const Type &t)

std::string	halide_type_to_c_type (const Type &t)

const GeneratorFactoryProvider &	get_registered_generators ()
	Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators.

int	generate_filter_main (int argc, char **argv)
	generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation.

int	generate_filter_main (int argc, char **argv, const GeneratorFactoryProvider &generator_factory_provider)
	This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g.

template<typename T >
T	parse_scalar (const std::string &value)

std::vector< Type >	parse_halide_type_list (const std::string &types)

void	execute_generator (const ExecuteGeneratorArgs &args)
	Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface `generate_filter_main()`, but with a structured API that is more suitable for calling directly from code (vs command line).

Stmt	inject_hexagon_rpc (Stmt s, const Target &host_target, Module &module)
	Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module.

Buffer< uint8_t >	compile_module_to_hexagon_shared_object (const Module &device_code)

Stmt	optimize_hexagon_shuffles (const Stmt &s, int lut_alignment)
	Replace indirect and other loads with simple loads + vlut calls.

Stmt	scatter_gather_generator (Stmt s)

Stmt	optimize_hexagon_instructions (Stmt s, const Target &t)
	Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations.

Expr	native_deinterleave (const Expr &x)
	Generate deinterleave or interleave operations, operating on groups of vectors at a time.

Expr	native_interleave (const Expr &x)

bool	is_native_deinterleave (const Expr &x)

bool	is_native_interleave (const Expr &x)

std::string	type_suffix (Type type, bool signed_variants=true)

std::string	type_suffix (const Expr &a, bool signed_variants=true)

std::string	type_suffix (const Expr &a, const Expr &b, bool signed_variants=true)

std::string	type_suffix (const std::vector< Expr > &ops, bool signed_variants=true)

std::vector< InferredArgument >	infer_arguments (const Stmt &body, const std::vector< Function > &outputs)

Stmt	call_extern_and_assert (const std::string &name, const std::vector< Expr > &args)
	A helper function to call an extern function, and assert that it returns 0.

Stmt	inject_host_dev_buffer_copies (Stmt s, const Target &t)
	Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed.

Stmt	inline_function (Stmt s, const Function &f)
	Inline a single named function, which must be pure.

Expr	inline_function (Expr e, const Function &f)

void	inline_function (Function caller, const Function &f)

void	validate_schedule_inlined_function (Function f)
	Check if the schedule of an inlined function is legal, throwing an error if it is not.

template<typename T >
RefCount &	ref_count (const T *t) noexcept
	Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here.

template<typename T >
void	destroy (const T *t)

bool	equal_impl (const IRNode &a, const IRNode &b)

bool	graph_equal_impl (const IRNode &a, const IRNode &b)

bool	less_than_impl (const IRNode &a, const IRNode &b)

bool	graph_less_than_impl (const IRNode &a, const IRNode &b)

HALIDE_ALWAYS_INLINE bool	equal (const Expr &a, int b)
	Compare an Expr to an int literal.

HALIDE_ALWAYS_INLINE bool	equal (const IRNode &a, const IRNode &b)
	Check if two defined Stmts or Exprs are equal.

HALIDE_ALWAYS_INLINE bool	equal (const IRHandle &a, const IRHandle &b)
	Check if two possible-undefined Stmts or Exprs are equal.

HALIDE_ALWAYS_INLINE bool	graph_equal (const IRNode &a, const IRNode &b)
	Check if two defined Stmts or Exprs are equal.

HALIDE_ALWAYS_INLINE bool	graph_equal (const IRHandle &a, const IRHandle &b)
	Check if two possibly-undefined Stmts or Exprs are equal.

HALIDE_ALWAYS_INLINE bool	less_than (const IRNode &a, const IRNode &b)
	Check if two defined Stmts or Exprs are in a lexicographic order.

HALIDE_ALWAYS_INLINE bool	less_than (const IRHandle &a, const IRHandle &b)
	Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.

HALIDE_ALWAYS_INLINE bool	graph_less_than (const IRNode &a, const IRNode &b)
	Check if two defined Stmts or Exprs are in a lexicographic order.

HALIDE_ALWAYS_INLINE bool	graph_less_than (const IRHandle &a, const IRHandle &b)
	Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.

void	ir_equality_test ()

bool	expr_match (const Expr &pattern, const Expr &expr, std::vector< Expr > &result)
	Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument.

bool	expr_match (const Expr &pattern, const Expr &expr, std::map< std::string, Expr > &result)
	Does the first expression have the same structure as the second? Variables are matched consistently.

Expr	with_lanes (const Expr &x, int lanes)
	Rewrite the expression x to have `lanes` lanes.

void	expr_match_test ()

template<typename Mutator , typename... Args>
std::pair< Region, bool >	mutate_region (Mutator *mutator, const Region &bounds, Args &&...args)
	A helper function for mutator-like things to mutate regions.

bool	is_const (const Expr &e)
	Is the expression either an IntImm, a FloatImm, a StringImm, or a Cast of the same, or a Ramp or Broadcast of the same.

bool	is_const (const Expr &e, int64_t v)
	Is the expression an IntImm, FloatImm of a particular value, or a Cast, or Broadcast of the same.

std::optional< int64_t >	as_const_int (const Expr &e)
	If an expression is an IntImm or a Broadcast of an IntImm, return a its value.

std::optional< uint64_t >	as_const_uint (const Expr &e)
	If an expression is a UIntImm or a Broadcast of a UIntImm, return its value.

std::optional< double >	as_const_float (const Expr &e)
	If an expression is a FloatImm or a Broadcast of a FloatImm, return its value.

std::optional< int >	is_const_power_of_two_integer (const Expr &e)
	Is the expression a constant integer power of two.

std::optional< int >	is_const_power_of_two_integer (uint64_t)

std::optional< int >	is_const_power_of_two_integer (int64_t)

bool	is_positive_const (const Expr &e)
	Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression)

bool	is_negative_const (const Expr &e)
	Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression)

bool	is_undef (const Expr &e)
	Is the expression an undef.

bool	is_const_zero (const Expr &e)
	Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression)

bool	is_const_one (const Expr &e)
	Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression)

bool	is_no_op (const Stmt &s)
	Is the statement a no-op (which we represent as either an undefined Stmt, or as an Evaluate node of a constant)

bool	is_pure (const Expr &e)
	Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects.

Expr	make_const (Type t, int64_t val)
	Construct an immediate of the given type from any numeric C++ type.

Expr	make_const (Type t, uint64_t val)

Expr	make_const (Type t, double val)

Expr	make_const (Type t, int32_t val)

Expr	make_const (Type t, uint32_t val)

Expr	make_const (Type t, int16_t val)

Expr	make_const (Type t, uint16_t val)

Expr	make_const (Type t, int8_t val)

Expr	make_const (Type t, uint8_t val)

Expr	make_const (Type t, bool val)

Expr	make_const (Type t, float val)

Expr	make_const (Type t, float16_t val)

Expr	make_signed_integer_overflow (Type type)
	Construct a unique signed_integer_overflow Expr.

bool	is_signed_integer_overflow (const Expr &expr)
	Check if an expression is a signed_integer_overflow.

void	check_representable (Type t, int64_t val)
	Check if a constant value can be correctly represented as the given type.

Expr	make_bool (bool val, int lanes=1)
	Construct a boolean constant from a C++ boolean value.

Expr	make_zero (Type t)
	Construct the representation of zero in the given type.

Expr	make_one (Type t)
	Construct the representation of one in the given type.

Expr	make_two (Type t)
	Construct the representation of two in the given type.

Expr	const_true (int lanes=1)
	Construct the constant boolean true.

Expr	const_false (int lanes=1)
	Construct the constant boolean false.

Expr	lossless_cast (Type t, Expr e, std::map< Expr, ConstantInterval, ExprCompare > *cache=nullptr)
	Attempt to cast an expression to a smaller type while provably not losing information.

Expr	lossless_negate (const Expr &x)
	Attempt to negate x without introducing new IR and without overflow.

void	match_types (Expr &a, Expr &b)
	Coerce the two expressions to have the same type, using C-style casting rules.

void	match_types_bitwise (Expr &a, Expr &b, const char *op_name)
	Asserts that both expressions are integer types and are either both signed or both unsigned.

Expr	halide_log (const Expr &a)
	Halide's vectorizable transcendentals.

Expr	halide_exp (const Expr &a)

Expr	halide_erf (const Expr &a)

Expr	raise_to_integer_power (Expr a, int64_t b)
	Raise an expression to an integer power by repeatedly multiplying it by itself.

void	split_into_ands (const Expr &cond, std::vector< Expr > &result)
	Split a boolean condition into vector of ANDs.

Expr	strided_ramp_base (const Expr &e, int stride=1)
	If e is a ramp expression with stride, default 1, return the base, otherwise undefined.

template<typename T >
T	mod_imp (T a, T b)
	Implementations of division and mod that are specific to Halide.

template<typename T >
T	div_imp (T a, T b)

template<>
float	mod_imp< float > (float a, float b)

template<>
double	mod_imp< double > (double a, double b)

template<>
float	div_imp< float > (float a, float b)

template<>
double	div_imp< double > (double a, double b)

Expr	remove_likelies (const Expr &e)
	Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed.

Stmt	remove_likelies (const Stmt &s)
	Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed.

Expr	remove_promises (const Expr &e)
	Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.

Stmt	remove_promises (const Stmt &s)
	Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.

Expr	unwrap_tags (const Expr &e)
	If the expression is a tag helper call, remove it and return the tagged expression.

HALIDE_NO_USER_CODE_INLINE void	collect_print_args (std::vector< Expr > &args)

template<typename... Args>
HALIDE_NO_USER_CODE_INLINE void	collect_print_args (std::vector< Expr > &args, const char *arg, Args &&...more_args)

template<typename... Args>
HALIDE_NO_USER_CODE_INLINE void	collect_print_args (std::vector< Expr > &args, Expr arg, Args &&...more_args)

Expr	requirement_failed_error (Expr condition, const std::vector< Expr > &args)

Expr	memoize_tag_helper (Expr result, const std::vector< Expr > &cache_key_values)

void	reset_random_counters ()
	Reset the counters used for random-number seeds in random_float/int/uint.

Expr	unreachable (Type t=Int(32))
	Return an expression that should never be evaluated.

template<typename T >
Expr	unreachable ()

Expr	promise_clamped (const Expr &value, const Expr &min, const Expr &max)
	FOR INTERNAL USE ONLY.

std::ostream &	operator<< (std::ostream &stream, IRNodeType)
	Emit a halide node type on an output stream (such as std::cout) in human-readable form.

std::ostream &	operator<< (std::ostream &stream, const AssociativePattern &)
	Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form.

std::ostream &	operator<< (std::ostream &stream, const AssociativeOp &)
	Emit a halide associative op on an output stream (such as std::cout) in a human-readable form.

std::ostream &	operator<< (std::ostream &stream, const ForType &)
	Emit a halide for loop type (vectorized, serial, etc) in a human readable form.

std::ostream &	operator<< (std::ostream &stream, const VectorReduce::Operator &)
	Emit a horizontal vector reduction op in human-readable form.

std::ostream &	operator<< (std::ostream &stream, const NameMangling &)
	Emit a halide name mangling value in a human readable format.

std::ostream &	operator<< (std::ostream &stream, const LinkageType &)
	Emit a halide linkage value in a human readable format.

std::ostream &	operator<< (std::ostream &stream, const DimType &)
	Emit a halide dimension type in human-readable format.

std::ostream &	operator<< (std::ostream &out, const Closure &c)
	Emit a Closure in human-readable form.

std::ostream &	operator<< (std::ostream &out, const Interval &c)
	Emit an Interval in human-readable form.

std::ostream &	operator<< (std::ostream &out, const ConstantInterval &c)
	Emit a ConstantInterval in human-readable form.

std::ostream &	operator<< (std::ostream &out, const ModulusRemainder &c)
	Emit a ModulusRemainder in human-readable form.

std::ostream &	operator<< (std::ostream &stream, const Indentation &)

void *	get_symbol_address (const char *s)

Expr	lower_lerp (Type final_type, Expr zero_val, Expr one_val, const Expr &weight, const Target &target)
	Build Halide IR that computes a lerp.

Stmt	hoist_loop_invariant_values (Stmt)
	Hoist loop-invariants out of inner loops.

Stmt	hoist_loop_invariant_if_statements (Stmt)
	Just hoist loop-invariant if statements as far up as possible.

template<typename T >
auto	iterator_to_pointer (T iter) -> decltype(&*std::declval< T >())

std::string	get_llvm_function_name (const llvm::Function *f)

std::string	get_llvm_function_name (const llvm::Function &f)

llvm::StructType *	get_llvm_struct_type_by_name (llvm::Module module, const char name)

llvm::Triple	get_triple_for_target (const Target &target)
	Return the llvm::Triple that corresponds to the given Halide Target.

std::unique_ptr< llvm::Module >	get_initial_module_for_target (Target, llvm::LLVMContext *, bool for_shared_jit_runtime=false, bool just_gpu=false)
	Create an llvm module containing the support code for a given target.

std::unique_ptr< llvm::Module >	get_initial_module_for_ptx_device (Target, llvm::LLVMContext *c)
	Create an llvm module containing the support code for ptx device.

void	add_bitcode_to_module (llvm::LLVMContext *context, llvm::Module &module, const std::vector< uint8_t > &bitcode, const std::string &name)
	Link a block of llvm bitcode into an llvm module.

std::unique_ptr< llvm::Module >	link_with_wasm_jit_runtime (llvm::LLVMContext *c, const Target &t, std::unique_ptr< llvm::Module > extra_module)
	Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module.

Stmt	loop_carry (Stmt, int max_carried_values=8)
	Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load.

Module	lower (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Argument > &args, LinkageType linkage_type, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >())
	Given a vector of scheduled halide functions, create a Module that evaluates it.

Stmt	lower_main_stmt (const std::vector< Function > &output_funcs, const std::string &pipeline_name, const Target &t, const std::vector< Stmt > &requirements=std::vector< Stmt >(), bool trace_pipeline=false, const std::vector< IRMutator * > &custom_passes=std::vector< IRMutator * >())
	Given a halide function with a schedule, create a statement that evaluates it.

void	lower_test ()

Stmt	lower_parallel_tasks (const Stmt &s, std::vector< LoweredFunc > &closure_implementations, const std::string &name, const Target &t)

Stmt	lower_warp_shuffles (Stmt s, const Target &t)
	Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions.

Stmt	inject_memoization (const Stmt &s, const std::map< std::string, Function > &env, const std::string &name, const std::vector< Function > &outputs)
	Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache.

Stmt	rewrite_memoized_allocations (const Stmt &s, const std::map< std::string, Function > &env)
	This should be called after Storage Flattening has added Allocation IR nodes.

std::map< OutputFileType, const OutputInfo >	get_output_info (const Target &target)

ModulusRemainder	operator+ (const ModulusRemainder &a, const ModulusRemainder &b)

ModulusRemainder	operator- (const ModulusRemainder &a, const ModulusRemainder &b)

ModulusRemainder	operator* (const ModulusRemainder &a, const ModulusRemainder &b)

ModulusRemainder	operator/ (const ModulusRemainder &a, const ModulusRemainder &b)

ModulusRemainder	operator% (const ModulusRemainder &a, const ModulusRemainder &b)

ModulusRemainder	operator+ (const ModulusRemainder &a, int64_t b)

ModulusRemainder	operator- (const ModulusRemainder &a, int64_t b)

ModulusRemainder	operator* (const ModulusRemainder &a, int64_t b)

ModulusRemainder	operator/ (const ModulusRemainder &a, int64_t b)

ModulusRemainder	operator% (const ModulusRemainder &a, int64_t b)

ModulusRemainder	modulus_remainder (const Expr &e)
	For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant.

ModulusRemainder	modulus_remainder (const Expr &e, const Scope< ModulusRemainder > &scope)
	If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder:

void	modulus_remainder_test ()

int64_t	gcd (int64_t, int64_t)
	The greatest common divisor of two integers.

int64_t	lcm (int64_t, int64_t)
	The least common multiple of two integers.

ConstantInterval	derivative_bounds (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope())
	Find the bounds of the derivative of an expression.

Monotonic	is_monotonic (const Expr &e, const std::string &var, const Scope< ConstantInterval > &scope=Scope< ConstantInterval >::empty_scope())

Monotonic	is_monotonic (const Expr &e, const std::string &var, const Scope< Monotonic > &scope)

std::ostream &	operator<< (std::ostream &stream, const Monotonic &m)
	Emit the monotonic class in human-readable form for debugging.

void	is_monotonic_test ()

Stmt	inject_gpu_offload (const Stmt &s, const Target &host_target)
	Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module.

Stmt	optimize_shuffles (Stmt s, int lut_alignment)

bool	can_parallelize_rvar (const std::string &rvar, const std::string &func, const Definition &r)
	Returns whether or not Halide can prove that it is safe to parallelize an update definition across a specific variable.

void	check_call_arg_types (const std::string &name, std::vector< Expr > *args, int dims)
	Validate arguments to a call to a func, image or imageparam.

bool	has_uncaptured_likely_tag (const Expr &e, const Scope<> &scope)
	Return true if an expression uses a likely tag that isn't captured by an enclosing Select, Min, or Max.

bool	has_likely_tag (const Expr &e, const Scope<> &scope)
	Return true if an expression uses a likely tag.

Stmt	partition_loops (Stmt s)
	Partitions loop bodies into a prologue, a steady state, and an epilogue.

Stmt	inject_placeholder_prefetch (const Stmt &s, const std::map< std::string, Function > &env, const std::string &prefix, const std::vector< PrefetchDirective > &prefetches)
	Inject placeholder prefetches to 's'.

Stmt	inject_prefetch (const Stmt &s, const std::map< std::string, Function > &env)
	Compute the actual region to be prefetched and place it to the placholder prefetch.

Stmt	reduce_prefetch_dimension (Stmt stmt, const Target &t)
	Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture).

Stmt	hoist_prefetches (const Stmt &s)
	Hoist all the prefetches in a Block to the beginning of the Block.

std::string	print_loop_nest (const std::vector< Function > &output_funcs)
	Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses.

Stmt	inject_profiling (const Stmt &, const std::string &, const std::map< std::string, Function > &env)
	Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end.

Expr	purify_index_math (const Expr &)
	Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero.

Expr	qualify (const std::string &prefix, const Expr &value)
	Prefix all variable names in the given expression with the prefix string.

Expr	random_float (const std::vector< Expr > &)
	Return a random floating-point number between zero and one that varies deterministically based on the input expressions.

Expr	random_int (const std::vector< Expr > &)
	Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers).

Expr	lower_random (const Expr &e, const std::vector< VarOrRVar > &free_vars, int tag)
	Convert calls to random() to IR generated by random_float and random_int.

std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > >	realization_order (const std::vector< Function > &outputs, std::map< std::string, Function > &env)
	Given a bunch of functions that call each other, determine an order in which to do the scheduling.

std::vector< std::string >	topological_order (const std::vector< Function > &outputs, const std::map< std::string, Function > &env)
	Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule.

Stmt	rebase_loops_to_zero (const Stmt &)
	Rewrite the mins of most loops to 0.

void	split_predicate_test ()

bool	is_func_trivial_to_inline (const Function &func)
	Return true if the cost of inlining a function is equivalent to the cost of calling the function directly.

Stmt	remove_dead_allocations (const Stmt &s)
	Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt.

Stmt	remove_extern_loops (const Stmt &s)
	Removes placeholder loops for extern stages.

Stmt	remove_undef (Stmt s)
	Removes stores that depend on undef values, and statements that only contain such stores.

Stmt	schedule_functions (const std::vector< Function > &outputs, const std::vector< std::vector< std::string > > &fused_groups, const std::map< std::string, Function > &env, const Target &target, bool &any_memoized)
	Build loop nests and inject Function realizations at the appropriate places using the schedule.

template<typename T >
std::ostream &	operator<< (std::ostream &stream, const Scope< T > &s)

Stmt	select_gpu_api (const Stmt &s, const Target &t)
	Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target.

Stmt	simplify (const Stmt &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope(), const std::vector< Expr > &assumptions=std::vector< Expr >())
	Perform a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc.

Expr	simplify (const Expr &, bool remove_dead_code=true, const Scope< Interval > &bounds=Scope< Interval >::empty_scope(), const Scope< ModulusRemainder > &alignment=Scope< ModulusRemainder >::empty_scope(), const std::vector< Expr > &assumptions=std::vector< Expr >())

bool	can_prove (Expr e, const Scope< Interval > &bounds=Scope< Interval >::empty_scope())
	Attempt to statically prove an expression is true using the simplifier.

Stmt	simplify_exprs (const Stmt &)
	Simplify expressions found in a statement, but don't simplify across different statements.

Stmt	simplify_correlated_differences (const Stmt &)
	Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions.

Expr	bound_correlated_differences (const Expr &expr)
	Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference.

void	simplify_specializations (std::map< std::string, Function > &env)
	Try to simplify the RHS/LHS of a function's definition based on its specializations.

Stmt	skip_stages (const Stmt &s, const std::vector< Function > &outputs, const std::vector< std::vector< std::string > > &order, const std::map< std::string, Function > &env)
	Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used.

Stmt	sliding_window (const Stmt &s, const std::map< std::string, Function > &env)
	Perform sliding window optimizations on a halide statement.

SolverResult	solve_expression (const Expr &e, const std::string &variable, const Scope< Expr > &scope=Scope< Expr >::empty_scope())
	Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e.

Interval	solve_for_outer_interval (const Expr &c, const std::string &variable)
	Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it.

Interval	solve_for_inner_interval (const Expr &c, const std::string &variable)
	Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it.

Expr	and_condition_over_domain (const Expr &c, const Scope< Interval > &varying)
	Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables.

void	solve_test ()

void	spirv_ir_test ()
	Internal test for SPIR-V IR.

Stmt	split_tuples (const Stmt &s, const std::map< std::string, Function > &env)
	Rewrite all tuple-valued Realizations, Provide nodes, and Call nodes into several scalar-valued ones, so that later lowering passes only need to think about scalar-valued productions.

Stmt	stage_strided_loads (const Stmt &s)
	Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles.

void	print_to_stmt_html (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="")
	Dump an HTML-formatted visualization of a Module to filename.

void	print_to_conceptual_stmt_html (const std::string &html_output_filename, const Module &m, const std::string &assembly_input_filename="")
	Dump an HTML-formatted visualization of a Module's conceptual Stmt code to filename.

Stmt	storage_flattening (Stmt s, const std::vector< Function > &outputs, const std::map< std::string, Function > &env, const Target &target)
	Take a statement with multi-dimensional Realize, Provide, and Call nodes, and turn it into a statement with single-dimensional Allocate, Store, and Load nodes respectively.

Stmt	storage_folding (const Stmt &s, const std::map< std::string, Function > &env)
	Fold storage of functions if possible.

bool	strictify_float (std::map< std::string, Function > &env, const Target &t)
	Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions.

Stmt	strip_asserts (const Stmt &s)

Expr	substitute (const std::string &name, const Expr &replacement, const Expr &expr)
	Substitute variables with the given name with the replacement expression within expr.

Stmt	substitute (const std::string &name, const Expr &replacement, const Stmt &stmt)
	Substitute variables with the given name with the replacement expression within stmt.

Expr	substitute (const std::map< std::string, Expr > &replacements, const Expr &expr)
	Substitute variables with names in the map.

Stmt	substitute (const std::map< std::string, Expr > &replacements, const Stmt &stmt)

Expr	substitute (const Expr &find, const Expr &replacement, const Expr &expr)
	Substitute expressions for other expressions.

Stmt	substitute (const Expr &find, const Expr &replacement, const Stmt &stmt)

Expr	graph_substitute (const std::string &name, const Expr &replacement, const Expr &expr)
	Substitutions where the IR may be a general graph (and not just a DAG).

Stmt	graph_substitute (const std::string &name, const Expr &replacement, const Stmt &stmt)

Expr	graph_substitute (const Expr &find, const Expr &replacement, const Expr &expr)

Stmt	graph_substitute (const Expr &find, const Expr &replacement, const Stmt &stmt)

Expr	substitute_in_all_lets (const Expr &expr)
	Substitute in all let Exprs in a piece of IR.

Stmt	substitute_in_all_lets (const Stmt &stmt)

void	target_test ()

void	lower_target_query_ops (std::map< std::string, Function > &env, const Target &t)

Stmt	inject_tracing (Stmt, const std::string &pipeline_name, bool trace_pipeline, const std::map< std::string, Function > &env, const std::vector< Function > &outputs, const Target &Target)
	Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations.

Stmt	trim_no_ops (Stmt s)
	Truncate loop bounds to the region over which they actually do something.

Stmt	unify_duplicate_lets (const Stmt &s)
	Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones.

Stmt	uniquify_variable_names (const Stmt &s)
	Modify a statement so that every internally-defined variable name is unique.

void	uniquify_variable_names_test ()

Stmt	unpack_buffers (Stmt s)
	Creates let stmts for the various buffer components (e.g.

Stmt	unroll_loops (const Stmt &)
	Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement.

Stmt	lower_unsafe_promises (const Stmt &s, const Target &t)
	Lower all unsafe promises into either assertions or unchecked code, depending on the target.

Stmt	lower_safe_promises (const Stmt &s)
	Lower all safe promises by just stripping them.

template<typename DST , typename SRC , typename std::enable_if< std::is_floating_point< SRC >::value >::type * = nullptr>
DST	safe_numeric_cast (SRC s)
	Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible.

template<typename DstType , typename SrcType >
DstType	reinterpret_bits (const SrcType &src)
	An aggressive form of reinterpret cast used for correct type-punning.

std::string	get_env_variable (char const *env_var_name)
	Get value of an environment variable.

std::string	running_program_name ()
	Get the name of the currently running executable.

std::string	unique_name (char prefix)
	Generate a unique name starting with the given prefix.

std::string	unique_name (const std::string &prefix)

bool	starts_with (const std::string &str, const std::string &prefix)
	Test if the first string starts with the second string.

bool	ends_with (const std::string &str, const std::string &suffix)
	Test if the first string ends with the second string.

std::string	replace_all (const std::string &str, const std::string &find, const std::string &replace)
	Replace all matches of the second string in the first string with the last string.

std::vector< std::string >	split_string (const std::string &source, const std::string &delim)
	Split the source string using 'delim' as the divider.

template<typename T >
std::string	join_strings (const std::vector< T > &sources, const std::string &delim)
	Join the source vector using 'delim' as the divider.

template<typename T , typename Fn >
T	fold_left (const std::vector< T > &vec, Fn f)
	Perform a left fold of a vector.

template<typename T , typename Fn >
T	fold_right (const std::vector< T > &vec, Fn f)
	Returns a right fold of a vector.

std::string	extract_namespaces (const std::string &name, std::vector< std::string > &namespaces)
	Returns base name and fills in namespaces, outermost one first in vector.

std::string	strip_namespaces (const std::string &name)
	Like extract_namespaces(), but strip and discard the namespaces, returning base name only.

std::string	file_make_temp (const std::string &prefix, const std::string &suffix)
	Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed.

std::string	dir_make_temp ()
	Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed.

bool	file_exists (const std::string &name)
	Wrapper for access().

void	assert_file_exists (const std::string &name)
	assert-fail if the file doesn't exist.

void	assert_no_file_exists (const std::string &name)
	assert-fail if the file DOES exist.

void	file_unlink (const std::string &name)
	Wrapper for unlink().

void	ensure_no_file_exists (const std::string &name)
	Ensure that no file with this path exists.

void	dir_rmdir (const std::string &name)
	Wrapper for rmdir().

FileStat	file_stat (const std::string &name)
	Wrapper for stat().

std::vector< char >	read_entire_file (const std::string &pathname)
	Read the entire contents of a file into a vector<char>.

void	write_entire_file (const std::string &pathname, const void *source, size_t source_len)
	Create or replace the contents of a file with a given pointer-and-length of memory.

void	write_entire_file (const std::string &pathname, const std::vector< char > &source)

bool	add_would_overflow (int bits, int64_t a, int64_t b)
	Routines to test if math would overflow for signed integers with the given number of bits.

bool	sub_would_overflow (int bits, int64_t a, int64_t b)

bool	mul_would_overflow (int bits, int64_t a, int64_t b)

HALIDE_MUST_USE_RESULT bool	add_with_overflow (int bits, int64_t a, int64_t b, int64_t *result)
	Routines to perform arithmetic on signed types without triggering signed overflow.

HALIDE_MUST_USE_RESULT bool	sub_with_overflow (int bits, int64_t a, int64_t b, int64_t *result)

HALIDE_MUST_USE_RESULT bool	mul_with_overflow (int bits, int64_t a, int64_t b, int64_t *result)

void	halide_tic_impl (const char *file, int line)

void	halide_toc_impl (const char *file, int line)

template<typename T >
auto	begin (reverse_adaptor< T > i)

template<typename T >
auto	end (reverse_adaptor< T > i)

template<typename T >
reverse_adaptor< T >	reverse_view (T &&range)
	Reverse-order adaptor for range-based for-loops.

std::string	c_print_name (const std::string &name, bool prefix_underscore=true)
	Emit a version of a string that is a valid identifier in C (.

int	get_llvm_version ()
	Return the LLVM_VERSION against which this libHalide is compiled.

void	run_with_large_stack (const std::function< void()> &action)
	Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size.

int	popcount64 (uint64_t x)
	Portable versions of popcount, count-leading-zeros, and count-trailing-zeros.

int	clz64 (uint64_t x)

int	ctz64 (uint64_t x)

int64_t	next_power_of_two (int64_t x)
	Return an integer 2^n, for some n, which is >= x.

template<typename T >
T	align_up (T x, int n)

std::vector< Var >	make_argument_list (int dimensionality)
	Make a list of unique arguments for definitions with unnamed arguments.

Stmt	vectorize_loops (const Stmt &s, const std::map< std::string, Function > &env)
	Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors.

std::map< std::string, Function >	wrap_func_calls (const std::map< std::string, Function > &env)
	Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions.

std::string	get_test_tmp_dir ()
	Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution).


Expr	lower_int_uint_div (const Expr &a, const Expr &b, bool round_to_zero=false)
	Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.

Expr	lower_int_uint_mod (const Expr &a, const Expr &b)
	Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.


Expr	lower_euclidean_div (Expr a, Expr b)
	Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.

Expr	lower_euclidean_mod (Expr a, Expr b)
	Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.


Expr	lower_signed_shift_left (const Expr &a, const Expr &b)
	Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.

Expr	lower_signed_shift_right (const Expr &a, const Expr &b)
	Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.


Expr	lower_extract_bits (const Call *c)
	Reduce bit extraction and concatenation to bit ops.

Expr	lower_concat_bits (const Call *c)
	Reduce bit extraction and concatenation to bit ops.


Stmt	eliminate_bool_vectors (const Stmt &s)
	Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.

Expr	eliminate_bool_vectors (const Expr &s)
	Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.


std::string	lldb_string (const Expr &)
	Debugging helpers for LLDB.

std::string	lldb_string (const Internal::BaseExprNode *)
	Debugging helpers for LLDB.

std::string	lldb_string (const Stmt &)
	Debugging helpers for LLDB.


HALIDE_MUST_USE_RESULT bool	reduce_expr_modulo (const Expr &e, int64_t modulus, int64_t *remainder)
	Reduce an expression modulo some integer.

HALIDE_MUST_USE_RESULT bool	reduce_expr_modulo (const Expr &e, int64_t modulus, int64_t *remainder, const Scope< ModulusRemainder > &scope)
	Reduce an expression modulo some integer.

Variables
const int64_t	unknown = std::numeric_limits<int64_t>::min()

constexpr IRNodeType	StrongestExprNodeType = IRNodeType::VectorReduce

std::atomic< int >	random_variable_counter

Typedef Documentation

◆ AbstractGeneratorPtr

using Halide::Internal::AbstractGeneratorPtr = std::unique_ptr<AbstractGenerator>

Definition at line 244 of file AbstractGenerator.h.

◆ DimBounds

typedef std::map<std::string, Interval> Halide::Internal::DimBounds

Definition at line 20 of file AutoScheduleUtils.h.

◆ FuncValueBounds

typedef std::map<std::pair<std::string, int>, Interval> Halide::Internal::FuncValueBounds

Definition at line 17 of file Bounds.h.

◆ add_const_if_T_is_const

template<typename T , typename T2 >

using Halide::Internal::add_const_if_T_is_const = typename std::conditional<std::is_const<T>::value, const T2, T2>::type

Definition at line 83 of file Buffer.h.

◆ GeneratorParamImplBase

template<typename T >

using Halide::Internal::GeneratorParamImplBase

Initial value:

 
    typename select_type<
        cond<std::is_same<T, Target>::value, GeneratorParam_Target<T>>,
        cond<std::is_same<T, LoopLevel>::value, GeneratorParam_LoopLevel>,
        cond<std::is_same<T, std::string>::value, GeneratorParam_String<T>>,
        cond<std::is_same<T, Type>::value, GeneratorParam_Type<T>>,
        cond<std::is_same<T, bool>::value, GeneratorParam_Bool<T>>,
        cond<std::is_arithmetic<T>::value, GeneratorParam_Arithmetic<T>>,
        cond<std::is_enum<T>::value, GeneratorParam_Enum<T>>>::type

Definition at line 941 of file Generator.h.

◆ GeneratorInputImplBase

template<typename T , typename TBase = typename std::remove_all_extents<T>::type>

using Halide::Internal::GeneratorInputImplBase

Initial value:

 
    typename select_type<
        cond<has_static_halide_type_method<TBase>::value, GeneratorInput_Buffer<T>>,
        cond<std::is_same<TBase, Func>::value, GeneratorInput_Func<T>>,
        cond<std::is_arithmetic<TBase>::value, GeneratorInput_Arithmetic<T>>,
        cond<std::is_scalar<TBase>::value, GeneratorInput_Scalar<T>>,
        cond<std::is_same<TBase, Expr>::value, GeneratorInput_DynamicScalar<T>>>::type

Definition at line 2175 of file Generator.h.

◆ GeneratorOutputImplBase

template<typename T , typename TBase = typename std::remove_all_extents<T>::type>

using Halide::Internal::GeneratorOutputImplBase

Initial value:

 
    typename select_type<
        cond<has_static_halide_type_method<TBase>::value, GeneratorOutput_Buffer<T>>,
        cond<std::is_same<TBase, Func>::value, GeneratorOutput_Func<T>>,
        cond<std::is_arithmetic<TBase>::value, GeneratorOutput_Arithmetic<T>>>::type

Definition at line 2786 of file Generator.h.

◆ GeneratorFactory

using Halide::Internal::GeneratorFactory = std::function<AbstractGeneratorPtr(const GeneratorContext &context)>

Definition at line 3115 of file Generator.h.

◆ LLVMOStream

typedef llvm::raw_pwrite_stream Halide::Internal::LLVMOStream

Definition at line 27 of file LLVM_Output.h.

Enumeration Type Documentation

◆ ArgInfoKind

enum class Halide::Internal::ArgInfoKind

strong

Enumerator
Scalar
Function
Buffer

Definition at line 26 of file AbstractGenerator.h.

◆ ArgInfoDirection

enum class Halide::Internal::ArgInfoDirection

strong

Enumerator
Input
Output

Definition at line 30 of file AbstractGenerator.h.

◆ Direction

enum class Halide::Internal::Direction

strong

Given a varying expression, try to find a constant that is either: An upper bound (always greater than or equal to the expression), or A lower bound (always less than or equal to the expression) If it fails, returns an undefined Expr.

Enumerator
Upper
Lower

Definition at line 42 of file Bounds.h.

◆ IRNodeType

enum class Halide::Internal::IRNodeType

strong

All our IR node types get unique IDs for the purposes of RTTI.

Enumerator
IntImm
UIntImm
FloatImm
StringImm
Broadcast
Cast
Reinterpret
Variable
Add
Sub
Mod
Mul
Div
Min
Max
EQ
NE
LT
LE
GT
GE
And
Or
Not
Select
Load
Ramp
Call
Let
Shuffle
VectorReduce
LetStmt
AssertStmt
ProducerConsumer
For
Acquire
Store
Provide
Allocate
Free
Realize
Block
Fork
IfThenElse
Evaluate
Prefetch
Atomic
HoistedStorage

Definition at line 25 of file Expr.h.

◆ ForType

enum class Halide::Internal::ForType

strong

An enum describing a type of loop traversal.

Used in schedules, and in the For loop IR node. Serial is a conventional ordered for loop. Iterations occur in increasing order, and each iteration must appear to have finished before the next begins. Parallel, GPUBlock, and GPUThread are parallel and unordered: iterations may occur in any order, and multiple iterations may occur simultaneously. Vectorized and GPULane are parallel and synchronous: they act as if all iterations occur at the same time in lockstep.

Enumerator
Serial
Parallel
Vectorized
Unrolled
Extern
GPUBlock
GPUThread
GPULane

Definition at line 406 of file Expr.h.

◆ SyntheticParamType

enum class Halide::Internal::SyntheticParamType

strong

Enumerator
Type
Dim
ArraySize

Definition at line 2892 of file Generator.h.

◆ Monotonic

enum class Halide::Internal::Monotonic

strong

Detect whether an expression is monotonic increasing in a variable, decreasing, or unknown.

Enumerator
Constant
Increasing
Decreasing
Unknown

Definition at line 26 of file Monotonic.h.

◆ DimType

enum class Halide::Internal::DimType

strong

Each Dim below has a dim_type, which tells you what transformations are legal on it.

When you combine two Dims of distinct DimTypes (e.g. with Stage::fuse), the combined result has the greater enum value of the two types.

Enumerator

PureVar

This dim originated from a Var.

You can evaluate a Func at distinct values of this Var in any order over an interval that's at least as large as the interval required. In pure definitions you can even redundantly re-evaluate points.

PureRVar

The dim originated from an RVar.

You can evaluate a Func at distinct values of this RVar in any order (including in parallel) over exactly the interval specified in the RDom. PureRVars can also be reordered arbitrarily in the dims list, as there are no data hazards between the evaluation of the Func at distinct values of the RVar.

The most common case where an RVar is considered pure is RVars that are used in a way which obeys all the syntactic constraints that a Var does, e.g:

RDom r(0, 100);

f(r.x) = f(r.x) + 5;

Halide::RDom

A multi-dimensional domain over which to iterate.

Definition RDom.h:193

Other cases where RVars are pure are where the sites being written to by the Func evaluated at one value of the RVar couldn't possibly collide with the sites being written or read by the Func at a distinct value of the RVar. For example, r.x is pure in the following three definitions:

// This definition writes to even coordinates and reads from the
// same site (which no other value of r.x is writing to) and odd
// sites (which no other value of r.x is writing to):
f(2*r.x) = max(f(2*r.x), f(2*r.x + 7));
 
// This definition writes to scanline zero and reads from the the
// same site and scanline one:
f(r.x, 0) += f(r.x, 1);
 
// This definition reads and writes over non-overlapping ranges:
f(r.x + 100) += f(r.x);

To give two counterexamples, r.x is not pure in the following definitions:

// The same site is written by distinct values of the RVar
// (write-after-write hazard):
f(r.x / 2) += f(r.x);
 
// One value of r.x reads from a site that another value of r.x
// is writing to (read-after-write hazard):
f(r.x) += f(r.x + 1);

ImpureRVar

The dim originated from an RVar.

You must evaluate a Func at distinct values of this RVar in increasing order over precisely the interval specified in the RDom. ImpureRVars may not be reordered with respect to other ImpureRVars.

All RVars are impure by default. Those for which we can prove no data hazards exist get promoted to PureRVar. There are two instances in which ImpureRVars may be parallelized or reordered even in the presence of hazards:

1) In the case of an update definition that has been proven to be an associative and commutative reduction, reordering of ImpureRVars is allowed, and parallelizing them is allowed if the update has been made atomic.

2) ImpureRVars can also be reordered and parallelized if Func::allow_race_conditions() has been set. This is the escape hatch for when there are no hazards but the checks above failed to prove that (RDom::where can encode arbitrary facts about non-linear integer arithmetic, which is undecidable), or for when you don't actually care about the non-determinism introduced by data hazards (e.g. in the algorithm HOGWILD!).

Definition at line 357 of file Schedule.h.

Function Documentation

◆ add_atomic_mutex()

Stmt Halide::Internal::add_atomic_mutex	(	Stmt	s,
		const std::vector< Function > &	outputs )

◆ add_image_checks()

Stmt Halide::Internal::add_image_checks	(	const Stmt &	s,
		const std::vector< Function > &	outputs,
		const Target &	t,
		const std::vector< std::string > &	order,
		const std::map< std::string, Function > &	env,
		const FuncValueBounds &	fb,
		bool	will_inject_host_copies )

Insert checks to make sure a statement doesn't read out of bounds on inputs or outputs, and that the inputs and outputs conform to the format required (e.g.

stride.0 must be 1).

◆ add_parameter_checks()

Stmt Halide::Internal::add_parameter_checks	(	const std::vector< Stmt > &	requirements,
		Stmt	s,
		const Target &	t )

Insert checks to make sure that all referenced parameters meet their constraints.

Also injects any custom requirements provided by the user.

◆ add_split_factor_checks()

Stmt Halide::Internal::add_split_factor_checks	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

Insert checks that all split factors that depend on scalar parameters are strictly positive.

◆ align_loads()

Stmt Halide::Internal::align_loads	(	const Stmt &	s,
		int	alignment,
		int	min_bytes_to_align )

Attempt to rewrite unaligned loads from buffers which are known to be aligned to instead load aligned vectors that cover the original load, and then slice the original load out of the aligned vectors.

Types that are less than min_bytes_to_align in size are not rewritten. This is intended to make a distinction between data that will be accessed as a scalar and that which will be accessed as a vector.

◆ allocation_bounds_inference()

Stmt Halide::Internal::allocation_bounds_inference	(	Stmt	s,
		const std::map< std::string, Function > &	env,
		const std::map< std::pair< std::string, int >, Interval > &	func_bounds )

Take a partially statement with Realize nodes in terms of variables, and define values for those variables.

◆ apply_split()

std::vector< ApplySplitResult > Halide::Internal::apply_split	(	const Split &	split,
		const std::string &	prefix,
		std::map< std::string, Expr > &	dim_extent_alignment )

Given a Split schedule on a definition (init or update), return a list of of predicates on the definition, substitutions that needs to be applied to the definition (in ascending order of application), and let stmts which defined the values of variables referred by the predicates and substitutions (ordered from innermost to outermost let).

◆ compute_loop_bounds_after_split()

std::vector< std::pair< std::string, Expr > > Halide::Internal::compute_loop_bounds_after_split	(	const Split &	split,
		const std::string &	prefix )

Compute the loop bounds of the new dimensions resulting from applying the split schedules using the loop bounds of the old dimensions.

◆ get_ops_table()

const std::vector< AssociativePattern > & Halide::Internal::get_ops_table ( const std::vector< Expr > & exprs )

◆ prove_associativity()

AssociativeOp Halide::Internal::prove_associativity	(	const std::string &	f,
		std::vector< Expr >	args,
		std::vector< Expr >	exprs )

Given an update definition of a Func 'f', determine its equivalent associative binary/unary operator if there is any.

'is_associative' indicates if the operation was successfuly proven as associative.

◆ associativity_test()

void Halide::Internal::associativity_test ( )

◆ fork_async_producers()

Stmt Halide::Internal::fork_async_producers	(	Stmt	s,
		const std::map< std::string, Function > &	env )

◆ string_to_int()

int Halide::Internal::string_to_int ( const std::string & s )

Return an int representation of 's'.

Throw an error on failure.

◆ substitute_var_estimates() [1/2]

Expr Halide::Internal::substitute_var_estimates ( Expr e )

Substitute every variable in an Expr or a Stmt with its estimate if specified.

◆ substitute_var_estimates() [2/2]

Stmt Halide::Internal::substitute_var_estimates ( Stmt s )

◆ get_extent()

Expr Halide::Internal::get_extent ( const Interval & i )

Return the size of an interval.

Return an undefined expr if the interval is unbounded.

◆ box_size()

Expr Halide::Internal::box_size ( const Box & b )

Return the size of an n-d box.

◆ disp_regions()

void Halide::Internal::disp_regions ( const std::map< std::string, Box > & regions )

Helper function to print the bounds of a region.

◆ get_stage_definition()

Definition Halide::Internal::get_stage_definition	(	const Function &	f,
		int	stage_num )

Return the corresponding definition of a function given the stage.

This will throw an assertion if the function is an extern function (Extern Func does not have definition).

◆ get_stage_dims()

std::vector< Dim > & Halide::Internal::get_stage_dims	(	const Function &	f,
		int	stage_num )

Return the corresponding loop dimensions of a function given the stage.

For extern Func, this will return a list of size 1 containing the dummy __outermost loop dimension.

◆ combine_load_costs()

void Halide::Internal::combine_load_costs	(	std::map< std::string, Expr > &	result,
		const std::map< std::string, Expr > &	partial )

Add partial load costs to the corresponding function in the result costs.

◆ get_stage_bounds() [1/2]

DimBounds Halide::Internal::get_stage_bounds	(	const Function &	f,
		int	stage_num,
		const DimBounds &	pure_bounds )

Return the required bounds of an intermediate stage (f, stage_num) of function 'f' given the bounds of the pure dimensions.

◆ get_stage_bounds() [2/2]

std::vector< DimBounds > Halide::Internal::get_stage_bounds	(	const Function &	f,
		const DimBounds &	pure_bounds )

Return the required bounds for all the stages of the function 'f'.

Each entry in the returned vector corresponds to a stage.

◆ perform_inline()

Expr Halide::Internal::perform_inline	(	Expr	e,
		const std::map< std::string, Function > &	env,
		const std::set< std::string > &	inlines = std::set< std::string >(),
		const std::vector< std::string > &	order = std::vector< std::string >() )

Recursively inline all the functions in the set 'inlines' into the expression 'e' and return the resulting expression.

If 'order' is passed, inlining will be done in the reverse order of function realization to avoid extra inlining works.

◆ get_parents()

std::set< std::string > Halide::Internal::get_parents	(	Function	f,
		int	stage )

Return all functions that are directly called by a function stage (f, stage).

◆ get_element() [1/2]

template<typename K , typename V >

V Halide::Internal::get_element	(	const std::map< K, V > &	m,
		const K &	key )

Return value of element within a map.

This will assert if the element is not in the map.

Definition at line 101 of file AutoScheduleUtils.h.

References internal_assert.

◆ get_element() [2/2]

template<typename K , typename V >

V & Halide::Internal::get_element	(	std::map< K, V > &	m,
		const K &	key )

Definition at line 108 of file AutoScheduleUtils.h.

References internal_assert.

◆ inline_all_trivial_functions()

bool Halide::Internal::inline_all_trivial_functions	(	const std::vector< Function > &	outputs,
		const std::vector< std::string > &	order,
		const std::map< std::string, Function > &	env )

If the cost of computing a Func is about the same as calling the Func, inline the Func.

Return true of any of the Funcs is inlined.

◆ is_func_called_element_wise()

std::string Halide::Internal::is_func_called_element_wise	(	const std::vector< std::string > &	order,
		size_t	index,
		const std::map< std::string, Function > &	env )

Determine if a Func (order[index]) is only consumed by another single Func in element-wise manner.

If it is, return the name of the consumer Func; otherwise, return an empty string.

◆ inline_all_element_wise_functions()

bool Halide::Internal::inline_all_element_wise_functions	(	const std::vector< Function > &	outputs,
		const std::vector< std::string > &	order,
		const std::map< std::string, Function > &	env )

Inline a Func if its values are only consumed by another single Func in element-wise manner.

◆ propagate_estimate_test()

void Halide::Internal::propagate_estimate_test ( )

◆ bound_constant_extent_loops()

Stmt Halide::Internal::bound_constant_extent_loops ( const Stmt & s )

Replace all loop extents of unrolled or vectorized loops with constants, by substituting and simplifying as needed.

If we can't determine a constant extent, but can determine a constant upper bound, inject an if statement into the body. If we can't even determine a constant upper bound, throw a user error.

◆ empty_func_value_bounds()

const FuncValueBounds & Halide::Internal::empty_func_value_bounds ( )

◆ bounds_of_expr_in_scope()

Interval Halide::Internal::bounds_of_expr_in_scope	(	const Expr &	expr,
		const Scope< Interval > &	scope,
		const FuncValueBounds &	func_bounds = empty_func_value_bounds(),
		bool	const_bound = false )

Given an expression in some variables, and a map from those variables to their bounds (in the form of (minimum possible value, maximum possible value)), compute two expressions that give the minimum possible value and the maximum possible value of this expression.

Max or min may be undefined expressions if the value is not bounded above or below. If the expression is a vector, also takes the bounds across the vector lanes and returns a scalar result.

This is for tasks such as deducing the region of a buffer loaded by a chunk of code.

◆ find_constant_bound()

Expr Halide::Internal::find_constant_bound	(	const Expr &	e,
		Direction	d,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope() )

◆ find_constant_bounds()

Interval Halide::Internal::find_constant_bounds	(	const Expr &	e,
		const Scope< Interval > &	scope )

Find bounds for a varying expression that are either constants or +/-inf.

◆ merge_boxes()

void Halide::Internal::merge_boxes	(	Box &	a,
		const Box &	b )

Expand box a to encompass box b.

◆ boxes_overlap()

bool Halide::Internal::boxes_overlap	(	const Box &	a,
		const Box &	b )

Test if box a could possibly overlap box b.

◆ box_union()

Box Halide::Internal::box_union	(	const Box &	a,
		const Box &	b )

The union of two boxes.

◆ box_intersection()

Box Halide::Internal::box_intersection	(	const Box &	a,
		const Box &	b )

The intersection of two boxes.

◆ box_contains()

bool Halide::Internal::box_contains	(	const Box &	a,
		const Box &	b )

Test if box a provably contains box b.

◆ boxes_required() [1/2]

std::map< std::string, Box > Halide::Internal::boxes_required	(	const Expr &	e,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

Compute rectangular domains large enough to cover all the 'Call's to each function that occurs within a given statement or expression.

This is useful for figuring out what regions of things to evaluate. Respects control flow (e.g. encodes if statement conditions), but assumes all encountered asserts pass. If it encounters an assert(false) in one if branch, assumes the opposite if branch runs unconditionally.

◆ boxes_required() [2/2]

std::map< std::string, Box > Halide::Internal::boxes_required	(	Stmt	s,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ boxes_provided() [1/2]

std::map< std::string, Box > Halide::Internal::boxes_provided	(	const Expr &	e,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

Compute rectangular domains large enough to cover all the 'Provides's to each function that occurs within a given statement or expression.

Handles asserts in the same way as boxes_required.

◆ boxes_provided() [2/2]

std::map< std::string, Box > Halide::Internal::boxes_provided	(	Stmt	s,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ boxes_touched() [1/2]

std::map< std::string, Box > Halide::Internal::boxes_touched	(	const Expr &	e,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

Compute rectangular domains large enough to cover all the 'Call's and 'Provides's to each function that occurs within a given statement or expression.

Handles asserts in the same way as boxes_required.

◆ boxes_touched() [2/2]

std::map< std::string, Box > Halide::Internal::boxes_touched	(	Stmt	s,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ box_required() [1/2]

Box Halide::Internal::box_required	(	const Expr &	e,
		const std::string &	fn,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

Variants of the above that are only concerned with a single function.

◆ box_required() [2/2]

Box Halide::Internal::box_required	(	Stmt	s,
		const std::string &	fn,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ box_provided() [1/2]

Box Halide::Internal::box_provided	(	const Expr &	e,
		const std::string &	fn,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ box_provided() [2/2]

Box Halide::Internal::box_provided	(	Stmt	s,
		const std::string &	fn,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ box_touched() [1/2]

Box Halide::Internal::box_touched	(	const Expr &	e,
		const std::string &	fn,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ box_touched() [2/2]

Box Halide::Internal::box_touched	(	Stmt	s,
		const std::string &	fn,
		const Scope< Interval > &	scope = Scope< Interval >::empty_scope(),
		const FuncValueBounds &	func_bounds = empty_func_value_bounds() )

◆ compute_function_value_bounds()

FuncValueBounds Halide::Internal::compute_function_value_bounds	(	const std::vector< std::string > &	order,
		const std::map< std::string, Function > &	env )

Compute the maximum and minimum possible value for each function in an environment.

◆ span_of_bounds()

Expr Halide::Internal::span_of_bounds ( const Interval & bounds )

◆ bounds_test()

void Halide::Internal::bounds_test ( )

◆ bounds_inference()

Stmt Halide::Internal::bounds_inference	(	Stmt	,
		const std::vector< Function > &	outputs,
		const std::vector< std::string > &	realization_order,
		const std::vector< std::vector< std::string > > &	fused_groups,
		const std::map< std::string, Function > &	environment,
		const std::map< std::pair< std::string, int >, Interval > &	func_bounds,
		const Target &	target )

Take a partially lowered statement that includes symbolic representations of the bounds over which things should be realized, and inject expressions defining those bounds.

◆ bound_small_allocations()

Stmt Halide::Internal::bound_small_allocations ( const Stmt & s )

◆ buffer_accessor()

Expr Halide::Internal::buffer_accessor	(	const Buffer<> &	buf,
		const std::vector< Expr > &	args )

◆ get_name_from_end_of_parameter_pack() [1/4]

template<typename T , typename = typename std::enable_if<!std::is_convertible<T, std::string>::value>::type>

std::string Halide::Internal::get_name_from_end_of_parameter_pack ( T && )

Definition at line 44 of file Buffer.h.

◆ get_name_from_end_of_parameter_pack() [2/4]

std::string Halide::Internal::get_name_from_end_of_parameter_pack ( const std::string & n )

inline

Definition at line 48 of file Buffer.h.

◆ get_name_from_end_of_parameter_pack() [3/4]

std::string Halide::Internal::get_name_from_end_of_parameter_pack ( )

inline

Definition at line 52 of file Buffer.h.

Referenced by get_name_from_end_of_parameter_pack().

◆ get_name_from_end_of_parameter_pack() [4/4]

template<typename First , typename Second , typename... Args>

std::string Halide::Internal::get_name_from_end_of_parameter_pack	(	First	first,
		Second	second,
		Args &&...	rest )

Definition at line 59 of file Buffer.h.

References get_name_from_end_of_parameter_pack().

◆ get_shape_from_start_of_parameter_pack_helper() [1/3]

void Halide::Internal::get_shape_from_start_of_parameter_pack_helper	(	std::vector< int > &	,
		const std::string &	)

inline

Definition at line 63 of file Buffer.h.

Referenced by get_shape_from_start_of_parameter_pack(), and get_shape_from_start_of_parameter_pack_helper().

◆ get_shape_from_start_of_parameter_pack_helper() [2/3]

void Halide::Internal::get_shape_from_start_of_parameter_pack_helper ( std::vector< int > & )

inline

Definition at line 66 of file Buffer.h.

◆ get_shape_from_start_of_parameter_pack_helper() [3/3]

template<typename... Args>

void Halide::Internal::get_shape_from_start_of_parameter_pack_helper	(	std::vector< int > &	result,
		int	x,
		Args &&...	rest )

Definition at line 70 of file Buffer.h.

References get_shape_from_start_of_parameter_pack_helper().

◆ get_shape_from_start_of_parameter_pack()

template<typename... Args>

std::vector< int > Halide::Internal::get_shape_from_start_of_parameter_pack ( Args &&... args )

Definition at line 76 of file Buffer.h.

References get_shape_from_start_of_parameter_pack_helper().

◆ buffer_type_name_non_const()

template<typename T >

void Halide::Internal::buffer_type_name_non_const ( std::ostream & s )

Definition at line 89 of file Buffer.h.

◆ buffer_type_name_non_const< void >()

template<>

void Halide::Internal::buffer_type_name_non_const< void > ( std::ostream & s )

inline

Definition at line 94 of file Buffer.h.

◆ buffer_type_name()

template<typename T >

std::string Halide::Internal::buffer_type_name ( )

Definition at line 99 of file Buffer.h.

◆ canonicalize_gpu_vars()

Stmt Halide::Internal::canonicalize_gpu_vars ( Stmt s )

Canonicalize GPU var names into some pre-determined block/thread names (i.e.

__block_id_x, __thread_id_x, etc.). The x/y/z/w order is determined by the nesting order: innermost is assigned to x and so on.

◆ gpu_thread_name()

const std::string & Halide::Internal::gpu_thread_name ( int index )

Names for the thread and block id variables.

Includes the leading dot. Indexed from inside out, so 0 gives you the innermost loop.

◆ gpu_block_name()

const std::string & Halide::Internal::gpu_block_name ( int index )

◆ clamp_unsafe_accesses()

Stmt Halide::Internal::clamp_unsafe_accesses	(	const Stmt &	s,
		const std::map< std::string, Function > &	env,
		FuncValueBounds &	func_bounds )

Inject clamps around func calls h(...) when all the following conditions hold:

The call is in an indexing context, such as: f(x) = g(h(x));
The FuncValueBounds of h are smaller than those of its type
The allocation bounds of h might be wider than its compute bounds.

◆ new_CodeGen_D3D12Compute_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_D3D12Compute_Dev ( const Target & target )

◆ get_vector_element_type()

llvm::Type * Halide::Internal::get_vector_element_type ( llvm::Type * )

Get the scalar type of an llvm vector type.

Returns the argument if it's not a vector type.

◆ function_takes_user_context()

bool Halide::Internal::function_takes_user_context ( const std::string & name )

Which built-in functions require a user-context first argument?

◆ can_allocation_fit_on_stack()

bool Halide::Internal::can_allocation_fit_on_stack ( int64_t size )

Given a size (in bytes), return True if the allocation size can fit on the stack; otherwise, return False.

This routine asserts if size is non-positive.

◆ long_div_mod_round_to_zero()

std::pair< Expr, Expr > Halide::Internal::long_div_mod_round_to_zero	(	const Expr &	a,
		const Expr &	b,
		std::optional< uint64_t >	max_abs = std::nullopt )

Does a {div/mod}_round_to_zero using binary long division for int/uint.

max_abs is the maximum absolute value of (a/b). Returns the pair {div_round_to_zero, mod_round_to_zero}.

◆ lower_int_uint_div()

Expr Halide::Internal::lower_int_uint_div	(	const Expr &	a,
		const Expr &	b,
		bool	round_to_zero = false )

Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.

Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.

◆ lower_int_uint_mod()

Expr Halide::Internal::lower_int_uint_mod	(	const Expr &	a,
		const Expr &	b )

Given a Halide Euclidean division/mod operation, do constant optimizations and possibly call lower_euclidean_div/lower_euclidean_mod if necessary.

Can introduce mulhi_shr and sorted_avg intrinsics as well as those from the lower_euclidean_ operation – div_round_to_zero or mod_round_to_zero.

◆ lower_euclidean_div()

Expr Halide::Internal::lower_euclidean_div	(	Expr	a,
		Expr	b )

Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.

◆ lower_euclidean_mod()

Expr Halide::Internal::lower_euclidean_mod	(	Expr	a,
		Expr	b )

Given a Halide Euclidean division/mod operation, define it in terms of div_round_to_zero or mod_round_to_zero.

◆ lower_signed_shift_left()

Expr Halide::Internal::lower_signed_shift_left	(	const Expr &	a,
		const Expr &	b )

Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.

◆ lower_signed_shift_right()

Expr Halide::Internal::lower_signed_shift_right	(	const Expr &	a,
		const Expr &	b )

Given a Halide shift operation with a signed shift amount (may be negative), define an equivalent expression using only shifts by unsigned amounts.

◆ lower_mux()

Expr Halide::Internal::lower_mux ( const Call * mux )

Reduce a mux intrinsic to a select tree.

◆ lower_extract_bits()

Expr Halide::Internal::lower_extract_bits ( const Call * c )

Reduce bit extraction and concatenation to bit ops.

◆ lower_concat_bits()

Expr Halide::Internal::lower_concat_bits ( const Call * c )

Reduce bit extraction and concatenation to bit ops.

◆ lower_round_to_nearest_ties_to_even()

Expr Halide::Internal::lower_round_to_nearest_ties_to_even ( const Expr & )

An vectorizable implementation of Halide::round that doesn't depend on any standard library being present.

◆ get_target_options()

void Halide::Internal::get_target_options	(	const llvm::Module &	module,
		llvm::TargetOptions &	options )

Given an llvm::Module, set llvm:TargetOptions information.

◆ clone_target_options()

void Halide::Internal::clone_target_options	(	const llvm::Module &	from,
		llvm::Module &	to )

Given two llvm::Modules, clone target options from one to the other.

◆ make_target_machine()

std::unique_ptr< llvm::TargetMachine > Halide::Internal::make_target_machine ( const llvm::Module & module )

Given an llvm::Module, get or create an llvm:TargetMachine.

◆ set_function_attributes_from_halide_target_options()

void Halide::Internal::set_function_attributes_from_halide_target_options ( llvm::Function & )

Set the appropriate llvm Function attributes given the Halide Target.

◆ embed_bitcode()

void Halide::Internal::embed_bitcode	(	llvm::Module *	M,
		const std::string &	halide_command )

Save a copy of the llvm IR currently represented by the module as data in the __LLVM,__bitcode section.

Emulates clang's -fembed-bitcode flag and is useful to satisfy Apple's bitcode inclusion requirements.

◆ new_CodeGen_Metal_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_Metal_Dev ( const Target & target )

◆ new_CodeGen_OpenCL_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_OpenCL_Dev ( const Target & target )

◆ new_CodeGen_PTX_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_PTX_Dev ( const Target & target )

◆ new_CodeGen_ARM()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_ARM ( const Target & target )

Construct CodeGen object for a variety of targets.

◆ new_CodeGen_Hexagon()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_Hexagon ( const Target & target )

◆ new_CodeGen_PowerPC()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_PowerPC ( const Target & target )

◆ new_CodeGen_RISCV()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_RISCV ( const Target & target )

◆ new_CodeGen_X86()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_X86 ( const Target & target )

◆ new_CodeGen_WebAssembly()

std::unique_ptr< CodeGen_Posix > Halide::Internal::new_CodeGen_WebAssembly ( const Target & target )

◆ new_CodeGen_Vulkan_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_Vulkan_Dev ( const Target & target )

◆ new_CodeGen_WebGPU_Dev()

std::unique_ptr< CodeGen_GPU_Dev > Halide::Internal::new_CodeGen_WebGPU_Dev ( const Target & target )

◆ set_compiler_logger()

std::unique_ptr< CompilerLogger > Halide::Internal::set_compiler_logger ( std::unique_ptr< CompilerLogger > compiler_logger )

Set the active CompilerLogger object, replacing any existing one.

It is legal to pass in a nullptr (which means "don't do any compiler logging"). Returns the previous CompilerLogger (if any).

◆ get_compiler_logger()

CompilerLogger * Halide::Internal::get_compiler_logger ( )

Return the currently active CompilerLogger object.

If set_compiler_logger() has never been called, a nullptr implementation will be returned. Do not save the pointer returned! It is intended to be used for immediate calls only.

◆ constant_integer_bounds()

ConstantInterval Halide::Internal::constant_integer_bounds	(	const Expr &	e,
		const Scope< ConstantInterval > &	scope = Scope< ConstantInterval >::empty_scope(),
		std::map< Expr, ConstantInterval, ExprCompare > *	cache = nullptr )

Deduce constant integer bounds on an expression.

This can be useful to decide if, for example, the expression can be cast to another type, be negated, be incremented, etc without risking overflow.

Also optionally accepts a scope containing the integer bounds of any variables that may be referenced, and a cache of constant integer bounds on known Exprs, which this function will update. The cache is helpful to short-circuit large numbers of redundant queries, but it should not be used in contexts where the same Expr object may take on different values within a single Expr (i.e. before uniquify_variable_names).

◆ operator+() [1/4]

ConstantInterval Halide::Internal::operator+	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

Arithmetic operators on ConstantIntervals.

The resulting interval contains all possible values of the operator applied to any two elements of the argument intervals. Note that these operator on unbounded integers. If you are applying this to concrete small integer types, you will need to manually cast the constant interval back to the desired type to model the effect of overflow.

◆ operator+() [2/4]

ConstantInterval Halide::Internal::operator+	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator-() [1/4]

ConstantInterval Halide::Internal::operator-	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator-() [2/4]

ConstantInterval Halide::Internal::operator-	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator/() [1/4]

ConstantInterval Halide::Internal::operator/	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator/() [2/4]

ConstantInterval Halide::Internal::operator/	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator*() [1/4]

ConstantInterval Halide::Internal::operator*	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator*() [2/4]

ConstantInterval Halide::Internal::operator*	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator%() [1/4]

ConstantInterval Halide::Internal::operator%	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator%() [2/4]

ConstantInterval Halide::Internal::operator%	(	const ConstantInterval &	a,
		int64_t	b )

◆ min() [1/2]

ConstantInterval Halide::Internal::min	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

Referenced by Halide::Internal::GeneratorMinMax::min_forward(), and Halide::Internal::GeneratorMinMax::min_forward().

◆ min() [2/2]

ConstantInterval Halide::Internal::min	(	const ConstantInterval &	a,
		int64_t	b )

◆ max() [1/2]

ConstantInterval Halide::Internal::max	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

Referenced by Halide::Internal::GeneratorMinMax::max_forward(), and Halide::Internal::GeneratorMinMax::max_forward().

◆ max() [2/2]

ConstantInterval Halide::Internal::max	(	const ConstantInterval &	a,
		int64_t	b )

◆ abs()

ConstantInterval Halide::Internal::abs ( const ConstantInterval & a )

◆ operator<<() [1/22]

ConstantInterval Halide::Internal::operator<<	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator<<() [2/22]

ConstantInterval Halide::Internal::operator<<	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator<<() [3/22]

ConstantInterval Halide::Internal::operator<<	(	int64_t	a,
		const ConstantInterval &	b )

◆ operator>>() [1/3]

ConstantInterval Halide::Internal::operator>>	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator>>() [2/3]

ConstantInterval Halide::Internal::operator>>	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator>>() [3/3]

ConstantInterval Halide::Internal::operator>>	(	int64_t	a,
		const ConstantInterval &	b )

◆ operator<=() [1/3]

bool Halide::Internal::operator<=	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

Comparison operators on ConstantIntervals.

Returns whether the comparison is true for all values of the two intervals.

◆ operator<=() [2/3]

bool Halide::Internal::operator<=	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator<=() [3/3]

bool Halide::Internal::operator<=	(	int64_t	a,
		const ConstantInterval &	b )

◆ operator<() [1/3]

bool Halide::Internal::operator<	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

◆ operator<() [2/3]

bool Halide::Internal::operator<	(	const ConstantInterval &	a,
		int64_t	b )

◆ operator<() [3/3]

bool Halide::Internal::operator<	(	int64_t	a,
		const ConstantInterval &	b )

◆ operator>=() [1/3]

bool Halide::Internal::operator>=	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

inline

Definition at line 144 of file ConstantInterval.h.

◆ operator>() [1/3]

bool Halide::Internal::operator>	(	const ConstantInterval &	a,
		const ConstantInterval &	b )

inline

Definition at line 147 of file ConstantInterval.h.

◆ operator>=() [2/3]

bool Halide::Internal::operator>=	(	const ConstantInterval &	a,
		int64_t	b )

inline

Definition at line 150 of file ConstantInterval.h.

◆ operator>() [2/3]

bool Halide::Internal::operator>	(	const ConstantInterval &	a,
		int64_t	b )

inline

Definition at line 153 of file ConstantInterval.h.

◆ operator>=() [3/3]

bool Halide::Internal::operator>=	(	int64_t	a,
		const ConstantInterval &	b )

inline

Definition at line 156 of file ConstantInterval.h.

◆ operator>() [3/3]

bool Halide::Internal::operator>	(	int64_t	a,
		const ConstantInterval &	b )

inline

Definition at line 159 of file ConstantInterval.h.

◆ cplusplus_function_mangled_name()

std::string Halide::Internal::cplusplus_function_mangled_name	(	const std::string &	name,
		const std::vector< std::string > &	namespaces,
		Type	return_type,
		const std::vector< ExternFuncArgument > &	args,
		const Target &	target )

Return the mangled C++ name for a function.

The target parameter is used to decide on the C++ ABI/mangling style to use.

◆ cplusplus_mangle_test()

void Halide::Internal::cplusplus_mangle_test ( )

◆ common_subexpression_elimination() [1/2]

Expr Halide::Internal::common_subexpression_elimination	(	const Expr &	,
		bool	lift_all = false )

Replace each common sub-expression in the argument with a variable, and wrap the resulting expr in a let statement giving a value to that variable.

This is important to do within Halide (instead of punting to llvm), because exprs that come in from the front-end are small when considered as a graph, but combinatorially large when considered as a tree. For an example of a such a case, see test/code_explosion.cpp

The last parameter determines whether all common subexpressions are lifted, or only those that the simplifier would not subsitute back in (e.g. addition of a constant).

◆ common_subexpression_elimination() [2/2]

Stmt Halide::Internal::common_subexpression_elimination	(	const Stmt &	,
		bool	lift_all = false )

Do common-subexpression-elimination on each expression in a statement.

Does not introduce let statements.

◆ cse_test()

void Halide::Internal::cse_test ( )

◆ operator<<() [4/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const Stmt &	)

Emit a halide statement on an output stream (such as std::cout) in a human-readable form.

◆ operator<<() [5/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	,
		const LoweredFunc &	)

Emit a halide LoweredFunc in a human readable format.

◆ PrintSpan()

template<typename T >

Halide::Internal::PrintSpan ( const T & ) -> PrintSpan< T >

◆ operator<<() [6/22]

template<typename StreamT , typename T >

StreamT & Halide::Internal::operator<<	(	StreamT &	stream,
		const PrintSpan< T > &	wrapper )

inline

Definition at line 85 of file Debug.h.

References Halide::Internal::PrintSpan< T >::span.

◆ PrintSpanLn()

template<typename T >

Halide::Internal::PrintSpanLn ( const T & ) -> PrintSpanLn< T >

◆ operator<<() [7/22]

template<typename StreamT , typename T >

StreamT & Halide::Internal::operator<<	(	StreamT &	stream,
		const PrintSpanLn< T > &	wrapper )

inline

Definition at line 119 of file Debug.h.

References Halide::Internal::PrintSpanLn< T >::span.

◆ debug_arguments()

void Halide::Internal::debug_arguments	(	LoweredFunc *	func,
		const Target &	t )

Injects debug prints in a LoweredFunc that describe the target and arguments.

Mutates the given func.

◆ debug_to_file()

Stmt Halide::Internal::debug_to_file	(	Stmt	s,
		const std::vector< Function > &	outputs,
		const std::map< std::string, Function > &	env )

Takes a statement with Realize nodes still unlowered.

If the corresponding functions have a debug_file set, then inject code that will dump the contents of those functions to a file after the realization.

◆ extract_odd_lanes()

Expr Halide::Internal::extract_odd_lanes ( const Expr & a )

Extract the odd-numbered lanes in a vector.

◆ extract_even_lanes()

Expr Halide::Internal::extract_even_lanes ( const Expr & a )

Extract the even-numbered lanes in a vector.

◆ extract_lane()

Expr Halide::Internal::extract_lane	(	const Expr &	vec,
		int	lane )

Extract the nth lane of a vector.

◆ rewrite_interleavings()

Stmt Halide::Internal::rewrite_interleavings ( const Stmt & s )

Look through a statement for expressions of the form select(ramp % 2 == 0, a, b) and replace them with calls to an interleave intrinsic.

◆ deinterleave_vector_test()

void Halide::Internal::deinterleave_vector_test ( )

◆ remove_let_definitions()

Expr Halide::Internal::remove_let_definitions ( const Expr & expr )

Remove all let definitions of expr.

◆ gather_variables() [1/2]

std::vector< int > Halide::Internal::gather_variables	(	const Expr &	expr,
		const std::vector< std::string > &	filter )

Return a list of variables' indices that expr depends on and are in the filter.

◆ gather_variables() [2/2]

std::vector< int > Halide::Internal::gather_variables	(	const Expr &	expr,
		const std::vector< Var > &	filter )

◆ gather_rvariables() [1/2]

std::map< std::string, ReductionVariableInfo > Halide::Internal::gather_rvariables ( const Expr & expr )

◆ gather_rvariables() [2/2]

std::map< std::string, ReductionVariableInfo > Halide::Internal::gather_rvariables ( const Tuple & tuple )

◆ add_let_expression()

Expr Halide::Internal::add_let_expression	(	const Expr &	expr,
		const std::map< std::string, Expr > &	let_var_mapping,
		const std::vector< std::string > &	let_variables )

Add necessary let expressions to expr.

◆ sort_expressions()

std::vector< Expr > Halide::Internal::sort_expressions ( const Expr & expr )

Topologically sort the expression graph expressed by expr.

◆ inference_bounds() [1/2]

std::map< std::string, Box > Halide::Internal::inference_bounds	(	const std::vector< Func > &	funcs,
		const std::vector< Box > &	output_bounds )

Compute the bounds of funcs.

The bounds represent a conservative region that is used by the "consumers" of the function, except of itself.

◆ inference_bounds() [2/2]

std::map< std::string, Box > Halide::Internal::inference_bounds	(	const Func &	func,
		const Box &	output_bounds )

◆ box_to_vector()

std::vector< std::pair< Expr, Expr > > Halide::Internal::box_to_vector ( const Box & bounds )

Convert Box to vector of (min, extent)

◆ equal() [1/4]

bool Halide::Internal::equal	(	const RDom &	bounds0,
		const RDom &	bounds1 )

Return true if bounds0 and bounds1 represent the same bounds.

Referenced by equal(), graph_equal(), Halide::Internal::IRMatcher::SpecificExpr::match(), Halide::Internal::IRMatcher::Wild< i >::match(), Halide::Internal::AssociativeOp::Replacement::operator==(), and Halide::Internal::AssociativePattern::operator==().

◆ vars_to_strings()

std::vector< std::string > Halide::Internal::vars_to_strings ( const std::vector< Var > & vars )

Return a list of variable names.

◆ extract_rdom()

ReductionDomain Halide::Internal::extract_rdom ( const Expr & expr )

Return the reduction domain used by expr.

◆ solve_inverse()

std::pair< bool, Expr > Halide::Internal::solve_inverse	(	Expr	expr,
		const std::string &	new_var,
		const std::string &	var )

expr is new_var == f(var), solve for var == g(new_var) if multiple new_var corresponds to same var, introduce a RDom

◆ find_buffer_param_calls()

std::map< std::string, BufferInfo > Halide::Internal::find_buffer_param_calls ( const Func & func )

◆ find_implicit_variables()

std::set< std::string > Halide::Internal::find_implicit_variables ( const Expr & expr )

Find all implicit variables in expr.

◆ substitute_rdom_predicate()

Expr Halide::Internal::substitute_rdom_predicate	(	const std::string &	name,
		const Expr &	replacement,
		const Expr &	expr )

Substitute the variable.

Also replace all occurrences in rdom.where() predicates.

◆ is_calling_function() [1/2]

bool Halide::Internal::is_calling_function	(	const std::string &	func_name,
		const Expr &	expr,
		const std::map< std::string, Expr > &	let_var_mapping )

Return true if expr contains call to func_name.

◆ is_calling_function() [2/2]

bool Halide::Internal::is_calling_function	(	const Expr &	expr,
		const std::map< std::string, Expr > &	let_var_mapping )

Return true if expr depends on any function or buffer.

◆ substitute_call_arg_with_pure_arg()

Expr Halide::Internal::substitute_call_arg_with_pure_arg	(	Func	f,
		int	variable_id,
		const Expr &	e )

Replaces call to Func f in Expr e such that the call argument at variable_id is the pure argument.

◆ make_device_interface_call()

Expr Halide::Internal::make_device_interface_call	(	DeviceAPI	device_api,
		MemoryType	memory_type = MemoryType::Auto )

Get an Expr which evaluates to the device interface for the given device api at runtime.

◆ distribute_shifts()

Stmt Halide::Internal::distribute_shifts	(	const Stmt &	stmt,
		bool	multiply_adds )

◆ inject_early_frees()

Stmt Halide::Internal::inject_early_frees ( const Stmt & s )

Take a statement with allocations and inject markers (of the form of calls to "mark buffer dead") after the last use of each allocation.

Targets may use this to free buffers earlier than the close of their Allocate node.

◆ eliminate_bool_vectors() [1/2]

Stmt Halide::Internal::eliminate_bool_vectors ( const Stmt & s )

Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.

For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.

◆ eliminate_bool_vectors() [2/2]

Expr Halide::Internal::eliminate_bool_vectors ( const Expr & s )

Some targets treat vectors of bools as integers of the same type that the boolean operation is being used to operate on.

For example, instead of select(i1x8, u16x8, u16x8), the target would prefer to see select(u16x8, u16x8, u16x8), where the first argument is a vector of integers representing a mask. This pass converts vectors of bools to vectors of integers to meet this requirement. This is done by injecting intrinsics to convert bools to architecture-specific masks, and using a select_mask intrinsic instead of a Select node. This also converts any intrinsics that operate on vectorized conditions to a *_mask equivalent (if_then_else, require). Because the masks are architecture specific, they may not be stored or loaded. On Stores, the masks are converted to UInt(8) with a value of 0 or 1, which is our canonical in-memory representation of a bool.

◆ eliminated_bool_type()

Type Halide::Internal::eliminated_bool_type	(	Type	bool_type,
		Type	other_type )

inline

If a type is a boolean vector, find the type that it has been changed to by eliminate_bool_vectors.

Definition at line 32 of file EliminateBoolVectors.h.

References Halide::Type::bits(), Halide::Type::Int, Halide::Type::is_vector(), Halide::Type::with_bits(), and Halide::Type::with_code().

◆ is_float16_transcendental()

bool Halide::Internal::is_float16_transcendental ( const Call * )

Check if a call is a float16 transcendental (e.g.

sqrt_f16)

◆ lower_float16_transcendental_to_float32_equivalent()

Expr Halide::Internal::lower_float16_transcendental_to_float32_equivalent ( const Call * )

Implement a float16 transcendental using the float32 equivalent.

◆ float32_to_bfloat16()

Expr Halide::Internal::float32_to_bfloat16 ( Expr e )

Cast to/from float and bfloat using bitwise math.

◆ float32_to_float16()

Expr Halide::Internal::float32_to_float16 ( Expr e )

◆ float16_to_float32()

Expr Halide::Internal::float16_to_float32 ( Expr e )

◆ bfloat16_to_float32()

Expr Halide::Internal::bfloat16_to_float32 ( Expr e )

◆ lower_float16_cast()

Expr Halide::Internal::lower_float16_cast ( const Cast * op )

◆ unhandled_exception_handler()

HALIDE_EXPORT_SYMBOL void Halide::Internal::unhandled_exception_handler ( )

References unhandled_exception_handler().

Referenced by unhandled_exception_handler().

◆ ref_count< IRNode >()

template<>

RefCount & Halide::Internal::ref_count< IRNode > ( const IRNode * t )

inlinenoexcept

Definition at line 117 of file Expr.h.

◆ destroy< IRNode >()

template<>

void Halide::Internal::destroy< IRNode > ( const IRNode * t )

inline

Definition at line 122 of file Expr.h.

◆ is_unordered_parallel()

bool Halide::Internal::is_unordered_parallel ( ForType for_type )

Check if for_type executes for loop iterations in parallel and unordered.

Referenced by Halide::Internal::Dim::is_unordered_parallel(), and Halide::Internal::For::is_unordered_parallel().

◆ is_parallel()

bool Halide::Internal::is_parallel ( ForType for_type )

Returns true if for_type executes for loop iterations in parallel.

Referenced by Halide::Internal::Dim::is_parallel(), and Halide::Internal::For::is_parallel().

◆ is_gpu()

bool Halide::Internal::is_gpu ( ForType for_type )

Returns true if for_type is GPUBlock, GPUThread, or GPULane.

◆ stmt_or_expr_uses_vars()

template<typename StmtOrExpr , typename T >

bool Halide::Internal::stmt_or_expr_uses_vars	(	const StmtOrExpr &	e,
		const Scope< T > &	v,
		const Scope< Expr > &	s = Scope<Expr>::empty_scope() )

inline

Test if a statement or expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 101 of file ExprUsesVar.h.

References Halide::Internal::ExprUsesVars< T >::result.

Referenced by expr_uses_vars(), stmt_or_expr_uses_var(), and stmt_uses_vars().

◆ stmt_or_expr_uses_var()

template<typename StmtOrExpr >

bool Halide::Internal::stmt_or_expr_uses_var	(	const StmtOrExpr &	e,
		const std::string &	v,
		const Scope< Expr > &	s = Scope<Expr>::empty_scope() )

inline

Test if a statement or expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 113 of file ExprUsesVar.h.

References Halide::Internal::Scope< T >::push(), and stmt_or_expr_uses_vars().

Referenced by expr_uses_var(), and stmt_uses_var().

◆ expr_uses_var()

bool Halide::Internal::expr_uses_var	(	const Expr &	e,
		const std::string &	v,
		const Scope< Expr > &	s = Scope<Expr>::empty_scope() )

inline

Test if an expression references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 124 of file ExprUsesVar.h.

References stmt_or_expr_uses_var().

◆ stmt_uses_var()

bool Halide::Internal::stmt_uses_var	(	const Stmt &	stmt,
		const std::string &	v,
		const Scope< Expr > &	s = Scope<Expr>::empty_scope() )

inline

Test if a statement references or defines the given variable, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 133 of file ExprUsesVar.h.

References Halide::stmt, and stmt_or_expr_uses_var().

◆ expr_uses_vars()

template<typename T >

bool Halide::Internal::expr_uses_vars	(	const Expr &	e,
		const Scope< T > &	v,
		const Scope< Expr > &	s = Scope<Expr>::empty_scope() )

inline

Test if an expression references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 143 of file ExprUsesVar.h.

References stmt_or_expr_uses_vars().

◆ stmt_uses_vars()

template<typename T >

bool Halide::Internal::stmt_uses_vars	(	const Stmt &	stmt,
		const Scope< T > &	v,
		const Scope< Expr > &	s = Scope<Expr>::empty_scope() )

inline

Test if a statement references or defines any of the variables in a scope, additionally considering variables bound to Expr's in the scope provided in the final argument.

Definition at line 153 of file ExprUsesVar.h.

References Halide::stmt, and stmt_or_expr_uses_vars().

◆ extract_tile_operations()

Stmt Halide::Internal::extract_tile_operations ( const Stmt & s )

Rewrite any AMX tile operations that have been stored in the AMXTile memory type as intrinsic calls, to be used in the X86 backend.

◆ find_direct_calls()

std::map< std::string, Function > Halide::Internal::find_direct_calls ( const Function & f )

Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, including in update definitions, update index expressions, and RDom extents.

This map does not include the Function f, unless it is called recursively by itself.

◆ find_transitive_calls()

std::map< std::string, Function > Halide::Internal::find_transitive_calls ( const Function & f )

Construct a map from name to Function definition object for all Halide functions called directly in the definition of the Function f, or indirectly in those functions' definitions, recursively.

This map always includes the Function f.

◆ build_environment()

std::map< std::string, Function > Halide::Internal::build_environment ( const std::vector< Function > & funcs )

Find all Functions transitively referenced by any Function in funcs and return a map of them.

◆ called_funcs_in_order_found()

std::vector< Function > Halide::Internal::called_funcs_in_order_found ( const std::vector< Function > & funcs )

Returns the same Functions as build_environment, but returns a vector of Functions instead, where the order is the order in which the Functions were first encountered.

This is stable to changes in the names of the Functions.

◆ lower_widen_right_add()

Expr Halide::Internal::lower_widen_right_add	(	const Expr &	a,
		const Expr &	b )

Implement intrinsics with non-intrinsic using equivalents.

◆ lower_widen_right_mul()

Expr Halide::Internal::lower_widen_right_mul	(	const Expr &	a,
		const Expr &	b )

◆ lower_widen_right_sub()

Expr Halide::Internal::lower_widen_right_sub	(	const Expr &	a,
		const Expr &	b )

◆ lower_widening_add()

Expr Halide::Internal::lower_widening_add	(	const Expr &	a,
		const Expr &	b )

◆ lower_widening_mul()

Expr Halide::Internal::lower_widening_mul	(	const Expr &	a,
		const Expr &	b )

◆ lower_widening_sub()

Expr Halide::Internal::lower_widening_sub	(	const Expr &	a,
		const Expr &	b )

◆ lower_widening_shift_left()

Expr Halide::Internal::lower_widening_shift_left	(	const Expr &	a,
		const Expr &	b )

◆ lower_widening_shift_right()

Expr Halide::Internal::lower_widening_shift_right	(	const Expr &	a,
		const Expr &	b )

◆ lower_rounding_shift_left()

Expr Halide::Internal::lower_rounding_shift_left	(	const Expr &	a,
		const Expr &	b )

◆ lower_rounding_shift_right()

Expr Halide::Internal::lower_rounding_shift_right	(	const Expr &	a,
		const Expr &	b )

◆ lower_saturating_add()

Expr Halide::Internal::lower_saturating_add	(	const Expr &	a,
		const Expr &	b )

◆ lower_saturating_sub()

Expr Halide::Internal::lower_saturating_sub	(	const Expr &	a,
		const Expr &	b )

◆ lower_saturating_cast()

Expr Halide::Internal::lower_saturating_cast	(	const Type &	t,
		const Expr &	a )

◆ lower_halving_add()

Expr Halide::Internal::lower_halving_add	(	const Expr &	a,
		const Expr &	b )

◆ lower_halving_sub()

Expr Halide::Internal::lower_halving_sub	(	const Expr &	a,
		const Expr &	b )

◆ lower_rounding_halving_add()

Expr Halide::Internal::lower_rounding_halving_add	(	const Expr &	a,
		const Expr &	b )

◆ lower_sorted_avg()

Expr Halide::Internal::lower_sorted_avg	(	const Expr &	a,
		const Expr &	b )

◆ lower_mul_shift_right()

Expr Halide::Internal::lower_mul_shift_right	(	const Expr &	a,
		const Expr &	b,
		const Expr &	q )

◆ lower_rounding_mul_shift_right()

Expr Halide::Internal::lower_rounding_mul_shift_right	(	const Expr &	a,
		const Expr &	b,
		const Expr &	q )

◆ lower_intrinsic()

Expr Halide::Internal::lower_intrinsic ( const Call * op )

Replace one of the above ops with equivalent arithmetic.

◆ find_intrinsics() [1/2]

Stmt Halide::Internal::find_intrinsics ( const Stmt & s )

Replace common arithmetic patterns with intrinsics.

◆ find_intrinsics() [2/2]

Expr Halide::Internal::find_intrinsics ( const Expr & e )

◆ lower_intrinsics() [1/2]

Expr Halide::Internal::lower_intrinsics ( const Expr & e )

The reverse of find_intrinsics.

◆ lower_intrinsics() [2/2]

Stmt Halide::Internal::lower_intrinsics ( const Stmt & s )

◆ flatten_nested_ramps() [1/2]

Stmt Halide::Internal::flatten_nested_ramps ( const Stmt & s )

Take a statement/expression and replace nested ramps and broadcasts.

◆ flatten_nested_ramps() [2/2]

Expr Halide::Internal::flatten_nested_ramps ( const Expr & e )

◆ check_types() [1/2]

template<typename Last >

void Halide::Internal::check_types	(	const Tuple &	t,
		int	idx )

inline

Definition at line 2616 of file Func.h.

References Halide::type_of(), and user_assert.

Referenced by check_types(), Halide::evaluate(), and Halide::evaluate_may_gpu().

◆ check_types() [2/2]

template<typename First , typename Second , typename... Rest>

void Halide::Internal::check_types	(	const Tuple &	t,
		int	idx )

inline

Definition at line 2625 of file Func.h.

References check_types().

◆ assign_results() [1/2]

template<typename Last >

void Halide::Internal::assign_results	(	Realization &	r,
		int	idx,
		Last	last )

inline

Definition at line 2631 of file Func.h.

References Buffer.

Referenced by assign_results(), Halide::evaluate(), and Halide::evaluate_may_gpu().

◆ assign_results() [2/2]

template<typename First , typename Second , typename... Rest>

void Halide::Internal::assign_results	(	Realization &	r,
		int	idx,
		First	first,
		Second	second,
		Rest &&...	rest )

inline

Definition at line 2637 of file Func.h.

References assign_results().

◆ schedule_scalar()

void Halide::Internal::schedule_scalar ( Func f )

inline

Definition at line 2684 of file Func.h.

References Halide::get_jit_target_from_environment(), Halide::Func::gpu_single_thread(), Halide::Target::has_feature(), Halide::Target::has_gpu_feature(), Halide::Func::hexagon(), and Halide::Target::HVX.

Referenced by Halide::evaluate_may_gpu(), and Halide::evaluate_may_gpu().

◆ deep_copy()

std::pair< std::vector< Function >, std::map< std::string, Function > > Halide::Internal::deep_copy	(	const std::vector< Function > &	outputs,
		const std::map< std::string, Function > &	env )

Deep copy an entire Function DAG.

◆ zero_gpu_loop_mins()

Stmt Halide::Internal::zero_gpu_loop_mins ( const Stmt & s )

Rewrite all GPU loops to have a min of zero.

◆ fuse_gpu_thread_loops()

Stmt Halide::Internal::fuse_gpu_thread_loops ( Stmt s )

Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model.

Within every loop over gpu block indices, fuse the inner loops over thread indices into a single loop (with predication to turn off threads). Push if conditions between GPU blocks to the innermost GPU threads. Also injects synchronization points as needed, and hoists shared allocations at the block level out into a single shared memory array, and heap allocations into a slice of a global pool allocated outside the kernel.

◆ fuzz_float_stores()

Stmt Halide::Internal::fuzz_float_stores ( const Stmt & s )

On every store of a floating point value, mask off the least-significant-bit of the mantissa.

We've found that whether or not this dramatically changes the output of a pipeline correlates very well with whether or not a pipeline will produce very different outputs on different architectures (e.g. with and without FMA). It's also a useful way to detect bad tests, such as those that expect exact floating point equality across platforms.

◆ generator_test()

void Halide::Internal::generator_test ( )

◆ parameter_constraints()

std::vector< Expr > Halide::Internal::parameter_constraints ( const Parameter & p )

◆ enum_to_string()

template<typename T >

HALIDE_NO_USER_CODE_INLINE std::string Halide::Internal::enum_to_string	(	const std::map< std::string, T > &	enum_map,
		const T &	t )

Definition at line 297 of file Generator.h.

References user_error.

Referenced by Halide::Internal::GeneratorParam_Enum< T >::get_default_value(), and halide_type_to_enum_string().

◆ enum_from_string()

template<typename T >

T Halide::Internal::enum_from_string	(	const std::map< std::string, T > &	enum_map,
		const std::string &	s )

Definition at line 308 of file Generator.h.

References user_assert.

◆ get_halide_type_enum_map()

const std::map< std::string, Halide::Type > & Halide::Internal::get_halide_type_enum_map ( )

extern

Referenced by halide_type_to_enum_string().

◆ halide_type_to_enum_string()

std::string Halide::Internal::halide_type_to_enum_string ( const Type & t )

inline

Definition at line 315 of file Generator.h.

References enum_to_string(), and get_halide_type_enum_map().

◆ halide_type_to_c_source()

std::string Halide::Internal::halide_type_to_c_source ( const Type & t )

Referenced by Halide::Internal::GeneratorParam_Type< T >::get_default_value().

◆ halide_type_to_c_type()

std::string Halide::Internal::halide_type_to_c_type ( const Type & t )

Referenced by Halide::Internal::GeneratorInput_Buffer< T2 >::get_c_type(), and Halide::Internal::GeneratorOutput_Buffer< T >::get_c_type().

◆ get_registered_generators()

const GeneratorFactoryProvider & Halide::Internal::get_registered_generators ( )

Return a GeneratorFactoryProvider that knows about all the currently-registered C++ Generators.

◆ generate_filter_main() [1/2]

int Halide::Internal::generate_filter_main	(	int	argc,
		char **	argv )

generate_filter_main() is a convenient wrapper for GeneratorRegistry::create() + compile_to_files(); it can be trivially wrapped by a "real" main() to produce a command-line utility for ahead-of-time filter compilation.

◆ generate_filter_main() [2/2]

int Halide::Internal::generate_filter_main	(	int	argc,
		char **	argv,
		const GeneratorFactoryProvider &	generator_factory_provider )

This overload of generate_filter_main lets you provide your own provider for how to enumerate and/or create the generators based on registration name; this is useful if you want to re-use the 'main' logic but avoid the global Generator registry (e.g.

for bindings in languages other than C++).

◆ parse_scalar()

template<typename T >

T Halide::Internal::parse_scalar ( const std::string & value )

Definition at line 2882 of file Generator.h.

References parse_scalar(), and user_assert.

Referenced by parse_scalar().

◆ parse_halide_type_list()

std::vector< Type > Halide::Internal::parse_halide_type_list ( const std::string & types )

References parse_halide_type_list().

Referenced by parse_halide_type_list().

◆ execute_generator()

void Halide::Internal::execute_generator ( const ExecuteGeneratorArgs & args )

Execute a Generator for AOT compilation – this provides the implementation of the command-line Generator interface generate_filter_main(), but with a structured API that is more suitable for calling directly from code (vs command line).

References execute_generator().

Referenced by execute_generator().

◆ inject_hexagon_rpc()

Stmt Halide::Internal::inject_hexagon_rpc	(	Stmt	s,
		const Target &	host_target,
		Module &	module )

Pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module.

◆ compile_module_to_hexagon_shared_object()

Buffer< uint8_t > Halide::Internal::compile_module_to_hexagon_shared_object ( const Module & device_code )

◆ optimize_hexagon_shuffles()

Stmt Halide::Internal::optimize_hexagon_shuffles	(	const Stmt &	s,
		int	lut_alignment )

Replace indirect and other loads with simple loads + vlut calls.

◆ scatter_gather_generator()

Stmt Halide::Internal::scatter_gather_generator ( Stmt s )

◆ optimize_hexagon_instructions()

Stmt Halide::Internal::optimize_hexagon_instructions	(	Stmt	s,
		const Target &	t )

Hexagon deinterleaves when performing widening operations, and interleaves when performing narrowing operations.

This pass rewrites widenings/narrowings to be explicit in the IR, and attempts to simplify away most of the interleaving/deinterleaving.

◆ native_deinterleave()

Expr Halide::Internal::native_deinterleave ( const Expr & x )

Generate deinterleave or interleave operations, operating on groups of vectors at a time.

◆ native_interleave()

Expr Halide::Internal::native_interleave ( const Expr & x )

◆ is_native_deinterleave()

bool Halide::Internal::is_native_deinterleave ( const Expr & x )

◆ is_native_interleave()

bool Halide::Internal::is_native_interleave ( const Expr & x )

◆ type_suffix() [1/4]

std::string Halide::Internal::type_suffix	(	Type	type,
		bool	signed_variants = true )

◆ type_suffix() [2/4]

std::string Halide::Internal::type_suffix	(	const Expr &	a,
		bool	signed_variants = true )

◆ type_suffix() [3/4]

std::string Halide::Internal::type_suffix	(	const Expr &	a,
		const Expr &	b,
		bool	signed_variants = true )

◆ type_suffix() [4/4]

std::string Halide::Internal::type_suffix	(	const std::vector< Expr > &	ops,
		bool	signed_variants = true )

◆ infer_arguments()

std::vector< InferredArgument > Halide::Internal::infer_arguments	(	const Stmt &	body,
		const std::vector< Function > &	outputs )

◆ call_extern_and_assert()

Stmt Halide::Internal::call_extern_and_assert	(	const std::string &	name,
		const std::vector< Expr > &	args )

A helper function to call an extern function, and assert that it returns 0.

◆ inject_host_dev_buffer_copies()

Stmt Halide::Internal::inject_host_dev_buffer_copies	(	Stmt	s,
		const Target &	t )

Inject calls to halide_device_malloc, halide_copy_to_device, and halide_copy_to_host as needed.

◆ inline_function() [1/3]

Stmt Halide::Internal::inline_function	(	Stmt	s,
		const Function &	f )

Inline a single named function, which must be pure.

For a pure function to be inlined, it must not have any specializations (i.e. it can only have one values definition).

◆ inline_function() [2/3]

Expr Halide::Internal::inline_function	(	Expr	e,
		const Function &	f )

◆ inline_function() [3/3]

void Halide::Internal::inline_function	(	Function	caller,
		const Function &	f )

◆ validate_schedule_inlined_function()

void Halide::Internal::validate_schedule_inlined_function ( Function f )

Check if the schedule of an inlined function is legal, throwing an error if it is not.

◆ ref_count()

template<typename T >

RefCount & Halide::Internal::ref_count ( const T * t )

noexcept

Because in this header we don't yet know how client classes store their RefCount (and we don't want to depend on the declarations of the client classes), any class that you want to hold onto via one of these must provide implementations of ref_count and destroy, which we forward-declare here.

E.g. if you want to use IntrusivePtr<MyClass>, then you should define something like this in MyClass.cpp (assuming MyClass has a field: mutable RefCount ref_count):

template<> RefCount &ref_count<MyClass>(const MyClass *c) noexcept {return c->ref_count;} template<> void destroy<MyClass>(const MyClass *c) {delete c;}

Referenced by Halide::Internal::IntrusivePtr< T >::is_sole_reference().

◆ destroy()

template<typename T >

void Halide::Internal::destroy ( const T * t )

◆ equal_impl()

bool Halide::Internal::equal_impl	(	const IRNode &	a,
		const IRNode &	b )

Referenced by equal(), and graph_equal().

◆ graph_equal_impl()

bool Halide::Internal::graph_equal_impl	(	const IRNode &	a,
		const IRNode &	b )

◆ less_than_impl()

bool Halide::Internal::less_than_impl	(	const IRNode &	a,
		const IRNode &	b )

Referenced by less_than().

◆ graph_less_than_impl()

bool Halide::Internal::graph_less_than_impl	(	const IRNode &	a,
		const IRNode &	b )

Referenced by graph_less_than().

◆ equal() [2/4]

HALIDE_ALWAYS_INLINE bool Halide::Internal::equal	(	const Expr &	a,
		int	b )

Compare an Expr to an int literal.

This is a somewhat common use of equal in tests. Making this separate avoids constructing an Expr out of the int literal just to check if it's equal to a.

Definition at line 28 of file IREquality.h.

References Halide::Internal::IRHandle::as(), Halide::Int(), and Halide::Expr::type().

◆ equal() [3/4]

HALIDE_ALWAYS_INLINE bool Halide::Internal::equal	(	const IRNode &	a,
		const IRNode &	b )

Check if two defined Stmts or Exprs are equal.

Definition at line 38 of file IREquality.h.

References equal_impl(), and Halide::Internal::IRNode::node_type.

◆ equal() [4/4]

HALIDE_ALWAYS_INLINE bool Halide::Internal::equal	(	const IRHandle &	a,
		const IRHandle &	b )

Check if two possible-undefined Stmts or Exprs are equal.

Definition at line 50 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), equal(), and Halide::Internal::IntrusivePtr< T >::get().

◆ graph_equal() [1/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_equal	(	const IRNode &	a,
		const IRNode &	b )

Check if two defined Stmts or Exprs are equal.

Safe to call on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 63 of file IREquality.h.

References equal_impl(), and Halide::Internal::IRNode::node_type.

◆ graph_equal() [2/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_equal	(	const IRHandle &	a,
		const IRHandle &	b )

Check if two possibly-undefined Stmts or Exprs are equal.

Safe to call on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 76 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), equal(), and Halide::Internal::IntrusivePtr< T >::get().

◆ less_than() [1/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::less_than	(	const IRNode &	a,
		const IRNode &	b )

Check if two defined Stmts or Exprs are in a lexicographic order.

For use in map keys.

Definition at line 89 of file IREquality.h.

References less_than_impl(), and Halide::Internal::IRNode::node_type.

Referenced by less_than(), and Halide::Internal::IRDeepCompare::operator()().

◆ less_than() [2/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::less_than	(	const IRHandle &	a,
		const IRHandle &	b )

Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.

For use in map keys.

Definition at line 102 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), Halide::Internal::IntrusivePtr< T >::get(), and less_than().

◆ graph_less_than() [1/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_less_than	(	const IRNode &	a,
		const IRNode &	b )

Check if two defined Stmts or Exprs are in a lexicographic order.

For use in map keys. Safe to use on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 118 of file IREquality.h.

References graph_less_than_impl(), and Halide::Internal::IRNode::node_type.

Referenced by graph_less_than(), and Halide::Internal::IRGraphDeepCompare::operator()().

◆ graph_less_than() [2/2]

HALIDE_ALWAYS_INLINE bool Halide::Internal::graph_less_than	(	const IRHandle &	a,
		const IRHandle &	b )

Check if two possibly-undefined Stmts or Exprs are in a lexicographic order.

For use in map keys. Safe to use on Exprs that haven't been passed to common_subexpression_elimination.

Definition at line 132 of file IREquality.h.

References Halide::Internal::IntrusivePtr< T >::defined(), Halide::Internal::IntrusivePtr< T >::get(), and graph_less_than().

◆ ir_equality_test()

void Halide::Internal::ir_equality_test ( )

◆ expr_match() [1/2]

bool Halide::Internal::expr_match	(	const Expr &	pattern,
		const Expr &	expr,
		std::vector< Expr > &	result )

Does the first expression have the same structure as the second? Variables in the first expression with the name * are interpreted as wildcards, and their matching equivalent in the second expression is placed in the vector give as the third argument.

Wildcards require the types to match. For the type bits and width, a 0 indicates "match anything". So an Int(8, 0) will match 8-bit integer vectors of any width (including scalars), and a UInt(0, 0) will match any unsigned integer type.

For example:

Expr x = Variable::make(Int(32), "*");

match(x + x, 3 + (2*k), result)

Halide::Int

Type Int(int bits, int lanes=1)

Constructing a signed integer type.

Definition Type.h:541

Halide::Expr

A fragment of Halide syntax.

Definition Expr.h:258

Halide::Internal::Variable::make

static Expr make(Type type, const std::string &name)

Definition IR.h:785

should return true, and set result[0] to 3 and result[1] to 2*k.

◆ expr_match() [2/2]

bool Halide::Internal::expr_match	(	const Expr &	pattern,
		const Expr &	expr,
		std::map< std::string, Expr > &	result )

Does the first expression have the same structure as the second? Variables are matched consistently.

The first time a variable is matched, it assumes the value of the matching part of the second expression. Subsequent matches must be equal to the first match.

For example:

Var x("x"), y("y");

match(x*(x + y), a*(a + b), result)

Halide::Var

A Halide variable, to be used when defining functions.

Definition Var.h:19

should return true, and set result["x"] = a, and result["y"] = b.

◆ with_lanes()

Expr Halide::Internal::with_lanes	(	const Expr &	x,
		int	lanes )

Rewrite the expression x to have lanes lanes.

This is useful for substituting the results of expr_match into a pattern expression.

◆ expr_match_test()

void Halide::Internal::expr_match_test ( )

◆ mutate_region()

template<typename Mutator , typename... Args>

std::pair< Region, bool > Halide::Internal::mutate_region	(	Mutator *	mutator,
		const Region &	bounds,
		Args &&...	args )

A helper function for mutator-like things to mutate regions.

Definition at line 124 of file IRMutator.h.

References Halide::Internal::IntrusivePtr< T >::same_as().

◆ is_const() [1/2]

bool Halide::Internal::is_const ( const Expr & e )

Is the expression either an IntImm, a FloatImm, a StringImm, or a Cast of the same, or a Ramp or Broadcast of the same.

Doesn't do any constant folding.

Referenced by Halide::Internal::IRMatcher::IsConst< A >::make_folded_const().

◆ is_const() [2/2]

bool Halide::Internal::is_const	(	const Expr &	e,
		int64_t	v )

Is the expression an IntImm, FloatImm of a particular value, or a Cast, or Broadcast of the same.

◆ as_const_int()

std::optional< int64_t > Halide::Internal::as_const_int ( const Expr & e )

If an expression is an IntImm or a Broadcast of an IntImm, return a its value.

Otherwise returns std::nullopt.

◆ as_const_uint()

std::optional< uint64_t > Halide::Internal::as_const_uint ( const Expr & e )

If an expression is a UIntImm or a Broadcast of a UIntImm, return its value.

Otherwise returns std::nullopt.

◆ as_const_float()

std::optional< double > Halide::Internal::as_const_float ( const Expr & e )

If an expression is a FloatImm or a Broadcast of a FloatImm, return its value.

Otherwise returns std::nullopt.

◆ is_const_power_of_two_integer() [1/3]

std::optional< int > Halide::Internal::is_const_power_of_two_integer ( const Expr & e )

Is the expression a constant integer power of two.

Returns log base two of the expression if it is, or std::nullopt if not. Also returns std::nullopt for non-integer types.

◆ is_const_power_of_two_integer() [2/3]

std::optional< int > Halide::Internal::is_const_power_of_two_integer ( uint64_t )

◆ is_const_power_of_two_integer() [3/3]

std::optional< int > Halide::Internal::is_const_power_of_two_integer ( int64_t )

◆ is_positive_const()

bool Halide::Internal::is_positive_const ( const Expr & e )

Is the expression a const (as defined by is_const), and also strictly greater than zero (in all lanes, if a vector expression)

◆ is_negative_const()

bool Halide::Internal::is_negative_const ( const Expr & e )

Is the expression a const (as defined by is_const), and also strictly less than zero (in all lanes, if a vector expression)

◆ is_undef()

bool Halide::Internal::is_undef ( const Expr & e )

Is the expression an undef.

◆ is_const_zero()

bool Halide::Internal::is_const_zero ( const Expr & e )

Is the expression a const (as defined by is_const), and also equal to zero (in all lanes, if a vector expression)

Referenced by Halide::Internal::IRMatcher::NegateOp< A >::match().

◆ is_const_one()

bool Halide::Internal::is_const_one ( const Expr & e )

Is the expression a const (as defined by is_const), and also equal to one (in all lanes, if a vector expression)

Referenced by Halide::Internal::IRMatcher::CanProve< A, Prover >::make_folded_const().

◆ is_no_op()

bool Halide::Internal::is_no_op ( const Stmt & s )

Is the statement a no-op (which we represent as either an undefined Stmt, or as an Evaluate node of a constant)

◆ is_pure()

bool Halide::Internal::is_pure ( const Expr & e )

Does the expression 1) Take on the same value no matter where it appears in a Stmt, and 2) Evaluating it has no side-effects.

◆ make_const() [1/12]

Expr Halide::Internal::make_const	(	Type	t,
		int64_t	val )

Construct an immediate of the given type from any numeric C++ type.

Referenced by Halide::Internal::IRMatcher::fuzz_test_rule(), Halide::Internal::IRMatcher::IntLiteral::make(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), make_const(), Halide::Internal::GeneratorParamImpl< T >::operator Expr(), Halide::Internal::IRMatcher::Rewriter< Instance >::operator()(), and Halide::Internal::IRMatcher::Rewriter< Instance >::operator()().

◆ make_const() [2/12]

Expr Halide::Internal::make_const	(	Type	t,
		uint64_t	val )

◆ make_const() [3/12]

Expr Halide::Internal::make_const	(	Type	t,
		double	val )

◆ make_const() [4/12]

Expr Halide::Internal::make_const	(	Type	t,
		int32_t	val )

inline

Definition at line 85 of file IROperator.h.

References make_const().

◆ make_const() [5/12]

Expr Halide::Internal::make_const	(	Type	t,
		uint32_t	val )

inline

Definition at line 88 of file IROperator.h.

References make_const().

◆ make_const() [6/12]

Expr Halide::Internal::make_const	(	Type	t,
		int16_t	val )

inline

Definition at line 91 of file IROperator.h.

References make_const().

◆ make_const() [7/12]

Expr Halide::Internal::make_const	(	Type	t,
		uint16_t	val )

inline

Definition at line 94 of file IROperator.h.

References make_const().

◆ make_const() [8/12]

Expr Halide::Internal::make_const	(	Type	t,
		int8_t	val )

inline

Definition at line 97 of file IROperator.h.

References make_const().

◆ make_const() [9/12]

Expr Halide::Internal::make_const	(	Type	t,
		uint8_t	val )

inline

Definition at line 100 of file IROperator.h.

References make_const().

◆ make_const() [10/12]

Expr Halide::Internal::make_const	(	Type	t,
		bool	val )

inline

Definition at line 103 of file IROperator.h.

References make_const().

◆ make_const() [11/12]

Expr Halide::Internal::make_const	(	Type	t,
		float	val )

inline

Definition at line 106 of file IROperator.h.

References make_const().

◆ make_const() [12/12]

Expr Halide::Internal::make_const	(	Type	t,
		float16_t	val )

inline

Definition at line 109 of file IROperator.h.

References make_const().

◆ make_signed_integer_overflow()

Expr Halide::Internal::make_signed_integer_overflow ( Type type )

Construct a unique signed_integer_overflow Expr.

Referenced by Halide::Internal::IRMatcher::make_const_special_expr().

◆ is_signed_integer_overflow()

bool Halide::Internal::is_signed_integer_overflow ( const Expr & expr )

Check if an expression is a signed_integer_overflow.

◆ check_representable()

void Halide::Internal::check_representable	(	Type	t,
		int64_t	val )

Check if a constant value can be correctly represented as the given type.

◆ make_bool()

Expr Halide::Internal::make_bool	(	bool	val,
		int	lanes = 1 )

Construct a boolean constant from a C++ boolean value.

May also be a vector if width is given. It is not possible to coerce a C++ boolean to Expr because if we provide such a path then char objects can ambiguously be converted to Halide Expr or to std::string. The problem is that C++ does not have a real bool type - it is in fact close enough to char that C++ does not know how to distinguish them. make_bool is the explicit coercion.

◆ make_zero()

Expr Halide::Internal::make_zero ( Type t )

Construct the representation of zero in the given type.

Referenced by Halide::Internal::IRMatcher::NegateOp< A >::make().

◆ make_one()

Expr Halide::Internal::make_one ( Type t )

Construct the representation of one in the given type.

◆ make_two()

Expr Halide::Internal::make_two ( Type t )

Construct the representation of two in the given type.

◆ const_true()

Expr Halide::Internal::const_true ( int lanes = 1 )

Construct the constant boolean true.

May also be a vector of trues, if a lanes argument is given.

◆ const_false()

Expr Halide::Internal::const_false ( int lanes = 1 )

Construct the constant boolean false.

May also be a vector of falses, if a lanes argument is given.

◆ lossless_cast()

Expr Halide::Internal::lossless_cast	(	Type	t,
		Expr	e,
		std::map< Expr, ConstantInterval, ExprCompare > *	cache = nullptr )

Attempt to cast an expression to a smaller type while provably not losing information.

If it can't be done, return an undefined Expr.

Optionally accepts a map that gives the constant bounds of exprs already analyzed to avoid redoing work across many calls to lossless_cast. It is not safe to use this optional map in contexts where the same Expr object may take on a different value. For example: (let x = 4 in some_expr_object) + (let x = 5 in the_same_expr_object)). It is safe to use it after uniquify_variable_names has been run.

◆ lossless_negate()

Expr Halide::Internal::lossless_negate ( const Expr & x )

Attempt to negate x without introducing new IR and without overflow.

If it can't be done, return an undefined Expr.

◆ match_types()

void Halide::Internal::match_types	(	Expr &	a,
		Expr &	b )

Coerce the two expressions to have the same type, using C-style casting rules.

For the purposes of casting, a boolean type is UInt(1). We use the following procedure:

If the types already match, do nothing.

Then, if one type is a vector and the other is a scalar, the scalar is broadcast to match the vector width, and we continue.

Then, if one type is floating-point and the other is not, the non-float is cast to the floating-point type, and we're done.

Then, if both types are unsigned ints, the one with fewer bits is cast to match the one with more bits and we're done.

Then, if both types are signed ints, the one with fewer bits is cast to match the one with more bits and we're done.

Finally, if one type is an unsigned int and the other type is a signed int, both are cast to a signed int with the greater of the two bit-widths. For example, matching an Int(8) with a UInt(16) results in an Int(16).

◆ match_types_bitwise()

void Halide::Internal::match_types_bitwise	(	Expr &	a,
		Expr &	b,
		const char *	op_name )

Asserts that both expressions are integer types and are either both signed or both unsigned.

If one argument is scalar and the other a vector, the scalar is broadcasted to have the same number of lanes as the vector. If one expression is of narrower type than the other, it is widened to the bit width of the wider.

◆ halide_log()

Expr Halide::Internal::halide_log ( const Expr & a )

Halide's vectorizable transcendentals.

◆ halide_exp()

Expr Halide::Internal::halide_exp ( const Expr & a )

◆ halide_erf()

Expr Halide::Internal::halide_erf ( const Expr & a )

◆ raise_to_integer_power()

Expr Halide::Internal::raise_to_integer_power	(	Expr	a,
		int64_t	b )

Raise an expression to an integer power by repeatedly multiplying it by itself.

◆ split_into_ands()

void Halide::Internal::split_into_ands	(	const Expr &	cond,
		std::vector< Expr > &	result )

Split a boolean condition into vector of ANDs.

If 'cond' is undefined, return an empty vector.

◆ strided_ramp_base()

Expr Halide::Internal::strided_ramp_base	(	const Expr &	e,
		int	stride = 1 )

If e is a ramp expression with stride, default 1, return the base, otherwise undefined.

◆ mod_imp()

template<typename T >

T Halide::Internal::mod_imp	(	T	a,
		T	b )

inline

Implementations of division and mod that are specific to Halide.

Use these implementations; do not use native C division or mod to simplify Halide expressions. Halide division and modulo satisify the Euclidean definition of division for integers a and b:

/code when b != 0, (a/b)*b + ab = a 0 <= ab < |b| /endcode

Additionally, mod by zero returns zero, and div by zero returns zero. This makes mod and div total functions.

Definition at line 252 of file IROperator.h.

References Halide::Type::is_float(), Halide::Type::is_int(), and Halide::type_of().

Referenced by Halide::Internal::Simplify::ExprInfo::cast_to(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Mod >(), and Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().

◆ div_imp()

template<typename T >

T Halide::Internal::div_imp	(	T	a,
		T	b )

inline

Definition at line 273 of file IROperator.h.

References Halide::Type::is_float(), Halide::Type::is_int(), and Halide::type_of().

Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Div >(), Halide::Internal::IRMatcher::constant_fold_bin_op< Div >(), and Halide::Internal::IRMatcher::constant_fold_bin_op< Div >().

◆ mod_imp< float >()

template<>

float Halide::Internal::mod_imp< float >	(	float	a,
		float	b )

inline

Definition at line 298 of file IROperator.h.

◆ mod_imp< double >()

template<>

double Halide::Internal::mod_imp< double >	(	double	a,
		double	b )

inline

Definition at line 304 of file IROperator.h.

◆ div_imp< float >()

template<>

float Halide::Internal::div_imp< float >	(	float	a,
		float	b )

inline

Definition at line 310 of file IROperator.h.

◆ div_imp< double >()

template<>

double Halide::Internal::div_imp< double >	(	double	a,
		double	b )

inline

Definition at line 314 of file IROperator.h.

◆ remove_likelies() [1/2]

Expr Halide::Internal::remove_likelies ( const Expr & e )

Return an Expr that is identical to the input Expr, but with all calls to likely() and likely_if_innermost() removed.

◆ remove_likelies() [2/2]

Stmt Halide::Internal::remove_likelies ( const Stmt & s )

Return a Stmt that is identical to the input Stmt, but with all calls to likely() and likely_if_innermost() removed.

◆ remove_promises() [1/2]

Expr Halide::Internal::remove_promises ( const Expr & e )

Return an Expr that is identical to the input Expr, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.

◆ remove_promises() [2/2]

Stmt Halide::Internal::remove_promises ( const Stmt & s )

Return a Stmt that is identical to the input Stmt, but with all calls to promise_clamped() and unsafe_promise_clamped() removed.

◆ unwrap_tags()

Expr Halide::Internal::unwrap_tags ( const Expr & e )

If the expression is a tag helper call, remove it and return the tagged expression.

If not, returns the expression.

◆ collect_print_args() [1/3]

HALIDE_NO_USER_CODE_INLINE void Halide::Internal::collect_print_args ( std::vector< Expr > & args )

inline

Definition at line 348 of file IROperator.h.

Referenced by Halide::Pipeline::add_requirement(), collect_print_args(), collect_print_args(), Halide::print(), Halide::print_when(), and Halide::require().

◆ collect_print_args() [2/3]

template<typename... Args>

HALIDE_NO_USER_CODE_INLINE void Halide::Internal::collect_print_args	(	std::vector< Expr > &	args,
		const char *	arg,
		Args &&...	more_args )

inline

Definition at line 352 of file IROperator.h.

References collect_print_args().

◆ collect_print_args() [3/3]

template<typename... Args>

HALIDE_NO_USER_CODE_INLINE void Halide::Internal::collect_print_args	(	std::vector< Expr > &	args,
		Expr	arg,
		Args &&...	more_args )

inline

Definition at line 358 of file IROperator.h.

References collect_print_args().

◆ requirement_failed_error()

Expr Halide::Internal::requirement_failed_error	(	Expr	condition,
		const std::vector< Expr > &	args )

◆ memoize_tag_helper()

Expr Halide::Internal::memoize_tag_helper	(	Expr	result,
		const std::vector< Expr > &	cache_key_values )

Referenced by Halide::memoize_tag().

◆ reset_random_counters()

void Halide::Internal::reset_random_counters ( )

Reset the counters used for random-number seeds in random_float/int/uint.

(Note that the counters are incremented for each call, even if a seed is passed in.) This is used for multitarget compilation to ensure that each subtarget gets the same sequence of random numbers.

◆ unreachable() [1/2]

Expr Halide::Internal::unreachable ( Type t = Int(32) )

Return an expression that should never be evaluated.

Expressions that depend on unreachabale values are also unreachable, and statements that execute unreachable expressions are also considered unreachable.

◆ unreachable() [2/2]

template<typename T >

Expr Halide::Internal::unreachable ( )

inline

Definition at line 1361 of file IROperator.h.

References Halide::type_of(), and unreachable().

Referenced by unreachable().

◆ promise_clamped()

Expr Halide::Internal::promise_clamped	(	const Expr &	value,
		const Expr &	min,
		const Expr &	max )

FOR INTERNAL USE ONLY.

An entirely unchecked version of unsafe_promise_clamped, used inside the compiler as an annotation of the known bounds of an Expr when it has proved something is bounded and wants to record that fact for later passes (notably bounds inference) to exploit. This gets introduced by GuardWithIf tail strategies, because the bounds machinery has a hard time exploiting if statement conditions.

Unlike unsafe_promise_clamped, this expression is context-dependent, because 'value' might be statically bounded at some point in the IR (e.g. due to a containing if statement), but not elsewhere.

This intrinsic always evaluates to its first argument. If this value is used by a side-effecting operation and it is outside the range specified by its second and third arguments, behavior is undefined. The compiler can therefore assume that the value is within the range given and optimize accordingly. Note that this permits promise_clamped to evaluate to something outside of the range, provided that this value is not used.

Note that this produces an intrinsic that is marked as 'pure' and thus is allowed to be hoisted, etc.; thus, extra care must be taken with its use.

◆ operator<<() [8/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		IRNodeType	)

Emit a halide node type on an output stream (such as std::cout) in human-readable form.

◆ operator<<() [9/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const AssociativePattern &	)

Emit a halide associative pattern on an output stream (such as std::cout) in a human-readable form.

◆ operator<<() [10/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const AssociativeOp &	)

Emit a halide associative op on an output stream (such as std::cout) in a human-readable form.

◆ operator<<() [11/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const ForType &	)

Emit a halide for loop type (vectorized, serial, etc) in a human readable form.

◆ operator<<() [12/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const VectorReduce::Operator &	)

Emit a horizontal vector reduction op in human-readable form.

◆ operator<<() [13/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const NameMangling &	)

Emit a halide name mangling value in a human readable format.

◆ operator<<() [14/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const LinkageType &	)

Emit a halide linkage value in a human readable format.

◆ operator<<() [15/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const DimType &	)

Emit a halide dimension type in human-readable format.

◆ operator<<() [16/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	out,
		const Closure &	c )

Emit a Closure in human-readable form.

◆ operator<<() [17/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	out,
		const Interval &	c )

Emit an Interval in human-readable form.

◆ operator<<() [18/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	out,
		const ConstantInterval &	c )

Emit a ConstantInterval in human-readable form.

◆ operator<<() [19/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	out,
		const ModulusRemainder &	c )

Emit a ModulusRemainder in human-readable form.

◆ operator<<() [20/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const Indentation &	)

◆ lldb_string() [1/3]

std::string Halide::Internal::lldb_string ( const Expr & )

Debugging helpers for LLDB.

◆ lldb_string() [2/3]

std::string Halide::Internal::lldb_string ( const Internal::BaseExprNode * )

Debugging helpers for LLDB.

◆ lldb_string() [3/3]

std::string Halide::Internal::lldb_string ( const Stmt & )

Debugging helpers for LLDB.

◆ get_symbol_address()

void * Halide::Internal::get_symbol_address ( const char * s )

◆ lower_lerp()

Expr Halide::Internal::lower_lerp	(	Type	final_type,
		Expr	zero_val,
		Expr	one_val,
		const Expr &	weight,
		const Target &	target )

Build Halide IR that computes a lerp.

Use by codegen targets that don't have a native lerp. The lerp is done in the type of the zero value. The final_type is a cast that should occur after the lerp. It's included because in some cases you can incorporate a final cast into the lerp math.

◆ hoist_loop_invariant_values()

Stmt Halide::Internal::hoist_loop_invariant_values ( Stmt )

Hoist loop-invariants out of inner loops.

This is especially important in cases where LLVM would not do it for us automatically. For example, it hoists loop invariants out of cuda kernels.

◆ hoist_loop_invariant_if_statements()

Stmt Halide::Internal::hoist_loop_invariant_if_statements ( Stmt )

Just hoist loop-invariant if statements as far up as possible.

Does not lift other values. It's useful to run this earlier in lowering to simplify the IR.

◆ iterator_to_pointer()

template<typename T >

auto Halide::Internal::iterator_to_pointer ( T iter ) -> decltype(&*std::declval<T>())

Definition at line 119 of file LLVM_Headers.h.

◆ get_llvm_function_name() [1/2]

std::string Halide::Internal::get_llvm_function_name ( const llvm::Function * f )

inline

Definition at line 123 of file LLVM_Headers.h.

◆ get_llvm_function_name() [2/2]

std::string Halide::Internal::get_llvm_function_name ( const llvm::Function & f )

inline

Definition at line 127 of file LLVM_Headers.h.

◆ get_llvm_struct_type_by_name()

llvm::StructType * Halide::Internal::get_llvm_struct_type_by_name	(	llvm::Module *	module,
		const char *	name )

inline

Definition at line 131 of file LLVM_Headers.h.

◆ get_triple_for_target()

llvm::Triple Halide::Internal::get_triple_for_target ( const Target & target )

Return the llvm::Triple that corresponds to the given Halide Target.

◆ get_initial_module_for_target()

std::unique_ptr< llvm::Module > Halide::Internal::get_initial_module_for_target	(	Target	,
		llvm::LLVMContext *	,
		bool	for_shared_jit_runtime = false,
		bool	just_gpu = false )

Create an llvm module containing the support code for a given target.

◆ get_initial_module_for_ptx_device()

std::unique_ptr< llvm::Module > Halide::Internal::get_initial_module_for_ptx_device	(	Target	,
		llvm::LLVMContext *	c )

Create an llvm module containing the support code for ptx device.

◆ add_bitcode_to_module()

void Halide::Internal::add_bitcode_to_module	(	llvm::LLVMContext *	context,
		llvm::Module &	module,
		const std::vector< uint8_t > &	bitcode,
		const std::string &	name )

Link a block of llvm bitcode into an llvm module.

◆ link_with_wasm_jit_runtime()

std::unique_ptr< llvm::Module > Halide::Internal::link_with_wasm_jit_runtime	(	llvm::LLVMContext *	c,
		const Target &	t,
		std::unique_ptr< llvm::Module >	extra_module )

Take the llvm::Module(s) in extra_modules (if any), add the runtime modules needed for the WASM JIT, and link into a single llvm::Module.

◆ loop_carry()

Stmt Halide::Internal::loop_carry	(	Stmt	,
		int	max_carried_values = 8 )

Reuse loads done on previous loop iterations by stashing them in induction variables instead of redoing the load.

If the loads are predicated, the predicates need to match. Can be an optimization or pessimization depending on how good the L1 cache is on the architecture and how many memory issue slots there are. Currently only intended for Hexagon.

◆ lower()

Module Halide::Internal::lower	(	const std::vector< Function > &	output_funcs,
		const std::string &	pipeline_name,
		const Target &	t,
		const std::vector< Argument > &	args,
		LinkageType	linkage_type,
		const std::vector< Stmt > &	requirements = std::vector< Stmt >(),
		bool	trace_pipeline = false,
		const std::vector< IRMutator * > &	custom_passes = std::vector< IRMutator * >() )

Given a vector of scheduled halide functions, create a Module that evaluates it.

Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. The Module may contain submodules for computation offloaded to another execution engine or API as well as buffers that are used in the passed in Stmt.

◆ lower_main_stmt()

Stmt Halide::Internal::lower_main_stmt	(	const std::vector< Function > &	output_funcs,
		const std::string &	pipeline_name,
		const Target &	t,
		const std::vector< Stmt > &	requirements = std::vector< Stmt >(),
		bool	trace_pipeline = false,
		const std::vector< IRMutator * > &	custom_passes = std::vector< IRMutator * >() )

Given a halide function with a schedule, create a statement that evaluates it.

Automatically pulls in all the functions f depends on. Some stages of lowering may be target-specific. Mostly used as a convenience function in tests that wish to assert some property of the lowered IR.

◆ lower_test()

void Halide::Internal::lower_test ( )

◆ lower_parallel_tasks()

Stmt Halide::Internal::lower_parallel_tasks	(	const Stmt &	s,
		std::vector< LoweredFunc > &	closure_implementations,
		const std::string &	name,
		const Target &	t )

◆ lower_warp_shuffles()

Stmt Halide::Internal::lower_warp_shuffles	(	Stmt	s,
		const Target &	t )

Rewrite access to things stored outside the loop over GPU lanes to use nvidia's warp shuffle instructions.

◆ inject_memoization()

Stmt Halide::Internal::inject_memoization	(	const Stmt &	s,
		const std::map< std::string, Function > &	env,
		const std::string &	name,
		const std::vector< Function > &	outputs )

Transform pipeline calls for Funcs scheduled with memoize to do a lookup call to the runtime cache implementation, and if there is a miss, compute the results and call the runtime to store it back to the cache.

Should leave non-memoized Funcs unchanged.

◆ rewrite_memoized_allocations()

Stmt Halide::Internal::rewrite_memoized_allocations	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

This should be called after Storage Flattening has added Allocation IR nodes.

It connects the memoization cache lookups to the Allocations so they point to the buffers from the memoization cache and those buffers are released when no longer used. Should not affect allocations for non-memoized Funcs.

◆ get_output_info()

std::map< OutputFileType, const OutputInfo > Halide::Internal::get_output_info ( const Target & target )

Referenced by Halide::SimdOpCheckTest::compile_and_check().

◆ operator+() [3/4]

ModulusRemainder Halide::Internal::operator+	(	const ModulusRemainder &	a,
		const ModulusRemainder &	b )

◆ operator-() [3/4]

ModulusRemainder Halide::Internal::operator-	(	const ModulusRemainder &	a,
		const ModulusRemainder &	b )

◆ operator*() [3/4]

ModulusRemainder Halide::Internal::operator*	(	const ModulusRemainder &	a,
		const ModulusRemainder &	b )

◆ operator/() [3/4]

ModulusRemainder Halide::Internal::operator/	(	const ModulusRemainder &	a,
		const ModulusRemainder &	b )

◆ operator%() [3/4]

ModulusRemainder Halide::Internal::operator%	(	const ModulusRemainder &	a,
		const ModulusRemainder &	b )

◆ operator+() [4/4]

ModulusRemainder Halide::Internal::operator+	(	const ModulusRemainder &	a,
		int64_t	b )

◆ operator-() [4/4]

ModulusRemainder Halide::Internal::operator-	(	const ModulusRemainder &	a,
		int64_t	b )

◆ operator*() [4/4]

ModulusRemainder Halide::Internal::operator*	(	const ModulusRemainder &	a,
		int64_t	b )

◆ operator/() [4/4]

ModulusRemainder Halide::Internal::operator/	(	const ModulusRemainder &	a,
		int64_t	b )

◆ operator%() [4/4]

ModulusRemainder Halide::Internal::operator%	(	const ModulusRemainder &	a,
		int64_t	b )

◆ modulus_remainder() [1/2]

ModulusRemainder Halide::Internal::modulus_remainder ( const Expr & e )

For things like alignment analysis, often it's helpful to know if an integer expression is some multiple of a constant plus some other constant.

For example, it is straight-forward to deduce that ((10*x + 2)*(6*y - 3) - 1) is congruent to five modulo six.

We get the most information when the modulus is large. E.g. if something is congruent to 208 modulo 384, then we also know it's congruent to 0 mod 8, and we can possibly use it as an index for an aligned load. If all else fails, we can just say that an integer is congruent to zero modulo one.

◆ modulus_remainder() [2/2]

ModulusRemainder Halide::Internal::modulus_remainder	(	const Expr &	e,
		const Scope< ModulusRemainder > &	scope )

If we have alignment information about external variables, we can let the analysis know about that using this version of modulus_remainder:

◆ reduce_expr_modulo() [1/2]

HALIDE_MUST_USE_RESULT bool Halide::Internal::reduce_expr_modulo	(	const Expr &	e,
		int64_t	modulus,
		int64_t *	remainder )

Reduce an expression modulo some integer.

Returns true and assigns to remainder if an answer could be found.

◆ reduce_expr_modulo() [2/2]

HALIDE_MUST_USE_RESULT bool Halide::Internal::reduce_expr_modulo	(	const Expr &	e,
		int64_t	modulus,
		int64_t *	remainder,
		const Scope< ModulusRemainder > &	scope )

Reduce an expression modulo some integer.

Returns true and assigns to remainder if an answer could be found.

◆ modulus_remainder_test()

void Halide::Internal::modulus_remainder_test ( )

◆ gcd()

int64_t Halide::Internal::gcd	(	int64_t	,
		int64_t	)

The greatest common divisor of two integers.

Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().

◆ lcm()

int64_t Halide::Internal::lcm	(	int64_t	,
		int64_t	)

The least common multiple of two integers.

Referenced by Halide::Internal::Autoscheduler::OptionalRational::operator+=().

◆ derivative_bounds()

ConstantInterval Halide::Internal::derivative_bounds	(	const Expr &	e,
		const std::string &	var,
		const Scope< ConstantInterval > &	scope = Scope< ConstantInterval >::empty_scope() )

Find the bounds of the derivative of an expression.

The scope gives the bounds on the derivatives of any variables found.

◆ is_monotonic() [1/2]

Monotonic Halide::Internal::is_monotonic	(	const Expr &	e,
		const std::string &	var,
		const Scope< ConstantInterval > &	scope = Scope< ConstantInterval >::empty_scope() )

◆ is_monotonic() [2/2]

Monotonic Halide::Internal::is_monotonic	(	const Expr &	e,
		const std::string &	var,
		const Scope< Monotonic > &	scope )

◆ operator<<() [21/22]

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const Monotonic &	m )

Emit the monotonic class in human-readable form for debugging.

◆ is_monotonic_test()

void Halide::Internal::is_monotonic_test ( )

◆ inject_gpu_offload()

Stmt Halide::Internal::inject_gpu_offload	(	const Stmt &	s,
		const Target &	host_target )

Pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module.

◆ optimize_shuffles()

Stmt Halide::Internal::optimize_shuffles	(	Stmt	s,
		int	lut_alignment )

◆ can_parallelize_rvar()

bool Halide::Internal::can_parallelize_rvar	(	const std::string &	rvar,
		const std::string &	func,
		const Definition &	r )

Returns whether or not Halide can prove that it is safe to parallelize an update definition across a specific variable.

If this returns true, it's definitely safe. If this returns false, it may still be safe, but Halide couldn't prove it.

◆ check_call_arg_types()

void Halide::Internal::check_call_arg_types	(	const std::string &	name,
		std::vector< Expr > *	args,
		int	dims )

Validate arguments to a call to a func, image or imageparam.

◆ has_uncaptured_likely_tag()

bool Halide::Internal::has_uncaptured_likely_tag	(	const Expr &	e,
		const Scope<> &	scope )

Return true if an expression uses a likely tag that isn't captured by an enclosing Select, Min, or Max.

The scope contains all vars that should be considered to have uncaptured likelies.

◆ has_likely_tag()

bool Halide::Internal::has_likely_tag	(	const Expr &	e,
		const Scope<> &	scope )

Return true if an expression uses a likely tag.

The scope contains all vars in scope that should be considered to have likely tags.

◆ partition_loops()

Stmt Halide::Internal::partition_loops ( Stmt s )

Partitions loop bodies into a prologue, a steady state, and an epilogue.

Finds the steady state by hunting for use of clamped ramps, or the 'likely' intrinsic.

◆ inject_placeholder_prefetch()

Stmt Halide::Internal::inject_placeholder_prefetch	(	const Stmt &	s,
		const std::map< std::string, Function > &	env,
		const std::string &	prefix,
		const std::vector< PrefetchDirective > &	prefetches )

Inject placeholder prefetches to 's'.

This placholder prefetch does not have explicit region to be prefetched yet. It will be computed during call to inject_prefetch.

◆ inject_prefetch()

Stmt Halide::Internal::inject_prefetch	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

Compute the actual region to be prefetched and place it to the placholder prefetch.

Wrap the prefetch call with condition when applicable.

◆ reduce_prefetch_dimension()

Stmt Halide::Internal::reduce_prefetch_dimension	(	Stmt	stmt,
		const Target &	t )

Reduce a multi-dimensional prefetch into a prefetch of lower dimension (max dimension of the prefetch is specified by target architecture).

This keeps the 'max_dim' innermost dimensions and adds loops for the rest of the dimensions. If maximum prefetched-byte-size is specified (depending on the architecture), this also adds an outer loops that tile the prefetches.

◆ hoist_prefetches()

Stmt Halide::Internal::hoist_prefetches ( const Stmt & s )

Hoist all the prefetches in a Block to the beginning of the Block.

This generally only happens when a loop with prefetches is unrolled; in some cases, LLVM's code generation can be suboptimal (unnecessary register spills) when prefetches are scattered through the loop. Hoisting to the top of the loop is a good way to mitigate this, at the cost of the prefetch calls possibly being less useful due to distance from use point. (This is a bit experimental and may need revisiting.) See also https://bugs.llvm.org/show_bug.cgi?id=51172

◆ print_loop_nest()

std::string Halide::Internal::print_loop_nest ( const std::vector< Function > & output_funcs )

Emit some simple pseudocode that shows the structure of the loop nest specified by this pipeline's schedule, and the schedules of the functions it uses.

◆ inject_profiling()

Stmt Halide::Internal::inject_profiling	(	const Stmt &	,
		const std::string &	,
		const std::map< std::string, Function > &	env )

Take a statement representing a halide pipeline insert high-resolution timing into the generated code (via spawning a thread that acts as a sampling profiler); summaries of execution times and counts will be logged at the end.

Should be done before storage flattening, but after all bounds inference.

◆ purify_index_math()

Expr Halide::Internal::purify_index_math ( const Expr & )

Bounds inference and related stages can lift integer bounds expressions out of if statements that guard against those integer expressions doing side-effecty things like dividing or modding by zero.

In those cases, if the lowering passes are functional, the value resulting from the division or mod is evaluated but not used. This mutator rewrites divs and mods in such expressions to fail silently (evaluate to undef) when the denominator is zero.

◆ qualify()

Expr Halide::Internal::qualify	(	const std::string &	prefix,
		const Expr &	value )

Prefix all variable names in the given expression with the prefix string.

◆ random_float()

Expr Halide::Internal::random_float ( const std::vector< Expr > & )

Return a random floating-point number between zero and one that varies deterministically based on the input expressions.

◆ random_int()

Expr Halide::Internal::random_int ( const std::vector< Expr > & )

Return a random unsigned integer between zero and 2^32-1 that varies deterministically based on the input expressions (which must be integers or unsigned integers).

◆ lower_random()

Expr Halide::Internal::lower_random	(	const Expr &	e,
		const std::vector< VarOrRVar > &	free_vars,
		int	tag )

Convert calls to random() to IR generated by random_float and random_int.

Tags all calls with the variables in free_vars, and the integer given as the last argument.

◆ realization_order()

std::pair< std::vector< std::string >, std::vector< std::vector< std::string > > > Halide::Internal::realization_order	(	const std::vector< Function > &	outputs,
		std::map< std::string, Function > &	env )

Given a bunch of functions that call each other, determine an order in which to do the scheduling.

This in turn influences the order in which stages are computed when there's no strict dependency between them. Currently just some arbitrary depth-first traversal of the call graph. In addition, determine grouping of functions with fused computation loops. The functions within the fused groups are sorted based on realization order. There should not be any dependencies among functions within a fused group. This pass will also populate the 'fused_pairs' list in the function's schedule. Return a pair of the realization order and the fused groups in that order.

◆ topological_order()

std::vector< std::string > Halide::Internal::topological_order	(	const std::vector< Function > &	outputs,
		const std::map< std::string, Function > &	env )

Given a bunch of functions that call each other, determine a topological order which stays constant regardless of the schedule.

This ordering adheres to the producer-consumer dependencies, i.e. producer will come before its consumers in that order

◆ rebase_loops_to_zero()

Stmt Halide::Internal::rebase_loops_to_zero ( const Stmt & )

Rewrite the mins of most loops to 0.

◆ split_predicate_test()

void Halide::Internal::split_predicate_test ( )

◆ is_func_trivial_to_inline()

bool Halide::Internal::is_func_trivial_to_inline ( const Function & func )

Return true if the cost of inlining a function is equivalent to the cost of calling the function directly.

◆ remove_dead_allocations()

Stmt Halide::Internal::remove_dead_allocations ( const Stmt & s )

Find Allocate/Free pairs that are never loaded from or stored to, and remove them from the Stmt.

This doesn't touch Realize/Call nodes and so must be called after storage_flattening.

◆ remove_extern_loops()

Stmt Halide::Internal::remove_extern_loops ( const Stmt & s )

Removes placeholder loops for extern stages.

◆ remove_undef()

Stmt Halide::Internal::remove_undef ( Stmt s )

Removes stores that depend on undef values, and statements that only contain such stores.

◆ schedule_functions()

Stmt Halide::Internal::schedule_functions	(	const std::vector< Function > &	outputs,
		const std::vector< std::vector< std::string > > &	fused_groups,
		const std::map< std::string, Function > &	env,
		const Target &	target,
		bool &	any_memoized )

Build loop nests and inject Function realizations at the appropriate places using the schedule.

Returns a flag indicating whether memoization passes need to be run.

◆ operator<<() [22/22]

template<typename T >

std::ostream & Halide::Internal::operator<<	(	std::ostream &	stream,
		const Scope< T > &	s )

Definition at line 307 of file Scope.h.

References Halide::Internal::Scope< T >::cbegin(), Halide::Internal::Scope< T >::cend(), and Halide::Internal::Scope< T >::const_iterator::name().

◆ select_gpu_api()

Stmt Halide::Internal::select_gpu_api	(	const Stmt &	s,
		const Target &	t )

Replace for loops with GPU_Default device_api with an actual device API depending on what's enabled in the target.

Choose the first of the following: opencl, cuda

◆ simplify() [1/2]

Stmt Halide::Internal::simplify	(	const Stmt &	,
		bool	remove_dead_code = true,
		const Scope< Interval > &	bounds = Scope< Interval >::empty_scope(),
		const Scope< ModulusRemainder > &	alignment = Scope< ModulusRemainder >::empty_scope(),
		const std::vector< Expr > &	assumptions = std::vector< Expr >() )

Perform a wide range of simplifications to expressions and statements, including constant folding, substituting in trivial values, arithmetic rearranging, etc.

Simplifies across let statements, so must not be called on stmts with dangling or repeated variable names. Can optionally be passed known bounds of any variables, known alignment properties, and any other Exprs that should be assumed to be true.

◆ simplify() [2/2]

Expr Halide::Internal::simplify	(	const Expr &	,
		bool	remove_dead_code = true,
		const Scope< Interval > &	bounds = Scope< Interval >::empty_scope(),
		const Scope< ModulusRemainder > &	alignment = Scope< ModulusRemainder >::empty_scope(),
		const std::vector< Expr > &	assumptions = std::vector< Expr >() )

◆ can_prove()

bool Halide::Internal::can_prove	(	Expr	e,
		const Scope< Interval > &	bounds = Scope< Interval >::empty_scope() )

Attempt to statically prove an expression is true using the simplifier.

◆ simplify_exprs()

Stmt Halide::Internal::simplify_exprs ( const Stmt & )

Simplify expressions found in a statement, but don't simplify across different statements.

This is safe to perform at an earlier stage in lowering than full simplification of a stmt.

◆ simplify_correlated_differences()

Stmt Halide::Internal::simplify_correlated_differences ( const Stmt & )

Symbolic interval arithmetic can be extremely conservative in cases where we analyze the difference between two correlated expressions.

For example, consider:

for x in [0, 10]: let y = x + 3 let z = y - x

x lies within [0, 10]. Interval arithmetic will correctly determine that y lies within [3, 13]. When z is encountered, it is treated as a difference of two independent variables, and gives [3 - 10, 13 - 0] = [-7, 13] instead of the tighter interval [3, 3]. It doesn't understand that y and x are correlated.

In practice, this problem causes problems for unrolling, and arbitrarily-bad overconservative behavior in bounds inference (e.g. https://github.com/halide/Halide/issues/3697 )

The function below attempts to address this by walking the IR, remembering whether each let variable is monotonic increasing, decreasing, unknown, or constant w.r.t each loop var. When it encounters a subtract node where both sides have the same monotonicity it substitutes, solves, and attempts to generally simplify as aggressively as possible to try to cancel out the repeated dependence on the loop var. The same is done for addition nodes with arguments of opposite monotonicity.

Bounds inference is particularly sensitive to these false dependencies, but removing false dependencies also helps other lowering passes. E.g. if this simplification means a value no longer depends on a loop variable, it can remain scalar during vectorization of that loop, or we can lift it out as a loop invariant, or it might avoid some of the complex paths in GPU codegen that trigger when values depend on the block index (e.g. warp shuffles).

This pass is safe to use on code with repeated instances of the same variable name (it must be, because we want to run it before allocation bounds inference).

◆ bound_correlated_differences()

Expr Halide::Internal::bound_correlated_differences ( const Expr & expr )

Refactor the expression to remove correlated differences or rewrite them in a form that is more amenable to bounds inference.

Performs a subset of what simplify_correlated_differences does. Can increase Expr size (i.e. does not follow the simplifier's reduction order).

◆ simplify_specializations()

void Halide::Internal::simplify_specializations ( std::map< std::string, Function > & env )

Try to simplify the RHS/LHS of a function's definition based on its specializations.

◆ skip_stages()

Stmt Halide::Internal::skip_stages	(	const Stmt &	s,
		const std::vector< Function > &	outputs,
		const std::vector< std::vector< std::string > > &	order,
		const std::map< std::string, Function > &	env )

Avoid computing certain stages if we can infer a runtime condition to check that tells us they won't be used.

Does this by analyzing all reads of each buffer allocated, and inferring some condition that tells us if the reads occur. If the condition is non-trivial, inject ifs that guard the production.

◆ sliding_window()

Stmt Halide::Internal::sliding_window	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

Perform sliding window optimizations on a halide statement.

I.e. don't bother computing points in a function that have provably already been computed by a previous iteration.

◆ solve_expression()

SolverResult Halide::Internal::solve_expression	(	const Expr &	e,
		const std::string &	variable,
		const Scope< Expr > &	scope = Scope< Expr >::empty_scope() )

Attempts to collect all instances of a variable in an expression tree and place it as far to the left as possible, and as far up the tree as possible (i.e.

outside most parentheses). If the expression is an equality or comparison, this 'solves' the equation. Returns a pair of Expr and bool. The Expr is the mutated expression, and the bool indicates whether there is a single instance of the variable in the result. If it is false, the expression has only been partially solved, and there are still multiple instances of the variable.

◆ solve_for_outer_interval()

Interval Halide::Internal::solve_for_outer_interval	(	const Expr &	c,
		const std::string &	variable )

Find the smallest interval such that the condition is either true or false inside of it, but definitely false outside of it.

Never returns undefined Exprs, instead it uses variables called "pos_inf" and "neg_inf" to represent positive and negative infinity.

◆ solve_for_inner_interval()

Interval Halide::Internal::solve_for_inner_interval	(	const Expr &	c,
		const std::string &	variable )

Find the largest interval such that the condition is definitely true inside of it, and might be true or false outside of it.

◆ and_condition_over_domain()

Expr Halide::Internal::and_condition_over_domain	(	const Expr &	c,
		const Scope< Interval > &	varying )

Take a conditional that includes variables that vary over some domain, and convert it to a more conservative (less frequently true) condition that doesn't depend on those variables.

Formally, the output expr implies the input expr.

The condition may be a vector condition, in which case we also 'and' over the vector lanes, and return a scalar result.

◆ solve_test()

void Halide::Internal::solve_test ( )

◆ spirv_ir_test()

void Halide::Internal::spirv_ir_test ( )

Internal test for SPIR-V IR.

◆ split_tuples()

Stmt Halide::Internal::split_tuples	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

Rewrite all tuple-valued Realizations, Provide nodes, and Call nodes into several scalar-valued ones, so that later lowering passes only need to think about scalar-valued productions.

◆ stage_strided_loads()

Stmt Halide::Internal::stage_strided_loads ( const Stmt & s )

Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles.

For a stride of two, the trick is to do a dense load of twice the size, and then extract either the even or odd lanes. This was previously done in codegen, where it was challenging, because it's not easy to know there if it's safe to do the double-sized load, as it either loads one element beyond or before the original load. We used the alignment of the ramp base to try to tell if it was safe to shift backwards, and we added padding to internal allocations so that for those at least it was safe to shift forwards. Unfortunately the alignment of the ramp base is usually unknown if you don't know anything about the strides of the input, and adding padding to allocations was a serious wart in our memory allocators.

This pass instead actively looks for evidence elsewhere in the Stmt (at some location which definitely executes whenever the load being transformed executes) that it's safe to read further forwards or backwards in memory. The evidence is in the form of a load at the same base address with a different constant offset. It also clusters groups of these loads so that they do the same dense load and extract the appropriate slice of lanes. If it fails to find any evidence, for loads from external buffers it does two overlapping half-sized dense loads and shuffles out the desired lanes, and for loads from internal allocations it adds padding to the allocation explicitly, by setting the padding field on Allocate nodes.

◆ print_to_stmt_html()

void Halide::Internal::print_to_stmt_html	(	const std::string &	html_output_filename,
		const Module &	m,
		const std::string &	assembly_input_filename = "" )

Dump an HTML-formatted visualization of a Module to filename.

If assembly_input_filename is not empty, it is expected to be the path to assembly output. If empty, the code will attempt to find such a file based on output_filename (replacing ".stmt.html" with ".s"), and will assert-fail if no such file is found.

◆ print_to_conceptual_stmt_html()

void Halide::Internal::print_to_conceptual_stmt_html	(	const std::string &	html_output_filename,
		const Module &	m,
		const std::string &	assembly_input_filename = "" )

Dump an HTML-formatted visualization of a Module's conceptual Stmt code to filename.

If assembly_input_filename is not empty, it is expected to be the path to assembly output. If empty, the code will attempt to find such a file based on output_filename (replacing ".stmt.html" with ".s"), and will assert-fail if no such file is found.

◆ storage_flattening()

Stmt Halide::Internal::storage_flattening	(	Stmt	s,
		const std::vector< Function > &	outputs,
		const std::map< std::string, Function > &	env,
		const Target &	target )

Take a statement with multi-dimensional Realize, Provide, and Call nodes, and turn it into a statement with single-dimensional Allocate, Store, and Load nodes respectively.

◆ storage_folding()

Stmt Halide::Internal::storage_folding	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

Fold storage of functions if possible.

This means reducing one of the dimensions module something for the purpose of storage, if we can prove that this is safe to do. E.g consider:

f(x) = ...
g(x) = f(x-1) + f(x)
f.store_root().compute_at(g, x);

We can store f as a circular buffer of size two, instead of allocating space for all of it.

◆ strictify_float()

bool Halide::Internal::strictify_float	(	std::map< std::string, Function > &	env,
		const Target &	t )

Propagate strict_float intrinisics such that they immediately wrap all floating-point expressions.

This makes the IR nodes context independent. If the Target::StrictFloat flag is specified in target, starts in strict_float mode so all floating-point type Exprs in the compilation will be marked with strict_float. Returns whether any strict floating-point is used in any function in the passed in env.

◆ strip_asserts()

Stmt Halide::Internal::strip_asserts ( const Stmt & s )

◆ substitute() [1/6]

Expr Halide::Internal::substitute	(	const std::string &	name,
		const Expr &	replacement,
		const Expr &	expr )

Substitute variables with the given name with the replacement expression within expr.

This is a dangerous thing to do if variable names have not been uniquified. While it won't traverse inside let statements with the same name as the first argument, moving a piece of syntax around can change its meaning, because it can cross lets that redefine variable names that it includes references to.

◆ substitute() [2/6]

Stmt Halide::Internal::substitute	(	const std::string &	name,
		const Expr &	replacement,
		const Stmt &	stmt )

Substitute variables with the given name with the replacement expression within stmt.

◆ substitute() [3/6]

Expr Halide::Internal::substitute	(	const std::map< std::string, Expr > &	replacements,
		const Expr &	expr )

Substitute variables with names in the map.

◆ substitute() [4/6]

Stmt Halide::Internal::substitute	(	const std::map< std::string, Expr > &	replacements,
		const Stmt &	stmt )

◆ substitute() [5/6]

Expr Halide::Internal::substitute	(	const Expr &	find,
		const Expr &	replacement,
		const Expr &	expr )

Substitute expressions for other expressions.

◆ substitute() [6/6]

Stmt Halide::Internal::substitute	(	const Expr &	find,
		const Expr &	replacement,
		const Stmt &	stmt )

◆ graph_substitute() [1/4]

Expr Halide::Internal::graph_substitute	(	const std::string &	name,
		const Expr &	replacement,
		const Expr &	expr )

Substitutions where the IR may be a general graph (and not just a DAG).

◆ graph_substitute() [2/4]

Stmt Halide::Internal::graph_substitute	(	const std::string &	name,
		const Expr &	replacement,
		const Stmt &	stmt )

◆ graph_substitute() [3/4]

Expr Halide::Internal::graph_substitute	(	const Expr &	find,
		const Expr &	replacement,
		const Expr &	expr )

◆ graph_substitute() [4/4]

Stmt Halide::Internal::graph_substitute	(	const Expr &	find,
		const Expr &	replacement,
		const Stmt &	stmt )

◆ substitute_in_all_lets() [1/2]

Expr Halide::Internal::substitute_in_all_lets ( const Expr & expr )

Substitute in all let Exprs in a piece of IR.

Doesn't substitute in let stmts, as this may change the meaning of the IR (e.g. by moving a load after a store). Produces graphs of IR, so don't use non-graph-aware visitors or mutators on it until you've CSE'd the result.

◆ substitute_in_all_lets() [2/2]

Stmt Halide::Internal::substitute_in_all_lets ( const Stmt & stmt )

◆ target_test()

void Halide::Internal::target_test ( )

◆ lower_target_query_ops()

void Halide::Internal::lower_target_query_ops	(	std::map< std::string, Function > &	env,
		const Target &	t )

◆ inject_tracing()

Stmt Halide::Internal::inject_tracing	(	Stmt	,
		const std::string &	pipeline_name,
		bool	trace_pipeline,
		const std::map< std::string, Function > &	env,
		const std::vector< Function > &	outputs,
		const Target &	Target )

Take a statement representing a halide pipeline, inject calls to tracing functions at interesting points, such as allocations.

Should be done before storage flattening, but after all bounds inference.

◆ trim_no_ops()

Stmt Halide::Internal::trim_no_ops ( Stmt s )

Truncate loop bounds to the region over which they actually do something.

For examples see test/correctness/trim_no_ops.cpp

◆ unify_duplicate_lets()

Stmt Halide::Internal::unify_duplicate_lets ( const Stmt & s )

Find let statements that all define the same value, and make later ones just reuse the symbol names of the earlier ones.

◆ uniquify_variable_names()

Stmt Halide::Internal::uniquify_variable_names ( const Stmt & s )

Modify a statement so that every internally-defined variable name is unique.

This lets later passes assume syntactic equivalence is semantic equivalence.

◆ uniquify_variable_names_test()

void Halide::Internal::uniquify_variable_names_test ( )

◆ unpack_buffers()

Stmt Halide::Internal::unpack_buffers ( Stmt s )

Creates let stmts for the various buffer components (e.g.

foo.extent.0) in any referenced concrete buffers or buffer parameters. After this pass, the only undefined symbols should scalar parameters and the buffers themselves (e.g. foo.buffer).

◆ unroll_loops()

Stmt Halide::Internal::unroll_loops ( const Stmt & )

Take a statement with for loops marked for unrolling, and convert each into several copies of the innermost statement.

I.e. unroll the loop.

◆ lower_unsafe_promises()

Stmt Halide::Internal::lower_unsafe_promises	(	const Stmt &	s,
		const Target &	t )

Lower all unsafe promises into either assertions or unchecked code, depending on the target.

◆ lower_safe_promises()

Stmt Halide::Internal::lower_safe_promises ( const Stmt & s )

Lower all safe promises by just stripping them.

This is a good idea once no more lowering stages are going to use boxes_touched.

◆ safe_numeric_cast()

template<typename DST , typename SRC , typename std::enable_if< std::is_floating_point< SRC >::value >::type * = nullptr>

DST Halide::Internal::safe_numeric_cast ( SRC s )

Some numeric conversions are UB if the value won't fit in the result; safe_numeric_cast<>() is meant as a drop-in replacement for a C/C++ cast that adds well-defined behavior for the UB cases, attempting to mimic common implementation behavior as much as possible.

Definition at line 99 of file Util.h.

◆ reinterpret_bits()

template<typename DstType , typename SrcType >

DstType Halide::Internal::reinterpret_bits ( const SrcType & src )

An aggressive form of reinterpret cast used for correct type-punning.

Definition at line 135 of file Util.h.

References memcpy().

Referenced by Halide::Internal::IRMatcher::fuzz_test_rule().

◆ get_env_variable()

std::string Halide::Internal::get_env_variable ( char const * env_var_name )

Get value of an environment variable.

Returns its value is defined in the environment. If the var is not defined, an empty string is returned.

◆ running_program_name()

std::string Halide::Internal::running_program_name ( )

Get the name of the currently running executable.

Platform-specific. If program name cannot be retrieved, function returns an empty string.

◆ unique_name() [1/2]

std::string Halide::Internal::unique_name ( char prefix )

Generate a unique name starting with the given prefix.

It's unique relative to all other strings returned by unique_name in this process.

The single-character version always appends a numeric suffix to the character.

The string version will either return the input as-is (with high probability on the first time it is called with that input), or replace any existing '$' characters with underscores, then add a '$' sign and a numeric suffix to it.

Note that unique_name('f') therefore differs from unique_name("f"). The former returns something like f123, and the latter returns either f or f$123.

Referenced by Halide::Buffer< T, Dims >::Buffer().

◆ unique_name() [2/2]

std::string Halide::Internal::unique_name ( const std::string & prefix )

◆ starts_with()

bool Halide::Internal::starts_with	(	const std::string &	str,
		const std::string &	prefix )

Test if the first string starts with the second string.

◆ ends_with()

bool Halide::Internal::ends_with	(	const std::string &	str,
		const std::string &	suffix )

Test if the first string ends with the second string.

◆ replace_all()

std::string Halide::Internal::replace_all	(	const std::string &	str,
		const std::string &	find,
		const std::string &	replace )

Replace all matches of the second string in the first string with the last string.

◆ split_string()

std::vector< std::string > Halide::Internal::split_string	(	const std::string &	source,
		const std::string &	delim )

Split the source string using 'delim' as the divider.

◆ join_strings()

template<typename T >

std::string Halide::Internal::join_strings	(	const std::vector< T > &	sources,
		const std::string &	delim )

Join the source vector using 'delim' as the divider.

Definition at line 187 of file Util.h.

◆ fold_left()

template<typename T , typename Fn >

T Halide::Internal::fold_left	(	const std::vector< T > &	vec,
		Fn	f )

Perform a left fold of a vector.

Returns a default-constructed vector element if the vector is empty. Similar to std::accumulate but with a less clunky syntax.

Definition at line 212 of file Util.h.

◆ fold_right()

template<typename T , typename Fn >

T Halide::Internal::fold_right	(	const std::vector< T > &	vec,
		Fn	f )

Returns a right fold of a vector.

Returns a default-constructed vector element if the vector is empty.

Definition at line 227 of file Util.h.

◆ extract_namespaces()

std::string Halide::Internal::extract_namespaces	(	const std::string &	name,
		std::vector< std::string > &	namespaces )

Returns base name and fills in namespaces, outermost one first in vector.

Referenced by halide_handle_cplusplus_type::make().

◆ strip_namespaces()

std::string Halide::Internal::strip_namespaces ( const std::string & name )

Like extract_namespaces(), but strip and discard the namespaces, returning base name only.

◆ file_make_temp()

std::string Halide::Internal::file_make_temp	(	const std::string &	prefix,
		const std::string &	suffix )

Create a unique file with a name of the form prefixXXXXXsuffix in an arbitrary (but writable) directory; this is typically /tmp, but the specific location is not guaranteed.

(Note that the exact form of the file name may vary; in particular, the suffix may be ignored on Windows.) The file is created (but not opened), thus this can be called from different threads (or processes, e.g. when building with parallel make) without risking collision. Note that if this file is used as a temporary file, the caller is responsibly for deleting it. Neither the prefix nor suffix may contain a directory separator.

◆ dir_make_temp()

std::string Halide::Internal::dir_make_temp ( )

Create a unique directory in an arbitrary (but writable) directory; this is typically somewhere inside /tmp, but the specific location is not guaranteed.

The directory will be empty (i.e., this will never return /tmp itself, but rather a new directory inside /tmp). The caller is responsible for removing the directory after use.

◆ file_exists()

bool Halide::Internal::file_exists ( const std::string & name )

Wrapper for access().

Quietly ignores errors.

◆ assert_file_exists()

void Halide::Internal::assert_file_exists ( const std::string & name )

assert-fail if the file doesn't exist.

useful primarily for testing purposes.

◆ assert_no_file_exists()

void Halide::Internal::assert_no_file_exists ( const std::string & name )

assert-fail if the file DOES exist.

useful primarily for testing purposes.

◆ file_unlink()

void Halide::Internal::file_unlink ( const std::string & name )

Wrapper for unlink().

Asserts upon error.

Quietly ignores errors.

Referenced by Halide::Internal::TemporaryFile::~TemporaryFile().

◆ ensure_no_file_exists()

void Halide::Internal::ensure_no_file_exists ( const std::string & name )

Ensure that no file with this path exists.

If such a file exists and cannot be removed, assert-fail.

◆ dir_rmdir()

void Halide::Internal::dir_rmdir ( const std::string & name )

Wrapper for rmdir().

Asserts upon error.

◆ file_stat()

FileStat Halide::Internal::file_stat ( const std::string & name )

Wrapper for stat().

Asserts upon error.

◆ read_entire_file()

std::vector< char > Halide::Internal::read_entire_file ( const std::string & pathname )

Read the entire contents of a file into a vector<char>.

The file is read in binary mode. Errors trigger an assertion failure.

◆ write_entire_file() [1/2]

void Halide::Internal::write_entire_file	(	const std::string &	pathname,
		const void *	source,
		size_t	source_len )

Create or replace the contents of a file with a given pointer-and-length of memory.

If the file doesn't exist, it is created; if it does exist, it is completely overwritten. Any error triggers an assertion failure.

Referenced by write_entire_file().

◆ write_entire_file() [2/2]

void Halide::Internal::write_entire_file	(	const std::string &	pathname,
		const std::vector< char > &	source )

inline

Definition at line 322 of file Util.h.

References write_entire_file().

◆ add_would_overflow()

bool Halide::Internal::add_would_overflow	(	int	bits,
		int64_t	a,
		int64_t	b )

Routines to test if math would overflow for signed integers with the given number of bits.

Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Add >().

◆ sub_would_overflow()

bool Halide::Internal::sub_would_overflow	(	int	bits,
		int64_t	a,
		int64_t	b )

Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Sub >().

◆ mul_would_overflow()

bool Halide::Internal::mul_would_overflow	(	int	bits,
		int64_t	a,
		int64_t	b )

Referenced by Halide::Internal::IRMatcher::constant_fold_bin_op< Mul >().

◆ add_with_overflow()

HALIDE_MUST_USE_RESULT bool Halide::Internal::add_with_overflow	(	int	bits,
		int64_t	a,
		int64_t	b,
		int64_t *	result )

Routines to perform arithmetic on signed types without triggering signed overflow.

If overflow would occur, sets result to zero, and returns false. Otherwise set result to the correct value, and returns true.

Referenced by Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().

◆ sub_with_overflow()

HALIDE_MUST_USE_RESULT bool Halide::Internal::sub_with_overflow	(	int	bits,
		int64_t	a,
		int64_t	b,
		int64_t *	result )

Referenced by Halide::Internal::Simplify::ExprInfo::trim_bounds_using_alignment().

◆ mul_with_overflow()

HALIDE_MUST_USE_RESULT bool Halide::Internal::mul_with_overflow	(	int	bits,
		int64_t	a,
		int64_t	b,
		int64_t *	result )

◆ halide_tic_impl()

void Halide::Internal::halide_tic_impl	(	const char *	file,
		int	line )

◆ halide_toc_impl()

void Halide::Internal::halide_toc_impl	(	const char *	file,
		int	line )

◆ begin()

template<typename T >

auto Halide::Internal::begin ( reverse_adaptor< T > i )

Definition at line 467 of file Util.h.

References Halide::Internal::reverse_adaptor< T >::range.

Referenced by Halide::Internal::Elf::Section::append_contents(), Halide::Internal::Elf::Section::prepend_contents(), Halide::Internal::Elf::Section::set_contents(), and Halide::Internal::Elf::Section::set_relocations().

◆ end()

template<typename T >

auto Halide::Internal::end ( reverse_adaptor< T > i )

Definition at line 472 of file Util.h.

References Halide::Internal::reverse_adaptor< T >::range.

Referenced by Halide::Internal::Elf::Section::append_contents(), Halide::Internal::Elf::Section::prepend_contents(), Halide::Internal::Elf::Section::set_contents(), and Halide::Internal::Elf::Section::set_relocations().

◆ reverse_view()

template<typename T >

reverse_adaptor< T > Halide::Internal::reverse_view ( T && range )

Reverse-order adaptor for range-based for-loops.

TODO: Replace with std::ranges::reverse_view when upgrading to C++20.

Definition at line 481 of file Util.h.

◆ c_print_name()

std::string Halide::Internal::c_print_name	(	const std::string &	name,
		bool	prefix_underscore = true )

Emit a version of a string that is a valid identifier in C (.

is replaced with _) If prefix_underscore is true (the default), an underscore will be prepended if the input starts with an alphabetic character to avoid reserved word clashes.

◆ get_llvm_version()

int Halide::Internal::get_llvm_version ( )

Return the LLVM_VERSION against which this libHalide is compiled.

This is provided only for internal tests which need to verify behavior; please don't use this outside of Halide tests.

◆ run_with_large_stack()

void Halide::Internal::run_with_large_stack ( const std::function< void()> & action )

Call the given action in a platform-specific context that provides at least the stack space returned by get_compiler_stack_size.

If that value is zero, just calls the function on the calling thread. Otherwise on Windows this uses a Fiber, and on other platforms it uses swapcontext.

◆ popcount64()

int Halide::Internal::popcount64 ( uint64_t x )

Portable versions of popcount, count-leading-zeros, and count-trailing-zeros.

◆ clz64()

int Halide::Internal::clz64 ( uint64_t x )

◆ ctz64()

int Halide::Internal::ctz64 ( uint64_t x )

◆ next_power_of_two()

int64_t Halide::Internal::next_power_of_two ( int64_t x )

inline

Return an integer 2^n, for some n, which is >= x.

Argument x must be > 0.

Definition at line 556 of file Util.h.

◆ align_up()

template<typename T >

T Halide::Internal::align_up	(	T	x,
		int	n )

inline

Definition at line 561 of file Util.h.

◆ make_argument_list()

std::vector< Var > Halide::Internal::make_argument_list ( int dimensionality )

Make a list of unique arguments for definitions with unnamed arguments.

Referenced by Halide::Func::define_extern(), Halide::Func::define_extern(), and Halide::Func::define_extern().

◆ vectorize_loops()

Stmt Halide::Internal::vectorize_loops	(	const Stmt &	s,
		const std::map< std::string, Function > &	env )

Take a statement with for loops marked for vectorization, and turn them into single statements that operate on vectors.

The loops in question must have constant extent.

◆ wrap_func_calls()

std::map< std::string, Function > Halide::Internal::wrap_func_calls ( const std::map< std::string, Function > & env )

Replace every call to wrapped Functions in the Functions' definitions with call to their wrapper functions.

◆ get_test_tmp_dir()

std::string Halide::Internal::get_test_tmp_dir ( )

inline

Return the path to a directory that can be safely written to when running tests; the contents directory may or may not outlast the lifetime of test itself (ie, the files may be cleaned up after test execution).

The path is guaranteed to be an absolute path and end in a directory separator, so a leaf filename can simply be appended. It is not guaranteed that this directory will be empty. If the path cannot be created, the function will assert-fail and return an invalid path.

Definition at line 76 of file halide_test_dirs.h.

References Halide::Internal::Test::get_current_directory(), and Halide::Internal::Test::get_env_variable().

Variable Documentation

◆ unknown

const int64_t Halide::Internal::unknown = std::numeric_limits<int64_t>::min()

Definition at line 22 of file AutoScheduleUtils.h.

◆ StrongestExprNodeType

IRNodeType Halide::Internal::StrongestExprNodeType = IRNodeType::VectorReduce

constexpr

Definition at line 81 of file Expr.h.

◆ random_variable_counter

std::atomic<int> Halide::Internal::random_variable_counter

extern

Namespaces

Classes

Typedefs

Enumerations

Functions

Variables

Typedef Documentation

◆ AbstractGeneratorPtr

◆ DimBounds

◆ FuncValueBounds

◆ add_const_if_T_is_const

◆ GeneratorParamImplBase

◆ GeneratorInputImplBase

◆ GeneratorOutputImplBase

◆ GeneratorFactory

◆ LLVMOStream

Enumeration Type Documentation

◆ ArgInfoKind

◆ ArgInfoDirection

◆ Direction

◆ IRNodeType

◆ ForType

◆ SyntheticParamType

◆ Monotonic

◆ DimType

Function Documentation

◆ add_atomic_mutex()

◆ add_image_checks()

◆ add_parameter_checks()

◆ add_split_factor_checks()

◆ align_loads()

◆ allocation_bounds_inference()

◆ apply_split()

◆ compute_loop_bounds_after_split()

◆ get_ops_table()

◆ prove_associativity()

◆ associativity_test()

◆ fork_async_producers()

◆ string_to_int()

◆ substitute_var_estimates() [1/2]

◆ substitute_var_estimates() [2/2]

◆ get_extent()

◆ box_size()

◆ disp_regions()

◆ get_stage_definition()

◆ get_stage_dims()

◆ combine_load_costs()

◆ get_stage_bounds() [1/2]

◆ get_stage_bounds() [2/2]

◆ perform_inline()

◆ get_parents()

◆ get_element() [1/2]

◆ get_element() [2/2]

◆ inline_all_trivial_functions()

◆ is_func_called_element_wise()

◆ inline_all_element_wise_functions()

◆ propagate_estimate_test()

◆ bound_constant_extent_loops()

◆ empty_func_value_bounds()

◆ bounds_of_expr_in_scope()

◆ find_constant_bound()

◆ find_constant_bounds()

◆ merge_boxes()

◆ boxes_overlap()

◆ box_union()

◆ box_intersection()

◆ box_contains()

◆ boxes_required() [1/2]

◆ boxes_required() [2/2]

◆ boxes_provided() [1/2]

◆ boxes_provided() [2/2]

◆ boxes_touched() [1/2]

◆ boxes_touched() [2/2]

◆ box_required() [1/2]

◆ box_required() [2/2]

◆ box_provided() [1/2]

◆ box_provided() [2/2]

◆ box_touched() [1/2]

◆ box_touched() [2/2]

◆ compute_function_value_bounds()