Halide 19.0.0
Halide compiler and libraries
|
This file defines the class FunctionDAG, which is our representation of a Halide pipeline, and contains methods to using Halide's bounds tools to query properties of it. More...
Namespaces | |
namespace | BoundaryConditions |
namespace to hold functions for imposing boundary conditions on Halide Funcs. | |
namespace | ConciseCasts |
namespace | Internal |
namespace | PythonBindings |
namespace | PyTorch |
namespace | Runtime |
Classes | |
struct | Argument |
A struct representing an argument to a halide-generated function. More... | |
struct | ArgumentEstimates |
struct | AutoschedulerParams |
Special the Autoscheduler to be used (if any), along with arbitrary additional arguments specific to the given Autoscheduler. More... | |
struct | AutoSchedulerResults |
struct | bfloat16_t |
Class that provides a type that implements half precision floating point using the bfloat16 format. More... | |
class | Buffer |
A Halide::Buffer is a named shared reference to a Halide::Runtime::Buffer. More... | |
struct | BufferConstraint |
class | Callable |
struct | CompileError |
An error that occurs while compiling a Halide pipeline that Halide attributes to a user error. More... | |
class | CompileTimeErrorReporter |
CompileTimeErrorReporter is used at compile time (not runtime) when an error or warning is generated by Halide. More... | |
class | CostModel |
struct | CustomLoweringPass |
A custom lowering pass. More... | |
class | DefaultCostModel |
class | Derivative |
Helper structure storing the adjoints Func. More... | |
struct | Error |
A base class for Halide errors. More... | |
class | EvictionKey |
Helper class for identifying purpose of an Expr passed to memoize. More... | |
struct | Expr |
A fragment of Halide syntax. More... | |
struct | ExprCompare |
This lets you use an Expr as a key in a map of the form map<Expr, Foo, ExprCompare> More... | |
struct | ExternCFunction |
struct | ExternFuncArgument |
An argument to an extern-defined Func. More... | |
struct | ExternSignature |
struct | float16_t |
Class that provides a type that implements half precision floating point (IEEE754 2008 binary16) in software. More... | |
class | Func |
A halide function. More... | |
class | FuncRef |
A fragment of front-end syntax of the form f(x, y, z), where x, y, z are Vars or Exprs. More... | |
class | FuncTupleElementRef |
A fragment of front-end syntax of the form f(x, y, z)[index], where x, y, z are Vars or Exprs. More... | |
struct | FuseLoopLevel |
class | Generator |
class | GeneratorContext |
GeneratorContext is a class that is used when using Generators (or Stubs) directly; it is used to allow the outer context (typically, either a Generator or "top-level" code) to specify certain information to the inner context to ensure that inner and outer Generators are compiled in a compatible way. More... | |
class | GeneratorInput |
class | GeneratorOutput |
class | GeneratorParam |
GeneratorParam is a templated class that can be used to modify the behavior of the Generator at code-generation time. More... | |
class | ImageParam |
An Image parameter to a halide pipeline. More... | |
struct | ImplicitVar |
struct | InternalError |
An error that occurs while compiling a Halide pipeline that Halide attributes to an internal compiler bug, or to an invalid use of Halide's internals. More... | |
struct | JITExtern |
struct | JITHandlers |
A set of custom overrides of runtime functions. More... | |
struct | JITUserContext |
A context to be passed to Pipeline::realize. More... | |
class | LoopLevel |
A reference to a site in a Halide statement at the top of the body of a particular for loop. More... | |
class | Module |
A halide module. More... | |
class | NamesInterface |
class | OutputImageParam |
A handle on the output buffer of a pipeline. More... | |
class | Param |
A scalar parameter to a halide pipeline. More... | |
class | Parameter |
A reference-counted handle to a parameter to a halide pipeline. More... | |
class | Pipeline |
A class representing a Halide pipeline. More... | |
struct | Range |
A single-dimensional span. More... | |
class | RDom |
A multi-dimensional domain over which to iterate. More... | |
class | Realization |
A Realization is a vector of references to existing Buffer objects. More... | |
struct | RuntimeError |
An error that occurs while running a JIT-compiled Halide pipeline. More... | |
class | RVar |
A reduction variable represents a single dimension of a reduction domain (RDom). More... | |
class | SimdOpCheckTest |
class | Stage |
A single definition of a Func. More... | |
struct | Target |
A struct representing a target machine and os to generate code for. More... | |
struct | Task |
struct | TestResult |
class | Tuple |
Create a small array of Exprs for defining and calling functions with multiple outputs. More... | |
struct | Type |
Types in the halide type system. More... | |
class | Var |
A Halide variable, to be used when defining functions. More... | |
struct | VarOrRVar |
A class that can represent Vars or RVars. More... | |
Typedefs | |
using | GeneratorParamsMap = std::map<std::string, std::string> |
typedef std::vector< Range > | Region |
A multi-dimensional box. | |
typedef Stage | ScheduleHandle |
using | MetadataNameMap = std::map<std::string, std::string> |
using | ModuleFactory = std::function<Module(const std::string &fn_name, const Target &target)> |
using | CompilerLoggerFactory = std::function<std::unique_ptr<Internal::CompilerLogger>(const std::string &fn_name, const Target &target)> |
using | AutoSchedulerFn = std::function<void(const Pipeline &, const Target &, const AutoschedulerParams &, AutoSchedulerResults *outputs)> |
Enumerations | |
enum class | DeviceAPI { None , Host , Default_GPU , CUDA , OpenCL , Metal , Hexagon , HexagonDma , D3D12Compute , Vulkan , WebGPU } |
An enum describing a type of device API. More... | |
enum class | MemoryType { Auto , Heap , Stack , Register , GPUShared , GPUTexture , LockedCache , VTCM , AMXTile } |
An enum describing different address spaces to be used with Func::store_in. More... | |
enum class | NameMangling { Default , C , CPlusPlus } |
An enum to specify calling convention for extern stages. More... | |
enum class | Partition { Auto , Never , Always } |
Different ways to handle loops with a potentially optimizable boundary conditions. More... | |
enum class | OutputFileType { assembly , bitcode , c_header , c_source , compiler_log , cpp_stub , featurization , function_info_header , hlpipe , llvm_assembly , object , python_extension , pytorch_wrapper , registration , schedule , static_library , stmt , conceptual_stmt , stmt_html , conceptual_stmt_html , device_code } |
Enums specifying various kinds of outputs that can be produced from a Halide Pipeline. More... | |
enum class | LinkageType { External , ExternalPlusMetadata , ExternalPlusArgv , Internal } |
Type of linkage a function in a lowered Halide module can have. More... | |
enum | StmtOutputFormat { Text , HTML } |
Used to determine if the output printed to file should be as a normal string or as an HTML file which can be opened in a browerser and manipulated via JS and CSS. More... | |
enum class | PrefetchBoundStrategy { Clamp , GuardWithIf , NonFaulting } |
Different ways to handle accesses outside the original extents in a prefetch. More... | |
enum class | TailStrategy { RoundUp , GuardWithIf , Predicate , PredicateLoads , PredicateStores , ShiftInwards , ShiftInwardsAndBlend , RoundUpAndBlend , Auto } |
Different ways to handle a tail case in a split when the factor does not provably divide the extent. More... | |
enum class | LoopAlignStrategy { AlignStart , AlignEnd , NoAlign , Auto } |
Different ways to handle the case when the start/end of the loops of stages computed with (fused) are not aligned. More... | |
Functions | |
std::unique_ptr< DefaultCostModel > | make_default_cost_model (const std::string &weights_in_dir="", const std::string &weights_out_dir="", bool randomize_weights=false) |
std::unique_ptr< DefaultCostModel > | make_default_cost_model (Internal::Autoscheduler::Statistics &stats, const std::string &weights_in_dir="", const std::string &weights_out_dir="", bool randomize_weights=false) |
std::unique_ptr< llvm::Module > | codegen_llvm (const Module &module, llvm::LLVMContext &context) |
Given a Halide module, generate an llvm::Module. | |
Internal::ConstantInterval | cast (Type t, const Internal::ConstantInterval &a) |
Cast operators for ConstantIntervals. | |
Internal::ConstantInterval | saturating_cast (Type t, const Internal::ConstantInterval &a) |
std::ostream & | operator<< (std::ostream &stream, const Expr &) |
Emit an expression on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Type &) |
Emit a halide type on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Module &) |
Emit a halide Module on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Target &) |
Emit a halide Target in a human readable form. | |
Derivative | propagate_adjoints (const Func &output, const Func &adjoint, const Region &output_bounds) |
Given a Func and a corresponding adjoint, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters. | |
Derivative | propagate_adjoints (const Func &output, const Buffer< float > &adjoint) |
Given a Func and a corresponding adjoint buffer, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters. | |
Derivative | propagate_adjoints (const Func &output) |
Given a scalar Func with size 1, (back)propagate the gradient to all dependent Funcs, buffers, and parameters. | |
Pipeline | deserialize_pipeline (const std::string &filename, const std::map< std::string, Parameter > &user_params) |
Deserialize a Halide pipeline from a file. | |
Pipeline | deserialize_pipeline (std::istream &in, const std::map< std::string, Parameter > &user_params) |
Deserialize a Halide pipeline from an input stream. | |
Pipeline | deserialize_pipeline (const std::vector< uint8_t > &data, const std::map< std::string, Parameter > &user_params) |
Deserialize a Halide pipeline from a byte buffer containing a serizalized pipeline in binary format. | |
std::map< std::string, Parameter > | deserialize_parameters (const std::string &filename) |
Deserialize the extenal parameters for the Halide pipeline from a file. | |
std::map< std::string, Parameter > | deserialize_parameters (std::istream &in) |
Deserialize the extenal parameters for the Halide pipeline from input stream. | |
std::map< std::string, Parameter > | deserialize_parameters (const std::vector< uint8_t > &data) |
Deserialize the extenal parameters for the Halide pipeline from a byte buffer containing a serialized pipeline in binary format. | |
const halide_device_interface_t * | get_device_interface_for_device_api (DeviceAPI d, const Target &t=get_jit_target_from_environment(), const char *error_site=nullptr) |
Gets the appropriate halide_device_interface_t * for a DeviceAPI. | |
DeviceAPI | get_default_device_api_for_target (const Target &t) |
Get the specific DeviceAPI that Halide would select when presented with DeviceAPI::Default_GPU for a given target. | |
bool | host_supports_target_device (const Target &t) |
This attempts to sniff whether a given Target (and its implied DeviceAPI) is usable on the current host. | |
bool | exceptions_enabled () |
Query whether Halide was compiled with exceptions. | |
void | set_custom_compile_time_error_reporter (CompileTimeErrorReporter *error_reporter) |
The default error reporter logs to stderr, then throws an exception (if HALIDE_WITH_EXCEPTIONS) or calls abort (if not). | |
Expr | fast_integer_divide (const Expr &numerator, const Expr &denominator) |
Integer division by small values can be done exactly as multiplies and shifts. | |
Expr | fast_integer_divide_round_to_zero (const Expr &numerator, const Expr &denominator) |
A variant of the above which rounds towards zero instead of rounding towards negative infinity. | |
Expr | fast_integer_modulo (const Expr &numerator, const Expr &denominator) |
Use the fast integer division tables to implement a modulo operation via the Euclidean identity: ab = a - (a/b)*b. | |
Expr | min (const FuncRef &a, const FuncRef &b) |
Explicit overloads of min and max for FuncRef. | |
Expr | max (const FuncRef &a, const FuncRef &b) |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE T | evaluate (JITUserContext *ctx, const Expr &e) |
JIT-Compile and run enough code to evaluate a Halide expression. | |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE T | evaluate (const Expr &e) |
evaluate with a default user context | |
template<typename First , typename... Rest> | |
HALIDE_NO_USER_CODE_INLINE void | evaluate (JITUserContext *ctx, Tuple t, First first, Rest &&...rest) |
JIT-compile and run enough code to evaluate a Halide Tuple. | |
template<typename First , typename... Rest> | |
HALIDE_NO_USER_CODE_INLINE void | evaluate (Tuple t, First first, Rest &&...rest) |
JIT-compile and run enough code to evaluate a Halide Tuple. | |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE T | evaluate_may_gpu (const Expr &e) |
JIT-Compile and run enough code to evaluate a Halide expression. | |
template<typename First , typename... Rest> | |
HALIDE_NO_USER_CODE_INLINE void | evaluate_may_gpu (Tuple t, First first, Rest &&...rest) |
JIT-compile and run enough code to evaluate a Halide Tuple. | |
template<typename Other , typename T > | |
auto | operator+ (const Other &a, const GeneratorParam< T > &b) -> decltype(a+(T) b) |
Addition between GeneratorParam<T> and any type that supports operator+ with T. | |
template<typename Other , typename T > | |
auto | operator+ (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a+b) |
template<typename Other , typename T > | |
auto | operator- (const Other &a, const GeneratorParam< T > &b) -> decltype(a -(T) b) |
Subtraction between GeneratorParam<T> and any type that supports operator- with T. | |
template<typename Other , typename T > | |
auto | operator- (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a - b) |
template<typename Other , typename T > | |
auto | operator* (const Other &a, const GeneratorParam< T > &b) -> decltype(a *(T) b) |
Multiplication between GeneratorParam<T> and any type that supports operator* with T. | |
template<typename Other , typename T > | |
auto | operator* (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a *b) |
template<typename Other , typename T > | |
auto | operator/ (const Other &a, const GeneratorParam< T > &b) -> decltype(a/(T) b) |
Division between GeneratorParam<T> and any type that supports operator/ with T. | |
template<typename Other , typename T > | |
auto | operator/ (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a/b) |
template<typename Other , typename T > | |
auto | operator% (const Other &a, const GeneratorParam< T > &b) -> decltype(a %(T) b) |
Modulo between GeneratorParam<T> and any type that supports operator% with T. | |
template<typename Other , typename T > | |
auto | operator% (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a % b) |
template<typename Other , typename T > | |
auto | operator> (const Other &a, const GeneratorParam< T > &b) -> decltype(a >(T) b) |
Greater than comparison between GeneratorParam<T> and any type that supports operator> with T. | |
template<typename Other , typename T > | |
auto | operator> (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a > b) |
template<typename Other , typename T > | |
auto | operator< (const Other &a, const GeneratorParam< T > &b) -> decltype(a<(T) b) |
Less than comparison between GeneratorParam<T> and any type that supports operator< with T. | |
template<typename Other , typename T > | |
auto | operator< (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a< b) |
template<typename Other , typename T > | |
auto | operator>= (const Other &a, const GeneratorParam< T > &b) -> decltype(a >=(T) b) |
Greater than or equal comparison between GeneratorParam<T> and any type that supports operator>= with T. | |
template<typename Other , typename T > | |
auto | operator>= (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a >=b) |
template<typename Other , typename T > | |
auto | operator<= (const Other &a, const GeneratorParam< T > &b) -> decltype(a<=(T) b) |
Less than or equal comparison between GeneratorParam<T> and any type that supports operator<= with T. | |
template<typename Other , typename T > | |
auto | operator<= (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a<=b) |
template<typename Other , typename T > | |
auto | operator== (const Other &a, const GeneratorParam< T > &b) -> decltype(a==(T) b) |
Equality comparison between GeneratorParam<T> and any type that supports operator== with T. | |
template<typename Other , typename T > | |
auto | operator== (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a==b) |
template<typename Other , typename T > | |
auto | operator!= (const Other &a, const GeneratorParam< T > &b) -> decltype(a !=(T) b) |
Inequality comparison between between GeneratorParam<T> and any type that supports operator!= with T. | |
template<typename Other , typename T > | |
auto | operator!= (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a !=b) |
template<typename Other , typename T > | |
auto | operator&& (const Other &a, const GeneratorParam< T > &b) -> decltype(a &&(T) b) |
Logical and between between GeneratorParam<T> and any type that supports operator&& with T. | |
template<typename Other , typename T > | |
auto | operator&& (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a &&b) |
template<typename T > | |
auto | operator&& (const GeneratorParam< T > &a, const GeneratorParam< T > &b) -> decltype((T) a &&(T) b) |
template<typename Other , typename T > | |
auto | operator|| (const Other &a, const GeneratorParam< T > &b) -> decltype(a||(T) b) |
Logical or between between GeneratorParam<T> and any type that supports operator|| with T. | |
template<typename Other , typename T > | |
auto | operator|| (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a||b) |
template<typename T > | |
auto | operator|| (const GeneratorParam< T > &a, const GeneratorParam< T > &b) -> decltype((T) a||(T) b) |
template<typename Other , typename T > | |
auto | min (const Other &a, const GeneratorParam< T > &b) -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
Compute minimum between GeneratorParam<T> and any type that supports min with T. | |
template<typename Other , typename T > | |
auto | min (const GeneratorParam< T > &a, const Other &b) -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
template<typename Other , typename T > | |
auto | max (const Other &a, const GeneratorParam< T > &b) -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
Compute the maximum value between GeneratorParam<T> and any type that supports max with T. | |
template<typename Other , typename T > | |
auto | max (const GeneratorParam< T > &a, const Other &b) -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
template<typename T > | |
auto | operator! (const GeneratorParam< T > &a) -> decltype(!(T) a) |
Not operator for GeneratorParam. | |
Callable | create_callable_from_generator (const GeneratorContext &context, const std::string &name, const GeneratorParamsMap &generator_params={}) |
Create a Generator from the currently-registered Generators, use it to create a Callable. | |
Callable | create_callable_from_generator (const Target &target, const std::string &name, const GeneratorParamsMap &generator_params={}) |
Expr | sum (Expr, const std::string &s="sum") |
An inline reduction. | |
Expr | saturating_sum (Expr, const std::string &s="saturating_sum") |
Expr | product (Expr, const std::string &s="product") |
Expr | maximum (Expr, const std::string &s="maximum") |
Expr | minimum (Expr, const std::string &s="minimum") |
Expr | sum (const RDom &, Expr, const std::string &s="sum") |
Variants of the inline reduction in which the RDom is stated explicitly. | |
Expr | saturating_sum (const RDom &r, Expr e, const std::string &s="saturating_sum") |
Expr | product (const RDom &, Expr, const std::string &s="product") |
Expr | maximum (const RDom &, Expr, const std::string &s="maximum") |
Expr | minimum (const RDom &, Expr, const std::string &s="minimum") |
Tuple | argmax (Expr, const std::string &s="argmax") |
Returns an Expr or Tuple representing the coordinates of the point in the RDom which minimizes or maximizes the expression. | |
Tuple | argmin (Expr, const std::string &s="argmin") |
Tuple | argmax (const RDom &, Expr, const std::string &s="argmax") |
Tuple | argmin (const RDom &, Expr, const std::string &s="argmin") |
Expr | sum (Expr, const Func &) |
Inline reductions create an anonymous helper Func to do the work. | |
Expr | saturating_sum (Expr, const Func &) |
Expr | product (Expr, const Func &) |
Expr | maximum (Expr, const Func &) |
Expr | minimum (Expr, const Func &) |
Expr | sum (const RDom &, Expr, const Func &) |
Expr | saturating_sum (const RDom &r, Expr e, const Func &) |
Expr | product (const RDom &, Expr, const Func &) |
Expr | maximum (const RDom &, Expr, const Func &) |
Expr | minimum (const RDom &, Expr, const Func &) |
Tuple | argmax (Expr, const Func &) |
Tuple | argmin (Expr, const Func &) |
Tuple | argmax (const RDom &, Expr, const Func &) |
Tuple | argmin (const RDom &, Expr, const Func &) |
template<typename T > | |
Expr | cast (Expr a) |
Cast an expression to the halide type corresponding to the C++ type T. | |
Expr | cast (Type t, Expr a) |
Cast an expression to a new type. | |
Expr | operator+ (Expr a, Expr b) |
Return the sum of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr | operator+ (Expr a, int b) |
Add an expression and a constant integer. | |
Expr | operator+ (int a, Expr b) |
Add a constant integer and an expression. | |
Expr & | operator+= (Expr &a, Expr b) |
Modify the first expression to be the sum of two expressions, without changing its type. | |
Expr | operator- (Expr a, Expr b) |
Return the difference of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr | operator- (Expr a, int b) |
Subtracts a constant integer from an expression. | |
Expr | operator- (int a, Expr b) |
Subtracts an expression from a constant integer. | |
Expr | operator- (Expr a) |
Return the negative of the argument. | |
Expr & | operator-= (Expr &a, Expr b) |
Modify the first expression to be the difference of two expressions, without changing its type. | |
Expr | operator* (Expr a, Expr b) |
Return the product of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr | operator* (Expr a, int b) |
Multiply an expression and a constant integer. | |
Expr | operator* (int a, Expr b) |
Multiply a constant integer and an expression. | |
Expr & | operator*= (Expr &a, Expr b) |
Modify the first expression to be the product of two expressions, without changing its type. | |
Expr | operator/ (Expr a, Expr b) |
Return the ratio of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr & | operator/= (Expr &a, Expr b) |
Modify the first expression to be the ratio of two expressions, without changing its type. | |
Expr | operator/ (Expr a, int b) |
Divides an expression by a constant integer. | |
Expr | operator/ (int a, Expr b) |
Divides a constant integer by an expression. | |
Expr | operator% (Expr a, Expr b) |
Return the first argument reduced modulo the second, doing any necessary type coercion using Internal::match_types. | |
Expr | operator% (Expr a, int b) |
Mods an expression by a constant integer. | |
Expr | operator% (int a, Expr b) |
Mods a constant integer by an expression. | |
Expr | operator> (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is greater than the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator> (Expr a, int b) |
Return a boolean expression that tests whether an expression is greater than a constant integer. | |
Expr | operator> (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is greater than an expression. | |
Expr | operator< (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is less than the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator< (Expr a, int b) |
Return a boolean expression that tests whether an expression is less than a constant integer. | |
Expr | operator< (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is less than an expression. | |
Expr | operator<= (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is less than or equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator<= (Expr a, int b) |
Return a boolean expression that tests whether an expression is less than or equal to a constant integer. | |
Expr | operator<= (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is less than or equal to an expression. | |
Expr | operator>= (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is greater than or equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator>= (const Expr &a, int b) |
Return a boolean expression that tests whether an expression is greater than or equal to a constant integer. | |
Expr | operator>= (int a, const Expr &b) |
Return a boolean expression that tests whether a constant integer is greater than or equal to an expression. | |
Expr | operator== (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator== (Expr a, int b) |
Return a boolean expression that tests whether an expression is equal to a constant integer. | |
Expr | operator== (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is equal to an expression. | |
Expr | operator!= (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is not equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator!= (Expr a, int b) |
Return a boolean expression that tests whether an expression is not equal to a constant integer. | |
Expr | operator!= (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is not equal to an expression. | |
Expr | operator&& (Expr a, Expr b) |
Returns the logical and of the two arguments. | |
Expr | operator&& (Expr a, bool b) |
Logical and of an Expr and a bool. | |
Expr | operator&& (bool a, Expr b) |
Expr | operator|| (Expr a, Expr b) |
Returns the logical or of the two arguments. | |
Expr | operator|| (Expr a, bool b) |
Logical or of an Expr and a bool. | |
Expr | operator|| (bool a, Expr b) |
Expr | operator! (Expr a) |
Returns the logical not the argument. | |
Expr | max (Expr a, Expr b) |
Returns an expression representing the greater of the two arguments, after doing any necessary type coercion using Internal::match_types. | |
Expr | max (Expr a, int b) |
Returns an expression representing the greater of an expression and a constant integer. | |
Expr | max (int a, Expr b) |
Returns an expression representing the greater of a constant integer and an expression. | |
Expr | max (float a, Expr b) |
Expr | max (Expr a, float b) |
template<typename A , typename B , typename C , typename... Rest, typename std::enable_if< Halide::Internal::all_are_convertible< Expr, Rest... >::value >::type * = nullptr> | |
Expr | max (A &&a, B &&b, C &&c, Rest &&...rest) |
Returns an expression representing the greater of an expressions vector, after doing any necessary type coersion using Internal::match_types. | |
Expr | min (Expr a, Expr b) |
Expr | min (Expr a, int b) |
Returns an expression representing the lesser of an expression and a constant integer. | |
Expr | min (int a, Expr b) |
Returns an expression representing the lesser of a constant integer and an expression. | |
Expr | min (float a, Expr b) |
Expr | min (Expr a, float b) |
template<typename A , typename B , typename C , typename... Rest, typename std::enable_if< Halide::Internal::all_are_convertible< Expr, Rest... >::value >::type * = nullptr> | |
Expr | min (A &&a, B &&b, C &&c, Rest &&...rest) |
Returns an expression representing the lesser of an expressions vector, after doing any necessary type coersion using Internal::match_types. | |
Expr | operator+ (Expr a, float b) |
Operators on floats treats those floats as Exprs. | |
Expr | operator+ (float a, Expr b) |
Expr | operator- (Expr a, float b) |
Expr | operator- (float a, Expr b) |
Expr | operator* (Expr a, float b) |
Expr | operator* (float a, Expr b) |
Expr | operator/ (Expr a, float b) |
Expr | operator/ (float a, Expr b) |
Expr | operator% (Expr a, float b) |
Expr | operator% (float a, Expr b) |
Expr | operator> (Expr a, float b) |
Expr | operator> (float a, Expr b) |
Expr | operator< (Expr a, float b) |
Expr | operator< (float a, Expr b) |
Expr | operator>= (Expr a, float b) |
Expr | operator>= (float a, Expr b) |
Expr | operator<= (Expr a, float b) |
Expr | operator<= (float a, Expr b) |
Expr | operator== (Expr a, float b) |
Expr | operator== (float a, Expr b) |
Expr | operator!= (Expr a, float b) |
Expr | operator!= (float a, Expr b) |
Expr | clamp (Expr a, const Expr &min_val, const Expr &max_val) |
Clamps an expression to lie within the given bounds. | |
Expr | abs (Expr a) |
Returns the absolute value of a signed integer or floating-point expression. | |
Expr | absd (Expr a, Expr b) |
Return the absolute difference between two values. | |
Expr | select (Expr condition, Expr true_value, Expr false_value) |
Returns an expression similar to the ternary operator in C, except that it always evaluates all arguments. | |
template<typename... Args, typename std::enable_if< Halide::Internal::all_are_convertible< Expr, Args... >::value >::type * = nullptr> | |
Expr | select (Expr c0, Expr v0, Expr c1, Expr v1, Args &&...args) |
A multi-way variant of select similar to a switch statement in C, which can accept multiple conditions and values in pairs. | |
Tuple | select (const Tuple &condition, const Tuple &true_value, const Tuple &false_value) |
Equivalent of ternary select(), but taking/returning tuples. | |
Tuple | select (const Expr &condition, const Tuple &true_value, const Tuple &false_value) |
template<typename... Args> | |
Tuple | select (const Tuple &c0, const Tuple &v0, const Tuple &c1, const Tuple &v1, Args &&...args) |
Equivalent of multiway select(), but taking/returning tuples. | |
template<typename... Args> | |
Tuple | select (const Expr &c0, const Tuple &v0, const Expr &c1, const Tuple &v1, Args &&...args) |
Expr | select (const Expr &condition, const FuncRef &true_value, const FuncRef &false_value) |
select applied to FuncRefs (e.g. | |
template<typename... Args> | |
Expr | select (const Expr &c0, const FuncRef &v0, const Expr &c1, const FuncRef &v1, Args &&...args) |
Expr | mux (const Expr &id, const std::initializer_list< Expr > &values) |
Oftentimes we want to pack a list of expressions with the same type into a channel dimension, e.g., img(x, y, c) = select(c == 0, 100, // Red c == 1, 50, // Green 25); // Blue This is tedious when the list is long. | |
Expr | mux (const Expr &id, const std::vector< Expr > &values) |
Expr | mux (const Expr &id, const Tuple &values) |
Expr | mux (const Expr &id, const std::initializer_list< FuncRef > &values) |
Tuple | mux (const Expr &id, const std::initializer_list< Tuple > &values) |
Tuple | mux (const Expr &id, const std::vector< Tuple > &values) |
Expr | sin (Expr x) |
Return the sine of a floating-point expression. | |
Expr | asin (Expr x) |
Return the arcsine of a floating-point expression. | |
Expr | cos (Expr x) |
Return the cosine of a floating-point expression. | |
Expr | acos (Expr x) |
Return the arccosine of a floating-point expression. | |
Expr | tan (Expr x) |
Return the tangent of a floating-point expression. | |
Expr | atan (Expr x) |
Return the arctangent of a floating-point expression. | |
Expr | atan2 (Expr y, Expr x) |
Return the angle of a floating-point gradient. | |
Expr | sinh (Expr x) |
Return the hyperbolic sine of a floating-point expression. | |
Expr | asinh (Expr x) |
Return the hyperbolic arcsinhe of a floating-point expression. | |
Expr | cosh (Expr x) |
Return the hyperbolic cosine of a floating-point expression. | |
Expr | acosh (Expr x) |
Return the hyperbolic arccosine of a floating-point expression. | |
Expr | tanh (Expr x) |
Return the hyperbolic tangent of a floating-point expression. | |
Expr | atanh (Expr x) |
Return the hyperbolic arctangent of a floating-point expression. | |
Expr | sqrt (Expr x) |
Return the square root of a floating-point expression. | |
Expr | hypot (const Expr &x, const Expr &y) |
Return the square root of the sum of the squares of two floating-point expressions. | |
Expr | exp (Expr x) |
Return the exponential of a floating-point expression. | |
Expr | log (Expr x) |
Return the logarithm of a floating-point expression. | |
Expr | pow (Expr x, Expr y) |
Return one floating point expression raised to the power of another. | |
Expr | erf (const Expr &x) |
Evaluate the error function erf. | |
Expr | fast_sin (const Expr &x) |
Fast vectorizable approximation to some trigonometric functions for Float(32). | |
Expr | fast_cos (const Expr &x) |
Expr | fast_log (const Expr &x) |
Fast approximate cleanly vectorizable log for Float(32). | |
Expr | fast_exp (const Expr &x) |
Fast approximate cleanly vectorizable exp for Float(32). | |
Expr | fast_pow (Expr x, Expr y) |
Fast approximate cleanly vectorizable pow for Float(32). | |
Expr | fast_inverse (Expr x) |
Fast approximate inverse for Float(32). | |
Expr | fast_inverse_sqrt (Expr x) |
Fast approximate inverse square root for Float(32). | |
Expr | floor (Expr x) |
Return the greatest whole number less than or equal to a floating-point expression. | |
Expr | ceil (Expr x) |
Return the least whole number greater than or equal to a floating-point expression. | |
Expr | round (Expr x) |
Return the whole number closest to a floating-point expression. | |
Expr | trunc (Expr x) |
Return the integer part of a floating-point expression. | |
Expr | is_nan (Expr x) |
Returns true if the argument is a Not a Number (NaN). | |
Expr | is_inf (Expr x) |
Returns true if the argument is Inf or -Inf. | |
Expr | is_finite (Expr x) |
Returns true if the argument is a finite value (ie, neither NaN nor Inf). | |
Expr | fract (const Expr &x) |
Return the fractional part of a floating-point expression. | |
Expr | reinterpret (Type t, Expr e) |
Reinterpret the bits of one value as another type. | |
template<typename T > | |
Expr | reinterpret (Expr e) |
Expr | operator& (Expr x, Expr y) |
Return the bitwise and of two expressions (which need not have the same type). | |
Expr | operator& (Expr x, int y) |
Return the bitwise and of an expression and an integer. | |
Expr | operator& (int x, Expr y) |
Expr | operator| (Expr x, Expr y) |
Return the bitwise or of two expressions (which need not have the same type). | |
Expr | operator| (Expr x, int y) |
Return the bitwise or of an expression and an integer. | |
Expr | operator| (int x, Expr y) |
Expr | operator^ (Expr x, Expr y) |
Return the bitwise xor of two expressions (which need not have the same type). | |
Expr | operator^ (Expr x, int y) |
Return the bitwise xor of an expression and an integer. | |
Expr | operator^ (int x, Expr y) |
Expr | operator~ (Expr x) |
Return the bitwise not of an expression. | |
Expr | operator<< (Expr x, Expr y) |
Shift the bits of an integer value left. | |
Expr | operator<< (Expr x, int y) |
Expr | operator>> (Expr x, Expr y) |
Shift the bits of an integer value right. | |
Expr | operator>> (Expr x, int y) |
Expr | lerp (Expr zero_val, Expr one_val, Expr weight) |
Linear interpolate between the two values according to a weight. | |
Expr | popcount (Expr x) |
Count the number of set bits in an expression. | |
Expr | count_leading_zeros (Expr x) |
Count the number of leading zero bits in an expression. | |
Expr | count_trailing_zeros (Expr x) |
Count the number of trailing zero bits in an expression. | |
Expr | div_round_to_zero (Expr x, Expr y) |
Divide two integers, rounding towards zero. | |
Expr | mod_round_to_zero (Expr x, Expr y) |
Compute the remainder of dividing two integers, when division is rounding toward zero. | |
Expr | random_float (Expr seed=Expr()) |
Return a random variable representing a uniformly distributed float in the half-open interval [0.0f, 1.0f). | |
Expr | random_uint (Expr seed=Expr()) |
Return a random variable representing a uniformly distributed unsigned 32-bit integer. | |
Expr | random_int (Expr seed=Expr()) |
Return a random variable representing a uniformly distributed 32-bit integer. | |
Expr | print (const std::vector< Expr > &values) |
Create an Expr that prints out its value whenever it is evaluated. | |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | print (Expr a, Args &&...args) |
Expr | print_when (Expr condition, const std::vector< Expr > &values) |
Create an Expr that prints whenever it is evaluated, provided that the condition is true. | |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | print_when (Expr condition, Expr a, Args &&...args) |
Expr | require (Expr condition, const std::vector< Expr > &values) |
Create an Expr that that guarantees a precondition. | |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | require (Expr condition, Expr value, Args &&...args) |
Expr | undef (Type t) |
Return an undef value of the given type. | |
template<typename T > | |
Expr | undef () |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | memoize_tag (Expr result, Args &&...args) |
Control the values used in the memoization cache key for memoize. | |
Expr | likely (Expr e) |
Expressions tagged with this intrinsic are considered to be part of the steady state of some loop with a nasty beginning and end (e.g. | |
Expr | likely_if_innermost (Expr e) |
Equivalent to likely, but only triggers a loop partitioning if found in an innermost loop. | |
template<typename T > | |
Expr | saturating_cast (Expr e) |
Cast an expression to the halide type corresponding to the C++ type T. | |
Expr | saturating_cast (Type t, Expr e) |
Cast an expression to a new type, clamping to the minimum and maximum values of the result type. | |
Expr | strict_float (Expr e) |
Makes a best effort attempt to preserve IEEE floating-point semantics in evaluating an expression. | |
Expr | unsafe_promise_clamped (const Expr &value, const Expr &min, const Expr &max) |
Create an Expr that that promises another Expr is clamped but do not generate code to check the assertion or modify the value. | |
Expr | scatter (const std::vector< Expr > &args) |
Scatter and gather are used for update definition which must store multiple values to distinct locations at the same time. | |
Expr | gather (const std::vector< Expr > &args) |
template<typename... Args> | |
Expr | scatter (const Expr &e, Args &&...args) |
template<typename... Args> | |
Expr | gather (const Expr &e, Args &&...args) |
Expr | extract_bits (Type t, const Expr &e, const Expr &lsb) |
Extract a contiguous subsequence of the bits of 'e', starting at the bit index given by 'lsb', where zero is the least-significant bit, returning a value of type 't'. | |
template<typename T > | |
Expr | extract_bits (const Expr &e, const Expr &lsb) |
Expr | concat_bits (const std::vector< Expr > &e) |
Given a number of Exprs of the same type, concatenate their bits producing a single Expr of the same type code of the input but with more bits. | |
Expr | widen_right_add (Expr a, Expr b) |
Below is a collection of intrinsics for fixed-point programming. | |
Expr | widen_right_mul (Expr a, Expr b) |
Compute a * widen(b). | |
Expr | widen_right_sub (Expr a, Expr b) |
Compute a - widen(b). | |
Expr | widening_add (Expr a, Expr b) |
Compute widen(a) + widen(b). | |
Expr | widening_mul (Expr a, Expr b) |
Compute widen(a) * widen(b). | |
Expr | widening_sub (Expr a, Expr b) |
Compute widen(a) - widen(b). | |
Expr | widening_shift_left (Expr a, Expr b) |
Compute widen(a) << b. | |
Expr | widening_shift_left (Expr a, int b) |
Expr | widening_shift_right (Expr a, Expr b) |
Compute widen(a) >> b. | |
Expr | widening_shift_right (Expr a, int b) |
Expr | rounding_shift_left (Expr a, Expr b) |
Compute saturating_narrow(widening_add(a, (1 >> min(b, 0)) / 2) << b). | |
Expr | rounding_shift_left (Expr a, int b) |
Expr | rounding_shift_right (Expr a, Expr b) |
Compute saturating_narrow(widening_add(a, (1 << max(b, 0)) / 2) >> b). | |
Expr | rounding_shift_right (Expr a, int b) |
Expr | saturating_add (Expr a, Expr b) |
Compute saturating_narrow(widen(a) + widen(b)) | |
Expr | saturating_sub (Expr a, Expr b) |
Compute saturating_narrow(widen(a) - widen(b)) | |
Expr | halving_add (Expr a, Expr b) |
Compute narrow((widen(a) + widen(b)) / 2) | |
Expr | rounding_halving_add (Expr a, Expr b) |
Compute narrow((widen(a) + widen(b) + 1) / 2) | |
Expr | halving_sub (Expr a, Expr b) |
Compute narrow((widen(a) - widen(b)) / 2) | |
Expr | mul_shift_right (Expr a, Expr b, Expr q) |
Compute saturating_narrow(shift_right(widening_mul(a, b), q)) | |
Expr | mul_shift_right (Expr a, Expr b, int q) |
Expr | rounding_mul_shift_right (Expr a, Expr b, Expr q) |
Compute saturating_narrow(rounding_shift_right(widening_mul(a, b), q)) | |
Expr | rounding_mul_shift_right (Expr a, Expr b, int q) |
Expr | target_arch_is (Target::Arch arch) |
Return a boolean Expr for the corresponding field of the Target being used during lowering; they can be useful in writing library code without having to plumb a Target through call sites, so that you can do things like. | |
Expr | target_os_is (Target::OS os) |
Expr | target_has_feature (Target::Feature feat) |
Expr | target_bits () |
Return the bit width of the Target used during lowering; this can be useful in writing library code without having to plumb a Target through call sites, so that you can do things like. | |
Expr | target_natural_vector_size (Type t) |
Return the natural vector width for the given Type for the Target being used during lowering; this can be useful in writing library code without having to plumb a Target through call sites, so that you can do things like. | |
template<typename data_t > | |
Expr | target_natural_vector_size () |
std::ostream & | operator<< (std::ostream &stream, const DeviceAPI &) |
Emit a halide device api type in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const MemoryType &) |
Emit a halide memory type in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const TailStrategy &) |
Emit a halide tail strategy in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Partition &) |
Emit a halide loop partitioning policy in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const LoopLevel &) |
Emit a halide LoopLevel in human-readable form. | |
Func | lambda (const Expr &e) |
Create a zero-dimensional halide function that returns the given expression. | |
Func | lambda (const Var &x, const Expr &e) |
Create a 1-D halide function in the first argument that returns the second argument. | |
Func | lambda (const Var &x, const Var &y, const Expr &e) |
Create a 2-D halide function in the first two arguments that returns the last argument. | |
Func | lambda (const Var &x, const Var &y, const Var &z, const Expr &e) |
Create a 3-D halide function in the first three arguments that returns the last argument. | |
Func | lambda (const Var &x, const Var &y, const Var &z, const Var &w, const Expr &e) |
Create a 4-D halide function in the first four arguments that returns the last argument. | |
Func | lambda (const Var &x, const Var &y, const Var &z, const Var &w, const Var &v, const Expr &e) |
Create a 5-D halide function in the first five arguments that returns the last argument. | |
std::unique_ptr< llvm::Module > | compile_module_to_llvm_module (const Module &module, llvm::LLVMContext &context) |
Generate an LLVM module. | |
std::unique_ptr< llvm::raw_fd_ostream > | make_raw_fd_ostream (const std::string &filename) |
Construct an llvm output stream for writing to files. | |
void | compile_llvm_module_to_object (llvm::Module &module, Internal::LLVMOStream &out) |
Compile an LLVM module to native targets (objects, native assembly). | |
void | compile_llvm_module_to_assembly (llvm::Module &module, Internal::LLVMOStream &out) |
void | compile_llvm_module_to_llvm_bitcode (llvm::Module &module, Internal::LLVMOStream &out) |
Compile an LLVM module to LLVM targets (bitcode, LLVM assembly). | |
void | compile_llvm_module_to_llvm_assembly (llvm::Module &module, Internal::LLVMOStream &out) |
void | create_static_library (const std::vector< std::string > &src_files, const Target &target, const std::string &dst_file, bool deterministic=true) |
Concatenate the list of src_files into dst_file, using the appropriate static library format for the given target (e.g., .a or .lib). | |
Module | link_modules (const std::string &name, const std::vector< Module > &modules) |
Link a set of modules together into one module. | |
void | compile_standalone_runtime (const std::string &object_filename, const Target &t) |
Create an object file containing the Halide runtime for a given target. | |
std::map< OutputFileType, std::string > | compile_standalone_runtime (const std::map< OutputFileType, std::string > &output_files, const Target &t) |
Create an object and/or static library file containing the Halide runtime for a given target. | |
void | compile_multitarget (const std::string &fn_name, const std::map< OutputFileType, std::string > &output_files, const std::vector< Target > &targets, const std::vector< std::string > &suffixes, const ModuleFactory &module_factory, const CompilerLoggerFactory &compiler_logger_factory=nullptr) |
Expr | user_context_value () |
Returns an Expr corresponding to the user context passed to the function (if any). | |
std::ostream & | operator<< (std::ostream &stream, const RVar &) |
Emit an RVar in a human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const RDom &) |
Emit an RDom in a human-readable form. | |
void | serialize_pipeline (const Pipeline &pipeline, std::vector< uint8_t > &data) |
Serialize a Halide pipeline into the given data buffer. | |
void | serialize_pipeline (const Pipeline &pipeline, std::vector< uint8_t > &data, std::map< std::string, Parameter > ¶ms) |
Serialize a Halide pipeline into the given data buffer. | |
void | serialize_pipeline (const Pipeline &pipeline, const std::string &filename) |
Serialize a Halide pipeline into the given filename. | |
void | serialize_pipeline (const Pipeline &pipeline, const std::string &filename, std::map< std::string, Parameter > ¶ms) |
Serialize a Halide pipeline into the given filename. | |
Target | get_host_target () |
Return the target corresponding to the host machine. | |
Target | get_target_from_environment () |
Return the target that Halide will use. | |
Target | get_jit_target_from_environment () |
Return the target that Halide will use for jit-compilation. | |
Target::Feature | target_feature_for_device_api (DeviceAPI api) |
Get the Target feature corresponding to a DeviceAPI. | |
Type | Int (int bits, int lanes=1) |
Constructing a signed integer type. | |
Type | UInt (int bits, int lanes=1) |
Constructing an unsigned integer type. | |
Type | Float (int bits, int lanes=1) |
Construct a floating-point type. | |
Type | BFloat (int bits, int lanes=1) |
Construct a floating-point type in the bfloat format. | |
Type | Bool (int lanes=1) |
Construct a boolean type. | |
Type | Handle (int lanes=1, const halide_handle_cplusplus_type *handle_type=nullptr) |
Construct a handle type. | |
template<typename T > | |
Type | type_of () |
Construct the halide equivalent of a C type. | |
std::string | type_to_c_type (Type type, bool include_space, bool c_plus_plus=true) |
Halide type to a C++ type. | |
void | load_plugin (const std::string &lib_name) |
Load a plugin in the form of a dynamic library (e.g. | |
void | set_compiler_stack_size (size_t) |
Set how much stack the compiler should use for compilation in bytes. | |
size_t | get_compiler_stack_size () |
Return how much stack size the compiler should use for calls that go through run_with_large_stack below. | |
template<typename T > | |
T | pick_value_in_vector (FuzzedDataProvider &fdp, std::vector< T > &vec) |
Variables | |
const int | head1_channels = 8 |
const int | head1_w = 40 |
const int | head1_h = 7 |
const int | head2_channels = 24 |
const int | head2_w = 39 |
const int | conv1_channels = 32 |
constexpr int | AnyDims = Halide::Runtime::AnyDims |
const DeviceAPI | all_device_apis [] |
An array containing all the device apis. | |
constexpr size_t | default_compiler_stack_size = 32 * 1024 * 1024 |
The default amount of stack used for lowering and codegen. | |
This file defines the class FunctionDAG, which is our representation of a Halide pipeline, and contains methods to using Halide's bounds tools to query properties of it.
Defines methods for manipulating and analyzing boolean expressions.
This file defines the LoopNest, which is our representation of a Halide schedule, and contains methods to generate candidates for scheduling as well as extract a featurization that can be used to cost each candidate.
using Halide::GeneratorParamsMap = std::map<std::string, std::string> |
Definition at line 22 of file AbstractGenerator.h.
typedef std::vector<Range> Halide::Region |
typedef Stage Halide::ScheduleHandle |
using Halide::MetadataNameMap = std::map<std::string, std::string> |
using Halide::ModuleFactory = std::function<Module(const std::string &fn_name, const Target &target)> |
using Halide::CompilerLoggerFactory = std::function<std::unique_ptr<Internal::CompilerLogger>(const std::string &fn_name, const Target &target)> |
using Halide::AutoSchedulerFn = std::function<void(const Pipeline &, const Target &, const AutoschedulerParams &, AutoSchedulerResults *outputs)> |
Definition at line 103 of file Pipeline.h.
|
strong |
An enum describing a type of device API.
Used by schedules, and in the For loop IR node.
Enumerator | |
---|---|
None | |
Host | Used to denote for loops that run on the same device as the containing code. |
Default_GPU | |
CUDA | |
OpenCL | |
Metal | |
Hexagon | |
HexagonDma | |
D3D12Compute | |
Vulkan | |
WebGPU |
Definition at line 15 of file DeviceAPI.h.
|
strong |
An enum describing different address spaces to be used with Func::store_in.
Enumerator | |
---|---|
Auto | Let Halide select a storage type automatically. |
Heap | Heap/global memory. Allocated using halide_malloc, or halide_device_malloc |
Stack | Stack memory. Allocated using alloca. Requires a constant size. Corresponds to per-thread local memory on the GPU. If all accesses are at constant coordinates, may be promoted into the register file at the discretion of the register allocator. |
Register | Register memory. The allocation should be promoted into the register file. All stores must be at constant coordinates. May be spilled to the stack at the discretion of the register allocator. |
GPUShared | Allocation is stored in GPU shared memory. Also known as "local" in OpenCL, and "threadgroup" in metal. Can be shared across GPU threads within the same block. |
GPUTexture | Allocation is stored in GPU texture memory and accessed through hardware sampler. |
LockedCache | Allocate Locked Cache Memory to act as local memory. |
VTCM | Vector Tightly Coupled Memory. HVX (Hexagon) local memory available on v65+. This memory has higher performance and lower power. Ideal for intermediate buffers. Necessary for vgather-vscatter instructions on Hexagon |
AMXTile | AMX Tile register for X86. Any data that would be used in an AMX matrix multiplication must first be loaded into an AMX tile register. |
|
strong |
An enum to specify calling convention for extern stages.
Enumerator | |
---|---|
Default | Match whatever is specified in the Target. |
C | No name mangling. |
CPlusPlus | C++ name mangling. |
Definition at line 26 of file Function.h.
|
strong |
Different ways to handle loops with a potentially optimizable boundary conditions.
Enumerator | |
---|---|
Auto | Automatically let Halide decide on Loop Parititioning. |
Never | Disallow loop partitioning. |
Always | Force partitioning of the loop, even in the tail cases of outer partitioned loops. If Halide can't find a way to partition this loop, it will raise an error. |
Definition at line 16 of file LoopPartitioningDirective.h.
|
strong |
Enums specifying various kinds of outputs that can be produced from a Halide Pipeline.
|
strong |
Type of linkage a function in a lowered Halide module can have.
Also controls whether auxiliary functions and metadata are generated.
Enumerator | |
---|---|
External | Visible externally. |
ExternalPlusMetadata | Visible externally. Argument metadata and an argv wrapper are also generated. |
ExternalPlusArgv | Visible externally. Argv wrapper is generated but not argument metadata. |
Internal | Not visible externally, similar to 'static' linkage in C. |
Used to determine if the output printed to file should be as a normal string or as an HTML file which can be opened in a browerser and manipulated via JS and CSS.
Enumerator | |
---|---|
Text | |
HTML |
Definition at line 72 of file Pipeline.h.
|
strong |
Different ways to handle accesses outside the original extents in a prefetch.
Definition at line 16 of file PrefetchDirective.h.
|
strong |
Different ways to handle a tail case in a split when the factor does not provably divide the extent.
Enumerator | |
---|---|
RoundUp | Round up the extent to be a multiple of the split factor. Not legal for RVars, as it would change the meaning of the algorithm. Pros: generates the simplest, fastest code. Cons: if used on a stage that reads from the input or writes to the output, constrains the input or output size to be a multiple of the split factor. |
GuardWithIf | Guard the inner loop with an if statement that prevents evaluation beyond the original extent. Always legal. The if statement is treated like a boundary condition, and factored out into a loop epilogue if possible. Pros: no redundant re-evaluation; does not constrain input our output sizes. Cons: increases code size due to separate tail-case handling; vectorization will scalarize in the tail case to handle the if statement. |
Predicate | Guard the loads and stores in the loop with an if statement that prevents evaluation beyond the original extent. Always legal. The if statement is treated like a boundary condition, and factored out into a loop epilogue if possible. Pros: no redundant re-evaluation; does not constrain input or output sizes. Cons: increases code size due to separate tail-case handling. |
PredicateLoads | Guard the loads in the loop with an if statement that prevents evaluation beyond the original extent. Only legal for innermost splits. Not legal for RVars, as it would change the meaning of the algorithm. The if statement is treated like a boundary condition, and factored out into a loop epilogue if possible. Pros: does not constrain input sizes, output size constraints are simpler than full predication. Cons: increases code size due to separate tail-case handling, constrains the output size to be a multiple of the split factor. |
PredicateStores | Guard the stores in the loop with an if statement that prevents evaluation beyond the original extent. Only legal for innermost splits. Not legal for RVars, as it would change the meaning of the algorithm. The if statement is treated like a boundary condition, and factored out into a loop epilogue if possible. Pros: does not constrain output sizes, input size constraints are simpler than full predication. Cons: increases code size due to separate tail-case handling, constraints the input size to be a multiple of the split factor.. |
ShiftInwards | Prevent evaluation beyond the original extent by shifting the tail case inwards, re-evaluating some points near the end. Only legal for pure variables in pure definitions. If the inner loop is very simple, the tail case is treated like a boundary condition and factored out into an epilogue. This is a good trade-off between several factors. Like RoundUp, it supports vectorization well, because the inner loop is always a fixed size with no data-dependent branching. It increases code size slightly for inner loops due to the epilogue handling, but not for outer loops (e.g. loops over tiles). If used on a stage that reads from an input or writes to an output, this stategy only requires that the input/output extent be at least the split factor, instead of a multiple of the split factor as with RoundUp. |
ShiftInwardsAndBlend | Equivalent to ShiftInwards, but protects values that would be re-evaluated by loading the memory location that would be stored to, modifying only the elements not contained within the overlap, and then storing the blended result. This tail strategy is useful when you want to use ShiftInwards to vectorize without a scalar tail, but are scheduling a stage where that isn't legal (e.g. an update definition). Because this is a read - modify - write, this tail strategy cannot be used on any dimension the stage is parallelized over as it would cause a race condition. |
RoundUpAndBlend | Equivalent to RoundUp, but protected values that would be written beyond the end by loading the memory location that would be stored to, modifying only the elements within the region being computed, and then storing the blended result. This tail strategy is useful when vectorizing an update to some sub-region of a larger Func. As with ShiftInwardsAndBlend, it can't be combined with parallelism. |
Auto | For pure definitions use ShiftInwards. For pure vars in update definitions use RoundUp. For RVars in update definitions use GuardWithIf. |
Definition at line 33 of file Schedule.h.
|
strong |
Different ways to handle the case when the start/end of the loops of stages computed with (fused) are not aligned.
Definition at line 137 of file Schedule.h.
std::unique_ptr< DefaultCostModel > Halide::make_default_cost_model | ( | const std::string & | weights_in_dir = "", |
const std::string & | weights_out_dir = "", | ||
bool | randomize_weights = false ) |
std::unique_ptr< DefaultCostModel > Halide::make_default_cost_model | ( | Internal::Autoscheduler::Statistics & | stats, |
const std::string & | weights_in_dir = "", | ||
const std::string & | weights_out_dir = "", | ||
bool | randomize_weights = false ) |
std::unique_ptr< llvm::Module > Halide::codegen_llvm | ( | const Module & | module, |
llvm::LLVMContext & | context ) |
Given a Halide module, generate an llvm::Module.
Internal::ConstantInterval Halide::cast | ( | Type | t, |
const Internal::ConstantInterval & | a ) |
Cast operators for ConstantIntervals.
These ones have to live out in Halide::, to avoid C++ name lookup confusion with the Halide::cast variants that take Exprs.
Referenced by Halide::ConciseCasts::bf16(), cast(), Halide::NamesInterface::cast(), Halide::NamesInterface::cast(), Halide::SimdOpCheckTest::check_one(), Halide::ConciseCasts::f16(), Halide::ConciseCasts::f32(), Halide::ConciseCasts::f64(), Halide::ConciseCasts::i16(), Halide::ConciseCasts::i32(), Halide::ConciseCasts::i64(), Halide::ConciseCasts::i8(), Halide::ConciseCasts::u16(), Halide::ConciseCasts::u32(), Halide::ConciseCasts::u64(), and Halide::ConciseCasts::u8().
Internal::ConstantInterval Halide::saturating_cast | ( | Type | t, |
const Internal::ConstantInterval & | a ) |
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Expr & | ) |
Emit an expression on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Type & | ) |
Emit a halide type on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Module & | ) |
Emit a halide Module on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Target & | ) |
Derivative Halide::propagate_adjoints | ( | const Func & | output, |
const Func & | adjoint, | ||
const Region & | output_bounds ) |
Given a Func and a corresponding adjoint, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters.
The bounds of output and adjoint need to be specified with pair {min, extent} For each Func the output depends on, and for the pure definition and each update of that Func, it generates a derivative Func stored in the Derivative.
Derivative Halide::propagate_adjoints | ( | const Func & | output, |
const Buffer< float > & | adjoint ) |
Given a Func and a corresponding adjoint buffer, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters.
For each Func the output depends on, and for the pure definition and each update of that Func, it generates a derivative Func stored in the Derivative.
Derivative Halide::propagate_adjoints | ( | const Func & | output | ) |
Given a scalar Func with size 1, (back)propagate the gradient to all dependent Funcs, buffers, and parameters.
For each Func the output depends on, and for the pure definition and each update of that Func, it generates a derivative Func stored in the Derivative.
Pipeline Halide::deserialize_pipeline | ( | const std::string & | filename, |
const std::map< std::string, Parameter > & | user_params ) |
Deserialize a Halide pipeline from a file.
filename | The location of the file to deserialize. Must use .hlpipe extension. |
user_params | Map of named input/output parameters to bind with the resulting pipeline (used to avoid deserializing specific objects and enable the use of externally defined ones instead). |
Pipeline Halide::deserialize_pipeline | ( | std::istream & | in, |
const std::map< std::string, Parameter > & | user_params ) |
Deserialize a Halide pipeline from an input stream.
in | The input stream to read from containing a serialized Halide pipeline |
user_params | Map of named input/output parameters to bind with the resulting pipeline (used to avoid deserializing specific objects and enable the use of externally defined ones instead). |
Pipeline Halide::deserialize_pipeline | ( | const std::vector< uint8_t > & | data, |
const std::map< std::string, Parameter > & | user_params ) |
Deserialize a Halide pipeline from a byte buffer containing a serizalized pipeline in binary format.
data | The data buffer containing a serialized Halide pipeline |
user_params | Map of named input/output parameters to bind with the resulting pipeline (used to avoid deserializing specific objects and enable the use of externally defined ones instead). |
std::map< std::string, Parameter > Halide::deserialize_parameters | ( | const std::string & | filename | ) |
Deserialize the extenal parameters for the Halide pipeline from a file.
This method allows a minimal deserialization of just the external pipeline parameters, so they can be remapped and overridden with user parameters prior to deserializing the pipeline definition.
filename | The location of the file to deserialize. Must use .hlpipe extension. |
std::map< std::string, Parameter > Halide::deserialize_parameters | ( | std::istream & | in | ) |
Deserialize the extenal parameters for the Halide pipeline from input stream.
This method allows a minimal deserialization of just the external pipeline parameters, so they can be remapped and overridden with user parameters prior to deserializing the pipeline definition.
in | The input stream to read from containing a serialized Halide pipeline |
std::map< std::string, Parameter > Halide::deserialize_parameters | ( | const std::vector< uint8_t > & | data | ) |
Deserialize the extenal parameters for the Halide pipeline from a byte buffer containing a serialized pipeline in binary format.
This method allows a minimal deserialization of just the external pipeline parameters, so they can be remapped and overridden with user parameters prior to deserializing the pipeline definition.
data | The data buffer containing a serialized Halide pipeline |
const halide_device_interface_t * Halide::get_device_interface_for_device_api | ( | DeviceAPI | d, |
const Target & | t = get_jit_target_from_environment(), | ||
const char * | error_site = nullptr ) |
Gets the appropriate halide_device_interface_t * for a DeviceAPI.
If error_site is non-null, e.g. the name of the routine calling get_device_interface_for_device_api, a user_error is reported if the requested device API is not enabled in or supported by the target, Halide has been compiled without this device API, or the device API is None or Host or a bad value. The error_site argument is printed in the error message. If error_site is null, this routine returns nullptr instead of calling user_error.
Referenced by Halide::Buffer< T, Dims >::copy_to_device(), Halide::Buffer< T, Dims >::device_malloc(), and Halide::Buffer< T, Dims >::device_wrap_native().
Get the specific DeviceAPI that Halide would select when presented with DeviceAPI::Default_GPU for a given target.
If no suitable api is enabled in the target, returns DeviceAPI::Host.
bool Halide::host_supports_target_device | ( | const Target & | t | ) |
This attempts to sniff whether a given Target (and its implied DeviceAPI) is usable on the current host.
If it appears to be usable, return true; if not, return false. Note that a return value of true does not guarantee that future usage of that device will succeed; it is intended mainly as a simple diagnostic to allow early-exit when a desired device is definitely not usable. Also note that this call is NOT threadsafe, as it temporarily redirect various global error-handling hooks in Halide.
References Internal.
bool Halide::exceptions_enabled | ( | ) |
Query whether Halide was compiled with exceptions.
void Halide::set_custom_compile_time_error_reporter | ( | CompileTimeErrorReporter * | error_reporter | ) |
The default error reporter logs to stderr, then throws an exception (if HALIDE_WITH_EXCEPTIONS) or calls abort (if not).
This allows customization of that behavior if a more gentle response to error reporting is desired. Note that error_reporter is expected to remain valid across all Halide usage; it is up to the caller to ensure that this is the case (and to do any cleanup necessary).
References Internal, and set_custom_compile_time_error_reporter().
Referenced by set_custom_compile_time_error_reporter().
Integer division by small values can be done exactly as multiplies and shifts.
This function does integer division for numerators of various integer types (8, 16, 32 bit signed and unsigned) numerators and uint8 denominators. The type of the result is the type of the numerator. The unsigned version is faster than the signed version, so cast the numerator to an unsigned int if you know it's positive.
If your divisor is compile-time constant, Halide performs a slightly better optimization automatically, so there's no need to use this function (but it won't hurt).
This function vectorizes well on arm, and well on x86 for 16 and 8 bit vectors. For 32-bit vectors on x86 you're better off using native integer division.
Also, this routine treats division by zero as division by
A variant of the above which rounds towards zero instead of rounding towards negative infinity.
Use the fast integer division tables to implement a modulo operation via the Euclidean identity: ab = a - (a/b)*b.
Explicit overloads of min and max for FuncRef.
These exist to disambiguate calls to min on FuncRefs when a user has pulled both Halide::min and std::min into their namespace.
Definition at line 597 of file Func.h.
References min().
Referenced by min(), min(), min(), min(), Halide::Runtime::Internal::BlockStorage::replace(), and Halide::Runtime::Internal::PointerTable::replace().
Definition at line 600 of file Func.h.
References max().
Referenced by Halide::Runtime::Internal::BlockAllocator::conform(), Halide::Runtime::Internal::conform_alignment(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::contains(), Halide::Runtime::Internal::LinkedList::initialize(), Halide::Runtime::Internal::LinkedList::LinkedList(), max(), max(), max(), max(), Halide::Runtime::Internal::BlockStorage::replace(), Halide::Runtime::Internal::PointerTable::replace(), Halide::Runtime::Internal::BlockStorage::reserve(), Halide::Runtime::Internal::PointerTable::reserve(), Halide::Runtime::Internal::BlockStorage::resize(), and Halide::Runtime::Internal::PointerTable::resize().
HALIDE_NO_USER_CODE_INLINE T Halide::evaluate | ( | JITUserContext * | ctx, |
const Expr & | e ) |
JIT-Compile and run enough code to evaluate a Halide expression.
This can be thought of as a scalar version of Func::realize
Definition at line 2648 of file Func.h.
References Halide::Func::realize(), Halide::Expr::type(), type_of(), and user_assert.
Referenced by evaluate(), and evaluate().
HALIDE_NO_USER_CODE_INLINE T Halide::evaluate | ( | const Expr & | e | ) |
HALIDE_NO_USER_CODE_INLINE void Halide::evaluate | ( | JITUserContext * | ctx, |
Tuple | t, | ||
First | first, | ||
Rest &&... | rest ) |
JIT-compile and run enough code to evaluate a Halide Tuple.
Definition at line 2667 of file Func.h.
References Halide::Internal::assign_results(), Halide::Internal::check_types(), and Halide::Func::realize().
HALIDE_NO_USER_CODE_INLINE void Halide::evaluate | ( | Tuple | t, |
First | first, | ||
Rest &&... | rest ) |
JIT-compile and run enough code to evaluate a Halide Tuple.
Definition at line 2678 of file Func.h.
References evaluate().
HALIDE_NO_USER_CODE_INLINE T Halide::evaluate_may_gpu | ( | const Expr & | e | ) |
JIT-Compile and run enough code to evaluate a Halide expression.
This can be thought of as a scalar version of Func::realize. Can use GPU if jit target from environment specifies one.
Definition at line 2702 of file Func.h.
References Halide::Func::realize(), Halide::Internal::schedule_scalar(), Halide::Expr::type(), type_of(), and user_assert.
HALIDE_NO_USER_CODE_INLINE void Halide::evaluate_may_gpu | ( | Tuple | t, |
First | first, | ||
Rest &&... | rest ) |
JIT-compile and run enough code to evaluate a Halide Tuple.
Can use GPU if jit target from environment specifies one.
Definition at line 2718 of file Func.h.
References Halide::Internal::assign_results(), Halide::Internal::check_types(), Halide::Func::realize(), and Halide::Internal::schedule_scalar().
auto Halide::operator+ | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a + (T)b) |
Addition between GeneratorParam<T> and any type that supports operator+ with T.
Returns type of underlying operator+.
Definition at line 1013 of file Generator.h.
auto Halide::operator+ | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a + b) |
Definition at line 1017 of file Generator.h.
auto Halide::operator- | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a - (T)b) |
Subtraction between GeneratorParam<T> and any type that supports operator- with T.
Returns type of underlying operator-.
Definition at line 1026 of file Generator.h.
auto Halide::operator- | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a - b) |
Definition at line 1030 of file Generator.h.
auto Halide::operator* | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a * (T)b) |
Multiplication between GeneratorParam<T> and any type that supports operator* with T.
Returns type of underlying operator*.
Definition at line 1039 of file Generator.h.
auto Halide::operator* | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a * b) |
Definition at line 1043 of file Generator.h.
auto Halide::operator/ | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a / (T)b) |
Division between GeneratorParam<T> and any type that supports operator/ with T.
Returns type of underlying operator/.
Definition at line 1052 of file Generator.h.
auto Halide::operator/ | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a / b) |
Definition at line 1056 of file Generator.h.
auto Halide::operator% | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a % (T)b) |
Modulo between GeneratorParam<T> and any type that supports operator% with T.
Returns type of underlying operator%.
Definition at line 1065 of file Generator.h.
auto Halide::operator% | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a % b) |
Definition at line 1069 of file Generator.h.
auto Halide::operator> | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a > (T)b) |
Greater than comparison between GeneratorParam<T> and any type that supports operator> with T.
Returns type of underlying operator>.
Definition at line 1078 of file Generator.h.
auto Halide::operator> | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a > b) |
Definition at line 1082 of file Generator.h.
auto Halide::operator< | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a < (T)b) |
Less than comparison between GeneratorParam<T> and any type that supports operator< with T.
Returns type of underlying operator<.
Definition at line 1091 of file Generator.h.
auto Halide::operator< | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a < b) |
Definition at line 1095 of file Generator.h.
auto Halide::operator>= | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a >= (T)b) |
Greater than or equal comparison between GeneratorParam<T> and any type that supports operator>= with T.
Returns type of underlying operator>=.
Definition at line 1104 of file Generator.h.
auto Halide::operator>= | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a >= b) |
Definition at line 1108 of file Generator.h.
auto Halide::operator<= | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a <= (T)b) |
Less than or equal comparison between GeneratorParam<T> and any type that supports operator<= with T.
Returns type of underlying operator<=.
Definition at line 1117 of file Generator.h.
auto Halide::operator<= | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a <= b) |
Definition at line 1121 of file Generator.h.
auto Halide::operator== | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a == (T)b) |
Equality comparison between GeneratorParam<T> and any type that supports operator== with T.
Returns type of underlying operator==.
Definition at line 1130 of file Generator.h.
auto Halide::operator== | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a == b) |
Definition at line 1134 of file Generator.h.
auto Halide::operator!= | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a != (T)b) |
Inequality comparison between between GeneratorParam<T> and any type that supports operator!= with T.
Returns type of underlying operator!=.
Definition at line 1143 of file Generator.h.
auto Halide::operator!= | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a != b) |
Definition at line 1147 of file Generator.h.
auto Halide::operator&& | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a && (T)b) |
Logical and between between GeneratorParam<T> and any type that supports operator&& with T.
Returns type of underlying operator&&.
Definition at line 1156 of file Generator.h.
auto Halide::operator&& | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a && b) |
Definition at line 1160 of file Generator.h.
auto Halide::operator&& | ( | const GeneratorParam< T > & | a, |
const GeneratorParam< T > & | b ) -> decltype((T)a && (T)b) |
Definition at line 1164 of file Generator.h.
auto Halide::operator|| | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(a || (T)b) |
Logical or between between GeneratorParam<T> and any type that supports operator|| with T.
Returns type of underlying operator||.
Definition at line 1173 of file Generator.h.
auto Halide::operator|| | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype((T)a || b) |
Definition at line 1177 of file Generator.h.
auto Halide::operator|| | ( | const GeneratorParam< T > & | a, |
const GeneratorParam< T > & | b ) -> decltype((T)a || (T)b) |
Definition at line 1181 of file Generator.h.
auto Halide::min | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
Compute minimum between GeneratorParam<T> and any type that supports min with T.
Will automatically import std::min. Returns type of underlying min call.
Definition at line 1221 of file Generator.h.
References Halide::Internal::GeneratorMinMax::min_forward().
auto Halide::min | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
Definition at line 1225 of file Generator.h.
References Halide::Internal::GeneratorMinMax::min_forward().
auto Halide::max | ( | const Other & | a, |
const GeneratorParam< T > & | b ) -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
Compute the maximum value between GeneratorParam<T> and any type that supports max with T.
Will automatically import std::max. Returns type of underlying max call.
Definition at line 1234 of file Generator.h.
References Halide::Internal::GeneratorMinMax::max_forward().
auto Halide::max | ( | const GeneratorParam< T > & | a, |
const Other & | b ) -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
Definition at line 1238 of file Generator.h.
References Halide::Internal::GeneratorMinMax::max_forward().
auto Halide::operator! | ( | const GeneratorParam< T > & | a | ) | -> decltype(!(T)a) |
Not operator for GeneratorParam.
Definition at line 1245 of file Generator.h.
Callable Halide::create_callable_from_generator | ( | const GeneratorContext & | context, |
const std::string & | name, | ||
const GeneratorParamsMap & | generator_params = {} ) |
Create a Generator from the currently-registered Generators, use it to create a Callable.
Any GeneratorParams specified will be applied to the Generator before compilation. If the name isn't registered, assert-fail.
References create_callable_from_generator().
Referenced by create_callable_from_generator(), and create_callable_from_generator().
Callable Halide::create_callable_from_generator | ( | const Target & | target, |
const std::string & | name, | ||
const GeneratorParamsMap & | generator_params = {} ) |
References create_callable_from_generator().
An inline reduction.
This is suitable for convolution-type operations - the reduction will be computed in the innermost loop that it is used in. The argument may contain free or implicit variables, and must refer to some reduction domain. The free variables are still free in the return value, but the reduction domain is captured - the result expression does not refer to a reduction domain and can be used in a pure function definition.
An example using sum :
Here g computes some blur of x, but g is still a pure function. The sum is being computed by an anonymous reduction function that is scheduled innermost within g.
Referenced by do_cost_model_schedule().
Referenced by Halide::SimdOpCheckTest::check_one().
Variants of the inline reduction in which the RDom is stated explicitly.
The expression can refer to multiple RDoms, and only the inner one is captured by the reduction. This allows you to write expressions like:
Cast an expression to the halide type corresponding to the C++ type T.
Definition at line 377 of file IROperator.h.
Return the sum of two expressions, doing any necessary type coercion using Internal::match_types.
Add an expression and a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Add a constant integer and an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Modify the first expression to be the sum of two expressions, without changing its type.
This casts the second argument to match the type of the first.
Return the difference of two expressions, doing any necessary type coercion using Internal::match_types.
Subtracts a constant integer from an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Subtracts an expression from a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Return the negative of the argument.
Does no type casting, so more formally: return that number which when added to the original, yields zero of the same type. For unsigned integers the negative is still an unsigned integer. E.g. in UInt(8), the negative of 56 is 200, because 56 + 200 == 0
Modify the first expression to be the difference of two expressions, without changing its type.
This casts the second argument to match the type of the first.
Return the product of two expressions, doing any necessary type coercion using Internal::match_types.
Multiply an expression and a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Multiply a constant integer and an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Modify the first expression to be the product of two expressions, without changing its type.
This casts the second argument to match the type of the first.
Return the ratio of two expressions, doing any necessary type coercion using Internal::match_types.
Note that integer division in Halide is not the same as integer division in C-like languages in two ways.
First, signed integer division in Halide rounds according to the sign of the denominator. This means towards minus infinity for positive denominators, and towards positive infinity for negative denominators. This is unlike C, which rounds towards zero. This decision ensures that upsampling expressions like f(x/2, y/2) don't have funny discontinuities when x and y cross zero.
Second, division by zero returns zero instead of faulting. For types where overflow is defined behavior, division of the largest negative signed integer by -1 returns the larged negative signed integer for the type (i.e. it wraps). This ensures that a division operation can never have a side-effect, which is helpful in Halide because scheduling directives can expand the domain of computation of a Func, potentially introducing new zero-division.
Modify the first expression to be the ratio of two expressions, without changing its type.
This casts the second argument to match the type of the first. Note that signed integer division in Halide rounds towards minus infinity, unlike C, which rounds towards zero.
Divides an expression by a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Divides a constant integer by an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Return the first argument reduced modulo the second, doing any necessary type coercion using Internal::match_types.
There are two key differences between C-like languages and Halide for the modulo operation, which complement the way division works.
First, the result is never negative, so x % 2 is always zero or one, unlike in C-like languages. x % -2 is equivalent, and is also always zero or one. Second, mod by zero evaluates to zero (unlike in C, where it faults). This makes modulo, like division, a side-effect-free operation.
Mods an expression by a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Mods a constant integer by an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Return a boolean expression that tests whether the first argument is greater than the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is greater than a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is greater than an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is less than the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is less than a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is less than an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is less than or equal to the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is less than or equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is less than or equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is greater than or equal to the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is greater than or equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is greater than or equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is equal to the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Expr Halide::operator!= | ( | Expr | a, |
Expr | b ) |
Return a boolean expression that tests whether the first argument is not equal to the second, after doing any necessary type coercion using Internal::match_types.
Expr Halide::operator!= | ( | Expr | a, |
int | b ) |
Return a boolean expression that tests whether an expression is not equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Expr Halide::operator!= | ( | int | a, |
Expr | b ) |
Return a boolean expression that tests whether a constant integer is not equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Returns an expression representing the greater of the two arguments, after doing any necessary type coercion using Internal::match_types.
Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
Returns an expression representing the greater of an expression and a constant integer.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
Returns an expression representing the greater of a constant integer and an expression.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
Definition at line 655 of file IROperator.h.
References max().
Definition at line 658 of file IROperator.h.
References max().
|
inline |
Returns an expression representing the greater of an expressions vector, after doing any necessary type coersion using Internal::match_types.
Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4). The expressions are folded from right ie. max(.., max(.., ..)). The arguments can be any mix of types but must all be convertible to Expr.
Definition at line 670 of file IROperator.h.
References max().
Returns an expression representing the lesser of an expression and a constant integer.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
Returns an expression representing the lesser of a constant integer and an expression.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
Definition at line 690 of file IROperator.h.
References min().
Definition at line 693 of file IROperator.h.
References min().
|
inline |
Returns an expression representing the lesser of an expressions vector, after doing any necessary type coersion using Internal::match_types.
Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4). The expressions are folded from right ie. min(.., min(.., ..)). The arguments can be any mix of types but must all be convertible to Expr.
Definition at line 705 of file IROperator.h.
References min().
Operators on floats treats those floats as Exprs.
Making these explicit prevents implicit float->int casts that might otherwise occur.
Definition at line 713 of file IROperator.h.
Definition at line 716 of file IROperator.h.
Definition at line 719 of file IROperator.h.
Definition at line 722 of file IROperator.h.
Definition at line 725 of file IROperator.h.
Definition at line 728 of file IROperator.h.
Definition at line 731 of file IROperator.h.
Definition at line 734 of file IROperator.h.
Definition at line 737 of file IROperator.h.
Definition at line 740 of file IROperator.h.
Definition at line 743 of file IROperator.h.
Definition at line 746 of file IROperator.h.
Definition at line 749 of file IROperator.h.
Definition at line 752 of file IROperator.h.
Definition at line 755 of file IROperator.h.
Definition at line 758 of file IROperator.h.
Definition at line 761 of file IROperator.h.
Definition at line 764 of file IROperator.h.
Definition at line 767 of file IROperator.h.
Definition at line 770 of file IROperator.h.
|
inline |
Definition at line 773 of file IROperator.h.
|
inline |
Definition at line 776 of file IROperator.h.
Clamps an expression to lie within the given bounds.
The bounds are type-cast to match the expression. Vectorizes as well as min/max.
Returns the absolute value of a signed integer or floating-point expression.
Vectorizes cleanly. Unlike in C, abs of a signed integer returns an unsigned integer of the same bit width. This means that abs of the most negative integer doesn't overflow.
Return the absolute difference between two values.
Vectorizes cleanly. Returns an unsigned value of the same bit width. There are various ways to write this yourself, but they contain numerous gotchas and don't always compile to good code, so use this instead.
Referenced by Halide::SimdOpCheckTest::check_one().
Returns an expression similar to the ternary operator in C, except that it always evaluates all arguments.
If the first argument is true, then return the second, else return the third. Typically vectorizes cleanly, but benefits from SSE41 or newer on x86.
|
inline |
A multi-way variant of select similar to a switch statement in C, which can accept multiple conditions and values in pairs.
Evaluates to the first value for which the condition is true. Returns the final value if all conditions are false.
Definition at line 810 of file IROperator.h.
References select().
Tuple Halide::select | ( | const Expr & | condition, |
const Tuple & | true_value, | ||
const Tuple & | false_value ) |
|
inline |
Equivalent of multiway select(), but taking/returning tuples.
If the condition is a Tuple, it must match the size of the true and false Tuples.
Definition at line 825 of file IROperator.h.
References select().
|
inline |
Definition at line 829 of file IROperator.h.
References select().
|
inline |
Definition at line 841 of file IROperator.h.
References select().
Oftentimes we want to pack a list of expressions with the same type into a channel dimension, e.g., img(x, y, c) = select(c == 0, 100, // Red c == 1, 50, // Green 25); // Blue This is tedious when the list is long.
The following function provide convinent syntax that allow one to write: img(x, y, c) = mux(c, {100, 50, 25});
As with the select equivalent, if the first argument (the index) is out of range, the expression evaluates to the last value.
Return the sine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the arcsine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the cosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the arccosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the tangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the arctangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the angle of a floating-point gradient.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic sine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic arcsinhe of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic cosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic arccosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic tangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic arctangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the square root of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Typically vectorizes cleanly.
Return the square root of the sum of the squares of two floating-point expressions.
If the argument is not floating-point, it is cast to Float(32). Vectorizes cleanly.
Return the exponential of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). For Float(64) arguments, this calls the system exp function, and does not vectorize well. For Float(32) arguments, this function is vectorizable, does the right thing for extremely small or extremely large inputs, and is accurate up to the last bit of the mantissa. Vectorizes cleanly.
Return the logarithm of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). For Float(64) arguments, this calls the system log function, and does not vectorize well. For Float(32) arguments, this function is vectorizable, does the right thing for inputs <= 0 (returns -inf or nan), and is accurate up to the last bit of the mantissa. Vectorizes cleanly.
Return one floating point expression raised to the power of another.
The type of the result is given by the type of the first argument. If the first argument is not a floating-point type, it is cast to Float(32). For Float(32), cleanly vectorizable, and accurate up to the last few bits of the mantissa. Gets worse when approaching overflow. Vectorizes cleanly.
Evaluate the error function erf.
Only available for Float(32). Accurate up to the last three bits of the mantissa. Vectorizes cleanly.
Fast vectorizable approximation to some trigonometric functions for Float(32).
Absolute approximation error is less than 1e-5.
Fast approximate cleanly vectorizable log for Float(32).
Returns nonsense for x <= 0.0f. Accurate up to the last 5 bits of the mantissa. Vectorizes cleanly.
Fast approximate cleanly vectorizable exp for Float(32).
Returns nonsense for inputs that would overflow or underflow. Typically accurate up to the last 5 bits of the mantissa. Gets worse when approaching overflow. Vectorizes cleanly.
Fast approximate cleanly vectorizable pow for Float(32).
Returns nonsense for x < 0.0f. Accurate up to the last 5 bits of the mantissa for typical exponents. Gets worse when approaching overflow. Vectorizes cleanly.
Fast approximate inverse for Float(32).
Corresponds to the rcpps instruction on x86, and the vrecpe instruction on ARM. Vectorizes cleanly. Note that this can produce slightly different results across different implementations of the same architecture (e.g. AMD vs Intel), even when strict_float is enabled.
Fast approximate inverse square root for Float(32).
Corresponds to the rsqrtps instruction on x86, and the vrsqrte instruction on ARM. Vectorizes cleanly. Note that this can produce slightly different results across different implementations of the same architecture (e.g. AMD vs Intel), even when strict_float is enabled.
Return the greatest whole number less than or equal to a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. Vectorizes cleanly.
Return the least whole number greater than or equal to a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. Vectorizes cleanly.
Return the whole number closest to a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. On ties, we round towards the nearest even integer. Note that this is not the same as std::round in C, which rounds away from zero. On platforms without a native instruction for this, it is emulated, and may be more expensive than cast<int>(x + 0.5f) or similar.
Return the integer part of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. Vectorizes cleanly.
Return the fractional part of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value has the same sign as the original expression. Vectorizes cleanly.
Reinterpret the bits of one value as another type.
Referenced by reinterpret(), and Halide::Internal::GeneratorInput_Scalar< T >::set_estimate().
Definition at line 1064 of file IROperator.h.
References reinterpret(), and type_of().
Return the bitwise and of two expressions (which need not have the same type).
The result type is the wider of the two expressions. Only integral types are allowed and both expressions must be signed or both must be unsigned.
Return the bitwise and of an expression and an integer.
The type of the result is the type of the expression argument.
Return the bitwise or of two expressions (which need not have the same type).
The result type is the wider of the two expressions. Only integral types are allowed and both expressions must be signed or both must be unsigned.
Return the bitwise or of an expression and an integer.
The type of the result is the type of the expression argument.
Return the bitwise xor of two expressions (which need not have the same type).
The result type is the wider of the two expressions. Only integral types are allowed and both expressions must be signed or both must be unsigned.
Return the bitwise xor of an expression and an integer.
The type of the result is the type of the expression argument.
Shift the bits of an integer value left.
This is actually less efficient than multiplying by 2^n, because Halide's optimization passes understand multiplication, and will compile it to shifting. This operator is only for if you really really need bit shifting (e.g. because the exponent is a run-time parameter). The type of the result is equal to the type of the first argument. Both arguments must have integer type.
Shift the bits of an integer value right.
Does sign extension for signed integers. This is less efficient than dividing by a power of two. Halide's definition of division (always round to negative infinity) means that all divisions by powers of two get compiled to bit-shifting, and Halide's optimization routines understand division and can work with it. The type of the result is equal to the type of the first argument. Both arguments must have integer type.
Linear interpolate between the two values according to a weight.
zero_val | The result when weight is 0 |
one_val | The result when weight is 1 |
weight | The interpolation amount |
Both zero_val and one_val must have the same type. All types are supported, including bool.
The weight is treated as its own type and must be float or an unsigned integer type. It is scaled to the bit-size of the type of x and y if they are integer, or converted to float if they are float. Integer weights are converted to float via division by the full-range value of the weight's type. Floating-point weights used to interpolate between integer values must be between 0.0f and 1.0f, and an error may be signaled if it is not provably so. (clamp operators can be added to provide proof. Currently an error is only signalled for constant weights.)
For integer linear interpolation, out of range values cannot be represented. In particular, weights that are conceptually less than 0 or greater than 1.0 are not representable. As such the result is always between x and y (inclusive of course). For lerp with floating-point values and floating-point weight, the full range of a float is valid, however underflow and overflow can still occur.
Ordering is not required between zero_val and one_val: lerp(42, 69, .5f) == lerp(69, 42, .5f) == 56
Results for integer types are for exactly rounded arithmetic. As such, there are cases where 16-bit and float differ because 32-bit floating-point (float) does not have enough precision to produce the exact result. (Likely true for 32-bit integer vs. double-precision floating-point as well.)
At present, double precision and 64-bit integers are not supported.
Generally, lerp will vectorize as if it were an operation on a type twice the bit size of the inferred type for x and y.
Some examples:
Count the number of leading zero bits in an expression.
If the expression is zero, the result is the number of bits in the type.
Count the number of trailing zero bits in an expression.
If the expression is zero, the result is the number of bits in the type.
Divide two integers, rounding towards zero.
This is the typical behavior of most hardware architectures, which differs from Halide's division operator, which is Euclidean (rounds towards -infinity). Will throw a runtime error if y is zero, or if y is -1 and x is the minimum signed integer.
Compute the remainder of dividing two integers, when division is rounding toward zero.
This is the typical behavior of most hardware architectures, which differs from Halide's mod operator, which is Euclidean (produces the remainder when division rounds towards -infinity). Will throw a runtime error if y is zero.
Return a random variable representing a uniformly distributed float in the half-open interval [0.0f, 1.0f).
For random numbers of other types, use lerp with a random float as the last parameter.
Optionally takes a seed.
Note that:
is very different to
The first doubles a random variable, and the second adds two independent random variables.
A given random variable takes on a unique value that depends deterministically on the pure variables of the function they belong to, the identity of the function itself, and which definition of the function it is used in. They are, however, shared across tuple elements.
This function vectorizes cleanly.
Return a random variable representing a uniformly distributed unsigned 32-bit integer.
See random_float. Vectorizes cleanly.
Return a random variable representing a uniformly distributed 32-bit integer.
See random_float. Vectorizes cleanly.
|
inline |
Definition at line 1274 of file IROperator.h.
References Halide::Internal::collect_print_args(), and print().
Create an Expr that prints whenever it is evaluated, provided that the condition is true.
Referenced by print_when().
|
inline |
Definition at line 1287 of file IROperator.h.
References Halide::Internal::collect_print_args(), and print_when().
Create an Expr that that guarantees a precondition.
If 'condition' is true, the return value is equal to the first Expr. If 'condition' is false, halide_error() is called, and the return value is arbitrary. Any additional arguments after the first Expr are stringified and passed as a user-facing message to halide_error(), similar to print().
Note that this essentially always inserts a runtime check into the generated code (except when the condition can be proven at compile time); as such, it should be avoided inside inner loops, except for debugging or testing purposes. Note also that it does not vectorize cleanly (vector values will be scalarized for the check).
However, using this to make assertions about (say) input values can be useful, both in terms of correctness and (potentially) in terms of code generation, e.g.
will allow the optimizer to assume positive, nonzero values for y.
Referenced by require().
|
inline |
Definition at line 1320 of file IROperator.h.
References Halide::Internal::collect_print_args(), and require().
Return an undef value of the given type.
Halide skips stores that depend on undef values, so you can use this to mean "do not modify this memory location". This is an escape hatch that can be used for several things:
You can define a reduction with no pure step, by setting the pure step to undef. Do this only if you're confident that the update steps are sufficient to correctly fill in the domain.
For a tuple-valued reduction, you can write an update step that only updates some tuple elements.
You can define single-stage pipeline that only has update steps, and depends on the values already in the output buffer.
Use this feature with great caution, as you can use it to load from uninitialized memory.
|
inline |
Definition at line 1348 of file IROperator.h.
References type_of(), and undef().
Referenced by undef().
|
inline |
Control the values used in the memoization cache key for memoize.
Normally parameters and other external dependencies are automatically inferred and added to the cache key. The memoize_tag operator allows computing one expression and using either the computed value, or one or more other expressions in the cache key instead of the parameter dependencies of the computation. The single argument version is completely safe in that the cache key will use the actual computed value – it is difficult or imposible to produce erroneous caching this way. The more-than-one argument version allows generating cache keys that do not uniquely identify the computation and thus can result in caching errors.
A potential use for the single argument version is to handle a floating-point parameter that is quantized to a small integer. Mutliple values of the float will produce the same integer and moving the caching to using the integer for the key is more efficient.
The main use for the more-than-one argument version is to provide cache key information for Handles and ImageParams, which otherwise are not allowed inside compute_cached operations. E.g. when passing a group of parameters to an external array function via a Handle, memoize_tag can be used to isolate the actual values used by that computation. If an ImageParam is a constant image with a persistent digest, memoize_tag can be used to key computations using that image on the digest.
Definition at line 1395 of file IROperator.h.
References Halide::Internal::memoize_tag_helper().
Expressions tagged with this intrinsic are considered to be part of the steady state of some loop with a nasty beginning and end (e.g.
a boundary condition). When Halide encounters likely intrinsics, it splits the containing loop body into three, and tries to simplify down all conditions that lead to the likely. For example, given the expression: select(x < 1, bar, x > 10, bar, likely(foo)), Halide will split the loop over x into portions where x < 1, 1 <= x <= 10, and x > 10.
You're unlikely to want to call this directly. You probably want to use the boundary condition helpers in the BoundaryConditions namespace instead.
Equivalent to likely, but only triggers a loop partitioning if found in an innermost loop.
Cast an expression to the halide type corresponding to the C++ type T.
As part of the cast, clamp to the minimum and maximum values of the result type.
Definition at line 1424 of file IROperator.h.
References saturating_cast(), and type_of().
Cast an expression to a new type, clamping to the minimum and maximum values of the result type.
Makes a best effort attempt to preserve IEEE floating-point semantics in evaluating an expression.
May not be implemented for all backends. (E.g. it is difficult to do this for C++ code generation as it depends on the compiler flags used to compile the generated code.
Create an Expr that that promises another Expr is clamped but do not generate code to check the assertion or modify the value.
No attempt is made to prove the bound at compile time. (If it is proved false as a result of something else, an error might be generated, but it is also possible the compiler will crash.) The promised bound is used in bounds inference so it will allow satisfying bounds checks as well as possibly aiding optimization.
unsafe_promise_clamped returns its first argument, the Expr 'value'
This is a very easy way to make Halide generate erroneous code if the bound promises is not kept. Use sparingly when there is no other way to convey the information to the compiler and it is required for a valuable optimization.
Unsafe promises can be checked by turning on Target::CheckUnsafePromises. This is intended for debugging only.
References Internal.
Scatter and gather are used for update definition which must store multiple values to distinct locations at the same time.
The multiple expressions on the right-hand-side are bundled together into a "gather", which must match a "scatter" the the same number of arguments on the left-hand-size. For example, to store the values 1 and 2 to the locations (x, y, 3) and (x, y, 4), respectively:
The result of gather or scatter can be treated as an expression. Any containing operations on it can be assumed to distribute over the elements. If two gather expressions are combined with an arithmetic operator (e.g. added), they combine element-wise. The following example stores the values 2 * x, 2 * y, and 2 * c to the locations (x + 1, y, c), (x, y + 3, c), and (x, y, c + 2) respectively:
Repeated values in the scatter cause multiple stores to the same location. The stores happen in order from left to right, so the rightmost value wins. The following code is equivalent to f(x) = 5
Gathers are most useful for algorithms which require in-place swapping or permutation of multiple elements, or other kinds of in-place mutations that require loading multiple inputs, doing some operations to them jointly, then storing them again. The following update definition swaps the values of f at locations 3 and 5 if an input parameter p is true:
For more examples of the use of scatter and gather, see test/correctness/multiple_scatter.cpp
It is not currently possible to use scatter and gather to write an update definition in which the number of values loaded or stored varies, as the size of the scatter/gather packet must be fixed a compile-time. A workaround is to make the unwanted extra operations a redundant copy of the last operation, which will be dead-code-eliminated by the compiler. For example, the following update definition swaps the values at locations 3 and 5 when the parameter p is true, and rotates the values at locations 1, 2, and 3 when it is false. The load from 3 and store to 5 will be redundantly repeated:
Note that in the p == true case, we redudantly load from 3 and write to 5 twice.
Referenced by scatter().
Definition at line 1557 of file IROperator.h.
References scatter().
Definition at line 1562 of file IROperator.h.
References gather().
Extract a contiguous subsequence of the bits of 'e', starting at the bit index given by 'lsb', where zero is the least-significant bit, returning a value of type 't'.
Any out-of-range bits requested are filled with zeros.
extract_bits is especially useful when one wants to load a small vector of a wide type, and treat it as a larger vector of a smaller type. For example, loading a vector of 32 uint8 values from a uint32 Func can be done as follows:
Note that the align_bounds call is critical so that the narrow Exprs are aligned to the wider Exprs. This makes the x%4 term collapse to a constant. If f8 is an output Func, then constraining the min value of x to be a known multiple of four would also be sufficient, e.g. via:
See test/correctness/extract_concat_bits.cpp for a complete example.
Referenced by extract_bits().
Definition at line 1592 of file IROperator.h.
References extract_bits(), and type_of().
Given a number of Exprs of the same type, concatenate their bits producing a single Expr of the same type code of the input but with more bits.
The number of arguments must be a power of two.
concat_bits is especially useful when one wants to treat a Func containing values of a narrow type as a Func containing fewer values of a wider type. For example, the following code reinterprets vectors of 32 uint8 values as a vector of 8 uint32s:
See test/correctness/extract_concat_bits.cpp for a complete example.
Below is a collection of intrinsics for fixed-point programming.
Most of them can be expressed via other means, but this is more natural for some, as it avoids ghost widened intermediates that don't (or shouldn't) actually show up in codegen, and doesn't rely on pattern-matching inside the compiler to succeed to get good instruction selection.
The semantics of each call are defined in terms of a non-existent 'widen' and 'narrow' operators, which stand in for casts that double or halve the bit-width of a type respectively. Compute a + widen(b).
Compute widen(a) * widen(b).
a and b may have different signedness, in which case the result is signed.
Compute widen(a) - widen(b).
The result is always signed.
Compute saturating_narrow(widening_add(a, (1 >> min(b, 0)) / 2) << b).
When b is positive indicating a left shift, the rounding term is zero.
Compute saturating_narrow(widening_add(a, (1 << max(b, 0)) / 2) >> b).
When b is negative indicating a left shift, the rounding term is zero.
Compute saturating_narrow(shift_right(widening_mul(a, b), q))
Compute saturating_narrow(rounding_shift_right(widening_mul(a, b), q))
Expr Halide::target_arch_is | ( | Target::Arch | arch | ) |
Return a boolean Expr for the corresponding field of the Target being used during lowering; they can be useful in writing library code without having to plumb a Target through call sites, so that you can do things like.
Note that this doesn't do any checking at runtime to verify that the Target is valid for the current hardware configuration.
Expr Halide::target_os_is | ( | Target::OS | os | ) |
Expr Halide::target_has_feature | ( | Target::Feature | feat | ) |
Expr Halide::target_bits | ( | ) |
Return the bit width of the Target used during lowering; this can be useful in writing library code without having to plumb a Target through call sites, so that you can do things like.
Note that this doesn't do any checking at runtime to verify that the Target is valid for the current hardware configuration.
Return the natural vector width for the given Type for the Target being used during lowering; this can be useful in writing library code without having to plumb a Target through call sites, so that you can do things like.
Note that this doesn't do any checking at runtime to verify that the Target is valid for the current hardware configuration.
Expr Halide::target_natural_vector_size | ( | ) |
Definition at line 1738 of file IROperator.h.
References target_natural_vector_size(), and type_of().
Referenced by target_natural_vector_size().
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const DeviceAPI & | ) |
Emit a halide device api type in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const MemoryType & | ) |
Emit a halide memory type in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const TailStrategy & | ) |
Emit a halide tail strategy in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Partition & | ) |
Emit a halide loop partitioning policy in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const LoopLevel & | ) |
Create a zero-dimensional halide function that returns the given expression.
The function may have more dimensions if the expression contains implicit arguments.
Referenced by Halide::BoundaryConditions::Internal::func_like_to_func().
Create a 1-D halide function in the first argument that returns the second argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Create a 2-D halide function in the first two arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Create a 3-D halide function in the first three arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Create a 4-D halide function in the first four arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Func Halide::lambda | ( | const Var & | x, |
const Var & | y, | ||
const Var & | z, | ||
const Var & | w, | ||
const Var & | v, | ||
const Expr & | e ) |
Create a 5-D halide function in the first five arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
std::unique_ptr< llvm::Module > Halide::compile_module_to_llvm_module | ( | const Module & | module, |
llvm::LLVMContext & | context ) |
Generate an LLVM module.
std::unique_ptr< llvm::raw_fd_ostream > Halide::make_raw_fd_ostream | ( | const std::string & | filename | ) |
Construct an llvm output stream for writing to files.
void Halide::compile_llvm_module_to_object | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out ) |
Compile an LLVM module to native targets (objects, native assembly).
void Halide::compile_llvm_module_to_assembly | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out ) |
void Halide::compile_llvm_module_to_llvm_bitcode | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out ) |
Compile an LLVM module to LLVM targets (bitcode, LLVM assembly).
void Halide::compile_llvm_module_to_llvm_assembly | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out ) |
void Halide::create_static_library | ( | const std::vector< std::string > & | src_files, |
const Target & | target, | ||
const std::string & | dst_file, | ||
bool | deterministic = true ) |
Concatenate the list of src_files into dst_file, using the appropriate static library format for the given target (e.g., .a or .lib).
If deterministic is true, emit 0 for all GID/UID/timestamps, and 0644 for all modes (equivalent to the ar -D option).
Link a set of modules together into one module.
void Halide::compile_standalone_runtime | ( | const std::string & | object_filename, |
const Target & | t ) |
Create an object file containing the Halide runtime for a given target.
For use with Target::NoRuntime. Standalone runtimes are only compatible with pipelines compiled by the same build of Halide used to call this function.
Referenced by Halide::SimdOpCheckTest::main().
std::map< OutputFileType, std::string > Halide::compile_standalone_runtime | ( | const std::map< OutputFileType, std::string > & | output_files, |
const Target & | t ) |
Create an object and/or static library file containing the Halide runtime for a given target.
For use with Target::NoRuntime. Standalone runtimes are only compatible with pipelines compiled by the same build of Halide used to call this function. Return a map with just the actual outputs filled in (typically, OutputFileType::object and/or OutputFileType::static_library).
void Halide::compile_multitarget | ( | const std::string & | fn_name, |
const std::map< OutputFileType, std::string > & | output_files, | ||
const std::vector< Target > & | targets, | ||
const std::vector< std::string > & | suffixes, | ||
const ModuleFactory & | module_factory, | ||
const CompilerLoggerFactory & | compiler_logger_factory = nullptr ) |
|
inline |
Returns an Expr corresponding to the user context passed to the function (if any).
It is rare that this function is necessary (e.g. to pass the user context to an extern function written in C).
Definition at line 329 of file Param.h.
References Handle(), and Halide::Internal::Variable::make().
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const RVar & | ) |
Emit an RVar in a human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const RDom & | ) |
Emit an RDom in a human-readable form.
void Halide::serialize_pipeline | ( | const Pipeline & | pipeline, |
std::vector< uint8_t > & | data, | ||
std::map< std::string, Parameter > & | params ) |
Serialize a Halide pipeline into the given data buffer.
pipeline | The Halide pipeline to serialize. |
data | The data buffer to store the serialized Halide pipeline into. Any existing contents will be destroyed. |
params | Map of named parameters which will get populated during serialization (can be used to bind external parameters to objects in the pipeline by name). |
void Halide::serialize_pipeline | ( | const Pipeline & | pipeline, |
const std::string & | filename ) |
void Halide::serialize_pipeline | ( | const Pipeline & | pipeline, |
const std::string & | filename, | ||
std::map< std::string, Parameter > & | params ) |
Serialize a Halide pipeline into the given filename.
pipeline | The Halide pipeline to serialize. |
filename | The location of the file to write into to store the serialized pipeline. Any existing contents will be destroyed. |
params | Map of named parameters which will get populated during serialization (can be used to bind external parameters to objects in the pipeline by name). |
Target Halide::get_host_target | ( | ) |
Return the target corresponding to the host machine.
Referenced by Halide::SimdOpCheckTest::can_run_code(), and Halide::SimdOpCheckTest::main().
Target Halide::get_target_from_environment | ( | ) |
Return the target that Halide will use.
If HL_TARGET is set it uses that. Otherwise calls get_host_target
Target Halide::get_jit_target_from_environment | ( | ) |
Return the target that Halide will use for jit-compilation.
If HL_JIT_TARGET is set it uses that. Otherwise calls get_host_target. Throws an error if the architecture, bit width, and OS of the target do not match the host target, so this is only useful for controlling the feature set.
Referenced by Halide::Internal::schedule_scalar().
Target::Feature Halide::target_feature_for_device_api | ( | DeviceAPI | api | ) |
Get the Target feature corresponding to a DeviceAPI.
For device apis that do not correspond to any single target feature, returns Target::FeatureEnd
References Internal.
|
inline |
Constructing a signed integer type.
Definition at line 541 of file Type.h.
References Halide::Type::Int.
Referenced by Halide::SimdOpCheckTest::check_one(), Halide::Internal::equal(), Halide::ConciseCasts::i16(), Halide::ConciseCasts::i16_sat(), Halide::ConciseCasts::i32(), Halide::ConciseCasts::i32_sat(), Halide::ConciseCasts::i64(), Halide::ConciseCasts::i64_sat(), Halide::ConciseCasts::i8(), Halide::ConciseCasts::i8_sat(), and Halide::NamesInterface::Int().
|
inline |
Constructing an unsigned integer type.
Definition at line 546 of file Type.h.
References Halide::Type::UInt.
Referenced by Bool(), Halide::SimdOpCheckTest::check_one(), Halide::ConciseCasts::u16(), Halide::ConciseCasts::u16_sat(), Halide::ConciseCasts::u32(), Halide::ConciseCasts::u32_sat(), Halide::ConciseCasts::u64(), Halide::ConciseCasts::u64_sat(), Halide::ConciseCasts::u8(), Halide::ConciseCasts::u8_sat(), and Halide::NamesInterface::UInt().
|
inline |
Construct a floating-point type.
Definition at line 551 of file Type.h.
References Halide::Type::Float.
Referenced by Halide::SimdOpCheckTest::check_one(), Halide::ConciseCasts::f16(), Halide::ConciseCasts::f32(), Halide::ConciseCasts::f64(), and Halide::NamesInterface::Float().
|
inline |
Construct a floating-point type in the bfloat format.
Only 16-bit currently supported.
Definition at line 556 of file Type.h.
References Halide::Type::BFloat.
Referenced by Halide::ConciseCasts::bf16(), and Halide::SimdOpCheckTest::check_one().
|
inline |
Construct a boolean type.
Definition at line 561 of file Type.h.
References UInt().
Referenced by Halide::NamesInterface::Bool().
|
inline |
Construct a handle type.
Definition at line 566 of file Type.h.
References Halide::Type::Handle.
Referenced by user_context_value().
|
inline |
Construct the halide equivalent of a C type.
Definition at line 572 of file Type.h.
Referenced by cast(), Halide::Internal::check_types(), Halide::Internal::div_imp(), evaluate(), evaluate_may_gpu(), Halide::ExternSignature::ExternSignature(), extract_bits(), Halide::Internal::mod_imp(), Halide::Target::natural_vector_size(), Halide::Internal::GeneratorParamImpl< T >::operator Expr(), Halide::Param< T >::operator=(), Halide::Param< T >::Param(), reinterpret(), saturating_cast(), Halide::Parameter::scalar(), Halide::Parameter::set_scalar(), Halide::Param< T >::static_type(), target_natural_vector_size(), undef(), and Halide::Internal::unreachable().
std::string Halide::type_to_c_type | ( | Type | type, |
bool | include_space, | ||
bool | c_plus_plus = true ) |
Halide type to a C++ type.
void Halide::load_plugin | ( | const std::string & | lib_name | ) |
Load a plugin in the form of a dynamic library (e.g.
for custom autoschedulers). If the string doesn't contain any . characters, the proper prefix and/or suffix for the platform will be added:
foo -> libfoo.so (Linux/OSX/etc – note that .dylib is not supported) foo -> foo.dll (Windows)
otherwise, it is assumed to be an appropriate pathname.
Any error in loading will assert-fail.
References Internal.
void Halide::set_compiler_stack_size | ( | size_t | ) |
Set how much stack the compiler should use for compilation in bytes.
This can also be set through the environment variable HL_COMPILER_STACK_SIZE, though this function takes precedence. A value of zero causes the compiler to just use the calling stack for all compilation tasks.
Calling this or setting the environment variable should not be necessary. It is provided for three kinds of testing:
First, Halide uses it in our internal tests to make sure we're not using a silly amount of stack size on some canary programs to avoid stack usage regressions.
Second, if you have a mysterious crash inside a generator, you can set a larger stack size as a way to test if it's a stack overflow. Perhaps our default stack size is not large enough for your program and schedule. Use this call or the environment var as a workaround, and then open a bug with a reproducer at github.com/halide/Halide/issues so that we can determine what's going wrong that is causing your code to use so much stack.
Third, perhaps using a side-stack is causing problems with sanitizing, debugging, or profiling tools. If this is a problem, you can set HL_COMPILER_STACK_SIZE to zero to make Halide stay on the main thread's stack.
size_t Halide::get_compiler_stack_size | ( | ) |
Return how much stack size the compiler should use for calls that go through run_with_large_stack below.
Currently that's lowering and codegen. If no call to set_compiler_stack_size has been made, this checks the value of the environment variable HL_COMPILER_STACK_SIZE. If that's unset, it returns default_compiler_stack_size, defined above.
References Internal.
|
inline |
Definition at line 10 of file fuzz_helpers.h.
const int Halide::head1_channels = 8 |
Definition at line 7 of file NetworkSize.h.
const int Halide::head1_w = 40 |
Definition at line 7 of file NetworkSize.h.
const int Halide::head1_h = 7 |
Definition at line 7 of file NetworkSize.h.
const int Halide::head2_channels = 24 |
Definition at line 8 of file NetworkSize.h.
const int Halide::head2_w = 39 |
Definition at line 8 of file NetworkSize.h.
const int Halide::conv1_channels = 32 |
Definition at line 9 of file NetworkSize.h.
const DeviceAPI Halide::all_device_apis[] |
An array containing all the device apis.
Useful for iterating through them.
Definition at line 31 of file DeviceAPI.h.