Halide
|
This file defines the class FunctionDAG, which is our representation of a Halide pipeline, and contains methods to using Halide's bounds tools to query properties of it. More...
Namespaces | |
namespace | BoundaryConditions |
namespace to hold functions for imposing boundary conditions on Halide Funcs. | |
namespace | ConciseCasts |
namespace | Internal |
namespace | PythonBindings |
namespace | PyTorch |
namespace | Runtime |
Classes | |
struct | Argument |
A struct representing an argument to a halide-generated function. More... | |
struct | ArgumentEstimates |
struct | AutoschedulerParams |
Special the Autoscheduler to be used (if any), along with arbitrary additional arguments specific to the given Autoscheduler. More... | |
struct | AutoSchedulerResults |
struct | bfloat16_t |
Class that provides a type that implements half precision floating point using the bfloat16 format. More... | |
class | Buffer |
A Halide::Buffer is a named shared reference to a Halide::Runtime::Buffer. More... | |
class | Callable |
struct | CompileError |
An error that occurs while compiling a Halide pipeline that Halide attributes to a user error. More... | |
class | CompileTimeErrorReporter |
CompileTimeErrorReporter is used at compile time (not runtime) when an error or warning is generated by Halide. More... | |
class | CostModel |
struct | CustomLoweringPass |
A custom lowering pass. More... | |
class | DefaultCostModel |
class | Derivative |
Helper structure storing the adjoints Func. More... | |
struct | Error |
A base class for Halide errors. More... | |
class | EvictionKey |
Helper class for identifying purpose of an Expr passed to memoize. More... | |
struct | Expr |
A fragment of Halide syntax. More... | |
struct | ExprCompare |
This lets you use an Expr as a key in a map of the form map<Expr, Foo, ExprCompare> More... | |
struct | ExternCFunction |
struct | ExternFuncArgument |
An argument to an extern-defined Func. More... | |
struct | ExternSignature |
struct | float16_t |
Class that provides a type that implements half precision floating point (IEEE754 2008 binary16) in software. More... | |
class | Func |
A halide function. More... | |
class | FuncRef |
A fragment of front-end syntax of the form f(x, y, z), where x, y, z are Vars or Exprs. More... | |
class | FuncTupleElementRef |
A fragment of front-end syntax of the form f(x, y, z)[index], where x, y, z are Vars or Exprs. More... | |
struct | FuseLoopLevel |
class | Generator |
class | GeneratorContext |
GeneratorContext is a class that is used when using Generators (or Stubs) directly; it is used to allow the outer context (typically, either a Generator or "top-level" code) to specify certain information to the inner context to ensure that inner and outer Generators are compiled in a compatible way. More... | |
class | GeneratorInput |
class | GeneratorOutput |
class | GeneratorParam |
GeneratorParam is a templated class that can be used to modify the behavior of the Generator at code-generation time. More... | |
class | ImageParam |
An Image parameter to a halide pipeline. More... | |
struct | ImplicitVar |
struct | InternalError |
An error that occurs while compiling a Halide pipeline that Halide attributes to an internal compiler bug, or to an invalid use of Halide's internals. More... | |
struct | JITExtern |
struct | JITHandlers |
A set of custom overrides of runtime functions. More... | |
struct | JITUserContext |
A context to be passed to Pipeline::realize. More... | |
class | LoopLevel |
A reference to a site in a Halide statement at the top of the body of a particular for loop. More... | |
class | Module |
A halide module. More... | |
class | NamesInterface |
class | OutputImageParam |
A handle on the output buffer of a pipeline. More... | |
class | Param |
A scalar parameter to a halide pipeline. More... | |
class | ParamMap |
class | Pipeline |
A class representing a Halide pipeline. More... | |
struct | Range |
A single-dimensional span. More... | |
class | RDom |
A multi-dimensional domain over which to iterate. More... | |
class | Realization |
A Realization is a vector of references to existing Buffer objects. More... | |
struct | RuntimeError |
An error that occurs while running a JIT-compiled Halide pipeline. More... | |
class | RVar |
A reduction variable represents a single dimension of a reduction domain (RDom). More... | |
class | SimdOpCheckTest |
class | Stage |
A single definition of a Func. More... | |
struct | Target |
A struct representing a target machine and os to generate code for. More... | |
struct | Task |
struct | TestResult |
class | Tuple |
Create a small array of Exprs for defining and calling functions with multiple outputs. More... | |
struct | Type |
Types in the halide type system. More... | |
class | Var |
A Halide variable, to be used when defining functions. More... | |
struct | VarOrRVar |
A class that can represent Vars or RVars. More... | |
Typedefs | |
using | GeneratorParamsMap = std::map< std::string, std::string > |
typedef std::vector< Range > | Region |
A multi-dimensional box. | |
typedef Stage | ScheduleHandle |
using | MetadataNameMap = std::map< std::string, std::string > |
using | ModuleFactory = std::function< Module(const std::string &fn_name, const Target &target)> |
using | CompilerLoggerFactory = std::function< std::unique_ptr< Internal::CompilerLogger >(const std::string &fn_name, const Target &target)> |
using | AutoSchedulerFn = std::function< void(const Pipeline &, const Target &, const AutoschedulerParams &, AutoSchedulerResults *outputs)> |
Enumerations | |
enum class | DeviceAPI { None , Host , Default_GPU , CUDA , OpenCL , OpenGLCompute , Metal , Hexagon , HexagonDma , D3D12Compute } |
An enum describing a type of device API. More... | |
enum class | MemoryType { Auto , Heap , Stack , Register , GPUShared , GPUTexture , LockedCache , VTCM , AMXTile } |
An enum describing different address spaces to be used with Func::store_in. More... | |
enum class | NameMangling { Default , C , CPlusPlus } |
An enum to specify calling convention for extern stages. More... | |
enum class | OutputFileType { assembly , bitcode , c_header , c_source , compiler_log , cpp_stub , featurization , llvm_assembly , object , python_extension , pytorch_wrapper , registration , schedule , static_library , stmt , stmt_html } |
Enums specifying various kinds of outputs that can be produced from a Halide Pipeline. More... | |
enum class | LinkageType { External , ExternalPlusMetadata , ExternalPlusArgv , Internal } |
Type of linkage a function in a lowered Halide module can have. More... | |
enum | StmtOutputFormat { Text , HTML } |
Used to determine if the output printed to file should be as a normal string or as an HTML file which can be opened in a browerser and manipulated via JS and CSS. More... | |
enum class | PrefetchBoundStrategy { Clamp , GuardWithIf , NonFaulting } |
Different ways to handle accesses outside the original extents in a prefetch. More... | |
enum class | TailStrategy { RoundUp , GuardWithIf , Predicate , PredicateLoads , PredicateStores , ShiftInwards , Auto } |
Different ways to handle a tail case in a split when the factor does not provably divide the extent. More... | |
enum class | LoopAlignStrategy { AlignStart , AlignEnd , NoAlign , Auto } |
Different ways to handle the case when the start/end of the loops of stages computed with (fused) are not aligned. More... | |
Functions | |
std::unique_ptr< DefaultCostModel > | make_default_cost_model (const std::string &weights_in_dir="", const std::string &weights_out_dir="", bool randomize_weights=false) |
std::unique_ptr< llvm::Module > | codegen_llvm (const Module &module, llvm::LLVMContext &context) |
Given a Halide module, generate an llvm::Module. | |
std::ostream & | operator<< (std::ostream &stream, const Expr &) |
Emit an expression on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Type &) |
Emit a halide type on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Module &) |
Emit a halide Module on an output stream (such as std::cout) in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const Target &) |
Emit a halide Target in a human readable form. | |
Derivative | propagate_adjoints (const Func &output, const Func &adjoint, const Region &output_bounds) |
Given a Func and a corresponding adjoint, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters. | |
Derivative | propagate_adjoints (const Func &output, const Buffer< float > &adjoint) |
Given a Func and a corresponding adjoint buffer, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters. | |
Derivative | propagate_adjoints (const Func &output) |
Given a scalar Func with size 1, (back)propagate the gradient to all dependent Funcs, buffers, and parameters. | |
const halide_device_interface_t * | get_device_interface_for_device_api (DeviceAPI d, const Target &t=get_jit_target_from_environment(), const char *error_site=nullptr) |
Gets the appropriate halide_device_interface_t * for a DeviceAPI. | |
DeviceAPI | get_default_device_api_for_target (const Target &t) |
Get the specific DeviceAPI that Halide would select when presented with DeviceAPI::Default_GPU for a given target. | |
bool | host_supports_target_device (const Target &t) |
This attempts to sniff whether a given Target (and its implied DeviceAPI) is usable on the current host. | |
bool | exceptions_enabled () |
Query whether Halide was compiled with exceptions. | |
void | set_custom_compile_time_error_reporter (CompileTimeErrorReporter *error_reporter) |
The default error reporter logs to stderr, then throws an exception (if HALIDE_WITH_EXCEPTIONS) or calls abort (if not). | |
Expr | fast_integer_divide (const Expr &numerator, const Expr &denominator) |
Integer division by small values can be done exactly as multiplies and shifts. | |
Expr | fast_integer_divide_round_to_zero (const Expr &numerator, const Expr &denominator) |
A variant of the above which rounds towards zero instead of rounding towards negative infinity. | |
Expr | fast_integer_modulo (const Expr &numerator, const Expr &denominator) |
Use the fast integer division tables to implement a modulo operation via the Euclidean identity: ab = a - (a/b)*b. | |
Expr | min (const FuncRef &a, const FuncRef &b) |
Explicit overloads of min and max for FuncRef. | |
Expr | max (const FuncRef &a, const FuncRef &b) |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE T | evaluate (JITUserContext *ctx, const Expr &e) |
JIT-Compile and run enough code to evaluate a Halide expression. | |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE T | evaluate (const Expr &e) |
evaluate with a default user context | |
template<typename First , typename... Rest> | |
HALIDE_NO_USER_CODE_INLINE void | evaluate (JITUserContext *ctx, Tuple t, First first, Rest &&...rest) |
JIT-compile and run enough code to evaluate a Halide Tuple. | |
template<typename First , typename... Rest> | |
HALIDE_NO_USER_CODE_INLINE void | evaluate (Tuple t, First first, Rest &&...rest) |
JIT-compile and run enough code to evaluate a Halide Tuple. | |
template<typename T > | |
HALIDE_NO_USER_CODE_INLINE T | evaluate_may_gpu (const Expr &e) |
JIT-Compile and run enough code to evaluate a Halide expression. | |
template<typename First , typename... Rest> | |
HALIDE_NO_USER_CODE_INLINE void | evaluate_may_gpu (Tuple t, First first, Rest &&...rest) |
JIT-compile and run enough code to evaluate a Halide Tuple. | |
template<typename Other , typename T > | |
auto | operator+ (const Other &a, const GeneratorParam< T > &b) -> decltype(a+(T) b) |
Addition between GeneratorParam<T> and any type that supports operator+ with T. | |
template<typename Other , typename T > | |
auto | operator+ (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a+b) |
template<typename Other , typename T > | |
auto | operator- (const Other &a, const GeneratorParam< T > &b) -> decltype(a -(T) b) |
Subtraction between GeneratorParam<T> and any type that supports operator- with T. | |
template<typename Other , typename T > | |
auto | operator- (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a - b) |
template<typename Other , typename T > | |
auto | operator* (const Other &a, const GeneratorParam< T > &b) -> decltype(a *(T) b) |
Multiplication between GeneratorParam<T> and any type that supports operator* with T. | |
template<typename Other , typename T > | |
auto | operator* (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a *b) |
template<typename Other , typename T > | |
auto | operator/ (const Other &a, const GeneratorParam< T > &b) -> decltype(a/(T) b) |
Division between GeneratorParam<T> and any type that supports operator/ with T. | |
template<typename Other , typename T > | |
auto | operator/ (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a/b) |
template<typename Other , typename T > | |
auto | operator% (const Other &a, const GeneratorParam< T > &b) -> decltype(a %(T) b) |
Modulo between GeneratorParam<T> and any type that supports operator% with T. | |
template<typename Other , typename T > | |
auto | operator% (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a % b) |
template<typename Other , typename T > | |
auto | operator> (const Other &a, const GeneratorParam< T > &b) -> decltype(a >(T) b) |
Greater than comparison between GeneratorParam<T> and any type that supports operator> with T. | |
template<typename Other , typename T > | |
auto | operator> (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a > b) |
template<typename Other , typename T > | |
auto | operator< (const Other &a, const GeneratorParam< T > &b) -> decltype(a<(T) b) |
Less than comparison between GeneratorParam<T> and any type that supports operator< with T. | |
template<typename Other , typename T > | |
auto | operator< (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a< b) |
template<typename Other , typename T > | |
auto | operator>= (const Other &a, const GeneratorParam< T > &b) -> decltype(a >=(T) b) |
Greater than or equal comparison between GeneratorParam<T> and any type that supports operator>= with T. | |
template<typename Other , typename T > | |
auto | operator>= (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a >=b) |
template<typename Other , typename T > | |
auto | operator<= (const Other &a, const GeneratorParam< T > &b) -> decltype(a<=(T) b) |
Less than or equal comparison between GeneratorParam<T> and any type that supports operator<= with T. | |
template<typename Other , typename T > | |
auto | operator<= (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a<=b) |
template<typename Other , typename T > | |
auto | operator== (const Other &a, const GeneratorParam< T > &b) -> decltype(a==(T) b) |
Equality comparison between GeneratorParam<T> and any type that supports operator== with T. | |
template<typename Other , typename T > | |
auto | operator== (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a==b) |
template<typename Other , typename T > | |
auto | operator!= (const Other &a, const GeneratorParam< T > &b) -> decltype(a !=(T) b) |
Inequality comparison between between GeneratorParam<T> and any type that supports operator!= with T. | |
template<typename Other , typename T > | |
auto | operator!= (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a !=b) |
template<typename Other , typename T > | |
auto | operator&& (const Other &a, const GeneratorParam< T > &b) -> decltype(a &&(T) b) |
Logical and between between GeneratorParam<T> and any type that supports operator&& with T. | |
template<typename Other , typename T > | |
auto | operator&& (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a &&b) |
template<typename T > | |
auto | operator&& (const GeneratorParam< T > &a, const GeneratorParam< T > &b) -> decltype((T) a &&(T) b) |
template<typename Other , typename T > | |
auto | operator|| (const Other &a, const GeneratorParam< T > &b) -> decltype(a||(T) b) |
Logical or between between GeneratorParam<T> and any type that supports operator|| with T. | |
template<typename Other , typename T > | |
auto | operator|| (const GeneratorParam< T > &a, const Other &b) -> decltype((T) a||b) |
template<typename T > | |
auto | operator|| (const GeneratorParam< T > &a, const GeneratorParam< T > &b) -> decltype((T) a||(T) b) |
template<typename Other , typename T > | |
auto | min (const Other &a, const GeneratorParam< T > &b) -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
Compute minimum between GeneratorParam<T> and any type that supports min with T. | |
template<typename Other , typename T > | |
auto | min (const GeneratorParam< T > &a, const Other &b) -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
template<typename Other , typename T > | |
auto | max (const Other &a, const GeneratorParam< T > &b) -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
Compute the maximum value between GeneratorParam<T> and any type that supports max with T. | |
template<typename Other , typename T > | |
auto | max (const GeneratorParam< T > &a, const Other &b) -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
template<typename T > | |
auto | operator! (const GeneratorParam< T > &a) -> decltype(!(T) a) |
Not operator for GeneratorParam. | |
Callable | create_callable_from_generator (const GeneratorContext &context, const std::string &name, const GeneratorParamsMap &generator_params={}) |
Create a Generator from the currently-registered Generators, use it to create a Callable. | |
Callable | create_callable_from_generator (const Target &target, const std::string &name, const GeneratorParamsMap &generator_params={}) |
Expr | sum (Expr, const std::string &s="sum") |
An inline reduction. | |
Expr | saturating_sum (Expr, const std::string &s="saturating_sum") |
Expr | product (Expr, const std::string &s="product") |
Expr | maximum (Expr, const std::string &s="maximum") |
Expr | minimum (Expr, const std::string &s="minimum") |
Expr | sum (const RDom &, Expr, const std::string &s="sum") |
Variants of the inline reduction in which the RDom is stated explicitly. | |
Expr | saturating_sum (const RDom &r, Expr e, const std::string &s="saturating_sum") |
Expr | product (const RDom &, Expr, const std::string &s="product") |
Expr | maximum (const RDom &, Expr, const std::string &s="maximum") |
Expr | minimum (const RDom &, Expr, const std::string &s="minimum") |
Tuple | argmax (Expr, const std::string &s="argmax") |
Returns an Expr or Tuple representing the coordinates of the point in the RDom which minimizes or maximizes the expression. | |
Tuple | argmin (Expr, const std::string &s="argmin") |
Tuple | argmax (const RDom &, Expr, const std::string &s="argmax") |
Tuple | argmin (const RDom &, Expr, const std::string &s="argmin") |
Expr | sum (Expr, const Func &) |
Inline reductions create an anonymous helper Func to do the work. | |
Expr | saturating_sum (Expr, const Func &) |
Expr | product (Expr, const Func &) |
Expr | maximum (Expr, const Func &) |
Expr | minimum (Expr, const Func &) |
Expr | sum (const RDom &, Expr, const Func &) |
Expr | saturating_sum (const RDom &r, Expr e, const Func &) |
Expr | product (const RDom &, Expr, const Func &) |
Expr | maximum (const RDom &, Expr, const Func &) |
Expr | minimum (const RDom &, Expr, const Func &) |
Tuple | argmax (Expr, const Func &) |
Tuple | argmin (Expr, const Func &) |
Tuple | argmax (const RDom &, Expr, const Func &) |
Tuple | argmin (const RDom &, Expr, const Func &) |
template<typename T > | |
Expr | cast (Expr a) |
Cast an expression to the halide type corresponding to the C++ type T. | |
Expr | cast (Type t, Expr a) |
Cast an expression to a new type. | |
Expr | operator+ (Expr a, Expr b) |
Return the sum of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr | operator+ (Expr a, int b) |
Add an expression and a constant integer. | |
Expr | operator+ (int a, Expr b) |
Add a constant integer and an expression. | |
Expr & | operator+= (Expr &a, Expr b) |
Modify the first expression to be the sum of two expressions, without changing its type. | |
Expr | operator- (Expr a, Expr b) |
Return the difference of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr | operator- (Expr a, int b) |
Subtracts a constant integer from an expression. | |
Expr | operator- (int a, Expr b) |
Subtracts an expression from a constant integer. | |
Expr | operator- (Expr a) |
Return the negative of the argument. | |
Expr & | operator-= (Expr &a, Expr b) |
Modify the first expression to be the difference of two expressions, without changing its type. | |
Expr | operator* (Expr a, Expr b) |
Return the product of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr | operator* (Expr a, int b) |
Multiply an expression and a constant integer. | |
Expr | operator* (int a, Expr b) |
Multiply a constant integer and an expression. | |
Expr & | operator*= (Expr &a, Expr b) |
Modify the first expression to be the product of two expressions, without changing its type. | |
Expr | operator/ (Expr a, Expr b) |
Return the ratio of two expressions, doing any necessary type coercion using Internal::match_types. | |
Expr & | operator/= (Expr &a, Expr b) |
Modify the first expression to be the ratio of two expressions, without changing its type. | |
Expr | operator/ (Expr a, int b) |
Divides an expression by a constant integer. | |
Expr | operator/ (int a, Expr b) |
Divides a constant integer by an expression. | |
Expr | operator% (Expr a, Expr b) |
Return the first argument reduced modulo the second, doing any necessary type coercion using Internal::match_types. | |
Expr | operator% (Expr a, int b) |
Mods an expression by a constant integer. | |
Expr | operator% (int a, Expr b) |
Mods a constant integer by an expression. | |
Expr | operator> (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is greater than the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator> (Expr a, int b) |
Return a boolean expression that tests whether an expression is greater than a constant integer. | |
Expr | operator> (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is greater than an expression. | |
Expr | operator< (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is less than the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator< (Expr a, int b) |
Return a boolean expression that tests whether an expression is less than a constant integer. | |
Expr | operator< (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is less than an expression. | |
Expr | operator<= (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is less than or equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator<= (Expr a, int b) |
Return a boolean expression that tests whether an expression is less than or equal to a constant integer. | |
Expr | operator<= (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is less than or equal to an expression. | |
Expr | operator>= (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is greater than or equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator>= (const Expr &a, int b) |
Return a boolean expression that tests whether an expression is greater than or equal to a constant integer. | |
Expr | operator>= (int a, const Expr &b) |
Return a boolean expression that tests whether a constant integer is greater than or equal to an expression. | |
Expr | operator== (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator== (Expr a, int b) |
Return a boolean expression that tests whether an expression is equal to a constant integer. | |
Expr | operator== (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is equal to an expression. | |
Expr | operator!= (Expr a, Expr b) |
Return a boolean expression that tests whether the first argument is not equal to the second, after doing any necessary type coercion using Internal::match_types. | |
Expr | operator!= (Expr a, int b) |
Return a boolean expression that tests whether an expression is not equal to a constant integer. | |
Expr | operator!= (int a, Expr b) |
Return a boolean expression that tests whether a constant integer is not equal to an expression. | |
Expr | operator&& (Expr a, Expr b) |
Returns the logical and of the two arguments. | |
Expr | operator&& (Expr a, bool b) |
Logical and of an Expr and a bool. | |
Expr | operator&& (bool a, Expr b) |
Expr | operator|| (Expr a, Expr b) |
Returns the logical or of the two arguments. | |
Expr | operator|| (Expr a, bool b) |
Logical or of an Expr and a bool. | |
Expr | operator|| (bool a, Expr b) |
Expr | operator! (Expr a) |
Returns the logical not the argument. | |
Expr | max (Expr a, Expr b) |
Returns an expression representing the greater of the two arguments, after doing any necessary type coercion using Internal::match_types. | |
Expr | max (Expr a, int b) |
Returns an expression representing the greater of an expression and a constant integer. | |
Expr | max (int a, Expr b) |
Returns an expression representing the greater of a constant integer and an expression. | |
Expr | max (float a, Expr b) |
Expr | max (Expr a, float b) |
template<typename A , typename B , typename C , typename... Rest, typename std::enable_if< Halide::Internal::all_are_convertible< Expr, Rest... >::value >::type * = nullptr> | |
Expr | max (A &&a, B &&b, C &&c, Rest &&...rest) |
Returns an expression representing the greater of an expressions vector, after doing any necessary type coersion using Internal::match_types. | |
Expr | min (Expr a, Expr b) |
Expr | min (Expr a, int b) |
Returns an expression representing the lesser of an expression and a constant integer. | |
Expr | min (int a, Expr b) |
Returns an expression representing the lesser of a constant integer and an expression. | |
Expr | min (float a, Expr b) |
Expr | min (Expr a, float b) |
template<typename A , typename B , typename C , typename... Rest, typename std::enable_if< Halide::Internal::all_are_convertible< Expr, Rest... >::value >::type * = nullptr> | |
Expr | min (A &&a, B &&b, C &&c, Rest &&...rest) |
Returns an expression representing the lesser of an expressions vector, after doing any necessary type coersion using Internal::match_types. | |
Expr | operator+ (Expr a, float b) |
Operators on floats treats those floats as Exprs. | |
Expr | operator+ (float a, Expr b) |
Expr | operator- (Expr a, float b) |
Expr | operator- (float a, Expr b) |
Expr | operator* (Expr a, float b) |
Expr | operator* (float a, Expr b) |
Expr | operator/ (Expr a, float b) |
Expr | operator/ (float a, Expr b) |
Expr | operator% (Expr a, float b) |
Expr | operator% (float a, Expr b) |
Expr | operator> (Expr a, float b) |
Expr | operator> (float a, Expr b) |
Expr | operator< (Expr a, float b) |
Expr | operator< (float a, Expr b) |
Expr | operator>= (Expr a, float b) |
Expr | operator>= (float a, Expr b) |
Expr | operator<= (Expr a, float b) |
Expr | operator<= (float a, Expr b) |
Expr | operator== (Expr a, float b) |
Expr | operator== (float a, Expr b) |
Expr | operator!= (Expr a, float b) |
Expr | operator!= (float a, Expr b) |
Expr | clamp (Expr a, const Expr &min_val, const Expr &max_val) |
Clamps an expression to lie within the given bounds. | |
Expr | abs (Expr a) |
Returns the absolute value of a signed integer or floating-point expression. | |
Expr | absd (Expr a, Expr b) |
Return the absolute difference between two values. | |
Expr | select (Expr condition, Expr true_value, Expr false_value) |
Returns an expression similar to the ternary operator in C, except that it always evaluates all arguments. | |
template<typename... Args, typename std::enable_if< Halide::Internal::all_are_convertible< Expr, Args... >::value >::type * = nullptr> | |
Expr | select (Expr c0, Expr v0, Expr c1, Expr v1, Args &&...args) |
A multi-way variant of select similar to a switch statement in C, which can accept multiple conditions and values in pairs. | |
Tuple | tuple_select (const Tuple &condition, const Tuple &true_value, const Tuple &false_value) |
Equivalent of ternary select(), but taking/returning tuples. | |
Tuple | tuple_select (const Expr &condition, const Tuple &true_value, const Tuple &false_value) |
template<typename... Args> | |
Tuple | tuple_select (const Tuple &c0, const Tuple &v0, const Tuple &c1, const Tuple &v1, Args &&...args) |
Equivalent of multiway select(), but taking/returning tuples. | |
template<typename... Args> | |
Tuple | tuple_select (const Expr &c0, const Tuple &v0, const Expr &c1, const Tuple &v1, Args &&...args) |
Expr | mux (const Expr &id, const std::initializer_list< Expr > &values) |
Oftentimes we want to pack a list of expressions with the same type into a channel dimension, e.g., img(x, y, c) = select(c == 0, 100, // Red c == 1, 50, // Green 25); // Blue This is tedious when the list is long. | |
Expr | mux (const Expr &id, const std::vector< Expr > &values) |
Expr | mux (const Expr &id, const Tuple &values) |
Expr | sin (Expr x) |
Return the sine of a floating-point expression. | |
Expr | asin (Expr x) |
Return the arcsine of a floating-point expression. | |
Expr | cos (Expr x) |
Return the cosine of a floating-point expression. | |
Expr | acos (Expr x) |
Return the arccosine of a floating-point expression. | |
Expr | tan (Expr x) |
Return the tangent of a floating-point expression. | |
Expr | atan (Expr x) |
Return the arctangent of a floating-point expression. | |
Expr | atan2 (Expr y, Expr x) |
Return the angle of a floating-point gradient. | |
Expr | sinh (Expr x) |
Return the hyperbolic sine of a floating-point expression. | |
Expr | asinh (Expr x) |
Return the hyperbolic arcsinhe of a floating-point expression. | |
Expr | cosh (Expr x) |
Return the hyperbolic cosine of a floating-point expression. | |
Expr | acosh (Expr x) |
Return the hyperbolic arccosine of a floating-point expression. | |
Expr | tanh (Expr x) |
Return the hyperbolic tangent of a floating-point expression. | |
Expr | atanh (Expr x) |
Return the hyperbolic arctangent of a floating-point expression. | |
Expr | sqrt (Expr x) |
Return the square root of a floating-point expression. | |
Expr | hypot (const Expr &x, const Expr &y) |
Return the square root of the sum of the squares of two floating-point expressions. | |
Expr | exp (Expr x) |
Return the exponential of a floating-point expression. | |
Expr | log (Expr x) |
Return the logarithm of a floating-point expression. | |
Expr | pow (Expr x, Expr y) |
Return one floating point expression raised to the power of another. | |
Expr | erf (const Expr &x) |
Evaluate the error function erf. | |
Expr | fast_sin (const Expr &x) |
Fast vectorizable approximation to some trigonometric functions for Float(32). | |
Expr | fast_cos (const Expr &x) |
Expr | fast_log (const Expr &x) |
Fast approximate cleanly vectorizable log for Float(32). | |
Expr | fast_exp (const Expr &x) |
Fast approximate cleanly vectorizable exp for Float(32). | |
Expr | fast_pow (Expr x, Expr y) |
Fast approximate cleanly vectorizable pow for Float(32). | |
Expr | fast_inverse (Expr x) |
Fast approximate inverse for Float(32). | |
Expr | fast_inverse_sqrt (Expr x) |
Fast approximate inverse square root for Float(32). | |
Expr | floor (Expr x) |
Return the greatest whole number less than or equal to a floating-point expression. | |
Expr | ceil (Expr x) |
Return the least whole number greater than or equal to a floating-point expression. | |
Expr | round (Expr x) |
Return the whole number closest to a floating-point expression. | |
Expr | trunc (Expr x) |
Return the integer part of a floating-point expression. | |
Expr | is_nan (Expr x) |
Returns true if the argument is a Not a Number (NaN). | |
Expr | is_inf (Expr x) |
Returns true if the argument is Inf or -Inf. | |
Expr | is_finite (Expr x) |
Returns true if the argument is a finite value (ie, neither NaN nor Inf). | |
Expr | fract (const Expr &x) |
Return the fractional part of a floating-point expression. | |
Expr | reinterpret (Type t, Expr e) |
Reinterpret the bits of one value as another type. | |
template<typename T > | |
Expr | reinterpret (Expr e) |
Expr | operator& (Expr x, Expr y) |
Return the bitwise and of two expressions (which need not have the same type). | |
Expr | operator& (Expr x, int y) |
Return the bitwise and of an expression and an integer. | |
Expr | operator& (int x, Expr y) |
Expr | operator| (Expr x, Expr y) |
Return the bitwise or of two expressions (which need not have the same type). | |
Expr | operator| (Expr x, int y) |
Return the bitwise or of an expression and an integer. | |
Expr | operator| (int x, Expr y) |
Expr | operator^ (Expr x, Expr y) |
Return the bitwise xor of two expressions (which need not have the same type). | |
Expr | operator^ (Expr x, int y) |
Return the bitwise xor of an expression and an integer. | |
Expr | operator^ (int x, Expr y) |
Expr | operator~ (Expr x) |
Return the bitwise not of an expression. | |
Expr | operator<< (Expr x, Expr y) |
Shift the bits of an integer value left. | |
Expr | operator<< (Expr x, int y) |
Expr | operator>> (Expr x, Expr y) |
Shift the bits of an integer value right. | |
Expr | operator>> (Expr x, int y) |
Expr | lerp (Expr zero_val, Expr one_val, Expr weight) |
Linear interpolate between the two values according to a weight. | |
Expr | popcount (Expr x) |
Count the number of set bits in an expression. | |
Expr | count_leading_zeros (Expr x) |
Count the number of leading zero bits in an expression. | |
Expr | count_trailing_zeros (Expr x) |
Count the number of trailing zero bits in an expression. | |
Expr | div_round_to_zero (Expr x, Expr y) |
Divide two integers, rounding towards zero. | |
Expr | mod_round_to_zero (Expr x, Expr y) |
Compute the remainder of dividing two integers, when division is rounding toward zero. | |
Expr | random_float (Expr seed=Expr()) |
Return a random variable representing a uniformly distributed float in the half-open interval [0.0f, 1.0f). | |
Expr | random_uint (Expr seed=Expr()) |
Return a random variable representing a uniformly distributed unsigned 32-bit integer. | |
Expr | random_int (Expr seed=Expr()) |
Return a random variable representing a uniformly distributed 32-bit integer. | |
Expr | print (const std::vector< Expr > &values) |
Create an Expr that prints out its value whenever it is evaluated. | |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | print (Expr a, Args &&...args) |
Expr | print_when (Expr condition, const std::vector< Expr > &values) |
Create an Expr that prints whenever it is evaluated, provided that the condition is true. | |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | print_when (Expr condition, Expr a, Args &&...args) |
Expr | require (Expr condition, const std::vector< Expr > &values) |
Create an Expr that that guarantees a precondition. | |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | require (Expr condition, Expr value, Args &&...args) |
Expr | undef (Type t) |
Return an undef value of the given type. | |
template<typename T > | |
Expr | undef () |
template<typename... Args> | |
HALIDE_NO_USER_CODE_INLINE Expr | memoize_tag (Expr result, Args &&...args) |
Control the values used in the memoization cache key for memoize. | |
Expr | likely (Expr e) |
Expressions tagged with this intrinsic are considered to be part of the steady state of some loop with a nasty beginning and end (e.g. | |
Expr | likely_if_innermost (Expr e) |
Equivalent to likely, but only triggers a loop partitioning if found in an innermost loop. | |
template<typename T > | |
Expr | saturating_cast (Expr e) |
Cast an expression to the halide type corresponding to the C++ type T. | |
Expr | saturating_cast (Type t, Expr e) |
Cast an expression to a new type, clamping to the minimum and maximum values of the result type. | |
Expr | strict_float (Expr e) |
Makes a best effort attempt to preserve IEEE floating-point semantics in evaluating an expression. | |
Expr | unsafe_promise_clamped (const Expr &value, const Expr &min, const Expr &max) |
Create an Expr that that promises another Expr is clamped but do not generate code to check the assertion or modify the value. | |
Expr | scatter (const std::vector< Expr > &args) |
Scatter and gather are used for update definition which must store multiple values to distinct locations at the same time. | |
Expr | gather (const std::vector< Expr > &args) |
template<typename... Args> | |
Expr | scatter (const Expr &e, Args &&...args) |
template<typename... Args> | |
Expr | gather (const Expr &e, Args &&...args) |
Expr | extract_bits (Type t, const Expr &e, const Expr &lsb) |
Extract a contiguous subsequence of the bits of 'e', starting at the bit index given by 'lsb', where zero is the least-significant bit, returning a value of type 't'. | |
template<typename T > | |
Expr | extract_bits (const Expr &e, const Expr &lsb) |
Expr | concat_bits (const std::vector< Expr > &e) |
Given a number of Exprs of the same type, concatenate their bits producing a single Expr of the same type code of the input but with more bits. | |
std::ostream & | operator<< (std::ostream &stream, const DeviceAPI &) |
Emit a halide device api type in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const MemoryType &) |
Emit a halide memory type in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const TailStrategy &t) |
Emit a halide tail strategy in human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const LoopLevel &) |
Emit a halide LoopLevel in human-readable form. | |
Func | lambda (const Expr &e) |
Create a zero-dimensional halide function that returns the given expression. | |
Func | lambda (const Var &x, const Expr &e) |
Create a 1-D halide function in the first argument that returns the second argument. | |
Func | lambda (const Var &x, const Var &y, const Expr &e) |
Create a 2-D halide function in the first two arguments that returns the last argument. | |
Func | lambda (const Var &x, const Var &y, const Var &z, const Expr &e) |
Create a 3-D halide function in the first three arguments that returns the last argument. | |
Func | lambda (const Var &x, const Var &y, const Var &z, const Var &w, const Expr &e) |
Create a 4-D halide function in the first four arguments that returns the last argument. | |
Func | lambda (const Var &x, const Var &y, const Var &z, const Var &w, const Var &v, const Expr &e) |
Create a 5-D halide function in the first five arguments that returns the last argument. | |
std::unique_ptr< llvm::Module > | compile_module_to_llvm_module (const Module &module, llvm::LLVMContext &context) |
Generate an LLVM module. | |
std::unique_ptr< llvm::raw_fd_ostream > | make_raw_fd_ostream (const std::string &filename) |
Construct an llvm output stream for writing to files. | |
void | compile_llvm_module_to_object (llvm::Module &module, Internal::LLVMOStream &out) |
Compile an LLVM module to native targets (objects, native assembly). | |
void | compile_llvm_module_to_assembly (llvm::Module &module, Internal::LLVMOStream &out) |
void | compile_llvm_module_to_llvm_bitcode (llvm::Module &module, Internal::LLVMOStream &out) |
Compile an LLVM module to LLVM targets (bitcode, LLVM assembly). | |
void | compile_llvm_module_to_llvm_assembly (llvm::Module &module, Internal::LLVMOStream &out) |
void | create_static_library (const std::vector< std::string > &src_files, const Target &target, const std::string &dst_file, bool deterministic=true) |
Concatenate the list of src_files into dst_file, using the appropriate static library format for the given target (e.g., .a or .lib). | |
Module | link_modules (const std::string &name, const std::vector< Module > &modules) |
Link a set of modules together into one module. | |
void | compile_standalone_runtime (const std::string &object_filename, const Target &t) |
Create an object file containing the Halide runtime for a given target. | |
std::map< OutputFileType, std::string > | compile_standalone_runtime (const std::map< OutputFileType, std::string > &output_files, const Target &t) |
Create an object and/or static library file containing the Halide runtime for a given target. | |
void | compile_multitarget (const std::string &fn_name, const std::map< OutputFileType, std::string > &output_files, const std::vector< Target > &targets, const std::vector< std::string > &suffixes, const ModuleFactory &module_factory, const CompilerLoggerFactory &compiler_logger_factory=nullptr) |
Expr | user_context_value () |
Returns an Expr corresponding to the user context passed to the function (if any). | |
std::ostream & | operator<< (std::ostream &stream, const RVar &) |
Emit an RVar in a human-readable form. | |
std::ostream & | operator<< (std::ostream &stream, const RDom &) |
Emit an RDom in a human-readable form. | |
Target | get_host_target () |
Return the target corresponding to the host machine. | |
Target | get_target_from_environment () |
Return the target that Halide will use. | |
Target | get_jit_target_from_environment () |
Return the target that Halide will use for jit-compilation. | |
Target::Feature | target_feature_for_device_api (DeviceAPI api) |
Get the Target feature corresponding to a DeviceAPI. | |
Type | Int (int bits, int lanes=1) |
Constructing a signed integer type. | |
Type | UInt (int bits, int lanes=1) |
Constructing an unsigned integer type. | |
Type | Float (int bits, int lanes=1) |
Construct a floating-point type. | |
Type | BFloat (int bits, int lanes=1) |
Construct a floating-point type in the bfloat format. | |
Type | Bool (int lanes=1) |
Construct a boolean type. | |
Type | Handle (int lanes=1, const halide_handle_cplusplus_type *handle_type=nullptr) |
Construct a handle type. | |
template<typename T > | |
Type | type_of () |
Construct the halide equivalent of a C type. | |
std::string | type_to_c_type (Type type, bool include_space, bool c_plus_plus=true) |
Halide type to a C++ type. | |
void | load_plugin (const std::string &lib_name) |
Load a plugin in the form of a dynamic library (e.g. | |
void | set_compiler_stack_size (size_t) |
Set how much stack the compiler should use for compilation in bytes. | |
size_t | get_compiler_stack_size () |
Return how much stack size the compiler should use for calls that go through run_with_large_stack below. | |
Variables | |
const int | head1_channels = 8 |
const int | head1_w = 40 |
const int | head1_h = 7 |
const int | head2_channels = 24 |
const int | head2_w = 39 |
const int | conv1_channels = 32 |
constexpr int | AnyDims = Halide::Runtime::AnyDims |
const DeviceAPI | all_device_apis [] |
An array containing all the device apis. | |
constexpr size_t | default_compiler_stack_size = 32 * 1024 * 1024 |
The default amount of stack used for lowering and codegen. | |
This file defines the class FunctionDAG, which is our representation of a Halide pipeline, and contains methods to using Halide's bounds tools to query properties of it.
Defines methods for manipulating and analyzing boolean expressions.
This file defines the LoopNest, which is our representation of a Halide schedule, and contains methods to generate candidates for scheduling as well as extract a featurization that can be used to cost each candidate.
using Halide::GeneratorParamsMap = typedef std::map<std::string, std::string> |
Definition at line 22 of file AbstractGenerator.h.
typedef std::vector<Range> Halide::Region |
typedef Stage Halide::ScheduleHandle |
using Halide::MetadataNameMap = typedef std::map<std::string, std::string> |
using Halide::ModuleFactory = typedef std::function<Module(const std::string &fn_name, const Target &target)> |
using Halide::CompilerLoggerFactory = typedef std::function<std::unique_ptr<Internal::CompilerLogger>(const std::string &fn_name, const Target &target)> |
using Halide::AutoSchedulerFn = typedef std::function<void(const Pipeline &, const Target &, const AutoschedulerParams &, AutoSchedulerResults *outputs)> |
Definition at line 143 of file Pipeline.h.
|
strong |
An enum describing a type of device API.
Used by schedules, and in the For loop IR node.
Enumerator | |
---|---|
None | |
Host | Used to denote for loops that run on the same device as the containing code. |
Default_GPU | |
CUDA | |
OpenCL | |
OpenGLCompute | |
Metal | |
Hexagon | |
HexagonDma | |
D3D12Compute |
Definition at line 15 of file DeviceAPI.h.
|
strong |
An enum describing different address spaces to be used with Func::store_in.
Enumerator | |
---|---|
Auto | Let Halide select a storage type automatically. |
Heap | Heap/global memory. Allocated using halide_malloc, or halide_device_malloc |
Stack | Stack memory. Allocated using alloca. Requires a constant size. Corresponds to per-thread local memory on the GPU. If all accesses are at constant coordinates, may be promoted into the register file at the discretion of the register allocator. |
Register | Register memory. The allocation should be promoted into the register file. All stores must be at constant coordinates. May be spilled to the stack at the discretion of the register allocator. |
GPUShared | Allocation is stored in GPU shared memory. Also known as "local" in OpenCL, and "threadgroup" in metal. Can be shared across GPU threads within the same block. |
GPUTexture | Allocation is stored in GPU texture memory and accessed through hardware sampler. |
LockedCache | Allocate Locked Cache Memory to act as local memory. |
VTCM | Vector Tightly Coupled Memory. HVX (Hexagon) local memory available on v65+. This memory has higher performance and lower power. Ideal for intermediate buffers. Necessary for vgather-vscatter instructions on Hexagon |
AMXTile | AMX Tile register for X86. Any data that would be used in an AMX matrix multiplication must first be loaded into an AMX tile register. |
|
strong |
An enum to specify calling convention for extern stages.
Enumerator | |
---|---|
Default | Match whatever is specified in the Target. |
C | No name mangling. |
CPlusPlus | C++ name mangling. |
Definition at line 25 of file Function.h.
|
strong |
|
strong |
Type of linkage a function in a lowered Halide module can have.
Also controls whether auxiliary functions and metadata are generated.
Enumerator | |
---|---|
External | Visible externally. |
ExternalPlusMetadata | Visible externally. Argument metadata and an argv wrapper are also generated. |
ExternalPlusArgv | Visible externally. Argv wrapper is generated but not argument metadata. |
Internal | Not visible externally, similar to 'static' linkage in C. |
Used to determine if the output printed to file should be as a normal string or as an HTML file which can be opened in a browerser and manipulated via JS and CSS.
Enumerator | |
---|---|
Text | |
HTML |
Definition at line 103 of file Pipeline.h.
|
strong |
Different ways to handle accesses outside the original extents in a prefetch.
Definition at line 16 of file PrefetchDirective.h.
|
strong |
Different ways to handle a tail case in a split when the factor does not provably divide the extent.
Definition at line 32 of file Schedule.h.
|
strong |
Different ways to handle the case when the start/end of the loops of stages computed with (fused) are not aligned.
Definition at line 110 of file Schedule.h.
std::unique_ptr< DefaultCostModel > Halide::make_default_cost_model | ( | const std::string & | weights_in_dir = "" , |
const std::string & | weights_out_dir = "" , |
||
bool | randomize_weights = false |
||
) |
std::unique_ptr< llvm::Module > Halide::codegen_llvm | ( | const Module & | module, |
llvm::LLVMContext & | context | ||
) |
Given a Halide module, generate an llvm::Module.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Expr & | |||
) |
Emit an expression on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Type & | |||
) |
Emit a halide type on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Module & | |||
) |
Emit a halide Module on an output stream (such as std::cout) in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const Target & | |||
) |
Derivative Halide::propagate_adjoints | ( | const Func & | output, |
const Func & | adjoint, | ||
const Region & | output_bounds | ||
) |
Given a Func and a corresponding adjoint, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters.
The bounds of output and adjoint need to be specified with pair {min, extent} For each Func the output depends on, and for the pure definition and each update of that Func, it generates a derivative Func stored in the Derivative.
Derivative Halide::propagate_adjoints | ( | const Func & | output, |
const Buffer< float > & | adjoint | ||
) |
Given a Func and a corresponding adjoint buffer, (back)propagate the adjoint to all dependent Funcs, buffers, and parameters.
For each Func the output depends on, and for the pure definition and each update of that Func, it generates a derivative Func stored in the Derivative.
Derivative Halide::propagate_adjoints | ( | const Func & | output | ) |
Given a scalar Func with size 1, (back)propagate the gradient to all dependent Funcs, buffers, and parameters.
For each Func the output depends on, and for the pure definition and each update of that Func, it generates a derivative Func stored in the Derivative.
const halide_device_interface_t * Halide::get_device_interface_for_device_api | ( | DeviceAPI | d, |
const Target & | t = get_jit_target_from_environment() , |
||
const char * | error_site = nullptr |
||
) |
Gets the appropriate halide_device_interface_t * for a DeviceAPI.
If error_site is non-null, e.g. the name of the routine calling get_device_interface_for_device_api, a user_error is reported if the requested device API is not enabled in or supported by the target, Halide has been compiled without this device API, or the device API is None or Host or a bad value. The error_site argument is printed in the error message. If error_site is null, this routine returns nullptr instead of calling user_error.
Referenced by Halide::Buffer< T, Dims >::copy_to_device(), Halide::Buffer< T, Dims >::device_malloc(), and Halide::Buffer< T, Dims >::device_wrap_native().
Get the specific DeviceAPI that Halide would select when presented with DeviceAPI::Default_GPU for a given target.
If no suitable api is enabled in the target, returns DeviceAPI::Host.
bool Halide::host_supports_target_device | ( | const Target & | t | ) |
This attempts to sniff whether a given Target (and its implied DeviceAPI) is usable on the current host.
If it appears to be usable, return true; if not, return false. Note that a return value of true does not guarantee that future usage of that device will succeed; it is intended mainly as a simple diagnostic to allow early-exit when a desired device is definitely not usable. Also note that this call is NOT threadsafe, as it temporarily redirect various global error-handling hooks in Halide.
References Internal.
bool Halide::exceptions_enabled | ( | ) |
Query whether Halide was compiled with exceptions.
void Halide::set_custom_compile_time_error_reporter | ( | CompileTimeErrorReporter * | error_reporter | ) |
The default error reporter logs to stderr, then throws an exception (if HALIDE_WITH_EXCEPTIONS) or calls abort (if not).
This allows customization of that behavior if a more gentle response to error reporting is desired. Note that error_reporter is expected to remain valid across all Halide usage; it is up to the caller to ensure that this is the case (and to do any cleanup necessary).
References Internal, and set_custom_compile_time_error_reporter().
Referenced by set_custom_compile_time_error_reporter().
Integer division by small values can be done exactly as multiplies and shifts.
This function does integer division for numerators of various integer types (8, 16, 32 bit signed and unsigned) numerators and uint8 denominators. The type of the result is the type of the numerator. The unsigned version is faster than the signed version, so cast the numerator to an unsigned int if you know it's positive.
If your divisor is compile-time constant, Halide performs a slightly better optimization automatically, so there's no need to use this function (but it won't hurt).
This function vectorizes well on arm, and well on x86 for 16 and 8 bit vectors. For 32-bit vectors on x86 you're better off using native integer division.
Also, this routine treats division by zero as division by
A variant of the above which rounds towards zero instead of rounding towards negative infinity.
Use the fast integer division tables to implement a modulo operation via the Euclidean identity: ab = a - (a/b)*b.
Explicit overloads of min and max for FuncRef.
These exist to disambiguate calls to min on FuncRefs when a user has pulled both Halide::min and std::min into their namespace.
Definition at line 584 of file Func.h.
References min().
Referenced by Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::Dimension::begin(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::contains(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::copy_to_interleaved(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::copy_to_planar(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::crop(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::cropped(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::Dimension::end(), Halide::GeneratorInput< T >::GeneratorInput(), Halide::Internal::GeneratorInput_Arithmetic< T >::GeneratorInput_Arithmetic(), Halide::GeneratorParam< T >::GeneratorParam(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::Dimension::max(), Halide::Internal::IRMatcher::min(), min(), Halide::Internal::GeneratorMinMax::min_forward(), Halide::Param< T >::Param(), Halide::RDom::RDom(), Halide::Runtime::Internal::PointerTable::replace(), Halide::Runtime::Internal::BlockStorage::replace(), Halide::Internal::GeneratorOutput_Func< T >::set_estimate(), Halide::Internal::GeneratorInput_Buffer< T >::set_estimate(), Halide::Internal::GeneratorInput_Func< T >::set_estimate(), Halide::Param< T >::set_min_value(), Halide::Param< T >::set_range(), Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::slice(), and Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::sliced().
Definition at line 587 of file Func.h.
References max().
Referenced by Halide::Runtime::Buffer< T, Dims, InClassDimStorage >::contains(), Halide::GeneratorInput< T >::GeneratorInput(), Halide::Internal::GeneratorInput_Arithmetic< T >::GeneratorInput_Arithmetic(), Halide::GeneratorParam< T >::GeneratorParam(), Halide::Runtime::Internal::LinkedList::initialize(), Halide::Runtime::Internal::LinkedList::LinkedList(), Halide::Internal::IRMatcher::max(), max(), Halide::Internal::GeneratorMinMax::max_forward(), Halide::Param< T >::Param(), Halide::Runtime::Internal::PointerTable::replace(), Halide::Runtime::Internal::BlockStorage::replace(), Halide::Runtime::Internal::BlockStorage::reserve(), Halide::Runtime::Internal::PointerTable::reserve(), Halide::Runtime::Internal::BlockStorage::resize(), Halide::Runtime::Internal::PointerTable::resize(), Halide::Param< T >::set_max_value(), and Halide::Param< T >::set_range().
HALIDE_NO_USER_CODE_INLINE T Halide::evaluate | ( | JITUserContext * | ctx, |
const Expr & | e | ||
) |
JIT-Compile and run enough code to evaluate a Halide expression.
This can be thought of as a scalar version of Func::realize
Definition at line 2521 of file Func.h.
References Halide::Func::realize(), Halide::Expr::type(), and user_assert.
Referenced by evaluate().
HALIDE_NO_USER_CODE_INLINE T Halide::evaluate | ( | const Expr & | e | ) |
HALIDE_NO_USER_CODE_INLINE void Halide::evaluate | ( | JITUserContext * | ctx, |
Tuple | t, | ||
First | first, | ||
Rest &&... | rest | ||
) |
JIT-compile and run enough code to evaluate a Halide Tuple.
Definition at line 2540 of file Func.h.
References Halide::Internal::assign_results(), Halide::Internal::check_types(), and Halide::Func::realize().
HALIDE_NO_USER_CODE_INLINE void Halide::evaluate | ( | Tuple | t, |
First | first, | ||
Rest &&... | rest | ||
) |
JIT-compile and run enough code to evaluate a Halide Tuple.
Definition at line 2551 of file Func.h.
References evaluate().
HALIDE_NO_USER_CODE_INLINE T Halide::evaluate_may_gpu | ( | const Expr & | e | ) |
JIT-Compile and run enough code to evaluate a Halide expression.
This can be thought of as a scalar version of Func::realize. Can use GPU if jit target from environment specifies one.
Definition at line 2575 of file Func.h.
References Halide::Func::realize(), Halide::Internal::schedule_scalar(), Halide::Expr::type(), and user_assert.
HALIDE_NO_USER_CODE_INLINE void Halide::evaluate_may_gpu | ( | Tuple | t, |
First | first, | ||
Rest &&... | rest | ||
) |
JIT-compile and run enough code to evaluate a Halide Tuple.
Can use GPU if jit target from environment specifies one.
Definition at line 2591 of file Func.h.
References Halide::Internal::assign_results(), Halide::Internal::check_types(), Halide::Func::realize(), and Halide::Internal::schedule_scalar().
auto Halide::operator+ | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a + (T)b) |
Addition between GeneratorParam<T> and any type that supports operator+ with T.
Returns type of underlying operator+.
Definition at line 1050 of file Generator.h.
auto Halide::operator+ | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a + b) |
Definition at line 1054 of file Generator.h.
auto Halide::operator- | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a - (T)b) |
Subtraction between GeneratorParam<T> and any type that supports operator- with T.
Returns type of underlying operator-.
Definition at line 1063 of file Generator.h.
auto Halide::operator- | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a - b) |
Definition at line 1067 of file Generator.h.
auto Halide::operator* | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a * (T)b) |
Multiplication between GeneratorParam<T> and any type that supports operator* with T.
Returns type of underlying operator*.
Definition at line 1076 of file Generator.h.
auto Halide::operator* | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a * b) |
Definition at line 1080 of file Generator.h.
auto Halide::operator/ | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a / (T)b) |
Division between GeneratorParam<T> and any type that supports operator/ with T.
Returns type of underlying operator/.
Definition at line 1089 of file Generator.h.
auto Halide::operator/ | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a / b) |
Definition at line 1093 of file Generator.h.
auto Halide::operator% | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a % (T)b) |
Modulo between GeneratorParam<T> and any type that supports operator% with T.
Returns type of underlying operator%.
Definition at line 1102 of file Generator.h.
auto Halide::operator% | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a % b) |
Definition at line 1106 of file Generator.h.
auto Halide::operator> | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a > (T)b) |
Greater than comparison between GeneratorParam<T> and any type that supports operator> with T.
Returns type of underlying operator>.
Definition at line 1115 of file Generator.h.
auto Halide::operator> | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a > b) |
Definition at line 1119 of file Generator.h.
auto Halide::operator< | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a < (T)b) |
Less than comparison between GeneratorParam<T> and any type that supports operator< with T.
Returns type of underlying operator<.
Definition at line 1128 of file Generator.h.
auto Halide::operator< | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a < b) |
Definition at line 1132 of file Generator.h.
auto Halide::operator>= | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a >= (T)b) |
Greater than or equal comparison between GeneratorParam<T> and any type that supports operator>= with T.
Returns type of underlying operator>=.
Definition at line 1141 of file Generator.h.
auto Halide::operator>= | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a >= b) |
Definition at line 1145 of file Generator.h.
auto Halide::operator<= | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a <= (T)b) |
Less than or equal comparison between GeneratorParam<T> and any type that supports operator<= with T.
Returns type of underlying operator<=.
Definition at line 1154 of file Generator.h.
auto Halide::operator<= | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a <= b) |
Definition at line 1158 of file Generator.h.
auto Halide::operator== | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a == (T)b) |
Equality comparison between GeneratorParam<T> and any type that supports operator== with T.
Returns type of underlying operator==.
Definition at line 1167 of file Generator.h.
auto Halide::operator== | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a == b) |
Definition at line 1171 of file Generator.h.
auto Halide::operator!= | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a != (T)b) |
Inequality comparison between between GeneratorParam<T> and any type that supports operator!= with T.
Returns type of underlying operator!=.
Definition at line 1180 of file Generator.h.
auto Halide::operator!= | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a != b) |
Definition at line 1184 of file Generator.h.
auto Halide::operator&& | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a && (T)b) |
Logical and between between GeneratorParam<T> and any type that supports operator&& with T.
Returns type of underlying operator&&.
Definition at line 1193 of file Generator.h.
auto Halide::operator&& | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a && b) |
Definition at line 1197 of file Generator.h.
auto Halide::operator&& | ( | const GeneratorParam< T > & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype((T)a && (T)b) |
Definition at line 1201 of file Generator.h.
auto Halide::operator|| | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(a || (T)b) |
Logical or between between GeneratorParam<T> and any type that supports operator|| with T.
Returns type of underlying operator||.
Definition at line 1210 of file Generator.h.
auto Halide::operator|| | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype((T)a || b) |
Definition at line 1214 of file Generator.h.
auto Halide::operator|| | ( | const GeneratorParam< T > & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype((T)a || (T)b) |
Definition at line 1218 of file Generator.h.
auto Halide::min | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
Compute minimum between GeneratorParam<T> and any type that supports min with T.
Will automatically import std::min. Returns type of underlying min call.
Definition at line 1258 of file Generator.h.
References min(), and Halide::Internal::GeneratorMinMax::min_forward().
auto Halide::min | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype(Internal::GeneratorMinMax::min_forward(a, b)) |
Definition at line 1262 of file Generator.h.
References min(), and Halide::Internal::GeneratorMinMax::min_forward().
auto Halide::max | ( | const Other & | a, |
const GeneratorParam< T > & | b | ||
) | -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
Compute the maximum value between GeneratorParam<T> and any type that supports max with T.
Will automatically import std::max. Returns type of underlying max call.
Definition at line 1271 of file Generator.h.
References max(), and Halide::Internal::GeneratorMinMax::max_forward().
auto Halide::max | ( | const GeneratorParam< T > & | a, |
const Other & | b | ||
) | -> decltype(Internal::GeneratorMinMax::max_forward(a, b)) |
Definition at line 1275 of file Generator.h.
References max(), and Halide::Internal::GeneratorMinMax::max_forward().
auto Halide::operator! | ( | const GeneratorParam< T > & | a | ) | -> decltype(!(T)a) |
Not operator for GeneratorParam.
Definition at line 1282 of file Generator.h.
Callable Halide::create_callable_from_generator | ( | const GeneratorContext & | context, |
const std::string & | name, | ||
const GeneratorParamsMap & | generator_params = {} |
||
) |
Callable Halide::create_callable_from_generator | ( | const Target & | target, |
const std::string & | name, | ||
const GeneratorParamsMap & | generator_params = {} |
||
) |
An inline reduction.
This is suitable for convolution-type operations - the reduction will be computed in the innermost loop that it is used in. The argument may contain free or implicit variables, and must refer to some reduction domain. The free variables are still free in the return value, but the reduction domain is captured - the result expression does not refer to a reduction domain and can be used in a pure function definition.
An example using sum :
Here g computes some blur of x, but g is still a pure function. The sum is being computed by an anonymous reduction function that is scheduled innermost within g.
Referenced by do_cost_model_schedule().
Referenced by Halide::SimdOpCheckTest::check_one().
Variants of the inline reduction in which the RDom is stated explicitly.
The expression can refer to multiple RDoms, and only the inner one is captured by the reduction. This allows you to write expressions like:
Cast an expression to the halide type corresponding to the C++ type T.
Definition at line 406 of file IROperator.h.
References cast().
Referenced by Halide::ConciseCasts::bf16(), cast(), Halide::NamesInterface::cast(), Halide::ConciseCasts::f16(), Halide::ConciseCasts::f32(), Halide::ConciseCasts::f64(), Halide::ConciseCasts::i16(), Halide::ConciseCasts::i32(), Halide::ConciseCasts::i64(), Halide::ConciseCasts::i8(), Halide::ConciseCasts::u16(), Halide::ConciseCasts::u32(), Halide::ConciseCasts::u64(), and Halide::ConciseCasts::u8().
Return the sum of two expressions, doing any necessary type coercion using Internal::match_types.
Add an expression and a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Add a constant integer and an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Modify the first expression to be the sum of two expressions, without changing its type.
This casts the second argument to match the type of the first.
Return the difference of two expressions, doing any necessary type coercion using Internal::match_types.
Subtracts a constant integer from an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Subtracts an expression from a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Return the negative of the argument.
Does no type casting, so more formally: return that number which when added to the original, yields zero of the same type. For unsigned integers the negative is still an unsigned integer. E.g. in UInt(8), the negative of 56 is 200, because 56 + 200 == 0
Modify the first expression to be the difference of two expressions, without changing its type.
This casts the second argument to match the type of the first.
Return the product of two expressions, doing any necessary type coercion using Internal::match_types.
Multiply an expression and a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Multiply a constant integer and an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Modify the first expression to be the product of two expressions, without changing its type.
This casts the second argument to match the type of the first.
Return the ratio of two expressions, doing any necessary type coercion using Internal::match_types.
Note that integer division in Halide is not the same as integer division in C-like languages in two ways.
First, signed integer division in Halide rounds according to the sign of the denominator. This means towards minus infinity for positive denominators, and towards positive infinity for negative denominators. This is unlike C, which rounds towards zero. This decision ensures that upsampling expressions like f(x/2, y/2) don't have funny discontinuities when x and y cross zero.
Second, division by zero returns zero instead of faulting. For types where overflow is defined behavior, division of the largest negative signed integer by -1 returns the larged negative signed integer for the type (i.e. it wraps). This ensures that a division operation can never have a side-effect, which is helpful in Halide because scheduling directives can expand the domain of computation of a Func, potentially introducing new zero-division.
Modify the first expression to be the ratio of two expressions, without changing its type.
This casts the second argument to match the type of the first. Note that signed integer division in Halide rounds towards minus infinity, unlike C, which rounds towards zero.
Divides an expression by a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Divides a constant integer by an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Return the first argument reduced modulo the second, doing any necessary type coercion using Internal::match_types.
There are two key differences between C-like languages and Halide for the modulo operation, which complement the way division works.
First, the result is never negative, so x % 2 is always zero or one, unlike in C-like languages. x % -2 is equivalent, and is also always zero or one. Second, mod by zero evaluates to zero (unlike in C, where it faults). This makes modulo, like division, a side-effect-free operation.
Mods an expression by a constant integer.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Mods a constant integer by an expression.
Coerces the type of the integer to match the type of the expression. Errors if the integer cannot be represented in the type of the expression.
Return a boolean expression that tests whether the first argument is greater than the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is greater than a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is greater than an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is less than the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is less than a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is less than an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is less than or equal to the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is less than or equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is less than or equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is greater than or equal to the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is greater than or equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is greater than or equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether the first argument is equal to the second, after doing any necessary type coercion using Internal::match_types.
Return a boolean expression that tests whether an expression is equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Return a boolean expression that tests whether a constant integer is equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Expr Halide::operator!= | ( | Expr | a, |
Expr | b | ||
) |
Return a boolean expression that tests whether the first argument is not equal to the second, after doing any necessary type coercion using Internal::match_types.
Expr Halide::operator!= | ( | Expr | a, |
int | b | ||
) |
Return a boolean expression that tests whether an expression is not equal to a constant integer.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Expr Halide::operator!= | ( | int | a, |
Expr | b | ||
) |
Return a boolean expression that tests whether a constant integer is not equal to an expression.
Coerces the integer to the type of the expression. Errors if the integer is not representable in that type.
Returns an expression representing the greater of the two arguments, after doing any necessary type coercion using Internal::match_types.
Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
References max().
Returns an expression representing the greater of an expression and a constant integer.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
References max().
Returns an expression representing the greater of a constant integer and an expression.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
References max().
Definition at line 684 of file IROperator.h.
References max().
Definition at line 687 of file IROperator.h.
References max().
|
inline |
Returns an expression representing the greater of an expressions vector, after doing any necessary type coersion using Internal::match_types.
Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4). The expressions are folded from right ie. max(.., max(.., ..)). The arguments can be any mix of types but must all be convertible to Expr.
Definition at line 699 of file IROperator.h.
References max().
Returns an expression representing the lesser of an expression and a constant integer.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
References min().
Returns an expression representing the lesser of a constant integer and an expression.
The integer is coerced to the type of the expression. Errors if the integer is not representable as that type. Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4).
References min().
Definition at line 719 of file IROperator.h.
References min().
Definition at line 722 of file IROperator.h.
References min().
|
inline |
Returns an expression representing the lesser of an expressions vector, after doing any necessary type coersion using Internal::match_types.
Vectorizes cleanly on most platforms (with the exception of integer types on x86 without SSE4). The expressions are folded from right ie. min(.., min(.., ..)). The arguments can be any mix of types but must all be convertible to Expr.
Definition at line 734 of file IROperator.h.
References min().
Operators on floats treats those floats as Exprs.
Making these explicit prevents implicit float->int casts that might otherwise occur.
Definition at line 742 of file IROperator.h.
Definition at line 745 of file IROperator.h.
Definition at line 748 of file IROperator.h.
Definition at line 751 of file IROperator.h.
Definition at line 754 of file IROperator.h.
Definition at line 757 of file IROperator.h.
Definition at line 760 of file IROperator.h.
Definition at line 763 of file IROperator.h.
Definition at line 766 of file IROperator.h.
Definition at line 769 of file IROperator.h.
Definition at line 772 of file IROperator.h.
Definition at line 775 of file IROperator.h.
Definition at line 778 of file IROperator.h.
Definition at line 781 of file IROperator.h.
Definition at line 784 of file IROperator.h.
Definition at line 787 of file IROperator.h.
Definition at line 790 of file IROperator.h.
Definition at line 793 of file IROperator.h.
Definition at line 796 of file IROperator.h.
Definition at line 799 of file IROperator.h.
|
inline |
Definition at line 802 of file IROperator.h.
|
inline |
Definition at line 805 of file IROperator.h.
Clamps an expression to lie within the given bounds.
The bounds are type-cast to match the expression. Vectorizes as well as min/max.
Returns the absolute value of a signed integer or floating-point expression.
Vectorizes cleanly. Unlike in C, abs of a signed integer returns an unsigned integer of the same bit width. This means that abs of the most negative integer doesn't overflow.
Referenced by Halide::Internal::IRMatcher::Intrin< Args >::make().
Return the absolute difference between two values.
Vectorizes cleanly. Returns an unsigned value of the same bit width. There are various ways to write this yourself, but they contain numerous gotchas and don't always compile to good code, so use this instead.
Referenced by Halide::SimdOpCheckTest::check_one(), and Halide::Internal::IRMatcher::Intrin< Args >::make().
Returns an expression similar to the ternary operator in C, except that it always evaluates all arguments.
If the first argument is true, then return the second, else return the third. Typically vectorizes cleanly, but benefits from SSE41 or newer on x86.
Referenced by select().
|
inline |
A multi-way variant of select similar to a switch statement in C, which can accept multiple conditions and values in pairs.
Evaluates to the first value for which the condition is true. Returns the final value if all conditions are false.
Definition at line 839 of file IROperator.h.
References select().
Tuple Halide::tuple_select | ( | const Tuple & | condition, |
const Tuple & | true_value, | ||
const Tuple & | false_value | ||
) |
Equivalent of ternary select(), but taking/returning tuples.
If the condition is a Tuple, it must match the size of the true and false Tuples.
Referenced by tuple_select().
Tuple Halide::tuple_select | ( | const Expr & | condition, |
const Tuple & | true_value, | ||
const Tuple & | false_value | ||
) |
|
inline |
Equivalent of multiway select(), but taking/returning tuples.
If the condition is a Tuple, it must match the size of the true and false Tuples.
Definition at line 854 of file IROperator.h.
References tuple_select().
|
inline |
Definition at line 859 of file IROperator.h.
References tuple_select().
Oftentimes we want to pack a list of expressions with the same type into a channel dimension, e.g., img(x, y, c) = select(c == 0, 100, // Red c == 1, 50, // Green 25); // Blue This is tedious when the list is long.
The following function provide convinent syntax that allow one to write: img(x, y, c) = mux(c, {100, 50, 25});
As with the select equivalent, if the first argument (the index) is out of range, the expression evaluates to the last value.
Return the sine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the arcsine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the cosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the arccosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the tangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the arctangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the angle of a floating-point gradient.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic sine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic arcsinhe of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic cosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic arccosine of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic tangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the hyperbolic arctangent of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Does not vectorize well.
Return the square root of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). Typically vectorizes cleanly.
Return the square root of the sum of the squares of two floating-point expressions.
If the argument is not floating-point, it is cast to Float(32). Vectorizes cleanly.
Return the exponential of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). For Float(64) arguments, this calls the system exp function, and does not vectorize well. For Float(32) arguments, this function is vectorizable, does the right thing for extremely small or extremely large inputs, and is accurate up to the last bit of the mantissa. Vectorizes cleanly.
Return the logarithm of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). For Float(64) arguments, this calls the system log function, and does not vectorize well. For Float(32) arguments, this function is vectorizable, does the right thing for inputs <= 0 (returns -inf or nan), and is accurate up to the last bit of the mantissa. Vectorizes cleanly.
Return one floating point expression raised to the power of another.
The type of the result is given by the type of the first argument. If the first argument is not a floating-point type, it is cast to Float(32). For Float(32), cleanly vectorizable, and accurate up to the last few bits of the mantissa. Gets worse when approaching overflow. Vectorizes cleanly.
Evaluate the error function erf.
Only available for Float(32). Accurate up to the last three bits of the mantissa. Vectorizes cleanly.
Fast vectorizable approximation to some trigonometric functions for Float(32).
Absolute approximation error is less than 1e-5.
Fast approximate cleanly vectorizable log for Float(32).
Returns nonsense for x <= 0.0f. Accurate up to the last 5 bits of the mantissa. Vectorizes cleanly.
Fast approximate cleanly vectorizable exp for Float(32).
Returns nonsense for inputs that would overflow or underflow. Typically accurate up to the last 5 bits of the mantissa. Gets worse when approaching overflow. Vectorizes cleanly.
Fast approximate cleanly vectorizable pow for Float(32).
Returns nonsense for x < 0.0f. Accurate up to the last 5 bits of the mantissa for typical exponents. Gets worse when approaching overflow. Vectorizes cleanly.
Fast approximate inverse for Float(32).
Corresponds to the rcpps instruction on x86, and the vrecpe instruction on ARM. Vectorizes cleanly. Note that this can produce slightly different results across different implementations of the same architecture (e.g. AMD vs Intel), even when strict_float is enabled.
Fast approximate inverse square root for Float(32).
Corresponds to the rsqrtps instruction on x86, and the vrsqrte instruction on ARM. Vectorizes cleanly. Note that this can produce slightly different results across different implementations of the same architecture (e.g. AMD vs Intel), even when strict_float is enabled.
Return the greatest whole number less than or equal to a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. Vectorizes cleanly.
Return the least whole number greater than or equal to a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. Vectorizes cleanly.
Return the whole number closest to a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. On ties, we round towards the nearest even integer. Note that this is not the same as std::round in C, which rounds away from zero. On platforms without a native instruction for this, it is emulated, and may be more expensive than cast<int>(x + 0.5f) or similar.
Return the integer part of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value is still in floating point, despite being a whole number. Vectorizes cleanly.
Return the fractional part of a floating-point expression.
If the argument is not floating-point, it is cast to Float(32). The return value has the same sign as the original expression. Vectorizes cleanly.
Reinterpret the bits of one value as another type.
Referenced by reinterpret(), and Halide::Internal::GeneratorInput_Scalar< T >::set_estimate().
Definition at line 1079 of file IROperator.h.
References reinterpret().
Return the bitwise and of two expressions (which need not have the same type).
The result type is the wider of the two expressions. Only integral types are allowed and both expressions must be signed or both must be unsigned.
Return the bitwise and of an expression and an integer.
The type of the result is the type of the expression argument.
Return the bitwise or of two expressions (which need not have the same type).
The result type is the wider of the two expressions. Only integral types are allowed and both expressions must be signed or both must be unsigned.
Return the bitwise or of an expression and an integer.
The type of the result is the type of the expression argument.
Return the bitwise xor of two expressions (which need not have the same type).
The result type is the wider of the two expressions. Only integral types are allowed and both expressions must be signed or both must be unsigned.
Return the bitwise xor of an expression and an integer.
The type of the result is the type of the expression argument.
Shift the bits of an integer value left.
This is actually less efficient than multiplying by 2^n, because Halide's optimization passes understand multiplication, and will compile it to shifting. This operator is only for if you really really need bit shifting (e.g. because the exponent is a run-time parameter). The type of the result is equal to the type of the first argument. Both arguments must have integer type.
Shift the bits of an integer value right.
Does sign extension for signed integers. This is less efficient than dividing by a power of two. Halide's definition of division (always round to negative infinity) means that all divisions by powers of two get compiled to bit-shifting, and Halide's optimization routines understand division and can work with it. The type of the result is equal to the type of the first argument. Both arguments must have integer type.
Linear interpolate between the two values according to a weight.
zero_val | The result when weight is 0 |
one_val | The result when weight is 1 |
weight | The interpolation amount |
Both zero_val and one_val must have the same type. All types are supported, including bool.
The weight is treated as its own type and must be float or an unsigned integer type. It is scaled to the bit-size of the type of x and y if they are integer, or converted to float if they are float. Integer weights are converted to float via division by the full-range value of the weight's type. Floating-point weights used to interpolate between integer values must be between 0.0f and 1.0f, and an error may be signaled if it is not provably so. (clamp operators can be added to provide proof. Currently an error is only signalled for constant weights.)
For integer linear interpolation, out of range values cannot be represented. In particular, weights that are conceptually less than 0 or greater than 1.0 are not representable. As such the result is always between x and y (inclusive of course). For lerp with floating-point values and floating-point weight, the full range of a float is valid, however underflow and overflow can still occur.
Ordering is not required between zero_val and one_val: lerp(42, 69, .5f) == lerp(69, 42, .5f) == 56
Results for integer types are for exactly rounded arithmetic. As such, there are cases where 16-bit and float differ because 32-bit floating-point (float) does not have enough precision to produce the exact result. (Likely true for 32-bit integer vs. double-precision floating-point as well.)
At present, double precision and 64-bit integers are not supported.
Generally, lerp will vectorize as if it were an operation on a type twice the bit size of the inferred type for x and y.
Some examples:
Count the number of leading zero bits in an expression.
If the expression is zero, the result is the number of bits in the type.
Count the number of trailing zero bits in an expression.
If the expression is zero, the result is the number of bits in the type.
Divide two integers, rounding towards zero.
This is the typical behavior of most hardware architectures, which differs from Halide's division operator, which is Euclidean (rounds towards -infinity). Will throw a runtime error if y is zero, or if y is -1 and x is the minimum signed integer.
Compute the remainder of dividing two integers, when division is rounding toward zero.
This is the typical behavior of most hardware architectures, which differs from Halide's mod operator, which is Euclidean (produces the remainder when division rounds towards -infinity). Will throw a runtime error if y is zero.
Return a random variable representing a uniformly distributed float in the half-open interval [0.0f, 1.0f).
For random numbers of other types, use lerp with a random float as the last parameter.
Optionally takes a seed.
Note that:
is very different to
The first doubles a random variable, and the second adds two independent random variables.
A given random variable takes on a unique value that depends deterministically on the pure variables of the function they belong to, the identity of the function itself, and which definition of the function it is used in. They are, however, shared across tuple elements.
This function vectorizes cleanly.
Return a random variable representing a uniformly distributed unsigned 32-bit integer.
See random_float. Vectorizes cleanly.
Return a random variable representing a uniformly distributed 32-bit integer.
See random_float. Vectorizes cleanly.
|
inline |
Definition at line 1289 of file IROperator.h.
References Halide::Internal::collect_print_args(), and print().
Create an Expr that prints whenever it is evaluated, provided that the condition is true.
Referenced by print_when().
|
inline |
Definition at line 1302 of file IROperator.h.
References Halide::Internal::collect_print_args(), and print_when().
Create an Expr that that guarantees a precondition.
If 'condition' is true, the return value is equal to the first Expr. If 'condition' is false, halide_error() is called, and the return value is arbitrary. Any additional arguments after the first Expr are stringified and passed as a user-facing message to halide_error(), similar to print().
Note that this essentially always inserts a runtime check into the generated code (except when the condition can be proven at compile time); as such, it should be avoided inside inner loops, except for debugging or testing purposes. Note also that it does not vectorize cleanly (vector values will be scalarized for the check).
However, using this to make assertions about (say) input values can be useful, both in terms of correctness and (potentially) in terms of code generation, e.g.
will allow the optimizer to assume positive, nonzero values for y.
Referenced by require().
|
inline |
Definition at line 1335 of file IROperator.h.
References Halide::Internal::collect_print_args(), and require().
Return an undef value of the given type.
Halide skips stores that depend on undef values, so you can use this to mean "do not modify this memory location". This is an escape hatch that can be used for several things:
You can define a reduction with no pure step, by setting the pure step to undef. Do this only if you're confident that the update steps are sufficient to correctly fill in the domain.
For a tuple-valued reduction, you can write an update step that only updates some tuple elements.
You can define single-stage pipeline that only has update steps, and depends on the values already in the output buffer.
Use this feature with great caution, as you can use it to load from uninitialized memory.
|
inline |
|
inline |
Control the values used in the memoization cache key for memoize.
Normally parameters and other external dependencies are automatically inferred and added to the cache key. The memoize_tag operator allows computing one expression and using either the computed value, or one or more other expressions in the cache key instead of the parameter dependencies of the computation. The single argument version is completely safe in that the cache key will use the actual computed value – it is difficult or imposible to produce erroneous caching this way. The more-than-one argument version allows generating cache keys that do not uniquely identify the computation and thus can result in caching errors.
A potential use for the single argument version is to handle a floating-point parameter that is quantized to a small integer. Mutliple values of the float will produce the same integer and moving the caching to using the integer for the key is more efficient.
The main use for the more-than-one argument version is to provide cache key information for Handles and ImageParams, which otherwise are not allowed inside compute_cached operations. E.g. when passing a group of parameters to an external array function via a Handle, memoize_tag can be used to isolate the actual values used by that computation. If an ImageParam is a constant image with a persistent digest, memoize_tag can be used to key computations using that image on the digest.
Definition at line 1410 of file IROperator.h.
References Halide::Internal::memoize_tag_helper().
Expressions tagged with this intrinsic are considered to be part of the steady state of some loop with a nasty beginning and end (e.g.
a boundary condition). When Halide encounters likely intrinsics, it splits the containing loop body into three, and tries to simplify down all conditions that lead to the likely. For example, given the expression: select(x < 1, bar, x > 10, bar, likely(foo)), Halide will split the loop over x into portions where x < 1, 1 <= x <= 10, and x > 10.
You're unlikely to want to call this directly. You probably want to use the boundary condition helpers in the BoundaryConditions namespace instead.
Referenced by Halide::Internal::IRMatcher::Intrin< Args >::make().
Equivalent to likely, but only triggers a loop partitioning if found in an innermost loop.
Referenced by Halide::Internal::IRMatcher::Intrin< Args >::make().
Cast an expression to the halide type corresponding to the C++ type T.
As part of the cast, clamp to the minimum and maximum values of the result type.
Definition at line 1439 of file IROperator.h.
References saturating_cast().
Referenced by Halide::ConciseCasts::i16_sat(), Halide::ConciseCasts::i32_sat(), Halide::ConciseCasts::i64_sat(), Halide::ConciseCasts::i8_sat(), saturating_cast(), Halide::ConciseCasts::u16_sat(), Halide::ConciseCasts::u32_sat(), Halide::ConciseCasts::u64_sat(), and Halide::ConciseCasts::u8_sat().
Cast an expression to a new type, clamping to the minimum and maximum values of the result type.
Makes a best effort attempt to preserve IEEE floating-point semantics in evaluating an expression.
May not be implemented for all backends. (E.g. it is difficult to do this for C++ code generation as it depends on the compiler flags used to compile the generated code.
Create an Expr that that promises another Expr is clamped but do not generate code to check the assertion or modify the value.
No attempt is made to prove the bound at compile time. (If it is proved false as a result of something else, an error might be generated, but it is also possible the compiler will crash.) The promised bound is used in bounds inference so it will allow satisfying bounds checks as well as possibly aiding optimization.
unsafe_promise_clamped returns its first argument, the Expr 'value'
This is a very easy way to make Halide generate erroneous code if the bound promises is not kept. Use sparingly when there is no other way to convey the information to the compiler and it is required for a valuable optimization.
Unsafe promises can be checked by turning on Target::CheckUnsafePromises. This is intended for debugging only.
References Internal.
Scatter and gather are used for update definition which must store multiple values to distinct locations at the same time.
The multiple expressions on the right-hand-side are bundled together into a "gather", which must match a "scatter" the the same number of arguments on the left-hand-size. For example, to store the values 1 and 2 to the locations (x, y, 3) and (x, y, 4), respectively:
The result of gather or scatter can be treated as an expression. Any containing operations on it can be assumed to distribute over the elements. If two gather expressions are combined with an arithmetic operator (e.g. added), they combine element-wise. The following example stores the values 2 * x, 2 * y, and 2 * c to the locations (x + 1, y, c), (x, y + 3, c), and (x, y, c + 2) respectively:
Repeated values in the scatter cause multiple stores to the same location. The stores happen in order from left to right, so the rightmost value wins. The following code is equivalent to f(x) = 5
Gathers are most useful for algorithms which require in-place swapping or permutation of multiple elements, or other kinds of in-place mutations that require loading multiple inputs, doing some operations to them jointly, then storing them again. The following update definition swaps the values of f at locations 3 and 5 if an input parameter p is true:
For more examples of the use of scatter and gather, see test/correctness/multiple_scatter.cpp
It is not currently possible to use scatter and gather to write an update definition in which the number of values loaded or stored varies, as the size of the scatter/gather packet must be fixed a compile-time. A workaround is to make the unwanted extra operations a redundant copy of the last operation, which will be dead-code-eliminated by the compiler. For example, the following update definition swaps the values at locations 3 and 5 when the parameter p is true, and rotates the values at locations 1, 2, and 3 when it is false. The load from 3 and store to 5 will be redundantly repeated:
Note that in the p == true case, we redudantly load from 3 and write to 5 twice.
Referenced by scatter().
Definition at line 1572 of file IROperator.h.
References scatter().
Definition at line 1577 of file IROperator.h.
References gather().
Extract a contiguous subsequence of the bits of 'e', starting at the bit index given by 'lsb', where zero is the least-significant bit, returning a value of type 't'.
Any out-of-range bits requested are filled with zeros.
extract_bits is especially useful when one wants to load a small vector of a wide type, and treat it as a larger vector of a smaller type. For example, loading a vector of 32 uint8 values from a uint32 Func can be done as follows:
Note that the align_bounds call is critical so that the narrow Exprs are aligned to the wider Exprs. This makes the x%4 term collapse to a constant. If f8 is an output Func, then constraining the min value of x to be a known multiple of four would also be sufficient, e.g. via:
See test/correctness/extract_concat_bits.cpp for a complete example.
Referenced by extract_bits().
Definition at line 1607 of file IROperator.h.
References extract_bits().
Given a number of Exprs of the same type, concatenate their bits producing a single Expr of the same type code of the input but with more bits.
The number of arguments must be a power of two.
concat_bits is especially useful when one wants to treat a Func containing values of a narrow type as a Func containing fewer values of a wider type. For example, the following code reinterprets vectors of 32 uint8 values as a vector of 8 uint32s:
See test/correctness/extract_concat_bits.cpp for a complete example.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const DeviceAPI & | |||
) |
Emit a halide device api type in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const MemoryType & | |||
) |
Emit a halide memory type in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const TailStrategy & | t | ||
) |
Emit a halide tail strategy in human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const LoopLevel & | |||
) |
Create a zero-dimensional halide function that returns the given expression.
The function may have more dimensions if the expression contains implicit arguments.
Referenced by Halide::BoundaryConditions::Internal::func_like_to_func().
Create a 1-D halide function in the first argument that returns the second argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Create a 2-D halide function in the first two arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Create a 3-D halide function in the first three arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Create a 4-D halide function in the first four arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
Func Halide::lambda | ( | const Var & | x, |
const Var & | y, | ||
const Var & | z, | ||
const Var & | w, | ||
const Var & | v, | ||
const Expr & | e | ||
) |
Create a 5-D halide function in the first five arguments that returns the last argument.
The function may have more dimensions if the expression contains implicit arguments and the list of Var arguments contains a placeholder ("_").
std::unique_ptr< llvm::Module > Halide::compile_module_to_llvm_module | ( | const Module & | module, |
llvm::LLVMContext & | context | ||
) |
Generate an LLVM module.
std::unique_ptr< llvm::raw_fd_ostream > Halide::make_raw_fd_ostream | ( | const std::string & | filename | ) |
Construct an llvm output stream for writing to files.
void Halide::compile_llvm_module_to_object | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out | ||
) |
Compile an LLVM module to native targets (objects, native assembly).
void Halide::compile_llvm_module_to_assembly | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out | ||
) |
void Halide::compile_llvm_module_to_llvm_bitcode | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out | ||
) |
Compile an LLVM module to LLVM targets (bitcode, LLVM assembly).
void Halide::compile_llvm_module_to_llvm_assembly | ( | llvm::Module & | module, |
Internal::LLVMOStream & | out | ||
) |
void Halide::create_static_library | ( | const std::vector< std::string > & | src_files, |
const Target & | target, | ||
const std::string & | dst_file, | ||
bool | deterministic = true |
||
) |
Concatenate the list of src_files into dst_file, using the appropriate static library format for the given target (e.g., .a or .lib).
If deterministic is true, emit 0 for all GID/UID/timestamps, and 0644 for all modes (equivalent to the ar -D option).
Link a set of modules together into one module.
void Halide::compile_standalone_runtime | ( | const std::string & | object_filename, |
const Target & | t | ||
) |
Create an object file containing the Halide runtime for a given target.
For use with Target::NoRuntime. Standalone runtimes are only compatible with pipelines compiled by the same build of Halide used to call this function.
std::map< OutputFileType, std::string > Halide::compile_standalone_runtime | ( | const std::map< OutputFileType, std::string > & | output_files, |
const Target & | t | ||
) |
Create an object and/or static library file containing the Halide runtime for a given target.
For use with Target::NoRuntime. Standalone runtimes are only compatible with pipelines compiled by the same build of Halide used to call this function. Return a map with just the actual outputs filled in (typically, OutputFileType::object and/or OutputFileType::static_library).
void Halide::compile_multitarget | ( | const std::string & | fn_name, |
const std::map< OutputFileType, std::string > & | output_files, | ||
const std::vector< Target > & | targets, | ||
const std::vector< std::string > & | suffixes, | ||
const ModuleFactory & | module_factory, | ||
const CompilerLoggerFactory & | compiler_logger_factory = nullptr |
||
) |
|
inline |
Returns an Expr corresponding to the user context passed to the function (if any).
It is rare that this function is necessary (e.g. to pass the user context to an extern function written in C).
Definition at line 329 of file Param.h.
References Handle(), and Halide::Internal::Variable::make().
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const RVar & | |||
) |
Emit an RVar in a human-readable form.
std::ostream & Halide::operator<< | ( | std::ostream & | stream, |
const RDom & | |||
) |
Emit an RDom in a human-readable form.
Target Halide::get_host_target | ( | ) |
Return the target corresponding to the host machine.
Referenced by Halide::SimdOpCheckTest::can_run_code().
Target Halide::get_target_from_environment | ( | ) |
Return the target that Halide will use.
If HL_TARGET is set it uses that. Otherwise calls get_host_target
Target Halide::get_jit_target_from_environment | ( | ) |
Return the target that Halide will use for jit-compilation.
If HL_JIT_TARGET is set it uses that. Otherwise calls get_host_target. Throws an error if the architecture, bit width, and OS of the target do not match the host target, so this is only useful for controlling the feature set.
Referenced by Halide::Internal::schedule_scalar().
Target::Feature Halide::target_feature_for_device_api | ( | DeviceAPI | api | ) |
Get the Target feature corresponding to a DeviceAPI.
For device apis that do not correspond to any single target feature, returns Target::FeatureEnd
References Internal.
|
inline |
Constructing a signed integer type.
Definition at line 526 of file Type.h.
References Halide::Type::Int.
Referenced by Halide::ConciseCasts::i16(), Halide::ConciseCasts::i16_sat(), Halide::ConciseCasts::i32(), Halide::ConciseCasts::i32_sat(), Halide::ConciseCasts::i64(), Halide::ConciseCasts::i64_sat(), Halide::ConciseCasts::i8(), Halide::ConciseCasts::i8_sat(), and Halide::NamesInterface::Int().
|
inline |
Constructing an unsigned integer type.
Definition at line 531 of file Type.h.
References Halide::Type::UInt.
Referenced by Bool(), Halide::ConciseCasts::u16(), Halide::ConciseCasts::u16_sat(), Halide::ConciseCasts::u32(), Halide::ConciseCasts::u32_sat(), Halide::ConciseCasts::u64(), Halide::ConciseCasts::u64_sat(), Halide::ConciseCasts::u8(), Halide::ConciseCasts::u8_sat(), and Halide::NamesInterface::UInt().
|
inline |
Construct a floating-point type.
Definition at line 536 of file Type.h.
References Halide::Type::Float.
Referenced by Halide::SimdOpCheckTest::check_one(), Halide::ConciseCasts::f16(), Halide::ConciseCasts::f32(), Halide::ConciseCasts::f64(), and Halide::NamesInterface::Float().
|
inline |
Construct a floating-point type in the bfloat format.
Only 16-bit currently supported.
Definition at line 541 of file Type.h.
References Halide::Type::BFloat.
Referenced by Halide::ConciseCasts::bf16().
|
inline |
Construct a boolean type.
Definition at line 546 of file Type.h.
References UInt().
Referenced by Halide::NamesInterface::Bool().
|
inline |
Construct a handle type.
Definition at line 551 of file Type.h.
References Halide::Type::Handle.
Referenced by user_context_value().
|
inline |
std::string Halide::type_to_c_type | ( | Type | type, |
bool | include_space, | ||
bool | c_plus_plus = true |
||
) |
Halide type to a C++ type.
void Halide::load_plugin | ( | const std::string & | lib_name | ) |
Load a plugin in the form of a dynamic library (e.g.
for custom autoschedulers). If the string doesn't contain any . characters, the proper prefix and/or suffix for the platform will be added:
foo -> libfoo.so (Linux/OSX/etc – note that .dylib is not supported) foo -> foo.dll (Windows)
otherwise, it is assumed to be an appropriate pathname.
Any error in loading will assert-fail.
References Internal.
void Halide::set_compiler_stack_size | ( | size_t | ) |
Set how much stack the compiler should use for compilation in bytes.
This can also be set through the environment variable HL_COMPILER_STACK_SIZE, though this function takes precedence. A value of zero causes the compiler to just use the calling stack for all compilation tasks.
Calling this or setting the environment variable should not be necessary. It is provided for three kinds of testing:
First, Halide uses it in our internal tests to make sure we're not using a silly amount of stack size on some canary programs to avoid stack usage regressions.
Second, if you have a mysterious crash inside a generator, you can set a larger stack size as a way to test if it's a stack overflow. Perhaps our default stack size is not large enough for your program and schedule. Use this call or the environment var as a workaround, and then open a bug with a reproducer at github.com/halide/Halide/issues so that we can determine what's going wrong that is causing your code to use so much stack.
Third, perhaps using a side-stack is causing problems with sanitizing, debugging, or profiling tools. If this is a problem, you can set HL_COMPILER_STACK_SIZE to zero to make Halide stay on the main thread's stack.
size_t Halide::get_compiler_stack_size | ( | ) |
Return how much stack size the compiler should use for calls that go through run_with_large_stack below.
Currently that's lowering and codegen. If no call to set_compiler_stack_size has been made, this checks the value of the environment variable HL_COMPILER_STACK_SIZE. If that's unset, it returns default_compiler_stack_size, defined above.
References Internal.
const int Halide::head1_channels = 8 |
Definition at line 7 of file NetworkSize.h.
const int Halide::head1_w = 40 |
Definition at line 7 of file NetworkSize.h.
const int Halide::head1_h = 7 |
Definition at line 7 of file NetworkSize.h.
const int Halide::head2_channels = 24 |
Definition at line 8 of file NetworkSize.h.
const int Halide::head2_w = 39 |
Definition at line 8 of file NetworkSize.h.
const int Halide::conv1_channels = 32 |
Definition at line 9 of file NetworkSize.h.
|
constexpr |
const DeviceAPI Halide::all_device_apis[] |
An array containing all the device apis.
Useful for iterating through them.
Definition at line 30 of file DeviceAPI.h.