Halide 19.0.0
Halide compiler and libraries
|
doc | |
► src | |
► autoschedulers | |
► adams2019 | |
AutoSchedule.h | |
Cache.h | |
cost_model_schedule.h | |
CostModel.h | |
DefaultCostModel.h | |
Featurization.h | |
FunctionDAG.h | |
LoopNest.h | |
NetworkSize.h | |
State.h | |
Timer.h | |
Weights.h | |
► anderson2021 | |
► test | |
test.h | |
AutoSchedule.h | |
cost_model_schedule.h | |
CostModel.h | |
DefaultCostModel.h | |
Featurization.h | |
FunctionDAG.h | |
GPULoopInfo.h | Data structure containing information about the current GPU loop nest hierarchy of blocks, threads, etc |
GPUMemInfo.h | Data structures that help track memory access information |
LoopNest.h | |
LoopNestParser.h | |
NetworkSize.h | |
SearchSpace.h | |
SearchSpaceOptions.h | |
State.h | |
Statistics.h | |
ThreadInfo.h | Data structure containing information about GPU threads for a particular location in the loop nest and its surrounding block |
Tiling.h | |
Weights.h | |
► common | |
ASLog.h | |
cmdline.h | |
Errors.h | |
HalidePlugin.h | |
ParamParser.h | |
PerfectHashMap.h | |
► runtime | |
► hexagon_remote | |
► bin | |
► src | |
halide_hexagon_remote.h | |
► qurt | |
known_symbols.h | |
log.h | |
sim_protocol.h | |
► internal | |
block_allocator.h | |
block_storage.h | |
linked_list.h | |
memory_arena.h | |
memory_resources.h | |
pointer_table.h | |
region_allocator.h | |
string_storage.h | |
string_table.h | |
android_ioctl.h | |
cl_functions.h | |
constants.h | This file contains private constants shared between the Halide library and the Halide runtime |
cpu_features.h | |
cuda_functions.h | |
device_buffer_utils.h | |
device_interface.h | |
gpu_context_common.h | |
HalideBuffer.h | Defines a Buffer type that wraps from halide_buffer_t and adds functionality, and methods for more conveniently iterating over the samples in a halide_buffer_t outside of Halide code |
HalidePyTorchCudaHelpers.h | Override Halide's CUDA hooks so that the Halide code called from PyTorch uses the correct GPU device and stream |
HalidePyTorchHelpers.h | Set of utility functions to wrap PyTorch tensors into Halide buffers, making sure the data in on the correct device (CPU/GPU) |
HalideRuntime.h | This file declares the routines used by Halide internally in its runtime |
HalideRuntimeCuda.h | Routines specific to the Halide Cuda runtime |
HalideRuntimeD3D12Compute.h | Routines specific to the Halide Direct3D 12 Compute runtime |
HalideRuntimeHexagonDma.h | Routines specific to the Halide Hexagon DMA host-side runtime |
HalideRuntimeHexagonHost.h | Routines specific to the Halide Hexagon host-side runtime |
HalideRuntimeMetal.h | Routines specific to the Halide Metal runtime |
HalideRuntimeOpenCL.h | Routines specific to the Halide OpenCL runtime |
HalideRuntimeQurt.h | Routines specific to the Halide QuRT runtime |
HalideRuntimeVulkan.h | Routines specific to the Halide Vulkan runtime |
HalideRuntimeWebGPU.h | Routines specific to the Halide WebGPU runtime |
hashmap.h | |
hexagon_dma_pool.h | |
metal_objc_platform_dependent.h | |
mini_cl.h | |
mini_cuda.h | |
mini_d3d12.h | |
mini_hexagon_dma.h | |
mini_qurt.h | |
mini_qurt_vtcm.h | |
mini_webgpu.h | |
objc_support.h | |
posix_timeval.h | |
printer.h | |
runtime_atomics.h | |
runtime_internal.h | |
scoped_mutex_lock.h | |
scoped_spin_lock.h | |
synchronization_common.h | |
thread_pool_common.h | |
vulkan_context.h | |
vulkan_extensions.h | |
vulkan_functions.h | |
vulkan_interface.h | |
vulkan_internal.h | |
vulkan_memory.h | |
vulkan_resources.h | |
AbstractGenerator.h | |
AddAtomicMutex.h | Defines the lowering pass that insert mutex allocation code & locks for the atomic nodes that require mutex locks |
AddImageChecks.h | Defines the lowering pass that adds the assertions that validate input and output buffers |
AddParameterChecks.h | Defines the lowering pass that adds the assertions that validate scalar parameters |
AddSplitFactorChecks.h | Defines the lowering pass that adds the assertions that all split factors are strictly positive |
AlignLoads.h | Defines a lowering pass that rewrites unaligned loads into sequences of aligned loads |
AllocationBoundsInference.h | Defines the lowering pass that determines how large internal allocations should be |
ApplySplit.h | Defines method that returns a list of let stmts, substitutions, and predicates to be added given a split schedule |
Argument.h | Defines a type used for expressing the type signature of a generated halide pipeline |
AssociativeOpsTable.h | Tables listing associative operators and their identities |
Associativity.h | Methods for extracting an associative operator from a Func's update definition if there is any and computing the identity of the associative operator |
AsyncProducers.h | Defines the lowering pass that injects task parallelism for producers that are scheduled as async |
AutoScheduleUtils.h | Defines util functions that used by auto scheduler |
BoundaryConditions.h | Support for imposing boundary conditions on Halide::Funcs |
BoundConstantExtentLoops.h | Defines the lowering pass that enforces a constant extent on all vectorized or unrolled loops |
Bounds.h | Methods for computing the upper and lower bounds of an expression, and the regions of a function read or written by a statement |
BoundsInference.h | Defines the bounds_inference lowering pass |
BoundSmallAllocations.h | Defines the lowering pass that attempts to rewrite small allocations to have constant size |
Buffer.h | |
Callable.h | Defines the front-end class representing a jitted, callable Halide pipeline |
CanonicalizeGPUVars.h | Defines the lowering pass that canonicalize the GPU var names over |
ClampUnsafeAccesses.h | Defines the clamp_unsafe_accesses lowering pass |
Closure.h | Provides Closure class |
CodeGen_C.h | Defines an IRPrinter that emits C++ code equivalent to a halide stmt |
CodeGen_D3D12Compute_Dev.h | Defines the code-generator for producing D3D12-compatible HLSL kernel code |
CodeGen_GPU_Dev.h | Defines the code-generator interface for producing GPU device code |
CodeGen_Internal.h | Defines functionality that's useful to multiple target-specific CodeGen paths, but shouldn't live in CodeGen_LLVM.h (because that's the front-end-facing interface to CodeGen) |
CodeGen_LLVM.h | Defines the base-class for all architecture-specific code generators that use llvm |
CodeGen_Metal_Dev.h | Defines the code-generator for producing Apple Metal shading language kernel code |
CodeGen_OpenCL_Dev.h | Defines the code-generator for producing OpenCL C kernel code |
CodeGen_Posix.h | Defines a base-class for code-generators on posixy cpu platforms |
CodeGen_PTX_Dev.h | Defines the code-generator for producing CUDA host code |
CodeGen_PyTorch.h | Defines an IRPrinter that emits C++ code that: |
CodeGen_Targets.h | Provides constructors for code generators for various targets |
CodeGen_Vulkan_Dev.h | Defines the code-generator for producing SPIR-V binary modules for use with the Vulkan runtime |
CodeGen_WebGPU_Dev.h | Defines the code-generator for producing WebGPU shader code (WGSL) |
CompilerLogger.h | Defines an interface used to gather and log compile-time information, stats, etc for use in evaluating internal Halide compilation rules and efficiency |
ConciseCasts.h | Defines concise cast and saturating cast operators to make it easier to read cast-heavy code |
ConstantBounds.h | Methods for computing compile-time constant int64_t upper and lower bounds of an expression |
ConstantInterval.h | Defines the ConstantInterval class, and operators on it |
CPlusPlusMangle.h | A simple function to get a C++ mangled function name for a function |
CSE.h | Defines a pass for introducing let expressions to wrap common sub-expressions |
Debug.h | Defines functions for debug logging during code generation |
DebugArguments.h | Defines a lowering pass that injects debug statements inside a LoweredFunc |
DebugToFile.h | Defines the lowering pass that injects code at the end of every realization to dump functions to a file for debugging |
Definition.h | Defines the internal representation of a halide function's definition and related classes |
Deinterleave.h | Defines methods for splitting up a vector into the even lanes and the odd lanes |
Derivative.h | Automatic differentiation |
DerivativeUtils.h | |
Deserialization.h | |
DeviceAPI.h | Defines DeviceAPI |
DeviceArgument.h | Defines helpers for passing arguments to separate devices, such as GPUs |
DeviceInterface.h | Methods for managing device allocations when jitting |
Dimension.h | Defines the Dimension utility class for Halide pipelines |
DistributeShifts.h | A tool to distribute shifts as multiplies, useful for some backends |
EarlyFree.h | Defines the lowering pass that injects markers just after the last use of each buffer so that they can potentially be freed earlier |
Elf.h | |
EliminateBoolVectors.h | Method to eliminate vectors of booleans from IR |
EmulateFloat16Math.h | Methods for dealing with float16 arithmetic using float32 math, by casting back and forth with bit tricks |
Error.h | |
Expr.h | Base classes for Halide expressions (Halide::Expr) and statements (Halide::Internal::Stmt) |
ExprUsesVar.h | Defines a method to determine if an expression depends on some variables |
Extern.h | Convenience macros that lift functions that take C types into functions that take and return exprs, and call the original function at runtime under the hood |
ExternFuncArgument.h | Defines the internal representation of a halide ExternFuncArgument |
ExtractTileOperations.h | Defines the lowering pass that injects calls to tile intrinsics that support AMX instructions |
FastIntegerDivide.h | |
FindCalls.h | Defines analyses to extract the functions called a function |
FindIntrinsics.h | Tools to replace common patterns with more readily recognizable intrinsics |
FlattenNestedRamps.h | Defines the lowering pass that flattens nested ramps and broadcasts |
Float16.h | |
Func.h | Defines Func - the front-end handle on a halide function, and related classes |
Function.h | Defines the internal representation of a halide function and related classes |
FunctionPtr.h | |
FuseGPUThreadLoops.h | Defines the lowering pass that fuses and normalizes loops over gpu threads to target CUDA, OpenCL, and Metal |
FuzzFloatStores.h | Defines a lowering pass that messes with floating point stores |
Generator.h | Generator is a class used to encapsulate the building of Funcs in user pipelines |
HexagonAlignment.h | Class for analyzing Alignment of loads and stores for Hexagon |
HexagonOffload.h | Defines a lowering pass to pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module |
HexagonOptimize.h | Tools for optimizing IR for Hexagon |
ImageParam.h | Classes for declaring image parameters to halide pipelines |
InferArguments.h | Interface for a visitor to infer arguments used in a body Stmt |
InjectHostDevBufferCopies.h | Defines the lowering passes that deal with host and device buffer flow |
Inline.h | Methods for replacing calls to functions with their definitions |
InlineReductions.h | Defines some inline reductions: sum, product, minimum, maximum |
IntegerDivisionTable.h | Tables telling us how to do integer division via fixed-point multiplication for various small constants |
Interval.h | Defines the Interval class |
IntrusivePtr.h | Support classes for reference-counting via intrusive shared pointers |
IR.h | Subtypes for Halide expressions (Halide::Expr) and statements (Halide::Internal::Stmt) |
IREquality.h | Methods to test Exprs and Stmts for equality of value |
IRMatch.h | Defines a method to match a fragment of IR against a pattern containing wildcards |
IRMutator.h | Defines a base class for passes over the IR that modify it |
IROperator.h | Defines various operator overloads and utility functions that make it more pleasant to work with Halide expressions |
IRPrinter.h | This header file defines operators that let you dump a Halide expression, statement, or type directly into an output stream in a human readable form |
IRVisitor.h | Defines the base class for things that recursively walk over the IR |
JITModule.h | Defines the struct representing lifetime and dependencies of a JIT compiled halide pipeline |
Lambda.h | Convenience functions for creating small anonymous Halide functions |
Lerp.h | Defines methods for converting a lerp intrinsic into Halide IR |
LICM.h | Methods for lifting loop invariants out of inner loops |
LLVM_Headers.h | |
LLVM_Output.h | |
LLVM_Runtime_Linker.h | Support for linking LLVM modules that comprise the runtime |
LoopCarry.h | |
LoopPartitioningDirective.h | Defines the Partition enum |
Lower.h | Defines the function that generates a statement that computes a Halide function using its schedule |
LowerParallelTasks.h | Support for platform independent lowering of Halide parallel and async mechanisms |
LowerWarpShuffles.h | Defines the lowering pass that injects CUDA warp shuffle instructions to access storage outside of a GPULane loop |
MainPage.h | This file only exists to contain the front-page of the documentation |
Memoization.h | Defines the interface to the pass that injects support for compute_cached roots |
Module.h | Defines Module, an IR container that fully describes a Halide program |
ModulusRemainder.h | Routines for statically determining what expressions are divisible by |
Monotonic.h | Methods for computing whether expressions are monotonic |
ObjectInstanceRegistry.h | Provides a single global registry of Generators, GeneratorParams, and Params indexed by this pointer |
OffloadGPULoops.h | Defines a lowering pass to pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module |
OptimizeShuffles.h | Defines a lowering pass that replace indirect loads with dynamic_shuffle intrinsics where possible |
OutputImageParam.h | Classes for declaring output image parameters to halide pipelines |
ParallelRVar.h | Method for checking if it's safe to parallelize an update definition across a reduction variable |
Param.h | Classes for declaring scalar parameters to halide pipelines |
Parameter.h | Defines the internal representation of parameters to halide piplines |
PartitionLoops.h | Defines a lowering pass that partitions loop bodies into three to handle boundary conditions: A prologue, a simplified steady-stage, and an epilogue |
Pipeline.h | Defines the front-end class representing an entire Halide imaging pipeline |
Prefetch.h | Defines the lowering pass that injects prefetch calls when prefetching appears in the schedule |
PrefetchDirective.h | Defines the PrefetchDirective struct |
PrintLoopNest.h | Defines methods to print out the loop nest corresponding to a schedule |
Profiling.h | Defines the lowering pass that injects print statements when profiling is turned on |
PurifyIndexMath.h | Removes side-effects in integer math |
PythonExtensionGen.h | |
Qualify.h | Defines methods for prefixing names in an expression with a prefix string |
Random.h | Defines deterministic random functions, and methods to redirect front-end calls to random_float and random_int to use them |
RDom.h | Defines the front-end syntax for reduction domains and reduction variables |
Realization.h | Defines Realization - a vector of Buffer for use in pipelines with multiple outputs |
RealizationOrder.h | Defines the lowering pass that determines the order in which realizations are injected and groups functions with fused computation loops |
RebaseLoopsToZero.h | Defines the lowering pass that rewrites loop mins to be 0 |
Reduction.h | Defines internal classes related to Reduction Domains |
RegionCosts.h | Defines RegionCosts - used by the auto scheduler to query the cost of computing some function regions |
RemoveDeadAllocations.h | Defines the lowering pass that removes allocate and free nodes that are not used |
RemoveExternLoops.h | Defines a lowering pass that removes placeholder loops for extern stages |
RemoveUndef.h | Defines a lowering pass that elides stores that depend on unitialized values |
Schedule.h | Defines the internal representation of the schedule for a function |
ScheduleFunctions.h | Defines the function that does initial lowering of Halide Functions into a loop nest using its schedule |
Scope.h | Defines the Scope class, which is used for keeping track of names in a scope while traversing IR |
SelectGPUAPI.h | Defines a lowering pass that selects which GPU api to use for each gpu for loop |
Serialization.h | |
Simplify.h | Methods for simplifying halide statements and expressions |
Simplify_Internal.h | The simplifier is separated into multiple compilation units with this single shared header to speed up the build |
SimplifyCorrelatedDifferences.h | Defines a simplification pass for handling differences of correlated expressions |
SimplifySpecializations.h | Defines pass that try to simplify the RHS/LHS of a function's definition based on its specializations |
SkipStages.h | Defines a pass that dynamically avoids realizing unnecessary stages |
SlidingWindow.h | Defines the sliding_window lowering optimization pass, which avoids computing provably-already-computed values |
Solve.h | |
SpirvIR.h | Defines methods for constructing and encoding instructions into the Khronos format specification known as the Standard Portable Intermediate Representation for Vulkan (SPIR-V) |
SplitTuples.h | Defines the lowering pass that breaks up Tuple-valued realization and productions into several scalar-valued ones |
StageStridedLoads.h | Defines the compiler pass that converts strided loads into dense loads followed by shuffles |
StmtToHTML.h | Defines a function to dump an HTML-formatted visualization to a file |
StorageFlattening.h | Defines the lowering pass that flattens multi-dimensional storage into single-dimensional array access |
StorageFolding.h | Defines the lowering optimization pass that reduces large buffers down to smaller circular buffers when possible |
StrictifyFloat.h | Defines a lowering pass to make all floating-point strict for all top-level Exprs |
StripAsserts.h | Defines the lowering pass that strips asserts when NoAsserts is set |
Substitute.h | Defines methods for substituting out variables in expressions and statements |
Target.h | Defines the structure that describes a Halide target |
TargetQueryOps.h | Defines a lowering pass to lower all target_is() and target_has() helpers |
Tracing.h | Defines the lowering pass that injects print statements when tracing is turned on |
TrimNoOps.h | Defines a lowering pass that truncates loops to the region over which they actually do something |
Tuple.h | Defines Tuple - the front-end handle on small arrays of expressions |
Type.h | Defines halide types |
UnifyDuplicateLets.h | Defines the lowering pass that coalesces redundant let statements |
UniquifyVariableNames.h | Defines the lowering pass that renames all variables to have unique names |
UnpackBuffers.h | Defines the lowering pass that unpacks buffer arguments onto the symbol table |
UnrollLoops.h | Defines the lowering pass that unrolls loops marked as such |
UnsafePromises.h | Defines the lowering pass that removes unsafe promises |
Util.h | Various utility functions used internally Halide |
Var.h | Defines the Var - the front-end variable |
VectorizeLoops.h | Defines the lowering pass that vectorizes loops marked as such |
WasmExecutor.h | Support for running Halide-compiled Wasm code in-process |
WrapCalls.h | Defines pass to replace calls to wrapped Functions with their wrappers |
► test | |
► autoschedulers | |
► adams2019 | |
included_schedule_file.schedule.h | |
► anderson2021 | |
included_schedule_file.schedule.h | |
► common | |
check_call_graphs.h | |
gpu_context.h | |
gpu_object_lifetime_tracker.h | |
halide_test_dirs.h | |
test_sharding.h | |
► correctness | |
simd_op_check.h | |
► fuzz | |
fuzz_helpers.h | |
► runtime | |
common.h |