Halide 19.0.0
Halide compiler and libraries
|
This is a detailed guide to building your own Halide programs with the official CMake package. If you need directions for building Halide, see BuildingHalideWithCMake.md. If you are looking for Halide's CMake coding guidelines, see CodeStyleCMake.md.
This document assumes some basic familiarity with CMake but tries to be explicit in all its examples. To learn more about CMake, consult the documentation and engage with the community on the CMake Discourse.
There are two main ways to use Halide in your application: as a JIT compiler for dynamic pipelines or an ahead-of-time (AOT) compiler for static pipelines. CMake provides robust support for both use cases.
No matter how you intend to use Halide, you will need some basic CMake boilerplate.
The cmake_minimum_required
command is required to be the first command executed in a CMake program. It disables all the deprecated behavior ("policies" in CMake lingo) from earlier versions. The project
command sets the name of the project (and accepts arguments for versioning, language support, etc.) and is required by CMake to be called immediately after setting the minimum version.
The next three variables set the project-wide C++ standard. The first, CMAKE_CXX_STANDARD
, simply sets the standard version. Halide requires at least C++17. The second, CMAKE_CXX_STANDARD_REQUIRED
, tells CMake to fail if the compiler cannot provide the requested standard version. Lastly, CMAKE_CXX_EXTENSIONS
tells CMake to disable vendor-specific extensions to C++. This is not necessary to simply use Halide, but we do not allow such extensions in the Halide repo.
Finally, we use find_package
to locate Halide on your system. When using the pip package on Linux and macOS, CMake's find_package
command should find Halide as long as you're in the same virtual environment you installed it in. On Windows, you will need to add the virtual environment root directory to CMAKE_PREFIX_PATH
:
If find_package
cannot find Halide, set CMAKE_PREFIX_PATH
to the Halide installation directory.
To use Halide in JIT mode (like the tutorials do, for example), you can simply link to Halide::Halide
.
Then Halide.h
will be available to your code and everything should just work. That's it!
Using Halide in AOT mode is more complicated so we'll walk through it step by step. Note that this only applies to Halide generators, so it might be useful to re-read the tutorial on generators. Assume (like in the tutorial) that you have a source file named my_generators.cpp
and that in it, you have generator classes MyFirstGenerator
and MySecondGenerator
with registered names my_first_generator
and my_second_generator
respectively.
Then the first step is to add a generator executable to your build:
Using the generator executable, we can add a Halide library corresponding to MyFirstGenerator
.
This will create a static library target in CMake that corresponds to the output of running your generator. The second generator in the file requires generator parameters to be passed to it. These are also easy to handle:
Adding multiple configurations is easy, too:
Here, we had to specify which generator to use (my_second_generator
) since it uses the target name by default. The functions in these libraries will be named after the target names, my_second_generator_2
and my_second_generator_3
, by default, but it is possible to control this via the FUNCTION_NAME
parameter.
Each one of these targets, <GEN>
, carries an associated <GEN>.runtime
target, which is also a static library containing the Halide runtime. It is transitively linked through <GEN>
to targets that link to <GEN>
. On an operating system like Linux, where weak linking is available, this is not an issue. However, on Windows, this can fail due to symbol redefinitions. In these cases, you must declare that two Halide libraries share a runtime, like so:
This will even work correctly when different combinations of targets are specified for each halide library. A "greatest common denominator" target will be chosen that is compatible with all of them (or the build will fail).
When the autoschedulers are included in the release package, they are very simple to apply to your own generators. For example, we could update the definition of the my_first_generator
library above to use the Adams2019
autoscheduler:
Halide provides a generic driver for generators to be used during development for benchmarking and debugging. Suppose you have a generator executable called my_gen
and a generator within called my_filter
. Then you can pass a variable name to the REGISTRATION
parameter of add_halide_library
which will contain the name of a generated C++ source that should be linked to Halide::RunGenMain
and my_filter
.
For example:
Then you can run, debug, and benchmark your generator through the runner
executable. Learn how to interact with these executables in RunGen.md.
Halide provides a CMake package configuration module. The intended way to use the CMake build is to run find_package(Halide ...)
in your CMakeLists.txt
file. Closely read the find_package
documentation before proceeding.
The Halide package script understands a handful of optional components when loading the package.
First, if you plan to use the Halide Image IO library, you will want to include the png
and jpeg
components when loading Halide.
Second, Halide releases can contain a variety of configurations: static, shared, debug, release, etc. CMake handles Debug/Release configurations automatically, but generally only allows one type of library to be loaded.
The package understands two components, static
and shared
, that specify which type of library you would like to load. For example, if you want to make sure that you link against shared Halide, you can write:
If the shared libraries are not available, this will result in a failure.
If no component is specified, then the Halide_SHARED_LIBS
variable is checked. If it is defined and set to true, then the shared libraries will be loaded or the package loading will fail. Similarly, if it is defined and set to false, the static libraries will be loaded.
If no component is specified and Halide_SHARED_LIBS
is not defined, then the BUILD_SHARED_LIBS
variable will be inspected. If it is not defined or defined and set to true, then it will attempt to load the shared libs and fall back to the static libs if they are not available. Similarly, if BUILD_SHARED_LIBS
is defined and set to false, then it will try the static libs first then fall back to the shared libs.
To ensure that the Python bindings are available, include the Python
component.
Variables that control package loading:
Variable | Description |
---|---|
Halide_SHARED_LIBS | override BUILD_SHARED_LIBS when loading the Halide package via find_package . Has no effect when using Halide via add_subdirectory as a Git or FetchContent submodule. |
Halide_RUNTIME_NO_THREADS | skip linking of Threads library to runtime. Should be set if your toolchain does not support it (e.g. baremetal). |
Halide_RUNTIME_NO_DL_LIBS | skip linking of DL library to runtime. Should be set if your toolchain does not support it (e.g. baremetal). |
Variables set by the package:
Variable | Description |
---|---|
Halide_VERSION | The full version string of the loaded Halide package |
Halide_VERSION_MAJOR | The major version of the loaded Halide package |
Halide_VERSION_MINOR | The minor version of the loaded Halide package |
Halide_VERSION_PATCH | The patch version of the loaded Halide package |
Halide_VERSION_TWEAK | The tweak version of the loaded Halide package |
Halide_HOST_TARGET | The Halide target triple corresponding to "host" for this build. |
Halide_CMAKE_TARGET | The Halide target triple corresponding to the active CMake target. |
Halide_ENABLE_EXCEPTIONS | Whether Halide was compiled with exception support |
Halide_ENABLE_RTTI | Whether Halide was compiled with RTTI |
WITH_AUTOSCHEDULERS | Whether the autoschedulers are available |
Variables that control package behavior:
Variable | Description |
---|---|
Halide_PYTHON_LAUNCHER | Semicolon separated list containing a command to launch the Python interpreter. Can be used to set environment variables for Python generators. |
Halide_NO_DEFAULT_FLAGS | Off by default. When enabled, suppresses recommended compiler flags that would be added by add_halide_generator |
Halide defines the following targets that are available to users:
Imported target | Description |
---|---|
Halide::Halide | this is the JIT-mode library to use when using Halide from C++. |
Halide::Generator | this is the target to use when manually defining a generator executable. It supplies a main() function. |
Halide::Runtime | adds include paths to the Halide runtime headers |
Halide::Tools | adds include paths to the Halide tools, including the benchmarking utility. |
Halide::ImageIO | adds include paths to the Halide image IO utility. Depends on PNG::PNG and JPEG::JPEG if they exist or were loaded through the corresponding package components. |
Halide::ThreadPool | adds include paths to the Halide simple thread pool utility library. This is not the same as the runtime's thread pool and is intended only for use by tests. Depends on Threads::Threads . |
Halide::RunGenMain | used with the REGISTRATION parameter of add_halide_library to create simple runners and benchmarking tools for Halide libraries. |
The following targets only guaranteed when requesting the Python
component (Halide_Python_FOUND
will be true):
Imported target | Description |
---|---|
Halide::Python | this is a Python 3 package that can be referenced as $<TARGET_FILE_DIR:Halide::Python>/.. when setting up PYTHONPATH for Python tests or the like from CMake. |
The following targets only guaranteed when WITH_AUTOSCHEDULERS
is true:
Imported target | Description |
---|---|
Halide::Adams2019 | the Adams et.al. 2019 autoscheduler (no GPU support) |
Halide::Anderson2021 | the Anderson, et.al. 2021 autoscheduler (full GPU support) |
Halide::Li2018 | the Li et.al. 2018 gradient autoscheduler (limited GPU support) |
Halide::Mullapudi2016 | the Mullapudi et.al. 2016 autoscheduler (no GPU support) |
The Halide package provides several useful functions for dealing with AOT compilation steps.
add_halide_generator
This function aids in creating cross-compilable builds that use Halide generators.
Every named argument is optional, and the function uses the following default arguments:
PACKAGE_NAME
is not provided, it defaults to ${PROJECT_NAME}-halide_generators
.PACKAGE_NAMESPACE
is not provided, it defaults to ${PROJECT_NAME}::halide_generators::
.EXPORT_FILE
is not provided, it defaults to ${PROJECT_BINARY_DIR}/cmake/${ARG_PACKAGE_NAME}-config.cmake
This function guarantees that a Halide generator target named <namespace><target>
is available. It will first search for a package named <package-name>
using find_package
; if it is found, it is assumed that it provides the target. Otherwise, it will create an executable target named target
and an ALIAS
target <namespace><target>
. This function also creates a custom target named <package-name>
if it does not exist and <target>
would exist. In this case, <package-name>
will depend on <target>
, this enables easy building of just the Halide generators managed by this function.
After the call, <PACKAGE_NAME>_FOUND
will be set to true if the host generators were imported (and hence won't be built). Otherwise, it will be set to false. This variable may be used to conditionally set properties on <target>
.
Please see test/integration/xc for a simple example and apps/hannk for a complete app that uses it extensively.
The SOURCES
keyword marks the beginning of sources to be used to build <target>
, if it is not loaded. All unparsed arguments will be interpreted as sources.
The LINK_LIBRARIES
argument lists libraries that should be linked to <target>
when it is being built in the present build system.
If PYSTUB
is specified, then a Python Extension will be built that wraps the Generator with CPython glue to allow use of the Generator Python 3. The result will be a shared library of the form <target>_pystub.<soabi>.so
, where <soabi>
describes the specific Python version and platform (e.g., cpython-310-darwin
for Python 3.10 on macOS). See Python.md for examples of use.
add_halide_library
This is the main function for managing generators in AOT compilation. The full signature follows:
This function creates a called <target>
corresponding to running the <generator-target>
(an executable target which links to Halide::Generator
) one time, using command line arguments derived from the other parameters.
The arguments GENERATOR
and FUNCTION_NAME
default to <target>
. They correspond to the -g
and -f
command line flags, respectively.
NAMESPACE
is syntactic sugar to specify the C++ namespace (if any) of the generated function; you can also specify the C++ namespace (if any) directly in the FUNCTION_NAME
argument, but for repeated declarations or very long namespaces, specifying this separately can provide more readable build files.
If USE_RUNTIME
is not specified, this function will create another target called <target>.runtime
which corresponds to running the generator with -r
and a compatible list of targets. This runtime target is an INTERFACE
dependency of <target>
. If multiple runtime targets need to be linked together, setting USE_RUNTIME
to another Halide runtime library, <target2>
will prevent the generation of <target>.runtime
and instead use <target2>.runtime
. This argument is most commonly used in conjunction with `add_halide_runtime`.
Parameters can be passed to a generator via the PARAMS
argument. Parameters should be space-separated. Similarly, TARGETS
is a space-separated list of targets for which to generate code in a single function. They must all share the same platform/bits/os triple (e.g. arm-32-linux
). Features that are in common among all targets, including device libraries (like cuda
) should go in FEATURES
. If TARGETS
is not specified, the value of Halide_TARGET
specified at configure time will be used.
Every element of TARGETS
must begin with the same arch-bits-os
triple. This function understands two meta-triples, host
and cmake
. The meta-triple host
is equal to the arch-bits-os
triple used to compile Halide along with all the supported instruction set extensions. On platforms that support running both 32 and 64-bit programs, this will not necessarily equal the platform the compiler is running on or that CMake is targeting.
The meta-triple cmake
is equal to the arch-bits-os
of the current CMake target. This is useful if you want to make sure you are not unintentionally cross-compiling, which would result in an IMPORTED
target being created. When TARGETS
is empty and the host
target would not cross-compile, then host
will be used. Otherwise, cmake
will be used and an author warning will be issued.
When CMAKE_OSX_ARCHITECTURES
is set and the TARGETS
argument resolves to cmake
, the generator will be run once for each architecture and the results will be fused together using lipo
. This behavior extends to runtime targets.
To use an autoscheduler, set the AUTOSCHEDULER
argument to a target named like Namespace::Scheduler
, for example Halide::Adams2019
. This will set the autoscheduler
GeneratorParam on the generator command line to Scheduler
and add the target to the list of plugins. Additional plugins can be loaded by setting the PLUGINS
argument. If the argument to AUTOSCHEDULER
does not contain ::
or it does not name a target, it will be passed to the -s
flag verbatim.
If GRADIENT_DESCENT
is set, then the module will be built suitably for gradient descent calculation in TensorFlow or PyTorch. See Generator::build_gradient_module()
for more documentation. This corresponds to passing -d 1
at the generator command line.
If the C_BACKEND
option is set, this command will invoke the configured C++ compiler on a generated source. Note that a <target>.runtime
target is not created in this case, and the USE_RUNTIME
option is ignored. Other options work as expected.
If REGISTRATION
is set, the path (relative to CMAKE_CURRENT_BINARY_DIR
) to the generated .registration.cpp
file will be set in OUTVAR
. This can be used to generate a runner for a Halide library that is useful for benchmarking and testing, as documented above. This is equivalent to setting -e registration
at the generator command line.
If HEADER
is set, the path (relative to CMAKE_CURRENT_BINARY_DIR
) to the generated .h
header file will be set in OUTVAR
. This can be used with install(FILES)
to conveniently deploy the generated header along with your library.
If FUNCTION_INFO_HEADER
is set, the path (relative to CMAKE_CURRENT_BINARY_DIR
) to the generated .function_info.h
header file will be set in OUTVAR
. This produces a file that contains constexpr
descriptions of information about the generated functions (e.g., argument type and information). It is generated separately from the normal HEADER
file because HEADER
is intended to work with basic extern "C"
linkage, while FUNCTION_INFO_HEADER
requires C++17 or later to use effectively. (This can be quite useful for advanced usages, such as producing automatic call wrappers, etc.) Examples of usage can be found in the generated file.
Each of the extra-output
arguments directly correspond to an extra output (via -e
) from the generator. The value OUTVAR
names a variable into which a path (relative to CMAKE_CURRENT_BINARY_DIR
) to the extra file will be written.
When NO_THREADS
is passed, the library targets will not depend on Threads::Threads
. It is your responsibility to link to an equivalent target.
When NO_DL_LIBS
is passed, the library targets will not depend on ${CMAKE_DL_LIBS}
. It is your responsibility to link to an equivalent library.
add_halide_python_extension_library
This function wraps the outputs of one or more add_halide_library
targets with glue code to produce a Python Extension library.
HALIDE_LIBRARIES
is a list of one of more add_halide_library
targets. Each will be added to the extension as a callable method of the module. Note that every library specified must be built with the PYTHON_EXTENSION
keyword specified, and all libraries must use the same Halide runtime.
The result will be a shared library of the form <target>.<soabi>.so
, where <soabi>
describes the specific Python version and platform (e.g., cpython-310-darwin
for Python 3.10 on macOS.)
add_halide_runtime
This function generates a library containing a Halide runtime. Most user code will never need to use this, as add_halide_library()
will call it for you if necessary. The most common use case is usually in conjunction with add_halide_python_extension_library()
, as a way to ensure that all the halide libraries share an identical runtime.
The TARGETS
, NO_THREADS
, and NO_DL_LIBS
arguments have identical semantics to the argument of the same name for `add_halide_library`.
Cross-compiling in CMake can be tricky, since CMake doesn't easily support compiling for both the host platform and the cross platform within the same build. Unfortunately, Halide generator executables are just about always designed to run on the host platform. Each project will be set up differently and have different requirements, but here are some suggestions for effective use of CMake in these scenarios.
add_halide_generator
If you are writing new programs that use Halide, you might wish to use add_halide_generator
. When using this helper, you are expected to build your project twice: once for your build host and again for your intended target.
When building the host build, you can use the <package-name>
(see the documentation above) target to build just the generators. Then, in the target build, set <package-name>_ROOT
to the host build directory.
For example:
A CMake super-build consists of breaking down a project into subprojects that are isolated by toolchain. The basic structure is to have an outermost project that only coordinates the sub-builds via the ExternalProject
module.
One would then use Halide to build a generator executable in one self-contained project, then export that target to be used in a separate project. The second project would be configured with the target toolchain and would call add_halide_library
with no TARGETS
option and set FROM
equal to the name of the imported generator executable. Obviously, this is a significant increase in complexity over a typical CMake project.
This is very compatible with the add_halide_generator
strategy above.
ExternalProject
directlyA lighter weight alternative to the above is to use ExternalProject
directly in your parent build. Configure the parent build with the target toolchain, and configure the inner project to use the host toolchain. Then, manually create an IMPORTED
target for your generator executable and call add_halide_library
as described above.
The main drawback of this approach is that creating accurate IMPORTED
targets is difficult since predicting the names and locations of your binaries across all possible platform and CMake project generators is difficult. In particular, it is hard to predict executable extensions in cross-OS builds.
The CMAKE_CROSSCOMPILING_EMULATOR
variable allows one to specify a command prefix to run a target-system binary on the host machine. One could set this to a custom shell script that uploads the generator executable, runs it on the device and copies back the results.
Another option is to install qemu-user-static
to transparently emulate the cross-built generator.
The previous two options ensure that the targets generated by add_halide_library
will be normal static libraries. This approach does not use ExternalProject
, but instead produces IMPORTED
targets. The main drawback of IMPORTED
targets is that they are considered second-class in CMake. In particular, they cannot be installed with the typical install(TARGETS)
command. Instead, they must be installed using install(FILES)
and the $<TARGET_FILE:tgt>
generator expression.