Halide 19.0.0
Halide compiler and libraries
|
Halide supports the Khronos Vulkan framework as a compute API backend for GPU-like devices, and compiles directly to a binary SPIR-V representation as part of its code generation before submitting it to the Vulkan API. Both JIT and AOT usage are supported via the vulkan
target flag (e.g. HL_JIT_TARGET=host-vulkan
).
Vulkan support is actively under development, and considered BETA quality at this stage. Tests are passing, but performance tuning and user testing is needed to identify potential issues before rolling this into production.
See below for details.
You'll need to configure Halide and enable the cmake option TARGET_VULKAN (which is now ON by default).
For example, on Linux & OSX:
On Windows, you may need to specify the location of the Vulkan SDK if the paths aren't resolved by CMake automatically. For example (assuming the Vulkan SDK is installed in the default path):
Halide has no direct dependency on Vulkan for code-generation, but the runtime requires a working Vulkan environment to run Halide generated code. Any valid Vulkan v1.0+ device driver should work.
Specifically, you'll need:
For AMD & NVIDIA & Intel devices, download and install the latest graphics driver for your platform. Vulkan support should be included.
To build Halide AOT generators, you'll need the Vulkan SDK (specifically the Vulkan loader library and headers): https://sdk.lunarg.com/sdk/download/latest/windows/vulkan-sdk.exe
For Vulkan device drivers, consult the appropriate hardware vendor for your device. A few common ones are listed below.
The Vulkan SDK packages are now being maintained by LunarG. These include the Vulkan Loader library, as well as the Vulkan Tools packages. Instructions for installing these can be found on their Getting Started Guide.
Once the SDK has been installed, you need to install the appropriate driver for your device. Proprietary drivers can be installed via 'apt' using PPA's for each vendor. Examples for AMD and NVIDIA are provided below.
For AMD on Ubuntu v22.04:
For NVIDIA on Ubuntu v22.04:
Note that only valid drivers for your system should be installed since there are reports of the Vulkan loader segfaulting just by having a non-supported driver present. Specifically, the seemingly generic mesa-vulkan-drivers
actually includes the AMD graphics driver, which can cause problems if installed on an NVIDIA-only system.
You're better off using Halide's Metal backend instead, but it is possible to run Vulkan apps on a Mac via the MoltenVK library:
The easiest way to get the necessary dependencies is to use the official MoltenVK SDK installer provided by LunarG:
Alternatively, if you have the Homebrew package manager installed for MacOS, you can use it to install the Vulkan Loader and MoltenVK compatibility layer:
You can validate that everything is configured correctly by running the vulkaninfo
app (bundled in the vulkan-utils package) to make sure your device is detected (eg):
Make sure everything looks correct before continuing!
To generate Halide code for Vulkan, simply add the vulkan
flag to your target as well as any other optional device specific features you wish to enable for Halide:
Target Feature | Description |
---|---|
vulkan | Enables the vulkan backend |
vk_int8 | Allows 8-bit integer storage types to be used |
vk_int16 | Allows 16-bit integer storage types to be used |
vk_int64 | Allows 64-bit integer storage types to be used |
vk_float16 | Allows 16-bit floating-point values to be used for computation |
vk_float64 | Allows 64-bit floating-point values to be used for computation |
vk_v10 | Generates code compatible with the Vulkan v1.0+ API |
vk_v12 | Generates code compatible with the Vulkan v1.2+ API |
vk_v13 | Generates code compatible with the Vulkan v1.3+ API |
Note that 32-bit integer and floating-point types are always available. All other optional device features are off by default (since they are not required by the Vulkan API, and thus must be explicitly enabled to ensure that the code being generated will be compatible with the device and API version being used for execution).
For AOT generators add vulkan
(and any other flags you wish to use) to the target command line option:
For JIT apps use the HL_JIT_TARGET
environment variable:
To modify the default behavior of the runtime, the following environment variables can be used to adjust the configuration of the Vulkan backend at execution time:
HL_VK_LAYERS=...
will tell Halide to choose a suitable Vulkan instance that supports the given list of layers. If not set, VK_INSTANCE_LAYERS=...
will be used instead. If neither are present, Halide will use the first Vulkan compute device it can find. Multiple layers can be specified using the appropriate environment variable list delimiter (:
on Linux/OSX/Posix, or ;
on Windows).
HL_VK_DEVICE_TYPE=...
will tell Halide to choose which type of device to select for creating the Vulkan instance. Valid options are 'gpu', 'discrete-gpu', 'integrated-gpu', 'virtual-gpu', or 'cpu'. If not set, Halide will search for the first 'gpu' like device it can find, or fall back to the first compute device it can find.
HL_VK_ALLOC_CONFIG=...
will tell Halide to configure the Vulkan memory allocator use the given constraints specified as 5x integer values separated by the appropriate environment variable list delimiter (e.g. N:N:N:N:N
on Linux/OSX/Posix, or N;N;N;N;N
on Windows). These values correspond to maximum_pool_size
, minimum_block_size
, maximum_block_size
, maximum_block_count
and nearest_multiple
.
The maximum_pool_size
constraint will tell Halide to configure the Vulkan memory allocator to never request more than N megabytes for the entire pool of allocations for the context. This includes all resource blocks used for suballocations. Setting this to a non-zero value will limit the amount device memory used by Halide, which may be useful when other applications and frameworks are competing for resources. Default is 0 ... meaning no limit.
The minimum_block_size
constraint will tell Halide to configure the Vulkan memory allocator to always request a minimum of N megabytes for a resource block, which will be used as a pool for suballocations.
Increasing this value may improve performance while sacrificing the amount of available device memory. Default is 32MB.
The maximum_block_size
constraint will tell Halide to configure the Vulkan memory allocator to never exceed a maximum of N megabytes for a resource block. Decreasing this value may free up more memory but may impact performance, and/or restrict allocations to be unusably small. Default is 0 ... meaning no limit.
The maximum_block_count
constraint will tell Halide to configure the Vulkan memory allocator to never exceed a total of N block allocations.
Decreasing this value may free up more memory but may impact performance, and/or restrict allocations. Default is 0 ... meaning no limit.
The nearest_multiple
constraint will tell Halide to configure the Vulkan memory allocator to always round up the requested allocation sizes to the given integer value. This is useful for architectures that require specific alignments for subregions allocated within a block. Default is 32 ... setting this to zero means no constraint.
The following environment variables may be useful for tracking down potential issues related to Vulkan:
HL_DEBUG_CODEGEN=3
will print out debug info that includees the SPIR-V code generator used for Vulkan while it is compiling.
HL_SPIRV_DUMP_FILE=...
specifies a file to dump the binary SPIR-V generated during compilation. Useful for debugging CodeGen issues. Can be inspected, validated and disassembled via the SPIR-V tools:
https://github.com/KhronosGroup/SPIRV-Tools
In addition to the SPIR-V Tools, you may also wish to install the Khronos Validation Layers which provide an exhaustive suite of runtime checks that can be injected by adding VK_LAYER_KHRONOS_validation
to the VK_INSTANCE_LAYERS=
environment variable.
To install the validation layers and the SPIR-V tools on Ubuntu v22.04:
To test the validation layer, you can prepend your shell command for any Vulkan enabled binary with the appropriate environment settings. For example, you can run one of the JIT-enabled correctness tests w/debug output and validation layers enabled like so:
All correctness tests are now passing on tested configs for Linux & Windows using the target host-vulkan-vk_int8-vk_int16-vk_int64-vk_float16-vk_float64-vk_v13
on LLVM v14.x.
MacOS passes most tests but encounters internal MoltenVK code translation issues for wide vectors, and ambiguous function calls.
Python apps, tutorials and correctness tests are now passing, but the AOT cases are skipped since the runtime environment needs to be customized to locate the platform specific Vulkan loader library.
Android platform support is currently being worked on.