Halide 19.0.0
Halide compiler and libraries
Loading...
Searching...
No Matches
StageStridedLoads.h
Go to the documentation of this file.
1#ifndef HALIDE_INTERNAL_STAGE_STRIDED_LOADS_H
2#define HALIDE_INTERNAL_STAGE_STRIDED_LOADS_H
3
4/** \file
5 *
6 * Defines the compiler pass that converts strided loads into dense loads
7 * followed by shuffles.
8 */
9
10#include "Expr.h"
11
12namespace Halide {
13namespace Internal {
14
15/** Convert all unpredicated strided loads in a Stmt into dense loads followed
16 * by shuffles.
17 *
18 * For a stride of two, the trick is to do a dense load of twice the size, and
19 * then extract either the even or odd lanes. This was previously done in
20 * codegen, where it was challenging, because it's not easy to know there if
21 * it's safe to do the double-sized load, as it either loads one element beyond
22 * or before the original load. We used the alignment of the ramp base to try to
23 * tell if it was safe to shift backwards, and we added padding to internal
24 * allocations so that for those at least it was safe to shift
25 * forwards. Unfortunately the alignment of the ramp base is usually unknown if
26 * you don't know anything about the strides of the input, and adding padding to
27 * allocations was a serious wart in our memory allocators.
28 *
29 * This pass instead actively looks for evidence elsewhere in the Stmt (at some
30 * location which definitely executes whenever the load being transformed
31 * executes) that it's safe to read further forwards or backwards in memory. The
32 * evidence is in the form of a load at the same base address with a different
33 * constant offset. It also clusters groups of these loads so that they do the
34 * same dense load and extract the appropriate slice of lanes. If it fails to
35 * find any evidence, for loads from external buffers it does two overlapping
36 * half-sized dense loads and shuffles out the desired lanes, and for loads from
37 * internal allocations it adds padding to the allocation explicitly, by setting
38 * the padding field on Allocate nodes.
39 */
41
42} // namespace Internal
43} // namespace Halide
44
45#endif
Base classes for Halide expressions (Halide::Expr) and statements (Halide::Internal::Stmt)
Stmt stage_strided_loads(const Stmt &s)
Convert all unpredicated strided loads in a Stmt into dense loads followed by shuffles.
This file defines the class FunctionDAG, which is our representation of a Halide pipeline,...
@ Internal
Not visible externally, similar to 'static' linkage in C.
A reference-counted handle to a statement node.
Definition Expr.h:427