Internals reference
Operation types
LoopVectorization.OperationType
— TypeOperationType
is an @enum
for classifying supported operations that can appear in @turbo
blocks. Type LoopVectorization.OperationType
to see the different types.
LoopVectorization.constant
— ConstantAn operation setting a variable to a constant value (e.g., a = 0.0
)
LoopVectorization.memload
— ConstantAn operation setting a variable from a memory location (e.g., a = A[i,j]
)
LoopVectorization.compute
— ConstantAn operation computing a new value from one or more variables (e.g., a = b + c
)
LoopVectorization.memstore
— ConstantAn operation storing a value to a memory location (e.g., A[i,j] = a
)
LoopVectorization.loopvalue
— Constantloopvalue
indicates an loop variable (i
in for i in ...
). These are the "parents" of compute
operations that involve the loop variables.
Operation
LoopVectorization.Operation
— TypeOperation
A structure to encode a particular action occurring inside an @turbo
block.
Fields
identifier::Int64
: A unique identifier for this operation.identifier(op::Operation)
returns the index of this operation withinoperations(ls::LoopSet)
.variable::Symbol
: The name of the variable storing the result of this operation. Fora = val
this would be:a
. For array assignmentsA[i,j] = val
this would be:A
.elementbytes::Int64
: Intended to be the size of the result, in bytes. Often inaccurate, not to be relied on.instruction::LoopVectorization.Instruction
: The specific operator, e.g.,identity
or+
node_type::LoopVectorization.OperationType
: TheOperationType
associated with this operationdependencies::Vector{Symbol}
: The loop variables this operation depends onreduced_deps::Vector{Symbol}
: Additional loop dependencies that must execute before this operation can be performed successfully (often needed in reductions)parents::Vector{LoopVectorization.Operation}
: Operations whose result this operation depends onchildren::Vector{LoopVectorization.Operation}
: Operations who depend on this resultref::LoopVectorization.ArrayReferenceMeta
: Formemload
ormemstore
, encodes the array locationmangledvariable::Symbol
:gensymmed
name of result.reduced_children::Vector{Symbol}
: Loop variables that consumers of this operation depend on. Often used in reductions to replicate assignment of initializers when unrolling.u₁unrolled::Bool
: Cached value for whether u₁loopsym ∈ loopdependencies(op)u₂unrolled::Bool
: Cached value for whether u₂loopsym ∈ loopdependencies(op)vectorized::Bool
: Cached value for whether vectorized ∈ loopdependencies(op)rejectcurly::Bool
: Cached value for whether or not to lower memop usingUnrolled
rejectinterleave::Bool
: Cached value for whether or not to lower memop by interleaving it with offset operations
Example
julia> using LoopVectorization
julia> AmulBq = :(for m ∈ 1:M, n ∈ 1:N
C[m,n] = zero(eltype(B))
for k ∈ 1:K
C[m,n] += A[m,k] * B[k,n]
end
end);
julia> lsAmulB = LoopVectorization.LoopSet(AmulBq);
julia> LoopVectorization.operations(lsAmulB)
6-element Vector{LoopVectorization.Operation}:
var"##RHS#245" = var"##zero#246"
C[m, n] = var"##RHS#245"
var"##tempload#248" = A[m, k]
var"##tempload#249" = B[k, n]
var"##RHS#245" = LoopVectorization.vfmadd(var"##tempload#248", var"##tempload#249", var"##RHS#245")
var"##RHS#245" = LoopVectorization.identity(var"##RHS#245")
Each one of these lines is a pretty-printed Operation
.
Instructions and costs
LoopVectorization.Instruction
— TypeInstruction
Instruction
represents a function via its module and symbol. It is similar to a GlobalRef
and may someday be replaced by GlobalRef
.
LoopVectorization.InstructionCost
— TypeInstructionCost
Store parameters related to performance for individual CPU instructions.
scaling::Float64
: A flag indicating how instruction cost scales with vector width (128, 256, or 512 bits)scalar_reciprocal_throughput::Float64
: The number of clock cycles per operation when many of the same operation are repeated in sequence. Think of it as the inverse of the flow rate at steady-state. It is typically ≤ thescalar_latency
.scalar_latency::Int64
: The minimum delay, in clock cycles, associated with the instruction. Think of it as the delay from turning on a faucet to when water starts coming out the end of the pipe. See alsoscalar_reciprocal_throughput
.register_pressure::Int64
: Number of floating-point registered used
Array references
LoopVectorization.ArrayReference
— TypeArrayReference
A type for encoding an array reference A[i,j]
occurring inside an @turbo
block.
Fields
array::Symbol
: The array variableindices::Vector{Symbol}
: The list of indices (e.g.,[:i, :j]
), orname(op)
for computed indices.offsets::Vector{Int8}
: Index offset, e.g.,a[i+7]
would store the7
.offsets
is also used to help identify opportunities for avoiding reloads, for example iny[i] = x[i] - x[i-1]
, the previous loadx[i-1]
can be "carried over" to the next iteration. Only used for small (Int8
) offsets.strides::Vector{Int8}
LoopVectorization.ArrayReferenceMeta
— TypeArrayReferenceMeta
A type similar to ArrayReference
but holding additional information.
Fields
ref::LoopVectorization.ArrayReference
: TheArrayReference
loopedindex::Vector{Bool}
: A vector of Bools indicating whether each index is a loop variable (false
for operation-computed indices)ptr::Symbol
: Variable holding the pointer to the array's underlying storage
Condensed types
These are used when encoding the @turbo
block as a type parameter for passing through to the @generated
function.
LoopVectorization.ArrayRefStruct
— TypeArrayRefStruct
A condensed representation of an ArrayReference
. It supports array-references with up to 8 indexes, where the data for each consecutive index is packed into corresponding 8-bit fields of index_types
(storing the enum IndexType
), indices
(the id
for each index symbol), and offsets
(currently unused).
LoopVectorization.OperationStruct
— TypeOperationStruct
A condensed representation of an Operation
.