Internals reference
Operation types
LoopVectorization.OperationType — TypeOperationType is an @enum for classifying supported operations that can appear in @turbo blocks. Type LoopVectorization.OperationType to see the different types.
LoopVectorization.constant — ConstantAn operation setting a variable to a constant value (e.g., a = 0.0)
LoopVectorization.memload — ConstantAn operation setting a variable from a memory location (e.g., a = A[i,j])
LoopVectorization.compute — ConstantAn operation computing a new value from one or more variables (e.g., a = b + c)
LoopVectorization.memstore — ConstantAn operation storing a value to a memory location (e.g., A[i,j] = a)
LoopVectorization.loopvalue — Constantloopvalue indicates an loop variable (i in for i in ...). These are the "parents" of compute operations that involve the loop variables.
Operation
LoopVectorization.Operation — TypeOperationA structure to encode a particular action occurring inside an @turbo block.
Fields
identifier::Int64: A unique identifier for this operation.identifier(op::Operation)returns the index of this operation withinoperations(ls::LoopSet).variable::Symbol: The name of the variable storing the result of this operation. Fora = valthis would be:a. For array assignmentsA[i,j] = valthis would be:A.elementbytes::Int64: Intended to be the size of the result, in bytes. Often inaccurate, not to be relied on.instruction::LoopVectorization.Instruction: The specific operator, e.g.,identityor+node_type::LoopVectorization.OperationType: TheOperationTypeassociated with this operationdependencies::Vector{Symbol}: The loop variables this operation depends onreduced_deps::Vector{Symbol}: Additional loop dependencies that must execute before this operation can be performed successfully (often needed in reductions)parents::Vector{LoopVectorization.Operation}: Operations whose result this operation depends onchildren::Vector{LoopVectorization.Operation}: Operations who depend on this resultref::LoopVectorization.ArrayReferenceMeta: Formemloadormemstore, encodes the array locationmangledvariable::Symbol:gensymmedname of result.reduced_children::Vector{Symbol}: Loop variables that consumers of this operation depend on. Often used in reductions to replicate assignment of initializers when unrolling.u₁unrolled::Bool: Cached value for whether u₁loopsym ∈ loopdependencies(op)u₂unrolled::Bool: Cached value for whether u₂loopsym ∈ loopdependencies(op)vectorized::Bool: Cached value for whether vectorized ∈ loopdependencies(op)rejectcurly::Bool: Cached value for whether or not to lower memop usingUnrolledrejectinterleave::Bool: Cached value for whether or not to lower memop by interleaving it with offset operations
Example
julia> using LoopVectorization
julia> AmulBq = :(for m ∈ 1:M, n ∈ 1:N
C[m,n] = zero(eltype(B))
for k ∈ 1:K
C[m,n] += A[m,k] * B[k,n]
end
end);
julia> lsAmulB = LoopVectorization.LoopSet(AmulBq);
julia> LoopVectorization.operations(lsAmulB)
6-element Vector{LoopVectorization.Operation}:
var"##RHS#245" = var"##zero#246"
C[m, n] = var"##RHS#245"
var"##tempload#248" = A[m, k]
var"##tempload#249" = B[k, n]
var"##RHS#245" = LoopVectorization.vfmadd(var"##tempload#248", var"##tempload#249", var"##RHS#245")
var"##RHS#245" = LoopVectorization.identity(var"##RHS#245")Each one of these lines is a pretty-printed Operation.
Instructions and costs
LoopVectorization.Instruction — TypeInstructionInstruction represents a function via its module and symbol. It is similar to a GlobalRef and may someday be replaced by GlobalRef.
LoopVectorization.InstructionCost — TypeInstructionCostStore parameters related to performance for individual CPU instructions.
scaling::Float64: A flag indicating how instruction cost scales with vector width (128, 256, or 512 bits)scalar_reciprocal_throughput::Float64: The number of clock cycles per operation when many of the same operation are repeated in sequence. Think of it as the inverse of the flow rate at steady-state. It is typically ≤ thescalar_latency.scalar_latency::Int64: The minimum delay, in clock cycles, associated with the instruction. Think of it as the delay from turning on a faucet to when water starts coming out the end of the pipe. See alsoscalar_reciprocal_throughput.register_pressure::Int64: Number of floating-point registered used
Array references
LoopVectorization.ArrayReference — TypeArrayReferenceA type for encoding an array reference A[i,j] occurring inside an @turbo block.
Fields
array::Symbol: The array variableindices::Vector{Symbol}: The list of indices (e.g.,[:i, :j]), orname(op)for computed indices.offsets::Vector{Int8}: Index offset, e.g.,a[i+7]would store the7.offsetsis also used to help identify opportunities for avoiding reloads, for example iny[i] = x[i] - x[i-1], the previous loadx[i-1]can be "carried over" to the next iteration. Only used for small (Int8) offsets.strides::Vector{Int8}
LoopVectorization.ArrayReferenceMeta — TypeArrayReferenceMetaA type similar to ArrayReference but holding additional information.
Fields
ref::LoopVectorization.ArrayReference: TheArrayReferenceloopedindex::Vector{Bool}: A vector of Bools indicating whether each index is a loop variable (falsefor operation-computed indices)ptr::Symbol: Variable holding the pointer to the array's underlying storage
Condensed types
These are used when encoding the @turbo block as a type parameter for passing through to the @generated function.
LoopVectorization.ArrayRefStruct — TypeArrayRefStructA condensed representation of an ArrayReference. It supports array-references with up to 8 indexes, where the data for each consecutive index is packed into corresponding 8-bit fields of index_types (storing the enum IndexType), indices (the id for each index symbol), and offsets (currently unused).
LoopVectorization.OperationStruct — TypeOperationStructA condensed representation of an Operation.