# LoopSet Structure

The loopsets define loops as a set of operations that depend on one another, and also on loops. Cycles are not allowed, making it a directed acyclic graph. Let's use a set of nested loops performing matrix multiplication as an example. We can create a naive LoopSet from an expression (naive due to being created without access to any type information):

julia> using LoopVectorization

julia> AmulBq = :(for m ∈ 1:M, n ∈ 1:N
C[m,n] = zero(eltype(B))
for k ∈ 1:K
C[m,n] += A[m,k] * B[k,n]
end
end);

julia> lsAmulB = LoopVectorization.LoopSet(AmulBq);

This LoopSet consists of seven operations that define the relationships within the loop:

julia> LoopVectorization.operations(lsAmulB)
7-element Array{LoopVectorization.Operation,1}:
var"##RHS#256" = var"##zero#257"
C[m, n] = var"##RHS#256"
var"##reduction#260" = var"##reductzero#261"
var"##RHS#256" = LoopVectorization.reduce_to_add(var"##reduction#260", var"##RHS#256")

The act of performing a "reduction" across a loop introduces a few extra operations that manage creating a "zero" with respect to the reduction, and then combining with the specified value using reduce_to_add, which performs any necessary type conversions, such as from an Vec vector-type to a scalar, if necessary. This simplifies code generation, by making the functions agnostic with respect to the actual vectorization decisions the library makes.

Each operation is listed as depending on a set of loop iteration symbols:

julia> LoopVectorization.loopdependencies.(LoopVectorization.operations(lsAmulB))
7-element Array{Array{Symbol,1},1}:
[:m, :n]
[:m, :n]
[:m, :k]
[:k, :n]
[:m, :n]
[:m, :k, :n]
[:m, :n]

We can also see which of the operations each of these operations depend on:

julia> LoopVectorization.operations(lsAmulB)

julia> LoopVectorization.parents(ans)
3-element Array{LoopVectorization.Operation,1}:
var"##reduction#260" = var"##reductzero#261"
References to arrays are represented with an ArrayReferenceMeta data structure:
julia> LoopVectorization.operations(lsAmulB).ref
LoopVectorization.ArrayReferenceMeta(LoopVectorization.ArrayReference(:A, [:m, :k], Int8[0, 0]), Bool[1, 1], Symbol("##vptr##_A"))
It contains the name of the parent array (:A), the indicies [:m,:k], and a boolean vector (Bool[1, 1]) indicating whether these indices are loop iterables. Note that the optimizer assumes arrays are column-major, and thus that it is efficient to read contiguous elements from the first index. In lower level terms, it means that high-throughput vmov instructions can be used rather than low-throughput gathers. Similar story for storing elements. When no axis has unit stride, the first given index will be the dummy Symbol("##DISCONTIGUOUSSUBARRAY##").