# Composite Types: DateTime Arrays

Currently, loops over some types are easier to work with than others. Here we show: (1) a sequential loop over Vector{DateTime} that cannot have @turbo applied directly, and (2) a solution that uses the interpreted integer representation of DateTime.

This may be applicable if you have a composite type that may be represented with primitive types.

## Setting up the Problem

Here's a simple problem involving timestamps:

Problem statement:

• Given: a vector of strictly increasing timestamps.
• Output: a vector of the same length starting at 0.0 and ending at 1.0. Each intermediate element is scaled proportionally to the length of time since the beginning.

Sample Output:

using Dates

sample_input = [
Dates.DateTime(2021, 5, 5, 10, 0, 0),
Dates.DateTime(2021, 5, 5, 10, 5, 15),
Dates.DateTime(2021, 5, 6, 10, 0, 0),
Dates.DateTime(2021, 5, 6, 10, 5, 15),
Dates.DateTime(2021, 5, 7, 10, 0, 20),
]

expected_output = [
0.0,
0.0018227057053581761,
0.499942136326814,
0.5017648420321722,
1.0,
]

## First Attempt: Sequential version of the loop

This implementation satisfies the problem statement by iterating over the examples:

using Dates

function scale_timeseries_sequential(data::Vector{Dates.DateTime})
out = similar(data, Float64)
ϕ = (data[lastindex(data)] - data[1]).value

@inbounds for i ∈ eachindex(data)
out[i] = (data[i] - data[1]).value / ϕ
end

return out
end

## Second Attempt: Turbo Loop

Our Vector{Dates.DateTime} has an integer interpretation which we can take advantage of here. We'll reinterpret our vector as Int, make the needed adjustments, then apply the @turbo macro to our loop:

using LoopVectorization, Dates

function scale_timeseries_turbo(data::Vector{Dates.DateTime})

# Interpret our DateTime vector as Int
tsi = reinterpret(Int, data)

out = similar(data, Float64)

# We've interpreted our data as integers, so we no longer need .value
ϕ = tsi[lastindex(tsi)] - tsi[1]

@turbo for i ∈ eachindex(tsi)
out[i] = (tsi[i] - tsi[1]) / ϕ
end

return out
end

## Benchmarks

We'll benchmark with randomly generated data:

function generate_timestamps(N::Int64)
data = Vector{Dates.DateTime}(undef,N)
v = DateTime(1990, 1, 1, 0, 0, 0)
for i in 1:N
v += Second(rand(1:5, 1)[1])
data[i] =v
end
return data
end

Briefly, the benchmark suggests that the mean time for the sequential vs. turbo solution is a ~3x speedup while holding memory requirements constant:

julia> using BenchmarkTools

julia> data_100000 = generate_timestamps(100000);

julia> data_200000 = generate_timestamps(200000);

julia> @benchmark scale_timeseries_sequential(data_100000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max):  318.864 μs … 967.760 μs  ┊ GC (min … max): 0.00% … 40.41%
Time  (median):     321.291 μs               ┊ GC (median):    0.00%
Time  (mean ± σ):   332.503 μs ±  52.040 μs  ┊ GC (mean ± σ):  1.97% ±  6.98%

█▆▅▂▂▂▁                                                       ▁
█████████▆▆▆▅▅▅▅▅▄▄▄▄▁▁▄▁▁▃▁▁▄▄▃▃▁▃▄▁▃▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅█▇ █
319 μs        Histogram: log(frequency) by time        701 μs <

Memory estimate: 781.33 KiB, allocs estimate: 2.

julia> @benchmark scale_timeseries_turbo(data_100000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max):   71.942 μs … 933.400 μs  ┊ GC (min … max):  0.00% … 71.93%
Time  (median):      87.926 μs               ┊ GC (median):     0.00%
Time  (mean ± σ):   100.082 μs ±  89.095 μs  ┊ GC (mean ± σ):  11.63% ± 11.43%

▄█▃▁                                                        ▁ ▁
████▇▄▁▁▁▁▁▁▁▁▃▄▆▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ █
71.9 μs       Histogram: log(frequency) by time        764 μs <

Memory estimate: 781.33 KiB, allocs estimate: 2.

julia> @benchmark scale_timeseries_sequential(data_200000)
BenchmarkTools.Trial: 7153 samples with 1 evaluation.
Range (min … max):  637.692 μs …   2.277 ms  ┊ GC (min … max): 0.00% … 65.01%
Time  (median):     640.729 μs               ┊ GC (median):    0.00%
Time  (mean ± σ):   694.282 μs ± 184.965 μs  ┊ GC (mean ± σ):  3.69% ±  8.68%

█▆▅▃▂▁                                                        ▁
███████▇▅▆▆▄▄▅▁▁▁▁▁▁▆██▇▅▄▄▁▃▁▃▃▄▁▁▁▁▁▁▁▁▁▁▁▇█▇▇▅▅▄▃▄▄▁▁▁▄▄█▇ █
638 μs        Histogram: log(frequency) by time       1.71 ms <

Memory estimate: 1.53 MiB, allocs estimate: 2.

julia> @benchmark scale_timeseries_turbo(data_200000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max):  159.023 μs …   2.092 ms  ┊ GC (min … max):  0.00% … 50.30%
Time  (median):     176.559 μs               ┊ GC (median):     0.00%
Time  (mean ± σ):   230.513 μs ± 189.542 μs  ┊ GC (mean ± σ):  11.86% ± 12.80%

█▇▅▄▄▃▂▂▁          ▁▁                                         ▂
██████████▇▅▅▃▁▄▁▃▁██▇▆▄▅▄▄▅▄▅▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁███▇▅▅▅▅▄▅▄▅█▇▇ █
159 μs        Histogram: log(frequency) by time       1.22 ms <

Memory estimate: 1.53 MiB, allocs estimate: 2.