Simply pass your data in either a tibble or a lazy DBI object to the time intelligence function and input the required arguments.
We will use mtd() function to calculate the month-to-date sum of Contoso’s margin from their sales dataset (contoso::sales).
Most time intelligence functions follow the same structure:
Specify the date column to index the time intelligence functions (e.g., order_date, deliver_date, etc.)
Specify the value column to aggregate (e.g., margin, net_margin, cogs, etc.)
If there is a period rollback / rollforward then clarify the number of periods to roll
Clarify if we are using a "standard" calendar or a retail calendar variation ("445", "454", or "544") – see the standard vs. non-standard calendar article for more information
If using a retail calendar, optionally specify fiscal_year_start (1–12) to control which month the fiscal year begins nearest to (default is January)
When you execute mtd(), your console will return a ti object displaying a custom print message:
A summary of the function’s actions
Details the calendar’s attributes used to support the calculation
Describes the main transformation steps and columns that are referenced
This creates a daily `cumsum()` of the current monthmargin from the start of
the standard calendar month to the end of the month
── Calendar: ──
• The calendar aggregated order_date to the day time unit
• A standard calendar is created with 0 groups
• Calendar ranges from 2021-05-18 to 2024-04-20
• 222 days were missing and replaced with 0
• New date column date, year, quarter, month was created from order_date
Select the dimensions you want to group and pass a grouped column to your ti function
Why do we need this package when we have lubridate?
Lubridate is an excellent package and is at the core of many of ti’s functions. The issue is not with lubridate itself, but rather the challenges in your dataset that prevent you from directly using lubridate functions.
The advantage of this package is that it will perform all of the required pre-processing steps for you.
Issue 1: Many datasets do not have continuous dates, especially if data is recorded only on business days or for active transactions
Issue 2: Imbalances between periods (e.g., the different number of days between February vs. January) can create misleading analysis or trends, or your analysis may require non-standard calendar types
Issue 3: Calculating time intelligence for groups can lead to larger-than-memory issues even with smaller datasets
Issue 1: Continuous Dates
Referencing the Table 1 below, if we were to use dplyr::lag() to compare Day-over-Day (DoD) margin, we would be missing 2024-01-02, 2024-01-04, and 2024-01-05 which would lead to incorrect answers or trends.
Table 1: Incomplete calendar table can lead to wrong conclusions or trends
Incomplete Calendar Data
Missing dates can lead to wrong conclusions
date
margin
2024-01-01
1,200
2024-01-03
1,100
2024-01-06
1,300
2024-01-07
900
2024-01-08
1,200
2024-01-09
850
2024-01-11
1,450
To correct this, ti will automatically complete your calendar for each group for the missing periods to ensure there are no missing periods when calculating trends.
In Table 2 we see a complete calendar set and a new column, “missing_date_indicator” to indicate how many dates were missing.
Table 2: Original table now complete with missing dates and a missing date indicator
Complete Calendar with Missing Date Indicator
ti automatically fills missing dates
date
year
quarter
month
week
day
margin
dod_margin
missing_date_indicator
2024-01-01
2024
1
1
1
1
1,200
NA
0
2024-01-02
2024
1
1
1
2
0
1,200
1
2024-01-03
2024
1
1
1
3
1,100
0
0
2024-01-04
2024
1
1
1
4
0
1,100
1
2024-01-05
2024
1
1
1
5
0
0
1
2024-01-06
2024
1
1
1
6
1,300
0
0
2024-01-07
2024
1
1
1
7
900
1,300
0
2024-01-08
2024
1
1
2
8
1,200
900
0
2024-01-09
2024
1
1
2
9
850
1,200
0
2024-01-10
2024
1
1
2
10
0
850
1
2024-01-11
2024
1
1
2
11
1,450
0
0
Issue 2: Period imbalances
When comparing two performance periods with a standard calendar, you may end up comparing periods with an unequal number of days.
For example, if you want to compare January sales to February, you can get misleading conclusions due to the unequal number of weekends and days in those periods.
In practice you have two choices:
Compare periods with similar days (e.g., the 28th of February should only compare up to the 28th of January) and you omit three days of January sales altogether
Have an imbalanced comparison (e.g., the 28th of February compares to the 31st of January so that no days are lost)
This package uses the second option to ensure we don’t lose any of January’s sales, but to help flag the imbalance, ti will add a column to let you know how many periods (e.g., days) are in your comparison period to increase transparency.
To create this example, we will use the pmtd() function to calculate the prior month to date cumulative margin in the current month.
When we pass the ti object through to calculate() and filter the results for February 2022, we would see the below Table 3.
On 2022-02-28, we see that is comparing 31 days of the previous period to the 28 days in the current period.
Table 3
Period Imbalance Example
February 2022 comparing to January (31 vs 28 days)
date
year
month
pmtd_margin
days_in_comparison_period
2022-02-26
2022
2
129,436
26
2022-02-27
2022
2
132,202
27
2022-02-28
2022
2
142,668
31
Issue 3: Larger-than-memory data
If your data isn’t already in a database, ti will leverage DuckDB to convert your data and enable larger-than-memory calculations.
This is necessary for time intelligence functions because, with grouped data, you need to complete the calendar for each group combination. For even modest datasets, this can quickly multiply and grow your data beyond available memory.
If your data is already in a database, ti will use dbplyr to convert the functions to SQL queries.
In either scenario, you can use dplyr::collect() to execute your SQL query and return a tibble to your local computer.
# quickly load contoso package to a database via the contoso packagedb <- contoso::create_contoso_duckdb()# pass through to the same function mtd()db$sales |>mtd(order_date,margin,"standard")# same syntax even though data source is now a tibblecontoso::sales |>mtd(order_date,margin,"standard")