ABC Classification

How to use classification strategies to focus your analysis

Published

2026-02-10

The ABC classification method is used to categorize items, customers, or any group members according to their relative contribution to a total metric.

The abc() function applies this method by ranking group members according to either transaction counts or the sum of a numeric variable (e.g., revenue, margin):

How It Works

abc() requires a grouped tibble or lazy DBI object using dplyr::group_by() to specify the group composition that drives the contribution

Value Capture

  • If .value is provided, then that column is aggregated per group member; otherwise, it counts rows

Category Values

  • Provide the break points that are used to set the cumulative categories
  • Each break point will get a letter category starting with ‘A’
  • If you want to see the stores that make up the top 40% of revenue followed by top 70% and then 90%, you should use c(0.4, 0.7, 0.9, 1)
Table 1
Function Arguments
Argument Description
.data A grouped tibble or DBI object (using dplyr::group_by())
category_values A numeric vector of breakpoints between 0 and 1, representing cumulative proportions for ABC categories.
.value Optional. A column to sum for categorization. If not provided, the function counts the number of rows per group.

When you execute abc(), your console will return a segment_abc object displaying a custom print message:

  • A summary of the function’s actions
  • Details the category break points and labels
  • Describes the main transformation steps and columns that are referenced
  • Lists out possible next actions
# Example

contoso::sales |> 
   dplyr::group_by(store_key) |> 
   ti::abc(
      category_values = c(0.4, 0.7, 0.9, 1), 
      .value = margin
      )
── ABC Classification ──────────────────────────────────────────────────────────
Function: `ABC` was executed
── Description: ──
This calculates a rolling cumulative distribution of variable and segments each
group member's contribution by the break points provided. Helpful to know which
group member's proportional contribution to the total.
── Category Information ──
• The data set is summarized by store_key and then sums each group member's margin contribution of the total margin and then finally calculates each groups rolling cumulative proportion of the total
• Then cumulative distribution was then arranged from lowest to highest and finally classified into 4 break points 40%, 70%, 90%, 100%  and labelled into the following categories a, b, c, d
── Actions: ──
✔Aggregate
✖Shift
✖Compare
✔Proportion of Total
✖Count Distinct
── Next Steps: ──
• Use `calculate()` to return the results
────────────────────────────────────────────────────────────────────────────────

Use calculate() to generate the ABC classification table in a lazy DBI object. Use dplyr::collect() to return a tibble.

contoso::sales |> 
   dplyr::group_by(store_key) |> 
   ti::abc(
      category_values = c(0.4, 0.7, 0.9, 1), 
      .value = margin
      ) |> 
   ti::calculate()
Table 2
ABC Classification Results
Top 10 stores by margin contribution
store_key abc_margin cum_sum prop_total cum_prop_total row_id category_name
540 78,124 78,124 4% 4% 1 a
610 74,604 152,728 4% 8% 2 a
510 69,455 222,183 4% 12% 3 a
80 67,881 290,064 4% 15% 4 a
270 56,723 346,786 3% 18% 5 a
440 56,500 403,287 3% 21% 6 a
450 54,942 458,228 3% 24% 7 a
550 53,874 512,103 3% 27% 8 a
490 51,348 563,450 3% 30% 9 a
650 49,467 612,918 3% 32% 10 a

This table contains grouped data with various metrics, highlighting the contribution of each group in terms of both value and transaction count. Below is an explanation of the key columns and how to interpret the results:

Understanding the Results

  • Store 540 has a margin of $7,812.11 (“ABC Margin”), which accounts for about 4% (“prop_total”) of the total margin across all stores

  • The “cum_sum” column tracks the running total of values (e.g., revenue or count) for each store, showing the cumulative sum up to that row

  • The “cum_prop_total” column shows each store’s contribution as a percentage of the total margin as you move down the table

  • The store with the highest contribution has a “row_id” of 1 and is assigned to the first category segment (‘A’) via the “category_name” column

  • The “max_row_id” shows that there are 57 additional stores in the same category (‘A’)

  • The “cum_unit_prop” column tracks the cumulative contribution from a transaction count perspective, similar to cum_prop_total but at the unit level

  • The category_value and category_name columns define the breakpoints you provided, assigning stores to categories (e.g., ‘A’, ‘B’, ‘C’) based on their cumulative contribution

This is summarized in Table 3 below:

Table 3
Output Column Descriptions
Column Description Examples
cum_sum The cumulative sum of the specified values (e.g., revenue, count, etc.), aggregated per group. Represents the total value up to that row. 1000, 2500, 4000
prop_total The proportion of the total for each row's value. Shows the percentage of the total represented by the current row's contribution. 0.10, 0.25, 0.40
cum_prop_total The cumulative proportion of the total, showing the running total percentage of the entire dataset as you move through the rows. 0.10, 0.35, 0.75
row_id The unique identifier for the row, often used to track or identify specific rows in the dataset. Typically sequential ID or index. 1, 2, 3
max_row_id The maximum row ID in the current group (if grouping is applied), representing the total number of rows in the group. 5, 5, 5
cum_unit_prop The cumulative proportion of the unit values, similar to cum_prop_total, but typically used when the unit is aggregated. 0.10, 0.30, 0.70
category_value The category value that corresponds to the cumulative proportion break points (e.g., top 10%, top 40%, etc.). Based on the break points provided. 0.4, 0.7, 0.9
category_name The name of the category assigned to each row based on the cumulative contribution. Categories are represented by letters (A, B, C, etc.). "A", "B", "C"