Excel Formula Translation

Overview

tidychain can extract formulas from Excel files and translate them to R code. This helps you understand Excel models and port them to R.

NoteTranslation Scope

Formula translation covers direct 1:1 mappings only. Basic arithmetic, aggregation functions (SUM, AVERAGE, MIN, MAX), and logical functions (IF, AND, OR) translate automatically. Functions that require syntax changes or data structure context (VLOOKUP, POWER, MOD) are not translated.

Extracting Formulas

Use extract_formulas() to get all formulas with their dependencies:

library(tidychain)

formulas <- extract_formulas("simple_model.xlsx")
formulas
# A tibble: 3 × 6
  sheet cell      cell_local formula references has_references
  <chr> <chr>     <chr>      <chr>   <list>     <lgl>
1 Model Model!B7  B7         B2*B3   <chr [2]>  TRUE
2 Model Model!B8  B8         B7*B4   <chr [2]>  TRUE
3 Model Model!B11 B11        B7+B8   <chr [2]>  TRUE

Each formula’s cell references are parsed into a list, revealing the dependency structure.

Parsing Formula References

Extract cell references from any formula string:

parse_formula_refs("=SUM(A1:A10)+B5*C3")
[1] "A1" "A10" "B5" "C3"
parse_formula_refs("=VLOOKUP(A1,Sheet2!B:C,2,FALSE)")
[1] "A1"
parse_formula_refs("=IF(AND(A1>0,B1<10),A1*2,0)")
[1] "A1" "B1"

Extracts single cell references (A1, $C$3), range endpoints (A1:A10A1, A10), and cross-sheet references (sheet prefix stripped). Named ranges are not currently resolved.

Expanding Ranges

Convert range notation to individual cells:

expand_range("A1:C3")
[1] "A1" "B1" "C1" "A2" "B2" "C2" "A3" "B3" "C3"
expand_range("B2:B5")
[1] "B2" "B3" "B4" "B5"
expand_range("A1:A1")  # Single cell
[1] "A1"

Translating to R Code

Use translate_formula() to convert Excel formulas to R:

translate_formula("A1+B1")
$r_code
[1] "a1+b1"

$success
[1] TRUE

$notes
[1] NA
translate_formula("=SUM(A1:A10)")
$r_code
[1] "sum(a1:a10)"

$success
[1] FALSE

$notes
[1] "Contains ranges that may need manual conversion: a1:a10"
translate_formula("=IF(A1>0,A1*2,0)")
$r_code
[1] "if_else(a1>0,a1*2,0)"

$success
[1] TRUE

$notes
[1] NA

The output includes r_code (translated R), success (TRUE if fully translated), and notes (warnings). When success = FALSE, the r_code is still provided but may need editing.

Supported Translations

These Excel functions have direct R equivalents and translate automatically:

Mathematical

Excel R
SUM(range) sum(range)
AVERAGE(range) mean(range)
MIN(range) min(range)
MAX(range) max(range)
ABS(x) abs(x)
SQRT(x) sqrt(x)
ROUND(x, n) round(x, n)
CEILING(x) ceiling(x)
FLOOR(x) floor(x)
EXP(x) exp(x)
LN(x) log(x)
LOG10(x) log10(x)

Statistical

Excel R
COUNT(range) length(range)
STDEV(range) sd(range)
VAR(range) var(range)
MEDIAN(range) median(range)

Logical

Excel R
IF(test, true, false) if_else(test, true, false)
AND(a, b, ...) all(a, b, ...)
OR(a, b, ...) any(a, b, ...)
NOT(x) !(x)

Text

Excel R
CONCATENATE(...) paste0(...)
CONCAT(...) paste0(...)
LEN(text) nchar(text)
UPPER(text) toupper(text)
LOWER(text) tolower(text)
TRIM(text) trimws(text)

Date

Excel R
TODAY() Sys.Date()
NOW() Sys.time()
YEAR(date) lubridate::year(date)
MONTH(date) lubridate::month(date)
DAY(date) lubridate::day(date)

Not Supported

These functions require manual translation—tidychain does not attempt to convert them:

Function Why Manual Approach
VLOOKUP Range-based lookup doesn’t map to R Use left_join() or match()
HLOOKUP Same as VLOOKUP Use left_join() or match()
INDEX/MATCH Requires data structure context Use df[row, col] indexing
POWER(x, n) Syntax change (x^n) Write x^n manually
MOD(x, n) Syntax change (x %% n) Write x %% n manually
COUNTA Complex NA handling Use sum(!is.na(x))
LEFT/RIGHT/MID Different argument order Use substr() with adjusted args
IFERROR Requires try/catch logic Use tryCatch()

Using Cell Mapping

Provide meaningful variable names instead of cell references:

mapping <- list(
  "A1" = "price",
  "B1" = "quantity",
  "C1" = "tax_rate"
)

translate_formula("=A1*B1*(1+C1)", cell_mapping = mapping)
$r_code
[1] "price*quantity*(1+tax_rate)"

$success
[1] TRUE

$notes
[1] NA

Practical Examples

Simple Calculation Model

# Extract formulas
formulas <- extract_formulas("pricing_model.xlsx")
formulas
# A tibble: 4 × 6
  sheet  cell        cell_local formula           references has_references
  <chr>  <chr>       <chr>      <chr>             <list>     <lgl>
1 Pricing Pricing!B5 B5         B2*B3             <chr [2]>  TRUE
2 Pricing Pricing!B6 B6         B5*B4             <chr [2]>  TRUE
3 Pricing Pricing!B7 B7         B5+B6             <chr [2]>  TRUE
4 Pricing Pricing!B8 B8         IF(B7>1000,B7*0.9,B7) <chr [2]>  TRUE
# Translate each formula
formulas |>
  rowwise() |>
  mutate(
    r_code = translate_formula(formula)$r_code
  ) |>
  select(cell_local, formula, r_code)
# A tibble: 4 × 3
  cell_local formula               r_code
  <chr>      <chr>                 <chr>
1 B5         B2*B3                 b2*b3
2 B6         B5*B4                 b5*b4
3 B7         B5+B6                 b5+b6
4 B8         IF(B7>1000,B7*0.9,B7) if_else(b7>1000,b7*0.9,b7)

With Cell Mapping

# Define meaningful names
mapping <- list(
  "B2" = "price",
  "B3" = "quantity",
  "B4" = "tax_rate",
  "B5" = "subtotal",
  "B6" = "tax",
  "B7" = "total"
)

# Translate with mapping
formulas |>
  rowwise() |>
  mutate(
    r_code = translate_formula(formula, cell_mapping = mapping)$r_code
  ) |>
  select(cell_local, formula, r_code)
# A tibble: 4 × 3
  cell_local formula               r_code
  <chr>      <chr>                 <chr>
1 B5         B2*B3                 price*quantity
2 B6         B5*B4                 subtotal*tax_rate
3 B7         B5+B6                 subtotal+tax
4 B8         IF(B7>1000,B7*0.9,B7) if_else(total>1000,total*0.9,total)

Limitations

ImportantBeyond Function Translation

Even supported functions may need manual adjustment:

  • Range syntax (A1:A10) — R uses vectors, not Excel ranges
  • Array formulas — Complex R vectorization needed
  • Structured referencesTable1[Column] syntax not parsed
  • Named ranges — Require workbook context to resolve
  • External references[Book.xlsx]Sheet!A1 not supported
  • VBA/macros — User-defined functions cannot be translated

Best Practices

TipTranslation Workflow
  1. Extract all formulas with extract_formulas()
  2. Identify inputs (cells with no references)
  3. Build cell mapping for meaningful names
  4. Translate iteratively starting with simple formulas
  5. Test each translation against Excel output
  6. Document manual changes for complex functions

Next Steps