---
title: "Derived variables"
format: html
vignette: >
  %\VignetteIndexEntry{Derived variables}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r, warning=FALSE, message=FALSE}
library(chmsflow)
```

## Introduction

There are two types of derived variables in the CHMS surveys. Both are supported in chmsflow.

- **Variable mapping** -- mapping two or more variables into a single variable.
- **Computed variables** -- variables derived using mathematical equations or clinical logic.

chmsflow computes derived variables using functions referenced in `variable-details.csv`. The `recEnd` column uses the prefix `Func::` to name the R function, and the `variableStart` column uses the prefix `DerivedVar::` to list the input variables.

For example, GFR (`gfr_ml_min`) has:

- `recEnd`: `Func::calculate_gfr`
- `variableStart`: `DerivedVar::[lab_bcre, pgdcgt, clc_sex, clc_age]`

This tells `rec_with_table()` to call `calculate_gfr()` with the four input variables.

## How to use derived variables

Since derived variables depend on their input variables, you must list both the derived variable and its inputs when calling `rec_with_table()`:

```{r, warning=FALSE, eval=FALSE}
cycle2_gfr <- recodeflow::rec_with_table(
  cycle2,
  variables = c("lab_bcre", "pgdcgt", "clc_sex", "clc_age", "gfr_ml_min"),
  variable_details = variable_details,
  log = TRUE
)
```

For variables that depend on medication status (e.g., hypertension, diabetes), use `recode_after_meds()` instead of `rec_with_table()`. See [Recoding medications](recoding_medications.html) and [Analysis walkthrough](analysis_walkthrough.html) for the full workflow.

## Creating a derived variable

To add a new derived variable to chmsflow, you need to create a harmonized set of input variables and an R function that computes the derived value. See [How to add variables](how_to_add_variables.html) for step-by-step instructions.

For details on the metadata schema, see [Variable schema reference](variables_and_variable_details.html).

## Next steps

- **See derived variables in a full analysis** -- The [Analysis walkthrough](analysis_walkthrough.html) demonstrates deriving hypertension status from CHMS cycle 3 data.
- **Handle missing data** -- Learn how `tagged_na()` codes propagate through derived variable functions in [Missing data (tagged_na)](tagged_na_usage.html).
- **Understand the methodology** -- For the design rationale behind the rules-as-data approach, see [Methodology](methodology.html).
