Derived variables

library(chmsflow)

Introduction

There are two types of derived variables in the CHMS surveys. Both are supported in chmsflow.

chmsflow computes derived variables using functions referenced in variable-details.csv. The recEnd column uses the prefix Func:: to name the R function, and the variableStart column uses the prefix DerivedVar:: to list the input variables.

For example, GFR (gfr_ml_min) has:

This tells rec_with_table() to call calculate_gfr() with the four input variables.

How to use derived variables

Since derived variables depend on their input variables, you must list both the derived variable and its inputs when calling rec_with_table():

cycle2_gfr <- recodeflow::rec_with_table(
  cycle2,
  variables = c("lab_bcre", "pgdcgt", "clc_sex", "clc_age", "gfr_ml_min"),
  variable_details = variable_details,
  log = TRUE
)

For variables that depend on medication status (e.g., hypertension, diabetes), use recode_after_meds() instead of rec_with_table(). See Recoding medications and Analysis walkthrough for the full workflow.

Creating a derived variable

To add a new derived variable to chmsflow, you need to create a harmonized set of input variables and an R function that computes the derived value. See How to add variables for step-by-step instructions.

For details on the metadata schema, see Variable schema reference.

Next steps