# ==============================================================================
# R/data.R - Example dataset documentation
# ==============================================================================

#' Simulated Birth Weight Study Data
#'
#' A simulated dataset containing information about maternal factors and
#' birth weight outcomes, used for demonstrating risk difference calculations.
#'
#' @format A data frame with 2,500 rows and 8 variables:
#' \describe{
#'   \item{id}{Patient identifier (1 to 2500)}
#'   \item{low_birthweight}{Binary outcome: 1 = low birth weight (<2500g), 0 = normal}
#'   \item{smoking}{Maternal smoking status: "No" or "Yes"}
#'   \item{maternal_age}{Maternal age in years (continuous, mean ~28)}
#'   \item{race}{Maternal race: "White", "Black", "Hispanic", "Other"}
#'   \item{education}{Education level: "Less than HS", "HS", "Some college", "College+"}
#'   \item{prenatal_care}{Prenatal care adequacy: "Adequate", "Inadequate"}
#'   \item{parity}{Number of previous births: 0, 1, 2, or 3+ (capped at 3)}
#' }
#'
#' @details
#' This dataset was simulated to reflect realistic associations between
#' maternal factors and low birth weight risk. The relationships include:
#'
#' * **Smoking**: Increases risk of low birth weight
#' * **Maternal age**: Modest association with risk
#' * **Race**: Health disparities reflected in different baseline risks
#' * **Education**: Higher education associated with lower risk
#' * **Prenatal care**: Adequate care reduces risk
#' * **Parity**: Higher parity associated with slightly increased risk
#'
#' The base rate of low birth weight is approximately 8%, which is realistic
#' for developed countries. The effect sizes and interactions were designed
#' to demonstrate various analysis scenarios including stratification and
#' adjustment.
#'
#' @source Simulated data based on patterns from epidemiological literature
#'   and the National Center for Health Statistics
#'
#' @examples
#' data(birthweight)
#' head(birthweight)
#'
#' # Basic descriptive statistics
#' table(birthweight$smoking, birthweight$low_birthweight)
#'
#' # Summary by race
#' with(birthweight, table(race, low_birthweight))
#'
#' # Simple risk difference
#' rd <- calc_risk_diff(birthweight, "low_birthweight", "smoking")
#' print(rd)
#'
#' # Create a simple summary table
#' cat(create_simple_table(rd, "Risk of Low Birth Weight by Smoking Status"))
#'
"birthweight"
