The EmpiricalDynamics package provides a complete
workflow for discovering differential equations from time series data.
This approach, sometimes called “equation discovery” or “symbolic
regression for dynamics,” allows researchers to uncover the mathematical
laws governing observed phenomena.
The package implements a six-step workflow:
Let’s discover the equation governing logistic population growth.
library(EmpiricalDynamics)
# Load example data
logistic_growth <- load_example_data("logistic_growth")
# View the data
head(logistic_growth)
# Compute derivative using Total Variation Regularization (recommended)
deriv <- compute_derivative(
logistic_growth$Z,
time = logistic_growth$time,
method = "tvr",
lambda = "auto"
)
# Add derivative to data
logistic_growth$dZ <- deriv$derivative
# Plot diagnostic
plot(deriv)Based on theory, logistic growth follows: \(\frac{dZ}{dt} = rZ(1 - Z/K)\)
# Get LaTeX equation
latex_eq <- to_latex(equation, variable = "Z")
cat(latex_eq)
# Generate full report (using tempdir() per CRAN policy)
generate_report(
results = list(
data = logistic_growth,
equation = equation,
validation = validation
),
output_file = file.path(tempdir(), "logistic_analysis"),
format = "markdown"
)Computing reliable derivatives from noisy data is critical. The package offers several methods:
Recommended for most applications, especially economic data with trends and structural breaks.
deriv_tvr <- compute_derivative(
y = data$Z,
time = data$time,
method = "tvr",
lambda = "auto" # Automatic regularization selection
)
# Plot diagnostic: shows original, derivative, reconstruction, residuals
plot_tvr_diagnostic(deriv_tvr)TVR solves the optimization problem: \[\min_{\dot{Z}} \left\| Z - \int \dot{Z} \, dt \right\|_2^2 + \lambda \|\Delta \dot{Z}\|_1\]
# Savitzky-Golay filter (preserves peaks, good for smooth data)
deriv_sg <- compute_derivative(y, time, method = "savgol",
window = 11, order = 3)
# Smoothing spline (good for high noise)
deriv_spline <- compute_derivative(y, time, method = "spline",
spar = 0.7)
# Finite differences (simple, fast, sensitive to noise)
deriv_fd <- compute_derivative(y, time, method = "fd")
# Spectral (for periodic data ONLY - warns about Gibbs phenomenon otherwise)
deriv_spec <- compute_derivative(y, time, method = "spectral")Before formal fitting, explore the data visually:
# Genetic algorithm search (pure R)
search_result <- symbolic_search(
data = data,
response = "dZ",
predictors = c("Z", "X"),
backend = "r_genetic",
max_complexity = 15,
n_generations = 50,
population_size = 100,
n_runs = 3
)
# View Pareto front
plot_pareto_front(search_result)
# Select best equation
best_eq <- select_equation(search_result, criterion = "bic")For large-scale symbolic regression, use the Julia backend:
# Compute residuals
resid <- compute_residuals(equation, data, "dZ")
# Run diagnostic tests
diag <- residual_diagnostics(resid, data)
print(diag)
plot(diag)The diagnostics include: - Ljung-Box test: Autocorrelation in residuals - ARCH-LM test: Conditional heteroscedasticity - Breusch-Pagan test: Heteroscedasticity vs predictors - Jarque-Bera test: Normality - Runs test: Randomness
When heteroscedasticity is significant, use iterative GLS:
# Iterative estimation for unbiased drift coefficients
sde <- estimate_sde_iterative(
drift_formula = "a + b * Z + c * Z^2",
data = data,
derivative_col = "dZ",
diffusion_formula = "sigma * (1 + tau * abs(Z))",
max_iter = 20,
tolerance = 1e-4
)
# Check convergence
print(sde$converged)
print(sde$iterations)# Simulate from the SDE
sim <- simulate_trajectory(
sde,
initial_conditions = c(Z = data$Z[1]),
times = data$time,
n_sims = 100,
method = "euler"
)
# Plot with observed data
plot(sim, observed_data = data)
# Compare quantitatively
metrics <- compare_trajectories(sim, data, "time", "Z")
print(metrics)# Analyze fixed points
fps <- analyze_fixed_points(equation, "Z", range = c(-10, 150))
print(fps)
# Bifurcation analysis (vary a parameter)
bifurc <- analyze_bifurcations(
equation,
variable = "Z",
parameter = "K",
param_range = c(50, 200)
)
plot(bifurc)
# Check against expected behavior
qual_check <- check_qualitative_behavior(
equation, data, "Z",
expected_features = list(
n_fixed_points = 2,
stability_pattern = c("unstable", "stable"),
bounded = TRUE
)
)
print(qual_check)# Coefficient table
coef_table <- coefficient_table(equation, format = "latex")
cat(coef_table)
# Model comparison
models <- list(
"Linear" = fit_specified_equation("a + b*Z", data, "dZ"),
"Quadratic" = fit_specified_equation("a + b*Z + c*Z^2", data, "dZ"),
"Logistic" = fit_specified_equation("r*Z*(1-Z/K)", data, "dZ")
)
comparison <- model_comparison_table(models, data, "dZ", format = "markdown")
cat(comparison)# Collect all results
results <- list(
data = data,
exploration = exploration,
equation = equation,
sde = sde,
validation = validation
)
# Generate markdown report (using tempdir() per CRAN policy)
generate_report(
results,
output_file = file.path(tempdir(), "analysis_report"),
format = "markdown",
title = "My Dynamics Analysis"
)
# Export to multiple formats (using tempdir() per CRAN policy)
export_results(results, output_dir = tempdir(), formats = c("rds", "csv", "json"))
# Save plots (using tempdir() per CRAN policy)
save_plots(results, output_dir = tempdir(), format = c("png", "pdf"))suggest_differentiation_method() for guidancediagnose_sampling_frequency()fit_specified_equation() is more reliablesessionInfo()
#> R version 4.4.2 (2024-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26200)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=C LC_CTYPE=Spanish_Spain.utf8
#> [3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C
#> [5] LC_TIME=Spanish_Spain.utf8
#>
#> time zone: America/Costa_Rica
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.37 R6_2.6.1 fastmap_1.2.0 xfun_0.55
#> [5] cachem_1.1.0 knitr_1.51 htmltools_0.5.8.1 rmarkdown_2.29
#> [9] lifecycle_1.0.4 cli_3.6.5 sass_0.4.10 jquerylib_0.1.4
#> [13] compiler_4.4.2 rstudioapi_0.17.1 tools_4.4.2 evaluate_1.0.5
#> [17] bslib_0.9.0 yaml_2.3.12 rlang_1.1.6 jsonlite_2.0.0