version 1.4.0
SVEMnet
implements Self-Validated Ensemble Models (SVEM,
Lemkus et al. 2021) and the SVEM whole model test (Karl 2024) using
Elastic Net regression via the glmnet
package Friedman et
al. (2010). This vignette provides an overview of the package’s
functionality and usage.
library(SVEMnet)
# Example data
data <- iris
svem_model <- SVEMnet(Sepal.Length ~ ., data = data, nBoot = 300)
coef(svem_model)
## Percent of Bootstraps Nonzero
## Sepal.Width 100.00000
## Petal.Length 100.00000
## Speciesvirginica 94.66667
## Petal.Width 92.00000
## Speciesversicolor 92.00000
Generate a plot of actual versus predicted values:
Predict outcomes for new data using the predict()
function:
## 1 2 3 4 5 6 7 8
## 5.004969 4.746061 4.775963 4.871504 5.056751 5.381124 4.927210 5.026849
## 9 10 11 12 13 14 15 16
## 4.694279 4.897482 5.182194 5.100510 4.772039 4.551054 5.116554 5.492709
## 17 18 19 20 21 22 23 24
## 5.086478 4.978991 5.355321 5.207998 5.174172 5.130238 4.762104 5.044457
## 25 26 27 28 29 30 31 32
## 5.321495 4.893384 5.048555 5.078631 4.953187 4.996947 4.945166 4.974893
## 33 34 35 36 37 38 39 40
## 5.415298 5.367440 4.871504 4.702301 4.931307 5.082728 4.672400 5.026849
## 41 42 43 44 45 46 47 48
## 4.905330 4.283951 4.775963 5.048381 5.476666 4.720083 5.307637 4.849624
## 49 50 51 52 53 54 55 56
## 5.182194 4.901406 6.466747 6.293446 6.536311 5.511060 6.159981 6.138275
## 57 58 59 60 61 62 63 64
## 6.466573 5.125143 6.263718 5.618547 5.065340 5.968898 5.537211 6.311402
## 65 66 67 68 69 70 71 72
## 5.527103 6.193981 6.189883 5.869781 5.775630 5.592917 6.436497 5.769967
## 73 74 75 76 77 78 79 80
## 6.225621 6.311576 6.042734 6.142199 6.333282 6.506235 6.138101 5.376030
## 81 82 83 84 85 86 87 88
## 5.467473 5.419790 5.670502 6.450530 6.189883 6.371031 6.388988 5.805706
## 89 90 91 92 93 94 95 96
## 5.947192 5.614623 5.987028 6.289522 5.692382 5.073362 5.865509 6.046831
## 97 98 99 100 101 102 103 104
## 5.969072 6.042734 4.929963 5.843629 6.966991 6.149214 6.841896 6.647063
## 105 106 107 108 109 110 111 112
## 6.742256 7.357526 5.655637 7.162693 6.587259 7.195997 6.382144 6.296537
## 113 114 115 116 117 118 119 120
## 6.547250 5.946012 6.071107 6.451534 6.625183 7.819463 7.319429 5.920556
## 121 122 123 124 125 126 127 128
## 6.746180 6.027695 7.353603 6.027869 6.849917 7.097054 6.005989 6.183213
## 129 130 131 132 133 134 135 136
## 6.517348 6.898123 6.937611 7.650434 6.491370 6.304907 6.595629 6.937263
## 137 138 139 140 141 142 143 144
## 6.750104 6.676964 6.109552 6.525370 6.594760 6.252430 6.149214 6.893503
## 145 146 147 148 149 150
## 6.746007 6.274310 5.971989 6.352243 6.628759 6.330537
This is the serial version of the significance test. It is slower but the code is less complicated to read than the faster parallel version.
test_result <- svem_significance_test(Sepal.Length ~ ., data = data)
print(test_result)
plot(test_result)
SVEM Significance Test p-value:
[1] 0
Whole model test result
Note that there is a parallelized version that runs much faster
# Simulate data
set.seed(1)
n <- 25
X1 <- runif(n)
X2 <- runif(n)
X3 <- runif(n)
X4 <- runif(n)
X5 <- runif(n)
#y only depends on X1 and X2
y <- 1 + X1 + X2 + X1 * X2 + X1^2 + rnorm(n)
data <- data.frame(y, X1, X2, X3, X4, X5)
# Perform the SVEM significance test
test_result <- svem_significance_test_parallel(
y ~ (X1 + X2 + X3)^2 + I(X1^2) + I(X2^2) + I(X3^2),
data = data
)
# View the p-value
print(test_result)
SVEM Significance Test p-value:
[1] 0.009399093
test_result2 <- svem_significance_test_parallel(
y ~ (X1 + X2 )^2 + I(X1^2) + I(X2^2),
data = data
)
# View the p-value
print(test_result2)
SVEM Significance Test p-value:
[1] 0.006475736
#note that the response does not depend on X4 or X5
test_result3 <- svem_significance_test_parallel(
y ~ (X4 + X5)^2 + I(X4^2) + I(X5^2),
data = data
)
# View the p-value
print(test_result3)
SVEM Significance Test p-value:
[1] 0.8968502
# Plot the Mahalanobis distances
plot(test_result,test_result2,test_result3)
Whole Model Test Results for Example 2
Newly added wrapper for cv.glmnet() to compare performance of SVEM to glmnet’s native CV implementation.
We ran focused simulations to examine when debiasing and objective
choice help, under standardize=TRUE
,
alpha_set = c(1)
, SVEM weights, and disjoint
train/test:
p_full
be the number of columns in the full model matrix
(no intercept). Debiasing tends to hurt when
n_train_used / p_full < 1
, is mostly
neutral near approx 1
, and can
help modestly when > 1
.wAIC
vs wSSE
).
wAIC
generally produced lower test RMSE than
wSSE
across R²∈{0.3,0.7,0.9}, especially in tighter‑sample
regimes; wSSE
narrows the gap as n/p_full
grows.debias = FALSE
is the safer default. Use
debias = TRUE
only when n/p_full
is
comfortably >1 or when you need JMP‑style outputs.## ======================================================================
## SVEMnet boundary sims: wAIC vs wSSE + debias(TRUE/FALSE)
## - standardize = TRUE
## - glmnet_alpha = c(1)
## - No leakage; test is disjoint from train; debias refit check
## - n_train ranges chosen to cross n_train_used vs p_full boundary
## - Outputs:
## * res ........ per-fit records (you can model this)
## * A tidy summary with 4 win-rates:
## winrate_debias_TRUE, winrate_debias_FALSE,
## winrate_wAIC, winrate_wSSE
## * Two simple plots (comment out prints if running headless)
## ======================================================================
if (!requireNamespace("SVEMnet", quietly = TRUE)) install.packages("SVEMnet")
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2")
library(SVEMnet)
library(ggplot2)
## ---------- Logging / Progress ----------
VERBOSE <- TRUE # per-fit messages
ECHO_EVERY <- 1 # print every k-th fit
PROGRESS <- TRUE # progress bar
stamp <- function() format(Sys.time(), "%H:%M:%S")
log_msg <- function(...) if (VERBOSE) {
cat(sprintf("[%s] %s\n", stamp(), sprintf(...))); flush.console()
}
options(warn = 1) # surface warnings immediately
set.seed(20250814)
## ------------------ Controls ------------------
REPS <- 10 # increase to 20+ for more power
NBOOT <- 200
R2_GRID <- c(0.30, 0.70, 0.90)
RHO_GRID <- c(0.2, 0.6, 0.9)
P_GRID <- c(4, 8, 12)
NTEST <- 1000
MODELS <- c( "main_plus_int", "full_quadratic")
## ------------------ Helpers -------------------
rmse <- function(obs, pred) sqrt(mean((obs - pred)^2))
safe_predict <- function(fit, newdata, debias=FALSE) {
out <- try(predict(fit, newdata=newdata, debias=debias), silent=TRUE)
if (inherits(out, "try-error")) return(rep(NA_real_, nrow(newdata)))
as.numeric(out)
}
## Count p_full (columns in model.matrix minus intercept) WITHOUT response 'y'
count_p_full <- function(fm, p) {
rhs_fml <- stats::reformulate(
attr(stats::delete.response(stats::terms(fm)), "term.labels")
)
tmp <- as.data.frame(matrix(rnorm(4 * p), nrow = 4))
names(tmp) <- paste0("X", seq_len(p))
mm <- model.matrix(rhs_fml, data = tmp)
sum(colnames(mm) != "(Intercept)")
}
verify_no_leakage <- function(fit, train_df_used, test_df) {
ok_disjoint <- !any(rownames(train_df_used) %in% rownames(test_df))
raw_pred_train <- safe_predict(fit, train_df_used, debias=FALSE)
y_train <- train_df_used$y
ref <- lm(y_train ~ raw_pred_train)
co_ref <- coef(ref)
if (is.null(fit$debias_fit)) {
return(list(ok=FALSE, ok_disjoint=ok_disjoint, coef_close=FALSE))
}
co_pkg <- coef(fit$debias_fit)
same_len <- length(co_pkg) == length(co_ref)
coef_close <- same_len && all(abs(co_pkg - co_ref) < 1e-8)
list(ok = (ok_disjoint && coef_close), ok_disjoint=ok_disjoint, coef_close=coef_close)
}
build_formulas <- function(p) {
vars <- paste0("X", seq_len(p))
main_terms <- paste(vars, collapse=" + ")
int_terms <- paste0("(", main_terms, ")^2")
sq_terms <- paste(sprintf("I(%s^2)", vars), collapse=" + ")
list(
main_effects = as.formula(paste0("y ~ ", main_terms)),
main_plus_int = as.formula(paste0("y ~ ", int_terms)),
full_quadratic = as.formula(paste0("y ~ ", int_terms, " + ", sq_terms))
)
}
## Sparse truth
truth_for_p <- function(p) {
list(
active_main = seq_len(min(3, p)),
active_pairs = if (p >= 3) rbind(c(1,2), c(2,3)) else matrix(numeric(), ncol=2),
active_sq = if (p >= 2) 2L else integer(0)
)
}
## Equal scales; AR-ish correlation
gen_X <- function(n, p, rho=0.5) {
Z <- matrix(rnorm(n*p), n, p)
if (p >= 2 && abs(rho) > 0) {
for (j in 2:p) {
Z[, j] <- rho * scale(Z[, j-1], TRUE, TRUE) + sqrt(1-rho^2) * scale(Z[, j], TRUE, TRUE)
}
}
X <- Z
colnames(X) <- paste0("X", seq_len(p))
as.data.frame(X)
}
## Data with targeted in-sample R^2
make_sparse_data_R2 <- function(n, p, rho, target_R2,
active_main, active_pairs, active_sq,
coef_main=2, coef_int=1.5, coef_sq=1.0, seed=NULL) {
if (!is.null(seed)) set.seed(seed)
X <- gen_X(n, p, rho)
mu <- 1
y_sig <- rep(mu, n)
for (j in active_main) y_sig <- y_sig + coef_main * X[[paste0("X", j)]]
if (length(active_sq)) for (j in active_sq) y_sig <- y_sig + coef_sq * (X[[paste0("X", j)]]^2)
if (nrow(active_pairs) > 0) {
for (k in seq_len(nrow(active_pairs))) {
i <- active_pairs[k,1]; j <- active_pairs[k,2]
y_sig <- y_sig + coef_int * (X[[paste0("X", i)]] * X[[paste0("X", j)]])
}
}
vs <- var(y_sig); sd_sig <- sqrt(max(vs, .Machine$double.eps))
sd_eps <- sd_sig * sqrt((1 - target_R2) / target_R2)
y <- y_sig + rnorm(n, 0, sd_eps)
df <- cbind.data.frame(y=y, X)
df$y_signal <- y_sig
rownames(df) <- sprintf("row%06d", seq_len(n))
df
}
## ------------------ RUN ------------------
rows <- list(); rid <- 1
leaks <- list()
TOTAL_FITS <- sum(sapply(P_GRID, function(p) length(MODELS)*length(RHO_GRID)*length(R2_GRID)*REPS))
fit_counter <- 0L
if (PROGRESS) pb <- txtProgressBar(min = 0, max = TOTAL_FITS, style = 3)
for (p in P_GRID) {
fms <- build_formulas(p)
thr <- truth_for_p(p)
for (model in MODELS) {
fm <- fms[[model]]
p_full <- count_p_full(fm, p)
## choose n_train to cross the boundary; clamp [10, 80]
n_grid_raw <- round(c(0.5, 0.75, 1.0, 1.25, 1.75, 2.5) * p_full)
n_grid <- sort(unique(pmin(pmax(n_grid_raw, 10), 80)))
for (rho in RHO_GRID) {
for (R2_tgt in R2_GRID) {
for (rep in seq_len(REPS)) {
n_tr <- sample(n_grid, 1L)
n_te <- NTEST
df <- make_sparse_data_R2(
n = n_tr + n_te, p = p, rho = rho, target_R2 = R2_tgt,
active_main = thr$active_main,
active_pairs = thr$active_pairs,
active_sq = thr$active_sq,
seed = 1000 + 7*rep + p + round(100*R2_tgt) + round(100*rho)
)
idx <- sample(seq_len(nrow(df)), size = n_tr)
train_df <- df[idx, ]
test_df <- df[-idx, ]
## PRE-FILTER complete cases for THIS formula (avoid NA crash)
mf_all <- model.frame(fm, data = train_df, na.action = na.pass)
keep <- complete.cases(mf_all)
if (sum(keep) < 2) {
warning(sprintf("Skipping: n_train=%d but <2 complete cases for %s (p=%d)", n_tr, model, p))
next
}
train_used <- train_df[keep, , drop = FALSE]
n_used <- nrow(train_used)
## realized true R^2
r2_true <- function(d) var(d$y_signal)/var(d$y)
R2_train_true <- r2_true(train_used)
R2_test_true <- r2_true(test_df)
fit_counter <- fit_counter + 1L
if (PROGRESS) setTxtProgressBar(pb, fit_counter)
if (fit_counter %% max(1L, ECHO_EVERY) == 0L) {
log_msg("Start #%d | p=%d (p_full=%d) | model=%s | rho=%.2f | R2=%.2f | n_train=%d (used=%d) | ratio=%.2f",
fit_counter, p, p_full, model, rho, R2_tgt, n_tr, n_used, n_used/p_full)
}
## ---- Fit wAIC ----
t0 <- proc.time()[3]
fit_waic <- SVEMnet(
formula = fm, data = train_used,
nBoot = NBOOT,
glmnet_alpha = c(1),
weight_scheme = "SVEM",
objective = "wAIC",
standardize = TRUE
)
t1 <- proc.time()[3]
## ---- Fit wSSE ----
t2 <- proc.time()[3]
fit_wsse <- SVEMnet(
formula = fm, data = train_used,
nBoot = NBOOT,
glmnet_alpha = c(1),
weight_scheme = "SVEM",
objective = "wSSE",
standardize = TRUE
)
t3 <- proc.time()[3]
## Predictions
pr_tr_raw_waic <- safe_predict(fit_waic, train_used, debias = FALSE)
pr_te_raw_waic <- safe_predict(fit_waic, test_df, debias = FALSE)
pr_tr_deb_waic <- safe_predict(fit_waic, train_used, debias = TRUE)
pr_te_deb_waic <- safe_predict(fit_waic, test_df, debias = TRUE)
pr_tr_raw_wsse <- safe_predict(fit_wsse, train_used, debias = FALSE)
pr_te_raw_wsse <- safe_predict(fit_wsse, test_df, debias = FALSE)
pr_tr_deb_wsse <- safe_predict(fit_wsse, train_used, debias = TRUE)
pr_te_deb_wsse <- safe_predict(fit_wsse, test_df, debias = TRUE)
## RMSEs
rmse_tr_raw_waic <- rmse(train_used$y, pr_tr_raw_waic)
rmse_te_raw_waic <- rmse(test_df$y, pr_te_raw_waic)
rmse_tr_deb_waic <- rmse(train_used$y, pr_tr_deb_waic)
rmse_te_deb_waic <- rmse(test_df$y, pr_te_deb_waic)
rmse_tr_raw_wsse <- rmse(train_used$y, pr_tr_raw_wsse)
rmse_te_raw_wsse <- rmse(test_df$y, pr_te_raw_wsse)
rmse_tr_deb_wsse <- rmse(train_used$y, pr_tr_deb_wsse)
rmse_te_deb_wsse <- rmse(test_df$y, pr_te_deb_wsse)
log_msg("Done #%d | AIC raw=%.3f deb=%.3f | SSE raw=%.3f deb=%.3f | Δ_deb(raw): AIC=%.3f SSE=%.3f | %.1fs+%.1fs",
fit_counter, rmse_te_raw_waic, rmse_te_deb_waic,
rmse_te_raw_wsse, rmse_te_deb_wsse,
rmse_te_deb_waic - rmse_te_raw_waic,
rmse_te_deb_wsse - rmse_te_raw_wsse,
(t1 - t0), (t3 - t2))
## Save row
rows[[rid]] <- data.frame(
p = p, p_full = p_full, model = model, rho = rho, R2_target = R2_tgt,
n_train = n_tr, n_train_used = n_used, n_test = n_te,
ratio_n_over_p = n_used / p_full,
above_boundary = as.integer(n_used > p_full),
R2_true_train = R2_train_true, R2_true_test = R2_test_true,
rmse_test_raw_waic = rmse_te_raw_waic,
rmse_test_deb_waic = rmse_te_deb_waic,
rmse_test_raw_wsse = rmse_te_raw_wsse,
rmse_test_deb_wsse = rmse_te_deb_wsse,
rmse_train_raw_waic = rmse_tr_raw_waic,
rmse_train_deb_waic = rmse_tr_deb_waic,
rmse_train_raw_wsse = rmse_tr_raw_wsse,
rmse_train_deb_wsse = rmse_tr_deb_wsse,
time_sec_waic = round(t1 - t0, 2),
time_sec_wsse = round(t3 - t2, 2),
stringsAsFactors = FALSE
)
rid <- rid + 1
## light leakage check
if (rep <= 2) {
chk <- verify_no_leakage(fit_waic, train_used, test_df)
leaks[[length(leaks) + 1]] <- data.frame(
p = p, model = model, rho = rho, R2_target = R2_tgt, n_train = n_tr, n_train_used = n_used,
ok = chk$ok, ok_disjoint = chk$ok_disjoint, coef_close = chk$coef_close
)
}
}
}
}
}
}
if (!length(rows)) stop("No successful fits were recorded. Check logs/warnings above.")
res <- do.call(rbind, rows)
leak_df <- if (length(leaks)) do.call(rbind, leaks) else NULL
## Ratio bins for readability on plots/tables
res$ratio_bin <- cut(
res$ratio_n_over_p,
breaks = c(-Inf, 0.75, 1.0, 1.25, 1.75, Inf),
labels = c("<0.75", "0.75–1.0", "1.0–1.25", "1.25–1.75", ">1.75"),
right = TRUE
)
## ------------------ SUMMARY (with 4 win-rates) ------------------
## CI helper
ci95 <- function(x) {
x <- x[is.finite(x)]
n <- length(x); if (!n) return(c(mean=NA, lo=NA, hi=NA, n=0))
m <- mean(x); se <- sd(x)/sqrt(n)
c(mean=m, lo=m-1.96*se, hi=m+1.96*se, n=n)
}
## Grouping key
grp_vars <- c("model","R2_target","ratio_bin","above_boundary")
## Build summary per group using by() to avoid unpack glitches
summ <- do.call(rbind, by(res, INDICES = res[grp_vars], FUN = function(d){
## win indicators (row-wise)
deb_win_waic <- as.numeric(d$rmse_test_deb_waic < d$rmse_test_raw_waic)
deb_win_wsse <- as.numeric(d$rmse_test_deb_wsse < d$rmse_test_raw_wsse)
deb_lose_waic <- as.numeric(d$rmse_test_deb_waic > d$rmse_test_raw_waic)
deb_lose_wsse <- as.numeric(d$rmse_test_deb_wsse > d$rmse_test_raw_wsse)
waic_win_raw <- as.numeric(d$rmse_test_raw_waic < d$rmse_test_raw_wsse)
waic_win_deb <- as.numeric(d$rmse_test_deb_waic < d$rmse_test_deb_wsse)
wsse_win_raw <- as.numeric(d$rmse_test_raw_wsse < d$rmse_test_raw_waic)
wsse_win_deb <- as.numeric(d$rmse_test_deb_wsse < d$rmse_test_deb_waic)
## pooled winrates (redundant on purpose)
winrate_debias_TRUE <- mean(c(deb_win_waic, deb_win_wsse), na.rm = TRUE)
winrate_debias_FALSE <- mean(c(deb_lose_waic, deb_lose_wsse), na.rm = TRUE)
winrate_wAIC <- mean(c(waic_win_raw, waic_win_deb), na.rm = TRUE)
winrate_wSSE <- mean(c(wsse_win_raw, wsse_win_deb), na.rm = TRUE)
## RMSE means + CI
a_raw <- ci95(d$rmse_test_raw_waic)
a_deb <- ci95(d$rmse_test_deb_waic)
s_raw <- ci95(d$rmse_test_raw_wsse)
s_deb <- ci95(d$rmse_test_deb_wsse)
## debias deltas (for context)
dA <- ci95(d$rmse_test_deb_waic - d$rmse_test_raw_waic)
dS <- ci95(d$rmse_test_deb_wsse - d$rmse_test_raw_wsse)
## realized true R2
rtr <- ci95(d$R2_true_train); rte <- ci95(d$R2_true_test)
## timings
tt <- ci95(rowMeans(cbind(d$time_sec_waic, d$time_sec_wsse), na.rm = TRUE))
data.frame(
model = d$model[1],
R2_target = d$R2_target[1],
ratio_bin = d$ratio_bin[1],
above_boundary = d$above_boundary[1],
R2_true_train_mean = rtr["mean"], R2_true_test_mean = rte["mean"],
rmse_raw_waic_mean = a_raw["mean"], rmse_raw_waic_ci95_lo = a_raw["lo"], rmse_raw_waic_ci95_hi = a_raw["hi"],
rmse_deb_waic_mean = a_deb["mean"], rmse_deb_waic_ci95_lo = a_deb["lo"], rmse_deb_waic_ci95_hi = a_deb["hi"],
rmse_raw_wsse_mean = s_raw["mean"], rmse_raw_wsse_ci95_lo = s_raw["lo"], rmse_raw_wsse_ci95_hi = s_raw["hi"],
rmse_deb_wsse_mean = s_deb["mean"], rmse_deb_wsse_ci95_lo = s_deb["lo"], rmse_deb_wsse_ci95_hi = s_deb["hi"],
## the four requested winrates
winrate_debias_TRUE = winrate_debias_TRUE,
winrate_debias_FALSE = winrate_debias_FALSE,
winrate_wAIC = winrate_wAIC,
winrate_wSSE = winrate_wSSE,
## debias deltas by objective (context)
deb_raw_diff_AIC_mean = dA["mean"], deb_raw_diff_AIC_ci95_lo = dA["lo"], deb_raw_diff_AIC_ci95_hi = dA["hi"],
deb_raw_diff_SSE_mean = dS["mean"], deb_raw_diff_SSE_ci95_lo = dS["lo"], deb_raw_diff_SSE_ci95_hi = dS["hi"],
mean_time = tt["mean"],
row.names = NULL,
check.names = FALSE
)
}))
rownames(summ) <- NULL
summ <- summ[order(summ$model, summ$R2_target, summ$ratio_bin, summ$above_boundary), ]
## Leakage summary
cat("\n---- Leakage summary (should be ~100%) ----\n")
if (!is.null(leak_df)) {
print(aggregate(cbind(ok, ok_disjoint, coef_close) ~ model, data = leak_df,
FUN = function(x) mean(x, na.rm=TRUE)), row.names = FALSE)
} else {
cat("No leakage samples collected.\n")
}
## Print compact summary columns
ord <- c("model","R2_target","ratio_bin","above_boundary",
"R2_true_train_mean","R2_true_test_mean",
"rmse_raw_waic_mean","rmse_deb_waic_mean",
"rmse_raw_wsse_mean","rmse_deb_wsse_mean",
"winrate_debias_TRUE","winrate_debias_FALSE",
"winrate_wAIC","winrate_wSSE",
"deb_raw_diff_AIC_mean","deb_raw_diff_AIC_ci95_lo","deb_raw_diff_AIC_ci95_hi",
"deb_raw_diff_SSE_mean","deb_raw_diff_SSE_ci95_lo","deb_raw_diff_SSE_ci95_hi",
"mean_time")
cat("\n================ SUMMARY (with 4 win-rates) ================\n")
print(summ[, ord], row.names = FALSE, digits = 4)
## ------------------ Plots (optional) ------------------
## Debias − Raw (negative helps) for each objective
plot_df <- within(res, {
deb_raw_AIC <- rmse_test_deb_waic - rmse_test_raw_waic
deb_raw_SSE <- rmse_test_deb_wsse - rmse_test_raw_wsse
})
ggA <- ggplot(plot_df, aes(ratio_n_over_p, deb_raw_AIC)) +
geom_hline(yintercept = 0, linetype = 2) +
geom_point(alpha = 0.2) +
geom_smooth(se = TRUE, method = "loess", span = 0.7) +
facet_grid(R2_target ~ model) +
labs(x = expression(paste(n[train[used]], " / ", p[full])),
y = "Debias − Raw (TEST RMSE) under wAIC",
title = "When does debiasing help? (wAIC)") +
theme_bw()
ggS <- ggplot(plot_df, aes(ratio_n_over_p, deb_raw_SSE)) +
geom_hline(yintercept = 0, linetype = 2) +
geom_point(alpha = 0.2) +
geom_smooth(se = TRUE, method = "loess", span = 0.7) +
facet_grid(R2_target ~ model) +
labs(x = expression(paste(n[train[used]], " / ", p[full])),
y = "Debias − Raw (TEST RMSE) under wSSE",
title = "When does debiasing help? (wSSE)") +
theme_bw()
print(ggA)
print(ggS)
## ------------------ Export (for your own modeling) ------------------
keep_cols <- c("p","p_full","model","rho","R2_target",
"n_train","n_train_used","n_test",
"ratio_n_over_p","above_boundary",
"R2_true_train","R2_true_test",
"rmse_test_raw_waic","rmse_test_deb_waic",
"rmse_test_raw_wsse","rmse_test_deb_wsse",
"time_sec_waic","time_sec_wsse")
res_export <- res[, keep_cols]
cat("\nFirst few rows of res_export:\n")
print(utils::head(res_export), row.names = FALSE)
## Write to CSV if you like:
## write.csv(res_export, "res_export.csv", row.names = FALSE)
## ---- Oneway-style plots: block on Row and show ANOM-like needles --------
library(dplyr)
library(tidyr)
library(ggplot2)
library(patchwork)
# If your results frame is called res_export, just do:
# res <- res_export
# 1) Long format with a Row (block) ID and log(RMSE)
df <- res_export %>%
mutate(Row = row_number()) %>% # block label = run id
pivot_longer(
cols = c(rmse_test_deb_waic, rmse_test_deb_wsse,
rmse_test_raw_waic, rmse_test_raw_wsse),
names_to = "settings",
values_to = "rmse"
) %>%
mutate(
settings = factor(settings,
levels = c("rmse_test_deb_waic","rmse_test_deb_wsse",
"rmse_test_raw_waic","rmse_test_raw_wsse")),
lmrse = log(pmax(rmse, .Machine$double.eps))
)
# 2) Block-centering (remove Row mean, then add back overall mean so the scale is familiar)
overall_mean <- mean(df$lmrse, na.rm = TRUE)
df <- df %>%
group_by(Row) %>%
mutate(lmrse_block = lmrse - mean(lmrse, na.rm = TRUE) + overall_mean) %>%
ungroup()
# 3) Fit two-way fixed effects model to get a pooled MSE (settings + block)
fit <- lm(lmrse_block ~ settings + factor(Row), data = df)
MSE <- summary(fit)$sigma^2
df_resid <- df.residual(fit)
# Effective sample size per setting (usually each setting appears once per Row)
n_i <- as.numeric(table(df$settings))
SE_i <- sqrt(MSE / n_i) # SE for each setting mean (RCBD-style)
names(SE_i) <- names(table(df$settings))
# 4) Means per setting and an ANOM-like CI
alpha <- 0.05
tcrit <- qt(1 - alpha/2, df = df_resid)
means <- df %>%
group_by(settings) %>%
summarise(mean_lmrse = mean(lmrse_block, na.rm = TRUE), .groups = "drop") %>%
mutate(
SE = SE_i[as.character(settings)],
lo = mean_lmrse - tcrit * SE,
hi = mean_lmrse + tcrit * SE
)
# Grand mean and a constant decision band
grand <- mean(df$lmrse_block, na.rm = TRUE)
SE_bar <- mean(SE_i) # one band for the panel
UDL <- grand + tcrit * SE_bar
LDL <- grand - tcrit * SE_bar
band_df <- data.frame(LDL = LDL, UDL = UDL)
# 5) Panel 1: “Block Centered lmrse” (violin + dots)
p1 <- ggplot(df, aes(settings, lmrse_block)) +
geom_violin(fill = "grey90", color = "grey40", scale = "width") +
geom_jitter(width = 0.10, height = 0, alpha = 0.25, size = 1) +
geom_hline(yintercept = grand, linetype = 2, color = "grey35") +
labs(title = "Oneway Analysis of lmrse by settings",
subtitle = "Block: Row (run ID) — block-centered values",
y = "Block-Centered lmrse", x = "settings") +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# 6) Panel 2: “Analysis of Means” (lollipops + CI + decision band)
p2 <- ggplot(means, aes(settings, mean_lmrse)) +
# decision band
geom_rect(data = band_df,
aes(xmin = -Inf, xmax = Inf, ymin = LDL, ymax = UDL),
inherit.aes = FALSE, alpha = 0.12, fill = "skyblue") +
geom_hline(yintercept = grand, color = "grey25") +
# “needles”
geom_segment(aes(x = settings, xend = settings, y = grand, yend = mean_lmrse),
linewidth = 0.6, color = "black") +
# point + CI
geom_point(size = 3, color = "firebrick") +
geom_errorbar(aes(ymin = lo, ymax = hi), width = 0.15) +
annotate("text", x = Inf, y = UDL, hjust = 1.05, vjust = -0.4,
label = sprintf("UDL=%.4f", UDL), size = 3.3) +
annotate("text", x = Inf, y = grand, hjust = 1.05, vjust = -0.4,
label = sprintf("Avg = %.4f", grand), size = 3.3) +
annotate("text", x = Inf, y = LDL, hjust = 1.05, vjust = 1.2,
label = sprintf("LDL=%.4f", LDL), size = 3.3) +
labs(title = "Analysis of Means (ANOM-style)",
subtitle = expression(paste("95% t-band; pooled MSE from ",
"lm(lmrse_block ~ settings + block)")),
y = "lmrse", x = "settings") +
coord_cartesian(ylim = range(c(means$lo, means$hi, LDL, UDL))) +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# 7)
(p1 / p2) + plot_annotation(
title = "Blocked Oneway (Row as Block) for lmrse vs. settings",
theme = theme(plot.title = element_text(face = "bold"))
)
Simulation-blocked Residual LRMSE for different objective functions and debiasing
We ran a larger simulation study spanning mixtures of factor structures (main-effects, interactions, quadratics), signal levels (R² ∈ {0.30, 0.50, 0.70, 0.90}), factor correlations ρ ∈ {−0.9, −0.5, 0, 0.5, 0.9}, sparsity levels (density ∈ {0.05, 0.10, 0.20, 0.30, 0.50, 0.75}), and a range of undersaturated to oversaturated settings via offsets in n_used − p_full. All fits used standardize=TRUE. Both SVEM and cv.glmnet searched α ∈ {0, 0.5, 1}. SVEM used nBoot=200 and no debiasing.
Headline findings
Among SVEM objectives, AICc is the most reliable default. Across the full grid, the geometric-mean (GM) RMSE ratios relative to cv.glmnet were:
Method (vs CV) | GM RMSE ratio |
---|---|
SVEM–AIC | approx 1.049 |
SVEM–AICc | approx 1.019 |
SVEM–wSSE | approx 1.200 |
So AICc improves meaningfully over AIC and wSSE in aggregate, reducing tail risk without giving up much average accuracy.
cv.glmnet remains a strong overall baseline. On the same grid, cv.glmnet had the lowest average LRMSE and the most first-place finishes (rank counts: CV approx 8.6k firsts vs AICc approx 5.5k; AIC and wSSE lag behind). A simple regret analysis (“always pick X”) shows Always-CV has slightly lower mean regret than Always-AICc overall.
Where AICc shines:
Tail behavior matters. When AIC or wSSE lose, they tend to lose big (heavy right tails in the RMSE ratio). AICc’s losses are rarer and smaller; its 90th percentile loss is substantially lower than AIC’s and far below wSSE’s.
Speed trade-off: With nBoot=200, SVEM is slower than cv.glmnet (GM speedup approx 0.04, i.e., CV is ~25× faster on a geometric mean basis). If runtime is critical and the regime is not severely undersaturated/low-signal, cv.glmnet is a pragmatic default.
Guidance
objective="wAICc"
. It is the best-performing and safest
objective inside SVEM across diverse conditions.Notes: All comparisons use disjoint train/test and log-RMSE
summaries; both SVEM and CV searched α ∈ {0, 0.5, 1} with
standardize=TRUE
. Debiasing was off
because it generally degraded test performance in these settings.
Results are representative of the tested grid; users should feel free to
focus simulations on their local regimes of interest.
How we compute wAIC/wAICc (and what if n < k?)
For each λ on the glmnet path, we score models on the validation side using SVEM’s fractional weights. Let w_i be validation weights normalized so Σw_i approx n. We define weighted MSE as MSE_w = SSE_w / Σw_i and replace n with an effective validation size n_eff = (Σw_i)^2 / Σw_i^2. Degrees of freedom k come from glmnet’s df plus the intercept. We then form:
To avoid pathologies we use a conservative k_eff = max(1, min(k, n−2)) and we mask any λ where the correction is undefined: if k ≥ n or k_eff ≥ n_eff − 1, that λ is marked inadmissible (score = +∞) and cannot be selected. So if someone asks, “how do you compute AICc when n < k?”, the answer is: we don’t—those candidates are explicitly excluded under the mask, and we only compare AICc where the correction is valid.
Design & metric. We simulated mixture factors A–D (sum-to-one) and a two-level categorical factor E. Each method proposes an optimal recipe (A–E), and we score it by True.CDF (closer to 1 is better). We analyze within-block comparisons where a “block” is one simulated dataset (File.Name), so all methods face the same data. Methods: SVEM objectives AIC, AICc, SSE; baselines cv.glmnet (reported as “glmnet”) and JMP’s SVEM-LASSO_w_int.
Debiasing note. JMP’s SVEM-LASSO_w_int performs a post-lasso unpenalized refit (“debiasing”) by design and this cannot be turned off. SVEMnet runs with debias = FALSE (our earlier LRMSE grid showed debiasing often hurt test performance near/under the n_used ≲ p_full boundary). Despite that asymmetry, the True.CDF analysis below shows SVEMnet(AICc) tracks—and slightly edges—JMP on average regret and mean delta.
Regret definition (higher-is-better metric like True.CDF). For each block b and method m, score_{m,b} = True.CDF_{m,b} best_b = max_m score_{m,b} regret_{m,b} = best_b − score_{m,b} (≥ 0; 0 if the method is best/tied best) We report mean regret across blocks. (For lower-is-better metrics like RMSE, either negate the metric or take score − min_m score.)
SVEM objective choice (AIC vs AICc vs SSE). Across 2,296 paired blocks:
• Winner counts (SVEM-only): AICc = 1155, AIC = 659, SSE = 482.
• Paired True.CDF (means/medians):
Method | Mean True.CDF | Median True.CDF |
---|---|---|
AIC | 0.8375 | 0.9854 |
AICc | 0.8694 | 0.9901 |
SSE | 0.8170 | 0.9719 |
• Paired deltas (positive favors first): – AICc − AIC: mean +0.03195, median approx 0, [10%,90%] = [−0.1789, +0.3566] – AICc − SSE: mean +0.05244, median approx 0, [10%,90%] = [−0.1944, +0.4683] – AIC − SSE: mean +0.02050, median approx 0, [10%,90%] = [−0.00261, +0.02687]
• Mean regret vs best SVEM objective (lower is better):
Always pick… | Mean regret |
---|---|
AIC | 0.0908 |
AICc | 0.0589 |
SSE | 0.1110 |
• By n_total × R² (means + AICc win rates): AICc tends to dominate AIC and SSE more clearly as R² rises and/or n_total increases. Examples:
n_total | R² | mean(AIC) | mean(AICc) | mean(SSE) | WR[AICc>AIC] | WR[AICc>SSE] |
---|---|---|---|---|---|---|
16 | 0.3 | 0.728 | 0.801 | 0.719 | 0.572 | 0.591 |
26 | 0.3 | 0.787 | 0.883 | 0.717 | 0.610 | 0.697 |
26 | 0.7 | 0.939 | 0.976 | 0.896 | 0.655 | 0.721 |
SVEM AICc vs cv.glmnet. Overall, cv.glmnet remains a strong, fast baseline; AICc is close and sometimes wins in tighter/low-signal strata. • Win rate: P(AICc > CV) = 0.453. • Paired delta: mean(AICc − CV) = −0.00443, median = 0. • Mean regret: Always-CV 0.0491 vs Always-AICc 0.0536. • Stratified highlights: AICc slightly beats CV at n_total=16, R²∈{0.3,0.5} (means: 0.801 vs 0.781; 0.851 vs 0.843), breaks even at n_total=26, R²approx0.5–0.7, and trails CV in many high-R² bins.
SVEM AICc vs JMP’s SVEM-LASSO_w_int (debiasing ON). AICc closely tracks JMP and is modestly better on average. • Win rate: P(AICc > JMP) = 0.497 (essentially a tie). • Paired delta: mean(AICc − JMP) = +0.0111, median = 0. • Mean regret: Always-JMP 0.0668 vs Always-AICc 0.0556 (AICc smaller).
League table (all blocks):
Metric | AIC | AICc | SSE | CV | JMP |
---|---|---|---|---|---|
Mean True.CDF | 0.8375 | 0.8694 | 0.8170 | 0.8739 | 0.8583 |
Median True.CDF | 0.9854 | 0.9901 | 0.9719 | 0.9907 | 0.9898 |
N blocks (non-NA) | 2296 | 2296 | 2296 | 2296 | 2296 |
Takeaways for SVEMnet. • Default objective: Use objective = “wAICc”. It wins most blocks among SVEM objectives, has the best average/median True.CDF, and the lowest regret. SSE is not recommended as a default in this setting. • Validation vs baselines: AICc is very close to cv.glmnet on this metric (slightly behind on average) and slightly ahead of JMP’s debiased SVEM-LASSO_w_int on mean regret and mean delta. • When to prefer CV: If you want the best average across all scenarios and speed, cv.glmnet remains a pragmatic choice. If you want SVEM’s ensembling and more robust tails, use AICc inside SVEM.
Relation to our earlier LRMSE study. The earlier grid (evaluated on test RMSE / LRMSE) also found AICc > AIC, wSSE inside SVEM and cv.glmnet slightly ahead of AICc on average (GM ratio AICc/CV approx 1.019). This True.CDF study—implemented in a different simulation environment and targeting a different metric—confirms the same hierarchy: inside SVEM, AICc is the most reliable objective; cv.glmnet edges AICc on average but by a small margin. The new study additionally verifies that SVEMnet(AICc, no debias) performs on par with (and slightly lowers regret vs) JMP’s debiased SVEM-LASSO_w_int.
## ============================================================
## SVEM (wAIC / wAICc / wSSE, no debias) vs glmnet_with_cv (defaults)
## Focus: small training sizes; compare test RMSE head-to-head
## Parallelized with future.apply (5 workers)
## ============================================================
## ----- Packages -----
if (!requireNamespace("SVEMnet", quietly = TRUE)) install.packages("SVEMnet")
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2")
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr")
if (!requireNamespace("tidyr", quietly = TRUE)) install.packages("tidyr")
if (!requireNamespace("patchwork", quietly = TRUE)) install.packages("patchwork")
if (!requireNamespace("future.apply", quietly = TRUE)) install.packages("future.apply")
# Optional progress bar:
# if (!requireNamespace("progressr", quietly = TRUE)) install.packages("progressr")
library(SVEMnet)
library(ggplot2)
library(dplyr)
library(tidyr)
library(patchwork)
library(future.apply)
library(progressr) # uncomment if you want progress bars
## ----- Controls -----
set.seed(425)
REPS <- 20
NBOOT <- 200
R2_GRID <- c(0.30, 0.50, 0.70, 0.90)
RHO_GRID <- c(0, -0.5, 0.5, -0.9, 0.9) # correlation among factors
P_GRID <- c(3, 5, 7)
MODELS <- c("main_effects","main_plus_int", "full_quadratic")
NTEST <- 1000
DENSITY_GRID <- c(.05, .10, .20, .30, .50,.75) # fraction of non-intercept terms active
N_WORKERS <- 7
## ----- Helpers -----
rmse <- function(obs, pred) sqrt(mean((obs - pred)^2))
safe_predict_svem <- function(fit, newdata, agg = "mean") {
out <- try(predict(fit, newdata = newdata, debias = FALSE, agg = agg), silent = TRUE)
if (inherits(out, "try-error")) return(rep(NA_real_, nrow(newdata)))
as.numeric(out)
}
safe_predict_cv <- function(fitcv, newdata) {
out <- try(predict_cv(fitcv, newdata = newdata, debias = FALSE), silent = TRUE)
if (inherits(out, "try-error")) return(rep(NA_real_, nrow(newdata)))
as.numeric(out)
}
## Count p_full (columns in model matrix – intercept)
count_p_full <- function(fm, p) {
rhs <- stats::reformulate(attr(stats::delete.response(stats::terms(fm)), "term.labels"))
tmp <- as.data.frame(matrix(rnorm(4 * p), nrow = 4))
names(tmp) <- paste0("X", seq_len(p))
mm <- model.matrix(rhs, data = tmp)
sum(colnames(mm) != "(Intercept)")
}
build_formulas <- function(p) {
vars <- paste0("X", seq_len(p))
main_terms <- paste(vars, collapse = " + ")
int_terms <- paste0("(", main_terms, ")^2")
sq_terms <- paste(sprintf("I(%s^2)", vars), collapse = " + ")
list(
main_effects = as.formula(paste0("y ~ ", main_terms)),
main_plus_int = as.formula(paste0("y ~ ", int_terms)),
full_quadratic = as.formula(paste0("y ~ ", int_terms, " + ", sq_terms))
)
}
gen_X <- function(n, p, rho = 0.5) {
Z <- matrix(rnorm(n * p), n, p)
if (p >= 2 && abs(rho) > 0) {
for (j in 2:p) {
Z[, j] <- rho * scale(Z[, j - 1], TRUE, TRUE) +
sqrt(1 - rho^2) * scale(Z[, j], TRUE, TRUE)
}
}
X <- Z
colnames(X) <- paste0("X", seq_len(p))
as.data.frame(X)
}
## Density-driven truth generator using the selected formula fm
make_sparse_data_R2 <- function(n, p, rho, target_R2, fm, density = 0.2, seed = NULL) {
if (!is.null(seed)) set.seed(seed)
Xdf <- gen_X(n, p, rho); names(Xdf) <- paste0("X", seq_len(p))
terms_obj <- stats::delete.response(stats::terms(fm, data = transform(Xdf, y = 0)))
mf <- model.frame(terms_obj, data = Xdf)
MM <- model.matrix(terms_obj, mf) # n x (M+1) with intercept at column 1
M <- ncol(MM) - 1L
if (M < 1L) stop("Formula produces no predictors beyond intercept.")
n_active <- max(1L, floor(density * M))
active_idx <- sample.int(M, n_active, replace = FALSE) + 1L # skip intercept
beta <- numeric(M + 1L)
beta[active_idx] <- rexp(n_active) - rexp(n_active)
y_signal <- drop(MM %*% beta)
sd_sig <- sqrt(max(var(y_signal), .Machine$double.eps))
sd_eps <- sd_sig * sqrt((1 - target_R2) / target_R2)
y <- y_signal + rnorm(n, 0, sd_eps)
out <- cbind.data.frame(y = y, Xdf)
out$y_signal <- y_signal
rownames(out) <- sprintf("row%06d", seq_len(n))
out
}
## Convenience wrapper to fit SVEM for a given objective (namespaced; robust)
fit_and_predict_svem <- function(fm, train_used, test_df, objective, nBoot = NBOOT, agg = "mean") {
t0 <- proc.time()[3]
mod <- try(SVEMnet::SVEMnet(
formula = fm, data = train_used,
nBoot = nBoot,
glmnet_alpha = c(0,.5,1),
weight_scheme = "SVEM",
objective = objective,
standardize = TRUE
), silent = TRUE)
elapsed <- round(proc.time()[3] - t0, 2)
if (inherits(mod, "try-error")) {
preds <- rep(NA_real_, nrow(test_df))
} else {
preds <- safe_predict_svem(mod, test_df, agg = agg)
}
list(pred = preds, time = elapsed)
}
## ----- Build scenario grid -----
grid <- expand.grid(
p = P_GRID,
model = MODELS,
rho = RHO_GRID,
R2_tgt = R2_GRID,
dens = DENSITY_GRID,
rep = seq_len(REPS),
stringsAsFactors = FALSE
)
## ----- Parallel plan (5 workers) -----
old_plan <- future::plan()
on.exit(future::plan(old_plan), add = TRUE)
future::plan(future::multisession, workers = N_WORKERS)
## Optional progress bar:
# handlers(global = TRUE)
# with_progress({
# pbar <- progressor(along = seq_len(nrow(grid)))
# ... call future_lapply with on.exit(pbar()) ...
# })
## ----- One scenario run -----
run_one <- function(row) {
p <- row$p
model <- row$model
rho <- row$rho
R2_tgt <- row$R2_tgt
dens <- row$dens
repi <- row$rep
fms <- build_formulas(p)
fm <- fms[[model]]
p_full <- count_p_full(fm, p)
# n around the boundary; clamp to [12, 40]
n_grid <- sort(unique(pmin(pmax(p_full + c(-12,-8,-6,-4,-2,0,2,4,6,8,12), 12), 40)))
# per-row deterministic seed (so sampling is reproducible too)
seed_i <- 3000 +
19 * repi + 100 * p +
1000 * match(model, MODELS) +
round(1000 * R2_tgt) +
round(1000 * rho) +
round(1000 * dens)
set.seed(seed_i)
n_tr <- sample(n_grid, 1L)
n_te <- NTEST
df <- make_sparse_data_R2(
n = n_tr + n_te, p = p, rho = rho, target_R2 = R2_tgt,
fm = fm, density = dens, seed = seed_i + 1L
)
set.seed(seed_i + 2L)
idx <- sample(seq_len(nrow(df)), size = n_tr)
train_df <- df[idx, ]
test_df <- df[-idx, ]
keep <- complete.cases(model.frame(fm, data = train_df, na.action = stats::na.pass))
if (sum(keep) < 2) return(NULL)
train_used <- train_df[keep, , drop = FALSE]
n_used <- nrow(train_used)
r2_true <- function(d) var(d$y_signal) / var(d$y)
R2_train_true <- r2_true(train_used)
R2_test_true <- r2_true(test_df)
# SVEM variants (alpha=1; no debias)
sv_waic <- fit_and_predict_svem(fm, train_used, test_df, "wAIC", nBoot = NBOOT)
sv_waicc <- fit_and_predict_svem(fm, train_used, test_df, "wAICc", nBoot = NBOOT)
sv_wsse <- fit_and_predict_svem(fm, train_used, test_df, "wSSE", nBoot = NBOOT)
# glmnet_with_cv (alpha=1; no debias)
t1 <- proc.time()[3]
fit_cv <- glmnet_with_cv(
formula = fm, data = train_used,
glmnet_alpha = c(0,.5,1),
standardize = TRUE
)
pr_te_cv <- safe_predict_cv(fit_cv, test_df)
time_cv <- round(proc.time()[3] - t1, 2)
data.frame(
p = p, model = model, rho = rho, R2_target = R2_tgt, density = dens,
p_full = p_full, n_train = n_tr, n_train_used = n_used, n_test = n_te,
n_train_minus_p_full = n_tr - p_full,
ratio_n_over_p = n_used / p_full,
above_boundary = as.integer(n_used > p_full),
R2_true_train = R2_train_true, R2_true_test = R2_test_true,
rmse_test_svem_waic = rmse(test_df$y, sv_waic$pred),
rmse_test_svem_waicc = rmse(test_df$y, sv_waicc$pred),
rmse_test_svem_wsse = rmse(test_df$y, sv_wsse$pred),
rmse_test_glmnet_cv = rmse(test_df$y, pr_te_cv),
time_sec_svem_waic = sv_waic$time,
time_sec_svem_waicc = sv_waicc$time,
time_sec_svem_wsse = sv_wsse$time,
time_sec_cv = time_cv,
stringsAsFactors = FALSE
)
}
## ----- Run in parallel -----
res_list <- future.apply::future_lapply(
seq_len(nrow(grid)),
function(i) {
# if using progressr:
# on.exit(pbar(), add = TRUE)
run_one(grid[i, , drop = FALSE])
},
future.seed = TRUE
)
res <- dplyr::bind_rows(Filter(Negate(is.null), res_list))
stopifnot(nrow(res) > 0)
## ---------- Summary & win-rates (robust, single block) ----------
stopifnot(all(c("rmse_test_svem_waic","rmse_test_svem_waicc","rmse_test_svem_wsse","rmse_test_glmnet_cv") %in% names(res)))
# Long format for plotting and summaries
res_long <- res %>%
mutate(
Row = dplyr::row_number(),
lmrse_waic = log(pmax(rmse_test_svem_waic, .Machine$double.eps)),
lmrse_waicc = log(pmax(rmse_test_svem_waicc, .Machine$double.eps)),
lmrse_wsse = log(pmax(rmse_test_svem_wsse, .Machine$double.eps)),
lmrse_cv = log(pmax(rmse_test_glmnet_cv, .Machine$double.eps))
) %>%
tidyr::pivot_longer(
cols = c(lmrse_waic, lmrse_waicc, lmrse_wsse, lmrse_cv),
names_to = "metric", values_to = "lmrse"
) %>%
mutate(
settings = dplyr::recode(metric,
lmrse_waic = "SVEM_wAIC",
lmrse_waicc = "SVEM_wAICc",
lmrse_wsse = "SVEM_wSSE",
lmrse_cv = "glmnet_cv"),
settings = factor(settings, levels = c("SVEM_wAIC","SVEM_wAICc","SVEM_wSSE","glmnet_cv"))
)
# Head-to-head win-rates vs glmnet_cv (paired by row)
win_rates_vs_cv <- res %>%
summarise(
SVEM_wAIC = mean(rmse_test_svem_waic < rmse_test_glmnet_cv, na.rm = TRUE),
SVEM_wAICc = mean(rmse_test_svem_waicc < rmse_test_glmnet_cv, na.rm = TRUE),
SVEM_wSSE = mean(rmse_test_svem_wsse < rmse_test_glmnet_cv, na.rm = TRUE)
) %>%
tidyr::pivot_longer(everything(), names_to = "settings", values_to = "winrate_vs_cv")
# Overall summary (log RMSE)
summ <- res_long %>%
group_by(settings) %>%
summarise(
mean_lmrse = mean(lmrse, na.rm = TRUE),
sd_lmrse = sd(lmrse, na.rm = TRUE),
n = dplyr::n(),
se = sd_lmrse / sqrt(pmax(n, 1)),
ci_lo = mean_lmrse - 1.96 * se,
ci_hi = mean_lmrse + 1.96 * se,
.groups = "drop"
) %>%
left_join(win_rates_vs_cv, by = "settings")
cat("\n================ SUMMARY (log RMSE) ================\n")
print(summ, row.names = FALSE, digits = 4)
## ---------- By-boundary summaries (attach boundary to long) ----------
if ("above_boundary" %in% names(res)) {
bound_key <- res %>%
mutate(Row = dplyr::row_number()) %>%
select(Row, above_boundary)
} else if (all(c("n_train_used","p_full") %in% names(res))) {
bound_key <- res %>%
mutate(Row = dplyr::row_number(),
above_boundary = as.integer(n_train_used > p_full)) %>%
select(Row, above_boundary)
} else if ("ratio_n_over_p" %in% names(res)) {
bound_key <- res %>%
mutate(Row = dplyr::row_number(),
above_boundary = as.integer(ratio_n_over_p > 1)) %>%
select(Row, above_boundary)
} else {
bound_key <- res %>%
mutate(Row = dplyr::row_number(), above_boundary = NA_integer_) %>%
select(Row, above_boundary)
}
res_long <- res_long %>% left_join(bound_key, by = "Row")
if ("above_boundary.x" %in% names(res_long) || "above_boundary.y" %in% names(res_long)) {
res_long <- res_long %>%
dplyr::mutate(
above_boundary = dplyr::coalesce(
if ("above_boundary.y" %in% names(.)) .data$above_boundary.y else NA_integer_,
if ("above_boundary.x" %in% names(.)) .data$above_boundary.x else NA_integer_
)
) %>%
dplyr::select(-dplyr::any_of(c("above_boundary.x","above_boundary.y")))
}
# Means of log RMSE by boundary & setting
summ_by_boundary <- res_long %>%
group_by(above_boundary, settings) %>%
summarise(
mean_lmrse = mean(lmrse, na.rm = TRUE),
sd_lmrse = sd(lmrse, na.rm = TRUE),
n = dplyr::n(),
.groups = "drop"
)
cat("\n---- Means by boundary (n_train_used > p_full) ----\n")
print(summ_by_boundary, row.names = FALSE, digits = 4)
# Boundary-specific win-rate for each SVEM variant vs glmnet
win_by_boundary <- res %>%
mutate(
above_boundary = if ("above_boundary" %in% names(.)) above_boundary
else if (all(c("n_train_used","p_full") %in% names(.))) as.integer(n_train_used > p_full)
else if ("ratio_n_over_p" %in% names(.)) as.integer(ratio_n_over_p > 1)
else NA_integer_,
win_waic = rmse_test_svem_waic < rmse_test_glmnet_cv,
win_waicc = rmse_test_svem_waicc < rmse_test_glmnet_cv,
win_wsse = rmse_test_svem_wsse < rmse_test_glmnet_cv
) %>%
group_by(above_boundary) %>%
summarise(
winrate_waic = mean(win_waic, na.rm = TRUE),
winrate_waicc = mean(win_waicc, na.rm = TRUE),
winrate_wsse = mean(win_wsse, na.rm = TRUE),
n = sum(!is.na(win_waic) | !is.na(win_waicc) | !is.na(win_wsse)),
.groups = "drop"
)
cat("\n---- Win-rate vs glmnet (by boundary) ----\n")
print(win_by_boundary, row.names = FALSE, digits = 4)
## ========== ANOM-style, row-blocked plot ==========
res_long_anom <- res_long %>%
dplyr::mutate(Row = as.factor(Row)) %>%
dplyr::filter(is.finite(lmrse))
fit_anom <- lm(lmrse ~ settings + Row, data = res_long_anom)
aov_tbl <- anova(fit_anom)
MSE <- aov_tbl[["Mean Sq"]][nrow(aov_tbl)]
df_res <- aov_tbl[["Df"]][nrow(aov_tbl)]
t_methods <- nlevels(res_long_anom$settings)
b_blocks <- nlevels(res_long_anom$Row)
grand_mu <- mean(res_long_anom$lmrse, na.rm = TRUE)
# Var(ȳ_i. − ȳ..) = σ^2 * (t − 1) / (t * b) under RBD
se_group <- sqrt(MSE * (t_methods - 1) / (t_methods * b_blocks))
alpha <- 0.05
crit <- qt(1 - alpha / (2 * t_methods), df = df_res)
UCL <- grand_mu + crit * se_group
LCL <- grand_mu - crit * se_group
means_df <- res_long_anom %>%
dplyr::group_by(settings) %>%
dplyr::summarise(mean_lmrse = mean(lmrse, na.rm = TRUE), .groups = "drop") %>%
dplyr::mutate(flag = dplyr::case_when(
mean_lmrse > UCL ~ "Above UCL",
mean_lmrse < LCL ~ "Below LCL",
TRUE ~ "Within Limits"
))
p_anom <- ggplot(means_df, aes(x = settings, y = mean_lmrse)) +
geom_hline(yintercept = grand_mu, linetype = 2) +
geom_hline(yintercept = UCL, linetype = 3) +
geom_hline(yintercept = LCL, linetype = 3) +
geom_segment(aes(xend = settings, y = grand_mu, yend = mean_lmrse), linewidth = 1) +
geom_point(aes(color = flag), size = 3) +
scale_color_manual(values = c("Within Limits" = "black",
"Above UCL" = "red",
"Below LCL" = "red")) +
labs(
title = "Blocked ANOM-style plot for lmrse (Row = block)",
subtitle = sprintf("Grand mean = %.3f | Limits = [%.3f, %.3f] | df = %d",
grand_mu, LCL, UCL, df_res),
x = NULL, y = "Mean lmrse", color = NULL
) +
theme_bw() +
theme(plot.title = element_text(face = "bold"))
p_pairs <- ggplot(res_long_anom, aes(x = settings, y = lmrse, group = Row)) +
geom_line(alpha = 0.25) +
geom_point(alpha = 0.6, position = position_dodge(width = 0.05)) +
stat_summary(fun = mean, geom = "point", size = 3, color = "black") +
labs(title = "Paired runs by Row", x = NULL, y = "lmrse") +
theme_bw()
(p_anom / p_pairs)
## ---------- Geometric-mean RMSE ratio (SVEM wAIC vs CV) ----------
eps <- .Machine$double.eps
rmse_ratio <- with(res, pmax(rmse_test_svem_waic, eps) / pmax(rmse_test_glmnet_cv, eps))
logdiff <- log(rmse_ratio)
gm_ratio <- exp(mean(logdiff, na.rm = TRUE)) # primary
q75 <- quantile(rmse_ratio, 0.75, na.rm = TRUE) # tail
q90 <- quantile(rmse_ratio, 0.90, na.rm = TRUE)
winrate <- mean(rmse_ratio < 1, na.rm = TRUE)
# simple bootstrap CI for GM ratio
set.seed(123)
B <- 2000
idx <- replicate(B, sample.int(length(logdiff), replace = TRUE))
gm_boot <- apply(idx, 2, function(ii) exp(mean(logdiff[ii], na.rm = TRUE)))
gm_ci <- quantile(gm_boot, c(0.025, 0.975), na.rm = TRUE)
cat(sprintf(
"\nPaired geometric-mean RMSE ratio (SVEM/CV): %.3f (95%% CI %.3f – %.3f)\n",
gm_ratio, gm_ci[1], gm_ci[2]
))
cat(sprintf("Win rate (SVEM better): %.1f%%\n", 100*winrate))
cat(sprintf("Tail ratios: 75th=%.3f, 90th=%.3f\n", q75, q90))
# Trim top/bottom 5% of logdiff before averaging
trim <- function(x, p=0.05) x[x >= quantile(x, p, na.rm=TRUE) & x <= quantile(x, 1-p, na.rm=TRUE)]
gm_ratio_trim <- exp(mean(trim(logdiff), na.rm = TRUE))
cat(sprintf("Trimmed GM ratio (5%%/5%%): %.3f\n", gm_ratio_trim))
Added objective = "wAICc"
.
Chosen default:
SVEMnet(..., objective = "wAICc")
and
predict(..., debias = FALSE)
.
Rationale from focused simulations (head-to-head test RMSE,
disjoint train/test): - AICc vs AIC (inside
SVEM): AICc consistently outperformed AIC. In paired
comparisons on log-RMSE, the geometric-mean (GM) ratio favored AICc
(e.g., GM approx 0.875 when debias=FALSE), indicating materially lower
test error for AICc across diverse settings. -
Debiasing: Turning debias on generally
increased test error for both SVEM and glmnet (typical GM inflation
approx +3%). Debias helped only in high-signal slices
(e.g., R² ≳ 0.7), but tended to hurt near or below the
n/p_full
boundary. As a broad default,
debias=FALSE is safer. - SVEM objectives
hierarchy: wAICc
> wAIC
≫
wSSE
in aggregate performance and tail risk.
wSSE
especially underperformed near boundary regimes. -
glmnet (for context only): cv.glmnet is a strong
baseline and was very close to SVEM–AICc on average. However, since this
package’s goal is to provide a robust SVEM implementation with
consistent defaults across regimes, we select AICc with debias
off as the package default and use glmnet primarily for
comparison.
Key practical implication: Unless you have strong
prior evidence of a high-signal regime and want JMP-style
debiased outputs, keep debias = FALSE
. AICc’s small-sample
penalty stabilizes selection near tight designs and lowers tail losses
without noticeably sacrificing average accuracy.
Default usage:
# Fit with the recommended defaults
fit <- SVEMnet(
y ~ (X1 + X2 + X3)^2 + I(X1^2) + I(X2^2) + I(X3^2),
data = dat,
objective = "wAICc",
nBoot = 200,
glmnet_alpha = c(0, 0.5, 1),
weight_scheme = "SVEM",
standardize = TRUE
)
# Predict with debias turned off (recommended default)
pred <- predict(fit, newdata = dat, debias = FALSE)
## ============================================================
## SVEM (wAICc & wAIC) vs glmnet:
## debias = FALSE vs TRUE for all methods
## Focus: small training sizes; head-to-head test RMSE
## Parallelized with future.apply (N_WORKERS)
## Includes paired analyses + interactions + extra slices
## ============================================================
## ----- Packages -----
if (!requireNamespace("SVEMnet", quietly = TRUE)) install.packages("SVEMnet")
if (!requireNamespace("glmnet", quietly = TRUE)) install.packages("glmnet")
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2")
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr")
if (!requireNamespace("tidyr", quietly = TRUE)) install.packages("tidyr")
if (!requireNamespace("tibble", quietly = TRUE)) install.packages("tibble")
if (!requireNamespace("patchwork", quietly = TRUE)) install.packages("patchwork")
if (!requireNamespace("future.apply", quietly = TRUE)) install.packages("future.apply")
# Optional progress bar:
# if (!requireNamespace("progressr", quietly = TRUE)) install.packages("progressr")
library(SVEMnet)
library(glmnet) # glmnet_with_cv likely depends on this
library(ggplot2)
library(dplyr)
library(tidyr)
library(tibble)
library(patchwork)
library(future.apply)
# library(progressr)
## ----- Controls -----
set.seed(67)
REPS <- 10
NBOOT <- 200
R2_GRID <- seq(.1, .9, .2)
RHO_GRID <- c(0, -0.9, 0.9)
P_GRID <- c(4, 5, 6, 7)
MODELS <- c("full_quadratic", "full_cubic") # quadratic + cubic truths
NTEST <- 1000
DENSITY_GRID <- seq(.1, .9, .2)
N_WORKERS <- 7
## ----- Helpers -----
rmse <- function(obs, pred) sqrt(mean((obs - pred)^2))
safe_predict_svem <- function(fit, newdata, debias = FALSE, agg = "mean") {
out <- try(predict(fit, newdata = newdata, debias = debias, agg = agg), silent = TRUE)
if (inherits(out, "try-error")) return(rep(NA_real_, nrow(newdata)))
as.numeric(out)
}
## Robustly call your glmnet_with_cv() + predict_cv() (they must exist in the environment)
safe_predict_cv <- function(object, newdata, debias = FALSE) {
out <- try(predict_cv(object, newdata = newdata, debias = debias), silent = TRUE)
if (inherits(out, "try-error")) return(rep(NA_real_, nrow(newdata)))
as.numeric(out)
}
## Count p_full (columns in model matrix – intercept)
count_p_full <- function(fm, p) {
rhs <- stats::reformulate(attr(stats::delete.response(stats::terms(fm)), "term.labels"))
tmp <- as.data.frame(matrix(rnorm(4 * p), nrow = 4))
names(tmp) <- paste0("X", seq_len(p))
mm <- model.matrix(rhs, data = tmp)
sum(colnames(mm) != "(Intercept)")
}
build_formulas <- function(p) {
vars <- paste0("X", seq_len(p))
main_terms <- paste(vars, collapse = " + ")
int2_terms <- paste0("(", main_terms, ")^2") # up to 2-way interactions
int3_terms <- paste0("(", main_terms, ")^3") # up to 3-way interactions
sq_terms <- paste(sprintf("I(%s^2)", vars), collapse = " + ")
cube_terms <- paste(sprintf("I(%s^3)", vars), collapse = " + ")
list(
main_effects = as.formula(paste0("y ~ ", main_terms)),
main_plus_int = as.formula(paste0("y ~ ", int2_terms)),
full_quadratic = as.formula(paste0("y ~ ", int2_terms, " + ", sq_terms)),
full_cubic = as.formula(paste0("y ~ ", int3_terms, " + ", sq_terms, " + ", cube_terms))
)
}
gen_X <- function(n, p, rho = 0.5) {
Z <- matrix(rnorm(n * p), n, p)
if (p >= 2 && abs(rho) > 0) {
for (j in 2:p) {
Z[, j] <- rho * scale(Z[, j - 1], TRUE, TRUE) +
sqrt(1 - rho^2) * scale(Z[, j], TRUE, TRUE)
}
}
X <- Z
colnames(X) <- paste0("X", seq_len(p))
as.data.frame(X)
}
## Density-driven truth generator using the selected formula fm
make_sparse_data_R2 <- function(n, p, rho, target_R2, fm, density = 0.2, seed = NULL) {
if (!is.null(seed)) set.seed(seed)
Xdf <- gen_X(n, p, rho); names(Xdf) <- paste0("X", seq_len(p))
terms_obj <- stats::delete.response(stats::terms(fm, data = transform(Xdf, y = 0)))
mf <- model.frame(terms_obj, data = Xdf)
MM <- model.matrix(terms_obj, mf) # n x (M+1) with intercept at column 1
M <- ncol(MM) - 1L
if (M < 1L) stop("Formula produces no predictors beyond intercept.")
n_active <- max(1L, floor(density * M))
active_idx <- sample.int(M, n_active, replace = FALSE) + 1L # skip intercept
beta <- numeric(M + 1L)
beta[active_idx] <- rexp(n_active) - rexp(n_active)
y_signal <- drop(MM %*% beta)
sd_sig <- sqrt(max(var(y_signal), .Machine$double.eps))
sd_eps <- sd_sig * sqrt((1 - target_R2) / target_R2)
y <- y_signal + stats::rnorm(n, 0, sd_eps)
out <- cbind.data.frame(y = y, Xdf)
out$y_signal <- y_signal
rownames(out) <- sprintf("row%06d", seq_len(n))
out
}
## SVEM: fit once for a given objective, then predict for debias=FALSE/TRUE
fit_and_predict_obj <- function(fm, train_used, test_df, objective, nBoot = NBOOT, agg = "mean") {
t0 <- proc.time()[3]
mod <- try(SVEMnet::SVEMnet(
formula = fm, data = train_used,
nBoot = nBoot,
glmnet_alpha = c(0, .5, 1),
weight_scheme = "SVEM",
objective = objective, # "wAICc" or "wAIC"
standardize = TRUE
), silent = TRUE)
elapsed <- round(proc.time()[3] - t0, 2)
if (inherits(mod, "try-error")) {
pred_false <- rep(NA_real_, nrow(test_df))
pred_true <- rep(NA_real_, nrow(test_df))
} else {
pred_false <- safe_predict_svem(mod, test_df, debias = FALSE, agg = agg)
pred_true <- safe_predict_svem(mod, test_df, debias = TRUE, agg = agg)
}
list(pred_false = pred_false, pred_true = pred_true, time = elapsed)
}
## glmnet comparator using your glmnet_with_cv() + predict_cv()
fit_and_predict_glmnet_cv <- function(fm, train_used, test_df, alphas = c(0, 0.5, 1)) {
if (!exists("glmnet_with_cv")) stop("glmnet_with_cv() not found in environment.")
if (!exists("predict_cv")) stop("predict_cv() not found in environment.")
t0 <- proc.time()[3]
mod <- try(glmnet_with_cv(formula = fm, data = train_used, glmnet_alpha = alphas), silent = TRUE)
elapsed <- round(proc.time()[3] - t0, 2)
if (inherits(mod, "try-error")) {
return(list(pred_false = rep(NA_real_, nrow(test_df)),
pred_true = rep(NA_real_, nrow(test_df)),
time = elapsed))
}
pred_false <- safe_predict_cv(mod, test_df, debias = FALSE)
pred_true <- safe_predict_cv(mod, test_df, debias = TRUE)
list(pred_false = pred_false, pred_true = pred_true, time = elapsed)
}
## ----- Build scenario grid -----
grid <- expand.grid(
p = P_GRID,
model = MODELS,
rho = RHO_GRID,
R2_tgt = R2_GRID,
dens = DENSITY_GRID,
rep = seq_len(REPS),
stringsAsFactors = FALSE
)
## ----- Parallel plan -----
old_plan <- future::plan()
on.exit(future::plan(old_plan), add = TRUE)
future::plan(future::multisession, workers = N_WORKERS)
## ----- One scenario run -----
run_one <- function(row) {
p <- row$p
model <- row$model
rho <- row$rho
R2_tgt <- row$R2_tgt
dens <- row$dens
repi <- row$rep
fms <- build_formulas(p)
fm <- fms[[model]]
p_full <- count_p_full(fm, p)
# n around the boundary; clamp to [12, 40]
n_grid <- sort(unique(pmin(pmax(p_full + c(-12, -8, -6, -4, -2, 0, 2), 12), 40)))
# per-row deterministic seed
seed_i <- 3000 +
19 * repi + 100 * p +
1000 * match(model, MODELS) +
round(1000 * R2_tgt) +
round(1000 * rho) +
round(1000 * dens)
set.seed(seed_i)
n_tr <- sample(n_grid, 1L)
n_te <- NTEST
df <- make_sparse_data_R2(
n = n_tr + n_te, p = p, rho = rho, target_R2 = R2_tgt,
fm = fm, density = dens, seed = seed_i + 1L
)
set.seed(seed_i + 2L)
idx <- sample(seq_len(nrow(df)), size = n_tr)
train_df <- df[idx, ]
test_df <- df[-idx, ]
keep <- complete.cases(model.frame(fm, data = train_df, na.action = stats::na.pass))
if (sum(keep) < 2) return(NULL)
train_used <- train_df[keep, , drop = FALSE]
n_used <- nrow(train_used)
r2_true <- function(d) var(d$y_signal) / var(d$y)
R2_train_true <- r2_true(train_used)
R2_test_true <- r2_true(test_df)
# SVEM (wAICc & wAIC)
sv_wAICc <- fit_and_predict_obj(fm, train_used, test_df, "wAICc", nBoot = NBOOT)
sv_wAIC <- fit_and_predict_obj(fm, train_used, test_df, "wAIC", nBoot = NBOOT)
# glmnet comparator (cv + built-in debias)
glmn <- fit_and_predict_glmnet_cv(fm, train_used, test_df)
data.frame(
p = p, model = model, rho = rho, R2_target = R2_tgt, density = dens,
p_full = p_full, n_train = n_tr, n_train_used = n_used, n_test = n_te,
n_train_minus_p_full = n_tr - p_full,
ratio_n_over_p = n_used / p_full,
above_boundary = as.integer(n_used > p_full),
R2_true_train = R2_train_true, R2_true_test = R2_test_true,
rmse_test_wAICc_debiasFalse = rmse(test_df$y, sv_wAICc$pred_false),
rmse_test_wAICc_debiasTrue = rmse(test_df$y, sv_wAICc$pred_true),
rmse_test_wAIC_debiasFalse = rmse(test_df$y, sv_wAIC$pred_false),
rmse_test_wAIC_debiasTrue = rmse(test_df$y, sv_wAIC$pred_true),
rmse_test_glmnet_debiasFalse= rmse(test_df$y, glmn$pred_false),
rmse_test_glmnet_debiasTrue = rmse(test_df$y, glmn$pred_true),
time_sec_fit_wAICc = sv_wAICc$time,
time_sec_fit_wAIC = sv_wAIC$time,
time_sec_fit_glmnet = glmn$time,
stringsAsFactors = FALSE
)
}
## ----- Run in parallel -----
# progressr::with_progress({
# pbar <- progressr::progressor(along = seq_len(nrow(grid)))
# res_list <- future.apply::future_lapply(
# seq_len(nrow(grid)),
# function(i) { on.exit(pbar(), add = TRUE); run_one(grid[i, , drop = FALSE]) },
# future.seed = TRUE
# )
# })
res_list <- future.apply::future_lapply(
seq_len(nrow(grid)),
function(i) run_one(grid[i, , drop = FALSE]),
future.seed = TRUE
)
res <- dplyr::bind_rows(Filter(Negate(is.null), res_list))
stopifnot(nrow(res) > 0)
## ---------- Summaries (marginal + paired + interactions) ----------
need_cols <- c("rmse_test_wAICc_debiasFalse","rmse_test_wAICc_debiasTrue",
"rmse_test_wAIC_debiasFalse","rmse_test_wAIC_debiasTrue",
"rmse_test_glmnet_debiasFalse","rmse_test_glmnet_debiasTrue")
stopifnot(all(need_cols %in% names(res)))
## Long format (for marginal summaries & ANOM)
eps <- .Machine$double.eps
res_long <- res %>%
mutate(Row = dplyr::row_number()) %>%
transmute(
Row, model, p, rho, R2_target, density,
above_boundary, ratio_n_over_p,
wAICc_FALSE = log(pmax(rmse_test_wAICc_debiasFalse, eps)),
wAICc_TRUE = log(pmax(rmse_test_wAICc_debiasTrue, eps)),
wAIC_FALSE = log(pmax(rmse_test_wAIC_debiasFalse, eps)),
wAIC_TRUE = log(pmax(rmse_test_wAIC_debiasTrue, eps)),
glmnet_FALSE= log(pmax(rmse_test_glmnet_debiasFalse,eps)),
glmnet_TRUE = log(pmax(rmse_test_glmnet_debiasTrue, eps))
) %>%
tidyr::pivot_longer(
cols = c(wAICc_FALSE, wAICc_TRUE, wAIC_FALSE, wAIC_TRUE, glmnet_FALSE, glmnet_TRUE),
names_to = "setting", values_to = "lmrse"
) %>%
mutate(
method = case_when(grepl("^glmnet", setting) ~ "glmnet",
grepl("^wAICc", setting) ~ "SVEM-wAICc",
TRUE ~ "SVEM-wAIC"),
debias = ifelse(grepl("TRUE$", setting), "TRUE", "FALSE"),
method = factor(method, levels = c("SVEM-wAICc","SVEM-wAIC","glmnet")),
debias = factor(debias, levels = c("FALSE","TRUE")),
setting = factor(paste0(method, " (debias=", debias, ")"),
levels = c("SVEM-wAICc (debias=FALSE)","SVEM-wAICc (debias=TRUE)",
"SVEM-wAIC (debias=FALSE)","SVEM-wAIC (debias=TRUE)",
"glmnet (debias=FALSE)","glmnet (debias=TRUE)"))
)
## ---------- Which combo is best overall? (per-row winner among 6)
winners <- res %>%
mutate(Row = dplyr::row_number()) %>%
transmute(
Row,
`SVEM-wAICc (debias=FALSE)` = rmse_test_wAICc_debiasFalse,
`SVEM-wAICc (debias=TRUE)` = rmse_test_wAICc_debiasTrue,
`SVEM-wAIC (debias=FALSE)` = rmse_test_wAIC_debiasFalse,
`SVEM-wAIC (debias=TRUE)` = rmse_test_wAIC_debiasTrue,
`glmnet (debias=FALSE)` = rmse_test_glmnet_debiasFalse,
`glmnet (debias=TRUE)` = rmse_test_glmnet_debiasTrue
) %>%
tidyr::pivot_longer(-Row, names_to = "setting", values_to = "rmse") %>%
group_by(Row) %>%
filter(is.finite(rmse), rmse == min(rmse, na.rm = TRUE)) %>%
ungroup()
best_counts <- winners %>% count(setting, name = "wins") %>%
mutate(share = wins / sum(wins)) %>%
arrange(desc(share))
cat("\n==== Which (method, debias) wins most often? ====\n")
print(best_counts, row.names = FALSE, digits = 4)
## ---------- Marginal summary (log RMSE) — info only
summ_marg <- res_long %>%
group_by(setting) %>%
summarise(
mean_lmrse = mean(lmrse, na.rm = TRUE),
sd_lmrse = sd(lmrse, na.rm = TRUE),
n = dplyr::n(),
se = sd_lmrse / sqrt(pmax(n, 1)),
ci_lo = mean_lmrse - 1.96 * se,
ci_hi = mean_lmrse + 1.96 * se,
.groups = "drop"
) %>%
arrange(mean_lmrse)
cat("\n================ SUMMARY (log RMSE; marginal) ================\n")
print(summ_marg, row.names = FALSE, digits = 4)
## ---------- Debias effect within each method (paired)
paired_effect <- function(num, den, label) {
logdiff <- log(pmax(num, eps)) - log(pmax(den, eps))
logdiff <- logdiff[is.finite(logdiff)]
n <- length(logdiff)
mu <- mean(logdiff); se <- sd(logdiff)/sqrt(n); ci <- mu + c(-1,1) * qt(0.975, n-1) * se
tibble(contrast = label, n = n,
gm_ratio = exp(mu), gm_lo = exp(ci[1]), gm_hi = exp(ci[2]),
winrate = mean(exp(logdiff) < 1))
}
debias_wAICc <- paired_effect(res$rmse_test_wAICc_debiasTrue, res$rmse_test_wAICc_debiasFalse, "SVEM-wAICc: TRUE / FALSE")
debias_wAIC <- paired_effect(res$rmse_test_wAIC_debiasTrue, res$rmse_test_wAIC_debiasFalse, "SVEM-wAIC: TRUE / FALSE")
debias_glm <- paired_effect(res$rmse_test_glmnet_debiasTrue, res$rmse_test_glmnet_debiasFalse,"glmnet: TRUE / FALSE")
cat("\n==== Debias effect within each method (GM ratios; <1 favors debias=TRUE) ====\n")
print(bind_rows(debias_wAICc, debias_wAIC, debias_glm), row.names = FALSE, digits = 4)
## ---------- Objective effect at fixed debias (SVEM only; paired)
obj_FALSE <- paired_effect(res$rmse_test_wAICc_debiasFalse, res$rmse_test_wAIC_debiasFalse,
"SVEM: wAICc / wAIC (debias=FALSE)")
obj_TRUE <- paired_effect(res$rmse_test_wAICc_debiasTrue, res$rmse_test_wAIC_debiasTrue,
"SVEM: wAICc / wAIC (debias=TRUE)")
cat("\n==== Objective effect (SVEM) at each debias (GM ratios; <1 favors wAICc) ====\n")
print(bind_rows(obj_FALSE, obj_TRUE), row.names = FALSE, digits = 4)
## ---------- SVEM vs glmnet at fixed debias (paired; optional but handy)
svem_wAICc_vs_glm_FALSE <- paired_effect(res$rmse_test_wAICc_debiasFalse, res$rmse_test_glmnet_debiasFalse,
"SVEM-wAICc / glmnet (debias=FALSE)")
svem_wAICc_vs_glm_TRUE <- paired_effect(res$rmse_test_wAICc_debiasTrue, res$rmse_test_glmnet_debiasTrue,
"SVEM-wAICc / glmnet (debias=TRUE)")
svem_wAIC_vs_glm_FALSE <- paired_effect(res$rmse_test_wAIC_debiasFalse, res$rmse_test_glmnet_debiasFalse,
"SVEM-wAIC / glmnet (debias=FALSE)")
svem_wAIC_vs_glm_TRUE <- paired_effect(res$rmse_test_wAIC_debiasTrue, res$rmse_test_glmnet_debiasTrue,
"SVEM-wAIC / glmnet (debias=TRUE)")
cat("\n==== SVEM vs glmnet (GM ratios; <1 favors SVEM) ====\n")
print(bind_rows(svem_wAICc_vs_glm_FALSE, svem_wAICc_vs_glm_TRUE,
svem_wAIC_vs_glm_FALSE, svem_wAIC_vs_glm_TRUE),
row.names = FALSE, digits = 4)
## ---------- Interaction: method × debias (Row-blocked + DiD style)
# assumes res_long is already built
res_long_dm <- res_long %>%
dplyr::group_by(Row) %>%
dplyr::mutate(lmrse_dm = lmrse - mean(lmrse)) %>%
dplyr::ungroup()
fit_interact_fast <- lm(lmrse_dm ~ method * debias, data = res_long_dm)
anova(fit_interact_fast)
# If you want the “setting” one-way with Row FE:
fit_setting_fast <- lm(lmrse_dm ~ setting, data = res_long_dm)
anova(fit_setting_fast)
## ---------- By-boundary (marginal means over 6 settings)
ab_key <- res %>% mutate(Row = dplyr::row_number()) %>% select(Row, above_boundary)
res_long_ab <- res_long %>%
select(-dplyr::any_of(c("above_boundary"))) %>%
left_join(ab_key, by = "Row")
summ_by_boundary <- res_long_ab %>%
group_by(above_boundary, setting) %>%
summarise(
mean_lmrse = mean(lmrse, na.rm = TRUE),
sd_lmrse = sd(lmrse, na.rm = TRUE),
n = dplyr::n(),
.groups = "drop"
)
cat("\n---- Means by boundary (n_train_used > p_full; marginal) ----\n")
print(summ_by_boundary, row.names = FALSE, digits = 4)
## ---------- Win-rate by boundary (which setting wins among 6; paired within row)
win_by_boundary6 <- res %>%
mutate(Row = dplyr::row_number(),
above_boundary = as.integer(n_train_used > p_full)) %>%
transmute(
Row, above_boundary,
`SVEM-wAICc (debias=FALSE)` = rmse_test_wAICc_debiasFalse,
`SVEM-wAICc (debias=TRUE)` = rmse_test_wAICc_debiasTrue,
`SVEM-wAIC (debias=FALSE)` = rmse_test_wAIC_debiasFalse,
`SVEM-wAIC (debias=TRUE)` = rmse_test_wAIC_debiasTrue,
`glmnet (debias=FALSE)` = rmse_test_glmnet_debiasFalse,
`glmnet (debias=TRUE)` = rmse_test_glmnet_debiasTrue
) %>%
tidyr::pivot_longer(-c(Row, above_boundary), names_to = "setting", values_to = "rmse") %>%
group_by(Row) %>%
filter(is.finite(rmse), rmse == min(rmse, na.rm = TRUE)) %>%
ungroup() %>%
count(above_boundary, setting, name = "wins") %>%
group_by(above_boundary) %>%
mutate(share = wins / sum(wins)) %>%
ungroup()
cat("\n---- Win shares by boundary (among 6 settings) ----\n")
print(win_by_boundary6, row.names = FALSE, digits = 4)
## ---------- Extra slices: wins by density & by R2_target
win_by_density <- winners %>%
left_join(res %>% mutate(Row = dplyr::row_number(), density_fac = factor(density)), by = "Row") %>%
count(density_fac, setting, name = "wins") %>%
group_by(density_fac) %>% mutate(share = wins / sum(wins)) %>% ungroup()
cat("\n---- Win shares by density ----\n")
print(win_by_density, row.names = FALSE, digits = 4)
win_by_R2 <- winners %>%
left_join(res %>% mutate(Row = dplyr::row_number(), R2_fac = factor(R2_target)), by = "Row") %>%
count(R2_fac, setting, name = "wins") %>%
group_by(R2_fac) %>% mutate(share = wins / sum(wins)) %>% ungroup()
cat("\n---- Win shares by R2_target ----\n")
print(win_by_R2, row.names = FALSE, digits = 4)
## ---------- ANOM-style, row-blocked plot (now 6 settings)
means_df <- res_long_anom %>%
group_by(setting) %>%
summarise(mean_lmrse = mean(lmrse, na.rm = TRUE), .groups = "drop")
## ---- FAST ANOM pieces (avoid lm(... + Row)) ----
# 1) Row-demean ( already ran this earlier)
res_long_dm <- res_long %>%
dplyr::group_by(Row) %>%
dplyr::mutate(lmrse_dm = lmrse - mean(lmrse)) %>%
dplyr::ungroup()
# 2) One-way on setting using the demeaned response
fit_setting_fast <- lm(lmrse_dm ~ setting, data = res_long_dm)
aov_tbl <- anova(fit_setting_fast)
# 3) Extract the same quantities you were using before
MSE <- aov_tbl[["Mean Sq"]][nrow(aov_tbl)]
df_res <- aov_tbl[["Df"]][nrow(aov_tbl)]
t_methods<- nlevels(res_long_anom$setting)
b_blocks <- nlevels(res_long_anom$Row)
grand_mu <- mean(res_long_anom$lmrse, na.rm = TRUE)
# ANOM group SE and limits (same formulas as before)
se_group <- sqrt(MSE * (t_methods - 1) / (t_methods * b_blocks))
crit <- qt(1 - 0.05 / (2 * t_methods), df = df_res)
UCL <- grand_mu + crit * se_group
LCL <- grand_mu - crit * se_group
p_anom <- ggplot(means_df, aes(x = setting, y = mean_lmrse)) +
geom_hline(yintercept = grand_mu, linetype = 2) +
geom_hline(yintercept = UCL, linetype = 3) +
geom_hline(yintercept = LCL, linetype = 3) +
geom_segment(aes(xend = setting, y = grand_mu, yend = mean_lmrse), linewidth = 1) +
geom_point(size = 3) +
labs(title = "Blocked ANOM-style plot (6 settings)", x = NULL, y = "Mean log RMSE") +
theme_bw()
p_pairs <- ggplot(res_long_anom, aes(x = setting, y = lmrse, group = Row)) +
geom_line(alpha = 0.2) +
geom_point(alpha = 0.5, position = position_dodge(width = 0.05)) +
stat_summary(fun = mean, geom = "point", size = 3, color = "black") +
labs(title = "Paired runs by Row", x = NULL, y = "log RMSE") +
theme_bw()
p_anom
## ---------- (Optional) Timing summary
time_summary <- res %>%
summarise(
median_wAICc = median(time_sec_fit_wAICc, na.rm = TRUE),
median_wAIC = median(time_sec_fit_wAIC, na.rm = TRUE),
median_glm = median(time_sec_fit_glmnet,na.rm = TRUE)
)
cat("\n---- Median fit times (sec) ----\n")
print(time_summary, row.names = FALSE, digits = 4)
## Win shares by n/p deciles (how “over/under” the boundary affects winners)
win_by_np <- winners %>%
left_join(res %>% mutate(Row = dplyr::row_number(),
np_decile = ntile(ratio_n_over_p, 10)), by = "Row") %>%
count(np_decile, setting, name = "wins") %>%
group_by(np_decile) %>% mutate(share = wins / sum(wins)) %>% ungroup()
print(win_by_np, n = 60)
## Winners by model family (quadratic vs. cubic)
win_by_model <- winners %>%
left_join(res %>% mutate(Row = dplyr::row_number(),
model = factor(model)), by = "Row") %>%
count(model, setting, name = "wins") %>%
group_by(model) %>% mutate(share = wins / sum(wins)) %>% ungroup()
print(win_by_model, n = 20)
paired_by <- function(res, num_col, den_col, by) {
eps <- .Machine$double.eps
res %>%
mutate(Row = dplyr::row_number()) %>%
select(Row, !!by, num = all_of(num_col), den = all_of(den_col)) %>%
filter(is.finite(num), is.finite(den)) %>%
group_by(!!sym(by)) %>%
summarise(
n = n(),
gm_ratio = exp(mean(log(pmax(num, eps)) - log(pmax(den, eps)))),
gm_lo = exp(t.test(log(pmax(num, eps)) - log(pmax(den, eps)))$conf.int[1]),
gm_hi = exp(t.test(log(pmax(num, eps)) - log(pmax(den, eps)))$conf.int[2]),
winrate = mean(num < den),
.groups = "drop"
)
}
# Prepare stratifiers
res_np <- res %>% mutate(np_decile = ntile(ratio_n_over_p, 10))
res_den <- res %>% mutate(density_fac = factor(density))
res_r2 <- res %>% mutate(R2_fac = factor(R2_target))
# SVEM-wAICc debias effect
db_wAICc_np <- paired_by(res_np, "rmse_test_wAICc_debiasTrue", "rmse_test_wAICc_debiasFalse", "np_decile")
db_wAICc_den<- paired_by(res_den, "rmse_test_wAICc_debiasTrue", "rmse_test_wAICc_debiasFalse", "density_fac")
db_wAICc_r2 <- paired_by(res_r2, "rmse_test_wAICc_debiasTrue", "rmse_test_wAICc_debiasFalse", "R2_fac")
# glmnet debias effect
db_glm_np <- paired_by(res_np, "rmse_test_glmnet_debiasTrue", "rmse_test_glmnet_debiasFalse", "np_decile")
db_glm_den <- paired_by(res_den, "rmse_test_glmnet_debiasTrue", "rmse_test_glmnet_debiasFalse", "density_fac")
db_glm_r2 <- paired_by(res_r2, "rmse_test_glmnet_debiasTrue", "rmse_test_glmnet_debiasFalse", "R2_fac")
# SVEM-wAIC debias effect
db_wAIC_np <- paired_by(res_np, "rmse_test_wAIC_debiasTrue", "rmse_test_wAIC_debiasFalse", "np_decile")
db_wAIC_den <- paired_by(res_den, "rmse_test_wAIC_debiasTrue", "rmse_test_wAIC_debiasFalse", "density_fac")
db_wAIC_r2 <- paired_by(res_r2, "rmse_test_wAIC_debiasTrue", "rmse_test_wAIC_debiasFalse", "R2_fac")
db_wAICc_np; db_glm_np; db_wAIC_np
db_wAICc_den; db_glm_den; db_wAIC_den
db_wAICc_r2; db_glm_r2; db_wAIC_r2
svem_vs_glm_by <- function(res, by, debias_flag = c("False","True")) {
eps <- .Machine$double.eps
col_svem <- paste0("rmse_test_wAICc_debias", debias_flag)
col_glm <- paste0("rmse_test_glmnet_debias", debias_flag)
res %>%
mutate(Row = dplyr::row_number()) %>%
select(Row, !!by, svem = all_of(col_svem), glm = all_of(col_glm)) %>%
filter(is.finite(svem), is.finite(glm)) %>%
group_by(!!sym(by)) %>%
summarise(
n = n(),
gm_ratio = exp(mean(log(pmax(svem, eps)) - log(pmax(glm, eps)))),
gm_lo = exp(t.test(log(pmax(svem, eps)) - log(pmax(glm, eps)))$conf.int[1]),
gm_hi = exp(t.test(log(pmax(svem, eps)) - log(pmax(glm, eps)))$conf.int[2]),
winrate = mean(svem < glm),
.groups = "drop"
) %>%
mutate(debias = debias_flag)
}
sv_glm_np_F <- svem_vs_glm_by(res_np, "np_decile", "False")
sv_glm_np_T <- svem_vs_glm_by(res_np, "np_decile", "True")
sv_glm_den_F<- svem_vs_glm_by(res_den, "density_fac","False")
sv_glm_den_T<- svem_vs_glm_by(res_den, "density_fac","True")
sv_glm_r2_F <- svem_vs_glm_by(res_r2, "R2_fac", "False")
sv_glm_r2_T <- svem_vs_glm_by(res_r2, "R2_fac", "True")
rbind(sv_glm_np_F, sv_glm_np_T) %>% arrange(np_decile, debias)
na_rates <- res %>%
transmute(
glmnet_FALSE = is.na(rmse_test_glmnet_debiasFalse),
glmnet_TRUE = is.na(rmse_test_glmnet_debiasTrue),
wAICc_FALSE = is.na(rmse_test_wAICc_debiasFalse),
wAICc_TRUE = is.na(rmse_test_wAICc_debiasTrue),
wAIC_FALSE = is.na(rmse_test_wAIC_debiasFalse),
wAIC_TRUE = is.na(rmse_test_wAIC_debiasTrue)
) %>%
summarise(across(everything(), ~mean(.)))
print(na_rates, digits = 3)
frontier <- res_long %>%
group_by(setting) %>%
summarise(mean_lmrse = mean(lmrse, na.rm = TRUE), .groups="drop") %>%
mutate(median_time = case_when(
grepl("^SVEM-wAICc", setting) ~ median(res$time_sec_fit_wAICc, na.rm = TRUE),
grepl("^SVEM-wAIC", setting) ~ median(res$time_sec_fit_wAIC, na.rm = TRUE),
grepl("^glmnet", setting) ~ median(res$time_sec_fit_glmnet,na.rm = TRUE)
))
frontier
# ggplot(frontier, aes(median_time, mean_lmrse, label = setting)) + geom_point() + ggrepel::geom_text_repel()
winner_margins <- res %>%
mutate(Row = dplyr::row_number()) %>%
transmute(
Row,
`SVEM-wAICc (debias=FALSE)` = rmse_test_wAICc_debiasFalse,
`SVEM-wAICc (debias=TRUE)` = rmse_test_wAICc_debiasTrue,
`SVEM-wAIC (debias=FALSE)` = rmse_test_wAIC_debiasFalse,
`SVEM-wAIC (debias=TRUE)` = rmse_test_wAIC_debiasTrue,
`glmnet (debias=FALSE)` = rmse_test_glmnet_debiasFalse,
`glmnet (debias=TRUE)` = rmse_test_glmnet_debiasTrue
) %>%
tidyr::pivot_longer(-Row, names_to="setting", values_to="rmse") %>%
group_by(Row) %>%
summarise(
best_setting = setting[which.min(rmse)],
best_rmse = min(rmse, na.rm=TRUE),
second_rmse = sort(rmse, partial=2)[2],
margin_pct = (second_rmse / best_rmse) - 1,
.groups="drop"
)
winner_margins %>%
summarise(
median_margin_pct = median(margin_pct, na.rm=TRUE),
p90_margin_pct = quantile(margin_pct, 0.9, na.rm=TRUE)
)
winner_margins <- res %>%
mutate(Row = dplyr::row_number()) %>%
transmute(
Row,
`SVEM-wAICc (debias=FALSE)` = rmse_test_wAICc_debiasFalse,
`SVEM-wAICc (debias=TRUE)` = rmse_test_wAICc_debiasTrue,
`SVEM-wAIC (debias=FALSE)` = rmse_test_wAIC_debiasFalse,
`SVEM-wAIC (debias=TRUE)` = rmse_test_wAIC_debiasTrue,
`glmnet (debias=FALSE)` = rmse_test_glmnet_debiasFalse,
`glmnet (debias=TRUE)` = rmse_test_glmnet_debiasTrue
) %>%
tidyr::pivot_longer(-Row, names_to="setting", values_to="rmse") %>%
group_by(Row) %>%
summarise(
best_setting = setting[which.min(rmse)],
best_rmse = min(rmse, na.rm=TRUE),
second_rmse = { vals <- rmse[is.finite(rmse)] ; sort(vals, partial=2)[2] },
margin_pct = (second_rmse / best_rmse) - 1,
.groups="drop"
)
# Overall margins (adds “how often is it basically a tie?”)
winner_margins %>%
summarise(
median_margin_pct = median(margin_pct, na.rm=TRUE),
p90_margin_pct = quantile(margin_pct, 0.9, na.rm=TRUE),
share_lt_2pct = mean(margin_pct < 0.02, na.rm=TRUE),
share_lt_5pct = mean(margin_pct < 0.05, na.rm=TRUE)
)
# Margins by winner (who wins “by a lot”?)
winner_margins %>%
group_by(best_setting) %>%
summarise(
n = n(),
med = median(margin_pct, na.rm=TRUE),
p90 = quantile(margin_pct, 0.9, na.rm=TRUE),
share_gt_5pct = mean(margin_pct > 0.05, na.rm=TRUE),
share_gt_10pct = mean(margin_pct > 0.10, na.rm=TRUE),
.groups="drop"
) %>%
arrange(desc(med))
margins_np_r2 <- winner_margins %>%
left_join(res %>% mutate(
Row = dplyr::row_number(),
np_decile = dplyr::ntile(ratio_n_over_p, 10),
R2_fac = factor(R2_target)
), by="Row") %>%
group_by(R2_fac, np_decile) %>%
summarise(
med_margin = median(margin_pct, na.rm=TRUE),
p90_margin = quantile(margin_pct, 0.9, na.rm=TRUE),
.groups="drop"
)
margins_np_r2
tau <- 0.02 # try 0.02 (2%) or 0.05 (5%)
policy_eval <- winner_margins %>%
mutate(choice = ifelse(best_setting == "SVEM-wAICc (debias=FALSE)" & margin_pct > tau,
"SVEM-wAICc (debias=FALSE)",
"glmnet (debias=FALSE)")) %>%
left_join(
res %>% mutate(Row = dplyr::row_number()) %>%
transmute(Row,
`SVEM-wAICc (debias=FALSE)`=rmse_test_wAICc_debiasFalse,
`SVEM-wAICc (debias=TRUE)` =rmse_test_wAICc_debiasTrue,
`SVEM-wAIC (debias=FALSE)` =rmse_test_wAIC_debiasFalse,
`SVEM-wAIC (debias=TRUE)` =rmse_test_wAIC_debiasTrue,
`glmnet (debias=FALSE)` =rmse_test_glmnet_debiasFalse,
`glmnet (debias=TRUE)` =rmse_test_glmnet_debiasTrue
) %>% tidyr::pivot_longer(-Row, names_to="setting", values_to="rmse"),
by=c("Row","choice"="setting")
) %>%
rename(rmse_choice = rmse) %>%
mutate(regret_pct = (rmse_choice / best_rmse) - 1) %>%
left_join(
res %>% mutate(Row = dplyr::row_number()) %>%
select(Row, time_sec_fit_wAICc, time_sec_fit_wAIC, time_sec_fit_glmnet),
by="Row"
) %>%
mutate(time_choice = dplyr::case_when(
grepl("^glmnet", choice) ~ time_sec_fit_glmnet,
grepl("^SVEM-wAICc", choice) ~ time_sec_fit_wAICc,
grepl("^SVEM-wAIC", choice) ~ time_sec_fit_wAIC
))
policy_eval %>%
summarise(
median_regret_pct = median(regret_pct, na.rm=TRUE),
p90_regret_pct = quantile(regret_pct, 0.9, na.rm=TRUE),
share_no_regret = mean(regret_pct <= 0, na.rm=TRUE),
median_time_sec = median(time_choice, na.rm=TRUE)
)
Lemkus, T., Gotwalt, C., Ramsey, P., & Weese, M. L.
(2021). Self-Validated Ensemble Models for Elastic Net
Regression.
Chemometrics and Intelligent Laboratory Systems, 219,
104439.
DOI: 10.1016/j.chemolab.2021.104439
Karl, A. T. (2024). A Randomized Permutation
Whole-Model Test for SVEM.
Chemometrics and Intelligent Laboratory Systems, 249,
105122.
DOI: 10.1016/j.chemolab.2024.105122
Friedman, J. H., Hastie, T., & Tibshirani, R.
(2010). Regularization Paths for Generalized Linear Models
via Coordinate Descent.
Journal of Statistical Software, 33(1), 1–22.
DOI: 10.18637/jss.v033.i01
Gotwalt, C., & Ramsey, P. (2018). Model
Validation Strategies for Designed Experiments Using Bootstrapping
Techniques With Applications to Biopharmaceuticals.
JMP Discovery Conference.
Link
Ramsey, P., Gaudard, M., & Levin, W. (2021).
Accelerating Innovation with Space-Filling Mixture Designs, Neural
Networks, and SVEM.
JMP Discovery Conference.
Link
Ramsey, P., & Gotwalt, C. (2018). Model
Validation Strategies for Designed Experiments Using Bootstrapping
Techniques With Applications to Biopharmaceuticals.
JMP Discovery Summit Europe.
Link
Ramsey, P., Levin, W., Lemkus, T., & Gotwalt, C.
(2021). SVEM: A Paradigm Shift in Design and Analysis of
Experiments.
JMP Discovery Summit Europe.
Link
Ramsey, P., & McNeill, P. (2023). CMC,
SVEM, Neural Networks, DOE, and Complexity: It’s All About
Prediction.
JMP Discovery Conference.
Karl, A., Wisnowski, J., & Rushing, H.
(2022). JMP Pro 17 Remedies for Practical Struggles with
Mixture Experiments.
JMP Discovery Conference.
Link
Xu, L., Gotwalt, C., Hong, Y., King, C. B., & Meeker,
W. Q. (2020). Applications of the Fractional-Random-Weight
Bootstrap.
The American Statistician, 74(4), 345–358.
Link
Karl, A. T. (2024). SVEMnet: Self-Validated
Ensemble Models with Elastic Net Regression.
R package
JMP Help Documentation Overview of
Self-Validated Ensemble Models.
Link