lsReg supports five hypothesis tests for the added
covariate(s), specified via the testtype argument to
lsReg():
testtype |
Description |
|---|---|
"lrt" |
Likelihood ratio test — twice the log-likelihood difference between the full and base models |
"score" |
Score (Rao) test — evaluated at the base-model estimates, no full model fit required |
"wald" |
Wald test — based on the maximum likelihood estimate and its standard error |
"robustscore" |
Score test with Huber-White sandwich variance |
"robustwald" |
Wald test with Huber-White sandwich variance |
The standard tests (LRT, score, Wald) assume the model is correctly specified. The robust tests use the Huber-White sandwich estimator for the variance-covariance matrix, providing valid inference when the variance function is mis-specified (e.g., overdispersion in count data, heteroscedasticity in linear models).
"lrt" returns a chi-square statistic with degrees of
freedom equal to the number of added columns. P-values use
pchisq(..., lower.tail = FALSE).2 * pnorm(-abs(statistic)).Under correct model specification all five tests are asymptotically equivalent, so their statistics should be similar in large samples.
We use the simulated dataset with ylin as the outcome
and ylin ~ x1 + x2 as the base model. We test
z5 (a true predictor) and z9 (a true
predictor) separately under all five test types and compare the
resulting p-values.
library(lsReg)
datafile <- system.file("extdata", "simulated_data.rds", package = "lsReg")
dat <- readRDS(datafile)
basemdl <- glm(ylin ~ x1 + x2, data = dat, family = gaussian)testtypes <- c("lrt", "score", "robustscore", "wald", "robustwald")
zvars <- c("z5", "z9")
results <- expand.grid(testtype = testtypes, variable = zvars,
stringsAsFactors = FALSE)
results$statistic <- NA_real_
results$pvalue <- NA_real_
for (tt in testtypes) {
mem <- lsReg(basemdl, colstoadd = 1, testtype = tt)
for (zv in zvars) {
xr <- as.matrix(dat[, zv, drop = FALSE])
addcovar(mem, xr)
stat <- if (tt == "lrt") mem$testvalue[1] else mem$testvalue[1, 1]
pval <- if (tt == "lrt") pchisq(stat, df = 1, lower.tail = FALSE) else
2 * pnorm(-abs(stat))
idx <- results$testtype == tt & results$variable == zv
results$statistic[idx] <- stat
results$pvalue[idx] <- pval
}
}# Reshape for a readable side-by-side comparison
res_z5 <- results[results$variable == "z5", c("testtype", "statistic", "pvalue")]
res_z9 <- results[results$variable == "z9", c("testtype", "statistic", "pvalue")]
cat("=== z5 ===\n")
#> === z5 ===
print(res_z5, digits = 5, row.names = FALSE)
#> testtype statistic pvalue
#> lrt 12.6852 0.00036857
#> score 3.5450 0.00039257
#> robustscore 3.5656 0.00036305
#> wald 3.5658 0.00036276
#> robustwald 3.6315 0.00028173
cat("\n=== z9 ===\n")
#>
#> === z9 ===
print(res_z9, digits = 5, row.names = FALSE)
#> testtype statistic pvalue
#> lrt 6.1142 0.013410
#> score -2.4652 0.013694
#> robustscore -2.5030 0.012314
#> wald -2.4715 0.013454
#> robustwald -2.5182 0.011797For both z5 and z9 all five tests yield
similar statistics and p-values, consistent with the asymptotic
equivalence of the tests under a correctly specified model. Minor
differences arise because each test uses a different approximation to
the same quantity.
The LRT statistic is on the chi-square scale; the others are z-scores. To compare them directly, note that the square root of the LRT statistic approximates the z-scores from the other tests:
for (zv in zvars) {
lrt_z <- sqrt(results$statistic[results$testtype == "lrt" & results$variable == zv])
wald_z <- results$statistic[results$testtype == "wald" & results$variable == zv]
score_z <- results$statistic[results$testtype == "score"& results$variable == zv]
cat(zv, "— sqrt(LRT):", round(lrt_z, 4),
" Wald:", round(wald_z, 4),
" Score:", round(score_z, 4), "\n")
}
#> z5 — sqrt(LRT): 3.5616 Wald: 3.5658 Score: 3.545
#> z9 — sqrt(LRT): 2.4727 Wald: -2.4715 Score: -2.4652