| Type: | Package |
| Title: | Estimation under not Missing at Random Nonresponse |
| Version: | 0.1.1 |
| Description: | Methods to estimate finite-population parameters under nonresponse that is not missing at random (NMAR, nonignorable). Incorporates auxiliary information and user-specified response models, and supports independent samples and complex survey designs via objects from the 'survey' package. Provides diagnostics and optional variance estimates. For methodological background see Qin, Leung and Shao (2002) <doi:10.1198/016214502753479338> and Riddles, Kim and Im (2016) <doi:10.1093/jssam/smv047>. |
| License: | MIT + file LICENSE |
| URL: | https://github.com/ncn-foreigners/NMAR, https://ncn-foreigners.ue.poznan.pl/NMAR/index.html |
| BugReports: | https://github.com/ncn-foreigners/NMAR/issues |
| Encoding: | UTF-8 |
| Imports: | stats, nleqslv, utils, generics, Formula |
| RoxygenNote: | 7.3.3 |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), numDeriv, survey, svrep, broom, progressr, future, future.apply, spelling |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 3.5) |
| LazyData: | true |
| Language: | en-US |
| NeedsCompilation: | no |
| Packaged: | 2026-01-10 15:12:54 UTC; runner |
| Author: | Maciej Beresewicz |
| Maintainer: | Maciej Beresewicz <maciej.beresewicz@ue.poznan.pl> |
| Repository: | CRAN |
| Date/Publication: | 2026-01-16 10:50:02 UTC |
Apply scaling to a matrix using a recipe
Description
Apply scaling to a matrix using a recipe
Usage
apply_nmar_scaling(matrix_to_scale, recipe)
Arguments
matrix_to_scale |
A numeric matrix with column names present in
|
recipe |
An object of class |
Value
A numeric matrix with each column centered and scaled using
recipe.
Shared bootstrap variance helpers
Description
Internal helpers to estimate the variance of a scalar estimator via bootstrap resampling (IID data) or bootstrap replicate weights (survey designs). Designed to be reused across NMAR engines.
Usage
bootstrap_variance(data, estimator_func, point_estimate, ...)
Arguments
data |
A |
estimator_func |
Function returning an object with a numeric scalar
component |
point_estimate |
Numeric scalar; used for survey bootstrap variance
(passed to |
... |
Additional arguments. Some are consumed by |
Details
For
data.frameinputs, performs IID bootstrap by resampling rows and rerunningestimator_funcon each resample, then computing the empirical variance of the replicate estimates.For
survey.designinputs, converts the design to a bootstrap replicate-weight design withsvrep::as_bootstrap_design(), evaluatesestimator_funcon each replicate weight vector (by injecting the replicate analysis weights into a copy of the input design), and passes the resulting replicate estimates and replicate scaling factors tosurvey::svrVar().
estimator_func is typically an engine-level estimator (for example
the EL engine) and is called with the same arguments used for the point
estimate, except that the data argument is replaced by the resampled
data (IID) or a replicate-weighted survey.design (survey). Arguments
reserved for the bootstrap implementation are stripped from ...
before forwarding.
Bootstrap-specific options
resample_guardIID bootstrap only. A function
function(indices, data)that returnsTRUEto accept a resample andFALSEto reject it.bootstrap_settingsSurvey bootstrap only. A list of arguments forwarded to
svrep::as_bootstrap_design().bootstrap_optionsAlias for
bootstrap_settings.bootstrap_typeShortcut for the
typeargument tosvrep::as_bootstrap_design().bootstrap_mseShortcut for the
mseargument tosvrep::as_bootstrap_design().
Progress Reporting
If the optional progressr package is installed, bootstrap calls
signal progress via a progressr::progressor inside
progressr::with_progress(). Users control whether progress is shown
(and how) by registering handlers with progressr::handlers(). When
progressr is not installed or no handlers are active, bootstrap runs
silently. Progress reporting is compatible with all future backends.
Reproducibility
For reproducible bootstrap results, always set a seed before calling the estimation function:
set.seed(123) # Set seed for reproducibility
result <- nmar(Y ~ X, data = df,
engine = el_engine(variance_method = "bootstrap",
bootstrap_reps = 500))
The future framework (via future.seed = TRUE in
future.apply::future_lapply()) ensures that each bootstrap replicate
uses an independent L'Ecuyer-CMRG random number stream derived from this
seed. This gives reproducible results across supported future backends
(sequential, multisession, cluster, and so on).
Bootstrap for IID data frames
Description
Bootstrap for IID data frames
Usage
bootstrap_variance.data.frame(
data,
estimator_func,
point_estimate,
bootstrap_reps = 500,
...
)
Arguments
data |
A |
estimator_func |
Function returning an object with a numeric scalar
component |
point_estimate |
Unused for IID bootstrap; included for signature consistency. |
bootstrap_reps |
integer; number of resamples. |
... |
Additional arguments. Some are consumed by |
Value
A list with components se, variance, and replicates.
Default method dispatch (internal safety net)
Description
Default method dispatch (internal safety net)
Usage
bootstrap_variance.default(data, estimator_func, point_estimate, ...)
Bootstrap for survey designs via replicate weights
Description
Bootstrap for survey designs via replicate weights
Usage
bootstrap_variance.survey.design(
data,
estimator_func,
point_estimate,
bootstrap_reps = 500,
survey_na_policy = c("strict", "omit"),
...
)
Arguments
data |
A |
estimator_func |
Function returning an object with a numeric scalar
component |
point_estimate |
Numeric scalar; used for survey bootstrap variance
(passed to |
bootstrap_reps |
integer; number of bootstrap replicates. |
survey_na_policy |
Character string specifying how to handle replicates that fail to produce estimates. Options:
|
... |
Additional arguments. Some are consumed by |
Details
This path constructs a replicate-weight design using
svrep::as_bootstrap_design() and evaluates the estimator on each set of
bootstrap replicate analysis weights.
Replicate evaluation starts from a shallow template copy of the input survey
design (including its ids/strata/fpc structure) and injects each replicate's
analysis weights by
updating the design's probability slots (prob/allprob) so that
weights(design) returns the desired replicate weights (with
zero weights represented as prob = Inf). This avoids replaying or
reconstructing a svydesign() call and therefore supports designs
created via subset() and update().
NA policy: By default, survey bootstrap uses a strict NA policy:
if any replicate fails to produce a finite estimate, the entire bootstrap
fails with an error. Setting survey_na_policy = "omit" drops failed
replicates (and their corresponding rscales) and proceeds with the
remaining replicates.
Value
A list with components se, variance, and replicates.
Limitations
Calibrated/post-stratified designs: Post-hoc adjustments applied
via survey::calibrate(), survey::postStratify(), or
survey::rake() are not supported here and will cause the function to
error. These adjustments are not recomputed when replicate weights are
injected, so the replicate designs would not reflect the intended
calibrated/post-stratified analysis.
Default coefficients for NMAR results
Description
Returns missingness-model coefficients if available.
Usage
## S3 method for class 'nmar_result'
coef(object, ...)
Arguments
object |
An 'nmar_result' object. |
... |
Ignored. |
Value
A named numeric vector or 'NULL'.
Coefficient table for summary objects
Description
Returns a coefficients table (Estimate, Std. Error, statistic, p-value) from a 'summary_nmar_result*' object when missingness-model coefficients and a variance matrix are available. If the summary does not carry missingness-model coefficients, returns 'NULL'.
Usage
## S3 method for class 'summary_nmar_result'
coef(object, ...)
Arguments
object |
An object of class 'summary_nmar_result' (or subclass). |
... |
Ignored. |
Details
The statistic column is labelled "t value" when finite degrees of freedom are available (e.g., survey designs); otherwise, it is labelled "z value".
Value
A data.frame with rows named by coefficient, or 'NULL' if not available.
Compute (possibly weighted) mean and standard deviation
Description
Compute (possibly weighted) mean and standard deviation
Usage
compute_weighted_stats(values, weights = NULL)
Wald confidence interval for base NMAR results
Description
Wald confidence interval for base NMAR results
Usage
## S3 method for class 'nmar_result'
confint(object, parm, level = 0.95, ...)
Arguments
object |
An object of class 'nmar_result'. |
parm |
Ignored. |
level |
Confidence level. |
... |
Ignored. |
Value
A 1x2 numeric matrix with confidence limits.
Confidence intervals for coefficient table (summary objects)
Description
Returns Wald-style confidence intervals for missingness-model coefficients from a 'summary_nmar_result*' object. Uses t-quantiles when finite degrees of freedom are available, otherwise normal quantiles.
Usage
## S3 method for class 'summary_nmar_result'
confint(object, parm, level = 0.95, ...)
Arguments
object |
An object of class 'summary_nmar_result' (or subclass). |
parm |
A specification of which coefficients are to be given confidence intervals, either a vector of names or a vector of indices; by default, all coefficients are considered. |
level |
The confidence level required. |
... |
Ignored. |
Value
A numeric matrix with columns giving lower and upper confidence limits for each parameter. Row names correspond to coefficient names. Returns 'NULL' if coefficients are unavailable.
Constraint summaries for EL diagnostics
Description
Constraint summaries for EL diagnostics
Usage
constraint_summaries(w_i_hat, W_hat, mass_untrim, X_centered)
Build a scaling recipe from one or more design matrices
Description
Build a scaling recipe from one or more design matrices
Usage
create_nmar_scaling_recipe(
...,
intercept_col = "(Intercept)",
weights = NULL,
weight_mask = NULL,
tol_constant = 1e-08,
warn_on_constant = TRUE
)
Arguments
... |
One or more numeric matrices with column names. |
intercept_col |
Name of an intercept column that should remain unscaled. |
weights |
Optional nonnegative numeric vector used to compute weighted means and standard deviations. |
weight_mask |
Optional logical mask or nonnegative numeric multipliers
applied to |
tol_constant |
Numeric tolerance below which columns are treated as constant and left unscaled. |
warn_on_constant |
Logical; warn when a column is treated as constant. |
Create Verbose Printer Factory
Description
Creates a verbose printing function based on trace level settings. Messages are printed only if their level is <= trace_level.
Usage
create_verboser(trace_level = 0)
Arguments
trace_level |
Integer 0-3; controls verbosity detail: - 0: No output (silent mode) - 1: Major steps only (initialization, convergence) - 2: Moderate detail (iteration summaries, key diagnostics) - 3: Full detail (all diagnostics, intermediate values) |
Value
A function with signature: 'verboser(msg, level = 1, type = c("info", "step", "detail", "result"))'
Empirical likelihood estimator
Description
Generic for the empirical likelihood (EL) estimator under NMAR.
Methods are provided for data.frame and survey.design.
Usage
el(data, ...)
Arguments
data |
A |
... |
Passed to class-specific methods. |
See Also
Assert that terms object lacks offsets
Description
Assert that terms object lacks offsets
Usage
el_assert_no_offset(terms_obj, label)
Strata augmentation for survey designs
Description
Augments the auxiliary design with strata dummies (dropping one level) and
appends stratum-share means when user-supplied auxiliary_means are
present. This is the Wu-style strategy of adding stratum indicators to the
auxiliary calibration block in pseudo empirical likelihood for surveys.
Usage
el_augment_strata_aux(
aux_design_full,
strata_factor,
weights_full,
N_pop,
auxiliary_means
)
Empirical likelihood estimating equations
Description
Empirical likelihood estimating equations
Usage
el_build_equation_system(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
Returns a function that evaluates the stacked EL system for
\theta = (\beta, z, \lambda_x) with z = \operatorname{logit}(W).
Blocks correspond to: (i) missingness (response) model score equations in \beta,
(ii) the response-rate equation in W, and (iii) auxiliary moment
constraints in \lambda_x. When no auxiliaries are present the last
block is omitted. The system matches Qin, Leung, and Shao (2002, Eqs. 7-10)
with empirical masses m_i = d_i/D_i(\theta), D_i as in the paper.
We cap \eta, clip w_i in ratios, and guard D_i away from zero to
ensure numerical stability; these safeguards are applied consistently in
equations, Jacobian, and post-solution weights.
Guarding policy (must remain consistent across equations/Jacobian/post):
Cap
\eta:eta <- pmax(pmin(eta, get_eta_cap()), -get_eta_cap()).Compute
w <- family$linkinv(eta)and clip to[1e-12, 1 - 1e-12]when used in ratios.Denominator floor:
Di <- pmax(Di_raw, nmar_get_el_denom_floor()). In the Jacobian, terms that depend ond(1/Di)/d(.)are multiplied byactive = 1(Di_raw > floor)to match the clamped equations.
The score with respect to the linear predictor uses the Bernoulli form
s_{\eta,i}(\beta) = \partial \log w_i / \partial \eta_i = \mu.\eta(\eta_i)/w_i,
which is valid for both logit and probit links when w_i is clipped.
References
Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200.
Empirical likelihood equations for survey designs (design-weighted QLS system)
Description
Empirical likelihood equations for survey designs (design-weighted QLS system)
Usage
el_build_equation_system_survey(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
Returns a function that evaluates the stacked EL system for complex survey
designs using design weights. Unknowns are
\theta = (\beta, z, \lambda_W, \lambda_x) with z = \operatorname{logit}(W).
Blocks correspond to:
response-model score equations in
\beta,the response-rate equation in
Wbased on\sum d_i (w_i - W)/D_i = 0,auxiliary moment constraints
\sum d_i (X_i - \mu_x)/D_i = 0,and the design-based linkage between
\lambda_Wand the nonrespondent total:T_0/(1-W) - \lambda_W \sum d_i / D_i = 0, whereT_0 = N_{\mathrm{pop}} - \sum d_ion the analysis scale.
When all design weights are equal and N_{\mathrm{pop}} and the respondent
count match the simple random sampling setup, this system reduces to the
Qin, Leung, and Shao (2002) equations (6)-(10).
Analytical Jacobian for empirical likelihood
Description
Analytical Jacobian for empirical likelihood
Usage
el_build_jacobian(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
Builds the block Jacobian A = \partial F/\partial \theta for the
EL system with \theta = (\beta, z, \lambda_x) and z = \operatorname{logit}(W).
Blocks follow Qin, Leung, and Shao (2002, Eqs. 7-10). The derivative with
respect to the linear predictor for the missingness (response) model uses the Bernoulli score form
\partial/\partial\eta\, \log w(\eta) = \mu.\eta(\eta)/w(\eta) with
link-inverse clipping. Denominator guards are applied consistently when
forming terms depending on D_i(\theta).
Guarding policy (must remain consistent across equations/Jacobian/post):
Cap
\eta:eta <- pmax(pmin(eta, get_eta_cap()), -get_eta_cap()).Compute
w <- family$linkinv(eta)and clip to[1e-12, 1 - 1e-12]when used in ratios.Denominator floor:
Di <- pmax(Di_raw, nmar_get_el_denom_floor()). Terms that depend ond(1/Di)/d(.)are multiplied byactive = 1(Di_raw > floor)to match the clamped equations.
References
Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200.
Analytical Jacobian for survey EL system (design-weighted QLS analogue)
Description
Analytical Jacobian for survey EL system (design-weighted QLS analogue)
Usage
el_build_jacobian_survey(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
Builds the block Jacobian A = \partial g/\partial \theta for the
survey EL system with \theta = (\beta, z, \lambda_W, \lambda_x) and
z = \operatorname{logit}(W). Blocks follow the design-weighted analogue
of Qin, Leung, and Shao (2002) used in el_build_equation_system_survey().
Guarding policy matches the IID Jacobian:
cap eta:
eta <- pmax(pmin(eta, get_eta_cap()), -get_eta_cap())compute
w <- family$linkinv(eta)and clip to[1e-12, 1-1e-12]when used in ratiosdenominator floor:
Di <- pmax(Di_raw, nmar_get_el_denom_floor()); multiply terms depending ond(1/Di)/d(.)byactive = 1(Di_raw > floor)
The Jacobian uses the same score and second-derivative machinery as
el_build_jacobian(); when family$d2mu.deta2 is missing, this
function returns NULL and the solver falls back to numeric/Broyden
Jacobians.
Build EL result object (success or failure)
Description
Build EL result object (success or failure)
Usage
el_build_result(
core_results,
inputs,
call,
formula,
engine_name = "empirical_likelihood"
)
Build starting values for the EL solver (beta, z, lambda)
Description
Build starting values for the EL solver (beta, z, lambda)
Usage
el_build_start(
missingness_model_matrix_scaled,
auxiliary_matrix_scaled,
nmar_scaling_recipe,
start,
N_pop,
respondent_weights
)
Check auxiliary means consistency against respondents' sample support.
Description
Computes a simple z-score diagnostic comparing user-supplied auxiliary means to the respondents' sample means. The caller is responsible for comparing the returned maximum z-score to any desired threshold.
Usage
el_check_auxiliary_inconsistency_matrix(
auxiliary_matrix_resp,
provided_means = NULL
)
Arguments
auxiliary_matrix_resp |
Respondent-side auxiliary design matrix. |
provided_means |
Optional named numeric vector of auxiliary means aligned to the matrix columns. |
Value
list(max_z = numeric(1) or NA, cols = character())
Compute diagnostics at the EL solution
Description
Compute diagnostics at the EL solution
Usage
el_compute_diagnostics(
estimates,
equation_system_func,
analytical_jac_func,
post,
respondent_weights,
auxiliary_matrix_scaled,
K_beta,
K_aux,
X_centered
)
Variance driver for EL (bootstrap or none)
Description
Variance driver for EL (bootstrap or none)
Usage
el_compute_variance(
y_hat,
full_data,
formula,
N_pop,
variance_method,
bootstrap_reps,
standardize,
trim_cap,
on_failure,
auxiliary_means,
control,
start,
family
)
Core eta-state computation for EL engines
Description
Computes the capped linear predictor, response probabilities, derivatives, and stable scores with respect to the linear predictor for a given family. This helper centralizes the numerically delicate pieces (capping, clipping, Mills ratios, and score derivatives) and is used consistently across the EL equation system and analytical Jacobians for both IID and survey designs.
Usage
el_core_eta_state(family, eta_raw, eta_cap)
Arguments
family |
List-like response family bundle (see |
eta_raw |
Numeric vector of unconstrained linear predictors. |
eta_cap |
Scalar cap applied symmetrically to |
Value
A list with components:
etaCapped linear predictor.
wMean function
family$linkinv(eta).w_clippedwclipped to[1e-12, 1-1e-12]for use in ratios.mu_etaDerivative
family$mu.eta(eta).d2muSecond derivative
family$d2mu.deta2(eta)when available, otherwiseNULL.s_etaScore with respect to
eta, using stable logit/probit forms where possible.ds_eta_detaDerivative of
s_etawith respect toetawhend2muis available, otherwiseNULL.
EL core helpers
Description
Internal helpers for solving and post-processing the EL system.
el_run_solver() orchestrates nleqslv::nleqslv() with a small,
deterministic fallback ladder; el_post_solution() computes masses and
the point estimate with
denominator guards and optional trimming.
Empirical likelihood for data frames (NMAR)
Description
Internal method dispatched by el() when data is a
data.frame. Returns c("nmar_result_el","nmar_result") with the
point estimate, optional
bootstrap SE, weights, coefficients, diagnostics, and metadata.
Usage
## S3 method for class 'data.frame'
el(
data,
formula,
auxiliary_means = NULL,
standardize = TRUE,
trim_cap = Inf,
control = list(),
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 500,
n_total = NULL,
start = NULL,
trace_level = 0,
family = logit_family(),
...
)
Arguments
data |
A |
formula |
Two-sided formula |
auxiliary_means |
Named numeric vector of population means for auxiliary
design columns. Names must match the materialized |
standardize |
Logical; whether to standardize predictors prior to estimation. |
trim_cap |
Numeric; cap for EL weights ( |
control |
List; optional solver control parameters for
|
on_failure |
Character; one of |
variance_method |
Character; one of |
bootstrap_reps |
Integer; number of bootstrap reps if
|
n_total |
Optional analysis-scale population total |
start |
Optional list of starting values passed to the solver helpers. |
trace_level |
Integer 0-3 controlling estimator logging detail. |
family |
Missingness (response) model family specification (defaults to the logit bundle). |
... |
Additional arguments passed to the solver. |
Details
Implements the empirical likelihood estimator for IID data with
optional auxiliary moment constraints. The missingness-model score is the
Bernoulli derivative with respect to the linear predictor, supporting logit
and probit links. When respondents-only data are supplied (no NA in the
outcome), n_total is required so the response-rate equation targets the
full sample size. When missingness is observed (NA present), the default
population total is nrow(data). If respondents-only data are used and
auxiliaries are requested, you must also provide population auxiliary
means via auxiliary_means. Result weights are the unnormalized EL
masses a_i / D_i(\theta) on the analysis scale, where a_i \equiv 1
for IID data.
Build denominator and floor pack
Description
Build denominator and floor pack
Usage
el_denominator(lambda_W, W, Xc_lambda, p_i, floor)
Arguments
lambda_W |
numeric scalar |
W |
numeric scalar in (0,1) |
Xc_lambda |
numeric vector (X_centered %*% lambda_x) or 0 |
p_i |
numeric vector of response probabilities |
floor |
numeric scalar > 0, denominator floor |
Value
list with denom, active, inv, inv_sq
Empirical likelihood (EL) engine for NMAR
Description
Constructs an engine specification for the empirical likelihood (EL) estimator of a full-data mean under nonignorable nonresponse (NMAR).
Usage
el_engine(
standardize = TRUE,
trim_cap = Inf,
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 500,
auxiliary_means = NULL,
control = list(),
strata_augmentation = TRUE,
n_total = NULL,
start = NULL,
family = c("logit", "probit")
)
Arguments
standardize |
logical; standardize predictors. Default |
trim_cap |
numeric; cap for EL weights ( |
on_failure |
character; |
variance_method |
character; one of |
bootstrap_reps |
integer; number of bootstrap replicates when
|
auxiliary_means |
named numeric vector; population means for auxiliary
design columns. Names must match the materialized model.matrix column names
on the first RHS (after formula expansion), e.g., factor indicator columns
created by |
control |
Optional solver configuration forwarded to
|
strata_augmentation |
logical; when |
n_total |
numeric; optional when supplying respondents-only data (no |
start |
list; optional starting point for the solver. Fields:
|
family |
Missingness (response) model family. Either |
Details
The implementation follows Qin, Leung, and Shao (2002): the response
mechanism is modeled as w(y, x; \beta) = P(R = 1 \mid Y = y, X = x) and
the joint law of (Y, X) is represented nonparametrically by respondent
masses that satisfy empirical likelihood constraints. The mean is estimated
as a respondent weighted mean with weights proportional to
\tilde w_i = a_i / D_i(\beta, W, \lambda), where a_i are base
weights (a_i \equiv 1 for IID data and a_i = d_i for survey
designs) and D_i is the EL denominator.
For data.frame inputs the estimator solves the Qin-Leung-Shao (QLS)
estimating equations for (\beta, W, \lambda_x) with W reparameterized
as z = \operatorname{logit}(W), and profiles out the response multiplier
\lambda_W using the closed-form QLS identity (their Eq. 10). For
survey.design inputs the estimator uses a design-weighted analogue
(Chen and Sitter 1999; Wu 2005) with an explicit \lambda_W and an
additional linkage equation involving the nonrespondent design-weight total
T_0.
Numerical stability:
-
Wis optimized on the logit scale so0 < W < 1. The response-model linear predictor is capped and EL denominators
D_iare floored at a small positive value; the analytic Jacobian is consistent with this guard via an active-set mask.Optional trimming (
trim_cap) is applied only after solving, to the unnormalized masses\tilde w_i = a_i/D_i; this changes the returned weights and therefore the point estimate.
Formula syntax and data constraints: nmar() accepts a
partitioned right-hand side y_miss ~ auxiliaries | response_only. Variables
left of | enter auxiliary moment constraints; variables right of |
enter only the response model. The outcome (LHS) is always included as a
response-model predictor through the evaluated LHS expression; explicit use of
the outcome on the RHS is rejected. The response model always includes an
intercept; the auxiliary block never includes an intercept.
To include a covariate in both the auxiliary constraints and the response
model, repeat it on both sides, e.g. y_miss ~ X | X.
Auxiliary means: If auxiliary_means = NULL (default) and the
outcome contains at least one NA, auxiliary means are estimated from the
full input and used as \bar X in the QLS constraints. For respondents-only
data (no NA in the outcome), n_total must be supplied; and if the
auxiliary RHS is non-empty, auxiliary_means must also be supplied.
When standardize = TRUE, supply auxiliary_means on the original
data scale; the engine applies the same standardization internally.
Survey scale: For survey.design inputs, n_total (if
provided) must be on the same analysis scale as weights(design). The
default is sum(weights(design)).
Convergence and identification: the stacked EL system can have
multiple solutions. Adding response-only predictors (variables to the right
of |) can make the problem sensitive to starting values. Inspect
diagnostics such as jacobian_condition_number and consider supplying
start = list(beta = ..., W = ...) when needed.
Variance: The EL engine supports bootstrap standard errors via
variance_method = "bootstrap" or can skip variance with
variance_method = "none".
Set a seed for reproducible bootstrap results.
Bootstrap requires suggested packages: for IID resampling it requires
future.apply (and future); for survey replicate-weight bootstrap
it requires survey and svrep.
Value
A list of class "nmar_engine_el" (also inheriting from "nmar_engine")
containing configuration fields to be supplied to nmar(). Users rarely
access fields directly; instead, pass the engine to nmar() together with a
formula and data.
References
Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200. doi:10.1198/016214502753479338
Chen, J., and Sitter, R. R. (1999). A pseudo empirical likelihood approach for the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385-406.
Wu, C. (2005). Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology, 31(2), 239-243.
See Also
nmar, weights.nmar_result, summary.nmar_result
Examples
set.seed(1)
n <- 200
X <- rnorm(n)
Y <- 2 + 0.5 * X + rnorm(n)
p <- plogis(-0.7 + 0.4 * scale(Y)[, 1])
R <- runif(n) < p
if (all(R)) R[1] <- FALSE
df <- data.frame(Y_miss = Y, X = X)
df$Y_miss[!R] <- NA_real_
# Estimate auxiliary mean from full data (QLS "use Xbar" case)
eng <- el_engine(auxiliary_means = NULL, variance_method = "none")
# Put X in both the auxiliary block and the response model (QLS-like)
fit <- nmar(Y_miss ~ X | X, data = df, engine = eng)
summary(fit)
# Response-only predictors can be placed to the right of |:
Z <- rnorm(n)
df2 <- data.frame(Y_miss = Y, X = X, Z = Z)
df2$Y_miss[!R] <- NA_real_
eng2 <- el_engine(auxiliary_means = NULL, variance_method = "none")
fit2 <- nmar(Y_miss ~ X | Z, data = df2, engine = eng2)
print(fit2)
# Survey design usage
if (requireNamespace("survey", quietly = TRUE)) {
des <- survey::svydesign(ids = ~1, weights = ~1, data = df)
eng3 <- el_engine(auxiliary_means = NULL, variance_method = "none")
fit3 <- nmar(Y_miss ~ X, data = des, engine = eng3)
summary(fit3)
}
Core Empirical Likelihood Estimator
Description
Implements the core computational engine for empirical likelihood estimation under nonignorable nonresponse, including parameter solving, variance calculation, and diagnostic computation.
Usage
el_estimator_core(
missingness_design,
aux_matrix,
aux_means,
respondent_weights,
analysis_data,
outcome_expr,
N_pop,
formula,
standardize,
trim_cap,
control,
on_failure,
family = logit_family(),
variance_method,
bootstrap_reps,
start = NULL,
trace_level = 0,
auxiliary_means = NULL
)
Arguments
missingness_design |
Respondent-side missingness (response) model design matrix (intercept + predictors). |
aux_matrix |
Auxiliary design matrix on respondents (may have zero columns). |
aux_means |
Named numeric vector of auxiliary population means (aligned to columns of |
respondent_weights |
Numeric vector of respondent weights aligned with |
analysis_data |
Data object used for logging and variance (survey designs supply the design object). |
outcome_expr |
Character string identifying the outcome expression displayed in outputs. |
N_pop |
Population size on the analysis scale. |
formula |
Original model formula used for estimation. |
standardize |
Logical. Whether to standardize predictors during estimation. |
trim_cap |
Numeric. Upper bound for empirical likelihood weight trimming. |
control |
List of control parameters for the nonlinear equation solver. |
on_failure |
Character. Action when solver fails: "return" or "error". |
family |
List. Link function specification (typically logit). |
variance_method |
Character. Variance estimation method. |
bootstrap_reps |
Integer. Number of bootstrap replications. |
auxiliary_means |
Named numeric vector of known population means supplied by the user (optional; used for diagnostics). |
Details
Orchestrates EL estimation for NMAR following Qin, Leung, and Shao (2002).
For data.frame inputs (IID setting) the stacked system in
(\beta, z, \lambda_x) with z = \mathrm{logit}(W) is solved by
nleqslv::nleqslv() using an analytic Jacobian. For survey.design inputs a
design-weighted analogue in (\beta, z, \lambda_W, \lambda_x) is solved
with an analytic Jacobian when the response family supplies second
derivatives, or with numeric/Broyden Jacobians otherwise. Numerical
safeguards are applied consistently across equations, Jacobian, and
post-solution weights: bounded linear predictors, probability clipping in
ratios, and a small floor on denominators D_i(\theta) with an
active-set mask in derivatives. After solving, unnormalized masses
d_i/D_i(\theta) are formed, optional trimming may be applied (with
normalization only for reporting), and optional variance is computed via
bootstrap when variance_method = "bootstrap".
Value
List containing estimation results, diagnostics, and metadata.
Extract a strata factor from a survey.design object
Description
Prefers strata already materialized in the survey.design object
(typically design$strata). When unavailable, attempts to reconstruct
strata from the original svydesign() call. When multiple
stratification variables are supplied, their interaction is used.
Usage
el_extract_strata_factor(design)
Compute lambda_W from C_const and W
Description
Compute lambda_W from C_const and W
Usage
el_lambda_W(C_const, W)
Arguments
C_const |
numeric scalar: (N_pop / sum(d_resp) - 1) |
W |
numeric scalar in (0,1) |
Log a step banner line
Description
Log a step banner line
Usage
el_log_banner(verboser, title)
Log data prep summary
Description
Log data prep summary
Usage
el_log_data_prep(
verboser,
outcome_var,
family_name,
K_beta,
K_aux,
auxiliary_names,
standardize,
is_survey,
N_pop,
n_resp_weighted
)
Log detailed diagnostics
Description
Log detailed diagnostics
Usage
el_log_detailed_diagnostics(
verboser,
beta_hat_unscaled,
W_hat,
lambda_W_hat,
lambda_hat,
denominator_hat
)
Log final summary
Description
Log final summary
Usage
el_log_final(verboser, y_hat, se)
Log solver configuration
Description
Log solver configuration
Usage
el_log_solver_config(verboser, control_top, final_control)
Log solver termination status
Description
Log solver termination status
Usage
el_log_solver_result(verboser, converged_success, solution, elapsed)
Log a short solver progress note
Description
Log a short solver progress note
Usage
el_log_solving(verboser)
Log starting values
Description
Log starting values
Usage
el_log_start_values(verboser, init_beta, init_z, init_lambda)
Log a short trace message with the chosen level
Description
Log a short trace message with the chosen level
Usage
el_log_trace(verboser, trace_level)
Log variance header and result
Description
Log variance header and result
Usage
el_log_variance_header(verboser, variance_method, bootstrap_reps)
Log weight diagnostics
Description
Log weight diagnostics
Usage
el_log_weight_diagnostics(verboser, W_hat, weights, trimmed_fraction)
EL masses and probabilities from denominators
Description
EL masses and probabilities from denominators
Usage
el_masses(weights, denom, floor, trim_cap)
Arguments
weights |
numeric respondent base weights (d_i) |
denom |
numeric denominators Di after floor guard |
floor |
numeric small positive guard (unused in core logic here, kept for API symmetry) |
trim_cap |
numeric cap (>0) or Inf |
Value
list with mass_untrim, mass_trimmed, prob_mass, trimmed_fraction
Mean from probability masses
Description
Mean from probability masses
Usage
el_mean(prob_mass, y)
Prepare EL inputs for IID and survey designs
Description
Parses the two-part Formula, constructs EL design matrices, injects the respondent delta indicator, attaches weights and (optionally) survey metadata, and returns the pieces needed by the EL core. The outcome enters the missingness design only through the evaluated LHS expression; any explicit use of the outcome variable on RHS2 is rejected.
Usage
el_prepare_inputs(
formula,
data,
weights = NULL,
n_total = NULL,
design_object = NULL
)
Details
Invariants enforced here (relied on by all downstream EL code):
LHS references exactly one outcome source variable in
data; any transforms are applied via the formula environment and must be defined for all respondent rows.The outcome is never allowed to appear on RHS1 (auxiliaries) or RHS2 (missingness predictors), either explicitly in the formula or implicitly via dot (
.) expansion. The missingness model uses the evaluated LHS expression as a dedicated predictor column instead.RHS1 always yields an intercept-free auxiliary design matrix with k-1 coding for factor auxiliaries, regardless of user
+0/-1syntax or custom contrasts. Auxiliary columns are validated to be fully observed and non-constant among respondents.RHS2 always yields a missingness-design matrix for respondents that includes an intercept column and zero-variance predictors only emit warnings (not errors); NA among respondents is rejected.
-
respondent_maskis defined from the raw outcome indata, not from the transformed LHS; an injected..nmar_delta..indicator inanalysis_datamatches this mask exactly. -
N_popis the analysis-scale population size used in the EL system: for IID it isnrow(data)unless overridden byn_total; for survey designs it issum(weights)orn_totalwhen supplied.
Prepare nleqslv top-level args and control
Description
Prepare nleqslv top-level args and control
Usage
el_prepare_nleqslv(control)
EL auxiliary design resolution and population means
Description
Computes the respondent-side auxiliary matrix and the population means vector
used for centering X - \mu_x. When auxiliary_means is supplied, only
respondent rows are required to be fully observed; NA values are permitted on
nonrespondent rows. When auxiliary_means is NULL, auxiliaries must be fully
observed in the full data used to estimate population means.
Usage
el_resolve_auxiliaries(
aux_design_full,
respondent_mask,
auxiliary_means,
weights_full = NULL
)
Solver orchestration with staged policy
Description
Solver orchestration with staged policy
Usage
el_run_solver(
equation_system_func,
analytical_jac_func,
init,
final_control,
top_args,
solver_method,
use_solver_jac,
K_beta,
K_aux,
respondent_weights,
N_pop,
trace_level = 0
)
Arguments
equation_system_func |
Function mapping parameter vector to equation residuals. |
analytical_jac_func |
Analytic Jacobian function; may be NULL if unavailable or when forcing Broyden. |
init |
Numeric vector of initial parameter values. |
final_control |
List passed to |
top_args |
List of top-level |
solver_method |
Character; one of "auto", "newton", or "broyden". |
use_solver_jac |
Logical; whether to pass analytic Jacobian to Newton. |
K_beta |
Integer; number of response model parameters. |
K_aux |
Integer; number of auxiliary constraints. |
respondent_weights |
Numeric vector of base sampling weights. |
N_pop |
Numeric; population total (weighted when survey design). |
trace_level |
Integer; verbosity level (0 silent, 1-3 increasingly verbose). |
Empirical likelihood for survey designs (NMAR)
Description
Internal method dispatched by el() when data is a
survey.design.
Usage
## S3 method for class 'survey.design'
el(
data,
formula,
auxiliary_means = NULL,
standardize = TRUE,
strata_augmentation = TRUE,
trim_cap = Inf,
control = list(),
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 500,
n_total = NULL,
start = NULL,
trace_level = 0,
family = logit_family(),
...
)
Arguments
data |
A |
formula |
Two-sided formula with an NA-valued outcome on the LHS; auxiliaries on the first RHS and, optionally, missingness predictors on the second RHS partition. |
auxiliary_means |
Named numeric vector of population means for auxiliary
design columns. Names must match the materialized |
standardize |
Logical; standardize predictors. |
strata_augmentation |
Logical; when |
trim_cap |
Numeric; cap for EL weights ( |
control |
List; solver control for |
on_failure |
Character; |
variance_method |
Character; |
bootstrap_reps |
Integer; reps when |
n_total |
Optional analysis-scale population size |
start |
Optional list of starting values passed to solver helpers. |
trace_level |
Integer 0-3 controlling estimator logging detail. |
family |
Missingness (response) model family specification (defaults to logit). |
... |
Passed to solver. |
Details
Implements the empirical likelihood estimator with design weights.
If n_total is supplied, it is treated as the analysis-scale population
size N_pop used in the design-weighted QLS system. If n_total
is not supplied, sum(weights(design)) is used as N_pop. Design
weights are not rescaled internally; the EL equations use respondent weights
and N_pop via T_0 = N_{\mathrm{pop}} - \sum d_i in the linkage equation.
When respondents-only designs are used (no NA in the outcome), n_total
must be provided; if auxiliaries are requested you must also provide
population auxiliary means via auxiliary_means. Result weights are the
unnormalized EL masses d_i / D_i(\theta) on this analysis scale;
weights(result, scale = "population") sums to N_pop.
Value
c("nmar_result_el","nmar_result").
EL utility helpers
Description
Internal helpers for auxiliary consistency checks and shared validation routines used during input parsing.
Validate design spec dimensions
Description
Validate design spec dimensions
Usage
el_validate_design_spec(design, data_nrow)
Validate matrix columns for NA and zero variance
Description
Validate matrix columns for NA and zero variance
Usage
el_validate_matrix(
mat,
allow_na,
label,
severity,
row_map = NULL,
scope_note = NULL,
plural_label = FALSE
)
Enforce (near-)nonnegativity of weights
Description
Softly enforces nonnegativity of a numeric weight vector. Large negative values (beyond a tolerance) are treated as errors; small negative values (for example, from numerical noise) are truncated to zero.
Usage
enforce_nonneg_weights(weights, tol = 1e-08)
Arguments
weights |
numeric vector of weights. |
tol |
numeric tolerance below which negative values are treated as numerical noise and clipped to zero. |
Details
Values below -tol are treated as clearly negative. Values in
[-tol, 0) are clipped to zero.
Value
A list with components:
oklogical;
TRUEif no clearly negative weights were found.messagecharacter; diagnostic message when
okisFALSE, otherwiseNULL.weightsnumeric vector of adjusted weights (original if
okisFALSE, otherwise with small negatives clipped to zero).
Extract engine configuration
Description
Returns the underlying configuration of an engine as a named list. This is intended for programmatic inspection (e.g., parameter tuning, logging). The returned object should be treated as read-only.
Usage
engine_config(x)
Arguments
x |
An object inheriting from class 'nmar_engine'. |
Value
A named list of configuration fields.
Canonical engine name
Description
Returns a stable, machine-friendly identifier for an engine object. This identifier is also used in 'nmar_result$meta$engine_name' to keep a consistent naming scheme between configurations and results.
Usage
engine_name(x)
Arguments
x |
An object inheriting from class 'nmar_engine'. |
Value
A single character string, e.g. "empirical_likelihood".
Exponential tilting estimator
Description
Generic for the exponential tilting (ET) estimator under NMAR. Methods are provided for 'data.frame' and 'survey.design'.
Usage
exptilt(data, ...)
Arguments
data |
A 'data.frame' or a 'survey.design'. |
... |
Passed to class-specific methods. |
Value
An engine-specific NMAR result object (for example
nmar_result_exptilt).
See Also
'exptilt.data.frame()', 'exptilt.survey.design()', 'exptilt_engine()'
Exponential tilting (ET) engine for NMAR estimation
Description
Constructs a configuration for the exponential tilting estimator under
nonignorable nonresponse (NMAR).
The estimator solves S_2(\boldsymbol{\phi}, \hat{\boldsymbol{\gamma}}) = 0, using nleqslv to apply EM algorithm.
Usage
exptilt_engine(
standardize = FALSE,
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 10,
supress_warnings = FALSE,
control = list(),
family = c("logit", "probit"),
y_dens = c("normal", "lognormal", "exponential", "binomial"),
stopping_threshold = 1,
sample_size = 2000
)
Arguments
standardize |
logical; standardize predictors. Default |
on_failure |
character; |
variance_method |
character; one of |
bootstrap_reps |
integer; number of bootstrap replicates when
|
supress_warnings |
Logical; suppress variance-related warnings. |
control |
Named list of control parameters passed to
See |
family |
character; response model family, either |
y_dens |
Outcome density model ( |
stopping_threshold |
Numeric; early stopping threshold. If the maximum absolute value of the score function falls below this threshold, the algorithm stops early (default: 1). |
sample_size |
Integer; maximum sample size for stratified random sampling (default: 2000). When the dataset exceeds this size, a stratified random sample is drawn to optimize memory usage. The sampling preserves the ratio of respondents to non-respondents in the original data. |
Details
The method is a robust Propensity-Score Adjustment (PSA) approach for Not Missing at Random (NMAR).
It uses Maximum Likelihood Estimation (MLE), basing the likelihood on the observed part of the sample (f(\boldsymbol{Y}_i | \delta_i = 1, \boldsymbol{X}_i)), making it robust against outcome model misspecification.
The propensity score is estimated by assuming an instrumental variable (X_2) that is independent of the response status given other covariates and the study variable.
Estimator calculates fractional imputation weights w_i.
The final estimator is a weighted average, where the weights are the inverse of the estimated response probabilities \hat{\pi}_i, satisfying the estimating equation:
\sum_{i \in \mathcal{R}} \frac{\boldsymbol{g}(\boldsymbol{Y}_i, \boldsymbol{X}_i ; \boldsymbol{\theta})}{\hat{\pi}_i} = 0,
where \mathcal{R} is the set of observed respondents.
Value
An engine object of class c("nmar_engine_exptilt","nmar_engine").
This is a configuration list; it is not a fit. Pass it to nmar.
References
Minsun Kim Riddles, Jae Kwang Kim, Jongho Im A Propensity-score-adjustment Method for Nonignorable Nonresponse Journal of Survey Statistics and Methodology, Volume 4, Issue 2, June 2016, Pages 215–245.
Examples
generate_test_data <- function(
n_rows = 500,
n_cols = 1,
case = 1,
x_var = 0.5,
eps_var = 0.9,
a = 0.8,
b = -0.2
) {
# Generate X variables - fixed to match comparison
X <- as.data.frame(replicate(n_cols, rnorm(n_rows, 0, sqrt(x_var))))
colnames(X) <- paste0("x", 1:n_cols)
# Generate Y - fixed coefficients to match comparison
eps <- rnorm(n_rows, 0, sqrt(eps_var))
if (case == 1) {
# Use fixed coefficient of 1 for all x variables to match: y = -1 + x1 + epsilon
X$Y <- as.vector(-1 + as.matrix(X) %*% rep(1, n_cols) + eps)
}
else if (case == 2) {
X$Y <- -2 + 0.5 * exp(as.matrix(X) %*% rep(1, n_cols)) + eps
}
else if (case == 3) {
X$Y <- -1 + sin(2 * as.matrix(X) %*% rep(1, n_cols)) + eps
}
else if (case == 4) {
X$Y <- -1 + 0.4 * as.matrix(X)^3 %*% rep(1, n_cols) + eps
}
Y_original <- X$Y
# Missingness mechanism - identical to comparison
pi_obs <- 1 / (1 + exp(-(a + b * X$Y)))
# Create missing values
mask <- runif(nrow(X)) > pi_obs
mask[1] <- FALSE # Ensure at least one observation is not missing
X$Y[mask] <- NA
return(list(X = X, Y_original = Y_original))
}
res_test_data <- generate_test_data(n_rows = 500, n_cols = 1, case = 1)
x <- res_test_data$X
exptilt_config <- exptilt_engine(
y_dens = 'normal',
control = list(maxit = 10),
stopping_threshold = 0.1,
standardize = FALSE,
family = 'logit',
bootstrap_reps = 5
)
formula = Y ~ x1
res <- nmar(formula = formula, data = x, engine = exptilt_config, trace_level = 1)
summary(res)
Nonparametric Exponential Tilting (Internal Generic)
Description
Nonparametric Exponential Tilting (Internal Generic)
Usage
exptilt_nonparam(data, ...)
Arguments
data |
A data.frame or survey.design object |
... |
Other arguments passed to methods |
Value
An engine-specific NMAR result object for the nonparametric exponential tilting estimator.
Nonparametric exponential tilting (EM) engine for NMAR
Description
Constructs a configuration for the nonparametric exponential tilting estimator
under nonignorable nonresponse (NMAR).
This engine implements the "Fully Nonparametric Approach" from **Appendix 2**
of Riddles et al. (2016). The estimator uses an
Expectation-Maximization (EM) algorithm to directly estimate the
nonresponse odds O(x_1, y) for aggregated, categorical data.
Usage
exptilt_nonparam_engine(refusal_col = "", max_iter = 100, tol_value = 1e-06)
Arguments
refusal_col |
character; the column name in |
max_iter |
integer; the maximum number of iterations for the EM algorithm. |
tol_value |
numeric; the convergence tolerance for the EM algorithm. The loop stops when the sum of absolute changes in the odds matrix is less than this value. |
Details
This engine is designed for cases where all variables (outcomes $Y$,
response predictors $X_1$, and instrumental variables $X_2$) are categorical,
and the input data is pre-aggregated into strata.
The method assumes an instrumental variable X_2 is available. The
response probability is assumed to depend on X_1 and $Y$, but *not*
on X_2.
The EM algorithm iteratively solves for the nonresponse odds:
O^{(t+1)}(x_1^*, y^*) = \frac{M_{y^*x_1^*}^{(t)}}{N_{y^*x_1^*}}
where M_{y^*x_1^*}^{(t)} is the expected count of non-respondents
(calculated in the E-step) and N_{y^*x_1^*} is the observed count
of respondents for a given stratum $(x_1, y)$.
The final output from the nmar call is an object containing
data_to_return, an aggregated data frame where the original
'refusal' counts have been redistributed into the outcome columns
(e.g., 'Voted_A', 'Voted_B') as expected non-respondent counts.
Value
An engine object of class c("nmar_engine_exptilt_nonparam","nmar_engine").
This is a configuration list; it is not a fit. Pass it to nmar.
References
Minsun Kim Riddles, Jae Kwang Kim, Jongho Im A Propensity-score-adjustment Method for Nonignorable Nonresponse Journal of Survey Statistics and Methodology, Volume 4, Issue 2, June 2016, Pages 215–245. (See **Appendix 2** for this specific method).
Examples
# Test data (Riddles 2016, Table 9)
voting_data_example <- data.frame(
Gender = rep(c("Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female"), 1),
Age_group = c("20-29", "30-39", "40-49", ">=50", "20-29", "30-39", "40-49", "50+"),
Voted_A = c(93, 104, 146, 560, 106, 129, 170, 501),
Voted_B = c(115, 233, 295, 350, 159, 242, 262, 218),
Other = c(4, 8, 5, 3, 8, 5, 5, 7),
Refusal = c(28, 82, 49, 174, 62, 70, 69, 211),
Total = c(240, 427, 495, 1087, 335, 446, 506, 937)
)
np_em_config <- exptilt_nonparam_engine(
refusal_col = "Refusal",
max_iter = 100,
tol_value = 0.001
)
# Formula: Y1 + Y2 + ... ~ X1_vars | X2_vars
# Here, Y = Voted_A, Voted_B, Other
# x1 = Gender (response model)
# x2 = Age_group (instrumental variable)
em_formula <- Voted_A + Voted_B + Other ~ Gender | Age_group
results_em_np <- nmar(
formula = em_formula,
data = voting_data_example,
engine = np_em_config,
trace_level = 0
)
# View the final adjusted counts
# (Original counts + expected non-respondent counts)
print(results_em_np$data_final)
Extract top-level nleqslv arguments from a control-like list
Description
Extract top-level nleqslv arguments from a control-like list
Usage
extract_nleqslv_top(ctrl)
Default fitted values for NMAR results
Description
Returns fitted response probabilities if available.
Usage
## S3 method for class 'nmar_result'
fitted(object, ...)
Arguments
object |
An 'nmar_result' object. |
... |
Ignored. |
Value
A numeric vector (possibly length 0).
One-line formatter for NMAR engines
Description
Returns a single concise line summarizing an engine configuration.
Usage
## S3 method for class 'nmar_engine'
format(x, ...)
Arguments
x |
An engine object inheriting from 'nmar_engine'. |
... |
Unused. |
Value
A length-1 character vector.
Default formula for NMAR results
Description
Returns the estimation formula if available.
Usage
## S3 method for class 'nmar_result'
formula(x, ...)
Arguments
x |
An 'nmar_result' object. |
... |
Ignored. |
Value
A formula or 'NULL'.
Generate conditional density
Description
Generate conditional density
Usage
generate_conditional_density(model)
Arguments
model |
An internal exptilt object |
Glance summary for NMAR results
Description
One-row diagnostics for NMAR fits.
Usage
## S3 method for class 'nmar_result'
glance(x, ...)
Arguments
x |
An object of class 'nmar_result'. |
... |
Ignored. |
Value
A one-row data frame with diagnostics and metadata.
Construct a logit response family bundle
Description
Construct a logit response family bundle
Usage
logit_family()
Value
A list with components name, linkinv, mu.eta,
d2mu.deta2, and score_eta.
Prefer explicit solver_args over control-provided top-level args
Description
Prefer explicit solver_args over control-provided top-level args
Usage
merge_nleqslv_top(solver_args, control_top)
Construct EL Engine Object
Description
Construct EL Engine Object
Usage
new_nmar_engine_el(engine)
Construct Result Object (parent helper)
Description
Builds an 'nmar_result' list using the shared schema and validates it. Engines must pass named fields; no legacy positional signature is supported.
Usage
new_nmar_result(...)
Details
Engine-level constructors should call this helper with named arguments rather
than assembling result lists by hand. At minimum, engines should supply
estimate (numeric scalar) and converged (logical). All other
fields are optional:
-
estimate_name: label for the primary estimand (defaults toNA_character_if omitted). -
se: standard error for the primary estimand (defaults toNA_real_when not available). -
model,weights_info,sample,inference,diagnostics,meta,extra: lists that may be partially specified orNULL;validate_nmar_result()will back-fill missing subfields with safe defaults. -
class: engine-specific result subclass name, e.g."nmar_result_el"; it is combined with the parent class"nmar_result".
Calling new_nmar_result() ensures that every engine returns objects
that satisfy the shared schema and are immediately compatible with parent
S3 methods such as vcov(), confint(), tidy(),
glance(), and weights().
Construct EL Result Object
Description
Construct EL Result Object
Usage
new_nmar_result_el(
y_hat,
se,
weights,
coefficients,
vcov,
converged,
diagnostics,
inputs,
nmar_scaling_recipe,
fitted_values,
call,
formula = NULL
)
Not Missing at Random (NMAR) Estimation
Description
High-level interface for NMAR estimation.
nmar() validates basic inputs and dispatches to an engine (for example
el_engine). The engine controls the estimation method and
interprets formula; see the engine documentation for model-specific
requirements.
Usage
nmar(formula, data, engine, trace_level = 0)
Arguments
formula |
A two-sided formula. Many engines support a partitioned
right-hand side via |
data |
A |
engine |
An NMAR engine configuration object, typically created by
|
trace_level |
Integer 0-3; controls verbosity during estimation
(default
|
Value
An object of class "nmar_result" with an engine-specific subclass
(for example "nmar_result_el"). Use summary(),
se, confint(), weights(), coef(),
fitted(), and generics::tidy() / generics::glance() to
access estimates, standard errors, weights, and diagnostics.
See Also
el_engine, exptilt_engine,
exptilt_nonparam_engine, summary.nmar_result,
weights.nmar_result
Examples
set.seed(1)
n <- 200
x1 <- rnorm(n)
z1 <- rnorm(n)
y_true <- 0.5 + 0.3 * x1 + 0.2 * z1 + rnorm(n, sd = 0.3)
resp <- rbinom(n, 1, plogis(2 + 0.1 * y_true + 0.1 * z1))
if (all(resp == 1)) resp[sample.int(n, 1)] <- 0L
y_obs <- ifelse(resp == 1, y_true, NA_real_)
# Empirical likelihood engine
df_el <- data.frame(Y_miss = y_obs, X = x1, Z = z1)
eng_el <- el_engine(variance_method = "none")
fit_el <- nmar(Y_miss ~ X | Z, data = df_el, engine = eng_el)
summary(fit_el)
# Exponential tilting engine (illustrative)
dat_et <- data.frame(y = y_obs, x2 = z1, x1 = x1)
eng_et <- exptilt_engine(
y_dens = "normal",
family = "logit",
variance_method = "none"
)
fit_et <- nmar(y ~ x2 | x1, data = dat_et, engine = eng_et)
summary(fit_et)
# Survey design example (same outcome, random weights)
if (requireNamespace("survey", quietly = TRUE)) {
w <- runif(n, 0.5, 2)
des <- survey::svydesign(ids = ~1, weights = ~w,
data = data.frame(Y_miss = y_obs, X = x1, Z = z1))
eng_svy <- el_engine(variance_method = "none")
fit_svy <- nmar(Y_miss ~ X | Z, data = des, engine = eng_svy)
summary(fit_svy)
}
# Bootstrap variance usage
if (requireNamespace("future.apply", quietly = TRUE)) {
set.seed(2)
eng_boot <- el_engine(
variance_method = "bootstrap",
bootstrap_reps = 20
)
fit_boot <- nmar(Y_miss ~ X | Z, data = df_el, engine = eng_boot)
se(fit_boot)
}
S3 helpers for NMAR engine objects
Description
Lightweight, user-facing methods for engine configuration objects (class 'nmar_engine'). These improve discoverability and provide a consistent print surface across engines while keeping the objects as simple lists internally.
Design
- 'engine_name()' returns a canonical identifier used across the package (e.g., in 'nmar_result$meta$engine_name'). - 'print.nmar_engine()' provides a concise, readable summary of the engine configuration; engine-specific classes reuse the parent method unless they need to override it. - 'engine_config()' returns the underlying configuration as a named list for programmatic inspection.
Format a number with fixed decimal places using nmar.digits
Description
Format a number with fixed decimal places using nmar.digits
Usage
nmar_fmt_num(x, digits = nmar_get_digits())
Format an abridged call line for printing
Description
Builds a concise one-line summary of the original call without materializing large objects (e.g., full data frames). Intended for use by print/summary methods.
Usage
nmar_format_call_line(x)
Details
Uses option 'nmar.show_call' (default TRUE). Width can be tuned via option 'nmar.call_width' (default 120), but the formatter aims to keep the line compact regardless of width.
Resolve global digits setting for printing
Description
Resolve global digits setting for printing
Usage
nmar_get_digits()
EL denominator floor (global, consistent)
Description
Returns the small positive floor \delta used to guard the empirical
likelihood denominator D_i(\theta) away from zero. This guard must be
applied consistently in the estimating equations, analytic Jacobian, and
post-solution weight construction. Advanced users can override via
'options(nmar.el_denom_floor = 1e-8)'.
Usage
nmar_get_el_denom_floor()
NMAR numeric settings
Description
NMAR numeric settings
Usage
nmar_get_numeric_settings()
Details
Centralized access to numeric thresholds used across the package. These settings can be overridden via options() for advanced users: - 'nmar.eta_cap': scalar > 0. Caps the response-model linear predictor to avoid extreme link values in Newton updates. Default 50. - 'nmar.grad_eps': finite-difference step size epsilon for numeric gradients of smooth functionals. Default 1e-6. - 'nmar.grad_d': relative step adjustment for numeric gradients. Default 1e-3.
The defaults are chosen to be conservative and stable across typical NMAR problems. Engines should retrieve values via this helper rather than hard-coding numbers, so documentation stays consistent.
Value
A named list with entries 'eta_cap', 'grad_eps', and 'grad_d'.
Internal helpers for nmar_result objects
Description
Internal helpers for nmar_result objects
Usage
nmar_result_get_estimate(x)
Parent S3 surface for NMAR results
Description
Methods that apply to the parent 'nmar_result' class and are not specific to a particular engine (e.g., EL). Engines return a child class (e.g., 'nmar_result_el') that inherits from 'nmar_result' and may override or extend behavior.
Details
S3 surface for base 'nmar_result'
Result objects expose a universal schema: - 'y_hat', 'estimate_name', 'se', 'converged'. - 'model': list with 'coefficients', 'vcov', plus optional extras. - 'weights_info': list with respondent weights and trimming metadata. - 'sample': list with total units, respondent count, survey flag, and 'design'. - 'inference': variance metadata ('variance_method', 'df', diagnostic flags). - 'diagnostics', 'meta', and 'extra' for estimator-specific details.
New engines should populate these components in their constructors and rely on the 'nmar_result_get_*' utilities when implementing child-specific S3 methods.
Shared scaling infrastructure for NMAR engines
Description
Centralized feature scaling and parameter unscaling routines used by NMAR estimation engines to ensure consistent, numerically stable behavior.
Goals
Provide an engine-agnostic API for standardizing design matrices and auxiliary moments before solving.
Return a minimal scaling recipe (
nmar_scaling_recipe) used to unscale coefficients and covariance matrices after solving.
Inputs/Outputs
- Inputs
Z_un(response model matrix with intercept), optionalX_un(auxiliary model matrix, no intercept), optional namedmu_x_un(auxiliary means on the original scale), and a logicalstandardizeflag.- Outputs
Scaled matrices
Z,X, andmu_x, plus annmar_scaling_recipeused later for unscaling.
Integration pattern
Before solving: call
validate_and_apply_nmar_scaling()(engine-level) orprepare_nmar_scaling()(low-level) to obtain scaled matrices and recipe.Solve in the scaled space.
After solving: call
unscale_coefficients()to unscale coefficients and their covariance matrices.Store the
nmar_scaling_recipein results for diagnostics and reproducibility.
Notes
The intercept column is never scaled.
Columns with near-zero variance are centered but assigned
sd = 1so that the corresponding parameter is not inflated by division by a very small standard deviation.Engines may use design-weighted scaling via the
weightsandweight_maskarguments.
Polish Household Budget Data with Simulated Nonignorable Nonresponse
Description
This dataset is derived from the 'h05' dataset (Polish household budgets for 2005) found in the 'RClas' package. The original data was cleaned to remove all rows with missing values.
Usage
polish_households
Format
A data frame with 19,330 rows and 17 columns. The key variables are:
- class
TODO
- voi
TODO
- bio
TODO
- type
TODO
- d345
TODO
- d347
TODO
- d348
TODO
- d36
TODO
- d38
TODO
- d61
TODO
- noper
TODO
- income
TODO
- expenditure
TODO
- y_exp
Numeric. The **true** scaled expenditure ('expenditure / mean(expenditure)'). This is the complete study variable without missingness.
- resp
TODO
- R
Integer. The simulated response indicator (1=responded, 0=nonresponse).
- y_exp_miss
Numeric. The **observed** scaled expenditure, containing 7,778 'NA' values where 'R = 0'. This is the variable to be used as the NMAR-affected outcome.
Details
To create a realistic test case for nonignorable nonresponse (NMAR), a nonresponse mechanism was simulated and applied to the scaled expenditure variable ('y_exp').
The key simulation steps were: 1. 'y_exp' (true study variable) was created by scaling total expenditure. 2. A true response probability ('resp') was created using the logistic model: 'plogis(1 - 0.6 * y_exp)'. 3. A response indicator ('R') was simulated based on this probability. 4. The final variable 'y_exp_miss' was generated by setting 'y_exp' to 'NA' wherever 'R' was 0.
The response is **nonignorable** because the probability of missingness depends directly on the value of the expenditure variable itself.
Source
TODO
See Also
'riddles_case1', 'riddles_case2', 'riddles_case3', 'riddles_case4'
Prepare scaled matrices and moments (low-level)
Description
Prepare scaled matrices and moments (low-level)
Usage
prepare_nmar_scaling(
Z_un,
X_un,
mu_x_un,
standardize,
weights = NULL,
weight_mask = NULL
)
Arguments
Z_un |
response model matrix (with intercept column). |
X_un |
auxiliary model matrix (no intercept), or NULL. |
mu_x_un |
named numeric vector of auxiliary means on the original scale
(names must match |
standardize |
logical; apply standardization if TRUE. |
weights |
Optional numeric vector used for weighted scaling. |
weight_mask |
Optional logical mask or nonnegative numeric multipliers
applied to |
Value
A list with components Z, X, mu_x, and
recipe.
Print method for NMAR engines
Description
Provides a compact, human-friendly summary for 'nmar_engine' objects. Child classes inherit this method; they can override it if they need a different presentation.
Usage
## S3 method for class 'nmar_engine'
print(x, ...)
Arguments
x |
An engine object inheriting from 'nmar_engine'. |
... |
Unused. |
Value
'x', invisibly.
Print method for nmar_result
Description
Print method for nmar_result
Usage
## S3 method for class 'nmar_result'
print(x, ...)
Arguments
x |
nmar_result object |
... |
Additional parameters |
Value
'x', invisibly.
Print method for EL results
Description
Compact print for objects of class nmar_result_el.
Usage
## S3 method for class 'nmar_result_el'
print(x, ...)
Arguments
x |
An object of class |
... |
Ignored. |
Value
x, invisibly.
Print method for Exponential Tilting results (engine-specific)
Description
This print method is tailored for 'nmar_result_exptilt' objects and shows a concise, human-friendly summary of the estimation result together with exptilt-specific diagnostics (loss, iterations) and a compact view of the response coefficients stored in the fitted model.
Usage
## S3 method for class 'nmar_result_exptilt'
print(x, ...)
Arguments
x |
An object of class 'nmar_result_exptilt'. |
... |
Ignored. |
Value
'x', invisibly.
Print method for summary.nmar_result
Description
Print method for summary.nmar_result
Usage
## S3 method for class 'summary_nmar_result'
print(x, ...)
Arguments
x |
summary_nmar_result object |
... |
Additional parameters |
Value
'x', invisibly.
Construct a probit response family bundle
Description
Construct a probit response family bundle
Usage
probit_family()
Value
A list with components name, linkinv, mu.eta,
d2mu.deta2, and score_eta.
Riddles Simulation, Case 1: Linear Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 1) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case1
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 1):
y_true = -1 + x + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Riddles Simulation, Case 2: Exponential Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 2) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case2
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 2):
y_true = -2 + 0.5 * exp(x) + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Riddles Simulation, Case 3: Sine Wave Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 3) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case3
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 3):
y_true = -1 + sin(2 * x) + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Riddles Simulation, Case 4: Cubic Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 4) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case4
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 4):
y_true = -1 + 0.4 * x^3 + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Run method for EL engine
Description
Run method for EL engine
Usage
## S3 method for class 'nmar_engine_el'
run_engine(engine, formula, data, trace_level = 0)
Arguments
engine |
An object of class |
formula |
A two-sided formula passed through by |
data |
A |
trace_level |
Integer 0-3 controlling verbosity. |
Value
An object of class nmar_result_el (which also inherits from
nmar_result).
Sanitize nleqslv control list for compatibility
Description
Sanitize nleqslv control list for compatibility
Usage
sanitize_nleqslv_control(ctrl)
Map unscaled auxiliary multipliers to scaled space
Description
Map unscaled auxiliary multipliers to scaled space
Usage
scale_aux_multipliers(lambda_unscaled, recipe, columns)
Arguments
lambda_unscaled |
named numeric vector of auxiliary multipliers aligned to auxiliary design columns (no intercept) on original scale. |
recipe |
Scaling recipe of class |
columns |
character vector of auxiliary column names (order) for the scaled design. |
Value
numeric vector of multipliers in the scaled space.
Map unscaled coefficients to scaled space
Description
Map unscaled coefficients to scaled space
Usage
scale_coefficients(beta_unscaled, recipe, columns)
Arguments
beta_unscaled |
named numeric vector of coefficients for the response
model on the original scale, including an intercept named
|
recipe |
Scaling recipe of class |
columns |
character vector of column names (order) for the scaled design matrix (including intercept). |
Value
numeric vector of coefficients in the scaled space, ordered by
columns.
Extract standard error for NMAR results
Description
Returns the standard error of the primary mean estimate.
Usage
se(object, ...)
Arguments
object |
An 'nmar_result' or subclass. |
... |
Ignored. |
Value
Numeric scalar.
Weighted linear algebra helpers
Description
Weighted linear algebra helpers
Usage
shared_weighted_gram(X, w)
Summary method for nmar_result
Description
Summary method for nmar_result
Usage
## S3 method for class 'nmar_result'
summary(object, conf.level = 0.95, ...)
Arguments
object |
nmar_result object |
conf.level |
Confidence level for intervals. |
... |
Additional parameters |
Value
An object of class 'summary_nmar_result'.
Summary method for EL results
Description
Summarize estimation, standard error and missingness-model coefficients.
Usage
## S3 method for class 'nmar_result_el'
summary(object, ...)
Arguments
object |
An object of class |
... |
Ignored. |
Value
An object of class summary_nmar_result_el.
Summary method for Exponential Tilting results (engine-specific)
Description
Summarize estimation, standard error and model coefficients.
Usage
## S3 method for class 'nmar_result_exptilt'
summary(object, conf.level = 0.95, ...)
Arguments
object |
An object of class 'nmar_result_exptilt'. |
conf.level |
Confidence level for confidence interval (default 0.95). |
... |
Ignored. |
Value
An object of class 'summary_nmar_result_exptilt'.
Tidy summary for NMAR results
Description
Return a data frame with the primary estimate and (if available) missingness-model coefficients.
Usage
## S3 method for class 'nmar_result'
tidy(x, conf.level = 0.95, ...)
Arguments
x |
An object of class 'nmar_result'. |
conf.level |
Confidence level for the primary estimate. |
... |
Ignored. |
Value
A data frame with one row for the primary estimate and, when available, additional rows for the response-model coefficients.
Trim weights by capping and proportional redistribution
Description
Applies a cap to a nonnegative weight vector and, when feasible, redistributes excess mass across the remaining positive entries so that the total sum is preserved. When the requested cap is too tight to preserve the total mass, all positive entries are set to the cap and the total sum decreases.
Usage
trim_weights(weights, cap, tol = 1e-12, warn_tol = 1e-08)
Arguments
weights |
numeric vector of weights. |
cap |
positive numeric scalar; maximum allowed weight, or |
tol |
numeric tolerance used when testing whether a rescaling step respects the cap. |
warn_tol |
numeric tolerance used when testing whether the total sum has been preserved. |
Details
Zero weights remain zero; only entries that are positive after nonnegativity enforcement can absorb redistributed mass.
Internally, a simple water-filling style algorithm is used on the positive weights: the largest weights are successively saturated at the cap and the remaining weights are rescaled by a common factor chosen to maintain the total sum.
Value
A list with components:
weightsnumeric vector of trimmed weights.
trimmed_fractionfraction of entries at or very close to the cap (within
tol).preserved_sumlogical;
TRUEif the total sum of weights is preserved to withinwarn_tol.total_beforenumeric; sum of the original weights.
total_afternumeric; sum of the trimmed weights.
Unscale regression coefficients and covariance
Description
Unscale regression coefficients and covariance
Usage
unscale_coefficients(scaled_coeffs, scaled_vcov, recipe)
Arguments
scaled_coeffs |
named numeric vector of coefficients estimated on the scaled space. |
scaled_vcov |
covariance matrix of |
recipe |
Scaling recipe of class |
Value
A list with components coefficients and vcov.
Validate and apply scaling (engine-friendly)
Description
Validate and apply scaling (engine-friendly)
Usage
validate_and_apply_nmar_scaling(
standardize,
has_aux,
response_model_matrix_unscaled,
aux_matrix_unscaled,
mu_x_unscaled,
weights = NULL,
weight_mask = NULL
)
Arguments
standardize |
logical; apply standardization if TRUE. |
has_aux |
logical; whether the engine uses auxiliary constraints. |
response_model_matrix_unscaled |
response model matrix (with intercept). |
aux_matrix_unscaled |
auxiliary matrix (no intercept) or an empty matrix. |
mu_x_unscaled |
named auxiliary means on original scale, or NULL. |
weights |
Optional numeric vector used for weighted scaling. |
weight_mask |
Optional logical mask or nonnegative numeric multipliers
applied to |
Value
A list with components nmar_scaling_recipe,
response_model_matrix_scaled, auxiliary_matrix_scaled, and
mu_x_scaled.
Validate Data for NMAR Analysis
Description
Little sanity-check for data
Usage
validate_data(data)
Arguments
data |
A data frame or a survey object. |
Value
Returns 'invisible(NULL)' on success, stopping with a descriptive error on failure.
Validate top-level nleqslv arguments (coerce invalid to defaults)
Description
Validate top-level nleqslv arguments (coerce invalid to defaults)
Usage
validate_nleqslv_top(top)
Validate EL Engine Settings
Description
Validate EL Engine Settings
Usage
validate_nmar_engine_el(engine)
Validate nmar_result structure
Description
Ensures both the child class and the parent schema are satisfied. The validator also back-fills defaults so downstream code can rely on the presence of optional components without defensive checks.
Usage
validate_nmar_result(x, class_name)
Details
This helper is the single authority on the 'nmar_result' schema. It expects
a list that already carries class c(class_name, "nmar_result") and
at least a primary estimate stored in y_hat. All other components are
optional; when they are NULL or missing, the validator supplies safe
defaults:
Core scalars:
se(numeric, defaultNA_real_),estimate_name(character, defaultNA_character_),converged(logical, defaultNA).-
model: list withcoefficientsandvcov, both defaulting toNULL. -
weights_info: list withvalues(defaultNULL) andtrimmed_fraction(defaultNA_real_). -
sample: list withn_total,n_respondents,is_survey, anddesign, defaulted to missing/empty values. -
inference: list withvariance_method,df, andmessage, all defaulted to missing values. -
diagnostics,meta, andextra: defaulted to empty lists, withmetacarryingengine_name,call, andformulawhen unset.
Engine constructors should normally call new_nmar_result() rather than
invoking this function directly. new_nmar_result() attaches classes and
funnels all objects through validate_nmar_result() so downstream S3
methods can assume a consistent structure.
Variance-covariance for base NMAR results
Description
Variance-covariance for base NMAR results
Usage
## S3 method for class 'nmar_result'
vcov(object, ...)
Arguments
object |
An object of class 'nmar_result'. |
... |
Ignored. |
Value
A 1x1 numeric matrix (the variance of the primary estimate).
Aggregated Exit Poll Data for Gangdong-Gap (2012)
Description
This dataset contains the aggregated exit poll results for the Gangdong-Gap district in Seoul from the 2012 nineteenth South Korean legislative election. The data is transcribed directly from Table 9 of Riddles, Kim, and Im (2016).
Usage
voting
Format
A data frame with 8 rows and 7 variables:
- Gender
Factor. The gender of the voter ("Male", "Female").
- Age_group
Character. The age group of the voter.
- Voted_A
Numeric. Count of respondents voting for Party A.
- Voted_B
Numeric. Count of respondents voting for Party B.
- Other
Numeric. Count of respondents voting for another party.
- Refusal
Numeric. Count of sampled individuals who refused to respond (this is the nonresponse count).
- Total
Numeric. Total individuals sampled in the group (Responders + Refusals).
Details
In the paper's application, 'Gender' is used as the nonresponse instrumental variable and 'Age_group' is the primary auxiliary variable .
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. *Journal of Survey Statistics and Methodology*, 4(1), 1–31. (Data from Table 9, p. 20).
Extract weights from an 'nmar_result'
Description
Return analysis weights stored in an 'nmar_result' as either probability-scale (summing to 1) or population-scale (summing to 'sample$n_total'). The function normalizes stored masses and attaches informative attributes.
Usage
## S3 method for class 'nmar_result'
weights(object, scale = c("probability", "population"), ...)
Arguments
object |
An 'nmar_result' object. |
scale |
One of '"probability"' (default) or '"population"'. |
... |
Additional arguments (ignored). |
Value
Numeric vector of weights with length equal to the number of respondents.