Type: Package
Title: Estimation under not Missing at Random Nonresponse
Version: 0.1.1
Description: Methods to estimate finite-population parameters under nonresponse that is not missing at random (NMAR, nonignorable). Incorporates auxiliary information and user-specified response models, and supports independent samples and complex survey designs via objects from the 'survey' package. Provides diagnostics and optional variance estimates. For methodological background see Qin, Leung and Shao (2002) <doi:10.1198/016214502753479338> and Riddles, Kim and Im (2016) <doi:10.1093/jssam/smv047>.
License: MIT + file LICENSE
URL: https://github.com/ncn-foreigners/NMAR, https://ncn-foreigners.ue.poznan.pl/NMAR/index.html
BugReports: https://github.com/ncn-foreigners/NMAR/issues
Encoding: UTF-8
Imports: stats, nleqslv, utils, generics, Formula
RoxygenNote: 7.3.3
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0), numDeriv, survey, svrep, broom, progressr, future, future.apply, spelling
VignetteBuilder: knitr
Config/testthat/edition: 3
Depends: R (≥ 3.5)
LazyData: true
Language: en-US
NeedsCompilation: no
Packaged: 2026-01-10 15:12:54 UTC; runner
Author: Maciej Beresewicz ORCID iD [aut, cre], Igor Kołodziej [aut, ctb], Mateusz Iwaniuk [aut, ctb]
Maintainer: Maciej Beresewicz <maciej.beresewicz@ue.poznan.pl>
Repository: CRAN
Date/Publication: 2026-01-16 10:50:02 UTC

Apply scaling to a matrix using a recipe

Description

Apply scaling to a matrix using a recipe

Usage

apply_nmar_scaling(matrix_to_scale, recipe)

Arguments

matrix_to_scale

A numeric matrix with column names present in recipe.

recipe

An object of class nmar_scaling_recipe.

Value

A numeric matrix with each column centered and scaled using recipe.


Shared bootstrap variance helpers

Description

Internal helpers to estimate the variance of a scalar estimator via bootstrap resampling (IID data) or bootstrap replicate weights (survey designs). Designed to be reused across NMAR engines.

Usage

bootstrap_variance(data, estimator_func, point_estimate, ...)

Arguments

data

A data.frame or a survey.design.

estimator_func

Function returning an object with a numeric scalar component y_hat and an optional logical component converged.

point_estimate

Numeric scalar; used for survey bootstrap variance (passed to survey::svrVar() as coef).

...

Additional arguments. Some are consumed by bootstrap_variance() itself (for example resample_guard for IID bootstrap or bootstrap_settings/bootstrap_options/bootstrap_type/bootstrap_mse for survey bootstrap); remaining arguments are forwarded to estimator_func.

Details

estimator_func is typically an engine-level estimator (for example the EL engine) and is called with the same arguments used for the point estimate, except that the data argument is replaced by the resampled data (IID) or a replicate-weighted survey.design (survey). Arguments reserved for the bootstrap implementation are stripped from ... before forwarding.

Bootstrap-specific options

resample_guard

IID bootstrap only. A function function(indices, data) that returns TRUE to accept a resample and FALSE to reject it.

bootstrap_settings

Survey bootstrap only. A list of arguments forwarded to svrep::as_bootstrap_design().

bootstrap_options

Alias for bootstrap_settings.

bootstrap_type

Shortcut for the type argument to svrep::as_bootstrap_design().

bootstrap_mse

Shortcut for the mse argument to svrep::as_bootstrap_design().

Progress Reporting

If the optional progressr package is installed, bootstrap calls signal progress via a progressr::progressor inside progressr::with_progress(). Users control whether progress is shown (and how) by registering handlers with progressr::handlers(). When progressr is not installed or no handlers are active, bootstrap runs silently. Progress reporting is compatible with all future backends.

Reproducibility

For reproducible bootstrap results, always set a seed before calling the estimation function:

  set.seed(123)  # Set seed for reproducibility
  result <- nmar(Y ~ X, data = df,
                 engine = el_engine(variance_method = "bootstrap",
                                    bootstrap_reps = 500))
  

The future framework (via future.seed = TRUE in future.apply::future_lapply()) ensures that each bootstrap replicate uses an independent L'Ecuyer-CMRG random number stream derived from this seed. This gives reproducible results across supported future backends (sequential, multisession, cluster, and so on).


Bootstrap for IID data frames

Description

Bootstrap for IID data frames

Usage

bootstrap_variance.data.frame(
  data,
  estimator_func,
  point_estimate,
  bootstrap_reps = 500,
  ...
)

Arguments

data

A data.frame.

estimator_func

Function returning an object with a numeric scalar component y_hat and an optional logical component converged.

point_estimate

Unused for IID bootstrap; included for signature consistency.

bootstrap_reps

integer; number of resamples.

...

Additional arguments. Some are consumed by bootstrap_variance() itself (for example resample_guard for IID bootstrap or bootstrap_settings/bootstrap_options/bootstrap_type/bootstrap_mse for survey bootstrap); remaining arguments are forwarded to estimator_func.

Value

A list with components se, variance, and replicates.


Default method dispatch (internal safety net)

Description

Default method dispatch (internal safety net)

Usage

bootstrap_variance.default(data, estimator_func, point_estimate, ...)

Bootstrap for survey designs via replicate weights

Description

Bootstrap for survey designs via replicate weights

Usage

bootstrap_variance.survey.design(
  data,
  estimator_func,
  point_estimate,
  bootstrap_reps = 500,
  survey_na_policy = c("strict", "omit"),
  ...
)

Arguments

data

A survey.design.

estimator_func

Function returning an object with a numeric scalar component y_hat and an optional logical component converged.

point_estimate

Numeric scalar; used for survey bootstrap variance (passed to survey::svrVar() as coef).

bootstrap_reps

integer; number of bootstrap replicates.

survey_na_policy

Character string specifying how to handle replicates that fail to produce estimates. Options:

"strict"

(default) Any failed replicate causes an error. This is a conservative default that makes instability explicit.

"omit"

Failed replicates are omitted. The corresponding rscales are also omitted to maintain correct variance scaling. Use with caution: if failures are non-random, variance may be biased.

...

Additional arguments. Some are consumed by bootstrap_variance() itself (for example resample_guard for IID bootstrap or bootstrap_settings/bootstrap_options/bootstrap_type/bootstrap_mse for survey bootstrap); remaining arguments are forwarded to estimator_func.

Details

This path constructs a replicate-weight design using svrep::as_bootstrap_design() and evaluates the estimator on each set of bootstrap replicate analysis weights.

Replicate evaluation starts from a shallow template copy of the input survey design (including its ids/strata/fpc structure) and injects each replicate's analysis weights by updating the design's probability slots (prob/allprob) so that weights(design) returns the desired replicate weights (with zero weights represented as prob = Inf). This avoids replaying or reconstructing a svydesign() call and therefore supports designs created via subset() and update().

NA policy: By default, survey bootstrap uses a strict NA policy: if any replicate fails to produce a finite estimate, the entire bootstrap fails with an error. Setting survey_na_policy = "omit" drops failed replicates (and their corresponding rscales) and proceeds with the remaining replicates.

Value

A list with components se, variance, and replicates.

Limitations

Calibrated/post-stratified designs: Post-hoc adjustments applied via survey::calibrate(), survey::postStratify(), or survey::rake() are not supported here and will cause the function to error. These adjustments are not recomputed when replicate weights are injected, so the replicate designs would not reflect the intended calibrated/post-stratified analysis.


Default coefficients for NMAR results

Description

Returns missingness-model coefficients if available.

Usage

## S3 method for class 'nmar_result'
coef(object, ...)

Arguments

object

An 'nmar_result' object.

...

Ignored.

Value

A named numeric vector or 'NULL'.


Coefficient table for summary objects

Description

Returns a coefficients table (Estimate, Std. Error, statistic, p-value) from a 'summary_nmar_result*' object when missingness-model coefficients and a variance matrix are available. If the summary does not carry missingness-model coefficients, returns 'NULL'.

Usage

## S3 method for class 'summary_nmar_result'
coef(object, ...)

Arguments

object

An object of class 'summary_nmar_result' (or subclass).

...

Ignored.

Details

The statistic column is labelled "t value" when finite degrees of freedom are available (e.g., survey designs); otherwise, it is labelled "z value".

Value

A data.frame with rows named by coefficient, or 'NULL' if not available.


Compute (possibly weighted) mean and standard deviation

Description

Compute (possibly weighted) mean and standard deviation

Usage

compute_weighted_stats(values, weights = NULL)

Wald confidence interval for base NMAR results

Description

Wald confidence interval for base NMAR results

Usage

## S3 method for class 'nmar_result'
confint(object, parm, level = 0.95, ...)

Arguments

object

An object of class 'nmar_result'.

parm

Ignored.

level

Confidence level.

...

Ignored.

Value

A 1x2 numeric matrix with confidence limits.


Confidence intervals for coefficient table (summary objects)

Description

Returns Wald-style confidence intervals for missingness-model coefficients from a 'summary_nmar_result*' object. Uses t-quantiles when finite degrees of freedom are available, otherwise normal quantiles.

Usage

## S3 method for class 'summary_nmar_result'
confint(object, parm, level = 0.95, ...)

Arguments

object

An object of class 'summary_nmar_result' (or subclass).

parm

A specification of which coefficients are to be given confidence intervals, either a vector of names or a vector of indices; by default, all coefficients are considered.

level

The confidence level required.

...

Ignored.

Value

A numeric matrix with columns giving lower and upper confidence limits for each parameter. Row names correspond to coefficient names. Returns 'NULL' if coefficients are unavailable.


Constraint summaries for EL diagnostics

Description

Constraint summaries for EL diagnostics

Usage

constraint_summaries(w_i_hat, W_hat, mass_untrim, X_centered)

Build a scaling recipe from one or more design matrices

Description

Build a scaling recipe from one or more design matrices

Usage

create_nmar_scaling_recipe(
  ...,
  intercept_col = "(Intercept)",
  weights = NULL,
  weight_mask = NULL,
  tol_constant = 1e-08,
  warn_on_constant = TRUE
)

Arguments

...

One or more numeric matrices with column names.

intercept_col

Name of an intercept column that should remain unscaled.

weights

Optional nonnegative numeric vector used to compute weighted means and standard deviations.

weight_mask

Optional logical mask or nonnegative numeric multipliers applied to weights before computing moments (useful for respondents-only scaling). If weights is NULL, weight_mask is treated as weights.

tol_constant

Numeric tolerance below which columns are treated as constant and left unscaled.

warn_on_constant

Logical; warn when a column is treated as constant.


Create Verbose Printer Factory

Description

Creates a verbose printing function based on trace level settings. Messages are printed only if their level is <= trace_level.

Usage

create_verboser(trace_level = 0)

Arguments

trace_level

Integer 0-3; controls verbosity detail: - 0: No output (silent mode) - 1: Major steps only (initialization, convergence) - 2: Moderate detail (iteration summaries, key diagnostics) - 3: Full detail (all diagnostics, intermediate values)

Value

A function with signature: 'verboser(msg, level = 1, type = c("info", "step", "detail", "result"))'


Empirical likelihood estimator

Description

Generic for the empirical likelihood (EL) estimator under NMAR. Methods are provided for data.frame and survey.design.

Usage

el(data, ...)

Arguments

data

A data.frame or a survey.design.

...

Passed to class-specific methods.

See Also

el_engine


Assert that terms object lacks offsets

Description

Assert that terms object lacks offsets

Usage

el_assert_no_offset(terms_obj, label)

Strata augmentation for survey designs

Description

Augments the auxiliary design with strata dummies (dropping one level) and appends stratum-share means when user-supplied auxiliary_means are present. This is the Wu-style strategy of adding stratum indicators to the auxiliary calibration block in pseudo empirical likelihood for surveys.

Usage

el_augment_strata_aux(
  aux_design_full,
  strata_factor,
  weights_full,
  N_pop,
  auxiliary_means
)

Empirical likelihood estimating equations

Description

Empirical likelihood estimating equations

Usage

el_build_equation_system(
  family,
  missingness_model_matrix,
  auxiliary_matrix,
  respondent_weights,
  N_pop,
  n_resp_weighted,
  mu_x_scaled
)

Details

Returns a function that evaluates the stacked EL system for \theta = (\beta, z, \lambda_x) with z = \operatorname{logit}(W). Blocks correspond to: (i) missingness (response) model score equations in \beta, (ii) the response-rate equation in W, and (iii) auxiliary moment constraints in \lambda_x. When no auxiliaries are present the last block is omitted. The system matches Qin, Leung, and Shao (2002, Eqs. 7-10) with empirical masses m_i = d_i/D_i(\theta), D_i as in the paper. We cap \eta, clip w_i in ratios, and guard D_i away from zero to ensure numerical stability; these safeguards are applied consistently in equations, Jacobian, and post-solution weights.

Guarding policy (must remain consistent across equations/Jacobian/post):

The score with respect to the linear predictor uses the Bernoulli form s_{\eta,i}(\beta) = \partial \log w_i / \partial \eta_i = \mu.\eta(\eta_i)/w_i, which is valid for both logit and probit links when w_i is clipped.

References

Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200.


Empirical likelihood equations for survey designs (design-weighted QLS system)

Description

Empirical likelihood equations for survey designs (design-weighted QLS system)

Usage

el_build_equation_system_survey(
  family,
  missingness_model_matrix,
  auxiliary_matrix,
  respondent_weights,
  N_pop,
  n_resp_weighted,
  mu_x_scaled
)

Details

Returns a function that evaluates the stacked EL system for complex survey designs using design weights. Unknowns are \theta = (\beta, z, \lambda_W, \lambda_x) with z = \operatorname{logit}(W). Blocks correspond to:

When all design weights are equal and N_{\mathrm{pop}} and the respondent count match the simple random sampling setup, this system reduces to the Qin, Leung, and Shao (2002) equations (6)-(10).


Analytical Jacobian for empirical likelihood

Description

Analytical Jacobian for empirical likelihood

Usage

el_build_jacobian(
  family,
  missingness_model_matrix,
  auxiliary_matrix,
  respondent_weights,
  N_pop,
  n_resp_weighted,
  mu_x_scaled
)

Details

Builds the block Jacobian A = \partial F/\partial \theta for the EL system with \theta = (\beta, z, \lambda_x) and z = \operatorname{logit}(W). Blocks follow Qin, Leung, and Shao (2002, Eqs. 7-10). The derivative with respect to the linear predictor for the missingness (response) model uses the Bernoulli score form \partial/\partial\eta\, \log w(\eta) = \mu.\eta(\eta)/w(\eta) with link-inverse clipping. Denominator guards are applied consistently when forming terms depending on D_i(\theta).

Guarding policy (must remain consistent across equations/Jacobian/post):

References

Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200.


Analytical Jacobian for survey EL system (design-weighted QLS analogue)

Description

Analytical Jacobian for survey EL system (design-weighted QLS analogue)

Usage

el_build_jacobian_survey(
  family,
  missingness_model_matrix,
  auxiliary_matrix,
  respondent_weights,
  N_pop,
  n_resp_weighted,
  mu_x_scaled
)

Details

Builds the block Jacobian A = \partial g/\partial \theta for the survey EL system with \theta = (\beta, z, \lambda_W, \lambda_x) and z = \operatorname{logit}(W). Blocks follow the design-weighted analogue of Qin, Leung, and Shao (2002) used in el_build_equation_system_survey(). Guarding policy matches the IID Jacobian:

The Jacobian uses the same score and second-derivative machinery as el_build_jacobian(); when family$d2mu.deta2 is missing, this function returns NULL and the solver falls back to numeric/Broyden Jacobians.


Build EL result object (success or failure)

Description

Build EL result object (success or failure)

Usage

el_build_result(
  core_results,
  inputs,
  call,
  formula,
  engine_name = "empirical_likelihood"
)

Build starting values for the EL solver (beta, z, lambda)

Description

Build starting values for the EL solver (beta, z, lambda)

Usage

el_build_start(
  missingness_model_matrix_scaled,
  auxiliary_matrix_scaled,
  nmar_scaling_recipe,
  start,
  N_pop,
  respondent_weights
)

Check auxiliary means consistency against respondents' sample support.

Description

Computes a simple z-score diagnostic comparing user-supplied auxiliary means to the respondents' sample means. The caller is responsible for comparing the returned maximum z-score to any desired threshold.

Usage

el_check_auxiliary_inconsistency_matrix(
  auxiliary_matrix_resp,
  provided_means = NULL
)

Arguments

auxiliary_matrix_resp

Respondent-side auxiliary design matrix.

provided_means

Optional named numeric vector of auxiliary means aligned to the matrix columns.

Value

list(max_z = numeric(1) or NA, cols = character())


Compute diagnostics at the EL solution

Description

Compute diagnostics at the EL solution

Usage

el_compute_diagnostics(
  estimates,
  equation_system_func,
  analytical_jac_func,
  post,
  respondent_weights,
  auxiliary_matrix_scaled,
  K_beta,
  K_aux,
  X_centered
)

Variance driver for EL (bootstrap or none)

Description

Variance driver for EL (bootstrap or none)

Usage

el_compute_variance(
  y_hat,
  full_data,
  formula,
  N_pop,
  variance_method,
  bootstrap_reps,
  standardize,
  trim_cap,
  on_failure,
  auxiliary_means,
  control,
  start,
  family
)

Core eta-state computation for EL engines

Description

Computes the capped linear predictor, response probabilities, derivatives, and stable scores with respect to the linear predictor for a given family. This helper centralizes the numerically delicate pieces (capping, clipping, Mills ratios, and score derivatives) and is used consistently across the EL equation system and analytical Jacobians for both IID and survey designs.

Usage

el_core_eta_state(family, eta_raw, eta_cap)

Arguments

family

List-like response family bundle (see logit_family() and probit_family()).

eta_raw

Numeric vector of unconstrained linear predictors.

eta_cap

Scalar cap applied symmetrically to eta_raw.

Value

A list with components:

eta

Capped linear predictor.

w

Mean function family$linkinv(eta).

w_clipped

w clipped to [1e-12, 1-1e-12] for use in ratios.

mu_eta

Derivative family$mu.eta(eta).

d2mu

Second derivative family$d2mu.deta2(eta) when available, otherwise NULL.

s_eta

Score with respect to eta, using stable logit/probit forms where possible.

ds_eta_deta

Derivative of s_eta with respect to eta when d2mu is available, otherwise NULL.


EL core helpers

Description

Internal helpers for solving and post-processing the EL system. el_run_solver() orchestrates nleqslv::nleqslv() with a small, deterministic fallback ladder; el_post_solution() computes masses and the point estimate with denominator guards and optional trimming.


Empirical likelihood for data frames (NMAR)

Description

Internal method dispatched by el() when data is a data.frame. Returns c("nmar_result_el","nmar_result") with the point estimate, optional bootstrap SE, weights, coefficients, diagnostics, and metadata.

Usage

## S3 method for class 'data.frame'
el(
  data,
  formula,
  auxiliary_means = NULL,
  standardize = TRUE,
  trim_cap = Inf,
  control = list(),
  on_failure = c("return", "error"),
  variance_method = c("bootstrap", "none"),
  bootstrap_reps = 500,
  n_total = NULL,
  start = NULL,
  trace_level = 0,
  family = logit_family(),
  ...
)

Arguments

data

A data.frame where the outcome column contains NA for nonrespondents.

formula

Two-sided formula Y_miss ~ auxiliaries or Y_miss ~ auxiliaries | missingness_predictors.

auxiliary_means

Named numeric vector of population means for auxiliary design columns. Names must match the materialized model.matrix columns on the first RHS (after formula expansion), including factor indicators and transformed terms. The intercept is always excluded.

standardize

Logical; whether to standardize predictors prior to estimation.

trim_cap

Numeric; cap for EL weights (Inf = no trimming).

control

List; optional solver control parameters for nleqslv::nleqslv(control = ...).

on_failure

Character; one of "return" or "error" on solver failure.

variance_method

Character; one of "bootstrap" or "none".

bootstrap_reps

Integer; number of bootstrap reps if variance_method = "bootstrap".

n_total

Optional analysis-scale population total N_pop. When the outcome contains at least one NA, n_total defaults to nrow(data). When respondents-only data are supplied (no NA in the outcome), n_total must be provided.

start

Optional list of starting values passed to the solver helpers.

trace_level

Integer 0-3 controlling estimator logging detail.

family

Missingness (response) model family specification (defaults to the logit bundle).

...

Additional arguments passed to the solver.

Details

Implements the empirical likelihood estimator for IID data with optional auxiliary moment constraints. The missingness-model score is the Bernoulli derivative with respect to the linear predictor, supporting logit and probit links. When respondents-only data are supplied (no NA in the outcome), n_total is required so the response-rate equation targets the full sample size. When missingness is observed (NA present), the default population total is nrow(data). If respondents-only data are used and auxiliaries are requested, you must also provide population auxiliary means via auxiliary_means. Result weights are the unnormalized EL masses a_i / D_i(\theta) on the analysis scale, where a_i \equiv 1 for IID data.


Build denominator and floor pack

Description

Build denominator and floor pack

Usage

el_denominator(lambda_W, W, Xc_lambda, p_i, floor)

Arguments

lambda_W

numeric scalar

W

numeric scalar in (0,1)

Xc_lambda

numeric vector (X_centered %*% lambda_x) or 0

p_i

numeric vector of response probabilities

floor

numeric scalar > 0, denominator floor

Value

list with denom, active, inv, inv_sq


Empirical likelihood (EL) engine for NMAR

Description

Constructs an engine specification for the empirical likelihood (EL) estimator of a full-data mean under nonignorable nonresponse (NMAR).

Usage

el_engine(
  standardize = TRUE,
  trim_cap = Inf,
  on_failure = c("return", "error"),
  variance_method = c("bootstrap", "none"),
  bootstrap_reps = 500,
  auxiliary_means = NULL,
  control = list(),
  strata_augmentation = TRUE,
  n_total = NULL,
  start = NULL,
  family = c("logit", "probit")
)

Arguments

standardize

logical; standardize predictors. Default TRUE.

trim_cap

numeric; cap for EL weights (Inf = no trimming).

on_failure

character; "return" or "error" on solver failure.

variance_method

character; one of "bootstrap" or "none".

bootstrap_reps

integer; number of bootstrap replicates when variance_method = "bootstrap".

auxiliary_means

named numeric vector; population means for auxiliary design columns. Names must match the materialized model.matrix column names on the first RHS (after formula expansion), e.g., factor indicator columns created by model.matrix() or transformed terms like I(X^2). Auxiliary intercepts are always dropped automatically, so do not supply (Intercept). If NULL (default) and the outcome contains at least one NA, auxiliary means are estimated from the full input (including nonrespondents): IID uses unweighted column means of the auxiliary design; survey designs use the design-weighted means based on weights(design). This corresponds to the QLS case where \mu_x is replaced by \bar X (the full-sample mean) when auxiliary variables are observed for all sampled units.

control

Optional solver configuration forwarded to nleqslv::nleqslv(). Provide a single list that may include solver tolerances (e.g., xtol, ftol, maxit) and, optionally, top-level entries global and xscalm for globalization and scaling. Example: control = list(maxit = 500, xtol = 1e-10, ftol = 1e-10, global = "qline", xscalm = "auto").

strata_augmentation

logical; when TRUE (default), survey designs with an identifiable strata structure are augmented with stratum indicators and corresponding population shares in the auxiliary block (Wu-style strata augmentation). Has no effect for data.frame inputs or survey designs without strata.

n_total

numeric; optional when supplying respondents-only data (no NA in the outcome). For data.frame inputs, set to the total number of sampled units before filtering to respondents. For survey.design inputs, set to the total design-weight total on the same analysis scale as weights(design) (default sum(weights(design))). If omitted and the outcome contains no NAs, the estimator errors, requesting n_total.

start

list; optional starting point for the solver. Fields:

  • beta: named numeric vector of missingness-model coefficients on the original (unscaled) scale, including (Intercept).

  • W or z: starting value for population response rate (0 < W < 1) or its logit (z). If both are provided, z takes precedence.

  • lambda: named numeric vector of auxiliary multipliers on the original scale (names must match auxiliary design columns; no intercept). Values are mapped to the scaled space internally.

family

Missingness (response) model family. Either "logit" (default) or "probit", or a custom family object: a list with components name, linkinv, mu.eta, score_eta, and optionally d2mu.deta2. When d2mu.deta2 is absent the solver uses Broyden/numeric Jacobians.

Details

The implementation follows Qin, Leung, and Shao (2002): the response mechanism is modeled as w(y, x; \beta) = P(R = 1 \mid Y = y, X = x) and the joint law of (Y, X) is represented nonparametrically by respondent masses that satisfy empirical likelihood constraints. The mean is estimated as a respondent weighted mean with weights proportional to \tilde w_i = a_i / D_i(\beta, W, \lambda), where a_i are base weights (a_i \equiv 1 for IID data and a_i = d_i for survey designs) and D_i is the EL denominator.

For data.frame inputs the estimator solves the Qin-Leung-Shao (QLS) estimating equations for (\beta, W, \lambda_x) with W reparameterized as z = \operatorname{logit}(W), and profiles out the response multiplier \lambda_W using the closed-form QLS identity (their Eq. 10). For survey.design inputs the estimator uses a design-weighted analogue (Chen and Sitter 1999; Wu 2005) with an explicit \lambda_W and an additional linkage equation involving the nonrespondent design-weight total T_0.

Numerical stability:

Formula syntax and data constraints: nmar() accepts a partitioned right-hand side y_miss ~ auxiliaries | response_only. Variables left of | enter auxiliary moment constraints; variables right of | enter only the response model. The outcome (LHS) is always included as a response-model predictor through the evaluated LHS expression; explicit use of the outcome on the RHS is rejected. The response model always includes an intercept; the auxiliary block never includes an intercept.

To include a covariate in both the auxiliary constraints and the response model, repeat it on both sides, e.g. y_miss ~ X | X.

Auxiliary means: If auxiliary_means = NULL (default) and the outcome contains at least one NA, auxiliary means are estimated from the full input and used as \bar X in the QLS constraints. For respondents-only data (no NA in the outcome), n_total must be supplied; and if the auxiliary RHS is non-empty, auxiliary_means must also be supplied. When standardize = TRUE, supply auxiliary_means on the original data scale; the engine applies the same standardization internally.

Survey scale: For survey.design inputs, n_total (if provided) must be on the same analysis scale as weights(design). The default is sum(weights(design)).

Convergence and identification: the stacked EL system can have multiple solutions. Adding response-only predictors (variables to the right of |) can make the problem sensitive to starting values. Inspect diagnostics such as jacobian_condition_number and consider supplying start = list(beta = ..., W = ...) when needed.

Variance: The EL engine supports bootstrap standard errors via variance_method = "bootstrap" or can skip variance with variance_method = "none". Set a seed for reproducible bootstrap results.

Bootstrap requires suggested packages: for IID resampling it requires future.apply (and future); for survey replicate-weight bootstrap it requires survey and svrep.

Value

A list of class "nmar_engine_el" (also inheriting from "nmar_engine") containing configuration fields to be supplied to nmar(). Users rarely access fields directly; instead, pass the engine to nmar() together with a formula and data.

References

Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200. doi:10.1198/016214502753479338

Chen, J., and Sitter, R. R. (1999). A pseudo empirical likelihood approach for the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385-406.

Wu, C. (2005). Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology, 31(2), 239-243.

See Also

nmar, weights.nmar_result, summary.nmar_result

Examples

set.seed(1)
n <- 200
X <- rnorm(n)
Y <- 2 + 0.5 * X + rnorm(n)
p <- plogis(-0.7 + 0.4 * scale(Y)[, 1])
R <- runif(n) < p
if (all(R)) R[1] <- FALSE
df <- data.frame(Y_miss = Y, X = X)
df$Y_miss[!R] <- NA_real_

# Estimate auxiliary mean from full data (QLS "use Xbar" case)
eng <- el_engine(auxiliary_means = NULL, variance_method = "none")

# Put X in both the auxiliary block and the response model (QLS-like)
fit <- nmar(Y_miss ~ X | X, data = df, engine = eng)
summary(fit)


# Response-only predictors can be placed to the right of |:
Z <- rnorm(n)
df2 <- data.frame(Y_miss = Y, X = X, Z = Z)
df2$Y_miss[!R] <- NA_real_
eng2 <- el_engine(auxiliary_means = NULL, variance_method = "none")
fit2 <- nmar(Y_miss ~ X | Z, data = df2, engine = eng2)
print(fit2)

# Survey design usage
if (requireNamespace("survey", quietly = TRUE)) {
  des <- survey::svydesign(ids = ~1, weights = ~1, data = df)
  eng3 <- el_engine(auxiliary_means = NULL, variance_method = "none")
  fit3 <- nmar(Y_miss ~ X, data = des, engine = eng3)
  summary(fit3)
}


Core Empirical Likelihood Estimator

Description

Implements the core computational engine for empirical likelihood estimation under nonignorable nonresponse, including parameter solving, variance calculation, and diagnostic computation.

Usage

el_estimator_core(
  missingness_design,
  aux_matrix,
  aux_means,
  respondent_weights,
  analysis_data,
  outcome_expr,
  N_pop,
  formula,
  standardize,
  trim_cap,
  control,
  on_failure,
  family = logit_family(),
  variance_method,
  bootstrap_reps,
  start = NULL,
  trace_level = 0,
  auxiliary_means = NULL
)

Arguments

missingness_design

Respondent-side missingness (response) model design matrix (intercept + predictors).

aux_matrix

Auxiliary design matrix on respondents (may have zero columns).

aux_means

Named numeric vector of auxiliary population means (aligned to columns of aux_matrix).

respondent_weights

Numeric vector of respondent weights aligned with missingness_design rows.

analysis_data

Data object used for logging and variance (survey designs supply the design object).

outcome_expr

Character string identifying the outcome expression displayed in outputs.

N_pop

Population size on the analysis scale.

formula

Original model formula used for estimation.

standardize

Logical. Whether to standardize predictors during estimation.

trim_cap

Numeric. Upper bound for empirical likelihood weight trimming.

control

List of control parameters for the nonlinear equation solver.

on_failure

Character. Action when solver fails: "return" or "error".

family

List. Link function specification (typically logit).

variance_method

Character. Variance estimation method.

bootstrap_reps

Integer. Number of bootstrap replications.

auxiliary_means

Named numeric vector of known population means supplied by the user (optional; used for diagnostics).

Details

Orchestrates EL estimation for NMAR following Qin, Leung, and Shao (2002). For data.frame inputs (IID setting) the stacked system in (\beta, z, \lambda_x) with z = \mathrm{logit}(W) is solved by nleqslv::nleqslv() using an analytic Jacobian. For survey.design inputs a design-weighted analogue in (\beta, z, \lambda_W, \lambda_x) is solved with an analytic Jacobian when the response family supplies second derivatives, or with numeric/Broyden Jacobians otherwise. Numerical safeguards are applied consistently across equations, Jacobian, and post-solution weights: bounded linear predictors, probability clipping in ratios, and a small floor on denominators D_i(\theta) with an active-set mask in derivatives. After solving, unnormalized masses d_i/D_i(\theta) are formed, optional trimming may be applied (with normalization only for reporting), and optional variance is computed via bootstrap when variance_method = "bootstrap".

Value

List containing estimation results, diagnostics, and metadata.


Extract a strata factor from a survey.design object

Description

Prefers strata already materialized in the survey.design object (typically design$strata). When unavailable, attempts to reconstruct strata from the original svydesign() call. When multiple stratification variables are supplied, their interaction is used.

Usage

el_extract_strata_factor(design)

Compute lambda_W from C_const and W

Description

Compute lambda_W from C_const and W

Usage

el_lambda_W(C_const, W)

Arguments

C_const

numeric scalar: (N_pop / sum(d_resp) - 1)

W

numeric scalar in (0,1)


Log a step banner line

Description

Log a step banner line

Usage

el_log_banner(verboser, title)

Log data prep summary

Description

Log data prep summary

Usage

el_log_data_prep(
  verboser,
  outcome_var,
  family_name,
  K_beta,
  K_aux,
  auxiliary_names,
  standardize,
  is_survey,
  N_pop,
  n_resp_weighted
)

Log detailed diagnostics

Description

Log detailed diagnostics

Usage

el_log_detailed_diagnostics(
  verboser,
  beta_hat_unscaled,
  W_hat,
  lambda_W_hat,
  lambda_hat,
  denominator_hat
)

Log final summary

Description

Log final summary

Usage

el_log_final(verboser, y_hat, se)

Log solver configuration

Description

Log solver configuration

Usage

el_log_solver_config(verboser, control_top, final_control)

Log solver termination status

Description

Log solver termination status

Usage

el_log_solver_result(verboser, converged_success, solution, elapsed)

Log a short solver progress note

Description

Log a short solver progress note

Usage

el_log_solving(verboser)

Log starting values

Description

Log starting values

Usage

el_log_start_values(verboser, init_beta, init_z, init_lambda)

Log a short trace message with the chosen level

Description

Log a short trace message with the chosen level

Usage

el_log_trace(verboser, trace_level)

Log variance header and result

Description

Log variance header and result

Usage

el_log_variance_header(verboser, variance_method, bootstrap_reps)

Log weight diagnostics

Description

Log weight diagnostics

Usage

el_log_weight_diagnostics(verboser, W_hat, weights, trimmed_fraction)

EL masses and probabilities from denominators

Description

EL masses and probabilities from denominators

Usage

el_masses(weights, denom, floor, trim_cap)

Arguments

weights

numeric respondent base weights (d_i)

denom

numeric denominators Di after floor guard

floor

numeric small positive guard (unused in core logic here, kept for API symmetry)

trim_cap

numeric cap (>0) or Inf

Value

list with mass_untrim, mass_trimmed, prob_mass, trimmed_fraction


Mean from probability masses

Description

Mean from probability masses

Usage

el_mean(prob_mass, y)

Prepare EL inputs for IID and survey designs

Description

Parses the two-part Formula, constructs EL design matrices, injects the respondent delta indicator, attaches weights and (optionally) survey metadata, and returns the pieces needed by the EL core. The outcome enters the missingness design only through the evaluated LHS expression; any explicit use of the outcome variable on RHS2 is rejected.

Usage

el_prepare_inputs(
  formula,
  data,
  weights = NULL,
  n_total = NULL,
  design_object = NULL
)

Details

Invariants enforced here (relied on by all downstream EL code):


Prepare nleqslv top-level args and control

Description

Prepare nleqslv top-level args and control

Usage

el_prepare_nleqslv(control)

EL auxiliary design resolution and population means

Description

Computes the respondent-side auxiliary matrix and the population means vector used for centering X - \mu_x. When auxiliary_means is supplied, only respondent rows are required to be fully observed; NA values are permitted on nonrespondent rows. When auxiliary_means is NULL, auxiliaries must be fully observed in the full data used to estimate population means.

Usage

el_resolve_auxiliaries(
  aux_design_full,
  respondent_mask,
  auxiliary_means,
  weights_full = NULL
)

Solver orchestration with staged policy

Description

Solver orchestration with staged policy

Usage

el_run_solver(
  equation_system_func,
  analytical_jac_func,
  init,
  final_control,
  top_args,
  solver_method,
  use_solver_jac,
  K_beta,
  K_aux,
  respondent_weights,
  N_pop,
  trace_level = 0
)

Arguments

equation_system_func

Function mapping parameter vector to equation residuals.

analytical_jac_func

Analytic Jacobian function; may be NULL if unavailable or when forcing Broyden.

init

Numeric vector of initial parameter values.

final_control

List passed to nleqslv::nleqslv(control = ...).

top_args

List of top-level nleqslv::nleqslv args (e.g., global, xscalm).

solver_method

Character; one of "auto", "newton", or "broyden".

use_solver_jac

Logical; whether to pass analytic Jacobian to Newton.

K_beta

Integer; number of response model parameters.

K_aux

Integer; number of auxiliary constraints.

respondent_weights

Numeric vector of base sampling weights.

N_pop

Numeric; population total (weighted when survey design).

trace_level

Integer; verbosity level (0 silent, 1-3 increasingly verbose).


Empirical likelihood for survey designs (NMAR)

Description

Internal method dispatched by el() when data is a survey.design.

Usage

## S3 method for class 'survey.design'
el(
  data,
  formula,
  auxiliary_means = NULL,
  standardize = TRUE,
  strata_augmentation = TRUE,
  trim_cap = Inf,
  control = list(),
  on_failure = c("return", "error"),
  variance_method = c("bootstrap", "none"),
  bootstrap_reps = 500,
  n_total = NULL,
  start = NULL,
  trace_level = 0,
  family = logit_family(),
  ...
)

Arguments

data

A survey.design created with survey::svydesign().

formula

Two-sided formula with an NA-valued outcome on the LHS; auxiliaries on the first RHS and, optionally, missingness predictors on the second RHS partition.

auxiliary_means

Named numeric vector of population means for auxiliary design columns. Names must match the materialized model.matrix columns on the first RHS (after formula expansion), including factor indicators and transformed terms. The intercept is always excluded.

standardize

Logical; standardize predictors.

strata_augmentation

Logical; when TRUE (default), augment the auxiliary design with stratum indicators and stratum shares when a strata structure is present in the survey design.

trim_cap

Numeric; cap for EL weights (Inf = no trimming).

control

List; solver control for nleqslv::nleqslv(control = ...).

on_failure

Character; "return" or "error" on solver failure.

variance_method

Character; "bootstrap" or "none".

bootstrap_reps

Integer; reps when variance_method = "bootstrap".

n_total

Optional analysis-scale population size N_pop; required for respondents-only designs.

start

Optional list of starting values passed to solver helpers.

trace_level

Integer 0-3 controlling estimator logging detail.

family

Missingness (response) model family specification (defaults to logit).

...

Passed to solver.

Details

Implements the empirical likelihood estimator with design weights. If n_total is supplied, it is treated as the analysis-scale population size N_pop used in the design-weighted QLS system. If n_total is not supplied, sum(weights(design)) is used as N_pop. Design weights are not rescaled internally; the EL equations use respondent weights and N_pop via T_0 = N_{\mathrm{pop}} - \sum d_i in the linkage equation. When respondents-only designs are used (no NA in the outcome), n_total must be provided; if auxiliaries are requested you must also provide population auxiliary means via auxiliary_means. Result weights are the unnormalized EL masses d_i / D_i(\theta) on this analysis scale; weights(result, scale = "population") sums to N_pop.

Value

c("nmar_result_el","nmar_result").


EL utility helpers

Description

Internal helpers for auxiliary consistency checks and shared validation routines used during input parsing.


Validate design spec dimensions

Description

Validate design spec dimensions

Usage

el_validate_design_spec(design, data_nrow)

Validate matrix columns for NA and zero variance

Description

Validate matrix columns for NA and zero variance

Usage

el_validate_matrix(
  mat,
  allow_na,
  label,
  severity,
  row_map = NULL,
  scope_note = NULL,
  plural_label = FALSE
)

Enforce (near-)nonnegativity of weights

Description

Softly enforces nonnegativity of a numeric weight vector. Large negative values (beyond a tolerance) are treated as errors; small negative values (for example, from numerical noise) are truncated to zero.

Usage

enforce_nonneg_weights(weights, tol = 1e-08)

Arguments

weights

numeric vector of weights.

tol

numeric tolerance below which negative values are treated as numerical noise and clipped to zero.

Details

Values below -tol are treated as clearly negative. Values in [-tol, 0) are clipped to zero.

Value

A list with components:

ok

logical; TRUE if no clearly negative weights were found.

message

character; diagnostic message when ok is FALSE, otherwise NULL.

weights

numeric vector of adjusted weights (original if ok is FALSE, otherwise with small negatives clipped to zero).


Extract engine configuration

Description

Returns the underlying configuration of an engine as a named list. This is intended for programmatic inspection (e.g., parameter tuning, logging). The returned object should be treated as read-only.

Usage

engine_config(x)

Arguments

x

An object inheriting from class 'nmar_engine'.

Value

A named list of configuration fields.


Canonical engine name

Description

Returns a stable, machine-friendly identifier for an engine object. This identifier is also used in 'nmar_result$meta$engine_name' to keep a consistent naming scheme between configurations and results.

Usage

engine_name(x)

Arguments

x

An object inheriting from class 'nmar_engine'.

Value

A single character string, e.g. "empirical_likelihood".


Exponential tilting estimator

Description

Generic for the exponential tilting (ET) estimator under NMAR. Methods are provided for 'data.frame' and 'survey.design'.

Usage

exptilt(data, ...)

Arguments

data

A 'data.frame' or a 'survey.design'.

...

Passed to class-specific methods.

Value

An engine-specific NMAR result object (for example nmar_result_exptilt).

See Also

'exptilt.data.frame()', 'exptilt.survey.design()', 'exptilt_engine()'


Exponential tilting (ET) engine for NMAR estimation

Description

Constructs a configuration for the exponential tilting estimator under nonignorable nonresponse (NMAR). The estimator solves S_2(\boldsymbol{\phi}, \hat{\boldsymbol{\gamma}}) = 0, using nleqslv to apply EM algorithm.

Usage

exptilt_engine(
  standardize = FALSE,
  on_failure = c("return", "error"),
  variance_method = c("bootstrap", "none"),
  bootstrap_reps = 10,
  supress_warnings = FALSE,
  control = list(),
  family = c("logit", "probit"),
  y_dens = c("normal", "lognormal", "exponential", "binomial"),
  stopping_threshold = 1,
  sample_size = 2000
)

Arguments

standardize

logical; standardize predictors. Default TRUE.

on_failure

character; "return" or "error" on solver failure

variance_method

character; one of "bootstrap", or "none".

bootstrap_reps

integer; number of bootstrap replicates when variance_method = "bootstrap".

supress_warnings

Logical; suppress variance-related warnings.

control

Named list of control parameters passed to nleqslv::nleqslv. Common parameters include:

  • maxit: Maximum number of iterations (default: 100)

  • method: Solver method - "Newton" or "Broyden" (default: "Newton")

  • global: Global strategy - "dbldog", "pwldog", "qline", "gline", "hook", or "none" (default: "dbldog")

  • xtol: Tolerance for relative error in solution (default: 1e-8)

  • ftol: Tolerance for function value (default: 1e-8)

  • btol: Tolerance for backtracking (default: 0.01)

  • allowSingular: Allow singular Jacobians (default: TRUE)

See ?nleqslv::nleqslv for full details.

family

character; response model family, either "logit" or "probit", or a family object created by logit_family() / probit_family().

y_dens

Outcome density model ("auto", "normal", "lognormal", "exponential", or "binomial").

stopping_threshold

Numeric; early stopping threshold. If the maximum absolute value of the score function falls below this threshold, the algorithm stops early (default: 1).

sample_size

Integer; maximum sample size for stratified random sampling (default: 2000). When the dataset exceeds this size, a stratified random sample is drawn to optimize memory usage. The sampling preserves the ratio of respondents to non-respondents in the original data.

Details

The method is a robust Propensity-Score Adjustment (PSA) approach for Not Missing at Random (NMAR). It uses Maximum Likelihood Estimation (MLE), basing the likelihood on the observed part of the sample (f(\boldsymbol{Y}_i | \delta_i = 1, \boldsymbol{X}_i)), making it robust against outcome model misspecification. The propensity score is estimated by assuming an instrumental variable (X_2) that is independent of the response status given other covariates and the study variable. Estimator calculates fractional imputation weights w_i. The final estimator is a weighted average, where the weights are the inverse of the estimated response probabilities \hat{\pi}_i, satisfying the estimating equation:

\sum_{i \in \mathcal{R}} \frac{\boldsymbol{g}(\boldsymbol{Y}_i, \boldsymbol{X}_i ; \boldsymbol{\theta})}{\hat{\pi}_i} = 0,

where \mathcal{R} is the set of observed respondents.

Value

An engine object of class c("nmar_engine_exptilt","nmar_engine"). This is a configuration list; it is not a fit. Pass it to nmar.

References

Minsun Kim Riddles, Jae Kwang Kim, Jongho Im A Propensity-score-adjustment Method for Nonignorable Nonresponse Journal of Survey Statistics and Methodology, Volume 4, Issue 2, June 2016, Pages 215–245.

Examples


generate_test_data <- function(
  n_rows = 500,
  n_cols = 1,
  case = 1,
  x_var = 0.5,
  eps_var = 0.9,
  a = 0.8,
  b = -0.2
) {
# Generate X variables - fixed to match comparison
  X <- as.data.frame(replicate(n_cols, rnorm(n_rows, 0, sqrt(x_var))))
  colnames(X) <- paste0("x", 1:n_cols)

# Generate Y - fixed coefficients to match comparison
  eps <- rnorm(n_rows, 0, sqrt(eps_var))
  if (case == 1) {
# Use fixed coefficient of 1 for all x variables to match: y = -1 + x1 + epsilon
    X$Y <- as.vector(-1 + as.matrix(X) %*% rep(1, n_cols) + eps)
  }
  else if (case == 2) {
    X$Y <- -2 + 0.5 * exp(as.matrix(X) %*% rep(1, n_cols)) + eps
  }
  else if (case == 3) {
    X$Y <- -1 + sin(2 * as.matrix(X) %*% rep(1, n_cols)) + eps
  }
  else if (case == 4) {
    X$Y <- -1 + 0.4 * as.matrix(X)^3 %*% rep(1, n_cols) + eps
  }

  Y_original <- X$Y

# Missingness mechanism - identical to comparison
  pi_obs <- 1 / (1 + exp(-(a + b * X$Y)))

# Create missing values
  mask <- runif(nrow(X)) > pi_obs
  mask[1] <- FALSE # Ensure at least one observation is not missing
  X$Y[mask] <- NA

  return(list(X = X, Y_original = Y_original))
}
res_test_data <- generate_test_data(n_rows = 500, n_cols = 1, case = 1)
x <- res_test_data$X

exptilt_config <- exptilt_engine(
  y_dens = 'normal',
  control = list(maxit = 10),
  stopping_threshold = 0.1,
  standardize = FALSE,
  family = 'logit',
  bootstrap_reps = 5
)
formula = Y ~ x1
res <- nmar(formula = formula, data = x, engine = exptilt_config, trace_level = 1)
summary(res)


Nonparametric Exponential Tilting (Internal Generic)

Description

Nonparametric Exponential Tilting (Internal Generic)

Usage

exptilt_nonparam(data, ...)

Arguments

data

A data.frame or survey.design object

...

Other arguments passed to methods

Value

An engine-specific NMAR result object for the nonparametric exponential tilting estimator.


Nonparametric exponential tilting (EM) engine for NMAR

Description

Constructs a configuration for the nonparametric exponential tilting estimator under nonignorable nonresponse (NMAR). This engine implements the "Fully Nonparametric Approach" from **Appendix 2** of Riddles et al. (2016). The estimator uses an Expectation-Maximization (EM) algorithm to directly estimate the nonresponse odds O(x_1, y) for aggregated, categorical data.

Usage

exptilt_nonparam_engine(refusal_col = "", max_iter = 100, tol_value = 1e-06)

Arguments

refusal_col

character; the column name in data that contains the aggregated counts of non-respondents (refusals).

max_iter

integer; the maximum number of iterations for the EM algorithm.

tol_value

numeric; the convergence tolerance for the EM algorithm. The loop stops when the sum of absolute changes in the odds matrix is less than this value.

Details

This engine is designed for cases where all variables (outcomes $Y$, response predictors $X_1$, and instrumental variables $X_2$) are categorical, and the input data is pre-aggregated into strata.

The method assumes an instrumental variable X_2 is available. The response probability is assumed to depend on X_1 and $Y$, but *not* on X_2.

The EM algorithm iteratively solves for the nonresponse odds:

O^{(t+1)}(x_1^*, y^*) = \frac{M_{y^*x_1^*}^{(t)}}{N_{y^*x_1^*}}

where M_{y^*x_1^*}^{(t)} is the expected count of non-respondents (calculated in the E-step) and N_{y^*x_1^*} is the observed count of respondents for a given stratum $(x_1, y)$.

The final output from the nmar call is an object containing data_to_return, an aggregated data frame where the original 'refusal' counts have been redistributed into the outcome columns (e.g., 'Voted_A', 'Voted_B') as expected non-respondent counts.

Value

An engine object of class c("nmar_engine_exptilt_nonparam","nmar_engine"). This is a configuration list; it is not a fit. Pass it to nmar.

References

Minsun Kim Riddles, Jae Kwang Kim, Jongho Im A Propensity-score-adjustment Method for Nonignorable Nonresponse Journal of Survey Statistics and Methodology, Volume 4, Issue 2, June 2016, Pages 215–245. (See **Appendix 2** for this specific method).

Examples

# Test data (Riddles 2016, Table 9)
voting_data_example <- data.frame(
  Gender = rep(c("Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female"), 1),
  Age_group = c("20-29", "30-39", "40-49", ">=50", "20-29", "30-39", "40-49", "50+"),
  Voted_A = c(93, 104, 146, 560, 106, 129, 170, 501),
  Voted_B = c(115, 233, 295, 350, 159, 242, 262, 218),
  Other = c(4, 8, 5, 3, 8, 5, 5, 7),
  Refusal = c(28, 82, 49, 174, 62, 70, 69, 211),
  Total = c(240, 427, 495, 1087, 335, 446, 506, 937)
)

np_em_config <- exptilt_nonparam_engine(
  refusal_col = "Refusal",
  max_iter = 100,
  tol_value = 0.001
)

# Formula: Y1 + Y2 + ... ~ X1_vars | X2_vars
# Here, Y = Voted_A, Voted_B, Other
#      x1 = Gender (response model)
#      x2 = Age_group (instrumental variable)
em_formula <- Voted_A + Voted_B + Other ~ Gender | Age_group


results_em_np <- nmar(
  formula = em_formula,
  data = voting_data_example,
  engine = np_em_config,
  trace_level = 0
)

# View the final adjusted counts
# (Original counts + expected non-respondent counts)
print(results_em_np$data_final)


Extract top-level nleqslv arguments from a control-like list

Description

Extract top-level nleqslv arguments from a control-like list

Usage

extract_nleqslv_top(ctrl)

Default fitted values for NMAR results

Description

Returns fitted response probabilities if available.

Usage

## S3 method for class 'nmar_result'
fitted(object, ...)

Arguments

object

An 'nmar_result' object.

...

Ignored.

Value

A numeric vector (possibly length 0).


One-line formatter for NMAR engines

Description

Returns a single concise line summarizing an engine configuration.

Usage

## S3 method for class 'nmar_engine'
format(x, ...)

Arguments

x

An engine object inheriting from 'nmar_engine'.

...

Unused.

Value

A length-1 character vector.


Default formula for NMAR results

Description

Returns the estimation formula if available.

Usage

## S3 method for class 'nmar_result'
formula(x, ...)

Arguments

x

An 'nmar_result' object.

...

Ignored.

Value

A formula or 'NULL'.


Generate conditional density

Description

Generate conditional density

Usage

generate_conditional_density(model)

Arguments

model

An internal exptilt object


Glance summary for NMAR results

Description

One-row diagnostics for NMAR fits.

Usage

## S3 method for class 'nmar_result'
glance(x, ...)

Arguments

x

An object of class 'nmar_result'.

...

Ignored.

Value

A one-row data frame with diagnostics and metadata.


Construct a logit response family bundle

Description

Construct a logit response family bundle

Usage

logit_family()

Value

A list with components name, linkinv, mu.eta, d2mu.deta2, and score_eta.


Prefer explicit solver_args over control-provided top-level args

Description

Prefer explicit solver_args over control-provided top-level args

Usage

merge_nleqslv_top(solver_args, control_top)

Construct EL Engine Object

Description

Construct EL Engine Object

Usage

new_nmar_engine_el(engine)

Construct Result Object (parent helper)

Description

Builds an 'nmar_result' list using the shared schema and validates it. Engines must pass named fields; no legacy positional signature is supported.

Usage

new_nmar_result(...)

Details

Engine-level constructors should call this helper with named arguments rather than assembling result lists by hand. At minimum, engines should supply estimate (numeric scalar) and converged (logical). All other fields are optional:

Calling new_nmar_result() ensures that every engine returns objects that satisfy the shared schema and are immediately compatible with parent S3 methods such as vcov(), confint(), tidy(), glance(), and weights().


Construct EL Result Object

Description

Construct EL Result Object

Usage

new_nmar_result_el(
  y_hat,
  se,
  weights,
  coefficients,
  vcov,
  converged,
  diagnostics,
  inputs,
  nmar_scaling_recipe,
  fitted_values,
  call,
  formula = NULL
)

Not Missing at Random (NMAR) Estimation

Description

High-level interface for NMAR estimation.

nmar() validates basic inputs and dispatches to an engine (for example el_engine). The engine controls the estimation method and interprets formula; see the engine documentation for model-specific requirements.

Usage

nmar(formula, data, engine, trace_level = 0)

Arguments

formula

A two-sided formula. Many engines support a partitioned right-hand side via |, for example y_miss ~ block1_vars | block2_vars. The meaning of these blocks is engine-specific (see the engine documentation). In the common "missing values indicate nonresponse" workflow, the left-hand side is the outcome with NA values for nonrespondents.

data

A data.frame or a survey.design containing the variables referenced by formula.

engine

An NMAR engine configuration object, typically created by el_engine, exptilt_engine, or exptilt_nonparam_engine. This object defines the estimation method and tuning parameters and must inherit from class "nmar_engine".

trace_level

Integer 0-3; controls verbosity during estimation (default 0):

  • 0: no output (silent mode);

  • 1: major steps only (initialization, convergence, final results);

  • 2: iteration summaries and key diagnostics;

  • 3: full diagnostic output.

Value

An object of class "nmar_result" with an engine-specific subclass (for example "nmar_result_el"). Use summary(), se, confint(), weights(), coef(), fitted(), and generics::tidy() / generics::glance() to access estimates, standard errors, weights, and diagnostics.

See Also

el_engine, exptilt_engine, exptilt_nonparam_engine, summary.nmar_result, weights.nmar_result

Examples

set.seed(1)
n <- 200
x1 <- rnorm(n)
z1 <- rnorm(n)
y_true <- 0.5 + 0.3 * x1 + 0.2 * z1 + rnorm(n, sd = 0.3)
resp <- rbinom(n, 1, plogis(2 + 0.1 * y_true + 0.1 * z1))
if (all(resp == 1)) resp[sample.int(n, 1)] <- 0L
y_obs <- ifelse(resp == 1, y_true, NA_real_)

# Empirical likelihood engine
df_el <- data.frame(Y_miss = y_obs, X = x1, Z = z1)
eng_el <- el_engine(variance_method = "none")
fit_el <- nmar(Y_miss ~ X | Z, data = df_el, engine = eng_el)
summary(fit_el)


# Exponential tilting engine (illustrative)
dat_et <- data.frame(y = y_obs, x2 = z1, x1 = x1)
eng_et <- exptilt_engine(
  y_dens = "normal",
  family = "logit",
  variance_method = "none"
)
fit_et <- nmar(y ~ x2 | x1, data = dat_et, engine = eng_et)
summary(fit_et)

# Survey design example (same outcome, random weights)
if (requireNamespace("survey", quietly = TRUE)) {
  w <- runif(n, 0.5, 2)
  des <- survey::svydesign(ids = ~1, weights = ~w,
                           data = data.frame(Y_miss = y_obs, X = x1, Z = z1))
  eng_svy <- el_engine(variance_method = "none")
  fit_svy <- nmar(Y_miss ~ X | Z, data = des, engine = eng_svy)
  summary(fit_svy)
}

# Bootstrap variance usage
if (requireNamespace("future.apply", quietly = TRUE)) {
  set.seed(2)
  eng_boot <- el_engine(
    variance_method = "bootstrap",
    bootstrap_reps = 20
  )
  fit_boot <- nmar(Y_miss ~ X | Z, data = df_el, engine = eng_boot)
  se(fit_boot)
}


S3 helpers for NMAR engine objects

Description

Lightweight, user-facing methods for engine configuration objects (class 'nmar_engine'). These improve discoverability and provide a consistent print surface across engines while keeping the objects as simple lists internally.

Design

- 'engine_name()' returns a canonical identifier used across the package (e.g., in 'nmar_result$meta$engine_name'). - 'print.nmar_engine()' provides a concise, readable summary of the engine configuration; engine-specific classes reuse the parent method unless they need to override it. - 'engine_config()' returns the underlying configuration as a named list for programmatic inspection.


Format a number with fixed decimal places using nmar.digits

Description

Format a number with fixed decimal places using nmar.digits

Usage

nmar_fmt_num(x, digits = nmar_get_digits())

Format an abridged call line for printing

Description

Builds a concise one-line summary of the original call without materializing large objects (e.g., full data frames). Intended for use by print/summary methods.

Usage

nmar_format_call_line(x)

Details

Uses option 'nmar.show_call' (default TRUE). Width can be tuned via option 'nmar.call_width' (default 120), but the formatter aims to keep the line compact regardless of width.


Resolve global digits setting for printing

Description

Resolve global digits setting for printing

Usage

nmar_get_digits()

EL denominator floor (global, consistent)

Description

Returns the small positive floor \delta used to guard the empirical likelihood denominator D_i(\theta) away from zero. This guard must be applied consistently in the estimating equations, analytic Jacobian, and post-solution weight construction. Advanced users can override via 'options(nmar.el_denom_floor = 1e-8)'.

Usage

nmar_get_el_denom_floor()

NMAR numeric settings

Description

NMAR numeric settings

Usage

nmar_get_numeric_settings()

Details

Centralized access to numeric thresholds used across the package. These settings can be overridden via options() for advanced users: - 'nmar.eta_cap': scalar > 0. Caps the response-model linear predictor to avoid extreme link values in Newton updates. Default 50. - 'nmar.grad_eps': finite-difference step size epsilon for numeric gradients of smooth functionals. Default 1e-6. - 'nmar.grad_d': relative step adjustment for numeric gradients. Default 1e-3.

The defaults are chosen to be conservative and stable across typical NMAR problems. Engines should retrieve values via this helper rather than hard-coding numbers, so documentation stays consistent.

Value

A named list with entries 'eta_cap', 'grad_eps', and 'grad_d'.


Internal helpers for nmar_result objects

Description

Internal helpers for nmar_result objects

Usage

nmar_result_get_estimate(x)

Parent S3 surface for NMAR results

Description

Methods that apply to the parent 'nmar_result' class and are not specific to a particular engine (e.g., EL). Engines return a child class (e.g., 'nmar_result_el') that inherits from 'nmar_result' and may override or extend behavior.

Details

S3 surface for base 'nmar_result'

Result objects expose a universal schema: - 'y_hat', 'estimate_name', 'se', 'converged'. - 'model': list with 'coefficients', 'vcov', plus optional extras. - 'weights_info': list with respondent weights and trimming metadata. - 'sample': list with total units, respondent count, survey flag, and 'design'. - 'inference': variance metadata ('variance_method', 'df', diagnostic flags). - 'diagnostics', 'meta', and 'extra' for estimator-specific details.

New engines should populate these components in their constructors and rely on the 'nmar_result_get_*' utilities when implementing child-specific S3 methods.


Shared scaling infrastructure for NMAR engines

Description

Centralized feature scaling and parameter unscaling routines used by NMAR estimation engines to ensure consistent, numerically stable behavior.

Goals

Inputs/Outputs

Inputs

Z_un (response model matrix with intercept), optional X_un (auxiliary model matrix, no intercept), optional named mu_x_un (auxiliary means on the original scale), and a logical standardize flag.

Outputs

Scaled matrices Z, X, and mu_x, plus an nmar_scaling_recipe used later for unscaling.

Integration pattern

  1. Before solving: call validate_and_apply_nmar_scaling() (engine-level) or prepare_nmar_scaling() (low-level) to obtain scaled matrices and recipe.

  2. Solve in the scaled space.

  3. After solving: call unscale_coefficients() to unscale coefficients and their covariance matrices.

  4. Store the nmar_scaling_recipe in results for diagnostics and reproducibility.

Notes


Polish Household Budget Data with Simulated Nonignorable Nonresponse

Description

This dataset is derived from the 'h05' dataset (Polish household budgets for 2005) found in the 'RClas' package. The original data was cleaned to remove all rows with missing values.

Usage

polish_households

Format

A data frame with 19,330 rows and 17 columns. The key variables are:

class

TODO

voi

TODO

bio

TODO

type

TODO

d345

TODO

d347

TODO

d348

TODO

d36

TODO

d38

TODO

d61

TODO

noper

TODO

income

TODO

expenditure

TODO

y_exp

Numeric. The **true** scaled expenditure ('expenditure / mean(expenditure)'). This is the complete study variable without missingness.

resp

TODO

R

Integer. The simulated response indicator (1=responded, 0=nonresponse).

y_exp_miss

Numeric. The **observed** scaled expenditure, containing 7,778 'NA' values where 'R = 0'. This is the variable to be used as the NMAR-affected outcome.

Details

To create a realistic test case for nonignorable nonresponse (NMAR), a nonresponse mechanism was simulated and applied to the scaled expenditure variable ('y_exp').

The key simulation steps were: 1. 'y_exp' (true study variable) was created by scaling total expenditure. 2. A true response probability ('resp') was created using the logistic model: 'plogis(1 - 0.6 * y_exp)'. 3. A response indicator ('R') was simulated based on this probability. 4. The final variable 'y_exp_miss' was generated by setting 'y_exp' to 'NA' wherever 'R' was 0.

The response is **nonignorable** because the probability of missingness depends directly on the value of the expenditure variable itself.

Source

TODO

See Also

'riddles_case1', 'riddles_case2', 'riddles_case3', 'riddles_case4'


Prepare scaled matrices and moments (low-level)

Description

Prepare scaled matrices and moments (low-level)

Usage

prepare_nmar_scaling(
  Z_un,
  X_un,
  mu_x_un,
  standardize,
  weights = NULL,
  weight_mask = NULL
)

Arguments

Z_un

response model matrix (with intercept column).

X_un

auxiliary model matrix (no intercept), or NULL.

mu_x_un

named numeric vector of auxiliary means on the original scale (names must match colnames(X_un)), or NULL.

standardize

logical; apply standardization if TRUE.

weights

Optional numeric vector used for weighted scaling.

weight_mask

Optional logical mask or nonnegative numeric multipliers applied to weights.

Value

A list with components Z, X, mu_x, and recipe.


Print method for NMAR engines

Description

Provides a compact, human-friendly summary for 'nmar_engine' objects. Child classes inherit this method; they can override it if they need a different presentation.

Usage

## S3 method for class 'nmar_engine'
print(x, ...)

Arguments

x

An engine object inheriting from 'nmar_engine'.

...

Unused.

Value

'x', invisibly.


Print method for nmar_result

Description

Print method for nmar_result

Usage

## S3 method for class 'nmar_result'
print(x, ...)

Arguments

x

nmar_result object

...

Additional parameters

Value

'x', invisibly.


Print method for EL results

Description

Compact print for objects of class nmar_result_el.

Usage

## S3 method for class 'nmar_result_el'
print(x, ...)

Arguments

x

An object of class nmar_result_el.

...

Ignored.

Value

x, invisibly.


Print method for Exponential Tilting results (engine-specific)

Description

This print method is tailored for 'nmar_result_exptilt' objects and shows a concise, human-friendly summary of the estimation result together with exptilt-specific diagnostics (loss, iterations) and a compact view of the response coefficients stored in the fitted model.

Usage

## S3 method for class 'nmar_result_exptilt'
print(x, ...)

Arguments

x

An object of class 'nmar_result_exptilt'.

...

Ignored.

Value

'x', invisibly.


Print method for summary.nmar_result

Description

Print method for summary.nmar_result

Usage

## S3 method for class 'summary_nmar_result'
print(x, ...)

Arguments

x

summary_nmar_result object

...

Additional parameters

Value

'x', invisibly.


Construct a probit response family bundle

Description

Construct a probit response family bundle

Usage

probit_family()

Value

A list with components name, linkinv, mu.eta, d2mu.deta2, and score_eta.


Riddles Simulation, Case 1: Linear Mean

Description

A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 1) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.

Usage

riddles_case1

Format

A data frame with 500 rows and 4 variables:

x

Numeric. The auxiliary variable, x ~ Normal(0, 0.5).

y

Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.

y_true

Numeric. The complete, true value of 'y' before missingness was introduced.

delta

Integer. The response indicator (1 = responded, 0 = nonresponse).

Details

This dataset was generated using the following model parameters (n = 500):

Density for x:

x ~ Normal(mean = 0, variance = 0.5)

Density for error:

e ~ Normal(mean = 0, variance = 0.9)

True Model (Case 1):

y_true = -1 + x + e

Response Model (NMAR):

logit(pi) = 0.8 - 0.2 * y_true

Source

Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.


Riddles Simulation, Case 2: Exponential Mean

Description

A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 2) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.

Usage

riddles_case2

Format

A data frame with 500 rows and 4 variables:

x

Numeric. The auxiliary variable, x ~ Normal(0, 0.5).

y

Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.

y_true

Numeric. The complete, true value of 'y' before missingness was introduced.

delta

Integer. The response indicator (1 = responded, 0 = nonresponse).

Details

This dataset was generated using the following model parameters (n = 500):

Density for x:

x ~ Normal(mean = 0, variance = 0.5)

Density for error:

e ~ Normal(mean = 0, variance = 0.9)

True Model (Case 2):

y_true = -2 + 0.5 * exp(x) + e

Response Model (NMAR):

logit(pi) = 0.8 - 0.2 * y_true

Source

Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.


Riddles Simulation, Case 3: Sine Wave Mean

Description

A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 3) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.

Usage

riddles_case3

Format

A data frame with 500 rows and 4 variables:

x

Numeric. The auxiliary variable, x ~ Normal(0, 0.5).

y

Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.

y_true

Numeric. The complete, true value of 'y' before missingness was introduced.

delta

Integer. The response indicator (1 = responded, 0 = nonresponse).

Details

This dataset was generated using the following model parameters (n = 500):

Density for x:

x ~ Normal(mean = 0, variance = 0.5)

Density for error:

e ~ Normal(mean = 0, variance = 0.9)

True Model (Case 3):

y_true = -1 + sin(2 * x) + e

Response Model (NMAR):

logit(pi) = 0.8 - 0.2 * y_true

Source

Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.


Riddles Simulation, Case 4: Cubic Mean

Description

A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 4) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.

Usage

riddles_case4

Format

A data frame with 500 rows and 4 variables:

x

Numeric. The auxiliary variable, x ~ Normal(0, 0.5).

y

Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.

y_true

Numeric. The complete, true value of 'y' before missingness was introduced.

delta

Integer. The response indicator (1 = responded, 0 = nonresponse).

Details

This dataset was generated using the following model parameters (n = 500):

Density for x:

x ~ Normal(mean = 0, variance = 0.5)

Density for error:

e ~ Normal(mean = 0, variance = 0.9)

True Model (Case 4):

y_true = -1 + 0.4 * x^3 + e

Response Model (NMAR):

logit(pi) = 0.8 - 0.2 * y_true

Source

Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.


Run method for EL engine

Description

Run method for EL engine

Usage

## S3 method for class 'nmar_engine_el'
run_engine(engine, formula, data, trace_level = 0)

Arguments

engine

An object of class nmar_engine_el.

formula

A two-sided formula passed through by nmar().

data

A data.frame or survey.design.

trace_level

Integer 0-3 controlling verbosity.

Value

An object of class nmar_result_el (which also inherits from nmar_result).


Sanitize nleqslv control list for compatibility

Description

Sanitize nleqslv control list for compatibility

Usage

sanitize_nleqslv_control(ctrl)

Map unscaled auxiliary multipliers to scaled space

Description

Map unscaled auxiliary multipliers to scaled space

Usage

scale_aux_multipliers(lambda_unscaled, recipe, columns)

Arguments

lambda_unscaled

named numeric vector of auxiliary multipliers aligned to auxiliary design columns (no intercept) on original scale.

recipe

Scaling recipe of class nmar_scaling_recipe.

columns

character vector of auxiliary column names (order) for the scaled design.

Value

numeric vector of multipliers in the scaled space.


Map unscaled coefficients to scaled space

Description

Map unscaled coefficients to scaled space

Usage

scale_coefficients(beta_unscaled, recipe, columns)

Arguments

beta_unscaled

named numeric vector of coefficients for the response model on the original scale, including an intercept named "(Intercept)".

recipe

Scaling recipe of class nmar_scaling_recipe, or NULL.

columns

character vector of column names (order) for the scaled design matrix (including intercept).

Value

numeric vector of coefficients in the scaled space, ordered by columns.


Extract standard error for NMAR results

Description

Returns the standard error of the primary mean estimate.

Usage

se(object, ...)

Arguments

object

An 'nmar_result' or subclass.

...

Ignored.

Value

Numeric scalar.


Weighted linear algebra helpers

Description

Weighted linear algebra helpers

Usage

shared_weighted_gram(X, w)

Summary method for nmar_result

Description

Summary method for nmar_result

Usage

## S3 method for class 'nmar_result'
summary(object, conf.level = 0.95, ...)

Arguments

object

nmar_result object

conf.level

Confidence level for intervals.

...

Additional parameters

Value

An object of class 'summary_nmar_result'.


Summary method for EL results

Description

Summarize estimation, standard error and missingness-model coefficients.

Usage

## S3 method for class 'nmar_result_el'
summary(object, ...)

Arguments

object

An object of class nmar_result_el.

...

Ignored.

Value

An object of class summary_nmar_result_el.


Summary method for Exponential Tilting results (engine-specific)

Description

Summarize estimation, standard error and model coefficients.

Usage

## S3 method for class 'nmar_result_exptilt'
summary(object, conf.level = 0.95, ...)

Arguments

object

An object of class 'nmar_result_exptilt'.

conf.level

Confidence level for confidence interval (default 0.95).

...

Ignored.

Value

An object of class 'summary_nmar_result_exptilt'.


Tidy summary for NMAR results

Description

Return a data frame with the primary estimate and (if available) missingness-model coefficients.

Usage

## S3 method for class 'nmar_result'
tidy(x, conf.level = 0.95, ...)

Arguments

x

An object of class 'nmar_result'.

conf.level

Confidence level for the primary estimate.

...

Ignored.

Value

A data frame with one row for the primary estimate and, when available, additional rows for the response-model coefficients.


Trim weights by capping and proportional redistribution

Description

Applies a cap to a nonnegative weight vector and, when feasible, redistributes excess mass across the remaining positive entries so that the total sum is preserved. When the requested cap is too tight to preserve the total mass, all positive entries are set to the cap and the total sum decreases.

Usage

trim_weights(weights, cap, tol = 1e-12, warn_tol = 1e-08)

Arguments

weights

numeric vector of weights.

cap

positive numeric scalar; maximum allowed weight, or Inf to disable trimming.

tol

numeric tolerance used when testing whether a rescaling step respects the cap.

warn_tol

numeric tolerance used when testing whether the total sum has been preserved.

Details

Zero weights remain zero; only entries that are positive after nonnegativity enforcement can absorb redistributed mass.

Internally, a simple water-filling style algorithm is used on the positive weights: the largest weights are successively saturated at the cap and the remaining weights are rescaled by a common factor chosen to maintain the total sum.

Value

A list with components:

weights

numeric vector of trimmed weights.

trimmed_fraction

fraction of entries at or very close to the cap (within tol).

preserved_sum

logical; TRUE if the total sum of weights is preserved to within warn_tol.

total_before

numeric; sum of the original weights.

total_after

numeric; sum of the trimmed weights.


Unscale regression coefficients and covariance

Description

Unscale regression coefficients and covariance

Usage

unscale_coefficients(scaled_coeffs, scaled_vcov, recipe)

Arguments

scaled_coeffs

named numeric vector of coefficients estimated on the scaled space.

scaled_vcov

covariance matrix of scaled_coeffs.

recipe

Scaling recipe of class nmar_scaling_recipe.

Value

A list with components coefficients and vcov.


Validate and apply scaling (engine-friendly)

Description

Validate and apply scaling (engine-friendly)

Usage

validate_and_apply_nmar_scaling(
  standardize,
  has_aux,
  response_model_matrix_unscaled,
  aux_matrix_unscaled,
  mu_x_unscaled,
  weights = NULL,
  weight_mask = NULL
)

Arguments

standardize

logical; apply standardization if TRUE.

has_aux

logical; whether the engine uses auxiliary constraints.

response_model_matrix_unscaled

response model matrix (with intercept).

aux_matrix_unscaled

auxiliary matrix (no intercept) or an empty matrix.

mu_x_unscaled

named auxiliary means on original scale, or NULL.

weights

Optional numeric vector used for weighted scaling.

weight_mask

Optional logical mask or nonnegative numeric multipliers applied to weights.

Value

A list with components nmar_scaling_recipe, response_model_matrix_scaled, auxiliary_matrix_scaled, and mu_x_scaled.


Validate Data for NMAR Analysis

Description

Little sanity-check for data

Usage

validate_data(data)

Arguments

data

A data frame or a survey object.

Value

Returns 'invisible(NULL)' on success, stopping with a descriptive error on failure.


Validate top-level nleqslv arguments (coerce invalid to defaults)

Description

Validate top-level nleqslv arguments (coerce invalid to defaults)

Usage

validate_nleqslv_top(top)

Validate EL Engine Settings

Description

Validate EL Engine Settings

Usage

validate_nmar_engine_el(engine)

Validate nmar_result structure

Description

Ensures both the child class and the parent schema are satisfied. The validator also back-fills defaults so downstream code can rely on the presence of optional components without defensive checks.

Usage

validate_nmar_result(x, class_name)

Details

This helper is the single authority on the 'nmar_result' schema. It expects a list that already carries class c(class_name, "nmar_result") and at least a primary estimate stored in y_hat. All other components are optional; when they are NULL or missing, the validator supplies safe defaults:

Engine constructors should normally call new_nmar_result() rather than invoking this function directly. new_nmar_result() attaches classes and funnels all objects through validate_nmar_result() so downstream S3 methods can assume a consistent structure.


Variance-covariance for base NMAR results

Description

Variance-covariance for base NMAR results

Usage

## S3 method for class 'nmar_result'
vcov(object, ...)

Arguments

object

An object of class 'nmar_result'.

...

Ignored.

Value

A 1x1 numeric matrix (the variance of the primary estimate).


Aggregated Exit Poll Data for Gangdong-Gap (2012)

Description

This dataset contains the aggregated exit poll results for the Gangdong-Gap district in Seoul from the 2012 nineteenth South Korean legislative election. The data is transcribed directly from Table 9 of Riddles, Kim, and Im (2016).

Usage

voting

Format

A data frame with 8 rows and 7 variables:

Gender

Factor. The gender of the voter ("Male", "Female").

Age_group

Character. The age group of the voter.

Voted_A

Numeric. Count of respondents voting for Party A.

Voted_B

Numeric. Count of respondents voting for Party B.

Other

Numeric. Count of respondents voting for another party.

Refusal

Numeric. Count of sampled individuals who refused to respond (this is the nonresponse count).

Total

Numeric. Total individuals sampled in the group (Responders + Refusals).

Details

In the paper's application, 'Gender' is used as the nonresponse instrumental variable and 'Age_group' is the primary auxiliary variable .

Source

Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. *Journal of Survey Statistics and Methodology*, 4(1), 1–31. (Data from Table 9, p. 20).


Extract weights from an 'nmar_result'

Description

Return analysis weights stored in an 'nmar_result' as either probability-scale (summing to 1) or population-scale (summing to 'sample$n_total'). The function normalizes stored masses and attaches informative attributes.

Usage

## S3 method for class 'nmar_result'
weights(object, scale = c("probability", "population"), ...)

Arguments

object

An 'nmar_result' object.

scale

One of '"probability"' (default) or '"population"'.

...

Additional arguments (ignored).

Value

Numeric vector of weights with length equal to the number of respondents.