Help for package propertee

Version:

1.0.1

Title:

Standardization-Based Effect Estimation with Optional Prior Covariance Adjustment

Description:

The Prognostic Regression Offsets with Propagation of ERrors (for Treatment Effect Estimation) package facilitates direct adjustment for experiments and observational studies that is compatible with a range of study designs and covariance adjustment strategies. It uses explicit specification of clusters, blocks and treatment allocations to furnish probability of assignment-based weights targeting any of several average treatment effect parameters, and for standard error calculations reflecting these design parameters. For covariance adjustment of its Hajek and (one-way) fixed effects estimates, it enables offsetting the outcome against predictions from a dedicated covariance model, with standard error calculations propagating error as appropriate from the covariance model.

License:

MIT + file LICENSE

License_is_FOSS:

yes

License_restricts_use:

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.2

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0), multcomp, MASS

Config/testthat/edition:

Imports:

stats, methods, sandwich

Enhances:

robustbase

Depends:

R (≥ 4.1.0)

VignetteBuilder:

knitr

URL:

https://github.com/benbhansen-stats/propertee

BugReports:

https://github.com/benbhansen-stats/propertee/issues

Collate:

'StudySpecification.R' 'StudySpecificationAccessors.R' 'SandwichLayer.R' 'block_center_residuals.R' 'SandwichLayerVariance.R' 'StudySpecificationConverters.R' 'StudySpecificationStructure.R' 'StudySpecificationUtilities.R' 'WeightedStudySpecification.R' 'areg.center.R' 'confint_lm.R' 'teeMod.R' 'as.lmitt.R' 'as_data_frame.R' 'assigned.R' 'bread.R' 'c_weightedStudySpecification.R' 'cov_adj.R' 'data.R' 'dichotomy.R' 'expand.model.frame_tee.R' 'get_cov_adj.R' 'get_data_from_model.R' 'get_spec.R' 'glmrobMethods.R' 'lmitt.R' 'lmrob_methods.R' 'merge_preserve_order.R' 'propertee_package.R' 'specification_table.R' 'summary.StudySpecification.R' 'summary.teeMod.R' 'update_by.R' 'validWeights.R' 'weights_internal.R' 'weights_exported.R'

NeedsCompilation:

Packaged:

2025-08-25 13:54:02 UTC; josh

Author:

Josh Errickson [cre, aut], Josh Wasserman [aut], Mark Fredrickson [ctb], Adam Sales [ctb], Xinhe Wang [ctb], Ben Hansen [aut]

Maintainer:

Josh Errickson <jerrick@umich.edu>

Repository:

CRAN

Date/Publication:

2025-08-26 08:30:14 UTC

`WeightedStudySpecification` Operations

Description

Algebraic operators on WeightedStudySpecification objects and numeric vectors. WeightedStudySpecifications do not support addition or subtraction.

Usage

## S4 method for signature 'WeightedStudySpecification,numeric'
e1 + e2

## S4 method for signature 'numeric,WeightedStudySpecification'
e1 + e2

## S4 method for signature 'WeightedStudySpecification,numeric'
e1 - e2

## S4 method for signature 'numeric,WeightedStudySpecification'
e1 - e2

## S4 method for signature 'WeightedStudySpecification,numeric'
e1 * e2

## S4 method for signature 'numeric,WeightedStudySpecification'
e1 * e2

## S4 method for signature 'WeightedStudySpecification,numeric'
e1 / e2

## S4 method for signature 'numeric,WeightedStudySpecification'
e1 / e2

Arguments

e1, e2

WeightedStudySpecification or numeric objects

Details

These are primarily used to either combine weights via multiplication, or to invert weights. Addition and subtraction are not supported and will produce errors.

Value

a WeightedStudySpecification object

Return ..uoa.. column

Description

Return ..uoa.. column

Usage

..uoa..(spec)

Arguments

spec

A studyspecification

Value

The ..uoa.. column

(Internal) Helper function for design-based meat matrix calculation

Description

(Internal) Helper function for design-based meat matrix calculation

Usage

.add_mat_diag(A, B)

(Internal) Helper function for design-based meat matrix calculation

Description

(Internal) Helper function for design-based meat matrix calculation

Usage

.add_mat_sqdif(X, zobs, bid, b, upper = TRUE)

(Internal) Helper function for design-based meat matrix calculation

Description

(Internal) Helper function for design-based meat matrix calculation

Usage

.add_vec(a, upper = TRUE)

(Internal) Aggregate weights and outcomes to cluster level

Description

(Internal) Aggregate weights and outcomes to cluster level

Usage

.aggregate_to_cluster(x, ...)

Arguments

x

a fitted teeMod model

Details

aggregate individual weights and outcomes to cluster weighted sums

Value

a list of

a data frame of cluster weights, outcomes, treatments, and block ids;
treatment id column name;
block id column name

(Internal) Align the dimensions and rows of direct adjustment and covariance adjustment model estimating equations matrices

Description

(Internal) Align the dimensions and rows of direct adjustment and covariance adjustment model estimating equations matrices

Usage

.align_and_extend_estfuns(x, ctrl_means_ef_mat = NULL, by = NULL, ...)

Arguments

x

a fitted teeMod model

ctrl_means_ef_mat

optional, a matrix of estimating equations corresponding to the estimates of the marginal (and possibly conditional) means of the outcome and offset in the control condition. These are aligned and extended in the same way as the matrix of estimating equations for x and cbinded to them

by

optional, a character vector indicating columns that uniquely identify rows in the dataframe used for fitting x and the dataframe passed to the data argument of the covariance adjustment model fit. The default is NULL, in which case the unit of assignment columns specified in the StudySpecification slot of x are used.

...

mostly arguments passed to methods, but the special case is the argument loco_residuals, which indicates the offsets in the residuals of x should be replaced by versions that use leave-one-cluster-out estimates of the covariance model

Details

.align_and_extend_estfuns() first extracts the matrices of contributions to the empirical estimating equations for the direct adjustment and covariance adjustment models; then, it pads the matrices with zeros to account for units of observation that appear in one model-fitting sample but not the other; finally it orders the matrices so units of observation (or if unit of observation-level ordering is impossible, units of assignment) are aligned.

Value

A list of two matrices, one being the aligned contributions to the estimating equations for the direct adjustment model, and the other being the aligned contributions to the covariance adjustment model.

(Internal) Applies dichotomy to treatment

Description

Given a dichotomy formula and a data.frame with a treatment variable and any variables in the formula, returns a vector containing only 0, 1, or NA.

Usage

.apply_dichotomy(txt, dichotomy)

Arguments

txt

A named data.frame containing a column of the treatment, such as that produed by treatment(myspecification), and any variables specified in dichotomy.

dichotomy

A formula specifying how to dichotomize the non-binary treatment column in txt (or a call that evaluates to a formula). See the Details section of the ett() or att() help pages for information on specifying this formula

Value

A vector of binary treatments

Convert object to `data.frame` or produce meaningful error

Description

Convert object to data.frame or produce meaningful error

Usage

.as_data_frame(x)

Arguments

x

An object

Value

x as a data.frame

(Internal) Extract empirical estimating equations from a `teeMod` model using the S3 method associated with its `.S3Class` slot

Description

(Internal) Extract empirical estimating equations from a teeMod model using the S3 method associated with its .S3Class slot

Usage

.base_S3class_estfun(x)

Arguments

x

a fitted teeMod model

Value

S3 method

(Internal) Extracts treatment as binary `vector`

Description

(Internal) Extracts treatment as binary vector

Usage

.bin_txt(spec, data = NULL, dichotomy = NULL)

Arguments

spec

A StudySpecification, used to get treatment assignment information

data

A dataframe with unit of assignment information and, if a dichotomy is provided, columns specified therein

dichotomy

Optional, a formula. See the Details section of the ett() or att() help pages for information on specifying the formula

Details

If a dichotomy is specified or the StudySpecification has a treatment variable consisting only of 0/1 or NA, then returns the binary treatment. Otherwise (it has a non-binary treatment and lacks a dichotomy) it errors.

Value

A vector of binary treatments

(Internal) A few checks to ensure `by=` is valid

Description

Thie ensures that the by= argument is of the proper type, is named, and consists of only unique entries.

Usage

.check_by(by)

Arguments

by

named vector or list connecting names of unit of assignment/unitid/cluster variables in specification to unit of assignment/unitid/cluster variables in data. Names represent variables in the StudySpecification; values represent variables in the data.

Value

NULL if no errors are found

(Internal) Replace standard errors for moderator effect estimates with insufficient degrees of freedom with `NA`

Description

(Internal) Replace standard errors for moderator effect estimates with insufficient degrees of freedom with NA

Usage

.check_df_moderator_estimates(
  vmat,
  model,
  cluster_cols,
  model_data = quote(data),
  envir = environment(formula(model))
)

Arguments

vmat

output of .vcov_XXX() called with input to model argument below as the first argument

model

a fitted teeMod model

cluster_cols

a character vector indicating the column(s) defining cluster ID's

model_data

dataframe or name of dataframe used to fit model

envir

environment to get model_data from if model_data has class name

Value

A variance-covariance matrix with NA's for estimates lacking sufficient degrees of freedom

(Internal) Perform checks on formula for creation of StudySpecification.

Description

Checks performed:

Ensure presence of no more than one of unit_of_assignment(), cluster() or unitid().
Disallow multiple block() or multiple forcing() terms.
Disallow forcing() unless in RDD.

Usage

.check_spec_formula(form, allow_forcing = FALSE)

Arguments

form

A formula passed to .new_StudySpecification()

allow_forcing

Binary whether forcing() is allowed (TRUE for RDD, FALSE for RCT and Obs).

Value

TRUE if all checks pass, otherwise errors.

Compute the degrees of freedom of a sandwich standard error with HC2 correction

Description

Compute the degrees of freedom of a sandwich standard error with HC2 correction

Usage

.compute_IK_dof(
  tm,
  ell,
  cluster = NULL,
  bin_y = FALSE,
  exclude = na.action(tm),
  tol = 1e-09
)

References

Guido W. Imbens and Michael Kolesár. "Robust Standard Errors in Small Samples: Some Practical Advice". In: The Review of Economics and Statistics 98.4 (Oct. 2016), pp. 701-712.

Compute residuals for a `teeMod` object with leave-one-out estimates of the `offset`

Description

Compute residuals for a teeMod object with leave-one-out estimates of the offset

Usage

.compute_loo_resids(x, cluster, ...)

Arguments

x

a teeMod object

cluster

vector of column names that identify clusters

Details

The residual for any observation also used for fitting the fitted_covariance_model stored in the offset of x is replaced by an estimated residual that uses a cluster leave-one-out estimate of the fitted_covariance_model for generating a value of the offset.

Produce confidence intervals for linear models

Description

Produce confidence intervals for linear models

Usage

.confint_lm(object, parm, level = 0.95, ...)

Arguments

object

a fitted teeMod model

parm

a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered.

level

the confidence level required.

...

additional arguments to pass to vcov.teeMod()

Details

.confint_lm() is a copy of stats::confint.lm but passes arguments in ... to the vcov() call. When called on a teeMod model, this produces confidence intervals where standard errors are computed based on the desired formulation of the vcov_tee() call.

Value

A matrix (or vector) with columns giving lower and upper confidence limits for each parameter. These will be labelled as (1-level)/2 and 1 - (1-level)/2 in % (by default 2.5% and 97.5%)

(Internal) Ensures replacement column for `StudySpecification` is a `data.frame`.

Description

Helper function for StudySpecification replacers to ensure replacement is a properly named data.frame

Usage

.convert_to_data.frame(value, specification, type)

Arguments

value

A vector or data.frame containing a replacement.

specification

A StudySpecification

type

One of "t", "f", "u" or "b"

Details

When given a replacement set of values (e.g vector or matrix), this ensures that the replacement is a named data.frame.

Input vector: Since it cannot be named, a vector can only be used to replace an existing component. If the existing component has more than 1 column, uses the name of the first column.

Input matrix or data.frame: If unnamed and replacing existing component, must have no more columns than original component. (If less columns, uses the name of the first few columns.) If named, can replace any number of columns.

Value

data.frame containing named column(s)

(Internal) Helper function for design-based meat matrix calculation

Description

(Internal) Helper function for design-based meat matrix calculation

Usage

.cov01_est(XX, zobs, bid)

Details

the Young's elementary inequality is used

Value

estimated upper and lower bounds of covariance matrix of estimating function vectors under treatment and under control

(Internal) Helper function for design-based meat matrix calculation

Description

(Internal) Helper function for design-based meat matrix calculation

Usage

.cov_mat_est(XXz, bidz)

Details

Diagonal elements are estimated by sample variances Off-diagonal elements are estimated using the Young's elementary inequality

Value

estimated upper and lower bounds of covariance matrix of estimating function vectors under either treatment or control

Design-based estimating equations contributions

Description

Design-based estimating equations contributions

Usage

.estfun_DB_blockabsorb(x, by = NULL, ...)

Arguments

x

a fitted teeMod object

...

arguments passed to methods

Details

calculate contributions to empirical estimating equations from a teeMod model with absorbed intercepts from the design-based perspective

Value

An n\times k matrix

Add new variables to a model frame from a `teeMod` object

Description

A variation of expand.model.frame which works for teeMod objects

Usage

.expand.model.frame_teeMod(
  model,
  extras,
  envir = environment(formula(model)),
  na.expand = FALSE
)

Arguments

model

A teeMod object

extras

one-sided formula or vector of character strings describing new variables to be added

envir

an environment to evaluate things in

na.expand

logical; see stats::expand.model.frame for details

Details

When building a teeMod object inside lmitt(), we do a lot of manipulation of the variables involved in the model such that by the time the teeMod is produced, neither the outcome nor predictors actually fit in the model exist in the data passed into the call.

(E.g. to be specific, if a user calls myda <- lmitt(y ~ 1, data = mydata), then model.frame(myda) would contain column names not found in mydata.)

This is a clone of stats::expand.model.frame() which has one addition

after extracting the model$call$data from model, it adds columns from model.frame(model) to the object. This ensures that the additional variables created during lmitt() can be found.

Trivial modifications from stats::expand.model.frame() include ensuring model is a teeMod object, and using the :: syntax as appropriate.

Value

A data.frame

(Internal) Expand treatment variable from a `StudySpecification` to a dataframe with unit of assignment information

Description

(Internal) Expand treatment variable from a StudySpecification to a dataframe with unit of assignment information

Usage

.expand_txt(txt, data, spec)

Arguments

txt

A dataframe with one column corresponding to the treatment. Can be dichotomized or as it's stored in spec

data

A dataframe with unit of assignment information

spec

A StudySpecification, used to align unit of assignment information with txt

(Internal) Fallback brute force method to locate `data` in the call stack.

Description

We try to be intelligent about finding the appropriate data. If this fails, we may have need for a brute force method that just loops through frames and looks for a data object.

Usage

.fallback_data_search()

Value

If found, the data.

(Internal) Find `dichotomy` formulas in the call stack

Description

(Internal) Find dichotomy formulas in the call stack

Usage

.find_dichotomies()

Details

.find_dichotomies() searches for lmitt.formula() calls and their weights arguments for any dichotomy arguments.

Value

A list where elements are either formulas or NULL, depending on whether a dichotomy argument was found in lmitt.formula() or its weights argument

(Internal) Bread matrix of design-based variance

Description

(Internal) Bread matrix of design-based variance

Usage

.get_DB_covadj_bread(x, ...)

Arguments

x

a fitted teeMod model

Details

Calculate bread matrix for design-based variance estimate for teeMod models with covariance adjustment and without absorbed effects

Value

a list of bread matrices

(Internal) Meat matrix of design-based variance

Description

(Internal) Meat matrix of design-based variance

Usage

.get_DB_covadj_meat(x, ...)

Arguments

x

a fitted teeMod model

Details

Calculate upper and lower bound estimates of meat matrix for design-based variance estimate for teeMod models with covariance adjustment and without absorbed effects

Value

a list of meat matrix bounds

(Internal) Design-based variance for models with covariance adjustment

Description

(Internal) Design-based variance for models with covariance adjustment

Usage

.get_DB_covadj_se(x, ...)

Arguments

x

a fitted teeMod model

Details

Calculate design-based variance estimate for teeMod models with covariance adjustment and without absorbed effects

Value

design-based variance estimate of the main treatment effect estimate

(Internal) Design-based variance for models without covariance adjustment

Description

(Internal) Design-based variance for models without covariance adjustment

Usage

.get_DB_wo_covadj_se(x, ...)

Arguments

x

a fitted teeMod model

Details

Calculate bread matrix for design-based variance estimate for teeMod models without covariance adjustment and without absorbed effects

Value

design-based variance estimate of the main treatment effect estimate

(Internal) Estimate components of the sandwich covariance matrix returned by `vcov_tee()`

Description

(Internal) Estimate components of the sandwich covariance matrix returned by vcov_tee()

Usage

.get_a22_inverse(x, ...)

.get_a11_inverse(x)

.get_a21(x, ...)

.get_tilde_a22_inverse(x, ...)

.get_tilde_a21(x)

Arguments

x

a fitted teeMod model

...

arguments passed to bread method

Details

.get_a22_inverse()/.get_tilde_a22_inverse(): A_{22}^{-1} is the "bread" of the sandwich covariance matrix returned by vcov_tee() whether one has fit a prior covariance adjustment model or not.

.get_a11_inverse(): A_{11}^{-1} is the "bread" of the sandwich covariance matrix for the covariance adjustment model. This matrix contributes to the meat matrix of the direct adjustment sandwich covariance matrix.

.get_a21()/.get_tilde_a21(): A_{21} is the gradient of the estimating equations for the direct adjustment model taken with respect to the covariance adjustment model parameters. This matrix is the crossproduct of the prediction gradient for the units of observation in \mathcal{Q} and the model matrix of the direct adjustment model.

Value

.get_a22_inverse()/.get_tilde_a22_inverse(): A 2\times 2 matrix corresponding to an intercept and the treatment variable in the direct adjustment model

.get_a11_inverse(): A p\times p matrix where p is the dimension of the covariance adjustment model, including an intercept

.get_a21()/.get_tilde_a21(): A 2\times p matrix where the number of rows are given by the intercept and the treatment variable in the direct adjustment model, and the number of columns are given by the dimension of the covariance adjustment model

(Internal) Product of `A_{pp}^{-1} A_{\tau p}^T`

Description

(Internal) Product of A_{pp}^{-1} A_{\tau p}^T

Usage

.get_appinv_atp(x, ...)

Arguments

x

a fitted teeMod model

Value

An s\times k matrix A_{pp}^{-1} A_{\tau p}^T

(Internal) Extract specified `type` from new data set

Description

(Internal) Extract specified type from new data set

Usage

.get_col_from_new_data(
  specification,
  newdata,
  type,
  by = NULL,
  implicitBlock = FALSE,
  ...
)

Arguments

specification

A StudySpecification

newdata

A data.frame, which may or may not be the one which was used to create specification. It must have the units of assignment variable(s) (though by= argument can be used if the name differ), and will appropriately merge with the specification the blocks, treatment or forcings.

type

One of "t", "f", or "b".

by

optional; named vector or list connecting names of unit of assignment/unitid/cluster variables in specification to unit of assignment/unitid/cluster variables in data. Names represent variables in the StudySpecification; values represent variables in the data. Only needed if variable names differ.

implicitBlock

If the StudySpecification does not include a block, TRUE will return a constant 1 for the blocks if type requests it.

...

Additional arguments to merge().

Value

The column(s) belonging to the requested type in

(Internal) Locate data in call stack

Description

Whenever a function in a model (ate()/ett()/cov_adj()/assigned()) is called without an explicit data= argument, this will attempt to extract the data from the model itself.

Usage

.get_data_from_model(which_fn, form = NULL, by = NULL)

Arguments

which_fn

Identify calling function, "weights" or "assigned", helps separate logic for the two functions.

form

Formula on which to apply model.frame(). See details

by

translation of unit of assignment/unitid/cluster ID names, passed down from weights.

Details

The form specifies what columns of the data are needed. For current use cases (ate()/ett() and assigned()), this will be only the unit of assignment variables, so e.g. form = ~ uoavar, to enable merging of UOA level variables to the model data. However, this can easily be expanded if other variables are needed.

Value

data.frame

Get the degrees of freedom of a sandwich variance estimate associated with a teeMod fit

Description

Get the degrees of freedom of a sandwich variance estimate associated with a teeMod fit

Usage

.get_dof(x, vcov_type, ell, cluster = NULL, cls = NULL, ...)

(Internal) Calculate grave{phi}

Description

(Internal) Calculate grave{phi}

Usage

.get_phi_tilde(x, ...)

Arguments

x

a fitted teeMod model

(Internal) Locate a `StudySpecification` in the call stack

Description

assigned()/ate()/ett()/cov_adj() all need the StudySpecification to operate. If any are called in the model without a specification= argument, this function sees if it can find the StudySpecification in another of these functions.

Usage

.get_spec(NULL_on_error = FALSE)

Arguments

NULL_on_error

if TRUE, returns NULL if a StudySpecification object is not found rather than throwing an error.

Details

Note that it will never look inside assigned() (gets complicated in formulas), only in weights or cov_adj(). E.g.

lm(y ~ assigned(), weights = ate(spec), offest = cov_adj(mod1))

lm(y ~ assigned(), weights = ate(), offest = cov_adj(mod1, specification = spec))

will both work, but

lm(y ~ assigned(spec), weights = ate(), offest = cov_adj(mod1))

will fail.

Value

A StudySpecification, or NULL if NULL_on_error is TRUE and the StudySpecification can't be found.

(Internal) Expand unit of assignment level weights to the level of the data

Description

Helper function called during creation of the weights via ate() or ett()

Usage

.join_spec_weights(
  weights,
  specification,
  target,
  weightAlias,
  data,
  dichotomy
)

Arguments

weights

a vector of weights sorted according to the StudySpecification

specification

a StudySpecification

target

One of "ate" or "ett"

weightAlias

Any currently supported alias

data

New data

dichotomy

formula used to specify a dichotomy of a non-binary treatment variable. The output WeightedStudySpecification object will store this as its dichotomy slot, unless it is NULL, in which case it will be translated to an empty formula.

Value

a WeightedStudySpecification

(Internal) Get covariance adjustments and their gradient with respect to covariance adjustment model coefficients

Description

.make_PreSandwichLayer() takes a fitted covariance adjustment model passed to the model argument and generates adjustments to outcomes for observations in the newdata argument. It also evaluates the gradient of the adjustments taken with respect to the coefficients at the coefficient estimates.

Usage

.make_PreSandwichLayer(model, newdata = NULL, ...)

Arguments

model

a fitted model to use for generating covariance adjustment values

newdata

a dataframe with columns called for in model

...

additional arguments to pass on to model.frame and model.matrix. These cannot include na.action, xlev, or contrasts.arg: the former is fixed to be na.pass, while the latter two are provided by elements of the model argument.

Value

A PreSandwichLayer object

Make a dataframe that links units of assignment with clusters

Description

Make a dataframe that links units of assignment with clusters

Usage

.make_uoa_cluster_df(spec, cluster = NULL)

Arguments

spec

A StudySpecification object.

cluster

A character vector of column names to use as clusters. Columns must exist in the dataframe used to create the StudySpecification object. Defaults to NULL, in which case the column names specified in the unitid(), unit_of_assignment(), or cluster() function in the StudySpecification formula will be used.

Value

A dataframe where the number of rows coincides with the number of distinct unit of assignment or cluster combinations (depending on whether cluster is a more or less granular level than the assignment level) and the columns correspond to the unit of assignment columns and a "cluster" column

Make ID's to pass to the `cluster` argument of `vcov_tee()`

Description

.make_uoa_ids() returns a factor vector of cluster ID's that align with the order of the units of observations' contributions in estfun.teeMod(). This is to ensure that when vcov_tee() calls sandwich::meatCL(), the cluster argument aggregates the correct contributions to estimating equations within clusters.

Usage

.make_uoa_ids(x, vcov_type, cluster = NULL, ...)

Arguments

x

a fitted teeMod object

vcov_type

a string indicating model-based or design-based covariance estimation. Currently, "MB", "CR", and "HC" are the only strings registered as indicating model-based estimation.

cluster

character vector or list; optional. Specifies column names that appear in both the covariance adjustment and direct adjustment model dataframes. Defaults to NULL, in which case unit of assignment columns indicated in the StudySpecification will be used for clustering. If there are multiple clustering columns, they are concatenated together for each row and separated by "_".

...

arguments passed to methods

Value

A vector with length equal to the number of unique units of observation used to fit the two models. See Details of estfun.teeMod() for the method for determining uniqueness.

(Internal) Merge multiple block IDs

Description

(Internal) Merge multiple block IDs

Usage

.merge_block_id_cols(df, ids)

Arguments

df

a data frame

ids

a vector of block IDs, column names of df

Details

merge multiple block ID columns by the value combinations and store the new block ID in the column ids[1]

Value

a data frame with a column that contains unique block number IDs

(Internal) Merge `data.frame`s ensuring order of first `data.frame` is maintained

Description

(Internal) Merge data.frames ensuring order of first data.frame is maintained

Usage

.merge_preserve_order(x, ...)

Arguments

x

data.frame whose ordering is to be maintained

...

Additional arguments to merge(), particularly a second data.frame and a by= argument.

Value

Merged data.frame with the same ordering as x.

(Internal) Create a new `StudySpecification` object.

Description

Helper function to create a new StudySpecification. Called internally from rct_spec(), rd_spec() or obs_spec().

Usage

.new_StudySpecification(
  form,
  data,
  type,
  subset = NULL,
  call = NULL,
  na.fail = TRUE,
  called_from_lmitt = FALSE
)

Arguments

form

Formula to create StudySpecification, see help for rcr_spec(), rd_spec() or obs_spec() for details.

data

The data set

type

One of "RCT", "RD", or "Obs"

subset

Any subset information

call

The call generating the StudySpecification.

na.fail

Should it error on NA's (TRUE) or remove them (FALSE)?

called_from_lmitt

Logical; was this called inside lmitt(), or was it called from *_spec() (default).

Value

A new StudySpecification object

(Internal) Order observations used to fit a `teeMod` model and a prior covariance adjustment model

Description

(Internal) Order observations used to fit a teeMod model and a prior covariance adjustment model

Usage

.order_samples(x, by = NULL, ...)

Arguments

x

a fitted teeMod model

by

character vector of columns to get ID's for ordering from. Default is NULL, in which case unit of assignment ID's are used for ordering.

...

arguments passed to methods

Details

.order_samples() underpins the ordering for .make_uoa_ids() and estfun.teeMod(). This function orders the outputs of those functions, but also informs how the original matrices of contributions to estimating equations need to be indexed to align units of observations' contributions to both sets of estimating equations.

When a by argument is provided to cov_adj(), it is used to construct the order of .order_samples().

Value

A list of four named vectors. The Q_not_C element holds the ordering for units of observation in the direct adjustment sample but not the covariance adjustment samples; Q_in_C and C_in_Q, the ordering for units in both; and C_not_Q, the ordering for units in the covariance adjustment sample only. Q_in_C and C_in_Q differ in that the names of the Q_in_C vector correspond to row indices of the original matrix of estimating equations for the direct adjustment model, while the names of C_in_Q correspond to row indices of the matrix of estimating equations for the covariance adjustment model. Similarly, the names of Q_not_C and C_not_Q correspond to row indices of the direct adjustment and covariance adjustment samples, respectively. Ultimately, the order of .make_uoa_ids() and estfun.teeMod() is given by concatenating the vectors stored in Q_not_C, Q_in_C, and C_not_q.

(Internal) Helper function for design-based meat matrix calculation

Description

(Internal) Helper function for design-based meat matrix calculation

Usage

.prepare_spec_matrix(x, ...)

Arguments

x

a fitted teeMod model

Value

a m \times (p+2) matrix of cluster sums of design-based estimating equations scaled by \sqrt{m_{b0}m_{b1}}/m_{b}. Here m is the number of clusters, p is the number of covariates used in the prior covariance adjustment (excluding intercept)

Bias correct residuals contributing to standard errors of a `teeMod`

Description

Bias correct residuals contributing to standard errors of a teeMod

Usage

.rcorrect(resids, x, model, type, ...)

Arguments

resids

numeric vector of residuals to correct

x

teeMod object

model

string indicating which model the residuals are from. "itt" indicates correction to the residuals of x, and "cov_adj" indicates correction to the residuals of the covariance adjustment model. This informs whether corrections should use information from x or the fitted_covariance_model slot of the SandwichLayer object in the offset for corrections

type

string indicating the desired bias correction. Can be one of "(HC/CR/MB)0", "(HC/CR/MB)1", or "(HC/CR/MB)2"

...

additional arguments passed from up the call stack; in particular, the cluster_cols argument, which informs whether to cluster and provide CR2 corrections instead of HC2 corrections, as well as the correction for the number of clusters in the CR1 correction. This may also include a by argument.

(Internal) Removes the forcing column entirely from a `StudySpecification`

Description

In preparation for converting an RD StudySpecification to another StudySpecification, this will strip the forcing variable entirely. It is removed from the data (both @structure and @column_index), as well as from the formula stored in @call.

Usage

.remove_forcing(spec)

Arguments

spec

A StudySpecification

Details

Note that the output StudySpecification will fail a validity check (with validObject()) due to an RD StudySpecification requiring a forcing variable, so change the @type immediately.

Value

The StudySpecification without any forcing variable

(Internal) Rename columns to strip function calls

Description

After calling model.frame() on the formula input to .new_StudySpecification(), the names of the columns will include function names, e.g. "block(blockvar)". This function strips all these.

Usage

.rename_model_frame_columns(modframe)

Arguments

modframe

A data.frame.

Value

The data.frame with function calls removed

(Internal) Return ID's used to order observations in the covariance adjustment sample

Description

(Internal) Return ID's used to order observations in the covariance adjustment sample

Usage

.sanitize_C_ids(x, id_col = NULL, sorted = FALSE, ...)

Arguments

x

a SandwichLayer object

id_col

character vector or list; optional. Specifies column names that appear in botn the covariance adjustment and direct adjustmet samples. Defaults to NULL, in which case unit of assignment columns in the SandwichLayer's StudySpecification slot will be used to generate ID's.

...

arguments passed to methods

Value

A vector of length equal to the number of units of observation used to fit the covariance adjustment model

(Internal) Return ID's used to order observations in the direct adjustment sample

Description

(Internal) Return ID's used to order observations in the direct adjustment sample

Usage

.sanitize_Q_ids(x, id_col = NULL, ...)

Arguments

x

a fitted teeMod model

id_col

character vector; optional. Specifies column(s) whose ID's will be returned. The column must exist in the data that created the StudySpecification object. Default is NULL, in which case unit of assignment columns indicated in the specification will be used to generate ID's.

...

arguments passed to methods

Value

A vector with length equal to the number of units of observation in the direct adjustment sample

(Internal) `show` helper for `PreSandwichLayer`/`SandwichLayer`

Description

(Internal) show helper for PreSandwichLayer/SandwichLayer

Usage

.show_layer(object)

Arguments

object

PreSandwichLayer or SandwichLayer

Value

object, invisibly

(Internal) Checks newdata/by argument for specification accessors

Description

(Internal) Checks newdata/by argument for specification accessors

Usage

.specification_accessors_newdata_validate(newdata, by)

Arguments

newdata

newdata argument from e.g. treatment(), blocks(), etc

by

from e.g. treatment(), blocks(), etc. See .check_by()

Value

Invisibly TRUE. Warns or errors as appropriate.

(Internal) Use `by` to update `StudySpecification` with new variable names

Description

Helper function used to update the variable names of a StudySpecification when user passes a by= argument to align variable names between data sets.

Usage

.update_by(specification, data, by)

Arguments

specification

A StudySpecification

data

Data set

by

Value

A StudySpecification with updated variable names

(Internal) Add columns for merging covariance adjustment and direct adjustment samples to model formula

Description

(Internal) Add columns for merging covariance adjustment and direct adjustment samples to model formula

Usage

.update_ca_model_formula(model, by = NULL, specification = NULL)

Arguments

model

any model that inherits from a glm, lm, or robustbase::lmrob object

by

optional; a string or named vector of unique identifier columns in the data used to create specification and the data used to fit the covariance adjustment model. Default is NULL, in which case unit of assignment columns are used for identification (even if they do not uniquely identify units of observation). If a named vector is provided, names should represent variables in the data used to create specification, while values should represent variables in the covariance adjustment data.

specification

a StudySpecification object. Default is NULL, in which case a StudySpecification object is sought from higher up the call stack.

Details

This function is typically used prior to .get_data_from_model() and incorporates information provided in a by argument to ensure the necessary columns for merging the two samples are included in any model.frame() calls.

Value

formula

(Internal) Updates `spec@call`'s formula with the currently defined variable names.

Description

Helper function to update the call with the appropriate variable names after they've been modified. Called within StudySpecification replacers.

Usage

.update_call_formula(specification)

Arguments

specification

A StudySpecification

Details

It's return should be stuck into the specification via spec@call$formula <- .update_call_formula(spec)

Value

An updated formula

(Internal) Rename cluster/unitid/uoa in a formula to unit_of_assignment for internal consistency

Description

Internally, we always refer to uoa/cluster/unitid as "unit_of_assignment"

Usage

.update_form_to_unit_of_assignment(form)

Arguments

form

A formula passed to .new_StudySpecification()

Value

The formula with "cluster"/"unitid"/"uoa" replace with "unit_of_assignment"

(Internal) Replaces `type` columns in `specification` with `new`

Description

Assumes .convert_to_data.frame() has already been called on new

Usage

.update_structure(specification, new, type)

Arguments

specification

A StudySpecification

new

A named data.frame with the replacement, should be the output of .convert_to_data.frame().

type

One of "t", "f", "u" or "b\".

Value

The updated StudySpecification

Valid Weights

Description

These track and access the weights we support as well as any aliases.

Usage

.validWeights

.isValidWeightTarget(target)

.isValidWeightAlias(alias)

.listValidWeightTargets()

.listValidWeightAliases()

Arguments

target

String

alias

String

Format

An object of class list of length 2.

Details

"target" refers to weight calculations we support.

"alias" refers to all possible names. Every "target" has at least one "alias" (itself) and may have more.

.isValidWeightTarget() and .isValidWeightAlias() identify whether a given input (from a user) is a value weighting name.

.listValidWeightTargets() and .listValidWeightAliases() are for returning nicely formatted strings for messages to users.

IMPORTANT: Adding new aliases MUST correspond to new functions defined in weights_exported.R, with possible adjustments to calculations in weights_internal.R.

Value

Logical for .isValid*, and a string for .listValid*.

(Internal) Validate a dichotomy against other dichotomies found in the call stack

Description

(Internal) Validate a dichotomy against other dichotomies found in the call stack

Usage

.validate_dichotomy(possible_dichotomy)

Arguments

possible_dichotomy

a list or a formula, the former coming from .find_dichotomies()

Value

formula for a dichotomy or NULL

(Internal) Worker function for weight calculation

Description

Called from ate() or ett().

Usage

.weights_calc(specification, target, weightAlias, dichotomy, by, data)

Arguments

specification

a StudySpecification object created by one of rct_spec(), rd_spec(), or obs_spec().

target

One of "ate" or "ett"; ate() and ett() chooses these automatically.

weightAlias

An alias for the weight target, currently one of "ate", "ett", "att". Chosen by ate() and ett() automatically.

dichotomy

optional; a formula defining the dichotomy of the treatment variable if it isn't already 0/1. See details of help for ate() or ett() e.g. for details.

by

optional; named vector or list connecting names of unit of assignment/ variables in specification to unit of assignment/cluster variables in data. Names represent variables in the StudySpecification; values represent variables in the data. Only needed if variable names differ.

data

optionally the data for the analysis to be performed on. May be excluded if these functions are included as the weights argument of a model.

Value

a WeightedStudySpecification object

Cluster-randomized experiment data on voter turnout in cable system markets

Description

This dataset is a toy example derived from a cluster-randomized field experiment that evaluates the effect of “Rock the Vote” TV advertisements on voter turnout rate. The original study included 23,869 first-time voters across 85 cable television markets in 12 states. These markets were grouped into matched sets based on their past voter turnout rates and then randomly assigned to either a treatment or control condition. This toy dataset is constructed by randomly sampling 10% of individuals from selected cable television markets in the original dataset.

Usage

GV_data

Format

A data.frame with 248 rows and 7 columns.

age Age of participant
vote_04 Outcome variable indicating whether participant voted
tv_company Cable system serving participant's residential area
treatment Binary variable denoting treatment assignment
pairs A numeric indicator for the strata or matched pair group to which a cable system belongs (1-3)
population_size Total population size of residential area served by cable system
sample_size Number of individuals sampled from the cable system cluster

Details

The original dataset was drawn from a randomized controlled trial in which 85 cable system areas were first grouped into 40 matched sets based on historical voter turnout. Within each matched set, one cable system area was randomly assigned to the treatment condition, while the others served as controls.

This toy dataset includes a subset of the original replication data, specifically individuals from matched sets 1–3, which encompass 7 of the 85 cable system areas. Within these selected clusters, a 10% random sample of individuals was taken.

The fuller Green-Vavreck dataset that this derives from bears a Creative Commons BY-NC-ND license (v3.0) and is housed in Yale University's Institution for Social and Policy Studies (ID: D005).

Source

https://isps.yale.edu/research/data/d005

References

Green, Donald P. & Lynn Vavreck (2008) "Analysis of Cluster-Randomized Experiments: A Comparison of Alternative Estimation Approaches." Political Analysis 16(2):138-152.

(Internal) model predictions with some model artifacts, as S4 object

Description

(Internal) model predictions with some model artifacts, as S4 object

Slots

.Data: numeric vector of predictions
fitted_covariance_model: model standing behind the predictions
prediction_gradient: matrix, predictions gradient w/r/t model params

STAR participants plus nonexperimental controls

Description

Data from Tennessee’s Project STAR study. This data frame describes student participants in the Project STAR (Student-Teacher Achievement Ratio) field experiment conducted in Tennessee, USA beginning in the mid-1980s, as well as an external control group consisting of the contemporaneous cohort of students attending a matched sample of Tennessee schools that did not participate in the STAR experiment. Variables are as described in Project STAR data documentation (see references), with five exceptions. Three *_at_entry variables were constructed as follows: grade_at_entry indicates the grade of student's first participation, while school_at_entry and cond_at_entry reflect the school ID and classroom type corresponding to the student's grade at entry to the study. Additionally, read_yr1 and math_yr1 capture a student's scaled scores on the Scholastic Assessment Test (SAT) administered to them during their grade_of_entry, i.e. their earliest available post-treatment SAT measurements.

Usage

STARplus

Format

A data.frame with 13,382 rows and 56 columns.

stdnid Student ID
gender Student gender
race Student race
birthmonth Student month of birth
birthday Student day of birth
birthyear Student year of birth
read_yr1 SAT reading scaled score from grade at which student entered the study
math_yr1 SAT math scaled score from grade at which student entered the study
gktreadss Kindergarten reading scaled score (RCT participants only)
gktmathss Kindergarten math scaled score (RCT participants only)
gktlistss Kindergarten listening scaled score (RCT participants only)
gkwordskillss Kindergarten word study skills scaled score (RCT participants only)
g1schid Grade 1 School ID
g1tchid Grade 1 Teacher ID
g1classsize Class size of Grade 1
g1treadss Grade 1 SAT reading scaled score
g1tmathss Grade 1 SAT math scaled score
g1tlistss Grade 1 total listening scale score in SAT
g1wordskillss Grade 1 word study skills scale score in SAT
g1readbsraw Grade 1 reading raw score in Basic Skills First (BSF) tests
g1mathbsraw Grade 1 math raw score in BSF
g1readbsobjpct Grade 1 reading percent objectives mastered in BSF tests
g1mathbsobjpct Grade 1 math percent objectives mastered in BSF tests
g2schid Grade 2 School ID
g2tchid Grade 2 Teacher ID
g2classsize Class size of Grade 2
g2treadss Grade 2 total reading scale score in SAT
g2tmathss Grade 2 total math scale score in SAT
g2tlistss Grade 2 total listening scale score in SAT
g2wordskillss Grade 2 word study skills scale score in SAT
g2readbsraw Grade 2 reading raw score in BSF tests
g2mathbsraw Grade 2 math raw score in BSF test
g2readbsobjpct Grade 2 reading percent objectives mastered in BSF tests
g3schid Grade 3 School ID
g3tchid Grade 3 Teacher ID
g3classsize Class size of Grade 3
g3treadss Grade 3 total reading scale score in SAT
g3tmathss Grade 3 total math scale score in SAT
g3langss Grade 3 total language scale score in SAT
g3tlistss Grade 3 total listening scale score in SAT
g3socialsciss Grade 3 social science scale score in SAT
g3spellss Grade 3 spelling scale score in SAT
g3vocabss Grade 3 vocabulary scale score in SAT
g3mathcomputss Grade 3 math computation scale score in SAT
g3mathnumconcss Grade 3 concept of numbers scale score in SAT
g3mathapplss Grade 3 math applications scale score in SAT
g3wordskillss Grade 3 word study skills scale score in SAT
g3readbsraw Grade 3 reading raw score in BSF tests
g3mathbsraw Grade 3 math raw score in BSF tests
g3readbsobjpct Grade 3 reading percent objectives mastered in BSF tests
g3mathbsobjpct Grade 3 math percent objectives mastered in BSF tests
dob Date of birth (with NAs imputed RCT participant median)
dobNA Dat of birth not recorded
grade_at_entry Grade at which each student first entered the study
school_at_entry School ID corresponding to the student's grade at entry into the study
cond_at_entry Classroom type corresponding to the student's grade at entry into the study

Details

Note: This dataset bears a Creative Commons Zero license (v1.0).

Source

doi:10.7910/DVN/SIWH9F

References

C.M. Achilles; Helen Pate Bain; Fred Bellott; Jayne Boyd-Zaharias; Jeremy Finn; John Folger; John Johnston; Elizabeth Word, 2008, "Tennessee's Student Teacher Achievement Ratio (STAR) project", Harvard Dataverse, V1, https://doi.org/10.7910/DVN/SIWH9F UNF:3:Ji2Q+9HCCZAbw3csOdMNdA

(Internal) model predictions with more model artifacts, as S4 object

Description

Contains PreSandwichLayer class. Only additional slots listed here.

Slots

keys: a data.frame
StudySpecification: a StudySpecification

(Internal) Modeling weights with an accompanying StudySpecification

Description

(Internal) Modeling weights with an accompanying StudySpecification

Details

⁠@target⁠ is used for calculation purpose; defining what weight to calculate ⁠@weightAlias⁠ is only to store the alias used in creation of the weights in case we want to report it later.

Slots

.Data: numeric vector of modeling weights
StudySpecification: a StudySpecification
target: character string, e.g. "ate"
weightAlias: alias for target appearing in an originating call
dichotomy: formula describing a treatment/comparison dichotomy

Group-center akin to Stata's `areg`

Description

From the Stata documentation: areg begins by recalculating Y and X and to have mean 0 within the groups specified by absorb(). The overall mean of each variable is then added back in.

Usage

areg.center(mm, grp, wts = NULL, grand_mean_center = FALSE)

Arguments

mm

Matrix of variables to center

grp

Group to center on

wts

Optional weights

grand_mean_center

Optional center output at mean(var)

Value

Vector of group-centered values

Convert a `PreSandwichLayer` to a `SandwichLayer` with a `StudySpecification` object

Description

as.SandwichLayer() uses the StudySpecification object passed to the specification argument to populate the slots in a SandwichLayer object that a PreSandwichLayer does not have sufficient information for.

Usage

as.SandwichLayer(x, specification, by = NULL, Q_data = NULL)

Arguments

x

a PreSandwichLayer object

specification

a StudySpecification object

by

Q_data

dataframe of direct adjustment sample, which is needed to generate the keys slot of the SandwichLayer object. Defaults to NULL, in which case if by is NULL, the data used to create specification is used, and if by is not NULL, appropriate data further up the call stack (passed as arguments to cov_adj() or lmitt.formula(), for example) is used.

Value

a SandwichLayer object

Convert `lm` object into `teeMod`

Description

Converts the output of lm() into a teeMod object, for standard errors that account for block and cluster information carried with the lm's weights, and/or an offset incorporating predictions of the outcome from a separate model.

Usage

as.lmitt(x, specification = NULL)

as.teeMod(x, specification = NULL)

Arguments

x

lm object with weights containing a WeightedStudySpecification, or an offset from cov_adj().

specification

Optional, explicitly specify the StudySpecification to be used. If the StudySpecification is specified elsewhere in x (e.g. passed as an argument to any of ate(), ett(), cov_adj() or assigned()) it will be found automatically and does not need to be passed here as well. (If different StudySpecification objects are passed (either through the lm in weights or covariance adjustment, or through this argument), an error will be produced.)

Details

The formula with which x was created must include a treatment identifier (e.g. assigned()). If a model-based offset is incorportated, the model's predictions would have to have been extracted using cov_adj() (as opposed to predict{} in order for teeMod standard error calculations to reflect propagation of error from these predictions. This mechanism only supports treatment main effects: to estimate interactions of treatment assignment with a moderator variable, use lmitt() instead of lm() and as.lmitt().

Value

teeMod object

Convert `StudySpecification` between types

Description

Convert a StudySpecification between a observational study, a randomized control trial, and a regression discontinuity (created from obs_spec, rct_spec and rd_spec respectively).

Usage

as_rct_spec(StudySpecification, ..., loseforcing = FALSE)

as_obs_spec(StudySpecification, ..., loseforcing = FALSE)

as_rd_spec(StudySpecification, data, ..., forcing)

Arguments

StudySpecification

a StudySpecification to convert

...

Ignored.

loseforcing

converting from RD to another StudySpecification type will error to avoid losing the forcing variable. Setting loseforcing = TRUE allows the conversion to automatically drop the forcing variable. Default FALSE.

data

converting to an RD requires adding a forcing variable, which requires access to the original data.

forcing

converting to an RD requires adding a forcing variable. This should be entered as a formula which would be passed to update, e.g. forcing = . ~ . + forcing(forcevar).

Value

StudySpecification of the updated type

Examples

spec <- rct_spec(z ~ unit_of_assignment(uoa1, uoa2), data = simdata)
spec
as_obs_spec(spec)
as_rd_spec(spec, simdata, forcing = ~ . + forcing(force))
spec2 <- rd_spec(o ~ uoa(uoa1, uoa2) + forcing(force), data = simdata)
spec2
# as_rct_spec(spec2) # this will produce an error
as_rct_spec(spec2, loseforcing = TRUE)

Obtain Treatment from StudySpecification

Description

When passing a lm object to lmitt(), extract and use the treatment variable specified in the StudySpecification.

Usage

assigned(specification = NULL, data = NULL, dichotomy = NULL)

adopters(specification = NULL, data = NULL, dichotomy = NULL)

a.(specification = NULL, data = NULL, dichotomy = NULL)

z.(specification = NULL, data = NULL, dichotomy = NULL)

Arguments

specification

Optional StudySpecification. If the StudySpecification can't be identified in the model (usually because neither weights (ate() or ett()) nor a covariate adjustment model (cov_adj()) are found), the StudySpecification can be passed diretly.

data

Optional data set. By default assigned() will attempt to identify the appropriate data, if this fails (or you want to overwrite it), you can pass the data here.

dichotomy

optional; a formula defining the dichotomy of the treatment variable if it isn't already 0/1. See details for more information. If ett() or ate() is called within a lmitt() call that specifies a dichotomy argument, that dichotomy will be used if the argument here has not been specified.

Details

When passing a lm object to lmitt(), the treatment variable in the formula passed to lm() needs to be identifiable. Rather than placing the treatment variable directly in the formula, use one of these functions, to allow lmitt() to identify the treatment variable.

To keep the formula in the lm() call concise, instead of passing specification and data arguments to these functions, one can pass a WeightedStudySpecification object to the weights argument of the lm() call or a SandwichLayer object to the offset argument.

Alternatively, you can pass the specification and data arguments.

While assigned() can be used in any situation, it is most useful for scenarios where the treatment variable is non-binary and the StudySpecification contains a Dichotomy. For example, say q is a 3-level ordinal treatment variable, and the binary comparison of interest is captured in dichotomy = q == 3 ~ q < 3. If you were to fit a model including q as a predictor, e.g. lm(y ~ q, ...), lm would treat q as the full ordinal variable. On the other hand, by calling lm(y ~ assigned(), weights = ate(spec), ...), assigned() will generate the appropriate binary variable to allow estimation of treatment effects.

If called outside of a model call and without a data argument, this will extract the treatment from the specification. If this is the goal, the treatment() function is better suited for this purpose.

Value

The treatment variable to be placed in the regression formula.

Examples

data(simdata)
spec <- obs_spec(z ~ uoa(uoa1, uoa2), data = simdata)
mod <- lm(y ~ assigned(), data = simdata, weights = ate(spec))
lmittmod <- lmitt(mod)
summary(lmittmod, vcov.type = "CR0")

Adjust residuals for both-sides absorption

Description

Adjust residuals for both-sides absorption

Usage

block_center_residuals(x)

Arguments

x

a fitted teeMod model

Details

This function subtracts off the block residual mean function \hat \alpha(v_b, \theta) for each observation from model residuals

Value

the fitted teeMod with updated block center residuals.

Extract bread matrix from a `teeMod` model fit

Description

An S3method for sandwich::bread that extracts the bread of the direct adjustment model sandwich covariance matrix.

Usage

## S3 method for class 'teeMod'
bread(x, ...)

Arguments

x

a fitted teeMod model

...

arguments passed to methods

Details

This function is a thin wrapper around .get_tilde_a22_inverse().

Value

A variance-covariance matrix with row and column entries for the estimated coefficients in x, the marginal mean outcome in the control condition, the marginal mean offset in the control condition (if an offset is provided), and if a moderator variable is specified in the formula for x, the mean interaction in the control condition of the outcome and offset with the moderator variable

Concatenate weights

Description

Given several variations of weights generated from a single StudySpecification, combine into a single weight.

Usage

## S4 method for signature 'WeightedStudySpecification'
c(x, ..., warn_dichotomy_not_equal = FALSE)

Arguments

x

.. a WeightedStudySpecification object, typically created from ate() or ett()

...

any number of additional WeightedStudySpecification objects with equivalent StudySpecification to x and eachother

warn_dichotomy_not_equal

if FALSE (default), WeightedStudySpecifications are considered equivalent even if their dichotomy differs. If TRUE, a warning is produced.

Details

Concatenating WeightedStudySpecification objects with c() requires both individual WeightedStudySpecification objects to come from the same StudySpecification and have the same target (e.g all created with ate() or all created with ett(), no mixing-and-matching). All arguments to c() must be WeightedStudySpecification.

WeightedStudySpecification objects may be concatenated together even without having the same @dichotomy slot. This procedure only prompts a warning for differing dichotomies if the argument warn_dichotomy_not_equal is set to TRUE.

Value

A numeric vector with the weights concatenated in the input order.

Examples

data(simdata)
spec <- rct_spec(z ~ unit_of_assignment(uoa1, uoa2), data = simdata)
w1 <- ate(spec, data = simdata[1:30,])
w2 <- ate(spec, data = simdata[31:40,])
w3 <- ate(spec, data = simdata[41:50,])
c_w <- c(w1, w2, w3)
c(length(w1), length(w2), length(w3), length(c_w))

spec <- rct_spec(dose ~ unit_of_assignment(uoa1, uoa2), data = simdata)
w1 <- ate(spec, data = simdata[1:10, ], dichotomy = dose >= 300 ~ .)
w2 <- ate(spec, data = simdata[11:30, ], dichotomy = dose >= 200 ~ .)
w3 <- ate(spec, data = simdata[31:50, ], dichotomy = dose >= 100 ~ .)
c_w <- c(w1, w2, w3)

Use properties of idempotent matrices to cheaply compute inverse symmetric square roots of cluster-specific subsets of projection matrices

Description

Use properties of idempotent matrices to cheaply compute inverse symmetric square roots of cluster-specific subsets of projection matrices

Usage

cluster_iss(
  tm,
  cluster_unit,
  cluster_ids = NULL,
  cluster_var = NULL,
  exclude = na.action(tm),
  tol = 1e-09,
  ...
)

Arguments

exclude

index of units to exclude from computing the correction; for example, if they're NA's

Confidence intervals with standard errors provided by `vcov.teeMod()`

Description

An S3method for stats::confint that uses standard errors computed using vcov.teeMod(). Additional arguments passed to this function, such as cluster and type, specify the arguments of the vcov.teeMod() call.

Usage

## S3 method for class 'teeMod'
confint(object, parm, level = 0.95, ...)

Arguments

object

a fitted teeMod model

parm

a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered.

level

the confidence level required.

...

additional arguments to pass to vcov.teeMod()

Details

Rather than call stats::confint.lm(), confint.teeMod() calls .confint_lm(), a function internal to the propertee package that ensures additional arguments in the ... of the confint.teeMod() call are passed to the internal vcov() call.

Value

A matrix (or vector) with columns giving lower and upper confidence limits for each parameter. These will be labelled as (1-level)/2 and 1 - (1-level)/2 in % (by default 2.5% and 97.5%)

Covariance adjustment of `teeMod` model estimates

Description

cov_adj() takes a fitted covariance model and returns the information necessary for adjusting direct adjustment model estimates and associated standard errors for covariates. Standard errors will reflect adjustments made to the outcomes as well as contributions to sampling variability arising from the estimates of the covariance adjustment model coefficients.

Usage

cov_adj(model, newdata = NULL, specification = NULL, by = NULL)

Arguments

model

any model that inherits from a glm, lm, or robustbase::lmrob object

newdata

a dataframe of new data. Default is NULL, in which case a dataframe is sought from higher up the call stack.

specification

a StudySpecification object. Default is NULL, in which case a StudySpecification object is sought from higher up the call stack.

by

Details

Prior to generating adjustments, cov_adj() identifies the treatment variable specified in the StudySpecification object passed to specification and replaces all values with a reference level. If the treatment has logical type, this reference level is FALSE, and if it has numeric type, this is the smallest non-negative value (which means 0 for 0/1 binary). Factor treatments are not currently supported for StudySpecification objects.

The values of the output vector represent adjustments for the outcomes in newdata if newdata is provided; adjustments for the outcomes in the data used to fit a teeMod model if cov_adj() is called within the offset argument of the model fit; or they are the fitted values from model if no relevant dataframe can be extracted from the call stack. The length of the output of cov_adj() will match the number of rows of the dataframe used.

Value

A SandwichLayer if specification is not NULL or a StudySpecification object is found in the call stack, otherwise a PreSandwichLayer object

Examples

data("STARplus")

##' A prognostic model fitted to experimental + non-experimental controls
y0hat_read <- lm(read_yr1 ~ gender*dob +dobNA + race,
                 data = STARplus,
                 subset = cond_at_entry!="small")

STARspec <- rct_spec(cond_at_entry ~ unit_of_assignment(stdntid) +
                         block(grade_at_entry, school_at_entry),
                     subset=!is.na(grade_at_entry),# excludes non-experimentals
                     data = STARplus)
ett_wts    <- ett(STARspec, data = STARplus,
                  dichotomy= cond_at_entry =="small" ~.)

ett_read <- lm(read_yr1 ~ assigned(dichotomy= cond_at_entry =="small" ~.),
               offset = cov_adj(y0hat_read),
               data = STARplus,
               weights = ett_wts)
coef(ett_read)
ett_read |> as.lmitt() # brings in control-group means of outcome, predictions

ate_read <- lmitt(read_yr1 ~ 1, STARspec, STARplus,
                  dichotomy= cond_at_entry =="small" ~.,
                  offset = cov_adj(y0hat_read),
                  weights = "ate")
show(ate_read)
vcov(ate_read, type = "HC0", cov_adj_rcorrect = "HC0") |> unname()

ate_read_loc <-
    lmitt(read_yr1 ~ race, STARspec, STARplus,
          dichotomy= cond_at_entry =="small" ~.,
          offset = cov_adj(y0hat_read, newdata = STARplus),
          weights = "ate")
show(ate_read_loc)

Extract empirical estimating equations from a `glmbrob` model fit

Description

Extract empirical estimating equations from a glmbrob model fit

Extract bread matrix from an lmrob() fit

Usage

## S3 method for class 'glmrob'
estfun(x, ...)

## S3 method for class 'glmrob'
bread(x, ...)

Arguments

x

a fitted lmrob object

...

arguments passed to methods

Value

matrix, estimating functions evaluated at data points and fitted parameters

matrix, inverse Hessian of loss as evaluated at fitted parameters

Generate matrix of estimating equations for `lmrob()` fit

Description

Generate matrix of estimating equations for lmrob() fit

Extract bread matrix from an lmrob() fit

Usage

## S3 method for class 'lmrob'
estfun(x, ...)

## S3 method for class 'lmrob'
bread(x, ...)

Arguments

x

An lmrob object produced using an MM/SM estimator chain

...

Additional arguments to be passed to bread

Details

This is part of a workaround for an issue in the robustbase code affecting sandwich covariance estimation. The issue in question is issue #6471, robustbase project on R-Forge. This function contributes to providing sandwich estimates of covariance-adjusted standard errors for robust linear covariance adjustment models.

Value

A n\times (p+1) matrix where the first column corresponds to the scale estimate and the remaining p colums correspond to the coefficients

A p\times (p+1) matrix where the first column corresponds to the scale estimate and the remaining p colums correspond to the coefficients

Author(s)

Ben B. Hansen

Extract empirical estimating equations from a `teeMod` model fit

Description

An S3method for sandwich::estfun for producing a matrix of contributions to the direct adjustment estimating equations.

Usage

## S3 method for class 'teeMod'
estfun(x, ...)

Arguments

x

a fitted teeMod model

...

arguments passed to methods, most importantly those that define the bias corrections for the residuals of x and, if applicable, a fitted_covariance_model stored in its offset

Details

If a prior covariance adjustment model has been passed to the offset argument of the teeMod model using cov_adj(), estfun.teeMod() incorporates contributions to the estimating equations of the covariance adjustment model.

The covariance adjustment sample may not fully overlap with the direct adjustment sample, in which case estfun.teeMod() returns a matrix with the same number of rows as the number of unique units of observation used to fit the two models. Uniqueness is determined by matching units of assignment used to fit the covariance adjustment model to units of assignment in the teeMod model's StudySpecification slot; units of observation within units of assignment that do not match are additional units that add to the row count.

Theby argument in cov_adj() can provide a column or a pair of columns (a named vector where the name specifies a column in the direct adjustment sample and the value a column in the covariance adjustment sample) that uniquely specifies units of observation in each sample. This information can be used to align each unit of observation's contributions to the two sets of estimating equations. If no by argument is provided and units of observation cannot be uniquely specified, contributions are aligned up to the unit of assignment level. If standard errors are clustered no finer than that, they will provide the same result as if each unit of observation's contributions were aligned exactly.

This method incorporates bias corrections made to the residuals of x and, if applicable, the covariance model stored in its offset. When its crossproduct is taken (perhaps after suitable summing across rows within clusters), it provides a heteroskedasticity- (or cluster-) robust estimate of the meat matrix of the variance-covariance of the parameter estimates in x.

Value

An n\times k matrix of empirical estimating equations for x. k includes the model intercept, main effects of treatment and moderator variables, any moderator effects, and marginal and conditional means of the outcome (and offset, if provided) in the control condition. See Details for definition of n.

Generate Direct Adjusted Weights for Treatment Effect Estimation

Description

These should primarily be used inside models. See Details.

Usage

ett(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

att(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

ate(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

etc(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

atc(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

ato(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

olw(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

owt(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

pwt(specification = NULL, dichotomy = NULL, by = NULL, data = NULL)

Arguments

specification

optional; a StudySpecification object created by one of rct_spec(), rd_spec(), or obs_spec().

dichotomy

by

optional; named vector or list connecting names of unit of assignment/ variables in specification to unit of assignment/unitid/cluster variables in data. Names represent variables in the StudySpecification; values represent variables in the data. Only needed if variable names differ.

data

optional; the data for the analysis to be performed on. May be excluded if these functions are included as the weights argument of a model.

Details

These functions should primarily be used in the weight argument of lmitt() orlm(). All arguments are optional if used within those functions. If used on their own, specification and data must be provided.

ate - Average treatment effect. Aliases: ate().
ett - Effect of treatment on the treated. Aliases: ett(), att().
etc - Effect of treatment on controls. Aliases: etc(), atc().
ato - Overlap-weighted average effect. Aliases: ato(), olw, owt, pwt.

In a StudySpecification with blocks, the weights are generated as a function of the ratio of the number of treated units in a block versus the total number of units in a block.

In any blocks where that ratio is 0 or 1 (that is, all units in the block have the same treatment status), the weights will be 0. In effect this removes from the target population any block in which there is no basis for estimating either means under treatment or means under control.

If block is missing for a given observation, a weight of 0 is applied.

A dichotomy is specified by a formula consisting of a conditional statement on both the left-hand side (identifying treatment levels associated with "treatment") and the right hand side (identifying treatment levels associated with "control"). For example, if your treatment variable was called dose and doses above 250 are considered treatment, you might write:

ate(..., dichotomy = dose > 250 ~ dose <= 250

The period (.) can be used to assign all other units of assignment. For example, we could have written the same treatment regime as either

etc(..., dichotomy = dose > 250 ~ .

olw(..., dichotomy = . ~ dose <= 250

The dichotomy formula supports Relational Operators (see Comparison), Logical Operators (see Logic), and %in% (see match()).

The conditionals need not assign all values of treatment to control or treatment, for example, dose > 300 ~ dose < 200 does not assign 200 <= dose <= 300 to either treatment or control. This would be equivalent to manually generating a binary variable with NA whenever dose is between 200 and 300. Standard errors will reflect the sizes of the comparison groups specified by the dichotomy.

Tim Lycurgus contributed code for the computation of weights. The ‘overlap weight’ concept is due to Li, Morgan and Zaslavsky (2018), although the current implementation differs from that discussed in their paper in that it avoids estimated propensity scores.

Value

a WeightedStudySpecification object, which is a vector of numeric weights

References

Li, Fan, Kari Lock Morgan, and Alan M. Zaslavsky. "Balancing covariates via propensity score weighting." Journal of the American Statistical Association 113, no. 521 (2018): 390-400.

Examples

data(simdata)
spec <- rct_spec(z ~ unit_of_assignment(uoa1, uoa2), data = simdata)
summary(lmitt(y ~ 1, data = simdata, specification = spec, weights = ate()), vcov.type = "CR0")

`StudySpecification` Structure Information

Description

Obtaining a data.frame which encodes the specification information.

Usage

get_structure(specification)

## S4 method for signature 'StudySpecificationStructure'
show(object)

Arguments

specification

a StudySpecification object

object

a StudySpecificationStructure object, typically the output of get_structure

Value

A StudySpecificationStructure object containing the structure of the specification as a data.frame.

Examples

data(simdata)
spec <- rct_spec(z ~ uoa(uoa1, uoa2) + block(bid), data = simdata)
get_structure(spec)

Check whether treatment stored in a `StudySpecification` object is binary

Description

Check whether treatment stored in a StudySpecification object is binary

Usage

has_binary_treatment(spec)

Arguments

spec

StudySpecification object

Value

logical vector of length 1

Test equality of two `StudySpecification` objects

Description

Check whether two StudySpecification objects are identical.

Usage

identical_StudySpecifications(x, y)

Arguments

x

A StudySpecification object.

y

A StudySpecification object.

Value

Logical, are x and y identical?

Identify fine strata

Description

Identify blocks in a StudySpecification with exactly one treated or one control unit of assignment.

Usage

identify_small_blocks(spec)

Arguments

spec

A StudySpecification object.

Value

Logical vector with length given by the number of blocks in StudySpecification

Linear Model for Intention To Treat

Description

Generates a linear model object to estimate a treatment effect, with proper estimation of variances accounting for the study specification.

Usage

lmitt(obj, specification, data, ...)

## S3 method for class 'formula'
lmitt(
  obj,
  specification,
  data,
  absorb = FALSE,
  offset = NULL,
  weights = NULL,
  ...
)

## S3 method for class 'lm'
lmitt(obj, specification = NULL, ...)

Arguments

obj

A formula or a lm object. See Details.

specification

The StudySpecification to be used. Alternatively, a formula creating a specification (of the type of that would be passed as the first argument to rd_spec(), rct_spec(), or obs_spec()). If the formula includes a forcing() element, an RD specification is created. Otherwise an observational specification is created. An RCT specification must be created manually using rct_spec().

data

A data.frame such as would be passed into lm().

...

Additional arguments passed to lm() and other functions. An example of the latter is dichotomy=, a formula passed to assigned() and, as appropriate, ate(), att(), atc() or ato(). It is used to dichotomize a non-binary treatment variable in specification. See the Details section of the ate() help page for examples.

absorb

If TRUE, fixed effects are included for blocks identified in the StudySpecification. Excluded in FALSE. Default is FALSE. The estimates of these fixed effects are suppressed from the returned object.

offset

Offset of the kind which would be passed into lm(). Ideally, this should be the output of cov_adj().

weights

Which weights should be generated? Options are "ate" or "ett". Alternatively, the output of a manually run ate() or ett() can be used.

Details

The first argument to lmitt() should be a formula specifying the outcome on the left hand side. The right hand side of the formula can be any of the following:

1: Estimates a main treatment effect.
a subgroup variable: Estimates a treatment effect within each level of your subgrouping variable.
a continuous moderator: Estimates a main treatment effect as well as a treatment by moderator interaction. The moderator is not automatically centered.

Alternatively, obj can be a pre-created lm object. No modification is made to the formula of the object. See the help for as.lmitt() for details of this conversion.

The lmitt() function's subset= argument governs the subsetting of data prior to model fitting, just as with lm(). Functions such as rct_spec() that create StudySpecifications also take an optional subset= argument, but its role differs from that of the subset= argument of lm() or lmitt(). The subset= argument when creating a StudySpecification restricts the data used to generate the StudySpecification, but has no direct impact on the future lm() or lmitt() calls using that StudySpecification. (It can have an indirect impact by excluding particular units from receiving a treatment assignment or weight. When treatment assignments or weights are reconstructed from the StudySpecification, these units will receive NAs, and will be excluded from the lm() or lmitt() fit under typical na.action settings.)

To avoid variable name collision, the treatment variable defined in the specification will have a "." appended to it. For example, if you request a main treatment effect (with a formula of ~ 1) with a treatment variable named "txt", you can obtain its estimate from the returned teeMod object via $coefficients["txt."].

lmitt() will produce a message if the StudySpecification designates treatment assignment by block but the blocking structure appears not to be reflected in the weights, nor in a block fixed effect adjustment (via absorb=TRUE). While not an error, this is at odds with intended uses of propertee, so lmitt() flags it as a potential oversight on the part of the analyst. To disable this message, run options("propertee_message_on_unused_blocks" = FALSE).

lmitt() returns objects of class ‘teeMod’, for Treatment Effect Estimate Model, extending the lm class to add a summary of the response distribution under control (the coefficients of a controls-only regression of the response on an intercept and any moderator variable). teeMod objects also record the underlying StudySpecification and information about any externally fitted models mod that may have been used for covariance adjustment by passing offset=cov_adj(mod). In the latter case, responses are offsetted by predictions from mod prior to treatment effect estimation, but estimates of the response variable distribution under control are calculated without reference to mod.

The response distribution under control is also characterized when treatment effects are estimated with block fixed effects, i.e. for lmitt() with a formula first argument with option absorb=TRUE. Here as otherwise, the supplementary coefficients describe a regression of the response on an intercept and moderator variables, to which only control observations contribute; but in this case the weights are modified for this supplementary regression. The treatment effect estimates adjusted for block fixed effects can be seen to coincide with estimates calculated without block effect but with weights multiplied by an additional factor specific to the combination of block and treatment condition. For block s containing units with weights w_i and binary treatment assignments z_i, define \hat{\pi}_s by \hat{\pi}_s\sum_sw_i=\sum_sz_iw_i. If \hat{\pi}_s is 0 or 1, the block doesn't contribute to effect estimation and the additional weighting factor is 0; if 0 < \hat{\pi}_s < 1, the additional weighting factor is 1 - \hat{\pi}_s for treatment group members and \hat{\pi}_s for controls. When estimating a main effect only or a main effect with continuous moderator, supplementary coefficients under option absorb=TRUE reflect regressions with additional weighting factor equal to 0 or \hat{\pi}_s, respectively, for treatment or control group members of block s. With a categorical moderator and absorb=TRUE, this additional weighting factor determining supplementary coefficients is calculated separately for each level \ell of the moderator variable, with the sums defining \hat{\pi}_{s\ell} restricted not only to block s but also to observations with moderator equal to \ell.

Value

teeMod object (see Details)

Examples

data(simdata)
spec <- rct_spec(z ~ unit_of_assignment(uoa1, uoa2), data = simdata)
mod1 <- lmitt(y ~ 1, data = simdata, specification = spec, weights = "ate")
mod2 <- lmitt(y ~ as.factor(o), data = simdata, specification = spec, weights = "ate")
mod3 <- lmitt(y ~ 1, data = simdata,
              specification = z ~ uoa(uoa1, uoa2) + forcing(force))

Synthethic Regression Discontinuity Data

Description

The data for this example were randomly simulated using the synthpop package in R based on data originally collected by Lindo, Sanders, and Oreopoulos (2010).

Usage

lsoSynth

Format

A data.frame with 40,403 rows and 11 columns.

R
lhsgrade_pct
nextGPA
probation_year1
totcredits_year1
male
loc_campus1
loc_campus2
bpl_north_america
english
age_at_entry

Details

See the "Regression Discontinuity StudySpecifications" vignette on the propertee website for more details on the original data, a link to the code used to generate this synthethic data, and a detailed example.

Intervention data from a pair-matched study of schools in Michigan

Description

Michigan high schools, with a plausible cluster RCT

Usage

michigan_school_pairs

Format

A data.frame with 14 rows and 13 columns.

schoolid school id
blk block
z treatment variable
MALE_G11_PERC percentage of G11 male students
FEMALE_G11_PERC percentage of G11 female students
AM_G11_PERC percentage of G11 American Indian/Alaska Native students
ASIAN_G11_PERC percentage of G11 Asian students
HISP_G11_PERC percentage of G11 Hispanic students
BLACK_G11_PERC percentage of G11 Black students
WHITE_G11_PERC percentage of G11 White students
PACIFIC_G11_PERC percentage of G11 Hawaiian Native/Pacific Islander students
TR_G11_PERC percentage of G11 Two or More Races students
G11 Number of G11 students

Details

Grade 11 demographics for all Michigan high schools in 2013, with mock block and treatment assignments for 14 high schools within a large county in the metro Detroit area. These schools were selected for this demonstration based on their similarity to the 14 high schools from an adjacent Michigan county that participated in the Pane et al (2013) study. As a result, they serve as an example of what one might expect to find as the state-specific school-level subsample in a multi-state paired cluster randomized trial featuring random assignment at the school level.

The mock experimental schools were selected by optimal matching of experimental schools to adjacent county schools, with substitute schools grouped into the same pairs or triples (‘fine strata’) as were their experimental counterparts. The original pairs and triples had been selected to reduce variation in baseline variables predictive of outcomes, and the blocking structure the substitute sample inherits may be expected to do this as well. The treatment/control distinction is also inherited from the experimental sample, but there is of course no treatment effect within the mock experiment.

The selection of mock experimental schools was based on both demographic and student achievement variables, but the present data frame includes only the demographic variables (as sourced from the Common Core of Data [CCD; U.S. Department of Education]). School average outcomes in student test scores are available separately, from Michigan's Center for Education Performance Information. See the vignette ‘Real-data demonstration with a finely stratified cluster RCT and a broader administrative database’, available on the package website.

References

Pane, John F., et al. "Effectiveness of cognitive tutor algebra I at scale." Educational Evaluation and Policy Analysis 36.2 (2014): 127-144.

U.S. Department of Education. Public Elementary/Secondary School Universe Survey Data, v.2a. Institute of Education Sciences, National Center for Education Statistics.

Examples

data(michigan_school_pairs)
mi_spec <- rct_spec(z ~ uoa(schoolid)+block(blk),
data=michigan_school_pairs)
mi_spec
table(is.na(michigan_school_pairs$blk))
specification_table(mi_spec, "block", "treatment")

Generates a `StudySpecification` object with the given specifications.

Description

Generate a randomized control treatment StudySpecification (rct_spec()), or an observational StudySpecification (obs_spec()), or a regression discontinuity StudySpecification (rd_spec()).

Usage

rct_spec(formula, data, subset = NULL, na.fail = TRUE)

rd_spec(formula, data, subset = NULL, na.fail = TRUE)

obs_spec(formula, data, subset = NULL, na.fail = TRUE)

rct_specification(formula, data, subset = NULL, na.fail = TRUE)

rd_specification(formula, data, subset = NULL, na.fail = TRUE)

obs_specification(formula, data, subset = NULL, na.fail = TRUE)

obsstudy_spec(formula, data, subset = NULL, na.fail = TRUE)

obsstudy_specification(formula, data, subset = NULL, na.fail = TRUE)

Arguments

formula

a formula defining the StudySpecification components. See Details for specification.

data

the data set from which to build the StudySpecification. Note that this data need not be the same as used to estimate the treatment effect; rather the data passed should contain information about the units of treatment assignment (as opposed to the units of analysis).

subset

optional, subset the data before creating the StudySpecification object

na.fail

If TRUE (default), any missing data found in the variables specified in formula (excluding treatment) will trigger an error. If FALSE, non-complete cases will be dropped before the creation of the StudySpecification

Details

The formula should include exactly one unit_of_assignment() to identify the units of assignment (one or more variables). (uoa, cluster, or unitid are synonyms for unit_of_assignment; the choice of which has no impact on the analysis. See below for a limited exception in which the unit_of_assignment specification may be omitted.) If defining an rd_spec, the formula must also include a forcing() entry. The formula may optionally include a block() as well. Each of these can take in multiple variables, e.g. to pass both a household ID and individual ID as unit of assignment, use uoa(hhid, iid) and not uoa(hhid) + uoa(iid).

The treatment variable passed into the left-hand side of formula can either be logical, numeric, or character. If it is anything else, it attempts conversion to one of those types (for example, factor and ordered are converted to numeric if the levels are numeric, otherwise to character). If the treatment is not logical or numeric with only values 0 and 1, in order to generate weights with ate() or ett(), the dichotomy argument must be used in those functions to identify the treatment and control groups. See ett() for more details on specifying a dichotomy.

There are a few aliases for each version.

If the formula excludes a unit_of_assignment(), data merges are performed on row order. Such formulas can also be passed as the specification argument to lmitt(), and that is their primary intended use case. It is recommended that each formula argument passed to *_specification() include a unit_of_assignment(), uoa() or cluster() term identifying the key variable(s) with which StudySpecification data is to be merged with analysis data. Exceptions to this rule will be met with a warning. To disable the warning, run options("propertee_warn_on_no_unit_of_assignment" = FALSE).

The units of assignment, blocks, and forcing variables must be numeric or character. If they are otherwise, an attempt is made to cast them into character.

Value

a StudySpecification object of the requested type for use in further analysis.

Examples

data(simdata)
spec <- rct_spec(z ~ unit_of_assignment(uoa1, uoa2) + block(bid),
                  data = simdata)

data(schooldata)
spec <- obs_spec(treatment ~ unit_of_assignment(schoolid) + block(state),
                  data = schooldata)

Student data

Description

An example of data sets stored at two levels.

Usage

schooldata

studentdata

Format

Two data.frames, one with school-level data (schooldata) including treatment assignment and a second with student-level data (studentdata). schoolata:

schoolid Unique school ID variable.
treatment Was this school in the intervention group?
state State which the school is in.
pct_disadvantage Percent of student body flagged as "disadvantaged".

studentdata:

id Unique student ID.
schoolid Unique school ID variable.
grade Student's grade, 3-5.
gpa Student GPA in prior year.
math Standarized math score (out of 100).

An object of class data.frame with 8713 rows and 5 columns.

Details

In this hypothetical data, schools were randomly assignment to treatment status, but the unit of analysis is students. Thus the two data sets, one encoding school information (including treatment status) and one encoding student information (which does not include treatment status).

Examples

soec <- obs_spec(treatment ~ uoa(schoolid), data = schooldata)

# Treatment effect
mod1 <- lmitt(math ~ 1, specification = soec, data = studentdata)

# Treatment effect by grade
mod2 <- lmitt(math ~ as.factor(grade), specification = soec, data = studentdata)

Show a `PreSandwichLayer` or `SandwichLayer`

Description

Display information about a PreSandwichLayer or SandwichLayer object

Usage

## S4 method for signature 'PreSandwichLayer'
show(object)

Arguments

object

PreSandwichLayer or SandwichLayer object

Value

an invisible copy of object

Show a `StudySpecification`

Description

Display information about a StudySpecification object

Usage

## S4 method for signature 'StudySpecification'
show(object)

Arguments

object

StudySpecification object, usually a result of a call to rct_spec(), obs_spec(), or rd_spec().

Value

object, invisibly.

Show a `WeightedStudySpecification`

Description

Prints out the weights from a WeightedStudySpecification

Usage

## S4 method for signature 'WeightedStudySpecification'
show(object)

Arguments

object

a WeightedStudySpecification object

Value

an invisible copy of object

Show a `teeMod`

Description

Display information about a teeMod object

Usage

## S4 method for signature 'teeMod'
show(object)

Arguments

object

teeMod object, usually a result of a call to lmitt().

Value

object, invisibly.

Simulated data

Description

Simulated data to use with the propertee package with unit of assignment level treatment assignment

Usage

simdata

Format

A data.frame with 100 rows and 7 columns.

uoa1 First level unit of assignment ID
uoa2 Second level unit of assignment ID
bid Block ID
force Forcing variable
z Binary treatment indicator
o 4-level ordered treatment variable
dose Dose treatment variable
x Some predictor
y Some outcome

Check for variable agreement within units of assignment

Description

Useful for debugging purposes to ensure that there is concordance between variables in the StudySpecification and data.

Usage

specification_data_concordance(
  specification,
  data,
  by = NULL,
  warn_on_nonexistence = TRUE
)

Arguments

specification

a StudySpecification object

data

a new data set, presumably not the same used to create specification.

by

optional; named vector or list connecting names of variables in specification to variables in data. Names represent variables in specification; values represent variables in data. Only needed if variable names differ.

warn_on_nonexistence

default TRUE. If a variable does not exist in data, should this be flagged? If FALSE, silently move on if a variable doesn't exist in data.

Details

Consider the following scenario: A StudySpecification is generated from some dataset, "data1", which includes a block variable "b1". Within each unique unit of assignment/unitid/cluster of "data1", it must be the case that "b1" is constant. (Otherwise the creation of the StudySpecification will fail.)

Next, a model is fit which includes weights generated from the StudySpecification, but on dataset "data2". In "data2", the block variable "b1" also exists, but due to some issue with data cleaning, does not agree with "b1" in "data1".

This could cause errors, either directly (via actual error messages) or simply produce nonsense results. specification_data_concordance() is specificationed to help debug these scenarios by providing information on whether variables in both the data used in the creation of specification ("data1" in the above example) and some new dataset, data, ("data2" in the above example) have any inconsistencies.

Value

invisibly TRUE if no warnings are produced, FALSE if any warnings are produced.

Table of elements from a `StudySpecification`

Description

Produces a table (1-dimensional, or 2-dimensional if y is specified) of the elements of the StudySpecification.

Usage

specification_table(
  specification,
  x,
  y = NULL,
  sort = FALSE,
  decreasing = TRUE,
  use_var_names = FALSE,
  ...
)

stable(
  specification,
  x,
  y = NULL,
  sort = FALSE,
  decreasing = TRUE,
  use_var_names = FALSE,
  ...
)

Arguments

specification

A StudySpecification object

x

One of "treatment", "unit of assignment", (synonym "uoa"), "block". Abbreviations are accepted. "unit of assignment" can be replaced by "unitid" or "cluster" if the StudySpecification was created with that element.

y

Optionally, another string similar to x. A 1-dimensional table is produced if y is left at its default, NULL.

sort

Ignored if y is not NULL. If FALSE (default), one-way table is sorted according to "names" of levels. If set to TRUE, one-way table is sorted according to values.

decreasing

If sort is TRUE, choose whether to sort descending (TRUE, default) or ascending (FALSE).

use_var_names

If TRUE, name dimensions of table returned by variable names. If FALSE (default), name by their function (e.g. "treatment" or "blocks"). Passing the dnn argument in ... (an argument of table()) overrides whatever is requested here.

...

additional arguments table()

Value

A table of the requested variables.

Examples

data(simdata)
spec <- obs_spec(z ~ unit_of_assignment(uoa1, uoa2) + block(bid),
                  data = simdata)
specification_table(spec, "treatment")
specification_table(spec, "treatment", "block", sort = TRUE, use_var_names = TRUE)

`PreSandwichLayer` and `SandwichLayer` subsetting

Description

Return subset of a PreSandwichLayer or SandwichLayer which meets conditions.

Usage

## S4 method for signature 'PreSandwichLayer'
subset(x, subset)

## S4 method for signature 'PreSandwichLayer'
x[i]

Arguments

x

PreSandwichLayer or SandwichLayer object

subset

Logical vector identifying values to keep or drop

i

indices specifying elements to extract or replace. See help("[") for further details.

Value

x subset by subset or i

`WeightedStudySpecification` subsetting

Description

Provides functionality to subset the weights of a WeightedStudySpecification object.

Usage

## S4 method for signature 'WeightedStudySpecification'
subset(x, subset)

## S4 method for signature 'WeightedStudySpecification'
x[i]

Arguments

x

WeightedStudySpecification object

subset

Logical vector identifying values to keep or drop

i

indices specifying elements to extract or replace. See help("[") for further details.

Value

A WeightedStudySpecification object which is a subsetted version of x.

Summarizing `StudySpecification` objects

Description

summary() method for class StudySpecification.

Usage

## S3 method for class 'StudySpecification'
summary(object, ..., treatment_binary = TRUE)

## S3 method for class 'summary.StudySpecification'
print(x, ..., max_unit_print = 3)

Arguments

object

StudySpecification object, usually a result of a call to rct_spec(), obs_spec(), or rd_spec().

...

Ignored

treatment_binary

Should the treatment be dichotomized if object contains a dichotomy? Ignored if object does not contain a dichotomy.

x

summary.StudySpecification object, usually as a result of a call to summary.StudySpecification()

max_unit_print

Maximum number of treatment levels to print in treatment table

Value

The StudySpecification or summary.StudySpecificationobject, invisibly

Summarizing `teeMod` objects

Description

summary() method for class teeMod

Usage

## S3 method for class 'teeMod'
summary(object, vcov.type = "HC0", ...)

## S3 method for class 'summary.teeMod'
print(
  x,
  digits = max(3L, getOption("digits") - 3L),
  signif.stars = getOption("show.signif.stars"),
  ...
)

Arguments

object

teeMod object

vcov.type

A string indicating the desired variance estimator. See vcov_tee() for details on accepted types.

...

Additional arguments to vcov_tee(), such as the desired finite sample heteroskedasticity-robust standard error adjustment.

x

summary.teeMod object

digits

the number of significant digits to use when printing.

signif.stars

logical. If ‘TRUE’, ‘significance stars’ are printed for each coefficient.

Details

If a teeMod object is fit with a SandwichLayer offset, then the usual stats::summary.lm() output is enhanced by the use of covariance-adjusted sandwich standard errors, with t-test values recalculated to reflect the new standard errors.

Value

object of class summary.teeMod

Accessors and Replacers for `StudySpecification` objects

Description

Allows access to the elements which define a StudySpecification, enabling their extraction or replacement.

Usage

treatment(x, newdata = NULL, dichotomy = NULL, by = NULL, ...)

## S4 method for signature 'StudySpecification'
treatment(x, newdata = NULL, dichotomy = NULL, by = NULL, ...)

treatment(x) <- value

## S4 replacement method for signature 'StudySpecification'
treatment(x) <- value

units_of_assignment(x, newdata = NULL, by = NULL)

## S4 method for signature 'StudySpecification'
units_of_assignment(x, newdata = NULL, by = NULL)

units_of_assignment(x) <- value

## S4 replacement method for signature 'StudySpecification'
units_of_assignment(x) <- value

clusters(x, newdata = NULL, by = NULL)

## S4 method for signature 'StudySpecification'
clusters(x, newdata = NULL, by = NULL)

clusters(x) <- value

## S4 replacement method for signature 'StudySpecification'
clusters(x) <- value

unitids(x)

## S4 method for signature 'StudySpecification'
unitids(x)

unitids(x) <- value

## S4 replacement method for signature 'StudySpecification'
unitids(x) <- value

blocks(x, newdata = NULL, by = NULL, ...)

## S4 method for signature 'StudySpecification'
blocks(x, newdata = NULL, by = NULL, ..., implicit = FALSE)

blocks(x) <- value

## S4 replacement method for signature 'StudySpecification'
blocks(x) <- value

has_blocks(x)

forcings(x, newdata = NULL, by = NULL)

## S4 method for signature 'StudySpecification'
forcings(x, newdata = NULL, by = NULL)

forcings(x) <- value

## S4 replacement method for signature 'StudySpecification'
forcings(x) <- value

Arguments

x

a StudySpecification object

newdata

optional; an additional data.frame. If passed, and the unit of assignment variable is found in newdata, then the requested variable type for each unit of newdata is returned. See by argument if the name of the unit of assignment differs.

dichotomy

optional; a formula specifying how to dichotomize a non-binary treatment variable. See the Details section of the ett() or att() help pages for information on specifying this formula

by

optional; named vector or list connecting names of unit of assignment/unitid/cluster variables in x to unit of assignment/unitid/cluster variables in data. Names represent variables in x; values represent variables in newdata. Only needed if variable names differ.

...

ignored.

value

replacement. Either a vector/matrix of appropriate dimension, or a named data.frame if renaming variable as well. See Details.

implicit

Should a block-less StudySpecification return a constant 1 when extracting blocks?

Details

For treatment(), when argument binary is FALSE, the treatment variable passed into the StudySpecification is returned as a one-column data.frame regardless of whether it is binary or x has a dichotomy

If a dichotomy is passed, a binary one-column data.frame will be returned. If not and binary is TRUE, unless the StudySpecification has a binary treatment, treatment() will error. If binary is "ifany", it will return the original treatment in this case.

The one-column data.frame returned by treatment() is named as entered in the StudySpecification creation, but if a dichotomy is passed, the column name is "__z" to try and avoid any name conflicts.

For the value when using replacers, the replacement must have the same number of rows as the StudySpecification (the same number of units of assignment). The number of columns can differ (e.g. if the StudySpecification were defined with two variable uniquely identifying blocks, you can replace that with a single variable uniquely identifying blocks, as long as it respects other restrictions.)

If the replacement value is a data.frame, the name of the columns is used as the new variable names. If the replacement is a matrix or vector, the original names are retained. If reducing the number of variables (e.g., moving from two variables uniquely identifying to a single variable), the appropriate number of variable names are retained. If increasing the number of variables, a data.frame with names must be provided.

Value

data.frame containing requested variable, or an updated StudySpecification. treatment() works slightly differently, see Details.

Examples

data(simdata)
spec <- obs_spec(z ~ unit_of_assignment(uoa1, uoa2), data = simdata)
blocks(spec) # empty
blocks(spec) <- data.frame(blks = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5))
blocks(spec)
blocks(spec) <- c(5, 5, 4, 4, 3, 3, 2, 2, 1, 1)
blocks(spec) # notice that variable is not renamed

Special terms in `StudySpecification` creation formula

Description

These are special functions used only in the definition of StudySpecification objects. They identify the units of assignment, blocks and forcing variables. They should never be used outside of the formula argument to obs_spec, rct_spec, or rd_spec.

Usage

unit_of_assignment(...)

unitid(...)

cluster(...)

uoa(...)

block(...)

forcing(...)

Arguments

...

any number of variables of the same length.

Details

These functions have no use outside of the formula in creating a StudySpecification.

unit_of_assignment, uoa, cluster and unitid are synonyms; you must include one and only one in each StudySpecification. The choice of which to use will have no impact on any analysis, only on some output and the name of the stored element in the StudySpecification. Accessors/ replacers (units_of_assignment, unitids, clusters) respect the choice made at the point of creation of the StudySpecification, and only the appropriate function will work.

See rct_spec, obs_spec, or rd_spec for examples of their usage.

Value

the variables with appropriate labels. No use outside of their inclusion in the formula argument to obs_spec, rct_spec, or rd_spec

Extract Variable Names from `StudySpecification`

Description

Methods to extract the variable names to the elements of the structure of the StudySpecification (e.g. treatment, unit of analysis, etc)

Usage

var_table(specification, compress = TRUE, report_all = FALSE)

var_names(specification, type, implicitBlocks = FALSE)

Arguments

specification

a StudySpecification object

compress

should multiple variables be compressed into a comma-separated string? Default TRUE. If FALSE, multiple columns can be created instead.

report_all

should we report all possible structures even if they don't exist in the StudySpecification? Default FALSE.

type

one of "t", "u", "b", "f"; for "treatment", "unit_of_assignment", "block", and "forcing" respectively

implicitBlocks

If the StudySpecification is created without blocks, setting this to TRUE will return ".blocks_internal" as the variable name corresponding to the blocks.

Details

When compress is TRUE, the result will always have two columns. When FALSE, the result will have number of columns equal to the largest number of variables in a particular role, plus one. E.g., a call such as rct_spec(z ~ unitid(a, b, c, d) ... will have 4+1=5 columns in the output matrix with compress = FALSE.

When report_all is TRUE, the matrix is guaranteed to have 3 rows (when the specification is an RCT or Obs) or 4 rows (when the specification is a RD), with empty variable entries as appropriate. When FALSE, the matrix will have minimum 2 rows (treatment and unit of assignment/unitid/cluster), with additional rows for blocks and forcing if included in the StudySpecification.

Value

var_table returns the requested table. var_names returns a vector of variable names.

Examples

spec <- rct_spec(z ~ uoa(uoa1, uoa2) + block(bid), data = simdata)
var_table(spec)
var_table(spec, compress = FALSE)
var_names(spec, "t")
var_names(spec, "u")
var_names(spec, "b")

Compute variance-covariance matrix for fitted `teeMod` model

Description

An S3method for stats::vcov that computes standard errors for teeMod models using vcov_tee().

Usage

## S3 method for class 'teeMod'
vcov(object, ...)

Arguments

object

a fitted teeMod model

...

additional arguments to vcov_tee().

Details

vcov.teeMod() wraps around vcov_tee(), so additional arguments passed to ... will be passed to the vcov_tee() call. See documentation for vcov_tee() for information about necessary arguments.

Value

Variance/Covariance for `teeMod` objects

Description

Compute robust sandwich variance estimates with optional covariance adjustment

Usage

vcov_tee(x, type = NULL, cluster = NULL, ...)

.vcov_DB0(x, ...)

.vcov_DB(x, ...)

Arguments

x

a fitted teeMod model

type

a string indicating the desired bias correction for the residuals of x. Default makes no bias correction. See Details for supported types

cluster

a vector indicating the columns that define clusters. The default is the unit of assignment columns in the StudySpecification stored in x. These columns should appear in the dataframe used for fitting x as well as the dataframe passed to the covariance model fit in the case of prior covariance adjustment. See Details

...

arguments to be passed to the internal variance estimation function, such as cov_adj_rcorrect and loco_residuals. If x has a SandwichLayer object in its offset, The former specifies the bias correction to the residuals of the covariance model, and the latter indicates whether the offset should be replaced with predictions from leave-one-cluster-out fits of the covariance adjustment model. See Details

Details

Variance estimates will be clustered on the basis of the columns provided to cluster (or obtained by the default behavior). As a result, providing "HCx" or "CRx" to type will produce the same variance estimate given that cluster remains the same.

With prior covariance adjustment, unless the data argument of the covariance model fit is the same as the data argument for fitting x and the StudySpecification of x has been created with a formula of the form trt_col ~ 1, the column(s) provided to cluster must appear in the dataframes in both data arguments, even if the clustering structure does not exist, per se, in the covariance adjustment sample. For instance, in a finely stratified randomized trial, one might desire standard errors clustered at the block level, but the covariance adjustment model may include auxiliary units that did not participate in the trial. In this case, in the data argument of the fitted covariance model, the column(s) passed to cluster should have the block ID's for rows overlapping with the data argument used for fitting x, and NA's for any auxiliary units. vcov_tee() will treat each row with an NA as its own cluster.

For ITT effect estimates without covariance adjustment, type corresponds to the variance estimate desired. Supported options include:

"MB0", "HC0", and "CR0" for model-based HC/CR0 standard errors
"MB1", "HC1", and "CR1" for model-based HC/CR1 standard errors (for "MB1" and "HC1", this is n/(n - 2), and for "CR1", this is g\cdot(n-1)/((g-1)\cdot(n-2)), where g is the number of clusters in the sample used for fitting x)
"MB2", "HC2", and "CR2" for model-based HC/CR2 standard errors
"DB0" for design-based HC0 variance estimates

The type argument does not correspond to existing variance estimators in the literature in the case of prior covariance adjustment. It specifies the bias correction to the residuals of x, but the residuals of the covariance model are corrected separately based on the cov_adj_rcorrect argument. The cov_adj_rcorrect argument takes the same options as type except "DB0". When the covariance model includes rows in the treatment condition for fitting, the residuals of x are further corrected by having the values of offset replaced by predictions that use coefficient estimates that leave out rows in the same cluster (as defined by the cluster argument).

The design-based variance estimates can be calculated for teeMod models satisfying the following requirements:

The model uses rct_spec as StudySpecification
The model only estimates a main treatment effect
Inverse probability weighting is incorporated

Value

Extract Weights from `WeightedStudySpecification`

Description

A WeightedStudySpecification object contains a numeric vector with a few additional slots, this extracts only the numeric vector.

Usage

## S4 method for signature 'WeightedStudySpecification'
weights(object, ...)

Arguments

object

a WeightedStudySpecification object

...

Ignored

Value

A numeric vector of the weights

WeightedStudySpecification Operations

Description

Usage

Arguments

Details

Value

Return ..uoa.. column

Description

Usage

Arguments

Value

(Internal) Helper function for design-based meat matrix calculation

Description

Usage

(Internal) Helper function for design-based meat matrix calculation

Description

Usage

(Internal) Helper function for design-based meat matrix calculation

Description

Usage

(Internal) Aggregate weights and outcomes to cluster level

Description

Usage

Arguments

Details

Value

(Internal) Align the dimensions and rows of direct adjustment and covariance adjustment model estimating equations matrices

Description

Usage

Arguments

Details

Value

(Internal) Applies dichotomy to treatment

Description

Usage

Arguments

Value

Convert object to data.frame or produce meaningful error

Description

Usage

Arguments

Value

(Internal) Extract empirical estimating equations from a teeMod model using the S3 method associated with its .S3Class slot

Description

Usage

Arguments

Value

(Internal) Extracts treatment as binary vector

Description

Usage

Arguments

Details

Value

(Internal) A few checks to ensure by= is valid

Description

Usage

Arguments

Value

(Internal) Replace standard errors for moderator effect estimates with insufficient degrees of freedom with NA

Description

Usage

Arguments

Value

(Internal) Perform checks on formula for creation of StudySpecification.

Description

Usage

Arguments

Value

Compute the degrees of freedom of a sandwich standard error with HC2 correction

Description

Usage

References

Compute residuals for a teeMod object with leave-one-out estimates of the offset

Description

Usage

Arguments

Details

Produce confidence intervals for linear models

Description

Usage

`WeightedStudySpecification` Operations

Convert object to `data.frame` or produce meaningful error

(Internal) Extract empirical estimating equations from a `teeMod` model using the S3 method associated with its `.S3Class` slot

(Internal) Extracts treatment as binary `vector`

(Internal) A few checks to ensure `by=` is valid

(Internal) Replace standard errors for moderator effect estimates with insufficient degrees of freedom with `NA`

Compute residuals for a `teeMod` object with leave-one-out estimates of the `offset`

(Internal) Ensures replacement column for `StudySpecification` is a `data.frame`.

Add new variables to a model frame from a `teeMod` object

(Internal) Expand treatment variable from a `StudySpecification` to a dataframe with unit of assignment information

(Internal) Fallback brute force method to locate `data` in the call stack.

(Internal) Find `dichotomy` formulas in the call stack