---
title: "TestREnlme: Testing Random Effects in Linear and Nonlinear Mixed-Effects Models"
author: "Germaine Uwimpuhwe, Reza Drikvandi, Shelley A. Blozis"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{TestREnlme}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment   = "#>",
  fig.width  = 7,
  fig.height = 4,
  warning   = FALSE,
  message   = FALSE
)
```

# Introduction
**TestREnlme** provides a nonparametric framework for testing random effects in
linear mixed-effects models (LMEMs) and nonlinear mixed-effects models (NLMEMs)
with additive random error. The package implements the permutation tests of
Uwimpuhwe, Drikvandi and Blozis (2026, *Statistics inMedicine*, 
doi:[10.1002/sim.70605](https://doi.org/10.1002/sim.70605)), allowing users to 
test all random effects jointly or selected subsets of random effects without
relying on distributional assumptions. The package provides three distribution-free 
variance component estimators:

- **VLS (Variance Least Squares, default)**: does not require subject-specific
  model fitting and is robust to convergence issues associated with individual-level estimation.
- **MM (Method of Moments, two-stage)**: a two-stage estimator that uses subject-specific 
  nonlinear least-squares fits in the first stage.
- **MMF (Method of Moments with two-stage and First-Order Approximation)**: a two-stage estimator
  incorporating a first-order approximation while retaining the two-stage estimation framework.

Key features of **TestREnlme**:

- No distributional assumptions on random effects or errors.
- Tests all random effects jointly or any user-specified subset.
- Supports the inclusion of subject-level covariates when applicable.
- All plots are `ggplot2` objects, directly customisable.

The package can be installed from CRAN using:

```{r install, eval = FALSE}
install.packages("TestREnlme")
```

```{r load}
library(TestREnlme)
```


# Model Formulation

The NLMEM is
$$
Y_i = f_i(a_i, \gamma) + \varepsilon_i, \qquad a_i = A_i\beta + b_i,
$$
where $f_i(\cdot)$ is a known nonlinear function, $\gamma$ is a vector of 
common fixed parameters, and $\beta$ contains fixed effects paired with 
random effects structure. The random effects $b_i$ have mean zero and 
covariance $D_* = \sigma^2 D$, and the random errors $\varepsilon_i$ have
mean zero and variance $\sigma^2$. if $f_i(\cdot)=X_i\beta+Z_ib_i$ the
NLMEM reduces to the LMEM.

The model is defined through the following arguments:

- `Expr`: the formula for $f_i(a_i, \gamma)$, using subject-specific parameter
   names (e.g.\ `ai1`, `ai2`, `ai3`) together with any common parameters $\gamma$
- `random`: a character vector of two-sided formula strings mapping each
  subject-specific parameter to its fixed + random structure
  (e.g.\ `"ai1 ~ B1 + bi1"` meaning $a_{i1} = \beta_1 + b_{i1}$)
- `start`: a named numeric vector of starting values for all parameters in 
  `Expr`. If not supplied, the package automatically computes suitable 
  initial values.
  
  
# Theophylline Pharmacokinetic Data

## Data and Model

The built-in `Theoph` dataset contains serum concentration measurements for
12 patients. The one-compartment pharmacokinetic model is:
$$
Y_{ij} = 
 \frac{D_i \exp(a_{i2} + a_{i3} - a_{i1}) \big[\exp(-e^{a_{i3}} T_{ij}) - \exp(-e^{a_{i2}} T_{ij})\big]}
           {\exp(a_{i2}) - \exp(a_{i3})} + \varepsilon_{ij}
$$
where $a_{i1} = \beta_1 + b_{i1}$, $a_{i2} = \beta_2 + b_{i2}$, and
$a_{i3} = \beta_3 + b_{i3}$ are the subject-specific log clearance,
log absorption rate, and log elimination rate, respectively.

## Exploratory Plot

```{r theoph-profiles, fig.cap = "Individual concentration profiles with mean (black and bold)."}
d      <- as.data.frame(Theoph)
p <- plot_profiles(d, group = "Subject", time = "Time",
                   response = "conc",
                   title = "Theophylline: individual profiles")
## Colour by subject using ggplot2 customisation
p + ggplot2::aes(colour = .data$`.group`) +
    ggplot2::scale_colour_manual(values = rainbow(12), guide = "none")
```


## Testing All Random Effects

### Step 1: Model preparation 
```{r theoph-setup}
Expr   <- conc ~ Dose * exp(ai2 + ai3 - ai1) *
            (exp(-Time * exp(ai3)) - exp(-Time * exp(ai2))) /
            (exp(ai2) - exp(ai3))

random <- c("ai1 ~ B1 + bi1",
            "ai2 ~ B2 + bi2", 
            "ai3 ~ B3 + bi3")

start  <- c(ai1 = -3.22, ai2 = 0.47, ai3 = -2.45)

```




### Step 2: Variance components estimation

```{r theoph-vls}
## VLS (default: no subject-specific estimation)
DVLS <- Dmethod(d, Expr, group = "Subject",
                random = random, start = start,
                verbose = 0)
summary(DVLS)
```

When both MM and MMF are needed on the same data, the expensive first-stage NLS
fits can be computed once via `MM_base()` and shared between both methods,
substantially reducing computation time:

```{r theoph-mm}
mb   <- MM_base(d, Expr, group = "Subject",
                random = random, start = start,
                kappa_max = 1e4, verbose = 1)
DMM  <- Dmethod(d, Expr, group = "Subject",
                random = random, start = start,
                method = "MM", MM_base_obj = mb, verbose = 0)


DMMF <- Dmethod(d, Expr, group = "Subject",
                random = random, start = start,
                method = "MMF", MM_base_obj = mb, verbose = 0)
summary(DMM)
summary(DMMF)
## Compare variance component estimates across methods
plot_variance(list(VLS = DVLS, MM = DMM, MMF = DMMF), component = "diagonal")
```


### Step 3: Bootstrap standard errors.
Once variance components have been estimated, uncertainty in all model
parameters can be assessed using `bootstrap_se()`:

```{r theoph-boot, eval = FALSE}
BVLS <- bootstrap_se(DVLS, nboot = 200, seed = 42)
print(BVLS)
```



### Step 4: Permutation tests

We consider three permutation tests: (i) testing all random effects jointly,
(ii) testing $b_{i1}$ and $b_{i3}$ while retaining $b_{i2}$, and (iii) testing
$b_{i3}$ alone while retaining $b_{i1}$ and $b_{i2}$. Since all tests use the
same variance estimate, `Dhatt` is computed once via `Dmethod()` and reused
across all calls. 

```{r theoph-tests}
## Test 1: H0: D = 0 (all RE absent)
HVLS <- Dhypothesis_test(d, Expr, group = "Subject",
                          random = random, start = start,
                          Dhatt = DVLS, nperm = 200, seed = 1,
                          verbose = 0)
cat("Test 1 - p-value:", HVLS$pvalue, "| Decision:", HVLS$Decision, "\n")

## Test 2: H0: d11 = d13 = d33 = 0 given  bi2 retained in the model
HVLS13 <- Dhypothesis_test(d, Expr, group = "Subject",
                            random = random, start = start,
                            bi_out = c("bi1", "bi3"),
                            Dhatt = DVLS, nperm = 200, seed = 1,
                            verbose = 0)
cat("Test 2 - p-value:", HVLS13$pvalue, "| Decision:", HVLS13$Decision, "\n")

## Test 3: H0: d33 = 0 given bi1 and bi2 retained in the model
HVLS3 <- Dhypothesis_test(d, Expr, group = "Subject",
                           random = random, start = start,
                           bi_out = "bi3",
                           Dhatt = DVLS, nperm = 200, seed = 1,
                           verbose = 0)
cat("Test 3 - p-value:", HVLS3$pvalue, "| Decision:", HVLS3$Decision, "\n")
```

###  Permutation histograms

```{r theoph-hist, fig.width = 10, fig.height = 4, fig.cap = "Permutation null distributions for the three tests."}
library(ggpubr)
ggarrange(
  plot_perm_hist(HVLS,   title = "H0: D = 0"),
  plot_perm_hist(HVLS13, title = "H0: d11=d13=d33=0"),
  plot_perm_hist(HVLS3,  title = "H0: d33=0"),
  nrow = 1
)
```

Test 1 and Test 2 strongly reject $H_0$ ($p < 0.001$). Test 3 fails to reject
($p =$ `r HVLS3$pvalue`), suggesting the elimination rate $K_{ei}$ can be
treated as a common fixed parameter across subjects.


## Reduced Random Effect Strureture.
Since $b_{i3}$ is not required, we refit the model retaining only $b_{i1}$
and $b_{i2}$. The parameter $B_3$ now appears directly in `Expr2` as a common
fixed parameter.

### Step 1: Model preparation 
```{r theoph-reduced-step1}
Expr2   <- conc ~ Dose * exp(ai2 + B3 - ai1) *
             (exp(-Time * exp(B3)) - exp(-Time * exp(ai2))) /
             (exp(ai2) - exp(B3))
random2 <- c("ai1 ~ B1 + bi1",
             "ai2 ~ B2 + bi2")
start2 <- c(ai1 = -3.22, ai2 = 0.47, B3 = -2.45)
```

### Step 2: Variance components estimation
```{r theoph-reduced-step2}
DVLS_C <- Dmethod(d, Expr2, group = "Subject",
                  random = random2, start = start2,
                  verbose = 0)
summary(DVLS_C)
```

#### Diagnostic plots
Observed versus fitted values can be visualised using `plot_fitted()`:
In the example below, we set `subjects = NULL` (the default), meaning that all
subjects are plotted. Alternatively, a subset of subjects can be selected for
visualisation.

```{r theoph-fitted, fig.cap = "Observed vs fitted values for all 12 subjects (reduced model)."}
plot_fitted(DVLS_C, time = "Time", subjects = NULL, overlay = TRUE)
```

Calling `plot()` on a `Dmethod` object produces all relevant diagnostic plots
(profiles, fitted values, raw and standardised residuals; and condition-number
plot for VLS based  objects):


```{r s3-dmethod, eval = FALSE, }
## Shows: profiles, fitted, residuals (raw + standardised)
## For MM/MMF also shows condition-number plot
plot(x=DVLS_C, time = "Time")
```

#### Condition-number diagnostics
For MM or MMF fits, `plot_condition()` shows the per-subject condition numbers
with the exclusion threshold. In the reduced model, all 12 subjects are well
below the threshold `kappa_max = 1e4`:

```{r theoph-cond, fig.cap = "Condition numbers for the reduced MM model. All subjects are below the exclusion threshold."}
DMM_C <- Dmethod(d, Expr2, group = "Subject",
                 random = random2, start = start2,
                 method = "MM", verbose = 0)
plot_condition(DMM_C, kappa_max = 1e4)
```


## Step 3: Bootstrap standard errors.
Bootstrap standard errors follow the same codes as in the full model, 
replacing `DVLS` with `DVLS_C` in the `bootstrap_se()` call.


## Step 4: Permutation

Both retained random effects are highly significant:

```{r theoph-reduced-test}
## Test for the need of all random effects in the reduced model
HC <- Dhypothesis_test(d, Expr2, group = "Subject",
                        random = random2, start = start2,
                        Dhatt = DVLS_C, nperm = 200, seed = 1,
                        verbose = 0)
cat("Testing of all RE (bi1, bi2) - p-value:", HC$pvalue, "| Decision:", HC$Decision, "\n")

# ## Test for the need of bi1 given the presence of bi2 in the reduced model
HC1 <- Dhypothesis_test(d, Expr2, group = "Subject",
                        random = random2, start = start2,bi_out = "bi1",
                        Dhatt = DVLS_C, nperm = 200, seed = 1,
                        verbose = 0)
cat("Testing bi1 give bi2 in the model- p-value:", HC1$pvalue, "| Decision:", HC1$Decision, "\n")
```



## Skill Acquisition Data 

The `SkillAcq` dataset, available in the **TestREnlme** Package, contains 
log-transformed response latencies for $N = 204$ subjects across $J = 11$
trial blocks, with a subject-level working memory (WM) covariate `wm2`. 
The nonlinear learning model is:
$$
Y_{ij} = a_{i1} - (a_{i1} + a_{i0})\exp(a_{i2}\,T_{ij}) + \varepsilon_{ij}
$$
where $a_{i0}$ is the initial offset, $a_{i1}$ the asymptote, and $a_{i2}$
the learning rate. Each parameter has a WM regression component:
$a_{ik} = \beta_{0k} + \beta_{1k} WM2_i + b_{ik}$.

```{r skill-setup}
## Reshape from wide to long format
library(reshape2)
data("SkillAcq")
qrt1 <- melt(SkillAcq, id.vars = c("id", "wm2"),
             variable.name = "ly", value.name = "Y")
qrt1$t <- as.numeric(sub("ly", "", qrt1$ly))


#model formulation
Expr_learn  <- Y ~ ai1 - (ai1 + ai0) * exp(ai2 * t)
random_learn <- c("ai0 ~ B0 + B01 * wm2 + bi0",
                  "ai1 ~ B1 + B11 * wm2 + bi1",
                  "ai2 ~ B2 + B21 * wm2 + bi2")
```

We first fit the full with MM to assess subject-specific convergence, 
and the starting values are automatically compute.

```{r skill-full}
## MM estimator 
DMM_all <- Dmethod(data=qrt1, Expr=Expr_learn, group = "id",
                   random = random_learn, method = "MM")
summary(DMM_all)
```

## Full Model (Three Random Effects)

The full model excludes 61 subjects (24 high-kappa + 37 non-converged);
this is also the case for the MMF approach. The figure below, produced
using `plot_condition()`, displays the condition-number diagnostic, which
illustrates this practical situation where the default method **VLS**
should be used (as it uses all 204 subjects) and not MM nor MMF. The
subsequent analysis should therefore proceed using VLS following Steps 1 to 4
as shown in the theophylline example above.


```{r skill-full-plot , fig.cap="Skill acquisition data: the per-subject condition numbers from MM for the full three-random-effects model, sorted in decreasing order for converged subjects. The subjects above the dashed line (red dots) were excluded from MM second-stage estimation, and the subjects below (blue dots) were retained and used in the analysis."}

## Condition number plot -- subjects above threshold excluded
set.seed(123)
NS <- sort(sample(1:204, 20))
plot_condition(DMM_all, kappa_max = 1e4)
```





# Summary of Key Functions

| Function | Purpose |
|---|---|
| `Dmethod()` | Estimate $\hat{D}_*$ and $\hat{\sigma}^2$ (VLS, MM, MMF) |
| `Tstat()` | Compute observed test statistic $T_{\rm obs}$ |
| `Dhypothesis_test()` | Run permutation test; returns $p$-value and histogram |
| `MM_base()` | Pre-compute shared first-stage for MM and MMF |
| `bootstrap_se()` | Bootstrap standard errors for all estimates |
| `plot_profiles()` | Individual profile plot with mean |
| `plot_fitted()` | Fitted curves overlaid on observed data |
| `plot_residuals()` | Residual diagnostic plots |
| `plot_condition()` | Condition number diagnostic (MM/MMF) |
| `plot.Dmethod()` |S3 plotting method for\code{Dmethod} objects: produces profiles, fitted values, residuals, and condition numbers|
| `plot_variance()` | Compare variance estimates across methods |
| `plot_perm_hist()` | Permutation null distribution histogram |

# References

Uwimpuhwe, G., Drikvandi, R., and Blozis, S.A. (2026).
Testing random effects in nonlinear mixed-effects models.
*Statistics in Medicine*.
<https://doi.org/10.1002/sim.70605>

Uwimpuhwe, G., Drikvandi, R. and Blozis, S. A. (in preparation).
TestREnlme: An R Package for Testing Random Effects in Nonlinear
Mixed-Effects Models.

Demidenko, E. (2013). *Mixed Models: Theory and Applications with R*. Wiley.

Drikvandi, R., Verbeke, G., Khodadadi, A. and Nia, V. P. (2013).
Testing multiple variance components in linear mixed-effects models.
*Biostatistics*, **14**(1), 144--159.
