The PMLE4SCR package implements two-stage pseudo maximum
likelihood estimation (PMLE) for copula-based regression models with
semi-competing risks data. In semi-competing risks data, a non-terminal
event (e.g., disease relapse) is subject to dependent censoring by a
terminal event (e.g., death), but not vice versa.
The package provides two main functions:
PMLE4SCR(): Two-stage pseudo maximum likelihood
estimationMLE4SCR(): Simultaneous maximum likelihood
estimationWe illustrate the package using the Bone Marrow Transplant (BMT)
dataset from the SemiCompRisks package. This dataset
contains 137 patients with acute leukemia aged 7 to 52. The non-terminal
event is leukemia relapse and the terminal event is death. Three disease
groups are considered: AML-low risk, AML-high risk, and ALL (acute
lymphoblastic leukemia), with AML-low as the baseline group.
library(PMLE4SCR)
data(BMT, package = "SemiCompRisks")
BMT$g <- factor(BMT$g, levels = c(2, 3, 1),
labels = c("AML-low", "AML-high", "ALL"))The key variables in the dataset are described below:
| Variable | Description |
|---|---|
| T1 | Time to death (terminal event) |
| T2 | Time to leukemia relapse (non-terminal event) |
| delta1 | Death indicator (1 = observed, 0 = censored) |
| delta2 | Relapse indicator (1 = observed, 0 = censored) |
| g | Disease group (AML-low, AML-high, ALL) |
A preview of the first six observations:
We fit a copula-based model using the Clayton copula, with
proportional hazards (PH) marginals for both relapse and death. The
disease group g is included as a covariate for both
marginals and the copula parameter.
The estimated copula parameters and corresponding Kendall’s \(\tau\) values measure the dependence between relapse and death times for each disease group. A larger \(\tau\) indicates stronger dependence.
| para | est | se |
|---|---|---|
| (Intercept) | 2.169 | 0.553 |
| gAML-high | -0.271 | 0.812 |
| gALL | -0.282 | 0.791 |
The estimated regression coefficients for the marginal distribution of relapse time. Positive values indicate higher risk of relapse compared to the AML-low baseline group.
| Group | Estimate | SE |
|---|---|---|
| AML-high | 1.168 | 0.311 |
| ALL | 0.710 | 0.324 |
| Group | Estimate | SE |
|---|---|---|
| AML-high | 1.022 | 0.276 |
| ALL | 0.611 | 0.285 |
For comparison, we also fit the simultaneous MLE and compare the estimates from both methods side by side.
myfit_mle <- MLE4SCR(BMT, time = "T2", death = "T1",
status_time = "delta2", status_death = "delta1",
T.fmla = ~ g, D.fmla = ~ g,
copula.family = "Clayton",
copula.control = list(formula = ~ g))| Parameter | Estimate | SE | Estimate | SE |
|---|---|---|---|---|
| Copula (Intercept) | 2.169 | 0.553 | 2.213 | 0.537 |
| Copula (AML-high) | -0.271 | 0.812 | -0.270 | 0.803 |
| Copula (ALL) | -0.282 | 0.791 | -0.293 | 0.770 |
| Relapse: AML-high | 1.168 | 0.311 | 1.116 | 0.306 |
| Relapse: ALL | 0.710 | 0.324 | 0.669 | 0.320 |
| Death: AML-high | 1.022 | 0.276 | 1.022 | 0.276 |
| Death: ALL | 0.611 | 0.285 | 0.611 | 0.285 |
The two methods produce similar estimates. The PMLE is computationally more efficient and robust against copula misspecification for the marginal parameters.
Arachchige, S. J., Chen, X., and Zhou, Q. M. (2025). Two-stage pseudo maximum likelihood estimation of semiparametric copula-based regression models for semi-competing risks data. Lifetime Data Analysis, 31, 52-75.