| Type: | Package |
| Title: | Sparse Varying Coefficient BART with Global-Local Priors" |
| Version: | 1.0.0 |
| Date: | 2026-05-13 |
| Description: | Fits sparse linear varying coefficient models (VCMs), which assert a linear relationship between an outcome and several covariates that is allowed to change as functions of additional variables known as effect modifiers. Designed for high-dimensional settings where the number of covariates (i.e., number of slopes) is comparable to or larger than the number of observations. Approximates the coefficient functions using a version of Bayesian Additive Regression Trees that can perform global-local shrinkage. For more details see Ghosh, Bhogale, and Deshpande (2026+) <doi:10.48550/arXiv.2510.08204>. |
| URL: | https://github.com/ghoshstats/sparseVCBART |
| License: | GPL (≥ 3) |
| LinkingTo: | Rcpp, RcppArmadillo |
| Imports: | Rcpp, MASS |
| NeedsCompilation: | yes |
| Packaged: | 2026-05-13 15:39:34 UTC; sameer |
| Author: | Soham Ghosh [aut],
Sameer K. Deshpande
|
| Maintainer: | Sameer K. Deshpande <sameer.deshpande@wisc.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-19 07:30:02 UTC |
Compute posterior predictive evaluates of covariate effect functions.
Description
Given an object returned by VCBART_ind or VCBART_cs and matrices of continuous and categorical modifiers, returns MCMC samples of the coefficient functions evaluated the provided points.
Usage
predict_betas(fit,
Z_cont = matrix(0, nrow = 1, ncol = 1),
Z_cat = matrix(0, nrow = 1, ncol = 1),
verbose = TRUE)
Arguments
fit |
A list returned by |
Z_cont |
Matrix of continuous modifiers at which you wish to evaluate the covariate effect functions. Default is a 1x1 matrix, which signals that no continuous modifiers are required for these evaluations. |
Z_cat |
Integer matrix of categorical modifiers at which you wish to evaluate the covariate effect functions. Default is a 1x1 matrix, which signals that no continuous modifiers are required for these evaluations. |
verbose |
Boolean indicating whether the code should print its progress ( |
Value
An array of size nd x N x (p+1) where nd is the total number of MCMC draws, N is the total number of points at which you are evaluating the covariate effect functions (i.e. nrow(Z_cont) or nrow(Z_cat)), and p is the number of covariates.
Note that the intercept function is included as the first slice in the third dimension.
Fit a sparse VCBART model with compound symmetry error structure
Description
Fit a sparse varying coefficient model by approximating each coefficient function with an ensemble of regression trees. Global-local priors on the regression tree outputs facilitate adaptive shrinkage of the coefficient functions. Assumes a compound symmetry error structure in which the residual errors for a given subject are equally correlated. This is equivalent to assuming that there is a normally distributed random effect per subject.
Usage
sparseVCBART_cs(Y_train,subj_id_train, ni_train,X_train,
Z_cont_train = matrix(0, nrow = 1, ncol = 1),
Z_cat_train = matrix(0L, nrow = 1, ncol = 1),
X_test = matrix(0, nrow = 1, ncol = 1),
Z_cont_test = matrix(0, nrow = 1, ncol = 1),
Z_cat_test = matrix(0, nrow = 1, ncol = 1),
unif_cuts = rep(TRUE, times = ncol(Z_cont_train)),
cutpoints_list = NULL,
cat_levels_list = NULL,
edge_mat_list = NULL,
graph_split = rep(FALSE, times = ncol(Z_cat_train)),
sparse = TRUE,
rho = 0.9,
M = 50,
mu0 = NULL, tau = NULL, nu = NULL, lambda = NULL,
nd = 1000, burn = 1000, thin = 1,
save_samples = TRUE, save_trees = TRUE,
verbose = TRUE, print_every = floor( (nd*thin + burn)/10))
Arguments
Y_train |
Vector of continous responses for training data |
ni_train |
Vector containing the number of observations per subject in the training data. |
subj_id_train |
Vector of length |
X_train |
Matrix of covariates for training observations. Do not include intercept as the first column. |
Z_cont_train |
Matrix of continuous modifiers for training data. Note, modifiers must be rescaled to lie in the interval [-1,1]. Default is a 1x1 matrix, which signals that there are no continuous modifiers in the training data. |
Z_cat_train |
Integer matrix of categorical modifiers for training data. Note categorical levels should be 0-indexed. That is, if a categorical modifier has 10 levels, the values should run from 0 to 9. Default is a 1x1 matrix, which signals that there are no categorical modifiers in the training data. |
X_test |
Matrix of covariate for testing observations. Default is a 1x1 matrix, which signals that testing data is not provided. |
Z_cont_test |
Matrix of continuous modifiers for testing data. Default is a 1x1 matrix, which signals that testing data is not provided. |
Z_cat_test |
Integer matrix of categorical modifiers for testing data. Default is a 1x1 matrix, which signals that testing data is not provided. |
unif_cuts |
Vector of logical values indicating whether cutpoints for each continuous modifier should be drawn from a continuous uniform distribution ( |
cutpoints_list |
List of length |
cat_levels_list |
List of length |
edge_mat_list |
List of adjacency matrices if any of the categorical modifiers are network-structured. Default is |
graph_split |
Vector of logicals indicating whether each categorical modifier is network-structured. Default is |
sparse |
Logical, indicating whether or not to perform variable selection in each tree ensemble based on a sparse Dirichlet prior rather than uniform prior; see Linero 2018. Default is |
rho |
Initial auto-correlation parameter for compound symmetry error structure. Must be between 0 and 1. Default is 0.9. |
M |
Number of trees in each ensemble. Default is 50. |
mu0 |
Prior mean for jumps/leaf parameters. Default is 0 for each beta function. If supplied, must be a vector of length |
tau |
Prior standard deviation for jumps/leaf parameters. Default is |
nu |
Degrees of freedom for scaled-inverse chi-square prior on sigma^2. Default is 3. |
lambda |
Scale hyperparameter for scaled-inverse chi-square prior on sigma^2. Default places 90% prior probability that sigma is less than |
nd |
Number of posterior draws to return. Default is 1000. |
burn |
Number of MCMC iterations to be treated as "warmup" or "burn-in". Default is 1000. |
thin |
Number of post-warmup MCMC iteration by which to thin. Default is 1. |
save_samples |
Logical, indicating whether to return all posterior samples. Default is |
save_trees |
Logical, indicating whether or not to save a text-based representation of the tree samples. This representation can be passed to |
verbose |
Logical, inciating whether to print progress to R console. Default is |
print_every |
As the MCMC runs, a message is printed every |
Details
Given p covariates X_{1}, \ldots, X_{p} and r effect modifiers Z_{1}, \ldots, Z_{r}, the varying coefficient model asserts that
E[Y \vert X = x, Z = ] = \beta_0(z) + \beta_1(z) * x_1 + ... \beta_p(z) * X_p.
That is, for any r-vector Z, the relationships between X and Y is linear.
However, the specific relationship is allowed to vary with respect tp Z.
This function approximates the functions \beta_{j}(z) using ensembles of regression trees and specifies global-local priors on the regression tree outputs.
Thanks to the ensuing adaptive shrinkage, this function can estimate sparse varying coefficient models, which are useful in settings where p is comparable to or greater than n.
This function assumes that the within-subject errors are equi-correlated (i.e. a compound symmetry error structure).
Value
A list containing
y_mean |
Mean of the training observations (needed by |
y_sd |
Standard deviation of the training observations (needed by |
x_mean |
Vector of means of columns of |
x_sd |
Vector of standard deviations of |
yhat.train.mean |
Vector containing posterior mean of evaluations of regression function E[y|x,z] on training data. |
betahat.train.mean |
Matrix with |
yhat.train |
Matrix with |
betahat.train |
Array of dimension with |
yhat.test.mean |
Vector containing posterior mean of evaluations of regression function E[y|x,z] on testing data. |
betahat.test.mean |
Matrix with |
yhat.test |
Matrix with |
betahat.test |
Array of size |
sigma |
Vector containing ALL samples of the residual standard deviation, including warmup. |
rho |
Vector containing ALL samples of the auto-correlation parameter rho, including warmup. |
varcounts |
Array of size |
theta |
If |
trees |
A list (of length |
Fit a sparse VCBART model with independent error structure
Description
Fit a sparse varying coefficient model by approximating each coefficient function with an ensemble of regression trees. Global-local priors on the regression tree outputs facilitate adaptive shrinkage of the coefficient functions. Assumes residual errors are independent within and between subjects.
Usage
sparseVCBART_ind(Y_train,subj_id_train, ni_train,X_train,
Z_cont_train = matrix(0, nrow = 1, ncol = 1),
Z_cat_train = matrix(0L, nrow = 1, ncol = 1),
X_test = matrix(0, nrow = 1, ncol = 1),
Z_cont_test = matrix(0, nrow = 1, ncol = 1),
Z_cat_test = matrix(0, nrow = 1, ncol = 1),
unif_cuts = rep(TRUE, times = ncol(Z_cont_train)),
cutpoints_list = NULL,
cat_levels_list = NULL,
edge_mat_list = NULL,
graph_split = rep(FALSE, times = ncol(Z_cat_train)),
sparse = TRUE,
M = 50,
mu0 = NULL, tau = NULL, nu = NULL, lambda = NULL,
nd = 1000, burn = 1000, thin = 1,
save_samples = TRUE, save_trees = TRUE,
verbose = TRUE, print_every = floor( (nd*thin + burn)/10))
Arguments
Y_train |
Vector of continous responses for training data |
ni_train |
Vector containing the number of observations per subject in the training data. |
subj_id_train |
Vector of length |
X_train |
Matrix of covariates for training observations. Do not include intercept as the first column. |
Z_cont_train |
Matrix of continuous modifiers for training data. Note, modifiers must be rescaled to lie in the interval [-1,1]. Default is a 1x1 matrix, which signals that there are no continuous modifiers in the training data. |
Z_cat_train |
Integer matrix of categorical modifiers for training data. Note categorical levels should be 0-indexed. That is, if a categorical modifier has 10 levels, the values should run from 0 to 9. Default is a 1x1 matrix, which signals that there are no categorical modifiers in the training data. |
X_test |
Matrix of covariate for testing observations. Default is a 1x1 matrix, which signals that testing data is not provided. |
Z_cont_test |
Matrix of continuous modifiers for testing data. Default is a 1x1 matrix, which signals that testing data is not provided. |
Z_cat_test |
Integer matrix of categorical modifiers for testing data. Default is a 1x1 matrix, which signals that testing data is not provided. |
unif_cuts |
Vector of logical values indicating whether cutpoints for each continuous modifier should be drawn from a continuous uniform distribution ( |
cutpoints_list |
List of length |
cat_levels_list |
List of length |
edge_mat_list |
List of adjacency matrices if any of the categorical modifiers are network-structured. Default is |
graph_split |
Vector of logicals indicating whether each categorical modifier is network-structured. Default is |
sparse |
Logical, indicating whether or not to perform variable selection in each tree ensemble based on a sparse Dirichlet prior rather than uniform prior; see Linero 2018. Default is |
M |
Number of trees in each ensemble. Default is 50. |
mu0 |
Prior mean for jumps/leaf parameters. Default is 0 for each beta function. If supplied, must be a vector of length |
tau |
Prior standard deviation for jumps/leaf parameters. Default is |
nu |
Degrees of freedom for scaled-inverse chi-square prior on sigma^2. Default is 3. |
lambda |
Scale hyperparameter for scaled-inverse chi-square prior on sigma^2. Default places 90% prior probability that sigma is less than |
nd |
Number of posterior draws to return. Default is 1000. |
burn |
Number of MCMC iterations to be treated as "warmup" or "burn-in". Default is 1000. |
thin |
Number of post-warmup MCMC iteration by which to thin. Default is 1. |
save_samples |
Logical, indicating whether to return all posterior samples. Default is |
save_trees |
Logical, indicating whether or not to save a text-based representation of the tree samples. This representation can be passed to |
verbose |
Logical, inciating whether to print progress to R console. Default is |
print_every |
As the MCMC runs, a message is printed every |
Details
Given p covariates X_{1}, \ldots, X_{p} and r effect modifiers Z_{1}, \ldots, Z_{r}, the varying coefficient model asserts that
E[Y \vert X = x, Z = ] = \beta_0(z) + \beta_1(z) * x_1 + ... \beta_p(z) * X_p.
That is, for any r-vector Z, the relationships between X and Y is linear.
However, the specific relationship is allowed to vary with respect tp Z.
This function approximates the functions \beta_{j}(z) using ensembles of regression trees and specifies global-local priors on the regression tree outputs.
Thanks to the ensuing adaptive shrinkage, this function can estimate sparse varying coefficient models, which are useful in settings where p is comparable to or greater than n.
This function assumes that the within-subject errors are independent.
Value
A list containing
y_mean |
Mean of the training observations (needed by |
y_sd |
Standard deviation of the training observations (needed by |
x_mean |
Vector of means of columns of |
x_sd |
Vector of standard deviations of |
yhat.train.mean |
Vector containing posterior mean of evaluations of regression function E[y|x,z] on training data. |
betahat.train.mean |
Matrix with |
yhat.train |
Matrix with |
betahat.train |
Array of dimension with |
yhat.test.mean |
Vector containing posterior mean of evaluations of regression function E[y|x,z] on testing data. |
betahat.test.mean |
Matrix with |
yhat.test |
Matrix with |
betahat.test |
Array of size |
sigma |
Vector containing ALL samples of the residual standard deviation, including warmup. |
varcounts |
Array of size |
theta |
If |
trees |
A list (of length |
References
Ghosh, S., Bhogale, S., and Deshpande, S.K (2026+). Fitting sparse high-dimensional varying-coefficient models with Bayesian regression tree ensembles. arXiv preprint arXiv:2510.08204. doi:10.48550/arXiv.2510.08204
Examples
############
# True beta functions
beta0_true <- function(Z){
tmp_Z <- (Z+1)/2
return( 3 * tmp_Z[,1] +
(2 - 5 * (tmp_Z[,2] > 0.5)) * sin(pi * tmp_Z[,1]) -
2 * (tmp_Z[,2] > 0.5))
}
beta1_true <- function(Z){
tmp_Z <- (Z+1)/2
return(sin(2*tmp_Z[,1] + 0.5)/(4*tmp_Z[,1] + 1) + (2*tmp_Z[,1] - 0.5)^3)
}
beta2_true <- function(Z){
tmp_Z <- (Z+1)/2
return( (3 - 3*cos(6*pi*tmp_Z[,1]) * tmp_Z[,1]^2) * (tmp_Z[,1] > 0.6) -
(10 * sqrt(tmp_Z[,1])) * (tmp_Z[,1] < 0.25) )
}
beta3_true <- function(Z){
return(rep(1, times = nrow(Z)))
}
beta4_true <- function(Z){
tmp_Z <- (Z+1)/2
return(10 * sin(pi * tmp_Z[,1] * tmp_Z[,2]) +
20 * (tmp_Z[,3] - 0.5)^2 +
10 * tmp_Z[,4] + 5 * tmp_Z[,5])
}
beta5_true <- function(Z){
tmp_Z <- (Z+1)/2
return(exp(sin((0.9 * (tmp_Z[,1] + 0.48))^10)) +
tmp_Z[,2] * tmp_Z[,3] + tmp_Z[,4])
}
beta6_true <- function(Z){
return(rep(0, times = nrow(Z)))
}
beta7_true <- function(Z){
return(rep(0, times = nrow(Z)))
}
beta8_true <- function(Z){
return(rep(0, times = nrow(Z)))
}
beta9_true <- function(Z){
return(rep(0, times = nrow(Z)))
}
## Set problem dimensions
n_train <- 250
n_test <- 25
sigma <- 1
set.seed(417)
n_all <- n_train + n_test
ni_all <- rep(4, times = n_all) # 4 observations per subject
subj_id_all <- rep(1:n_all, each = 4) # give every subject an id number
N_all <- sum(ni_all) # total number of observations
p <- 9 # number of covariates
R_cont <- 20 # number of continuous modifiers
R_cat <- 0 # number of categorical modifiers
R <- R_cont + R_cat
## Generate data
Sigma_X <- (0.5)^(abs(outer(1:p, 1:p, FUN = "-"))) # covariates are all correlated
X_all <- MASS::mvrnorm(N_all, mu = rep(0, times = p), Sigma = Sigma_X)
Z_cont_all <- matrix(runif(N_all * R_cont, min = -1, max = 1), nrow = N_all, ncol = R_cont)
beta0_all <- beta0_true(Z_cont_all)
beta1_all <- beta1_true(Z_cont_all)
beta2_all <- beta2_true(Z_cont_all)
beta3_all <- beta3_true(Z_cont_all)
beta4_all <- beta4_true(Z_cont_all)
beta5_all <- beta5_true(Z_cont_all)
beta6_all <- beta6_true(Z_cont_all)
beta7_all <- beta7_true(Z_cont_all)
beta8_all <- beta8_true(Z_cont_all)
beta9_all <- beta9_true(Z_cont_all)
beta_all <-
cbind(
beta0_all, beta1_all, beta2_all, beta3_all, beta4_all,
beta5_all,beta6_all, beta7_all, beta8_all, beta9_all)
mu_all <-
beta0_all + X_all[,1] * beta1_all + X_all[,2] * beta2_all + X_all[,3] * beta3_all +
X_all[,4] * beta4_all + X_all[,5] * beta5_all + X_all[,6] * beta6_all +
X_all[,7] * beta7_all + X_all[,8] * beta8_all + X_all[,9] * beta9_all
Y_all <- mu_all + sigma * rnorm(n = N_all, mean = 0, sd = 1)
## Token run to ensure installation works
fit <-
sparseVCBART_ind(
Y_train = Y_all,
subj_id_train = subj_id_all,
ni_train = ni_all,
X_train = X_all,
Z_cont_train = Z_cont_all,
nd = 5, burn = 5,
verbose = FALSE)
#############
# Create a training/testing split
# since we randomly generated X and Z,
# we can just use first n_train subjects as train
############
train_subjects <- 1:n_train
test_subjects <- (n_train+1):(n_train + n_test)
test_index <- which(subj_id_all %in% test_subjects)
train_index <- which(subj_id_all %in% train_subjects)
ni_train <- ni_all[train_subjects]
ni_test <- ni_all[test_subjects]
N_train <- sum(ni_train)
N_test <- sum(ni_test)
X_train <- X_all[train_index,]
X_test <- X_all[test_index,]
Z_cont_train <- Z_cont_all[train_index,]
Z_cont_test <- Z_cont_all[test_index,]
subj_id_train <- rep(1:n_train, times = ni_train)
Y_train <- Y_all[train_index]
Y_test <- Y_all[test_index]
beta_train <- beta_all[train_index,]
beta_test <- beta_all[test_index,]
fit <-
sparseVCBART_ind(
Y_train = Y_train,
subj_id_train = subj_id_train,
ni_train = ni_train,
X_train = X_train,
Z_cont_train = Z_cont_train,
X_test = X_test,
Z_cont_test = Z_cont_test,
verbose = FALSE)
plot(c(beta_train, beta_test),
c(fit$betahat.train.mean, fit$betahat.test.mean),
xlab = "Actual", ylab = "Fitted")
abline(a = 0, b = 1, col = 'blue')
Compute posterior mean and 95% credible interval for evaluations of each coefficient function.
Description
Given an array of posterior samples of coefficient function evaluations, returns the posterior mean and 95% credible interval for each evaluation.
Usage
summarize_beta(beta_samples,
level = 0.95,
include_median = FALSE,
na.rm = FALSE)
Arguments
beta_samples |
An array, returned by |
level |
Default is |
include_median |
Default is |
na.rm |
Default is |
Value
An array of size N x 3 x p where N is the number of inputs at which the coefficient functions are evaluated (i.e. N = dim(beta_samples)[2]) and p is the total number of coefficient functions including the intercept (i.e. p = dim(beta_samples)[3]). The j-th slice is an N x 3 matrix where the columns correspond to the posterior mean, 2.5% quantile, and 97.5% quantile of each evaluation of the (j-1)-th coefficient function. Note the effect of predictor X_j is the (j+1)-st coefficient function.