matrixCorr computes correlation and related association
matrices from small to high-dimensional data using simple, consistent
functions and sensible defaults. It includes shrinkage and robust
options for noisy or p ≥ n settings, plus convenient
print/plot methods. Performance-critical paths are implemented in C++
with BLAS/OpenMP and memory-aware symmetric updates. The API accepts
base matrices and data frames and returns standard R objects via a
consistent S3 interface.
Contributions from other researchers who want to add new correlation
methods are very welcome. A central goal of matrixCorr is
to keep efficient correlation and agreement estimation in one package
with a common interface and consistent outputs, so methods can be
extended, compared, and used without repeated translation across
packages.
Supported measures include Pearson, Spearman, Kendall, distance correlation, partial correlation, and robust biweight mid-correlation; agreement tools cover Bland–Altman (two-method and repeated-measures) and Lin’s concordance correlation coefficient (including repeated-measures LMM/REML extensions).
Rcpppearson_corr(),
spearman_rho(), kendall_tau()biweight_mid_corr())distance_corr())partial_correlation())schafer_corr())bland_altman() and
repeated-measures bland_altman_repeated()),ccc(), repeated-measures LMM/REML
ccc_lmm_reml() and non-parametric
ccc_pairwise_u_stat())# Install from GitHub
# install.packages("devtools")
devtools::install_github("Prof-ThiagoOliveira/matrixCorr")library(matrixCorr)
set.seed(1)
X <- as.data.frame(matrix(rnorm(300 * 6), ncol = 6))
names(X) <- paste0("V", 1:6)
R_pear <- pearson_corr(X)
R_spr <- spearman_rho(X)
R_ken <- kendall_tau(X)
print(R_pear, digits = 2)
plot(R_spr) # heatmapset.seed(2)
Y <- X
# inject outliers
Y$V1[sample.int(nrow(Y), 8)] <- Y$V1[sample.int(nrow(Y), 8)] + 8
R_bicor <- biweight_mid_corr(Y)
print(R_bicor, digits = 2)set.seed(3)
n <- 60; p <- 200
Xd <- matrix(rnorm(n * p), n, p)
colnames(Xd) <- paste0("G", seq_len(p))
R_shr <- schafer_corr(Xd)
print(R_shr, digits = 2, max_rows = 6, max_cols = 6)R_part <- partial_correlation(X)
print(R_part, digits = 2)R_dcor <- distance_corr(X)
print(R_dcor, digits = 2)distance_corr() uses an unbiased estimator with a fast
univariate (O(n n)) dispatch and an exact (O(n^2)) fallback for
robustness.
set.seed(4)
x <- rnorm(120, 100, 10)
y <- x + 0.5 + rnorm(120, 0, 8)
ba <- bland_altman(x, y)
print(ba)
plot(ba)set.seed(5)
S <- 20; Tm <- 6
subj <- rep(seq_len(S), each = Tm)
time <- rep(seq_len(Tm), times = S)
true <- rnorm(S, 50, 6)[subj] + (time - mean(time)) * 0.4
mA <- true + rnorm(length(true), 0, 2)
mB <- true + 1.0 + rnorm(length(true), 0, 2.2)
mC <- 0.95 * true + rnorm(length(true), 0, 2.5)
dat <- rbind(
data.frame(y = mA, subject = subj, method = "A", time = time),
data.frame(y = mB, subject = subj, method = "B", time = time),
data.frame(y = mC, subject = subj, method = "C", time = time)
)
dat$method <- factor(dat$method, levels = c("A","B","C"))
ba_rep <- bland_altman_repeated(
data = dat, response = "y", subject = "subject",
method = "method", time = "time",
include_slope = FALSE, use_ar1 = FALSE
)
summary(ba_rep)
# plot(ba_rep) # faceted BA scatter by pairset.seed(6)
S <- 30; Tm <- 8
id <- factor(rep(seq_len(S), each = 2 * Tm))
method <- factor(rep(rep(c("A","B"), each = Tm), times = S))
time <- rep(rep(seq_len(Tm), times = 2), times = S)
u <- rnorm(S, 0, 0.8)[as.integer(id)]
g <- rnorm(S * Tm, 0, 0.5)
g <- g[ (as.integer(id) - 1L) * Tm + as.integer(time) ]
y <- (method == "B") * 0.3 + u + g + rnorm(length(id), 0, 0.7)
dat_ccc <- data.frame(y, id, method, time)
fit_ccc <- ccc_lmm_reml(dat_ccc, response = "y", rind = "id",
method = "method", time = "time")
summary(fit_ccc) # overall CCC, variance components, SEs/CIIssues and pull requests are welcome. Please see
CONTRIBUTING.md for guidelines and
cran-comments.md/DESCRIPTION for package
metadata.
See inst/LICENSE for the full MIT license text.