Translating lme4 models to sommer

The sommer package was developed to provide R users with a powerful and reliable multivariate mixed model solver for different genetic and non-genetic analyses in diploid and polyploid organisms. This package allows the user to estimate variance components for a mixed model with the advantages of specifying the variance-covariance structure of the random effects, specifying heterogeneous variances, and obtaining other parameters such as BLUPs, BLUEs, residuals, fitted values, variances for fixed and random effects, etc. The core algorithms of the package are coded in C++ using the Armadillo library to optimize dense matrix operations common in the derect-inversion algorithms. Although the vignette shows examples using the mmes function with the default direct inversion algorithm (henderson=FALSE) the Henderson’s approach can be faster when the number of records surpasses the number of coefficients to estimate and setting the henderson argument to TRUE can bring significant speed ups.

The purpose of this vignette is to show how to translate the syntax formula from lme4 models to sommer models. Feel free to remove the comment marks from the lme4 code so you can compare the results.

Random slopes with same intercept
Random slopes and random intercepts (without correlation)
Random slopes and random intercepts (with correlation)
Random slopes with a different intercept
Other models not available in lme4

1) Random slopes

This is the simplest model people use when a random effect is desired and the levels of the random effect are considered to have the same intercept.

# install.packages("lme4")
# library(lme4)
library(sommer)
data(DT_sleepstudy)
DT <- DT_sleepstudy
###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (1 | Subject), data=DT)
# summary(fm1) # or vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups   Name        Variance Std.Dev.
#  Subject  (Intercept) 1378.2   37.12   
#  Residual              960.5   30.99   
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmes(Reaction ~ Days,
            random= ~ Subject, 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                VarComp VarCompSE   Zratio Constraint
## Subject:mu:mu 1378.158  505.6990 2.725254   Positive
## units:mu:mu    960.458  107.0498 8.972063   Positive

2) Random slopes and random intercepts (without correlation)

This is the a model where you assume that the random effect has different intercepts based on the levels of another variable. In addition the || in lme4 assumes that slopes and intercepts have no correlation.

###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (Days || Subject), data=DT)
# summary(fm1) # or vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups    Name        Variance Std.Dev.
#  Subject   (Intercept) 627.57   25.051  
#  Subject.1 Days         35.86    5.988  
#  Residual              653.58   25.565  
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmes(Reaction ~ Days,
            random= ~ Subject + vsm(ism(Days), ism(Subject)), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                      VarComp VarCompSE   Zratio Constraint
## Subject:mu:mu      627.54087 283.52939 2.213319   Positive
## Days:Subject:mu:mu  35.86008  14.53187 2.467686   Positive
## units:mu:mu        653.58305  76.72711 8.518281   Positive

Notice that Days is a numerical (not factor) variable.

3) Random slopes and random intercepts (with correlation)

This is the a model where you assume that the random effect has different intercepts based on the levels of another variable. In addition a single | in lme4 assumes that slopes and intercepts have a correlation to be estimated.

###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (Days | Subject), data=DT)
# summary(fm1) # or # vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups   Name        Variance Std.Dev. Corr
#  Subject  (Intercept) 612.10   24.741       
#           Days         35.07    5.922   0.07
#  Residual             654.94   25.592       
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmes(Reaction ~ Days, # henderson=TRUE,
            random= ~ covm( vsm(ism(Subject)) , vsm(ism(Days), ism(Subject)) ), 
            nIters = 200, data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                           VarComp VarCompSE    Zratio Constraint
## Subject:Days:ran1:ran1 612.081194 288.75050 2.1197581   Positive
## Subject:Days:ran1:ran2   9.604358  46.68381 0.2057321   Unconstr
## Subject:Days:ran2:ran2  35.073186  14.78720 2.3718604   Positive
## units:mu:mu            654.939143  77.18332 8.4855011   Positive

cov2cor(fm2$theta[[1]])

##            [,1]       [,2]
## [1,] 1.00000000 0.06555053
## [2,] 0.06555053 1.00000000

Notice that this last model require a new function called covm() which creates the two random effects as before but now they have to be encapsulated in covm() instead of just added.

4) Random slopes with a different intercept

This is the a model where you assume that the random effect has different intercepts based on the levels of another variable but there’s not a main effect. The 0 in the intercept in lme4 assumes that random slopes interact with an intercept but without a main effect.

###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (0 + Days | Subject), data=DT)
# summary(fm1) # or vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups   Name Variance Std.Dev.
#  Subject  Days  52.71    7.26   
#  Residual      842.03   29.02   
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmes(Reaction ~ Days,
            random= ~ vsm(ism(Days), ism(Subject)), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                      VarComp VarCompSE   Zratio Constraint
## Days:Subject:mu:mu  52.70946  19.09984 2.759681   Positive
## units:mu:mu        842.02736  93.84640 8.972399   Positive

4) Other models available in sommer but not in lme4

One of the strengths of sommer is the availability of other variance covariance structures. In this section we show 4 models available in sommer that are not available in lme4 and might be useful.

library(orthopolynom)
## diagonal model
fm2 <- mmes(Reaction ~ Days,
            random= ~ vsm(dsm(Daysf), ism(Subject)), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                     VarComp VarCompSE    Zratio Constraint
## Daysf:Subject:0:0  139.5473  399.5095 0.3492967   Positive
## Daysf:Subject:1:1  196.8544  411.8262 0.4780037   Positive
## Daysf:Subject:2:2    0.0000  365.3178 0.0000000   Positive
## Daysf:Subject:3:3  556.0773  501.2665 1.1093445   Positive
## Daysf:Subject:4:4  855.2104  581.8190 1.4698910   Positive
## Daysf:Subject:5:5 1699.4269  820.4561 2.0713197   Positive
## Daysf:Subject:6:6 2910.8975 1175.7872 2.4757011   Positive
## Daysf:Subject:7:7 1539.6201  779.1437 1.9760413   Positive
## Daysf:Subject:8:8 2597.5337 1089.4522 2.3842568   Positive
## Daysf:Subject:9:9 3472.7108 1351.5702 2.5693899   Positive
## units:mu:mu        879.6958  247.4680 3.5547862   Positive

## unstructured model
fm2 <- mmes(Reaction ~ Days,
            random= ~ vsm(usm(Daysf), ism(Subject)), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                     VarComp VarCompSE    Zratio Constraint
## Daysf:Subject:0:0  404.2274  572.9406 0.7055311   Positive
## Daysf:Subject:0:1 1024.8277  395.1858 2.5932810   Unconstr
## Daysf:Subject:1:1  418.9669  288.1527 1.4539753   Positive
## Daysf:Subject:0:2  542.0553  398.5048 1.3602229   Unconstr
## Daysf:Subject:1:2  830.0421  412.4896 2.0122739   Unconstr
## Daysf:Subject:2:2    0.0000  518.4224 0.0000000   Positive
## Daysf:Subject:0:3  800.6213  567.2558 1.4113938   Unconstr
## Daysf:Subject:1:3 1139.4846  491.8231 2.3168586   Unconstr
## Daysf:Subject:2:3 1058.4964  578.5524 1.8295602   Unconstr
## Daysf:Subject:3:3  761.4044  692.1520 1.1000538   Positive
## Daysf:Subject:0:4  759.8438  522.0343 1.4555437   Unconstr
## Daysf:Subject:1:4 1041.4806  326.7026 3.1878552   Unconstr
## Daysf:Subject:2:4  912.2462  445.2663 2.0487653   Unconstr
## Daysf:Subject:3:4 1592.3784  448.6967 3.5488967   Unconstr
## Daysf:Subject:4:4  957.7023  549.5803 1.7426068   Positive
## Daysf:Subject:0:5  935.2038  585.0229 1.5985763   Unconstr
## Daysf:Subject:1:5 1182.0541  494.0087 2.3927801   Unconstr
## Daysf:Subject:2:5  861.0084  594.1198 1.4492167   Unconstr
## Daysf:Subject:3:5 1675.4551  705.2855 2.3755700   Unconstr
## Daysf:Subject:4:5 2005.1568  510.5543 3.9274112   Unconstr
## Daysf:Subject:5:5 2069.9991  386.7921 5.3517102   Positive
## Daysf:Subject:0:6  668.3500  378.6722 1.7649827   Unconstr
## Daysf:Subject:1:6  853.0252  441.5719 1.9317924   Unconstr
## Daysf:Subject:2:6  917.6183  504.8746 1.8175175   Unconstr
## Daysf:Subject:3:6 1787.8367  426.9689 4.1872761   Unconstr
## Daysf:Subject:4:6 2079.1716  487.9486 4.2610462   Unconstr
## Daysf:Subject:5:6 2605.7361  551.2976 4.7265505   Unconstr
## Daysf:Subject:6:6 3124.3139  437.2096 7.1460324   Positive
## Daysf:Subject:0:7  934.8337  567.6481 1.6468543   Unconstr
## Daysf:Subject:1:7  929.1933  665.6637 1.3958900   Unconstr
## Daysf:Subject:2:7  925.8402  752.0828 1.2310348   Unconstr
## Daysf:Subject:3:7 1284.6133  584.4937 2.1978223   Unconstr
## Daysf:Subject:4:7 1551.3176  719.2232 2.1569347   Unconstr
## Daysf:Subject:5:7 1943.7604  799.3245 2.4317539   Unconstr
## Daysf:Subject:6:7 2307.7377  364.2914 6.3348672   Unconstr
## Daysf:Subject:7:7 1670.4190  739.9904 2.2573523   Positive
## Daysf:Subject:0:8  922.9089  823.1582 1.1211806   Unconstr
## Daysf:Subject:1:8 1047.3727  644.6257 1.6247766   Unconstr
## Daysf:Subject:2:8  833.2198  806.9536 1.0325498   Unconstr
## Daysf:Subject:3:8 1609.3713  894.2295 1.7997295   Unconstr
## Daysf:Subject:4:8 2031.1247  554.0902 3.6656935   Unconstr
## Daysf:Subject:5:8 3061.0333 1036.7583 2.9525042   Unconstr
## Daysf:Subject:6:8 2929.9489  812.7286 3.6050765   Unconstr
## Daysf:Subject:7:8 2435.3269 1095.3333 2.2233661   Unconstr
## Daysf:Subject:8:8 2949.0476 1198.3224 2.4609801   Positive
## Daysf:Subject:0:9 1443.6078 1049.7042 1.3752520   Unconstr
## Daysf:Subject:1:9 1517.7329  952.6090 1.5932380   Unconstr
## Daysf:Subject:2:9  969.8886 1179.0801 0.8225808   Unconstr
## Daysf:Subject:3:9 1745.3735 1186.7845 1.4706743   Unconstr
## Daysf:Subject:4:9 2200.7039  612.3551 3.5938363   Unconstr
## Daysf:Subject:5:9 3240.0574  959.0399 3.3784387   Unconstr
## Daysf:Subject:6:9 2213.3201 1029.3177 2.1502788   Unconstr
## Daysf:Subject:7:9 2401.9426  845.6411 2.8403806   Unconstr
## Daysf:Subject:8:9 3850.1122 1393.5787 2.7627519   Unconstr
## Daysf:Subject:9:9 3948.8494 1229.9178 3.2106611   Positive
## units:mu:mu        884.1034  579.1145 1.5266470   Positive

## random regression (legendre polynomials)
fm2 <- mmes(Reaction ~ Days,
            random= ~ vsm(dsm(leg(Days,1)), ism(Subject)), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                          VarComp  VarCompSE   Zratio Constraint
## Days:Subject:leg0:leg0 2817.4048 1011.23903 2.786092   Positive
## Days:Subject:leg1:leg1  473.4608  199.53635 2.372805   Positive
## units:mu:mu             654.9433   77.18822 8.485016   Positive

## unstructured random regression (legendre)
fm2 <- mmes(Reaction ~ Days,
            random= ~ vsm(usm(leg(Days,1)), ism(Subject)), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

##                          VarComp  VarCompSE   Zratio Constraint
## Days:Subject:leg0:leg0 2817.4056 1011.24151 2.786086   Positive
## Days:Subject:leg0:leg1  869.9595  381.02552 2.283205   Unconstr
## Days:Subject:leg1:leg1  473.4608  199.53619 2.372807   Positive
## units:mu:mu             654.9429   77.18771 8.485067   Positive

Literature

Covarrubias-Pazaran G. 2016. Genome assisted prediction of quantitative traits using the R package sommer. PLoS ONE 11(6):1-15.

Covarrubias-Pazaran G. 2018. Software update: Moving the R package sommer to multivariate mixed models for genome-assisted prediction. doi: https://doi.org/10.1101/354639

Bernardo Rex. 2010. Breeding for quantitative traits in plants. Second edition. Stemma Press. 390 pp.

Gilmour et al. 1995. Average Information REML: An efficient algorithm for variance parameter estimation in linear mixed models. Biometrics 51(4):1440-1450.

Henderson C.R. 1975. Best Linear Unbiased Estimation and Prediction under a Selection Model. Biometrics vol. 31(2):423-447.

Kang et al. 2008. Efficient control of population structure in model organism association mapping. Genetics 178:1709-1723.

Lee, D.-J., Durban, M., and Eilers, P.H.C. (2013). Efficient two-dimensional smoothing with P-spline ANOVA mixed models and nested bases. Computational Statistics and Data Analysis, 61, 22 - 37.

Lee et al. 2015. MTG2: An efficient algorithm for multivariate linear mixed model analysis based on genomic information. Cold Spring Harbor. doi: http://dx.doi.org/10.1101/027201.

Maier et al. 2015. Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. Am J Hum Genet; 96(2):283-294.

Rodriguez-Alvarez, Maria Xose, et al. Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics 23 (2018): 52-71.

Searle. 1993. Applying the EM algorithm to calculating ML and REML estimates of variance components. Paper invited for the 1993 American Statistical Association Meeting, San Francisco.

Yu et al. 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Genetics 38:203-208.

Tunnicliffe W. 1989. On the use of marginal likelihood in time series model estimation. JRSS 51(1):15-27.