This vignette provides an overview of the linear molecular and genomic eigen selection index methods, which integrate marker trait data and genome-wide marker values into the calculation. Methods covered include the Molecular Eigen Selection Index Method (MESIM), Genomic Eigen Selection Index Method (GESIM), Genome-Wide Linear Eigen Selection Index Method (GW-ESIM), Restricted Genomic Eigen Selection Index Method (RGESIM), and Predetermined Proportional Gain Genomic Eigen Selection Index Method (PPG-GESIM).
library(selection.index)
# Load standard phenotype dataset
data(maize_pheno)
# Define traits and design variables
traits <- c("Yield", "PlantHeight", "DaysToMaturity")
env_col <- "Environment"
genotype_col <- "Genotype"
# Phenotypic variance-covariance matrix (P)
pmat <- phen_varcov(maize_pheno[, traits], maize_pheno[[genotype_col]], maize_pheno[[env_col]])
# Genetic variance-covariance matrix (G)
gmat <- gen_varcov(maize_pheno[, traits], maize_pheno[[genotype_col]], maize_pheno[[env_col]])
# For the sake of demonstration within this vignette, we simulate the required molecular/genomic variance components:
set.seed(42)
# Simulate Gamma: Covariance between phenotypes and GEBVs
Gamma <- gmat * 0.85
# Molecular matrices for MESIM
S_M <- gmat * 0.75 # Covariance between phenotypic values and marker scores
S_Mg <- gmat * 0.70 # Covariance between genotypic values and marker scores
S_var <- gmat * 0.80 # Variance-covariance of marker scores
# Genomic matrices for GW-ESIM
G_M <- gmat * 0.82 # Covariance between true genotypic values and marker values
M <- gmat * 0.90 # Variance-covariance matrix of markersThe MESIM index integrates marker scores directly:
\[ I = \beta'_{y}\mathbf{y} + \beta'_{s}\mathbf{s} = \begin{bmatrix} \beta'_{y} & \beta'_{s} \end{bmatrix} \begin{bmatrix} \mathbf{y} \\ \mathbf{s} \end{bmatrix} = \beta'\mathbf{t} \]
Its optimum coefficients are calculated effectively from the primary eigenvector resulting from this maximization function:
\[ (\mathbf{T}^{-1}\mathbf{\Psi} - \lambda^2_{M}\mathbf{I}_{2t})\beta_M = \mathbf{0} \]
Using the mesim function:
mes_index <- mesim(pmat, gmat, S_M, S_Mg, S_var)
summary(mes_index)
#>
#> ==============================================================
#> MOLECULAR EIGEN SELECTION INDEX METHOD (MESIM)
#> Ceron-Rojas & Crossa (2018) - Chapter 8, Section 8.1
#> ==============================================================
#>
#> Selection intensity (k_I): 2.063
#> Number of traits: 3
#>
#> -------------------------------------------------------------
#> INDEX METRICS
#> -------------------------------------------------------------
#> lambda_M^2 (h^2_I): 1.000798
#> Accuracy (r_HI): 1.000399
#> Index Std Dev (sigma_I): 6.315343
#> Selection Response (R_M): 13.028553
#>
#> -------------------------------------------------------------
#> PHENOTYPE COEFFICIENTS (b_y)
#> -------------------------------------------------------------
#> Trait b_y
#> Yield 0.000162
#> PlantHeight -0.010260
#> DaysToMaturity 0.007350
#>
#> -------------------------------------------------------------
#> MARKER SCORE COEFFICIENTS (b_s)
#> -------------------------------------------------------------
#> Trait b_s
#> Yield -0.012821
#> PlantHeight 0.812791
#> DaysToMaturity -0.582277
#>
#> -------------------------------------------------------------
#> EXPECTED GENETIC GAINS PER TRAIT (E_M)
#> -------------------------------------------------------------
#> Trait E_M
#> Yield 165.154061
#> PlantHeight 16.311970
#> DaysToMaturity -0.538044
#>
#> ==============================================================
#> SUMMARY TABLE
#> ==============================================================
#>
#> b_y.1 b_y.2 b_y.3 b_s.1 b_s.2 b_s.3 hI2 rHI
#> 0.000162 -0.01026 0.00735 -0.012821 0.812791 -0.582277 1.000798 1.000399
#> sigma_I R_M lambda2
#> 6.315343 13.02855 1.000798The GESIM incorporates Genomic Estimated Breeding Values (GEBVs, denoted as \(\gamma\)):
\[ I = \beta'_{y}\mathbf{y} + \beta'_{\gamma}\mathbf{\gamma} = \begin{bmatrix} \beta'_{y} & \beta'_{\gamma} \end{bmatrix} \begin{bmatrix} \mathbf{y} \\ \mathbf{\gamma} \end{bmatrix} = \beta'\mathbf{f} \]
The optimum index coefficients are the first eigenvector of:
\[ (\mathbf{\Phi}^{-1}\mathbf{A} - \lambda^2_{G}\mathbf{I}_{2t})\beta_G = \mathbf{0} \]
Using the gesim function:
ges_index <- gesim(pmat, gmat, Gamma)
summary(ges_index)
#>
#> ==============================================================
#> LINEAR GENOMIC EIGEN SELECTION INDEX METHOD (GESIM)
#> Ceron-Rojas & Crossa (2018) - Chapter 8, Section 8.2
#> ==============================================================
#>
#> Selection intensity (k_I): 2.063
#> Number of traits: 3
#>
#> -------------------------------------------------------------
#> INDEX METRICS
#> -------------------------------------------------------------
#> lambda_G^2 (h^2_I): 1.000000
#> Accuracy (r_HI): 1.000000
#> Index Std Dev (sigma_I): 18221.926505
#> Selection Response (R_G): 37591.834380
#>
#> -------------------------------------------------------------
#> PHENOTYPE COEFFICIENTS (b_y)
#> -------------------------------------------------------------
#> Trait b_y
#> Yield 0
#> PlantHeight 0
#> DaysToMaturity 0
#>
#> -------------------------------------------------------------
#> GEBV COEFFICIENTS (b_gamma)
#> -------------------------------------------------------------
#> Trait b_gamma
#> Yield 1
#> PlantHeight 0
#> DaysToMaturity 0
#>
#> -------------------------------------------------------------
#> EXPECTED GENETIC GAINS PER TRAIT (E_G)
#> -------------------------------------------------------------
#> Trait E_G
#> Yield 37591.8344
#> PlantHeight 985.3145
#> DaysToMaturity 547.5454
#>
#> -------------------------------------------------------------
#> IMPLIED ECONOMIC WEIGHTS (w_G)
#> -------------------------------------------------------------
#> Trait Implied_w
#> Yield 0.000255
#> PlantHeight -0.011029
#> DaysToMaturity 0.002325
#>
#> ==============================================================
#> SUMMARY TABLE
#> ==============================================================
#>
#> b_y.1 b_y.2 b_y.3 b_gamma.1 b_gamma.2 b_gamma.3 hI2 rHI sigma_I R_G
#> 0 0 0 1 0 0 1 1 18221.93 37591.83
#> lambda2
#> 1When high-density markers covering the whole genome are available, the GW-ESIM formulation estimates genetic potential uniformly across the entire marker panel.
Using the gw_esim function:
gw_index <- gw_esim(pmat, gmat, G_M, M)
summary(gw_index)
#>
#> ==============================================================
#> GENOME-WIDE LINEAR EIGEN SELECTION INDEX METHOD (GW-ESIM)
#> Ceron-Rojas & Crossa (2018) - Chapter 8, Section 8.3
#> ==============================================================
#>
#> Selection intensity (k_I): 2.063
#> Number of traits: 3
#> Number of markers: 3
#>
#> -------------------------------------------------------------
#> INDEX METRICS
#> -------------------------------------------------------------
#> lambda_W^2 (h^2_I): 1.000000
#> Accuracy (r_HI): 1.000000
#> Index Std Dev (sigma_I): 18750.207685
#> Selection Response (R_W): 38681.678454
#>
#> -------------------------------------------------------------
#> PHENOTYPE COEFFICIENTS (b_y)
#> -------------------------------------------------------------
#> Trait b_y
#> Yield 0
#> PlantHeight 0
#> DaysToMaturity 0
#>
#> -------------------------------------------------------------
#> MARKER COEFFICIENTS (b_m) - First 10
#> -------------------------------------------------------------
#> Marker b_m
#> 1 1
#> 2 0
#> 3 0
#>
#> -------------------------------------------------------------
#> EXPECTED GENETIC GAINS PER TRAIT (E_W)
#> -------------------------------------------------------------
#> Trait E_W
#> Yield 35243.3070
#> PlantHeight 923.7575
#> DaysToMaturity 513.3379
#>
#> ==============================================================
#> SUMMARY TABLE
#> ==============================================================
#>
#> n_traits n_markers hI2 rHI sigma_I R_W lambda2
#> 3 3 1 1 18750.21 38681.68 1RGESIM enables breeders to impose constraints on particular traits such that their expected genetic advancements are zero, while employing the genomic capability of the GESIM.
Using the rgesim function to restrict improvements to
Ear_Height (second trait):
# Restrict the second trait (PlantHeight)
U_mat <- matrix(c(0, 1, 0), nrow = 1)
rges_index <- rgesim(pmat, gmat, Gamma, U_mat)
summary(rges_index)
#>
#> ==============================================================
#> RESTRICTED LINEAR GENOMIC EIGEN SELECTION INDEX (RGESIM)
#> Ceron-Rojas & Crossa (2018) - Chapter 8, Section 8.4
#> ==============================================================
#>
#> Selection intensity (k_I): 2.063
#> Number of traits: 3
#> Number of restrictions: 1
#>
#> -------------------------------------------------------------
#> INDEX METRICS
#> -------------------------------------------------------------
#> lambda_RG^2 (h^2_I): 1.000000
#> Accuracy (r_HI): 1.000000
#> Index Std Dev (sigma_I): 6.999294
#> Selection Response (R_RG): 14.439543
#>
#> -------------------------------------------------------------
#> PHENOTYPE COEFFICIENTS (b_y)
#> -------------------------------------------------------------
#> Trait b_y
#> Yield 0
#> PlantHeight 0
#> DaysToMaturity 0
#>
#> -------------------------------------------------------------
#> GEBV COEFFICIENTS (b_gamma)
#> -------------------------------------------------------------
#> Trait b_gamma
#> Yield -0.026207
#> PlantHeight 0.999657
#> DaysToMaturity 0.000000
#>
#> -------------------------------------------------------------
#> EXPECTED GENETIC GAINS PER TRAIT (E_RG)
#> -------------------------------------------------------------
#> Trait E_RG
#> Yield -550.97005
#> PlantHeight 0.00000
#> DaysToMaturity -10.82068
#>
#> -------------------------------------------------------------
#> CONSTRAINT VERIFICATION
#> -------------------------------------------------------------
#> Constrained response (should be near zero):
#> [1] 0
#>
#> -------------------------------------------------------------
#> IMPLIED ECONOMIC WEIGHTS (w_RG)
#> -------------------------------------------------------------
#> Trait Implied_w
#> Yield -0.011032
#> PlantHeight 0.476755
#> DaysToMaturity -0.100501
#>
#> ==============================================================
#> SUMMARY TABLE
#> ==============================================================
#>
#> b_y.1 b_y.2 b_y.3 b_gamma.1 b_gamma.2 b_gamma.3 hI2 rHI sigma_I R_RG
#> 0 0 0 -0.026207 0.999657 0 1 1 6.999294 14.43954
#> lambda2
#> 1This approach aims to acquire customized genetic gains according to a relative priority proportion assigned distinctly for each trait.
\[ (\mathbf{T}_{PG} - \lambda^2_{PG}\mathbf{I}_{2t})\beta_{PG} = \mathbf{0} \]
Using the ppg_gesim function with relative gains heavily
weighted towards Yield (trait 3) and
Days_To_Silking (trait 4):
# Desired genetic gain proportions: 1 for Yield, 0.5 for DaysToMaturity, 0 for others
d <- c(1, 0, 0.5)
ppg_ges_index <- ppg_gesim(pmat, gmat, Gamma, d)
summary(ppg_ges_index)
#>
#> ==============================================================
#> PREDETERMINED PROPORTIONAL GAIN GENOMIC EIGEN INDEX (PPG-GESIM)
#> Ceron-Rojas & Crossa (2018) - Chapter 8, Section 8.5
#> ==============================================================
#>
#> Selection intensity (k_I): 2.063
#> Number of traits: 3
#>
#> -------------------------------------------------------------
#> DESIRED PROPORTIONAL GAINS (d)
#> -------------------------------------------------------------
#> Trait d
#> Yield 1.0
#> PlantHeight 0.0
#> DaysToMaturity 0.5
#>
#> -------------------------------------------------------------
#> INDEX METRICS
#> -------------------------------------------------------------
#> lambda_PG^2 (h^2_I): 3.500000
#> Accuracy (r_HI): 1.870829
#> Index Std Dev (sigma_I): 2056.464206
#> Selection Response (R_PG): 4242.485656
#>
#> -------------------------------------------------------------
#> PHENOTYPE COEFFICIENTS (b_y)
#> -------------------------------------------------------------
#> Trait b_y
#> Yield -0.001616
#> PlantHeight 0.019492
#> DaysToMaturity -0.610526
#>
#> -------------------------------------------------------------
#> GEBV COEFFICIENTS (b_gamma)
#> -------------------------------------------------------------
#> Trait b_gamma
#> Yield 0.111810
#> PlantHeight 0.002665
#> DaysToMaturity 0.783816
#>
#> -------------------------------------------------------------
#> EXPECTED GENETIC GAINS PER TRAIT (E_PG)
#> -------------------------------------------------------------
#> Trait E_PG
#> Yield 37151.3513
#> PlantHeight 973.7697
#> DaysToMaturity 541.1297
#>
#> -------------------------------------------------------------
#> PROPORTIONAL GAIN VERIFICATION
#> -------------------------------------------------------------
#> Gain ratios (E_PG / d) - should be approximately constant:
#> Trait Ratio
#> Yield 37151.351
#> PlantHeight NA
#> DaysToMaturity 1082.259
#> Standard deviation of ratios: 25504.7
#>
#> -------------------------------------------------------------
#> IMPLIED ECONOMIC WEIGHTS (w_PG)
#> -------------------------------------------------------------
#> Trait Implied_w
#> Yield 0.046996
#> PlantHeight -0.015539
#> DaysToMaturity 0.004212
#>
#> ==============================================================
#> SUMMARY TABLE
#> ==============================================================
#>
#> b_y.1 b_y.2 b_y.3 b_gamma.1 b_gamma.2 b_gamma.3 hI2 rHI
#> -0.001616 0.019492 -0.610526 0.11181 0.002665 0.783816 3.5 1.870829
#> sigma_I R_PG lambda2
#> 2056.464 4242.486 3.5Just as with phenotypic and standard molecular selection indices, the efficiency of Genomic Eigen Selection Indices can be evaluated via accuracy (\(\rho_{HI}\)), maximized selection response (\(R\)), and expected genetic gain per trait (\(E\)).
For the MESIM index, the parameters are formulated by estimating the standard deviations of the true genetic merit (\(\sigma_H\)) and the given index (\(\sigma_I\)). The maximum accuracy obtained through canonical correlation (\(\lambda_M\)) corresponds directly to the square root of the primary eigenvalue:
\[ \rho_{H_MI_M} = \frac{\sqrt{\beta'_{M}\mathbf{T}_M\beta_M}}{\sqrt{\beta'_{M}\mathbf{T}_M\mathbf{\Psi}_M^{-1}\mathbf{T}_M\beta_M}} = \frac{\sigma_{I_M}}{\sigma_{H_M}} \]
The selection response is obtained using a standardized selection intensity factor (\(k_I\)):
\[ R_M = k_I \sqrt{\beta'_{M_1}\mathbf{T}_M\beta_{M_1}} \]
Expected genetic gain per trait can then be calculated accordingly:
\[ \mathbf{E}_M = k_I \frac{\mathbf{\Psi}_M\beta_{M_1}}{\sqrt{\beta'_{M_1}\mathbf{T}_M\beta_{M_1}}} \]