BeviMed, which stands for Bayesian Evaluation of Variant Involvement in Mendelian Disease, is an association test which estimates both the probability of an association between a given set of variants and a case/control disease label, and, given the association, the probability that each individual variant is pathogenic with respect to the disease.
The inference is carried out based on the inputs:
y, a length N (number of samples) logical vector,G, an N by k integer matrix of allele counts for N individuals at k rare variant sites,min_ac, representing a mode of inheritance hypothesis (i.e. minimum number of pathogenic variants required to be considered to have a pathogenic configuration of variants).Then, depending on the quantity of interest, the inference procedure can be invoked simply by passing the above arguments to the functions:
prob_association - returning the probability of association between configurations of variants represented in G and the case-control label y (optionally broken down by mode of inheritance). By default, the prior probability of association is 0.01, and the prior probabilities of dominant and recessive inheritance given that there is an association are each 0.5.log_BF - the log Bayes factor between the association model and no-association model.prob_pathogenic - the probabilities of pathogenicity for the individual variants.The inference is performed by the function bevimed, an MCMC sampling procedure with many parameters, including those listed above and others determining the sampling management and prior distributions of the model parameters.
It returns a list of traces for the sampled parameters in an object of class BeviMed. This object can take up a lot of memory, so it may be preferable to store a summarised version passed to summary.
Here we demonstrate a simple application of BeviMed for some simulated data.
library(BeviMed)
set.seed(0)
Firstly, we’ll generate some random data consisting of an allele-count matrix G for 100 samples at 20 variant sites (each with an allele frequency of 0.02) and an independently generated case-control label, y_random.
G <- matrix(rbinom(size=2, prob=0.02, n=100*20), nrow=100, ncol=20)
y_random <- runif(n=nrow(G)) < 0.1
prob_association(G=G, y=y_random)
## [1] 0.002853161
The results indicate that there is a low probability of association. We now generate a new case control label y_dependent which depends on G - specifically, we treat variants 1 to 3 as ‘pathogenic’, and label any samples harbouring alleles for any of these variants as cases.
y_dependent <- apply(G, 1, function(variants) sum(variants[1:3]) > 0)
prob_association(G=G, y=y_dependent)
## [1] 0.9997962
Notice that there is now a higher estimated probability of association.
By default, prob_association integrates over mode of inheritance (e.g. are at least 1 or 2 pathogenic variants required for a pathogenic configuration?). The probabilities of association with each mode of inheritance can by shown by passing the option by_MOI=TRUE (for more details, including how to set the ploidy of the samples within the region, see ?prob_pathogenic).
For a more detailed output, the bevimed function can be used, and it’s returned values can be summarised and stored/printed.
output <- summary(bevimed(G=G, y=y_dependent))
output
## ---------------------------------------------------------------------------
## The probability of association is 1 [prior: 0.01]
##
## The expected number of variants involved in explained cases is: 2.96
##
## Log Bayes factor between gamma 1 model and gamma 0 model is 13.91
## A confidence interval for the log Bayes factor is:
## 2.5% 97.5%
## 12.22 14.74
## ---------------------------------------------------------------------------
## Estimated probabilities of pathogenicity of individual variants
## (conditional on gamma = 1)
##
## Variant Controls Cases P(z_j=1|y,gamma=1) Bar Chart
## 1 0 2 1.00 [=================== ]
## 2 0 4 1.00 [=================== ]
## 3 0 2 0.94 [================== ]
## 4 9 0 0.00 [ ]
## 5 2 1 0.00 [ ]
## 6 4 0 0.00 [ ]
## 7 3 0 0.00 [ ]
## 8 2 1 0.00 [ ]
## 9 4 1 0.00 [ ]
## 10 7 0 0.00 [ ]
## 11 3 0 0.00 [ ]
## 12 1 0 0.01 [ ]
## 13 4 0 0.00 [ ]
## 14 1 2 0.01 [ ]
## 15 3 0 0.00 [ ]
## 16 5 0 0.00 [ ]
## 17 7 0 0.00 [ ]
## 18 4 0 0.00 [ ]
## 19 7 2 0.00 [ ]
## 20 6 0 0.00 [ ]