% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/QC.mppData.R
\name{QC.mppData}
\alias{QC.mppData}
\title{Quality control for \code{mppData} objects}
\usage{
QC.mppData(mppData, mk.miss = 0.1, gen.miss = 0.25, n.lim = 15,
  MAF.pop.lim = 0.05, MAF.cr.lim = NULL, MAF.cr.miss = TRUE,
  MAF.cr.lim2 = NULL, verbose = TRUE, n.cores = 1)
}
\arguments{
\item{mppData}{An object of class \code{mppData} formed with
\code{\link{create.mppData}}.}

\item{mk.miss}{\code{Numeric} maximum marker missing rate at the whole
population level comprised between 0 and 1. Default = 0.1.}

\item{gen.miss}{\code{Numeric} maximum genotype missing rate at the whole
population level comprised between 0 and 1. Default = 0.25.}

\item{n.lim}{\code{Numeric} value specifying the minimum cross size.
Default = 15.}

\item{MAF.pop.lim}{\code{Numeric} minimum marker minor allele frequency at
the population level. Default = 0.05.}

\item{MAF.cr.lim}{\code{Numeric vector} specifying the critical within cross
MAF. Marker with a problematic segregation rate in at least
one cross is either set as missing within the problematic cross
(\code{MAF.cr.miss = TRUE}), or remove from the marker matrix
(\code{MAF.cr.miss = FALSE}). For default value see details.}

\item{MAF.cr.miss}{\code{Logical} value specifying if maker with a too low
segregation rate within cross (\code{MAF.cr.lim}) should be put as missing
(\code{MAF.cr.miss = TRUE}) or discarded (\code{MAF.cr.miss = FALSE}).
Default = TRUE.}

\item{MAF.cr.lim2}{\code{Numeric}. Alternative option for marker MAF
filtering. Only markers segregating with a MAF larger than \code{MAF.cr/lim2}
will be kept for the analysis. Default = NULL.}

\item{verbose}{\code{Logical} value indicating if the steps of the QC should
be printed. Default = TRUE.}

\item{n.cores}{\code{Numeric}. Specify here the number of cores you like to
use. Default = 1.}
}
\value{
a filtered \code{mppData} object containing the the same elements
as \code{\link{create.mppData}} after filtering. It contains also the
following new elements:

\item{geno.id}{ \code{Character} vector of genotpes identifiers.}

\item{ped.mat}{Four columns \code{data.frame}: 1) the type of genotype:
"offspring" for the last genration and "founder" for the genotypes above
the offspring in the pedigree; 2) the genotype indicator; 3-4) the parent 1
(2) of each line.}

\item{geno.par.clu}{Parent marker matrix without monomorphic or completely
missing markers.}

\item{haplo.map}{Genetic map corresponding to the list of marker of the
\code{geno.par.clu} object.}

\item{parents}{List of parents.}

\item{n.cr}{Number of crosses.}

\item{n.par}{Number of parents.}

\item{rem.mk}{Vector of markers that have been removed.}

\item{rem.geno}{Vector of genotypes that have been removed.}
}
\description{
Perform different operations of quality control (QC) on the marker data of an
\code{mppData} object.
}
\details{
The different operations of the quality control are the following:

\enumerate{

\item{Remove markers with more than two alleles.}

\item{Remove markers that are monomorphic or fully missing in the parents.}

\item{Remove markers with a missing rate higher than \code{mk.miss}.}

\item{Remove genotypes with more missing markers than \code{gen.miss}.}

\item{Remove crosses with less than \code{n.lim} genotypes.}

\item{Keep only the most polymorphic marker when multiple markers map at the
same position.}

\item{Check marker minor allele frequency (MAF). Different strategy can be
used to control marker MAF:

A) A first possibility is to filter marker based on MAF at the whole population
level using \code{MAF.pop.lim}, and/or on MAF within crosses using
\code{MAF.cr.lim}.

The user can give the its own vector of critical values for MAF within cross
using \code{MAF.cr.lim}. By default, the within cross MAF values are defined
by the following function of the cross-size n.ci: MAF(n.ci) = 0.5 if n.ci c
[0, 10] and MAF(n.ci) = (4.5/n.ci) + 0.05 if n.ci > 10. This means that up
to 10 genotypes, the critical within cross MAF is set to 50%. Then it
decreases when the number of genotype increases until 5% set as a lower bound.

If the within cross MAF is below the limit in at least one cross, then marker
scores of the problematic cross are either put as missing
(\code{MAF.cr.miss = TRUE}) or the whole marker is discarded
(\code{MAF.cr.miss = FALSE}). By default, \code{MAF.cr.miss = TRUE} which
allows to include a larger number of markers and to cover a wider genetic
diversity.

B) An alternative is to select only markers that segregate in at least
on cross at the \code{MAF.cr.lim2} rate.

}

}
}
\examples{

data(mppData_init)

mppData <- QC.mppData(mppData = mppData_init, n.lim = 15, MAF.pop.lim = 0.05,
                      MAF.cr.miss = TRUE, mk.miss = 0.1,
                      gen.miss = 0.25, verbose = TRUE)      

}
\seealso{
\code{\link{create.mppData}}
}
\author{
Vincent Garin
}
