% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/individualQC.R
\name{run_check_relatedness}
\alias{run_check_relatedness}
\title{Run PLINK IBD estimation}
\usage{
run_check_relatedness(
  indir,
  name,
  qcdir = indir,
  highIBDTh = 0.185,
  mafThRelatedness = 0.1,
  path2plink = NULL,
  genomebuild = "hg19",
  showPlinkOutput = TRUE,
  verbose = FALSE
)
}
\arguments{
\item{indir}{[character] /path/to/directory containing the basic PLINK data
files name.bim, name.bed, name.fam files.}

\item{name}{[character] Prefix of PLINK files, i.e. name.bed, name.bim,
name.fam.}

\item{qcdir}{[character] /path/to/directory to save name.genome as returned
by plink --genome. User needs writing permission to qcdir. Per default
qcdir=indir.}

\item{highIBDTh}{[double] Threshold for acceptable proportion of IBD between
pair of individuals; only pairwise relationship estimates larger than this
threshold will be recorded.}

\item{mafThRelatedness}{[double] Threshold of minor allele frequency filter
for selecting variants for IBD estimation.}

\item{path2plink}{[character] Absolute path to PLINK executable
(\url{https://www.cog-genomics.org/plink/1.9/}) i.e.
plink should be accesible as path2plink -h. The full name of the executable
should be specified: for windows OS, this means path/plink.exe, for unix
platforms this is path/plink. If not provided, assumed that PATH set-up works
and PLINK will be found by \code{\link[sys]{exec}}('plink').}

\item{genomebuild}{[character] Name of the genome build of the PLINK file
annotations, ie mappings in the name.bim file. Will be used to remove
high-LD regions based on the coordinates of the respective build. Options
are hg18, hg19 and hg38. See @details.}

\item{showPlinkOutput}{[logical] If TRUE, plink log and error messages are
printed to standard out.}

\item{verbose}{[logical] If TRUE, progress info is printed to standard out.}
}
\description{
Run LD pruning on dataset with plink --exclude range highldfile
--indep-pairwise 50 5 0.2, where highldfile contains regions of high LD as
provided by Anderson et (2010) Nature Protocols. Subsequently, plink
--genome is run on the LD pruned, maf-filtered data. plink --genome
calculates identity by state (IBS) for each pair of individuals based on the
average proportion of alleles shared at genotyped SNPs. The degree of recent
shared ancestry,i.e. the identity by descent (IBD) can be estimated from the
genome-wide IBS. The proportion of IBD between two individuals is returned by
--genome as PI_HAT.
}
\details{
Both \code{\link{run_check_relatedness}} and its evaluation via
\code{\link{evaluate_check_relatedness}} can simply be invoked by
\code{\link{check_relatedness}}.

The IBD estimation is conducted on LD pruned data and in a first
step, high LD regions are excluded. The regions were derived from the
high-LD-regions file provided by Anderson et (2010) Nature Protocols. These
regions are in NCBI36 (hg18) coordinates and were lifted to GRCh37 (hg19)
and GRC38 (hg38) coordinates using the liftOver tool available here:
\url{https://genome.ucsc.edu/cgi-bin/hgLiftOver}. The 'Minimum ratio of bases
that must remap' which was set to 0.5 and the 'Allow multiple output regions'
box ticked; for all other parameters, the default options were selected.
LiftOver files were generated on July 9,2019. The commands for formatting
the files are provided in system.file("extdata", 'liftOver.cmd',
package="plinkQC").
}
\examples{
indir <- system.file("extdata", package="plinkQC")
name <- 'data'
qcdir <- tempdir()
# the following code is not run on package build, as the path2plink on the
# user system is not known.
\dontrun{
run <- run_check_relatedness(indir=indir, qcdir=qcdir, name=name)
}
}
