% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/cpt.R, R/inputCPT.R
\name{cpt}
\alias{cpt}
\alias{cpt.formula}
\alias{cpt.list}
\alias{inputCPT}
\alias{inputCPT.formula}
\alias{inputCPT.list}
\title{Compute a conditional probability table for a factor given other factors}
\usage{
cpt(x, data, wt, ...)

\method{cpt}{formula}(formula, data, wt, ...)

\method{cpt}{list}(x, data, wt, ...)

inputCPT(x, factorLevels, reduce = TRUE, ...)

\method{inputCPT}{formula}(formula, factorLevels, reduce = TRUE, ...)

\method{inputCPT}{list}(x, factorLevels, reduce = TRUE, ...)
}
\arguments{
\item{x}{a list containing the names of the variables used to compute
the conditional probability table. See details.}

\item{data}{a data frame containing all the factors represented by the \code{formula}
parameter.}

\item{wt}{(optional) a numeric vector of observation weights.}

\item{...}{Additional arguments to be passed to other methods.}

\item{formula}{a formula specifying the relationship between the dependent and
independent variables.}

\item{factorLevels}{(optional) a named list with the following structure:
Variable names for the factors specified in \code{vars} comprise the names
of the list elements, and each list element is a character vector containing
the levels of the respective factor. See examples.}

\item{reduce}{set to \code{TRUE} if \code{inputCPT()} is to compute probabilities
for the first level of the dependent variable as the complement of the
inputted probabilities corresponding to the other levels of the dependent
variable. For example, \code{reduce = TRUE} with a binary dependent variable
\code{y} (say, with levels \code{'no'} and \code{'yes'}) will ask for the
probabilities of \code{'yes'} at each combination of the independent variables,
and compute the probability of \code{'no'} as their respective complements.
See details.}
}
\description{
The function \code{cpt} operates on sets of factors. Specifically,
  it computes the conditional probability distribution of one of the factors
  given other factors, and stores the result in a multidimensional \code{array}.

  \code{inputCPT()} is a utility function aimed at facilitating the process of
  populating small conditional probability distributions, i.e., those for which
  the response variable doesn't have too many levels, there are relatively few
  independent variables, and the independent variables also don't have too many
  levels.
}
\details{
If a \code{formula} object is entered for the \code{vars} parameter, the
  formula must have the following structure: \emph{response ~ var1 + var2 + etc.}.
  The other option is to pass a named \code{list} containing two elements \code{y}
  and \code{x}. Element \code{y} is a character string containing the name of the
  factor variable in \code{data} to be used as the dependent variable, and
  element \code{x} is a character vector containing the name(s) of the factor
  variable(s) to be used as independent (or conditioning) variables.

  In \code{inputCPT()}, when the parameter \code{reduce} is set to \code{FALSE},
  any non-negative number (e.g., cell counts) is accepted as input. Conditional
  probabilities are then calculated via a normalization procedure. However, when
  \code{reduce} is set to \code{TRUE}, a) only probabilities in [0,1] are accepted
  and b) all inputted probabilities for each specific combination of independent
  variable values must not sum to a value greater than 1 (or the calculated
  probability for the first level of the dependent variable would be negative).

  The \code{cpt()} function with a weight vector passed to parameter \code{wt}
  works analogously to \code{inputCPT(reduce = FALSE)}, i.e., it accepts any
  non-negative vector, and computes the conditional probability array by
  normalizing sums of weights.
}
\examples{
# a very imbalanced dice example

n <- 50000
data <- data.frame(
  di1 = as.factor(1:6 \%*\% rmultinom(n,1,prob=c(.4,.3,.15,.10,.03,.02))),
  di2 = as.factor(1:6 \%*\% rmultinom(n,1,prob=rev(c(.4,.3,.15,.10,.03,.02)))),
  di3 = as.factor(1:6 \%*\% rmultinom(n,1,prob=c(.15,.10,.02,.3,.4,.03)))
)

cpt1 <- cpt(di3 ~ di1 + di2, data)
cpt1[di1 = 1, di2 = 4, ]  # Pr(di3 | di1 = 1, di2 = 4)
cpt1["1","4",]
cpt1[1,4,]

plyr::aaply(cpt1, c(1,2), sum) # card(di1)*card(di2) matrix of ones

l <- list(y = "di3", x = c("di1","di2"))
all(cpt(l, data) == cpt1)

\dontrun{
inputCPT(wetGrass ~ rain + morning)

inputCPT(wetGrass ~ rain + morning,
         factorLevels <- list(wetGrass = c("dry","moist","VeryWet"),
                              rain     = c("nope","yep"),
                              morning  = c("NO","YES")),
         reduce = FALSE)
}
}
\author{
Jarrod Dalton and Benjamin Nutter
}

