% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/vtreat.R
\name{designTreatmentsN}
\alias{designTreatmentsN}
\title{build all treatments for a data frame to predict a numeric outcome}
\usage{
designTreatmentsN(dframe, varlist, outcomename, ..., weights = c(),
  minFraction = 0.02, smFactor = 0, rareCount = 2, rareSig = 0.3,
  maxMissing = 0.04, collarProb = 0, returnXFrame = FALSE,
  scale = FALSE, doCollar = TRUE, verbose = TRUE,
  parallelCluster = NULL)
}
\arguments{
\item{dframe}{Data frame to learn treatments from (training data), must have at least 1 row.}

\item{varlist}{Names of columns to treat (effective variables).}

\item{outcomename}{Name of column holding outcome variable. dframe[[outcomename]] must be only finite non-missing values and there must be a cut such that dframe[[outcomename]] is both above the cut at least twice and below the cut at least twice.}

\item{...}{no additional arguments, declared to forced named binding of later arguments}

\item{weights}{optional training weights for each row}

\item{minFraction}{optional minimum frequency a categorical level must have to be converted to an indicator column.}

\item{smFactor}{optional smoothing factor for impact coding models.}

\item{rareCount}{optional integer, suppress direct effects of level of this count or less.}

\item{rareSig}{optional numeric, suppress direct effects of level of this significance value greater.  Set to one to turn off effect.}

\item{maxMissing}{optional maximum fraction (by data weight) of a categorical variable that are allowed before switching from indicators to impact coding.}

\item{collarProb}{what fraction of the data (pseudo-probability) to collar data at (<0.5).}

\item{returnXFrame}{optional if TRUE return out of sample transformed frame.}

\item{scale}{logical optional controls scaling for scoring and returnXFrame}

\item{doCollar}{logical optional controls collaring for scoring and returnXFrame}

\item{verbose}{if TRUE print progress.}

\item{parallelCluster}{(optional) a cluster object created by package parallel or package snow}
}
\value{
treatment plan (for use with prepare)
}
\description{
Function to design variable treatments for binary prediction of a
numeric outcome.  Data frame is assumed to have only atomic columns
except for dates (which are converted to numeric).
Note: each column is processed independently of all others.
}
\details{
The main fields are mostly vectors with names (all with the same names in the same order):

- vars : (character array without names) names of variables (in same order as names on the other diagnostic vectors)
- varMoves : logical TRUE if the variable varied during hold out scoring, only variables that move will be in the treated frame
- sig : an estimate significance of effect

See the vtreat vignette for a bit more detail and a worked example.
}
\examples{
dTrainN <- data.frame(x=c('a','a','a','a','b','b','b'),
    z=c(1,2,3,4,5,6,7),y=c(0,0,0,1,0,1,1))
dTestN <- data.frame(x=c('a','b','c',NA),
    z=c(10,20,30,NA))
treatmentsN = designTreatmentsN(dTrainN,colnames(dTrainN),'y')
dTrainNTreated <- prepare(treatmentsN,dTrainN,pruneSig=0.99)
dTestNTreated <- prepare(treatmentsN,dTestN,pruneSig=0.99)
}
\seealso{
\code{\link{prepare}} \code{\link{designTreatmentsC}}
}

