% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/nestfs-package.R
\docType{package}
\name{nestfs-package}
\alias{nestfs}
\alias{nestfs-package}
\title{Cross-validated (nested) forward selection}
\description{
This package provides an implementation of forward selection based on linear
and logistic regression which adopts cross-validation as a core component of
the selection procedure.
}
\details{
The engine of the package is \code{\link[=fs]{fs()}}, whose aim is
to select a set of variables out of those available in the dataset. The
selection of variables can be done according to two main different criteria:
by paired-test p-value or by largest decrease in validation log-likelihood.
A combined criteria is also available.

The role of \code{\link[=nested.fs]{nested.fs()}} is to allow the
evaluation of the selection method by providing an unbiased estimate of the
performance of the selected variables on withdrawn data.

Forward selection is an inherently slow approach, as for each variable a
model needs to be fitted. In our implementation, this issue is further
aggravated by the fact that an inner cross-validation happens at each
iteration, with the aim of guiding the selection towards variables that
have better generalization properties.

The code is parallelized over the inner folds, thanks to the \strong{parallel}
package. User time therefore depends on the number of available cores, but
there is no advantage in using more cores than inner folds. The number of
cores assigned to computations must be registered before starting by setting
the \code{"mc.cores"} option.

The main advantage of forward selection is that it provides an immediately
interpretable model, and the panel of variables obtained is in some sense
the least redundant one, particularly if the number of variables to choose
from is not too large (in our experience, up to about 30-40 variables).

However, when the number of variables is much larger than that, forward
selection, besides being unbearably slow, may be more subject to
overfitting, which is in the nature of its greedy-like design. These
undesirable effects can be somewhat remedied by applying some filtering
(see the \code{num.filter} argument to \code{\link[=fs]{fs()}}, thus
reducing the number or variables entering the selection phase.
}
\seealso{
Useful links:
\itemize{
  \item \url{https://github.com/mcol/nestfs}
  \item Report bugs at \url{https://github.com/mcol/nestfs/issues}
}

}
\author{
Marco Colombo \email{mar.colombo13@gmail.com}
}
