% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/optim_tdv_simul_anne.R
\name{optim_tdv_simul_anne}
\alias{optim_tdv_simul_anne}
\title{Total Differential Value optimization using a Simulated Annealing (and GRASP)
algorithm(s)}
\usage{
optim_tdv_simul_anne(
  m_bin,
  k,
  p_initial = NULL,
  n_runs = 10,
  n_sol = 1,
  t_inic = 0.3,
  t_final = 1e-06,
  alpha = 0.05,
  n_iter = 1000,
  use_grasp = TRUE,
  thr = 0.95,
  full_output = FALSE
)
}
\arguments{
\item{m_bin}{A matrix. A phytosociological table of 0s (absences) and
1s (presences), where rows correspond to taxa and columns correspond to
relevés.}

\item{k}{A numeric giving the number of desired groups.}

\item{p_initial}{A vector of integer numbers with the partition of the
relevés (i.e., a \code{k}-partition, consisting in a vector with values from 1
to \code{k}, with length equal to the number of columns of \code{m_bin}, ascribing
each relevé to one of the \code{k} groups), to be used as initial partition in
the Simulated Annealing. For a random partition use \code{p_initial = "random"}.
This argument is ignored if \code{use_grasp = TRUE}.}

\item{n_runs}{A numeric giving the number of runs. Defaults to 10.}

\item{n_sol}{A numeric giving the number of best solutions to keep in the
final output (only used if \code{full_output} is \code{FALSE}; if \code{full_output} is
\code{TRUE} all runs will produce an output). Defaults to 1.}

\item{t_inic}{A numeric giving the initial temperature. Must be greater
than 0 and maximum admitted value is 1. Defaults to 0.3.}

\item{t_final}{A numeric giving the initial temperature. Must be bounded
between 0 and 1. Usually very low values are needed to ensure convergence.
Defaults to 0.000001.}

\item{alpha}{A numeric giving the fraction of temperature drop to be used
in the temperature reduction scheme (see Details). Must be bounded between
0 and 1. Defaults to 0.05.}

\item{n_iter}{A numeric giving the initial temperature. Defaults to 1000.}

\item{use_grasp}{A logical. Defaults to \code{TRUE}. IF \code{TRUE}, a GRASP is used
to obtain the initial partitions for the Simulated Annealing. If \code{FALSE}
the user should provide an initial partition or use or use
\code{p_initial = "random"} for a random one.}

\item{thr}{A numeric giving a threshold value (from 0 to 1 ) with the
probability used to compute the sample quantile, in order to get the best
\code{m_bin} columns from which to select one to be include in the GRASP
solution (in each step of the procedure). Only needed if \code{use_grasp} is
\code{TRUE}.}

\item{full_output}{A logical. Defaults to \code{FALSE}. If \code{TRUE} extra
information is presented in the output. See Value.}
}
\value{
If \code{full_output = FALSE} (the default), a list with the following
components (the GRASP component is only returned if \code{use_grasp = TRUE}):

\describe{
\item{GRASP}{A list with at most \code{n_sol} components, each one
containing also a list with two components:
\describe{
\item{par}{A vector with the partition of highest TDV obtained by
GRASP;}
\item{tdv}{A numeric with the TDV of \code{par}.}
}
}
\item{SANN}{A list with at most \code{n_sol} components, each one containing
also a list with two components:
\describe{
\item{par}{A vector with the partition of highest TDV obtained by the
(GRASP +) SANN algorithm(s);}
\item{tdv}{A numeric with the TDV of \code{par}.}
}
}
}

If \code{full_output = TRUE}, a list with the following components (the GRASP
component is only returned if \code{use_grasp = TRUE}):

\describe{
\item{GRASP}{A list with \code{n_runs} components, each one containing also a
list with two components:
\describe{
\item{par}{A vector with the partition of highest TDV obtained by
GRASP.}
\item{tdv}{A numeric with the TDV of \code{par}.}
}
}
\item{SANN}{A list with \code{n_runs} components, each one containing also a
list with six components:
\describe{
\item{current.tdv}{A vector of length \code{n_iter} with the current TDV of
each SANN iteration.}
\item{alternative.tdv}{A vector of length \code{n_iter} with the alternative
TDV used in each SANN iteration.}
\item{probability}{A vector of length \code{n_iter} with the probability
used in each SANN iteration.}
\item{temperature}{A vector of length \code{n_iter} with the temperature of
each SANN iteration.}
\item{par}{A vector with the partition of highest TDV obtained by the
(GRASP +) SANN algorithm(s).}
\item{tdv}{A numeric with the TDV of \code{par}.}
}
}
}
}
\description{
This function searches for \code{k}-partitions of the columns of a given matrix
(i.e., a partition of the columns in \code{k} groups), optimizing the Total
Differential Value (TDV) using a stochastic global optimization method
called Simulated Annealing (SANN) algorithm. Optionally, a Greedy
Randomized Adaptive Search Procedure (GRASP) can be used to find a initial
partition (seed) to be passed to the SANN algorithm.
}
\details{
Given a phytosociological table (\code{m_bin}, with rows corresponding to
taxa and columns corresponding to relevés) this function searches for a
\code{k}-partition (\code{k}, defined by the user) optimizing TDV, i.e., searches,
using a SANN algorithm (optionally working upon GRASP solutions), for the
global maximum of TDV (by rearranging the relevés into \code{k} groups).

This function uses two main algorithms:
\enumerate{
\item An optional GRASP, which is used to obtain initial solutions
(partitions of \code{m_bin}) using function \code{\link[=partition_tdv_grasp]{partition_tdv_grasp()}}.
Such initial solutions are then submitted to the SANN algorithm.
\item The (main) SANN algorithm, which is used to search for the global
maximum of TDV. The initial partition for each run of SANN can be a
partition obtained from GRASP (if \code{use_grasp = TRUE}) or, (if
\code{use_grasp = FALSE}), a partition given by the user (using \code{p_initial}) or
a random partition (using \code{p_initial = "random"}).
}

The SANN algorithm decreases the temperature multiplying the current
temperature by \code{1 - alpha} according to a predefined schedule, which is
automatically calculated from the given values for \code{t_inic}, \code{t_final},
\code{alpha} and \code{n_iter}.
Specifically, the cooling schedule is obtained calculating the number of
times that the temperature has to be decreased in order to approximate
\code{t_final} starting from \code{t_inic}. The number of times that the temperature
decreases, say \code{nt}, is calculated by the expression:

\verb{floor(n_iter/((n_iter * log(1 - alpha)) / (log((1 - alpha) * t_final / }
\verb{t_inic))))}.

Finally, these decreasing stages are scattered through the desired
iterations (\code{n_iter}) homogeneously, by calculating the indices of the
iterations that will experience a decrease in temperature using
\code{floor(n_iter / nt * (1:nt))}.

SANN is often seen as an exploratory technique where the temperature
settings are challenging and dependent on the problem. This function tries
to restrict temperature values taking into account that TDV is always
between 0 and 1. Even though, obtaining values of temperature that allow
convergence can be challenging. \code{full_output = TRUE} allows the user to
inspect the behaviour of \code{current.tdv} and check if convergence fails.
Generally, convergence failure can be spotted when final SANN TDV values
are similar to the initial \code{current.tdv}, specially when coming from random
partitions. In such cases, as a rule of thumb, it is advisable to decrease
\code{t_final}.
}
\examples{
# Getting the Taxus baccata forests data set
data(taxus_bin)

# Removing taxa occurring in only one relevé in order to
# reproduce the example in the original article of the data set
taxus_bin_wmt <- taxus_bin[rowSums(taxus_bin) > 1, ]

# Obtaining a partition that maximizes TDV using the Simulated Annealing
# algorithm
result <- optim_tdv_simul_anne(
  m_bin = taxus_bin_wmt,
  k = 3,
  p_initial = "random",
  n_runs = 5,
  n_sol = 5,
  use_grasp = FALSE,
  full_output = TRUE
)

# Inspect the result
# The TDV of each run
sapply(result[["SANN"]], function(x) x$tdv)
# The best partition that was found (i.e., with highest TDV)
result[["SANN"]][[1]]$par

# A TDV of 0.1958471 indicates you are probably reproducing the three
# groups (Estrela, Gerês and Galicia) from the original article. A solution
# with TDV = 0.2005789 might also occur, but note that one group has only two
# elements. For now, a minimum group size is not implemented in function
# optim_tdv_simul_anne() as it is in the function optim_tdv_hill_climb().

# Inspect how the optimization progressed (should increase towards the right)
plot(
  result[["SANN"]][[1]]$current.tdv,
  type = "l",
  xlab = "Run number",
  ylab = "TDV of the currently accepted solution"
)
for (run in 2:length(result[["SANN"]])) {
  lines(result[["SANN"]][[run]]$current.tdv)
}

# Plot the sorted (or tabulated) phytosociological table, using the best
# partition that was found
tabul <- tabulation(
  m_bin = taxus_bin_wmt,
  p = result[["SANN"]][[1]]$par,
  taxa_names = rownames(taxus_bin_wmt),
  plot_im = "normal"
)

}
\author{
Jorge Orestes Cerdeira and Tiago Monteiro-Henriques.
E-mail: \email{tmh.dev@icloud.com}.
}
