Package: randomSurvivalForest
Version: 3.6.1
BUILD: bld20100127

---------------------------------------------------------------------------------
CHANGES TO RELEASE 3.6.1

RELEASE 3.6.1 represents a critical update of the product
resolving platform portability on 64-bit Windows machines.  Key
changes are as follows:

o Fix to 64-bit Windows build resulting in naming conflict with
  variable reserved for implementation.  No other platforms affected.
  Thanks to Brian Ripley for this.

---------------------------------------------------------------------------------

RELEASE 3.6.0 represents the last and final major upgrade of this
product.  Current and future functionality will migrate to the new
CRAN package, Random Forests for Survival, Regression, and
Classification, that will be released in the coming months.  Key
changes to the current release are as follows:

o The ability to fully analyze competing risk data including ensemble
  estimation, error rates, and VIMP by event type.  Can also predict
  on test data.  Missing data imputation is also available. See rsf()
  and competing.risk() for details.

o Automatic variable selection using minimal depth theory implemented
  in the new function varSel().  Also see the core function
  max.subtree().

o Pairwise interactions between variables using minimal depth theory.
  See find.interaction() for more details.

o The ability to analyze mean minimal depth information by individual
  (for variable selection at the individual level). See the
  documentation for rsf() for more information.

o Impute only mode for extremely fast imputation of data.  See the
  documentation for impute.rsf() fro more details.

o The ability to export forests to a Java (JUNG) based plug-in.  This
  allows one to visualize individual trees in the forest. See the
  documentation for rsf2rfz() for more information.

o Modification to random splitting to grow deeper trees. 

o Fix of memory leak in INTERACTION mode.  Thanks to Xi Chen for
  finding this feature.  Other miscellaneous bug fixes.

---------------------------------------------------------------------------------

Release 3.5.1 represents a minor upgrade of the product, and will not
affect most users of the prior version of the product.  Key changes
are as follows:

o An adaptive upper bound for the number of splits attempted is now in
force.  It is not greater than the number of data points in a node.

---------------------------------------------------------------------------------

Release 3.5.0 represents a significant upgrade in the functionality of
the product.  Key changes are as follows:

o A random version of all computed split rules is now available.  See
the documentation for more details.

o The ability to handle factors has been implemented.  If the factor
is ordered, then splits are similar to real valued variables.  If the
factor is unordered, a split will move a subset of the levels in the
parent node to the left daughter, and the complementary subset to the
right daughter.  All possible complementary pairs are considered and
apply to factors with an unlimited number of levels.  However, for
deterministic splitting there is an optimization check to ensure that
the number of splits is not greater than the number of complementary
pairs in a node (this internal check will be overridden in random
splitting mode if nsplit is set high enough).  Note that when
predicting on test data involving factors, the factor labels in the
test data must be the same as in the grow (training) data.  Consider
setting labels that are unique in the test data to missing.

---------------------------------------------------------------------------------

Release 3.2.3 represents a critical update of the product
resolving unexpected program termination under certain conditions.
Key changes are as follows:

o In the case of missing data, it is possible for imputation to fail
catastrophically in PREDICT or INTERACTION mode.  This can occur when
the GROW data set has no missing data, but the non-GROW data set does
have missingness.  Thanks to Andy J. Minn for finding this feature.

o In the case of a single predictor, PREDICT mode can fail to
dimension the data properly causing unpredictable results.  Thanks to
Eric A. Macklin for finding this feature.

---------------------------------------------------------------------------------

Release 3.2.2 represents a critical update of the product
resolving unexpected program termination under certain conditions.
Key changes are as follows:

o In the case of missing data, it is possible for naive imputation to
fail catastrophically in PREDICT or INTERACTION mode.  This can occur
when the dimension of the mode specific data set is less than the GROW
data set.  Thanks to Andy J. Minn for finding this feature.

---------------------------------------------------------------------------------
 
Release 3.2.1 represents a minor upgrade of the product, and will not
affect most users of the prior version of the product.  Key changes
are as follows:

o An additional option in INTERACTION mode can now be specified.  The
option 'rough' can significantly decrease the processing time
necessary to determine variable importance for some large data sets.
This is accomplished by replacing the cumulative hazard function that
is based on the Nelson-Aalen estimate with the mean death time in a
node as the measure of mortality.  See the documentation for more
details.

o A minor but longstanding (since Release 1.0.0) issue in booking
potential split points has been discovered.  The problem manifests
itself by slightly favoring splits on continuous covariates under
certain conditions such as low values of 'mtry'.  However, it is not
expected that users will notice significant changes.  The issue has
been resolved.

---------------------------------------------------------------------------------

Release 3.2.0 represents a significant upgrade in the functionality of
the product.  Key changes are as follows:

o A second method of perturbing the data set in order to calculate
variable importance (VIMP) has been implemented.  In addition to
permuting the values for a single variable, a random split approach
has been taken in which a data point is randomly assigned to the left
or right daughter node when a split occurs on the specified variable.

o The joint VIMP among multiple variables of a (potentially proper)
subset of the GROW data can now be calculated using the new function
interaction.rsf().  This represents a third mode of operation
(INTERACTION) for the application, and follows rsf.default (GROW) and
predict.rsf (PREDICT).  See the documentation for details.

o An additional option in GROW mode can now be specified.  The option
'varUsed' allows users to quantify which variables have been split
upon within a single tree or over the entire forest.  See the
documentation for more details.

o The ability to multiply impute data has been implemented.  This
involves imputing data while growing a forest and using the results to
grow a new forest in order to better impute the data.

o In GROW mode, the application now outputs both the in-bag and OOB
summary imputed values.

o An additional split rule 'randomsplit' has been implemented.
See the documentation for more details.

o The split rule 'logrankscore' is now calculated correctly.  

o The split rule 'logrankapprox' has been removed and replaced by
the new split rule 'logrankrandom'.  See the documentation for more details.

---------------------------------------------------------------------------------

Release 3.0.1 represents a minor upgrade of the product, and will not affect
most users of the prior version of the product.  Key changes are as follows:

o A slight adjustment has been made to the variance used in the "logrankapprox"
splitting rule.  This fixes the issue where a sizable fraction of trees
in the forest were being stumped in some examples.

o Illegal syntax fixes to C-code that manifest themselves on some compilers.
  Thanks to Brian Ripley for pointing this out.

---------------------------------------------------------------------------------

Release 3.0.0 represents a major upgrade in the functionality of the product.
Key changes are as follows:

o Missing data can be imputed in both GROW and PREDICT mode.  This
  applies to variables as well as time and censoring outcome values.
  Values are imputed dynamically as the tree is grown using a new tree
  imputation methodology.  This produces an imputed forest which can be
  used for prediction purposes on test data sets with missing data.

o Importance values for variables are returned in PREDICT mode when test
  data contains outcomes as well as variables.

o Fixed some bugs in plot.variable().  Thanks to Andy J. Minn for pointing this out.

o Minor modification of PMML representation of RSF forest output to accomodate
  imputation.  The method of random seed chain recovery has been altered.
  Note that forests produced with prior releases will have to be 
  regenerated using this release.  We apologize for the inconvenience.

---------------------------------------------------------------------------------

Release 2.1.0 represents a minor upgrade of the product, and will not affect
most users of the prior version of the product.  Key changes are as follows:

o R 2.5.0 compliance issues and necessitated modifications.

o Modification of PMML representation of RSF forest output.  The RSF custom
  extension has been moved from the DataDictionary node to a new
  MiningBuildTask node.  Note that forests produced with Release 2.0.0 will
  have to be regenerated using Release 2.1.0.  We apologize for the 
  inconvenience.

o Fast processing of data involving large numbers of predictors (as in
  many genomic examples) by using the option big.data=TRUE.  This
  option bypasses the huge overhead needed by R in creating design
  matrices and parsing formula.  However, users should be aware of
  some side effects.  See the RSF help file for more details.  Thanks
  to Steven (Xi) Chen for pointing out the problem.

o Only the top 100 predictors are now printed to the terminal when
  calling plot.error().  This deals with settings as above when one
  might have thousands of predictors.

o Introduced a new wrapper "find.interaction()" for testing of
  pairwise interactions between predictors.  

---------------------------------------------------------------------------------

Release 2.0.0 represents a major upgrade in the functionality and stability
of the original 1.0.0 release.  Key changes are as follows:

o Two new splitting rules, 'logrankscore' and 'logrankapprox', added.  

o Expanded output from 'rsf()'.  Now out-of-bag objects 'oob.ensemble' and
  'oob.mortality' are included in addition to the full ensemble objects 
  'ensemble' and 'mortality'.

o Importance values for predictors can now be calculated (set 'importace = TRUE' 
  in the initial 'rsf()' call).  Extended 'plot.error()' to print, as well as plot,
  such values.

o Prediction on test data can now be implemented using 'rsf.predict()' (set
  'forest = TRUE' in the initial 'rsf()' call).

o Included option 'predictorWt' used for weighted sampling of predictors when
  growing a tree.

o Formula no longer restricted to main effects.  Formula for 'rsf' interpreted
  as in typical R applications.  However, users should be aware that including
  interactions or higher order terms in a formula may not be an optimal way to
  grow a forest.

o Three types of objects are generated in an RSF analysis: '(rsf, grow)', 
  '(rsf, predict)' and '(rsf, forest)'.  Wrappers handle each type of object
  in different ways.

o Improved error checking in all wrappers.

o Extended 'plot.variable()' wrapper to generate parial plots for predictors.

o Improved control over trace output.  See the 'do.trace' option in 'rsf()'.

o Implements the Predictive Model Markup Language specification for an 
  '(rsf, forest)' forest object.  PMML is an XML based language which
  provides a way for applications to define statistical and data mining
  models and to share models between PMML compliant applications.  More
  information about PMML and the Data Mining Group can be found at
  http://www.dmg.org.  Our implementation gives the user the ability to
  save the geometry of a forest as a PMML XML document for export or
  later retrieval.

---------------------------------------------------------------------------------
