MLModel is a function supplied by the MachineShop package. It allows for the integration of statistical and machine learning models supplied by other R packages with the MachineShop model fitting, prediction, and performance assessment tools.
The following are guidelines for writing model constructor functions that are wrappers around the MLModel function.
In this context, the term “constructor” refers to the wrapper function and “source package” to the package supplying the original model implementation.
The constructor should produce a valid model if called without any arguments; i.e., not have any required arguments.
The source package defaults will be used for parameters with NULL values.
Model formula, data, and weights are separate from model parameters and should not be defined as constructor arguments.
Include all packages whose functions are called directly from within the constructor.
Use :: or ::: to reference source package functions.
"binary", "factor", "matrix", "numeric", "ordered", and/or "Surv") that can be analyzed with the model.MachineShop:::params(environment()) if all arguments are to be passed to the source package fit function as supplied. Additional steps may be needed to pass the constructor arguments to the source package in a different format; e.g., when some model parameters must be passed in a control structure, as in C50Model and CForestModel.data argument that represents a model frame and return its number of analytic predictor variables.The first three arguments should be formula, data, and weights followed by an ellipsis (...).
If weights are not supported, the following should be included in the function:
Only add elements to the resulting fit object if they are needed and will be used in the predict or varimp functions.
Return the fit object.
The arguments are a model fit object, newdata frame, optionally times for prediction at survival time points, and an ellipsis.
The predict function should return a vector or column matrix of probabilities for the second level of binary factors, a matrix whose columns contain the probabilities for factors with more than two levels, a matrix of predicted responses if matrix, a vector or column matrix of predicted responses if numeric, a matrix whose columns contain survival probabilities at times if supplied, or a vector of survival predictions if times are not supplied.
Should have a single model fit object argument followed by an ellipsis.
Variable importance results should generally be returned as a vector with elements named after the corresponding predictor variables. The package will handle conversions to a data frame and VarImp object. If there is more than one set of relevant variable importance measures, they can be returned as a matrix or data frame with predictor variable names as the row names.
Include the first sentences from the source package.
Start sentences with the parameter value type (logical, numeric, character, etc.).
Start sentences with lowercase.
Omit indefinite articles (a, an, etc.) from the starting sentences.
Include response types (binary, factor, matrix, numeric, ordered, and/or Surv).
Include the following sentence:
Default values for the arguments and further model details can be found in the source link below.
MLModel class object.
\code{\link[<source package>]{<fit function>}}, \code{\link{fit}},
\code{\link{resample}}, \code{\link{tune}}If adding a new model to the package, save its source code in a file whose name begins with “ML_” followed by the model name, and ending with a .R extension; e.g., "R/ML_CustomModel.R".
Export the model in NAMESPACE.
Add any required packages to the “Suggests” section of DESCRIPTION.
Add the model to R/MachineShop-package.R.
Add the model to R/modelinfo.R.
Add a unit testing file to tests/testthat.