News
Version Updates
2.5.0
- Add
options
argument to step_lincomp()
and step_sbf()
.
- CRAN release.
2.4.3
- Add recipe
step_sbf()
function for variable selection by filtering.
- Inherit
step_kmedoids
objects from step_sbf
, and refactor methods.
- Support user-specified center and scale functions.
- Append prefix to selected variable names.
- Rename
tidy()
column medoids
to selected
.
- Rename
tidy()
column names
to name
.
- Set
tidy()
non-selected variable names to NA
.
- Add recipe
step_lincomp()
function for linear components variable reduction.
- Inherit
step_kmeans
objects from step_lincomp
, and refactor methods.
- Support user-specified center and scale functions.
- Rename
tidy()
column names
to name
.
- Inherit
step_spca
objects from step_lincomp
, and refactor methods.
- Support user-specified center and scale functions.
- Rename
tidy()
column value
to weight
.
- Rename
tidy()
column component
to name
.
- Set
GBMModel
distribution to bernoulli, instead of multinomial, for binary responses.
2.4.2
- Add global setting
RHS.formula
for listing of operators and functions allowed on right-hand side of traditional formulas.
- Add clara clustering method to
step_kmedoids()
.
- Support Cox and accelerated failure time regression for survival responses in
XGBModel
, XGBDARTModel
, XGBLinearModel
, and XGBTreeModel
.
2.4.1
- Set
NNetModel
linout
argument automatically according to the response variable type (numeric: TRUE
, other: FALSE
). Previously, linout
had a default value of FALSE
as defined in the nnet
package.
2.4.0
2.3.2
- Display progress bars for sequential resampling iterations.
2.3.1
- R 4.0 data.frame compatibility updates for calibration curves.
- Fix recipe prediction with StackedModel and SuperModel
2.3.0
- Display progress messages for any foreach parallel backend.
2.2.5
- Show all error messages when resample selection stops.
- Preserve predictor names in
NNetModel
fit()
method.
- Fix aggregation of performance curves with infinite values.
- Add progress bar and verbose output options for
resample()
methods.
- Get non-negative probabilities for survival confusion matrix.
- Update Using webpages and vignette.
2.2.4
- Fix
BARTMachineModel
to predict highest binary response level.
- Grid tune
BARTMachineModel
nu
parameter for numeric responses only.
2.2.3
- Extend
ModeledInput()
to SelectedModelFrame
, SelectedModelRecipe
, and TunedModelRecipe
.
2.2.2
- Fix updating of recipe parameters in
TunedInput()
.
2.2.1
- Print
StackedModel
and SuperModel
training information.
- Fix missing case names when resampling with recipes.
2.2.0
2.1.4
- Add cost-complexity pruning parameters to
TreeModel
.
- Perform stratified resampling automatically for
ModeledInput()
and SelectedInput()
objects constructed with formulas and matrices.
2.1.3
- Revisions needed to some
fit()
methods to ensure that unprepped recipes are passed to models, like TunedModed
, StackedModel
, SelectedModel
and SuperModel
, needing to replicate preprocessing steps in their resampling routines.
- Extend
GLMModel
to factor and matrix responses.
- Use
fun
instead of deprecated fun.y
in ggplot2 functions.
- Capture user-supplied parameters passed in to the ellipsis of model constructor functions that have them.
2.1.2
- Compatibility fix for tibble 3.0.0.
- Include missing values in model matrices created internally from formulas.
2.1.1
- Improve specificity of
metricinfo()
results for factor responses.
- Correct
SplitControl()
to train on the split sample instead of the full dataset.
- Perform stratified resampling automatically when
fit()
formula and matrix methods are called with meta-models.
2.1.0
2.0.4
- Extend
print()
argument n
to data frame and matrix columns for more concise display of large data structures.
- Add preprocessing recipe functions
step_kmeans()
, step_kmedoids()
, and step_spca()
.
2.0.3
- Internal changes:
- Remove
MLModel
slot y
.
- Rename
ModelFrame
and ModelRecipe
columns (casenames)
to (names)
.
- Register
ModelFrame
inheritance from data.frame
.
- Define
Terms
S4 classes for ModelFrame
slot terms
.
2.0.2
- Implement
ModeledInput
, SelectedInput
and TunedInput
classes and methods.
- Deprecate
SelectedFormula()
, SelectedMatrix()
, SelectedModelFrame()
, SelectedRecipe()
, and TunedRecipe()
.
- Remove deprecated
tune()
.
- Rename global setting
stat.Curves
to stat.Curve
.
2.0.1
- Rename global setting
stat.Train
to stat.train
.
- Add print methods for
SelectedModel
, StackedModel
, SuperModel
, and TunedModel
.
- Revise training methods to ensure nested resampling of
SelectedRecipe
and TunedRecipe
.
- Return list of all training steps in
MLModel
trainbits
slot.
2.0.0
- Rename global setting
stat.Tune
to stat.Train
.
- Enable selection of formulas, design matrices, and model frames with
SelectedFormula()
, SelectedMatrix()
, and SelectedModelFrame()
.
- Rename discrete variable classes:
BinomialMatrix
→ BinomialVariate
, DiscreteVector
→ DiscreteVariate
, NegBinomialVector
→ NegBinomialVariate
, and PoissonVector
→ PoissonVariate
.
- Add global setting
require
for user-specified packages to load during parallel execution of resampling algorithms.
- Rename recipe role
case_strata
to case_stratum
.
- Rename
object
argument to data
in ConfusionMatrix()
, SurvEvents()
, and SurvProbs()
.
- Add
c
methods for BinomialVariate
, DiscreteVariate
, ListOf
, and SurvMatrix
.
- Add
role_binom()
, role_case()
, role_surv()
, and role_term()
to set recipe roles.
- Support
base
argument to varimp()
for log-transformed p-values.
- Rename
ParamSet
to ParameterGrid
.
- Add option to
reset
global settings individually.
- Add
as.data.frame
methods for Performance
, Performance
summary, PerformanceDiff
, PerformanceDiffTest
, and Resamples
.
1.99.0
- Implement
DiscreteVector
class and subclasses BinomialVector
, NegBinomialVector
, and PoissonVector
for discrete response variables.
- Extend model support to
DiscreteVector
classes as follows.
DiscreteVector
: all models applicable to numeric responses.
BinomialVector
/NegBinomialVector
/PoissonVector
: BlackBoostModel
, GAMBoostModel
, GLMBoostModel
, GLMModel
, and GLMStepAICModel
.
BinomialVector
/PoissonVector
: GLMNetModel
.
PoissonVector
: GBMModel
and XGBModel
- Add support for offset terms in formulas, model matrices, and recipes.
- Add recipe tune information to fitted
MLModel
.
- Replace
Calibration()
, Confusion()
, Curves()
, Lift()
, and Resamples()
with c
methods.
- Redefine
Confusion
S3 class as ConfusionList
S4 class.
- Remove support for one-element list to
metricinfo()
and modelinfo()
.
- Remove deprecated
expand.model()
.
- Expire deprecated
tune()
.
1.6.4
- Calculate regression variable importance as negative log p-values.
- Support empty vectors in
metricinfo()
and modelinfo()
.
- Add support for dials package parameter sets with
ParamSet()
.
1.6.3
- Add
as.MLModel()
for coercing MLModelFit
to MLModel
.
- Deprecate
tune()
; call fit()
with a SelectedModel
or TunedModel
instead.
1.6.2
- Implement optimism-corrected cross-validation (
CVOptimismControl
).
- Fix
BootOptimismControl
error with 2D responses.
- Add global option
max.print
for the number of models and data frame rows to show with print methods.
- Enable recipe selection with
SelectedRecipe()
.
- Refactor
tune()
methods.
- Replace
MLModelFit
element fitbits
(MLFitBits
object) with mlmodel
(MLModel
object).
- Rename
VarImp
slot center
to shift
.
1.6.1
- Use tibbles for parameter grids.
- Add random sampling option to
expand_model()
, expand_params()
, and expand_steps()
.
- Display information for model functions and objects more compactly.
1.6.0
- Add global setting for default cutoff threshold value.
- Add option to reset all global settings.
- Enable recipe tuning with
TunedRecipe()
.
- Add
expand_model()
for model expansion over tuning parameters.
- Add
expand_params()
for model parameters expansion.
- Add
expand_steps()
for recipe step parameters expansion.
- Implement
MLModelFunction
and MLModelList
classes.
- Add fit methods for
MLModel
, MLModelFunction
, and MLModelList
.
- Fix
NNetModel
fit error with binary and factor responses.
- Fix
modelinfo()
function not found error.
1.5.2
- Implement exception handling of
tune()
resampling failures.
- Remove deprecated
types
and design
arguments from MLModel()
.
1.5.1
- Implement global settings for default resampling control, performance metrics, summary statistics, and tuning grid.
- Support vector arguments in
metricinfo()
and modelinfo()
.
- Update package documentation.
1.5.0
- Implement model:
SelectedModel
.
- Remove
maximize
argument from tune()
and TunedModel
.
- Support lists as arguments to
StackedModel()
and SuperModel
.
1.4.2
- Revert renaming of
expand.model()
.
- Exclude 0 distance from
KNNModel
tuning grid.
- Improve random tuning grid coverage.
1.4.1
- Implement model:
TunedModel
.
- Remove deprecated
na.action
argument from ModelFrame
methods.
- Rename
MLModel()
argument types
to response_types
.
- Rename
MLModel()
argument design
to predictor_encoding
.
- Rename
expand.model()
to expand_model()
.
1.4.0
1.3.3
- Implement optimism-corrected bootstrap resampling (
BootOptimismControl
).
- Store case names in
ModelFrame
and ModelRecipe
and save to Resamples
.
1.3.2
- Add
BinaryConfusionMatrix
and OrderedConfusionMatrix
classes.
- Export
ConfusionMatrix
constructor.
- Extend
metricinfo()
to confusion matrices.
- Refactor performance metrics methods code.
1.3.1
- Check and convert ordered factors in response methods.
- Check consistency of extracted variables in response methods.
- Add metrics methods for
Resamples
.
1.3.0
- Improve compatibility with preprocessing recipes.
- Allow base math functions and operators in
ModelFrame
formulas.
1.2.5
- Save
ModelFrame
response in first column.
- Unexport
response
formula method.
- Add
ICHomes
dataset.
- Add
center
and scale
slot to VarImp
.
1.2.4
- Prohibit in-line functions in
ModelFrame
formulas.
- Rename
response
function argument from data
to newdata
.
1.2.3
- Add
fit
, resample
, and tune
methods for design matrices.
- Reduce computational overhead for design matrices and recipes.
- Rename
ModelFrame()
argument na.action
to na.rm
.
1.2.2
- Implement parametric (
"exponential"
, "rayleigh"
, "weibull"
) estimation of baseline survival functions.
- Set
"weibull"
as the default distribution for survival mean estimation.
- Add extract method for
Resamples
.
- Add
na.rm
argument to calibration()
, confusion()
, performance()
, and performance_curve()
.
- Add loess
span
argument to calibration()
.
- Change
SurvMatrix
from S4 to S3 class.
1.2.1
- Add
method
option to predict()
for Breslow, Efron (default), or Fleming-Harrington estimation of survival curves for Cox proportional hazards-based models.
- Add
dist
option to predict()
for exponential or Weibull approximation to estimated survival curves.
- Add
dist
option to calibration()
for distributional estimation of observed mean survival.
- Add
dist
option to r2()
for distributional estimation of the total sum of squares mean.
- Handle unnamed arguments in
metricinfo()
and modelinfo()
.
1.2.0
- Implement metrics:
auc
, fnr
, fpr
, rpp
, tnr
, tpr
.
- Implement performance curves, including ROC and precision recall.
- Implement
SurvMatrix
classes for predicted survival events and probabilities to eliminate need for separate times
arguments in calibration, confusion, metrics, and performance functions.
- Add calibration curves for predicted survival means.
- Add lift curves for predicted survival probabilities.
- Add recipe support for survival and matrix outcomes.
- Rename
MLControl
argument surv_times
to times
.
- Fix identification of recipe
case_weight
and case_strata
variables.
- Launch package website.
- Bring Introduction vignette up to date with package features.
1.1.0
- Implement model:
BARTModel
.
- Implement model tuning over automatically generated grids of parameter values and random sampling of grid points.
- Add metrics for predicted survival times:
accuracy
, f_score
, kappa2
, npv
, ppv
, pr_auc
, precision
, recall
, roc_index
, sensitivity
, specificity
- Add metrics for predicted survival means:
cindex
, gini
, mae
, mse
, msle
, r2
, rmse
, rmsle
.
- Add
performance
and metric methods for ConfusionMatrix
.
- Add confusion matrices for predicted survival times.
- Standardize predict functions to return mean survival when times are not specified.
- Replace
MLModel
slot and constructor argument nvars
with design
.
1.0.0
- Implement models:
BARTMachineModel
, LARSModel
.
- Implement performance metrics:
gini
, multi-class pr_auc
and roc_auc
, multivariate rmse
, msle
, rmsle
.
- Implement smooth calibration curves.
- Implement
MLMetric
class for performance metrics.
- Add
as.data.frame
method for ModelFrame
.
- Add
expand.model
function.
- Add
label
slot to MLModel
.
- Expand
metricinfo/modelinfo
support for mixed argument types.
- Rename
calibration
argument n
to breaks
.
- Rename
modelmetrics
function to performance
.
- Rename
ModelMetrics/Diff
classes to Performance/Diff
.
- Change
MLModelTune
slot resamples
to performance
.
0.4.0
- Implement models:
AdaBagModel
, AdaBoostModel
, BlackBoostModel
, EarthModel
, FDAModel
, GAMBoostModel
, GLMBoostModel
, MDAModel
, NaiveBayesModel
, PDAModel
, RangerModel
, RPartModel
, TreeModel
- Implement user-specified performance metrics in
modelmetrics
function.
- Implement metrics:
accuracy
, brier
, cindex
, cross_entropy
, f_score
, kappa2
, mae
, mse
, npv
, ppv
, pr_auc
, precision
, r2
, recall
, roc_auc
, roc_index
, sensitivity
, specificity
, weighted_kappa2
.
- Add
cutoff
argument to confusion
function.
- Add
modelinfo
and metricinfo
functions.
- Add
modelmetrics
method for Resamples
.
- Add
ModelMetrics
class with print
and summary
methods.
- Add
response
method for recipe
.
- Export
Calibration
constructor.
- Export
Confusion
constructor.
- Export
Lift
constructor.
- Extend
calibration
arguments to observed and predicted responses.
- Extend
confusion
arguments to observed and predicted responses.
- Extend
lift
arguments to observed and predicted responses.
- Extend
metrics
and stats
function arguments to accept function names.
- Extend
Resamples
to arguments with multiple models.
- Change
CoxModel
, GLMModel
, and SurvRegModel
constructor definitions so that model control parameters are specified directly instead of with a separate control
argument/structure.
- Change
predict(..., times = numeric())
function calls to survival model fits to return predicted values in the same direction as survival times.
- Change
predict(..., times = numeric())
function calls to CForestModel
fits to return predicted means instead of medians.
- Change
tune
function argument metrics
to be defined in terms of a user-specified metric or metrics.
- Deprecate MLControl arguments
cutoff
, cutoff_index
, na.rm
, and summary
.
0.3.0
- Implement linear models (
LMModel
), linear discriminant analysis (LDAModel
), and quadratic discriminant analysis (QDAModel
).
- Implement confusion matrices.
- Support matrix response variables.
- Support user-specified stratification variables for resampling via the
strata
argument of ModelFrame
or the role of "case_strata"
for recipe variables.
- Support user-specified case weights for model fitting via the role of
"case_weight"
for recipe variables.
- Provide fallback for models with undefined variable importance.
- Update the importing of
prepper
due to its relocation from rsample
to recipes
.
0.2.0
- Implement partial dependence, calibration, and lift estimation and plotting.
- Implement k-nearest neighbors model (
KNNModel
), stacked regression models (StackedModel
), super learner models (SuperModel
), and extreme gradient boosting (XGBModel
).
- Implement resampling constructors for training resubstitution (
TrainControl
) and split training and test sets (SplitControl
).
- Implement
ModelFrame
class for general model formula and dataset specification.
- Add multi-class Brier score to
modelmetrics()
.
- Extend
predict()
to automatically preprocess recipes and to use training data as the newdata
default.
- Extend
tune()
to lists of models.
- Extent
summary()
argument stats
to functions.
- Fix survival probability calculations in
GBMModel
and GLMNetModel
.
- Change
MLControl
argument na.rm
default from FALSE
to TRUE
.
- Removed
na.rm
argument from modelmetrics()
.
0.1