A number of changes to comply with section 1.1.3.1 of "Writing R Extensions" were made.

Changes in version 6.0-34

For the input data x to train, we now respect the class of the input value to accommodate other data types (such as sparse matrices). There are some complications though; for pre-processing we throw a
warning if the data are not simple matrices or data frames since there is some infrastructure that does not exist for other classes( e.g. complete.cases). We also throw a warning if returnData <- TRUE and it cannot be converted to a data frame. This allows the use of sparse matrices and text corpus to be used as inputs into that function.

plsRglm was added.

From the frbs, the following rule-based models were added: ANFIS, DENFIS, FH.GBML, FIR.DM, FRBCS.CHI, FRBCS.W, FS.HGD, GFS.FR.MOGAL, GFS.GCCL, GFS.LTS, GFS.THRIFT, HYFIS, SBC and WM. Thanks to Lala Riza for suggesting these and facilitating their addition to the package.

prcomp.resamples now passed ... to prcomp. Also the call to prcomp uses the formula method so that na.action can be used.

The function resamples was enhanced so that train and rfe models that used returnResamp="all" subsets the resamples to get the appropriate values and issues a warning. The function also fills in missing model names if one or more are not given.

Changes in version 6.0-21

When creating the grid of tuning parameter values, the column
names no longer need to be preceded by a period. Periods can still be
used as before but are not required. This isn't guaranteed to break
backwards compatibility but it may in some cases.

trainControl now has a method = "none" resampling
option that bypasses model tuning and fits the model to the entire
training set. Note that if more than one model is specified an error
will occur.

panel.lift2 and xyplot.lift now have an argument
called values that show the percentages of samples found for
the specified percentages of samples tested.

train, rfe and sbf should no longer throw
a warning that "executing

A ggplot method for train was added.

Imputation via medians was added to preProcess by Zachary Mayer.

A small change was made to rpart models. Previously, when the
final model is determined, it would be fit by specifying the model using the
cp argument of rpart.control. This could lead to duplicated Cp
values in the final list of possible Cp values. The current version fits the
final model slightly different. An initial model is fit using cp = 0
then it is pruned using prune.rpart to the desired depth. This
shouldn't be different for the vast majority of data sets. Thanks to Jeff
Evans for pointing this out.

The method for estimating sigma for SVM and RVM models was slightly
changed to make them consistent with how ksvm and rvm does the
estimation.

The default behavior for returnResamp in rfeControl and
sbfControl is now returnResamp = "final".

cluster was added as a general class with a specific method
for resamples objects.

The refactoring of model code resulted in a number of packages being
eliminated from the depends field. Additionally, a few were moved to exports.

Changes in version 5.17-07

A bug in spatialSign was fixed for data frames with
a single column.

Pre-processing was not applied to the training data set
prior to grid creation. This is now done but only for models
that use the data when defining the grid. Thanks to Brad Buchsbaum
for finding the bug.

Some code was added to rfe to truncate the subset
sizes in case the user over-specified them.

A bug was fixed in gamFuncs for the rfe
function.

Option in trainControl, rfeControl and
sbfControl were added so that the user can set the
seed at each resampling iteration (most useful for parallel
processing). Thanks to Allan Engelhardt for the recommendation.

Some internal refactoring of the data was done to prepare
for some upcoming resampling options.

predict.train now has an explicit na.action
argument defaulted to na.omit. If imputation is used in
train, then na.action = na.pass is recommended.

A bug was fixed in dummyVars that occured when
missing data were in newdata. The function
contr.dummy is now deprecated and contr.ltfr
should be used (if you are using it at all). Thanks to
stackexchange user mchangun for finding the bug.

A check is now done inside dummyVars when
levelsOnly = TRUE to see if any predictors share common
levels.

A new option fullRank was added to dummyVars.
When true, contr.treatment is used. Otherwise,
contr.ltfr is used.

A bug in train was fixed with gbm models
(thanks to stackoverflow user screechOwl for finding it).

Changes in version 5.16-24

The protoclass function in the protoclass
package was added. The model uses a distance matrix as input and
the train method also uses the proxy package to
compute the distance using the Minkowski distance. The two tuning
parameters is the neighborhood size (eps) and the Minkowski
distance parameter (p).

A bug was (hopefully) fixed that occurred when some type of
parallel processing was used with train. The problem is
that the methods package was not being loaded in the workers.
While reproducible, it is unknown why this occurs and why it is
only for some technologies and systems. The methods package
is now a formal dependency and we coerce the workers to load it
remotely.

A bug was fixed where some calls were printed twice.

For rpart, C5.0 and ksvm, cost-sensitive
versions of these models for two classes were added to train.
The method values are rpartCost, C5.0Cost and
svmRadialWeights.

The prediction code for the ksvm models was changed. There
are some cases where the class predictions and the predicted class
probabilities disagree. This usually happens when the probabilities are
close to 0.50 (in the two class case). A kernlab bug has been
filed. In the meantime, if the ksvm model uses a probability
model, the class probabilities are generated first and the predicted
class is assigned to the probability with the largest value. Thanks to
Kjell Johnson for finding that one.

print.train was changed so that tune parameters that are
logicals are printed well.

Changes in version 5.16-13

Added a few exemptions to the logic that determines whether a model call should be scrubbed.

An error trap was created to catch issues with missing importance scores in rfe.

Changes in version 5.16-03

A function twoClassSim was added for benchmarking classification models.

A bug was fixed in predict.nullModel related to predicted class probabilities.

For bagged trees using ipred, the code in train
defaults to keepX = FALSE to save space. Pass in keepX =
TRUE to use out-of-bag sampling for this model.

Changes were made to support vector machines for classification
models due to bugs with class probabilities in the latest version of
kernlab. The prob.model will default to the value of
classProbs in the trControl function. If
prob.model is passed in as an argument to train, this
specification over-rides the default. In other words, to avoid
generating a probability model, set either classProbs = FALSE
or prob.model = FALSE.

Changes in version 5.15-052

A few bugs were fixed in bag, thanks to Keith
Woolner. Most notably, out-of-bag estimates are now computed when the
prediction function includes a column called pred.

Parallel processing was implemented in bag and
avNNet, which can be turned off using an optional arguments.

train, rfe, sbf, bag and
avNNet were given an additional argument in their respective
control files called allowParallel that defaults to
TRUE. When Code, the code will be executed in parallel
if a parallel backend (e.g. doMC) is registered. When
allowParallel = FALSE, the parallel backend is always
ignored. The use case is when rfe or sbf calls
train. If a parallel backend with P processors is being used,
the combination of these functions will create P^2 processes. Since
some operations benefit more from parallelization than others, the
user has the ability to concentrate computing resources for specific
functions.

A new resampling function called createTimeSlices was
contributed by Tony Cooper that generates cross-validation indices for
time series data.

A few more options were added to
trainControl. initialWindow, horizon and
fixedWindow are applicable for when method =
"timeslice". Another, indexOut is an optional list of
resampling indices for the hold-out set. By default, these values are
the unique set of data points not in the training set.

A bug was fixed in multiclass glmnet models when
generating class probabilities (thanks to Bradley Buchsbaum for
finding it).

To be more consistent with recent versions of lattice,
the parallel.resamples function was changed to
parallelplot.resamples.

Since ksvm now allows probabilities when class weights
are used, the default behavior in train is to set
prob.model = TRUE unless the user explicitly sets it to
FALSE. However, I have reported a bug in ksvm that gives
inconsistent results with class weights, so this is not advised at
this point in time.

Bugs were fix in predict.bagEarth and
predict.bagFDA.

When using rfeControl(saveDetails = TRUE) or
sbfControl(saveDetails = TRUE) an additional column is
added to object$pred called rowIndex. This indicates the
row from the original data that is being held-out.

Changes in version 5.15-045

A bug was fixed that induced NA values in SVM model predictions.

Changes in version 5.15-042

Many examples are wrapped in dontrun to speed up cran checking.

The scrda methods were removed from the package (on
6/30/12, R Core sent an email that "since we haven't got fixes for
long standing warnings of the rda packages since more than half a year
now, we set the package to ORPHANED.")

The output of train with verboseIter = TRUE was
modified to show the resample label as well as logging when the worker
started and stopped the task (better when using parallel processing).

Added a long-hidden function downSample for class imbalances

An upSample function was added for class imbalances.

A new file, aaa.R, was added to be compiled first that tries to
eliminate the dreaded 'no visible binding for global variable' false
positives. Specific namespaces were used with several functions for
avoid similar warnings.

A bug was fixed with icr.formula that was so ridiculous,
I now know that nobody has ever used that function.

Fixed a bug when using method = "oob" with train

Some exceptions were added to plot.train so that some
tuning parameters are better labeled.

dotplot.resamples and bwplot.resamples now order
the models using the first metric.

A few of the lattice plots for the resamples class were
changed such that when only one metric is shown: the strip is not
shown and the x-axis label displays the metric

When using trainControl(savePredictions = TRUE) an
additional column is added to object$pred called
rowIndex. This indicates the row from the original data that is
being held-out.

A variable importance function for nnet objects was
created based on Gevrey, M., Dimopoulos, I., & Lek, S. (2003). Review
and comparison of methods to study the contribution of variables in
artificial neural network models. ecological modelling, 160(3),
249–264.

The predictor function for glmnet was update and a
variable importance function was also added.

Raghu Nidagal found a bug in predict.avNNet that was
fixed.

sensitivity and specificity were given an
na.rm argument.

A first attempt at fault tolerance was added to train. If
a model fit fails, the predictions are set to NA and a warning
is issued (eg "model fit failed for Fold04: sigma=0.00392,
C=0.25"). When verboseIter = TRUE, the warning is also printed
to the log. Resampled performance is calculated on only the
non-missing estimates. This can also be done during predictions, but
must be done on a model by model basis. Fault tolerance was added for
kernlab models only at this time.

lift was modified in two ways. First, cuts is no
longer an argument. The function always uses cuts based on the number
of unique probability estimates. Second, a new argument called
label is available to use alternate names for the models
(e.g. names that are not valid R variable names).

A bug in print.bag was fixed.

Class probabilities were not being generated for sparseLDA
models.

Bugs were fixed in the new varImp methods for PART and RIPPER

Starting using namespaces for ctree and cforest to
avoid conflicts between duplicate function names in the party
and partykit package

A set of functions for RFE and logistic regression
(lrFuncs) was added.

A bug in train with method="glmStepAIC" was fixed
so that direction and other stepAIC arguments were
honored.

A bug was fixed in preProcess where the number of ICA
components was not specified. (thanks to Alexander Lebedev)

Another bug was fixed for oblique random forest methods in
train. (thanks to Alexander Lebedev)

Changes in version 5.15-023

The list of models that can accept factor inputs directly was
expanded to include the RWeka models, ctree,
cforest and custom models.

Added model lda2, which tunes by the number of functions
used during prediction.

confusionMatrix.train was updated to use the default
confusionMatrix code when norm = "none" and only a
single hold-out was used.

Added variable importance metrics for PART and RIPPER in the
RWeka package.

vignettes were moved from /inst/doc to /vignettes

Changes in version 5.14-023

The model details in ?train was changed to be more
readable

Added two models from the RRF package. RRF uses a
penalty for each predictor based on the scaled variable importance
scores from a prior random forest fit. RRFglobal sets a common,
global penalty across all predictors.

Added two models from the KRLS package: krlsRadial
and krlsPoly. Both have kernel parameters (sigma and
degree) and a common regularization parameter
lambda. The default for lambda is NA, letting the
krls function estimate it internally. lambda can also be
specified via tuneGrid.

twoClassSummary was modified to wrap the call to
pROC:::roc in a try command. In cases where the hold-out
data are only from one class, this produced an error. Now it generates
NA values for the AUC when this occurs and a general warning is
issued.

The underlying workflows for train were modified so that
missing values for performance measures would not throw an error (but
will issue a warning).

update.train was added so that tuning parameters
can be manually set if the automated approach to setting their
values is insufficient.

Changes in version 5.11-006

When using method = "pls" in train, the
plsr function used the default PLS algorithm
("kernelpls"). Now, the full orthogonal scores method is used. This
results in the same model, but a more extensive set of values are
calculated that enable VIP calculations (without much of a loss in
computational efficient).

A check was added to preProcess to ensure valid
values of method were used.

A new method, kernelpls, was added.

residuals and summary methods were added to
train objects that pass the final model to their
respective functions.

Changes in version 5.11-006

Bugs were fixed that prevented hold-out predictions from being
returned.

Changes in version 5.11-003

A bug in roc was found when the classes were completely
separable.

The ROC calculations for twoClassSummary and
filterVarImp were changed to use the pROC
package. This, and other changes, have increased efficiency. For
filterVarImp on the cell segmentation data lead to a
54-fold decrease in execution time. For the Glass data in the
mlbench package, the speedup was 37-fold. Warnings were
added for roc, aucRoc and
rocPoint regarding their deprecation.

A weird bug was fixed that occurred when some models were run with
sequential parameters that were fixed to single values (thanks to
Antoine Stevens for finding this issue).

item Another bug was fixed where pre-processing with train could fail

Changes in version 5.03-003

pre-processing in train did not occur for the final model fit

Changes in version 5.02-011

A function, lift, was added to create lattice
objects for lift plots.

Several models were added from the obliqueRF package:
'ORFridge' (linear combinations created using L2 regularization),
'ORFpls' (using partial least squares), 'ORFsvm' (linear support
vector machines), and 'ORFlog' (using logistic regression). As of
now, the package only support classification.

Added regression models simpls and
widekernelpls. These are new models since both
train and plsr have an argument
called method, so the computational algorithm could not be
passed through using the three dots.

Model rpart was added that uses cp as the tuning
parameter. To make the model codes more consistent, rpart
and ctree correspond to the nominal tuning parameters
(cp and mincriterion, respectively) and rpart2
and ctree2 are the alternate versions using maxdepth.

The text for ctree's tuning parameter was changed to '1 -
P-Value Threshold'

The argument controls was not being properly passed
through in models ctree and ctree2.

Changes in version 5.01-001

controls was not being set properly for cforest
models in train

The print methods for train, rfe and
sbf did not recognize LOOCV

avNNet sometimes failed with categorical outcomes with bag = FALSE

A bug in preProcess was fixed that was triggered by matrices without
dimnames (found by Allan Engelhardt)

bagged MARS models with factor outcomes now work

cforest was using the argument control instead of controls

A few bugs for class probabilities were fixed for slda, hdda,
glmStepAIC, nodeHarvest, avNNet and sda

When looping over models and resamples, the foreach
package is now being used. Now, when using parallel processing, the
caret code stays the same and parallelism is invoked using
one of the "do" packages (eg. doMC, doMPI, etc). This
affects train, rfe and
sbf. Their respective man pages have been revised to
illustrate this change.

The order of the results produced by defaultSummary were changed
so that the ROC AUC is first

A few man and C files were updated to eliminate R CMD check warnings

Now that we are using foreach, the verbose option in trainControl,
rfeControl and sbfControl are now defaulted to FALSE

rfe now returns the variable ranks in a single data frame (previously
there were data frames in lists of lists) for each of use. This will
will break code from previous versions. The built-in RFE functions
were also modified

confusionMatrix methods for rfe and sbf were added

NULL values of 'method' in preProcess are no longer allowed

a model for ridge regression was added (method = 'ridge') based on enet.

Changes in version 4.98

A bug was fixed in a few of the bagging aggregation
functions (found by Harlan Harris).

Fixed a bug spotted by Richard Marchese Robinson in createFolds
when the outcome was numeric. The issue is that
createFolds is trying to randomize n/4 numeric
samples to k folds. With less than 40 samples, it could not
always do this and would generate less than k folds in some
cases. The change will adjust the number of groups based on
n and k. For small samples sizes, it will not use
stratification. For larger data sets, it will at most group the
data into quartiles.

A function confusionMatrix.train was added to get an average
confusion matrices across resampled hold-outs when using the
train function for classification.

Added another model, avNNet, that fits several neural networks
via the nnet package using different seeds, then averages the
predictions of the networks. There is an additional bagging
option.

The default value of the 'var' argument of bag was changed.

As requested, most options can be passed from
train to preProcess. The
trainControl function was re-factored and several
options (e.g. k, thresh) were combined into a single
list option called preProcOptions. The default is consistent
with the original configuration: preProcOptions = list(thresh
= 0.95, ICAcomp = 3, k = 5)

nother option was added to preProcess. The pcaComp
option can be used to set exactly how many components are used
(as opposed to just a threshold). It defaults to NULL so that
the threshold method is still used by default, but a non-null
value of pcaComp over-rides thresh.

When created within train, the call for preProcess is now
modified to be a text string ("scrubed") because the call could
be very large.

Removed two deprecated functions: applyProcessing and
processData.

A new version of the cell segmentation data was saved and the
original version was moved to the package website (see
segmentationData for location). First, several
discrete versions of some of the predictors (with the suffix
"Status") were removed. Second, there are several skewed
predictors with minimum values of zero (that would benefit from
some transformation, such as the log). A constant value of 1 was
added to these fields: AvgIntenCh2, FiberAlign2Ch3,
FiberAlign2Ch4, SpotFiberCountCh4 and
TotalIntenCh2.

Changes in version 4.92

Some tweaks were made to plot.train in a effort to get the group
key to look less horrid.

train, rfe and sbf are
now able to estimate the time that these models take to predict new
samples. Their respective control objects have a new option,
timingSamps, that indicates how many of the training set samples
should be used for prediction (the default of zero means do not
estimate the prediction time).

xyplot.resamples was modified. A new argument,
what, has values: "scatter" plots the resampled
performance values for two models; "BlandAltman" plots the
difference between two models by the average (aka a MA plot) for two
models; "tTime", "mTime", "pTime" plot the total
model building and tuning; time ("t") or the final model
building time ("m") or the time to produce predictions
("p") against a confidence interval for the average
performance. 2+ models can be used.

Three new model types were added to train using
regsubsets in the leaps package:
"leapForward", "leapBackward" and "leapSeq". The
tuning parameter, nvmax, is the maximum number of terms in the
subset.

The seed was accidentally set when preProcess used ICA (spotted
by Allan Engelhardt)

preProcess was always being called (even to do nothing)
(found by Guozhu Wen)

Changes in version 4.91

Added a few new models associated with the bst package: bstTree,
bstLs and bstSm.

A model denoted as "M5" that combines M5P and M5Rules from the
RWeka package. This new model uses either of these functions
depending on the tuning parameter "rules".

Changes in version 4.90

Fixed a bug with train and method = "penalized". Thanks to
Fedor for finding it.

Changes in version 4.89

A new tuning parameter was added for M5Rules controlling smoothing.

The Laplace correction value for Naive Bayes was also added as a
tuning parameter.

varImp.RandomForest was updated to work. It now requires a recent
version of the party package.

Changes in version 4.88

Changes in version 4.87

A new option to trainControl was added to allow
users to constrain the possible predicted values of the model to the
range seen in the training set or a user-defined range. One-sided
ranges are also allowed.

Changes in version 4.85

Two typos fixed in print.rfe and
print.sbf (thanks to Jan Lammertyn)

Changes in version 4.83

dummyVars failed with formulas using "."
(all.vars does not handle this well)

tree2 was failing for some classification models

When SVM classification models are used with class.weights, the
options prob.model is automatically set to FALSE (otherwise, it
is always set to TRUE). A warning is issued that the model will
not be able to create class probabilities.

Also for SVM classification models, there are cases when the
probability model generates negative class probabilities. In
these cases, we assign a probability of zero then coerce the
probabilities to sum to one.

Several typos in the help pages were fixed (thanks to Andrew Ziem).

Added a new model, svmRadialCost, that fits the SVM model
and estimates the sigma parameter for each resample (to
properly capture the uncertainty).

preProcess has a new method called "range" that scales the predictors
to [0, 1] (which is approximate for new samples if the training set
ranges is narrow in comparison).

A check was added to train to make sure that, when the user passes
a data frame to tuneGrid, the names are correct and complete.

print.train prints the number of classes and levels for classification
models.

Changes in version 4.78

Added a few bagging modules. See ?bag.

Added basic timings of the entire call to train, rfe and sbf
as well as the fit time of the final model. These are stored in an element
called "times".

The data files were updated to use better compression, which added a
higher R version dependency.

plot.train was pretty much re-written to more effectively use trellis theme
defaults and to allow arguments (e.g. axis labels, keys, etc) to be passed
in to over-ride the defaults.

Bug fix for lda bagging function

Bug fix for print.train when preProc is NULL

predict.BoxCoxTrans would go all klablooey if there were missing
values

varImp.rpart was failing with some models (thanks to Maria Delgado)

Changes in version 4.77

A new class was added or estimating and applying the Box-Cox
transformation to data called BoxCoxTrans. This is also included as an
option to transform predictor variables. Although the Box-Tidwell
transformation was invented for this purpose, the Box-Cox transformation
is more straightforward, less prone to numerical issues and just as
effective. This method was also added to preProcess.

Fixed mis-labelled x axis in plot.train when a
transformation is applied for models with three tuning parameters.

When plotting a train object with method ==
"gbm" and multiple values of the shrinkage parameter, the ordering of
panels was improved.

Fixed bugs for regression prediction using partDSA and
qrf.

Another bug, reported by Jan Lammertyn, related to
extractPrediciton with a single predictor was also
fixed.

Changes in version 4.76

Fixed a bug where linear SVM models were not working for classification

Changes in version 4.75

'gcvEearth' which is the basic MARS model. The pruning procedure
is the nominal one based on GCV; only the degree is tuned by train.

Changes in version 4.74

Some changes to print.train: the call is not automatically
printed (but can be when print.train is explicitly invoked); the
"Selected" column is also not automatically printed (but can be);
non-table text now respects options("width"); only significant
digits are now printed when tuning parameters are kept at a
constant value

Changes in version 4.73

Bug fixes to preProcess related to complete.cases and a single predictor.

Changes in version 4.72

"Down-sampling" was implemented with bag so that, for
classification models, each class has the same number of classes
as the smallest class.

Added a new class, dummyVars, that creates an entire set of
binary dummy variables (instead of the reduced, full rank set).
The initial code was suggested by Gabor Grothendieck on R-Help.
The predict method is used to create dummy variables for any
data set.

A new function, createMultiFolds, is
used to generate indices for repeated CV.

The various resampling functions now have *named* lists
as output (with prefixes "Fold" for cv and repeated cv
and "Resample" otherwise)

Pre-processing has been added to train with the
preProcess argument. This has been tested when caret
function are used with rfe and sbf (via
caretFuncs and caretSBF, respectively).

When preProcess(method = "spatialSign"), centering and
scaling is done automatically too. Also, a bug was fixed
that stopped the transformation from being executed.

knn imputation was added to preProcess. The RANN package
is used to find the neighbors (the knn impute function in
the impute library was consistently generating segmentation
faults, so we wrote our own).

Changed the behavior of preProcess in situations where
scaling is requested but there is no variation in the
predictor. Previously, the method would fail. Now a
warning is issued and the value of the standard
deviation is coerced to be one (so that scaling has
no effect).

Changes in version 4.62

Added gam from mgcv (with smoothing splines and feature
selection) and gam from gam (with basic splines and loess)
smoothers. For these models, a formula is derived
from the data where "near zero variance" predictors
(see nearZerVar) are excluded and predictors with
less than 10 distinct values are entered as linear
(i.e. unsmoothed) terms.

Changes in version 4.61

Changed earth fit for classification models to use the
glm argument with a binomial family.

Added varImp.multinom, which is based on the absolute
values of the model coefficients

Changes in version 4.60

The feature selection vignette was updated slightly (again).

Changes in version 4.59

Updated rfe and sbf to include class probabilities
in performance calculations.

Also, the names of the resampling indices were harmonized
across train, rfe and sbf.

The feature selection vignette was updated slightly.

Changes in version 4.58

Added the ability to include class probabilities in
performance calculations. See trainControl and
twoClassSummary.

Updated and restructured the main vignette.

Changes in version 4.57

Internal changes related to how predictions from models are
stored and summarized. With the exception of loo, the model
performance values are calculated by the workers instead of
the main program. This should reduce i/o and lay some
groundwork for upcoming changes.

The default grid for relaxo models were changed based on
and initial model fit.

partDSA model predictions were modified; there were cases
where the user might request X partitions, but the model
only produced Y < X. In these cases, the partitions for
missing models were replaced with the largest model
that was fit.

The function modelLookup was put in the namespace and
a man file was added.

The names of the resample indices are automatically
reset, even if the user specified them.

Changes in version 4.56

Fixed a bug generated a few versions ago where varImp
for plsda and fda objects crashed.

Changes in version 4.55

When computing the scale parameter for RBF kernels, the
option to automatically scale the data was changed to TRUE

Changes in version 4.48

Changes in version 4.47

Changes in version 4.46

Cleaned up some of the man files for the resamples class
and added parallel.resamples.

Fixed a bug in diff.resamples where ... were
not being passed to the test statistic function.

Added more log messages in train when running verbose.

Added the German credit data set.

Changes in version 4.45

Added a general framework for bagging models via the
bag function. Also, model type "hdda" from the
HDclassif package was added.

Changes in version 4.44

Added neuralnet, quantregForest and rda
(from rda) to train. Since there is a naming
conflict with rda from mda, the rda model was
given a method value of "scrda".

Changes in version 4.43

Tthe resampling estimate of the standard deviation given
by train since v 4.39 was wrong

A new field was added to varImp.mvr called
"estimate". In cases where the mvr model had multiple
estimates of performance (e.g. training set, CV, etc) the user can
now select which estimate they want to be used in the importance
calculation (thanks to Sophie Bréand for finding this)

Changes in version 4.42

Added predict.sbf and modified the structure of
the sbf helper functions. The "score" function
only computes the metric used to filter and the filter function does
the actual filtering. This was changed so that FDR corrections or
other operations that use all of the p-values can be computed.

Also, the formatting of p-values in print.confusionMatrix
was changed

An argument was added to maxDissim
so that the variable name is returned instead of the index.

Independent component analysis was added to the list of
pre-processing operations and a new model ("icr") was
added to fit a pcr-like model with the ICA components.

Changes in version 4.40

Changes in version 4.39

Added several classes for examining the resampling results. There
are methods for estimating pair-wise differences and lattice
functions for visualization. The training vignette has a new
section describing the new features.

Changes in version 4.29

Changes in version 4.28

Also added another column to the output of
extractProb and extractPrediction that
saves the name of the model object so that you can have multiple
models of the same type and tell which predictions came from which
model.

Changes were made to plotClassProbs: new parameters were added
and densityplots can now be produced.

Changes in version 4.27

Changes in version 4.26

Fixed a bug in caretFunc that led to NaN variable rankings, so
that the first k terms were always selected.

Changes in version 4.25

Added parallel processing functionality for rfe

Changes in version 4.24

Added the ability to use custom metrics with rfe

Changes in version 4.22

Many Rd changes to work with updated parser.

Changes in version 4.21

Re-saved data in more compressed format

Changes in version 4.20

Added pcr as a method

Changes in version 4.19

Weights argument was added to train for models that accept weights

Also, a bug was fixed for lasso regression (wrong lambda
specification) and other for prediction in naive Bayes models
with a single predictor.

Changes in version 4.18

Fixed bug in new nearZeroVar and updated format.earth so that it
does not automatically print the formula

Changes in version 4.17

Added a new version of nearZeroVar from Allan Engelhardt that is
much faster

Changes in version 4.16

Fixed bugs in extractProb (for glmnet) and filterVarImp.

For glmnet, the user can now pass in their own value of family to
train (otherwise train will set it depending on the mode of the
outcome). However, glmnet doesn't have much support for families at
this time, so you can't change links or try other distributions.

Changes in version 4.15

Fixed bug in createFolds when the smallest y value is more than 25
of the data

Changes in version 4.07

Changes in version 4.06

Changes in version 4.05

Updated again for more stringent R CMD check tests in R-devel 2.9

Changes in version 4.04

Updated for more stringent R CMD check tests in R-devel 2.9

Changes in version 4.03

Significant internal changes were made to how the models are
fit. Now, the function used to compute the models is passed in as a
parameter (defaulting to lapply). In this way, users can use
their own parallel processing software without new versions of
caret. Examples are given in train.

Also, fixed a bug where the MSE (instead of RMSE) was reported
for random forest OOB resampling

There are more examples in train.

Changes to confusionMatrix, sensitivity,
specificity and the predictive value functions: each was made
more generic with default and table methods;
confusionMatrix "extractor" functions for matrices and tables
were added; the pos/neg predicted value computations were changed to
incorporate prevalence; prevalence was added as an option to several
functions; detection rate and prevalence statistics were added to
confusionMatrix; and the examples were expanded in the help
files.

This version of caret will break compatibility with
caretLSF and caretNWS. However, these packages will not be
needed now and will be deprecated.

Changes in version 3.51

Updated the man files and manuals.

Changes in version 3.50

Added qda, mda and pda.

Changes in version 3.49

Fixed bug in resampleHist. Also added a check in the train functions
that error trapped with glm models and > 2 classes

Changes in version 3.48

Added glms. Also, added varImp.bagEarth to the
namespace.

Changes in version 3.47

Added sda from the sda package. There was a naming
conflict between sda::sda and sparseLDA:::sda. The
method value for sparseLDA was changed from "sda" to
"sparseLDA".

Changes in version 3.44

Also, a bug was fixed where the ellipses were not passed into a
few of the newer models (such as penalized and ppr)

Changes in version 3.43

Added the penalized model from the penalized package. In
caret, it is regression only although the package allows for
classification via glm models. However, it does not allow the user to
pass the classes in (just an indicator matrix). Because of this, it
doesn't really work with the rest of the classification tools in the
package.

Changes in version 3.42

Added a little more formatting to print.train

Changes in version 3.41

For gbm, let the user over-ride the default value of the
distribution argument (brought us by Peter Tait via RHelp).

Changes in version 3.40

Changed predict.preProcess so that it doesn't crash if
newdata does not have all of the variables used to originally
pre-process *unless* PCA processing was requested.

Changes in version 3.39

Fixed bug in varImp.rpart when the model had only primary
splits.

Minor changes to the Affy normalization code

Changed typo in predictors man page

Changes in version 3.38

Added a new class called predictors that returns the
names of the predictors that were used in the final model.

Changes in version 3.37

Added the ability of train to use custom made performance
functions so that the tuning parameters can be chosen on the basis of
things other than RMSE/R-squared and Accuracy/Kappa.

A new argument was added to trainControl called
"summaryFunction" that is used to specify the function used to
compute performance metrics. The default function preserves the
functionality prior to this new version

a new argument to train is "maximize" which is a logical
for whether the performance measure specified in the "metric"
argument to train should be maximized or minimized.

The selection function specified in trainControl carries
the maximize argument with it so that customized performance
metrics can be used.

A bug was fixed in confusionMatrix (thanks to Gabor
Grothendieck)

Another bug was fixed related to predictions from least square
SVMs

Changes in version 3.36

Added superpc from the superpc package. One note:
the data argument that is passed to superpc is saved in
the object that results from superpc.train. This is used later
in the prediction function.

Changes in version 3.35

Changes in version 3.34

Changes in version 3.33

Added xyplot.train, densityplot.train,
histogram.train and stripplot.train. These are all
functions to plot the resampling points. There is some overlap between
these functions, plot.train and
resampleHist. plot.train gives the average metrics only
while these plot all of the resampled performance
metrics. resampleHist could plot all of the points, but only
for the final optimal set of predictors.

To use these functions, there is a new argument in
trainControl called returnResamp which should have
values "none", "final" and "all". The default is "final" to be
consistent with previous versions, but "all" should be specified to
use these new functions to their fullest.

Changes in version 3.32

The functions predict.train and predict.list were
added to use as alternatives to the extractPrediction and
extractProbs functions.

Also added logitBoost from the caTools
package. This package doesn't have a namespace and RWeka has a
function with the same name. It was suggested to use the "::" prefix
to differentiate them (but we'll see how this works).