Bootstrap Cross Validation

Bootstrap Methods

Boot strapping is a powerful tool to get an idea on accuracy of the model and the test error

Can estimate the likely future performance of a given modeling procedure, on new data not yet realized.

The Algorithm

We have a training data is of size N

Draw random sample with replacement of size N – This gives a new dataset, it might have repeated observations, some observations might not have even appeared once.

Create B such new datasets. These are called boot strap datasets

Build the model on these B datasets, we can test the models on the original training dataset.

Bootstrap Example

Example

We have a training data is of size 500

Boot Strap Data-1:

Create a dataset of size 500. To create this dataset, draw a random point, note it down, then replace it back. Again draw another sample point. Repeat this process 500 times. This makes a dataset of size 500. Call this as Boot Strap Data-1