RANSAC is an iterative algorithm for the robust estimation of parameters
from a subset of inliers from the complete data set. More information can
be found in the general documentation of linear models.

A detailed description of the algorithm can be found in the documentation
of the linear_model sub-package.

score(X, y): Returns the mean accuracy on the given test data,
which is used for the stop criterion defined by stop_score.
Additionally, the score is used to decide which of two equally
large consensus sets is chosen as the better one.

predict(X): Returns predicted values using the linear model,
which is used to compute residual error using loss function.

If base_estimator is None, then
base_estimator=sklearn.linear_model.LinearRegression() is used for
target values of dtype float.

Note that the current implementation only supports regression
estimators.

min_samples:int (>= 1) or float ([0, 1]), optional

Minimum number of samples chosen randomly from original data. Treated
as an absolute number of samples for min_samples >= 1, treated as a
relative number ceil(min_samples * X.shape[0]) for
min_samples < 1. This is typically chosen as the minimal number of
samples necessary to estimate the given base_estimator. By default a
sklearn.linear_model.LinearRegression() estimator is assumed and
min_samples is chosen as X.shape[1]+1.

residual_threshold:float, optional

Maximum residual for a data sample to be classified as an inlier.
By default the threshold is chosen as the MAD (median absolute
deviation) of the target values y.

is_data_valid:callable, optional

This function is called with the randomly selected data before the
model is fitted to it: is_data_valid(X, y). If its return value is
False the current randomly chosen sub-sample is skipped.

is_model_valid:callable, optional

This function is called with the estimated model and the randomly
selected data: is_model_valid(model, X, y). If its return value is
False the current randomly chosen sub-sample is skipped.
Rejecting samples with this function is computationally costlier than
with is_data_valid. is_model_valid should therefore only be used if
the estimated model is needed for making the rejection decision.

max_trials:int, optional

Maximum number of iterations for random sample selection.

max_skips:int, optional

Maximum number of iterations that can be skipped due to finding zero
inliers or invalid data defined by is_data_valid or invalid models
defined by is_model_valid.

New in version 0.19.

stop_n_inliers:int, optional

Stop iteration if at least this number of inliers are found.

stop_score:float, optional

Stop iteration if score is greater equal than this threshold.

stop_probability:float in range [0, 1], optional

RANSAC iteration stops if at least one outlier-free set of the training
data is sampled in RANSAC. This requires to generate at least N
samples (iterations):

N>=log(1-probability)/log(1-e**m)

where the probability (confidence) is typically set to high value such
as 0.99 (the default) and e is the current fraction of inliers w.r.t.
the total number of samples.

loss:string, callable, optional, default “absolute_loss”

String inputs, “absolute_loss” and “squared_loss” are supported which
find the absolute loss and squared loss per sample
respectively.

If loss is a callable, then it should be a function that takes
two arrays as inputs, the true and predicted value and returns a 1-D
array with the i-th value of the array corresponding to the loss
on X[i].

If the loss on a sample is greater than the residual_threshold,
then this sample is classified as an outlier.

The generator used to initialize the centers. If int, random_state is
the seed used by the random number generator; If RandomState instance,
random_state is the random number generator; If None, the random number
generator is the RandomState instance used by np.random.

Attributes:

estimator_:object

Best fitted model (copy of the base_estimator object).

n_trials_:int

Number of random selection trials until one of the stop criteria is
met. It is always <=max_trials.

inlier_mask_:bool array of shape [n_samples]

Boolean mask of inliers classified as True.

n_skips_no_inliers_:int

Number of iterations skipped due to finding zero inliers.

New in version 0.19.

n_skips_invalid_data_:int

Number of iterations skipped due to invalid data defined by
is_data_valid.

New in version 0.19.

n_skips_invalid_model_:int

Number of iterations skipped due to an invalid model defined by
is_model_valid.

The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter> so that it’s possible to update each
component of a nested object.