Transformers are usually combined with classifiers, regressors or other
estimators to build a composite estimator. The most common tool is a
Pipeline. Pipeline is often used in combination with
FeatureUnion which concatenates the output of
transformers into a composite feature space. TransformedTargetRegressor deals with transforming the target
(i.e. log-transform y). In contrast, Pipelines only transform the
observed data (X).

Pipeline can be used to chain multiple estimators
into one. This is useful as there is often a fixed sequence
of steps in processing the data, for example feature selection, normalization
and classification. Pipeline serves multiple purposes here:

Convenience and encapsulation

You only have to call fit and predict once on your
data to fit a whole sequence of estimators.

Joint parameter selection

You can grid search
over parameters of all estimators in the pipeline at once.

Safety

Pipelines help avoid leaking statistics from your test data into the
trained model in cross-validation, by ensuring that the same samples are
used to train the transformers and predictors.

All estimators in a pipeline, except the last one, must be transformers
(i.e. must have a transform method).
The last estimator may be any type (transformer, classifier, etc.).

Calling fit on the pipeline is the same as calling fit on
each estimator in turn, transform the input and pass it on to the next step.
The pipeline has all the methods that the last estimator in the pipeline has,
i.e. if the last estimator is a classifier, the Pipeline can be used
as a classifier. If the last estimator is a transformer, again, so is the
pipeline.

Fitting transformers may be computationally expensive. With its
memory parameter set, Pipeline will cache each transformer
after calling fit.
This feature is used to avoid computing the fit transformers within a pipeline
if the parameters and input data are identical. A typical example is the case of
a grid search in which the transformers can be fitted only once and reused for
each configuration.

The parameter memory is needed in order to cache the transformers.
memory can be either a string containing the directory where to cache the
transformers or a joblib.Memory
object:

Enabling caching triggers a clone of the transformers before fitting.
Therefore, the transformer instance given to the pipeline cannot be
inspected directly.
In following example, accessing the PCA instance pca2
will raise an AttributeError since pca2 will be an unfitted
transformer.
Instead, use the attribute named_steps to inspect estimators within
the pipeline:

TransformedTargetRegressor transforms the targets y before fitting
a regression model. The predictions are mapped back to the original space via
an inverse transform. It takes as an argument the regressor that will be used
for prediction, and the transformer that will be applied to the target
variable:

FeatureUnion combines several transformer objects into a new
transformer that combines their output. A FeatureUnion takes
a list of transformer objects. During fitting, each of these
is fit to the data independently. The transformers are applied in parallel,
and the feature matrices they output are concatenated side-by-side into a
larger matrix.

(A FeatureUnion has no way of checking whether two transformers
might produce identical features. It only produces a union when the
feature sets are disjoint, and making sure they are the caller’s
responsibility.)

A FeatureUnion is built using a list of (key,value) pairs,
where the key is the name you want to give to a given transformation
(an arbitrary string; it only serves as an identifier)
and value is an estimator object:

Many datasets contain features of different types, say text, floats, and dates,
where each type of feature requires separate preprocessing or feature
extraction steps. Often it is easiest to preprocess data before applying
scikit-learn methods, for example using pandas.
Processing your data before passing it to scikit-learn might be problematic for
one of the following reasons:

Incorporating statistics from test data into the preprocessors makes
cross-validation scores unreliable (known as data leakage),
for example in the case of scalers or imputing missing values.

You may want to include the parameters of the preprocessors in a
parameter search.

For this data, we might want to encode the 'city' column as a categorical
variable using preprocessing.OneHotEncoder but apply a
feature_extraction.text.CountVectorizer to the 'title' column.
As we might use multiple feature extraction methods on the same column, we give
each transformer a unique name, say 'city_category' and 'title_bow'.
By default, the remaining rating columns are ignored (remainder='drop'):

In the above example, the
CountVectorizer expects a 1D array as
input and therefore the columns were specified as a string ('title').
However, preprocessing.OneHotEncoder
as most of other transformers expects 2D data, therefore in that case you need
to specify the column as a list of strings (['city']).

Apart from a scalar or a single item list, the column selection can be specified
as a list of multiple items, an integer array, a slice, or a boolean mask.
Strings can reference columns if the input is a DataFrame, integers are always
interpreted as the positional columns.

We can keep the remaining rating columns by setting
remainder='passthrough'. The values are appended to the end of the
transformation: