Correlation

Abstract

Correlation quantifies the relationship between features in order to identify feature candidates that may be best suited to achieve desired effects. Linear correlation methods are robust and computationally efficient but detect only linear dependencies. Nonlinear correlationmethods are able to detect nonlinear dependencies but need to be carefully parametrized. As a popular example for nonlinear correlation we present the chi-square test for independence that is based on histogram counts. Nonlinear correlation can also be quantified by the regression validation error. Correlation does not imply causality, so correlation analysismay reveal spurious correlations. If the underlying features are known, then spurios correlations may be handled with partial correlation methods.

References

K. Pearson. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, 50(302):157–175, 1900.MATHGoogle Scholar