Search

Compute the pairwise covariance among the series of a DataFrame.
The returned data frame is the covariance matrix of the columns
of the DataFrame.

Both NA and null values are automatically excluded from the
calculation. (See the note below about bias from missing values.)
A threshold can be set for the minimum number of
observations for each value created. Comparisons with observations
below this threshold will be returned as NaN.

This method is generally used for the analysis of time series data to
understand the relationship between different measures
across time.

Parameters

min_periodsint, optional

Minimum number of observations required per pair of columns
to have a valid result.

Returns the covariance matrix of the DataFrame’s time series.
The covariance is normalized by N-1.

For DataFrames that have Series that are missing data (assuming that
data is missing at random)
the returned covariance matrix will be an unbiased estimate
of the variance and covariance between the member Series.

However, for many applications this estimate may not be acceptable
because the estimate covariance matrix is not guaranteed to be positive
semi-definite. This could lead to estimate correlations having
absolute values which are greater than one, and/or a non-invertible
covariance matrix. See Estimation of covariance matrices for more details.