When performing time series analysis, most statistical forecasting methods assume that the time series is approximately stationary. How can you determine if a time series is stationary? The Augmented Dickey-FullerTest is a well known statistical test that can help determine if your time series is stationary. In this article I will show you how to perform the Augmented Dickey-Fuller Test (ADF) test in python.

Stationary vs. Non-Stationary

In a stationary time series, statistical properties such as mean and variance are constant over time. In a non-stationary series, these properties are dependent on time.

Augmented Dickey-Fuller Test (ADF) Statistical Test

To determine if a time series is stationary or not, we will use the ADF test which is a type of unit root test. Unit roots are a cause for non-stationarity, the ADF test will test if unit root is present.

A time series is stationary if a single shift in time doesn't change the time series statistical properties, in which case unit root does not exist.

The Null and Alternate hypothesis of the Augmented Dickey-Fuller test are defined as follows:

Null Hypothesis states there is presence of a unit root.

Alternate Hypothesis states there is no unit root. In other words, Stationarity exists.

ADF Python Code

To implement the ADF test in python, we will be using the statsmodel implementation. Statsmodels is a Python module that provides functions and classes for the estimation of many statistical models. The function to perform ADF is called adfuller.

First, import the required dependencies. Import the statsmodel module and the adfuller class from the tsa.statstools namespace. We will also need Pandas.

We will create a class called StationarityTests to hold the ADF function. Our class constructor accepts a significance level as a parameter. This is defaulted to a significance level of 5%. It also contains an isStationary variable that will hold the results of the Augmented Dickey-Fuller test. If the time series is stationary, isStationary will be True, otherwise it will be False.

Lastly, we add the ADF implementation via a function called ADF_Stationarity_Test. This function takes a 1d-Array as input and a variable defaulted to True to determine if the function should print the full ADF results. The Akaike Information Criterion (AIC) is used to determine the lag.

The adfuller function returns a tuple of statistics from the ADF test such as the Test Statistic, P -Value, Number of Lags Used, Number of Observations used for the ADF regression and a dictionary of Critical Values.

If the P-Value is less than the Significance Level defined, we reject the Null Hypothesis that the time series contains a unit root. In other words, by rejecting the Null hypothesis, we can conclude that the time series is stationary.

If the pValue is very close to your significance level, you can use the Critical Values to help you reach a conclusion regarding the stationarity of your time series.

Testing Time Series for Stationarity

With our class now defined, its easy to test a time series for stationarity through the Augmented Dickey-Fuller test. Will test our first 2 series defined in this article for stationarity.

First, let’s see if our non_stationary_series is stationary. Instantiate the class and provide the non_stationary_series to the ADF_Stationarity_Test function like below. Then, print the results which will be holded by our isStationary variable in the class.

In this case, it is easy to see that the series is not stationary. P-Value of 0.83 is greater than our 5% significance level, therefore we fail to reject the null hypothesis that unit root does exist.

Another way to interpret this test is using the critical value which comes up to -0.75. This is greater than the 5% critical value of -2.89 (or the significance that you need) and therefore we fail to reject the null hypothesis.

As you would expect, the results show that the series is actually stationary. In this case, the P-Value from our ADF test is much smaller than our 5% significance level, therefore we can reject the Null hypothesis and instead accept the alternate hypothesis that stationarity exists.

Taking a look at the critical value yields the same conclusion. The tests critical value ends up being -10.458 which is much smaller than the 5% critical value of -2.89 and so we have enough evidence to conclude that unit root does not exist. In other words, series is stationary.

Conclusion

You have now learned how to test for stationarity using the Augmented Dickey-Fuller Test (ADF) and are able to interpret the test using the P-Value or the Critical Values returned by the test. We created our own class which implements the ADF test from the statsmodels python package. Knowledge of this statistical test will greatly help you when you are building time series forecasting models in which stationarity is many times a strong underlying assumption for various models.

For the latest news onMachine Learning

Follow Us!

MJ

Advanced analytics professional currently practicing in the healthcare sector.
Passionate about Machine Learning, Operations Research and Programming. Enjoys
the outdoors and extreme sports.