In this article, we discuss **how to test for autocorrelation in R**. Especially in the context of regression models.

The non-existence of autocorrelation among residuals is one of the main assumptions of a regression model. If autocorrelation does exist, the outcomes of the model might be unreliable. Therefore, it’s essential to check this assumption.

**In R, the easiest way to test for autocorrelation among residuals is with the ACF() function. This function computes and plots the autocorrelation of a regression model and makes your analysis straightforward. Alternatively, you can perform the Durbin-Watson test or the Breusch-Godfrey test.**

In this article, we show how to use the ACF() function and perform both tests. We use examples and R code that you can use directly in your own project.

## What is Autocorrelation?

Autocorrelation occurs when the residuals of a regression model are not independent of each other. In other words, if the value of residual *e _{i}* depends on the value of residual

*e*.

_{i-1}Autocorrelation, or lagged correlation, can be measured in different forms, of which lag-1 is the most common. However, you also have lag-2, lag-3, etc.

- Lag-1: Checks the correlation between
and**e**_{i}.**e**_{i-1} - Lag-2: Checks the correlation between
and**e**_{i}.**e**_{i-2} - Lag-3: Checks the correlation between
and**e**_{i}.**e**_{i-3} - etc.

Autocorrelation occurs mainly in time series. For example, when we measure the height of the same person at different moments in time. The height of a person now is in general highly correlated with its height during the previous measurement. However, autocorrelation can also occur in other circumstances.

*Autocorrelation leads to underestimation of the standard error of predictor variables. Which in turn makes you think that predictors are significant (when there are not). Therefore, you should always check for the non-existence of autocorrelation in your regression model.*

**Do you know: 3 Ways to Check the Homoscedasticity assumption** and **3 Ways to Check for Multicollinearity**.

## 3 Ways to Check for Autocorrelation

**In the sections below we show 3 ways to test for autocorrelation in R**. We cover the ACF plot, the Durbin-Watson test, and the Breusch-Godfrey test. For each method, we include two examples.

In the examples, we test the assumption of the non-existence of autocorrelation. However, the residuals of one regression model are highly correlated while the other model meets the assumption of no-autocorrelation. The difference in the outcome of both examples will help you to draw the right conclusion in your analysis.

The first model estimates the daily closing prices of the UK stock exchange (FTSE) based on the German stock exchange (DAX). As you might expect, the closing price of a stock exchange is highly correlated with the closing price of the previous day. Therefore, the residuals might show autocorrelation.

The second model estimates the fuel efficiency (MPG) of a car based on the rear axle ratio (DRAT). In this case, we won’t expect autocorrelation of the residuals.

### 1. Test for Autocorrelation with the ACF Plot

**The first way to check for autocorrelation in R is by using the ACF() function.** This function is part of the *stats* package and computes and plots estimates of the autocorrelation.

The ACF() function requires just one argument, namely a numeric vector with the residuals of the regression model. Additionally, you can use the *type = “correlation”*-parameter to specify what you want to calculate.

**Syntax**

acf(residuals,type="correlation")

**Example with Autocorrelation**

```
library(stats)
model <- lm(FTSE~DAX, data = EuStockMarkets)
acf(model$residuals, type = "correlation")
```

The image above shows the ACF plot of a regression model with **highly autocorrelated residuals**.

**The interpretation of an ACF plot is simple. **The x-axis corresponds to the different lags of the residuals (i.e., lag-0, lag-1, lag-2, etc.). Whereas the y-axis shows the correlation of each lag. Finally, the dashed blue line represents the significance level.

The first vertical bar (i.e., lag-0) shows the correlation of a residual with itself and therefore is always one. In the absence of autocorrelation, the subsequent vertical bars would drop quickly to almost zero or at least between (or near) the dashed blue lines.

The example above clearly doesn’t show a quick drop in the correlations. So, we can conclude that the residuals are autocorrelated.

**Example without Autocorrelation**

```
library(stats)
model <- lm(mpg~drat, data = mtcars)
acf(model$residuals, type = "correlation")
```

The image above shows the ACF plot of residuals that are not **autocorrelated**.

After the lag-0 correlation, the subsequent correlations drop quickly to zero and stay (mostly) between the limits of the significance level (dashed blue lines). Therefore, we can conclude that the residuals of this model meet the assumption of no autocorrelation.

### 2. Perform the Durbin-Watson Test to Check Autocorrelation

**The second method to measure the autocorrelation of residuals in R is by performing the Durbin-Watson test.** More specifically, it checks the first-order autocorrelation (i.e., lag-1).

**The hypotheses of the Durbin-Watson test are:**

The Durbin-Watson test uses the following test statistic to test the hypothesis:

, where:

: is the number of observations.*n*: is the residual of the*e*_{i}*i-*th observation.

The Durbin-Watson test statistic has always a value between 0 and 4, where:

- [0-2): means positive autocorrelation
**2: means no autocorrelation**- (2-4]: mean negative autocorrelation

**As a rule of thumb, we assume that the residuals are not correlated when the Durbin-Watson test statistic has a value between 1.5 and 2.5**. If the statistic is below 1 or above 3, then there is definitely autocorrelation among the residuals.

In R, you can use the DWTEST() function from the *lmtest* package to perform the Durbin-Watson test. The function requires just one parameter, namely a fitted regression model (i.e., an “*lm*” object).

**Example with Autocorrelation**

```
library(lmtest)
model <- lm(FTSE~DAX, data = EuStockMarkets)
lmtest::dwtest(model)
```

The image above shows the output of the Durbin-Watson test in R. Since the test statistic is lower than one (*DW = 0.012*) and the p-value (*< 2.2e-16*) is significant, we reject the null hypothesis and conclude that the residuals are autocorrelated.

**Example without Autocorrelation**

```
library(lmtest)
model <- lm(mpg~drat, data = mtcars)
lmtest::dwtest(model)
```

The output of the Durbin-Watson test above shows a test statistic near 2 (DW = 2.08) and a non-significant p-value (*0.55*). Therefore, we conclude that there is no autocorrelation among the residuals.

### 3. Perform the Breusch-Godfrey Test to Check Autocorrelation

**Lastly, you can perform a Breusch-Godfrey test to check the no autocorrelation assumption in R.**

In contrast to the Durbin-Watson test, the Breusch-Godfrey test checks for autocorrelation among residuals of the first-order, second-order, third-order, etc. Although you normally only need to check the first-order correlation, the Breusch-Godfrey test can still be useful.

The Breusch-Godfrey test uses the same hypothesis as the Durbin-Watson test:

In R, you can perform the Breusch-Godfrey test by using the BGTEST() function from the *lmtest* package. This function requires two parameters, namely a fitted regression model and a positive integer that indicates to which order you want to test.

**Syntax of the BGTEST() function:**

bgtest(model, order)

In the examples below, we use the BGTEST() function to test for autocorrelation in the first-, second-, and third-order.

**Example with Autocorrelation**

```
library(lmtest)
model <- lm(FTSE~DAX, data = EuStockMarkets)
lmtest::bgtest(model, order = 3)
```

The image above shows the output of the Breusch-Godfrey test in R. Because the p-value is significant (*< 2.2e-16*) we reject the null hypothesis and conclude that the residuals are autocorrelated.

**Example without Autocorrelation**

```
library(lmtest)
model <- lm(mpg~drat, data = mtcars)
lmtest::bgtest(model, order = 3)
```

Alternatively, the residuals of this model don’t show autocorrelation. Because the p-value is not significant we won’t reject the null hypothesis.