In this article, we demonstrate 3 ways to calculate the Mean Absolute Error (MAE) in R.
The Mean Absolute Error is one of the most used metrics to assess the performance of a regression model. Its measures the average absolute difference between the predicted values and the actual (i.e., observed) values. But, how do you calculate the Mean Absolute Error in R?
The easiest way to calculate the Mean Absolute Error (MAE) in R is by using the MAE() function. This function is available in both the Metrics and the ie2misc package. It requires one vector with the predicted values and one vector with the actual values. As a result, it returns the MAE.
In this article, we show how to use the MAE() function from both packages, as well as a third option to calculate the Mean Absolute Error.
Instead, if you want to have a relative metric rather than an absolute metric, we recommend the Mean Absolute Percentage Error (MAPE), Weighted Absolute Percentage Error (WAPE), or Weighted Mean Absolute Percentage Error (WMAPE).
The Mean Abolsute Error (MAE)
As mentioned before, the Mean Absolute Error calculates the absolute difference between the predicted and actual values, sums these differences, and divides the sum by the number of observations.
In formula form, the MAE looks like this:
In this formula:
- n: represents the number of observations
- yi: represents the actual value
- ŷi: represents the predicted value
In assessing the performance of your (regression) model, the lower the MAE, the better your model. A Mean Absolute Error of 0 means a perfect model. In other words, for all observations, the actual and predicted values are the same.
An advantage of the MAE of other metrics is the fact that its outcome is in the same units as the variable of interest. That means, if your Mean Absolute Error is 2, then the absolute difference between the actual and predicted value is 2 units.
A possible disadvantage of the MAE is that low and high errors are considered equally important. In other words, the severity of a difference between the actual and predicted value of 5 units is equally important as a difference of 1 unit. If higher errors should be penalized more than smaller errors, you could use the Root Mean Squared Error (RMSE) instead.
3 Ways to Calculate the Mean Absolute Error
Before we demonstrate 3 different ways to calculate the Mean Absolute Error, we create two numeric vectors. We call these vectors y and y_hat, and they represent the actual values and the predicted values from a regression model.
We create these vectors of random numbers with the SAMPLE.INT() function.
set.seed(123) y <- sample.int(100, 100, replace = TRUE) y set.seed(321) y_hat <- sample.int(100, 100, replace = TRUE) y_hat
1. Use the MAE() Function from the Metrics Package
The quickest way to calculate the Mean Absolute Error in R is by using the MAE() function from the Metrics packages. You only need to provide a vector with the actual values and a vector with the predicted values, and the MAE() function returns the Mean Absolute Error.
The Metrics package is an implementation of many evaluation metrics that are frequently used in machine learning. It has no dependencies on other packages and, besides the Mean Absolute error, also provides functions to calculate the Mean Absolute Percentage Error (MAPE) and Mean Absolute Scaled Error (MASE).
The R code below shows how to load the Metrics package and use the MAE() function.
library(Metrics) mae(actual = y, predicted = y_hat)
Instead of using the MAE() function from the Metrics packages, you can also use the same function from the MLmetrics package. Both functions behave identically.
2. Use the MAE() Function from the ie2misc Package
A second option to find the MAE in R is by using the MAE() function from the ie2misc package.
Although the ie2misc package provides fewer evaluation metrics than the Metrics package, it is still a useful package. Like the Metrics package, the MAE() function from the ie2misc package requires just the vectors with the predicted value and the actual values to calculate the Mean Absolute Error.
library(ie2misc) mae(predicted = y_hat, observed = y)
Though you might not need it frequently, the MAE() function from the ie2misc package has an advantage over the same function from the Metrics package. Namely, it can ignore missing values.
The example below shows the normal behavior of both functions in case of missing values in the vectors with predicted or actual values. The two functions return both a NA.
y_new <- y y_new <- NA y_hat_new <- y_hat y_hat_new <- NA Metrics::mae(actual = y_new, predicted = y_hat_new) ie2misc::mae(predicted = y_hat_new, observed = y_new)
However, in constrast to the Metrics package, the MAE() function from the ie2misc package has the useful optional parameter na.rm. By default, this parameter is set to FALSE, but if you use na.rm = TRUE, then missing values are ignored.
ie2misc::mae(predicted = y_hat_new, observed = y_new, na.rm = TRUE)
3. Use Different Function from the R Base Package
Lastly, you could also calculate the Mean Absolute Error with some basic R functions.
By using the SUM(), ABS(), and LENGTH() function you can calculate the Mean Absolute Error without applying functions from packages such as Metrics or ie2misc. Although this requires more R code, you might prefer this method because of its readability.
This is the R code you could use:
sum(abs(y - y_hat))/length(y)