This article demonstrates **3 methods to compare two R data frames and check if they are equal**.

Making sure that data frames are identical might be useful when you compare the outcome of two algorithms or compare two identically named tables from different environments.

As with many operations in R, you can check if two data frames are the same in different ways. However, **the all_equal() function from the dplyr package provides the most convenient method to check if data frames are identical**.

Besides the *all_equal()* function, you can also the *identical()* function or the *all.equal()* function (both R base) to compare two data frames.

**The table below summarizes the 3 functions considering:**

- The library the function belongs to.
- The information the function returns. Does the function return only
*TRUE/FALSE*or also additional information? - How columns are compared. Does the function compare columns based on their position or their name?

Method | Library | Information returned | Compares Column by |
---|---|---|---|

identical() | R base | TRUE / FALSE | Position |

all.equal() | R Base | TRUE / additional information (if false) | Position |

all_equal() | dplyr | TRUE / additional information (if false) | Name |

Next, we will demonstrate how to check if data frames are equal using the 4 following tables:

: The table we will compare with all other tables.**my_df_1**: The exact same table as**my_df_2***my_df_1*.: A completely different table.**my_df_3**: The same table as**my_df_4***my_df_1*, but with the*column names in a different order*.

```
# my_df_1: Base data frame
my_df_1 <- data.frame(x1 = letters[1:5],
x2 = c(1:5),
x3 = c("@", "#", "$", "%", "&"))
# my_df_2: Same data frame
my_df_2 <- my_df_1
# my_df_ 3: Different data frame
my_df_3 <- data.frame(x1 = letters[22:26],
x2 = seq(20,100,20),
x3 = c(1,0,1,0,1))
# my_df_4: Same data frame, but different column order
my_df_4 <- data.frame(x3 = c("@", "#", "$", "%", "&"),
x2 = c(1:5),
x1 = letters[1:5])
```

**METHOD 1: Check if Data Frames are Identical with the ***identical()* Function

*identical()*Function

The first method to check if two data frames in R are equal uses **the identical() function**.

The *identical()* function is part of the standard R base library and requires the names of two data frames as arguments. The function **returns either TRUE or FALSE**.

The *identical()* function **compares the data frames based on the position of the columns**. In other words, it compares the first column of table A with the first column of table B, the second column of table A with the second column of table B, etc.

See the R code below for an example.

```
# METHOD 1: identical() function
identical(my_df_1, my_df_2)
identical(my_df_1, my_df_3)
identical(my_df_1, my_df_4)
```

**METHOD 2: Check if Data Frames are Equal with the ***all.equal()* Function

*all.equal()*Function

The second method to check if two data frames are the same uses **the all.equal() function**.

**This function from the R base library compares all values in two data frames and returns TRUE if all values are equal. In contrast, it returns additional information when the data frames are different.**

Like the *identical()* function, the *all.equal()* function also **compares the data frames based on the position of the columns** in the data frame. Therefore, the *all.equal()* function considers the data frames 1 and 4 as different.

See the example below.

```
# METHOD 2: all.equal() function
all.equal(my_df_1, my_df_2)
all.equal(my_df_1, my_df_3)
all.equal(my_df_1, my_df_4)
```

**METHOD 3: Check if Data Frames are Equal with the ***all_equal()* Function (from *dplyr*)

*all_equal()*Function (from

*dplyr*)

Lastly, the **best method to compare if data frames are identical in R is the all_equal() function** from the

*dplyr*package.

In contrast to the other methods, the *all_equal()* function compares data frames based on the column names. In other words, as long as the columns of two data frames have the same names, **the column order does not matter**.

Moreover, **the all_equal() function also provides additional information if the data frames are not identical**. For example, it tells you that the data types are different or the mean difference (in the case of numeric data).

Note that **the all_equal() function is case-sensitive, also for the column names**. In other words, it considers

*x1*and

*X1*as two different column names, and therefore it will return that the two data frames are different.

We’ve written a separate article where we discuss how to change the case of the column names. This might be a necessary step before you can compare two data frames.

For example:

```
# METHOD 3: all_equal() function from dplyr
library("dplyr")
all_equal(my_df_1, my_df_2)
all_equal(my_df_1, my_df_3)
all_equal(my_df_1, my_df_4)
```