3 Ways to Check if Data Frames are Equal in R [Examples]

This article demonstrates 3 methods to compare two R data frames and check if they are equal.

Making sure that data frames are identical might be useful when you compare the outcome of two algorithms or compare two identically named tables from different environments.

As with many operations in R, you can check if two data frames are the same in different ways. However, the all_equal() function from the dplyr package provides the most convenient method to check if data frames are identical.

Besides the all_equal() function, you can also the identical() function or the all.equal() function (both R base) to compare two data frames.

The table below summarizes the 3 functions considering:

  • The library the function belongs to.
  • The information the function returns. Does the function return only TRUE/FALSE or also additional information?
  • How columns are compared. Does the function compare columns based on their position or their name?
MethodLibraryInformation returnedCompares Column by
identical()R baseTRUE / FALSEPosition
all.equal()R BaseTRUE / additional information (if false)Position
all_equal()dplyrTRUE / additional information (if false)Name

Next, we will demonstrate how to check if data frames are equal using the 4 following tables:

  1. my_df_1: The table we will compare with all other tables.
  2. my_df_2: The exact same table as my_df_1.
  3. my_df_3: A completely different table.
  4. my_df_4: The same table as my_df_1, but with the column names in a different order.
# my_df_1: Base data frame
my_df_1 <- data.frame(x1 = letters[1:5],
                      x2 = c(1:5),
                      x3 = c("@", "#", "$", "%", "&"))


# my_df_2: Same data frame
my_df_2 <- my_df_1


# my_df_ 3: Different data frame
my_df_3 <- data.frame(x1 = letters[22:26],
                      x2 = seq(20,100,20),
                      x3 = c(1,0,1,0,1))


# my_df_4: Same data frame, but different column order
my_df_4 <- data.frame(x3 = c("@", "#", "$", "%", "&"),
                      x2 = c(1:5),
                      x1 = letters[1:5])
Compare data frames in R

METHOD 1: Check if Data Frames are Identical with the identical() Function

The first method to check if two data frames in R are equal uses the identical() function.

The identical() function is part of the standard R base library and requires the names of two data frames as arguments. The function returns either TRUE or FALSE.

The identical() function compares the data frames based on the position of the columns. In other words, it compares the first column of table A with the first column of table B, the second column of table A with the second column of table B, etc.

See the R code below for an example.

# METHOD 1: identical() function
identical(my_df_1, my_df_2)
identical(my_df_1, my_df_3)
identical(my_df_1, my_df_4)
Check if two data frames in R are identical

METHOD 2: Check if Data Frames are Equal with the all.equal() Function

The second method to check if two data frames are the same uses the all.equal() function.

This function from the R base library compares all values in two data frames and returns TRUE if all values are equal. In contrast, it returns additional information when the data frames are different.

Like the identical() function, the all.equal() function also compares the data frames based on the position of the columns in the data frame. Therefore, the all.equal() function considers the data frames 1 and 4 as different.

See the example below.

# METHOD 2: all.equal() function
all.equal(my_df_1, my_df_2)
all.equal(my_df_1, my_df_3)
all.equal(my_df_1, my_df_4)
Check if two data frames in R are the same.

METHOD 3: Check if Data Frames are Equal with the all_equal() Function (from dplyr)

Lastly, the best method to compare if data frames are identical in R is the all_equal() function from the dplyr package.

In contrast to the other methods, the all_equal() function compares data frames based on the column names. In other words, as long as the columns of two data frames have the same names, the column order does not matter.

Moreover, the all_equal() function also provides additional information if the data frames are not identical. For example, it tells you that the data types are different or the mean difference (in the case of numeric data).

Note that the all_equal() function is case-sensitive, also for the column names. In other words, it considers x1 and X1 as two different column names, and therefore it will return that the two data frames are different.

We’ve written a separate article where we discuss how to change the case of the column names. This might be a necessary step before you can compare two data frames.

For example:

# METHOD 3: all_equal() function from dplyr
library("dplyr")

all_equal(my_df_1, my_df_2)
all_equal(my_df_1, my_df_3)
all_equal(my_df_1, my_df_4)
Check if data frames are identical in R using the all_equal() function from the dplyr package.

Related Topics: