How to Duplicate Columns in R [Examples]

In this article, we discuss how to duplicate columns in an R data frame.

Normally, you want to avoid and remove duplicated columns (i.e., columns with the same information). However, sometimes you might need them. So, you do duplicate columns?

In R, the easiest way to create duplicated columns is with the CBIND() function. This function combines the original columns from a data frame with the new, duplicated columns. Alternatively, you can use the dplyr package to generate duplicated columns.

In this article, we show different ways to copy columns from a data frame. We discuss how to duplicate (i.e., create one copy) and replicate (i.e., create multiple columns) one or more columns.

Before we start, we create a data frame that we will use in our examples. This data frame has 3 columns (x, y, and z) and four rows.

my_df <- data.frame(x=c(1,2,3,4),
Example data

3 Ways to Duplicate a Column in R

Next, we demonstrate 3 ways to duplicate one column from a data frame. In other words, we will create a copy of a column and add it to the original data.

1. Duplicate a Column with R Base Code

The easiest way to duplicate one column is with basic R code. These are the steps:

  1. Define a new column in a data frame with the $-sign.
  2. Use the arrow-sign (i.e., <-) to assign the new column a value.
  3. Specify the new value.

For example, with the R code below, we create a new variable x_dup and assign it the value of the original column x. Hence, we duplicate column x.

my_df$x_dup <- my_df$x
Duplicate one column in R

The advantage of this method is that it is fast and easy to understand. However, if you want to duplicate or replicate one or more columns, this method is impractical.

2. Duplicate a Column with the CBIND() Function

The second method to duplicate columns in R is by using the CBIND() function.

The CBIND() function, short for column bind, merges multiple columns into one data frame. Therefore, it is a convenient function to create duplicated columns.

Because the function merges two or more columns, it has at least two arguments. Namely, the original columns and the columns you want to add (i.e., duplicate).

For example, the next R code uses the CBIND() function to merge the data frame my_df with the column x from the same data frame.

cbind(my_df, x_dup = my_df$x)
Duplicate columns in R with dplyr

In the code above, we assigned the new, duplicated column a new, namely x_dup. If we had omitted the new of the new column, R would have used my_df$x as the new column name instead.

3. Duplicate a Column with the dplyr Package

The third way to create a duplicated column uses the MUTATE() function from the dplyr package.

To copy a column with the dplyr package you start the MUTATE() function, followed by the name of the new column and the name of the original column (i..e, the column you want to duplicate). This method is convenient if you want to copy a column and directly use it in subsequent operations.

For example:


my_df %>% 
  mutate(x_dup = x)
Duplicate columns in R with cbind

How to Duplicate Multiple Columns in R

Instead of duplicating one column, you might want to duplicate multiple columns.

To copy different columns with one single line of code, you use the CBIND() function. First, you specify the name of the original data frame. Then, you use the bracket notation to select and add multiple columns (from the same data frame). The CBIND() function merges all the selected columns into a single data frame.

For example, the R code below duplicates the columns x and z.

cbind(my_df, my_df[,c(1,3)])
Duplicate multiple columns in R

Alternatively, you can use the MUTATE() function from the dplyr package to duplicate multiple columns at once. However, this code is complex, and therefore we won’t explain it in detail.

my_df %>%
  mutate(across(all_of(c(1,3)), ~ ., .names = "{col}2"))
Duplicate multiple columns in R with dplyr

How to Replicate a Column n Times in R

Instead of copying a column once (i.e., duplication), you might want to copy a column multiple times (i.e., replication). Unfortunately, the methods explained above won’t work or require several repetitive operations. So, how do you replicate a column effectively?

The best way to replicate columns in R is by using the CBIND() function and the REP() function. First, you use the REP() function to select a column and create one or more copies. Then, you use the CBIND() function to merge the original dataset and the replicated columns into a single data frame.

In this method of replicating columns, the REP() function plays a vital role. To make this function work, you need to provide two arguments, namely:

  1. The column you want to replicate.
  2. The number of copies you want.

For example, with the following R code, we replicate the second column of the original data frame three times.

n <- 3
cbind(my_df, rep(my_df[2],n))
Replicate one column multiple times

Unlike using the CBIND() function to create duplicates, it is not possible to give directly a new to the replicated columns.

How to Replicate Multiple Columns n Times in R

Besides replicating one column, you can also use the CBIND() function and REP() function to replicate multiple columns at once. In fact, this is very easy.

Instead of selecting just one column as the first argument of the REP() function, you can conveniently select multiple columns. For instance with the bracket notation.

Below we provide an example of how to replicate the first and third column both three times.

n <- 3
cbind(my_df, rep(my_df[,c(1,3)],n))
Duplicate multiple columns multiple times

Related: Do you know how to duplicate rows?