In this article, we discuss 3 ways to repeat rows in R.
Normally, you want to avoid and remove duplicated rows from a data frame. But, sometimes you might need them. So, how do you create identical observations?
In R, the easiest way to repeat rows is with the REP() function. This function selects one or more observations from a data frame and creates one or more copies of them. Alternatively, you can use the SLICE() function from the dplyr package to repeat rows.
Next, we will show 3 different ways to repeat rows using the REP() and/or SLICE() function.
However, before we do so, we first create a data frame that we will use in the supporting examples. The data frame contains 2 columns and 4 rows, of which will we repeat some (or all).
## Example Data Frame x <- data.frame(x1=c(1,2,3,4), x2=c(1,2,3,4)) x
1. Repeat Rows with the REP() Function (Basic R Code)
The way to repeat rows in R is by using the REP() function.
The REP() function is a generic function that replicates the value of x one or more times and it has two mandatory arguments:
- x: A vector. For example, a row of a data frame.
- each: A non-negative integer. Each element of x is repeated each times.
For example, below we repeat the vector “A”, “B”, “C” three times.
If you use the REP() function in combination with the square bracktes to index a data frame, you can easily create duplicated rows.
Repeat One Row
In the example below, we use the REP() function to replicate the first row 3 times. All other rows in the original data frame will be ignored and therefore not appear in the output data set.
# Repeat One Row row = 1 times = 3 x[rep(row, times),]
Repeat Multiple Rows
Instead of repeating just one row, you can use the REP() function also to replicate multiple rows.
In this case, the first argument of the REP() function must be a vector with the row numbers you want to duplicate. For example, to repeat rows 1 and 3 of a data frame, you use c(1,3).
rows= c(1,3) times = 3 x[rep(rows, times),]
Repeat a Complete Data Frame
It is also possible to repeat all the rows from a data frame with the REP() function.
Again, the first argument of the REP() function specifies the rows you want to duplicate. To select all rows, you can use c(1:nrow(x)), where nrow(x) returns the number of rows in data frame x.
The R code below shows how to duplicate a complete data frame.
# Repeat All Rows rows= c(1:nrow(x)) times = 2 x[rep(rows, times),]
Alternatively, you can also use the rbind() function to “paste” one data frame below another data frame. If you run the code below, then the result will be the same as in the previous example.
2. Repeat Rows with the SLICE() and REP() Functions (dplyr Package)
The second method to repeat rows in R uses the dplyr package.
An advantage of this method is that you can create duplicated rows as part of a longer sequence of operations. In other words, you can use the replicate rows directly as input for other operations.
Before you can use this method, you need to (install and) load the dplyr package.
Like the first method, we will use the REP() function to repeat rows. However, to make this function work with the pipe operator, we need the SLICE() function, too. The SLICE() function lets you index rows by their position in a data frame.
Repeat One Row
In the example below, we use the pipe operator, the SLICE() function, and the REP() function to replicate the first row 3 times.
# Repeat One Row row = 1 times = 3 x %>% slice(rep(row,times))
As you observe the output, the difference between using the REP() function in basic R code and as part of tidy code are the row numbers. If you use the SLICE() and REP() functions, the row numbers are continuous. For example, 1, 2, 3 instead of 1, 1.1, and 1.2.
Repeat Multiple Rows
You can also you the dplyr package to repeat multiple rows from a data frame.
To do so, you need a vector that specifies the rows you want to replicate based on their position. For example, this R code selects rows 1 and 3 and duplicates each one 3 times.
# Repeat Multiple Rows rows= c(1,3) times = 3 x %>% slice(rep(rows, times))
Repeat All Rows
You can also repeat a complete data frame with the dplyr package.
Instead of using the SLICE() and REP() functions, you can directly use the BIND_ROWS() function. This function efficiently binds many data frames into one. For example:
# Repeat All Rows (Once) x %>% bind_rows(x)
Alternatively, you can also use the SLICE() and REP() functions to repeat all rows from a data frame. An advantage of this method is that you can duplicate one data frame multiple times (which is not possible with the BIND_ROWS() function).
For example, the R code below repeats all the observations in a data frame 3 times.
# Repeat All Rows (Multiple) x %>% slice(rep(1:n(), 3))
3. Repeat Rows with the LAPPLY() Function
The third method to repeat rows in a data frame uses the LAPPLY() function.
Although this method requires more code, it has one useful feature. Instead of repeating each (selected) row the same number of times, you can use the LAPPLY() function to repeat each row a specific number of times.
To illustrate this, we first create a new data frame. This data frame has the same columns x1 and x2 as before, but we add a third column called nb_times. This column indicates how many times we want to repeat each row.
x <- data.frame(x1=c(1,2,3,4), x2=c(1,2,3,4), nb_times = c(2,0,1,3)) x
As the image shows, we only want to repeat rows 1, 3, and 4. These rows we will replicate 2, 1, and 3 times, respectively.
To do so, we use the LAPPLY() function which has 3 mandatory arguments>
- The input data frame, e.g., x.
- The REP() function, i.e., rep.
- The column that indicates how many times each row should be repeated, i.e., x$nb_times.
See the example below.
data.frame(lapply(x, rep, x$nb_times))
In this article, we discussed 3 methods to repeat one or more rows from a data frame in R. The table below summarizes each method with its main advantage/disadvantage.
|R base code (I)||REP() function||Quick and easy to understand.||Can’t be used as part of a large sequence of operations.|
|dplyr code||SLICE() and REP() functions||Requires an additional function (i.e., SLICE()).||Can be used in combination with other dplyr functions.|
|R base code (II)||LAPPLY() function||Can repeat each row a specific number of times.||Requires extra code.|