3 Ways to Remove the Last N Characters from a String in R [Examples]

This article discusses 3 ways to remove the last N characters from a string in R. This string can be a single variable, an element of a vector, or a value in a data frame column.

Even though you can use basic R code for this manipulation, the best option to remove the last N characters from a string is by using the str_sub() function from the stringr package. This function is intuitive, fast, and compatible with functions from the dplyr package.

Irrespective of the method you choose, removing the last N characters from a string is similar to reading all characters up to the N-th to last character.

For example, if you have a string of 10 characters, then removing the last 3 characters is the same as reading characters 1 to 7.

The next 3 methods will all use this trick to remove the last N characters from a string in R.

METHOD 1: Remove the Last N Characters with R Base Code

Although it is not the most convenient method, you can eliminate the last N characters from a string with solely R base code. In other words, you don’t need to install additional packages.

You need the following R base functions:

  1. substr() (or substring())
  2. nchar()

The substr() function reads a substring from a larger string and has the following syntax.

substr(text, start, stop)

where:

  • text is a single string, a vector of strings, or a column with character data.
  • start is the first character to read.
  • stop is the last character to read.

For example, to drop the last 3 characters from the text “My String” (9 characters long), you need to read all the characters up to the 6th character.

substr("My String", start = 1, stop = 6)

Notice that, if the length of your string changes, you need to modify the stop=-argument to still eliminate the last N characters.

For example, to remove the last 2 characters of a string of length 7, you need to read until the 5th element. However, if your string has 8 elements, you need to read until the 6th character, etc.

Fortunately, you can use the nchar() function to retrieve the number of characters in a string and avoid changing the stop=-argument manually.

Hence, this is the code to remove the last N characters with only R base code.

my_string <- "My String"

remove_last_n <- 3
substr(my_string, 1, nchar(my_string) - remove_last_n)
Remove the last N characters from a string in R using only R base code.

This code works also for vectors with string.

vector_of_strings <- c("My String", "Two Words", "12345")

remove_last_n <- 3
substr(vector_of_strings, 1, nchar(vector_of_strings) - remove_last_n)

METHOD 2: Remove the Last N Characters with the str_sub() Function

The easiest method to drop the last N elements from a string in R is by using the str_sub() function. This function is part of the stringr package which offers a wide range of functions to manipulate strings.

This is the syntax of the str_sub() function.

str_sub(string, start, end)

where:

  • string is a character vector (i.e., a single string, multiple strings, or a data frame column with character data).
  • start is the position of the first character to read.
  • end is the position of the end character to read.

In contrast to substr() function mentioned above, the str_sub() function allows the end=-argument to be negative. As a result, it allows you to specify up until which element of the string to read counting from right to left.

Hence, you can use a negative value for the end=-argument to remove the last N characters without the need to know the number of elements in the string.

For example, we use end=3 to eliminate the last 3 elements.

install.packages("stringr")
library("stringr")

my_string <- "My String"

remove_last_n <- 3
str_sub(my_string, start = 1, end = -remove_last_n)
Remove the last N characters from a string in R using the str_sub() function.

The next example shows even better that the str_sub() can remove the last N characters of strings irrespective of their lengths.

vector_of_strings <- c("My String", "These are 4 Words", "12345")

remove_last_n <- 3
str_sub(vector_of_strings, start = 1, end = -remove_last_n)

METHOD 3: Remove the Last N Characters with the dplyr Package

In R, a well-known package for data manipulation is the dplyr package. This package provides many functions for selecting, filtering, and transforming data sets. Moreover, this package is, like the stringr package, part of the tidyverse.

Therefore, you can use the str_sub() function as part of a dplyr sentence to remove characters from a string.

For example, here we use the mutate() function and the str_sub() function to create a new column. This column contains all but the last 3 characters from the original string.

install.packages("tidyverse")
library("tidyverse")

my_data <- data.frame(name = c("Mr. Peter", "Ms. Anna", "Ms. Maria"))

remove_last_n <- 3

my_data %>% 
  mutate(remove_last_3_chars = str_sub(name, start = 1, end = -remove_last_n))
Remove the last N characters from a string in R using dplyr.

Related Topics