Replace specific values in column using regex in R
Last Updated :
30 Jun, 2021
In this article, we will discuss how to replace specific values in columns of dataframe in R Programming Language.
Method 1 : Using sub() method
The sub() method in R programming language is a replacement method used to replace any occurrence of a pattern matched with another string. It is operative on the dataframe column or vector. It is particularly useful in the case of large datasets. It can be used to replace a character or both strings composed of one or more words in the specified dataframe column.
Syntax:
sub (pattern , new_string , df$col-name)
Parameter :
- pattern - regular expression , or a character string to replace. A * in the pattern indicates one or more characters.
- new_string - the string to replace the matches with
- df$col-name - the desired column name
Example 1:
R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","geek","friends"))
print ("Original DataFrame")
print (data_frame)
data_frame$col1 <- sub("^ge.*", "new_String", data_frame$col1)
print ("Modified DataFrame")
print (data_frame)
Output
[1] "Original DataFrame"
col1
1 geeks
2 for
3 geek
4 friends
[1] "Modified DataFrame"
col1
1 new_String
2 for
3 new_String
4 friends
This method replaces only the first occurrence of the specified string from the mainline.
Example 2:
R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks for geeks interviews",
"suitable 4 placements",
"interviews placements interviews"))
print ("Original DataFrame")
print (data_frame)
data_frame$col1 <- sub("interviews", "programming", data_frame$col1)
print ("Modified DataFrame")
print (data_frame)
Output
[1] "Original DataFrame"
col1
1 geeks for geeks interviews
2 suitable 4 placements
3 interviews placements interviews
[1] "Modified DataFrame"
col1
1 geeks for geeks programming
2 suitable 4 placements
3 programming placements interviews
Method 2 : Using gsub() method
The gsub( ) method is similar to the sub() method. However, it can use regular expressions for substitution. It also replaces all the occurrences of a particular word in the line.
Syntax:
gsub (pattern , new_string , df$col-name)
Parameter :
- pattern - regular expression , or a character string to replace
- new_string - the string to replace the matches with
- df$col-name - the desired column name
Example 1:
R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","friends","gap","geek"))
print ("Original DataFrame")
print (data_frame)
data_frame$col1 <- gsub("^\\ge.*", "new_String", data_frame$col1)
print ("Modified DataFrame")
print (data_frame)
Output
[1] "Original DataFrame"
col1
1 geeks
2 for
3 friends
4 gap
5 geek
[1] "Modified DataFrame"
col1
1 new_String
2 for
3 friends
4 gap
5 new_String
The gsub() method can be used to replace all the occurrences of a particular column.
Example 2:
R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","friends","gap","geek"))
print ("Original DataFrame")
print (data_frame)
data_frame$col1 <- gsub(".*^","GFG ",data_frame$col1)
print ("Modified DataFrame")
print (data_frame)
Output:
[1] "Original DataFrame"
col1
1 geeks
2 for
3 friends
4 gap
5 geek
[1] "Modified DataFrame"
col1
1 GFG geeks
2 GFG for
3 GFG friends
4 GFG gap
5 GFG geek
It can also be used to remove numbers from the string components of the values.
Example 3:
R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks12 is good","suitable 4 placements",
"love you 2 much"))
print ("Original DataFrame")
print (data_frame)
data_frame$col1 <- gsub("[0-9]*", "", data_frame$col1)
print ("Modified DataFrame")
print (data_frame)
Output:
[1] "Original DataFrame"
col1
1 geeks1
2 is good 2 suitable 4 placements
3 love you 2 much
[1] "Modified DataFrame"
col1
1 geeks is good
2 suitable placements
3 love you much
Similar Reads
Replace values from dataframe column using R In this article, we will discuss how to replace values from a DataFrame column in R Programming Language. Whenever we deal with some data, it is necessary that we draw accurate conclusions from it. Now imagine what if, there are some missing values(this happens when some observation is missing in a
3 min read
Replace Specific Characters in String in R In this article, we will discuss how to replace specific characters in a string in R Programming Language. Method 1: Using gsub() function We can replace all occurrences of a particular character using gsub() function. Syntax: gsub(character,new_character, string)Â Parameters: string is the input st
3 min read
Replace Spaces in Column Names in R DataFrame In this article, we will replace spaces in column names of a dataframe in R Programming Language. Let's create a Dataframe with 4 columns with 3 rows: R # create a dataframe with 4 columns and 3 rows data = data.frame("web technologies" = c("php","html","js"), "backend tech" = c("sql","oracle","mong
2 min read
Transpose a vector into single column in R In this article, we will discuss how to convert a vector from single row to a column in R Programming Language. Steps - Create a vector c(value1,value2,.....,valuen) Convert the vector into the matrix. In R, Matrix is a two-dimensional data structure that comprises rows and columns. We can create a
2 min read
How to Select Specific Columns in R dataframe? In this article, we will discuss how to select specific columns from a data frame in the R Programming Language. Selecting specific Columns Using Base R by column nameIn this approach to select a specific column, the user needs to write the name of the column name in the square bracket with the name
7 min read
Replace Character Value with NA in R In this article, we are going to see how to replace character value with NA in R Programming Language. We can replace a character value with NA in a vector and in a dataframe. Example 1: Replace Character Value with NA in vector In a vector, we can replace it by using the indexing operation. Syntax:
2 min read