Remove All Whitespace in Each DataFrame Column in R
Last Updated :
02 Jun, 2022
In this article, we will learn how to remove all whitespace in each dataframe column in R programming language.
Sample dataframe in use:
c1 c2
1 geeks for geeks
2 cs f
3 r -lang g
Method 1: Using gsub()
In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the gsub() function, this used to replace all the matches of a pattern from a string, we have used to gsub() function to find whitespace(\s), which is then replaced by "", this removes the whitespaces.
Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.
Syntax: as.data.frame(apply(df,margin, function(x) gsub("\\s+", "", x)))
Parameters:
df: Dataframe object
margin: dimension on which operation is to be applied
function(x): operation to be applied, gsub() in this case.
gsub(): replaces "\s" with ""
Example: R program to remove whitespaces using gsub()
R
df <- data.frame(c1 = c(" geeks for", " cs", "r -lang "),
c2 = c("geeks ", "f ", " g"))
df_new <- as.data.frame(
apply(df,2, function(x) gsub("\\s+", "", x)))
df_new
Output:
c1 c2
1 geeksfor geeks
2 cs f
3 r-lang g
Method 2: Using str_remove_all()
We need to first install the package "stringr" by using install.packages() command and then import it using library() function.
str_remove_all() function takes 2 arguments, first the entire string on which the removal operation is to be performed and the character whose all the occurrences are to be removed.
Syntax: str_remove_all(string, char_to_remove)
Parameter:
string: entire string
char_to_remove: character which is to be removed from the string
Example: R program to remove whitespaces using str_remove_all()
R
library("stringr")
str <- " Welcome to Geeks for Geeks "
str_remove_all(str," ")
Output:
[1] "WelcometoGeeksforGeeks"
Since we have understood the str_remove_all() function so let's move on to the approach where we will be applying this function to all the rows of the Dataframe.
Syntax: as.data.frame(apply(df,margin, str_remove_all, " "))
Parameters:
df: Dataframe object
margin: dimension on which operation is to be applied
str_remove_all: operation to be applied
In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the str_remove_all() function. We have passed whitespace " " as an argument, this function removes all the occurrences of " ", from each row.
Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.
Example: R program to remove whitespaces from dataframe using str_remove_all()
R
library("stringr")
df <- data.frame(c1 = c(" geeks for", " cs", "r -lang "),
c2 = c("geeks ", "f ", " g"))
df_new <-as.data.frame(apply(df,2, str_remove_all, " "))
df_new
Output:
c1 c2
1 geeksfor geeks
2 cs f
3 r-lang g
Method 3: Using str_replace_all()
str_replace_all() function takes 3 arguments. First, it takes the input string on which the operation has to be performed. Then it takes the pattern which is to be replaced and the replacement value with which it is to be replaced. Here we have the pattern " " is replaced by "".
Syntax: as.data.frame(apply(df,2, function(x) str_replace_all(string=x, pattern=" ", repl="")))
Parameters:
df: Dataframe object
margin: dimension on which operation is to be applied
function(x): operation to be applied, str_replace_all() in this case.
str_replace_all(): replaces all the occurrences of " " with ""
In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the str_replace_all() function, this used to replace all the matches of a pattern from a string, we have used to str_replace_all() function to find whitespace(" "), which is then replaced by "", this removes the whitespaces.
Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.
Example: R program to remove whitespaces using str_replace_all()
R
library(stringr)
df <- data.frame(c1 = c(" geeks for", " cs", "r -lang "),
c2 = c("geeks ", "f ", " g"))
df_new <-as.data.frame(apply(df,2,
function(x) str_replace_all(string=x,
pattern=" ", repl="")))
df_new
Output:
c1 c2
1 geeksfor geeks
2 cs f
3 r-lang g
Similar Reads
Replace Spaces in Column Names in R DataFrame
In this article, we will replace spaces in column names of a dataframe in R Programming Language. Let's create a Dataframe with 4 columns with 3 rows: [GFGTABS] R # create a dataframe with 4 columns and 3 rows data = data.frame("web technologies" = c("php","html","
2 min read
Remove rows with NA in one column of R DataFrame
Columns of DataFrame in R Programming Language can have empty values represented by NA. In this article, we are going to see how to remove rows with NA in one column. We will see various approaches to remove rows with NA values. Approach Create a data frameSelect the column based on which rows are t
2 min read
How to remove rows that contain all zeros in R dataframe?
In this article, let's discuss how to rows that contain all zeroes in R dataframe. Approach: Create dataframeGet the sum of each rowSimply remove those rows that have zero-sum. Based on the sum we are getting we will add it to the new dataframe. if the sum is greater than zero then we will add it ot
1 min read
How to Select DataFrame Columns by Index in R?
In this article, we will discuss how to select columns by index from a dataframe in R programming language. Note: The indexing of the columns in the R programming language always starts from 1. Method 1: Select Specific Columns By Index with Base R Here, we are going to select columns by using index
2 min read
How to add column to dataframe in R ?
In this article, we are going to see how to add columns to dataframe in R. First, let's create a sample dataframe. Adding Column to the DataFrame We can add a column to a data frame using $ symbol. syntax: dataframe_name $ column_name = c( value 1,value 2 . . . , value n)Â Here c() function is a vec
2 min read
Get All Factor Levels of DataFrame Column in R
The data frame columns in R can be factorized on the basis of its factor columns. The data frame factor columns are composed of factor levels. Factors are used to represent categorical data. Each of the factor is denoted by a level, computed in the lexicographic order of appearance of characters or
3 min read
Reorder DataFrame by column name in R
It is very difficult any time taking task if we reorder the column name, so we use R Programming Language to do it effectively. In this article, we will be discussing the three different ways to reorder a given DataFrame by column name in R. Method 1: Manually selecting the new order of the column n
2 min read
Replace contents of factor column in R dataframe
In this article, we are going to see how to replace the content of the factor column in the dataframe in R Programming Language. Example 1: Replacing content of factor column Initially, the factor column is converted to a character column using the explicit conversion of as.character() method in R.
2 min read
How to get the classes of all columns in a dataframe in R ?
In this article, we will discuss how to find all the classes of the dataframe in R Programming Language. There are two methods to find the classes of columns in the dataframe. Using str() functionUsing lapply() function Method1 : Using str() function This function will return the class and value of
2 min read
Count non zero values in each column of R dataframe
In this article, we are going to count the number of non-zero data entries in the data using R Programming Language. To check the number of non-zero data entries in the data first we have to put that data in the data frame by using: data <- data.frame(x1 = c(1,2,0,100,0,3,10), x2 = c(5,0,1,8,10,0
2 min read