
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Remove Multiple Rows from R Data Frame using dplyr Package
Sometimes we get unnecessary information in our data set that needs to be removed, this information could be a single case, multiple cases, whole variable or any other thing that is not helpful in achieving our analytical objective, hence we want to remove it. If we want to remove such type of rows from an R data frame with the help of dplyr package then anti_join function can be used.
Example
Consider the below data frame:
> set.seed(2514) > x1<-rnorm(20,5) > x2<-rnorm(20,5,0.05) > df1<-data.frame(x1,x2) > df1
Output
x1 x2 1 5.567262 4.998607 2 5.343063 4.931962 3 2.211267 5.034461 4 5.092191 5.075641 5 3.883282 4.997900 6 5.950218 5.038626 7 4.903268 5.010087 8 7.462286 4.974513 9 5.056762 5.097812 10 6.031768 5.002989 11 3.814416 4.990552 12 3.359167 4.891964 13 5.304671 4.950883 14 4.768564 4.953290 15 3.842797 4.950219 16 5.270018 4.995953 17 6.344269 5.008545 18 5.366249 4.905290 19 5.547608 5.098554 20 5.266844 5.003416
Loading dplyr package:
> library(dplyr)
Removing rows 1 to 5 from df1:
> anti_join(df1,df1[1:5,]) Joining, by = c("x1", "x2") x1 x2 1 5.950218 5.038626 2 4.903268 5.010087 3 7.462286 4.974513 4 5.056762 5.097812 5 6.031768 5.002989 6 3.814416 4.990552 7 3.359167 4.891964 8 5.304671 4.950883 9 4.768564 4.953290 10 3.842797 4.950219 11 5.270018 4.995953 12 6.344269 5.008545 13 5.366249 4.905290 14 5.547608 5.098554 15 5.266844 5.003416
Removing rows 11 to 18 from df1:
> anti_join(df1,df1[11:18,]) Joining, by = c("x1", "x2") x1 x2 1 5.567262 4.998607 2 5.343063 4.931962 3 2.211267 5.034461 4 5.092191 5.075641 5 3.883282 4.997900 6 5.950218 5.038626 7 4.903268 5.010087 8 7.462286 4.974513 9 5.056762 5.097812 10 6.031768 5.002989 11 5.547608 5.098554 12 5.266844 5.003416
Removing rows 6 to 12 from df1:
> anti_join(df1,df1[6:12,]) Joining, by = c("x1", "x2") x1 x2 1 5.567262 4.998607 2 5.343063 4.931962 3 2.211267 5.034461 4 5.092191 5.075641 5 3.883282 4.997900 6 5.304671 4.950883 7 4.768564 4.953290 8 3.842797 4.950219 9 5.270018 4.995953 10 6.344269 5.008545 11 5.366249 4.905290 12 5.547608 5.098554 13 5.266844 5.003416
Removing rows 15 to 20 from df1:
> anti_join(df1,df1[15:20,]) Joining, by = c("x1", "x2") x1 x2 1 5.567262 4.998607 2 5.343063 4.931962 3 2.211267 5.034461 4 5.092191 5.075641 5 3.883282 4.997900 6 5.950218 5.038626 7 4.903268 5.010087 8 7.462286 4.974513 9 5.056762 5.097812 10 6.031768 5.002989 11 3.814416 4.990552 12 3.359167 4.891964 13 5.304671 4.950883 14 4.768564 4.953290
Removing rows 5 to 18 from df1:
> anti_join(df1,df1[5:18,]) Joining, by = c("x1", "x2") x1 x2 1 5.567262 4.998607 2 5.343063 4.931962 3 2.211267 5.034461 4 5.092191 5.075641 5 5.547608 5.098554 6 5.266844 5.003416
Removing rows 11 to 20 from df1:
> anti_join(df1,df1[11:20,]) Joining, by = c("x1", "x2") x1 x2 1 5.567262 4.998607 2 5.343063 4.931962 3 2.211267 5.034461 4 5.092191 5.075641 5 3.883282 4.997900 6 5.950218 5.038626 7 4.903268 5.010087 8 7.462286 4.974513 9 5.056762 5.097812 10 6.031768 5.002989
Removing rows 1 to 10 from df1:
> anti_join(df1,df1[1:10,]) Joining, by = c("x1", "x2") x1 x2 1 3.814416 4.990552 2 3.359167 4.891964 3 5.304671 4.950883 4 4.768564 4.953290 5 3.842797 4.950219 6 5.270018 4.995953 7 6.344269 5.008545 8 5.366249 4.905290 9 5.547608 5.098554 10 5.266844 5.003416
Removing rows 2 to 11 from df1:
> anti_join(df1,df1[2:11,]) Joining, by = c("x1", "x2") x1 x2 1 5.567262 4.998607 2 3.359167 4.891964 3 5.304671 4.950883 4 4.768564 4.953290 5 3.842797 4.950219 6 5.270018 4.995953 7 6.344269 5.008545 8 5.366249 4.905290 9 5.547608 5.098554 10 5.266844 5.003416
Advertisements