
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Subset R Data Frame by Numerical Column and Category
Subsetting is one of the commonly used technique which serves many different purposes depending on the objective of analysis. To subset a data frame if numerical column is greater than a certain value for a particular category in grouping column then we need to follow the below steps −
- Creating a data frame.
- Subsetting the data frame with the help of filter function of dplyr package.
Create the data frame
Let's create a data frame as shown below −
x<-rnorm(20,10,0.25) Gender<-sample(c("Male","Female"),20,replace=TRUE) df<-data.frame(x,Gender) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
x Gender 1 9.401786 Male 2 10.219677 Male 3 10.126467 Male 4 10.260641 Male 5 10.685478 Male 6 10.006628 Male 7 9.912915 Male 8 10.206531 Male 9 10.366212 Female 10 9.746924 Male 11 10.092994 Male 12 10.291531 Male 13 10.398257 Male 14 9.441365 Male 15 9.479788 Male 16 9.670627 Female 17 10.249913 Female 18 9.718280 Male 19 10.007886 Male 20 9.976768 Male
Subsetting the data frame
Loading dplyr package and subsetting df if Gender is Male and x is greater than 10 −
library(dplyr) x<-rnorm(20,10,0.25) Gender<-sample(c("Male","Female"),20,replace=TRUE) df<-data.frame(x,Gender) df %>% filter(x>10,Gender=="Male")
Output
x Gender 1 10.21968 Male 2 10.12647 Male 3 10.26064 Male 4 10.68548 Male 5 10.00663 Male 6 10.20653 Male 7 10.09299 Male 8 10.29153 Male 9 10.39826 Male 10 10.00789 Male
Advertisements