
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Select Columns of an R Data Frame and Skip If Does Not Exist
Sometimes we have a large number of columns in the data frame and we know the name of some columns but among known ones some does not exist in the data frame. Now if we want to select the columns that we know and skip the ones that do not exist then we can use the subsetting.
For Example, if we have a data frame called df that contains twenty columns and we believe that x, y, z exists in df but z is not there in reality. Now the selection of columns x, y, z that will skip z can be done by using the below command −
df[,names(df) %in% c("x","y","z")]
Example 1
Consider the below data frame −
x1<-rnorm(20) x3<-rnorm(20) x4<-rnorm(20) df1<-data.frame(x1,x3,x4) df1
The following dataframe is created
x1 x3 x4 1 0.39242697 1.94369518 -0.36692667 2 -2.87236253 0.63008900 -1.06281211 3 -0.65349377 -0.88442286 -0.01778122 4 -1.17954360 -1.12290165 1.22420677 5 0.12765932 -2.47906508 -0.36339964 6 1.00167594 0.98720588 0.26306844 7 -0.45533660 -0.61367430 0.59131906 8 0.10805656 0.70099416 -1.25835396 9 0.41539962 -0.34988934 -1.16621416 10 1.69208586 -0.08883033 0.25785287 11 0.14335867 -1.67958251 -0.45326409 12 -0.69518421 -1.50169655 -0.32216638 13 0.29088005 -1.30874972 -0.28515476 14 -0.01994773 0.19276681 -0.36537207 15 -0.61455895 -0.59203646 0.09349088 16 0.34339425 0.86884825 1.04326014 17 1.71791754 0.88276790 0.66905104 18 2.06755011 -0.64288995 -0.09404691 19 -1.54713973 0.73062146 -2.27962611 20 1.33430182 -1.03840560 0.94347980
To select the columns x1, x2, x3, and x4 from df1 on the above created data frame, add the following code to the above snippet −
x1<-rnorm(20) x3<-rnorm(20) x4<-rnorm(20) df1<-data.frame(x1,x3,x4) df1[,names(df1) %in% c("x1","x2","x3","x4")]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
x1 x3 x4 1 0.39242697 1.94369518 -0.36692667 2 -2.87236253 0.63008900 -1.06281211 3 -0.65349377 -0.88442286 -0.01778122 4 -1.17954360 -1.12290165 1.22420677 5 0.12765932 -2.47906508 -0.36339964 6 1.00167594 0.98720588 0.26306844 7 -0.45533660 -0.61367430 0.59131906 8 0.10805656 0.70099416 -1.25835396 9 0.41539962 -0.34988934 -1.16621416 10 1.69208586 -0.08883033 0.25785287 11 0.14335867 -1.67958251 -0.45326409 12 -0.69518421 -1.50169655 -0.32216638 13 0.29088005 -1.30874972 -0.28515476 14 -0.01994773 0.19276681 -0.36537207 15 -0.61455895 -0.59203646 0.09349088 16 0.34339425 0.86884825 1.04326014 17 1.71791754 0.88276790 0.66905104 18 2.06755011 -0.64288995 -0.09404691 19 -1.54713973 0.73062146 -2.27962611 20 1.33430182 -1.03840560 0.94347980
Example 2
Consider the data frame given below −
y1<-rpois(20,5) y2<-rpois(20,2) y4<-rpois(20,5) y6<-rpois(20,2) df2<-data.frame(y1,y2,y4,y6) df2
The following dataframe is created
y1 y2 y4 y6 1 4 3 5 2 2 4 2 4 3 3 4 0 9 3 4 3 5 6 1 5 4 2 3 3 6 7 1 8 0 7 7 3 4 2 8 5 2 8 1 9 3 4 8 1 10 4 2 10 1 11 3 2 4 1 12 5 1 5 2 13 3 2 8 2 14 4 2 9 0 15 5 0 2 3 16 4 0 6 1 17 5 2 7 1 18 6 0 6 2 19 5 2 5 2 20 6 1 4 1
To select the columns y1, y2, y3, y4, y5, and y6 from df2 on the above created data frame, add the following code to the above snippet −
y1<-rpois(20,5) y2<-rpois(20,2) y4<-rpois(20,5) y6<-rpois(20,2) df2<-data.frame(y1,y2,y4,y6) df2[,names(df2) %in% c("y1","y2","y3","y4","y5","y6")]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
y1 y2 y4 y6 1 4 3 5 2 2 4 2 4 3 3 4 0 9 3 4 3 5 6 1 5 4 2 3 3 6 7 1 8 0 7 7 3 4 2 8 5 2 8 1 9 3 4 8 1 10 4 2 10 1 11 3 2 4 1 12 5 1 5 2 13 3 2 8 2 14 4 2 9 0 15 5 0 2 3 16 4 0 6 1 17 5 2 7 1 18 6 0 6 2 19 5 2 5 2 20 6 1 4 1