
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Deal with Error: Undefined Columns Selected in R Data Frame Subsetting
The error “undefined columns selected when subsetting data frame” means that R does not understand the column that you want to use while subsetting the data frame. Generally, this happens when we forget to use comma while subsetting with single square brackets.
Example
Consider the below data frame −
> set.seed(99) > x1<-rnorm(20,0.5) > x2<-rpois(20,2) > x3<-runif(20,2,10) > x4<-rnorm(20,0.2) > x5<-rpois(20,5) > df<-data.frame(x1,x2,x3,x4,x5) > df x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 3 7.389898 -0.07847958 5 4 0.9438585 4 7.873764 -1.35241100 6 5 0.1371621 2 5.534758 -1.17969925 4 6 0.6226740 4 8.786676 -1.15705659 5 7 -0.3638452 1 6.407712 -0.72113718 5 8 0.9896243 2 9.374095 -0.66681774 9 9 0.1358831 2 2.086996 1.85664439 3 10 -0.7942420 0 8.730721 0.04492028 3 11 -0.2457690 3 2.687042 -1.37655243 2 12 1.4215504 3 7.075115 0.82408260 4 13 1.2500544 3 5.373809 0.53022068 5 14 -2.0085540 5 5.287499 -0.19812226 12 15 -2.5409341 1 6.217131 -0.88139693 5 16 0.5002658 3 2.723290 0.12307794 6 17 0.1059810 0 6.288451 -0.32553662 4 18 -1.2450277 2 2.942365 0.59128965 5 19 0.9986315 4 7.012492 -0.48045326 6 20 0.7709538 1 7.801093 -0.54869693 5
Now suppose, you want to select rows where x2 is greater than 2 and you type of the following code −
> df[df$x2>2] Error in `[.data.frame`(df, df$x2 > 2) : undefined columns selected
It is throwing an error of undefined columns because you forgot the comma after defining your objective. The appropriate way to select the rows where x2 is greater than 2 is as shown below −
> df[df$x2>2,] x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 3 0.5878287 3 7.389898 -0.07847958 5 4 0.9438585 4 7.873764 -1.35241100 6 6 0.6226740 4 8.786676 -1.15705659 5 11 -0.2457690 3 2.687042 -1.37655243 2 12 1.4215504 3 7.075115 0.82408260 4 13 1.2500544 3 5.373809 0.53022068 5 14 -2.0085540 5 5.287499 -0.19812226 12 16 0.5002658 3 2.723290 0.12307794 6 19 0.9986315 4 7.012492 -0.48045326 6
Similarly, to select the rows where x2 is less than 2 is as follows −
> df[df$x2<2,] x1 x2 x3 x4 x5 7 -0.3638452 1 6.407712 -0.72113718 5 10 -0.7942420 0 8.730721 0.04492028 3 15 -2.5409341 1 6.217131 -0.88139693 5 17 0.1059810 0 6.288451 -0.32553662 4 20 0.7709538 1 7.801093 -0.54869693 5
In the same way, the selection of rows where x2 is greater than 1 is as follows −
> df[df$x2>1,] x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 3 7.389898 -0.07847958 5 4 0.9438585 4 7.873764 -1.35241100 6 5 0.1371621 2 5.534758 -1.17969925 4 6 0.6226740 4 8.786676 -1.15705659 5 8 0.9896243 2 9.374095 -0.66681774 9 9 0.1358831 2 2.086996 1.85664439 3 11 -0.2457690 3 2.687042 -1.37655243 2 12 1.4215504 3 7.075115 0.82408260 4 13 1.2500544 3 5.373809 0.53022068 5 14 -2.0085540 5 5.287499 -0.19812226 12 16 0.5002658 3 2.723290 0.12307794 6 18 -1.2450277 2 2.942365 0.59128965 5 19 0.9986315 4 7.012492 -0.48045326 6
Advertisements