
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Select First Row for Each Level of a Factor Variable in R Data Frame
Comparison of rows is an influential part of data analysis, sometimes we compare variable with variable, value with value, case or row with another case or row, or even a complete data set with another data set. This is required to check the accuracy of data values and its consistency therefore we must do it. For this purpose, we need to select the required rows, columns etc. To select the first row for each level of a factor variable we can use duplicated function with ! sign.
Example
Consider the below data frame −
> x1<-rep(c(1,2,3,4,5,6,7,8,9,10),each=5) > x2<-1:50 > x3<-rep(c(LETTERS[1:10]),times=5) > df<-data.frame(x1,x2,x3) > head(df,20) x1 x2 x3 1 1 1 A 2 1 2 B 3 1 3 C 4 1 4 D 5 1 5 E 6 2 6 F 7 2 7 G 8 2 8 H 9 2 9 I 10 2 10 J 11 3 11 A 12 3 12 B 13 3 13 C 14 3 14 D 15 3 15 E 16 4 16 F 17 4 17 G 18 4 18 H 19 4 19 I 20 4 20 J > tail(df,20) x1 x2 x3 31 7 31 A 32 7 32 B 33 7 33 C 34 7 34 D 35 7 35 E 36 8 36 F 37 8 37 G 38 8 38 H 39 8 39 I 40 8 40 J 41 9 41 A 42 9 42 B 43 9 43 C 44 9 44 D 45 9 45 E 46 10 46 F 47 10 47 G 48 10 48 H 49 10 49 I 50 10 50 J
Selecting first rows based on each level of factor variable x1 −
> df[!duplicated(df$x1),] x1 x2 x3 1 1 1 A 6 2 6 F 11 3 11 A 16 4 16 F 21 5 21 A 26 6 26 F 31 7 31 A 36 8 36 F 41 9 41 A 46 10 46 F
Selecting first rows based on each level of factor variable x3 −
> df[!duplicated(df$x3),] x1 x2 x3 1 1 1 A 2 1 2 B 3 1 3 C 4 1 4 D 5 1 5 E 6 2 6 F 7 2 7 G 8 2 8 H 9 2 9 I 10 2 10 J
Advertisements