
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Subset Non-Duplicate Values from an R Data Frame Column
Generally, the duplicate values are considered after first occurrence but the first occurrence of a value is also a duplicate of the remaining. Therefore, we might want to exclude that as well.
The subsetting of non-duplicate values from an R data frame column can be easily done with the help of duplicated function with negation operator as shown in the below Examples.
Example 1
Following snippet creates a sample data frame −
x<-rpois(20,10) df1<-data.frame(x) df1
The following dataframe is created
x 1 16 2 5 3 17 4 7 5 6 6 7 7 14 8 10 9 7 10 13 11 11 12 15 13 4 14 10 15 16 16 11 17 10 18 11 19 9 20 11
To subset the non-duplicate values from x with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −
x<-rpois(20,10) df1<-data.frame(x) df1$x[!(duplicated(df1$x)|duplicated(df1$x,fromLast=TRUE))]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
[1] 5 17 6 14 13 15 4 9
Example 2
Following snippet creates a sample data frame −
y<-sample(1:10,20,replace=TRUE) df2<-data.frame(y) df2
The following dataframe is created
y 1 8 2 10 3 1 4 5 5 5 6 2 7 1 8 2 9 6 10 7 11 10 12 5 13 7 14 4 15 2 16 1 17 6 18 5 19 10 20 7
To subset the non-duplicate values from y with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −
y<-sample(1:10,20,replace=TRUE) df2<-data.frame(y) df2 df2$y[!(duplicated(df2$y)|duplicated(df2$y,fromLast=TRUE))]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
[1] 8 4
Example 3
Following snippet creates a sample data frame −
z<-sample(501:510,20,replace=TRUE) df3<-data.frame(z) df3
The following dataframe is created
z 1 509 2 507 3 504 4 508 5 502 6 510 7 508 8 506 9 503 10 508 11 507 12 508 13 502 14 508 15 506 16 510 17 505 18 510 19 510 20 505
To subset the non-duplicate values from y with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −
z<-sample(501:510,20,replace=TRUE) df3<-data.frame(z) df3$z[!(duplicated(df3$z)|duplicated(df3$z,fromLast=TRUE))]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
[1] 509 504 503