
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Check for Duplicate Values in Data Frame Column in R
To check if a data frame column contains duplicate values, we can use duplicated function along with any. For example, if we have a data frame called df that contains a column ID then we can check whether ID contains duplicate values or not by using the command −
any(duplicated(df$ID))
Example1
Consider the below data frame −
ID<-1:20 x<-rpois(20,1) df1<-data.frame(ID,x) df1
Output
ID x 1 1 4 2 2 1 3 3 2 4 4 2 5 5 1 6 6 0 7 7 1 8 8 1 9 9 0 10 10 1 11 11 1 12 12 2 13 13 1 14 14 3 15 15 1 16 16 0 17 17 0 18 18 3 19 19 2 20 20 2
Checking whether x contains any duplicate or not −
any(duplicated(df1$x))
[1] TRUE
Example2
S.No<-1:20 y<-round(rnorm(20,5,3),1) df2<-data.frame(S.No,y) df2
Output
S.No y 1 1 5.1 2 2 5.8 3 3 4.4 4 4 10.1 5 5 3.3 6 6 6.1 7 7 4.8 8 8 12.6 9 9 6.4 10 10 8.7 11 11 1.5 12 12 2.5 13 13 2.1 14 14 8.7 15 15 5.5 16 16 2.0 17 17 2.1 18 18 5.5 19 19 5.4 20 20 3.4
Checking whether y contains any duplicate or not −
any(duplicated(df2$y))
[1] TRUE
Advertisements