
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Convert Strings in R Data Frame to Unique Integers
To convert strings in R data frame to unique integers, we first need to extract the unique strings in the data frame and then read them inside data.frame function with as.numeric along with factor function.
Check out the below Examples to understand how it works.
Example 1
Consider the below data frame −
x1<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) x2<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) x3<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) df1<-data.frame(x1,x2,x3) df1
The following dataframe is created
x1 x2 x3 1 Hot Normal Hot 2 Hot Hot Normal 3 Hot Normal Hot 4 Hot Cold Normal 5 Hot Cold Hot 6 Hot Hot Cold 7 Hot Cold Cold 8 Normal Cold Cold 9 Hot Normal Cold 10 Hot Hot Hot 11 Normal Normal Hot 12 Normal Normal Normal 13 Hot Hot Cold 14 Normal Cold Cold 15 Hot Hot Hot 16 Cold Hot Normal 17 Hot Hot Hot 18 Hot Hot Cold 19 Hot Cold Cold 20 Cold Hot Hot
To extract the unique values in data frame df1 on the above created data frame, add the following code to the above snippet −
x1<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) x2<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) x3<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) df1<-data.frame(x1,x2,x3) Unique_df1<- unique(c(as.character(df1$x1),as.character(df1$x2),as.character(df1$x3))) Unique_df1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
[1] "Hot" "Normal" "Cold"
To convert the string values in df1 to unique numeric values on the above created data frame, add the following code to the above snippet −
x1<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) x2<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) x3<-sample(c("Hot","Normal","Cold"),20,replace=TRUE) df1<-data.frame(x1,x2,x3) Unique_df1<- unique(c(as.character(df1$x1),as.character(df1$x2),as.character(df1$x3))) df1<- data.frame(x1=as.numeric(factor(df1$x1,levels=Unique_df1)),x2=as.numeric(factor (df1$x2,levels=Unique_df1)),x3=as.numeric(factor(df1$x3,levels=Unique_df1))) df1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
x1 x2 x3 1 1 2 1 2 1 1 2 3 1 2 1 4 1 3 2 5 1 3 1 6 1 1 3 7 1 3 3 8 2 3 3 9 1 2 3 10 1 1 1 11 2 2 1 12 2 2 2 13 1 1 3 14 2 3 3 15 1 1 1 16 3 1 2 17 1 1 1 18 1 1 3 19 1 3 3 20 3 1 1
Example 2
Following snippet creates a sample data frame −
y1<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) y2<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) y3<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) df2<-data.frame(y1,y2,y3) df2
The following dataframe is created
y1 y2 y3 1 Rainy Winter Rainy 2 Summer Rainy Summer 3 Summer Spring Summer 4 Summer Spring Winter 5 Winter Winter Rainy 6 Summer Rainy Winter 7 Winter Winter Rainy 8 Winter Summer Spring 9 Spring Summer Winter 10 Summer Summer Spring 11 Rainy Rainy Spring 12 Rainy Winter Summer 13 Summer Spring Spring 14 Summer Summer Winter 15 Spring Spring Winter 16 Spring Spring Spring 17 Winter Spring Spring 18 Winter Rainy Summer 19 Winter Spring Winter 20 Winter Summer Summer
To extract the unique values in data frame df2 on the above created data frame, add the following code to the above snippet −
y1<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) y2<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) y3<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) df2<-data.frame(y1,y2,y3) Unique_df2<- unique(c(as.character(df2$y1),as.character(df2$y2),as.character(df2$y3))) Unique_df2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
[1] "Rainy" "Summer" "Winter" "Spring"
To convert the string values in df2 to unique numeric values on the above created data frame, add the following code to the above snippet −
y1<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) y2<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) y3<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE) df2<-data.frame(y1,y2,y3) Unique_df2<- unique(c(as.character(df2$y1),as.character(df2$y2),as.character(df2$y3))) df2<- data.frame(y1=as.numeric(factor(df2$y1,levels=Unique_df2)),y2=as.numeric(factor (df2$y2,levels=Unique_df2)),y3=as.numeric(factor(df2$y3,levels=Unique_df2))) df2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
y1 y2 y3 1 1 3 1 2 2 1 2 3 2 4 2 4 2 4 3 5 3 3 1 6 2 1 3 7 3 3 1 8 3 2 4 9 4 2 3 10 2 2 4 11 1 1 4 12 1 3 2 13 2 4 4 14 2 2 3 15 4 4 3 16 4 4 4 17 3 4 4 18 3 1 2 19 3 4 3 20 3 2 2