
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Correlation Matrix of Groups for Data Table Object in R
To find the correlation of groups, we can use cor function but it cannot be directly used.
For this purpose, we first need to set they key for group column of data table object. For example, if we have a data.table DT with one numerical column defined as x and one group column defined as Group having 4 groups as a, b, c, and d then the correlation of numerical values for groups a and b can be found as −
setkey(DT,Group) cor(DT["a"]$x,DT["b"]$x)
Loading data.table package −
library(data.table)
Example
Consider the below data.table object −
x<-rnorm(20,1,0.04) Class<-rep(LETTERS[1:2],10) DT1<-data.table(x,Class) DT1
Output
x Class 1: 1.0315869 A 2: 1.0240505 B 3: 0.9820461 A 4: 1.0095865 B 5: 1.0025895 A 6: 1.0076078 B 7: 1.0266381 A 8: 0.9735519 B 9: 1.0457029 A 10: 1.0407300 B 11: 1.0384560 A 12: 0.9798408 B 13: 0.9810080 A 14: 1.0602431 B 15: 0.9968140 A 16: 1.0239540 B 17: 0.9675810 A 18: 1.0723230 B 19: 0.9705898 A 20: 1.0713552 B
Finding a correlation between A and B Class −
Example
setkey(DT1,Class) cor(DT1["A"]$x,DT1["B"]$x)
Output
[1] -0.6282066
Example
y<-rpois(20,5) Group<-rep(c("S1","S2","S3","S4"),5) DT2<-data.table(y,Group) DT2
Output
y Group 1: 3 S1 2: 3 S2 3: 5 S3 4: 7 S4 5: 9 S1 6: 6 S2 7: 7 S3 8: 6 S4 9: 4 S1 10: 5 S2 11: 6 S3 12: 4 S4 13: 9 S1 14: 6 S2 15: 4 S3 16: 6 S4 17: 8 S1 18: 5 S2 19: 2 S3 20: 1 S4
Example
setkey(DT2,Group) cor(DT2["S1"]$y,DT2["S2"]$y)
Output
[1] 0.8502303
Example
cor(DT2["S1"]$y,DT2["S3"]$y)
Output
[1] -0.1984965
Example
cor(DT2["S1"]$y,DT2["S4"]$y)
Output
[1] -0.1962715
Example
cor(DT2["S2"]$y,DT2["S3"]$y)
Output
[1] 0.1061191
Example
cor(DT2["S2"]$y,DT2["S4"]$y)
Output
[1] -0.1709964
Example
cor(DT2["S3"]$y,DT2["S4"]$y)
Output
[1] 0.6423677
Advertisements