How to Calculate Correlation Between Multiple Variables in R?
Last Updated :
19 Dec, 2021
In this article, we will discuss how to calculate Correlation between Multiple variables in R Programming Language. Correlation is used to get the relation between two or more variables:
- The result is 0 if there is no correlation between two variables
- The result is 1 if there is a positive correlation between two variables
- The result is -1 if there is a negative correlation between two variables
Let’s create an initial dataframe:
R
data= data.frame (col1= c (1:10),col2= c (11:20),
col3= c (21:30),col4= c (1:10))
data
|
Output:
col1 col2 col3 col4
1 1 11 21 1
2 2 12 22 2
3 3 13 23 3
4 4 14 24 4
5 5 15 25 5
6 6 16 26 6
7 7 17 27 7
8 8 18 28 8
9 9 19 29 9
10 10 20 30 10
Method 1: Correlation Between Two Variables
In this method to calculate the correlation between two variables, the user has to simply call the corr() function from the base R, passed with the required parameters which will be the name of the variables whose correlation is needed to be calculated and further this will be returning the correlation detail between the given two variables in the R programming language.
Syntax:
cor(dataframe$column1, dataframe$column1)
where,
- dataframe is the input dataframe
- column1 is the column1 correlated with column2
Example:
Here, in this example, we are going to create the dataframe with 4 columns with 10 rows and find the correlation between col1 and col2,correlation between col1 and col3,correlation between col1 and col4 and correlation between col3 and col4 using the cor() function in the R programming language.
R
data= data.frame (col1= c (1:10),col2= c (11:20),
col3= c (21:30),col4= c (1:10))
print ( cor (data$col1,data$col2))
print ( cor (data$col1,data$col3))
print ( cor (data$col1,data$col4))
print ( cor (data$col3,data$col4))
|
Output:
1
1
1
1
Method 2: Correlation Between Multiple Variables
In this method, the user has to call the cor() function and then within this function the user has to pass the name of the multiple variables in the form of vector as its parameter to get the correlation among multiple variables by specifying multiple column names in the R programming language.
Syntax:
cor(dataframe[, c('column1','column2',.,'column n')])
Example:
In this example, we will find the correlation between using cor() function of col1,col3, and col2,col1,col4 and col2, and col2,col3, and col4 in the R programming language.
R
data= data.frame (col1= c (1:10),col2= c (11:20),
col3= c (21:30),
col4= c (1:5,34,56,32,23,45))
print ( cor (data[, c ( 'col1' , 'col3' , 'col2' )]))
print ( cor (data[, c ( 'col1' , 'col4' , 'col2' )]))
print ( cor (data[, c ( 'col2' , 'col3' , 'col4' )]))
|
Output:
col1 col3 col2
col1 1 1 1
col3 1 1 1
col2 1 1 1
col1 col4 col2
col1 1.000000 0.787662 1.000000
col4 0.787662 1.000000 0.787662
col2 1.000000 0.787662 1.000000
col2 col3 col4
col2 1.000000 1.000000 0.787662
col3 1.000000 1.000000 0.787662
col4 0.787662 0.787662 1.000000
Method 3: Correlation between all variables
In this method to compute the correlation between all the variables in the given data frame, the user needs to call the cor() function with the entire data frame passed as its parameter to get the correlation between all variables of the given data frame in the R programming language.
Syntax:
cor(dataframe)
Example:
In this example, we are going to find the correlation between all the columns of the given data frame in the R programming language.
R
data= data.frame (col1= c (1:10),col2= c (11:20),
col3= c (21:30),
col4= c (1:5,34,56,32,23,45))
print ( cor (data))
|
Output:
col1 col2 col3 col4
col1 1.000000 1.000000 1.000000 0.787662
col2 1.000000 1.000000 1.000000 0.787662
col3 1.000000 1.000000 1.000000 0.787662
col4 0.787662 0.787662 0.787662 1.0000
Similar Reads
How to Calculate Correlation in R with Missing Values
When we calculate correlation in R Programming Language with missing values then its default behavior is to exclude observations with missing values pairwise, meaning that if a pair of variables has missing values for any observation, that pair will not contribute to the correlation calculation for
2 min read
How to Calculate Partial Correlation in R?
In this article, we will discuss how to calculate Partial Correlation in the R Programming Language. Partial Correlation helps measure the degree of association between two random variables when there is the effect of other variables that control them. in partial correlation in machine learning It g
3 min read
How to Calculate Correlation Between Two Columns in Pandas?
In this article, we will discuss how to calculate the correlation between two columns in pandas Correlation is used to summarize the strength and direction of the linear association between two quantitative variables. It is denoted by r and values between -1 and +1. A positive value for r indicates
2 min read
How to Calculate Point-Biserial Correlation in R?
In this article, we will discuss how to calculate Point Biserial correlation in R Programming Language. Correlation measures the relationship between two variables. we can say the correlation is positive if the value is 1, the correlation is negative if the value is -1, else 0. Point biserial correl
2 min read
How to Calculate Polychoric Correlation in R?
In this article, we will discuss how to calculate polychoric correlation in R Programming Language. Calculate Polychoric Correlation in R Correlation measures the relationship between two variables. we can say the correlation is positive if the value is 1, the correlation is negative if the value is
2 min read
How to Calculate Correlation By Group in R
Calculating correlation by group in R Programming Language involves finding the correlation coefficient between two variables within each subgroup defined by another variable. In R, correlation by group can be achieved by using the cor() function along with other functions like group_by() from the '
5 min read
How to Calculate Cross Correlation in R?
In this article we will discuss how to calculate cross correlation in R programming language. Correlation is used to get the relation between two or more variables. The result is 0, if there is no correlation between two variablesThe result is 1, if there is positive correlation between two variable
1 min read
How to Calculate Rolling Correlation in R?
In this article, we will discuss Rolling Correlation in R Programming Language. Correlation is used to get the relationship between two variables. It will result in 1 if the correlation is positive.It will result in -1 if the correlation is negative.it will result in 0 if there is no correlation. Ro
2 min read
How to Calculate Partial Correlation in Excel?
Partial correlation helps find the correlation between the two variables by removing the effect of the third variable. There can be situations when the relations between variables can be many. This could reduce the accuracy of correlation or could also give wrong results. Partial correlation removes
5 min read
How to Create a Scatterplot in R with Multiple Variables?
In this article, we will be looking at the way to create a scatter plot with multiple variables in the R programming language. Using Plot() And Points() Function In Base R: In this approach to create a scatter plot with multiple variables, the user needs to call the plot() function Plot() function:
3 min read