Python statistics.correlation() Function



The Python statistics.correlation function determines Pearson's correlation coefficient for two given inputs. This function measures the strength and direction of a linear relationship. The correlation coefficients are indicators of the strength of the linear relationship between two different variables, i.e., x and y.

A linear correlation coefficient that is greater than zero indicates a positive relationship; otherwise, this variable signifies a negative relationship. The resulting coefficient measures the strength of a monotonic relationship.

Inputs must be of the same length and need not be constant; otherwise, a StatisticsError is raised. A value close to zero indicates a weak relationship between the two variables being compared. The mathematical representation of the Statistics Correlation is as follows −

statistics Correlation representation

Syntax

Following is the basic syntax for the statistics.Correlation function.

statistics.Correlation(x, y, /, *, method = 'linear')

Parameters

This function takes x and y parameters to represent the correlation coefficients to be determined.

Return Value

The Correlation function returns Pearson's Correlation for the given inputs.

Example 1

Now we are using statistics.correlation() function to determines Pearson's correlation coefficient for the two given x and y inputs.

Open Compiler
import numpy as np x = [1, 2, 3, 4, 5, 6, 7, 8, 9] y = [5, 6, 7, 8, 9, 1, 2, 3, 4] z = np.corrcoef(x, y) print("The Pearson's coefficient inputs are:" ,z)

Output

The result is produced as follows −

The Pearson's coefficient inputs are: [[ 1.  -0.5]
 [-0.5  1. ]]
 

Example 2

In the below example, we are giving a single variable using statistics.correlation function, then this returns NaN(Not a Number).

Open Compiler
import numpy as np import warnings warnings.filterwarnings('ignore') x = [2] y = [4] z = np.corrcoef(x, y) print("The Pearson's coefficient inputs are:", z)

Output

The output is obtained as follows −

The Pearson's coefficient inputs are: [[nan nan]
 [nan nan]]

Example 3

Here, we are calculating the correlation between the given numbers using statistics.Correlation function.

Open Compiler
import numpy as np x = np.array([1, 2, 3, 4, 5, 6]) y = np.array([10, 20, 30, 40, 50, 60]) def Pearson_correlation(x, y): if len(x) == len(y): sum_xy = sum((x - x.mean()) * (y - y.mean())) sum_x_squared = sum((x - x.mean()) ** 2) sum_y_squared = sum((y - y.mean()) ** 2) corr = sum_xy / np.sqrt(sum_x_squared * sum_y_squared) return corr else: return "Arrays must be of the same length." print(Pearson_correlation(x, y)) print(Pearson_correlation(x, x))

Output

We will get the output as follows −

1.0
1.0
python_statistics_module.htm
Advertisements