Python statistics.correlation() Function



The Python statistics.correlation function determines Pearson's correlation coefficient for two given inputs. This function measures the strength and direction of a linear relationship. The correlation coefficients are indicators of the strength of the linear relationship between two different variables, i.e., x and y.

A linear correlation coefficient that is greater than zero indicates a positive relationship; otherwise, this variable signifies a negative relationship. The resulting coefficient measures the strength of a monotonic relationship.

Inputs must be of the same length and need not be constant; otherwise, a StatisticsError is raised. A value close to zero indicates a weak relationship between the two variables being compared. The mathematical representation of the Statistics Correlation is as follows −

statistics Correlation representation

Syntax

Following is the basic syntax for the statistics.Correlation function.

statistics.Correlation(x, y, /, *, method = 'linear')

Parameters

This function takes x and y parameters to represent the correlation coefficients to be determined.

Return Value

The Correlation function returns Pearson's Correlation for the given inputs.

Example 1

Now we are using statistics.correlation() function to determines Pearson's correlation coefficient for the two given x and y inputs.

import numpy as np
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [5, 6, 7, 8, 9, 1, 2, 3, 4]
z = np.corrcoef(x, y)
print("The Pearson's coefficient inputs are:" ,z)

Output

The result is produced as follows −

The Pearson's coefficient inputs are: [[ 1.  -0.5]
 [-0.5  1. ]]
 

Example 2

In the below example, we are giving a single variable using statistics.correlation function, then this returns NaN(Not a Number).

import numpy as np
import warnings
warnings.filterwarnings('ignore')
x = [2]
y = [4]
z = np.corrcoef(x, y)
print("The Pearson's coefficient inputs are:", z)

Output

The output is obtained as follows −

The Pearson's coefficient inputs are: [[nan nan]
 [nan nan]]

Example 3

Here, we are calculating the correlation between the given numbers using statistics.Correlation function.

import numpy as np
x = np.array([1, 2, 3, 4, 5, 6])
y = np.array([10, 20, 30, 40, 50, 60])

def Pearson_correlation(x, y):
    if len(x) == len(y):
        sum_xy = sum((x - x.mean()) * (y - y.mean()))
        sum_x_squared = sum((x - x.mean()) ** 2)
        sum_y_squared = sum((y - y.mean()) ** 2)
        corr = sum_xy / np.sqrt(sum_x_squared * sum_y_squared)
        return corr
    else:
        return "Arrays must be of the same length."
print(Pearson_correlation(x, y))
print(Pearson_correlation(x, x))

Output

We will get the output as follows −

1.0
1.0
python_statistics_module.htm
Advertisements