What are Numbers?

Covariance Matrix

Last Updated : 23 Jul, 2025

A Covariance Matrix is a type of matrix used to describe the covariance values between two items in a random vector. It is also known as the variance-covariance matrix because the variance of each element is represented along the matrix’s major diagonal and the covariance is represented among the non-diagonal elements.

It’s particularly important in fields like data science, machine learning, and finance, where understanding relationships between multiple variables is crucial and comes in handy when it comes to stochastic modeling and principal component analysis.

It helps us to identify the direction of the relationship(positive or negative) between variables. Covariance matrix are also essential to understand the high- dimensional datasets.

Table of Content

Mathematical Definition
Covariance Matrix Example
Covariance Matrix Formula

2 ⨯ 2 Covariance Matrix
3 ⨯ 3 Covariance Matrix

How to Find the Covariance Matrix?
Properties of the Covariance Matrix
Solved Examples on Covariance Matrix

Mathematical Definition

The variance-covariance matrix is a square matrix with diagonal elements that represent the variance and the non-diagonal components that express covariance.

The covariance of a variable can take any real value- positive, negative, or zero.
A positive covariance suggests that the two variables have a positive relationship, whereas a negative covariance indicates that they do not.
If two elements do not vary together, they have a zero covariance.

Learn More: Diagonal Matrix

Covariance Matrix Example

Let's say there are 2 data sets, X = [10, 5] and Y = [3, 9]. The variance of Set X = 6.5 and the variance of Set Y = 9. The covariance between both variables is -15. The covariance matrix is as follows:

\begin{bmatrix} Variance~of~Set~X & Covariance~of~Both~Sets\\ Covariance~of~Both~Sets& Variance~of~Set~Y \end{bmatrix}=\begin{bmatrix} 6.5 & -15\\ -15& 9 \end{bmatrix}

Covariance Matrix Formula

The general form of a covariance matrix is given as follows:

Covariance Matrix

Where,

Sample Variance: var(x₁) = \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1}
Sample Covarinace: cov(x₁, y₁) = \frac{\sum_{1}^{n}\left (x_{i} -\overline{x}\right )\left(y_{i}-\overline{y}\right)}{n-1}
Population Variance: var(x_n) = \frac{\sum_{1}^{n}\left ( x_{i} -\mu\right )^{2} }{n}
Population Covariance: cov(x_n, y_n) = \frac{\sum_{1}^{n}\left ( x_{i} -\mu_{x}\right )\left ( y_{i}-\mu_{y} \right ) }{n}

Here,

μ is the Mean of Population
\overline x is the Mean of the Sample
n is the Number of observations
x_i is the Observation in Dataset x

Let's see the format of the Covariance Matrix of 2 ⨯ 2 and 3 ⨯ 3.

2 ⨯ 2 Covariance Matrix

We know that in a 2 ⨯ 2 matrix there are two rows and two columns. Hence, the 2 ⨯ 2 Covariance Matrix can be expressed as \begin{bmatrix}\mathrm{var(x)}& \mathrm{cov(x,y)} \\\mathrm{cov(x,y)} &\mathrm{var(y)}\end{bmatrix}

3 ⨯ 3 Covariance Matrix

In a 3⨯3 Matrix, there are 3 rows and 3 columns. We know that in a Covariance Matrix, the diagonal elements are variance, and the non-diagonal elements are covariance. Hence, a 3⨯3 Covariance Matrix can be given as \begin{bmatrix}\mathrm{var(x)}&\mathrm{cov(x,y)} &\mathrm{cov(x,z)} \\\mathrm{cov(x,y)} &\mathrm{var(y)} &\mathrm{cov(y,z)} \\\mathrm{cov(x,z)} &\mathrm{cov(y,z)} &\mathrm{var(z)} \\\end{bmatrix}

How to Find the Covariance Matrix?

The dimensions of a covariance matrix are determined by the number of variables in a given data set. If there are only two variables in a set, then the covariance matrix would have two rows and two columns. Similarly, if a data set has three variables, then its covariance matrix would have three rows and three columns.

The data pertains to the marks scored by Anna, Caroline, and Laura in Psychology and History. Make a covariance matrix.

Student	Psychology(X)	History(Y)
Anna	80	70
Caroline	63	20
Laura	100	50

The following steps have to be followed:

Step 1: Find the mean of variable X. Sum up all the observations in variable X and divide the sum obtained with the number of terms. Thus, (80 + 63 + 100)/3 = 81.
Step 2: Subtract the mean from all observations. (80 - 81), (63 - 81), (100 - 81).
Step 3: Take the squares of the differences obtained above and then add them up. Thus, (80 - 81)² + (63 - 81)² + (100 - 81)².
Step 4: Find the variance of X by dividing the value obtained in Step 3 by 1 less than the total number of observations. var(X) = [(80 - 81)² + (63 - 81)² + (100 - 81)²] / (3 - 1) = 343.
Step 5: Similarly, repeat steps 1 to 4 to calculate the variance of Y. Var(Y) = 633.333
Step 6: Choose a pair of variables.
Step 7: Subtract the mean of the first variable (X) from all observations; (80 - 81), (63 - 81), (100 - 81).
Step 8: Repeat the same for variable Y; (70 - 47), (20 - 47), (50 - 47).
Step 9: Multiply the corresponding terms: (80 - 81)(70 - 47), (63 - 81)(20 - 47), (100 - 81)(50 - 47).
Step 10: Find the covariance by adding these values and dividing them by (n - 1). Cov(X, Y) = [(80 - 81)(70 - 47) + (63 - 81)(20 - 47) + (100 - 81)(50 - 47)]/(3-1) = 260.
Step 11: Use the general formula for the covariance matrix to arrange the terms. The matrix becomes: \begin{bmatrix} 343 & 260\\ 260& 633.333 \end{bmatrix}

Properties of the Covariance Matrix

The Properties of the Covariance Matrix are mentioned below:

A covariance matrix is always square, implying that the number of rows in a covariance matrix is always equal to the number of columns in it.
A covariance matrix is always symmetric, implying that the transpose of a covariance matrix is always equal to the original matrix.
A covariance matrix is always positive and semi-definite.
The eigenvalues of a covariance matrix are always real and non-negative.

Read More,

Real-Life Application of the Covariance Matrix

Stock Market - Investors use a covariance matrix to see how different stocks move together.
Weather Forecasting - Meterologist use it to study that how temprature, humidity and wind speed are related over time.
Image Processing - In facial recognization, covariance helps us to indentify pattern and diffrence between images by analyzing pixel relation.
Economics - Economists use to analyse how different economic indicators move together.

Solved Examples on Covariance Matrix

Example 1: The marks scored by 3 students in Physics and Biology are given below:

Student	Physics(X)	Biology(Y)
A	92	80
B	60	30
C	100	70

Calculate the Covariance Matrix from the above data.

Solution:

Sample covariance matrix is given by \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1} .
Here, μ_x = 84, n = 3
var(x) = [(92 - 84)² + (60 - 84)² + (100 - 84)²] / (3 - 1) = 448
Also, μ_y = 60, n = 3
var(y) = [(80 - 60)² + (30 - 60)² + (70 - 60)²] / (3 - 1) = 700
Now, cov(x, y) = cov(y, x) = [(92 - 84)(80 - 60) + (60 - 84)(30 - 60) + (100 - 84)(70 - 60)] / (3 - 1) = 520.
The population covariance matrix is given as: \begin{bmatrix} 448 & 520\\ 520& 700 \end{bmatrix}

Example 2. Prepare the population covariance matrix from the following table:

Age	Number of People
29	68
26	60
30	58
35	40

Solution:

Population variance is given by \frac{\sum_{1}^{n}\left ( x_{i} -\mu\right )^{2} }{n} .
Here, μ_x = 56.5, n = 4
var(x) = [(68 - 56.5)² + (60 - 56.5)² + (58 - 56.5)² + (40 - 56.5)² ] / 4 = 104.75
Also, μ_y = 30, n = 4
var(y) = [(29 - 30)² + (26 - 30)² + (30 - 30)² + (35 - 30)²] / 4 = 10. 5
Now, cov(x, y) = \frac{\sum_{1}^{4}\left ( x_{i} -\mu_{x}\right )\left ( y_{i}-\mu_{y} \right ) }{4}
cov(x, y) = -27
The population covariance matrix is given as: \begin{bmatrix} 104.7 &-27 \\ -27& 10.5 \end{bmatrix}

Example 3. Interpret the following covariance matrix:

\begin{bmatrix} & X & Y & Z\\ X & 60 & 32 & -4\\ Y & 32 & 30 & 0\\ Z & -4 & 0 & 80 \end{bmatrix}

Solution:

The diagonal elements 60, 30, and 80 indicate the variance in data sets X, Y, and Z respectively. Y shows the lowest variance whereas Z displays the highest variance.
The covariance for X and Y is 32. As this is a positive number it means that when X increases (or decreases) Y also increases (or decreases)
The covariance for X and Z is -4. As it is a negative number it implies that when X increases Z decreases and vice-versa.
The covariance for Y and Z is 0. This means that there is no predictable relationship between the two data sets.

Example 4. Find the sample covariance matrix for the following data:

X	Y	Z
75	10.5	45
65	12.8	65
22	7.3	74
15	2.1	76
18	9.2	56

Solution:

Sample covariance matrix is given by \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1} .
n = 5,
μ_x = 22.4, var(X) = 321.2 / (5 - 1) = 80.3
μ_y = 12.58, var(Y) = 132.148 / 4 = 33.037
μ_z = 64, var(Z) = 570 / 4 = 142.5
Now, cov(X, Y) = \frac{\sum_{1}^{5}\left ( x_{i} -22.4\right )\left ( y_{i}-12.58\right ) }{5-1} = -11.76
⇒ cov(X, Z) = \frac{\sum_{1}^{5}\left ( x_{i} -22.4\right )\left ( z_{i}-64 \right ) }{5-1} = 34.97
⇒ cov(Y, Z) = \frac{\sum_{1}^{5}\left ( y_{i} -12.58\right )\left ( z_{i}-64 \right ) }{5-1} = -40.87
The covariance matrix is given as:
\begin{bmatrix} 80.3 & -13.865 &14.25 \\ -13.865 & 33.037 & -39.5250\\ 14.25 & -39.5250 & 142.5 \end{bmatrix}

Practice Problems on Covariance Matrix

Problem 1: Given two sets of data points: X = [2, 4, 6, 8, 10] and Y = [1, 3, 5, 7, 9], calculate the covariance between X and Y.

Problem 2: Calculate the covariance matrix for the following dataset:

X₁	X₂	X₃
4	2	0
4	5	6
8	10	12
12`	9	6

Problem 3: Given the covariance matrix:

\Sigma = \begin{bmatrix} 4 & -2 & 0 \\ -2 & 3 & 1 \\ 0 & 1 & 5 \\ \end{bmatrix}

Problem 4: Identify the variances and covariances between the variables.

Problem 5: Handling Missing Data: How do you compute the covariance matrix when some data points are missing? Provide an example with missing values.

Problem 6: Covariance Matrix in Multivariate Normal Distribution: Explain the role of the covariance matrix in defining the shape of the multivariate normal distribution.

Problem 7: Visualizing Covariance Matrix: Plot the covariance matrix for a dataset using Python. Provide a code example and explain the visualization.

Problem 8: Covariance Matrix for Time Series Data: How do you calculate and interpret the covariance matrix for time series data? Provide a sample time series dataset and the corresponding covariance matrix.

Conclusion

The covariance matrix is a fundamental concept in statistics and data analysis, providing insight into the relationships between multiple variables.
It captures the extent to which two variables change together and is essential for the various applications such as the Principal Component Analysis (PCA), financial analysis and multivariate statistics.
By understanding and interpreting the covariance matrix, one can make more informed decisions and uncover deeper patterns in data.

The Covariance Matrix

What are Numbers?

P

prabhjotkushparmar

Improve

Article Tags :

Similar Reads

Mathematics, often referred to as "math" for short. It is the study of numbers, quantities, shapes, structures, patterns, and relationships. It is a fundamental subject that explores the logical reasoning and systematic approach to solving problems. Mathematics is used extensively in various fields