0% found this document useful (0 votes)
46 views

How To Create A Correlation Matrix Using Pandas - Data To Fish

This document provides a 4-step process to create and visualize a correlation matrix using Pandas, Seaborn, and Matplotlib in Python. Step 1 involves collecting the data into a dataset. Step 2 creates a DataFrame from the data. Step 3 generates the correlation matrix using df.corr(). Step 4 (optional) produces a visual heatmap of the correlation matrix using Seaborn and Matplotlib.

Uploaded by

intluser
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

How To Create A Correlation Matrix Using Pandas - Data To Fish

This document provides a 4-step process to create and visualize a correlation matrix using Pandas, Seaborn, and Matplotlib in Python. Step 1 involves collecting the data into a dataset. Step 2 creates a DataFrame from the data. Step 3 generates the correlation matrix using df.corr(). Step 4 (optional) produces a visual heatmap of the correlation matrix using Seaborn and Matplotlib.

Uploaded by

intluser
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

How to Create a Correlation Matrix using Pandas

datatofish.com/correlation-matrix-pandas

In this short guide, I’ll show you how to create a Correlation Matrix using Pandas. I’ll also
review the steps to display the matrix using Seaborn and Matplotlib.

To start, here is a template that you can apply in order to create a correlation matrix using
pandas:

df.corr()

Step 1: Collect the Data


For example, I collected the following data about 3 variables:

A B C

45 38 10

37 31 15

42 26 17

35 28 21

39 33 12

Step 2: Create a DataFrame using Pandas


Next, create a DataFrame in order to capture the above dataset in Python:

import pandas as pd

data = {'A': [45,37,42,35,39],


'B': [38,31,26,28,33],
'C': [10,15,17,21,12]
}

df = pd.DataFrame(data,columns=['A','B','C'])
print (df)

Once you run the code, you’ll get the following DataFrame:
Step 3: Create a Correlation Matrix using Pandas
Now, create a correlation matrix using this template:

df.corr()

This is the complete Python code that you can use to create the correlation matrix for our
example:

import pandas as pd

data = {'A': [45,37,42,35,39],


'B': [38,31,26,28,33],
'C': [10,15,17,21,12]
}

df = pd.DataFrame(data,columns=['A','B','C'])

corrMatrix = df.corr()
print (corrMatrix)

Run the code in Python, and you’ll get the following matrix:

Step 4 (optional): Get a Visual Representation of the Correlation Matrix


using Seaborn and Matplotlib
You can use the seaborn and matplotlib packages in order to get a visual representation of
the correlation matrix.

First import the seaborn and matplotlib packages:

import seaborn as sn
import matplotlib.pyplot as plt

Then, add the following syntax at the bottom of the code:

sn.heatmap(corrMatrix, annot=True)
plt.show()
So the complete Python code would look like this:

import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt

data = {'A': [45,37,42,35,39],


'B': [38,31,26,28,33],
'C': [10,15,17,21,12]
}

df = pd.DataFrame(data,columns=['A','B','C'])

corrMatrix = df.corr()
sn.heatmap(corrMatrix, annot=True)
plt.show()

Run the code, and you’ll get the following correlation matrix:

You might also like