0% found this document useful (0 votes)
17 views

Visualization

Once we have cleaned the dataset, we will perform exploratory data analysis (EDA) to understand what the data reveals and find relationships between features and patterns. EDA uses pair plots with histograms along the diagonal to show each variable's distribution and scatter plots to depict relationships between two variables, with categorical variables represented by color. The pair plots help provide a clear understanding of variable distributions and their relationships to each other and reveal which features are highly correlated.

Uploaded by

Pravin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Visualization

Once we have cleaned the dataset, we will perform exploratory data analysis (EDA) to understand what the data reveals and find relationships between features and patterns. EDA uses pair plots with histograms along the diagonal to show each variable's distribution and scatter plots to depict relationships between two variables, with categorical variables represented by color. The pair plots help provide a clear understanding of variable distributions and their relationships to each other and reveal which features are highly correlated.

Uploaded by

Pravin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

VISUALISATION EXPLANATION

Once we have a clean dataset , Our next step will be Exploratory Data
Analysis (EDA). EDA is the process of understanding what the data is
trying to tell us . It is used to find relationship between each feature and
patterns in data . We find anomalies in data to improvise the accuracy.
There are many ways to implement EDA and we opt to choose pair
plots .Pair plots gives us a clear understanding about distribution of
variables and relationship between them.

Pair plots consist of two basic representations namely Histogram and


scatter plot. The histogram on the diagonal gives us information about
distribution of single variables while the scatter plot gives us the relation
ship between two variables. Colouring is done with categorical
variables .

From the second row last column we can see that the features f2 and f5
are highly correlated . The overall feature map tells us that all these
features are highly correlated.

You might also like