Module-1 Part 2 Data Visualization
Module-1 Part 2 Data Visualization
January 1, 2024
Topics to be covered1
▶ Exploring the data distribution
▶ Percentiles and boxplots
▶ Frequency tables and histograms
▶ Density Plots and Estimates
▶ Exploring Binary and Categorical Data
▶ Mode
▶ Expected Value
▶ Probability
▶ Correlation
▶ Scatterplots
▶ Exploring two or more variables
▶ Hexagonal Binning and Contours
▶ Two categorical variables
▶ Categorical and Numeric Data
▶ Visualizing multiple variables
1
The Instructor acknowledges authors of various articles available on the web and
other resources from which some of the materials presented here are taken
Exploring Data Distribution
▶ Distribution of data
▶ Better presentation of data
▶ Inclusion of finer details
▶ Drawing conclusions and inferences!
▶ Analyze
▶ Identify Patterns, Trends, etc.,
▶ Formulate/Test Hypothesis
▶ Provides evidence and support!
Frequency Table
Based on passages taken from newspapers and novels, and total sample
of 100,362 alphabetic characters. Beker and Piper, 1982
Frequency Table - Examples!
Mode
Most commonly occurring category or value in a dataset
Expected Value
Associating the categories with numerical values, E[X] gives the
average based on the probability of occurrence of the category
n
X
E[X] = p(x)x
x=1
5
https://onlinestatbook.com/2/advanced_graphs/contour.html
Categorical and Numerical Data - Violin Plots