Visualization Summarization S25 Lec6,7
Visualization Summarization S25 Lec6,7
• missing data?
• replace with mean
• remove
Continuous Categorical
What information does it give ???
Outline
• Visualization
• why we visualise
• how to pick a plot
• initial data vs final results visualization (some
examples)
• bad designs and misleading graphs
• Summarization
• measures of central tendency & dispersion
• which measure to pick
Data Visualisation
Mean Mid-Sem Test Score = 65.5
https://www.cardinalpath.com/blog/makes-good-visualization
What makes a good visualisation
https://www.cardinalpath.com/blog/makes-good-visualization
What makes a good visualisation
• labelling
• label the axis correctly and consistently across all your
charts.
• avoid using acronyms that are not widely understood.
• make the chart title as concise and descriptive as
possible.
• whenever possible, label the lines in your line chart
directly rather than using a legend.
• be consistent in formatting; if you are working with
currency symbols, percentage signs and the decimal
values, retain them across all your charts.
https://www.cardinalpath.com/blog/makes-good-visualization
Market Share of Film Studios
So which visualisation was best?
Tufte’s Graphical Theory
• minimize data-to-ink ratio
• minimise lie factor (or increase graphical integrity)
• minimise chart junk
• use proper scales and labelling
The good, the bad, & the ugly
Outline
• Visualization
• why we visualise
• how to pick a plot
• initial data vs final results visualization (some
examples)
• bad designs and misleading graphs
• Summarization
• measures of central tendency & dispersion
• which measure to pick
How to choose the right plot?
How to choose the right plot?
BAR CHART
Pie vs Bar Charts
• use pie charts when
• smaller no. of categories
• readers can differentiate slices (unless you are making a point)
• you don’t need to rely on many colors or labels to explain the
proportions
• total adds up to 100%
• use bar charts when
• have many categories (not too many)
• need to compare numbers side-by-side (caution: more than
two bars are hard for readers
AREA PLOTS: TREE MAP
AREA PLOTS: WAFFLE CHART
Data Visualisation
• mosaic plots
• allows you to observe the relationship among two or
more categorical variables
AREA PLOTS: MOSAIC PLOT
So which visualisation was best?
VIOLIN PLOT
Higher Probability
Lower Probability
RAINDROP PLOT (Combo)
Visualizing Results
Funnel Plots
(Regression Results)
Describing Data
Funnel Plots
SPIDER PLOT / RADAR CHART
Describing Data
• temporal changes
• proportions
• data distributions
• group differences
• relationships between variables
• geographical data
Temporal
Area plots
Data Distribution
• Storytelling
• Reduce Cognitive Load
• Less is more
• Missing data
• Color Consistency
• Labelling
https://www.cardinalpath.com/blog/makes-good-visualization
To do or not to do
• Provide necessary Context around Visuals
• Ensure Simplicity and Clarity of Information
• Ensure Brevity and Avoid Unnecessary Information
• Use Simple and Easy to Understand Color Palettes
• Pay attention to Graphics in order to make sure that
they are Visually Appealing
• Where possible, bring in Originality by relating,
seemingly Unrelated data and subjects
To do or not to do
• Avoid using Too Many Variables within a single image which might result in
distracting the viewers
• Be extremely careful of not visualizing data through an Unsuitable or Incorrect
visualization format
• While using Scales in Data Visualization in order to depict differences between
data points, it is important to ensure that the scale is consistent
• Poor Choice of Colors is another significant issue which should be avoided at all
costs. Thus, it is important to:
• avoid using colors with negligible contrast
• avoid using too many colors
• avoid using conventional colors to convey opposite meanings
• pay heed to the needs of people who might be colorblind (check also in
grayscale)
Outline
• Visualization
• why we visualise
• how to pick a plot
• initial data vs final results visualization (some
examples)
• bad designs and misleading graphs
• Summarization
• measures of central tendency & dispersion
• which measure to pick
Bad Designs & Improvements
https://nandeshwar.info/data-visualization/pie-chart-vs-bar-chart/
What if we want to compare genders
within the job categories and
ethnicities/races?
Outline
• Visualization
• why we visualise
• how to pick a plot
• initial data vs final results visualization (some
examples)
• bad designs and misleading graphs
• Summarization
• measures of central tendency & dispersion
• which measure to pick
Descriptive Statistics
• Common descriptive statistics are:
• Measure of central tendency
- the most typical value of a given
group of values
• Measure of dispersion
- how much all the other values in the
group vary around the typical value
Measures of central tendency
Central Tendency for
Variable Types
Measures of central tendency
Advantages Disadvantages
A sensitive and exact measure of the A single extreme value in one direction
centre point of a group of values can seriously distort the mean
Advantages Disadvantages
- Fundamental to significance
testing, and forms basis of
Analysis of Variance
——
(ANOVA)
- Enables population
parameters to be estimated
from a sample of people
?