0% found this document useful (0 votes)
5 views

lec 1 .. Introduction to Data Visualization

Data visualization is crucial for transforming complex information into clear visuals that enhance understanding, tell data stories, and support decision-making. It utilizes digital tools for interactivity and accessibility while also addressing the potential for misinformation. Key concepts include understanding preattentive attributes, types of data, and effective use of colors in visualizations.

Uploaded by

alcinialbob1234
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

lec 1 .. Introduction to Data Visualization

Data visualization is crucial for transforming complex information into clear visuals that enhance understanding, tell data stories, and support decision-making. It utilizes digital tools for interactivity and accessibility while also addressing the potential for misinformation. Key concepts include understanding preattentive attributes, types of data, and effective use of colors in visualizations.

Uploaded by

alcinialbob1234
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Introduction to Data

Visualization

By
Dr. Mourad Raafat
Why Data Visualization?

Data visualization is essential because it transforms complex information


into clear, meaningful, and compelling visual representations. Here are the
key takeaways

1. Enhances Understanding – Just as sentences are more persuasive with


supporting evidence, visualizations make data-driven arguments more
insightful and compelling.
2. Tells Data Stories – While words narrate stories, visualizations show data
stories by converting numerical, relational, or spatial patterns into
images.
3. Highlights Key Insights – A well-designed visualization draws attention to
what is most important in the data, making it easier to grasp than raw
text alone.
4. Uses Digital Tools – The text mentions a variety of free and easy-to-use
tools that allow users to create effective charts, tables, and maps.
Why Data Visualization?
5. Supports Decision-Making – Data visualizations help new
learners and professionals decide on the best way to present
information, whether through tables, charts, or maps.

6. Encourages Interactivity – Unlike static visuals, modern data


visualizations can be interactive, allowing users to explore,
download, and share insights easily.

7. Increases Accessibility – With the rise of digital content, data


visualizations reach a wider audience on the web, engaging them
in ways that traditional print materials cannot.

8. Combats (or Contributes to) Misinformation – While


visualizations help uncover the truth, they can also be manipulated
to deceive. Therefore, it’s essential to critically evaluate the
sources and accuracy of data stories.
What Can You Believe?
Example I-1.
Economic inequality has sharply risen in the
Example I-2.
United States since the 1970s.

In 1970, the top 10% of US adults received an


average income of about $135,000 in today’s
dollars, compared to the bottom 50% who earned
around $16,500. This inequality gap grew sharply
over the next five decades, as the top tier income
climbed to about $350,000, while the bottom half
barely moved to about $19,000, according to the
World Inequality Database
Group Group Group Group Group Group Group Group
Ax Ay Bx By Cx Cy Dx Dy
10 8.04 10 9.14 10 7.46 8 6.58

8 6.95 8 8.14 8 6.77 8 5.76

13 7.58 13 8.74 13 12.74 8 7.71

9 8.81 9 8.77 9 7.11 8 8.84

11 8.33 11 9.26 11 7.81 8 8.47

14 9.96 14 8.1 14 8.84 8 7.04

6 7.24 6 6.13 6 6.08 8 5.25

4 4.26 4 3.1 4 5.39 8 5.56

12 10.84 12 9.13 12 8.15 8 6.89

7 4.82 7 7.26 7 6.42 8 7.91

5 5.68 5 4.74 5 5.73 19 12.5


Group A Group A Group B Group B Group C Group C Group D Group D
x y x y x y x y
10 8.04 10 9.14 10 7.46 8 6.58

8 6.95 8 8.14 8 6.77 8 5.76

13 7.58 13 8.74 13 12.74 8 7.71

9 8.81 9 8.77 9 7.11 8 8.84

11 8.33 11 9.26 11 7.81 8 8.47

14 9.96 14 8.1 14 8.84 8 7.04

6 7.24 6 6.13 6 6.08 8 5.25

4 4.26 4 3.1 4 5.39 8 5.56

12 10.84 12 9.13 12 8.15 8 6.89

7 4.82 7 7.26 7 6.42 8 7.91

5 5.68 5 4.74 5 5.73 19 12.5


Why do we Visualize Data

•Seeing the numbers in a chart


shows you something that
tables and some statistical
measures cannot.
•We visualize data to harness
the incredible power of our
visual system to spot
relationships and trends.
How Do We Visualize Data
• In order to visualize data,
we have to understand
three things:
1.Preattentive attributes
2.Types of data
3.Dealing with Colors
Preattentive Attributes

• What are Preattentive Attributes?


• Preattentive Attributes are things our brains process in
milliseconds, before we pay attention to everything else.
• Types of Preattentive Attributes
• There are many different types of Preattentive Attributes,
such as:
• Color
• Size
• Position
• ………
• How many 9’s in this?
Coloring every digit is nearly as bad
as having no color.
Conclusion:
• Using one color on a visualization
is highly effective to make one
category stand out.
• Using a few colors, as (black, blue,
and red) to distinguish a small
number of categories, is fine too.
• Using many colors makes it hard to
distinguish many categories.
Solution:
To count each digit, we need to aggregate.
• Visualization is, at its core, about encoding
aggregations, such as frequency, in order to
gain insight.
• We need to move away from the table
entirely and encode the frequency of each
digit.
• Since the task is to count the 9s in the data
source, the bar chart is one of the best
ways to see the results.
• Remember: length and position are best
for quantitative comparisons.
How Do We Visualize
Data
1. Preattentive attributes
2. Types of data
3. Dealing with Colors
Types of Data

There are three types of


data:
• Categorical Data (aka
Nominal Data)
• Ordinal Data
• Quantitative Data
(aka Numerical Data)
Categorical Data

A categorical variable (sometimes called a nominal variable) is:

One that has two or more categories, but there is no intrinsic ordering to the categories.
For example:
•Binary variable (yes/no) is a categorical variable having two categories (e.g. Male or
Female) and there is no intrinsic ordering to the categories.
•Hair color is also a categorical variable having a number of categories (blonde, brown,
brunette, red), and again, there is no agreed way to order these from highest to lowest.

If the variable has a clear ordering, then that variable would be an ordinal
variable.
Ordinal Data
An Ordinal variable is:
A Categorical variable but with clear ordering to the categories.
Examples:
•Age category (early 20s, mid 30s, end 50s)
•Salary Scale (Low, Mid, High)
•Academic Rank (Assistant, Associate, Full Prof)
•Educational experience (elementary school graduate, high school graduate, some
college, and college graduate)
Note: The difference between categories (elementary and high school) is probably
much bigger than the difference between categories (high school and some college).
The spacing between the values may not be the same across the levels of the
variables.
Quantitative Data
A quantitative variable (sometimes called a Numerical variable) is:
Numbers (can be measured and aggregated).
Examples:
•Sales
•Profit
•Exam scores
•Page-views
•Number of patients in a hospital
(All are Numbers and can be aggregated, e.g., on a weekly, monthly,
yearly basis).
Quantitative Data

• Quantitative data can be expressed in two ways:


• Discrete data:
• Presented at predefined, exact numbers. There’s no “in
between.”(e.g. number of classes per semester: 4, 5, 6...
there is no 4.5 classes).
• Continuous data:
• Continuous data allows for the "in between," as there is
an infinite number of possible intermediate values.(e.g.
Temperature is 30, 30.4, 20.45,... Weights are 150gm,
120.4, 11.23).
How Do We Visualize
Data
1. Preattentive attributes
2. Types of data
3. Dealing with Colors
How to use colors in a proper
way

Color is one of the most important things to understand


in data visualization.
It is frequently misused.
• You should not use color just to spice up a boring
visualization.
• Many great data visualizations don’t use color at
all and are still informative and beautiful.
Color should be used in data visualization in three
primary ways:
• Sequential
• Diverging
• Categorical
In addition, there is often the need to:
• Highlight data
• Alert the reader of something important.
• Using color in Sequential way
– Color is ordered from low to high
• Two Sequential color
– With midpoint
Using color in Categorical way

• Contrasting colors for individual comparison.


• Examples of utilizing
Sequential Color
• Remember: Sequential
Color is the use of a single
color from light to dark.
• Encoding the
unemployment rate by state
using a sequential color
scheme.
• Another example is encoding the total amount of sales by state
in blue,where the darker blue shows higher sales and a lighter
blue shows lower sales.
Examples of utilizing Highlight Color
• Remember: Highlight color is used when
there is something that needs to stand out
to the reader,
but not to alert or alarm them.
Highlights can be used in a number of ways:
• Highlighting a certain data point
• Highlighting text in a table
• Highlighting a certain line on a line
chart
• Highlighting a specific bar in a bar
chart
Color Vision Deficiency (CVD)
• Based on research in 1993, approximately 8 percent of males
have color vision deficiency (CVD) compared to only 0.4
percent of females.
• The deficiency is commonly referred to as "color blindness",
but that term isn’t entirely accurate.
• People suffering from CVD can in fact see color, but they
cannot distinguish colors in the same way as the rest of the
population.
• The more accurate term is "color vision deficiency."
• Thank you

You might also like