0% found this document useful (0 votes)

33 views98 pages

04 Exploring+Data+Visually Combined Lms

The document discusses exploring data visually through data visualization. It covers key aspects of the data exploration process including understanding the available data, determining what questions you want to answer, choosing appropriate visualization methods, and interpreting what you see in the visualizations. The purpose of data visualization is to gain insights and new perspectives from the data. An iterative process is recommended to fully explore the data through creating and comparing multiple visualizations from different angles.

Uploaded by

Rami Haidar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views98 pages

04 Exploring+Data+Visually Combined Lms

Uploaded by

Rami Haidar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 98

“The greatest value of a picture is when it

forces us to notice what we never expected

to see.”
John W . Tukey, Exploratory Data Analysis (1977)
Exploring Data Visually and
Graph Types

MSBA 325
Fouad Zablith, PhD
Purpose

https://www.theguardian.com/data
Process

Things to consider when you explore your data visually:

 What data do you have?

 What do you want to know about your data?

 What visualization methods should you use?

 What do you see, and does it make sense?

Process
Process

Exploring data visually is an iterative process.

Process
What Data Do You Have?

 If your dataset is only a handful of observations, this limits what you can find in your data and
what visualization methods are useful, and you won’t see much.

 If you have a lot of data, what you see when you visualize one aspect of it can lead to a
curiosity about other dimensions, which in turn leads to different graphics.
Process
What Data Do You Have?

It is very important to get the data before forming the visual.

Getting the data that you need is the hardest and most time-consuming part.

Programming or click-and-play applications can be helpful to manage data.

Things to consider:

 What do values represent

Where is the data from
Process

 How are variables measured

What Do You Want to Know About Your Data?

When you have a dataset with thousands or millions of observations, it can be challenging to
figure out what to look at first.

This is where the phrase “drowning in data” comes from. You stare at a bunch of numbers on
your computer screen, and values start to blur together the longer you stare. Soon all you see is
a blob of data that feels suffocating, but wait; there’s hope.

Take a step back. Breathe.

To avoid drowning in data, you learn to swim. When you learn to swim, you start at the shallow
end and work your way toward the deep end.
Process
What Do You Want to Know About Your Data?

Your answer doesn’t need to be complex or profound. The more specific you are, the more
direction you get.

For example, if you have time series data, you might want to know if something has improved
or gotten worse over the past decade.
Process
What Visualization Methods Should You Use

It is more important to see your data from different angles and to drill down to what matters
for your project.

 Make multiple charts, compare all your variables, and see if there are interesting bits that are
worth a closer look.

 Try different scales, colors, shapes, sizes, and geometries, and you might find a graphic worth
pursuing further.
Process
What Visualization Methods Should You Use

You can go out of the norms: the figure shows an interactive exploration of article deletions on
Wikipedia.
Process

http://notabilia.net
What Visualization Methods Should You Use

 If you were to design a dashboard that provides the status of a system at a glance, you must
visualize the data in a way that is straightforward.

 If the goal is to encourage reflection or to evoke emotions, efficiency might not be your main
concern.

 Traditional visualization such as bar graphs and line charts can be made easily and read
quickly.
Process
What Do You See and Does it Make Sense?

After you visualize your data, there are certain things to look for.
Process
Do we always Need to Visualize Graphically?

• When you have just one or two numbers to share, use the numbers themselves
to highlight their importance.
• The fact that you have numbers doesn’t mean that you need to use a graph.

For example, in the adjacent figure, the graph

Types of Visuals

doesn’t do much in the interpretation of the

numbers.
Types of Visuals Do we always Need to Visualize Graphically?
Types of Visuals Do we always Need to Visualize Graphically?
Tables

• Very useful when communicating to a mixed audience, since each member will
look for their particular row of interest.

• It is easier to communicate multiple units of measure with a table than with a

graph.

• In order to place the emphasis on the data, use light borders and let the design
of the table fade into the background.
Types of Visuals
Tables

• Very useful when communicating to a mixed audience, since each member will
look for their particular row of interest.

• It is easier to communicate multiple units of measure with a table than with a

graph.

• In order to place the emphasis on the data, use light borders and let the design
of the table fade into the background.
Types of Visuals
A Special Case of Tables: Heatmap

A heatmap is a way to visualize data in tabular format where you color cells to show
the relative magnitude of the numbers.
Types of Visuals

For example, in the above Heatmap, the higher saturation of blue, the higher the
number.

This makes the process of picking out the tails (the lowest number and highest number)
faster.
Process Visualizing Categorical Data
Visualizing Categorical Data

The bar graph, of course, is one of the most common ways to show categorical data.

For example, the results of a survey of approximately 2,200 people about how they use the
Internet, social networking sites such as Facebook and Twitter, and whether politics was a
regular occurrence on those sites can be reported using a bar graph.

The figure shows the results for four of the ﬁfty questions.
Process
Visualizing Categorical Data

The same results can be presented using a diﬀerent scale and shapes.

The figure shows the same poll results with squares sized by area.
Process
Visualizing Categorical Data

The diﬀerences among categories don’t look as dramatic in the symbols plots as they do in the
bar graphs.

For example, the bar for Google looks a lot longer than the rest in the search engine bar graph,
but when you compare the square for Google, it looks bigger, but not quite by the same
magnitude relative to the other squares.
Process
Types of Visuals Bar Chart Note
Bar Chart Note
Because of how our eyes compare the relative end points of the
bars, bar charts must have a zero baseline.
Types of Visuals

Non-zero Baseline vs. Zero Baseline

Graph Axis vs. Data Labels

In making this decision, consider the level of specificity needed:

• If you want your audience to focus on big‐picture trends, think

about preserving the axis but deemphasizing it by making it grey.

• If the specific numerical values are important, it may be better to

label the data points directly. In this latter case, it’s usually best to
omit the axis to avoid the inclusion of redundant information.
Types of Visuals

Always consider how you want your audience to use the visual and
construct it accordingly.
Types of Graphs: Bars

In general, the bars should be wider than the white space between the bars.
Types of Visuals
Types of Graphs: Bars

Vertical Bar Charts

• Vertical bar charts can be single series, two series, or multiple series.
Types of Visuals

Note: as you add more series of data, it becomes more difficult to focus on one at a
time.
Types of Graphs: Bars

Stacked Vertical Bar Chart

• Meant to allow you to compare totals across categories and also see the
subcomponent pieces within a given category.

• It is hard to compare the subcomponents across the various categories once you get
beyond the bottom series because you no longer have a consistent baseline to use to
compare.
Types of Visuals
Types of Graphs: Bars

Waterfall Chart

The waterfall chart can be used to pull apart the pieces of a stacked bar chart to focus
on one at a time, or to show a starting point, increases and decreases, and the
resulting ending point.
Types of Visuals

Example: Imagine that you are an HR business partner and want to understand and
communicate how employee headcount has changed over the past year for the client
group you support.
Types of Graphs: Bars
Waterfall Chart Example
Types of Visuals

• The first column shows the employee headcount at the beginning of the year.
• The final column represents employee headcount at the end of the year, after the
additions and deductions have been applied to the beginning of year headcount.
Types of Graphs: Bars

Horizontal Bar Chart

• Very useful for categorical data

• Extremely easy to read
• Can be single series, two series, or multiple series
Types of Visuals
Types of Graphs: Bars

Stacked Horizontal Bar Chart

Types of Visuals

• Can be used to show the totals across different categories but also give
information about the subcomponents.

• Can be structured to show either absolute values or sum to 100%.

Types of Graphs: Bars

Stacked Horizontal Bar Chart

For example, a stacked horizontal bar chart can work well for visualizing survey data
collected along a Likert scale.
Types of Visuals
Parts of a Whole

When you put categories together, the sum of the parts can equal a whole.

This is when the pie chart comes into the picture.

Returning to the example of the survey on Internet usage, the figure shows breakdowns of
awareness of targeted advertising online.
Process
Parts of a Whole

OR
Process
Subcategories

Subcategories, the categories within categories (within categories), are often more revealing
than the main categories.

As you drill down, there can be higher variability.

Showing subcategories can make it easier to browse your data, because you can visually jump
to the areas that you care most about.

You can use a treemap with the survey data.

Process

Example:
http://linked.aub.edu.lb/apps/charts
/courses_concepts_treemap.php
Subcategories

The figure shows the proportion of people in the survey who said they were the parent or
guardian of a child younger than 18 living in the household.

The plot looks like one column from a stacked bar graph.

The bigger a section, the more people who gave that answer.

In this case, a mosaic plot is better than a treemap.

Process
Subcategories

As shown in the figure, you can introduce another dimension.

More area means a higher percentage.

Process
Subcategories

You can keep going and bring in a third variable.

The orientation of education and parenting are the same, but you can also see e-mail usage.

 Notice the vertical split on the subsection in the figure.

 You could keep on adding variables, but as you can see,

as the plot grows, it becomes more challenging to read, so
proceed with caution.
Process
What to Look for

With categorical data, you often look for the minimum and maximum right away.

 This gives you a sense of the range of the dataset, and is easily found with a quick sorting of
values.

 After that, look at the distribution of the parts. Are most values high? Low?

 Look for structure and patterns.

Process

 If a couple of categories have the same value or highly diﬀering ones, it’s worth asking why.
Visualizing Time Series Data

When you visualize time series data, your goal is to see what has passed, what is diﬀerent,
what is the same, and by how much it changed.
Process
Visualizing Time Series Data

The bar chart is a straightforward way to look at data over time, except instead of categories
on one of the axes, you use time.

The figure shows the unemployment rate in the United States from 1948 to 2012, according to
the Bureau of Labor Statistics.
Process
Visualizing Time Series Data

A dot plot can be used in the same way, as shown in the figure.
Process

The data and axes are the same and the visual cue is diﬀerent.
Visualizing Time Series Data

Like bar charts, dots put focus on each value, and trends can be harder to see.

When you connect the sparse dots with a line, the focus of the plot shifts again.
Process
Visualizing Time Series Data

If you care more about an overall trend than you do about the more speciﬁc monthly
variability, you can ﬁt a smoothing curve to the dots, as shown in the figure (instead of
connecting every dot).
Process
Cycles

A number of factors feed into the economy and aﬀect the unemployment rate, so there
aren’t regular intervals in between signiﬁcant increases.

However, there are a lot of things that repeat themselves on regular intervals. Example:
• More people travel during the summer months
• More people leave work around 5 in the afternoon and head home;
• More accidents occur on Saturday than any other day of the week.
Process
Cycles

Flight data from the Bureau of Transportation Statistics shows a similar cycle, as shown in the
figure.

The chart shows a weekly cycle, with the fewest ﬂights on Saturdays and typically the most
ﬂights on Fridays.
Process
Cycles

Because the data repeats itself, it makes sense to compare like days of the week to each other.
For example, compare all Mondays.

In order to do this, you can split the days into weekly segments so that you can directly
compare cycles, as shown in the figure, with both the line chart and star plot.
Process
Cycles

The figure shows the data in a familiar calendar format. The ﬁrst
column is Sunday, the second is Monday, and so on, to Saturday at
the end.

With the calendar heat map, along with seeing cycles as you scan
top to bottom, it’s easy to see speciﬁc days in rows and columns, so
it’s easier to reference what day of the year each value is for.
Process

A disadvantage of the calendar is that color is the visual cue, and

it can be hard to see small diﬀerences.
Process Cycles
Cycles

Memorial Day
(May 30)

Independence Day
(July 4)

Labor Day
Process

(September 5)

Thanksgiving
Christmas
(December 25)
Types of Graphs: Slopegraph

Slopegraphs are useful when you have two time periods or points of comparison and
want to quickly show relative increases and decreases or differences across various
categories between the two data points.

• In addition to the points, the lines that connect them give you the visual increase or
decrease in rate of change (via the slope or direction).
Types of Visuals
Types of Graphs: Slopegraph

Imagine that you are analyzing and communicating

data from a recent employee feedback survey.

To show the relative change in survey categories

from 2014 to 2015, the slopegraph might look
Types of Visuals

something like the adjacent figure.

Types of Graphs: Slopegraph

We can also draw attention to the single category that decreased over time from
the preceding example.
Types of Visuals
Visualizing Spatial Data

There is a natural hierarchy to spatial data

that allows, and often requires, you to
explore at diﬀerent granularities.

The most obvious way to explore spatial

data is with maps, which place values
within a geographic coordinate system.

The figure shows some options.

Process
Visualizing Spatial Data

If you care only about individual locations, you can place dots on a map.
Process
Visualizing Spatial Data

The figure uses bubbles for the airports, sized by the number of outgoing ﬂights.

With the addition of an area as visual cue, you don’t just see where the busiest airports are, but
also how busy they are relative to each other.
Process
Visualizing Spatial Data

Rather than separate locations, you might want to explore connections between locations.
Process

The brighter a line is, the more ﬂights that went to and from those two airports.
Visualizing Spatial Data

But there’s more you can take away from this data by splitting it into categories.

For example, map ﬂights by airline, and you see the data with a new dimension.
Process
Regions

To maintain the privacy of individuals and to keep personal addresses hidden, it’s common to
aggregate spatial data before releasing it.

Choropleth maps are the most common way to visualize regional data in a spatial context.
Process
Regions

The map shows unemployment rate by county during August 2012.

Process

You can see high rates on the West Coast and in the Southeast and lower unemployment in the
Midwest.
Regions

There are also times when you’re more interested in the aggregates.

For example, the figure shows all recorded UFO sightings between 1906 to 2007, according to
the National UFO Reporting Center.
Process
Process Visualizing Spatial Data
Regions

The figure shows the same data, but as a ﬁlled contour map.
Process

 A color scale is used to show sightings density, where white means more sightings and black
means none, and varying shades of red are for everything in between.
Cartograms

A challenge with mapping regions, the choropleth map in particular, is that larger regions
always get more visual attention regardless of the data.

Cartograms are one way to remedy this. Location is somewhat preserved, but geographical
areas and boundaries are not.

For example, the figure shows the UFO sighting data as a cartogram. Notice the shrinking of
Texas and swelling of California.
Process

California

Texas
Cartograms

The upside of cartograms is that areas ﬁll the appropriate amount of space, but the trade-oﬀ is
less geographic accuracy.

When your data is for larger regions, with a wide range of sizes, this trade-oﬀ is worth it, but
when regions are uniform in size, a choropleth map is most likely a better ﬁt

What to Look for

Spatial data is a lot like categorical data, but with a geographic component.

 You should know the range of the data to start with.

Process

 Then look for regional patterns.

Multiple Variables

A Few Variables

The dot plots discussed previously placed time on the horizontal axis and a variable on the
vertical axis.

A scatter plot replaces time with a diﬀerent variable, so you have two variables plotted against
each other, as shown in the figure.
Process
Multiple Variables

A Few Variables

This statistical relationship between variables is called correlation.

The correlation strength can vary, as shown in the figure.

Process
Multiple Variables

A Few Variables

For a more deﬁned view of how two variables are related, you can ﬁt a line through the points,
as shown in the figure.
Process
Don’t mix up causal and correlative
relationships.

They look the same when you visualize

them, but the former is more difficult to
prove than the latter.
http://www.tylervigen.com/spurious-correlations
Multiple Variables
A Few Variables

The figure shows two ways to incorporate a third variable in a scatter plot.

In the scatter plot on the left, the area of a circle represents assists per game.
Process

The scatter plot on the right uses color to show the same thing. The darker the shade, the more
assists per game.
Multiple Variables
A Few Variables

In the figure, you see assist leaders closer toward the right corner of higher usage percentage
and points, but there’s high variability and there isn’t a clear trend.
Process
Multiple Variables
A Few Variables

The figure shows the same values on the axes, usage percentage and points per game, but uses
area for rebounds and color for assists.
Process
Multiple Variables
Many Variables

A heat map, as shown in the figure, can be used to translate a table to a set of colors.

It shows the same basketball player data, in addition to several other variables (number of
games played, ﬁeld goal percentage, and three-point percentage).
Process

Each row represents a player, and darker shades represent relatively higher values.
Multiple Variables
Many Variables

With players sorted alphabetically, it’s hard to see patterns, but if you sort by a column, say,
points per game, as shown in the figure, relationships are easier to see.
Process
Multiple Variables

Many Variables

Parallel coordinates plots also arrange variables horizontally, but instead of using color like a
heat map, you use vertical position, as shown in the figure.
Process

On each vertical axis, the highest value is plotted at the top and the lowest at the bottom.

Then lines are drawn left to right, positioned by the variables of each observation.
Multiple Variables

Many Variables

The figure shows a few more relationships.

When there aren’t clear relationships across the board, it can be hard to see patterns.
Process

There’s high variability from player to player in the figure, so you end up with a jumble of lines.
Multiple Variables

Many Variables

If you highlight players who averaged ﬁve assists or more and gray out everyone else, it’s easier
to see how these type of players perform in other categories.
Process
Multiple Variables

Many Variables

Whereas the heat map and parallel coordinates plot provide an overview of the data, you might
also want to look at individual data points more closely.

Star plots present data separately.

That is, you represent each row of data with its own plot.
Process
Process Multiple Variables
Multiple Variables

Using Multiple Views

It’s also often useful to look at data with diﬀerent views at the same time.
Process
Multiple Variables
Using Multiple Views

A scatter plot matrix can show similar relationships.

Process
Multiple Variables

Using Multiple Views

if you have multiple variables that might be categorical, temporal, and spatial, it is often better
to use multiple charts.

The figure explores the time series component of the ﬂight

data. Each line represents ﬂight volume for an airline.
Process
Distributions

Imagine there are 100 adults in a room. These 100 people have diﬀerent heights, as shown in
the figure.

They range from 4 feet and 10-inches tall to 6½-feet tall, and the average height for the group is
5 feet and 4 inches.
Process

It’s hard to determine how many people there are in various height ranges without counting
each dot.
Distributions

You can get a better idea if you sort everyone from shortest to tallest, as shown in the figure.
Process

The median line at 64 inches is in the middle, where 50 people are shorter and 50 people are
taller.
Distributions

A better way to see the distribution is when you group them into
height categories or bins, such as those in between 4 feet and 4½-
feet.

But, the dot plot can take a lot of space, especially if you had a lot
more heights to show.
Process

Instead of dots, you could use bars (histogram).

Distributions

As shown in the figure, you can visualize

distributions with varying levels of granularity.

Some views, such as median, show only

summary statistics, whereas other views, such
as the histogram, show distribution in greater
detail.
Process
Distributions

The box plot, as shown in the figure, is an overview visualization that provides a general sense
of distribution.

The box in the middle is deﬁned by the lower and upper quartiles.
Process

 The lower quartile represents where one-quarter of the values are lower.

 The upper quartile represents where one-quarter of the values are higher.
Distributions

The range in between the upper and lower quartiles is called the interquartile range.

 The outer lines are the lower and upper fences, deﬁned by subtracting and adding 1½ times
the interquartile range from the lower and upper quartiles, respectively.

 If the maximum and minimum values are within the upper and lower fences, the outlines are
Process

only drawn to the extremes.

 Otherwise, dots are used to represent any points that fall outside the upper and lower fences
and are considered outliers.
Distributions

You can also use multiple box plots to compare distributions.

Process
Distributions

The figure shows how the same height data can be represented with diﬀerent bins.
Process
Distributions
You can also use multiple histograms to
compare distributions.

In return to the ﬂight data, the figure

shows the distributions of arrival delays
for major airlines.

Delays of more than 15 minutes are

highlighted in orange.

Regardless of the type of visualization

you use to explore distributions, look
Process

for peaks and valleys, range, and the

spread of your data, which tell you a
lot more than the mean and median.
Summary
Process Take Home Points
Take Home Points
Visualization can be a great tool to explore your data.

The key to getting the most out of your data isn’t so much about finding the right software
than it is to learn how to use the tools you have and to know what questions to ask:

- Consider what data you have and what you can get
- Where the data is from
- How it was derived
- What all the variables mean
Take Home Points

- Let that extra information guide your visual exploration.

Even if your goal is to visualize data for presentation, exploration can lead to unexpected
insights, which makes for better graphics.

Automotive Interview Questions PDF
80% (30)
Automotive Interview Questions PDF
59 pages
Minesched PDF
No ratings yet
Minesched PDF
4 pages
Basic Computer Concepts
100% (12)
Basic Computer Concepts
9 pages
Apache Airflow Documentation
No ratings yet
Apache Airflow Documentation
101 pages
DC-30 - System Recovery Guide - V2.0 - EN
No ratings yet
DC-30 - System Recovery Guide - V2.0 - EN
12 pages
OpenText Extended ECM For SAP Solutions 16.2 - Installation Guide For Microsoft Windows With Oracle Database English (ERLK160200-00-IWO-En-01)
No ratings yet
OpenText Extended ECM For SAP Solutions 16.2 - Installation Guide For Microsoft Windows With Oracle Database English (ERLK160200-00-IWO-En-01)
256 pages
Java Theory (9th Class)
No ratings yet
Java Theory (9th Class)
13 pages
JAVA Internship
No ratings yet
JAVA Internship
63 pages
Product Data Sheet - APC Smart-UPS C 1500VA LCD (SMC1500IC)
No ratings yet
Product Data Sheet - APC Smart-UPS C 1500VA LCD (SMC1500IC)
3 pages
1 CNC Press Break
No ratings yet
1 CNC Press Break
27 pages
Chassis Name Platform Model Name: 4100S Tsumv59 24PHA4100S/67
No ratings yet
Chassis Name Platform Model Name: 4100S Tsumv59 24PHA4100S/67
28 pages
710, Barton Centre, M G Road, Bangalore 560 001: A.O: Against Order Tax (Vat) 5% Extra
No ratings yet
710, Barton Centre, M G Road, Bangalore 560 001: A.O: Against Order Tax (Vat) 5% Extra
11 pages
Unit-1 Cloud Computing
No ratings yet
Unit-1 Cloud Computing
18 pages
Maharishi - Resume Lam
No ratings yet
Maharishi - Resume Lam
5 pages
05-Reducing Visualization Clutter - LMS
No ratings yet
05-Reducing Visualization Clutter - LMS
53 pages
Micromine Draft
No ratings yet
Micromine Draft
2 pages
DMB-6100E IP To Analog Modulator User Manual - Digicast
No ratings yet
DMB-6100E IP To Analog Modulator User Manual - Digicast
26 pages
Dell Inspiron 20-3045 Spec Sheet
No ratings yet
Dell Inspiron 20-3045 Spec Sheet
4 pages
Internship Report
No ratings yet
Internship Report
40 pages
MX12 Ug
No ratings yet
MX12 Ug
210 pages
CVS Case 3
No ratings yet
CVS Case 3
10 pages
07-The Design Aspect of Visualizations-Lms
No ratings yet
07-The Design Aspect of Visualizations-Lms
47 pages
EXPERIMENT 9 (Kubernetes Multi Node Cluster)
No ratings yet
EXPERIMENT 9 (Kubernetes Multi Node Cluster)
4 pages
Lecture 2b - Karnaugh Map - PART 2
No ratings yet
Lecture 2b - Karnaugh Map - PART 2
27 pages
PW4 (F1002, F1005, F1020)
No ratings yet
PW4 (F1002, F1005, F1020)
22 pages
CVS Case 2
No ratings yet
CVS Case 2
7 pages
900 - Startups Hiring Remotely in 2025
No ratings yet
900 - Startups Hiring Remotely in 2025
3 pages
Order Fulfillment - 24.1 Implementation Guide
No ratings yet
Order Fulfillment - 24.1 Implementation Guide
40 pages
3 - 21, 2 - 24 PM) - WPS Office PDF Love 6
No ratings yet
3 - 21, 2 - 24 PM) - WPS Office PDF Love 6
2 pages
SF Dump
No ratings yet
SF Dump
20 pages
Multimedia Technology Cat
No ratings yet
Multimedia Technology Cat
17 pages
Zara Case Study
No ratings yet
Zara Case Study
6 pages
AN5295
No ratings yet
AN5295
16 pages
RP C Ext Dali 1
No ratings yet
RP C Ext Dali 1
5 pages
Windows Basic Notes December 2024
No ratings yet
Windows Basic Notes December 2024
3 pages
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)

04 Exploring+Data+Visually Combined Lms

Uploaded by

04 Exploring+Data+Visually Combined Lms

Uploaded by

“The greatest value of a picture is when it

forces us to notice what we never expected

Things to consider when you explore your data visually:

 What data do you have?

 What do you want to know about your data?

 What visualization methods should you use?

 What do you see, and does it make sense?

Exploring data visually is an iterative process.

It is very important to get the data before forming the visual.

Programming or click-and-play applications can be helpful to manage data.

 What do values represent

 How are variables measured

Take a step back. Breathe.

For example, in the adjacent figure, the graph

doesn’t do much in the interpretation of the

• It is easier to communicate multiple units of measure with a table than with a

• It is easier to communicate multiple units of measure with a table than with a

Non-zero Baseline vs. Zero Baseline

In making this decision, consider the level of specificity needed:

• If you want your audience to focus on big‐picture trends, think

• If the specific numerical values are important, it may be better to

Vertical Bar Charts

Stacked Vertical Bar Chart

Horizontal Bar Chart

• Very useful for categorical data

Stacked Horizontal Bar Chart

• Can be structured to show either absolute values or sum to 100%.

Stacked Horizontal Bar Chart

This is when the pie chart comes into the picture.

As you drill down, there can be higher variability.

You can use a treemap with the survey data.

In this case, a mosaic plot is better than a treemap.

As shown in the figure, you can introduce another dimension.

More area means a higher percentage.

You can keep going and bring in a third variable.

 Notice the vertical split on the subsection in the figure.

 You could keep on adding variables, but as you can see,

 Look for structure and patterns.

A disadvantage of the calendar is that color is the visual cue, and

Imagine that you are analyzing and communicating

To show the relative change in survey categories

something like the adjacent figure.

There is a natural hierarchy to spatial data

The most obvious way to explore spatial

The figure shows some options.

The map shows unemployment rate by county during August 2012.

What to Look for

 You should know the range of the data to start with.

 Then look for regional patterns.

This statistical relationship between variables is called correlation.

The correlation strength can vary, as shown in the figure.

They look the same when you visualize

The figure shows a few more relationships.

Star plots present data separately.

Using Multiple Views

A scatter plot matrix can show similar relationships.

Using Multiple Views

The figure explores the time series component of the ﬂight

Instead of dots, you could use bars (histogram).

As shown in the figure, you can visualize

Some views, such as median, show only

only drawn to the extremes.

You can also use multiple box plots to compare distributions.

In return to the ﬂight data, the figure

Delays of more than 15 minutes are

Regardless of the type of visualization

for peaks and valleys, range, and the

- Let that extra information guide your visual exploration.

You might also like