Module 3 - Time Oriented Data-1
Module 3 - Time Oriented Data-1
• Visual Analysis of data from various domain [12 Hrs] [Bloom’s Level
Selected: Apply]
• Time-series data analysis is becoming very important in so many industries like financial
industries, pharmaceuticals, social media companies, web service providers, research, and
many more
Examples of Time Series Data:
• Electrical activity in the brain
• Rainfall measurements
• Stock prices
• Number of sunspots
• Annual retail sales
• Monthly subscribers
• Heartbeats per minute
How is Time measured?
• Scale
• When are the data measurements/samples taken?
• Ordinal -- before, during, after
• Discrete -- clear intervals (seconds, minutes, hours.....)
• Continuous -- mapping to the real numbers. Discrete values can be interpolated
• Scope
• The range of time associated with a measurement/sample
• Point -- the sample is from a point in time that has no duration
• Interval-based -- there is a duration; a start and end
• These time primitives can be anchored (absolute) or unanchored (relative)
How is Time measured?
• Arrangement
• Time often has a cyclical nature, compared to the linear nature described above
• Hourly Cycle
• 24 hour cycle in a daily cycle
• 7 days in a Weekly Cycle (Mon->Tues....Sun->Mon)
• 30 days in a Monthly Cycle
• Lunar cycle
• Quarterly/Seasonal Cycle (financial, astronomical, meteorological)
• 365 days, 52 weeks, 12 months in a Yearly Cycle
• Decades
How is Time measured?
• Viewpoint
• Branching -- There may be a partial ordering; events may occur in parallel streams.
• Word Cloud
• Slope chart
• Sankey chart
• Collocate Clouds
• Word Art
Text Data Visualization Examples
• Word Cloud
A word cloud is a grouping of
keywords or tags using a
particular color and font size to
create a representation of a
shape or figure you can easily
recognize
Slope chart
• If you’re wanting to highlight
transitions, absolute values,
rankings and variations in the long
term, then slope charts or graphs
are the right text data visualization.
• Slope charts/graphs are the
perfect text visualization example
when comparing time periods or
other points of reference and want
to underline rises and drops
across diverse categories between
two data points
Sankey Chart
• With a Sankey Chart, you can
visualize how one group of
values flows to the next group.
These two interconnected points
are called ‘nodes’ and the
connections are ‘links’.
MULTIVARIATE DATA VISUALIZATION
Introduction
• Multivariate data visualization is a way to display and analyze data with more
than two variables. It allows you to see relationships between multiple
variables at once and can help identify patterns and trends in the data.
• There are several types of multivariate data visualizations, including scatter
plots, heatmaps, parallel coordinate plots, and treemaps. Each type of
visualization has its strengths and weaknesses, and the choice of which one to
use depends on the data being analyzed and the questions being asked.
Types of Multivariate Data Visualization
1. Scatter Plots
2. Parallel coordinate plots
3. Tree maps
4. Line Graphs
5. Region Based Techniques
1. Scatter Plots
• Scatterplots and scatterplot matrices -- take pairs of attributes, generally ordinal but not
necessarily, and plot the values for immediate determination of relationships.
• Selected data is shown in red. Raw data sample below of the iris dataset.
• Sepal Petal
Len Width Len Width
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
…
2. Parallel coordinate plots
• A parallel coordinate plot maps each row in the data table as a line, or profile. Each
attribute of a row is represented by a point on the line. This makes parallel coordinate
plots similar in appearance to line charts, but the way data is translated into a plot is
substantially different.
• Consider, for example, a data table where a laboratory has measured the amount of
various carbohydrates contained in various fruit and vegetables.
4. Line Graphs
• A line plot comprises dots connected by a line that shows the relationship between the x
and y variables. The x-axis usually contains time intervals, while the y-axis holds a
numeric variable whose changes we want to track over time
5. Region Based Techniques
• Bar charts and histograms. Often the width of the bar is not significant.
• Stack bar, clustered bar chart. 3D bar charts. Pie charts.
• Cityscape charts on geospatial plots.
Case Study 1: Cause of Deaths in the United
States (1999–2015)
This case study will try to answer the following questions:
• What was the total number of deaths in the United States from 1999 to 2015?
• What is the number of deaths per each year from 1999 to 2015?
• What were the top causes of deaths in the United States during this period?
Data Gathering
The data set in this case study comes from open data from the U.S. government,
which can be accessed through
https://data.gov
You can download it from here:
https://catalog.data.gov/dataset/age-adjusted-death-rates for-the-top-10-leading-ca
uses-of-death-united-states-2013
Data Analysis
• What is the total number of recorded death cases?
• What were the causes of death in this dataset?
Unique States in the Study
• What was the total number of deaths in the United
States from 1999 to 2015?
• What is the number of deaths for each year from 1999 to
2015?
• Which ten states had the highest number of deaths ?
• What were the top causes of deaths in the United States
during this period?
Findings