Data visualization
Presented By
Dr.Deepa A 1
2
Topics Overview
Data Visualization Introduction – What is DV?
Why DV?
Benefits
Who Uses DV
Steps in Data Visualization
DV Techniques
Data Visualization Tools
Techniques in Programming
3
Examples
INTRODUCTION
What is DV?
Is the practice of translating information into a visual context
such as map or graph.
Is one of the steps of the data science process, which states
that after data has been collected, processed and modeled, it
must be visualized for conclusions to be made.
The term is often used interchangeably with others, including
information graphics, information visualization and statistical
graphics. 4
5
Why DV?
The main goal of data visualization is to make it easier to identify
patterns, trends and outliers in large data sets.
DV aims to identify, locate, manipulate, format and deliver data
in the most efficient way possible.
6
7
Benefits of DV
Simple and fast way to transmit and interpret Business information..
Better understanding of operational performance and business activities.
Rapid identification of trends and areas that need attention.
Analysis of patterns and understanding the impact of strategies implemented.
Direct and custom interaction with data and scenario prediction.
8
Benefits of DV
Significant interpretation of a large volume of Data.
Process optimization and decision making based on facts.
Increases productivity and ROI.
Streamlines processes and ensures strategy assertiveness.
Risk reduction and optimization of time and resources.
9
10
Who uses DV
Data visualization is important for almost every career.
DV in Education:
To monitor students learning progress throughout the semester.
Advisors can take prompt actions to help the students who are failing
It can be used by teachers to display student test results, by
computer scientists exploring advancements in artificial intelligence
(AI) or by executives looking to share information with stakeholders .
11
DV in Business
Provides Business with vastly improved Decision making process because it
enhance the available information and represents it in a pictorial format.
Visualization charts with actionable data can display and help to communicate
the message effectively.
Data visualization can support organization leaders in identifying patterns and
gaps and interpreting the information in a meaningful manner.
DV in Military
For the military, clear and actionable data is critical
To quickly share accurate information in the most concise structure.
Better understanding of past Data can make more accurate. 12
DV Steps
Develop your Research question
Get or create your Data
Clean your Data.
Choose a Chart Type.
Choose your Tool.
Prepare data.
Create Chart
13
DV Techniques
14
Histogram
15
Histogram
A Chart that shows the frequency distribution of data points across a continuous
range of Data values
The bars on a Histogram represent ranges along a continuous quantifiable spectrum.
A chart that displays numeric data in ranges
Histograms are a type of Bar Chart but it is used when variables takes continuous
numeric values.
16
17
Line Chart
A Line Chart is a Graphical representation of information that changes over time.
It is a visual comparison of how two variables shown on X and Y Axis are related or vary
with each other.
A Line Graph helps to determine the relationship between two sets of values.
When changes are miner it is better to use Line charts than Bar Graphs.
18
Area Plot
19
Area Plot
An Area Chart displays graphically quantitative data.
It is based on line chart.
The area between axis and line are commonly emphasized with colors, and textures.
It is used to showcase data that depicts a time series relationship.
It is a great chart to visualize a volume change over a period of time.
20
Scatter Chart
21
Scatter Chart
A Scatter plot or chart uses dots to represent values for different numeric variables.
Scatter Plots are used to observe relationships between variables
The position of each dot on the Horizontal and Vertical Axis indicates values for an
individual data point.
It is particularly useful for Researchers, Economists, Scientists and Journalists
22
23
Pie Chart
A Circular Chart with multiple divisions where each division shows the
contribution of each value to the total value.
is a Graphical representation of information that shows a part to a whole.
One would easily see the biggest or smallest share of the total data
Displays relative proportions of multiple classes of Data.
24
Bar Chart
25
Bar Chart
A Bar chart represents categorical Data with rectangular bars with
heights proportional to the values that they represent.
It is used when you want to show a distribution of data points or
perform a comparison of metric values across different subgroups of
your data.
From a bar chart, we can see which groups are highest or most
common
A Bar chart can be of two types horizontal or Vertical.
26
Box plots
27
Box plots
A box plot organizes large amounts of data , and visualizes outlier values.
A box and Whisker plot displays the five number summary of a set of data.
The five number summary is the Minimum, First quartile, Median, Third quartile
and Maximum.
In Box Plot, we draw a box from the first quartile to the third quartile.
A vertical line goes through the box at the median.
The whiskers go from each quartile to the minimum or maximum.
28
Pair Plot
29
Pair Plot
Pair plot is a Module of Seaborn Library which provides a high level
interface for drawing attractive and informative statistical graphics.
Visualizes given data to find the relationship between them where
the variables can be continuous or categorical
A Pair plot is a representation which plots pairwise relationship in
the data.
It is used to understand the best set of features to explain a
relationship between two variables or to form the most separated
clusters 30
KDE Chart
31
KDE Chart
Kernel Distribution Estimation Plot is used for visualizing the
probability density of a continuous variable.
Interpretation of Density Curve:
If Density curve is left skewed, then the mean is less than median.
If Density curve is right skewed, then the mean is greater than median.
If Density curve has no skew, then the mean is equal to median.
32
Hex Bin Plot
33
Hex Bin Plot
A Hexbin plot is used to represent the relationship between two numerical variables
when many data points are present.
In Hex bin plot, the points are not overlapping the plot is split into several
Hexbins/Hexagons.
It shows the density of data points in a 2Dspace.
The colour of the bins represents the number of data points within that bin..
Uses Hexagons to split the area into several parts and attribute a colour to it.
34
Heat Maps
35
Heat Maps
36
Heat Maps
A Heat Map is graphical representation of data that uses a system of colour coding to
represent different values.
It is most commonly used to show user behaviour on specific web pages.
It is a Two dimensional data visualization that represents the magnitude of individual
values within a data set as a color.
They are applicable in A/B Testing, helpful in redesigining websites, content marketing
etc.
Click heatmaps, Mouse Tracking, Eye Tracking
Scroll Maps 37
38
Data Visualization Tools
Tableau
• Infogram
• ChartBlocks
• D3.js
• Google Charts
• Fusion Charts
• Chart.js
39
40
Tableau
Tableau has a variety of options available, including a desktop app,
server and hosted online versions, and a free public option.
There are hundreds of data import options available, from CSV files to
Google Ads and Analytics data to Sales force data.
Output options include multiple chart formats as well as mapping
capability. That means designers can create color-coded maps that
showcase geographically important data in a format that’s much easier to
digest than a table or chart could ever be.
41
42
Qlik Sense
Qlik Sense is a data visualization platform that helps companies to become
data-driven enterprises by providing an associative data analytics engine,
sophisticated Artificial Intelligence system that allows you to deploy any
combination of SaaS, on-premises, or a private cloud.
You can easily combine, load, visualize, and explore your data on Qlik
Sense, no matter its size.
All the data charts, tables, and other visualizations are interactive and
instantly update themselves according to the current data context.
The Qlik Sense AI can even provide you with data insights and help you
create analytics using just drag and drop.
43
that allow
data analysts to simplify complex data and obtain insights for the
organization.
Sisense believes that eventually, every company will be a data-driv
company
and every product will be related to data in some way.
It tries its best to provide various data analytics tools to business teams a
data analytics
so that they can help make their companies the data-driven companies of t
future.
It is very easy to set up and learn Sisense.
It can be easily installed within a minute and data analysts can get their wo
44
Zoho Analytics
Analytics is a Business Intelligence and Data Analytics software that can help y
e wonderful looking data visualizations based on your data in a few minutes.
can obtain data from multiple sources and mesh it together to create
tidimensional data visualizations that allow you to view your business data
ss departments.
se you have any questions, you can use Zia which is a smart assistant created u
ficial intelligence,machine learning, and natural language processing.
Analytics allows you to share or publish your reports with your colleagues and
comments or engage in conversations as required.
can export Zoho Analytics files in any format such as Spreadsheet, MS Word, Ex
45
Data Wrapper
Data Wrapper Analytics is a Business Intelligence and Data Analytics
software that can help you create wonderful looking data
visualizations based on your data in a few minutes.
If you are writing articles online and need to quickly insert beautiful
interactive charts, maps or tables, Data wrapper is the best one.
This can be used on both mobile devices and computers.
Data wrapper has free hosting, making it easy for you to upload data.
Data wrapper would give simpler representations, it can only run fewer
data sets simultaneously.
46
Infogram
• Infogram is a fully-featured drag-and-drop visualization tool that allows
even non-designers to create effective visualizations of data for marketing
reports, infographics, social media posts, maps, dashboards, and more.
• Finished visualizations can be exported into a number of formats: .PNG,
.JPG, .GIF, .PDF, and .HTML. Interactive visualizations are also possible,
perfect for embedding into websites or apps.
• Infogram also offers a WordPress plugin that makes embedding
visualizations even easier for WordPress users.
47
Fusion Charts
•Fusion Charts is another JavaScript-based option for creating web and
mobile dashboards. It includes over 150 chart types and 1,000 map
types.
• It can integrate with popular JS frameworks (including React, jQuery,
React, Ember, and Angular) as well as with server-side programming
languages (including PHP, Java, Django, and Ruby on Rails).
• FusionCharts gives ready-to-use code for all of the chart and map
variations, making it easier to embed in websites even for those
designers with limited programming knowledge. 48
Visualization using Programming
Python
• Python is considered one of the top-level programming languages
for data visualization because it is known for having many libraries
that allow for greater flexibility and its large and active scientific
computing community.
• It also controls the specific elements of the created graphics and
makes the specifications repeatable through code.
49
• Python is also very good at processing data, it provides open-source communities and
rich third-party libraries that allow continuous optimization for data visualization.
– matplotlib
– seaborn
– plotly
– pylab
50
R
• is an open-source software environment designed for creating
graphics.
• R is designed for data analysis.
•Although Python is becoming more and more popular, especially in
the areas of machine learning and in-depth learning, the R language
still has absolute advantages in data analysis and visualization, with
ggplot2 package and its extension package humanized drawing
grammar favored by users, especially bioinformatics and medical
researchers.
R 51
Power BI
•Power BI is able to extract data from a variety of data sources in
addition to supporting Microsoft's own products.
•The drag-and-drop graphical development model used by Power BI will
free data analysts from the visual chores and put more effort into data
management, algorithm research, and business communication.
52
Features of Tableau
53
Tableau products
54
Tableau Reports
55
Tableau file types…
56
Tableau can connect to:
57
Connecting to Data
Connecting to Data
Sample Data
Importing Data to tableau
Data in Tableau
Tableau Worksheet
Making Charts
Making Charts
Changing the axes
Multiple Dimensions in Charts
Multiple Measures in Charts
Various Types of Charts
Tree Map
Mark Options in Tableau
Color
Size
Label
Detail
Scatter Plot
Sorting
Sorting
Filters
Filters
Types of Aggregation
Count
Results in Percentages
Coloumn wise Percentage
Coloumn wise Percentage
Row wise Percentage
Calculated Fields
Creating Calculated Fields
Viewing Calculated Fields
Viewing Calculated Fields
Using Calculated Fields
Sets
Using Sets
Hierarchy
Hierarchy
Maps
Creating Maps
Making Dashboards in Tableau
Conclusion
The tutorial covers the basic functionalities in Tableau. Much more
options are available which can be explored by one self after one
gets a feel of the software
THANK YOU
100