Module 3 - Data Visualization 1
Module 3 - Data Visualization 1
◦ Statistical Analysis
FLEX Course Material
BUSINESS ANALYTICS
.
Data Visualization and
Measure of Central Tendency,
Variation and Skewness and
Kurtosis
What is Data Visualization?
By using visual elements like pie charts, graphs, and line graph, data visualization
tools provide an accessible way to see and understand trends, outliers, and
patterns in data. Additionally, it provides an excellent way for employees or
business owners to present data to non-technical audiences without confusion.
Common Data Visualization used in Business Analytics
1. Bar graph
is a pictorial representation of any statistics which is used to compare data. It shows quantity or
numbers in the form of bars which can be either horizontal or vertical.
2. Pie Chart
is a pictorial representation of any statistics which is used to determine the percentage of a data.
It displays data information in an easy-to-read 'pie-slice' format with varying slice telling you
what percent of one data element exists.
b. Line graph
is a pictorial representation of any statistics which is used to the trend data.
Trend analysis involves collecting the information from multiple periods and
plotting the collected information on a horizontal line to find actionable patterns
from the given information.
Sample problem of Data Visualization Business Analytics
Example 1
Example
Construct the bar graph and the pie chart of the given data using excel.
Bar graph
Pie Chart
Example 2
PRODUCTS SOLD BY XYZ
COMPANY
Example 2
Construct the Line graph of the given data.
◦ Statistics
◦ A collection of mathematical techniques to characterize and interpret data
◦ Descriptive Statistics
◦ Describing the data (as it is)
◦ Inferential statistics
◦ Drawing inferences about the population based on sample data
Measures of Central Tendency
◦ Descriptive Statistics
Measures of Centrality Tendency
◦ Arithmetic mean
◦ Median
◦ The number in the middle
◦ Mode
◦ The most frequent observation
◦ Descriptive Statistics
Measures of Dispersion
◦ Dispersion
◦ Degree of variation in a given variable
◦ Range
◦ Max - Min
◦ Variance Standard Deviation
◦ Box-and-Whiskers Plot
◦ a.k.a. box-plot
◦ Descriptive Statistics
Shape of a Distribution
◦ Histogram – frequency chart
◦ Skewness
◦ Measure of asymmetry
◦ Kurtosis
◦ Peak/tall/skinny nature of the distribution
Relationship
Between
Dispersion and
Shape Properties
Technology Insights 2.1 – Descriptive Statistics in Excel
Creating box-plot in Microsoft Excel
◦ Regression Modeling for Inferential Statistics
◦ Regression
◦ A part of inferential statistics
◦ The most widely known and used analytics technique in statistics
◦ Used to characterize relationship between explanatory (input) and response
(output) variable
◦ It can be used for
◦ Hypothesis testing (explanation)
◦ Forecasting (prediction)
◦ Regression Modeling
◦ Correlation versus Regression
◦ What is the difference (or relationship)?
◦ Simple Regression versus Multiple Regression
◦ Base on number of input variables
◦ How do we develop linear regression models?
◦ Scatter plots (visualization—for simple regression)
◦ Find a straight line passing through right between the plotted dots.
◦ Ordinary least squares method
◦ A line that minimizes distance between dots and the line
Regression Modeling
◦ Regression Modeling
◦ x: input, y: output
https://www.youtube.com/watch?v=zPG4NjIk
Time Series Forecasting
Data Repositories
DEPLOYMENT CHART
DEPT 2
DEPT 4
2 4
DEPT 3
1 3 5
Information Decision
(reporting) Maker
Types of Business Reports
◦ Statistics
◦ A collection of mathematical techniques
to characterize and interpret data
◦ Descriptive Statistics
◦ Describing the data (as it is)
◦ Inferential statistics
◦ Drawing inferences about the population based
on sample data
Descriptive Statistics
Measures of Centrality Tendency
◦ Arithmetic mean
◦ Median
◦ The number in the middle
◦ Mode
◦ The most frequent observation
Measures of dispersion?
Measures of dispersion
- is summary statistics that represent the amount of spread in a set of numerical data.
.
Descriptive Statistics
Measures of Dispersion
◦ Dispersion
◦ Degree of variation in a given variable
◦ Range
◦ Max - Min
◦ Variance Standard Deviation
◦ Box-and-Whiskers
Plot
◦ a.k.a. box-plot
Measures of skewness
Skewness
is a measure of the symmetry in a distribution. A symmetrical dataset (Normally distributed) will
have a skewness equal to 0. Skewness essentially measures the relative size of the two tails.
Descriptive Statistics
Shape of a Distribution
◦ Kurtosis
◦ Peak/tall/skinny nature of the distribution
Symmetrical Dataset with Skewness = 0
Dataset with Positive Skewness
Dataset with Negative Skewness
Kurtosis
Kurtosis
is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. The
value is often compared to the kurtosis of the normal distribution, which is equal to 3. If the kurtosis
is greater than 3, then the dataset has heavier tails than a normal distribution (more in the tails). If
the kurtosis is less than 3, then the dataset has lighter tails than a normal distribution (less in the
tails).
Dataset with Negative Kurtosis
Dataset with Positive Kurtosis
Relationship
Between
Dispersion and
Shape Properties
Regression
Regression Modeling for Inferential Statistics
◦ Regression
◦ A part of inferential statistics
◦ The most widely known and used analytics
technique in statistics
◦ Used to characterize relationship between
explanatory (input) and response (output)
variable
◦ It can be used for
◦ Hypothesis testing (explanation)
◦ Forecasting (prediction)
Regression Modeling
◦ x: input, y: output
◦ Simple Linear Regression
https://www.youtube.com/watch?v=zPG4NjIk
KEEP SAFE EVERYONE
END