Unit - III Univariate Analysis

Unit 3 focuses on univariate analysis, which examines a single variable to identify patterns through numerical summaries such as mean, median, and mode. It discusses techniques for visualizing data distributions, including bar charts, pie charts, and histograms, as well as measures of spread like range and standard deviation. Additionally, the unit covers the conversion of interval variables to ordinal variables and the importance of quartiles and percentiles in data analysis.

Uploaded by

mk4997320

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views33 pages

Unit - III Univariate Analysis

Uploaded by

mk4997320

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

UNIT - 3

UNIVARIATE
ANALYSIS
UNIT III UNIVARIATE ANALYSIS
Introduction to Single variable:
Distributions and Variables - Numerical
Summaries of Level and Spread - Scaling
and Standardizing – Inequality - Smoothing
Time Series.
I A T E
I VA R
U N ?
LY S I S
A N A
UNIVARIATE ANALYSIS
• Univariate analysis is a basic kind of analysis technique for statistical data.
• Uni - One, here the data contains just one variable.
• For example consider a survey of a classroom.
• The analysts would want to count the number of boys and girls in the room.
• The data here simply talks about the number which is a single variable and
the variable quantity.
• The main objective of the univariate analysis is to describe the data in order
to find out the patterns in the data.
• This is done by looking at the mean, mode, median, standard deviation,
dispersion, etc.
u t i o n
i s tr i b
D b l e s ?
V a r i a
a n d
VARIABLES ON HOUSEHOLD
SURVEY

1 - Hardly drink at 3 - Drink a moderate

5 - Drink
all amount
REDUCING THE NUMBER OF DIGITS
• 134, 121, 167 - Two varying digits
• 0.034, 0.045, 0.062 - Two varying digits
• 0.67, 1.31, 0.92 - Three varying digits
• There are two techniques for reducing the number of
digits
1. Rounding (8.47 = 8.5)
2. Cutting off / Truncating (899.945
= 899)
BAR CHARTS AND PIE CHARTS
• To visualize how any variable is distributed across our
cases.
• How a nominal or ordinal variables, such as drinking
classification represented pictorially?
1. Bar chart
2. Pie Chart
FEATURE VISIBLE IN
HISTOGRAMS
• Histogram allows inspection of four important
aspects of any distribution
1. Level
2. Spread
3. Shape
4. Outliers
• Level - What are typical values in the
distribution?
• Spread - Do the values differ much from one
another?
• Shape - Is the distribution flat or peaked?
• Outliers - Are there any unusual values?
• Unimodel - Distributions with one peak

• Bimodel - Distributions with two peaks

FROM INTERVAL LEVEL TO ORDINAL
LEVEL VARIABLES - RECORDING
• A variable can be recorded in a survey at an interval
level.
• For eg. the maximum recommended weekly intake of
alcohol is 21 units for men and 14 units for women.
• For men, this interval variable is converted to an
ordinal variable as,
• no alcohol drunk (none)
• 1 - 21 units drunk (moderate drinking)
• Over 21 units drunk (heavy drinking)
FROM INTERVAL LEVEL TO ORDINAL
LEVEL VARIABLES - RECORDING
• For women, this interval variable is converted to

an ordinal variable as,

• no alcohol drunk (none)

• 1 - 14 units drunk (moderate drinking)

• Over 14 units drunk (heavy drinking)

NUMERICAL SUMMARIES OF
LEVEL AND SPREAD IN
UNIVARIATE ANALYSIS
• In univariate analysis, numerical summaries are used to
describe the distribution of a single variable.
• Two important aspects of the distribution are its
Central tendency (or level)
Variability (or spread)
LEVEL (CENTRAL TENDENCY)
MEASURES

• Mean

• Median

• Mode
LEVEL Mean
(CENTRAL • The mean is the average of all values in

TENDENCY) the dataset.

• It is calculated by summing up all values
MEASURES and dividing by the number of values.
• The mean represents the center of the
data.
LEVEL Median

(CENTRAL • The median is the middle value of a

dataset when it is sorted in numerical
TENDENCY) order.
• It separates the higher half of the data
MEASURES from the lower half and is less sensitive
to extreme values (outliers) than the
mean.
• For an odd number of observations: The
median is the middle value.
• For an even number of observations: The
median is the average of the two middle
values.
LEVEL Mode

(CENTRAL • The mode is the value that

TENDENCY) appears most frequently in
MEASURES the dataset.
SPREAD (VARIABILITY)
MEASURES
• Range
• Interquartile Range (IQR)
• Variance
• Standard Deviation
SPREAD Range

(VARIABILITY) • The range is the difference

between the maximum and
MEASURES minimum values in the dataset.
• It provides a simple measure of
the spread of the data.
Interquartile Range
SPREAD (IQR)
(VARIABILITY) • The IQR is the range between the
first quartile (Q1) and the third
MEASURES quartile (Q3).
• It is a measure of the dispersion
of the middle 50% of the data and
is less affected by outliers than
the range.
Variance
SPREAD • Variance measures how much each
(VARIABILITY) number in the dataset differs from
the mean.

MEASURES • It involves squaring the differences

from the mean, summing these
squares, and dividing by the
number of observations.
Standard
SPREAD Deviation
(VARIABILITY) • Standard deviation is the
square root of the variance.
MEASURES
• It provides a measure of the
average distance between
each data point and the
mean.
PERCENTILE
• The percent of data that is equal to
or less than a given data point.
• It’s useful for describing where a
data point stands within the data
set.
• If the percentile is close to zero,
then the observation is one of the
smallest.
• If the percentile is close to 100,
then the data point is one of the
largest in the data set.
PERCENTILE
ages =

[5,31,43,48,50,41,7,11,15,39,80,82,32,2,8,6,25,36,27,6

1,31]

What is the 75. percentile?

• The answer is 43, meaning that 75% of the people

are 43 or younger.
QUARTILES
• Quartiles measure the center and it’s also great to describe the spread
of the data. Highly useful for skewed data. Quartiles are values that
separate the data into four equal parts.
• Minimum
• 25th percentile (lower quartile)
• 50th percentile (median)
• 75th percentile (upper quartile)
• 100th percentile (maximum)
QUARTILES
• The quartiles (Q0,Q1,Q2,Q3,Q4) are
the values that separate each
quarter.
• Between Q0 and Q1 are the 25%
lowest values in the data. Between
Q1 and Q2 are the next 25%. And so
on.
• Q0 is the smallest value in the data.
• Q1 is the value separating the first
quarter from the second quarter of
the data.
• Q2 is the middle value (median),
separating the bottom from the top
half.
• Q3 is the value separating the third
quarter from the fourth quarter.
• Q4 is the largest value in the data.
• A boxplot is one good way to plot the five-number summary and
explore the data set.
• The bottom end of the boxplot represents the minimum; the first
horizontal line represents the lower quartile; the line inside the
square is the median; the next line is the upper quartile, and the
top is the maximum.
PROPORTION
• It’s often referred to as “percentage”. Defines the percent of
observations in the data set that satisfy some requirements.
CORRELATION
• Defines the strength and direction of the association between two
quantitative variables. It ranges between -1 and 1.
• Positive correlations mean that one variable increases as the other
variable increases.
• Negative correlations mean that one variable decreases as the other
increases.
• When the correlation is zero, there is no correlation at all.
• As closest to one of the extreme the result is, stronger is the
association between the two variables.

Definition of Statistical Terms
100% (3)
Definition of Statistical Terms
6 pages
MEAN, MEDIAN & MODE Questions
100% (1)
MEAN, MEDIAN & MODE Questions
21 pages
Introduction To The Practice of Basic Statistics (Textbook Outline)
100% (14)
Introduction To The Practice of Basic Statistics (Textbook Outline)
65 pages
Basic Functions of Excel
No ratings yet
Basic Functions of Excel
76 pages
Ap Stat Exam Rev ch1-13
No ratings yet
Ap Stat Exam Rev ch1-13
120 pages
Powerpoint Presentation On: "Frequency
100% (2)
Powerpoint Presentation On: "Frequency
36 pages
Comparing The Mean and The Median
No ratings yet
Comparing The Mean and The Median
48 pages
Business Analytics Unit 4
No ratings yet
Business Analytics Unit 4
24 pages
Week 5A - Statistics Handout
No ratings yet
Week 5A - Statistics Handout
9 pages
Lesson2 - Measures of Tendency
No ratings yet
Lesson2 - Measures of Tendency
65 pages
MÔ TẢ BIẾN SỐ
No ratings yet
MÔ TẢ BIẾN SỐ
48 pages
Variables & Chart
No ratings yet
Variables & Chart
60 pages
R22 Unit2 CH2
No ratings yet
R22 Unit2 CH2
28 pages
Biostat Aguila Mission Solis
No ratings yet
Biostat Aguila Mission Solis
44 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
63 pages
Biostats Lesson 3
No ratings yet
Biostats Lesson 3
6 pages
02 Exploratory Data Analytics
No ratings yet
02 Exploratory Data Analytics
41 pages
STAB22 Lecture's Notes
No ratings yet
STAB22 Lecture's Notes
64 pages
DSILYTC Session 5 - Descriptive Statistics
No ratings yet
DSILYTC Session 5 - Descriptive Statistics
99 pages
Statistics Notes Part 1
No ratings yet
Statistics Notes Part 1
26 pages
2.data Description
No ratings yet
2.data Description
57 pages
Quantitative Data Analysis
100% (2)
Quantitative Data Analysis
27 pages
U1 Exploring One-Variable Data
No ratings yet
U1 Exploring One-Variable Data
22 pages
Data Managementmmw
No ratings yet
Data Managementmmw
26 pages
ISOM Cheat Sheet 1
No ratings yet
ISOM Cheat Sheet 1
6 pages
MMW Midterm Reviewer
No ratings yet
MMW Midterm Reviewer
6 pages
Math Project (Section A)
No ratings yet
Math Project (Section A)
10 pages
Chapter 2 - Stat
No ratings yet
Chapter 2 - Stat
100 pages
TUT1
No ratings yet
TUT1
7 pages
1 Basics of Stat (Statistics IEM 2-2)
No ratings yet
1 Basics of Stat (Statistics IEM 2-2)
29 pages
Analysis of Interpreting Test Scores 1
No ratings yet
Analysis of Interpreting Test Scores 1
8 pages
2.descriptive Statistics
No ratings yet
2.descriptive Statistics
53 pages
AEB801 20222023-Lecture 03-1
No ratings yet
AEB801 20222023-Lecture 03-1
38 pages
Statistics Midterms Reviewer 1
No ratings yet
Statistics Midterms Reviewer 1
9 pages
AP Stats Semester 1 Finals Prep
No ratings yet
AP Stats Semester 1 Finals Prep
4 pages
Presentation and Summary of Data
No ratings yet
Presentation and Summary of Data
6 pages
1.ungrouped Data Mean, Median&Mode
No ratings yet
1.ungrouped Data Mean, Median&Mode
39 pages
Analytical Techniques Lec 1
No ratings yet
Analytical Techniques Lec 1
42 pages
STATS
No ratings yet
STATS
3 pages
Topic1 Summarizing and Visualizing Data PDF
No ratings yet
Topic1 Summarizing and Visualizing Data PDF
29 pages
Prelims Biostat
No ratings yet
Prelims Biostat
9 pages
J.K Shah Full Course Practice Question Paper
No ratings yet
J.K Shah Full Course Practice Question Paper
10 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
Statistics For Business Topic - Chapter 3, 4 - Descriptive Statistics
No ratings yet
Statistics For Business Topic - Chapter 3, 4 - Descriptive Statistics
1 page
Probability+&+Statistics Formulas
No ratings yet
Probability+&+Statistics Formulas
47 pages
365 Data Science - Statistics: Glossary Section Lesson Word
No ratings yet
365 Data Science - Statistics: Glossary Section Lesson Word
5 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Module 3 4 MMW
No ratings yet
Module 3 4 MMW
6 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
4 pages
AP Statistics Chapter 1-3 Outlines
No ratings yet
AP Statistics Chapter 1-3 Outlines
9 pages
1st Unit Notes
No ratings yet
1st Unit Notes
22 pages
Glossary of Terms
No ratings yet
Glossary of Terms
7 pages
Organization of Terms
No ratings yet
Organization of Terms
10 pages
Annotated 3 Ch3 Data Description F2014
No ratings yet
Annotated 3 Ch3 Data Description F2014
16 pages
Class Test 1 Revision Notes
No ratings yet
Class Test 1 Revision Notes
10 pages
Statistics For Begineers
No ratings yet
Statistics For Begineers
28 pages
Introduction Descriptive & Univariate Statistics - Page 1
No ratings yet
Introduction Descriptive & Univariate Statistics - Page 1
11 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Unit 01 Statistics
No ratings yet
Unit 01 Statistics
10 pages
Statistics: Organize Understand
No ratings yet
Statistics: Organize Understand
9 pages
ISO System of Limits and Fits (Tolerances)
No ratings yet
ISO System of Limits and Fits (Tolerances)
6 pages
Central Limit Theorem: Melc Competency Code
No ratings yet
Central Limit Theorem: Melc Competency Code
9 pages
R Studio How To
No ratings yet
R Studio How To
12 pages
Skewness
No ratings yet
Skewness
6 pages
Problem Set 3A
0% (1)
Problem Set 3A
3 pages
Applied Maths Unit4
No ratings yet
Applied Maths Unit4
4 pages
Frequency and Distribution Graphical and Textual
No ratings yet
Frequency and Distribution Graphical and Textual
21 pages
44a Statistical Diagrams Box Plots - H - Question Paper
No ratings yet
44a Statistical Diagrams Box Plots - H - Question Paper
17 pages
Sampling Size Calculation
No ratings yet
Sampling Size Calculation
4 pages
Math 7 Q4 Weeks6to9 MELCs5to10 MOD2
No ratings yet
Math 7 Q4 Weeks6to9 MELCs5to10 MOD2
39 pages
Viii - Atso - Level - 2 - Averages
No ratings yet
Viii - Atso - Level - 2 - Averages
2 pages
EstimationTheory Lecture 02
No ratings yet
EstimationTheory Lecture 02
14 pages
COT 3 Standard Deviation
No ratings yet
COT 3 Standard Deviation
37 pages
Lecture 4 - Data Wrangling
No ratings yet
Lecture 4 - Data Wrangling
41 pages
The Central Limit Theorem
No ratings yet
The Central Limit Theorem
6 pages
Chapter 5 Statistics
No ratings yet
Chapter 5 Statistics
7 pages
Statistical Estimation
No ratings yet
Statistical Estimation
21 pages
8409 Statistics
No ratings yet
8409 Statistics
17 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
31 pages
Exercise - Corriges AUTO HETERO E7
No ratings yet
Exercise - Corriges AUTO HETERO E7
24 pages
Hasil Uji Normalitas Data Shapiro-Wilk: Case Processing Summary
No ratings yet
Hasil Uji Normalitas Data Shapiro-Wilk: Case Processing Summary
6 pages
Dap An BTL KTL HK I 2022
No ratings yet
Dap An BTL KTL HK I 2022
4 pages
EC2203
No ratings yet
EC2203
20 pages
Chapter 3 and 4: Numerical Descriptive Measures: X N X WX P L N
No ratings yet
Chapter 3 and 4: Numerical Descriptive Measures: X N X WX P L N
7 pages
BA1502-BAS152 Recitation1
No ratings yet
BA1502-BAS152 Recitation1
2 pages
Unit 3 - Activity 10 - Measures of Spread Worksheet
No ratings yet
Unit 3 - Activity 10 - Measures of Spread Worksheet
3 pages
TD Sebelum Dan Sesudah + Hasil Olah Data
No ratings yet
TD Sebelum Dan Sesudah + Hasil Olah Data
4 pages
Subject: Validity and Reliability Test Prepared By: Taufiq Kurniawan, S.Si, MM
No ratings yet
Subject: Validity and Reliability Test Prepared By: Taufiq Kurniawan, S.Si, MM
4 pages
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet