0% found this document useful (0 votes)
23 views46 pages

6 DATA Analysis 2

Uploaded by

Rochelle Gaanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views46 pages

6 DATA Analysis 2

Uploaded by

Rochelle Gaanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

QUANTITATIVE DATA ANALYSIS

HAS TWO BRANCHES:


DESCRIPTIVE STATISTICS AND
INFERENTIAL STATISTICS
What is Descriptive and Inferential
Statistics?

• Descriptive statistics is used to


describe data and inferential
statistics is used to make
predictions.
• Descriptive and inferential statistics
have different tools that can be
used to draw conclusions about the
data.
What is Descriptive and Inferential
Statistics?
• The purpose of descriptive and
inferential statistics is to analyze
different types of data using different
tools.
• Descriptive statistics helps to describe
and organize known data using charts,
bar graphs, etc., while inferential
statistics aims at making inferences
and generalizations about the
population data.
What is Descriptive Statistics?
• It is used to summarize the
attributes of a sample in such a
way that a pattern can be
drawn from the group.
• It enables researchers to
present data in a more
meaningful way such that easy
interpretations can be made.
TOOLS TO ORGANIZE AND DESCRIBE
DATA
1. Measures of Central Tendency
• These help to describe the
central position of the data by
using measures such as mean,
median, and mode.
• Help you find the middle, or the
average, of a dataset.
MEAN
• The arithmetic mean of a
dataset is the sum of all values
divided by the total number of
values.
• It’s the most commonly used
measure of central tendency
because all values are used in
the calculation.
MEDIAN
• The median of a dataset is the
value that’s exactly in the
middle when it is ordered from
low to high.
MODE
• The mode is the most frequently occurring
value in the dataset. It’s possible to have no
mode, one mode, or more than one mode.
• To find the mode, sort your dataset
numerically or categorically and select the
response that occurs most frequently.
TOOLS TO ORGANIZE AND DESCRIBE
DATA
• Measures of Dispersion - The term “dispersion”
refers to how dispersed a set of data is.
• The measure of dispersion is always a non-
negative real number that starts at zero when all
the data is the same and rises as the data gets
more varied.
• The homogeneity or heterogeneity of the scattered
data is defined by dispersion measures. It also
refers to how data differs from one another.
TOOLS TO ORGANIZE AND DESCRIBE
DATA
• Measures of Dispersion - As the name
suggests, the measure of dispersion
shows the scatterings of the data.
• It tells the variation of the data from one
another and gives a clear idea about the
distribution of the data.
• The measure of dispersion shows the
homogeneity or the heterogeneity of the
distribution of the observations.
TOOLS TO ORGANIZE AND DESCRIBE
DATA
• Measures of Dispersion - These
measures help to see how spread
out the data is in a distribution with
respect to a central point.
• Range, standard deviation,
variance, quartiles, and absolute
deviation are the measures of
dispersion.
RANGE
• Range quantifies the spread or variability of a
set of values.
• To find the range, it is calculated as the
difference between the maximum and minimum
values in a dataset.
RANGE
STANDARD
DEVIATION
• It is the statistical measure of how spread out the
values of a data set are from the mean or
average number.
• In short, it measures the variation of the values
from the mean.
• Population Standard Deviation
• Sample Standard Deviation
STANDARD
DEVIATION
EXAMPLE #1:
POPULATION
STANDARD
DEVIATION
EXAMPLE #2:
SAMPLE
What is Inferential Statistics?
• Inferential statistics can be
defined as a field of statistics that
uses analytical tools for drawing
conclusions about a population by
examining random samples.
• The goal of inferential statistics is
to make generalizations about a
population.
What is Inferential Statistics?
• In inferential statistics, a statistic
is taken from the sample data
(e.g., the sample mean) that used
to make inferences about the
population parameter (e.g., the
population mean).
What is Inferential Statistics?
• In inferential statistics, a statistic
is taken from the sample data
(e.g., the sample mean) that used
to make inferences about the
population parameter (e.g., the
population mean).
Types of Inferential
Statistics

A
Hypothesis Testing
B
Regression Analysis
Hypothesis Testing

▪ A type of inferential statistics that is


used to test assumptions and draw
conclusions about the population
from the available sample data.
▪ It involves setting up a null
hypothesis and an alternative
hypothesis followed by conducting a
statistical test of significance.
Z Test

▪ A z test is used on data that follows


a normal distribution and has a sample
size greater than or equal to 30.
▪ It is used to test if the means of the
sample and population are equal when
the population variance is known.
Z Test

The right tailed hypothesis can be set up as follows:


T Test

▪ A t test is used when the data follows


a student t distribution and the sample
size is lesser than 30. It is used to
compare the sample and population
mean when the population variance is
unknown.
T Test

The hypothesis test for inferential statistics is given as follows:


F Test

▪ An f test is used to check if there is a


difference between the variances of two
samples or populations.
F Test

The hypothesis test for inferential statistics is given as follows:


Regression Analysis

▪ Regression analysis is used to


quantify how one variable will
change with respect to another
variable.
▪ There are many types of regressions
available such as simple linear,
multiple linear, nominal, logistic, and
ordinal regression.
Regression Analysis

▪ The most commonly used regression


in inferential statistics is linear
regression. Linear regression checks
the effect of a unit change of the
independent variable in the
dependent variable.
Linear Regression

Simple linear regression is used to estimate the


relationship between two quantitative variables.
You can use simple linear regression when you want to
know:
1.How strong the relationship is between two variables
(e.g., the relationship between rainfall and soil erosion).
2.The value of the dependent variable at a certain value
of the independent variable (e.g., the amount of soil
erosion at a certain level of rainfall).
Linear Regression

Example:
You are a social researcher interested in the relationship
between income and happiness. You survey 500 people
whose incomes range from 15k to 75k and ask them to
rank their happiness on a scale from 1 to 10.
Your independent variable (income) and dependent
variable (happiness) are both quantitative, so you can do
a regression analysis to see if there is a linear relationship
between them.
Linear Regression
THANK YOU!

You might also like