0% found this document useful (0 votes)
32 views44 pages

Final Term Lectures 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views44 pages

Final Term Lectures 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

MS 102

QUANTITATIVE
METHODS
Module3-UNIT 1

Population VS
Sample
Definition
• A population is the entire group
that you want to draw conclusions
about
• A sample is the specific group
that you will collect data from.
• The size of the sample is always
less than the total size of the
population
Meaning of Population
In research, a population doesn’t
always refer to people. It can
mean a group containing elements
of anything you want to study,
such as objects,events,
organisations, countries, species,
organisms, etc
Meaning of Population
Population vs sample
Population Sample
Advertisements for IT jobs The top 50 search results for
in the Netherlands advertisements
for IT jobs in the Netherlands on
May 1, 2020
Songs from the Eurovision Winning songs from the Eurovision
Song Contest Song
Contest that were performed in
English
Undergraduate students in 300 undergraduate students from
the ASCOT the School of Information
Technology
All countries of the world Countries with published data
available on
birth rates and GDP since 2000
Collecting Data from a
Population
• Populations are used when your
research question requires, or
when you have access to, data from
every member of the population.
• Usually, it is only straightforward
to collect data from a whole
population when it is small,
accessible and cooperative.
Collecting Data from a
Population
• Example: Collecting data from a
population
A high school administrator wants to
analyze the final exam scores of all
graduating seniors to see if there is a
trend.
Since they are only interested in
applying their findings to the graduating
seniors in this high school, they use the
whole population dataset.
Collecting Data from a
Population
• For larger and more dispersed
populations, it is often difficult or
impossible to collect data from every
individual.
• For example, every 10 years, the federal
US government aims to count every person
living in the country using the US
Census. This data is used to distribute
funding across the nation.
Collecting Data from a
Population
• However, historically, marginalized
and low-income groups have been
difficult to contact, locate and
encourage participation from.
Because of non-responses, the
population count is incomplete and
biased towards some groups, which
results in disproportionate funding
across the country.
Collecting Data from a Sample
When your population is large in
size, geographically dispersed, or
difficult to contact, it’s
necessary to use a sample. With
statistical analysis, you can use
sample data to make estimates or
test hypotheses about population
data.
Collecting Data from a Sample
Example:
Collecting data from a sample, you want to
study political attitudes in young people.

Your population is the 300,000 undergraduate


students in the Philippines.
Because it’s not practical to collect data from
all of them, you use a sample of 300
undergraduate volunteers from three
universities – this is the group who will
complete your online survey.
Collecting Data from a Sample
Ideally, a sample should be randomly
selected and representative of the
population.
Using probability sampling methods
(such as simple random sampling or
stratified sampling) reduces the
risk of sampling bias and enhances
both internal and external validity.
Collecting Data from a Sample
For practical reasons, researchers often
use non-probability sampling methods.
Non-probability samples are chosen for
specific criteria; they may be more
convenient or cheaper to access.
Because of non-random selection methods,
any statistical inferences about the
broader population will be weaker than
with a probability sample.
Reasons for Sampling
Necessity: Sometimes it’s simply not possible to
study the whole population due to its size or
inaccessibility.
Practicality: It’s easier and more efficient to
collect data from a sample.
Cost-effectiveness: There are fewer participant,
laboratory, equipment, and researcher costs
involved.
Manageability: Storing and running statistical
analyses on smaller datasets is easier and
reliable.
Module3-UNIT 2

CORRELATION
ANALYSIS
Definitions
The simplest methods of measuring relationships
existing between economic variables are
correlation analysis and regression analysis.
Correlation can be defined as the degree of
relationship between two or more variables.
The degree of relationship between two
variables is called simple correlation.
The degree of relationship existing among three
or more variables is called multiple
correlations.
Definitions
Correlation may be linear for scatter diagram on
the values of two variables, (X and Y) are
clustered near a straight line, or nonlinear,
when all points on the scatter lie near a curve.
Two variables may have a positive correlation or
a negative correlation, or they may be
unrelated.
These correlations are represented are:
1.Positive linear correlation
2.Positive non- linear correlation
Linear Correlation
We can determine the kind of correlation between two
variables by direct observations.
If the points lie close to the line, the correlation
is strong. A greater dispersion of points about the
line implies weaker correlation.
Simple linear correlation is a measure of the degree
to which two variables vary together, or a measure
of the intensity of the association between two
variables.
Correlation often is abused. You need to show that
one variable actually is affecting another variable.
Linear Correlation
The parameter being measure is
(rho) and is estimated by the
statistic r, the correlation
coefficient.
r can range from -1 to 1, and is
independent of units of
measurement.
Linear Correlation
The strength of the association increases as r
approaches the absolute value of 1.0
A value of 0 indicates there is no association
between the two variables tested.
A better estimate of r usually can be obtained by
calculating r on treatment means averaged across
replicates.
Correlation does not have to be performed only
between independent and dependent variables.
Correlation can be done on two dependent
variables.
Linear Correlation
Correlation can be done on two
dependent variables.
We use a parameter referred to as
the correlation coefficient. The
sample estimate of this parameter
is referred to as r.
Linear Correlation
Rank Correlation Coefficient is
used for qualitative variables,
whereby the variables cannot be
measured numerically.
Examples of such variables include
profession, education, preferences
for a particular brand of commodity
and the like.
Linear Correlation
A partial correlation coefficient
measures the relationship between
any two variables, keeping other
variables constant.
Linear Correlation
The limitations of linear correlations as a
technique for the study of economic
relations are as follows:
1. The formula for correlation coefficient
applies only to linear relationships between
variables.
2. That correlation coefficient as a measure
of co-variability of variables does not
imply any functional relationship between
the variables concerned.
Linear Correlation
The X and Y in the equation to
determine r do not necessarily
correspond between a independent
and dependent variable,
respectively
Linear Correlation
Example 1:
X Y
41 52
73 95
67 72
37 52
58 96
Partial Correlations
Example 1:
A partial correlation coefficient measures
the relationship between any two variables,
keeping other variables constant.
Assume a multiple relationship between three
variables, X1, X2, and X3.
To measure the true correlation between X1
and X2, we find the partial correlation
coefficient between X1 and X2, keeping X3
constant.
Partial Correlations
= correlation coefficient between X1
and X2
= correlation coefficient between X1
and X3
= correlation coefficient between X2
and X3.
Module3-UNIT 3

SIMPLE LINEAR
REGRESSION
Simple linear regression
Simple linear regression is a
statistical method you can use to
understand the relationship between
two variables, x and y.
One variable, x, is known as the
predictor variable. The other
variable, y, is known as the response
variable
Simple linear regression
Suppose we have the following dataset
that shows the weight and height of
players on a basketball team:
Simple linear regression
To make a scatterplot, we place the height
along the x-axis and the weight along the y-
axis. Each player is then represented as a dot
on the scatterplot:
Simple linear regression
Simple linear regression
From the scatterplot we can clearly see that as
weight increases, height tends to increase as
well, but to actually quantify this
relationship between weight and height, we need
to use linear regression.
Using linear regression, we can find the line
that best “fits” our data. This line is known
as the least squares regression line and it can
be used to help us understand the relationships
between weight and height.
Usually, you would use software like Microsoft
Excel, SPSS, or a graphing calculator to
Simple linear regression
The formula for the line of best fit is
written as:

Where
ŷ is the predicted value of the response
variable,
is the y intercept,
is the regression coefficient, and x is
the value of thepredictor variable.
Simple linear regression
For this example, we can simply plug our
data into the Statology Linear Regression
Calculator and hit Calculate:
https://www.statology.org/linear-regressio
n-calculator/

The calculator automatically finds the


least squares regression line:
ŷ = -53.5937 + (3.4108)*x
Simple linear regression
Simple linear regression
Linear Regression Equation:
ŷ = -53.5937 + (3.4108)*x

Goodness of Fit:
R Square: 0.6449
Simple linear regression
Interpretation:
When the predictor variable is equal to 0, the
average value for the response variable is -
53.5937.

Each one unit increase in the predictor


variable is associated with an average change
of (3.4108) in the response variable.

64.49% of the variation in the response


variable can be explained by the predictor
Homework
1.Using correlation analysis.
A.Plot the following data (scatter plot)
B.Calculate the correlation coefficient.
C.What is the relationship between level of
education and lifetime earnings?
X (Education) Y (Income)
8 3.4
7 4.4
6 2.5
5 2.1
4 1.6
3 1.5
2 1.2
1 1
Homework
2. Researchers who measure reaction time for human participants often
observe a relationship between the reaction time scores and the number of
errors that the participants commit. This relationship is known as the
speed-accuracy tradeoff. The following data are from a reaction time study
where the researcher recorded the average reaction time (milliseconds) and
the total number of errors for each individual in a sample of 8
participants.
A.Plot the following data (scatter plot)
B.Calculate the correlation coefficient.
C.What is the relationship between reaction time and number of errors.

Reaction Time Errors


184 10
213 6
234 2
197 7
189 13
221 10
237 4
192 9
Homework
2.Using linear regression.
A.Find the Linear Regression Equation.
B.Find the Goodness of Fit.
C.Plot the data in a scatter plot with the
Linear regression equation.
X (Education) Y (Income)
8 3.4
7 4.4
6 2.5
5 2.1
4 1.6
3 1.5
2 1.2
1 1
Homework
By Pair. Printed or Hand Written.
You may use excel or any software you deem
necessary.
Deadline is on Monday. December 4, 2023.

You might also like