0% found this document useful (0 votes)
34 views17 pages

Q4 Lesson 1 2 Pearson R and T Test

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 17

Statistics and Probability

Quarter IV
Lesson 1 – 2

Lesson 1: 1: Pearson Product – Moment Correlation

Describes the nature of bivariate data.

GENERALIZATION:

Bivariate data are data that involve two variables that are taken from a sample or population.
Univariate data are data that involve only a single variable.
Correlation Analysis – is a statistical method used to determine whether a relationship between
two variables exists.

Examples:
In a class, IQ scores and Math scores in a long exam can be collected.
In Chemistry, a confined gas can have different volumes and corresponding pressure.
In motoring, we can collect from different cars their ages and their mileages

Exercise I:
1. Cite three examples of bivariate data that are correlated.

Constructs the scatterplot for a set of bivariate data and estimates the strength of association
between two variables based on a scatterplot.

GENERALIZATION:

A scatterplot shows how the points of bivariate data are scattered. The arrangement of these
points is important in making analysis.

The line that is closed to the points is called the trend line. It indicates the direction – whether
positive or negative as denoted by the slope of the line.

In a positive correlation, In a negative correlation,


high values in one variable high values in one variable
correspond to high values in correspond to low values in
the other variable. the other variable.

In the analysis of a scatterplot, the two elements should be considered are: direction and strength
of the correlation or relationship.

The closeness of the points to the trend line determines the strength of the association. The
closer the points are to the line, the stronger is the correlation.
A perfect correlation exists when all the points fall in the trend line. A perfect correlation maybe
positive or negative. A perfect correlation happens only when other variables that may affect the
relationship between the two variables are controlled.

1|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2

Example: Construct the scatterplot for the following bivariate data and tell
whether the relationship is positive or negative.
Age of a car, in years 1 2 3 4 5 6 7
Mileage in km/liter 15 12 12 10 11 10 8

Answer: Since as the age of the car increases, its mileage decreases, there is negative
relationship between the two variables.

Exercise II:
For each case, determine the two variables and tell whether the relationship is
positive or negative.
1. The more students enroll in a school, the more teachers are needed.
2. As a person ages, his memory decreases.
3. The more time is spent in studying his lessons, the higher is the average grade of Jack.

2|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2

Pearson Correlation Coefficient

The most common coefficient of correlation is known as the Pearson product-moment


correlation coefficient, or Pearson’s r. It is a measure of the linear correlation (dependence)
between two variables X and Y, giving a value between +1 and −1. It was developed by Karl
Pearson from a related idea introduced by Francis Galton in the 1880s.

When conducting a statistical test between two variables, it is a good idea to conduct a
Pearson correlation coefficient value to determine just how strong that relationship is between the
two variables. If the coefficient value is in the negative range, then that means the relationship
between the variables is negatively correlated, or as one value increases, the other decreases. If
the value is in the positive range, then that means the relationship between the variables is
positively correlated, or both values increase or decrease together.

The Meaning of the Correlation Coefficient


1. If the trend line contains all the points in the scatterplot and the line points to the right, we
conclude that there is a perfect positive correlation between the two variables. The computed
r is 1.
2. If all the points fall on the trend line that point to the left, then there exists a perfect negative
correlation between the pair of variables. The computed value of r is -1.
3. If the trend line does not exist, there is no correlation between the pair of variables. This is
confirmed by the computed value of r which is 0.
4. The absolute value of r indicates the strength of correlation between the two variables. The
direction of correlation is indicated by the sign (positive or negative) of r.

The following table for interpretation of r can be used in interpreting the degree

Value of r Strength of Correlation


1 Perfect positive/negative correlation
0.71 to 0.99 Strong positive/negative correlation
0.51 to 0.70 Moderately positive/negative correlation
0.31 to 0.50 Weak positive/negative correlation
0.01 to 0.30 Negligible positive/negative correlation
0 No correlation

3|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2

Example 1. The table below shows the time in hours spent studying (x) by six
grade 11 students and their scores on a test (y). Solve for the Pearson Product
Moment Correlation Coefficient, r.

x 1 2 3 4 5 6
y 5 10 15 15 25 35

Answer:

Steps Solution

X Y XY X2 Y2
1. Construct a table shown on 1 5 5 1 25
the right side. Complete the 2 10 20 4 100
entries in each column. Get
3 15 45 9 225
the sum of all entries below
the columns. 4 15 60 16 225
5 25 125 25 625
6 35 210 36 1,225
ΣX = 21 ΣY = 105 ΣXY=465 ΣX2 = 91 ΣY2=2,425

2. Substitute the values


obtained in the formula,

6 (465) −(21)(105)
r=

585
r=

585
r=
√370,125
r = 0.96157
r = 0.96
3. Interpret the value of r (The value of r = 0.96 is between +0.71 and +0.99 in the table
of r)
It indicates that there is a strong positive correlation between
the time in hours spent in studying and the scores on a test. It
may be concluded that a student who spends more time spent
studying is expected to have higher score in the test.

Example 2.

4|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2
Listed below are the heights in cm and weights in kilograms of six teachers. Solve the Pearson
Product Moment Correlation Coefficient.

Teacher A B C D E F
Heights (in cm) 160 162 167 158 167 170
Weights (in kg) 50 59 63 52 65 68

Answer:
Teacher x (height) y (weight) xy x2 y2
A 160 50 8,000 25,600 2,500
B 162 59 9,558 26,244 3,481
C 167 63 10,521 27,889 3,969
D 158 52 8,216 24,964 2,704
E 167 65 10,855 27,889 4,225
F 170 68 11,560 28,900 4,624
∑x = 984 ∑y = 357 ∑xy = 58,710 ∑x2 = 161,486 ∑y2 = 21,503

6 (58,710) −(984)(357)
r=

972
r=

972
r=
r = 0.95517
r = 0.96
It indicates a strong positive correlation between the height and weight of the
six teachers. It shows that as the height of the teachers increases, the weight
also increases.
Student Score in Statistics (X) Score in Gen Math (Y)
Alvin 3 5
Johnny 9 8
Efren 10 10
Gemma 12 9
Cecilia 7 8

Example: The following data shows the scores of 5 students in Statistics and
General Mathematics. Determine if there is a relationship between the scores in
Statistics and General Mathematics.

5|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2

Answer:
Steps:
1. Construct a table and complete the column X2, Y2and XY.
2. Get the sum of all entries: X, Y, X2, Y2, XY
3. Substitute the values obtained in Step 2 in the formula

Studen X Y X2 Y2 XY
t
Alvin 3 5 9 25 15
Johnny 9 8 81 64 72
Efren 10 10 100 100 100
Gemma 12 9 144 81 108
Cecilia 7 8 49 64 56
41 40 383 334 351
X = 41 Y = 40 X = 383
2
Y = 334
2
XY = 351

Interpretation: Very High Positive Correlation between the scores in Statistics and General
Mathematics. If a student gets high score in Statistics, he is also expected to get
high Score in General Mathematics.

Exercise III:
Compute for r for each of the following:
a. X = 225 b. X = 32
Y = 22 Y = 1105
X2 = 9653 X2 = 220
Y2 = 143 Y2 = 364525
XY = 651 XY = 3402
n=6 n=6

Answers to Exercises:
Exercise I: (Answers may vary)
Ex. Arm Span and Height of a Person Savings and Expenditures of an Individual
Exercise II:
1. Students enrollment and number of teachers – Positive
2. Person’s age and memory – Negative
3. Time spent in studying and Average grade – Positive
Exercise III:
1. – 0.63
2. – 0.88

6|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2

Lesson 2: 2: T – test

INTRODUCTION TO THE T-TEST

The t-test is a statistical test procedure that tests whether there is a significant difference
between the means of two groups.

Types of t-test

There are three different types of t-tests. The one sample t-test, the independent-sample t-test and
the paired-sample t-test.

1. One sample t-Test

When do we use the one sample t-test (simple t-test)? We use the one sample t-test when
we want to compare the mean of a sample with a known reference mean.

Example of a one sample t-test

A manufacturer of chocolate bars claims that its chocolate bars weigh 50 grams on
average. To verify this, a sample of 30 bars is taken and weighed. The mean value of this sample
is 48 grams.

7|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2

We can now perform a one sample t-test to see if the mean of 48 grams is significantly
different from the claimed 50 grams.

2. t-test for independent samples

When to use the t-test for independent samples? We use the t-test for independent
samples when we want to compare the means of two independent groups or samples. We want to
know if there is a significant difference between these means.

Example of a t-test for independent samples

We would like to compare the effectiveness of two painkillers, drug A and drug B.

To do this, we randomly divide 60 test subjects into two groups. The first group
receives drug A, the second group receives drug B. With an independent t-test we can now test
whether there is a significant difference in pain relief between the two drugs.

8|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2
3. Paired samples t-Test

When to use the t-test for dependent samples (paired t-test)? The t-test for dependent
samples is used to compare the means of two dependent groups.

Example of the t-test for paired samples


We want to know how effective a diet is. To do this, we weigh 30 people before the diet
and exactly the same people after the diet.

Now we can see for each person how big the weight difference is
between before and after. With a dependent t-test we can now check whether there is a
significant difference.

Calculate t-test

How do you calculate a t-test? First the t-value is needed:


To calculate the t-value, we need two values. First, we need the difference of the means
and second, the standard deviation from the mean. This value is called the standard error.

In the sample t-test, we calculate the difference between the sample mean and the known
reference mean. s is the standard deviation of the data collected and n is the number of cases.

9|Page
Statistics and Probability
Quarter IV
Lesson 1 – 2
s divided by the square root of n is then the standard deviation from the mean or the
standard error.

In the t-test for independent samples, the difference is simply calculated from the
difference of the two sample means.

To calculate the standard error, we need the standard deviation and the number of cases
of the first and the second sample.
Depending on whether we can assume equal or unequal variances for our data, there are
different formulas for the standard error.

With a paired samples t-test, we only need to calculate the difference of the paired values
and calculate the mean from this. The standard error is then the same as in the t-test for one
sample.

The t-value and the null hypothesis

We now want to use the t-test to find out whether we reject the null hypothesis or not. To
do this, we can use the t-value in two ways. Either we read the so-called critical t-value from a
table or we simply calculate the p-value with the help of the t-value.

Let's start with the method involving the critical t-value, which we can read from a table.
To do this, we first need the table of critical t-values, which we can find on datatab.net, under
"Tutorials" and "t-distribution". Let's start with the two-sided case first, which is a one-sided or
directed hypothesis. Below we see the table.

10 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2

First we have to determine which significance level we want to use. Here we choose a
significance level of 0.05, i.e. 5%. Then we have to look in the column at 1-0.05, so at 0.95.

Now we need the degrees of freedom. In the one sample t-test and the dependent-sample
t-test, the degrees of freedom are simply the number of cases minus 1. So if we have a sample of
10 people, we have 9 degrees of freedom.

In the independent samples t-test, we add the number of people from the two samples and
calculate minus 2 because we have two samples. It should be noted that the degrees of freedom
can also be determined in other ways, depending on whether one assumes equal or unequal
variance.

So if we have a significance level of 5% and 9 degrees of freedom, we get a critical t-


value of 2.262.

On the one hand, we have now calculated a t-value with the t-test, and then we have the
critical t-value. If the calculated t-value is greater than the critical t-value, we reject the null
hypothesis. Suppose we have calculated a t-value of 2.5. This value is greater than 2.262 and
thus the two means are so far apart that we can reject the null hypothesis.

On the other hand, we can also calculate the p-value for the t-value we calculated. If we
enter 2.5 for the t-value and 9 for the degrees of freedom at the green marked region of the
image, we get a p-value of 0.034. The p-value is smaller than 0.05 and thus we also reject the
null hypothesis in this way.

11 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2

As a check, if we enter the t-value of 2.262, we get exactly a p-value of 0.05, which is
exactly the limit.

EXAMPLE:

A researcher wants to determine if there is a significant difference in the mean exam


scores between two different teaching methods. They randomly select two groups of students:
Group A, which receives traditional lectures, and Group B, which receives interactive online
tutorials. The exam scores (out of 100) for each group are as follows:

Group A: 78, 82, 75, 85, 79, 88, 72, 80, 83, 76

Group B: 85, 90, 88, 92, 86, 89, 82, 91, 87, 84

Perform a t-test to determine if there is a significant difference in the mean exam scores
between the two groups. Use a significance level of α = 0.05.

Solution:

Step 1: State the null and alternative hypotheses.


Ho: There is no significant difference in the mean exam scores between two groups.
Ha: There is a significant difference in the mean exam scores between two groups.

Step 2. Determine the test statistics.


Since we are comparing two group, we are using t – test for independent samples.

Step 3: Determine the critical t-value from the t-distribution table for a
significance level of α = 0.05 and
df = n₁ + n₂ - 2 = 10 + 10 – 2 = 18
The critical t-value is approximately ±2.101 (two-tailed test).

Step 4. Compute for the t-test value.

12 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2
Calculate the sample means, sample standard deviations, and sample sizes for each
group.
For Group A:
Mean

Sample standard deviation (s₁) ≈ 5.3.


Sample size (n₁) = 10

For Group B:

Mean
Sample standard deviation (s₂) ≈ 3.8
Sample size (n₂) = 10

Calculate the t-statistic using the formula:

Compare the calculated t-value to the critical t-value.

Step 5. Decision.

Since |-3.95| > 2.101, we reject the null hypothesis.

Step 6. Interpretation.

There is sufficient evidence to conclude that there is a significant difference in the mean
exam scores between the two teaching methods at the α = 0.05 level.

Examples using SPSS

I. One Sample t-Test – Used to test whether the mean of single variable differs from a
specified constant.

A researcher wants to test whether the average IQ score of a group of students differs
from 100.

Step 1: State the Null and Alternate Hypotheses


Ho = The average grade on Assignment 1 is equal to 23.
Ha = The average grade on Assignment 1 is not equal to 23.
Is this a directional or nondirectional Ha?
Step 2: Input each student’s grade into SPSS
Step 3: Run the Analysis.
Analyze  Compare Means  One Sample T-test
Test variable = assign1
Test value = 23
Click OK

13 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2

Step 4: Make a decision regarding the null


M = 21.03, SD = 1.54
t (14*) = -4.944
p < .001
What is the decision regarding the null?
*14 = df = n-1 = 15-1 = 14

Using the level of significance = .05, do we reject or fail to reject the null?
If p < .05, we reject the null
If p > .05, we fail to reject the null

According to SPSS, p < .001


0.001 < .05, therefore, we reject the null.

Step 5: Write up your results.


The null hypothesis stated that the average grade on Assignment 1 is equal to 23. A one
sample t-test revealed that the average grade on Assignment 1 (M = 21.03, SD = 1.54) differed
significantly from 23, t (14) = -4.944, p < . 001. Consequently, the null hypothesis was rejected.

II. Independent t-Test. The independent samples t-test is used to test comparative research
questions. That is, it tests for differences in two group means or compares means for two
groups of cases.

Suppose the stats professor wanted to determine whether the average score on
Assignment 1 in one stats class differed significantly from the average score on Assignment 1 in
her second stats class.

Step 1: State the Null and Alternate Hypotheses


Ho = There is no difference between class 1 and class 2 on Assignment 1.
Ha = There is a difference between class 1 and class 2 on Assignment 1.
Is this a directional or nondirectional Ha?
Step 2: Input each student’s grade into SPSS, along with which class they are in

Step 3: Run the Analysis.


Analyze  Compare Means  Independent Samples T-test
Test variable = assign1
Grouping variable = class

14 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2
Define Groups:
Type “1” next to Group 1
Type “2” next to Group 2
Click Continue
Click OK

Step 4: Make a decision regarding the null


Class 1 (M = 21.18, SD = 1.49)
Class 2 (M = 21.90, SD = 1.94)

Which row do we look at on the output?

Step 5: Levene’s Test for equal variances


Ho = The variances of the two variables are equal.
Ha = The variances of the two variables are not equal.

Looking at the Equal variances not assumed row (the bottom row)

Make a decision regarding the null


t (22.5) = -1.086
p = .289

Using the level of significance = .05, do we reject or fail to reject the null?
Remember
If p < .05, we reject the null
If p > .05, we fail to reject the null
According to SPSS, p = .289
.289 > .05, therefore, we fail to reject the null.

Step 5: Write up your results.


The null hypothesis stated that there is no difference between class 1 and class 2 on
Assignment 1. An independent samples t-test revealed that the average grades on Assignment 1
did not differ significantly from Class 1
(M = 21.18, SD = 1.49) to Class 2 (M = 21.90, SD = 1.94), t (22.5) = -1.086, p = .289.
Consequently, the researcher failed to reject the null hypothesis.

15 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2

III. Paired Samples t-Test. Used to compare the means of two variables for a single group.
The procedure computes the differences between values of the two variables for each
case and tests whether the average differs from 0.

A researcher wanted to know the effects of a reading program. The researcher gave the
students a pretest, implemented the reading program, then gave the students a post test.

Step 1: State the Null and Alternate Hypotheses


Ho = There is no difference in students’ performance between the pretest and the posttest.
Ha = Students will perform better on the posttest than on the pretest.

Is this a directional or nondirectional Ha?


Remember when we have a directional hypothesis, we conduct a one-tailed test.
When we have a non-directional hypothesis, we conduct a two-tailed test.
SPSS (unless given the choice) automatically runs a 2tailed test, IF you have a directional
alternate hypothesis (and a 2-tailed test was run), you MUST divide the p-value by 2 to obtain
the correct p-value.

Step 2: Set up data

Step 3: Analyze the Results


Analyze  Compare Means  Paired Samples t-Test
Paired variables: pre—post

Step 4: Make a decision regarding the null


Pretest (M = 19.83, SD = 1.17)
Posttest (M = 23.83, SD = 1.17)
t (5) = -15.49 – p < .001 (two-tailed) }

What is the decision regarding the null?


We have a directional alternate, therefore we have to divide the p-value by 2.

.000/2 = 0
p < .001

What is the decision regarding the null?

16 | P a g e
Statistics and Probability
Quarter IV
Lesson 1 – 2
Using the level of significance = .05, do we reject or fail to reject the null?
If p < .05, we reject the null
If p > .05, we fail to reject the null
According to SPSS, p < .001
.001 < .05, therefore, we reject the null.

Step 5: Write up your results.


The null hypothesis stated that there is no difference in students’ performance
between the pretest and the posttest. A paired samples t-test revealed that students scored
significantly higher on the posttest (M = 23.83, SD = 1.17) than they did on the pretest
(M = 19.83, SD = 1.17), t (5) = -15.49, p < .001. Consequently, the null hypothesis was
rejected.

17 | P a g e

You might also like