Course Content: St. Paul University Philippines
Course Content: St. Paul University Philippines
/ /
4. Variance
s
2
=
5. Standard deviation
s =
1
) (
2
n
X X
2
s
Formulas (Ungrouped Data)
Exercise:
Given the following data, find the range, MAD,
variance and the standard deviation.
20, 26, 40, 39, 35
Application:
Two seemingly equally excellent students are
vying for an academic honor where only one must
have to be chosen to get the award. The following
are their grades which are used as a basis for giving
the award.
Student A: 90, 92, 92, 94, 95
Student B: 90, 91, 93, 94, 95
Who do you think deserves the award? Why?
17
Guiding Principle
The lesser the value of the measure, the
more consistent, the more homogenous and
the less scattered are the observations in the
set of data.
Formulas (Grouped Data)
1. Range
R = HOV LOV
2. Mean absolute deviation
MAD =
3. Semi interquartile range/ quartile deviation
QD = Q
3
Q
1
2
n
X X f m
/ /
Formulas (Grouped Data)
4. Variance
s
2
=
5. Standard deviation
s =
1
) (
2
n
X X f m
2
s
X F
56 62 6
49 55 9
42 48 10
35 41 12
28 34 10
21 27 8
14 20 6
7 13 4
**Using the frequency distribution below, find:
1. Range 3. QD 5. Standard Deviation
2. MAD 4. variance
Exercise:
18
Tests of Hypothesis
Hypothesis
A statement or tentative theory which aims to
explain facts about the real world
An educated guess
It is subject for testing. If it is found to be
statistically true, it is accepted. Otherwise, it gets
rejected.
Kinds of Hypotheses
1. Null Hypothesis (Ho)
It serves as the working hypothesis
It is that which one hopes to accept or reject
It must always express the idea of no
significant difference
2. Alternative Hypothesis (H
1
or Ha)
It generally represents the hypothetical
statement that the researcher wants to prove.
Types of Alternative Hypotheses (Ha)
1. Directional hypothesis
expresses direction
one tailed
uses order relation of greater than or less than,
2. Non directional hypothesis
does not express direction
two tailed
uses the not equal to
Type I and Type II Errors
When making a decision about a proposed
hypothesis based on the sample data, one runs the
risk of making an error. The following table on the
next slide summarizes the possibilities:
19
Type I and Type II Errors
A Type I error is the mistake of rejecting the null
hypothesis when it is true.
The symbol (alpha) is used to represent the probability
of a type I error.
A Type I I error is the mistake of failing to reject the null
hypothesis when it is false.
The symbol | (beta) is used to represent the probability of
a type II error.
Level of Significance
The probability of making Type I error or alpha
error in a test is called the significance level of the
test. The significance level of a test is the maximum
value of the probability of rejecting the null
hypothesis (Ho) when in fact it is true.
Critical Region
The critical region (or rejection region) is the set of all values
of the test statistic that cause us to reject the null hypothesis.
P - value Critical - value
Region of
acceptance
Region of
rejection
20
Critical Value
A critical value is any value that separates the
critical region (where we reject the null
hypothesis) from the values of the test statistic
that do not lead to rejection of the null
hypothesis, the sampling distribution that
applies, and the significance level o.
P - Value
The P-value (probability value) is the probability of
getting a value of the test statistic that is at least as
extreme as the one representing the sample data,
assuming that the null hypothesis is true. The null
hypothesis is rejected if the P-value is very small,
such as 0.05 or less.
Two-tailed, Right-tailed and
Left-tailed Tests
The tails in a distribution are the extreme
regions bounded by critical values.
Two-tailed Tests
Given:
H
0
: = ; H
1
:
21
Right tailed Tests
Given:
H
0
: = ; H
1
: >
Left tailed Tests
Given:
H
0
: = ; H
1
: <
Steps in Hypothesis Testing
1. Formulate the null hypothesis (Ho) that there is no
significant difference between the items compared. State
the alternative hypothesis (Ha) which is used in case Ho
is rejected.
2. Set the level of significance of the test, o.
3. Determine the test to be used.
Z TEST used if the population standard deviation
is given
T TEST used if the sample standard deviation is
given
Steps in Hypothesis Testing
4. Determine the tabular value of the test.
***For a Z test, the table below summarizes the
critical values at varying significance levels
Type of
Test
Level of Significance
0.10 0.05 0.025 0.01
One
Tailed
1. 28 1. 645 1.96 2.33
Two
Tailed
1.645 1.96 2.33 2. 58
22
Steps in Hypothesis Testing
4. Determine the tabular value of the test.
***For a T test, one must compute first the
degree/s of freedom (df) then look for the tabular
value from the table of Students T Distribution.
i. For a single sample
df = n 1
ii. For two samples
df = n
1
+ n
2
2
Steps in Hypothesis Testing
5. Compute for z or t as needed. Vary your solutions using
the formulas:
For z test
i. Sample mean compared with a population mean
ii. Comparing two sample means
iii. Comparing two sample proportions
For t test
i. Sample mean compared with a population mean
ii. Comparing two sample means
Steps in Hypothesis Testing
6. Compare the computed value with its
corresponding tabular value, then state your
conclusions based on the following guidelines:
Reject Ho if the absolute computed value is
equal to or greater than the absolute tabular value
Accept Ho if the absolute computed value is less
than the absolute tabular value
Decision Criterion
Traditional Method:
***Reject H
0
(Accept H
1
) if the test
statistic falls within the critical region.
***Fail to reject H
0
(Accept Ho) if the
test statistic does not fall within the critical
region.
23
Decision Criterion
P - value method:
*** Reject H
o
(Accept H
1
) if P-value s
o (where o is the significance level, such as
0.05)
***Fail to reject H
0
(Accept Ho)
if
P-value > o
Decision Criterion
Another option:
Instead of using a significance level
such as 0.05, simply identify the P-value and
leave the decision to the reader.
Z - TEST
1. Sample Mean (X) Compared with a Population Mean ()
Where:
X sample mean
population mean
n number of items in the sample
population standard deviation
( X ) n
Z =
Z - TEST
2. Comparing Two Sample Means (X
1
& X
2
)
Where:
X
1
mean of the first sample
X
2
mean of the second sample
n
1
number of items in the first sample
n
2
number of items in the second sample
population standard deviation
X
1
- X
2
Z =
(1/n
1
) + (1/n
2
)
24
Z- TEST
Where:
p
1
proportion of the first sample
p
2
proportion of the second sample
n
1
number of items in the first sample
n
2
number of items in the second sample
q
1
= 1 p
1
q
2
= 1 p
2
3. Comparing Two Sample Proportions (P
1
& P
2
)
P
1
- P
2
Z =
(p
1
q
1
/n
1
) + (p
2
q
2
/n
2
)
T- TEST
4. Sample Mean (X) Compared with a Population Mean ()
Where:
X sample mean
population mean
n number of items in the sample
s sample standard deviation
( X ) n 1
t =
s
T- TEST
5. Comparing Two Sample Means (X
1
& X
2
)
Where:
X
1
mean of the first sample
X
2
mean of the second sample
n
1
number of items in the first sample
n
2
number of items in the second sample
s
1
standard deviation of the first sample
s
2
standard deviation of the second sample
X
1
X
2
t =
(n
1
1)(s
1
)
2
+ (n
2
1)(s
2
)
2
1 + 1
n
1
+ n
2
2 n
1
n
2
Example 1
Data from a school census show that the
mean weight of college students is 45 kilos with a
standard deviation of 3 kilos. A sample of 100
college students were found to have a mean of 47
kilos. Are the college students really heavier than
the rest using the 0.05 level of significance?
25
Example 2
A researcher wishes to find out whether or not there
is significant difference in the monthly allowance of
morning and afternoon students in his school. By random
sampling, he took a sample of 239 students in the morning
session. The students were found to have a mean monthly
allowance of P142.00. The researcher also took a sample of
209 students in the afternoon session . They were found to
have a mean monthly allowance of P148.00. The population
of students in that school have a standard deviation of
P40.00. Is there a significant difference between the two
samples at 0.01 level?
Example 3
A sample survey of television programs in
Metro Manila shows that 80 out of 200 men and 75
out of 250 women dislike May Bukas Pa
program. One likes to know whether the difference
between the two sample proportions, 80/200 = 0.40
and 75/250 = 0.30, is significant or not at 0.05
level.
Example 4
A researcher knows that the average height of
Filipino women is 1.525 meters. A random sample
of 26 women was taken and was found to have a
mean height of 1.56 meters, with a standard
deviation of 0.10 meters. Is there reason to believe
that the 26 women are significantly taller than the
rest using the 0.05 level of significance?
Example 5
Beta company is manufacturing steel wire
with an average tensile strength of 50 kilos. The
laboratory tests 16 pieces and finds that the mean is
47 kilos with a standard deviation of 15 kilos. Are
the results in accordance with the hypothesis that
the population mean is 50 kilos?
26
Example 6
It is known from the records of the city
schools that the standard deviation of math test
scores on ABC test is 5. A sample of 200 students
from the system was taken and it was found out that
the sample mean is 75. Previous tests showed the
population mean to be 70. Is it safe to conclude that
the sample is significantly different from the
population at 0.01 level?
Example 7
Two types of rice varieties are being considered for
yield and a comparison is needed. Thirty hectares were
planted with the rice varieties exposed to fairly uniform
conditions. The results are tabulated below:
Variety A Variety B
Average yield 80 sack/hec 85 sack/hec
Sample Variance 5.90 12.10
Is there significant difference in the yield of the two
varieties at 0.05 level of significance?
Example 8
A manufacturer of flashlight batteries claims
that the average life of his product will exceed 40
hours. A company is willing to buy a very large
shipment of batteries provided the claim is true. A
random sample of 36 batteries is tested, and it was
found out that the sample mean is 45 hours. If the
population of batteries has a standard deviation of 5
hours, is it likely that the batteries will be bought?
Example 9
A company is trying to decide which brand of two
types to buy for their trucks. They would like to adopt Brand
c unless there is some evidence that Brand D is better. An
experiment was conducted where 16 from each brand were
used. The tires were run under uniform conditions until they
wore out. The results are:
Brand C: X
1
= 40,000 km s
1
= 5,400 km
Brand D: X
2
= 38,000 km s
2
= 3,200 km
What conclusion can be drawn?
27
Example 10
All freshmen in a particular school were
found to have a variability in grades expressed as a
standard deviation of 3. two samples among these
freshmen, made up of 20 and 50 students each,
were found to have means of 88 and 85respectively.
Based on their grades, is the first group really
brighter than the second group using 0.01 level of
significance?
Analysis of Variance (F - Test)
-A test that was developed by Ronald A. Fisher
-A technique in inferential statistics designed to test
whether or not more than two samples (or groups)
are significantly different from each other
Analysis of Variance
Steps:
1. Compute for the sum of squares
TSS =
N
x
x
2
2
) (
SSB =
N
x
x
r
ij
2
2
) (
) (
1
SSW = TSS SSB
Analysis of Variance
2. Compute degrees of freedom
dft = rk 1 = N 1
dfb = k 1
dfw = dft dfb
28
Analysis of Variance
3. Compute for the mean sum of squares
MSSB =
MSSW =
dfb
SSB
dfw
SSW
4. Compute for the F Ratio
F =
MSSW
MSSB
Contingency Table for ANOVA
Sources of
Variation
Sum of
Squares
Degree of
Freedom
(df)
Mean Sum
of Squares
F Ratio
Between
Column
SSB dfb MSSB
Within
Column
SSW dfw MSSW
Total TSS dft
Exercise
1. The weights in kilograms of three groups of 5 members
each are shown in the table below. Is there unusual
variation among the groups? ( use = 0.05)
Members
Group
A B C
1 50 60 53
2 48 40 55
3 55 50 40
4 50 60 40
5 46 52 47
Exercise
2. The following are the mileage obtained after several road tests were
run using 5 different kinds of gasoline on a Toyota Car.
Is there significant difference among the mileage yields, at 1% level?
Road
Test
Type of Gasoline
A B C D E
1
ST
35 61 38 65 56
2
ND
31 63 54 60 69
3
RD
42 50 47 57 70
4
TH
48 42 60 55 50
5
TH
40 49 55 60 48
29
Exercise
3. Below are the bowling scores of four groups og four
members each. At 5% significance level, find out if there
is unusual variation among the groups.
Members Group
A B C D
1 98 100 87 90
2 78 95 92 93
3 95 90 105 95
4 110 85 88 97
Chi Square Test (X
2
)
- Used to test significant difference or relationship
- Used if data are in frequencies (enumeration data)
USES:
1. to test the goodness of fit of a normal curve; that is to
find out whether or not a sample distribution conforms
with the hypothetical normal distribution
2. to find out whether or not an observed proportion is
equal to some given ideal or expected proportion
3. to test the independence of one variable from another
variable.
Formulas:
i. For a 2 x 2 table (with YATEs correction for continuity)
ii. For a non 2 x 2 table
X
2
=
X
2
=
EF
EF OF
2
) 5 . 0 (
EF
EF OF
2
) (
Exercise
1. Test the hypothesis that educational attainment does not
depend on socio economic status for the following 100
persons in a particular community.
Socio economic
status
Educational Attainment
Finished College Did Not Finish
College
Poor 18 10
Middle Class 28 25
Rich 14 5
30
Exercise
2. At 1% significance level, does college academic grade
depend on the high school NSAT results for the following
200 students?
Academic
Grade
NSAT Rating
Low Average High
Above 85 13 25 21
75 85 18 31 38
Below 75 14 20 20
Exercise
3. At ABC Company, there are 28 males and 32
females. Out of the 28 males, 10 holds executive
posts and the others do clerical work. Of the 32
females, only 5 hold executive position and the
others do clerical work. Prepare a contingency
table, then test the hypothesis that position is
independent on sex.
Exercise
4. To determine whether type of personality is related to
academic performance, a random sample of 180 high
school students from a certain college were taken and the
data are as follows:
Is there a significant relationship between personality type
and academic performance?
Low Average Average High Average
Introvert 35 30 25
Extrovert 31 23 36
Correlation
and
Regression Analysis
31
Regression Analysis
- concerned with the problem of estimation and
forecasting
FORMULA:
y = a + bx
Where:
y predicted score
a y intercept
b slope of the line
Regression Analysis
Where:
Y mean of the y values
X mean of the x values
b =
a = Y bX
( )( )
( )
2
2
x x n
y x xy n
Correlation Analysis
- Concerned in the relationship of the changes of
the variables
Formula: Pearson Product Moment Correlation (r)
r =
) )( ( ) ( y x xy n
2 2 2 2
) ( ) ( ][ ) ( ) ( [ y y n x x n
Range of Values: r = [-1, 1]
(+) r shows a direct positive relationship
(- ) r shows a negative or inverse relationship
r = 0 this indicates no relationship
r = 1 perfect positive relationship
r = -1 perfect negative relationship
32
Interpretation:
Pearson r Qualitative Description
1 Perfect Correlation
0.91 0.99 Very High
0.71 0.90 High
0.41 0.70 Marked
0.21 0.40 Slight/Low
0 0.20 Negligible
Testing the Significance of r
t = r
2
2
1
) 2 (
r
n
Exercise
1. It is generally known that the number of road accidents is inversely
proportional with road width. The following data shows the result of
a study indicating the number of accidents occurring per hundred
thousand vehicles.
a. draw a scatter diagram
b. find the equation of the LSRL
c. predict accident frequency for a road whose width is 55 feet;
48 feet
d. find the degree of relationship between road width and
accident frequency.
Road width (in feet) (x) 75 52 60 33 22
Number of accidents (y) 40 84 55 92 90
Exercise
2. The following table shows the final grades of ten students
in Algebra and Statistics.
a. draw a scatter diagram
b. find the equation of the LSRL
c. predict grade in Statistics if grade in
Algebra is 78; 82; 89; 95; 100
d. find the degree of relationship between grades in
Algebra and Statistics
Algebra (x) 75 80 93 65 87 71
Statistics (y) 82 78 86 72 91 80
33
Pilar B. Acorda
Email Address : [email protected]
Mobile Number: 09359547319