Workbook

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 119

University of Eastern Philippines

University Town, Northern Samar


College of Engineering

WORK BOOK
(ENGINEERING DATA ANALYSIS)

Submitted by:
Reyes Lorwel C.
BSEE

Submitted to:
ENGR. FELIX S. LICAS
Professor
MODULE 1

"INTRODUCTION TO STATISTICS"

Problem No. 1

Direction: From the previous questions regarding the heights of the family member.
Instructions:
 Select 3 families in your barangay.
 Using a tape measure or a meter stick, measure the individual heights of each member
of the family. Use centimeter unit. Round off units to the nearest centimeter.
 Group yourselves making each family a one group. List down all the raw data
and present it in the best presentation you can.

Family No.1
Respondent Measurement Family No.2
Mother 167cm
Father 170cm Respondent Measurement
Child no.1 155cm Mother 152cm
Child no.2 146cm
Child no.3 168cm Father 165cm
Child no.1 171cm
FamilyChild
No.3no.2 155cm
Respondent Measurement
Mother 155cm
Father 158cm
Child no.1 145cm
Child no.2 163cm
Questions:

1. What do these numbers represents?


 The height of each members in six families that been measured.
2. How can you get clear and precise information from the numbers?
 I get the clear and precise information or the measurement of each member of
the family by conducting survey or personally measured their height.
3. Are the numbers meaningful for everyone? Why?
 Yes, because they will know what is the measurement of their height and know
about others to. Also height is a requirement in other job so better to know well
about yourself measurement

Problem No. 2
Instruction: survey from your street or specific area how many members are there in their
family.
No. Family’s Name No. of Family Members
1 Tan 5
2 Cu 8
3 Siervo 7
4 Morillo 3
5 Fernandez 6
6 Sosing 5
7 Magluyoan 4
8 Gumarao 4
9 Cabili 2
10 Mapao 4

Questions:
1. What do these numbers represent?
 These numbers represent the numbers of family members in a specific area of
Victoria
2. Are these information precise?
 In this information, I can say that these are precise. Number of family members is
usually in small size and purely natural numbers that is why this is precise
information.

Problem No.3.

Instruction:

 Select 20 person as your respondent.


 Ask them their recent weight.
 Present your data in the best way possible.

Name No. of family member


Vincent Muncada 4
Cynric Flores 5
Rupert Neil Espina 4
Rojie Adesas 4
Kyle Kristian Magdaraog 4
CJ Calixtro 7
Clifford Adlawan 4
Alexiz Escarilla 5
Cyril Robiato 4
Riza Munez 5
Yanee Verano 5
Shannah Muncada 6
Kristel Muncada 4
Bea Rose Segun 5
Thailee Martires 6
Jia Marquez 4
Maria Julia Calonge 5
Sweet Joy Rubenecia 4
April Olchondra 4
Lizea Diaz 6
Kent Gordo 3
Questions:

 Who has the highest number of family members?


-CJ Calixtro has the highest number of family members.
 Who has the smallest number of family member?
-Kent Gordo has the smallest number of family member.
Kean Loyogoy has the highest number of family members.
 What do these data represent?
-These data represent the number of family members in which each respondent belongs
to.
 How can you get the precise and exact information?
-To get the precise and exact information, using the appropriate measuring tool is
necessary. The measuring tool used is a digital scale.

Problem No. 4

Instruction:

 Select 20 person as your respondent.


 Ask them on how many kdrama have they watched.
 Present your data in the best way possible.

Name No. of Anime


Vincent Muncada 26
Cynric Flores 15
Rupert Neil Espina 16
Rojie Adesas 5
Kyle Kristian Magdaraog 10
CJ Calixtro 2
Clifford Adlawan 20
Alexiz Escarilla 16
Cyril Robiato 21
Riza Munez 3
Yanee Verano 1
Shannah Muncada 5
Kristel Muncada 10
Bea Rose Segun 12
Thailee Martires 9
Jia Marquez 3
Maria Julia Calonge 7
Sweet Joy Rubenecia 15
April Olchondra 16
Lizea Diaz 21
Kent Gordo 3
Questions:

 Who watches anime the most?


Vincent Muncada watches anime the most.
 Who watches anime least?
The one who watches anime the least is Yanee Verano.
 What do these data represent?
These data represent the no. of anime the respondents had watched.
 How can you get the precise and exact information?
To get the precise and exact information, you simply count the anime you had watched.

Problem No. 5

Direction: Make a surveyed in your barangay age 18-22 about on how many hours do they
spend playing online games.

1. Who is often spend time playing online games?


2. What is the average of hours playing online games in the data?
3. How do you get the data?

Name: Hours of playing o-games:


Alfred 4hours
Deo 2hours
Cristian 6hours
Diane 1hour
Ray 6hours
Todd 10hours
Vince 4hours
Jude 5hours
Angela 3hours
Ronald 3hours
1. Who is often spend time playing online games?
 Todd spend more hours playing online games than the rest having 10 hours to
play.
2. What is the average of hours playing online games in the data?
 4.4hours is the average of hours playing online games in the data.
3. How do you get the data?
 I get the data by conducting a surveyed to the people I know playing online
games and directly asking them question related to the topic.
MODULE 2

"DATA COLLECTION"

LESSON 1: METHODS OF COLLECTING DATA


Problem No. 1
Instruction: using interview method, interview at least one student from University of Eastern
Philippines of how he/she budget his/her one week allowance.
I interview one of the students from University of Eastern Philippines and ask how much is
his allowance and how he spend those money in one week. Upon interviewing, I wrote down
his answers.

One week budget:


Allowance: P1,000.00
Transportation: P174.00 (vice versa)
Food: P500.00
School-related miscellaneous: P100.00
Cell phone load: P70.00
Others (such as soap, toothpaste, shampoo, etc.): P50.00
Savings: P106.00

Problem No. 2
Instruction: Make a survey from 15 students if they are learning from online class/modular-
type learning.
Yes No
1 /
2 /
3 /
4 /
5 /
6 /
7 /
8 /
9 /
10 /
11 /
12 /
13 /
14 /
15 /
Total: 2 total: 13

 based from the table presented, most of the students are not learning from this
blended-type of learning.

Problem No. 3.

What is the best method to collect data about the voter’s information?

Answer: Registration method.

Problem No. 4.

What is the kind of method used in the scenario: Consider someone on the busy street of a
New York neighborhood asking random people that pass by how many pets they have, then
taking this data and using it to decide if there should be more pet food stores in that area.

Answer: Observation Method

Problem No 5.

A company wants to have the feedbacks of their clients to be gathered in order for their
services to be improved. What is the best way to gather data?

Answer: Questionnaire Method

LESSON 2. POPULATION AND SAMPLE


Problem No. 1
Instruction: Make a group frequency on the score of the students in a summative test in ESP.
use an interval of 5.

12 21 30 19 12 20 24 16 15 20
9 27 23 20 8 28 21 25 7 29
16 26 22 22 17 30 17 5 19 18
Score tally Frequency
1-5 / 1
6-10 /// 3
11-15 /// 3
16-20 /////-///// 10
21-25 /////-// 7
26-30 /////-/ 6

Question:
1. Based on the group frequency data, what score has the most frequency?
 Score from 16 to 20 has the most frequency, with a frequency of 10.

Problem No. 2

Identify the population and sample in this setting: A factory overseer selects 40 threaded
rods at random from those that week at the factory, then she test their tensile strength.

Answer: The population is the threaded rods produced at the factory that week; the sample is
the 40 threaded selected.

Problem No 3

Identify the population and sample in this setting: A researcher conducted an experiment on
a randomly selected group of 50 positive Covid patients.

Answer: The population is all the positive Covid patients; the sample is the 50 patients
selected.

Problem No. 4

Instruction: Construct a pie chart for the following. Show computations of per cent and angle
distribution using the given data on preferred strand of Grade 10 when going to Senior High.
Strand frequency
STEM 14
HUMSS 13
ABM 10
GAS 8
Others 5

STEM
HUMSS
ABM
GAS
OTHERS

STEM- 14/50(100) = 28%, 0.28(360) = 100.8’


HUMMS- 13/50(100) = 26%, 0.26(360) = 93.6’
ABM- 10/50(100) = 20%, 0.29(360) = 72’
GAS- 8/50(100) = 16%, 0.16(360) = 57.6’
OTHERS- 5/50(100) = 10%, 0.1(360) = 36’

Problem No. 5
Instruction: Make a survey from 15 students if they are learning from online class/modular-
type learning. Present it with bar graph
Yes No
1 /
2 /
3 /
4 /
5 /
6 /
7 /
8 /
9 /
10 /
11 /
12 /
13 /
14 /
15 /
Total: 2 total: 13

14

12

10

8
Series 3
Series 2
6 Series 1

0
No Yes

LESSON 3: TYPE OF DATA

Problem No. 1

Between gender and height, which is an example of a qualitative data?

Answer: Gender is an example of qualitative data because it cannot be expressed in numbers.

Problem No. 2

Between weight and religion, which is an example of a quantitative data?

Answer: Weight can be expressed in numbers and it represent an amount so it is an example


of quantitative data.

Problem No. 3
The painting is 14 inches wide and 12 inches long. What type of data?
 Quantitative data

Problem No. 4
Notes from classroom observations. What type of data?
 Qualitative data
Problem No. 5
Feedback from a teacher about a student's progress. What type of data?
 Qualitative data

LESSON 4: PRESENTATION OF DATA

Problem No.1
Construct a pie chart for the following. Show computations of percent using the given data
on a family budget.
Budget Percent
Food 9000 30
Rent 7500 25
Kids 6000 20
Leisure 1500 5
Savings 3500 12
Gasoline 2500 8
Total=30000 Total=100%

o The pie chart show that food is having more budget than the others and the less
is leisure.

Problem No. 2
Make a group frequency table on the ages of participants to an event on “Laugh out loud” for
our city. Use an interval 3

16 37 20 21 45 20 22 19 18
43 37 34 35 21 19 38 24 18
31 29 32 27 18 22 23 21 19
37
Ages of the 28 Participants to a Vigil on our Country
Age Tally Frequency
15-17 III 3
18-20 IIIII-II 7
21-23 IIIII-I 6
24-26 II 2
27-29 II 2
30-32 II 2
33-35 II 2
36-38 II 2
39-41 O 0
42-44 I 1
45-47 I 1
N=28
 The table show that 18-20 years old have participated the event on “Laugh out
loud” and 39-41 years old have no participant on the vigil.

Problem No. 3
The final grades in Engineering Data Analysis of 80 students at UEP are recorded in
the accompanying table. Construct a relative frequency distribution table.

68 84 75 82 68 90 62 88 76 93
73 79 88 73 60 93 71 59 85 75
61 65 75 87 74 62 95 78 63 72
66 78 82 75 94 77 69 74 68 60
96 78 89 61 75 95 60 79 83 71
79 62 67 97 78 85 76 65 71 75
65 80 73 57 88 78 62 76 53 74
86 67 73 81 72 63 76 75 85 77

Class Class Interval Frequency Cumulative Cumulative


Interval Boundary Median Frequency Percentage
50 – 54 49.5 – 54.5 52 1 1 1.25
55 – 59 54.5 – 59.5 57 2 3 3.75
60 – 64 59.5 – 64.5 62 11 14 17.50
65 – 69 64.5 – 69.5 67 10 24 30.00
70 – 74 69.5 – 74.5 72 12 36 45.00
75 – 79 75.5 – 79.5 77 21 57 71.25
80 – 84 80.5 – 84.5 82 6 63 78.75
85 – 89 84.5 – 89.5 87 9 72 90.00
90 – 94 89.5 – 94.5 92 4 76 95.00
95 – 99 94.5 – 99.5 97 4 80 100.00

PROBLEM No. 4

Given data is a sample of the accounts receivable of a small merchandising firm.


Construct relative frequency distribution table.

29 79 75 66 63 58 50 85 81 72
54 42 80 74 68 67 59 48 86 80
60 56 44 78 72 69 64 60 52 88
67 61 55 47 82 71 66 64 62 90
73 65 62 53 46 83 76 70 68 92

CLASS Class INTERVA FREQUENCY CUMULATIV CUMULATIVE


INTERVA Boundaries L MEDIAN E PERCENTAGE
L FREQUENCY
35 – 39 34.5 – 39.5 37 1 1 2
40 – 44 39.5 – 44.5 42 2 3 6
45 – 49 44.5 – 49.5 47 3 6 12
50 – 54 49.5 – 54.5 52 4 10 20
55 – 59 54.5 – 59.5 57 4 14 28
60 – 64 59.5 – 64.5 62 8 22 44
65 – 69 64.5 – 69.5 67 8 30 60
70 – 74 69.5 – 74.5 72 6 36 72
75 – 79 74.5 – 79.5 77 4 40 80
80 – 84 79.5 – 84.5 82 5 45 90
85 – 89 84.5 – 89. 5 87 3 48 96
90 – 94 89.5 – 94.5 92 2 50 100

Problem No.5
Forbes magazine published data on the best small firms in 2012. These were firms
which have been publicly traded for at least a year, have a stock price of at least $5 per
share, and have reported annual revenue between $5 million and$1 billion. Complete
the frequency distribution of ages of the chief executive officers for the first 60 ranked
firms.
Age Frequency

40-44 3
45-49 11
50-54 13
55-59 16
60-64 10
65-69 6
70-74 1
Answer:
Age Frequency Median Relative frequency Cumulative relative
frequency
40-44 3 42 0.05 0.5
45-49 11 47 0.18 0.23
50-54 13 52 0.22 0.45
55-59 16 57 0.27 0.72
60-64 10 62 0.17 0.89
65-69 6 67 0.1 0.99
70-74 1 72 0.02 1

Questions:
a. What is the frequency for CEO ages between 54 and 65?
 The frequency for CEOs ages between 54 and 65 is 33.

b. What is the relative frequency of ages under 50?


 The relative frequency of ages under 50 is 0.22.

c. What is the cumulative relative frequency for CEOs younger than 55?
 The cumulative relative frequency for CEOs younger than 55 is 0.45.

MODULE 3

"MEASURES OF CENTRAL TENDENCY"


LESSON 1: MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA

Problem No. 1

Given the following set of ungrouped measurements 3, 5, 6, 6, 7 and 9. Determine the


mean, median, and mode.

3 5 6 6 7 9

 For the mean:



∑ x 36
3+5+6+ 6+7+ 9 = =6
X́ =❑
= 6
n 6

The mean of the ungrouped measurement is 6.

 For the median:

There are two middle values so we need to get the average of the two.

6+6
=6
6
Thus, the median is 6.

 For the mode:

We can observe that the measurement 6 appeared twice. Therefore, the mode is 6.

Problem No.2
The observation below are the body temperatures in degrees Celsius of Five patients
who have fever in ward B of Hospital E. Find the mean body temperature of the
patients.
Patient Temperature ( ˚C )
Dianne 40.5
Lily 38..9
Antonio 41
Catherine 38.8
Will 39.6

∑ X
Answer: 198.8
X́ = ❑
=
n 5
x́=39.76 ° C

Problem No.3
Find the mean of the following scores:
22 25 22 20 23 24 23 21 20 20
22 22 20 28 29 30
Answer:

∑ X
22+ 25+22+20+23+24 +23+21+20+20+22+22+20+ 28+29+30 371
X́ = ❑
= = =23.188
n 16 16
x́=23.188

o The mean of the scores is 23.188.

Problem No 4

The intelligence quotients of 10 boys are recorded. Find the mean, median and mode.

70 83 88 88 98 100 101 105 110 120

 For the mean:



∑ x 963
70+83+ 88+ 88+98+100+101+105+110 +120 ¿ =¿ 96.3
X́ = ❑
= 10
n 10

The mean of the daily income of a computer shop is 96.3.

 For the median:

98+100
X́ = = 99
2

There are 10 data so the median is between the fifth and sixth item in
ascending order which is 99.

 For the mode:

We can observe that 88 has appeared two times. Thus, the mode is 88.

Problem No. 5

Alex timed 21 people in the sprint race, to the nearest second: 59, 65, 61, 62, 53, 55,
60, 70, 64, 56, 58, 58, 62, 62, 68, 65, 56, 59, 68, 61, and 67. Find the mean, median and
mode.

53, 55, 56, 56, 58, 58, 59, 59, 60, 61, 61, 62, 62, 62, 64, 65, 65, 67, 68, 68, 70

 For the mean:



∑ x
53+55+56+56 +58+58+59+59+60+ 61+ 61+ 62+ 62+ 62+ 64+65+ 65+67+68+68+ 70
X́ = ❑
=
n 10

1289
¿ =¿ 61.38
21

The mean of the daily income of a computer shop is 61.38 seconds.

 For the median:

There are 21 data so the median is the 11th item in ascending order which is 61.

 For the mode:

We can observe that 62 has appeared three times. Thus, the mode is 62.
LESSON 2: MEASURES OF CENTRAL TENDENCY FOR GROUPED DATA

Problem No. 1

Alex timed 21 people in the sprint race, to the nearest second: 59, 65, 61, 62, 53, 55, 60,
70, 64, 56, 58, 58, 62, 62, 68, 65, 56, 59, 68, 61, and 67. Construct a grouped frequency table.
Find the mean, median and mode.

Seconds X f F(x) Class <CF


Boundaries
51-55 53 2 106 50.5 – 55.5 2
56-60 58 7 406 55.5 – 60.5 9
61-65 63 8 504 60.5 – 65.5 17
66-70 68 4 272 65.5 – 70.5 21
i=5 Total 21 1288

 For the Mean:



∑ f (x)
1288
X́ = ❑
= =61.33
N 21

The mean is 61.33 seconds.


 For the median:
N
Md=L Md +
2
(
−¿ Cf b
f Md
i )
21
¿ 60.5+( )
2
−9
8
5

¿ 60.5+0.9375

¿ 61.4375 seconds
 For the mode:
∆1
Mo=LMo + ( ∆ 1+∆ 2)
i

∆ 1=8−7=1
∆ 2=8−4=4

Mo=60.5+ ( 1+1 4 )5
¿ 60.5+ ( 23 )6=¿65.5 seconds

Problem No. 2

A farmer grew fifty baby carrots using special soil. He dig them up and measure their
lengths (to the nearest mm) and group the results:

Length (mm) X f fx Class <Cf


Boundaries
150 – 154 152 5 760 149.5 – 154.5 5
155 – 159 157 2 314 154.5 – 159.5 7
160 – 164 162 6 972 159.5 – 164.5 13
165 – 169 167 8 1336 164.5 – 169.5 21
170 – 174 172 9 1548 169.5 – 174.5 30
175 – 179 177 11 1947 174.5 – 179.5 41
180 – 184 182 6 1092 179.5 – 184.5 47
185 – 189 187 3 561 184.5 – 189.5 50
i=5 Total 50 8530

 For the Mean:



∑ f (x)
8530
X́ = ❑
= =170.6
N 50

The mean is 170.66 mm


For the median:

N
Md=L Md +
2
(−¿ Cf b
f Md
i )
50

( )
¿ 169.5+
2
−21
9
5

¿ 169.5+2.2222

¿ 171.7 mm

 For the mode:


∆1
Mo=LMo + ( ∆ 1+∆ 2)
i

∆ 1=11−9=2
∆ 2=11−6=5

Mo=169.5+ ( 2+52 )5
¿ 169.5+1.42 = 175

Problem No. 3

The ages of the 112 people who live on a tropical island are grouped below.
Determine the mean, median and mode.

Age x f fx Class <CF


Boundaries
0–9 5 20 100 0 – 10 20
10 – 19 15 21 315 10 – 20 41
20 – 29 25 23 575 20 – 30 64
30 – 39 35 16 560 30 – 40 80
40 – 49 45 11 495 40 – 50 91
50 – 59 55 10 550 50 – 60 101
60 – 69 65 7 455 60 – 70 108
70 – 79 75 3 225 70 – 80 111
80 – 89 85 1 85 80 – 90 112
i = 10 Total 112 3360
 For the Mean:

∑ f ( x)
3360
X́ = ❑
= =30
N 112

The mean is 30.

 For the median:


N
Md=L Md +
2
(
−¿ Cf b
f Md
i )
112

(
¿ 20+
2
−41
23 )5

¿ 20+6.52=26.5

 For the mode:


∆1
Mo=LMo + ( ∆ 1+∆ 2)
i

∆ 1=23−21=2
∆ 2=23−16=7

Mo=20+ ( 2+72 ) 5
¿ 20+2.22=¿22

Problem No. 4
What is the value of mean, median and mode for the data in the following frequency
distribution below?

Class Limits f x f(x) Class <Cumulative


Boundaries Frequency
9-10 5 9.5 47.5 8.5 – 10.5 37
7-8 10 7.5 75 6.5 – 8.5 32
5-6 15 5.5 82.5 4.5 – 6.5 22
3-4 5 3.5 17.5 2.5 – 4.5 7
1-2 2 1.5 3 0.5 – 2.5 2

N=37 ∑ f ( x )=225.5

 For the mean:



∑ f (x)
225.5
X́ = ❑
= =6.09
N 37

Thus, the mean is 6.09.

 For the median:


N
Md=L Md +
2
f Md(
−¿ Cf b
i )
37
¿ 4.5+
2
( )
−7
15
2

23
¿ 4.5+
2
15( )
2

¿ 5.42

 For the mode:


∆1
Mo=LMo + ( ∆ 1+∆ 2)
i

∆ 1=15−5=10
∆ 2=15−10=5
15
Mo=4.5+ ( 15+5 )2
¿ 4.5+ ( 32 )
Mo=6

Problem No. 5
Determine the mean, median and mode for the grouped data given below.

Class f X f(x) Class Boundaries <CF


73 – 78 22 75.5 1661 72.5 – 77.5 100
67 – 72 24 69.5 1668 66.5 – 72.5 78
61 – 66 33 63.5 2095.5 60.5 – 66.5 54
55 – 60 15 57.5 826.5 54.5 – 60.5 21
49 – 54 6 51.5 309 48.5 – 54.5 6

N= 100 ∑ f ( x )=6560

 For the Mean:

∑ f (x)
6560
X́ = ❑
= =65.6
N 100
The mean is 65.6.

 For the median:


N
Md=L Md +
2
f Md(
−¿ Cf b
i )
100
¿ 60.5+
2
(−21
33
6 )
¿ 60.5+ ( 2933 )6
¿ 65.77

 For the mode:


∆1
Mo=LMo + ( ∆ 1+∆ 2)
i

∆ 1=33−15=18
∆ 2=33−24=9

Mo=60.5+ ( 18+18 9 ) 6

¿ 60.5+ ( 23 )6
Mo=¿64.5
LESSON 3: FRACTILES FOR UNGROUPED DATA

Problem No. 1

The following data lists the number of calories in 30 manufacturers of vanilla flavored
ice cream bars. Solve for the 2nd quartile, 6th decile and 80th percentile.

111 132 151 182 197 209 255 295 337 377
126 147 179 185 200 234 286 310 353 377
131 151 180 190 201 255 294 319 365 439

 For the 2nd quartile:

2 N 2 ( 30 )
Q 2= = = 15TH item which is 201.
4 4

 For the 6th decile:

6 N 6 ( 30 )
D 6= = = 18th item which is 255.
10 10

 For the 80th percentile:

80 N 80 ( 30 )
P80= = = 24th item which is 319.
100 100

Problem No. 2

Calculate the Q3, D7 and P28 for the following test scores of 10 students.

10 22 24 27 32 36 40 41 50 90

 For the 3rd quartile:

3 N 3 (10 )
Q 3= = = 7.5TH Item which is 41.
4 4

 For the 7th decile:

7 N 7 (10 ) th
D 7= = = 7 item which is 40.
10 10
 For the 30th percentile:

30 N 30 (10 ) th
P30= = = 3 item which is 24.
100 100

Problem No. 3

Determine the 2nd quartile, 8th decile, and 25th percentile of the data given below.

22 30 36 41 53
23 33 36 42 54
24 33 37 49 54
28 35 38 53 56
 For the 2nd quartile:

2 N 2 ( 20 )
Q 2= = = 10TH Item which is 36.
4 4

 For the 8th decile:

8 N 8 ( 20 )
D 8= = = 16th item which is 53.
10 10

 For the 25th percentile:

25 N 25 (20 ) th
P25= = = 5 item which is 30.
100 100

Problem No.4

Calculate Quartile-2, Deciles-6, and Percentiles-45 from the following data


85, 96, 76, 108, 85, 80, 100, 85, 70, 95.

Arranging Observations in the ascending order, we get:


70, 76, 80, 85, 85, 85, 95, 96, 100, and 108

 For the 2nd quartile:

2 N 2 ( 10 ) TH
Q 2= = = 5 Item which is 85.
4 4

 For the 6th decile:


6 N 6 ( 10 ) th
D 6= = = 6 item which is 85.
10 10

 For the 45th percentile:

45 N 45 ( 10 )
P45= = = 4.5th item which is 85.
100 100

Problem No. 5

Calculate the Q1, D5 and P70 for the following IQ scores.

87 90 95 96 97 98 98 99
100 100 100 100 100 101 101 102
102 102 103 104 105 107 110

 For the 1st quartile:

N ( 23 )
Q 1= = = 5.75TH Item which is 98.
4 4

 For the 5th decile:

5 N 5 ( 23 )
D 5= = = 11.5th item which is 100.
10 10

 For the 80th percentile:

70 N 70 ( 23 )
P70= = = 16.1th item which is 102.
100 100

LESSON 4: FRACTILE FOR GROUPED DATA

Problem No. 1

Calculate the 1st quartile, 8th decile and 65th percentile of the Engineering Data
Analysis test score of 50 students.

Scores f Class Boundary <Cf


70 – 74 3 69.5 – 74.5 3
75 – 79 16 74.5 – 79.5 19
80 – 84 14 79.5 – 84.5 33
85 – 89 10 84.5 – 89.5 47
90 – 94 7 89.5 – 94.5 50
50

 For the 1st quartile:

N ( 50 )
Q 1= = = 12.5TH item
4 4

Thus, the 1st quartile class is 75 – 79 since it is where the 12.5th item is found.

LQ1 = 74.5 N = 50 fQ1 = 16 i=5 <cf = 3

N
Q 1=LQ +
4
fQ1 (
−¿ Cf b
i ) 1

( 50 )
¿ 74.5+
4
( )
16
−3
5

¿ 74.5+2.97=77.47

 For the 8th decile:


8 N 8(50)
D 8= = = 40th item
10 10

Thus, the 8th percentile class is 85 – 89 since it is where the 40th item is found.

LD3 = 84.5 N = 50 fd3 = 10 i=5 <cf = 33

8N
D8=L D +8
10
(
−¿ Cf b
fL
i
8
)
8 ( 50 )
¿ 84.5+
10
(
10
−33
5 )
¿ 84.5+3.5= 88

 For the 65th percentile:


65 N 65 (50 )
P65= = = 32.5th item
100 100
Thus, the 65th percentile class is 80 – 84 since it is where the 32.5th item is found.
LP = 79.5
65
f P =14
65
<Cf = 19 i=5 N = 50

65 N
P65=L P +
100
fP
65 (
−¿ Cf b
i
65
)
65 ( 50 )
P65=79.5+
100
14
−19
(5 )
¿ 79.5+4.82=84.32

Problem No. 2

Calculate the 1st quartile, 3rd decile and 50th percentile of the Differential test score of
50 students.

Test Score f Class Boundaries <Cf


20 – 24 2 19.5 – 24.5 2
25 – 29 6 24.5 – 29.5 8
30 – 34 9 29.5 – 34.5 17
35 – 39 10 34.5 – 39.5 27
40 – 44 12 39.5 – 44.5 39
45 – 49 7 44.5 – 49.5 46
50 – 54 4 49.5 – 54.5 50
N = 50
 For the 1st quartile:

N ( 50 )
Q 1= = = 12.5TH item
4 4

Thus, the 1st quartile class is 30 – 34 since it is where the 12.5th item is found.

LQ1 = 29.5 N = 50 fQ1 = 9 i=5 <cf = 8

N
Q 1=LQ +
4
fQ
1 (
−¿ Cf b
i ) 1

( 50 )
¿ 29.5+
4
( )
9
−8
5

¿ 29.5+2.5¿ 32

 For the 3th decile:


3 N 3(50)
D 3= = = 15th item
10 10

Thus, the 3rd decile class is 30 – 34 since it is where the 15th item is found.

LD3 = 29.5 N = 50 fd3 = 9 i=5 <cf = 8

3N
D 3=L D +
3
10
(
−¿ Cf b
fL
i
3
)
3 ( 50 )
¿ 29.5+
10
( 9
−6
5 )
¿ 29.5+5= 34.5

 For the 50th percentile:


50 N 50 ( 50 )
P50= = = 25th item
100 100
Thus, the 50th percentile class is 35 - 39 since it is where the 25th item is found.

LP = 34.5
30
f P =10
30
<Cf = 17 i=5 N = 50

50 N
P50=L P +
100
50
fP (
−¿ Cf b
i
50
)
50 ( 50 )
P50=34.5+
100
10 (
−17
5 )
¿ 34.5+ 4=38.5

Problem No. 3

In a work study investigation, the time taken by 20 men in a firm to do a particular job
were tabulated below. Determine the 2nd quartile, 7th decile, and 30th Percentile.

Time taken 8 – 10 11 – 13 14 – 16 17 – 19 20 – 22 23 – 25
Frequencies 2 4 6 4 3 1

x f Class Boundaries <Cf


8 – 10 2 7.5 – 10.5 2
11 – 13 4 10.5 – 13.5 6
14 – 16 6 13.5 – 16.5 12
17 – 19 4 16.5 – 19.5 16
20 – 22 3 19.5 – 22.5 19
23 – 25 1 22.5 – 25.5 20
Total 20

 For the 2nd quartile:

2 N 2 ( 20 )
Q 2= = = 10TH item
4 4

Thus, the 2nd quartile class is 14 – 16 since it is where the 10th item is found.

LQ2 = 13.5 N = 20 fq2 = 6 i=3 <cf = 6

2N
Q 2=LQ +
2
4
(
fQ )
−¿ Cf b
i
2

2 (20 )
¿ 13.5+ ( )
4
6
−6
3

¿ 13.5+2=15.5

 For the 6th decile:


6 N 6 ( 20 )
D 6= = = 12th item
10 10

Thus, the 6th decile class is 14 – 16 since it is where the 10th item is found.

LD6 = 13.5 N = 20 fd7 = 6 i=3 <cf = 6

6N
D 6=L D +6
10
(
−¿ Cf b
fL 6
i )
6 ( 20 )
¿ 13.5+ (10
6
−6
3 )
¿ 13.5+ ( 66 )3= 16.5
 For the 30th percentile:
30 N 30 (20 ) th
P30= = = 6 item
100 100
Thus, the 30th percentile class is 11 – 30 since it is where the 6th item is found.

LP = 10.5
30
f P =4
30
<Cf = 2 i=3 N = 20

30 N
P30=L P +
100
30
fP(
−¿ Cf b
i
33
)
30 ( 20 )
P33=10.5+
100
4 (
−2
3 )
¿ 10.5+3=13.5

Problem No. 4

Calculate the 2nd quartile, 7th decile and 45th percentile of the mathematics test score of
50 students.

Scores f <Cf Class Boundaries


46 – 50 4 50 45.5 – 50.5
41 – 45 8 46 40.5 – 45.5
36 – 40 11 38 35.5 – 40 .5
31 – 35 9 27 30.5 – 35.5
26 – 30 12 18 25.5 – 30.5
21 – 25 6 6 20.5 – 25.5
 For the 2nd quartile:

2 N 2 ( 50 )
Q 2= = = 25TH item
4 4

Thus, the 2nd quartile class is 31-35 since it is where the 25th item is found.
LQ2 = 30.5 N = 50 fq2 = 9 i=5 <cf = 18

2N
Q 2=LQ +
4
fQ
2 (
−¿ Cf b
i
2
)
2 (50 )
¿ 30.5+
4
9 (−18
5 )
¿ 405.5+ ( 79 ) 5
¿ 405.5+ ( 359 )
¿ 34.39

 For the 6th decile:


7 N 7 (50 )
D 7= = = 35th item
10 10

Thus, the 7th decile class is 36 - 40 since it is where 35th item is found.

LD7 = 35.5 N = 50 f d7 = 11 i = 5 <cf = 27

7N
D 7=L D +7
10
(
−¿ Cf b
fL
i
7
)
7 ( 50 )
¿ 35.5+
10
11( −27
5 )
¿ 35.5+ ( 4011 )
=39.14

 For the 33th percentile:


33 N 33 (50 )
P33= = = 16.5th item
100 100
Thus, the 45th percentile class is 26 – 30 since it is where 16.5 item is found.

LP = 25.5
33
f P =12
33
<Cf = 6 i=5 N = 50

33 N
P33=L P +
100
33 (
−¿ Cf b
fP
i
33
)
33 ( 50 )
P33=25.5+
100
12 (
−6
5 )
¿ 25.5+ ( 78 )5
¿ 29.88

Problem No. 5

The airborne speeds in kilometer per hour of 26 planes are shown below. Find the 1 st
quartile, 6th decile and 95th percentile.

Class f <Cf Class Boundaries


526 – 545 5 26 525.5 – 545.5
506 – 525 4 21 505.5 – 525.5
486 – 505 3 17 485.5 – 505.5
466 – 415 2 14 465.5 – 415.5
446 – 465 1 12 445.5 – 465.5
426 – 445 2 11 425.5 – 445.5
406 – 425 3 9 405.5 – 425.5
386 – 405 2 6 385.5 – 405.5
366 – 385 4 4 365.5 – 385.5
 For the 1st quartile:

N ( 26 )
Q 1= = = 6.5TH item
4 4

Thus, the 1st quartile class is 406 – 425 since it is where the 6.5th item is found.

LQ1 = 405.5 N = 26 f Q1 = 3 i = 20 <cf = 6


N
Q 1=LQ +
4
−¿ Cf b
fQ 1 (
i
1
)
26
¿ 405.5+
4
3
−6
( )
20

¿ 405.5+ ( 16 ) 20
¿ 405.5+ ( 103 )
¿ 408.83

 For the 6th decile:


6 N 6 ( 26 )
D 6= = = 15.6th item
10 10

Thus, the 6th decile class is 486 – 505 since it is where 15.6th item is found.

LD6 = 405.5 N = 26 f d6 = 3 i = 20 <cf = 14

6N
D 6=L D +6
10
fL (
−¿ Cf b
i
6
)
6 ( 26 )
¿ 485.5+
10
(
3
−14
20 )
¿ 405.5+ ( 5815 ) 20
¿ 405.5+ ( 2323 )
¿ 408.17

 For the 95th percentile:


95 N 95 ( 26 )
P95= = = 24.7th item
100 100
Thus, the 9th percentile class is 526 – 545 since it is where 24.7 item is found.

LP = 525.5 f P =5 <Cf = 21
95 95
i = 26 N = 26

95 N
P95=L P +
95
100
( −¿Cf b
fP
i
95
)
95 ( 26 )
P95=525.5+
100
(
5
−21
26 )
( 3750 )26=525.5+( 745 )=540.3
¿ 525.5+
Module 4

"Measures of Variation"

LESSON 1: RANGE

Problem No.1

Given the measurements 20, 26, 40, 39, 25, 36, 21, 34, 33, and 37. Find the range.
R=Highest observation−Lowest observation

Answer:

R=40−20= 20 - The range of the given measurements is 20.

Problem No. 2

A group of adventurers went to the mountain range in Sierra Madre, Philippines to enjoy the
great view and to explore the mountain in the area. The ages of the adventurers are 34, 30, 27,
50, 45, 35, 38, 47, 52, 31, 38, and 40. What is the range of their ages?

R=Highest observation−Lowest observation

Answer:

R=52−27= 25 - The range of the ages of the adventurers is 25.

Problem No. 3

Find the range:

a) 150, 250, 825, 400, 180, 500

b) 2.2, 1.8, 5.1, 0.6

R=Highest observation−Lowest observation

Answer:

Ra =825−150 = 675 - The range of the given number is 675.

Rb =5.1−0.6 = 4.5 - The range of the given number is 4.5.

Problem No. 4

You take 7 statistics tests over the course of a semester. You score 94, 88, 73, 84, 91,
87, and 79. What is the range of your scores?

Range = 94 – 73 = 21

The range is 21 which is the difference between the highest and lowest observation.
Problem No. 5

Find the range: 150, 250, 825, 400, 200, 500.

Range = 825 – 150 = 675

The range is 675 which is the difference between the highest and lowest observation.

LESSON 2: AVERAGE OR MEAN DEVIATION, STANDARD DEVIATION and


VARIANCE, and Coefficient of Variation

Problem No. 1

Calculate the mean deviation about the mean for the following data.

Class 1-4 5-8 9-12 13-16 17-20


Interval
Frequency 4 6 8 5 2
Answer:

Class Frequency x fx x- f(x-x́/)


Interval x́/
1-4 9 2.5 22.5 8.98 89.82
5-8 6 6.5 39 4.98 29.88
9-12 12 10.5 126 0.98 11.76
13-16 8 14.5 116 3.02 24.16
17-20 14 18.5 259 7.02 98.28
f=49 fx=562.5  f(x-
x/)=253.9

∑ fX
: 562.5
X́ = ❑
= =11.48
n 49

253.9
Mean Deviation: MAD¿ ∑ f ¿ ¿ ¿ MAD¿
= 5.18
❑ 49
- The mean deviation of the following data is 5.18.

Problem No.2
In a pancake eating competition the number of pancake eaten by five contestants in
an hour is as follows:
12, 18, 21, 26, 17, 20, 18
Answer:
Score(x) x-
x́ /
12 6.86
18 0.86
21 2.14
26 7.14
17 1.86
20 1.14
18 0.86
❑ ❑
∑ ¿132 ∑ ( x− x́ ) = 20.86
❑ ❑


∑ X
132
X́ = ❑
= =18.86
n 7

20.86
Mean Deviation: MAD¿ ∑ f ¿ ¿ ¿ MAD¿ = 2.98
❑ 7
- The mean deviation of the number of pancake eaten by five contestants is 2.98

Problem No. 3

Find the mean deviation, standard deviation and variance, coefficient of variation,
skewness and kurtosis of the sample observations 2, 5, 7, 9 and 12.

 For the mean:



∑ x 35
2+5+7+ 9+12 ¿ =¿ 7
X́ =❑
= 5
n 5

2 3 4
X X − X́ | X− X́| ( X − X́ ) ( X − X́ ) ( X − X́ )
2 2 – 7 = -5 5 25 -125 625
5 5 – 7 = -2 2 4 -8 32
7 7–7=0 0 0 0 0
9 9–7=2 2 4 8 32
12 12 – 7 = 5 5 25 125 625
❑ ❑ ❑ ❑
2 3 4
∑ | X− X́|=14 ∑ ( X− X́ ) =58 ∑ ( X− X́ ) =0 ∑ ( X− X́ ) =1314
❑ ❑ ❑ ❑
 For Mean Deviation

∑ |X − X́|
MD = ❑ 14 = 2.8
=
n 5
 For Standard Deviation:


2
∑ ( X− X́ ) 58
S= ❑
n−1
=
√ 5−1
= 3. 81

 For variance: (Variance is the square of standard deviation)



2

S 2 ∑ ( X − X́ )
= ❑ 58 58
= = =14.5
n−1 5−1 4

 For coefficient of variation:

S 3.81
V= (100) = ( 100 )=54.42 %
X́ 7

 For the skewness:



3
∑ ( X − X́ )
Skewness = ❑ =
0
(5−1) ¿ ¿
(n−1)( S3 )

 For the kurtosis:



4
∑ ( X − X́ )
Kurtosis = ❑ =
1314
(5−1)¿ ¿
(n−1)(S4 )

Problem No. 4

Given the values 4, 4, 6, 7 and 9. Compute the mean deviation, standard deviation and
variance, coefficient of variation, skewness and kurtosis.

 For the mean:



∑ x 30
4 +4 +6+7+ 9 ¿ =¿ 6
X́ = ❑
= 5
n 5

( X − X́ )2 3 4
X X − X́ | X− X́| ( X − X́ ) ( X − X́ )
4 4 – 6 = -2 2 4 -8 16
4 4 – 6 = -2 2 4 -8 16
6 6–6=0 0 0 0 0
7 7–6=1 1 1 1 1
9 9–6=3 3 9 27 81
❑ ❑ ❑ ❑
2 3 4
∑ | X− X́|=6 ∑ ( X− X́ ) =18 ∑ ( X− X́ ) =¿ ¿∑ ( X− X́ ) =114
❑ ❑ ❑ ❑

12
 For Mean Deviation

∑ |X − X́|
MD = ❑ 6 = 1.2
=
n 5

 For Standard Deviation:



2
∑ ( X− X́ ) 18
S= ❑
n−1
=
√ 5−1
= 2.12

 For variance: (Variance is the square of standard deviation)



2

S 2 ∑ ( X − X́ )
= ❑ 18 18
= = =4.5
n−1 5−1 4

 For coefficient of variation:

S 2.12
V= (100) = ( 100 )=35.33 %
X́ 6

 For the skewness:



3
∑ ( X − X́ )
Skewness = ❑ =
12
(5−1) ¿ ¿
(n−1)( S3 )

 For the kurtosis:



4
∑ ( X − X́ )
Kurtosis = ❑ =
114
(5−1)¿ ¿
(n−1)(S4 )

Problem No. 5
Computer the average deviation for the age at which men in a Chataqua bowling club
scored their first game over 175. Solve also for the standard deviation, variance, coefficient
of variation, skewness and kurtosis.

29, 36, 42, 48, 49, 56, 59, 62, 64, 65

 For the mean:



∑ x 510
29+36+ 42+48+ 49+56+59+62+64 +65 ¿ =¿ 51
X́ = ❑
= 10
n 10

( X − X́ )2 3 4
X X − X́ | X− X́| ( X − X́ ) ( X − X́ )
29 29 – 51 = - 22 22 484 -10648 234256
36 36 – 51 = - 15 15 225 -3375 50625
42 42 – 51 = - 9 9 81 -729 6561
48 48 – 51 = - 3 3 9 -27 81
49 49 – 51 = - 2 2 4 -8 16
56 56 – 51 = 5 5 25 125 625
59 59 – 51 = 8 8 64 512 4096
62 62 – 51 = 11 11 121 1331 14641
64 64 – 51 = 13 13 169 2197 28561
65 65 – 51 = 14 14 196 2744 38416
❑ ❑ ❑ ❑
2 3 4
∑ | X− X́|=98 ∑ ( X− X́ ) =1378 ∑ ( X− X́ ) =¿ ¿ ∑ ( X− X́ ) =¿ ¿
❑ ❑ ❑ ❑

-7878 377878
 For Mean Deviation

∑ |X − X́|
MD = ❑ 98 = 9.8
=
n 10

 For Standard Deviation:



2
∑ ( X− X́ ) 1378
S= ❑
n−1
=
√ 10−1
= 12.24

 For variance: (Variance is the square of standard deviation)



2

S 2 ∑ ( X − X́ )
= ❑ 1378 1378
= = =153.11
n−1 10−1 9
 For coefficient of variation:

S 12.24
V= (100) = ´ ( 100 ) =24 %
X́ 51
 For the skewness:

3
∑ ( X − X́ )
Skewness = ❑ =
−7878
(10−1)¿ ¿
(n−1)( S3 )

 For the kurtosis:



4
∑ ( X − X́ )
Kurtosis = ❑ ¿
377878
(10−1) ¿ ¿
(n−1)(S4 )

LESSON 3: QUARTILE DEVIATION

Problem No. 1

Get the quartile deviation of the observation below in ascending order: 70, 76, 80, 83, 85,
85, 95, 96, 100, 110

N +1 11
Q 1= = = 2.75TH Item
4 4

= 2nd item + 0.75 (3rd – 2nd)

= 76 + 0.75 (80 – 76)

= 76 + 3

=79

3( N +1) 3(11)
Q 3= = = 8.25TH Item
4 4

= 8th item + 0.25 (9th – 8th)

= 96 + 0.25 (100 -96)

= 96 + 1

=97
Q 3−Q 1 97−79 18
Q= = = =9
2 2 2

Problem No. 2

The age at which men in a Chataqua bowling club scored their first game over 175 are
recorded below in order. Get the quartile deviation

29, 36, 42, 48, 49, 56, 59, 62, 64, 65

N +1 11
Q 1= = = 2.75TH Item
4 4

= 2nd item + 0.75 (3rd – 2nd)

= 36 + 0.75 (42 – 36)

= 36 + 4.5

= 40.5

3( N +1) 3(11)
Q 3= = = 8.25TH Item
4 4

= 8th item + 0.25 (9th – 8th)

= 62 + 0.25 (64 – 62)

= 62 + 0.5

= 62.5

Q 3−Q 1 62.5−40.5 22
Q= = = =11
2 2 2
Problem No. 3

Get the quartile deviation of the sample observations 85, 96, 76, 108, 85, 80, 100, 85,
70, 95, 106, 70, 99, 79, 88.

70 70 76 79 80 85 85 85 88 95 96 99 100 106 108

N +1 16 TH
Q 1= = = 4 Item= 79
4 4

3(N +1) 3(16)


Q 3= = = 12TH Item = 99
4 4

Q 3−Q 1 99−79 20
Q= = = =5
2 2 4

Problem No. 4

Harry Itd. is a textile manufacturer. They want to know how much their production spread
is. Use the quartile deviation formula to help the management find the dispersion with the
data collected for the last 10 days per employee.

140, 145, 150, 155, 156, 169, 175, 177, 188, 190

N +1 11
Q 1= = = 2.75TH Item
4 4

= 2nd item + 0.75 (3rd – 2nd)

= 145 + 0.75 (150 -145)

= 145 + 3.75

=148. 75
3( N +1) 3(11)
Q 3= = = 8.25TH Item
4 4

= 8th item + 0.25 (9th – 8th)

= 177 + 0.25 (188 -177)

= 177 + 2.75

=179.75

Q 3−Q 1 179.75−148.75 31
Q= = = =15.5
2 2 2

Problem No. 5

Solve the quartile deviation of the sample observations 10, 3, 13, 11, 15, 5, 4, 2, 3, 2.

In order: 2 2 3 3 4 5 10 11 13 15

N +1 11
Q 1= = = 2.75TH Item
4 4

= 2nd item + 0.75 (3rd – 2nd)

= 2 + 0.75 (3 – 2)

= 2 + 0.75

= 2.75

3( N +1) 3(11)
Q 3= = = 8.25TH Item
4 4

= 8th item + 0.25 (9th – 8th)

= 11 + 0.25 (13 – 11)

= 11 + 0.5
= 11.5

Q= 8.7
Lesson 4: Coefficient of Variation

Problem No. 1

The following table gives the values of mean and variance of heights and weights of the 10th
standard students of a school.

Answer:

Convert :σ 2 ¿ σ

σ =√ 72.25=8.5

8.5
CV = ×100 %=5.48 % (height )
155

σ =√ 28.09=5.3

5.3
CV = ×100 %=11.40 % (weight )
46.50

- The weight of the students is more varied than the height having 11.40
coefficients of variation.
Problem No.2
Given the mean and standard deviation of the consumption of number of banana and
apple on one family in a week.
Apple Banana
MEAN 4.29 4.29
STANDARD DEVIATION 1.01 2.81
Answer:

1.01 2.81
CV = × 100 % CV = × 100 %
4.29 4.29

CV =0.2354 × 100 % CV =0.6550 ×100 %

CV =23.54 % CV =65.5 %

- This show that the consumption of the Banana show more varied than the Apple.

Problem No.3
The standard deviation and mean of a data are 6.5 and 12.5 respectively. Find the coefficient
of variation.
Answer:
6.5
CV = ×100 %
12.5

CV =0.52 ×100 %

CV =52 %

-The coefficient of variations in the given data is 52 % .

Problem No.4-5

Calculate the coefficient of variations and construct a frequency distribution.

Class 1-4 5-8 9-12 13-16 17-20


Interval
Frequency 4 6 8 5 2
Answer:

Class Frequency x fx x- (x-x́/¿2 f(x-x́/¿2


Interval x́/
1-4 9 2.5 22.5 8.98 80.6404 725.7636
5-8 6 6.5 39 4.98 24.8004 148.8024
9- 12 10.5 126 0.98 0.9604 11.5248
12
13- 8 14.5 116 3.02 9.1204 72.9632
16
17- 14 18.5 259 7.02 49.2804 689.9256
20
f=49 fx=562.5  f(x-x/¿2
=1648.9796

∑ fX s
562.5 CV = ×100 %
X́ = ❑
= =11.48 μ
n 49

s= √∑❑
f ¿¿¿¿

5.86
CV = × 100 %
11.48

CV =0.5105 ×100 %

CV =51.05 %

-The coefficient of variations of the given data is 51.05 % .


Lesson 5: Percentile Range

Problem No.1

In solving percentile range what is the equation to be used?

o PR=P 90−P 10

Problem No.2

Calculate the percentile range of the runs scored by a batsman in last 20 matches:

34 39 63 64 67 70 75 76 81 82 84 85 86 88 89 90 90 96 96 100
Answer:
90 N 90(20)
P90= = =18 th P90=96
100 100

10 N 10(20)
P10= = =2 th P90=39
100 100
PR=P 90 −P 10=96−39
PR=57

-The percentile range in the given data is 57.

Problem No. 3

Calculate the percentile range of the scores of students in a Post-test examination. The scores
are follows:

77 56 89 90 76 72 65 92 83 84 71 94 64
64 80 86 74 90 64 88

Answer:

56 64 64 64 65 71 72 74 76 77 80 83 84 86 88
89 90 90 92 94

90 N 90(20)
P90= = =18 th P90=90
100 100
10 N 10(20)
P10= = =2 th P90=64
100 100

PR=P 90−P 10=90−64


PR=26
-The percentile range of the scores is 26.

Problem No. 4

Calculate the percentile range of the weight of students in section A. The following data are
weights:

56 49 70 63 58 62 67 51 69 72 64
67 61 54 59

Answer:

49 51 54 56 58 59 61 62 63 64
67 67 69 70 21

90 N 90(15)
P90= = =13.5 th P90=70
100 100

10 N 10(15)
P10= = =1.5 th P90 =51
100 100

PR=P 90 −P 10=70−51
PR=19
-The percentile range of the weights is 19.

Problem No. 5
Calculate the percentile range of frequency distribution table:
Class Interval Frequency Middle class(x) fx Cumulative
Frequency(cf)
54-57 3 55.5 166.5 3
58-61 2 59.5 119 5
62-65 11 63.5 698.5 16
66-69 12 67.5 810 28
70-73 9 71.5 652.5 37
74-77 8 75.5 604 45
78-81 4 79.5 318 49
82-85 1 83.5 83.5 50
Answer:
90 N 90(50)
P90= = =45 th𝑙𝑏P90 =77.5 < 𝑐ƒ = 45 i=4 ƒQ1 = 4
100 100

90 N
P90=L P 90+ [
100
−¿ cf
f ]
×i
= 77.5 +
45−45
[4 ]
× 4=77.5

10 N 10(50)
P10= = =5 th𝑙𝑏P10 =61.5 < 𝑐ƒ = 5 i=4 ƒQ1 = 11
100 100

10 N
P10=L P 10+ [
100
−¿ cf
f ]
×i
= 61.5 +
5−5
[ ]
11
× 4=61.5

PR=P 90 −P 10=77.5−61.5
PR=16
-The percentile range of the given frequency distribution is 16.
MODULE 5

PROBABILITIES
Lesson 1: Sample pace and event
Problem No.1
There are 6! permutations of the 6 letters of the word ”square.” In how many of them is r the
second letter?
Solution
Let r be the second letter. Then there are 5 ways to fill the first spot, 4 ways to fill the third, 3
to fill the fourth, and so on. There are 5! such permutations
Problem No.2
.Five different books are on a shelf. In how many different ways could you arrange them?
Solution
The five books can be arranged in 5·4·3·2·1 = 5! = 120 ways
Problem No. 3

Two coins are tossed, find the probability that two heads are obtained. 

 The sample space S is given by.


S = {(H,T), (H,H), (T,H), (T,T)}
 Let E be the event "two heads are obtained".
E = {(H,H)}

We use the formula of the classical probability.

P(E) = n(E) / n(S) = 1 / 4

Problem No.4

Two dice are rolled, find the probability that the sum is equal to 1.

 The sample space S of two dice is shown below.


S = { (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1),
(3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3),
(5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6) }
 Let E be the event "sum equal to 1". There are no outcomes which correspond to a
sum equal to 1, hence:
P(E) = n(E) / n(S) = 0 / 36 = 0
Problem No. 5

A die is rolled and a coin is tossed, find the probability that the die shows an odd number
and the coin shows a head.

 Let H be the head and T be the tail of the coin. The sample space S of the experiment
described in problem 3 is as follows:
S = { (1,H),(2,H),(3,H),(4,H),(5,H),(6,H),(1,T),(2,T),(3,T),(4,T),(5,T),(6,T)}
 Let E be the event "the die shows an odd number and the coin shows a head". Event E
may be described as follows:
E={(1,H),(3,H),(5,H)}
 The probability P(E) is given by
P(E) = n(E) / n(S) = 3 / 12 = 1 / 4

Lesson 2: Laws of probability


Problem No. 1.
QUESTION: Describe the sample space and all 16 events for a trial in which two coins are
thrown and each shows either a head or a tail.
SOLUTION: The sample space is S = {hh, ht, th, tt}. As this has 4 elements there are 24 = 16
subsets, namely φ, hh, ht, th, tt, {hh, ht}, {hh, th}, {hh, tt}, {ht, th}, {ht, tt}, {th, tt}, {hh, ht,
th}, {hh, ht, tt}, {hh, th, tt}, {ht, th, tt} and finally {hh, ht, th, tt}.
Problem No. 2.
QUESTION: A fair coin is tossed, and a fair die is thrown. Write down sample spaces for
(a) the toss of the coin; (b) the throw of the die; (c) the combination of these experiments.
Let A be the event that a head is tossed, and B be the event that an odd number is thrown.
Directly from the sample space, calculate P(A∩B) and P(A∪B).
SOLUTION:
(a) {Head,Tail} (b) {1,2,3,4,5,6} (c) {(1∩Head),(1∩Tail),...,(6∩Head),(6∩Tail)} Clearly
P(A) = 1 2 = P(B). We can assume that the two events are independent, so
P(A∩B) = P(A)P(B) =1 4
Alternatively, we can examine the sample space above and deduce that three of the twelve
equally likely events comprise A∩B. Also, P(A ∪B) = P(A) + P(B)−P(A∩B) = 3 4, where
this probability can also be determined by noticing from the sample space that nine of twelve
equally likely events comprise A∪B.

Problem No. 3.
QUESTION: A bag contains fifteen balls distinguishable only by their colours; ten are blue
and five are red. I reach into the bag with both hands and pull out two balls (one with each
hand) and record their colours.
(a) What is the random phenomenon? (b) What is the sample space? (c) Express the event
that the ball in my left hand is red as a subset of the sample space.
SOLUTION:
(a) The random phenomenon is (or rather the phenomena are) the colours of the two balls. (b)
The sample space is the set of all possible colours for the two balls, which is {(B,B),(B,R),
(R,B),(R,R)}. (c) The event is the subset {(R,B),(R,R)}.
Problem No. 4.
QUESTION: M&M sweets are of varying colours and the different colours occur in different
proportions. The table below gives the probability that a randomly chosen M&M has each
colour, but the value for tan candies is missing.
Colour Brown Red Yellow Green Orange Tan Probability 0.3 0.2 0.2 0.1 0.1 ?
(a) What value must the missing probability be? (b) You draw an M&M at random from a
packet. What is the probability of each of the following events? i. You get a brown one or a
red one. ii. You don’t get a yellow one. iii. You don’t get either an orange one or a tan one.
iv. You get one that is brown or red or yellow or green or orange or tan.
SOLUTION:
(a) The probabilities must sum to 1.0 Therefore, the answer is 1−0.3−0.2−0.2−0.1−0.1 =
1−0.9 = .1. (b) Simply add and subtract the appropriate probabilities. i. 0.3+0.2 = 0.5 since it
can’t be brown and red simultaneously (the events are incompatible). ii. 1−P(yellow) = 1−0.2
= 0.8. iii. 1−P(orange or tan) = 1−P(orange)−P(tan) = 1−0.1−0.1 = 0.8 (since orange and tan
are incompatible events). iv. This must happen; the probability is 1.0
Problem No.5.
QUESTION: You consult Joe the bookie as to the form in the 2.30 at Ayr. He tells you that,
of 16 runners, the favourite has probability 0.3 of winning, two other horses each have
probability 0.20 of winning, and the remainder each have probability 0.05 of winning,
excepting Desert Pansy, which has a worse than no chance of winning. What do you think of
Joe’s advice?
SOLUTION: Assume that the sample space consists of a win for each of the 16 different
horses. Joe’s probabilities for these sum to 1.3 (rather than unity), so Joe is incoherent, albeit
profitable! Additionally, even “Dobbin” has a non-negative probability of winning.
LESSON 3. PROBABILITY DISTRIBUTION

Problem No. 1

A fair coin is tossed twice. Let X be the number of heads that are observed.

 Construct the probability distribution of X.

X 0 1 2
P(X) 0.25 0.50 0.25
 Find the probability that at least one head is observed.
P(X ≥ 1) = P (1) + P (2) = 0.50 + 0.25 = 0.75

Problem No. 2

A pair of fair dice is rolled. Let X denote the sum of the number of dots on the top faces.

 Construct the probability distribution of x.

 Find P(X ≥ 9)

Problem No. 3

A service organization in a large town organizes a raffle each month. One thousand
raffle tickets are sold for $1 each. Each has an equal chance of winning. First prize is
$300, second prize is $200, and third prize is $100. Let X denote the net gain from the
purchase of one ticket.

 Construct the probability distribution of X.


 Find the probability of winning any money in the purchase of one ticket.

P (W) =P (299) + P (199) + P (99) = 0.001 + 0.001 +

Problem No. 4

A life insurance company will sell a $200,000 one-year term life insurance policy to an
individual in a particular risk group for a premium of $195. Find the expected value to the
company of a single policy if a person in this risk group has a 99.97% chance of surviving
one year.

x 195 -199, 805


P(x) 0.9997 0.0003
Therefore,
E (X) = Σ x P(x) =195⋅0.9997 + (−199,805) (0.0003) = 135

Problem No. 5

Question:. In the Arizona lottery called Pick 3, a player pays $1 and then picks a three-digit
number. If those three numbers are picked in that specific order the person wins $500. What
is the expected value in this game?
Solution: To find the expected value, you need to first create the probability distribution. In
this case, the random variable x = winnings. If you pick the right numbers in the right order,
then you win $500, but you paid $1 to play, so you actually win $499. If you didn’t pick the
right numbers, you lose the $1, the x value is −$1. You also need the probability of winning
and losing. Since you are picking a three-digit number, and for each digit there are 10
numbers you can pick with each independent of the others, you can use the multiplication
rule. To win, you have to pick the right numbers in the right order. The first digit, you pick 1
number out of 10, the second digit you pick 1 number out of 10, and the third digit you pick 1
number out of 10. The probability of picking the right number in the right order is 1/10*
1/10* 1/10 = 1/1000 =0.001. The probability of losing (not winning) would be 1− 1 1000 =
999/1000 =0.999. Putting this information into a table will help to calculate the expected
value.
LESSON 4. DEPENDENT AND INDEPENDENT EVENTS

Problem No. 1

A purse contains four $5 bills, five $10 bills and three $20 bills. Two bills are selected
without the first selection being replaced. Find P($5, then $5).

Solution:
There are four $5 bills.
There are a total of twelve bills.
P ($5) = 4/12

The result of the first draw affected the probability of the second draw.

There are three $5 bills left.


There are a total of eleven bills left.
P ($5 after $5) = 3/11

P ($5, then $5) = P ($5) · P ($5 after $5) = (4/12) X (3/11) = 1/11

The probability of drawing a $5 bill and then a $5 bill is 1/11.

Problem No. 2

If a dice is thrown twice, find the probability of getting two 5’s.

Solution:
Problem No. 3

Two sets of cards with a letter on each card as follows are placed into separate bags.
Sara randomly picked one card from each bag. Find the probability that she picked the letters
‘J’ and ‘R’.

1 1 1
Solution: Probability that she picked J and R =  x =
5 6 30

Problem No. 4

When we have just got 6 heads in a row, what is the probability that the next toss is
also a head?

Answer: ½, as the previous tosses don't affect the next toss

Problem No. 5.

A bag contains 6 red, 5 blue and 4 yellow marbles. Two marbles are drawn, but the first
marble drawn is not replaced. Find P(red, then blue).

 There are 6 red marbles. There are a total of 15 marbles.


P (red) = 6/15
The result of the first draw affected the probability of the second draw.
 There are 5 blue marbles. There are a total of 14 marbles left.
P(blue after red) = 5/14

P(red, then blue) = P(red) · P(blue after red) = 6/15 x 5/14 = 1/7

The probability of drawing a red marble and then a blue marble is 1/7
LESSON 5. PROBABILITY AND COMBINATORIAL ANALYSIS

Problem No. 1.

In a lottery you have to guess 6 out of 49 numbers. What is the probability that you
get all of them right? If submit 100 guesses every week, how long on average will it take you
to win?

There are 49C6 = 13,983,816 possible outcomes of the lottery, so the probability of getting the
right solution is 1 / 49C6 = 0.000000072.

On average it will also take 13,983,816 attempts to win. If we submit 100 guesses every week
this corresponds to 139,838 weeks, which is the same as 2,689 years.

Problem No. 2.

In a certain state’s lottery, 48 balls numbered 1 through 48 are placed in a machine


and six of them are drawn at random. If the six numbers drawn match the numbers that a
player had chosen, the player wins $1,000,000. In this lottery, the order the numbers are
drawn in doesn’t matter. Compute the probability that you win the million-dollar prize if you
purchase a single lottery ticket.

In order to compute the probability, we need to count the total number of ways six
numbers can be drawn, and the number of ways the six numbers on the player’s ticket could
match the six numbers drawn from the machine.

Since there is no stipulation that the numbers be in any particular order, the number of
possible outcomes of the lottery drawing is
48C6 = 12,271,512.

Of these possible outcomes, only one would match all six numbers on the player’s
ticket, so the probability of winning the grand prize is:1/(6C6)(48C6)= 0.00000008156
Problem No. 3.

Compute the probability of randomly drawing five cards from a deck and getting
exactly two Aces.

P(two Aces)=1/(4C2)(48C3)(52C5)= 1/1037762598960 ≈ 0.0399

Problem No. 4.

Compute the probability of randomly drawing five cards from a deck and getting exactly
one Ace.

Putting this all together, we have

P(one Ace)=(4C1)(48C4)/(52C5)= 778320/2598960


≈ 0.299

Problem No. 5.

Compute the probability of randomly drawing five cards from a deck of cards and getting
three Aces and two Kings.

P(three Aces and two Kings)= (4C3)(4C2)/(52C5)


= 24/2598960
≈ 0.0000092
MODULE 6

"NORMAL DISTRIBUTION"

Lesson 1: Z-Score
Problem No. 1.
Convert the following scores to z-scores, where µ= 75 and 𝜎= 5
a. 75
b. 80
c. 58
Solution
a. x= 75
75−75
z=
5
0
=
5
=0
b. x= 80
80−75
z=
5
5
=
5
=1
c. x= 58
58−75
z=
5
−17
=
5
= -3.4
Problem No. 2.
Convert the following scores to z-scores, where µ= 100 and 𝜎= 5
a. 110
b. 101
c. 95
Solution
a. x= 110
110−100
z=
5
10
=
5
=2
b. x= 101
101−100
z=
5
1
=
5
= 0.2
c. x= 95
95−100
z=
5
−5
=
5
= -1
Problem No.3.
Convert the following scores to z-scores, where µ= 25 and 𝜎= 2.75
a. 31
b. 16
c. 26.5
Solution
a. x= 31
31−25
z=
2.75
6
=
2.75
= 2.18
b. x= 16
16−25
z=
2.75
−9
=
2.75
= -3.27
c. x= 26.5
26.5−25
z=
2.75
1.5
=
2.75
= 0.55
Problem No. 4
Convert the following scores to z-scores, where µ= 50 and 𝜎= 5
a. 60
b. 49.5
c. 51.5
Solution
a. x= 60
60−50
z=
5
10
=
5
=2
b. x= 49.5
49.5−50
z=
5
−0.5
=
5
= -0.1
c. x= 51.5
51.5−50
z=
5
1.5
=
5
= 0.3

Problem No. 5
The following are the scores of 10 students in Math Quiz Bee:
14 10 15 12 17 19 13 11 9 20
Convert scores to z-scores where µ= 15 and 𝜎= 4.5
X Z
9 -1.33
10 -1.11
11 -0.89
12 -0.67
13 -0.44
14 -0.22
15 0
17 0.44
19 0.89
20 1.11

LESSON 2. AREAS UNDER THE NORMAL CURVE

Problem No. 1

X is a normally distributed variable with mean μ = 30 and standard deviation σ = 4.


Find P(x < 40).

Solution:
z = (40 - 30) / 4 = 2.5
P(x < 40) = P (z < 2.5) = [area to the left of 2.5] = 0.9938

Problem No. 2

A radar unit is used to measure speeds of cars on a motorway. The speeds are
normally distributed with a mean of 90 km/hr and a standard deviation of 10 km/hr. What is
the probability that a car picked at random is travelling at more than 100 km/hr?

Solution:

Let x be the random variable that represents the speed of cars. X has μ = 90 and σ =
10. We have to find the probability that x is higher than 100 or P(x > 100).

For x = 100, z = (100 - 90) / 10 = 1


P(x > 90) = P(z > 1)

= [total area] - [area to the left of z = 1]

= 1 - 0.8413 = 0.1587

The probability that a car selected at a random has a speed greater than 100 km/hr is
equal to 0.1587.
Problem No. 3

For a certain type of computers, the length of time bewteen charges of the battery is
normally distributed with a mean of 50 hours and a standard deviation of 15 hours. John
owns one of these computers and wants to know the probability that the length of time will be
between 50 and 70 hours.

Solution:

Let x be the random variable that represents the length of time. It has a mean of 50
and a standard deviation of 15.

For x = 50 , z = (50 - 50) / 15 = 0


For x = 70 , z = (70 - 50) / 15 = 1.33 (rounded to 2 decimal places)
P( 50< x < 70) = P( 0< z < 1.33) = [area to the left of z = 1.33] - [area to the left of z = 0]
0.9082 - 0.5 = 0.4082
The probability that John's computer has a length of time between 50 and 70 hours is equal to
0.4082.

Problem No. 4

Entry to a certain University is determined by a national test. The scores on this test are
normally distributed with a mean of 500 and a standard deviation of 100. Tom wants to be
admitted to this university and he knows that he must score better than at least 70% of the
students who took the test. Tom takes the test and scores 585. Will he be admitted to this
university?

Solution:

Let x be the random variable that represents the scores. X is normally distributed with a mean
of 500 and a standard deviation of 100. The total area under the normal curve represents the
total number of students who took the test. If we multiply the values of the areas under the
curve by 100, we obtain percentages.

For x = 585, z = (585 - 500) / 100 = 0.85


The proportion P of students who scored below 585 is given by:
P = [area to the left of z = 0.85] = 0.8023 = 80.23%
Tom scored better than 80.23% of the students who took the test and he will be admitted to
this University.

Problem No. 5

The length of similar components produced by a company are approximated by a normal


distribution model with a mean of 5 cm and a standard deviation of 0.02 cm. If a component
is chosen at random

a) What is the probability that the length of this component is between 4.98 and 5.02
cm?
b) What is the probability that the length of this component is between 4.96 and 5.04
cm?

Solution:

a) P (4.98 < x < 5.02) = P (-1 < z < 1) = 0.6826


b) P (4.96 < x < 5.04) = P (-2 < z < 2) = 0.9544

Lesson 3: Skewness
Problem No. 1
Calculate the degree of skewness of a distribution if the mean is 45, the median is 40, and
standard deviation is 5
Solution:
3( Ẋ −Md )
Sk=
S
3(45−40)
=
5
3(5)
=
5
=3 hence, the distribution is positively skewed
Problem No. 2
Calculate the degree of skewness of a distribution if the mean is 60, the median is 50, and
standard deviation is 2
Solution:
3( Ẋ −Md )
Sk=
S
3(60−50)
=
2
3(10)
=
2
= 15 hence, the distribution is positively skewed
Problem No. 3
Calculate the degree of skewness of a distribution if the mean is 100, the median is 120, and
standard deviation is 5
Solution:
3( Ẋ −Md )
Sk=
S
3(100−120)
=
5
3(−20)
=
2
= -12 hence, the distribution is negatively skewed
Problem No. 4
Calculate the degree of skewness of a distribution if the mean is 7.5, the median is 6.3, and
standard deviation is 0.7

Solution:
3( Ẋ −Md )
Sk=
S
3(7.5−6.3)
=
0.7
3(1.2)
=
0.7
= 504 hence, the distribution is positively skewed
Problem No. 5
Calculate the degree of skewness of a distribution if the mean is 1050, the median is 995, and
standard deviation is 25.7
Solution:
3( Ẋ −Md )
Sk=
S
3(1050−995)
=
25.7
3(55)
=
25.7
= 6.42 hence, the distribution is positively skewed

LESSON 4. KURTOSIS

Problem No. 1

In a work study investigation, the time taken by 20 men in a firm to do a particular


job were tabulated below. Determine the quartile deviation.

Time taken 8 – 10 11 – 13 14 – 16 17 – 19 20 – 22 23 – 25
Frequencies 2 4 6 4 3 1

x f Class Boundaries <Cf


8 – 10 2 7.5 – 10.5 2
11 – 13 4 10.5 – 13.5 6
14 – 16 6 13.5 – 16.5 12
17 – 19 4 16.5 – 19.5 16
20 – 22 3 19.5 – 22.5 19
23 – 25 1 22.5 – 25.5 20
Total 20
 st
For the 1 quartile:

N ( 20 ) TH
Q 1= = = 5 item
4 4

Thus, the 1st quartile class is 11 – 13 since it is where the 5th item is found.

LQ1 = 10.5 N = 20 fQ1 = 4 i=3 <cf = 2


N
Q 1=LQ +
4
1
fQ (
−¿ Cf b
i
1
)
¿ 10.5+ ( 5−2
4 )
3

¿ 10.5+2.2¿ 12.75

 For the 3rd quartile:

3 N 3 (20 )
Q 3= = = 15TH item
4 4

Thus, the 3rd quartile class is 17 – 19 since it is where the 15th item is found.

LQ3 = 16.5 N = 20 fQ3 = 4 i=3 <cf = 12

N
Q 3=LQ +
4
3
fQ (
−¿ Cf b
i
3
)
¿ 16.5+ ( 15−12
4 )
3

¿ 39.5+2.25=41.75

Q 3−Q 1 41.75−12.75 29
Q= = = =14.5
2 2 2

 For the 90th percentile:


90 N 90 ( 20 )
P90= = = 18th item
100 100
Thus, the 90th percentile class is 20 – 22 since it is where the 18th item is found.

LP = 19.5
90
f P =3
90
<Cf = 16 i=3 N = 20

90 N
P90=L P +
100
90
fP (
−¿Cf b
i
90
)
P90=19.5+ ( 18−16
3 )
3

¿ 19.5+2=21.5

 For the 10th percentile:


10 N 10 ( 20 ) th
P10= = = 2 item
100 100
Thus, the 10th percentile class is 8 – 10 since it is where the 2th item is found.

LP = 7.5
10
f P =2
10
<Cf = 0 i=3 N = 20

90 N
P10=L P +
10
100
( −¿ Cf b
fP 10
i )
P10=7.5+ ( 2−0
2 )
3

¿ 7.5+3=10.5

Q 14.5 14.5
k= = = =1.32
P 90−P10 21.5−10.5 11

Problem No. 2

Calculate the quartile deviation of the Differential test score of 50 students.

Test Score f Class Boundaries <Cf


20 – 24 2 19.5 – 24.5 2
25 – 29 6 24.5 – 29.5 8
30 – 34 9 29.5 – 34.5 17
35 – 39 10 34.5 – 39.5 27
40 – 44 12 39.5 – 44.5 39
45 – 49 7 44.5 – 49.5 46
50 – 54 4 49.5 – 54.5 50
N = 50
 For the 1st quartile:

N ( 50 )
Q 1= = = 12.5TH item
4 4
Thus, the 1st quartile class is 30 – 34 since it is where the 12.5th item is found.

LQ1 = 29.5 N = 50 fQ1 = 9 i=5 <cf = 8

N
Q 1=LQ +
4
−¿ Cf b
fQ 1
i ( ) 1

( 50 )
¿ 29.5+
4
9( )
−8
5

¿ 29.5+2.5¿ 32
 For the 3rd quartile:

3 N 3 (50 )
Q 3= = = 37.5TH item
4 4

Thus, the 3rd quartile class is 40 – 44 since it is where the 37.5th item is found.

LQ3 = 39.5 N = 50 fQ3 = 12 i=5 <cf = 27

N
Q 3=LQ +
4
3
fQ (
−¿ Cf b
i
3
)
¿ 39.5+ ( 37.5−27
12 )
5

¿ 39.5+ 4.375=43.875

Q 3−Q 1 43.875−32 11.875


Q= = = =5.938
2 2 2

 For the 90th percentile:


90 N 90 ( 50 )
P90= = = 45th item
100 100
Thus, the 90th percentile class is 45 – 49 since it is where the 45th item is found.

LP = 44.5
90
f P =7
90
<Cf = 39 i=5 N = 50

90 N
P90=L P +
100
−¿Cf b
fP
90
i( 90
)
P90=44.5+ ( 45−39
7 )
5

¿ 44.5+ 4.29=48.79

 For the 10th percentile:


10 N 10 ( 50 ) th
P10= = = 5 item
100 100
Thus, the 5th percentile class is 25 – 29 since it is where the 5th item is found.

LP = 24.5
10
f P =6
10
<Cf = 2 i=5 N = 50

90 N
P10=L P +
100
10 (−¿ Cf b
fP
i
10
)
P10=24.5+ ( 5−2
6 )
3

¿ 24.5+1.5=26

Q 5.938 5.938
k= = = =0.2606
P 90−P10 48.79−26 22.79

Problem No. 3

Calculate the quartile deviation of the Engineering Data Analysis test score of 50
students.

Scores f Class Boundary <Cf


70 – 74 3 69.5 – 74.5 3
75 – 79 16 74.5 – 79.5 19
80 – 84 14 79.5 – 84.5 33
85 – 89 10 84.5 – 89.5 47
90 – 94 7 89.5 – 94.5 50
50
 For the 1st quartile:
N ( 50 )
Q 1= = = 12.5TH item
4 4

Thus, the 1st quartile class is 75 – 79 since it is where the 12.5th item is found.

LQ1 = 74.5 N = 50 fQ1 = 16 i=5 <cf = 3

N
Q1=LQ +
4
−¿ Cf b
fQ 1
i ( ) 1

( 50 )
¿ 74.5+
4
16( )
−3
5

¿ 74.5+2.97=77.47
 For the 3rd quartile:

3 N 3 (50 )
Q 3= = = 37.5TH item
4 4

Thus, the 3rd quartile class is 85 – 89 since it is where the 37.5th item is found.

LQ3 = 84.5 N = 50 fQ3 = 10 i=5 <cf = 33

N
Q3=LQ +
4
fQ
3 (
−¿ Cf b
i
3
)
¿ 84.5+ ( 3.7 .5−33
10 )5
¿ 84.5+2.25=86.75

Q 3−Q 1 86.75−77.47 9.28


Q= = = =4.64
2 2 2

 For the 90th percentile:


90 N 90 ( 50 )
P90= = = 45th item
100 100

Thus, the 90th percentile class is 85 – 89 since it is where the 45th item is found.

LP = 84.5
90
f P =10
90
<Cf = 33 i=5 N = 50
90 N
P90=L P +
100
90 (
−¿Cf b
fP
i
90
)
P90=84.5+ ( 45−33
10 )
5

¿ 84.5+6=90.5

 For the 10th percentile:


10 N 10 ( 50 ) th
P10= = = 5 item
100 100
Thus, the 5th percentile class is 75 – 79 since it is where the 5th item is found.

LP = 74.5
10
f P =16
10
<Cf = 3 i=5 N = 50

90 N
P10=L P +
100
10 (−¿ Cf b
fP
i
10
)
5−3
P10=74.5+ ( )
16
5

¿ 74.5+0.625

¿ 75.125

Q 4.64 4.64
k= = = =0.3017
P 90−P10 90.5−75.125 15.375

 For the 90th percentile:


90 N 90 ( 50 )
P90= = = 45th item
100 100
Thus, the 90th percentile class is 85 – 89 since it is where the 45th item is found.

LP = 84.5
90
f P =10
90
<Cf = 33 i=5 N = 50

90 N
P90=L P +
100
90 (
−¿Cf b
fP
i
90
)
P90=84.5+ ( 45−33
10 )
5
¿ 84.5+6

¿ 90.5

 For the 10th percentile:


10 N 10 ( 50 ) th
P10= = = 5 item
100 100
Thus, the 5th percentile class is 75 – 79 since it is where the 5th item is found.

LP = 74.5
10
f P =16
10
<Cf = 3 i=5 N = 50

90 N
P10=L P +
100
10
fP(
−¿ Cf b
i
10
)
P10=74.5+ ( 5−3
16 )
5

¿ 74.5+0.625

¿ 75.125

Q 4.64 4.64
k= = = =0.3017
P 90−P10 90.5−75.125 15.375

Problem No. 4

Get the quartile deviation from the following grouped data.

Class f Class <Cf


Boundaries
2–4 10 1.5 – 4.5 10
5–7 4 4.5 – 7.5 14
8 – 10 9 7.5 – 10.5 23
11 – 13 7 10.5 – 13.5 30
30

 For the 1st quartile:

N ( 30 )
Q 1= = = 7.5TH item
4 4

Thus, the 1st quartile class is 5 – 7 since it is where the 7.5th item is found.

LQ1 = 4.5 N = 30 fQ1 = 4 i=3 <cf = 10


N
Q 1=LQ +
4
1
fQ (
−¿ Cf b
i
1
)
¿ 4.5+ ( 7.5−10
4 )
3

¿ 4.5−1.875.
¿ 2.625

 For the 3rd quartile:

3 N 3 (30 )
Q 3= = = 22.5TH item
4 4

Thus, the 3rd quartile class is 8 – 10 since it is where the 22.5th item is found.

LQ3 = 7.5 N = 30 fQ3 = 9 i=3 <cf = 14

N
Q3=LQ +
4
fQ
3 (
−¿ Cf b
i
3
)
¿ 7.5+ ( 22.5−14
9 )3
¿ 7.5+2.83
¿ 10.33

Q 3−Q 1 10.33−2.625 7.705


Q= = = =3.85
2 2 2

 For the 90th percentile:


90 N 90 ( 30 )
P90= = = 27th item
100 100
Thus, the 90th percentile class is 11 – 13.

LP = 10.5
90
f P =7
90
<Cf = 23 i=3 N = 30

90 N
P90=L P +
100
90 (−¿Cf b
fP
i
90
)
P90=10.5+ ( 27−23
7 )
3

¿ 10.5+1.71

¿ 12.21

 For the 10th percentile:


10 N 10 ( 30 ) rd
P10= = = 3 item
100 100
Thus, the 3rd percentile class is 2 – 4.

LP = 1.5
10
f P =10
10
<Cf = 0 i=3 N = 30

90 N
P10=L P +
100
10 (
−¿ Cf b
fP
i
10
)
Class f Class Boundaries <Cf
10 – 19 8 9.5 – 19.5 8
20 – 29 16 19.5 – 29.5 24
30 – 39 21 29.5 – 39.5 55
40 – 49 11 39.5 – 49.5 66
50 – 59 4 49.5 – 59.5 70
70

P10=1.5+ ( 3−0
10 )
3

¿ 1.5+0.9

¿ 2.4

Q 3.85 3.85
k= = = =0.3925
P 90−P10 12.21−2.4 9.81

Problem No. 5

A sample of college students was asked how much they spent monthly on a cellphone
phone plan. Solve for quartile deviation.
 For the 1st quartile:

N ( 70 )
Q 1= = = 17.5TH item
4 4

Thus, the 1st quartile class is 20 - 29 since it is where the 17.5th item is found.

LQ1 = 19.5 N = 70 fQ1 = 16 i = 10 <cf = 8

N
Q 1=LQ +
4
1
fQ (
−¿ Cf b

1
i )
¿ 19.5+ ( 17.5−8
16 )
10

¿ 19.5+5.94.
¿ 25.44

 For the 3rd quartile:

3 N 3 (70 )
Q 3= = = 52.5TH item
4 4

Thus, the 3rd quartile class is 30 – 39 since it is where the 52.5th item is found.

LQ3 = 29.5 N = 70 fQ3 = 21 i = 10 <cf = 24

N
Q3=LQ +
4
fQ
3 (
−¿ Cf b
i
3
)
¿ 29.5+ ( 52.5−24
21 )10
¿ 29.5+13.57
¿ 43.07

Q 3−Q 1 43.07−25.44 17.63


Q= = = =8.815
2 2 2

 For the 90th percentile:


90 N 90 ( 70 )
P90= = = 63th item
100 100
Thus, the 90th percentile class is 40 – 49.

LP = 39.5
90
f P =11
90
<Cf = 55 i = 10 N = 70

90 N
P90=L P +
100
90 (
−¿Cf b
fP
i
90
)
P90=39.5+ ( 63−55
11 )
10

¿ 39.5+7.27

¿ 46.77

 For the 10th percentile:


10 N 10 ( 70 ) th
P10= = = 7 item
100 100
Thus, the 3th percentile class is 10– 19.

LP = 9.5
10
f P =8
10
<Cf = 0 i = 10 N = 70

90 N
P10=L P +
100
10 (−¿ Cf b
fP
i
10
)
P10=9.5+ ( 7−0
8 )
10

¿ 9.5+8.75
¿ 18.25

Q 8.815 8.815
k= = = =0.3091
P 90−P10 46.77−18.25 28.52
MODULE 7

"HYPOTHESIS TESTING"

Lesson 1: The null and alternative hypothesis


Problem No. 1
A health-care actuary has been investigating the cost of maintaining the cancer patients
within its plan. These people have typically been running up costs at the rate of $1,240 per
month. A sample of 15 cases for November (the first 15 for which complete records were
available) and an average cost of $1,080, with a standard deviation of $180. Is there any
evidence of a significant change?
SOLUTION: Let’s examine the steps to a standard solution.
Step 1: The hypothesis statement is H0: μ = $1,240 versus H1: μ ≠ $1,240.
Observe that μ represents the true-but-unknown mean for November. The comparison value
$1,240 is the known traditional value to which you want to compare μ.
Do not be tempted into using H1: μ < $1,240. The value in the data should not prejudicially
influence your choice of H1. Also, you should not attempt to second-guess the researcher’s
motives; that is, you shouldn’t try to create a story that suggests that the researcher was
looking for smaller costs. In general, you’d prefer to stay away from one-sided alternative
hypotheses.
Step 2: Level of significance α = 0.05.
The story gives no suggestion as to the value of α. The choice 0.05 is the standard default.
Step 3: The test statistic will be 0 x tn s −μ = . The null hypothesis will be rejected if | t | ≥
tα/2;n-1. If | t | < tα/2;n-1 then H0 will be accepted or judgment will be reserved.
At this point it would be helpful to recognize that the sample size is small; we should state
the assumption that the data are sampled from a normal population.
In using this formula, we’ll have n = 15, μ0 = $1,240 (the comparison value), and x =$1,080
and s = $180 will come from the sample. The value tα/2;n-1 is t0.025;14 = 2.145.
The “judgment will be reserved” phrase allows for the possibility that you might end up
accepting H0 without really believing H0. This happens frequently when the sample size is
small.

$ 1,080−$ 1,240
Step 4: Compute t =√ 15 ≈ -3.443
$ 180
Problem No. 2.
The hourly French fried potato output by the Krisp-o-Matic fry machine is advertised to be
150 pounds. For the new machine purchased by the Burger Heaven drive-in, tests were run
for 22 different one-hour periods, producing an average production of 143 pounds, with a
standard deviation of 17 pounds. At the 5% level of significance, does the Burger Heaven
management have grounds for complaints?
SOLUTION: Here are the steps for this problem.
Step 1: The hypothesis statement is H0: μ = 150 versus H1: μ ≠ 150.
Observe that μ represents the true-but-unknown mean for the new Krisp-o-Matic machine.
The comparison value 150 is the numerical claim, and we want to compare μ to 150.
It might seem that the whole problem was set up with H1: μ < 150 in mind. After all, the test
could not possibly be designed to detect a machine that was performing better than
advertised. However, in the absence of a blatant statement that the experiment was designed
with a one-sided motive, we should use the two-sided alternative. As before, we should not
let the value in the data influence the choice of H1. Also as before, you should not attempt to
second-guess the researcher’s motives. In general, we really like to stay away from one-sided
alternative hypotheses.
Step 2: Level of significance α = 0.05. The value 0.05 is requested. If the α value were left
vague or unspecified, most users would take 0.05 as the default.
Step 3: The test statistic will be 0 x tn s −μ = . The null hypothesis will be rejected if | t | ≥
tα/2;n-1. If | t | < tα/2;n-1 then H0 will be accepted or judgment will be reserved.
At this point it would be helpful to recognize that the sample size is small; we should state
the assumption that the data are sampled from a normal population.
In using this formula, we’ll have n = 22, μ0 = 150 (the comparison value). The numbers x
=143 and s = 17 will come from the sample. The value tα/2;n-1 is t0.025;21 = 2.080.
The “judgment will be reserved” phrase allows for the possibility that you might end up
accepting H0 without really believing H0. This happens frequently when the sample size is
small.
143−150
Step 4: Compute t = √ 22 ≈ -1.931
17
Step 5: Since | -1.931 | = 1.931 < 2.080, the null hypothesis is accepted. The results are not
significant. The Krisp-o-Matic would be declared not significantly different from the claim.
The phrase not significant means that the null hypothesis has been accepted. This does not
mean that we really believe H0 ; we might simply reserve judgment until we get more data.
The p-value would be reported as p > 0.05 (NS). The NS stands for not significant.

Problem No. 3.
 Researchers are interested in the mean age of a certain population.
 A random sample of 10 individuals drawn from the population of interest has a
mean of 27.
 Assuming that the population is approximately normally distributed with variance
20,can we conclude that the mean is different from 30 years ? (α=0.05) .
 If the p - value is 0.0340 how can we use it in making a decision?

Solution
1-Data: variable is age, n=10, =27 ,σ2=20,α=0.05
2-Assumptions: the population is approximately normally distributed with variance 20
3-Hypotheses:
 H0 : μ=30
 HA: μ 30 x

4-Test Statistic:
 Z = -2.12

5.Decision Rule
 The alternative hypothesis is HA: μ ≠ 30
 Hence we reject H0 if Z > Z1-0.025= Z0.975 or Z< - Z1-0.025 = - Z0.975
 Z0.975=1.96(from table D)

Problem No. 4

A contractor wishes to lower heating bills by using a special type of insulation in houses. If
the average of the monthly heating bills is $78, her hypotheses about heating costs will be.

Answer:

H o : µ ≥ $78

H a : µ < $78

Problem No. 5

A chemist invents an additive to increase the life of an automobile battery. If the mean
lifetime of the battery is 36 months, then his hypotheses are.

Answer:

H o: µ ≤ 36

H a: µ > 36

LESSON 2: Z-TEST ON THE COMPARISON BETWEEN THE POPULATION


MEAN AND THE SAMPLE MEAN

Problem No. 1

The school nurse thinks the average height of 7 th graders has increased. The average height of
a 7th grader five year ago was 145cm with a standard deviation of 20cm. She takes a random
sample of 200 students and finds that average height of her sample is 147cm. are 7 th graders
now taller than they were before? Conduct a single tailed hypothesis test using 0.5 level of
significance.

Answer:

: µ ≤145

H a : µ > 145

a= 0.05, one tailed


( x́−μ ) 147−145
z= √ n= 20 √200=1.414
σ

The calculated value is smaller than the tabulated value. Therefore, the null hypothesis is not
rejected.

Problem No. 2

A researcher reports that the average salary of assistant professors is more than $42,000. A
sample of 30 assistant professors has a mean salary of $43,260. At α = 0.05, test the claim
that assistant professors earn more than $42,000 a year. The standard deviation of the
population is $5230.

Answer:

H o: µ ≤ $42,000

H a: µ > 42,000

a= 0.05, one tailed

( x́−μ ) 43260−4200
 z= √ n= 5230 √30=1.32
σ

The calculated value is smaller than the tabulated value. Therefore, the null hypothesis is not
rejected.

Problem No. 3

A national magazine claims that the average college student watches less television than the
general public. The national average is 29.4 hours per week, with a standard deviation of 2
hours. A sample of 30 college students has a mean of 27 hours. Is there enough evidence to
support the claim at a= 0.01?

Answer:

H o: µ ≥ 29.4

H a: µ <29.4

a= 0.01, one tailed

( x́−μ ) 27−29.4
 z= √ n= 2 √ 30=−6.57
σ
The calculated value is greater than the tabulated value. Therefore, the null hypothesis is
rejected.

Problem No. 4

The Medical Rehabilitation Education Foundation reports that the average cost of
rehabilitation for stroke victims is $24,672. To see if the average cost of rehabilitation is
different at a large hospital, a researcher selected a random sample of 35 stroke victims and
found that the average cost of their rehabilitation is $25,226. The standard deviation of the
population is $3,251. At α = 0.01, can it be concluded that the average cost at a large hospital
is different from $24,672?

Answer:

H o: µ = $24,672

H a: µ ≠$24,672

a= 0.01, two tailed

( x́−μ ) 25226−29.4
 z= √ n= 3251 √ 35=1.01
σ

The calculated value is smaller than the tabulated value. Therefore, the null hypothesis is not
rejected.

Problem No. 5

A researcher wishes to test the claim that the average age of lifeguards in Ocean City is
greater than 24 years. She selects a sample of 36 guards and finds the mean of the sample to
be 24.7 years, with a standard deviation of 2 years. Is there evidence to support the claim at
α= 0.05?

Answer:

H o: µ ≤ 24

H a: µ >24

a= 0.05, one tailed

( x́−μ ) 24.7−24
 z= √ n= 2 √ 36=2.10
σ
The calculated value is greater than the tabulated value. Therefore, the null hypothesis is
rejected.

LESSON 3: T-TEST CONCERNING MEANS OF INDEPENDENT SAMPLES

Problem No. 1

An investigator thinks that people under the age of forty have vocabularies that are different
than those of people over sixty years of age. The investigator administers a vocabulary test to
a group of 31 younger subjects and to a group of 31 older subjects. Higher scores reflect
better performance. The mean score for younger subjects was 14.0 and the standard deviation
of younger subject's scores was 5.0. The mean score for older subjects was 20.0 and the
standard deviation of older subject's scores was 6.0. Does this experiment provide evidence
for the investigator's theory? The level of significance is 0.05.

Answer:

H o: There is no significant that people under the age of forty have vocabularies that are
different than those of people over sixty years of age.

H a: There is significant that people under the age of forty have vocabularies that are different
than those of people over sixty years of age.

a= 0.05, two tailed

x́1 − x́2
t=
( n1−1 ) s 21+(n 2−1) s22 n 1+n 2
√[ n 1+ n2−2 ][ ]
n1 n2

x́ 1=14 n1 =31 s21=25


x́ 2=20 n2 =31 s22=36

14−20
t=
( 31−1 ) 25+(31−1)36 31+31 = -4.28
√[ 31+31−2 ][
31× 31 ]
df =n1 +n2−2=31+31−2=60
The calculated value is greater than tabulated. Therefore, the null hypothesis is rejected.

Problem No. 2

An investigator predicts that dog owners in the country spend more time walking their dogs
than do dog owners in the city. The investigator gets a sample of 21 country owners and 23
city owners. The mean number of hours per week that city owners spend walking their dogs
is 10.0. The standard deviation of hours spent walking the dog by city owners is 3.0. The
mean number of hour’s country owners spent walking their dogs per week was 15.0. The
standard deviation of the number of hours spent walking the dog by owners in the country
was 4.0. Do dog owners in the country spend more time walking their dogs than do dog
owners in the city? Use 0.01 level of significance.

Answer:

H o: There is no significant between the time dog owner in the country and city in spending
more time to their dogs.

H a: There is significant between the time dog owners in the country and city in spending
more time to their dogs.

a= 0.01, one tailed

Let:

x́ 1=15 n1 =21 s21=16


x́ 2=10 n2 =23 s22=9

15−10
t=
( 21−1 ) 16+(23−1) 9 21+23 = 4.78
√[ 21+23−2 ][
21 ×23 ]
df =n1 +n2−2=21+23−2=42
-The calculated value is greater than tabulated. Therefore, the null hypothesis is
rejected.

Problem No. 3
An investigator theorizes that people who participate in a regular program of exercise will
have levels of systolic blood pressure that are significantly different from that of people who
do not participate in a regular program of exercise. To test this idea the investigator randomly
assigns 21 subjects to an exercise program for 10 weeks and 21 subjects to a non-exercise
comparison group. After ten weeks the mean systolic blood pressure of subjects in the
exercise group is 137 and the standard deviation of blood pressure values in the exercise
group is 10. After ten weeks, the mean systolic blood pressure of subjects in the non-exercise
group is 127 and the standard deviation on subjects in the non-exercise group is 9.0. Please
test the investigator's theory using an alpha level of .05.

Answer:

H o: There is no significant between people who do not participate in a regular program of


exercise and people attend the regular program of exercise.

H a: There is significant between people who do not participate in a regular program of


exercise and people attend the regular program of exercise.

a= 0.01, two tailed

Let:

x́ 1=137 n1 =21 s21=100


x́ 2=127 n2 =21 s22=81

137−127
t=
( 21−1 ) 100+(21−1)81 21+21 = 3.41
√[ 21+21−2 ][
21× 21 ]
df =n1 +n2−2=21+21−2=40

The calculated value is greater than tabulated. Therefore, the null hypothesis is rejected.

Problem No. 4

A statistics teacher wants to compare his two classes to see if they performed any differently
on tests he gave that semester: Class F has a 25 students with an average score of 70, standard
deviation 15. Class H had 20 students with an average score of 74, standard deviation of 25.
The level of significance is 0.05. Did these classes performed differently the tests?

Answer:

H o: μclass a=μclass b

H a : μclassa ≠ μclassb

a= 0.05, two tailed

Let:

x́ 1=70 n1 =25 s21=225


x́ 2=74 n2 =20 s22=625

70−74
t=
( 25−1 ) 225+(20−1)625 25+20 = -0.67
√[ 25+20−2 ][25 × 20 ]
df =n1 +n2−2=25+20−2=43

The calculated value is smaller than the tabulated value. Therefore, the null hypothesis is not
rejected.

Problem No. 5

Leo grows tomatoes in two separate fields. When the tomatoes are ready to picked, he is
curious as to whether the sizes of his tomatoes plants differ between the two fields. He takes a
random sample of plants from each field and measures the heights of the plants. Here is a
summary of the results: Use 0.05 as level of significance.

Field A Field B
Mean 1.3m 1.6m
Standard deviation 0.5m 0.3m
Number of plants 22 24
Answer:

H o: μa =μ b
H a : μa ≠ μ b

a= 0.05, two tailed

Let:

x́ 1=1.3 n1 =22 s21=0.25

x́ 2=1.6 n2 =24 s22=0.09

1.3−1.6
t=
( 22−1 ) 0.25+ (24−1 ) 09 22+24 = -2.49
√[ 22+24−2 ][ 22× 244 ]
df =n1 +n2−2=22+24−2=44

The calculated value is greater than the tabulated value. Therefore, the null hypothesis is
rejected.

LESSON 4: T-TEST ON THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN


TWO CORRELATED MEANS

Problem No. 1

The English teacher conduct an vocabulary quiz in the first meeting and last meeting in the
class each year to assess if the students learn something in class hours. In the first meeting the
student scored 142 overall and in the last meeting students scored 173. A 200 item quiz and
having a sample variance of 42. Determine if the student performance improved. The level
of significance is 0.01

Answer:

H o: μ1=μ 2

H a : μ1 ≠ μ2

a= 0.01, one tail

d 142−173
n 200
t= √n= √200=−0.34
sd 6.481
df =n−1=200−1=199

The calculated value is smaller than the tabular value. Therefore, the null hypothesis is not
rejected.

Problem No. 2
The following are fear ratings administered to five subjects before and after exposure to “fear
of the dark therapy”:
Subject Before After
Shaggy 8 4
Scooby 9 6
Fred 4 3
Velma 2 2
Daphne 5 3
Answer:

Subject Before After d d2


Shaggy 8 4 4 16
Scooby 9 6 3 8
Fred 4 3 1 1
Velma 2 2 0 0
Daphne 5 3 2 4
H o: μ1=μ 2

: μ1 ≠ μ2

a=0.01, two tailed

Let:

d=10 n=5 ∑ d 2=¿ ¿29

sd =√ 5 ( 29 )−¿ ¿ ¿0.707
d 10
n 5
t= √n= √ 5=¿6.33
sd 0.707

df =n−1=5−1=4
The calculated value is greater than the tabular value. Therefore, the null hypothesis is
rejected.

Problem No. 3

Suppose a sample of n students was given a diagnostic test before studying a particular
module and then again after completing the module. We want to find out if, in general, our
teaching leads to improvements in students’ knowledge/skills (i.e. test scores). We can use
the results from our sample of students to draw conclusions about the impact of this module
in general.

Answer:

Subject Pre score Post score d d2


1 18 22 -4 16
2 21 25 -4 16
3 16 17 -1 1
4 22 24 -2 4
5 19 16 3 9
6 24 29 -5 25
7 17 20 -3 9
8 21 23 -2 4
9 23 19 4 16
10 18 20 -2 4
11 14 15 -1 1
12 16 15 1 1
13 16 18 -2 4
14 19 26 -7 49
15 18 18 0 0
H o: μ1=μ 2

H a : μ1 ≠ μ2

a=0.01, two tailed



Let: d=−25 n=15 ∑ d 2=¿ ¿159

sd =√ 15 (159 )−¿ ¿ ¿2.89

d −25
n 15
t= √n= √15=¿-2.23
sd 2.89

df =n−1=15−1=14

The calculated value is smaller than the tabular value. Therefore, the null hypothesis is not
rejected.

Problem No. 4

We could have conducted the charter school study in a different way—by comparing
teachers’ satisfaction ratings before and after a school was converted to a privately operated
school. This design could be classified as a single-group pretest-posttest design. I have used
the same numbers as in the first between-subjects example given in class to illustrate a point,
but this is completely different example where we have two scores for each of 5 teachers.
Notice that in this design we only are using half the number of cases. Each teacher has two
scores.

Answer:

Teacher Public( pretest Charter(posttest) d d2


)
1 2 7 -5 25
2 4 8 -4 16
3 6 10 -4 16
4 8 8 0 0
5 10 12 -2 3
❑ ❑
∑ d=−16 ∑ d 2=¿ ¿60
❑ ❑

H o: μ1=μ 2

H a : μ1 ≠ μ2

a=0.01, two tailedLet: d=−16 n=5

sd =√ 5 ( 60 )−¿ ¿ ¿
d −16
n 5
t= √n= √ 5=−¿4.83
sd 1.48

df =n−1=5−1=4

The calculated value is greater than the tabular value. Therefore, the null hypothesis is
rejected.

Problem No. 5

A researcher is studying the influence of noise on one’s ability to solve statistics problems.
The researcher randomly selects n = 10 students and exposes them to a noisy condition for 10
minutes and then a quiet condition for 10 minutes. In each condition, students are given a set
of statistics problems to solve. The dependent variable is the number of mistakes made on the
statistics problems during the ten minutes. Here, the researcher is testing a non-directional
hypothesis, because she wants to know if there is any effect of noise on performance (errors).

Student Noisy (XN) Quiet (XQ) d d2


A 9 6 3 9
B 9 7 2 4
C 6 7 -1 1
D 7 5 2 4
E 6 4 2 4
F 7 4 3 9
G 9 6 3 9
H 11 9 2 4
I 7 5 2 4
J 9 7 2 4

Answer:

H o: μ Noise = μ Quiet, H a :μ Noise ≠ μ Quiet


a= 0.01, two tailed
Let: d=20 n=10 d 2=52

sd =√ 10 (52 )−¿ ¿ ¿ 1.54


d 20
n 10
t= √n= √ 10=4.11
sd 1.54

df =n−1=10−1=9

The calculated value is greater than the tabular value. Therefore, the null hypothesis is
rejected.

LESSON 5: Z-TEST ON THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN


TWO INDEPENDENT PROPORTIONS

Problem No. 1

You’re testing two flu drugs A and B. Drug A works on 41 people out of a sample of 195.
Drug B works on 351 people in a sample of 605. Are the two drugs comparable? Use a
5% alpha level.

Answer:

H o: P1=P2

H a : P1 ≠ P2
a= 0.05, two tailed
41 351

p 1− p2 195 605
z= = =8.99
p1 q1 p2 q2
√ n1
+
n2

41 154
195 195
195
+
351 254
( )( ) ( )( )
605 605
605
The calculated value is greater than the tabulated value. Therefore, the null
hypothesis is rejected.

Problem No. 2

Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The
company states that the drug is equally effective for men and women. To test this claim, they
choose a simple random sample of 100 women and 200 men from a population of 100,000
volunteers. At the end of the study, 38% of the women caught a cold; and 51% of the men
caught a cold. Based on these findings, can we reject the company's claim that the drug is
equally effective for men and women? Use a 0.05 level of significance.

Answer:

H o: P1=P2

H a : P1 ≠ P2

a= 0.05, two tailed

38 102

p 1− p2 100 200
z= = =−2.16
p1 q1 p2 q2
√ n1
+
n2

38 31
100 50
100
+
102 49
( )( ) ( )( )
200 100
200

The calculated value is greater than the tabulated value. Therefore, the null hypothesis is
rejected.

Problem No. 3

Two types of medication for hives are being tested to determine if there is a difference in the
proportions of adult patient reactions. Twenty out of a random sample of 200 adults given
medication A still had hives 30 minutes after taking the medication. Twelve out of
another random sample of 200 adults given medication B still had hives 30 minutes after
taking the medication. Test at a 1% level of significance.

Answer:

H o: P A =PB

Ha : PA ≠ PB

a= 0.01, two tailed


20 12

p 1− p2 200 200
z= = =¿
p1 q1 p2 q2 1.48
√ n1
+
n2

20
200 10
200
9
+
12 47
( )( ) ( )( )
200 50
200

The calculated value is smaller than the tabulated value. Therefore, the null hypothesis is not
rejected.

Problem No. 4

A research study was conducted about gender differences in “sexting.” The researcher
believed that the proportion of girls involved in “sexting” is less than the proportion of boys
involved. The data collected in the spring of 2010 among a random sample of middle and
high school students in a large school district in the southern United States is summarized in
the table. Is the proportion of girls sending sexts less than the proportion of boys “sexting?”
Test at a 1% level of significance.

Males Females
Sent “sexts” 183 156
Total number surveyed 2231 2169
Answer:

H o:  P F=  P M
H a:  P F < P M
a=0.01, one tailed

183 156

p 1− p2 2231 2169
z= = =1.26
p1 q1 p2 q2
√ n1
+
n2

183 2048
( 2231
2231
156 671
)( 2231 ) + ( 2169 )( 723 )
2169

The calculated value is smaller than the tabulated value. Therefore, the null hypothesis is not
rejected.

Problem No. 5

Researchers conducted a study of smartphone use among adults. A cell phone company
claimed that iPhone smartphones are more popular with whites (non-Hispanic) than with
African Americans. The results of the survey indicate that of the 232 African American cell
phone owners randomly sampled, 5% have an iPhone. Of the 1,343 white cell phone owners
randomly sampled, 10% own an iPhone. Test at the 5% level of significance. Is the
proportion of white iPhone owners greater than the proportion of African American iPhone
owners?

H o: Pw =  P A
H a: Pw >  P A
A=0.05, one tailed

11.6 134.3

p 1− p2 232 1343
z= = =−3.03
p1 q1 p2 q2
√ n1
+
n2

11.6 19
( )( ) (
232 20
232
+
134.3 9
)( )
1343 10
1343

The calculated value is greater than the tabulated value. Therefore, the null hypothesis is
rejected.
MODULE 8

"CORRELATION ANALYSIS"

LESSON 1. PEARSON PRODUCT-MOMENT CORRELATION COEFFCIENT

Problem No. 1
Find the value of the correlation coefficient from the following table below and test the
hypothesis that there is no significant correlation between the age and glucose level at 5%
level of significance.

Subject Age (X) Glucose Level


1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81

Subject Age (X) Glucose Level XY X2 Y2


(Y)
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
∑ 247 486 20485 11409 40022

a. H0 = There is no significant correlation between the age and glucose level.


b. α = 5%
c. Pearson r will be used to test the hypothesis.
d. Computation:

❑ ❑ ❑

r=
n (∑❑ XY )−(∑❑ X )(∑❑ Y )
√(
❑ ❑ ❑ ❑


)
n ∑ X 2−∑ X 2 (n ∑ Y 2−∑ Y 2)
❑ ❑ ❑

6(20485)−(247)(486)
=
√[ ( 6 ) (11409 )−(11409)][ ( 6 ) ( 40022 )−(40022)]
r = 0.529809

e. df = N – 2 = 6 – 2 = 4
f. Tabular Value = ±0.811
g. The null is accepted since the computed value does not fall in the critical region. It
falls between the critical values.
h. There is no significant linear relationship between the age and glucose level. The
verbal interpretation of r shows that there is moderate correlation.

Problem No. 2

Marls obtained the scores of 5 students in algebra and trigonometry. Calculate the
Pearson correlation coefficient and test the hypothesis that there is no significant correlation
between the scores in algebra and trigonometry level at 5% level of significance.

Algebra 15 16 12 10 8
Trigonometry 18 11 10 20 17

Subject Algebra Trigonometry XY X2 Y2


(X) (Y)
1 15 18 270 225 324
2 16 11 176 256 121
3 12 10 120 144 100
4 10 20 200 100 400
5 8 17 136 64 289
∑ 61 76 902 789 1234

a. H0 = There is no significant correlation between the scores in algebra and


trigonometry.
b. α = 5%
c. Pearson r will be used to test the hypothesis.
d. Computation:

❑ ❑ ❑

r=
n ( ❑
)
∑ XY −(∑ X )(∑ Y )
❑ ❑

√(
❑ ❑ ❑ ❑


)
n ∑ X 2−∑ X 2 (n ∑ Y 2−∑ Y 2)
❑ ❑ ❑

5( 902)−(61)(76)
=
√[ (5 )( 789 ) −(789)][ ( 5 ) (1234 )−(1234)]
r = -0.424

e. df = 5 – 2 = 5 – 2 = 3
f. Tabular Value = ± 0.878
g. The null is accepted since the computed value does not fall in the critical region. It
falls between the critical values.
h. There is no significant linear relationship between the scores in trigonometry and
algebra. The verbal interpretation of r shows that there is moderate correlation.

Problem No. 3

Calculate the Pearson correlation coefficient of the age of husbands and wives below and
test the hypothesis that there is no significant correlation between the ages of husbands and
wives at 5% level of significance.

Husband (X) 36 72 37 36 51 50 47 50 37 41
Wife (Y) 35 67 33 35 50 46 47 42 36 41

Subject Husbands Wife XY X2 Y2


(X) (Y)
1 36 35 1260 1296 1225
2 72 67 4824 5184 4489
3 37 33 1221 1369 1089
4 36 35 1260 1296 1225
5 51 50 2550 2601 2500
6 50 46 2300 2500 2116
7 47 47 2209 2209 2209
8 50 42 2100 2500 1764
9 37 36 1332 1369 1296
10 41 41 1681 1681 1681
∑ 457 432 20737 22005 19594
a. H0 = There is no significant correlation between the ages of husbands and wife.
b. α = 5%
c. Pearson r will be used to test the hypothesis.
d. Computation:

❑ ❑ ❑

r=
n ( ❑
)
∑ XY −(∑ X )(∑ Y )
❑ ❑

√(
❑ ❑ ❑ ❑


)
n ∑ X 2−∑ X 2 (n ∑ Y 2−∑ Y 2)
❑ ❑ ❑

10(20737)−(457)( 432)
=
√[ (10 )( 22005 ) −(22005)][ ( 10 ) ( 19594 )−(19594)]
r = 0.973
e. df = 10 – 2 = 8
f. Tabular Value = ± 0.632
g. Reject the null hypothesis because the computed value, 0.973, is greater than the
tabular value, 0.632.
h. There is a significant linear relationship between the ages of the husbands and wives.
The verbal interpretation of r shows that there is a very high correlation.

Problem No. 4.

The statics and differential calculus scores of engineering students of the University of
Eastern Philippines were recorded. Test the hypothesis that there is no significant correlation
between the scores of engineering students in statics and differential calculus at 5% level of
significance.

Statics 10 12 15 14 12 10 9 11 14 10 16
Calculus 32 46 62 60 51 40 38 42 56 30 65

Subject (X) (Y) XY X2 Y2


1 10 32 320 100 1024
2 12 46 552 144 2116
3 15 62 930 225 3844
4 14 60 840 196 3600
5 12 51 612 144 2601
6 10 40 400 100 1600
7 9 38 342 81 1444
8 11 42 462 121 1764
9 14 56 784 196 3136
10 10 30 300 100 900
11 16 65 1040 256 4225
∑ 133 522 6582 1663 26254
a. H0 = There is no significant correlation between the ages of husbands and wife.
b. α = 5%
c. Pearson r will be used to test the hypothesis.
d. Computation:
❑ ❑ ❑

r=
n ( ❑
)
∑ XY −(∑ X )(∑ Y )
❑ ❑

√(
❑ ❑ ❑ ❑


)
n ∑ X 2−∑ X 2 (n ∑ Y 2−∑ Y 2)
❑ ❑ ❑

11(6582)−(133)(522)
=
√[ (11 )( 1663 )−(1663)][ ( 11) ( 26254 ) −(26254)]
r = 0.9589

e. df = 11 – 2 = 9
f. Tabular Value = ± 0.602
g. Reject the null hypothesis because the computed value, 0.9589, is greater than the
tabular value, 0.602.
h. There is a significant linear relationship between the scores of engineering students in
statics and differential calculus. The verbal interpretation of r shows that there is a
very high correlation.

Problem No. 5

Below are the data for six participants giving their number of years in college (X) and
their subsequent monthly income in thousands (Y). Calculate the Pearson correlation
coefficient and test the hypothesis that there is no significant correlation between the
participant’s number of years in college and their subsequent monthly income at 5% level of
significance.

X 0 1 3 4 4 6
Y 15 15 20 25 30 35

Subject (X) (Y) XY X2 Y2


1 0 15 0 0 225
2 1 15 15 1 225
3 3 20 60 9 400
4 4 25 100 16 625
5 4 30 120 16 900
6 6 35 210 36 1225
∑ 18 140 505 78 3600
a. H0 = There is no significant correlation between the ages of husbands and wife.
b. α = 5%
c. Pearson r will be used to test the hypothesis.
d. Computation:

❑ ❑ ❑

r=
n (∑❑ XY )−(∑❑ X )(∑❑ Y )
√(
❑ ❑ ❑ ❑


)
n ∑ X 2−∑ X 2 (n ∑ Y 2−∑ Y 2)
❑ ❑ ❑

6(505)−(18)(140)
=
√[ ( 6 ) (78 )−(78)][( 6 )( 3600 ) −(3600)]
r = 0.1924

e. df = 6 – 2 = 4
f. Tabular Value = ± 0.811
g. The null is accepted since the computed value does not fall in the critical region. It
falls between the critical values.
h. There is no significant linear relationship between the participant’s number of years in
college and their subsequent monthly income. The verbal interpretation of r shows
that there is a slight correlation.

LESSON 2. REGRESSIONA ANALYSIS

Problem No. 1

There are two variables that need to be studied: weight loss and days spent exercising one
month. You are given a data set in which individuals have been asked the number of days
they exercise for more than half an hour in one month. Predict the weight loss in 50 days.

Exercise Days Weight Loss (kg)


0 4
4 1
8 1.5
12 2
16 4
20 5
24 2

Exercise Weight XY X2 Y2
Days (X) Loss (Y)
1 0 4 0 0 16
2 4 1 4 16 1
3 8 1.5 12 64 2.25
4 12 2 24 144 4
5 16 4 64 256 16
6 20 5 100 400 25
7 24 2 48 576 4
∑ 84 19.5 252 1456 68.25

a = a = 2.30

❑ ❑ ❑

❑ ❑ ❑ ❑ n (∑ XY )−(∑ X )(∑ Y )
(∑ Y )(∑ X )−(∑ X )(∑ XY )
❑ ❑
2

❑ ❑
b= ❑



n ∑ X 2−( ∑ X )

2
❑ ❑
2
n ∑ X 2−( ∑ X ) ❑ ❑

❑ ❑
7(252)−( 84)(19.5)
b=
19.5(1456)−( 84)(252) ( 7 ) ( 1456 )−(84)2
a=
( 7 ) ( 1456 )−(84)2
b = 0.040

y = a + bx

y = 2.30 + 0.040x

y = 2.30 + (0.040) (50)

y = 4.3

Problem No. 2

Predict the demand if the price is 20.

Price 10 12 13 12 16 15
Amount Demanded 48 38 43 45 37 43

(X) (Y) XY X2 Y2
1 10 48 480 100 2304
2 12 38 456 144 1444
3 13 43 559 169 1849
4 12 45 540 144 2025
5 16 37 592 256 1369
6 15 43 645 225 1849
∑ 78 254 3272 1038 10840

a = a = 58.58

❑ ❑ ❑

❑ ❑ ❑ ❑ n (∑❑ XY )−(∑❑ X )(∑❑ Y )


(∑❑ Y )(∑❑ X 2)−(∑❑ X )(∑❑ XY ) b= ❑
n ∑ X 2−( ∑ X )

2
❑ ❑
2
n ∑ X 2−( ∑ X ) ❑ ❑

❑ ❑
6(3272)−( 78)(254 )
b=
254(1038)−(78)(3272) ( 6 )( 1038 ) −(78)2
a=
( 6 )( 1038 ) −(78)2
b = -1.25

y = a + bx

y = 58.58 -1.25x

y = 58.58 - (1.25) (20)

y = 33.58

Problem No. 3

Predict the grade of a student in Calculus if his grade in statics is 98.

Calculus 85 82 79 85 85 90 85 84 74 81
(X)
Statics 75 76 90 78 92 90 95 85 82 82
(Y)

(X) (Y) XY X2 Y2
1 85 75 6375 7225 5625
2 82 76 6232 6724 5776
3 79 90 7110 6241 8100
4 85 78 6630 7225 6084
5 85 92 7820 7225 8464
6 90 90 8100 8100 8100
7 85 95 8075 7225 9025
8 84 85 7140 7056 7225
9 74 82 6068 5476 6724
10 81 82 6642 6561 6724
∑ 830 845 70192 78383 71847
❑ ❑ ❑ ❑ ❑ ❑ ❑

a=
(∑❑ Y )(∑❑ X 2)−(∑❑ X )(∑❑ XY ) b=
n (∑❑ XY )−(∑❑ X )(∑❑ Y )
❑ ❑ ❑ ❑
2 2 2
n ∑ X −( ∑ X ) n ∑ X 2−(∑ X )
❑ ❑ ❑ ❑

845(78383)−(830)(70192) 10(70192)−(830)(845)
a= b=
( 10 ) ( 78383 )−(830)2 ( 10 ) ( 78383 )−(830)2

a = 84.00 b = 0.006

y = a + bx

y = 84 + 0.006x

y = 84 + 0.006 (98)

y = 84.588

Problem No. 4

Predict the glucose level if the age is 30.

Subject Age (X) Glucose Level


1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81
Subject Age (X) Glucose Level XY X2 Y2
(Y)
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
∑ 247 486 20485 11409 40022
❑ ❑ ❑ ❑ ❑ ❑ ❑

a=
( )
❑ ❑
2
∑ Y (∑ X )−(∑ X )(∑ XY )
❑ ❑
b=
n ( ❑
)
∑ XY −(∑ X )( ∑ Y )
❑ ❑
❑ ❑ ❑ ❑
2 2
n ∑ X 2−( ∑ X ) n ∑ X 2−(∑ X )
❑ ❑ ❑ ❑

486(11409 )−(247)( 20485) 6(20485)−(247)(486)


a= b=
( 6 )( 11409 )−(247)2 ( 6 ) ( 11409 ) −(247)2

a = 65.14 b = 0.39

y = a + bx

y = 65.14 + 0.39x

y = 65.14+ 0.39 (30)

y = 76.84

Problem No. 5

Predict the score of a student in Trigonometry if his grade in Algebra is 20.

Algebra 15 16 12 10 8
Trigonometry 18 11 10 20 17

Subject Algebra Trigonometry XY X2 Y2


(X) (Y)
1 15 18 270 225 324
2 16 11 176 256 121
3 12 10 120 144 100
4 10 20 200 100 400
5 8 17 136 64 289
∑ 61 76 902 789 1234
❑ ❑ ❑ ❑ ❑ ❑ ❑

a=
(∑ Y )(∑ X )−(∑ X )(∑ XY )
❑ ❑
2

❑ ❑
b=
n (∑ XY )−(∑ X )(∑ Y )
❑ ❑ ❑
❑ ❑ ❑ ❑
2 2
n ∑ X 2−( ∑ X ) n ∑ X 2−(∑ X )
❑ ❑ ❑ ❑

76(789)−(61)(902) 5(902)−(61)(76)
a= b=
( 5 )( 789 )−(61)2 ( 5 )( 789 )−(61)2

a = 22.06 b = -0.56
y = a + bx

y = 22.06 - 0.56x

y = 22.06 - 0.56 (20)

y = 10.

You might also like