0% found this document useful (0 votes)
195 views55 pages

Basic Concepts in Statistics-Aggie

This document provides an overview of basic statistical concepts including: - Descriptive statistics which describes data and inferential statistics which draws inferences from data. - Probability and the concepts of sample space, sample points, and events. - The multiplication principle for determining total outcomes of multiple experiments. - Permutations and combinations for arranging and selecting objects with or without order. - Calculating probabilities of events using formulas like P(A union B) = P(A) + P(B) - P(A intersection B). - Conditional probability and the multiplication law for finding P(A intersection B).

Uploaded by

Espee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
195 views55 pages

Basic Concepts in Statistics-Aggie

This document provides an overview of basic statistical concepts including: - Descriptive statistics which describes data and inferential statistics which draws inferences from data. - Probability and the concepts of sample space, sample points, and events. - The multiplication principle for determining total outcomes of multiple experiments. - Permutations and combinations for arranging and selecting objects with or without order. - Calculating probabilities of events using formulas like P(A union B) = P(A) + P(B) - P(A intersection B). - Conditional probability and the multiplication law for finding P(A intersection B).

Uploaded by

Espee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Basic Concept in Statistics

BERLITA Y. DISCA
Math/Physics Dept, CNSM
MSU-General Santos City

I. Introduction

One of the important areas in applied mathematics is that of statistics. Statistics deals with
scientific methods of collecting, organizing, summarizing, presenting, and analyzing data. It
helps in drawing valid conclusions and making reasonable decision on the basis of the analysis at
hand. The things known or assumed are called data and these served as the bases for inference.
Thus, we have statistics on board exams passers and topnotchers, arson statistics, vehicular
accident statistics, crime statistics, etc.

The basic concern in the study of statistics is the presentation and interpretation of chance
outcomes that occur in a planned or scientific investigation.

Two major areas in statistics are the descriptive statistics are those methods concerned with
collecting and describing a set of data so as to yield meaningful information. It provides
information only about the collected data and in no way draws inferences or conclusions
concerning a larger set of data. The construction of tables, charts, graphs, and other relevant
computations in various newspapers and magazines usually fall under this method. The other
method is the statistical inference or inferential statistics. This method deals with the analysis of
a subset of data leading to predictions or inferences about the entire set of data. The main
function of statistical inference is estimation of population parameters and tests of hypotheses.

II. Probability
Probability is the chance that something is likely to happen. The concepts in probability theory
were originated in the study of games of chance. These theories were used extensively in areas
such as statistics, mathematics, science and philosophy to draw conclusions about the likelihood
of potential events and the underlying mechanics of complex systems.

In probability we need to define the following: Sample Space – the set of all possible outcomes
in a statistical experiment; Sample Point – each outcome in a sample space is called an element
or a sample point; Events – an event is a subset of a sample space; and the Null Space – is a
sample space that contains no element.

We also need to know the operations we used like the Intersection of events -  A  B  is the
event containing all elements common to both A and B. The Mutually Exclusive Events - If
A  B   , that is, A and B have no common elements. On the other hand, Union of two Events-
A  B is the event containing all the elements that belong to A or to B or to both and The
Complement of an Event A (symbolized by A’) with respect to S is the set of all elements in S
outside of A.

The Multiplication Principle

If one experiment has m outcomes and another experiment has n outcome, then there are mn
possible outcomes for the two experiments.

Example 1. If an experiment consists of drawing a letter from an English alphabet and tossing a
pair of coins, find the total sample points.

264  104 → number of ways


Extended Multiplication Principle

If there are p experiments and the first has n1 possible outcome, the second has n 2 possible
outcomes, the third n3 , . . . and the pth n p , then there are a total of n1 . n2 . n3 . . .n p possible
outcomes for the p experiments.

Example 2. A 4-bit binary word is a sequence of 4 digits, of which may be either a 0 or a 1. How
many different 4-bit words are there?

(2)(2)(2)(2) = 16

Permutations and Combinations

A permutation is an ordered arrangement of objects.

Example 2. How many different samples of size 5 can we form the letters A, B, C, D, E?

5! = 5 . 4. 3. 2.1 = 120

The number of ways of selecting r object from n distinct objects is the permutation taken r at a
time is given by the formula.

Pr  Pn, r  
n!
n
n  r !
Example 3. How many different samples of size 3 can we form the letters A, B, C, D, E?

Pr  Pn, r  
n! 5!
  60
n
n  r ! 5  3!
The number of arrangements of n objects of which n1 are alike of one kind, n 2 are alike of
another kind and so forth, is

n!
where n1  n2  n3 . . .  nk  n .
n1! n2! n3! . . . nk !

Example 4. How many distinct permutations can be made from the letter of the word
“agriculturist”?

13!
 389,188,800 .
2! 2! 2! 2!

Circular permutation is arranging the objects in circle: n  1!

Example 5. In how many can 5 people be arranged in a round table? n  1! 5  1! 4! 24

A combination is a collection of objects with no definite arrangement. It is process of selecting r


objects from n without regard to order.

Cr  C n, r  
n!
r! n  r !
n

2
Example 6. How many ways can a committee of 4 is selected at random from a group of 6 men
and 8 women?

a. with no restriction.
C n, r   C 14,4 
14!
 1001
4!10!
b. with two men and two women if a certain man must be on the committee.
C 1,1 
1!
 1 a certain man must be on the committee
1!0!
C 5,1 
5!
 5 another man must be on the committee
1!4!
C 8,2 
8!
 28 two women must be on the committee
2!6!

By multiplication rule, (1)(5)(28) = 140 committees of size 4.

Probability of an Event:

Probability of an event – If an experiment can result in one of N different equally likely


outcomes and if exactly n of these equally likely outcomes corresponds to event A,
then the probability of event A is
P  A 
n
N

Properties:
1. 0  P A  1
2. P   0
3. PS   1
4. If A and B are disjoint, then

P A  B   P A  PB .
In general, P A1  A2  A3   An   P A1   P A2   P An  for mutually exclusive events.
 
5. P Ac  1  P A
6. The Addition Law: P A  B  P A  PB  P A  B

Examples:
7. A coin is tossed twice. What is the probability that at least 1 head occurs?

8. If a card is drawn out of an ordinary deck of cards, find the probability that it is a heart.

9. From a box containing 3 green apples, 5 red apples, and 6 yellow apples, an apple is
drawn. What is the probability that the apple is
a. red
b. green or yellow
2
10. The probability that a student will pass his Statistics course is and the probability
3
4 14
that he passes English 10 is . If the probability of passing both courses is , what is
9 45
the probability that he will pass at least one course?

11. A pair of dice is tossed. Find the probability that the sum is
c. 6
d. 8 or 10
3
Conditional Probability

Let A and B be two events with PB   0 . The conditional probability of A given B is defined to
be

P A  B 
P A / B  
P B 

Example 12. A math teacher gave her class two tests. 25% of the class passed both tests and 42%
of the class passed the first test. What percent of those who passed the first test also passed the
second test?

Solution: Let A the event that a student passed the second test and let B the event that a student
passed the first test. Let A/B the event that those who passed the first test also passed the second
test

P(A  B) = .25 and P(B) = .42. Find P(A/B)

P A  B  .25
P A / B     .60  60%
P B  .42

Multiplication Law

Let A and B be events and assume PB   0. Then

P A  B  P A / B PB .

The multiplication law is often useful in finding the probabilities of intersection.

Example 13. A box contains three red balls and 2 blue balls. Two balls are selected without
replacement. What is the probability that they are both red?

Solution: Let A and B denote the events that a red ball is drawn on the first and second trial,
respectively.
3
P(A) = . If the a red ball has been removed on the first trial, there are 2 red and 2 blue balls left
5
PB / A   . By multiplication law,
2 1
4 2
 1  3  3
P A  B   P A / B PB       .
 2  5  10

III. Sampling and Collection of Data

When we deal with researches we need to know the sampling procedures and how are we going
to go about the collection of data. Here, there are terms that we need to know the meaning.
Population - consists of the totality of the observations with which we are concerned while its
subset is known as the Sample. A numerical value that describes the characteristics of the
population is called a Parameter and Statistic is any numerical value that describes the
characteristics of the sample.

A statistical technique may be classified as univariate, bivariate or multivariate depending on


the number of variables involved in the analysis. The technique is univariate if it applies to as
single variable, bivariate if it applies to two and multivariate when more than two variables are

4
involved. The technique is inferential when it involves estimation of population parameters and
tests hypothesis.

MEASUREMENT

Measurement refers to the process of determining the value or label, either qualitatively or
quantitatively, of a particular variable for a particular unit of analysis.

Levels of Measurement

1. Level Characteristics
2. 1.Nominal - numbers or symbols are used simply to classify an object, person, or
characteristics into categories
- the categories must be distinct, non-overlapping and exhaustive and weakest level
of measurement
Example: Religious Affiliation:
 Roman Catholic  Iglesia Ni Cristo  Baptist
 Alliance  Islam  Others
2. Ordinal - contains the property of the nominal level
- the numbers assigned to categories of any variable may be ranked or ordered in
some low-to-high manner
Example: Blood Pressure:  High  Normal  Low/ Anemic
3. Interval - contains the properties of the ordinal level
- the distances between any two numbers on the scale are of known sizes
- characterized by a common and constant unit of measurement
- units of measurement are arbitrary
- the number zero does not imply the absence of the characteristic under
consideration (thus, the zero point is arbitrary)
Examples: Temperature in oC and oF Intelligence quotient (75,100, etc.)
4. Ratio - contains the properties of the interval level
- has a true zero point, that is, the number zero indicates the absence of the
characteristic under consideration
- strongest level of measurement
Examples: Scores in a test Height in meters, feet, etc.

CENSUS vs. SURVEY SAMPLING

Census or Complete the process of gathering information from every unit in


Enumeration the population

Survey Sampling the process of obtaining information from only a subset


unit in the population

Advantages of Survey Sampling


 reduced cost  greater speed
 greater scope  greater accuracy

PROBABILITY and NONPROBABILITY SAMPLING

Probability procedure wherein every element of the population is given a


Sampling (known) nonzero chance of being selected in the sample

Nonprobability
procedure wh wherein not all the sampling elements in the population are given
a chance of being included in the sample

5
Methods of Nonprobability Sampling

Purposive Sampling sets out to make a sample agree with the population
in regard to certain characteristics

Quota Sampling a specific number of particular types of elements are


selected

Convenience Sampling chooses units which come to hand or are convenient

Judgment Sampling selects a sample in accordance with an expert’s


judgment

Cases wherein Nonprobability Sampling is Useful

 Only few are willing to be interviewed


 Extreme difficulties are experienced in locating or identifying subjects
 Lower budget (probability sampling is more expensive to implement)

When probability sampling is used:

 results apply to the sample


 one can not project results to the whole population

Methods of Probability Sampling

1. Simple Random Sampling (SRS) - is the process on selecting a sample of size n, giving
each sampling unit an equal chance of being included in the sample. A SRS of n
observations of the population is a sample that is chosen in such a way that each subset of
n observations of the population has an equal chance of being selected.

Random sampling may be with replacement (SRSWR) or without replacement (SRSWOR). In


SRSWR, a chosen element is always replaced before the next selection is made, so that an
element may be chosen more than once.

Advantages
 The theory involved is easier to understand than the theory behind other sampling designs
 Estimation methods are simpler and easier.

Disadvantages
 The sample chosen may be widely spread, thus entailing high transportation costs.
 A population list, or frame, is needed.
 The sample chosen may not be truly typical of the population if the population is
heterogeneous with respect to the characteristic under study.

2. Stratified Sampling - There are cases wherein the population consists of items that are
heterogeneous with respect to the characteristic under study. In such situations the
population should be divided, or stratified, into more or less homogeneous
subpopulations or strata before sampling is done.

Stratified random sampling then consists of selecting a SRS from each of the strata into which
the population has been divided.

Advantages

6
 Stratification may bring about a gain in precision of the estimates of characteristics of the
population.
 It allows a more comprehensive data analysis since information is provided in each
stratum.
 It is administratively convenient.

Disadvantages
 A listing of the population for each stratum is needed.
 The stratification of the population may require additional prior information about the
population and its strata.

3. (1-in-k) Systematic Sampling - Systematic Sampling with a random start is a method of


selecting a sample by taking every kth element from an ordered population, the first unit
being selected at random. Here k is called sampling interval and 1/k the sampling
fraction.

Advantages
 Drawing of the sample is administratively easy.
 It is possible to select a sample in the field without a frame.
Disadvantage
 If periodic irregularities are found in the list, a systematic sample may consist only of
similar types.

4. Cluster Sampling - is a method of selecting a sample of distinct groups, or clusters, of


smaller units called elements. The sample clusters may be chosen by SRS or by
systematic sampling. Similar to strata in stratified sampling, clusters are mutually
exclusive subpopulations which together comprise the entire population. Unlike strata,
clusters are preferably formed with heterogeneous elements so that each cluster will be
typical of the population.

The number of clusters C in population is called the size of the population clusters. Clusters may
be of equal or unequal sizes.

Advantages
 A population list is not needed.
 Listing cost is reduced.

Disadvantages
 The costs and problems of statistical analysis are greater.
 Estimation procedures are difficult.

5. Multi-Stage Sampling – sampling scheme characterized by sampling being done in stages


before the ultimate sampling units are selected.

7
IV. Presentation of Data
After deciding what sampling procedures to be used and data had been collected the next step is
to present the data.

1. MEASURE OF CENTRAL LOCATION

One of the important ways of describing a group of measurements, whether it be a sample or a


population, is by the use of average. An average is the measure of the center of a set of data
when the data are arranged in an increasing or decreasing order of magnitude.

Measure of Central Location or Measure of Central Tendency- any measure indicating the center
of a set of data, arranged in an increasing or decreasing order of magnitude. The most commonly
used are the mean, median, and mode.

POPULATION MEAN AND SAMPLE MEAN

If the set of data x1, x2 , , xn , not necessarily all distinct represents a finite population of
size n, then the population mean and the sample mean are given respectively below

N n

x i x i
 i 1
and x i 1

N n

MEDIAN:

The median of a set of observations arranged in an increasing or decreasing order of magnitude


is the middle value when the number of observations is odd or the arithmetic mean of the two
middle values when the number of observation is even.

MODE

The mode of a set of observations is that value which occurs most often or with the greatest
frequency. The mode does not always exist. This is certainly true when all observations occur
with the same frequency.

Remarks:

1. The mean is the most commonly used measure of location in statistics. It is easy to
calculate and it employs all the variable information. The disadvantage of the mean is it is
adversely affected by extreme values.

2. The median is easy to compute if the number of observations is relatively small. It is not
affected by extreme values.

3. The mode is the least used measure of the three. Its value is almost useless for small sets of
data. It requires no calculation. It can be used for both quantitative and qualitative data.

2. MEASURES OF VARIATION

8
The measures of central location do not give an adequate description of our data. We need to
know how the observations spread out from the average. The most important statistics for
measuring the variability of a set of data are the range and the variance:

RANGE

The range of a set of data is the difference between the largest and smallest number in a set. The
range is very simple to compute. However, it is a poor a measure of variation, particularly if the
size of the sample is large. It only considers the extreme values and it tells us nothing about the
distribution of numbers in between.

POPULATION VARIANCE:

Given the finite population x1 , x2 , . . . . xn , the population variance is

(x )i
2

2  i 1

SAMPLE VARIANCE

Given a random sample x1 , x2, , xn , the sample variance is

 x  x 
n 2
i
s2  i 1

n 1

COMPUTING FORMULA FOR SAMPLE VARIANCE:

n xi2   xi 
2

s 
2

n n  1

The standard deviation provides a method for converting observed variances to standard form so
that they can be more easily understood and compared. The variance and standard deviation
provide the most powerful estimate of variation because they consider the value of each score.

Remarks:

1. The range is the least reliable of the measures and is used only when one is in a hurry to
get a measure of variability. It may be used for ordinal, interval, or ratio data.

2. The most important measures of variability are the standard deviation and its square, the
variance. The variance is the average of the squared deviation around the mean.

3. The standard deviation is used whenever a distribution approximates a normal


distribution. It is the basis for most statistics used in analysis of data. It is used with
interval and ratio data.

V. Frequency Distributions and Graphical Representation of Data

Important characteristics of a large mass of data can be readily assessed by grouping the data into
different classes and then determining the number of observations that fall in each class. Such an
arrangement in tabular form is called frequency distribution.

9
Data that are presented in the form of frequency distribution are called grouped data. The data of
a sample are often grouped into intervals to produce a better overall picture of the unknown
population, but in so doing the identity of the individual observations are lost.

The lowest and largest values that can fall in a class interval are called class limits. The number
of observations falling in a particular class interval is called class frequency. The numerical
difference between the upper and lower class boundaries of a class interval is defined to be the
class width. The midpoint of the class interval called the class mark is the average of the class
limits.

The total frequency of all values less than the upper class boundary of a given class interval is
called the cumulative frequency up to and including that class.

The steps in grouping a large set of data into a frequency distribution may be summarized as
follows:

1. Decide on the number of class intervals required. (We can choose between 5 and 20 class
intervals)
2. Determine the range.
3. Divide the range by the number of classes to estimate the approximate the width of the
interval.
4. List the lower class limit of the bottom interval and then the lower class boundary. Add
the class width to the lower class boundary to obtain the upper class boundary. Write
down the upper class limit.
5. List all the class limits and class boundaries.
6. Determine the class marks.
7. Tally the frequencies for each class.
8. Sum the frequency column and check against the total number of observations

Example1: The following represent the final examination scores in elementary statistics course:

85 49 23 60 79 32 57 74 52 70 82 36
80 77 81 95 41 65 92 85 55 76 33 92
52 10 64 75 78 25 80 98 81 67 68 24
41 71 83 54 64 72 88 62 74 43 79 83
60 78 89 76 84 48 84 90 15 79 84 66
34 67 17 82 69 74 63 80 85 61 89 67

Using 9 intervals with the lowest starting at 10,


a. set up the frequency distribution.
b. construct a cumulative frequency distribution.

GRAPHICAL REPRESENTATIONS
The information provided by a frequency distribution in tabular form is easier to grasp if
presented graphically. A visual picture is beneficial in understanding the essential features of
frequency distribution.

Bar Chart – plotting the class frequency against the class limits.

Frequency Histogram – frequency against the class boundaries.

Frequency Polygon – frequency against the class marks

Cumulative Frequency Polygon or Ogive – cumulative frequency against the upper class
boundaries

10
GROUPED DATA: FORMULAS FOR MEASURES OF CENTRAL TENDENCY

MEAN 
 fi xi
n
where : f i  the frequency of class int erval i
xi  the midpo int of class int erval i
f i xi  the sum of the products of the frequency and midpo int of the class
int erval i
or by the use of codes:

MEAN  A 
 f u c
i i

n
where A = is the midpoint of class interval assigned with a code of zero
fi = the frequency of class interval i
xi = the code of class interval i
c = the class width
 fi ui = the sum of the products of the frequency and the code of the class intervals
n 
  cf  c
MEDIAN  L1   
2
f
where L1 = the lower class boundary of the median class
n = the sample size
cf = the cumulative frequency of the class right below the median class
f = the frequency of the median class
c = the class width
n
median class = class interval where the   the observation falls
 2
MODE  L1  1
d c
d1  d 2
where L1 = the lower class boundary of the modal class
d1 = the difference between the frequencies of the modal class and the class right
below the modal class
d2 = the difference between the frequencies of the modal class and the class right
above the modal class
c = the class width
modal class = the class interval having the highest frequency.

GROUPED DATA: FORMULAS FOR MEASURES OF VARIATION

n fi xi2   fi xi 
2

s  2

n n  1

where: f i = the frequency of class interval I


xi = the midpoint of class interval I
f i xi = the sum of the products of the frequency and midpoint of the class interval i

 n f i ui2   f i ui 2 
s 2  c2
 n n  1 
 

11
where: f i = the frequency of class interval i
xi = the code of class interval i
c = the class width
 fi ui = the sum of the products of the frequency and the code of the class
interval

The Quartile Deviation

The quartile deviation is used when the median is used as an average; when the data
depart noticeably from the normal. It is used for ordinal data.

The quartile deviation, Q, is frequently called the semi-interquartile range. It is half of


the distance between two quartile points, Q1, and Q3.

Q 3  Q1
In symbols: Q
2
n   3n 
  cf  c   cf  c
Q1  L1    Q3  L1   
4 4
where: ,
f f

Example 2. Given the following data,

Class f
interval
36-40 2
31-35 8
26-30 12
21-25 18
16-20 10

Complete the table and find the mean, median, mode, variance, sd, Q1 , Q3, and Q.
Make the bar chart, histogram, frequency polygon, and ogive.

Solution:
Class f Class Class Marks u fu fu2 fixi fx2 cf
interval boundaries x
36-40 2 35.5-40.5 38 2 4 8 76 2888 50
31-35 8 30.5-35.5 33 1 8 8 264 8712 48
26-30 12 25,5-30.5 28 0 0 0 336 9408 40
21-25 18 20.5-25.5 23 -1 -18 18 414 9522 28
16-20 10 15.5-20.5 18 -2 -20 4 180 3240 10

a) MEAN 
f i xi
=
1270
 25.4
n 50

MEAN  A 
 f u c i i
= 28 
 26
5  28  2.6  25.4
n 50

n  50
  cf  c  10
b) MEDIAN  L1    = 20.5  2 5  24.67
2
f 18

12
c) MODE  L1 
d1 c = 20.5 
8
5  23.36
d1  d 2 86

d) Variance
n fi xi2   fi xi  5033770  1270
2
1688500  1612900 75600
2
s 
2
=  =  30.86
n n  1 5049 2450 2450
 n f i ui2   f i ui 2   5074   262  2 3700  676
s 
2
 n n  1
 c2 =


     
 5  25  30.86
   50 49  2450

e) Standard Deviation
s  30.86

n   50 
  cf  c   10 
f) Q1  L1   4  = 20.5   4 5  20.5  0.69  21.19
f  18 
 
 
 3n  3 
  cf  c  50  28 
g) Q3  L1    = 25.5   4
4  5  25.5  3.96  29.46
f  12 
 
 
Q  Q1 29.46  21.19
h) Q  3 =  4.14
2 2

THE MEASURE OF SKEWNESS

The Measure of Skewness is a value that measures the degree of departure of the distribution
from symmetry. The formula for solving the coefficient of skewness (SK) is defined as
3  Md 
SK 

Remark:
The distribution of the data is symmetric when SK = 0. It is negatively skewed when SK
< 0 and it is positively skewed when SK > 0.

THE MEASURE OF KURTOSIS

The Measure of Kurtosis is a value that measures the flatness or peakedness of a frequency
distribution. It is computed using the formula

1 k
n

4

xi  x fi 
K  i 1 4
s
For a normal distribution K = 3, the distribution is mesokurtic. If K > 3, the distribution is
leptokurtic; and if K < 3, the distribution is platykurtic.

VI. Hypothesis Testing


MEASURES OF ASSOCIATION

There is a need to measure the degree of association of variables to express quantitatively the
extent to which they are related.
13
In determining the association, two sets of measurements are obtained on the same individuals or
unit or on pairs of individuals or units who are matched on some basis. The association is then
tested for significance.

Null Hypothesis No association/correlation between variables.

Alternative Hypothesis There exists a significant association or correlation between


variables.

1. PHI Coefficient - is an appropriate measure of association between two sets of attributes


measured on nominal scale, each of which may take only two values (i.e., Yes/NO,
Present/Absent, Positive /Negative).

Data Layout: 2x2 contingency table

Variable Variable X
Y 0 1 Total
0 A B A+B
1 C D C+D
Total A+C B+D N

The phi coefficient is computed by the formula

AD  BC

A  B C  D A  C B  D
Testing for Significance

To test the null hypothesis of no association, we compute the test statistic

NAD  BC2
2 
A  B C  D A  C B  D
which follows a chi-square distribution with one degree of freedom.

Decision Rule

Reject the null hypothesis and conclude significant association when the p-value
associated with the statistic is les than or equal to the prescribed level of significance.

Example : A significant relation between student’s career choices in college to their father’s
occupation has been hypothesized by a number researchers. Do the following data support the
hypothesis? What is the index of relationship?

Father’s Student’s Career Choice


Occupation Professional Vocational/ Total
Technical
White Collar 19 1 20
Blue Collar 11 9 20
Total 30 10 40

Solving for the index

14
AD  BC 171  11
  = 0.462
A  B C  D A  C B  D (20)(20)(30)(10)

For testing the significance of this measure of association, we compute

NAD  BC2 401602


2   = 8.53
A  B C  D A  C B  D 120000
The critical value at 5% level of significance and 1 degree of freedom is 3.841. Hence the
hypothesis is not rejected. We cannot conclude that there is a significant relation between
student’s career choices in college to their father’s occupation.

2. Chi-Square Test For Independence

 The chi-square test is used to determine if two categorical variables are dependent or not.
 It tests the association of two variables A and B, A having r categories and B having k
categories.

Remarks:
 There should be no empty cells.
 No more than 20% of the cells must contain an expected frequency less than 5.

Testing for Significance

Null Hypothesis

There is no association between variables A and B.

Test statistic( Chi-square statistic)

  
2 
n ij  E ij 2 
E ij
i j

Eij = Expected frequency of the ijth cell


= CiRj/N = Column Total x Row Total
N
Decision Rule

Reject Ho and conclude that there exists a significant association between A and B if the
value of the test statistic is less than the critical value obtained using (r-1)(k-1) degrees of
freedom at a given level of significance.

Example : A random sample of 200, all retired, were classified according to education and
number of children. The results are:

Education Number of Children


0-1 2-3 4 or more Total
Elementary 14 37 32 83
Secondary 19 42 17 78
College 12 17 10 39
Total 45 96 59 200

We first determine the expected frequencies. We use column and row totals. We obtain the
following:

15
18.68 39.84 24.49
17.55 37.44 23.01
8.78 18.72 11.51

We compute the chi-square statistic by constructing the table below.

n ij  E ij 2
Observed Expected n ij  E ij E ij
14 18.675 -4.68 1.17
19 17.55 1.45 0.12
12 8.775 3.23 1.19
37 39.84 -2.84 0.20
42 37.44 4.56 0.56
17 18.72 -1.72 0.16
32 24.485 7.52 2.31
17 23.01 -6.01 1.57
10 11.505 -1.51 0.20
2 = 7.46

At 5% level of significance and 4 degrees of freedom, the critical value is 9.488 hence there is no
sufficient evidence to show that family size is associated with level of education of the father.

3. Cramer Coefficient C

 It measures the degree of association between two nominal variables set in an r x k


contingency table. The variable A has r categories and the other variable B, k categories.

2
The Cramer Coefficient C is computed as C 
NL  1

where 2 is the value of a chi-square statistic given above and L is the smallest between the
number of rows and the number of columns

Testing for Significance

To test the null hypothesis that the true association is zero, a chi-square statistic, 2
obtained is used. This statistic follows a chi-square distribution with
(r-1) x (k-1) degrees of freedom.

Decision Rule

Reject Ho and conclude that there exists a significant association between A and B if the
value of the test statistic is less than the critical value obtained from the table of critical values of
a chi-square distribution at specified level of significance.

Example : Is a parent’s political orientation related to his/her disciplinary method? Suppose


data on this research question are summarized as shown below, find the index of association.

Disciplinary Political Orientation


Method Conservative Moderate Liberal Total
Permissive 7 9 14 30

16
Moderate 10 10 8 28
Authoritarian 15 11 5 31
Total 32 30 27 89

To compute the value of the index we need to solve for the chi square statistics. So we
determine expected frequencies first
10.79 10.11 9.10
Expected Frequencies (E) 10.07 9.44 8.49
11.15 10.45 9.40
We may construct the following table:

n ij  E ij 2 
Observed Expected n ij  E ij E ij
7 10.79 -3.79 1.33
10 10.07 -0.07 0.00
15 11.15 3.85 1.33
9 10.11 -1.11 0.12
10 9.44 0.56 0.03
11 10.45 0.55 0.03
14 9.10 4.90 2.64
8 8.49 -0.49 0.03
5 9.40 -4.40 2.06
2 = 7.58

The Cramer Coefficient C is computed as


2 7.58
C  = 0.21
NL  1 893  1

Note: r = k =3 so L=3

At say, 5% level of significance, the tabulated value of chi-square at (r-1)(k-1) = 4 degrees of


freedom is 9.488. Since the chi-square value (7.58) obtained is less than this the hypothesis is
not rejected. Thus the parent’s political orientation is not related to his/her disciplinary method.

4. CORRELATION

Correlation is the extent of linear association between two variables.

 The correlation of two variables is usually expressed in terms of an index or coefficient


which is within the interval –1.0 to + 1.0. These two limits indicate the perfect negative and
perfect positive relationship, respectively.

 A graph of the paired scores will enable the researcher to judge the nature of the correlation.
Such graphical presentation of scores on the two variables is called a scatterplot.

 A correlation coefficient equal zero implies absence of linear relationship but not total
absence of relationship.

Spearman Rank Order Correlation Coefficient

 The Spearman coefficient determines the degree of correlation between two variables
measured in at least an ordinal scale.

17
Data Layout: Let (xi, yi) be the rank of the ith unit for variables X and Y

Variable X x1 x2 . . . . . . . xn
Variable Y y1 y2 . . . . . . . yn

6 x i  y i 2
The Spearman correlation coefficient is rs  1 
n n 2  1
 
Testing for Significance
To test the null hypothesis that the true correlation is zero, we compute the test statistic
rs n  2
t ; for sample size less than 30
1 r 2

z  rs n 1 z ; for sample size of at least 30

Decision Rule

Reject the null hypothesis and conclude significant correlation when t or z is greater than
the critical value obtained from the t or z distribution at a specified level of significance. For the
critical value t the number of degrees of freedom is n-2.

Example : Suppose ten presidential candidates are ranked by two civic action groups as to
their platforms on the issue of unemployment. The data are given below.

Civic Candidate
Group A B C D E F G H I J
X 5 10 7 2 3 4 1 9 8 6
Y 3 9 8 1 2 5 4 10 7 6 Total
(xi - yi)2 4 1 1 1 1 1 9 1 1 0 22

Is there a significant correlation between the rankings of the two groups?

The value of the correlation coefficient is

6 x i  y i 2 622
rs  1   1  1  0.133  0.867
 2
n n  1  10100  1
 
rs n  2 0.867 10  2
and t    4.92
1 r 2 1  0.867 2
At 5% level and 8 degrees of freedom, the tabulated t-value is 1.860. The t- statistic obtained
exceeds the critical value. Hence the hypothesis is rejected and we conclude that there is a
significant correlation between the rankings of the two civic groups.

Pearson Product Moment Correlation Coefficient

This coefficient, more commonly referred to as Pearson’s r, measures the degree of


correlation of two continuous variables, X and Y, measured in at least an interval scale. The
formula for the Pearson correlation is given by

18
n x i yi   x i  yi
r
n x 2   x 2  n y 2   y 2 
  i  i    i  i 
Testing for Significance

To test the null hypothesis that the true correlation is zero, we compute the test statistic

r n 2
t ; for sample size less than 30
1 r2

z  r n 1 z ; for sample size of at least 30

Decision Rule

Reject the null hypothesis and conclude significant correlation when t or z statistic is
greater than the critical value obtained from the t or z distribution at a specified level of
significance. For the critical value t the number of degrees of freedom is n-2.

Example : An anthropologist wishes to determine if there is any correlation between


children’s spelling ability and the size of their feet. Twelve children were chosen and data on
foot size and number of correctly spelt words (out of a chosen hundred are given as follows.
Child Foot size Words X^2 Y^2 XY
1 6.25 16 39.06 256 100
2 6.75 28 45.56 784 189
3 6.75 46 45.56 2116 310.5
4 7 14 49.00 196 98
5 7.25 41 52.56 1681 297.25
6 7.5 10 56.25 100 75
7 7.75 56 60.06 3136 434
8 8 43 64.00 1849 344
9 8.25 15 68.06 225 123.75
10 8.75 21 76.56 441 183.75
11 9 50 81.00 2500 450
12 9.25 28 85.56 784 259
Total 92.5 368 723.25 14068 2864.25

n x i yi   x i  yi 12(2864.25)  (92.5)(3680)
r 
n x 2   x 2  n y 2   y 2  12723.25  92.52  1214068  3682 
  i  i    i  i     
r = 0.163
r n  2 0.163 12  2
and t   0.522
1 r 2 1  0.1632

The t statistic obtained does not exceed the tabulated value at 5% level of significance and n-2 =
10 degrees of freedom. Hence the hypothesis is not rejected.

SPECIAL MEASURES OF ASSOCIATION

Point Biserial Coefficient

19
 The point-biserial coefficient measures the association between a continuous variable X and
a dichotomous variable Y. Here, the dichotomous variable is coded as either 0 or 1. The
coefficient is computed as

rpb = (xp – xt) (p/q)


s
where s = standard deviation of the continuous variable
xp = mean of the continuous variable when the dichotomous Y is
equal to 1
xt = mean of the continuous variable for the entire data set
p = proportion of units with Y values equal to 1
q = proportion of units with Y values equal to 0

Testing for Significance

To test the null hypothesis of no association, we compute for the test statistic
t = rpb (n-2)1/2
(1- rpb2)1/2
Decision Rule

Reject the null hypothesis and conclude significant correlation when t statistic is greater
than the tabulated t-value with n-2 degrees of freedom.

Example: Given a small class of seven students, four boys and three girls, determine if there
exists a relationship between math scores and sex based on the data below.

Student Don Rita Ben Joy Mylene Jay May


Math Score 14 17 12 19 14 9 6
Sex 0 1 0 1 1 0 1

xp = 17+19+14+6 = 14, xt = 14+17+12+19+14+9+6 = 13


3 7
s = 4.47, p = 4/7 =0.5714 q = 3/7 = 0.4286

rpb = (xp – xt) (p/q) = (14-13) (0.5714 /0.4286/) = 0.298


s 4.47
___
and t = rpb (n-2)1/2 = 0.2987-2 = 0.7953
(1- rpb2)1/2 1-0.298

At 0.05 level the tabulated t-value with 5 degrees of freedom is 2.571. Hence the null
hypothesis is not rejected.

Kendall’s Coefficient of Concordance

 The Kendall’s coefficient of concordance measures the extent of agreement among several
judges in rating a set of n different objects. The ratings given by judges are usually of
ordinal scale.

The coefficient is computed as 12  (Ri – R)2


W =
k2n(n2-1)
where Ri = sum of ranks per object
R =  Ri/n = average sum of ranks of the n objects
k = no. of raters; n= no. of objects

20
Data Layout : Assume there are k=3 judges and n=6 objects

Object Judge A Judge B Judge C Sum of


ranks(Ri) (Ri – R)2
1 5 4 6 15 20.25
2 1 1 3 5 30.25
3 4 3 2 9 2.25
4 2 6 5 13 6.25

5 3 5 1 9 2.25
6 6 2 4 12 2.25
 R i = 63  (Ri–R)2=63.5
R = 63/6 = 10.5

Testing for Significance

Null Hypothesis

There is no concordance among the judges, that is, the ratings given by the judges
have no agreement.
Test Statistic
2 = k(n-1)W which follows a chi-square distribution with n-1
degrees of freedom

Decision Rule

Reject the null hypothesis and conclude a significant concordance among the judges if
the chi-square statistic exceeds the tabulated value at a specified level of significance and n-1
degrees of freedom.

Example: (using the data set above in the data layout above)
12 R i  R 2 1263.5
W   0.403
2  2
k n n  1  9( 6)(36  1)
 
and 2 = k(n-1)W = 3(5)(0.403) = 6.045.

For level of significance say  = 0.05 (5%), with n-1 = 5 degrees of freedom, the value obtained
from the table is 11.07. The test statistic is less than the critical value so the hypothesis is not
rejected. Hence there is no significant concordance among the judges.

VII. Review Problems


1. If an experiment consists of drawing a letter from an English alphabet and tossing a pair of
coins, find the total sample points.

2. In how many ways can a judge award first, second, and third places in a contest with eight
entries?

21
3. Ten people are to sit at a round table. Find the number of seating arrangements if the host and
the hostess must always sit together.

4. Six married couples have bought 12 seats in a row for a premier show of Spiderman III. In
how many ways can they be seated if
a) each couple is to sit together? b) all the men is to sit at the right side of all the women?

5. How many ways are there to hire 3 applicants from 5 equally qualified recent graduates for a
teaching job in a certain state university?

6. In how many ways can 4 boys and 5 girls sit in a row?

7. Five different mathematics books, four different physics books and two different chemistry
books are to be arranged on a shelf so that the mathematics books stand together, physics books
stand together, and chemistry books stand together. How many such arrangements are possible?

8. From 4 mathematicians and 3 physicists find the number of committees that can be formed
consisting of 2 mathematicians and 1 physicist.

9. If a letter is chosen at random from the English alphabet, find the probability that the letter is a
consonant.

10. A pair of dice is tossed. Find the probability of getting


a. a total of 8 b. at least a total of 10

11. A card is drawn from a deck of 52 cards. Find the probability that it will be a heart or a king.

12. A call center urgently needs four newly graduates of business courses. These 4 applicants are
selected at random from 5 accountancy graduates, 3 economics graduates and 1 management
graduate. What is the probability that a management graduate will be hired?

13. A coin and a die are tossed, what is the probability of obtaining a tail and at least a 4?

14. From 6 male and 4 female examinees, a committee of 3 persons is selected at random, what
is the probability of having 2 male and 1 female in a committee?

15. What is the probability of randomly selecting the letter M in the word COLUMN?

16. A bag contains 8 red marbles, 6 blue marbles, and 10 white marbles. A marble is drawn from
the box. What is the probability that the marble is?
a. is red? b. is not blue? c. is blue or red?

17. If a letter is chosen at random from the word PARKING; find the probability that it is a
a. vowel? b. consonant?

18. Choose a number at random from 1 to 5. What is the probability of each outcome? What is
the probability that the number chosen is even? What is the probability that the number chosen is
odd?

19. If a card is drawn from a deck of cards, what is the probability that it is
a. a spade? b. a heart or diamond?

20. If a pair of dice is tossed, what is the probability that the sum is
a. 10? b. 6 or 11?

22
21. The probability that a student will go to the library is 0.35, the probability that he will attend
his class is 0.75, and the probability that he will go to the library and attends his class is 0.68.
What is the probability that he will
a. not go to the library? b. either go to the library or attends his class?

22. The probability that when a student visits his girlfriend, he will bring chocolates is 0.6, and
the probability that he will bring flowers is 0.7. If the probability that he will bring either
chocolates or flowers is 0.5, what is the probability that he will
a. bring both chocolates and flowers? b. not bring either chocolates or flowers?

23. A jar contains black and white marbles. Two marbles are chosen without replacement. The
probability of selecting a black marble and then a white marble is 0.34, and the probability of
selecting a black marble on the first draw is 0.47. What is the probability of selecting a white
marble on the second draw, given that the first marble drawn was black?

24. The probability that it is Friday and that a student is absent is 0.03. Since there are 5 school
days in a week, the probability that it is Friday is 0.2. What is the probability that a student is
absent given that today is Friday?

25. At Kennedy Middle School, the probability that a student takes Technology and Spanish is
0.087. The probability that a student takes Technology is 0.68. What is the probability that a
student takes Spanish given that the student is taking Technology?

SAMPLE PROBLEMS
Directions: Read each question below. Encircle the letter of your answer.

1. Which of the following is an experiment?

a. tossing a coin c. rolling a single 6-sided die


b. choosing a marble from a jar d. all of the above

2. Which of the following is an outcome?

a. rolling a pair of dice c. landing on red


b. choosing 2 marbles from a jar d. none of the above.

3. Which of the following experiments does NOT have equally likely outcomes?

a. choose a number at random from 1 to 7 c. toss a coin


b. choose a letter at random from the word SCHOOL d. none of the above

4. What is the probability of choosing a vowel from the alphabet?


21 1 5
a. b. c. d. none of the above
26 21 21

5. A number from 1 to 11 is chosen at random. What is the probability of choosing an odd


number?
1 5 6
a. b. c. d. none of the above
11 11 11

6. In New York State, 48% of all teenagers own a skateboard and 39% of all teenagers own a
skateboard and roller blades. What is the probability that a teenager owns roller blades given that
the teenager owns a skateboard?

a. 87% b. 81% c. 123% d. none of the above.


23
7. At a middle school, 18% of all students play football and basketball and 32% of all students
play football. What is the probability that a student plays basketball given that the student plays
football?

a. 56% b. 178% c. 50% d. none of the above.

8. In the United States, 56% of all children get an allowance and 41% of all children get an
allowance and do household chores. What is the probability that a child does household chores
given that the child gets an allowance?

a. 137% b. 97% c. 73% d. none of the above.

9. In Europe, 88% of all households have a television. 51% of all households have a television
and a VCR. What is the probability that a household has a VCR given that it has a television?

a. 173% b. 58% c. 42% d. none of the above

10. In New England, 84% of the houses have a garage and 65% of the houses have a garage and
a back yard. What is the probability that a house has a backyard given that it has a garage?

a) 77% b) 109% c) 19% d) none of the above

11. Which of the following is the sample space when 2 coins are tossed?

a) {H, T, H, T} b) {H, T} c) {HH, HT, TH, TT} d) none of the above

12. At Kennedy Middle School, 3 out of 5 students make honor roll. What is the probability that
a student does not make honor roll?

a) 65% b) 40% c) 60% d) none of the above

13. A large basket of fruit contains 3 oranges, 2 apples and 5 bananas. If a piece of fruit is chosen
at random, what is the probability of getting an orange or a banana?

a) 4/5 b) 1/2 c) 7/10 d) none of the above

14. A pair of dice is rolled. What is the probability of getting a sum of 2?

a) 1/6 b) 1/3 c) 1/36 d) none of the above

15. In a class of 30 students, there are 17 girls and 13 boys. Five are A students, and three of
these students are girls. If a student is chosen at random, what is the probability of choosing a
girl or an A student?

a) 19/30 b) 11/15 c) 17/180 d) none of the above

16. In the United States, 43% of people wear a seat belt while driving. If two people are chosen
at random, what is the probability that both of them wear a seat belt?

a) 86% b)18% c) 57% d) none of the above.

17. Three cards are chosen at random from a deck without replacement. What is the probability
of getting a jack, a ten and a nine in order?
a) 8/16,575 b)1/2197 c) 6/35152 d) none of the above

24
18. A city survey found that 47% of teenagers have a part time job. The same survey found that
78% plan to attend college. If a teenager is chosen at random, what is the probability that the
teenager has a part time job and plans to attend college?

a) 60% b) 63% c) 37% d) none of the above

19. In a school, 14% of students take drama and computer classes, and 67% take drama class.
What is the probability that a student takes computer class given that the student takes drama
class?

a) 81% b) 21% c) 53% d) none of the above

20. In a shipment of 100 televisions, 6 are defective. If a person buys two televisions from that
shipment, what is the probability that both are defective?

a) 3/100 b) 9/2500 c) 1/330 d) none of the above

21. It is the graphical representation that is very useful for illustrating set operations.

a) histogram b) Venn diagram c) pentagram d) pie chart

22. A committee of 5 pupils is to be selected randomly from a group of 5 male and 10 female
pupils. Find the probability that the committee will consist of 2 male and 3 female pupils.

a) 0.75 b) 0.26 c) 0.33 d) 0.40

23. During 4 successive years, Mr. Santos purchased oil for his furnace at respective costs of
1.83, 1.92, 1.25 and 1.45 cents per gallon. What was the average cost of oil over the 4-year
period?

a) 3.2 cent/gal b) 3.5 cent/gal c) 4.25 cent/gal d) 2.55 cent/gal

24. The number of permutations of letters in the word STATISTICS is

a) 50,000 b) 50,075 c) 50,300 d) 50,400

25. Which of the following is not correct? A frequency distribution can be presented as

a. an estimate c. scatter plot


b. a histogram d. a stem-leaf display

26. The kurtosis of a symmetrical curve is 2.56. The curve therefore is:

a) normal b) platykurtic c) mesokurtic d) leptokurtic

27. Which of the following is a quantitative data?

a) male vs female c) blue eyes vs green eyes


b) left-handed vs right-handed d) salaries of government employees

28. The following figures represent the ages of the members of De Guzman family. What is the
average age for the family?
Father Mother Maggie Brian
45 yrs 41 yrs and 7 mos 14 yrs and 3 mos 12yrs and 6 mos

a) 28 yrs and 4 mos b) 30 yrs c) 28 yrs and 6 mos d) 35 yrs and 4 mos

25
29. You have calculated the correlation coefficients between two variables to be -.95. This would
indicate

a) a weak linear relationship c) a strong linear relationship


b) no relationship d) no linear relationship

30. In a certain city, the distribution of the 1st and 2nd-born children of 2 children families by
gender is shown below

Second born
First born Female Male Total

Female 540 512 1052


Male 492 456 948
Total 1032 968 2000

What is the probability that a family with 2 children selected at random in this city has
children of the same gender?

a. 0.500 b. 0.270 c. 0.498 d. 0.288

26
STATISTICS AND PROBABILITY

1. The score of 8 students were 75, 78, 73, 82, 87, 89, 93, 95. What was the average score of
the students?
a. 82 b. 83 c. 84 d. 44

2. The mean of 7 numbers is 63. What is the sum of the numbers?


a. 9 b. 44.1 c. 0.9 d. 441

3. Out of 100 numbers. Sixteen were 5’s, twenty-one were 6’s, thirty were 7’s and the rest
were 8’s. Find the arithmetic mean of the numbers.
a. 6.5 b. 6.8 c. 7.0 d. 7.1

4. Thirty students in a class averaged 80% on a certain exam. Twenty others averaged 90%.
What is the class average?
a. 84 b. 85 c. 86 d. 87

5. if the average annual income of 15 workers is P66,000 and six of the workers made
P30,000 of the year, what is the average annual income of the remaining 9 workers?
a. P66,000 b. P50,000 c. P30,000 d. P90,000

6. Anna has an average of 87% on 5 exams in Statistics. What must she get in the sixth
exam to average 88% on the six exams?
a. 87 b. 88 c. 93 d. 95

7. The average of 15 numbers is 7. Angelo is adding these numbers and mistakenly reads 6
of the numbers as 9 instead of 7, what average will he get?
a. 7.6 b. 7.8 c. 8 d. 8.2

8. Find the range of the set of numbers: 7, 3, 9, 8, 1, 17


a. 10 b. 7 c. 16 d. 16

9. Two coins are tossed. How many possible outcomes are there?
a. 2 b. 4 c. 8 d. 16

10. Two dice are tossed. How many possible outcomes are there?
a. 12 b. 24 c. 36 d. 42

11. How many numbers of 4 different digits each greater than 5000 can be formed from the
digits 1, 2, 3, 4, 5, 6, 7?
a. 30 b. 60 c. 90 d. 360

12. How many 2-digit numbers of two different digits can be formed from the numbers 0, 2,
4, 6, 8?
a. 12 b. 16 c. 20 d. 25
13. A committee of 7 is to be selected from 8 seniors and juniors. In how many ways can this
be done if the committee must be composed of 4 seniors?
a. 8C4 • 5C4 b. 8C7 • 5C0 c. 8C4 • 5C3 d. 5C4 • 8C3

14. In how many ways can the judges in the Bb. Pilipinas pageant chooses the Philippine
representatives to the Miss Universe and Miss World beauty contests from among 5
finalists?
a. 5 b. 10 c. 20 d. 25

15. In how many ways can the first, second and third places be choose from a group of 10
contestants?
a. 27 b. 30 c. 720 d. 1000
27
16. In how many ways can 4 persons be seated in a round table?
a. 4 b. 6 c. 24 d. 30

17. A multiple choice test consist of 5 questions with 3 possible answers but only one of
which is correct. In how many ways can a student answer the 5 questions and get them all
correct?
a. 1 b. 2 c. 5 d. 15

18. How many straight lines can be drawn, given 5 points no three of which are collinear?
a. 5 b. 10 c. 15 d. 20

19. A pair of dice is tossed. Find the probability of getting a total of 8?


a. 31 b. 5 c. 29 d. 7
36 36 36 36

20. A group of 3 boys and 2 girls are seated in a row of 5 chairs. Find the probability that will
be seated alternately?
a. 3!2! b. 2 c. 3 d. 1
5! 5 5 5

21. Find the probability of getting a head or a tail in a single toss of a coin?
a. 1 b. 1 c. 1 d. 1
4 2 8

22. A pair of dice is tossed. If one die shows a 5, what is the probability that the other die
shows a
a. 1 b. 1 c. 1 d. 1
36 6 11 12

23. A card is drawn from a deck. What is the probability that the card drawn is an ace?
a. 2 b. 1 c. 1 d. 5
13 2 13 26
24. One card is drawn at random from 100 cards numbered 1 to 100. What is the probability that
the number on the card is divisible by 5?
a. 1 b. 1 c. 1 d. 1
2 3 4 5

25. A basket of 20 apples, three of which are rotten. If an apple is selected, what is the
probability that it is good?
a. 17 b. 3 c. 17 d. 3
20 20 23 23

26. A card is drawn from a well shuffled deck of 52 cards. What is the probability that a card
drawn is a face card?
a. 1 b. 3 c. 3 d. 1
4 13 52 2

27. A box contains 5 red balls, 3 green balls and 4 blue balls. If two balls are drawn in succession
without replacement. What is the probability that both are red?
a.  5  3  c.  5  4 
 12  11   12  11 
b.  5  5  d.  4  3 
 12  12   12  11 

28. A student has 4 different books. In how many ways can he arrange them in a bookshelf?

28
a. 4 b. 8 c. 12 d. 24

29. In how many ways can 3 boys and 4 girls sit in a row of 7 chairs if the boys and the girls
alternate?
a. 3! 4! b. 2! c. 2! 3! 4! d. 8!
30. Eight children are to be seated in a round table. In how many ways can they be seated?
a. 5! b. 6! c. 7! d. 8!

31. Given S = {2, 5, 9, 7, 8, 8} consider the following statistics


I. The mean score is 6.5.
II. The median is 4.
III. The mode is 8.
Which of the following is true?
a. I only b. I and II only c. I and III only d. II and III only

32. There are 100 envelopes in a box. Of these, 40 contain P50, 30 contain P100, 20 contain P500
and 10 contain P1000. If one draws an envelope at random from the box, what is his expectations?
a. P16 b. P412 c. P200 d. P250

10
33. Find the value of  i2 .
i 1

a. 1 b. 10 c. 11 d. 55

4
34. Find the value of  i2 .
i 1

a. 1 b. 4 c. 17 d. 30

10
35. If  xi  34, x1 = 3, x10 = 7.
i 1

9
Find  xi  ?
i 1

a. 31 b. 24 c. 28 d. can’t be determined

36. If P (A) = 0.7 P(B) = 0.5 and P(A∩B) = 0.4.


Find P(A  B).
a. 1.6 b. 0.8 c. 0.3 d. 0.1

37. A pair of dice are rolled. Find the probability that the total on the two dice is not 8.
a. 31 b. 5 c. 29 d. 7
36 36 36 36

38. A group of six members gathered for special meeting. Each member has to shake hands with all the
other members. Find the total number of handshakes made?
a. 6 b. 12 c. 10 d. 15

39. From five married couples, 4 people are selected at random. Find the probability of selecting one
woman and 3 men.

29
P 5 P3 C1  5 C 3 10C 4
a. 10
b. 10! c. 5
d.
10
P4 5!5! 10
C4 10!

40. In how many ways can 10 different marbles divided among Anton, Bobby and Charles so that 5 are
given to Anton, 3 to Bobby and 2 to Charles.

a. 10P5 • 10P2 • 10P2 c. 10!


5!3!2!
b. 10 C5 • 10C3 • 10C2 d. 10P10

For numbers 41, 42 and 43


In survey of 100 students. The following data were obtained as to the number of students
enrolled in Math (M), Science (S) and History (H).

Subjects No. of Students Enrolled

Math only 7
Science only 9
History only 14
Math and Science only 8
Math and History only 3
Science and History only 26
All subjects 14

41. How many ways are not taking any of the subjects?
a. 14 b. 19 c. 26 d. 31

42. How many are enrolled in Math?


a. 22 b. 32 c. 7 d. 18
43. How many are enrolled in any three of the subjects?
a. 14 b. 81 c. 85 d. 90

44. A player rolls two dice. He wins if and only if the first die shows an even number or if the two dice
show a sum of 9. Find his probability of winning?
a. 1 b. 2 c. 5 d. 1
2 3 9 3

45. In a board exam, the probability that an examinee will pass each of the three subjects is 0.60. What is
the probability that an examinee will pass at most three subjects?
a. 0.064 b. 0.216 c. 0.784 d. 0.936

46. Find the sum of ways of two 1-peso coin and eight 5-peso coins can be given to street children if
each child get a coins.
a. 15P15 b. 15P15 c. 15! d. 15!
2!5!8! 2!8!

47. The number of permutations of letter a, b, c and d taken 3 at a time is


a. 12 b. 24 c. 4 d. 30

48. A box contains 5 red, 6 white and 5 blue balls. Two balls are chosen at random. What is the
probability that they are both white?
a. 1 b. 3 c. 5 d. 16
8 11 11 11

30
49. Twenty-one tickets numbered from 1 to 21 are in box. If two tickets are drawn at random, determine
the probability that both are odd?
a. 11 b. 121 c. 16 d. 10
42 441 49 21

50. Seventeen tickets numbered from 1 to 17 are in box. If two tickets are drawn at random, determine
the probability that the first one is odd and the second one is given.
a. 1 b. 9 c. 1 d. 81
34 4 289

24. A student received a grade of 84 on a final exam in Math for which the average grade was 76
and the standard deviation is 10. On the final exam in Physics, for which the mean grade was 82
and the standard deviation was 16, she received a grade of 90. In which subject was her relative
standing higher?

A. Physics B. Mathematics C. Either of the two D. Neither of the two

7. The number 911 emergency calls classified as domestic disturbance calls in a large
metropolitan location were sample for 30 randomly selected 24-hour period with the following
result:

25 46 34 45 37 36 40 30 29 37
44 56 50 47 23 40 30 27 38 47
58 22 29 56 46 46 38 19 49 50

Find the average number of calls for the 24-hour period.

A. 38.89 B. 40.00 C. 37.50 D. 45.00

8. Out of 100 numbers, 20 were 4’s, 40 were 5’s and 30 were 6’s. The remainder were 7’s. Find
the average value of the numbers.

A. 5.76 B. 7.26 C. 5.30 D. 5.40

9. Over the years, it is observed that the total number of samples of covered in the annual survey
of the Philippine Business Industry, 70% are small (with employment size of at most 9 workers).
It is further noted that 90% of small establishments submit their report while 70% of the large
ones do not submit. What is the probability that a sample for the survey will submit its report?

A. 0.75 B. 0.84 C. 0.80 D. 0.90

10. A fair die is tossed twice. Find the probability of getting a 4, 5 and 6 on the first toss and a 1,
2, 3, or 4 on the second toss.

A. 0.90 B. 0.60 C. 0.33 D. 0.48

11. It is required to seat 5 men and 4 women in a row so that the women occupy the even places.
How many such arrangements are possible?

A. 3200 B. 3000 C. 2880 D. 4000

12. In a given business venture, a lady can make a profit of P300 with probability 0.6 or take a
loss of P100 with probability 0.4. Determine her expectation.

A.140 B.160 C. 170 D.190


31
13. If 20% of the bolts produced by a machine are defective, determine the probability that out of
4 bolts chosen at random, one will be defective bolts in a total of 400.

A.0.4096 B.0.5120 C.0.6300 D.0.7000

14. The probability that an entering college student will graduate is 0.4. Determine that out of 5
students, all will graduate.

A.0.50 B.0.01 C. 0.30 D.0.60

15. If the probability of a defective bolt is 0.1, find the (a) mean and the (b) standard deviation
for the distribution of defective bolts in a total of 400.

A.mean = 40, standard deviation = 6 B.mean = 6, standard deviation = 40


C. mean = 40, standard deviation = 40 D. mean = 6, standard deviation = 6

15th PSQ Regional Elimination

Round 1

1. What is the probability of randomly selecting the letter T in the word COMPUTE?

a. 1/6 b. 1/7 c. 6/7 d. 7/1

2. What is the probability of selecting a vowel in the word COMPUTE?

a. 1/7 b. 4/7 c. 3/7 7/1

3. Which of the following statements is not correct in constructing histograms?

a. All class intervals should be of equal widths.


b. The bars of the histograms are centered over the class marks or midpoints.
c. The first and the last classes should be open-ended o account for extreme
points.
d. There should be no spaces between bars.

4. Ben will be given 4 chances to pick up toys at random from a gift bag containing Pinoy
superhero toys: 2 Captain Barbel, 3 Darna and 2 Panday. How many possible collections
of toys can he pull out?

a. 48 b. 840 c. 35 d. cannot be determined

5. Standard deviation of scores in a statistics exam is 10. However, since the exam was
very difficult, the statistics teacher later decided to give a bonus of 5 points to all the
students. What is the standard deviation of the new scores?

a. 10/25 b. 10 c. 15 d. 25

6. In a student body election, where 1800 students voted, the votes for Vi, Guy and Glo are
in the ratio of 4:3:2. If they were the only candidates and none of the 1800 students cast
more than 1 vote, how many vote did Guy received?

a. 300 b. 400 c. 600 d. 900

7. Which of the following is not correct? A frequency distribution can be presented as


32
a. an estimate c. scatter plot
b. a histogram d. a stem-leaf display

8. In clinical trials 0f a new skin astringent, 100 women experienced 1st degree burns,
nausea or both. Of these, a total of 35 women experienced 1st degree burns and 25
experienced both burns and nausea. How many experienced nausea?

a. 25 b. 65 c. 90 d. cannot be determined

9. Which of the following statements is true about a truly normal frequency distribution?

a. The median is always the same as the standard deviation.


b. The mean is never the same as the mode.
c. The mode is never the same as the median.
d. The mean is always the same as the median.

10. The Philippine National Statistics Office reports the following median family income (in
thousand pesos) by income decile for the year 2003. An estimate of the 1st quartile is

DECILE 1st 2nd 3RD 4TH 5TH 6TH 7TH 8TH 9TH 10th

MEDIAN 28 43 55 69 86 106 134 175 242 419


INCOME

a. P 43,000 b. P 55,000 C. P 28,000 d. P 69,000

11. Which of the following can be attributed only to sample?

a. Range b. Estimate C. Standard Deviation d. Parameter

2nd Round

1. Fill in the missing words to this quote: “Statistical inference methods may be described
as methods for drawing conclusions about______ based on _______ computed from
the_______ “.

a. population, parameter, sample


b. statistics, parameter, sample
c. parameter, statistics, sample
d. population, statistics, sample

2. You have calculated the correlation coefficients between two variables to be -.95. This
would indicate

a. a strong linear relationship


b. a weak linear relationship
c. no relationship
d. no linear relationship

3. A candy company claims that 10% of its candies are blue. A random sample of 200 of
these candies is taken, and 16 are found to be blue. Which of the following tests would
be the most appropriate for establishing whether the candy company needs to change its
claim?

33
a. a marked pair t-test
b. one-sample proportion z-test
c. two-sample proportion z-test
d. chi-square test of association

4. Consider the set of test scores from the last year’s Phil. Statistics Quiz elimination round.
Supposing that we double all the scores, which of the following sample statistics will be
changed? The mean, the median, or the standard deviation

a. only the mean


b. only the median
c. only the mean and median
d. the mean, median and standard deviation

5. In a certain city, the distribution of the 1st and 2nd-born children of 2 children
families by gender is shown below

Second born
First born Female Male Total

Female 540 512 1052


Male 492 456 948
Total 1032 968 2000

What is the probability that a family with 2 children selected at random in this city has
children of the same gender?

a. 0.500 b. 0.270 c. 0.498 d. 0.288

6. The Phil. National Statistics office reports the following median family (in thousand
pesos) by income deciles for the year 2003.

DECILE 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
MEDIAN
INCOME
28 43 55 69 86 106 134 175 242 419

Approximately what percentage of the families have incomes between P55,000 and P
134,000?

a. 30% b. 40% c. at least 75% d. 95%

7. Let us define a new statistics as the distribution between the 70th sample percentile and
30th sample percentile. This new statistic would give us information concerning

a. central tendency c. skewness


b. variability d. symmetry

8. Workplace accidents are categorized in 3 groups: minor, moderate and severe. The
probability that a given accident is minor is 0.5, that it is moderate is 0.4, and that it is
severe is 0.1. Two accidents occur independently in one month. Calculate the probability
that neither accident is severe nor at most one is moderate.

a. 0.25 b. 0.40 c. 0.45 d. 0.65

Final Round
34
1. The central limit theorem is important in statistics because

a. it tells us that large samples do not need to be selected.


b. it tells that when it applies, the samples that are drawn are always randomly
selected.
c. It enables reasonably accurate problems to be determined for events
involving the sample average when the sample size is large regardless of the
data.
d. none of the above

2. The average time that it takes for a person to experience pain relief from aspirin is 25
minutes. A new ingredient is added to help speed up pain relief. Let U denote the
average time to obtain pain relief. A study is conducted to verify if the new product is
better. Which of the following is the most appropriate null and alternative hypotheses
for this experiment.

a. Ho: U= 25 vs. Ha: U< 25


b. Ho: U< 25 vs. Ha: U= 25
c. Ho: U< 25 vs. Ha: U> 25
d. Ho: U= 25 vs. Ha: U> 25

3. Which of the following statements best describes statistical inference?

a. A decision, prediction or generalization about a sample based on information


contained in a population.
b. A statement made about a sample based on the measurements in that sample.
c. A set of data selected for a larger set of data
d. A decision, estimate, prediction or generalization about the population
based on information contained in a sample.

4. Which of the following statements about confidence interval is correct?

a. If the sample size is fixed, the confidence interval gets wider as we increase the
confidence the confidence coefficient.
b. If the population standard deviation increases, the confidence interval
decreases in width.
c. A confidence interval for a mean always contains the sample mean.
d. If the confidence coefficient is fixed, the confidence interval gets narrower as we
increase the sample size

5. You have measured the systolic blood pressure of a random sample of 25 employees of
NSO. A 95% confidence interval for the mean systolic blood pressure for the employees
is computed to be (122, 138). Which of the following statements gives a solid
interpretation of this interval?

a. About 95% of the sample of 25 employees has a systolic blood pressure between
122 and 138.
b. About 25% of all NSO employees has a systolic blood pressure between 122 and
138.
c. If the sampling procedure were repeated many times, the approximate 95%
of the resulting confidence interval would contain the mean systolic blood
pressure of all NSO employees
d. The probability that the sample mean falls between 122 and 138 is equal to 0.95

6. Ben and Eddie were enrolled in different sections of the same statistics course. Both of
them scored 80% in the finals. However, Eddie’s teacher said that he is in the 60th
35
percentile in his section while Ben is in the 80th percentile. Which of the following is
correct?

a. Ben’s section generally performed better on the test than Eddie’s section.
b. A person at the 30th percentile in Eddie’s class will be at the 40th percentile in
Ben’s section.
c. A person at the 30th percentile in Ben’s section will be at the 40th percentile in
Eddie’s section.
d. Individuals in Ben’s section generally scored lower on the test than those in
Eddie’s section.

7. The sample mean is an unbiased estimator for the population mean. This means that

a. The sample mean always equals the population mean.


b. The average sample mean, over all possible samples equals the population
mean.
c. The sample mean is always very close to the population mean.
d. The sample mean will only vary a little from the population mean.

8. Some descriptive statistics for a set of test scores that follow a normal distribution are
shown in the table below. For this test, a certain student has a standardized score of Z= -
1.2. What is the new score of this student?

Variable N Mean
score 50 1045.7
Variable Minimum Maximum
score 628.9 1577.1

a. 266.28 b. 779.42 c. 1008.02 d. 1083.38

9. Suppose the examination grades of a large group of statistics students are approximately
normally distributed with mean 70 and standard deviation 10. The instructor wishes to
award quality point grades1.0, 1.5, 2.0, 2.5 and 3.0 so that that the top 10% of the
students would get 1.0’s. What numerical grade should divide the 1.0’s and the 1.5’s?
( Z – table needed)

a. 0.25 b. 0.40 c. 0.45 d. 0.65

STATISTICS (regional champ 2004 – Adrian quilatan)

1. What law in statistics states that if an experiment is repeated a large number of times, the
relative frequency of an outcome approach the probability of the outcome?

a. Law of Independent Segregation


b. Law of Large Numbers
c. Law of Theoretical Probability
d. None of these

2. This is a written record of a statistical survey of England ordered by William the


Conqueror. The survey, made in 1086, was an attempt to register the landed wealth of the
country in a systematic fashion, to determine the revenues due to the king. What is this?

a. Das Kapital
b. Domesday Book
36
c. Principia Mathematica
d. None of these

3. This is the analytic study of the hereditary probabilities of any organism.

a. Genetics
b. Taxonomy
c. Biometrics
d. None of these

4. There are 15 points on a plane. Given that no three points are collinear, how many
triangles can be formed from the plane?

a. 210
b. 105
c. 455
d. 273

5. Which of the following is true about the set:


{ 5, 6, 7, 7, 8, 8, 8, 9, 10, 10, 10, 11, 11}

a. The mean is 9.2


b. The standard deviation is 2.53
c. The mode is 8
d. The variance is 3.32

6. The MMDA reports that in the rainy season, there is an increase in the number of
recorded road accidents especially in Quezon City and Caloocan City. Which of the
following is the best conclusion about this report?

a. That the temperature of the rainy season tends to disrupt the psycho-motor control
of the drivers in Quezon City and Caloocan City.
b. That roads become more slippery, causing the increase of accidents.
c. That trailer trucks on those cities tend not to see signs when the rainy seasons
come.
d. That MMDA officers are much more efficient during the sunny days.

7. In a round table, 7 people are seated. If two people refuse to be replaced on their initial
seats, in how many ways can the round table be set-up?

a. 720
b. 120
c. 24
d. None of these

8. How do you call the graph of a cumulative frequency?

a. bar graph
b. frequency polygon
c. ogive
d. density graph

9. How many three-digit numbers which are divisible by 5 can be formed from the digits
from 0 to 9 if no repetitions are allowed?

a. 162
b. 810
37
c. 180
d. 200

10. How many three-digit numbers greater than 800 can be formed from the digits from 0 to
9, if a digit may be repeated?

a. 81
b. 162
c. 200
d. 100

11. On the average, the number of days that an institution is closing because of extreme rains
is 5. What is the probability that this institution would close for 4 days?

a. 0.1462
b. 0.1606
c. 0.1456
d. 0.1342

12. At what point on the plot of the cumulative frequency must the two ogives intersect?

a. at the mean frequency


b. at the median frequency
c. at the mode frequency
d. cannot be determined

13. Which of the following is true about the given table below?

Time Slot Network 1 Network 2 Network 3


7:00 – 7:30 15.49 12.24 6.01
7:30 – 8:00 12.98 13.56 7.90

Table 1. TV show ratings from 3 networks at 2 given time slots

a. 12.24 percent of the audiences watch Network 2’s program at 7:30.


b. Network 1 is the modal choice from 7:00 to 8:00.
c. 6.01 percent of the entire country watches Network 3’s program from 7:00 to 7:30.
d. Approximately 13.56 percent of the entire audience population of the country
watches Network 2’s program from 7:30 to 8:00.

14. The kurtosis of a symmetrical curve is 2.43. The curve therefore is:

a. platykurtic
b. mesokurtic
c. leptokurtic
d. normal

15. In how many ways can the valedictorian, salutatorian and first honorable mention be
chosen from a senior high school class of 36 students?

a. 1332
b. 630
c. 46620
d. 7140

38
16. Ars Conjectandi (The Art of Guessing), a book published in 1713, was a book that gave
interest on probabilities concerning gambling, economics and politics. Who wrote this
book?

a. Gregor Mendel
b. Wilhelm Leibniz
c. Jacob Bernoulli
d. Karl Marx

17. If 5 cards are dealt from a standard deck of 52 playing cards, what is the probability that
3 will be spades?

a. 0.0815
b. 0.0961
c. 0.0588
d. 0.7154

18. Given is a table about the contestants of the 2004 Philippine Statistics Quiz. Assuming
that all contestants are equally intelligent, what is the probability that the PSQ champion
is a male and is an engineering student?

2004 PSQ NATIONAL FINALISTS


REGION SEX COURSE
NCR Male BS Engineering Management
CAR Male BS Electronics & Communications Engineering
1 Female BS Nursing
2 Male BS Civil Engineering
3 Male BS Electronics & Communications Engineering
4A Male BS Electronics & Communications Engineering
4B Male BS Accountancy
5 Male BS Chemical Engineering
6 Male BS Applied Mathematics
7 Female BS Electronics & Communications Engineering
8 Female BS Computer Science
9 Male BS Physics
10 Female Associate in Health Science Education
11 Female BS Applied Mathematics
12 Male BS Civil Engineering
ARMM Male BS Chemistry
CARAGA Male BS Computer Science

Table 2. 2004 Philippine Statistics Quiz National Finals


[Regions, Sexes and Courses]
a. 6/17
b. 7/17
c. 8/17
d. 9/17

19. There are 15 balls in a box. 3 are red, 5 are blue and 7 are yellow. If 6 balls are to be
taken from the box, what is the probability that 1 is red, 2 are blue and 3 are yellow?

a. 0.0316
b. 0.2879
c. 0.2098

39
d. 0.3122

20. The Philippine export and import costs in the year 2000 are estimated to be P
12,243,000.00 and P 10,106,000.00 respectively. This data implies that

a. There is a P 2,137,000.00 surplus cost


b. There is a P 2,137,000.00 deficit cost
c. There is a P 3,167,000.00 surplus cost
d. There is a P 3,167,000.00 deficit cost

21. What is the Pearsonian coefficient of skewness of a data from a population whose mean,
median and standard deviation are 23, 22, and 0.9 respectively?

a. 3.45
b. 3.33
c. 0.03
d. 2.89

22. If in a certain line, the letters A, B, …, I and J are placed in its alphabetical order, in how
many ways can this line be named?

a. 24
b. 120
c. 45
d. 10

23. The scores of a student in a series of five 20 item tests are: 12, 18, 15, 15 and 17.
Accidentally, the instructor replaced the 17 with a 16. By how much will the standard
deviation of the student’s score change after the replacement?
a. 0.18
b. 0.12
c. 0.13
d. 0.15
Trial Question

At a pre-school class in Sto. Niño Parochial School, the top three students are considered as
honor students. If the class is composed of 15 students, what portion of the class will not make it
to the honor roll?
a. 60 % b. 40% c. 80% d. 20%

ROUND 1

1. (With material) 5sec


Which of the four diagrams is best used to depict the presence of outliers or extreme
values in the data set?
a. Figure 1 b. Figure 2 c. Figure 3 d. Figure 4

2. (5 sec) The Bureau of Agricultural Statistics collects data on retail prices of selected
agricultural commodities in major trading centers and reports these prices in the BAS media
service. The report includes the prevailing price, lowest price and highest price. Which of the
following gives a quick measure of the variability of the price data?
a. variance b. range c. fractile d. mean

3. (5 sec) In the memorandum of the secretary of the Department of Agriculture regarding the
weekly updates on prices of palay, rice, and corn, Bureau of Agricultural Statistics reports the
prevailing prices as the modal price of each commodities. If the prices of well-milled rice per

40
kilogram are P22, P25, P24, P22, P23, and P22, what will be reported as the prevailing price of
well-milled rice?
a. P25 b. P23 c. P24 d. P22

4. (10 sec) If the mean absolute deviation of a data set is zero, what does this suggest?
a. All observations are necessarily equal
b. All observations are necessarily equal to 1.
c. All observations are necessarily equal to 0.
d. There is an equal number of positive and negative observations.

5. (30 sec w/ material) The Speedo Wheel Company makes custom wheels for Toyota. The
wheel sizes tend to follow a normal distribution. The mean wheel width is 6.05 inches and the
standard deviation is 0.05 inch. In order for the wheels to fit properly, the diameter of the wheels
must be between 5.90 and 6.10 inches. What is the probability that a randomly chosen wheel
manufactured by this company will fit properly?
a 68% b. 84% c. 90% d. 95%

6. (40 sec w/ material) The following are monthly salaries of a group of call center employees in
Manila. (Refer to the given table.) What best describes the mean salary of a call center agent?
a. Mean salary is in the modal class.
b. Mean salary is in the median class.
c. Mean salary is in the class interval with 100 employees.
d. Mean salary is in the class interval with 50 employees.

7. (5 sec) Suppose a person throws a dart at a circular board having a radius of one foot. We
assume that he is bound to hit somewhere on the board. Let x denote the distance (in inches) by
which he misses the center of the board. Describe the random variable X.

A. Random variable X is always equal to one foot


B. Random variable X is always equal to zero
C. Random variable X assumes any value of x where 0  x  12
D. Random variable X assumes any value of x where 0  x  24

8. (30 sec) The committee system is the core of congress law-making, investigative and oversight
function. This is so because much of the business of congress is done in the committee. The
Philippine Senate is composed of 24 senators who make up the membership of the senate
permanent committees. Suppose 9 senators were to be picked from 12 opposition senators and
nine administration senators to form a committee, find the probability that 5 opposition senators
will be on the committee.

A. 0.21 B. 0.34 C. 0.47 D. 0.60

9. (10 sec w/ data) Jean was a census enumerator in the 2007 Census of Population conducted by
the NSO in August. In one of her interviews with the households in the barangay in Manila, she
was able to collect the following variables

Table

Jean coded the variables’ relationship to head, sex and highest grade completed with numerical
figures to facilitate data processing. What type of variable do these coded data represents?

A. Column 3, 6, and 8 are all nominal variables


B. Column 3, 6, and 8 are all ordinal variables
C. Column 3 and 6 are nominal variables while 8 is an ordinal variable
D. Column 3 and 6 are ordinal variables while 8 is an nominal variable

41
10. (5 sec) The birth certificate of a child is a proof of recognition of a new born individual’s
importance to the state and his status under the law. Which of the ff. set of information consist of
quantitative data?

A. Information about the child name, sex, place of birth and date of the birth
B. Information about the mother: total number of children born alive, number of
children still living including birth
C. Information about the child name, place of birth, weight at birth
D. Information about the father, name, height, occupation, age at the time of this birth

ROUND 2

1. (5 sec with material) After the International Astronomical Union (IAU) had declassified Pluto
as a dwarf planet in 2006, there remained only eight known planets based on the new definition.
The following table shows the distances of these different planets from the sun.

Planet Distances from the sun


(million kilometers)
Mercury 57.9
Venus 108.2
Earth 149.6
Mars 227.9
Jupiter 778.3
Saturn 1,426.9
Uranus 2,871.0
Neptune 4,497.1

If you were to compute the mean and the median distances of the eight planets which of the
following statements would be true about the mean and the median?

A. The mean and the median are values that belong in the data set.
B. The mean and the median are values that do not belong in the data set.
C. The mean and the median are computed using all the values in the data set.
D. The mean and the median are computed using only the middle value.

2. (5 sec) If you are certain that 3 of the 5 choices to a particular question are wrong but had to
guess randomly between the remaining choices. What is the probability that you guess the right
answer?

A. 1/2 B. 1/4 C. 1 D. 1/8

3. (5 sec) If the particular male college students in a class got a percentile score of 35 in an
examination, this means that

A. Answered 35% of the questions on the test correctly


B. Knew 35% of the material covered by the exam
C. Has earned a score equal to or better than 35 students in this class
D. Has earned a score equal to or better than 35% of the students in this class

4. (10 sec) Consider the following frequency distribution on the height of 200 college varsity
players. If the frequencies in the first and last class intervals are increased, which of the
following will result?
Class Interval Frequency
59-62 4
63-66 26

42
67-70 18
71-74 36
75-78 56
79-82 38
83-86 16
87-90 6

A. The standard deviation is reduced.


B. The standard deviation is not affected.
C. The standard deviation is increased.
D. The standard deviation is not affected as long as the increased are balanced on each side
of the mean.

5. (5 sec) A survey was conducted on 60 TV viewers in an urban barangay of a province in


Mindanao. They were asked on the most watched TV show between 7:00 to 10:00 pm everyday
in the past week. The survey allowed multiple responses for the most watched TV shows. The
following were the results:

Type of TV Shows Number of TV Viewers


Drama 58
News 54
Sports 26
Comedy 35
Others (cartoons, talk show) 33
Which among the figures presented below is the most appropriate graph

A. Figure 1 B. Figure 2 C. Figure 3 D. Figure 4

6. (5 sec) Which of the following statement is not a property of a binomial experiment?


A. The experiment consists of a sequence of identical trials.
B. Two outcomes are possible on each trial. We refer to one as success and the other as
failure.
C. The probability of success p and failure (1-p).
D. The trials are dependent.

7. (40 sec) A rock concert producer has asked an outdoor performance for the benefit of cancer
patients of a certain hospital. He has invited 6 bands to perform at an outdoor stage in Manila.
The earnings will be used to purchase medicines, wheelchairs and other materials and equipment
of the hospital.
The producer expects to earn P 290,000 if the weather is fine on the schedule day. If it is
a rainy day he expects to earn only P 120,000 and if it is a stormy day he expects a P 270,000
loss. Based on the historical record at that time of the year, the weather office has estimated the
chance of a fine day to be 0.60, the chance of a rainy day to be 0.25 and the chance of a stormy
day to be 0.15.
If the producer decides to pursue the concert, will you agree with his decision?

A. Yes, because his expected profit is P 290,000


B. Yes, because his expected profit is P 163,500
C. Yes, because his expected profit is P 140,000
D. No, because he expects to lose P 270,000

8. (5 sec) One of the regular surveys of Bureau of Agricultural Statistics is the Agricultural
Labor Survey. In this survey, data are collected semi-annually for palay and corn and annually
for coconut and sugarcane. In conducting the coconut survey, each data collector seeks coconut

43
farmers who hired laborers during the reference period until he/she has interviewed ten farmers.
This type of sampling procedure is called:

A. Simple Random Sampling C. Quota Sampling


B. Cluster Sampling D. Stratified Random Sampling

9. (10 sec) Suppose a sample of 50 bus commuter travel to work at an average of 30 minutes a
day with a population standard deviation of 2.5. A 95% confidence interval for the true mean is
29.3 minutes to 3.07 minutes. The confidence interval of a population mean is obtained by x = z.
What would happen to the confidence interval at the same confidence level at 95% if the number
of sample of bus commuters is increased by 100?

A. The confidence interval will be narrower


B. The confidence interval will be wider
C. The confidence interval remains the same
D. The confidence interval cannot be determined

10. (5 sec) There are several methods of measuring degree of association or relationship between
variables. Which of the following is not a tool used in analyzing relationship of variables?

A. Exponential distribution
B. Regression Analysis
C. Pearson’s coefficient of correlation
D. Scatter diagram

ROUND 3 (FINAL ROUND)

1. (15 sec) In a bus terminal, one passenger bus is scheduled to arrive every 4 hours on the tour.
In practice, however, the scheduled bus may not arrive on the exact hour, but it is equally likely
to arrive at any time during the next 60 minutes. Suppose you arrive at the terminal at 5 pm, what
is the probability that you have to wait for more that 10 minutes for the 5 pm bus?

A. 0.456 B. 0.500 C. 0.667 D. 0.833

P(x>10) = (60 – 10)  (60 – 0) = 0.833

2. (25 sec) Suppose is used to represent a couple, and a pictogram of the number of
children of these couples is shown below

Table

Number of children

44
If the data is to be represented in a pie chart, which of the following is best suited to illustrate the
above information?

A. Figure 1 B. Figure 2 C. Figure 3 D. Figure 4

3. (60 sec) Three doughnuts for Floyd’s merienda are to be selected at random from a box of a
dozen doughnuts that contains three Bavarians, four choco honey dip and five choco butternuts.
What is the probability that Floyd will get one choco butternut and two choco honey dip?

A. 0.27 B. 0.18 C. 0.14 D. 0.07

4. (5 sec) Glutathione is one of the controversial phytochemical trading. It is a sulfur containing


amino acids which manufacturer claims detoxifies our body from free radicals that could damage
cell membranes. What makes it ever more popular is its claimed side-effect of causing a lighter
and porcelain complexion skin.

Manufacturer A claimed that their brand of 250 mg Glutathione skin whitening pills could
change the skin tone of a brown-skinned Filipina weighing 70 kilos or more in less than 7 weeks
on the average.

Due to its popularity, various manufacturers of Glutathione supplement compete in market today.
Manufacturers of 250 mg Glutathione skin whitening is concerned how sown the whitening
effect of the pills would be? Where will your critical region be located?

A. H 0 :   70kg vs H a :   70kg ; critical region lies on the left side


B. H 0 :   70kg vs H a :   70kg ; critical region lies on the both side
C. H 0 :   7 weeks vs H a :   7 weeks ; critical region lies on the left side
D. H 0 :   7 weeks vs H a :   7 weeks ; critical region lies on the right side

5. (60 sec) A barangay official hypothesized that the proportion of the residents who favored the
construction of public utility terminal in their subdivision is 75%. A survey was then conducted
in the subdivision with 600 residents to elicit their responses. Out of 100 respondents, 80% were
in favor of having a jeepney terminal within this subdivision. At 95% confidence interval, is
there a reason to accept the barangay officials hypothesis?

p 
p(1  p) N  n

.8(.2) 600  100
n 1 N 1 99 600  1

and the z-value for  / 2 is 1.96 (   .05 )

A. Yes, because the barangay officials hypothesized proportion lies within 95%
confidence interval
B. Yes, because the barangay officials hypothesized proportion is 75% and is close to 80%
C. Yes, because the barangay officials should know his business
D. No, because 75% is not equal to 80%

6. (40 sec) XYZ, a leading national TV broadcasting Co. claims to have 60% share of household
viewers of their daily routine show. Suppose ZTE, a rival broadcasting company wants to verify
the claim of conducting a survey of 20 households which begins at the time of the noontime
show.

If the claim of XYZ broadcasting company is true, what is the probability that exactly 10
households are tuned in to their noontime show?
45
A. 0.12 B. 0.18 C. 0.59 D. 0.87

7. (5 sec) Base of the study of livestock and Poultry Division of Bureau Agricultural, a
regression analysis was used to determine if the demand for broiler in metric tons (Y) can be
explained by the following variables. Retail price of chicken (X1), retail price of beef (X2), and
the personal consumption expenditure (X3), all in pesos

The regression equation is

Yt   0  1X1   2 X 2  3 X 3  e

The results of the study gave the following regression equation

Broiler Demand = 5.2 – 0.4 * (Retail Price of Chicken) +


1.2 * (Retail Price of Beef) +
0.5 * (Personal Consumption Expenditure)

If the retail price of chicken increased by P2.00 and all other variable are held constant, what is
the change in the broiler demand?

A. Broiler demand decreased by 0.4 metric tons


B. Broiler demand increased by 5.2 metric tons
C. Broiler demand increased by 1.2 metric tons
D. Broiler demand decreased by 0.8 metric tons

8. (20 sec) The 2003 Functional Education and Mass Media Survey conducted by NSO aims to
measure among other simple or basic and the functional literacy rate of the population.

Simple or basic literacy rate is defined as the proportion of the population aged 10 and over who
can read and write with understanding the language or dialect, “Functional Literacy,” on the
other hand, refers to the proportion of the population aged 10 to 64 yrs old who have reading,
writing, and numeric skills.

Using the table below, what can be said about the results of the 2003 FLEMMS in the National
Capital Region (NCR)?

Number and Percent of the Population 10 Years Old and


Over by Literacy and Sex, NCR: 1994 and 2003

NCR 1994 2003


Total Male Female Total Male Female
Population 10 yrs and over 6,706 3,194 3,513 8,318 3,994 4,324
(in thousand pesos)
Population 10-64 yrs old 6,069 2,879 3,190 7,711 3,693 4,019
(in thousand pesos)
Simple Literacy Rate (%) 98.8 98.9 98.8 99.0 98.9 99.1
Functional Literacy Rate (%) 92.4 91.8 93.0 94.6 94.0 95.2
Source:2003 FLEMMS, NSO

A. Total number of persons with basic literacy in 1994 is around 6,628


B. The number of persons with basic literacy grew by 0. 2% from 1994 to 2003
C. There are more than 3.8 million functionally literate women in 2003
D. Literacy rate in the Philippines is higher in 2003 than in 1994

46
9. (60 sec) If the quality grade-point averages of a random sample of college seniors are
normally distributed with a mean  and standard deviation of 0.3, how large a sample is
required if we want to be 95% confident that our estimate of  will not differ by more than 0.05?

A. 139 B. 564 C. 239 D. 98

10. (40 sec) A researcher assigned a number from 1 to 500 to his population of college students.
He needs a sample of 20 students for his study. If he uses systematic sampling, who will be the
10th member of his sample given that the first member is the 25th student in the population?

A. 250th student C. 275th student


B. 274th student D. 290th student

47
ADDITIONAL TOPICS

Blaise Pascal and Pierre de Fermat, French mathematicians, through correspondence in 1654
on a problem in gambling, began the mathematical study of probability.

An unpublished piece by Pascal on gambling stimulated Dutch scientist Christiaan Huygens to


publish a small work in 1657. These were on the probabilities in dice game. Swiss
mathematician Jakob Bernoulli reprinted this work in his Ars Conjectandi (Art of Conjecturing)
published in 1713. Both Bernoulli and French-English mathematician Abraham De Moivre, in
his Doctrine of Chances (1718), applied the newly discovered calculus to probability. They thus
made advances in probability theory, which by then had important applications in the rapidly
developing insurance industry.

Carl Friedrich Gauss, German mathematician, noted for his wide-ranging contributions to
physics, particularly the study of electromagnetism and in probability, he developed the
important method of least squares and the fundamental laws of probability distributions. The
normal probability graph is still called the Gaussian curve.

Bayes’ Theorem - a theorem that relates the probability of particular events taking place to the
probability that events conditional upon them have occurred.

Karl Pearson, British mathematician and philosopher of science, who is best known for
developing some of the central techniques of modern statistics, and for applying these techniques
to the problem of biological inheritance or heredity. Pearson's research laid much of the
foundation for 20th-century statistics, defining the meanings of correlation, regression analysis,
and standard deviation

Francis Galton, British scientist, attempted to find statistical relationships to explain how
biological characteristics were passed down through succeeding generations.

The Roman Empire was the first government to gather extensive data about the population,
area, and wealth of the territories that it controlled.

Census is a term usually referring to an official count by a national government of its country’s
population. A population census determines the size of a country’s population and the
characteristics of its people, such as their age, sex, ethnic background, marital status, and income.
National governments also conduct other types of censuses, particularly of economic activity. An
economic census collects information on the number and characteristics of farms, factories,
mines, or businesses.

Uses of Census Information

Governments use census information in almost all aspects of public policy.

1. the population census is used to determine the number of representatives each area within
the country is legally entitled to elect to the national legislature.
2. it provides that seats in the House of Representatives should be apportioned to the states
according to the number of their inhabitants.
3. to determine how many seats each state should have in the House and in the electoral
college, the body that nominally elects the president and vice president of the United
States. This process is known as reapportionment.
4. use population census figures as a basis for allocating delegates to the state legislatures
and for redrawing district boundaries for seats in the House, in state legislatures, and in
local legislative districts.

48
5. census population data are similarly used to apportion seats among the provinces and
territories in the House of Commons and to draw electoral districts.
6. population census finds information of great value in planning public services because
the census tells how many people of each age live in different areas. These governments
use census data to determine how many children an educational system must serve, to
allocate funds for public buildings such as schools and libraries, and to plan public
transportation systems. They can also determine the best locations for new roads, bridges,
police departments, fire departments, and services for the elderly.

Besides governments, many others use census data. Private businesses analyze population and
economic census data to determine where to locate new factories, shopping malls, or banks; to
decide where to advertise particular products; or to compare their own production or sales
against the rest of their industry. Community organizations use census information to develop
social service programs and child-care centers. Censuses make a huge variety of general
statistical information about society available to researchers, journalists, educators, and the
general public.

Consumer Price Index (CPI)

COMPONENTS

A . Market Basket

The CPI market basket contains a sample of goods and services commonly purchased by a group
of households in a particular area. The 1988-based CPI series has 13 regional market baskets
which are the combined market baskets of the bottom-30% and upper-70% income groups for
each region.

Computation of the provincial CPI uses the market basket for the region where the province
belongs.

The number of items comprising the market basket for all-income group for each region is
shown below:

Metro Manila = 384 Region 7 = 500


Region 1 = 548 Region 8 = 524
Region 2 = 565 Region 9 = 549
Region 3 = 526 Region 10 = 619
Region 4 = 651 Region 11 = 635
Region 5 = 525 Region 12 = 582
Region 6 = 645

B. Weighting System

Weights used in the current CPI series were derived from the results of the 1988 Family Income
and Expenditures Survey (FIES). The weight is computed as the proportion of expenditure on a
specific group of items to total expenditure.

Aggregated weights for the six item major groups are shown below:

Weights for the CPI (1988 = 100)

Areas Outside Metro


Commodity Group Philippines
Metro Manila Manila
All Items 100.00 24.42 75.58
49
Food, Beverage and Tobacco 58.47 11.84 46.63
Clothing 4.35 0.87 3.49
Housing and Repairs 13.30 5.40 7.90
Fuel, Light and Water 5.36 1.45 3.91
Services 10.90 3.32 7.58
Miscellaneous 7.59 1.53 6.06

C. Base Period

The CPI series constructed by NSO since 1945 has undergone several revisions. The 1988-based
CPI, the current series, is the sixth rebasing.

D. Index Formula

The construction of the CPI basically uses a Laspeyres Formula (fixed base year weights).

The formula is modified as the weighted arithmetic mean of price relatives. That is,


Sum (Pn / P )(P *Q ) 
Index 
 
0 0 0

Sum P *Q
0 0

Pn = current price
Po = base year price or base price
Po*Qo = base year weight

E. Sample Outlets

Sample outlets are establishments where prices of sample commodities are quoted. There are
about 9,000 outlets nationwide.

The outlets were selected according to the following criteria:

1. Popularity of the establishment along the line of goods to be priced


2. Permanency of outlet
3. Consistency or completeness of stock
4. Accessibility of outlet

The selected outlets are permanent sources of price data that cannot be changed at will unless
necessary because of the following reasons:

1. Closing of business
2. Disappearance of item from the stock for more than three consecutive months or permanently

An outlet may be completely abandoned or partly only, i.e., one or more items in the survey list
disappeared from its stock. It is replaced with the nearest retail outlet, that is, within the vicinity
of the replaced outlet. The choice of which outlet to choose is left to the discretion of the price
canvasser using the criteria for regular outlet selection. Once a substitute outlet has been selected,
the outlet becomes a permanent outlet for the succeeding survey rounds.

DEFINITION OF TERMS

A. Consumer Price Index (CPI)

50
Consumer price index (CPI) is a measure of change in the average retail prices of goods
and services commonly purchased by a particular group of people in a particular area.

B. Market Basket

Market basket refers to a sample of goods and services used to represent all goods and
services bought by a particular group of consumers in a particualr area.

C. Base Period

Base period, usually a year, is the reference period of the index number. It is the period at
which the index is set to 100.

D. Sample Outlets

Sample outlets are outlets or establishments where prices of sample commodities are
quoted.

E. Weight

Weight is a value attached to a commodity or group of commodities to indicate the


relative importance of that commodity or group of commodities in the market basket.

F. Inflation Rate

Inflation rate (IR) is the annual rate of change or year-on-year change in CPI. That is,

CPIn - CPIo
Inflation Rate (IR) = ------------------------------- x 100
CPIo

where:

CPIn = current month's index for all items


CPIo = same month last year's index for all items

G. Purchasing Power of the Peso

Purchasing Power of the Peso (PPP) shows how much the peso in the base period is
worth in another period. It gives an indication of the real value of the peso in a given
period relative to the peso value in the base period.

Purchasing Power of the Peso (PPP) = 1 / CPI(All Items) * 100

H. General Wholesale Price Index

Wholesale price index (WPI) measures the monthly changes in the general price level of
commodities (usually in large quantities) that flow into the wholesale trading system.

The 1978 - based WPI series has a total of 376 commodities or items traded in the
wholesale market. These items include producer's materials, consumer goods and capital
goods which may either be raw materials, intermediate products or finished goods.
Moreover, they may also be domestically produced (including exports) or imported for
resale. These items are grouped according to the Philippine Standard Commodity
Classification (PSCC).

51
The weights of the current WPI utilizes the value of sales of commodities traded in the
wholesale market in 1978 as derived from the 1974 Input-Output tables. It covers only
the National Capital Region or Metro Manila. The weighted average of relatives method,
basically the Laspeyre's formula, is used in the construction of WPI.

I. Retail Price Index

Retail price index (RPI) is a measure of the changes in the retail price at which retailers
dispose of their goods to consumers or end-users.

The current RPI still uses 1978 as the base year and covers only the National Capital
Region (NCR) or Metro Manila.

While the 1972-based series was computed using the geometric mean without any
weighting pattern, the present series is constructed using the weights based on the 1974
Input-Output tables on the values of expenditures of goods and services of consumers
from the retail sector, estimated at 1978 prices. The weighted average of relatives method,
basically the Laspeyre's formula, is used in the computation of the index.

The present market basket has a total of 479 commodities grouped according to the Philippine
Standard Commodity Classification (PSCC).

Agricultural production – the growing field crops, fruits, nuts, seeds, tree nurseries (except
those of forest trees), bulb vegetables and flowers, both in the open and under glass; and
the production of coffee, tea, cocoa, rubber; and the production of livestock and livestock
products, honey rabbits, fur-bearing animals, silkworm, cocoons, etc. Forestry and fishery
production carried on as an ancillary activity on an agricultural holding is also considered
as agricultural production.

Constant Prices (at constant prices) – valuation of transactions, wherein the influence of price
changes from the base year to the current year has been removed.

Gross Domestic Product – the value of all goods and services produced domestically; the sum
of gross value added of all resident institutional units engaged in production (plus any
taxes, and minus any subsidies, on products not included in the values of their outputs).

Gross Regional Domestic Product - aggregate of the gross value added or income from each
industry or economic activity of the regional economy.

Gross National Product – the Gross Domestic Product adjusted with the net factor income from
the rest of the world. It refers to the aggregate earnings of the factors of production
(nationals) plus indirect taxes (net) and capital consumption allowance.

Gross Value Added – the difference between gross output and intermediate inputs. Gross
outputs of a production unit during a given period is equal to the gross value of the goods
and services produced during the period and recorded at the moment they are produced,
regardless of whether or not there is a change of ownership. Intermediate inputs refer to
the value of goods and services used in the production process during the accounting
period.

Personal Consumption Expenditures - consist of actual and imputed expenditures of


households for the purpose of acquiring individual consumption goods and services.

Basic or Simple Literacy - the ability to read and write with understanding simple messages in
any language or dialect

52
Functional Literacy – represents a significantly higher level literacy which includes not only
reading and writing skills but also numeracy skills. The skills must be sufficiently
advanced to enable the individual to participate fully and effectively in activities
commonly occurring in his life situation that require a reasonable capability beyond oral
and written communication.

Base Period - usually a year, is the reference period of the index number. It is the period at
which the index is set to 100.

Consumer Price Index (CPI) - measure of the average changes in the prices of a fixed basket of
goods and services usually purchased by households for their consumption.

Wholesale Price Index (WPI) - measure of the changes in the price level of commodities that
flow into the wholesale trade intermediaries.

Retail Price Index (RPI) - measure of the changes of the prices at which retailers dispose of
their goods to consumers and end-users.

Family expenditures - refer to the expenses or disbursements made by the family purely for
personal consumption during the calendar year 1997. They exclude all expenses in
relation to farm or business operation, investment ventures, purchase of real property and
other disbursements which do not involve personal consumption. Income from other
sources - include imputed rental values of owner-occupied dwelling units, interests,
rentals including landowner's share of agricultural products, pensions, support and the
value of food and non-food items received as gifts by the family (as well as the imputed
value of services rendered free of charge to the family).

Per capita income - is obtained by dividing the total family income by the total number of
family members.

Primary income - includes salaries and wages, commissions, tips, bonuses, family and clothing
allowance, transportation and representation allowances, honoraria, and other forms of
compensation and net receipts derived from the operation of family-operated
enterprises/activities and the practice of a profession or trade.

Total family income - includes primary income and receipts from other sources received by all
family members during the calendar year 1991 as participants in any economic activity or as
recipients of transfers, pensions, grants, etc.

Magnitude of the Poor - the number of families or the population whose annual per capita
income falls below the subsistence/poverty threshold.

Poverty Incidence - proportion of families/population whose annual per capita income falls
below the annual per capita poverty threshold to the total number of families/population

Poverty Threshold – annual per capita income required or the amount to be spent to satisfy
nutritional requirements (2,000 Kcal) and other basic needs

Subsistence Incidence - proportion of families/population whose annual per capita income falls
below the annual per capita food/subsistence threshold to the total number of families/population

Subsistence and Food Threshold – annual per capita income required or the amount to be spent
to satisfy nutritional requirements (2,000 Kcal)

Average Total Employment – arrived at by dividing the total employment during the pay
periods, nearest the middle of each quarter (Feb. 15, May 15, Aug. 15, and Nov. 15) by four
quarters.
53
Balance of Payments (BOP) – statistical statement that systematically summarizes, for a
specific period, the economic transactions of a country with the rest of the world. Transactions,
for the most part between residents and non-residents, consist of those involving goods, services
and income; those involving financial claims on and liabilities to the rest of the world; and those
(such as gifts) classified as transfers which are real resources and financial claims provided to, or
received from the rest of the world without the corresponding resources and financial claims
received or given in exchange.

What are the indicators used to measure the attainment of MDGs?

MDG – MILLENEUM DEVELOPMENT GOALS

The United Nations Secretariat, specialized agencies of the UN system, and representatives of
the International Monetary Fund (IMF), the World Bank and Organization for Economic Co-
operation and Development (OECD) as well as international experts identified and selected
the 48 MDG indicators.

The millennium indicators for each goal are as follows:

Goal 1. Eradicate Extreme Poverty and Hunger

 Proportion of population below $1 (PPP) per day


 Poverty gap ratio
 Share of poorest quintile in national consumption
 Prevalence of underweight children under 5 years of age
 Proportion of population below minimum level of dietary energy consumption

Goal 2. Achieve Universal Primary Education

 Net enrolment ratio in primary education


 Proportion of pupils starting grade 1 who reach grade 5
 Literacy rate of 15-24 year-olds

Goal 3. Promote Gender Equality And Empower Women

 Ratios of girls to boys in primary, secondary and tertiary education


 Ratio of literate females to males of 15-24 year-olds
 Share of women in wage employment in the non-agricultural sector
 Proportion of seats held by women in national parliament

Goal 4. Reduce Child Mortality

 Under-five mortality rate


 Infant mortality rate
 Proportion of 1 year-old children immunized against measles

Goal 5. Improve Maternal Health

 Maternal mortality ratio


 Proportion of births attended by skilled health personnel

Goal 6. Combat HIV/Aids, Malaria and Other Diseases

 HIV prevalence among 15-24 year old pregnant women

54
 Condom use rate of the contraceptive prevalence rate
 Number of children orphaned by HIV/AIDS (to be measured by the ratio or proportion of
orphans to non-orphans aged 10-14 who are attending school)
 Prevalence and death rates associated with malaria
 Proportion of population in malaria risk areas using effective malaria prevention and
treatment measures
 Prevalence and death rates associated with tuberculosis
 Proportion of tuberculosis cases detected and cured under directly observed treatment
short course (DOTS)

Goal 7. Ensure Environmental Sustainability

 Proportion of land area covered by forest


 Ratio of area protected to maintain biological diversity to surface area
 Energy use (kg oil equivalent) per $1 GDP (PPP)
 Carbon dioxide emissions (per capita) and consumption of ozone-depleting CFCs (ODP
tons)
 Proportion of population using solid fuels
 Proportion of population with sustainable access to improved water source, urban and
rural
 Proportion of urban population with access to improved sanitation
 Proportion of households with access to secure tenure (owned or rented)

Goal 8. Develop A Global Partnership For Development

 Proportion of total bilateral, sector-allocable ODA of OECD/DAC donors to basic social


services (basic education, primary health care, nutrition, safe water and sanitation)
 Proportion of bilateral ODA of OECD/DAC donors that is untied
 ODA received in landlocked countries as proportion of their GNIs
 ODA received in small island developing States as proportion of their GNIs
 Market access
 Proportion of total developed country imports (by value and excluding arms) from
developing countries and LDCs, admitted free of duties
 Average tariffs imposed by developed countries on agricultural products and textiles and
clothing from developing countries
 Agricultural support estimate for OECD countries as percentage of their GDP
 Proportion of ODA provided to help build trade capacity (OECD and WTO are collecting
data that will be available from 2001 onwards)
 Debt sustainability
 Total number of countries that have reached their HIPC decision points and number that
have reached their HIPC completion points (cumulative)
 Debt relief committed under HIPC initiative, US$
 Debt service as a percentage of exports of goods and services
 Unemployment rate of 15-24 year olds, each sex and total (an improved measure of the
targets is under development by ILO for future years)
 Proportion of population with access to affordable essential drugs on a sustainable basis
 Telephone lines and cellular subscribers per 100 population
 Personal computers in use per 100 population and internet users per 100 population

55

You might also like