0% found this document useful (0 votes)
39 views54 pages

Stat For Comp (CH 1-5)

Uploaded by

ruthnsr066816
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views54 pages

Stat For Comp (CH 1-5)

Uploaded by

ruthnsr066816
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

CHAPTER ONE

INTRODUCTION

Introduction
In our daily life, it is common to use statistical figures. You may hear such statistical talks through radio,
television or you may read statistical figures through newspapers and magazines in your daily life.
Statistics is used in almost all fields of human endeavor. In sports, for example, a statistician may keep
records of the number of yards a running back gains during a football game, or the number of hits a
baseball player gets in a season. In other areas, such as public health, an administrator might be concerned
with the number of residents who contract a new strain of flu virus during a certain year. In education, a
researcher might want to know if new methods of teaching are better than old ones. These are only a few
examples of how statistics can be used in various occupations. Furthermore, statistics is used to analyze
the results of surveys and as a tool in scientific research to make decisions based on controlled
experiments.

We begin the course with some concepts of basic data analysis. Since statistics may involve a more
sophisticated analysis, we must first know how to understand, display and summarize large amounts of
qualitative and quantitative information before undertaking a more sophisticated analysis. Thus, this
chapter begin by defining and classifying statistics, stating and explaining the stages in any statistical
investigation, defining some basic terms such as population, sample, sampling, sample size, census,
sample survey, parameter and statistic, understanding applications, uses and limitations of statistics as
well as defining and identifying the types of variables and measurement scales.

1.1. Definition and classification of Statistics

Definition: We can define statistics in two ways.


1. Plural sense (lay man definition): In this sense we can define statistics as statistics is an aggregate
or collection of numerical facts.
2. Singular sense (formal definition): In this sense we can define statistics as the science of collecting,
organizing, presenting, analyzing and interpreting numerical data for the purpose of assisting in
making a more effective decision.

Classifications: Depending on how data can be used statistics is sometimes divided in to two main areas
or branches.
1. Descriptive Statistics: is consists of the collection, organization, summarization, and presentation of
data. For example, the Ethiopian national census which conducted by the government every 10 years give
you the average age, income, and other characteristics of the Ethiopian population. To obtain this
information, the Census Bureau must have some means to collect relevant data. Once data are collected,
the bureau must organize, summarize and present them in some meaningful form, such as charts, graphs,
or tables.
2. Inferential Statistics: is a method used to generalize from a sample to a population. Here, statistical
techniques based on probability theory are required because the statistical data which is important to
generalize a population usually arises from sample. For example, the average income of all families (the
population) in Ethiopia can be estimated from figures obtained from a few hundred (the sample) families.

1
ACTIVITY 1.1

1. Classify the following sentences as belonging to the area of descriptive statistics or inferential
statistics.
a) As a result of recent cutbacks by oil-producing nations, we can expect the price of gasoline to
double in the next year.
b) At least 5% of all killings reported last year in city X were due to tourists.
c) Of all patients who received this particular type of drug at a clinic Y, 75% later developed
significant side effect.
d) Adane concludes that his chance of passing the first year this academic year is at least 80% based
on the statistics that 75% of the freshmen passed last year.

1.2. Stages in statistical investigation


There are five stages or steps in any statistical investigation.
1. Collection of data: the process of measuring, gathering, assembling the raw data up on which the
statistical investigation is to be based. Data can be collected in a variety of ways; one of the most
common methods is through the use of survey. Survey can also be done in different methods, three of
the most common methods are: Telephone survey, Mailed questionnaire and Personal interview
2. Organization of data: Summarization of data in some meaningful way, e.g table form
3. Presentation of data: The process of re-organization, classification, compilation, and summarization
of data to present it in a meaningful form.
4. Analysis of data: The process of extracting relevant information from the summarized data, mainly
through the use of elementary mathematical operation.
5. Inference of data: The interpretation and further observation of the various statistical measures
through the analysis of the data by implementing those methods by which conclusions are formed and
inferences made. In this stage statistical techniques based on probability theory are required.
1.3. Definition of Some Basic terms
a. Statistical Population: It is the collection of all possible observations of a specified characteristic of
interest (possessing certain common property) and being under study.
b. Sample: It is a subset of the population, selected using some sampling technique in such a way that
they represent the population.
c. Sampling: The process or method of sample selection from the population.
d. Sample size: The number of elements or observation to be included in the sample.
e. Census: Complete enumeration or observation of the elements of the population. Or it is the
collection of data from every element in a population.
f. Parameter: Characteristic or measure obtained from a population.
g. Statistic: Characteristic or measure obtained from a sample.
h. Variable: It is an item of interest that can take on many different numerical values.
1.4. Applications, uses and limitations of Statistics
Applications of statistics:
 In almost all fields of human endeavor.
 Almost all human beings in their daily life are subjected to obtaining numerical facts.
 Applicable in some process e.g. invention of certain drugs, extent of environmental pollution.
 In industries especially in quality control area.

2
Uses of statistics:
The main function of statistics is to enlarge our knowledge of complex phenomena. The following are
some uses of statistics:
1. It presents facts in a definite and precise form.
2. Data reduction.
3. Measuring the magnitude of variations in data.
4. Furnishes a technique of comparison
5. Estimating unknown population characteristics.
6. Testing and formulating of hypothesis.
7. Studying the relationship between two or more variable.
8. Forecasting future events.
Limitations of statistics:
As a science statistics has its own limitations. The following are some of the limitations:
 Deals with only quantitative information.
 Deals with only aggregate of facts and not with individual data items.
 Statistical data are only approximately and not mathematical correct.
 Statistics can be easily misused and therefore should be used be experts.

1.5. Types of variables and measurement scales

Types of variables:
Qualitative Variables are nonnumeric variables and can't be measured. Examples include gender,
religious affiliation, and state of birth.
Quantitative Variables are numerical variables and can be measured. Examples include balance in
checking account, number of children in family. Note that quantitative variables are either discrete (which
can assume only certain values, and there are usually "gaps" between the values, such as the number of
bedrooms in your house) or continuous (which can assume any value within a specific range, such as the
air pressure in a tire).

Scales of measurement
Proper knowledge about the nature and type of data to be deal with is essential in order to specify and
apply the proper statistical method for their analysis and inferences. Measurement scale refers to the
property of value assigned to the data based on the properties of order, distance and fixed zero.
The goal of measurement systems is to structure the rule for assigning numbers to objects in such a way
that the relationship between the objects is preserved in the numbers assigned to the objects. The different
kinds of relationships preserved are called properties of the measurement system.

Order: The property of order exists when an object that has more of the attribute than another object, is
given a bigger number by the rule system. This relationship must hold for all objects in the "real world".
Distance: The property of distance is concerned with the relationship of differences between objects. If a
measurement system possesses the property of distance it means that the unit of measurement means the
same thing throughout the scale of numbers. That is, an inch is an inch, no matters were it falls
immediately ahead or a mile downs the road. More precisely, an equal difference between two numbers
reflects an equal difference in the "real world" between the objects that were assigned the numbers.

3
Fixed Zero: A measurement system possesses a rational zero (fixed zero) if an object that has none of the
attribute in question is assigned the number zero by the system of rules. The object does not need to really
exist in the "real world", as it is somewhat difficult to visualize a "man with no height". The requirement
for a rational zero is this: if objects with none of the attribute did exist would they be given the value zero.
The property of fixed zero is necessary for ratios between numbers to be meaningful.

Scales type: Measurement is the assignment of numbers to objects or events in a systematic fashion. Four
levels of measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and each
possessed different properties of measurement systems.

Nominal Scales: Nominal scales are measurement systems that possess none of the three properties stated
above.
 Level of measurement which classifies data into mutually exclusive, all inclusive categories in which
no order or ranking can be imposed on the data.
 No arithmetic and relational operation can be applied.
Examples:
 Political party preference (Republican, Democrat, or Other,)
 Sex (Male or Female.)
 Marital status(married, single, widow, divorce)
 Regional differentiation of Ethiopia.
Ordinal Scales: Ordinal Scales are measurement systems that possess the property of order, but not the
property of distance. The property of fixed zero is not important if the property of distance is not satisfied.
 Level of measurement which classifies data into categories that can be ranked. Differences between
the ranks do not exist.
 Arithmetic operations are not applicable but relational operations are applicable.
 Ordering is the sole property of ordinal scale.
Examples:
 Letter grades (A, B, C, D, F)
 Rating scales (Excellent, Very good, Good, Fair, poor)
 Military status
Interval Scales: Interval scales are measurement systems that possess the properties of Order and
distance, but not the property of fixed zero.
 Level of measurement which classifies data that can be ranked and differences are meaningful.
However, there is no meaningful zero, so ratios are meaningless.
 All arithmetic operations except division are applicable.
 Relational operations are also possible.
Examples:
 IQ
 Temperature in oF

Ratio Scales: Ratio scales are measurement systems that possess all three properties: order, distance, and
fixed zero. The added power of a fixed zero allows ratios of numbers to be meaningfully interpreted; i.e.
the ratio of Bekele's height to Martha's height is 1.32, whereas this is not possible with interval scales.

4
ACTIVITY 1.2

1. Classify the following list of different attributes and rules for assigning numbers to objects into first
as qualitative or quantitative and second as one of the four types of measurement scales.
a) Your checking account number as a name for your account; (Ans. nominal)
b) Your score on the first statistics test as a measure of your knowledge of statistics; (Ans. interval)
c) Your score on an individual intelligence test as a measure of your intelligence; (Ans. interval)
d) Times for swimmers to complete a 50-meter race; (Ans. ratio)
e) Socioeconomic status of a family when classified as low, middle and upper classes; ((Ans. ordinal)
f) Blood type of individuals, A, B, AB and O; ((Ans. nominal)
g) Regions numbers of Ethiopia (1, 2, 3 etc); ((Ans. nominal)
h) The number of students in a college; ((Ans. ratio)
i) The net wages of a group of workers; ((Ans. ratio)

CHECKLIST 1.1

Put a tick mark (√) for each of the following questions if you can solve the problems and an X otherwise.
Can you
1. Define statistics?
2. Discuss the two areas (branches) of statistics?
3. State and explain the five stages in statistical investigation?
4. Write some applications and uses of statistics?
5. Differentiate between Qualitative Variable and Quantitative Variable?
5. Identify the types of measurement scales for a given variables?
Exercise 1.1

1. Distinguish the difference between descriptive and inferential statistics.


2. Distinguish the difference between parameter and statistic.
3. Discuss why we study and use of statistics.
4. Compare the use of census data and sample survey data for investigation.
5. Because of a recent increase in the number of neck injuries incurred by motorist, the Department of
Statistics designed a study to evaluate the strength of helmets worn by motorist in Ethiopia. A total of
540 helmets were collected from the five companies that currently produce helmets. The agency then
sent the helmets to an independent testing agency to evaluate the impact cushioning of the helmet and
the amount of shock transmitted to the neck when the face mask was twisted.
a. What is the population of interest?
b. What is the sample?
c. What variables should be measured?

5
CHAPTER TWO

METHODS OF DATA COLLECTION AND PRESENTATION

2.1 Methods of data collection


2.1.1 Sources of data
There are two sources of data; these are primary and secondary sources of data.
 Primary sources: are sources where the data is measured or collected by the investigator directly.
 Secondary sources: are sources where the data is not measured or collected by the investigator
directly.
Based on the sources; data can be categorized into two. These are primary and secondary data.
1. Primary Data: is data measured or collect by the investigator or the user directly from the source. Two
activities involved: planning and measuring.
a) Planning:
 Identify source and elements of the data.
 Decide whether to consider sample or census.
 If sampling is preferred, decide on sample size, selection method,… etc
 Decide measurement procedure.
 Set up the necessary organizational structure.
b) Measuring: there are different options.
 Focus Group, Telephone Interview, Questionnaires, Door-to-Door Survey, Registration,
Personal Interview and Experiments are some of the sources for collecting the primary data.
2. Secondary Data: is data gathered or compiled from published and unpublished sources or files. When
our source is secondary data check that:
 The type and objective of the situations.
 The purpose for which the data are collected and compatible with the present problem.
 The nature and classification of data is appropriate to our problem.
 There are no biases and misreporting in the published data.
Note: Data which are primary for one may be secondary for the other.
2.1.2 Methods of collection
Take a step back, if we are starting from scratch, how do we collect data and where does it come from?
There are many methods used to collect or obtain data for statistical analysis. Three of the most popular
methods are direct observation which relies on the researchers’ ability to gather data though their senses
and allows researchers to document actual behaviour rather than responses related to behaviour,
experimentations which explores cause and effect relationships by manipulating independent variables in
order to see if there is a corresponding effect on a dependent variable and surveying that involves
gathering information from individuals using a questionnaire.
ACTIVITY 2.1

1. Distinguish the difference between primary and secondary sources of data.


2. Discuss the advantage and disadvantage of using primary and secondary data.
3. Discuss the most popular methods used to collect data.

6
2.2 Methods of Data Presentation
After the measurements of interest have been collected, ideally the data are organized, displayed, and
examined by using various presentation techniques. That means present the data in a readily
comprehensible condensed form that aids in order to draw inferences from it. As a general rule, the data
should be arranged into categories so that each measurement is classified into one, and only one, of the
categories.
The process of arranging data in to classes or categories according to similarities technically is called
classification. Classification is a preliminary and it prepares the ground for proper presentation of data.
Presentation of data is broadly classified in to the following categories: Tabular presentation, Graphic
presentation and Diagrammatic.
2.2.1 Tabular presentation of data

In this form of presentation, data are tabulated or arranged in some properly selected classes and the
arrangement is described by title and sub-titles. Such tables can list the original raw scores as well as the
percentages, means, standard deviations etc as shown a sketch of a table for illustration in Table 2.1 or in
the form of frequency distribution.

Table 2.1: Pass Percentage of High Schools in Wolkite Town in Examination


Name of High School Pass percentage Girls pass percentage Boys pass percentage
High School of
Wolkite Town

Yaberus high sch.


Jamaika high sch.
Abafransa high sch.

Frequency distributions:

In this form of presentation, we group the quantitative data into some arbitrarily chosen classes. For this
purpose, usually, the raw scores are distributed into classes and each score is allotted a place in the
respective class. It is also, seen how many: times a particular score occurs in the given data set. To have a
good understanding of this concept let us see the following definition:

Definition:
 Raw data: recorded information in its original collected form, whether it is counts or
measurements, is referred to as raw data.
 Frequency: is the number of values in a specific class of the distribution.
 Frequency distribution: is the organization of raw data in tabular form using classes and
frequencies.
There are three basic types of frequency distributions:
 Categorical frequency distribution
 Ungrouped frequency distribution
 Grouped frequency distribution
There are specific procedures for constructing each type.

7
1) Categorical frequency Distribution: Used for data that can be place in specific categories such as nominal,
or ordinal. e.g. marital status.
Example: A social worker collected the following data on marital status for 25 persons. (M=married,
S=single, W=widowed, D=divorced)
M S D W D
S S W M M
W D S M M
W D D S S
S W W D D
Solution: Since the data are categorical, discrete classes can be used. There are four types of marital status M,
S, D, and W. These types will be used as class for the distribution.
We follow procedure to construct the frequency distribution.
Step 1: Make a table as shown.
Class Tally Frequency Percent
(1) (2) (3) (4)
M
S
D
W

Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Step 4: Find the percentages of values in each class by using; ( ⁄ ) , Where f= frequency of the
class, n=total number of value. Percentages are not normally a part of frequency distribution but they can be
added since they are used in certain types diagrammatic such as pie charts.
Step 5: Find the total for column (3) and (4).
Combing all steps one can construct the following frequency distribution.
Class Tally Frequency Percent
(1) (2) (3) (4)
M //// 5 20
S //// // 7 28
D //// // 7 28
W //// / 6 24
2) Ungrouped frequency Distribution: Is a table of all the potential raw score values that could possible
occur in the data along with the number of times each actually occurred. It is often constructed for small
set or data on discrete variable.
Constructing ungrouped frequency distribution:
 First find the smallest and largest raw score in the collected data.
 Arrange the data in order of magnitude and count the frequency.
 To facilitate counting one may include a column of tallies.
Example: The following data represent the mark of 20 students.
80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85
Construct a frequency distribution, which is ungrouped.

8
Solution:
Step 1: Find the range, Range=Max-Min=90-60=30.
Step 2: Make a table as shown
Step 3: Tally the data.
Step 4: Compute the frequency.
Mark Tally Frequency
60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1
Each individual value is presented separately, that is why it is named ungrouped frequency distribution.
3) Grouped frequency Distribution: When the range of the data is large, the data must be grouped in to
classes that are more than one unit in width.
Definitions:
 Grouped Frequency Distribution: a distribution when several numbers are grouped in one class.
 Class limits: Separates one class in a grouped frequency distribution from another. The limits could
appear in the data and have gaps between the upper limits of one class and lower limit of the next.
 Units of measurement (U): the distance between two possible consecutive measures. It is usually
taken as 1, 0.1, 0.01, 0.001, ....
 Class boundaries: Separates one class in a grouped frequency distribution from another. The
boundaries have one more decimal places than the row data and therefore do not appear in the data.
There is no gap between the upper boundary of one class and lower boundary of the next class. The
lower class boundary is found by subtracting U/2 from the corresponding lower class limit and the
upper class boundary is found by adding U/2 to the corresponding upper class limit.
 Class width: the difference between the upper and lower class boundaries of any class. It is also the
difference between the lower limits of any two consecutive classes or the difference between any two
consecutive class marks.
 Class mark (Mid points): it is the average of the lower and upper class limits or the average of
upper and lower class boundary.
 Cumulative frequency: is the number of observations less than/more than or equal to a specific
value.
 Cumulative frequency above: it is the total frequency of all values greater than or equal to the
lower class boundary of a given class.
 Cumulative frequency blow: it is the total frequency of all values less than or equal to the upper
class boundary of a given class.
 Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class interval together
with their corresponding cumulative frequencies. It can be more than or less than type, depending on
the type of cumulative frequency used.

9
 Relative frequency (rf): it is the frequency divided by the total frequency.
 Relative cumulative frequency (rcf): it is the cumulative frequency divided by the total frequency.

Guidelines for classes


1. There should be between 5 and 20 classes.
2. The classes must be mutually exclusive. This means no data value can fall into two different classes
3. The classes must be all inclusive or exhaustive. This means that all data values must be included.
4. The classes must be continuous. There are no gaps in a frequency distribution.
5. The classes must be equal in width. The exception here is the first or last class. It is possible to have a
"below ..." or "... and above" class. This is often used with ages.
Steps for constructing Grouped frequency Distribution
1. Find the largest and smallest values
2. Compute the Range(R) = Maximum - Minimum
3. Select the number of classes desired, usually between 5 and 20 or use Sturges rule
k  1  3.322 log n where k is number of classes desired and n is total number of observation.
R
4. Find the class width by dividing the range by the number of classes and rounding up, not off. w  .
k
5. Pick a suitable starting point less than or equal to the minimum value. The starting point is called the
lower limit of the first class. Continue to add the class width to this lower limit to get the rest of the
lower limits.
6. To find the upper limit of the first class, subtract U from the lower limit of the second class. Then
continue to add the class width to this upper limit to find the rest of the upper limits.
7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2 units from the
upper limits. The boundaries are also half-way between the upper limit of one class and the lower
limit of the next class. !may not be necessary to find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not be
necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies
Example*: Construct a frequency distribution for the following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Solutions:
Step 1: Find the highest and the lowest value H=39, L=6
Step 2: Find the range; R=H-L=39-6=33
Step 3: Select the number of class’s desired using Sturges formula;
k  1  3.322 log n =1+3.322log (20) =5.32=6(rounding up)
Step 4: Find the class width; w=R/k=33/6=5.5=6 (rounding up)
Step 5: Select the starting point, let it be the minimum observation.
 6, 12, 18, 24, 30, 36 are the lower class limits.
Step 6: Find the upper class limit; e.g. the first upper class=12-U=12-1=11
 11, 17, 23, 29, 35, 41 are the upper class limits.

10
So combining step 5 and step 6, one can construct the following classes.
Class limits: 6-11, 12-17, 18-23, 24-29, 30-35 and 36-41
Step 7: Find the class boundaries;
E.g. for class 1 Lower class boundary=6-U/2=5.5 and Upper class boundary =11+U/2=11.5
 Then continue adding w on both boundaries to obtain the rest boundaries. By doing so one can
obtain the following classes.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5
Step 8: tally the data.
Step 9: Write the numeric values for the tallies in the frequency column.
Step 10: Find cumulative frequency.
Step 11: Find relative frequency or/and relative cumulative frequency.
The complete frequency distribution follows:

Class limit Class Class Tally Freq. Cf(less Cf(more rf. rcf(less
boundary Mark than type) than type) than type
6 – 11 5.5 – 11.5 8.5 // 2 2 20 0.10 0.10
12 – 17 11.5 – 17.5 14.5 // 2 4 18 0.10 0.20
18 – 23 17.5 – 23.5 20.5 ////// 7 11 16 0.35 0.55
24 – 29 23.5 – 29.5 26.5 //// 4 15 9 0.20 0.75
30 – 35 29.5 – 35.5 32.5 /// 3 18 5 0.15 0.90
36 – 41 35.5 – 41.5 38.5 // 2 20 2 0.10 1.00

ACTIVITY 2.2

1. Twenty-five army inductees were given a blood test to determine their blood type. Construct a
frequency distribution for the data set which is given below.
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
2. Discuss the difference between ungrouped and grouped frequency distribution.
3. Given below is raw data on ages of 40 employees of a certain organization. Construct a grouped
frequency distribution including the class boundaries, class marks the relative frequencies, the less
than and more than cumulative frequencies using 8 classes.
62 58 53 27 30 31 26 34 49 47 48 41
50 61 40 47 41 43 50 45 43 32 37 31
35 38 29 65 58 43 44 41 37 27 62 65
36 42 63 50

11
2.2.2 Graphical presentation of data

The histogram, frequency polygon and cumulative frequency graph or ogive is most commonly applied
graphical representation for continuous data.

Procedures for constructing statistical graphs:


 Draw and label the X and Y axes.
 Choose a suitable scale for the frequencies or cumulative frequencies and label it on the Y axes.
 Represent the class boundaries for the histogram or ogive or the mid points for the frequency polygon
on the X axes.
 Plot the points.
 Draw the bars or lines to connect the points.
Histogram:
It is a graph which displays the data by using vertical bars of various heights to represent frequencies.
Class boundaries are placed along the horizontal axes. Class marks and class limits are sometimes used as
quantity on the X axes.
Example: Construct a histogram to represent the following frequency distribution
Class boundary Freq.
5.5 – 11.5 2
11.5 – 17.5 2
17.5 – 23.5 7
23.5 – 29.5 4
29.5 – 35.5 3
35.5 – 41.5 2
Solution
Step 1 Draw and label the x and y axes.
Step 2 Represent the frequency on the y axis and the class boundaries on the x axis.
Step 3 Using the frequencies as the heights, draw vertical bars for each class.

As the histogram shows, the class with the greatest number of data values (7) is 17.5–23.5, followed by 4
for 23.5–29.5. The graph also has one peak with the data clustering around it.

12
Frequency Polygon:
It is a line graph. The frequency is placed along the vertical axis and classes mid points are placed along
the horizontal axis. It is customer to the next higher and lower class interval with corresponding
frequency of zero, this is to make it a complete polygon.
Example: Using the frequency distribution given in the above example, construct a frequency polygon.

Solutions:
Step 1: Find the midpoints of each class. Recall that midpoints are found by adding the upper and lower
boundaries and dividing by 2: e.g.
Class boundary Freq. Class Mark
5.5 – 11.5 2 8.5
11.5 – 17.5 2 14.5
17.5 – 23.5 7 20.5
23.5 – 29.5 4 26.5
29.5 – 35.5 3 32.5
35.5 – 41.5 2 38.5
Step 2: Draw the x and y axes. Label the x axis with the midpoint of each class, and then use a suitable
scale on the y axis for the frequencies.
Step 3: Using the midpoints for the x values and the frequencies as the y values, plot the points.
Step 4: Connect adjacent points with line segments. Draw a line back to the x axis at the beginning and
end of the graph, at the same distance that the previous and next midpoints would be located, as shown
below.
8

4
Value Frequency

0
2.5 8.5 14 .5 20 .5 26 .5 32 .5 38 .5 44 .5

Class Mid points

The frequency polygon and the histogram are two different ways to represent the same data set. The
choice of which one to use is left to the discretion of the researcher.

Ogive curve (cumulative frequency polygon):


It is a graph showing the cumulative frequency (less than or more than type) plotted against upper or
lower class boundaries respectively. That is class boundaries are plotted along the horizontal axis and the
corresponding cumulative frequencies are plotted along the vertical axis. The points are joined by a free
hand curve.

13
Example: Construct an ogive for the frequency distribution described in the above Examples.

Solutions:
Step 1: Find the cumulative frequency (here we find less than ype) for each class
Cf (less than type)
Less than 5.5 0
Less than 11.5 2
Less than 17.5 4
Less than 23.5 11
Less than 29.5 15
Less than 35.5 18
Less than 41.5 20
Step 2: Draw the x and y axes. Label the x axis with the class boundaries. Use an appropriate scale for the
y axis to represent the cumulative frequencies.
Step 3: Plot the cumulative frequency at each upper class boundary, as shown in Figure 2.3. Upper
boundaries are used since the cumulative frequencies represent the number of data values accumulated up
to the upper boundary of each class.
Step 4: Starting with the first upper class boundary, 11.5, connect adjacent points with line segments, as
shown in Figure 2.3. Then extend the graph to the first lower class boundary, 5.5, on the x axis.

2.2.3 Diagrammatic presentation of data


These are techniques for presenting data in visual displays using geometric and pictures.
Importance:
 They have greater attraction.
 They facilitate comparison.
 They are easily understandable.
Diagrams are appropriate for presenting discrete data. The three most commonly used diagrammatic
presentation for discrete as well as qualitative data are:
 Pie charts
 Pictogram
 Bar charts
Pie chart: A pie chart is a circle that is divided in to sections or wedges according to the percentage of
frequencies in each category of the distribution. The angle of the sector is obtained using:
Value of the part
Angle of sector  * 360
the whole quantity

14
Example: Draw a suitable diagram to represent the following population in a town.
Men Women Girls Boys
2500 2000 4000 1500
Solutions:
Step 1: Find the percentage.
Step 2: Find the number of degrees for each class.
Step 3: Using a protractor and compass, graph each section and write its name corresponding percentage.
Class Frequency Percent Degree
Men 2500 25 90
Women 2000 20 72
Girls 4000 40 144
Boys 1500 15 54

Pictogram
In this diagram, we represent data by means of some picture symbols. We decide abut a suitable picture to
represent a definite number of units in which the variable is measured.
Example: draw a pictogram to represent the following population of a town.
Year 1989 1990 1991 1992
Population 2000 3000 5000 7000

Bar Charts: A set of bars (thick lines or narrow rectangles) representing some magnitude over time
space. They are useful for comparing aggregate over time space. Bars can be drawn either vertically or
horizontally. There are different types of bar charts. The most common are:
 Simple bar chart
 Component or sub divided bar chart.
 Multiple bar charts.

Simple Bar Chart: Simple bar charts are used to display data on one variable. They are thick lines
(narrow rectangles) having the same breadth. The magnitude of a quantity is represented by the height
/length of the bar.

15
Example: The following data represent sale by product, 1957- 1959 of a given company for three
products A, B, C.
Product Sales($) Sales($) Sales($)
In 1957 In 1958 In 1959
A 12 14 18
B 24 21 18
C 24 35 54
Solutions:
Sales by product in 1957

30
25
Sales in $

20
15
10
5
0
A B C
product

Component Bar chart:


When there is a desire to show how a total (or aggregate) is divided in to its component parts, we use
component bar chart. The bars represent total value of a variable with each total broken in to its
component parts and different colors or designs are used for identifications.
Example: Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:
SALES BY PRODUCT 1957-1959

100

80
Sales in $

Product C
60
Product B
40
Product A
20

0
1957 1958 1959
Year of production

Multiple Bar charts: These are used to display data on more than one variable. They are used for
comparing different variables at the same time.

Example: Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:
Sales by product 1957-1959

60
50
Sales in $

40 Product A
30 Product B
20 Product C

10
0
1957 1958 1959
Year of production

16
ACTIVITY 2.3

1. Suppose the following data sets be the number of currently married men and women who knew
modern method of contraception in certain country according to the five regions.
Region Women Men
A 394 176
B 386 186
C 747 268
D 1144 349
E 894 277
a) present each of the findings in a simple bar chart
b) present the results for men and women in a multiple bar chart
c) construct a composite bar chart to show this information
d) present the same information using a pie diagram.

CHECKLIST 2.1

Put a tick mark (√) for each of the following questions if you can solve the problems and an X otherwise.
Can you
1. Explain the primary and secondary sources of data?
2. List the most popular methods used to collect data?
3. List and explain the possible classes of data presentation?
4. Explain the reason why we organize data into a frequency distribution?
5. Name the three types of frequency distributions, and explain when each should be used?
6. Present different types of data using appropriate graph/diagram?
Exercise 2.1

1. The marks obtained by 50 students in a Statistics test are given below:


62, 21, 26, 32, 56, 36, 37, 39, 53, 40, 54, 42, 44, 61, 68, 28, 33, 56, 57, 37, 52, 39, 40, 54, 43, 43, 63,
30, 34, 58, 35, 38, 50, 38, 52, 41, 51, 44, 41, 42, 43, 45, 46, 45, 47, 48, 49, 45, 46, 48 Tabulate these
scores in frequency distribution by clearly explaining the various steps.
2. Construct a frequency distribution of weights of miniature poodles if the class marks are 6.5, 8.5,
10.5, 12.5 and 14.5 kgs with corresponding frequencies 8, 12, 22, 17 and 3.
3. Suppose data collected for heights (in cms) 0f 390 cows were tabulated in a frequency distribution
and the following results were obtained.
fi: 6 25 48 72 116 60 38 22 3
CM1 =112, CM2=117 where CMi is the ith class mark
Determine:
a) the class interval size (class width), class limits, class boundaries and class marks
b) the less than cumulative frequency distribution
c) the class intervals having the highest frequency
d) Above which height do we find 50% of the cows?
e) Below which height do we get 25% of the cows?
f) Draw a histogram, frequency polygon and less than ogive for the above data.

17
CHAPTER THREE
DESCRIPTIVE STATISTICS

3.1 Measures of Central Tendency

In this section, we shall study measures of central tendency which concerns with mathematical measures
of central tendency (arithmetic, geometric and harmonic means) and positional measures of central
tendency (mode, median and quantiles). When we want to make comparison between groups of numbers
it is good to have a single value that is considered to be a good representative of each group. This single
value is called the average of the group. Averages are also called measures of central tendency. An
average which is representative is called typical average and an average which is not representative and
has only a theoretical value is called a descriptive average.

A typical average should posses the following:


 It should be rigidly defined.
 It should be based on all observation under investigation.
 It should be as little as affected by extreme observations.
 It should be capable of further algebraic treatment.
 It should be as little as affected by fluctuations of sampling.
 It should be ease to calculate and simple to understand.

Objectives of measures of central tendency


 To comprehend the data easily.
 To facilitate comparison.
 To make further statistical analysis.

3.2 Types of measures of central tendency

There are several different types of measures of central tendency; each has its advantage and
disadvantage. These are the mean like Arithmetic, Geometric and Harmonic, the mode, median and
quantiles such as quartiles, deciles and percentiles.
3.2.1 The mean
Arithmetic Mean: is defined as the sum of the magnitude of the items divided by the number of items.
Usually the mean of X1, X2 ,X3 …Xn is denoted by A.M ,m or X and is given by:
X 1  X 2  ...  X n
X 
n
n

X i
 X  i 1

n
 If X1 occurs f1 times
 If X2occurs f2 times
 .
 .
 If Xn occurs fn times

18
k

f X
k
Then the mean will be
X  i 1
k
i i , where k is the number of classes and f
i 1
i n
f
i 1
i

Example: Obtain the mean of the following number: 2, 7, 8, 2, 7, 3, 7


Solution:
Xi fi Xifi
2 2 4
3 1 3
7 3 21
8 1 8
Total 7 36
4

 fi X i
36
X  i 1
4
  5.15

7
fi
i 1

Arithmetic Mean for Grouped Data


If data are given in the shape of a continuous frequency distribution, then the mean is obtained as
k

f i Xi Xi =the class mark of the ith class and fi = the frequency of the ith class
X  i 1
k
, Where
f i 1
i

Example: calculate the mean for the following age distribution.


Class Frequency
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6
Solutions:
 First find the class marks
 Find the product of frequency and class marks
 Find mean using the formula.
Class fi Xi Xifi
6- 10 35 8 280
11- 15 23 13 299
16- 20 15 18 270
21- 25 12 23 276
26- 30 9 28 252
31- 35 6 33 198
Total 100 1575
6

f i Xi
1575
X  i 1
6
  15.75
f
100
i
i 1

19
ACTIVITY 3.1

1. Marks of 75 students are summarized in the following frequency distribution:


Marks No. of students
40-44 7
45-49 10
50-54 22
55-59 f4
60-64 f5
65-69 6
70-74 3
If 20% of the students have marks between 55 and 59, then
a) Find the missing frequencies f4 and f5.
b) Find the mean.

Special properties of Arithmetic mean


n
1. The sum of the deviations of a set of items from their mean is always zero. i.e.  ( X  X )  0.
i
i 1

2. The sum of the squared deviations of a set of items from their mean is the minimum. i.e.
n n

 ( Xi  X )   ( X  A) , A  X
i 1
2

i 1
i
2

3. If X 1 is the mean of n1 observations


If X 2 is the mean of n 2 observations
.
.
If X k is the mean of n k observations, then the mean of all the observation in all groups often called
the combined mean is given by:
k

X 1n 1  X 2 n 2  ....  X k n k X n i i
Xc   i 1
n 1  n 2  ...n k
k

n i 1
i

Example: In a class there are 30 females and 70 males. If females averaged 60 in an examination and boys
averaged 72, find the mean for the entire class.
Solutions:
Females Males
X 1  60 X 2  72
n1  30 n 2  70
2

X 1 n1  X 2 n 2 X n i i
Xc   i 1

n1  n 2
2

n
i 1
i

30(60)  70(72) 6840


 Xc    68.40
30  70 100

20
4. If a wrong figure has been used when calculating the mean the correct mean can be obtained without
repeating the whole process using:
(CorrectValue  Wrong Value)
Correct Mean  WrongMean 
n
Where n is total number of observations.
Example: An average weight of 10 students was calculated to be 65.Latter it was discovered that one
weight was misread as 40 instead of 80 k.g. Calculate the correct average weight.
Solutions:
(CorrectValue  WrongValue)
CorrectMean  WrongMean 
n
(80  40)
CorrectMean  65   65  4  69k.g.
10
5. The effect of transforming original series on the mean.
a) If a constant k is added/ subtracted to/from every observation then the new mean will be the old
mean± k respectively.
b) If every observations are multiplied by a constant k then the new mean will be k*old mean
Example:
1. The mean of n Tetracycline Capsules X1, X2, …,Xn are known to be 12 gm. New set of capsules of
another drug are obtained by the linear transformation Yi = 2Xi – 0.5 ( i = 1, 2, …, n ) then what will
be the mean of the new set of capsules
Solutions:
NewMean  2 * OldMean  0.5  2 *12  0.5  23.5
2. The mean of a set of numbers is 500.
a) If 10 is added to each of the numbers in the set, then what will be the mean of the new set?
b) If each of the numbers in the set are multiplied by -5, then what will be the mean of the new set?
Solutions:
a).NewMean  OldMean  10  500  10  510
b).NewMean  5 * OldMean  5 * 500  2500
Weighted Mean:
 When a proper importance is desired to be given to different data a weighted mean is appropriate.
 Weights are assigned to each item in proportion to its relative importance.
 Let X1, X2, …Xn be the value of items of a series and W1, W2, …Wn their corresponding weights ,
then the weighted mean denoted X w is defined as:
n

X W i i
Xw  i 1
n

W i 1
i

Example: A student obtained the following percentage in an examination: English 60, Biology 75,
Mathematics 63, Physics 59, and chemistry 55.Find the students weighted arithmetic mean if weights 1,
2, 1, 3, 3 respectively are allotted to the subjects.
Solutions:
5

X W i i
60 * 1  75 * 2  63 * 1  59 * 3  55 * 3 615
Xw  i 1
   61.5
1 2  1 3  3
5
10
W
i 1
i

21
Merits and Demerits of Arithmetic Mean
Merits:
 It is rigidly defined.
 It is based on all observation.
 It is suitable for further mathematical treatment.
 It is stable average, i.e. it is not affected by fluctuations of sampling to some extent.
 It is easy to calculate and simple to understand.
Demerits:
 It is affected by extreme observations.
 It cannot be used in the case of open end classes.
 It cannot be determined by the method of inspection.
 It cannot be used when dealing with qualitative characteristics, such as intelligence, honesty, beauty.
 It can be a number which does not exist in a serious.
 Sometimes it leads to wrong conclusion if the details of the data from which it is obtained are not
available.
 It gives high weight to high extreme values and less weight to low extreme values.

Geometric mean: The geometric mean of a set of n observation is the nth root of their product. The
geometric mean of X1, X2 ,X3 …Xn is denoted by G.M and it is given by:

G.M  n X 1 * X 2 * ... * X n
Taking the logarithms of both sides
1
log(G.M)  log(n X 1 * X 2 * ... * X n )  log(X 1 * X 2 * ... * X n ) n
1 1
 log(G.M)  log(X 1 * X 2 * .... * X n )  (log X 1  log X 2  ...  log X n )
n n
1 n
 log(G.M)   log X i
n i1
 The logarithm of the G.M of a set of observation is the arithmetic mean of their logarithm.
1 n
 G.M  Anti log(  log X i )
n i1
Example: Find the G.M of the numbers 2, 4, 8.
Solutions:
G.M  n X 1 * X 2 * ... * X n  3 2 * 4 * 8  3 64  4
Remark: The Geometric Mean is useful and appropriate for finding averages of ratios.
Harmonic mean: The harmonic mean of X1, X2 , X3 …Xn is denoted by H.M and given by:
n , This is called simple harmonic mean.
H.M  n
1
X
i 1 i

In a case of frequency distribution:


, n k
H.M 
n
k
fi 1
i
fi

i 1 X i

22
If observations X1, X2, …Xn have weights W1, W2, …Wn respectively, then their harmonic mean is given
by
n
, This is called Weighted Harmonic Mean.
W i
H.M  n
i 1

W
i 1
i Xi

Remark: The Harmonic Mean is useful and appropriate in finding average speeds and average rates.
Example: A cyclist pedals from his house to his college at speed of 10 km/hr and back from the college to
his house at 15 km/hr. Find the average speed.
Solution: Here the distance is constant
The simple H.M is appropriate for this problem.
X1= 10km/hr X2=15km/hr
2
H.M   12km / hr
1 1

10 15
3.2.2 The mode: merits and demerits
Mode is a value which occurs most frequently in a set of values. The mode may not exist and even if it
does exist, it may not be unique. In case of discrete distribution the value having the maximum frequency
is the model value.
Examples:
1. Find the mode of 5, 3, 5, 8, 9
Mode =5
2. Find the mode of 8, 9, 9, 7, 8, 2, and 5.
It is a bimodal Data: 8 and 9
3. Find the mode of 4, 12, 3, 6, and 7.
No mode for this data.
The mode of a set of numbers X1, X2, …Xn is usually denoted by X̂ .
Mode for Grouped data
If data are given in the shape of continuous frequency distribution, the mode is defined as:

 1 
X̂  L mo  w 
 1   2 
Where:
Xˆ  the mod e of the distribution
w  the size of the mod al class
 1  f mo  f 1
 2  f mo  f 2
f mo  frequencyof the mod al class
f 1  frequencyof the class preceeding the mod al class
f 2  frequencyof the class following the mod al class
Note: The modal class is a class with the highest frequency.

23
Example: Following is the distribution of the size of certain farms selected at random from a district.
Calculate the mode of the distribution.
Size of farms No. of farms
5-15 8
15-25 12
25-35 17
35-45 29
45-55 31
55-65 5
65-75 3
Solutions:
45  55 is the mod al class,sin ce it is a class with the highest frequency.
L mo  45
w  10
 1  f mo  f 1  2
 2  f mo  f 2  26
f mo  31
f 1  29
f2  5

ˆ  45  10
X 
2 

 2  26 
 45.71

Merits and Demerits of Mode


Merits:
 It is not affected by extreme observations.
 Easy to calculate and simple to understand.
 It can be calculated for distribution with open end class
Demerits:
 It is not rigidly defined.
 It is not based on all observations
 It is not suitable for further mathematical treatment.
 It is not stable average, i.e. it is affected by fluctuations of sampling to some extent.
 Often its value is not unique.
Note: Being the point of maximum density, mode is especially useful in finding the most popular size in
studies relating to marketing, trade, business, and industry. It is the appropriate average to be used to find
the ideal size.
3.2.3 The median: merits and demerits of median
In a distribution, median is the value of the variable which divides it in to two equal halves. In an ordered
series of data median is an observation lying exactly in the middle of the series. It is the middle most
value in the sense that the number of values less than the median is equal to the number of values greater
than it. If X1, X2, …Xn be the observations, then the numbers arranged in ascending order will be X [1],
X[2], …X[n], where X[i] is ith smallest value.  X[1]< X[2]< …<X[n]
Median is denoted by X̂ .
Median for ungrouped data
 X ( n1) 2  ,If n is odd.
~ 
X   1 (X  X ), If n is even
2 
n 2  ( n 2)  1
 
 

24
Example: Find the median of the following numbers.
a) 6, 5, 2, 8, 9, 4.
b) 2, 1, 8, 3, 5.
Solutions:
a) First order the data: 2, 4, 5, 6, 8, 9; Here n=6
~ 1
X (X n  X n )
2 [ ]
2
[  1]
2

1
 (X [3]  X [ 4 ] )
2
1
 ( 5  6 )  5.5
2
b) Order the data :1, 2, 3, 5, 8; Here n=5
~ X
X n 1
[ ]
2

 X[3]
3
Example: After the third-grade classes in a school district received low overall scores on a statewide
reading test, a supplemental reading program was implemented in order to provide extra help to those
students who were below expectations with respect to their reading proficiency. Six months after
implementing the program, the 10 third-grade classes in the district were reexamined. For each of the 10
schools, the percentage of students reading above the statewide standard was determined. These data are
shown here.
95 86 78 90 62 73 89 92 84 76
Determine the median percentage of the 10 schools.
Solution: First we must arrange the percentage in order of magnitude.
62 73 76 78 84 86 89 90 92 95
Because there is an even number of measurements, the median is the average of the two midpoint scores.
Example: An experiment was conducted to measure the effectiveness of a new procedure for pruning
grapes. Each of 13 workers was assigned the task of pruning an acre of grapes. The productivity,
measured in worker-hours/acre, is recorded for each person.
4.4 4.9 4.2 4.4 4.8 4.9 4.8 4.5 4.3 4.8 4.7 4.4 4.2
Determine the mode and median productivity for the group.
Solution: First arrange the measurements in order of magnitude:
4.2 4.2 4.3 4.4 4.4 4.4 4.5 4.7 4.8 4.8 4.8 4.9 4.9
For these data, we have two measurements appearing three times each. Hence, the data are bimodal, with
modes of 4.4 and 4.8. The median for the odd number of measurements is the middle score, 4.5.
Median for grouped data: If data are given in the shape of continuous frequency distribution, the
median is defined as:
~ w n
X  L med  (  c)
f med 2
Where :
L med  lower class boundary of the median class.
w  the size of the median class
n  total number of observations.
c  the cumulative frequency( less than type) preceeding the median class.
f med  thefrequency of the median class.

25
Remark: The median class is the class with the smallest cumulative frequency (less than type) greater than
or equal to .
Example: Find the median of the following distribution.
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Solutions:
 First find the less than cumulative frequency.
 Identify the median class.
 Find median using formula.
Class Frequency Cumu.Freq(less than type)
40-44 7 7
45-49 10 17
50-54 22 39
55-59 15 54
60-64 12 66
65-69 6 72
70-74 3 75
n 75
  37.5
2 2
39 is the first cumulative frequencyto be greater than or equal to 37.5
 50  54 is the median class.

L  49.5, w  5
med
n  75, c  17, f  22
med

~
 X L  w ( n  c)
med f 2
med
5
 49.5  (37.5  17)
22
 54.16
Merits and Demerits of Median
Merits:
 Median is a positional average and hence not influenced by extreme observations.
 Median can be calculated in the case of open end intervals.
 Median can be located even if the data are incomplete.
Demerits:
 It is not a good representative of data if the number of items is small.
 It is not amenable to further algebraic treatment.
 It is susceptible to sampling fluctuations.

26
ACTIVITY 3.2

1. Compute the mean, median, and mode for the following data:
55 85 90 50 110 115 75 85 8 23
70 65 50 60 90 90 55 70 5 31
2. Refer to the first question in ACTIVITY 2.2 with the measurements 110 and 115 replaced by 345 and
467. Recompute the mean, median, and mode. Discuss the impact of these extreme measurements on
the three measures of central tendency.
3. Determine the mean, median, and mode for the data presented in the following frequency table.
Class Interval Frequency
2.0 – 4.9 5
5.0 –7.9 13
8.0 –10.9 16
11.0 –13.9 9
14.0 –16.9 4
17.0 –19.9 2
20.0 –22.9 2

3.2.4 The quantiles: quartiles, deciles and percentiles


When a distribution is arranged in order of magnitude of items, the median is the value of the middle
term. Their measures that depend up on their positions in distribution quartiles, deciles, and percentiles
are collectively called quantiles.
Quartiles: Quartiles are measures that divide the frequency distribution in to four equal parts. The value
of the variables corresponding to these divisions are denoted Q1, Q2, and Q3 often called the first, the
second and the third quartile respectively. Q1 is a value which has 25% items which are less than or equal
to it. Similarly Q2 has 50%items with value less than or equal to it and Q3 has 75% items whose values
are less than or equal to it.
To find Qi (i=1, 2, 3) we count iN of the classes beginning from the lowest class.
4
For grouped data: we have the following formula
Q L Q  w ( iN  c) ,i  1,2,3
i i fQ 4
i
Where :
L Q  lower classboundaryof thequartile class.
i
w  the sizeof thequartile class
N  total numberof observations.
c  the cumulativefrequency(less thantype) preceeding thequartile class.
f Q  thefrequency of thequartile class.
i

Remark: The quartile class (class containing Qi ) is the class with the smallest cumulative frequency (less
than type) greater than or equal to iN .
4
Deciles: Deciles are measures that divide the frequency distribution in to ten equal parts. The values of
the variables corresponding to these divisions are denoted D1, D2,.. D9 often called the first, the second,…,
the ninth decile respectively.
To find Di (i=1, 2,..9) we count iN of the classes beginning from the lowest class.
10

27
For grouped data: we have the following formula
w iN
Di  L Di  (  c) , i  1,2,...,9
f Di 10
Where :
L Di  lower class boundaryof the decile class.
w  the size of the decileclass
N  total number of observations.
c  the cumulative frequency( less than type) preceeding the decile class.
f Di  thefrequency of the decile class.

Remark: The decile class is the class with the smallest cf (less than type) greater than or equal to iN .
10
Percentiles: Percentiles are measures that divide the frequency distribution in to hundred equal parts. The
values of the variables corresponding to these divisions are denoted P1, P2,.. P99 often called the first, the
second,…, the ninety-ninth percentile respectively.
To find Pi (i=1, 2,..99) we count iN of the classes beginning from the lowest class.
100
For grouped data: we have the following formula
w iN
Pi  L Pi  (  c) , i  1,2,...,99
f Pi 100
Where :
L Pi  lower class boundaryof the percentile class.
w  the size of the percentile class
N  total number of observations.
c  the cumulative frequency( less than type) preceeding the percentile class.
f Pi  thefrequency of the percentile class.

Remark: The percentile class is the class with the smallest cf (less than type) greater than or equal to iN .
100
Example: Considering the following distribution calculate:
 All quartiles.
 The 7th decile.
 The 90th percentile.
Values Frequency
140- 150 17
150- 160 29
160- 170 42
170- 180 72
180- 190 84
190- 200 107
200- 210 49
210- 220 34
220- 230 31
230- 240 16
240- 250 12
Solutions:
 First find the less than cumulative frequency.
 Use the formula to calculate the required quantile.

28
Values Frequency Cum.Freq(less than type)
140- 150 17 17
150- 160 29 46
160- 170 42 88
170- 180 72 160
180- 190 84 244
190- 200 107 351
200- 210 49 400
210- 220 34 434
220- 230 31 465
230- 240 16 481
240- 250 12 493
Q1: Determine the class containing the first quartile.
N
 123.25
4
 170  180 is the class containingthe first quartile.
w N
L Q1  170 , w 10 Q L  (  c)
1 Q1 f 4
N  493 , c  88 , f Q1  72 Q
1
10
 170  (123.25  88)
72
 174.90

Q2: Determine the class containing the second quartile.


2* N
 246.5
4
 190  200 is the class containing the sec ond quartile.
LQ  190 ,
2
w 10
w 2* N
N  493 , c  244 , f Q 107
2
 Q2  LQ  2
fQ
(
4
 c)
2

10
 170  ( 246.5  244)
72
 190.23

Q3: Determine the class containing the third quartile.


3* N
 369.75
4
 200  210 is the class containing the third quartile.
LQ  200 , w 10 w 3* N
3
 Q3  LQ 3  (  c)
N  493 , c  351 , f Q  49
3
fQ 43

10
 200  (369.75  351)
49
 203.83

D7: Determine the class containing the 7th decile.


7*N
 345.1
10
190  200 is the class containing the seventh decile.
w 7* N
LD7  190 , w 10  D7  LD7  (  c)
f D7 10
N  493 , c  244 , f D7 107 10
 190  (345.1  244)
107
 199.45

29
P90: Determine the class containing the 90th percentile.
90 * N
 443.7
100
 220  230 is the class containing the 90th percentile.
L P90  220 , w 10  P90  LP 
w 90 * N
(  c)
N  493 , c  434 , f P90  31 90
fP
90
100
10
 220  (443.7  434)
31
 223.13

3.3 Measures of Dispersion (Variation)


The scatter or spread of items of a distribution is known as dispersion or variation. In other words the
degree to which numerical data tend to spread about an average value is called dispersion or variation of
the data.
Measures of dispersions are statistical measures which provide ways of measuring the extent in which
data are dispersed or spread out.
3.3.1 Types of measures of Variation
We can classify measures of variation as absolute and relative measures of variation. The measures of
dispersion which are expressed in terms of the original unit of a series are termed as absolute measures.
Such measures are not suitable for comparing the variability of two distributions which are expressed in
different units of measurement and different average size.
Example: Range, Quartile deviation, Mean Deviation, Variance and Standard deviation.
Relative measures of variations are a ratio or percentage of a measure of absolute variation to an
appropriate measure of central tendency and are thus pure numbers independent of the units of
measurement. For comparing the variability of two distributions (even if they are measured in the same
unit).
Example: Relative range, Coefficient of Mean deviation, Coefficient of standard deviation, Standard
Scores (Z-scores) and Coefficient of Variation (C.V).
The Range (R): The range is the largest score minus the smallest score. It is a quick and dirty measure of
variability, although when a test is given back to students they very often wish to know the range of
scores. Because the range is greatly affected by extreme scores, it may give a distorted picture of the
scores. The following two distributions have the same range, 13, yet appear to differ greatly in the amount
of variability.
Distribution 1: 32 35 36 36 37 38 40 42 42 43 43 45
Distribution 2: 32 32 33 33 33 34 34 34 34 34 35 45
For this reason, among others, the range is not the most important measure of variability.
R  L  S , L is the largest observation and S is the largest observation
Range for grouped data: If data are given in the shape of continuous frequency distribution, the range is
computed as:
R  UCLk  LCL1 , UCLk is the upper class limit of the last class.
UCL1 is the lower class limit of the first class.
This is sometimes expressed as:
R  X k  X1 , X k is class mark of the last class.
X 1 is classmark of the first class.

30
Relative Range (RR): It is also sometimes called coefficient of range and given by:
LS R
RR  
LS LS
Example:
1. The largest and smallest values for certain row data are 100 and 60 respectively. Then what is the
range and relative range for the data?
Solution :(1)
Largest (L) =100, Smallest (S) =60 Then R=L-S; R=100-60 =40
LS R ;  40 =0.25
RR   100  60
LS LS
2. If the range and relative range of a series are 4 and 0.25 respectively. Then what is the value of the
smallest and largest observations?
Solutions :( 2)
R  4  L  S  4 … equation (1) and RR  0.25  L  S  16 … equation (2)
Solving (1) and (2) at the same time, one can obtain the following value L  10 and S  6 .
Quartile deviation and Coefficient of Quartile deviation
The quartile deviation is also called semi-inter quartile range. The inter quartile range is the difference
between the third and the first quartiles of a set of items and semi-inter quartile range is half of the inter
quartile range.
Q  Q1
Q.D  3
2
Coefficient of Quartile Deviation (C.Q.D):
(Q3  Q1 2 2 * Q.D Q  Q1
C. Q.D    3
(Q3  Q1 ) 2 Q3  Q1 Q3  Q1
It gives the average amount by which the two quartiles differ from the median.
Example: Compute Q.D and its coefficient for the following distribution
Values Frequency
140- 150 17
150- 160 29
160- 170 42
170- 180 72
180- 190 84
190- 200 107
200- 210 49
210- 220 34
220- 230 31
230- 240 16
240- 250 12
Solutions: In the previous chapter we have obtained the values of all quartiles as:
Q1= 174.90, Q2= 190.23, Q3=203.83
Q3  Q1 203.83  174.90
 Q.D    14.47
2 2
2 * Q.D 2 *14.47
C.Q.D    0.076
Q3  Q1 203.83  174.90
Remark: Q.D or C.Q.D includes only the middle 50% of the observation.

31
Mean deviation and Coefficient of Mean deviation
The mean deviation of a set of items is defined as the arithmetic mean of the values of the absolute
deviations from a given average. Depending up on the type of averages used we have different mean
deviations.
a) Mean Deviation about the mean: It is denoted by M.D( X ) and given by:
n

X i X
M .D( X )  i 1
n
For the case of frequency distribution it is given as:
k

f i Xi  X
M .D( X )  i 1
n
Steps to calculate M.D ( X ):
1. Find the arithmetic mean, X
2. Find the deviations of each reading from X .
3. Find the arithmetic mean of the deviations, ignoring sign
~
b) Mean Deviation about the median: It is denoted by M.D( X ) and given by
n
~
~ X i X
M .D( X )  i 1
n
For the case of frequency distribution it is given as:
k
~
~ f i Xi  X
M .D( X )  i 1
n
~
Steps to calculate M.D ( X ):
~
1. Find the median, X
~
2. Find the deviations of each reading from X .
3. Find the arithmetic mean of the deviations, ignoring sign.
c) Mean Deviation about the mode. It is denoted by M.D( X̂ ) and given by
n

X i  Xˆ
M .D ( Xˆ )  i 1
n
For the case of frequency distribution it is given as:
k

f i X i  Xˆ
M .D( Xˆ )  i 1
n

Steps to calculate M.D ( X̂ ):

1. Find the mode, X̂


2. Find the deviations of each reading from X̂ .
3. Find the arithmetic mean of the deviations, ignoring sign.

32
Examples:
1. The following are the number of visit made by ten mothers to the local doctor’s surgery: 8, 6, 5, 5, 7,
4, 5, 9, 7, 4. Then find mean deviation about mean, median and mode.
Solutions: First calculate the three averages
~
X  6, X  5.5, Xˆ  5
Then take the deviations of each observation from these averages.
Xi 4 4 5 5 5 6 7 7 8 9 total
X 6
i
2 2 1 1 1 0 1 1 2 3 14

X i  5 .5 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14

Xi  5 1 1 0 0 0 1 2 2 3 4 14

10

 X i  6)
14
 M .D ( X )  i 1
  1.4
10 10
10

~ X i  5.5
14
M .D( X )  i 1
  1.4
10 10
10

X i  5)
14
M .D( Xˆ )  i 1
  1.4
10 10
Remark: Mean deviation is always minimum about the median.
Coefficient of Mean Deviation (C.M.D):
M .D
C.M .D 
Average about which deviationsare taken
M .D( X )
 C.M .D( X ) 
X
~
~ M .D( X )
C.M .D( X )  ~
X
M .D( Xˆ )
C.M .D( Xˆ ) 

Example: calculate the C.M.D about the mean, median and mode for the data in example 1 above.
Solutions:
M .D
C.M .D 
Average about which deviationsare taken
M .D( X ) 1.4
 C.M .D( X )    0.233
X 6
~
~ M .D( X ) 1.4
C.M .D( X )  ~   0.255
X 5.5
M .D( Xˆ ) 1.4
C.M .D( Xˆ )    0.28
Xˆ 5

33
Population Variance:
If we divide the variation by the number of values in the population, we get something called the
population variance. This variance is the "average squared deviation from the mean".
1
Population Varince   2 
N
 ( X i   ) 2 , i  1,2,.....N
For the case of frequency distribution it is expressed as:
1
Population Varince   2 
N
 f (X
i i   ) 2 , i  1,2,.....k

Sample Variance:
One would expect the sample variance to simply be the population variance with the population mean
replaced by the sample mean. However, one of the major uses of statistics is to estimate the
corresponding parameter. This formula has the problem that the estimated value isn't the same as the
parameter. To counteract this, the sum of the squares of the deviations is divided by one less than the
sample size.
1
Sample Varince  S 2 
n 1
 ( X i  X ) 2 , i  1,2,....., n
For the case of frequency distribution it is expressed as:
1
Sample Varince  S 2 
n 1
 f i ( X i  X ) 2 , i  1,2,.....k
We usually use the following short cut formula.
n k

X  nX 2 fX  nX 2
2 2
i i i
S 
2 i 1
, for raw data. S 
2 i 1
, for frequencydistribution.
n 1 n 1
Standard deviation: There is a problem with variances. Recall that the deviations were squared. That
means that the units were also squared.
Population s tan dard deviation     2
Sample s tan dard deviation  s  S 2
To get the units back the same as the original data values, the square root must be taken.
The following steps are used to calculate the sample variance:
1. Find the arithmetic mean.
2. Find the difference between each observation and the mean.
3. Square these differences.
4. Sum the squared differences.
5. Since the data is a sample, divide the number (from step 4) by the number of observations minus one,
i.e., n-1 (where n is equal to the number of observations in the data set).
Examples: Find the variance and standard deviation of the following sample data: 5, 17, 12, 10
Solutions: X  11
Xi 5 10 12 17 Total
( X i  X )2 36 1 1 36 74
n

(X i  X )2
74
S 2
 i 1

 24.67.
n 1 3
S  S2  24.67  4.97.

34
Special properties of Standard deviations
1. (X  X ) 
i
2
 ( X i  A) 2
n 1 ,A X
n 1
2. For normal (symmetric distribution the following holds.
 Approximately 68.27% of the data values fall within one standard deviation of the mean. i.e. with in
( X  S, X  S)
 Approximately 95.45% of the data values fall within two standard deviations of the mean. i.e. with in
( X  2S , X  2 S )
 Approximately 99.73% of the data values fall within three standard deviations of the mean. i.e. with
in ( X  3S , X  3S )
3. Chebyshev's Theorem: For any data set, no matter what the pattern of variation, the proportion of the
1
values that fall within k standard deviations of the mean or ( X  kS, X  kS ) will be at least 1  ,
k2
where k is an number greater than 1.
i.e. the proportion of items falling beyond k standard deviations of the mean is at most 12
k
Example: Suppose a distribution has mean 50 and standard deviation 6. What percent of the numbers are:
a) Between 38 and 62
b) Between 32 and 68
c) Less than 38 or more than 62.
d) Less than 32 or more than 68.
Solutions:
a) 38 and 62 are at equal distance from the mean,50 and this distance is 12
 ks  12
12 12
k   2
S 6

1
Applying the above theorem at least (1  ) *100%  75% of the numbers lie between 38 & 62.
k2
b) Similarly done.
1
c) It is just the complement of a) i.e. at most *100%  25% of the numbers lie less than 32 or more
k2
than 62.
d) Similarly done.
If the standard deviation of X 1 , X 2 , .....X n is S , then the standard deviation of
a) X1  k , X 2  k , .....X n  k will also be S
b) kX 1 , kX 2 , .....kX n would be k S

c) a  kX 1 , a  kX 2 , .....a  kX n would be k S
Exercise: Verify each of the above relationship, considering k and a as constants.
Examples:

35
1. The mean and standard deviation of n Tetracycline Capsules X 1 , X 2 , ..... X n are known to be
12 gm and 3 gm respectively. New set of capsules of another drug are obtained by the linear
transformation Yi = 2Xi – 0.5 ( i = 1, 2, …, n ) then what will be the standard deviation of the new set
of capsules
2. The mean and the standard deviation of a set of numbers are respectively 500 and 10.
a. If 10 is added to each of the numbers in the set, then what will be the variance and standard
deviation of the new set?
b. If each of the numbers in the set are multiplied by -5, then what will be the variance and
standard deviation of the new set?
Solutions:
1. Using c) above the new standard deviation = k S  2 * 3  6
2.
a. They will remain the same.
b. New standard deviation=  k S  5 *10  50
Coefficient of Variation (C.V):
 Is defined as the ratio of standard deviation to the mean usually expressed as percents.
S
C.V  *100
X
 The distribution having less C.V is said to be less variable or more consistent.
Examples:
1. An analysis of the monthly wages paid (in Birr) to workers in two firms A and B belonging to the
same industry gives the following result
Value Firm A Firm B
Mean wage 52.5 47.5
Median wage 50.5 45.5
Variance 100 121
In which firm A or B is there greater variability in individual wages?
Solutions:
Calculate coefficient of variation for both firms.
SA 10
C.V A  *100  *100  19.05%
XA 52.5
SB 11
C.VB  *100  *100  23.16%
XB 47.5
Since C.VA < C.VB, in firm B there is greater variability in individual wages.
3.3.2 Standard Scores (Z-scores):
 If X is a measurement from a distribution with mean X and standard deviation S, then its value in
standard units is
X  and X X
Z  , for population. Z  , for sample
 S
 Z gives the deviations from the mean in units of standard deviation
 Z gives the number of standard deviation a particular observation lie above or below the mean.
 It is used to compare two observations coming from different groups.
Examples:

36
1. Two sections were given introduction to statistics examinations. The following information was
given.
Value Section 1 Section 2
Mean 78 90
Stan.deviation 6 5
Student A from section 1 scored 90 and student B from section 2 scored 95.Relatively speaking who
performed better?
Solutions:
Calculate the standard score of both students.
X A  X1 90  78
ZA   2
S1 6
X  X2 95  90
ZB  B  1
S2 5
 Student A performed better relative to his section because the score of student A is two standard
deviation above the mean score of his section while, the score of student B is only one standard deviation
above the mean score of his section.
2. Two groups of people were trained to perform a certain task and tested to find out which group is
faster to learn the task. For the two groups the following information was given:
Value Group one Group two
Mean 10.4 min 11.9 min
Stan.dev. 1.2 min 1.3 min

Relatively speaking:
a) Which group is more consistent in its performance
b) Suppose a person A from group one take 9.2 minutes while person B from Group two take 9.3
minutes, who was faster in performing the task? Why?
Solutions:
a) Use coefficient of variation.
S1 1.2
C.V1  *100  *100  11.54%
X1 10.4
S2 1.3
C.V2  *100  *100  10.92%
X2 11.9
Since C.V2 < C.V1, group 2 is more consistent.
b) Calculate the standard score of A and B
X A  X 1 9.2  10.4
ZA    1
S1 1.2
X  X 2 9.3  11.9
ZB  B   2
S2 1.3
Child B is faster because the time taken by child B is two standard deviation shorter than the average
time taken by group 2 while, the time taken by child A is only one standard deviation shorter than the
average time taken by group 1.

37
CHAPTER FOUR
ELEMENTARY PROBABILITY
4.1 Introduction
In our daily life, it is not uncommon to hear words which express our doubts or being uncertain about the
happenings of certain events. To mention some instances, “If by chance you meet her, please convey my
heartfelt greeting", “Probably, he might not take the class today", etc. These statements show uncertainty
about the happening of the event under question. In Statistics, however, sensible numerical statements
can be made about uncertainty and apply different approaches to calculate probabilities.
Probability theory is the foundation upon which the logic of inference is built. It helps us to cope up with
uncertainty. In general, probability is the chance of an outcome of an experiment. It is the measure of how
likely an outcome is to occur. In any random experiment there is always uncertainty as to whether a
particular event will or will not occur. As a measure of the chance, or probability, with which we can
expect the event to occur, it is convenient to assign a number between 0 and 1. If we are sure or certain
that the event will occur, we say that its probability is 100% or 1, but if we are sure that the event will not
occur, we say that its probability is zero.
4.2 Definition and some Concepts
1. Experiment: Any process of observation or measurement or any process which generates well
defined outcome.
2. Probability Experiment: It is an experiment that can be repeated any number of times under similar
conditions and it is possible to enumerate the total number of outcomes without predicting an
individual out come. It is also called random experiment.
Example: If a fair die is rolled once it is possible to list all the possible outcomes i.e.1,2, 3, 4, 5, 6 but it is
not possible to predict which outcome will occur.
3. Outcome: The result of a single trial of a random experiment
4. Sample Space: Set of all possible outcomes of a probability experiment
5. Event: It is a subset of sample space. It is a statement about one or more outcomes of a random
experiment .They are denoted by capital letters.
Example: Considering the above experiment
let A be the event of odd numbers,
B be the event of even numbers, and
C be the event of number 8.
 A  1,3,5
B  2,4,6
C    or empty space or impossible event
Remark: If S (sample space) has n members then there are exactly 2n subsets or events.
6. Equally Likely Events: Events which have the same chance of occurring.
7. Complement of an Event: The complement of an event A means nonoccurrence of A and is denoted
by A' , or Ac , or A contains those points of the sample space which don’t belong to A.
8. Elementary Event: an event having only a single element or sample point.
9. Mutually Exclusive Events: Two events which cannot happen at the same time.
10. Independent Events: Two events are independent if the occurrence of one does not affect the
probability of the other occurring.
11. Dependent Events: Two events are dependent if the first event affects the outcome or occurrence of
the second event in a way the probability is changed.

38
Example: .What is the sample space for the following experiment
a) Toss a die one time.
b) Toss a coin two times.
c) A light bulb is manufactured. It is tested for its life length by time.
Solution:
a) S={1,2,3,4,5,6} b) S={(HH),(HT),(TH),(TT)} c) S={t /t≥0}
4.3 Counting Rules
In order to calculate probabilities, we have to know
 The number of elements of an event and the number of elements of the sample space
In order to determine the number of outcomes, one can use several rules of counting.
 The addition rule
 The multiplication rule
 Permutation rule
 Combination rule
To list the outcomes of the sequence of events, a useful device called tree diagram is used.
Example: A student goes to the nearest snack to have a breakfast. He can take tea, coffee, or milk with
bread, cake and sandwich. How many possibilities does he have?
Solutions:
Tea Bread
Cake
Sandwich
Coeffee Bread
Cake
Sandwitch
Milk Bread
Cake
Sandwitch
 There are nine possibilities.
The Multiplication Rule: If a choice consists of k steps of which the first can be made in n 1 ways, the
second can be made in n2 ways…, the kth can be made in nk ways, then the whole choice can be made in
( n1 * n2 * ........ * nk ) ways .
Example: The digits 0, 1, 2, 3, and 4 are to be used in 4 digit identification card. How many different
cards are possible if a) Repetitions are permitted b) Repetitions are not permitted
Solutions:
a)
1st digit 2nd digit 3rd digit 4th digit
5 5 5 5
There are four steps
1. Selecting the 1st digit, this can be made in 5 ways.
2. Selecting the 2nd digit, this can be made in 5 ways.
3. Selecting the 3rd digit, this can be made in 5 ways.
4. Selecting the 4th digit, this can be made in 5 ways.
 5 * 5 * 5 * 5  625 differentcards are possible.

39
b)
1st digit 2nd digit 3rd digit 4th digit
5 4 3 2
There are four steps
1. Selecting the 1st digit, this can be made in 5 ways.
2. Selecting the 2nd digit, this can be made in 4 ways.
3. Selecting the 3rd digit, this can be made in 3 ways.
4. Selecting the 4th digit, this can be made in 2 ways.
 5 * 4 * 3 * 2  120 differentcards are possible.
Permutation: An arrangement of n objects in a specified order is called permutation of the objects.
Permutation Rules:
1. The number of permutations of n distinct objects taken all together is n!
Where n! n * (n  1) * (n  2) * ..... * 3 * 2 *1
2. The arrangement of n objects in a specified order using r objects at a time is called the permutation of
n objects taken r objects at a time. It is written as n Pr and the formula is n Pr  n!
(n  r )!
3. The number of permutations of n objects in which k1 are alike k2 are alike … etc is
Pr  n!
n
k1!*k2 * ... * kn
Example:
1. Suppose we have a letters A,B, C, D
a) How many permutations are there taking all the four?
b) How many permutations are there two letters at a time?
2. How many different permutations can be made from the letters in the word “CORRECTION”?
Solutions:
1. a) Here n  4, there are four disnict object
 There are 4! 24 permutations.
Here n  4, r  2
b) 4! 24
 There are 4 P2    12 permutations.
(4  2)! 2
Here n  10
2. Of which 2 are C , 2 are O, 2 are R ,1E ,1T ,1I ,1N
 K1  2, k 2  2, k3  2, k 4  k5  k 6  k 7  1
U sin g the 3rd rule of permutation , there are
10!
 453600 permutations.
2!*2!*2!*1!*1!*1!*1!
Combination: A selection of objects without regard to order is called combination.
Example: Given the letters A, B, C, and D list the permutation and combination for selecting two letters.
Solutions:
Permutation Combination
AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC
Note that in permutation AB is different from BA. But in combination AB is the same as BA.

40
Combination Rule: The number of combinations of r objects selected from n objects is denoted by
n and is given by the formula: n n!
n C r or 
r
 
r

    (n  r )!*r!
Examples:
1. In how many ways a committee of 5 people be chosen out of 9 people?
n9 , r 5
Solutions: n n! 9!

r
  ( n  r )!*r!  4!*5!  126 ways
 
2. Among 15 clocks there are two defectives .In how many ways can an inspector chose three of the
clocks for inspection so that:
a) There is no restriction.
b) None of the defective clock is included.
c) Only one of the defective clocks is included.
d) Two of the defective clock is included.
Solutions: n  15 of which 2 are defectiveand 13 are non  defective.
r 3
a) If there is no restriction select three clocks from 15 clocks and this can be done in
n  15 , r  3
n n! 15!

r  (n  r )!*r!  12!*3!  455 ways
 
b) None of the defective clocks is included.
This is equivalent to zero defective and three non defective, which can be done in:
 2  13 
 
  *
   286 ways.
0  3 
c) Only one of the defective clocks is included.
This is equivalent to one defective and two non defective, which can be done in:
 2  13 

1 *
 
  156 ways.
   2
d) Two of the defective clock is included.
This is equivalent to two defective and one non defective, which can be done in:
 2  13 

 2*
 
  13 ways.
   3
4.4 Approaches in Probability definition
There are different procedures by means of which we can define or estimate the probability of an event.
These procedures are the classical approach, the frequentist approach, the axiomatic approach and the
subjective approach which discussed below.
The classical approach: This approach is used when:
 All outcomes are equally likely.
 Total number of outcome is finite, say N.
Definition: If a random experiment with N equally likely outcomes is conducted and out of these N A
outcomes are favourable to the event A, then the probability that event A occur denoted by P(A) is
defined as: P( A)  N A  No. of outcomes favourableto A  n( A)
N Total number of outcomes n( S )
Examples:
1. A fair die is tossed once. What is the probability of getting
a) Number 4? b) An odd number? c) An even number? d) Number 8?

41
Solutions:
First identify the sample space, say S
S  1, 2, 3, 4, 5, 6
 N  n( S )  6
a) Let A be the event of number 4
A  4
 N A  n( A)  1
n( A)
P ( A)  1 6
n( S )
b) Let A be the event of odd numbers
A  1,3,5
 N A  n( A)  3
n( A)
P ( A)   3 6  0.5
n( S )
c) Let A be the event of even numbers
A  2,4,6
 N A  n( A)  3
n( A)
P ( A)   3 6  0.5
n( S )
d) Let A be the event of number 8
A Ø
 N A  n( A)  0
n( A)
P ( A)  0 60
n( S )
2. A box of 80 candles consists of 30 defective and 50 non defective candles. If 10 of this candles are
selected at random, what is the probability
a) All will be defective. b) 6 will be non defective c) All will be non defective
Solutions: Total selection   80   N  n( S )
 
 10 
a) Let A be the event that all will be defective.
 30   50 
Total way in which A occur    *    N A  n( A)
 10   0 
 30   50 
  *  
n( A)  10   0 
 P ( A)    0.00001825
n( S )  80 
 
 10 
b) Let A be the event that 6 will be non defective.
 30   50 
Total way in which A occur    4 *
 
  N A  n( A)
   6 
 30   50 
 * 
n( A)  4  
 P ( A)     6   0.265
n( S )  80 

 10 

 
c) Let A be the event that all will be non defective.
 30   50 
Total way in which A occur    *    N A  n( A)
 0   10 
 30   50 
 * 
n( A)  0   10 
 P( A)    0.00624
n( S )  80 
 10 
 

42
Short coming of the classical approach:
This approach is not applicable when:
 The total number of outcomes is infinite.
 Outcomes are not equally likely.
The Frequentist Approach: This is based on the relative frequencies of outcomes belonging to an event.
Definition: The probability of an event A is the proportion of outcomes favourable to A in the long run
when the experiment is repeated under same condition. i.e. P( A)  lim N A
N  N

Example: If records show that 60 out of 100,000 bulbs produced are defective. What is the probability of
a newly produced bulb to be defective?
Solution:
Let A be the event that the newly produced bulb is defective.
NA 60
P( A)  lim   0.0006
N  N 100,000
Axiomatic Approach: Let E be a random experiment and S be a sample space associated with E. With
each event A a real number called the probability of A satisfies the following properties called axioms of
probability or postulates of probability.
1. P( A)  0
2. P(S )  1, S is the sure event.
3. If A and B are mutually exclusive events, the probability that one or the other occur equals the sum of
the two probabilities. i. e. P( A  B)  P( A)  P( B)
4. P( A' )  1  P( A)
5. 0  P( A)  1
6. P(ø) =0, ø is the impossible event.
Remark: Venn-diagrams can be used to solve probability problems.

AUB AnB A
In general p ( A  B )  p ( A)  p ( B )  p ( A  B )
4.5 Conditional probability and independency
4.5.1 Conditional probability
The conditional probability of an event A, given that event B has occurred with P(B) > 0, is denoted by
P(A|B) and is defined by: P(A│B)=( P (A ∩ B ))/P(B). P (B) ≠0
Similarly the conditional probability of an event B, given that event A has occurred with P(A)>0 is
defined as P(B\A)=(P(A∩B))/(P(A))
 P (A ∩ B )= P(A|B)P(B)= P(B)P(A|B)
 P (A ∩ B )= P(B|A)P(A)= P(A)P(B|A)
Example 4.1: A box contains four black and six white balls. What is the probability of getting two black
balls in drawing one after the other under the following conditions?
a) The first ball drawn is not replaced b) The first ball drawn is replaced
Solution; Let A= first drawn ball is black and B= second drawn is black
Required p  A  B 

43
a. p  A  B   p  B A. p  A  4 103 9   2 15
b. p  A  B   p  A. p  B   4 10 4 10   4 25
4.5.2 Independent Event
Two events A1 and A2 are said to be independent (statistically or stochastically or in the probability
sense), if P (A1 ∩ A2) = P (A1) P (A2). Whereas two events A1 and A2 are said to be dependent when
P (A1 ∩ A2) ≠ P (A1) P (A2). In other words, two events A1 and A2 are independent means the occurrence
of one event A1 is not affected by the occurrence or non-occurrence of A2 and vice versa.
Remark:
 If two events A and B are independent then P(B\A) = P(B), for P(A) > 0 and P(A|B) = P(A)
where P(B) > 0.
Example: A box contains four black and six white balls. What is the probability of getting two black balls
in drawing one after the other under the following conditions?
a. The first ball drawn is not replaced
b. The first ball drawn is replaced
Solution; Let A= first drawn ball is black
B= second drawn is black
Required P(A∩B)
a. P(A∩B) = P(B/A).p(A) = (4/10)(3/9) = 2/15
b. P(A∩B) = P(A).p(B) = (4/10)(4/10) = 4/25

44
CHAPTER FIVE
PROBABILITY DISTRIBUTIONS

Introduction
In this chapter, we shall study meaning of random variables which may be both discrete and continuous
distribution of these random variables including their properties.

5.1 Definition and distribution function of random variables


Definition: A random variable is a numerical description of the outcomes of the experiment or a
numerical valued function defined on sample space, usually denoted by capital letters.
Example: If X is a random variable, then it is a function from the elements of the sample space to the set
of real numbers. i.e. X is a function X: S -R
A random variable takes a possible outcome and assigns a number to it.
Example: Flip a coin three times, let X be the number of heads in three tosses.

 S  HHH , HHT , HTH , HTT , THH , THT , TTH , TTT 


 X HHH   3,
X HHT   X HTH   X THH   2,
X HTT   X THT   X TTH   1
X TTT   0

X = {0, 1, 2, 3 }
X assumes a specific number of values with some probabilities.
Random variables are of two types:
1. Discrete random variable: are variables which can assume only a specific number of values. They
have values that can be counted
Examples: Toss coin n times and count the number of heads, Number of children in a family, Number of
car accidents per week, Number of defective items in a given company, Number of bacteria per two cubic
centimeters of water and etc.
2. Continuous random variable: are variables that can assume all values between any two give values.
Examples: Height of students at certain college, Mark of a student, Life time of light bulbs, Length of
time required to complete a given training and etc.

Definition: a probability distribution consists of a value a random variable can assume and the
corresponding probabilities of the values.

Example: Consider the experiment of tossing a coin three times. Let X be the number of heads. Construct
the probability distribution of X.
Solution:
 First identify the possible value that X can assume.
 Calculate the probability of each possible distinct value of X and express X in the form of frequency
distribution.
X x 0 1 2 3
P X  x  18 3 8 3 8 18
Probability distribution is denoted by P for discrete and by f for continuous random variable.

45
Properties of Probability Distribution:
Note:
1. P( x)  0, if X is discrete.
f ( x)  0, if X is continuous.
2.  P X  x   1 , if X is discrete.
x

 f ( x)dx  1 , if is continuous.
x

1. If X is a continuous random variable then


b
P ( a  X  b)   f ( x) dx
a

2. Probability of a fixed value of a continuous random variable is zero.


 P ( a  X  b)  P ( a  X  b)  P ( a  X  b)  P ( a  X  b)
3. If X is discrete random variable the
b 1
P ( a  X  b)   P ( x )
x  a 1
b 1
P ( a  X  b)   p ( x )
xa
b
P ( a  X  b)   P ( x )
x  a 1
b
P ( a  X  b)   P ( x )
xa

4. Probability means area for continuous random variable.

5.2 Introduction to expectation – mean and variance of a random variable


Definition:
1. Let a discrete random variable X assume the values X1, X2, ….,Xn with the probabilities P(X1), P(X2),
….,P(Xn) respectively. Then the expected value of X ,denoted as E(X) is defined as:
E ( X )  X 1 P ( X 1 )  X 2 P ( X 2 )  ....  X n P ( X n )
n
  X i P( X i )
i 1

2. Let X be a continuous random variable assuming the values in the interval (a, b) such that
b
b E ( X )   x f ( x)dx
 f ( x)dx  1 , then
a

a
Examples:
1. What is the expected value of a random variable X obtained by tossing a coin three times where is the
number of heads
Solution: First construct the probability distribution of X
X x 0 1 2 3
P X  x  18 3 8 3 8 18

 E ( X )  X 1 P ( X 1 )  X 2 P ( X 2 )  ....  X n P ( X n )
 0 *1 8  1 * 3 8  .....  2 *1 8
 1.5

46
2. Suppose a charity organization is mailing printed return-address stickers to over one million homes in
the Ethiopia. Each recipient is asked to donate either $1, $2, $5, $10, $15, or $20. Based on past
experience, the amount a person donates is believed to follow the following probability distribution:

X  x $1 $2 $5 $10 $15 $20


P X  x  0.1 0.2 0.3 0.2 0.15 0.05
What is expected that an average donor to contribute?
Solution:
X x $1 $2 $5 $10 $15 $20 Total
P X  x  0.1 0.2 0.3 0.2 0.15 0.05 1
xP ( X  x ) 0.1 0.4 1.5 2 2.25 1 7.25
6
 E ( X )   xi P( X  xi )  $7.25
i 1
Mean and Variance of a random variable
Let X be given random variable.
1. The expected value of X is its mean
 Mean of X  E ( X )
2. The variance of X is given by:
Variance of X  var( X )  E ( X 2 )  [ E ( X )]2
Where:
n
E ( X 2 )   xi P( X  xi ) , if X is discrete
2

i 1

  x 2 f ( x)dx , if X is continuous.
x

Examples:
1. Find the mean and the variance of a random variable X in example 2 above.
Solutions:
X  x $1 $2 $5 $10 $15 $20 Total
P X  x  0.1 0.2 0.3 0.2 0.15 0.05 1
xP ( X  x ) 0.1 0.4 1.5 2 2.25 1 7.25
x P ( X  x ) 0.1
2
0.8 7.5 20 33.75 20 82.15
 E ( X )  7.25
Var ( X )  E ( X 2 )  [ E ( X )]2  82.15  7.252  29.59

There are some general rules for mathematical expectation. Let X and Y are random variables and k is a
constant.
RULE 1 : E (k )  k
RULE 2 : Var(k )  0
RULE 3 : E(kX )  kE( X )
RULE 4 : Var(kX )  k 2Var( X )
RULE 5: E( X  Y )  E( X )  E(Y )

47
5.3 Common discrete distributions and their properties
In this chapter, we shall study the common discrete probability distributions like Binomial and Poisson
including their properties.
5.3.1 Binomial Distribution
A binomial experiment is a probability experiment that satisfies the following four requirements called
assumptions of a binomial distribution.
1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive outcomes, success or a failure.
3. The probability of each outcome does not change from trial to trial, and
4. The trials are independent, thus we must sample with replacement.
Examples of binomial experiments
 Tossing a coin 20 times to see how many tails occur.
 Asking 200 people if they watch BBC news.
 Registering a newly produced product as defective or non defective.
 Asking 100 people if they favour the ruling party.
Definition: The outcomes of the binomial experiment and the corresponding probabilities of these
outcomes are called Binomial Distribution.
Let P  the probability of success
q  1  p  the probability of failure on any given trial
Then the probability of getting x successes in n trials becomes:
 n  x n x
P( X  x)    xp q , x  0,1,2,...., n
 
And this is sometimes written as: X ~ Bin( n, p )
When using the binomial formula to solve problems, we have to identify three things:
 The number of trials ( n )
 The probability of a success on any one trial ( p ) and
 The number of successes desired ( X ).
Examples:
1. What is the probability of getting three heads by tossing a fair con four times?
Solution: Let X be the number of heads in tossing a fair coin four times
X ~ Bin( n  4, p  0.50)
 n  x n x
 P( X  x)  
 xp q , x  0,1,2,3,4
 
 4 4 x
 x
x
0.5 0.5
 
 4
 x
0.5
4

 
 4
 P ( X  3)  
 0.5 4  0.25
 3
2. Suppose that an examination consists of six true and false questions, and assume that a student has no
knowledge of the subject matter. The probability that the student will guess the correct answer to the

48
first question is 30%. Likewise, the probability of guessing each of the remaining questions correctly
is also 30%.
a) What is the probability of getting more than three correct answers?
b) What is the probability of getting at least two correct answers?
c) What is the probability of getting at most three correct answers?
d) What is the probability of getting less than five correct answers?
Solution: Let X = the number of correct answers that the student gets.
X ~ Bin ( n  6, p  0.30)
a)P ( X  3)  ?
 n  x n x
 P( X  x)  
 xp q , x  0,1,2,..6
 
6

 0.3 x 0.7 6  x
 x

 P ( X  3)  P ( X  4)  P ( X  5)  P ( X  6)
 0.060  0.010  0.001
 0.071
Thus, we may conclude that if 30% of the exam questions are answered by guessing, the probability is
0.071 (or 7.1%) that more than four of the questions are answered correctly by the student.
b) P ( X  2)  ?
P( X  2)  P( X  2)  P( X  3)  P ( X  4)  P( X  5)  P ( X  6)
 0.324  0.185  0.060  0.010  0.001
 0.58
c) P ( X  3)  ?
P ( X  3)  P ( X  0)  P ( X  1)  P ( X  2)  P ( X  3)
 0.118  0.303  0.324  0.185
 0.93
d) P ( X  5)  ?
P ( X  5)  1  P ( X  5)
 1  {P ( X  5)  P ( X  6)}
 1  (0.010  0.001)
 0.989
Exercises:
1. Suppose that 4% of all TVs made by A&B Company in 2000 are defective. If eight of these TVs are
randomly selected from across the country and tested, what is the probability that exactly three of
them are defective? Assume that each TV is made independently of the others.
2. An allergist claims that 45% of the patients she tests are allergic to some type of weed. What is the
probability that
a) Exactly 3 of her next 4 patients are allergic to weeds?
b) None of her next 4 patients are allergic to weeds?
Remark: If X is a binomial random variable with parameters n and p then E ( X )  np and Var ( X )  npq
5.3.2 Poisson distribution
A random variable X is said to have a Poisson distribution if its probability distribution is given by:

49
x e  
P( X  x)  , x  0,1,2,......
x!
Where   the average number.
 Poisson distribution depends only on the average number of occurrences per unit time space.
 The Poisson distribution is used as a distribution of rare events, such as:
 Number of misprints, Natural disasters like earth quake, Accidents, Hereditary, Arrivals and etc.
 The process that gives rise to such events is called Poisson process.
Examples:
1. If 1.6 accidents can be expected an intersection on any given day, what is the probability that there
will be 3 accidents on any given day? Solution; Let X =the number of accidents,   1.6
1.6 x e 1.6
X  poisson1.6   p X  x  
x!
1.63 e 1.6
p X  3   0.1380
3!
2. On the average, five smokers pass a certain street corners every ten minutes, what is the probability
that during a given 10minutes the number of smokers passing will be
a) 6 or fewer b) 7 or more c) Exactly 8……. (Exercise)
If X is a Poisson random variable with parameters  then E ( X )   and Var ( X )  
Note: The Poisson probability distribution provides a close approximation to the binomial probability
distribution when n is large and p is quite small or quite large with   np .
(np) x e  ( np )
P( X  x)  , x  0,1,2,......
x!
Where   np  the average number.
Usually we use this approximation if np  5 . In other words, if n  20 and np  5 [or n(1  p )  5 ],
then we may use Poisson distribution as an approximation to binomial distribution.
Example: Find the binomial probability P(X=3) by using Poisson distribution if p  0.01 and n  200
Solution:
U sin g Poisson ,   np  0.01* 200  2
23 e  2
 P( X  3)   0.1804
3!
U sin g Binomial , n  200, p  0.01
 200 
 P( X  3)   (0.01)3 (0.99)99  0.1814
 3 
5.4 Common continuous probability distributions and their properties
5.4.1 Normal Distribution
A random variable X is said to have a normal distribution if its probability density function is given by
1  x 2
1   
 
f ( x)  e 2 ,    x  ,      ,   0
 2
Where   E ( X ),  2  Variance( X )
 and  2 are the Parameters of the Normal Distribution.
Properties of Normal Distribution:

50
1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum ordinate is at
x   and is given by f ( x)  1
 2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a different
normal distribution. Thus, the normal distribution is completely described by two parameters: mean
and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of the mean is 0.5.


 f ( x) dx  1


6. It is unimodal, i.e., values mound up only in the center of the curve.


7. Mean  Median  mod e  
8. The probability that a random variable will have a value between any two points is equal to the area
under the curve between those points.
Note: To facilitate the use of normal distribution, the following distribution known as the standard normal
distribution was derived by using the transformation
1
1  z 2

X   f ( z)  e 2
Z 
 2
Properties of the Standard Normal Distribution:
Same as a normal distribution, but the mean is zero, variance is one and standard deviation is one
 Areas under the standard normal distribution curve have been tabulated in various ways. The most
common ones are the areas between Z  0 and a positive value of Z .
 Given a normal distributed random variable X with Mean  and s tan dard deviation 
a X  b  a b
P ( a  X  b)  P (   ) P ( a  X  b)  P ( Z  )
    
Note: P ( a  X  b)  P ( a  X  b)
 P ( a  X  b)
 P ( a  X  b)
Examples:
1. Find the area under the standard normal distribution which lies
a) Between Z  0 and Z  0.96
Solution: Area  P (0  Z  0.96)  0.3315
b) Between Z  1.45 and Z  0
Solution:
Area  P ( 1.45  Z  0)
 P (0  Z  1.45)
 0.4265
c) To the right of Z  0.35
Solution:

51
Area  P ( Z  0.35)
 P ( 0.35  Z  0)  P ( Z  0)
 P (0  Z  0.35)  P ( Z  0)
 0.1368  0.50  0.6368
d) To the left of Z  0.35
Solution:
Area  P ( Z  0.35)
 1  P ( Z  0.35)
 1  0.6368  0.3632
e) Between Z  0.67 and Z  0.75
Solution:
Area  P ( 0.67  Z  0.75)
 P ( 0.67  Z  0)  P (0  Z  0.75)
 P (0  Z  0.67)  P (0  Z  0.75)
 0.2486  0.2734  0.5220
f) Between Z  0.25 and Z  1.25
Solution:
Area  P (0.25  Z  1.25)
 P (0  Z  1.25)  P (0  Z  0.25)
 0.3934  0.0987  0.2957
2. Find the value of Z if
a) The normal curve area between 0 and z(positive) is 0.4726
Solution:
P (0  Z  z )  0.4726 and from table
P (0  Z  1.92)  0.4726
 z  1.92.....uniqueness of Areea.
b) The area to the left of z is 0.9868
Solution:
P ( Z  z )  0.9868
 P ( Z  0)  P (0  Z  z )
 0.50  P (0  Z  z )
 P (0  Z  z )  0.9868  0.50  0.4868
and from table
P (0  Z  2.2)  0.4868
 z  2.2

3. A random variable X has a normal distribution with mean 80 and standard deviation 4.8. What is the
probability that it will take a value
a) Less than 87.2 b) Greater than 76.4 c) Between 81.2 and 86.0
Solution: X is normal with mean,   80, s tan dard deviation,   4.8
a)
X  87.2  
P ( X  87.2)  P (  )
 
87.2  80
 P( Z  )
4.8
 P ( Z  1.5)
 P ( Z  0)  P (0  Z  1.5)
 0.50  0.4332  0.9332

52
b)
X  76.4  
P ( X  76.4)  P (  )
 
76.4  80
 P( Z  )
4.8
 P ( Z  0.75)
 P ( Z  0)  P (0  Z  0.75)
 0.50  0.2734  0.7734
c)
81.2   X  86.0  
P(81.2  X  86.0)  P(   )
  
81.2  80 86.0  80
 P( Z )
4.8 4.8
 P(0.25  Z  1.25)
 P(0  Z  1.25)  P(0  Z  1.25)
 0.3934  0.0987  0.2957
4. A normal distribution has mean 62.4.Find its standard deviation if 20.05% of the area under the
normal curve lies to the right of 72.9
Solution:
X  72.9  
P ( X  72.9)  0.2005  P (  )  0.2005
 
72.9  62.4
 P( Z  )  0.2005

10.5
 P( Z  )  0.2005

10.5
 P (0  Z  )  0.50  0.2005  0.2995

And from table P (0  Z  0.84)  0.2995
10.5
  0.84

   12.5
5. A random variable has a normal distribution with   5 .Find its mean if the probability that the
random variable will assume a value less than 52.5 is 0.6915.
Solution:
52.5  
P( Z  z )  P( Z  )  0.6915
5
 P (0  Z  z )  0.6915  0.50  0.1915.
But from the table
 P (0  Z  0.5)  0.1915
52.5  
 z  0.5
5
   50

53
CHECKLIST
Put a tick mark (√) for each of the following questions if you can solve the problems and an X otherwise:
Can you
1 State the assumptions underlying the binomial distribution?
2 Write down the mathematical formula of the binomial distribution?
3 Compute probabilities of events in a binomial distribution?
4 Write down the pdf of the Normal Distribution (N.D)?
5 State and verify the properties of the normal curve?
6 Define the standard N.D with its properties?
7 Compute probabilities of a normal r-v?
Exercise
1. The average of sodium content in a certain brand of low salt microwave frozen dinners is 660 mg and
the standard deviation is 35 mg. Assume the variable is normally distributed.
a) If a single dinner is selected, find the probability that the sodium content will be more than 670 mg?
b) If a sample of 10 dinners is selected, find the probability that the mean of the sample will be larger
than 670 mg?
1. In a certain university, 5 percent of the students are overweight. A sample of 20 students is selected at
random and observed for weight. Using both the binomial and Poisson distributions, find the
probability that there are exactly 3 students in the sample who are overweight.
2. Suppose that the probability of suffering a side effect from a certain flu vaccine is 0.05. If 1000
persons are inoculated, find approximately the probability that
a) at most one person suffers b) 5 or 6 person suffers.
3. A production engineers find that, on average mechanics working in a machine shop complete a
certain tasks in 15 minutes. The time required to complete the tasks is approximately normally
distributed with a standard deviation of 3 minutes. Find the probabilities that the task is completed.
a) in less than 8 minutes b) in more than 9 minutes c) between 10 and 12 minutes

54

You might also like