0% found this document useful (0 votes)

329 views111 pages

Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation

This document provides an introduction to basic statistics concepts. It discusses how statistics is used to collect, analyze, interpret and draw conclusions from quantitative data. The objectives of understanding key terms, stages of statistical investigations, types of variables, and applications and limitations of statistics are also outlined. Specifically, it defines descriptive and inferential statistics, and the nominal, ordinal, interval and ratio scales of measurement.

Uploaded by

Gebru Getachew

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

329 views111 pages

Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation

Uploaded by

Gebru Getachew

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 111

Chapter one

1. Basic Concepts, Methods of data collection and presentation

Introduction:-

Statistics is a very broad subject, with applications in a vast number of different fields. In general
one can say that statistics is the methodology for collecting, analyzing, interpreting and drawing
conclusions from information. Putting it in other words, statistics is the methodology which
scientists and mathematicians have developed for interpreting and drawing conclusions from
collected data. Everything that deals even remotely with the collection, processing, interpretation
and presentation of data belongs to the domain of statistics, and so does the detailed planning of
that precedes all these activities.
Objectives of the chapter
After studying this chapter, you should be able to:-

 Understand the definition and classification of statistics.

 They should know important statistical terms.
 They should know stages of statistical investigations.
 Identify types of variables.
 Understand the applications, uses and limitations of statistics.
 They should know methods of data collection and presentation.

1.1 Definition and classifications of statistics

Definition:-
We can define statistics in two ways.
1. Plural sense (lay man definition).
In this sense statistics is defined as aggregates of numerically expressed facts or figures collected
in a systematic manner for a pre determined purpose.
2. Singular sense (formal definition)
In this sense Statistics is defined as the science of collecting, organizing, presenting, analyzing
and interpreting numerical data for the purpose of assisting in making a more effective decision.

1
Classifications:-

Depending on how data can be used, statistics can be classified in to two broad classes.
1. Descriptive Statistics:
- This part of statistics deals only with describing some characteristics of the data collected
without going beyond the data. In other words, it deals with only describing the sample
data without going any further: that is without attempting to infer (conclude) anything
about the population.
- Descriptive statistics deals with collection of data, its presentation in various forms, such
as tables, graphs and diagrams and finding averages and other measures which would
describe the data.
- Descriptive statistics refers only to the actual data. That is the data at hand.ind of
statistics which is used to describe the features of the data that gathered by the researcher.
Examples:
 Classification of computers based on their generation.
 Average score of students in a given semester
2. Inferential Statistics:
- This type of statistics is concerned with drawing statistically valid conclusions about the
characteristics of the population (large group) based on information obtained from a
sample (small group). That is, this part of statistics is concerned with the generalizing the
results of a sample or small groups using probabilities, performing hypothesis testing,
determining relationships between variables, and making predictions.
Examples:
 Out of 50 computer science students10 students are randomly selected they had the last
name Abebe.
 About 20% of all people living in Ethiopia have the last name Abebe.
1.2 Stages in Statistical Investigation
There are five stages or steps in any statistical investigation.
1. Collection of data: is the first stage of statistical investigation. The data should be collected
with a specific and well defined purpose so that the conclusions drawn are not to be misleading.

2
Two methods of data collection …primary and secondary….
Primary method of data collection refers to obtaining original and first hand data and secondary
method of data collection involves obtaining data from other sources.
2. Organization of data: this is a methodology for classification and describing the properties of
data in summary form. Editing, coding and classification are the three steps in organization of
data.
3. Presentation of data: The process of re-organization, classification, compilation, and
summarization of data to present it in a meaningful form.
4. Analysis of data: The process of extracting relevant information from the summarized data,
mainly through the use of elementary mathematical operation.
5. Interpretation of data: The interpretation and further observation of the various statistical
measures through the analysis of the data by implementing those methods by which conclusions
are formed and inferences made.
1.3 Definitions of some statistical terms
a. Statistical Population: It is the collection of all possible observations of a specified
characteristic of interest (possessing certain common property) and being under study.
An example is all of the students in Adigrat University.
b. Sample: It is a subset of the population, selected using some sampling technique in such
a way that they represent the population.
c. Sampling: The process or method of sample selection from the population.
d. Sample size: The number of elements or observation to be included in the sample.
e. Census: Complete enumeration or observation of the elements of the population. Or it is
the collection of data from every element in a population
f. Data:- Data as a collection of related facts and figures from which conclusions may be
drawn.
g. Parameter: Characteristic or measure obtained from a population.
h. Statistic: Characteristic or measure obtained from a sample.
i. Variable: It is an item of interest that can take on many different numerical values.

3
1.4 Applications, Uses and Limitations of statistics
Applications of statistics:
 In almost all fields of human endeavor.
 Almost all human beings in their daily life are subjected to obtaining numerical facts e.g.
abut price.
 Applicable in some process e.g. invention of certain drugs, extent of environmental
pollution.
 In industries especially in quality control area.

Uses of statistics:
The main function of statistics is to enlarge our knowledge of complex phenomena. The
following are some uses of statistics:
1. It presents facts in a definite and precise form.
2. Data reduction.
3. Measuring the magnitude of variations in data.
4. Furnishes a technique of comparison
5. Estimating unknown population characteristics.
6. Testing and formulating of hypothesis.
7. Studying the relationship between two or more variable.
8. Forecasting future events.
Limitations of statistics
- As a science statistics has its own limitations. The following are some of the limitations:
 Deals with only quantitative information.
 Deals with only aggregate of facts and not with individual data items.
 Statistical data are only approximately and not mathematical correct.
 Statistics can be easily misused and therefore should be used be experts.
1.5 Types of Variables
1. Qualitative Variables are nonnumeric variables and can't be measured. Examples include
gender, religious affiliation, and state of birth.

4
2. Quantitative Variables are numerical variables and can be measured. Examples include
balance in checking account, number of children in family. Note that quantitative variables are
either discrete (which can assume only certain values, and there are usually "gaps" between the
values, such as the number of bedrooms in your house) or continuous (which can assume any
value within a specific range, such as the air pressure in a tire.)

SCALE TYPES:-
Normally, when one hears the term measurement, they may think in terms of measuring the length
of something (i.e. the length of a piece of wood) or measuring a quantity of something (i.e. a cup
of flour). This represents a limited use of the term measurement. In statistics, the term
measurement is used more broadly and is more appropriately termed scales of measurement.
Scales of measurement refer to ways in which variables or numbers are defined and categorized.
Each scale of measurement has certain properties which in turn determine the appropriateness for
use of certain statistical analyses. The four scales of measurement are nominal, ordinal, interval,
and ratio.
The various measurement scales results from the facts that measurement may be carried out under
different sets of rule.
Nominal Scale:-Consists of ‘naming’ observations or classifying them into various mutually
exclusive categories. Sometimes the variable under study is classified by some quality it possesses
rather than by an amount or quantity. In such cases, the variable is called attribute.
 Example
o Religion: Christianity, Islam, Hinduism, etc.
o Sex: Male, Female
o Eye color: brown, black, etc.
o Blood type: A, B, AB and O.
Ordinal Scale: - Whenever observations are not only different from category to category, but can
be ranked according to some criterion. The variables deal with their relative difference rather than
with quantitative differences.
Ordinal data are data which can have meaningful inequalities. The inequality signs < or > may
assume any meaning like ‘stronger, softer, weaker, better than’, etc.
Example

5
 Patients may be characterized as unimproved, improved & much improved.
 letter grading system, authority, career, etc
 Individuals may be classified according to socio-economic as low, medium & high.
Interval Scale: With this scale it is not only possible to order measurements, but also the
distance between any two measurements is known but not meaningful quotients. There is no true
zero point but arbitrary zero point. Interval data are the types of information in which an increase
from one level to the next always reflects the same increase. Possible to add or subtract interval
data but they may not be multiplied or divided.
Example: Temperature of zero degrees does not indicate lack of heat. The two common
temperature scales; Celsius (C) and Fahrenheit (F). We can see that the same difference exists
between 10oC (50oF) and 20oC (68OF) as between 25oc (77oF) and 35oc (95oF) i.e. the
measurement scale is composed of equal-sized interval. But we cannot say that a temperature of
20oc is twice as hot as a temperature of 10oc. because the zero point is arbitrary.
Ratio Scale: - Characterized by the fact that equality of ratios as well as equality of intervals may
be determined. Fundamental to ratio scales is a true zero point. Typical examples of ratio scales
are measures of time or space. For example, as the Kelvin temperature scale is a ratio scale, not
only can we say that a temperature of 200 degrees is higher than one of 100 degrees; we can
correctly state that it is twice as high. Interval scales do not have the ratio property. Most
statistical data analysis procedures do not distinguish between the interval and ratio properties of
the measurement scales.
Example: Variables such as age, height, length, volume, rate, time, amount of rainfall, etc. are
require ratio scale.

1.6 Methods of data collection & presentation

A) Introduction to methods of data collection

We have already explained what it means by statistical data. Numerical facts or measurements
obtained in the course of enquiry in to a phenomenon, marked by uncertainty, constitute statistical
data. The statistical data may be already available or may have to be collected by an investigator
or an agency. Data termed primary when the reference is to data collected for the first time by the
investigator and is termed secondary when the data are taken from records or data already

6
available.
Method of primary data collection

In primary data collection, you collect the data yourself using methods such as interviews,
observations, laboratory experiments and questionnaires. The key point here is that the data you
collect is unique to you and your research and, until you publish, no one else has access to it.
There are many methods of collecting primary data and the main methods include:

Questionnaire: It is a popular means of collecting data, but is difficult to design and often require
many rewrites before an acceptable questionnaire is produced.

Advantages:

 Can be used as a method in its own right or as a basis for interviewing or a telephone
survey.
 Can be posted, e-mailed or faxed.
 Can cover a large number of people or organizations.
 Wide geographic coverage.
 Relatively cheap.
 No prior arrangements are needed.
 Avoids embarrassment on the part of the respondent.
 Respondent can consider responses.
 Possible anonymity of respondent.
 No interviewer bias.

Disadvantages:
 Historically low response rate (although inducements may help).
 Time delay whilst waiting for responses to be returned
 Require a return deadline.
 Several reminders may be required.
 Assumes no literacy problems.
 No control over who completes it.
 Not possible to give assistance if required.
7
 Replies not spontaneous and independent of each other.
 Respondent can read all questions beforehand and then decide whether to complete or not.
For example, perhaps because it is too long, too complex, uninteresting, or too personal.

Interviewing is a technique that is primarily used to gain an understanding of the underlying

reasons and motivations for people’s attitudes, preferences or behavior. Interviews can be
undertaken on a personal one-to-one basis or in a group. They can be conducted at work, at home,
in the street or in a shopping center, or some other agreed location.
Advantages:
 Serious approach by respondent resulting in accurate information.
 Good response rate.
 Completed and immediate.
 Possible in-depth questions.
 Interviewer in control and can give help if there is a problem.
 Can investigate motives and feelings.
 Can use recording equipment.
 Characteristics of respondent assessed – tone of voice, facial expression, hesitation, etc.
 If one interviewer used, uniformity of approach.
 Used to pilot other methods.
Disadvantages:
 Need to set up interviews.
 Time consuming.
 Geographic limitations.
 Can be expensive.
 Normally need a set of questions.
 Respondent bias – tendency to please or impress, create false personal image, or end
Interview quickly.
 Embarrassment possible if personal questions.
 Transcription and analysis can present problems– subjectivity.
 If many interviewers, training required.

8
Observation: It involves recording the behavioral patterns of people, objects and events in a
systematic manner.
Diaries: A diary is a way of gathering information about the way individuals spend their time on
professional activities. They are not about records of engagements or personal journals of
thought! Diaries can record either quantitative or qualitative data, and in management research
can provide information about work patterns and activities.
Laboratory experiment: Conducting laboratory experiments on fields of chemical, biological
sciences and so on.

Methods of secondary data collection

Secondary data analysis can be literally defined as second-hand analysis and is the analysis of
data or information that was either gathered by someone else (e.g., researchers, institutions, other
NGOs, etc.) or for some other purpose than the one currently being considered, or often a
combination of the two.
Some of the sources of secondary data are government document, official statistics, technical
report, scholarly journals, trade journals, review articles, reference books, research institutes,
universities, hospitals, libraries, library search engines, computerized data base and world wide
web (www).

Advantage of secondary data

 Secondary data may help to clarify or redefine the definition of the problem as part of the
exploratory research process.
 Time saving
 Does not involve collection data
 Provides a larger database as compared to primary data

Disadvantage of secondary data

 Lack of availability
 Lack of relevance
 Inaccurate data
 Insufficient data.

9
B) METHODS OF DATA PRESNTATION
Having collected and edited the data, the next important step is to organize it. That is to present it
in a readily comprehensible condensed form that aids in order to draw inferences from it. It is
also necessary that the like be separated from the unlike ones.
The presentation of data is broadly classified in to the following three categories:
 Tabular presentation
 Diagrammatic and
 Graphic presentation.
The process of arranging data in to classes or categories according to similarities technically is
called classification.
Classification is a preliminary and it prepares the ground for proper presentation of data.

Definitions:
 Raw data: recorded information in its original collected form, whether it may be counts
or measurements, is referred to as raw data.
 Frequency: is the number of values in a specific class of the distribution.
 Frequency distribution: is the organization of raw data in table form using classes and
frequencies.
- There are three basic types of frequency distributions
 Categorical frequency distribution
 Ungrouped frequency distribution
 Grouped frequency distribution
There are specific procedures for constructing each type.
1) Categorical frequency Distribution:
Used for data that can be place in specific categories such as nominal, or ordinal. E.g. marital
status.

Example: a social worker collected the following data on marital status for 25
persons.(M=married, S=single, W=widowed, D=divorced)

10
M S D W D

S S M M M

W D S M M

W D D S S

S W W D D

Solution:
Since the data are categorical, discrete classes can be used. There are four types of marital status
M, S, D, and W. These types will be used as class for the distribution. We follow procedure to
construct the frequency distribution.
Step 1: Make a table as shown.
Class Tally Frequency Percent

(1) (2) (3) (4)

M
S
D
W

Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Step 4: Find the percentages of values in each class by using;
%=f/n *100 Where f= frequency of the class, n=total number of value.
Percentages are not normally a part of frequency distribution but they can be added since
they are used in certain types diagrammatic such as pie charts.
Step 5: Find the total for column (3) and (4).
Combing the entire steps one can construct the following frequency distribution.

11
Class Tally Frequency Percent
(1) (2) (3) (4)
M //// / 6 24
S //// // 7 28
D //// // 7 28
W //// 5 20

2) Ungrouped frequency Distribution:

-Is a table of all the potential raw score values that could possible occur in the data along with
the number of times each actually occurred.
-Is often constructed for small set or data on discrete variable.
Constructing ungrouped frequency distribution :
 First find the smallest and largest raw score in the collected data.
 Arrange the data in order of magnitude and count the frequency.
 To facilitate counting one may include a column of tallies.
Example:
The following data represent the mark of 20 students.
80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85

Construct a frequency distribution, which is ungrouped.

Solution:
Step 1: Find the range, Range=Max-Min=90-60=30.
Step 2: Make a table as shown
Step 3: Tally the data.
Step 4: Compute the frequency.

12
Mark Tally Frequency
60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1

Each individual value is presented separately, that is why it is named ungrouped frequency
distribution.
3) Grouped frequency Distribution:
- When the range of the data is large, the data must be grouped in to classes that are more
than one unit in width.
Definitions:
- Grouped Frequency Distribution: a frequency distribution when several numbers are
grouped in one class.
- Class limits: Separates one class in a grouped frequency distribution from another. The
limits could actually appear in the data and have gaps between the upper limits of one
class and lower limit of the next.
- Units of measurement (U): the distance between two possible consecutive measures. It
is usually taken as 1, 0.1, 0.01, 0.001, -----.
- Class boundaries: Separates one class in a grouped frequency distribution from another.
The boundaries have one more decimal places than the row data and therefore do not
appear in the data. There is no gap between the upper boundary of one class and lower
boundary of the next class. The lower class boundary is found by subtracting U/2 from
the corresponding lower class limit and the upper class boundary is found by adding U/2

13
to the corresponding upper class limit.
- Class width: the difference between the upper and lower class boundaries of any class. It
is also the difference between the lower limits of any two consecutive classes or the
difference between any two consecutive class marks
- Class mark (Mid points): it is the average of the lower and upper class limits or the
average of upper and lower class boundary.
- Cumulative frequency: is the number of observations less than/more than or equal to a
specific value.
- Cumulative frequency above: it is the total frequency of all values greater than or equal
to the lower class boundary of a given class.
- Cumulative frequency blow: it is the total frequency of all values less than or equal to
the upper class boundary of a given class.
- Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class
interval together with their corresponding cumulative frequencies. It can be more than or
less than type, depending on the type of cumulative frequency used.
- Relative frequency (rf): it is the frequency divided by the total frequency.
- Relative cumulative frequency (rcf): it is the cumulative frequency divided by the total
frequency.
-
Guidelines for classes:-
1. There should be between 5 and 20 classes.
2. The classes must be mutually exclusive. This means that no data value can fall into two
different classes
3. The classes must be all inclusive or exhaustive. This means that all data values must be
included.
4. The classes must be continuous. There are no gaps in a frequency distribution.
5. The classes must be equal in width. The exception here is the first or last class. It is
possible to have a "below ..." or "... and above" class. This is often used with ages.

Steps for constructing Grouped frequency Distribution:-

14
1. Find the largest and smallest values
2. Compute the Range(R) = Maximum – Minimum
3. Select the number of classes desired, usually between 5 and 20 or use Sturge’s rule
log (nwhere k is number of classes desired and n is total number of
observation.
4. Find the class width by dividing the range by the number of classes and rounding up, not
off. wR/K
5. Pick a suitable starting point less than or equal to the minimum value. The starting point
is called the lower limit of the first class. Continue to add the class width to this lower
limit to get the rest of the lower limits.
6. To find the upper limit of the first class, subtract U from the lower limit of the second
class. Then continue to add the class width to this upper limit to find the rest of the upper
limits.
7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2 units
from the upper limits. The boundaries are also half- way between the upper limit of one
class and the lower limit of the next class. !may not be necessary to find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may
not be necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies
Example:(***)
Construct a frequency distribution for the following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Solutions:
Step 1: Find the highest and the lowest value H=39, L=6
Step 2: Find the range; R=H-L=39-6=33
Step 3: Select the number of classes’ desired using Sturges formula;

k 1 3.32 log n =1+3.32log (20) =5.32=6(rounding up)

15
Step 4: Find the class width; w=R/k=33/6=5.5=6 (rounding up)
Step 5: Select the starting point, let it be the minimum observation.
 6, 12, 18, 24, 30, 36 are the lower class limits.
Step 6: Find the upper class limit; e.g. the first upper class=12-U=12-1=11
11, 17, 23, 29, 35, 41 are the upper class limits.
So combining step 5 and step 6, one can construct the following classes.
Class limits
6 – 11
12 – 17
18 – 23
24 – 29
30 – 35
36 – 41
Step 7: Find the class boundaries;
E.g. for class 1 Lower class boundary=6-U/2=5.5
Upper class boundary =11+U/2=11.5
 Then continue adding w on both boundaries to obtain the rest boundaries. By doing so
one can obtain the following classes.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5
Step 8: tally the data.
Step 9: Write the numeric values for the tallies in the frequency column.
Step 10: Find cumulative frequency.
Step 11: Find relative frequency or/and relative cumulative frequency.

The complete frequency distribution follows:

16
Class Class Class Tally Freq. Cf (less Cf (more rf. rcf (less
limit boundary Mark than than than type
type) type)
6 – 11 5.5 – 11.5 8.5 // 2 2 20 0.10 0.10
12 – 17 11.5 – 17.5 14.5 // 2 4 18 0.10 0.20
18 – 23 17.5 – 23.5 20.5 ////// 7 11 16 0.35 0.55
24 – 29 23.5 – 29.5 26.5 //// 4 15 9 0.20 0.75
30 – 35 29.5 – 35.5 32.5 /// 3 18 5 0.15 0.90
36 – 41 35.5 – 41.5 38.5 // 2 20 2 0.10 1.00
Diagrammatic and graphic presentation of data.
These are techniques for presenting data in visual displays using geometric and pictures.
Importance:
 They have greater attraction.
 They facilitate comparison.
 They are easily understandable.
-Diagrams are appropriate for presenting discrete data.
-The three most commonly used diagrammatic presentation for discrete as well as qualitative
data are:
 Pie charts
 Pictogram
 Bar charts
Pie chart
- A pie chart is a circle that is divided in two sections or wedges according to the percentage of
frequencies in each category of the distribution. The angle of the sector is obtained using:

= *360

Example: The following table gives the details of monthly budget of a family. Represent these
figures by a suitable diagram.

17
Items of Family
expenditure budget
Food $600
Clothing $100
House rent $400
Fuel and lighting $100
Miscellaneous $300
Total $1500

Solutions:
Step 1: Find the percentage.
Step 2: Find the number of degrees for each class.
Step 3: Using a protractor and compass, graph each section and write its name corresponding
percentage.
Items of Family Angle of percentage
expenditure budget sector
Food $600 144o 40%
Clothing $100 24o 6.67%
House rent $400 96o 26.67%
Fuel and lighting $100 24o 6.67%
miscellaneous $300 72o 20%
Total $1500 360o 100%

18
Pictogram
In this diagram, we represent data by means of some picture symbols. We decide about a suitable
picture to represent a definite number of units in which the variable is measured.
Example: draw a pictogram to represent the following population of a town.

Bar Charts:
-A set of bars (thick lines or narrow rectangles) representing some magnitude over time space.
-They are useful for comparing aggregate over time space.
-Bars can be drawn either vertically or horizontally.
-There are different types of bar charts. The most common being :
 Simple bar chart
 Deviation or two way bar chart
 Broken bar chart
 Component or sub divided bar chart.
 Multiple bar charts.
Simple Bar Chart
- Are used to display data on one variable.
- They are thick lines (narrow rectangles) having the same breadth. The magnitude of a

19
quantity is represented by the height /length of the bar.
- Example:- Draw simple bar diagram to represent the profits of a bank for 5 years.
years 1989 1990 1991 1992 1993
Profit 10 12 18 25 42
(million $)

Component Bar chart

-When there is a desire to show how a total (or aggregate) is divided in to its component parts,
we use component bar chart.
-The bars represent total value of a variable with each total broken in to its component parts and
different colures or designs are used for identifications.
Example: The table below shows the quantity in hundred kgs of Wheat, Barley and Oats produced
on a certain form during the years 1991 to 1994. Draw stratified bar chart.

years wheat Barley Oats total

1991 34 18 27 79
1992 43 14 24 81
1993 43 16 27 86
1994 45 13 34 92

20
Solution: To make the component bar chart, first of all we have to take year wise
total production.
The required diagram is given below:

Multiple Bars
When two or more interrelated series of data are depicted by a bar diagram, then
such a diagram is known as a multiple-bar diagram. Suppose we have export and
import figures for a few years.
We can display by two bars close to each other, one representing exports while the
other representing imports figure shows such a diagram based on hypothetical
data.

Multiple Bars
It should be noted that multiple bar diagrams are particularly suitable where some comparison is
involved.

21
Graphical Presentation of data

- The histogram, frequency polygon and cumulative frequency graph or ogives are most
commonly applied graphical representation for continuous data.
Procedures for constructing statistical graphs:
 Draw and label the X and Y axes.
 Choose a suitable scale for the frequencies or cumulative frequencies and label it on the
Y axes.
 Represent the class boundaries for the histogram or ogive or the mid points for the
frequency polygon on the X axes.
 Plot the points.
 Draw the bars or lines to connect the points

Histogram

The graph which displays the data by using vertical bars of height to represent frequencies.
Class boundaries are placed along the horizontal axes. Class marks and class limits are
sometimes used as quantity on the X axes.
Example: Construct a histogram to represent the previous data which is stated in (example ***).
Frequency Polygon:
- A line graph. The frequency is placed along the vertical axis and classes mid points are placed
along the horizontal axis. It is customer to the next higher and lower class interval with
corresponding frequency of zero, this is to make it a complete polygon.
Example: Draw a frequency polygon for the above data (example* * *).
Ogive (cumulative frequency polygon)
- A graph showing the cumulative frequency (less than or more than type) plotted against upper
or lower class boundaries respectively. That is class boundaries are plotted along the horizontal
axis and the corresponding cumulative frequencies are plotted along the vertical axis. The points
are joined by a free hand curve.
Example: Draw an ogive curve(less than type) for the above data in example ***.

22
1.7. Review Exercises
2. Classify the following statements as Descriptive and Inferential Statistics
a. The average age of the students in this class is 21 years.
b. At least 5% of the killings reported last year in city X were due to tourists.
c. Of the students enrolled in Adigrat University in this year 74% are male and 26% are
female.
d. The chance of winning the Ethiopian National Lottery in any day is 1 out of 167000.
3. Classify each of the following as Qualitative and Quantitative and if it is quantitative
classify as Discrete and Continuous.
a. Color of automobiles in a dealer’s show room.
b. Number of seats in a movie theater.
c. Classification of patients based on nursing care needed (complete, partial or seafarer)
d. Number of tomatoes on each plant on a field.
e. Weight of newly born babies.
4. Mark of 50 students out of 40
16 21 26 24 11 17 25 26 13 27 24 26 3 27 23 24 15 22 22 12 22 29 18 22
28 25 7 17 22 28 19 23 23 22 3 19 13 31 23 28 24 9 20 33 30 23 20 8 21
24
a) Construct grouped frequency distribution.
b) Construct histogram for the above data.
c) Construct frequency polygon and ogives.

23
CHAPTER 2
MEASURES OF CENTERAL TENDENCY

Objectives of the chapter

By the end of this chapter; students should be able to:-

 Understand the use of the measure of central tendency.

 Calculate the different measures of central tendency.
 Understand and calculate the measures of variation.

Introduction
- When we want to make comparison between groups of numbers it is good to have a
single value that is considered to be a good representative of each group. This single
value is called the average of the group. Averages are also called measures of central
tendency.
- An average which is representative is called typical average and an average which is not
representative and has only a theoretical value is called a descriptive average.
A typical average should posses the following:
 It should be rigidly defined.
 It should be based on all observation under investigation.
 It should be as little as affected by extreme observations.
 It should be capable of further algebraic treatment.
 It should be as little as affected by fluctuations of sampling.
 It should be ease to calculate and simple to understand.
Objectives of measures of central tendency:
 To comprehend the data easily.
 To facilitate comparison.
 To make further statistical analysis.

24
The Summation Notation:
- Let X1, X2 ,X3 …XN be a number of measurements where N is the total number of
observation and Xi is ith observation.
- Very often in statistics an algebraic expression of the form X1+X2+X3+...+XN is used in
a formula to compute a statistic. It is tedious to write an expression like this very often,
so mathematicians have developed a shorthand notation to represent a sum of scores,
called the summation notation.
- The symbol ∑ xi is a mathematical shorthand for X1+X2+X3+...+XN

The expression is read, "the sum of X sub i from i equals 1 to N." It means "add up all the
numbers."
Example: Suppose the following were scores made on the first homework assignment for five
students in the class: 5, 7, 7, 6, and 8. In this example set of five numbers, where N=5, the
summation could be written:

The "i=1" in the bottom of the summation notation tells where to begin the sequence of
summation. If the expression were written with "i=3", the summation would start with the third
number in the set. For example:

In the example set of numbers, this would give the following result:

The "N" in the upper part of the summation notation tells where to end the sequence of
summation. If there were only three scores then the summation and example would be:

25
Sometimes if the summation notation is used in an expression and the expression must be written
a number of times, as in a proof, then a shorthand notation for the shorthand notation is
employed. When the summation sign "" is used without additional notation, then "i=1" and "N"
are assumed.
For example:

PROPERTIES OF SUMMATION

∑ k nk where k is any constant

2. ∑ kxi = k ∑ xi where k is any constant
3. ∑ (a + bxi) = na +b∑ xi where a and b are any constants
4. ∑ (xi + yi = ∑ xi + ∑ yi

The sum of the product of the two variables could be written:

Example: considering the following data determine

X Y
5 6
7 7
7 8
6 7
8 8
a) ∑ x
b) ∑ y
c) ∑ 10
d) ∑ (xi + yi)
e) ∑ (xi − yi)

26
f) ∑ xiyi
g) ∑ xi2
h) (∑ xi)(∑ yi)
Solutions:
a) ∑ x= 

b) ∑ y= 

c) ∑ 10

d) ∑ (xi + yi)= 

e) ∑ (xi − yi)

f) ∑ xiyi 

g) ∑ xi2 =

h) (∑ xi)(∑ yi)= 

2.1 Types of measures of central tendency
There are several different measures of central tendency; each has its advantage and
disadvantage.
 The Mean (Arithmetic, Geometric and Harmonic)
 The Mode
 The Median
 Quantiles (Quartiles, Deciles and Percentiles)
The choice of these averages depends up on which best fit the property under discussion.

27
A. The Arithmetic Mean
 Is defined as the sum of the magnitude of the items divided by the number of items.
 The mean of X1, X2 ,X3 …Xn is denoted by A.M ,m or X and is given by:

⋯
x=
∑
x=

If x1 occurs f1 times
If x2 occurs f2 times
.
.
if xn occurs fn times
∑
Then the mean will be x  ∑ where k is number of classes and ∑ i=n

Example: Obtain the mean of the following number

2, 7, 8, 2, 7, 3, 7

Solution:

Xi fi Xifi
2 2 4
3 1 3
7 3 21
8 1 8
Total 7 36

∑
x ∑  

Arithmetic Mean for Grouped Data

If data are given in the shape of a continuous frequency distribution, then the mean is obtained as
follows:
∑
x  ∑ where, Xi =the class mark of the ith class and fi = the frequency

of the ith class

28
Example: calculate the mean for the following age distribution.

Class fi Xi Xifi
6- 10 35 8 280
11- 15 23 13 299
16- 20 15 18 270
21- 25 12 23 276
26- 30 9 28 252
31- 35 6 33 198
Total 100 1575

Solutions:
1. First find the class marks
2. Find the product of frequency and class marks
3. Find mean using the formula.

Class Frequency
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6

∑6 1575
x ∑i 6 1 xi i 100 
i 1 i

- If the values in a series or mid values of a class are large enough, coding of values is a
good device to simplify the calculations.
- For raw data suppose we have used the following coding system.
d i X i A
X i d i A

29
∑ ∑ ( )
x= =

∑
x=A+

x =A+ ̅
Where A is an assumed mean and d is the mean of the coded data.
- If the data are expressed in terms of ungrouped frequency distribution

d i X i A
 X i d i A
∑ ( )
x 
∑
x 

xd
- In both cases the true mean is the assumed mean plus the average of the deviations from
the assumed mean.
- Suppose the data is given in the shape of continuous frequency distribution with a
constant class size of w then the following coding is appropriate.

di =

xi = wdi + A
∑ ∑ ( )
 x= =
∑
x=A+
x = A + wd

Where: Xi is the original class mark for the ith class.

di is the transformed class mark for the ith class.
A is an assumed mean usually the mean of the class marks.
(i =1, 2… k)

30
Example:
1. Suppose the deviations of the observations from an assumed mean of 7 are:
1, -1, -2, -2, 0, -3, -2, 2, 0, -3.
a) Find the true mean
b) Find the original observation.
Solutions:
A) 7 , ∑ di = -10
a) d = -10/10 = -1
x = A + d = 7-1 = 6
The true mean is 6.
b) Using Xi=A+di we obtain the following original observations:
8, 6, 5, 5, 7, 4, 5, 9, 7, 4.

Special properties of Arithmetic mean

1. The sum of the deviations of a set of items from their mean is always zero.
=0
2. The sum of the squared deviations of a set of items from their mean is the minimum. i.e
∑(xi − x) 2 < ∑(xi − A) 2, A≠x
3. If x1 is the mean of n1 observations
If ̅ 2 is the mean of n2 observations
.
.
If X k is the mean of nk observations
Then the mean of all the observation in all groups often called the combined mean is given by:
….
 

Example: In a class there are 30 females and 70 males. If females averaged 

60 in an examination and boys averaged 72, find the mean for the entire class.

Solutions:

31
Females Males
x1 =60 X 2 =72
n1=30 n2 =70
∗ ∑ ∗
X c   ∑

( ∗ ) ( ∗ )
X c = = = 68.40
4. If a wrong figure has been used when calculating the mean the correct mean can be obtained
without repeating the whole process using:
( )
Correct Mean Wrong Mean

Where n is total number of observations.

Example: An average weight of 10 students was calculated to be 65.Latter it was discovered
that one weight was misread as 40 instead of 80 k.g. Calculate the correct average weight.
Solutions:
( )
Correct Mean Wrong Mean +
( )
Correct Mean g

5. The effect of transforming original series on the mean.

a) If a constant k is added/ subtracted to/from every observation then the new mean will
be the old mean± k respectively.
b) If every observations are multiplied by a constant k then the new mean will be k*old
mean.
Example:
1. The mean of n Tetracycline Capsules X1, X2, …,Xn are known to be 12 gm. New set of
capsules of another drug are obtained by the linear transformation Yi = 2Xi – 0.5 ( i = 1, 2, …, n
), then what will be the mean of the new set of capsules?
Solutions:
New Mean 2 * Old Mean 0.5 2 *12 0.523.5
2. The mean of a set of numbers is 500.
a) If 10 is added to each of the numbers in the set, then what will be the mean of the new
set?

32
b) If each of the numbers in the set are multiplied by -5, then what will be the mean of the
new set?
Solutions:
a).New Mean Old Mean 10 500 10 510
b).New Mean5 * Old Mean5 * 5002500
Weighted mean
- When a proper importance is desired to be given to different data a weighted mean is
appropriate.
- Weights are assigned to each item in proportion to its relative importance.
- Let X1, X2, …Xn be the value of items of a series and W1, W2, …Wn ; their
corresponding weights , then the weighted mean denoted X w is defined as:
∑ ∗
x̅ = ∑

Example:
A student obtained the following percentage in an examination: English 60, Biology 75,
Mathematics 63, Physics 59, and chemistry 55.Find the students weighted arithmetic mean if
weights 1, 2, 1, 3, 3 respectively are allotted to the subjects.
Solutions:
∑ ∗ ( ∗ ) ( ∗ ) ( ∗ ) ( ∗ ) ( ∗ )
x̅ = ∑
= = = 61.5

Merits and Demerits of Arithmetic Mean

Merits:
 It is rigidly defined.
 It is based on all observation.
 It is suitable for further mathematical treatment.
 It is stable average, i.e. it is not affected by fluctuations of sampling to some extent.
 It is easy to calculate and simple to understand.
Demerits:
 It is affected by extreme observations.
 It cannot be used in the case of open end classes.
 It cannot be determined by the method of inspection.

33
 It cannot be used when dealing with qualitative characteristics, such as intelligence,
honesty, beauty.
 It can be a number which does not exist in a serious.
 Sometimes it leads to wrong conclusion if the details of the data from which it is obtained
are not available.
 It gives high weight to high extreme values and less weight to low extreme values.
The Geometric Mean
- The geometric mean of a set of n observation is the nth root of their product.
- The geometric mean of X1, X2 ,X3 …Xn is denoted by G.M and given by:
G.M= √x1 ∗ x2 ∗ … ∗ xn
- Taking the logarithms of both sides

Log(G.M) = log( √x1 ∗ x2 ∗ … ∗ xn) = log(x1x2…*xn)n

Log(G.M) = 1/n* log(x1*x2*…*xn)
= 1/n*(log x1+ log x2+…+log Xn)

= ∗∑ log xi

- The logarithm of the G.M of a set of observation is the arithmetic mean of their
logarithm
G.M = anti log ( ∗ ∑ log xi)

Example: Find the G.M of the numbers 2, 4, 8.

Solutions:
G.M= √x1 ∗ x2 ∗ … ∗ xn = √2 ∗ 4 ∗ 8 = √64 = 4
Remark: The Geometric Mean is useful and appropriate for finding averages of ratios.
The harmonic mean
The harmonic mean of X1, X2 , X3 …Xn is denoted by H.M and given by:
H.M = , this also known as simple harmonic mean.
∑

In a case of frequency distribution:

H.M = , where ∑fi=n, i.e total number of observation.

∑

If observations X1, X2, …, Xn have weights W1, W2, …Wn respectively, then their harmonic

34
mean is given by
∑
H.M = , This is called Weighted Harmonic Mean.
∑

Remark: The Harmonic Mean is useful and appropriate in finding average speeds and average
rates.
Example: A cyclist pedals from his house to his college at speed of 10 km/hr and back from the
college to his house at 15 km/hr. Find the average speed.
Solution: Here the distance is constant
The simple H.M is appropriate for this problem.
X1= 10 km/hr X2= 15 km/hr

H.M = = = 12 km/hr
∑

B. Mode
- Mode is a value which occurs most frequently in a set of values
- The mode may not exist and even if it does exist, it may not be unique.
- In case of discrete distribution the value having the maximum frequency is the model value.
Examples:
1. Find the mode of 5, 3, 5, 8, 9
Mode =5
2. Find the mode of 8, 9, 9, 7, 8, 2, and 5.
It is a bimodal Data: 8 and 9
3. Find the mode of 4, 12, 3, 6, and 7.
No mode for this data.
- The mode of a set of numbers X1, X2, …Xn is usually denoted by X.
Mode for Grouped data
If data are given in the shape of continuous frequency distribution, the mode is defined as:

∆
X = L0 + w(∆ )
∆

35
Where:
Xthe mod e of the distribution
w the size of the mod al class
1 f mo f1
 2 f mo f 2
f mo frequency of the mod al class
f1 frequency of the class preceding the mod al class
f 2 frequency of the class following the mod al class
Note: The modal class is a class with the highest frequency.
Example: Following is the distribution of the size of certain farms selected at random from a
district. Calculate the mode of the distribution.

Size of farms No. of farms

5-15 8
15-25 12
25-35 17
35-45 29
45-55 31
55-65 5
65-75 3

Solutions:
45 55 is the mod al class, since it is a class with the highest frequency.
L mo 45
w 10
 1 f mo f1 2
 2 f mo f 2 26
f mo 31
f1 29
f 2 5
∆
X = L0 + w(∆ ) = 45 + 10( ) = 45.71
∆

36
Merits and Demerits of Mode
Merits:
 It is not affected by extreme observations.
 Easy to calculate and simple to understand.
 It can be calculated for distribution with open end class
Demerits:
 It is not rigidly defined.
 It is not based on all observations
 It is not suitable for further mathematical treatment.
 It is not stable average, i.e. it is affected by fluctuations of sampling to some extent.
 Often its value is not unique.
Note: being the point of maximum density, mode is especially useful in finding the most popular
size in studies relating to marketing, trade, business, and industry. It is the appropriate average to
be used to find the ideal size.

C. The Median
- In a distribution, median is the value of the variable which divides it in to two equal
halves.
- In an ordered series of data median is an observation lying exactly in the middle of the
series. It is the middle most value in the sense that the number of values less than the
median is equal to the number of values greater than it.
- If X1, X2, …Xn be the observations, then the numbers arranged in ascending order will
be X[1], X[2], …X[n], where X[i] is ith smallest value.
 X[1]< X[2]< …<X[n]
- Median is denoted by x and read as x tile.
Median for ungrouped data

x= , if n is odd

{ + } , if n is even.

37
Example: Find the median of the following numbers.
a. 6, 5, 2, 8, 9, 4.
b. 2, 1, 8, 3, 5.
Solutions:
a) First order the data: 2, 4, 5, 6, 8, 9
Here n=6
x = { + }

= ½(5+6)= 5.5 is the median value.

b) Order the data :1, 2, 3, 5, 8
Here n=5
x = (n+1)/2
=(5+1)/2
=3 is the median value.
Median for grouped data
If data are given in the shape of continuous frequency distribution, the median is defined as:
x = Lmed + { − c}

Where:
Lmed lower class boundary of the median class.
w the size of the median class
n total number of observations.
c the cumulative frequency (less than type) preceding the median class.
f med the frequency of the median class.
Remark:
The median class is the class with the smallest cumulative frequency (less than type) greater than
or equal to .

38
Example: Find the median of the following distribution.

class frequency

40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3

Solutions:
1. First find the less than cumulative frequency.
2. Identify the median class.
3. Find median using formula.

class Frequency Less than

cum.
40-44 7 7
45-49 10 17
50-54 22 39
55-59 15 54
60-64 12 66
65-69 6 72
70-74 3 75

= = 37.5

39 is the first cumulative frequency to be greater than or equal to 37.5

 50 54 is the median class.
Lmed = 49.5
W=5
n = 75 , c = 17 , fmed= 22

39
x = Lmed + { − c}

= 49.5 + { − 17}

= 54.16
Merits and Demerits of Median
Merits:
 Median is a positional average and hence not influenced by extreme observations.
 Can be calculated in the case of open end intervals.
 Median can be located even if the data are incomplete.
Demerits:
 It is not a good representative of data if the number of items is small.
 It is not amenable to further algebraic treatment.
 It is susceptible to sampling fluctuations.
D. Quantiles
When a distribution is arranged in order of magnitude of items, the median is the value of the
middle term. Their measures that depend up on their positions in distribution quartiles, deciles,
and percentiles are collectively called quantiles.
j. Quartiles:
- Quartiles are measures that divide the frequency distribution in to four equal parts.
- The value of the variables corresponding to these divisions are denoted Q1, Q2, and Q3
often called the first, the second and the third quartile respectively.
- Q1 is a value which has 25% items which are less than or equal to it.
- Similarly Q2 has 50%items with value less than or equal to it and Q3 has 75% items
whose values are less than or equal to it.

- To find Qi (i=1, 2, 3) we count of the classes beginning from the lowest class.

- For grouped data: we have the following formula:

Qi= LQi + { − cQi} ,i=1, 2 and 3.

40
Where,
LQi = lower class boundary for quartile class.
wthe size of the quartile class
N total number of observations.
c the cumulative frequency(less than type) preceding the quartile class.
fQi = frequency of the quartile class
Remark:
The quartile class (class containing Qi ) is the class with the smallest cumulative frequency (less

than type) greater than or equal to .

ii. Deciles:
- Deciles are measures that divide the frequency distribution in to ten equal parts.
- The values of the variables corresponding to these divisions are denoted D1, D2,.. D9
often called the first, the second,…, the ninth decile respectively.

- To find Di (i=1, 2,..9) we count of the classes beginning from the lowest class.

- For grouped data: we have the following formula

Di=LDi+ { − cDi} i= 1,2,……..,9.

Where :
L Di lower class boundary of the decile class.
w the size of the decile class
N total number of observations.
c the cumulative frequency (less than type) preceding the decile class.
f D i the frequency of the decile class.
Remark:
- The decile class (class containing Di )is the class with the smallest cumulative frequency

(less than type) greater than or equal to

iii. Percentiles:
- Percentiles are measures that divide the frequency distribution in to hundred equal parts.
- The values of the variables corresponding to these divisions are denoted P1, P2,.. P99 often
called the first, the second,…, the ninety-ninth percentile respectively.

41
- To find Pi (i=1, 2,..99) we count of the classes beginning from the lowest class.

- For grouped data: we have the following formula

Pi=LPi + { − cPi} i=1,2,………,99.

Where:
L Pi lower class boundary of the percentile class.
w the size of the percentile class
N total number of observations.
c the cumulative frequency (less than type) preceding the percentile class.
f Pi the frequency of the percentile class.
Remark:
The percentile class (class containing Pi )is the class with the smallest cumulative frequency (less

than type) greater than or equal to

Example: Considering the following distribution

Calculate:
a) All quartiles.
b) The 7th decile.
c) The 90th percentile.

values Frequency
140-150 17
150-160 29
160-170 42
170-180 72
180-190 84
190-200 107
200-210 49
210-220 34
220-230 31
230-240 16
240-250 12

42
Solutions:
 First find the less than cumulative frequency.
 Use the formula to calculate the required quantile.
Values frequency Cum.Freq(less
than type)
140-150 17 17
150-160 29 46
160-170 42 88
170-180 72 160
180-190 84 244
190-200 107 351
200-210 49 400
210-220 34 434
220-230 31 465
230-240 16 481
240-250 12 493

a) Quartiles:
i. determine the class containing the first quartile.
= 123.25

 170 180 is the class containing the first quartile.

LQ1 170 , w10
N 493 , c 88 , f Q1 72
Q1= LQ1+ { − cQ1}

Q1=170 + { − 88}

 174.90
ii. Q2
- determine the class containing the second quartile.


43
190 200 is the class containing the second quartile.
LQ2 190 , w10
N 493 , c 244 , f Q2107
∗
Q2=LQ2 + { − cQi}
∗
Q2= 190 + { − 244}

 190.23
iii. Q3
- determine the class containing the third quartile.
= 369.75

200210 is the class containing the third quartile.

LQ3 200 , w10
N493 , c351 , f Q3 49

w 3∗n
Q3=LQ3 +fQ3 { − cQ3}
4
10 2∗493
Q3= 200 +49 { − 351}
4

 203 . 83
b) D7
- determine the class containing the 7th decile.
= 345.1

LD7 190, w10

N 493 , c 244 , f D7 107
w 7∗n
 D7 LD 7fD7 { 10 − cD7}
10 7∗493
D7107 { − 244}
10

 199 . 45
c) P90
- determine the class containing the 90th percentile.
= 443 .7

44
 220 230 is the class containing the 90 th percentile .
Lp90 220, w10
N 493 , c 434 , f p90 3107
w 90∗n
 P90 LP90 fp90 { 10
− cp90}
10 90∗493
P903107 { − 434}
10

 223 . 13

2.2. Measures of Dispersion (Variation)

Introduction and objectives of measuring Variation
-The scatter or spread of items of a distribution is known as dispersion or variation. In other
words the degree to which numerical data tend to spread about an average value is called
dispersion or variation of the data.
-Measures of dispersions are statistical measures which provide ways of measuring the extent in
which data are dispersed or spread out.
Objectives of measuring Variation:
 To judge the reliability of measures of central tendency
 To control variability itself.
 To compare two or more groups of numbers in terms of their variability.
 To make further statistical analysis.
Absolute and Relative Measures of Dispersion
The measures of dispersion which are expressed in terms of the original unit of a series are
termed as absolute measures. Such measures are not suitable for comparing the variability of two
distributions which are expressed in different units of measurement and different average size.
Relative measures of dispersions are a ratio or percentage of a measure of absolute dispersion to
an appropriate measure of central tendency and are thus pure numbers independent of the units
of measurement. For comparing the variability of two distributions (even if they are measured in
the same unit), we compute the relative measure of dispersion instead of absolute measures of
dispersion.

45
2.2.1. Types of Measures of Dispersion
Various measures of dispersions are in use. The most commonly used measures of dispersions
are:
1. Range and relative range
2. variance
3. Standard deviation
4. Coefficient of variation and standard score.
The Range (R)
The range is the largest score minus the smallest score. It is a quick and dirty measure of
variability, although when a test is given back to students they very often wish to know the range
of scores. Because the range is greatly affected by extreme scores, it may give a distorted picture
of the scores. The following two distributions have the same range, 13, yet appear to differ
greatly in the amount of variability.
Distribution 1: 32 35 36 36 37 38 40 42 42 43 43 45
Distribution 2: 32 32 33 33 33 34 34 34 34 34 35 45
For this reason, among others, the range is not the most important measure of variability.
R L S , L l arg est observation
S smallest observation
Range for grouped data:
If data are given in the shape of continuous frequency distribution, the range is computed as:
R UCBk LCB1 , UCBk is upper class boundaries of the last class.
LCB1 is lower class boundaries of the first class.
This is sometimes expressed as:

R X k X 1 , X k is class mark of the last class.

X 1 is classmark of the first class.
Merits and Demerits of range
Merits:
 It is rigidly defined.
 It is easy to calculate and simple to understand.

46
Demerits:
 It is not based on all observation.
 It is highly affected by extreme observations.
 It is affected by fluctuation in sampling.
 It is not liable to further algebraic treatment.
 It cannot be computed in the case of open end distribution.
 It is very sensitive to the size of the sample.
Relative Range (RR)
-it is also sometimes called coefficient of range and given by:

RR=
= R/(H+L)

Example: Find the relative range of the above two distribution.(exercise!)

2. If the range and relative range of a series are 4 and 0.25 respectively. Then what is the value
of:
a) Smallest observation
b) Largest observation
Solutions :( 2)
R 4 L S 4 _________________(1)
RR 0.25 L S 16 _____________(2)
Solving (1) and (2) at the same time , one can obtain the following value L 10 and S 6
The Variance
Population Variance
If we divide the variation by the number of values in the population, we get something called the
population variance. This variance is the "average squared deviation from the mean".
Population Variance 2N{∑(xi-µ)2}
For the case of frequency distribution it is expressed as:
1
Population Variance 2N ∗ ∑ki 1(xi − µ)2

Sample Variance
One would expect the sample variance to simply be the population variance with the population
mean replaced by the sample mean. However, one of the major uses of statistics is to estimate
the corresponding parameter. This formula has the problem that the estimated value isn't the

47
same as the parameter. To counteract this, the sum of the squares of the deviations is divided by
one less than the sample size.
1
Sample Variance S2n ∗ ∑ni 1(xi − x)
1

For the case of frequency distribution it is expressed as:

1
Sample Variance S2n ∗ ∑ki 1 fi(xi − x)
1

We usually use the following short cut formula.

∑ni 1 xi2 n(x)2
S2 = , for raw data. Xi2
n 1
∑ni 1 fiXi2 n(x)2
S2 for frequency distribution.
n 1

Standard Deviation
There is a problem with variances. Recall that the deviations were squared.
That means that the units were also squared. To get the units back the same as the original data
values, the square root must be taken.
The following steps are used to calculate the sample variance:
a. Find the arithmetic mean.
b. Find the difference between each observation and the mean.
c. Square these differences.
d. Sum the squared differences.
e. Since the data is a sample, divide the number (from step 4 above) by the
number of observations minus one, i.e., n-1 (where n is equal to the number of
observations in the data set).
Population standard deviation√σ
Sample standard deviation√s

Examples: Find the variance and standard deviation of the following sample data
1. 5, 17, 12, 10.
2. The data is given in the form of frequency distribution.

48
Solutions:
1. x = 11
Class frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3

Xi 5 10 12 17 Total
2 36 1 1 36 74
(Xi- X )

∑ni 1(xi x)2

S2 = = 74/3 = 24.67
n 1

S=√s2 =√24.67 = 4.97

2. x = 55
Xi(C.M) 42 47 52 57 62 67 72 Total
2 1183 640 198 60 588 864 867 4400
fi(Xi- X )
∑7i 1 fi(xi x)2
S2 = = 4400/74 = 59.46
n 1

S= √s2= √59.46 = 7.71

Special properties of Standard deviations

∑ni 1(xi x)2 ∑ni 1(xi A)2

1, < , A≠x
n 1 n 1

2. For normal (symmetric distribution the following holds.

 Approximately 68.27% of the data values fall within one standard
deviation of the mean. i.e. within (x  S , x S )
 Approximately 95.45% of the data values fall within two standard
deviations of the mean. i.e. within ( x2S, x  2S )

49
 Approximately 99.73% of the data values fall within three standard
deviations of the mean. i.e. within (x 3S , x 3S )
3, Chebyshev's Theorem
For any data set ,no matter what the pattern of variation, the proportion of the values that
1
fall within k standard deviations of the mean or (  x kS , x  kS ) will be at least 1 - k2 where

k is an number greater than 1 i.e. the proportion of items falling beyond k standard
1
deviations of the mean is at most 2.
k

Example: Suppose a distribution has mean 50 and standard deviation 6.What percent of the
numbers are:
a) Between 38 and 62
b) Between 32 and 68
c) Less than 38 or more than 62.
d) Less than 32 or more than 68.
Solutions:
a) 38 and 62 are at equal distance from the mean,50 and this distance is 12
 ks 12
K = 12/s =12/2 = 2
1
Applying the above theorem at least (1-k2) *100% 75% of the numbers lie between 38 and 62.

b), Similarly done.

1
c) It is just the complement of a) i.e. at most k2 *100% 25% of the numbers lie less than 32 or

more than 62.

d) Similarly done.
- If the standard deviation of X 1, X 2 ,.....X n is S , then the standard deviation of
a) X 1 k , X 2 k , .....X n k will also be S
b) kX1, kX 2 ,.....kX n would be k S
c) a kX 1, a kX 2 ,.....a kX n would be k S
Examples:
1. The mean and standard deviation of n Tetracycline Capsules
X 1, X 2 ,.....X n are known to be 12 gm and 3 gm respectively. New set of capsules of

50
another drug are obtained by the linear transformation Yi =
2Xi – 0.5 ( i = 1, 2, …, n ) then what will be the standard deviation of the new set of
capsules
2. The mean and the standard deviation of a set of numbers are respectively 500 and 10.
a. If 10 is added to each of the numbers in the set, then what will be the variance and
standard deviation of the new set?
b. If each of the numbers in the set are multiplied by -5, then what will be the variance
and standard deviation of the new set?

Solutions:
1. Using c) above the new standard deviation = k S 2 * 3 6
2. a. They will remain the same.
b. New standard deviation  k S 5 *10 50
Coefficient of Variation (C.V)
 Is defined as the ratio of standard deviation to the mean usually expressed as percents.
C.V = * 100%

 The distribution having less C.V is said to be less variable or more consistent.
Examples:
1. An analysis of the monthly wages paid (in Birr) to workers in two firms A and B belonging to
the same industry gives the following results
Values Firm A Firm B
Mean wage 52.5 47.5
Variance 100 121
In which firm A or B is there greater variability in individual wages?
Solutions:
Calculate coefficient of variation for both firms.
sA
C.VA = xA* 100% = 10/52.5 = 19.05%
sB
C.VB = *100% =11/47.5 = 23.16%
xB

Since C.VA < C.VB, in firm B there is greater variability in individual wages.

51
Standard Scores (Z-scores)
 If X is a measurement from a distribution with mean X and standard deviation S, then its
value in standard units is

x µ
Z= , for population.
σ
x x
Z= , for sample.
s

 Z gives the deviations from the mean in units of standard deviation

 Z gives the number of standard deviation a particular observation lie above or below the
mean.
 It is used to compare two observations coming from different groups.
Examples:
1. Two sections were given introduction to statistics examinations. The following
information was given.
Value section 1 section 2
Mean 78 90
St.devation 6 5

Student A from section 1 scored 90 and student B from section 2 scored 95.Relatively speaking
who performed better?
Solutions:
Calculate the standard score of both students.
xA x1 90 78
ZA = = =2
s1 6
xB x2 95 90
ZB = = =1
s2 5

- Student A performed better relative to his section because the score of student A is two
standard deviation above the mean score of his section; while, the score of student B is
only one standard deviation above the mean score of his section.
2. Two groups of people were trained to perform a certain task and tested to find out which
group is faster to learn the task. For the two groups the following information was given:

52
Value group 1 group 2
Mean 10.4 min 11.9 min
St.dev 1.2 min 1.3 min
Relatively speaking:
a) Which group is more consistent in its performance?
b) Suppose a person A from group one take 9.2 minutes while person B from Group two take 9.3
minutes, who was faster in performing the task? Why?
Solutions:
a) Use coefficient of variation.

sA 1.2
C.V1 x1 10.4
sA 1.3
C.V2=x2 * 100% 11.9

Since C.V2 < C.V1, group 2 is more consistent.

b) Calculate the standard score of A and B
xA x1 9.2 .
  
s1 1.2
xB 9.3 11.9
  
s2 1.3

 Child B is faster; because the time taken by child B is two standard deviation shorter than
the average time taken by group 2 while, the time taken by child A is only one standard
deviation shorter than the average time taken by group 1.

53
2.3. Review Exercises
1. By considering the given raw data
12, 13, 17, 12, 13, 14, 16, 12, 15 and 15.
Find: i) mean
ii) mode
iii) median
iv) variance and standard deviation
v) coefficient of variation.
2. The Addis Ababa city municipality police traffic control department has observed
the number of car accidents (per month) to be categorized as shown in the table
below.
No. of care accidents frequency
10-14 5
15-19 6
20-24 3
25-29 4
30-34 2

Then calculate:
a) the average number of accidents
b) The median and mode
c) The variance, standard deviation and coefficient of variation
3. Marks of 75 students are summarized in the following frequency distribution:

Marks No. of students

40-44 7
45-49 10
50-54 22
55-59 f4
60-64 f5
65-69 6
70-74 3

54
If 20% of the students have marks between 55 and 59
i) Find the missing frequencies f4 and f5.
ii) Find the mean.
4. A teacher attaches 2 to Quiz, 3 to Mid-term and 5 for Final exam. If a student gets
90, 50 and 60 for Quiz, Mid-term and Final-exam respectively, what is his/her
average academic performance?
5. The mean weight of 50 women workers in a factory is 48 kg. The mean weight of
75 men working in the same factory is 58 kg. Find the mean weight of all workers
in the factory.
6. The mean of 200 items was found to be 40. Later on it was discovered that two
items were wrongly read as 92 and 8 instead of 192 and 88 respectively. Find the
correct mean.
7. The mean salary of 100 laborers working in a factory , running in two shifts of 40
and 60 workers respectively is birr 380. The mean salary of the 40 laborers
working in the morning shift is birr 350. Find the mean salary of the 60 laborers
working in the evening shift.
8. Find the geometric mean of A) 1, 2, 3, 4, and 5. B) 1, 2, 3, 4, 100. Is there a great
difference between the GM of A and that of B?
9. The price of a commodity increased by 5% from 1989 to 1990, 8% from 1990 to
1991 and by 77% from 1991 to 1992. Find the average price increase.
10. Find the harmonic mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great
difference between the HM of A and that of B?
11. A driver traveled 400 km per day for three days at a speed of 60, 50 and 40
kilometers per hour. Find the average speed of the driver.
12. A student reads the first 100 pages of a book at a rate of 5 pages per hour, the next
100 pages at a rate of 8 pages per hour. What is the student’s average reading
speed?
13. In a certain investigation, 460 persons were involved in the study, and based on an
enquiry on their age, it was known that 75% of them were 22or more. The
following frequency distribution shows the age composition of the persons under
study.

55
Mid-age in years 13 18 23 28 33 38 43 48

Number of persons 24 f1 90 122 f2 56 20 33

a. Find the median and modal life of condensers and interpret them.
b. Find the values of all quartiles.
c. Compute the 5th deciles, 25th percentile, 50th percentile and the 75th
percentile and interpret the results.
14. The mean annual salary of all employees in a company is 2500. The mean salary
of male and female is 2700 and 1700 respectively. Find the percentage of males
and females employed in the company.
15. Given the following FD.
Mid-price of a commodity 15 25 35 45 55

Number of items sold 27 A 28 B 19

a. If 75% of the items were sold in birr 45 or less and most items were sold in
birr 34, find the missing frequencies.
b. If 25% of the items were sold in less than or equal to birr 45 and most items
were sold in birr 34, find the missing frequencies.

56
CHAPTER 3
ELEMENTARY PROBABILITY
Introduction
In chapter one’ we discussed the difference b/n descriptive and inferential statistics. Much
statistical analysis are inferential, and probability is the corner stone of inferential statistics.
Recall that inferential statistics involves taking a sample from the population, a sample value (a
statistic) on the sample, and inferring from the statistic the value of the corresponding population
value (a parameter). The reason for doing so is that it is difficult, and sometimes impossible, to
get the population parameter directly.
Objectives of the chapter
By the end of this chapter students must be able to:
- Understand the concept behind probabilistic models, random experiment, sample space,
events and fundamental axioms of probability.
- Justify the different types of outcomes, sample spaces, events and probability approaches.
- Solve various probability problems.
- Calculate permutation and combination; determine whether you should use combination
or permutation to calculate the number of outcomes.
Definition
 Probability theory is the foundation upon which the logic of inference is built.
 It helps us to cope up with uncertainty.
 In general, probability is the chance of an outcome of an experiment. It is the measure of
how likely an outcome is to occur.
3.1 Definitions of some probability terms
1. Experiment: Any process of observation or measurement or any process which generates well
defined outcome.
2. Probability Experiment: It is an experiment that can be repeated any number of times under
similar conditions and it is possible to enumerate the total number of outcomes without
predicting an individual out come. It is also called random experiment.
Example: If a fair die is rolled once it is possible to list all the possible outcomes i.e.1, 2, 3, 4, 5,
6 but it is not possible to predict which outcome will occur.
3. Outcome: The result of a single trial of a random experiment

57
4. Sample Space: Set of all possible outcomes of a probability experiment
5. Event: It is a subset of sample space. It is a statement about one or more outcomes of a
random experiment .They are denoted by capital letters.
Example: Considering the above experiment let A be the event of odd numbers, B be the event of
even numbers, and C be the event of number 8.
 A
B
Cor empty space or impossible event
Remark:
If S (sample space) has n members then there are exactly 2n subsets or events.
6. Equally Likely Events: Events which have the same chance of occurring.
7. Complement of an Event: the complement of an event A means non- occurrence of A and
is denoted by A , or A , or A contains those points of the sample space which don’t
belong to A.
8. Elementary Event: an event having only a single element or sample point.
9. Mutually Exclusive Events: Two events which cannot happen at the same time.
10. Independent Events: Two events are independent if the occurrence of one does not affect
the probability of the other occurring.
11. Dependent Events: Two events are dependent if the first event affects the outcome or
occurrence of the second event in a way the probability is changed.
Example: .What is the sample space for the following experiment
a) Toss a die one time.
b) Toss a coin two times.
c) A light bulb is manufactured. It is tested for its life length by time.
Solution
a) S={1,2,3,4,5,6}
b) S={(HH),(HT),(TH),(TT)}
c) S={t /t≥0}
Sample space can be
 Countable ( finite or infinite)
 Uncountable.

58
3.2 Counting Rules
In order to calculate probabilities, we have to know
 The number of elements of an event
 The number of elements of the sample space.
That is in order to judge what is probable, we have to know what is possible .
 In order to determine the number of outcomes, one can use several rules of counting.
- The addition rule
- The multiplication rule
- Permutation rule
- Combination rule
 To list the outcomes of the sequence of events, a useful device called tree diagram is
used.
Example: A student goes to the nearest snack to have a breakfast. He can take tea, coffee, or milk
with bread, cake and sandwich. How many possibilities does he have?
Solutions:
Tea
Bread
Cake
Sandwich
Coeffee
Bread
Cake
Sandwitch
Milk
Bread
Cake
Sandwitch
 There are nine possibilities.
The Multiplication Rule:
If a choice consists of k steps of which the first can be made in n1 ways, the second can be made
in n2 ways…, the kth can be made in nk ways, then the whole choice can be made in (n1 * n2 *

59
........ * nk ) ways.
Example: The digits 0, 1, 2, 3, and 4 are to be used in 4 digit identification card.
How many different cards are possible if
a) Repetitions are permitted.
b) Repetitions are not permitted.
Solutions
a)

1st digit 2nd digit 3rd digit 4th digit

5 5 5 5

There are four steps

1. Selecting the 1st digit, this can be made in 5 ways.
2. Selecting the 2nd digit, this can be made in 5 ways.
3. Selecting the 3rd digit, this can be made in 5 ways.
4. Selecting the 4th digit, this can be made in 5 ways.
 different cards are possible.
b)

1st digit 2nd digit 3rd digit 4th digit

5 4 3 2

There are four steps

5. Selecting the 1st digit, this can be made in 5 ways.
6. Selecting the 2nd digit, this can be made in 4 ways.
7. Selecting the 3rd digit, this can be made in 3 ways.
8. Selecting the 4th digit, this can be made in 2 ways.
 different cards are possible.
Permutation
An arrangement of n objects in a specified order is called permutation of the objects.
Permutation Rules:
1. The number of permutations of n distinct objects taken all together is n!

60
Where n! n * (n1) * (n2) *.....* 3 * 2 *1
2. The arrangement of n objects in a specified order using r objects at a time is called the
permutation of n objects taken r objects at a time. It is written as n Pr and the formula is
!
n Pr = (
∗ )!

3. The number of permutations of n objects in which k1 are alike k2 are alike ---- kn are alike is
!
n Pr =
!∗ !∗…∗ !
Example:
1. Suppose we have a letters A, B, C, D
a) How many permutations are there taking all the four?
b) How many permutations are there two letters at a time?
2. How many different permutations can be made from the letters in the word “CORRECTION”?
Solutions:
1,
A. Here n 4, there are four distinct object
 There are 4! 24 permutations.
B, Here n 4, r 2
4!
There are 4 P2(4 permutation.
2)!

2,
Here n 10
Of which 2 are C, 2 are O, 2 are R ,1E,1T ,1I ,1N
K1 2, k2 2, k3 2, k4 k5 k6k7 1
U sin g the 3rd rule of permutation , there are
10!
 453600 permutations.
2!∗2!∗2!∗1!∗1!∗1!∗1!

Combination
- A selection of objects without regard to order is called combination.

Example: Given the letters A, B, C, and D list the permutation and combination for selecting two

61
letters.
Solutions:
Permutation Combination
AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC
Note that in permutation AB is different from BA. But in combination AB is the same as BA.
Combination Rule
- The number of combinations of r objects selected from n objects is denoted by
nCr or and given by:

n!
= (n r)!∗r!

Examples:
1. In how many ways a committee of 5 people be chosen out of 9 people?
Solutions:
n 9 , r 5
n n! 9!
( ) = (n = (9 = 126 way.
r r)!∗r! 5)!∗5!

2. Among 15 clocks there are two defectives .In how many ways can an inspector chose
three of the clocks for inspection so that:
A. There is no restriction.
B. None of the defective clock is included.
C. Only one of the defective clocks is included.
D. Two of the defective clock is included.
Solutions:
n 15 of which 2 are defective and 13 are non defective.
r3
a) If there is no restriction select three clocks from 15 clocks and this can be done in:
n 15 , r 3
n! 15!
) = (n = (15 = 455 way.
r)!∗r! 5)!∗5!

b) None of the defective clocks is included.

62
This is equivalent to zero defective and three non defective, which can be done in:
* = 286 ways.
c) Only one of the defective clocks is included.
This is equivalent to one defective and two non defective, which can be done in:
* = 156 ways.
d) Two of the defective clock is included.
This is equivalent to two defective and one non defective, which can be done in:
* = 13 ways.
3.3 Approaches to measuring Probability
There are four different conceptual approaches to the study of probability theory. These are:
 The classical approach.
 The frequentist approach.
 The axiomatic approach.
 The subjective approach.
The classical approach
This approach is used when:
- All outcomes are equally likely.
- Total number of outcome is finite, say N.
Definition: If a random experiment with N equally likely outcomes is conducted and out of these
NA outcomes are favorable to the event A, then the probability that event A occur denoted P( A)
is defined as:
N No.of outcomes favourable to A n(A)
P( A) =  n(S)
N total number of outcomes

Examples:
1. A fair die is tossed once. What is the probability of getting?
a) Number 4?
b) An odd number?
c) An even number?
d) Number 8?
Solutions:
First identify the sample space, say S

63
S
N n(S 
a) Let A be the event of number 4
A
 N A n( A) 1
n(A)
P( A n(S) 

b) Let A be the event of odd numbers

A
 N A n( A)3
n(A)
P( A) n(S) 

c) Let A be the event of even numbers

A
 N A n( A) 3
n(A)
P( A n(S) 

d) Let A be the event of number 8

A Ø
 N A n( A) 0
n(A)
P( A n(S) 

2. A box of 80 candles consists of 30 defective and 50 non defective candles. If 10 of these

candles are selected at random, what is the probability?
a) All will be defective.
b) 6 will be non defective
c) All will be non defective
Solutions:
80
Total selection = N =n(S) = ( )
10
a) Let A be the event that all will be defective.
30 50
Total way in which A occur = NA = n(A) =( )( )
10 0

64
( )
P( A   0.00001825
( )

b) Let A be the event that 6 will be non defective.

Total way in which A occur = NA = n(A) =

( )
P( A  
( )

c) Let A be the event that all will be non defective.

Total way in which A occur = NA = n(A) =

( )
P( A  
( )

Short coming of the classical approach:

This approach is not applicable when:
- The total number of outcomes is infinite.
- Outcomes are not equally likely.
The Frequentist Approach
This is based on the relative frequencies of outcomes belonging to an event.
Definition: The probability of an event A is the proportion of outcomes favorable to A in the
long run when the experiment is repeated under same condition.
NA
P( A)limN→∞ N

Example: If records show that 60 out of 100,000 bulbs produced are defective.
What is the probability of a newly produced bulb to be defective?
Solution:
Let A be the event that the newly produced bulb is defective.

NA 60
P( A)limN→∞ 100,000
N

Axiomatic Approach:
Let E be a random experiment and S be a sample space associated with E. With each event A a

65
real number called the probability of A satisfies the following properties called axioms of
probability or postulates of probability.
1. P( A) 0
2. P(S ) 1, S is the sure event.
3. If A and B are mutually exclusive events, the probability that one or the other occur equals the
sum of the two probabilities. i. e.
P( A B) P( A) P(B)
4. P(A) = 1-P(A)
5. 0 P( A 1
6. P(ø) =0, ø is the impossible event.
Remark: Venn-diagrams can be used to solve probability problems.

AUB AnB A

In general p( AB) p( A) p(B) p( A B)

3.4. Review Exercises

1. Six different statistics books, seven different physics books, and 3 different Economics books
are arranged on a shelf. How many different arrangements are possible if;
i. The books in each particular subject must all stand together
ii. Only the statistics books must stand together
2. If the permutation of the word WHITE is selected at random, how many of the permutations
i. Begins with a consonant?
ii. Ends with a vowel?

66
iii. Has a consonant and vowels alternating?
3. Out of 5 Mathematician and 7 Statistician a committee consisting of 2 Mathematician and 3
Statistician is to be formed. In how many ways this can be done if
a) There is no restriction
b) One particular Statistician should be included
c) Two particular Mathematicians cannot be included on the committee.
4 . If 3 books are picked at random from a shelf containing 5 novels, 3 books of poems, and a
dictionary, in how many ways this can be done if
a) There is no restriction.
b) The dictionary is selected?
c) 2 novels and 1 book of poems are selected?
5. What is the probability that a waitress will refuse to serve alcoholic beverages to only three
minors if she randomly checks the I.D’s of five students from among ten students of which four
are not of legal age?
6. If 3 books are picked at random from a shelf containing 5 novels, 3 books of poems, and a
dictionary, what is the probability that
a) The dictionary is selected?
b) 2 novels and 1 book of poems are selected?
7. There are 5 cooperative, 3 management, and 2 accounting students. In how many ways can
these students be arranged if:
a. students from the same department must sit together?
b. no restriction?

67
CHAPTER FOUR
Conditional probability and Independency
Objectives of the chapter
By the end of the chapter; you should be able to:-
- Know basic concept and important facts about conditional probability and independency.
- Determine whether two events are dependent or independent and whether one event is
conditional on another or not.
- Solve problems related to conditional probability and independence.
Conditional Events: If the occurrence of one event has an effect on the next occurrence of the
other event then the two events are conditional or dependant events.
The conditional probability of an event B in relation to an event A is the probability that event B
occurs given that the event A has already occurred. The notation for this conditional probability
is p (B/A), where the vertical bar (/) is read as “given” and the whole symbol is referred to as
“the probability of B given A” or “the conditional probability of B given A’.
Example: Suppose we have two red and three white balls in a bag
1. Draw a ball with replacement

Let A= the event that the first draw is red p( A)

B= the event that the second draw is red p(B)

A and B are independent.

2. Draw a ball without replacement

Let A= the event that the first draw is red p(A)

B= the event that the second draw is redp (B)

This is conditional.
Let B= the event that the second draw is red given that the first draw is
red p(B) 1/ 4
4.1 Conditional probability of an event
The conditional probability of an event A given that B has already occurred, denoted p( A B) is

68
p( A /B) = p( A B) , p(B) 0
p(B)
Examples
1. For a student enrolling at freshman at certain university the probability is 0.25 that he/she will
get scholarship and 0.75 that he/she will graduate. If the probability is 0.2 that he/she will get
scholarship and will also graduate. What is the probability that a student who get a scholarship
graduate?
Solution:
Let A= the event that a student will get a scholarship
B= the event that a student will graduate
given p( A) 0.25, p(B) 0.75, p A B 0.20
Re quired pB /A
(∩ ) .
pB/ A   0.80
( ) .

2. If the probability that a research project will be well planned is 0.60 and the probability that it
will be well planned and well executed is 0.54, what is the probability that it will be well
executed given that it is well planned?
Solution;
Let A= the event that a research project will be well Planned
B= the event that a research project will be well Executed
Given: p( A) 0.60, p A B 0.54
Re quired pB/ A
(∩ ) .
pB/ A   0.9
() .

Remark: 1. p( A/ B)’ 1 p( A/ B)

2. p(B /A)’ 1 p(B A)

Note; for any two events A and B the following relation holds.
pB pB/ A. p A pB /A' pA'

69
4.2. Probability of Independent Events
We say two events A and B are said to be independent if the occurrence of event A in a
probability experiment does not affect the probability of event B. that is , if we know that A has
occurred, then B still occurs with its usual probability. Similarly, if B has occurred, then A
occurs with its usual probability. In other words, events A and B are considered as independent if
the conditional probability A given B is the same as the unconditional probability of A, i.e,
P(A/B) = P(A) (B does not affect event A) and on the same way if the conditional probability B
given A is the same as the unconditional probability of B, i.e, P(B/A) = P(B) (A does not affect
event B) other ways, the events are dependent. This leads to a useful formula which is also our
definition of independency.
P(AnB) = P(A)*P(B) if and only if A and B are independent events.
Here p A/ B p A, PB/ A pB
Example; A box contains four black and six white balls. What is the probability of getting two
black balls in drawing one after the other under the following conditions?
a. The first ball drawn is not replaced
b. The first ball drawn is replaced
Solution; Let A= first drawn ball is black
B= second drawn is black
Required p A B
a. p A B pB/ A p A4 /103/ 9 2 /15
b. p A B p A. pB4 /104 /10 4/ 25

70
4.3. Review exercises
1. A coin is filliped 3 times. Each of the 8 outcomes is equally likely.
Let A; head occur on each of the first 2 flips
B; tail occurs on the third flip, and
C; exactly 2 tails occur in the 3 flips.
Then, show that A and B are independent, but B and C are dependent?
2. Let A and B be independent events with P(A) = 0.3 and P(B) = 0.4. Then what P (A or
B)?
3. Suppose that a fair die is tossed 2 times. Let;
A: be the event that the first die shows an even number and
B: be the event that the second die shows a 5 or 6.
Are events A and B independent?
4. Suppose that A and B are independent events associated with an experiment. If the
probability that A or B occurs equal to 0, while the probability that A occurs equals 0.4,
determine the probability that B occurs?

71
Chapter five
5. ONE- DIMENSIONAL RANDOM VARIABLE

Learning Objectives:

At the end of this chapter, you will be able to:

 Define the term random variable
 Differentiate a discrete and continuous random variables
 Discuss probability distribution
 Describe the characteristics of probability mass function and probability
density function.
 Find the probability mass function and probability density function.
 Find the cumulative distribution function (CDF) of a random variable.
5.1.Random Variables
Definition: let S be the sample space which represents the total outcome of given experiment
and let x be real valued function defined over the sample space, and then x is said to be a
random variable. Or Random Variable is a variable whose values are determined by chance or with
some probability.

The type of a given random variable determined by the output of a given experiment. If the
output of given experiment is countable finite or accountably infinite then the type of random
variable defined over this experiment is known to be discrete random variable where as if the
output of given experiment cannot be determined by counting but assumes any real values
between n two defined points then the type of random variable define over this experiment is
known to be continuous random.

Depending on the number of variable that determine out put a given experiment, we can
categorize the the random variable as one dimensional random variable if the output of that
experiment is determined by one variable, two dimensional random variable if the output of that
experiment is determined by two variables and k dimensional random variable if the output of
that experiment is determined by two variables. In general the type of a given experiment
determines the type of random variable because of the random variable is defined over a given
experiment.

72
5.1.1. Discrete random variables
The discrete random variable arises in situations when the possible outcomes are countable
finite or countably infinite. Or it is a random variable that assumes only certain clearly separated
values or whole numbers.

Example: Toss a coin 3 times, then

S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Let the variable of interest, X, be the number of heads observed then relevant events
Would be
{X = 0} = {TTT}
{X = 1} = {HTT, THT, TTH}
{X = 2} = {HHT, HTH, THH}
{X = 3} = {HHH}.
The relevant question is to find the probability of each these events.
Note that X takes integer values even though the sample space consists of H’s and T’s. The
variable X transforms the problem of calculating probabilities from that of set Theory to calculus.
Definition: A random variable (r.v) is a rule that assigns a numerical value to each possible
outcome of a random experiment.
Interpretation:
-random: the value of the r.v. is unknown until the outcome is observed
- Variable: it takes a numerical value.
Probability mass function: The probability mass function of a discrete r.v., X, assigns a
Probability p(xi) for each possible x such that
(i) 0 ≤ p(x) ≤ 1, and
(ii) ∑ ( ) = 1 where the summation is over all possible values of x with respective of their
probabilities
Example: A coin is tossed twice and the number x of heads is observed. Find the probability distribution
(Probability mass function) for x.
Solution:
S = HH, HT, TH, TT when x is the number of heads
So the pd is as follows

No of heads (x) 0 1 2

73
Probability of heads 1/4 1/2 1/4
p(x)
Exercise:
1. Construct a probability distribution for the number of heads observed in tossing a coin three times.
2. Construct a probability distribution for the number of girls if a family plans to have three children
3. f(X)= 2/9x where x=1,2,4,5

5.1.2. Continuous random variable

IF an output of a given experiment is not determined by counting but assumes any real number
between the defined regions then the type of random variable defined under such type of
experiment is known to be continuous random variable.
Example: Resistance emitted through a certain cable during a certain time interval, age of
students selected randomly which lies between certain years and etc here in this case our random
variable assumes any real value between the defined intervals.
probability density function: let x be a continuous random variable, then f(x) is said to be
probability density function for the random variable x if it satisfies the following condition.
1. fX (x) ≥ 0 f or every value of x in the defined region
2. ∫ ( ) =1
3. p(a<X<b) =∫ ( ) where −∞ < < < < ∞
Example 1: The following function is probability density function
3x ,0 < < 1
a) f(x) =
0, otherwise
because fX (x) ≥ 0,∫ 3 =1 and p(0<X<1)= , ∫ 3
Exercise 1: For the following functions identify whether it is pdf or not,
e − x, x > 0
b) f(x) =
0, other wise
c) f(x)= 2x, 0<x<1
0, other wise
Example 2: For the following probability density function, find
3x ,0 < < 1
f(x) =
0, otherwise
a) p(0<x<1/2)

74
/ /
Solution: p(0<x<1/2)=∫ = | =0.04
Exercise 2: The length X (in inches) of a steel rod is assumed to be a random variable with
probability density function given by

3 − ,0 < <2
fX(x) =
0, otherwise
Rod is considered defective if its length is less than 1 inches. Then find the probability of the
steel road become defective.
5.2. Cumulative distribution function
Definition let x be a random variable continuous or discrete the cumulative distribution of x is
defined by F(x) =P(X≤ ) here the way we find CDF for the two types random variable are
different.
 If X is a discrete random variable F(x) is given by P(X≤ )=∑ ( )
 If X is a continuous random variable its F(x) is given by P(X≤ )=∫ ( ) were f(t)
here is the function of probability density function of a given random variable.
Example 1: a student is given three true false questions but the student do not know the exact
answer of the questions let X represents the number of true answers the student answered
randomly construct a cumulative distribution of X.
Solution:
Here x takes the values of 0,1,2,3 now our aim is to find the CDF of each value but here we
should have to compute its probability mass function.
Xi
0 1 2 3
P(xi) 1/8 3/8 3/8 1/8

F(0)= P(X≤ 0)= P(X=0)=1/8

F(1)= P(X≤ 1)= P(X= 0)+ P(X= 1)=1/8+3/8=1/2
F(2)= P(X≤ 2)= P(X= 0)+ P(X= 1)+P(X= 2)= 1/8+3/8+3/8=7/8
F(3)= P(X≤ 3)= P(X= 0) + P(X= 1)+P(X= 2) +P(X= 3)= 1/8+3/8+3/8+1/8=1
Finally the CDF becomes
0,x<0

75
1/8, 0 ≤ <1
F(x)= 4/8, 1 ≤ <2
7/8, 2 ≤ <3
1, x≥ 3

Example: Assume that the resistance of certain cable is assumed to be a continuous with the
3x , 0 < < 1
following pdf f(x) =
0, other wise
Then compute the CDF for the resistance of a cable
Solution:
F(x) =P(X≤ )= ∫ ∞
( ) but here f(t) is the function of f(x)

= ∫0 3 2dt
=x3
Then the general form of CDF is given by

0, x < 0
F(x)= x , 0 < < 1
1, x > 1

The general properties of CDF

 is non decreasing function
 is right hand continuous
 become decreased when we goes to left

5.3. Uniformly distributed random variables

Definition: - suppose that x is a continuous random variable assuming all values in the interval
1
[ , ] , where a and b are finite. If the pdf of x is given by f(x) = , a≤x≤b and 0 elsewhere, we

say that x is uniformly distributed over the interval [ , ].

Example: - point is chosen at random on number line segment [0, 2], what is the probability that
the chosen point lies b/n 1 and 3/2?

76
Let x represent the coordinate of the chosen point, we have that the pdf of x is given by f(x) = ½,
3/2 1
0≤x≤2, 0, otherwise. And hence p (1≤x≤3/2) =∫1 2 = 1/4.

5.4.Review Exercises
1. Let X be a continuous with pdf
, 0≤ ≤1
( )= ( )=
0, ℎ
Where c is a constant
a. Determine the value of constant c
b. Find the cumulative distribution function of the random variable X that is,
( ).
2. The length of time (in minute) y that a certain women speaks on mobile telephone is
random variable with probability density function (pdf)
f(y) = , ≥0 then Determine
0, ℎ
a. The value of constant c.
b. The cumulative distribution function (CDF)
c. What is the probability that the duration of her conversation will be between 10 and 15
minutes?
3. For the random variable X with probability mass function P(x)= aX,
X=1,2,3,4 and Y=2X+1. Find
a) The value of a.
b) CDF of Y.
4. From the following pmf

PX;Y(x,y) Y
0 1 2

1 0.1 0.1 0.1

X 2 0.1 0.2 0.05

3 0.2 0.1 0.05

a) Calculate PX(x)
b) Find and PY(y).

77
CHAPTER SIX

6. TWO- DIMENSIONAL RANDOM VARIABLES

Learning Objectives:
At the end of this chapter, you will be able to:
 Distinguish the difference between one dimensional random variable and
two dimensional random variables.
 Discuss the joint cumulative distribution functions.
 Distinguish marginal probability mass functions and marginal probability
density functions.
 Understand the properties of discrete and continuous distributions for two
and above dimensional random variables.
 Discuss about conditional distributions for discrete and continuous random
variables.
 Define independence of random variables.
 Test independence of random variables for discrete and continuous case.

6.1. Basic Concepts

Often, a single random variable cannot adequately provide all of the information needed about
the outcome of an experiment. For example, tomorrow’s weather is really best described by an
array of random variables that includes wind speed, wind direction, atmospheric pressure,
relative humidity and temperature. It would not be either easy or desirable to attempt to combine
all of this information into a single measurement.
We would like to extend the notion of a random variable to deal with an experiment that result in
several observations each time the experiment is run. For example, let T be a random variable
representing tomorrow’s maximum temperature and let R be a random variable representing
tomorrow’s total rainfall. It would be reasonable to ask for the probability that tomorrow’s
temperature is greater than 70℃ and tomorrow’s total rainfall is less than 0.1 inch. In other
words, we wish to determine the probability of the event
A = {T > 70; R < 0.1}
Another question that we might like to have answered is, “What is the probability that the
temperature will be greater than 70℃ regardless of the rainfall?” To answer this question, we
would need to compute the probability of the event B = {T > 70}
In this chapter, we will build on our probability model and extend our definition of a random
variable to permit such calculations.
78
Definition 6.1 Let Ω be a sample space. An n dimensional random variable or random
vector, X1(.), X2(.), . . . ,Xn(.) , is a vector of functions that assigns to each point S∈ Ω a point
X1(S),X2(S), . . . ,Xn(S) in n dimensional Euclidean space.

Example: Consider an experiment where a die is rolled twice. Let X1 denote the number of the
first roll, and X2 the number of the second roll. Then (X1, X2) is a two dimensional random
vector.
6.2. Joint Distributions
Now that we know the definition of a random vector, we can begin to use it to assign
probabilities to events. For any random vector, we can define a joint cumulative distribution
function for all of the components as follows:
Definition 6.2. Let (X1,X2) be a random vector. The joint cumulative distribution function for
this random vector is given by , FX1,X2,...;Xn(x1, x2, . . . , xn)= P(X1 ≤ x1,X2 ≤ x2, . . . ,Xn ≤ x)
In the two dimensional case, the joint cumulative distribution function for the random vector
(X ,X ) evaluated at the point (x ,x ), namely FX , X (x ,x ),is the probability that the
experiment results in a two dimensional value within the specified region.
Every joint cumulative distribution function must possess the following properties:

1. lim → ∞ 1, 2, . . . ; ( 1, 2, . . . , )=0
2. lim → ∞ 1, 2, . . . , ( 1, 2, . . . , ) =1
3. As xi varies, with all other xj’s (j ≠ i) fixed, FX1,X2,...;Xn(x1, x2, . . . , xn) is a non
decreasing function of xi.
As in the case of one dimensional random variables, we shall identify two major classifications
of vector valued random variables: discrete and continuous. Although there are many common
properties between these two types, we shall discuss each separately.
6.2.1. Discrete Distributions
A random vector that can only assume at most a countable collection of discrete values is said to
be discrete. As an example, consider once where a die is rolled twice. The possible values for
either X1 or X2 are in the set {1; 2; 3; 4; 5; 6}. Hence, the random vector (X1;X2) can only take
on one of the 36 values. If the die is fair, then each of the points can be considered to have a

79
probability mass of 1/36. This prompts us to define a joint probability mass function for this type
of random vector, as follows:
Definition 6.3. Let (X1,X2, . . .Xn) be a discrete random vector. Then pX1, X2,...,Xn(x1, x2, . . .
,xn) = P(X1 = x1,X2 = x2, . . . ,Xn = xn): is the joint probability mass function for the random
vector (X1,X2, . . . ,Xn). Referring again to the example, we find that the joint probability mass
function for (X1,X2) is given by pX1,X2(x1, x2) = 1/ 36 for x1 = 1,2, . . . , 6 and x2 = 1, 2, . . . , 6
Note that for any probability mass function,

FX1, X2,... Xn(b1, b2, . . . , bn) =   ...  PX1, X 2...Xn(x1, x2,...xn)

x1b1x 2b 2 xnbn

Therefore, if we wished to evaluate FX1,X2(3,4.5) we would sum all of the probability mass in
the specified region , and obtain FX1,X2(3,4.5) = 12* 1/36 = 1/3.
This is the probability that the first roll is less than or equal to 3 and the second roll is less than
or equal to 4.5.
Every joint probability mass function must have the following properties:
1. pX1,X2,...,Xn(x1, x2, . . . xn) ≥ 0 for any (x1,x2,. . . ,xn)
2.   .. PX 1, X 2...Xn( x1, x2,...xn) =1
allx1 allx 2 allxn

3. P(E) =  PX 1, X 2...Xn( x1, x2,...xn)

( x1, x 2 ,.. xn )E

You should compare these properties with those of probability mass functions for single valued
discrete random variables.
6.2.2. Continuous Distributions
Definition 6.4. Let (X1, X2, . . . , Xn) be a continuous random vector with joint cumulative
distribution function FX1,...,Xn(x1, . . . , xn): The function fX1,...,Xn(x1, . . . , xn) that satisfies the
equation

FX1,...,Xn(b1, . . . , bn) =∫ ∫ …∫ 1, 2, … ( 1, 2, … ) 1 2 … for all (b1, .

. . ,bn) is called the joint probability density function for the random vector (X1, . . . ,Xn).
Every joint probability density function must have the following properties:
1. fX1,X2,...,Xn(x1, x2, . . .,xn) ≥ 0 for any (x1, x2, . . . ,xn):
2. ∫ ∫ …∫ 1, 2, . . . , ( 1, 2, . . . , ) 1… =1

80
3. P (E) = ∫ … ∫ 1, 2, . . . , ( 1, 2, . . . , ) 1… for any event E.
You should compare these properties with those of probability density functions for single
valued continuous random variables.
In the one dimensional case, we had the handy formula P (a<X<b)=FX(b)-FX(a). This
worked for any type of probability distribution.

Example: Let (X,Y ) be a two dimensional random variable with the following joint probability
density function.
fX,Y (x, y) = 2-y, if 0≤x≤2,0≤y≤2
0, elsewhere

Note that ∫ ∫ (2 − ) =1
Suppose we would like to compute P(X ≤1,Y ≤1.5). To do this, we calculate the volume under
the surface
fX,Y (x, y) over the region{(x,y) : x ≤ 1, y ≤ 1.5}.
Performing the integration, we get,
.
P(X ≤ 1.0,Y ≤ 1.5) = ∫ ∞
∫ ∞(2 − ) =3/8.
Exercise: what is p(x<0.5, y<2)?

6.3. Marginal Distributions

Given the probability distribution for a vector valued random variable (X1,. . . Xn), we might ask
the question; “Can we determine the distribution of X1, disregarding the other components?” The
answer is yes, and the solution requires the careful use of English rather than mathematics. For
example, in the two dimensional case, we may be given a random vector (X,Y ) with joint
cumulative distribution function FX,Y (x, y). Suppose we would like to find the cumulative
distribution function for X alone, i.e., FX(x)? We know that
FX,Y (x,y) = P(X ≤ x,Y ≤ y), and we are asking for
(1) FX(x) = P(X ≤ x):
But in terms of both X and Y, expression 1 can be read: “the probability that X takes on a value
less than or equal to x and Y takes on any value.” Therefore, it would make sense to say
FX(x) = P(X ≤ x) = P(X ≤ x, Y ≤ ∞)= lim →∞ , ( , ) .

81
Definition 6.5: Let (X1; . . . ;Xn) be a random vector with joint cumulative distribution function
FX1,...,Xn(x1; . . . , xn). The marginal cumulative distribution function for X1 is given by
FX1(x1) = lim →∞ lim →∞ … lim →∞ 1, . . . , ( 1, . . . , )( 1, 2, . . . , )
Notice that we can renumber the components of the random vector and call any one of them X1.
So we can use the above definition to find the marginal cumulative distribution function for any
of the Xi’s.
Although Definition 5.5 is a nice definition, it is more useful to examine marginal probability
mass functions and marginal probability density functions. For example, suppose we have a
discrete random vector (X; Y ) with joint probability mass function pX,Y (x, y). To find pX(x), we
ask “What is the probability that X = x regardless of the value that Y takes on? This can be
written as PX(x) = P(X = x) = P(X = x, Y = any value) =  PX , Y ( x, y) .
ally

Example: In the die example, pX1; X2(x1; x2) = 1/36, for x1 = 1,2, . . . ,6 and x2 = 1,2, . . . ,6
To find PX1 (2), for example, we compute
P (2) = P( = 2) =  PX , Y (2, k ) =1/6.
allk

Example: Table 5.1: Joint pmf for daily production

Y
PX,Y (x,y) 1 2 3 4 5
x
1 .15 0 0 0 0

2 .15 .1 0 0 0

3 .05 .05 .10 0 0

4 .05 .025 .025 0 0

5 .10 .10 .10 .10 0

Let X be the total number of items produced in a day’s work at a factory, and let Y be the
number of defective items produced. Suppose that the probability mass function for (X,Y ) is
given by Table 5.1. Using this joint distribution, we can see that the probability of producing 2
items with exactly 1 of those items being defective is
PX,Y (2, 1) = 0.15

82
To find the marginal probability mass function for the total daily production, X, we sum the
probabilities over all possible values of Y for each fixed x:
PX(1) = PX,Y (1,1) = 0.05
PX(2) = PX,Y (2,1) + PX,Y (2, 2) = 0:15 + 0:10 = 0:25
PX(3) = pX;Y (3; 1) + pX;Y (3; 2) + pX;Y (3; 3) = 0:05 + 0:05 + 0:10 = 0:20
etc.
The procedure is similar for obtaining marginal probability density functions. Re call that a
density, fX(x), itself is not a probability measure, but fX(x)dx, is. So with a little loose speaking
integration notation we should be able to compute fX(x) dx = P(x ≤ X < x + dx)
= P(x ≤ X < x + dx; Y = any value)
=∫ , ( , ) , where y is the variable of integration in the above
integral. Looking at this relationship as
∞
fX(x) dx= ∫ ∞
, ( , )

It would seem reasonable to define

∞
fX(x) = ∫ ∞
, ( , )
Definition 6.6. Let (X1; . . . ;Xn) be a continuous random variable with joint probability density
function fX1;...;Xn. The marginal probability density function for the random variable X1 is
given by
∞ ∞ ∞
fX1(x1) =∫ ∞
∫ ∞ …∫ ∞ 1, . . . , ( 1, . . . , ) 2…
Notice that in both the discrete and continuous cases, we sum (or integrate) over all possible
values of the unwanted random variable components.
Example: Let (X,Y ) be a two dimensional continuous random variable with joint probability
density function
fX,Y (x,y) = x + y 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
0, else where
Find the marginal probability density function for X and y.
Solution: Let’s consider the case where we fix x so that 0 ≤ x ≤ 1. We compute
∞
fX(x) =∫ ∞
, ( , )

=∫ ( + )

83
=(xy+ ) 0,1

=x+

Therefore, fX(x) = x+ , 0≤x≤1, 0, elsewhere.

Exercise: find the marginal distribution for y.

6.4. Conditional distributions

Case I: Discrete
Let X and Y be random variables with joint probability mass function PX,Y (x, y)
and let PY (y) be the marginal probability mass function for Y .
We define the conditional probability mass function of X given Y as
Px , y ( x, y )
PX/Y (x/y) = whenever PY (y) > 0.
Py ( y )
Case II: Continuous
Let X and Y be random variables with joint probability density function fX,Y (x, y) and let fY (y)
be the marginal probability density function for Y .
We define the conditional probability density function of X given Y as
fx, y ( x, y )
fX,Y (x/y) = whenever fY (y) > 0
fy( y )
Example1: A company produces two types of compressors, grade A and grade B.

Let X denote the number of grade A compressors produced on a given day. Let Y denotes the
number of grade B compressors produced on the same day. Suppose that the joint probability
mass function
PX,Y (x, y) = P(X = x, Y = y) is given by the following table:
pX;Y(x;y) Y pX(x)
0 1
0 0.1 0.3 0.4
X 1 0.2 0.1 0.3
2 0.2 0.1 0.3

PY(y) 0.5 0.5 1

84
Given that no grade B compressors were produced on a given day, what is the probability that 2
grade A compressors were produced?
p( x  2, y  0) 0.2 2
p ( x  2 / y  0)   
p ( y  0) 0.5 5

Example2: Suppose an electronic circuit contains two transistors. Let X be the time to failure of
transistor 1 and let Y be the time to failure of transistor 2.

( )
4 x ≥ 0, ≥ 0
fX, Y (x, y)=
0, otherwise.

Given that the total life time for the two transistors is less than two hours, what is
the probability that the first transistor lasted more than one hour?

P( X 1, X  Y  2)
Solution: p ( x  1 / x  y  2) 
P ( X  Y  2)

2 2 x 2 ( x  y )

p ( x  1, x  y  2)    4e dydx
1 0

 e 2  3e 4

2 2 x 2 ( x  y )

p ( x  y  2)    4e dxdy
0 0

 1  5e 4

e 2  3e 4
p ( x  1 / x  y  2)   0.0885
1  5e  4

6.5. Independence of Random Variable

Definition 6.7. A sequence of n random variablesX1, X2, Xn is independent if and

only if, and
FX1,X2,...,Xn(b1, b2, . . . ,bn) = FX1(b1)FX2(b2) FXn(bn) for all values b1, b2, . . .
, bn.
Recall: An event A is independent of an event B if and only if
P (A∩B) = P (A) P (B):
Theorem 6.1. If X and Y are independent random variables then any event A
involving X alone is independent of any event B involving Y alone.

85
Testing for independence:

Case I: Discrete
A discrete random variable X is independent of a discrete random variable Y if and
only if
pX,Y (x, y) = [pX(x)][pY (y)] for all possible values of x and y

Case II: Continuous

A continuous random variable X is independent of a continuous random variable
Y if and only if
fX,Y (x, y) = [fX(x)][fY (y)] for all x and y.
Example: Consider the compressor problem again.
PX,Y(x, y) Y PX(x)
0 1
0 0.1 0.3 0.4

X 1 0.2 0.1 0.3

2 0.2 0.1 0.3

PY(y) 0.5 0.5 1

The random variables X and Y are not independent. Note that

PX,Y (0,0) = 0.1 ≠PX(0)PY (0) = (0.4)(0.5) = 0.2

Example: Suppose an electronic circuit contains two transistors. Let X be the time to failure of
transistor 1 and let Y be the time to failure of transistor 2.

( )
4 x ≥ 0, ≥ 0
fX;Y (x; y)=
0, otherwise.

The marginal densities are:

2 x ≥ 0
fX(x)=
0, Otherwise
2 y ≥ 0
fY (y)=
0, otherwise
We must check the probability density functions for (X, Y), X and Y for all values
of (x, y).
For x ≥ 0 and y ≥ 0:
86
( )
fX,Y (x, y) = 4 = fX(x) fY (y) =2 2 For x ≥ 0 and y < 0:
fX;Y (x; y) = 0 = fX(x)fY (y) = 2 (0) For x < 0 and y ≥ 0:
fX;Y (x; y) = 0 = fX(x)fY (y) = (0) 2 For x < 0 and y < 0:
fX;Y (x; y) = 0 = fX(x)fY (y) = (0)(0)
So the random variables X and Y are independent.

6.6. Review Exercise:

1. A candy company distributes boxes of chocolates with a mixture of creams, coffees, and nuts
coated in both light and dark chocolate. For a randomly selected box, let X and Y, respectively, be
the proportions of the light and dark chocolates that are creams and suppose that the joint density
function is
(2 + 3 ), 0 < < 1, 0 < y < 1
f (x,y ) =
0, ℎ
a. Verify f (x,y ) whether or not is joint probability function of x and y.
b. Find P(X<1/2, ¼<y<1/2)
c. Find the marginal probability density of x and y

2. Two electronic components of a missile system work in harmony for success of the total system.
Let X and Y denote the life in hours of the two components. The joint density of X and Y is

( )
, ≥ 0
( , ) =
0, ℎ .
a. Find the marginal density functions for both random variables.
b. What is the probability that both components will exceed 2 hours?

87
Chapter Seven

7. Expectation

Learning outcomes

At the end of the chapter students are expected to:

 know what is meant by expectation, variance and correlation;

 be familiar with functions of random variables and derive their probability distributions
 compute expectation, variance and correlation of random variables;

7.1. Expectation of a random variable

 The expected value of a random variable X is the average value of the random variable in an
infinite number of repetitions of the experiment (repeated samples); it is denoted E[X].
Expectation of a Discrete Random Variable:

 If X is a discrete random variable which can take the values x1, x2,…,xn with probability

density values f(x1), f(x2),…, f(xn), the expected value of X is

E ( X )     x i f ( xi )  x1 f ( x1 )  x 2 f ( x 2 )    x n f ( x n ) 
i

E[ X ]   x f ( x) for a discrete random variable,

Example: Consider one roll of a die. Let X is the number that turns up. To ﬁnd E(X), we
must get the probability distribution of X.

Solution:

X f(x)
1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6
1 1 1 1 1 1 7
μ = E(X) = 1     2     3     4     5     6    
6 6 6 6 6 6 2

88
Expectations of Continuous Random Variables

Let the continuous random variable X taking values in [a,b] and f ( x ) is the probability
density function. Then, the expected value of the continuous random variable X is

b
E(X )     xf ( x ) dx .
a

Example : The probability density function for a continuous random variable X is

 x  2
 18 ,  2  x  4
f (x )  

 0 , otherwise
Find E[X] .

Solution:

E X     
4
x
 x  2
4
 x2 x 
dx    dx 
 x3 x2 

 18   2
2
18 2
9   54 18 2

Exercise

1. X is a continuous random variable with pdf, f(x) = 2x, 0  x  1 . Find E(X)

2. Suppose a coin is tossed 5 times. Find the average and variance of the number of heads.
3. Find the average number of girls if a family plans to have four children.

7.1.1. Expectation of a Function of a Random Variable

If X is a discrete random variable and g(X) is a function of it, then

E [ g ( X )]  
x
g ( x) f ( x)

b
E [ g ( X )]   g ( X ) f ( x ) dx .
a

89
Let X and Y be random variables with joint probability distribution f ( X , Y ) . The mean or
expected value of the random variable g ( X , Y ) is

E[ g ( X , Y )] = ( , ) ( , ) , if X and Y are discrete

+∞ +∞

E[ g ( X , Y )] = ( , ) ( , ) , if X and Y are
−∞ −∞

Example: let X and Y be the random variables with joint probability distribution indicated in the
table. Find the expected value of g ( X , Y )  XY

f ( X ,Y ) X
0 1 2 Row totals
0 3/28 9/28 3/28 15/28
Y 1 3/14 3/14 0 3/7
2 1/28 0 0 1/28
Column totals 5/14 15/28 3/28

Solution:

E[( X , Y ] = , ( , )

=0.0 (0,0)+ 0.1 (0,1)+ 0.02 (0,2)+ 1.0 (1,0)+ 1.1 (1,1)+ 1.2 (1,2)+ 2.0 (2,0)+
2.1 (2,1)+ 2.2 (2,2)
= (1,1)
=3/14
Example: The joint pdf of two random variables X and Y is given by

1
f X ,Y ( x, y )  xy 0  x  2, 0  y  2
4
 0 otherwise
2
Find the joint expectation of g ( X , Y )  X Y

90
Eg ( X , Y )  EX 2Y
 
   g ( x, y ) f X ,Y ( x, y )dxdy
 
22 1
   x 2 y xydxdy
00 4
12 2
  x 3 dx  y 2 dy
40 0

1 2 23 4
  
4 4 3
8

3

Note:

If ( , ) = and ( , ) =

⎧ ( , ) = ℎ( ) ,
⎪
⎪
( )=
⎨ ∞
⎪
⎪ ( , ) = ℎ( ) ,
⎩ −∞

Where ℎ( )is the marginal distribution of

⎧ ( , ) = ℎ( ) ,
⎪
⎪
( )=
⎨ ∞
⎪
⎪ ( , ) = ℎ( ) ,
⎩ −∞

Where ℎ( )is the marginal distribution of .

Properties of expectation
1. If c is a constant,

91
E [c ]  c
2. If c is a constant and X is a random variable, then
E [cX ]  cE [ X ]
3. If a and c are constants then
E [a  cX ]  a  cE [ X ]
7.2. Variance of a random variable

Variance of a discrete random variable:

Let x1 , x 2 ,, x n , be all the possible values of the discrete random variable X and f ( x ) is

the probability distribution. Let   E(X ) be the expected value of X. Then, the variance of
the discrete random variable X is

Var ( X )   2  E  X  E ( X )  
2
 (x i   ) 2 f ( xi )
i

 ( x1   ) 2 f ( x1 )  ( x 2   ) 2 f ( x 2 )    ( x n   ) 2 f ( x n )

Example: Consider one roll of a die. Let X is the number that turns up. To ﬁnd V(X), we must get
the expected value of X. This is

1 1 1 1 1 1 7

μ = E(X) = 1     2     3     4     5     6    
6 6 6 6 6 6 2

To ﬁnd the variance of X, we form the new random variable (X − µ) 2 and compute its
expectation. We can easily do this using the following table.

X f(x) (x − 7/2)2
1 1/6 25/4
2 1/6 9/4
3 1/6 ¼
4 1/6 ¼

92
5 1/6 9/4
6 1/6 25/4

From this table we ﬁnd E ((X − µ) 2) as

Var(X) =1/6{25/4+9/4+1/4+1/4+9/4+25/4}=35/12

Variance of a continuous random variable:

Let the continuous random variable X taking values in [a,b] and f (x) is the probability
distribution. Let   E(X ) be the expected value of X. Then, the variance of the continuous
random variable X is

b
 E X  E ( X )    ( x   ) 2 f ( x ) dx
2 2
Var ( X )  
a

Standard Deviation

The standard deviation of X, denote d by SD(X), is SD(X) = Var (X ) . We often write σ for SD(X)

and σ 2 for Var(X).

Example 5.1.2 : Find variance of for the given probability density function for a continuous
random variable X

x  2
 18 ,  2  x  4
f (x)  

 0 , otherwise

Solution:

93
4 4 4

E X      x
 x  2  x2 x 
dx    dx 
 x3 x2 

  18   2
2
18 2
9   54 18 2

Since

   x
2
4
2

 x  2
4
 x3 x 2 
dx    dx 
 x4 x3 
EX
18   18 
9   72 27  6 ,
2 2   2
2
 
Var X   E X     E X 2   2  6  22  2

The standard deviation SD(X ) = 2

Theorem: If X is any random variable with E(X ) = µ, then

Var(X) = E(X 2) − µ2 or Var(X) = E(X2) – [E(X)]2

Proof. We have

Var(X) = E((X − µ)2 )

= E(X 2 − 2µX + µ2 )

= E(X2) − 2µE(X) + µ2

= E(X2) − µ2.

= E(X2) – [E(X)]2

Using Theorem, we can compute the variance of the outcome of a roll of a die by ﬁrst
computing

1 1 1 1 1  1  91

E (X2) = 1     4     9     16     25     36     and,
6 6 6 6 6 6 6

94
91 7 35
Var (X) = E (X2) − µ2    , in agreement with the value obtained directly from
6 2 12
the deﬁnition of Var(X).

Properties of Variance

The variance has properties very deferent from those of the expectation. If c is any
constant, E (cX) = cE(X) and E(X + c) = E (X) + c. These two statements imply that the
expectation is a linear function. However, the variance is not linear, as seen in the next
theorem. If X is any random variable and c is any constant, then

1. Var(cX) = c2Var(X) and,

2. Var(X+c) = Var(X), the proof is left as exercise.
7.3. Covariance of random variables

The covariance between the random variables X and Y, denoted as cov(X, Y) or is

= ( − ) − = ( )− = ( )− ( ) ( )

7.4. Correlation of random variables

The correlation between random variables X and Y, denoted as is

( )
= =
( ) ( )

Because > 0 and >0, if the covariance between X and Y is positive, negative, or
zero, the correlation between X and Y is positive, negative, or zero, respectively. The
following result can be shown.

For any two random variables X and Y

95
− ≤ ≤+

Example: the joint pdf of the random variable X and Y is given by

2
( , ) = 5 (2 + 3 ), 0≤ ≤ 1, 0≤ ≤1
0 , ℎ

Find
a. ( ) and ( )
b. ( ) and ( )
c. Covariance of and ( )
d. The Correlation coefficient between and ( )
Solution:
a. ( )=∫ ( )
ℎ ( )
1
2  3y2 
( )= (2 + 3 ) =  2xy 
5  2  0
2 3
= 2 + − (0 + 0)
5 2
2 3
( )= 2 +
5 2
( )= ( )
1
2 3 2  2 x 3 3x 2 
= 2 + =  3  4 
5 2 5  0
2 2 3 8+9
=  3  4  = 30
5  
17
( )=
30

( )= ( )

96
1
 2 x 4 3x 3 
2 3 2  
= 2 + =  4 6 0
5 2 5

2 2 3
=   
5  4 6
2
( )=
5
( ) = ( ) − [ ( )]
2 17
= −
5 30
360 − 289
=
900
71
( )=
900

b. ( )=∫ ( )
ℎ ( )
1
2  2x 2 
( )= (2 + 3 ) =   3xy 
5  2 0
2
= [(1 + 3 ) − (0 + 0)]
5
2
( ) = (1 + 3 )
5
( )= ( )

2 2
= (1 + 3 ) = ( +3 )
5 5
1
2  y 2 3y3 
=  2  3 
5  0
2 1  2 3 
=  2  1 = 5  2 
5    
3
( )=
5
( )= ( )

2 2
= (1 + 3 ) = ( +3 )
5 5

97
1
2  y3 3y 4 
=  
5 3 4  0

2 1 3 2 4+9
=    =
5  3 4 5 12
13
( )=
30

( )= ( ) − [ ( )]
13 3
= −
30 5

65 − 54
=
150
11
( )=
150

c. = ( − ) − = ( )− = ( )− ( ) ( )

2
( )= ( , )
5

1 1
2
= (2 + 3 )
5
0 0

1 1
2
= (2 +3 )
5
0 0

1
2  2 x 3 y 3x 2 y 2 
=  3  2  dy
5  0

2  2y 3y 2 
=  3  2  dy
5  

98
1
2  2 y2 3y3 
=  
5 6 6  0

2  2  3 2
=   =
5 6  6

1
( )=
3

Since we have
1 3 17
( ) = , ( ) = ( ) =
3 5 30
Therefore,
= ( )− ( ) ( )
1 3 17
= −
3 5 30

50 − 51
=
150
−1
( )= =
150
( )
d. = =
( ) ( )

First we have to find

( )= = ( )

71
=
900

( )= = ( )

11
=
150
−1
( )
Therefore, = = 150
( ) ( ) 71 11
900 150

99
7.5. Review Exercises
1. Find the mean, variance and standard deviation of the following probability
distributions.
1,0  x  1
a. f ( x)  
0, otherwise
3x 2 ,0  x  1
b. f ( x)  
0, otherwise
2. It is given that E(x) = 3, V(x) = 16, E(y) = 4, V(y) = 9 and that x and y are independent, find

a) E(x+y) b) V(x+y) c) E(4x-3y) d) V(4x-3y) e) E( x + y) f) V( x + y)

3. The pdf of a continuous random variable x is given by

+ ,0 ≤ ≤ 1
f(x) =
0, ℎ
if E(x) = find “a” and “b”.

100
Chapter Eight

Important Discrete and Continuous Distributions

Learning Objectives: At the end of the chapter students are expected to:

 know what is meant by probability distribution

 use standard statistical tables for the normal distributions
 be familiar with functions of random variables and derive their probability distributions
 be familiar with standard discrete and continuous probability distributions and their
application.
 be familiar with calculating probabilities using different probability distributions

8.1 The Binomial Distribution

Binomial distribution is one of the simplest and most frequently used discrete probability
distribution and is very useful in many practical situations involving either /or types of events.

Properties of Binomial Experiment

1. Each trial has only two mutually exclusive outcomes or outcomes that can be reduced to
two. One of the outcomes is labeled as Success and the other as Failure.
2. The outcome of each trial is independent.
3. The probability of Success remains the same from trial to trial.
4. The experiment (trial) is performed for fixed number of times, say n.
Let X be the number of successes. Then X follows a binomial distribution with parameters n,
number of experiments performed and p, probability of success, and write as X~Bin(n,p).
The probability of getting exactly x successes in n trials is given by
 n
P ( X  x)    p x q n  x , x  0,1,2,...n .
 x
Where p is the probability of success
q=1-p is the probability of failure
n is number of trials
x is number of successes.
This is called the Binomial Distribution.

101
The mean of a binomial distribution is E(X) =np and variance is V(X) =npq (S.d=sqrt (V(X))).

Example

A coin is tossed five times. (This is the same as a sample size of five). What is the
probability of obtaining exactly two heads in the five tosses?

Solution:

It is known, by prior knowledge, that the probability of a single success (probability of a

head in one toss of a coin) is fifty percent. The question is looking for two successes or
two heads in five tosses of a coin. A success is the outcome that is desired to occur.

For this example:

 The number of trials = n = 5

 The probability of a single event = p = 1/2

 The number of successes that the question is seeking (x = 2).

To arrive at the answer to the question the values are entered in the binomial formula.

(2 )= = 10 × × = 0.3125 31.25

Example

A company produces electronic chips by a process that normally averages 2% defective

products. A sample of four chips is selected at random and the parts are tested for certain
characteristics.

a. What is the probability that exactly one chip is defective?

Solution:

4
(1 ℎ ) = (1) = (0.02)1 (0.98) = 4(0.02)(0.9412) = 0.0753
1

b. What is the probability that more than one chip is defective?

Solution:

102
More than one defective chip in a sample of four means two, three or four
defective chips. The probability of each may be calculated using the binomial
formula.

P(more than 1 defective chip) = P(2) or P(3) or P(4) = P(2) + P(3) + P(4)

In any trial or sample, the sum of the probabilities of the individual events always
equal one. In this problem: P(0) + P(1) + P(2) + P(3) + P(4) = 1

P(more than 1 defective) = 1 - [P(0) + P(1)] = 1 - [.9224 +.0753] = .0023

Exercise:
1. Suppose a coin is tossed 10 times. What is the probability of getting
a. Exactly 3 heads
b. At most 3 heads
c. At least 3 heads
d. More than 3 heads
e. No head
Find the average and variance of the number of heads.
2. The probability of a man kicking into the goal is 2/3. If a person kicks 5 times, what is
the probability of scoring
a. At least one goal.
b. At most 3 goals.
Find the average, variance and standard deviation of the number of goals.

8.2 The Poisson distribution

Properties
1. The probability of success, p, is very small.
2. The experiment is performed indefinitely (n is very large).
3. The average number of events per unit of time (  ) is known.

103
Thus, the random variable X (number of successes) has a Poisson distribution with parameter  ,
X~Poisson ( ) and the probability of getting x successes is given by

e  x
P ( X  x)  , x  0,1,2,.... .
x!
If X is a Poisson random variable then E(X) =  and V(X) = 

Example:

In making switches, it has been determined by empirical studies that there is, on average,
one defect per switch. What is the probability of selecting a sample of five switches that
contains zero defects?

Solution:

There are two methods to solve this problem. The first method is to use the above
formula where x = 0, n = 5, and p = 1, therefore

np = 5 x 1 = 5.

(5) 0.00674
(0) = = = 0.00664 0 0.674 %
0! 1

The second and most widely used method is to use the Poisson tables that are published
in most statistics books. To use the tables, find the value of x in the leftmost column, then
find the value of np on the top row and read P(x) at the intersection of the two values.

The Poisson table value for P(0) = .006738 or .674%

Exercise:
1. On average a typist commits 3 errors per page. Find the probability that she will make
a. No mistake.
b. More than one mistake.
2. Customer arrive at a photocopying machine at an average rate of two every 10 minutes.
What is the probability that there will be

104
a. No arrivals during any period of ten minutes.
b. Exactly one arrival during these time period.
c. More than two arrivals during this time period.

8.3 The Normal Distribution

The normal distribution is a continuous probability distribution and plays a very
important and pivotal role in statistical theory and practice, particularly in the area of
statistical inference and statistical quality control. Its importance is due to the fact that
in practice, the experimental results, very often seem to follow the normal distribution
or bell shaped curve. The normal curve is symmetrical and is defined by its mean µ and
its standard deviation  . The normal curve is not just one curve but a family of curves.
Just as the equation for a circle describes the family of circles, some small and some
large, the equation for the normal curve describes a family of such curves which may
differ only with regard to the values of µ and  , but have the same characteristics.

Characteristics of the Normal Curve

1. The normal curve is symmetrical about the mean. This means that the number of
units in the data below the mean is the same as the number of units above the
mean. This means the mean and median have the same value.
2. The height of the normal curve is maximum at the mean value. Thus, the mean and
mode coincide. This means that the normal distribution has the same value of
mean, median and mode.
3. The curve declines as we go in either direction from the mean, but never touches
the base (X-axis) so that the tails of the curve on both sides extend indefinitely.
4. The corresponding deciles, quartiles and percentiles are equi-distant from the
mean.
The height of the normal curve Y at any value of the random variable X is given by
1 x   2
1 ( )
Y  f ( x)  e2 
,   x   and write as X  N (  ,  2 )
2
Where µ is the mean of the distribution

105
 is the standard deviation of the distribution

Standard Normal Distribution

X 
If X  N (  ,  2 ) , then Z  is called the standard normal curve variate with mean

1
1 z2
zero and variance one and written as Z  N (0,1) and f ( z )  e 2 ,   z  
2
Therefore,
b b 1 X   2
1 (

) a b
P ( a  X  b)   f ( x ) dx   e2 dx  P ( Z )
a a 2  

b 

  f ( z)dz
a


Area under the Normal Curve

The total area under the standard normal curve is one. Then the area to right and left
from the central point (µ=0) is 0.5 each.
Example: Find the area under the standard normal distribution which lies
a) (0 ≤ ≤ 1.96)
b) ( ≥ 1.96)
Solution:

a) (0 ≤ ≤ 1.96) = 0.4750 = (−1.96 ≤ ≤ 0)

b) ( ≥ 1.96) = ( ≥ 0) − (0 ≤ ≤ 1.96) = 0.5 − 0.4750 = 0.025
Example:

A random variable X has a normal distribution with mean 80 and standard deviation 4.8.
What is the probability that it will take a value

a) Less than 87.2

b) Greater than 76.4
c) Between 81.2 and 86.0
Solution:

106
X is normal with mean, =80, standard deviation, = 4.8
.
a) ( < 87.2) = <
87.2 − 80
= <
4.8

= ( < 1.5)

= ( < 0) + (0 < < 1.5)

= 0.5 + 0.4332

= 0.9332

.
b) ( > 76.4) = >
76.4 − 80
= >

= ( > −0.75)

= ( > 0) + (0 < < 0.75)

= 0.5 + 0.2734

= 0.7734

.
c) (82.2 < < 86) = < <
81.2 − 80 86 − 80
= < <
4.8 4.8
= (0.25 < < 1.25)
= (0 < < 1.25) − (0 < < 0.25)
= 0.3934 − 0.0987
= 0.2957

107
Exercise:
The IQ score of students is normally distributed with a mean of 120 and variance
400. What is the probability that a student will have an IQ?
a) Between 100 and 130.
b) Below 150.
c) Above 140.

d) Between 140 and 150.

8.4. Review Exercises
1. Suppose a coin is tossed 10 times. What is the probability of getting
a. Exactly 3 heads
b. At most 3 heads
c. At least 3 heads
d. More than 3 heads
e. No head
Find the average and variance of the number of heads.
2. The probability of a man kicking into the goal is 2/3. If a person kicks 5 times, what is the
probability of scoring
a. At least one goal.
b. At most 3 goals.
Find the average, variance and standard deviation of the number of goals.
3. On an average a typist commits 3 errors per page. Find the probability that she will make
a. No mistake.
b. More than one mistake.
4. Customer arrive at a photocopying machine at an average rate of two every 10 minutes. What is
the probability that there will be
a. No arrivals during any period of ten minutes.
b. Exactly one arrival during these time period.
c. More than two arrivals during this time period.
5. The IQ score of students is normally distributed with a mean of 120 and variance 400. What is
the probability that a student will have an IQ
a. Between 100 and 130.
b. Above 140.
c. Below 150.
d. Between 140 and 150.

108
6. A student is given 4 true or false questions. The student does not know the answer to any of the
questions. He tosses a coin. Each time he gets a head, he selects true. What is the probability
that he will get;
a. Only one correct answer.
b. At most 2 correct answers.
c. At least 3 correct answers.
d. All correct answers.
7. The number of cars pulling into a petrol pump is 3 cars in every 10 minutes. What is the
probability that exactly 2 cars will arrive in the next 10 minutes?
8. The price of one gallon bottle of milk is normally distributed with an average price of $2 and
standard deviation of 20 cents. A family stops at a booth to buy a gallon of milk. What is the
probability that they will pay;
a. More than $2.1
b. Less than $1.75
c. Between $1.85 and $ 2.15
d. Between $1.85 and $1.95

8.5. Assignment from all Chapters (1:8)

1. Construct a frequency distribution table for the following data by including class limit,
class boundaries, class mark, frequency, LCF and MCF. Then calculate the range and
stander deviation for constructed FD.
42 62 46 54 41 37 54 44 32 45
47 50 58 49 51 42 46 37 42 39
56 38 45 52 46 54 39 51 58 47
64 43 48 49 48 49 61 41 40 58
49 59 57 57 34 40 63 41 51 41
2. Suppose a train moves 100 km with a speed of 40 km per hour, then 150 km with a speed
of 50 km per hour and the next 135 km with a speed of 45 km per hour. Calculate the
average speed of the train.
3. In Adigrat University, 120 students were involved in the study, and based on an enquiry
on their age, it was known that 35% of them were 20 or less. The following frequency
distribution shows the age composition of the students under study.
Mid-age in years Number of persons
16 15
19 f1
22 f2
25 23
28 15
31 9
Find the mean, median and mode?

4. random variable X has the following probability function value of X:

109
X 0 1 2 3 4 5 6 7
2 2
P(X=x) 0 k 2k 2k 3k K 2k 2k2+k
i. Find K.
ii. If p(x≤k)> ½; find the minimum value of k.
iii. Determine the distribution function of X.
iv. Find mean and variance.
5. The IQ score of students is normally distributed with a mean of 120 and variance 400.
What is the probability that a student will have an IQ?
a) Between 100 and 130.
b) Below 150.
c) Above 140.
d) Between 140 and 150.
6. On average a typist commits 3 errors per page. Find the probability that she will make
a. No mistake.
b. More than one mistake.
7. Customer arrive at a photocopying machine at an average rate of two every 10 minutes.
What is the probability that there will be
a. No arrivals during any period of ten minutes.
b. Exactly one arrival during these time period.

8. From the following pmf

PX;Y(x,y) Y
0 1 2

1 0.1 0.1 0.1

X 2 0.1 0.2 0.05

3 0.2 0.1 0.05

c) Calculate PX(x)
d) PY(y).
e) Find PX=2/Y=1.
9. From the following joint probability density function (4pt)

, 0<x<2; 0<y<1

fX;Y (x,y)= 0, other wise

110
a) Determine the value of A.
b) Find the E (X), E (Y).
10. Let (X; Y) be a two dimensional continuous random variable with joint probability
density function given by
2 + , 0 < < 1 0 < < 1
; ( , )=
0 , ℎ
Find the marginal probability density function for X and y

11. If [ ( )] = { [ ( )| ]} then,
a) E(Y)=E{E[Y|X]}
b) Var(Y)=E[Var(Y|X)]+Var(E[Y|X])
12. Graduated Statistics Aptitude Test (GSAT) scores are widely used by graduate school of
Engineering and Technology as an entrance requirement. Suppose that in one particular year, the
mean score for GSAT was 476 with s.d 107. Assuming that the FSAT scores are normally
distributed. Answer the following question.
i. What is the probability that randomly selected score falls between 476 and 650?
ii. What is the probability of receiving score greater than 750?
iii. What is the probability of receiving score 540 or less?
iv. What is the probability of receiving score between 440 and 330?

111

Mathematics 10 LAS Q4
81% (21)
Mathematics 10 LAS Q4
67 pages
03 Introduction To Probability
No ratings yet
03 Introduction To Probability
28 pages
'A' Level Maths Statistics and Mechanics
100% (3)
'A' Level Maths Statistics and Mechanics
142 pages
Statistics Theory Weightage Till Jan 2025: Click The Poster To Watch Session
No ratings yet
Statistics Theory Weightage Till Jan 2025: Click The Poster To Watch Session
35 pages
CURRICULUM OF STATISTICS BOS BS (Hons)
No ratings yet
CURRICULUM OF STATISTICS BOS BS (Hons)
48 pages
Collection and Presentation of Data - FDT
No ratings yet
Collection and Presentation of Data - FDT
20 pages
Estimation and Hypothesis Testing
100% (2)
Estimation and Hypothesis Testing
47 pages
Reviewed Competency Based Curriculum For Fashion Design Level 6
No ratings yet
Reviewed Competency Based Curriculum For Fashion Design Level 6
111 pages
1 Biostatistics LECTURE 1
100% (1)
1 Biostatistics LECTURE 1
64 pages
Statistical Computing I
No ratings yet
Statistical Computing I
187 pages
BBA IV Business Statistics
No ratings yet
BBA IV Business Statistics
270 pages
Chapter One: Introduction: 1 1.1 Definition and Classification of Statistics
No ratings yet
Chapter One: Introduction: 1 1.1 Definition and Classification of Statistics
68 pages
Lack of Fit Test
No ratings yet
Lack of Fit Test
5 pages
Inference About Population Variance
100% (1)
Inference About Population Variance
30 pages
Estimation
No ratings yet
Estimation
53 pages
Statistics
100% (1)
Statistics
3 pages
Web Based Finance System
No ratings yet
Web Based Finance System
65 pages
Assignment I
100% (1)
Assignment I
4 pages
Statistics For Management Unit 2 Note
No ratings yet
Statistics For Management Unit 2 Note
24 pages
Statistic & Analytics
No ratings yet
Statistic & Analytics
98 pages
Chapter 1 Introduction To Statistics For Engineers
No ratings yet
Chapter 1 Introduction To Statistics For Engineers
91 pages
Sta404 Chapter 02
100% (1)
Sta404 Chapter 02
55 pages
Lecture 1
100% (1)
Lecture 1
33 pages
4estimation and Hypothesis Testing (DB) (Compatibility Mode)
No ratings yet
4estimation and Hypothesis Testing (DB) (Compatibility Mode)
170 pages
Population and Sample
No ratings yet
Population and Sample
15 pages
Ashenafi Tiruneh
No ratings yet
Ashenafi Tiruneh
77 pages
Chapter 4 Inferential
No ratings yet
Chapter 4 Inferential
135 pages
Mathematics Project2 Navya Almal
75% (4)
Mathematics Project2 Navya Almal
9 pages
2nd Sem Final Exam in Statistics
No ratings yet
2nd Sem Final Exam in Statistics
12 pages
MMW Module 6 - Measures of Central Tendency
No ratings yet
MMW Module 6 - Measures of Central Tendency
10 pages
Math 10 Q4 Week1
No ratings yet
Math 10 Q4 Week1
26 pages
Practical Exercise 2: Solution: Part A: Introduction To STATA
No ratings yet
Practical Exercise 2: Solution: Part A: Introduction To STATA
6 pages
32 2022 Ao Jao MPL Admn Notification20221231205516
No ratings yet
32 2022 Ao Jao MPL Admn Notification20221231205516
26 pages
Sampling and Estimation
No ratings yet
Sampling and Estimation
15 pages
Elementary Statistics Step by Step Approach 9th Edition Bluman Solutions Manual
No ratings yet
Elementary Statistics Step by Step Approach 9th Edition Bluman Solutions Manual
23 pages
F2 Term 2 Exam 1819 QP
No ratings yet
F2 Term 2 Exam 1819 QP
8 pages
1.1 Introduction To Statistics and Data Gatherings
No ratings yet
1.1 Introduction To Statistics and Data Gatherings
102 pages
Math 1F Module 4 Frequency Distribution
No ratings yet
Math 1F Module 4 Frequency Distribution
7 pages
Introduction To Statistics (SRWM)
No ratings yet
Introduction To Statistics (SRWM)
141 pages
Stats and Proba 240878
No ratings yet
Stats and Proba 240878
130 pages
QCM With Answers
No ratings yet
QCM With Answers
14 pages
Descriptive Statistics PDF
No ratings yet
Descriptive Statistics PDF
130 pages
46R 111 PDF
No ratings yet
46R 111 PDF
22 pages
Chapter-6-Random Variables & Probability Distributions
No ratings yet
Chapter-6-Random Variables & Probability Distributions
15 pages
Biostatistics Assignment
No ratings yet
Biostatistics Assignment
3 pages
Priinciples of Statistics PDF
No ratings yet
Priinciples of Statistics PDF
62 pages
2018 Mit 068
No ratings yet
2018 Mit 068
67 pages
Chapter Two
No ratings yet
Chapter Two
154 pages
Worksheet For Engineers
100% (2)
Worksheet For Engineers
2 pages
class12PRACTICAL FILE WriteUp
No ratings yet
class12PRACTICAL FILE WriteUp
35 pages
Midterm Review
No ratings yet
Midterm Review
10 pages
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
No ratings yet
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
25 pages
Lecture 9: Analysis of Variance: Statistics For Economics 1
No ratings yet
Lecture 9: Analysis of Variance: Statistics For Economics 1
50 pages
Basic-Statistics-For-Business-And-Economics-9th-Edition-Lind-Solutions-Manual CH 2
No ratings yet
Basic-Statistics-For-Business-And-Economics-9th-Edition-Lind-Solutions-Manual CH 2
26 pages
Quantum Proposal
No ratings yet
Quantum Proposal
21 pages
Sampling and Sampling Distributionsnew
100% (1)
Sampling and Sampling Distributionsnew
13 pages
Statistics 2
100% (1)
Statistics 2
4 pages
CH 4 Seasonal Estimation 2015
No ratings yet
CH 4 Seasonal Estimation 2015
17 pages
Statistics 1-1
No ratings yet
Statistics 1-1
4 pages
Chapter One
No ratings yet
Chapter One
8 pages
Quartiles, Deciles, Percentiles
100% (1)
Quartiles, Deciles, Percentiles
5 pages
Introduction 2 NEW
No ratings yet
Introduction 2 NEW
31 pages
Statistical Methods and Testing of Hypothesis
No ratings yet
Statistical Methods and Testing of Hypothesis
52 pages
ES031 M1 DataCollection&Presentation
No ratings yet
ES031 M1 DataCollection&Presentation
64 pages
Melat Proposal
No ratings yet
Melat Proposal
31 pages
Introduction Statistics
100% (1)
Introduction Statistics
23 pages
Lesson 2.2 Rational Exponents and Radicals
No ratings yet
Lesson 2.2 Rational Exponents and Radicals
31 pages
Chi Square Test
No ratings yet
Chi Square Test
49 pages
Measure of Central Tendency
No ratings yet
Measure of Central Tendency
24 pages
Frequency Distribution and Charts and Graphs
No ratings yet
Frequency Distribution and Charts and Graphs
61 pages
Pasaoa, FGD. PETA-1-Stats-and-Prob
No ratings yet
Pasaoa, FGD. PETA-1-Stats-and-Prob
11 pages
Lesson Plan Math 10 (Quartiles For Grouped Data)
No ratings yet
Lesson Plan Math 10 (Quartiles For Grouped Data)
14 pages
MCQ Testing of Hypothesis With Correct Answers
No ratings yet
MCQ Testing of Hypothesis With Correct Answers
8 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
21 pages
Sta 111 1ST Lecture Note
No ratings yet
Sta 111 1ST Lecture Note
6 pages
Grouped Data
No ratings yet
Grouped Data
17 pages
@StudyTime - Channel 14 - 15
No ratings yet
@StudyTime - Channel 14 - 15
21 pages
Introduction To Statistics 1 COD
No ratings yet
Introduction To Statistics 1 COD
58 pages
Statistics Module
No ratings yet
Statistics Module
91 pages
Types of Statistical Data
No ratings yet
Types of Statistical Data
35 pages
Statistics Probability Assignment
No ratings yet
Statistics Probability Assignment
6 pages
Chapter 6 Section 4-5: Probability: Multiple Choice
No ratings yet
Chapter 6 Section 4-5: Probability: Multiple Choice
7 pages
WFM 5201: Data Management and Statistical Analysis: Lecture-2: Descriptive Statistics
No ratings yet
WFM 5201: Data Management and Statistical Analysis: Lecture-2: Descriptive Statistics
12 pages
Research Methodology - Parametric and Non-Parametric Tests
No ratings yet
Research Methodology - Parametric and Non-Parametric Tests
7 pages
Sampling Notes Part 01
No ratings yet
Sampling Notes Part 01
13 pages
Makerere University: College of Education and External Studies. School of Education
No ratings yet
Makerere University: College of Education and External Studies. School of Education
4 pages
Assignment 2 Questions One
No ratings yet
Assignment 2 Questions One
2 pages
Sampling Distribution of The Sample Proportion
No ratings yet
Sampling Distribution of The Sample Proportion
9 pages
Learning Activity Sheet: Week 3
No ratings yet
Learning Activity Sheet: Week 3
9 pages
Lecture 6 - Measures of Variability
No ratings yet
Lecture 6 - Measures of Variability
3 pages
Data Types
No ratings yet
Data Types
8 pages
R Programming Exam With Solutions
No ratings yet
R Programming Exam With Solutions
9 pages
MCA4020-Model Question Paper
No ratings yet
MCA4020-Model Question Paper
18 pages
Name Date Period: - : This Worksheet Will Walk You Through How To Calculate Standard Deviation
No ratings yet
Name Date Period: - : This Worksheet Will Walk You Through How To Calculate Standard Deviation
4 pages
IB Math SL Statistics Review
No ratings yet
IB Math SL Statistics Review
11 pages
UM04CBBA04 - 09 - Statistics For Management II
No ratings yet
UM04CBBA04 - 09 - Statistics For Management II
2 pages

Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation

Uploaded by

Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation

Uploaded by

Chapter one

1. Basic Concepts, Methods of data collection and presentation

 Understand the definition and classification of statistics.

1.1 Definition and classifications of statistics

1.6 Methods of data collection & presentation

A) Introduction to methods of data collection

Interviewing is a technique that is primarily used to gain an understanding of the underlying

Methods of secondary data collection

Advantage of secondary data

Disadvantage of secondary data

(1) (2) (3) (4)

2) Ungrouped frequency Distribution:

Construct a frequency distribution, which is ungrouped.

Steps for constructing Grouped frequency Distribution:-

k 1 3.32 log n =1+3.32log (20) =5.32=6(rounding up)

The complete frequency distribution follows:

Component Bar chart

years wheat Barley Oats total

Objectives of the chapter

By the end of this chapter; students should be able to:-

 Understand the use of the measure of central tendency.

∑ k nk where k is any constant

The sum of the product of the two variables could be written:

Example: considering the following data determine

d) ∑ (xi + yi)= 

h) (∑ xi)(∑ yi)= 

Example: Obtain the mean of the following number

Arithmetic Mean for Grouped Data

of the ith class

Where: Xi is the original class mark for the ith class.

Special properties of Arithmetic mean

Example: In a class there are 30 females and 70 males. If females averaged 

Where n is total number of observations.

5. The effect of transforming original series on the mean.

Merits and Demerits of Arithmetic Mean

Log(G.M) = log( √x1 ∗ x2 ∗ … ∗ xn) = log(x1*x2*…*xn)n

Example: Find the G.M of the numbers 2, 4, 8.

In a case of frequency distribution:

H.M = , where ∑fi=n, i.e total number of observation.

Size of farms No. of farms

= ½(5+6)= 5.5 is the median value.

class Frequency Less than

39 is the first cumulative frequency to be greater than or equal to 37.5

- For grouped data: we have the following formula:

Qi= LQi + { − cQi} ,i=1, 2 and 3.

than type) greater than or equal to .

- For grouped data: we have the following formula

Di=LDi+ { − cDi} i= 1,2,……..,9.

(less than type) greater than or equal to

- For grouped data: we have the following formula

Pi=LPi + { − cPi} i=1,2,………,99.

than type) greater than or equal to

Example: Considering the following distribution

 170 180 is the class containing the first quartile.

200210 is the class containing the third quartile.

LD7 190, w10

2.2. Measures of Dispersion (Variation)

R X k X 1 , X k is class mark of the last class.

Example: Find the relative range of the above two distribution.(exercise!)

For the case of frequency distribution it is expressed as:

We usually use the following short cut formula.

∑ni 1(xi x)2

S=√s2 =√24.67 = 4.97

S= √s2= √59.46 = 7.71

∑ni 1(xi x)2 ∑ni 1(xi A)2

2. For normal (symmetric distribution the following holds.

b), Similarly done.

more than 62.

 Z gives the deviations from the mean in units of standard deviation

Since C.V2 < C.V1, group 2 is more consistent.

Marks No. of students

Number of persons 24 f1 90 122 f2 56 20 33

Number of items sold 27 A 28 B 19

1st digit 2nd digit 3rd digit 4th digit

There are four steps

1st digit 2nd digit 3rd digit 4th digit

Log(G.M) = log( √x1 ∗ x2 ∗ … ∗ xn) = log(x1x2…*xn)n