0% found this document useful (0 votes)
174 views247 pages

AMTH - 107 - All Lectures

This document provides an overview of basic statistics and probability concepts: 1) It defines statistics as the collection, organization, and analysis of data to draw conclusions about large data sets from representative samples. Data can come from primary or secondary sources. 2) Variables are characteristics that can take different values and are classified as qualitative or quantitative, with quantitative variables further divided into discrete and continuous. Measurement scales include nominal, ordinal, interval, and ratio scales. 3) Random variables have unknown values. A population is the entire data set of interest, while a sample is a subset used to make inferences about the population. Measurement involves assigning numerical values to variables using an appropriate scale.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
174 views247 pages

AMTH - 107 - All Lectures

This document provides an overview of basic statistics and probability concepts: 1) It defines statistics as the collection, organization, and analysis of data to draw conclusions about large data sets from representative samples. Data can come from primary or secondary sources. 2) Variables are characteristics that can take different values and are classified as qualitative or quantitative, with quantitative variables further divided into discrete and continuous. Measurement scales include nominal, ordinal, interval, and ratio scales. 3) Random variables have unknown values. A population is the entire data set of interest, while a sample is a subset used to make inferences about the population. Measurement involves assigning numerical values to variables using an appropriate scale.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 247

Basic Statistics and Probability

Lecture-01 Date:14/01/2020
◼ Statistics: Statistics is a field of study that deals with
1) Collection, organization, summarization and analysis of data.
and
2) Drawing inference/to draw some conclusion about a large data set from a representative
part of it.
◼ Data: The information about which we are concerned is called data. In statistics, data are
expressed in numbers. Information can be transformed into numbers by
1) Using the measuring instruments,
2) The counting process,
3) Classifying into numbered classes/groups/categories.
Example:
I. Weight of a patient,
II. Temperature of a place in a day,
III. Number of patients discharged from a clinic,
IV. Economic status of an individual country/company.
#Remark:
Data are numbers, numbers contain information and the purpose of statistics is to investigate
and explore the nature and meaning of the information.

◼ Source of Data: Source of data can be classified into


a) Primary data (BBS, Newspaper, …)
b) Secondary data
◼ Primary Data: Primary data are organized for the first time by user for the purpose of
addressing his/her problems.
1) Keeping records routinely,
2) Conducting survey (When information exists, but is not available in routinely kept
record),
3) Conducting experiment (When information does not exist).
◼ Secondary Data: Secondary data are the second-hand information which was collected by
any other person other than the user for a purpose not relating to the current problem of user.
Example:
I. Census, government publications;
II. Internal records of organization;
III. Published reports, books, journals;
IV. Website.
Lecture-02 Date:16/01/2020
✓ Note: The concept theories and tools of statistics are used to analyze the data derived from
many fields of study such as
a) Business,
b) Psychology,
c) Economics,
d) Biological and Science, Bio-statistics
e) Medicine.
◼ Entity: Entity is person, place or object from which information is obtained.
Example: Number of student of a class. (class-Entity)
◼ Variable: Variable is a characteristic (About which we want information) which takes on
different values (numbers) in different entities.
Example:
I. Height of a student,
II. Age of an individual student/patient,
III. Number of calls received in a day by a person,
IV. Education level of an individual student.

Variable can broadly be classified as


a) Qualitative Variable (Percentage)
b) Quantitative Variable (Average)
◼ Qualitative Variable: A variable is said to be qualitative variable if the numbers can be
assigned to the variable by classifying into numbered classes/groups/categories. Number
associated with the qualitative variable convey information regarding attribute.
Example:
I. Gender,
II. Religion,
III. Eye color,
IV. Place of residence,
V. Birth place,
VI. Letter Grade.
◼ Quantitative Variable: A variable is called a quantitative variable if number can be assigned
to the variable by using measuring instrument or by the process of counting. The numbers of
quantitative variable convey information regarding amount.
Example:
I. Height,
II. Weight,
III. Age,
IV. Blood glucose concentration,
V. Heart-beat,
Quantitative variable can be classified into two groups.
a) Discrete Variable,
b) Continuous Variable.
◼ Discrete Variable: A quantitative variable will be a discrete variable if there are gaps or
interruptions in the values (numbers) that quantitative variable can assume. These gaps
interruptions indicate the absence of values (numbers) between particular values.
Example:
I. Number of Tumers of patients,
II. Number of accidents occurred in a place in a day,
III. Number of students in a class,
IV. Heart-beat,
V. Letter Grade Point.
◼ Continuous Variable: A quantitative variable will be continuous if the variable can assume
any value with-in a specified interval of values.
Example:
I. Age,
II. Height,
III. Weight,
IV. Blood glucose concentration,
V. Marks obtained in-course by a student,
VI. GPA (Average),
VII. Time to reach department by a student.
Variable

Qualitative Quantitative

Discrete

Continuous

#Remark:
1) Average (or mean) is usually used to summarize the values of a quantitative variable.
2) Percentage of entities belonging to each class/group/categories is usually uses for
qualitative variable.
Lecture-03 Date:19/01/2020
◼ Random Variable: A variable is said to be a random variable when all possible values of
variable are known in advance but it is not possible to predict exactly what the values will be
for an entity.
◼ Population: Population is the largest collection of entities for which we have an interest at a
particular time.
Example:
If we are interested to know the weight of under 5 children of Dhaka city.

Population

Finite Infinite

◼ Census: A survey is called census when information is collected from all entities of a
population. Census is not always feasible. Because-
a) It takes a long time,
b) It is expensive,
c) It require a large number of skilled person to obtain information.
◼ Sample: Sample is a representative part of a population.
Example:
Suppose that the average birth weight in Bangladesh was found to 2394 gm in 2016. But
a study on 6139 new-born revealed that the average birth-weight was 2405 gm.(6139-Sample)
◼ Measurement and Measurement Scale: Measurement is a scientific rule of assigning values
(numbers) to a variable by using an appropriate scale.
Four scales are available in statistics for this purpose and those are
a) Nominal Scale,
b) Ordinal Scale,
c) Interval Scale,
d) Ratio Scale.
A. Nominal Scale: The measurement scale in which numbers are assigned to the
categories/classes/groups only for the identification is called Nominal Scale.
The numbers can’t be ordinal on ranked.
Example:
I. Jersey number of a player (0-50)
II. Gender (Male/Female)
III. Blood Group (A+, A-, B+…)
IV. Home District (Dhaka, Rangpur, …)
V. Religion (Islam, Hindu, …)
VI. Smoking Behavior (Yes/No)
VII. Nationality (Bangladeshi, Indian, …)
VIII. Place of residence.

B. Ordinal Scale: The measurement scale in which numbers are assigned to the
categories/classes/groups only for the identification as well as ranking is called an Ordinal
Scale.
Ordinal scale is a type of Nominal Scale.
Example:
I. Education level [0 No class, 1 Primary,
2 Secondary, 3 Higher]
II. Economic statics [1 Poor, 2 Middle, 3 Rich]
III. Level of satisfaction
IV. Level of usefulness
V. Level of opinion
VI. Smoking status
Lecture-04 Date:26/01/2020
C. Interval Scale: The measurement scale in which numbers are assigned to the variable in
such a way that the scale can be broken down on a scale by equal units and zero value on the
scale is not absolutely zero (zero doesn’t mean the absence of characteristic) is called an
Interval Scale.
Example:
I. Temperature,
II. IQ Scale,
III. Marks obtained in a course,
IV. Calendar time,
V. Watch time
In this case, the arithmetic operations (addition and subtraction) are appropriate.

D. Ratio Scale: The measurement scale in which numbers are assigned to the variable in such a
way that the scale can be broken down on a scale by equal units and zero value on the scale
is absolutely zero (zero doesn’t mean the absence of characteristic) is called an Ratio Scale.
Example:
I. Height,
II. Weight,
III. Age,
IV. Income,
V. Number of accident occurred in a city in a day,
VI. Blood glucose concentration.
Variable

Qualitative Quantitative

Nominal Interval

Ordinal Ratio

#Remark:
1) 20℃ is twice as warm as 10℃.
2) 20℃ is 10℃ warmer than 10℃.
3) 0℃ means absence of heat.
4) $30 is $10 more than $20.
5) $30 is three times as much as $10.
6) $0 means no money.
Problem: For the following variables, indicate whether it is quantitative/qualitative. Also
specify the measurement scale used to assigned the numbers.
a) Class standing position of students
b) Admitting diagnosis of a patient
c) Weight of a baby
d) Gender of a child
e) Body temperature of a new born.
Solution:
a. Qualitative, Ordinal
b. Qualitative, Nominal
c. Quantitative, Ratio (Continuous)
d. Qualitative, Nominal
e. Quantitative, Interval (Continuous).

Problem: For each of the followings situations answer the questions (a) to (d).
a) What is the variable of interest? Type of variable?
b) What measurement scale was used?
c) What is the population?
d) What is the sample?
Situation-01: Study of 300 households in a small town reveals the fact that 20% of
households had school-going children.
Situation-02: A study on 250 patients admitted to a specific hospital in last year revealed
that the patients, on the average, lived 15 miles away from hospital.
Solution:
#Situation-01:
a. Presence of school going children in household. (Qualitative variable)
b. Nominal.
c. All households in the small town.
d. 300 households.
#Situation-02:
a) Distance between residence of patients in hospital. (Quantitative and continuous)
b) Ratio.
c) All patients admitted into the hospital last year.
d) 250 patients.
Lecture-05: Date:28/01/2020
◼ Summarization Data: (Frequency Distribution)
Summarization of data is used to know the meaning, nature or pattern of data obtained on
a variable of interest. One of the tools for summarization is the frequency distribution of data.
◼ Classification:
Classification means grouping of related facts (numbers of values) into different non-
overlapping (or mutually exclusive) classes. Number of values belongings to each class is known
as frequency of that class.
◼ Frequency Distribution:
A set of classes along with corresponding frequencies, presented in tabular from, is called
Frequency Distribution. The minimum number of class will be two. Ideal number of class in
frequency distribution will be a number between 4 and 8, inclusive.
◼ Types of Frequency Distribution:
There are two types of frequency distribution. These are-
1) Ungrouped or Simple Frequency Distribution
-Qualitative Variable
2) Grouped Frequency Distribution
-Quantitative Variable.
#Remark:
In practice, if a quantitative variable took on few distinct numbers, one may use simple
frequency distribution.
◼ Simple Frequency Distribution:
To construct a simple frequency distribution, one needs to follow steps given below-
a) Consider each numbered category/group of a qualitative variable or each distinct
number of a quantitative variable as a class. Arrange the classes in ascending order
(small to big).
b) Consider the first value (observation) and examine in which class it belongs to and
consider a tally mark corresponding to per class. Repeat this process for remaining
observations.
c) Count the number of observations belonging to each class which is known as frequency
of the class. Note that sum of frequencies to the size of data.
d) Compute the Relative Frequency (RF) for each class as
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠
𝑅𝐹 = 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑑𝑎𝑡𝑎
e) Compute Percentage Frequency (PF) as
PF=RF× 100
Table-01: (Title)
Class Tally Frequency RF PF

Male

Female
1

Lecture-06 Date:02/02/2020
#Example: Following data represent the number of children of women
3 5 2 4 0 1 3 5 2 3
2 3 3 2 4 1
construct a subtitle frequency distribution and interprent the result.
Solution:
The variable of interest is
x = Number of children of a mother (Quantitative, discrete, ratio)
n = data size =16
Since the distinct numbers are few-one may construct simple frequency distribution.
Table-01: Number of children of women, n=16.

Class Tally Frequency RF PF


0 1 0.0625 6.25

1 2 0.1250 12.50

2 4 0.2500 25.00

3 5 0.3125 31.25

4 2 0.1250 12.50

5 2 0.1250 12.50

It is clear from table that only 6.25% of women do not have any children of the
women. 12.5% have 5 children. Most of the women have (31.25%) 3 children.
2

#Example: Following data provides information on the level of internet use (1:0
hours; 2:<1 hours; 3:1-4 hours; 4:4-10 hours; 5:≥10 hrs).
1 5 3 3 2 3 2 3 1 4 2 1
5 1 1 1 3 2 1 5 1 1 2 4
4 2 3 2 1 4 4 1 2 3 1 1
14.
Solution:
The variable of interest is
x= Level of internet use (Qualitative, Ordinal)
n= Data size (Sample/Population size)
Table-02: Level of internet use n=50.

Class Tally Frequency RF PF


No Internet 19 0.38 38
Use

<1 hours 11 0.22 22

1-4 hours 09 0.18 18

4-10 hours 08 0.16 16

≥10 hours 03 0.06 6

Table-02 reveals that most of the student (38%) are not using internet. A few students
(6%) are using internet for more than 10 hours of the students 22% are using internet
less than 1 hr.

◼ Graphical Representation: (Simple Frequency Distribution)


There are two types of chart on diagram for graphical representation of a
simple frequency distribution. These are-
1) Bar Diagram,
2) Pie Chart.
3

1. Bar Diagram:
To draw bar diagram, in X-axis classes (categories, distinct value) are
considered and in Y-axis frequency/RF/PF is considered. Then, rectangular of equal
width is drawn over each class where height of rectangle is proportional to the
frequency/RF/PF of the corresponding class. Note that there will be gap between
rectangle.

Example of Bar Chart


5 4.5
4.3
Frequemcy/RF/PF

4 3.5

3 2.5

0
Category 1 Category 2 Category 3 Category 4
Classes(Categories,distinct value)

2. Pie Chart:
Pie chart/diagram is drawn using a circle where circle is divided into different
segments/slices. Each segment will represent a class of simple frequency distribution.
The area of a segment will be proportional to the frequency of the center of a segment
will be determined as
Angle of segment/class=RF× 360°

Sales

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr


4

◼ Grouped Frequency Distribution:


Unlike simple frequency distribution, few related numbers or values are
considered to create a class having a lower limit (L) and upper limit (U). Then,
the width (W) of the class is defined as
W=U-L
There are many approaches to construct the grouped frequency distribution.
One of the approaches is described below.
a) Determine the maximum and minimum value from data set. Let x max
and x min maximum and minimum values. Then, define the range of
data as
R= x max -x min
b) Choose a value for N such that
𝑅
𝑚=
𝑤
is a number between 4 and 8 inclusive.
c) Find lower limit (L) and upper limit (U) for each class as
Class-01: L1 = Floor (x min) U1 = L1+W
Class-02: L2 = U1 U2 = L2+W
Class-03: L3 = U2 U3 = L3+W
Note that x min belongs to the first class and the x max belongs to the last
class.
d) Find the mid-point of each class as
𝐿+𝑈
𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 =
2
Midpoint of a class is considered as the typical values or
representative value of all values belonging to that class.
e) Consider the first observation and check in which class it belongs to.
Then, consider a tally mark corresponding to the class. An observation
is said to belongs to a class if it is equal to on larger than but less than
the upper limit.
f) Count the number of observations in each class, which is the frequency
of the class.
g) Complete the Relative Frequency (RF) as
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑅𝐹 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎
h) Complete the Percentage Frequency (PF) as
𝑃𝐹 = 𝑅𝐹 × 100.
5

#Example: The hemoglobin concentration (gm/dl) obtained from 50 patients all


given below. Construct a suitable frequency distribution. Comment on your
result.
17.0 17.7 15.9 15.2 16.2 17.1 15.7 14.6 15.8 15.3 16.4
13.7 16.2 16.4 14.0 16.2 16.4 14.9 17.8 16.1 15.5 15.9
15.3 13.9 16.8 15.9 16.3 17.4 14.2 16.1 15.7 15.1 17.4
16.5 14.4 13.5 17.0 15.8 17.5 17.3 16.3 15.9 16.7 16.1
15.8
x =Hemoglobin concentration of a patient (Quantitative, Continuous, ratio)
Grouped frequency distribution
x max =18.3
x min =13.5
Range, R = 18.3-13.5 = 4.8
Let width (W) = 1.0
4.8
𝑚 = 1 = 4.8
Table-01: Frequency distribution for hemoglobin concentration, n=50

Class Midpoint Tally Frequency RF PF


13-14 13.5 03 0.06 6
14-15 14.5 05 0.10 10
15-16 15.5 15 0.30 30
16-17 16.5 16 0.32 32
17-18 17.5 10 0.20 20
18-19 18.5 01 0.02 2

Details: It is clear from Table-01 that only 6% of observations are of 13.5 and 2% are
of 16.5.

#Problem: The grouped frequency distribution of 30 students for a course given


below.
Class 12-17 17-22 22-27 27-32 32-37 37-42 42-47 47-52
03 03 ? 04 10 02 03 02
Complete the frequency distribution table and comment on the result.
6

Lecture-07 Date:04/02/2020
◼ Graphical Representation: (Grouped Frequency Distribution)
There are two types od chart/diagram are available to represent the grouped
frequency distribution graphically. These are
a) Histogram
b) Polygon.
◼ Histogram: To draw the histogram on the x-axis, class limits of classes are
considered and the frequency/RF/PF is considered on the y-axis. Then, a
rectangle is drawn for each class so that the width of the rectangle is the
frequency/RF/PF of the corresponding class.
✓ Note that there will be no gap between rectangle.
#Example:
Figure-01: Hemoglobin concentration, n=50
PF

X
Hemoglobin Concentration
7

◼ Polygon:
To draw polygon, midpoints of classes are considered on x-axis and on y-axis,
consider frequency/RF/PF. Then, consider a point for each class and add the
adjacent points by a straight line. It is widely used to represent grouped
frequency distribution for more than one distribution set.
Figure-02: Hemoglobin Concentration, n=50

PF

Mid-point

#Problem: Makes obtained on a course are distributed by gender as

Frequency
Class Male Female
40-50 04 06
50-60 15 20
60-70 10 13
70-80 14 11

Graphically represent the data on the same graph after constructing


frequency distribution for male and female students.

male
female
8

◼ Symmetric Curve: A curve is called symmetric about any point 𝜃 if right


side of 𝜃 mirrors the left side.

Bell Shaped Symmetric

Mathematically, it can be determined as


Area (under curved from 𝜃 to 𝜃 + 𝑎) = Area (under curved from 𝜃 to 𝜃 − 𝑎)

Note:∀→for all a>0

#Positively Skewed (lack if symmetric)/Skewed to Right:

Frequency

#Negatively Skewed (lack if symmetric)/Skewed to Left:

Skewed to the right Skewed to the left


9

Lecture-08 Date:09/02/2020
◼ Notation: Summation:
Let X→Y be two quantities variables. Suppose that n1 observations are
obtained for X→n2 observations for Y. Let a and b any constants.
Suppose that x1, x2, … , xi, … , xn are n1 observations obtained on x
and y1, y2, … , yi, … ,yn are n2 observations obtained on Y.
∑ → 𝑆𝑢𝑚𝑚𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑡𝑎𝑡𝑖𝑜𝑛

𝑛1
(01) ∑𝑖=1 𝑥𝑖 = x1+x2+…+xi+…+xn1
𝑛1
∑𝑖=3 𝑥𝑖 = x3+x4+…+xi+…+xn1
∑11
𝑖=5 𝑦𝑖 = y5+y6+…+yi+…+y11
𝑛1
(02) ∑𝑖=1 𝑎 = an1
Let xi =a,∀i
∑𝑛𝑖=1
1
𝑥𝑖 = x1+x2+…+xi+…+xn1
=a+a+…+a+…+a
=a(1+1+…+1+…+1)= an1
7
∑𝑖=4 𝑎 = 4a
𝑛1 𝑛1
(03) ∑𝑖=1 𝑎𝑥𝑖 = 𝑎 ∑𝑖=1 𝑥𝑖

∑𝑛𝑖=1
1
𝑧𝑖 = z1+…+zi+…+zn1
= ax1+…+axi+…+axn1
= a(x1+…+xi+…+xn1)
𝑛1
= 𝑎 ∑𝑖=1 𝑥𝑖
𝑛2 𝑛2 𝑛2 𝑛2
(04) ∑𝑖=1(𝑦𝑖 + 𝑏)= ∑𝑖=1 𝑦𝑖 +n2b= ∑𝑖=1 𝑦𝑖 +∑𝑖=1 𝑏
𝑛
(05) ∑𝑖=1(𝑥𝑖 + 𝑦i)= ∑𝑛𝑖=1 𝑥𝑖 +∑𝑛𝑖=1 𝑦𝑖
(06) ∑𝑛𝑖=1 𝑥𝑖 2 = x12+x22+…+xi2+…+xn2
(07) (∑𝑛𝑖=1 𝑥𝑖 )2 = (x1+x2+…+xi+…+xn)2
𝑛 𝑛
(08) ∑𝑖=1(𝑦𝑖 + 𝑏)2 = ∑𝑖=1(𝑥𝑖 2+2axi+a2)
= ∑𝑛𝑖=1 𝑥𝑖 2+2 𝑎 ∑𝑛𝑖=1 𝑥𝑖 +na2
𝑛 𝑛
(09) ∑𝑖=1(𝑥𝑖 + 𝑦i)2 = ∑𝑖=1(𝑥𝑖 2+2xiyi+yi2)
= ∑𝑛𝑖=1 𝑥𝑖 2+2 ∑𝑛𝑖=1 𝑥𝑖 yi+∑𝑛𝑖=1 𝑦𝑖 2
10

◼ Descriptive Measure:
It may require to describe the data using a single value, which is
called the descriptive measure of data. This measure could be computed
using
i. Population Data
ii. Sample Data
The descriptive computed from population data is called parameter.
The descriptive measure computed from sample data is called statistic.
◼ Central Tendency:
Every data set has a tendency to group around or cluster around a certain data
point (value). This tendency is known as central tendency. This value is
usually the central value of a data set.
A value is said to be approximately central value of data set if 40%-60%
observations of data set are greater than that value of data set.
Central value is interpreted as the typical on approximated or representative
value of all value that are in the data set.
Central value is broadly known as average value.
◼ Measure of Central Tendency:
Numerical measure of central is called Measure of Central Tendency. In other
words finding the central value of data is known as measure of central
tendency.
There are different types of central tendency.
These are-
a. Arithmetic Mean or Mean
b. Median
c. Mode
d. Geometric Mean
e. Harmonic Mean.
◼ Arithmetic Mean (AM) or Mean:
The most popular measure of central tendency in the arithmetic mean (AM) or
simply mean. The AM of a data is obtained by adding all observations first
and then by dividing the sum by the number of observations.
Let the variable of interest be x. Suppose that n observations were collected on
x and these are,
x1, …, xi, …, xn
The AM denoted by 𝑥̅ , is defined as
𝑥1 +⋯+𝑥𝑖 +⋯+𝑥𝑛
𝑥̅ =
𝑛
1𝑛
= ∑ 𝑥
𝑛 𝑖=1 𝑖
#Example: Suppose that one is interested to know the lifetime of electronic
component produced by a company. For this purpose, a random sample of 10
was drown and it was found that lifetimes (day) are as follow-
123 116 122 110 140 120 125 111 118 117
Find the AM. Justify your answer. Interpret.
Solution:
The set is
123+116+⋯+117
𝑥̅ =
10
= 120.2
40%-60% Central Pole
11

Ascending order-
110 111 116 117 118 120 122 123 125 140
Since 40% of observations are greater than 120.2 . One may treat it as a measure of
central tendency. The approximated lifetime of electronic component produced by
this company is 120.2 day.
12

Lecture-09 Date:11/02/2020
#Remark-01: Suppose that data on a quantitative variables x are presented in a
graphic frequency distribution table with k classes. If xj is the value for the jth
(j=1, 2, 3, …, k) class and fj is the corresponding frequency, the mean of this data
set is
𝒌
∑ 𝒙𝒋 𝒚𝒋
𝒋=𝟏
̅=
𝒙 𝒌
∑ 𝒇𝒋
𝑱=𝟏

Class Frequency

X1 F1
. .
. .
. .
Xj Fj
. .
. .
. .
Xk Fk

#Example: An example survey was conducted among 100 families to know the
family size and data are given as
x←Family 23456
f←Frequency 15 25 20 15 25
̅ =?
𝒙
Solution:
(𝟐×𝟏𝟓)+(𝟑×𝟐𝟓)+⋯ 410
̅=
𝒙 = 100 = 4.1
𝟏𝟓+𝟐𝟓+⋯
#Remark-02: Data on quantitative variable x are arranged in a graphical
frequency distribution, the approximated AM of this data set will be,
𝒌
∑ 𝒙𝒋 𝒚𝒋
𝒋=𝟏
̅=
𝒙 𝒌
∑ 𝒇𝒋
𝑱=𝟏
where xj= mid value for the jth class..
13

#Example: Ages collected from a group of individual are presented in the


following table-
Group 25-30 30-35 35-40 40-45 45-50 50-55
3 9 15 12 7 4
Find the approximated AM of this data set.

#Remark-03: Let there be an observations on a quantitative variable x and these


values are x1, x2, …, xn. Then, sum of deviations of observations from the AM is
always zero.
Proof:
The AM is
∑𝑛𝑖=1 𝑥𝑖
𝑥̅ =
𝑛
The deviation for the ith (i=1,2, …, n) observation xi from AM is defined as
Therefore, the sum of definitions is
∑𝑛𝑖=1(xi-𝑥̅ )= ∑𝑛 𝑛
𝑖=1 𝑥𝑖 -∑𝑖=1 𝑥̅
𝑛
=∑𝑖=1 𝑥𝑖 -𝑛𝑥̅
𝑛
∑𝑖=1 𝑥𝑖
=𝑛 𝑛
- 𝑛𝑥̅= 𝑛𝑥̅ − 𝑛𝑥̅ = 0.
#Example:
∑𝑛𝑖=1(xi-3.5)=0
Or, ∑𝑛 𝑛
𝑖=1 𝑥𝑖 +∑𝑖=1 3.5=0
Or, ∑𝑛𝑖=1 𝑥𝑖 +3.5n=0
𝑛
∑ 𝑥𝑖
Or, 𝑖=1 = −3.5
𝑛
#Remark-04: The sum of squared deviations of observations from AM is less
than the sum of squared deviations of observations from any other number.
Proof:
Let there be an observations on a quantitative variable x and these values are
x1,… ,xi, …, xn.
The AM is

∑𝑛𝑖=1 𝑥𝑖
𝑥̅ =
𝑛
The squared deviation of the ith (i=1,…,n) deviation xi from the AM 𝑥̅ is

Let ‘a’ be any other value such that a≠ 𝑥̅ . The squared deviation of the ith (xi-a)2. The
sum of squared deviation of observation from AM is ∑𝑛𝑖=1(xi-𝑥̅ )2 and from ‘a’ is
∑𝑛𝑖=1(xi-a)2.
Now, ∑𝑛𝑖=1(xi-a)2= ∑𝑛𝑖=1(xi-𝑥̅ + 𝑥̅ −a)2 [Suppose that, u=xi-𝑥̅ ,v=𝑥̅ -a]
𝑛
= ∑𝑖=1( (xi-𝑥 ̅) +2(xi-𝑥 ̅)( 𝑥̅ -a)+(xi-a)2)
2

= ∑𝑛𝑖=1( (xi-𝑥̅ )2+2( 𝑥̅ -a) ∑𝑛𝑖=1(xi-𝑥̅ )+ n(xi-a)2)


= ∑𝑛𝑖=1(xi-𝑥̅ )2+2( 𝑥̅ -a)×0+ n(xi-a)2
= ∑𝑛𝑖=1(xi-𝑥̅ )2+ n(xi-a)2
2
Since n(xi-a) >0 as a≠ 𝑥̅
∑𝑛𝑖=1(xi-a)2>∑𝑛𝑖=1(xi-𝑥̅ )2.
14

Lecture-10 Date:13/02/2020
◼ The Change of Origin (location) and Scale:
Let x be a variable with n observations x1,…,x2,…,xn. The change of origin (or
location) for the x is the change of x to Y=x+a, where a is any constant. Since
x is a variable, y is also a variable with x observations y1,…,yi,..,yn where
yi=xi+a ; i=1,….,n
The change of scale for the x is the change of x to y=bx. The n observations
on y defined as y1,…,yi,..yn where
yi=bxi ;i=1,….,n
The change of both origin and scale for the x is the change of x to y=a+bx.
The n observations of y are defined as y1,…,y2,…yn where
yi=a+bxi ;i=1,….,n
#Remark: The AM of a data set is dependent on the change of the both origin
and scale.
Proof:
Let there be n observations on variable x. Let a and b be any constants. After
changing both origin and scale , the x becomes y=a+bx.
If xi (i=1,…,n) is the ith observation for x, then ith observation for y will be yi=a+bxi ;
i=1,…,n.
Taking summation on both sides,
𝑛
∑𝑛𝑖=1 𝑦𝑖 =∑𝑖=1(𝑎 + 𝑏𝑥𝑖 )
Or, ∑𝑛 𝑛
𝑖=1 𝑦𝑖 = na+b∑𝑖=1 𝑥𝑖
𝑛
∑ 𝑦𝑖 𝑛
∑ 𝑥𝑖
Or, 𝑖=1 = a+b 𝑖=1
𝑛 𝑛
Or, 𝑦̅ = 𝑎 + 𝑏𝑥̅ [a= scale, b=origin]
It implies that AM is dependent on the change of both origin and scale.
#Example: Suppose that AM of a set of observations is 23. Find the AM of
a) 5 is added to each observations. Ans: 28
b) Each observation is multiplied by 3. Ans: 69
c) 5 is added then multiplied by 3.
̅ = 𝒂 + 𝒃𝒙
Hence, , 𝒚 ̅
̅ = 𝟓 + 𝟑 × 𝟐𝟑= 84.
Or, , 𝒚
#Problem: Let there be n observations in a data set with AM is 43.4. If 52 and 60
are added to this data set, the AM rise to 47. What is the value for n?
[Hints:
∑𝒏
𝒊=𝟏 𝒙𝒊
= 43.4
𝒏
Or, ∑𝒏𝒊=𝟏 𝒙𝒊 = n×43.4
∑𝒏
𝒊=𝟏 𝒙𝒊 +𝟓𝟐+𝟔𝟎
Or, = 47
𝒏+𝟐
So, n=5 ]
15

#Remark: The AM of sum (or difference) of two variables is the sum (or
difference) of AM of two variables.
Proof:
Let X and Y be two given variables. Then, the transformed variable Z is,
Z=𝑋 ±Y
Let there be n observations on X and Y and these are
X: x1,…,xi,…,xn
Y: y1,…,yi,..,yn
Therefore, Z has n observations which are
Z: z1,…,zi,….zn
Where,
Zi=xi+yi; [i=1,..,n]
Or, ∑𝑛𝑖=1 𝑧𝑖 = ∑(xi±yi)
∑𝒏 𝒛𝒊 ∑𝒏 𝒙 ±∑𝒏 𝒚
Or, 𝒊=𝟏 = 𝒊=𝟏 𝒊 𝒏 𝒊=𝟏 𝒊
𝒏
Or, 𝑧̅ = 𝑥̅ + 𝑦̅
It can be generalized as
𝑧̅ = 𝑥̅1 ± 𝑥̅2 ± ⋯ ± 𝑥̅ 𝑘
#Example: The arithmetic mean of expenditure (thousand) of a student in
different months in a year are given as 2.5, 2.2, 2.7, 2.3, 2.9, 3.1, 2.3, 3.0, 2.9, 2.4,
2.7, 3.2. These GMs were computed for 10 students. Find the AM of yearly
expenditure for a student.
Solution:
AM of Yearly expenditure= 2.5+…+3.2
= 32.2. (Ans.)
#Problem: Suppose that there are 3 earning members in a family. The monthly
mean income (in thousand) of this 3 members are 45, 50, 62. What is the monthly
mean income of this

M1 M2 M3 Family Income (FI)


J . . .
F . . .
. . . .
. . . .
. . . .
N . . .
D . . .
45 50 62
FI= M1+M2+M3
= 45+50+62
= 157.
Lecture-11 Date:16/02/2021

◼ Combined/Overall/Pooled AM:
Suppose that there are two groups of individuals with group size n1 and n2, respectively.
Also, suppose that same variable 𝛼 is observed on these two groups and it is found that
AMs are 𝑥̅1 and 𝑥̅2 , respectively. Then, the combined/overall/ pooled AM (mean) of
these two groups is
𝑛1 𝑥̅ 1 +𝑛2 𝑥̅2
𝑥̅ c= 𝑛1 +𝑛2
Proof:
Let x11,x12,…,x1i,…,x1n1 be observations for group (1) and x21,…,x2i,…,x2n2 be
observations for group (2). The AMs of these two groups are
1 𝑛 𝑛
2
∑𝑖=1 𝑥1𝑖 ∑𝑖=2 𝑥2𝑖
𝑥1 =
̅̅̅ and 𝑥
̅̅̅2 =
𝑛1 𝑛2
𝑛1 𝑛2
Then, ∑𝑖=1 𝑥1𝑖 =𝑛1 𝑥̅1 and ∑𝑖=2 𝑥2𝑖 =𝑛2 𝑥̅2
There will be (n1+n2) observations after combining these two groups. If 𝑥̅ 𝑐 is the AM of
combined data, it can be shown that,
x11+x12+⋯+x1i+⋯+x1n1+x21+⋯+x2i+⋯+x2n2
𝑥̅𝑐 = 𝑛1 +𝑛2
𝑛1 𝑛2
∑𝑖=1 𝑥1𝑖 +∑𝑖=2 𝑥2𝑖 𝑛1 𝑥̅ 1 +𝑛2 𝑥̅ 2
= =
𝑛1 +𝑛2 𝑛1 +𝑛2
This formula can be extended for k groups as,
𝑛 𝑥̅ +⋯+𝑛𝑘 𝑥̅ 𝑘
𝑥̅𝑐 = (𝑛1 +𝑛
1
1 2 +⋯+𝑛𝑘 )
𝑘
∑ 𝑛𝑗 𝑥̅ 𝑗
𝑗=1
= 𝑘
∑ 𝑛𝑗
𝑗=1
#Example: A family consists of 6 male and female. The AM of age for male 48 years and
37 for female. What is the AM of age of a member?
48+37
= 43.6 𝑦𝑒𝑎𝑟𝑠
2
#Example: The mean birth-weight (in grams) in bd by economic status is given in the
following table.
Economic Status Number of Births Mean Birth-weight
Poor 3050 2107
Middle 2978 2339
Rich 3001 2518
Interpret the results. Also find the mean birth-weight for Bangladesh.
2107+2339+2518
𝑥̅𝑐 = 3
=2320.13 gm [𝑥̅𝑐 = whole bd]
#Remark: AM of a data set is always unique. That is a data set has one and only one
AM.
▪ Extreme/Outlier/Observations:
An observations in a data set is said to be on extreme observations or outlier if this
observations is too small on too large compared to the other observations in the data set.
#Remark: AM is adversely (so badly) affected by the presence of extreme value or
outlier. In the presence of extreme value, the AM is no longer a central on
representative value of a data set.
#Example: Compute AM for 4,2,8,10,99.
AM = 24.6
▪ Weight AM (WAM):
In computing AM, it assumed that each observation has equal importance. But, in
practice, it may happen that the relative importance of all observations is not same. In this
case, it requires to calculate the weighted AM (WAM). The ‘weight’ stands for relative
importance of observation.
Let x1,…,xi,…,xn be n observations with corresponding weight w1,..,wi,…,wn.
The WAM of observations on x is

x1w1+⋯+xiwi+⋯+xnwn
𝑥̅𝑤 = w1+...+wi+⋯+wn
𝑛
∑𝑖=1 𝑥𝑖 𝑤𝑖
= 𝑛
∑𝑖=1 𝑤𝑖
The WAM reduced to AM if wi=c, ∀i
𝑛 𝑛 𝑛
∑𝑖=1 𝑥𝑖 𝑐 𝑐 ∑𝑖=1 𝑥𝑖 ∑𝑖=1 𝑥𝑖
WAM= ∑𝑛
= 𝑛𝑐 = 𝑛 = AM
𝑖=1 𝑐
#Example: Suppose that a student took an exam on five courses with different credit-
hours. The scores are given below. Find the average score of this student.
Solution:
Score Credit
70 3
60 4
75 3
80 2
90 2
WAM=72.5.
Lecture-12 Date:20/02/20
#Example: In course AMTH-107, a student is evaluated as follows. Class attendance is
5% of total score, the AM of scores in two in-courses is 25% of total score and the final
exam is 70% of the total score. Suppose that a student got 85 in attendance out of 100
and 93 in final exam. Find the average score of this exam.
Solve:
Score Weight
Attendance 85 5 (0.05)
In-course 90 + 96 25 (0.25)
= 93
2
Final 93 70 (0.70)

(85×5)+(93×25)+(93×70)
WAM= =92.6
5+25+70

#Problem: Suppose that grade point of a student on different course are given as 4 (4-
Credits), 4 (3-Credits), 3.74 (4-Credits), 3.75 (3-Credits), 3.5 (4-Credits) and 4 (2-
Credits). Find GPA.
Solution:
GPA Weight (Credit-wise)

#Example: Suppose that Mr. X wants to buy camera giving emphasis on different
components as image quality (50%), battery life (30%) and zoom range (20%). Company
A and B produced camera and scores on these components out of 10 are given below.
Company Image Battery life Zoom
A 06 07 07
B 08 06 05
Which one will Mr. X buy?
Solution:
WAM Company A= 6.5
WAM Company B= 6.8
Mr. X will purchase camera produced by camera B.

◼ Median:
Median of a set observations is a value that divided the ordered observations into two
equal parts. In other words, it is the middle value of a data set. If 𝑥̃ is the median od data
set,
#Number of Ordered Observation < 𝑥̃ = #Ordered Observations > ̃𝑥
◼ Findings Median:
1. Arrange observations in the data set is ascending or descending order.
2. Identify when total number of observations in data set, n is odd or even.
3. If n is odd, the median (𝑥̃) will be
𝑛+1
𝑥̃=( 2 )th ordered observations
if n is even,
𝑛 𝑛+1
𝑥̃= AM of ( 2)th observations and =( 2 ) observation

#Problem: Ages of 7-member family are given below


12,7,2,34,17,21,19.
Find the Median age.
Solution:
2,7,12,17,`9,21,34
𝟕+𝟏 th th
( ) =4 =17.
𝟐

◼ Property: Median:
a) Median is a unique value for a data set.
b) Unlike mean/adversely, median is not adversely affected by extreme value
or outlier.
Example: 4,2,8,10,99
AM=24.6, Median=8
#Remark:
If Mean is approximate equal to the median, the AM/Median will be preferred to use as a
measure of central tendency. This is because
a) To conclude AM, all observations have direct contribution, whereas for median,
middle one or two observations have direct contribution and all other observations
have indirect contribution.
b) Further mathematical observations with AM is comparatively easy than that with
median.
Lecture-13 Date:23/02/2020
◼ Cumulative Frequency:
Let there be k classes in a frequency distribution (Ungrouped or Grouped). Let x j be the
class value for the jth class and fj be the frequency of the corresponding class. Note that in
frequency distribution table xj’s are arranged in ascending order. The cumulative frequency
(CF) of the jth class is the sum of frequency of the jth class and all frequencies of classes
that come before jth class. Mathematically,
𝑗
CFj=f1+f2+…+fj=∑𝑖=1 𝑓𝑗 ; j=1,2,…k
i.e. CF1=f1, CF2=f2,…CFk=fk=n
Similarly, cumulative relative frequency (CRF) can be defined as
𝑗
CRFj=∑𝑖=1 𝑅𝐹𝑗
And cumulative percentage frequency (CPF) is
𝑗
CPFj=∑𝑖=1 𝑃𝐹𝑗

◼ Median: Ungrouped Frequency:


Let there be k classes in a ungrouped (simple) frequency distribution with class value xj for
the jth class (j=1,2,…k) and frequency fj. Let N the total number of observations in the data
set, i.e. n=∑𝑘𝑗=1 𝑓𝑗

x1 f1
. .
. .
. .
xj fj
. .
. .
. .
xk fk

𝑛+1
When N is odd and CFj-1< ≤CFj the median will be 𝑥̃=xj.
2
If N is even,
𝑛 (xj+xj)
1. CFj< 2<CFj the medians 𝑥̃= 2
𝑛 (xj+xj+1)
2. =CFj the median is 𝑥̃=
2 2
◼ Median (Grouped Frequency):
The procedure to obtain median from ungrouped frequency distribution can be extended
for grouped frequency distribution to obtain approximated median by treating xj as the mid-
(𝐿𝑗+𝑈𝑗)
point for the jth class, i.e. xj= 2
; j=1,2,…k
#Example: Consider the ‘AGE’ data,
Class 25-30 30-35 35-40 40-45 45-50 50-55
Frequency 03 09 15 12 07 04
Find the approximated median.
#Solution:
Class Median(x) Frequency CF 𝑪𝑭𝒋 CPF
CRF = 𝒏
(F)
25-30 27.5 03 03 0.06 06
30-35 32.5 09 12 0.24 24
35-40 37.5 15 27 0.54 54
40-45 42.5 12 39 0.78 78
45-50 47.5 07 46 0.92 92
50-55 52.5 04 50 1.00 100
n=50 (Even)
̃=3.5 (Ans).
𝒙

◼ Ogive: Grouped Frequency Distribution:


The graphical representation of CF or CRF or CPF of grouped frequency data is known
as ogive. To draw ogive upper limits of classes are considered on the X-axis and
CF/CRF/CPF are considered on the Y-axis. Then, consider a co-ordinate for each class
on the graph and connect adjacent co-ordinates with straight line.
Example: ‘AGE’
CPF Data

Upper Limit

◼ Determining Median: Graphically for Grouped Frequency:


Median for grouped frequency data can e determined approximately from ogive. For this
purpose,
𝒏
1. Draw a horizontal line parallel to the X-axis through the point 𝟐 (for CF ogive) on
0.5 (for CRF ogive) or 50% (for CPF ogive)
2. From the point of intersection of line drawn in (1) and ogive, draw a vertical line
parallel to Y-axis.
3. The point at which the line drawn in (2) cuts thee X-axis is the approximated
median.
Lecture-14 Date:23/02/2020
◼ Mode:
The mode of a set of observations is an observation that occurs most frequently. If all
observations are different it is said that the data set has no mode. A data set may have
more than one mode. When data are presented in frequency distribution table. The class
with highest frequency is called modal class.
Since a data set may not have made and a data set may have more than one mode, the
mode is not that popular in practice.
Mode is widely used for qualitative variable.
#Example: Find the mode for following data sets.
a) 1 3 5 5 7 8 8 8 8 8 8
9 9 9 9 9 10 10 10 10 (Mode=8)
b) 20, 21, 20, 20, 34, 22, 24, 27, 27, 27 (Mode=20 and 27)
c) 10, 21, 33, 53, 54 (Mode=No Mode)
#Problem: Find the mean, median, and mode. For the data given below.
9 8 7 3 4 5 81 10
which one do you think approximate as a measure of central tendency of justify. (Mean?
Median? Mode?)

◼ Type of Variable:
Quantitative Qualitative Remark
Mean ✓  In the observation of
outlier
Median ✓ ✓ In the presence of
outlier
Mode ✓ ✓ Mean approximate
for qualitative
variable.
*Ordinal qualitative variable with odd number of observations.

◼ Geometric Mean:
Suppose that x1,…,xi,..,xn are non-zero positive values on a variable of interest x. the
geometric mean (GM) of these observations is defined as
𝒏
GM= √𝒙𝟏 × … × 𝒙𝒊 ×. .× 𝒙𝒏
𝟏
ln GM= 𝒏[ln x1+…+ln xi+..+ln xn]
𝟏
or, GM= Antilog [𝒏 ∑𝒏𝒊=𝟏 𝒍𝒏 𝒙𝒊 ]

= 𝒆𝑨𝑴 𝒐𝒇 𝒍𝒐𝒈 𝒐𝒇 𝒐𝒃𝒔𝒆𝒓𝒗𝒕𝒊𝒐𝒏𝒔


For data presented in any frequency data table,
(𝑳𝒋+𝑼𝒋)
For grouped frequency xj = ; j=1,2,…,k
𝟐

#Remark:
The GM is most useful as a measure of central tendency, when observations of a data sets
are not independent of each other.
Applications of GM are most common in computing average of rate of change/ returns
expressed in percentages.
If 100r1%,100r2%,…,100rn% are n returns/ change rates, the GM of rates computed as

GM= 𝒏√(𝟏 + 𝒓𝟏) × … × (𝟏 + 𝒓𝒊) × … × (𝟏 + 𝒓𝒏) -1

Pn=P0(1+r)n [r→GM]
P1=P0(1+r1)
P2=P0(1+r2)=P0(1+r1)(1+r2)…
#Example: Suppose that price of a product was increased by 50% in first year and in
second year it was increased by 20%. Find the average increased rate. Justify your
answer.
Solution:
Since observations (rates) are dependent, GM is appropriate. Given that
𝟓𝟎
r1=50%=𝟏𝟎𝟎=0.5
Justification
r2=20%=0.2
Let →100→150→180
Then, the GM is,
Pn=P0(1+r)n
GM=√(𝟏 + 𝟎. 𝟓) × (𝟏 + 𝟎. 𝟐) -1
Or, 180=100(1+r)2
= 1.342-1
So, r=0.342
= 0.342
= 34.2%
It implies that the price of product is increased by 34.2% in a year.
Lectrue-15 Date:27/02/2020

#Example: Suppose that a stock grows by 10% in the first year, declines by 20% in the
second year and grows by 30% in the third year. What is the average return rate?
Hints:
𝟑
GM= √(𝟏 + 𝒓𝟏)(𝟏 + 𝒓𝟐)(𝟏 + 𝒓𝟑)-1
r1=0.1, r2=-0.2, r3=0.3
GM=0.046=4.6%
#Example: Suppose that a stock goes from $100 to $110 in the first year; then declines
to $80 in the second year and up to $150 in the third year. Find the average return rate.
Justify.
Solution:
𝟏𝟏𝟎−𝟏𝟎𝟎
The return rate in the first year, r1= =10%=0.1
𝟏𝟎𝟎
𝟖𝟎−𝟏𝟏𝟎
The return rate in the second year, r2= =-27.27%=-.2727
𝟏𝟏𝟎
𝟏𝟓𝟎−𝟖𝟎
The return rate in the third year, r3= =87.5%=0.875
𝟖𝟎

GM=0.1447=14.47%
P3=P0(1+r)3
Or, 150=100(1+r)3
Or, (1+r)3=1.5
Or, 1+r=(1.5)1/3
#Example: Return rates of a stock for last five years are 90%,10%,20%, 30%, -90%
respectively. Find the average return rate.
GM=-0.2008=-20.08%

◼ Harmonic Mean (HM):


The harmonic mean (HM) of a set of non-zero observations is defined as the reciprocal of
the arithmetic mean (AM) of reciprocal of individual observations.
Let x1, x2,…,xi,…xn be n non-zero observations on the variable X.
Then the HM is,
𝟏
HM=𝟏 𝟏 𝟏 𝟏
( +⋯+ +⋯+ )
𝒏 𝒙𝟏 𝒙𝒊 𝒙𝒏

𝟏
=𝟏 𝟏
(∑𝒏 )
𝒏 𝒊=𝟏𝒙𝒊

𝒏
= 𝟏
∑𝒏
𝒊=𝟏𝒙𝒊

#Example: Find the HM for 4, 7, 12, 19, 25


Solution:
𝟓
HM=𝟏 𝟏 𝟏=8.79
+ +⋯+
𝟒 𝟕 𝟐𝟓

AM=13.4 GM=10.98
AM>GM>HM
For frequency distribution data
∑𝒌𝒋=𝟏 𝒇𝒋
HM= 𝒇𝒋
∑𝒌𝒋=𝟏
𝒙𝒋

For grouped frequency


𝑼𝒋+𝑼𝒊
Xj= ;j=1,….,k
𝟐

#Remark: The Hm is generally used to find the average of values expressed as ratio
of two different measuring units, where the numerator is constent. e.g. speed

𝑎 𝑎
𝑏 𝑎+𝑏

ratio proportion

#Example: A driver travels from place ‘A’ to place ‘B’ at a speed of 20km/h and returns
at the speed of 8 km/h. What is the average speed 9. Justify.
Solution:
HM is appropriate to find out the average
𝟐
HM= 𝟏 𝟏 AM=14 GM=√𝟏𝟔𝟎=12.65
+
𝟐𝟎 𝟖
Justification
Let the distance between A to be x km.
𝒙
Time require to reach B from A=𝟐𝟎
𝒙
Time require to reach A from B=𝟖
𝒙 𝒙
Total time require=𝟐𝟎 + 𝟖
𝟐𝒙 𝟖𝟎
Then the average speed= 𝒙 𝒙 = 𝟕 =11.43
+
𝟐𝟎 𝟖

#Example: To travel from A to B, Mr. X drivers half of the way at a speed of 10 km/h
and rest of the way at a speed of 15 km/h. What is the average speed?
𝟐
HM= 𝟏 𝟏
+
𝟏𝟎 𝟏𝟓

Do justification…
#Example: Three families have equal expenditure. The per capita expenditure of these
families are 4000, 5000 and 10000. Find the average per capita expenditure.
Solution:
These observations are of ratio type, where the numerator is constant. Therefore, HM is
appropriate, which is,
𝟑
HM= 𝟏 𝟏 𝟏 =5454 tk.
+ +
𝟒𝟎𝟎𝟎 𝟓𝟎𝟎𝟎 𝟏𝟎𝟎𝟎𝟎

Justification:
Let X be the expenditure if a family
𝒙
Total member of family 1=𝟒𝟎𝟎𝟎
𝒙
Total member of family 2=𝟓𝟎𝟎𝟎
𝒙
Total member of family 3=
𝟏𝟎𝟎𝟎𝟎

Total expenditure of 3 families=3x


𝒙 𝒙 𝒙
Total member of =𝟒𝟎𝟎𝟎 + 𝟓𝟎𝟎𝟎 + 𝟏𝟎𝟎𝟎𝟎
𝒙 𝒙 𝒙 𝟏 𝟏 𝟏
Per capita expenditure𝟒𝟎𝟎𝟎 + 𝟓𝟎𝟎𝟎 + 𝟏𝟎𝟎𝟎𝟎 = 𝟒𝟎𝟎𝟎 + 𝟓𝟎𝟎𝟎 + 𝟏𝟎𝟎𝟎𝟎=0.00055
Lecture-16 Date:05/03/2020

➢ Theorem:
For two positive values,
AM×HM=GM2
Or, GM=√𝐴𝑀 × 𝐻𝑀
Proof:
Let x1 and x2 be two positive numbers, i.e. xi> 0, 𝑖 = 1,2.
Then, AM, GM and HM can be defined as
𝑥1+𝑥2
AM= 2
GM=√x1 × x2
2
HM= 1 1
+
𝑥1 𝑥2

Now, AM× HM=x1×x2=(√x1 × x2)2=GM2


#Problem: If for two positive values, AM=0.5 and GM=6. Find these values. Here, find
the HM.
➢ Theorem:
For any set of positive values
AM≥GM≥HM
Proof:
Assume that there are two positive numbers in a set. i.e. n=2 then
Now, when x1≠x2; x1x2>0
𝒙𝟏+𝒙𝟐
AM-GM= =√𝒙𝟏𝒙𝟐
𝟐
𝟏
=𝟐(x1+x2-2√𝒙𝟏𝒙𝟐)
𝟏
=𝟐[(√𝒙𝟏)2+(√𝒙𝟐)2-2√𝒙𝟏√𝒙𝟐]
𝟏
=𝟐(√𝒙𝟏-√𝒙𝟐)2>0
i.e. AM-GM>0
or, AM>GM…(1)
Again,
𝟐×𝒙𝟏𝒙𝟐
GM-HM=√𝒙𝟏𝒙𝟐- 𝒙𝟏+𝒙𝟐
√𝒙𝟏𝒙𝟐
=𝒙𝟏+𝒙𝟐[x1+x2-2√𝒙𝟏√𝒙𝟐]
√𝒙𝟏𝒙𝟐
= [(√𝒙𝟏)+(√𝒙𝟐)2-2√𝒙𝟏√𝒙𝟐]
𝒙𝟏+𝒙𝟐
√𝒙𝟏𝒙𝟐
=𝒙𝟏+𝒙𝟐(√𝒙𝟏-√𝒙𝟐)2>0
i.e. GM-HM>0
or, GM>HM…(2)
From (1) and (2) AM>GM>HM…(3)
When x1=x2, x1x2>0
Let x1=x2=c
𝑪+𝑪 𝑪×𝑪×𝟐
AM= GM=√𝑪 × 𝑪=C HM=
𝟐 𝑪+𝑪
i.e. AM=GM=HM…(4)
For any positive x1 and x2
AM≥GM≥ 𝑯𝑴
By the method of induction. It can be proved that for any integer n>0; AM≥GM≥HM
◼ Shape of distribution of Data and Measure of Central Tendency:
For population data, if
o Mean (AM) = Median or AM≅Median
The shaped of distribution is symmetric about mean
o Mean > Median
The shaped of distribution is skewed to the right or positively skewed.
o Mean <Median
Skewed to the left or negatively skewed.
◼ Percentile:
For a set of observations or variable the pth percentile (0<p<100), denoted by xp , is a
value such that p percent or less of observations are less than xp and (100-p) percent or
less of observations are greater than xp.
◼ Quartiles:
Quartiles are values that divide the observations into four equal parts. There are three
quartiles and these are,
o First quartile (Q1) = 25th percentile
o Second quartile (Q2) = 50th percentile = median
o Third quartile (Q3) = 75th percentile
Q1<Q2<Q3

Mean>50%

min Q1 Q2 Q3 max

25% 25%

75%
◼ Steps to Compute Percentile:
a) Arrange observations in ascending order.
𝒑
b) Compute k as k=𝟏𝟎𝟎 ×n ; n=size of data set
c) If k is not an integer, the pth percentile is the value/observation at the [celling(k)]th
position in the ordered arrangement.
d) If k is an integer, the pth percentile is the mean/ arithmetic mean of values at the
kth and (k+1)th positions in the ordered arrangement.
Lecture-17 Date:08/03/2020

#Problem: Compare the following results on the course AMTH-102 taken in 2017 and
2018.
2017 X90=79 X18=40 10% A+ 18% F
2018 X82=79 X10=40
18% A+ 10% F
>79:A+ ; <40:F

◼ Five-Number Summary:
One may describe a set of observations using five numbers which are
a) Minimum value;
b) First quartile;
c) Second quartile;
d) Third quartile;
e) Maximum value.
These members are called five-number summary of a data set.
#Example: 12 customer satisfaction scale (s) are given below:
19,17,14,27,23,32,40,49,59,54,80,71
Find the 5-number summary.
Solution:
14 17 19 23 27 32 40 49 54 59 71 80
Q1 = P25 = 21
Q2 = P50 = 36
Q3 = P75 = 56.5
◼ Shaped of Distribution and 5-Numnber Summary:
o Symmetric
Q2-min=max-Q2
o Positive skewed/Right skewed
Q2-min<max-Q2
o Negative skewed/Left skewed
Q2-min>max-Q2
◼ Box-Plot: Graph of 5-Number Summary:

Q3

Q2

Q1
◼ Dispersion:
The various measures of central tendency (averages) gives us a single value to represent
all observations in a data set. But it fails to provide an idea about the formulation of
observations. For this purpose, it requires to study the dispersion along with the averages.
Data Set 1 Data Set 2
34,35,36 10,35,60
̅=35
𝒙 ̅=35
𝒙
*Dispersion means the scatter, spread or variability of observations among themselves in
a data set.
◼ Measure of Dispersion:
The measure of dispersion indicates how observations in a data set are scattered among
themselves. It conveys information regarding amount of variability present in the data set.
o When all observations are same, there is no dispersion.
o When observations are not same, but close together, the dispersion will be small.
o When observations are not same, but widely scattered, the dispersion will be
large.
There are commonly two types of measure dispersion. These are-
1. Range
2. Variance/Standard Deviation
◼ Range:
Range of a data set of observations is defined as difference between maximum value and
minimum value. That is, if R is range,
R = maximum value – minimum value
= X max -X min
#Remark: R≥0
Lecture-18 Date:10/03/2020

#Remark:
a) The value of range is always non-negative, i.e. R≥0
b) The unit of range is exactly same as the unit of observation.
c) Only two observations are directly used to compute range. For this reason, it is not
widely used as a measure of dispersion.
◼ Variance:
To compute variance of a set of observation the arithmetic mean (mean) is used as a
reference point.
o When observations are close to mean, the variance will be smaller.
o When observations are spread out from mean, the variance will be larger.
o When observations are same as mean value, the variance will be zero.
◼ Population Variance:
Let x1,x2,…xi,…,xN be N population observations with population mean, is defined as,
𝜇 = (1/𝑁) ∑𝑁
𝑖=1(𝑥 i-𝜇)
2

The population variance, denoted by 𝜎2, is the arithmetic mean of squared distance of
observations from population mean 𝜇.
That is,
1
𝜎2= (𝑁) ∑𝑁𝑖=1(𝑥 i-𝜇)
2

o It is always non-negative, i.e. 𝜎2≥0


o In practice, it is usually unknown
o It is a parameter
o The unit of 𝜎2 is the squared unit of value.
◼ Sample Variance:
Let x1,…xi,…xn be n sample observations on variable X with the sample mean defined as
𝑥̅ = (1/𝑁) ∑𝑁𝑖=1(𝑥i-𝜇)2
The sample variance, denoted by S2/𝜎̂2/𝛽 2, is the sum of squared distance of observations
from sample mean, 𝑥̅ divided by (n-1). That is,
𝑆2= (1/n-1) ∑𝑛𝑖=1(𝑥i-𝑥̅ )2
o It is always non-negative, i.e. S2≥0
o It can be used as an approximate value for unknown 𝜎2
o It is statistic
o The unit of S2 is the squared unit of observation.
#Remark:
It can be shown that,
(∑𝑛
𝑖=1 xi)^2
∑𝑛𝑖=1(xi-𝑥̅ )2 = ∑𝑛𝑖=1 𝑥i2-n𝑥̅ 2 = ∑𝑛𝑖=1 𝑥 i2 -
𝑛

1 (∑𝑛
𝑖=1 xi)^2
S2 = 𝑛−1 [ ∑𝑛𝑖=1 𝑥i2 - ]
𝑛

Proof:
∑𝑛𝑖=1(xi-𝑥̅ )2 = ∑𝑛𝑖=1(xi2-2xi𝑥̅ +𝑥̅ 2)
= ∑𝑛𝑖=1 𝑥i2-2𝑥̅ ∑𝑛𝑖=1 𝑥i + n𝑥̅ 2
= ∑𝑛𝑖=1 𝑥i2-2n𝑥̅ 2 + n𝑥̅ 2
(∑𝑛
𝑖=1 xi) 2
= ∑𝑛𝑖=1 𝑥i2-n𝑥̅ 2 = ∑𝑛𝑖=1 𝑥 i2- n( )
𝑛

(∑𝑛
𝑖=1 𝑥 i)2
= ∑𝑛𝑖=1 𝑥i2- 𝑛

◼ Sample Variance: Frequency Distribution Data:


Let there be k classes in a frequency distribution. Let xj be the jth class value and fj be the
corresponding frequency. Then the sample variance
1 ∑𝑘
𝑗=1 𝑓𝑗𝑥𝑖
S2 = 𝑛−1 ∑𝑘𝑗=1 𝑓j(xj-𝑥̅ )2 ; when 𝑥̅ = ∑𝑘
𝑗=1 𝑓𝑗
1 ∑𝑘
𝑗=1 𝑓𝑗𝑥𝑖 n
= 𝑛−1[∑𝑛𝑖=1 𝑓 x - j j
2
∑𝑘
] = ∑𝑘𝑗=1 𝑓j
𝑗=1 𝑓𝑗

◼ Standard Deviation:
Variance is not an appropriate measure of dispersion when one wants to express the
measure of dispersion with the unit of value. In such case, standard deviation (SD) is
widely used, which is the square root of variance.
Population SD: 𝜎 = √𝜎2
Sample SD: S = √S2
#Problem: Compute variance and SD for the following data,
x 3 5 7 8 9
f 2 3 2 2 1
Solution:

x f x2 fx fx2
3 2 9 6 18
5 3 25 15 75
7 2 49 14 98
8 2 64 16 128
9 1 81 9 81
Total:10 Total:60 Total:400
Lecture-19 Date:19/03/2020
➢ Theorem:
Variance or Standard Deviation is independent of change of origin, but dependent on the
change of scale.
Proof:
Let x1,…,xi,…,xn be n observations on a variable x. Then the sample variance is
Sx2 = (1/n-1) ∑𝑛𝑖=1(𝑥i-𝑥̅ )2
1
Where 𝑥̅ = 𝑛 ∑𝑛𝑖=1 𝑥i

After changing the origin and scale, the y becomes Y where are a & b any constants. The
observations on Y are y1,…,yi,…,yn with y = a + bxi ; i=1,2,….n
The sample mean of Y is,
1 1
𝑦̅ = ∑𝑛𝑖=1 𝑦I = 𝑥̅ = ∑𝑛𝑖=1(a + bxi) = a+ b𝑥̅
𝑛 𝑛

The sample variance of Y is,


1
Sy2 = (1/n-1) ∑𝑛𝑖=1(𝑦i-𝑦̅)2 = 𝑛 ∑𝑛𝑖=1(a + bxi-a- b𝑥̅ )2 = (b2/n-1) ∑𝑛𝑖=1(𝑥i-𝑥̅ )2 = b2Sx2

Therefore, sample SD of 𝑥̅ is,


Sy = |𝑏|Sx
#Remark:
1) If Y = a + x ; Sy2 = Sx2 and Sy = Sx

2) If Y = -a + x ; Sy2 = Sx2 and Sy = Sx

3) If Y=bx; Sy2 = b2Sx2 and Sy = |𝑏|Sx

4) If Y=-bx; Sy2 = b2Sx2 and Sy = |𝑏|Sx

#Problem: The sample variance of a set of observations is 9. Find the sample variance and SD,
if with each observations.
i. 10 is added. Ans: 9,3
ii. 10 is subtracted. Ans:9,3
iii. Observations is multiplied by 2 and than 5 is subtracted. Ans: 36,6
iv. From each observation 5 is subtracted and divided by 3. Ans: 1,1
➢ Theorem:
For a population of size 2. Show that population SD is the half of range.
Proof:
Let x1 and x2 be two observations in the population, where x1>x2. The population mean
𝜇 = ½ (x1+x2)
And population variance is,
𝑥1+𝑥2 𝑥1+𝑥2
𝜎2 = ½ ∑2𝑖=1(xi-𝜇)2 = ½ [(x1- 2 )2+(x2- 2 )2]
𝑥1−𝑥2 2 𝑥2−𝑥1 2 𝑥1−𝑥2 2 𝑅^2
= ½ [( ) +( 2 ) ] = 1/2 × 2( ) =
2 2 4
#Problem: If a population consists of 2 observations with mean 6 and variance 16. Find the
numbers.
➢ Theorem:
If X and Y are two variance with sample variance Sx2 and Sy2, respective, the sample
variance of Z=X+Y will be
1
Sz2= SX2+SY2+2Cov(X,Y); where Cov(X,Y) = 𝑛−1 ∑𝑛𝑖=1(xi-𝑥̅ )(yi-𝑦̅)
Proof:
Let x1,…,xi,…,xn and y1,…,yi,….yn n-observations on X and Y, respectively. If z1,…,zi,…,zn are
n observations on z,
zi = xi + yi ; i=1,2,…,n
∑ 𝑍I = ∑(𝑥i+yi)
Or, ∑ 𝑍I = ∑ 𝑥I + ∑ 𝑦i
Or, 𝑍̅ = 𝑥̅ + 𝑦̅
1 1 1
Now, Sz2 = 𝑛−1 ∑(zi-𝑧̅)2 = 𝑛−1 ∑(xi+yi-𝑥̅ -𝑦̅)2 = 𝑛−1 ∑((xi-𝑥̅ )+(yi-𝑦̅))2
1
= 𝑛−1 ∑((xi-𝑥̅ )2 + 2(xi-𝑥̅ ) (yi-𝑦̅)+ (yi-𝑦̅)2)
1 1 1
= 𝑛−1 ∑(xi-𝑥̅ )2+𝑛−1 ∑(yi-𝑦̅)2+2 𝑛−1 ∑(xi-𝑥̅ )(yi-𝑦̅)
= Sx2 + Sy2 +2Cov(x,y)
◼ Co-efficient of Variation (CV):
Variance and SD are useful to measure the variability/dispersion of a set of observations.
This measure is not appropriate to compare the variability/dispersion of two or more data
sets of observations.
1) On two or more different variables as the measuring units of variables are
different
2) On the same variable as the arithmetic means of data sets may be same.
For this reason, CV is widely used to compare the variability of two or more data sets.
Lecture-20 Date:15/03/2020
◼ Co-efficient of Variable:
The co-efficient of variation (CV) is a relative measure of dispersion that describes
variability of observations of a data set compared to arithmetic mean. It is usually
expressed as percentage. Let 𝑥̅ and S be the mean and standard division of a set,
𝑆
respectively. Then, the CV is computed as CV=𝑥̅ ×100%, provided that 𝑥̅ >0. This
measure is unitless. Therefore, it can be used to compute the variability of two or more
data sets.
#Remark:
i. If a data set has a large CV, it indicates that the data set is more variable and less stable
on less uniform.
ii. If a data set has a smaller CV, it indicates that the data set is less and it is more stable or
more uniform.
Data Set ̅)
Mean (𝒙 Standard Division 𝑺
CV=𝒙̅ ×100%
(S)
1
2
3

#Example: Suppose that a company produces 75-gm soap and 150-gm soap. The standard
divisions of these soaps are 5 gm and 8 gm, respectively. Which soap has larger variability?
Data Set ̅
𝒙 S 𝑺
CV=𝒙̅ ×100%
75 gm 75 05 5
×100% = 6.67%
75
150 gm 150 08 8
×100% = 5.35%
150

#Example: Compare the variability


Sample-01 Sample-02
Age 25 year 11-year
Mean 145 pounds 80 pounds
SD 10 pounds 10 pounds
#Example: An investor analyzed mean returns and standard division over 5 years for three
companies in a stock exchange.
Company Mean SD CV
A 5.47% 14.68% 26.8%
B 6.88% 21.31% 309.74%
C 7.16% 19.46% 271.79%
in which company deal the interest?

◼ Inter-quantile Range (IQR):


The IQR reflects the variability among the middle 50% of observations in a data set. If Q1
and Q3 are the first and third quantiles, respectively, the IQR is defined as
IQR = Q3-Q1
◼ Extreme Value/Outlier:
Mathematically, an observation, x is said to be an extreme value or outlier if
x>Q3+(1.5×IQR)
or
x<Q1-(1.5×IQR)
This outlying observation is also known as mild extreme value or mild outlier. An
observation will be strong outlier or extreme values
x>Q3+(30×IQR)
or
x<Q1-(30×IQR)
Empirical Rule (Rule 68-95-99.7)

If the distribution of observations in a data set appears to be bell-shaped symmetric


about population mean, it is expected that approximately middle

• 68% of the observations fall in the interval (𝜇𝜇 − 𝜎𝜎, 𝜇𝜇 + 𝜎𝜎)


(within one standard deviation of the mean)

• 95% of the observations fall in the interval (𝜇𝜇 − 2𝜎𝜎, 𝜇𝜇 + 2𝜎𝜎)


(within two standard deviations of the mean)

• 99.7% of the observations to fall in the interval (𝜇𝜇 − 3𝜎𝜎, 𝜇𝜇 + 3𝜎𝜎)


(within three standard deviations of the mean)

𝝁𝝁: 𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴; 𝝈𝝈: 𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫𝑫


Example 1: A research was performed on the IQ scores of the employees of a
private firm. The scores are noted to be bell shaped symmetric. The mean of the
distribution is 100 and standard deviation is 15. Find the percentage of the scores
that fall
(a) between 70 and 130;
(b) above 100;
(c) between 85 and 130;
(d) less than 85.

Example 2: The scores of an entrance test for the high school pass-outs in a
particular year were bell shaped symmetric. If the mean and standard deviation
were 490 and 100, then

(a) What percentage students scored between 390 and 590 on this test?
(b) The score of a student was 795. What can you say about his
performance as compared to rest of the scores?

Example 3: If the average age of retirement for the entire population in a country
is 64 years and the distribution is bell shaped symmetric with a standard deviation
of 3.5 years, what is the approximate age range in which middle 95% of people
retire?
Practice Problem

Suppose that observations in a data set are distributed as bell-shaped symmetric


about mean 100 with standard deviation 10. Find the percentage of observations
that fall

(i) Between 80 and 100; Ans. 47.5%


(ii) Between 70 and 100; Ans. 49.85%
(iii) Between 100 and 120; Ans. 47.5%
(iv) Between 90 and 120; Ans. 81.5%
(v) Between 70 and 110; Ans. 83.85%%
(vi) More than 110; and Ans. 16%
(vii) Less than 120. Ans. 97.5%%
Remark:

• Empirical rule is usually used for population data

• This rule can also be used for large data set by replacing population mean
with sample mean and population standard deviation with sample standard
deviation

If the distribution of sample observations appear bell-shaped symmetric, one may


expect that approximately middle

• 68% of the observations fall in the interval (𝑥𝑥̅ − 𝑠𝑠, 𝑥𝑥̅ + 𝑠𝑠)
(within one standard deviation of the sample mean)
• 95% of the observations to fall in the interval (𝑥𝑥̅ − 2𝑠𝑠, 𝑥𝑥̅ + 2𝑠𝑠)
(within two standard deviations of the sample mean)
• 99.7% of the observations to fall in the interval (𝑥𝑥̅ − 3𝑠𝑠, 𝑥𝑥̅ + 3𝑠𝑠)
(within three standard deviations of the sample mean)

𝑥𝑥̅ = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀; 𝑠𝑠 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷


Chebyshev’s Theorem

• Similar to Empirical Rule, but this can be applied even when the distribution
is not bell shaped symmetric.

1
• For any value k > 1, at least (minimum) 100�1 − �% of the population
𝑘𝑘 2
measurements lie in the interval (𝜇𝜇 − 𝑘𝑘𝑘𝑘, 𝜇𝜇 + 𝑘𝑘𝑘𝑘).

Example: A population data set of size N = 500 has mean μ = 5.2 and standard
deviation σ = 1.1. Find the minimum number of observations in the data set that
must lie:

(a) between 3 and 7.4;


(b) between 1.9 and 8.5.
Correlation Analysis

• Correlation analysis is a bivariate analysis, which is used to quantify the


relationship between to quantitative variables (most preferable two continuous
variables). In bivariate data, values on two variables are observed from the same
entity, i.e., from same individual, object, place etc.

o Height and weight of an individual


o Heights of father and son from a family
o Time spent on internet and total marks obtained in exam of a
student
o Amount of income and expenditure of a family
o Amount of time spent on walking and blood glucose
concentration of an individual

• For the purpose of quantification, Pearson correlation coefficient has widely been
used to determine the linear relationship between two quantitative variables.

• The sample Pearson correlation coefficient is usually denoted by 𝑟𝑟; it is unit less
and its value ranges between −1 and +1, inclusive, i.e., −1 ≤ 𝑟𝑟 ≤ +1.

• The value of 𝑟𝑟 indicates both direction and strength of linear relationship


between two quantitative variables.
o The sign of 𝑟𝑟 indicates the direction of relationship.

o The magnitude of 𝑟𝑟, i.e. |𝑟𝑟| indicates the strength of


relationship.
• The correlation coefficient can be negative if higher values of one variable are
related with the lower values of the other variable.

• The correlation coefficient can be positive if higher values of one variable are
related with the higher values of the other variable.

Rules of Interpreting the Magnitude of Correlation Coefficient, |𝒓𝒓|: 𝟎𝟎 ≤ |𝒓𝒓| ≤ +𝟏𝟏

Magnitude Type of Linear Relationship


|𝑟𝑟| = 1 Perfect
0.9 ≤ |𝑟𝑟| < 1 Very High
0.7 ≤ |𝑟𝑟| < 0.9 High
0.5 ≤ |𝑟𝑟| < 0.7 Moderate
0.3 ≤ |𝑟𝑟| < 0.5 Low
0.0 < |𝑟𝑟| < 0.3 Very Low
|𝑟𝑟| = 0 No
Scatter Plot
The graphical representation of bivariate data is known as scatter plot. It helps to
visualize the linear relationship between two quantitative variables measured on the same
set of individuals. Suppose that (𝑋𝑋, 𝑌𝑌) is a pair of two quantitative variables measured on
𝑛𝑛 independent individuals. Let (𝑥𝑥1 , 𝑦𝑦1 ), ⋯ , (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 ) be the 𝑛𝑛 pair
observations. To draw the scatter plot, values of one variable are considered on the
𝑋𝑋 − 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 and values of another variable are considered on the 𝑌𝑌 − 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎; and finally a
point is considered on the graph for each (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ), 𝑖𝑖 = 1, ⋯ , 𝑛𝑛.

Data Layout:

Value on 𝑋𝑋 Value on 𝑌𝑌
𝑥𝑥1 𝑦𝑦1
𝑥𝑥2 𝑦𝑦2
⋮ ⋮
⋮ ⋮
𝑥𝑥𝑛𝑛 𝑦𝑦𝑛𝑛
Scatter Plot:
Figure 1: Suitable Title

o Source: Internet
Scatter Plot and Correlation Coefficient, 𝒓𝒓:

o Source: Book
Pearson Correlation Coefficient, 𝒓𝒓:

Suppose that (𝑋𝑋, 𝑌𝑌) is a pair of two quantitative variables measured on 𝑛𝑛 independent
individuals in a sample. Let (𝑥𝑥1 , 𝑦𝑦1 ), ⋯ , (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 ) be the 𝑛𝑛 pair observations.
To determine the linear relationship between 𝑋𝑋 and 𝑌𝑌, Karl Pearson developed a formula
based on 𝑛𝑛 paired sample observation. The value obtained from this formula is known as
Pearson correlation coefficient, denoted by 𝑟𝑟. The formula for 𝑟𝑟 is given by

𝐶𝐶𝐶𝐶𝐶𝐶 (𝑋𝑋,𝑌𝑌 )
𝑟𝑟 = ,
𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦

where
o 𝐶𝐶𝐶𝐶𝐶𝐶(𝑋𝑋, 𝑌𝑌) = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑋𝑋 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌
1
= ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�)
𝑛𝑛−1

o 𝑠𝑠𝑥𝑥 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑓𝑓𝑓𝑓𝑓𝑓 𝑋𝑋


1
=� ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2
𝑛𝑛−1

o 𝑠𝑠𝑦𝑦 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑓𝑓𝑓𝑓𝑓𝑓 𝑌𝑌


1
=� ∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑦𝑦�)2
𝑛𝑛−1

Therefore,
∑𝑛𝑛𝑖𝑖=1(𝑥𝑥 𝑖𝑖 −𝑥𝑥� )(𝑦𝑦 𝑖𝑖 −𝑦𝑦�)
𝑟𝑟 = .
�� ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥 𝑖𝑖 −𝑥𝑥̅ )2 �{∑𝑛𝑛𝑖𝑖=1(𝑦𝑦 𝑖𝑖 −𝑦𝑦�)2 }
It can be shown that
𝑛𝑛 𝑛𝑛

�(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�) = �(𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑥𝑥𝑖𝑖 𝑦𝑦� − 𝑥𝑥̅ 𝑦𝑦𝑖𝑖 + 𝑥𝑥̅ 𝑦𝑦�)
𝑖𝑖=1 𝑖𝑖=1

= ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑦𝑦� ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ ∑𝑛𝑛𝑖𝑖=1 𝑦𝑦𝑖𝑖 + 𝑛𝑛𝑥𝑥̅ 𝑦𝑦�
= ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑛𝑛𝑥𝑥̅ 𝑦𝑦� − 𝑛𝑛𝑥𝑥̅ 𝑦𝑦� + 𝑛𝑛𝑥𝑥̅ 𝑦𝑦�
= ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑛𝑛𝑥𝑥̅ 𝑦𝑦�
Again,

�2
∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2 = ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥2𝑖𝑖 − 𝑛𝑛𝑥𝑥

�2
∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑦𝑦�)2 = ∑𝑛𝑛𝑖𝑖=1 𝑦𝑦2𝑖𝑖 − 𝑛𝑛𝑦𝑦

Therefore, the formula for 𝑟𝑟 can be re-written as


∑𝑛𝑛𝑖𝑖=1 𝑥𝑥 𝑖𝑖 𝑦𝑦 𝑖𝑖 − 𝑛𝑛𝑥𝑥̅ 𝑦𝑦�
𝑟𝑟 = .
��∑𝑛𝑛𝑖𝑖=1 𝑥𝑥 𝑖𝑖2 −𝑛𝑛𝑥𝑥̅ 2 �{∑𝑛𝑛𝑖𝑖=1 𝑦𝑦𝑖𝑖2 −𝑛𝑛𝑦𝑦� 2 }

Layout for Calculation, 𝒓𝒓


Table #: Calculation for Pearson correlation coefficient
𝒙𝒙 𝒚𝒚 𝒙𝒙𝟐𝟐 𝒚𝒚𝟐𝟐 𝒙𝒙𝒙𝒙
𝑥𝑥1 𝑦𝑦1 𝑥𝑥12 𝑦𝑦12 𝑥𝑥1 𝑦𝑦1
𝑥𝑥2 𝑦𝑦2 𝑥𝑥22 𝑦𝑦22 𝑥𝑥2 𝑦𝑦2
⋮ ⋮ ⋮ ⋮ ⋮
⋮ ⋮ ⋮ ⋮ ⋮
𝑥𝑥𝑛𝑛 𝑦𝑦𝑛𝑛 𝑥𝑥𝑛𝑛2 𝑦𝑦𝑛𝑛2 𝑥𝑥𝑛𝑛 𝑦𝑦𝑛𝑛
𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛

� 𝑥𝑥𝑖𝑖 � 𝑦𝑦𝑖𝑖 � 𝑥𝑥𝑖𝑖2 � 𝑦𝑦𝑖𝑖2 � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖


Total
𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1
Remark:
o The above formula measures only the linear relationship between two quantitative
variables.
o It the relationship is not linear, it is not appropriate to use the above formula.

Problem:

Following data were obtained on the age and blood glucose concentration (BGC)
collected from 6 independent individuals.
(a) Draw the scatter plot and comment on it.
(b) Find the Pearson correlation coefficient between age and BGC.
Comment on the result.

Age 43 21 25 42 57 59
BGC 99 65 79 75 87 81
Online Class 2
Sunday, July 12, 2020 8:54 PM

Problem:

Following data were obtained on the age and blood glucose concentration (BGC) collected from 6
independent individuals.

a. Draw the scatter plot and comment on it.


b. Find the Pearson correlation coefficient between age and BGC. Comment on the result.

Age 43 21 25 42 57 59
BGC 99 65 79 75 87 81

AMTH 107 Page 1


AMTH 107 Page 2
AMTH 107 Page 3
Theorem
The Pearson correlation coefficient, 𝑟𝑟 always lies between −1 and +1,
i.e. −1 ≤ 𝑟𝑟 ≤ +1.

Proof:
Suppose that 𝑛𝑛 paired observations (𝑥𝑥1 , 𝑦𝑦1 ), ⋯ , (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 ) are
observed on two quantitative variables (𝑋𝑋, 𝑌𝑌). By definition, the Pearson
correlation coefficient is given by

𝐶𝐶𝐶𝐶𝐶𝐶 (𝑋𝑋,𝑌𝑌)
𝑟𝑟 = ,
𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦

1
where 𝐶𝐶𝐶𝐶𝐶𝐶 (𝑋𝑋, 𝑌𝑌) = ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�),
𝑛𝑛−1

𝑛𝑛
1
𝑠𝑠𝑥𝑥 = � �(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2 , 𝑎𝑎𝑎𝑎𝑎𝑎
𝑛𝑛 − 1
𝑖𝑖=1

1
𝑠𝑠𝑦𝑦 = � ∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑦𝑦�)2 .
𝑛𝑛−1

Therefore, one may write


𝑛𝑛
1
𝑟𝑟 = �(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�)
(𝑛𝑛 − 1)𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦
𝑖𝑖=1

𝑛𝑛
1 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ 𝑦𝑦𝑖𝑖 − 𝑦𝑦�
= �� �� �
(𝑛𝑛 − 1) 𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦
𝑖𝑖=1
𝑥𝑥 𝑖𝑖 −𝑥𝑥̅ 𝑦𝑦 𝑖𝑖 −𝑦𝑦�
Let 𝑢𝑢𝑖𝑖 = and 𝑣𝑣𝑖𝑖 = . Therefore,
𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦

1
𝑟𝑟 = ∑𝑛𝑛𝑖𝑖=1 𝑢𝑢𝑖𝑖 𝑣𝑣𝑖𝑖 .
𝑛𝑛−1

It can be shown that

𝑛𝑛 𝑛𝑛 2 𝑛𝑛
2 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ 1 2
(𝑛𝑛 − 1)𝑠𝑠𝑥𝑥2
� 𝑢𝑢𝑖𝑖 = � � � = 2 �(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ ) = = (𝑛𝑛 − 1)
𝑠𝑠𝑥𝑥 𝑠𝑠𝑥𝑥 𝑠𝑠𝑥𝑥2
𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1

and
𝑛𝑛 𝑛𝑛 2 𝑛𝑛
2 𝑦𝑦𝑖𝑖 − 𝑦𝑦� 1 2
(𝑛𝑛 − 1)𝑠𝑠𝑦𝑦2
� 𝑣𝑣𝑖𝑖 = � � � = 2 �(𝑦𝑦𝑖𝑖 − 𝑦𝑦 ) = = (𝑛𝑛 − 1)
𝑠𝑠𝑦𝑦 𝑠𝑠𝑦𝑦 𝑠𝑠𝑦𝑦2
𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1

Assume that 𝑤𝑤𝑖𝑖 = 𝑢𝑢𝑖𝑖 ± 𝑣𝑣𝑖𝑖 . Then

∑𝑛𝑛𝑖𝑖=1 𝑤𝑤𝑖𝑖2 ≥ 0

⇒ ∑𝑛𝑛𝑖𝑖=1(𝑢𝑢𝑖𝑖 ± 𝑣𝑣𝑖𝑖 )2 ≥ 0

⇒ ∑𝑛𝑛𝑖𝑖=1 𝑢𝑢𝑖𝑖2 + ∑𝑛𝑛𝑖𝑖=1 𝑣𝑣𝑖𝑖2 ± 2 ∑𝑛𝑛𝑖𝑖=1 𝑢𝑢𝑖𝑖 𝑣𝑣𝑖𝑖 ≥ 0

⇒ (𝑛𝑛 − 1) + (𝑛𝑛 − 1) ± 2(𝑛𝑛 − 1)𝑟𝑟 ≥ 0

⇒ 1 + 1 ± 2 𝑟𝑟 ≥ 0

⇒ 1 ± 𝑟𝑟 ≥ 0

Therefore, if 1 + 𝑟𝑟 ≥ 0, one may get 𝑟𝑟 ≥ −1. Again, when 1 − 𝑟𝑟 ≥ 0,


one may get 𝑟𝑟 ≤ +1. Hence, in general,

−1 ≤ 𝑟𝑟 ≤ +1.
Problem:

Let there be 𝑛𝑛 paired observations (𝑥𝑥1 , 𝑦𝑦1 ), ⋯ , (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 )
observed on two quantitative variables (𝑋𝑋, 𝑌𝑌), where 𝑦𝑦𝑖𝑖 = 𝑎𝑎 + 𝑏𝑏𝑥𝑥𝑖𝑖 , 𝑎𝑎
being any constant and 𝑏𝑏 > 0. Find the Pearson correlation coefficient
between 𝑋𝑋 and 𝑌𝑌.

Solution:

Given that 𝑦𝑦𝑖𝑖 = 𝑎𝑎 + 𝑏𝑏𝑥𝑥𝑖𝑖 , 𝑖𝑖 = 1, ⋯ , 𝑛𝑛. Therefore, using the rule of


changing the location and scale, one may express the sample mean (𝑥𝑥̅ ),
variance (𝑠𝑠𝑦𝑦2 ), and standard deviation (𝑠𝑠𝑦𝑦 ) of 𝑌𝑌 as

𝑦𝑦� = 𝑎𝑎 + 𝑏𝑏𝑥𝑥̅

𝑠𝑠𝑦𝑦2 = 𝑏𝑏 2 𝑠𝑠𝑥𝑥2 , and

𝑠𝑠𝑦𝑦 = |𝑏𝑏| 𝑠𝑠𝑥𝑥 = 𝑏𝑏 𝑠𝑠𝑥𝑥 𝑎𝑎𝑎𝑎 𝑏𝑏 > 0

By definition, the Pearson correlation coefficient is given by


𝐶𝐶𝐶𝐶𝐶𝐶 (𝑋𝑋,𝑌𝑌)
𝑟𝑟 = ,
𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦

1
where C𝑜𝑜𝑜𝑜(𝑋𝑋, 𝑌𝑌) = ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�).
𝑛𝑛−1

1
Then, C𝑜𝑜𝑜𝑜(𝑋𝑋, 𝑌𝑌) = ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑎𝑎 + 𝑏𝑏𝑥𝑥̅ − 𝑎𝑎 − 𝑏𝑏𝑥𝑥̅ )
𝑛𝑛−1

𝑏𝑏
= ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )2 ) = 𝑏𝑏 𝑠𝑠𝑥𝑥2 .
𝑛𝑛−1
Therefore,

𝑏𝑏 𝑠𝑠𝑥𝑥2 𝑏𝑏 𝑠𝑠𝑥𝑥2
𝑟𝑟 = = = +1 .
𝑠𝑠𝑥𝑥 (𝑏𝑏 𝑠𝑠𝑥𝑥 ) 𝑏𝑏 𝑠𝑠𝑥𝑥2

That is, there exists a perfect positive correlation between 𝑋𝑋 and 𝑌𝑌.

Problem:

Let there be 𝑛𝑛 paired observations (𝑥𝑥1 , 𝑦𝑦1 ), ⋯ , (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 )
observed on two quantitative variables (𝑋𝑋, 𝑌𝑌), where 𝑦𝑦𝑖𝑖 = 𝑎𝑎 + 𝑏𝑏𝑥𝑥𝑖𝑖 , 𝑎𝑎
being any constant and 𝑏𝑏 < 0. Find the Pearson correlation coefficient
between 𝑋𝑋 and 𝑌𝑌.

Solution:

Given that 𝑦𝑦𝑖𝑖 = 𝑎𝑎 + 𝑏𝑏𝑥𝑥𝑖𝑖 , 𝑖𝑖 = 1, ⋯ , 𝑛𝑛. Therefore, using the rule of


changing the location and scale, one may express the sample mean (𝑥𝑥̅ ),
variance (𝑠𝑠𝑦𝑦2 ), and standard deviation (𝑠𝑠𝑦𝑦 ) of 𝑌𝑌 as

𝑦𝑦� = 𝑎𝑎 + 𝑏𝑏𝑥𝑥̅

𝑠𝑠𝑦𝑦2 = 𝑏𝑏 2 𝑠𝑠𝑥𝑥2 , and

𝑠𝑠𝑦𝑦 = |𝑏𝑏| 𝑠𝑠𝑥𝑥 = −𝑏𝑏 𝑠𝑠𝑥𝑥 𝑎𝑎𝑎𝑎 𝑏𝑏 < 0

By definition, the Pearson correlation coefficient is given by

𝐶𝐶𝐶𝐶𝐶𝐶 (𝑋𝑋,𝑌𝑌)
𝑟𝑟 = ,
𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦
1
where C𝑜𝑜𝑜𝑜(𝑋𝑋, 𝑌𝑌) = ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�).
𝑛𝑛−1

1
Then, C𝑜𝑜𝑜𝑜(𝑋𝑋, 𝑌𝑌) = ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )(𝑎𝑎 + 𝑏𝑏𝑥𝑥̅ − 𝑎𝑎 − 𝑏𝑏𝑥𝑥̅ )
𝑛𝑛−1

𝑏𝑏
= ∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥� )2 ) = 𝑏𝑏 𝑠𝑠𝑥𝑥2 .
𝑛𝑛−1

Therefore,

𝑏𝑏 𝑠𝑠𝑥𝑥2 𝑏𝑏 𝑠𝑠𝑥𝑥2
𝑟𝑟 = = = −1 .
𝑠𝑠𝑥𝑥 (−𝑏𝑏 𝑠𝑠𝑥𝑥 ) −𝑏𝑏 𝑠𝑠𝑥𝑥2

That is, there exists a perfect negative correlation between 𝑋𝑋 and 𝑌𝑌.
Simple Linear Regression Model

Regression analysis is a statistical technique to study the dependence of one


random variable (which is called the dependent variable) on one or more other
variables (which are called independent variables).

• Dependent random variable is also known as outcome or response


variable

• Independent variable is also known as explanatory, predictor, covariate,


or regressor variable

Example:

1. Amount of sale of a product may depend on the amount of money spent in


advertising the product.

2. Height of son may dependent on the height of father.

3. Blood glucose concentration may dependent on his/her age.

4. GPA of a student may depend on the amount of time spent on study.


Purpose of Regression Analysis

1. To examine the relationship between dependent and independent variables.

2. To determine the change in the mean of response variable for a change a


change in the independent variables.

3. To predict the mean of response variable for the given values of independent
variables.

Remark
• Regression model is a mathematical formulation that describes how a
dependent random variable is connected with one or more independent
variables.

• If one and only one independent variable is considered in the regression


model, the regression model is called simple regression model. If more
than one independent variables are considered in a regression model, the
regression model is known as multiple regression model.

• If in a regression model, there exists a linear relationship between a


dependent variable and a set of independent variables, the regression
model is known as linear regression model.

• In linear regression model, the dependent variable is always a


quantitative continuous random variable.
Simple Linear Regression Model

In simple linear regression model, the dependent variable is a random continuous


variable and one and only one independent quantitative variable is considered. It is
assumed that there is a linear relationship between the dependent and independent
variables. Since independent variable is not random, it is also assumed that values
of independent variable are exactly known in advance.

Let (𝑌𝑌, 𝑋𝑋) be a paired variable, where 𝑌𝑌 is the dependent random variable and 𝑋𝑋 is
the independent random variable. Let 𝜇𝜇 be the unknown population mean (i.e.
parameter) of the dependent variable 𝑌𝑌. Under linear regression model, it is
assumed that for a given value of 𝑋𝑋, say 𝑥𝑥, the parameter 𝜇𝜇 can be expressed as

𝜇𝜇 = 𝛼𝛼 + 𝛽𝛽𝛽𝛽, (1)
where 𝛼𝛼 and 𝛽𝛽 are known as unknown regression parameters. More specifically, 𝛼𝛼
is called intercept of the model and 𝛽𝛽 is called the regression parameter. The model
given in (1) is known as mathematical or deterministic model. This is because 𝜇𝜇
always takes a single value 𝛼𝛼 + 𝛽𝛽𝛽𝛽 for a given value of independent variable 𝑥𝑥.

Interpretation of Regression Parameters

• The intercept 𝛼𝛼 indicates the population mean of the dependent variable 𝑌𝑌,
when no independent variable is considered in to the model. From theoretical
point of view, it has no physical interpretation, but it is necessary to include 𝛼𝛼
for fully specifying the regression model.
• The regression parameter 𝛽𝛽 measures the change (amount increase or amount of
decrease) in the population mean value of dependent variable 𝑌𝑌 for one-unit
increase in the value of independent variable 𝑥𝑥.

o When 𝛽𝛽 > 0, the mean value of 𝑌𝑌 increases by 𝛽𝛽 units for one-unit


increase in 𝑥𝑥. It implies that mean value of 𝑌𝑌 increases as the value of 𝑥𝑥
incraeses. In other words, it indicates that there is a positive correlation
between dependent and independent variables.

o When 𝛽𝛽 < 0, the mean value of 𝑌𝑌 decreases by |𝛽𝛽| units for one-unit
increase in 𝑥𝑥. It implies that mean value of 𝑌𝑌 decreases as the value of 𝑥𝑥
incraeses. In other words, it indicates that there is a negative correlation
between dependent and independent variables.

o For 𝑘𝑘-unit increase in the value 𝑥𝑥, the population mean of 𝑌𝑌 changes
by 𝑘𝑘𝑘𝑘 units.

In practice, it is almost impossible to observe the population mean value of the


dependent variable 𝑌𝑌. The observed value of 𝑌𝑌, say 𝑦𝑦, for a given value 𝑥𝑥 tends to
deviate from the mean 𝜇𝜇. Let 𝜖𝜖 be the deviation between the observed value 𝑦𝑦 and
the mean 𝜇𝜇, i.e. 𝜖𝜖 = 𝑦𝑦 − 𝜇𝜇. Therefore, one can express the observed value 𝑦𝑦 as
𝑦𝑦 = 𝛼𝛼 + 𝛽𝛽𝛽𝛽 + 𝜖𝜖, (2)
since 𝜇𝜇 = 𝛼𝛼 + 𝛽𝛽𝛽𝛽. The 𝜖𝜖 is also termed as the random error associated with the
response 𝑦𝑦. The model given in (2) is known as a probabilistic model.
Remark

 In simple linear regression model, the regression parameters 𝛼𝛼 and 𝛽𝛽 are


unknown. Therefore, to study the simple linear regression model, it is necessary
to quantify (estimate or approximate) these unknown parameters using the
available data on (𝑌𝑌, 𝑋𝑋). This procedure is known as the fitting a simple linear
regression model.

 Suppose that 𝑛𝑛 independent sample paired observations were obtained on (𝑌𝑌, 𝑋𝑋)
and these are (𝑦𝑦1 , 𝑥𝑥1 ), ⋯ , (𝑦𝑦𝑖𝑖 , 𝑥𝑥𝑖𝑖 ), ⋯ , (𝑦𝑦𝑛𝑛 , 𝑥𝑥𝑛𝑛 ).

 Ordinary Least Squared (OLS) is one of the estimation techniques that can be
used to quantify the unknown parameters using available paired data.

Note
1. Estimate: Estimate is a function of sample observations (i.e. a statistic) that
provides an approximate value for an unknown parameter.

2. Estimation: Estimation is a statistical technique to obtain estimates for


unknown parameters.
Ordinary Least Squared (OLS) Estimation Technique

Suppose that 𝑛𝑛 independent sample paired observations were collected on (𝑌𝑌, 𝑋𝑋) to
obtain estimates for the regression parameters 𝛼𝛼 and 𝛽𝛽. Let 𝛼𝛼� and 𝛽𝛽̂ be the
estimates of 𝛼𝛼 and 𝛽𝛽, respectively. Then, the estimate of population mean of 𝑌𝑌 can
be obtained as
𝜇𝜇̂ = 𝛼𝛼� + 𝛽𝛽̂ 𝑥𝑥,
which is also known as the predicted value of response, denoted by 𝑦𝑦�, i.e. 𝜇𝜇̂ = 𝑦𝑦�
and
𝑦𝑦� = 𝛼𝛼� + 𝛽𝛽� 𝑥𝑥.
Hence, estimated random error, denoted by 𝜖𝜖̂, is defined as
𝜖𝜖̂ = 𝑦𝑦 − 𝑦𝑦� = 𝑦𝑦 − 𝜇𝜇̂ = 𝑦𝑦 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥,
which is also called residual.

Let (𝑦𝑦1 , 𝑥𝑥1 ), ⋯ , (𝑦𝑦𝑖𝑖 , 𝑥𝑥𝑖𝑖 ), ⋯ , (𝑦𝑦𝑛𝑛 , 𝑥𝑥𝑛𝑛 ) be the 𝑛𝑛 independent sample paired
observations. Following equation (2), the 𝑖𝑖 𝑡𝑡ℎ (𝑖𝑖 = 1, ⋯ , 𝑛𝑛) response can be
expressed as
𝑦𝑦𝑖𝑖 = 𝛼𝛼 + 𝛽𝛽𝑥𝑥𝑖𝑖 + 𝜖𝜖𝑖𝑖

and the corresponding estimated random error or the residual is expressed as

𝜖𝜖�𝑖𝑖 = 𝑦𝑦𝑖𝑖 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥𝑖𝑖 .

In ordinary least squared (OLS) estimation technique, the regression parameters


are estimated in such a way that the residual will be minimum. To obtain the OLS
estimates 𝛼𝛼� and 𝛽𝛽̂, the sum of squared of residuals, i.e., ∑𝑛𝑛𝑖𝑖=1 𝜖𝜖̂𝑖𝑖2 is minimized
with respect to 𝛼𝛼� and 𝛽𝛽̂. That is,
𝑛𝑛 𝑛𝑛

�𝛼𝛼�, 𝛽𝛽� � = min � 𝜖𝜖̂𝑖𝑖2 = min ��𝑦𝑦𝑖𝑖 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥𝑖𝑖 �2 .


�,𝛽𝛽� �
�𝛼𝛼 �,𝛽𝛽� �
�𝛼𝛼
𝑖𝑖=1 𝑖𝑖=1

Note that from the above expression, numerically the values for 𝛼𝛼� and 𝛽𝛽̂ can be
obtained. To expressed 𝛼𝛼� and 𝛽𝛽̂ as functions of sample observations, it requires to
solve the following estimating equations simultaneously.

𝜕𝜕 2
∑𝑛𝑛𝑖𝑖=1�𝑦𝑦𝑖𝑖 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥𝑖𝑖 � = 0 (3)

𝜕𝜕 𝛼𝛼

𝜕𝜕 2

∑𝑛𝑛𝑖𝑖=1�𝑦𝑦𝑖𝑖 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥𝑖𝑖 � = 0. (4)
𝜕𝜕 𝛽𝛽

From equation (3), one may obtain


𝑛𝑛

−2 ��𝑦𝑦𝑖𝑖 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥𝑖𝑖 � = 0


𝑖𝑖=1

⇒ ∑𝑛𝑛𝑖𝑖=1 𝑦𝑦𝑖𝑖 – 𝑛𝑛𝛼𝛼� − 𝛽𝛽̂ ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 = 0


𝑛𝑛 𝑛𝑛

⇒ 𝑛𝑛𝛼𝛼� = � 𝑦𝑦𝑖𝑖 − 𝛽𝛽̂ � 𝑥𝑥𝑖𝑖


𝑖𝑖=1 𝑖𝑖=1

𝑛𝑛 𝑛𝑛
1 𝛽𝛽̂
⇒ 𝛼𝛼� = � 𝑦𝑦𝑖𝑖 − � 𝑥𝑥𝑖𝑖
𝑛𝑛 𝑛𝑛
𝑖𝑖=1 𝑖𝑖=1

⇒ 𝛼𝛼� = 𝑦𝑦� − 𝛽𝛽̂ 𝑥𝑥� (5)


Again, from equation (4), one may get
𝑛𝑛

−2 � 𝑥𝑥𝑖𝑖 �𝑦𝑦𝑖𝑖 − 𝛼𝛼� − 𝛽𝛽̂ 𝑥𝑥𝑖𝑖 � = 0


𝑖𝑖=1

𝑛𝑛 𝑛𝑛 𝑛𝑛

⇒ � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝛼𝛼� � 𝑥𝑥𝑖𝑖 − 𝛽𝛽̂ � 𝑥𝑥𝑖𝑖2 = 0


𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1

𝑛𝑛 𝑛𝑛

⇒ � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ 𝛼𝛼� − 𝛽𝛽̂ � 𝑥𝑥𝑖𝑖2 = 0


𝑖𝑖=1 𝑖𝑖=1

𝑛𝑛 𝑛𝑛

⇒ � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ (𝑦𝑦� − 𝛽𝛽̂ 𝑥𝑥� ) − 𝛽𝛽̂ � 𝑥𝑥𝑖𝑖2 = 0


𝑖𝑖=1 𝑖𝑖=1

[using equation (5)]


𝑛𝑛 𝑛𝑛

⇒ � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ 𝑦𝑦� + 𝛽𝛽̂ 𝑛𝑛𝑥𝑥̅ 2 − 𝛽𝛽̂ � 𝑥𝑥𝑖𝑖2 = 0


𝑖𝑖=1 𝑖𝑖=1

𝑛𝑛 𝑛𝑛

⇒ 𝛽𝛽̂ � 𝑥𝑥𝑖𝑖2 − 𝛽𝛽̂ 𝑛𝑛 𝑥𝑥̅ 2 = � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ 𝑦𝑦�


𝑖𝑖=1 𝑖𝑖=1

∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ 𝑦𝑦�


⇒ 𝛽𝛽̂ = (6)
∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖2 − 𝑛𝑛 𝑥𝑥̅ 2

That is, the OLS estimates of 𝛼𝛼 and 𝛽𝛽 are

𝛼𝛼� = 𝑦𝑦� − 𝛽𝛽̂ 𝑥𝑥�

∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ 𝑦𝑦�


𝛽𝛽̂ = ,
∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖2 − 𝑛𝑛 𝑥𝑥̅ 2

respectively.

�𝒂𝒂𝒂𝒂𝒂𝒂 𝜷𝜷
Layout for Calculation: 𝜶𝜶
Table #: Calculation for regression parameters
𝒙𝒙 𝒚𝒚 𝒙𝒙𝟐𝟐 𝒚𝒚𝟐𝟐 𝒙𝒙𝒙𝒙
𝑥𝑥1 𝑦𝑦1 𝑥𝑥12 𝑦𝑦12 𝑥𝑥1 𝑦𝑦1
𝑥𝑥2 𝑦𝑦2 𝑥𝑥22 𝑦𝑦22 𝑥𝑥2 𝑦𝑦2
⋮ ⋮ ⋮ ⋮ ⋮
⋮ ⋮ ⋮ ⋮ ⋮
𝑥𝑥𝑛𝑛 𝑦𝑦𝑛𝑛 𝑥𝑥𝑛𝑛2 𝑦𝑦𝑛𝑛2 𝑥𝑥𝑛𝑛 𝑦𝑦𝑛𝑛
𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛

� 𝑥𝑥𝑖𝑖 � 𝑦𝑦𝑖𝑖 � 𝑥𝑥𝑖𝑖2 � 𝑦𝑦𝑖𝑖2 � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖


Total
𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1
Problem

Following data were obtained on the age and blood glucose concentration (BGC)
collected from 6 independent individuals.

Age 43 21 25 42 57 59

BGC 99 65 79 75 87 81

Identify the dependent and independent variables. Fit a simple linear regression
model. Interpret the results. Also,

1. Predict the value of BGC when the age of an individual is 45 years.


2. Compare the BGC between individuals of ages
(i) 35 years and 30 years
(ii) 40 years and 48 years.

Solution:

In the given problem, the dependent variable is blood glucose concentration (BGC)
and the independent variable is age of individual, i.e. 𝑌𝑌 = 𝐵𝐵𝐵𝐵𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑋𝑋 = 𝐴𝐴𝐴𝐴𝐴𝐴.
Note that both dependent and independent variables are continuous variables.
Under a simple linear regression model, the observed value of 𝑌𝑌 can be expressed
as

𝑦𝑦 = 𝛼𝛼 + 𝛽𝛽𝛽𝛽 + 𝜖𝜖

and the predicted value of dependent variable is

𝑦𝑦� = 𝛼𝛼� + 𝛽𝛽̂ 𝑥𝑥,


where the OLS estimates of regression parameters 𝛼𝛼 and 𝛽𝛽 are

𝛼𝛼� = 𝑦𝑦� − 𝛽𝛽̂ 𝑥𝑥�

∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 – 𝑛𝑛 𝑥𝑥̅ 𝑦𝑦�


𝛽𝛽̂ = ,
∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖2 − 𝑛𝑛 𝑥𝑥̅ 2

respectively.

Table 1: Calculation of regression parameters

𝑥𝑥 𝑦𝑦 𝑥𝑥 2 𝑦𝑦 2 𝑥𝑥𝑥𝑥
43 99 1849 9801 4257
21 65 441 4225 1365
25 79 625 6241 1975
42 75 1764 5625 3150
57 87 3249 7569 4959
59 81 3481 6561 4779
Total 247 486 11409 40022 20485

Here, 𝑛𝑛 = 6 and

247 486
𝑥𝑥̅ = = 41.17; 𝑦𝑦� = = 81.0
6 6
6 6

� 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 = 20485; � 𝑥𝑥𝑖𝑖2 = 11409.


𝑖𝑖=1 𝑖𝑖=1
Therefore,

20485 − 6 × 41.17 × 81.0


𝛽𝛽̂ = = 0.384
11409 − 6 × 41.172

and 𝛼𝛼� = 81.0 − 0.384 × 41.17 = 64.191.

Interpretation of Results

In the absence of independent variable, the mean BGC of individuals is


approximately 64.191. For one unit increase in the age of an individual, the mean
BGC increases by 0.384 units approximately. It indicates that there is a positive
correlation between BGC and age of individuals.

The predicted value of response variable (also known as the fitted linear regression
model) is

𝑦𝑦� = 64.191 + 0.384 × 𝑥𝑥 .

The predicted value of BGC of an individual of age 45 years is

𝑦𝑦� = 64.191 + 0.384 × 45 = 81.471

It indicates that the mean BGC value for individuals of age 45years is
approximately 81.471 units.
Comparison:

(i) The mean BGC of individuals of age 35 years is approximately


(35 − 30) × 0.384 = 1.92 units higher than that of individuals of age
30 years.

(ii) Individuals of age 40 years have approximately (48 − 40) × 0.384 =


3.072 units lower mean BGC compared to individuals of age 48
years.
Probability

Example

Probability is frequently used in everyday communication.

• A physician says that a patient has a 50% chance of surviving in a certain


operation.
• There is a 75% chance that it will rain tomorrow.
• A team has 25% chance of winning the next game.
• The chance of having flood in this year is 80%.
• The chance of passing the first year for a student is 96%.
• There is a 90% chance of developing cancer for a smoker.

Remark

• The probability of the occurrence of some event is measured by a number


between zero and one, inclusive.
• Most people express probabilities in terms of percentages. In dealing with
probabilities mathematically, it is more convenient to express probabilities
as fractions (Percentages result from multiplying the fractions by 100).
• The more likely the event is, the closer the number is to one; and the more
unlikely the event is, the closer the number is to zero.
• An event that cannot occur has a probability of zero (this event is called
impossible event); and an event that is certain to occur has a probability of
one (this event is called sure event).
Experiment

An experiment is a process of generating outcomes through trials, where all


possible outcomes are known in advance, but a single outcome cannot be predicted
with certainty (i.e. for sure).

Example
• Tossing a coin
• Tossing a die
• Selecting a ball from a box containing three red balls and four
green balls
• Tossing two fair coins one by one (or at a time)
• Observing lifetime of an electric bulb
• Observing heart beat of a student in a class

Sample Space

The set of all possible outcomes of an experiment is called the sample space.
Sample space is usually denoted by 𝑆𝑆 or Ω. Each element (i.e. outcome) is known
as the sample point. Sample points in sample space are mutually exclusive or
disjoint (i.e. one and only one outcome is obtained from an experiment. It is not
possible to get two or more outcomes from an experiment).
Example
Experiment: Tossing a coin.
Sample space: 𝑆𝑆 = {𝐻𝐻, 𝑇𝑇}.

Experiment: Tossing a die.


Sample space: 𝑆𝑆 = {1, 2, 3, 4, 5, 6}.

Experiment: Observing lifetime of an electric bulb.


Sample space: 𝑆𝑆 = {𝑡𝑡| 𝑡𝑡 ≥ 0}.

Tree Diagram
The tree diagram is a systematic way to list all possible outcomes of an
experiment. Using tree diagram, one can easily construct the sample space of an
experiment.

Example
Experiment: Toss a coin first. If head appears, flip it again; toss a die if tail
appears.

Then, the sample space is 𝑆𝑆 = {𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻, 𝑇𝑇1, 𝑇𝑇2, 𝑇𝑇3, 𝑇𝑇4, 𝑇𝑇5, 𝑇𝑇6}.
Exercise: Draw tree diagram and hence write the sample space

(a) Tossing two fair coins one by one (or at a time).

(b) Suppose that a bag contains three white, four green, and five yellow
marbles. Mr. Paul randomly picks a marble and records the color of the
marble and returns it back in the bag. He mixes marbles well and again
randomly picks a ball and record the color.

Event
Any subset of sample space of an experiment is known as event. Event is usually
denoted by capital English letters, except 𝑆𝑆. An event is said to be occurred if any
one element (outcome) of the corresponding subset is found from the experiment.
The probability is always computed for an event.

Since sample space and the null set (set without any elements, denoted by 𝜙𝜙) are
always subsets of sample space, these are events. Sample space is known as sure
event, whereas null set is known as impossible event.

Example
Experiment: Toss a coin first. If head appears, flip it again; toss a die if tail
appears.
The sample space is 𝑆𝑆 = {𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻, 𝑇𝑇1, 𝑇𝑇2, 𝑇𝑇3, 𝑇𝑇4, 𝑇𝑇5, 𝑇𝑇6}.
• One may be interested in the event having at least one head. Then, the event
is 𝐴𝐴 = {𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻}.
• One may be interest in the having a tail and even number. In this case, the
event is 𝐵𝐵 = {𝑇𝑇2, 𝑇𝑇4, 𝑇𝑇6}.

Venn Diagram
The Venn diagram is a pictorial way to represent the sample space and related
events of an experiment. The sample space is represented by a rectangle and events
are represented by closed curves (usually circles).

Example
Consider the previous example. The Venn diagram associated with this experiment
is given below.

Complement of an Event
The complement of an event 𝐴𝐴, denoted by 𝐴𝐴𝑐𝑐 𝑜𝑜𝑜𝑜 𝐴𝐴′ , with respect to the sample
space 𝑆𝑆, is a subset that contains all elements of 𝑆𝑆 that are not in 𝐴𝐴. In other words,
complement of an event 𝐴𝐴 indicates that the event 𝐴𝐴 will not occur.
Example
Consider the previous example. Complement of event 𝐵𝐵 is
𝐵𝐵𝑐𝑐 = {𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻, 𝑇𝑇1, 𝑇𝑇3, 𝑇𝑇4}

Intersection of Events
The intersection of two events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵, denoted by 𝐴𝐴⋂𝐵𝐵, in an experiment, is a
subset of the sample space that contains elements common to both 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵. In
other words, it is also an event indicating that both events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 occur
simultaneously (or at the same time).

Example
Consider an experiment of tossing a die. Let 𝐴𝐴 be the event indicating number less
than 3 and 𝐵𝐵 be the event for all odd number. Then, the sample space and events
are 𝑆𝑆 = {1,2,3,4,5,6}; 𝐴𝐴 = {1,2}, 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 = {1,3,5}. Therefore,
𝐴𝐴⋂𝐵𝐵 = {1}.
This event indicates of getting a number by tossing a die, which is less than 3 and
an odd number.

Note
• 𝐴𝐴⋂𝐵𝐵 is also denoted by 𝐴𝐴𝐴𝐴.
• Events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are said to be mutually exclusive or disjoint
events if 𝐴𝐴⋂𝐵𝐵 = 𝜙𝜙. It implies that events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 cannot occur
simultaneously (or at the same time).
Union of Events
The union of two events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵, denoted by 𝐴𝐴 ∪ 𝐵𝐵, in an experiment, is a subset
of the sample space that contains all elements belonging to 𝐴𝐴 𝑜𝑜𝑜𝑜 𝐵𝐵 𝑜𝑜𝑜𝑜 𝑏𝑏𝑏𝑏𝑏𝑏ℎ. In
other words, it is also an event, which indicates that either 𝐴𝐴 𝑜𝑜𝑜𝑜 𝐵𝐵 occurs.

Example
Consider an experiment of tossing a die. Let 𝐴𝐴 be the event indicating number less
or equal to 3 and 𝐵𝐵 be the event for all even number. Then, the sample space and
events are 𝑆𝑆 = {1,2,3,4,5,6}; 𝐴𝐴 = {1,2,3}, 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 = {2,4,6}. Therefore,
𝐴𝐴 ∪ 𝐵𝐵 = {1,2,3,4,6}.
This event indicates that by tossing a die, one may get a number which is less or
equal to 3 or an even number.

Note
• Events 𝐴𝐴1 , ⋯ , 𝐴𝐴𝑖𝑖 , ⋯ , 𝐴𝐴𝑛𝑛 are said to be exhaustive events if
𝐴𝐴1 ∪ 𝐴𝐴2 ∪ ⋯ ∪ 𝐴𝐴𝑖𝑖 ∪ ⋯ ∪ 𝐴𝐴𝑛𝑛 = 𝑆𝑆.

• Events 𝐴𝐴1 , ⋯ , 𝐴𝐴𝑖𝑖 , ⋯ , 𝐴𝐴𝑛𝑛 are said to be exhaustive and pairwise mutually
exclusive (or disjoint) events if
𝐴𝐴1 ∪ 𝐴𝐴2 ∪ ⋯ ∪ 𝐴𝐴𝑖𝑖 ∪ ⋯ ∪ 𝐴𝐴𝑛𝑛 = 𝑆𝑆
and
𝐴𝐴𝑗𝑗 ⋂𝐴𝐴𝑘𝑘 = 𝜙𝜙, 𝑗𝑗 ≠ 𝑘𝑘; 𝑗𝑗, 𝑘𝑘 = 1, ⋯ , 𝑛𝑛.
Some Important Results

Suppose that 𝑆𝑆, 𝐴𝐴, 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are the sample space and two events, respectively in an
experiment. Let 𝜙𝜙 be the null set.

• 𝐴𝐴 ⋂ 𝜙𝜙 = 𝜙𝜙
• 𝐴𝐴 ∪ 𝜙𝜙 = 𝐴𝐴
• 𝐴𝐴 ⋂ 𝐴𝐴𝑐𝑐 = 𝜙𝜙
• 𝐴𝐴 ∪ 𝐴𝐴𝑐𝑐 = 𝑆𝑆
• (𝐴𝐴𝑐𝑐 )𝑐𝑐 = 𝐴𝐴
• 𝑆𝑆 𝑐𝑐 = 𝜙𝜙
• 𝜙𝜙 𝑐𝑐 = 𝑆𝑆
• (𝐴𝐴 ⋂ 𝐵𝐵)𝑐𝑐 = 𝐴𝐴𝑐𝑐 ∪ 𝐵𝐵𝑐𝑐 [De Morgan’s Law]
• (𝐴𝐴 ∪ 𝐵𝐵)𝑐𝑐 = 𝐴𝐴𝑐𝑐 ⋂ 𝐵𝐵𝑐𝑐 [De Morgan’s Law]
Probability

Probability of an Event
An event is said to occur if one of the sample points contained in the set defined
for that event is obtained from an experiment. The probability of the event is a
number between zero and one, inclusive that measures the possibility (or
likelihood) of occurring the event when the experiment is performed.

Note
• If probability for each sample point in the sample space can be computed,
the sum of probabilities of all sample points in sample space is always
one.

Axioms Related to Probability


Suppose that in an experiment, 𝑆𝑆 is the sample space; 𝐴𝐴 is an event; and
𝐵𝐵1 , ⋯ , 𝐵𝐵𝑖𝑖 , ⋯ , 𝐵𝐵𝑘𝑘 are 𝑘𝑘 pairwise mutually exclusive (disjoint) events.

Axiom 1:
0 ≤ 𝑃𝑃𝑃𝑃. (𝐴𝐴) ≤ 1.
Axiom 2:
𝑃𝑃𝑃𝑃. (𝑆𝑆) = 1.
Axiom 3:
𝑃𝑃𝑃𝑃. (𝐵𝐵1 ∪ ⋯ ∪ 𝐵𝐵𝑖𝑖 ∪ ⋯ ∪ 𝐵𝐵𝑘𝑘 ) = 𝑃𝑃𝑃𝑃. (𝐵𝐵1 ) + ⋯ + 𝑃𝑃𝑃𝑃. (𝐵𝐵𝑖𝑖 ) + ⋯ + 𝑃𝑃𝑃𝑃. (𝐵𝐵𝑘𝑘 )
= ∑𝑘𝑘𝑖𝑖=1 𝑃𝑃𝑃𝑃. (𝐵𝐵𝑖𝑖 )

Note that Axiom 3 is also known as the additive rule of probability.


Assigning Probability to an Event
• Suppose that 𝐴𝐴 is an event of an experiment. If 𝑆𝑆 is the sample space, the
set associated with event 𝐴𝐴 is a subset of 𝑆𝑆. Since elements (sample
points) in 𝑆𝑆 are mutually exclusive, elements in the set for event 𝐴𝐴 are
also mutually exclusive. Therefore, following Axiom 3, the probability of
the event 𝐴𝐴 is the sum of probabilities of elements that are in the set
defined for event 𝐴𝐴. Example: Let 𝑎𝑎1 , ⋯ , 𝑎𝑎𝑗𝑗 , ⋯ , 𝑎𝑎𝑟𝑟 be the elements of set
defined for event 𝐴𝐴, 𝑖𝑖. 𝑒𝑒. 𝐴𝐴 = �𝑎𝑎1 , ⋯ , 𝑎𝑎𝑗𝑗 , ⋯ , 𝑎𝑎𝑟𝑟 �. If probabilities of these
elements are known, probability of event 𝐴𝐴 is given as
𝑃𝑃𝑃𝑃. (𝐴𝐴) = 𝑃𝑃𝑃𝑃. (𝑎𝑎1 ) + ⋯ + 𝑃𝑃𝑃𝑃. �𝑎𝑎𝑗𝑗 � + ⋯ + 𝑃𝑃𝑃𝑃. (𝑎𝑎𝑟𝑟 ).

• If sample points in the sample space are equally likely (i.e. all elements
have the same probability to occur), then the probability of event 𝐴𝐴 is
defined as the number of elements in the set defined for 𝐴𝐴 divided by the
number of elements in the sample space. Consider the above example. In
the set associated with the event 𝐴𝐴, the number of elements is 𝑟𝑟.If there
are 𝑁𝑁 elements in the sample space 𝑆𝑆, the probability of event 𝐴𝐴 is given
as
𝑟𝑟
𝑃𝑃𝑃𝑃. (𝐴𝐴) = .
𝑁𝑁

Remark
Suppose that 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are two events of interest. If the occurrence of event 𝐴𝐴 does
not influence the occurrence of event 𝐵𝐵, the events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are said to be
independent. When events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴𝐴𝐴) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵).
Problem
Using Axioms of Probability, prove that
1. 𝑃𝑃𝑃𝑃. (𝜙𝜙) = 0.
2. 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴).
3. If event 𝐴𝐴 is a proper subset of event 𝐵𝐵, 𝑃𝑃𝑃𝑃. (𝐴𝐴) < 𝑃𝑃𝑃𝑃. (𝐵𝐵).
4. If 𝐴𝐴 and 𝐵𝐵 are two non-disjoint events,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) + 𝑃𝑃(𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵).
What would happen, if 𝐴𝐴 and 𝐵𝐵 are disjoint? Independent?
5. If 𝐴𝐴 and 𝐵𝐵 are two non-disjoint events,
𝑃𝑃𝑃𝑃. (𝐴𝐴𝐵𝐵𝑐𝑐 ∪ 𝐴𝐴𝑐𝑐 𝐵𝐵) = 𝑃𝑃(𝐴𝐴 ∪ 𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴𝐴𝐴).

Proof 1: 𝑷𝑷𝑷𝑷. (𝝓𝝓) = 𝟎𝟎


Let 𝑆𝑆 and 𝜙𝜙 be the sample space and null set in an experiment. It can be shown that
𝑆𝑆 ∪ 𝜙𝜙 = 𝑆𝑆.
The sample space and the null set are mutually exclusive events since
𝑆𝑆 ∩ 𝜙𝜙 = 𝜙𝜙.
By applying Axiom 3, one then write
𝑃𝑃𝑃𝑃. (𝑆𝑆 ∪ 𝜙𝜙) = 𝑃𝑃𝑃𝑃. (𝑆𝑆) + 𝑃𝑃𝑃𝑃. (𝜙𝜙)
⇒ 𝑃𝑃𝑃𝑃. (𝑆𝑆) = 𝑃𝑃𝑃𝑃. (𝑆𝑆) + 𝑃𝑃𝑃𝑃. (𝜙𝜙)
⇒ 𝑃𝑃𝑃𝑃. (𝜙𝜙) = 0

Proof 2: 𝑷𝑷𝑷𝑷. (𝑨𝑨𝒄𝒄 ) = 𝟏𝟏 − 𝑷𝑷𝑷𝑷. (𝑨𝑨)


Let 𝑆𝑆 and 𝐴𝐴 be the sample space and an event in an experiment. Let 𝐴𝐴𝑐𝑐 is an event,
which is complement of event 𝐴𝐴.
From above Venn diagram, it is clear that
𝐴𝐴 ∪ 𝐴𝐴𝑐𝑐 = 𝑆𝑆.
The event 𝐴𝐴 and the complement event 𝐴𝐴𝑐𝑐 is mutually exclusive events since
𝐴𝐴 ∩ 𝐴𝐴𝑐𝑐 = 𝜙𝜙.
By applying Axiom 3, one then write
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐴𝐴𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 )
⇒ 𝑃𝑃𝑃𝑃. (𝑆𝑆) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 )
⇒ 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) = 1; using Axiom 𝑃𝑃𝑃𝑃. (𝑆𝑆) = 1
⇒ 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴).

Proof 3: If event 𝑨𝑨 is a proper subset of event 𝑩𝑩, 𝑷𝑷𝑷𝑷. (𝑨𝑨) < 𝑃𝑃𝑃𝑃. (𝐵𝐵)
Let 𝑆𝑆, 𝐴𝐴, and 𝐵𝐵 be the sample space and two events in an experiment, where 𝐴𝐴 is a
proper subset of 𝐵𝐵, 𝑖𝑖. 𝑒𝑒. 𝐴𝐴 ⊂ 𝐵𝐵.

From above Venn diagram, it is clear that


𝐵𝐵 = 𝐴𝐴 ∪ (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵).
It is also clear that these events 𝐴𝐴 and (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) are mutually exclusive as
𝐴𝐴 ∩ (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) = 𝜙𝜙.
By applying Axiom 3, one then write
𝑃𝑃𝑃𝑃. (𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵).
Since 𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵 ≠ 𝜙𝜙, 0 < 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) < 1. Therefore,
𝑃𝑃𝑃𝑃. (𝐴𝐴) < 𝑃𝑃𝑃𝑃. (𝐵𝐵).

Proof 4: 𝑷𝑷𝑷𝑷. (𝑨𝑨 ∪ 𝑩𝑩) = 𝑷𝑷(𝑨𝑨) + 𝑷𝑷(𝑩𝑩) − 𝑷𝑷𝑷𝑷. (𝑨𝑨 ∩ 𝑩𝑩)


Let 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are two non-disjoint events in an experiment, 𝑖𝑖. 𝑒𝑒. 𝐴𝐴 ∩ 𝐵𝐵 ≠ 𝜙𝜙.

From above Venn diagram, it is clear that


𝐴𝐴 ∪ 𝐵𝐵 = 𝐴𝐴 ∪ (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵).
It is also clear that these events 𝐴𝐴 and (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) are mutually exclusive as
𝐴𝐴 ∩ (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) = 𝜙𝜙.
By applying Axiom 3, one then write
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) … … … … (1).

From above Venn diagram, the event 𝐵𝐵 can be expressed as


𝐵𝐵 = (𝐴𝐴 ∩ 𝐵𝐵) ∪ (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵),
where events (𝐴𝐴 ∩ 𝐵𝐵) 𝑎𝑎𝑎𝑎𝑎𝑎 (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) are mutually exclusive as
(𝐴𝐴 ∩ 𝐵𝐵) ∩ (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) = 𝜙𝜙.
Using Axiom 3, therefore,
𝑃𝑃𝑃𝑃. (𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) + 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) … … … … … … . (2)
⇒ 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) … … … … … . (3)
Using equation (3) in equation (1), one may obtain
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) … … … . . (4)

When events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are mutually exclusive, 𝑖𝑖. 𝑒𝑒. 𝐴𝐴 ∩ 𝐵𝐵 = 𝜙𝜙,


𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 0.
Therefore, equation (4) reduces to
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐵𝐵) … . . … … … … … … … … . (5)
When events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵).
Therefore, equation (4) reduces to
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵) … . . … … … . . (6)

Proof 5: Exercise (Please Try)………………………….


Probability_3

Problem

1. A fair die is tossed and up face is observed. If the face is even, you will win, otherwise you lose. What is
probability that you will win?

2. Consider an experiment of tossing two unfair coins with 𝑃(𝐻) = ⎯ and 𝑃(𝑇) = ⎯. Find the probability that
(i) exactly one head is observed; (ii) at least one head is observed; (iii) at most one head is observed.

Probability Page 1
Probability Page 2
2 1
3. Consider an experiment consisting of six sample points with 𝑃(𝑥) = , 𝑥 = 1,2,3 ; 𝑃(𝑥) = , 𝑥 = 4,5,6. Suppose
9 9
that there are two events defined as 𝐴 = {2,3,5} 𝑎𝑛𝑑 𝐵 = {4,6}. Draw Venn diagram. Find the probability that (i)
A will occur; (ii) B will occur; (iii) A and B will occur.

Probability Page 3
4. Consider an experiment composed of one roll of a fair die followed by one toss of a fair coin. List all sample
points and assign a probability to each sample point. Find the probability that (i) 6 on the die and head on the coin;
(ii) even number on die and tail on the coin; (iii) even number on the die; (iv) tail on the coin.

Probability Page 4
5. A fair coin is tossed 10 times and the up face is reported after each toss. What is the probability that at least one
head is observed?

Probability Page 5
6. Ideal number of children in a family is two. If having a boy and having a girl are equally likely independent events,
find the probability that a family has (i) two girls; (ii) one boy and one girl; (iii) at least one girl.

Probability Page 6
Probability Page 7
Probability Page 8
Problem 7
Following table shows the frequency distribution of mother by age (in year) and
race.
Age Race
White Black
≤ 17 20 20
18-19 30 20
20-29 410 120
≥ 30 330 50
If a mother is selected randomly, find the probability that (i) a mother is white; (ii)
a mother is teenager; (iii) a mother is white and teenager; (iv) a mother is white or
teenager.

Solution
Given that
Age Race Total
White Black
≤ 17 20 20 40
18-19 30 20 50
20-29 410 120 530
≥ 30 330 50 380
Total 790 210 1000

It is clear that in the sample space, there are 1000 sample points. Assume that all
sample points are equally likely.
(i) Let 𝐴𝐴 be an event defined as
𝐴𝐴 = 𝑀𝑀𝑀𝑀𝑀𝑀ℎ𝑒𝑒𝑒𝑒 𝑖𝑖𝑖𝑖 𝑤𝑤ℎ𝑖𝑖𝑖𝑖𝑖𝑖
From the table, it is clear that there are 790 sample points in the event 𝐴𝐴.
Therefore,
790
Pr(𝐴𝐴) = = 0.79.
1000

It implies that 79% of mothers are white.

(ii) Let 𝐵𝐵 be an event defined as


𝐵𝐵 = 𝑀𝑀𝑀𝑀𝑀𝑀ℎ𝑒𝑒𝑒𝑒 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
From the table, it is clear that there are 90 (= 40 + 50) sample points in the event
𝐵𝐵. Therefore,
90
Pr(𝐵𝐵) = = 0.09.
1000
It implies that 9% of mothers are teenagers.

(iii) Let C be an event defined as


𝐶𝐶 = 𝑀𝑀𝑀𝑀𝑀𝑀ℎ𝑒𝑒𝑒𝑒 𝑖𝑖𝑖𝑖 𝑤𝑤ℎ𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎𝑎𝑎 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 = 𝐴𝐴 ∩ 𝐵𝐵
It is clear from table that there are 50(= 20 + 30) sample points in the event 𝐶𝐶.
Therefore,
50
𝑃𝑃𝑃𝑃. (𝐶𝐶) = = 0.05.
1000
It implies that 5% of mothers are white and teenagers.

(iv) The event 𝐴𝐴𝐴𝐴𝐴𝐴 indicates that a mother is white or teenager. Therefore,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
= 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐶𝐶)
= 0.79 + 0.09 − 0.05 = 0.83
That is, 83% of mothers are white or teenagers.

Problem 8

Hospital shows that 12% of all patients are admitted for surgical, 16% obstetrics
and 2% for both surgical and obstetrics. If a new patient is admitted to the hospital,
what is the probability that (i) the patient is admitted for either surgical or
obstetrics; (ii) the patients is admitted for neither surgical nor obstetrics.

Solution
Let events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 be defined as
𝐴𝐴 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝐵𝐵 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
Given that
𝑃𝑃𝑃𝑃. (𝐴𝐴) = 0.12
𝑃𝑃𝑃𝑃. (𝐵𝐵) = 0.16
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 0.02
(i) Probability that the patient is admitted for either surgical or obstetrics is
given by
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) + 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
= 0.12 + 0.16 − 0.02 = 0.26
That is, 26% of patients are admitted for either surgical or obstetrics.
(ii) Probability that the patients is admitted for neither surgical nor obstetrics is
given by
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵)𝑐𝑐 = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵)
= 1 − 0.26 = 0.74
That is, 76% of patients are admitted for neither surgical nor obstetrics.

Problem 9
A fair coin is tossed three times and events A and B are defined as
A: at least one head is observed;
B: number of heads observed is odd.
Define A, B, 𝐴𝐴 ∪ 𝐵𝐵, A∩B, 𝐴𝐴𝑐𝑐 , 𝐴𝐴𝑐𝑐 ∪ 𝐵𝐵𝑐𝑐 . Also, find the probability for
each event. Do you think that A and B are mutually exclusive? Justify.

Solution
The sample space associated with this experiment is given by
𝑆𝑆 = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇}.
1
Since the coin is fair, 𝑖𝑖. 𝑒𝑒. 𝑃𝑃𝑃𝑃. (𝐻𝐻) = 𝑃𝑃𝑃𝑃. (𝑇𝑇) = and outcome of one toss does not
2

influence the outcome of other tosses, the probability of each sample point is given
by
1 1 1 1
𝑃𝑃𝑃𝑃. (𝑒𝑒𝑒𝑒𝑒𝑒ℎ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝) = × × = .
2 2 2 8
Events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 can be defined as
𝐴𝐴 = 𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜𝑜𝑜 ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑖𝑖𝑖𝑖 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
= {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝐻𝐻𝐻𝐻𝐻𝐻}
𝐵𝐵 = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖 𝑜𝑜𝑜𝑜𝑜𝑜
= {𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝐻𝐻𝐻𝐻𝐻𝐻}.
Therefore,
𝐴𝐴 ∪ 𝐵𝐵 = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝐻𝐻𝑇𝑇𝑇𝑇} = 𝐴𝐴
𝐴𝐴 ∩ 𝐵𝐵 = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝐻𝐻𝐻𝐻𝐻𝐻} = 𝐵𝐵
𝐴𝐴𝑐𝑐 = {𝑇𝑇𝑇𝑇𝑇𝑇}
𝐴𝐴𝑐𝑐 ∪ 𝐵𝐵𝑐𝑐 = (𝐴𝐴 ∩ 𝐵𝐵)𝑐𝑐 = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}

The probabilities of corresponding events are given below.


1 7
𝑃𝑃𝑃𝑃. (𝐴𝐴) = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) = 1 − 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇𝑇𝑇) = 1 − =
8 8
𝑃𝑃𝑃𝑃. (𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐻𝐻𝐻𝐻𝐻𝐻) + 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃𝑃𝑃. (𝐻𝐻𝐻𝐻𝐻𝐻)
1 1 1 1 4 1
= + + + = =
8 8 8 8 8 2
7
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) =
8
1
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐵𝐵) =
2
1
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇𝑇𝑇) =
8
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∪ 𝐵𝐵𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝐻𝐻𝐻𝐻𝐻𝐻) + 𝑃𝑃𝑃𝑃. (𝐻𝐻𝐻𝐻𝐻𝐻) + 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇𝑇𝑇)
1 1 1 1 4 1
= + + + = =
8 8 8 8 8 2

Events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are not mutually exclusive since


𝐴𝐴 ∩ 𝐵𝐵 ≠ 𝜙𝜙.
Problem 10

A bag contains 2 red and 2 white balls. If 2 balls are drawn at a time, find the
probability that (i) both balls are red; (ii) one is red and one is white; (iii) at least
one is white.

Solution
Total number of balls in the bag is 4(= 2 + 2). The possible number of ways of
choosing 2 balls from these 4 balls is
4 4!
� �= = 6.
2 2! 2!
That is, number of sample points in the sample space associated with this
experiment is 6. Assume that all sample points are equally likely to be selected.

(i) Let 𝐴𝐴 be the event defined as


𝐴𝐴 = 𝐵𝐵𝐵𝐵𝐵𝐵ℎ 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑎𝑎𝑎𝑎𝑎𝑎 𝑟𝑟𝑟𝑟𝑟𝑟
The possible number of ways of choosing 2 red balls from these 4 balls is
2 2
� � × � � = 1 × 1 = 1.
2 0
Therefore,
1
𝑃𝑃𝑃𝑃. (𝐴𝐴) = .
6
(ii) Let 𝐵𝐵 be the event defined as
𝐵𝐵 = 𝑂𝑂𝑂𝑂𝑂𝑂 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟 𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖 𝑤𝑤ℎ𝑖𝑖𝑖𝑖𝑖𝑖
The possible number of ways of choosing 1 red ball and 1 white ball from these 4
balls is
2 2
� � × � � = 2 × 2 = 4.
1 1
Therefore,
4 2
𝑃𝑃𝑃𝑃. (𝐵𝐵) = = .
6 3

(iii) Let 𝐶𝐶 be the event defined as


𝐶𝐶 = 𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 1 𝑤𝑤ℎ𝑖𝑖𝑖𝑖𝑖𝑖 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏
The possible number of ways of choosing at least 1 white ball from these 4 balls is
equal to the possible number of ways of choosing 2 white balls from these 4 balls
or possible number of ways of choosing 1 red ball and 1 white ball from these 4
balls. That is, possible number of ways is
2 2 2 2
� �×� �+� �×� �=1×1+2×2=1+4=5
0 2 1 1
Therefore,
5
𝑃𝑃𝑃𝑃. (𝐶𝐶) = .
6

Exercise 11
A box contains 4 red, 5 black and 6 yellow balls. If 3 balls are drawn from the box,
find the probability that (i) all balls are yellow; (ii) 2 balls are red and 1 is black;
(iii) one ball of each color.
Problem 12

A letter is chosen at from the English alphabet. Find the probability that the letter
is (i) a vowel; (ii) a letter before letter j; (iii) a letter after g.

Solution
Note that there are 26 letters in English alphabet. Therefore, the possible number
of ways of choosing a letter from these 26 letters is
26 26!
� �= = 26.
1 1! 25!
That is, there are 26 sample points in the sample space corresponding to this
experiment. Assume that all sample points are equally likely to be selected.

(i) Let the event 𝐴𝐴 be defined as


𝐴𝐴 = 𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Since English alphabet, there are 5 vowels, the possible number of ways of
selecting a vowel is
5 5!
� �= = 5.
1 1! 4!
Therefore,
5
𝑃𝑃𝑃𝑃. (𝐴𝐴) = .
26
(ii) Let the event 𝐵𝐵 be defined as
𝐵𝐵 = 𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑗𝑗 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Since English alphabet, there are 9 letters before 𝑗𝑗, the possible number of ways of
selecting a letter before 𝑗𝑗 is
9 9!
� �= = 9.
1 1! 8!
Therefore,
9
𝑃𝑃𝑃𝑃. (𝐵𝐵) = .
26

(iii) Let the event 𝐶𝐶 be defined as


𝐶𝐶 = 𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑔𝑔 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Since English alphabet, there are 19 letters after g, the possible number of ways of
selecting a letter after g is
19 19!
� �= = 19.
1 1! 18!
Therefore,
19
𝑃𝑃𝑃𝑃. (𝐶𝐶) = .
26
Problem 13
In a high school of 100 students, 54 studied mathematics, 69 studied history, and
35 studied both. If one is selected randomly from this school, find the probability
that the student (i) took mathematics or history; (ii) did not take either of these
subjects; (iii) took only history.

Solution:
The sample space associated with this experiment contains 100 equally likely
sample points. Let 𝑀𝑀 and 𝐻𝐻 be two events defined as
𝑀𝑀 = 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑀𝑀𝑀𝑀𝑀𝑀ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
𝐻𝐻 = 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻
Therefore, the event 𝑀𝑀 ∩ 𝐻𝐻 denotes
𝑀𝑀 ∩ 𝐻𝐻 = 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑏𝑏𝑏𝑏𝑏𝑏ℎ 𝑚𝑚𝑚𝑚𝑚𝑚ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑎𝑎𝑎𝑎𝑎𝑎 ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜
Given that
Number of sample points in 𝑀𝑀 is 𝑛𝑛(𝑀𝑀) = 54.
Number of sample points in 𝐻𝐻 is 𝑛𝑛(𝐻𝐻) = 69.
Number of sample points in 𝑀𝑀 ∩ 𝐻𝐻 is 𝑛𝑛(𝑀𝑀 ∩ 𝐻𝐻) = 35.
Therefore,
54
𝑃𝑃𝑃𝑃. (𝑀𝑀) = = 0.54
100
69
𝑃𝑃𝑃𝑃. (𝐻𝐻) = = 0.69
100
35
𝑃𝑃𝑃𝑃. (𝑀𝑀 ∩ 𝐻𝐻) = = 0.35
100
(i) The event that a student took mathematics or history is defined as
𝑀𝑀 ∪ 𝐻𝐻.
Then,
𝑃𝑃𝑃𝑃. (𝑀𝑀𝑀𝑀𝑀𝑀) = 𝑃𝑃𝑃𝑃. (𝑀𝑀) + 𝑃𝑃𝑃𝑃. (𝐻𝐻) − 𝑃𝑃𝑃𝑃. (𝑀𝑀 ∩ 𝐻𝐻)
= 0.54 + 0.69 − 0.35 = 0.88
That is, 88% of students take mathematics or history.

(ii) The event that a student did not take either of these subjects is defined as
𝑀𝑀𝑐𝑐 ∩ 𝐻𝐻𝑐𝑐 = (𝑀𝑀 ∪ 𝐻𝐻)𝑐𝑐 .
Then,
𝑃𝑃𝑃𝑃. ( 𝑀𝑀𝑐𝑐 ∩ 𝐻𝐻𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝑀𝑀 ∪ 𝐻𝐻)𝑐𝑐 = 1 − 𝑃𝑃𝑃𝑃. ( 𝑀𝑀 ∪ 𝐻𝐻)
= 1 − 0.88 = 0.22
That is, 22% of students did not take either mathematics or history.

(iii) The event that a student took only history is defined as


𝐻𝐻 ∩ 𝑀𝑀𝑐𝑐 .
It can be shown that
𝐻𝐻 = (𝐻𝐻 ∩ 𝑀𝑀 𝑐𝑐 ) ∪ (𝐻𝐻 ∩ 𝑀𝑀)
and events (𝐻𝐻 ∩ 𝑀𝑀𝑐𝑐 ) 𝑎𝑎𝑎𝑎𝑎𝑎 (𝐻𝐻 ∩ 𝑀𝑀) are mutually exclusive. Therefore,
𝑃𝑃𝑃𝑃. (𝐻𝐻) = 𝑃𝑃𝑃𝑃. (𝐻𝐻 ∩ 𝑀𝑀𝑐𝑐 ) + 𝑃𝑃𝑃𝑃. (𝐻𝐻 ∩ 𝑀𝑀)
⇒ 𝑃𝑃𝑃𝑃. (𝐻𝐻 ∩ 𝑀𝑀𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝐻𝐻) − 𝑃𝑃𝑃𝑃. (𝐻𝐻 ∩ 𝑀𝑀)
⇒ 𝑃𝑃𝑃𝑃. (𝐻𝐻 ∩ 𝑀𝑀𝑐𝑐 ) = 0.69 − 0.35 = 0.34
That is, 34% of students took only history.
Problem 14
A committee of two students is to be formed from the Department of Mathematics
and Economics. One student is to be selected from Mathematics Department
consisting of 15 third-year students and 5 fourth-year students. The other is to be
selected from Department of Economic consisting of 20 third-year and 10 fourth-
year students. Find the probability that (i) both students are third-year students; (ii)
both students are fourth-year students; (iii) a third-year and a fourth-year student.

Solution:
Let us define following events.
𝑀𝑀3 = 𝑇𝑇ℎ𝑖𝑖𝑖𝑖𝑖𝑖 − 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑀𝑀𝑀𝑀𝑀𝑀ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
𝑀𝑀4 = 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹ℎ − 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑀𝑀𝑀𝑀𝑀𝑀ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
Given that number of sample points in the events is
𝑛𝑛(𝑀𝑀3) = 15; 𝑛𝑛(𝑀𝑀4) = 5.
Total number of students in mathematics is
𝑛𝑛(𝑀𝑀3) + 𝑛𝑛(𝑀𝑀4) = 15 + 5 = 20.
Therefore,
15
𝑃𝑃𝑃𝑃. (𝑀𝑀3) =
20
5
𝑃𝑃𝑃𝑃. (𝑀𝑀4) =
20
Similarly, one may define
𝐸𝐸3 = 𝑇𝑇ℎ𝑖𝑖𝑖𝑖𝑖𝑖 − 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
𝐸𝐸4 = 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹ℎ − 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
Given that number of sample points in the events is
𝑛𝑛(𝐸𝐸3) = 20; 𝑛𝑛(𝐸𝐸4) = 10
Total number of students in economics is
𝑛𝑛(𝐸𝐸3) + 𝑛𝑛(𝐸𝐸4) = 20 + 10 = 30.
Therefore,
20
𝑃𝑃𝑃𝑃. (𝐸𝐸3) =
30
10
𝑃𝑃𝑃𝑃. (𝐸𝐸4) = .
30
Note that selection of a student from mathematics does not influence the selection
of a student from economic. Therefore, events corresponding to the selection of
students are independent.

(i) The event that selected both students are third-year students is
𝑀𝑀3 ∩ 𝐸𝐸3.
Therefore,
15 20 1
𝑃𝑃𝑟𝑟. (𝑀𝑀3 ∩ 𝐸𝐸3) = 𝑃𝑃𝑃𝑃. (𝑀𝑀3) × 𝑃𝑃𝑃𝑃. (𝐸𝐸3) = × = = 0.50.
20 30 2
That is, in 50% of cases, in the committee both students are third-year students.

(ii) The event that selected both students are fourth-year students is
𝑀𝑀4 ∩ 𝐸𝐸4.
Therefore,
5 10 1
𝑃𝑃𝑃𝑃. (𝑀𝑀4 ∩ 𝐸𝐸4) = 𝑃𝑃𝑃𝑃. (𝑀𝑀4) × 𝑃𝑃𝑃𝑃. (𝐸𝐸4) = × = = 0.083.
20 30 12
That is, in 8.3% of cases, in the committee both students are fourth-year students.

(iii) The event that selected students are a third-year and a fourth-year student
is
(𝑀𝑀3 ∩ 𝐸𝐸4) ∪ (𝑀𝑀4 ∩ 𝐸𝐸3)
Since events (𝑀𝑀3 ∩ 𝐸𝐸4) 𝑎𝑎𝑎𝑎𝑑𝑑 (𝑀𝑀4 ∩ 𝐸𝐸3) are mutually exclusive,
𝑃𝑃𝑃𝑃. [(𝑀𝑀3 ∩ 𝐸𝐸4) ∪ (𝑀𝑀4 ∩ 𝐸𝐸3)] = 𝑃𝑃𝑃𝑃. (𝑀𝑀3 ∩ 𝐸𝐸4) + 𝑃𝑃𝑃𝑃. (𝑀𝑀4 ∩ 𝐸𝐸3)
= 𝑃𝑃𝑃𝑃. (𝑀𝑀3) × 𝑃𝑃𝑃𝑃. (𝐸𝐸4) + 𝑃𝑃𝑃𝑃. (𝑀𝑀4) × 𝑃𝑃𝑃𝑃. (𝐸𝐸3)
15 10 5 20 1 1 5
= × + × = + = = 0.417
20 30 20 30 4 6 12
That is, in 41.7% of cases, in the committee, there will be a third student and
fourth year student.

Conditional Probability

Suppose that 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are two events defined from an experiment. One may define
an event as
Event 𝐴𝐴 will occur when (or if) event 𝐵𝐵 has already occurred.
Such event is called the conditional event and is usually denoted by 𝐴𝐴|𝐵𝐵. The
probability of conditional event is known as the conditional probability.
Mathematically,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵) = 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝑃𝑃𝑃𝑃.(𝐵𝐵) ≠ 0.
𝑃𝑃𝑃𝑃. (𝐵𝐵)
Similarly,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
𝑃𝑃𝑃𝑃. (𝐵𝐵|𝐴𝐴) = 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝑃𝑃𝑃𝑃.(𝐴𝐴) ≠ 0.
𝑃𝑃𝑃𝑃. (𝐴𝐴)

Remark:
Marginal Probability: 𝑃𝑃𝑃𝑃. (𝐴𝐴), 𝑃𝑃𝑃𝑃. (𝐵𝐵)
Joint Probability: 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
Conditional Probability: 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵), 𝑃𝑃𝑃𝑃. (𝐵𝐵|𝐴𝐴).
Conditional Probability and Dependency:

1. If 𝑃𝑃𝑃𝑃. (𝐴𝐴) > 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵) ⟹ 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑


2. If 𝑃𝑃𝑃𝑃. (𝐴𝐴) < 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵) ⟹ 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
3. If 𝑃𝑃𝑃𝑃. (𝐴𝐴) = 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵) ⟹ 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖

Note
If events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent, 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵).

Proof:
Since events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent,
𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴)
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
⇒ = 𝑃𝑃𝑃𝑃. (𝐴𝐴)
𝑃𝑃𝑃𝑃. (𝐵𝐵)
⇒ 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵).

Remark
If sample points in an experiment are equally likely, to compute 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵), the
event 𝐵𝐵 can be considered as the sample space. Therefore,
𝑛𝑛(𝐴𝐴|𝐵𝐵)
𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵) = .
𝑛𝑛(𝐵𝐵)
Problem
Prove that
(i) 𝑃𝑃(𝐴𝐴𝑐𝑐 |𝐵𝐵) = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵).
(ii) Sample space and an event 𝐴𝐴 are independent.
(iii) If events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent, 𝐴𝐴𝑐𝑐 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵𝑐𝑐 are also
independent.

Proof:

(i) From the definition of conditional probability,


𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵)
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 |𝐵𝐵) = .
𝑃𝑃𝑃𝑃. (𝐵𝐵)
It can be shown that
𝑃𝑃𝑃𝑃. (𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) + 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵).
Therefore,
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵).
Finally,
𝑃𝑃𝑃𝑃. (𝐵𝐵) − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 |𝐵𝐵) = =1− = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴|𝐵𝐵).
𝑃𝑃𝑃𝑃. (𝐵𝐵) 𝑃𝑃𝑃𝑃. (𝐵𝐵)
(ii)
Let 𝑆𝑆 be the sample space. By the axiom of probability, 𝑃𝑃𝑃𝑃. (𝑆𝑆) = 1.
Now,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝑆𝑆) 𝑃𝑃𝑃𝑃. (𝐴𝐴)
𝑃𝑃𝑃𝑃. (𝐴𝐴|𝑆𝑆) = = = 𝑃𝑃𝑃𝑃. (𝐴𝐴).
𝑃𝑃𝑃𝑃. (𝑆𝑆) 𝑃𝑃𝑃𝑃. (𝑆𝑆)
Therefore, the sample space and any event 𝐴𝐴 are always independent.
(iii)
Given that 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are two independent events. Therefore,
𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵).
Now,
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵)𝑐𝑐 = 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∪ 𝐵𝐵)
= 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴) − 𝑃𝑃𝑃𝑃. (𝐵𝐵) + 𝑃𝑃𝑃𝑃. (𝐴𝐴 ∩ 𝐵𝐵)
= 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴) − 𝑃𝑃𝑃𝑃. (𝐵𝐵) + 𝑃𝑃𝑃𝑃. (𝐴𝐴) × 𝑃𝑃𝑃𝑃. (𝐵𝐵)
= 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴) − 𝑃𝑃𝑃𝑃. (𝐵𝐵)[1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴)]
=[1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴)] × [1 − 𝑃𝑃𝑃𝑃. (𝐵𝐵)]
= 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) × 𝑃𝑃𝑃𝑃. (𝐵𝐵𝑐𝑐 )

𝑖𝑖. 𝑒𝑒. 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵𝑐𝑐 ) = 𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) × 𝑃𝑃𝑃𝑃. (𝐵𝐵𝑐𝑐 ).


Therefore, events 𝐴𝐴𝑐𝑐 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵𝑐𝑐 are independent.
Problem 15

Following table shows probabilities for developing cancer for smoker and non-
smoker.
Smoker Developing cancer
Yes No
Yes 0.05 0.20
No 0.03 0.72
If an individual is selected randomly, find the probability (i) of developing cancer
if he is a smoker; (ii) of developing cancer if he is a non-smoker. (iii) Do you think
that smoking is a cause of cancer?

Solution:
Let us define the following events.
𝑆𝑆 = 𝐴𝐴𝐴𝐴 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝑆𝑆 𝑐𝑐 = 𝐴𝐴𝐴𝐴 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑛𝑛𝑛𝑛𝑛𝑛 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑟𝑟
𝐶𝐶 = 𝐴𝐴𝐴𝐴 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
𝐶𝐶 𝑐𝑐 = 𝐴𝐴𝐴𝐴 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
In the table, the entries are the joint probability. These are
𝑃𝑃(𝑆𝑆 ∩ 𝐶𝐶) = 0.05
𝑃𝑃(𝑆𝑆 ∩ 𝐶𝐶 𝑐𝑐 ) = 0.20
𝑃𝑃(𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶) = 0.03
𝑃𝑃(𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶 𝑐𝑐 ) = 0.72
(i) Developing cancer if he is a smoker
This event is defined as 𝐶𝐶|𝑆𝑆. The probability of this event is conditional
probability defined as
𝑃𝑃(𝑆𝑆 ∩ 𝐶𝐶)
𝑃𝑃(𝐶𝐶|𝑆𝑆) =
𝑃𝑃(𝑆𝑆)
Now the event 𝑆𝑆 can be defined as
𝑆𝑆 = (𝑆𝑆 ∩ 𝐶𝐶) ∪ (𝑆𝑆 ∩ 𝐶𝐶 𝑐𝑐 ).
Note that events (𝑆𝑆 ∩ 𝐶𝐶)𝑎𝑎𝑎𝑎𝑎𝑎 (𝑆𝑆 ∩ 𝐶𝐶 𝑐𝑐 ) are mutually exclusive. Hence
𝑃𝑃(𝑆𝑆) = 𝑃𝑃(𝑆𝑆 ∩ 𝐶𝐶) + 𝑃𝑃(𝑆𝑆 ∩ 𝐶𝐶 𝑐𝑐 )
= 0.05 + 0.20 = 0.25
Therefore,
0.05
𝑃𝑃(𝐶𝐶|𝑆𝑆) = = 0.20.
0.25
It implies that when individuals are smokers, 20% of them develop cancer.

(ii) Developing cancer if he is a non-smoker


This event is defined as 𝐶𝐶|𝑆𝑆 𝑐𝑐 . The probability of this event is conditional
probability defined as
𝑃𝑃(𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶)
𝑃𝑃(𝐶𝐶|𝑆𝑆 𝑐𝑐 ) =
𝑃𝑃(𝑆𝑆 𝑐𝑐 )
Now the event 𝑆𝑆 can be defined as
𝑆𝑆 𝑐𝑐 = (𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶) ∪ (𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶 𝑐𝑐 ).
Note that events (𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶)𝑎𝑎𝑎𝑎𝑎𝑎 (𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶 𝑐𝑐 ) are mutually exclusive. Hence
𝑃𝑃(𝑆𝑆 𝑐𝑐 ) = 𝑃𝑃(𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶) + 𝑃𝑃(𝑆𝑆 𝑐𝑐 ∩ 𝐶𝐶 𝑐𝑐 )
= 0.03 + 0.72 = 0.75
Therefore,
0.03
𝑃𝑃(𝐶𝐶|𝑆𝑆 𝑐𝑐 ) = = 0.04.
0.75
It implies that when individuals are non-smokers, 4% of them develop cancer.

(iii)
Irrespective of smoking status of an individual, the event of developing cancer for
an individual can be defined as
𝐶𝐶 = (𝐶𝐶 ∩ 𝑆𝑆) ∪ (𝐶𝐶 ∩ 𝑆𝑆 𝑐𝑐 ).
Since both events (𝐶𝐶 ∩ 𝑆𝑆) 𝑎𝑎𝑎𝑎𝑎𝑎 (𝐶𝐶 ∩ 𝑆𝑆 𝑐𝑐 ) are mutually exclusive,
𝑃𝑃(𝐶𝐶) = 𝑃𝑃(𝐶𝐶 ∩ 𝑆𝑆) + 𝑃𝑃(𝐶𝐶 ∩ 𝑆𝑆 𝑐𝑐 )
= 0.05 + 0.03 = 0.08.
That is, 8% of individuals develop cancer. But among the smoker, this rate is 20%
[from (i)], whereas it 4% among the non-smokers [from(ii)]. Since the rate of
developing cancer among the smokers is much higher than developing cancer
among the non-smokers, smoking can be considered as a cause of cancer.

Problem 16
Probability that a doctor correctly diagnoses a particular disease is 0.7. Given that
doctor makes an incorrect diagnosis, the patient enters a law suit is 0.9. What is the
probability that the doctor makes an incorrect diagnosis and the patient sues?

Solution:

Let us define following events.


𝐷𝐷 = 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑡𝑡ℎ𝑒𝑒 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝐷𝐷𝑐𝑐 = 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑡𝑡ℎ𝑒𝑒 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝐿𝐿 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Given that
𝑃𝑃(𝐷𝐷) = 0.7 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝐿𝐿|𝐷𝐷𝑐𝑐 ) = 0.9
The event that doctor makes an incorrect diagnosis and the patient enters a law suit
is
𝐷𝐷𝑐𝑐 ∩ 𝐿𝐿.
It is asked to find 𝑃𝑃(𝐷𝐷𝑐𝑐 ∩ 𝐿𝐿). By the definition of conditional probability

𝑐𝑐 )
𝑃𝑃(𝐷𝐷𝑐𝑐 ∩ 𝐿𝐿)
𝑃𝑃(𝐿𝐿|𝐷𝐷 =
𝑃𝑃(𝐷𝐷𝑐𝑐 )
⇒ 𝑃𝑃(𝐷𝐷𝑐𝑐 ∩ 𝐿𝐿) = 𝑃𝑃(𝐿𝐿|𝐷𝐷𝑐𝑐 ) × 𝑃𝑃(𝐷𝐷𝑐𝑐 ) = 𝑃𝑃(𝐿𝐿|𝐷𝐷𝑐𝑐 ) × [1 − 𝑃𝑃(𝐷𝐷)]
= 0.9 × (1 − 0.7) = 0.9 × 0.3 = 0.27.
That is, in 27% of cases, doctor makes an incorrect diagnosis and the patient enters
a law suit.

Problem 17
A training program consists of two consecutive parts. To pass the program, the
trainee must pass both parts of the program. From the past experience, it is known
that 90% of the trainees pass the first part and 80% of those who pass the first part
pass the second part. What is the percentage of trainees who pass the program?

Exercise!!!!!!!!!!!!!

Problem 18
A town has 2 fire engines operating independently. The probability that a specific
engine is available when needed is 0.96. Find the probability that (i) both are
available when needed; (ii) at least one is available; (iii) neither is available when
needed.
Solution:
Let us define following events.
𝐴𝐴 = 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 1 𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎
𝐵𝐵 = 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 2 𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎
Given that events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent with
𝑃𝑃(𝐴𝐴) = 𝑃𝑃(𝐵𝐵) = 0.96.

(i) Both are available when needed


This event can defined as 𝐴𝐴 ∩ 𝐵𝐵. Since events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent,
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵) = 0.96 × 0.96 = 0.9216

Interpret!!!!!!!!!!!!!!!!!!!!!
(ii) At least one is available
This event is defined as 𝐴𝐴 ∪ 𝐵𝐵. Therefore,
𝑃𝑃(𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) + 𝑃𝑃(𝐵𝐵) − 𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵)
= 0.96 + 0.96 − 0.9216 = 0.9984

Interpret!!!!!!!!!!!!!!!!!!!!!
(iii) Neither is available
This event is event is defined as 𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵𝑐𝑐 . Since events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are independent,
the probability of this event is
𝑃𝑃(𝐴𝐴𝑐𝑐 ∩ 𝐵𝐵𝑐𝑐 ) = 𝑃𝑃(𝐴𝐴𝑐𝑐 ) × 𝑃𝑃(𝐵𝐵𝑐𝑐 )
= [1 − 𝑃𝑃(𝐴𝐴)] × [1 − 𝑃𝑃(𝐵𝐵)] = (1 − 0.96) × (1 − 0.96)
= 0.04 × 0.04 = 0.0016

Interpret!!!!!!!!!!!!!!!!!!!!!
Problem 19
A pair of fair dice is thrown. Find the probability that sum of the points on the two
dice is greater than or equal to 10 if 5 appears on the first die.

Solution:
The number of sample points in the sample space associated with this experiment
is 62 = 36 and these are
(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),
⎧ ⎫
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6),
⎪ ⎪
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
𝑆𝑆 = .
⎨ (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), ⎬
⎪ (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), ⎪
⎩ (6,1), (6,2), (6,3), (6,4), (6,5), (6,6), ⎭
Assume that outcome of the one die will not influence the outcome of another die.
Since dice are fair, the probability of each sample point is
1 1 1
× = .
𝑃𝑃(𝑒𝑒𝑒𝑒𝑒𝑒ℎ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝) =
6 6 36
In other words, sample points are equally likely. Let us define following events.
𝐴𝐴 = sum of the points on the two dice is greater than or equal to 10
𝐵𝐵 = 5 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑑𝑑𝑑𝑑𝑑𝑑
That is,
𝐴𝐴 = {(4,6), (5,5), (5,6), (6,4), (6,5), (6,6)}
𝐵𝐵 = {(5,1), (5,2), (5,3), (5,4), (5,5), (5,6)}
Then, the event that sum of the points on the two dice is greater than or equal to 10
if 5 appears on the first die can be defined as
𝐴𝐴|𝐵𝐵
and the probability of this event is
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵)
𝑃𝑃(𝐴𝐴|𝐵𝐵) = .
𝑃𝑃(𝐵𝐵)
Here,
𝐴𝐴 ∩ 𝐵𝐵 = {(5,5), (5,6)}
Therefore,
2 1
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = =
36 18
6 1
𝑃𝑃(𝐵𝐵) = = .
36 6
Hence,
1 1
𝑃𝑃(𝐴𝐴|𝐵𝐵) = ×6= .
18 3

Problem 20
The probability that a married man watches a certain TV show is 0.4, whereas it is
0.5 for a married woman. The probability that a man watches the show given that
his wife does is 0.7. Find the probability that (i) a married couple watches the
show; (ii) a wife watches the show given that her husband does; (iii) at least one
person of a married couple will watch the show.

Solution:
Let us define following events.
𝐻𝐻 = 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑡𝑡ℎ𝑒𝑒 𝑇𝑇𝑇𝑇 𝑠𝑠ℎ𝑜𝑜𝑜𝑜
𝑊𝑊 = 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑡𝑡ℎ𝑒𝑒 𝑇𝑇𝑇𝑇 𝑠𝑠ℎ𝑜𝑜𝑜𝑜
Given that
𝑃𝑃(𝐻𝐻) = 0.4; 𝑃𝑃(𝑊𝑊) = 0.5; 𝑃𝑃(𝐻𝐻|𝑊𝑊) = 0.7
(i) A married couple watches the show
This event is defined as
𝐻𝐻 ∩ 𝑊𝑊.
The probability of this event can be computes as
𝑃𝑃(𝐻𝐻 ∩ 𝑊𝑊)
𝑃𝑃(𝐻𝐻|𝑊𝑊) =
𝑃𝑃(𝑊𝑊)
⇒ 𝑃𝑃(𝐻𝐻 ∩ 𝑊𝑊) = 𝑃𝑃(𝐻𝐻|𝑊𝑊) × 𝑃𝑃(𝑊𝑊)
= 0.7 × 0.5 = 0.35
It implies that in 35% of families, a married couple watches the show.

(ii) A wife watches the show given that her husband does
This event is defined as
𝑊𝑊|𝐻𝐻.
The probability of this event can be computes as
𝑃𝑃(𝐻𝐻 ∩ 𝑊𝑊) 0.35
𝑃𝑃(𝑊𝑊|𝐻𝐻) = = = 0.875
𝑃𝑃(𝐻𝐻) 0.4

Interpret!!!!!!!!!!!
(iii) At least one person of a married couple will watch the show
This event is defined as
𝐻𝐻 ∪ 𝑊𝑊.
The probability of this event can be computes as
𝑃𝑃(𝐻𝐻 ∪ 𝑊𝑊) = 𝑃𝑃(𝐻𝐻) + 𝑃𝑃(𝑊𝑊) − 𝑃𝑃(𝐻𝐻 ∩ 𝑊𝑊)
= 0.4 + 0.5 − 0.35 = 0.55
Interpret!!!!!!!!!!!
Exercise: Also, find the probability that (iv) Only husband watches the show; (v)
Only wife watches the show; (vi) Neither of them watches the show.
Problem 21

A box contains 7 red and 3 black marbles. Three marbles are drawn one after
another without replacement. Find the probability that (i) first two are red and third
is black; (ii) at least two are red; (iii) all are of same color; (iv) at least one is red.

Solution:
It clear that at each draw all marbles in the box have the same probability to be
drawn, i.e. marbles are equally likely to be selected, There are 10, 9, 𝑎𝑎𝑎𝑎𝑎𝑎 8 marbles
in the box at the time of first, second and third draw as a single marble is chosen at
each draw. The tree diagram associated with this experiment is given below.

(i) The event that first two are red and third is black is
𝑅𝑅 ∩ 𝑅𝑅 ∩ 𝐵𝐵 = 𝑅𝑅𝑅𝑅𝑅𝑅.
The probability of this event is
𝑃𝑃(𝑅𝑅𝑅𝑅𝑅𝑅) = 𝑃𝑃(𝑅𝑅) × 𝑃𝑃(𝑅𝑅|𝑅𝑅) × 𝑃𝑃(𝐵𝐵|𝑅𝑅𝑅𝑅)
7 6 3
= × × = 0.175.
10 9 8
Interpret!!!!!!!!!
(ii) The event that at least two are red is
𝑅𝑅𝑅𝑅𝑅𝑅 ∪ 𝑅𝑅𝑅𝑅𝑅𝑅 ∪ 𝑅𝑅𝑅𝑅𝑅𝑅 ∪ 𝐵𝐵𝐵𝐵𝐵𝐵.
Therefore,
𝑃𝑃(𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑡𝑡𝑡𝑡𝑡𝑡 𝑎𝑎𝑎𝑎𝑎𝑎 𝑟𝑟𝑟𝑟𝑟𝑟) = 𝑃𝑃(𝑅𝑅𝑅𝑅𝑅𝑅) + 𝑃𝑃(𝑅𝑅𝑅𝑅𝑅𝑅) + 𝑃𝑃(𝑅𝑅𝑅𝑅𝑅𝑅) + 𝑃𝑃(𝐵𝐵𝐵𝐵𝐵𝐵)
7 6 5 7 6 3 7 3 6 3 7 6
= × × + × × + × × + × × =? ? ?
10 9 8 10 9 8 10 9 8 10 9 8
Interpret!!!!!!!!!
(iii) The event that all are of same color is
𝑅𝑅𝑅𝑅𝑅𝑅 ∪ 𝐵𝐵𝐵𝐵𝐵𝐵.
Therefore,
𝑃𝑃(𝑎𝑎𝑎𝑎𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐) = 𝑃𝑃(𝑅𝑅𝑅𝑅𝑅𝑅) + 𝑃𝑃(𝐵𝐵𝐵𝐵𝐵𝐵)
7 6 5 3 2 1
= × × + × × =? ? ? ?
10 9 8 10 9 8
Interpret!!!!!!!!!
(iv) The event that at least one is red is
(𝐵𝐵𝐵𝐵𝐵𝐵)𝑐𝑐 .
Therefore,
𝑃𝑃(𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟) = 𝑃𝑃(𝐵𝐵𝐵𝐵𝐵𝐵)𝑐𝑐 = 1 − 𝑃𝑃(𝐵𝐵𝐵𝐵𝐵𝐵)
3 2 1
=1− × × =? ? ? ? ?
10 9 8
Interpret!!!!!!!!!
Problem 22
A fair coin is tossed until a head appears or it has been tossed three times. Given
that head does not occur on the first toss, what is the probability that the coin is
tossed three times?

Solution:
Since the coin is fair,
1
𝑃𝑃(𝐻𝐻) = 𝑃𝑃(𝑇𝑇) = .
2
The sample space associated with this experiment is
𝑆𝑆 = {𝐻𝐻, 𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}.
Since the outcome obtained in a trial does not influence the outcome of the other
trial,
1 1 1
𝑃𝑃(𝐻𝐻) = ; 𝑃𝑃(𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝑇𝑇) × 𝑃𝑃(𝐻𝐻) = ; 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝑇𝑇) × 𝑃𝑃(𝑇𝑇) × 𝑃𝑃(𝐻𝐻) =
2 4 8
1
𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝑇𝑇) × 𝑃𝑃(𝑇𝑇) × 𝑃𝑃(𝑇𝑇) = .
8
Let us define following event
𝐴𝐴 = 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑛𝑛𝑛𝑛𝑛𝑛 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 = {𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}
𝐵𝐵 = 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑟𝑟𝑟𝑟𝑟𝑟 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 = {𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}
We need to find
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵)
𝑃𝑃(𝐵𝐵|𝐴𝐴) =
𝑃𝑃(𝐴𝐴)
Now,
𝐴𝐴 ∩ 𝐵𝐵 = {𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}.
Hence,
1 1 1
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) = + =
8 8 4
1 1 1 1
𝑃𝑃(𝐴𝐴) = 𝑃𝑃(𝑇𝑇𝑇𝑇) + 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) = + + =
4 8 8 2
Finally,
1 1
𝑃𝑃(𝐵𝐵|𝐴𝐴) = ×2= .
4 2

Problem 23
Suppose that events A and B are exhaustive events. It is given that 𝑃𝑃(𝐴𝐴|𝐵𝐵) =
1 2
𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝐵𝐵) = . Find the probability that A will occur.
4 3

Solution:
Given that events 𝐴𝐴 𝑎𝑎𝑎𝑎𝑎𝑎 𝐵𝐵 are exhaustive. That is,
𝐴𝐴 ∪ 𝐵𝐵 = 𝑆𝑆.
Therefore,
𝑃𝑃(𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃(𝑆𝑆)
⇒ 𝑃𝑃(𝐴𝐴) + 𝑃𝑃(𝐵𝐵) − 𝑃𝑃(𝐴𝐴𝐴𝐴) = 1 (1)
2
It is also given that 𝑃𝑃(𝐵𝐵) = and
3
1
𝑃𝑃(𝐴𝐴|𝐵𝐵) =
4
𝑃𝑃(𝐴𝐴𝐴𝐴) 1
⇒ =
𝑃𝑃(𝐵𝐵) 4
1 1 2 1
⇒ 𝑃𝑃(𝐴𝐴𝐴𝐴) = × 𝑃𝑃(𝐵𝐵) = × = .
4 4 3 6
From equation (1),
2 1
𝑃𝑃(𝐴𝐴) + − = 1
3 6
3
⇒ 𝑃𝑃(𝐴𝐴) + = 1
6
1 1
⇒ 𝑃𝑃(𝐴𝐴) = 1 − = .
2 2
Odds of an Event
The odds of occurring an event 𝐴𝐴 is defined as the ratio of probability of occurring
the event 𝐴𝐴 to the probability of not occurring the event 𝐴𝐴, 𝑖𝑖. 𝑒𝑒.
𝑃𝑃𝑃𝑃. (𝐴𝐴) 𝑃𝑃𝑃𝑃. (𝐴𝐴)
𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂(𝐴𝐴) = = .
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴)
Similarly, the odds of not occurring of an event 𝐴𝐴 is defined as
𝑃𝑃𝑃𝑃. (𝐴𝐴𝑐𝑐 ) 1 − 𝑃𝑃𝑃𝑃. (𝐴𝐴)
𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂(𝐴𝐴𝑐𝑐 ) = = .
𝑃𝑃𝑃𝑃. (𝐴𝐴) 𝑃𝑃𝑃𝑃. (𝐴𝐴)
The value of odds of an event ranges from 0 𝑡𝑡𝑡𝑡 ∞.

Interpretation of odds of an event 𝑨𝑨:


\
• If 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴) = 1, occurring of the event 𝐴𝐴 and not occurring of the
event 𝐴𝐴 are equally likely.

• If 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴) > 1, the event 𝐴𝐴 is 100 × [𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴) − 1]% more likely to


occur than not occurring of the event 𝐴𝐴.

• If 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴) < 1, the event 𝐴𝐴 is 100 × [1 − 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴)]% less likely to


occur than not occurring of the event 𝐴𝐴.

Finding Probability of an Events from its Odds


Suppose that the odds of an event 𝐴𝐴 is given as 𝑥𝑥. That is,
𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴) = 𝑥𝑥
𝑃𝑃(𝐴𝐴)
⇒ = 𝑥𝑥
𝑃𝑃(𝐴𝐴𝑐𝑐 )
⇒ 𝑃𝑃(𝐴𝐴) = 𝑥𝑥 × [1 − 𝑃𝑃(𝐴𝐴)]
⇒ 𝑃𝑃(𝐴𝐴) = 𝑥𝑥 − 𝑥𝑥 × 𝑃𝑃(𝐴𝐴)
⇒ 𝑃𝑃(𝐴𝐴) (1 + 𝑥𝑥) = 𝑥𝑥
𝑥𝑥
⇒ 𝑃𝑃(𝐴𝐴) = .
1 + 𝑥𝑥
Hence,
1
𝑃𝑃(𝐴𝐴𝑐𝑐 ) = 1 − 𝑃𝑃(𝐴𝐴) = .
1 + 𝑥𝑥

Problem 24
In a class, there are 15 boys and 25 girls. Find the odds of choosing a boy. Interpret
the result.

Solution
The sample space associated with this experiment consists of (15 + 25) = 40
sample points, which are equally likely. Let 𝐴𝐴 be an event defined as
𝐴𝐴 = 𝐶𝐶ℎ𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑎𝑎 𝑏𝑏𝑏𝑏𝑏𝑏.
Therefore,
15
𝑃𝑃(𝐴𝐴) = .
40
Hence,
15 25
𝑃𝑃(𝐴𝐴𝑐𝑐 ) = 1 − = .
40 40
Therefore, the odds of choosing a boy from that class is
15 40 3
𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐴𝐴) = × = = 0.6.
40 25 5
It implies that a boy is 100(1 − 0.6)% = 40% less likely to be chosen than a girl.
Problem 25
If a political candidate has 35% chance of winning an election, find the odds of
losing the election for the candidate. Interpret the result.

Solution
Let 𝐵𝐵 be an event defined as
𝐵𝐵 = 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒.
𝐵𝐵𝑐𝑐 = 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒.
Given that
𝑃𝑃(𝐵𝐵) = 0.35.
Therefore,
𝑃𝑃(𝐵𝐵𝑐𝑐 ) = 1 − 𝑃𝑃(𝐵𝐵) = 1 − 0.35 = 0.65.
The odds of losing the election for the candidate is

𝑐𝑐 )
𝑃𝑃(𝐵𝐵𝑐𝑐 ) 0.65
𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜(𝐵𝐵 = = = 1.86.
𝑃𝑃(𝐵𝐵) 0.35
It implies that this candidate is 100(1.86 − 1)% = 86% more likely to lose the
election than winning.
Bayes’ Theorem
Statement
Suppose that an experiment consists of 𝑘𝑘 mutually exclusive and exhaustive
events 𝐴𝐴1 , ⋯ , 𝐴𝐴𝑖𝑖 , ⋯ , 𝐴𝐴𝑘𝑘 . Also, suppose that 𝐵𝐵 is another event such that 𝐵𝐵 ≠
𝜙𝜙 𝑎𝑎𝑎𝑎𝑎𝑎 𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵 ≠ 𝜙𝜙, ∀𝑖𝑖 = 1, ⋯ , 𝑘𝑘. If 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ), ∀ 𝑖𝑖 = 1, ⋯ , 𝑘𝑘 are known
in advance, the probability of the event 𝐴𝐴𝑖𝑖 |𝐵𝐵 can be computed as
𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
𝑃𝑃(𝐴𝐴𝑖𝑖 |𝐵𝐵) =
𝑃𝑃(𝐴𝐴1 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴1 ) + ⋯ + 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ) + ⋯ + 𝑃𝑃(𝐴𝐴𝑘𝑘 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑘𝑘 )
𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
= ; 𝑖𝑖 = 1, ⋯ , 𝑘𝑘.
∑𝑘𝑘𝑖𝑖=1 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
The information on 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ) is known as prior information and
information on 𝑃𝑃(𝐴𝐴𝑖𝑖 |𝐵𝐵) is known as posterior information.

Proof:
Given that events 𝐴𝐴1 , ⋯ , 𝐴𝐴𝑖𝑖 , ⋯ , 𝐴𝐴𝑘𝑘 are mutually exclusive and exhaustive, 𝑖𝑖. 𝑒𝑒.
A1 ∪ ⋯ ∪ 𝐴𝐴𝑖𝑖 ∪ ⋯ ∪ 𝐴𝐴𝑘𝑘 = 𝑆𝑆
𝐴𝐴𝑖𝑖 ∩ 𝐴𝐴𝑗𝑗 = 𝜙𝜙, ∀ 𝑖𝑖 ≠ 𝑗𝑗.
It is also given that 𝐵𝐵 is an event such that 𝐵𝐵 ≠ 𝜙𝜙 𝑎𝑎𝑎𝑎𝑎𝑎 𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵 ≠ 𝜙𝜙, ∀𝑖𝑖 = 1, ⋯ , 𝑘𝑘.
Under this setup, the Venn diagram is given below.

The event 𝐵𝐵 can be expressed as


𝐵𝐵 = (A1 ∩ 𝐵𝐵) ∪ ⋯ ∪ (𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵) ∪ ⋯ ∪ (𝐴𝐴𝑘𝑘 ∩ 𝐵𝐵).
Since events 𝐴𝐴1 , ⋯ , 𝐴𝐴𝑖𝑖 , ⋯ , 𝐴𝐴𝑘𝑘 are mutually exclusive events, (A1 ∩ 𝐵𝐵), ⋯ (𝐴𝐴𝑖𝑖 ∩
𝐵𝐵), ⋯ , (𝐴𝐴𝑘𝑘 ∩ 𝐵𝐵) are also mutually exclusive. Then using Axiom 3, one may write
𝑃𝑃(𝐵𝐵) = 𝑃𝑃(A1 ∩ 𝐵𝐵) + ⋯ + 𝑃𝑃(𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵) + ⋯ + 𝑃𝑃(𝐴𝐴𝑘𝑘 ∩ 𝐵𝐵)
𝑘𝑘

= � 𝑃𝑃(𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵).
𝑖𝑖=1

Using information given on 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ) , one may compute 𝑃𝑃(𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵) as
follows. Following definition conditional probability,
𝑃𝑃(𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵)
𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ) =
𝑃𝑃(𝐴𝐴𝑖𝑖 )
⇒ 𝑃𝑃(𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ).
Hence,
𝑘𝑘

𝑃𝑃(𝐵𝐵) = � 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 ).


𝑖𝑖=1

Following the definition of conditional probability, the posterior information


𝑃𝑃(𝐴𝐴𝑖𝑖 |𝐵𝐵) can be obtained as
𝑃𝑃(𝐴𝐴𝑖𝑖 ∩ 𝐵𝐵)
𝑃𝑃(𝐴𝐴𝑖𝑖 |𝐵𝐵) =
𝑃𝑃(𝐵𝐵)
𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
= .
∑𝑘𝑘𝑖𝑖=1 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
𝑖𝑖. 𝑒𝑒. 𝑃𝑃(𝐴𝐴𝑖𝑖 |𝐵𝐵) = ; 𝑖𝑖 = 1, ⋯ , 𝑘𝑘.
∑𝑘𝑘𝑖𝑖=1 𝑃𝑃(𝐴𝐴𝑖𝑖 ) 𝑃𝑃(𝐵𝐵|𝐴𝐴𝑖𝑖 )
Problem 26
A survey was conducted among students of Dhaka University (DU), Rajshahi
University (RU), and Chittagong University (CU) to know their opinion regarding
4-year honors course. The percentages of favoring 4-year honors course were: DU
21%; RU 45%; and 75% CU. If a university is chosen at random and a student is
selected from this university also at random, what is probability that the selected
student will be in favor of 4-year honors course? Given that the student is in favor
of a 4-year course, what is the probability that he/she comes from DU? From CU?
From RU?

Solution
Let us define following events.
𝐷𝐷 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝐷𝐷𝐷𝐷
𝐶𝐶 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝐶𝐶𝐶𝐶
𝑅𝑅 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑅𝑅𝑅𝑅
𝐹𝐹 = 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 4 − 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Given that
1 1 1
𝑃𝑃(𝐷𝐷) = ; 𝑃𝑃(𝐶𝐶) = ; 𝑃𝑃(𝑅𝑅) = .
3 3 3

It was also given that


𝑃𝑃(𝐹𝐹|𝐷𝐷) = 0.21; 𝑃𝑃(𝐹𝐹|𝐶𝐶) = 0.75; 𝑃𝑃(𝐹𝐹|𝑅𝑅) = 0.45.
One may redefine the event 𝐹𝐹 as
𝐹𝐹 = 𝐷𝐷𝐷𝐷 ∪ 𝐶𝐶𝐶𝐶 ∪ 𝑅𝑅𝑅𝑅.
Hence,
𝑃𝑃(𝐹𝐹) = 𝑃𝑃(𝐷𝐷𝐷𝐷) + 𝑃𝑃(𝐶𝐶𝐶𝐶) + 𝑃𝑃(𝑅𝑅𝑅𝑅)
= 𝑃𝑃(𝐷𝐷) × 𝑃𝑃(𝐹𝐹|𝐷𝐷) + 𝑃𝑃(𝐶𝐶) × 𝑃𝑃(𝐹𝐹|𝐶𝐶) + 𝑃𝑃(𝑅𝑅) × 𝑃𝑃(𝐹𝐹|𝑅𝑅)
1 1 1
= × 0.21 + × 0.75 + × 0.45 = 0.47.
3 3 3
It implies that 47% of students from all three universities favor the 4-year course.

The event that if the student is in favor of a 4-year course, he/she comes from DU
is defined as
𝐷𝐷|𝐹𝐹.
The probability of this event is
𝑃𝑃(𝐷𝐷𝐷𝐷) 𝑃𝑃(𝐷𝐷) × 𝑃𝑃(𝐹𝐹|𝐷𝐷)
𝑃𝑃(𝐷𝐷|𝐹𝐹) = =
𝑃𝑃(𝐹𝐹) 𝑃𝑃(𝐷𝐷) × 𝑃𝑃(𝐹𝐹|𝐷𝐷) + 𝑃𝑃(𝐶𝐶) × 𝑃𝑃(𝐹𝐹|𝐶𝐶) + 𝑃𝑃(𝑅𝑅) × 𝑃𝑃(𝐹𝐹|𝑅𝑅)
1
× 0.21
=3 = 0.15.
0.47
That is, among the students who favor the 4-year course, 15% of students are from
DU.

𝑃𝑃(𝐶𝐶|𝐹𝐹) = 0.53. Try


𝑃𝑃(𝑅𝑅|𝐹𝐹) = 0.32. Try

Problem 27
Suppose that a bin contains many small plastic eggs. Some eggs are painted red
and some are painted blue. Among eggs, 40% contain pearl and 60% nothing. If
eggs contain pearl, 30% of them are painted blue. If eggs do not contain pearl, 10%
of them are painted blue. What is the probability that an egg contains pearl if it is
of blue color?

Hints:
𝑃𝑃: 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
𝑁𝑁: 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑛𝑛𝑛𝑛𝑛𝑛ℎ𝑖𝑖𝑖𝑖ℎ
𝐵𝐵: 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏
Given that
𝑃𝑃𝑃𝑃. (𝑃𝑃) = 0.40; 𝑃𝑃𝑃𝑃. (𝑁𝑁) = 0.60
𝑃𝑃𝑃𝑃. (𝐵𝐵|𝑃𝑃) = 0.30; 𝑃𝑃𝑃𝑃. (𝐵𝐵|𝑁𝑁) = 0.10
𝑃𝑃𝑃𝑃. (𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝐵𝐵𝐵𝐵) + 𝑃𝑃(𝐵𝐵𝐵𝐵) = 𝑃𝑃𝑃𝑃. (𝑃𝑃) Pr(𝐵𝐵|𝑃𝑃) + 𝑃𝑃𝑃𝑃. (𝑁𝑁)𝑃𝑃𝑃𝑃. (𝐵𝐵|𝑁𝑁) = 01.8
𝑃𝑃𝑃𝑃. (𝑃𝑃|𝐵𝐵) = 0.667. Answer

Problem 28
Suppose that there are two websites, A and B, for renting books. The site A
receives 60% of all orders. Among the orders placed on site A, 75% arrive on time.
Among the orders placed on site B, 90% arrive on time. What is the probability
that an order arrives on time. Given that an order arrives on time, find the
probability that it was placed on website B.

Hints:
𝐴𝐴 = 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜𝑜𝑜 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝐴𝐴
𝐵𝐵 = 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜𝑜𝑜 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝐵𝐵
𝑇𝑇 = 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
Given that
𝑃𝑃(𝐴𝐴) = 0.60; 𝑃𝑃(𝐵𝐵) = 0.40
𝑃𝑃(𝑇𝑇|𝐴𝐴) = 0.75 𝑃𝑃(𝑇𝑇|𝐵𝐵) = 0.90

𝑇𝑇 = 𝑇𝑇𝑇𝑇 ∪ 𝑇𝑇𝑇𝑇
𝑃𝑃(𝑇𝑇) = 𝑃𝑃(𝑇𝑇𝑇𝑇) + 𝑇𝑇(𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝐴𝐴)𝑃𝑃(𝑇𝑇|𝐴𝐴) + 𝑃𝑃(𝐵𝐵)𝑃𝑃(𝑇𝑇|𝐵𝐵)
= 0.60 × 0.75 + 0.40 × 0.90 = 0.81
𝑃𝑃(𝑇𝑇𝑇𝑇) 𝑃𝑃(𝐵𝐵)𝑃𝑃(𝑇𝑇|𝐵𝐵) 0.36
𝑃𝑃(𝐵𝐵|𝑇𝑇) = = = = 0.4444
𝑃𝑃(𝑇𝑇) 𝑃𝑃(𝐴𝐴)𝑃𝑃(𝑇𝑇|𝐴𝐴) + 𝑃𝑃(𝐵𝐵)𝑃𝑃(𝑇𝑇|𝐵𝐵) 0.81
Problem 29
Suppose that in a population 1% of a population have a certain disease. A test is
used to detect this disease. This test is positive in 95% of the population with the
disease and is also positive in 2% of the population free from the disease. Find the
probability that the test is positive. If a person, selected at random from this
population, has tested positive, what is the probability that he/she has disease?

Hints
𝐷𝐷 = 𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 ℎ𝑎𝑎𝑎𝑎 𝑡𝑡ℎ𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝐷𝐷𝑐𝑐 = 𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑛𝑛𝑛𝑛𝑛𝑛 ℎ𝑎𝑎𝑎𝑎𝑎𝑎 𝑡𝑡ℎ𝑖𝑖𝑖𝑖 𝑑𝑑𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
𝑇𝑇 + = 𝑎𝑎 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
Given that
𝑃𝑃(𝐷𝐷) = 0.01 𝑃𝑃(𝐷𝐷𝑐𝑐 ) = 1 − 0.01 = 0.99
𝑃𝑃(𝑇𝑇 +|𝐷𝐷) = 0.95 𝑃𝑃(𝑇𝑇 +|𝐷𝐷𝑐𝑐 ) = 0.02

𝑇𝑇 + = 𝑇𝑇 +𝐷𝐷 ∪ 𝑇𝑇 +𝐷𝐷𝑐𝑐
𝑃𝑃(𝑇𝑇 +) = 𝑃𝑃(𝑇𝑇 +𝐷𝐷) + 𝑃𝑃(𝑇𝑇 +𝐷𝐷𝑐𝑐 ) = 𝑃𝑃(𝐷𝐷)𝑃𝑃(𝑇𝑇 +|𝐷𝐷) + 𝑃𝑃(𝐷𝐷𝑐𝑐 )𝑃𝑃(𝑇𝑇 +|𝐷𝐷𝑐𝑐 )
= 0.01 × 0.95 + 0.99 × 0.02 = 0.0293
𝑖𝑖. 𝑒𝑒. 2.93% of individuals in the population will have positive test results.

+)
𝑃𝑃(𝑇𝑇 +𝐷𝐷) 𝑃𝑃(𝐷𝐷)𝑃𝑃(𝑇𝑇 +|𝐷𝐷) 0.01 × 0.95
𝑃𝑃(𝐷𝐷|𝑇𝑇 = = =
𝑃𝑃(𝑇𝑇 +) 𝑃𝑃(𝐷𝐷)𝑃𝑃(𝑇𝑇 +|𝐷𝐷) + 𝑃𝑃(𝐷𝐷𝑐𝑐 )𝑃𝑃(𝑇𝑇 +|𝐷𝐷𝑐𝑐 ) 0.0293
= 0.3242
𝑖𝑖. 𝑒𝑒. among the individuals having positive test results, 32.42% of them have this
disease.
Problem 30
In a market, 60% of light bulbs are produced by company A and rest of the bulbs
produced by company B. Among the bulbs produced by company A, 4% are found
to be defective; whereas it is 5% if light bulbs are produced by company B. If you
buy a light bulb from this market, what is the probability that it is defective? If a
light bulb is defective, what is the probability that it was produced by company A?

Hints:
𝐴𝐴 = 𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙ℎ𝑡𝑡 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑏𝑏𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝐴𝐴
𝐵𝐵 = 𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙ℎ𝑡𝑡 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑏𝑏𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝐵𝐵
𝐷𝐷 = 𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙ℎ𝑡𝑡 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
Given that
𝑃𝑃(𝐴𝐴) = 0.60 ; 𝑃𝑃(𝐵𝐵) = 0.40
𝑃𝑃(𝐷𝐷|𝐴𝐴) = 0.04; 𝑃𝑃(𝐷𝐷|𝐵𝐵) = 0.05
𝐷𝐷 = 𝐷𝐷𝐷𝐷 ∪ 𝐷𝐷𝐷𝐷
𝑃𝑃(𝐷𝐷) = 𝑃𝑃(𝐷𝐷𝐷𝐷) + 𝑃𝑃(𝐷𝐷𝐷𝐷) = 𝑃𝑃(𝐴𝐴)𝑃𝑃(𝐷𝐷|𝐴𝐴) + 𝑃𝑃(𝐵𝐵)𝑃𝑃(𝐷𝐷|𝐵𝐵)
= 0.60 × 0.04 + 0.40 × 0.05 = 0.044
0.024
𝑃𝑃(𝐴𝐴|𝐷𝐷) = = 0.5455
0.044

Problem 31
Party A or Party B will win the next election with probability 0.4 and 0.6,
respectively. If Party A wins, it will pass the equity bill with probability 0.8. If
Party B wins, it will pass the equity bill with probability 0.3. What is the
probability that the equity bill will be passed? If the equity bill is passed, what is
the probability that Party A has won the election? If the equity bill is passed, what
is the probability that Party A has not won the election?
Hints:
𝐴𝐴 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐴𝐴 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑤𝑤𝑤𝑤𝑤𝑤 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
𝐵𝐵 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐵𝐵 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑤𝑤𝑤𝑤𝑤𝑤 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
𝐸𝐸 = 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
Given that
𝑃𝑃(𝐴𝐴) = 0.4; 𝑃𝑃(𝐵𝐵) = 0.6
𝑃𝑃(𝐸𝐸|𝐴𝐴) = 0.8 𝑃𝑃(𝐸𝐸|𝐵𝐵) = 0.3
𝐸𝐸 = 𝐸𝐸𝐸𝐸 ∪ 𝐸𝐸𝐸𝐸
𝑃𝑃(𝐸𝐸) = 𝑃𝑃(𝐸𝐸𝐸𝐸) + 𝑃𝑃(𝐸𝐸𝐸𝐸) = 𝑃𝑃(𝐴𝐴)𝑃𝑃(𝐸𝐸|𝐴𝐴) + 𝑃𝑃(𝐵𝐵)𝑃𝑃(𝐸𝐸|𝐵𝐵)
= 0.4 × 0.8 + 0.6 × 0.3 = 0.50
0.32
𝑃𝑃(𝐴𝐴|𝐸𝐸) = = 0.64
0.50
0.18
𝑃𝑃(𝐴𝐴𝑐𝑐 |𝐸𝐸) = 𝑃𝑃(𝐵𝐵|𝐸𝐸) = = 0.36
0.50
Problem 32
Two fair dice are thrown. Calculate the probability that (i) sum of two numbers is
six; (ii) difference between two numbers is zero; (iii) sum of two numbers is
greater than eight.

Hints:
Since die is fair,
1
𝑃𝑃(1) = 𝑃𝑃(2) = 𝑃𝑃(3) = 𝑃𝑃(4) = 𝑃𝑃(5) = 𝑃𝑃(6) = .
6
There are 62 = 36 sample points in the sample space. Assume that outcome of one
die will not influence of outcome of the other die. Therefore, probability of each
sample point is
1 1 1
𝑃𝑃(𝑒𝑒𝑒𝑒𝑒𝑒ℎ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝) = × = .
6 6 36
(i) The event that sum of two numbers is six is
𝐴𝐴 = {(1,5), (2,4), (3,3), (4,2), (5,1)}
𝑃𝑃(𝐴𝐴) = 𝑃𝑃(1,5) + 𝑃𝑃(2,4), +𝑃𝑃(3,3) + 𝑃𝑃 (4,2) + 𝑃𝑃 (5,1)
1 1 1 1 1 5
= + + + + =
36 36 36 36 36 36
(ii) The event that difference between two numbers is zero is
𝐵𝐵 = {(1,1), (2,2), (3,3), (4,4), (5,5), (6,6)}
6
𝑃𝑃(𝐵𝐵) =
36

(iii) The event that sum of two numbers is greater than eight is
𝐶𝐶 = {(3,6), (4,5), (4,6), (5,4), (5,5), (5,6), (6,3), (6,4), (6,5), (6,6)}
10
𝑃𝑃(𝐶𝐶) = .
36
Problem 33
𝑥𝑥
For an unfair die, 𝑃𝑃(𝑥𝑥) = ; 𝑥𝑥 = 1,2,3,4,5,6. The die is thrown twice, what is the
21

probability that you will get a ‘3’ on the first toss and a ‘6’ on the second toss?
Calculate the probability that (i) sum of two numbers is five; (ii) difference
between two numbers is zero; (iii) sum of two numbers is greater than or equal to
ten.

Hints:
The die is unfair with
1 2 3 4 5 6
𝑃𝑃(1) = ; 𝑃𝑃(2) = ; 𝑃𝑃(3) = ; 𝑃𝑃(4) = ; 𝑃𝑃(5) = ; 𝑃𝑃(6) = .
21 21 21 21 21 21
There are 62 = 36 sample points in the sample space. Assume that outcome of one
toss will not influence of outcome of the other toss.

The event that a ‘3’ on the first toss and a ‘6’ on the second toss is
𝐴𝐴 = {(3,6)}
3 6
𝑃𝑃(𝐴𝐴) = 𝑃𝑃(3,6) = 𝑃𝑃(3) × 𝑃𝑃(6) = × = 0.041
21 21
(i) The event that sum of two numbers is five is
𝐵𝐵 = {(1,4), (2,3), (3,2), (4,1)}
𝑃𝑃(𝐵𝐵) = 𝑃𝑃(1,4) + 𝑃𝑃(2,3) + 𝑃𝑃(3,2) + 𝑃𝑃 (4,1)
= 𝑃𝑃(1) × 𝑃𝑃(4) + 𝑃𝑃(2) × 𝑃𝑃(3) + 𝑃𝑃(3) × 𝑃𝑃(2) + 𝑃𝑃(4) × 𝑃𝑃(1)
1 4 2 3 3 2 4 1
= × + × + × + ×
21 21 21 21 21 21 21 21
(ii) The event that difference between two numbers is zero is
𝐶𝐶 = {(1,1), (2,2), (3,3), (4,4), (5,5), (6,6)}
𝑃𝑃(𝐶𝐶) =? ? ? ? ? ? ? ? ? ? ? ?
(iii) The event that sum of two numbers is greater than or equal to ten is
𝐷𝐷 = {(4,6), (5,5), (5,6), (6,4), (6,5), (6,6)}
𝑃𝑃(𝐷𝐷) =? ? ? ? ? ? ? ? ? ? ? ?

Problem 34

Of flights from Dhaka to London, 89.5% leave on time and arrive on time; 3.5%
leave on time and arrive late; 1.5% leave late and arrive on time; and 5.5% leave
late and arrive late. What is the probability that given that a flight leave late, it will
arrive on time?

Hints
Let us define following events.
𝐿𝐿 = 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
𝐿𝐿𝑐𝑐 = 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙
𝐴𝐴 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
𝐴𝐴𝑐𝑐 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙
Given that
𝑃𝑃(𝐿𝐿𝐿𝐿) = 0.895
𝑃𝑃(𝐿𝐿𝐴𝐴𝑐𝑐 ) = 0.035
𝑃𝑃(𝐿𝐿𝑐𝑐 𝐴𝐴) = 0.015
𝑃𝑃(𝐿𝐿𝑐𝑐 𝐴𝐴𝑐𝑐 ) = 0.055
The event that given that a flight leave late, it will arrive on time is
𝐴𝐴|𝐿𝐿𝑐𝑐 .
The probability of this event is
𝑐𝑐 )
𝑃𝑃(𝐿𝐿𝑐𝑐 𝐴𝐴)
𝑃𝑃(𝐴𝐴|𝐿𝐿 =
𝑃𝑃(𝐿𝐿𝑐𝑐 )
Here,
𝐿𝐿𝑐𝑐 = 𝐿𝐿𝑐𝑐 𝐴𝐴 ∪ 𝐿𝐿𝑐𝑐 𝐴𝐴𝑐𝑐
𝑃𝑃(𝐿𝐿𝑐𝑐 ) = 𝑃𝑃(𝐿𝐿𝑐𝑐 𝐴𝐴) + 𝑃𝑃(𝐿𝐿𝑐𝑐 𝐴𝐴𝑐𝑐 ) = 0.015 + 0.055 = 0.070

0.015
𝑃𝑃(𝐴𝐴|𝐿𝐿𝑐𝑐 ) = = 0.214.
0.070
It implies that among the flights leaving late, 21.4% of them arrive on time.

Problem 35
Three independent missiles are launched simultaneous from different launchers.
The success rates of hitting target for these launchers are 80%; 75%, and 80%,
respectively. What is the probability that at least two missiles will hit the target?

Hints:
Let us define following events.

𝑀𝑀1 = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 1 ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡


𝑀𝑀2 = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 2 ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
𝑀𝑀3 = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 3 ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
Given that
𝑃𝑃(𝑀𝑀1 ) = 0.8 𝑃𝑃(𝑀𝑀2 ) = 0.75 𝑃𝑃(𝑀𝑀3 ) = 0.80
𝑃𝑃(𝑀𝑀1𝑐𝑐 ) = 0.2 𝑃𝑃(𝑀𝑀2𝑐𝑐 ) = 0.25 𝑃𝑃(𝑀𝑀3𝑐𝑐 ) = 0.20

The event that at least two missiles will hit the target is
𝐴𝐴 = 𝑀𝑀1 𝑀𝑀2 𝑀𝑀3𝑐𝑐 ∪ 𝑀𝑀1 𝑀𝑀2𝑐𝑐 𝑀𝑀3 ∪ 𝑀𝑀1𝑐𝑐 𝑀𝑀2 𝑀𝑀3 ∪ 𝑀𝑀1 𝑀𝑀2 𝑀𝑀3
𝑃𝑃(𝐴𝐴) = 𝑃𝑃(𝑀𝑀1 𝑀𝑀2 𝑀𝑀3𝑐𝑐 ) + 𝑃𝑃(𝑀𝑀1 𝑀𝑀2𝑐𝑐 𝑀𝑀3 ) + 𝑃𝑃(𝑀𝑀1𝑐𝑐 𝑀𝑀2 𝑀𝑀3 ) + 𝑃𝑃(𝑀𝑀1 𝑀𝑀2 𝑀𝑀3 )
Since missiles operates independently,
𝑃𝑃(𝑀𝑀1 𝑀𝑀2 𝑀𝑀3𝑐𝑐 ) = 𝑃𝑃(𝑀𝑀1 ) × 𝑃𝑃(𝑀𝑀2 ) × 𝑃𝑃(𝑀𝑀3𝑐𝑐 )
= 0.8 × 0.75 × 0.20 = 0.12
Similarly,
𝑃𝑃(𝑀𝑀1 𝑀𝑀2𝑐𝑐 𝑀𝑀3 ) = 0.16
𝑃𝑃(𝑀𝑀1𝑐𝑐 𝑀𝑀2 𝑀𝑀3 ) = 0.12
𝑃𝑃(𝑀𝑀1 𝑀𝑀2 𝑀𝑀3 ) = 0.48
Hence,
𝑃𝑃(𝐴𝐴) = 0.88

Problem 36
For a particular unfair die, the probability of a ‘5’ is 0.2. The die is thrown twice.
What is the probability that ‘5’ occurs exactly once?

Hints:
Let us define following events.
5 = 5 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
5𝑐𝑐 = 5 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
Given that
𝑃𝑃(5) = 0.2
𝑃𝑃(5𝑐𝑐 ) = 0.8
It can be assumed that outcome of one die will not influence the outcome of
another die. Le 𝐴𝐴 be an event defined as
𝐴𝐴 = 5 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖 2 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
= (5,5𝑐𝑐 ) ∪ (5𝑐𝑐 , 5)
𝑃𝑃(𝐴𝐴) = 𝑃𝑃(5,5𝑐𝑐 ) + 𝑃𝑃 (5𝑐𝑐 , 5)
= 𝑃𝑃(5) × 𝑃𝑃(5𝑐𝑐 ) + 𝑃𝑃(5𝑐𝑐 ) × 𝑃𝑃(5)
= 0.2 × 0.8 + 0.8 × 0.2 = 0.32

Problem 37
There are two tributaries in a watershed. From the past experience, the probability
that water in tributary 1 will over flow during a major storm is 0.5, whereas the
probability that tributary 2 will overflow is 0.4. The probability of tributary 2
overflowing is 0.6 given that tributary 1 overflows. Calculate the probability that
(i) at least one tributary would overflow in a storm; (ii) none of them would
overflow in a storm.

Hints:
Let
𝑇𝑇1 = 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 1 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
𝑇𝑇2 = 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 2 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
Given that
𝑃𝑃(𝑇𝑇1 ) = 0.5 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑇𝑇2 ) = 0.4
𝑃𝑃(𝑇𝑇2 |𝑇𝑇1 ) = 0.6
That is,
𝑃𝑃(𝑇𝑇1 𝑇𝑇2 )
= 0.6
𝑃𝑃(𝑇𝑇1 )
⟹ 𝑃𝑃(𝑇𝑇1 𝑇𝑇2 ) = 0.6 × 𝑃𝑃(𝑇𝑇1 ) = 0.6 × 0.5 = 0.30
(i) The event that at least one tributary would overflow in a storm is
𝑇𝑇1 ∪ 𝑇𝑇2
𝑃𝑃(𝑇𝑇1 ∪ 𝑇𝑇2 ) = 𝑃𝑃(𝑇𝑇1 ) + 𝑃𝑃(𝑇𝑇2 ) − 𝑃𝑃(𝑇𝑇1 𝑇𝑇2 )
= 0.5 + 0.4 − 0.3 = 0.6

(ii) The event that none of them would overflow in a storm is


𝑇𝑇1𝑐𝑐 ∩ 𝑇𝑇2𝑐𝑐 = (𝑇𝑇1 ∪ 𝑇𝑇2 )𝑐𝑐
𝑃𝑃(𝑇𝑇1𝑐𝑐 ∩ 𝑇𝑇2𝑐𝑐 ) = 𝑃𝑃(𝑇𝑇1 ∪ 𝑇𝑇2 )𝑐𝑐 = 1 − 𝑃𝑃(𝑇𝑇1 ∪ 𝑇𝑇2 ) = 1 − 0.6 = 0.4

Problem 38
A smoke-detector system uses two devices, A and B. If smoke is present, the
probability that it will be detected by device A is .95; by device B, .98; and by both
devices, .94. If smoke is present find the probability that (i) the smoke will be
detected by device A or device B or both devices; (ii) the smoke will not be
detected.

Hints
Let
𝐴𝐴 = 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝐴𝐴 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝐵𝐵 = 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝐵𝐵 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Given that
𝑃𝑃(𝐴𝐴) = 0.95; 𝑃𝑃(𝐵𝐵) = 0.98; 𝑃𝑃(𝐴𝐴𝐴𝐴) = 0.94

(i) The event that the smoke will be detected by device A or device B or both
devices is
𝐴𝐴 ∪ 𝐵𝐵
𝑃𝑃(𝐴𝐴 ∪ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) + 𝑃𝑃(𝐵𝐵) − 𝑃𝑃(𝐴𝐴𝐴𝐴) = 0.95 + 0.98 − 0.94 = 0.99

(ii) The event that the smoke will not be detected is


𝐴𝐴𝑐𝑐 𝐵𝐵𝑐𝑐 = (𝐴𝐴 ∪ 𝐵𝐵)𝑐𝑐
𝑃𝑃(𝐴𝐴𝑐𝑐 𝐵𝐵𝑐𝑐 ) = 𝑃𝑃(𝐴𝐴 ∪ 𝐵𝐵)𝑐𝑐 = 1 − 𝑃𝑃(𝐴𝐴𝐴𝐴𝐴𝐴) = 1 − 0.99 = 0.01

Problem 39
A survey of people in a given region showed that 20% were smokers. The
probability of death due to lung cancer, given that a person smoked, was roughly
10 times the probability of death due to lung cancer, given that a person did not
smoke. If the probability of death due to lung cancer in the region is 0.006, what is
the probability of death due to lung cancer given that a person is a smoker?

Hints:
Let
𝐷𝐷 = 𝑎𝑎 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑ℎ 𝑑𝑑𝑑𝑑𝑑𝑑 𝑡𝑡𝑡𝑡 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
𝑆𝑆 = 𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝑆𝑆 𝑐𝑐 = 𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠

Given that
𝑃𝑃(𝑆𝑆) = 0.20; 𝑃𝑃(𝑆𝑆 𝑐𝑐 ) = 0.80; 𝑃𝑃(𝐷𝐷) = 0.006
𝑃𝑃(𝐷𝐷|𝑆𝑆) ≅ 10 × 𝑃𝑃(𝐷𝐷|𝑆𝑆 𝑐𝑐 )
It is asked to find
𝑃𝑃(𝐷𝐷|𝑆𝑆) =? ? ? ? ?
The event 𝐷𝐷 can be expressed as
𝐷𝐷 = 𝐷𝐷𝐷𝐷 ∪ 𝐷𝐷𝑆𝑆 𝑐𝑐
𝑃𝑃(𝐷𝐷) = 𝑃𝑃(𝐷𝐷𝐷𝐷) + 𝑃𝑃(𝐷𝐷𝑆𝑆 𝑐𝑐 )
⟹ 𝑃𝑃(𝐷𝐷) = 𝑃𝑃(𝑆𝑆) 𝑃𝑃(𝐷𝐷|𝑆𝑆) + 𝑃𝑃(𝑆𝑆 𝑐𝑐 ) 𝑃𝑃(𝐷𝐷|𝑆𝑆 𝑐𝑐 )
⟹ 𝑃𝑃(𝑆𝑆) 𝑃𝑃(𝐷𝐷|𝑆𝑆) = 𝑃𝑃(𝐷𝐷) − 𝑃𝑃(𝑆𝑆 𝑐𝑐 ) 𝑃𝑃(𝐷𝐷|𝑆𝑆 𝑐𝑐 )
𝑃𝑃(𝐷𝐷|𝑆𝑆)
⟹ 𝑃𝑃(𝑆𝑆) 𝑃𝑃(𝐷𝐷|𝑆𝑆) ≅ 𝑃𝑃(𝐷𝐷) − 𝑃𝑃(𝑆𝑆 𝑐𝑐 ) ×
10
0.8
⟹ 0.20 𝑃𝑃(𝐷𝐷|𝑆𝑆) ≅ 0.006 − 𝑃𝑃(𝐷𝐷|𝑆𝑆)
10
⟹ 0.20 𝑃𝑃(𝐷𝐷|𝑆𝑆) ≅ 0.006 − 0.08𝑃𝑃(𝐷𝐷|𝑆𝑆)
⟹ 0.28 𝑃𝑃(𝐷𝐷|𝑆𝑆) ≅ 0.006
0.006
⟹ 𝑃𝑃(𝐷𝐷|𝑆𝑆) ≅ = 0.021
0.28
Problem 40
A student prepares for an exam by studying a list of ten problems. Student can
solve six of them. For the exam, the instructor selects five problems at random
from the ten on the list given to the students. What is the probability that the
student can solve all five problems on the exam?
Hints:
In the sample space, the number of sample points is
10
� � = 252.
5
Assume that all sample points are equally likely to be selected.
Let
𝐴𝐴 = 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎𝑎𝑎 5 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
The number of sample points in 𝐴𝐴 is
6 4
� � × � � = 6.
5 0
Therefore,
6
𝑃𝑃(𝐴𝐴) = = 0.0238.
252
Exercise
Probability of developing a certain disease in a community is 6%. A couple has 3
children. If it is known that at least one of these children has this disease, what is
the probability that all children developed this disease?
Probability Distribution Function

Random Variable
A variable is said to be a random variable if all possible values that variable can
assume are known in advance, but it is not possible to predict in advance with
certainty what the value would be for an entity (from which information regarding
variable is collected). Two types of random variable are widely used in statistics.
These are
• Discrete Random Variable
• Continuous Random Variable
Note
• Random variable is usually denoted by English capital letter, say,
𝑋𝑋, 𝑌𝑌, 𝑍𝑍, 𝑊𝑊 …
• Value of random variable is usually denoted by English small letter, say,
𝑥𝑥, 𝑦𝑦, 𝑧𝑧, 𝑤𝑤 … ….

Probability Distribution Function


A statistical function that describes the distribution of population values that a
random can assume is called probability distribution function. It implies that it is a
function of a value of a random variable.
• It is widely used to determine the population characteristics of a random
variable of interest.
• It is also used to find out the probability of an event defined by using the
values of a random variable. In this case, the sample space is the set of all
possible values of a random variable and each value is treated as sample
point. Hence, values of a random variable are mutually exclusive or disjoint.
• Depending on the type of a random variable (discrete or continuous),
probability distribution function can be classified into two categories. These
are

o Probability mass function (𝑝𝑝𝑝𝑝𝑝𝑝), when random variable is of


discrete type

o Probability density functions (𝑝𝑝𝑝𝑝𝑝𝑝), when random variable is of


continuous type.

Probability Mass Function: Discrete Random Variable

Suppose that 𝑋𝑋 is a discrete random variable and 𝑥𝑥 is a value of 𝑋𝑋. Let 𝑅𝑅 be a set of
all possible values of 𝑋𝑋, 𝑖𝑖. 𝑒𝑒. 𝑅𝑅 is the sample space and all 𝑥𝑥′𝑠𝑠 in 𝑅𝑅 are mutually
exclusive. Also, let 𝑓𝑓(𝑥𝑥) be a function defined on 𝑅𝑅. The function 𝑓𝑓(𝑥𝑥) is said to
be the probability mass function (𝑝𝑝𝑝𝑝𝑝𝑝) for the discrete random variable 𝑋𝑋, if
following conditions are satisfied.
1. 𝑓𝑓(𝑥𝑥) > 0, ∀ 𝑥𝑥 ∈ 𝑅𝑅.
2. ∑𝑥𝑥∈𝑅𝑅 𝑓𝑓(𝑥𝑥) = 1.
Note that
𝑃𝑃𝑃𝑃. (𝑋𝑋 = 𝑥𝑥), ∀ 𝑥𝑥 ∈ 𝑅𝑅
𝑓𝑓(𝑥𝑥) = �
0, ∀ 𝑥𝑥 ∉ 𝑅𝑅.
Remark:
1. Let 𝐴𝐴 be an event defined using the values of 𝑋𝑋, 𝑖𝑖. 𝑒𝑒 𝐴𝐴 ⊂ 𝑅𝑅. Then, the
probability of event 𝐴𝐴 can be defined as

𝑃𝑃𝑃𝑃. (𝐴𝐴) = � 𝑓𝑓(𝑥𝑥).


𝑥𝑥∈𝐴𝐴
2. The 𝑝𝑝𝑝𝑝𝑝𝑝 of a discrete random variable 𝑋𝑋 can also be presented by a table or
by a graph showing all possible values of 𝑋𝑋 along with their respective
probabilities.

Problem
Consider an experiment of tossing two fair coins. Define a random variable that
represents the number of heads obtained from the experiment. Determine the
probability distribution function of the random variable. Also, present the function
in tabular form and graphically. Hence, find the probability that (i) no head will be
observed and (ii) at least one head will be observed.

Solution
The sample space associated with this experiment is given by
𝑆𝑆 = {𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇}.
Let 𝑋𝑋 be the random variable representing the number of heads obtained from this
experiment. Therefore, the possible values of 𝑋𝑋 are 0, 1, 𝑎𝑎𝑎𝑎𝑎𝑎 2. 𝑖𝑖. 𝑒𝑒. 𝑅𝑅 =
{𝑥𝑥| 𝑥𝑥 = 0,1,2}. Now,
1 1 1
𝑃𝑃𝑃𝑃. (𝑋𝑋 = 0) = 𝑃𝑃𝑃𝑃. (𝑇𝑇𝑇𝑇) = 𝑃𝑃𝑃𝑃. (𝑇𝑇) × 𝑃𝑃𝑃𝑃. (𝑇𝑇) = × =
2 2 4
𝑃𝑃𝑃𝑃. (𝑋𝑋 = 1) = 𝑃𝑃𝑃𝑃. (𝐻𝐻𝐻𝐻 ∪ 𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝐻𝐻𝐻𝐻) + 𝑃𝑃𝑃𝑃. (𝑇𝑇𝐻𝐻)
1 1 1 1 1 1
= 𝑃𝑃𝑃𝑃. (𝐻𝐻) × 𝑃𝑃𝑃𝑃. (𝑇𝑇) + 𝑃𝑃𝑃𝑃. (𝑇𝑇) × 𝑃𝑃𝑃𝑃. (𝐻𝐻) = × + × = +
2 2 2 2 4 4
1
=
2
1 1 1
𝑃𝑃𝑃𝑃. (𝑋𝑋 = 2) = 𝑃𝑃𝑃𝑃. (𝐻𝐻𝐻𝐻) = 𝑃𝑃𝑃𝑃. (𝐻𝐻) × 𝑃𝑃𝑃𝑃. (𝐻𝐻) = × =
2 2 4
Therefore, the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is given by
𝑥𝑥
2 1 1 2−𝑥𝑥
𝑓𝑓(𝑥𝑥) = � � � � � � ; 𝑥𝑥 = 0,1,2.
𝑥𝑥 2 2
This 𝑝𝑝𝑝𝑝𝑝𝑝 can also be presented by using a table or a graph as follows.
𝑥𝑥 0 1 2
𝑃𝑃𝑃𝑃. (𝑋𝑋 = 𝑥𝑥) 1 1 1
4 2 4

(i) The event that no head is observed can be defined as


𝑋𝑋 = 0.
Therefore,
1
𝑃𝑃𝑃𝑃. (𝑋𝑋 = 0) = .
4
(ii) The event that at least one head is observed can be defined as
𝑋𝑋 ≥ 1.
Therefore,
𝑃𝑃𝑃𝑃(𝑋𝑋 ≥ 1) = 𝑃𝑃𝑃𝑃. (𝑋𝑋 = 1 ∪ 𝑋𝑋 = 2) = 𝑃𝑃𝑃𝑃. (𝑋𝑋 = 1) + 𝑃𝑃𝑃𝑃. (𝑋𝑋 = 2)
1 1 3
= + = .
2 4 4
Problem
Suppose that random variable 𝑋𝑋 represents the number of times an individual had
colds in last year winter in a community. The 𝑝𝑝𝑝𝑝𝑝𝑝 for 𝑋𝑋 is given as
𝑥𝑥 0 1 2 3
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) 0.6250 0.1250 0.0625
Find
(i) 𝑃𝑃(𝑋𝑋 = 1). 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.1875
(ii) 𝑃𝑃(𝑋𝑋 > 2). 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.0625
(iii) 𝑃𝑃(1 ≤ 𝑋𝑋 < 3). 𝐴𝐴𝐴𝐴𝐴𝐴. ∶ 0.3125
(iv) 𝑃𝑃(𝑋𝑋 ≤ 2). 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.9375
(v) 𝑃𝑃(𝑋𝑋 = 3.5). 𝐴𝐴𝐴𝐴𝐴𝐴. : 0
(vi) 𝑃𝑃(−1.5 ≤ 𝑋𝑋 < 1.3). 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.8125
(vii) 𝑃𝑃(𝑋𝑋 ≤ 10). 𝐴𝐴𝐴𝐴𝐴𝐴. : 1.0
Also, find the probability that the selected person had
(viii) 𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐. 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.1875
(ix) 𝑎𝑎𝑎𝑎 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐. 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.9375
(x) 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑡𝑡ℎ𝑎𝑎𝑎𝑎 2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐. 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.0625
(xi) 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑡𝑡ℎ𝑎𝑎𝑎𝑎 2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐. 𝐴𝐴𝐴𝐴𝐴𝐴. : 0.8125

Problem
Suppose that random variable 𝑋𝑋 has the following 𝑝𝑝𝑝𝑝𝑝𝑝.
𝑥𝑥 10 11 12 13 14
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) 0.2 0.3 0.2 0.1 0.2
Find 𝑃𝑃(𝑋𝑋 ≤ 11 𝑜𝑜𝑜𝑜 𝑋𝑋 > 12). 𝐴𝐴𝐴𝐴𝐴𝐴. 0.8.
Note
1−𝑟𝑟 𝑛𝑛 +1
1. 1 + 𝑟𝑟 + 𝑟𝑟 2 + 𝑟𝑟 3 + ⋯ ⋯ + 𝑟𝑟 𝑛𝑛 = , 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑡𝑡ℎ𝑎𝑎𝑎𝑎 |𝑟𝑟| < 1
1−𝑟𝑟
1
2. 1 + 𝑟𝑟 + 𝑟𝑟 2 + 𝑟𝑟 3 + ⋯ ⋯ + ⋯ = , 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑡𝑡ℎ𝑎𝑎𝑎𝑎 |𝑟𝑟| < 1
1−𝑟𝑟

Problem
Suppose that random variable 𝑋𝑋 can take values 1, 2, 3, 4, ⋯. Show that following
function can be used as 𝑝𝑝𝑝𝑝𝑝𝑝 for 𝑋𝑋.
𝑓𝑓(𝑥𝑥) = 0.23 × 0.77𝑥𝑥−1 .
Also, find 𝑃𝑃(𝑋𝑋 = 1); 𝑃𝑃(𝑋𝑋 ≤ 5), 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 ≥ 2).

Solution
Given that
𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 1,2,3,4, ⋯ }
It can be shown that
𝑓𝑓(𝑥𝑥) > 0 ; ∀ 𝑥𝑥.
For example,
𝑓𝑓(1) = 0.23 > 0; 𝑓𝑓(2) = 0.23 × 0.77 > 0; 𝑓𝑓(3) = 0.23 × 0.772 > 0.
Again,

� 𝑓𝑓(𝑥𝑥) = 𝑓𝑓(1) + 𝑓𝑓(2) + 𝑓𝑓(3) + 𝑓𝑓(4) + ⋯


𝑥𝑥∈𝑅𝑅

= 0.23 + 0.23 × 0.77 + 0.23 × 0.772 + 0.23 × 0.773 + ⋯


1 1
= 0.23 [1 + 0.77 + 0.772 + 0.773 + ⋯ ] = 0.23 × = 0.23 × =1
1 − 0.77 0.23
Therefore, the function
𝑓𝑓(𝑥𝑥) = 0.23 × 0.77𝑥𝑥−1 , 𝑥𝑥 = 1,2,3,4, ⋯
can be used as a 𝑝𝑝𝑝𝑝𝑝𝑝 for 𝑋𝑋.
Now,
𝑃𝑃(𝑋𝑋 = 1) = 𝑓𝑓(1) = 0.23

𝑃𝑃(𝑋𝑋 ≤ 5) = 𝑓𝑓(1) + 𝑓𝑓(2) + 𝑓𝑓(3) + 𝑓𝑓(4) + 𝑓𝑓(5)


= 0.23 + 0.23 × 0.77 + 0.23 × 0.772 + 0.23 × 0.773 + 0.23 × 0.774
= 0.23[1 + 0.77 + 0.772 + 0.773 + 0.774 ]
1 − 0.775
= 0.23 × = 0.729
1 − 0.77

𝑃𝑃(𝑋𝑋 ≥ 2) = 𝑓𝑓(2) + 𝑓𝑓(3) + 𝑓𝑓(4) + ⋯


= 0.23 × 0.77 + 0.23 × 0.772 + 0.23 × 0.773 + ⋯
1
= (0.23 × 0.77)[1 + 0.77 + 0.772 + ⋯ ] = (0.23 × 0.77) × = 0.77
1 − 0.77

Problem
Suppose that an unfair coin with 𝑃𝑃(𝐻𝐻) = 𝑝𝑝 is tossed until a head appears for the
first time. If 𝑌𝑌 is a random variable indicating the number of tosses required, find
the probability distribution function for 𝑌𝑌. Justify your answer. If coin is fair,
find 𝑃𝑃(2 ≤ 𝑋𝑋 < 5).

Solution
Suppose that 𝑅𝑅 be the set of all possible values of 𝑌𝑌, 𝑖𝑖. 𝑒𝑒.
𝑅𝑅 = {𝑦𝑦|𝑦𝑦 = 1,2,3, ,4, ⋯ }.
Given that the coin is unfair with 𝑃𝑃(𝐻𝐻) = 𝑝𝑝. Hence, 𝑃𝑃(𝑇𝑇) = 1 − 𝑝𝑝 = 𝑞𝑞. Now,
𝑃𝑃(𝑌𝑌 = 1) = 𝑃𝑃(𝐻𝐻) = 𝑝𝑝
𝑃𝑃(𝑌𝑌 = 2) = 𝑃𝑃(𝑇𝑇𝑇𝑇) = 𝑞𝑞 × 𝑝𝑝 = 𝑞𝑞𝑞𝑞
𝑃𝑃(𝑌𝑌 = 3) = 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) = 𝑞𝑞 × 𝑞𝑞 × 𝑝𝑝 = 𝑞𝑞2 𝑝𝑝
𝑃𝑃(𝑌𝑌 = 4) = 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇) = 𝑞𝑞 × 𝑞𝑞 × 𝑞𝑞 × 𝑝𝑝 = 𝑞𝑞 3 𝑝𝑝


𝑃𝑃(𝑌𝑌 = 15) = 𝑞𝑞 × 𝑞𝑞 × ⋯ × 𝑞𝑞 × 𝑝𝑝 = 𝑞𝑞14 𝑝𝑝

In general, one may write, the 𝑝𝑝𝑝𝑝𝑝𝑝 for 𝑌𝑌 is
𝑓𝑓(𝑦𝑦) = 𝑃𝑃(𝑌𝑌 = 𝑦𝑦) = 𝑞𝑞 𝑦𝑦−1 𝑝𝑝; 𝑦𝑦 = 1,2,3, ⋯ ; 𝑎𝑎𝑎𝑎𝑎𝑎 𝑞𝑞 = 1 − 𝑝𝑝

Justification
It is clear from above expression that
𝑓𝑓(𝑦𝑦) > 0, ∀𝑦𝑦 ∈ 𝑅𝑅.
Again,

� 𝑓𝑓(𝑦𝑦) = 𝑓𝑓(1) + 𝑓𝑓(2) + 𝑓𝑓(3) + ⋯


𝑦𝑦 ∈𝑅𝑅

1 1
= 𝑝𝑝[1 + 𝑞𝑞 + 𝑞𝑞 2 + 𝑞𝑞 3 + ⋯ ] = 𝑝𝑝 × = 𝑝𝑝 × = 1.
1 − 𝑞𝑞 𝑝𝑝
1
When the coin is fair, 𝑝𝑝 = 𝑞𝑞 = . Therefore,
2

𝑃𝑃(2 ≤ 𝑋𝑋 < 5) = 𝑃𝑃(𝑋𝑋 = 2) + 𝑃𝑃(𝑋𝑋 = 3) + 𝑃𝑃(𝑋𝑋 = 4)


1 1 1 1 7
= 𝑝𝑝(𝑞𝑞 + 𝑞𝑞 2 + 𝑞𝑞 3 ) = � + + � = .
2 2 4 8 16

Exercise
Find the value of ′𝑐𝑐′ so that following function can be treated as 𝑝𝑝𝑝𝑝𝑝𝑝.
(i) 𝑓𝑓(𝑥𝑥) = 𝑐𝑐𝑥𝑥 2 ; 𝑥𝑥 = 1,2,3.
(ii) 𝑓𝑓(𝑦𝑦) = 𝑐𝑐(0.25)𝑦𝑦 ; 𝑦𝑦 = 1,2,3, ⋯
Exercise
3 𝑥𝑥
The 𝑝𝑝𝑝𝑝𝑝𝑝 for 𝑋𝑋 is given as 𝑓𝑓(𝑥𝑥) = 𝛼𝛼 � � , 𝑥𝑥 = 0, 1,2,3, ⋯ Find the value of 𝛼𝛼
4

and hence find the 𝑃𝑃(𝑋𝑋 < 4).


Exercise
Suppose that random variable 𝑌𝑌 has the following 𝑝𝑝𝑝𝑝𝑝𝑝.
𝑦𝑦 −3 −2 −1 0 1
𝑃𝑃(𝑌𝑌 = 𝑦𝑦) 0.10 0.25 0.30 0.15 𝑘𝑘
Find a value for 𝑘𝑘. Also, find 𝑃𝑃(−3 < 𝑌𝑌 < 0) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑌𝑌 ≥ 1).

Problem
A shipment of 7 television sets contains 2 defective sets. A hotel makes a random
purchase of 3 of these sets. If 𝑋𝑋 denotes the number of defective sets purchased by
the hotel, find the probability distribution function for 𝑋𝑋.
Hints
Number of non-defective sets=5; Number of defective sets=2
𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 0,1,2}
�2��5� 2 �2��5� 4 �2 ��5� 1
0 3 1 2 2 1
𝑃𝑃(𝑋𝑋 = 0) = = ; 𝑃𝑃(𝑋𝑋 = 1) = = ; 𝑃𝑃(𝑋𝑋 = 2) = =
�7
3 � 7 �7
3 � 7 �7
3 � 7

In general,
2 5
� �� �
𝑓𝑓(𝑥𝑥) = 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 𝑥𝑥 3 − 𝑥𝑥 ; 𝑥𝑥 = 0,1,2.
7
� �
3
In tabular form,
𝑥𝑥 0 1 2
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) 2 4 1
7 7 7
Cumulative Distribution Function: Discrete Random Variable

Suppose that 𝑋𝑋 be a discrete random variable with probability mass function


(𝑝𝑝𝑝𝑝𝑝𝑝) 𝑓𝑓(𝑥𝑥). Let 𝑅𝑅 be the set of all possible value of 𝑋𝑋 and 𝑡𝑡 be any real number.
The cumulative distribution function (𝑐𝑐𝑐𝑐𝑐𝑐) of 𝑋𝑋 at 𝑡𝑡 , denoted by 𝐹𝐹(𝑡𝑡), provides
the probability that the random variable 𝑋𝑋 takes value 𝑡𝑡 or less. Mathematically,

𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥).


𝑥𝑥≤𝑡𝑡

Remark:
• For any real number 𝑡𝑡, 0 ≤ 𝐹𝐹(𝑡𝑡) ≤ 1.
• The 𝐹𝐹(𝑡𝑡) is a non-decreasing function of 𝑡𝑡, 𝑖𝑖. 𝑒𝑒. 𝐹𝐹(𝑎𝑎) ≤ 𝐹𝐹(𝑏𝑏), ∀ 𝑎𝑎 < 𝑏𝑏.
• lim𝑡𝑡→−∞ 𝐹𝐹(𝑡𝑡) = 0 𝑎𝑎𝑎𝑎𝑎𝑎 lim𝑡𝑡→∞ 𝐹𝐹(𝑡𝑡) = 1.
• The graph of 𝐹𝐹(𝑡𝑡) of a discrete random variable is a step function.
• The 𝑐𝑐𝑐𝑐𝑐𝑐 is also known as distribution function.

Problem
For the following 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋, find and draw the 𝑐𝑐𝑐𝑐𝑐𝑐.
𝑥𝑥 0 1 2 3
𝑓𝑓(𝑥𝑥) 20 60 36 4
120 120 120 120
Solution
When 𝑡𝑡 < 0

𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) = 0


𝑥𝑥≤𝑡𝑡

When 0 ≤ 𝑡𝑡 < 1
20
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) =
120
𝑥𝑥≤𝑡𝑡
When 1 ≤ 𝑡𝑡 < 2
20 60 80
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) = + =
120 120 120
𝑥𝑥≤𝑡𝑡

When 2 ≤ 𝑡𝑡 < 3
20 60 36 116
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) = + + =
120 120 120 120
𝑥𝑥≤𝑡𝑡

When 𝑡𝑡 ≥ 3
20 60 36 4 120
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) = + + + = =1
120 120 120 120 120
𝑥𝑥≤𝑡𝑡

That is,
0, 𝑡𝑡 < 0
⎧ 20
⎪ , 0 ≤ 𝑡𝑡 < 1
⎪ 120
80
𝐹𝐹(𝑡𝑡) = , 1 ≤ 𝑡𝑡 < 2
⎨120
⎪116
⎪120 , 2 ≤ 𝑡𝑡 < 3
⎩ 1.0, 𝑡𝑡 ≥ 3
The Graph: CDF
Note
Computing Probability of an event using 𝑐𝑐𝑐𝑐𝑐𝑐
Let 𝑋𝑋 be a discrete random variable; 𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑏𝑏 real numbers; 𝑎𝑎− be a real number
just prior to 𝑎𝑎; 𝑏𝑏 − be a real number just prior to 𝑏𝑏.
1. 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑎𝑎)
2. 𝑃𝑃(𝑋𝑋 = 𝑎𝑎) = 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎− ) = 𝐹𝐹(𝑎𝑎) − 𝐹𝐹(𝑎𝑎−)
3. 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎− ) = 𝐹𝐹(𝑎𝑎−)
4. 𝑃𝑃(𝑋𝑋 > 𝑎𝑎) = 1 − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 1 − 𝐹𝐹(𝑎𝑎)
5. 𝑃𝑃(𝑋𝑋 ≥ 𝑎𝑎) = 1 − 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 1 − 𝐹𝐹(𝑎𝑎−)
6. 𝑃𝑃(𝑎𝑎 < 𝑋𝑋 < 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 < 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑏𝑏−) − 𝐹𝐹(𝑎𝑎)
7. 𝑃𝑃(𝑎𝑎 ≤ 𝑋𝑋 < 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 < 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝐹𝐹(𝑏𝑏−) − 𝐹𝐹(𝑎𝑎−)
8. 𝑃𝑃(𝑎𝑎 < 𝑋𝑋 ≤ 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 ≤ 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑏𝑏) − 𝐹𝐹(𝑎𝑎)
9. 𝑃𝑃(𝑎𝑎 ≤ 𝑋𝑋 ≤ 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 ≤ 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝐹𝐹(𝑏𝑏) − 𝐹𝐹(𝑎𝑎−)

Example
Using the 𝑐𝑐𝑐𝑐𝑐𝑐 obtained in the previous problem, compute 𝑃𝑃(𝑋𝑋 = 2), 𝑃𝑃(𝑋𝑋 < 2),
𝑃𝑃(𝑋𝑋 > 1), 𝑃𝑃(𝑋𝑋 ≥ 1), 𝑃𝑃(1 < 𝑋𝑋 < 3), 𝑃𝑃(1 ≤ 𝑋𝑋 ≤ 3).

Solution
116 80 36
𝑃𝑃(𝑋𝑋 = 2) = 𝐹𝐹(2) − 𝐹𝐹(2−) = − =
120 120 120
80
𝑃𝑃(𝑋𝑋 < 2) = 𝐹𝐹(2−) =
120
80 40
𝑃𝑃(𝑋𝑋 > 1) = 1 − 𝑃𝑃(𝑋𝑋 ≤ 1) = 1 − 𝐹𝐹(1) = 1 − =
120 120
20 100
𝑃𝑃(𝑋𝑋 ≥ 1) = 1 − 𝑃𝑃(𝑋𝑋 < 1) = 1 − 𝐹𝐹(1−) = 1 − =
120 120
𝑃𝑃(1 < 𝑋𝑋 < 3) = 𝑃𝑃(𝑋𝑋 < 3) − 𝑃𝑃(𝑋𝑋 ≤ 1)
116 80 36
= 𝐹𝐹(3−) − 𝐹𝐹(1) = − =
120 120 120

𝑃𝑃(1 ≤ 𝑋𝑋 ≤ 3) = 𝑃𝑃(𝑋𝑋 ≤ 3) − 𝑃𝑃(𝑋𝑋 < 1)


20 100
= 𝐹𝐹(3) − 𝐹𝐹(1−) = 1 − = .
120 120

Exercise
For the following 𝑝𝑝𝑝𝑝𝑝𝑝
𝑦𝑦
2 1 1 2−𝑦𝑦
𝑓𝑓(𝑦𝑦) = � � � � � � ; 𝑦𝑦 = 0,1,2
𝑦𝑦 2 2
Find and draw the 𝑐𝑐𝑐𝑐𝑐𝑐. Hence, compute 𝑃𝑃(𝑌𝑌 = 1), 𝑃𝑃(𝑌𝑌 > 0), 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑌𝑌 ≤ 1).

Exercise
For the following 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋, find and draw the 𝑐𝑐𝑐𝑐𝑐𝑐. Hence, find 𝑃𝑃(𝑋𝑋 <
2) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 > 1).
𝑥𝑥 0 1 2 3
𝑓𝑓(𝑥𝑥) 1 3 3 1
8 8 8 8
Expectation of Discrete Random Variable
Suppose that 𝑋𝑋 be a discrete random variable with probability mass function
(𝑝𝑝𝑝𝑝𝑝𝑝) 𝑓𝑓(𝑥𝑥). Let 𝑅𝑅 be the set of all possible value of 𝑋𝑋. The expectation of random
variable or expected value of random variable, denoted by 𝐸𝐸(𝑋𝑋) 𝑜𝑜𝑜𝑜 𝜇𝜇 , is defined
as

𝐸𝐸(𝑋𝑋) = 𝜇𝜇 = � 𝑥𝑥 × 𝑓𝑓(𝑥𝑥).
𝑥𝑥∈𝑅𝑅

Suppose that 𝑔𝑔(𝑋𝑋) is a function of 𝑋𝑋. Hence, 𝑔𝑔(𝑋𝑋) is also a random variable. The
expectation of 𝑔𝑔(𝑋𝑋) is defined as

𝐸𝐸[𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥) × 𝑓𝑓(𝑥𝑥).


𝑥𝑥∈𝑅𝑅

Note
Expected value of a random variable provides the population mean of the
population observations observed for that random variable.

Properties of Expectation
1. If 𝑐𝑐 is any constant, 𝐸𝐸(𝑐𝑐) = 𝑐𝑐.
Proof:
Let 𝑔𝑔(𝑋𝑋) = 𝑐𝑐. Then

𝐸𝐸[𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥) × 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅

⇒ 𝐸𝐸(𝑐𝑐) = � 𝑐𝑐 × 𝑓𝑓(𝑥𝑥) = 𝑐𝑐 � 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

Since 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝 and hence ∑𝑥𝑥∈𝑅𝑅 𝑓𝑓(𝑥𝑥) = 1,


𝐸𝐸(𝑐𝑐) = 𝑐𝑐.
2. If 𝑐𝑐 is any constant, 𝐸𝐸(𝑐𝑐𝑐𝑐) = 𝑐𝑐 𝐸𝐸(𝑋𝑋).
Proof:
Let 𝑔𝑔(𝑋𝑋) = 𝑐𝑐𝑐𝑐. Then

𝐸𝐸[𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥) × 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅

⇒ 𝐸𝐸(𝑐𝑐𝑐𝑐) = � 𝑐𝑐𝑐𝑐 × 𝑓𝑓(𝑥𝑥) = 𝑐𝑐 � 𝑥𝑥 × 𝑓𝑓(𝑥𝑥) = 𝑐𝑐𝑐𝑐(𝑋𝑋)


𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

𝑖𝑖. 𝑒𝑒. 𝐸𝐸(𝑐𝑐𝑐𝑐) = 𝑐𝑐 𝐸𝐸(𝑋𝑋).


3. If 𝑐𝑐 is any constant, 𝐸𝐸(𝑋𝑋 + 𝑐𝑐) = 𝐸𝐸(𝑋𝑋) + 𝑐𝑐.
Proof:
Let 𝑔𝑔(𝑋𝑋) = 𝑋𝑋 + 𝑐𝑐. Then

𝐸𝐸[𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥) × 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅

⇒ 𝐸𝐸(𝑋𝑋 + 𝑐𝑐) = �(𝑥𝑥 + 𝑐𝑐) × 𝑓𝑓(𝑥𝑥) = � 𝑥𝑥 × 𝑓𝑓(𝑥𝑥) + 𝑐𝑐 � 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

⇒ 𝐸𝐸(𝑋𝑋 + 𝑐𝑐) = 𝐸𝐸(𝑋𝑋) + 𝑐𝑐.

4. If 𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑏𝑏 are two any constants, 𝐸𝐸(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑎𝑎 + 𝑏𝑏 𝐸𝐸(𝑋𝑋).


Proof:
Let 𝑔𝑔(𝑋𝑋) = 𝑎𝑎 + 𝑏𝑏𝑏𝑏. Then

𝐸𝐸[𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥) × 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅

⇒ 𝐸𝐸(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = �(𝑎𝑎 + 𝑏𝑏𝑏𝑏) × 𝑓𝑓(𝑥𝑥) = � 𝑎𝑎 × 𝑓𝑓(𝑥𝑥) + � 𝑏𝑏𝑏𝑏 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

= 𝑎𝑎 � 𝑓𝑓(𝑥𝑥) + 𝑏𝑏 � 𝑥𝑥 𝑓𝑓(𝑥𝑥) = 𝑎𝑎 + 𝑏𝑏 𝐸𝐸(𝑋𝑋)


𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

𝑖𝑖. 𝑒𝑒. 𝐸𝐸(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑎𝑎 + 𝑏𝑏 𝐸𝐸(𝑋𝑋).


Problem
The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is given below.
𝑥𝑥 −1 0 1
𝑓𝑓(𝑥𝑥) 0.2 0.3 0.5
Compute 𝐸𝐸(𝑋𝑋), 𝐸𝐸(2𝑋𝑋), 𝑎𝑎𝑎𝑎𝑎𝑎 𝐸𝐸(3𝑋𝑋 + 1).
Solution:

𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 × 𝑓𝑓(𝑥𝑥) = (−1) × 𝑓𝑓(−1) + (0) × 𝑓𝑓(0) + (1) × 𝑓𝑓(1)


𝑥𝑥∈𝑅𝑅

= (−1) × 0.2 + (0) × 0.3 + (1) × 0.5 = 0.3

𝐸𝐸(2𝑋𝑋) = 2 × 𝐸𝐸(𝑋𝑋) = 2 × 0.3 = 0.6

𝐸𝐸(3𝑋𝑋 + 1) = 3 × 𝐸𝐸(𝑋𝑋) + 1 = 3 × 0.3 + 1 = 1.9

Exercise
Suppose that 𝑋𝑋 represents the number of defective teeth of a patient visiting a
certain dental clinic. The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is given as
𝑥𝑥 1 2 3 4 5
𝑓𝑓(𝑥𝑥) 0.25 0.35 0.20 0.15 𝑘𝑘

(a) Find the value for 𝑘𝑘.


(b) Find and draw the 𝑐𝑐𝑐𝑐𝑐𝑐.
(c) Find the probability that a patient has at least 4 defective teeth.
(d) Find the probability that a patient has at most 2 defective teeth.
(e) Find the expected number of defective teeth of a patient.
Population Variance of a Random Variable

Let 𝑋𝑋 be any random variable (discrete or continuous). Note that properties of


expectation given in the last class are appropriate for discrete as well as continuous
random variable.

The variance of population observations obtained on the random variable 𝑋𝑋, 𝑖𝑖. 𝑒𝑒.
population variance of 𝑋𝑋, denoted by 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋), is defined as
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸[𝑋𝑋 − 𝜇𝜇]2 , 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝜇𝜇 = 𝐸𝐸(𝑋𝑋).
It can be shown that
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 − 2𝑋𝑋𝑋𝑋 + 𝜇𝜇2 )
= 𝐸𝐸(𝑋𝑋 2 ) − 2𝜇𝜇𝜇𝜇(𝑋𝑋) + 𝜇𝜇2 = 𝐸𝐸(𝑋𝑋 2 ) − 2𝜇𝜇2 + 𝜇𝜇2
= 𝐸𝐸(𝑋𝑋 2 ) − 𝜇𝜇2 = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
That is,
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2 .

Properties of Variance for both Discrete and Continuous Random Variables

1. If 𝑐𝑐 is any constant, 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐) = 0.


Proof:
Let 𝑌𝑌 = 𝑐𝑐. Then
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) = 𝐸𝐸(𝑌𝑌 2 ) − [𝐸𝐸(𝑌𝑌)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐) = 𝐸𝐸(𝑐𝑐 2 ) − [𝐸𝐸(𝑐𝑐)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐) = 𝑐𝑐 2 − (𝑐𝑐)2 = 𝑐𝑐 2 − 𝑐𝑐 2 = 0.
2. If 𝑐𝑐 is any constant, 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝑐𝑐 2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).
Proof:
Let 𝑌𝑌 = 𝑐𝑐𝑐𝑐. Then
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) = 𝐸𝐸(𝑌𝑌 2 ) − [𝐸𝐸(𝑌𝑌)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝐸𝐸(𝑐𝑐 2 𝑋𝑋 2 ) − [𝐸𝐸(𝑐𝑐𝑐𝑐)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝑐𝑐 2 𝐸𝐸(𝑋𝑋 2 ) − [𝑐𝑐 𝐸𝐸(𝑋𝑋)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝑐𝑐 2 𝐸𝐸(𝑋𝑋 2 ) − 𝑐𝑐 2 [𝐸𝐸(𝑋𝑋)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝑐𝑐 2 {𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2 }
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝑐𝑐 2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).

3. If 𝑐𝑐 is any constant, 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋 + 𝑐𝑐) = 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).


Proof:
Let 𝑌𝑌 = 𝑋𝑋 + 𝑐𝑐. Then
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) = 𝐸𝐸[𝑌𝑌 − 𝐸𝐸(𝑌𝑌)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋 + 𝑐𝑐) = 𝐸𝐸[𝑋𝑋 + 𝑐𝑐 − 𝐸𝐸(𝑋𝑋 + 𝑐𝑐)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋 + 𝑐𝑐) = 𝐸𝐸[𝑋𝑋 + 𝑐𝑐 − 𝐸𝐸(𝑋𝑋) − 𝑐𝑐)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋 + 𝑐𝑐) = 𝐸𝐸[𝑋𝑋 − 𝐸𝐸(𝑋𝑋)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋 + 𝑐𝑐) = 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).

4. If 𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑏𝑏 is any constants, 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑏𝑏 2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).


Proof:
Let 𝑌𝑌 = 𝑎𝑎 + 𝑏𝑏𝑏𝑏. Then
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) = 𝐸𝐸[𝑌𝑌 − 𝐸𝐸(𝑌𝑌)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝐸𝐸[𝑎𝑎 + 𝑏𝑏𝑏𝑏 − 𝐸𝐸(𝑎𝑎 + 𝑏𝑏𝑏𝑏)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝐸𝐸[𝑎𝑎 + 𝑏𝑏𝑏𝑏 − 𝑎𝑎 − 𝑏𝑏(𝐸𝐸(𝑋𝑋)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝐸𝐸[𝑏𝑏𝑏𝑏 − 𝑏𝑏(𝐸𝐸(𝑋𝑋)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑏𝑏 2 𝐸𝐸[𝑋𝑋 − (𝐸𝐸(𝑋𝑋)]2
⟹ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑏𝑏 2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).

Problem
The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is given below.
𝑥𝑥 −1 0 1
𝑓𝑓(𝑥𝑥) 0.2 0.3 0.5
Compute 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋), 𝑉𝑉𝑉𝑉𝑉𝑉(2𝑋𝑋), 𝑎𝑎𝑎𝑎𝑎𝑎 𝑉𝑉𝑉𝑉𝑉𝑉(3𝑋𝑋 + 1).

Solution:

𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 × 𝑓𝑓(𝑥𝑥) = (−1) × 𝑓𝑓(−1) + (0) × 𝑓𝑓(0) + (1) × 𝑓𝑓(1)


𝑥𝑥∈𝑅𝑅

= (−1) × 0.2 + (0) × 0.3 + (1) × 0.5 = 0.3

𝐸𝐸(𝑋𝑋 2 ) = � 𝑥𝑥 2 × 𝑓𝑓(𝑥𝑥) = (−1)2 × 𝑓𝑓(−1) + (0)2 × 𝑓𝑓(0) + (1)2 × 𝑓𝑓(1)


𝑥𝑥∈𝑅𝑅

= 1 × 0.2 + 0 × 0.3 + 1 × 0.5 = 0.7


Therefore,
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 0.7 − (0.3)2 = 0.61.
𝑉𝑉𝑉𝑉𝑉𝑉(2𝑋𝑋) = 22 × 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 4 × 0.61 = 2.44
𝑉𝑉𝑉𝑉𝑉𝑉(3𝑋𝑋 + 1) = 32 × 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 9 × 0.61 = 5.49.

Problem
Suppose that random variable 𝑋𝑋 denotes the natural number and set of possible
values of 𝑋𝑋 is 𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 1, ,2, ⋯ , 𝑁𝑁}. The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is given by
1
𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 = 1,2, ⋯ , 𝑁𝑁.
𝑁𝑁
Find the population mean and variance of 𝑋𝑋.

Solution
Population mean
1 1
𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 𝑓𝑓(𝑥𝑥) = � 𝑥𝑥 = (1 + 2 + ⋯ + 𝑁𝑁)
𝑁𝑁 𝑁𝑁
𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

1 𝑁𝑁(𝑁𝑁 + 1) 𝑁𝑁 + 1
= × = .
𝑁𝑁 2 2
Population variance
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
1 1
𝐸𝐸(𝑋𝑋 2 ) = � 𝑥𝑥 2 𝑓𝑓(𝑥𝑥) = � 𝑥𝑥 2 = (12 + 22 + ⋯ + 𝑁𝑁 2 )
𝑁𝑁 𝑁𝑁
𝑥𝑥∈𝑅𝑅 𝑥𝑥∈𝑅𝑅

1 𝑁𝑁(𝑁𝑁 + 1)(2𝑁𝑁 + 1) (𝑁𝑁 + 1)(2𝑁𝑁 + 1)


= × =
𝑁𝑁 6 2
Therefore,
(𝑁𝑁 + 1)(2𝑁𝑁 + 1) 𝑁𝑁 + 1 2 𝑁𝑁 2 − 1
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = −� � = .
2 2 12

Problem
Suppose that you bought one $10 raffle ticket for a new car valued at $15000. If
2000 tickets were sold, what is your expected gain in buying this ticket?
Solution
Let the random variable 𝑋𝑋 denote
𝑋𝑋 = 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑖𝑖𝑖𝑖 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑎𝑎 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
Therefore, the possible values of 𝑋𝑋 are 14990 and −10. Since 2000 tickets were
sold,
1
𝑃𝑃(𝑋𝑋 = 14990) = 𝑃𝑃(𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑡𝑡ℎ𝑒𝑒 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑) =
2000
1 1999
𝑃𝑃(𝑋𝑋 = −10) = 𝑃𝑃(𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑡𝑡ℎ𝑒𝑒 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑) = 1 − = .
2000 2000
The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is
𝑥𝑥 −10 14990
𝑓𝑓(𝑥𝑥) 1999 1
2000 2000
The expected gain is
1999 1
𝐸𝐸(𝑋𝑋) = (−10) × + 14990 × = −2.5.
2000 2000
It implies that if someone participates in this draw so many times, on the average,
he/she will lose $2.5.

Problem
Consider an experiment of tossing three fair coins. You will get one point for each
head and lose one point for each tail. What is your expected point?

Solution
The sample space associated with this experiment is
𝑆𝑆 = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝐻𝐻𝐻𝐻𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}
1
Since three coins are fair, the probability of each sample point is . Therefore,
8

Sample Point 𝐻𝐻𝐻𝐻𝐻𝐻 𝐻𝐻𝐻𝐻𝐻𝐻 𝐻𝐻𝐻𝐻𝐻𝐻 𝑇𝑇𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇𝑇𝑇 𝐻𝐻𝐻𝐻𝐻𝐻 𝑇𝑇𝑇𝑇𝑇𝑇
Probability 1 1 1 1 1 1 1 1
8 8 8 8 8 8 8 8
Points Obtained 3 1 1 1 −1 −1 −1 −3
Let 𝑋𝑋 be a random variable indicating points obtained in an experiment. Therefore,
the possible values of 𝑋𝑋 are −3, −1, 1, 3. Now,
1
𝑃𝑃(𝑋𝑋 = −3) = 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) =
8
3
𝑃𝑃(𝑋𝑋 = −1) = 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇 ∪ 𝑇𝑇𝑇𝑇𝑇𝑇 ∪ 𝐻𝐻𝐻𝐻𝐻𝐻) = 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) + 𝑃𝑃(𝐻𝐻𝐻𝐻𝐻𝐻) =
8
3
𝑃𝑃(𝑋𝑋 = 1) = 𝑃𝑃(𝐻𝐻𝐻𝐻𝐻𝐻 ∪ 𝐻𝐻𝐻𝐻𝐻𝐻 ∪ 𝑇𝑇𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝐻𝐻𝐻𝐻𝐻𝐻) + 𝑃𝑃(𝐻𝐻𝐻𝐻𝐻𝐻) + 𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇) =
8
1
𝑃𝑃(𝑋𝑋 = 3) = 𝑃𝑃(𝐻𝐻𝐻𝐻𝐻𝐻) =
8
That is, the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is
𝑥𝑥 −3 −1 1 3
𝑓𝑓(𝑥𝑥) 1 3 3 1
8 8 8 8
Therefore, the expected points is
1 3 3 1
𝐸𝐸(𝑋𝑋) = (−3) × + (−1) × + 1 × + 3 × = 0.
8 8 8 8

Problem
How many heads would you expect, if you flip a fair coin twice?

Hints
The sample space associated with this experiment is
𝑆𝑆 = {𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇}
Let 𝑋𝑋 denote the number of heads observed. Therefore,
𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 0,1,2}
Now,
1
𝑃𝑃(𝑋𝑋 = 0) = 𝑃𝑃(𝑇𝑇𝑇𝑇) =
4
1 1 1
𝑃𝑃(𝑋𝑋 = 1) = 𝑃𝑃(𝐻𝐻𝐻𝐻 ∪ 𝑇𝑇𝑇𝑇) = 𝑃𝑃(𝐻𝐻𝐻𝐻) + 𝑃𝑃(𝑇𝑇𝑇𝑇) = + =
4 4 2
1
𝑃𝑃(𝑋𝑋 = 2) = 𝑃𝑃(𝐻𝐻𝐻𝐻) =
4
That is, the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is
𝑥𝑥 0 1 2
𝑓𝑓(𝑥𝑥) 1 1 1
4 2 4
Hence,
1 1 1
𝐸𝐸(𝑋𝑋) = 0 × + 1 × + 2 × = 1.
4 2 4
Problem
Roll a fair die. If the side that turns up is odd, you will get the points that the side
shows; otherwise you will lose 4 points. What is your expected point?

Hints
The sample space associated with this experiment is
𝑆𝑆 = {1,2,3,4,5,6}
Let
𝑋𝑋 = 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = −4,1,3,5}
1 1 1 1
𝑃𝑃(𝑋𝑋 = −4) = 𝑃𝑃(2 ∪ 4 ∪ 6) = 𝑃𝑃(2) + 𝑃𝑃(4) + 𝑃𝑃(6) = + + =
6 6 6 2
1
𝑃𝑃(𝑋𝑋 = 1) = 𝑃𝑃(1) =
6
1
𝑃𝑃(𝑋𝑋 = 3) = 𝑃𝑃(3) =
6
1
𝑃𝑃(𝑋𝑋 = 5) = 𝑃𝑃(5) =
6
The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is
𝑥𝑥 −4 1 3 5
𝑓𝑓(𝑥𝑥) 1 1 1 1
2 6 6 6

The expected point is


1 1 1 1
𝐸𝐸(𝑋𝑋) = (−4) × + 1 × + 3 × + 5 ×
2 6 6 6
1
= −2 + × 9 = −2 + 1.5 = −0.5.
6
Moment and Moment Generating Function

Moment
The 𝑟𝑟 𝑡𝑡ℎ ( 𝑟𝑟 = 1,2 ,3, ⋯ ) moment about zero or simply the 𝑟𝑟 𝑡𝑡ℎ moment of a random
variable (discrete or continuous) 𝑋𝑋, denoted by 𝜇𝜇𝑟𝑟′ , is defined as
𝜇𝜇𝑟𝑟′ = 𝐸𝐸[𝑋𝑋 𝑟𝑟 ].
It is also known as the 𝑟𝑟 𝑡𝑡ℎ raw moment.

Moment Generating Function


The moment generating function (𝑚𝑚𝑚𝑚𝑚𝑚) of a random variable (discrete or
continuous) 𝑋𝑋, denoted by 𝑀𝑀𝑋𝑋 (𝑡𝑡), is defined as
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[𝑒𝑒 𝑡𝑡𝑡𝑡 ],
where 𝑡𝑡 is a constant taking value from an interval that contains zero. It is always
non-negative for any value of 𝑡𝑡 and when 𝑡𝑡 = 0,
𝑀𝑀𝑋𝑋 (0) = 𝐸𝐸[𝑒𝑒 0×𝑋𝑋 ] = 𝐸𝐸[1] = 1.
Note that all moments about zero can be obtained by successively differentiating
𝑀𝑀𝑋𝑋 (𝑡𝑡) and evaluating at 𝑡𝑡 = 0, 𝑖𝑖. 𝑒𝑒.
𝜕𝜕 𝑟𝑟
𝜇𝜇𝑟𝑟′ = � 𝑟𝑟 𝑀𝑀𝑋𝑋 (𝑡𝑡)� , 𝑟𝑟 = 1,2,3, ⋯
𝜕𝜕𝑡𝑡 𝑡𝑡=0

Because of this, 𝑀𝑀𝑋𝑋 (𝑡𝑡) is known as moment generating function.

Proof
By the definition of moment generating function of random variable 𝑋𝑋,
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[𝑒𝑒 𝑡𝑡𝑡𝑡 ]
𝑡𝑡𝑡𝑡 (𝑡𝑡𝑡𝑡)2 (𝑡𝑡𝑡𝑡)3 (𝑡𝑡𝑡𝑡)𝑟𝑟
⟹ 𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[1 + + + +⋯+ + ⋯]
1! 2! 3! 𝑟𝑟!
𝑡𝑡 2 2)
𝑡𝑡 3 3)
𝑡𝑡 𝑟𝑟
= 1 + 𝑡𝑡𝑡𝑡(𝑋𝑋) + 𝐸𝐸(𝑋𝑋 + 𝐸𝐸(𝑋𝑋 + ⋯ + 𝐸𝐸(𝑋𝑋 2 ) + ⋯
2! 3! 𝑟𝑟!
Therefore,
𝜕𝜕 2𝑡𝑡 2
3𝑡𝑡 2 3
𝑟𝑟𝑡𝑡 𝑟𝑟−1
𝑀𝑀 (𝑡𝑡) = 𝐸𝐸(𝑋𝑋) + 𝐸𝐸(𝑋𝑋 ) + 𝐸𝐸(𝑋𝑋 ) + ⋯ + 𝐸𝐸(𝑋𝑋 𝑟𝑟 ) + ⋯
𝜕𝜕𝑡𝑡 𝑋𝑋 2! 3! 𝑟𝑟!
Then,
𝜕𝜕
� 𝑀𝑀 (𝑡𝑡)� = 𝐸𝐸(𝑋𝑋) = 𝜇𝜇1′
𝜕𝜕𝜕𝜕 𝑋𝑋 𝑡𝑡=0

Again,
𝜕𝜕 2 𝜕𝜕 𝜕𝜕
𝑀𝑀𝑋𝑋 (𝑡𝑡) = � 𝑀𝑀 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 2 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝑋𝑋
2 2)
6 3)
𝑟𝑟 (𝑟𝑟 − 1)𝑡𝑡 𝑟𝑟−2
= 𝐸𝐸(𝑋𝑋 + 𝑡𝑡 𝐸𝐸(𝑋𝑋 + ⋯ + 𝐸𝐸(𝑋𝑋 𝑟𝑟 ) + ⋯
2! 3! 𝑟𝑟!
Then,
𝜕𝜕 2
� 2 𝑀𝑀𝑋𝑋 (𝑡𝑡)� = 𝐸𝐸(𝑋𝑋 2 ) = 𝜇𝜇2′
𝜕𝜕𝑡𝑡 𝑡𝑡=0

Again,
𝜕𝜕 3 𝜕𝜕 𝜕𝜕 2
𝑀𝑀 (𝑡𝑡) = � 𝑀𝑀 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 3 𝑋𝑋 𝜕𝜕𝜕𝜕 𝜕𝜕𝑡𝑡 2 𝑋𝑋
6 3)
𝑟𝑟 (𝑟𝑟 − 1) (𝑟𝑟 − 2) 𝑡𝑡 𝑟𝑟−3
= 𝐸𝐸(𝑋𝑋 + ⋯ + 𝐸𝐸(𝑋𝑋 𝑟𝑟 ) + ⋯
3! 𝑟𝑟!
Then,
𝜕𝜕 3
� 3 𝑀𝑀𝑋𝑋 (𝑡𝑡)� = 𝐸𝐸(𝑋𝑋 3 ) = 𝜇𝜇3′
𝜕𝜕𝑡𝑡 𝑡𝑡=0

Therefore, in general, one may write


𝜕𝜕 𝑟𝑟
𝜇𝜇𝑟𝑟′ = 𝐸𝐸(𝑋𝑋 𝑟𝑟 )
= � 𝑟𝑟 𝑀𝑀𝑋𝑋 (𝑡𝑡)� , 𝑟𝑟 = 1,2,3, ⋯
𝜕𝜕𝑡𝑡 𝑡𝑡=0
Problem
The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 is given below.
𝑥𝑥 −1 0 1
𝑓𝑓(𝑥𝑥) 0.2 0.3 0.5
Compute the 𝑚𝑚𝑚𝑚𝑚𝑚 of 𝑋𝑋. Hence find the population mean and variance.
Solution:
The 𝑚𝑚𝑚𝑚𝑚𝑚 of 𝑋𝑋 is

𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[𝑒𝑒 𝑡𝑡𝑡𝑡 ] = � 𝑒𝑒 𝑡𝑡𝑡𝑡 × 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅

= 𝑒𝑒 𝑡𝑡×(−1) 𝑓𝑓(−1) + 𝑒𝑒 𝑡𝑡×(0) 𝑓𝑓(0) + 𝑒𝑒 𝑡𝑡×(1) 𝑓𝑓(1)


= 0.2 𝑒𝑒 −𝑡𝑡 + 0.3 + 0.5𝑒𝑒 𝑡𝑡
𝑖𝑖. 𝑒𝑒. 𝑀𝑀𝑋𝑋 (𝑡𝑡) = 0.3 + 0.2 𝑒𝑒 −𝑡𝑡 + 𝑒𝑒 𝑡𝑡 0.5
The population mean and variance of 𝑋𝑋 are
𝐸𝐸(𝑋𝑋) = 𝜇𝜇1′
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2 = 𝜇𝜇2′ − [𝜇𝜇1′ ]2 ,
respectively.
Now,
𝜕𝜕
𝜇𝜇1′ = � 𝑀𝑀𝑋𝑋 (𝑡𝑡)� = [−0.2 𝑒𝑒 −𝑡𝑡 + 0.5𝑒𝑒 𝑡𝑡 ]𝑡𝑡=0 = −0.2 + 0.5 = 0.3
𝜕𝜕𝜕𝜕 𝑡𝑡=0

𝜕𝜕 2 𝜕𝜕 𝜕𝜕
𝜇𝜇2′ = � 2 𝑀𝑀𝑋𝑋 (𝑡𝑡)� =� 𝑀𝑀𝑋𝑋 (𝑡𝑡)� = [0.2 𝑒𝑒 −𝑡𝑡 + 0.5𝑒𝑒 𝑡𝑡 ]𝑡𝑡=0 = 0.7
𝜕𝜕𝑡𝑡 𝑡𝑡=0
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝑡𝑡=0

Therefore,
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 0.7 − (0.3)2 = 0.7 − 0.09 = 0.61.
Probability Distribution Function of Continuous Random Variable

The probability distribution function of a continuous random variable is known as


probability density function (𝑝𝑝𝑝𝑝𝑝𝑝). Let 𝑋𝑋 be a continuous random variable with set
of possible values of 𝑅𝑅. Then the function 𝑓𝑓(𝑥𝑥) is said to be the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 if
1. 𝑓𝑓(𝑥𝑥) ≥ 0 ∀𝑥𝑥 ∈ 𝑅𝑅

2. � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = 1.
𝑥𝑥∈𝑅𝑅

Let 𝐴𝐴 be a subset of 𝑅𝑅, 𝑖𝑖. 𝑒𝑒 𝐴𝐴 ⊂ 𝑅𝑅. Then,

𝑃𝑃(𝐴𝐴) = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑.
𝑥𝑥∈𝐴𝐴

Note
1. If ′𝑎𝑎′ is any number, 𝑃𝑃(𝑋𝑋 = 𝑎𝑎) = 0.
2. 𝑓𝑓(𝑥𝑥) = 0 ∀ 𝑥𝑥 ∉ 𝑅𝑅.

Problem
Suppose that 𝑋𝑋 is a continuous random variable such that −1 < 𝑥𝑥 < 2. Let 𝑓𝑓(𝑥𝑥)
be a function of 𝑥𝑥 given as
𝑓𝑓(𝑥𝑥) = 𝑥𝑥 2 /3.
Show that 𝑓𝑓(𝑥𝑥) is the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋. Also, find the 𝑃𝑃(0 < 𝑋𝑋 ≤ 1).

Solution
Let 𝑅𝑅 be the set of all possible values of 𝑋𝑋. Therefore,
𝑅𝑅 = {𝑥𝑥|−1 < 𝑥𝑥 < 2}.
It can be shown that
𝑓𝑓(𝑥𝑥) ≥ 0, ∀𝑥𝑥 ∈ 𝑅𝑅.
Again,
2 2
𝑥𝑥 2 1 𝑥𝑥 3 1
� 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = � 𝑑𝑑𝑑𝑑 = � � = [8 + 1] = 1.
𝑥𝑥∈𝑅𝑅 −1 3 3 3 −1 9
Therefore,
𝑥𝑥 2
𝑓𝑓(𝑥𝑥) = ; −1 < 𝑥𝑥 < 2
3
can be considered as a 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋.

Let 𝐴𝐴 = {𝑥𝑥|0 < 𝑥𝑥 ≤ 1}.That is, 𝐴𝐴 ⊂ 𝑅𝑅. Then


1 1 1
𝑥𝑥 2 1 𝑥𝑥 3 1
𝑃𝑃(0 < 𝑋𝑋 ≤ 1) = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = � 𝑑𝑑𝑑𝑑 = � � = = 0.1111
0 0 3 3 3 0 9
It implies that 11.11% of population observations on the continuous random
variable 𝑋𝑋 is between 0 𝑎𝑎𝑎𝑎𝑎𝑎 1.

Cumulative Distribution Function: Continuous Random Variable


Let 𝑋𝑋 be a continuous random variable with 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥) and 𝑡𝑡 be any real number.
The cumulative distribution function (𝑐𝑐𝑐𝑐𝑐𝑐) of 𝑋𝑋 at 𝑡𝑡 , denoted by 𝐹𝐹(𝑡𝑡), provides
the probability that the random variable 𝑋𝑋 takes value 𝑡𝑡 or less. Mathematically,

𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑.


𝑥𝑥≤𝑡𝑡

Remark:
• For any real number 𝑡𝑡, 0 ≤ 𝐹𝐹(𝑡𝑡) ≤ 1.
• The 𝐹𝐹(𝑡𝑡) is a non-decreasing function of 𝑡𝑡, 𝑖𝑖. 𝑒𝑒. 𝐹𝐹(𝑎𝑎) ≤ 𝐹𝐹(𝑏𝑏), ∀ 𝑎𝑎 < 𝑏𝑏.
• lim𝑡𝑡→−∞ 𝐹𝐹(𝑡𝑡) = 0 𝑎𝑎𝑎𝑎𝑎𝑎 lim𝑡𝑡→∞ 𝐹𝐹(𝑡𝑡) = 1.
• The 𝑐𝑐𝑐𝑐𝑐𝑐 is also known as distribution function.
Note
The 𝑝𝑝𝑝𝑝𝑝𝑝 of the random variable 𝑋𝑋 cab be obtained from the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋 as
𝜕𝜕
𝑓𝑓(𝑥𝑥) = 𝐹𝐹(𝑥𝑥).
𝜕𝜕𝜕𝜕
Note
Computing Probability of an event using 𝑐𝑐𝑐𝑐𝑐𝑐
Let 𝑋𝑋 be a continuous random variable; 𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑏𝑏 real numbers.

1. 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑎𝑎)


2. 𝑃𝑃(𝑋𝑋 = 𝑎𝑎) = 0
3. 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑎𝑎)
4. 𝑃𝑃(𝑋𝑋 > 𝑎𝑎) = 1 − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 1 − 𝐹𝐹(𝑎𝑎)
5. 𝑃𝑃(𝑋𝑋 ≥ 𝑎𝑎) = 1 − 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 1 − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 1 − 𝐹𝐹(𝑎𝑎)
6. 𝑃𝑃(𝑎𝑎 < 𝑋𝑋 < 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 < 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑏𝑏) − 𝐹𝐹(𝑎𝑎)
7. 𝑃𝑃(𝑎𝑎 ≤ 𝑋𝑋 < 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 < 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝐹𝐹(𝑏𝑏) − 𝐹𝐹(𝑎𝑎)
8. 𝑃𝑃(𝑎𝑎 < 𝑋𝑋 ≤ 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 ≤ 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝐹𝐹(𝑏𝑏) − 𝐹𝐹(𝑎𝑎)
9. 𝑃𝑃(𝑎𝑎 ≤ 𝑋𝑋 ≤ 𝑏𝑏) = 𝑃𝑃(𝑋𝑋 ≤ 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝐹𝐹(𝑏𝑏) − 𝐹𝐹(𝑎𝑎)

Problem
The 𝑝𝑝𝑝𝑝𝑝𝑝 of random variable 𝑋𝑋 is given as
𝑓𝑓(𝑥𝑥) = 𝑘𝑘√𝑥𝑥, 0 < 𝑥𝑥 < 1.
Find a value for 𝑘𝑘. Find the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋 and hence find the 𝑃𝑃(0.3 < 𝑋𝑋 < 0.6).

Hints
𝑅𝑅 = {𝑥𝑥|0 < 𝑥𝑥 < 1}
Since 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋,
� 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = 1
𝑥𝑥∈𝑅𝑅
1 1
⟹ 𝑘𝑘 � 𝑥𝑥 2 𝑑𝑑𝑑𝑑 = 1
0

2 3 1
⟹ 𝑘𝑘 × �𝑥𝑥 2 � = 1
3 0
3
⟹ 𝑘𝑘 =
2
Therefore,
3
𝑓𝑓(𝑥𝑥) = √𝑥𝑥; 0 < 𝑥𝑥 < 1.
2
Now, if 𝑡𝑡 ≤ 0,

𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = 0


𝑥𝑥≤𝑡𝑡

When 0 < 𝑡𝑡 < 1,


𝑡𝑡
3 𝑡𝑡 3 2 3 𝑡𝑡 3
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = � √𝑥𝑥𝑑𝑑𝑑𝑑 = × �𝑥𝑥 2 � = 𝑡𝑡 2
𝑥𝑥≤𝑡𝑡 0 2 0 2 3 0

When 𝑡𝑡 ≥ 1,
𝑡𝑡 1 𝑡𝑡
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 + � 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
𝑥𝑥≤𝑡𝑡 0 0 1

=1+0=1
Therefore,
0, 𝑖𝑖𝑖𝑖 𝑡𝑡 ≤ 0
3
𝐹𝐹(𝑡𝑡) = �𝑡𝑡 2 , 𝑖𝑖𝑖𝑖 0 < 𝑡𝑡 < 1
1, 𝑖𝑖𝑖𝑖 𝑡𝑡 ≥ 1
Now,
3 3
𝑃𝑃(0.3 < 𝑋𝑋 < 0.6) = 𝐹𝐹(0.6) − 𝐹𝐹(0.3) = (0.6)2 − (0.3)2 = 0.3004.
Exercise
The 𝑐𝑐𝑐𝑐𝑐𝑐 of a continuous random variable 𝑋𝑋 is given as
0, 𝑖𝑖𝑖𝑖 𝑡𝑡 < −1
⎧ 3
𝑡𝑡 + 1
𝐹𝐹(𝑡𝑡) = , 𝑖𝑖𝑖𝑖 − 1 ≤ 𝑡𝑡 < 2
⎨ 9
⎩ 1, 𝑖𝑖𝑖𝑖 𝑡𝑡 ≥ 2
2
(a) Find 𝑃𝑃(−1 < 𝑋𝑋 ≤ 1). 𝐴𝐴𝐴𝐴𝐴𝐴. .
9
𝑥𝑥 2
(b) Find the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋 and hence find 𝑃𝑃(−1 < 𝑋𝑋 ≤ 1). 𝐴𝐴𝐴𝐴𝐴𝐴. 𝑓𝑓(𝑥𝑥) = ; −1 ≤
3
2
𝑥𝑥 < 2; .
9
Problem
Suppose that 𝑋𝑋 is continuous random variable. A function 𝑓𝑓(𝑥𝑥) is defined as
𝑥𝑥; 0 < 𝑥𝑥 < 1
𝑓𝑓(𝑥𝑥) = �
2 − 𝑥𝑥; 1 ≤ 𝑥𝑥 < 2.
(a) Show that 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋.
(b) Find 𝑃𝑃(𝑋𝑋 < 1.2) and 𝑃𝑃(0.5 < 𝑋𝑋 < 1).
(c) Find the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋 and hence find the probability given in (b).

Solution
(a)
Given that 𝑅𝑅 = {𝑥𝑥|0 < 𝑥𝑥 < 2}. It is clear that
𝑓𝑓(𝑥𝑥) ≥ 0; ∀ 𝑥𝑥 ∈ 𝑅𝑅.
Now,
2 1 2
� 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 + � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 0 0 1
1 2
= � 𝑥𝑥 𝑑𝑑𝑑𝑑 + � (2 − 𝑥𝑥) 𝑑𝑑𝑑𝑑
0 1
1 2
𝑥𝑥 2 𝑥𝑥 2 1 4 1
= � � + �2𝑥𝑥 − � = + 4 − − 2 + = 1
2 0 2 1 2 2 2
Therefore, 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋.

(b)
1.2 1 1.2
𝑃𝑃(𝑋𝑋 < 1.2) = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 𝑑𝑑𝑑𝑑 + � (2 − 𝑥𝑥) 𝑑𝑑𝑑𝑑
0 0 1
1 1.2
𝑥𝑥 2 𝑥𝑥 2
= � � + �2𝑥𝑥 − � = 0.68
2 0 2 1
1 1 1
𝑥𝑥 2
𝑃𝑃(0.5 < 𝑋𝑋 < 1) = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 𝑑𝑑𝑑𝑑 == � � = 0.375
0.5 0.5 2 0.5

(c)
The 𝑐𝑐𝑐𝑐𝑐𝑐
When 𝑡𝑡 ≤ 0

𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = 0


𝑥𝑥≤𝑡𝑡

When 0 < 𝑡𝑡 < 1


𝑡𝑡 𝑡𝑡
𝑡𝑡 2
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 𝑑𝑑𝑑𝑑 =
𝑥𝑥≤𝑡𝑡 0 0 2
When 1 ≤ 𝑡𝑡 < 2
𝑡𝑡 1 𝑡𝑡
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 𝑑𝑑𝑑𝑑 + � (2 − 𝑥𝑥) 𝑑𝑑𝑑𝑑
𝑥𝑥≤𝑡𝑡 0 0 1
1 𝑡𝑡
𝑥𝑥 2 𝑥𝑥 2 1 𝑡𝑡 2 1 𝑡𝑡 2
= � � + �2𝑥𝑥 − � = + 2𝑡𝑡 − − 2 + = 2𝑡𝑡 − − 1
2 0 2 1 2 2 2 2
When 𝑡𝑡 ≥ 2
𝑡𝑡 2
𝐹𝐹(𝑡𝑡) = 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 + � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑
𝑥𝑥≤𝑡𝑡 0 0 𝑥𝑥≥2

=1+0=1
That is, the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋 is given by
0, 𝑡𝑡 ≤ 0
⎧𝑡𝑡 2
⎪ , 0 < 𝑡𝑡 < 1
𝐹𝐹(𝑡𝑡) = 2 .
⎨ 𝑡𝑡 2
⎪2𝑡𝑡 − 2 − 1, 1 ≤ 𝑡𝑡 < 2
⎩ 1, 𝑡𝑡 ≥ 2
Now,
(1.2)2
𝑃𝑃(𝑋𝑋 < 1.2) = 𝑃𝑃(𝑋𝑋 ≤ 1.2) = 𝐹𝐹(1.2) = 2 × 1.2 − − 1 = 0.68
2
1 0.52
𝑃𝑃(0.5 < 𝑋𝑋 < 1) = 𝐹𝐹(1) − 𝐹𝐹(0.5) = 2 − − 1 − = 0.375.
2 2

Exercise
Let 𝑋𝑋 be a continuous random variable. If the function 𝑓𝑓(𝑥𝑥) is defined as
2
𝑓𝑓(𝑥𝑥) = (1 + 𝑥𝑥), 2 < 𝑥𝑥 < 5,
27
show that 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋. Find the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋. Hence find 𝑃𝑃(𝑋𝑋 < 3) and
𝑃𝑃(𝑋𝑋 > 4).

Exercise
Let 𝑋𝑋 be a continuous random variable with 𝑝𝑝𝑝𝑝𝑝𝑝 𝑓𝑓(𝑥𝑥) defined as
𝑘𝑘
𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 > 0.
(1 + 𝑥𝑥)2
Find a value for 𝑘𝑘. Find the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋. Hence find 𝑃𝑃(𝑋𝑋 > 1) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 < 1).

Hints

𝑘𝑘
� 𝑑𝑑𝑑𝑑 = 1
0 (1 + 𝑥𝑥)2
Let
𝑦𝑦 = 1 + 𝑥𝑥
⟹ 𝑑𝑑𝑑𝑑 = 𝑑𝑑𝑑𝑑
𝑥𝑥 0 ∞
𝑦𝑦 1 ∞

𝑘𝑘
� 𝑑𝑑𝑑𝑑 = 1
1 𝑦𝑦 2
⟹ 𝑘𝑘 = 1
0, 𝑡𝑡 ≤ 0
𝑡𝑡
𝐹𝐹(𝑡𝑡) = � , 𝑡𝑡 > 0
1 + 𝑡𝑡

𝑃𝑃(𝑋𝑋 < 1) = 0.5 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 > 1) = 0.5.

Exercise
Let 𝑋𝑋 be a continuous random variable with 𝑝𝑝𝑝𝑝𝑝𝑝 𝑓𝑓(𝑥𝑥) defined as
1
𝑓𝑓(𝑥𝑥) = , 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏
𝑘𝑘
Find a value for 𝑘𝑘. Find the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋.

Answer
𝑘𝑘 = 𝑏𝑏 − 𝑎𝑎
0, 𝑡𝑡 < 𝑎𝑎
𝑡𝑡 − 𝑎𝑎
𝐹𝐹(𝑡𝑡) = � , 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏
𝑏𝑏 − 𝑎𝑎
1, 𝑡𝑡 > 𝑏𝑏
Exercise
Suppose that the 𝑐𝑐𝑐𝑐𝑐𝑐 of a continuous random variable 𝑋𝑋 is given as
0, 𝑡𝑡 ≤ 0
⎧ 𝑡𝑡
⎪ , 0 < 𝑡𝑡 < 2
8
𝐹𝐹(𝑡𝑡) = 2
⎨ 𝑡𝑡 , 2 ≤ 𝑡𝑡 < 4
⎪ 16
⎩ 1, 𝑡𝑡 ≥ 4
Find the 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑋𝑋. Also, find 𝑃𝑃(𝑋𝑋 > 1.5), 𝑃𝑃(0.5 < 𝑋𝑋 < 2.5), 𝑃𝑃(𝑋𝑋 > 3.5).

Answer
1
⎧ , 0 < 𝑥𝑥 < 2
⎪8
𝑓𝑓(𝑥𝑥) =
⎨𝑥𝑥
⎪ , 2 ≤ 𝑥𝑥 < 4
⎩8
1.5
𝑃𝑃(𝑋𝑋 > 1.5) = 1 − 𝐹𝐹(1.5) = 1 −
8
(2.5)2 0.5
𝑃𝑃(0.5 < 𝑋𝑋 < 2.5) = 𝐹𝐹(2.5) − 𝐹𝐹(0.5) = −
16 8
(3.5)2
𝑃𝑃(𝑋𝑋 < 3.5) = 𝐹𝐹(3.5) =
16
Exercise
Suppose that the 𝑝𝑝𝑝𝑝𝑝𝑝 of a continuous random variable 𝑋𝑋 is given as
𝑓𝑓(𝑥𝑥) = 𝑘𝑘 𝑒𝑒𝑒𝑒𝑒𝑒(−2𝑥𝑥), 𝑥𝑥 > 0.
Find a value for 𝑘𝑘. Find the 𝑐𝑐𝑐𝑐𝑐𝑐 of 𝑋𝑋. Hence find 𝑃𝑃(1 < 𝑋𝑋 < 2).

Hints and Answer:



� 𝑘𝑘 exp(−2𝑥𝑥) 𝑑𝑑𝑑𝑑 = 1
0

⟹ 𝑘𝑘 = 2.
The 𝑐𝑐𝑐𝑐𝑐𝑐
0, 𝑡𝑡 ≤ 0
𝐹𝐹(𝑡𝑡) = �
1 − exp(−2𝑡𝑡) , 𝑡𝑡 > 0
𝑃𝑃(1 < 𝑋𝑋 < 2) = 𝐹𝐹(2) − 𝐹𝐹(1) = 1 − exp(−4) − 1 + exp(−2) = 0.117.

Exercise
Suppose that the 𝑝𝑝𝑝𝑝𝑝𝑝 of a continuous random variable 𝑋𝑋 is given as
𝑥𝑥
𝑓𝑓(𝑥𝑥) = , 0 < 𝑥𝑥 < 4.
8
Find values of 𝑡𝑡 such that
1
(𝑎𝑎) 𝑃𝑃(𝑋𝑋 ≤ 𝑡𝑡) =
4
1
(𝑏𝑏) 𝑃𝑃(𝑋𝑋 > 𝑡𝑡) =
2
Answer
(𝑎𝑎) 𝑡𝑡 = 2 𝑎𝑎𝑎𝑎𝑎𝑎 (𝑏𝑏)√8.

Expectation and Variance of a Continuous Random Variable

Suppose that 𝑋𝑋 is a continuous random variable with 𝑝𝑝𝑝𝑝𝑝𝑝 𝑓𝑓(𝑥𝑥) and the set of
possible values of 𝑅𝑅. Then, the expectation or the expected value of 𝑋𝑋 (the
population mean of observations obtained on 𝑋𝑋) is defined as

𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑.


𝑥𝑥∈𝑅𝑅

If 𝑔𝑔(𝑋𝑋) is a function of 𝑋𝑋, the the expectation or the expected value of 𝑔𝑔(𝑋𝑋) (the
population mean of observations obtained on 𝑔𝑔(𝑋𝑋)) is defined as
𝐸𝐸[𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥) 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑.
𝑥𝑥∈𝑅𝑅

The population variance of 𝑋𝑋 is given as


𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸[𝑋𝑋 − 𝐸𝐸(𝑋𝑋)]2
= 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2 .
Note
The properties of expectation and variance of a continuous random variable is
exactly same as the properties of expectation and variance given for discrete
random variable.
If 𝑎𝑎, 𝑏𝑏, 𝑐𝑐 any constants,
1. 𝐸𝐸(𝑐𝑐) = 𝑐𝑐
2. 𝐸𝐸(𝑐𝑐𝑐𝑐) = 𝑐𝑐𝑐𝑐(𝑋𝑋)
3. 𝐸𝐸(𝑎𝑎 + 𝑋𝑋) = 𝑎𝑎 + 𝐸𝐸(𝑋𝑋)
4. 𝐸𝐸(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑎𝑎 + 𝑏𝑏𝑏𝑏(𝑋𝑋)
5. 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐) = 0
6. 𝑉𝑉𝑉𝑉𝑉𝑉(𝑐𝑐𝑐𝑐) = 𝑐𝑐 2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋)
7. 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑋𝑋) = 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋)
8. 𝑉𝑉𝑉𝑉𝑉𝑉(𝑎𝑎 + 𝑏𝑏𝑏𝑏) = 𝑏𝑏 2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).
Continuous Random Variable

Problem
Suppose that the continuous random variable 𝑋𝑋 has the following 𝑝𝑝𝑝𝑝𝑝𝑝
𝑓𝑓(𝑥𝑥) = 2(1 − 𝑥𝑥); 0 < 𝑥𝑥 < 1.
Find the population mean and variance of 𝑋𝑋.

Solution
The population mean
1
𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 × 2(1 − 𝑥𝑥)𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 0
1
𝑥𝑥 2 𝑥𝑥 3 1
= 2� − � = .
2 3 0 3
1
The population average value for all population observations obtained on 𝑋𝑋 is .
3

The population variance


𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
Now,
1
2)
𝐸𝐸(𝑋𝑋 =� 𝑥𝑥 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 2 × 2(1 − 𝑥𝑥)𝑑𝑑𝑑𝑑
2
𝑥𝑥∈𝑅𝑅 0
1
𝑥𝑥 3 𝑥𝑥 4 1
= 2� − � = .
3 4 0 6
Therefore,
1 12 1
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = − � � = .
6 3 18
Problem
Suppose that the continuous random variable 𝑋𝑋 has the following 𝑝𝑝𝑝𝑝𝑝𝑝
𝑓𝑓(𝑥𝑥) = exp(−𝑥𝑥) ; 𝑥𝑥 > 0.
2
Find the population mean of exp � 𝑋𝑋�.
3

Hints
The population mean

2 2 2
𝐸𝐸 �exp � 𝑋𝑋�� = � exp � 𝑥𝑥� 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � exp � 𝑥𝑥� exp(−𝑥𝑥) 𝑑𝑑𝑑𝑑
3 𝑥𝑥∈𝑅𝑅 3 0 3
∞ ∞
1 1
= � exp �− 𝑥𝑥� 𝑑𝑑𝑑𝑑 = 3 �− exp �− 𝑥𝑥�� = 3.
0 3 3 0

Note
Gamma Function:

Γ(𝑛𝑛) = � 𝑥𝑥 𝑛𝑛−1 exp(−𝑥𝑥) 𝑑𝑑𝑑𝑑.
0

• Γ(𝑛𝑛) = (𝑛𝑛 − 1)! ;


• Γ(𝑛𝑛 + 1) = 𝑛𝑛 Γ(𝑛𝑛)
• Γ(1) = 1
1
• Γ � � = √𝜋𝜋
2

Problem
Suppose that the continuous random variable 𝑋𝑋 has the following 𝑝𝑝𝑝𝑝𝑝𝑝
1
𝑓𝑓(𝑥𝑥) = λ exp �− 𝑥𝑥� ; 𝑥𝑥 > 0.
3
Find a value for 𝜆𝜆. Hence find the population mean and variance.
Hints
Since 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝,

� 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = 1
𝑥𝑥∈𝑅𝑅

1
⟹ 𝜆𝜆 � exp �− 𝑥𝑥� 𝑑𝑑𝑑𝑑 = 1
0 3

1
⟹ 3 𝜆𝜆 �− exp �− 𝑥𝑥�� = 1
3 0
1
⟹ 3 𝜆𝜆 = 1; ⟹ 𝜆𝜆 = .
3
Therefore,
1 1
𝑓𝑓(𝑥𝑥) = exp �− 𝑥𝑥� , 𝑥𝑥 > 0.
3 3
The population mean
1 ∞ 1
𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 𝑒𝑒𝑒𝑒𝑒𝑒 �− 𝑥𝑥� 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 3 0 3
1
Let 𝑦𝑦 = 𝑥𝑥. Therefore, 𝑥𝑥 = 3𝑦𝑦
3
1
𝑑𝑑𝑑𝑑 = 𝑑𝑑𝑑𝑑 ⟹ 𝑑𝑑𝑑𝑑 = 3𝑑𝑑𝑑𝑑
3
𝑥𝑥 0 ∞
𝑦𝑦 0 ∞
Then,
1 ∞ ∞
𝐸𝐸(𝑋𝑋) = � 3𝑦𝑦 𝑒𝑒𝑒𝑒𝑒𝑒(−𝑦𝑦) 3 𝑑𝑑𝑑𝑑 = 3 � 𝑦𝑦 𝑒𝑒𝑒𝑒𝑒𝑒(−𝑦𝑦) 𝑑𝑑𝑑𝑑 = 3 × Γ(2) = 3
3 0 0
The population variance
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
Now,

2) 2
1 ∞ 2 1
𝐸𝐸(𝑋𝑋 = � 𝑥𝑥 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑥𝑥 exp �− 𝑥𝑥� 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 3 0 3
1
Let 𝑦𝑦 = 𝑥𝑥.
3

2)
1 ∞ ∞
𝐸𝐸(𝑋𝑋 = � (3𝑦𝑦) exp(−𝑦𝑦) 3 𝑑𝑑𝑑𝑑 = 9 � 𝑦𝑦 2 exp(−𝑦𝑦) 𝑑𝑑𝑑𝑑 = 9 × Γ(3) = 18
2
3 0 0

Then
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 18 − (3)2 = 9.

Exercise
The hospital stay in days for patients following treatment for a certain type of
kidney disorder is a random variable defined as 𝑌𝑌 = 𝑋𝑋 + 4, where the probability
density function of 𝑋𝑋 is
32
𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 > 0.
(𝑥𝑥 + 4)3
Find the expected number of days that a patient has to stay at the hospital after
taking such treatment.
Hints
𝐸𝐸(𝑌𝑌) = 𝐸𝐸(𝑋𝑋) + 4.

32
𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 𝑑𝑑𝑑𝑑 = 4
0 (𝑥𝑥 + 4)3
𝐸𝐸(𝑌𝑌) = 8.
Exercise
For the given 𝑝𝑝𝑝𝑝𝑝𝑝
𝑓𝑓(𝑥𝑥) = 𝑘𝑘(𝑥𝑥 − 𝑥𝑥 2 ), 0 < 𝑥𝑥 < 1;
find a value for 𝑘𝑘. Hence, find the population mean and variance.

Answer:
1
𝑘𝑘 = 6; 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = . 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉 ==? ?
2
Exercise
Suppose that the continuous random variable 𝑋𝑋 denotes the time (in years) to
develop a certain software with 𝑝𝑝𝑝𝑝𝑝𝑝
𝑥𝑥
𝑓𝑓(𝑥𝑥) = ; 0 < 𝑥𝑥 < 2.
2
(a) Compute the probability that it will take more than 6 months to develop the
software.
(b) Find the expected (population mean) time to develop the software.

Hints
(a)
2
1 𝑥𝑥 1
𝑃𝑃 �𝑋𝑋 > � = � 𝑑𝑑𝑑𝑑 = [𝑥𝑥 2 ]21 = 0.9375
2 1 2 4 2
2

It implies that in 93.75% of cases, it takes more than 6 months to develop the
certain software.
(b)
2
𝑥𝑥 4
𝐸𝐸(𝑋𝑋) = � 𝑥𝑥 × 𝑑𝑑𝑑𝑑 = = 1.333
0 2 3
It takes approximately 1.333 years to develop the certain software.
Exercise
Suppose that the continuous random variable 𝑋𝑋 denotes the time (in years) to
develop a certain software with 𝑝𝑝𝑝𝑝𝑝𝑝
𝑓𝑓(𝑥𝑥) = 5𝑥𝑥 4 ; 0 < 𝑥𝑥 < 1.
(a) Compute the probability that it will take more than 3 months to develop the
software. Answer 0.9990
(b) Find the expected (population mean) time to develop the software. Answer
0.8333 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑜𝑜𝑜𝑜 10 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚ℎ𝑠𝑠.

Moment Generating Function: Continuous Random Variable


Let 𝑋𝑋 be a continuous random variable with 𝑝𝑝𝑝𝑝𝑝𝑝 𝑓𝑓(𝑥𝑥) and 𝑅𝑅 be the set of all
possible values of 𝑋𝑋. Then the moment generating function (𝑚𝑚𝑚𝑚𝑚𝑚) of 𝑋𝑋, denoted
by 𝑀𝑀𝑋𝑋 (𝑡𝑡), is defined as

𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[exp(𝑡𝑡𝑡𝑡)] = � 𝑒𝑒𝑥𝑥𝑥𝑥(𝑡𝑡𝑡𝑡)𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑.


𝑥𝑥∈𝑅𝑅

Problem
Suppose that 𝑋𝑋 is a continuous random variable with 𝑝𝑝𝑝𝑝𝑝𝑝
1 1
𝑓𝑓(𝑥𝑥) = exp �− 𝑥𝑥� ; 𝑥𝑥 > 0,
𝜆𝜆 𝜆𝜆
where 𝜆𝜆 > 0 is a constant. This is known as exponential distribution with
parameter 𝜆𝜆. Find the 𝑚𝑚𝑚𝑚𝑚𝑚 of 𝑋𝑋. Hence find the population mean and variance.

Solution:
Let 𝑅𝑅 = {𝑥𝑥|𝑥𝑥 > 0}. The 𝑚𝑚𝑚𝑚𝑚𝑚 of 𝑋𝑋 is

𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[exp(𝑡𝑡𝑡𝑡)] = � 𝑒𝑒𝑥𝑥𝑥𝑥(𝑡𝑡𝑡𝑡)𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑.


𝑥𝑥∈𝑅𝑅
1 ∞ 1
= � 𝑒𝑒𝑥𝑥𝑥𝑥(𝑡𝑡𝑡𝑡) exp �− 𝑥𝑥� 𝑑𝑑𝑑𝑑
𝜆𝜆 0 𝜆𝜆
1 ∞ 1
= � exp �−𝑥𝑥 � − 𝑡𝑡�� 𝑑𝑑𝑑𝑑
𝜆𝜆 0 𝜆𝜆
1
Note that above integration exists, if > 𝑡𝑡. Therefore,
𝜆𝜆

1 1 1
𝑀𝑀𝑋𝑋 (𝑡𝑡) = � exp �−𝑥𝑥 � − 𝑡𝑡�� 𝑑𝑑𝑑𝑑 ; > 𝑡𝑡
𝜆𝜆 0 𝜆𝜆 𝜆𝜆

1 1 1
1 exp �−𝑥𝑥 � − 𝑡𝑡�� 1
= �− 𝜆𝜆 � = 𝜆𝜆 (0 + 1) = 𝜆𝜆 ; > 𝑡𝑡
𝜆𝜆 1 1 1
− 𝑡𝑡 − 𝑡𝑡 − 𝑡𝑡 𝜆𝜆
𝜆𝜆 0 𝜆𝜆 𝜆𝜆
1 1
= ; > 𝑡𝑡,
1 − 𝜆𝜆𝑡𝑡 𝜆𝜆
That is,
1 1
𝑀𝑀𝑋𝑋 (𝑡𝑡) = ; > 𝑡𝑡.
1 − 𝜆𝜆𝑡𝑡 𝜆𝜆
The population mean
𝜕𝜕
𝐸𝐸(𝑋𝑋) = � 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 𝑡𝑡=0

Now,
𝜕𝜕 𝜕𝜕
𝑀𝑀𝑋𝑋 (𝑡𝑡) = (1 − 𝜆𝜆𝑡𝑡)−1 =) − 1) × (1 − 𝜆𝜆𝑡𝑡)−2 (−𝜆𝜆) = 𝜆𝜆(1 − 𝜆𝜆𝑡𝑡)−2
𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡
𝜕𝜕
� 𝑀𝑀𝑋𝑋 (𝑡𝑡)� = 𝜆𝜆
𝜕𝜕𝑡𝑡 𝑡𝑡=0

That is,
𝐸𝐸(𝑋𝑋) = 𝜆𝜆.
The population variance
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2 .
2)
𝜕𝜕 2 𝜕𝜕 𝜕𝜕
𝐸𝐸(𝑋𝑋 = � 2 𝑀𝑀𝑋𝑋 (𝑡𝑡)� =� 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 𝑡𝑡=0
𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 𝑡𝑡=0

𝜕𝜕 𝜕𝜕 𝜕𝜕
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝜆𝜆(1 − 𝜆𝜆𝑡𝑡)−2 = (−2)𝜆𝜆 × (1 − 𝜆𝜆𝑡𝑡)−3 (−𝜆𝜆) = 2𝜆𝜆2 (1 − 𝜆𝜆𝑡𝑡)−3
𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡
𝜕𝜕 𝜕𝜕
� 𝑀𝑀𝑋𝑋 (𝑡𝑡)� = 2𝜆𝜆2
𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 𝑡𝑡=0

Therefore,
𝐸𝐸(𝑋𝑋 2 ) = 2𝜆𝜆2
and
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 2𝜆𝜆2 − 𝜆𝜆2 = 𝜆𝜆2 .

Exercise
Prove that the 𝑚𝑚𝑚𝑚𝑚𝑚 of the following 𝑝𝑝𝑝𝑝𝑝𝑝
1 1
𝑓𝑓(𝑥𝑥) = exp �− 𝑥𝑥� , 𝑥𝑥 > 0
2 2
1 1
is given by 𝑀𝑀𝑋𝑋 (𝑡𝑡) = , 𝑡𝑡 < . Hence, find the population mean and variance.
1−2𝑡𝑡 2

Answer 2; 4.

Exercise
Justify that each of following functions is a 𝑚𝑚𝑚𝑚𝑚𝑚 of a continuous random
variable 𝑋𝑋. Hence find the population mean, variance, and standard deviation.
1 4 1 4
(a) 𝑀𝑀𝑋𝑋 (𝑡𝑡) = � � , 𝑡𝑡 < 2 ; (b) 𝑀𝑀𝑋𝑋 (𝑡𝑡) = 4−𝑡𝑡 , 𝑡𝑡 < 4;
1−2𝑡𝑡
1 1
(c)𝑀𝑀𝑋𝑋 (𝑡𝑡) = exp �𝜇𝜇𝜇𝜇 + 𝑡𝑡 2 𝜎𝜎 2 � ; 𝜇𝜇, 𝜎𝜎 2 𝑎𝑎𝑎𝑎𝑎𝑎 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐; (d) 𝑀𝑀𝑋𝑋 (𝑡𝑡) = exp � 𝑡𝑡 2 �.
2 2
Binomial Distribution

Binomial distribution is a probability distribution function of a discrete random


variable. Therefore, binomial distribution is a probability mass function (𝑝𝑝𝑝𝑝𝑝𝑝).
The binomial random variable is generated through a binomial experiment, which
can be described as follows.

(a) The experiment has 𝑛𝑛 independent trials. That is, outcome of a trial
does not influence the outcome of other trials.

(b) Each trial has one of the two outcomes, say success or failure.

(c) The probability of success, denoted by 𝑝𝑝, and the probability of


failure, denoted by 𝑞𝑞 = 1 − 𝑝𝑝, remain same for each trial.

The binomial random variable, denoted by 𝑋𝑋, is then defined as

𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑖𝑖𝑖𝑖 𝑛𝑛 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡.

If 𝑅𝑅 is the set of all possible values of 𝑋𝑋, one may define 𝑅𝑅 as

𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 0,1,2, ⋯ , 𝑛𝑛}.

The binomial distribution of 𝑋𝑋 is then defined as


𝑛𝑛
𝑓𝑓(𝑥𝑥) = � � 𝑝𝑝 𝑥𝑥 𝑞𝑞 𝑛𝑛−𝑥𝑥 , 𝑥𝑥 = 0,1,2, ⋯ , 𝑛𝑛.
𝑥𝑥
Note that 𝑛𝑛 and 𝑝𝑝 are known as the parameters of the binomial distribution. If a
random variable 𝑋𝑋 has a binomial distribution with parameters 𝑛𝑛 and 𝑝𝑝, it is
denoted as
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑛𝑛, 𝑝𝑝).
Note
Binomial Expansion
𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛
(𝑎𝑎 + 𝑏𝑏)𝑛𝑛 = � � 𝑎𝑎0 𝑏𝑏 𝑛𝑛−0 + � � 𝑎𝑎1 𝑏𝑏 𝑛𝑛−1 + � � 𝑎𝑎2 𝑏𝑏 𝑛𝑛−2 + ⋯ + � � 𝑎𝑎𝑛𝑛 𝑏𝑏 𝑛𝑛 −𝑛𝑛
0 1 2 𝑛𝑛
𝑛𝑛
𝑛𝑛
= � � � 𝑎𝑎𝑘𝑘 𝑏𝑏 𝑛𝑛−𝑘𝑘 .
𝑘𝑘
𝑘𝑘=0

Is Binomial distribution a 𝒑𝒑𝒑𝒑𝒑𝒑?


Suppose that 𝑋𝑋 is a binomial discrete random variable with parameters 𝑛𝑛 and 𝑝𝑝,
𝑖𝑖. 𝑒𝑒.
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑛𝑛, 𝑝𝑝).
Therefore, 𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 0,1,2, ⋯ , 𝑛𝑛} and
𝑛𝑛
𝑓𝑓(𝑥𝑥) = � � 𝑝𝑝 𝑥𝑥 𝑞𝑞 𝑛𝑛−𝑥𝑥 ; 𝑥𝑥 = 0,1,2, ⋯ , 𝑛𝑛; 𝑞𝑞 = 1 − 𝑝𝑝.
𝑥𝑥
It is clear that
𝑓𝑓(𝑥𝑥) > 0, ∀ 𝑥𝑥 ∈ 𝑅𝑅.
Again,
𝑛𝑛
𝑛𝑛
� 𝑓𝑓(𝑥𝑥) = � � � 𝑝𝑝 𝑥𝑥 𝑞𝑞 𝑛𝑛−𝑥𝑥 = (𝑝𝑝 + 𝑞𝑞)𝑛𝑛 = (𝑝𝑝 + 1 − 𝑝𝑝)𝑛𝑛 = 1.
𝑥𝑥
𝑥𝑥∈𝑅𝑅 𝑥𝑥=0

Therefore, binomial distribution 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝.

Note
𝑛𝑛
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 𝑓𝑓(𝑥𝑥) = � � 𝑝𝑝 𝑥𝑥 𝑞𝑞 𝑛𝑛−𝑥𝑥 ; 𝑥𝑥 = 0,1,2, ⋯ , 𝑛𝑛.
𝑥𝑥
Moment Generating Function, Population Mean and Variance

Suppose that 𝑋𝑋 is a random variable distributed as 𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑛𝑛, 𝑝𝑝).

Moment Generating Function (𝑚𝑚𝑚𝑚𝑚𝑚)


The moment generating function (𝑚𝑚𝑚𝑚𝑚𝑚) of 𝑋𝑋, denoted by 𝑀𝑀𝑋𝑋 (𝑡𝑡), is defined as

𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[exp(𝑡𝑡𝑡𝑡)] = � exp(𝑡𝑡𝑡𝑡) 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅
𝑛𝑛 𝑛𝑛
𝑡𝑡𝑡𝑡 𝑛𝑛 𝑥𝑥 𝑛𝑛−𝑥𝑥 𝑛𝑛
=� 𝑒𝑒 � � 𝑝𝑝 𝑞𝑞 =� � � (𝑝𝑝𝑒𝑒 𝑡𝑡 )𝑥𝑥 𝑞𝑞 𝑛𝑛−𝑥𝑥
𝑥𝑥=0 𝑥𝑥 𝑥𝑥=0 𝑥𝑥

= (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛
𝑖𝑖. 𝑒𝑒. 𝑀𝑀𝑋𝑋 (𝑡𝑡) = (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛 .
Population Mean
The population mean of 𝑋𝑋 is
𝜕𝜕
𝐸𝐸(𝑋𝑋) = � 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝜕𝜕 𝑡𝑡=0

Now,
𝜕𝜕
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝑛𝑛(𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1 𝑝𝑝𝑒𝑒 𝑡𝑡 = 𝑛𝑛𝑛𝑛𝑒𝑒 𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1
𝜕𝜕𝜕𝜕
Therefore,
𝐸𝐸(𝑋𝑋) = [𝑛𝑛𝑛𝑛𝑒𝑒 𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1 ]𝑡𝑡=0 = 𝑛𝑛𝑛𝑛(𝑝𝑝 + 𝑞𝑞) = 𝑛𝑛𝑛𝑛.

Population Variance
The population variance of 𝑋𝑋 is
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
Now,
2)
𝜕𝜕 2 𝜕𝜕 𝜕𝜕
𝐸𝐸(𝑋𝑋 = � 2 𝑀𝑀𝑋𝑋 (𝑡𝑡)� =� 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 𝑡𝑡=0
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝑡𝑡=0

𝜕𝜕 𝜕𝜕 𝜕𝜕 𝜕𝜕
𝑀𝑀𝑋𝑋 (𝑡𝑡) = [𝑛𝑛𝑛𝑛𝑒𝑒 𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1 ] = 𝑛𝑛𝑛𝑛 [𝑒𝑒 𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1 ]
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕
= 𝑛𝑛𝑛𝑛[𝑒𝑒 𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1 + 𝑒𝑒 𝑡𝑡 (𝑛𝑛 − 1)(𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−2 𝑝𝑝𝑒𝑒 𝑡𝑡 ]
= 𝑛𝑛𝑛𝑛[𝑒𝑒 𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−1 + (𝑛𝑛 − 1)𝑝𝑝𝑝𝑝 2𝑡𝑡 (𝑝𝑝𝑒𝑒 𝑡𝑡 + 𝑞𝑞)𝑛𝑛−2 ]
Therefore,
𝐸𝐸(𝑋𝑋 2 ) = 𝑛𝑛𝑛𝑛[(𝑝𝑝 + 𝑞𝑞)𝑛𝑛−1 + (𝑛𝑛 − 1)𝑝𝑝(𝑝𝑝 + 𝑞𝑞)𝑛𝑛−2 ]
= 𝑛𝑛𝑛𝑛[1 + (𝑛𝑛 − 1)𝑝𝑝] = 𝑛𝑛𝑛𝑛 + 𝑛𝑛(𝑛𝑛 − 1)𝑝𝑝2
Finally,
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑛𝑛𝑛𝑛 + 𝑛𝑛(𝑛𝑛 − 1)𝑝𝑝2 − (𝑛𝑛𝑛𝑛)2
= 𝑛𝑛𝑛𝑛 + 𝑛𝑛2 𝑝𝑝2 − 𝑛𝑛𝑝𝑝2 − 𝑛𝑛2 𝑝𝑝2
= 𝑛𝑛𝑛𝑛(1 − 𝑝𝑝) = 𝑛𝑛𝑛𝑛𝑛𝑛
𝑖𝑖. 𝑒𝑒. 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑛𝑛𝑛𝑛𝑛𝑛.
Note
1. Since 0 ≤ 𝑞𝑞 ≤ 1, for binomial random variable
𝐸𝐸(𝑋𝑋) > 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).
2. Variance of 𝑋𝑋 attains maximum value at 𝑝𝑝 = 0.5. This is because
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑛𝑛𝑛𝑛 − 𝑛𝑛𝑝𝑝2
𝜕𝜕
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑛𝑛 − 2𝑛𝑛𝑛𝑛
𝜕𝜕𝜕𝜕
Then,
𝑛𝑛 − 2𝑛𝑛𝑛𝑛 = 0 ⟺ 𝑝𝑝 = 0.5.
Again,
𝜕𝜕 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = −2𝑛𝑛 < 0.
𝜕𝜕𝑝𝑝2
Remark
The binomial distribution is known as Bernoulli distribution if the number of trials
in the binomial experiment is only one. Therefore, the Bernoulli random variable is
denoted as
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(1, 𝑝𝑝).
Therefore,
𝑓𝑓(𝑥𝑥) = 𝑝𝑝 𝑥𝑥 𝑞𝑞1−𝑥𝑥 ; 𝑥𝑥 = 0,1; 𝑞𝑞 = 1 − 𝑝𝑝
𝐸𝐸(𝑋𝑋) = 𝑝𝑝
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑝𝑝𝑝𝑝.
Problem
In a community, 15% of individuals have high blood pressure. Six individuals are
selected randomly and it is observed that how many of them have high blood
pressure. Find the probability that
(a) Exactly three individuals have high blood pressure.
(b) No one has high blood pressure.
(c) At least one individual has high blood pressure.
(d) At most five individuals have high blood pressure.
(e) Less than two individuals have high blood pressure.
Also, find the expected number of individuals having high blood pressure with
variance.
Solution
Let the random variable 𝑋𝑋 be defined as
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 ℎ𝑖𝑖𝑖𝑖ℎ 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
𝑜𝑜𝑜𝑜𝑜𝑜 𝑜𝑜𝑜𝑜 6 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
Then
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 (𝑛𝑛 = 6, 𝑝𝑝 = 0.15)
𝑞𝑞 = 1 − 𝑝𝑝 = 0.85
6
𝑓𝑓(𝑥𝑥) = � � (0.15)𝑥𝑥 (0.85)6−𝑥𝑥 ; 𝑥𝑥 = 0,1,2,3,4,5,6.
𝑥𝑥

(a) Probability that exactly 3 individuals have high blood pressure


6
𝑃𝑃(𝑋𝑋 = 3) = 𝑓𝑓(3) = � � (0.15)3 (0.85)6−3 = 0.0415
3
It implies that among the groups of 6 individuals, there are 4.15% of groups in
which exactly 3 individuals have high blood pressure.

(b) Probability that no one has high blood pressure


6
𝑃𝑃(𝑋𝑋 = 0) = 𝑓𝑓(0) = � � (0.15)0 (0.85)6−0 = 0.3772
0
It implies that among the groups of 6 individuals, there are 37.72% of groups in
which no individual has high blood pressure.

(c) Probability at least one individual has high blood pressure


𝑃𝑃(𝑋𝑋 ≥ 1) = 𝑃𝑃(𝑋𝑋 < 1)𝑐𝑐 = 1 − 𝑃𝑃(𝑋𝑋 < 1) = 1 − 𝑓𝑓(0)
= 1 − 0.3772 = 0.6228
It implies that among the groups of 6 individuals, there are 62.28% of groups in
which at least one individual has high blood pressure.
(d) Probability that at most five individuals have high blood pressure
𝑃𝑃(𝑋𝑋 ≤ 5) = 𝑃𝑃(𝑋𝑋 > 5)𝑐𝑐 = 1 − 𝑃𝑃(𝑋𝑋 > 5) = 1 − 𝑓𝑓(6)
6
= 1 − � � (0.15)6 (0.85)6−6 = 0.9999
6
It implies that among the groups of 6 individuals, there are 99.99% of groups in
which at most 5 individuals have high blood pressure.

(e) Probability that less than two individuals have high blood pressure
𝑃𝑃(𝑋𝑋 < 2) = 𝑓𝑓(0) + 𝑓𝑓(1)
6 6
= � � (0.15)0 (0.85)6−0 + � � (0.15)1 (0.85)6−1
0 1
= 0.7765
It implies that among the groups of 6 individuals, there are 77.65% of groups in
which less than 2 individuals have high blood pressure.

The expected number of individuals having high blood pressure out of 6


individuals is
𝐸𝐸(𝑋𝑋) = 𝑛𝑛𝑛𝑛 = 6 × 0.15 = 0.9.
It implies that in each group of 6 individuals, one may expect to find 0.9
individuals with high blood pressure. In other words, in a group of 60 individuals,
9 individuals are expected to have high blood pressure.
The variance is given by
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑛𝑛𝑛𝑛𝑛𝑛 = 6 × 0.15 × 0.85 = 0.765.
Problem
Suppose that 25% of the people in a population have low hemoglobin level (LHL).
In a study, five people are selected randomly and it is observed that how many of
them have LHL. Find the probability that
(a) At least two people have LHL.
(b) At most three of them have LHL.
(c) Find the expected number of people having LHL with variance.
Hints
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝐿𝐿𝐿𝐿𝐿𝐿 𝑜𝑜𝑜𝑜𝑜𝑜 𝑜𝑜𝑜𝑜 5
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 (𝑛𝑛 = 5, 𝑝𝑝 = 0.25)
𝑞𝑞 = 1 − 0.25 = 0.75
5
𝑓𝑓(𝑥𝑥) = � � 0.25𝑥𝑥 0.755−𝑥𝑥 ; 𝑥𝑥 = 0,1,2,3,4,5.
𝑥𝑥
(a)
𝑃𝑃(𝑋𝑋 ≥ 2) = 𝑃𝑃(𝑋𝑋 < 2)𝑐𝑐 = 1 − [𝑃𝑃(𝑋𝑋 = 0) + 𝑃𝑃(𝑋𝑋 = 1)] = 0.3679
(b)
𝑃𝑃(𝑋𝑋 ≤ 3) = 𝑃𝑃(𝑋𝑋 > 3)𝑐𝑐 = 1 − [𝑃𝑃(𝑋𝑋 = 4) + 𝑃𝑃(𝑋𝑋 = 5)] = 0.9844
(c)
𝐸𝐸(𝑋𝑋) = 𝑛𝑛𝑛𝑛 = 1.25
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝑛𝑛𝑛𝑛𝑛𝑛 = 0.9375.

Problem
Suppose that the recovery rate from a certain disease is 40%. In a study, 15
patients suffering from this disease are selected and observe that how many of
them are recovered. Find the probability that
(a) At least 10 of them recover.
(b) Number of patients recovering from this disease is between 3 and
8, inclusive.
(c) Exactly 5 of them recover.
Hints
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑡𝑡ℎ𝑖𝑖𝑖𝑖 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑜𝑜𝑜𝑜𝑜𝑜 𝑜𝑜𝑜𝑜 15
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 (𝑛𝑛 = 15, 𝑝𝑝 = 0.40)
𝑞𝑞 = 1 − 0.40 = 0.60
15
𝑓𝑓(𝑥𝑥) = � � 0.40𝑥𝑥 0.6015−𝑥𝑥 ; 𝑥𝑥 = 0,1,2, ⋯ ,15.
𝑥𝑥
(a)
𝑃𝑃(𝑋𝑋 ≥ 10) = 0.0338
(b)
𝑃𝑃(3 ≤ 𝑋𝑋 ≤ 8) = 𝑃𝑃(𝑋𝑋 = 3) + 𝑃𝑃(𝑋𝑋 = 4) + ⋯ + 𝑃𝑃(𝑋𝑋 = 8) = 0.8779
(c)
𝑃𝑃(𝑋𝑋 = 5) = 0.1859

Problem
Suppose that 𝑋𝑋 is a binomial random variable with mean 4 and variance 3.
Determine the parameter of the distribution. Hence, find 𝑃𝑃(𝑋𝑋 > 3); 𝑃𝑃(2 < 𝑋𝑋 <
7), 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 > 14).
Hints
Given that
𝐸𝐸(𝑋𝑋) = 4 ⟹ 𝑛𝑛𝑛𝑛 = 4
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 3 ⟹ 𝑛𝑛𝑛𝑛𝑛𝑛 = 3 ⟹ 𝑛𝑛𝑛𝑛(1 − 𝑝𝑝) = 3
1 3
𝑝𝑝 = ; 𝑞𝑞 = ; 𝑛𝑛 = 16.
4 4
𝑥𝑥
16 1 3 𝑛𝑛−𝑥𝑥
𝑓𝑓(𝑥𝑥) = � � � � � � ; 𝑥𝑥 = 0,1, ⋯ ,16.
𝑥𝑥 4 4
Problem
Suppose that 10% of inmates in a large prison are known to be innocent. A non-
profit organization randomly selects 20 inmates from the prison. Find the
probability that the organization will find at least 3 innocent inmates. 𝐴𝐴𝐴𝐴𝐴𝐴: 0.323.

Problem
From a previous study, it is found that during a severe thunderstorm, 4% of
transmission lines are damaged. A city with 75 transmission lines is hit with a
severe thunderstorm. What is the probability that at least 5 of them get damaged?
𝐴𝐴𝐴𝐴𝐴𝐴. 0.185.

Problem
An exam consists of 10 multiple choice questions with 4 possible answers.
Suppose that a student did not take preparation for this exam and answered all
questions randomly. Find the probability that he/she
(a) Answered 6 questions correctly.
(b) Answered all questions wrongly.
(c) Answered at least one questions correctly.
Hints
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 𝑜𝑜𝑜𝑜 10 𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞
1
𝑝𝑝 = = 0.25; 𝑞𝑞 = 1 − 0.25 = 0.75
4
𝑋𝑋~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑛𝑛 = 10, 𝑝𝑝 = 0.25)
10
𝑓𝑓(𝑥𝑥) = � � 0.25𝑥𝑥 0.7510−𝑥𝑥 ; 𝑥𝑥 = 0,1,2, ⋯ ,10.
𝑥𝑥
Poisson Distribution

Poisson distribution is a probability distribution function of a discrete random


variable. Therefore, Poisson distribution is a probability mass function (𝑝𝑝𝑝𝑝𝑝𝑝). The
Poisson random variable represents the number of occurrences of an event in a
specific time interval or space.

Let 𝑋𝑋 be a Poisson random variable and 𝑅𝑅 is the set of all possible values of 𝑋𝑋.
Then one may define 𝑅𝑅 as
𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 0,1,2, ⋯ , ⋯ }.

The Poisson distribution of 𝑋𝑋 is defined as

𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥
𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 = 0,1,2, ⋯ ; 𝜆𝜆 > 0.
𝑥𝑥!
Note that 𝜆𝜆 is known as the parameter of the Poisson distribution. If a random
variable 𝑋𝑋 has a Poisson distribution with parameter 𝜆𝜆, it is denoted as
𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆).
Note

𝑥𝑥
𝑥𝑥 𝑥𝑥 2 𝑥𝑥 𝑘𝑘 ∞ 𝑥𝑥 𝑘𝑘
𝑒𝑒 = 1 + + + ⋯ + + ⋯+ ⋯ = � .
1! 2! 𝑘𝑘! 𝑘𝑘=0 𝑘𝑘!

Is Poisson distribution a 𝒑𝒑𝒑𝒑𝒑𝒑?


Suppose that 𝑋𝑋 is a Poisson discrete random variable with parameter 𝜆𝜆, 𝑖𝑖. 𝑒𝑒.
𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆).
Therefore, 𝑅𝑅 = {𝑥𝑥|𝑥𝑥 = 0,1,2, ⋯ } and
𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥
𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 = 0,1,2, ⋯ ; 𝜆𝜆 > 0.
𝑥𝑥!
It is clear that
𝑓𝑓(𝑥𝑥) > 0, ∀ 𝑥𝑥 ∈ 𝑅𝑅.
Again,
∞ ∞
𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥 𝜆𝜆𝑥𝑥
� 𝑓𝑓(𝑥𝑥) = � = 𝑒𝑒 � = 𝑒𝑒 −𝜆𝜆 𝑒𝑒 𝜆𝜆 = 1.
−𝜆𝜆
𝑥𝑥! 𝑥𝑥!
𝑥𝑥∈𝑅𝑅 𝑥𝑥=0 𝑥𝑥=0

Therefore, Poisson distribution 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝.

Note
𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 = 0,1,2, ⋯ ; 𝜆𝜆 > 0.
𝑥𝑥!

Example: Poisson Random Variable


• Number of patients waiting in a clinic in an hour
• Number of serious injuries occurred in a factory in a month
• Number of telephone calls received by a student in a day
• Number of typographical errors on a page of a book

Moment Generating Function, Population Mean and Variance


Suppose that 𝑋𝑋 is a random variable distributed as 𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆).

Moment Generating Function (𝑚𝑚𝑚𝑚𝑚𝑚)


The moment generating function (𝑚𝑚𝑚𝑚𝑚𝑚) of 𝑋𝑋, denoted by 𝑀𝑀𝑋𝑋 (𝑡𝑡), is defined as

𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[exp(𝑡𝑡𝑡𝑡)] = � exp(𝑡𝑡𝑡𝑡) 𝑓𝑓(𝑥𝑥)


𝑥𝑥∈𝑅𝑅

𝑡𝑡𝑡𝑡
𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥 −𝜆𝜆
∞ (𝜆𝜆 𝑒𝑒 𝑡𝑡 )𝑥𝑥
=� 𝑒𝑒 = 𝑒𝑒 �
𝑥𝑥=0 𝑥𝑥! 𝑥𝑥=0 𝑥𝑥!
𝑡𝑡 𝑡𝑡 −1)
= 𝑒𝑒 −𝜆𝜆 𝑒𝑒 𝜆𝜆𝑒𝑒 = 𝑒𝑒 𝜆𝜆(𝑒𝑒
𝑡𝑡 −1)
𝑖𝑖. 𝑒𝑒. 𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝑒𝑒 𝜆𝜆(𝑒𝑒 .
Population Mean
The population mean of 𝑋𝑋 is
𝜕𝜕
𝐸𝐸(𝑋𝑋) = � 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝜕𝜕 𝑡𝑡=0

Now,
𝜕𝜕 𝑡𝑡
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝑒𝑒 𝜆𝜆(𝑒𝑒 −1) 𝜆𝜆 𝑒𝑒 𝑡𝑡
𝜕𝜕𝜕𝜕
Therefore,
𝑡𝑡 −1)
𝐸𝐸(𝑋𝑋) = �𝑒𝑒 𝜆𝜆(𝑒𝑒 𝜆𝜆 𝑒𝑒 𝑡𝑡 �𝑡𝑡=0 = 𝜆𝜆

Population Variance
The population variance of 𝑋𝑋 is
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
Now,

2)
𝜕𝜕 2 𝜕𝜕 𝜕𝜕
𝐸𝐸(𝑋𝑋 = � 2 𝑀𝑀𝑋𝑋 (𝑡𝑡)� =� 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 𝑡𝑡=0
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝑡𝑡=0

𝜕𝜕 𝜕𝜕 𝜕𝜕 𝑡𝑡 𝜕𝜕 𝑡𝑡
𝑀𝑀𝑋𝑋 (𝑡𝑡) = �𝑒𝑒 𝜆𝜆(𝑒𝑒 −1) 𝜆𝜆 𝑒𝑒 𝑡𝑡 � = 𝜆𝜆 � 𝑒𝑒 𝑡𝑡+𝜆𝜆(𝑒𝑒 −1) �
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕
𝑡𝑡 −1)
= 𝜆𝜆�𝑒𝑒 𝑡𝑡+𝜆𝜆(𝑒𝑒 (1 + 𝜆𝜆𝑒𝑒 𝑡𝑡 )�
Therefore,
𝐸𝐸(𝑋𝑋 2 ) = 𝜆𝜆(1 + 𝜆𝜆) = 𝜆𝜆 + 𝜆𝜆2
Finally,
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝜆𝜆 + 𝜆𝜆2 − 𝜆𝜆2 = 𝜆𝜆.
Note
1. For Poisson random variable
𝐸𝐸(𝑋𝑋) = 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋).
2. Parameter 𝜆𝜆 of a Poisson distribution is the population mean as well as
population variance.

Remark
Given that 𝑋𝑋 is a Poisson random variable with parameter 𝜆𝜆, 𝑖𝑖. 𝑒𝑒. 𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆),
where 𝑋𝑋 represents, for example,
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑑𝑑𝑑𝑑𝑑𝑑
• If
𝑌𝑌 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚ℎ
𝑌𝑌~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆∗ ) 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝜆𝜆∗ = 30𝜆𝜆.
• If
𝑍𝑍 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦
𝑍𝑍~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆∗ ) 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝜆𝜆∗ = 365𝜆𝜆.
• If
𝑈𝑈 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 ℎ𝑜𝑜𝑜𝑜𝑜𝑜
𝜆𝜆
𝑌𝑌~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆∗ ) 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝜆𝜆∗ = .
24
• If
𝑉𝑉 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤
𝑉𝑉~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆∗ ) 𝑤𝑤𝑤𝑤𝑤𝑤ℎ 𝜆𝜆∗ = 7𝜆𝜆.
Problem
The average number of calls received by a telephone operator during 5 𝑝𝑝𝑝𝑝 to
5: 10 𝑝𝑝𝑝𝑝 daily is 3. Find the probability that tomorrow during the same time
interval, the operator receive
(a) No call; (b) Exactly one call; and (c) At least two calls.

Solution
Let 𝑋𝑋 be a random variable defined as
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑡𝑡ℎ𝑒𝑒 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
Therefore,
𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆 = 3).
That is,
𝑒𝑒 −3 3𝑥𝑥
𝑓𝑓(𝑥𝑥) = ; 𝑥𝑥 = 0,1,2, ⋯.
𝑥𝑥!
(a) Probability that no call will be received
𝑒𝑒 −3 30
𝑃𝑃(𝑋𝑋 = 0) = = 𝑒𝑒 −3 = 0.0498.
0!
It implies that the operator will not receive any call in 4.98% of days during 5 𝑝𝑝𝑝𝑝
to 5: 10 𝑝𝑝𝑝𝑝.

(b) Probability that exactly call will be received


𝑒𝑒 −3 31
𝑃𝑃(𝑋𝑋 = 1) = = 3 × 𝑒𝑒 −3 = 0.1494.
1!
It implies that the operator will receive exactly one call in 14.94% of days during
5 𝑝𝑝𝑝𝑝 to 5: 10 𝑝𝑝𝑝𝑝.
(c)Probability that at least two calls will be received
𝑃𝑃(𝑋𝑋 ≥ 2) = 𝑃𝑃(𝑋𝑋 < 2)𝑐𝑐 = 1 − 𝑃𝑃(𝑋𝑋 < 2)
= 1 − [𝑃𝑃(𝑋𝑋 = 0) + 𝑃𝑃(𝑋𝑋 = 1)] = 1 − (0.0498 + 0.1494)
= 0.8008
It implies that the operator will receive at least two calls in 80.08% of days during
5 𝑝𝑝𝑝𝑝 to 5: 10 𝑝𝑝𝑝𝑝.

Problem
The average number of emergency patients seen in a private clinic in a day is 10.
Find the probability that in a specific day, the clinic will receive
(a) Exactly 5 emergency patients.
(b) At least 3 emergency patients.
(c) Between 5 and 10, inclusive emergency patients.
Hints
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑑𝑑𝑑𝑑𝑑𝑑
𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(10).
(a) 𝑃𝑃(𝑋𝑋 = 5) = 0.038; (b) 𝑃𝑃(𝑋𝑋 ≥ 3) = 0.997; (c) 𝑃𝑃(5 ≤ 𝑋𝑋 ≤ 10) = 0.554.

Problem
In a medical study, it was found that a patient suffering from a certain eye disease
has on average 4 tumors per eye. Find the proportion of patients suffering from this
certain eye disease who have
(a) Exactly 5 tumors per eye.
(b) More than 5 tumors per eye.
(c) Fewer than 5 tumors per eye.
(d) Between 5 and 7, inclusive tumors per eye.
Problem
The average number of snake bite cases seen at the DMC in a year is 6. Find the
probability that
(a) The number of snake bite cases will be 7 in a year.
(b) The number of snake bite cases will be less than 2 in a year.
(c) The number of snake bite cases will be 10 in 2-year.
(d) There will be no snake bite case in a month.

Hints
𝑋𝑋 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦
𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(6).
(a) 𝑃𝑃(𝑋𝑋 = 7) = 0.1377.
That is, there will be 7 snake bite cases in 13.77% of years.
(b) 𝑃𝑃(𝑋𝑋 < 2) = 0.0174.
(c) Let
𝑌𝑌 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖 2 − 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦
𝑌𝑌~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆∗ ), 𝜆𝜆∗ = 6 × 2 = 12
𝑃𝑃(𝑌𝑌 = 10) = 0.1048.
(d) Let
𝑍𝑍 = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚ℎ
6
𝑍𝑍~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆∗ ); 𝜆𝜆∗ = = 0.5
12
𝑃𝑃(𝑍𝑍 = 0) = 0.6065.
Problem
Suppose that 𝑋𝑋 is Poisson random variable with parameter 𝜆𝜆. If 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 3, find
𝑃𝑃(𝑋𝑋 ≥ 1).

Hints
Given that
𝑋𝑋~𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆𝜆).
That is,
𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥
𝑓𝑓(𝑥𝑥) = , 𝑥𝑥 = 0,1,2, ⋯ ; 𝜆𝜆 > 0.
𝑥𝑥!
For Poisson distribution, it is known that
𝐸𝐸(𝑋𝑋) = 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝜆𝜆
Therefore, 𝜆𝜆 = 3.
𝑃𝑃(𝑋𝑋 ≥ 1) = 𝑃𝑃(𝑋𝑋 < 1)𝑐𝑐 = 1 − 𝑃𝑃(𝑋𝑋 < 1) = 1 − 𝑃𝑃(𝑋𝑋 = 0).

Problem
Suppose that 𝑋𝑋 is Poisson random variable with parameter 𝜆𝜆. If
𝑃𝑃(𝑋𝑋 = 2) = 2𝑃𝑃(𝑋𝑋 = 0),
find 𝑃𝑃(𝑋𝑋 = 1) and 𝑃𝑃(𝑋𝑋 > 1).
Hints
Given that
𝑃𝑃(𝑋𝑋 = 2) = 2𝑃𝑃(𝑋𝑋 = 0)
𝑒𝑒 −𝜆𝜆 𝜆𝜆2 𝑒𝑒 −𝜆𝜆 𝜆𝜆0
⟹ =2× ⟹ 𝜆𝜆 = 2.
2! 0!
𝑒𝑒 −2 21
𝑃𝑃(𝑋𝑋 = 1) = = 0.2707
1!
𝑃𝑃(𝑋𝑋 > 1) = 𝑃𝑃(𝑋𝑋 ≤ 1)𝑐𝑐 = 1 − 𝑃𝑃(𝑋𝑋 ≤ 1) = 0.5940.
Normal Distribution

Normal distribution is a probability distribution function of a continuous random


variable. Therefore, normal distribution is a probability density function (𝑝𝑝𝑝𝑝𝑝𝑝).

o Most of the naturally occurred phenomena can be explained with the


normal distribution.
o This is the most widely used probability distribution function in
statistics because a good number of statistical theories and techniques
have been developed based on the normal distribution.

Example: Normal Random Variable


• Height or weight of individual
• Blood glucose concentration of an individual
• Marks obtained by a student in a test
• Amount of rainfall in a day during rainy season
• Temperature recorded in a day
Let 𝑋𝑋 be a normal random variable and 𝑅𝑅 is the set of all possible values of 𝑋𝑋.
Then one may define 𝑅𝑅 as
𝑅𝑅 = {𝑥𝑥|−∞ < 𝑥𝑥 < ∞}.

The normal distribution of 𝑋𝑋 is defined as


1 −
1 𝑥𝑥−𝜇𝜇 2
� �
𝑓𝑓(𝑥𝑥) = 𝑒𝑒 2 𝜎𝜎 ; −∞ < 𝑥𝑥 < ∞, −∞ < 𝜇𝜇 < ∞ , 𝜎𝜎 > 0.
𝜎𝜎√2𝜋𝜋
Note that 𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 are known as the parameters of the normal distribution. If a
random variable 𝑋𝑋 has a normal distribution with parameter 𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎, it is denoted
as
𝑋𝑋~𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ).
Note
• 𝜇𝜇 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚; 𝜎𝜎 2 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉;
𝜎𝜎 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷.
• Normal distribution curve (or simply normal curve) is ball-shaped
symmetric about 𝜇𝜇. Therefore,
𝑃𝑃(𝜇𝜇 − 𝑎𝑎 < 𝑋𝑋 < 𝜇𝜇) = 𝑃𝑃(𝜇𝜇 < 𝑋𝑋 < 𝜇𝜇 + 𝑎𝑎), ∀ 𝑎𝑎 > 0.

• Population mean, median, and mode normally distributed random variable


are same. That is,
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = 𝜇𝜇.
Is normal distribution a 𝒑𝒑𝒑𝒑𝒑𝒑?
Suppose that 𝑋𝑋 is a normal continuous random variable with parameter 𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎,
𝑖𝑖. 𝑒𝑒.
𝑋𝑋~𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ).
Therefore, 𝑅𝑅 = {𝑥𝑥| − ∞ < 𝑥𝑥 < ∞} and
1 −
1 𝑥𝑥−𝜇𝜇 2
� �
𝑓𝑓(𝑥𝑥) = 𝑒𝑒 2 𝜎𝜎 ; −∞ < 𝑥𝑥 < ∞, −∞ < 𝜇𝜇 < ∞ , 𝜎𝜎 > 0.
𝜎𝜎√2𝜋𝜋
It is clear that
𝑓𝑓(𝑥𝑥) ≥ 0, ∀ 𝑥𝑥 ∈ 𝑅𝑅.
Again,

1 −
1 𝑥𝑥−𝜇𝜇 2
� �
� 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑒𝑒 2 𝜎𝜎 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 −∞ 𝜎𝜎√2𝜋𝜋
Let
𝑥𝑥 − 𝜇𝜇
𝑧𝑧 = ⟺ 𝑥𝑥 = 𝜇𝜇 + 𝜎𝜎𝜎𝜎
𝜎𝜎
Hence,
𝑑𝑑𝑑𝑑 = 𝜎𝜎𝜎𝜎𝜎𝜎
and
𝑥𝑥 −∞ ∞
𝑧𝑧 −∞ ∞
Therefore,
∞ ∞
1 −
1 2
𝑧𝑧 1 1 2
� 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑒𝑒 2 𝜎𝜎 𝑑𝑑𝑑𝑑 = � 𝑒𝑒 − 2 𝑧𝑧 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 𝜎𝜎√2𝜋𝜋 −∞ √2𝜋𝜋 −∞

Since the integrand is an even function, one may write



2 1 2
� 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑒𝑒 − 2 𝑧𝑧 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 √2𝜋𝜋 0

Again, let
1
𝑦𝑦 = 𝑧𝑧 2 ⟺ 𝑧𝑧 = �2𝑦𝑦
2
1
𝑑𝑑𝑑𝑑 = × 2𝑧𝑧 𝑑𝑑𝑑𝑑 = 𝑧𝑧 𝑑𝑑𝑑𝑑
2
1
⟹ 𝑑𝑑𝑑𝑑 = 𝑑𝑑𝑑𝑑
�2𝑦𝑦
𝑧𝑧 0 ∞
𝑦𝑦 0 ∞
That is,
∞ ∞
2 1 1 1
� 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑒𝑒 −𝑦𝑦
𝑑𝑑𝑑𝑑 = � 𝑦𝑦 2−1 𝑒𝑒 −𝑦𝑦 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 √2𝜋𝜋 0 �2𝑦𝑦 √𝜋𝜋 0

1 1 √𝜋𝜋
= Γ� � = =1
√𝜋𝜋 2 √𝜋𝜋

𝑖𝑖. 𝑒𝑒. � 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = 1.


𝑥𝑥∈𝑅𝑅

Therefore, normal distribution 𝑓𝑓(𝑥𝑥) is a 𝑝𝑝𝑝𝑝𝑝𝑝.

Moment Generating Function, Population Mean and Variance


Suppose that 𝑋𝑋 is a continuous random variable distributed as 𝑋𝑋~𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ).

Moment Generating Function (𝑚𝑚𝑚𝑚𝑚𝑚)


The moment generating function (𝑚𝑚𝑚𝑚𝑚𝑚) of 𝑋𝑋, denoted by 𝑀𝑀𝑋𝑋 (𝑡𝑡), is defined as

𝑡𝑡𝑡𝑡
1 𝑡𝑡𝑡𝑡
1 𝑥𝑥−𝜇𝜇 2
− � �
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝐸𝐸[exp(𝑡𝑡𝑡𝑡)] = � 𝑒𝑒 𝑓𝑓(𝑥𝑥) 𝑑𝑑𝑑𝑑 = � 𝑒𝑒 𝑒𝑒 2 𝜎𝜎 𝑑𝑑𝑑𝑑
𝑥𝑥∈𝑅𝑅 𝜎𝜎√2𝜋𝜋 −∞

Let
𝑥𝑥 − 𝜇𝜇
𝑧𝑧 = ⟺ 𝑥𝑥 = 𝜇𝜇 + 𝜎𝜎𝜎𝜎
𝜎𝜎
Hence,
𝑑𝑑𝑑𝑑 = 𝜎𝜎𝜎𝜎𝜎𝜎
and
𝑥𝑥 −∞ ∞
𝑧𝑧 −∞ ∞
Hence,
∞ ∞
1 1 2 𝑒𝑒 𝜇𝜇𝜇𝜇 1 2 −2𝑡𝑡𝑡𝑡𝑡𝑡 )
𝑀𝑀𝑋𝑋 (𝑡𝑡) = � 𝑒𝑒 𝑡𝑡(𝜇𝜇 +𝜎𝜎𝜎𝜎 )
𝑒𝑒 −2 𝑧𝑧 𝜎𝜎𝜎𝜎𝜎𝜎 = � 𝑒𝑒 − 2( 𝑧𝑧 𝑑𝑑𝑑𝑑
𝜎𝜎√2𝜋𝜋 −∞ √2𝜋𝜋 −∞

𝑒𝑒 𝜇𝜇𝜇𝜇 1 2 −2𝑡𝑡𝑡𝑡𝑡𝑡 +𝑡𝑡 2 𝜎𝜎 2 ) 1 2 2
= � 𝑒𝑒 − 2( 𝑧𝑧 𝑒𝑒 2𝑡𝑡 𝜎𝜎
𝑑𝑑𝑑𝑑
√2𝜋𝜋 −∞
1 2 2
𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡
𝜎𝜎 ∞ 1
( 𝑧𝑧−𝑡𝑡𝑡𝑡 )2
= � 𝑒𝑒 − 2 𝑑𝑑𝑑𝑑
√2𝜋𝜋 −∞

Let
𝑦𝑦 = 𝑧𝑧 − 𝑡𝑡𝑡𝑡 ⟺ 𝑧𝑧 = 𝑦𝑦 + 𝑡𝑡𝑡𝑡
𝑑𝑑𝑑𝑑 = 𝑑𝑑𝑑𝑑
𝑧𝑧 −∞ ∞
𝑦𝑦 −∞ ∞
Then,
1 2 2
𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡 𝜎𝜎 ∞ 1 2
𝑀𝑀𝑋𝑋 (𝑡𝑡) = � 𝑒𝑒 − 2𝑦𝑦 𝑑𝑑𝑑𝑑
√2𝜋𝜋 −∞

Since the integrand is an even function, one may write


1 2 2
𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡
𝜎𝜎 ∞ 1
− 𝑦𝑦 2
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 2 � 𝑒𝑒 2 𝑑𝑑𝑑𝑑
√2𝜋𝜋 0

Again, let
1
𝑢𝑢 = 𝑦𝑦 2 ⟺ 𝑦𝑦 = √2𝑢𝑢
2
1
𝑑𝑑𝑑𝑑 = × 2𝑦𝑦 𝑑𝑑𝑑𝑑 = 𝑦𝑦 𝑑𝑑𝑑𝑑
2
1
⟹ 𝑑𝑑𝑑𝑑 = 𝑑𝑑𝑑𝑑
√2𝑢𝑢
𝑦𝑦 0 ∞
𝑢𝑢 0 ∞
That is,
1 2 2
𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡
𝜎𝜎 ∞
1
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 2 � 𝑒𝑒 − 𝑢𝑢 𝑑𝑑𝑑𝑑
√2𝜋𝜋 0 √2𝑢𝑢
1 1
𝜇𝜇𝜇𝜇 + 𝑡𝑡 2 𝜎𝜎 2 ∞ 𝜇𝜇𝜇𝜇 + 𝑡𝑡 2 𝜎𝜎 2
𝑒𝑒 2 1 𝑒𝑒 2 1
= � 𝑢𝑢2−1 𝑒𝑒 − 𝑢𝑢 𝑑𝑑𝑑𝑑 = Γ� �
√𝜋𝜋 0 √𝜋𝜋 2
1 2 2
𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡𝜎𝜎 1
𝜇𝜇𝜇𝜇 + 𝑡𝑡 2 𝜎𝜎 2
= √𝜋𝜋 = 𝑒𝑒 2
√𝜋𝜋
1 2 2
𝑖𝑖. 𝑒𝑒. 𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡 𝜎𝜎
.
Population Mean
The population mean of 𝑋𝑋 is
𝜕𝜕
𝐸𝐸(𝑋𝑋) = � 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝜕𝜕 𝑡𝑡=0

Now,
𝜕𝜕 1 2 2
𝑀𝑀𝑋𝑋 (𝑡𝑡) = 𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡 𝜎𝜎 (𝜇𝜇 + 𝑡𝑡𝜎𝜎 2 )
𝜕𝜕𝜕𝜕
Therefore,
1 2 2
𝐸𝐸(𝑋𝑋) = �𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡 𝜎𝜎
(𝜇𝜇 + 𝑡𝑡𝜎𝜎 2 )� = 𝜇𝜇.
𝑡𝑡=0

Population Variance
The population variance of 𝑋𝑋 is
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝐸𝐸(𝑋𝑋 2 ) − [𝐸𝐸(𝑋𝑋)]2
Now,
2)
𝜕𝜕 2 𝜕𝜕 𝜕𝜕
𝐸𝐸(𝑋𝑋 = � 2 𝑀𝑀𝑋𝑋 (𝑡𝑡)� =� 𝑀𝑀𝑋𝑋 (𝑡𝑡)�
𝜕𝜕𝑡𝑡 𝑡𝑡=0
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝑡𝑡=0

𝜕𝜕 𝜕𝜕 𝜕𝜕 1 2 2
𝑀𝑀𝑋𝑋 (𝑡𝑡) = �𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡 𝜎𝜎 (𝜇𝜇 + 𝑡𝑡𝜎𝜎 2 )�
𝜕𝜕𝑡𝑡 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕
1 2 2 1 2 2
= 𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡 𝜎𝜎
(𝜇𝜇 + 𝑡𝑡𝜎𝜎 2 )2 + 𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡𝜎𝜎
𝜎𝜎 2
1 2 2
= 𝑒𝑒 𝜇𝜇𝜇𝜇 +2𝑡𝑡𝜎𝜎
[ (𝜇𝜇 + 𝑡𝑡𝜎𝜎 2 )2 + 𝜎𝜎 2 ]
Therefore,
𝐸𝐸(𝑋𝑋 2 ) = 𝜇𝜇2 + 𝜎𝜎 2 .
Finally,
𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝜇𝜇2 + 𝜎𝜎 2 − 𝜇𝜇2 = 𝜎𝜎 2 .

Note: Standard Normal Distribution


A normal distribution with population mean 𝜇𝜇 = 0 and variance 𝜎𝜎 2 = 1 is known
as standard normal distribution. If a continuous random variable 𝑋𝑋 has a normal
distribution with parameters 𝜇𝜇 and 𝜎𝜎, 𝑖𝑖. 𝑒𝑒. 𝑋𝑋~𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ), the standard normal
variable, denoted by 𝑍𝑍, can be obtained from 𝑋𝑋 as
𝑋𝑋 − 𝜇𝜇
𝑍𝑍 = .
𝜎𝜎
That is, 𝑍𝑍~𝑁𝑁(0,1). The 𝑝𝑝𝑝𝑝𝑝𝑝 of 𝑍𝑍 is given as
1 1
− 𝑧𝑧 2
𝑓𝑓(𝑧𝑧) = 𝑒𝑒 2 , −∞ < 𝑧𝑧 < ∞.
√2𝜋𝜋
The population mean and variance of a standard normal random variable are 0 and
1, respectively. The moment generating function of standard normal random
variable can be expressed as
1 2
𝑀𝑀𝑍𝑍 (𝑡𝑡) = 𝑒𝑒 2𝑡𝑡 .
Remark
• The population mean, median, and mode of standard normal distribution
coincide at 0. The population variance or standard deviation is 1.
• The standard normal distribution is bell-shaped symmetric about 0. That is,
𝑃𝑃(−𝑎𝑎 < 𝑍𝑍 < 0) = 𝑃𝑃(0 < 𝑍𝑍 < 𝑎𝑎), ∀𝑎𝑎 > 0.

• Computation of probability of an event defined using the values of normally


distributed random variable with mean and variance 𝜇𝜇 and 𝜎𝜎 2 is
mathematically complex. Hence, it requires some special probability tables.
Since probability of an event depends on the value of 𝜇𝜇 (−∞, ∞) and 𝜎𝜎(>
0), probability table is not available for general normal distribution.

• Since in standard normal distribution, 𝜇𝜇 = 0 and 𝜎𝜎 = 1, probability table is


available for standard normal distribution. Probability table usually provides
the cumulative probability of a standard normal random variable.
• Therefore, to compute the probability of an event from normal random
variable, the random variable needs to be converted into the standard normal
random variable.

Computation of Probability of an Event: 𝑿𝑿~𝑵𝑵(𝝁𝝁, 𝝈𝝈𝟐𝟐 )

Let 𝑎𝑎 and 𝑏𝑏 (𝑎𝑎 < 𝑏𝑏) be any real numbers.


• 𝑃𝑃(𝑋𝑋 = 𝑎𝑎) = 0.
𝑋𝑋−𝜇𝜇 𝑎𝑎−𝜇𝜇 𝑎𝑎−𝜇𝜇
• 𝑃𝑃(𝑋𝑋 < 𝑎𝑎) = 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 𝑃𝑃 � ≤ � = 𝑃𝑃 �𝑍𝑍 ≤ �.
𝜎𝜎 𝜎𝜎 𝜎𝜎
𝑎𝑎−𝜇𝜇
• 𝑃𝑃(𝑋𝑋 > 𝑎𝑎) = 𝑃𝑃(𝑋𝑋 ≥ 𝑎𝑎) = 1 − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎) = 1 − 𝑃𝑃 �𝑍𝑍 ≤ �.
𝜎𝜎

• 𝑃𝑃(𝑎𝑎 < 𝑋𝑋 < 𝑏𝑏) = 𝑃𝑃(𝑎𝑎 ≤ 𝑋𝑋 < 𝑏𝑏) = 𝑃𝑃(𝑎𝑎 < 𝑋𝑋 ≤ 𝑏𝑏) = 𝑃𝑃(𝑎𝑎 ≤ 𝑋𝑋 ≤ 𝑏𝑏)
= 𝑃𝑃(𝑋𝑋 ≤ 𝑏𝑏) − 𝑃𝑃(𝑋𝑋 ≤ 𝑎𝑎)
𝑏𝑏 − 𝜇𝜇 𝑎𝑎 − 𝜇𝜇
= 𝑃𝑃 �𝑍𝑍 ≤ � − 𝑃𝑃 �𝑍𝑍 ≤ �.
𝜎𝜎 𝜎𝜎
Normal Distribution
Problem
Suppose that hemoglobin concentration of healthy males is normally distributed
with mean 16 and variance 0.81. Find the probability (proportion) that a randomly
chosen health male has hemoglobin concentration less than 14.

Solution
Let
𝑋𝑋 = 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑜𝑜𝑜𝑜 𝑎𝑎 ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒ℎ𝑦𝑦 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
Given that
𝑋𝑋~𝑁𝑁(16, 0.81).
That is,
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = 𝜇𝜇 = 16 and 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉 = 𝜎𝜎 2 = 0.81.
Therefore,
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 = 𝜎𝜎 = √0.81 = 0.90.
Now,
14 − 16
𝑃𝑃(𝑋𝑋 < 14) = 𝑃𝑃 �𝑍𝑍 ≤ � = 𝑃𝑃(𝑍𝑍 ≤ −2.22)
0.90
Using standard normal table,
𝑃𝑃(𝑋𝑋 < 14) = 0.0132.
It implies that among the healthy male, 1.32% of them have hemoglobin
concentration less than 14.
Problem
Suppose that birth weight of newborn has a normal distribution with mean 3.4 𝑘𝑘𝑘𝑘
and standard deviation 0.35 𝑘𝑘𝑘𝑘. A new born is considered to have low birth
weight, if his/her birth weight is less than 2.4 𝑘𝑘𝑘𝑘. Find the low birth weight rate?
Find the probability that a randomly chosen newborn has birth weight between
3.0 𝑘𝑘𝑔𝑔 and 4.0 𝑘𝑘𝑘𝑘.

Hints
𝑋𝑋 = 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵ℎ 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 𝑎𝑎 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛
𝑋𝑋~𝑁𝑁(3.4, 0.352 )
𝜇𝜇 = 3.4, 𝜎𝜎 2 = 0.352 , 𝜎𝜎 = 0.35.
Low Birth weight Rate
2.4 − 3.4
𝑃𝑃(𝑋𝑋 < 2.4) = 𝑃𝑃 �𝑍𝑍 ≤ � = 𝑃𝑃(𝑍𝑍 ≤ −2.86) = 0.0021.
0.35
That is, the low birth weight rate is 0.21%. In other words, among the newborn,
0.21% of them will have birth weight less than 2.4 𝑘𝑘𝑘𝑘.

𝑃𝑃(3.0 < 𝑋𝑋 < 4.0) = 𝑃𝑃(𝑋𝑋 ≤ 4.0) − 𝑃𝑃(𝑋𝑋 ≤ 3.0)


4.0 − 3.4 3.0 − 3.4
= 𝑃𝑃 �𝑍𝑍 ≤ � − 𝑃𝑃 �𝑍𝑍 ≤ �
0.35 0.35
= 𝑃𝑃(𝑍𝑍 ≤ 1.71) − 𝑃𝑃(𝑍𝑍 ≤ −1.14) = 0.9564 − 0.1271 = 0.8293
That is, 82.93% of newborn have birth weight between 3.0 𝑘𝑘𝑘𝑘 and 4.0 𝑘𝑘𝑘𝑘.
Problem
Suppose that GPA of 80 students were found to have a normal distribution with
mean 2.1 and standard deviation 0.60. How many of these students are expected to
have GPA between 2.5 and 3.5.

Hints
𝑋𝑋 = 𝐺𝐺𝐺𝐺𝐺𝐺 𝑜𝑜𝑜𝑜 𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝑋𝑋~𝑁𝑁(2.1, 0.602 )
𝜇𝜇 = 2.1, 𝜎𝜎 2 = 0.602 , 𝜎𝜎 = 0.60.

𝑃𝑃(2.5 < 𝑋𝑋 < 3.5) = 𝑃𝑃(𝑋𝑋 ≤ 3.5) − 𝑃𝑃(𝑋𝑋 ≤ 2.5)


3.5 − 2.1 2.5 − 2.1
= 𝑃𝑃 �𝑍𝑍 ≤ � − 𝑃𝑃 �𝑍𝑍 ≤ �
0.60 0.60
= 𝑃𝑃(𝑍𝑍 ≤ 2.33) − 𝑃𝑃(𝑍𝑍 ≤ 0.67) = 0.9901 − 0.7487 = 0.2415
That is, 24.15% of students have GPA between 2.5 and 3.5. Therefore, the number
of students having such GPA is
80 × 0.2415 = 19.32 ≅ 20.

Problem
Suppose that random variable 𝑋𝑋 is normally distributed with mean 18 and standard
deviation 2.5. Find the value for 𝑘𝑘 such that
(i) 𝑃𝑃(𝑋𝑋 < 𝑘𝑘) = 0.2578.
(ii) 𝑃𝑃(𝑋𝑋 > 𝑘𝑘) = 0.1539.
Hints
𝑋𝑋~𝑁𝑁(18, 2.52 )
𝜇𝜇 = 18, 𝜎𝜎 2 = 2.52 , 𝜎𝜎 = 2.5.
(i)
𝑃𝑃(𝑋𝑋 < 𝑘𝑘) = 0.2578
𝑘𝑘 − 18
⟹ 𝑃𝑃 �𝑍𝑍 ≤ � = 0.2578
2.5
From standard normal table,
𝑘𝑘 − 18
= −0.65
2.5
⟹ 𝑘𝑘 = (−0.65) × 2.5 + 18 = 16.375.
(ii)
𝑃𝑃(𝑋𝑋 > 𝑘𝑘) = 0.1539
⟹ 1 − 𝑃𝑃(𝑋𝑋 ≤ 𝑘𝑘) = 0.1539
⟹ 𝑃𝑃(𝑋𝑋 ≤ 𝑘𝑘) = 1 − 0.1539 = 0.8461
𝑘𝑘 − 18
⟹ 𝑃𝑃 �𝑍𝑍 ≤ � = 0.8461
2.5
From standard normal table,
𝑘𝑘 − 18
= 1.02
2.5
⟹ 𝑘𝑘 = 1.02 × 2.5 + 18 = 20.55.
Problem
If 𝑋𝑋 is a normally distributed random variable with mean 𝜇𝜇 and standard deviation
5 and 𝑃𝑃(𝑋𝑋 > 25) = 0.0526, find the population mean.

Hints
𝑋𝑋~𝑁𝑁(𝜇𝜇, 52 )
𝜇𝜇 =? , 𝜎𝜎 2 = 52 , 𝜎𝜎 = 5.
𝑃𝑃(𝑋𝑋 > 25) = 0.0526
⟹ 𝑃𝑃(𝑋𝑋 ≤ 25) = 1 − 0.0526 = 0.9474
25 − 𝜇𝜇
⟹ 𝑃𝑃 �𝑍𝑍 ≤ � = 0.9474
5
From standard normal table,
25 − 𝜇𝜇
= 1.62
5
⟹ 𝜇𝜇 = 25 − (1.62 × 5) = 16.90.

Exercise
If 𝑋𝑋 is a normally distributed random variable with mean 25 and standard
deviation 𝜎𝜎 and 𝑃𝑃(𝑋𝑋 < 10) = 0.0778, find the population standard deviation and
variance.

Problem
Attendance of a football game at a certain stadium is normally distributed with
mean 45000 and standard deviation 3000. Find the probably that
(i) Attendance is between 44000 and 48000.
(ii) Attendance exceeds 50000.
Find minimum attendance for which the probability is 0.80.

Hints
𝑋𝑋 = 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔
𝑋𝑋~𝑁𝑁(45000, 30002 )
𝜇𝜇 = 45000, 𝜎𝜎 2 = 30002 , 𝜎𝜎 = 3000.
(i)
𝑃𝑃(44000 < 𝑋𝑋 < 48000) = 𝑃𝑃(𝑋𝑋 ≤ 48000) − 𝑃𝑃(𝑋𝑋 ≤ 44000)
48000 − 45000 44000 − 45000
= 𝑃𝑃 �𝑍𝑍 ≤ � − 𝑃𝑃 �𝑍𝑍 ≤ �
3000 3000
= 𝑃𝑃(𝑍𝑍 ≤ 1) − 𝑃𝑃(𝑍𝑍 ≤ −0.33) = 0.8413 − 0.3707 = 0.4706
That is, in 47.06% of games, the attendance is between 44000 and 48000.
(ii)
𝑃𝑃(𝑋𝑋 > 50000) = 1 − 𝑃𝑃(𝑋𝑋 ≤ 50000)
50000 − 45000
= 1 − 𝑃𝑃 �𝑍𝑍 ≤ � = 1 − 𝑃𝑃(𝑍𝑍 ≤ 1.67)
3000
= 1 − 0.9525 = 0.0475
Interpretation!!!!!!!!!!

Let 𝑘𝑘 be the minimum number of attendance for which the probability is 0.80.
That is,
𝑃𝑃(𝑋𝑋 ≥ 𝑘𝑘) = 0.80
⟹ 1 − 𝑃𝑃(𝑋𝑋 ≤ 𝑘𝑘) = 0.80
⟹ 𝑃𝑃(𝑋𝑋 ≤ 𝑘𝑘) = 0.20
𝑘𝑘 − 45000
⟹ 𝑃𝑃 �𝑍𝑍 ≤ � = 0.20
3000
From standard normal table,
𝑘𝑘 − 45000
= −0.84
3000
⟹ 𝑘𝑘 = (−0.84) × 3000 + 45000 = 42480.
Problem
If 𝑍𝑍 is a standard normal random variable and 𝑃𝑃(−𝑘𝑘 < 𝑍𝑍 < 𝑘𝑘) = 0.95, find the
value for 𝑘𝑘.

Solution
Given that
𝑍𝑍~𝑁𝑁(0,1).
Since standard normal distribution is symmetric about 0,
𝑃𝑃(−𝑘𝑘 ≤ 𝑍𝑍 ≤ 0) = 𝑃𝑃(0 ≤ 𝑍𝑍 ≤ 𝑘𝑘).
Therefore,
𝑃𝑃(−𝑘𝑘 < 𝑍𝑍 < 𝑘𝑘) = 0.95
⟹ 𝑃𝑃(−𝑘𝑘 ≤ 𝑍𝑍 ≤ 𝑘𝑘) = 0.95
⟹ 𝑃𝑃[(−𝑘𝑘 ≤ 𝑍𝑍 ≤ 0) ∪ (0 ≤ 𝑍𝑍 ≤ 𝑘𝑘)] = 0.95
⟹ 𝑃𝑃(−𝑘𝑘 ≤ 𝑍𝑍 ≤ 0) + 𝑃𝑃(0 ≤ 𝑍𝑍 ≤ 𝑘𝑘) = 0.95
⟹ 2 × 𝑃𝑃(0 ≤ 𝑍𝑍 ≤ 𝑘𝑘) = 0.95
⟹ 𝑃𝑃(0 ≤ 𝑍𝑍 ≤ 𝑘𝑘) = 0.475
⟹ 𝑃𝑃(𝑍𝑍 ≤ 0) + 𝑃𝑃(0 ≤ 𝑍𝑍 ≤ 𝑘𝑘) = 0.475 + 𝑃𝑃(𝑍𝑍 ≤ 0)
⟹ 𝑃𝑃[(𝑍𝑍 ≤ 0) ∪ (0 ≤ 𝑍𝑍 ≤ 𝑘𝑘)] = 0.475 + 0.5
⟹ 𝑃𝑃(𝑍𝑍 ≤ 𝑘𝑘) = 0.975
From standard normal table,
𝑘𝑘 = 1.96.
Exercise
Suppose that 𝑍𝑍 is a standard normal random variable. Find the value for 𝑘𝑘, if
(i) 𝑃𝑃(−𝑘𝑘 < 𝑍𝑍 < 𝑘𝑘) = 0.90. Answer: 𝑘𝑘 = 1.65 (𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴).
(ii) 𝑃𝑃(−𝑘𝑘 < 𝑍𝑍 < 𝑘𝑘) = 0.99. Answer: 𝑘𝑘 = 2.58 (𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴).

You might also like