0% found this document useful (0 votes)
19 views42 pages

Probability Notes Level1

The document outlines the course 'Methods of Mathematics - Probability and Statistics' (MA1020) offered by the Department of Mathematics at the University of Moratuwa, detailing its content, learning outcomes, and evaluation methods. The course covers fundamental concepts in probability and statistics, including set theory, probability distributions, and data analysis using Minitab over a duration of 7-8 weeks. Students are expected to attend classes for comprehensive understanding and are evaluated through assignments and examinations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views42 pages

Probability Notes Level1

The document outlines the course 'Methods of Mathematics - Probability and Statistics' (MA1020) offered by the Department of Mathematics at the University of Moratuwa, detailing its content, learning outcomes, and evaluation methods. The course covers fundamental concepts in probability and statistics, including set theory, probability distributions, and data analysis using Minitab over a duration of 7-8 weeks. Students are expected to attend classes for comprehensive understanding and are evaluated through assignments and examinations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Department of Mathematics

Faculty of Engineering,
University of Moratuwa

Methods of Mathematics - Probability and Statistics

MA1020 (Level 1/Semester 2)

by

Dr. T S G Peiris
Department of Mathematics

Note: In these handouts only the important points are given. It is necessary that the students
to attend all the classes to acquire more details and to expose in tackling different
statistical problems related to engineering applications.

Course Content:
Introduction to probability using set theory, Conditional probability and independence,
Applications of Bayes theorem, Discrete and continuous random variables, Properties of the
probability distributions (Binomial, Normal, Standard Normal, Student‟s t, Poisson and
Exponential) and their applications, Descriptive statistics and Introduction to Minitab for
data analysis

Duration: 7-8 weeks (2 hours/week)

Learning Outcomes:
Upon successful completion of this course, students should be able to

 Apply the knowledge on the fundamental probability concepts to various applications


 Use of probability distributions for various engineering application
 Compute various statistical indicators in practical problems
 Apply descriptive statistics for decision making
 Use of inferential statistics for decision making

1
 Interpret the results of data analysis

Methodology: Lecturers and tutorials

Scheme of Evaluation: Assignments + Mid semester examination – 25%


End of semester examination - 75%

Recommended Readings:
 Mathematics for Engineers – J M J A Cooray
 Business Statistics Concept and Application (Many books on, “Business Statistics”
are available in the library. All of those have very good practical applications)

2
1. BASIC DEFINITIONS IN SET THEORY

1.1 Set: Any well defined collection of objects is called a set.

1.2 Element: The objects comprising the set are called its elements.

If A is set and p is an element of A then it is denoted by p  A .

Eg. 9 is an element of a set A  1, 3, 5, 7, 9 consisting of odd numbers less than 10 .

1.3 Universal Set: An universal set is a set which contains all objects, including itself and
is denoted by U .

1.4 Null Set: The set contains no elements is called null set and is denoted by 

1.5 Set Operations:


Let A and B be arbitrary sets.
 The union of A and B is denoted by A  B , is the set of elements belong to
A or to B .
A  B  x / x  A or x  B
 The intersection of A and B is denoted by A  B , is the set of elements belong to
A and to B .
A  B  x / x  A and x  B
 The difference between A and B (or the relative complement of B with respect to
A ) is said to A complementary B .
A  B  x / x  A and x  B
 If A  B is  ( A and B do not have common element) then A and B is said to be
disjoint set.

3
1.5 Laws of Set Theory

Associative law: ( A  B)  C  A  ( B  C ) and

( A  B)  C  A  ( B  C )

Commutative law: ( A  B)  ( B  A) and


( A  B)  ( B  A)

Distributive law: A  ( B  C  ( A  B)  ( A  C )

Identity law: A   and A  

Idempotent law: A  A  A and A  A  A

De Morgan’s law: ( A  B) c  Ac  B c and


( A  B) c  Ac  B c

4
2. FUNDAMENTAL PRINCIPAL OF COUNTING

2.1 Factorial Notation


Factorial n is denoted as n! and is defined as n! 1.2.3........(n  1).n

Note: If some procedure can be performed in n1 different ways and a second procedure can
be performed in n2 ways , third procedure can be performed in n3 ways, and so forth
then the number of ways the procedure can be performed in the order indicated is
n1  n2  n3 .............

2.2 Permutation
An arrangement of set of n objects in a given order is called a permutation. An arrangements
of any (r  n) objects from n objects taken at a time is denoted by

n!
n
Pr  n( p, r ) 
(n  r )!

2.3 Permutation with repetitions


The number of permutations of n objects of which n1 are alike, n2 are alike and n3 are alike
n!
is given by
(n1!n2 !n3 !)

Eg The number of different signals each consists of 8 flags in a vertical line formed from a
set of 4 indistinguishable red flags and 3 indistinguishable white flags and one blue flag is
8!/ 4! 3! .

2.4 Combinations
A combination of n objects taken r at a time is denoted by c(n, r ) where

p(n, r ) n!
n
Cr  c(n, r )  
r! (n  r )!r!

Eg. The number of committees of 3 can be formed from 8 persons = C3   56 .


8 8!
3!( 83)!

Comparison between combinations and permutations of the four letters a, b, c and d taken 3
at a time
Table 1
Combinations Permutations
abc abc, acb, bca, bac, cab, cba
abd abd, adb, bad, bda, dab, dba
acd acd, adc, cad, cda, dac, dca
bcd bcd, bdc, cbd, cdb, dbc, dcb

5
3. PROBABILITY THEORY

3.1 Probability
Probability is the likelihood or chance that a particular event will occur.

Eg. Chance of picking a black card from a deck of cards;


Chance of winning 50 over game;
Chance of selecting Electronics stream

In a scientific way, probability is defined as a study of random (nondeterministic)


experiments.

The theory of probability makes some sense to find the mathematical foundation (numerical
measure) for uncertainty. It enables us to make decisions under condition of uncertainty. The
theory of probability is useful in day to day life and has many applications in all the fields of
engineering. There are three approaches to the subject of probability. They are ,
(i) prior classical probability,
(ii) empirical classical probability
(iii) subjective probability.
Probability theory is based on the paradigm of a random experiment.

3.2 Random Experiment


It is a process which is conducted repeatedly under homogenous environment. That is, an
experiment whose outcome cannot be predicted with certainty, before the experiment is run.
We usually assume that the experiment can be repeated infinitely under essentially the same
conditions.

3.3 Random Variable


In reality all variables are non deterministic. Thus the variables whose exact value can not be
predicted (determined) are known as random variables. Thus for a given experiment many
number of random variables can be defined.

6
3.4 Trial
The performance of a random experiment is called a trial. Many random variables can be
associated in a trial
Eg. Throwing a die and throwing a coin three times are trials.

3.5 Event
Outcome of a trail is an event.
Let A be an event that two or more heads appear consecutively from an experiment of
throwing a coin three times. Then A  HHH , HHT , THH .

3.6 Sample Space - S


The all possible outcome of a given experiment (or random variable) is known as sample
space and is generally denoted by S . The values which belong to S are known as elements in
the sample space.
Let the random variable X = the number appears by tossing a die, then
S  1,2,3,4,5,6S
3.7 Sample point
A particular outcome of the experiment is known as sample point. An event may consist of
one or more sample points. In the above example the number of sample points in A is three
and is denoted by n( A)  3 .

3.8 Simple Event


An event is known as simple event if it corresponds to a single possible outcome.
Eg. In tossing a die the chance of getting 3 is a simple event.

3.9 Compound (Joint) Event


An event is known as compound event if it corresponds to more than a single possible
outcome. In tossing a die the chance of getting an odd number is a compound event.

3.10 Favourable Event


The number of outcome due to a desired event.

7
3.11 Mutually Exclusive Event
Events are mutually exclusive if they cannot happen at the same time.
 If we toss a coin, either heads or tails might turn up, but not heads and tails at the same
time.
 In a single throw of a die, we can only have one number shown at the top face. The
numbers on the face are mutually exclusive events

If A and B are mutually exclusive events then the probability of A happening OR the
probability of B happening is P( A)  P( B). That is P( A  B)  P( A)  P( B).

3.12 Equally likely Events


Two or more events are said to be equally likely if the chance of their happening is the same.
 Obtaining 1or 2 or 3 by throwing an unbiased die

3.13 Independent Event


Two or more events are said to be independent if its happening does not influence by the
happening of other events. That is, A occurs does not affect the probability of B occurring.

 Choosing a marble from a jar AND landing on heads after tossing a coin
 Attending to Maths class and playing a tennis game

3.14 Probability of an event


Probability of an event A in S is defined as P(A) and equals to

number of possible points of the event n( A)


P( A)  
numberof all possible points in S n( S )

 A spinner has 4 equal sectors colored yellow, blue, green and red. After spinning the
spinner, what is the probability of landing on each color?
The possible outcomes of this experiment are yellow, blue, green, and red.
number of ways to land on yellow 1
P( yellow )  
total number of colors 4

Find P(blue) , P(green) and P(red ) ?

8
3.15 Axioms of Probability
Let S be a random sample space and A be an event within S. Then

(1) 0  P( A)  1
(2) P( S )  1
(3) The sum of the probabilities of all simple events must be 1.
(4) If A and B are mutually exclusive events then P( A  B)  P( A)  P( B).
(5) If Ai (i  1, 2, ......n) are mutually exclusive events then P( Ai )   P( Ai )
i

Useful Theorems in Probability

Theorem 1: If  is the empty set then P( )  0

If A is the complementary event of A then P( A )  1  P( A) .


c c
Theorem 2:

Theorem 3: If A  B then P( A)  P( B)

Theorem 4: If A and B are any two events then P( A  B)  P( A)  P( A  B)

Addition Theorem
If A and B are any two events then probability that at least one of them occurs (that is A or
B occurs) is denoted by P( A  B) and given by,
P( A  B)  P( A)  P( B)  P( A  B) .

3.16 Simple (Marginal) Probability


Example 1. Suppose that you as the President of a company is interested in studying the
intension of 1000 households to purchase a big screen televisions in the next 12 months. As a
follow-up study the following results were observed in a survey. What is the probability that
a household is planning to purchase a big screen television in the next 12 months? What is
the probability that a household is planning to purchase a big screen television and actually
purchases the television in the next 12 months?
Table 2
Planed to Actually purchased
purchase Yes No
Yes 200 50
No 100 650

number who planned to purcahse


p( planned to purchase)  = 250/1000=0.25
total numer of households

9
Note: Simple probability is also called marginal probability as the total number of success
(those who planned to purchase) can be obtained from the appropriate margin of
contingency table.

3.17 Joint Probability


Joint probability refers to the situation involving two or more events.
Eg. P(planned to purchase and actually purchased a big screen TV).
P( planned to purchase & actualluy purcahsed)
number who planned to purcahse and acttually purcahsed

total numer of households
200
=  0.20
1000

Example 2. Suppose in the follow up study the following additional information was
obtained from the 300 households who actually purchased a big screen TV.
Table 3
Purchased Purchased DVD
HDTV Yes No
Yes 38 42
No 70 150

3.18 Conditional Probability


Let B be an any event with P( B)  0 . The prob. that an event A occurs once B has
occurred is known as the conditional prob. of A given that B occurred and is denoted by
p ( A  B)
P( A / B) and is given by P( A / B)  ………………………. (1)
p( B)
Eg. If a pair of pair of fair dice is tossed then the prob. that the sum is 6 given that one
dice has 2 is 11 /36. (Discuss in the class).

3.19 Multiplication Theorem for Conditional Probability


For any two events A1 and A2 , from equation (1) it is clear that
P( A1  A2 )  P( A1 )  p( A2 / A1 ) and P( A2  A1 )  p( A2 )  p( A1 / A2 )
Thus the same concept can be extended to for any events A1 , A2 , ….., An so that

P( A1  A2  ....... An )  [ P( A1 )][ P( A2 / A1 )][ P( A3 /( A1  A2 )].... ........[P (A n / A1  A2  ..... An1 )]

10
3.20 Partitions and Baye’s Thoerm
n
Suppose the events A1 , A2 ,..... An be partitions of the sample space S s.t. S   Ai .
i 1

B be an event s.t. B  S  B  ( A1  A2  ........  An )  B

= ( A1  B)  ( A2  B)  ........  ( An  B)
n n
P( B)   p( Ai  B)   p( Ai ). p( B / Ai )
i 1 i 1

Bayes’ Theorm
p( Ai  B) p( Ai ) p( B / Ai )
( P( Ai / B)   n
p( B)
 p( A ). p( B / A )
i 1
i i

3.21 Independence
An event B is said to be an independent event of an event A if the probability that B occurs
is not influenced by whether event A has or has not occurred.
That is P( B)  P( B / A)
p( B  A)
We know from eq. (1) that P( B / A) 
p( A)
If A and B are independent P( B / A)  P( B) .
Thus it is clear if A and B are two independent events then P( A  B)  P( A)  P( B) .

11
4. Properties of Random Variables

In mathematics, random variables are used in the study of probability. They were developed
to assist in the analysis of games of chance, stochastic events, and the results of scientific
experiments by capturing only the mathematical properties necessary to answer probabilistic
questions. There are two types of random variables; discrete and continuous depending on the
type of measurement of the random variable.

4.1 Discrete Random Variables


A discrete random variable is one which may take on only a countable number of distinct
values such as 0,1,2,3,4,......etc. Discrete random variables are usually counts.
Examples (a) the number of children in a family
(b) the number of calls received in an interval (0, t )
(c) number of cards passed at a traffic light in a specified period

4.2 Probability Distribution of a discrete random variable


The probability distribution of a discrete random variable is a list of probabilities associated
with each of its possible values. It is also sometimes called the probability function or the
probability mass function.

Eg. Consider the tossing of pair of fair dice. Then the possible outcome is
S  (1,1), (1,2), ...............(1,6),.............(6,1), (6,2),..................(6,6)
Thus n(S )  36 . Let X be a RV such that X  max( a, b) here (a, b) is the outcome of the
pair of dice. Then the possible values that X can have are , {1,2,3,4,5,6}=S(x) (say) and
n(X(s )= 6. Then,
P(X = 1) = P{(1,1)} = 1/36 = f (1) (say)
P(X = 2) = P{(1,2), (2,2), (2,1)} Thus f (2) = 3/36 (say)
Similarly f (3) = 5/36, f (4) = 7/37, f (5) = 9/36, f (6) = 11/36.
Thus we can form a table given below and it is called as probability distribution of X.
Table 4
xi 1 2 3 4 5 6
f(xi) 1/36 3/36 5/36 7/36 9/36 11/36

12
0.35
0.3
0.25

P(X=s)
0.2
0.15
0.1
0.05
0
1 2 3 4 5 6
Value of RV - X

Figure 1. Probability distribution of X (probability histogram)

4.3 Properties of distribution of a RV


Let X be a rv on a sample space S with finite image set, X (S) = {x1, x2, x3, …..xn}. Let the
function f on X (S), denoted by f (xi) = P(X = xi/ i = 1,2….n) is given by the table below.
Table 5
x1 x2 …. xn
f (x1) f (x2) f (xn)

The distribution f satisfies the conditions (a) f (xi) ≥ 0 and


n
(b)  f (x )  1
i 1
i

4.4 Cumulative Probability:


It is defined as the probability of observing less than or equal a given number of success (see
Table 6).
Eg. Let Y be a RV of the above experiment such that Y = sum of (a, b).
Then Y = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

Table 6. Probability distribution and cumulative probability distribution of Y


Yi 2 3 4 5 6 7 8 9 10 11 12
P(Yi) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
Cum 1/36 3/36 6/36 10/36 15/36 21/36 26/36 30/36 33/36 35/36 36/36
Pr

13
4.5 Distribution of Y

0.18
0.16
0.14
0.12
P(Y=s)

0.1
0.08
0.06
0.04
0.02
0
2 3 4 5 6 7 8 9 10 11 12
Values of RV - Y

Figure 2. Probability distribution of Y

1
Cumulative Prob. of Y

0.8

0.6

0.4

0.2

0
2 3 4 5 6 7 8 9 10 11 12
Values of RV - Y

Figure 3. Cumulative distribution of Y

4.6 Continuous Random Variable


A continuous random variable is one which takes an infinite number of possible values.
Continuous random variables are usually measurements.
Examples (a) Z score of the first year students in UOM
(b) Leaf areas of the 4th leaf of a given plant

Note: Continuous random variable is not defined at specific values. Instead, it is defined
over an interval of values, and is represented by the area under a curve.
b
Thus if X is a continuous RV with pdf of f (x) then P(a  x  b)   f ( x)dx .
a

As for discrete case f (x) > 0 and 

f ( x)dx  1 .

14
5. PARAMETERS OF A DISTRIBUTION

In order compare different distribution various parameters (statistical indicators) have been
defined. The physical meaning of each of the indicator is explained in the class.

5.1. Expected value (mean) - µ

In probability theory the expected value (or expectation, or mean) of a discrete random
variable is the sum of the probability of each possible outcome of the experiment multiplied
by the outcome value. Thus,

If X is RV with a distribution of f(x) then the mean or expected value of X is denoted by


n
E(X) (  X ) and is given by E(X) =  x f (x )
i 1
i i if X is a discrete RV and


=  x f ( x)dx

if X is a continuous RV

Eg. (a) Consider rv X = max (a,b) where (a,b) is the outcome of tossing two fair dices. Then
the pdf of X is given by (as shown above) Table 7.
Table 7 – Pdf of X
xi 1 2 3 4 5 6
f(xi) 1/36 3/36 5/36 7/36 9/36 11/36

Hence E(X) =  X = 1(1/36) + 2(3/36) + 3(5/36) + 4(7/36) + 5(9/36) + 6(11/36) = 4.47

Eg. If X is continuous rv with pdf f(x) where f(x) = kx2 (1-x) , 0< x < 1
= o otherwise


Then it can be shown that k =12 and E(X)=3/5 using the property of

 f(x).dx =1

Properties of E(X)

 If c is a constant then E(c) = c


 If X and Y are random variables such that X ≤ Y the E(X) ≤ E(Y)
 E(X + Y) = E (X) + E(Y)
 E(X + c) = E (X) + c
 E (aX ) = aE(X)

Note: Proofs are discussed in the class.

15
5.2. Variance of a distribution ( 2 ) – Var(X)

Var (X) = (x  )


i
i
2
f(x i )  E[( X   )2 ] , where E(X) = µ when X is discrete

= x
i
2
i f ( xi )   2 = E(X2) – [E(x)]2

x f ( x ) dx -  2 when X is continuous
2
V(X) =


Thus using Table 7 above

E(X2) = 12(1/36) + 22 (3/36) + 32 (5/36) + 42 (7/36) + 52 (9/36) + 62 (11/36)

= 701/36 = 21.9

E(X) = 4.47 (showed) Thus E(X2) = 19.98

V(X) = 21.9 – 19.88 = 1.99

Properties of V(X)
 V(X+k) = V(X),
 Var (aX) = a2V(X)

5.3 Standard deviation – 

It is defined as the square root of variance. This indicator has more benefits than the variance
in interpreting results.

Remark: Let Y be a random variable with mean µ ad standard deviation  . Then the
Y 
standardized random variable Z is defined as Z  so that that V(Z) = 1 and E(Z) = 0.

This is very common transformation in statistics and would be very useful in all applications.
More details are discussed later in the class.

16
5.4. Covariance between X and Y-  xy
If X and Y are two RVs then the extent to which two random variables vary together (co-vary)
is measured by an indicator known as Covariance and it is denoted by Cov(X,Y) and given
by

Cov (X,Y) = E { [ X - E(X)] [Y – E(Y)] }


= E(XY) – E(X)E(Y)

E(XY) =  x y h( x , y )  
i i i i X Y if X and Y are discrete

=  h( x, y)dxdy -  X Y if X and Y are continuous

Note:
Positive covariance: It indicates that higher than mean values of one variable tend to be
paired with higher than mean values of other variable.
Negative covariance: It indicates that higher than mean values of one variable tend to be
paired with lower than mean values of other variable.
Zero covariance: If the two random variables are independent then the covariance will
be zero.
However, covariance is zero does not imply that two variables are independent

Note :
V(X + Y ) = V(X) + V(Y) + 2Cov(X,Y).

Example:
A pair of fair dice is tossed. Let X = max (a, b) and Y = a + b where (a ,b) is any ordered pair
belongs to S.
Table 8
Y Sum
2 3 4 5 6 7 8 9 10 11 12
1 1/36 0 0 0 0 0 0 0 0 0 0 1/36
2 0 2/36 1/36 0 0 0 0 0 0 0 0 3/36
X 3 0 0 2/36 2/36 1/36 0 0 0 0 0 0 5/36
4 0 0 0 2/36 2/36 2/36 1/36 0 0 0 0 7/36
5 0 0 0 0 2/36 2/36 2/36 2/36 1/36 0 0 9/36
6 0 0 0 0 0 2/36 2/36 2/36 2/36 2/36 1/36 11/36
Sum 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

h(3,5) = p(X=3 and Y=5) = 2/36

E(XY) = 121/36 + 232/36 + 241/36 + …………+6121/36


=1232/36 = 34.2
E(X) =  X = 4.47,  X =1.4
E(Y) = Y = 7.0 and  Y =2.4

Thus Cov (X,Y) = 2.9/(1.4)  (2.4) = 2.9

17
Properties of the Covariance
If X, Y, W, and V are real-valued random variables and a, b, c, d are constant ("constant" in
this context means non-random), then the following can be proved easily.

Notes:
(a) Cov (X, a) = 0,
(b) Cov (X, Y) = Cov (Y, X)
(c) Cov (aX, bY) = abCov (X, Y)
(d) Cov (X + a, Y+b) = Cov (X, Y)
(e) Cov (aX+bY, cW+ dV) = cCov(X,W) + adCov(X, ) + bcCov(Y, W) + bdCov(Y,V)

Note: Proofs of the axioms and their applications are discussed in the class

Limitation
Because of the number represent by Cov(X,Y) depends on the units of the data it is difficult to
compare COV among different data sets, having different scales. As a solution to that
another very useful indictor, known as correlation coefficient has been introduced.

5.5 Correlation Coefficient -  xy


The correlation coefficient  xy between two random variables X and Y with expected values
 X and  Y and standard deviations  X and  Y is defined as:

 xy E ( x   X )(Y   Y )
 xy   = Corr(X,Y)
 x Y  x Y

E ( XY )  E ( X ) E (Y )
 xy 
E ( X )  [ E ( X )]2 E (Y 2 )  [ E (Y )]2
2

5.6 Skewness -Sk: It is a measure of symmetry of a pdf. Sk  [ E ( X   X ) 3 ] /  X3

Skewness can come in the form of "negative skewness" or "positive skewness", depending on
whether data points are skewed to the left (negative skew) or to the right (positive skew) of
the data average.

1. negative skew: The left tail is longer; the mass of the distribution is concentrated on
the right of the figure. It has relatively few low values. The distribution is said to be
left-skewed.
2. positive skew: The right tail is longer; the mass of the distribution is concentrated on
the left of the figure. It has relatively few high values. The distribution is said to be
right-skewed.

18
(+) ve (right) skewed (-) ve (left) skewed
mode < median < mean mean < median < mode
Figure 4 Negative skewness and positive skewness
5.7 Kurtosis – K.
It is a measure of the tallness or flatness ("peakedness") of the pdf. It is a measure of
whether the data are peaked or flat relative to a normal distribution.
K  [ E ( X   X )4 ] /  X4 .

Estimation of the above population parameters based on a sample


Table 9
Parameter Population Estimator from a sample
Mean  n
x   xi / n
i 1

 (x i  x)2
Variance 2 s2  i 1

(n  1)
Standard Deviation  s
n


 (x i  x )( y i  y )
Covariance i 1

(n  1)
n


 (x i  x )( y i  y )
Correlation coefficient rXY  i 1

(n  1) s X sY
n
Skewness Sk
 (x
i 1
i  x)3

(n  1) s 3
n
Kurtosis K
 (x
i 1
i  x)4

(n  1) s 4
Note: More details with applications are discussed in the class.

19
6. DESCRIPTIVE STATISTICS (Statistical Indicators)

One important use of descriptive statistics is to summarize a collection of data in a clear and
understandable way. Collected data may be in either ungrouped form or grouped form. In statistics there
are various types of descriptive statistics. Such statistics are known as “statistical indicators”. The
statistical indicators play an important role in statistical data analysis.

6.1 Indicators to measure central tendency

Table 10

Parameter Estimator
Ungrouped Grouped
Arithmetic mean n n n
x   xi / n x   f i xi /  f i
i 1 i 1 i 1
n n
Weighted mean
w x i i fwx i i i
x i 1
n
x i 1
n

w
i 1
i w
i 1
i fi
n n
=  i xi such that  i  1
i 1 i

N / 2  ( f )1
L1 
f median
Median n 1
ranked observation
2 L1=lower class boundary of the median
class

N=Total freq, c=size of med. class

fmed = freq. of the median class

( f )1 =sum of freq. of all classes


lower than the median class
Mode value of the data series that 1
L1  ( )*c
1   2
appears most frequently

20
6.2 Indicators to Measure Dispersion

The indicators of dispersion are important for describing the spread of the data. Some of such indicators
are as follows:

(a) Range Max(xi) - Min(xi)

(b) Mean Deviation: MD 


 (x i  x)
n
n

x
i 1
i x
(c) Mean Absolute Deviation: MAD =
n

(d) Sample variance: s  2  ( xi  x ) 2 


x 2
i  nx 2
(n  1) (n  1)

(e) Standard deviation: s=


 (x i  x)2
(n  1)
s
(f) Standard error SE =
n

6.3 Indicators to Measure Percentiles

The pth percentile is a value so that roughly p percentage of the data are smaller and (100-p) percent of
the data are larger. A percentile is a measure of relative standing against all other observations in the
sample.

First quartile (Q1): The sample 25th percentile (P25)

Second quartile (Q2): The sample 50th percentile (P50) “median”

Third quartile (Q3): The sample 75th percentile (P75)

Inter quartile range: Q3 – Q1

p
Desired Percentile: L p  (n  1)
100

21
6.4 Indicator to measure relative dispersion

Coefficient of Variation (CV) is a relative measure that indicates the magnitude of variation relative to the
magnitude of the mean.

sd
Coefficient of variation (CV) = * 100%
x

Eg. A manufacturer of television tubes has two types of tubes namely A and B. Mean life
time tubes A and B are 1495 hrs and 1875 and SD of tubes are 280 hrs and 380 hrs
respectively.

280 310
The CV of A = *100 =18.7% and CV of B = *100 = 16.9%
1495 1875

6.5 Indicator to measure linear association between variables

Correlation Coefficient is one of the most common and most useful statistical indicators to
describe the association (degree of linear relationship) between two variables.

Pearson correlation coefficient


If X and Y are two random variables (continuous) following the same bivariate distribution and
the paired values of X and Y of a sample of size n is (x1, y1), (x2,y2), ….. (xn,yn) then the
correlation coefficient between Y and X is defined by

 (x i  x )( yi  y )
s XY
rXY    = = r (say) -1 ≤ r ≤+1
( xi  x ) 2 ( y i  y ) 2 s XX s YY

=
x y i i  n( x )( y )
.
( xi2  nx 2 ) ( yi2  n y 2

Figure 5

22
Table 11

number x y
1 20 35
2 25 45
3 30 50
4 35 65
5 40 60
6 45 65
7 50 70
8 60 65

Scatterplot of y vs x

70

60
y

50

40

30
20 30 40 50 60
x

Figure 6

Note1:
Positive correlation: If x and y have a strong positive linear correlation, r is close to +1. An r
value of exactly +1 indicates a perfect positive fit That is, the relationship between x and y
variables is such that y increases as x increases.

Negative correlation: If x and y have a strong negative linear correlation, r is close to -1. An r
value of exactly -1 indicates a perfect negative fit. Negative values
indicate a relationship between x and y such that as values for x increase, values
for y decrease.

No correlation: If there is no linear correlation or a weak linear correlation, r is


close to 0. A value near zero means that there is a random, nonlinear relationship
between the two variables.

Note 2: A correlation greater than 0.8 is generally described as strong, whereas a correlation
less than 0.5 is generally described as weak. These values can vary based upon the "type" of
data being examined and size of the sample. THIS IS VERY SUBJECTIVE CRITERIA.

23
7. Chebyshev’s Theorm and Empirical Rule

7.1 Chebyshev’s Theorm:


1
For any number k > 1, at least 1  of the values for any distribution lie within k standard
k2
deviations of the mean.
1
P(k  X  k ) 
k2

7.2 Empirical Rule:


For a normal distribution (mode = median = mean):
68% of data lies within 1 standard deviation of the mean.
95% of data lies within 1 standard deviation of the mean.
99.7% of data lies within 1 standard deviation of the mean.

Ex.
a) According to Chebyshev‟s theorem, at least what % of any set of observations will be
within 1.8 standard deviations of the mean?
b) The mean income of a group of a sample observations is Rs. 500 and the sd=Rs. 40.
According to Chebyshev‟s theorem at least what % of the income will be lie between Rs.
400 and 600.
c) HebysIf a group of data has a mean of 54 and a standard deviation of 78.5, what is the
interval that should contain at least 93.8% of the data?
d) Given a data set comprised of 4117 measurements that is bell-shaped with a mean of 862.
If 99.7% of the data lies between 580 and 1144 then what is the standard deviation?
e) Given a group of data with mean 40 and standard deviation 15, at least what percent of
the data will fall between 10 and 70?

24
8. Discrete (Binomial -B, Poisson- P) and
Continuous (Normal- N. Exponential - E) Distributions

8.1. Binomial Distribution


The binomial distribution describes the behavior of a count variable if the number of
observations n (say) is fixed and each observation has only two outcomes ("success" or
"failure").
Further, each observation is assumed to be independent. The probability “success" = p is the
same for each outcome. Then we say Y is distributed binomially.
ie. Y has a binomial distribution.

Examples.
 The engineer is interested in the number of break down buses in a sample of 100 lot.
 The doctor studies the number of survivors‟ vs deaths after treatment for a sample of 200
patients
 A teacher may interest how many heads occurs when by throwing 60 coins

The conditions for binomial distributions are:


 The experiment consists of n identical trials,
 each trial has only one of the two possible mutually exclusive outcomes, success or a
failure
 The probability of each outcome does not change from trial to trial

If Y ~ B (n, p) then P(Y = r) = C rn (p)r(1-p)n-r

p = Probability of success, n = number of trials

Ex. 1. A family has 6 children. Find the probability that there are (a) 3 boys and 3 girls and
(b) fewer boys than girls. The probability that any child being a boy = ½.

Let Y = Number of boys in a family then Y ~ B (6,1/2)


(a) P(3 boys) = C r6 (1/2)3(1-1/2)3 = 5/16

(b) P(fewer boys than girls) = P(no boys) + P(1 boy) + P(2 boys)

= C 06 (1/2)0 (1-1/2)6 + C16 (1/2)1 (1-1/2)5 + C 26 (1/2)3 (1-1/2)3

= 1/64 + 6/64 + 15/64 = 11/32

25
Ex. 2. A multiple choice test has four possible answers to each of 16 questions. A student
guesses the answer to each question, i.e., the probability of getting a correct answer on any
given question is 0.25. What is probability that at least 14 questions be correct?

Let Y = Number of correct answers. Then Y ~ B (16,1/4)

P(Y>14) = P(Y=14) + P(Y=15) + P(Y=16)

16
= C14 (1/4014(3/4)2+ C15
16
(1/4015(3/4)1+ C16
16
(1/4016(3/4)0

Properties of Binomial Distribution

Mean = np, Variance = np(1-p) = npq where p + q =1 and SD = np(1  p)

Proof:: Let Y be a rv distributed B(n,p)

Expectation – E(Y)

n
E(Y) =  k  P(Y  k )
k 0
n
n!
=  k  k!(n  k )!p
k 0
k
(1  p) n  k
n
n(n  1)!
=  k  k (k  1)!(n  k )!p
k 0
k
(1  p) n  k
n
n(n  1)!
=  k  k (k  1)!(n  k )!( p * p
k 1
k 1
)(1  p) n  k
n
(n  1)!
= np  p k 1 (1  p) n  k
k 1 ( k  1)!( n  k )!
m
(m)!
= np  p s 1 (1  p) m s ( m = n - 1 and s = k – 1)
s  0 ( s )!( m  s )!

= np  1 = np

26
Variance – V(Y)

V(Y) = E(Y2) – [E(Y)]2

n
n!
E(Y2) =
k
k 0
2

k!(n  k )!
p k (1  p) n  k
n
(n  1)!
= np  k 2  p k 1 (1  p) n  k
k 0 k ( k  1)! ( n  k )!
n 1
(n  1)!
= np  ( s  1)  p s (1  p) n  s 1 (let s = k-1)
s 0 s!(n  s  1)!

n 1
(n  1)! n 1
(n  1)!
= np  ( s)  p s (1  p) n s 1 + np  s!(n  s  1)!p (1  p)
s n s 1

s 0 s!(n  s  1)! s 0

= np(n-1)p + 1 = np(np + q)

[First term is the mean of B(n-1, p) and second term is sum of probabilities of B(n-1,p)
V (Y )  np(np  q)  n 2 p 2  npq ]

Note:
If X ~ B(n, p) and Y ~ B(m, p) and X and Y are independent then X + Y is also a binominal
distribution with (n+m, p) parameters.(Proof is not required)

27
8.2. Normal Distribution
The normal distribution is the most important family of continuous probability distributions
in statistics which is widely applicable in all fields. The distribution is defined by two
parameters, namely mean ("average", μ) and variance ("variability", σ2). The normal
distribution, is known as the Gaussian distribution and is denoted by Y ~ N(µ,  2 ).
PDF of the Normal distribution is given by
1 (x  )2
f x ( x)  exp[  ] ……………….. [1]
2 2 2 2

Standard normal distribution

In general all normal random variables are converted to the standard normal. If X ~ N(μ,σ2),
X 
then Z  is a standard normal distribution.

It can be shown that E(Z) = 0 and V(Z) =1. Thus it is written as Z ~ N(0,1).

An important consequence is that the distribution function of a general normal distribution is


1 x2
given by, exp[  2 ] …..…………….. [2]
2 2

One of the useful properties of the std. normal distribution is shown below.

Note: If Y ~ N(µ,  2 ) then E(Y) = µ and V(Y) =  2

Proof:
 1 ( y  ) 2
exp[  ].dy
E(Y) = 

y*
22 2 2

y
Let t = then dy =  dt

 t2
1 -
E(Y) =
2


(t  )e 2
dt

 t2  t2
1 - 1 -
=
2


te 2
dt +  
 2
e 2
dt

= 0+ 
 t2  t2
1 - 1 -
[it can be easily shown that 
 2
e 2
dt = 1 and
2


te 2
dt =0]

=

28

1 ( y  ) 2
E(Y2) =  y2* exp[ 
22
].dy
 2 2

 t2
1 -
 (t  ) e 2 2
= dt
2 

 t2  t2  t2
2 - 2 - 2 -
  
2 2 2 2
= t e dt + te dt + e dt
2  2  2 

 t2
2 -
 dt +  2
2 2
= t e
2 
 t2

Integrating the above integral by parts let u = t and dv = t e 2


.dt it can be shown that
above integral equals to  2 .

Thus E(Y2) =  2 +  2 and V(Y)= E(Y2) – [E(Y)]2 =  2

Applications of Normal Distribution

1.2 x2
1 2
(1) (a) P(Z  1.2) =

 2
e dx

0 x2 1.2 2
1 2 1  x2
= 
 2
e dx + 
0 2
e dx

= 0. 5000 + 0.3849

= 8849

(b) P(Z  1.13) = 1 – P(Z < 1.13)

1.13 x2 1.13 x2
1 2 1 2
=1- 
 2
e dx = 1 – [0.5000 + 
0 2
e dx

= 1 – [0.5000 + 0.3708]= 0.1293

(c ) P(-1.37  Z  2.01) = P(-1.37  Z  0 )+ P(0  Z  2.01)

= P(0  Z  1.37) + P(0  Z  2.01)= 0.4147 + 0.4778 = 0.8925

29
(2) Let T be the temperature (oF) in May in a given year and distributed normally with mean
68 and SD 6. Find the probability that the temperature is between 70 & 80.

T ~ N (68, 62)

70  68 T - 68 80  68
P( 70  T  80 ) = P(   ) = P(0.33  Z  2.0)
6 6 6

= P(0  Z  2.0) – P(0  Z  0.33) =  (2.0)   (0.33)

= 0.4772 – 0.1293 = = 0.3479

(3) The radius of the nails of a sample of 800 is normally distributed with men 66 mm and
variance 25. Find the number of nails with radius between 65 and 70 mm.

65  66 70  66
P(65  R  70 ) = P( Z  ) = P (-0.20  Z  0.80)
5 5

= P(-0.20  Z  0) + P(0.0  Z  0.8) = 0.0793 + 0.2881 == 0.3674


Required nails = 800  0.3674 = 294

Normal Approximation for Binomial

If n is large enough (n > 30) the skewness of the distribution is not too great and in such
situation if Y ~B(n, p) then for large n, Y~ N(np, npq)

Note: But to use the normal approximation to calculate this probability, we apply the
continuity correction in converting discrete to continuous variable.

Ex.
1. A fair die is tossed 180 times. Find the probability that the face 6 will appear between 29
and 32 inclusive.
Let Y = No. of times six appear
Thus Y ~ B(180, 1/6)
Using normal approximation to Binomial distribution,
Y ~ N (np, npq) where n = 180, p = 1/6 and q = 1-p = 5/6
E(Y) = np = 180  1/6 = 30 , V(Y) = npq = 180  1/6  5/6 = 25
SD (Y) = 5

30
 29  30 Y  30 32  30 
P (29  Y  32) = P   Z  
 5 5 5 
= P (-0.2  Z  0.4) = = 0.1554 + 0.0793 = 0.2347

With continuity correction we are interested


 28.5  30
P (28.5  Y  32.5) = P 
 5
 Y  30
5
Z 
32.5  30 
5


= P (-0.3  Z  0.5)
= 0.1179 + 0.1915 = 0.3094

2. Consider a group of voters in a first year undergraduates in the University of


Moratuwa. The true proportion of voters who favor candidate A is equal to 0.40.
Given a sample of 200 voters, what is the probability that more than half of the
voters support candidate A?

3. The grades on a short quiz in statistics were 0, 1, 2, .. ,10 points depending on the
number of 10 questions. The mean grade = 6.7 and sd =1.2. Assuming the grades to
be normally distributed determine (a) % of students scoring 6 points and (b)
maximum grade of the lower 10% and (c ) the minimum grade of highest 10% in the
class.

31
8.3. Poisson Distribution

The Poisson distribution is also a discrete distribution which is used to model the number of
events occurring within a given time interval.

Examples: (1) Y = number of accidents between time a and time b .


Since Poisson is a discrete distribution, the probability distribution of Poisson variable is
given by,
e   y
P(Y = y) = for y = 0,1,2, and is denoted by Y ~ P (  )
y!
 is the average number of events in the given time interval and it is known as the “shape
parameter of the distribution”.

Properties of Poisson distribution

If Y ~ P(  ) then E(Y) =  = V(Y)


Proof:

K e 

k 1e  
s e  
E(Y) = k 
K 0 k!
= 
K 1 ( k  1)!
= 
s 0 s!
(let k-1 = s)

s e  
=  1 (  =1) = 
s 0 s!


K e  
K 1e   
s e  
E(Y2) = k2 
K 0 k!
= k 
K 1 (k  1)!
=   (s  1) 
s 0 s!
(k-1 = s)


s e   s e  

= s  + =  E(Y) +  *1 = 2 + 
s 0 s! s 0 s!

V(Y) = E(Y2) – E(Y)]2 = 2 +  - 2 = 

Relation between B(n, p) and P(  )


In Binomial if n is large and p is close to 0 then the event is called a rare event. In practice we
shall consider an event to be rare if n  50 while np close to 5. In such situations Binomial is
closely approximated by  = np.

Note:
Since there is a relationship between Binomial & Normal there is a relation between Poisson
and Normal. It is given by ,
P (  )  N (  ,  ) = N (  , )
2

32
Eg. 1. If a probability that an individual suffer a bad reaction from injection of a given type is
0.001 determine the probability that out of 2000 individuals (a) exactly 3 and (b) more than 2
will suffer bad reaction.
Let Y = number of individuals suffer bad reaction
As N is large and p is small it can be assumed that Y ~ P(  ) where  = np =2000*0.001 = 2.
23 e 2
P(Y=3) = = 0.1804
3!
Assuming Y ~ B (2000, 0.001), P(Y=3) = 2000C3(0.001)3(1-0.001)1997 = 0.1805

8.4. Exponential Distribution


This is continuous distribution. This is useful in many occasions in particularly in business. It is
widely used in waiting line (or queuing) theory to model the length of time between arrivals in
process such as customers at Bank ATMs, customers at a fast food restaurant, patients entering
to a accident ward. It is defined using a single parameter  .
Thus if X~ Exp(  ) then f(x) = e   .

1 1
H/E: Prove that E(X) = and V(X) = 2
 
Ex. Suppose that customers arrive at bank‟s ATM at the rate 20 per hour. If a customer has
just arrived, what is the probability that the next customer arrive within 6 minutes ?

X ~ E(  ) = E(20)
P(X < 0.1) = 1  e 20(0.1) =0.8647

Note: More exercises on applications are done in the class

33
9. STATISTICAL INFERENCES

It is not possible to find parameters (mean, variance etc) of a population due to obvious
reasons. Thus we have to compute a value (or range) that represents a ``good'' guess for the
true values of the parameter to make conclusions (inferences) on the population based on
sample. There are two types of estimators namely
(a) Point estimator and
(b) interval estimator

9.1 Point Estimator


The point estimate of a parameter is a single number based on the sample data that we can
consider to be the most plausible value of parameter.

9.2 Interval estimator


An estimate of a population parameter given by two numbers at a given confidence is called
the interval estimator.

For example:
We want to know the average salary of chemical engineering graduate. So we selected 25
people at random. The mean annual income is 60,000/=. This is a point estimate. Using an
interval estimate we say that the mean annual income is between 40,000 and 85,000/= with
95% confidence.

How do we judge the confidence?


Let  and  2 be the population mean and variance (irrespective of the distribution). Then
these two parameters are estimated by
n n

 xi  (x  x)
i
2

ˆ  sample mean  x = i 1
and ˆ 2  sample var aince  s 2 = i 1
n n

How accurate these estimators?


Basically if we know the population parameters the sample estimates should be very close to
the population parameters. Let  be the population mean. Then we want  - x  0 if x the
mean of the sample. But difference samples can be obtained. So x is random variable. Thus

we want  - E(x)  0

34
Note: The difference is known as the „bias‟ of the parameter estimated based on sample.

Thus to obtain more precise estimator for the population parameter “bias” should be zero.
That is estimator should be unbiased.

9.3 Unbiased estimator


If a statistic  is used as an estimator for population parameter  and E(  ) =  then  is
said to be as an unbiased estimator.

Note 1. x is an unbiased estimator for the population mean  .

x i
1 n 1 n
Proof: E( x ) = E( i 1
n
) =  i n
n i 1
E ( x ) =
i 1
= 

Note 2: Sample variance s2 is not an unbiased estimator for the population variance  2

n 2 n

 ( xi  x )
i 1
x
i 1
2
i  nx 2
Proof: s 2 = =
n n

n n n

 xi2  nx 2 E ( xi2 ) 2  E(x 2


i )
2 i 1 i 1 E (nx ) i 1 nE ( x 2 )
E( s ) = E( )= - = -
n n n n n

2
 n 2
2

V ( x )   [E ( x )] i i n[V ( x )  {E ( x )} n  n
2 2 2 n
n
= - = -
n n n n

( Central Limit Theorem)

n 2


 xi 1 (n  1) 2  ( xi  x )
[V( x ) =V(
n
) =
n2
 i
V ( x ) =
n
. Hence i 1
n
is not an unbiased

estimator for  2 .

n
Thus E[  ( xi  x ) 2 ] = (n-1)  2
i 1

 (x  x) i
2

Thus i 1
is an unbiased estimator for the population variance  2 .
(n  1)

35
9.4 Central Limit Theorem (without proof)

For large n (n> 30) the distribution of mean is approximately normal with mean  and

2
variance (irrespective of population and mean and variance are finite). Thus, the Central
n
Limit theorem is the foundation for many statistical procedures. The distribution of an
average will tend to be Normal as the sample size increases, regardless of the distribution
from which the average is taken.

9.5 Confidence Interval


Instead of point estimator we compute range to represent parameter of the population. We
construct this interval with some confidence. Thus it is known as, “confidence interval”.
That is, a range within which the population value is likely to fall. "Likely" is usually taken
to be "95% of the time," and the range is called the 95% confidence interval. The values at
each end of the interval are called the confidence limits.

Confidence Interval for mean under Normal

Case 1:  is known
To explain how confidence intervals are constructed, we are going to work backwards and
begin by assuming characteristics of the population.

2
Let X ~ N ( , 2 ) then X ~ N ( , )
n

X 
Z= ~ N (0,1)
/ n

Based on standard normal assumption we know that an actual sample statistic lying in the
interval    ,   2 and   3 (   0 and  2 =1 ) about 68%, 95% and 99%
confidence. Thus we use this concept to compute CI for mean instead of point estimator. As
P( -1.96  Z  +1.96 ) = 95.0% (as shown below)

Figure 7

36
X   
P( -1.96   +1.96 ) = 95% P(  -1.96  X   + 1.96 ) = 95%
/ n n n


Thus 95% CI for mean is given by X  1.96
n

 
So 99% CI for mean is given by X  2.58 and 90% CI for mean is given by X  1.65
n n


Thus  % CI for mean (when  is known) is given by X   / 2
n

Example 1: Suppose that we found that the mean mark (out of 20) of 50 students in Mid-
term test is 12 with a standard deviation of 6. What can we conclude about the average
marks of students with a 95% confidence level?

In this example n (> 30) is large and we can approximate to normal.

Let X = marks of students

Assume X~ N( ,  2 ) and thus 95% CI for the population mean is given by

 6
x  1.96 =12  1.96 =12  1.66 = [10.35 - 13.66] = [10,14]
n 50

Thus we are 95% confidence that interval for the mean marks is between 10.35 and 13.66.
We can also say that 1.66 is the margin of error.

Example 1. The blood cholesterol levels of a population of teachers have mean 202 and SD
14. If a sample of 30 teachers is selected approximate the probability that the sample mean of
their blood cholesterol level will lie between 198 and 206. Repeat it for sample size of 64
(Class Exercise)

37
Case 2: Confidence Interval when population variance is not known

When  is unknown (i.e.  estimated by a sample variance, s ) and thus 100(1-  )% CI for
s
mean is given by X   / 2
n

Case 3: Confidence Interval when population variance is not known and n < 30

When  is unknown (i.e.  estimated by a sample variance) and sample size is small (< 30)
s
 % CI for mean is given by X  t / 2,n 1
n

Where t / 2,n1 =  % value of the t distribution at n-1 degree of freedom

x
Note : In this case it is assumed that ~ t n1
s/ n

Table 12 Summary - Selecting confidence interval for means

Variance Sample size 95% confidence interval


 2 known Large or small 
x  1.96
n
 unknown
2 Large
x  1.96
s
n
 unknown
2 Small
x  2.54  t ,n 1 
s
n

Eg. Given the following GPA for 6 students: 2.80, 3.20, 3.75, 3.10, 2.95, 3.40. Calculate a
95% confidence interval for the population mean GPA.

0.339
Ans: 3.2  2.57 = [2.84 – 3.56] ( based on case 3)
6

0.339
If we assume normal the CI = 3.2  1.96 = [2.93 – 3.47]
6

Note: You can try using Minitab software

38
10. POINT ESTIMATOR AND CONFIDENCE INTERVAL FOR PROPORTION

Suppose we desire to estimate of p, the proportion of an event in a population based on a


sample size of n. Let X = number of successes in a sample. Then the estimator for the
x
proportion of success(p, say) can be obtained by pˆ 
n

X ~ B(n, p)  X ~ N(np, npq) where q = 1-p

x 1 np
E( ) = E ( x) =  p Thus p̂ is an unbiased estimator for p.
n n n

x 1 pq
V( ) = 2 V ( x)  npq / n 2 =
n n n

Assuming Y = X/n ~ N (p, p(1-p)/n) CI for the proportion is given by

pˆ (1  pˆ )
pˆ   / 2 SD( pˆ ) = pˆ   / 2  .
n

Example: Out of random sample of 100 boxes coming from a particular machine, 82 were
non defective. Construct the 99% CI estimate of the proportion of non defectives.

82
Estimator for non defective = pˆ 
100

0.82(1  0.82)
Required CI = 0.82  2.576 = [72.1, 91.9]
100

Example. The Ceylon Daily News reported that a poll in Jaffna 46% of the population was in
the favour of the present paddy prices with a margin error of 3%. How many people were
questioned?

pˆ (1  pˆ )
Margin error =   / 2 ,
n

0.46  (1  0.46)
Thus 1.962  0.032 n = 1060
n

Example: A sample pole of 100 voters chosen at random from all voters in a given district
indicated that 55% of them were in favor of a particular candidate. Find the (a) 95% and (b)
99% confidence limits for the proportion of all the voters in favor of this candidate. How
larger sample of voters should we take in order to be 99% confident that the candidate will be
elected.

39
Home Exercises (Tutorial) – Applying Concepts
1. Using the company records of last 500 working days the Manager of the company has
summarized the number of cars sold per day as follow.
Number of cars sold/day Frequency
0 40
1 100
2 142
3 66
4 36
5 30
6 26
7 20
8 16
9 14
10 8
11 2

(a) Construct the pdf and cdf of the number of cars sold (say, Y).
(b) Compute the expected value of Y and SD
(c) Find the number of cars sold less than or equals to 50%?

2. Prove the followings.


If X and Y are two continuous rvs then
(a) E(X+Y)=E(X)+E(Y)
(b) Cov(aX+bY, cU+dV)=acCov(X, U) + adCov(X, V)+ bcCov(Y, U) + bdCov(Y,V)
(c) V(Y) = pq if the rv Y takes the value 1 and 0 with probabilities p and q respectively.

3. The following data are the estimated market value (in Rs. 100,000) of 50 companies.
26.8 8.6 6.5 30.6 15.4 18.0 7.6 21.5
11.0 10.2 28.3 15.5 31.4 23.4 4.3 20.2
33.5 7.9 11.2 1.0 11.7 18.5 6.8 22.3
12.9 29.8 1.3 14.1 29.7 18.7 6.7 31.4
30.4 20.6 5.2 37.8 13.4 18.3 27.1 32.7
6.1 .9 9.6 35.0 17.1 1.9 1.2 16.6
31.1 16.1

(a) Determine the mean, standard deviation and the median of the market values and
interpret.
(b) Using the empirical rule about 95% of the values would occur between what values.
(c) Determine the coefficient of variation and interpret.
(d) Estimate Q1 and Q3 values and interpret.
(e) Draw a Box plot and write brief report of the variability of data.

40
4. Consider the following joint distribution of X and Y.

X Y Total
-2 -1 4 3
1 0.1 0.2 0.0 0.3 0.6
2 0.2 0.1 0.1 0.0 0.4
Total 0.3 0.3 0.1 0.3

Find E(X), E(Y), Cov(X,Y) and  XY .

3. The portfolio expected return and portfolio risk of two asset investments X and Y is given

by E(P) = wE(X) + (1-w)E(Y) and D(P)= w2V ( X )  (1  w) 2 V (Y )  2w(1  w)Cov( X , Y ) .


In two investments X and Y E(X)=Rs. 50, E(Y)=Rs. 100, V(X)=9000, V(Y)=15,000 and
Cov (X,Y)=7500. The weight assigned to investment X of portfolio asset is 0.4. Compute
portfolio expected mean and risk.

41
42

You might also like