Unit-10 IGNOU STATISTICS

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

UNIT 10 UNIVARIATE DISTRIBUTIONS

Structure
10.1 Introduction
e Objectives
10.2 Distribution Functions
10.3 Density Functions
10.4 Expectation and Variance
a 10.5 Moments and Moment Generating Function
10.6 Functions of a Random Variable
10.7 Summary
* 10.8 Solutions and Answers

10.1 INTRODUCTION

In this unit, we f i s t introduce the concept of a distribution function of a random


variable. Random variables taking values in either a finite set or countably infinite
set have been studied in Unit 6. Our main emphasis here is on random variables
taking values in a set which is possibly uncountable. Most often, we consider
random variables where values fall in an interval, finite or infinite, on the real line. A
special class of distributions, namely, absolutely continuous distribution play a
major role in practical problems. Throughout this unit, this class of distributions is
the base of our study. We will discuss the notions of a function, expectation and the
variance of a random variable in Secs. 10.3-10.5. The concept of moment of a
random variable and a method of obtaining moments using the moment generating
function are given in Sec. 10.6 and Sec. 10.7. Different approaches useful in finding
the probability distribution function of functions of a given random variable are
discussed in Sec. 10.8.
The facts covered in this unit will be used constantly in the rest of the course.
Therefore we suggest that you do all the exercises in the unit as you come to them.
We will use some facts from Blocks 1 , 2 , 3 and 4 of MTE-01 and Block 1of
MTE-07. So keep them handy while studying, so that you can refer to them easily.
% Further, please do not go to the next unit till you are sure that you have achieved the
following objectives.

0
Objectives
After reading this unit you should be able to :

a define the distribution function for a random variable anda density function for
an absolutely continuous distribution, and establish their interrelations ;
a check whether a given function is a distribution function ;
a check whether a given function is a density function ;
a compute the distribution function and the density function when it exists ;
a compute the moments and moment generating function of a random variable
when they exist ;
a derive the distribution function of a function of a random variable.
Distribution Tbmy
10.2 DISTRIBUTION FUNCTIONS

In Block 2, we have discussed the concept of probability on discrete sample spaces


at length. If you remember, we had started our discussion with the definitions of
random experiments and their sample spaces. We had then remarked that sample
spaces can be classified as discrete and continuous. Since the treatment for these two
categories is slightly different, we had then focussed our attention only on the
discrete case. Now we take up the case of general spaces and define probabilities.
Where do we begin ? As before corresponding to a random experiment, we have a
sample space. We call each of its elements (sample points) an outcome or an
elementary event. But what about an event ?
In a discrete sample space S, we say that any subset of S is an event. In other words,
the collection of all subsets of S is precisely the collection of all events. Now, in a
general sample space 52, it is not always possible to consider all subsets of 52 as
events. There are some difficulties in doing this which we shall not explain to you as
the technicalities are beyond the level of this course. So we are forced to take a
smaller collection of subsets of the sample space as the collection of all events. But,
at the same time, we would like this collection of events to have certain
"reasonable" properties. For example, we would like
i) 52 to be an event.

ii) If A is an event, then A' should also be an event.


w

iii) If Al, A2, ... are events, then U Ai should also be an event.
i-1

We are sure you will have no problem in agreeing to properties (i) and (ii) above.
What about the third one ? If there are only a finite number of events in 52, then even
this property seems reasonable. (In fact, you have already come across it in Block 2.)
At this stage, we can only say that the third condition is an important axiom, which
is crucial to the development of the probability concept. An important point to note
here is that in (iii) above, we are taking only countably infinite unions and not
uncountably infinite unions.
To take into account the properties (i), (ii) and (iii) above, we define a collection 3
of subsets of the sample space 52, which has the following properties :
i)QE 3

iii) If each of A1, A2, ....belong to 3 ,then

a i s the greek letter 'sigma'.

Remark 1 :Note that property (iii) guarantees only that the union of a countable
number of sets belongs to 3 . It does not say anything about the union of an uncountable
number of sets. The above collection 3 is called a a - field of events of 52.
We say that A is an event in the sample space 52 if A E 3 .
Now that we have defined events, let us talk about their probabilities. In the discrete
case we had associated probabilities to each outcome and then added these up to
calculate probabilities of events. But it is not always possible to do this in general
(not necessarily discrete) sample spaces. We can give you a glimpse of the kind of
difficulties that we may encounter.
Let us consider the random experiment of choosing a number x, at random, from the
interval [0, 11.This means that the probability assigned to each value in [0, 11should
be the same. But the total probability assigned to [0, 11is one. This leads us to assign
a probability zero to each individual value in [0, 11. If the aggregate of the
probability of each individual value x, x E [0, 11is taken to be the probability of Univariate Ditribution
[O, 11, (if such an aggregation in the sense of summation of individual values is
possible) then we encounter a problem iri that although P {[0, 11) = 1, the individual
terms in the aggregate are all zero. In fact the same difficulty would be faced with a
discrete sample space having countably infinite sample points, for example when
one desires to draw an integer at random with equal probability from the set of all
positive integers. This raises the question : What do we mean by an aggregate of an
uncountable number of values ? Is such an aggregation at all possible ?
We take care of this in the following definition.
Definition 1 :Let P be a real-valued function defined o n 3 ,the collection of events
on a sample space S2. Suppose P has the following properties :

ii) 0 s P(E) s 1 V E €3
w
e P(Ai) if Ai E 3 for i 2 1 and Ai f l Aj = @ for i * j.

Then, P is called a probability function.


jfss 6

The problem that we encountered when we were taking the aggregate of individual
probabilities to obtain the probability of the union is bypassed by the above
definition because we have conveniently disregarded uncountable unions in (iii) of
the above definition. As such, with this definition of probability, it becomes
meaningless to talk about an aggregation of probabilities over an uncountable set.
From Definition 1 we have P(S2) = 1. Can you deduce the value of P(@)from this ?
Do you agree that ~ ( 4 =) 0 ? Suppose P(@)*O, then it equals some positive number,
say r, in 10, I].
NowSZ-SZU@andSZn@=@.
.'. By (iii) in Definition 1,
P(S2) = P(SZ)+P(@)= 1+r r 1.
This is a contradiction. Hence P(@)= 0.
In Unit 5 we had listed some examples of sample spaces which are not discrete. You
might have noticed that in each of these examples, the outcome is expressed in
numerical terms. In most other practical situations as well, we can assign a real
number to each outcome in the continuous sample space. This observation allows us
to consider only those sample spaces which are subsets of R, the set of real numbers.

5 Now, let us consider a continuous sample space S. We saw that we can associate a
real number to each outcome of S. Does this correspondence have any significance ?
Before answering this question let us go batk for a moment to the discrete case.
Recall from Unit 7 (Block 2) that, if the sample space is discrete then we can
C
associate a number to each outcome. and this association defines a real-valued IfX denotes a random variable
function on the discrete sample space. This function is called a discrete random taking values XI, XZ, ...,then the
probability mass €unction is
variable. defmed by f(xJ = P [X= xj].
We have also seen in Unit 7 that the importance of a random variable lies in the fact
that using that we can define another function called probability mass function :
Thus the probability mass function gives the probability ofoccurrence of the
elements in the range of X which in turn can be used to compute the probability of
Occurrence of any event defined by the observed values of X.
Now can we define a continuous random variable in the same way as in the discrete
case ? From the preceding discussion we know that the definition of Irandom
variable should conform with the definition of probability mass function. But in the
continuous case, there are some constraint induced on what kind of subsets of Q can
Distribution e assigned a probability. This imposes certain conditions on the definition of
mdom variables. More clearly, suppose X is a random variable (r.v.) and we want
evaluate the probability that the random variable X takes values in a set
,G R, i.e. P[X E A]. Then we are actually concerned with the set
I = {a:X(o) E A) C S2, and we want to evaluate the probability of this subset of S2.
low we know that we can obtain the probability if B only if B belongs to the special
lass 3 of sets we defined earlier. So naturally, we need to modify the definition
f a random variable. We, thus have the following definition of a r.v.
befinition 2 : Let E be an experiment and S2, the sample space associated with it. Let
! be the collection of events in S2. A real-valued function X, defined on S2 is
alled a random variable ( r.v.), if

C we study one such real-valued function defined on S2, we have a univariate


roblem under study. If we simultaneously study two such real-valued function on
1, we have a bivariate problem and so on. Bivariate distributions will be studied in
lnit 12. 4

lext we shall define another function related to random variables which can be used
I evaluate probabilities of events.

)ennition 3 :The distribution function F for a random variable X is a function


efined on the real-line by

'he definition makes sense because if X is a random variable, then [X s x] is an


vent in S2. Therefore P[X s x] is well-defined. This function is sometimes called the
umulative univnriate distribution function. You know why this is called
nivariate, isn't it ? This is because the corresponding random variable is one
ariable. Our discussion from now on deals with random variables and their
istributions. So you won't have to worry about the nature of the a-field 'S .
~t US now try to understand the distribution function by looking at some of its
roperties.
'roperties of n distribution function F(x)
) 0 s F(x) s 1 for all x E R.
This property is a consequence of the property (ii) of the probability function
since every event [X r x] should have a number between 0 and 1 as its .
probability.
) F(x) is a non-decreasing function of x ; that is, if x s y, then F(x) s F(y).
'o obtain this property, we write
[Xsy]=[Xsx]U [x<Xsy].
lince the events [X s x] and [x c X 5 y] are disjoint, by the property (iii) of the
robability function P, we have
P[X s y] = P[X s XI + P[x < X s y].
but by the property (ii) of the probability function, the last term, namely,
[ x < X s y] r 0. Hence we get

FSYI
rPFrx1-
bat is,
Did you notice that the above argulnent also proves that Univdnte Distribution

Next we shall state two more properties. We have omitted the verifications of these
properties as they are too technical.

X' -
c) lim F (x) .= 1 and lim E (x) = 0.
X' -m
You recall that we have defined limits as x-,w or x-i., for a real-valued function ~ ~ ~ ~in ~ ~ ~
of one variable in the Calculus course, MTE-01, Unit 2, Block 1. F(-a) - 0.

Now, yeu note that [w : X(o) < a ] = Q and [w : X(o) < - w] = $ and therefore P[o :
X(o) < a ] = P(Q) = 1and P [ o : X(o) < - w] = 0,
d) F(x) is right continuous,
Recall from your Calculus course (MTE-01, Unit 3) that F(x) is right continuous
means that F(x + h) -,F(x) as x -,0'.
Now, on the basis of these properties can you visualise a distribution function of a
random variable graphically ?
Let us first look at some graphs of distribution fullctions of discrete random variables. Here
is an example.

Example 1 :Suppose the random variable X takes the values 0 and 1 with
probabilities p and 1- p, respectively. Then let us obtain the graph of the
distribution function of X.
We first note that F(x) is defined for all real x so we must compute PP< 5 x] for both
positive and negative real numbers x. Also the smallest value that x can take is 0.
Then for any x < 0, the event [X s x] = (I for x < 0. That is,

Now consider any real number x greater than or equal to 0 and less than 1. Then the
event [X s x] for 0 5 x < 1 occurs if X = 1.That is
F(x) =P[X s x ] = :P[X = 0 ] = p i f 0 5 x < 1.
Likewise, if x is a real number greater than or equal to 1, then the event [X s x]
occurs if X = 0 ar 1.Therefore

function F(x) is given by


-
F(x) = P[X s x] =P[X 01 + P[X = :L] = p + 1 - p = 1 if x r 1. Hence the distribution

0 ifx< 0
F(x)=[Xsx]=
1 ifxs 1

Fig. 1
Distribution Tbeory Now you can easily draw the graph of F(x) as in Fig. 1(see Fig. 1). What can you
say about the continuity of this function ? We leave this as an exercise for you to
check (see El).

E l ) What are the points at which the function F(x) given in Example 1is continuous ?
E2) Suppose X is a random variable taking the values 1 , 2 and 3 with probabilities
1 2 3
- 9 -and
6 6
-9
6
respectively. Obtain the distribution function of X and graph it.
Also discuss the continuity of the distribution function.

While doing E2 you must have observed the following facts :


i) The graph of F is a step function
ii) The jump discontinuities of F are at the pointr at which the random variable has
positive probabilities.
Now let us look at the graphs of distribution functions in the continuous case.
Example 2 : Suppose the distribution function of a continuous random variable is
given by
0 forxc0

1 for x > 1

Fi.2

The graph of F is shown in Fig. 2.


Do you see any difference in the continuity of the functions graphed in Fig. 1 and
Fig. 2 ? The graph in Fig. 2 is continuous whereas the graph in Fig. 1is
discontinuous.
Why don't you try some exercises now ?

E3) Graph the following distribution functions and check whether they are
continuous or not.

lo , xco
In E3 (b) you must have seen that the function is neither a pure step function (as in
Example 1and E2),nor a purely continuous function (as in Example 2). The
function F has a discontinuity at 0 with a jump of size 213 at the point and it is
continuous everywhere else.
Now we state the following formulas for computation of probabilities in terms of the univ~rl.tc
Distribution
distribution function F. The proofs of these formulas are beyond the level of this
course.
e) For any x and y,
i) - P[X 5 x] = F(x),
ii) P[X < XI = F(x - 0),
iii) P[X = X I = F(x) -F(x LO),
iv) P[x c X 5 -
y] F(y) - F(x),
V) P[x 5 XC y] = F(y - 0) - F(x - O),
vi) :P[x < X c y] = F(y -0) -(F(x), and
vii) P[x 5 X s y] = F(y) - F(x - 0).
If the distribution function F is continuous at a point x, then the limits of F at x from
the right and left exist and are both equal to F(x) (see MTE-01, Unit 2). That is,
F(x - 0) = F(x+O) = F(x), where F(x+O) is the right-hand limit of F at x.
Hence in this case P[X = x] = F(x) - F(x - 0) = 0.
Thus, if the distribution function F of a random variable X is continuous at a point x,
then

In particular, if the distribution function F is continuous everywhere, then the


probability for every singleton, {x) is zero. Inspite of this,

for any a, b E R.
Now, suppose the random variable X is discrete and takes the values xi with P[X =
xi] = pi for i r 1.Then from Example 1and E2 you can see that the distribution
function F of X is given by

where the summation extends over all indices i such that xi s x. This distribution
function F is a step function (as in Examples 1and 2). In such a case, F is called a
discrete distribution and the random variable X is said to be of discrete type.
On the other hand, suppose that F is a distribution function of a random variable X
whose graph is continuous. For example, the distribution functions in Example 2 and
E3 and are continuous. Let us closely look at those graphs. Is there any difference
between the graphs ? You might have noticed that graph in E3 is smooth compared
to that in Example 2. In mathematical language we say that the distribution function
in E3 is not only continuous but it is differentiable.
Note: Henceforth, in this course we will consider only those distribution
functions which are differentiable and their derivatives are also continuous
(except possibly at discrete set of points, having no effect on any probabilities
computed).
That means, there exists a function f defined on the real-line such that
-
f(x) F' (x)
for all real x. (We shall ignore the points at which the function is not differentiable.)
Recall from your Calculus course that such a function F(x) is called an antiderivative
Distribution meory of f(x). Then, since F(x) is continuous by the fundamental theorem of calculus (see
Block 3, Unit 1, MTE-Ol), we have

Taking limits on both sides as a -.-m, we get,


X

F(x) - F(-m) =$f(t)dt.


-OD

But we have seen that F(-m) = 0. Therefore we have


X

F(x) =$f(t)dt. ...( 1)


a

Note that f is non-negative since F is non-decreasing.


Summarising our discussion we can say that if F is a distribution function whose
derivative exists and is continuous (almost everywhere) on the real-line, then there
exists a non-negative function defined on the real line such that
X

F(x) =$f(t)dt.
--CU

Such distribution functions are called absolutely continuous distribution functions


or continuous distributions for short. Therefore, the distribution functions which we
shall deal with in this course, are either discrete or (absolutely) continuous.
Occasionally, we might consider distribution functions which are neither discrete nor
(absolutely) continuous but a mixture of the two as in E7. With some abuse of
terminology, hereafter we shall write continuous distribution for an absolutely
continuous distribution.
You have already studied discrete distribution in Block 2. In the later part of this
block, we shall mainly study umtinuous distributions. The function f(x) which
appears in (1) is called density function. In the next section we shall discuss density
function in detail.
Before we conclude this section here is an important remark :
Remark 3 :There can be two distinct random variables with the same distribution
function. For instance, let us consider the random experiment of tossing an unbiased
coin. Define X = 1 if a "head" appears, and X = O otherwise. Let Y = 1 if a "tail"
appears, and Y = 0 otherwise. Obviously, X and Y are distinct random variables.
You can check that both X and Y have the same distribution function.
Now you can check whether you have followed the ideas discussed in this section by
attempting the following cxcrcises.

E4) Given the distribution function

lo for x c - 1

for -1sxcl

sketch the graph of F and compute

(b) P[X = 01

(d) P[2 < X s 31


E5) A random variable X has the distribution function F as shown in the graph
given below.
Fig.3

-
Find
(a) PIX = 1/21 (b) P[X - 11 ( 4 PIX < 11

Next we shall talk about density functions.

10.3 DENSITY FZJNCTIONS

In the last section we said that a distribution function F is absolutely continuous if


there is a function f such that

The function f in this expression is called a density function of X. In this section we


shall study this density function in detail. We start with its formal definition.
Definition 3 :A function f defined on the real-line is called a density function of a
random variable X if

(i) f(x) r 0 . for all x


b
(ii) P[a < X a b] = dy, for all a b, E R and a s b.

In particular, observe that


J f(y) dy = 1.

Now, suppose that F is a distribution function such that F' exists and F' is
continuous. Then we know that

for some non-negative real-vaIued function f. Now let us verify whether f satisfies
(i) and (ii) in Definition 3. (i) is automatically satisfied. To verify (ii), note that
Therefore, f satisfies the conditions (i) and (ii).
Conversely, iff is a density function of a random variable X, then define

Then, by (ii) F(x) = P[X s x]. Therefore, F is a distribution function of the r.v. X.
Also, by using the Fundamental Theorem of Calculus (Theorem 7, Unit 10,
dF
-
MTE-Ol), F is differentiable and - f(x). Further
dx

for any pair of real numbers a and b.


Again, from Unit 15, MTE-01, you know that the integral of a function f between
the limits a and b can be interpreted as the area bounded between the curves y = f(x),
the x-axis and the ordinates y = a and y = b. Hence this area is equal to the
probability that the random variable X takes values between a and b.
Note that the area enclosed between the curve y = f(x) and the line y = 0 is unity,
since it is equal to P[- w < X < w].
Let us now look at some examples of density functions and their corresponding
distribution functions.
Example 3 : Let X be a random variable with density function

I
forOsxs1
f(x) =
0 , otherwise
Then

F(x) - X

P[X 5 xl =Jf(y) dy
-0

-
1:
for x s 0
~.e, F(x) x for 0 < x < 1
for x 2 1
You can see the graphs off and F in Fig. 4

Fig. 4 :Graph of (a) density function and (b) corresponding distribution function
This distribution is called the standard uniform distribution or rectangular
distribution.
Note that, for any a and b with 0 5 a c b 5 1,
b
~[as~abl-$dy=b-a,
a

Hence the probability that a real number is selected from [a, b] under this probability
model is b -a. It is just the length of the interval [a, b].
Let us consider another example.
Example 4 :Let X be a random variable with density function f which is a constant
over an interval [a, @] and equal to zero outside the interval [a, f31. In other words

f(x) - 1
C

0
forasxr~

, otherwise,
where C is a constant. From the properties of density function, we get C r 0. Further
the equation

implies that

- 1
This relation leads to C -and we have
B-a
1
ifasxsf3

0 , otherwise

This is called the uniform density function on the interval [a, PI. The
corresponding distribution function is

CO ifxca
x-a
ifasxsf3

called the uniform distribution on [a,f31. Did you notice that Example 3 is a
particular case of Example 4? We trust you will be able to check the calculation of F
from f very easily.
In the next example we discuss another distribution which is frequently used as a
model in describing the life time of a light bulb.
Example 5 : Suppose X denotes the life time of a bulb and X has density function
x r o
q X )=
0 ,xco.

-
m

Check that f(x)r 0 and thatp(x) dx 1.We now claim that the distribution function
-m
F corresponding to f is
for x < 0

1 - e-x for x r 0.

Do you agree? Check that = f(x), and you will be convinced. This distribution t
dx
is known as the standard exponential distribution.
Another distribution which is by far the single most important distribution in
Statistics is the normal distribution. It is sometimes referred to as the Gaussian
distribution. We take this up in our next example.
Example 6 :Suppose X is a random variable with density function

It is obvious that f is a non-negative function. It needs some effort to show that

We will postpone this proof until Unit 11.The distribution function F corresponding
to this density function is

cp and @are the greek letters small ,


phi and capital phi.

We have sketched the graphs of 6 and @ in Fig. 5 below.

Fig. 5 :Gnpb of the (a) density hctim, b) distribution fundom d a dnodPrd normal d h i b d o n

We will study this graph in detail in the next unit.


The distribution discussed in the above example is known as the standard normal
distribution.
See if you can solve these exercises now.

E6) Suppose that a random variable X has the density function

Find the value xo such that F(xo) = .5.


E7) A random variable X has the distribution function
I
0 for x < 0 Univariate Distribution

F(x) - x2 for o s x s 1

1 forxzl.
Show that X is of continuous type, and determine its density function.
E8) Show that the function
1
f(x) = ' -m<x<m
n (1 + x2)
is a density function.
E9) Buses arrive at a specified stop at 15-minute intervals starting at 8.00 A.M. That
is, they arrive at 8.00 A.M., 8.15 A.M., 8.30 A.M., and so on. If a passenger
arrives at the stop at a time that is uniformly distributed between 8.00 A.M. and
* 8.30 A.M., find the probability that she waits less than 5 minutes for a bus.
E10) Consider the function

= 0 , otherwise
Can f be a probability density function? If so, calculate the constant C.
- - -

By now you must have become quite familiar with the density and distribution
functions of a random variable. In the next section we take up the study of the
expectation, variance and other related concepts for a r.v.

10.4 EXPECTATION AND VARIANCE

In Block 1 you have calculated the mean (expected value), variance and other
moments of a frequency distribution of a quantitative character. In Block 2, again,
you have studied these very concepts in the context of discrete probability
dislributions. If you remember, over there you had replaced relative frequencies by
probabilities. Now we are going to study these concepts again -this time for a
continuous random variable. Since you are already familiar with the interpretations
and interrelationships of these concepts, here we shall go over them quickly. Quite
often we'll only state the all-too-familiar results and expect you to prove them. Let
us start with the definition.
Definition 4 : The expectation of a r.v. X with density function f is defined to be

$x f(x) dx.
-m
m

provided $1~1 f(x) dx < m.


-m
I
We denote the expectation or expected value of X by E(X) whenever it exists.

In general, if g is a function of the r.v. X, we define


m
We will discuss a function o f a r.v.
E [g (x)l =$g(x) f(x) dx. in more detail in Sec. 10.6.
-w

provided j l g ( x ) lf ( ~dx
) < m.
-
-w
Distribution Theory With this general definition we'll be able to write down the expression for the
variance and the moments of a r.v. X.
As you know, from Block 1,

Therefore, we write
[ '1
Var (X) = E (X - p) ,where p = E(X).

Var (X) -$
-00
00

(x - a)2 f(x) dx,

provided the integral on the R.H.S. is finite.


Then using some algebraic properties of expectation we can show that
2
Var(X) = E ( x ~ )- [E(X)] = E ( x ~ )- p2. We shall discuss this at the end of this
- section. Variance is also denoted by 2 and we are sure you remember that a is
called the standard deviation. Have you noted the similarities and the
dissimilarities between these definitions and those given in Blocks 1and 2? A major
point of dissimilarity is that here we have defined the expected values as integrals,
whereas earlier we had used summations. But hadn't you expected this ? Since our
random variable now varies continuously, instead of taking only discrete values, it is
quite natural that we use integrals and not summations. Another change is that the
density function f(x) now takes the place of the p.m.f. But these differences apart,
don't you agree that the basic concept remains the same ?
Before we talk about the algebraic properties of expectation and variance, we give a
few examples. These will familiarise you with the calculations of mean and variance.
Example 7 : Let us calculate the expected value and the variance of X, where X is a
r.v. with uniform distribution on [a, P] described in Example 6.
Now,

By definition,
f(x) =

"I
00
1

E(X) =$x f(x) dx


i f a s x sf3

, otherwise

- 00
8
dx, since f(x) = 0 outside [a, B]
a

Thus, the expected value of X is the mid-point of the interval [a, PI.

-
If X is a d i m e t e r.v.. then
E@) r xp(x),
Now, Var(X) = E(x') - [ ]',
a+P

8
and E(x?) = $xZ -dx
a P-a

Hence, Var(X) - -@?d


- + 1'
3(B-a)
a
4
Let us consider another example.
Example 8 :Suppose X is a r.v. with exponential distribution. This means that the
density function f of X is given by

where h is some positive constant.


Recall that we have seen the case A - 1in Example 5.
Let's compute the mean and variance of X.

- -IY -
aD
1
e'ydy ,if we put y ?a.

Do you agree t h a t I y e9dy


0
- I? Note that this follows by the method of integration

by parts. If X is interpreted as the life-time of an electric bulb (see Example S), then
the mean or expected life time is l/h. Now, to calculate Var(X), we begin by
computing E(x2).

- 1
-sy2e9
A20
-
dy, where y A x.

You may not have come across this integral before. It is the value of the gamma
function at 3. The gamma function, l?, is defined as
w

l?(a) -Jya- 'e7 dy, where a > 0.


0
Then it is known that
-
l?(n + 1) n !,where n is any non-negative integer. Without going into the how and
why of this, we shall only use this fact to evaluate E(x2).
Distribution ~hus, var(x) = E(x') -[E(x)~

In all the examples considered so far, the random variables turned out to have finite
expectations. But you should note that there are cases of r.v.s. whose expectations do
not exist. You can see one such r.v. in the next example.
Example 9 : Let X be a r.v. with density function,

In E8 you must have proved that the function I above is a density function. This
distribution is called the (standard) Cauchy distribution.
Let us check whether E(X) exists in this case.
We have

m .
1
and - ln(1 + x2) -.w as x -. w . ~ e n c e , S l xf(x)
( dx is not finite. Therefore, E(X)
Jr
..
-w
does not exist.
From the above exampIes you must have got a pretty good idea of the computations
required to evaluate the expectation and the variance of a r.v. Now we shall list some
of their algebraic properties. The proofs of these properties depend on some
elementary properties of integrals. We have proved some of them and we are sure
you will be able to prove the rest (see E l 1).

i) -
If Y aX + b, where a and b are any two constants, and if X has a finite

-
expectation then E(Y) exists and
E(Y) a E(X) + b.
ii) If X is a r.v. taking non-negative values with probability one, and if
E(X) < a,then E(X) 2 0.

-
iii) If X and Y are r.v.'s with finite expectations and a and b ark constants, then
E(aX + b y ) a E(X) + b E(Y)
We will come back to the proof of this property in Unit 12.
But note that this result can be extended to three or more variables.
iv) Var(X) = 0 if and only if X is a constant with probability one, i.e. iff P[X = C] =
1 for some constant C.
One-way implication in this statement is easy. Then we can consider X as a discrete

You will see that we get E(X) = C. Then Var(X) = E[X - c]' = 0, since (X - c)'
with probability one.
-
r.v. So let's apply the definition of the expected value of a discrete r.v. to get E(X).
0

To prove the converse, we need to use Chebyshev's inequality. So, we postpone the
proof till Unit 14, where we are going to discuss this inequality.

v) For any two constants a and b,


-
vi) Var (X) E ( x ~ )- p2, where p = E(X).

We have already mentioned this property earlier.


Now attempt the following exercises to complete the discussion of algebraic
properties of E(X) and Var (X).

E l l ) Prove the properties i), v) and vi) above.


E12) Compute the expectation and variance of a random variable Y whose density
function is
1-Jyl forl-(ylel
f(y) =
otherwise
E13) Let X be a random variable with density function

[O otherwise.
Show that E(X) exists and E(X) = 2 but Var(X) does not exist.

[
is minimum when a = p = E(X).
I'
E14) Let X be a random variable such that E (X - a)' exists for all real numbers

So far we have seen how to calculate the expectation and variance of a random
variable as long as they exist. As we have seen, both the expectation and variance
are specified by the values E(X) and E(x'), the expected values of X raised to the
first and second powers. But these two expected values describe only two particular
aspects : "the middle value" and the measure of relative variability about the middle
value of the probability distribution corresponding to the random variable. These
two numbers are not sufficient to desoribe the distribution completely. To get more
information about the distribution we need to study its moments which are specified
by the values of E(X k ), k = 1 , 2 , 3,....... We shall take up this in the next section.

10.5 MOMENTS AND MOMENT GENEFUTING


FUNCTION

In Unit 7, we have discussed moments and m.g.f. of a discrete r.v. X. Thc discussion
in the case of a continuous probability distribution runs parallel to that in the case of
discrete probability distributions.
If k is any integer greater than or equal to one, and if b is any real number, then if
E[X - bf exists, it is called thekthmornent of X about the point b.
kth mpments about b = 0 given by
k
kk' = E(X ), k = 1,2,...
are called raw moments or simply moments. Now if we take b = p = E(X), then

[ kl
p k = E ( X - p ) ,k=1,2,...
which are the moments about the mean p, are called central moments.
k is called the order of the moment (X - b)k.
Do you agree with the following observations ?
11' = P
p1= 0
p~ = 2 = Var(X)
In Blocks 1 and 2, we had derived the relations between raw and central moments of
X. The same hold good here. Thus, we have

At the end of the last section we said that we need to study moments to get more
information about the distribution of a r.v. You may think, what additional
information can the moments give ?
To see that let us look at the following expressions.

Now what is the significance of this? Aren't the expression for yi and P2 familiar ?
y l measures the skewness and b,the kurtosis of a density function. (Compare these
with the measures of skewness and kurtosis of a frequency distribution, discussed in
Unit 3.)
Let us see an example.
Example 11 : Suppose X has uniform distribution on [ a . PI. Let us compute the
raw moments for this distribution.
We have,

In particular

Why don't you try some exercises now ?

E15) Suppose X has uniform distribution on [a, 81. Find


E(Xr) and E [(X - p)f for r r 1,where p is the mean of X.
E16) Suppose X has a standard exponential density. Find the coefficient of
skewness.
E17) If Y = ax+b, show that Y has the same coefficients of skewness and kurtosis
as X, whenever they exist.

After doing these exercises you would have realised that calculation of moments of a
random variable is cumbersome even when they exist. Alternatively, we can use the
moment generating function, whenever it exists, to obtain the moments.

e el
Let X be a random variable, such that MX (t) = 1 ' exists for some t rr 0.Mx (t) is
called the moment generating function (m.g.0 of the random variable X, whenever
it is well-defined.
-
Note that Mx (0) 1for any random variable X. Let us expand Mx (t) by
Maclaurin's series expansion (See Unit 6, Block 2 of MTE-01). Then we have

- t a+O........+ ntn! ddn


Mx (t) = Mx (0) + t dMx (t) .............(2)
d tn

On the other hand, suuuose the following cornnutatinn ic i l l c t i f i ~ d


U UnivPriate Distribution

........(3)
Comparing the coefficient of tn for every n = 1 , 2 , 3 ...in (2) and (3), we have
the relation

This relation implies that the nth moment about zero of the random variable X can
be obtained by differentiating the m.g.f. Mx (t) of X exactly n times, and then
evaluating the nth derivative at zero. This is why Mx (t) is called a "moment
generating function" of X. We can justify the above arguments under some
-L
conditions on the existence of moments of X. But this discussion is beyond the scope
of this course.
We now show you how to calculate the m.g.f. for the uniform, and the exponential
distributions.
Example 11 :Suppose X has a uniform distribution on [a,PI. Let us compute the
m.g.f. of this distributtjn.
We have

-
Mx (t) = E[ea] =Jiiadx
' em
a

and
Mx (0) = 1.
I
Hence the m.g.f. Mx (t) exists for all t.
I
1 Now, why don't you check your answers to E l 5 by calculating the moments from
the m.g.f. obtained in this example ? (see E18).
Example 12 :Suppose X has the exponential density,

f(x) =
lo
he- A x

where h > 0. Let us compute the m.g.f.


, x>O

, XIY

Mx (t) = E[ea] =Jew h e- "dx


0

h
a- fort < A.
h-t

This m.g.f. Mx (1) does not exist for t 2 h since in that case, e(t- 'is unbounded
-- 1 n -r
Distribution Tbmry Try to solve these exercises now.

E18) Calculate the first and second moments of the uniform distribution using the
m.g.f. of the distribution.
E19) Let X be a random variable with density function

1
x if0sxsl
f(x) = 2-x if 1 s x 5 2
O otherwise
Determine the m.g.f. Mx(t) of X whenever it exists.

E20) Suppose X have the density function


xe"
if x > 0
f(x) =
o otherwise
Find its moment generating function whenever it exists.
E21) Let Y = aX + b. Show that
~ y ( t =) ebtMX (at).

Here we make an important remark.


Remark 4 : You may have got the wrong impression that if two r.v.s X and Y are
such that all the moments of X are respectively equal to the moments of Y, then X
and Y are equal. This is not so. Infact moments do not even determine a distribution
uniquely. Same is the case in general with m.g.f.s. However the following theorem is
valid. We will not discuss its proof as it is beyond the scope of this course.
Theorem 1 : Suppose X and Y are random variables with m.g.f.s Mx(t) and My(t),
respectively. Suppose Mx(t) and My(t) exist in an open interval containing zero and
Mx(t) = My(t) in that interval. Then X and Y have the same distribution.
In the next section, we take up one last topic, the distribution of a function of a
randoin variable.

10.6 FUNCTIONS OF A RANDOM VARIABLE

In the earlier sections of the units we were concerned with various aspects of the
distribution function of a random variable. In many applications we may have to
consider not only the distribution function of a random variable, but the distribution
function of a function of a random variable. In this section we first try to understand
what is meant by a function of a random variable and then discuss how to find its
distribution function.
Let us first consider this situation :
4
Suppose we want to know the volume V = - nr3 of a spherical object, say ball
3
bearing, manufactured by a company. Due to manufacturing defect, the radii of
different spheres may be different. We suppose that the radius of a sphere is a
continuous random variable X having density function f. Then we can consider V as
function of the random variable X, say

Here we would expect that we can derive the density function of V from the
knowledge of the density function of X. In such situations we are concerned with the
concept of a function of a random variable and its density function. Formally we
define the concept as follows :
Definition 5 :Let X be an r.v. defined on S2 and g : R -,R. Then the real valued
function Y defined on S2 by

is called a function of the random variable X.


For example, if X is an r.v. and g(x) = ax + b, then Y = aX + b is a function of the
r.v. X.
You have already come across some functions of r.v.s in the earlier sections like X 2,
x3,.... . In this unit we consider only the continuous case. Here we make some
remarks.
Remark 5 : a) In general a function of a r.v. need not be an r.v. But it turns out that
whenever g has nice properties, some of which are continuity (and) monotonocity,
then Y becomes an r.v. So, in this course, whenever we deal with functions of r.v.,
we assume that g has nice properties by which Y becomes an r.v.
b) Another question which comes to o w mind is that suppose X is a continuous
?.
7 i'
, (discrete) r.v., is it true that Y is also continuous (discrete) ? In general we cannot
., conclude from the definition that Y is of the same type as X. For instance, if g(x) =
:.i
8
C, a constant, for all x E R, then P F = c] = 1 and Y is a degenerate r.v. (Recall the
gh definition of a dcgcncrate r.v. from Unit 7, Block 2.)
,-
Next we shall see how we find the distribution of Y = g(X). Let us first consider a
simple case :
Suppose X is a continuous r.v. and g(x) = ax + b, where a, b E R, a > 0. Then
Y=aX+b.
To get the distribution of Y, we consider

Now if Fy and FX denote the distribution functions of Y, and X respectively, then


we have

Differentiating both sides with respect to y, we get the density of Y as

In this case we could easily derive the density function because the inverse of g
exists (given by g-' (y) = a-' (y- b)), and the inverse is differentiable. So, in the
general case we expect that if g has similar properties as in the above case, then wc
can find the density function.
Now let us start with a real-valued function g defined on R. Let Y = g(X).
We call D = {x : f(x) > 0 as the support off.
Let us now suppose that g is a continuous and strictly increasing function on the
support of f. Since g is strictly increasing, there exists a function s, such that

for all y . s is called the inverse function of g. Since g is continuous, s is'alsb


continuous.
Further e(x) 4 y if and only if x s s(y). Hence
Distribution Theory P[Y 5 Y] = P[g(X) 5 YI
= PCX 5 S(Y)I

Therefore, if FX and Fy denote the distribution functions of X and Y respectively,


then

M Y ) = Fx (s(Y))
In particuIar, suppose X has a density function f x and s(y) is differentiable in y. Then
Y = g(X) has a density function, f y and

Now on the other hand if g is continuous and strictly decreasing on the support of f,
then also we can use a similar argument as in the earlier case. To see this, first note
that g has a unique inverse function s(y) which is continuous and strictly decreasing.
Hence g(x) 5 y if x 2 s(y). Therefore,
- -
FY(Y) P[Y 5 Y] P[g(x) 5 Yl
= P[X 2 s(y)l
= 1 - P[X < s(y)]
= 1 - Fx[s(Y) - 01
If X has a density function f, then F y is continuous and
FY(Y)= 1 - Fx(s(y))
In fact, Y has a density function fy(y) given by

Thus we have proved the following theorem.


Theorem 2 :Suppose X is a random variable with density function fx. Let Y =
g(X), where g is continuous, and either strictly increasing or strictly decreasing
function. Let x = s(y) be the inverse function of g and suppose s is differentiable in
y. Then Y has a density function fy(y) and

Let us consider some examples. i


Example 13 : Suppose X has a density function-
ifO<x<l
f(x) =
otherwise.
Then the support of f is ]O,1[
Suppose Y = X 2. Here g(x) = x2 in the earlier notation.
Note that g is strictly increasing and continuous on the 10, 1[ and s(y) = y1I2 is the
inverse of g on 10, l[. Then s(y) is differentiable in 10, 1[ and therefore by Theorem 2,
we have

d -lR , O<y<l
~ Y ( Y=) fls(y)l -[s(y)l =
dy , otherwise
Thus, the density function of Y is Univnrinte Distribution

10 , otherwise
Example 14 : Suppose X is a random variable with standard uniform density
function, that is
fx(x) = 1, o < x < 1
= 0, otherwise
1
Define Y = - In X. The function
A

maps the interval ]O,1[ to 10, m[. Further g(x) is continuous and strictly decreasing
on the interval 10, l[. The inverse function s(y) of g is given by

which is differentiable. Therefore by Theorem 2, Y has a density function given by

he-'^

lo
for y > 0

for y s 0.
=
Sometimes we cannot apply Theorem 2 which we have used in the two examples
above. The next example gives one such situation.
Example 15 : Suppose X has the standard normal density function. Let Y = x2.
Here g(x) = x2. This function is continuous but strictly increasing on [0, w[ and
strictly decreasing on ]-a,O[. Further g(x) is not one-to-one, since g(-x) = g(x).
That means g(x) does not have an inverse. Therefore we cannot apply Theorem 2 in this
case. So we try some other method in this case. Since y r 0 for all x, we have
P[Y s y] - P[X2s y] = 0 for y < 0 and for y > 0,
~[~s~]=~[~~s~]=~[~x~sfi]=~[-fisxsfi]
6
i.e. Fu(y) = s f @ ) dx ,
-6
where f is the standard normal density function. By the symmetry of this density
function, for y > 0 we have

This derivation proves that h e random variable Y has a density function f ~where
,
(0 for y s 0
~ Y ( Y= )
for y > 0.

Recalling that r
G)- = fi ,we can write fy(y) in the form,

for y S O
~Y(Y
=) Y (In)- e-yn for y > 0.
. .
This density function is known as a Chi-square density withI 1degree of freedom.
We will study more about this distribution in Unit 13.
Now, suppose we want to compute the expectation of Y = g(X) whenever it exists.
We can either use the distribution of X or the distribution of Y. For instance,
suppose X has a density fx and Y has density fy.
Then

and

It can be shown that E(Y) exists if and only E[g(x)] exists 'and both these methods of
calculation lead to the same result. We do not give the reasoning here. The choice of
the method depends on the complexity involved in finding the distribution or the
density of Y = g(X).
Let us continue the discussion in Example 15 to illustrate this.

Example 16 : In Example 15, you have seen that E ( x ~ )= 1. Let us compute E(Y)
where Y = x2directly using the probability density of Y derived above.
Then

- r(1/2) 1(3/2)
-- .

But T(3/2) =
2
T ($1 . Hence E(Y) = I as it should be.

Now it's time to do some exercises. -

the density funaiM of Y =


~ 2 2 ~) i n d x2when X bas a uniform densily on [-I, Ij.
E23) Suppose a random variable X has the density function I
I
Univnrinte Distribution

[O otherwise.
Let Y = 4 - x 3 . Find the density function of Y

We now end this discussion. We hope that by now you would have gained
reasonable knowledge about the various aspects related to the distribution of a
random variable. In the next unit we shall study some standard distributions.

10.7 SUMMARY
- /

L-In this unit, we have

1) introduced the concepts of the distribution function of a random variable and


the density function for an (absolutely) continuous distribution;
2) studied properties of a distribution function and a density function;
3) defined the notions of moments of a random variable in general and the
expectation (mean) and the variance in particular ;
4) introduced the concept of a moment generating function for a random variable; and
5) given methods for finding the distribution function or the density function of a
function of a random variable.
You may now like to go back to Sec. 10.1 and go through the list of unit objectives
to see if you have achieved them. If you want to see what our solutions to the
exercises in the unit are, we have given them in the following section.

10.8 SOLUTIONS AND ANSWERS

El) a) 0 and 1are the points at which the function is discontinuous.


b) At 0, the function has a jump discontinuity of size p and at 1, the function
has a jump discontinuity of size 1- p.

E 2) The distribution function F(x) = 1 '6


The graph of F is as in Fig. 6

Fig. 6.

.m r..--
..,:b ",.I itu
~ ; ~ ~ n + ; n l l st \r = 1. at x = 2 and at x = 3.
E3) a) Fig. 7 shows the graph of F(x).

Fig. 7

The graph shows that F(x) is continuous for all x.


b) Fig. 8 shows the graph of F(x).

The graph shows that the function is discontinuous at x = 0 and is continuous


everywhere else.
a) By prope'rty (e), we have Univnrinle Distribution

E5) From the figure we get that

3
=- iflsxc2
4'
=x, i f 2 s x c 3
= 1, i f x z 3 .
Then by Property (e) we have

-F ) -F )
1
since the function is continuous st x = -
2

Similarly we get
1
c) P[X<l]=-
2

E6) The density function of X is

f(x)=Leixl,-acxc+a
2
Then the distribution function of F (x) is given by
X

F(x)= s L e - l x d x
2
- 0)

1 -
= -[Sex dx +Je-' dx]
2
-m 0
DistributionTheory 1 -x
= -e
2
= .5e-'
Then the point xo satisfying F(xo) = .5 is xo = 0.
E7) The graph of F(x) is shown in Fig. 10.

Fig. 10
The graph shows that the function F(x) is continuous for all x.
The density function f(x) is given by

f(x) = - =&
2x, itocxcl

=0 , otherwise
1
E8) The given function is f(x) = , -mux<m.
n(1 + x2)
Then f(x) satisfies the following conditions.
i) f(x) s 0 for all x.

E9) Suppose X denotes the number of minutes past 8 that the passenger arrives at
the bus stop. Since X is uniformly distributed, the density function of X is given
by

[O , otherwise
(see Example 4)
Now the passenger will have to wait less than 5 minutes if and only if he or she
arrives between 8.10 A.M. and 8.15 A.M. or between 8.25 A.M. and 8.30 A.M.
Therefore the probability that the passenger waits less than 5 minutes is
P[lO < X s 151 + P[25 < X s 301
15
1
But P[lO < X s 151-Jf(x)dx -I--dx
30
10
15

10
-6
1

30
and P[25 X s 301 -I-301 dx 1
=-
6 .
25
1
Therefore the required probability = -
3'
5
E10) f cannot be a density function since f(x) < 0 for fi < x < -
2'
E l l ) i) Property (i) : Let f(x) denotes the density function of X.
From Definition 4, we have

-
E(Y) E(aX + b) -J(ax
+m

-m
+ b) f(x)dx

- +m
a Jx f(x)dx
-00
+m
+ bJf(x)dx
-m
= a E(X) + b,
+m

since Jf(x)dx = 1
-m

Property V : Let E(x) = p. Then E(aX + b) = a p + b by property (i). Therefore

= a2 Var(X).
Property VI :

Var(X) = E [(X - p12]

-
= E x2-2px+p
[
E (x2) + E [(-2px)]
'I + E p2
= E ( x ~ )- 2p E(X) + p2

E12) Expectation is zero and variance is 116.

and in x -
does not exist.
w as x --.a.Hence E(x~)is not finite and therefore the variance
Distribution Theory E14) Let g(a) = E [(X - a)2]
=E[(x-p+p-a)2],wherep=~(~)
= E[(X - d21+ ( p - a) E[(X - p)1+ ( p - a)2
= E[(X - p12] + ( p - a)2, since E[(X - p)] = 0 .
This shows that g(a) will be minimum when a = p = E(X).

Further

where p = a+p.
2
E16) By definition, the skewness is given by
E[(X - CL)~I
1
' = ,3
u
Then show that E(X) = 1, E ( x ~ )= 2 and E ( x ~ )= 6.
Hence 13= 1 and LL = 1.
Hence we have

E17) Let E(X) =px, Var(X) =&. Denote the coefficients of skewness and kurtosis of
X by ylX and y;X respectively. Then py = E(Y) = a px + b and 4 = a2c&.
Further
3
E[(Y -p ~ )-
-~E[(aX
] + b) - (apx + b)]
?IY =
aZ a3d

Similarly we can show that y2Y = y2'. Hence the result

E18) The first moment ml - --


E(X)
d
dt
Mx(f) I t -
Here Mx(t) =
fi - a

t ($ - a )
0
But when we substitute t = 0, the expression the R.H.S. is of the - form. Therefore, UnivPridc Ditribution
0
0
by applying, L'hopital's rule for - form (see MTE-07,Block 1, Unit 2), we get
0

Similar argument shows that

and Mx(0) - 1.

E22) Let Y = g(X) = X' and 'fy denote the density function of Y. Then
d
~ Y ( Y=)fx(s(y1) 5
S(Y)
where f x denotes the density function of X and s(y) denotes the inverse
function of g(x).

=0 , otherwise
i
E23) fu(y) =
-
6(4 y)l" '

You might also like