Lecture 11
Lecture 11
Expectation
Reza Abdolmaleki
Lecture 11
Moments
In statistics, the mathematical expectations defined here and, called the moments of the distribution of a
random variable or simply the moments of a random variable, are of special importance.
Definition. The -th moment about the origin of a random variable , denoted by is the expected value
of ; symbolically
when is continuous
It is of interest to note that the term “moment” comes from the field of physics: If the quantities in
the discrete case were point masses acting perpendicularly to the axis at distances from the origin,
would be the x-coordinate of the center of gravity, that is, the first moment divided by , and would
be the moment of inertia. This also explains why the moments are called moments about the origin:
In the analogy to physics, the length of the lever arm is in each case the distance from the origin.
The analogy applies also in the continuous case, where and might be the coordinate of the center of
gravity and the moment of inertia of a rod of variable density.
When , we have by Corollary 2 of Theorem 2 in the previous Lecture. When , we have which is
just the expected value of the random variable and in view of its importance in statistics we give it a
special symbol and a special name.
Definition. is called the mean of the distribution of , or simply the mean of , and it is denoted
simply by
The special moments we shall define next are of importance in statistics because they serve to describe the
shape of the distribution of a random variable, that is, the shape of the graph of its probability distribution
or probability density
Definition. The -th moment about the mean of a random variable , denoted by , is the expected value of ,
symbolically
when is continuous
It is easy to see that = 1 and = 0 for any random variable for which μ exists.
The second moment about the mean is of special importance in statistics because it is indicative of
the spread or dispersion of the distribution of a random variable; thus, it is given a special symbol
and a special name.
D is called the variance of the distribution of , or simply the variance of , and it is denoted by ,, or
The positive square root of the variance, , is called the standard deviation of
The following figure shows how the variance reflects the spread or dispersion of the distribution of a
random variable. Here we show the histograms of the probability distributions of four random variables
with the same mean but variances equaling , and . As can be seen, a small value of suggests that we are
likely to get a value close to the mean, and a large value of suggests that there is a greater probability of
getting a value that is not close to the mean. This will be discussed
further in the next lecture.
Let us derive the following computing formula for :
Theorem 1.
Proof.
Example 1. Use Theorem 1 to calculate the variance of , representing the number of points rolled with a
balanced die
Solution.
First we compute
Now,
Example 2. With reference to Example 2 of Lecture 10, find the standard deviation of the random
variable
Solution.
In Example 2 of Lecture 10, we showed that Now
and
The following is another theorem that is of importance in work connected with standard deviations
or variances:
Proof.
Note that: for , we find that the addition of a constant to the values of a random variable, resulting
in a shift of all the values of to the left or to the right, in no way affects the spread of its
distribution; for we find that if the values of a random variable are multiplied by a constant, the
variance is multiplied by the square of that constant, resulting in a corresponding change in the
spread of the distribution.
Chebyshev’s Theorem
Then, dividing the integral into three parts as shown in the figure, we get
provided Since the sum of the two integrals on the right-hand side is the probability that will take on a value less than or equal to or
greater than or equal to , we have thus shown that
find the probability that it will take on a value within two standard deviations of the mean and compare
this probability with the lower bound provided by Chebyshev’s theorem.
Solution.
Straightforward integration shows that and , so that or approximately . Thus, the probability that
will take on a value within two standard deviations of the mean is the probability that it will take on
a value between and that is,
Observe that the statement “the probability ” is a much stronger statement than “the probability is at
least ,” which is provided by Chebyshev’s theorem.
Moment-Generating Functions
Although the moments of most distributions can be determined directly by evaluating the necessary
integrals or sums, an alternative procedure sometimes provides considerable simplifications. This
technique utilizes moment-generating functions.
Definition. The moment generating function of a random variable , where it exists, is given by
when is continuous.
The independent variable is , and we are usually interested in values of in the
neighborhood of .
To explain why we refer to this function as a “moment-generating” function, let us substitute for
its Maclaurin’s series expansion, that is,
and it can be seen that in the Maclaurin’s series of the moment-generating function of X the
coefficient of is , the -th moment about the origin. In the continuous case, the argument is the
same.
Example 4. Find the moment-generating function of the random variable whose probability density
is given by
Solution.
By definition
for .
As is well known, when the Maclaurin’s series for this moment-generating function is
Theorem 4.
This follows from the fact that if a function is expanded as a power series in , the coefficient of is the r-
th derivative of the function with respect to at .
Example 5. Given that X has the probability distribution for and , find
the moment-generating function of this random variable and use it to determine and
and
Often the work involved in using moment-generating functions can be simplified by making use of the
following theorem.
The proof of this theorem is left as an exercise. The first part of the theorem is of special importance
when , and the third part is of special importance when and , in which case