0% found this document useful (0 votes)
14 views54 pages

Ch4 Random Variables

The document covers fundamental concepts in probability, including expectation, variance, covariance, and correlation. It provides definitions, examples, and theorems related to discrete random variables and their properties. Additionally, it discusses various distributions such as binomial, Poisson, and geometric random variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views54 pages

Ch4 Random Variables

The document covers fundamental concepts in probability, including expectation, variance, covariance, and correlation. It provides definitions, examples, and theorems related to discrete random variables and their properties. Additionally, it discusses various distributions such as binomial, Poisson, and geometric random variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

[ ] - Probability

Probability
Stefano Bonaccorsi
/

/
Table of Contents
Introduction

! Introduction
Expectation
Variance

! Covariance and correlation


Independent random variables

! Binomial and Poisson

! Geometric, negative binomial, and hypergeometric random variables

/
Discrete random variables
Introduction

Learning Goals
. Know how to compute the expected value (mean) of a discrete random variable.
. Know the expected value of Bernoulli, binomial and geometric random variables.
. Be able to compute the variance and standard deviation of a random variable.
Understand that standard deviation is a measure of scale or spread.
. Introduce the Poisson distribution.

/
Expectation
Introduction

Example
Suppose we have a six-sided die marked with five ’s and one . What would you expect
the average of rolls to be?
By a relative frequencies approach, if we knew the value of each roll, we could compute
the average by summing all the values and dividing by the number of rolls. Without
knowing the exact values, we can compute the expected average as follows.
Since there are five ’s and one , we expect roughly / of the rolls (around ) to
land a , and / of the rolls (around ) to land a .
Assuming this to be exactly true, we have the following table of values and counts:
value:
expected counts:
The average of these values is then
5000 · 3 + 1000 · 6 5 1
= · 3 + · 6 = 3.5
6000 6 6
/
Expectation
Introduction

Expectation
Given a random variable X with range R and probability distribution p we define the
expectation of X to be
)n
E[X] = xj pj .
j=1

If the range of X is infinite, we shall assume in this chapter that X is such that the infinite
series above converges.
Notes:
. The expected value is also called the mean or average of X and often denoted by µ
(“mu”).
. As seen in the above examples, the expected value need not be a possible value of
the random variable. Rather it is a weighted average of the possible values.
. Expected value is a summary statistic, providing a measure of the location or central
tendency of a random variable.
/ . If all the values are equally probable then the expected value is just the usual
Expectation
Introduction

Example .
Find E[X] when
(a) X is Bernoulli,
(b) X is uniform,
(c) X = xk with certainty.

/
Expectation
Introduction

Example .
Find E[X] when
(a) X is Bernoulli: E[X] = (1 → p) + (0 → (1−p)) = p,
"n 1
(b) X is uniform: E[X] = j=1 n xj ,
"
(c) X = xk with certainty: E[X] = nj=1 δxk xj = xk .

/
Algebraic properties of expectation
Introduction
• If X is a random variable with distribution p then
n
)
E[f(X)] = f(xj )pj
j=1

• If X and Y are two random variables with joint distribution pij , it makes sense to
define
)n ) m
E[X + Y] = (xi + yj )pij
i=1 j=1
and
n )
) m
E[XY] = xi yj pij
i=1 j=1

• A random variable X is said to be non-negative if R ⊂ [0, ∞), that is, no value in its
range is negative. Note that X2 is always non-negative irrespective of whether X is.
/
Theorem .
Let X and Y be two random variables defined on the same sample space and ε ∈ R.
(a) E[X + Y] = E[X] + E[Y]
(b) E[εX] = εE[X]
(c) If X is a non-negative random variable, then E[X] ≥ 0

It is possible to see that (c) is a special case of the following more general result:
min{R} ≤ E[X] ≤ max{R}, i.e., the mean (or average, or expected value) of X is
contained between the minimum and the maximum of the elements of its range.

/
Proof
By direct computation.

/
Variance
Introduction

Now, take f(X) = (X−µ)2 ; then f(X) is non-negative, so E[(X−µ)2 ] ≥ 0. We define the
variance of X, V(X) by
) n
V(X) = E[(X−µ)2 ] = (xj − µ)2 pj ( . )
j=1

The expected value (mean) of a random variable is a measure of location or central


tendency.
The interpretation of V(X) is that it measures the extent to which we expect the result of
the experiment represented by X to spread, on average, from the mean µ.

/
Variance
Introduction

The only problem with ( . ) is the practical one that if the values of X are in (units) then
V(X) is measured in (units)2 . For this reason we find it useful to introduce the standard
deviation σ(X) of the random variable X by
#
σ(X) = V(X) ( . )

We will sometimes simply denote σ(X) as σ (‘sigma’) and V(X) as σ 2 .

/
Variance
Introduction

The following result gives some useful properties of the variance.


Theorem
(a) V(X) = E[X2 ] − (E[X])2
(b) V(εX) = ε2 V(X) for all ε ∈ R

Proof
Expand (X − µ)2 = X2 − 2µX + µ2 .

/
Example .
Compute the variance of the following random variables:
(a) X is Bernoulli,
(b) X is uniform in R = {1, . . . , n},
(c) X = xk with certainty,
(d) X is the sum of scores on two fair dice.

/
Solution
(a) X is Bernoulli: V(X) = p(1 − p),
n+1 n2 →1
(b) X is uniform: E[X] = 2 , V(X) = 12 ,
(c) X = xk with certainty: V(X) = 0,

/
Solution
(d) S is the sum of scores on two fair dice:
Let S = X + Y with X and Y the score on the first (second) die. Then
E[X] = E[Y] = 72 , hence E[S] = 7;
xk = Sum
pk = 1
36
2
36
3
36
4
36
5
36
6
36
5
36
4
36
3
36
2
36
1
36

xk pk = 2
36
6
36
12
36
20
36
30
36
42
36
40
36
36
36
30
36
22
36
12
36

x2k pk = 4
36
18
36
48
36
100
36
180
36
294
36
320
36
326
36
300
36
242
36
144
36 54 + 5
6

hence V(S) = (54 + 56 ) − (7)2 = 35


6

/
Variance
Introduction
Consider the following random variables:
X ∼ {(1, 1/5), (2, 1/5), (3, 1/5), (4, 1/5), (5, 1/5)}
Y ∼ {1, 1/10), (2, 2/10), (3, 4/10), (4, 2/10), (5, 1/10)}
Z ∼ {(1, 1/2), (2, 0), (3, 0), (4, 0), (5, 1/2)} W ∼ {1, 0), (2, 0), (3, 1), (4, 0), (5, 0)}
Order them from the largest to the smallest variance

https://app.wooclap.com/ADCJQM
/
Table of Contents
Covariance and correlation

! Introduction
Expectation
Variance

! Covariance and correlation


Independent random variables

! Binomial and Poisson

! Geometric, negative binomial, and hypergeometric random variables

/
Dependence
Covariance and correlation

Suppose that X and Y are two random variables with means µX and µY respectively. We
say that they are linearly related if we can find constants m and c such that Y = mX + c,
so for each yk , 1 ≤ k ≤ n, we can find xk such that yk = mxk + c.
Each of the points (x1 , y1 ), . . . , (xm , ym ) lies on a straight line.


y3 ❵





❵✧
y2 ✧
y1 ✧

✧ ✲
x1 x2 x3

/
Covariance
Covariance and correlation RF IP X S Rx Y S Ry
C 1122 discretesets
XY ra Ray
A quantity that enables us to measure how ‘close’ X and Y are to being linearly related is
the covariance Cov(X, Y).
This is defined by
Cov(X, Y) = E[(X−µX )(Y−µY )]
Note that Cov(X, Y) = Cov(Y, X) and that Cov(X, X) = V(X).
Furthermore, if X and Y are linearly related, then Cov(X, Y) = mV(X).

m b

M mutb marginal
of Y
distribution

/
Covariance and random vectors
Covariance and correlation

Suppose that X = (X1 , X2 ) is a random vector with range R = {(ai , bj )} and distribution
pij = P(X1 = ai , X2 = bj ). We know that the expectation of X is a vector
E[X] = (E[X1 ], E[X2 ]); what about the variance? it is necessary to take the square of
(X − E[X]) that we can substitute with the product
$ 2
%
(X1 − µ1 ) (X1 − µ1 )(X2 − µ2 )
(X − E[X])T · (X − E[X]) =
(X1 − µ1 )(X2 − µ2 ) (X2 − µ2 )2

Taking the expectation we recognize, on the diagonal, the variances V(X1 ) and V(X2 ), and
off-diagonal the covariance between X1 and X2 .

E III 1 113 VIXI Colkikl


Coulx MI VIX
/
Correlation coefficient
Covariance and correlation

Cov(X, Y) is measured in the product of the units of X and those of Y; however, a


dimensionless number should measure the strength of the relationship between X and Y.
For this reason, we define the correlation coefficient ρ(X, Y) between X and Y by

Cov(X, Y)
ρ(X, Y) =
σX σY
where σX and σY are the standard deviations of X and Y respectively.
If ρ(X, Y) = 0, we say that X and Y are uncorrelated.

If Yen b then 6 Y IMITIXI


XY mo XP 1
a 1MITL11
sgu mi
/
64 1
M.v Mx
m b

b
ELY E m

Y E m DI

Y ELY ELY
m'EEX 2m BEE b MEIX b
m E m'EEX 2 Mb E 2mbETX
b b
m UX
TIY 1m 61 1

Coulx Y E X µ mxtb m My b1

E X Mxl m X Ax m VLX
Example .
Consider a symmetric binary channel, with q0|0 = q1|1 = 0.8 and q0|1 = q1|0 = 0.2
Suppose that the input is described by a symmetric Bernoulli random variable X (hence
p0 = P(X = 0) = 12 and p1 = 12 as well).
Notice that the random variable Y describing the output is also symmetric Bernoulli. The
error probability is given by ε = 0.2. Find the correlation ρ(X, Y) between input and
output.
1 0.8 ✲1
❅ ✒
$
$
input ❅$
0.2
receiver
❅ 0.2 Y
$ ❅
0
$ ❘

✲0
Belt Y be 17
0.8

/
Example . 1 E 6
Consider a symmetric binary channel, with q0|0 = q1|1 = 0.8 and q0|1 = q1|0 = 0.2
Suppose that the input is described by a symmetric Bernoulli random variable X (hence
p0 = P(X = 0) = 12 and p1 = 12 as well).
Notice that the random variable Y describing the output is also symmetric Bernoulli. The
error probability is given by ε = 0.2. Find the correlation ρ(X, Y) between input and
output.
We have µX = µY = 12 and σX = σY = 12 .
The joint probabilities are
p(0, 0) = p0 q0|0 = 0.4, p(1, 1) = 0.4, p(0, 1) = p0 q1|0 = 0.1, p(1, 0) = 0.1
hence 1 2 1E E LE
Cov(X, Y) = E[XY] − µX µY = 1 → 0.4 − (0.5)2 = 0.15
so ρ(X, Y) = 0.15
(0.5)2
= 0.6
Eliel 111 11
/ 1 2s 1211 e 1 2 1
The Cauchy-Schwarz inequality
Covariance and correlation

The Cauchy-Schwarz inequality


E[XY]2 ≤ E[X2 ]E[Y2 ]

Proof
For every t ∈ R, the random variable (X + tY)2 is ≥ 0, hence its mean is ≥ 0, i.e.,

0 ≤ E[(X + tY)2 ] = E[X2 ] + 2tE[XY] + t2 E[Y2 ]

If we consider the right-hand side as a polynomial in t we see that a necessary condition


for the inequality is ! = (E[XY])2 − E[X2 ]E[Y2 ] ≤ 0.

/
The Cauchy-Schwarz inequality
Covariance and correlation

Coulx41 Cool te Ytb E a


Mxtel Y b
My
b

The Cauchy-Schwarz inequality


E[XY]2 ≤ E[X2 ]E[Y2 ]

Corollary
(i) | Cov(X, Y)| ≤ σX σY
(ii) −1 ≤ ρ(X, Y) ≤ 1

/
Exercise .
Show that, if X, Y and Z are arbitrary random variables and ε and β are real numbers,
then
(i) V(X + ε) = V(X)
(ii) Cov(X, Y) = E[XY] − E[X]E[Y]
(iii) Cov(X, εY + βZ) = ε Cov(X, Y) + β Cov(X, Z)
(iv) V(X + Y) = V(X) + 2 Cov(X, Y) + V(Y)

/
Linearly dependent random variables
Covariance and correlation

Theorem
X and Y are linearly dependent: Y = aX + b
if and only if the correlation is ±1

Proof
If Y = aX + b then µY = aµX + b, V(Y) = a2 V(X) and
n
)
E[XY] = xi (axi + b)pX (i) = aE[X2 ] + bE[X]
i=1
#
hence Cov(X, Y) = aV(X) = V(X)(a2 V(X)).

/
Linearly dependent random variables
Covariance and correlation

Theorem
X and Y are linearly dependent: Y = aX + b
if and only if the correlation is ±1

Proof
Conversely, let X" = X/σX and Y" = Y/σY ; then

Cov(X" , Y" ) = ρ(X, Y) = ±1

then

V(X" ± Y" ) = 2 + 2ρ(X, Y)

so if ρ(X, Y) = ±1 then either V(X" − Y" ) = 0 or V(X" + Y" ) = 0, and we know that a
random variable has variance = if and only if it is degenerate.

/
Independent random variables
Covariance and correlation

Two random variables X and Y are said to be (probabilistically) independent if each of the
events (X = xj ) and (Y = yk ) are independent:

P(X = xj , Y = yk ) = P(X = xj )P(Y = yk )

Example
Choose a number at random between and . Let X be the remainder of the division of
the number by , and Y be the remainder of the division of the number by . Then X is
uniformly distributed in {0, 1} and Y is uniformly distributed in {0, 1, 2}.
Moreover, X and Y are independent.
However, if we take the number at random between 0 and , then X and Y are no longer
independent!

/
Theorem .
If X and Y are independent, then
(a) E[XY] = E[X]E[Y]
(b) Cov(X, Y) = ρ(X, Y) = 0
(c) V(X + Y) = V(X) + V(Y).

Proof

n )
) m n )
) m
E[XY] = xj yk P(X = xj , Y = yk ) = xj yk P(X = xj )P(Y = yk )
j=1 k=1 j=1 k=1
 * +
)n m
)
=  xj P(X = xj )  yk P(Y = yk ) = E[X]E[Y]
j=1 k=1

as required.
/
Theorem .
If X and Y are independent, then
(a) E[XY] = E[X]E[Y]
(b) Cov(X, Y) = ρ(X, Y) = 0
(c) V(X + Y) = V(X) + V(Y).

Proof
Since
Cov(X, Y) = E[XY] − E[X]E[Y]
it follows from previous computation that Cov(X, Y) = 0 and, a fortiori, ρ(X, Y) = 0.

/
Theorem .
If X and Y are independent, then
(a) E[XY] = E[X]E[Y]
(b) Cov(X, Y) = ρ(X, Y) = 0
(c) V(X + Y) = V(X) + V(Y).

Proof
Since
V(X + Y) = V(X) + V(Y) + 2 Cov(X, Y)
the thesis follows from the previous computation.

/
Table of Contents
Binomial and Poisson

! Introduction
Expectation
Variance

! Covariance and correlation


Independent random variables

! Binomial and Poisson

! Geometric, negative binomial, and hypergeometric random variables

/
Let X1 , X2 , ..., Xn be i.i.d. Bernoulli random variables:

P(Xj = 1) = p = 1 − P(Xj = 0), for all j = 1, . . . , n

The sum S(n) = X1 + · · · + Xn is called a binomial random variable with parameters n and
p, S(n) ∼ B(n, p).
The range of S(n) is {0, 1, . . . , n};
Lemma .
,n-
The probability law of S(n) is given by p(k) = k pk (1 − p)n→k for 0 ≤ k ≤ n

By the results of Example . , Theorem . , Example . and Theorem . we obtain that


E[S(n)] = E[X1 ] + · · · + E[Xn ] = np and V(S(n)) = V(X1 ) + · · · + V(Xn ) = np(1 − p)

/
Example .
An information source emits a six-digit message into a channel in binary code. Each digit is
chosen independently of the others and is a one with probability . . Calculate the
probability that the message contains
(i) three ones,
(ii) between two and four ones (inclusive),
(iii) no less than two zeros.

/
From Binomial to Poisson distribution

Having dealt with a finite number of i.i.d. Bernoulli random variables it is natural (if you
are a mathematician) to inquire about the behavior of an infinite number of these. Of
course, the passage to the infinite generally involves taking some kind of limit and this
needs to be carried out with care.
We will take the limit of the probability law
$ %
n k
p(k) = p (1 − p)n→k
k

as n → ∞ and as p → 0.
In order to obtain a sensible answer we will assume that n increases and p decreases in
such a way that λ = np remains fixed.

/
We denote Y as the corresponding random variable, which is called a Poisson random
variable with parameter λ. The range of Y is N.
To obtain the probability law of S we take the limit
$ % $ %k $ %n→k k
n λ λ →λ λ
pY (k) = lim 1− =e
n→∞ k n n k!

for every k ≥ 0.

/
Remember the well-known fact that

) λk
= eλ
k!
k=0

in order to prove that pY (k) is, indeed, a probability distribution.

Prove (either by taking the limit in E[S(n)] = np and V(S(n)) = np(1 − p) or by a direct
computation) that
E[Y] = λ, V(Y) = λ.

/
Example
A typesetter makes, on average, one mistake per words. Assume that he is setting a
book with words to a page. Let S100 be the number of mistakes that he makes on a
single page. Then the exact probability distribution for S100 would be obtained by
considering S100 as a result of Bernoulli trials with p = 1/1000. The expected value of
S100 is λ = 100(1/1000) = .1. The exact probability that S100 takes a certain value j is
,100- j (.1)j
j p (1 − p) , and the Poisson approximation is e
100→j →.1
j! . Numerically, the values
are:

Poisson λ = 0.1 . . . . .
Binomial n = 100, p = 0.001 . . . . .

/
Sum of Independent Poisson Random Variables
Let X and Y be independent Poisson random variables with respective means λ1 and λ2 .
Calculate the distribution of X + Y.

Solution
Since the event {X + Y = n} may be written as the union of the disjoint events
{X = k, Y = n−k}, 0 ≤ k ≤ n, we have
n
) n
)
P(X + Y = n) = P(X = k, Y = n − k) = P(X = k)P(Y = n − k)
k=0 k=0
n n
) λk1 −λ2 λn−k 1 ) n! −λ1 −λ2 1
= e −λ1
e 2
= e−λ1 −λ2 λk1 λn−k = e (λ1 + λ2 )n
k! (n − k)! n! k! (n − k)! 2
n!
k=0 k=0

In words, X + Y has a Poisson distribution with parameter λ1 + λ2 .

/
Splitting a Poisson distribution
The number of precipitation phenomena during the month of January in a certain district
is Poisson distributed with mean µ = 14.3. Suppose that each phenomenon can be
classified as normal or extreme and that each of them, independently of the others, is an
extreme phenomenon with probability p = 0.03.
Find the joint distribution that n normal events and m extreme events occur in the next
month of January.

/
Solution
Let N be the total number of events, N1 the number of normal events and N2 the number of
extreme events, with N1 + N2 = N. Conditioning on N gives

P(N1 = n, N2 = m) = P(N1 = n, N2 = m | N = n + m)P(N = n + m)

Given that n + m events have occurred, the fact that n of them are classified as normal and the
remaining as extreme is just a binomial distribution, hence
$ % n+m n m
n+m n m −λ λ −λ(1−p) (λ(1 − p)) −λp (λp)
P(N1 = n, N2 = m) = (1 − p) p e =e e
n (n + m)! n! m!

Because the preceding joint probability mass function factors into two products, one of which
depends only on n and the other only on m, it follows that N1 and N2 are independent and

−λ(1−p) (λ(1 − p))n −λp (λp)


m
P(N1 = n) = e , P(N2 = m) = e
n! m!
we can conclude that N1 and N2 are independent Poisson random variables with respective means
λ(1−p)
/ and λp.
Splitting a Poisson distribution
Therefore, this example establishes the important result that when each of a Poisson
number of events is independently classified either as being type with probability p or
type with probability 1−p, then the numbers of type and type events are
independent Poisson random variables.

/
Table of Contents
Geometric, negative binomial, and hypergeometric random variables

! Introduction
Expectation
Variance

! Covariance and correlation


Independent random variables

! Binomial and Poisson

! Geometric, negative binomial, and hypergeometric random variables

/
Example. Geometric distribution
A coin of parity p is repeatedly tossed. What is the probability p(r) of getting the first
head on the r-th try?
By independence we have

p(r) = P(X1 = T)P(X2 = T) . . . P(Xr→1 = T)P(Xr = H) = (1 − p)r→1 p

Now define a random variable Y taking values in N by: Y is the first try when H appears. Y
is called a geometric random variable and we have p (r) = p(1 − p) r→1 .
"∞ Y
Verify that r=1 p(r) = 1 and that E[Y] = 1p , V(Y) = 1→pp2
.

/
Example
In printing a book, an error can occur in every character with probability p = 0.1
Find the probability that no misprint occurs in the first characters.

/
We want to compute the probability that Y > 10, where Y is a geometric random variable
of parameter p.
This gives us a useful opportunity to develop a general formula to compute the
cumulative distribution FY (n) = P(Y ≤ n)
We have that (Y > n) means that n successive failures have occurred in the first n tries,
hence
FY (n) = 1 − (1 − p)n
In our case, we have (with n = 10 and p = 0.1) P(Y > 10) = 0.3487

/
We will remain in the same context as above with our sequence X1 , X2 , ... of i.i.d.
Bernoulli random variables. We saw that the geometric random variable could be
interpreted as a ‘waiting time’ until the first is registered.
Now we will consider a more general kind of waiting time, namely, we fix r ∈ N and ask
how long we have to wait (i.e. how many Xj ’s have to be emitted) until we observe r ’s.
To be specific we define a random variable N, called the negative binomial random
variable with parameters r and p and range {r, r + 1, r + 2, ...} by: N is the smallest value
of n for which X1 + X2 + · · · + Xn = r
We have
P(N = n) = P({r − 1 of the r.v.’s X1 , . . . Xn→1 take the value } ∩ {Xn = 1})
= P({r − 1 of the r.v.’s X1 , . . . Xn→1 take the value })P({Xn = 1})
$ %
n−1 r
= p (1 − p)n→r
r−1
You should try to prove and convince yourself that N is the sum of r independent
geometric random variables.
/
Suppose we have a supply of n binary symbols (i.e. ’s and ’s), m of which take the value
(so that n−m take the value ). Suppose that we wish to form a ‘codeword of length r’,
that is, a sequence of r binary symbols out of our supply of n symbols, and that this
codeword is chosen at random.
We define a random variable: H = number of ’s in the codeword
so that H has range {0, 1, 2, ..., p}, where p is the minimum of m and r; H is called the
hypergeometric random variable with parameters n, m and r.
We have ,m-,n→m-
x
P(H = x) = ,nr→x
-
r

/
[ ] - Probability
Thank you for listening!
Any questions?

/
Application: elementary statistical inference
In statistics we are interested in gaining information about a population which for
reasons of size, time or cost is not directly accessible to measurement.
We are interested in some quality of the members of the population which can be
measured numerically.
Statisticians attempt to learn about the population by studying a sample taken from it at
random.
Clearly, the properties of the population will be reflected in the properties of the sample.
Suppose that we want to gain information about the population mean µ. If our sample is
{x1 , x2 , ..., xn }, we might calculate the sample mean
n
1)
x̄ = xj
n
j=1

and take this as an approximation of µ.


/
We justify this choice by using random variables. Let X1 be the random variable whose
values are all possible choices x1 of the first member of the sample, so E[X1 ] = µ and
V(X1 ) = σ 2 .
Now let X2 be the random variable whose values are all possible choices x2 of the second
member of the sample. In practice, we are usually sampling without replacement, so the
range of X2 should be one less than the range of X1 , but statisticians are usually content to
fudge this issue by arguing that if the population is sufficiently large it is a ‘reasonable’
approximation to take X1 and X2 to be identically distributed. As the choice of the value of
X1 should not affect that of X2 , we also assume that these random variables are
independent.
We continue in this way to obtain n i.i.d. random variables X1 , X2 , ..., Xn
Now consider the random variable X̄(n) defined by
n
1)
X̄(n) = Xk
n
k=1
σ2
such that the mean of X̄(n) is µ and the variance is V(X̄(n)) = n . Further information
about
/ the law of X will be obtained in later chapters.

You might also like