0% found this document useful (0 votes)
40 views24 pages

Density of The Ratio of Two Normal Random Variables and Applications

This document presents the density of the ratio of two normal random variables (X/Y) where X and Y are normal. It provides the exact closed form expression of this density in terms of Hermite and confluent hypergeometric functions. It considers all cases where the variables are standardized or non-standardized, independent or correlated. Examples of applications are given and it discusses generalizing the ratio to variables from scale mixtures of bivariate normal distributions.

Uploaded by

dheeraj kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views24 pages

Density of The Ratio of Two Normal Random Variables and Applications

This document presents the density of the ratio of two normal random variables (X/Y) where X and Y are normal. It provides the exact closed form expression of this density in terms of Hermite and confluent hypergeometric functions. It considers all cases where the variables are standardized or non-standardized, independent or correlated. Examples of applications are given and it discusses generalizing the ratio to variables from scale mixtures of bivariate normal distributions.

Uploaded by

dheeraj kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Communications in Statistics - Theory and Methods

ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: https://www.tandfonline.com/loi/lsta20

Density of the Ratio of Two Normal Random


Variables and Applications

T. Pham-Gia, N. Turkkan & E. Marchand

To cite this article: T. Pham-Gia, N. Turkkan & E. Marchand (2006) Density of the Ratio of
Two Normal Random Variables and Applications, Communications in Statistics - Theory and
Methods, 35:9, 1569-1591, DOI: 10.1080/03610920600683689

To link to this article: https://doi.org/10.1080/03610920600683689

Published online: 15 Feb 2007.

Submit your article to this journal

Article views: 1515

View related articles

Citing articles: 17 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=lsta20
Communications in Statistics—Theory and Methods, 35: 1569–1591, 2006
Copyright © Taylor & Francis Group, LLC
ISSN: 0361-0926 print/1532-415X online
DOI: 10.1080/03610920600683689

Distributions and Models

Density of the Ratio of Two Normal Random


Variables and Applications

T. PHAM-GIA1 , N. TURKKAN2 , AND E. MARCHAND3


1
Department of Mathematics and Statistics, Université de Moncton,
Moncton, New Brunswick, Canada
2
Department of Civil Engineering, Université de Moncton, Moncton,
New Brunswick, Canada
3
Department of Mathematics, Université de Sherbrooke,
Sherbrooke, Canada

In reply to a question raised in the literature, and to settle an argument debated


in the last decades, we give the exact closed form expression of the density of
X/Y , where X and Y are normal random variables, in terms of Hermite and
confluent hypergeometric functions. All cases will be considered: standardized
and nonstandardized variables, independent or correlated variables. Examples in
applied disciplines are presented, and generalizations to ratios of variables from
scale mixtures of bivariate normal distributions show the potential of further new
applications in applied statistics and operations research.

Keywords Bivariate normal; Finite sampling; Hermite function; Integral


representation; Kummer confluent hypergeometric function; Normal; Ratio.

AMS Classification 62E15; 62N05.

1. Introduction
The density of W = X/Y , where X and Y are normal random variables, has attracted
the interest of several researchers as early as 1930, since it was encountered in
some basic problems in statistics. Although the cases where both X and Y are
standard normal, and X Y is standard bivariate normal, are fairly simple, general
cases are much more complex. An unexpected result is that the related density
can only be either unimodal or bimodal. Geary (1930) was the first to investigate
this question, and Fieller (1932) presented another approach to evaluate this
probability density. In the 1960s, two important papers (Hinkley, 1969; Marsaglia,

Received January 3, 2005; Accepted January 20, 2006


Address correspondence to T. Pham-Gia, Department of Mathematics and
Statistics, Université de Moncton, Moncton, New Brunswick, E1A 3E9 Canada; E-mail:
[email protected]

1569
1570 Pham-Gia et al.

1965) addressed this concern, but with different viewpoints. Recently, demand
for this expression resurfaced in a number of important applications, and an
active exchange on the Web page: file stat 97ratio.html (Marsaglia, 2001; Startz,
1997; Ward, 1997) has rekindled the need for a convenient expression of this
distribution. Adopting another approach, Springer (1984, p. 139), using Mellin
transform methods, obtained this density in terms of an infinite series. But the
inversion of the Mellin transform in the complex plane naturally requires some
advanced computation that is not always easy to handle, and the result is not a
closed form expression.
In this article we use special mathematical functions, the Hermite function,
which is a generalization of the Hermite polynomials, and Kummer’s confluent
hypergeometric function, to give a convenient closed form expression to the density
of W . But this expression can also be obtained by using a different approach,
based on conditional expectations. Some particular cases have simpler expressions,
expressed as Hermite or other common functions.
In Sec. 2, we first recall the Hermite function H z and give its basic properties
and its integral representation when  is a negative real number. In Secs. 3 and 4,
the density of W is given for all cases. Sec. 5 discusses the interesting shapes that
the density of W can take, while Sec. 6 presents some applications with related
discussions. Finally, Sec. 7 gives another look at the problem, from a cumulative
distribution viewpoint, and puts it in its wider context of ratios of variables from
scale mixtures of bivariate normal distributions, a topic that has several potential
applications in several directions. At the same time, we wish to settle an argument,
in favor of Marsaglia, that the ratio of two correlated normal variables can be
represented by the quotient of two sums of independent standard normal variables,
with respective appropriate constants.

2. The Hermite Function and the Power-Quadratic Exponential Family

2.1.
Although Hermite polynomials are well utilized in statistics, for instance in the
Gram–Charlier expansion of a density, the Hermite function has only timid
encounters with distribution theory (Pham-Gia, 1994). But the use of special
functions, already very widespread in mathematical physics, is gaining ground in
statistics, where they provide powerful tools to complement the classical common
functions. Dickey (1983) championed such a use already a few decades ago.
The Hermite function with parameter , H z, can be derived√from the
parabolic cylinder function D by the relation H z = 2/2 expz2 /2D z 2 where
D itself is related to the th derivative of the function exp−z2 /2 by the relation
 
z2 d  −z2 /2 
D z = −1 exp e   > 1
4 dz

For any value of , H can be directly defined by the infinite series (Lebedev, 1972,
p. 289)

1 
−1n m − /2
H z = 2zm  (1)
2− m=0 m!
Ratio of Normal Random Variables 1571

with the gamma function for negative values obtained by repeatedly applying the
relation
   1−     
− −1 3
− =   +12√   > 0 and   = −
2 2  2 2

When  is a positive integer, we have the corresponding Hermite polynomial,


while with  < 0, H z has an integral representation of the form

1   −t2 −2tz −+1


H z = e t dt (2)
− 0

which shows that H x is a positive function on the whole real line. The Hermite
function H−2 z is of particular interest in this article. We have
 
te−t
2 −2tz
H−2 z = dt (3)
0


with H−2 0 = /2, and from the general relation (Lebedev, 1972, p. 297)

2+1   2
H z = z e−t t−+1 t2 + z2 −1/2 dt  < 0
−/2 0

we also have

z  e−t t
2

H−2 z = dt


2 0 t + z2 3/2
2

Kummer’s classical confluent hypergeometric function of first kind, 1 F1 , is


defined by


  k zk
1 F1   z = ·  = 0 −1 −2    
k=0
  k k!

with the ascending factorials, or Pochhammer coefficients,   k, defined by   k =


 + 1 · · ·  + k − 1 =  + k/  with   0 = 1. An important identity
relates it to the Hermite function
   
√ 1 − 1 2 2z 1− 3 2
H z = 2   1−  · 1 F1 z −    · 1 F1 z (4)
 2 2 2  −2 2 2

2.2.
As presented in Pham-Gia (1994), the power-quadratic exponential family of
distributions has the property that its hazard rates can be ordered under certain
conditions, and densities of the ratios of two members of this family are given
in Pham-Gia and Turkkan (2005). Here, it is used when one of the normal
variables can be approximated by a member of this family, which happens when the
coefficient of variation of this variable is small.
1572 Pham-Gia et al.

Definition. The family PQE   of distributions consists of positive continuous


random variables with densities of the form ft   = C  t exp− t +
t2   where the domain of the parameters is

   ∈  > −1 − < <   > 0 ∪  > −1 > 0  = 0

We write X ∼ PQE  , and, as given in Pham-Gia and Turkkan (2005),


most common distributions for positive random variables are its special cases.
In particular, the following cases will be considered in subsequent sections:
Case = 0:

(a)   = 0. We have the normal density, N 2 , truncated from below at 0,


denoted NTr  2 , with  = 22 −1 and = −/2 . Hence

ft 0   = C0   exp− t + t2   t ≥ 0 (5)

with

exp− 2 /4
C0   = √ √ 
 / · 1 −  / 2

where  is the cumulative


√ distribution
√ function of the standard normal. We also
have C0   = /H−1  /2 , by using the relation
√ √
H−1 z = 1 − z 2 (6)

(b) If = 0,  > 0, we have the half-normal distribution, denoted by


−1
NH 0 2 , defined for t ≥ 0, with variance √  = 2 , and ft 0 0  =
2

C0 0  exp−t , t ≥ 0 with C0 0  = 2 /.


2

(c) If  = 0 but > 0, then the distribution is exponential, of the form


ft 0  0 = exp− t, t ≥ 0.

3. Ratio of Two Independent Normal Variables

3.1.
Works on the derivation of the density of W = X/Y , where X and Y are normal
variables, dependent or independent, have generally followed the approach of
finding the cumulative distribution of the ratio W ∗ = Y W = UU1 + X
, where Ui ,
X 2 +Y
i = 1 2, are dependent standard normal variables with the same correlation
coefficient  as X Y, and X = X /X and Y = Y /Y , with PW ≤ t = PW ∗ ≤
Y /X t (Geary, 1930; Fieller, 1932; Hinkley, 1969). But Marsaglia (1965) required
further that Ui , i = 1 2 be independent, denoted by Vi , and hence, W ∗ becomes
 
V1 + a 1 X Y 
T=  with a =  − and b = Y  (7)
V2 + b 1− 2  X  Y  Y
Ratio of Normal Random Variables 1573

Section 7 establishes the relation between W and T . Naturally, the tabulated


values of the standard bivariate normal distribution,
    2
1 x − 2 xy + y2
Lh k  = exp − dx dy
21 −   h k
2 21 − 2 

are used in the expression of the cumulative distribution function by the first three
authors, while Marsaglia (1965)
qx
also expressed it in terms of the Nicholson (1943)
h
V function Vh q = 0 0 h xydx dy, where  is the standard normal density.
Differentiating the cumulative distribution function, he obtained the density, which
contains, however, an integral of .
Since our focus here is on providing a convenient, direct closed form expression
for the density of W , as is often required in applications, and is also to be used in a
few examples in this article, we will not consider the ratio U1 + a/U2 + b approach
in our main theorems. But a generalization of our problem to ratios of variables
from scale mixtures of bivariate normals, as presented in Sec. 7, is essentially based
on that ratio. In Theorem 1, to derive such a density for W , we will adopt Springer’s
approach at the first step (1984, p. 118) but will use Hermite functions instead of
Mellin transforms at the following step.
It is recalled that Pitman (1939) has noticed that, for binormal X Y, the two
random variables

X − X  Y − Y  X − X  Y − Y 
S1 = + and S2 = −
X Y X Y

are independent normal variates, with zero means and variances 21 −  and
21 + , respectively. This argument is particularly useful in deriving results on
confidence intervals for X2 /Y2 , as we will see in Sec. 6.

Lemma 1. We have the following relation between the Hermite function and Kummer’s
confluent hypergeometric function,

H−2 z + H−2 −z = 1 F1 1 1/2 z2  ∀z (8)

Proof. Using Relation (4) for  = −2, and adding the two expressions H−2 z and
H−2 −z, we obtain the above identity. 

Theorem 1. Let X ∼ NX  X2  and Y ∼ NY  Y2  be independent normal variables.
Then W = X/Y has density

K1
fw X  Y X  Y  = · 1 F1 1 1/2 1 w −  < w <  (9)
Y2 w2 + X2

where
 2
1 Y2 X w + Y X2
1 w = ≥0 (10)
2X2 Y2 Y2 w2 + X2
1574 Pham-Gia et al.

and
 
X Y 1 X2 Y2
K1 = exp − +  (11)
 2 X2 Y2

Proof. The density of W is in fact given by the very simple integral


 
fw = y f1 wyf2 ydy − < w < 
−

In order to use Hermite functions, which are obtained as integrals over 0 ,
we first reparametrize the normal X ∼ NX  X2  to N ∗ 1  1 , where 1 = 2X2 −1
and 1 = −X /X2 , with 1 > 0 and − < 1 < . For X ∼ N∗ 1  1 , its density
is fx 1  1  = C1 exp−1 x2 − 1 x, − < x < , with C1 = 1 / exp− 21 /41 ,
and hence, it is a generalization of (5). Similarly, Y ∼ NY  Y2  is reparametrized
into N ∗ 2  2 .
Let us now decompose the two normal densities f1 and f2 into their positive
and negative parts, i.e., fi = fi+ + fi− , i = 1 2, where fi+ = fi for t > 0, and fi+ = 0
for t < 0, i = 1 2. Similarly, fi− = fi for t < 0, and fi− = 0 for t > 0. We then have,
for the density of W ,
   
fw 1  2 1 2 = K0 x2 f1+ wx2 f2+ x2 dx2 + x2 f1− −wx2 f2− −x2 dx2
0 0

for w > 0

and similarly
   
fw 1  2 1  2  = K0 x2 f1− wx2 f2+ x2 dx2 + x2 f1+ −wx2 f2− −x2 dx2
0 0

for w < 0

where
21 2 2
+ 2
K0 = exp − 1 2

 2

 Let us consider the case w > 0. Making the change of variable t =


x2 1 w2 + 2 , the first integral I1 becomes
   
1 1w + 2
I1 = t exp −t − 
2
dt
1 w 2 +  2 0 1 w 2 +  2 t

Using Relation (3), we have


K0 w+ 2
I1 = H−2 1 w where 1 w =  1  (12)
1 w +  2
2
2 1 w 2 +  2

Similarly, the second integral is I2 = K0


H −1 w,
1 w2 +2 −2
and hence, for w > 0,

K0 
fw 1  2 1 2 = H−2 1 w + H−2 −1 w 
1 w +  2
2
Ratio of Normal Random Variables 1575

Replacing 1 , 2 , 1 , and 2 by their values in terms of 1 , 2 , 1 , and 2 , and using


Lemma 1, we obtain Expressions (9) to (11), where 1 w = 1 w 2 .
For the case w < 0, we have its first integral equal to I2 , while its second integral
equals I1 . Hence we have the same expression for fw 1  2 1  2 , for w < 0. 

Remarks. 1. The above results are valid for all values of X  Y > 0 and of X  Y
in R. For the particular case of X = Y = 0, since 1 F1 1 1/2 0 = 1, we have
X Y
fw X  Y  =  − < w < 
Y w +
2 2
X2 

a generalized form of the Cauchy distribution. Moreover, when X = Y , we have


the standard Cauchy distribution fw = 1/w2 + 1, − < w < , as is well
known.
2. When the coefficient of variation of a normal variable is small enough,
the area of the related distribution, truncated at zero, either from the left or from
the right, depending on the case, is near unity, and this truncated distribution
can practically be taken as the original normal distribution. Hence, under the
hypothesis that both normal variables have their coefficients of variation  less
than 0.25,
 for example, one of the contributions to the sum H −2   1 w +
H−2 − 1 w becomes negligible, and we can just use the other, with the
corresponding adjustment in the normalizing coefficient to

exp− 2 /4
C0   = √ √ 
/ · 1 −  / 2

as given by (5). Geary (1930) considered this case for Y and showed that √X −YW
2 2 X +Y W
is nearly standard normal.
For example, let both X and Y have small coefficients of variation, so that
their densities can be almost taken as truncated (from below, at the origin) normal
densities. For X, we have
 
ft 0 X  X  = C0 1  1  exp − 1t + 1 t2  t ≥ 0

with 1 and 1 defined as previously, and


 
exp − 21 /41
C0 1  1  =     
/1 · 1 −  1 / 21

as given by (5), while a similar expression holds for Y . Then the density of W = X/Y
on 0  can now be obtained as a Hermite function:


2
1  
fW w 1  1 2  2  = C0 i  i  · H −2 1 w 
i=1
1 w 2 +  2

Expressed in terms of i  i , i = 1 2, we have

K1  
fw X  Y X  Y  = · H−2 1 w  0 ≤ w < 
Y2 w2 + X
2
1576 Pham-Gia et al.

where

1 Y2 X w + Y X2
¯ 1 w = √  
X Y 2 Y2 w2 + X2

and
 
 X Y 1 X2 Y2
K1 = 2 exp − + 
i=1 1 − −i /i  2 X2 Y2

3. For the case of the inverse of a normal variable, W = 1/Y , where X ≡ 1


 1−wY 2  as a degenerate variable. However, the density of W , fW w =
can be considered
w2
Y 2
exp − 22 w2 , − < w < , has to be established directly, and fW is not
Y
defined at 0 and is symmetrical w.r.t. the vertical axis when Y = 0.

4. Ratio of Two Dependent Normal Variables


Two dependent variables can have their marginal distributions normal, while their
joint density could be the bivariate normal or has another form. Let X Y ∼
BVNX  Y X  Y , i.e., let X Y have the bivariate normal density of the form

fXY x y X  Y X  Y 
       
1 x − X 2 x − X y − Y y − Y 2
= A exp − − 2 + 
21 − 2  X X Y Y
− < x y < 
  −1
where A = 2X Y 1 − 2 , with X  Y > 0, − < X  Y < , and −1 <  < 1.

Theorem 2. Let X Y ∼ BVN1  2 X  Y . Then the density of W = X/Y is

21 − 2 X2 Y2


fW w X  Y X  Y  = K2 · 1 F1 1 1/2 2 w (13)
Y2 w2 − 2X Y w + X2

− < w < , where


2
− Y2 X w + X Y Y w + X  − Y X2
2 w =   ≥0
2X2 Y2 1 − 2  Y2 w2 − 2X Y w + X2

and
 
1 Y2 X2 − 2X Y X Y + Y2 X2
K2 =  · exp − 
2X Y 1 − 2 21 − 2 X2 Y2

Proof. The proof uses exactly the same arguments as before, after reparametrizing
the bivariate density, as we did in the previous case, and decomposing fx y
++ +−
according to the signs of x and y in the four quadrants of the plane, as fXY , fXY ,
−+ −−
fXY , and fXY . We will not reproduce the proof here, but another proof, using a
different approach, can be found in Sec. 7. 
Ratio of Normal Random Variables 1577

Figure 1. Fieller’s ratio distribution for Egyptian archaeological data.

With the above closed form expression, several questions related to the
distribution of W can now be easily handled. For example, Korhonen and
Narula (1989) used a complex approach to compute P W ≤ w0 , but the same
result can be obtained by using the density gw = fW w X  Y X  Y  +
w
fW −w X  Y X  Y , with fW given by (13), and by computing 0 0 gwdw.
As another application of the above expression, let us consider Fieller’s
archaeological study. Fieller (1932, p. 436) compiled two measurements made, on
the temporal, Y , and parietal bones, X, on the left-hand side of 787 Egyptian skulls,
and gave the following data: x̄ = 111207, ȳ = 86019, sx = 5788, sy = 3845, and
rxy = 0174. He was interested in finding the distribution of X/Y . Using these values
as parameter values in Eq. (13), we obtain Fig. 1, and the distribution obtained has
mean 1.295 and variance 6575 × 10−3 .

Remarks. 1. When X = Y = 0 and X = Y = 1, we have the standard bivariate


normal density with correlation coefficient . Then (13) reduces to

1 − 2
fw  =  − < w < 
1 − 2w + w2 

which is the same result obtained by Fieller (1932) and Springer (1984, p. 156).
Similarly, Geary (1930) considered the two standardized variables and arrived
at the same result. Naturally, if  = 0, this density reduces to the Cauchy one, as
expected.

2. When  = 0, the two variables X and Y are independent, and we can verify
that Theorem 2 reduces to Theorem 1, i.e., 2 w = 1 w and K2 = K1 , and the
same density for W = X/Y is obtained. The cases  = ±1 lead to a degenerate
bivariate distribution and are not considered.
1578 Pham-Gia et al.

Again, as in the case where the variables are independent, when the coefficient(s)
of variation of either variable, or both, is (are) small, (these) variable(s) can be taken
as positive, or better as normal(s) variable(s) truncated from below at the origin,
and a simpler expression of the approximate density for W can be obtained. Hinkley
(1969) studied this case, and showed that Fw, the cumulative distribution of W ,
converges to
   
Y w − X w2 2w 1
  where aw = − + 
1 2 aw 12 1 2 22

He also studied the accuracy of approximating Fw by the above expression.


Shanmurgalingam (1982) presented a Monte Carlo study of the density of W ,
with its shape varying from mildly normal to very skewed, depending on the two
coefficients of variation CX and CY . A similar study along these lines can be carried
out here, using expression (13), but there is much less need to approximate the
cumulative distribution function of W now that the closed form expression of its
density is available.
3. Our approach adopted to derive the density of the ratio W , based on
a decomposition into positive components and the use of Hermite and Kummer
functions, can be applied for the joint distribution studied by Ruymgaart (1973), where
fx y is not bivariate normal, but the marginal distributions of X and Y are normal.
This is a discrete mixture, with equal weights, of two independent bivariate normal
densities, with mean vectors zero and correlation coefficients −½ and ½ respectively:
1
fx y =  x y + 1/2 x y
2 −1/2
where
1
 x y =  exp−x2 − 2xy + y2 /2 − < x y < 
2 1 − 2

Similarly, the results can be applied to the conditional variables X Y = y and


Y X = x, when they are both normal, with the joint distribution of X Y bivariate
normal, or not. In the first case, we then have X Y = y ∼ NX + X /Y y −
Y  X2 1 − 2  and similarly Y X = x ∼ NY + Y /X x − X  Y2 1 − 2 ,
and the density of their ratio X Y = y/Y X = x can be obtained from
Theorem 1, since, for determined values x0 and y0 of x and y, X Y = y0 and Y X =
x0 are conditionally independent. An example for the second case is provided by the
non-bivariate normal density

fx y = C exp−x2 + y2 + 2xyx + y + xy  − < x y < 

given by Castillo and Galambos (1989). We then have


 
−y02 1
X Y = y0  ∼ N   and
1 + 2y0 + 2y02 2 + 4y0 + 4y02
 
−x02 1
Y X = x0  ∼ N  
1 + 2x0 + 2x02 2 + 4x0 + 4x02
Ratio of Normal Random Variables 1579

and the distribution of their ratio X Y = y0 /Y X = x0  can be obtained from


Theorem 1, again by conditional independence. The case of ratios of variables from
discrete, as well as continuous, scale mixtures of bivariate normal distributions will
be further discussed in Sec. 7.

5. Shapes of the Density of W


The various shapes that the density of W can take make this topic of particular
interest. W has either a unimodal or a bimodal density, sometimes with the second
mode barely noticeable and quite distant from the first. Here, it is more convenient
that W = X/Y , with X Y ∼ BVN1  2  1  2  , be put under a form related
to T = V1 + a/V2 + b, with V1 and V2 being independent standard normal variates
N0 1, and a and b given by (7). We can see that it suffices to study the case
a b ≥ 0 for the distribution of T , since other cases can be obtained from this case
by an orthogonal transformation. In fact, with V1 and V2 independent, defining

V1 + a
a b =  V1  V2 ∼ N0 1 a b > 0 
V2 + b

we have a b = − a −b = −a −b = − −a b.

Lemma 2. For c ≥ d > 0 the function Rz = 1 F1 c d z/1 F1 c + 1 d + 1 z,


z > 0, is nondecreasing in z and is bounded below by 1.

Proof. We have first R0 = 1. Writing Rz = c/dEZ d + K/c + K where K
c+1k zk
is a discrete r.v. with parameter Z and mass function PZ k = d+1k k!
, we can see
that this family of probability mass functions has an increasing monotone likelihood
ratio. The result follows since the ratio d+K
c+K
is nondecreasing in K. 

Proposition 1. The density of T has the following properties:


(a) For t > 0, fT t has a mode at m0 , with 0 < m0 ≤ a/b, and fT 0 is a decreasing
function of a, when b is fixed.
(b) For either a = 0 or b = 0 (but not both), we have fT t symmetrical w.r.t. the
vertical axis.
(c) In the general case, with 0 < a and 0 < b, fT t a b is asymmetric w.r.t. the
vertical axis, and higher to the right, i.e., for any value t0 > 0, we have fT t0  ≥
fT −t0  and is either unimodal or bimodal.

Proof. By Theorem 1, T has as density

K1
fT t a b  = · F 1 1/2 1 t (14)
t2 + 1 1 1

where ¯ 1 t = at + b2 /2t2 + 1 ≥ 0 and K1 = 1/ exp−a2 + b2 /2 .


(a) We always have: a2 /2 ≤ 1 t ≤ a2 + b2 /2 for 0 ≤ t, and hence fT 0 =
K1 · 1 F1 1 1/2 b2 /2 > 0 has the above property. Using the properties of the
function 1 t, which is increasing in −b/a a/b and decreasing outside, for t > 0,
fT t has a mode at m0 , with 0 < m0 ≤ a/b.
1580 Pham-Gia et al.

(b) In the general case, the derivative of fT t is



F1 2 3/2 1 t b + ata − bt F 1 1/2 1 t
fT t = 2K1 1 −t· 1 1  (15)
1 + t 
2 2 1 + t 
2
1 F1 2 3/2 1 t

We have fT a/b < 0, while fT 0 > 0, and the density is always increasing at
the origin. Also, the sign of fT t depends only on the sign of the function

b + ata − bt F 1 1/2 1 t


t = −t· 1 1 
1 + t2  1 1 2 3/2 1 t
F

where by Lemma 2 the ratio t = 1 FF1 21 1/2 1 t


3/2 1 t
is an increasing function of t > 0
1 1
and takes the value 1 at the origin, 1 ≤ t Hence, for a = 0, b > 0, fT t is
unimodal, since as a function of t, t/t decreases from 0. For b = 0, if a ≤ 1,
t/t is negative for t > 0, and hence fT t decreases on 0  and we have only
one mode. For a > 1, t/t is positive, then negative, √ and hence fT t has two
symmetric modes at m0 and −m0 . Also, because  a2 − 1 < 0, we have m0 <

a2 − 1. Although zero is a point of continuity, it is also a turning point, and the
two values of fT t, to the left and the right of zero can be obtained by direct
computation.
(c) When a = 0 and b = 0, using the bounds of t above, we can see that
t has at most three roots, one of which can be shown to be always positive, while
the two others, when they exist, are negative.
If t has two roots on the negative axis, there will be a second mode
at m1 < 0. Then we have fT m1  ≤ fT −m1 . Furthermore, using the increasing
property of 1 t in −b/a 0, we have fT t increasing in this interval, and hence
m1 < −b/a.
In the case of no negative root, fT t is always positive and there is only one
mode at m0 > 0. The intermediary case of a double negative root also gives a
unimodal density, and the corresponding values of a and b determine the boundary
values between the bimodal and the unimodal forms of the density, as given by
Marsaglia (1965). 

To investigate further the last point, we proceed as follows.


For a > 0 we compute the values of b > 0 so that equation t = 0 has a single
positive root and a double negative root. The positive root corresponds to the mode
for w > 0, but the double negative root, denoted z0 , will determine the values of b
for which there is no second mode. At the same time, we compute the values of z0 .
Numerical results show that, as already pointed out by Marsaglia (1965), in the
a b plane, a curve fP a b with vertical asymptote about a = 2257 delimits the
values of a b, concerning the two possible shapes of f , and the density is unimodal
on the left of that curve, while it is bimodal on its right. In Fig. 4, the space curve
fa b z0 , defined in the one-eight subspace R3 ++− , has its projection fP a b on
the a b plane. Figures 2 and 3 give fT t a b  for some values of a and b
The distribution of the ratio of two linear combinations of independent normal
variables, coming from univariate or bivariate distributions, can now be obtained
directly from Theorems 1 and 2. It is encountered in several operations research and
engineering applications.
Ratio of Normal Random Variables 1581

Figure 2. Unimodal densities for X/Y .

Proposition 2. (a) Let Xi , i = 1     n and


 Yj , j = 1     m, be m + n independent
normal variables Xi ∼ N Xi  X2 i , Yi ∼ N Yi  Y2i , and let V = T1 /T2 , where T1 =
n m
i Xi and T2 =
i=1  j=1 j Yj . Then the density of V is given by (9), where we have
T1 = ni=1 i Xi :

m 
n 
m
T2 = i Yi  T21 = 2 2
i Xi  and T22 = i2 Y2i 
i=1 i=1 i=1

when T1 = T2 = 0 and V has a generalized Cauchy-type distribution.

Figure 3. Bimodal densities for X/Y .


1582 Pham-Gia et al.

Figure 4. Space curve fa b z0  relating the boundary between unimodal and bimodal
densities with the negative abscissa of the related point and its projection.

(b) Similarly, Let X1  Y1      Xk  Yk  be k independent normal vectors, with


Xi  Yi  ∼ Ni1  i2 i1  i2 i , i = 1 k. Then the density of the ratio of the two linear
 
combinations of corresponding components, V = ki=1 i Xi / ki=1 i Yi , is given by (13),
with


n 
m 
n
1 = i Xi  2 = i Yi  1 =
2 2 2
i i1 
i=1 i=1 i=1
 

m 
k
22 = 2 2
i i2  and  = i i1 i2 /1 2 
i=1 i=1

Proof. Immediate by applying Theorems 1 and 2. 

6. Applications
There are several known applications of the density of X/Y . For example, in
linear regression, the ratio of the two least squares estimates of the regression
line, which are the intercept and the slope, has this density. Marsaglia (1965)
mentioned the distribution of red cells, which motivated his research on this topic,
and Shanmurgalingam (1982) mentioned digestibility, or the ratio of the weight of a
component of a plant to that of the whole plant, and presented a Monte Carlo study
related to this ratio. In operations research, the ratio of strength to stress, when
both factors are normally distributed, can now have its density studied in detail, and
in most cases, Remark 2 of Sec. 3 would apply since both are positive, with small
coefficients of variation. In what follows, we provide two other applications, one in
the domain of education, where evaluation of students’ performance over related
academic subjects for forecasting purposes is an important concern. The other, in
finite sampling theory, uses the distribution of this ratio and has been mentioned
frequently in the ratio estimating approach.
Ratio of Normal Random Variables 1583

Figure 5. Density of X Y, with X = introduction to English, Y = English literature, and
box plots.

A. For example, in education, the joint distribution of the final grades of


first-year English X and second year English literature Y  can be taken as
approximately BVN7525 7158 6252  5452 076. These parameter values are,
in fact, the corresponding sample values, obtained from a large sample of 427
students in the last 3 years, who took the two courses with the same professors. The
density of X Y  is given by Fig. 5, and the two Box plots are also exhibited there.

1. If we look first into the two marginal distributions of X and Y , the ratio
W1 = X1 /Y1 , its density given by Theorem 1, is denoted f1 in Fig. 6 and reflects the
distribution of this ratio for X1 and Y1 , which have the same marginal distributions

Figure 6. Densities of X/Y , and of marginal and conditional ratios.


1584 Pham-Gia et al.

as X and Y but are considered independent. This distribution has mean 1.057 and
variance 0.014.
2. How academic achievements in English literature are related to the ones in
introductory English is better reflected by the distribution of W2 = X/Y as given
by Theorem 2. This density, denoted by f2 , is also given by Fig. 6 and gives the
distribution of this ratio for any value X Y of the above bivariate distribution.
It has mean 1.054 and variance 0.0034. So both the mean and the variance have
decreased, the variance significantly so.
3. To study further the homogeneity of the distributions, we can consider
the two conditional distributions Y X = 66 and X Y = 63, these two specific
values for X and Y being adopted as respective minimal passing grades for the
two courses (Fig. 5). Setting X2 y0 = X2 1 − 2  and X y0 = X + X /Y y0 −
Y , where y0 = 63, and similarly for Y2 x0 and Y x0 , the density of the ratio
W3 = Y X = 66/X Y = 63 is given by Theorem 1, since they can be considered
as conditionally independent, and its graph, denoted f3 , is given by Fig. 6. This
distribution has mean 1.031 and variance 0.0075. The mean has decreased and the
variance increased from the same measures of f2 , and various other conclusions can
be made.
4. When X and Y are independent, the ratio 12 /22 of their variances can
be estimated using the Fn1 −1n2 −1 -distribution, as is well-known. Although lesser-
known, a 1 − 100% confidence interval for this ratio can also be obtained for the
dependency case, based on Pitman’s result (1939) already recalled. It is Bs12 /s22  ±
√ n−2+21−r 2 t2/2
s12 /s22  B2 − 1, where B = n−2
, which is computed as 1186 1458, for the
above W2 variable, for example, at the 90% confidence level.
B. The classical delta method for two nonnegative random variables gives

 2
 Y covX Y
EX/Y  X 1 + −
Y Y X Y

and can be used for X Y  binormal, when the two means are large compared  to the

two standard deviations. Hence, depending on the magnitude of = Y Y −  X ,
Y Y X
we can use EW, with W = X/Y , to approximate X /Y and vice-versa. On the other
hand, under the topic of ratio estimating, there are two intimately related problems.
First, in theoretical statistics, we wish to estimate the ratio of two unknown means,
R = X /Y , and the estimator is Tn =  X /
Y , the ratio of two sample means. Then,
if Y is known, we can estimate X by   X = Tn Y . This estimator is asymptotically
more efficient than  X if and only if we have 2XY > Tn Y2 (Shao, 2003, p. 205), i.e.,
the correlation between X and Y is large enough to pay off the variability caused
by using Y / Y instead of 1. An approximately unbiased estimator of X is  X +

X−W  /
Y  (Kendall et al., 1983, p. 236).
Conversely, in classical finite sampling  theory, with equalprobabilities
 and
without replacement, we wish to estimate X and consider  = X/ Y , the ratio
of the two totals of two populations, with the same number of elements N . We also
have, equivalently,  = X /Y , where X and Y are the two populations means. If
we can estimate  by , ˆ using, for example, the theoretical densities of X and Y ,
Ratio of Normal Random Variables 1585


 ˆ 
 ˆ
the estimate of X is then  X = Y , and N  X = N Y gives the estimation of X,
supposing Y is known.
Paulson (1942), using Geary’s results (1930) mentioned previously, suggested
a method to give an interval estimation of , based on . ˆ Depending on
whether the parameters of the two theoretical distributions of X and Y are
known, we have two cases for the confidence limits of , based on a sample of
size n, x1  y1      xn  yn . The following formulas apply, for the 1 − 100%
confidence limits of , which in turn will lead to those of ,
(a)    2   
nx̄ ȳ − z2/2 X Y ± nx̄ȳ − z2/2 X Y − nȳ2 − z2/2 Y2 nx̄2 − z2/2 X2
nȳ2 − z2/2 Y2

in the first case and


   2   
(b) nx̄ȳ − t2/2n−1 rsX sY ± nx̄ȳ − t2/2n−1 rsX sY − nȳ2 − t2/2n−1 sY2 nx̄2 − t2/2n−1 sX2
nȳ2 − t2/2n−1 sY2

when the values of the parameters are unknown, where


 
xi − x2 yi − y2
sx2 =  sY2 = 
n−1 n−1

and r is the sample correlation coefficient. Taking the mid-point as the estimate

ˆ of , we have 
 ˆ
X = Y .

Let us now consider W = X/Y . If its density is available, we can alternately


estimate  by EW, if is small, as mentioned before. This density, when X
and Y are normal, can be obtained from either Theorems 1 or 2, and having
computed EW we estimate X by  X = EW · Y . As before, an approximate
unbiased estimator of X is
 

W N −1 n

 + 
X −  if the coefficient is near 1
X

Y N n−1

For the Cauchy distribution, however, its mean theoretically does not exist. But,
in practice, as pointed out by Brown (2004), the mean of W is computable when the
Cauchy component of its density plays only a nonsignificant role in the density and
is called the pseudo-mean. It should be pointed out, too, that the principal value of
this mean exists (Stuart and Ord, 1987, p. 77).
For example, in a survey, we have found x̄ = 6, ȳ = 8, sX = 125, sY = 215,
r = 097, while Y has the value of 8.15. We wish to estimate X . Since t4025 =
27764, Paulson’s method gives a 95% confidence interval for , the ratio of the two
means, as 0773 ± 0086. We then have  
X = 0773815 = 6299.
Since ˆ can be computed to be 0.0179, using ˆ = EW, and the expression
of fW w 6 8 125 215 097, given by Theorem 2, is an approximate distribution
of W , we obtain its pseudo-mean EW = 0769 and its variance 00122 by
numerical computation. We then have  X = 0769815 = 6267. Hence the two-
point estimates of X , according to the two approaches, are quite close.
1586 Pham-Gia et al.

7. Ratio of Variables from Mixtures of Normal Distributions

7.1.
The ratio of the two normal variables considered above is in fact a particular case
of the ratio of two variables from a bivariate normal distribution related by a
mixture process. To derive the general expression of its density, we first establish
an expression of the cumulative distribution function of W in the bivariate normal
case.

Lemma 3. Let W = X/Y , where


 
X Y ∼ N 1  2  12  22   (16)

Then we have

t − U2 + t 2 − 1 
PW ≤ t = E U2
  · sgnU2 + 2   (17)
1 − 2

where t = 2 /1 t and U1  U2  ∼ N2 0 I2 , with i = i /i , i = 1 2.

Proof. Using the representation W = 1 UU1 +1


, as well as the relations U1 U2 = u2 ∼
2 2 +2
Nu2  1 −   and U2 ∼ N0 1, we have
2

 
 U1 + 1
PW ≤ t = u2 P ≤ t U2 = u2 du2 
− U2 +  2

where y = e−y /2 / 2 is the standard normal density. Considering the sign of
2

U2 + 2 , we integrate in − −2  and −2   separately and obtain


 −2
PW ≤ t = u2 PU1 + 1 ≥ t U2 + 2  U2 = u2 du2
−
 
+ u2 PU1 + 1 ≤ t U2 + 2  U2 = u2 du2
−2
 
−2 −t − U2 − t 2 − 1 
= u2   du2
− 1 − 2
   
t − U2 + t −2 1 
+ u2   du2 
−2 1− 2

and we have the above result. 

From the above proof, we can see that the same representation is valid for any
ratio of the form UU1 +a . We can now obtain a representation of W = X/Y , in terms
2 +b
of T = VV1 +a , but with V1 and V2 independent, i.e., V1  V2  ∼ BVN0 0 1 1 0 and
2 +b 
1 −2
a b = √ 2  2 .
1−
Ratio of Normal Random Variables 1587

Proposition 3. Let W = X/Y , where X Y ∼ BVN1  2 1  2 . ThenW has the
 1 −2
same distribution as T ∗ =  1 + 1 1 − 2 T , where a b = √ 2
 2 , with i ,
2 2 1−
i = 1 2, as given in Theorem 3.

Proof. First, from Theorem 3, for T , we have, since  = 0, and 1 = 2 = 1 for


V1  V2 ,

PT ≤ t = E V2 tV2 + tb − a · sgnV2 + b  (18)

Hence
  
t −  t −  bt − a
PT ∗ ≤ t = P T ≤  =E V2
  V2 +  · sgnV2 + b 
1 − 2 1 − 2 1 − 2

and
 
t −  t  − 1 
PT ∗ ≤ t = E V2   V2 +  2 · sgnV2 + 2  
1 − 2 1 − 2

when replacing a and b by their values above. Since V2 ∼ N0 1, this is also
expression (17) of PW ≤ t, as given by Theorem 3. 

It is hence worthwhile pointing out that Marsaglia’s claim (1965) that it suffices
to study the standard bivariate case is correct, and so is his relation (7).

V1 +a
Lemma 4. Let T = V2 +b
, with V1  V2  ∼ BVN0 0 1 1 0 and a b > 0. Then the
density of T is
 
1 at + b2
fT t =  a2 +b2  · 1 F1 1 1/2  − < t < 
exp 2
1 + t2  21 + t2 

Proof. This proof can be taken as another proof of Theorem 1, now using (18),
and arguments based on conditional expectations of random variables.
From (18), we have
 
fT t = u2 + bsgnu2 + btu2 + b − au2 du2
−
 
= u2 sgnu2 tu2 − au2 − bdu2 
−

by making the change of variable


   
1   tu2 − a2 u2 − b2
u2 → u2 − b = u sgnu2  exp − exp − du2
2 − 2 2 2
 2 
A   u
= u2 sgnu2  exp − 2 1 + t2  expu2 b + atdu2 
2 − 2
1588 Pham-Gia et al.

where
 2 
a + b2
A = exp −
2
  2 
A  u
= u2 exp − 2 1 + t2  expu2 b + at + exp−u2 b + at du2 
2 0 2

Posing now z = u22 1 + t2 /2, we obtain


 
A   2z
fT t = exp−z cosh b + at du2 
1 + t2  0 1 + t2

√  c2 zk
1 
Since coshc z = k≥0 2k!
, and the Pochhammer coefficient 2
k = 2k!
22k k!
,
we have
   k
A  1 2z
fT t = exp−z b + at zk dz
2
1 + t2  0 k≥0
2k! 1 + t2
A  1 b + at2k 1
= 
1 + t  k≥0 2 1 + t  1/2 k
2 2k 2 k

And hence
A at + b2
fT t = · 1 F1 1 1/2  
1 + t 
2 21 + t2 
T ∗ 2 /1 −
Using Proposition 3, we have T ≡ √ , and, hence, for W = X/Y , with
1−2
X Y ∼ BVN1  2  1  2  , we have

1
fW w =
1 − t ∗ w 2 
 
a∗2 + b∗  a∗ t∗ w + b∗ 2
2
2
· exp − ·  · 1 F1 1 1/2 
2 1 1 −  2 21 + t∗ w 2 

with
 
w2 /1  −  1 a b
t ∗ w =   b∗ = b/2 and a∗ =  − 
1 − 2 1 −  2 1 2

which gives the same expression as Theorem 2.

7.2.
We now consider the ratio T = UU1 +a for cases where the distribution of
2 +b
U1  U2  admits the scale mixture bivariate normal representation U1  U2 Z =
z ∼ N2 0 zI2 , where I2 is the identity matrix, with V = 1/Z having distribution
function Gv. It is immediate that when PZ = 1 = 1, we have essentially the case
considered in Theorem 1. Also, when the correlation U1  U2 z, between U1 z and
U2 z is  = 0, a change of variables, as in the above Proposition 3, is sufficient to
Ratio of Normal Random Variables 1589

bring the problem back to the above case. As we will see, the density of T is again
the product of the Cauchy density with another function.
U1 +a
Theorem 3. Let U1  U2 Z = z ∼ N2 0 zI2 . Then the density of T = U2 +b
is
   2 
1 a + b2 v
fT t = exp − · F 1 1/2 utvdGv
1 + t2  0 2 2 1 1
b + at2
where ut =  (19)
21 + t2 

Proof. Conditioning on the scale, we have, by Lemma 4,


 √
U1 + a/ z
PT ≤ t Z = z = P √ ≤t Z=z
U2 + b/ z
 t  
1 uy
= exp−a + b /2z
2 2
F 1 1/2 dy
− t2 + 1 1 1 z

Hence, setting V = 1/Z,


a2 + b2 v 1
fT V =v t = exp − · F 1 1/2 utv
2 t + 1 1 1
2

and (19) follows. 



Remark. The case U1  U2 Z = z ∼ N2 0 zI2 , where zI2 = 0z 0z , with > 0, can
be treated similarly, and (19) becomes
√    2 
a + b2 /  v √
fT t = exp − · F 1 1/2 u tvdGv
1 + t2  0 2 2 1 1

Applications. There are several potential applications of the density of W . In


particular, in the common cause failure topic of statistical reliability theory, we can
identify that stochastic cause with the r.v. Z, which affects the variances of the two
components of a system, which are related to each other by a bivariate normal
distribution, under either model zI2 or zI2 above. Failure will occur if W is less than
a certain threshold, and with the expression of the density fT , we can compute the
related probability. For example, under zI2 ,
(1) Z is a discrete random variable, with mass function PZ = zi  = pi ,

2 +b2 
− a 2z  
i = 1 2    . Then fT t = 1+t
1
2 p e
i=1 i 1
i
F1 1 1/2 ut
z
.
i

(2) Z is a continuous random variable, e.g., Z ∼ inv − Gamma   (or V ∼


Gamma  . Then, by (19),
 2 
1     utk +k−1 a + b2 + 2
fT t = v exp − v dv
1 + t2    0 k≥0 1/2k 2
   k
1 2    k 1 k 2ut
=
1 + t2  a2 + b2 + 2 k≥0 1/2 k k! a2 + b2 + 2
 
K 2ut
= · F 1 1/2 
1 + t2  2 1 a2 + b2 + 2
1590 Pham-Gia et al.

where
 
2 b + at2
K= and ut = 
a2 + b2 + 2 21 + t2 

and 2 F1 is a Gauss hypergeometric function.

For =  = 1/2, we have a more general form of the√ Cauchy distribution,


a2 +b2 +1
of the form fT t = t−c c12 +c2 , with scale parameter c1 = 1 b2 +1 , and location
2 1
parameter c2 = b2ab+1 , since, in the above density, 2 F1  1 1/2 z reduces to 1−z
1
.

8. Conclusion
The density of the ratio W = X/Y of two normal random variables has been shown
to have a convenient and simple closed form, when Hermite and Kummer functions
are used. These functions are easy to program on a computer and provide powerful
tools to deal with questions related to this ratio. The shapes of this density can be
determined from the representation of W as a ratio of the form VV1 +a , where V1 and
2 +b
V2 are independent standard normal variates. Applications in other domains that
use this result can now be handled with ease. A generalization to ratios of variables
arising from scale mixtures of normal distributions is possible and presents a unified
approach to address this problem.

Acknowledgments
The research of Pham-Gia and Marchand partially supported by NSERC of
Canada. The authors wish to thank a referee for some pertinent comments that have
helped to improve the presentation of the paper.

References
Brown, K. S. (2004). Ratio populations. http://www.mathpages.com/home/kmath042/
kmath042.htm
Castillo, E., Galambos, J. (1989). Conditional distributions and the bivariate normal
distribution. Metrika 36:209–214.
Dickey, J. M. (1983). Multiple hypergeometric functions: probabilistic interpretations and
statistical uses. J. Amer. Statist. Asso. 78:628–637.
Fieller, E. C. (1932). The distribution of the index of a normal bivariate distribution.
Biometrika 24:428–440.
File stat 97ratio.html. http://www.pitt.edu/∼wpilib/statfaq/97ratio.html.
Geary, R. C. (1930). The frequency distribution of the quotient of two normal variates.
J. Royal Statist. Soc. 97:442–446.
Hinkley, D. V. (1969). On the ratio of two correlated normal random variables. Biometrika
56:635–639.
Kendall, M., Stuart, A., Ord, J. K. (1983). The Advanced Theory of Statistics. Vol. 3. 4th ed.
New York: Macmillan.
Korhonen, P. J., Narula, S. C. (1989). The probability distribution of the ratio of the
absolute values of two normal variables. J. Statist. Comput. Simul. 33:173–182.
Lebedev, N. N. (1972). Special Functions and Their Applications. New York: Dover.
Marsaglia, G. (1965). Ratios of normal variables and ratios of sums of uniform variables.
J. Amer. Statist. Asso. 60:193–204.
Ratio of Normal Random Variables 1591

Nicholson, C. (1943). The probability integral for two variables. Biometrika 33:59–72.
Paulson, E. (1942). A note on the estimation of some mean values for a bivariate
distribution. Ann. Math. Stat. XIII(4):440–445.
Pham-Gia, T. (1994). The hazard rate of the power-quadratic exponential family of
distributions. Statist. Prob. Lett. 20:375–382.
Pham-Gia, T., Turkkan, N. (2005). Distribution of ratios of random variables from the
power-quadratic exponential family and applications. Statistics 39(4):355–372.
Pitman, E. J. G. (1939). Note on normal correlation. Biometrika 31:9.
Ruymgaart, F. H. (1973). Non-normal bivariate densities with normal marginals and linear
regression functions. Statist. Neerlandica 27:11–17.
Shanmurgalingam, S. (1982). On the analysis of the ratio of two correlated normal variables.
Statistician 31:251–258.
Shao, J. (2003). Mathematical Statistics. New York: Springer-Verlag.
Springer, M. (1984). The Algebra of Random Variables. New York: John Wiley.
Stuart, A., Ord, K. (1987). Kendall’s Advanced Theory of Statistic. Vol. 1. 5th ed. New York:
Oxford Univ. Press.

You might also like