Math Work 12345

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

4 Coin Flipping as a Stochastic Process

Tossing a coin typically, document 1 if it comes up Heads and file 0 if it comes


up Tails at every coin toss. Then, we get a protracted sequence such as zero
and 1—let us name this sort of series a {zero, 1}-collection—this is random. in
this chapter, with such random {0, 1}-sequences as fabric,
1.1 Mathematical model

as an example, the idea ‘circle’ is received


2 by using
2 abstracting
2 an essence
from diverse spherical items inside the world. To deal with circle in
mathematics, we keep in mind an equation (x − a)² + (y − b)²= c² as a
mathematical model. namely, what we name a circle in mathematics is the set
of all solutions of this equation{(x, y) | (x − a) + (y − b) = c }.
in addition, to investigate random gadgets, in view that we cannot deal with
them di- rectly in mathematics, we keep in mind their mathematical fashions.
For exam- ple, when we are saying ‘n coin tosses’, it does not mean that we
toss a actual coin 1

Fig. 1.1 Heads and Tails of 1 JPY coin

Example 1.1. Let {0, 1} 3 denote the set of all {0, 1}-sequences of length 3:
{0, 1}3 := { ω = (ω1 , ω2 , ω3 ) | ωi ∈ {0, 1}, 1 ≤ i ≤ 3 }
2

= { (0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0),
(1, 0, 1), (1, 1, 0), (1, 1, 1) }.
Let P({0, 1}3) be the power set †1 of {0, 1} 3, i.e., the set of all subsets of
{0, 1}3. A ∈ P({0, 1}3) is equivalent to A ⊂ {0, 1}3. Let #A denote the
number of elements of A. Now, define a function P3 : P({0, 1}3) → [0, 1] :=
{ x | 0 ≤ x ≤ 1 } by
#A #A
P3 (A) := = , A ∈ P({0, 1}3)
#{0, 1}3 23
(See Definition A.2), and functions ξi : {0, 1}3 → {0, 1}, i = 1, 2, 3, by
ξi (ω) := ωi , ω = (ω1 , ω2 , ω3 ) ∈ {0, 1} 3 . (1.2)

Each ξi is called a ←→ {ξi}i=1,


coordinate . }

{
function. Then, }
we have
P3 ω 0, 1 3
ξ1(ω) = 1, ξ2(ω) =
0, ξ3(ω) = 1
= P3 ω ∈ {0, 1}3 . ω1 = 1, ω2 = 0,
}
ω3 = 1
1
= P3( { (1, 0, 1) } ) = 3 . (1.3)
2
P ←→ P3, {Xi} i=1
P3 and {ξi}3i=1 are a mathematical model of 3 coin tosses.

The equation (x − a)2 + (y − b)2 = c2 is not a unique mathematical


model of ‘circle’. There are different models of it; e.g., a parametrized
representation
x = c cos t + a,
0 ≤ t ≤ 2π.
y = c sin t + b,
You can select suitable mathematical models according with your particular
purposes. In the case of coin tosses, it is all the same. We can present
another mathematical model of 3 coin tosses.

Example 1.2. (Borel’s model of coin tosses) For each x ∈ [0, 1) := { x | 0 ≤


x < 1}, let di(x) ∈ {0, 1} denote the i-th digit of x in its binary expansion
(Sec. A.2.2). We write the length of each semi-open interval [a, b) ⊂ [0, 1)
3
as
P( [a, b) ) := b − a.
Here, the function P that returns the lengths of semi-open intervals is called
the Lebesgue measure. Then, the length of the set of x ∈ [0, 1) for which
d1(x), d 2(x), d3(x), are 1, 0, 1, respectively, is
P ( {x ∈ [0, 1) | d1(x) = 1, d2(x) = 0, d3(x) = 1} )
1 0 1
=P x ∈ [0, 1) .. + + ≤x< +
1 0
+
1
+
1

2 22 23 2 22
23 23
5 6 1
=P , = .
In the number8line
8 with binary
8 scale, it is expressed as a segment:
4

. 。

0 0.001 0.01 0.011 0.1 0.101 0.11 0.111 1

Under the correspondence

P ←→ P, {Xi}3i=1 ←→ {di}3i=1,

P and {di}3i=1 are also a mathematical model of 3 coin tosses.

Definition 1.1. (Probability distribution) Let Ω be a non-empty finite


set, i.e., Ω /= ∅ † 3 and #Ω < ∞. Suppose that for each ω ∈ Ω, there
corresponds a real number 0 ≤ pω ≤ 1 so that
Σ
pω = 1
ω ∈Ω

(Sec. A.1.2 ). Then, we call the set of all pairs ω and p ω

{(ω, pω) | ω ∈ Ω}

Proposition 1.1. Let (Ω, P(Ω), P ) be a probability space. For A, B ∈


P(Ω), we have
(i) P (Ac ) = 1 − P (A),† 5 in particular, P (∅) = 0,
(ii) A ⊂ B =⇒ P (A) ≤ P (B), and
(iii) P (A ∪ B) = P (A) + P (B) − P (A ∩ B), in particular, P (A ∪ B) ≤
P (A) + P (B).

Questions allow us to appearance intently at joint distribution and


marginal distribution inside the case of n = 2. think that the joint
distribution of random variables X1 and X2 is given throughP ( X1 =
a1i , X 2 = a2j ) = pij , i = 1, . . . , s 1 , j = 1, . . . , s 2 .

Then, their marginal distributions are computed as


s2 s2
Σ Σ
P ( X1 = a 1i ) = P ( X1 = a1i , X2 = a 2j ) = pij , i = 1, . . . , s 1 ,
j=1 j=1
s1 s1
5
Σ Σ
P ( X2 = a 2j ) = P ( X1 = a 1i , X 2 = a2j ) = pij , j = 1, . . . , s 2 .
i=1 i=1
This situation is illustrated in the following table.
6

zz
zz X marginal distribution
1 a11 ······ a1s1
zz
X2 zz of X2
Σs 1
a21 p11 ······ ps11 pi1
i=1
..
. . . . .
Σs1
a2s2 p1s2 ······ ps1 s2 p is
i=1 2

marginal distribution Σs2 p1j Σs2


ps j
Σ
j=1
· · · · · · j=1 1
i=1,...,s1 pij = 1
of X1 j=1,...,s2

Example 1.4. The coordinate functions ξi : {0, 1}3 → R, i = 1, 2, 3, de-


fined on the probability space ( {0, 1}3, P({0, 1} 3), P3 ) by (1.2) are random
variables. They all have a same distribution:
{ (0, 1/2), (1, 1/2) }.
Their joint distribution is the uniform distribution in {0, 1}3:
{ ((0, 0, 0), 1/8), ((0, 0, 1), 1/8), ((0, 1, 0), 1/8), ((0, 1, 1), 1/8),
((1, 0, 0), 1/8), ((1, 0, 1), 1/8), ((1, 1, 0), 1/8), ((1, 1, 1), 1/8) }.

Example 1.5. A constant can be considered to be a random variable. Let


(Ω, P(Ω), P ) be a probability space and c ∈ R be a constant. Then, a
random variable X(ω) := c, ω ∈ Ω, has a distribution {(c, 1)}.

Random variables play the leading role in probability theory. “A ran-


dom variable X is defined on a probability space (Ω, P(Ω), P )” is inter-
preted as “ω is randomly chosen from Ω with probability P ({ω}), and ac-
cordingly the value X(ω) becomes random.” In general, choosing an ω ∈ Ω
and getting the value X(ω) is called sampling, and X(ω) is called a sample
value of X.
7

In opportunity theory, we always deal with random variables as func-


tions, and we are detached to individual sample values or sampling meth-
ods. consequently random variables need now not have interpretation
that they're random, and ω want not be selected randomly. but in sensible
programs, consisting of mathematical statistics or the Monte Carlo method,
pattern values or sampling techniques may also become vast.

generally, a possibility area is just a stage that random variables


en- ter. Given a distribution or a joint distribution, we frequently
make a appropriate opportunity space and outline a random
variable or several random variables on it, in order that its
distribution or their joint distribution coincides with the given
one. for example, for any given distribution { (ai, pi) | i = 1, . . . , s
}, define a chance area ( Ω, P(Ω), P ) and a random variable X
throughΩ := {a1, . . . , as}, P ({ai}) := pi, X(ai) := ai, i=
1, . . . , s.
Then, the distribution of X coincides with the given one. Similarly, for any
given joint distribution
{((a1j1 , . . . , anjn ), p j1 ,...,jn ) | j1 = 1, . . . , s 1 , . . . , jn = 1, . . . , sn }, (1.8)
define a probability space (Ω, P(Ω), P ) and random variables X1, . . . , Xn
by
Ω := {a11, . . . , a1s1 } × · · · × {an1, . . . , ansn },
P ({(a1j1 , . . . , anjn )}) := pj1 ,...,jn , j1 = 1, . . . , s1 , . . . , jn = 1, . . . , s n ,
Xi ((a1j1 , . . . , a njn )) := aiji , i = 1, . . . , n.
Each Xi is a coordinate function. Then, the joint distribution of X1, . . . , Xn
coincides with the given one (1.8). Such realization of probability space and
random variable(s) is called the canonical realization. Example 1.4 shows
the canonical realization of 3 coin
1.2 8
.
Ra
nd
Monte
om
Carlo method
nu
mb
er

The Monte Carlo method is a numerical method to solve mathematical


problems by computer-aided sampling of random variables.

When #Ω is small, sampling of a random variable X : Ω → R is easy.

If #Ω = 108, we have only to specify a number of at most 9 decimal digits


to choose an ω ∈ Ω, but when Ω = {0, 1}108 , a computer is indispensable
for sampling. Let us consider the following exercise.
exercise I when we toss a coin a hundred times, what's the possibility

P that it comes up Heads at least 6 instances in succession?

Repeat ‘one hundred coin tosses’ 106 times, and permit S be the variety of
the arise- rences of “the coin comes up Heads 6 times in succession”
among the 106 trials. Then, by using the regulation of huge numbers,
S/106 can be an amazing approxi- mate fee of p with excessive
probability. To try this, the full number of coin tosses we need is one
hundred × 106 = 108. Of path, we do no longer toss a actual coin
9 Ma
the
ma
tics
of
coi
n
tos
Namely, if Alice chooses an ω from {0, 1}108 , and computes the valuesinof
S(ω)/106, she can get an g
approximate value of p with error less than 1/200
with probability at least 0.99.

Now, to give the inequality (1.15) a practical meaning, Alice should be


equally likely to choose an ω ∈ {0, 1}108 . This means that she should
choose ω from mainly among random numbers because they account for
nearly all elements of {0, 1}108 .†14 However, as we saw in Sec. 1.2, it is
impossible to choose a random number even by computer.

In most of practical Monte Carlo methods, pseudorandom numbers are


used instead of random numbers. A program that produces
pseudorandom numbers, mathematically speaking, a function

G : {0, 1}l → {0, 1}n, l < n,

Is called a pseudorandom generator. For practical use, l is assumed to be


small enough for Alice to be equally likely to choose ω′ ∈ {0, 1}l, and on the
other hand, n is assumed to be too large for her to be equally likely to
choose ω ∈ {0, 1}n. Namely, l n. The program produces g(ω′) ∈ {0, 1}n
from ω′ ∈ {0, 1}l that Alice has chosen. Here, g(ω′) is called a
pseudorandom number, and ω′ is called its seed.

For any ω′, the pseudorandom number g(ω′) is not a random number.

Nevertheless, it is useful in some situations. In fact, in the case of Exercise


I, there exists a suitable pseudorandom generator g : {0, 1}238 → {0,
1}108 such that an inequality
1.4 10
.
Mo
nte
Car
lo
me
1.5 Infinite coin tosses
tho
d
Borel’s model of coin tosses (Example 1.2) can give a mathematical model
of not only 3 coin tosses but also arbitrarily many coin tosses. Furthermore,
the sequence of functions {di }∞ i=1 defined in Example 1.2 can be regarded
as infinite coin tosses. Of course, there do not exist infinite coin tosses in
the real world, but by some reasons, it is important to consider them.

The contents of this section slightly exceeds the level of this book.
Borel’s everyday variety theorem
Rational numbers are sufficient for sensible computation, but to permit calcu- lus
be to be had, real numbers are essential. Just like this, the primary purpose why
we remember infinite coin tosses is that they’re beneficial while we examine the
limit behavior of n coin tosses as n → ∞. Indeed, the fact that the prob- potential
area for n coin tosses varies as n varies is awkward, terrible-searching, and poor
for superior take a look at of opportunity concept.
As an instance, Borel’s normal range theorem
P lim 1 Σ d = 1 =1 (1.16)
I
N→∞ n i=1 2
Asserts just one of the analytic residences of the series of functions
{di}∞i=1, however it is interpreted within the context of opportunity principle as
“whilst we toss a coin infinitely in many instances, the asymptotic restriction of
the relative frequency of Heads is half with probability 1.” It’s miles
acknowledged that Borel’s regular range theorem implies Bernoulli’s theorem
(1.thirteen). note that it isn’t smooth to comprehend the exact that means of
(1.16). Intuitively, it manner that the ‘duration’ of the set
A := ( x ∈ [0, 1) . lim 1 d (x) = 1 ⊂ [0, 1)
I
. n→∞ n i=1 2
Is equal to 1, however considering the fact that A isn’t a easy set like a semi-open
c program languageperiod, how
To define its ‘length’ and how to compute it come into query. To clear up them,
we want degree idea.
19 Ma
the
ma
tics
Construction of Brownian motion of
coi
the second one reason why we consider limitless coin tosses is that wen
tos
will assemble a random variable with arbitrary distribution from sin
g
countless coin tosses. what's more, besides very special cases†15, any
probabilistic object can be created from them. for instance, we right here
construct from them a Brownian movement—the maximum essential
stochastic technique both in theory and in practice.Define a function F :
R → (0, 1) := { x | 0 < x < 1 } ⊂ R as follows.†16
t 1 u2
F (t) := √ exp − du, t ∈ R.
−∞
2
For example, construction of uncountable (cf. p.25) independent random variables.
15
t t
16 is an abbreviation of lim , which is also called an improper integral. See
−∞ R→∞ −R
Remark 3.7.
1.5 20
.
Infi
s. Then,
nit putting
e
coi
n X(x) :=
tos −∞ (x = 0),
ses
it holds that F −1(x) (0 < x < 1),
P(X < t) := P({x ∈ [0, 1) | X(x) < t})
= P({x ∈ [0, 1) | x < F (t)})
= P( [0, F (t)) ) = F (t), t ∈ R.
21 Ma
the
ma
Here is an amazing idea: if we put tics
of
X1 := X 2−1d1 + 2−2d 3 + 2−3d6 + 2−4d10 + 2−5d15 + ·· , coi
n
X2 := X ·2−1d 2 + 2−2d 5 + 2−3d 9 + 2−4d 14 + ·· , tos
X3 := X ·2−1d 4 + 2−2d8 + 2−3d13 + ·· , sin
g
X4 := X 2·−1d7 + 2−2d12 + · · · ,
−1
X := X 2
d + · · · ,5
.
11
then each Xn obeys the standard normal distribution. We emphasize that
each dk appears only in one X n, which means the value of each Xn does
not make influence on any other Xn' (n′ /= n). Namely, {Xn}n∞=1 are
‘independent’.
Now, we are at the position to define a Brownian motion {Bt}0≤t≤π
(Fig. 1.6):
r ∞
t 2 Σ sin nt
Bt := √ X1 + π Xn+1, 0 ≤ t ≤ π. (1.17)
π n n=1

n=1

X̂1 := X 2−1 d1 + 2−2 d2 + 2−3 d3 + + 2−31 d31 ·· , n=1

X̂2 := X · 2−1 d32 + 2−2 d33 + 2−3 d34 + + 2−31 d62 ·· ,


X̂3 := X · 2−1 d63 + 2−2 d64 + 2−3 d65 + + 2−31 d93 ·· ,
X̂4 := X · 2−1 d94 + 2−2 d 95 + 2−3 d96 + · · · + 2−31 d 124 ,
..
X̂1000 := X 2−1 d 30970 + 2−2 d 30971 + 2−3 d30972 + · · · + 2−31 d31000 ,
and using these, define {B̂t} 0≤t≤π by
r 999
t 2 Σ sin nt
B̂t := √ X̂1 + X̂n+1 , 0 ≤ t ≤ π. (1.18)
π π n=1 n
1.5 22
.
Infi
nit
e
coi
n
tos 1.5
ses

1.0

0.5

0.5 1.0 1.5 2.0 2.5 3.0

–0.5

Fig. 1.6 A sample path of Brownian motion

You might also like