0% found this document useful (0 votes)
11 views68 pages

Thermodynamics

Uploaded by

thefelix :D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views68 pages

Thermodynamics

Uploaded by

thefelix :D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Pontificia Universidad Catolica

Faculty of physics

Notes of thermodynamics

Author: LPT
Professor: Marco Aurelio Diaz

March 2021
Contents

1 Appendix 4
1.1 Language of set theory . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Other logical conectives . . . . . . . . . . . . . . . . . . . . . 5
1.2 Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Axiom of extensionality . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Axiom of comprehension . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Sets that can defined by the comprehension axiom . . . . . . 8
1.2.4 Axiom of pairs . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.5 Axiom of union . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.6 Axiom of the parts . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.7 Classic sets defined in the ZF language . . . . . . . . . . . . . 9
1.2.8 Axiom of infinity . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.9 Principle of induction . . . . . . . . . . . . . . . . . . . . . . 14

2 Combinatorics and theory of probability 16


2.0.1 Complementary principle . . . . . . . . . . . . . . . . . . . . 17
2.0.2 Principle of inclusion exclusion . . . . . . . . . . . . . . . . . 17
2.1 Counting with bijections . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.1 Boundary properties . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Counting with recurrences . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Case with repetition . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.2 Case without repetition . . . . . . . . . . . . . . . . . . . . . 21
2.4 Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Combinations with repetition . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.7 Permutations with repetition . . . . . . . . . . . . . . . . . . . . . . 26
2.8 Theory of probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.8.1 Probability space . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.8.2 Properties of probabilities . . . . . . . . . . . . . . . . . . . . 29
2.8.3 Type of events . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.4 Probability function in finite sample spaces (theorem) . . . . 32
2.8.5 Classical Probability (Corollary) . . . . . . . . . . . . . . . . 32
2.8.6 Conditional probability . . . . . . . . . . . . . . . . . . . . . 33
2.8.7 Random variable . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8.8 Borel sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.8.9 Types of random variable . . . . . . . . . . . . . . . . . . . . 41

2
pag. 3

2.8.10 Properties of distribution function . . . . . . . . . . . . . . . 48


2.8.11 Types of random variables and their density function . . . . . 49
2.8.12 Independent random variables . . . . . . . . . . . . . . . . . 51
2.8.13 Moments of a random variable (and Expectation) . . . . . . . 52
2.8.14 Theorem of change of variable in distribution functions . . . 52
2.8.15 Expectation in functions of random variable (theorem) . . . . 53
2.8.16 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.8.17 Discrete uniform distribution . . . . . . . . . . . . . . . . . . 57
2.8.18 Bernoulli distribution . . . . . . . . . . . . . . . . . . . . . . 58
2.8.19 Binomial distribution . . . . . . . . . . . . . . . . . . . . . . 59
2.8.20 Geometric distribution . . . . . . . . . . . . . . . . . . . . . . 61
2.8.21 Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . 63

3 Basic definitions of thermodynamics 66


Chapter 1

Appendix

In this part of the document we will analyze in an introductory way some useful
concepts of axiomatic set theory (assuming that the student has knowledge of the
intuitive theory and first-order logical) that will help to understand concepts of
combinatorics and probabilistic theory.

1.1 Language of set theory


The axiomatic set theory is a system that is built on well-established principles
called axioms, which have the following properties:

1) The axioms are independent of each other.


2) It is not necessary to define new axioms to prove properties. (completeness)
3) The axioms and their consequences do not produce contradictions (consistency)

This way, it is ensure the formation of a rigorous theory that does not have
circular reasoning and is based at all times. Any axiomatic theory also presents a
language with logical and non-logical symbols, in particular, the set theory works
with a first-order language, which is formed by:

1) Logical simbols:
-Individual variables commonly represented with the letters x,y,z, among others
-Logical conectives (basics): ¬, ∨, ∃
-Binary equality predicate:=

2) Non logical simbols:


-Symbol of belonging: ∈

Thus, an expression in the theory will be generated from a sequence of symbols,


however, not just any sequence is a useful expression, for example, the following
sequence is an expression that does not say anything in particular:

∃∧ ∈=

4
pag. 5

The latter generates the need to think of a rule that allows to obtain valid
expressions (in the fundamental logic).

Definition 1.0 (Formula): A formula is defined recursively by three rules:

1) If x,y are variables, the x=y and x ∈ y are formulas.


2) If φ and γ are formulas, then: ¬φ and φ ∨ γ are formulas.
3) If x is a variable and φ is a formula, then (∃x)φ is a formula.

With this, we can easily differentiate between formulas and expressions without
logical sense, for example, in the following question:

Exercise: Determine if the following expression is a formula, (substantiate your


answer)

∃x (x ∈ y) ∨ (z = x)
Answer: By 1, x ∈ y and z = x are formulas, we will call φ to x ∈ y and γ to
z = x, thus, it is easy to see that φ ∨ γ is formula by 2, let’s call it , finally (∃x)
is a formula by 3, therefore, the latter expression is a formula.

1.1.1 Other logical conectives


With the basic logical symbols we can form the other common connectives of first-
order logic, as for example:

1) x 6= y ≡ ¬(x = y)
2) φ ∧ γ ≡ ¬(¬φ ∨ ¬y)
3) φ → γ ≡ ¬φ ∨ γ
4) φ ↔ γ ≡ φ → γ ∧ γ → φ
5) (∀x)φ ≡ ¬((∃x)¬φ)
6) ∃x ∈ y(φ) ≡ ∃x(x ∈ y ∧ φ)
7) x ∈
/ y ≡ ¬(x ∈ y)

Definition 1.1 (Bound and free variables): If there is a stay of the


variable x in φ, x is said to be bound if there appears anywhere in φ a formula
with the form (∃x)γ, otherwise if the variable is said to be free.

A common notation for formulas with free variables is φ(x1 , x2 , ...xn ), where
x1 ,x2 ,..., and xn represent these variables. Another commonly used terminology is
that of open or closed formula, if a formula has no quantifiers it is said to be open,
and if it has no free variables it is called closed.
pag. 6

1.2 Axioms
The axioms that will be presented below are those used in zermelo frankel’s theory
(ZF), these will be taught chronologically trying to present the motivation behind
each one.

1.2.1 Axiom of extensionality


This axiom is motivated by the need to support the proposition, ”two sets are equal
if and only if they have the same elements”, since this assertion is independent of
the rules of equality proposed by Leibniz, which are:

Identity:

∀x(x = x)
Sustitution:

∀x∀y(x = y ∧ φ(x) → φ(y))

From these rules it can be deduced that:

∀a∀b [a = b → ∀x(x ∈ a ↔ x ∈ b)]


But not its reciprocal, so it is proposed as axiom:

Axiom extensionality

∀a∀b [∀x(x ∈ a ↔ x ∈ b) → a = b]

1.2.2 Axiom of comprehension


In intuitive set theory, a set could be defined by extension or comprehension. On the
one hand, extension refers to writing the elements one by one, while comprehension
designates the set as containing all the elements x that satisfy a given formula
φ(x). The problem arises when an English mathematician named Bertrand Russell
postulates the following set:

R = {x : x ∈
/ x}
With the definition given at the time, asking the question ”R ∈ R?” would lead
to an obvious contradiction:

R∈R↔R∈
/R
Therefore, mathematical objects such as these cannot be allowed to be sets, so
the idea of creating the concept of class to designate them was born and the axiom
of comprehension was created to differentiate them from sets. (It should be noted
that since every element of the theory represents a set, then classes of the form
A = {x : φ(x)} turn out to be just a notation to refer to a grouping of sets with a
property in common).
pag. 7

Axiom of comprehension

∀x∃y∀z (z ∈ y ↔ z ∈ x ∧ φ(z))

Thus, any class of type A = {x : φ(x)} is a set if and only if there is a set
containing it, this way a set is write (by the axiom) with the notation y = {z ∈ x :
φ(z)}

Notations: Let A = {x : φ(x)} and B = {x : γ(x)} classes, then:

1) A = B ≡ ∀x(φ(x) ↔ γ(x))
2) A ⊂ B ≡ ∀x(φ(x) → γ(x))
3) A ∩ B ≡ {x : x ∈ A ∧ x ∈ B}
4) A ∪ B ≡ {x : x ∈ A ∨ x ∈ B}
5) A − B ≡ {x : x ∈ A ∧ x ∈
/ B}

The classes that do not belong to a set are called proper classes, it is clear that
russell’s class is evidently a proper class, however there is another class that must be
taken into account as proper, this is the universe, that is, the class that categorizes
all the mathematical elements, defined by:

U ≡ {x : x = x}
This is a proper class, because if it were a set, then russell’s class would belong
to it and therefore it would also be a set, which would generate the contradiction
again.
Another consequence of the axiom of comprehension, is that it allows us to define
the empty set, since given any x, the empty set belong x, therefore is a set, but a
set of the form:

∅ ≡ {z ∈ x : z 6= z}
Exercise: Prove that the empty set is unique with the extensionality and com-
prehension axioms.
Answer: Let a and b be any two sets that are different from each other, such
that:

a = {x ∈ z : x 6= x}

b = {x ∈ y : x 6= x}
by comprehension axiom:

x∈a x∈b
↔ x ∈ z ∧ x 6= x ↔ x ∈ y ∧ x 6= x
↔x∈z∧F ↔x∈y∧F
↔F ↔F
pag. 8

Finally:
F ↔F
↔x∈a↔x∈b
→ a = b (by extensionality)

Therefore empty set is unique (by contradiction).

1.2.3 Sets that can defined by the comprehension axiom


If a and b are sets, then by comprehension axiom it is possible defined the intersection
a ∩ b choosing φ(x) = x ∈ b:

a ∩ b ≡ {x ∈ a : x ∈ b}
Another set that can be defined by choosing in this case φ(x) = x ∈
/ b, is the
difference set:

a − b ≡ {x ∈ a : x ∈
/ b}
Finally, we can defined a arbitrary intersection of sets. If F 6= ∅ and x ∈ F ,
then:
\
F ≡ {z ∈ x : ∀y ∈ F (z ∈ y)}

1.2.4 Axiom of pairs


The axiom of pairs states that given any two sets x and y, there will be a set whose
only elements are x and y.
Axiom of pairs

(∀x, y)(∃z)(∀a)[a ∈ z ↔ a = x ∨ a = y]
With this axiom can defined the pairs:

{x, y} ≡ {a : a = x ∨ a = y}

{x, x} ≡ {x}
That help to designate the definition of ordered pair:

(x, y) ≡ {{x}, {x, y}}


Note: There are other ways to name the ordered pair, however, this is the most
common, but any definition that complies with the fundamental property of ordered
pairs is equally valid:

(a, b) = (c, d) ↔ a = c ∧ b = d
pag. 9

1.2.5 Axiom of union


This axiom, as its name indicates, is used to support the existence of the set union:

Axiom of union

∀a∃b∀x(x ∈ b ↔ ∃y ∈ a(x ∈ y))


Of the latter, b can be understood as the arbitrary union of the sets of a:
[
a ≡ {x : ∃y ∈ a(x ∈ y)}
and in the case a = {m, n}, we can see that:
[
{m, n} ≡ {x : x ∈ m ∨ x ∈ y} = m ∪ n

1.2.6 Axiom of the parts


The axiom of the parts is defined to support the existence of the set of parts, and
says that:

∀x∃y∀z(z ∈ y ↔ z ⊂ x)

Note: It is important to remember that the definition of subset is described by


the following formula:

z ⊂ x ≡ ∀u(u ∈ z → u ∈ x)
By the axiom, can be denotated the parts sets y by:

P (x) ≡ {z : z ⊂ x}

1.2.7 Classic sets defined in the ZF language


Since it is considered that the student has a deep understanding of intuitive set
theory, we will proceed to write as a review, all the classic sets as Cartesian product,
relations and functions, in the language defined above.

Definition 1.2 (Cartesian product): Given a and b, the cartesian prod-


uct a × b is defined as the set of all ordered pairs (x,y) such that x ∈ a and
y ∈ b:

a × b = {(x, y) : x ∈ a ∧ y ∈ b}

Note: The latter definition is a set since a × b ⊂ P (P (a ∪ b)), and P (P (a ∪ b))


is a set by the axiom of the parts.
pag. 10

Definition 1.3 (Binary relation): Given a and b, is defined the binary


relation R as a subset of the cartesian product a × b:

R = {(a, b) ∈ a × b : φ(a, b)}

Notation:

(x, y) ∈ R ≡ xRy

R⊂a×b≡R:a→b
.

Definition 1.4 (Domain): Let a binary relation R : a → b, is defined


Dom(R) as the set of the elements x in a such that ∃y ∈ b(xRy):

Dom(R) = {x ∈ a : ∃y ∈ b(xRy)}

Definition 1.5 (Range): Let a binary relation R : a → b, is defined


Rang(R) as the set of the elements y in b such that ∃x ∈ a(xRy):

Rang(R) = {y ∈ b : ∃x ∈ a(xRy)}

Definition 1.6 (Inverse relation): Let a binary relation R : a → b is


defined the inverse relation R−1 as the set of pairs (y, x) ∈ b × a such that xRy,
that is:

R−1 = {(y, x) ∈ b × a : xRy}

Definition 1.7 (Compound Relation): Let R : a → b and S : b → c,


then the compound relation S ◦ R is defined as:

S ◦ R = {(x, z) ∈ a × c : ∃y ∈ b(xRy ∧ yRz)}


pag. 11

Properties of homogeneous and heterogeneous binary relations

Homogeneous: Let R : a → a then it is say that:


- R is reflexive ↔ ∀x ∈ a(xRx)
- R is symmetrical ↔ ∀x, y ∈ a(xRy → yRx)
- R is antisymmetric ↔ ∀x, y ∈ a(xRy ∧ yRx → x = y)
- R is transitive ↔ ∀x, y, z ∈ a(xRy ∧ yRz → xRz)
- R is total ↔ ∀x, y ∈ a(xRy ∨ yRx)
- R is irreflexive ↔ ∀x ∈ a(¬ xRx)

Heterogeneous: Let R : a → b then it is say that:


- R is IE ↔ ∀x ∈ a, ∃y ∈ b(xRy)
- R is OE ↔ ∀y ∈ b, ∃x ∈ a(xRy)
- R is UI ↔ ∀x ∈ Dom(R), ∀y1 , y2 ∈ b, [xRy1 ∧ xRy2 → y1 = y2 ]
- R is UO ↔ ∀y ∈ Rgo(R), ∀x1 , x2 ∈ a, [x1 Ry ∧ x2 Ry → x1 = x2 ]

Definition 1.8 (Application): Let a relation f : a → b it is said that is a


application if f is UI and IE, therefore:

∀x ∈ a, ∃!y ∈ b, xf y

Notation:

xf y ≡ y = f (x)
Application properties

Let a application f : a → b, then:


1) f is injective ↔ ∀x1 , x2 ∈ a(f (x1 ) = f (x2 ) → x1 = x2 )
2) f is surjective ↔ ∀y ∈ b, ∃x ∈ a(f (x) = y)
3) f is bijective ↔ ∀y ∈ b, ∃!x ∈ a(f (x) = y)

Definition 1.9 (Inverse application): Let a bijective application f : a →


b, its inverse application is defined as the f −1 : b → a such that:

∀x ∈ a, ∀y ∈ b (f −1 (y) = x ↔ f (x) = y)

Definition 2.0 (Compound application): let f : a → b and g : b → c


then the compound application g ◦ f : a → c is the one that

∀x ∈ a[(g ◦ f )(x) ↔ g(f (x))]


pag. 12

Order relations
Let a homogeneous relation R : a → a, then:
1) R is a preorder ↔ R is reflexive and transitive.
2) R is a partial order ↔ R is a preorder and antisymmetric.
3) R is a total order ↔ R is a partial order and total.

Notation: When the relation R : a → a is an order it is written as (a,≤) and


specifies which type it is.

Distinguished elements
Let a partial order (a, ≤) it is say that:
1) M is maximun ↔ M ∈ a ∧ ∀x ∈ a (x ≤ M )
2) m in minimum ↔ m ∈ a ∧ ∀x ∈ a (m ≤ x)

Notation:

≤−1 ≡ ≥

3) p is maximal ↔ p ∈ a ∧ ∀x ∈ a (p ≤ a → p = a)
4) q is minimal ↔ q ∈ a ∧ ∀x ∈ a (q ≥ x → q = x)
5) b ⊂ a → [c is upper bound of b ↔ c ∈ a ∧ ∀x ∈ b (x ≤ c)]
6) b ⊂ a → [c is lower bound of b ↔ c ∈ a ∧ ∀x ∈ b (x ≥ c)]

Set of bounds
Let a partial order (a, ≤) and b ⊂ a, then:
1) U B = {c ∈ a : ∀x ∈ b (x ≤ c)}
2) LB = {c ∈ a : ∀x ∈ b (x ≥ c)}

Infimum and supremum


Let a partial order (a, ≤) and b ⊂ a, then:
1) s is supremum ↔ ∀s ∈ U B (s ≤ x)
2) i is infimum ↔ ∀i ∈ LB (l ≥ x)

Strict orders
A order (a, <) is strict if:
- < is irreflexive
- < is antisymmetric
- < is transitive
pag. 13

1.2.8 Axiom of infinity


This axiom is generated with the idea of supporting the existence of at least one
set whose number of elements is infinite, but to introduce it, it is first necessary to
understand two important concepts, the successor set and the definition of inductive
set.

Definition 2.1 (Successor): Given a set X we define the successor of X,


S(X) as the set formed by the union of X with the set that has X as an element,
i.e. {X}:

S(X) = X ∪ {X}

With this, if we called the empty set as:

∅≡0
Then, the successor of 0 is equal to a set with one element:

S(0) = {0} ≡ 1
And the successor of 1 is equal to a set with two elements:

S(1) = {0, 1} ≡ 2
and so on...

Definition 2.2 (Inductive set): Given a set X, it is say that X is a


inductive set if it satisfies the following property (Called Ind(X)):

∅ ∈ X ∧ [∀y (y ∈ x → S(y) ∈ x]

Axiom of infinity:

∃X (Ind(X))
With this we can find an important observation, which is that, any inductive set
should have the numbers 0,1,2,... etc, since by definition, 0 will always belong to
them, and therefore 1, 2,... and so on. This allows us to define the naturals as the
following set:

Definition 2.3 (Natural set (N)): By te axiom of infinity, there is at least


one set J that is inductive, therefore, we can define the natural set, as the set
with the elements n such that n ∈ J and for all X inductive, n ∈ X:

N = {n ∈ J : ∀X (Ind(X) → n ∈ X)}
pag. 14

Understanding this, the following can be noted:


1) N is a inductive
2) N is the minimum inductive set respect the order ⊂

Proof 1:
In first place, by definition of inductive set we can obtain that:

∀X (Ind(X) → 0 ∈ X)

∴0∈N
And in second place, by definition of natural set, given a n arbitrary:

n∈N

↔ ∀X (Ind(X) → n ∈ X)

→ ∀X (Ind(X) → S(n) ∈ X)

→ S(n) ∈ N

∴ ∀n (n ∈ N → S(n) ∈ N)
Finally, putting 1 and 2 together we get that Ind(N) is a true proposition.

Proof 2:
By definition of natural set, for any inductive set X, the elements n on N also
belong to X, that is:

∀X (Ind(X) → ∀n ∈ N (n ∈ X))

∴ ∀X (Ind(X) → N ⊂ X)

1.2.9 Principle of induction


Since the naturals is a set, then, by the comprehension axiom we can created a new
set of the following form:

A = {n ∈ N : φ(n)}
This set is a subset of N whose elements comply with the property φ(n), now,
What would happen if A were also an inductive set?, then, the naturals should also
be subsets of A, therefore:

A ⊂ N ∧ Ind(A) → A = N
pag. 15

This is what is known as the principle of induction, although it may not


seem so at first, however, its most common form can be found by interpreting the
proposition in the following way:
If A is subset of N, then it can be write as:

A = {n ∈ N : φ(n)}
Now, by definition, if A is a inductive set, then:

0 ∈ A ∧ ∀n (n ∈ A → S(n) ∈ A)
Implicitly understanding n as a natural number, then, the above proposition
becomes equivalent to:

φ(0) ∧ ∀n (φ(n) → φ(S(n)))


This implies that:

∀n (n ∈ A ↔ n ∈ N)
But since n is a natural number, the second proposition is trivially true, obtain
thus:

∀n (φ(n))
Therefore:

The principle of induction

φ(0) ∧ ∀n (φ(n) → φ(S(n))) → ∀n (φ(n)) : n ∈ N

Note: In spite of all the above mentioned construction, a convention can be


chosen as to whether 0 is or is not a natural element, for our case it will not be for
convenience in notation, and when using the natural unions with 0 it will be denoted
simply as N0
Chapter 2

Combinatorics and theory of


probability

In this chapter, the basic concepts of combinatorics and probability theory will be
presented, assuming that the student has knowledge of finite sets (definition and
cardinality), this way they can start with a solid foundation in the course.
Combinatorics is the branch of discrete mathematics that studies the number
of elements in finite sets with different orderings. This is based on two principles,
(demonstrable in set theory, but that in most documents, including this one, are
practically axioms), these are the additive and multiplicative principle.

Definition 2.4 (Additive principle): If A and B are disjoint finite sets,


then the cardinality of their union is equal to the sum of the cardinalities in A
and B. Simplifying:

A ∩ B = ∅ → |A ∪ B| = |A| + |B|

Definition 2.5 (Multiplicative principle): Let A and B finite sets, the


cardinality of their cartesian product is equal to the product of their cardinalities.

|A × B| = |A| ∗ |B|

To deepen these ideas, the following exercise is proposed:


The alphabet P
is a finite set that contains the letters of a language, this is denoted
with the symbol and in english it has to:
X
= {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z}
On the other hand, a word is an finite ordered sequence formed with letters of
the alphabet. Knowing all this, determinate:
1) The number of words with at most n figures.
2) The number of words with n figures, that don’t start with the letter b.

16
pag. 17

Answer 1: A orderly sequence with n figures can be understood as an n-tuple,


for example, if n=3 the word mom=(m,o,m), with this, we know that the number
totally of words
Pwith n figures is equal to the number of n-tuples, that is to say the
cardinality of n , but for multiplicative principle, it has to:

X n X
Y n
Y
n
|( ) | = | |= 26 = 26n
i=1 i=1

But we need the number of words at most n figures, this is, the number of words
with
Sn P n figures, or n-1 figures, or ..., or one figure, that is to say, the cardinality of
i
i=1 , so that, applying the additive principle, we obtain:

n X n n n n
[ i X Xi X
i
X
i
X 26n+1 − 26
| |= | |= 26 = 26 ± 1 = 26i − 1 =
25
i=1 i=1 i=1 i=1 i=0

Answer
P 2: For this case wePneed the number of n-tuples (xi )ni=1 such that
x1 ∈ −{b} and ∀i ∈ [2; n], xi ∈ , in other words:
X X X X
n n
| −{b} × Xi=2 |=| −{b}| ∗ |Xi=2 | = 25 ∗ 26n−1

2.0.1 Complementary principle


Let A and B finite sets such that A ⊂ B, then:
|B − A| = |B| − |A|
Proof:
|B| = |(B − A) ∪ A| (for A ⊂ B)
↔ |B| = |B − A| + |A|
↔ |B| − |A| = |B − A|

2.0.2 Principle of inclusion exclusion


Let A and B finite sets, it is true that:
|A ∪ B| = |A| + |B| − |A ∩ B|
Proof
|A ∪ B| = |A ∪ (B − A)|
= |A| + |B − A|
= |A| + |B − (A ∩ B)|
= |A| + |B| − |A ∩ B| (for Complementary principle)
pag. 18

2.1 Counting with bijections

Definition 1.2 (Set of functions): It defines the set B A as the set whose
elements are functions from A to B, that is:

B A = {f ⊂ A × B : f is f uncion}

from the definition of a finite set it follows that, let A and B finite set:

∃f ∈ B A , f is bijective → |A| = |B|


So that, is possible conjecture that the cardinality of one set A is n, and verify
it demonstrating the existence a bijection f : A → B with |B| = n, we will delve
into this with the following exercise.

-Let A and B finite sets, if prove that |B A | = |B||A|

Proof: We will define the functions φ and Ψ such that:

φ : B A → B |A| Ψ : B |A| → B A
 

φ(f ) = (f (ai ))ni=1 Ψ((f (ai )ni=1 )) = f

Where ai are the elements of A. We can prove that φ = Ψ−1 , since knowing
φ ◦ Ψ : B |A| → B |A| and Ψ ◦ φ : B A → B A , it is enough to verify that:

(φ ◦ Ψ)((f (ai ))ni=1 ) = φ(Ψ(f (ai ))ni=1 ) (Ψ ◦ φ)(f ) = Ψ(Φ(f ))


↔ (φ ◦ Ψ)((f (ai ))ni=1 ) = φ(f ) ↔ (Ψ ◦ φ)(f ) = Ψ((f (ai ))ni=1 )
↔ (φ ◦ Ψ)((f (ai ))ni=1 ) = (f (ai ))ni=1 ↔ (Ψ ◦ φ)(f )) = f
↔ (φ ◦ Ψ)((f (ai ))i=1 ) = IdB |A| ↔ (Ψ ◦ φ)(f )) = IdB A

With this we confirmed that φ = Ψ−1 , then φ is bijective, thus |B A | = |B |A| | =


|B||A| .

2.1.1 Boundary properties


Let A and B finite sets, then:
1) A ⊂ B → |A| ≤ |B|
Proof:
by complementary principle:
|B − A| = |B| − |A|
↔ |B| = |A| + |B − A| ≥ |A|
pag. 19

2) f : A → B is injective → |A| ≤ |B|


Proof:
f : A → B in injective
→ f : A → f (A) is bijective
→ |A| = |f (A)| ≤ |B| (for f (A) ⊂ B and property 1)

3) f : A → B is surjective → |A| ≥ |B|


Proof: Suppose that B = {bi }m
i=1 , then:

f is surjective
↔ ∀i ∈ [1; m], f −1 ({bi }) 6= ∅
→ ∀i ∈ [1; m], |f −1 ({bi })| =
6 0
↔ ∀i ∈ [1; m], |f −1 ({bi })| ≥ 1
→ m −1 ({b })| ≥ m
P
i=1 |f i
Sn −1
↔ | i=1 f ({bi })| ≥ m
↔ |f −1 ( ni=1 {bi })| ≥ m
S

↔ |f −1 (B)| ≥ m
↔ |A| ≥ m
↔ |A| ≥ |B|

2.2 Counting with recurrences

Definition 2.6 (Recurrence): Let a succesion a : N → R, it is said that


this is defined by recurrence, if an is function of n and its previous terms already
defined; mathematically:

φ(an−1 , an−2 , ..., an−k , n) if n > k
an =
a i = Ai if i ∈ [1; k]
Where φ : Rk ×N → R is called recurrence equation and Ai initial conditions.

This type of relations helps to solve different types of problems, some of them are
very well known, as for example, the problem of calculating the minimal number of
moves to solve the game of the towers of Hanoi with n washers. The game consist in
its traditional form of three vertical poles, where they are stacked a indeterminate
number of discs with decreasing radius and one perforation in the center of more
diameter that the poles (so that they can be traversed by the latter). At the begin-
ning, all the disks are stacked on one of the poles in a decreasing way from the base
up, this in order to move the tower to another pole following two simple rules:
pag. 20

1) Only one disk can be moved at a time (to another pole).


2) A disk of larger radius cannot be on top of a smaller one

Figure 2.1: Towers of Hanoi

Solution for n washers:


First of all, it should be noted that in order to move the larger radius washer it
is necessary to create a tower of n-1 pieces on one of the posts, and after moving
the large piece, must be create again the tower of n-1 pieces, but in this case, on the
larger radius washer. Knowing all this, if we define an as the number of moves for
solve a tower with n pieces, then:

an = 2an−1 + 1
This is a recurrence, where the initial condition a1 is equal 1. The sucession
generated with this formule is a1 = 1 ; a2 = 3 ; a3 = 7 ; a4 = 15 ; a5 = 31 ; ... etc,
if sum one to any number then, a1 + 1 = 2 ; a2 + 2 = 4 ; a3 + 1 = 8 ; a4 + 1 = 16 ;
a5 + 1 = 32 ; ... etc, so that it is no unreasonable to conjecture that

an = 2n − 1
To check this we will use mathematical induction:
B.S: a1 = 1 = 21 − 1
I.H: an−1 = 2n−1 − 1
I.S:
an−1 = 2n−1 − 1
↔ 2an−1 = 2n − 2
↔ 2an−1 + 1 = 2n − 1
↔ an = 2n − 1
Therefore, the minimal number of moves for solve the problem is 2n − 1.

Proposed exercise: Calculate how many binary sequences of length twenty


exist such that there are no two consecutive zeros.
Hint: Create a recurrence with the cardinals to two disjunts sets (fibonacci
sequence).
pag. 21

2.3 Variations
If B is a finite set with elements bi and i ∈ [1; n], it is called variation of n objects
taken of r in r, to all orderly sequence of elements bi with length r, this also can be
understood as all r-tuples formed with elements of B. Sometimes, it is possible to
encounter problems in which it is necessary to know the number of variations of a
set, however, this problem can be solved, both for the case in which the elements of
B can be repeated and for the case in which they cannot.

2.3.1 Case with repetition


Since a variation is an r-tuple formed with elements of B, if there is no restriction
on the coordinates that prevents repeating elements, the set with all r-tuples is no
more than B r by definition of cross product, then, the number of variations with
repetition, denoted by V Rnr , is :

V Rnr = |B r | = nr
This is equal to the cardinality of the set B A when |A| = r (see page 5), therefore,
a variation with repeticion is also represented as a function f : A → B.

2.3.2 Case without repetition


In this case, to know the number of variations it is necessary to use the multiplicative
principle, observing that, if there are n possibilities for choose v1 , then there are n-1
possibilities for choose v2 , n-2 possibilities to choose v3 ... and so on, this process
can be repeated n-r+1 times (with r ≤ n), therefore, the number of variations is:

r−1 r−1
Y Y (n − r)! n!
Vnr = n−i= n−i∗ =
(n − r)! (n − r)!
i=0 i=0

This number is equal to that cardinal of the set Inj(A, B) = {f ∈ B A :


f is injective}, since if A = {ai }ri=1 , then there are n ways to choose f (a1 ), n-1
ways to choose f (a2 )... and so on, so that they form the same number of rules as
the r-tuples.

Exercise: How many ways can you have 2 rooks attacking each other in chess?
Answer: As the towers move in a straight line, then, a tower attacks another if
it is positioned in the same column or row, therefore, once the position of the first
tower is chosen (64 options), we only have seven options with the same column or
seven with the same row, so that, the number n of ways is :

n = 64 ∗ 7 + 64 ∗ 7 = 896
Proposed exercise: how many ways can you have n towers that not attacking
each other on an nxn board?
pag. 22

2.4 Combinations
The combinations is a selection of r elements of a set A = {ai }ni=1 where no matter
the order, this can be understood as a subset with cardinal equal to r in the case
that elements are no repeated, this way, the number of combinations is equal the
cardinality of existing subsets on A with length r. To know this number one must
take into account that any variation (without repetition) can be obtained by order-
ing the elements of the combinations in all possible ways, then, by multiplicative
principle, since there are nr subsets in A with cardinal r and r! ways of ordering


such subsets, the number of variations is:


   
n n! n n!
∗ r! = ↔ =
r (n − r)! r (n − r)! r!
This value is called combinatorial number, and it is present in countless occasions,
for example, in the nth powers of a binomial, expressed by the formula:
n  
n
X n
(x + y) = ∗ xn−k ∗ y k
k
k=0

The previous expression is called the Newton binomial in honor of its discoverer,
and can be proved by induction with the help of pascal’s identity, which indicates
that:
     
n n n+1
+ =
k k+1 k+1

Proof of pascal’s identity:


   
n n n! n!
+ = +
k k+1 (n − k)! k! (n − k − 1)!(k + 1)!
n! n!
= +
(n − k)(n − k − 1)! k! (n − k − 1)!(k + 1)(k)!
  
n! 1 1
= +
(n − k − 1)!(k)! n−k k+1
  
n! n+1
=
(n − k − 1)!(k)! (n − k)(k + 1)
 
(n + 1)! n+1
= =
(n − k)!(k + 1)! k+1

It is necessary to clarify that equivalent expressions can be obtained by making


a change of variable in the formula, for example, in the demostration, we will use
the identity:
     
n n n+1
+ =
k−1 k k
pag. 23

Proof of Newton binomial

B.S: (x + y)1 = 1k=0 k1 ∗ x1−k ∗ y k = x + y


P 

I.H: (x + y)n = nk=0 nk ∗ xn−k ∗ y k


P 

I.S: By I.H, if we multiply by (x+y), then :


n   n  
n+1
X n n−k+1 k
X n
(x + y) = ∗x ∗y + ∗ xn−k ∗ y k+1
k k
k=0 k=0
Pn Pn+1
Applying the index change, k=0 ak = k=1 ak−1 in the right sum:
n    n+1
X
X n n−k+1 k n
= ∗x ∗y + ∗ xn−(k−1) ∗ y (k−1)+1
k k−1
k=0 k=1
n n
 
Recalling that by definition of combination n+1 = −1 = 0, then if we add 0:
n     !
X n n−k+1 k n 0 n+1
= ∗x ∗y + ∗x ∗y + ...
k n+1
k=0
n+1
!
X n   
n
∗ xn−k+1 ∗ y k + ∗ xn+1 ∗ y 0
k−1 −1
k=1
n+1
X  n+1
X n 
n n−k+1 k
= ∗x ∗y + ∗ xn−k+1 ∗ y k
k k−1
k=0 k=1
n+1
X    
n n
= + ∗ xn−k+1 ∗ y k
k k−1
k=0
n+1
X 
n+1
= ∗ xn+1−k ∗ y k
k
k=0
Thus proving the proposed theorem.

Proposed exercises:
1) Prove the following identity:
n  
X n
= 2n
k
k=0

2) Prove that:
   
n n
=
k n−k

3) Prove the following identity and find its relation with 1):

|P ([1; n])| = 2n
pag. 24

2.5 Combinations with repetition


To work with this type of combinations, the concept of multiset is usually defined.

Definition 2.7 (Multiset): A multiset m of A = {ai }ni=1 is a function m :


A → N0 where m(ai ) represents the number of repetitions in ai or multiplicity
of ai ; mathematically we use the notation:
m(ai ) n
{(ai , m(ai ))}ni=1 = {ai }i=1

For example, the multiset {(a, 3), (b, 1), (c, 4)} can be denoted simply as {a3 , b, c4 }.
Thus a combination with the elements of A that have repetition, can be understood
as a multiset m : A → N0 . It can be observed that the cardinality of the set m is
not the same as the one represented by the multiset, therefore:

Definition 2.8 (Cardinal of multiset): If m is a multiset of the set


A = {ai }ni=1 , we define the cardinality |m| as the sum of the numbers m(ai ) for
all i in [1;n], that is:
n
X
|m| = m(ai )
i=1

If m(ai ) is called xi and |m| = k, then, the multiset m can be represented as a


polynomial with its repective natural solutions, therefore, the number of multisets
with cardinality k is equal to the number of solutions in the polynomial, it is to say:
n  
X n
|{(xi )ni=1 ∈ Nn0 : xi = k}| =
k
i=1

But this cardinal is equal to the number of ways that k elements can be placed
in n different containers xi :

Figure 2.2: Intuitive representation of polynomial solutions

However, it is not necessary that the compartments are boxes as in the previous
image, the containers can also be defined as a divisions of the objects throught
vertical lines as in the following image:
pag. 25

Figure 2.3: Representation with vertical lines

Formally, we can say that exists a bijection:

φ : {(xi )ni=1 ∈ Nn0 : ni=1 xi = k} → {•k , |n−1 }n+k−1


 P




 φ((xi )ni=1 ) = •...•
|{z} | ... | |{z}
•...•

x1 xn

Where {•k , |n−1 }n+k−1 is the set of sequences with k points (•) and n-1 vertical
lines (|) (i.e, an n+k-1-tuple). The sequence is defined simply by choosing the coor-
dinates of the n+k-1-tuple that have a vertical line, if the coordinates are considered
as elements of a set, the number of sequence is equal to the number of selections of
n-1 coordinates for a total of n+k-1-tuples, this is:
     
n+k−1 n+k−1 n
= =
n−1 k k
Exercise: Calculate the number of solutions to the following inequation:

x1 + x2 + x3 ≤ 7 : x1 , x2 , x3 ∈ N0
Answer: knowing that:
7
_
x1 + x2 + x3 ≤ 7 ↔ x1 + x2 + x3 = k
k=0
Then, the number of solutions (x1 , x2 , x3 ) is equal to the cardinality of the next
set:
7
[
{(xi )3i=1 ∈ N0 : x1 + x2 + x3 = k}
k=0
These sets are disjoint, therefore, aplying the additive principle, the cardinality
is equal to:

7 7   7  
X X 3 X 3+k−1
{(xi )3i=1 ∈ N0 : x1 + x2 + x3 = k} = = = 120
k k
k=0 k=0 k=0

Proposed excercise:
Prove the following property:
k    
X n+i n+k+1
=
i k
i=0
pag. 26

2.6 Permutations
The permutation is a specific case of the variation without repetition when n=r,
therefore, the cardinal of the set with all permutations is:

Pn = Vnn = n!
This can also be interpreted as the cardinal of the set Bij(A, B) = {f ∈ B A :
f is bijective} since in this case, the selection is with all the elements of B, so that
any element of B corresponds to a unique f (ai ) (being unique due to the restriction
that indicates the non-repetition of the elements).

2.7 Permutations with repetition


We define the permutation with repetition of a multiset m as any ordered sequence
with length |m|. Since every multiset is a function from A to N0 , then, the permu-
tation is also a sequence with the elements of A in which ai repeats m(ai ) times.
For know the number of permutations of a multiset, we will interpret the sequence
as a n-tuple (vi )ni=1 in which appear for all i m(ai ) times the element ai . We now
differentiate between two cases:
1) ∀i ∈ [1; n], m(ai ) = 1
2) ∃i ∈ [1; n], m(ai ) 6= 1
In the first case, as any ai have multiplicity equals 1, then, the multiset can
be interpreted as the set A, and the permutation as without repetition. In the
second case, since exist any ai with different multiplicity to 1, then, exist a ai
with multiplicity 0 and other with multiplicity greater than 1 (by definition of |m|),
therefore we say that the multiset have r differentiable elements (where r < n in
this case, and r = n in the first), and the set with those elements define it as:

md = {ai ∈ A : m(ai ) 6= 0} = {adj }rj=1


Thus, a permutation with repetition is defined as:

(vi )ni=1 : ∀i ∈ [1; n], ∀j ∈ [1; r], i ∈ Ij → vi = adj

With I = {Ij }rj=1 a partition of [1; n] such that |Ij | = m(adj ), this repetitions
number of the value adj we called nj for simplify the notation. With all this, is
possible to note that the number of variations without repetition is equal to the
number of combinations for the partitions I. For calculate this value we cancaltulate
the number of combinations for the elements Ij , where if for I1 exists nn1 possible
results, then for I2 will exist n−n 1

n2 possible results (the indexes already chosen for
n−n1 −n2

I1 cannot be chosen), and for I3 n3 posibble results, ... and so on, in general
we can see that the number of permutation with repetition for a multiset with n
elements and r distinguishable elements adj that are repeated nj times is:

r  r r
n − j−1 nk (n − j−1 (n − j−1
P  P P
k=1 nk )! nk )!
Y Y Y
k=1 = Pj−1 = Pj k=1
nj
j=1 j=1 (n − k=1 nk − nj )!(nj )! j=1 (n − k=1 nk )!(nj )!
pag. 27

r
(n − j−1
P
1 Y nk )!
= Qr ∗ Pk=1
j
j=1 (nj )! j=1 (n − k=1 nk )!
Q 
r aj−1 a1 Pr
But, by telescopic property j=1 aj = ar and recalling that k=1 nk = n,
we finally obtain:
! 
1 n! n!
= Qr Pr = Qr
j=1 nj ! n − k=1 nk )! j=1 nj !

Notation:
 
n n!
= Qr
n1 , n2 , ..., nr j=1 nj !

Exercise: if there are 52! different permutations of the cards in a deck, given
two identical decks, how many permutations are there?
Answer: if there are 52! different permutations in a deck, then the number of
cards is 52, therefore, the number of cards in two decks is 104, in addition, as the
decks are identicals, the cards are repeated 2 by 2 a total of 52 times, thus, the
number of permutations is:

104!
252

2.8 Theory of probability


The theory of probability is a mathematical model that studies the random events,
these are those experiments that have several possible outcomes under fairly similar
initial conditions, for example a coin toss, since this phenomenon has two possible
outcomes, heads or tails, the amazing thing is that the number that expresses the
frequency with which some event occur, stabilizes once the experiment has been
repeated a large number of times, this empirical principle on which the theory is
based is called statistical regularity.

2.8.1 Probability space


The construction of the model makes use of the definition of probability space, i.e.
a tern of the form:

(Ω, Σ, P )
Where it is called Ω to the sample space, which is an arbitrary set interpreted as
one that contains all possible outcomes of a random experiment; on the other hand
Σ is the called σ − algebra this is a family of subsets of Ω defined by the following
rules:
pag. 28

1- Ω ∈ Σ
2- A ∈ Σ → Ac ∈ Σ
S
3- ∀i ∈ N, Ai ∈ Σ → i∈N Ai ∈Σ

These rules guarantee that sets generated by the basic operations also belong to
Σ:
Exercise: Prove the next proposition:
\
∀i ∈ N, Ai ∈ Σ → Ai ∈ Σ
i∈N

Proof:
∀i ∈ N, Ai ∈ Σ
→ ∀i ∈ N, Aci ∈ Σ
→ i∈N Aci ∈ Σ
S
T c
→ i∈N Ai ∈Σ
T
→ i∈N Ai ∈ Σ
\
∴ ∀i ∈ N, Ai ∈ Σ → Ai ∈ Σ
i∈N

This mathematical structure also allows us to separate from the sample space
those sets that interest us to calculate their probability, these sets are called events,
and when a result w of the experiment is measured, it is said that the event A
happened if and only if w ∈ A, so the probability theory is intimately related to the
set theory.
Finally, P is known as the probability measure, and is a function governed by
kolmorov’s axioms, formally:

P :Σ→R:
1) 0 ≤ P (S)
2) P (Ω) = 1
3) ∀i 6= j ∈ N, Ai ∩ Aj = ∅ → P ( ∞
S P∞
i=1 Ai ) = i=1 P (Ai )

With all this in mind, we can move on to some important properties and defini-
tions that will be constantly addressed in the course of this document.
pag. 29

2.8.2 Properties of probabilities


Given a probability space (Ω, Σ, P ), the following properties are satisfied:
1) P (∅) = 0
Sn Pn
2) ∀i 6= j ∈ N, Ai ∩ Aj = ∅ → P ( i=1 Ai ) = i=1 P (Ai )

3) P (Ac ) = 1 − P (A)
4) A ⊂ B → P (B − A) = P (B) − P (A)
5) A ⊂ B → P (A) ≤ P (B)
6) 0 ≤ P (A) ≤ 1
7) P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
8) P (A ∪ B) ≤ P (A) + P (B)

Proofs:
1) It is possible to write the empty set as a succession of sets of the following
form:

∅ = ∅ ∪ ∅ ∪ ....
we can observe that all the elements of the sequence are disjoint, therefore, using
the second axiom of Kolmorov we can obtain that:

X
P (∅) = P (∅)
i=1
But by definition of probabilitie this serie should converge, which will occur only
when P (∅) = 0

2) Given a sucession of disjoints sets {Ai } where ∀i > n, Ai = ∅, it can be noted


(using the firs propertie) that :

[ n
[ ∞
[ n
[ n
[
Ai = Ai ∪ Ai = Ai ∪ ∅ = Ai
i=1 i=1 i=n+1 i=1 i=1

therefore:

∞ ∞ n ∞ n n
! !
[ X X X X [
P Ai = P (Ai ) = P (Ai ) + P (Ai ) = P (Ai ) = P Ai
i=1 i=1 i=1 i=n+1 i=1 i=1

3) Since A and Ac are disjoints, using the second propertie, we can obtain that:

P (A ∪ Ac ) = P (A) + P (Ac )
But A ∪ Ac = Ω and P (Ω) = 1 by the first axiom of Kolmorov, therefore:

P (A) + P (Ac ) = 1 ↔ P (Ac ) = 1 − P (A)


pag. 30

4)
P (B) = P ((B − A) ∪ A) = P (B − A) + P (A)

↔ P (B − A) = P (B) − P (A)

5) By the first axiom of Kolmorov:

0 ≤ P (B − A)
Adding P(A):

P (A) ≤ P (B − A) + P (A)
Using the propertie 4:

P (A) ≤ P (B)

6) By the first axiom of Kolmorov:

0 ≤ P (A)
And by definition of Σ, ∀A ∈ Σ, A ⊂ Ω, then, using the propertie 5:

P (A) ≤ P (Ω)

↔ P (A) ≤ 1

7) For the demonstration we will define the following 3 sets:

1) C = A ∩ B c
2) D = A ∩ B
3) E = Ac ∩ B
We can be noted that:

C ∩D =∅

C ∩E =∅

D∩E =∅
In addition:
C ∪ D ∪ E = (A ∩ B c ) ∪ (A ∩ B) ∪ (Ac ∩ B)
= (A ∪ (A ∩ B)) ∩ (B c ∪ (A ∩ B)) ∪ (Ac ∩ B)
= (A ∩ ((B c ∪ A) ∩ Ω) ∪ (Ac ∩ B)
= (A ∩ (B c ∪ A)) ∪ (Ac ∩ B)
= (A ∪ (Ac ∩ B))
=A∪B
pag. 31

So that:

P (C ∪ D ∪ E) = P (C) + P (D) + P (E) = P (A ∪ B)


But it should also be noted:

C ∪D =A

D∪E =B
Therefore:

P (A ∩ B) = P (C ∪ D) + P (D ∪ E) − P (D) = P (A) + P (B) − P (A ∩ B)

8) By the first axiom of Kolmorov:

P (A ∩ B) ≥ 0
Adding P (A ∪ B):

P (A ∩ B) + P (A ∪ B) ≥ P (A ∪ B)
Using the propertie 7:

P (A) + P (B) ≥ P (A ∪ B)

2.8.3 Type of events


Based on the interpretation of the sets belonging to the σ − algebra as events, each
set is associated with the name of its representative event, these are usually denoted
as follows:
1- Sure event ≡ Ω
2- Impossible event ≡ ∅
3- Event A or B ≡ A ∪ B
4- Event A and B ≡ A ∩ B
5- Event not A ≡ Ac
6- Mutually exclusive event ≡ (A ∩ B = ∅)
7- Implicated events ≡ A ⊂ B
pag. 32

2.8.4 Probability function in finite sample spaces (theorem)


Given a sample space Ω = {wi }ni=1 , the function P : 2Ω → R (with 2Ω equal to the
power set of omega) such that:

1) ∀i ∈ [1; n], 0 ≤ P (wi ) ≤ 1

2) ni=1 P ({wi }) = 1
P

P
3) P (A) = w∈A P ({w})

is a probability measure on (Ω, 2Ω )

Proof:
1) Kolmorov’s axiom 1 is fulfilled by property 1.

2) P (Ω) = w∈Ω P ({w}) = w∈Sn {wi } P ({w}) = ni=1 w∈{wi } P ({w})


P P P P
i=1
Pn
= i=1 P ({wi }) = 1

3) Since no element of the sample space belongs to ∅, then by the properties two
and three, we can state that P (∅) = 0, thus, we can defined the sucession {Ai } such
that ∀i > n, Ai = ∅ and ∀i ≤ n, Ai are disjoint elements in 2Ω , then, if we calculate
the probability of the union of the elements belonging to the image of the sequence
we can notice that:

[ X ∞ X
X ∞
X
P( Ai ) = P ({w}) = P ({w}) = P (Ai )
i=1 w∈ ∞ i=1 w∈Ai i=1
S
i=1 Ai

2.8.5 Classical Probability (Corollary)


The classical probability is a consequent case of the previous theorem, where for
every element of the sample space Ω = {wi }ni=1 we have the function P that satisfies
the latter condintions and:

∀i ∈ [1; n], P ({wi }) = p


is a probability measure on (Ω, 2Ω ). Some properties can be noticed quickly, for
example:
1
∀i ∈ [1; n], p =
n
|A|
∀A ∈ 2Ω , P (A) =
|Ω|
Where the first is a consequence of Kolmorov’s third axiom and the second is a
consequence of the third property of the theorem (the latter is since called Laplace’s
rule in honor of its discoverer).
pag. 33

2.8.6 Conditional probability


Suppose you want to know the probability that when rolling a die twice, the result
of the sum is 10, it is intuitive to think that when rolling the first die, the result of
this will influence the probability of obtaining the expected result, so in this case
the probability is not static, since is conditioned by the result of the first die, this is
an example of what will be seen next.
To understand how the probability of conditional events is expressed mathemat-
ically, we must delve into statistical regularity. Suppose an experiment of a similar
nature to that of the die example is performed, but a large number of times. If we
call f(B) the frequency with which the result was B and f (B ∩ A) the frequency with
the result A and B happened, then mathematically, if the experiment was repeated
n times, we have that:

∗B
f (B) =
n
∗(B ∩ A)
f (B ∩ A) =
n
Where ∗ is used by represents the number of times the result was obtained. On
the other hand, the frequency of A given that B occurred should be the number
of times that A and B occurred among all the times that B occurred, this can be
understood intuitively as if upon the occurrence of B, the sample space is restricted
to only the values that can intersect with it, so it is only necessary to focus on the
repetition of these values among all the times that event B occurred, therefore:

∗B ∩ A ∗B ∩ A/n f (B ∩ A)
f (A|B) = = =
∗B ∗B/n f (B)
Finally, since the probability is the frequency to which an event tends when the
experiment has been repeated a large number of times (P (A) = lim ∗A/n), then
n→∞
the probability of A happening given that B is:

P (B ∩ A)
P (A|B) =
P (B)
Exercise: Given a probabilistic space (Ω, Σ, P ), prove that the 3-tuple (Ω, Σ, P (X|B)),
where X, B ∈ Σ (for all X and P (B) 6= 0), is a probabilistic space.

Answer:
1) ∀X ∈ Σ, P (B ∩ X) ≥ 0 ∧ P (B) > 0
P (B ∩ X)
→ ∀X ∈ Σ, = P (X|B) ≥ 0
P (B)
P (B ∩ Ω) P (B)
2) P (Ω|B) = = =1
P (B) P (B)
pag. 34

3) Given a disjoint sequence of sets {Xi } of Σ, then:

∞ S∞
P( ∞
S P∞
[ P( i=1 Xi ∩ B) i=1 (Xi ∩ B)) P (Xi ∩ B)
P( Xi |B) = = = i=1
P (B) P (B) P (B)
i=1

∞ ∞
X P (Xi ∩ B) X
= = P (Xi |B)
P (B)
i=1 i=1
As a corollary of the above formula, it follows that:

P (B ∩ A) = P (A|B) ∗ P (B) : P (B) 6= 0

Definition 2.9 (Mutually independent events): Given a probabilistic


space (Ω, Σ, P ), it is say that A and B (with P (B) 6= 0) are mutually independent
events if:

P (A|B) = P (A)

Exercise: Prove that:


A and B are mutually independent ↔ P (A ∩ B) = P (A) ∗ P (B)
Proof: By definition:
A and B are mutually independent
↔ P (A|B) = P (A)
P (B ∩ A)
↔ = P (A)
P (B)
↔ P (B ∩ A) = P (A) ∗ P (B)

Total probability theorem


Given a probability space (Ω, Σ, P ) and a partition C = {Ci }∞
i=1 of Ω (with
P (Ci ) > 0 for all i), then, for any A ∈ Σ:

X
P (A) = P (A|Ci ) ∗ P (Ci )
i=1

Proof: By definition of partition we know that ∞


S
i=1 Ci = Ω and for all i 6= j,
Ci and Cj are disjoints, in addition, we can prove that:

P (A ∩ Ω)
P (A|Ω) = = P (A)
P (Ω)
Therefore:

∞ ∞
! S∞ !
P( i=1 Ci ∩ A)
[ [
P (A) = P (A|Ω) = P A| Ci = =P (Ci ∩ A)
P (Ω)
i=1 i=1
pag. 35

∞ ∞ ∞
X X P (Ci ∩ A) X
= P (Ci ∩ A) = ∗ P (Ci ) = P (A|Ci ) ∗ P (Ci )
P (Ci )
i=1 i=1 i=1

Bayes’ Theorem
Given a probability space (Ω, Σ, P ) and a partition C = {Ci }∞
i=1 of Ω (with
P (Ci ) > 0 for all i), then, for all B ∈ Σ such that P (B) > 0:

P (B|Ci ) ∗ P (Ci )
∀i ∈ [1; n], P (Ci |B) = P∞
j=1 P (B|Cj ) ∗ P (Cj )

Proof: By definition, given an arbitrary i ∈ [1; n]

P (B ∩ Ci )
P (Ci |B) =
P (B)
But remembering the total probability theorem, we have that:

X
P (B) = P (B|Cj ) ∗ P (Cj )
j=1

And by the product formula:

P (B ∩ Ci ) = P (B|Ci ) ∗ P (Ci )
Then, replacing:

P (B|Ci ) ∗ P (Ci )
P (Ci |B) = P∞
j=1 P (B|Cj ) ∗ P (Cj )

2.8.7 Random variable


In order to introduce the definition of random variable, it is necessary to previously
deepen in some of the properties of σ − algebra and its consequences:

Property: The arbitrary intersection of σ − algebras is a σ − algebra.

Proof:
Suppose we have any set of indices I such that ∀i ∈ I, Σi is a σ − algebra of Ω,
if F is the following class:

F = {Σi : i ∈ I}
Then, the arbitrary intersection of the elements of F is:
\
F = {A : ∀i ∈ I, A ∈ Σi }
1) ∀i ∈ I, Σi is a σ − algebra of Ω
→ ∀i ∈ I, Ω ∈ Σi
T
↔Ω∈ F
pag. 36

T
2) A ∈ F
↔ ∀i ∈ I, A ∈ Σi
→ ∀i ∈ I, Ac ∈ Σi
↔ Ac ∈ F
T

T
3) ∀j ∈ N, Aj ∈ F
↔ ∀j ∈ N, ∀i ∈ I, Aj ∈ Σi
↔ ∀i ∈ I, ∀j ∈ N, Aj ∈ Σi
S
→ ∀i ∈ I, j∈N Aj ∈ Σi
S T
↔ j∈N Aj ∈ F

With this property already demonstrated, we can define the following set:

Definition 3.0 (Generated σ−algebra): Given a colection of subsets of Ω,


B, it is defined the Generated σ − algebra by B, denoted σ(B), as the arbitrary
intersection of all Σ that containing to B, that is:
\
σ(B) = {Σ : Σ is σ − algebra ∧ B ⊂ Σ}

It can be observed from the above definition the fact that:

∀Σ ∈ {Σ : Σ is σ − algebra ∧ B ⊂ Σ}, σ(B) ⊂ Σ


The latter is a consequence of that the intersection of sets will always be a subset
of the sets that compose the intersection, therefore, σ(B) will be the minimal
sigma algebra containing the elements of B.
Example: If A ⊂ Ω (with Ω a sample space), Find σ({A}).
Answer: Based on what has been previously described, the set can be found
simply by seeing which elements are minimally necessary to form a sigma algebra
containing A, then:
1) Ac ∈ σ({A}) (by property two)
2) Ω ∈ σ({A}) (by property one)
3) ∅ ∈ σ({A}) (by property one and two)

∴ σ({A}) = {A, Ac , Ω, ∅}
pag. 37

2.8.8 Borel sets


If we have that the sample space of an experiment is the set of real numbers, i.e.
Ω = R, then, the set of all intervals of the form (−∞, x] in R can be extracted from
it, and based on the latter, find its generated sigma algebra, such a set is known as
The Borel σ − algebra, and is denoted by:

B(R) = σ({(−∞, x] : x ∈ R})


Note: The elements in B(R) are called Borelians or Borel sets.

Exercise: Prove that the following intervals are Borelians:

1- (x, ∞) 6- (x, y]
2- (−∞, x) 7- (x, y)
3- [x, ∞) 8- {x}
4- [x, y] 9- N, Z, Q, Qc
5- [x, y) 10- R

Proofs: Given a arbitrary elements x, y ∈ R:

1) (−∞, x] ∈ B(R)
→ (−∞, x]c ∈ B(R)
↔ (x, ∞) ∈ B(R)

2) ∀n ∈ N, (−∞, x − 1/n ] ∈ B(R)


→ ∞
S
n=1 (−∞, x − 1/n ] ∈ B(R)

But ∀n ∈ N, x − 1/n < x, then, ∞


S
i=1 (−∞, x − 1/n] ⊂ (−∞, x).

In other hand, we know that lim x − 1/n = x, then, ∃n0 such that for arbitrary
n→∞
y ∈ (−∞, x), y < x − 1/n0 , so:

y ∈ (−∞, x − 1/n0 ]


[
→y∈ (−∞, x − 1/n]
i=1

[
∴ (−∞, x − 1/n) = (−∞, x) ∈ B(R)
i=1

3) By the property 2:
(−∞, x) ∈ B(R)
→ (−∞, x)c ∈ B(R)
↔ [x, ∞) ∈ B(R)
pag. 38

4) If x < y, then as the sets (−∞, y] and [x, ∞) are Borelians:


(−∞, y]c , [x, ∞)c ∈ B(R)
→ (−∞, y]c ∪ [x, ∞)c ∈ B(R)
↔ ((−∞, y] ∩ [x, ∞))c ∈ B(R)
→ (−∞, y] ∩ [x, ∞) ∈ B(R)
↔ [x, y] ∈ B(R)

5) If y > x, then as [x, ∞) and (−∞, y) are Borelians:


[x, ∞) ∩ (−∞, y) ∈ B(R)
↔ [x, y) ∈ B(R)

6) If x < y, then as (−∞, y] and (x, ∞) are Borelians:


(−∞, y] ∩ (x, ∞) ∈ B(R)
↔ (x, y] ∈ B(R)

7) If x < y then as (−∞, y) and (x, ∞) are Borelians:


(−∞, y) ∩ (x, ∞) ∈ B(R)
↔ (x, y) ∈ B(R)

8) If x=y, by property 4:
[x, y] ∈ B(R)
↔ [x, x] ∈ B(R)
↔ {x} ∈ B(R)

9) By the property 8:
S∞
i=1 {i} ∈ B(R)

↔ N ∈ B(R)
On the other hand, we can define the index sets E = {2, 4, ...} and O = {1, 3, ...},
such that:

F = {zi : ∀i ∈ E, zi = {i/2} ∧ ∀i ∈ O, zi = {−(i − 1)/2}}


then, since the sets E and O are a partition of the naturals, we have that:

[ ∞
[
F = zi ∈ B(R)
i=1
pag. 39

But:

[
zi = Z
i=1

Therefore, Z ∈ B(R).

Based on the above, we will define the elements mi such that, ∀i ∈ N, mi ∈ zi ,


then if j ∈ N:

{mi /j} ∈ B(R)


[
→ {mi /j} ∈ B(R)
j=1
∞ [
[ ∞
→ {mi /j} ∈ B(R)
i=1 j=1

But:
∞ [
[ ∞
{mi /j} = Q
i=1 j=1

Therefore Q ∈ B(R), and in consequence Qc ∈ B(R) (irrationals).

10) Finally, the R = Ω, then, R ∈ B(R) by the first property of the σ − algebra.

Definition 3.1 (Random variable): Given a probabilistic space (Ω, Σ, P ),


the random variable X is defined as a function of the following form:

X:Ω→R
X −1 (B) ∈ Σ
Where B ∈ B(R).

The condition proposed in the definition of a random variable is known as mea-


surability and is of great importance for the development of the theory, since by
defining a probability space (Ω, Σ, P ), it is possible to form a space (R, B(R), PX ),
where PX is a probability measure of the form:

 PX : B(R) → [0, 1]

PX (B) = P (X −1 (B))

Notation: Since X −1 (B) is equal to the set of pre-images of B as from the


random variable X, i.e. :

X −1 (B) = {w ∈ Ω : X(w) ∈ B}
pag. 40

Then, the latter can be denoted as X ∈ B, and say that the probability that
event X −1 (B) occurs is equal to the probability that X ∈ B, since knowing that for
a arbitrary result w, w ∈ Ω, it follows that:

X −1 (B) happened ↔ w ∈ X −1 (B) ↔ X(w) ∈ B


So that if, for example, B were a Bolerian of the form (−∞, x] (with x ∈ R),
then:

P (X −1 ((−∞, x])) = P (X ∈ (−∞, x]) = P (X ≤ x)

Proposition: Given a probabilistic space (Ω, Σ, P ), a function X : Ω → R


and a Borelian B:

X −1 (B) ∈ Σ ↔ ∀x ∈ R, (X ≤ x) ∈ Σ

Proof:
⇒) As {(−∞, x] : x ∈ R} ⊂ B(R), if for any B, X −1 (B) ∈ Σ, then in specific,
for all x ∈ R, X ≤ x ∈ Σ.

⇐) Let’s define the following sets:

β = {B ∈ B(R) : X −1 (B) ∈ Σ}

γ = {(−∞, x] : x ∈ R}

Because of the hypothesis, for any x ∈ R, (−∞, x] ∈ Σ, then, the intervals are
also contained in β, therefore γ ⊂ β. In other hand by definition of β, β ⊂ B(R) =
σ(γ). Thus, in order to prove what has been proposed, we must make sure that
β = B(R), but with the above deduction this is equivalent to saying that B(R) ⊂ β,
however, we know that γ ⊂ β, therefore, if we prove that β is a sigma algebra, then
we prove that the sigma algebra generated from γ ( i.e. B(R)) is a subset of β, since
σ(γ) is minimal by definition.

1) We know that R ∈ β is equivalent to X −1 (R) ∈ Σ, but as X is a function,


this is the same that Ω ∈ Σ, which is true, then R ∈ β is true.

2) Given a arbitrary B:
B∈β
↔ B ∈ B(R) ∧ X −1 (B) ∈ Σ
→ B c ∈ B(R) ∧ (X −1 (B))c ∈ Σ
pag. 41

However, X −1 (B c ) = (X −1 (B))c , then:


B c ∈ B(R) ∧ X −1 (B c ) ∈ Σ
↔ Bc ∈ β

3) Given a sequence {Bi }ni=1 then:


∀i ∈ N, Bi ∈ β
↔ ∀i ∈ N, Bi ∈ B(R) ∧ X −1 (Bi ) ∈ Σ
→ ∞
S S∞ −1 (B ) ∈ Σ
i=1 Bi ∈ B(R) ∧ i=1 X i
S∞ −1
S∞
↔ i=1 Bi ∈ B(R) ∧ X ( i=1 Bi ) ∈ Σ
↔ ∞
S
i=1 Bi ∈ β

This theorem is of utmost importance, since it is much more convenient to simply


prove that a certain type of interval satisfies the condition of measurability than all
Borel sets, and it can be shown analogously that this is true for:
1) (X < x) ∈ Σ
2) (X > x) ∈ Σ
3) (X ≥ x) ∈ Σ
4) (a < X < b) ∈ Σ
Since the definition of The Borel σ − algebra can also be written as a function
of these intervals.

2.8.9 Types of random variable


Given a probabilistic space (Ω, Σ, P ), if X and Y are random variables, C is a real
constant, and Z : Ω → R is a function, then the following functions are random
variables:

1) Z = C 4) XY
2) CX 5) X/Y : Y 6= 0
3) X + Y 6) |X|

Proofs:
1) Since Z is a function and Z=C, then the preimage of any Borelian containing
C must be the domain of the function, that is, given a B ∈ B(R):

C ∈ B → Z −1 (B) = Ω
On the other hand, if C does not belong to the Borelian then no domain value
will be associated with it, therefore:

/ B → Z −1 (B) = ∅
C∈
pag. 42

Finally, since these are the only values that the preimages can take and belong
to sigma, then Z is a random variable.

2) By the proposition we can prove that:

∀x ∈ R, (CX ≤ x) ∈ Σ
(or their equivalent expressions) Given a arbitrary real x, we have 3 cases:
1- C > 0:
CX ≤ x
 x
↔ X≤ ∈ Σ (because X is a random variable)
C
2- C < 0
CX ≤ x
x
= (X ≥ )∈Σ
C
3- C = 0
→ CX = 0
Therefore CX is a random variable (By property 1).

3) To prove this we will make use of the following identity:


[
X +Y >x= (X > r) ∩ (Y > x − r)
r∈Q

Since X and Y are random variables then X > r and Y > x − r are events, then,
their intersection is also, and finally their numerable union (since the rationals are
a numerable set as seen on page 38) also, so it suffices to demonstrate this identity
to prove the proposed property.
⊂): Given a arbitrary w ∈ Ω
X(w) + Y (w) > x
↔ X(w) > x − Y (w)
→ X(w) > r0 ∧ r0 > x − Y (w) (by the density of rationals)
↔ X(w) > r0 ∧ Y (w) > x − r0
→ ∃r ∈ Q, X(w) > r ∧ Y (w) > x − r
S
↔ w ∈ r∈Q (X > r) ∩ (Y > x − r)

⊃): Given a arbitrary w ∈ Ω


S
w ∈ r∈Q (X > r) ∩ (Y > x − r)
↔ ∃r ∈ Q, X(w) > r ∧ Y (w) > x − r
→ X(w) > r0 ∧ Y (w) > x − r0
pag. 43

→ X(w) + Y (w) > x (Adding up the inequalities)

4) In the case X=Y we can prove that XY < x is a event:


XY < x
= X2 < x
= (−x < X < x) ∈ Σ
If X = C or Y = C, then is trivial by property 2.
If X 6= Y :

(X + Y )2 − (X − Y )2
XY =
4
And this latter is a random variable.

5) By property 4, it is sufficient to show that 1/Y is a random variable, and to


prove this property, we will show that 1/Y ≤ x is an event:
1) Case x > 0:
1
≤x
Y
1
= ≤x∪Ω
Y
1
= ≤ x ∪ Y = R − {0}
Y
1
= ≤ x ∪ (Y < 0 ∪ Y > 0)
Y
   
1 1
= ≤x∩Y >0 ∪ ≤x∩Y <0
Y Y
   
1 1
= Y ≥ ∩Y >0 ∪ Y ≤ ∩Y <0
x x
     
1 1 1
= Y ≥ ∪ Y ≤ ∩Y <0 ∩ Y >0∪ Y ≤ ∩Y <0
x x x
   
1 1
= Y ≥ ∪Y <0 ∩ Y ≤ ∪Y <0
x x
 
1
= Y ≥ ∪Y <0 ∈Σ
x
2) Case x < 0:
1
≤x
Y
   
1 1
= ≤x∩Y >0 ∪ ≤x∩Y <0
Y Y
 
1
=∅∪ ≤x∩Y <0
Y
pag. 44

1
= ≤x∩Y <0
Y
1
=Y ≥ ∩Y <0
x
 
1
= Y ∈ [ , 0) ∈ Σ
x

3) Case x=0:
1
≤0
Y
   
1 1
= ≤0∩Y >0 ∪ ≤0∩Y <0
Y Y
1
= ≤0∩Y <0
Y
= (Y < 0) ∈ Σ

6) Case x ≥ 0:
|X| ≤ x
= (−x ≤ X ≤ x) ∈ Σ

Case x < 0:
As for all w ∈ Ω, |X(w)| ≥ 0, then X −1 (−∞, 0) = ∅, therefore is a event.

Definition 3.2 (Distribution function): Given a probabilistic space


(Ω, Σ, P ) and the random variable X : Ω → R, it is defined the distribution
function F as:

F : R → [0, 1]
F (x) = P (X ≤ x)

The distribution function can also be defined on the basis of their characteristic
properties, which are as follows:
1) lim F (x) = 1
x→+∞

2) lim F (x) = 0
x→−∞

3) x1 ≤ x2 → F (x1 ) ≤ F (x2 )
4) ∀x0 ∈ R, lim F (x) = F (x0 )
x→x+
0
pag. 45

Proofs:
1) To test this we will use the following property:
Given a increasing sequence {An }, then:

[
P( An ) = lim P (An )
n→∞
n=1

To prove this we will define the sequence {Bn } such that:



B 1 = A1
Bn = An − An−1 ∀n > 1
the above sequence has the following properties:
1) ∀i 6= j, Bi ∩ Bj = ∅
2) ∞
S S∞
n=1 Bn = n=1 An

Proof 1: There are two cases of trichotomy, j < i or i < j, in the first case it
follows that Aj−1 ⊂ Aj ⊂ Ai−1 ⊂ Ai since the sequence is increasing, then:
Bi ∩ Bj
= Ai ∩ Aci−1 ∩ Aj ∩ Acj−1
= Ai ∩ Acj−1 ∩ Aci−1 ∩ Ai−1
=∅
On the other hand, the case i < j implies Ai−1 ⊂ Ai ⊂ Aj−1 ⊂ Aj obtaining
analogously the same result.

Proof 2:
B.S:
[ [
An = Bn = B1 = A1
n∈{1} n∈{1}

I.H:
[ [
An = Bn
n∈[1;m] n∈[1;m]

I.S:
[ [ [
Bn = Bn ∪ Bm+1 = An ∪ Bm+1
n∈[1;m+1] n∈[1;m] n∈[1;m]
[ [ [
= An ∪ (Am+1 ∩ Acm ) = An ∩ An ∪ Acm
n∈[1;m] n∈[1;m+1] n∈[1;m]
[ [
= An ∩ Ω = An
n∈[1;m+1] n∈[1;m+1]
pag. 46

Then:
[ [
∀m ∈ N, Bn = An
n∈[1;m] n∈[1;m]
[ [
∴ An = Bn
n∈N n∈N
Thus:


[ ∞
[ ∞
X ∞
X
P( An ) = P (B1 ∪ Bn ) = P (B1 ) + P (Bn ) = P (A1 ) + P (An − An−1 )
n=1 n=2 n=2 n=2

m
X
= P (A1 ) + lim P (An ) − P (An−1 ) = P (A1 ) + lim P (Am ) − P (A1 )
m→∞ m→∞
n=2

= lim P (Am )
m→∞
Finally, Given a sequence of real numbers {xn } that is positively divergent, then:


[
lim F (xn ) = lim P (X ≤ xn ) = lim P (X −1 (−∞, xn )) = P ( X −1 (−∞, xn ))
n→∞ n→∞ n→∞
n=1

P (Ω) = 1

∴ lim F (x) = 1
x→∞

2) Thanks to the auxiliary property used in the previous demonstration, it can


be verified that, given a decreasing sequence of {An }, then:

\
P( An ) = lim P (An )
n→∞
n=1
Since, if {An } is decreasing, then {Acn } is increasing, therefore:

[
P( Acn ) = lim P (Acn )
n→∞
n=1

!c !
\
↔P An = lim 1 − P (An )
n→∞
n=1

!
\
↔1−P An = lim 1 − P (An )
n→∞
n=1

\
∴ P( An ) = lim P (An )
n→∞
n=1
pag. 47

With this, given a sequence of real numbers {xn } that is negatively divergent,
then:

\
lim F (xn ) = lim P (X ≤ xn ) = P ( X −1 (−∞, x) = P (∅) = 0
n→∞ n→∞
n=1

∴ lim F (x) = 0
x→−∞

3) x1 ≤ x2
→ X ≤ x1 ⊂ X ≤ x2
→ P (X ≤ x1 ) ≤ P (X ≤ x2 )
↔ F (x1 ) ≤ F (x2 )

Given a sequence of real numbers {xn } decreasing to 0, then it can be seen that:
F (x0 + xn ) = P (X ≤ x0 + xn )
= P (X ≤ x0 ∪ x0 < X ≤ x0 + xn )
= P (X ≤ x0 ) + P (x0 < X ≤ x0 + xn )
= F (x0 ) + P (x0 < X ≤ x0 + xn )

Then:

lim F (x0 + xn ) = F (x0 ) + lim P (x0 < X ≤ x0 + xn )


n→∞ n→∞


\
= F (x0 ) + P ( x0 < X ≤ x0 + xn ) = F (x0 ) + P (∅) = F (x0 )
n=1

∴ lim F (x) = F (x0 )


x→x+
0

The importance of defining the distribution function based on the demonstrated


properties, is that in this case, each function that complies with 1,2,3 and 4 will
have an associated random variable X defined in a probability space (Ω, Σ, P), so
that we can create distribution functions, and work it ignoring the space, which is
much more versatile.
pag. 48

2.8.10 Properties of distribution function


Given a random variable X with distribution function F(x), then, if a,b are real
numbers such that a < b:
1) P (X < a) = lim F (x)
x→a−

2) P (X = a) = F (a) − lim F (x)


x→a−

3) P (a < X ≤ b) = F (b) − F (a)

4) P (a ≤ X ≤ b) = F (b) − lim F (x)


x→a−

5) P (a < X < b) = lim F (x) − F (a)


x→b−

6) P (a ≤ X < b) = lim F (x) − lim F (x)


x→b− x→a−

Proofs:
1) Given a real sequence {xn } decreasing to 0, then {a − xn } is a real sequence
increasing to a a, therefore:


[
lim F (x) = lim F (a−xn ) = lim P (X ≤ a−xn ) = P ( X ≤ a−xn ) = P (X < a)
x→a− n→∞ n→∞
n=1

2) By the property 1:

P (X = a) = P (X ≤ a ∩ X ≥ a) = P (X ≤ a ∩ (X < a)c ) = P (X ≤ a − X < a)

P (X ≤ a) − P (X < a) = F (a) − lim F (x)


x→a−

3) Since a < b:

P (a < X ≤ b) = P (X ≤ b ∩ (X ≤ a)c ) = P (X ≤ b − X ≤ a)

= P (X ≤ b) − P (X ≤ a) = F (b) − F (a)

4) By the property 1:

P (a ≤ X ≤ b) = P (X ≤ b ∩ (X < a)c ) = P (X ≤ b − X < a)

P (X ≤ b) − P (X < a) = F (b) − lim F (x)


x→a−
pag. 49

5)
P (a < X < b) = P (X < b − X ≤ a) = P (X < b) − P (X ≤ a)

= lim F (x) − F (a)


x→b−

6)
P (a ≤ X < b) = P (X < b − X < a) = P (X < b) − P (X − a)

= lim F (x) − lim F (x)


x→b− x→a−

2.8.11 Types of random variables and their density function

Definition 3.3 (Discrete random variable): Given a probabilistic space


(Ω, Σ, P ), it is said that the random variable X is discrete if Rang(X) is finite or
numerable.

Definition 3.4 (Continuous random variable): Given a probabilistic


space (Ω, Σ, P ), it is said that the random variable X is Continuous if Rang(X)
is a subinterval of the real numbers.

Density function (Discrete random variable)


Given a random variable discrete X, it is define the density function f : R → R:

P (X = x) : ∀x ∈ Rang(X)
f (x) =
0 : in other case
On the basis of the latter, it can be noted that the distribution function F can
be written as:
X
F (x) = f (t)
t≤x

Figure 2.4: Graphic of density function in discrete random variable


pag. 50

Figure 2.5: Graphic distribution function in discrete random variable

Density function (Continuous random variable)


Given a continuous random variable X with distribution function F continuous,
then it is define the density function f : R → R As that non-negative, integrable
function such that:
Z x
F (x) = f (t) dt
−∞

Figure 2.6: Graphic density function in continuous random variable

Note: First of all, since the function is positive, the probability of an event can
be interpreted as the area under the curve given by the distribution function, on the
other hand, since it is continuous, the properties of F described above are drastically
simplified, obtaining that:

P (a ≤ X ≤ b) = P (a < X < b) = P (a < X ≤ b) = P (a ≤ X < b)


pag. 51

2.8.12 Independent random variables


Given two continuous random variables X and Y it is said that are independent if:

∀x, y ∈ R, P (X ≤ x ∩ Y ≤ y) = P (X ≤ x) ∗ P (Y ≤ y)
On other hand in the discrete case, we have that:

∀x, y ∈ R, P (X = x ∩ Y = y) = P (X = x) ∗ P (Y = y)
As it will be noticed, the probability measure takes values of two random vari-
ables X and Y, in such a case a distribution function can be associated which will
depend on the values taken by these two variables, that is:

FXY : R2 → R


FXY (x, y) = P (X ≤ x ∩ Y ≤ y)
Thus having for the discrete and continuous case distribution functions of the
form:

P (X = x ∩ Y = y) : ∀(x, y) ∈ Rang(X) × Rang(Y )
f (x, y) =
0 : in other case
XX
→ F (x, y) = f (t, r)
t≤x r≤y

Z x Z y
F (x, y) = f (t, r) dt dr
−∞ −∞
Note: Generally, as with multiparticle mechanics, theorem proving will be done
for the discrete case only since the proofs are analogous in the continuous case only
with integrals.

Notation:

P (X = x ∩ Y = y) ≡ P (X = x, Y = y)

Total probability: Given two independent random variables X and Y, then:


X
P (X = x) = P (X = x, Y = y)
y
X
P (Y = x) = P (X = x, Y = y)
x

Proof:

[ X
P (X = x) = P (X = x, Ω) = P (X = x, Y = y) = P (X = x) ∗ P (Y = y)
y y
X X
P (X = x)P (Y = y) = P (X = x, Y = y)
y y
pag. 52

2.8.13 Moments of a random variable (and Expectation)


Given a discrete random variable X, it is define the r-th moment of this as:
X
µr (X) = xr f (x)
x

In other hand if, the random variable is continuous it is define as:


Z
µr (X) = xr f (x) dx
R
Note: For those who have had a course in statistics will notice that the first
moment is nothing more than the expectation of the random variable, which repre-
sents an average value of this (in case it converges).The reason why the expectation
is an expression for the average can be better understood in the discrete case, the
continuous case is basically to change the summation by the integral. First, we must
remember that:

(X = x)∗
f (x) = P (X = x) = lim
n→∞ n
Then, the expectation E(X) is (in discrete case):

x ∗ (X = x)∗ ∗ (X = x)∗
P
xx
X X X
E(X) = xf (x) = xP (X = x) = lim = lim
x x x
n→∞ n n→∞ n

And since x is the value that the random variable will take, (X = x)∗ the
frequency with which it will be repeated, and n the number of runs of the experiment,
then the value that E will deliver will be a representative of the random variable
associated with the value that will be repeated the most times.

2.8.14 Theorem of change of variable in distribution functions


Given a continuous random variable X with distribution function fX (x), suppose
there is a continuous and strictly increasing or decreasing function g : R → R having
differentiable inverse g −1 , in which case, given the random variable Y = g(X), it
can be written its distribution function as:

d −1
fX (g −1 (y)) ∗ g (y) : ∀y ∈ Rang(g(X))

fY (y) = dy
0 : in other case

Proof: In the case of increasing, given a arbitrary y ∈ Rang(g(X))


FY (y) = P (Y ≤ y)
= P (g(X) ≤ y)
= P (X ≤ g −1 (y))
= FX (g −1 (y))
pag. 53

But we know that:


d
FY (y) = fY (y)
dy
d −1
↔ fY (y) = fX (g −1 (y)) ∗ g (y) (By chain rule)
dy
d −1
↔ fY (y) = fX (g −1 (y)) ∗ g (y) (By g −1 0 > 0)
dy
In the case of decreasing, given a arbitrary y ∈ Rang(g(X)):
FY (y) = P (Y ≤ y)
= P (g(X) ≤ y)
= P (X ≥ g −1 (y))
= P ((X < g −1 (y))c )
= 1 − P (X < g −1 (y))
= 1 − P (X ≤ g −1 (y)) (By X continuous)
= 1 − FX (g −1 (y))
 
−1 d −1
→ fY (y) = − fX (g (y)) ∗ g (y)
dy
 
d
↔ fY (y) = fX (g −1 (y)) ∗ − g −1 (y)
dy
d −1
↔ fY (y) = fX (g −1 (y)) ∗ g (y) (By g −1 0 < 0)
dy

2.8.15 Expectation in functions of random variable (theorem)


Given a random variable, suppose there is a function g : R → R such that g(X)=Y
is a random variable with E(g(X)) < ∞, then if X is discrete:
X
E(g(X)) = g(x)f (x)
x

In other hand, if X is continuous:


Z
E(g(X)) = g(x)f (x) dx
R
Proof:
P
E(Y ) = y y fY (y)
P
= y y P (Y = y)

Knowing that:

g −1 ({y}) = {x : g(x) = y}
pag. 54

Then:

X ∈ g −1 ({y}) ↔ Y = y
Thus:
E(Y ) = y yP (X ∈ g −1 ({y}))
P
P P
= y y x∈g−1 ({y}) P (X = x)
P P
= y x∈g−1 ({y}) y P (X = x)
P P
= y x∈g−1 ({y}) g(x) P (X = x)
P
= x g(x) P (X = x)

In the case of having a function of two random variables, these results can be
extended as follows:
X
E(g(X, Y )) = g(x, y) f (x, y)
x,y
Z Z
E(g(X, Y )) = g(x, y) f (x, y) dy dx
R R

Properties of the expectation:


Given three random variables X and Y and C (where C is a constant), then:
1) E(C) = C
2) E(C ∗ X) = C ∗ E(X)
3) E(X + Y ) = E(X) + E(Y )
4) X, Y are independents → E(X ∗ Y ) = E(X) ∗ E(Y )

Proofs
P
1) E(C) = xx ∗ P (C = x)
= C ∗ P (C = C)
=C

P
2) E(C ∗ X) = x C ∗ xP (X = x, C = C)
P
= C x P (X = x)
= C ∗ E[X]

P
3) E(X + Y ) = x,y (x + y) P (X = x, Y = y)
P P
= x,y x P (X = x, Y = y) + x,y y P (X = x, Y = y)
P P P P
= x x y P (X = x, Y = y) + y y x P (X = x, Y = y)
pag. 55

P P
= x x P (X = x) + y y P (Y = y)

= E(X) + E(Y )

P
4) E(X ∗ Y ) = x,y (x ∗ y) P (X = x, Y = y)
P
= x,y x P (X = x) ∗ y P (Y = y)
P P
= x x P (X = x) ∗ y y P (Y = y)
= E(X) ∗ E(Y )

2.8.16 Variance
The variance represents a measure of dispersion of the values taken by the random
variable with respect to the mean and is defined as:

V (X) = E[(X − E(X))2 ]


Note: As you can see, the variance mathematically is the average of the squared
deviations, however, why squared?, the reason for this is because if it were not
squared, the differences could take negative values and therefore cancel the positive
ones when averaged with the expectation, in fact, you can check with the properties
already seen that if it were not squared, the variance would always be zero. On the
other hand, there is also the possibility of leaving it as an absolute value, however
the quadratic term is more versatile when calculating with the entity, because the
absolute value is not fully differentiable and is difficult to derive.
Thanks to the properties of expectation, it is also possible to develop the square
to obtain that:

V (X) = E[(X − E(X))2 ]


= E[X 2 − 2XE(X) + E(X)2 ]
= E[X 2 ] − E[2XE(X)] + E[E(X)2 ]
= E[X 2 ] − 2E(X)E[X] + E(X)2
= E[X 2 ] − 2E(X)2 + E(X)2
= E[X 2 ] − E[X]2

Properties of the variance


1) V (C) = 0
2) V (CX) = C 2 ∗ V (X)
3) V (X + C) = V (X)
4) X,Y are independents → V (X + Y ) = V (X) + V (Y )
pag. 56

Proofs:
1) V (C) = E(C 2 ) − E(C)2
= C2 − C2
=0

2) V (CX) = E((CX)2 ) − E(CX)2


= C 2 ∗ E(X 2 ) − (C ∗ E(X))2
= C 2 ∗ E(X 2 ) − C 2 ∗ E(X)2
= C 2 ∗ E(X 2 ) − E(X)2


= C 2 ∗ V (X)

3) V (X + C) = E((X + C)2 ) − E(X + C)2


= E(X 2 + 2XC + C 2 ) − (E(X) + E(C))2
= E(X 2 ) + 2C ∗ E(X) + C 2 − (E(X) + C)2
= E(X 2 ) + 2C ∗ E(X) + C 2 − (E(X)2 + 2C ∗ E(X) + C 2 )
= E(X 2 ) − E(X)2
= V (X)

V (X + Y ) = E((X + Y )2 ) − E(X + Y )2
= E(X 2 + 2XY + Y 2 ) − (E(X) + E(Y ))2
= E(X 2 ) + 2E(XY ) + E(Y 2 ) − (E(X)2 + 2E(X) ∗ E(Y ) + E(Y )2 )
= E(X 2 ) + 2E(X)E(Y ) + E(Y 2 ) − (E(X)2 + 2E(X) ∗ E(Y ) + E(Y )2 )
= E(X 2 ) − E(X)2 + E(Y 2 ) − E(Y )2
= V (X) + V (Y )

Note: It is important to mention that squaring the deviation values will result
in a measure of dispersion with units also squared, so that to return to the respective
units of interest it is proposed to calculate the square root of the variance, which is
known as the standard deviation:
p
s= V (X)
pag. 57

2.8.17 Discrete uniform distribution


Given a discrete random variable X such that rang(X) = {xi }ni=1 , it is said that
X ∼ U nif orm({xi }ni=1 ) (X has uniform distribution on the discrete set {xi }ni=1 ), if
it has equiprobable density function, that is:

: ∀x ∈ {xi }ni=1

1/n
f (x) =
0 : in other case
In this way it can also be noted that:
n
!
X 1 1 X
E[X] = x = xi = A.M ({xi }ni=1 )
x
n n
i=1

n
X 1 1X
V (X) = E[(X − E[X])2 ] = (x − A.M ({xi }ni=1 ))2 = (xi − A.M ({xi }ni=1 ))2
x
n n
i=1

In addition, if the set {xi }ni=1 is ordered in the form ∀i < j, xi < xj , then, for
i ∈ [1; n − 1] :

 0 : ∀x < x1
F (x) = i/n : ∀x ∈ [xi , xi+1 )
1 : ∀x ≥ xn

Figure 2.7: Graph of density and distribution functions


pag. 58

2.8.18 Bernoulli distribution


A bernoulli trial is defined as an experiment that has only two possible outcomes,
which are called success (S) and failure (F), on the other hand, if it is define the
random variable X : {S, F } → R such that X(S) = 1 and X(F ) = 0, then it is said
that X ∼ Ber(p) with p ∈ (0, 1) and density function:

px (1 − p)1−x : ∀x ∈ {0, 1}

f (x) =
0 : in other case
So that, P (X = 1) = p and P (X = 0) = (1−p). As for the distribution function,
it can be deduced that:

 0 : ∀x < 0
F (x) = 1−p : ∀x ∈ [0, 1)
1 : ∀x ≥ 1

Figure 2.8: Graph density and distribution functions in Bernoulli distribution

In addtion:

E(X) = 0 ∗ (1 − p) + 1(p) = p

V (X) = E[X 2 ] − E[X]2 = p − p2 = p ∗ (1 − p)


pag. 59

2.8.19 Binomial distribution


Suppose that n Bernoulli trials are performed independently, which are characterized
by having the same probability p of success. In this case let us define the random
variable:

X : {S, F }n → R
Such that X indicates the number of times the trial was successful. That said, it
can be seen that Rang(X) = [0; n], on other hand, to know the probability P(X=k)
with k ∈ [0; n] we will perform the following analysis. First we should note that the
events are mutually independent, so the probability of the intersection will be the
product of the probabilities, so we can calculate the probability of obtaining first k
consecutive successes and then n-k failures, which will give a total of :

pk (1 − p)n−k
However, since only the number of successes matters, it is necessary to multiply
by the number of combinations in which a sequence with k successes can be obtained,
therefore:
 
n k
P (X = k) = p (1 − p)n−k
k
thus obtaining the density function:
 n x
 x p (1 − p)n−x : ∀x ∈ [0; n]
f (x) =
0 : in other case

And with the latter, its distribution function (with i ∈ [0; n − 1]):


 0 : ∀x < 0


 P
i n
 k n−k
F (x) =
 k=0 k p (1 − p) : ∀x ∈ [i, i + 1)



: ∀x ≥ n

1

Figure 2.9: Graph of density and distribution functions in Binomial distribution


pag. 60

When a random variable has a Binomial distribution, it is denoted as follows:

X ∼ Bin(n, p)
And as for the expectation and variance of this variable, they can be calculated
as follows:
n   n  
X n x n−x
X n x
E(X) = x p (1 − p) = x p (1 − p)n−x
x x
x=0 x=1

n n
X n! X n!
= x px (1 − p)n−x = px (1 − p)n−x
(n − x)!(x)! (n − x)!(x − 1)!
x=1 x=1

n n
X (n − 1)! X (n − 1)!
= n∗ px (1−p)n−x = n∗ px (1−p)n−x
(n − x)!(x − 1)! ((n − 1) − (x − 1))!(x − 1)!
x=1 x=1

n   n−1
X 
X n−1 x n−x n − 1 x+1
=n∗ p (1 − p) =n∗ p (1 − p)n−(x+1)
x−1 x
x=1 x=0

n−1
X 
n−1 x
= np ∗ p (1 − p)(n−1)−x = np ∗ (p + (1 − p))n−1 = np
x
x=0

V (X) = E(X 2 ) − E(X)2


= E(X(X − 1) + X) − (np)2
= E(X(X − 1)) + np − (np)2

But by the property of the section 2.8.15:


E(X(X − 1)) = nx=0 (x)(x − 1) nx px (1 − p)n−x
P 

= nx=2 (x)(x − 1) nx px (1 − p)n−x


P 

= (n)(n − 1) nx=2 n−2


P  x n−x
x−2 p (1 − p)

= (n)(n − 1)(p2 ) n−2 n−2 x (n−2)−x


P 
x=0 x p (1 − p)
= (n)(n − 1)(p2 )

Then:
V (X) = (np)2 − np2 + np − (np)2
= np − np2
= np(1 − p)
pag. 61

2.8.20 Geometric distribution


Like the previous distribution, the geometric distribution is also based on an inde-
pendent succession of Bernoulli trials, however, in this case the random variable X
is taken as the one that models the number of failures before obtaining the first
success. It is intuitive that the range of the random variable is N0 , because you can
have a success immediately, or after one error, or after two... and so on. On the
other hand, since the random variable has as its domain successions of outcomes
associated with independent events, then the probability of achieving a success after
x failures is given by the following density function:

(1 − p)x p : ∀x ∈ N0

f (x) =
0 : in other case
As for its distribution function, for its construction we will first perform the
deduction with F (x) for x ∈ N0 and then we will complement it with other sections,
then:

x x
1 − (1 − p)x+1
X X X  
t t
F (x) = f (t) = (1 − p) p = p ∗ (1 − p) = p ∗
1 − (1 − p)
t≤x t=0 t=0

= 1 − (1 − p)x+1
Therefore, for i ∈ N0 :

0 : ∀x < 0
F (x) =
1 − (1 − p)i+1 : ∀x ∈ [i, i + 1)

Figure 2.10: Graph density and distribution function in Geometric distribution

The expectation and variance of the random variable can be calculated as follows:
E(X) = ∞ x
P
x=0 x (1 − p) p

= (p)(1 − p) ∗ ∞ x−1
P
x=0 x(1 − p)
P∞ d
= −(p)(1 − p) ∗ x=0 (1 − p)x
dp
pag. 62

d P∞
= −(p)(1 − p) ∗ (1 − p)x
dp x=0
 
d 1
= −(p)(1 − p) ∗
dp (1 − p) − 1
 
d 1
= −(p)(1 − p) ∗
dp p
1−p
=
p

V (X) = E(X 2 ) − E(X)2


 2
1−p
= E(X(X − 1)) + E(X) −
p
 2
1−p 1−p
= E(X(X − 1)) + −
p p
But:
E(X(X − 1)) = ∞ x
P
x=0 x(x − 1) (1 − p) p

= (p)(1 − p)2 ∞ x−2


P
x=0 x(x − 1) (1 − p)

d2 P∞
= (p)(1 − p)2 ∗ (1 − p)x
dp2 x=0
d2 1
 
= (p)(1 − p)2 ∗ 2
dp p
2
= (p)(1 − p)2
p3
(1 − p)2
=2∗
p2
Then:
1 − p (1 − p)2
V (X) = +
p p2
1−p
=
p2
Note: It should be noted that the above distribution can also be described as a
function of the naturals, for which the random variable X must be determined as
the number of attempts necessary to obtain a success, so that:

(1 − p)x−1 p

: ∀x ∈ N
f (x) =
0 : in other case

And under the same reasoning as above, it can also be shown that:

1 1−p
E(X) = ; V (X) =
p p2
pag. 63


0 : ∀x < 1
F (x) =
1 − (1 − p)i : ∀x ∈ [i, i + 1)
With i ∈ N.

Notation: X ∼ Geo(p).

2.8.21 Poisson distribution


The Poisson distribution allows modeling the number of events occurring in a fixed
time interval or space, and its construction can be generated from the random vari-
able X ∼ Bin(n, p) with n tending to infinity and p tending to 0, which implies
that the expected value E(X) = np remains indeterminate (it can converge to a
finite value), the latter value is simplified with the notation λ, and can be adapted
to modeling based on time or space by making different substitutions (which will be
shown later).

Deduction: Given a X ∼ Bin(n, p), If n tends to infinity and p to 0, the


following substitution of its density function can be made from the parameter λ
(For x ∈ N0 ):

x−1
!  
λ x λ n−x
    
n! 1 Y
f (x) = n→∞
lim px (1−p)n−x = lim n−k 1−
p→0
(n − x)!(x)! x! n→∞
p→0
n n
k=0

x−1
!
λx λ n λ −x λx e−λ
     
Y n−k
= lim lim 1 − lim 1 − =
x! n→∞
p→0
nx n→∞
p→0
n n→∞
p→0
n x!
k=0

Then, under such conditions, the random variable is said to have a Poisson
distribution with parameter lambda, denoted as X ∼ P oi(λ) (with λ ∈ R), on the
other hand, complementing, its density function can be described by:

x −λ
 λ e

 : ∀x ∈ N0
f (x) = x!


 0 : in other case
And the distribution function (with i ∈ N0 ) by:


 0 : ∀x < 0

F (x) =
 Pi λk e−λ

 k=0 : ∀x ∈ [i, i + 1)
k!
pag. 64

Figure 2.11: Graph density and distribution function in Poisson distribution

As for the expectation and variance, they can be calculated as follows:


P∞ λx e−λ
E(X) = x=0 x
x!
P∞ λx
= e−λ x=0 x
x!
λx−1
= e−λ λ ∞
P
x=0 x
x!
 x
−λ
P ∞ d λ
= e λ x=0
dλ x!
 x
d P∞ λ
= e−λ λ
dλ x=0 x!
d λ
= e−λ λ (e )

V (X) = E(X 2 ) − E(X)2


= E(X 2 ) − λ2
= E(X(X − 1) + X) − λ2
= E(X(X − 1)) + λ − λ2

But:
P∞ e−λ λx
E(X(X − 1)) = x=0 x(x − 1)
x!
P∞ λx
= e−λ x=0 x(x − 1)
x!
λx−2
= e−λ λ2 ∞
P
x=0 x(x − 1)
x!
2
 x
d λ
= e−λ λ2 ∞
P
x=0 2
dλ x!
pag. 65

d2 P∞ λx
= e−λ λ2
dλ2 x=0 x!
d2
= e−λ λ2 2 eλ


=λ 2

Then:
V (X) = λ
Chapter 3

Basic definitions of
thermodynamics

In this paper the thermodynamics of systems in equilibrium will be analyzed, so I


believe it is prudent to formally define what a system is and what are the charac-
teristics that allow its mathematical modeling.
A three-dimensional thermodynamic system is a portion of volume V under
study, which is contained in an environment U and whose boundary ∂V happens to
be a closed surface. Sometimes the modeling can also be taken to two dimensions,
where the volume is replaced by an area and its boundary by a closed curve. The
union of the system with its environment is what is known as the thermodynamic
universe, which ”encloses” all the region that is of interest for the study of the
physical properties of the system.

Figure 3.1: Thermodynamic universe

66
pag. 67

An example of a thermodynamic system is the classic cylinder with a piston,


where inside the piston there is a gas with a certain volume and pressure that keeps
the system at rest unless an external force acts on it.

Figure 3.2: Example of a thermodynamic system

In this example, the system is usually the volume of the gas, so the environment
can be reduced to the walls of the cylinder and the piston. Finally, the environment
would be the regions that delimit the gas and allow it not to escape, i.e. the base
of the piston and the inner walls of the cylinder.
The systems are mainly classified into 3 types, open, closed and isolated. On the
one hand, the open system is one that exchanges energy and matter with the envi-
ronment, an example of this could be a pot of boiling water, as energy is exchanged
in the form of heat and matter in the form of water molecules that evaporate and
escape to the environment.

Figure 3.3: Example of open system

On the other hand, a closed system only exchanges energy, for example, if we
cover the pot with boiling water then it would not allow the gas to escape, however
heat would still enter the system and through the walls of the pot.

Figure 3.4: Example of closed system


pag. 68

Finally, the isolated system is one that exchanges neither matter nor energy,
although these systems do not exist, there are some that significantly reduce the
transfer of energy, such as thermos flasks.

Figure 3.5: Example of isolated system

You might also like