Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF

Martin Väth
Nonstandard
Analysis
Birkhäuser Verlag
Basel · Boston · Berlin
Contents
Preface vii
1 Preliminaries 1
§1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Archimedean Fields and Infinitesimals . . . . . . . . . . . . 4
§2 Superstructures, Sentences, and Interpretations . . . . . . . . . . . 13
2.1 Superstructures . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Formal Language . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . 19
2 Nonstandard Models 23
§3 The Three Fundamental Principles . . . . . . . . . . . . . . . . . . 23
3.1 Elementary Embeddings and the Transfer Principle . . . . 23
3.2 The Standard Definition Principle . . . . . . . . . . . . . . 28
3.3 The Internal Definition Principle . . . . . . . . . . . . . . . 34
3.4 Existence of External Sets . . . . . . . . . . . . . . . . . . . 40
§4 Nonstandard Ultrapower Models . . . . . . . . . . . . . . . . . . . 44
4.1 Ultrafilters . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Ultrapowers . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Embedding in a Superstructure . . . . . . . . . . . . . . . . 51
3 Nonstandard Real Analysis 59

§5 Hyperreal Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.1 Hyperreal and Hypernatural Numbers . . . . . . . . . . . . 59
5.2 Interpretation of the Standard Part Homomorphism . . . . 70
§6 The Permanence Principle and ∗ -finite Sets . . . . . . . . . . . . . 74
§7 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
vi Contents
7.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4 Enlargements and Saturated Models 103

§8 Enlargements, Saturation, and Concurrency . . . . . . . . . . . . . 103
§9 Saturated Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.1 Models for Enlargements . . . . . . . . . . . . . . . . . . . 114
9.2 Compact Enlargements . . . . . . . . . . . . . . . . . . . . 115
9.3 Polysaturated Models . . . . . . . . . . . . . . . . . . . . . 122
5 Functionals, Generalized Limits, and Additive Measures 125

§10 Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
10.1 Linear Functionals and Operators . . . . . . . . . . . . . . 125
10.2 Hahn-Banach and Banach-Mazur Limits . . . . . . . . . . . 129
§11 Additive Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6 Nonstandard Topology and Functional Analysis 149

§12 Topologies and Filters . . . . . . . . . . . . . . . . . . . . . . . . . 149
12.1 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . 149
12.2 Filters in Nonstandard Analysis . . . . . . . . . . . . . . . . 150
12.3 Topologies in Nonstandard Analysis . . . . . . . . . . . . . 154
12.4 Functions in Nonstandard Topology . . . . . . . . . . . . . 167
§13 Uniform Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
13.1 Uniform Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 170
13.2 Nonstandard Hulls . . . . . . . . . . . . . . . . . . . . . . . 178
§14 Topological Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . 185
7 Miscellaneous 197
§15 Loeb Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
§16 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
A Some Important ∗-Values 211
B Solutions to the Exercises 217
Bibliography 241
Index 245
Preface
Historically, the idea of nonstandard analysis was to rigorously justify calcula-

tions with infinitesimal numbers. For example, formally, the chain rule of Leibniz’
calculus for the function F = f (g(x)) can be written as
dF dF dg
= ,
dx dg dx
and for a formal proof, one may just divide numerator and denominator by the
“infinitesimal small number” dg.
Nowadays, nonstandard analysis has gone far beyond the realm of infinites-
imals. In fact, it provides a machinery which enables one to describe “explicitly”
mathematical concepts which by standard methods can only be described “implic-
itly” and in a cumbersome way. In the above example the “standard” notion of
a limit is in a certain sense replaced by the “nonstandard” notion of an infinites-
imal. If one applies a similar approach to other objects than the real numbers
(like topological spaces or Banach spaces etc.), one has a tool which provides “ex-
plicit” definitions for objects which can in principle not be described explicitly
by standard methods. Examples of such objects are sets which are not Lebesgue
measurable, or functionals with certain properties like so-called Hahn–Banach lim-
its. Since it is possible in nonstandard analysis to simply “calculate” with such
objects, one can obtain results about them which are extremely hard to obtain by
standard methods.
This book is an introduction to nonstandard analysis. In contrast to some
other textbooks on this topic, it is not meant as an introduction to basic calculus
by nonstandard analysis. Instead, the above mentioned applications in analysis
(which are not easily accessible by standard methods) are our main motivation.
The infinitesimals are only described as an elementary example for the provided
machinery.
Consequently, the reader is supposed to be already familiar with (standard)
basic calculus. For deeper understanding, also experience with (basic) topology and
Chapter 1
Preliminaries
§ 1 Introduction
1.1 General Remarks
Historically, the idea of nonstandard analysis was to find a rigorous justification
for calculations with infinitesimal numbers. However, in the author’s opinion, this
is not the most important property of nonstandard analysis. Instead, it appears to
the author that it is more essential that so-called concurrent relations are satisfied.
We will make this more precise later, but we already mention that this means,
roughly speaking, the following.
If there is a statement which holds for any finite subset of a given set, then
it holds for the whole set in nonstandard analysis.
Consider, for example, for any set M of positive real numbers the statement
“there is some c > 0 with c < ε for all ε ∈ M ”. Clearly, this statement is true
for any finite set M of positive real numbers (in our later terminology, we denote
such a fact by “concurrency”). This implies that the statement is also true for the
set of all positive numbers in nonstandard analysis and so there indeed exists an
infinitesimal c > 0 which is less than any positive real number. In other words,
nonstandard analysis allows us to conclude that “true for each finite number”
implies “true for all”.
The formulation of the above considerations in precise mathematical terms
is rather involved. For this reason (and to have a further motivation up to this
point), we will first concentrate on the “classical” topic of nonstandard analysis:
This is Leibniz’ idea which may be described as follows. Leibniz’ program is to join
“infinitesimals” to the system Ê of real numbers such that the enlarged system
Ê
obeys the same “rules” as . As we shall see, this program cannot be carried out
2 Chapter 1. Preliminaries
directly, because the system of real numbers is uniquely determined by these rules
(up to an isomorphism).
The solution proposed by A. Robinson and W. A. J. Luxemburg out of this
dilemma is the following: Consider together with Ê Ê
a nonstandard real line ∗
Ê
which contains and also infinitesimal numbers as elements and which satisfies the
Ê
following: Any so-called transitively bounded sentence about can be transferred
Ê
into an analogous sentence about ∗ , and the latter sentence is true if and only
if the sentence about Ê was true. The crucial point in this concept is that more
Ê
sentences can be formulated about ∗ than those transferred from a sentence
Ê
about (and many of these additional sentences are true). Later, these additional
sentences will be called sentences about nonstandard objects. A fundamental point
is that true sentences about nonstandard objects can be combined to give a true
Ê
sentence ∗ α about ∗ which can be obtained by transferring some sentence α
Ê
about . This allows us to conclude that α is true.
To make this approach precise, one has of course to define what is meant by
Ê
a “sentence α about ”. Then one has to define what is meant by the transferred
sentence ∗ α. This is the first problem we shall attack.
After this is done, there arises the fundamental question: Does there actually
Ê
exist an object ∗ with the required properties? Or does in contrast the assump-
Ê
tion that such an object ∗ exists even lead to a contradiction?
The answer to the first question is “yes” if one assumes the axiom of choice
(which we therefore do throughout). For this reason the answer to the second
question is “no” (even if one rejects the axiom of choice). However, the axiom of
choice really is essential.
Applying the above ideas, one can “explicitly construct” objects (in the non-
standard world) which in principle cannot be constructed in the standard world.
Such objects are e.g. sets which are not Lebesgue-measurable or so-called Hahn-
Banach limits: It is possible to prove the existence of such objects in the standard
world by means of the axiom of choice, but it is not possible to give explicit for-
mulas for them without the axiom of choice (even if one allows a weaker form
of this axiom which allows countable recursive or nonrecursive choices). In fact,
assuming the consistency of a so-called inaccessible cardinal, this was first proved
in the famous paper [Sol70]. Since in the nonstandard world we can really “calcu-
late” with such objects, it is easy to obtain results which cannot be obtained by
standard methods, or only with very abstract applications of the axiom of choice.
Thus, in a sense, nonstandard analysis might just be considered as a ma-
chinery to simplify such abstract applications of the axiom of choice by providing
objects which implicitly contain this application. Of course, nonstandard analysis
means actually much more, but in the author’s opinion this is the most impor-
tant advantage of nonstandard analysis over standard analysis: To have convenient
§1 Introduction 3
(almost “explicit”) representations of certain obects like Hahn–Banach limits for

which by standard methods more or less only their mere existence can be proved
with the axiom of choice.
Of course, the above property means that the axiom of choice must actually
Ê
be involved in the definition of the nonstandard world (or ∗ ); we will see that (a
rather strong form of) this axiom comes into play by the choice of an appropriate
so-called filter. Due to this crucial role of the axiom of choice in nonstandard
analysis, we will assume it throughout. Somebody who rejects the general axiom
of choice always has to replace phrases like “. . . then . . . is true” by a phrase like
“. . . then it does not lead to a contradiction to assume that . . . is true”.
The study of nonstandard analysis naturally divides into two parts: One
Ê Ê
part is to define ∗ , and the other part is to “work” with ∗ by using the above
described transferring of sentences. The first part belongs to the realm of so-called
model theory while the second part can be considered as the actual nonstandard
analysis (or also just as an application of the first part). It turns out that the second
part can be done to a large extent without appealing to the first part, i.e. without
Ê
explicitly knowing how ∗ is defined. There even is an approach to nonstandard
analysis (Nelson’s internal set theory [Nel77]; see also e.g. [vdB87, LG81, Ric82,
Ê
Rob88]) which completely hides the definition of ∗ , and only uses some axioms
Ê
to describe a new set theory for ∗ . However, we shall use Robinson’s approach
([Rob70], see also [AFHKL86, Cut88, Dav77, Gol98, HL85, LR94, Lux73, Lux69b,
SL76, SB86]) which is also concerned with the definition of ∗ . Ê
This has not only the advantage that we work with “more concrete” objects.
Ê
But this has also an important practical advantage: In the definition of ∗ one has
many choices, more than can be described by any axiomatic system (this is related
with the axiom of choice which we use for the construction: Roughly speaking, we
can fix any finite number of choices in a way that we like). Thus, by choosing an
Ê
“appropriate” definition of ∗ , we can get some additional properties. This is why
the Robinson/Luxemburg approach is actually more powerful than the Nelson ap-
proach to nonstandard analysis. There are some applications where this difference
really plays a role. A comparison of the two approaches can be found in [DS88]
(see also [CK90, LR94]). There is another more algebraic approach to nonstandard
analysis via the so-called “Ω-calculus” (see [Lau86]) which, however, is in essence
contained in the Robinson/Luxemburg approach. For another approach due to
Hrbacek (which is similar to Nelson’s approach) and several other approaches and
comparisons, we refer the reader to the monograph [RK04].
The crucial point for further applications of nonstandard analysis is that the
definition of a nonstandard object ∗ X is not only possible for the case X = but Ê
also for any other objects X, for example a topological space.
1.2 Archimedean Fields and Infinitesimals

Ê
As mentioned in Section 1.1, the idea that the set ∗ (containing infinitesimals)
should have the same properties as Ê
soon leads to severe difficulties. We shall
discuss these difficulties now in more detail.
Recall that a relation ≤ on a set X is called an order , if
1. a ≤ a.
2. a ≤ b and b ≤ a implies a = b.
3. a ≤ b and b ≤ c implies a ≤ c.
The order is called total , if for each two elements a, b of the set we have either
a ≤ b or b ≤ a. We write a < b to denote that a ≤ b and a = b. The order is called
well-order if each nonempty set has a smallest element. Each well-order is a total
order (consider the set {a, b} to see this).
A set X with two operations + and · is called a (commutative) field , if X is
a commutative group with respect to + (we denote the neutral element by 0X ),
and X \ {0X } is a commutative group with respect to · (we denote the neutral
element by 1X ), 0X · a = a · 0X = 0X , and if furthermore the distributive law
a(b + c) = ab + ac holds. X is a totally ordered field if it is equipped with a total
order such that the relations a ≤ b and c ≥ 0X imply a + c ≤ b + c and ac ≤ bc.
Note that this implies a2 ≥ 0X for each a ∈ X: For a ≥ 0X , this is clear, and for
a < 0X , we have 0X = a − a ≤ 0X − a = −a, and so −a ≥ 0X which implies
a2 = (−a)2 ≥ 0X , as claimed.
Each totally ordered field X contains a “canonical copy” of the set , namely
{1X , 1X + 1X , 1X + 1X + 1X , . . .} (we write X := {1X , 2X , . . .}). Note that X
is indeed infinite, since 1X < 2X < 3X < · · · because 12X = 1X > 0X , and so e.g.
2X < 2X + 1X = 3X , and so on. Similarly, X contains a “canonical copy” of the

sets and of integer and rational numbers. By “canonical copy”, we mean that
there is an isomorphism, i.e. a bijection f :
→ X which preserves the order
and the arithmetic operations (e.g. f (x + y) = f (x) + f (y)).
Theorem 1.1. In a totally ordered field X, the following statements are equivalent:
1. X has the Archimedean property. For each x ∈ X there is some n ∈ X
such that n > x.
2. X has the Eudoxos property: For each ε ∈ X, ε > 0X , there is some n ∈ X
such that n−1 < ε.

3. X is dense in X, i.e. for each x < y there is some q ∈ X with x < q < y.
Proof. Assume that X has the Archimedean property, and x < y. Then we find
some n ∈ X such that n > (y − x)−1 and some m ∈ X with m > nx and
m > −nx. Let z be the smallest number of {−m, . . . , m} which satisfies z > nx,
§1 Introduction 5

hence z − 1X ≤ nx. Then q := z/n ∈ X satisfies q > x. Moreover, q < y, since

otherwise z ≥ ny = n(y − x) + nx > 1X + nx. Hence, X is dense in X.

If X is dense in X, then ε > 0X implies that we find some q ∈ X such

that ε > q > 0X . Then q = z/n for z, n ∈ X , and so ε > n−1 > 0X . Hence, X
has the Eudoxos property.
Let X have the Eudoxos property and x ∈ X. If x ≤ 0X , we have 1X > x.

Otherwise ε := x−1 > 0, and we find some n ∈ X with n−1 < ε. In both cases,

we find some n ∈ X with n > x, and so X has the Archimedean property.
If a totally ordered field X has the Archimedean property, we simply call X

an Archimedean field .
A totally ordered field X is called (Dedekind) complete if each nonempty
subset A ⊆ X which is bounded from above has a smallest upper bound (i.e. if
the set B of upper bounds for A is nonempty, this set has a minimum).
Given some totally ordered field X, we define the Dedekind completion X of
X as follows:

A nonempty set A ⊆ X is called a Dedekind cut, if
1. A is bounded from above,
2. A does not possess a largest element, and

3. the relations a ∈ A, b ∈ and b ≤ a imply b ∈ A.
(We emphasize that this definition is only useful if X is a totally ordered field as
in our situation; in more general situations of so-called Riesz spaces, one has to
use more sophisticated definitions, see e.g. [LZ71, §32]). The set X of all Dedekind
cuts is ordered by inclusion, i.e. A ≤ B if and only if A ⊆ B. Note that this is a
total order, and in particular
A < B ⇐⇒ B \ A = ∅.
We define an addition + on X by
A + B = {a + b : a ∈ A and b ∈ B}.
Observe that A + B ∈ X: If A is bounded by a and B is bounded by b, then A + B

is bounded by a + b. The set A + B does not contain a largest element a + b with
a ∈ A, b ∈ B, since we find some a0 ∈ A with a0 > a, and so a+b < a0 +b ∈ A+B.

Finally, we have for any a ∈ A, b ∈ B, c ∈ X with c ≤ a + b that a0 = c − b ∈ A,
and so c = a0 + b ∈ A + B.

We define X as the set of all cuts of the form
qX := {x ∈
X : x < q} (q ∈ X ).
Moreover, we put

−A =
{b ∈ X : b < −a for all a ∈ A} = {a ∈
X : −a ∈
/ A} if A ∈
/ X,
(−q)X if A = qX .
For the equality in the above definition, observe that if b := −a ∈

/ A, then b > x
for all x ∈ A, and so a = −b < −x for all x ∈ A which implies that a ∈ −A;
conversely, if b < −a for all a ∈ A, then −b ∈ / A. Note that −A is indeed a
Dedekind cut: The only nontrivial property is that if A ∈
/ X , then −A ∈ / X
has no largest element. But if b were such a largest element, then A = (−b)X , a
contradiction: Indeed, if a ∈ A, then b < −a by definition of −A, i.e. a ∈ (−b)X .
Conversely, if x ∈ (−b)X , then a := −x > b, and we must have x ∈ A: Otherwise
−a ∈/ A which means a ∈ −A and contradicts the assumption that b < a is the
largest element of −A.
For A, B > 0X , we define a multiplication
A · B := {a · b : a ∈ A, b ∈ B, a, b > 0X } ∪ {x ∈ X : x ≤ 0X }
which defines a Dedekind cut: If a ∈ A, b ∈ B with a, b > 0X and c ∈ X satisfy

c < ab, then either c ∈ A · B because c ≤ 0X , or a0 := c/b < a, and so a0 ∈ A
and a0 > 0 imply c = a0 b ∈ A · B. No number a · b with a ∈ A, b ∈ B, a, b > 0X
can be a largest element of A · B, since there is some a0 ∈ A with a0 > a, and so
ab < a0 b ∈ A · B.
For A < 0X < B, we define A · B := −((−A) · B), for B < 0X < A, we define
A · B := −(A · (−B)), for A, B < 0X , we put A · B := (−A)(−B), and if A = 0X
or B = 0X , we put A · B := 0X .
The name “Dedekind completion” is indeed justified for X:
Theorem 1.2 (Dedekind). The Dedekind completion X of a totally ordered field X

is a complete Archimedean field with X as the canonical copy of X .
If X is an Archimedean field, then X contains a canonical copy of X. More-
over, if X is a complete Archimedean field, then this canonical copy is X, i.e.
X is isomorphic to X (i.e. there is a bijection which preserves the order and the
arithmetic operations).
Proof. The fact that X is a totally ordered field is a straightforward verification

of the axioms. We leave the details to the reader.

To see that X is Archimedean, we prove that X is dense in X: If A < B,
we find by definition some b ∈ B \ A. Since b is not a largest element of B, we
find some q ∈ B with q > b. Then A ≤ bX < qX < B, and so we have found some

qX ∈ X with A < qX < B, as desired.
§1 Introduction 7
Let us now prove that X is complete. Thus, let A ⊆ X be a nonempty

subset which is bounded from above by some B ∈ X. We claim that S = A is
a Dedekind cut:
Indeed, S is bounded from above: Since B + 1X > B, we find some b ∈
B + 1X \ B. Then bX > A for each A ∈ A which implies that b is an upper bound
for S. Moreover, S has no largest element, since for each a ∈ S we find some
A ∈ A with a ∈ A and some a0 > a with a0 ∈ A ⊆ S. Finally, the relations s ∈ S

and s > a ∈ X trivially imply a ∈ A .
Hence, S ∈ X. Since A ⊆ S for any A ∈ A , S is an upper bound for A .
But S is the smallest upper bound, since each upper bound for A must at least
contain any set A ⊆ A as a subset.
Let now X be Archimedean. Then each of the sets
xX = {q ∈ X : q < x} (x ∈ X) (1.1)
belongs to X: Since X is Archimedean, each xX is nonempty and bounded from

above. Moreover, since X is dense (Theorem 1.1), xX has no maximal element
Now the map x → xX is the desired embedding of X into X: Since X is dense,
we have xX < yX if and only if x < y (and so in particular the mapping is
one-to-one). Moreover, (x ± y)X = xX ± yY , and similarly for the multiplication.
It remains to prove that this embedding is onto, if X is complete, i.e. that
any Dedekind cut A has the form (1.1) with some x ∈ X. But since X is Dedekind
complete, any Dedekind cut A has a smallest upper bound x = sup A in X which
implies A = xX .
Theorem 1.3. Any Archimedean field X is isomorphic to a subfield of Ê. If X is

Dedekind complete, it is even isomorphic to . Ê
Proof. Let É denote the Dedekind completion of É . We show that any
Archimedean field X is isomorphic to a subfield of É and isomorphic to É
Ê É
if X is complete. Since then in particular is isomorphic to , no matter which
definition for Ê the reader wants to choose (for any possible definition, Ê is a
Dedekind complete Archimedean field), the statement follows.
The isomorphism is defined as follows: Let f : É É
→ X be the canonical
É
isomorphism. Then f induces an isomorphism F : → X, defined by F (A) = {q ∈
É X :f
−1
(q) ∈ A} (it is evident that this is a bijection, but it is also easily verified
that the algebraic operations and the order structure are preserved). Hence, É
and X are isomorphic. Since Theorem 1.2 implies that X contains X as a subfield
(resp. is isomorphic to X if X is complete) we may conclude that X is isomorphic
to a subfield of É (resp. isomorphic to Éif X is complete). This is what we had
to prove.
We thus have the problem that it actually is not possible to join “infinitesi-
Ê Ê
mals” to the set such that we get a set ∗ in which the same rules hold as in : Ê
Ê
If ∗ were in particular a Dedekind complete Archimedean field, we would have
∗
Ê Ê = (up to an isomorphism). Thus, the main goal of introducing a set ∗ Ê
containing infinitesimals cannot be achieved! Then what is the rest of this book
about?
Ê
The trick is that although ∗ is neither Archimedean nor Dedekind complete,
Ê
it shares all properties of ! This appears to be a contradiction, but of course
depends on the definition of the term “property”: Of course, if one considers e.g.
Ê
Dedekind completeness as a property of , this is not true. However, Dedekind
completeness is actually not a property of Êor of the real numbers (i.e. of the
Ê Ê
elements of ) but a property of the subsets of . This may be considered as a
“higher type” property: Let us for a moment call statements concerning relations
of real numbers (such as e.g. the distributive law) properties of type 0, while we call
a statement concerning subsets of real numbers (such as Dedekind completeness)
a property of type 1; a property of type 2 could be called a statement about sets
Ê
of subsets of real numbers, and so on. It turns out that ∗ completely shares the
properties of Ê of type 0. But it does not satisfy all properties of higher type.
Ê
However, the theory would not be very useful, if ∗ completely violates properties
Ê
of higher type. In a weak sense ∗ also satisfies the properties of higher type of
Ê Ê
, but with some restrictions. As remarked above, ∗ is not Dedekind complete,
Ê
i.e. not any subset of ∗ which is bounded from above must have a smallest upper
Ê
bound. However, this holds for the so-called internal subsets of ∗ which, roughly
speaking, are sets which can be described by properties of type 0.
Actually, one could be more precise than to define a type as a number ;
instead one could choose a certain set which “represents the type” in a more
detailed way—such a theory of types was Robinson’s original approach. But this
is a rather technical procedure. Instead, we follow the approach of Luxemburg in
which one does not have to care much about types: It turns out that any bounded
sentence which holds for Ê Ê
also holds for ∗ . Properties of type 0 in the sense
sketched above can always be formulated as bounded sentences.
Let us also mention other ways of introducing infinitesimals by dropping

some of the axioms of Dedekind complete Archimedean fields. These approaches
are, however, not too useful for our purpose, since the constructed sets X have a
rather different structure than . Ê
Example 1.4. Let X be the set of all rational functions (i.e. x(t) = p(t)/q(t) where
p and q are polynomials and q is not the zero-polynomial). Addition and multi-
plication are defined in the evident way (pointwise), and an order is introduced
§1 Introduction 9
as follows: We say x ≤ y if there is some t0 such that x(t) ≤ y(t) for t > t0 , and
x < y if x ≤ y and x = y.
Then X is a totally ordered field which contains a copy of Ê
as the subset
of constant functions. Moreover, X contains nonzero infinitesimals, i.e. elements x
which satisfy 0 < x < ε for any real number ε > 0. Indeed, the element x(t) = 1/t
has this property.
Exercise 1. Prove the statements of Example 1.4. Is the field X defined there
Dedekind complete? Is it Archimedean?
Example 1.5. Let

X := {xa,b :=
a b
0 a
: a, b ∈ Ê}.
Addition and multiplication are defined in the obvious way (matrix-
multiplication). We define an order by saying that xa1 ,b1 < xa2 ,b2 if either
a1 < a2 or if simultaneously a1 = a2 and b1 < b2 (i.e. we order the pairs (a, b)
lexicographically). A straightforward calculation shows that X is a totally ordered
ring, i.e. it satisfies all axioms of a totally ordered field with the possible exception
of the existence of inverses with respect to multiplication. The set X contains a
copy of Ê Ê
(namely {xa,0 : a ∈ }). Moreover, X contains nonzero infinitesimals:
The element x0,1 satisfies x0,0 < x0,1 < xε,0 for each ε > 0.
Exercise 2. Prove the statements of Example 1.5. Is the totally ordered ring X
defined there even a field? Is it Dedekind complete? Is it Archimedean?
The following example has some relation to the way in which ∗ will later Ê
actually be defined. To motivate this example, recall that Ê
can alternatively be
É
defined from by Cauchy sequences: Call two rational Cauchy sequences xn and
yn equivalent if xn − yn → 0. The set of all such equivalence classes, equipped
Ê
with the natural operations, is isomorphic to . One might try to introduce an
infinitely large number into Êby a similar method: For example, one may define
+∞ as the equivalence class of all sequences xn → ∞.
Example 1.6. Let X0 be the set of all sequences with values in , i.e. of all Ê
mappings x : Æ Ê
→ . We call two elements of X0 equivalent, if they differ at
Æ
most on finitely many points of . Let X be the set of all equivalence classes [x]
where x ∈ X0 . Addition and multiplication in X0 are defined in an evident way
(pointwise). The addition and multiplication in X is defined by [x] + [y] := [x + y]
and [x] · [y] := [x · y], and we write [x] ≤ [y] if and only if x does not exceed
y at all except possibly finitely many points. It is straightforward to check, that
these notions are well-defined, i.e. that they actually do not depend on the choice
of the representatives x and y. Moreover, with these notions and the convention
[x] < [y] if and only if [x] ≤ [y] and [x] = [y], the set X becomes an ordered ring,
i.e. it satisfies the axioms of a totally ordered ring with the exception that the
order need not be total. The equivalence classes of constant sequences constitute
a canonical copy of Ê in X. The equivalence class of the sequence x : n → n−1
Ê
is a nonzero infinitesimal in X, since for any ε ∈ , ε > 0, we have n−1 for all
except finitely many n, and so 0 < [x] < ε.
Exercise 3. Prove the statements of Example 1.6. Is the ordered ring X defined
there even totally ordered? Or a field? Is it Dedekind complete? Is it Archimedean?
The disappointing properties of Example 1.6 are due to the fact that we
chose the “wrong” definition for the equivalence of sequences. Later, we will call
two sequences equivalent, if they are equal “almost everywhere”. However, the
term “almost everywhere” will be defined in a very tricky manner by means of an
ultrafilter. The fact that (nontrivial) ultrafilters cannot be constructed is a deeper
Ê
reason why the model ∗ defined later is not very explicit (but the axiom of choice
will imply its existence).
Exercise 4. (Very difficult). Let X0 consist of all measurable functions x : [0, 1] →

Ê. Call two functions equivalent, if they differ only on a (Lebesgue) null set, and
let X be the set of the corresponding equivalence classes. Addition, multiplication
and the order on X is introduced analogously to Example 1.6. Prove that X is
an ordered ring. The reals are embedded into X as equivalence classes of constant
functions. Prove that X contains also “infinitesimals” in the sense that there is
some x > 0 in X which is not larger than any real ε > 0. Is X totally ordered? Or
a field? Is it Dedekind complete? Is it Archimedean?
The difficulty of nonstandard analysis lies not only in the mere introduc-
tion of infinitesimals. Think, for example, of a natural infinitesimal description
of the statement “The function f : Ê → Êis continuous at 0”. The “intuitive”
description will read: Whenever x is “infinitely close” to 0, then f (x) is “infinitely
Ê Ê
close” to f (0). Even if we know some extension ∗ of with infinitesimals, there
Ê Ê
arises the problem that we have given a function f : → (and not a function
Ê Ê
f : ∗ → ∗ ): So how should f (x) be defined if x is an infinitesimal?
For this reason, a proper theory of infinitesimals should also associate to each
function f : Ê→ Ê Ê Ê
an extension ∗ f : ∗ → ∗ . How could such an extension
be defined? If f (x) = a0 + a1 x + · · · + an xn is a polynomial, it is clear how the
extension should be defined. Similarly, if f is a rational function or a power series.
The reader who has some knowledge in complex analysis will find an analogy with
analytic functions: All “basic” analytic real functions have a unique canonical
extension into (a large part of) the complex plane. But how should functions like

the Dirichlet function (f (x) = 1 if x ∈ and f (x) = 0 otherwise) be extended?
§1 Introduction 11
Of course, one would like that the extension ∗ f of f has the same “formal”
properties as f . For example, for f (x) = sin(x), one would like to have that
Ê Ê
|∗ f (x)| ≤ 1 for all x ∈ ∗ and that ∗ f (x + π) = −∗ f (x) for x ∈ ∗ .
In this connection, the approach of Example 1.6 is very promising: Any x ∈ X
Ê Ê
is an equivalence class of sequences. Of course, any function f : → , no matter
Æ Ê
how complicated, maps any sequence x : → into another sequence y : → Æ Ê
in a canonical way by the formula y(n) = f (x(n)). If the equivalence class of y
depends only on the equivalence class of x, one may consider this as a mapping
∗
f : X → X. This is the idea that we will actually use.
However, such a direct approach will lead to many technical difficulties: For
Ê
example what is to be done if f is only defined on a subset A ⊆ ? Moreover,
how could we discuss functions which are constructed by means of infinitesimals
like f (x) = sin(x/c) where c > 0 is infinitesimally small (later, such functions
are called internal ). Of course, one could try to discuss such special cases in all
detail, but we will use another more axiomatic approach: We will define a mapping
∗ : S → ∗ Ê Ê
S on a very large set S which contains , all subsets of , all functions
Ê
on , etc., and which associates to each such element a a nonstandard element ∗ a.
Moreover, this mapping will be defined in such a way that the truth of sentences
is preserved in the following sense: If α is a sentence, and if we put a ∗ at each
element of the sentence, then the corresponding sentence ∗ α is true if and only if
α is a true sentence. This will be called the transfer principle. Observe that one
may conversely also conclude that α is true if ∗ α is true: In this way it is possible
to prove “standard” results with “nonstandard” methods.
The plan for the beginning is as follows: In §2, we describe more precisely
how S and sentences α are defined. The mapping ∗ is axiomatically introduced
in §3, without discussing how such a mapping might be defined. One way to define
such a mapping (similarly to Example 1.6) will be discussed in §4. We already
point out that without the axiom of choice the existence of such a mapping ∗
cannot be proved, i.e. the existence proofs of ∗ must always be nonconstructive.
In particular, even the simplest results of nonstandard analysis contain implicitly
a nonconstructive element. This is one of the main arguments against the use of
nonstandard analysis. On the other hand, physicists like to think of infinitesimals
as “existing” objects, in particular in string theory. In this sense it is perhaps
not false to think of nonstandard analysis as an “extreme idealization” of reality
which, however, might be even “too idealized”; so one should treat the results with
care.
The reader is asked to be patient in these first sections: The existence proof
of ∗ and the fundamental properties are rather complicated and technical; the
calculation with infinitesimals which has been briefly discussed above appears
after this as a simple exercise (and is in fact not much more than a special case of
general properties of ∗). However, the main power of nonstandard analysis becomes
Ê
clear if one considers more complicated structures than , such as Banach spaces
etc. These applications are the topic of the second part of this book.
§2 Superstructures, Sentences, and Interpretations 13
§ 2 Superstructures, Sentences, and Interpretations

Before we can define the details of nonstandard analysis, we need some fundamen-
tal concepts of model theory which we describe in this section.
2.1 Superstructures
As we have mentioned before, we intend to define a map ∗ which maps each object
(number, function, set, etc.) of the standard world into a corresponding object of
the nonstandard world. If one tries to map actually any set into a nonstandard
set, one is in the realm of category theory, and serious fundamental difficulties
arise (for example, ∗ cannot be a map in the sense of set theory). The easiest way
to overcome these difficulties is to work with a “restricted universe” which is still
a set. This has the disadvantage that we have to work with atoms which however
is not a big problem from the viewpoint of applications. We define such a universe
now:
Ê
Let S be some set in which we are interested. For example, S = , or S is
the point set of a topological space. If we speak of “statements about S”, we are
actually interested in statements about elements of S, subsets of S, functions and
relations on such subsets, sets of such functions, functions on such sets of functions,
etc. All these objects can be found in a set which is called the superstructure of
S. This superstructure is defined in the following way:
If A is some set, we denote by P(A) the powerset of A, i.e. the system of
all subsets of A. Let S0 := S, and for n = 1, 2, . . . define inductively Sn :=

S0 ∪ P(Sn−1 ). Then S := Sn is the superstructure of S. The elements of S are
called individuals or atoms, and the elements of S which are not atoms are called
entities.
The notion “atom” is chosen, because the sentence a ∈ s should always be
false for an atom s ∈ S and a ∈ S (at least, we will assume this throughout: If
necessary, we have to “rename” the elements of S to achieve this). For example, if
Ê
we are interested in statements about S = , this means that e.g. the statement
Thus, the choice S =
a ∈ 1 is always false (for a ∈ S). Ê means that we do not
care how the real numbers might be constructed—we just assume them as given.
If we are also interested in the definition of the real numbers from the natural
numbers, we could start with the set S = instead. The only restriction is that
we should start with an infinite set S (because otherwise things degenerate and
no nonstandard analysis is possible, as we shall soon see).
To see that also functions and relations are contained in the superstructure,
we briefly have to recall how functions and relations are defined in set theory:
A pair (a, b) is defined by the formula (a, b) := {{a}, {a, b}}. By induction, an
n-tuple (a1 , . . . , an ) is defined as the pair ((a1 , . . . , an−1 ), an ) for n ≥ 2 where we
put (a1 ) := a1 for n = 1. The Cartesian product A1 × · · · × An of finitely many
sets A1 , . . . , An is the set of all n-tuples (a1 , . . . , an ) where ai ∈ Ai (i = 1, . . . , n).
An (n-ary) relation Φ over A1 , . . . , An is a subset of A1 × · · · × An . If Φ is a binary
relation over A, B, we write dom(Φ) for the domain of Φ (i.e. dom(Φ) is the set of
all a ∈ A such that there is some b ∈ B with (a, b) ∈ Φ); similarly, rng(Φ) denotes
the range of Φ (i.e. rng(Φ) is the set of all b ∈ B such that there is some a ∈ A
with (a, b) ∈ Φ). If for each a ∈ A we find precisely one b ∈ B with (a, b) ∈ Φ,
then Φ is called a function. In this case, the usual notation Φ : A → B is used,
and Φ(a) denotes the unique b ∈ B with (a, b) ∈ Φ. Note that by this convention,
a function f is not only determined by its graph {(x, f (x)) : x ∈ dom(f )}, it even
more is its graph!
Now it is easy to see that all functions mentioned in the beginning are entities.
More general, if A and B are entities, then any mapping f : A → B (and any
relation R ⊆ A × B) is also an entity. Roughly speaking, S is closed under all
natural set operations:
Theorem 2.1. The following holds in any superstructure S:
1. S0 ∈ S1 ∈ S2 ∈ · · · ∈ S and S0 ⊆ S1 ⊆ · · · ⊆ S. In particular, Sn are entities.
Also ∅ is an entity.
2. Each Sn is transitive, i.e. each element of Sn which is not an atom is a subset
of Sn . The same holds for S. In other words: If A is an entity and x ∈ A,
then x is either an entity or an atom.
3. If A is an entity and B ⊆ A, then B is also an entity. In particular, if B ⊆ Sn
for some n, then B is an entity.
4. If A is an entity, then P(A) is an entity.

5. Let A be a set of entities. If A = ∅, then A := x∈A x is an entity. If

A ∈ S, then A :=
x∈A x is an entity.
then {x1 , . . . , xk } is an entity.
6. If x1 , . . . , xk ∈ S,

7. If A1 , . . . , Ak are entities, then j Aj = A1 ∪ · · · ∪ Ak is an entity.
8. If A1 , . . . , An are entities, then A1 × · · · × An is an entity.
9. All n-ary relations on entities are entities. In particular, all functions acting
between entities are themselves entities.
Theorem 2.1 implies that the superstructure S is built of “levels” Sn : If we
can show that a set belongs to some level (either as an element or as a subset), this
set belongs to the superstructure. Conversely, each element of the superstructure
is contained in some level (both as an element and as a subset). For this reason,
the elements of Sn \ Sn−1 are said to be of type n; the atoms are said to be of type
0. As we shall see in the following proof, the “operations” P and × may increase
the type, but they do not lead out of the superstructure.
Proof of Theorem 2.1. 1. The inclusion Sn−1 ⊆ Sn follows by induction: This is

true for n = 1, and if it is true for some n, then Sn = S0 ∪P(Sn−1 ) ⊆ S0 ∪P(Sn ) =
Sn+1 . The inclusions Sn−1 ∈ Sn and Sn ⊆ S follow immediately from the defini-
tion. Finally, ∅, Sn ∈ Sn+1 ⊆ S.
2. If A ∈ Sn = S0 ∪ P(Sn−1 ) is not an atom, we must have A ∈ P(Sn−1 ), i.e.
A ⊆ Sn−1 ⊆ Sn by 1. If A ∈ S, then A ∈ Sn for some n, i.e. by what we just
proved A ⊆ Sn for some n.
3. A ∈ Sn for some n which implies A ⊆ Sn by 2. Hence, any B ⊆ A is a subset
of Sn , i.e. B ∈ P(Sn ) ⊆ Sn+1 ⊆ S.
4. As in 3., we have A ⊆ Sn for some n. This means P(A) ⊆ P(Sn ) ⊆ Sn+1 .
Hence, P(A) is an entity by 3.

5. If A = ∅, choose some x ∈ A. Then x is an entity, and so A ⊆ x is an entity
by 3. If A is an entity, we have A ∈ Sn for some n ≥ 1, which implies A ⊆ Sn
by 2., i.e. A ⊆ S0 ∪ P(Sn−1 ). Since A contains no atoms, we have A ⊆ P(Sn−1 ),

i.e. all elements of A are subsets of Sn−1 . Consequently, A ⊆ Sn−1 . Hence, the
statement follows from 3.

6. By 1., we have x1 , . . . , xk ∈ Sn for some n, and so {x1 , . . . , xk } ∈ Sn+1 ⊆ S.

7. By 6., the set A := {A1 , . . . , Ak } is an entity, and so j Aj = A is an entity
by 5.
8. For n = 2, it suffices to observe that by definition of pairs A1 × A2 ⊆
P(P(A1 ∪ A2 )) and to apply 4. and 5. The general case follows by a trivial induc-
tion, since by definition A1 × · · · × An = (A1 × · · · × An−1 ) × An .
9. If Φ is an n-ary relation on entities A1 , . . . , An , then Φ ⊆ A1 × · · · × An . Since
A1 × · · · × An is an entity by 8., also Φ is an entity by 3.
More surprising than the above operations which are allowed in S is that some
important operations are not allowed: In general, subsets of S are not entities. For
example S itself is no entity; more generally, the subsets of S which are not entities
are precisely those sets which contain elements of infinitely many types (i.e. which
contain an infinite subset of elements of pairwise different type).
In other words: A set which is the “collection” of elements of S is not an
entity, in general. However, finite collections of such elements are entities, as we
have seen.
For the same reason as above, the union of a set A of entities need not be
an entity (a counterexample is A := {{x} : x ∈ S}). However, by what we have
proved, this is the case if either A itself is an entity or if A is finite.
Thus, roughly speaking, the operations “union”, and “collecting to some set”
are in general not admissible. An exception can be made if we either consider only
finitely many entities or if the whole collection forms an entity.
The reader familiar with set theory might find some analogies of these re-
strictions to usual set theory: Recall e.g. that there is no set containing all sets in
the universe (this is analogous to the fact that S is no entity). Moreover, it is in
general not allowed to “collect” arbitrary elements to some set. In particular, the
“union” of a class A of arbitrary sets need not be a set, in general. However, we

have the union axiom: If A is a set, then also A is a set.
In this sense, we might consider the superstructure S as a model of a set
theory (with so-called “urelements” which are the atoms, i.e. the elements of S):
The superstructure S serves as a model, if we just interpret the system of entities
as the class of all sets in the universe. S satisfies all axioms of set theory with the
exception of the infinity axiom. Note, that if we eliminate the urelements, i.e. if we
consider S := ∅, the corresponding superstructure S indeed contains only finite
sets. The same is true if S is finite. Thus, the only “interesting” case is the one
when S is infinite.
There is one axiom in set theory which in a certain sense allows us to “collect”
elements to some set: This is the axiom of choice. Since we assume the axiom
of choice throughout, it turns out that the axiom of choice holds also in the
superstructure S:
Let A ∈ S be a set whose elements are entities, and ∅ ∈ / A. By the axiom of

choice, there is a function f : A → A such that f (x) ∈ x for each x ∈ A. If A is

an entity, then f is an entity, because A and A are entities.
The superstructure S is large enough to represent all structures that typi-

cally occur in problems of analysis, topology, . . . if S is chosen appropriately. For
example, if one is interested in a continuous mapping f : X → Y of topological
spaces, one may choose S := X ∪ Y . Note that then also the topology of X and
Y is contained in the superstructure S.
For real analysis an appropriate choice is S := Æ or S := Ê
(it suffices to
Æ
consider S := , since Ê may be constructed from the reals in one of the usual
well-known manners: This construction can completely be carried out within the
superstructure; in this sense, Æ ∈ S already implies Ê Note that also the
∈ S).
algebraic structure and the order structure of Ê since e.g. the
is contained in S,
addition is represented by the entity {(x, y, z) ∈ Ê2
: x + y = z}.
As another example, if one is interested in a mapping f : X → Y in normed
Æ Ê
spaces X and Y , the choice S := X ∪ Y ∪ or S := X ∪ Y ∪ is appropriate.
We already point out an important logical observation: If one is interested
in a statement like “any mapping f : {0} → A with a topological space A is
continuous”, one cannot find a superstructure in which this statement makes
sense (because the class of all topological spaces A is not a set). However, given
any topological space A, we may consider the superstructure S corresponding to
S = A ∪ {0} and can verify the statement in S. If we can do this for any given
instance of A, we may conclude that the general statement is true.
Now observe that all statements occurring in analysis, topology, . . . have such
a form that it suffices to verify them for given instances, and so superstructures
are actually large enough to represent the corresponding statements. Later, we
will tacitly make use of the above mentioned reasoning.
Let us introduce now a notational convention that we shall use throughout:
Let i : X → Y be a map. Then the value of the image of a point x ∈ X under this
map is usually denoted by i(x). However, in nonstandard analysis it is sometimes
more convenient to write i a for the value. We shall use both conventions. The
value of a set A ⊆ X is defined as i(A) = {i(x) : x ∈ A}. This definition may be
ambiguous: If e.g. X = S is a superstructure and A ∈ S is an entity, it is not clear
whether i(A) means the image of the element A, or of the set A (which consists
By i A we always mean the image of the element A.
of elements from S).
2.2 Formal Language

To properly formulate the transfer principle which is the heart of nonstandard
analysis, we have to work with a formal language which is defined as follows:
The formal language L has the following atomic symbols (=the “letters”
from which our language is built):
1. The variables which are symbols from a countable set (in practice, letters
from the Roman alphabet are used, but one can add more letters if required:
the number of variables is not restricted). We occasionally underline variables
to distinguish them from constants.
2. The constants which form a set of symbols sufficiently large to be put in
one-to-one correspondence with the elements of whatever structures are un-
der consideration. For example, if we consider the superstructure of S = , Ê
each real number is such a constant; also each set of numbers, etc. When we
explicitly write down a formula, we use of course the traditional notation for
the constants whenever there are some (e.g. 1, 2, . . ., e, π, {1/2, ∅}). The set
of all constants is denoted by cns(L ). We already point out that in all our
applications, constants will represent sets (or atoms).
3. The basic predicates ∈ and =.
4. The separation symbols : ( and ).
5. The logical connectives ∧, ∨, =⇒ , ⇐⇒ and ¬.
6. The quantifiers ∀ and ∃.
We note that in the general theory of formal language, one allows a general
collection of basic predicates (not only ∈ and =) and also allows basic functors;
however, we confine ourselves throughout to languages of the above type.
Not all combinations of the above symbols are admissible in our language.
Only the so-called well-formed formulas (wffs) are admissible. These are defined
inductively:
The smallest wffs are the atomic formulas a = b and a ∈ b where a and b are
variables or constants. If α and β are wffs, then also (α ∨ β), (α ∧ β), (α =⇒ β),
(α ⇐⇒ β), and (¬α) are wffs. Moreover, if α does not already contain one of ∀x
or ∃x, then (∀x : α) and (∃x : α) are wffs. In these formulas, α is called the scope
of the quantifier ∀x resp. ∃x.
For simpler notation, we sometimes add or eliminate braces in a wff, if there
is no ambiguity.
An occurrence of a variable x in a wff α is called free, if it is not the occurrence
in a quantifier ∀x or ∃x and not within the scope of such a quantifier. All other
occurrences of x are called bound . For example, in the formula (∀x : x = x) ∧
(x = y) the first three occurrences are bound while the last one is free.
If x1 , . . . , xn occurs freely in a wff α, we sometimes point out this fact by
writing α(x1 , . . . , xn ). In this case we mean by α(a1 , . . . , an ) the string where each
free occurrence of xi is replaced by ai . We use this notation also if not all of the
variables xi actually occur (freely) in α. In any case, the above notation does not
mean that there are no other free variables in α (unless we state this explicitly).
If all occurrences of all variables in α are bound, then α is called a sentence.
Otherwise, α is called a predicate. In other words: α is a sentence if variables
occur only in quantifiers or the scope of quantifiers; all other objects of sentences
are constants (here, we do not count e.g. logical connectives as an “object” of
a sentence). We point this out, since this means that although we think of a
sentence that it concerns sets, it actually concerns only constants: In a sentence
like ∀x : (x ∈ {∅} =⇒ ∀y ∈ x : (¬y = y)) the symbol {∅} is a constant
(otherwise this would not be a sentence in our formal language). To solve this
“paradox”, one has to think of {∅} as a constant which represents the set which
contains the empty set as its only element; under this interpretation, the above
sentence should be true. However, the pure “symbol” {∅} is not this set but just a
name (the “constant”) which traditionally is interpreted as this set. We will make
the term interpretation precise in Section 2.3. However, we already point out that
later we will use constants of e.g. the form {x : α(x)} where α(x) is some formula
of our language; under the traditional interpretation it is clear what we mean by
such constants.
For most applications, it suffices to consider so-called bounded quantifiers:
The quantifiers ∀x resp. ∃x are transitively bounded if they occur in the form
∀x : (x ∈ A =⇒ (α)) resp. ∃x : (x ∈ A ∧ (α)) where A is either a constant or
a variable of the language. In this connection, we use the shortcut ∀x ∈ A : α
resp. ∃x ∈ A : α for the above formulas. If A is a constant, we call the quantifier
bounded . We call a wff (transitively) bounded, if all its quantifiers are (transitively)

bounded.
The term “bounded” is not used uniquely in literature: Sometimes bounded
wffs are meant, and sometimes transitively bounded wffs are meant. The reason
for our term “transitively bounded” in this connection will become clear later on.
Without further mention, we make use of the common abbreviations, e.g.
/ y) means (¬x ∈ y), and ∀x, y ∈ A, z ∈ B : α means ∀x ∈ A : ∀y ∈ A : ∀z ∈
(x ∈
B : α (here, we already eliminated some superfluous braces).
Remark 2.2. We already point out that the symbols ⊆ (subset) and P (powerset)
are not atomic symbols. Instead, we define A ⊆ B as the shortcut of the formula
∀x : (x ∈ A =⇒ x ∈ B). The symbol P will not even be used as some shortcut:
When A is a constant (other cases will not occur), we always consider P(A) as a
constant (the meaning of this constant is evident).
2.3 Interpretations
So far, we have defined only the syntax of a formal language L . Now we are going
to define the semantic, i.e. we associate a truth value to each of its sentences.
The reader is advised to read this section carefully, because the interpretation of
a sentence is not what it appears to be at first glance.
The reader should note that a formal language L is defined through its
constants cns(L ). An abstract interpretation map I is a one-to-one map of a
subset dom(I) ⊆ cns(L ) into a set S which is equipped with two binary relations
∈∗ and =∗ . Each interpretation map gives rise to an interpretation of all sentences
of L which have the form that all constants occurring in these sentences are in
dom(I):
Given such a sentence α, we define the interpreted sentence I α as follows:
We replace all occurrences of constants c ∈ dom(I) by the interpreted value I(c),
and all occurrences of ∈ and = by ∈∗ and =∗ , respectively. The truth value of the
sentence α under the interpretation map I is then defined as the truth value of the
interpreted sentence I α which in turn is defined in the obvious way by induction
on the structure of α:
The formula x ∈∗ y resp. x =∗ y is true if and only if the pairs (x, y) belong
to the relations ∈∗ resp. =∗ ; the logical connectives have their usual meaning, e.g.
α ⇐⇒ β is true if and only if α and β have the same truth value. Finally, the
quantified expressions ∀x : α resp. ∃x : α are true if α is true for every resp. at
least one value of x ∈ S (i.e. if we replace the free occurrences of x in α by the
corresponding value).
This is explained best by means of an example: Let M be a constant of the

language, and let α be the sentence
∃x : ∀y : (y ∈ M =⇒ x ∈ y).
Then I α is the sentence
∃x : ∀y : (y ∈∗ I(M ) =⇒ x ∈∗ y).
By definition, this sentence is true (i.e. α is true under the interpretation map I)
if and only if there is an element x ∈ S such that for each element y ∈ S for
which the pair (y, I(M )) belongs to the relation ∈∗ also the pair (x, y) belongs to
the relation ∈∗ .
The point which causes difficulties here is that the interpretation takes place
in S and not only in rng(I), i.e. for quantified variables (x and y in the above
example) we actually take all elements of S into account and not only the elements
arising as the image of a constant. In the above example, this means in particular
that x (and similarly y) only has to exist in S and need not necessarily be of the
form x = I(c) with some c ∈ cns(L ) (if I is not onto). If I is not onto, this leads
to strange effects, as we shall see soon.
The case most important for us is that S is a superstructure, and that ∈∗
and =∗ are defined in the set-theoretical way, i.e. for elements x, y ∈ S the relation
x ∈∗ y resp. x =∗ y holds if and only if (in the set-theoretical meaning) x ∈ y resp.
x = y. In this special case, we say that I is an interpretation map in set theory.
The essential point is that if I is not onto, we do not get the usual interpre-
tation of L as a set theoretical formula.
Let us give an example: Assume that the constants of the language cns(L )
are the elements of a superstructure S. Let I be an interpretation map in set

theory, say I : cns(L ) → T . If α denotes the formula “A ⊆ B” (with constants
A, B ∈ S ) which we use as a shortcut for
∀x : (x ∈ A =⇒ x ∈ B),
the interpreted formula I α becomes
∀x : (x ∈∗ I(A) =⇒ x ∈∗ I(B)),
and so (since I is an interpretation map in set theory, i.e. ∈∗ has the usual meaning
of the element relation in T) α becomes defined as “true” (under the interpretation
map I) if and only if I(A) ⊆ I(B) in the usual set-theoretical sense.
This sounds natural at first glance, but if the set I(A) contains elements x
which do not belong to rng(I), this is strange because the truth of the sentence
“A ⊆ B” then depends on elements which cannot be described by the language L ,

because L has no constants for it : In the above example the variable x runs in
I
α over all elements of I(A), not only over the elements of the form I(x) where
I(x) ∈ I(A).
We point this out once more: Even if some sentence α(c) holds for any con-
stant c this does not imply that the sentence ∀x : α(x) is true; similarly, if ∃x : α(x)
is true, it does not follow that there is some constant c such that α(c) is true. The
reason is that it might happen that the language has no constant for the object
in question (an exception of this rule is when I is onto).
Let us summarize: If a set theoretical interpretation map I is onto, then the
interpretation has the “canonical” meaning. However, if I is not onto, then also
elements may play a role which cannot be named in the language.
The above observation is the heart of nonstandard analysis: For example, let
L denote the “natural language” over the superstructure S of S = Ê
(in par-
ticular, the constants of the language are the elements of this superstructure, i.e.
Ê
cns(L ) = ). Assume that I : cns(L ) → T is an interpretation map into another
superstructure T with the property that rng(I) does not contain all elements of
Ê Ê Ê
I( ) (here, stands for the constant in the language L ), then the expression
∀x ∈ Ê becomes interpreted under I in such a way that x runs actually through
Ê
more elements than the elements of I( ). If the interpretation I is “canonical”
Ê
in the sense that I( ) has the same “structure” as Ê
(we will make this more
precise, soon), the interpretation thus adds additional (“nonstandard”) elements
to Ê (later we will call these elements internal ; it turns out that the elements
added in this way may be considered as the “infinitesimals”).
Chapter 2
Nonstandard Models
§ 3 The Three Fundamental Principles

3.1 Elementary Embeddings and the Transfer Principle
We shall now make use of non-surjective interpretation maps in set theory (see Sec-
tion 2.3) to add nonstandard elements to a given superstructure:
Let S be a given set of atoms, and S be the corresponding superstructure.
Consider a language L together with a surjective interpretation map I : dom(I) →
S where dom(I) ⊆ cns(L ). Then the set K0 of true sentences whose constants
are taken from dom(I) “describes” the superstructure S in the canonical set-
theoretical way.
Assume now that we have another interpretation map I ′ : dom(I ′ ) → T for
L in set theory.
Definition 3.1. We call the mapping ∗ = I ′ ◦ I −1 : S → T elementary (or elemen-
tary embedding) if the following holds:
1. Each transitively bounded sentence from K0 (true under the interpretation I)
is a true sentence under the interpretation I ′ .
∗
2. S = T .
Since T = ∗ S, we will often simply write ∗ : S → ∗
S. In this connection, S is
called the standard universe. We shall see that ∗ S contains S in a certain sense,
but moreover (for “interesting” maps ∗) it may contain also additional (“nonstan-
dard”) elements: Indeed, if C ∈ dom(I), the expression ∀x ∈ C : . . . becomes
interpreted under I ′ in such a way that x runs actually through all elements of
I ′ (C) and thus through more elements than the elements of I ′ ({c : c ∈ C}).
However, ∗S is even “too” large in a certain sense, and so we will reserve
24 Chapter 2. Nonstandard Models
the name “nonstandard universe” for a certain subset of ∗ S which will be defined
later. The mapping ∗ is considered as the embedding of the standard world into
the nonstandard world.
For this reason, we call every element in the range of ∗ a standard element .
Hence, an element of ∗
S is standard, if it can be written in the form ∗ A with A ∈ S.
This notion may be slightly confusing, since one would expect that “standard
elements” are elements of the standard world S. However, since ∗ is injective (this
follows from the definition or also from Lemma 3.5 below), the map ∗ provides a
one-to-one correspondence between the elements of S and the standard elements.
We call a standard element a standard entity, if it is not an atom in ∗ S (no
confusion will be possible: It follows from Lemma 3.5 below that standard elements
are precisely those elements ∗ a where a ∈ S is not an atom in S).
If A ∈ S is an entity in the standard world, we are also interested in the
standard copy of A which is denoted by
σ
A = {∗ a : a ∈ A}.
As already pointed out at the end of Section 2.3, an interpretation may add more
elements to a set. For this reason, it is not surprising that we always have σ A ⊆ ∗ A
(Lemma 3.5), but that the inclusion may be strict. It is a good idea to think of ∗
as a “blow-up functor”.
Definition 3.2. An elementary map ∗ : S → ∗ S is called a nonstandard map (or
σ ∗
nonstandard embedding) if A = A for each infinite entity A ∈ S.
The existence of nonstandard embeddings will be the topic of §4. The re-
quirement that σ A = ∗ A for each infinite entity A ∈ S appears rather restrictive.
However, we will see later (Theorem 3.22) that already the existence of some
countable infinite set A with σ A = ∗ A implies that ∗ is a nonstandard embedding.
Sometimes in literature, it is additionally required that S ⊆ T and ∗ s = s
for each s ∈ S, see e.g. [LR94]. However, this is more or less a formal restriction,
since it follows from Lemma 3.5 below that ∗ maps S into T and is one-to-one.
Thus, the question whether ∗ s = s (s ∈ S) is just a question of naming the atoms
in T .
Mathematically, we still have to prove that the predicate “elementary” in
Definition 3.1 is actually well-defined, i.e. that it depends only on the map ∗:
Proposition 3.3. Definition 3.1 depends only on ∗ and not on the particular choice
of the language L or the interpretation maps I and I ′ .
Proof. Let L0 be another language, and I0 : dom(I0 ) → S and I0′ : dom(I0′ ) →
T be other interpretation maps with dom(I0 ), dom(I0′ ) ⊆ cns(L0 ) such that
I0′ ◦ I0−1 = ∗ = I ′ ◦ I −1 (in particular, I0 is onto). Assuming that ∗ is elemen-
tary with respect to L , I, I ′ we shall prove now that ∗ is elementary with respect
§3 The Three Fundamental Principles 25
to L0 , I0 , I0′ . Exchanging the roles of L , I, I ′ with L0 , I0 , I0′ in this conclusion, we

find the desired equivalence.
Thus, let a sentence α0 of the language L0 with constants from dom(I0 ) be
given which is true under the interpretation map I0 . This means that I0 α0 is a
I −1
true sentence of set theory. Then the sentence α = (I0 α0 ) from L is true under
I I0
the interpretation map I, because α = α0 . Since ∗ is elementary with respect to
′
L , I, I ′ , we may conclude that α is true under the interpretation map I ′ , i.e. I α
′ ′ −1 ′
is a true sentence of set theory. But we have I α = I ◦I ◦I0 α0 = I0 α0 . Hence, α0
is true under the interpretation I0′ , and the property 1. of Definition 3.1 is satisfied
with respect to L0 , I0 , I0′ .
Let us now discuss the requirements of Definition 3.1 step by step. One
requirement is apparently that I ′ ◦ I −1 is a function, i.e. that dom(I ′ ) ⊇ dom(I).
However, this is already a consequence of the requirement 1.: Indeed, K0 contains
all sentences of the form c = c where c ∈ dom(I). Since this sentence is bounded
and thus true under the interpretation map I ′ by 1. (in particular, it has an
interpretation), we must have c ∈ dom(I ′ ). Hence, dom(I ′ ) ⊇ dom(I), as claimed.
But the requirement 1. of Definition 3.1 implies much more: Actually, this is
the key property of nonstandard analysis:
Proposition 3.4 (Transfer principle, First version). Let ∗ = I ′ ◦ I −1 be an ele-
mentary map. A transitively bounded sentence whose constants are taken from
dom(I) is true under the interpretation map I if and only if it is true under the
interpretation map I ′ .
Proof. Let α be a sentence whose constants are taken from dom(I). If α is true,
then α ∈ K0 , and by property 1., α has a true interpretation under I ′ . Conversely,
if α is false, then ¬α is true, i.e. ¬α ∈ K0 , and so ¬α has a true interpretation
under I ′ which means that α has a false interpretation under I ′ .
The transfer principle is sometimes also called Leibniz’s principle. The reason
is that this principle implies, as we will discuss later, that the hyperreal numbers
(with infinitesimals) satisfy the same “formal” properties as the real numbers:
This was Leibniz’s demand which we mentioned in Section 1.1.
We note that in older references on nonstandard analysis like [SL76, Lux73],
the first property of Definition 3.1 (i.e. the transfer principle) is required only
for bounded sentences (not for transitively bounded sentences). In contrast, in
the book [LR94], the transfer principle is assumed to hold for even more general
formulas (the range of a variable in a quantifier can be a so-called term). We shall
discuss this later. However, we stress that the transfer principle does not hold for
all sentences if ∗ is a nonstandard map (if we exclude the case that S is finite
which becomes rather trivial as we shall see).
The second requirement in Definition 3.1 is of minor importance: This prop-

erty implies together with the transfer principle that ∗ maps entities of S into
entities of T. Such a requirement is necessary since we are formally working with
a set theory with atoms: The transfer principle alone is not sufficient to e.g. deter-
mine the value ∗ ∅. Indeed, the transfer principle just implies that ∗ ∅ is “something
which contains no elements”; however from this property we do not know whether
this happens because ∗ ∅ = ∅ or because ∗ ∅ is an atom. Of course, we want to
have the first alternative.
Lemma 3.5. Let ∗ : S → T be an elementary map. Then for any x, y ∈ S the
following holds:
1. We have x = y, x ∈ y, resp. x ⊆ y if and only if ∗ x = ∗ y, ∗ x ∈ ∗ y, resp.
∗
x ⊆ ∗ y.
2. x is an atom resp. an entity in S if and only if ∗ x is an atom resp. an entity
in T = ∗ S.
3. A ⊆ ∗ A for any entity A ∈ S.
σ
Proof. Let c and d be the constants from the language L which denote x resp. y,
i.e. c = I −1 (x) and d = I −1 (y). Then we have x = y, x ∈ y, resp. x ⊆ y if and only
if the sentence c = d, c ∈ d, resp. ∀z ∈ c : z ∈ d is true. This is the case if and only
if these sentences are true under the interpretation I ′ , i.e. if and only if (in the
usual set-theoretical sense) I ′ (c) = I ′ (d), I ′ (c) ∈ I ′ (d), resp. ∀z ∈ I ′ (c) : z ∈ I ′ (d).
But this means ∗ x = ∗ y, ∗ x ∈ ∗ y resp. ∗ x ⊆ ∗ y.
By definition, ∗ x is an atom in T if and only if ∗ x ∈ T . Since T = ∗ S, this is
the case if and only if ∗ x ∈ ∗ S. But by what we just proved this is equivalent to
x ∈ S, i.e. to the fact that x is an atom in S. Thus, ∗ x is an atom if and only if x
∗
is an atom. But this means also that x is an entity if and only if x is an entity.
The last statement follows from the fact that b ∈ σ A means b = ∗ a for some
a ∈ A. By what we had proved before, the latter means ∗ a ∈ ∗ A, i.e. b ∈ ∗ A.
From now on, we will assume without loss of generality that I is the identity
(by just renaming the constants in cns(L ) if necessary). This means that each
element of S is simultaneously a constant in the language L , and each formula
in L whose constants are in dom(I) = S is simultaneously a formula in standard
set theory. By this convention we will henceforth not have to distinguish between
such a formula and its interpretation in set theory. Since this is confusing in some
occasions, we will sometimes use the symbol I anyway.
With the above convention, it makes now sense to consider all the other
convenient shortcuts commonly used in set theory as part of our language. Let us
give a small list:
Proposition 3.6. If ϕ, X, X1 , . . . , Xn are constants or variables which do not repre-

sent atoms, and x1 , . . . , xn are variables or constants, then the following sentences
resp. predicates are abbreviations of transitively bounded formulas:
1. X1 ⊆ X2 , X = X1 \ X2 , X = X1 ∪ X2 , X = X1 ∩ X2 .
2. X = {x1 , . . . , xn }.
3. X = (x1 , . . . , xn ), (x1 , . . . , xn ) ∈ X.
4. X = X1 × · · · × Xn .
5. ϕ ⊆ X1 × · · · × Xn . (i.e. ϕ is a relation on X1 , . . . , Xn ).
6. ϕ : X1 → X2 , x2 = ϕ(x1 ).
7. X is transitive.
The mentioned variables/constants are the only free variables or constants occur-
ring in these formulas. Moreover, if X, X1 , X2 and x1 , . . . , xn are constants, then
the formulas for 1. and 2. are even bounded.
Proof. The formula X1 ⊆ X2 is a shortcut for (∀x ∈ X1 : x ∈ X2 ), and X =

X1 \ X2 is a shortcut for ((∀x ∈ X : (x ∈ X1 ∧ x ∈ / X2 )) ∧ (∀x ∈ X1 : (x ∈
/ X2 =⇒
x ∈ X))); the other formulas in 1. are similar, and their formulation is left to the
reader. The formula X = {x1 , x2 } is a shortcut for (x1 ∈ X ∧ x2 ∈ X ∧ (∀x ∈ X :
(x = x1 ∨ x = x2 ))). Using this shortcut, we may treat the formula X = (x1 , x2 )
as a shortcut for (∃x ∈ X : ∃y ∈ X : (x = {x1 } ∧ y = {x1 , x2 }) ∧ X = {x, y}).
The sentence (x1 , x2 ) ∈ X can then be considered as a shortcut for (∃x ∈ X : x =
(x1 , x2 )). Using also this shortcut, we may treat X = X1 × X2 as a shortcut for
((∀x ∈ X : ∃x1 ∈ X1 , x2 ∈ X2 : x = (x1 , x2 )) ∧ (∀x1 ∈ X1 , x2 ∈ X2 : (x1 , x2 ) ∈
X)). Now the formula ϕ ⊆ X1 × X2 may be written with the previous shortcuts.
The cases n ≥ 3 in the previous shortcuts are similar and left to the reader (one
may use an induction, if one wants to be extraordinarily precise). The sentence
ϕ : X1 → X2 is a shortcut for ((ϕ ⊆ X1 × X2 ) ∧ (∀x ∈ X1 , y1 , y2 ∈ X2 : (((x, y1 ) ∈
ϕ ∧ (x, y2 ) ∈ ϕ) =⇒ y1 = y2 ))). In this case, ϕ(x1 ) = x2 is a shortcut for
(x1 , x2 ) ∈ ϕ. “X is transitive” is a shortcut for ∀y ∈ X : ∀x ∈ y : x ∈ X.
The requirement that X, X1 , . . . , Xn do not represent atoms is not essential

for all of the formulas, but e.g. for X = X1 \ X2 : If e.g. X1 = X2 , then any atom
X would satisfy the formula given in the above proof.
Note that the formula in the above proof given for X = (x1 , x2 ) is not
bounded, even if x1 , x2 , and X are constants (you have to write down the formula
more explicitly to see this). However, this formula is transitively bounded.
Of course, the list from Proposition 3.6 is by far not complete. Anyway, we will
tacitly make use of simple extensions of this proposition; for example we consider a
formula like {x1 , . . . , xn } ∈ X as an abbreviation of (∃x ∈ X : x = {x1 , . . . , xn }),
and so on.
The ∗-transform of a transitively bounded formula α with constants in S

is the formula ∗ α where any occurring constant c is replaced by ∗ c (in contrast,
variables remain unchanged!). In other words: If α is a set-theoretical formula
then ∗ α is the formula which arises from α if each
about the superstructure S,

element of S in the formula is replaced by its image under the mapping ∗.
Now we are in a position to formulate the main principle of nonstandard
analysis:
Theorem 3.7 (Transfer principle). Let ∗ : S → ∗S be elementary. The transitively

bounded formula α with constants in S is true if and only if its ∗-transform ∗ α is
true.
Proof. Since I is the identity, the formula α is true if and only if it is true under
the interpretation map I. By the first transfer principle this is the case if and only
′
if α is true under the interpretation map I ′ , i.e. if and only if I α is a true formula
′
in set theory. Now observe that I α = ∗ α.
As remarked earlier, the definition of elementary maps (and thus the transfer
principle) slightly varies in literature: While e.g. in [SL76, Lux73] only bounded
sentences are considered, in the book [LR94] even sentences which are slightly
more general than transitively bounded sentences are considered.
Since already the simple formula x = (x1 , x2 ) has no simple bounded formu-
lation, the first approach appears rather restrictive.
On the other hand, all results in this section are of course based on the
assumption that there exists a nontrivial elementary map ∗. In the mentioned
literature (and also in §4), this map is constructed in the same way. Only the proof
that the map has the required properties varies. Of course, the more properties
one wants to have the more technical is this proof. Our restriction to transitively
bounded formulas is reasonably mild in practice and still allows us to give a proof
where the main ideas are not hidden beyond technical details. The reader who
needs a (slightly) more general form of the transfer principle is referred to [LR94].
3.2 The Standard Definition Principle

The transfer principle implies that the mapping ∗ preserves much of the structure
In fact, we have:
of S.
Theorem 3.8 (Standard Definition Principle). An entity A ∈ ∗ S is standard if and
only if it can be written in the form
A = {x ∈ B : α(x)}
where α is a transitively bounded predicate with x as its only free variable, and B
and all elements (=constants) occurring in α are standard elements.
More precisely, if B = ∗ B 0 , and the elements occurring in α are ∗ B 1 , . . . , ∗ B n

(we write α = α(x, ∗ B 1 , . . . , ∗ B n ) to point this out), then
∗
A = {x ∈ ∗ B 0 : α(x, ∗ B 1 , . . . , ∗ B n )} = {x ∈ B0 : α(x, B1 , . . . , Bn )}
where α(x, B1 , . . . , Bn ) denotes the formula α where all occurrences of Bi are

replaced by ∗ B i (i = 1, . . . , n).
Proof. Necessity is clear, since we have A = {x ∈ B : x = x} for the choice
B = A. Sufficiency follows from the second statement. To prove the latter, let
A0 := {x ∈ B0 : α(x, B1 , . . . , Bn )} and note that
∀x ∈ B0 : (x ∈ A0 ⇐⇒ α(x, B1 , . . . , Bn ))
is a true sentence which is transitively bounded. Hence, the transfer principle

implies that its ∗-transform is true. But the latter reads
∀x ∈ ∗ B 0 : (x ∈ ∗ A0 ⇐⇒ α(x, ∗ B 1 , . . . , ∗ B n )).
Since ∗ A0 ⊆ ∗ B 0 by Lemma 3.5, this means that ∗

A0 = {x ∈ ∗
B0 :
α(x, ∗ B 1 , . . . , ∗ B n )}, i.e. ∗ A0 = A, as claimed.
Together with Proposition 3.6 we find that each elementary embedding is a
superstructure monomorphism:
Definition 3.9. A map ∗ : S → T is a superstructure monomorphism if it is one-
to-one, and if for all entities A, B ∈ S and all x1 , . . . , xn ∈ S the following holds:
1. ∗ preserves ∈: a ∈ A implies ∗ a ∈ ∗ A.
2. ∗ preserves finite sets:
∗
{x1 , . . . , xn } = {∗ x1 , . . . , ∗ xn }.
3. ∗ preserves grouping:
∗
(x1 , . . . , xn ) = (∗ x1 , . . . , ∗ xn ).
∗ ∗
4. ∗ preserves basic set operations: ∗ ∅ = ∅, (A ∪ B) = ∗ A ∪ ∗ B, (A ∩ B) =
∗
A ∩ ∗ B, ∗ (A \ B) = ∗ A \ ∗ B, ∗ (A × B) = ∗ A × ∗ B.
5. ∗ preserves domains and ranges of n-ary relations Φ in X and commutes with
permutations of the variables:
For example, if Φ is a binary relation, then ∗ Φ is a relation, and
dom( Φ) = ∗ dom(Φ), rng(∗ Φ) = ∗ rng(Φ). If another relation Ψ on B, A has
∗
the property that (y, x) ∈ Φ if and only if (x, y) ∈ Ψ, then (z, w) ∈ ∗ Φ if and
∗
only if (w, z) ∈ ∗ Ψ (i.e. (Φ−1 ) = (∗ Φ)−1 ). An analogous statement holds for
relations of more than two variables.
6. ∗ preserves the equality relation:

∗
{(x, x) : x ∈ A} = {(x, x) : x ∈ ∗ A}.
7. ∗ preserves atomic standard definitions of sets:

∗
{(x, y) : x ∈ y ∈ A} = {(x, y) : x ∈ y ∈ ∗ A}.
We note that many of the properties in Definition 3.9 are actually redundant.
For example, the relation ∗ (A \ B) = ∗ A \ ∗ B implies for the choice A = B that
∗
∅ = ∅.
Theorem 3.10. Each elementary map ∗ is a superstructure monomorphism.
Proof. The injectivity of ∗ has already been proved in Lemma 3.5. Concerning the
other properties:
1. This was proved in Lemma 3.5.
2. Let C denote the entity {x1 , . . . , xn }. Then C = {x1 , . . . , xn } is a true and
bounded sentence by Proposition 3.6 with C and x1 , . . . , xn as the only constants,
and so its ∗-transform ∗ C = {∗ x1 , . . . , ∗ xn } is true.
3. Analogously with C = (x1 , . . . , xn ). The only difference is that this statement
is only transitively bounded.
4. Let C denote the entity A \ B. Then C = A \ B is a true and transitively
bounded sentence by Proposition 3.6, and so its ∗-transform ∗ C = ∗ A \ ∗ B is
true. Note that ∗ C is actually an entity by Lemma 3.5. The proof of the other
statements is similar. To see that ∗ ∅ = ∅, we may argue as remarked before, or
apply the transfer principle to the bounded true sentence ∀x ∈ ∅ : x = x.
5. Let Φ be an entity in S which is a binary relation. We have Φ ∈ Sn for some
n. Since Sn is transitive (Theorem 2.1), the relation (x, y) ∈ Φ implies x, y ∈ Sn .
∗
Consequently, Φ ⊆ Sn × Sn . By Lemma 3.5 and 4., we have ∗ Φ ⊆ (Sn × Sn ) =
∗
Sn × ∗Sn.
If C := dom(Φ), then C = {x ∈ Sn : (∃y ∈ Sn : (x, y) ∈ Φ)} where
(x, y) ∈ Φ is the shortcut of Proposition 3.6. The standard definition principle
implies ∗ C = {x ∈ ∗ S n | ∃y ∈ ∗ S n : (x, y) ∈ ∗ Φ}, i.e. ∗ C = ∗ dom(Φ). The formula
∗
rng(Φ) = rng(∗ Φ) is proved analogously.
If Ψ is a relation which satisfies (y, x) ∈ Ψ if and only if (x, y) ∈ Φ, then
Ψ = {z ∈ Sn × Sn | ∃x, y ∈ Sn : (z = (x, y) ∧ (y, x) ∈ Φ)},
and the standard definition principle implies

∗ ∗
Ψ = {z ∈ (Sn × Sn ) | ∃x, y ∈ ∗ S n : (z = (y, x) ∧ (x, y) ∈ ∗ Φ)},
∗
which in view of (Sn × Sn ) = ∗ S n × ∗ S n implies that ∗ Ψ = {(y, x) : (x, y) ∈ Φ},
as claimed.
The case of relations which are not binary is similar and left to the reader.
6. Let C := {(x, x) : x ∈ A} = {y ∈ A × A : (∃x ∈ A : y = (x, x))} where
y = (x, x) is the transitively bounded predicate from Proposition 3.6. The standard
∗
definition principle thus implies ∗ C = {y ∈ (A × A) : (∃x ∈ ∗ A : y = (x, x)}.
∗
Since (A × A) = A × A by 4., we find C = {(x, x) : x ∈ ∗ A}, as claimed.
∗ ∗ ∗
7. At first glance, one might suspect that for the proof of 7., one might argue
similarly to the proof of 6. However, from the definition of the set C := {(x, y) :
x ∈ y ∈ A} we may not apply the standard definition principle, since it is not clear
from which entity the elements (x, y) are taken: We need a “universal” entity U
with (x, y) ∈ U whenever x ∈ y ∈ A.
Such an entity indeed exists: Since A is an entity, we find some n with A ∈ Sn
(with Sn as in Section 2.1). Since Sn is transitive (Theorem 2.1, 2.), the relation
y ∈ A implies y ∈ Sn ; using the transitivity of Sn once more, we find that x ∈
y also implies x ∈ Sn . Consequently, (x, y) ∈ Sn × Sn whenever x ∈ y ∈ A.
Hence, U := Sn × Sn is the required universal entity (U actually is an entity by
Theorem 2.1).
Now the proof is straightforward: By our choice of U , we have ∀y ∈ A : ∀x ∈
y : (x, y) ∈ U , and so C = {(x, y) ∈ U : x ∈ y ∈ A} which implies by the standard
definition principle that ∗ C = {(x, y) ∈ ∗ U : x ∈ y ∈ ∗ A}. Now observe that an
application of the transfer principle to the above mentioned transitively bounded
true sentence implies that ∀y ∈ ∗ A : ∀x ∈ y : (x, y) ∈ ∗ U is true. Using this fact
in the above formula for ∗ C, we have ∗ C = {(x, y) : x ∈ y ∈ ∗ A}.
We note that, conversely, each superstructure monomorphism satisfies the

transfer principle for bounded sentences and also the standard definition prin-
ciple for bounded predicates; a proof of this result can be found in [RZ69]. It
appears rather reasonable that even superstructure monomorphisms must satisfy
the transfer principle for transitively bounded sentences, but we did not try to
prove this.
Corollary 3.11. Let ∗ be a nonstandard embedding. Then for any entity A in the
standard universe we have σ A ⊆ ∗ A with equality if and only if A is finite.
Proof. The inclusion σ A ⊆ ∗ A has been proved in Lemma 3.5, and the equality
for finite sets follows from the fact that ∗ is a superstructure monomorphism. The
fact that we have inequality for infinite sets is the definition of a nonstandard
embedding.
In particular, if S is finite and so all entities of the standard universe are finite,
we always have σ A = ∗ A. In this case, everything is trivial: After a renaming of
the atoms in ∗ S = σ S, we may assume that ∗ S = S and that ∗ : S → ∗ S is the

identity. Since σ A = ∗ A for all sets A, we may conclude that ∗ : S → ∗
S is the
identity.
Corollary 3.12 (Standard Definition Principle for Relations). An n-ary relation
A ∈ ∗
S is standard if and only if it can be written in the form
A = {(x1 , . . . , xn ) ∈ B1 × · · · × Bn : ∗ α(x1 , . . . , xn )}
where ∗ α is a transitively bounded predicate with x1 , . . . , xn as its only free vari-

ables, and B and all elements (=constants) occurring in ∗ α are standard elements.
More precisely, if Bk = ∗ Ak , then
A = ∗ {(x1 , . . . , xn ) ∈ A1 × · · · × An : α(x1 , . . . , xn )}
where α denotes the formula ∗ α where all occurrences of constants ∗ B are replaced
by B.
Proof. Put B0 = A1 × · · · × An . Then ∗ B 0 = B1 × · · · × ∗ B n , because ∗ is a
superstructure monomorphism. Putting B = ∗ B 0 , we thus have
A = {y ∈ B | ∃x1 ∈ B1 , . . . , xn ∈ Bn : (y = (x1 , . . . , xn ) ∧ ∗ α(x1 , . . . , xn ))}.
The standard definition principle implies

∗
A = {y ∈ B0 | ∃x1 ∈ A1 , . . . , xn ∈ An : (y = (x1 , . . . , xn ) ∧ α(x1 , . . . , xn ))},
and so the statement follows.

Exercise 5. Prove that it is not possible to describe nonstandard elements by a
standard predicate. More precisely, if A is a standard entity and c ∈ ∗ A \ σ A is
a nonstandard element, prove that there is no standard predicate α(x) (i.e. all
constants in α are standard) such that α(x) is true for x ∈ ∗ A if and only if x = c.
In nonstandard analysis it is an important fact that each function in the
standard world may be “extended” to a function acting on the corresponding
nonstandard sets:
Theorem 3.13. Let ∗ be elementary. If A, B are standard entities and f : A → B,
then ∗ f : ∗ A → ∗ B. Moreover:
1. f is one-to-one if and only if ∗ f is one-to-one, i.e. if and only if ∗ f is invertible
∗
on its range rng(∗ f ) = ∗ rng(f ), and we have (∗ f |rng(∗ f ) )−1 = (f −1 |rng(f ) ).
2. f is onto if and only if ∗ f is onto.
∗
3. (f (a)) = (∗ f )(∗ a) for each a ∈ A.
∗
4. If C ⊆ A, then (f |C ) = (∗ f )|∗ C .
∗ ∗
5. (f (C)) = ∗ f (∗ C) (C ⊆ A) and (f −1 (D)) = (∗ f )−1 (∗ D) (D ⊆ B).
∗
6. If g : B → C, then (g ◦ f ) = (∗ g) ◦ (∗ f ).
Proof. The transfer principle applied to the formula f : A → B from Proposi-
tion 3.6 shows that ∗ f : ∗ A → ∗ B.
1. f is one-to-one if and only if the transitively bounded sentence
∀x1 , x2 ∈ A : (x1 = x2 =⇒ f (x1 ) = f (x2 ))
is true, where f (x1 ) = f (x2 ) is a shortcut for ∃y1 , y2 ∈ rng(f ) : (y1 = f (x1 ) ∧ y2 =
f (x2 ) ∧ y1 = y2 ) (where we used the shortcut from Proposition 3.6). The ∗-trans-
form of this sentence reads
∗
∀x1 , x2 ∈ ∗ A : (x1 = x2 =⇒ f (x1 ) = ∗ f (x2 ))
which is true if and only if ∗ f is one-to-one. By the transfer principle the sentences
are either both true or both false. The formulas for the range and the inverse have
been proved in Theorem 3.10.
2. f is onto if and only if rng(f ) = B which by the injectivity of ∗ is the case if
∗
and only if rng(f ) = ∗ B. By Theorem 3.10, this means ∗ B = rng(∗ f ), i.e. ∗ f is
onto.
3. Let c denote the constant f (a). Then the sentence (a, c) ∈ f is true, and so by
∗
the transfer principle (a, c) ∈ ∗ f . Since ∗ is a superstructure monomorphism, we
∗
have (∗ a, ∗ c) = (a, c), and so (∗ a, ∗ c) ∈ ∗ f , i.e. ∗ c = ∗ f (∗ a).
4. We have
f |C = {(x, y) ∈ f : x ∈ C}.
∗
(f |C ) = {(x, y) ∈ f : x ∈ ∗ C}
∗
which means that (f |C ) = (∗ f )|∗ C , as claimed.
∗ ∗
5. (f (C)) = ∗ rng(f |C ) = rng( (f |C )) = rng(∗ f |∗ C ) = ∗ f (∗ C). Since
f −1 (D) = {x ∈ A : f (x) ∈ D},
the standard definition principle implies

∗
(f −1 (D)) = {x ∈ ∗ A : ∗ f (x) ∈ ∗ D} = (∗ f )−1 (∗ D).
6. Let h = g ◦ f . Then the sentence ∀x ∈ A : h(x) = g(f (x)) with the evident
abbreviations is transitively bounded and true. The transfer principle implies ∀x ∈
∗
A : ∗ h(x) = ∗ g(∗ f (x)) which means ∗ h = (∗ g) ◦ (∗ f ).
From the previous proofs, the reader might get the wrong impression that
all useful sentences are transitively bounded so that the whole structure of S is
preserved by an elementary map ∗. This impression is false:
Although it actually is the case that any useful sentence can be formulated as
a transitively bounded formula, their “natural” formulation often is not a transi-
tively bounded formula. But the transfer principle only applies to the transitively
bounded formulation.
Let us give a key example: Let α denote the sentence
∀A ⊆ B : β(A)
and assume that α is true. The sentence α is in

with some infinite set B ∈ S,
general not bounded (unless B or β are rather trivial). However, α is equivalent
to the sentence
∀A ∈ P(B) : β(A).
This sentence is bounded (here, P(B) is considered as a constant). Hence, the
transfer principle implies that
∀A ∈ ∗ P(B) : β(A)
is true. In contrast, the ∗-transform of the original sentence α would read ∀A ⊆

∗
B : β(A) which can be rephrased as ∀A ∈ P(∗ B) : β(A). The transfer principle
does not imply that the latter sentence is true. In fact, this sentence may fail,
because we have
∗
P(B) = P(∗ B)
for any infinite set B ∈ S if ∗ is a nonstandard map, as we shall see later.
3.3 The Internal Definition Principle

The reader might have observed that almost all of the arguments in the previous
proofs followed the same scheme. Only in the proof of Theorem 3.10 in step 7.,
an essentially different argument was needed. Actually, this argument is the key
to the definition of the nonstandard universe and the internal definition principle
which we shall introduce now.
Let us first point out that transitive subsets of ∗
S play a special role in con-
nection with transitively bounded sentences (this is the reason for our terminology
transitively bounded):
Consider a simple transitively bounded sentence like ∀x ∈ C : ∀y ∈ x : ∃z ∈
y : α under an interpretation map I ′ : cns(L ) → ∗ S where C is interpreted as
an element from a transitive subset T ⊂ ∗ S. Since T is transitive, this element
is a subset of T , and so x runs under the interpretation actually only through

elements of T . For the same reason, also y runs only through elements of T , and
hence also z runs only through elements of T . The same argument holds for any
transitively bounded sentence. Thus, roughly speaking, for transitively bounded
sentences with constants from T , it suffices to know the “universe” T to find the
correct interpretation: It is not necessary to know the whole superstructure ∗ S.
Since we are mainly interested in the ∗-transform of sentences, i.e. sentences
with only standard constants, it suffices for us to consider the smallest transitive
set which contains all the standard constants. This set will be defined as the
nonstandard universe. It turns out that this set has a simple characterization: The
elements of this set are precisely all elements of all standard entities. To see this,
we need some preparation:
Lemma 3.14. Let ∗ : S → ∗ S be elementary. For the sets Sn from Section 2.1, we
have
1. ∗ S n is transitive.
2. ∗ S 0 ⊆ ∗ S 1 ⊆ · · · ⊆ ∗S.
3. ∗ S 1 ∈ ∗ S 1 ∈ · · · ∈ ∗
S.
Proof. By Theorem 2.1, we know that Sn is transitive, S0 ⊆ S1 ⊆ · · · ⊆ S, and

S0 ∈ S1 ∈ · · · ∈ S. Since “Sn is transitive” is a transitively bounded sentence
(Proposition 3.6), the transfer principle implies that ∗ S n is transitive. The in-
clusions ∗ S n ⊆ ∗ S n+1 and ∗ S n ∈ ∗ S n+1 follow from Lemma 3.5, and ∗ S n ∈ ∗
S
∗ ∗ ∗ ∗
follows from ∗ : S → S. This also implies S n ⊆ S, since the superstructure S
is transitive by Theorem 2.1.
It is important to know that the sets ∗ S n are not the level sets of the su-
perstructure ∗ S as in Section 2.1: The superstructure ∗ S is much larger than the
union of the sets ∗ S n , as we shall see. Instead, it turns out that the elements of
∗
S n form what we call the nonstandard universe:
Definition 3.15. Let ∗ : S → ∗S be elementary. The elements of standard sets are
called internal . The nonstandard universe I is the system of all internal elements,
i.e.

I := {∗ A : A is an entity in S}.
An element x ∈ ∗S which is not internal is called external .

In particular, each atom of ∗
S is an internal element (but not each entity, as
we shall see).
Proposition 3.16. The nonstandard universe I is a transitive subset of ∗ S. Each
standard element is internal. Conversely, each internal entity is a subset of some
standard entity. Moreover,

∞

∗
I = Sn (3.1)
n=0
where Sn are defined as in Section 2.1. Hence, I is the smallest transitive subset
of ∗
S which contains all sets ∗ S n as subsets (or, equivalently, which contains all
standard elements).
then x ∈ Sn for some n, and so ∗ x ∈ ∗ S n by Lemma 3.5. Since
Proof. If x ∈ S,
Sn is an entity of S (Theorem 2.1, 1.), I contains all sets ∗ S n as subsets (which
proves one inclusion of (3.1)), and hence in particular ∗ x ∈ I .
If x is an internal element, then x ∈ ∗ A for some entity A ∈ S. Then A ∈
Sn for some n. Recall that Sn is transitive, and so A ⊆ Sn , hence ∗ A ⊆ ∗ S n
by Lemma 3.5, i.e. x ∈ ∗ S n . The formula (3.1) and I ⊆ ∗ S now follows. If
additionally x is no atom, the transitivity of S n (Lemma 3.14) implies x ⊆ ∗ S n ⊆
∗
I which shows that each internal entity is a subset of some standard entity and,
moreover, that I is transitive.
Roughly speaking, all elements which can be “described” by transitively

bounded formulas are internal (we will make this more precise and give a rig-
orous proof in the internal definition principle below). At this point, we only want
to point out that in a sense the internal sets are precisely those sets which can be
“explicitly” constructed within the language of the nonstandard universe. For this
reason, internal sets are sometimes called “definable”. Nevertheless, many impor-
tant sets will turn out to be external. The distinction of internal and external sets
is one of the most useful concepts in nonstandard analysis (this will become clear
later in the applications). Thus, one can hardly overestimate the value of theorems
stating that certain sets are internal or external.
Now that we know the above described intuitive “transfer principle for in-
ternal sets”, it is not surprising that we also have an analogue of the standard
definition principle for internal sets which can be formulated more rigorously.
Recall that by Proposition 3.3 we have much freedom in the choice of our
language L : As long as dom(I) and the restriction of I to dom(I) remains un-
changed, we may add or delete any constants to cns(L ). In particular, it is no loss
of generality to assume that rng(I ′ ) = I : We just have to establish a one-to-one
correspondence between the constants from cns(L )\dom(I) (we may assume that
this set has the right cardinality) and the nonstandard elements from I . We call
a formula α in this language internal .
In other words: A (set-theoretical) formula containing as constants only ele-
ments from ∗ S is internal if and only if the elements occurring in the formula all
belong to I .
Theorem 3.17 (Internal Definition Principle). An entity A ∈ ∗

S is internal if and
only if it can be written in the form
A = {x ∈ B : α(x)}
where B is an internal entity, and α is a transitively bounded internal predicate

with x as its only free variable.
Proof. Necessity is trivial: Put B = A, and let α(x) be x = x. To prove sufficiency,

let B be internal, and α(x) be a transitively bounded internal predicate with x as
its only free variable. Let B1 , . . . , Bk be the constants (internal elements) which
occur in α. The essential step in the proof is to observe that there is some n such
that B =: B0 , B1 , . . . , Bk ∈ ∗ S n : Indeed, by Proposition 3.16, we find for any i
some n with Bi ∈ ∗ S n . Since ∗ S 0 ⊆ ∗ S 1 ⊆ · · · by Lemma 3.14, we may assume
that n is independent of i. We denote by α(x, y1 , . . . , yk ) the formula which arises
from α(x) if we replace any occurrence of Bi by yi (i = 1, . . . , k) (we assume
that the variables yi did not occur before in α). Now observe that the transitively
bounded sentence
∀y1 , . . . , yk , y ∈ Sn : ∃z ∈ Sn+1 : ∀x ∈ Sn+1 :

(x ∈ z ⇐⇒ (x ∈ y ∧ α(x, y1 , . . . , yk )))
is true: In fact, if y1 , . . . , yk , y ∈ Sn are given, the set z := {x ∈ y : α(x, y1 , . . . , yk )}

exists by the comprehension axiom of formal set theory. Since Sn is transitive, we
have z ⊆ Sn , and so z ∈ Sn+1 (a more careful analysis shows that we even have
z ∈ Sn , but we do not need this fact). The transfer principle thus implies that
∀y1 , . . . , yk , y ∈ ∗ S n : ∃z ∈ ∗ S n+1 : ∀x ∈ ∗ S n+1 :

(x ∈ z ⇐⇒ (x ∈ y ∧ α(x, y1 , . . . , yk )))
is true. For the choice yi = Bi ∈ ∗ S n , y = B ∈ ∗ S n , we thus find that there is a

set Z := z ∈ ∗ S n+1 which satisfies Z ∩ ∗ S n+1 = A ∩ ∗ S n+1 . Noting that ∗ S n+1
is transitive by Lemma 3.14, we have Z ⊆ ∗ S n+1 and A ⊆ B ⊆ ∗ S n ⊆ ∗ S n+1 .
Hence, A = Z ∈ ∗ S n+1 is an internal set.
Corollary 3.18. An entity A ∈ ∗

S is internal if and only if it can be written in the
form
A = {x ∈ ∗ B : α(x)}
where B ∈ S, and α is a transitively bounded internal predicate with x as its only
free variable.
Proof. Since ∗ B is internal by Proposition 3.16, the internal definition principle

implies that each set A of the described form is internal. Conversely, if A is internal,
then Proposition 3.16 implies that A ⊆ ∗ B for some standard entity ∗ B. Hence,
we have the required representation A = {x ∈ ∗ B : x ∈ A}.
Theorem 3.19. The nonstandard universe I satisfies:
1. If A, B ∈ I are entities, then A ∪ B, A ∩ B, A \ B, A × B ∈ I .

2. If A is an internal system of entities, then A and A are internal.
3. For binary internal relations ϕ, the sets dom(ϕ) and rng(ϕ) are internal. Also
ϕ−1 is internal.
4. Images and preimages of internal sets under internal functions are internal.
Proof. 1. By Proposition 3.16 and Lemma 3.14, we find some index n with A, B ∈
∗
S n , and so A, B ⊆ ∗ S n . Hence, A ∪ B = {x ∈ ∗ S n : x ∈ A ∨ x ∈ B} is internal by
the internal definition principle. A ∩ B ∈ I and A \ B ∈ I follows analogously.
∗
Since ∗ S n × ∗ S n = (Sn × Sn ) (since ∗ is a superstructure monomorphism) is
internal, also
A × B = {z ∈ ∗ S n × ∗ S n | ∃x ∈ A, y ∈ B : z = (x, y)}
(Proposition 3.6) is internal by the internal definition principle.

2. If A is an internal system of entities, then Proposition 3.16 implies that there
is some n with A ∈ ∗ S n , and then A ⊆ ∗ S n . Since ∗ S n is transitive, we have

A = {x ∈ ∗ S n | ∃y ∈ A : x ∈ y}
which is internal by the internal definition principle. Analogously,

A = {x ∈ ∗ S n | ∀y ∈ A : x ∈ y}
is internal.
3. If ϕ is a binary internal relation, we find some n with ϕ ∈ ∗ S n . Since ∗ S n is
transitive, it follows that (x, y) ∈ ϕ implies x, y ∈ ∗ S n . Hence, dom(ϕ) = {x ∈
∗
S n | ∃y ∈ ∗ S n : (x, y) ∈ ϕ}, which implies by the internal definition principle that
dom(ϕ) is internal. rng(ϕ) ∈ I is proved analogously. If ψ = {(y, x) : (x, y) ∈ ϕ},
then ψ = {(y, x) ∈ ∗ S n × ∗ S n : (x, y) ∈ ϕ} is internal by the internal definition
principle.
4. If ϕ : A → B is internal and M ⊆ A is internal, then the image of M is {y ∈
rng(ϕ) | ∃x ∈ M : (x, y) ∈ Φ} which is internal by the internal definition principle.
An analogous proof shows that preimages of internal sets are internal.
We have some analogue to Theorem 2.1: A union of a system of internal sets
need not be internal; but it is internal if the system is finite or if the system is
internal.
However, in contrast to Theorem 2.1, subsets of internal sets are not internal,
in general (and so we have a similar restriction as for the union also for the
intersection of internal sets).
Corollary 3.20 (Internal Definition Principle for Relations). An n-ary relation
ϕ ∈ ∗
S is internal if and only if it can be written in the form
ϕ = {(x1 , . . . , xn ) ∈ B1 × · · · × Bn : α(x1 , . . . , xn )}
where B1 , . . . , Bn are internal entities, and α is a transitively bounded internal

predicate with x1 , . . . , xn as its only free variables.
Proof. We have
A = {z ∈ B1 × · · · × Bn | ∃x1 ∈ B1 , . . . , xn ∈ Bn :
(z = (x1 , . . . , xn ) ∧ α(x1 , . . . , xn ))}.
Since B1 × · · · × Bn is internal by Theorem 3.19, the internal definition principle

implies that A is internal.
Exercise 6. If x1 , . . . , xn are internal, prove that {x1 , . . . , xn } and (x1 , . . . , xn ) are
internal. Conclude that any external subset of an internal entity is infinite.
Exercise 7. Let f : A → B and g : B → C be internal functions. Prove that g ◦ f
is an internal function.
Exercise 8. Prove the following:
1. Let A1 , . . . , An be pairwise disjoint internal entities, and A := A1 ∪ · · · ∪ An .
Let fi (i = 1, . . . , n) be internal functions with dom(fi ) ⊇ Ai and rng(fi ) ⊆ B.
Then there exists an internal function f : A → B satisfying f (x) = fi (x) for
x ∈ Ai (i = 1, . . . , n).
2. Let A, B be internal entities, B = ∅, A0 ⊆ A, and let f : A0 → B be an
internal function. Then f may be extended to an internal function F : A → B,
i.e. F (x) = f (x) for all x ∈ A0 .
3. The restriction of an internal function to an internal set is internal.
From the above observations one might guess that I is the same as ∗ S,
because many “natural” operations appear to remain within the nonstandard uni-
verse I . In fact, the earlier mentioned approach to nonstandard analysis by Nelson
only “knows” internal sets: This approach is more or less an axiomatic description
of set theory within I , the so-called internal set theory (this is not quite precise,
but gives a rather good idea of Nelson’s approach).
However, the above impression is misleading: ∗ S is actually much larger than
I . Although I has many properties analogous to Theorem 2.1, one important
property is missing: The relation A ⊆ B for an internal set B does not imply
that A is internal. In particular, the powerset of an internal set is not internal, in

general. For this reason one has to take extreme care if the symbols ⊆ and P are
involved.
For some purposes it is very convenient to calculate with ∗ S in place of I ;
this possibility gets lost in Nelson’s nonstandard analysis. On the other hand,
some other properties can be described more consistently by internal set theory. A
comparison of the two approaches can be found in [DS88] (see also [CK90, LR94]).
We will not discuss Nelson’s nonstandard analysis any more.
Now that we know internal sets, we can “explicitly” calculate the ∗-value of
P(A) and of the set B A of all functions f : A → B:
Theorem 3.21. Let ∗ : S → ∗ S be elementary. If A ∈ S is an entity, then
∗
P(A) = {M ⊆ ∗ A : M is internal}.
If A, B ∈ S are entities, then

∗
(B A ) = {f | f : ∗ A → ∗ B, and f is internal}.
Proof. Note that C := P(A) and D := B A belong to S. Hence, we find some

index n with C, D ⊆ Sn (recall that Sn is transitive), and so ∗ C, ∗ D ⊆ ∗ S n
(Lemma 3.5). Using the shortcuts from Proposition 3.6, the sentences ∀x ∈ Sn :
(x ∈ C ⇐⇒ x ⊆ A) and ∀x ∈ Sn : (x ∈ D ⇐⇒ (x : A → B)) are true.
The transfer principle implies that ∀x ∈ ∗ S n : (x ∈ ∗ C ⇐⇒ x ⊆ ∗ A) and
∀x ∈ ∗ S n : (x ∈ ∗ D ⇐⇒ (x : ∗ A → ∗ B)) are true. In view of ∗ C, ∗ D ⊆ ∗ S n , this
implies the statement.
Now we know all definitions and results which are needed to understand
the appendix, where ∗-values of other important classes of sets are calculated.
Occassionally, some of these values are needed in the later sections. However,
reading the appendix now may appear boring and not too easy in the moment.
Thus, although from a mathematical point of view the appendix should be read
now, the reader is advised to read it at a later time, after having more experience
with nonstandard analysis (e.g. when reference to a result in the appendix is
made). It is a good idea to read the appendix parallel to §6.
3.4 Existence of External Sets

Let us now prove that there are a lot of external sets. Simultaneously, we show that
all “nontrivial” elementary embeddings ∗ are actually nonstandard embeddings:
Theorem 3.22. Let ∗ : S → ∗S be elementary. If S contains an infinite entity, then
∗ is a nonstandard embedding if and only if σ B = ∗ B for some infinite countable
In this case, for any entity A ∈ S the following holds:
entity B ∈ S.
1. If A is infinite, then σ A is external.

2. If A is infinite, then P(∗ A) is external, and
σ
P(A) ∗ P(A) P(∗ A).
3. If S is infinite, then ∗ S \ σ S is nonempty and contains only elements which

are internal but not standard.
Proof. If S contains an infinite entity B0 and ∗ is a nonstandard embedding, choose

some infinite countable B ⊆ B0 . Then B ∈ S by Theorem 2.1, and by definition
σ
B = ∗ B.
Conversely, suppose that there is some infinite countable B ∈ S such that
σ
B = ∗ B. We shall show 1. from this assumption. This implies that ∗ is a non-
standard embedding, since we must have σ A = ∗ A, because ∗ A is internal by
Proposition 3.16, but σ A is not.
1. Recall that σ B ∗ B by Corollary 3.11. We show first that C := ∗ B \ σ B is
external:
Assume the contrary, that C is internal. Let b1 , b2 , . . . be an enumeration of
the elements of B. Define a relation ϕ ⊆ B × B by (bn , bk ) ∈ ϕ ⇐⇒ n ≤ k.
Then ϕ defines a well-order on B, i.e. each nonempty subset of B has a smallest
element:
∀x ∈ P(B) : (x = ∅ =⇒ ∃y ∈ x : ∀z ∈ x : (y, z) ∈ ϕ).
The transfer principle thus implies
∀x ∈ ∗ P(B) : (x = ∗ ∅ =⇒ ∃y ∈ x : ∀z ∈ x : (y, z) ∈ ∗ ϕ).

∗
Since C is internal by assumption, Theorem 3.21 implies C ∈ P(B). Since C =
∅ = ∗ ∅, the above statement for x = C thus implies that there is an element
y ∈ C such that z ∈ C implies (y, z) ∈ ∗ ϕ.
We shall follow this to a contradiction. We note first that an induction by
n implies (∗ bn , y) ∈ ∗ ϕ: Indeed, the transfer principle applied to the transitively
bounded sentence
∀y ∈ B : ((y = b1 ∧ · · · ∧ y = bn−1 ) =⇒ (bn , y) ∈ ϕ)
(Proposition 3.6) shows that the sentence
∀y ∈ ∗ B : ((y = ∗ b1 ∧ · · · ∧ y = ∗ bn−1 ) =⇒ (∗ bn , y) ∈ ∗ ϕ)
is true. Since y = y ∈ ∗ B \ σ B, we thus have (∗ bn , y) ∈ ∗ ϕ by induction assumption

(the case n = 1 is analogous).
Now we consider the map p : B → B defined by p(bn ) := bn−1 (n ≥ 1) and

p(b1 ) := b1 and note that
∀y ∈ B : (y = b1 =⇒ (y, p(y)) ∈
/ ϕ).
Here, (y, p(y)) ∈/ ϕ is a shortcut for ∃z ∈ rng p : (z = p(y)∧(y, z) ∈

/ ϕ) (henceforth,
we will no longer write down such shortcuts). The transfer principle now implies
that (y, ∗ p(y)) ∈
/ ∗ ϕ. We thus have a contradiction if we can prove that z := ∗ p(y)
does not belong to C = ∗ B \ σ B. Since ∗ p : ∗ B → ∗ B (Theorem 3.13), we have
z ∈ ∗ B, so it suffices to prove z ∈ / σ B. But this would mean z = ∗ bn for some n.
The transfer of the sentence
∀y ∈ B : (p(y) = bn =⇒ (y = bn+1 ∨ y = b1 ))
thus implies y = ∗ bn+1 or y = ∗ b1 which are both not possible, since y ∈ / σ B. This
contradiction shows that C = ∗ B \ σ B is indeed external.
Now it follows that also σ B is external, because otherwise C = ∗ B \ σ B is
the difference of two internal sets (Proposition 3.16) and thus C were internal by
Theorem 3.19.
Since A is infinite, there is a function f : A → B which is onto B. Then
∗
f : ∗ A → ∗ B, and ∗ f maps σ A onto σ B: Indeed, for any a ∈ A and any b ∈ B for
which the sentence f (a) = b is true, the transfer principle implies ∗ f (∗ a) ∈ ∗ b. In
particular ∗ f (∗ a) ∈ σ B for all ∗ a ∈ σ A, and for each ∗ b ∈ σ B, we find indeed some
preimage ∗ a ∈ σ A, since f is onto. We may now conclude that σ A is external,
because otherwise the image of σ A under the internal map ∗ f (Proposition 3.16)
would be internal by Theorem 3.19; but this image is σ B and thus external. This
contradiction shows that σ A is external.
2. The inclusion σ P(A) ∗ P(A) follows from Corollary 3.11. Now we note that
P(∗ A) consists of all subsets of ∗ A, while ∗ P(A) consists of all internal such
subsets. Hence, ∗ P(A) ⊆ P(∗ A), and the inclusion is strict since by 1. there
actually is a subset of ∗ A which is not internal, namely σ A.
3. From Corollary 3.11, we conclude that ∗ S \ σ S is nonempty. Let a ∈ ∗ S. Then
a is an internal element and also an atom by Lemma 3.5. Using Lemma 3.5, we
find that a is a standard element if and only if a = ∗ b for some atom b ∈ S, i.e. if
and only if a ∈ σ S.
The previous proof might appear rather artificial to the reader: What we did

in fact was to identify B = and then proved that all internal subsets of ∗ have

a smallest element, but that ∗ \ σ contains no smallest element. We will repeat
this argument in §5.
To give the reader an impression about internal sets, let us already note the
following theorem, although we have to postpone the proof:
Theorem 3.23. Let ∗ : S → ∗ S be a nonstandard embedding. Then an internal

entity is either finite or has at least the cardinality of the continuum. In particular,
there are no countable internal sets.
Theorem 3.23 can be proved by purely model theoretic considerations [CK90].
However, we will give a real “nonstandard” proof for this result in Section 7.1.
Corollary 3.24. If ∗ : S → ∗ S is a nonstandard embedding, then each infinite
internal entity has an external subset.
Proof. Just choose some countable subset.
The previous results already give a rather good impression of the nature of
internal sets: They may not be “too discrete”.
Ê Ê Ê
For example if S = , the subset σ of the internal set ∗ is “too dis-
Ê Ê
crete” to be internal; also the complement ∗ \ σ is not internal (why?). In this
Ê
connection it is useful to think of ∗ as a “smooth continuum”; the “natural”
subsets are always “thick” in the sense that with each point they have to contain
a “continuous neighborhood of infinitesimals”. These “natural” sets are the inter-
nal ones. Of course, it is also “natural” to speak of a single real number (and in
fact, sets containing only a single number are also internal (why?)). However, it
is not admissible to collect infinitely many single numbers into some set: Already
for a countable collection, we get an external set by Theorem 3.23. However, if
the collection is “good enough” (in some sense a “smooth continuum”), we have
to deal with an internal set.
Exercise 9. All previous examples of external sets have been subsets of standard
sets. Is it always true that an external set is a subset of some standard set or at
least a subset of some internal set?
§ 4 Nonstandard Ultrapower Models

The aim of this section is to describe a nonstandard map “explicitly”. There is only
one inconstructive step in the proof, namely the choice of a so-called ultrafilter.
Since there are many possible ultrafilters (recall that we assume the axiom of
choice), there are many different ways to define nonstandard maps which usually
have different additional properties. Nevertheless, the process of defining the model
will be independent of the particular choice of the ultrafilter.
We should point out that the method we present here is not the only possible
approach, but the model we obtain has particularly nice properties. Moreover, the
fact that the approach is “almost constructive” has the advantage that in special
cases one can better see what happens: Up to the ultrafilter one can “calculate” the
nonstandard embedding ∗. In particular, the role of the internal sets will become
evident in the course of the construction.
The plan of the proof is as follows: Let a superstructure S and a language L
be given whose constants are in a one-to-one relation with the elements of S (as
before, we denote this relation by I). We first determine an abstract model S of L
which is in a certain sense nonstandard and which satisfies the transfer principle
even for sentences which are not necessarily bounded; let I0 denote the corre-
sponding interpretation map. In a second step, we embed S into a superstructure
∗
S by a map ϕ such that I ′ = ϕ ◦ I0 is the desired interpretation map. This map
ϕ does not satisfy the transfer principle for all sentences but only for a certain
subclass (containing the transitively bounded sentences). In the construction of ϕ
the role of the internal sets will become evident.
To construct the abstract model S , we need some facts about ultrafilters.
4.1 Ultrafilters
Let J be some set. Probably, the reader has already heard the notion that a
property holds “almost everywhere” on J: By this, one means that the set of all
point j ∈ J with this property is “large” in a certain specified sense. For example,
one may mean that the complement of this set is finite (if J is infinite); if J is a
measure space, one can also mean that the complement of this set is a null set.
(Recall Exercises 3 and 4).
If we want to introduce a general definition of the term “almost everywhere”
which contains the two cases above, we should fix a family F of subsets of J
and say that a property holds almost everywhere if the set of all points j ∈ J
with this property is an element of F . It is natural to require that for any set
A ∈ F , F should also contain all sets which are larger than A. Moreover, we
should require that if a property P1 holds almost everywhere and a property P2
§4 Nonstandard Ultrapower Models 45
also holds almost everywhere, then P1 ∧ P2 also holds almost everywhere. To avoid
trivial cases, we also require that if a property holds nowhere, then it does not
hold almost everywhere. These requirements may be formulated in terms of F as
follows:
Definition 4.1. A set F of subsets of J is called a filter on J, if it has the following
properties:
1. If A ∈ F and A ⊆ B ⊆ J, then B ∈ F .
2. If A, B ∈ F , then A ∩ B ∈ F .
3. ∅ ∈/ F.
We say that a property holds almost everywhere on J (with respect to F ), if the
set of all j ∈ J with this property belongs to F ; we briefly say that this property
holds for almost every j ∈ J.
The examples mentioned above come from the following filters:
Example 4.2. Let J be infinite, and F be the system of all sets of the form J \ J0
where J0 ⊆ J is finite. Then F is a filter.
Example 4.3. Let J be a measure space (mes J = 0), and F be the system of all
sets of the form J \ N where N is a subset of some set of measure 0. Then F is a
filter.
Recall that not any subset of a set of measure 0 must be measurable; but if
we define a null set as an arbitrary subset of a set of measure 0, we get a filter in
the sense of Definition 4.1. In this sense, we can say that the complements of sets
of measure 0 generate the filter from Example 4.3:
Definition 4.4. A system B of sets has the finite intersection property, if for each
finitely many sets of B the intersection is nonempty. If B is a system of subsets
of J with the finite intersection property, then the filter generated by B is the
system F of all subsets B ⊆ J for which there exist finitely many A1 , . . . , An ∈ B
with B ⊇ A1 ∩ · · · ∩ An .
The proof of the following observation is straightforward and left to the
reader.
Proposition 4.5. Let F be the filter generated by B. Then F is a filter in the
sense of Definition 4.1. Moreover, F is the smallest filter which contains all sets
from B.
If we can say that a property holds almost everywhere, it also makes sense to
say that a property holds almost nowhere if this property fails almost everywhere.
Since a property is either true or false, we have that a property holds almost
nowhere on J (with respect to F ), if the complement of the set of all points
j ∈ J with this property belongs to F . It is a natural question whether there
exists a filter F such that each property holds either almost everywhere or almost
nowhere. The filters with this property are called ultrafilters:
Definition 4.6. A filter U on J is an ultrafilter if for any A ⊆ J with A ∈
/ U , we
have J \ A ∈ U .
Note that A ∈ U implies J \ A ∈ / U , since the intersection of these sets is
empty and thus cannot be contained in the filter U .
Proposition 4.7. A filter U on J is an ultrafilter, if and only if it is not contained
in a strictly larger filter on J.
Proof. Assume that U is an ultrafilter. If U is contained in a strictly larger filter
F , we find some A ∈ F which does not belong to U . By assumption, the set
B = J \ A belongs to U and thus to F . Since F is a filter, we must have
A ∩ B ∈ F . But this means ∅ ∈ F , a contradiction.
Conversely, suppose that U is not contained in a strictly larger filter. If U
is not an ultrafilter, we find some A ⊆ J such that A ∈ / U or J \ A ∈ / U . Then
the set B = U ∪ {A} has the finite intersection property: To see this, it suffices to
prove that A ∩ B = ∅ for any B ∈ U , because U is a filter. But if A ∩ B = ∅ for
some B ∈ U , we have B ⊆ J \ A, and so J \ A ∈ U , contradicting our assumption
on A. Hence, B has the finite intersection property and thus generates a filter
which is strictly larger than U .
A trivial example of an ultrafilter is the following: Fix some element j0 ∈ J,
and let U be the system of all subsets of J which contain the element j0 . Then U
is a filter which contains A ⊆ J if and only if j0 ∈ A; if A ∈
/ U , we have j0 ∈ J \ A,
and thus J \ A ∈ U . We want to exclude this example:

Definition 4.8. A filter F is called free, if F = ∅.
Exercise 10. Prove that an ultrafilter U is free if and only if it does not have the
form described above, i.e. if and only if we do not have
U = {U ⊆ J : j0 ∈ U }
for some j0 ∈ J.
It is not obvious whether there exist free ultrafilters. Consider, for example,
the filter F of all subsets of an infinite set J with finite complements. Then F is

free. Thus, any ultrafilter containing F is free. However, even for J = it is not
possible to describe an ultrafilter containing F . Nevertheless, the axiom of choice
implies that such an ultrafilter exists:
Theorem 4.9. Each filter F is contained in some ultrafilter.
The proof is a straightforward application of e.g. Zorn’s Lemma (or, alter-
natively, of Hausdorff’s maximality principle or of the well-ordering theorem) and
is left to the reader. We remark that, in contrast to Zorn’s lemma, the statement
of Theorem 4.9 is not equivalent to the axiom of choice [Pin73]. In other words, if
we take Theorem 4.9 as an axiom, this axiom is strictly less restrictive than the
axiom of choice (usually, this axiom is called the maximal ideal theorem because it
is equivalent to the fact that on any Boolean algebra there exists a maximal ideal
[Sik64, Lux69c]).
Now we can prove that over each infinite set J there exists a free ultrafilter
(using the axiom of choice):
Exercise 11. Prove that for an ultrafilter U over an infinite set J the following
statements are equivalent:
1. U is free.
2. U contains the filter of Example 4.2 (observe that by Theorem 4.9 such
ultrafilters exist for each infinite set J).
Actually, we do not really need free ultrafilters but δ-incomplete ultrafilters:
Definition 4.10. A filter F is called δ-incomplete if there is a countable subset

F0 ⊆ F with F0 ∈ / F.
Note that for a filter F , the intersection of finite subsets of F belongs to
F , so that F0 must actually be infinite.
Exercise 11 (and Theorem 4.9) imply that at least over countable sets J there
exist δ-incomplete ultrafilters:
Proposition 4.11. If F is a free filter on a countable set J, then F is δ-incomplete.

Proof. Let J = {j1 , j2 , . . .}. Since F = ∅, we find for each n some set Fn ∈ F

/ Fn . Hence, n Fn = ∅ ∈
with jn ∈ / F.
The importance of δ-incomplete filters lies in the following fact:
Proposition 4.12. If a filter F on J is δ-incomplete then there is a partition of J
into countably many pairwise disjoint sets J0 , J1 , . . . such that none of these sets
belongs to F .
Conversely, if U is an ultrafilter for which such a partition exists, then U
is δ-incomplete and free.
Proof. If F is δ-incomplete, there exist countably many A1 , A2 , . . . ∈ F such that

J0 := n An ∈ / F . Then we define by induction Jn := J \ (J0 ∪ · · · ∪ Jn−1 ∪ An )
(n = 1, 2, . . .). By construction, we have Jn ∩ Jk = ∅ for k < n, and J \ Jn ⊆ An .

The latter implies J \ n≥1 Jn = n≥1 (J \ Jn ) ⊆ J0 , and so J0 , J1 , . . . is indeed
a partition of J. Since the set J0 ∪ · · · ∪ Jn−1 ∪ An belongs to F (because F is a
filter which contains An ), the complement Jn does not belong to F .
Conversely, if J0 , J1 , . . . is a partition of J into (at most) countably many
pairwise disjoint sets with Jn ∈ / U , then J \ Jn ∈ U , because U is an ultrafilter.

Hence, U0 = {J \Jn : n = 0, 1, . . .} is a countable subset of U with U0 = ∅ ∈
/U.

Hence, U is δ-incomplete. Since U ⊆ U0 = ∅, U is free.
Corollary 4.13. Any δ-incomplete ultrafilter is free.
On any uncountable set J, there is a free filter which fails to be δ-incomplete,
namely the filter of all sets with countable complements. However, it is not clear
whether any free ultrafilter must be δ-incomplete (i.e. whether the converse of
Corollary 4.13 holds). This question is known as “Ulam’s measure problem”. It
is consistent with the axioms of ZF set theory to assume that any free filter is
δ-incomplete, but it is not provable that it is consistent to assume the converse:
If there actually should exist a free ultrafilter on J which fails to be δ-incomplete,
then J must have an extremely large cardinality, namely the cardinality of at
least a so-called measurable cardinal which in turn has the cardinality of at least a
so-called inaccessible cardinal. It is not provable that inaccessible cardinals exist.
Moreover, it is not even provable that it is consistent (with the axioms of ZF set
theory) to assume that such cardinals exist (see [Jec97]).
4.2 Ultrapowers
Let S be a superstructure, and L a language with a surjective interpretation map
I onto S.
Let J be an infinite set, and U an ultrafilter on J. We define now an abstract
model S of the language L , the so-called ultrapower of S modulo U .
On this set, we introduce
We start with the set SJ of all functions f : J → S.
a natural equivalence relation: We call two such functions f, g equivalent, if f (j) =
g(j) for almost all j ∈ J, i.e. if Jf,g = {j ∈ J : f (j) = g(j)} ∈ U . This is an
equivalence relation: This is trivial except for the transitivity. But if f is equivalent
to g and g is equivalent to h, then Jf,g , Jg,h ∈ U which implies Jf,g ∩ Jg,h ∈ U .
Since Jf,h ⊇ Jf,g ∩ Jg,h , this implies Jf,h ∈ U .
Let S be the set of equivalence classes of SJ with respect to this equivalence
relation. The interpretation map I0 : S → S is simply the map which associates
to each constant whose interpretation under I is c ∈ S the class which contains
the constant function fc , defined by fc (j) = c (j ∈ J). To speak of an abstract
model we have to equip S with two relations ∈U and =U . These are defined as
follows:
Denote the class containing a function f : J → S by [f ]. Then we define
[f ] ∈U [g] ⇐⇒ f (j) ∈ g(j) for almost all j ∈ J,
and
[f ] =U [g] ⇐⇒ f (j) = g(j) for almost all j ∈ J.
We have to prove that this definition is independent of the particular choice of

the representing elements f, g: Concerning =U , this is trivial, since =U is just
the usual equality relation of equivalence classes. Concerning ∈U , assume that
[f1 ] = [f2 ] and [g1 ] = [g2 ] and f1 (j) ∈ g1 (j) for almost all j. We have to prove
that f2 (j) ∈ g2 (j) for almost all j. By assumption, the sets Jf1 ,f2 , Jg1 ,g2 , and
{j ∈ J : f1 (j) ∈ g1 (j)} belong to U , and so also their intersection. This intersec-
tion is contained in the set {j ∈ J : f2 (j) ∈ g2 (j)} which thus also belongs to U .
Hence, f2 (j) ∈ g2 (j) for almost all j, as claimed.
So far, we have only used the fact that U is a filter (if U is not necessarily
an ultrafilter, one calls S also the reduced power of S). But the fact that U is an
ultrafilter is needed for the following important theorem:
Theorem 4.14 (Loś and Luxemburg). A sentence in L is true under the interpre-
tation map I if and only if it is true under the interpretation map I0 .
Proof. Let α be a formula in L (not necessarily a sentence!). Let x1 , . . . , xn be

all free variables of α (n = 0 is not excluded). For fi ∈ SJ , let I0 α([f1 ], . . . , [fn ])
denote the formula where all free occurrences of xi are replaced by [fi ] (i =
1, . . . , n), all constants are replaced by their image under the interpretation map
I0 , and the symbols ∈ and = are replaced by ∈U and =U , respectively. Similarly,
let I α(f1 (j), . . . , fn (j)) denote the formula where all free occurrences of xi are
replaced by fi (j), and all constants are replaced by their image under the canonical
interpretation map I. We will show that
I I0
α(f1 (j), . . . , fn (j)) is true for almost all j ⇐⇒
α([f1 ], . . . , [fn ]) is true.
(4.1)
Then the statement follows from the special case n = 0: If α is a sentence, then
α is true under the canonical interpretation map I if and only if I α holds (for all
j, since it is independent of j). By (4.1), this is equivalent to the fact that I0 α is
true which means that α is true under the interpretation map I0 .
Let us now prove (4.1). By the usual logical transformations, we may assume
that the only logical connectives used in α are ¬ and ∧. Moreover, replacing the
formula ∀x : β by the equivalent formula ¬∃x : ¬β, we may assume that ∃ is
the only quantifier used in α. We now prove the statement by induction over the
structure of the sentence α (i.e. on the number of the symbols ¬, ∧, and ∃). For the
induction assumption, we only have to consider the elementary formulas x = y and
x ∈ y where x and y are either free variables or constants. In this case, (4.1) follows
immediately from the definition of the relations =U and ∈U . For the induction
step, we have to consider three cases:
1. α has the form ∃x : β(x, x1 , . . . , xn ): (Note that even if α is a sentence, β(x)
might be a formula with a free variable: For this reason, even though we are
actually only interested in (4.1) for sentences (n = 0), we have to consider more
general formulas with free variables for our induction proof).
Assume that I α(f1 (j), . . . , fn (j)) is true for almost all j. Then we find for
almost all j some f0 (j) such that I β(f0 (j), f1 (j), . . . , fn (j)) is true. We consider
f0 is a function (axiom of choice!). By induction hypothesis, this implies that
I′ ′
β([f0 ], [f1 ], . . . , [fn ]) is true, and so I α([f1 ], . . . , [fn ]) is true.
Conversely, if I α([f1 ], . . . , [fn ]) is true, then there is some [f0 ] ∈ S such
′
that I β([f0 ], [f1 ], . . . , [fn ]) is true. By induction assumption, this implies that
I
β(f0 (j), f1 (j), . . . , fn (j)) is true for almost all j. For all these j, we thus have
that I α(f1 (j), . . . , fn (j)) is true.
2. α has the form ¬β(x1 , . . . , xn ):
If I α(f1 (j), . . . , fn (j)) is true for almost all j, then I β(f1 (j), . . . , fn (j))
is false for almost all j. In particular, we do not have for almost all j
that I β(f1 (j), . . . , fn (j)) is true. By induction assumption, this means that
I0
β([f1 ], . . . , [fn ]) is not true, i.e. I0 α([f1 ], . . . , [fn ]) is true.
Conversely, assume that I0 α([f1 ], . . . , [fn ]) is true. Then I0 β([f1 ], . . . , [fn ])
is false, and by induction assumption it is not the case that, for almost all
j, I β(f1 (j), . . . , fn (j)) is true. Since U is an ultrafilter, we may conclude that
I
β(f1 (j), . . . , fn (j)) is true for almost no j, i.e. I β(f1 (j), . . . , fn (j)) is false for
almost all j. Hence, I α(f1 (j), . . . , fn (j)) is true for almost all j.
3. α has the form β1 ∧ β2 :
I
α(f1 (j), . . . , fn (j)) is true for almost all j if and only if I β i (f1 (j), . . . , fn (j))
(i = 1, 2) are both true for almost all j (because A, B ∈ U implies A ∩ B ∈ U ).
This is the case if and only if I0 β i ([f1 ], . . . , [fn ]) (i = 1, 2) are both true, i.e. if and
only if I0 α([f1 ], . . . , [fn ]) is true.
We note that we used the axiom of choice in the previous proof to find the
function f0 ; however, if e.g. J is countable, only a countable form of the axiom of
choice is needed.
We call the interpretation map I0 nonstandard , if for any constant A whose
interpretation under I is an infinite entity I A, the sets A∗ := {c ∈ S : c ∈U I0 A}
and Aσ := {I0 a : a ∈ A is true} differ.
Theorem 4.15 (Luxemburg). If the ultrafilter U is δ-incomplete, then the above
defined interpretation map I0 is nonstandard. More precisely, we have for any
constant A with infinite I A that A∗ Aσ .
Conversely, if there is a constant A with countable infinite I A such that
A∗ = Aσ , then U is δ-incomplete.
Proof. Note that A∗ consists of the equivalence classes of the functions f : J → S
such that f (j) ∈ I A for almost all j, and Aσ consists of the equivalence classes
of the constant functions f : J → S with values in I A. Hence, Aσ ⊆ A∗ , and we
have A∗ = Aσ if and only if there is a function f : J → I A which does not belong

to the equivalence class of a constant function.
If U is δ-incomplete, we find a partition J1 , J2 , . . . of J such that no Jn
belongs to U . If A is a constant with infinite I A, we find a sequence of pairwise
distinct elements a1 , a2 , . . . ∈ I A. Putting f (j) := an for j ∈ Jn , the function
f is not almost everywhere equal to a constant function, since Jn ∈ / U . Hence,
[f ] ∈ A∗ \ Aσ , and the interpretation map I0 is nonstandard.
Conversely, assume there is a countable infinite I A and a function f : J → I A
which does not belong to the equivalence class of a constant function. Let an be
an enumeration of the elements of the image, and Jn be the preimage of an . Then
Jn is an (at most) countable partition of J into pairwise disjoint sets which do
not belong to U : If Jn ∈ U , then f (j) = an for almost all j, a contradiction.
Proposition 4.12 thus implies that U is δ-incomplete.
4.3 Embedding in a Superstructure

So far, we were only able to build an abstract nonstandard model, i.e. the relations
∈ and = are not interpreted in the usual set-theoretic sense but instead by the
relations ∈U and =U . But to have a nonstandard embedding in the sense of
Definition 3.2, we want the interpretation of ∈ and = in the set-theoretic sense,
and in fact we want an interpretation map I ′ with values in a superstructure ∗ S
(and not just in an abstract set S of equivalence classes).
As long as we restrict our attention to transitively bounded sentences, we may
embed the abstract model S defined in Section §4.2 into a superstructure. More
precisely, we are going to define a set ∗ S and a map ϕ of a subset SI ⊆ S into ∗ S
such that any transitively bounded sentence which is true under the interpretation
map I is also true under the interpretation map I ′ = ϕ ◦ I0 . Moreover, SI is
sufficiently large so that ∗ S will be a nonstandard model if S was a nonstandard
model.
Let Sn be the levels of the superstructure S as defined in Section 2.1. We
consider the sets
In := {[f ] | f : J → Sn }

and note that S0 ⊆ S1 ⊆ · · · implies I0 ⊆ I1 ⊆ · · · ; put I ′ := n In .
Lemma 4.16. We have
In = {[f ] ∈ S : There is some [g] ∈ In+1 with [f ] ∈U [g]}.
Proof. If [f ] ∈ In is given, consider the constant function g : J → Sn+1 , defined

by g(j) := Sn . Then [g] ∈ In+1 , and [f ] ∈U [g].
Conversely, if [g] ∈ In+1 , and [f ] ∈U [g], we have f (j) ∈ g(j) ∈ Sn+1 for
almost all j. Since Sn+1 = S0 ∪ P(Sn ), this implies f (j) ∈ Sn for almost all j, and
so we may assume f : J → Sn . Hence, [f ] ∈ In .
We now let ∗ S := I0 be the set of atoms ∗ S of our new superstructure ∗ S.

∗
Let Tn denote the level sets of that superstructure, i.e. in particular T0 = S.
The function ϕ that we are looking for is an injection ϕ : I ′ → ∗ S such that

[g] if [g] ∈ I0 = ∗ S,
ϕ([g]) = (4.2)
{ϕ([f ]) : [f ] ∈U [g]} if [g] ∈
/ I0 .
It turns out that the range of ϕ consists precisely of the internal sets. To prove that
the function we are going to construct is actually injective, we need the following
lemma:
Lemma 4.17. If elements [f ], [g] ∈ S \ I0 satisfy
{[h] ∈ S : [h] ∈U [f ]} = {[h] ∈ S : [h] ∈U [g]}, (4.3)
then [f ] = [g].
Proof. Let k : J → S be the constant function k(j) = S. The relation [f ] ∈
/ I0
means that f (j) ∈ S0 = k(j) does not hold almost everywhere, i.e. [f ] ∈U [k] is
not true. Hence, we have [f ] ∈
/ U [k]. Analogously, [g] ∈
/ U [k].
Note now that the sentence (in the language L )
∀x, y ∈
/ S : ((∀z : (z ∈ x ⇐⇒ z ∈ y)) =⇒ x = y)
is true under the canonical interpretation map I. By the theorem of

L
oś/Luxemburg (Theorem 4.14), this sentence must also be true in the ab-
stract model S . (Recall that it is not required in Theorem 4.14 that the sentence
be transitively bounded). In the model S the above sentence reads
∀x, y ∈
/ U [k] : ((∀z : (z ∈U x ⇐⇒ z ∈U y)) =⇒ x = y).
Putting x = [f ] and y = [g] in this sentence, the statement follows.

Exercise 12. Give a straightforward proof of Lemma 4.17, i.e. without applying
Theorem 4.14.
Now we are going to define by induction on n sets ∗ S n and functions ϕn
such that the following holds. (The idea is that ϕn is the restriction ϕ|In where
ϕ satisfies (4.2); however, we still have to prove that such a function ϕ actually
exists).
1. ϕn : In → ∗ S n is bijective.
2. ∗ S n ⊆ Tn .
3. We have

[g] if [g] ∈ I0 = ∗ S,
ϕn ([g]) = (4.4)
{ϕn ([f ]) : [f ] ∈U [g]} if [g] ∈ In \ I0 .
4. If n ≥ 1, we have ∗ S n−1 ⊆ ∗ S n , and moreover, ϕn−1 is the restriction of ϕn

to the set In−1 .
Then we may define the desired function ϕ by putting ϕ([g]) := ϕn ([g]) for [g] ∈
In . Note that ϕ must be injective, since each ϕn is injective and I0 ⊆ I1 ⊆ · · · .
We stress that the inclusion ∗ S n ⊆ Tn is in general strict (which is a deeper
reason for the existence of external sets).
For n = 0, we observe that I0 = ∗ S = T0 , and so we may put ∗ S 0 := ∗ S,
and ϕ0 ([g]) := [g] for [g] ∈ I0 .
Assume now by induction hypothesis that the set ∗ S n and the map ϕn :
In → ∗ S n with the additional properties have already been defined. Then we let
∗
S n+1 consist of all elements of ∗ S and furthermore of all elements A ∈ P(∗ S n )
with the following property: There is some function [g] ∈ In+1 \ I0 such that A =
{x ∈ ∗ S n : ϕ−1
n (x) ∈U [g]}. In this case, we put ϕn+1 ([g]) := A. For [g] ∈ I0 = S,
∗
we put ϕn+1 ([g]) := [g].

By construction, ∗ S n+1 ⊆ ∗ S ∪ P(∗ S n ), which in view of the induction as-
sumption ∗ S n ⊆ Tn implies ∗ S n+1 ⊆ T0 ∪ P(Tn ) = Tn+1 . Since by induction
assumption (4.4), ∗ S n−1 ⊆ ∗ S n , and since any [g] ∈ In belongs to In+1 , our
construction implies that ϕn is indeed a restriction of ϕn+1 and thus that (4.4)
holds for n + 1.
It is also clear from the construction that ϕn+1 is onto. It remains to prove
that ϕn+1 is one-to-one. Assume that ϕn+1 ([f ]) = ϕn+1 ([g]). We prove that this
implies [f ] = [g] by distinguishing three cases: If [f ] ∈ I0 , then ϕn+1 ([f ]) = [f ] ∈
I0 ; hence ϕn+1 ([g]) = [f ] is an equivalence class (and not a subset of Tn ), which
by construction of ϕn+1 is only possible if [g] ∈ I0 in which case we must have
[f ] = ϕn+1 ([g]) = [g]. In the case [g] ∈ I0 , it follows analogously that [f ] = [g]. In
the remaining case, we have [f ], [g] ∈ / I0 . Then the construction of ϕn+1 implies
{x ∈ ∗ S n : ϕ−1 ∗ −1
n (x) ∈U [f ]} = {x ∈ S n : ϕn (x) ∈U [g]}.
Since ϕn is a bijection onto ∗ S n , we thus find
{[h] ∈ In : [h] ∈U [f ]} = {[h] ∈ In : [h] ∈U [g]}.
Lemma 4.16 now implies that (4.2) holds which by Lemma 4.17 implies that [f ] =
[g], as claimed.

The function ϕ can now be defined, and the range of ϕ is the set I := n ∗ S n .
Roughly speaking, it is now clear that ϕ preserves the truth of transitively
bounded sentences which deal with internal sets: It “preserves” the relations ∈
and = (for ∈ observe (4.2), and for = use Lemma 4.17). Moreover, this mapping
is onto, and thus only provides a “renaming” of the constants. A more rigorous
proof reads as follows:
Theorem 4.18. Let α be a transitively bounded formula in the language whose
constants are taken from I ′ , and let x1 , . . . , xn denote the free variables of α
(n = 0 is not excluded). For xi ∈ I ′ , let α0 denote the formula where all free
occurrences of xi are replaced by xi (i = 1, . . . , n). Then α0 is true under the
interpretation map ϕ if and only if it is true interpreted by the inclusion i into S .
In particular, a transitively bounded sentence with constants taken from I ′
is true under the interpretation map ϕ if and only if it is true under the inclusion
i into S .
Proof. Similarly as in the proof of Theorem 4.14, we may assume that the only
logical connectives used in α are ¬ and ∧ and that the only quantifier used is ∃
(in a transitively bounded form). The proof is by induction on the structure of the
formula α. For induction assumption, we have to consider the elementary formulas
x = y and x ∈ y where x and y are either free variables or constants. In general,
we have to distinguish the following cases:
1. α has the form x = y: If α0 is true under the interpretation map ϕ, then it is
also true under the interpretation map i, since ϕ is one-to-one. The converse is
trivial.
2. α has the form x ∈ y: Then the statement follows immediately from (4.2).
3. α has the form ¬β or β1 ∧ β2 : These cases are trivial, since only the constants
are exchanged.
4. α has the form ∃x ∈ y : β(x) where y is either a free variable or a constant.
Then α0 has the form ∃x ∈ y : β0 (x) where β0 is derived from β by replacing the
free occurrences of x1 , . . . , xn by x1 , . . . , xn , respectively.
If ϕ α0 is true, then there is some x in the set ϕ y such that ϕ β 0 (x) holds.
By 2., the element c := ϕ−1 (x) then satisfies c ∈U y, and by induction assumption
i
β 0 (c) holds. Hence, i α0 is true.
Conversely, if i α0 is true, we find some x ∈ I ′ such that x ∈U y and i β 0 (x)
holds. By 2., we then have c := ϕ x ∈ ϕ y, and by induction assumption ϕ β 0 (c) is
true. Hence, ϕ α0 is true.
Theorem 4.18 is the reason why we can prove the transfer principle only for
transitively bounded sentences. If one needs the transfer principle for a particular
class of sentences which are not transitively bounded, one “just” has to check
whether Theorem 4.18 can be generalized to this class.
Proposition 4.19. Let I ′ := ϕ ◦ I0 . Then ∗ := I ′ ◦ I −1 : S → ∗
S has the following
property: We have x ∈ ∗ A for some entity A ∈ S if and only if x = ϕ([f ]) for
some function f : J → A. Moreover, Sn is actually mapped into the set ∗ S n as
defined above.
Proof. Let some entity A ∈ S be given, and consider the constant function fA :
J → S, defined by fA (j) := A. Note that I0 ◦ I −1 maps A into [fA ]. If f : J → A,
then [f ]∈U [fA ], and so Theorem 4.18 (or also (4.2)) implies ϕ([f ]) ∈ ϕ([fA ]) = ∗ A.
Conversely, if x ∈ ∗ A, then x belongs to the range of ϕ, i.e. x = ϕ([f ]) for some
f : J → S. Since ϕ([f ]) ∈ ∗ A = ϕ([fA ]), Theorem 4.18 implies [f ] ∈U [fA ], i.e.
f (j) ∈ A for almost all j. By choosing a different representative if necessary, we
may assume that f (j) ∈ A even for all j, i.e. f : J → A.
For the second statement, apply what we just proved for A := Sn : We have
∗
A = {ϕ([f ]) | f : J → Sn } = {ϕ(x) : x ∈ In }.
But by construction, we have that ϕn : In → ∗ S n is onto where ϕn is the restric-

tion of ϕ to In . Hence, ∗ A = {y : y ∈ ∗ S n } = ∗ S n , as claimed.
Let us collect the main result of §4:
Theorem 4.20. Let I ′ := ϕ ◦ I0 . Then ∗ := I ′ ◦ I −1 : S → ∗S is an elementary
embedding which maps Sn into ∗ S n .
If the ultrafilter U is δ-incomplete, then ∗ is a nonstandard embedding. Con-
versely, if ∗ is a nonstandard embedding and S is infinite, then U is δ-incomplete.
Proof. By Proposition 4.19, S0 is mapped into ∗ S 0 = ∗ S = T0 , as required in
Definition 3.1.
If α is a transitively bounded sentence which is true under the interpretation
map I, then I0 α is true by Theorem 4.14. Interpreting this sentence by ϕ, we thus
′
get a true sentence by Theorem 4.18. But this interpreted sentence is just I α.
Let the ultrafilter U be δ-incomplete. By Theorem 4.15, we have for each
constant A with infinite I A that the inclusion Aσ A∗ is strict. For each such A,
we find some c ∈U I0 A which cannot be written in the form I0 a where the sentence
a ∈ A is true. Note that c belongs to I ′ , since I0 A is the equivalence class of
a constant mapping f : J → S. Similarly, all constant I0 a with a ∈ A belong
to I . Thus each of the sentences c = I0 a can be formulated in the language of
′
Theorem 4.18 and is true. Theorem 4.18 thus implies σ A = ∗ A.

Conversely, if S is infinite and ∗ is a nonstandard embedding, let A be an
infinite countable subset of S. Then ∗ A σ A, i.e. we find some c ∈ σ A which
cannot be written in the form ∗ a with a ∈ A. Since c is an internal element, it
occurs in the image of ϕ. With a similar argument as above, Theorem 4.18 now
implies Aσ = A∗ , and Theorem 4.15 shows that the ultrafilter U is δ-incomplete.

Together with Proposition 4.19, we find a natural characterization of internal

sets in our model:
Corollary 4.21. With the above notation, we have: An element x is internal if and
only if it arises from a map f : J → A with an entity A ∈ S in the sense that
x = ϕ([f ]). Moreover, x = ∗ a is standard if and only if f can be chosen constant
f (j) = a.
Example 4.22. For the map ∗ from our ultrapower model (Theorem 4.20), one has
a simple interpretation for standard relations ∗ Φ where Φ ⊆ X1 × · · · × Xn :
Note that ∗ Φ = ϕ(fΦ ) where fΦ : J → S denotes the constant function
fΦ (j) := Φ. In view of Proposition 4.19, each element xk ∈ ∗ X k has a representa-
tion xk = ϕ([fk ]) where fk : J → Xk . We claim that
(x1 , . . . , xn ) ∈ ∗ Φ ⇐⇒ (f1 (j), . . . , fn (j)) ∈ Φ for almost all j.
Indeed, Theorem 4.18 implies that (x1 , . . . , xn ) ∈ Φ is true if and only if the
corresponding formalized sentence is true if interpreted in the abstract model S .
But this just means that (fx1 (j), . . . , fxn (j)) ∈ fΦ (j) for almost all j.
Example 4.23. Let us illustrate the importance of Example 4.22 for a standard
function ∗ f when f : X → Y :
If x = ϕ([fx ]) with fx : J → X, then ∗ f (x) = ϕ([fy ]) where fy : J → Y is
given by fy (j) = f (fx (j)). Thus, the extension of a function f to a function ∗ f is
indeed defined in the canonical way announced in Section 1.1.
Although the model of Theorem 4.20 is rather “constructive”, the reader
should be aware that actually the ultrafilter U is a rather “unknown” element:
Except for very special cases it is impossible to decide from the representation
of two nonstandard elements x, y whether they satisfy e.g. a simple sentence like
x ∈ y or x = y:
Exercise 13. Consider the model of Theorem 4.20 with a countable infinite set J.
Let A ∈ S, and x, y ∈ ∗ A, i.e. x = ϕ([fx ]) and y = ϕ([fy ]) where fx , fy : J → A.
Give a necessary and sufficient condition on fx and fy such that the identity x = y
holds for any choice of a δ-incomplete ultrafilter U .
For our particular map ∗, we can prove a special case of Theorem 3.23 already.
(Actually, the model theoretic proof of Theorem 3.23 which can be found in [CK90]
reduces the general result to a variation of the following special case; our proof
in Section 7.1 will be rather different).
Exercise 14. Prove from the definition of the map ∗ in Theorem 4.20 that
1. ∗ A = σ A if A is a finite entity.
2. ∗ A is uncountable if A is infinite and U is δ-incomplete (in particular, ∗ A =
σ
A for countable A).
Hint: If f1 , f2 , . . . : J → A, construct a function f : J → A such that
f (j) = f1 (j) everywhere, f (j) = f2 (j) almost everywhere, f (j) = f3 (j) on a
smaller set but still almost everywhere, etc.
In particular, standard entities are either finite or uncountable if the mapping ∗
in Theorem 4.20 is a nonstandard embedding.
Chapter 3
Nonstandard Real Analysis
Throughout this chapter, let ∗ : S → ∗

S be a nonstandard embedding, and Ê ∈ S
be an entity.
§ 5 Hyperreal Numbers
5.1 Hyperreal and Hypernatural Numbers
Ê Ê Ê
Let ∗ be the value of the ∗-transform of . The elements of ∗ are called the
hyperreal numbers.
We first introduce the notation for the most important functions on ∗ : Ê
These are + : 2 →Ê Ê
(defined in the usual way), and similarly subtraction,
multiplication, division, and exponentiation. These functions are mapped by ∗
∗
Ê Ê ∗
Ê
into functions ( 2 ) → ∗ . Note that ( 2 ) = (∗ R)2 , and so in particular, e.g.
∗ ∗
Ê
+:( ) → 2 ∗
Ê Ê
. Instead of writing +(a, b) for a, b ∈ ∗ , we use the traditional
∗
notation a∗ +b. For the sake of convenience, we will later also drop the symbol ∗
in this connection and just write a + b. However, for the beginning and to avoid
confusion, we will keep this symbol.
Ê ∗
Proposition 5.1. If a, b ∈ , then (a + b) = ∗ a ∗ + ∗ b; similarly for multiplication,
division, and exponentiation.
Proof. If c denotes the constant a + b, then c = a + b is a true bounded sentence,

and so its ∗-transform ∗ c = ∗ a ∗ + ∗ b is true by the transfer principle.
Thus, ∗ is an isomorphism of Ê into σ Ê. In particular:

Corollary 5.2. σ
Ê is a field.
60 Chapter 3. Nonstandard Real Analysis
However, the hyperreal numbers would not be useful if we had the field
Ê Ê
property only for the copy σ of : We want to have the same rules even for
Ê
the larger set ∗ . The essential point in the following result is that the statement
holds not only for elements of the form ∗ a where a is from the standard universe,
but even for nonstandard elements:
Ê
Proposition 5.3. The set ∗ equipped with the relations ∗ + and ∗ · is a field. The
neutral element of addition and multiplication is ∗ 0 and ∗ 1, respectively. The
Ê
inverse element of a ∈ ∗ for addition is ∗ 0 ∗ − a, and the inverse element of
Ê ∗
a ∈ ∗ \ {∗ 0} for multiplication is ∗ 1 / a.
Proof. The commutative law for addition follows by applying the transfer principle
for the transitively bounded true sentence
∀x, y ∈ Ê : x + y = y + x.
The commutative law for multiplication, and the associative and the distributive
Ê
laws are proved analogously. The transfer of the formula ∀x ∈ : x + 0 = x shows
that ∗ 0 is the neutral element of the addition ∗ +, and the transfer of the formula
∀x ∈ Ê : x + (0 − x) = 0
with the evident shortcuts implies
∀x ∈ ∗ Ê : x ∗+ (∗0 ∗− x) = ∗0
which means that ∗ 0 ∗ − a is the inverse element of addition. The proof concerning
∗
Ê Ê ∗
Ê
multiplication is similar, if one observes that ( \ {0}) = ∗ \ {0} = ∗ \ {∗ 0}.

We note that the inverse element of a with respect to addition and mul-
tiplication is usually denoted by −a resp. a−1 . One might thus define functions
Ê
f1 (a) = −a and f2 (a) = a−1 on . The ∗-transform gives us hyperreal functions
∗
Ê Ê
f i : ∗ → ∗ . One may ask whether ∗ f 1 (a) is the inverse element of a with
Ê
respect to addition even for all hyperreal numbers a ∈ ∗ . The transfer of the
formula ∀x ∈ Ê : x + f1 (x) = 0 shows that this is indeed the case. Similarly,
∗
Ê
f 2 (a) is the inverse of a with respect to multiplication for any a ∈ ∗ \ {∗ 0}.
One might also interpret f2 (a) as the result of the exponential function
e(a, b) := ab with b := −1 and may ask whether the ∗-transform of e yields
∗
the same function, i.e. whether ∗ f 2 (a) = ∗ e(a, (−1)) for any hyperreal number
Ê
a ∈ ∗ . This is indeed the case, as follows by the transfer principle from the
sentence
Ê
∀x ∈ \ {0} : f2 (x) = e(x, −1)
§5 Hyperreal Numbers 61
∗
Ê Ê
in view of the fact that ( \ {0}) = ∗ \ {∗ 0} (recall the proof of Proposition 5.3).
Thus the ∗-transform of a → a−1 yields always the same nonstandard function,
no matter how the symbol a−1 is interpreted.
We hope that the reader already has the impression that all elementary
properties of Ê Ê
carry over to ∗ in the canonical way. The limitations of this
transfer will be made clear later.
Ê
Let us now also consider the order properties of ∗ : The relation ≤ on is Ê
Ê
described by a subset of 2 , namely ≤:= {(a, b) : a ≤ b}. The ∗-transform of this
Ê
set is a relation on ∗ . We write a ∗ ≤ b (or later more briefly a ≤ b), if the pair
Ê
(a, b) ∈ (∗ )2 belongs to ∗ ≤. Similarly, we define the meaning of symbols like ∗ <
or ∗ > for hyperreal numbers.
The transfer principle of course implies that the relation a ≤ b for elements
a, b of the standard universe gives ∗ a∗ ≤∗ b. In particular, ∗ ≤ is a total order on
Ê Ê
the standard copy σ of . However, ∗ ≤ is even a total order in the nonstandard
universe, as we shall show now. Observe that this does not follow from the above
Ê
argument, since σ is a strict subset of ∗ . Ê
We note that we could alternatively have defined ∗ < on ∗ by Ê
a ∗ < b ⇐⇒ (a ∗ ≤ b ∧ a = b).
The following result implies that these two possible definitions coincide. An anal-
ogous remark holds for ∗ > and ∗ ≥.
Proposition 5.4. ∗ ≤ defines a total order on ∗
Ê. Moreover, for hyperreal numbers
Ê
a, b ∈ ∗ the following holds true:
1. We have a ∗ ≤ b if and only if a∗ <b or a = b.
2. We have a ∗ ≥ b if and only if a∗ >b or a = b.
3. Precisely one of the three relations a ∗ < b, a ∗ > b, a = b holds.
Ê
Proof. Let α be the bounded sentence ∀x ∈ : x ≤ x. This sentence is true, and
Ê
so the transfer principle implies that ∀x ∈ ∗ : x ∗ ≤ x, i.e. a ∗ ≤ a for any hyperreal
Ê Ê
number a ∈ ∗ . Similarly, the transfer of the sentence ∀x, y ∈ : ((x ≤ y ∧ y ≤
x) =⇒ x = y) shows that the relations a ≤ b and b ≤ a for hyperreal numbers
a, b imply a = b. The proof of the other properties is similar.
Let us now summarize:

Theorem 5.5. With the above notation, σ
Ê and ∗Ê are ordered fields, and σ Ê is
isomorphic to .Ê
Ê
Proof. The statement for σ is trivial, since by the transfer principle, ∗ : → σ Ê Ê
Ê
is an isomorphism. Concerning ∗ , we have proved already everything up to the
relations between arithmetic and order operations. But these follow by the transfer
principle from the sentences
∀x, y, z ∈ Ê:x<y =⇒ (x + z < y + z)
and
∀x, y, z ∈ Ê : (x < y ∧ z > 0) =⇒ (x · z < y · z)
where we used evident abbreviations (henceforth, we will use such shortcuts with-
out further mention).
∗ ∗
We also use notation like |·|, max(·, ·) for the transfer of the functions with
their evident meanings.
∗
All elementary formulas like |a| = ∗ max(a, ∗ −a) (for hyperreal numbers
a∈ ∗
Ê) follow immediately from the transfer principle and will be used henceforth
without further mention. Moreover, we will henceforth drop the symbol ∗ on such
simple functions and on simple constants, if no confusion arises. Thus, 0 may either
Ê Ê
mean the element 0 in the set , or the element ∗ 0 in the set ∗ (or in σ ). Ê
Example 5.6. For the map ∗ from our ultrapower model (Theorem 4.20), one has
a simple interpretation for the elementary operations: Recall that any element x ∈
∗
Ê has the form x = ϕ([f ]) with a function f : J → Ê
(recall Proposition 4.19).
Ê
To simplify notation, we write fx for such an f . Note that if x ∈ σ , then x is
standard, and so fx may be chosen constant. We claim that
z = x + y ⇐⇒ fz (j) = fx (j) + fy (j) for almost all j,

z = x · y ⇐⇒ fz (j) = fx (j) · fy (j) for almost all j,
x ≤ y ⇐⇒ fx (j) ≤ fy (j) for almost all j.
To see this, recall that +, · and ≤ are just standard relations, and apply Exam-
ple 4.22.
We define the hypernatural numbers as the elements of ∗ , and similarly the
hyperinteger numbers and the hyperrational numbers as the elements of ∗ and ∗ ,

respectively. Note that the relations ⊆ ⊆ ⊆ imply ∗ ⊆ ∗ ⊆ ∗ ⊆ ∗ .
There are two natural definitions for the order on ∗ : Either, we can define

the order as the restriction of the order on ∗ to ∗ , or we can use the ∗-transform
of the order of . As the reader might have expected, the two definitions actually
coincide: One may apply the transfer principle to see this, or just has to recall
Theorem 3.13.

We will show soon that ∗ contains, besides the copy σ of , also infinite
and infinitesimal numbers:
Definition 5.7. A hyperreal number x ∈ ∗ Ê is called

1. finite, if there is some n ∈ σ
Æ
such that |x| < n,
Æ
2. infinite, if for any n ∈ σ we have |x| > n,
3. infinitesimal , if |x| < n−1 for any n ∈ σ . Æ
We use the notation
Ê
fin(∗ ) := {x ∈ ∗ Ê : x is finite},
inf( Ê) := {x ∈ Ê : x is infinitesimal}.
∗ ∗
The notation inf(∗ Ê) is of course ambiguous, since the symbol inf is usually
reserved for the infimum; however, we hope that no confusion will arise.
Ê
It will be convenient to use the notation + := {x ∈ Ê
: x > 0}. Then we
have σ
Ê + = {x ∈
σ
Ê
: x > 0} and ∗
Ê
+ = {x ∈
∗
Ê
: x > 0}.
Ê
Any x ∈ ∗ is either finite or infinite.
Proposition 5.8. A number x ∈ ∗ Ê is
1. finite, if and only if |x| < y for some y ∈ σ , Ê
2. infinite, if and only if |x| > y for any y ∈ σ , Ê
3. infinitesimal, if and only if |x| < ε for any ε ∈ σÊ+.
Proof. One implication follows immediately from Æ ⊆ σ Ê. For the converse
σ
implication note that σ Ê is Archimedean, i.e. for any y ∈ σ Ê we find some n ∈ σ Æ

such that n > y. Thus, if x is infinite, we have |x| > n > y. Similarly, if x is
Ê
infinitesimal and ε ∈ σ + , we put y := ε−1 and then find some n ∈ σ with Æ
|x| < n−1 < y −1 = ε.
Ê
Note that there is of course no x ∈ ∗ + which satisfies x < ε for any hyperreal
Ê
number ε ∈ ∗ + (It is not necessary to invoke the transfer principle to see this:
The choice ε := x is enough).
Ê
Although ∗ contains both, infinite numbers and infinitesimals, one might
Æ Æ
expect that ∗ contains, besides σ , only infinite numbers. This is indeed the
case. Moreover, there are infinite numbers in ∗ : Æ
Recall that, since ∗ is a nonstandard embedding, we have σ ∗ . For Æ Æ
evident reasons, we put
Æ ∞ :=
∗
Æ Æ
\σ .
Proposition 5.9. If h ∈Æ∞, then h is infinite, i.e.
∗
N ∩ fin(∗ Ê) = σ Æ.
Proof. Given some N ∈ Æ, consider the sentence
∀x ∈ Æ : (x = 1 ∧ x = 2 ∧ · · · ∧ x = N =⇒ x > N ).

The transfer principle implies that any h ∈ ∗ which does not have the form

h = ∗ k with k = 1, 2, . . . , N (in particular, any h ∈ ∞ ) satisfies h > ∗ N . Since

N ∈ was arbitrary, the statement follows.
Corollary 5.10. In ∗
Ê there are infinite numbers and nonzero infinitesimal num-
bers.
Proof. By Proposition 5.9, there is an infinite number h, and so 1/h is infinitesimal

(and different from 0).
Ê
Exercise 15. Show that for each x ∈ ∗ there is precisely one h ∈ ∗ Æ with
/σ .
h ≤ |x| < h + 1. Moreover, prove that x is infinite if and only if h ∈ Æ
Example 5.11. For the map ∗ from our ultrapower model (Theorem 4.20), it is easy
Ê
to characterize the finite numbers: Recall that x ∈ ∗ if and only if x = ϕ([f ])
Ê
where f : J → . By definition, x is finite if and only if |x| ≤ ∗ n for some n ∈ . Æ
Example 5.6 implies that the relation |x| ≤ ∗ n for x = ϕ([f ]) is equivalent to
f (j) ≤ n for almost all j. Hence, x = ϕ([f ]) is finite if and only if [f ] has a
representing function which is bounded on J.
Now we come to the limitations of the transfer principle:
Æ Æ
Theorem 5.12. The set ∗ is not well-ordered. More precisely, ∞ has no smallest
Æ
element. However, any nonempty internal subset of ∗ has a smallest element.
Æ
Proof. Assume by contradiction that ∞ has a smallest element h. Then h > ∗ n
Æ Æ
for each n ∈ , and so h − 1 > ∗ n for each n ∈ . Since no element n0 ∈ Æ
Æ
satisfies ∗ n0 > ∗ n for all n ∈ , we may conclude that h − 1 ∈ Æ
/ σ . Hence, the
Æ
element h0 := h − 1 belongs to ∞ and is strictly smaller than h, a contradiction.
SinceÆ is well-ordered, the sentence
Æ
∀x ∈ P( ) : (x = ∅ =⇒ ∃y ∈ x : ∀z ∈ x : y ≤ z)
Æ
is true. The transfer of this sentence implies that any nonempty x ∈ ∗ P( ) has a
Æ
smallest element. By Theorem 3.21, the set ∗ P( ) consists precisely of all internal
Æ
subsets of ∗ .
Æ Æ
Theorem 5.12 implies that ∗ N \ σ is external, and so σ is external. In the
light of this proof, the reader might want to reconsider the proof of Theorem 3.22:
This proof is actually just a repetition of the arguments that we used above.
The reason why the transfer principle does not apply for the sentence “ Æ
is well-ordered” is that the natural formalization of this sentence is unbounded
(namely it has the form ∀x ⊆ Æ
: . . .); recall in this connection the remark
following Theorem 3.13.
Proposition 5.9 shows another limitation of the transfer principle:
Theorem 5.13. The ordered field ∗

Ê is not Archimedean and not isomorphic to a
Ê
subfield of .
Proof. Using the notation of Section 1.2 with X := ∗ , we evidently have X =Ê
σ
É Æ Æ
and X = σ . Choose some h ∈ ∗ N \ σ . Since ∗ ⊆ ∗ , we then have Æ Ê
Ê
h ∈ ∗ , but by Proposition 5.9 there is no n ∈ σ with n > h. This means Æ
Ê
that X = ∗ is not Archimedean. The second statement follows from the first by
Theorem 1.3.
Ê
Theorem 5.14. The set ∗ is not Dedekind complete. However, any nonempty
Ê
internal subset of ∗ which is bounded from above has a least upper bound.
Proof. The subset Æ ⊆ Ê is bounded from above by Proposition 5.9. However,
σ ∗
σ
Æ has no least upper bound: If x ∈ Ê is an upper bound for Æ, i.e. x > n for
∗ σ
each n ∈ Æ, then also x − 1 > n for each n ∈ Æ, i.e. x − 1 ∈ Ê is an upper

σ σ ∗
bound for Æ which is strictly smaller than x.

σ
For the second statement, observe that Ê is Dedekind complete which means
that
Ê
∀x ∈ P( ) : (x = ∅ ∧ α(x, Ê)) =⇒ β(x, Ê)
Ê Ê
is true where α(x, ) and β(x, ) are transitively bounded formulas with the
Ê
meaning “x has an upper bound in ” and “x has a smallest upper bound in
Ê ”, respectively (we leave the precise formulation of the formulas to the reader).
The transfer of the above statement means that any internal subset of ∗ which Ê
Ê
is nonempty and bounded from above in ∗ possesses a smallest upper bound in
∗
Ê Ê
(recall that ∗ ∅ = ∅ and that ∗ P( ) consists by Theorem 3.21 precisely of all
internal subsets of ∗ ). Ê
The fact that ∗ Ê is not Dedekind complete should not be too surprising to the
reader, since the transfer principle simply does not provide much information on
external sets: The “natural” formalization of the sentence that a field is Dedekind
complete is not transitively bounded.
Ê
However, the reader might be surprised that ∗ is not Archimedean, because
the sentence that Ê
is Archimedean can in a natural way be formalized by the
transitively bounded sentence
∀x ∈ Ê : ∃y ∈ Æ : y > x.
Of course, the ∗-transform of this sentence must be true:
Ê : ∃y ∈ Æ : y > x.
∀x ∈ ∗ ∗
However, this sentence does not mean that Ê is Archimedean, because for X =
∗
∗
Ê the set Æ from Section 1.2 is Æ and not Æ.
X
σ ∗
Ê Ê
Proposition 5.15. fin(∗ ) is an Archimedean subring of ∗ without zero-divisors
Ê
which contains σ and inf(∗ ). Ê
Ê Ê
fin(∗ ) is not a field. More precisely, we have 1/x ∈ fin(∗ ) for some x ∈ ∗ Ê
/ inf(∗ ).
(x = 0) if and only if x ∈ Ê
Ê Ê
Proof. σ ⊆ fin(∗ ) follows from Proposition 5.8, and the fact that fin(∗ ) is Ê
Archimedean follows from the definition (note that for X := fin(∗ ), we have Ê
Æ X = Æ
σ
Ê
). If fin(∗ ) would have zero-divisors, we would have x · y = 0 for
Ê Ê Ê Ê
x, y ∈ fin(∗ ) ⊆ ∗ , contradicting the fact that ∗ is a field. fin(∗ ) is a subring:
Ê Ê
If x, y ∈ fin(∗ ), then x+y, x−y, and xy also belong to fin(∗ ): Indeed, there
Æ
are n, m ∈ σ with |x| ≤ n, |y| ≤ m which implies |x ± y| ≤ |x|+|y| ≤ n+m ∈ σ , Æ
and similarly |xy| ≤ nm ∈ σ . Æ
Ê Æ
If x ∈ inf(∗ ), then |x| < n−1 < 1 for all n ∈ σ . Hence, x ∈ fin(∗ ) and Ê
y := 1/x ∈ Ê
/ fin(∗ ) (if x = 0) because |y| > n for all n ∈ σ . Æ
Conversely, if x ∈ Ê
/ inf(∗ ), then we find some n with |x| ≥ n−1 . Hence,
y := 1/x satisfies |y| ≤ n and does not belong to fin(∗ ). Ê
Definition 5.16. We say that two hyperreal numbers x, y ∈ ∗ Ê are infinitely close
Ê
to each other, if x − y ∈ inf(∗ ). We then write x ≈ y.
Proposition 5.17. ≈ is an equivalence relation on ∗
Ê. Moreover, if x1 ≈ y1 and
x2 ≈ y2 we have:
1. x1 ± x2 ≈ y1 ± y2 .
2. x1 · x2 ≈ y1 · y2 if x1 and x2 are finite.
3. x1 /x2 ≈ y1 /y2 if x1 is finite and x2 ≈ 0.
Ê
Proof. Since 0 ∈ inf(∗ ), we have x ≈ x. Since |x − y| = |y − x|, the relation x ≈ y
implies y ≈ x. Finally, if x ≈ y and y ≈ z, then |x − z| ≤ |x − y| + |y − z| < 2ε for
Ê
any ε ∈ σ + which implies x ≈ z.
For each n ∈ σ Æ
we have |xi − yi | < n−1 (i = 1, 2). Hence,
Æ
|(x1 ± x2 ) − (y1 ± y2 )| ≤ 2n for each n ∈ σ . If x1 and x2 are finite, we find
−1
Æ
some N ∈ σ with |xi | ≤ N (i = 1, 2). Since |x2 − y2 | ≤ 1, we find |y2 | ≤ N + 1,
and so |x1 · x2 − y1 · y2 | = |x1 (x2 − y2 ) + (x1 − y1 )y2 | ≤ N n−1 + n−1 N = 2N n−1
Æ Æ
for each n ∈ σ . Finally, if x2 ≈ 0, then we find some N ∈ σ with |x2 | ≥ N −1 .
Since |x2 − y2 | ≤ N −1 /2, we also have |y2 | ≥ N −1 /2. Consequently,

1

n−1

− 1
=
y2 − x2
≤

x2 y2

x2 y2
N N −1 /2
−1
Æ
for each n ∈ σ which proves x−1 −1 −1
2 ≈ y2 . Since x2 is finite, it follows by what
−1 −1
we just proved that x1 /x2 = x1 x2 ≈ y1 y2 = y1 /y2 .
The reader might have observed that the above proof essentially repeats the
argument of the classical limit rules like lim(xn ± yn ) = lim xn ± lim yn , etc. In
fact, we will see later that Proposition 5.17 implies these limit rules.
Ê Ê
Corollary 5.18. For any x ∈ inf(∗ ), y ∈ fin(∗ ), we have x · y ∈ inf(∗ ). Ê
Proof. x ≈ 0 and y ≈ y implies x · y ≈ 0 · y = 0.
Exercise 16. What can be said about the “equations” x2 = 2 and x2 ≈ 2 in ∗

É?
Exercise 17. Show that for any x ∈ Ê there is some q ∈ É with x ≈ q.
∗ ∗
Theorem 5.19. For each x ∈ fin( Ê) there is precisely one x̂ ∈ Ê with x ≈ x̂.
∗ ∗
Proof. Uniqueness: If x̂, ŷ ∈ Ê satisfy x̂ ≈ x ≈ ŷ, then | x̂ − ŷ| < ε for any
σ ∗ ∗ ∗ ∗ ∗
ε ∈ Ê . The inverse transfer principle implies |x̂ − ŷ| < ε for any ε ∈ Ê , and so
+ +
x̂ = ŷ.
Ê
Existence: Consider the set A := {y ∈ σ : y < x}. Since x ∈ fin(∗ ), the Ê
set A is nonempty and bounded from above, and since σ is Dedekind complete Ê
Ê
(because it is isomorphic to ), it has a least upper bound s ∈ σ , i.e. s = ∗ x̂ Ê
Ê Æ
for some x̂ ∈ . Given some n ∈ σ , we have s ≥ x − n−1 since otherwise
s + n−1 ∈ A would contradict the fact that s is an upper bound for A. But we also
have s ≤ x + n−1 , since otherwise s − n−1 would be an upper bound for A which is
Æ
strictly smaller than s. Hence |s − x| ≤ n−1 for each n ∈ σ , i.e. x ≈ s = ∗ x̂.
We emphasize that the proof of Theorem 5.19 made essential use of the
Dedekind completeness of . Ê
Definition 5.20. Let st : fin(∗ ) → Ê Ê
be the map x → x̂ from Theorem 5.19.
Ê
We call st(x) = x̂ the standard part of x ∈ fin(∗ ), and st the standard part
homomorphism.
Theorem 5.21. st : fin(∗ ) → Ê Ê
is a surjective order-preserving ring-homomor-
Ê
phism with kernel inf(∗ ), i.e. for all x, x1 , x2 ∈ fin(∗ ) we have Ê
1. st(x) = 0 if and only if x ∈ inf( ), ∗
Ê
2. st(x1 ± x2 ) = st(x1 ) ± st(x2 ),
3. st(x1 · x2 ) = st(x1 ) · st(x2 ),
4. st(x1 /x2 ) = st(x1 )/ st(x2 ) if st(x2 ) = 0, and
5. x1 ≤ x2 implies st(x1 ) ≤ st(x2 ).
Hence, st induces an order-preserving ring-isomorphism
Ê
fin(∗ )/ inf(∗ ) ∼
= Ê Ê.
Ê Ê Ê
Since is a field, also fin(∗ )/ inf(∗ ) is a field (and inf(∗ ) is a maximal ideal Ê
Ê
in fin(∗ )).
Proof. Let y1 := st(x1 ) and y2 := st(x2 ). Then xi ≈ yi , and Proposition 5.17

Ê
implies x1 + x2 ≈ y1 + y2 . Since y1 + y2 ∈ σ , we find st(x1 + x2 ) = y1 + y2 , and so
st commutes with the addition. Analogously, st commutes with multiplication and
thus is a ring homomorphism. If x1 ≤ x2 , then we have for any n ∈ σ in view of

|xi − yi | ≤ n−1 that y1 < x1 + n−1 ≤ x2 + n−1 ≤ y2 + n−1 . Since y1 , y2 ∈ σ ∼ = ,
this implies y1 ≤ y2 .

st is onto σ , since st(x) = x for x ∈ σ . Moreover, st(x1 ) = 0 if and only

if x1 ≈ 0, i.e. if and only if x1 ∈ inf(∗ ).
Definition 5.22. For x ∈ , we define the monad of x as the set

mon(x) := {y ∈ ∗ : y ≈ ∗ x}.
We have

mon(x) = ∗ x + inf(∗ ) = {y : st(y) = x}.

Indeed, y ≈ ∗ x if and only if y − x ∈ inf(∗ ). Now observe that inf(∗ ) is the
kernel of the standard part homomorphism st which satisfies st(∗ x) = x for x ∈ .

Up to now we know that ∗ contains more elements than σ ∼
= ; we also
know that under these new elements are the infinite and the nonzero infinitesimal

elements. However, ∗ contains even more elements, namely all which are infinitely

close to some x ∈ σ , i.e. all which belong to some monad. The question arises

whether there are other “exotic” elements contained in ∗ . The answer is “no”:

Proposition 5.23. The set fin(∗ ) is the disjoint union of all monads. The elements

of ∗ \ fin(∗ ) are the inverses of the nonzero elements of inf(∗ ) = mon(0).
Proof. For any y ∈ fin(∗ ), we have y ∈ mon(st(y)), and so fin(∗ ) is contained in
the union of all monads. Conversely, each monad mon(x) is contained in fin(∗ ),
since ∗ x ∈ fin(∗ ) and y ≈ ∗ x implies y ∈ fin(∗ ) (we have |x| ≤ n for some
n ∈ , and so |y| ≤ |∗ x| + 1 ≤ ∗ n + 1). To see that the monads are disjoint,

assume that for x1 , x2 ∈ we find some y ∈ mon(x1 ) ∩ mon(x2 ). Then st(y) = x1
and st(y) = x2 , i.e. x1 = x2 .
The second statement is a reformulation of the second part of Proposi-
tion 5.15.
Most of the objects we considered so far are actually external. In this connec-
tion recall that ∗ A is internal and σ A is external for infinite sets A (in particular,

for A = , , , ).

Theorem 5.24. The sets inf(∗ ), fin(∗ ), and mon(x) are external. Moreover,
∗
also the mapping s : x → (st(x)) is external.
Ê Ê
Proof. The set inf(∗ ) ⊆ ∗ has no least upper bound (and so Theorem 5.14
Ê Ê
implies that it is external): Assume that x ∈ ∗ + is such a bound. If x ∈ inf(∗ ),
Ê
then 2x ∈ inf(∗ ) (Corollary 5.18) contradicts the fact that x is an upper bound.
Hence, x ∈ Ê
/ inf(∗ ) which implies x/2 ∈ Ê
/ inf(∗ ) (since otherwise x = 2(x/2) ∈
Ê
∗
inf( )). This contradicts the fact that x is the least upper bound.
Ê
If fin(∗ ) were internal, then the internal definition principle would imply
that
Ê Ê Ê
inf(∗ ) = {x ∈ ∗ : x = 0 ∧ 1/x ∈ fin(∗ )}
is internal, a contradiction. Similarly, if mon(x) were internal, the internal defini-
Ê
tion principle would imply that inf(∗ ) = mon(x) − x were internal.
Ê
If s were internal, then dom(s) = fin(∗ ) were internal by Theorem 3.19
Ê
(and even rng(s) = σ were internal).
Exercise 18. Prove that {y ∈ ∗Ê : y ≈ x} is external for any x ∈ ∗Ê.

Exercise 19. Prove that any h ∈ ∗ Æ is either even or odd (and also not both), i.e.
precisely one of the following alternatives holds:
1. There is some n ∈ ∗ Æ such that h = 2n, or
2. There is some n ∈ ∗ Æ such that h = 2n − 1.
Prove the statement also for the map ∗ of Theorem 4.20 directly from the defini-
tion.
Ê
Let us now point out that some “curiosities” of the set ∗ correspond to
the classical paradoxa which have been associated with the “continuum” (e.g. by
Leibniz):
Historically, the continuum has been considered as a line. It was possible to
divide this line at any point, but it did not make much sense to speak of a “point”
of the line: The “point” could only be considered as an “endpoint” of e.g. a line
segment; if there is no line segment, then there is also no “point” to speak of. This
intuitive idea is reflected by the fact that monads are external (if we consider only
Ê
internal subsets of ∗ as “reasonable”) and that infinite internal sets are always
“large” in a certain sense (recall the remarks following Theorem 3.23).
Another correspondence to Leibniz’s intuitive ideas is the treatment of in-
finitesimals: An infinitesimal dx > 0 was by Leibniz only considered as a “place-
holder for many possibilities” (with the only requirement that it be less than any
positive number). This is reflected by the fact, that the standard formal language
has no symbols for infinitesimals: We can describe infinitesimals only by means of
Ê
variables within quantified expressions like ∀x ∈ ∗ : . . .. In particular, although
we can say in a proof “Fix some infinitesimal c”, we actually do not know precisely
which infinitesimal will be fixed: Although we can specify even more properties
Æ
of the infinitesimal, e.g. that it be of the form 1/h with h ∈ ∞ , we cannot give
a property which describes it completely: Indeed, assume that such a property

would exist. Formally, this means that there is a standard predicate α(x) such
Ê
that {x ∈ ∗ : α(x)} = {c}. But by the standard definition principle, this would
imply that {c} is a standard set which is not the case (Exercise 5).
5.2 Interpretation of the Standard Part Homomorphism

The standard part homomorphism st is one of the most important functions in
nonstandard analysis. Let us interpret this function in the situation of the ultra-
power model (Theorem 4.20) (with U being δ-incomplete).
Ê
Recall (Example 5.6) that x ∈ fin(∗ ) if and only if x = ϕ([f ]) where f :
Ê
J → is bounded.
Proposition 5.25. If in the above situation J = Æ and limj→∞ f (j) = c, then
st(x) = c.
Ê
Proof. Given ε ∈ + , the set N = {j : |f (j) − c| > ε} is finite, and so its
complement belongs to U by Exercise 11. Hence |f (j) − c| ≤ ε for almost all j
which in view of Example 5.6 means |x − ∗ c| ≤ ∗ ε. Hence, x ≈ ∗ c which in view
Ê
of c ∈ actually implies c = st(x).
Thus, st is equal to “lim” (if it makes sense to speak of “lim”). Moreover,
Theorem 5.21 shows that st has also many properties analogous to “lim”. However,
st is even defined whenever f : J → Ê is just bounded. Thus, in a certain sense
we may consider st as a generalization of a limit to all bounded functions. We will
discuss such limits later on.
At the moment we will just recall a definition of “limit” from general topol-
ogy: The reader who is not familiar with topology may in the following just con-
Ê
sider X = ; this is the only case that is actually needed at the moment. However,
since it makes no essential difference, we formulate the following results already in
a more general context (the reader may want to reread the following after having
read §12).
For the rest of this section, let X be some Hausdorff space i.e. a topological
space with the property that each two points x = y have disjoint open neighbor-
hoods.
Let F be a filter over some set J, and f : J → X be a function. One calls
the filter generated by {f (F ) : F ∈ F } the image filter of F and denotes this
filter by f (F ). One says that f converges with respect to F to some point x ∈ X
if any open neighborhood U ⊆ X of x is an element of the image filter f (F ). If f
converges with respect to F , we write
lim f (j) = x.
j→F
This notation is justified:

Proposition 5.26. If X is a Hausdorff space and F some filter, then f converges
to at most one point with respect to F .
Proof. Assume that f converges to x and to y with respect to F , and that x = y.

Since X is a Hausdorff space, the points x and y have disjoint open neighborhoods,
say Ux and Uy . Then Ux , Uy ∈ f (F ), and so ∅ = Ux ∩ Uy ∈ f (F ), contradicting
the fact that f (F ) is a filter.
Lemma 5.27. f (F ) consists precisely of those sets U ⊆ X for which f −1 (U ) =

{j : f (j) ∈ U } belongs to F .
Proof. If F = f −1 (U ) belongs to F , then U ⊇ f (F ) ∈ f (F ). Since f (F ) is a

filter, this implies U ∈ f (F ). Conversely, if U ∈ f (F ), then there are finitely many
F1 , . . . , Fn ∈ F such that U ⊇ f (F1 )∩· · ·∩f (Fn ). The set F := F1 ∩· · ·∩Fn belongs
to F (because F is a filter), and the set f (F ) is contained in f (F1 ) ∩ · · · ∩ f (Fn ),
and so U ⊇ f (F ), i.e. f −1 (U ) ⊇ F . This implies f −1 (U ) ∈ F .
The definition of convergence with respect to a filter contains the usual no-
tions of convergence as special cases:
Proposition 5.28. Let J = , X be a topological space, and F be the filter of
Example 4.2. Then
lim f (j) = x ⇐⇒ lim f (j) = x.

j→F j→∞
Proof. We have limj→∞ f (j) = x if and only if for any open neighborhood U of
x we have f (j) ∈ U for all except finitely many j. The latter means f −1 (U ) =
{j : f (j) ∈ U } ∈ F which by Lemma 5.27 is equivalent to U ∈ f (F ). Hence,
limj→∞ f (j) = x if and only if any open neighborhood U of x is contained in
f (F ), i.e. if and only if limj→F f (j) = x.
Exercise 20. Let J, X be topological spaces, j0 ∈ J, and let F be the filter gener-
ated by all sets of the form J0 \ {j0 } where J0 is an open neighborhood of j0 (we
assume that none of these sets is empty which implies that the family of these sets
indeed has the finite intersection property). Prove that
lim f (j) = x ⇐⇒ lim f (j) = x.

j→F j→j0
The finer the filter, the more convergent functions exist:

Proposition 5.29. Let F1 , F2 be filters over J with F1 ⊆ F2 . If f converges to x
with respect to F1 , then f converges to x with respect to F2 .
Proof. We have f (F1 ) ⊆ f (F2 ). Hence, if any open neighborhood of x is contained

in f (F1 ), it is also contained in f (F2 ).
The crucial point for ultrafilters is that all bounded functions have a limit:
Theorem 5.30. If U is an ultrafilter over J and f : J → Ê is bounded, then f
converges with respect to U to some point x ∈ . Ê
Proof. Since f is bounded, the image f (J) is contained in a compact interval [a, b].
Let B be the system of all open sets which do not belong to f (U ). We claim

that B does not contain [a, b]: Indeed, since [a, b] is compact, we find otherwise
finitely many B1 , . . . , Bn ∈ B with [a, b] ⊆ B1 ∪· · ·∪Bn . Put Ai := {j : f (j) ∈ Bi }
(i = 1, . . . , n). Then Ai ∈/ U , since otherwise f (Ai ) ∈ f (U ) which would imply
the contradiction Bi ∈ f (U ). Since U is an ultrafilter, the sets Ji := J \ Ai
belong to U . Hence, C := J1 ∩ · · · ∩ Jn = J \ (A1 ∪ · · · ∪ An ) belongs to U . But
A1 ∪ · · · ∪ An ⊆ {j : f (j) ∈ [a, b]} = J which implies C = ∅, a contradiction.
This contradiction shows that there is indeed some point x ∈ [a, b] which is

not contained in B, i.e. x is not contained in an open set which does not belong
to f (U ). But this means that all open neighborhoods of x belong to f (U ), i.e. f
converges to x with respect to U .
An inspection of the proof of Theorem 5.30 shows that it is actually not
Ê
required that the image space of f be ; in fact, it suffices that f (J) is contained
in a compact space.
The limit value depends only on the equivalence class of f :
Lemma 5.31. If F is a filter and f1 , f2 : J → X such that f1 (j) = f2 (j) for almost
all j, then f1 converges to x if and only if f2 converges to x.
Proof. By assumption, F := {j : f1 (j) = f2 (j)} belongs to F . We claim that
this implies f1 (F ) = f2 (F ) which of course implies the statement. Thus, given
A ∈ f1 (F ), we have to prove that A ∈ f2 (F ) (the converse follows analogously).
We have A ⊇ f1 (F0 ) for some F0 ∈ F (Lemma 5.27). But then A ⊇ f1 (F0 ∩ F ) =
f2 (F0 ∩ F ) which in view of F0 ∩ F ∈ F implies that A ∈ f2 (F ), as claimed.
Theorem 5.32. Consider the map ∗ of the ultrapower model of Theorem 4.20. If
x = ϕ([f ]) is finite, then
st(x) = lim f (j),
j→U
where the limit on the right-hand side exists and is independent of the particular
choice of f .
Proof. First note that, since x is finite, we have x = ϕ([f ]) for some bounded f :
Ê
J → (Example 5.11) so that the limit x0 = limj→U f (j) exists by Theorem 5.30.
Moreover, by Lemma 5.31, the limit exists also (and has the same value) if we
Ê
choose another representative f . Given ε ∈ + , the open neighborhood U :=
(x0 − ε, x0 + ε) belongs to f (U ). By Lemma 5.27, this means that {j : f (j) ∈ U }
belongs to U . But this means |f (j) − x0 | < ε almost everywhere, and so x ≈ ∗ x0
in view of Example 5.6. We thus must have x0 = st(x).
Proposition 5.25 is a special case of Theorem 5.32: Indeed, if J =
and
limj→∞ f (j) = x0 exists, then limj→F f (j) = x0 for the filter of Example 4.2
by Proposition 5.28, and so limj→U f (j) = x0 by Proposition 5.29 (observe that
F ⊆ U by Exercise 11, since U is δ-incomplete and thus free).
§ 6 The Permanence Principle and ∗ -finite Sets

One of the most important principles in nonstandard analysis is the so-called

permanence principle which is a simple consequence of the fact that σ is external

and that all elements in ∞ are infinite:
Theorem 6.1 (Permanence Principle). Let α(n) be an internal predicate with n as
its only free variable.

1. If α(n) holds for all sufficiently large finite n ∈ σ , n ≥ n0 , then there is

some infinite h ∈ ∞ such that α(n) holds for all n ∈ ∗ with n0 ≤ n ≤ h.
In particular, α(h) holds for some infinite h ∈ ∞ .

2. If α(h) holds for all infinite h ∈ ∗ ∞ , then there is some n0 ∈ σ such that

α(n) holds for all n ∈ ∗ with n ≥ n0 . In particular, α(n0 ) holds for some
finite n0 ∈ σ .
Proof. 1. Let
M := {n ∈ ∗ | ∀y ∈ ∗ : (n0 ≤ y ≤ n =⇒ α(y))}.
Then M is internal by the internal definition principle, and the assumption implies

that any n ∈ σ belongs to M : Indeed, any n1 ∈ ∗ with n1 ≤ n is finite and thus

belongs to σ by Proposition 5.9. Hence, σ ⊆ M . Since M is internal by the

internal definition principle and σ is external, we have M = σ . Hence, there is

some h ∈ M \ σ , which thus has the required properties.
2. Let

M := {n ∈ ∗ | ∀y ∈ ∗ : (y ≥ n =⇒ α(y))}.

If h ∈ ∞ is infinite (Proposition 5.9), then no h1 ∈ ∗ with h1 ≥ h belongs

to σ , and so the assumption implies h ∈ M . Hence, ∞ ⊆ M . Since σ is

external, also ∞ is external (Theorem 3.19), and so ∞ = M . Hence, there is

some n ∈ M \ ∞ = M ∩ σ .
The second part of the following consequence is also called the Cauchy prin-
ciple. The name “Cauchy principle” is due to the fact that it allows us to formulate
properties which hold for infinitesimals (which have been used by Leibniz) in an
ε-δ-type manner as was first propagated by Cauchy (and which is the only rea-
sonable definition in standard analysis).
Corollary 6.2 (Permanence Principle for ∗
Ê). Let α(ε) be an internal predicate
with ε as its only free variable.
1. If α(ε) holds for all sufficiently small standard ε ∈ σ Ê+, ε < ε0, then α(c)
Ê
holds also for some infinitesimal c ∈ inf(∗ ), c > 0.
§6 The Permanence Principle and ∗ -finite Sets 75
Ê
2. If α(c) holds for all infinitesimals c ∈ inf(∗ ), c > 0, then there is some
Ê
standard ε0 ∈ σ , such that α(c) holds for all standard or nonstandard c ∈ ∗ Ê
with 0 < c ≤ ε0 .
Ê
Moreover, if α(c) holds for all infinitesimals c ∈ inf(∗ ), then there is
Ê
some standard ε0 ∈ σ , such that α(c) holds for all standard or nonstandard
Ê
c ∈ ∗ with |c| ≤ ε0 .
Ê
3. If α(x) holds for all sufficiently large standard x ∈ σ , x > x0 , then α(c)
holds also for some infinite c > 0.
4. If α(c) holds for all infinite c > 0, then there is some finite standard x0 ∈ σÊ
Ê
such that α(x) holds for all standard or nonstandard x ∈ σ , x ≥ x0 .
Proof. 1. Since α(1/n) holds for all sufficiently large n, we have α(1/h) for some
Æ
h ∈ σ by Theorem 6.1. By Proposition 5.15, we have c := 1/h ∈ inf(∗ ). Ê
2. Let β(ε) denote the internal predicate
Ê : (0 < y ≤ 1/ε =⇒ α(y)).

∀y ∈ ∗
Then β(h) holds for all infinite h ∈ Æ∞ , and Theorem 6.1 thus implies that β(n)
holds for some finite n ∈ σ Æ. Now the claim follows with ε0 := 1/n. The second
part follows analogously by the predicate
∀y ∈ ∗ Ê : (|y| ≤ 1/ε =⇒ α(y)).
The remaining claims follow by applying the above proved statements for the
Ê
predicate α(1/ε) in place of α; recall that c ∈ inf(∗ ) if and only if 1/c is infinite
(Proposition 5.23).
Exercise 21. Prove the following generalizations of the permanence principle: Let
α(x) be an internal predicate with x as its only free variable.
Æ Æ
1. If there is some h0 ∈ ∗ ∞ such that α(h) holds for all h ∈ ∗ ∞ with h < h0 ,
Æ Æ
then there is some n0 ∈ σ such that α(n) holds for all n ∈ σ with n ≥ n0 .
Ê
2. If there is some infinitesimal c ∈ inf(∗ ), c > 0, such that α(d) holds for all
Ê
infinitesimals d ∈ inf(∗ ) with d > c, then there is some standard ε0 ∈ σ + Ê
Ê
such that α(ε) holds for all standard or nonstandard ε ∈ ∗ with c < ε ≤ ε0 .
Æ Ê
Exercise 22. Prove Robinson’s sequential lemma: If x : ∗ → ∗ is an internal
Æ Æ
sequence such that xn ≈ 0 for all n ∈ σ , then there is some h ∈ ∞ such that
Æ
xn ≈ 0 for all n ∈ ∗ with n ≤ h.
As a sample application of the permanence principle, let us give a simpler
proof of Theorem 5.24:
Ê Ê
Example 6.3. inf(∗ ) is external. Indeed, if inf(∗ ) were internal, then the pred-
Ê
icate α(x) ≡ ∀y ∈ inf(∗ ) : y < x were internal. Since α(ε) holds for all standard
numbers ε > 0, the permanence principle implies that we have α(c) for some
Ê
infinitesimal c ∈ inf(∗ ), a contradiction.

Note, however, that we used the fact that ∞ is external for the proof of the
permanence principle. Thus, in a sense, the permanence principle is equivalent to
the fact that certain entities are external. This is not accidental: Many deep results
of nonstandard analysis depend on the fact that certain entities are external.
All nonstandard phenomena we observed so far are based on the fact that
∗

= σ . The transfer principle implies, roughly speaking, that ∗ plays in the
nonstandard universe the same role as the set
plays in the standard universe.
But then the word “finite” should be interpreted differently in the nonstandard
universe:
Definition 6.4. A set A is called finite, if it is in a one-to-one correspondence with
a set {1, . . . , n} of natural numbers.
A set A is called Dedekind finite, if it is not in a one-to-one correspondence
with a proper subset A0 A.
We recall that a countable form of the axiom of choice implies the following
well-known result (of the standard world). For the reader unfamiliar with such
results, we provide a (standard) proof:
Proposition 6.5. A set A is finite if and only if it is Dedekind finite.
Proof. If A is finite and A0 A, then the cardinality of A0 is strictly smaller
than that of A. Since bijections preserve the cardinality, there is no bijection
f : A → A0 . Conversely, if A is infinite, we may define inductively an injection

f : → A in the following way: Choose f (1) ∈ A arbitrary, and if f (1), . . . , f (n)
are already defined, choose f (n + 1) ∈ A such that f (n + 1) ∈ / {f (1), . . . , f (n)}.
Such a value exists, since otherwise we have a bijection witnessing that A is finite.
Then we may define a bijection F : A → A0 where A0 = A \ {f (1)} by

F (x) =
x if x ∈
/ f ( ),
f (n + 1) if x = f (n).
Hence, A is not Dedekind finite.

In the nonstandard world, we have now:
Definition 6.6. An entity A ∈ ∗
S is called ∗ -finite or hyperfinite if there is some

h ∈ ∗ and an internal bijection f : {1, . . . , h} → A. In this case, we define # A as
the hyperfinite number h; otherwise, we put # A := ∞.

Here and in the following, {1, . . . , h} is used as a shortcut for {x ∈ ∗ : x ≤

h} if h ∈ ∗ . Although this notation is intuitive, the reader should take care:

For h ∈ ∞ the set {1, . . . , h} is even uncountable (Theorem 3.23). Moreover,

this set is not well-ordered: The set {1, . . . , h} ∩ ∞ has no smallest element
by Theorem 5.12 (but internal nonempty subsets of {1, . . . , h} have a smallest
element by Theorem 5.12).
Note that if A is hyperfinite, then A = rng f is internal by Theorem 3.19.
The transfer principle implies that a standard entity ∗ A is ∗ -finite if and
only if A is finite (see Exercise 27 below). However, the crucial point in the above
definition is that this definition applies also for nonstandard (internal) sets.
For the rest of this section, we discuss the above notion. The following results
all sound rather natural. However, the proofs are surprisingly technical. The rea-
son is that sentences about internal sets cannot easily be formulated such that the
transfer principle can be applied, i.e. such that all constants are standard objects
(and not only internal objects): To do this, we always have to formulate the sen-
tences as sentences about objects which contain the given internal sets as elements.
In other words: We must consider objects of a higher type. It may be a good idea
if the reader works through the appendix parallel to this section. Alternatively,
the reader may also want to skip the proofs of this section at the first reading and
to consider the proofs after more experience.
We have to prove that # A is well-defined. To do so, we first show the following

lemma which we will need later:

Lemma 6.7. Let h1 , h2 ∈ ∗ , and f : {1, . . . , h1 } → {1, . . . , h2 } be internal. If f
is onto, then h1 ≥ h2 . If f is one-to-one, then h1 ≤ h2 .
Proof. By Exercise 8, we find an internal function F : ∗ → ∗ such that F (h) =
f (h) for any h ∈ {1, . . . , h1 }.
For the first statement, consider the sentence
∀x ∈ , n1, n2 ∈ : α(x, n1 , n2) =⇒ n1 ≥ n2
where α(x, n1 , n2 ) is a shortcut for
∀z2 ∈ : z 2 ≤ n2 =⇒ (∃z1 ∈ : z1 ≤ n1 ∧ (z1, z2) ∈ x)

which may more intuitively be written as {1, . . . , n2 } ⊆ x({1, . . . , n1 }). Thus,
the sentence means that whenever {1, . . . , n2 } ⊆ x({1, . . . , n1 }) for some function

x : → , we must have n1 ≥ n2 . This is evidently true. Hence, the ∗-transform

of this sentence is true. Since ( ) contains F by Theorem 3.21, we may conclude
∗
that the relation {1, . . . , h2 } ⊆ F ({1, . . . , h1 }) implies h1 ≥ h2 .

For the second statement, consider analogously the sentence
∀x ∈ , n ∈ : “x one-to-one on {1, . . . , n}” =⇒ ∃m ∈ : (m ≤ n ∧ x(m) = n)

which is true since any injection f : {1, . . . , n} → attains at least some value

m ≥ n. Hence, the ∗-transform of this sentence is true. Since ( ) contains F ,
∗
we may conclude that F ({1, . . . , h1 }) must at least contain some value h ≥ h1 . In

view of F ({1, . . . , h1 }) = f ({1, . . . , h1 }) = {1, . . . , h2 }, this implies h2 ≥ h1 .
#
Proposition 6.8. A is well-defined.
Proof. Let f1 : {1, . . . , h1 } → A and f2 : {1, . . . , h2 } → A be two internal bijec-
tions. By Theorem 3.19, also f2−1 is internal, and by Exercise 7 also the composition
g = f2−1 ◦ f1 = {1, . . . , h1 } → {1, . . . , h2 }. Since g is a bijection, Lemma 6.7 now
implies h1 = h2 which means that # A is well-defined.
Of course, one expects from the term “∗ -finite” that all finite sets are ∗ -finite:
Proposition 6.9. If A ∈ ∗
S is finite, then A is ∗ -finite, and # A is the number of
elements.
Proof. Let a1 , . . . , an be all elements of A. Put f := {(1, a1 ), . . . , (n, an )}. Then
f : {1, . . . , n} → A is the desired internal bijection.
It is a natural question whether we are led to a different definition of ∗ -finite
sets, if we start with Dedekind finite sets. The answer is negative:
Call an internal entity A ∈ ∗
S Dedekind ∗ -finite if there is no internal bijec-
tion f : A → A0 where A0 A. Observe that in this definition we require both A
and f to be internal.
Theorem 6.10. An internal entity A is ∗ -finite if and only if it is Dedekind ∗ -finite.
Proof. Let α be the sentence
∃x ∈ AA : “x is one-to-one” ∧ ∃y ∈ A : ∀z ∈ A : (z, y) ∈
/ x,
i.e. α is true if and only if there is a bijection f of A onto a proper subset of A0 .

Let β be the sentence
∃x ∈ A , n ∈ : “x maps {1, . . . , n} bijectively onto A”,

i.e. β is true if and only if A is finite. Then the sentence
(¬α) ⇐⇒ β
is true by Proposition 6.5. Since (AA ) and (A ) consist of all internal functions
∗ ∗

f : ∗ A → ∗ A resp. f : ∗ → ∗ A, the ∗-transform of the above sentence means that
∗
A is Dedekind ∗ -finite if and only if there is an internal function f : ∗ → ∗ A

which maps {1, . . . , h} onto ∗ A for some h ∈ ∗ . Since any internal function

f : {1, . . . , h} → ∗ A may be extended to a function f : ∗ → ∗ A by Example 8,
the claim follows.
One might also choose the following characterization as the definition of

∗
-finite sets:
Theorem 6.11. An entity A ∈ ∗ S is ∗ -finite if and only if there is some entity
A ∈ S whose elements are finite entities such that A ∈ ∗ A .

Moreover, if A ∈ S consists of finite entities and f : A → is the mapping
which associates to each B ∈ A its number of elements, then ∗ f (A) = #
A for
each A ∈ ∗ A .
It may be arranged that all elements of A have the same type as A.
Proof. Assume that A ∈ S is an entity whose elements are finite entities. Note

that U := A ∈ S by Theorem 2.1. Then the following sentence is true:
∀x ∈ A : ∃y ∈ U : ∃n ∈ : (α(y, n, x) ∧ n = f (x))
where α(y, n, x) is a shortcut for a transitively bounded sentence with the meaning
“y maps {1, . . . , n} bijectively onto x”: This sentence is a reformulation of the fact
that each B ∈ A is finite and has f (B) elements. The transfer of this sentence
implies for any A ∈ ∗ A that there is some y ∈ (U ) (i.e. some internal function
∗

y : ∗ U → ∗ by Theorem 3.21) and some z ∈ ∗ such that y maps {1, . . . , z}
bijectively onto A and z = ∗ f (A). Hence, any A ∈ ∗ A is ∗ -finite, and # A = z =
∗
f (A).

Conversely, let A be ∗ -finite, i.e. there is some h ∈ ∗ and some internal
bijection f : {1, . . . , h} → A. Since A is an internal entity, we find by Corollary A.3
an entity B ∈ S which consists only of entities such that A ∈ ∗ B (we may even

assume that all elements of B have the same type as A). Put U := B, and
observe that ∗ A ⊆ ∗ U by Theorem A.4. Let A ⊆ B be the collection of all finite
entities B ∈ B. Then the sentence
∀x ∈ B : ((∃y ∈ U , n ∈ : α(y, n, x)) =⇒ x ∈ A )
is true, where α is defined as before. The transfer of this sentence means that ∗ A

contains all elements A0 ∈ ∗ B for which we find an internal function y : ∗ → ∗ U

(Theorem 3.21) and some h ∈ ∗ such that y maps {1, . . . , h} bijectively onto
A0 . But A is such an element: Indeed, since ∗ A ⊆ ∗ U , we may extend the given

function f to an internal function y : ∗ → ∗ U . Hence, we have A ∈ ∗ A .

If a set A is infinite, then there exists an injection f : → A (by a countable
form of the axiom of choice). We have an analogue in the nonstandard world:
Theorem 6.12. For each internal entity A precisely one of the following alternatives
holds:
1. A is ∗ -finite, or
2. There is an internal injection f : ∗ → A.

Proof. By Corollary A.3, we find some entity A ∈ S whose elements are all entities

such that A ∈ ∗ A . Let U := A . Then the transitively bounded sentence
∀x ∈ A : ((∃y ∈ U : α(y, x, U, )) ⇐⇒ ¬(∃y ∈ U , n ∈ : β(y, n, x, U, )))

is true, where α(y, x, U, ) is a shortcut of a transitively bounded formula with
the meaning “y maps
injectively into x”, and β(y, n, x, U, ) means similarly
“y maps {1, . . . , n} bijectively onto x”. The transfer of the above sentence reads:
∀x ∈ ∗ A : ((∃y ∈ (U ) : α(y, x, ∗ U , ∗ ))∨˙

∗
(U ), n ∈ ∗ : β(y, n, x, ∗ U , ∗ ))).
∗
(∃y ∈
By Theorem 3.21, (U ) consists of all internal functions f : ∗ → ∗ U . For

∗

the choice x = A ∈ ∗ A , the above sentence thus means in view of ∗ A ⊆ ∗ U
(because A ⊆ U ): Precisely one of the following alternatives holds: Either there is

an internal injection f : ∗ → A, or there is some h ∈ ∗ and an internal function

g : ∗ → ∗ U such that g : {1, . . . , h} → A is bijective. Since the restriction of such
an internal function g to {1, . . . , h} is internal and since conversely any internal
bijection g : {1, . . . , h} → A can be extended to an internal function g : ∗ → ∗ U
(both follows from Exercise 8), the statement follows.
Recall that we agreed to write # A = ∞ if A is not ∗ -finite. In this connection,
we define ∞ > h for any h ∈ ∗ .
Theorem 6.13. Let A ∈ ∗ S be an internal entity.
1. If B ⊆ A is internal, then # B ≤ # A.
2. If B A is internal and A is ∗ -finite, then # B ≤ # A − 1.
3. If f : A → B is an internal injection, then # A ≤ # B.
4. If f : A → B is an internal surjection, then # A ≥ # B.
Proof. 1. If # A = ∞, there is nothing to prove. Thus, assume that A is ∗ -finite.
This implies that B is ∗ -finite: Otherwise, Theorem 6.12 would imply that there

is an internal injection f : ∗ → B. But then f is also an internal injection from
∗
into A which by Theorem 6.12 contradicts our assumption that A is ∗ -finite.
Hence, we may assume that h := # A and k := # B both belong to ∗ , and
that there are internal bijections f : {1, . . . , h} → A and g : {1, . . . , k} → B. Then
f −1 is internal by Theorem 3.19, and thus G := f −1 ◦ g : {1, . . . , k} → {1, . . . , h}
is an internal injection (Exercise 7). Lemma 6.7 implies h ≥ k, i.e. # A ≥ # B.
2. With G : {1, . . . , k} → {1, . . . , h} as in 1., the function G is not onto, i.e. there
is some j ∈ C := {1, . . . , h} with j ∈ / rng G. Observe that the sets C1 := {x ∈ C :
G(x) < j} and C2 := {x ∈ C : G(x) > j} are internal by the internal definition
principle. Now define a function F : C → {1, . . . , h − 1} by putting F (x) := G(x)

for x ∈ C1 and F (x) := G(x) − 1 for x ∈ C2 . In view of Exercise 8, the function F
is internal. Since F : {1, . . . , k} → {1, . . . , h − 1} is one-to-one, Lemma 6.7 implies
k ≤ h − 1, i.e. # B ≤ # A − 1.
3. In case # B = ∞, there is nothing to prove. Thus, assume # B < ∞. Since B0 :=
rng(f ) is an internal subset of B, we find by 1. that h := # B 0 ≤ # B. Hence, there
is an internal bijection g : {1, . . . , h} → B0 . The bijection f −1 ◦ g : {1, . . . , h} → A
is internal by Exercise 7 and Theorem 3.19. Hence, # A = h ≤ # B.
4. In case # A = ∞, there is nothing to prove. Thus, assume h := # A < ∞, i.e.
there is an internal bijection g : {1, . . . , h} → A. Assume first that B is ∗ -finite.
Then k := # B < ∞, and there is an internal bijection g1 : {1, . . . , k} → B. Since
g1−1 ◦ f ◦ g : {1, . . . , h} → {1, . . . , k} is an internal surjection, Lemma 6.7 implies
h ≥ k, i.e. # A ≥ # B.
To see that B must be ∗ -finite, we apply Theorem 6.11: There is some A ∈

S whose elements are finite entities such that A ∈ ∗ A . Since B is internal by
Theorem 3.19, we find some C ∈ S such that B ∈ ∗ C and such that C contains
only entities (Corollary A.3). Let B ⊆ C be the collection of all finite entities

of C . Putting U := A , V := C , we have U, V ∈ S by Theorem 2.1. Let F
denote the system of all functions f with dom(f ) ⊆ U and rng(f ) ⊆ V . Since
functions map finite sets into finite sets, we have
∀x ∈ F : (∃y ∈ A : dom(x) ⊆ y) =⇒ (∃z ∈ B : rng(x) = z).
The reader should take care that the shortcuts dom(x) and rng(x) used here are
not transitively bounded, but we may take the quantifiers over the sets U and V .
Since ∗ F consists of all internal functions x with dom(x) ⊆ ∗ U and rng(x) ⊆ ∗ V
(Exercise 83) and thus in particular f ∈ ∗ F , we conclude from the ∗-transform
of the above sentence that B = rng(f ) ∈ ∗ B (concerning U ∗ and V ∗ observe
Theorem A.4). Applying the converse direction of Theorem 6.11, this shows that
B is indeed ∗ -finite, as claimed.
Exercise 23. (Difficult). Prove the following generalization of Theorem 6.13:

1. If f : A → B is an internal surjection, then there exists an internal injection
g : B → A.
2. If there exist internal injections f1 : A → B and f2 : B → A, then there exists
an internal bijection g : A → B.
Hint: Use that the above facts are right in the standard world, i.e. without the
term “internal”. In the second case, this fact (in the standard world) can be found
in literature on set theory under the name Schröder-Bernstein theorem.
Theorem 6.14. If A and B are ∗ -finite, then A ∪ B and A × B are ∗ -finite and
satisfy
#
(A ∪ B) = # A + # B if A ∩ B = ∅, (6.1)
# # #
(A × B) = A · B. (6.2)
Also the system PA of all internal subsets of A is ∗ -finite, and

#
# A
PA = 2 .
Proof. By Theorem 6.11, there are entities A , B ∈ S consisting of finite entities

such that A ∈ ∗ A , B ∈ ∗ B. Let C , D, E ∈ S denote the collection of all sets of
the form X × Y , X ∪ Y resp. P(X), with X ∈ A , Y ∈ B. Note that
A × B ∈ ∗C , A ∪ B ∈ ∗ D, PA ∈ ∗ E , (6.3)
as follows from Exercise 84, Corollary A.5, respectively Theorem A.8.

Put C := C , D := D, and E := E . Let f1 : A → and f2 : B →
be the functions which associate to each set the number of elements. Then we have
#
∀x ∈ A , y ∈ B, z ∈ C : z = x × y =⇒ z = f1 (x)f2 (y)
where the last expression is a shortcut for
∃w ∈ C : “w maps {1, . . . , f1 (x)f2 (y)} bijectively onto z”.
The transfer of this sentence implies in view of (6.3) that there is some w ∈
∗

(C ) (i.e. some internal function w : ∗ → ∗ C by Theorem 3.21) such that
w : {1, . . . , ∗ f 1 (A) · ∗ f 2 (B)} → z = A × B is bijective. Since ∗ f 1 (A) = # A and
∗
f 2 (B) = # B by Theorem 6.11, this proves (6.2).
Analogously, the transfer of the sentence
∀x ∈ A , y ∈ B, z ∈ D : z = x ∪ y =⇒ α
where α is a shortcut for
∃w ∈ D , v ∈ : “w maps {1, . . . , v)} bijectively onto z”

implies in view of (6.3) that A∪B is ∗ -finite. Similarly, the transfer of the sentence
#
∀x ∈ A , y ∈ B, z ∈ D : (z = x ∪ y ∧ x ∩ y = ∅) =⇒ vz = f1 (x) + f2 (y)
where the last expression is a shortcut for
∃w ∈ D : “w maps {1, . . . , f1 (x) + f2 (y)} bijectively onto z”

proves (6.1).
For the last statement, consider the sentence
∀x ∈ E , y ∈ A : (α(x, y) =⇒ β(x, y, E))
where α(x, y) and β(x, y, E) are shortcuts for
∀z ∈ x : z ⊆ y
and
∃w ∈ E : “w maps {1, . . . , 2f1 (y) } bijectively onto x”.
Note that for x ∈ E and y ∈ A the statement α(x, y) can be interpreted as
x = P(y); now apply the ∗-transform of the above sentence with x = PA ∈ ∗ E
and y = A ∈ ∗ A .
Exercise 24. Prove that for any ∗ -finite entities A, B the formula
# #
(A ∪ B) = # A + # B − (A ∩ B)
holds.
Given some ∗ -finite sequence x : {1, . . . , h} → X, we denote by #X (x) the
number h.
Proposition 6.15. Given some internal entity X, the system
X<
∗
= {x | ∃n ∈ ∗ : x : {1, . . . , n} → X is internal}
is internal. Moreover, the function #X (·) is internal.

Proof. By Exercise 82, the set F of all internal functions x with dom(x) ⊆ ∗

and rng(x) ⊆ X is internal. Hence,
X<
∗
= {x ∈ F | ∃n ∈ ∗ : x : {1, . . . , n} → X is internal}
is internal by the internal definition principle. Similarly,
ϕ := {z | ∃x ∈ F , n ∈ ∗ : (x : {1, . . . , n} → X is internal ∧ z = (x, n))}

is internal; but ϕ = #X (·).
We use the notation #X (x) also for sequences in the standard world, i.e. if
x : {1, . . . , n} → X, then #X (x) = n.
Exercise 25. Let X < denote the set of all functions of the form x : {1, . . . , n} → X
<∗
where n ∈ may depend on f . Prove that (X < ) = ∗ X
∗ ∗
and (#X (·)) =
#∗ X (·).
: < →
Ê Ê be the mapping which associates to each finite sequence

Let
its sum. Then
∗ ∗
: ( Ê<) → ∗ Ê. For any ∗-finite (internal) sequence x ∈ ∗Ê< , ∗
we define
h
∗
x(n) := (x).
n=1
We use the more intuitive notation x1 , . . . , xh with xn = x(n) in place of x.

Corollary 6.16. If x and y are ∗ -finite sequences of length h in ∗
Ê with xn ≤ yn,
then
h h
xn ≤ yn .
n=1 n=1
Similarly, |xn | ≤ yn implies

h

xn
≤ yn .

n=1 n=1
Moreover,
h

1 = h.
n=1
Proof. The first statement follows in view of the previous results by the transfer
of

∀x, y ∈ < : (∀n ∈ : (n ≤ #Ê (x) =⇒ x(n) ≤ y(n))) =⇒
Ê (x) ≤ (y).
The proof of the other statements is similar.

Exercise 26. (Difficult). Let R be a totally ordered internal entity (the order re-
lation also being internal), and A be an internal system of ∗ -finite subsets of R.
Prove that the function max : A → R (with the obvious meaning) is well-defined
and internal.
Exercise 27. Let A ∈ S be a nonempty entity which contains no atoms, and

c : A → ∪ {∞} be the function which associates to each A ∈ A the number

of its elements. We write ∞ := ∗ ∞. Prove that ∗ c : ∗ A → ∗ ∪ {∞} satisfies
∗
c(B) = # B for each B ∈ ∗ A . Moreover, prove that a standard entity B = ∗ A is
∗
-finite if and only if A is finite.
§7 Calculus 85
§ 7 Calculus
The basic calculus is very easily described by nonstandard methods. As remarked
in Section 1.1, this was historically one of the main motivations of nonstandard
analysis.
However, the use of nonstandard analysis has the drawback that even the
simplest results make use of the axiom of choice: Recall that without the axiom
of choice (more precisely: Without the existence of δ-free ultrafilters) we were not
able to construct nonstandard embeddings. We will see later that this restriction is
essential: Indeed, with nonstandard methods one can “construct” so-called Hahn-
Banach limits and also nonmeasurable functions, as we will see; without the axiom
of choice it is for fundamental reasons not possible to prove the existence of such
objects.
The above observation is an essential disadvantage since this means in the
author’s opinion that nonstandard analysis is not a good model for “real-world”
phenomena.
On the other hand, if one is particularly interested in such objects whose
existence can only be proved by the axiom of choice, nonstandard analysis is
a much more convenient tool than classical analysis. We will see this later in
particular in our discussion of Hahn-Banach limits.
7.1 Sequences
We first discuss real sequences. Recall that a sequence is a mapping x : Ê
→ ;
as usual, we write xn instead of x(n). The essential point of nonstandard analysis
Æ Ê
here is that x, as a mapping, has a ∗-transform ∗ x : ∗ → . We will also write
∗
Æ
xn in place of ∗ x(n). For n ∈ , we have ∗ x∗ n = ∗ (xn ) (Theorem 3.13), i.e.
the sequence ∗ xn may be identified with the sequence xn for standard numbers.
Æ
However, for h ∈ ∞ , we get additional values of ∗ xh on “infinite” places. One
will suspect that these values have something to do with the limit of the sequence
xn . This is indeed the case:
Theorem 7.1. Let xn be a real sequence. Then we have for x ∈ Ê:
Æ
1. xn → x if and only if ∗ xh ≈ ∗ x for each infinite h ∈ ∞ .
2. xn has the accumulation point x if and only if ∗ xh ≈ ∗ x for some infinite
Æ
h ∈ ∞.
Proof. 1. If xn → x, then for any ε ∈ Ê+, we have

∀n ∈ Æ : n ≥ n0 =⇒ |xn − x| < ε
for some n0 ∈ . The transfer principle implies

∀n ∈ ∗ : n ≥ ∗ n0 =⇒ |∗ xn − ∗ x| < ∗ ε.
In particular, |∗ xh − ∗ x| ≤ ∗ ε for any h ∈ ∞ . Since the latter holds for any

ε ∈ Ê+ , we have ∗ xh ≈ ∗ x.
Conversely, if ∗ x ≈ ∗ xh for each infinite h ∈ Æ∞ , then for any ε ∈ Ê+
the internal formula |∗ xn − ∗ x| < ∗ ε holds true for each infinite n ∈ Æ∞ . The
permanence principle implies that there is some ∗ n0 ∈ σ Æ such that |∗ xn − ∗ x| <
∗
ε holds for all n ∈ ∗ Æ with n ≥ ∗ n0 , i.e.
∀n ∈ ∗ Æ : n ≥ ∗ n0 =⇒ |∗ xn − ∗ x| < ∗ ε.
The reverse form of the transfer principle implies that |xn − x| < ε for all n ∈ Æ
with n ≥ n0 , and so xn → x.
2. If xn has the accumulation point x, then the transfer principle immediately
shows that
Ê+ : ∀n ∈ ∗Æ : ∃m ∈ ∗Æ : m ≥ n ∧ |∗xm − ∗x| < ε.

∀ε ∈ ∗
Choosing ε ∈ inf(∗ Ê) and n ∈ Æ∞ , we thus find some m ∈ Æ∞ with ∗ xm ≈ x.

Conversely, if ∗ x ≈ ∗ xh for some infinite h ∈ Æ∞ , then we have for each
ε ∈ Ê+ and each n0 ∈ Æ that
∃n ∈ ∗ Æ : (n ≥ ∗ n0 ∧ |∗ xn − ∗ x| < ∗ ε).
Applying the converse direction of the transfer principle, we find
∃n ∈ Æ : (n ≥ n0 ∧ |xn − x| < ε).

Since n0 and ε were arbitrary, x is an accumulation point of xn .
Also boundedness is easily characterized:
Theorem 7.2. Let xn be a real sequence.
Æ
1. xn is bounded if and only if ∗ xh is finite for each h ∈ ∗ (or, equivalently,
Æ
for each h ∈ ∞ ).
2. xn → ±∞ if and only if ∗ xh is infinite and positive/negative for each infinite
Æ
h ∈ ∞.
Proof. 1. If xn is bounded, say |xn | ≤ c ∈ Ê, then we have by the transfer principle
∀n ∈ ∗ Æ : |∗xn| ≤ ∗c,
Æ
i.e. ∗ xh is finite for each h ∈ ∗ .
§7 Calculus 87

Conversely, if ∗ xh is finite for each h ∈ ∗ , then the internal predicate
∀n ∈ ∗ : |∗xn | ≤ m

holds for each infinite m ∈ ∞ . The permanence principle implies that it also

holds for some finite m = ∗ m ∈ σ . An application of the converse direction of
the transfer principle implies that xn is bounded by m.
2. If xn → ∞, then we have for any N ∈
that there is some n0 ∈
such that
(using the transfer principle)
∀n ∈ ∗ : n ≥ ∗ n0 =⇒ xn > ∗ N .

In particular, xh > ∗ N for each infinite h ∈ ∞ . Since xh > ∗ N for any N ∈ ,
this means that xh is infinite.

Conversely, if xh is infinite and positive for any h ∈ ∞ , then for any N ∈

the internal formula ∗ xn > N holds true for each infinite n ∈ ∞ . The permanence

principle implies that there is some n0 ∈ σ such that xn > N holds for all n ∈ ∗
with n ≥ n0 ; in particular xn > N for all sufficiently large n ∈ .
It may be slightly astonishing to the reader that there is some relation be-
tween boundedness and finiteness. However, if one thinks of ∗ xh for infinite h ∈ ∞
as “generalized accumulation points”, this is not surprising. This interpretation
indeed makes sense:
Corollary 7.3. Let xn be a real sequence. Then its set of accumulation points is
{st(∗ xh ) : h ∈ ∞, ∗xh finite}.

Ê
Proof. If ∗ xh is finite, then st(∗ xh ) ≈ ∗ xh and st(∗ xh ) ∈ , and so Theorem 7.1
implies that st(∗ xh ) is an accumulation point of xn . Conversely, if x is an ac-
cumulation point of xn , then Theorem 7.1 implies x ≈ xh ≈ st(∗ xh ) for some
Æ
h ∈ ∞.
Now we can give a simple nonstandard proof for a standard fact:

Corollary 7.4. If a bounded real sequence has at most one accumulation point, then
it converges.
Æ
Proof. If the sequence xn is bounded, ∗ xh is finite for each h ∈ ∞ . Hence, st(xh )
is an accumulation point of x by Theorem 7.1. The assumption thus implies that
x = st(xh ) ∈ Ê Æ
is independent of h, i.e. ∗ xh ≈ ∗ x for all h ∈ ∞ which implies
xn → x by Theorem 7.1.
The proof of the classical Bolzano-Weierstraß theorem is even simpler:

Corollary 7.5 (Bolzano-Weierstraß). Any bounded real sequence has an accumula-

tion point.

Proof. If the sequence xn is bounded, ∗ xh is finite for each h ∈ ∞ , in particular

finite for some h ∈ ∞ . Then st(xh ) is an accumulation point of xn .
We emphasize once more that despite the simplicity of the above proof, the
nonstandard method has in contrast to the standard method the disadvantage
that it relies on the axiom of choice.
Also for the well-known limit rules we have simple proofs:
Corollary 7.6. For real convergent sequences xn → x and yn → y, we have
xn ± yn → x ± y, xn · yn → x · y and xn /yn → x/y (if yn , y = 0).

Proof. For each h ∈ ∞ , we have ∗ xh ≈ ∗ x and ∗ y h ≈ ∗ y by Theorem 7.1,
and so ∗ xh ± ∗ y h ≈ ∗ x ± ∗ y by Proposition 5.17 which by Theorem 7.1 implies
xn ± yn → x ± y; the other statements are proved analogously.
Exercise 28. Prove that for any bounded real sequence
lim sup xn = sup {st(∗ xh ) : h ∈ ∞ }

and
lim inf xn = inf {st(∗ xh ) : h ∈ ∞}.
What can be said if the sequence is unbounded?
Exercise 29. Find a real sequence xn with more than one accumulation point such
that you are able to calculate ∗ xh for any h ∈ ∞ .
Interpret your example also for the map ∗ of Theorem 4.20.
Now we are in a position to prove that internal sets in nonstandard embed-
dings are either finite or have at least the cardinality of the continuum:
Proof of Theorem 3.23. Let ∗ : S → ∗ S be a nonstandard embedding. If S is
finite, then all sets in S are finite, and ∗ is a bijection (recall the remarks following
Corollary 3.11); in this case, all entities of ∗ S are finite, and the claim is trivial.
Thus, assume that S is infinite. Then it is no loss of generality to assume that
⊆ S (just rename the atoms). We prove first that {1, . . . , h} has the cardinality

of the continuum for each h ∈ ∞ . To this end, consider the real sequence defined
by x2n +k := k/2n (k = 0, . . . , 2n −1), i.e. x1 = x2 = 0, x3 = 1/2, x4 = 0, x5 = 1/4,
Ê
x6 = 2/4, x7 = 3/4, x8 = 0, . . .. Then we have for any x ∈ , 0 ≤ x ≤ 1, that
∀m ∈ Æ : ∃n ∈ Æ : (n ≤ 2m+1 ∧ |xn − x| ≤ 2−m).

The transfer principle implies
∀m ∈ ∗ Æ : ∃n ∈ ∗Æ : (n ≤ 2m+1 ∧ |∗xn − ∗x| ≤ 2−m). (7.1)

§7 Calculus 89
Observe now that, by the transfer principle,
∀n ∈ ∗ : (n ≥ 4 : 2m+1 ≤ n < 2m+2).

=⇒ ∃m ∈ ∗
Hence, since h ∈ ∞ , we find some m ∈ ∗ such that 2m+1 ≤ h ≤ 2m+2 . We

cannot have m ∈ σ , since this would imply h ∈ σ in view of Proposition 5.9.
Thus, m ∈ ∞ , and so 2−m ≈ 0 (apply e.g. Theorem 7.1), i.e. 2−m ∈ inf(∗ Ê).
By (7.1), we find for each x ∈ Ê, 0 ≤ x ≤ 1, some n(x) ∈ ∗ Æ with n(x) ≤ 2m+1 ≤ h
and |∗ xn(x) − ∗ x| ≤ 2−m ∈ inf(∗ Ê), and so ∗ xn(x) ≈ ∗ x. By the axiom of choice,
we may assume that n : [0, 1] → {1, . . . , h} is a map. Moreover, n is one-to-one,
since for real numbers x1 , x2 ∈ [0, 1] with n(x1 ) = n(x2 ), we have ∗ x1 ≈ ∗ xn(x1 ) =
∗
xn(x2 ) ≈ ∗ x2 , i.e. ∗ x1 ≈ ∗ x2 which for real numbers implies x1 = x2 . We thus
Æ
have established for any h ∈ ∞ an injection x : [0, 1] → {1, . . . , h}, and so any
Æ
{1, . . . , h} (h ∈ ∞ ) has at least the cardinality of the continuum.
Now let A be an internal entity. Theorem 6.12 implies that we either find an
Æ
internal injection f : ∗ → A or an internal bijection f : {1, . . . , h} → A where
Æ
h ∈ ∗ . Thus, we find either an internal bijection f : {1, . . . , n} → A for some
Æ
n ∈ σ in which case A is finite, or an internal injection f : {1, . . . , h} → A
Æ
for some h ∈ ∞ . The latter implies that A has at least the cardinality of the
continuum by what we proved above.
Ê
Recall that a sequence xn ∈ is called a Cauchy sequence, if for each ε > 0
there is some n0 such that |xn − xm | < ε for n, m ≥ n0 .
Exercise 30. Prove, without using the fact that Cauchy sequences converge, that a
Æ
real sequence xn is a Cauchy sequence if and only if ∗ xh ≈ ∗ xk for each h, k ∈ ∞ .
With Exercise 30, we find another nonstandard proof of a well-known stan-
dard fact:
Corollary 7.7. A real sequence converges if and only if it is a Cauchy sequence
(i.e.,Ê is complete).
Æ
Proof. If xn → x converges, then ∗ xh ≈ ∗ x ≈ ∗ xk for each h, k ∈ ∞ by Theo-
rem 7.1. Conversely, if xn is a Cauchy sequence, then xn is bounded, and so ∗ xh
Æ
is finite for any h ∈ ∞ by Theorem 7.2. Put x := st(xh ) for some h ∈ ∞ . Æ
Æ
Then Exercise 30 implies ∗ xk ≈ ∗ xh ≈ ∗ x for each k ∈ ∞ , and it follows from
Theorem 7.1 that xn → x.
It is known that the completeness of Ê

is equivalent to its Dedekind com-
Ê
pleteness: We used the Dedekind completeness of in the previous proof implicitly
when we made use of the function st (recall the proof of Theorem 5.19).
7.2 Sets
While the fact that boundedness (in the classical sense) and finiteness (in the
nonstandard sense) are related for sequences is rather intuitive, the reader may be
surprised to see the same relation for sets:
Theorem 7.8. A set A ⊆ Ê is bounded if and only if ∗A contains only finite
elements, i.e. if and only if
∗
Ê
A ⊆ fin(∗ ). (7.2)
More precisely, A is unbounded from above if and only if A contains an infinite
positive element, and A is unbounded from below if and only if A contains an
infinite negative element.
Proof. If A ⊆ Ê is bounded from above, we find some c ∈ Ê+ with
∀x ∈ A : x ≤ c.
The transfer reads ∀x ∈ ∗ A : x ≤ ∗ c which implies that all elements of ∗ A are
either finite or negative. Conversely, if each x ∈ ∗ A is either finite or negative, then
any sequence xn ∈ A is bounded from above. Indeed, otherwise there were some
Æ
positive infinite ∗ xh by Theorem 7.2. The transfer of the sentence ∀n ∈ : xn ∈ A
implies in particular that ∗ xh ∈ ∗ A, i.e. A contains an infinite positive element.
Thus the map A → ∗ A reflects the boundedness of A by joining some positive
resp. negative infinite elements to ∗ A if A is unbounded. It is not very surprising
that it also reflects the local properties of A. In particular, we have:
Theorem 7.9. A set A ⊆ Ê
is closed if and only if each finite point of ∗ A is
infinitely close to some (standard) point of σ A, i.e. if and only if
Ê
st(∗ A ∩ fin(∗ )) = A. (7.3)
Proof. Let A be closed, and x ∈ ∗ A be finite. We claim that st(x) ∈ A. Indeed,
Ê
put y := st(x). For each ε ∈ + , we have
∃z ∈ ∗ A : |z − ∗ y| < ∗ ε,
because ∗ A ∋ x ≈ ∗ y. The converse form of the transfer principle implies that we
Ê
find some z ∈ A with |z − y| < ε. Since ε ∈ + was arbitrary and A is closed, this
implies y ∈ A, as claimed.
Conversely, if each finite point of ∗ A is infinitely close to some point from
σ
A, and xn ∈ A is a sequence with xn → x, we have x ∈ A: Indeed, the transfer
Æ
principle implies ∗ xh ∈ ∗ A for each h ∈ ∗ , and by Theorem 7.1, we have ∗ xh = ∗ x
Æ
for some h ∈ ∞ . Hence, ∗ x ∈ ∗ A. By assumption, ∗ x is infinitely close to some
standard point of σ A. Since ∗ x is itself standard, it must be that standard point
of σ A, i.e. ∗ x ∈ σ A, and so x ∈ A. Hence, A is closed.
§7 Calculus 91
Ê
We call a set A ⊆ compact if A is closed and bounded. The previous results
imply the following consequence:
Corollary 7.10. A set A ⊆ Ê is compact if and only if each point from ∗ A is
infinitely close to some (standard) point of σ A.
σ
Proof. If A is compact, then a combination of (7.2) and (7.3) shows that (∗ A) =
∗
A. Conversely, if each x ∈ ∗ A is infinitely close to some standard point, then
A is closed by Theorem 7.9, and each x ∈ ∗ A is finite, whence A is bounded by
Theorem 7.8.
We will see later another (deeper) reason why Corollary 7.10 is true. For
later applications, Corollary 7.10 is one of the most essential tools. In fact, all
results which are typically proved by the Heine-Borel compactness criterion (a set
is compact if and only if each open covering has a finite subcovering) can usually
more easily be proved by an application of Corollary 7.10.
Exercise 31. Prove that a point x ∈ A is an interior point of A if and only if the
relation y ≈ ∗ x implies y ∈ ∗ A, i.e. if mon(x) ∈ ∗ A. Thus, A is open if and only if

mon(x) ⊆ ∗ A.
x∈A
Exercise 32. Give a standard characterization of those sets A ⊆ Ê with the prop-
erty that the relations x ∈ ∗ A and y ≈ x imply y ∈ ∗ A.
Why does the answer not contradict Exercise 31?
Exercise 33. Give a standard characterization of those sets A ⊆ Ê satisfying

∗
A= mon(x).
x∈A
The following exercises are easier to prove if one makes use of Theorem 7.12
below. However, the author recommends solving them now (without appealing to
Theorem 7.12).
Exercise 34. Give a standard characterization of those sets A ⊆ Ê with the prop-
erty that each finite point x ∈ ∗ A satisfies x = ∗ (st(x)).
Exercise 35. Recall that a set A ⊆ Ê
is called perfect, if each point x ∈ A is an
accumulation point of A\ {x}. Give a nonstandard characterization of perfect sets.
Theorem 7.11. A point x ∈ Ê belongs to the closure of some set A ⊆ Ê if and
only if x is the standard part of some point from ∗ A, i.e. if and only if x ∈ st(∗ A).
Proof. If x belongs to the closure of A, then there is a sequence xn ∈ A with

xn → x. Then ∗ xh ≈ ∗ x for some h ∈ ∞ by Theorem 7.1. Since the perma-
nence principle implies ∗ xh ∈ ∗ A and since x = st(∗ xh ), we have the required
∗
representation of x. Conversely, if ∗ x = st(y) for some y ∈ ∗ A, then y ∈ A, since
∗
the permanence principle implies ∗ A ⊆ A. By Theorem 7.9, y is infinitely close
σ
to some standard point of A. Since ∗ x is such a standard point, we must have
∗ σ
x ∈ A, i.e. x ∈ A.
Ê
Theorem 7.12. Let A ⊆ . A point x ∈ A is isolated if and only if ∗ A contains
no point y = ∗ x with y ≈ ∗ x.
Proof. If x is isolated, there is some ε ∈ Ê+ such that
∀y ∈ A : (y = x =⇒ |y − x| > ε).
The transfer implies that all points y ∈ ∗ A \ {∗ x} satisfy |y − ∗ x| > ∗ ε, and so

y ≈ ∗ x.
Conversely, if x is not isolated, there is a sequence xn ∈ A, xn = x with
Æ
xn → x. We have ∗ xh ≈ ∗ x for some h ∈ ∞ by Theorem 7.1, and the transfer
principle implies ∗ xh ∈ ∗ A and ∗ xh = x. Hence, y = ∗ xh ∈ ∗ A satisfies y = ∗ x and
y ≈ ∗ x.
7.3 Functions
Throughout, we consider functions f : D → Ê Ê
where D ⊆ . Then ∗ f defines
Ê
a function ∗ f : ∗ D → ∗ which extends f . One might expect that ∗ f reflects
properties like continuity of f in nonstandard terms. This is indeed true.
Theorem 7.13. Let x0 be an accumulation point of D. Then for c ∈ Ê the following
statements are equivalent:
lim f (x) = c.
1. x→x
0
x∈D
2. For any x ∈ ∗ D with ∗ x0 = x ≈ ∗ x0 we have ∗ f (x) ≈ ∗ c.
Proof. Let f (x) → c as x → x0 . For any ε ∈ Ê+, we find some δ ∈ Ê+ with
∀x ∈ D : (0 < |x − x0 | < δ =⇒ |f (x) − c| < ε).
The transfer principle implies
∀x ∈ ∗ D : (0 < |x − ∗ x0 | < ∗ δ =⇒ |∗ f (x) − ∗ c| < ∗ ε).
In particular, |∗ f (x) − ∗ c| < ∗ ε whenever ∗ x0 = x ≈ ∗ x0 . Since this holds for all

Ê
ε ∈ + , we even have ∗ f (x) ≈ ∗ c.
Ê
Conversely, if ∗ x0 = x ≈ ∗ x0 implies ∗ f (x) ≈ ∗ c, let ε ∈ + be given. Then
Ê
the following internal predicate is true for all infinitesimal d ∈ inf(∗ ), d > 0:
∀x ∈ ∗ D : (0 < |x − ∗ x0 | < d =⇒ |∗ f (x) − ∗ c| < ∗ ε).

§7 Calculus 93
Ê
By the permanence principle for ∗ (Cauchy principle), the predicate then also
Ê
holds for some d = ∗ δ where δ ∈ + . The inverse direction of the transfer principle
implies
∀x ∈ D : (0 < |x − x0 | < δ =⇒ |f (x) − c| < ε).
But this means that f (x) → c as x → x0 .
Corollary 7.14. f : D → Ê is continuous at x0 ∈ D if and only if the relation

∗
x ∈ ∗ D, x ≈ ∗ x0 , implies ∗ f (x) ≈ (f (x0 )) = ∗ f (∗ x0 ).
Proof. If x0 ∈ D is not isolated, then f is continuous at x0 if and only if

lim f (x) = f (x0 ). Thus, the statement follows from Theorem 7.13.
x→x 0
Ê
x∈D
If x0 ∈ D is isolated, any function f : D → is continuous at x0 . Moreover,
by Theorem 7.12 the only point x ∈ ∗ D which satisfies x ≈ ∗ x0 is x = ∗ x0 , and so
∗
f (x) ≈ ∗ f (∗ x0 ) is always satisfied.
As an application, let us give a simple proof of the following fact whose proof
is much more complicated by standard methods (recall that we defined compact
subsets of Êsimply as the closed and bounded subsets):
Corollary 7.15. If D ⊆ Ê is compact and f : D → Ê is continuous, then f (D) is
compact.
Proof. Put B := f (D). By Corollary 7.10, we have to prove that each point y ∈ ∗ B
is infinitely close to some standard point of σ B. Thus, let x ∈ ∗ B be given. Since
∗
f : ∗ D → ∗ B is onto (Theorem 3.13), we have y = ∗ f (x) for some x ∈ ∗ D. Since
D is compact, Corollary 7.10 implies that x is infinitely close to some point ∗ x0
with x0 ∈ D. Since f is continuous at x0 and x ≈ ∗ x0 , Corollary 7.14 implies
∗
y = ∗ f (x) ≈ ∗ f (∗ x0 ) = (f (x0 )) ∈ σ B, as desired.
As a further application, we prove that continuous functions map intervals

into intervals:
Corollary 7.16 (Intermediate Value Theorem). If f : [a, b] → Ê is continuous,
then f attains all values between f (a) and f (b).
Proof. Without loss of generality, let f (a) < c < f (b), and we have to prove that

c ∈ B := f ([a, b]). Choose h ∈ ∞ , and let xn := a + n(b − a)/h (n = 0, . . . , h)

be an infinite equidistant partition of [a, b]. Let n0 ∈ ∗ be the first index with
∗
f (xn0 ) > ∗ c, i.e. ∗ f (xn0 −1 ) ≤ ∗ c. By Corollary 7.10, the point xn0 is infinitely
σ
close to some standard point from [a, b], i.e. xn0 ≈ ∗ x for some x ∈ [a, b]. Since
∗
x ≈ xn0 ≈ xn0 −1 and since f is continuous at x, we have ∗ f (∗ x) ≈ ∗ f (xn0 ) > ∗ c
and ∗ f (∗ x) ≈ ∗ f (xn0 −1 ) ≤ ∗ c which implies ∗ f (∗ x) ≈ ∗ c and so f (x) = c (since
all points are standard points).
Exercise 36. In the previous proof, we used that there is a first index n0 ∈ ∗

with ∗ f (xn0 ) > ∗ c. Why does such an index exist?
Theorem 7.17. A function f : D → Ê
is uniformly continuous if and only if the
relations x, y ∈ ∗ D and x ≈ y imply ∗ f (x) ≈ ∗ f (y).
Proof. Let f : D → Ê
be uniformly continuous. For any ε ∈ Ê+, we find some
Ê
δ ∈ + such that, in view of the transfer principle,
∀x, y ∈ ∗ D : (|x − y| < ∗ δ =⇒ |∗ f (x) − ∗ f (y)| < ∗ ε).
In particular, the relation x ≈ y for hyperreal numbers x, y ∈ ∗ D implies
Ê
|∗ f (x) − ∗ f (y)| < ∗ ε. Since this holds for any ε ∈ + , we even have ∗ f (x) ≈ ∗ f (y).
Conversely, if x ≈ y implies ∗ f (x) ≈ ∗ f (y), then we have for any ε ∈ + Ê
that the internal predicate
∀x, y ∈ ∗ D : (|x − y| < c =⇒ |∗ f (x) − ∗ f (y)| < ∗ ε)
Ê
holds for any infinitesimal c ∈ inf(∗ ), c > 0. By the Cauchy principle, this
Ê
predicate holds also for some c = ∗ δ with δ ∈ + . The converse direction of the
transfer principle now shows that the relation |x − y| < δ for x, y ∈ D implies
|f (x) − f (y)| < ε. Hence, f is uniformly continuous.
Theorem 7.17 might appear strange at first glance, because it is not clear how
the uniformity comes into play, compared to e.g. Corollary 7.14: The only difference
to the characterization of continuous functions by Corollary 7.14 is that we want
the relation ∗ f (x) ≈ ∗ f (y) for x ≈ y even if y is a nonstandard point. In this
sense, Theorem 7.17 is in a certain sense a “local” (nonstandard) characterization
of uniform continuity which is somewhat paradoxical.
Employing the above paradox, we get a simple proof for another well-known
standard result:
Corollary 7.18. If D ⊆ Ê is compact, then any continuous f : D → Ê is uniformly
continuous.
Proof. Let x, y ∈ ∗ D with x ≈ y. By Theorem 7.17, we have to prove that ∗ f (x) ≈
∗
f (y). But since D is compact, we find by Corollary 7.10 some x0 ∈ D with
∗
x ≈ ∗ x0 . Since x, y ≈ ∗ x0 , Corollary 7.14 implies ∗ f (x) ≈ (f (x0 )) ≈ ∗ f (y), as
claimed.
We now come to the real calculus:
Ê
Theorem 7.19. Let x0 ∈ D be an accumulation point of D ⊆ . Then f : D → Ê
is differentiable in x0 with derivative c ∈ Ê
if and only if for each x ∈ ∗ D with
x ≈ ∗ x0 and x = x0 the relation
∗
f (x) − ∗ f (∗ x0 ) ∗
≈ c
x − ∗ x0
§7 Calculus 95
holds. Equivalently: For each 0 = dx ≈ 0 with ∗ x0 + dx ∈ ∗ D the relation

∗ ∗
f (∗ x0 + dx) − f (∗ x0 ) ∗
≈ c (7.4)
dx
holds.
Proof. Put g(x) := (f (x) − f (x0 ))/(x − x0 ). Then we have ∗ g(x) =

(∗ f (x) − ∗ (f (x0 )))/(x − ∗ x0 ) (why?). Now the first statement follows by
Theorem 7.13. The second statement follows by putting dx := x − x0 resp.
x := x0 + dx.
Theorem 7.19 implies that we can really do calculus with infinitesimals in

the sense of Leibniz by just dropping infinitesimal terms:
Example 7.20. Let us determine the derivative of the function f (x) = x2 . For
Ê Ê
x ∈ ∗ and h ∈ inf(∗ ), we have
∗
f (x + dx) − f (x) x2 + 2xdx + dx2 − x2
= ≈ 2x,
dx dx
where the “loss” of the infinitesimal dx is justified, because we wrote “≈” instead
of “=”. A comparison with (7.4) for x = ∗ x0 now shows that f is differentiable in
x0 with derivative 2x0 .
Corollary 7.21. If f is differentiable at x0 , then f is continuous at x0 .
Proof. For x ≈ ∗ x0 , we have ∗ f (x) − ∗ f (∗ x0 ) ≈ (x − ∗ x0 )∗ c ≈ 0, and so ∗ f (x) ≈

∗
f (∗ x0 ).
Exercise 37. Prove by nonstandard methods that f (x) = |x| is not differentiable
at 0.
We get now a clear proof for the chain rule of the calculus. As usual, we write
f ′ (x0 ) for the derivative of f at x0 (if it exists).
Corollary 7.22 (Chain rule). We have for real functions f, g:
(f ◦ g)′ (x0 ) = f ′ (g(x0 ))g ′ (x0 ),
provided the derivatives on the right-hand side exist.
Proof. Put F (x) := f (g(x)). Given dx ≈ 0 with dx = 0, we define

dg := ∗ g(∗ x0 + dx) − ∗ g(∗ x0 ) and df := ∗ F (∗ x0 + dx) − ∗ F (∗ x0 ) =
∗ ∗ ∗
f ( g( x0 ) + dg) − ∗ f (∗ g(∗ x0 )). We have by Theorem 7.19
dg ∗
≈ (g ′ (x0 )),
dx
in particular also dg ≈ g ′ (x0 )dx ≈ 0. Thus, in case dg = 0, Theorem 7.19 implies
df ∗
≈ (f ′ (g(x0 ))),
dg
which in view of the above formula shows that

df df dg ∗
= · ≈ (f ′ (g(x0 ))g ′ (x0 )).
dx dg dx
If dg = 0, we have df = 0, and so the previous formula holds also (because

∗
(g ′ (x0 )) ≈ dg/dx = 0 and df /dx = 0). Now the statement follows by Theo-
rem 7.19.
Thus, essentially, the chain rule follows by just multiplying nominator and
df
numerator of dx by dg: The crucial point here is that we may in fact calculate
with the infinitesimals df , dx, and dg as if they were real numbers. As for real
numbers, one only has to take care of the special case dg = 0.
Exercise 38. Give a nonstandard proof (as intuitive as possible) of the product
formula
(f · g)′ (x0 ) = f ′ (x0 )g(x0 ) + f (x0 )g ′ (x0 ).
Ê
Exercise 39. Let f : [a, b] → be differentiable on (a, b) with derivative f ′ . Derive
Ê
from the mean value theorem that for each x, y ∈ ∗ with ∗ a ≤ x < y ≤ ∗ b there
Ê
is some ξ ∈ ∗ , x < ξ < y such that
∗
f (x) − ∗ f (y) ∗ ′
= f (ξ).
x−y
We now turn to the integral:

Theorem 7.23. If f : [a, b] → Ê
is Riemann-Stieltjes integrable with respect to
Ê
some function ϕ : [a, b] → , then the integral may be infinitely closely approxi-
mated by a Riemann-Stieltjes sum to an infinitely fine internal partition, i.e.
∗ h
b ∗
f (x) dϕ(x) ≈ f (xn−1 )(∗ ϕ(xn ) − ∗ ϕ(xn−1 )) (7.5)
a n=1
where x0 = ∗ a, xh = ∗ b, and 0 < xn − xn−1 < δ for n = 1, . . . , h where 0 <

Ê
δ ∈ inf(∗ ), and the ∗ -finite sequence xn is internal. Conversely, if the right-hand
side of (7.5) is finite and has the same standard part for all infinitely fine internal
partitions, then f is Riemann-Stieltjes integrable with respect to ϕ.
§7 Calculus 97
Proof. If f is Riemann-Stieltjes integrable with integral c, then we find for any

Ê Ê
ε ∈ + some δ ∈ + such that

∀x ∈ Ê Æ
: ∃n ∈ : α(x, n, δ) =⇒
c −

f (xk−1 )(ϕ(xk ) − ϕ(xk−1 ))
< ε,

k=1
where α(x, n, δ) is a shortcut for
x0 = a ∧ xk = b ∧ ∀k ∈ Æ : (0 < k ≤ n =⇒ 0 < xk − xk−1 < δ).
Ê
Now we apply the transfer principle, observing that ( ) consists of all internal
∗
sequences, and that any ∗ -finite partition may be extended to such a sequence.
We thus find that for all infinitely fine internal partitions x1 , . . . , xh , the relation

c− ∗
f (xn−1 )( ϕ(xn ) − ϕ(xn−1 ))
< ∗ ε
∗ ∗

n=1
holds. Since ε ∈ Ê+ was arbitrary, we have

h

∗
f (xn−1 )(∗ ϕ(xn ) − ∗ ϕ(xn−1 )) ≈ ∗ c,
n=1
as claimed.
For the second statement, assume that the right-hand side of (7.5) is infinitely
Ê
close to ∗ c for some c ∈ whenever xn is an infinitely fine internal partition. Then
Ê
we have for any ε ∈ + that the internal predicate
∀x ∈ ( ) : ∃n ∈ ∗ :
Ê Æ
∗
∗
∗
α(x, n, z) =⇒
c − ∗
f (xk−1 )( ϕ(xk ) − ϕ(xk−1 ))
< ∗ ε
∗ ∗

k=1
Ê
holds for any z ∈ inf(∗ ), z > 0. By the permanence principle (Cauchy principle),
Ê
the above internal predicate holds for some z = ∗ δ, δ ∈ + . Then the inverse
direction of the permanence principle implies that the Riemann-Stieltjes sum for
any finite δ-fine partition differs from c by less than ε. Hence, f is Riemann-Stieltjes
integrable.
Exercise 40. If f : [a, b] → Ê
is continuous and ϕ : [a, b] → Ê
is monotone, it is
well-known that f is Riemann-Stieltjes integrable with respect to ϕ. Prove that in
this case ∗
b h
∗
f (x) dϕ(x) ≈ f (yn )(∗ ϕ(xn ) − ∗ ϕ(xn−1 ))
a n=1
whenever xn is an infinitely fine internal partition and yn is an arbitrary internal

sequence with xn−1 ≤ yn ≤ xn .
Proof. Since ∗ f is uniformly continuous, we have ∗ f (yn ) ≈ ∗ f (xn−1 ) and so

|∗ f (y(n)) − ∗ f (xn−1 )| < ε for any n ∈ ∗ and and any ε ∈ σ + . Hence, Ê

h ∗
h

( f (yn ) − ∗ f (xn−1 ))(∗ ϕ(xn ) − ∗ ϕ(xn−1 ))
≤ ε |∗ ϕ(xn ) − ∗ ϕ(xn−1 )|

n=1
n=1
= ε |ϕ(b) − ϕ(a)|
for any ε ∈ σ Ê+. Thus, the left-hand side is infinitesimal which means that
h
h

∗ ∗
f (yn )(∗ ϕ(xn ) − ∗ ϕ(yn )) ≈ f (xn−1 )(∗ ϕ(xn ) − ∗ ϕ(xn−1 )).
n=1 n=1
Now the claim follows from Theorem 7.23.
As an application, we prove one part of the fundamental theorem of calculus:

Corollary 7.24. If f : [a, b] → Ê is continuously differentiable, then
b
f ′ (x) dx = f (b) − f (a).
a
Proof. Let x0 , . . . , xh be an infinitely fine partition. Since F is differentiable, we

have by Exercise 39 that
∗
f (xn+1 ) − ∗ f (xn ) ∗ ′
= f (ξn )
xn+1 − xn
where xn < ξn < xn+1 . Multiplying by (xn+1 − xn ) and summing up, we find
h

∗ ′
f (∗ b) − ∗ f (∗ a) = ∗
f (ξn )(xn+1 − xn ).
n=1
∗ b
By Exercise 40, the right-hand side is infinitely close to a
f ′ (x) dx , and the
∗
left-hand side is equal to (f (b) − f (a)).
Concerning functions of more variables, we restrict ourselves to a nonstan-

dard continuity criterion which is somewhat surprising, since it suffices to consider
continuity in each variable separately. However, the point is that this continuity is
even needed at nonstandard points (recall the remarks following Theorem 7.17).
§7 Calculus 99
Ê
Theorem 7.25. Let D = [a, b]×[c, d]. Then f : D → is continuous at (x0 , y0 ) ∈ D
if and only if for each (x, y) ∈ ∗ D with x ≈ ∗ x0 and y ≈ ∗ y 0 we have ∗ f (x, y) ≈
∗
f (x, ∗ y 0 ) and ∗ f (x, y) ≈ f (∗ x0 , y).
Proof. Let f have the properties of the statement. Then we actually have
∗
f (x, y) ≈ ∗ f (x, ∗ y 0 ) ≈ ∗ f (∗ x0 , ∗ y0 ) whenever (x, y) ∈ ∗ D satisfy x ≈ ∗ x0 and
Ê
y ≈ ∗ y0 . Hence, for each ε ∈ + the following predicate holds for each infinitesi-
mal z > 0:
∗ ∗
∀x ∈ [a, b], y ∈ [c, d] :
(7.6)
(|x − x0 | < z ∧ |y − y0 | < z) =⇒ |∗ f (x, y) − ∗ f (∗ x0 , ∗ y0 )| < ∗ ε.
By the permanence principle (Cauchy principle), the predicate holds also for some
Ê
z = ∗ δ with δ ∈ + . The inverse direction of the transfer principle shows that f
is continuous at (x0 , y0 ).
Conversely, let f be continuous at (x0 , y0 ). Then we find for all ε ∈ + some Ê
Ê
δ ∈ + such that by the transfer principle the sentence (7.6) is true for z = ∗ δ. In
particular, if x ≈ x0 and y ≈ y0 , then |∗ f (x, y) − ∗ f (x0 , y0 )| < ∗ ε. Since this holds
Ê
for all ε ∈ + , we even have ∗ f (x, y) ≈ ∗ f (x0 , y0 ). But since then ∗ f (∗ x0 , y) ≈
∗ ∗
f ( x0 , ∗ y 0 ) ≈ ∗ f (x, ∗ y 0 ), we have also ∗ f (x, y) ≈ ∗ f (∗ x0 , y) ≈ ∗ f (x, ∗ y 0 ).
We shall now use nonstandard analysis to define explicitly a nonmeasurable

function. Since this is not possible without the (uncountable) axiom of choice (at
least under the assumption that the existence of a so-called inaccessible cardinal is
consistent [Sol70]), we already see that the (uncountable) axiom of choice cannot
be avoided in the construction of the mapping ∗.
We need some preparation. At first, we make use of the following well-known
generalization of the fundamental theorem of calculus for the Lebesgue integral
(for a proof, see e.g. [Rud87, HS69]):
Proposition 7.26. If f : [a, b] → Ê is integrable in the sense of Lebesgue, then
x
F (x) = f (t) dt
a
is differentiable at almost all points of [a, b] (in the sense of Lebesgue) and satisfies
there F ′ (x) = f (x).
Using this fact, we can prove the following standard result:
Proposition 7.27. Let f : →Ê Ê be measurable on some nontrivial interval and
have arbitrarily small periods, i.e. there is a sequence Tn ↓ 0 with f (x + Tn ) = f (x)
Ê
(x ∈ ). Then f is almost everywhere constant, i.e. there is some c ∈ with Ê
Ê
f (x) = c for almost all x ∈ (in the sense of Lebesgue).
Proof. Without loss of generality, we may assume that |f (x)| ≤ 1: Indeed, if a

constant c as in the claim does not exist, then there is a constant c such that
A := {x : f (x) ≥ c} and its complement both have positive measure. Put g(x) :=
1 for x ∈ A and g(x) := 0 for x ∈ / A. Then g is measurable with arbitrarily
small periods, and |g(x)| ≤ 1. If we can prove that g is a.e. constant, we have a
contradiction.
x
Put F (x) := 0 f (t) dt. We claim that f (x) = F (T1 )/T1 for all x for which
F ′ (x) exists with F ′ (x) = f (x) (these are almost all x by Proposition 7.26). By
the definition of F ′ , it suffices to prove that
x+Tn
F (x + Tn ) − F (x)
= Tn−1 f (t) dt → T1−1 F (T1 )
Tn x
for all those x. But since Tn is a full period of f , we may replace x in the integral
by any other number, and thus have to prove that
Tn−1 F (Tn ) → T1−1 F (T1 ). (7.7)
Choosing kn ∈ such that knTn ≤ T1 < (kn + 1)Tn, we find

T1
|F (kn Tn ) − F (T1 )| ≤ |f (t)| dt ≤ T1 − kn Tn < Tn → 0.
kn Tn
Since |kn Tn − T1 | < Tn−1 → 0, we may conclude that

F (kn Tn ) F (T1 )

−
→ 0.

kn Tn T1
But the periodicity of f implies F (kn Tn ) = kn F (Tn ), and so (7.7) could be proved.

Ê
For the moment, we define [x] for x ∈ as the largest natural number which
is not larger than x. Then for each natural number n, the number
f (x) := |[2n x] − 2[2n−1 x]|
might be interpreted as the n-th digit (after the colon) of the binary expansion of
x.
Theorem 7.28. Put h ∈ Æ∞, and
∗ ∗
f (x) := st(| [2h∗ x] − 2 [2h−1∗ x]|) (x ∈ Ê).
Then f : Ê → {0, 1} is nonmeasurable (in the sense of Lebesgue) on each nontrivial
interval.
§7 Calculus 101
Proof. Put g(n, x) := |[2n x] − 2[2n−1 x]|, and note that f (x) = st(∗ g(h, ∗ x)). Since
Ê Æ Ê Ê
g : × → {0, 1}, we have ∗ g : ∗ × ∗ → {∗ 0, ∗ 1}, and so f : → {0, 1}. The
transfer of the statement
∀k, n ∈ Æ : (k < n =⇒ ∀x ∈ Ê : g(n, x + 2−k ) = g(n, x))

implies that ∗ g(h, ·) is periodic with 2−k as a period for any k < h. In particular, f
has arbitrarily small periods. Thus, if f were measurable, Proposition 7.27 would
imply that we have either f (x) = 0 for almost all x, or f (x) = 1 for almost all x.
In particular, one of the sets A := {x ∈ [0, 1] : f (x) = 0} and B := {x ∈ [0, 1] :
f (x) = 1} has measure 0, and the other has measure 1. We prove that A and B
have the same measure and thus find a contradiction.
To see this, let α(x) for x ∈ [0, 1] be the predicate “x is not dyadic”, i.e.
∀n, k ∈ Æ : x = k2−n.
Since g(n, x) is the n-th number of the binary expansion of x, we have
∀x ∈ [0, 1], n ∈ Æ : α(x) =⇒ g(n, 1 − x) + g(n, x) = 1.

∗
The transfer implies that g(h, x) = 1 − g(h, 1 − x) for all x ∈ [0, 1] with ∗ α(x). If
x ∈ [0, 1] is not dyadic, i.e. α(x) holds, then the transfer principle implies ∗ α(∗ x),
and so f (x) = 1 − f (1 − x). Thus, x ∈ A if and only if 1 − x ∈ B. Since the dyadic
numbers are countable and thus form a null set, we may conclude that A and B
have the same measure, as claimed.
Theorem 7.28 is taken from [Lux73]. Another example of a nonmeasurable
function is given by
f (x) = st(∗ sin(2πhx)) (x ∈ Ê) (7.8)
Æ
with an appropriate h ∈ ∞ , see e.g. [SL76, Example 8.4.45] (see also [Tay69]).
The measurability of (7.8) in dependence of h is discussed in [BH02, Tay69].
Æ
Exercise 41. Let U be a free ultrafilter over . Apply Theorem 7.28 to prove that
the set
{x ∈ [0, 1] : lim ([2n x] − 2[2n−1 x]) = 0}
n→U
is nonmeasurable (in the sense of Lebesgue).

It should be noted that the statement of Exercise 41 holds without the (un-
countable) axiom of choice. This is of some interest, because it means that the
Æ
mere existence of a free ultrafilter over (which is less restrictive than the axiom
of choice) implies the existence of a nonmeasurable set (recall that the classical
proofs on the existence of nonmeasurable sets require a more powerful form of the
axiom of choice). This fact was first observed by Sierpinski [Sie38] (Sierpinski used
standard arguments, of course).
Theorem 7.28 has another interesting consequence:
Recall that a map · : X → [0, ∞) on a linear space (=vector space) X over

= or = is called a norm, if the following holds:
1. x = 0 if and only if x = 0.
2. λx = |λ| x for scalars λ ∈ .
3. x + y ≤ x + y (the triangle inequality).
Theorem 7.29. Let ℓ∞ denote the space of all bounded sequences with the natural
operations. There is a norm · on ℓ∞ which is additionally monotone (i.e. if
|xn | ≤ |yn |, then (xn )n ≤ (yn )n ) and a measurable function f : × →
such that
x → f (·, x)
is nonmeasurable on any nontrivial interval.
Proof. Fix some h ∈ ∞, and define the norm by the formula

(xn )n = sup |xn | + |st(∗ xh )| .
n∈
It is easily checked that this indeed provides a norm (the first term is only needed
to have that (xn )n = 0 implies xn = 0 for all n). For f (n, x) := [2n x] − 2[2n−1 x],
we have
f (·, x) = 1 + |st([2n x] − 2[2n−1 x])| = 1 + st(|[2n x] − 2[2n−1 x]|)
which is nonmeasurable by Theorem 7.28.

Theorem 7.29 answers a problem in the theory of ideal spaces which was
open for a long time. It was proved (by a slightly different argument but with the
same idea) in [Lux63]. Using a similar argument as in Exercise 41, one can give
a formula for a norm with the property of Theorem 7.29 which does not involve
nonstandard expressions, namely

(xn )n = sup |xn | + | lim xn | ,
n∈ n→U

where U denotes a free ultrafilter over . (Note that in view of Theorem 5.30,
the limit in this expression always exists when xn is a bounded sequence).
Chapter 4
Enlargements and Saturated

Models
§ 8 Enlargements, Saturation, and Concurrency

Throughout this section, let ∗ : S → ∗
S be elementary.
Definition 8.1. Let ∗ : S → ∗
S be elementary. Then ∗ is called:
1. κ-enlargement (with some set κ) if for any nonempty system A of entities
A ∈ S which has the finite intersection property and at most the cardinality

of κ, we have σ A = ∅, i.e.

{∗ A : A ∈ A } = ∅.
2. enlargement if it is a κ-enlargement for any κ (i.e. if the above condition holds

without any assumption on the cardinality of A ).
3. κ-saturated (with some set κ) if for any nonempty system B of internal entities
which has the finite intersection property and at most the cardinality of κ,
we have
B = ∅.

4. polysaturated if it is S-saturated.
The reader should be warned that the definitions of κ-enlargement, κ-
saturated and polysaturated differ slightly in literature: Usually, κ denotes a car-
dinal number, and one requires that the cardinality of A resp. B be strictly less

than κ (e.g. in [CK90, SL76]). Thus, e.g. what we call -saturated, is in literature
usually called ℵ1 -saturated where ℵ1 denotes the first uncountable ordinal. More-
over, in e.g. [SL76] it is required that B itself be an internal entity. (We shall see,
104 Chapter 4. Enlargements and Saturated Models
however, that it makes actually no difference if we would require that A and B

be entities). The above definition of polysaturated maps is used in [LR94], where
also enlargements are called “strong nonstandard embeddings”.
Proposition 8.2. If ∗ is κ-saturated, it is a κ-enlargement.
Proof. If A is as in Definition 8.1, then σ A is a system of internal entities (Propo-

sition 3.16) with the finite intersection property which has the same cardinality as

A . Hence σ A = ∅.
We shall see later that for any set S and any κ one can find κ-saturated maps
(and thus also κ-enlargements).
It looks rather non-symmetric that for the definition of enlargements no
restriction on the cardinality on A is made while for the definition of polysat-
urated maps a restriction is made. However, for enlargements, this restriction is
implicit, since (see Lemma 8.7 below), one may assume that A ∈ S. Hence, each

S-enlargement is automatically an enlargement. In particular, each polysaturated
map is an enlargement.
It would not make sense to drop the assumptions on the cardinality of B in
the definition of polysaturated maps, since no such maps can exist:
Proposition 8.3. If A ∈ S is an infinite entity, then there is a nonempty system

B of internal subsets of ∗ A with the finite intersection property and B = ∅.
Proof. Let B be the system of all sets of the form Bb = {x ∈ ∗ A : x = b}

(b ∈ ∗ A). Each Bb is internal by the internal definition principle (more precisely,
by Corollary 3.18), and the other claims are evident.
Corollary 8.4. If S is infinite and ∗ is κ-saturated, then κ has at most the cardi-
nality of ∗ S. In particular, there is no map ∗ which is κ-saturated for any κ.
This does of course not exclude that for any κ we can find a κ-saturated map
∗ (and, as remarked above, such maps indeed exist): Corollary 8.4 only implies
that ∗ then must depend on κ.
We already see that it cannot be too easy to find saturated maps: If we
use the construction of §4, then ∗ S consists of the equivalence classes of maps
x : J → S. In particular, the cardinality of ∗ S is at most the cardinality of S J .
We thus need that S J has a larger cardinality than S ⊇ S ∪ P(S) ∪ P(P(S)) ∪ · · · .
Thus, the cardinality of J must be rather large (in particular, the choice J :=
is never sufficient). Hence, there exists a large class of nonstandard maps which is
not polysaturated.
Theorem 8.5. Let S be infinite. Then ∗ is an -enlargement if and only if ∗ is a
nonstandard embedding.
§8 Enlargements, Saturation, and Concurrency 105
In particular, each enlargement (and so each polysaturated map) is a non-

standard embedding.

Proof. Let ∗ be an -enlargement, and B ⊆ S be infinite countable. To see that
∗ is a nonstandard embedding, it suffices by Theorem 3.22 to prove that σ B =
∗
B. Let A be the system of all sets of the form B \ {b} (b ∈ B). Since B is
infinite, A has the finite intersection property. Moreover, A is countable (since

B is countable). Hence, {∗ A : A ∈ A } = ∅. Since ∗ (B \ {b}) = ∗ B \ {∗ b}, this

means ∗ B \ σ B = b∈B (∗ B \ {∗ b}) = ∅, and so ∗ is a nonstandard embedding.
Conversely, assume that ∗ is a nonstandard embedding. We show that ∗ is

an -enlargement. Thus, let A denote a nonempty countable system of entities
A ∈ S which has the finite intersection property. Let A1 , A2 , . . . be an enumeration

of all elements of A . By an obvious identification, we may assume that ⊆ S (just
rename the atoms). Define a function f : → P(A1 ) by f (n) := A1 ∩ · · · ∩ An .
Note that Theorem 2.1 implies P(A1 ) ∈ S and f ∈ S. Since A has the finite
intersection property, we have f (n) = ∅ for each n, i.e. ∀x ∈ : f (x) = ∅.
The transfer principle implies ∀x ∈ ∗ ∗
: f (x) = ∅. Since ∗ is a nonstandard

embedding, there is some h ∈ ∗ \ σ , and we have ∗ f (h) = ∅. If we can prove

that ∗ f (h) is contained in ∗ An for each n ∈ , it follows that the intersection of

the sets ∗ An (n ∈ ) is nonempty, and so that ∗ is an -enlargement. To prove
∗

f (h) ⊆ ∗ An , let n ∈ be given. The transfer of the true sentence ∀x ∈ : (x >

n =⇒ f (x) ⊆ An ) reads ∀x ∈ ∗ : (x > ∗ n =⇒ ∗ f (x) ⊆ ∗ An ). Now we note
that h > ∗ n by Proposition 5.9, and so ∗ f (h) ⊆ ∗ An , as claimed.

It is in general not true that each nonstandard embedding is -saturated.

There exist even enlargements which are not -saturated, see [CK90, Exer-
cise 4.4.29]. However, these examples are rather “exotic”. All elementary embed-
dings that we discuss are “nicer”: We will see that all embeddings arising from the

ultrapower construction of §4 are -saturated. To see this, we need some auxiliary
notions.
If f : σ A → ∗ B is a function where A ∈ S is an infinite entity, then f
cannot be an internal function, since otherwise dom(f ) = σ A were internal by
Theorem 3.19, contradicting Theorem 3.22. One may ask whether f is just the
restriction of an internal function to the external set σ A. If this is the case for any
function, we call ∗ comprehensive:
Definition 8.6. An elementary map ∗ : S → ∗ S is called comprehensive, if for each
entities A, B ∈ S and each function f : σ A → ∗ B, there is an internal function
F : ∗ A → ∗ B such that F (a) = f (a) for each a ∈ σ A.
We will see in §9 that all embeddings arising from the ultrapower construction
of §4 are comprehensive. We intend to prove now a relation of comprehensive and
saturated maps. In particular, we will show that all comprehensive maps in turn

are -saturated and also a partial converse. To this end, we need a result which
is of independent interest:
Lemma 8.7. To verify that a map ∗ is a κ-enlargement resp. κ-saturated, it suffices
to consider systems A resp. B in Definition 8.1 which additionally are entities
in S resp. ∗
S. Moreover, it additionally suffices to consider systems B which are
subsets of standard entities.
Proof. Indeed, if A is as in Definition 8.1, fix some A0 ∈ A and consider the
set A0 := {A ∩ A0 : A ∈ A } in place of A : Then A0 ∈ P(A0 ) is an entity by
∗
Theorem 2.1. Moreover, σ A 0 ⊆ σ A , since (A ∩ A0 ) ⊆ ∗ A for each A ∈ A
σ
(Lemma 3.5). It thus suffices to verify that A 0 = ∅. Now the first statement
follows if we observe that A0 has the finite intersection property and at most the
cardinality of A .
Concerning κ-saturated maps, we first argue similarly: If B is given as in
Definition 8.1, fix some B0 ∈ B and consider the set B0 := {B ∩ B0 : B ∈ B}.
Then B0 consists of internal entities (Theorem 3.19) with the finite intersection

property whose cardinality is not larger than B. Since B0 ⊆ B, it suffices to

verify that B0 = ∅. Now observe that B0 consists only of internal subsets of
B0 . The system P of all internal subsets of B0 is internal (Exercise 81), and so
B0 ⊆ P . Proposition 3.16 implies that there is some n with P ∈ ∗ S n and P ⊆ ∗ S n .
Consequently, B0 ⊆ ∗ S n . Since ∗ S n is an entity of ∗S, this inclusion implies also
that B0 is an entity of ∗
S (Theorem 2.1).
Theorem 8.8. Let S be infinite. If ∗ is a comprehensive nonstandard embedding,

then ∗ is -saturated. Conversely, if ∗ is polysaturated, then it is comprehensive.
More precisely, if ∗ is κ-saturated, then for any internal entities A, B and for
each function f : A0 → B with A0 ⊆ A where A0 has at most the cardinality of κ,
there is an internal function F : A → B such that F (a) = f (a) for each a ∈ A0 .
Proof. We start to prove the last claim. Thus, let ∗ be κ-saturated, A, B be internal
entities, and f : A0 → B with A0 ⊆ A where A0 has at most the cardinality of κ.
Let F denote the system of all internal functions g : ∗ A → B. Recall that F is
internal by Exercise 82. Consider now the family B of sets
Ba = {x ∈ F : x(a) = f (a)} (a ∈ A0 ).
By the internal definition principle, each set Ba is internal. Moreover, the system
B has the finite intersection property: Indeed, an element of Ba1 ∩ · · · ∩ Ban can
be defined in view of Exercise 8 by redefining some function from F at the finitely
many points a1 , . . . , an ∈ A0 . Since B has at most the cardinality of A0 , we find

some F ∈ B. Then F ∈ F satisfies F (a) = f (a) for each a ∈ A.
If ∗ is polysaturated, and A, B ∈ S and f : σ A → ∗ B are given, then ∗ A, ∗ B

are internal by Proposition 3.16, and σ A ⊆ ∗ A. Since σ A has the cardinality
of A which has a strictly smaller cardinality than S, we may conclude by what
we proved above that an internal function F : A → ∗ B exists which satisfies
∗
F (x) = f (x) for each x ∈ σ A. Hence, ∗ is comprehensive.

Conversely, let ∗ be comprehensive. Let B be a countable system of internal
entities with the finite intersection property. By Lemma 8.7, we may assume that
B ⊆ ∗ C for some entity C ∈ S. Let B1 , B2 , . . . be an enumeration of all elements
of B. Since S is infinite, we may assume that ⊆ S (rename the atoms), and

define a map f : σ → ∗ C by f (∗ n) = Bn . Since ∗ is comprehensive, we thus

find an internal map F : ∗ → ∗ C with F (∗ n) = Bn for each n ∈ . Note that

U := ∗ C is internal by Theorem 3.19. The set
G := {x ∈ ∗ | ∃y ∈ U : ∀z ∈ ∗ : (z ≤ x =⇒ y ∈ F (z))}

is internal by the internal definition principle. For any n ∈ , we have ∗ n ∈ G,
since B has the finite intersection property and thus F (1) ∩ · · · ∩ F (∗ n) = ∅
(recall Proposition 5.9). Hence, by the permanence principle, G also contains some

h ∈ ∗ \ σ N . This means that there is some y ∈ U such that

y ∈ {F (z) : z ∈ ∗ ∧ z ≤ h} ⊆ {F (z) : z ∈ σ } = B.
Thus,

B = ∅, and ∗ is -saturated.
Let us now come to the point why enlargements and polysaturated maps are
of particular interest.
Definition 8.9. Let ϕ be a binary relation. We say that ϕ is satisfied by b ∈ rng(ϕ)
on A ⊆ dom(ϕ), if (a, b) ∈ ϕ for each a ∈ A.
We call ϕ concurrent on A ⊆ dom(ϕ), if for each finite subset A0 ⊆ A there
is some b ∈ rng(ϕ) which satisfies ϕ on A0 .
In other words: ϕ is concurrent on A, if for each finitely many a1 , . . . , an ∈ A
there is some b with (a1 , b), . . . , (an , b) ∈ ϕ.
Let us first discuss the connection of enlargements and concurrent binary

relations:
Theorem 8.10. The following statements are equivalent for an elementary map ∗:
1. ∗ is a κ-enlargement.
2. For any binary relation ϕ ∈ S for which dom(ϕ) has at most the cardinality
of κ the following holds: If ϕ is concurrent on dom(ϕ), then there is some
b ∈ rng(∗ ϕ) which satisfies ∗ ϕ on the set σ dom(ϕ).
3. For any entity A ∈ S which has at most the cardinality of κ there is some
∗
-finite entity B with σ A ⊆ B ⊆ ∗ A.
Proof. Let ∗ be a κ-enlargement, and ϕ be a binary standard relation such that

dom(ϕ) has at most the cardinality of κ. Let A be the system of all sets of the
form
Ad := {y ∈ rng(ϕ) | (d, y) ∈ ϕ}
where d ∈ dom(ϕ). If ϕ is concurrent on dom(ϕ), then A has the finite intersection

property. Since A has at most the cardinality of dom(ϕ), the set d∈dom(ϕ) ∗ Ad
contains some b. By the standard definition principle, we have
∗
Ad = {y ∈ ∗ rng(ϕ) | (∗ d, y) ∈ ∗ ϕ},
i.e. (∗ d, b) ∈ ∗ ϕ for any d ∈ dom(ϕ), i.e. b satisfies ∗ ϕ on σ dom(ϕ). Thus 2. holds.

To prove that 2. implies 3., let A ∈ S be an entity which has at most the
cardinality of κ. Consider the relation
ϕ := {(x, y) ∈ A × P(A) | x ∈ y ∧ “y is finite”}.
Then ϕ is concurrent on A. Thus, 2. implies that we find some B ∈ rng(∗ ϕ) such

that (∗ a, B) ∈ ∗ ϕ for any a ∈ A. Note that, by the standard definition principle
for relations,
∗
ϕ = {(x, y) ∈ ∗ A × ∗ P(A) | x ∈ y ∧ “y is ∗ -finite”}.
Hence, B is a ∗ -finite element of ∗ P(A) (i.e. an internal subset of ∗ A) which

contains σ A.
Assume now that 3. holds. Then ∗ is a κ-enlargement. Indeed, let A be a
nonempty system of entities A ∈ S which has the finite intersection property and
at most the cardinality of κ. By Lemma 8.7, we may assume that A is an entity.
Then 3. implies that there is some ∗ -finite entity B with σ A ⊆ B ⊆ ∗ A . Since
A has the finite intersection property, the transitively bounded sentence

∀x ∈ P(A ) : (“x is finite” =⇒ x = ∅)
is true. Its ∗-transform reads

∀x ∈ ∗ P(A ) : (“x is ∗ -finite” =⇒ x = ∅).
σ
Since B ∈ ∗ P(A ) is ∗ -finite, we have B = ∅. Hence, σ A ⊆ B implies A ⊇

B = ∅.
Thus, enlargements mean, roughly speaking, that whenever we can satisfy in

the standard universe each finite number of conditions, we can in the nonstandard
universe satisfy all conditions simultaneously.
The reader should observe the analogy with the permanence principle: If
something holds for all finite sets, then it also holds for all ∗ -finite sets, and thus
in particular for an infinite set (actually, this is what we have used in the proof
of Theorems 8.5 and 8.8). The difference between the permanence principle and
enlargements is that for enlargements we are not forced to the special role of
but may instead use any other (standard) set as the index set.
Exercise 42. Prove directly that the property 2. of Theorem 8.10 implies that ∗ is
a κ-enlargement.
The following consequence is in a sense a generalization of Corollary 8.4 for
enlargements.
Proposition 8.11. If ∗ is a κ-enlargement, then ∗
has at least the cardinality of
either S or κ (whichever is smaller).
Proof. If A ∈ S is an entity which has at most the cardinality of κ, then there is
some ∗ -finite B ⊆ ∗ A with σ A ⊆ B. Since B is in one-to-one correspondence with

some set {1, . . . , h} ⊆ ∗ , it follows that ∗ has at least the cardinality of A.
Now we distinguish two cases: If there is some n such that Sn has at least
the cardinality of κ, choose some A ⊆ Sn which has precisely the cardinality of

κ (axiom of choice!), and by what we just proved, |∗ | ≥ |κ|. If there is no such

entity, then we have |∗ | ≥ |Sn | for each n which implies in view of S0 ⊆ S1 ⊆ · · ·

that |∗ | ≥ | Sn | = |S|.
Proposition 8.11 was first observed in [Zak69].
Exercise 43. Let S be infinite. Prove that the ultrapower model of §4 with J =
does not provide an enlargement. Show more precisely that if κ has a strictly larger

cardinality than P( ), then ∗ is not a κ-enlargement.
For applications in topology, the following compactness property of enlarge-
ments is crucial:
Exercise 44. Let ∗ be a κ-enlargement. Then for any system A of entities A ∈ S

which has at most the cardinality of κ and any standard ∗ A0 ⊆ σ A there is a
finite A0 ⊆ A with
A0 ⊆ A0 .
Ê
Exercise 45. Let ∈ S be an entity, and ∗ : S → ∗
S be an Ê-enlargement. Prove
Æ
that there is a number h ∈ ∗ such that
∗
sin(πhx) ≈ 0 Ê
(x ∈ σ ).

In particular, there is a number h ∈ ∗ such that (7.8) is constant and thus
measurable on . Ê
Ê
Hint: Use without proof that for each finitely many x1 , . . . , xn ∈ and any
Ê Æ
ε ∈ + there is some h ∈ such that the distance of hxk to an integer is at most
ε (for any k).
Actually, the statement of Exercise 45 holds if ∗ is just an arbitrary nonstan-
dard map [Tay69], but the proof is harder for this case.
The fact that ∗ ϕ is not satisfied on dom(∗ ϕ) = ∗ dom(ϕ) in Theorem 8.10 but
only on σ dom(ϕ) is rather disappointing. If we consider instead of κ-enlargements
even κ-saturated maps, we do not have this restriction. Moreover, ϕ can even be
an internal relation:
Theorem 8.12. The following statements are equivalent for an elementary map ∗:
1. ∗ is κ-saturated.
2. For any (not necessarily internal) binary relation ϕ and any (not necessarily
internal) A ⊆ dom(ϕ) which has at most the cardinality of κ and for which
each of the sets
ϕ(a) := {y ∈ rng(ϕ) : (a, y) ∈ ϕ} (a ∈ A)
is internal, the following holds: If ϕ is concurrent on A, then it is satisfied on

A.
Proof. Let ∗ be κ-saturated, ϕ be a binary relation, and A ⊆ dom(ϕ) have at most

the cardinality of κ. Let B := {ϕ(a) : a ∈ A}. If ϕ is concurrent on A, then B
has the finite intersection property. Since ∗ is κ-saturated and B has at most the

cardinality of A, the set B contains some b, if B consists only of internal sets.
This means that (a, b) ∈ ϕ for any a ∈ A, i.e. b satisfies ϕ on A.
Conversely, let 2. be satisfied, and B be a nonempty system of internal en-
tities which has the finite intersection property and at most the cardinality of κ.
Put
ϕ := {(x, y) : y ∈ x ∈ B}.
Since B has the finite intersection property, ϕ is concurrent on B. Moreover for

any B ∈ dom(ϕ) = B, the set
{y ∈ rng(ϕ) : (B, y) ∈ ϕ} = B
is internal. Hence, the assumption implies that ϕ is satisfied on B, i.e. there is

some b with (B, b) ∈ ϕ for any B ∈ B, i.e. b ∈ B.
Corollary 8.13. If ∗ is κ-saturated, then for any internal binary relation ϕ for
which dom(ϕ) has at most the cardinality of κ, we have: If ϕ is concurrent on
dom(ϕ), then it is satisfied on dom(ϕ).
Proof. The sets ϕ(b) in Theorem 8.12 are internal by the internal definition prin-
ciple.
The difference between Theorem 8.12 and Corollary 8.13 corresponds to the
different definitions of κ-saturated maps which can be found in literature (e.g. in
[SL76]):
Exercise 46. Show that for elementary maps ∗ the property of Corollary 8.13 is
equivalent to the fact that ∗ is “κ-saturated” in the sense that for any nonempty
internal system B of entities which has the finite intersection property and at

most the cardinality of κ, we have B = ∅.
Roughly speaking that a map ∗ is polysaturated means: Whenever it ap-
pears possible that an internal relation can be satisfied (because there are not
finitely many elements witnessing the contrary), then it actually is satisfied (if the
cardinality of the domain is not too large).
The following result is a generalization of the compactness property of en-
largements (Exercise 44):
Exercise 47. Let ∗ be κ-saturated. Then for any system A of internal entities and

any internal A0 ⊆ A there is some finite A0 ⊆ A with

A0 ⊆ A0 .
There is a concept which is in between κ-enlargements and κ-saturated maps:

Definition 8.14. Let ∗ : S → ∗ S be elementary. Then ∗ is called a compact κ-
enlargement (with some set κ) if for any internal binary relation ϕ the following
holds:
If ϕ is concurrent on a (not necessarily internal) entity A which has at most
the cardinality of κ and the property that at most finitely many elements of A are
nonstandard, then ϕ is satisfied on A.
The embedding ∗ is called a compact enlargement if it is a compact κ-
enlargement for any κ (i.e. if the above condition holds without any assumption
on the cardinality of A).
Proposition 8.15. We have for each set κ:
1. Each κ-saturated map is a compact κ-enlargement, and each compact κ-
enlargement is a κ-enlargement.
2. Each polysaturated map is a compact enlargement, and each compact enlarge-
ment is an enlargement.
Proof. 1. The first statement follows from Theorem 8.12 by restricting dom(ϕ)
to the set A. For the second statement, we apply Theorem 8.10: Let ϕ ∈ S be
a binary relation which is satisfied on A := dom(ϕ) where A has at most the
cardinality of κ. Assume that ϕ is satisfied on A. Then ∗ ϕ is a standard relation
which is concurrent on σ A. If ∗ is a compact κ-enlargement, then ∗ ϕ is satisfied
on σ A (because σ A contains only standard elements and has the same cardinality
as A). Hence, Theorem 8.10 implies that ∗ is a κ-enlargement.
2. Let ∗ be polysaturated, and ϕ be an internal binary relation which is concurrent
on an entity A which has at most finitely many nonstandard elements. Since we
may assume that A is infinite (otherwise ϕ is trivially satisfied on A), also S is
By Theorem 8.12, it follows that
infinite, and so A has at most the cardinality of S.
ϕ is satisfied, and so ∗ is a compact enlargement. The second statement follows
immediately from 1.
There exist enlargements which are not compact enlargements (even ultrafil-
ter models with this property exist [Lux69a]). For most practical purposes, com-
pact enlargements are sufficient, and as we shall see, they are considerably easier
to construct than saturated maps.
Compact enlargements also have a “finite subcovering property” which is “in
between” the corresponding property for enlargements and for saturated maps
(Exercise 44 resp. Exercise 47): In contrast to enlargements, the covered set A0
may be internal (and need not be standard). But in contrast to saturated maps,
the covering family must consist of standard sets (and not of internal sets).
Theorem 8.16. Let ∗ be a compact κ-enlargement. Then for any system A of

entities A ∈ S which has at most the cardinality of κ and any internal A0 ⊆ σ A
there is a finite A0 ⊆ A with

σ
A0 ⊆ A 0.
Proof. Consider the internal relation
ϕ := {(x, y) ∈ ∗ A × A0 | y ∈
/ x}.

If there is no finite A0 ⊆ A with A0 ⊆ σ A 0 , then ϕ is concurrent on σ A . Since
σ
A has at most the cardinality of κ, it follows that ϕ is satisfied by some a ∈ A0

on σ A , i.e. a ∈
/ σ A , a contradiction to our assumption A0 ⊆ σ A .
§9 Saturated Models 113
§ 9 Saturated Models
In this section, we shall “construct” for any κ and any set S a model which pro-
vides a κ-saturated map ∗. (This of course implies that there exist polysaturated
maps for any S). To this aim, we first construct enlargements and then consider a
countable chain of enlargements (a so-called direct limit) which provides a compact
enlargement. If we consider even an uncountable chain, we obtain a κ-saturated
map. The model for the enlargement is a special ultrapower model. In contrast,
the direct limit models are not ultrapowers.
The way we proceed is not the only possible construction of polysaturated
maps. In fact, it is even possible to obtain polysaturated models as ultrapowers,
see e.g. [LR94, Lux69a]. However, the construction of the latter depends on the
existence of a certain type of ultrafilters which is extremely hard to prove. For
this reason, we chose the direct limit construction. The latter is essentially due to
W. A. J. Luxemburg [Lux69a, SL76].

However, for many applications, it suffices to have -saturated maps. For
these applications, we need no additional considerations at all, since each non-
standard map ∗ as constructed in §4 by ultrapowers has this property:
Theorem 9.1. Let ∗ be a map arising from the ultrapower construction of §4. Then
∗ is comprehensive. In particular, if S is infinite and U is δ-incomplete (i.e. ∗ is

nonstandard), then ∗ is -saturated.
Proof. The second statement follows from the first by Theorem 8.12. Let entities
A, B ∈ S and a function f : σ A → ∗ B be given. Consider first the abstract model
S from Section 4.2. Then ∗ A corresponds to the set of all equivalence classes
of maps x : J → A, and σ A corresponds to the subset of classes which have
a constant function as their representative (we write in abuse of notation that
[a] is the equivalence class of the constant function with value a). Similarly, ∗ B
corresponds to the set of all classes of functions x : J → B. Hence, for each a ∈ A,
the value f ([a]) corresponds to the equivalence class of some function fa : J → B.
Consider now the function F0 : J → B A , defined by (F0 (j))(a) := fa (j). By
Proposition 4.19, the equivalence class [F0 ] is mapped by ϕ into an internal element
∗
F . By Proposition 4.19, we have F ∈ (B A ), i.e. F is an internal function from ∗ A
into ∗ B (Theorem 3.21). To prove that F is an extension of f , we have to prove
that (∗ a, f (∗ a)) ∈ F is a true sentence for any given a ∈ A. By Theorem 4.18, this
sentence is true if and only if ([a], [fa ]) ∈U [F0 ] where the shortcut for pairing is
to be understood in the interpretation of Theorem 4.18. By the L oś/Luxemburg
Theorem 4.14, this is equivalent to (a, fa (j)) ∈ F0 (j) for almost all j which in turn
is equivalent to (F0 (j)) = fa (j) for almost all j. The latter is true by construction,
and so we have indeed F (∗ a) = f (∗ a) for any given a ∈ A.
9.1 Models for Enlargements

We already know (Exercise 43) that not any ultrapower provides an enlargement.
However, for certain choices of J and an appropriate ultrafilter U over J, we
obtain enlargments. More, precisely, we get enlargements if we choose a so-called
λ-adequate filter:
Definition 9.2. Let λ be a set. A filter F over J is called λ-adequate, if for each
nonempty family A of subsets of λ with the finite intersection property there is a
map f : J → λ such that for each A ∈ A there is some F ∈ F with f (F ) ⊆ A.
Theorem 9.3. For each set λ there exists a λ-adequate ultrafilter U on an appro-
priate set J.
Proof. Let J be the system of all finite collections of subsets from λ, i.e. J consists
of all finite subsets of P(λ). For each A ⊆ λ define a subset of J by
FA := {j ∈ J : A ∈ j}.
The collection F0 of all sets FA (i.e. F0 := {FA : A ⊆ J}), has the finite intersec-
tion property. Indeed, if A1 , . . . , An ⊆ λ, then j := {A1 , . . . , An } belongs to J, and
the intersection FA1 ∩ · · · ∩ FAn contains j. Hence, F0 generates a filter F (recall
Proposition 4.5). By Theorem 4.9 (axiom of choice!), there exists an ultrafilter U
on J containing F and thus F0 .
We claim that U is λ-adequate. Thus, let A be a nonempty family of subsets
of λ with the finite intersection property. Recall that each j ∈ J is a finite collection
of subsets of λ. Let J0 be the subset of those collections which contain only elements
of A . Since A has the finite intersection property, each of the sets

Bj := A (j ∈ J0 )
A∈j
is nonempty, i.e. it contains some f (j). For j ∈

/ J0 , we define f (j) arbitrarily.
Considering f as a function (axiom of choice!), we claim that f : J → λ
has the required property: If A ∈ A , then FA ∈ F0 ⊆ F ⊆ U . We claim that
f (FA ) ⊆ A. Indeed, for any j ∈ FA we have A ∈ j, and so f (j) ∈ Bj ⊆ A.

Theorem 9.4. If U is an S-adequate filter, then the ultrafilter model of §4 provides
an enlargement.
Proof. Let ∗ be the corresponding map. Let A be a nonempty system of entities
A ∈ S which has the finite intersection property. We have to prove that there is
some internal element a which is contained in each of the sets ∗ A where A ∈ A .
Recall that we have x ∈ ∗ A if and only if x = ϕ([f ]) for some f : J → A
(Proposition 4.19). We thus have to prove that there exists a function f : J → S
such that for any A ∈ A the relation f (j) ∈ A holds for almost all j. But since

U is S-adequate, we find a function f : J → S such that for each A ∈ A there is
some F ∈ U with f (F ) ⊆ A. This is the required function, since f (F ) ⊆ A means
f (j) ∈ A for all j ∈ F , i.e. (since F ∈ U ) for almost all j.
9.2 Compact Enlargements

By the results of Section 9.1, we are now able to define for any set S an enlargement
∗ : S → ∗
S. The idea to define a compact enlargement is to iterate this process.
Given a superstructure S, we put S0 and start the induction with an en-
largement of S0 , say ∗0 : S0 → S1 . Next, we choose an enlargement of S1 , i.e.

∗1 : S1 → S2 , and so on, i.e. ∗n : Sn → Sn+1 (n = 0, 1, 2, . . .) is an enlargement.
We define for n < m the composition ∗m m
n := ∗m−1 ◦ · · · ◦ ∗n , i.e. ∗n is the map
which sends Sn into Sm .
Now the reader should note that each Sn has a natural language Ln where
the set of constants cns(Ln ) is in a one-to-one correspondence to Sn (the cor-
respondence being the interpretation map In ). For n > 0, some of these con-
stants represent standard elements, i.e. elements which correspond to the image
of some element of Sn−1 under the map ∗n−1 . Thus, each ∗n induces an injection
in : cns(Ln ) → cns(Ln+1 ), and ∗m m
n induces the injection in := im−1 ◦ · · · ◦ in from
cns(Ln ) into cns(Lm ). We also let ∗nn and inn be the identity maps of Sn resp.
cns(Ln ), and then have the crucial identities
∗m m k
n = ∗k ∗n and im m k
n = ik in (n ≤ k ≤ m). (9.1)
m
A sentence α in the language Ln is transformed by ∗m n into a sentence
in
α
in the language Lm which arises from α by replacing any occurrence of a constant
c in α by the constant im
n (c).
The crucial point now is that a transitively bounded sentence α in the lan-
m
guage Ln is true (interpreted by In in Sn ) if and only if the sentence in α in the
language Lm is true (interpreted by Im in Sm ): This is a reformulation of the
statement that each ∗n is elementary (after a trivial induction). Hence, we have:
Lemma 9.5. Each of the maps ∗m
n : Sn → Sm is elementary.
Proof. It only remains to verify the second condition of Definition 3.1, i.e. that
∗m
n S ∗n
n = Sm . But since S n = Sn+1 (because ∗n is elementary), this follows by a
trivial induction on m.
To simplify notation, we assume that all of the sets Sn are pairwise disjoint
(which can be arranged by choosing the atom sets Sn pairwise disjoint). This

means that for each x ∈ n Sn there is a unique n with x ∈ Sn . We denote this n
by nx .
To construct an abstract limit model, we introduce an equivalence relation

on the set n Sn : We call two elements x ∈ Sn , y ∈ Sm equivalent if y = ∗m n (x)
resp. x = ∗nm (y) (depending on whether n = nx or m = my is larger).
This is indeed an equivalence relation: By the convention ∗nn (x) = x, we have
x ∼ x. Moreover, x ∼ y means y ∼ x, since the definition is symmetric. To see the
transitivity (i.e. that x ∼ y and y ∼ z implies x ∼ z), one has to distinguish six
cases. In all cases, the transitivity follows from (9.1).
The abstract limit model Sω consists of all equivalence classes on the set

n S n . The relations ∈ω and =ω are defined by:
[x] =ω [y] ⇐⇒ ∗kn (x) = ∗km (y) for all k ≥ n, m where n = nx , m = my . (9.2)
[x] ∈ω [y] ⇐⇒ ∗kn (x) ∈ ∗km (y) for all k ≥ n, m where n = nx , m = my . (9.3)
Actually, =ω is the usual equality of equivalence classes:

Lemma 9.6. The relations =ω and ∈ω are well-defined on Sω , and moreover:
[x] =ω [y] ⇐⇒ ∗kn (x) = ∗km (y) for some k ≥ n, m where n = nx , m = my . (9.4)
[x] ∈ω [y] ⇐⇒ ∗kn (x) ∈ ∗km (y) for some k ≥ n, m where n = nx , m = my . (9.5)
Proof. The fact that the right-hand side of (9.2) resp. of (9.3) is equivalent to the
right-hand of (9.4) resp. of (9.5) follows by observing that for any K ≥ k ≥ n, m,
we have ∗K K k K K k K
n = ∗k ∗n and ∗m = ∗k ∗m and that ∗k preserves the equality and
element relation: The latter follows from Lemma 3.5, since ∗K k is elementary by
Lemma 9.5. Once we know this equivalence, it follows straightforwardly that =ω
and ∈ω are well-defined.
A sentence α in the language Ln will now be interpreted in the abstract
model Sω by mapping any constant c ∈ cns(Ln ) corresponding to some x ∈ Sn
to the equivalence class [x].
Theorem 9.7. A transitively bounded sentence α in the language Ln is true in the
abstract model Sω if and only if it is true under the interpretation map In .
Proof. It suffices to prove that whenever α is true under the interpretation map
In , it is also true in the abstract model Sω : Indeed, if we have proved this, and α
is false under the interpretation map In , then the transitively bounded sentence
β = ¬α is true. But then, by assumption, β is true in Sω , and so α is false in Sω .
Thus, without loss of generality, assume that α is true under the interpre-
tation map In . We may equivalently rewrite α in prenex normal form, i.e. in the
form
Q1 x1 : Q2 x2 : . . . Qk xk : β
where Qj stands for a quantifier ∀ or ∃, and where β contains no further quantifiers.
For the proof, we make use of so-called Herbrand-Skolem functors. These are
defined as follows: Let j1 < j2 < · · · < jp be those indices for which Qji is the
symbol ∃ (p = 0 is not excluded). In particular, α has the form
∀x1 : . . . xj1 −1 : ∃xj1 : Qj1 +1 xj1 +1 : . . . Qk xk : β
(j1 = 1 is not excluded). Note that In is onto, i.e. each possible value of xj is ac-
tually represented by some constant in the language cns(Ln ). Then the statement
that α is true under the interpretation In means that for each possible value of
x1 , . . . , xj1 −1 , we find a constant cn1 (x1 , . . . , xj1 − 1) such that the sentence
Qj1 +1 xj1 +1 : . . . Qk xk : β(cn1 )
is true, where β(cn1 ) arises from β by replacing all free occurrences of xj1 by the
constant cn1 (x1 , . . . , xj1 −1 ). By the axiom of choice, we may assume that cn1 is a
function. Note that conversely, the existence of such a function cn1 implies that α is
true under the interpretation map In . (However, the reader should be aware that
cn1 is not a function in the sense of the language Ln ). The function cn1 is called
the Herbrand-Skolem functor for the existential quantifier Qj1 . By an induction,
we thus can eliminate all quantifiers, and find that α is true if and only if
β(cn1 , . . . , cnp )
holds for any values of the free variables xj (j = j1 , . . . , jp ) where cnk is a Herbrand-
Skolem functor depending only on the choice of the values of xj (j = j1 , . . . , jp ).
Note now that α is transitively bounded, and so it remains true under the inter-
pretation Im if m ≥ n, provided we replace all occurrences of constants c by im n (c).
Hence, we find for each m ≥ n Herbrand-Skolem functors cm 1 , . . . , cm
p such that
im
n
β(cm m
1 , . . . , cp ) (9.6)
m
holds for all values of the free variables xj (j = j1 , . . . , jp ) in Sm , where in β arises
from β by replacing all occurrences of constants c by im n (c). We claim that there
ω ω
are even Herbrand-Skolem functors c1 , . . . , cp such that
ω
β(cω ω
1 , . . . , cp ) (9.7)
is true for all values of the free variables xj (j = j1 , . . . , jp ) in Sω . Here, ω β

arises from β by interpreting all constants c which represent some x ∈ Sn by the
corresponding equivalence class [x] ∈ Sω . The existence of such Herbrand-Skolem
functors then means in turn that α is true when interpreted in the model S .
Thus, let xj (j = j1 , . . . , jp ) have fixed values in S , i.e. xj = [aj ] where
[aj ] ∈ S . Note that we have aj ∈ Snj for some nj . Fix m ≥ n such that m ≥ nj
m
∗
for all j, and put bj := nj aj . Then bj ∈ Sm , and [aj ] = [bj ]. In particular,
xj = [bj ]. Now let ck be the equivalence class containing the interpretation of cm
ω
k
under the values xj = bj . For the choice xj = bj the formula (9.6) is true in Sm ,
and it follows that also (9.7) is true for xj = [bj ] in Sω , as claimed:
Indeed, since β contains no quantifiers, it consists only of logical connectives
and elementary formulas a = b and a ∈ b. If we can prove that the elementary
formula in (9.6) is true if and only if the corresponding formula in (9.7) is true,
also the complete formulas in (9.6) resp. (9.7) must have the same truth value.
It thus remains to prove that for any a, b ∈ Sm we have a = b if and only if
[a] =ω [b], and a ∈ b if and only if [a] ∈ω [b]. But it follows from (9.2) that [a] =ω [b]
implies ∗m m
m (a) = ∗m (b) and so a = b (since we assume a, b ∈ Sm ); analogously,
m m
[a] ∈ω [b] implies by (9.3) that ∗m (a) ∈ ∗m (b), and so a ∈ b. For the converse
implication, observe that a = b implies ∗m m
m (a) = ∗m (b), and so [a] =ω [b] by (9.4);
m m
analogously, a ∈ b implies ∗m (a) ∈ ∗m (b), and so [a] ∈ω [b] by (9.4).
It follows from the proof, that Theorem 9.7 holds not only for transitively
bounded sentences but even for a larger class of sentences, if the corresponding
embeddings ∗n preserve the truth for the corresponding class of sentences.
Exercise 48. Prove that any sentence α can equivalently be rewritten in prenex
normal form.
As in Section 4.3, we now want to replace the abstract model Sω by some
superstructure Sω . Of course, this replacement should be done in such a way that
transitively bounded sentences α in any of the languages Ln should keep their
truth value.
If we want to proceed analogously to Section 4.3, we have to define sets
Ik ⊆ Sω which represent “internal objects of level at most k”, in particular I0
will represent the atoms.
The definition of Ik is rather straightforward: Recall that each superstruc-

ture Sn is built of level sets Sn,k (i.e. Sn,0 represents the atoms of the superstruc-
ture, and Sn,k+1 = Sn,0 ∪ P(Sn,k )). Then we just let
Ik := {[x] : x ∈ Sn,k for some n}.
It follows from the definition of the superstructures Sn that I0 ⊆ I1 ⊆ · · · and

Ik = Sω .
As already announced, we put now Sω := I0 , i.e. Sω consists of all classes
[x] where x is an atom in some of the superstructures Sn .
To continue as in Section 4.3, we need an analogue to Lemma 4.16 and
Lemma 4.17:
Lemma 9.8. We have
Ik = {[x] ∈ Sω : There is some [y] ∈ Ik+1 with [x] ∈ω [y]}.
Proof. If [x] ∈ Ik is given, we have x ∈ Sn,k for some n. Then y := Sn,k is an

element of Sn,k+1 , and so [y] ∈ Ik+1 . By (9.5), we have [x] ∈ω [y].
Conversely, assume that [y] ∈ Ik+1 and [x] ∈ω [y]. Then x ∈ Sn,i , y ∈ Sm,k+1
for some n, m and some i, and we find some K ≥ n, m with ∗K K
n (x) ∈ ∗m (y). Since
K ∗K
∗m is elementary (Lemma 9.5), we have m S m,k+1 ⊆ SK,k+1 (Theorem A.1), and
so ∗K K K
m (y) ∈ SK,k+1 = SK,0 ∪ P(SK,k ). In view of ∗n (x) ∈ ∗m (y), this implies
K K
∗n (x) ∈ SK,k . Hence, [x] = [∗n (x)] ∈ Ik .
Lemma 9.9. If elements [x], [y] ∈ Sω \ I0 satisfy
{[z] ∈ Sω : [z] ∈ω [x]} = {[z] ∈ Sω : [z] ∈ω [y]}, (9.8)
then [x] = [y].

Proof. Let x ∈ Sm and y ∈ Sn , without loss of generality, n ≥ m: Putting x0 =
∗nm (x), we have [x0 ] = [x] and x0 ∈ Sn . Hence, replacing x by x0 if necessary, it
is no loss of generality to assume that x and y belong to the same superstructure
Sn . Then we have [x] = [y] if and only if x = y. Moreover, x and y are not atoms
of the superstructure Sn , because [x], [y] ∈
/ I0 . Consequently, if [x] = [y], then
the sets x and y are different, and so we find an element which belongs to one
of this set but not to the other. Without loss of generality, z ∈ x \ y. In view of
Theorem 9.7, we then have [z] ∈ω [x] and ¬[z] ∈ω [y], a contradiction to (9.8).
Now we may proceed analogously to Section 4.3: As in Section 4.3, we may
define an injection ϕω : Sω → Sω :

[y] if [y] ∈ I0 = Sω ,
ϕω ([y]) = (9.9)
{ϕω ([x]) : [x] ∈ω [y]} if [y] ∈
/ I0 = Sω .
Indeed, for the construction of ϕω in Section 4.3, no particular properties of the

abstract model S have been used except for Lemma 4.16 and Lemma 4.17 for
which we just have proved corresponding replacements.
Also an analogue of Theorem 4.18 can be proved (with essentially the same
proof): In this connection the reader should note that the set I ′ from Section 4.3
corresponds in our case to the whole abstract model Sω . We only formulate the
analogue of the second part of Theorem 4.18:
Theorem 9.10. A transitively bounded sentence in the language whose constants
are taken from Sω is true (interpreted by the identity map) if and only if it is true
under the interpretation map ϕω .
We emphasize that the proof of Theorem 9.10 makes use of the fact that the
sentence is transitively bounded, but this restriction could be relaxed in the same
way in which this restriction in Theorem 4.18 could have been relaxed.
Now we define a new language Lω by taking the set of its constants, cns(Lω )
in a one-to-one correspondence with Sω (the correspondence being the interpreta-
tion map Iω ). Now we can define maps ∗ω
n : Sn → Sω by
∗ω
n (x) = ϕω ([x]),
and also iω ω −1 ω
n : cns(Ln ) → cns(Lω ) by in = I∞ ◦ ∗n ◦ In , i.e. if c ∈ cns(Ln )
corresponds to the element x ∈ Sn , then in (c) corresponds to the element ∗ω
ω
n (x).
ω
For later reference, we observe by the way that the value ∗n (x) depends by
definition only on the equivalence class of x, and so it follows that
∗ω ω k
n = ∗k ∗n (n ≤ k).
Hence, the relation (9.1) holds also for m = ω (and even for k = ω or n = ω if we
define ∗ω
ω (x) = x).
Theorem 9.11. Each of the maps ∗ω
n : Sn → Sω is elementary.

Moreover, if an element x ∈ Sω is internal under this map, then there is
some m < ω such that x is a standard entity under the map ∗ω
m : Sm → Sω .
Proof. Let α be a transitively bounded sentence in the language Ln which is true

under the interpretation map In . By Theorem 9.7, α is true under the interpre-
tation map J ◦ In where J : Sn → Sω is the map sending each element x ∈ Sn
onto the equivalence class [x]. Hence, Theorem 9.7 implies that α is true under
the interpretation map In′ = ϕω ◦ J ◦ In = ∗ω n ◦ In .
Hence, condition 1. of Definition 3.1 is satisfied. The other condition reads
ω
in our case ∗n S n = Sω . To prove this, note that (9.9) implies
∗ω
n S n = ϕω ([Sn ]) = {ϕω ([x]) : [x] ∈ω [Sn ]}. (9.10)
By Lemma 9.8, the relation [x] ∈ω [Sn ] implies [x] ∈ I0 . But also the converse
holds: If [x] ∈ I0 , then [x] ∈ω [Sn ].
To see the latter, let [x] ∈ I0 and note that we may by definition of I0
choose some representative x which is an atom x ∈ Sm for some m. If m ≤ n, then
∗n
mS n n
m = Sn , because ∗m is elementary (Lemma 9.5), and so x0 = ∗m (x) ∈ Sn .
∗m
Since x0 ∼ x, we have [x] = [x0 ] ∈ω [Sn ]. If m ≥ n, then the relation n S n = Sm
∗m
(because ∗m n is elementary) implies that [x] ∈ω [
n S ] = [S ]. Hence, in both cases
n n
[x] ∈ω [Sn ], as claimed.
It thus follows from (9.10) that
∗ω
n S n = {ϕω ([x]) : [x] ∈ I0 } = I0 = Sω ,
where we made use of (9.9). This completes the proof that ∗ω n is elementary.
∗ω
If x is internal under the map ∗ω
n , then x ∈ nS
n,k for some k. By (9.9), we
have
∗ω
n S n,k = ϕω ([Sn,k ]) = {ϕω ([y]) : [y] ∈ω [Sn ]}.
In particular, x = ϕω ([y]) for some [y] ∈ Sω . We must have y ∈ Sm for some m,
and so x = ∗ωm (y).
We thus have constructed a true limit model Sω for the sequence Sn . Since
each of the models Sn becomes successively “more saturated”, and since any tran-
sitively bounded sentence in the model Sn is also true in the limit model Sω , one
might expect that the limit model is “enormously saturated”. This is indeed the
case:
Theorem 9.12. The map ∗ω
0 : S → Sω is a compact enlargement.
Proof. By Theorem 9.11, the map ∗ = ∗ω 0 is elementary. Let ϕ be an internal

binary relation which is concurrent on an entity A which contains at most finitely
many nonstandard elements.
By Theorem 9.11, we find some index mϕ such that ϕ is standard under
the map ∗ω
mϕ : Smϕ → Sω . This means that there is some ϕ0 ∈ Smϕ such that
ω
ϕ = ∗m 0 ϕ 0 .
ω
Similarly, we find for each a ∈ A some ma such that a = ∗ma ca for some
ca ∈ Sma ; moreover, we may assume that ma = 0 for all except at most finitely
many a ∈ A (because A contains only finitely many nonstandard elements).
Hence, m := max({ma : a ∈ A} ∪ {m0 }) exists. Consider in the model Sm
m m
the relation ψ := ∗m0 ϕ0 and the set B consisting of the elements ba := ∗ma ca .
We claim that ψ is concurrent on B: Indeed, if a1 , . . . , an ∈ A, then there is
some y ∈ rng(ϕ) such that (ak , y) ∈ ϕ for k = 1, . . . , n, because ϕ is concurrent on
ω ω ω
A. Now note that rng(ϕ) = rng(∗m ψ) = ∗m rng(ψ) and ak = ∗m bak . We thus have
in Sω ,
ω ω ω ω
∃y ∈ ∗m rng(ψ) : (∗m ba1 , y), . . . , (∗m ban , y) ∈ ∗m ψ
Applying the inverse form of the transfer principle (for the map ∗ω m ), we find that
there is some y ∈ rng(ψ) such that (ba1 , y), . . . , (ban , y) ∈ ψ, i.e. ψ is concurrent
on B, as claimed.
Since ∗m+1
m : Sm → Sm+1 is an enlargement, we may conclude that there
m+1 m+1
is some y ∈ Sm+1 such that (∗m ba , y) ∈ ∗m ψ holds for each ba ∈ B. Hence,
∗ω
m+1 m+1 ω ∗ω
m+1 m+1 ω
( (∗m ba ), ∗m+1 y) ∈ (∗m ψ) for each a ∈ A which means (a, ∗m+1 y) ∈ ϕ
for each a ∈ A, i.e. ϕ is satisfied on A, as desired.
9.3 Polysaturated Models

To prove the existence of a polysaturated model, we have to work even more: We
have to repeat the construction of Section 9.2 a transfinite number of times.
More precisely, we construct a transfinite sequence of enlargements: By a
transfinite induction, we associate to each ordinal number α a model Sα and a
family of maps ∗α
β : Sβ → Sα (β ≤ α) such that the following holds:
1. For all ordinals α1 ≤ α2 ≤ α3 ≤ α we have
∗α α3 α2
α1 = ∗α2 ∗α1 ,
3
and ∗α
α is the identity.
2. For each β < α the map ∗α
β : Sβ → Sα is elementary.
3. If α is a successor ordinal, i.e. α = β +1 for some ordinal β, then ∗α
β : Sβ → Sα
is even an enlargement.
4. If α is a limit ordinal (i.e. not a successor ordinal), then each element which is
internal under some map ∗α
β : Sβ → Sα with β < α is standard under another
such map.
Theorem 9.13. For each set S there exists a transfinite sequence as described above

such that S0 = S.
Proof. The induction start is clear: Put S0 := S, and let ∗00 : S0 → S0 be the
identity. If α = β + 1, let ∗α
β : Sβ → Sα be an enlargement, and for ordinals γ < β
α := α β
define ∗γ ∗β ∗γ .
Thus, the only case which needs some care is if α is a limit ordinal. For
α = ω, we have proved the existence of a limit model. However, for the general
case, the proof is analogous: Actually, we have at no place used the fact that we
determined the limit of a countable sequence (only the relation (9.1) has been
used). For clearer representation we had used in the above proof languages Lβ ,
interpretation maps Iβ : cns(Lβ ) → Sβ , and maps iββ21 : cns(Lβ1 ) → cns(Lβ2 ):
One may of course just put cns(Lβ ) := Sβ , and let Iβ and iββ21 be the identity map.
Then, with the same proof as above, we can construct a limit model Sα such that
each of the embeddings ∗α : Sβ → Sα is elementary.
β
The crucial point of the above construction is that we now indeed obtain κ-
saturated models. The proof is in most parts analogous to Theorem 9.12, but one
has to take care, since there need not exist a maximum for infinite sets of ordinals.
To ensure that the supremum is actually strictly smaller than the ordinal number
α of the highest model, we implicitly make use of a property of successor cardinals
α which is called “regularity” in literature on set theory:
Theorem 9.14. Let S and κ be arbitrary sets. Then in the transfinite sequence
from Theorem 9.13 the elementary map ∗α
0 : S → Sα is κ-saturated, if α is the
first ordinal with a strictly larger cardinality than κ.
Proof. We may assume that κ is infinite, since otherwise each elementary embed-
ding is κ-saturated.
Let B be a nonempty system of internal entities which has the finite intersec-

tion property and at most the cardinality of κ. We have to prove that B = ∅.
Since each B ∈ B is an internal set, we find some index βB < α such that B is a
standard set under the embedding ∗α
βB : SβB → Sα , i.e. there is some CB ∈ SβB
α
such that B = ∗βB C B .

Put β := {βB : B ∈ B}. Then β is an ordinal number. Since βB < α, the
set βB has at most the cardinality of κ. Hence, β has at most the cardinality of
the set κ × B. By assumption, B has at most the cardinality of κ. Hence, β has
at most the cardinality of κ × κ which is the cardinality of κ, since κ is infinite.
Summarizing, β has at most the cardinality of κ. By our choice of α, we may
conclude that β < α.
∗β
Consider in the model Sβ the sets AB := βB C B . Then the system A :=
{AB : B ∈ B} has the finite intersection property: Indeed, if B1 , . . . , Bn ∈ B,
then B1 ∩· · ·∩Bn = ∅, since B has the finite intersection property by assumption.
α α
Now note that ∗β ABi = ∗βB C Bi = Bi , and so
∗α α α
β
(AB1 ∩ · · · ∩ ABn ) = ∗β AB1 ∩ · · · ∩ ∗β ABn = B1 ∩ · · · ∩ Bn = ∅,
because ∗α
β is a superstructure monomorphism. Hence, AB1 ∩ · · · ∩ ABn = ∅, and
A has the finite intersection property, as claimed.
Since ∗β+1
β : Sβ → Sβ+1 is an enlargement, we may conclude that there is
β+1
some d ∈ Sβ+1 which is contained in each of the sets DB := ∗β AB (B ∈ B). Since
∗β+1 α α
∗α
DB = βB C B , we find by Lemma 3.5 that b := ∗β+1 d ∈ ∗β+1 DB = βB CB = B

(B ∈ B). In particular, b ∈ B, and so ∗α
0 is κ-saturated.
Corollary 9.15. For each set S there exists a polysaturated map ∗ : S → ∗

S.
Our above construction of the transfinite sequence is similar to the construc-
tion of ultralimits in [SL76, Section 7.5] (see also [Lux69a, Section 1.7]): In these
books, a transfinite sequence of abstract (ultrapower) models is considered. It is
not discussed there, how the limit model might be embedded into a superstructure
(and it appears that this is a rather difficult task).
However, our construction is different: We consider at each step of the trans-
finite sequence already the embedding into some superstructure. This has the
advantage that one is not forced to use ultrapower models to construct the enlarge-
ments needed for the induction step α → α + 1. (As mentioned after Theorem 8.5,

there exist enlargements which are not -saturated. By Theorem 9.1, these en-
largements cannot be described by ultrapower models). However, even if one uses
ultrapowers, it is not clear whether we end up with the same model as in [SL76],
because the embedding of the abstract model into the superstructure in each step
does not preserve all sentences but only the transitively bounded sentences.
Chapter 5
Functionals, Generalized Limits,

and Additive Measures
Throughout this chapter, we assume that Ê ∈ S is an entity.
§10 Normed Spaces

10.1 Linear Functionals and Operators
Ê
Recall that a map · : X → + on a linear space (=vector space) X over =

or = is called a norm, if the following holds:
1. x = 0 if and only if x = 0.
2. λx = |λ| x for scalars λ ∈ .
3. x + y ≤ x + y (the triangle inequality).
Each norm induces a metric by means of the formula d(x, y) = x − y. A normed
linear space is called Banach space, if it is complete with respect to the metric
induced by the norm (i.e. if any Cauchy sequence converges). A map f : X → Y
in normed spaces is called linear if f (x + y) = f (x) + f (y) and f (λx) = λf (x). If

Y = , then f is called a (linear) functional . If X and Y are normed spaces, then

a linear map f : X → Y is called bounded , if there is a constant L ∈ + with
f (x) ≤ L x (x ∈ X),
or, equivalently
f (x)
sup f (x) = sup f (x) = sup < ∞.
x=1 x≤1 x
=0 x
126 Chapter 5. Functionals, Generalized Limits, and Additive Measures
The minimum of all such constants L (which is the above supremum) is called the
norm. In particular, a linear functional is bounded if and only if
f = sup |f (x)| < ∞.

x=1
The space of all bounded linear functionals on X is usually denoted by X ∗ . It is

endowed with the canonical vector operations (i.e. (f + g)(x) := f (x) + g(x) and
(λf )(x) := λf (x)). It is well-known and can be easily checked that X ∗ is always
a Banach space with respect to the norm introduced above; it is called the dual
space to X.
A classical result of functional analysis is the theorem of Hahn-Banach for
which we give now a nonstandard proof:
Theorem 10.1 (Hahn-Banach). Let X0 be a linear subspace of a normed linear
space X. Let f ∈ X0∗ . Then f may be extended to a bounded linear functional
F ∈ X ∗ without increasing the norm, i.e. f (x) = F (x) for x ∈ X0 and f = F .
The nonstandard proof of Theorem 10.1 is reduced to the following special
case in the standard world. This part of the proof is classical:
Lemma 10.2. Let f, X0 , X be as in Theorem 10.1 with Ã Ê
= . Then for each
finite number of elements x1 , . . . , xn ∈ X one may extend f to a subspace U ⊆ X
which contains x1 , . . . , xn without increasing the norm.
Proof. Evidently, it suffices to consider n = 1 (because in the general case, we
may first extend f to a subspace which contains x1 , then to a subspace which
also contains x2 , etc.). If x1 ∈ X0 , the statement is trivial. Otherwise, put U :=
span(X0 ∪ {x1 }). Note that x1 is linearly independent of X0 , i.e. each u ∈ U may
uniquely be written in the form u = x + λx1 where x ∈ X0 and λ ∈ . The only Ê
way to extend f linearly to U is by putting
F (x + λx1 ) := f (x) + λc
Ê
with some c ∈ . We have to prove that we may choose c in such a way that
F ≤ f , i.e. |F (u)| ≤ f u for all u ∈ U . The latter is equivalent to
|f (x0 ) + λc| ≤ f x0 + λx1 (x0 ∈ X0 , λ ∈ Ê).

After dividing by |λ| (the case λ = 0 is trivial), we find, putting x := x0 /λ ∈ X,
that the above inequality is equivalent to
|f (x) + c| ≤ f x + x1 (x ∈ X0 ).
Since Ã = Ê, we thus have to find some c ∈ Ê such that

− f x + x1 − f (x) ≤ c ≤ f x + x1 − f (x).
§10 Normed Spaces 127
Such a constant c exists, since for each x, y ∈ X0 the estimate
f (x) − f (y) = f (x − y) ≤ f x − y = f (x + x1 ) − (y + x1 )

≤ f (x + x1 + y + x1 )
holds, which implies that
sup (− f y + x1 − f (y)) ≤ inf (f x + x1 − f (x)).

y∈X0 x∈X0
Thus, we may just choose some c in between these two quantities.

Proof of Theorem 10.1. First, assume that
= . Choose some S such that
∗
, X ∈ S are entities. Let ∗ : S → S be an enlargement. Consider the relation
ϕ := {(x, y) ∈ X × F (X,) | α(x, y)},

where F (X, ) denotes the set of all functions y : X → with dom(y) ⊆ X
and rng(y) ⊆ , and α(x, y) is a transitively bounded sentence with the meaning
“y is linear, defined on a subspace U ⊇ X0 with x ∈ U , and its norm does
not exceed f ”. By Lemma 10.2, the relation ϕ is concurrent on X. Since ∗ is
an enlargement, Theorem 8.10 implies that there is some y ∈ ∗ F (X, ) which
satisfies ∗ ϕ on σ dom(ϕ) = σ X, i.e. (∗ x, y) ∈ ∗ ϕ for any x ∈ X. To determine ∗ ϕ,
we may use the standard definition principle for relations. In view of Exercise 83,
y is an internal linear functional, defined on an internal subspace U ⊆ ∗ X with
∗ ∗ ∗
x ∈ U for each x ∈ X and such that |y(u)| ≤ (f ) u for u ∈ U . Define now
F : X → ∗
by F (x) := st(y( x)). Since st is linear (Theorem 5.21), also F is
linear. Moreover, since
|y(∗ x)| ≤ ∗ y∗ ∗ x ≤ ∗ (f )∗ ∗ x = ∗ (f x),
the function F is indeed defined on X (because y(∗ x) is finite), and |F (x)| ≤

f x, i.e. F ≤ f .
The case = is reduced to the case
= : Indeed, by considering
only multiplication with real scalars, we find that the function fÊ (x) := Re(f (x))
defines a linear functional over the (real) vector space X which satisfies fÊ ≤
f . By what we proved above, we may extend fÊ to some (real) linear FÊ on
X without increasing the norm. Define now F (x) := FÊ (x) − iFÊ (ix). Then F
is linear in the complex sense, since it is linear in the real sense and F (ix) =
FÊ (ix) − iFÊ (−x) = FÊ (ix) + iFÊ (x) = iF (x). Moreover,
F = sup sup Re(λF (x)) = sup sup Re(F (λx)) ≤ sup |FÊ (x)| = FÊ ,
x=1 |λ|=1 x=1 |λ|=1 x=1
and so F ≤ FÊ = fÊ ≤ f .

Some remarks are appropriate: The previous nonstandard proof is essentially

due to W. A. J. Luxemburg [Lux62]. The main advantage compared to the classical
(standard) proof is that a careful analysis of the underlying model shows that not
the full power of the axiom of choice is needed but only the existence of a certain
ultrafilter which is slightly less restrictive from the logical point of view [Pin73].
Actually, by a refinement of the method, it is proved in [Lux69c] that one does not
even need an ultrafilter but only an additive measure which is even less restrictive
[PS77]. Anyway, in the author’s opinion these weakenings of the axiom of choice
are not too interesting, since already the theorem of Hahn-Banach implies the most
counterintuitive consequences of the axiom of choice (for example, the existence of
nonmeasurable sets [FW91] or the Banach-Tarski paradox [Paw91] can be proved
by Hahn-Banach’s extension theorem without the axiom of choice).
Exercise 49. Prove the following form of the Hahn-Banach extension theorem: Let
X be a real linear space, and p : X → Ê
sublinear , i.e. p(x + y) ≤ p(x) + p(y)
and p(λx) = λp(x) for λ > 0. If X0 ⊆ X is a subspace and f : X0 → Ê
is linear
with f (x) ≤ p(x) on X0 , then f may be extended to a linear F : X → Ê
with
Ã Ê
F (x) ≤ p(x) on X. Why is Theorem 10.1 for = a special case?
Exercise 50. A normed space X is called separable, if there is a countable set
C ⊆ X with C = X. Give a standard proof of Theorem 10.1 for the case that X is
separable, using only Lemma 10.2 and a countable (recursive) form of the axiom
of choice.
Hint: Extend f first to the linear hull U of C ∪ X0 .
It is a well-known result of functional analysis that on any infinite-
dimensional normed space there exist linear functionals which are not bounded.
However, in the nonstandard world, it suffices to consider bounded (internal)
functionals:
Theorem 10.3. Let X be a normed linear space with dual space X ∗ . Let ∗ be an
X-enlargement. Then any linear functional f on X (not necessarily bounded) may
be written in the form
∗
(f (x)) = g(∗ x) (x ∈ X)
with some g ∈ ∗ (X ∗ ). In particular,
f (x) = st(g(∗ x)) (x ∈ X).
Proof. Consider the binary relation
ϕ := {(x, y) ∈ X × X ∗ | y(x) = f (x)}.
Then ϕ is concurrent on X: Indeed, if x1 , . . . , xn ∈ X are given, let X0 be the linear

Ã
hull of x1 , . . . , xn . Since X0 has finite dimension, it is isomorphic to some n , and
so the restriction of f to X0 is bounded. By the Hahn-Banach extension theorem,

this restriction has an extension to some F ∈ X ∗ . Then (x1 , F ), . . . , (xn , F ) ∈ ϕ,
and so ϕ is concurrent. Since ∗ is an X-enlargement, ϕ is satisfied on σ X, i.e. we
find some g ∈ ∗ (X ∗ ) such that (∗ x, g) ∈ ∗ ϕ for any x ∈ X. The standard definition
∗
for relations implies that for any x ∈ X the equality g(∗ x) = ∗ f (∗ x) = (f (x))
holds.
Recall that if X has finite dimension N , then any linear map A : X → Y
may be written in matrix form
N

A(x) = fn (x)yn (10.1)
n=1
where fn are linear functionals and yn ∈ Y . In nonstandard analysis, all operators

have matrix form:
Exercise 51. Let X be a normed linear space, Y be a linear space, and A : X → Y
be linear. If ∗ is an X-enlargement, prove that there are internal ∗ -finite sequences
∗
y1 , . . . , yh ∈ ∗ Y and f1 , . . . , fh ∈ (X ∗ ) such that
h

∗
(A(x)) = fn (∗ x)yn (x ∈ X).
n=1
10.2 Hahn-Banach and Banach-Mazur Limits

The space ℓ∞ is the set of all bounded real sequences x = (ξn )n , equipped with
the norm
x∞ = sup |ξn | .
n
It is well-known (and easily checked) that ℓ∞ is a Banach space. By ℓp (1 ≤ p <

∞), we denote the set of all real sequences x = (ξn )n with finite norm
∞
1/p
p
xp = |ξn | .
n=1
Also ℓp is a Banach space. It is well-known that in case 1 ≤ p < ∞ the space ℓp′
with p1 + p1′ = 1 is dual to ℓp in the sense that each element y = (ηn )n ∈ ℓp′ defines
a bounded linear functional fy on ℓp by means of the formula
∞

fy (x) = ηn ξn , (10.2)
n=1
and conversely all bounded linear functionals have this form.

For ℓ∞ the situation is different: Any y ∈ ℓ1 defines a bounded linear func-

tional on ℓ∞ , but (if we assume the axiom of choice) there exist also bounded
linear functionals on ℓ∞ which do not have this form, as we shall see soon. Al-
though much is known about the structure of the space ℓ∗∞ of bounded linear
functionals on ℓ∞ , there are still some open problems. Nonstandard analysis is
the most convenient tool to study ℓ∗∞ . In fact, by nonstandard methods the space
ℓ∗∞ is easily described, as was first observed in [Rob64]. To give the main result of
[Rob64], we need the following result of the standard world:
Lemma 10.4. Let f ∈ ℓ∗∞ . Then for any finite number of elements x1 , . . . , xK ∈ ℓ∞ ,
xk = (ξk,n )n and any ε > 0 there exist real numbers η1 , . . . , ηN ∈ such that Ê
N

f (xk ) = ηn ξk,n (k = 1, . . . , K) (10.3)
n=1
and
N

|ηn | ≤ (1 + ε) f .
n=1
Proof. If one of the vectors x1 , . . . , xK , say xK , is a linear combination of the

others, it suffices to find the numbers η1 , . . . , ηN corresponding to x1 , . . . , xK−1 ,
since (10.3) then holds also for xK by the linearity. Successively eliminating such
vectors, we may assume without loss of generality that x1 , . . . , xK are linearly
independent. Following [Ban87, p. 42/43], we prove under this assumption that
there is some N such that
λ1 x1 + · · · + λK xK ∞ ≤ (1 + ε) max |λ1 ξ1,n + · · · + λK ξK,n | (10.4)

n=1,...,N
Ê
for any choice λ1 , . . . , λK ∈ . Indeed, if this is not true, we find for each N
corresponding numbers λk,N such that
λ1,N x1 + · · · + λK,N xK ∞ > (1 + ε) max |λ1,N ξ1,n + · · · + λK,N ξK,n | .

n=1,...,N
(10.5)
Dividing (10.5) by max{|λ1,N | , . . . , |λK,N |} if necessary, we may assume that
|λk,N | ≤ 1. Successively passing to subsequences, we find a subsequence Nj such
that λk,Nj → λk converges as j → ∞ for any k = 1, . . . , K. Observe now that
(λ1,Nj x1 + · · · + λK,Nj xK ) − (λ1 x1 + · · · + λK xK )∞ → 0.
Hence, passing to the limit in (10.5) we find, for each N , that
λ1 x1 + · · · + λK xK ∞ ≥ (1 + ε) max |λ1 ξ1,n + · · · + λK ξK,n | .

n=1,...,N
But this contradicts the definition of the norm in ℓ∞ (note that the right-hand
side does not vanish for sufficiently large N , since we assumed that the vectors
x1 , . . . , xN are linearly independent). This contradiction shows that we find indeed
some N satisfying (10.4).
Together with x1 , . . . , xK , we consider the truncated vectors y1 , . . . , yK ∈
Ê N
Ê
where yk := (ξk,1 , . . . , ξk,N ). If we equip N with the max-norm, we may
read (10.4) as
λ1 x1 + · · · + λK xK ∞ ≤ (1 + ε) λ1 y1 + · · · + λK yK . (10.6)
Recalling that x1 , . . . , xK are linearly independent, (10.6) implies that also the
Ê
vectors y1 , . . . , yK are linearly independent. On the subspace of N spanned by
y1 , . . . , yK we define a functional g by
g(λ1 y1 + · · · + λK yK ) := f (λ1 x1 + · · · + λK xK ).
Since y1 , . . . , yK are linearly independent, g is well-defined. Moreover, using (10.6),

we find
g(λ1 y1 + · · · + λK yK ) ≤ f λ1 x1 + · · · + λK xK ∞
≤ f (1 + ε) λ1 y1 + · · · + λK yK .
Hence g ≤ (1 + ε) f . By the Hahn-Banach extension theorem (actually

Lemma 10.2 suffices), we may extend g to a linear functional on N without in- Ê
Ê
creasing its norm. Let e1 , . . . , eN be the canonical base of N . Put now ηn := g(en ).
Since g is linear, we have
N N

f (xk ) = g(yk ) = g ξk,n en = ηn ξk,n ,
n=1 n=1
and so (10.3) holds. In order to prove the norm estimate, we consider the vector
x := (sgn(η1 ), . . . , sgn(ηN )). Then
N
N

|ηn | = sgn(ηn )g(en ) = g(x) ≤ g x ≤ (1 + ε) f .
n=1 n=1

Using Lemma 10.4, we obtain now:
Theorem 10.5. Let ∗ : S → ∗
S be a nonstandard embedding. If ∗ is even an enlarge-
ment, then for any f ∈ ℓ∗∞ there exists an internal ∗ -finite sequence η1 , . . . , ηh ∈ ∗ Ê
with
h

∗
(f (x)) = ηn ∗ ξ n (x = (ξn )n ∈ ℓ∞ ), (10.7)
n=1
where
h

f = st |ηn | . (10.8)
n=1
Conversely, each ∗ -finite internal sequence for which the right-hand side of (10.8)
is finite gives rise to a functional f ∈ ℓ∗∞ defined by
h

∗
f (x) := st ηn ξ n (x = (ξn )n ∈ ℓ∞ ), (10.9)
n=1
which satisfies
h

f ≤ st |ηn | .
n=1
Proof. Let f ∈ ℓ∗∞ be given. Let ϕ ∈ S be the following binary relation from
<
Ê+ × ℓ∞ into Ê
:
#Ê (y) #Ê (y)

ϕ = {(ε, x, y) ∈ Ê+ ×ℓ∞ × Ê< | f (x) = y(n)x(n) ∧ |y(n)| ≤ (1+ε) f }
n=1 n=1
(here, we consider sequences as functions). Lemma 10.4 implies that ϕ is concurrent

Ê Ê
on + × ℓ∞ , i.e. for each finitely many (εk , xk ) ∈ + × ℓ∞ (k = 1, . . . , N ) there is
some y ∈ < such that (εk , xk , y) ∈ ϕ (k = 1, . . . , N ). Since ∗ is an enlargement,
Ê ∗ <
Theorem 8.10 implies that there is some y ∈ Ê
which satisfies ∗ ϕ on σ dom(ϕ),
∗ ∗ ∗
Ê
i.e. ( ε, x, y) ∈ ϕ for any ε ∈ + and any x ∈ ℓ∞ . By Exercise 25, y is a ∗ -finite
internal sequence η1 , . . . , ηh where h := (∗ #Ê (·))(y). We thus have (10.7), and
h
∗
|ηn | ≤ (1 + ∗ ε) (f ) (ε ∈ Ê+).
n=1
Conversely, if η1 , . . . , ηh is a ∗ -finite internal sequence and f is given by (10.9),

∗
then we have for each x = (ξn )n ∈ ℓ∞ that |∗ ξ n | ≤ (x∞ ) (n ∈ ∗ ) by the Æ
transfer principle, and so
h
h

∗ ∗
(|f (x)|) ≤ ∗ ε + |ηn | |∗ ξ n | ≤ ∗ ε + |ηn | ( x∞ ),
n=1 n=1
Ê
for each ε ∈ + . Moreover (10.9) implies that f is linear, since st is linear (Theo-
rem 5.21). Hence, f ∈ ℓ∗∞ and
h

∗
(f ) ≤ ∗ ε + |ηn | (ε ∈ Ê+).
n=1
Exercise 52. Let X be a linear space (not necessarily normed) of real sequences.
Prove that any linear functional f on X can be written in the form
h

∗
(f (x)) = ηn ξn (x = (ξn )n ∈ X),
n=1
where η1 , . . . , ηh is a ∗ -finite internal sequence.

A functional f ∈ ℓ∗∞ is called a Hahn-Banach limit , if for any convergent
sequence x = (ξn )n the relation
f (x) = lim ξn
n→∞
holds. Thus, Hahn-Banach limits are generalizations of the lim-operator which

associate a “limit” to any bounded sequence.
Exercise 53. Prove that Hahn-Banach limits may not be written in the form (10.2)
with some sequence y = (ηn )n . Prove however, that Hahn-Banach limits exist (and
so ℓ∗∞ indeed contains elements which cannot be written in the form (10.2)).
Theorem 10.6. Let ∗ : S → ∗ S be a nonstandard embedding. If ∗ is even an
enlargement, then for any Hahn-Banach limit f there exist h0 , h ∈ ∞, h0 < h,
and an internal sequence ηh0 , . . . , ηh ∈ ∗ such thatÊ
h

∗
(f (x)) = ηn ∗ ξ n (x = (ξn )n ∈ ℓ∞ ),
n=h0
h

ηn = 1,
n=h0
and
h

f = st |ηn | .
n=h0
Ê
Conversely, each internal ∗ -finite sequence η1 , . . . , ηh ∈ ∗ which satisfies ηn ≈ 0
Æ
for any n ∈ σ , η1 + · · · + ηh ≈ 1, and for which |η1 | + · · · + |ηh | is finite defines
a Hahn-Banach limit f by means of the formula
h
∗
f (x) := st ηn ξ n (x = (ξn )n ∈ ℓ∞ ).
n=1
Proof. Let f be a Hahn-Banach limit, and let η1 , . . . , ηh be the internal ∗ -finite

sequence of Theorem 10.5. The first statement follows if we can prove that there

is some h0 ∈ ∞ with ηn = 0 for n < h0 , and η1 + · · · + ηh = 1. To see the latter,
consider the constant sequence x := (1) (i.e. ξn := 1). Since f is a Hahn-Banach

limit, we must have f (x) = 1. But since ∗ ξ n = 1 for each n ∈ ∗ (this follows by the
transfer principle or by Theorem 7.1), the formula (10.7) implies η1 + · · · + ηh = 1,

as claimed. To see that η∗ n = 0 for n ∈ , we consider the particular sequence x
defined by ξk := 0 for k = n and ξn := 1. The transfer principle implies ∗ ξ k = 0
for k = ∗ n and ∗ ξ ∗ n = 1. Since f (x) = 0, the formula (10.7) implies η∗ n = 0,
as claimed. We thus have proved that the internal formula ηn = 0 holds for each

n ∈ σ . By the permanence principle there is some h1 ∈ ∞ such that ηn = 0
holds for all n ≤ h1 . Thus, the first statement follows with h0 := h1 + 1.
For the second statement, let η1 , . . . , ηh and f be given as in the formulation
of the theorem. Theorem 10.5 implies that f ∈ ℓ∗∞ . It remains to prove that if
Ê
x = (ξn )n converges to some l ∈ , that f (x) = l, i.e. ∗ l ≈
∗
ηn ξ n . Let ε ∈ + Ê
∗
be given. Since ηn ξ n ≈ 0 for each n ∈ σ
Æ
, the internal predicate
x

|ηn | < ∗ ε
n=1
is true for any x ∈ Æ and by the permanence principle thus also for some x = h0 ∈
σ
Æ∞. For n > h0, we have ∗ξn ≈ ∗l by Theorem 7.1, in particular |∗ξ n − ∗l| < ∗ε.
Putting c := η1 + · · · + ηn − 1, we have by assumption c ≈ 0 and thus also |c| < ∗ ε.
Now we may calculate

h

h

h

h

ηn ∗ ξ n − ∗ l
=
ηn ∗ ξ n − ηn − c ∗ l
≤
ηn (∗ ξ n − ∗ l)
+ c |∗ l|

n=1 n=1 n=1 n=1

h0
h

≤ |ηn | (|∗ ξ n | + |∗ l|) + |ηn | ∗ ε + ∗ ε |∗ l| .
n=1 n=h0 +1
∗
Now observe that the transfer principle implies |∗ ξ n | ≤ (x∞ ) for all n ∈ ∗ Æ
and that by assumption |η1 | + · · · + |ηh | ≤ ∗ M for some M ∈ + . Hence, we have Ê
proved
h

∗
∗

∗
ηn ξ n − l
≤ ∗ ε( (x∞ ) + |∗ l|) + ∗ M ∗ ε + ∗ ε |∗ l| .

n=1
Ê
Since this estimate holds for any ε ∈ + , it follows that
∗
ηn ξ n ≈ ∗ l, which we
had to prove.
Theorem 10.6 slightly generalizes [Lux92, Theorem 4.4], using a refinement
of the technique from [Rob64].
Exercise 54. Does there exist a Hahn-Banach limit f such that for each x ∈ ℓ∞
the point f (x) is an accumulation point of the sequence x?
Another generalization of limits is the following: A linear functional f on ℓ∞

is called a Banach-Mazur limit, if it has the following properties:
1. If x = (ξn )n is the constant sequence ξn = c, then f (x) = c.
2. f is positive, i.e. if x = (ξn )n ∈ ℓ∞ satisfies ξn ≥ 0 for all n, then f (x) ≥ 0.
3. f is shift invariant, i.e. f ((ξn )n ) = f ((ξn+1 )n ).
Actually, we could have required the first property even for all convergent se-
quences ξn → c (which is apparently a more restrictive requirement):
Exercise 55. Prove that any Banach-Mazur limit f is a Hahn-Banach limit. More
precisely, show that f ∈ ℓ∗∞ with f = 1 and that for any x = (ξn )n ∈ ℓ∞ the
estimate
lim inf ξn ≤ f (x) ≤ lim sup ξn
n→∞ n→∞
holds.
Exercise 56. Let f be a Banach-Mazur limit. Calculate f (x) for the sequence
x = (ξn )n which is given by ξn := (−1)n .
Does there exist a Banach-Mazur limit which has the additional property
from Exercise 54, i.e. such that f (x) is always an accumulation point of the se-
quence x?
The standard proofs for the existence of Banach-Mazur limits are not very
constructive. We just mention one of the simplest standard approaches from
[Rud90, Chapter 3, Exercise 4]:
Exercise 57. Given a sequence x = (ξn )n ∈ ℓ∞ , put ζn := (ξ1 + · · · + ξn )/n. Apply
Exercise 49 for p(x) := lim sup ζn and f (x) := lim ζn (if ζn converges) to prove the
existence of a Banach-Mazur limit.
We now present a class of Banach-Mazur limits which can easily be charac-
terized by nonstandard methods. This result is taken from [Rob64]:
Theorem 10.7. Let ∗ : S → ∗S be a nonstandard embedding, and η1 , . . . , ηh ∈ ∗ Ê
∗
be an internal -finite sequence. Assume:
1. ηn ≥ 0 for n = 1, . . . , h.
2. η1 + · · · + ηh ≈ 1.
h
3. n=1 |ηn − ηn−1 | ≈ 0 (put η0 := 0).
Then a Banach-Mazur limit f is given by the formula
h

∗
f (x) := st ηn ξ n (x = (ξn )n ∈ ℓ∞ ).
n=1

Proof. An induction implies that ηn ≈ 0 for each n ∈ σ : Indeed, if ηn−1 ≈ 0, then
|ηn − ηn−1 | ≈ 0 (by assumption) shows that also ηn ≈ 0. Hence, by Theorem 10.6,
f is a Banach-Mazur limit. Moreover, positivity of f is trivial, since ηn ≥ 0. It

thus only remains to prove that f is shift-invariant. This can be seen as follows:
∗
If x = (ξn )n ∈ ℓ∞ is given, we have |∗ ξ n | ≤ (x∞ ) by the transfer principle.
Putting y := (ξn+1 )n , we have

h h

h

∗
∗ ∗

∗
(|f (x) − f (y)|) ≈

ηn ξ n − ηn ξ n+1
=
η1 ξ 1 + (ηn − ηn−1 )ξn
n=1 n=1 n=2

h

≤ |η1 ∗ ξ 1 | + |ηn − ηn−1 | ∗ (x∞ ) ≈ 0,
n=2
and so f (x) = f (y), as desired.
Example 10.8. Let ∗ : S → ∗S be a nonstandard embedding. Fix some h ∈ ∞.

Then
h
1 ∗
f (x) := st ξ (x = (ξn )n ∈ ℓ∞ )
h n=1 n
defines a Banach-Mazur limit.
We remark that Theorem 10.7 has the following powerful converse:
Theorem 10.9. Let ∗ : S → ∗ S be polysaturated. Then for any Banach-Mazur limit
Ê
f there exist h0 , h ∈ ∞ , h0 < h, and an internal sequence ηh0 , . . . , ηh ∈ ∗ such
that
h
∗
f (x) = st ηn ξ n (x = (ξn )n ∈ ℓ∞ ),
n=h0
ηn ≥ 0,
h

ηn ≈ 1,
n=h0
and
h

|ηn − ηn−1 | ≈ 0.
n=h0 +1
The proof of Theorem 10.9 needs deeper facts about the geometry of Banach
spaces which are beyond the scope of this book. The proof can be found in [Lux92].
Theorem 10.9 suggests that the following Hahn-Banach limits are particularly
“natural”:
Example 10.10. Let ∗ : S → ∗
S be a nonstandard embedding. For any h0 , h ∈ ∞ Æ
with h0 ≤ h, the formula
h

1
∗
f (x) := st ξn (x = (ξn )n ∈ ℓ∞ )
h − h0 + 1
n=h0
defines a Banach-Mazur limit (by Theorem 10.7).

The Banach-Mazur limits of Example 10.10 are said to be of Cesàro type.
For certain sequences x ∈ ℓ∞ , the value f (x) is independent of the choice
of the Banach-Mazur limit f . For example, if x converges to c, we must always
have f (x) = c. One may wonder whether this property holds for a larger class
of sequences x. This is indeed true: The sequences with this property have been
characterized by Lorentz [Lor48], but the (standard) proof is rather cumbersome.
A brief nonstandard proof of this result was given in [Lux92]. We present a variant
of this proof now. The essential observation is that it suffices to consider Banach-
Mazur limits of Cesàro type:
Proposition 10.11. Let ∗ : S → ∗S be an enlargement. Let x0 ∈ ℓ∞ be such that
f (x0 ) = c for any Banach-Mazur limit f of Cesàro type. Then f (x0 ) = c for any
shift-invariant Hahn-Banach limit f . In particular, f (x0 ) = c for any Banach-
Mazur limit f .
Proof. Let S : × ℓ∞ → ℓ∞ be defined by
S(n, x) := (ξn+1 , ξn+2 , . . .) (x = (ξk )k ∈ ℓ∞ ).
Let f be a shift-invariant Hahn-Banach limit. Then the value c0 := f (S(n, x0 )) is

independent of n ∈ . The transfer principle implies
∀n ∈ ∗ : ∗f (∗S(n, ∗x0)) = ∗c0 .
By Theorem 10.6, we find an internal sequence ηh0 , . . . , ηh ∈ ∗
∗
Ê (with h0, h ∈ Æ∞ )
with ηk = 1 and |ηk | ≤ (f ) + 1 such that
h

∗
f (∗ x) = ηk ∗ ξ k (x = (ξn )n ∈ ℓ∞ ).
k=h0
Since ∗ f (∗ S(n, ∗ x0 )) = ∗ c0 for n = h0 , . . . , h, we find

h
h
h

∗
(h − h0 + 1)∗ c0 = f (∗ S(n, ∗ x0 )) = ηk ∗ ξ n+k ,
n=h0 n=h0 k=h0
and so
h
h h
∗ 1 ∗
c0 = ηk ξn+k = ηk (∗ c + εk )
h − h0 + 1
k=h0 n=h0 k=h0
Ê
where εk ∈ inf(∗ ). Since

ηk = 1, we find for any ε ∈ + that Ê

h
h

∗ ∗
| c0 − c| =
ηk εk
≤ |ηk | ∗ ε ≤ ∗ (f + 1)∗ ε,

k=h0 k=h0
∗ ∗
and so c0 ≈ c which implies f (x) = c0 = c.
A deeper reason why Proposition 10.11 is true is revealed in [KM92] (see also
the remarks in [Lux92]). However, we will not go into further detail here.
Definition 10.12. A sequence x = (ξn )n is almost convergent to c if we have
n
1
lim ξm+k = c
n→∞ n
m=1
uniformly in k ∈ .
It can be proved by standard methods that any almost convergent sequence
is bounded. However, with nonstandard methods an easier proof can be given
[Lux92]:
Exercise 58. Let ∗ be a nonstandard embedding. Then x = (ξn )n is almost con-
vergent to c if and only if
h
1 ∗
ξ
h n=1 n+k
≈ ∗c (h ∈ ∞, k ∈ ∗). (10.10)
Moreover, in this case x ∈ ℓ∞ .

Hint: For the second statement, prove that ∗ x ∈ ∗ ℓ∞ .
Now we can prove the announced result:
Theorem 10.13. A sequence x = (ξn )n is almost convergent to c if and only if
x ∈ ℓ∞ and f (x) = c for any Banach-Mazur limit f .
Proof. Let ∗ : S → ∗
S be an enlargement, and x be almost convergent to c. Then
x ∈ ℓ∞ by Exercise 58. By Proposition 10.11, it suffices to prove that f (x) = c for
any Banach-Mazur limit f of Cesàro type, i.e. we may assume
h

1 ∗
f (x) = st ξm (10.11)
h − h0 + 1
m=h0
for some h0 , h ∈ ∞ with h0 ≤ h. Putting n := h − h0 + 1 and k := h0 − 1, we

have by (10.10)
h
n
1 ∗ 1 ∗
ξm = ξ ≈ ∗ c,
h − h0 + 1 n m=1 m+k
m=h0
and so f (x) = c, as desired.

Conversely, let x ∈ ℓ∞ and f (x) = c for any Banach-Mazur limit f . Let
∗ : S → ∗

S be a nonstandard map. Given h ∈ ∞ and k ∈ ∗ , we find by
considering the Banach-Mazur limit
h+k

1 ∗
f ((ζn )n ) := st ζm
h
m=k+1
(Theorem 10.7) that

h+k h
∗ 1 1
c≈ ξm = ξm+k ,
h h m=1
m=k+1
and so x is almost convergent to c by Exercise 58.

Corollary 10.14. If x is convergent to c, then x is almost convergent to c.
The converse of Corollary 10.14 does not hold: For example, the sequence
xn := (−1)n is almost convergent to 0 but not convergent. One may ask for
additional conditions which ensure that a sequence which is “almost convergent”
(e.g. in our sense) is actually convergent. Theorems giving such conditions are
called Tauberian theorems. In our case, it turns out that there is a simple Tauberian
theorem giving a condition which is even necessary and sufficient. This was already
observed in [Lor48]; the following easier nonstandard proof is taken from [Lux92]:
Theorem 10.15. The sequence x = (ξn )n is convergent to c if and only if it is
almost convergent to c and ξn − ξn+1 → 0.

In particular, a series an converges if and only if the partial sums are
almost convergent, and an → 0.
Proof. One implication follows from Corollary 10.14. For the converse implication,
let x be almost convergent to c and ξn+1 − ξn+1 → 0. Let ∗ : S → ∗ S be a

nonstandard map. Given h ∈ ∞ , consider the internal sequence given by
n

ηn := ∗ ξ h − ∗ ξ n+h = (∗ ξ m+h−1 − ∗ ξ m+h ).
m=1
∗ ∗
Since ξ m+h−1 − ξ m+h ≈ 0 by Theorem 7.1, we have ηn ≈ 0 for any finite n ∈ σ .
Robinson’s sequential lemma (Exercise 22) implies that there is some h0 ∈ ∞

with ηn ≈ 0 for all n ∈ ∗ with n ≤ h0 . In particular, we have for any ε ∈ + Ê
that
h

1 ∗

ηn
≤ h0 ε.
h0
n=1
h0
In view of (10.10), this implies

h0 h0
1 1 ∗
0 + ∗c ≈ ηn + ξ = ∗ξh,
h0 n=1 h0 n=1 n+h
and so ξn → c by Theorem 7.1.

§ 11 Additive Measures
Let S0 be a set, and Σ be a set algebra over S0 , i.e. Σ is a system of subsets
of S0 with the property that S0 ∈ Σ and that A, B ∈ Σ implies A ∪ B ∈ Σ
and S0 \ A ∈ Σ. A function µ : Σ → [0, ∞] is called an additive measure if
µ(A ∪ B) = µ(A) + µ(B) for any A, B ∈ Σ with A ∩ B = ∅. If µ(S0 ) = 1,
then µ is called an additive probability measure. If Σ is even a σ-algebra, i.e.

additionally An ∈ Σ for countably many An ∈ Σ and µ is even σ-additive, i.e.

µ( An ) = µ(An ) whenever An ∈ Σ are pairwise disjoint, then µ is called a
measure resp. a probability measure.
An additive measure µ is called singular , if µ(A) = 0 for any finite set A ∈ Σ.
At the moment, we are interested in additive probability measures on

where Σ = P( ). Such a measure µ is singular if and only if µ({n}) = 0 for any

n ∈ . It cannot be proved without the axiom of choice that such measures exist
(even a rather powerful countable version of the axiom of choice is not sufficient
[PS77]). In particular, it is not possible by standard methods to construct singular

additive measures on . However, many Hahn-Banach limits f provide such a
Ê
measure: Given A ⊆ , we let χA denote the sequence an ∈ defined by

1 if n ∈ A,
an :=
0 if n ∈/ A.
Then µ(A) := f (χA ) is additive, because if A ∩ B = ∅, we have
µ(A ∪ B) = f (χA∪B ) = f (χA + χB ) = f (χA ) + f (χB ) = µ(A) + µ(B).
Moreover, µ({n}) = 0, because χ{n} is a null sequence. The only property which
is not necessarily satisfied is that µ(A) ≥ 0. However, this holds if we choose some
f which has a representation as in Theorem 10.5 with ηn ≥ 0. Hence, we found a
rather large class of singular additive measures on .Æ
In particular, if f is a Banach-Mazur limit, then µ(A) = f (χA ) defines a
singular measure which additionally is translation invariant, i.e. µ({n : n + 1 ∈
A}) = µ(A).
By an appropriate choice, we can satisfy certain other additional properties.
Let us give a sample application:
Æ
A set A ⊆ is said to have a density d, if
n
1
d := lim an ((an )n = χA )
n→∞ n
k=1
exists. The density (if it exists) may be considered as a “relative frequency” of the
occurrences of 1 in the sequence (an )n . A singular measure µ with the property
§11 Additive Measures 141
that µ(A) = d whenever A has the density d may be considered as some sort
of “Laplace measure” on
(i.e. each number has in a certain sense the same
“weight” for the calculation of the probability).
Theorem 11.1. There is a singular additive translation invariant measure µ on
with the additional property that µ(A) = d whenever A has the density d. Moreover,
for any A ⊆ X the sequence χA = (an )n satisfies
n n
1 1
lim inf ak ≤ µ(A) ≤ lim sup ak .
n→∞ n n→∞ n
k=1 k=1
Proof. Let ∗ : S → ∗
S be a nonstandard embedding. Fix h0 ∈ ∞ and choose

some h ∈ ∞ such that h/h0 is infinite (put e.g. h := h20 ). Consider the Banach-
Mazur limit
h

1 ∗
f (x) := st ξk (x = (ξk )k )
h − h0
k=h0
of Cesàro type, and put µ(A) := f (χA ). Since f is a Banach-Mazur limit, µ is a

singular translation invariant measure.
To see that the additional property holds, let χA = (an )n , and consider the
sequence
n
1
bn := ak .
n
k=1
Exercise 28 implies that
n n
1 1
lim inf ak ≤ st(∗ bh ) ≤ lim sup ak .
n→∞ n n→∞ n
k=1 k=1
∗
Hence, it suffices to prove that f (χA ) = st( bh ). But we have
h
0 −1
h
∗ 1 ∗ h ∗ 1 ∗
(f (χA )) ≈ ak = bh − ak .
h − h0 h − h0 h − h0
k=h0 k=1
h/h0
Note now that h/(h − h0 ) = h/h 0 −1
≈ 1, and

1 h 0 −1
h −1 h0 /h − 1/h

∗
0

ak
≤ = ≈ 0,

h − h0
h − h0 1 − h0 /h
k=1
∗ ∗
and so f (χA ) ≈ bh , as desired.
Exercise 59. In the proof of Theorem 11.1, we have chosen a particular Banach-
Mazur limit f of Cesàro type. Does the conclusion µ(A) = d also hold for any
Banach-Mazur limit f of Cesàro type or even for any Banach-Mazur limit f ?
Hint: Apply Theorem 10.13.
It turns out that in the nonstandard world any singular measure is a Laplace
measure in another sense. This was first proved in [Hen72b]. We present a slightly
modified version:
Theorem 11.2. Let ∗ : S → ∗S be a nonstandard embedding, S0 ∈ S be an entity,
and Σ be an algebra over S0 . Then for any nonempty ∗ -finite B ⊆ ∗ S 0 the function

# ∗
( A ∩ B)
µ(A) := st #
(A ⊆ S0 ) (11.1)
B
defines an additive probability measure (even for Σ = P(S0 )). Moreover, if B is

infinite, then µ is singular.
Conversely, if ∗ is even a Σ-enlargement, then any singular additive proba-
bility measure µ can be written in the above form.
Proof. Clearly, µ(A) ≥ 0, and µ(S0 ) = 1. If A1 , A2 ∈ Σ are disjoint, then also

B1 := ∗ A1 ∩ B and B2 := ∗ A2 ∩ B are disjoint, and Theorem 6.14 implies
#
(B1 ∪ B2 ) = # B1 + # B2 . Hence,
# ∗ # ∗ # ∗
( (A1 ∪ A2 ) ∩ B) ( A1 ∩ B) ( A1 ∩ B)
#
= #
+ #
,
B B B
and the additivity of st implies that µ is additive. If A is finite and B is infinite,
#

then ∗ A and thus (∗ A ∩ B) ∈ σ is finite and # B ∈ ∞ is infinite, and so
µ(A) = 0.
For the second statement, consider the relation
ϕ := {(x, y) ∈ Σ × P(Σ) | “y is finite” ∧ α(y) ∧ β(x)}
where α(y) is a shortcut for “each two different elements of y are disjoint”, and
β(y) is a shortcut for

∃z ∈ P(Σ) : (z ⊆ y ∧ x = z).
Then ϕ is concurrent: Given A1 , . . . , An ∈ Σ, let Σ1 denote the system of all

finite intersections An1 ∩ · · · ∩ Ank , and eliminate successively all elements which
can be written as the union of other elements. The resulting set y = Σ2 satisfies
(A1 , Σ2 ), . . . , (An , Σ2 ) ∈ ϕ. Theorem 8.10 implies that ∗ ϕ is satisfied on σ Σ. In
view of the standard definition principle for relations and Theorem 3.21, this
means that we find some ∗ -finite subset Σ0 := y ⊆ ∗ Σ of pairwise disjoint sets

such that for each A ∈ Σ, we find an internal ΣA ⊆ Σ0 with ∗ A = ΣA .

Let c : P(S0 ) → ∪ {∞} be the function which associates to each A ⊆ S0 its
#
number of elements. By Exercise 27, we have ∗ c(A0 ) = (A0 ) for any A0 ∈ ∗ P(S0 ).

To each n ∈ and each infinite set z ∈ Σ, we can associate a subset x(z) ⊆ z
whose number of elements is the smallest integer which is at least nµ(z). Actually,
this holds also if z ∈ Σ is finite, because in this case µ(z) = 0. Hence,
∀n ∈ , y ∈ P(Σ) : ∃x ∈ P(S0)Σ : γ
where γ is a shortcut for
∀z ∈ Σ : (x(z) ⊆ z ∧ c(y(z)) ≤ nµ(z) < c(y(z)) + 1).
The transfer principle implies for the choice y := Σ0 in view of Theorem 3.21 that
we find for any h ∈ ∗ N some internal function f : ∗ Σ → ∗ P(S0 ) such that for any
A0 ∈ Σ0 the set f (A0 ) is a ∗ -finite subset of A0 with
# #
(f (A0 )) ≤ h∗ µ(A0 ) < (f (A0 )) + 1. (11.2)

Fix some h ∈ ∞ such that h/# Σ0 is infinite, and let f denote the corresponding
function. We claim that

B := {f (A0 ) : A0 ∈ Σ0 }
has the required properties. Indeed, given A ∈ Σ, we find by our construction

of Σ0 an internal ΣA ⊆ Σ0 with ∗ A = ΣA . Theorem 6.13 implies that ΣA is
∗
-finite with # ΣA ≤ # Σ0 . Since the transfer principle implies that ∗ µ is additive
on ∗ -finite subsets of ∗ Σ and since the elements of ΣA are pairwise disjoint, we
find
∗
(µ(A)) = ∗ µ(∗ A) = ∗ µ( ΣA ) = ∗
µ(A0 ).
A0 ∈ΣA
Since (11.2) implies

( (f (A0 )) − h µ(A0 ))
≤ # ΣA ≤ # Σ0 ,
∗

A0 ∈ΣA
we find

∗ #

h (µ(A)) − (f (A0 ))
≤ # Σ0 .

A0 ∈ΣA

Since f (A0 ) ⊆ A0 and since the sets A0 ∈ Σ0 are pairwise disjoint and ∗ A = ΣA ,

we find in view of the definition of B that ∗ A ∩ B = {f (A0 ) : A0 ∈ ΣA } where
the union is pairwise disjoint. In particular,
# ∗
#
( A ∩ B) = (f (A0 )).
A0 ∈ΣA
Summarizing,

∗ #

h (µ(A)) − (∗ A ∩ B)
≤ # Σ0 (A ∈ Σ). (11.3)
Applying (11.3) for A = S, we find in view of µ(S) = 1 that

h − # B
≤ # Σ0 .
Dividing this estimate by # Σ0 , we find that # B/# Σ0 ≥ (h/# Σ0 ) − 1 is infinite.

(In particular, B is actually infinite).
Moreover, for any A ∈ Σ we have by (11.3) in view of ∗ µ(A) ≤ 1 that

# ∗

# (∗ A ∩ B)

( ( A ∩ B) − h∗ µ(A)) + (h − # B)∗ µ(A)

#
− (µ(A))
= #

B
B
2 # Σ0
≤ #
≈ 0.
B
We thus have the required representation.
Corollary 11.3 (Measure Extension Theorem). If µ is a singular additive probabil-
ity measure defined on an algebra Σ ⊆ S0 , then µ may be extended to a singular
additive probability measure on P(S0 ).
Proof. Let ∗ be a Σ-enlargement. Then we may write µ in the form (11.1), and
the latter defines even a singular additive probability measure on P(S0 ).
In particular, the Lebesgue measure (defined on the measurable subsets of
[0, 1]) has an extension to an additive measure which is defined on all subsets of
[0, 1]. A famous theorem of Banach states that the Lebesgue measure on Ê has
an extension to a translation invariant additive measure µ which is defined on all
subsets ofÊ (translation invariance means µ(A + x) = µ(A) for each A ⊆ ). Ê
We intend to give a nonstandard proof for this result. By calculating modulo
1, it obviously suffices to extend the Lebesgue measure on [0, 1) to a measure µ
which has the property that µ(A) = µ(A ⊕ x) for any A ⊆ [0, 1] and any x ∈ Ê
(here, A ⊕ x := {a ⊕ x : a ∈ A} and a ⊕ x := a + x + z where z ∈ is chosen such
that a ⊕ x ∈ [0, 1)).
This result follows rather immediately from the following result of the stan-
dard world.
Proposition 11.4. The operation ⊕ satisfies Følner’s condition on X = [0, 1),
i.e. for each x1 , . . . , xn ∈ [0, 1) and each ε ∈ + there is a nonempty finite set
A ⊆ [0, 1) such that
|A∆(A ⊕ xj )|
<ε (j = 1, . . . , n) (11.4)
|A|
where ∆ denotes the symmetric difference A∆B := (A \ B) ∪ (B \ A).
Proof. Without loss of generality, let x1 = 0. We define kx := x ⊕ · · · ⊕ x (k times)

and 0x := 0 and prove by induction on n that the set A may even be chosen such
that each a ∈ A may be written in the form a = k1 x1 ⊕ · · · ⊕ kn xn .
Since x1 = 0, the induction start is trivial. Assume the claim has already
Ê
been proved for n − 1. Let ε ∈ + and x1 , . . . , xn be given. We distinguish two
Æ
cases: First assume that there exist numbers k1 , . . . , kn ∈ ∪ {0}, kn = 0, with
kn xn = k1 x1 ⊕ · · · ⊕ kn−1 xn−1 . By induction hypothesis, we find a nonempty finite
set A satisfying (for any y ∈ [0, 1)):
|(y ⊕ A)∆(y ⊕ A ⊕ xj )| < |A| ε0 (j = 1, . . . , n − 1) (11.5)
where ε0 := ε/(k1 + · · · + kn−1 ). Put A0 := A ∪ (A ⊕ xn ) ∪ · · · ∪ (A ⊕ (kn − 1)xn ).

Summing up (11.5) for y = 0, xn , . . . , (kn − 1)xn , we obtain
|A0 ∆(A0 ⊕ xj )| < kn |A| ε0 ≤ |A0 | ε (j = 1, . . . , n − 1).
Moreover, a successive application of (11.5) implies
|A0 ∆(A0 ⊕ kn xn )| = |A0 ∆(A0 ⊕ k1 x1 ⊕ · · · ⊕ kn−1 xn−1 )|

< (k1 + · · · + kn−1 ) |A| ε0 .
The definition of A0 implies A0 ∆(A0 ⊕ xn ) = A∆(A ⊕ kn xn ), and so
|A0 ∆(A0 ⊕ xn )| < (k1 + · · · + kn−1 ) |A| ε0 ≤ |A0 | ε.
Hence, we are done in this case.

If no numbers k1 , . . . , kn as assumed above exist, choose k > 2/ε. By in-
duction assumption, we find a nonempty finite set A such that (11.5) holds with
ε0 := ε. Put A0 := A ∪ (A ⊕ xn ) ∪ · · · ∪ (A ⊕ (k − 1)xn ). By assumption, this is a
union of disjoint sets, and so |A0 | = k |A|. From (11.5), it follows that
|A0 ∆(A0 ⊕ xj )| < k |A| ε0 = |A0 | ε (j = 1, . . . , n − 1).
Moreover, since A0 ∆(A0 ⊕ xn ) = A∆(A ⊕ kxn ), we have
|A0 ∆(A0 ⊕ xn )| ≤ 2 |A| = 2 |A0 | /k < |A0 | ε.
Ê
Theorem 11.5 (Measure Extension Theorem for ). The Lebesgue measure has
an extension to a translation invariant additive measure which is defined on all
Ê
subsets of .
Proof. As noted above, it suffices to extend the Lebesgue measure on X = [0, 1).
By Corollary 11.3, we find some extension µ of the Lebesgue measure to all subsets
of X. Let ∗ : S → ∗ S be an X-enlargement. Let c : P(X) →
∪ {∞} be the
function which associates to each subset of X its number of elements. Exercise 27
implies ∗ c(B) = # B for any internal B ⊆ ∗ X. Consider the binary relation
× X) × P(X) | “y is finite” ∧ c(y∆(y ⊕ x))/c(y) ≤ 1/n}.

ϕ := {(n, x, y) ∈ (
Proposition 11.4 implies that ϕ is concurrent on × X. By Theorem 8.10, ∗ ϕ is

satisfied on ( × X). The standard definition principle for relations thus implies
σ
that we find some ∗ -finite B ∈ ∗ X such that

#
(B∆(B ∗ ⊕ ∗ x))
#
≈0 (x ∈ X).
B
We claim that

1 ∗ ∗ ∗
µ0 (A) := st #
µ( A ⊕ x) (A ⊆ X)
B x∈B
is the desired measure. Note that µ0 is defined, because |∗ µ(A0 )| ≤ 1 for all
A0 ∈ ∗ P(X) implies
∗ ∗ ∗
µ( A ⊕ x) ≤ # B.
x∈B
Moreover, if A1 , A2 ⊆ X are disjoint, then ∗ A1 ∗ ⊕ x and ∗ A2 ∗ ⊕ x are disjoint, and

the additivity of ∗ µ and st thus implies that µ is additive. If A ⊆ X is Lebesgue
measurable, we have µ(A ⊕ x) = µ(A) for any x ∈ X, and so
∗ 1 ∗ ∗
(µ0 (A)) ≈ #
µ( A) = ∗ µ(∗ A) = ∗ (µ(A))
B x∈B
which implies that µ0 (A) = µ(A).

To see that µ0 is translation invariant, let A ⊆ X and y ∈ X be given. We
have

1 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
µ0 (A) − µ0 (A ⊕ y) = st # ( µ( A ⊕ x) − µ( A ⊕ y ⊕ x)) .
B x∈B
Observe now that the sum in the above formula may be written as

∗ ∗ ∗ ∗ ∗ ∗
µ( A ⊕ x) − µ( A ⊕ x)
x∈B x∈B⊕y

∗
= ∗ ∗
µ( A ⊕ x) − ∗
µ(∗ A ∗ ⊕ x).
x∈B\(B⊕y) x∈(B⊕y)\B
Since 0 ≤ ∗ µ(∗ A ∗ ⊕ x) ≤ 1 and the total number of summands in these two sums
#
is (B∆(B ⊕ y)), we obtain the estimate

1 #
|µ0 (A) − µ0 (A ⊕ y)| ≤ st # (B∆(B ∗ ⊕ ∗ y)) = 0.
B
Thus, µ0 is translation invariant.
The previous proofs are essentially taken from [Wag86, p. 161] and [Hen72b].
The reader will have realized that for Proposition 11.4 we might have replaced
X by any commutative group. Moreover, the proof of Theorem 11.5 holds even for
any group X which satisfies Følner’s condition: Any translation invariant finitely
additive probability measure on such a group X may be extended to a translation
invariant finitely additive probability measure defined on all subsets of X. Groups
possessing such a measure are called amenable. In particular, our above proofs
show that commutative groups and, more generally, groups satisfying Følner’s
condition, are amenable. Følner has proved that conversely amenable groups must
satisfy Følner’s condition. Readers more interested in amenable groups are referred
Ê
to [Wag86]. We only mention that the group of isometries in n is amenable if
and only if n ≤ 2.
Chapter 6
Nonstandard Topology and

Functional Analysis
§ 12 Topologies and Filters

12.1 Topological Spaces
Definition 12.1. Let X be some set, and O a system of subsets of X. Then O is
called a topology (and the pair (X, O) is called a topological space), if:
1. ∅, X ∈ O.
2. A, B ∈ O implies A ∩ B ∈ O.

3. For any A ⊆ O we have A ∈ O.
The elements of O are called open subsets of X. The complements of open sets
are called closed .
Often, we do not mention O explicitly, and simply say (not very precisely)
that X is a topological space.
The most important example of topological spaces are metric spaces:
Definition 12.2. Let X be some set, and d : X × X → [0, ∞). Then d is called a
pseudometric on X, if:
1. d(x, x) = 0.
2. d(x, y) = d(y, x) (symmetry).
3. d(x, y) ≤ d(x, z) + d(z, y) (triangle inequality).
Moreover, d is called a metric on X if additionally d(x, y) = 0 implies x = y.
Each (pseudo)metric induces a topology O on X in a canonical way, namely
O is the system of all sets O ⊆ X with the property that for each x ∈ O there is
150 Chapter 6. Nonstandard Topology and Functional Analysis
Ê
some ε ∈ + such that the open ball B(x, ε) := {y ∈ X : d(x, y) < ε} is contained
in O. If we speak of a (pseudo)metric space, we always mean that X is equipped
with this topology.
Any topology allows us a definition of neighborhoods of a point:
Definition 12.3. If X is a topological space and x ∈ X, then U ⊆ X is called a
neighborhood of x if there is an open set O ⊆ U with x ∈ O. The system U (x) of
all neighborhoods of x is called the neighborhood filter of x.
For example, in a (pseudo)metric space X, a set is a neighborhood of x if
and only if it contains some ball with positive radius and center x.
Proposition 12.4. Any neighborhood filter U (x) is a filter.
Proof. If U ∈ U (x), then x ∈ O ⊆ U for some open set O. Hence U = ∅, and

whenever U ⊆ V ⊆ X, we have x ∈ O ⊆ V which implies V ∈ U . Finally, if
U1 , U2 ∈ U (x), say x ∈ O1 ⊆ U1 and x ∈ O2 ⊆ U2 with open sets O1 , O2 , then
O := O1 ∩ O2 is open, and x ∈ O ⊆ U1 ⊆ U2 , i.e. U1 ∩ U2 ∈ U (x).
The neighborhoods determine the topology:
Proposition 12.5. A set U ⊆ X is open if and only if it is a neighborhood for each
of its elements.
Proof. Let A denote the system of all subsets of U which are open. By definition,

A ⊆ U . If U is a neighborhood for each of its elements, then U ∈ A , and thus

U ⊆ A . Hence, U = A is open. The converse implication is trivial.
12.2 Filters in Nonstandard Analysis

In view of Proposition 12.4, filters play an important role in the study of topologies.
We first describe filters in more detail by nonstandard methods.
For the rest of this section, let X be an entity of the standard world S, and
∗
∗ : S → S be a P(X)-enlargement. If F is a filter over X, we define its monad
as the set σ
mon(F ) := F = {∗ F : F ∈ F }.
Theorem 12.6. The monads of filters satisfy:
1. mon(F ) = ∅.
2. If A ⊆ mon(F ) is internal, then there is some B ∈ ∗ F with A ⊆ B ⊆
mon(F ).
3. If A ⊆ ∗ X is internal with mon(F ) ⊆ A, then A ∈ ∗ F .
4. If A ⊆ X with mon(F ) ⊆ ∗ A, then A ∈ F .
5. Finer filters have smaller monads: F1 ⊇ F2 if and only if mon(F1 ) ⊆
mon(F2 ).
§12 Topologies and Filters 151
6. The monad characterizes the filter: F1 = F2 if and only if mon(F1 ) =

mon(F2 ).
Proof. 1. Since F has the finite intersection property, mon(F ) = ∅ follows from
the definition of enlargements.
2. By Theorem 8.10, there is a ∗ -finite set F0 with σ F ⊆ F0 ⊆ ∗ F . Then
F1 := {F ∈ F0 : A ⊆ F } is an internal subset of F0 by the internal definition
principle. Theorem 6.13 thus implies that F1 is ∗ -finite. Since A ⊆ mon(F ), the

definition of mon(F ) implies σ F ⊆ F1 and thus also B := F1 ⊆ mon(F ).
Since F is a filter, the sentence

∀x ∈ P(F ) : (“x is finite” =⇒ x ∈ F)
is true. The transfer principle implies that for any ∗ -finite internal subset x of ∗ F ,

we have x ∈ F . In particular, B = F1 is an element of ∗ F .
3. By 2., we find some B ∈ ∗ F with B ⊆ mon(F ); in particular B ⊆ A. The
transfer of the sentence
∀x ∈ P(X) : ((∃y ∈ F : y ⊆ x) =⇒ x ∈ F )
implies that ∗ F contains all internal subsets of ∗ X which contain some element
of ∗ F as a subset. In particular, A ∈ ∗ F .
4. Since ∗ A ⊆ ∗ X is internal with mon(F ) ⊆ ∗ A, we find by 3. that ∗ A ∈ ∗ F
which by the inverse form of the transfer principle implies A ∈ F .
5. If F1 ⊇ F2 , then trivially mon(F1 ) ⊆ mon(F2 ). Conversely, suppose that
mon(F1 ) ⊆ mon(F2 ). For any F ∈ F2 , the definition of the monad implies
∗
F ⊇ mon(F2 ) ⊇ mon(F1 ), and so F ∈ F1 by 4.
6. Swapping the roles of F1 and F2 in 5., the statement follows.
Exercise 60. Prove that a filter U on X is an ultrafilter if and only if for any filter
F on X we have either mon(U ) ⊆ mon(F ) or mon(U )∩mon(F ) = ∅. Prove also
that in this case, any different ultrafilter U ′ on X satisfies mon(U ) ∩ mon(U ′ ) =
∅.
Hint: Consider the system F0 := {F ∩ U : F ∈ F , U ∈ U }.
Theorem 12.7. If ∗ is even a compact P(X)-enlargement, then the following holds:
If F is a filter on X, then any internal set B ⊇ {F ∈ ∗ F : F ⊆ mon(F )}
contains some standard element F ∈ σ F .
Proof. Assume that B contains no standard element. Consider the internal binary
relation
ϕ := {(x, y) ∈ ∗ F × ∗ F | y ⊆ x ∧ y ∈
/ B}.
Then ϕ is concurrent on σ F : Indeed, if x1 = ∗ F 1 , . . . , xn = ∗ F n with Fk ∈ F

∗
are given, put y := (F1 ∩ · · · ∩ Fn ) = ∗ F 1 ∩ · · · ∩ ∗ F n . Since y is a standard
element our assumption implies y ∈ / B. Since y ∈ ∗ F satisfies y ⊆ xk , we thus
have (x1 , y), . . . , (xn , y) ∈ ϕ, as desired. Since ∗ is a compact P(X)-enlargement,
ϕ is satisfied on σ F , i.e. there is some F0 ∈ ∗ F such that (∗ F , F0 ) ∈ ϕ for each
F ∈ F . But then F0 ⊆ mon(F ) and F0 ∈ / B, a contradiction to the definition of
B.
Definition 12.8. A filter F over X is called principal if there is some A ⊆ X such
that
F = {F ⊆ X : A ⊆ F };
otherwise F is called nonprincipal .

Example 12.9. A filter F is principal if and only if F ∈ F (consider A := F
to see this). An ultrafilter U is nonprincipal if and only if it is free (Exercise 10).
∗
Theorem 12.10 (Luxemburg). If the filter F is principal, then mon(F ) = ( F )
is a standard entity. Otherwise, mon(F ) is external.
Proof. (For the case that ∗ is a compact P(X)-enlargement):
The first statement is almost trivial: Let F be principal, i.e. F = {F ⊆

X : A ⊆ F } where A := F . The transfer principle implies ∗ A ⊆ ∗ F for any
F ∈ F , and so ∗ A ⊆ mon(F ). Since A ∈ F , we have mon(F ) ⊆ ∗ A, and so
mon(F ) = ∗ A is a standard entity.
Conversely, suppose that mon(F ) is internal. Then the set B := {A ∈ ∗ F :
A ⊆ mon(F )} is internal. Since we assume that ∗ is a compact enlargement,
Theorem 12.7 implies that B contains some standard element, i.e. there is some
A ∈ F with ∗ A ⊆ mon(F ), i.e. ∗ A ⊆ ∗ F for each F ∈ F . This means A ⊆ F

for each F ∈ F (Lemma 3.5), and so F = A ∈ F which means that F is
principal.
We intend to give a proof for Theorem 12.10 also for the case that ∗ is only
an enlargement (not necessarily a compact enlargement). The proof is rather deep
and is taken from [Lux69a]. We need some preparation:
A system B of subsets of X is called a subbase of the filter F , if F is the
filter generated by B. The monad of a filter is determined by any subbase:
Lemma 12.11. If the filter F is generated by B, then
σ
mon(F ) = B.
σ σ σ σ
Proof. Since B ⊆ F , we have B ⊇ F = mon(F ). For the converse
σ ∗
inclusion, let A ∈ B, i.e. A ⊆ B for any B ∈ B. If F ∈ B is given, we have
F ⊇ B1 ∩ · · · ∩ Bn for B1 , . . . , Bn ∈ B, and so ∗ F ⊇ ∗ B 1 ∩ · · · ∩ ∗ B n ⊇ A. Thus,
A ⊆ ∗ F for any F ∈ F , i.e. A ∈ mon(F ).
The smallest cardinality of all subbases of F is called the dimension of F .

Lemma 12.12. If the filter F is principal, its dimension is 1. Otherwise its dimen-
sion is infinite.

Proof. If F is principal, then the single set F constitutes a subbase for F . If
F has finite dimension, then there is a finite subbase B = {B1 , . . . , Bn } for F .
Putting A := B1 ∩ · · · ∩ Bn , we have F ∈ F if and only if A ⊆ F ⊆ X, i.e. F is
principal.
One might hope to obtain a “minimal subbase” for F by considering a sub-
base of F with smallest possible cardinality and then to generate from this subbase
a “minimal subbase” by successively choosing only those elements which are really
“needed” to generate F . Of course, if the subbase is uncountable, one has to use
a transfinite induction for this procedure. This idea leads to a minimal subbase in
the following sense:
Lemma 12.13. Let F be a filter of infinite dimension κ. Then there exists a subbase
B for F with the following property:
There is an injection i : B → κ such that whenever E ⊆ B is finite and

E ⊆ E ∈ B, then i(E) ≤ max{i(F ) : F ∈ E } (with respect to the order in the
ordinal number κ).
Proof. Let B0 be a subbase with cardinality κ, and let Fα (0 ≤ α < κ) be a
corresponding enumeration of the elements of B0 . Now we put Cα := {Fα : 0 ≤
β < α}, and for 0 < α < κ, let Fα denote the filter generated by Cα . Now we
define B as follows:
B := {Fα | α = 0 or Fα ∈
/ Fα }.
The mapping i is defined by i(Fα ) := α.
We prove by transfinite induction on α that the filter generated by Aα :=
{Fβ : 0 ≤ β ≤ α} = Cα ∪ {Fα } is a subset of the filter BF generated by B:
Indeed, this is true for α = 0. Suppose as induction assumption that this is true
for all α < α0 , i.e. Cα0 ⊆ BF . By definition of B, we have either Fα0 ∈ B, or
Fα0 ∈ Fα0 . Since the induction assumption implies Fα0 ⊆ BF , we have in both
cases Fα0 ∈ BF , and so Aα0 = Cα0 ∪ {Fα0 } ⊆ BF , as required.
We thus have proved, in particular, that each Fα is contained in the filter
BF generated by B. Since the system of all Fα generates F , we have F ⊆ BF .
But in view of B ⊆ B0 , we have BF ⊆ F , and so BF = F , i.e. B actually
generates the filter F .
The other property follows from our construction: Let E ⊆ B be finite, say

E = {Fα1 , . . . , Fαn }, and E ⊆ E ∈ B. Assume by contradiction that α :=
i(E) > {i(F ) : F ∈ E } = max{α1 , . . . , αn }. Then E = Fα belongs to the filter Fα
generated by Cα ⊇ E which contradicts our definition of B.
Proof of Theorem 12.10. The first statement has already been proved (the com-
pactness of ∗ was not needed in our previous proof concerning the first statement).
For the second statement, let F be nonprincipal. Then F has infinite di-
mension κ by Lemma 12.12. Without loss of generality, we assume that κ is an
Choose B and i as in Lemma 12.13, and put
entity of S.

P := {x ∈ ∗ P(B) | “x is ∗ -finite” ∧ x ⊆ mon(F )}.
If we assume by contradiction that mon(F ) is internal, then P is internal by the

internal definition principle. By Theorem 8.10, there is some ∗ -finite set B0 ⊆ ∗ B

with B0 ⊇ σ B. In particular, B0 ⊆ σ B = mon(F ) (Lemma 12.11). Hence,
B0 ∈ P, and so P = ∅.
Note that P consists of (internal) ∗ -finite entities A ⊆ ∗ B. Hence, we may
define an internal function j : P → ∗ κ by j(A ) := max{∗ i(F ) : F ∈ A } (Exer-
cise 26). In particular, rng(j) is an internal nonempty subset of ∗ κ (Theorem 3.19).
The transfer of the sentence “κ is well-ordered” implies that rng(j) has a smallest
element α0 ∈ ∗ κ.
We show first that we have for any standard element A ∈ σ B the relation
∗
i(A) ≤ α0 : Indeed, by B0 ⊇ σ B, we have A ∈ B0 ∈ P, and so the definition of
j and α0 imply ∗ i(A) ≤ j(B0 ) ≤ α0 .
It follows that α0 is not a standard number: Indeed, assume to the contrary
∗
that α0 = ∗ β 0 for some β0 ∈ κ. Then we would have for any A ∈ B that (i(A)) =
∗ ∗ ∗
i( A) ≤ α0 = β 0 , i.e. i(A) ≤ β0 . Thus, i is an injection from B into the set
{β ∈ κ : β ≤ β0 } which has a strictly smaller cardinality than κ, a contradiction
to the fact that B has the cardinality of κ (because κ is the dimension of F ).
Hence α0 is not a standard number.
We thus have for any standard A ∈ σ B that ∗ i(A) = α0 ; since we already
proved ∗ i(A) ≤ α0 , we even have ∗ i(A) < α0 . Consider now the set B1 := {x ∈
B0 : ∗ i(x) < α0 }. By the internal definition principle, the set B1 is an internal
subset of the ∗ -finite set B0 . Hence, B1 is ∗ -finite (Theorem 6.13). Moreover, since
B0 ⊇ σ B and since any A ∈ σ B satisfies ∗ i(A) < α0 (as we have proved above),

we have σ B ⊆ B1 . Consequently, B1 ⊆ σ B = mon(F ) (Lemma 12.11).
Thus, B1 ∈ P. Since j(B1 ) < α0 (by definition of j and B1 and the transfer
principle), this contradicts the definition of α0 .
12.3 Topologies in Nonstandard Analysis

Now we return to the study of topological spaces. As before, let X be a topological
space where X ∈ S is an entity, and ∗ : S → ∗
S is a P(X)-enlargement.
For x ∈ X, let U (x) denote its neighborhood filter. In practice, one usually
does not calculate with U (x) but only with a neighborhood base:
Definition 12.14. A system B ⊆ U (x) is called a neighborhood base for x, if for

each neighborhood U of x there is some B ∈ B with B ⊆ U .
Example 12.15. The system of all open neighborhoods of x is a neighborhood base
for x. Indeed, if U ∈ U (x), there is some open O ⊆ X with x ∈ O ⊆ U , i.e. O is
an open neighborhood of x with O ⊆ U .
Example 12.16. If X is a metric space with metric d, then a neighborhood base
for x is given by the system of open balls B(x, r) := {y ∈ X : d(x, y) < r}
Ê
(r ∈ + ) or also by the system of all closed balls B(x, r) := {y ∈ X : d(x, y) ≤ r}
Ê
(r ∈ + ). Another neighborhood base is given by the system B(x, 1/n) (n ∈ ) Æ
Æ
or by B(x, 1/n) (n ∈ ).
Example 12.17. Let U ∈ U (x). Then the system {V ∈ U (x) : V ⊆ U } is a
neighborhood base for x (since for W ∈ U (x) the set V := W ∩U is a neighborhood
for x). Moreover, the system of all open neighborhoods O of x with O ⊆ U is also
a neighborhood base for x.
Definition 12.18. The monad of x, mon(x), is the monad of its neighborhood filter
U (x).
Proposition 12.19. If x is a point in a topological space X, we have

σ
mon(x) = B = {∗ U : U ∈ B}
for any neighborhood base B of x. In particular, y ∈ mon(x) if and only if y ∈ ∗ O

for each open O ⊆ X with x ∈ O. Moreover, there is some set B ∈ ∗ B with
B ⊆ mon(x).
Proof. It follows immediately from the definition that the neighborhood filter
U (x) of x is generated by any neighborhood base. Hence, the first statement
follows from Lemma 12.11. For the last statement, observe that Theorem 12.6 2.
implies that there is some some F ∈ ∗ F with F ⊆ mon(x). The transfer of the
statement
∀x ∈ F : ∃y ∈ B : y ⊆ x
implies that we find some B ∈ ∗ B with B ⊆ F ⊆ mon(x).
Corollary 12.20. If X is a pseudometric space, we have

∗ ∗
mon(x) = B(x, r) = B(x, 1/n),
r∈ Ê + n∈
Ê
where B(x, r) := {y ∈ X : d(x, y) < r}. In particular, for X = with the natural
topology, the definition of mon(x) coincides with our previous Definition 5.22.
We also want to define monads for points in the nonstandard world; we define
even monads of sets:
Definition 12.21. If A ⊆ ∗ X is nonempty, we denote the filter generated by the
system of all open sets O ⊆ X with A ⊆ ∗ O the standard filter of A. Its monad is
called the monad of A and denoted by µ(A).
Note that the system B of all open sets O ⊆ X with A ⊆ ∗ O indeed generates
a filter, since it has the finite intersection property: If O1 , . . . , On ∈ B, then
O = O1 ∩ · · · ∩ On is open, and ∗ O = ∗ O1 ∩ · · · ∩ ∗ O n contains A and thus is

nonempty. Hence, O = ∅. By Lemma 12.11, we have µ(A) = σ B. In other
words:
Proposition 12.22. We have y ∈ µ(A) if and only if y ∈ ∗ O for each open set
O ⊆ X with A ⊆ ∗ O.
Corollary 12.23. If x ∈ X, then µ({∗ x}) = mon(x).
Proof. By the transfer principle, {∗ x} ⊆ ∗ O if and only if x ∈ O. Now apply
Propositions 12.22 and 12.19.
Definition 12.24. We call two nonstandard points x, y ∈ ∗ X of a standard topolog-
ical space X infinitely close to each other if for each open set O ⊆ X with x ∈ ∗ O
we also have y ∈ ∗ O, i.e. if y ∈ µ({x}). In this case, we write y ≈O x.
Ê
In the case X = , we have mon(x) = {y ∈ X : y ≈ ∗ x} = {y ∈ X : y≈O ∗ x}.
Indeed, Corollary 12.23 implies for any topological space X:
Corollary 12.25. We have for any x ∈ X that
mon(x) = {y ∈ ∗ X : y ≈O ∗ x}.
We emphasize that ≈O is in literature sometimes only defined when x is a

standard point (e.g. in [LR94]).
The reader should be warned that ≈O is in general not an equivalence relation
and not even symmetric, i.e. x ≈O y does not imply that y ≈O x, even if x and y
are both standard points:
Example 12.26. Let X = {a, b} (a = b) where only the three sets ∅, {b}, {a, b} are
open (this is a topology!). Since X is finite, we have σ X = ∗ X. Similarly, ∗ O = σ O
for each open set O ⊆ X. The only set O with ∗ a ∈ ∗ O is thus O = {a, b}, and
so ∗ b ≈O ∗ a. However, for O = {b}, we have ∗ b ∈ ∗ O and ∗ a ∈/ ∗ O which implies
∗ ∗
a ≈O b.
There is another danger when dealing with ≈O for nonstandard points:

For X = with the natural (metric) topology, one might suspect that n≈O m
only for n = m. This is indeed true if either n or m is finite, but fails for nonstan-
dard points:

Theorem 12.27. Let X = and h ∈ ∞ . Then there are infinitely many n ∈ ∗
with n≈O h. Moreover, the relation n≈O h implies that either n = h or that |n − h|
is infinite.
Proof. Recall that µ({h}) is the filter monad of the standard filter of {h}. This
filter is nonprincipal, since for any n ∈ the set \ {n} belongs to the filter
∗

(because h ∈ ∗ \ {∗ n} = ( \ {n})). Consequently, µ({h}) ⊆ ∗ is external by
Theorem 12.10. Hence, µ({h}) is infinite by Exercise 6.

Assume that k := |n − h| > 0 is finite, i.e. k ∈ σ . Put Fj := {2ki + j : i ∈
∗ ∗
}. Then F0 ∪ · · · ∪ F2k−1 = , and so ∗ = (F0 ) ∪ · · · ∪ (F2k−1 ), i.e. we find
some j ∈ ∗
with h ∈ F j . Then Fj belongs to the standard filter of {h}, and so

n ∈ F j . By the standard definition principle, we have ∗ F j = {2ki + j : i ∈ ∗ }.
∗
In view of h, n ∈ ∗ F j , we thus find that n − h = (n + j) − (h + j) is a multiple of

2k, a contradiction to k = |n − h| > 0.
Corollary 12.28. On X = Ê (with the natural topology) the relation y ≈ x is not

equivalent to y ≈O x.
However, we will see that
y ≈ ∗ x ⇐⇒ y ≈O ∗ x.
For the above reasons, ≈O may not appear a “natural” notion. For so-called
Ê
uniform spaces (like X = ) we will later learn another relation which is more
natural and which for standard points coincides with ≈O ; for X = Ê this new
relation becomes the same as ≈.
One of the most useful concepts in real nonstandard analysis was the mapping
st. It appears natural to call x ∈ X the standard part of y ∈ ∗ X if y ≈O ∗ x, i.e.
mon(x) consists precisely of all those points y ∈ ∗ X whose standard part is x.
Ê
Recall that in case X = , the standard part mapping was not defined on ∗ R but
Ê
only on fin(∗ ). Hence, we cannot expect to define st on all of X. Even worse, in
general it may happen that st(y) is not uniquely determined. Nevertheless, we can
define:
Definition 12.29. The standard part relation st is a relation on ∗ X × X, defined
by
(y, x) ∈ st ⇐⇒ y ≈O ∗ x ( ⇐⇒ y ∈ mon(x)).
Points y ∈ dom(st) are called nearstandard . The set of all nearstandard points of
X is denoted by ns(X).
Ê
Of course, in case X = , the relation st is a function, and we end up with
the old Definition 5.20. Recall that a topological space is called a Hausdorff space
if each two points x = y have disjoint neighborhoods. For example, a pseudometric
space is a Hausdorff space if and only if it is a metric space (indeed, if two points
x = y satisfy d(x, y) = 0, they have the same neighborhoods).
Proposition 12.30. For a topological space X the following three statements are
equivalent:
1. X is a Hausdorff space.
2. The relation st is a function.
3. Monads to different points are disjoint, i.e. x = y implies mon(x) ∩ mon(y) =
∅.
Proof. The equivalence of the last two statements follows immediately from the
definition. If x = y have disjoint monads, choose U ∈ ∗ U (x), V ∈ ∗ U (y) with
U ⊆ mon(x) and V ⊆ mon(y) (Proposition 12.19). Then the sentence
∃u ∈ ∗ U (x), v ∈ ∗ U (y) : u ∩ v = ∅
is true, and the inverse form of the transfer principle implies that x and y have
disjoint neighborhoods. Conversely, if x = y have disjoint neighborhoods U and
V , respectively, then U ∩ V = ∅ implies ∗ U ∩ ∗ V = ∅, and since mon(x) ⊆ ∗ U ,
mon(y) ⊆ ∗ V , it follows that mon(x) ∩ mon(y) = ∅.
For compact enlargements we have the following generalization of the Cauchy
principle:
Theorem 12.31 (Permanence principle for ∗ X (Cauchy principle)). Let ∗ be a
compact P(X)-enlargement. Let α(y) be an internal predicate with y as its only free
variable. If α(y) holds for all y ∈ mon(x), then there is some standard neighborhood
U of x such that α(y) holds for all y ∈ ∗ U.
Proof. The set
B := {u ∈ ∗ U (x) | ∀y ∈ u : α(y)}
is internal by the internal definition principle. Moreover, any F ∈ ∗ U (x) with
F ⊆ mon(x) belongs to B. Hence, Theorem 12.7 implies that B contains some
standard element V = ∗ U with U ∈ U (x).
As one might expect, monads can be used to characterize topological sets:
Definition 12.32. Let A ⊆ X. A point x ∈ A is called an interior point of A, if
A is a neighborhood of x. The closure A is the set of all points x ∈ X with the
property that any neighborhood of x intersects A.
We first recall some facts of the standard world:
Proposition 12.33. We have:
1. The set A is the smallest closed set which contains A. In particular, A is
closed if and only if A = A.
2. The set of all interior points of A is the smallest open set which is contained
in A. In particular, A is open if and only if each x ∈ A is interior, i.e. if and
only if A is a neighborhood for each of its elements.
Proof. 1. We claim that C := X \ A is the union of all open sets which are
contained in X \ A: Then C is open, and thus A is closed, and moreover, if B ⊇ A
is closed, then X \ B ⊆ C, i.e. A ⊆ B.
If O ⊆ X \ A is open and x ∈ O, then O is a neighborhood of x which does
not intersect A, and so x ∈ C by definition of A. Conversely, if x ∈ C, then x has
a neighborhood U which does not intersect A, and so there is some open O ⊆ U
with x ∈ O. We have O ⊆ C by definition of A.
2. We prove that the set I of interior points of A is the union of all open sets
contained in A: If O ⊆ A is open and x ∈ O, then x ∈ I. Conversely, if x ∈ I,
then there is some open O ⊆ A with x ∈ O; hence, x is contained in an open set
O ⊆ A.
Now we turn to the nonstandard characterizations for topological properties
of sets:
Theorem 12.34. Let A ⊆ X. A point x ∈ A is an interior point of A if and only
if mon(x) ⊆ ∗ A. The set A is open if and only if

mon(x) ⊆ ∗ A.
x∈A
Proof. The second statement follows from the first by Proposition 12.33. Let x ∈
A be interior, i.e. A ∈ U (x). The system B := {V ∈ U (x) : V ⊆ A} is a
neighborhood base for x such that any V ∈ B satisfies V ⊆ A, i.e. ∗ V ⊆ ∗ A.
Hence, Proposition 12.19 implies

σ
mon(x) = B = {∗ V : V ∈ B} ⊆ ∗ A.
Conversely, suppose that mon(x) ⊆ ∗ A. By Proposition 12.19, we find some U ∈

∗
U (x) with ∗ U ⊆ mon(x). Hence,
∃u ∈ ∗ U (x) : u ⊆ ∗ A.
The inverse form of the transfer principle implies that U (x) contains a subset of
A, i.e. x is an interior point of A.
For the second statement, we could also have applied the Cauchy principle,
but this would require that ∗ be a compact enlargement.
We think of st as a multivalued function, and thus use for y ∈ ∗ X the notation
st(y) := {x ∈ X : (y, x) ∈ st} = {x ∈ X : y ≈O ∗ x}

which is slightly ambiguous if st is single-valued, because in this case st is a func-

tion, i.e. st(y) was already defined as the unique point x ∈ X with y ≈O ∗ x (and not
as the set {x}). However, we hope that no confusion will arise. We emphasize that
by the above definition, we have st(y) = ∅ if (and only if) y ∈ / dom(st) = ns(∗ X).
∗
We also use for A ⊆ X the corresponding notation

st(A) := st(y) = {x ∈ X : There is some y ∈ A with y ≈O ∗ x}.
y∈A
Moreover, for the inverse relation st−1 := {(x, y) : (y, x) ∈ st} we write corre-
spondingly for x ∈ X,
st−1 (x) := {y : (x, y) ∈ st−1 } = {y ∈ ∗ X : y ≈O ∗ x}
and for A ⊆ X,

st−1 (A) := st(x) = {y ∈ ∗ X : There is some x ∈ A with y ≈O ∗ x}.
x∈A
Theorem 12.35. Let A ⊆ X. Then
st(∗ A) = A,
i.e. a point x ∈ X belongs to A if and only if there is some y ∈ ∗ A with y ≈O ∗ x.

In particular, A is closed if and only if
st(∗ A) ⊆ A
(and then equality holds, because the converse inclusion is always true).
Proof. Let x ∈ A, i.e. for any U ∈ U (x), we have U ∩ A = ∅. Then the system
A := {U ∩ A : U ∈ U (x)} has the finite intersection property. Since ∗ is a P(X)-
∗
enlargement, this implies σ A = ∅, i.e. there is some y with y ∈ (U ∩ A) =
∗
U ∩ ∗ A for all U ∈ U (x). Hence, y ∈ mon(x) ∩ ∗ A, i.e. y ∈ ∗ A satisfies y ≈O ∗ x,
and so x ∈ st(∗ A).
Conversely, let x ∈ st(∗ A), i.e. there is some y ∈ ∗ A with y ≈O ∗ x, i.e.
y ∈ mon(x). For any U ∈ U (x), we thus have y ∈ ∗ U , and so ∗ U ∩ ∗ A = ∅ which
implies U ∩ A = ∅ by the transfer principle. Consequently, x ∈ A.
Corollary 12.36. If X is a Hausdorff space, then A ⊆ X is closed if and only if

each y ∈ ∗ A ∩ ns(∗ X) is infinitely close to some (standard) point of σ A, i.e. if and
only if
∗
A ∩ ns(∗ X) ⊆ st−1 (A).
Proof. Since X is Hausdorff, any y ∈ ∗ A is infinitely close to at most one point

∗
x with x ∈ X (Proposition 12.30). Theorem 12.35 implies that A is closed if and
only if the points x arising in this way all belong to A. The latter means that for
any y ∈ ∗ A with the additional property that y is infinitely close to some standard
point x (i.e. y ∈ ns(∗ X)), we have x ∈ σ A.
For compact enlargements, we have a generalization of the previous result

for internal (not necessarily standard) sets:
Theorem 12.37. Let ∗ be a compact P(X)-enlargement. If A ⊆ ∗ X is internal,
then st(A) is closed.
Proof. Let x ∈ st(A), and let B be the system of all open neighborhoods of x.
Consider the internal relation
ϕ := {(x, y) ∈ ∗ B × A | y ∈ x}.
We claim that ϕ is concurrent on σ B. Indeed, if O1 , . . . , On ∈ B, then O :=

O1 ∩ · · · ∩ On also belongs to B, and so O ∩ st(A) contains some point x0 . In view
of x0 ∈ st(A), there is some y ∈ A with y ∈ mon(x0 ). Since O is a neighborhood of
x0 , this implies y ∈ ∗ O. Hence, we found some y ∈ A with y ∈ ∗ O = ∗ O 1 ∩· · ·∩ ∗ O n ,
and so ϕ is concurrent on σ B, as claimed.
Since σ B consists only of standard sets (and has not a larger cardinality than
P(X)), ϕ is satisfied on σ B, i.e. there is some y ∈ A such that y ∈ ∗ O for any
O ∈ B. This means y ≈O ∗ x. Hence x ∈ st(A), and so st(A) = st(A) is closed.
Definition 12.38. A set A ⊆ X is compact if each open cover of A has a finite

subcover. This means that whenever C is a system of open sets with A ⊆ C ,
there exist finitely many O1 , . . . , On ∈ C with A ⊆ O1 ∪ · · · ∪ On .
One of the most important aspects of nonstandard topology is that compact
sets have a very natural characterization:
Theorem 12.39. A set A ⊆ X is compact if and only if each point of ∗ A is infinitely
close to some (standard) point of σ A, i.e. if and only if
∗
A ⊆ st−1 (A).
Proof. Let A be compact, and y ∈ ∗ A. Assume by contradiction that we find no

x ∈ A with y ≈O ∗ x. Then we find for each x ∈ A some open set O ⊆ X with
x ∈ O and y ∈ / ∗ O. Hence, the set C of all open sets O ⊆ X with y ∈ / ∗ O is an
open cover of X. Since A is compact, we have A ⊆ O1 ∪ · · · ∪ On with Ok ∈ C . But
since ∗ is a superstructure monomorphism, this implies y ∈ ∗ A ⊆ ∗ O 1 ∪ · · · ∪ ∗ O n ,
and so y ∈ ∗ O k for some Ok ∈ C , a contradiction to the definition of C .
Conversely, let each y ∈ ∗ A be infinitely close to some standard point of σ A,

and C be an open cover of A. For any y ∈ ∗ A, we find some x ∈ A with y ≈O ∗ x and

some O ∈ C with x ∈ O; hence, y ∈ ∗ O. This proves ∗ A ⊆ σ C . By Exercise 44,

we thus find a finite C0 ⊆ C with A ⊆ C0 , i.e. A is compact.
Ê
For X = , we defined compact sets as closed and bounded. However, since
the nonstandard characterization of Theorem 12.39 is the same as the characteri-
zation of Corollary 7.10, we may conclude that the definitions actually are equiv-
alent. This nonstandard argument implies in particular the classical Heine-Borel
theorem:
Ê
Corollary 12.40 (Heine-Borel). A subset A ⊆ is compact in the sense of Defini-
tion 12.38 if and only if it is closed and bounded.
In literature, the term “compact” is sometimes reserved for compact Haus-
dorff spaces, because these spaces have particularly convenient properties. This
appears natural from the nonstandard characterization:
Corollary 12.41. The topological space X is a compact Hausdorff space if and only
if
st : ∗ X → X,
i.e. if and only if for each y ∈ ∗ X there is precisely one x ∈ X with y ≈O ∗ x.
Proof. Combine Proposition 12.30 and Theorem 12.39.
Exercise 61. Prove the following statements by nonstandard methods:
1. A closed set which is contained in a compact set is compact.
2. Compact sets in Hausdorff spaces are closed.
Exercise 62. Let ∗ be a compact P(X)-enlargement. Prove that A ⊆ X is compact
if and only if st−1 (A) ∩ ∗ A is internal.
Hint: Apply Theorem 8.16.
Definition 12.42. A topological space is called a T3 space if points may be divided
from closed sets by open sets, i.e. whenever A is closed and x ∈ / A, there exist
disjoint open sets O0 , O1 with x ∈ O0 and A ⊆ O1 .
Lemma 12.43. Let X be a T3 space. Then a set M ⊆ X is compact if and only if

for each open cover C of M there is a finite C0 ⊆ C with M ⊆ C0 .
Proof. Necessity is trivial. To prove sufficiency, consider the family C1 of all open
sets O with the property that there is some O′ ∈ C with O ⊆ O′ . Then C1 is
an open cover of M . Indeed, for any x ∈ M there is some O ∈ C with x ∈ O.
Putting A := X \ O, we find disjoint open sets O0 , O1 with x ∈ O0 and A ⊆ O1 .
Then A0 := X \ O1 is closed and contains O0 which means O0 ⊆ A0 ⊆ O. Hence,

O0 ∈ C1 , and so x ∈ O0 ⊆ C1 .
By assumption, we find O1 , . . . , On ∈ C1 with M ⊆ O1 ∪ · · · ∪ On =

O1 ∪ · · · ∪ On . The definition of C1 implies that there are Oi′ ∈ C with Oi′ ⊇ O i ,
and so M ⊆ O1′ ∪ · · · ∪ On′ .
Theorem 12.44. Let X be a T3 space, and A ⊆ X. Then A is compact if and only

if ∗ A ⊆ ns(∗ X).
∗
Proof. If A is compact, then Theorem 12.39 implies ∗ A ⊆ (A) ⊆ st−1 (A) ⊆
ns(∗ X).
Conversely, let ∗ A ⊆ ns(∗ X), and let C be an open cover of A. For any
∗
y ∈ A we find some x ∈ X with y ≈O ∗ x. By Theorem 12.35, we have x ∈ A, and

so there is some O ∈ C with x ∈ O; hence, y ∈ ∗ O. This proves ∗ A ⊆ σ C . By

Exercise 44, we thus find a finite C0 ⊆ C with A ⊆ C0 . Then A ⊆ C0 , and so
A is compact by Lemma 12.43.
For compact enlargements, we have a generalization to internal sets (similar

to Theorem 12.37):
Exercise 63. Let ∗ be a compact P(X)-enlargement. If X is T3 and A ⊆ ns(∗ X)
is internal, prove that st(A) is compact.
Hint: Apply Lemma 12.43 and Theorem 8.16.
We now discuss convergence in a topological space:
Definition 12.45. A sequence xn ∈ X converges to x ∈ X, if for each U ∈ U (x),
we have xn ∈ U for all except finitely many n. A filter F over X converges to x,
if U (x) ⊆ F . We write xn → x resp. F → x.
These definitions are related:
Proposition 12.46. A sequence xn converges to x if and only if the filter F gen-
erated by the sets Fn = {xn , xn+1 , xn+2 , . . .} (n = 1, 2, . . .) converges to x.
Proof. We have U (x) ⊆ F if and only if for each U ∈ U (x) there are n1 , . . . , nk
with Fn1 ∩ · · · ∩ Fnk ⊆ U . Since Fn1 ∩ · · · ∩ Fnk = Fmax{n1 ,...,nk } , this is the case
if and only if for each U ∈ U (x) there is some n with Fn ⊆ U , i.e. if and only if
all except finitely many xn belong to U .
Theorem 12.6 implies:

Proposition 12.47. We have F → x if and only if mon(F ) ⊆ mon(x).
Proof. By Theorem 12.6, U (x) ⊆ F if and only if mon(F ) ⊆ mon(U (x)) =

mon(x).
Corollary 12.48. If X is a Hausdorff space, then F converges to at most one point.

Proof. The relations mon(F ) ⊆ mon(x) and mon(F ) ⊆ mon(y) imply

mon(x) ∩ mon(y) = ∅ (because mon(F ) = ∅), and so x = y by Proposi-
tion 12.30.
Exercise 64. Prove by nonstandard methods that in a compact space each ul-
trafilter converges. (Actually, this characterizes compact spaces, but the converse
implication is proved more easily by standard methods.)
We prove now that standard open sets can be characterized in terms of
monads:
Theorem 12.49. Let ∗ be a compact P(X)-enlargement, and A ⊆ ∗ X be internal.
1. If µ({a}) ⊆ A for some a ∈ A, then A contains some standard open set ∗ O
(i.e. O ⊆ X is open) with a ∈ ∗ O.
2. If µ({a}) ⊆ A for any a ∈ A, then A is a standard open set.
Proof. 1. Let F be the standard filter generated by a, and let B denote the
system of all F ∈ ∗ F with a ∈ F ⊆ A. By mon(F ) = µ({a}) ⊆ A, we have
B ⊇ {F ∈ ∗ F : F ⊆ mon(F )}, and so Theorem 12.7 implies that ∗ F ∈ B for
some F ∈ F , i.e. ∗ F ⊆ A. By definition of F , we have F ⊇ O1 ∩ · · · ∩ On = O for
open Ok ⊆ X with a ∈ ∗ O k . Then O is open with ∗ O = ∗ O1 ∩ · · · ∩ ∗ O n , and so
a ∈ ∗ O ⊆ ∗ F ⊆ A.
2. Let A denote the system of all open sets O ⊆ X with ∗ O ⊆ A. In view of 1.,

we find for any a ∈ A some open O ⊆ X with a ∈ ∗ O ⊆ A, and so A ⊆ σ A .
σ
Theorem 8.16 implies that there is a finite A0 ⊆ A with A ⊆ A 0 . We have
A0 = {O1 , . . . , On } where Ok ⊆ X are open, and ∗ O k ⊆ A. Then O := O1 ∪· · ·∪On
σ
is open, and ∗ O = A: Indeed, A ⊆ A 0 = ∗ O 1 ∪ · · · ∪ ∗ On ⊆ A, and so
A = ∗ O1 ∪ · · · ∪ ∗ O n = ∗ O.

By choosing an appropriate topology, we can characterize standard elements
in terms of monads: The discrete topology on X is O = P(X), i.e. any subset of
X is open with respect to the discrete topology.
Corollary 12.50. Let X be equipped with the discrete topology. Let ∗ be a compact
P(X)-enlargement, and A ⊆ ∗ X be internal.
1. A contains a standard element if and only if µ({a}) ⊆ A for some a ∈ A.
2. A is standard if and only if µ({a}) ⊆ A for any a ∈ A.
Proof. Sufficiency follows immediately from Theorem 12.49. For the converse im-
plications observe that if ∗ x ∈ A for some x ∈ X, then we have for a = ∗ x that
O := {x} is open and a ∈ ∗ O, and so µ({a}) ⊆ ∗ O = {∗ x} = {a} ⊆ A. Similarly,
if A = ∗ O for some O ⊆ X, then O is open, and so we have for any a ∈ A in view
of a ∈ ∗ O that µ({a}) ⊆ ∗ O = A.
As a nice application of nonstandard methods, let us give a simple proof of

the famous Tychonoff compactness theorem.

Definition 12.51. Let X = Xi with topological spaces Xi (i ∈ I). Then the
following topology on X is called the product topology: Let B denote the system
of all sets of the form

Oi
i∈I
where Oi = Xi for all except finitely many i ∈ I, and Oi ⊆ Xi is open for all i ∈ I.
Then O is open in the product topology if and only if it is a union of sets from B.
The condition that Oi = Xi for all except finitely many i ∈ I appears rather
artificial. However, there are two main reasons why this should be included in the
above definition of the product topology (besides the fact that the above definition
has many applications):
It turns out that the above definition is the smallest topology such that
each of the projections pi : X → Xi is continuous (continuity will be defined
in Section 12.4). In nonstandard terms, this fact reads as follows:
We will assume henceforth that we have given a family Xi (i ∈ I) of topolog-

Then also U := Xi ,
ical spaces where Xi , I, and {Xi : i ∈ I} are all entities of S.
As before, we assume that ∗ is an
U I , and thus also X := Xi are entities of S.
P(X)-enlargement.
We use the notation of Corollary A.6, i.e. for i ∈ ∗ I, we define ∗ X i := ∗ f (i)
where f denotes the function i → Xi . Recall (Exercise 80), that the elements of
∗
X are precisely the internal elements of i∈∗ I ∗ X i . In particular (Corollary A.6),
each x ∈ ∗ X is an internal function

x : ∗I → ∗
X i.
i∈∗ I
∗
Theorem 12.52. In the product topology, we have y ≈O x if and only if
∗
y(∗ i) ≈O ∗ x(∗ i) = (x(i)) for each i ∈ I.
Proof. Suppose that y(∗ i) ≈O ∗ x(∗ i) for each i ∈ I. Let O ⊆ X be open with
∗
x ∈ ∗ O. We have to prove that y ∈ ∗ O. But ∗ x ∈ ∗ O implies x ∈ O, and so
we have x ∈ B ⊆ O for some B ∈ B where B is as in Definition 12.51. By
definition of B, we find finitely many i1 , . . . , in ∈ I and open sets Oik ⊆ Xik

such that B = i∈I Oi for each i ∈ I where we put Oi := Xi for i = i1 , . . . , in .

Exercise 80 implies that ∗ B consists of all internal elements of i∈∗ I ∗ O i . Note
that the transfer principle implies
∀x ∈ ∗ I : ((x = ∗ i1 ∧ · · · ∧ x = ∗ in ) =⇒ ∗
Ox = ∗ X x ).
∗ ∗
Since y(∗ ik ) ≈O ∗ x(ik ) ∈ (Oik ), we have y(∗ ik ) ∈ (Oik ) = ∗ O ∗ ik (recall the
remarks preceding Corollary A.6). Thus, y is an internal function which satisfies
y(i) ∈ ∗ O i for all i ∈ ∗ I (for i = ∗ i1 , . . . , ∗ ik we have ∗ O i = ∗ X i , as we have shown
above). This proves y ∈ ∗ B ⊆ ∗ O, as desired.
Conversely, let y ≈O ∗ x. If i0 ∈ I and some open Oi0 ⊆ Xi0 with x(i0 ) ∈ Oi0

are given, put Oi := Xi (i = i0 ) and O := i∈I Oi . Then O is open with x ∈ O.
Hence, ∗ x ∈ ∗ O, and so our assumption implies y ∈ ∗ O. Since ∗ O consists of all
∗
internal elements of i∈∗ I ∗ Oi and since ∗ O ∗ i0 = (Oi0 ), we must have y(∗ i0 ) ∈
∗ ∗ ∗
(Oi0 ). Hence, y( i0 ) ≈O (x(i0 )).

Corollary 12.53. X := i∈I Xi is a Hausdorff space if and only if each Xi is a
Hausdorff space.
Proof. If Xi0 is not a Hausdorff space, there are points a, b ∈ Xi0 , a = b such that
mon(a) ∩ mon(b) contains some point c (Proposition 12.30). Choose some x ∈ X
with x(i0 ) = a, and put y(i) = x(i) (i ∈ I \ {i0 }) and y(i0 ) = b. Consider the
function z(i) := ∗ y(i) (i ∈ I \ {i0 }) and z(i0 ) := c. Then z is an internal function
(Exercise 8), and in view of Theorem 12.52, we have z ∈ mon(x)∩mon(y) although
x = y. Hence, Proposition 12.30 implies that X is not a Hausdorff space.
Conversely, if X is not a Hausdorff space, we find elements x = y in X such
that mon(x) ∩ mon(z) contains some element z. Choose some i0 with x(i0 ) =
y(i0 ). Then Xi0 is not a Hausdorff space, because Theorem 12.52 implies z(∗ i0 ) ∈
mon(x(i0 )) ∩ mon(y(i0 )).
The other reason for the definition of the product topology is that the fol-
lowing important theorem of Tychonoff holds, which has many applications. We
note that all known proofs of the Tychonoff theorem are rather technical so that
the following nonstandard proof is an essential simplification:

Corollary 12.54 (Tychonoff). X := i∈I Xi is compact if and only if each Xi is
compact.
Proof. Let X be compact. A standard argument immediately implies that Xi0 is
compact (because each projection pi : X → Xi is continuous, as mentioned above).
However, for completeness we provide a nonstandard proof: By Theorem 12.39,
we have to prove that for each b ∈ ∗ X i0 there is some a ∈ Xi0 with b ≈O ∗ a.
There is some y ∈ ∗ X with y(∗ i0 ) = b. Since X is compact, we find some x ∈ X
with y ≈O ∗ x. Then y(∗ i0 ) ≈O ∗ (x(i0 )) by Theorem 12.52, and so a = x(i0 ) ∈ Xi0
satisfies b ≈O ∗ a.
The converse direction is the one which is hard to prove by standard methods:
Suppose that all Xi are compact. Let y ∈ ∗ X. For each i ∈ I, we have y(∗ i) ∈
∗ ∗
X ∗ i = (Xi ). Since Xi is compact, we find some x(i) ∈ st(y(∗ i)). Then x ∈ X
(axiom of choice!), and Theorem 12.52 implies y ≈O ∗ x.
Some notes are in order: It lies in the nature of things that we had to use the
axiom of choice in its full generality to prove the Tychonoff theorem. In fact, the
Tychonoff theorem is actually equivalent to the axiom of choice [Kel50] (this paper
contains a minor mistake which however can be corrected, see [LRN51]). However,
to prove Tychonoff’s theorem for Hausdorff spaces, one does not need the full
power of the axiom of choice: In fact, the Tychonoff theorem for Hausdorff spaces
is actually equivalent to the so-called maximal ideal theorem [LRN54] which in
turn is equivalent to Theorem 4.9 [Sik64]. The most difficult of these implications
follows from our above proof of the Tychonoff theorem: In fact, if all Xi are
compact Hausdorff spaces, then st is a function by Proposition 12.30, and so no
axiom of choice is required to define the function x in the above proof. So in
this case, we only made use of the axiom of choice in the construction of the
ultrapower model. A careful analysis shows that a map ∗ sufficient for our proof
may be defined by only applying Theorem 4.9 and no other form of the axiom of
choice.
12.4 Functions in Nonstandard Topology

Let X and Y be topological spaces. Assume that X, Y ∈ S are entities and that
∗ : S → ∗S is an enlargement (actually, it suffices for the following considerations
that ∗ is a P(X) ∪ P(Y )-enlargement).
Let f : X → Y . Recall that if F is a filter over X, then f (F ) denotes the
filter generated by {f (F ) : F ∈ F }.
Theorem 12.55. We always have mon(f (F )) ⊇ ∗ f (mon(F )). Moreover, equality
holds if ∗ is also a compact P(X)-enlargement.
Proof. Since B := {f (F ) : F ∈ F } generates f (F ), Lemma 12.11 implies in view
∗
of (f (F )) = ∗ f (∗ F ) (Theorem 3.13) that
σ
mon(f (F )) = B = {∗ (f (F )) : F ∈ F }

= {∗ f (F ) : F ∈ σ F } ⊇ ∗ f ( σ F ) = ∗ f (mon(F )).
To see that we have equality if ∗ is a compact P(X)-enlargement, assume that

there is some y ∈ {∗ f (F ) : F ∈ σ F } which is not contained in ∗ f (mon(F )).
Put By := {F ∈ ∗ F : y ∈ / ∗ f (F )}. Then By is internal by the internal definition
principle, and By ⊇ {F ∈ ∗ F : F ⊆ mon(F )}, because y ∈ / ∗ f (mon(F )).
Theorem 12.7 now implies that By contains some standard F ∈ σ F , i.e. y ∈ /
∗
f (F ), a contradiction to our choice of y.
Definition 12.56. A function f : X → Y is called continuous at x ∈ X, if for each
neighborhood V ∈ U (f (x)) there is a neighborhood U ∈ U (x) with f (U ) ⊆ V .
The function f is called continuous, if it is continuous at each x ∈ X.

Theorem 12.57. For f : X → Y the following statements are equivalent:
1. f is continuous at x.
2. F → x implies f (F ) → x for any filter F over X.
3. U (f (x)) ⊆ f (U (x)).
4. mon(f (U (x))) ⊆ mon(U (f (x))).
∗
5. f (mon(x)) ⊆ mon(f (x)).
∗
6. y ≈O ∗ x implies ∗ f (y) ≈O (f (x)) = ∗ f (∗ x).
Proof. We first prove the equivalence of the first three statements: If f is contin-
uous at x, and F → x (i.e. U (x) ⊆ F ), then we find for each V ∈ U (f (x))
some U ∈ U (x) ⊆ F with f (U ) ⊆ V , and so V belongs to the filter generated by
{f (U ) : U ∈ F }. Thus 2. holds. If 2. holds, then we find for the particular choice
F = U (x) that f (U (x)) → f (x) which means that 3. is satisfied. Finally, assume
that 3. holds. Recall that by Lemma 5.27, f (U (x)) consists precisely of those sets
V ⊆ Y for which U := {x : f (x) ∈ V } ∈ U (x). Hence, if U (f (x)) ⊆ f (U (x)) and
V ∈ U (f (x)), we find some U ∈ U (x) with f (U ) ⊆ V , and so f is continuous at
x.
Theorem 12.6 5. implies that the inclusions 3. and 4. are equivalent.
Noting that mon(U (f (x))) = mon(f (x)), and that ∗ f (mon(x)) =
∗
f (U (f (x))) ⊆ mon(f (U (x))) in view of Theorem 12.55, we see that 4. im-
plies 5.; moreover, equivalence follows if ∗ is a compact enlargement. To see the
converse inclusion without this additional requirement, assume that 5. holds.
Choose some internal U ∈ ∗ U (x) with U ⊆ mon(x) (Proposition 12.19). Given
V ∈ U (f (x)), we have by assumption ∗ f (U ) ⊆ mon(f (x)) ⊆ ∗ V , i.e. we have
proved
∃u ∈ ∗ U (x) : ∗ f (u) ⊆ ∗ V .
The inverse form of the transfer principle implies that there is some U ∈ U (x)
with f (U ) ⊆ V . Hence, f is continuous at x. The equivalence of 6. with 5. is
trivial.
Corollary 12.58. If f is continuous at x, then xn → x implies f (xn ) → f (x).
Proof. Let xn → x, and F be the filter generated by the sets {xn , xn+1 , . . .}
(n = 1, 2, . . .). Then F → x (Proposition 12.46), and so f (F ) → f (x)
by Theorem 12.57. By definition, f (F ) is the filter generated by the sets
{f (xn ), f (xn+1 ), . . .} (n = 1, 2, . . .). Thus, Proposition 12.46 implies f (xn ) →
f (x).
We point out that the converse to Corollary 12.58 does not hold in general.
For counterexamples (and assumptions which imply that the converse implication
holds), we refer the reader to books on topology.
Exercise 65. Prove by nonstandard methods that a continuous function maps
compact sets into compact sets.
Using Proposition 12.5, one can prove that a function f is continuous if and
only if preimages of open sets are open. For one direction of this statement, one
can give an easier nonstandard proof:
Exercise 66. Prove by nonstandard methods that for any continuous function
f : X → Y preimages of open sets are open.
§ 13 Uniform Structures
13.1 Uniform Spaces
There are some concepts which appear topological but which actually cannot
be described in topological spaces: Uniform convergence, uniform continuity, or
Cauchy sequences. In (pseudo)metric spaces, one can define these concepts: For
example, a sequence fn of functions with values in a pseudometric space converges
Ê
uniformly to a function f , if for each ε ∈ + one finds some index N such that
d(fn (x), f (x)) < ε (n ≥ N ) simultaneously for all x. This definition is possible,
since ε determines not only a neighborhood, but actually a system of neighbor-
hoods for each point in the space (in particular for each of the points f (x)). Thus,
if one intends to introduce such a concept in more general topological spaces, one
should consider families of neighborhoods. This is the motivation for the definition
of a so-called uniform structure.
Recall that sets U ⊆ X × X are relations. Hence, the notation
U −1 := {(y, x) : (x, y) ∈ U }
is natural, and if V ⊆ X × X and x ∈ X also
U ◦ V := {(x, z) | ∃y ∈ X : ((x, y) ∈ V ∧ (y, z) ∈ U )}
and
U (x) := {y : (x, y) ∈ U }.
2
We write also U for U ◦ U . We already made use of these conventions for the
particular relation st (see the remarks in front of Theorem 12.35).
∗
Similar arguments as in Theorem 3.13 show that (∗ U )−1 = (U −1 ), ∗ U ◦ ∗ V =
∗ ∗ ∗ ∗
(U ◦ V ), and U( x) = (U (x)).
Definition 13.1. A uniform structure over a space X is a filter U over X × X such
that each U ∈ U satisfies:
1. ∆ := {(x, x) : x ∈ X} ⊆ U .
2. U −1 ∈ U .
3. There is some V ∈ U with V 2 ⊆ U .
Each uniform structure generates a topology in a canonical way: Given x ∈ X,
put U (x) := {U (x) : U ∈ U }.
Proposition 13.2. Let O be the system of all sets O with the property that O ∈
U (x) for any x ∈ O. Then O is a topology on X, and U (x) is the corresponding
neighborhood filter of x.
The topology is Hausdorff if and only if the relation (x, y) ∈ U for any U ∈ U
implies x = y.
§13 Uniform Structures 171
Proof. We prove first that O is a topology: Clearly, ∅, X ∈ O. Let O1 , O2 ∈ O

and O := O1 ∩ O2 . For any x ∈ O, we find sets U1 , U2 ∈ U with Oi = Ui (x).
Since U is a filter, we have U := U1 ∩ U2 ∈ U . Hence O = U (x) ∈ U (x), and

so O ∈ O. Similarly, if O0 ⊆ O the set O := O0 belongs to O: For any x ∈ O,
there is some U ∈ U with U (x) ⊆ O. Putting V := U ∪ {(x, y) : y ∈ O}, we have
V ∈ U (because V ⊇ U ∈ U ) and O = V (x) ∈ U (x). Hence, O ∈ O, and so O is
a topology.
If W is a neighborhood of x, then there is some U ∈ U with W ⊇ U (x).
Then V := U ∪ {(x, y) : y ∈ W } belongs to U , and so W = V (x) ∈ U (x).
The only nontrivial part of the proof is the converse implication, i.e. that
any W ∈ U (x) actually is a neighborhood of x: Let W ∈ U (x), i.e. W = U (x)
for some U ∈ U . Let O be the set of all y ∈ X with the property that there is
some V ∈ U with V (y) ⊆ W . Then O ∈ O: Given y ∈ O, choose some V ∈ U
with V (y) ⊆ W . There is some V0 ∈ U with V02 ⊆ V . For any z ∈ V0 (y), we have
V0 (z) ⊆ V02 (y) ⊆ V (y) ⊆ W , and so z ∈ O. Hence, V0 (y) ⊆ O, which implies that
O = V1 (y) for V1 := V0 ∪ {(y, z) : z ∈ O} ∈ U . Thus, O ∈ U (y) for any y ∈ O
which means O ∈ O, as claimed. Since y ∈ V (y) for V ∈ U , the definition of O
implies O ⊆ W . Finally, since U (x) ⊆ W , we have x ∈ O. Thus, W is indeed a
neighborhood of x.
If the topology is Hausdorff and x = y, then there are U, V ∈ U with
U (x) ∩ V (y) = ∅. Since y ∈ V (y), this implies y ∈ / U (x), and so (x, y) ∈
/ U.
Conversely, assume that (x, y) ∈ U for any U ∈ U implies x = y. Then the
topology is Hausdorff, since for any x = y we find some U ∈ U with (x, y) ∈ / U.
Choose some V ∈ U with V 2 ⊆ U . Then V −1 (x) ∩ V (y) = {z : (x, z), (z, y) ∈
V } = ∅, since otherwise (x, y) ∈ V 2 ⊆ U , a contradiction. Hence, x and y have
disjoint neighborhoods V −1 (x) and V (y).
For evident reasons, the above topology is called the topology generated by
the uniform structure U . We call X equipped with this topology a uniform space.
(Similarly as we did for topological spaces, we usually do not mention U explic-
itly).
Example 13.3. Let X be a (pseudo)metric space. Let U be the system of all sets
Ê
U ⊆ X × X for which there is some ε ∈ + with
{(x, y) : d(x, y) < ε} ⊆ U.
Then U is a uniform structure. A set O is open in the generated topology if and

Ê
only if for any x ∈ O there is some ε ∈ + such that {y : d(x, y) < ε} ⊆ O. Thus,
the topology generated by U is the topology of X in the usual sense.
From now on, we understand by a (pseudo)metric space always a space with
this uniform structure and the corresponding topology.
The previous example is generalized by the following:

Example 13.4. Let D be a family of pseudometrics. Let U be the filter which is
generated by the system of all sets
{(x, y) : d(x, y) < ε} (d ∈ D, ε ∈ Ê+).

Thus, we have U ∈ U if and only if there are finitely many d1 , . . . , dn ∈ D and
Ê
some ε ∈ + such that
{(x, y) : d1 (x, y) < ε ∧ · · · ∧ dn (x, y) < ε} ⊆ U.
Then U is a uniform structure. We call U the uniform structure induced by the

family D.
It is a rather deep result of elementary topology that each uniform structure
is actually induced by a family of pseudometrics. This result is remarkable for
two reasons: First, it implies some criteria for the metrizability of uniform spaces.
Moreover, it establishes a connection between a general uniform space X and
the system Ê Ê
of real numbers (because each pseudometric takes its values in ).
However, we will not make use of this result and so do not provide a proof. The
reader who is interested in a proof is referred to books on elementary topology
(see e.g. [vQ79, 11.34]).
The importance of uniform structures becomes clear in the nonstandard de-
scription. As in the previous sections, we assume in the following that X ∈ S is an
entity and that ∗ : S → ∗S is a P(X)-enlargement. Then ∗ is even a P(X × X)-
enlargement (because X × X has the same cardinality as X if X is infinite).
∗
Given a filter U over X ×X, note that any U ∈ U satisfies ∗ U ⊆ (X × X) =
∗ ∗ ∗ ∗
X × X, and so mon(U ) ⊆ X × X. Hence, it makes sense to define
x ≈U y ⇐⇒ (x, y) ∈ mon(U ).
If U is a uniform structure, we call x infinitely U -close to y if x ≈U y. It makes

sense to consider only uniform structures in this connection:
Proposition 13.5. A filter U over X × X is a uniform structure if and only if ≈U
is an equivalence relation on ∗ X.
Proof. In fact, the three properties of Definition 13.1 correspond to the properties
of equivalence relations for ≈U :
The first property is equivalent to the reflexivity of ≈U . Indeed, we have
∆ := {(x, y) ∈ X × X | x = y} ⊆ U for any U ∈ U if and only if ∗ ∆ ⊆ ∗ U
for any U ∈ U . Since the standard definition principle for relations implies ∗ ∆ =
{(x, y) ∈ ∗ X × ∗ X | x = y} = {(x, x) : x ∈ ∗ X}, this is the case if and only if
x ≈U x for any x ∈ ∗ X, as claimed.
The second property of Definition 13.1 is equivalent to the symmetry of ≈U .

∗
In fact, since ∗ is a superstructure monomorphism, we have (U −1 ) = (∗ U )−1 for
any U ∈ U . Using our notation for relations, we thus find that ≈U is symmetric
if and only if mon(U ) = mon(U )−1 . Note that U0 := {U −1 : U ∈ U } is a filter
with mon(U0 ) = mon(U )−1 . Hence, ≈U is symmetric if and only if mon(U0 ) =
mon(U ) which in view of Theorem 12.6 is equivalent to U = U0 and thus to the
second property of Definition 13.1.
The last property of Definition 13.1 is equivalent to the transitivity of ≈U . In
fact, let this property be satisfied. If x≈U y and y≈U z, we have for any U ∈ U that
∗
(x, y), (y, z) ∈ ∗ U . Choose some V ∈ U with V 2 ⊆ U . Then (∗ V )2 = (V 2 ) ⊆ ∗ U ,
and so (x, y), (y, z) ∈ ∗ V implies (x, z) ∈ ∗ U. Hence, x ≈U z, and so ≈U is
transitive.
Conversely, let ≈U be transitive. By Theorem 12.6, there is some V ∈ ∗ U
with V ⊆ mon(U ). Then we have for any (x, y) ∈ V that x ≈U y. Since ≈U
is transitive, we have for any (x, z) ∈ V 2 that x ≈U z and thus V 2 ∈ mon(U ).
Hence, given U ∈ U , we have V 2 ∈ ∗ U , i.e. the sentence
∃v ∈ ∗ U : v 2 ⊆ ∗ U
is true. The inverse form of the transfer principle implies that there is some V ∈ U
with V 2 ⊆ U .
Proposition 13.6. If the uniform structure on X is induced by a family D of pseu-
dometrics, then
∗
x ≈U y ⇐⇒ d(x, y) ≈ 0 for any d ∈ D.
In particular, if X is a pseudometric space, we have

∗
x ≈U y ⇐⇒ d(x, y) ≈ 0,
and for X = Ê, we have

x ≈U y ⇐⇒ x ≈ y.
Proof. By Example 13.4, the sets of the form
Bε,d := {(x, y) ∈ X × X : d(x, y) < ε} Ê

(ε ∈ + , d ∈ D)
∗
generate U . Hence, Lemma 12.11 implies mon(U ) = { (Bε,d ) : ε ∈ Ê+, d ∈ D}.
Note that, by the standard definition principle for relations,
∗
(Bε,d ) = {(x, y) ∈ ∗ X × ∗ X : ∗ d(x, y) < ∗ ε}.
Hence, (x, y) ∈ mon(U ) if and only if ∗ d(x, y) < ε for any ε ∈ Ê+ and any
d ∈ D.
One might hope that x≈U y if and only if x≈O y with respect to the topology
generated by the uniform structure. Unfortunately, for nonstandard points this
may fail even in natural situations:
Example 13.7. Consider X = with the canonical (metric) uniform structure.
By Proposition 13.6, we have
n ≈U m ⇐⇒ n = m.
On the other hand, for h ∈ ∞, we find some n = h with n≈O h by Theorem 12.27.
Nevertheless, for standard points the situation is good:
Proposition 13.8. Let X be a uniform space. Then we have
y ≈O ∗ x ⇐⇒ y ≈U ∗ x (x ∈ X, y ∈ ∗ X).
Proof. Let y ≈U ∗ x. If O ⊆ X is open with ∗ x ∈ ∗ O, then x ∈ O, and the definition

of the topology implies that there is some U ∈ U with O = U (x). Since ≈U is
symmetric, we have ∗ x ≈U y and thus (∗ x, y) ∈ ∗ U, i.e. y ∈ ∗ U (∗ x) = ∗ (U (x)) =
∗
O. Hence, y ≈O ∗ x.
Conversely, if y ≈O ∗ x and U ∈ U is given, then U (x) is a neighborhood of
x by Proposition 13.2, and so we find some open O ⊆ X with x ∈ O ⊆ U (x).
∗
Then ∗ x ∈ ∗ O implies in view of y ≈O ∗ x that y ∈ ∗ O ⊆ (U (x)) = ∗ U (∗ x), i.e.
∗ ∗ ∗ ∗
( x, y) ∈ U . Hence, ( x, y) ∈ mon(U ) which means x ≈U y (or, equivalently,
y ≈U ∗ x).
Proposition 13.8 explains why y ≈O x is in literature sometimes defined only

for the case that x is a standard point.
In uniform spaces, it makes sense to speak of Cauchy sequences:
Definition 13.9. A sequence xn ∈ X is a Cauchy sequence if for each U ∈ U there
is some n0 such that (xn , xm ) ∈ U for n, m ≥ n0 . A filter F over X is a Cauchy
filter if for each U ∈ U there is some F ∈ F with F × F ⊆ U .
Proposition 13.10. A sequence xn ∈ X is a Cauchy sequence if and only if the filter
F generated by the sets Fn := {xn , xn+1 , xn+2 , . . .} (n = 1, 2, . . .) is a Cauchy
filter.
Proof. If xn is a Cauchy sequence and U ∈ U , then there is some n0 with Fn0 ∈ U .

Conversely, if F is a Cauchy filter and U ∈ U , then there are n1 , . . . , nk such that
F := Fn1 ∩ · · · ∩ Fnk satisfies F × F ⊆ U . The latter means (xn , xm ) ∈ U for
n, m ≥ max{n1 , . . . , nk }.
Theorem 13.11. A filter F is a Cauchy filter if and only if x, y ∈ mon(F ) implies

x ≈U y.
Proof. If F is a Cauchy filter and x, y ∈ mon(F ), then we have for any U ∈ U

that there is some F ∈ F with F × F ⊆ U , and so (x, y) ∈ ∗ F × ∗ F ⊆ ∗ U . Hence,
x ≈U y.
Conversely, let x ≈U y for any x, y ∈ mon(F ). By Theorem 12.6, there is
some F ∈ ∗ F with F ⊆ mon(F ). Given U ∈ U , we have for any x, y ∈ F that
(x, y) ∈ ∗ U, i.e. F × F ⊆ ∗ U . Hence,
∃x ∈ ∗ F : x × x ⊆ ∗ U .
The inverse form of the transfer principle implies that there is some F ∈ F with
F × F ⊆ U , and so F is a Cauchy filter.
Definition 13.12. A uniform space is called complete, if each Cauchy filter con-
verges.
Propositions 12.46 and 13.10 together imply:
Corollary 13.13. In a complete space any Cauchy sequence converges.
Proof. If xn is a Cauchy sequence, then the filter F generated by the sets Fn :=
{xn , xn+1 , . . .} is a Cauchy filter by Proposition 13.10 and thus convergent to some
x. Proposition 12.46 implies xn → x.
We point out that the converse to Corollary 13.13 does not hold, in gen-
eral. But the converse is true in (pseudo)metric spaces; although it is rather easy
to prove this fact by standard methods, one can also give a nonstandard proof
(Exercise 68).
It turns out that completeness of a uniform space X is related to another
important notion:
Definition 13.14. A point y ∈ ∗ X is called a pre-nearstandard point if for any
U ∈ U there is some x ∈ X with (∗ x, y) ∈ ∗ U . We write pns(∗ X) for the set of
pre-nearstandard points.
∗
Since U ∈ U if and only if U −1 ∈ U and since (∗ U )−1 = (U −1 ), it is
equivalent to require that for any y ∈ ∗ X and any U ∈ U there is some x ∈ X
with (y, ∗ x) ∈ ∗ U .
Exercise 67. Prove that in a (pseudo)metric space X a point y ∈ ∗ X is pre-
Ê
nearstandard if and only if for each ε ∈ + there is some x ∈ ∗ X with ∗ d(∗ x, y) <
∗
ε.
Lemma 13.15. If F is a Cauchy filter, then mon(F ) ⊆ pns(∗ X).
Proof. Let y ∈ mon(F ). Given U ∈ U , choose some F ∈ F with F × F ⊆ U . Fix
some x ∈ F . Then
∗
(∗ x, y) ∈ ∗ F × ∗ F = (F × F ) ⊆ ∗ U .
Theorem 13.16. ns(∗ X) ⊆ pns(∗ X), i.e. each nearstandard point is a pre-
nearstandard point. Moreover, we have equality if and only if X is complete.
Proof. If y ∈ ns(∗ X), then there is some x ∈ X with y ≈O ∗ x which by Proposi-

tion 13.8 implies y ≈U ∗ x, i.e. (∗ x, y) ∈ ∗ U for any U ∈ U . Hence, y ∈ pns(∗ X).
Now, let X be complete, and y ∈ pns(∗ X). Let B be the system of all
sets of the form U (x) where U ∈ U and (∗ x, y) ∈ ∗ U , i.e. y ∈ ∗ U (∗ x). For
U1 (x1 ), . . . , Un (xn ) ∈ B, the set ∗ U 1 (∗ x1 ) ∩ · · · ∩ ∗ U n (∗ xn ) is not empty (because
it contains y), and so U1 (x1 ) ∩ · · · ∩ Un (xn ) is not empty. Hence, B has the
finite intersection property and thus generates some filter F . We claim that F
is a Cauchy filter: Given U ∈ U , choose some V ∈ U with V 2 ∈ U . Then
U0 := V ∩ V −1 ∈ U . Since y ∈ pns(∗ X), we find some x ∈ X with (∗ x, y) ∈ ∗ U 0 .
Then
U0 (x) × U0 (x) ⊆ V −1 (x) × V (x) = {(a, b) : (a, x), (x, b) ∈ V } ⊆ V 2 ⊆ U.
Since U0 (x) ∈ F , we may conclude that F is a Cauchy filter. Since X is complete,

we have F → x for some x ∈ X, i.e. mon(F ) ⊆ mon(x) (Proposition 12.47).

Lemma 12.11 implies mon(F ) = σ B; since we have for any B = U (x) ∈ B that
y ∈ ∗ U (∗ x) = ∗ B, it follows that y ∈ mon(F ) ⊆ mon(x), i.e. y ≈ ∗ x.
Conversely, let pns(∗ X) = ns(∗ X), and let F be a Cauchy filter. Choose
some y ∈ mon(F ). Lemma 13.15 implies y ∈ pns(∗ X) = ns(∗ X). Hence, there
is some x ∈ X with y ≈U ∗ x. We claim that F → x: If U0 is a neighborhood
of x, Proposition 13.2 implies that there is some U ∈ U with U0 = U (x). For
any z ∈ mon(F ), Theorem 13.11 implies z ≈U y ≈U ∗ x, and so z ≈U ∗ x. Hence,
(∗ x, z) ∈ ∗ U, i.e. z ∈ ∗ U 0 . This proves mon(F ) ⊆ mon(U (x)) which means
F → x by Proposition 12.47. Thus, X is complete.
Exercise 68. Prove by applying the nonstandard Theorem 13.16 that a

(pseudo)metric space X is complete if and only if each Cauchy sequence converges.
If Y ⊆ X, we equip Y with the uniform structure
UY := {U ∩ (Y × Y ) : U ∈ U }
which we call the inherited uniform structure.

Exercise 69. Prove that UY is indeed a uniform structure on Y .
The definition immediately implies:
Proposition 13.17. Let Y ⊆ X. Then the neighborhoods of y ∈ Y with respect to
the induced uniform structure UY are precisely the sets of the form U ∩ Y where
U ⊆ X is a neighborhood of y with respect to U .
If X is a topological space and Y ⊆ X, one can also define an inherited

topology on Y : The open sets in Y with respect to this topology are by defini-
tion precisely the sets of the form O ∩ Y where O ⊆ X is open. Fortunately,
Proposition 13.17 implies that there cannot arise any confusion if X is a uniform
space:
Corollary 13.18. Let Y ⊆ X. Then the inherited topology on Y is induced by the
inherited uniform structure UY .
Proposition 13.19. Let Y ⊆ X. For x, y ∈ ∗ Y , we have x ≈U y if and only if
x ≈UY y.
Proof. We have x ≈U y if and only if (x, y) ∈ ∗ U for any U ∈ U , i.e. if and only if
∗
(x, y) ∈ ∗ U ∩ (∗ Y × ∗ Y ) = (U ∩ (Y × Y )) for any U ∈ U . This means (x, y) ∈ ∗ V
for any V ∈ UY , i.e. x ≈UY y.
Exercise 70. Prove by nonstandard methods that a closed subset of a complete
uniform space with the inherited uniform structure is complete.
Definition 13.20. A subset A of a uniform space X is called precompact , if for each
U ∈ U there are finitely many x1 , . . . , xn ∈ X with A ⊆ U (x1 ) ∪ · · · ∪ U (xn ).
Example 13.21. A subset A of a (pseudo)metric space X is precompact if and
Ê
only if for each ε ∈ + there is a finite ε-net in X, i.e. there are finitely many
x1 , . . . , xn ∈ X such that the balls with center xi and radius ε cover X.
Theorem 13.22. A subset A of uniform space X is precompact if and only if any
point of ∗ A is pre-nearstandard, i.e. if and only if ∗ A ⊆ pns(∗ X).
Proof. Let A be precompact, and y ∈ ∗ A. For any U ∈ U , there are finitely many
x1 , . . . , xn ∈ X with y ∈ ∗ A ⊆ ∗ ((U (x1 ) ∪ · · · ∪ U (xn )) = ∗ U(x1 ) ∪ · · · ∪ ∗ U (xn ).
Hence, there is some xk with (y, ∗ xk ) ∈ ∗ U, and so y ∈ pns(∗ X).
Conversely, if ∗ A ⊆ pns(∗ X) and U ∈ U , put A := {U (x) : x ∈ X}. For
∗
any y ∈ ∗ A ⊆ pns(∗ X) there is some x ∈ X with y ∈ ∗ U(∗ x) = (U (x)), i.e.
∗ σ
A⊆ A . By Exercise 44, there is a finite A0 ⊆ A with A ⊆ A0 . But this
means that A is precompact.
As an application, we give an easy nonstandard proof of an important stan-
dard result:
Corollary 13.23. A uniform space X is compact if and only if it is precompact and
complete.
Proof. Theorem 13.16 and 13.22 imply that X is complete and precompact if and
only if ns(∗ X) = pns(∗ X) = ∗ X. Since ns(∗ X) ⊆ pns(∗ X) ⊆ ∗ X (Theorem 13.16),
this is the case if and only if ∗ X = ns(∗ X). By Theorem 12.39, this means that
∗
X is compact.
Exercise 71. Let ∗ be a compact P(X)-enlargement. Prove that X is precompact

if and only if pns(∗ X) is internal.
Hint: σ X ⊆ pns(∗ X).
13.2 Nonstandard Hulls

In connection with nonstandard analysis, uniform structures have an important
advantage over general topologies: They allow us to define a so-called nonstandard
hull. To define this, we first have to consider a uniform structure on ∗ X:
Proposition 13.24. The system σ U generates a filter U∗ which is a uniform
structure over ∗ X. We have V ∈ U∗ if and only if there is some U ∈ U with
∗
U ⊆ V ⊆ ∗ X × ∗ X.
Proof. Since U has the finite intersection property, also σ U has the finite inter-
section property, and so the set U∗ is a filter over ∗ (X × X) = ∗ X × ∗ X. By
definition of a generated filter, we have V ∈ U∗ if and only if there are finitely
many U1 , . . . , Un ∈ U∗ with U1 ∩ · · · ∩ Un ⊆ V ⊆ ∗ X × ∗ X. Since U is a filter, we
have U1 ∩ · · · ∩ Un ∈ U , and so U∗ can be described as in the claim.
Since ∆ := {(x, y) ∈ X × X : x = y} ∈ U , we have ∗ ∆ ∈ U∗ . The standard
definition principle for relations implies
∗
∆ = {(x, y) ∈ ∗ X × ∗ X : x = y},
and so the first property of Definition 13.1 is verified for U∗ . If V ∈ U∗ , choose

∗
some U ∈ U with ∗ U ⊆ V . Since U −1 ∈ U and (U −1 ) = (∗ U )−1 ⊆ V −1 , we
have V −1 ∈ U∗ . Moreover, there is some W ∈ U with W 2 ⊆ U . Then ∗ W ∈ U∗
∗
satisfies (∗ W )2 = (W 2 ) ⊆ ∗ U ⊆ V , and so U∗ is a uniform structure.
Definition 13.25. If X is a uniform space, we equip ∗ X with the uniform structure

U∗ of Proposition 13.24 and the corresponding topology.
We point out that the uniform structure on ∗ X is not internal (except for
trivial cases). In particular, the uniform structure U∗ usually differs from ∗ U .
Theorem 13.26. Let ∗ be P(X)-saturated. Then ∗ X is complete.
Proof. Let F be some Cauchy filter over ∗ X. For each U ∈ U , there is some
FU ∈ F with FU × FU ⊆ ∗ U and some xU ∈ FU . Consider the system A :=
{∗ U (xU ) : U ∈ U } (axiom of choice!). We claim that A has the finite intersection
property: If U1 , . . . , Un ∈ U , then FU1 ∩ · · · ∩ FUn contains some element x. For
each k = 1, . . . , n, we have (xUk , x) ∈ FUk × FUk ⊆ ∗ U k , and so x ∈ ∗ U k (xUk ).
Moreover, A has at most the cardinality of P(X) (if X is infinite). Since ∗ is

P(X)-saturated, we thus find some x ∈ A , i.e. x ∈ ∗ U (xU ) for any U ∈ U .
We claim that F → x: Indeed, by Proposition 13.2, any neighborhood of x

can be written in the form W (x) where W ∈ U∗ . We have to prove that W (x) ∈ F .
There is some U ∈ U with ∗ U ⊆ W and some V ∈ U with V 2 ⊆ U . We may
assume that V = V −1 (otherwise replace V by V ∩ V −1 ). Since x ∈ ∗ V (xV ), we
∗
have xV ∈ (∗ V )−1 (x) = (V −1 )(x) = ∗ V (x). In particular, ∗ V (xV ) ⊆ (∗ V )2 (x).
In view of FV × FV ⊆ V and xV ∈ FV , we obtain FV ⊆ ∗ V (xV ) ⊆ (∗ V )2 (x) =
∗
∗
(V 2 )(x) ⊆ ∗ U (x) ⊆ W (x), and so W (x) ∈ F , as claimed.
Exercise 72. Assume that ∗ is comprehensive and a compact P(X)-enlargement.
Prove that ∗ X is complete.
If X is a (pseudo)metric space, then a weaker saturation property suffices for
the completeness. At this point, we recall that any nonstandard ultrapower model

is comprehensive and thus -saturated.
In the definition of a (pseudo)metric space, one may formally allow that
the metric d attains the value ∞. If the reader does not like this convention, an
alternative is to equivalently replace d by
d0 (x, y) = min{d(x, y), 1}
which is a (pseudo)metric that does not attain the value ∞ and generates the
same uniform structure as d.
Theorem 13.27. Let the uniform structure on X be induced by a family D of
pseudometrics. Then the uniform structure U∗ on ∗ X is induced by the family of
pseudometrics

st(∗ d(x, y)) if ∗ d(x, y) is finite,
d∗ (x, y) := (d ∈ D).
∞ if ∗ d(x, y) is infinite.
Moreover, if ∗ is (P(D) × )-saturated, then ∗X is complete.

Proof. Given d ∈ D, the transfer principle implies ∀x ∈ ∗ X : ∗ d(x, x) = 0, and
∀x, y, z ∈ ∗ X : ∗ d(x, y) ≤ ∗ d(x, z) + ∗ d(z, y). Hence, d∗ (x, x) = 0, and the triangle
inequality for d∗ follows together with the additivity and monotonicity of st. Thus,
d∗ is a pseudometric.
By definition, U ∈ U if and only if U contains a set of the form
U0 := {(x, y) ∈ X × X | d1 (x, y) < ε ∧ · · · ∧ dn (x, y) < ε}
Ê
with ε ∈ + and d1 , . . . , dn ∈ D. We have V ∈ U∗ if and only if there is some
U ∈ U with ∗ U ⊆ V , i.e. if and only if there is some U0 of the above form such
that (using the standard definition principle for relations)
V ⊇ ∗ U 0 = {(x, y) ∈ ∗ X × ∗ X | d1 (x, y) < ∗ ε ∧ · · · ∧ ∗ dn (x, y) < ∗ ε}.

In view of the monotonicity of st, this means that there are d1 , . . . , dn ∈ D and
Ê
ε ∈ + with
V ⊇ {(x, y) : (d1 )∗ (x, y) ≤ ε ∧ · · · ∧ (dn )∗ (x, y) ≤ ε}.
But this means that the uniform structure U∗ is induced by the family of pseudo-
metrics d∗ (d ∈ D).
Let F be a Cauchy filter over ∗ X. Then we find for each n ∈ and each
finite D0 ⊆ D some FD0 ,n ∈ F such that
FD0 ,n × FD0 ,n ⊆ {(x, y) ∈ ∗ X × ∗ X | ∗ d(x, y) < 1/∗ n for all d ∈ D0 } =: UD0 ,n .
Choose some xD0 ,n ∈ FD0 ,n , and consider the system A of internal

sets UD0 ,n (xD0 ,n ) (axiom of choice!). Then A has the finite intersec-
tion property. Indeed, if n1 , . . . , nN ∈ and D1 , . . . , DN are finite sub-
set of D, then FD1 ,n1 ∩ · · · ∩ FDN ,nN contains some element x. Then
x ∈ UD1 ,n1 (xD1 ,n1 ) ∩ · · · ∩ UDN ,nN (xnN ), because for n = nk and D0 = Dk

(k = 1, . . . , n), we have (x, xD0 ,n ) ∈ FD0 ,n × FD0 ,n ⊆ UD0 ,n . Since ∗ is (P(D)× )-

saturated, we may conclude that A contains some element x ∈ ∗ X.
We claim that Fn → x: We have to prove that any neighborhood V of x
belongs to F . We have V = U (x) for some U ∈ U∗ , and U ⊇ UD0 ,n for some
n∈ and some finite D0 ⊆ D. Hence, V ⊇ UD0 ,n (x). Since x ∈ UD0 ,2n (xD0 ,2n ),
we have ∗ d(x, x2n ) < 1/(2∗ n) for each d ∈ D0 , and so by the triangle inequality of
∗
d that V ⊇ UD0 ,2n (x2n ). Hence, FD0 ,2n × FD0 ,2n ⊆ UD0 ,2n and xD0 ,2n ∈ FD0 ,2n
imply FD0 ,2n ⊆ UD0 ,2n (x2n ) ⊆ V , and so V ∈ F , as claimed.
We note that for (pseudo)metric spaces, we needed only a countable version

of the axiom of choice in the previous proof.
Even if X is a metric space, the space ∗ X is usually not a Hausdorff space:
If x ≈U y, then x and y do not have disjoint neighborhoods. To get a Hausdorff
space, one identifies such elements: One may do this, since ≈U is an equivalence
relation. Thus, we put
:= {[x] : x ∈ ∗ X},
X
where [x] denotes the equivalence class of x, i.e. the set of all y ∈ ∗ X with y ≈U x.
be the system of all sets of the form
Let U
:= {([x], [y]) : (x, y) ∈ U },

U
where U is an element of the uniform structure U∗ of ∗ X from Definition 13.25.

does not imply that (x, y) ∈
Some care is necessary: The relation ([x], [y]) ∈ U
σ
U . However, a weakening is true if U ∈ U :
implies
Lemma 13.28. If U = ∗ V for some V ∈ U , then the relation ([x], [y]) ∈ U
3
(x, y) ∈ U .
Proof. Since ([x], [y]) ∈ U , there are elements x0 , y0 ∈ ∗ X with x0 ≈U x, y0 ≈U x
and (x0 , y0 ) ∈ U = V . Since x0 ≈U x and y0 ≈U y, we have (x, x0 ) ∈ ∗ V and
∗
(y0 , y) ∈ ∗ V , and so (x, y) ∈ (∗ V )3 .

Theorem 13.29. U is a uniform structure on X.
For V ⊆ X ×X we have V ∈ U
∗
if and only if there is some U ∈ U with V ⊇ U .
The space U is always a Hausdorff space. Moreover, if X is Hausdorff, then
the embedding X ֒→ X defined by x → [∗ x] is one-to-one. In the sense of this
embedding, we have
U = {U ∩ (X × X) : U ∈U }, (13.1)
More-
i.e. the uniform structure of X is inherited by the uniform structure of X.

over, the closure of X in X is the set
X := {[x] : x ∈ pns(∗ X)}.
Proof. It follows by definition that U consists of subsets of X ×X which contain

∆ := {([x], [x]) : [x] ∈ X}. We prove first that U is even a filter:
∈U
If U and V ⊇ U , let V0 := {(x, y) : ([x], [y]) ∈ V }. Then V0 ⊇ U , hence
V0 ∈ U , and so V = V0 ∈ U .

Let U1 , U2 ∈ U be given, i.e. U1 , U2 ∈ U∗ . Then U := U1 ∩ U2 ∈ U∗ , and so

U ∈U . Since
= {([x], [y]) : x, y ∈ U1 ∩ U2 } ⊆ {([x], [y]) : x, y ∈ U1 } ∩ {([x], [y]) : x, y ∈ U2 }

U
1 ∩ U
=U 2 ,
we have U 1 ∩ U 2 ∈ U, as desired.

Now we prove that U is even a uniform structure: If U
∈U, then (U )−1 =

{([x], [y]) : (y, x) ∈ U } = U ∈ U . Moreover, there is some V ∈ U∗ with V 6 ⊆ U
−1
and some V0 ∈ U with V ⊇ ∗ V 0 . Then W := ∗ V0 ∈ U satisfies W 2 ⊆ U .

2 ∗
Indeed, if ([x], [y]) ∈ W , we find some z ∈ X with ([x], [z]), ([z], [y]) ∈ W . By
Lemma 13.28, we have (x, z), (z, y) ∈ (∗ V 0 )3 , and so (x, y) ∈ (∗ V 0 )6 ⊆ V 6 ⊆ U
which implies ([x], [y]) ∈ U , as desired.
It is now easily seen that U can be described as in the claim: If V ⊇ ∗ U
∗ , since U is a filter. Conversely,
for some U ∈ U , then U ∈ U , and so V ∈ U
if V ∈ U , then V = W for some W ∈ U∗ . There is some U ∈ U with W ⊇ ∗ U .
Then V = W ⊇ ∗ U.
To prove that X is Hausdorff, we apply Proposition 13.2: Let [x], [y] ∈ X
satisfy ([x], [y]) ∈ U for any U ∈U . For any U ∈ U , we find some V ∈ U with
∗
V 3 ⊆ U . Since ([x], [y]) ∈ ∗
V , Lemma 13.28 implies (x, y) ∈ (∗ V )3 = (V 3 ) ⊆ ∗ U .
∗
Hence, (x, y) ∈ U for any U ∈ U which means means x ≈U y, and so [x] = [y],
as desired.
Now, let X be Hausdorff. If x, y ∈ X satisfy [∗ x] = [∗ y], we have ∗ x ≈U ∗ y,
and so ∗ x ≈O ∗ y which implies x = st(∗ y) = y, since st is a function by Proposi-
tion 12.30. Thus, the map x → [∗ x] is one-to-one.
Now we prove (13.1): If U ∈U, there is some V ∈ U with U ⊇ ∗ V . Since
σ
U ⊇ V , we have in the sense of the embedding
∩ (X × X),
V = {([∗ x], [∗ y]) : (x, y) ∈ V } ⊆ U
and so U ∩ (X × X) ∈ U . Conversely, let U ∈ U . There is some V ∈ U with

∗
V ⊆ U . Then ∗
3
V ∈ U , and ([x], [y]) ∈ ∗V implies (x, y) ∈ (∗ V )3 = (V 3 ) by
∗
Lemma 13.28. In particular, ([∗ x], [∗ y]) ∈ ∗
V implies (∗ x, ∗ y) ∈ (V 3 ), and so
3
(x, y) ∈ V ⊆ U . In the sense of the embedding, this means
∗
V ∩ (X × X) ⊆ U.
We thus find some W ⊇ ∗ V (and so W ∈ U∗ ) with W ∩ (X × X) = U , as desired.

If y ∈ pns( X) and V ∈ U , choose some U ∈ U with V ⊇ ∗ U . We find some
∗
x ∈ X with (y, ∗ x) ∈ ∗ U ⊆ V . Hence, ([y], [∗ x]) ∈ V , i.e. [y] ∈ V ([∗ x]) = V (x) (in
the sense of our identification). Thus, [y] belongs to the closure of X. Conversely,
if [y] belongs to the closure of X and U ∈ U is given, choose some V ∈ U with
V 3 ⊆ U . Since [y] belongs to the closure of X, we find some element of X in the
neighborhood ∗ V ([y]), i.e. there is some x ∈ X with ([∗ x], [y]) ∈ ∗V . Lemma 13.28
implies ( x, y) ∈ (∗ V )3 ⊆ ∗ U . Hence, y ∈ pns(∗ X).
∗

The set X is called the nonstandard hull of
with the uniform structure U
the uniform space X.
Theorem 13.30. Let ∗ be P(X)-saturated. Then X is complete.
Proof. Let F be a Cauchy filter over X. Let F0 be the system of all subsets of
∗
X which contain an element of the form {x ∈ ∗ X : [x] ∈ F } where F ∈ F .
Then F0 is a filter: By definition, ∅ ∈/ F0 , and the relations G0 ∈ F0 and
G0 ⊆ G ⊆ ∗ X imply G ∈ F0 . Moreover, for any G1 , G2 ∈ F0 , we find F1 , F2 ∈ F
with Gi ⊇ {x : [x] ∈ Fi }. Hence,
G1 ∩ G2 ⊇ {x : [x] ∈ F1 } ∩ {x : [x] ∈ F2 } ⊇ {x : [x] ∈ F1 ∩ F2 } ∈ F0 ,
and so G1 ∩ G2 ∈ F0 .
Moreover, F0 is a Cauchy filter: Given U ∈ U∗ , choose some V ∈ U∗ with
V 3 ⊆ U and some W ∈ U with V ⊇ ∗ W . Since F is a Cauchy filter, we find
some F ∈ F with F × F ⊆ ∗ W . Putting F0 := {x : [x] ∈ F } ∈ F0 , we have

for any (x, y) ∈ F0 × F0 that ([x], [y]) ⊆ ∗ W which by Lemma 13.28 implies
(x, y) ∈ (∗ W )3 ⊆ U . Hence, F0 × F0 ⊆ U .
By Theorem 13.26, ∗ X is complete, and so F0 → x for some x ∈ X. We
claim that F → [x]. Given U ∈ U , we have to prove that U([x])
∈ F . Since
F0 → x, we have U (x) ∈ F0 . By definition of F0 , we find some F ∈ F with
{y ∈ ∗ X : [y] ∈ F } ⊆ U (x). Then
([x]) = {[y] : ([x], [y]) ∈ U

U } = {[y] : (x, y) ∈ U }
= {[y] : y ∈ U (x)} ⊇ {[y] : [y] ∈ F } = F ∈ F ,
([x]) ∈ F , as desired.
and so U
As can be seen from the proof, the saturation property is only needed for the
completeness of ∗ X. In particular, the same result holds if ∗ is comprehensive and
a compact P(X)-enlargement (Exercise 72). Moreover, if the uniform structure is

induced by a family D of pseudometrics, it suffices that ∗ is (P(D) × )-saturated
(Theorem 13.27).
Corollary 13.31. If X is a Hausdorff space, then X has a completion, i.e. there
is a complete uniform Hausdorff space X such that X ⊆ X carries the inherited
uniform structure and such that X is the closure of X.
is complete.
Proof. By Exercise 70, the closure X of X in the complete space X
Proposition 13.32. Let the uniform structure on X be induced by a family D of
pseudometrics. Then the uniform structure on X is induced by the family of pseu-
dometrics

st(∗ d(x, y)) if ∗ d(x, y) is finite,
d([x], [y]) = (d ∈ D).
∞ if ∗ d(x, y) is infinite.
If X is a metric space, i.e. D = {d}, then d is a metric (which might assume the
value ∞). Moreover, X
is complete if ∗ is (P(D) × )-saturated.
Proof. The last statement follows by the remarks following Theorem 13.30. To see
that d is well-defined, let [x] = [x0 ] and [y] = [y0 ]. Then x ≈U x0 and y ≈U y0 ,
and by Proposition 13.6, this implies ∗ d(x, x0 ) ≈ 0 ≈ ∗ d(y, y0 ) for any d ∈ D.
By the inverse triangle inequality for ∗ d, we thus have |∗ d(x, y) − ∗ d(x0 , y0 )| ≤
∗
d(x, x0 ) + ∗ d(y, y0 ) ≈ 0, and so ∗ d(x, y) and ∗ d(x0 , y0 ) are either both infinite or
both finite with the same standard part.
If X is a metric space, then X is a pseudometric space. Since X is Hausdorff

(Theorem 13.29), the pseudometric on X must even be a metric.
Corollary 13.33. The completion of a metric space is a metric space.

One should not make the mistake of thinking that the nonstandard hull X
is essentially the completion of X: The nonstandard hull is much larger and more

useful (even if X is complete). Consider, for example, X = . Then ∗ X = ∗ ,

and in the canonical way we have also = ∗ (Example 13.7). As another
Ê Ê
example consider X = : Then ∗ X = ∗ , but the nonstandard hull is not so
easy to describe. However, from the previous consideration, we may conclude that
X Æ
⊇ ∗ (in a canonical way). In this example, it appears that the subspace of
“finite” elements of X plays an important role, i.e. the subspace of all [x] where
∗
x ∈ X is “finite” in a certain sense. More generally, this “finiteness” plays a
particular role in the context of topological vector spaces which we discuss now.
Unfortunately, there are two natural definitions of “finiteness” which differ
in general [Hen72a, HM72]. However, for all “important” spaces, these definitions
coincide: In particular, for the large class of so-called locally convex spaces (see the
remarks preceding Proposition 14.14), and also for so-called quasinormed spaces
(like ℓp or Lp with 0 < p ≤ ∞), the two definitions are identical.
Thus, instead of going into technical details, we present just one of these
definitions in the context of topological vector spaces in the following. The reader
who is interested in more details is referred to the original papers [Hen72a, HM72].
§14 Topological Vector Spaces 185
§ 14 Topological Vector Spaces

Definition 14.1. A linear space (=vector space) X over
= or
= with a
topology O is called a topological vector space, if the addition and multiplication
(by scalars) are continuous operations, i.e. if the mappings (x, y) → x + y and
(λ, x) → λx are continuous as mappings from the topological spaces X × X resp.
× X (with the product topology) into X.
Sometimes, the term topological vector space is used for those topological
vector spaces which are additionally Hausdorff. If we assume that X is Hausdorff,
we will state this explicitly.
Theorem 14.2. Let X be a vector space endowed with some topology. Then X is a
topological vector space if and only if the following three conditions hold:
1. The neighborhoods of x ∈ X are precisely the sets U of the form U = x + V
where V is a neighborhood of 0.
In particular, the topology is translation invariant (and so the neighbor-
hoods of 0 determine the topology): If O is open, then x + O is a neighborhood
for each of its elements and thus open.
2. For each neighborhood U of 0 and each λ ∈ there is some neighborhood V
of 0 with V + V ⊆ U and some neighborhood Λ of λ with ΛV ⊆ U .
In particular, if U is a neighborhood of 0, then λ−1 U is a neighborhood
of 0 for all λ = 0, and there is a neighborhood V of 0 with V − V ⊆ U .
3. For each neighborhood U of 0 and each x ∈ X there is some neighborhood Λ
of 0 with Λx ⊆ U .
Proof. Put A(x, y) := x + y and M (λ, x) := λx. Assume first that A and M are
continuous.
To prove 1., we show that for any x0 , x1 ∈ X and any neighborhood U of x0
the set U0 := (x1 − x0 ) + U is a neighborhood of x1 : For the choice x0 := 0 and
x1 := x, we obtain then that for any neighborhood U of 0 the set U0 = x + U is a
neighborhood of x; for the choice x0 := x and x1 := 0, we obtain conversely that
any neighborhood U of x has the form U = x + U0 for some neighborhood U0 of
0.
Thus, let x0 , x1 ∈ X, and U be a neighborhood of x0 . Since A(x1 , x0 − x1 ) =
x0 , we find a neighborhood V of (x1 , x0 − x1 ) with A(V ) ⊆ x0 . By definition of
the product topology, there are neighborhoods V1 of x1 and V2 of x0 − x1 such
that V1 × V2 ⊆ V . Then A(V1 × V2 ) ⊆ U ; in particular, V1 + (x0 − x1 ) ⊆ U . Hence,
U0 = U − (x0 − x1 ) contains V1 and thus is a neighborhood of x1 , as claimed.
For the proof of 2. and 3., let a neighborhood U of 0 and points x ∈ X and
λ∈ be given. We find a neighborhood V0 of (λ, x) such that M (V0 ) ⊆ U . By
definition of the product topology this means that there are neighborhoods Λ of λ
and V1 of x such that Λ × V1 ⊆ V0 . In particular, ΛV1 ⊆ M (V0 ) ⊆ U . This already

proves 3. for the choice λ := 0.
To proceed with the proof of 2., consider the choice x := 0, and observe that
V1 is then a neighborhood of 0. In view of A(0, 0) = 0, we find a neighborhood W
of (0, 0) with A(W ) ⊆ U . There are neighborhood W1 , W2 of 0 with W1 ×W2 ⊆ W .
Then the set V := W1 ∩ W2 ∩ V1 is a neighborhood of 0. As observed above, we
have ΛV ⊆ ΛV1 ⊆ U . Moreover, V + V ⊆ A(W ) ⊆ U . Thus, 2. holds.
Now let the conditions of the theorem be satisfied. To prove that A is con-
tinuous at (x, y), let V be a neighborhood of z := x + y. Then U := V − z is
a neighborhood of 0. Choose a neighborhood V of 0 with V + V ⊆ U . Then
U1 := x + V and U2 := y + V are neighborhoods of x and y, and so U1 × U2 is a
neighborhood of (x, y) with A(U1 × U2 ) ⊆ x + y + U = V .
To prove that M is continuous at (λ, x), let V be a neighborhood of y := λx.
Then V − y is a neighborhood of 0, and so we find a neighborhood U of 0 with
U + U ⊆ V − y. There is a neighborhood Λ of 0 with Λx ⊆ U . Moreover, there are
neighborhoods Λ0 of λ and U0 of 0 with Λ0 U0 ⊆ U . Since Λ1 := Λ0 ∩ (λ + Λ) is
a neighborhood of λ, the set W := Λ1 × (x + U0 ) is a neighborhood of (λ, x). We
have
M (W ) ⊆ (λ + Λ)x + Λ0 U0 ⊆ y + U + U ⊆ V,
and so M is continuous at (λ, x).
If X is a topological vector space, we let U denote the system of all sets

U ⊆ X × X which satisfy U ⊇ {(x, y) : x − y ∈ O} for some neighborhood O ⊆ X
of 0.
Theorem 14.3. If X is a topological vector space, then U is a uniform structure
which generates the topology O.
Proof. Let O0 denote the system of neighborhoods of 0. Given some O ∈ O0 , we

use the notation UO := {(x, y) : x − y ∈ O}. Then U is the system of all sets
U ⊆ X × X which contain some UO .
Observe that ∆ := {(x, x) : x ∈ X} is contained in each UO and thus in each
element of U . Moreover, it follows from the definition that the relation U ∈ U
and V ⊇ U implies V ∈ U . To see that U is a filter, let U1 , U2 ∈ U be given.
Then there are O1 , O2 ∈ O0 with Ui ⊇ UOi . Then
U1 ∩ U2 ⊇ UO1 ∩ UO2 ⊇ UO1 ∩O2 ∈ U ,
and so U1 ∩ U2 ∈ U , as desired.
To see that U is a uniform structure, let U ∈ U be given. Then U ⊇ UO
for some O ∈ O0 . By Theorem 14.2 2., there is some O0 ∈ O0 with −O0 ⊆ O and
O0 + O0 ⊆ O. Then U −1 ⊇ UO−1 = U−O ⊇ UO0 ∈ U implies U −1 ∈ U . Moreover,

for V := UO0 , we have
V 2 = {(x, y) | ∃z ∈ X : x − z, z − y ∈ O0 }.
Since x − z, z − y ∈ O0 implies x − y = (x − z) + (z − y) ∈ O0 + O0 ⊆ O, we find

V 2 ⊆ UO .
We prove now that U generates the topology. In view of Proposition 12.5,
it suffices to consider neighborhoods: Given x ∈ X, we have to prove that the
neighborhoods of x are precisely the sets of the form U (x) with U ∈ U (recall
Proposition 13.2). If U ∈ U , we have U ⊇ UO for some O ∈ O0 , and so U (x) ⊇
UO (x) = x − O is a neighborhood of x by Theorem 14.2. Conversely, if V is a
neighborhood of x, we find by Theorem 14.2 some O ∈ O0 with V ⊇ x−O = UO (x).
Hence,U := UO ∪ {(x, y) : y ∈ V } ∈ U satisfies U (x) = V , as desired.
It can be proved that the uniform structure of Theorem 14.3 is the unique
uniform structure such that the addition is uniformly continuous. However, we
will not need this fact. If we speak of the uniform structure of a topological vector
space, we always mean the uniform structure of Theorem 14.3.
Note that if X is a vector space, then also ∗ X becomes equipped with an
addition and a scalar multiplication by the standard definition for relations. The
transfer principle implies that ∗ X is actually a vector space (with scalar multipli-
cation λx := (∗ λ) ∗ · x). We write + in place of ∗ + and 0 in place of ∗ 0 (noting
that ∗ 0 is also the neutral element of addition in ∗ X by the transfer principle).
Proposition 14.4. Let X be a topological vector space. Then
x ≈U y ⇐⇒ x − y ≈O 0 ⇐⇒ x − y ≈U 0 (x, y ∈ ∗ X). (14.1)
Moreover, V is a neighborhood of x ∈ ∗ X if and only if V ⊇ x + ∗ O where O ⊆ X

is a neighborhood of 0.
In particular, V is a neighborhood of x ∈ ∗ X if and only if V = x + U for
some neighborhood U of 0.
Proof. We have x ≈U y if and only if for any U ∈ U the relation (x, y) ∈ ∗ U

holds. By Theorem 14.2, this holds if and only if (x, y) ∈ ∗ U whenever
U ⊇ UO := {(x, y) ∈ X × X : x − y ∈ O}
where O is a neighborhood of 0. By the standard definition principle for relations,

this is the case if and only if
(x, y) ∈ ∗ U O = {(x, y) ∈ ∗ X × ∗ X : x − y ∈ ∗ O}
for any neighborhood O ⊆ X of 0, i.e. if and only if x−y ∈ ∗ O for any neighborhood
O of 0. But this means x − y ≈O 0. The second equivalence of (14.1) follows from
Proposition 13.8.
For the second statement, note that V is a neighborhood of x if and only if
−1 −1
V ⊇ ∗ U (x) for some U ∈ U . This is the case if and only if V ⊇ ∗ U O (x) for some
neighborhood O ⊆ X of 0. But this means V ⊇ {y ∈ X : y − x ∈ ∗ O} = x + ∗ O
∗
for some neighborhood O of 0.

For the last statement, observe that by what we just proved, U is a neigh-
borhood of 0 if and only if U ⊇ 0 + ∗ O = ∗ O for some neighborhood O ⊆ X of
0. Hence, if U is a neighborhood of 0, and V := x + U , then V ⊇ x + ∗ O for
some neighborhood O of 0, and so V is a neighborhood of x. Conversely, if V is
a neighborhood of x, then V ⊇ x + ∗ O for some neighborhood O of 0, and so
U := V − x ⊇ ∗ O is a neighborhood of 0 with V = x + U .
It is, however, not true that ∗ X is a topological vector space: In fact, by

condition 3. of Theorem 14.2, we would otherwise find for any x ∈ ∗ X and any
neighborhood U of 0 some λ = 0 with λ−1 x ∈ U . Since U ⊇ ∗ O for some neigh-
borhood O of 0, this means that x is finite in the following sense:
Definition 14.5. Let X be a topological vector space. A point x ∈ ∗ X is called
finite, if for each neighborhood U ⊆ X of 0 there is some λ ∈
with x ∈ λ∗ U .
We use the notation
fin(∗ X) := {x ∈ ∗ X : x is finite}.
As we have mentioned at the end of §13, this is not the only natural definition
of the term “finite”: One could also use another definition which takes into account
only the uniform structure of the space (and not the multiplication operation).
Unfortunately, these two definitions may differ in certain cases. However, the above
definition appears to be the most natural one in the context of topological vector
spaces. For details, we refer the reader to [Hen72a, HM72].
Lemma 14.6. If X is a topological vector space, then any neighborhood U of 0
contains a balanced neighborhood O of 0, i.e. |λ| ≤ 1 implies λO ⊆ O.
Proof. By Theorem 14.2, we have Λ0 O0 ⊆ U where Λ0 ⊆ and O0 ⊆ X are

appropriate neighborhoods of 0. There is some ε > 0 such that |λ| ≤ ε implies
λ ∈ Λ0 . Then O := {λx : x ∈ O0 , |λ| ≤ ε} has the required properties.
Proposition 14.7. The following statements are equivalent:

1. x ∈ fin(∗ X).
2. For each balanced neighborhood U of 0 there is some λ ∈
with x ∈ λ∗ U .

3. For each neighborhood U of 0 there is some n ∈ with x ∈ n∗ U.
Proof. If U is a neighborhood of 0, then there is a balanced neighborhood U0 ⊆ U

of 0. If there is some λ ∈ Ã with x ∈ λ∗ U 0 , we also have x ∈ λ∗ U . Thus, it
suffices to consider balanced neighborhoods. But if U is a balanced neighborhood
Æ
and x ∈ λ∗ U , then we have x ∈ n∗ U where n ∈ is such that n ≥ |λ|.
We define the set of infinitesimals of X as
inf(∗ X) := {x ∈ ∗ X : x ≈O 0} = {x ∈ ∗ X : x ≈U 0} = mon(0).
Ê
Exercise 73. Prove that x ∈ fin(∗ X) if and only if for any c ∈ inf(∗ ), c > 0, the
relation cx ∈ inf(∗ X) holds.
Lemma 14.8. The set inf(∗ X) is a linear subspace of ∗ X.
Ã
Proof. If x, y ∈ inf(∗ X), λ ∈ , and U is a neighborhood of 0, we find by The-
orem 14.2 some neighborhood O of 0 with O + O ⊆ U and λO ⊆ U . Since
∗ ∗
x, y ∈ ∗ O, we have λx ∈ (λO) ⊆ ∗ U and x + y ∈ ∗ O + ∗ O = (O + O) ⊆ ∗ U .
Hence, λx, x + y ∈ mon(0).
Theorem 14.9. If X is a topological vector space, then fin(∗ X) ⊆ ∗ X is the largest

subspace of ∗ X with the property that fin(∗ X) is a topological vector space (with
the inherited topology).
The uniform structure on the topological vector space fin(∗ X) is precisely the
uniform structure inherited from ∗ X.
Moreover, fin(∗ X) is closed in ∗ X. In particular, if ∗ is P(X)-saturated, then
∗
fin( X) is complete.
Proof. The fact that each topological vector subspace of ∗ X must be contained in
fin(∗ X) follows from our considerations preceding Definition 14.5.
Now we prove that fin(∗ X) is indeed a linear subspace: If x, y ∈ fin(∗ X) and
λ∈ Ã Ê
are given, then cx, cy ∈ inf(∗ X) for any c ∈ inf( ), c > 0 by Exercise 73.
By Lemma 14.8, we have c(x + y) = cx + cy ∈ inf(∗ X) and cλx = λ(cx) ∈ inf(∗ X)
Ê
for any c ∈ inf( ), c > 0, and so x + y, λx ∈ fin(∗ X) by Exercise 73.
To prove that fin(∗ X) is a topological vector space, we verify the three condi-
tions of Theorem 14.2. Condition 1. follows from Proposition 14.4: If x ∈ fin(∗ X),
and U0 ⊆ fin(∗ X) is a neighborhood of 0, i.e. U0 = U ∩ fin(∗ X) for some neigh-
borhood U ⊆ ∗ X of 0, then x + U is a neighborhood of x in ∗ X. Since fin(∗ X) is
a vector space, it follows that x + U0 = x + (U ∩ fin(∗ X)) = (x + U ) ∩ fin(∗ X) is a
neighborhood of x. Analogously, if V0 ⊆ fin(∗ X) is a neighborhood of x ∈ fin(∗ X),
then we find some neighborhood V ⊆ ∗ X of x with V0 = V ∩ fin(∗ X). Since V − x
is a neighborhood of 0 in ∗ X, it follows that U0 := V0 − x = (V − x) ∩ fin(∗ X) is
a neighborhood of 0 in fin(∗ X) with V0 = x + U0 .

For conditions 2. and 3., let a neighborhood U0 ⊆ fin(∗ X) of 0, λ ∈ , and
x ∈ fin(∗ X) be given. We have U0 = U ∩fin(∗ X) for some neighborhood U ⊆ ∗ X of
0, and U ⊇ ∗ V for some balanced neighborhood V ⊆ X of 0 (Proposition 14.4 and
Lemma 14.6). By Theorem 14.2, we find neighborhoods O ⊆ X of 0 and Λ ⊆ of
λ with O + O ⊆ V and ΛO ⊆ V . The transfer principle implies ∗ O + ∗ O ⊆ ∗ V and
Λ∗ O ⊆ ∗ V . Hence, 2. holds with the neighborhoods Λ and O0 := ∗ O ∩ fin(∗ X).
Indeed, ∗ O is a neighborhood of 0 in ∗ X by Proposition 14.4, and so O0 is a
neighborhood of 0 in fin(∗ X). Moreover, O0 + O0 ⊆ ∗ V ∩ fin(∗ X) ⊆ U0 , and
ΛO0 ⊆ ∗ V ∩ fin(∗ X) ⊆ U0 , since fin(∗ X) is a vector space. For condition 3.,
observe that we find some n ∈ with x ∈ n∗ V . For each µ in the neighborhood
Λ0 := {µ ∈ : |µ| < n−1 } of 0, we have µx ∈ ∗ V , because ∗ V is balanced, and
thus µx ∈ V ∩ fin(∗ X) ⊆ U0 . Hence, also condition 3. holds.
∗
Let for a moment U0 denote the uniform structure corresponding to the

topological vector space fin(∗ X). We have to prove that U0 = Ufin(∗ X) where
Ufin(∗ X) := {U ∩ (fin(∗ X) × fin(∗ X)) | U ∈ U∗ }
with U∗ being the uniform structure of ∗ X introduced in Definition 13.25. By

definition, a set U ⊆ fin(∗ X) × fin(∗ X) satisfies U ∈ U0 if and only if there is a
neighborhood O ⊆ fin(∗ X) of 0 with
U ⊇ {(x, y) ∈ fin(∗ X) × fin(∗ X) | x − y ∈ O}.
Since the neighborhoods of 0 in fin(∗ X) are precisely those sets of the form
O ∩ fin(∗ X) where O ⊆ ∗ X is a neighborhood of 0, we find that U0 consists
of all sets of the form U ∩ (fin(∗ X) × fin(∗ X)) where U ⊆ ∗ X × ∗ X is such that
there is some neighborhood O ⊆ ∗ X of 0 with
U ⊇ UO := {(x, y) ∈ ∗ X × ∗ X | x − y ∈ O}.
In view of Proposition 14.4, this holds for U if and only if there is a neighborhood
O ⊆ X of 0 with U ⊇ U∗ O . The latter means by the inverse form of the standard
definition principle for relations that
U ⊇ ∗ {(x, y) ∈ X × X : x − y ∈ O}.
By Theorem 14.3, this is satisfied by U if and only if there is some W ∈ U with

U ⊇ ∗ W . In view of Proposition 13.24, this means U ∈ U .
To see that fin(∗ X) is closed, let some x ∈ ∗ X be given with x ∈ / fin(∗ X).
Then there is a neighborhood U of 0 such that x ∈ / n∗ U for any n ∈ . By
Theorem 14.2 and Lemma 14.6, we find a balanced neighborhood O of 0 with
O − O ⊆ U (in particular, O ⊆ U ). By Proposition 14.4, V := x + ∗ O is a
neighborhood of x. Moreover, V ∩ fin(∗ X) = ∅ which implies that x ∈ / fin(∗ X):

∗
Indeed, assume that there is some y ∈ V ∩ fin( X). Then we find some n with
y ∈ n∗ O, i.e. there are o1 , o2 ∈ ∗ O with no1 = y = x+o2 . But since ∗ O is balanced,
this implies x = no1 − o2 ∈ n∗ O − ∗ O ⊆ n(∗ O − ∗ O) ⊆ n∗ U , a contradiction.
The last statement follows from Theorem 13.26 and Exercise 70.
Definition 14.10. A subset A ⊆ X of a topological vector space is called bounded ,

Ê
if for each neighborhood U of 0 there is some λ ∈ + with A ⊆ λU .
Now we can formulate a generalization of Theorem 7.8:
Exercise 74. Prove that A ⊆ X is bounded if and only if ∗ A contains only finite
elements, i.e. if and only if
∗
A ⊆ fin(∗ X).
Exercise 75. Prove the inclusion
pns(∗ X) ⊆ fin(∗ X),
and conclude that precompact subsets of topological vector spaces are bounded.
Exercise 76. Prove that the vector space inf(∗ X) is a closed subspace of ∗ X and
contained in fin(∗ X). In particular, inf(∗ X) is a closed subspace of fin(∗ X).
Recall that if U is a subspace of some vector space X, one defines the factor
space X/U as the set of all equivalence classes with respect to the equivalence
relation
x ≈ y ⇐⇒ x − y ∈ U.
The space X/U becomes a vector space with the operations [x] + [y] = [x + y] and
[λx] = λ[x] (which are well-defined). If X is a topological vector space and U is
a subspace, then one equips X/U with the following topology: The open sets are
the sets of the form {[x] : x ∈ O} where O is open.
Proposition 14.11. X/U is a topological vector space.
Proof. We prove first that the sets Õ := {[x] : x ∈ O} with open sets O form
a topology: Clearly, ∅, X/O are open. If {Õi : i} is a family of open sets, then

i Õi = {[x] : x ∈ i Oi } is open. Finally, if Õ1 , Õ2 are open,
let A be the family
of all open sets which are contained in A := Õ1 ∩ Õ2 . Then A ⊆ A, and if we

can prove that even A = A , then A is open. Thus, let [x] ∈ A be given. By
choosing a proper representative, we may assume that x ∈ O1 , and we find some
u ∈ U with x+u ∈ O2 . Now we apply Theorem 14.2 several times: O3 = O2 −x−u
is a neighborhood of 0, and so x + O3 := O2 − u is a neighborhood of x. We thus
find an open set O ⊆ O1 with x ∈ O ⊆ O2 − u. Then Õ is open with [x] ∈ Õ ⊆ A,

and so [x] ∈ A , as desired.
To see that the addition is continuous at ([x], [y]), let U be a neighborhood

of [x + y] = [x] + [y]. Without loss of generality, we may assume that U is open, i.e.
U = Õ for some open set O where x+y +u ∈ O for some u ∈ U . Since the addition
is continuous in X, we find open sets O1 , O2 with x ∈ O1 , y ∈ O2 and O1 + O2 ⊆
O − u. Then Õ1 and Õ2 are neighborhoods of [x] resp. [y], and Õ1 + Õ2 ⊆ Õ.
Hence, the addition is continuous. The continuity of the multiplication is proved
analogously.
Exercise 77. Prove that X/U is Hausdorff if and only if U is closed. In particular,
X is Hausdorff if and only if {0} is closed.
Definition 14.12. If X is a topological vector space, we define the nonstandard hull
X̆ as the factor space fin(∗ X)/ inf(∗ X) = fin(∗ X)/ mon(0).
The reader should be aware that the nonstandard hull of a topological vector
space X is in general different from the nonstandard hull of X considered as a
uniform space. However, one has an inclusion:
Theorem 14.13. The space X̆ is a Hausdorff topological vector space. It is a subset
of X and has the inherited uniform structure. Moreover, X̆ is closed in X.
If ∗ is
P(X)-saturated, then X̆ is complete.
If X is Hausdorff, then X ⊆ X̆ in the sense of the one-to-one embedding
x → [∗ x].
Proof. The first statement follows from Exercise 77 and Exercise 76 or, alterna-
tively, from the embedding X̆ ⊆ X if one observes that X is a Hausdorff space
(Theorem 13.29) and that subspaces of Hausdorff spaces are Hausdorff.
By (14.1), we have x − y ∈ inf(∗ X) if and only if x ≈U y. Moreover, if
additionally x ∈ fin(∗ X), then x− y ∈ inf(∗ X) ⊆ fin(∗ X) implies that y ∈ fin(∗ X).
consist
Hence, for x ∈ fin(∗ X), the equivalence classes [x] in the spaces X̆ and X
∗
of the same elements (namely those elements of fin( X) for which x ≈U y). In
: x ∈ fin(∗ X)} ⊆ X.
particular, X̆ = {[x] ∈ X
To see that X̆ carries the inherited uniformity, let first U be an element
of the uniformity of X̆. By Theorem 14.3, this means that we find some open
neighborhood V ⊆ X̆ of [0] with
U ⊇ UV := {([x], [y]) : x, y ∈ fin(∗ X), [x] − [y] ∈ V }.
By definition of the topology in X̆, we have V = {[x] : x ∈ O} for some open

neighborhood O ⊆ ∗ X of 0. By Proposition 14.4, O is a neighborhood of 0 if
and only if O ⊇ ∗ O 0 for some neighborhood O0 ⊆ X of 0. Hence, whenever
x, y ∈ fin(∗ X) satisfy x − y ∈ ∗ O 0 ⊆ O, we have [x] − [y] = [x − y] ∈ V , and so
([x], [y]) ∈ UV ⊆ U . This proves
U ⊇ {([x], [y]) : x, y ∈ fin(∗ X) ∧ x − y ∈ ∗ O0 }.

By Theorem 14.3, the set U0 := {(x, y) ∈ X × X : x − y ∈ O0 } belongs to the

uniform structure U of X. The standard definition principle for relations thus
implies
∗
U 0 = {([x], [y]) | x, y ∈ ∗ X ∧ x − y ∈ ∗ O0 }.
It follows that
∗
U0 ∩ (X̆ × X̆) = {([x], [y]) : x, y ∈ fin(∗ X) ∧ x − y ∈ ∗ O 0 } ⊆ U,
and so U belongs to the uniform structure inherited from U .

Conversely, let U belong to the uniform structure inherited from U. Then
∗
U ⊇ U 0 ∩(X̆×X̆) for some U0 ∈ U . By Theorem 14.3, there is some neighborhood
O0 ⊆ X of 0 with
U0 ⊇ {(x, y) ∈ X × X | x − y ∈ O0 }.
In view of the standard definition principle, we find
∗
U0 ⊇ {([x], [y]) | x, y ∈ ∗ X ∧ x − y ∈ ∗ O0 },
and so
U ⊇ ∗
U 0 ∩ (X̆ × X̆) ⊇ {([x], [y]) | x, y ∈ fin(∗ X) ∧ x − y ∈ ∗ O0 }. (14.2)
Since ∗ O0 is a neighborhood of 0 (Proposition 14.4), we find some open neighbor-

hood O ⊆ ∗ O 0 of 0. Then O1 := O ∩ fin(∗ X) is open in fin(∗ X), and so the set
V := {[x] : x ∈ O1 } is an open neighborhood of [0]. Theorem 14.3 implies that
V0 := {([x], [y]) ∈ X̆ × X̆ | [x] − [y] ∈ V }
belongs to the uniform structure of X̆. If we can prove that V0 ⊆ U , then also U
must belong to this uniform structure, and we are done. Thus, let ([x], [y]) ∈ V0
be given, i.e. x, y ∈ fin(∗ X) with [x − y] = [x] − [y] ∈ V . Then x − y − u ∈ O ⊆ ∗ O 0
for some u ∈ inf(∗ X) ⊆ fin(∗ X). By (14.2), this implies ([x], [y + u]) ∈ U . Since
[y] = [y + u], we find ([x], [y]) ∈ U , as desired.
To see that X̆ is closed, let y ∈ X \ X̆ be given, i.e. y = [x] for some
x ∈ ∗ X \ fin(∗ X). Since fin(∗ X) is closed, we find some neighborhood O ⊆ ∗ X of x
which is disjoint from fin(∗ X). By definition of the topology of ∗ X, we find some
U ∈ U with O ⊇ ∗ U(x). There is some V ∈ U with V 3 ⊆ U . Then ∗ V ([x]) is a
∗ ∗
neighborhood of [x] which is disjoint from fin( X). Indeed, if [y] ∈ V ([x]) then
Lemma 13.28 implies (x, y) ∈ (∗ V )3 ⊆ ∗ U , and so y ∈ ∗ U (x) ⊆ O which implies
y∈/ fin(∗ X), as desired.
If X is Hausdorff, Theorem 13.29 implies that the embedding X ֒→ X defined
∗ ∗
by x → [ x] is one-to-one. Since x is always finite, this is even an embedding into
the subspace X̆.
A map · : X → [0, ∞) is called a seminorm if it has all properties of a

norm with the exception that x = 0 need not imply x = 0. Any seminorm
induces a pseudometric by d(x, y) := x − y. Thus, a family N of seminorms
induces a family of pseudometrics which in turn induces a uniform structure. It
can be verified straightforwardly that X is a topological vector space with this
uniform structure. Not all topological vector spaces can be obtained in this way.
It turns out that these are precisely those topological vector spaces for which 0
(and equivalently each point) has a neighborhood base of convex sets. For this
reason, such spaces are called locally convex .
Proposition 14.14. Let X be a locally convex space generated by the family N of
pseudonorms. Then
∗
fin(∗ X) = {x ∈ ∗ X : x is finite for each · ∈ N },
and
inf(∗ X) = {x ∈ ∗ X : ∗ x is infinitesimal for each · ∈ N }.
The uniform structure of fin(∗ X) is induced by the family of seminorms
∗
x∗ := st( x) (· ∈ N ),
and the uniform structure of X̆ is induced by the family of seminorms

∗
[x] = st( x) (· ∈ N ).
If X is seminormed, i.e. N = {·}, then X̆ is normed. Moreover, if ∗ is

(P(N ) × )-saturated, then fin(∗ X) and X̆ are complete.
Proof. We have x ∈ fin(∗ X) if and only if for any ε ∈ Ê+ and each finitely many
·1 , . . . , ·k ∈ N , we find some n ∈ withÆ
∗
x ∈ n {y ∈ X : y1 < ε ∧ · · · ∧ yk < ε}.
∗ ∗
By the standard definition principle, this means x/n1 < ∗ ε, . . . , x/nk < ∗ ε,
Ê
i.e. ∗ x1 < n∗ ε, . . . , ∗ xk < n∗ ε. Hence x ∈ fin(∗ X) if and only if ∗ x ∈ fin(∗ )
for each · ∈ N .
Ê
Similarly, x ∈ inf(∗ X) if and only if for any ε ∈ + and each finitely many
·1 , . . . , ·n ∈ N , we have
∗
x ∈ {y ∈ X : y1 < ε ∧ · · · ∧ yn < ε}
∗ ∗
which by the standard definition principle means x1 , . . . xk ≤ ∗ ε. Hence,
∗
Ê
x ∈ inf(∗ X) if and only if x ∈ inf(∗ ) for each · ∈ N .
The transfer principle implies for any · ∈ N that

∗ ∗ ∗
∀x, y ∈ ∗ X : x + y ≤ x + y,
and since st is additive and monotone, the triangle inequality for ·∗ follows.
The equality λx∗ = |λ| x∗ is proved analogously. Hence, ·∗ (· ∈ N ) is a
family of seminorms on fin(∗ X) which generates the same uniform structure as
the pseudometrics d∗ (x, y) := st(∗ d(x, y)) (d ∈ D) where D is the family of all
pseudometrics d(x, y) := x − y with · ∈ N . Hence, the statement concerning
fin(∗ X) follows from Theorem 13.27 (recall that fin(∗ X) is a closed subspace of
∗
X, and so the completeness of ∗ X implies the completeness of fin(∗ X) by Ex-
ercise 70). The proof of the statements concerning X̆ follows analogously, using
Proposition 13.32.
In general, also the nonstandard hull of a Hausdorff topological vector space

is much larger than its completion. Roughly speaking, the elements of the non-
standard hull X̆ which are not contained in the completion are some sort of “in-
finitesimals for the dimensions”. In particular, X̆ is the completion of a normed
space X if and only if X has finite dimension. To prove this, we need the following
result:
Theorem 14.15. Let X be a Hausdorff topological vector space. Then X̆ is the
(in the sense of the embedding x → [∗ x]) if and only if fin(∗ X) =
closure of X in X
pns(∗ X).
is equal to
Proof. By Theorem 13.29, the closure of X in X
X := {[x] : x ∈ pns(∗ X)}.
Thus X̆ is this closure if and only if
{[x] : x ∈ fin(∗ X)} = X̆ = X = {[x] : x ∈ pns(∗ X)}.
This is true if fin(∗ X) = pns(∗ X). Conversely, if fin(∗ X) = pns(∗ X), then Ex-
ercise 75 implies that we find some x ∈ fin(∗ X) = pns(∗ X). Then [x] ∈ X̆, but
we have [x] ∈ / X, since otherwise [x] = [y] for some y ∈ pns(∗ X). But this im-
plies the contradiction x ∈ pns(∗ X): Indeed, given U ∈ U , choose some V ∈ U
with V 2 ⊆ U . Since y ∈ pns(∗ X), we find some z ∈ X with (∗ z, y) ∈ ∗ V . Since
2
(y, x) ∈ ∗ V (because x ≈U y), we have (∗ z, x) ∈ ∗ V ⊆ ∗ U. Hence, x ∈ pns(∗ X),
as claimed.
A topological vector space is called a Montel space, if all bounded subsets

are precompact. It follows from the well-known lemma of Riesz that the unit
ball of an infinite-dimensional normed space is not precompact, and so no infinite-

dimensional normed space is Montel. The following result thus implies in particular
that infinite-dimensional normed spaces always satisfy X X:
Corollary 14.16. Let X be a Hausdorff topological vector space. If X̆ is the closure
then X is a Montel space.
of X in X,
Proof. If A ⊆ X is bounded, then Exercise 74 implies ∗ A ⊆ fin(∗ X) = pns(∗ X).
Hence, A is precompact by Theorem 13.22.
It is not known whether a converse of Corollary 14.16 is true. At least, there
are Montel spaces for which X̆ = X, namely all finite-dimensional spaces (and
there are even infinite-dimensional Montel space with this property, see [HM72,
Example 4.5]):
Theorem 14.17. If X is a finite-dimensional Hausdorff topological vector space,
then X ∼
= X̆.
Proof. It is well-known that finite-dimensional Hausdorff topological vector spaces
are Montel spaces and that the topology is generated by some norm (see e.g.
[Rud90] for a proof). Hence, X is complete, and there is a bounded (and thus
precompact) neighborhood U of 0. Given x ∈ fin(∗ X), we find some n ∈
∗
with
x ∈ n U . Since U is precompact, also nU is precompact, and so (nU ) ⊆ pns(∗ X)
∗
by Theorem 13.22. It follows that x ∈ pns(∗ X), and so we have proved fin(∗ X) ⊆
pns(∗ X). Exercise 75 now implies fin(∗ X) = pns(∗ X), and by Theorem 14.15, we
find that X̆ is the closure of X in X (which is X, since X is complete).
Ê Ê n
Ên
In particular, we have for X = n that fin(∗ )/ inf(∗ ) = n = Ê̆ Ê
∼ n . For
n = 1, this is a new proof of (a part of) Theorem 5.21. The previous results imply:
Corollary 14.18. If X is normed, then X̆ is the closure of X in X̆ if and only if
X has finite dimension.
The reader who is interested in deeper results on the nonstandard theory
of topological vector spaces and normed spaces is referred to the papers [HM72,
HM74, HM83] and also to [Lux69a, Lux92].
Chapter 7
Miscellaneous
§15 Loeb Measures

To define nonstandard measures, we first recall the Carathéodory extension pro-
cedure from measure theory:
Let Σ0 be a set algebra over some set S0 , and µ : Σ0 → [0, ∞] a σ-additive
measure. It turns out that under this assumption µ may be extended to a σ-
additive measure µ∗ on a σ-algebra Σ ⊇ Σ0 . This may be done constructively by
the Carathéodory extension procedure:
One first defines the so-called outer measure for µ on all subsets of Σ0 :
∞
∞

µ∗ (E) = inf{ µ(An ) | E ⊆ An , An ∈ Σ0 }.
n=1 n=1
Clearly, µ∗ (A) = µ(A) if A ∈ Σ. Moreover, µ∗ is monotone (i.e. D ⊆ E implies

µ∗ (D) ⊆ µ∗ (E)) and σ-subadditive (i.e. µ∗ ( En ) ≤ µ∗ (En )). To see the latter,
note that, given ε > 0 and sets En , we find for any n a sequence An,k ∈ Σ0 with

En ⊆ k An,k and µ∗ (En ) ≥ µ(An ) − 2−n ε. Hence,
∞
∞
∞

µ∗ ( En ) ≤ µ(An,k ) ≤ (µ∗ (En ) + 2−n ε) = ε + µ∗ (En ).
n,k=1 n=1 n=1
In general, µ∗ is not σ-additive, but one is only interested in finding a σ-algebra

Σ ⊇ Σ0 such that the restriction of µ∗ to Σ is σ-additive. It turns out that such a
σ-algebra is given by the system Σ of all sets E ⊆ S0 which satisfy the estimate
µ∗ (D) ≥ µ∗ (D ∩ E) + µ∗ (D \ E) (D ⊆ S0 ). (15.1)
198 Chapter 7. Miscellaneous
Theorem 15.1 (Carathéodory). Σ is a σ-algebra, and the restriction of µ∗ to Σ is

a measure extending µ.
Proof. To prove that µ∗ extends µ, we only have to prove Σ0 ⊆ Σ. Thus, let

E ∈ Σ0 and D ⊆ S0 be given. Given ε > 0, we find a sequence An ∈ Σ0 with

D ⊆ An and µ∗ (D) ≥ µ(An ) − ε. Then the sets An ∩ E and An \ E belong
to Σ0 , and since their union covers D ∩ E resp. D \ E, we find
∞
∞
∞

µ∗ (D ∩ E) + µ∗ (D \ E) ≤ µ(An ∩ E) + µ(An \ E) = µ(An ) ≤ µ∗ (D) + ε.
n=1 n=1 n=1
Hence, (15.1) holds which by definition means E ∈ Σ.

For the proof that the restriction of µ∗ to Σ is a measure, we only need to
show that µ∗ is monotone and σ-subadditive. We divide this proof into two parts:
1. We prove first that Σ is an algebra. Since µ∗ (∅) = 0, we have S0 ∈ Σ. If E ∈ Σ
and D ⊆ S0 , then
µ∗ (D ∩ (S \ E)) + µ∗ (D \ (S \ E)) = µ∗ (D \ E) + µ∗ (D ∩ E) ≤ µ∗ (D),
and so S \ E ∈ Σ. Finally, let E, F ∈ Σ. For D ⊆ S we calculate, putting DE :=

D \ E, that
µ∗ (D ∩ (E ∪ F )) + µ∗ (D \ (E ∪ F )) = µ∗ ((D ∩ E) ∪ (DE ∩ F )) + µ∗ (DE \ F )

≤ µ∗ (D ∩ E) + µ∗ (DE ∩ F ) + µ∗ (DE \ F ) ≤ µ∗ (D ∩ E) + µ∗ (DE ) ≤ µ∗ (D).
2. Now observe that for E ∈ Σ and any F, D ⊆ S with F ∩ E = ∅ we have
µ∗ (D ∩ (E ∪ F )) ≥ µ∗ ((D ∩ (E ∪ F )) ∩ E) + µ∗ ((D ∩ (E ∪ F )) \ E)
= µ∗ (D ∩ E) + µ∗ (D ∩ F ).
An induction by N thus implies that for any sequence En ∈ Σ of pairwise disjoint

sets and all D ⊆ S,
N
N

µ∗ (D ∩ En ) ≥ µ∗ (D ∩ En ).
n=1 n=1

Since n≤N En ∈ Σ by step 1., this implies
N
N
N
∞

µ∗ (D) ≥ µ∗ (D ∩ En ) + µ∗ (D \ En ) ≥ µ∗ (D ∩ En ) + µ∗ (D \ En ).
n=1 n=1 n=1 n=1
§15 Loeb Measures 199
Letting N → ∞, we thus have by the σ-subadditivity

∞
∞
∞
∞

µ∗ (D) ≥ µ∗ (D ∩ En ) + µ∗ (D \ En ) ≥ µ∗ ( (D ∩ En )) + µ∗ (D \ En )
n=1 n=1 n=1 n=1
∞
∞
= µ∗ (D ∩ En ) + µ∗ (D \ En ).
n=1 n=1

In particular, En ∈ Σ. Moreover, in view of the subadditivity of µ∗ , we even

have equality in the above estimate, and for the choice D = En , this implies
that µ∗ is σ-additive on Σ.
To prove that Σ is a σ-algebra, it now suffices to prove that En ∈ Σ implies

En ∈ Σ. For pairwise disjoint sets En we have just proved this, and the general

case reduces to this: Just replace En by En \ k<n Ek .
Sets of finite measure may be approximated by elements of Σ up to an arbi-
trary small error ε ∈ + : Ê
Theorem 15.2. Let E ∈ Σ satisfy µ∗ (E) < ∞. Then for any ε ∈ + there is some Ê
D ∈ Σ0 such that µ∗ (D∆E) < ε. In particular, µ∗ (D) and µ∗ (E) differ at most
by ε.

Proof. By definition of µ∗ , we find a sequence An ∈ Σ0 ⊆ Σ with E ⊆ A := An
and
∞ ∞
µ∗ (E) ≥ µ(An ) − ε/2 = µ∗ (An ) − ε/2 ≥ µ∗ (A) − ε/2.
n=1 n=1
Put Dn := A1 ∪· · ·∪An and D0 := ∅. Then the sets Bn := Dn \ Dn−1 are pairwise

disjoint with Bn = Dn = A, and so
n
n

∗ ∗ ∗ ∗
µ (E) − µ (E \ Dn ) = µ (E ∩ Dn ) = µ ( (E ∩ Bk )) = µ∗ (E ∩ Bk )
k=1 k=1
∞
∞

→ µ∗ (E ∩ Bk ) = µ∗ ( (E ∩ Bk )) = µ∗ (E ∩ A) = µ∗ (E).
k=1 k=1
Hence, we find some n such that
µ∗ (E \ Dn ) ≤ ε/2.
For this n, we have
µ∗ (E∆Dn ) ≤ µ∗ (E \ Dn ) + µ∗ (Dn \ E)
≤ ε/2 + µ∗ (A \ E) = ε/2 + (µ∗ (A) − µ∗ (E)) ≤ ε.

If µ∗ (E) = ∞, we still can approximate E in a weaker sense by elements of D,

if E is a so-called σ-finite set . The latter means that E is the union of countably
many sets of finite measure.
Corollary 15.3. Let E ∈ Σ be σ-finite. Then for any ε > 0 there is a countable
union D of sets from Σ0 such that µ∗ (E∆D) ≤ ε.
Proof. Since E is σ-finite, it is the union of countably many sets En ∈ Σ
with µ∗ (En ) < ∞. By Theorem 15.2, we find for each n some Dn ∈ Σ0 with

µ∗ (En ∆Dn ) < 2−n ε. Since E = En , we have for D := Dn that
∞
∞

µ∗ (E∆D) ≤ µ∗ ( (En ∆Dn )) ≤ µ∗ (En ∆Dn ) < ε.
n=1 n=1

We point out that the previous results hold true if Σ is not necessarily an
algebra: It suffices that Σ is a so-called semi-ring. For generalizations in this
direction, we refer the reader to [Zaa67].
So far, we have only recalled some results of the standard world. The inter-
esting point in connection with nonstandard analysis is that any additive measure
µ is automatically σ-additive, if it is internal:
Through the rest of this section, we assume that S0 ∈ ∗ S is a nonstan-
dard entity, and that [0, ∞] ∈ S is an entity. Let Σ be an algebra over S0 . Then
∗
an internal function µ : Σ → [0, ∞] is called an internal additive measure, if
µ(A ∪ B) = µ(A) + µ(B) for all A, B ∈ Σ with A ∩ B = ∅. Note that Σ is internal
by Theorem 3.19, and so any A ∈ Σ is internal, too (Proposition 3.16).
Example 15.4. If µ is an additive measure in the standard world, then ∗ µ is an
internal measure. This may be straightforwardly verified by the transfer principle.
Exercise 78. (Cumbersome). With Σ and µ as above, prove that for any ∗ -finite

sequence A1 , . . . , Ah ∈ Σ (h ∈ ∗ ) the relation A1 ∪ · · · ∪ Ah ∈ Σ holds. Moreover,
if the sets Ak are pairwise disjoint, prove also
µ(A1 ∪ · · · ∪ Ah ) = µ(A1 ) + · · · + µ(Ah ).
Each internal additive measure µ gives rise to an additive measure with

standard values, defined by

st(µ(A)) if µ(A) is finite,
µ0 (A) := (A ∈ Σ).
∞ if µ(A) is infinite.
The fact that this is indeed an additive measure follows immediately from the
additivity of st (Theorem 5.21). As mentioned above, even more holds true auto-
matically:

Proposition 15.5. If ∗ is -saturated then, for any sequence An ⊆ S0 of pairwise

disjoint nonempty internal sets, the union An is not internal. In particular, with
the above notation:
1. Σ is either finite or fails to be a σ-algebra.
2. µ and µ0 are σ-additive.

Proof. If A := An were internal, then each of the sets Bn := A \ (A1 ∪ · · · ∪ An )

were internal by Theorem 3.19. Then B := {Bn : n ∈ } is a countable family

of nonempty internal sets with the finite intersection property, and so B = ∅.
This contradicts the definition of A.
If Σ is infinite, we find a sequence Bn ∈ Σ of nonempty internal sets. Putting

An := Bn \ k<n Bk , we have An ∈ Σ but not An ∈ Σ. Hence, Σ is not a
σ-algebra.

Finally, if An ∈ Σ are pairwise disjoint with An ∈ Σ, then all except finitely
many An must be empty. Hence, the σ-additivity of µ and µ0 follows from their
additivity.
In this connection, we recall that any ultrapower model is automatically

comprehensive and thus -saturated.
Note that although µ0 takes values in the standard world, its domain of
σ
definition is in the nonstandard world. Of course, by identifying [0, ∞] with [0, ∞],
one might consider µ0 as a mapping of the nonstandard world. However, this
mapping must be external if it attains infinitely many values: This follows from
Theorem 3.19, because the range of this mapping is σ rng(µ0 ) which is external by
Theorem 3.22.
The essential point of our above considerations is that µ0 is automatically
a σ-additive function defined on the algebra Σ. In particular, we may apply the
Carathéodory extension procedure as explained in the beginning:
Let µ∗0 denote the corresponding outer measure, and ΣL be the system of
all sets satisfying (15.1) (for µ∗0 ). Then ΣL is a σ-algebra, and the restriction of
µ∗0 to ΣL is a measure. This measure µL is called the Loeb measure for the given
additive internal measure µ.
For the rest of this section, we keep this notation.
In contrast to the general Carathéodory extension procedure, the Loeb mea-
sure has particularly nice properties. In particular, Theorem 15.2 and Corol-
lary 15.3 hold even with ε = 0.
The proof of this fact is not as easy as one might suspect at first glance,
because to apply the saturation property, one has to consider internal sets. In
particular, one has to consider µ in place of the Loeb measure µL .

Theorem 15.6. Let ∗ be -saturated. Let E be measurable with respect to the
Loeb measure µL and such that µL (E) < ∞. Then there is some D ∈ Σ with
µL (D∆E) = 0.
Proof. By Theorem 15.2, we find for any n some set Dn ∈ Σ with µL (Dn ∆E) <
n−1 . We may conclude that
µL (Dn ∆Dk ) ≤ µL ((Dn ∆E) ∪ (E∆Dk )) ≤ n−1 + k −1 .
It follows that the binary relation

ϕ := {(x, n, y) ∈ (Σ × σ ) × Σ | µ(x∆y) ≤ 3/n}
is concurrent on the set M := {(Dn , ∗ n) : n ∈ }: Indeed, given n1, . . . , nm ∈ ,

put k := max{n1 , . . . , nm }. Then
∗
µ(Dni ∆Dk ) ≈ (µL (Dn ∆Dk )) ≤ n−1
i +k
−1
≤ 2/n−1
i (i = 1, . . . , m).
By Theorem 8.12, it follows that ϕ is satisfied on M , i.e. there is some D ∈ Σ

such that µ(Dn ∆D) ≤ 3/∗ n for any n ∈ . Since st is monotone, we find that

µL (Dn ∆D) ≤ 3/n for any n ∈ . Hence,
µL (D∆E) ≤ µL ((D∆Dn ) ∪ (Dn ∆E)) ≤ 3/n + 1/n = 4/n.
Since µL (D∆E) is standard, this implies µL (D∆E) = 0.

Corollary 15.7. Let ∗ be -saturated, and S0 be σ-finite with respect to the Loeb
measure. Then a set E ⊆ S0 is Loeb measurable if and only if there is a countable
union D of sets from Σ with µ∗0 (D∆E) = 0 (and in this case µL (D∆E) = 0).

Proof. By assumption, we find Loeb measurable sets Fn with S0 = Fn and
µL (Fn ) < ∞. Hence, if E is Loeb measurable, the sets En := E ∩ Fn satisfy
µ(En ) < ∞. By Theorem 15.6, we find for each n some Dn with µL (Dn ∆En ) = 0.

Since E = En , we have for D = Dn that

µ∗0 (D∆E) ≤ µ∗0 ( (Dn ∆En )) ≤ µ∗0 (Dn ∆En ) = 0.
For the converse, recall that the Loeb measurable sets constitute a σ-algebra.
Hence, if D is the countable union of sets from Σ, then D is Loeb measurable.
Now, if E ⊆ S0 satisfies µ∗0 (E∆D) = 0, put F := E∆D. For any C ⊆ S0 , we have
µ∗0 (C) ≥ µ∗0 (C \ F ) = µ∗0 (C ∩ F ) + µ∗0 (C \ F ),
i.e. (15.1) holds for µ0 which by definition means that F is Loeb measurable.
Hence, also E = D∆F is Loeb measurable.
Also the following property does not hold for general Carathéodory exten-

sions. Recall that any nonstandard ultrapower model is -saturated and compre-
hensive.

Theorem 15.8. Let ∗ be -saturated and comprehensive. Let E ⊆ S0 be contained
in a set of finite Loeb measure. Then E is Loeb measurable if and only if for each
ε > 0 there are sets C, D ∈ Σ such that C ⊆ E ⊆ D and µL (D \ C) < ε.
Proof. First, assume that E is Loeb measurable. We find by definition of µ∗0 a

sequence Dn ∈ Σ with E ⊆ Dn and
∞

µL (E) = µ∗0 (E) ≥ µ0 (Dn ) − ε.
n=1
It is no loss of generality to assume that the sets Dn are pairwise disjoint: Otherwise

replace Dn by Dn \ k<n Dk .

Define f : σ → Σ by f (∗ n) = Dn . Since ∗ is comprehensive, we find an
extension of f to an internal mapping f : ∗ → Σ. Let
M := {n ∈ ∗ | (∃k ∈ ∗ : (k ≤ n ∧ F (n + 1) ∩ F (k) = ∅))
n+1
∗
∨ µ(F (k)) > (µL (E) + ε)}.
k=1
By the internal definition principle, M is internal. Hence, either M has some

smallest element h, or M = ∅ in which case we choose some arbitrary h ∈ ∞ :
In both cases the sets F (1), . . . , F (h) are pairwise disjoint, and we have
h
∗
µ(F (k)) ≤ (µL (E) + ε).
k=1
Moreover, by construction, h cannot be finite. The set D := F (1) ∪ · · · ∪ F (h)

belongs to Σ (Exercise 78). Since h ∈ ∞ , we have D ⊇ Dn ⊇ E. Moreover,
since the sets F (1), . . . , F (h) are pairwise disjoint, we have
h

∗
(µL (D)) ≈ µ(D) = µ(F (k)) ≤ ∗ (µL (E) + ε),
k=1
and so µL (D) ≤ µL (E) + ε.

Replacing E by E0 := D \ E in the above argument, we find a set D0 ∈ Σ
with D0 ⊇ E0 and µL (D0 ) ≤ µL (D0 ) ≤ µL (E0 ) + ε. Putting C := D \ D0 , we have
then C ⊆ E ⊆ D and
µL (D \ C) ≤ µL (D0 ) ≤ µL (E0 ) + ε = (µL (D) − µL (E)) + ε ≤ 2ε.

Conversely, assume that there are sequences Cn , Dn ∈ Σ with Cn ⊆ E ⊆ Dn and

µL (Dn \ Cn ) → 0. For C := Cn and D := Dn , we have then C ⊆ E ⊆ D and
µL (D \ C) ≤ inf n µL (Dn \ Cn ) = 0. Hence, E∆D (and also E∆C) are null sets,
and so E is Loeb measurable by Corollary 15.7.
Theorem 15.8 is essentially the main result from [Loe75].
§16 Distributions 205
§ 16 Distributions
In this section, we show how nonstandard analysis can be used to describe distri-
butions. We only sketch some ideas and leave the details to the reader.
Ê
Throughout this section, we assume that ∈ S is an entity.
The idea of distributions goes back to Dirac’s δ-function. The physical idea
was that this is a function δ with the property that δ(x) = 0 for all x = 0 but
satisfying
δ(x) dx = 1.
Ê
Of course, no function δ with this property does exist. Nevertheless, a formal
calculation with the δ-function in physics was successful, in particular due to the
essential property
δ(x)f (x) ds = f (0).
Ê
An appropriate mathematical framework for the treatment of the δ-function is the
following:
The closure of the set {x ∈ Ê : f (x) = 0} is called the support of f . One
identifies a locally integrable function ϕ with the linear functional

Fϕ (f ) = ϕ(x)f (x) ds
Ê
defined on e.g. the system of all smooth functions f with compact support. Then
the “δ-function” can be identified with the linear functional
Fδ (f ) = f (0).
If one now considers integration as an application of the corresponding linear

functional, we may indeed say that Fδ “is” the δ-function. Note that there is no
function ϕ such that Fδ = Fϕ .
In nonstandard analysis, we can define the “δ-function” as an actual (inter-
nal) function δ, and we really may consider (internal) integrals in place of abstract
linear functionals. The above mentioned fact that the “δ-function” is not a function
in the usual sense is explained by the fact that δ is not a standard function.
Definition 16.1. Let L1 denote the system of all integrable functions f : → .Ê Ê
Write I : L1 → Ê for the mapping which associates to each f ∈ L1 its integral.
∗
Then we call L1 the system of internal integrable functions and define the internal
integral
f (x) dx := ∗ I(f ) (f ∈ ∗ L1 ).
It follows from the transfer principle that the internal integral is linear in the
strong sense that

(λf (x) + µg(x)) ds = λ f (x) ds + µ g(x) dx
holds even for not necessarily standard numbers λ, µ ∈ ∗ . Ê

Let Lloc
1 denote the system of locally integrable functions f : Ê
→ , and Ê
C0 denote the system of all continuous functions with compact support. Func-
loc
tions from ∗ L1 are called locally integrable internal functions. It follows from the
loc
transfer principle that for all ϕ ∈ ∗ L1 and f ∈ ∗ C 0 the function g := ϕf (defined
by g(x) := ϕ(x)f (x)) belongs to ∗ L1 . However, the integral is a value from ∗ and
loc
Ê
need not necessarily be finite. Let L ⊆ ∗ L1 be the system of all locally integrable

internal functions ϕ with the property that the integral ϕ(x)f (x) dx is finite for
any f ∈ σ C 0 . Any ϕ ∈ L defines a linear functional Fϕ on C0 by means of the
formula
Fϕ (f ) := st ϕ(x)∗ f (x) dx (f ∈ C0 ). (16.1)
We shall prove now that every linear functional on C0 or on a linear subspace

U ⊆ C0 has this form, even with a particular ϕ:
Ê Ê
Let C0∞ denote the system of all functions f : → which are differentiable
arbitrarily many times and which have a compact support. It is well-known that
C0∞ is not trivial: It contains e.g. the “bump” function
2
e−1/(1−x )
if |x| < 1,
Φ(x) :=
0 if |x| ≥ 1.

Theorem 16.2. If ∗ is a P( )-enlargement, then any linear functional F on a
linear subspace U ⊆ C0 can be written in the form

∗
(F (f )) = ϕ(x)∗ f (x) dx (f ∈ U ) (16.2)
∞
with some ϕ ∈ L ∩ ∗ C 0 . In particular, (16.1) holds.
Proof. Using standard linear algebra (and the axiom of choice) one can extend
F to a linear (not necessarily bounded) functional on C0 . Hence, without loss of
generality, we can assume U = C0 .
The essential step is to prove that the binary relation

ψ := {(x, y) ∈ C0 × C0∞ | y(t)x(t) dt = F (x)}
is concurrent: To see this, let f1 , . . . , fn ∈ C0 be given. We have to prove that

there is some ϕ ∈ C0∞ such that

ϕ(t)fk (t) dt = F (fk ) (k = 1, . . . , n). (16.3)
It is no loss of generality to assume that the functions fk are linearly independent

(since all expressions in (16.3) are linear, we may successively eliminate those fi
which are linear combinations of the remaining ones).
We prove by induction on n that for any linearly independent f1 , . . . , fn ∈ C0
there are ϕ1 , . . . , ϕn ∈ C0∞ satisfying

1 if k = j,
ϕk (t)fj (t) dt = (16.4)
0 if k = j.
Then ϕ := F (f1 )ϕ1 + · · · + F (fn )ϕn has the required properties. Assume that the
claim has already been proved for n − 1. Under this assumption, we will construct
for any given linearly independent f1 , . . . , fn ∈ C0 and any given k ∈ {1, . . . , n} a
function ϕk ∈ C0∞ which satisfies (16.4) for j = 1, . . . , k. Then the induction step
is complete. Renumbering the functions fj if necessary, it suffices to describe this
construction for the case k = n.
By induction hypothesis, we find functions ψ1 , . . . , ψn−1 ∈ C0∞ with

1 if k = j and j = 1, . . . , n − 1,
ψk (t)fj (t) dt =
0 if k = j and k, j = 1, . . . , n − 1.
Since f1 , . . . , fn are linearly independent, the function

n−1
f := fn − ψj (t)fn (t) dt fj
j=1
does not vanish. In particular, there is a nonempty interval on which f is either

strictly positive or strictly negative. Choose a nontrivial nonnegative function of
C0∞ which has its support in this interval (such a function can be obtained by
rescaling the “bump” function described above). Multiplying this function with a
constant, one finds some Φ ∈ C0∞ such that

Φ(t)f (t) dt = 1.

Putting λk := Φ(t)fk (t) dt (k = 1, . . . , n − 1), we claim that the function
n−1

ϕn := Φ − λk ψk
k=1
has the required properties. Indeed, for j = 1, . . . , n − 1, we have

ϕn (t)fj (t) dt = Φ(t)fj (t) dt − λj = 0.
Moreover, the definitions of ϕn , λj , and f imply

⎛ ⎞
n−1

ϕn (t)fn (t) dt = ⎝Φ(t) − λj ψj (t)⎠ fn (t) dt
j=1

n−1
= Φ(s)fn (s) ds − Φ(s)fj (s) ds ψj (t)fn (t) dt
j=1
⎛ ⎞

n−1
= Φ(s) ⎝fn (s) − ψj (t)fn (t) dt fj (s)⎠ ds
j=1

= Φ(s)f (s) ds = 1,
as desired.
Note that to determine a function from C0 , it suffices to determine it for ra-
tional arguments. Hence, C0 has the cardinality of which in turn has the cardi-
Ê
nality | | = |(2 ) | = |2× | = |2 | = |P( )|. Since ∗ is a P( )-enlargement,
Ê
∞
we have that ∗ ψ is satisfied on σ C 0 , i.e. we find some ϕ ∈ ∗ C 0 with (∗ f , ϕ) ∈ ∗ ψ
for any f ∈ C0 . By the standard definition principle, this means

ϕ(t)∗ f (t) dt = ∗ F (∗ f ) = ∗ (F (f )) (f ∈ C0 ).
∗
Hence (16.2) holds, and since (F (f )) is always finite, this implies also ϕ ∈ L.
Considering the functional F (f ) := f (0), we thus find indeed a function
∞
δ ∈ L ∩ ∗ C 0 satisfying

δ(x)∗ f (x) dx = ∗ f (0) (f ∈ C0 ).
If we replace here = by ≈, it may even be arranged that δ(x) = 0 for x ≈ 0.

Moreover, we do not even need that ∗ is an enlargement to find such a function:
In fact, any locally integrable internal function δ with the following properties
will do:
1. δ ≥ 0.
2.
There is some infinitesimal c > 0 with δ(x) = 0 for |x| > c.
3. δ(t) dt = 1.
If ∗ is a nonstandard map, we find such a function: Choose some f ∈ C0∞ with

f ≥ 0, f (t) dt = 1 and f (x) = 0 for |x| > 1. Then δ(x) := c−1 ∗ f (c−1 x) has the

required properties for any infinitesimal c > 0: The identity δ(x) dx = 1 follows
by the substitution rule.
Exercise 79. Prove that any function δ with the above properties satisfies

δ(x)∗ f (x) dx ≈ ∗ f (0) (f ∈ C0 ),
i.e. for ϕ = δ the functional (16.1) satisfies Fϕ (f ) = f (0) (f ∈ C0 ).

Appendix A
Some Important ∗-Values
Theorem A.1. Let ∗ : S → ∗S be elementary. Let Sn and Tn denote the level sets
∗
of the superstructure S and S, respectively, i.e. Sn is as in Section 2.1. Then
∗
S n = {x ∈ Tn : x is internal} (n = 0, 1, 2, . . .). (A.1)
Proof. The proof is by induction on n: For n = 0, we have ∗ S 0 = ∗ S = T0 .

If (A.1) is already proved for some n, recall that Sn+1 = S0 ∪ P(Sn ), and so
∗
S n+1 = ∗ S 0 ∪ ∗ P(Sn ). Thus, Theorem 3.21 implies
∗
S n+1 = ∗ S 0 ∪ {A ⊆ ∗ S n : A is internal}.
Since by induction assumption ∗ S n ⊆ Tn , this already implies

∗
S n+1 ⊆ T0 ∪ P(Tn ) = Tn+1 ,
and since all elements of ∗ S n+1 are internal, we thus have even
∗
S n+1 ⊆ {x ∈ Tn+1 : x is internal}.
For the converse inclusion, observe that if A ∈ Tn+1 \T0 is internal, then each x ∈ A
is internal, because I is transitive; moreover, since A ∈ Tn+1 = T0 ∪ P(Tn ), we
must have x ∈ Tn which by induction assumption implies x ∈ ∗ S n . Consequently,
{A ∈ Tn+1 \ T0 : A is internal} ⊆ {A ⊆ ∗ S n : A is internal}.
Since T0 = ∗ S 0 , we thus have
{A ∈ Tn+1 : A is internal} ⊆ ∗ S 0 ∪ {A ⊆ ∗ S n : A is internal} = ∗ S n+1 ,
which proves the desired converse inclusion.

212 Appendix A. Some Important ∗-Values
We intend to prove now, roughly speaking, that

∗ ∗
A = A .
However, since we work with a set theory with atoms, some care is needed:
Proposition A.2. Let Sn and Tn be as in Theorem A.1.
Given an entity A ∈ S, let An be the collection of all elements of x ∈ A
which are of type n, i.e. x ∈ Sn \ Sn−1 . Then
∗
An = {x ∈ ∗ A : x ∈ Tn \ Tn−1 },
i.e. ∗ An contains all elements from ∗ A which are of type n. Here, we put S−1 =
T−1 = ∅.
Proof. We have
An = {x ∈ A : x ∈ Sn \ Sn−1 }.
∗
An = {x ∈ ∗ A : x ∈ ∗ S n \ ∗ S n−1 }.
Since Theorem A.1 implies that ∗ S n \ ∗ S n−1 contains all internal elements of
Tn \ Tn−1 , the statement follows.
Corollary A.3. Let x ∈ ∗ S be internal, and of type n. Then there is some entity
A ∈ S with x ∈ ∗ A such that all elements of A are of type n.
In particular, any internal entity is contained in a set ∗ A where A is a set
consisting of entities.
Proof. Since x is internal, there is an entity B ∈ S with x ∈ ∗ B. Let A be the
collection of all elements y ∈ B of type n. By Proposition A.2, we have y ∈ ∗ A.
Theorem A.4. Let A ∈ S be an entity. Let A0 be the collection of all elements of
A which are entities. Then
∗
A0 = {A : A ∈ ∗ A is an entity}
and ∗
A0 = {A : A ∈ ∗ A is an entity}.
Proof. Since entities are the elements of at least type 1, Proposition A.2 implies
that ∗ A 0 = {A ∈ ∗ A : A entity}. Hence, it is no loss of generality to assume that
A = A0 .

Put U := A . The transitively bounded sentence
∀x ∈ U : ∃y ∈ A : x ∈ y
213
is true, and so the transfer principle implies
∀x ∈ ∗ U : ∃y ∈ ∗ A : x ∈ y,
∗
i.e. ∗ U ⊆ A . Conversely, the transfer of
∀x ∈ A : ∀y ∈ x : y ∈ U

implies analogously ∗ A ⊆ ∗ U . This proves the first equality. For the second

equality, put D := A , and observe that
D = {x ∈ U | ∀y ∈ A : x ∈ y}.
The standard definition principle shows

∗
D = {x ∈ ∗ U | ∀y ∈ ∗ A : x ∈ y}.
∗ ∗
Since, ∗ U = A (as we just have proved), we find ∗ D = A.
Corollary A.5. Let A , B ∈ S be sets of entities. Then
∗
{A ∪ B : A ∈ A , B ∈ B} = {A ∪ B : A ∈ ∗ A , B ∈ ∗ B},
∗
{A ∩ B : A ∈ A , B ∈ B} = {A ∩ B : A ∈ ∗ A , B ∈ ∗ B},
and
∗
{A \ B : A ∈ A , B ∈ B} = {A \ B : A ∈ ∗ A , B ∈ ∗ B}.

Proof. Put C := {A ∪ B : A ∈ A , B ∈ B}, and U := (A ∪ B). Then
C = {z ∈ U | ∃x ∈ A , y ∈ B : z = x ∪ y}.
The standard definition principle thus implies that

∗
C = {z ∈ ∗ U | ∃x ∈ ∗ A , y ∈ ∗ B : z = x ∪ y}.
∗
Since Theorem A.4 implies ∗ U = (∗ A ∪ ∗ B) = (A ∪ B), the first statement
follows. The proof of the other formulas is analogous (choose the same set U as
above).
Although Theorem A.4 is rather satisfactory from a theoretical point of view,
one often considers unions of indexed sets:
If Xi (i ∈ I) is a family of sets, one would like to describe the ∗-value of
∗
i∈I Xi . A natural conjecture is that this value is i∈∗ I X i . To make this more
precise, we have to define what we mean by X i when i ∈ ∗ I:
∗
214 Appendix A. Some Important ∗-Values
Let Xi , I and X = {Xi : i ∈ I} all be entities of S. Then we may define

∗ ∗ ∗
a bijection f : I → X by f (i) = Xi . Then f : I → X . In a slight misuse of
notation, we define ∗ X i := ∗ f (i) (i ∈ ∗ I). The reader should not confuse ∗ X i with
∗
(Xi ). However, there is no real danger of such confusion, since for i ∈ I we have
∗ ∗
∗
X ∗ i = ∗ f (∗ i) = (f (i)) = (Xi ).
Corollary A.6. With the above notation, we have
∗

∗
Xi = X i.
i∈I i∈∗ I
Proof. The functions f : I → X and ∗ f : ∗ I → ∗ X are both onto (Theorem 3.13).

Hence, i∈∗ I Xi = X and i∈∗ I ∗ X i = ∗ X . Thus, the statement follows
from Theorem A.4
Definition A.7. With the above notation, one defines the Cartesian product

Xi := {f : I → Xi | f (i) ∈ Xi for each i ∈ I}.
i∈I i∈I
Exercise 80. Prove that

∗

∗
Xi = {x ∈ X i | x internal}.
i∈I i∈∗ I
Exercise 81. Show that for each internal entity X the system of all internal subsets
of X is an internal entity.
Exercise 82. Show that for each pair of internal entities A, B the system F of all
internal functions f : A → B is an internal entity. Is also the system F consisting
of all internal functions defined on subsets of A with values in B an internal entity?
We obtain similarly also results for sets of higher type:
Theorem A.8. Let A ∈ S be a system of entities. Then
∗
{P(A) : A ∈ A } = {PA : A ∈ ∗ A },
where PA denotes the system of all internal subsets of A.

Proof. Let Sn , Tn be as in Theorem A.1, and put P := {P(A) : A ∈ A }. We find
some n with A , P ∈ Sn . Then P ∈ Sn+2 , and since each Sn is transitive, we have
P = {x ∈ Sn+2 | ∃y ∈ A : ∀z ∈ Sn+1 : (z ∈ x ⇐⇒ z ⊆ y)}.
By the standard definition principle, we find

∗
P = {x ∈ ∗ S n+2 | ∃y ∈ ∗ A : ∀z ∈ ∗ S n+1 : (z ∈ x ⇐⇒ z ⊆ y)}.
215
For fixed y ∈ ∗ A , the following holds: Since ∗ A ∈ ∗ S n ⊆ Tn and Tn is transitive,

any internal z ⊆ y satisfies z ⊆ Tn , and so z ∈ Tn+1 . Since z is internal, we find
z ∈ ∗ S n+1 by Theorem A.1. In other words: If z ⊆ y is internal, then z ∈ ∗ S n+1 .
Conversely, each z ∈ ∗ S n+1 is internal. We thus have
∗
P = {PA ∈ ∗ S n+2 | A ∈ ∗ A }.
It remains to prove that PA ∈ ∗ S n+1 for any A ∈ ∗ A . Since ∗ A ∈ ∗ S n ⊆ Tn

and since Tn is transitive, we have for any A ∈ ∗ A that A ⊆ Tn , and so PA ⊆
P(A) ∈ Tn+1 . Since Tn+1 is transitive, this proves PA ⊆ Tn+1 , and so PA ∈ Tn+2 .
Since PA is internal by Exercise 81, we thus have PA ∈ ∗ S n+2 by Theorem A.1,
as claimed.
Exercise 83. (Cumbersome). Let A, B ∈ S, and let F (A, B) denote the system of
all functions f with dom(f ) ⊆ A and rng(f ) ⊆ B. Prove that
∗
F (A, B) = {f ∈ F (∗ A, ∗ B) : f is internal}.
Exercise 84. Let A , B ∈ S be systems of entities. Prove that

∗
{A × B : A ∈ A , B ∈ B} = {A × B : A ∈ ∗ A , B ∈ ∗ B}.
Appendix B
Solutions to the Exercises
The following solutions to the exercises are not complete, but all important ideas
are sketched.
√
Exercise 1: X is not Dedekind complete: Let f (t) := t, and A ⊆ X consist of
all x ∈ X such that there is some t0 > 0 with x(t) ≤ f (t) for t > t0 . Then A
is bounded (e.g. by x(t) := t), but A has no maximal element. To see this, we
will prove that any x ∈ A satisfies S := lim supt→∞ x(t) < ∞. Then the constant
function y(t) := S + 1 belongs to A and is strictly larger than x.
We have for sufficiently large t that x(t2 ) ≤ f (t2 ) = t, and so the rational
function y(t) = x(t2 )/t satisfies lim supt→∞ y(t) ≤ 1. Since y(t) = p(t)/q(t) with
polynomials p, q this means that either y(t) → −∞ or that the degree deg p of
the polynomial p is at most as large as the degree deg q of q. In the first case,
x(t) ≤ 0 for sufficiently large t. In the second case, observe that deg p is even and
deg q is odd by the definition of y. Hence, deg p ≤ deg q − 1 which implies that
x(t2 ) = ty(t) converges as t → ∞. Hence, lim supt→∞ x(t) < ∞ in both cases, as
claimed.
X is not Archimedean: The function x(t) := t belongs to X and satisfies
x ≥ n for any n ∈ . ⋆
Exercise 2: The set is not a field, since x0,1 has no inverse with respect to mul-
tiplication. It is also not Dedekind complete, since the set {x0,b : b > 0} has no
least upper bound. X is Archimedean, for if xa,b ∈ X, then we find some n ∈
with n > a, and so xn,0 ∈ X satisfies xn,0 > xa,b . ⋆
Exercise 3: It is not totally ordered, since e.g. the equivalence class of the sequence
x : n → (−1)n cannot be compared with 0. It is not a field, because there is some
[x] ∈ X such that [x] = 0 and x contains infinitely many 0’s (e.g. x : n →
218 Appendix B. Solutions to the Exercises
1 + (−1)n ); then [x] is not invertible. X is not Dedekind complete: Let A be the
set of all equivalence classes of sequences converging to 0. Then A is bounded from
above (e.g. by 1). However, A has no least upper bound: If [x] ∈ X were such a
bound, then [x] > 0 (because the equivalence class of n → n−1 belongs to X).
Ê
Note that [y] ∈ A implies λ[y] ∈ A for any λ ∈ . Hence, in the case [x] ∈ A, we
find 2[x] ∈ A, and so [x] is no upper bound. But if [x] ∈ / A, then [x]/2 provides a
strictly smaller upper bound for A. In both cases, we found a contradiction to the
fact that [x] is the smallest upper bound. X fails to be Archimedean, since for the
sequence x : n → n, we even have [x] > n for any n. ⋆
Exercise 4: The function x(t) = t is an example of an “infinitesimal”. With the

exception of Dedekind completeness, the answers to the other questions are always
“no” (with a similar proof as in the previous exercise). However, the space X
is Dedekind complete. (In fact, an analogous statement holds for each σ-finite
measure space. This observation goes back to Riesz and Kantorovich, see e.g.
[Zaa67] for a general proof). Let M ⊆ X be bounded from above. We have to
show that M has a least upper bound. If M is countable, the least upper bound is
evidently s(t) := sup{x(t) : x ∈ M } where we identify x with some function of its
equivalence class (hence, s is only defined up to equivalence classes). However, if
M is uncountable, s would not be well-defined in this way. To avoid this difficulty,
we assume in addition that the functions in M are uniformly bounded. This is
no loss of generality, because we can replace M by {arctan ◦x : x ∈ M }; and
if y is a least upper bound for this set, then tan ◦y is a least upper bound for
M . Now, if the functions in M are uniformly bounded, then for each countable
subset A ⊆ M the least upper bound xA := sup A is integrable, and the set
1
{ 0 xA (t) dt : A ⊆ M countable} is bounded and thus has a least upper bound
1
S. Choose a sequence of countable sets An ⊆ M such that 0 xAn (t) dt → S.
The union A∞ of these sets An is countable, and so xA∞ = sup A∞ satisfies
1
0 xA∞ (t) dt = S. Moreover, for each x ∈ M we must have x ≤ xA∞ almost
everywhere, since otherwise
1 1
xA∞ ∪{x} (t) dt > xA∞ (t) dt = S,
0 0
contradicting the definition of S. Hence, xA∞ is an upper bound for M , and in

view of A∞ ⊆ M , it must be the least upper bound for M . ⋆
Exercise 5: If such a predicate would exist, then {c} = {x ∈ ∗ A : α(x)} would

be a standard entity by the standard definition principle. But this means that
there is some standard entity B with {c} = ∗ B. Since ∗ B ⊆ ∗ A, we have B ⊆ A
(Lemma 3.5), and so σ B ⊆ σ A. By Corollary 3.11, we have σ B ⊆ ∗ B = {c}.
219
Since σ B ⊆ σ A and c ∈
/ σ A, this implies B = ∅, and Lemma 3.5 then gives the
∗
contradiction {c} = B = ∗ ∅ = ∅. ⋆
Exercise 6: By Proposition 3.16 and Lemma 3.14, we find some index k with
x1 , . . . , xk ∈ ∗ S n . Then
{x1 , . . . , xn } = {x ∈ ∗ S k : x = x1 ∨ · · · ∨ x = xn }
and
(x1 , . . . , xn ) = {(x1 , . . . , xn ) ∈ ∗ S k × · · · × ∗ S k : x1 = x1 ∧ · · · ∧ xn = xn }
are internal by the internal definition principle resp. by the internal definition
principle for relations. If B is an internal entity, and A ⊆ B is external, then A
must be infinite, since otherwise A = {x1 , . . . , xn } with xi ∈ B would be internal
by what we just proved (and since each xi is internal, because I is transitive). ⋆
Exercise 7: We have
g ◦ f = {(x, y) ∈ dom(f ) × rng(g) | ∃z ∈ dom(g) : (x, z) ∈ f ∧ (z, y) ∈ g}.
This set is internal by the internal definition principle for relations. ⋆
Exercise 8: 1. By the internal definition principle, the sets
gi := {z ∈ fi | ∃x ∈ Ai , y ∈ rng(fi ) : z = (x, y)}
are internal. Hence, f = g1 ∪ · · · ∪ gn is internal.

2. Fix some b ∈ B, and let f1 : A → B be the constant function f1 (x) = b, i.e.
f1 = A × {b}. Then f1 is internal by the internal definition principle. Now define
F : A → B by F (x) = f (x) for x ∈ A0 and F (x) = f1 (x) for x ∈ A \ A0 . By 1., it
follows that F is internal (observe that A0 = dom(f ) and A \ A0 are internal by
Theorem 3.19).
3. Let f : A → B and A0 ⊆ A be internal. Then
f |A0 = {(x, y) ∈ f : x ∈ A0 }
is internal by the internal definition principle for relations. ⋆
Exercise 9: The answer is negative: Let A be some external set. Then the set
A = {A} cannot be a subset of some internal set B, since then we would have
A ∈ B, and so A is internal because I is transitive (Proposition 3.16). ⋆

Exercise 10: If U has the above form, then j0 ∈ U , and so U is not free.

Conversely, if U is not free, then there is some j0 ∈ J with j0 ∈ U . For any
U ⊆ J precisely one of the sets U and J \ U belongs to U , because U is an
ultrafilter. If j0 ∈
/ U , then U cannot belong to U by our choice of j0 . Conversely,
if j0 ∈ U , then j0 ∈ / J \ U , and so J \ U cannot belong to U ; but this implies
U ∈ U . Hence, U contains precisely those sets U ⊆ J with j0 ∈ U . ⋆
Exercise 11: If U contains the filter of Example 4.2, then we have for any j0 ∈ J
that J \ {j0 } ∈ U ; hence U is free. Conversely, assume that U is free but that
J \ J0 ∈
/ U for some finite set J0 ⊆ J. Since U is an ultrafilter, we must have
J0 ∈ U . For each j ∈ J0 there is some Uj ∈ U such that j ∈ / Uj (otherwise U

would not be free). The finite intersection J0 ∩ j∈J0 Uj belongs to U , because
U is a filter. But this intersection is empty, a contradiction. ⋆
Exercise 12: Note first that the relation [f ] ∈ / I0 means that f (j) ∈ S0 holds
not almost everywhere. Since U is an ultrafilter, this means that f (j) ∈ S0 holds
almost nowhere. By choosing an appropriate representative f , we may thus assume
that f (j) ∈
/ S0 for all j. Similarly, we may assume that g(j) ∈ / S0 for all j.
Assume now that [f ] = [g], i.e. f (j) = g(j) does not hold almost everywhere.
Since U is an ultrafilter, we have f (j) = g(j) almost nowhere, i.e. f (j) = g(j) for
all j ∈ U where U ∈ U . Let U1 denote the set of all j ∈ U for which f (j)\g(j) = ∅,
and U2 := U \ U1 . For all j ∈ U2 , we have g(j) \ f (j) = ∅. Since U is an
ultrafilter, one of the sets U1 or U2 must belong to U . Without loss of generality,
we assume that U1 ∈ U . For all j ∈ U1 , we find a function h : J → S such that
h(j) ∈ f (j) \ g(j). Then [h] ∈U [f ] but not [h] ∈U [g], a contradiction to (4.3). ⋆
Exercise 13: The condition is fx (j) = fy (j) for all except at most finitely many
j, i.e. the set D = {j : fx (j) = fy (j)} has a finite complement.
Indeed, if J \ D is finite, then we have D ∈ U for any δ-incomplete ultrafilter
(because U is free by Corollary 4.13, and so D ∈ U by Exercise 11). Hence,
fx (j) = fy (j) holds for almost all j.
Conversely, if J \ D is infinite, we “construct” a δ-incomplete ultrafilter U
with D ∈ / U as follows: Let F0 denote the filter of Example 4.2, i.e. F ∈ F0
if and only if J \ M is finite. The system B = F0 ∪ {J \ D} has the finite
intersection property, since for any F ∈ F0 the set F ∩ D is infinite (and in
particular nonempty). Hence, B generates a filter F . Let U be an ultrafilter
containing F (Theorem 4.9). Exercise 11 implies that U is free. Hence, U is
δ-incomplete by Proposition 4.11. Since J \ D ∈ B ⊆ U , we have D ∈ / U , as
claimed. Thus, fx (j) = fy (j) for almost all j, and so x = y. ⋆
Exercise 14: 1. Let A = {a1 , . . . , an }. By Proposition 4.19, we have ∗ A = ϕ([F ])
where F : J → S is the constant function F (j) := A. By Proposition 4.19 and
221
Theorem 4.18, we have x ∈ ∗ A if and only if x = ϕ([f ]) where [f ] ∈U [F ]. Of

course, the constant functions fi (j) := ai (i = 1, . . . , n) satisfy [fi ] ∈U [F ], and so
ϕ([fi ]) ∈ ∗ A, i.e. ∗ ai ∈ ∗ A (by Proposition 4.19, we have ∗ ai = ϕ([fi ])). It remains
to prove that any other function f satisfying [f ] ∈U [F ] must satisfy [f ] =U [fi ] for
some i ∈ {1, . . . , n}, i.e. that Mi := {j : f (j) = i} belongs to U for some i. But
since J is the disjoint union of the sets M1 , . . . , Mn and since U is an ultrafilter,
precisely one of these sets must belong to U : Otherwise their complements would
belong to U and thus also the intersection of these complements which is empty,
a contradiction to ∅ ∈ /U.
2. As in 1., we have x ∈ ∗ A if and only if x = ϕ([f ]) where f : J → A. So we
have to prove that there are uncountable many equivalence classes of functions
f : J → A, if A is infinite. Assume that [f1 ], [f2 ], . . . is an enumeration of all such
equivalence classes. Since U is δ-incomplete, we find by Proposition 4.12 countably

many pairwise disjoint J1 , J2 , . . . ⊆ J with Jn = J. Define now f : J → A such
that f (j) ∈ A \ (f1 (j) ∪ · · · ∪ fn (j)) for j ∈ Jn (this is possible, since A is infinite).
Then f (j) = fn (j) holds at most on the set J1 ∪ · · · ∪ Jn , i.e. almost nowhere.
Hence, [f ] = [fn ] for all n, and so the equivalence class of f : J → A is not
contained in the given enumeration. Hence, the set of all equivalence classes is not
countable. ⋆
Exercise 15: The first statement follows immediately from the transfer principle.
If x is finite, then n ≥ |x| ≥ h = |h|, and so h is finite and belongs to σ by
Proposition 5.9. If x is infinite, then h > |x| − 1 ≥ n − 1 for each n ∈ σ which is
not possible for h ∈ σ . ⋆
Exercise 16: The equation x2 = 2 has no solution: This follows by the transfer
principle from ∀x ∈ É É
: x2 = 2. However, there is some x ∈ ∗ with x2 ≈ 2: We
have
Æ É
∀n ∈ : ∃x ∈ : |x2 − 2| ≤ n−1 ,
and the transfer principle implies
∀n ∈ ∗ Æ : ∃x ∈ É : |x
∗ 2
− 2| ≤ n−1 .
Fixing some h ∈ Æ ∞, we thus find some x ∈ ∗

É with |x
2
− 2| ≤ h−1 , i.e. x2 ≈
2. ⋆
Exercise 17: Consider the transitively bounded sentence
∀x, y ∈ Ê : (y > 0 =⇒ ∃z ∈ É : |x − z| < y).

Since É is dense in Ê, this sentence is true, and so the transfer principle implies
∀x, y ∈ Ê : (y > 0 =⇒ ∃z ∈ É : |x − z| < y).
∗ ∗
Ê
Hence, for each x ∈ ∗ and some infinitesimal y > 0 (even for all), we find some
É
z ∈ ∗ with |x − z| < y, i.e. x ≈ z. ⋆
Ê
Exercise 18: If this set (denote it by Mx ) were internal, then inf(∗ ) = Mx − x
would be internal by the internal definition principle. ⋆
Exercise 19: The statement follows by transfer of
∀x ∈ Æ : ((∃n ∈ Æ : x = 2n) Æ : x = 2n − 1)).

⇐⇒ ¬(∃n ∈
Concerning the map ∗ of Theorem 4.20, the elements of Æ correspond to equiv-

∗
alence classes of functions f : J → Æ (recall Proposition 4.19). Either the set
M = {j : f (j) is even}
or its complement belongs to U . The first case occurs if and only if f may for
almost all j be written in the form f (j) = 2g(j) where g : J → , and the Æ
second case occurs if and only if f may for almost all j be written in the form
Æ
f (j) = 2g(j) − 1 where g : J → . Thus, the statement follows in view of
Example 5.6. ⋆
Exercise 20: We have limj→j0 f (j) = x if and only if for any open neighborhood U
of x we find some open neighborhood JU of j0 with f (j) ∈ U for j ∈ JU \ {j0 }. The
latter means f −1 (U ) = {j : f (j) ∈ U } ∈ F which by Lemma 5.27 is equivalent to
U ∈ f (F ). Hence, limj→j0 f (j) = x if and only if any open neighborhood U of x
is contained in f (F ), i.e. if and only if limj→F f (j) = x. ⋆
Exercise 21: 1. Let β(x) be the internal formula (x ≤ h0 =⇒ α(x)). Then β(h)
Æ
holds for all infinite h ∈ ∗ ∞ . By the permanence principle, we thus find some
Æ Æ
n0 ∈ σ such that β(n) holds for all n ∈ ∗ with n0 ≤ n ≤ h. In particular, β(n)
Æ
and thus α(n) holds for any finite n ∈ σ with n ≥ n0 .
Ê
2. The proof is analogously reduced to the permanence principle for : Let β(x)
denote the internal formula (x > c =⇒ α(x)). Then β(d) holds for all infinites-
Ê Ê
imals d ∈ inf(∗ ), d > 0. By the permanence principle for , we thus find some
Ê
ε0 ∈ σ + such that β(ε) holds for all standard or nonstandard ε ∈ ∗ with Ê
Ê
0 < ε ≤ ε0 . In particular, α(ε) holds for all ε ∈ ∗ with c < ε ≤ ε0 . ⋆
Exercise 22: Let α(n) be the internal formula |xn | ≤ n−1 . Then α(n) holds for all
Æ Æ
n ∈ σ . By the permanence principle, there is some h ∈ ∞ such that α(n) holds
Æ Æ
for all n ∈ ∗ with n ≤ h. In particular, |xn | ≤ n−1 for all n ∈ ∞ with n ≤ h
which implies that xn ≈ 0 for those n. ⋆
Exercise 23: The assumptions imply in both cases in view of Theorem 3.19 that
A and B are internal, i.e. there are sets C, D ∈ S with A ∈ ∗ C and B ∈ ∗ D. In
223
view of Corollary A.3, we may assume that all elements of C and D are entities.

Let C0 ⊆ C and D0 ⊆ D be the subsets of entities of C resp. D, and let U := C0
Let F denote the system of
and V := D0 . By Theorem 2.1, we have U, V ∈ S.
all functions f with dom(f ) ⊆ U and rng(f ) ⊆ V and recall (Exercise 83) that
∗
F consists of all internal functions f with dom(f ) ⊆ ∗ U and rng(f ) ⊆ ∗ V .
1. By the hint, the sentence
∀x ∈ F : ∃y ∈ F : dom(y) = rng(x) ∧ rng(y) = dom(y) ∧ “y is one-to-one”
is true. Here, the shortcuts rng and dom use quantifiers over U and V . The ∗-trans-
form implies the statement for the choice x = f and g = y.
2. By the Schröder-Bernstein theorem, the sentence
∀x1 , x2 ∈ F : ((“x1 , x2 one-to-one” ∧ rng(x1 ) ⊆ dom(x2 ) ∧ rng(x2 ) ⊆ dom(x1 ))
=⇒ ∃y ∈ F : (“y one-to-one” ∧ dom(y) = dom(x1 ) ∧ rng(y) = dom(x2 )))
is true (as before, rng and dom use quantifiers over U and V ). The ∗-transform
implies the statement for the choice x1 = f1 , x2 = f2 , g = y. ⋆
Exercise 24: The sets I := A∩B, and A0 := A\B are internal subsets of the ∗ -finite
set A and thus ∗ -finite by Theorem 6.13. Moreover, A = A0 ∪I and A∪B = A0 ∪B
where the sets on the right-hand side are disjoint. Hence, Theorem 6.14 implies
# #
A = # A0 + # I and (A ∪ B) = # A0 + # B. Combining these two equations, the
statement follows. ⋆
Exercise 25: We have
X < = {x ∈ P( × X) | ∃n ∈ : (x : {1, . . . , n} → X ∧ dom(x) ⊆ {1, . . . , n})}
/ dom(x) can be formulated in a bounded form, if we use that x ⊆ ×X.
where k ∈
Note now that ∗ P( × X) consists of all internal subsets of ( × X) = ∗ × ∗ X.
∗

(X < ) = {x ∈ ∗ P( × X) | ∃n ∈ ∗ :
∗
(x : {1, . . . , n} → ∗ X ∧ dom(x) ⊆ {1, . . . , n})},

where dom(x) ⊆ {1, . . . , n} has the expected meaning in view of the fact that we

have x ⊆ ∗ × ∗ X. Thus, the first statement follows. For the second statement,
note that
#X (·) = {z ∈ P(X < × ) | ∃x ∈ ∗P( × X), ∃n ∈ ∗ :
x : {1, . . . , n} → ∗ X ∧ dom(x) ⊆ {1, . . . , n} ∧ z = (x, n)},
and apply similarly as above the standard definition principle, using the fact that
<∗
we already proved that (X < ) = ∗ X
∗

and that ∗ P(X < × ) consists of all
< ∗ <
∗
internal subsets of (X ) × = X ∗
×∗ . ⋆
Exercise 26: There is some n such that R and the order relation on R both belong
to ∗ S n . By Proposition 3.16, there is some n such that R ∈ ∗ S n . Consider the
sentence
∀x ∈ Sn , y ∈ Sn , z ∈ P(Sn ) : (α(x, y, z) =⇒ ∃w ∈ P(P(Sn ) × Sn ) : β(x, y, z, w))
where α(x, y, z, Sn ) is a transitively bounded sentence with the meaning “y is a

total order on x, and the elements of z are finite subsets of x” and β(x, y, z, w)
means
w : z → x ∧ ∀u ∈ z : w(u) = max{u},
where the order y is used to define max{u}. Note that in order to formalize α,

it suffices to quantify over Sn and . The ∗-transform of α then becomes “y is
a total order on x, and the elements of z are ∗ -finite subsets of x”. Hence the
∗-transform of the above true sentence implies the statement. ⋆

Exercise 27: Put U := A . Then ∗ U = ∗ A by Theorem A.4. Using the nota-
tion of Exercise 25, we have
∀x ∈ A : (c(x) = ∞ ∨˙ ∃y ∈ U < : (“y is a bijection onto x” ∧ c(x) = #U (y)))
Hence, in view of Exercise 25,
∀x ∈ ∗ A : (∗ c(x) = ∞ ∨˙ ∃y ∈ ∗ U
<∗ :
(“y is a bijection onto x” ∧ ∗ c(x) = #∗ U (y))).
This implies that for any x = B ∈ ∗ A one of the following alternatives holds:
Either ∗ c(B) = ∞, or ∗ c(x) = h where y : {1, . . . , h} → B is an internal bijection.
But this means ∗ c(B) = # B. For the second statement, note that A is finite if and
#
only if ∗ A = ∗ c(∗ A) = ∗ (c(A)) = ∞. ⋆

Exercise 28: If the sequence xn is bounded, all ∗ xh with h ∈ ∞ are finite by
Theorem 7.2. Corollary 7.3 thus shows that the sets of accumulation points is

given by {st(∗ xh ) : h ∈ ∞ }. The supremum and infimum of this set are lim sup xn
and lim inf xn , respectively. If the sequence is unbounded, ∗ xh is infinite for some

h ∈ ∞ , and so st(∗ xh ) is not defined. However, if one chooses the natural notation
st(∗ xh ) = ±∞ if ∗ xh is infinite and ±∗ xh > 0, then the formula still holds: Indeed,
±∞ is an accumulation point of xn if and only if st(∗ xn ) = ±∞ for some n: For
positive sign, the proof is analogous to Theorem 7.2 (just drop the absolute values
in the proof). For negative sign, the proof is similar or may be reduced to the case
of positive sign by considering the sequence −xn . ⋆
225
Exercise 29: (One of many solutions): Put xn := (−1)n . Then ∗ xh = 1 if h is even,

and ∗ xh = −1 if h is odd (recall Exercise 19). For the map ∗ of Theorem 4.20, the

map h → ∗ xh associates to each equivalence class [fh ] with fh : J → the value
∗
1 if the set {j : fh (j) even} belongs to U , and ∗ 0 otherwise. ⋆
Exercise 30: If xn is a Cauchy sequence, then we find for any ε ∈ Ê+ some n0
such that, in view of the transfer principle,
Æ : (n, m ≥ n0 =⇒ |∗xn − ∗xm| ≤ ∗ ε).

∀n, m ∈ ∗
In particular, we have for any h, k ∈ Æ∞ that |∗ xh − ∗ xk | < ε. Since ε > 0 is

arbitrary here, it follows that ∗ xh ≈ ∗ xk .
Conversely, if ∗ xh ≈ ∗ xk for each h, k ∈ Æ∞, we have for any ε ∈ Ê+ that
the internal predicate
∀n, m ∈ ∗ Æ : (n, m ≥ y =⇒ |∗ xn − ∗ xm | < ∗ ε)
Æ
holds for all y ∈ ∞ . By the permanence principle, it also holds for some y ∈ σ , Æ
Æ
i.e. for some y = ∗ n0 with n0 ∈ . The converse direction of the transfer principle
implies that xn is a Cauchy sequence. ⋆
Exercise 31: If x is an interior point of A, we find some ε ∈ Ê+ such that
∀y ∈ Ê : (|x − y| < ε =⇒ y ∈ A).
Ê
The transfer principle implies that ∗ A contains all points y ∈ ∗ which satisfy
|x − y| < ∗ ε, in particular all points with y ≈ x. Conversely, if mon(x) ⊆ ∗ A, then
the internal predicate
∀y ∈ ∗ Ê : (|x − y| < ε =⇒ y ∈ ∗ A)
Ê
holds for any ε ∈ inf(∗ ), ε > 0. By the Cauchy principle (permanence principle),
Ê
the predicate holds also for some standard ε = ∗ ε, ε ∈ + . The inverse form of
the transfer principle implies that A contains all y ∈ Ê
with |x − y| < ε, i.e. x is
an interior point of A. ⋆
Ê
Exercise 32: Evidently, A = ∅ and A = have this property. One might suspect
Ê
that all open sets have this property, but actually A = ∅ and A = are the only
sets: Assume that A has this property. Then the internal formula
∀x ∈ ∗ A, y ∈ ∗ Ê : (|x − y| ≤ z =⇒ y ∈ ∗ A)
Ê
holds for any z ∈ inf(∗ ), z > 0. By the Cauchy principle (permanence principle),
Ê
this formula also holds for some z = ∗ ε with ε ∈ + . The inverse direction of the
transfer principle implies
∀x ∈ A, y ∈ Ê : (|x − y| ≤ ε =⇒ y ∈ A).
Thus, if A contains some point x, then an induction by n shows that A contains

the intervals [x − nε, x + nε] which means that A = . Ê
There is no contradiction to Exercise 31, because not any x ∈ ∗ A is infinitely
close to some standard point (otherwise A would be compact, but compact sets
are not open unless A = ∅). ⋆
Exercise 33: A must be the empty set: Indeed, by Exercise 32, we must have either
Ê Ê Ê
A = ∅ or A = . But the union of all monads is fin(∗ ) = ∗ , so that A = is Ê
not possible. ⋆
Exercise 34: These are precisely those sets which have no accumulation points,
or, equivalently, those sets A which have the property that the intersection of
A with any bounded set is finite (this equivalence follows immediately from the
Bolzano-Weierstraß theorem (Corollary 7.5)).
Indeed, if x ∈Ê is an accumulation point of A, choose a sequence xn ∈ A,
Æ
xn = x, such that xn → x. Then ∗ xh ≈ ∗ x for some h ∈ ∞ by Theorem 7.1. The
transfer principle implies ∗ xh = ∗ x. Hence, ∗ xh ∈ ∗ A is finite with ∗ xh = ∗ x =
∗
(st(∗ xh )).
Conversely, assume that the intersection of A with any bounded set is finite.
Ê
Then, for Kn := {x ∈ : |x| ≤ n}, the set A ∩ Kn is finite, and so σ (A ∩ Kn ) =
∗
(A ∩ Kn ) = ∗ A ∩ ∗ K n . By the standard definition principle, ∗ K n = {x ∈ ∗ : Ê
|x| ≤ ∗ n}. Consequently,
{x ∈ ∗ A : |x| ≤ ∗ n} = {∗ y : y ∈ A ∧ |y| ≤ n}.
Ê
Taking the union over all n, we then find for any x ∈ ∗ A ∩ fin(∗ ) some y ∈ A
with x = ∗ y. Since then y = st(x), the set A has the required property. ⋆
Exercise 35: The set A is perfect if and only if for any x ∈ A the monad mon(x)
contains some point of ∗ A which is different from ∗ x.
Indeed, if A is perfect and x ∈ A, we find a sequence xn ∈ A such that
xn = x and xn → x. By Theorem 7.1, we have ∗ xh ≈ ∗ x for some h ∈ ∞ . Æ
Hence ∗ xh ∈ mon(x), and the transfer principle implies ∗ xh ∈ ∗ A and ∗ xh = ∗ x.
Conversely, assume that A is not perfect. Then there is some x ∈ A and some
Ê
ε ∈ + such that
∀y ∈ A : (y = x =⇒ |x − y| > ε).
The transfer principle implies that all y ∈ ∗ A with y = ∗ x satisfy |∗ x − y| > ∗ ε
and so in particular y ≈ ∗ x, i.e. y ∈
/ mon(x). ⋆
Exercise 36: Note first that xn := a + n(b − a)/h is an internal sequence as a
Æ
composition of internal functions (Exercise 7). Hence, the set {n ∈ ∗ : ∗ f (xn ) >
∗
Æ
c} is an internal subset of ∗ by the internal definition principle. Thus, it contains
a smallest element by Theorem 5.12. ⋆
227
Exercise 37: Assume that c := f ′ (0) exists. Then we have for any 0 = x ≈ 0 that
|x| /x ≈ ∗ c, i.e. ∗ c ≈ 1 (for x > 0) and simultaneously ∗ c ≈ −1 (for x < 0), a
contradiction. ⋆
Exercise 38: Put F (x) := f (x)g(x). Given h ≈ 0, put dx := h, df :=
∗
f (∗ x0 + dx) − ∗ f (∗ x0 ), dg := ∗ g(∗ x0 + dx) − ∗ g(∗ x0 ), and dF :=
∗
F (∗ x0 + h) − ∗ F (∗ x0 ). Then
df ∗ g(∗ x0 + dx) + ∗ f (∗ x0 )dg = dF,
and so
dF df ∗ ∗ dg ∗
= g( x0 + dx) + ∗ f (∗ x0 ) ≈ (f ′ (x0 )g(x0 ) + f (x0 )g ′ (x0 )),
dx dx dx
where we used Proposition 5.17 and the fact that ∗ g(∗ x0 + dx) ≈ ∗ g(∗ x0 ) which
in turn follows from dg = g ′ (x0 )dx ≈ 0. ⋆
Exercise 39: The statement follows from transfer of
f (x) − f (y)
∀x, y ∈ [a, b] : (x < y =⇒ ∃z ∈ [a, b] : x < z < y ∧ = f ′ (z))
x−y
which is true by the classical mean value theorem. ⋆
Exercise 41: By Proposition 4.11, U is δ-incomplete over J := . Hence, the
corresponding ultrafilter model of Theorem 4.20 provides a nonstandard map ∗.

An element h ∈ ∞ in this model is given by h := ϕ([h0 ]) where h0 (n) := n. Note
Ê Ê
that g(n, x) := |[2n x] − 2[2n−1 x]| defines a relation on × × . By Example 4.22,
Ê Ê
we have for x ∈ and y ∈ ∗ , y = ϕ([fy ]), fy : J → that Ê
(h, ∗ x, y) ∈ ∗ g ⇐⇒ (h0 (j), x, fy (j)) ∈ g for almost all j,
i.e. ∗ g(h, x) = ϕ([F ]) where F (j) := g(h0 (j), x), i.e. F (n) = g(n, x). By Theo-
rem 5.32, we thus have
f (x) = st(g(h, x)) = st(∗ g(h, x)) = lim g(n, x),

n→U
where f is the function from Theorem 7.28. Since this function attains only the
values 0 and 1 and is nonmeasurable on [0, 1], the statement follows. ⋆
Exercise 42: Let A be a nonempty system of entities A ∈ S which has the finite
intersection property and at most the cardinality of κ. By Lemma 8.7, we may

assume that A is an entity. Put U := A , and consider the relation
ϕ := {(x, y) ∈ A × U | y ∈ x},
i.e. (A, y) ∈ ϕ if and only if y ∈ A. Note that ϕ ∈ S. Since A has the finite
intersection property, ϕ is concurrent on A , and so there is some b which satisfies
∗
ϕ on σ A . By the standard definition principle for relations, we have
ϕ = {(x, y) ∈ ∗ A × ∗ U | y ∈ x}.
∗

Since b satisfies ∗ ϕ on σ A , we have b ∈ σ A . ⋆
Exercise 43: Since S is infinite, we may assume that
∈ S is an entity. By
Proposition 4.19, we have x ∈ ∗

if and only if x = ϕ([f ]) for some f : J → .
Since J is countable, the system of all functions f : J →
has at most the

cardinality of which is the cardinality of P( ) (this can be seen e.g. by the
estimate | | ≤ |(2 ) | = |2× | = |2 | = |P( )|). Thus, ∗ has at most the

cardinality of P( ). Hence, Proposition 8.11 implies that ∗ is not a κ-enlargement

when κ has a strictly larger cardinality than P( ). In particular, ∗ is not an
enlargement. ⋆
Exercise 44: By Theorem 8.10, there is some ∗ -finite B ⊆ ∗ A with σ A ⊆ B.

Then ∗ A0 ⊆ σ A ⊆ B, and so

∃x ∈ ∗ P(A ) : (“x is ∗ -finite” ∧ ∗ A0 ⊆ x).
The inverse form of the transfer principle implies that there is some finite x =

A0 ∈ P(A ) with A0 ⊆ A0 .
A completely different solution proceeds as follows: Let B denote the system
of all sets of the form A0 \ A (A ∈ A ). If one cannot find a finite A0 ⊆ A with

A0 ⊆ A0 , then B has the finite intersection property, and so σ B = ∅. This

means ∗ A0 \ σ A = ∅, a contradiction to the assumption. ⋆
σ
Exercise 45: By Theorem 8.10, there is some ∗ -finite set R0 with ⊆ R0 ⊆ ∗ .
The transfer of the statement in the hint implies that
∀y ⊆ ∗ P( ), ε ∈ + : (“y ∗ -finite” =⇒ ∃n ∈ ∗ : ∀x ∈ y : ∗α(x, n, ε))

where ∗ α(x, n, ε) is a shortcut for
∃z ∈ ∗ : |nx − z| < ε.
Let c ∈ inf(∗ ), c > 0. By the transfer principle, we find some ε ∈ ∗ + such that
∀x ∈ ∗ : ((∃n ∈ ∗ : ∗α(x, n, ε)) =⇒ |sin(πnx)| < c).
Applying the former sentence with this ε = ε and y = R0 , we find some h =

n ∈ ∗ such that |sin(πhx)| < c for all x ∈ R0 . This implies sin(πhx) ≈ 0 for all
x∈σ . ⋆
229
Exercise 46: Necessity is proved as in Theorem 8.12. For sufficiency, let B be an

internal nonempty system of entities with the finite intersection property and at

most the cardinality of κ. Observe that U := B and P := B × U are internal
by Theorem 3.19. Consider the relation
ϕ := {z ∈ P | ∃x ∈ B : ∃y ∈ x : z = (x, y)}
which is internal by the internal definition principle. Note that dom(ϕ) ⊆ B has
at most the cardinality of κ. We have (B, y) ∈ ϕ for some B ∈ B if and only if
y ∈ B. Since B has the finite intersection property, the relation ϕ is concurrent

on B. Hence, it is satisfied on B, i.e. B = ∅. ⋆

Exercise 47: Assume contrary that A \ A0 = ∅ for each finite A0 = ∅. Let
B be the system of all sets of the form A0 \ A with A ∈ A . Then B has the

finite intersection property, and so B = ∅. But this means that A ⊆ A , a
contradiction. ⋆
Exercise 49: The proof is analogous to Theorem 10.1 with p in place of ·. The
only difference is in the proof of the analogue to Lemma 10.2: We have to prove
Ê
that there is a constant c ∈ such that the functional F (x0 + λx1 ) := f (x0 ) + λc
Ê
(x ∈ X0 , λ ∈ ) satisfies F (x0 + λx1 ) ≤ p(x0 + λx1 ), i.e. f (x0 ) + λc ≤ p(x0 + λx1 ).
In the case λ > 0, we may divide by λ and need the estimate f (x) + c ≤ p(x + x1 )
(x ∈ X0 ). In case λ < 0, we divide by −λ and need f (x) − c ≤ p(x − x1 ) (x ∈ X0 ).
The case λ = 0 is trivial. Thus, we have to find some c satisfying
f (x) − p(x + x1 ) ≤ c ≤ p(x + x1 ) − f (x) (x ∈ X0 ).
Since for x, y ∈ X0 the relation
f (x) + f (y) = f (x + y) ≤ p(x + y) = p((x + x1 ) + (y − x1 )) ≤ p(x + x1 ) + p(y − x1 )
holds, we find
sup (f (x) − p(x + x1 )) ≤ inf (p(y − x1 ) − f (y)),

x∈X0 y∈X0
so that we may just choose some c in between these two quantities.

Theorem 10.1 is for Ã Ê = a special case of the above result, because if
f ∈ X0∗ , we may put p(x) := f x, and then find some linear F : X → Ê
extending f such that F (x) ≤ p(x), i.e. F (x) ≤ f x (x ∈ X). Replacing x
by −x, we find also −F (x) = F (−x) ≤ f −x = f x, so that we actually
have |F (x)| ≤ f x, i.e. F ≤ f , as required. ⋆
Exercise 50: Assume C = {x1 , x2 , . . .}. We may successively extend f to a linear

functional Fn on the subspace Un := span(X0 ∪ {x1 , . . . , xn }) such that Fn is an

extension of Fn−1 with Fn = f (trivial induction by n). Put U := Un , and
F0 (x) := Fn (x) for x ∈ Un . Clearly, U is a subspace, and F0 is well-defined and
linear and satisfies |F0 (u)| ≤ f u (u ∈ U ).
Given some x ∈ X choose a sequence un ∈ U with un → x (such a sequence
exists, since U ⊇ C = X). Then un is a Cauchy sequence which implies that also
F0 (un ) is a Cauchy sequence, because
|F0 (un ) − F0 (uj )| = |F0 (un − uj )| ≤ f un − uj .
Thus F (un ) → c converges. Moreover, c is independent of the particular choice of

the sequence un , since if vn ∈ U is another sequence with vn → x, then
|F0 (un ) − F0 (vn )| ≤ f un − vn → 0.
Thus, we may put F (x) := c. The function F : X → Ê

defined in this way is
linear, because if un , vn ∈ U satisfy un → x and vn → y, then wn := un + vn ∈ U
converges to x + y, and F0 (wn ) = F0 (un ) + F0 (vn ) → F (x) + F (y) which implies
F (x + y) = F (x) + F (y); similarly, F0 (λun ) = λF0 (un ) → λF (x) implies F (λx) =
λF (x). Moreover, since |F0 (un )| ≤ f un → f x, we find |F (x)| ≤ f x,
i.e. F ∈ X ∗ satisfies F ≤ f . ⋆
Exercise 51: Let F denote the system of all maps A : X → Y in the form (10.1)
with finitely many fn ∈ X ∗ and yn ∈ Y , i.e.
F := {z ∈ Y X | ∃y ∈ Y < , f ∈ (X ∗ )< : (#Y (y) = #X ∗ (f ) ∧

#Y (y)

∀x ∈ X : z(x) = (f (n))(x)y(n))}.
n=1
The standard definition principle implies in view of Theorem 3.21 and Exercise 25
that ∗ F consists of all internal maps Z : ∗ X → ∗ Y for which there are ∗ -finite
∗
sequences y1 , . . . , yh ∈ ∗ Y and f1 , . . . , fh ∈ (X ∗ ) such that
h

Z(x) = fn (x)yn (x ∈ ∗ X).
n=1
Consider now the binary relation
ϕ := {(x, y) ∈ X × F | y(x) = A(x)}.
Then ϕ is concurrent on X: Given x1 , . . . , xn ∈ X, let X0 be the linear hull of X.

The restriction A : X0 → Y may be written in the form (10.1), i.e. we find some
231
F ∈ F with A(x) = F (x) for all x ∈ X0 . This means (x1 , F ), . . . , (xn , F ) ∈ ϕ.

Since ∗ is an X-enlargement, ∗ ϕ is satisfied on σ X, i.e. we find some Z ∈ ∗ F such
that (∗ x, Z) ∈ ∗ ϕ for any x ∈ X. The standard definition principle for relations
implies that Z(∗ x) = ∗ A(∗ x) = ∗ (A(x)) for any x ∈ X. Since Z ∈ ∗ F , A has the
desired form. ⋆
Exercise 52: We first prove an analogue of Lemma 10.4: For any finite number of
elements x1 , . . . , xK ∈ X, xk = (ξk,n )n and any ε > 0 there exist real numbers
Ê
η1 , . . . , ηN ∈ such that
N

f (xk ) = ηn ξk,n (k = 1, . . . , K).
n=1
Indeed, as in the proof of Lemma 10.4, we may assume that x1 , . . . , xK are linearly
independent. Choose N such that the truncated vectors yk := (ξk,1 , . . . , ξk,N ) are
linearly independent. Defining
g(λ1 y1 + · · · + λK yK ) := f (λ1 x1 + · · · + λK yK )
Ê
and extending g to N , we find as in the proof of Lemma 10.4 the required numbers
ηn := g(en ) where e1 , . . . , eN are the canonical base vectors of N . Ê
Consider now the relation
#Ê (y)

ϕ := {(x, y) ∈ X × Ê < | f (x) = y(n)x(n)}.
n=1
By what we just proved, ϕ is concurrent on X. Since ∗ is an enlargement, The-

∗ <
orem 8.10 implies that there is some y ∈ Ê
which satisfies ∗ ϕ on σ dom(ϕ),
i.e. (∗ x, y) ∈ ∗ ϕ for any x ∈ X. By Exercise 25, y is a ∗ -finite internal sequence
∗
η1 , . . . , ηh where h := ( #X (·))(y). ⋆
Exercise 53: Assume there is some y = (ηn )n such that (10.2) is a Hahn-Banach
limit. Given some n, consider the sequence defined by ξk := 0 (k = n) and ξn := 1.
Since fy is a Hahn-Banach limit, its value for this sequence (which converges to
0) must be 0. On the other hand, by (10.2), this value must be ηn . Hence ηn = 0.
Since this argument holds for any n, we would have fy (x) = 0 for any x ∈ ℓ∞
by (10.2), contradicting the fact that fy is a Hahn-Banach limit.
Let c ⊆ ℓ∞ be the subspace of all convergent sequences. For x = (ξn )n ∈ c,
define f0 (x) = lim ξn . Then f0 ∈ c∗ (with f0 = 1), and so we may use the
Hahn-Banach extension theorem to extend f0 to an element f ∈ ℓ∗∞ which thus is
a Hahn-Banach limit. ⋆

Exercise 54: Fixing some h ∈ ∞ , the functional f ((ξn )n ) := st(∗ ξ h ) is such
a Hahn-Banach limit. Indeed, Theorem 10.6 implies that f is in fact a Hahn-
Banach limit (put η1 = · · · = ηh−1 = 0 and ηh = 1). Moreover, st(∗ ξ h ) is always
an accumulation point of a bounded sequence x = (ξn )n by Corollary 7.3 (note
that ∗ ξ h is finite, because |ξn | ≤ x∞ implies by the transfer principle that
∗
|∗ ξ h | ≤ (x∞ )). ⋆
Exercise 55: Given x = (ξn )n ∈ ℓ∞ , put l := lim inf ξn and L := lim sup ξn . Given
Ê Æ
ε ∈ + , there is some N ∈ such that l − ε ≤ ξn ≤ L + ε holds for each n ≥ N .
Putting ηn := ξn − (l − ε), we thus have ηn+N ≥ 0, and so 0 ≤ f ((ηn+N )n ) =
f ((ηn )n ) = f (x) − (l − ε) which implies f (x) ≥ l − ε. The estimate f (x) ≤ L + ε
follows analogously by putting ηn := (L + ε) − ξn in view of 0 ≤ f ((ηn+N )n ) =
f ((ηn )n ) = L + ε − f (x). We thus have proved l − ε ≤ f (x) ≤ L + ε. Let ε → 0
to obtain the estimate in the statement. This estimate then implies that f is a
Hahn-Banach limit: On the one hand, this estimate implies that the functional f
is bounded with f ≤ 1, because |f (x)| ≤ max{|l| , |L|} ≤ x∞ . On the other
hand, if ξn → c converges, then l = L = c, and so f (x) = c. To see that even
f = 1, observe that f (x) = 1 where x is the constant sequence ξn = 1. ⋆
Exercise 56: Since (ξn+1 )n = −x, we must have f (x) = f ((ξn+1 )n ) = f (−x) =
−f (x), and so f (x) = 0. Since 0 is not an accumulation point of the sequence x,
a Banach-Mazur limit with the required properties does not exist. ⋆
Exercise 57: The set X0 of all x for which ζn converges is a linear subspace of
X := ℓ∞ . Since ζn depends linearly on x, it follows that f is linear and p is
sublinear. Hence, there is a linear functional F ∈ ℓ∗∞ which extends f and satisfies
F (x) ≤ p(x) for all x ∈ ℓ∞ . We claim that each such functional F is a Banach-
Mazur limit: If x is the constant sequence ξn = c, then ζn = c, and so F (x) =
f (x) = c. The estimate F (x) ≤ p(x) implies that F is positive: Indeed, −F (x) =
F (−x) ≤ p(−x) shows F (x) ≥ −p(−x). Thus, if x = (ξn )n with ξn ≥ 0, we have
F (x) ≥ −p(−x) ≥ 0, since the definition immediately implies p(−x) ≤ 0. Finally,
F is shift invariant: If x = (ξn )n and y = (ξn+1 )n , then
F (x) − F (y) = F (x − y) ≤ p(x − y) = lim sup(ξ1 − ξn )/n = 0,

n→∞
and analogously
F (y) − F (x) = F (y − x) ≤ p(y − x) = lim sup(ξn − ξ1 )/n = 0

n→∞
which together implies F (x) = F (y), as required. ⋆

233
Exercise 58: If x is almost convergent to c, then we find for any ε ∈ Ê+ some

Æ
n0 ∈ such that, in view of the transfer principle,
⎛

⎞

n
Æ
∀n ∈ ∗ : ⎝n ≥ ∗ n0 =⇒ ∀k ∈ ∗ :

1
Æ ∗

ξ m+k − ∗ c
< ∗ ε⎠ .

n m=1
Given h ∈ Æ∞ and k ∈ ∗Æ, we thus find that

h

∗ ∗

ξ n+k − c
< ∗ ε.

h
n=1
Ê
Since this holds for any ε ∈ + , (10.10) follows.
Ê
Conversely, let (10.10) be satisfied. Given ε ∈ + , the internal formula

∀k ∈ ∗
Æ

1 ∗

ξ m+k − c
< ∗ ε

n m=1
Æ
holds for all n ∈ ∞ . By the permanence principle, we find some n0 ∈ such Æ
Æ
that this formula holds for all n = ∗ n where n ∈ satisfies n ≥ n0 . By the inverse
form of the transfer principle, we find

1 n
∀k ∈ :
ξm+k − c
< ε

n m=1
Æ
for any n ∈ with n ≥ n0 . But this means that x is almost convergent to c.
Æ
For the second statement, fix h ∈ ∞ . Applying (10.10) two times, we find
that
h−1
1∗ ∗ h−1 1 ∗ h − 1∗ ∗
c
ξk ≈ c − ξ n+k+1 ≈ ∗ c − c= .
h h h − 1 n=1 h h
Hence, |∗ ξ k | /h ≤ |∗ c| /h + 1 which implies |∗ ξ k | ≤ |∗ c| + h. In particular,
∃n ∈ ∗ Æ : ∀k ∈ ∗Æ : |∗ ξk | ≤ |∗ c| + n.
The inverse form of the transfer principle implies that (ξk )k is bounded. ⋆
Exercise 59: The answer is negative: Choose A such that χA corresponds to the
sequence (0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, . . .). Then A has the density d = 1/2, but the
sequence χA = (an )n is not almost convergent to 1/2, because for any n we find
n
some k such that a1+k = · · · = an+k = 0, and so n1 m=1 am+k cannot converge
to 1/2 uniformly in k. Theorem 10.13 implies that there is some Banach-Mazur
limit f with f (χA ) = 1/2, and by Proposition 10.11, we find even a Banach-Mazur
limit of Cesàro type with this property. ⋆
Exercise 60: If U is not an ultrafilter, then there is some filter F U (Propo-

sition 4.7). Then ∅ = mon(F ) mon(U ) by Theorem 12.6, and so F satisfies
neither mon(U ) ⊆ mon(F cl) nor mon(U ) ∩ mon(F ) = ∅.
Conversely, let U be an ultrafilter and F be some filter with
mon(U ) ∩ mon(F ) = ∅. Then we have for any U ∈ U , F ∈ F that ∗ U ∩ ∗ F = ∅
and so U ∩ F = ∅. Let F0 denote the system of all sets of the form U ∩ F with
U ∈ U , F ∈ F . By what we proved, ∅ = F0 . It follows readily that F0 is a
filter, and for the choice F = X resp. U = X, we find U , F ⊆ F0 . Since U is
an ultrafilter, we have U = F0 , and so F ⊆ F0 = U , i.e. mon(U ) ⊆ mon(F ).
Moreover, if F = U ′ is an ultrafilter, then the relation U ′ = F ⊆ F0 implies
U ′ = F0 = U . ⋆
Exercise 61: 1. Let A ⊆ X be compact, and A0 ⊆ X be closed with A0 ⊆ A. By

Theorem 12.39, we have to prove that for each y ∈ ∗ A0 there is some x ∈ A0 with
y ≈O ∗ x. Since A is compact and y ∈ ∗ A0 ⊆ ∗ A, we find some x ∈ A0 with y ≈O ∗ x,
i.e. x ∈ st(y). But since A0 is closed and y ∈ ∗ A0 , Theorem 12.35 implies x ∈ A0 .
2. If A ⊆ X is compact, then each y ∈ ∗ A is infinitely close to some x ∈ σ A by
Theorem 12.39. Corollary 12.36 implies that A is closed. ⋆
Exercise 62: If A is compact, then Theorem 12.39 implies that st−1 (A) ∩ ∗ A = ∗ A
is standard and thus internal. Conversely, let B := ∗ A∩st−1 (A) be internal, and let
C be an open cover of A. For any y ∈ B there is some x ∈ A with y ≈O ∗ x. There is

some O ∈ C with x ∈ O. Then y ∈ ∗ O, and so y ∈ σ C . This proves B ⊆ σ C .
σ
Theorem 8.16 implies that there is some finite C0 ⊆ C with B ⊆ C 0 . In

particular, σ A ⊆ B ⊆ σ C 0 which implies A ⊆ C0 . Hence, A is compact. ⋆
Exercise 63: Let C be an open cover of A0 := st(A). For any a ∈ A, we find in view
of A ⊆ ns(∗ X) some a0 ∈ ∗ X with a ≈O ∗ a0 . We have a0 ∈ st(a) ⊆ st(A) = A0 .

Hence, there is some O ∈ C with a0 ∈ O, and so a ∈ ∗ O. This proves A ⊆ σ C . By

Theorem 8.16, there is a finite C0 ⊆ C , with A ⊆ σ C 0 . Since C0 = {O1 , . . . , On }
is finite, we have
∗
σ
C 0 = ∗ O1 ∪ · · · ∪ ∗ O n = ∗ (O1 ∪ · · · ∪ On ) = C0 ,
∗ ∗
and so A ⊆ ( C0 ). Theorem 12.35 thus implies A0 = st(A) ⊆ st ( C0 ) =

C0 , and so A0 is compact by Lemma 12.43. ⋆
Exercise 64: Let X be compact, and F be an ultrafilter. Choose some y ∈

mon(F ). Since X is compact, there is some x ∈ X with y ≈O ∗ x, i.e. y ∈
mon(x) = mon(U (x)). Hence, mon(F ) ∩ mon(U (x)) = ∅ which implies
mon(F ) ⊆ mon(U (x)) = mon(x) by Exercise 60. ⋆
235
Exercise 65: Let f : X → Y , and K ⊆ X be compact. We have to prove that

I := f (K) is compact. Let y ∈ ∗ I = ∗ f (∗ K) (Theorem 3.13). Then we find some
x ∈ ∗ K with y = ∗ f (x). Since K is compact, there is some x0 ∈ K with x ≈O ∗ x0
(Theorem 12.39). Hence, Theorem 12.57 implies that y = ∗ f (∗ x) ≈O ∗ f (∗ x0 ) =: y0
∗
Since y0 = (f (x0 )) ∈ σ I, Theorem 12.39 implies that I is compact. ⋆
Exercise 66: If O ⊆ Y is open, then we have for any y ∈ O that mon(y) ⊆
∗
O (Theorem 12.34). If x ∈ P := f −1 (O), then ∗ f (mon(x)) ⊆ mon(f (x)) ⊆
∗
O by Theorem 12.57. Hence, we have in view of Theorem 3.13 that mon(x) ⊆
∗ −1 ∗
( f ) ( O) = ∗ P , and so P is open by Theorem 12.34. ⋆
Exercise 67: Given U ∈ U , we find some ε ∈ Ê+ such that
Bε := {(x, y) ∈ X × X | d(x, y) < ε}
is contained in U . The standard definition principle for relations implies

∗
B ε = {(x, y) ∈ ∗ X × ∗ X | ∗ d(x, y) < ∗ ε}.
Ê
Hence, if we find for any ε ∈ + some x ∈ ∗ X with ∗ d(∗ x, y) < ∗ ε, we find
in particular for any U ∈ U some x ∈ ∗ X with (∗ x, y) ∈ ∗ B ε ⊆ ∗ U , and so
Ê
y ∈ pns(∗ X). Conversely, if y ∈ pns(∗ X) and ε ∈ + are given, we find in view of
Bε ∈ U some x ∈ X with (∗ x, y) ∈ ∗ B ε which means ∗ d(∗ x, y) < ∗ ε. ⋆
Exercise 68: Necessity has been proved in Corollary 13.13. For sufficiency, suppose
that any Cauchy sequence converges. By Theorem 13.16, we have to prove that
pns(∗ X) ⊆ ns(∗ X). Thus, let y ∈ pns(∗ X) be given. By Exercise 67, we find
∗
for each n some xn ∈ X with ∗ d( (xn ), y) < 1/∗ n. Using the triangle inequality
for ∗ d (which holds by the transfer principle), we thus find for any ε ∈ + that Ê
∗ ∗ ∗
d( (xn ), (xm )) < ∗ ε for all n, m ∈ Æ
with n, m ≥ 2/ε. But since this estimate
means by the inverse form of the transfer principle that d(xn , xm ) < ε, we may
conclude that xn is a Cauchy sequence (xn is a sequence by a countable form of
the axiom of choice). By assumption, xn → x for some x ∈ X.
Ê
We claim that y ≈U ∗ x. Given ε ∈ + , we have to prove that ∗ d(∗ x, y) < ∗ ε
(Proposition 13.6). But choosing some n ∈ Æ
with n > 2/ε such that d(x, xn ) <
ε/2, we find
∗
d(∗ x, y) ≤ ∗ d(∗ x, ∗ (xn )) + ∗ d(∗ (xn ), y) ≤ ∗ ε/2 + 1/∗ n = ∗ ε,
as desired. ⋆
Exercise 69: If U ∈ U and V ⊆ Y × Y satisfies V ⊇ U ∩ (Y × Y ), put W := U ∪ V .
Then W ∈ U (because W ⊇ U ), and V = W ∩ (Y × Y ). Hence V ∈ UY . If
U1 , U2 ∈ U , then U := U1 ∩ U2 ∈ U , and thus (U1 ∩ (Y × Y )) ∩ (U2 ∩ (Y × Y )) =
U ∩ (Y × Y ) ∈ UY . Moreover, ∆Y := {(y, y) : y ∈ Y } ⊆ U ∩ (Y × Y ) for any

U ∈ U , i.e. ∆Y ⊆ V for any V ∈ UY . In particular, UY is a filter which satisfies
the first property of Definition 13.1.
If U ∈ U , then U −1 ∈ U , and so (U ∩ (Y × Y ))−1 = U −1 ∩ (Y × Y ) ∈ UY .
Moreover, there is some V ∈ U with V 2 ⊆ U . Then W := V ∩ (Y × Y ) ∈ U , and
W 2 ⊆ V 2 ⊆ U and W 2 ⊆ (Y × Y )2 = Y × Y . Hence, W 2 ⊆ U ∩ (Y × Y ). ⋆
Exercise 70: Let X be complete, and Y ⊆ X be closed with the inherited uniform
structure. By Theorem 13.16, we have to prove that any y ∈ pns(∗ Y ) belongs
to ns(∗ Y ). It follows from the definition (and Proposition 13.19) that pns(∗ Y ) ⊆
pns(∗ X). Since X is complete, Theorem 13.16 implies that there is some x ∈ X
with y ≈O ∗ x. Since Y is closed, we find in view of Theorem 12.35 and y ∈ ∗ Y
that x ∈ Y . We are done if we can prove that y ∈ ns(∗ Y ), i.e. y ∈ ∗ V (∗ x) for any
V ∈ UY . By definition, V = U ∩ (Y × Y ) for some U ∈ U . Since y ≈O ∗ x, we have
(y, ∗ x) ∈ ∗ U ∩ (∗ Y × ∗ Y ) = ∗ V , and so y ∈ ∗ V (∗ x), as desired. ⋆
Exercise 71: If X is precompact, then Theorem 13.22 implies that pns(∗ X) =

∗
X is even standard. Conversely, if pns(∗ X) is internal, consider the set A :=
{U (x) : x ∈ X}. For any y ∈ pns(∗ X), there is some x ∈ X with y ∈ ∗ U (∗ x) =
∗
(U (x)), and so pns(∗ X) ⊆ A . Since pns(∗ X) is internal, Theorem 8.16 implies
σ
that there is some finite A0 ⊆ A with pns(∗ X) ⊆ A 0 . Hence, there are
∗ ∗
points x1 , . . . , xn ∈ X with pns( X) ⊆ (U (x1 )) ∪ · · · ∪ (U (xn )). This implies
X ⊆ U (x1 ) ∪ · · · ∪ U (xn ), since for any x ∈ X, we have ∗ x ∈ pns(∗ X), and so we
∗
find some k with ∗ x ∈ U (xk ), i.e. x ∈ U (xk ). ⋆
Exercise 72: The only place in the proof of Theorem 13.26 where the saturation

property was used is in the proof that A = ∅. However, if ∗ is comprehensive,
we may extend the function f : σ U → ∗ X, defined by f (∗ U ) := xU (U ∈ U )
(axiom of choice!), to some internal function F : ∗ U → ∗ X. Consider the internal
binary relation
ϕ := {(x, y) ∈ ∗ U × ∗ X | y ∈ x(F (x))}.
For U ∈ U , we have (∗ U , y) ∈ ϕ if and only if y ∈ ∗ U (xU ). Thus, since A has the

finite intersection property, ϕ is concurrent on σ U . Since ∗ is a compact P(X)-
enlargement, ϕ is satisfied on σ U , i.e. there is some y ∈ ∗ X with (∗ U , y) ∈ ϕ for

any U ∈ U . But this means y ∈ A . ⋆
Exercise 73: Let x ∈ fin(∗ X) and O ⊆ X be open with 0 ∈ O. By Lemma 14.6,

there is a balanced neighborhood U ⊆ O of 0. We find some n with x ∈ n∗ U =
∗
(nU ). Since U is balanced, we find by the transfer principle that
∗
∀x ∈ (nU ) : ∀y ∈ ∗ Ê : (y ≥ ∗n =⇒ x ∈ y ∗ U).
237
Ê
This implies x ∈ y ∗ U for any y ∈ ∗ with y ≥ ∗ n. Hence, cx ∈ ∗ U ⊆ ∗ O for any
Ê Ê
c ∈ ∗ + with c ≤ 1/∗ n. In particular, cx ∈ ∗ O for any c ∈ inf(∗ ) with c > 0.
Ê
Conversely, assume that cx ∈ inf(∗ X) for any c ∈ inf(∗ ), c > 0. If U is a
neighborhood of 0, then the internal formula εx ∈ ∗ O holds for any ε ∈ inf(∗ ), Ê
ε > 0. By the permanence principle (Cauchy principle), we have ∗ εx ∈ ∗ O for
Ê
some ε ∈ + , and so x ∈ λ∗ O for λ := 1/ε. ⋆
Exercise 74: If A is bounded and U is a neighborhood of 0, then there is some

n∈ Æ with A ⊆ nU , and so ∗ A ⊆ n∗ U . In particular, for any x ∈ ∗ A we have
x ∈ n U , i.e. x ∈ fin(∗ X).
∗
Conversely, let ∗ A ⊆ fin(∗ X) and a neighborhood U of 0 be given. By Ex-

Ê
ercise 73, the internal formula ε∗ A ⊆ ∗ U holds for any ε ∈ inf(∗ ), ε > 0.
By the permanence principle (Cauchy principle), we have ∗ ε∗ A ⊆ ∗ U for some
Ê
ε ∈ + . The inverse form of the transfer principle implies A ⊆ ε−1 U , and so A is
bounded. ⋆
Exercise 75: Let y ∈ pns(∗ X), and U be a neighborhood of 0. By Theorem 14.2

and Lemma 14.6, we find a balanced neighborhood O of 0 with O + O ⊆ U . By
Theorem 14.3, the set
UO := {(x, y) ∈ X × X : x − y ∈ O}
belongs to U , and so we find some x ∈ X with (∗ x, y) ∈ ∗ U O which by the

standard definition principle for relations means ∗ x − y ∈ ∗ O. By Theorem 14.2,
we find some λ = 0 with λx ∈ O. Since O is balanced, it is no loss of generality to
assume that λ ∈ (0, 1). We have ∗ x ∈ λ−1 ∗ O. Since ∗ O is balanced and λ ∈ (0, 1),
we thus find
y ∈ λ−1∗ O − ∗ O ⊆ λ−1∗ O + λ−1∗ O = λ−1 (∗ O + ∗ O) ⊆ λ−1∗ U .
Hence, y ∈ fin(∗ X).

If A ⊆ X is precompact, then Theorem 13.22 implies ∗ A ⊆ pns(∗ X) ⊆
∗
fin( X) which means that A is bounded by Exercise 74. ⋆
Exercise 76: inf(∗ X) is a vector subspace of ∗ X by Lemma 14.8. To see that

inf(∗ X) is closed, let x ∈/ inf(∗ X). Then there is some neighborhood U of 0 with
∗
x∈ / U . We find a balanced neighborhood O of 0 with O + O ⊆ U . Then V =
x + ∗ O is a neighborhood of x (Proposition 14.4), and V ∩ inf(∗ X) = ∅: Indeed, if
y ∈ V ∩ inf(∗ X), we have y ∈ x + ∗ O and y ∈ ∗ O, and so we find o1 , o2 ∈ ∗ O with
o1 = x + o2 , i.e. x = o1 − o2 ∈ ∗ O − ∗ O = ∗ O + ∗ O ⊆ ∗ U , a contradiction. To prove
inf(∗ X) ⊆ fin(∗ X), let x ∈ inf(∗ X) be given. Then we have for any neighborhood
U of 0 that x ∈ ∗ U, and so x ∈ fin(∗ X). ⋆
Exercise 77: In view of Proposition 13.2 and Theorem 14.3, a topological vector
space X is Hausdorff if and only if the relation x−y ∈ O for any open neighborhood
O of 0 implies that x = y.
The relation [x] − [y] ∈ O for any open neighborhood O ⊆ X/U of [0] is
equivalent to the fact that for any neighborhood O0 ⊆ X of 0 there is some
u ∈ U with x − y + u ∈ O0 . Putting z := y − x, this means that we find for any
neighborhood O0 ⊆ X of 0 some u with u ∈ z + O, i.e. any neighborhood of z
contains some element of U , i.e. z ∈ U . We thus have proved that [x] − [y] ∈ O for
any open neighborhood O ⊆ X/U if and only if y − x ∈ U .
Together with the assumption in the beginning, we obtain: X/U is Hausdorff
if and only if for any [x], [y] ∈ X/U the relation y − x ∈ U implies [x] = [y]. Since
the latter means y − x ∈ U , we have that X/U is Hausdorff if and only if the
relation y − x ∈ U for some y, x ∈ X implies y − x ∈ U . This is the case if and
only if U ⊆ U , i.e. if and only if U is closed.
For the second claim, note that X/{0} can in a canonical way be identified
with X such that also the open sets are in correspondence. ⋆
Exercise 78: Let F denote the system of all functions from subsets of P(S0 ) into
[0, ∞] (recall Exercise 83). Consider the sentence
∀x ∈ P(S0 ), y ∈ F , z ∈ P(S0 )< : α =⇒ β
where α and β are shortcuts with the meaning
“x is a set algebra” ∧ “y : x → [0, ∞] is a measure” ∧ rng(z) ⊆ x
and

rng z ∈ Σ ∧ (“Range of z pairwise disjoint” =⇒ y( rng z) = z(n)
n
respectively. The transfer of this sentence implies the statement for the choice
x := Σ, y := µ, z(n) := An (n = 1, . . . , h). ⋆
Exercise 79: Let c > 0 be infinitesimal with δ(x) = 0 for |x| > c. In view of the
transfer principle, we may estimate analogously as for standard integrals

δ(x)∗ f (x) dx − ∗ f (0)
=
δ(x) (∗ f (x) − ∗ f (0)) dx
≤ δ(x)M dx = M

Ê
where M := sup{|∗ f (x) − ∗ f (0)| : x ∈ ∗ , δ(x) = 0} (note that the supremum
exists, since the considered set is internal by the internal definition principle). We
have
Ê
M ≤ sup{|∗ f (x) − ∗ f (0)| : x ∈ ∗ ∧ |x| ≤ c} =: M0
239
with some infinitesimal c > 0. Since f is continuous at 0, we have M0 ≈ 0, and so

also M ≈ 0 which implies the statement. ⋆

Exercise 80: Putting U := i∈I Xi and X := i∈I Xi , we have
X = {y ∈ U I | ∀x ∈ I : y(x) ∈ f (x)},
where f denotes the mapping i → Xi . The standard definition principle implies

∗
∗
X = {y ∈ (U I ) | ∀x ∈ ∗ I : y(x) ∈ ∗ f (x)}.
By Theorem 3.21, we find

∗
X = {y : ∗ I → ∗ U | ∀x ∈ ∗ I : y ∈ ∗ f (x) ∧ y internal}.

By Corollary A.6, we have ∗ U = i∈∗ I
∗
X i . Since f (i) =: ∗ X i (i ∈ ∗ I), we have
∗
∗
X = {y : ∗ I → ∗
X i | y(i) ∈ ∗ X i for all i ∈ ∗ I and y is internal}.
i∈∗ I
This implies the statement. ⋆

Exercise 81: By Proposition 3.16, X is a subset of some standard entity, say
X ⊆ ∗ B where B ∈ S. By Theorem 3.21, the system ∗ P(B) consists precisely
of all internal subsets of ∗ B. Hence, ∗ P(B) contains all internal subsets of X, and
conversely, all elements of ∗ P(B) which are subsets of X are internal. Hence, the
system of all internal subsets of X can be written in the form {x ∈ ∗ P(B) : x ⊆ X}
which is internal by Corollary 3.18. ⋆
Exercise 82: Observe that A × B is internal. Hence, by Exercise 81, the system P
of all internal subsets of A × B is internal. Hence,
F = {x ∈ P | x : A → B}
is internal by the internal definition principle (note that x : A → B is an internal

predicate in view of Proposition 3.6). Moreover, by Exercise 81, the system PA
of all internal subsets of A is internal. Note that if f : A0 → B is internal, then
A0 = dom(f ) is internal by Theorem 3.19. Hence, as above,
F = {x ∈ P | ∃y ∈ PA : (x : y → B)}
is internal by the internal definition principle. ⋆

Exercise 83: The set C := F (A, B) belongs to S. Hence, we find some index n
with C ⊆ Sn (since Sn is transitive), and so C ⊆ ∗ S n (Lemma 3.5). The sentence
∗
∀x ∈ Sn : (x ∈ C ⇐⇒ α(x)) is true, where α(x) is the predicate with the meaning

that x ⊆ A × B, and that x is a function (to formalize the latter quantify over A
and B). The transfer principle implies that ∗ C consists of all elements x ∈ ∗ S n
for which ∗ α(x) is true. The condition x ∈ ∗ S n may be dropped, since we already
know ∗ C ⊆ ∗ S n , and ∗ α(x) becomes: “x ⊆ ∗ (A × B) = ∗ A × ∗ B is true, and x is
a function”. ⋆

Exercise 84: Put C := {A × B : A ∈ A , B ∈ B}, U := A and V := B. Then
A × B ⊆ U × V for each A ∈ A and each B ∈ B. Putting P := P(U × V ), we
thus have
C = {z ∈ P | ∃x ∈ A , y ∈ B : z = x × y}.
∗
C = {z ∈ ∗ P | ∃x ∈ ∗ A , y ∈ ∗ B : z = x × y}.
Now note that ∗ P = ∗ P(U × V ) consists by Theorem 3.21 of all internal subsets

of ∗ (U × V ) = ∗ U × ∗ V = ( ∗ A ) × ( ∗ B) (for the equalities we used that ∗ is
a superstructure monomorphism and Theorem A.4). Since by Theorem 3.19 each
of the sets A × B is internal for A ∈ ∗ A and B ∈ ∗ B, the statement follows. ⋆
Bibliography
[AFHKL86] Albeverio, S., Fenstad, J. E., Høegh-Krohn, R., and Lindstrøm,

T., Nonstandard methods in stochastic analysis and mathematical
physics, Academic Press, Orlando, San Diego, New York, 1986.
[Ban87] Banach, S., Introduction to linear operations, North-Holland, Amster-
dam, New York, Oxford, 1987, translation of Théorie des Opérations
Linéaires, Polish Scientific Publ., Warsaw, 1979.
[BH02] Borde, P. and Haddad, L., Etude de la mesurabilité de la function
o
cos(2πωx), Ital. J. Pure Appl. Math. 11 (2002), 19–31.
[CK90] Chang, C. C. and Keisler, H. J., Model theory, 3rd ed., North-Holland
Publ. Company, Amsterdam, London, New York, Tokyo, 1990.
[Cut88] Cutland, N. (ed.), Nonstandard analysis and its applications, Cam-
bridge Univ. Press, Cambridge, 1988.
[Dav77] Davis, M., Applied nonstandard analysis, John Wiley & Sons, New
York, London, Sidney, 1977.
[DS88] Diener, F. and Stroyan, K. D., Syntactical methods in infinitesimal
analysis, In Cutland [Cut88], 258–281.
[FL92] Fuchssteiner, B. and Luxemburg, W. A. J. (eds.), Analysis and ge-
ometry. Trends in research and teaching, Mannheim, Leipzig, Wien,
Zürich, BI Wissenschaftsverlag, 1992, dedicated to D. Laugwitz.
[FW91] Foreman, M. and Wehrung, F., The Hahn-Banach theorem implies the
existence of a non-Lebesgue measurable set, Fund. Math. 138 (1991),
no. 1, 13–19.
[Gol98] Goldblatt, R., Lectures on the hyperreals, Springer, New York, Berlin,
Heidelberg, 1998.
[Hen72a] Henson, C. W., The nonstandard hulls of a uniform space, Pacific J.
Math. 43 (1972), 115–137.
[Hen72b] Henson, C. W., On the nonstandard representation of measures,
Trans. Amer. Math. Soc. 172 (1972), 437–446.
242 Bibliography
[HL85] Hurd, A. and Loeb, P., An introduction to nonstandard real analysis,

Academic Press, Orlando, San Diego, New York, 1985.
[HM72] Henson, C. W. and Moore, L. C., Jr., The nonstandard theory of
topological vector spaces, Trans. Amer. Math. Soc. 172 (1972), 405–
435.
[HM74] Henson, C. W. and Moore, L. C., Jr., Subspaces of the nonstandard
hull of a normed space, Trans. Amer. Math. Soc. 197 (1974), 131–143.
[HM83] Henson, C. W. and Moore, L. C., Jr., Nonstandard analysis and the
theory of Banach spaces, Nonstandard Analysis and the Theory of
Banach Spaces (Heidelberg) (Hurd, A., ed.), Lect. Notes Math., no.
983, Springer, Heidelberg, 1983, 27–112.
[HS69] Hewitt, E. and Stromberg, K., Real and abstract analysis, 2nd ed.,
Springer, Berlin, Heidelberg, New York, 1969.
[Jec97] Jech, T. J., Set theory, 2nd ed., Springer, Berlin, Heidelberg, New
York, 1997.
[Kel50] Kelley, J. L., The Tychonoff product theorem implies the axiom of
choice, Fund. Math. 37 (1950), 75–76.
[KM92] Keller, G. and Moore, L. C., Jr., Invariant means on the group of in-
tegers, In Fuchssteiner and Luxemburg [FL92], dedicated to D. Laug-
witz, 1–18.
[Lau86] Laugwitz, D., Zahlen und Kontinuum, Wissensch. Buchgesellsch.
Darmstadt, Zürich, 1986.
[LG81] Lutz, R. and Goze, M., Nonstandard analysis. A practical guide with
applications, Lect. Notes Math., no. 881, Springer, Berlin, New York,
1981.
[Loe75] Loeb, P. A., Conversion from nonstandard to standard measure spaces
and applications in probability theory, Trans. Amer. Math. Soc. 211
(1975), 113–122.
[Lor48] Lorentz, G. G., A contribution to the theory of divergent sequences,
Acta Math. 80 (1948), 167–190.
[LR94] Landers, D. and Rogge, L., Nichtstandard Analysis, Springer, Berlin,
Heidelberg, New York, 1994.
[LRN51] Loś, J. and Ryll-Nardzewski, C., On the applications of Tychonoff ’s
theorem in mathematical proofs, Fund. Math. 38 (1951), 233–237.
[LRN54] Loś, J. and Ryll-Nardzewski, C., Effectiveness of the representation
theory for Boolean algebras, Fund. Math. 41 (1954), 49–56.
[Lux62] Luxemburg, W. A. J., Two applications of the method of construction
of ultrapowers in analysis, Bull. Amer. Math. Soc. (N.S.) 68 (1962),
416–419.
Bibliography 243
[Lux63] Luxemburg, W. A. J., Addendum to “On the measurability of a func-

tion which occurs in a paper by A. C. Zaanen”, Proc. Netherl. Acad.
Sci. (A) 66 (1963), 587–590, (Indag. Math. 25 (1963), 587–590).
[Lux69a] Luxemburg, W. A. J., A general theory of monads, In (Intern. Symp.
on the) Applications of Model Theory to Algebra, Analysis, and Prob-
ability (1967) [Lux69b], 18–86.
[Lux69b] Luxemburg, W. A. J. (ed.), (Intern. symp. on the) Applications of
model theory to algebra, analysis, and probability (1967), Toronto,
Holt, Rinehart and Winston, 1969.
[Lux69c] Luxemburg, W. A. J., Reduced powers of the real number system
and equivalents of the Hahn-Banach extension theorem, In (Intern.
Symp. on the) Applications of Model Theory to Algebra, Analysis,
and Probability (1967) [Lux69b], 123–137.
[Lux73] Luxemburg, W. A. J., What is nonstandard analysis?, Amer. Math.
Monthly 80 (1973), no. 6 part II, 38–67.
[Lux92] Luxemburg, W. A. J., Nonstandard hulls, generalized limits and al-
most convergence, In Fuchssteiner and Luxemburg [FL92], dedicated
to D. Laugwitz, 19–45.
[LZ71] Luxemburg, W. A. J. and Zaanen, A. C., Riesz spaces, vol. I, North-
Holland Publ. Company, Amsterdam, London, 1971.
[Nel77] Nelson, E., Internal set theory, a new approach to nonstandard anal-
ysis, Bull. Amer. Math. Soc. 83 (1977), 1165–1198.
[Paw91] Pawlikowski, J., The Hahn-Banach theorem implies the Banach-
Tarski paradox, Fund. Math. 138 (1991), no. 1, 20–22.
[Pin73] Pincus, D., The strength of the Hahn-Banach theorem, Victoria Sym-
posium on Nonstandard Analysis (Univ. of Victoria 1972) (Dold, A.
and Eckmann, B., eds.), Lect. Notes Math., no. 369, Springer, 1973,
203–248.
[PS77] Pincus, D. and Solovay, R. M., Definability of measures and ultrafil-
ters, J. Symbolic Logic 42 (1977), no. 2, 179–190.
[Ric82] Richter, M. M., Ideale Punkte, Monaden und Nichtstandard-
Methoden, Vieweg, Braunschweig, Wiesbaden, 1982.
[RK04] Reeken, M. and Kanovei, V., Nonstandard analysis: Axiomatically,
Springer, Berlin, Heidelberg, New York, 2004.
[Rob64] Robinson, A., On generalized limits and linear functionals, Pacific J.
Math. 14 (1964), 269–283.
[Rob70] Robinson, A., Non-standard analysis, 2nd ed., North-Holland Publ.
Company, Amsterdam, London, 1970.
[Rob88] Robert, A., Nonstandard analysis, John Wiley & Sons, New York,
Chichester, Brisbane, 1988.
244 Bibliography
[Rud87] Rudin, W., Real and complex analysis, 3rd ed., McGraw-Hill, Singa-
pore, 1987.
[Rud90] Rudin, W., Functional analysis, 14th ed., McGraw-Hill, New Delhi,
New York, 1990.
[RZ69] Robinson, A. and Zakon, E., A set-theoretical characterization of en-
largements, In Luxemburg [Lux69b], 109–122.
[SB86] Stroyan, K. D. and Bayod, J. M., Foundations of infinitesimal
stochastic analysis, North-Holland, Amsterdam, 1986.
[Sie38] Sierpiński, W., Fonctions additives non complètement additives et
fonctions non mesurables, Fund. Math. 30 (1938), 96–99.
[Sik64] Sikorski, R., Boolean algebras, Springer, Berlin, Heidelberg, New
York, 1964.
[SL76] Stroyan, K. D. and Luxemburg, W. A. J., Introduction to the theory
of infinitesimals, Academic Press, New York, San Francisco, London,
1976.
[Sol70] Solovay, R. M., A model of set-theory in which every set of reals is
Lebesgue measurable, Ann. of Math. (2) 92 (1970), 1–56.
[Tay69] Taylor, R. F., On some properties of bounded internal functions, In
Luxemburg [Lux69b], 167–170.
[vdB87] van den Bergh, I., Nonstandard asymptotic analysis, Lect. Notes
Math., no. 1249, Springer, Berlin, New York, 1987.
[vQ79] von Querenburg, B., Mengentheoretische Topologie, 2nd ed., Springer,
Berlin, Heidelberg, New York, 1979.
[Wag86] Wagon, S., The Banach-Tarski paradox, 2nd ed., Cambridge Univ.
Press, Cambridge, 1986.
[Zaa67] Zaanen, A. C., Integration, North-Holland Publ. Company, Amster-
dam, 1967.
[Zak69] Zakon, E., Remarks on the nonstandard real axis, In Luxemburg
[Lux69b], 195–227.
Index
∀, 17 µ(A), 156
A, 158 ns(X), 157
#
A, 76 pns(∗ X), 175
C0 , 206 rng(Φ), 14
C0∞ , 206 st, 67, 157
∃, 17 ⊆, 19
F → x, 163 x ≈O y, 156
I , 35 x ≈ y, 66
L1 , 205 x ≈U y, 172
Lloc xn → x, 85, 163
1 , 206
∞ , 63
abstract interpretation map, 19
P(A), 13, 19
Ê + , 63
accumulation point
∼ of a sequence, 85
U∗ , 178
180 ∼ of a set, 91
U,
additive
U, 180
∼ measure, 140
X̆, 192
X < , 83
internal ≈, 200
∼ probability measure, 140
X < , 83
∗
σ-∼, 140
180
X, adequate, λ-∼, 114
#X (x), 83 algebra, set-∼, 140
cns(L ), 17 σ-≈, 140
dom(Φ), 14 almost
Ê
fin(∗ ), 63 ∼ every j ∈ J, 45
fin(∗ X), 188 ∼ everywhere, 45
Ê
inf(∗ ), 63 ∼ nowhere, 45
inf(∗ X), 189 almost convergent, 138
lim inf xn , 88 amenable group, 147
lim sup xn , 88 Archimedean
mon(F ), 150 ∼ field, 5
mon(x), 68, 155 ∼ property, 4
246 Index
atom, 13 Cartesian product, 14, 214

atomic Cauchy
∼ formula, 18 ∼ filter, 174
∼ symbol, 17 ∼ principle, 74
axiom ≈ for ∗ X, 158
comprehension ∼, 37 ∼ sequence, 89, 174
Hahn-Banach extension theo- Cesàro type Banach-Mazur limit, 137
rem, 128 chain rule, 95
maximal ideal theorem, 47 choice, axiom of ∼, 2, 3, 10, 11, 16,
∼ of choice, 2, 3, 10, 11, 16, 44, 44, 46, 47, 50, 76, 79, 85, 88,
46, 47, 50, 76, 79, 85, 88, 89, 89, 99, 101, 102, 109, 114,
99, 101, 102, 109, 114, 117, 117, 128, 130, 140, 166, 167,
128, 130, 140, 166, 167, 178, 178, 180, 235, 236
180, 235, 236 closed set, 90, 149
closure of a set, 91, 158
balanced neighborhood, 188
commutative field, 4
Banach space, 125
compact
Banach, extension theorem of Hahn-
∼ enlargement, 111
∼, 126
∼ κ-enlargement, 111
Banach-Mazur limit, 135
∼ set, 91, 161
∼ of Cesàro type, 137
complete space, 125, 175
Banach-Tarski paradox, 128
complete, Dedekind ∼, 5
base, neighborhood ∼, 155
completion, Dedekind ∼, 5
basic predicate, 17
comprehension axiom, 37
Bernstein, theorem of Schröder-∼, 81
comprehensive, 105
binary relation, 14
concurrent binary relation, 107
concurrent ∼, 107
constant, 17
satisfied ∼, 107
continuous function, 93, 167, 168
Bolzano-Weierstraß, theorem of ∼,
uniformly ∼, 94
88
convergence
Borel, theorem of Heine-∼, 91, 162
bound occurrence, 18 ∼ of a filter, 163
bounded linear function, 125 ∼ of a sequence, 85, 87, 89, 163
bounded linear functional, 126 convergence with respect to a filter,
bounded quantifier, 19 70
transitively ∼, 18 copy, standard ∼, 24
bounded set, 191 cut, Dedekind ∼, 5
bounded wff, 19
Dedekind
transitively ∼, 19
∼ complete, 5
Carathéodory, theorem of ∼, 198 ∼ completion, 5
Index 247
∼ cut, 5 Archimedean ∼, 5
∼ finite set, 76 commutative ∼, 4
∼ ∗ -finite set, 78 totally ordered ∼, 4
theorem of ∼, 6 filter, 45
δ-function, Dirac’s ∼, 205 Cauchy ∼, 174
δ-incomplete filter, 47 convergence of a ∼, 163
density of a set, 140 convergence with respect to ∼,
derivative of a function, 94 70
difference, symmetric ∼, 145 δ-incomplete ∼, 47
differentiable function, 94 ∼ dimension, 153
dimension of a filter, 153 free ∼, 46
Dirac’s δ-function, 205 generated ∼, 45
discrete topology, 164 image ∼, 70
distributive law, 4 λ-adequate ∼, 114
domain, 14 neighborhood ∼, 150
dual space, 126 nonprincipal ∼, 152
principal ∼, 152
element
standard ∼, 156
external ∼, 35
∼ subbase, 152
internal ∼, 35
ultra∼, 46
standard ∼, 24
finite
elementary embedding, 23
∼ hyperreal number, 63
embedding
∼ intersection property, 45
elementary ∼, 23
∼ point, 188
nonstandard ∼, 24
∼ set, 76
enlargement, 103
Dedekind ≈, 76
compact ∼, 111
Dedekind ∗ -≈, 78
κ-∼, 103 ∗
-≈, 76
compact ≈, 111
Følner’s condition, 144
entity, 13
formula
external ∼, 35
atomic ∼, 18
internal ∼, 35
internal ∼, 36
standard ∼, 24
well-formed ∼, 18
Eudoxos property, 4
free filter, 46
extension theorem, Hahn-Banach ∼,
free occurrence, 18
126
function, 14
external
bounded linear ∼, 125
∼ element, 35
continuous ∼, 93, 167, 168
∼ entity, 35
uniformly ≈, 94
field differentiable ∼, 94
248 Index
internal ∼, 11 infinitely close, 66, 156

internal integrable ∼, 205 infinitely U -close, 172
locally ≈, 206 infinitesimal
linear ∼, 125 ∼ hyperreal number, 63
Riemann-Stieltjes integrable ∼, ∼ of a topological vector space,
96 189
sublinear ∼, 128 inherited
support of a ∼, 205 ∼ topology, 177
functional ∼ uniform structure, 176
bounded linear ∼, 126 integrable internal function, 205
linear ∼, 125 locally ≈, 206
functor integral
Herbrand-Skolem ∼, 117 internal ∼, 205
Riemann-Stieltjes ∼, 96
generated filter, 45 interior point, 158
generated topology, 171 intermediate value theorem, 93
graph, 14 internal, 21
group, amenable ∼, 147 ∼ additive measure, 200
∼ definition principle, 37
Hahn-Banach
≈ for relations, 39
∼ extension theorem, 126
∼ element, 35
∼ limit, 133
∼ entity, 35
Hausdorff space, 70, 157
∼ formula, 36
Heine-Borel theorem, 91, 162
∼ function, 11
Herbrand-Skolem functor, 117
∼ integrable function, 205
hyperfinite set, 76
locally ≈, 206
hyperinteger number, 62
∼ integral, 205
hypernatural number, 62
∼ set, 8
hyperrational number, 62
internal set theory, 39
hyperreal number, 59
interpretation map
finite ∼, 63
abstract ∼, 19
infinite ∼, 63
∼ in set theory, 20
infinitesimal ∼, 63
nonstandard ∼, 50
image filter, 70 interpreted sentence, 19
individual, 13 isolated point, 92
induced uniform structure
κ-enlargement, 103
∼ by a family of pseudometrics,
compact ∼, 111
172
κ-saturated, 103
∼ by a metric, 171
infinite hyperreal number, 63 λ-adequate filter, 114
Index 249
Leibniz’s principle, 25 monomorphism, superstructure ∼,

lemma, Robinson’s sequential ∼, 75 29
limit monotone set function, 197
Banach-Mazur ∼, 135 Montel space, 195
≈ of Cesàro type, 137
Hahn-Banach ∼, 133 n-ary relation, 14
linear function, 125 n-tuple, 14
bounded ∼, 125 nearstandard point, 157
linear functional, 125 pre-∼, 175
bounded ∼, 126 neighborhood
locally convex space, 194 balanced ∼, 188
locally integrable internal function, ∼ base, 155
206 ∼ filter, 150
Loeb measure, 201 ∼ of a point, 150
logical connective, 17 nonprincipal filter, 152
L
oś/Luxemburg, theorem of ∼, 49 nonstandard
Luxemburg, 50, 152 ∼ embedding, 24
theorem of L
oś/∼, 49 ∼ hull
≈ of a topological vector
map, nonstandard ∼, 24 space, 192
maximal ideal theorem, 47 ≈ of a uniform space, 182
measure, 140 ∼ interpretation map, 50
additive ∼, 140 ∼ map, 24
internal ≈, 200 ∼ universe, 35
σ-≈, 140 norm, 102, 125
Loeb ∼, 201 ∼ of a bounded linear function,
outer ∼, 197 126
probability ∼, 140 normal form, prenex ∼, 116
singular ∼, 140 number
measure extension theorem, 144 hyperinteger ∼, 62
Ê
∼ for , 145 hypernatural ∼, 62
metric, 149 hyperrational ∼, 62
pseudo∼, 149 hyperreal ∼, 59
∼ space, 150
model theory, 3 occurrence
monad bound ∼, 18
∼ of a filter, 150 free ∼, 18
∼ of a number, 68 open set, 91, 149
∼ of a point, 155 order, 4
∼ of a set, 156 total ∼, 4
250 Index
well- ∼, 4 probability measure, 140

outer measure, 197 additive ∼, 140
product topology, 165
pair, 14 product, Cartesian ∼, 14, 214
paradox, Banach-Tarski, 128 property
perfect set, 91 Archimedean ∼, 4
permanence principle, 74 Eudoxos ∼, 4
Ê
∼ for ∗ , 74 finite intersection ∼, 45
∼ for ∗ X, 158 pseudometric, 149
point ∼ space, 150
accumulation ∼
≈ of a sequence, 85 quantifier, 17
≈ of a set, 91 bounded ∼, 19
finite ∼, 188 transitively ≈, 18
interior ∼, 158 scope of a ∼, 18
isolated ∼, 92
range, 14
nearstandard ∼, 157
reduced power, 49
pre-nearstandard ∼, 175
relation
standard ∼, 24
binary ∼, 14
polysaturated, 103
concurrent ≈, 107
powerset, 13, 19
satisfied ≈, 107
pre-nearstandard point, 175
n-ary ∼, 14
precompact set, 177
standard part ∼, 157
predicate, 18
Riemann-Stieltjes integrable func-
basic ∼, 17
tion, 96
prenex normal form, 116
Robinson’s sequential lemma, 75
principal filter, 152
principle satisfied binary relation, 107
Cauchy ∼, 74 saturated
≈ for ∗ X, 158 κ-∼, 103
internal definition ∼, 37 poly∼, 103
≈ for relations, 39 Schröder-Bernstein theorem, 81
Leibniz’s ∼, 25 scope of a quantifier, 18
permanence ∼, 74 semi-ring, 200
Ê
≈ for ∗ , 74 seminorm, 194
≈ for ∗ X, 158 sentence, 18
standard definition ∼, 28 interpreted ∼, 19
≈ for relations, 32 separable space, 128
transfer ∼, 11, 28 separation symbol, 17
≈ first version, 25 sequence
Index 251
Cauchy ∼, 89, 174 seminormed ∼, 194

convergence of a ∼, 85, 87, 89, T3 ∼, 162
163 topological ∼, 70, 149
set Hausdorff ≈, 70, 157
∼ algebra, 140 T3 ≈, 162
σ-≈, 140 vector ≈, 185
bounded ∼, 191 vector ≈, nonstandard hull,
closed ∼, 90, 149 192
closure of a ∼, 91 uniform ∼, 171
compact ∼, 91, 161 nonstandard hull of a ≈, 182
Dedekind finite ∼, 76 standard
Dedekind ∗ -finite ∼, 78 ∼ copy, 24
finite ∼, 76 ∼ definition principle, 28
∗
-≈, 76 ≈ for relations, 32
hyperfinite ∼, 76 ∼ element, 24
internal ∼, 8 ∼ entity, 24
open ∼, 91, 149 ∼ filter, 156
perfect ∼, 91 ∼ part, 67
precompact ∼, 177 ≈ homomorphism, 67
σ-finite ∼, 200 ≈ relation, 157
∗
-finite ∼, 76 ∼ universe, 23
∼ theory ∗-transform, 28
internal ≈, 39 ∗
-finite set, 76
interpretation map in ≈, 20 Dedekind ∼, 78
transitive ∼, 14 Stieltjes, Riemann-∼ integral, 96
σ-additive measure, 140 subbase of a filter, 152
σ-algebra, 140 sublinear function, 128
σ-finite set, 200 subset, 19
σ-subadditive, 197 superstructure, 13
singular measure, 140 ∼ monomorphism, 29
Skolem, Herbrand-∼ functor, 117 support of a function, 205
space symbol, atomic ∼, 17
Banach ∼, 125 symmetric difference, 145
complete ∼, 125, 175
dual ∼, 126 T3 space, 162
Hausdorff ∼, 70, 157 Tarski, Banach-∼ paradox, 128
locally convex ∼, 194 Tauberian theorem, 139
metric ∼, 150 theorem
Montel ∼, 195 ∼ of Bolzano-Weierstraß, 88
pseudometric ∼, 150 ∼ of Carathéodory, 198
252 Index
∼ of Dedekind, 6 ultrafilter, 46
∼ of Hahn-Banach, 126 δ-incomplete ∼, 47
∼ of Heine-Borel, 91, 162 free ∼, 46
intermediate value ∼, 93 ultrapower, 48
∼ of L oś/Luxemburg, 49 uniform space, 171
maximal ideal ∼, 47 nonstandard hull of a ∼, 182
measure extension ∼, 144 uniform structure, 170
Ê
≈ for , 145 ∼ induced
∼ of Schröder-Bernstein, 81 ≈ by a family of pseudomet-
Tauberian ∼, 139 rics, 172
∼ of Tychonoff, 166 ≈ by a metric, 171
topological space, 70, 149 inherited ∼, 176
Hausdorff ∼, 70, 157 ∼ of a topological vector space,
187
T3 ∼, 162
uniformly continuous function, 94
vector ∼, 185
universe
nonstandard hull of a ≈, 192
nonstandard ∼, 35
topological vector space, 185
standard ∼, 23
nonstandard hull of a ∼, 192
uniform structure of a ∼, 187 variable, 17
topology, 149
discrete ∼, 164 Weierstraß, theorem of Bolzano-∼,
∼ generated by a uniform struc- 88
ture, 171 well-formed formula, 18
inherited ∼, 177 well-order, 4
product ∼, 165 wff, 18
T3 ∼, 162 bounded ∼, 19
total order, 4 transitively bounded ∼, 19
totally ordered field, 4
transfer principle, 11, 28
∼ first version, 25
transitive set, 14
transitively bounded
∼ quantifier, 18
∼ wff, 19
translation invariant, 140
triangle inequality, 102, 125
truth value, 19
Tychonoff’s theorem, 166
type, 14

Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF

Uploaded by

Copyright:

Available Formats

Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF

Uploaded by

Copyright:

Available Formats

Martin Väth

3 Nonstandard Real Analysis 59

4 Enlargements and Saturated Models 103

5 Functionals, Generalized Limits, and Additive Measures 125

6 Nonstandard Topology and Functional Analysis 149

A Some Important ∗-Values 211

B Solutions to the Exercises 217

Historically, the idea of nonstandard analysis was to rigorously justify calcula-

(almost “explicit”) representations of certain obects like Hahn–Banach limits for

1.2 Archimedean Fields and Infinitesimals

If a totally ordered ﬁeld X has the Archimedean property, we simply call X

Observe that A + B ∈ X: If A is bounded by a and B is bounded by b, then A + B

For the equality in the above deﬁnition, observe that if b := −a ∈

which deﬁnes a Dedekind cut: If a ∈ A, b ∈ B with a, b > 0X and c ∈ X satisfy

Proof. The fact that X is a totally ordered ﬁeld is a straightforward veriﬁcation

Let us now prove that X is complete. Thus, let A ⊆ X be a nonempty

belongs to X: Since X is Archimedean, each xX is nonempty and bounded from

Theorem 1.3. Any Archimedean field X is isomorphic to a subfield of Ê. If X is

Let us also mention other ways of introducing inﬁnitesimals by dropping

Exercise 4. (Very diﬃcult). Let X0 consist of all measurable functions x : [0, 1] →

§ 2 Superstructures, Sentences, and Interpretations

the type, but they do not lead out of the superstructure.

Proof of Theorem 2.1. 1. The inclusion Sn−1 ⊆ Sn follows by induction: This is

The superstructure S is large enough to represent all structures that typi-

2.2 Formal Language

bounded . We call a wﬀ (transitively) bounded, if all its quantiﬁers are (transitively)

This is explained best by means of an example: Let M be a constant of the

Then I α is the sentence

the interpreted formula I α becomes

“A ⊆ B” then depends on elements which cannot be described by the language L ,

§ 3 The Three Fundamental Principles

to L0 , I0 , I0′ . Exchanging the roles of L , I, I ′ with L0 , I0 , I0′ in this conclusion, we

The second requirement in Deﬁnition 3.1 is of minor importance: This prop-

Proposition 3.6. If ϕ, X, X1 , . . . , Xn are constants or variables which do not repre-

Proof. The formula X1 ⊆ X2 is a shortcut for (∀x ∈ X1 : x ∈ X2 ), and X =

The requirement that X, X1 , . . . , Xn do not represent atoms is not essential

The ∗-transform of a transitively bounded formula α with constants in S

3.2 The Standard Definition Principle

More precisely, if B = ∗ B 0 , and the elements occurring in α are ∗ B 1 , . . . , ∗ B n

where α(x, B1 , . . . , Bn ) denotes the formula α where all occurrences of Bi are

is a true sentence which is transitively bounded. Hence, the transfer principle

Since ∗ A0 ⊆ ∗ B 0 by Lemma 3.5, this means that ∗

6. ∗ preserves the equality relation:

7. ∗ preserves atomic standard deﬁnitions of sets:

Ψ = {z ∈ Sn × Sn | ∃x, y ∈ Sn : (z = (x, y) ∧ (y, x) ∈ Φ)},

and the standard deﬁnition principle implies

We note that, conversely, each superstructure monomorphism satisﬁes the

the atoms in ∗ S = σ S, we may assume that ∗ S = S and that ∗ : S → ∗ S is the

where ∗ α is a transitively bounded predicate with x1 , . . . , xn as its only free vari-

A = {y ∈ B | ∃x1 ∈ B1 , . . . , xn ∈ Bn : (y = (x1 , . . . , xn ) ∧ ∗ α(x1 , . . . , xn ))}.

The standard deﬁnition principle implies

and so the statement follows.

∀x1 , x2 ∈ A : (x1 = x2 =⇒ f (x1 ) = f (x2 ))

f −1 (D) = {x ∈ A : f (x) ∈ D},

the standard deﬁnition principle implies

 and assume that α is true. The sentence α is in

is true. In contrast, the ∗-transform of the original sentence α would read ∀A ⊆

3.3 The Internal Definition Principle

is a subset of T , and so x runs under the interpretation actually only through

The superstructure S is large enough to represent all structures that typi-

The ∗-transform of a transitively bounded formula α with constants in S

and assume that α is true. The sentence α is in

Proof. By Theorem 2.1, we know that Sn is transitive, S0 ⊆ S1 ⊆ · · · ⊆ S, and

An element x ∈ ∗S which is not internal is called external .

Theorem 3.17 (Internal Deﬁnition Principle). An entity A ∈ ∗

Corollary 3.18. An entity A ∈ ∗

If A, B ∈ S are entities, then

Proof. Note that C := P(A) and D := B A belong to S. Hence, we ﬁnd some

Proof. If S contains an inﬁnite entity B0 and ∗ is a nonstandard embedding, choose

Theorem 3.23. Let ∗ : S → ∗ S be a nonstandard embedding. Then an internal

We now let ∗ S := I0 be the set of atoms ∗ S of our new superstructure ∗ S.

Throughout this chapter, let ∗ : S → ∗