Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF
Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF
Martin Andreas Väth - Nonstandard Analysis (2007, Birkhauser) PDF
Nonstandard
Analysis
Birkhäuser Verlag
Basel · Boston · Berlin
Contents
Preface vii
1 Preliminaries 1
§1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Archimedean Fields and Infinitesimals . . . . . . . . . . . . 4
§2 Superstructures, Sentences, and Interpretations . . . . . . . . . . . 13
2.1 Superstructures . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Formal Language . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . 19
2 Nonstandard Models 23
§3 The Three Fundamental Principles . . . . . . . . . . . . . . . . . . 23
3.1 Elementary Embeddings and the Transfer Principle . . . . 23
3.2 The Standard Definition Principle . . . . . . . . . . . . . . 28
3.3 The Internal Definition Principle . . . . . . . . . . . . . . . 34
3.4 Existence of External Sets . . . . . . . . . . . . . . . . . . . 40
§4 Nonstandard Ultrapower Models . . . . . . . . . . . . . . . . . . . 44
4.1 Ultrafilters . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Ultrapowers . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Embedding in a Superstructure . . . . . . . . . . . . . . . . 51
7.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7 Miscellaneous 197
§15 Loeb Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
§16 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Bibliography 241
Index 245
Preface
dF dF dg
= ,
dx dg dx
and for a formal proof, one may just divide numerator and denominator by the
“infinitesimal small number” dg.
Nowadays, nonstandard analysis has gone far beyond the realm of infinites-
imals. In fact, it provides a machinery which enables one to describe “explicitly”
mathematical concepts which by standard methods can only be described “implic-
itly” and in a cumbersome way. In the above example the “standard” notion of
a limit is in a certain sense replaced by the “nonstandard” notion of an infinites-
imal. If one applies a similar approach to other objects than the real numbers
(like topological spaces or Banach spaces etc.), one has a tool which provides “ex-
plicit” definitions for objects which can in principle not be described explicitly
by standard methods. Examples of such objects are sets which are not Lebesgue
measurable, or functionals with certain properties like so-called Hahn–Banach lim-
its. Since it is possible in nonstandard analysis to simply “calculate” with such
objects, one can obtain results about them which are extremely hard to obtain by
standard methods.
This book is an introduction to nonstandard analysis. In contrast to some
other textbooks on this topic, it is not meant as an introduction to basic calculus
by nonstandard analysis. Instead, the above mentioned applications in analysis
(which are not easily accessible by standard methods) are our main motivation.
The infinitesimals are only described as an elementary example for the provided
machinery.
Consequently, the reader is supposed to be already familiar with (standard)
basic calculus. For deeper understanding, also experience with (basic) topology and
Chapter 1
Preliminaries
§ 1 Introduction
1.1 General Remarks
Historically, the idea of nonstandard analysis was to find a rigorous justification
for calculations with infinitesimal numbers. However, in the author’s opinion, this
is not the most important property of nonstandard analysis. Instead, it appears to
the author that it is more essential that so-called concurrent relations are satisfied.
We will make this more precise later, but we already mention that this means,
roughly speaking, the following.
If there is a statement which holds for any finite subset of a given set, then
it holds for the whole set in nonstandard analysis.
Consider, for example, for any set M of positive real numbers the statement
“there is some c > 0 with c < ε for all ε ∈ M ”. Clearly, this statement is true
for any finite set M of positive real numbers (in our later terminology, we denote
such a fact by “concurrency”). This implies that the statement is also true for the
set of all positive numbers in nonstandard analysis and so there indeed exists an
infinitesimal c > 0 which is less than any positive real number. In other words,
nonstandard analysis allows us to conclude that “true for each finite number”
implies “true for all”.
The formulation of the above considerations in precise mathematical terms
is rather involved. For this reason (and to have a further motivation up to this
point), we will first concentrate on the “classical” topic of nonstandard analysis:
This is Leibniz’ idea which may be described as follows. Leibniz’ program is to join
“infinitesimals” to the system Ê of real numbers such that the enlarged system
Ê
obeys the same “rules” as . As we shall see, this program cannot be carried out
2 Chapter 1. Preliminaries
directly, because the system of real numbers is uniquely determined by these rules
(up to an isomorphism).
The solution proposed by A. Robinson and W. A. J. Luxemburg out of this
dilemma is the following: Consider together with Ê Ê
a nonstandard real line ∗
Ê
which contains and also infinitesimal numbers as elements and which satisfies the
Ê
following: Any so-called transitively bounded sentence about can be transferred
Ê
into an analogous sentence about ∗ , and the latter sentence is true if and only
if the sentence about Ê was true. The crucial point in this concept is that more
Ê
sentences can be formulated about ∗ than those transferred from a sentence
Ê
about (and many of these additional sentences are true). Later, these additional
sentences will be called sentences about nonstandard objects. A fundamental point
is that true sentences about nonstandard objects can be combined to give a true
Ê
sentence ∗ α about ∗ which can be obtained by transferring some sentence α
Ê
about . This allows us to conclude that α is true.
To make this approach precise, one has of course to define what is meant by
Ê
a “sentence α about ”. Then one has to define what is meant by the transferred
sentence ∗ α. This is the first problem we shall attack.
After this is done, there arises the fundamental question: Does there actually
Ê
exist an object ∗ with the required properties? Or does in contrast the assump-
Ê
tion that such an object ∗ exists even lead to a contradiction?
The answer to the first question is “yes” if one assumes the axiom of choice
(which we therefore do throughout). For this reason the answer to the second
question is “no” (even if one rejects the axiom of choice). However, the axiom of
choice really is essential.
Applying the above ideas, one can “explicitly construct” objects (in the non-
standard world) which in principle cannot be constructed in the standard world.
Such objects are e.g. sets which are not Lebesgue-measurable or so-called Hahn-
Banach limits: It is possible to prove the existence of such objects in the standard
world by means of the axiom of choice, but it is not possible to give explicit for-
mulas for them without the axiom of choice (even if one allows a weaker form
of this axiom which allows countable recursive or nonrecursive choices). In fact,
assuming the consistency of a so-called inaccessible cardinal, this was first proved
in the famous paper [Sol70]. Since in the nonstandard world we can really “calcu-
late” with such objects, it is easy to obtain results which cannot be obtained by
standard methods, or only with very abstract applications of the axiom of choice.
Thus, in a sense, nonstandard analysis might just be considered as a ma-
chinery to simplify such abstract applications of the axiom of choice by providing
objects which implicitly contain this application. Of course, nonstandard analysis
means actually much more, but in the author’s opinion this is the most impor-
tant advantage of nonstandard analysis over standard analysis: To have convenient
§1 Introduction 3
Proof. Assume that X has the Archimedean property, and x < y. Then we find
some n ∈ X such that n > (y − x)−1 and some m ∈ X with m > nx and
m > −nx. Let z be the smallest number of {−m, . . . , m} which satisfies z > nx,
§1 Introduction 5
hence z − 1X ≤ nx. Then q := z/n ∈ X satisfies q > x. Moreover, q < y, since
otherwise z ≥ ny = n(y − x) + nx > 1X + nx. Hence, X is dense in X.
If X is dense in X, then ε > 0X implies that we find some q ∈ X such
that ε > q > 0X . Then q = z/n for z, n ∈ X , and so ε > n−1 > 0X . Hence, X
has the Eudoxos property.
Let X have the Eudoxos property and x ∈ X. If x ≤ 0X , we have 1X > x.
Otherwise ε := x−1 > 0, and we find some n ∈ X with n−1 < ε. In both cases,
we find some n ∈ X with n > x, and so X has the Archimedean property.
A < B ⇐⇒ B \ A = ∅.
We define an addition + on X by
A + B = {a + b : a ∈ A and b ∈ B}.
qX := {x ∈
X : x < q} (q ∈ X ).
6 Chapter 1. Preliminaries
Moreover, we put
−A =
{b ∈ X : b < −a for all a ∈ A} = {a ∈
X : −a ∈
/ A} if A ∈
/ X,
(−q)X if A = qX .
A · B := {a · b : a ∈ A, b ∈ B, a, b > 0X } ∪ {x ∈ X : x ≤ 0X }
xX = {q ∈ X : q < x} (x ∈ X) (1.1)
We thus have the problem that it actually is not possible to join “infinitesi-
Ê Ê
mals” to the set such that we get a set ∗ in which the same rules hold as in : Ê
Ê
If ∗ were in particular a Dedekind complete Archimedean field, we would have
∗
Ê Ê = (up to an isomorphism). Thus, the main goal of introducing a set ∗ Ê
containing infinitesimals cannot be achieved! Then what is the rest of this book
about?
Ê
The trick is that although ∗ is neither Archimedean nor Dedekind complete,
Ê
it shares all properties of ! This appears to be a contradiction, but of course
depends on the definition of the term “property”: Of course, if one considers e.g.
Ê
Dedekind completeness as a property of , this is not true. However, Dedekind
completeness is actually not a property of Êor of the real numbers (i.e. of the
Ê Ê
elements of ) but a property of the subsets of . This may be considered as a
“higher type” property: Let us for a moment call statements concerning relations
of real numbers (such as e.g. the distributive law) properties of type 0, while we call
a statement concerning subsets of real numbers (such as Dedekind completeness)
a property of type 1; a property of type 2 could be called a statement about sets
Ê
of subsets of real numbers, and so on. It turns out that ∗ completely shares the
properties of Ê of type 0. But it does not satisfy all properties of higher type.
Ê
However, the theory would not be very useful, if ∗ completely violates properties
Ê
of higher type. In a weak sense ∗ also satisfies the properties of higher type of
Ê Ê
, but with some restrictions. As remarked above, ∗ is not Dedekind complete,
Ê
i.e. not any subset of ∗ which is bounded from above must have a smallest upper
Ê
bound. However, this holds for the so-called internal subsets of ∗ which, roughly
speaking, are sets which can be described by properties of type 0.
Actually, one could be more precise than to define a type as a number ;
instead one could choose a certain set which “represents the type” in a more
detailed way—such a theory of types was Robinson’s original approach. But this
is a rather technical procedure. Instead, we follow the approach of Luxemburg in
which one does not have to care much about types: It turns out that any bounded
sentence which holds for Ê Ê
also holds for ∗ . Properties of type 0 in the sense
sketched above can always be formulated as bounded sentences.
as follows: We say x ≤ y if there is some t0 such that x(t) ≤ y(t) for t > t0 , and
x < y if x ≤ y and x = y.
Then X is a totally ordered field which contains a copy of Ê
as the subset
of constant functions. Moreover, X contains nonzero infinitesimals, i.e. elements x
which satisfy 0 < x < ε for any real number ε > 0. Indeed, the element x(t) = 1/t
has this property.
Exercise 1. Prove the statements of Example 1.4. Is the field X defined there
Dedekind complete? Is it Archimedean?
Example 1.5. Let
X := {xa,b :=
a b
0 a
: a, b ∈ Ê}.
Addition and multiplication are defined in the obvious way (matrix-
multiplication). We define an order by saying that xa1 ,b1 < xa2 ,b2 if either
a1 < a2 or if simultaneously a1 = a2 and b1 < b2 (i.e. we order the pairs (a, b)
lexicographically). A straightforward calculation shows that X is a totally ordered
ring, i.e. it satisfies all axioms of a totally ordered field with the possible exception
of the existence of inverses with respect to multiplication. The set X contains a
copy of Ê Ê
(namely {xa,0 : a ∈ }). Moreover, X contains nonzero infinitesimals:
The element x0,1 satisfies x0,0 < x0,1 < xε,0 for each ε > 0.
Exercise 2. Prove the statements of Example 1.5. Is the totally ordered ring X
defined there even a field? Is it Dedekind complete? Is it Archimedean?
The following example has some relation to the way in which ∗ will later Ê
actually be defined. To motivate this example, recall that Ê
can alternatively be
É
defined from by Cauchy sequences: Call two rational Cauchy sequences xn and
yn equivalent if xn − yn → 0. The set of all such equivalence classes, equipped
Ê
with the natural operations, is isomorphic to . One might try to introduce an
infinitely large number into Êby a similar method: For example, one may define
+∞ as the equivalence class of all sequences xn → ∞.
Example 1.6. Let X0 be the set of all sequences with values in , i.e. of all Ê
mappings x : Æ Ê
→ . We call two elements of X0 equivalent, if they differ at
Æ
most on finitely many points of . Let X be the set of all equivalence classes [x]
where x ∈ X0 . Addition and multiplication in X0 are defined in an evident way
(pointwise). The addition and multiplication in X is defined by [x] + [y] := [x + y]
and [x] · [y] := [x · y], and we write [x] ≤ [y] if and only if x does not exceed
y at all except possibly finitely many points. It is straightforward to check, that
these notions are well-defined, i.e. that they actually do not depend on the choice
of the representatives x and y. Moreover, with these notions and the convention
[x] < [y] if and only if [x] ≤ [y] and [x] = [y], the set X becomes an ordered ring,
10 Chapter 1. Preliminaries
i.e. it satisfies the axioms of a totally ordered ring with the exception that the
order need not be total. The equivalence classes of constant sequences constitute
a canonical copy of Ê in X. The equivalence class of the sequence x : n → n−1
Ê
is a nonzero infinitesimal in X, since for any ε ∈ , ε > 0, we have n−1 for all
except finitely many n, and so 0 < [x] < ε.
Exercise 3. Prove the statements of Example 1.6. Is the ordered ring X defined
there even totally ordered? Or a field? Is it Dedekind complete? Is it Archimedean?
The disappointing properties of Example 1.6 are due to the fact that we
chose the “wrong” definition for the equivalence of sequences. Later, we will call
two sequences equivalent, if they are equal “almost everywhere”. However, the
term “almost everywhere” will be defined in a very tricky manner by means of an
ultrafilter. The fact that (nontrivial) ultrafilters cannot be constructed is a deeper
Ê
reason why the model ∗ defined later is not very explicit (but the axiom of choice
will imply its existence).
The difficulty of nonstandard analysis lies not only in the mere introduc-
tion of infinitesimals. Think, for example, of a natural infinitesimal description
of the statement “The function f : Ê → Êis continuous at 0”. The “intuitive”
description will read: Whenever x is “infinitely close” to 0, then f (x) is “infinitely
Ê Ê
close” to f (0). Even if we know some extension ∗ of with infinitesimals, there
Ê Ê
arises the problem that we have given a function f : → (and not a function
Ê Ê
f : ∗ → ∗ ): So how should f (x) be defined if x is an infinitesimal?
For this reason, a proper theory of infinitesimals should also associate to each
function f : Ê→ Ê Ê Ê
an extension ∗ f : ∗ → ∗ . How could such an extension
be defined? If f (x) = a0 + a1 x + · · · + an xn is a polynomial, it is clear how the
extension should be defined. Similarly, if f is a rational function or a power series.
The reader who has some knowledge in complex analysis will find an analogy with
analytic functions: All “basic” analytic real functions have a unique canonical
extension into (a large part of) the complex plane. But how should functions like
the Dirichlet function (f (x) = 1 if x ∈ and f (x) = 0 otherwise) be extended?
§1 Introduction 11
Of course, one would like that the extension ∗ f of f has the same “formal”
properties as f . For example, for f (x) = sin(x), one would like to have that
Ê Ê
|∗ f (x)| ≤ 1 for all x ∈ ∗ and that ∗ f (x + π) = −∗ f (x) for x ∈ ∗ .
In this connection, the approach of Example 1.6 is very promising: Any x ∈ X
Ê Ê
is an equivalence class of sequences. Of course, any function f : → , no matter
Æ Ê
how complicated, maps any sequence x : → into another sequence y : → Æ Ê
in a canonical way by the formula y(n) = f (x(n)). If the equivalence class of y
depends only on the equivalence class of x, one may consider this as a mapping
∗
f : X → X. This is the idea that we will actually use.
However, such a direct approach will lead to many technical difficulties: For
Ê
example what is to be done if f is only defined on a subset A ⊆ ? Moreover,
how could we discuss functions which are constructed by means of infinitesimals
like f (x) = sin(x/c) where c > 0 is infinitesimally small (later, such functions
are called internal ). Of course, one could try to discuss such special cases in all
detail, but we will use another more axiomatic approach: We will define a mapping
∗ : S → ∗ Ê Ê
S on a very large set S which contains , all subsets of , all functions
Ê
on , etc., and which associates to each such element a a nonstandard element ∗ a.
Moreover, this mapping will be defined in such a way that the truth of sentences
is preserved in the following sense: If α is a sentence, and if we put a ∗ at each
element of the sentence, then the corresponding sentence ∗ α is true if and only if
α is a true sentence. This will be called the transfer principle. Observe that one
may conversely also conclude that α is true if ∗ α is true: In this way it is possible
to prove “standard” results with “nonstandard” methods.
The plan for the beginning is as follows: In §2, we describe more precisely
how S and sentences α are defined. The mapping ∗ is axiomatically introduced
in §3, without discussing how such a mapping might be defined. One way to define
such a mapping (similarly to Example 1.6) will be discussed in §4. We already
point out that without the axiom of choice the existence of such a mapping ∗
cannot be proved, i.e. the existence proofs of ∗ must always be nonconstructive.
In particular, even the simplest results of nonstandard analysis contain implicitly
a nonconstructive element. This is one of the main arguments against the use of
nonstandard analysis. On the other hand, physicists like to think of infinitesimals
as “existing” objects, in particular in string theory. In this sense it is perhaps
not false to think of nonstandard analysis as an “extreme idealization” of reality
which, however, might be even “too idealized”; so one should treat the results with
care.
The reader is asked to be patient in these first sections: The existence proof
of ∗ and the fundamental properties are rather complicated and technical; the
calculation with infinitesimals which has been briefly discussed above appears
after this as a simple exercise (and is in fact not much more than a special case of
12 Chapter 1. Preliminaries
general properties of ∗). However, the main power of nonstandard analysis becomes
Ê
clear if one considers more complicated structures than , such as Banach spaces
etc. These applications are the topic of the second part of this book.
§2 Superstructures, Sentences, and Interpretations 13
2.1 Superstructures
As we have mentioned before, we intend to define a map ∗ which maps each object
(number, function, set, etc.) of the standard world into a corresponding object of
the nonstandard world. If one tries to map actually any set into a nonstandard
set, one is in the realm of category theory, and serious fundamental difficulties
arise (for example, ∗ cannot be a map in the sense of set theory). The easiest way
to overcome these difficulties is to work with a “restricted universe” which is still
a set. This has the disadvantage that we have to work with atoms which however
is not a big problem from the viewpoint of applications. We define such a universe
now:
Ê
Let S be some set in which we are interested. For example, S = , or S is
the point set of a topological space. If we speak of “statements about S”, we are
actually interested in statements about elements of S, subsets of S, functions and
relations on such subsets, sets of such functions, functions on such sets of functions,
etc. All these objects can be found in a set which is called the superstructure of
S. This superstructure is defined in the following way:
If A is some set, we denote by P(A) the powerset of A, i.e. the system of
all subsets of A. Let S0 := S, and for n = 1, 2, . . . define inductively Sn :=
S0 ∪ P(Sn−1 ). Then S := Sn is the superstructure of S. The elements of S are
called individuals or atoms, and the elements of S which are not atoms are called
entities.
The notion “atom” is chosen, because the sentence a ∈ s should always be
false for an atom s ∈ S and a ∈ S (at least, we will assume this throughout: If
necessary, we have to “rename” the elements of S to achieve this). For example, if
Ê
we are interested in statements about S = , this means that e.g. the statement
Thus, the choice S =
a ∈ 1 is always false (for a ∈ S). Ê means that we do not
care how the real numbers might be constructed—we just assume them as given.
If we are also interested in the definition of the real numbers from the natural
numbers, we could start with the set S = instead. The only restriction is that
we should start with an infinite set S (because otherwise things degenerate and
no nonstandard analysis is possible, as we shall soon see).
To see that also functions and relations are contained in the superstructure,
we briefly have to recall how functions and relations are defined in set theory:
14 Chapter 1. Preliminaries
A pair (a, b) is defined by the formula (a, b) := {{a}, {a, b}}. By induction, an
n-tuple (a1 , . . . , an ) is defined as the pair ((a1 , . . . , an−1 ), an ) for n ≥ 2 where we
put (a1 ) := a1 for n = 1. The Cartesian product A1 × · · · × An of finitely many
sets A1 , . . . , An is the set of all n-tuples (a1 , . . . , an ) where ai ∈ Ai (i = 1, . . . , n).
An (n-ary) relation Φ over A1 , . . . , An is a subset of A1 × · · · × An . If Φ is a binary
relation over A, B, we write dom(Φ) for the domain of Φ (i.e. dom(Φ) is the set of
all a ∈ A such that there is some b ∈ B with (a, b) ∈ Φ); similarly, rng(Φ) denotes
the range of Φ (i.e. rng(Φ) is the set of all b ∈ B such that there is some a ∈ A
with (a, b) ∈ Φ). If for each a ∈ A we find precisely one b ∈ B with (a, b) ∈ Φ,
then Φ is called a function. In this case, the usual notation Φ : A → B is used,
and Φ(a) denotes the unique b ∈ B with (a, b) ∈ Φ. Note that by this convention,
a function f is not only determined by its graph {(x, f (x)) : x ∈ dom(f )}, it even
more is its graph!
Now it is easy to see that all functions mentioned in the beginning are entities.
More general, if A and B are entities, then any mapping f : A → B (and any
relation R ⊆ A × B) is also an entity. Roughly speaking, S is closed under all
natural set operations:
Theorem 2.1. The following holds in any superstructure S:
1. S0 ∈ S1 ∈ S2 ∈ · · · ∈ S and S0 ⊆ S1 ⊆ · · · ⊆ S. In particular, Sn are entities.
Also ∅ is an entity.
2. Each Sn is transitive, i.e. each element of Sn which is not an atom is a subset
of Sn . The same holds for S. In other words: If A is an entity and x ∈ A,
then x is either an entity or an atom.
3. If A is an entity and B ⊆ A, then B is also an entity. In particular, if B ⊆ Sn
for some n, then B is an entity.
4. If A is an entity, then P(A) is an entity.
5. Let A be a set of entities. If A = ∅, then A := x∈A x is an entity. If
A ∈ S, then A :=
x∈A x is an entity.
then {x1 , . . . , xk } is an entity.
6. If x1 , . . . , xk ∈ S,
7. If A1 , . . . , Ak are entities, then j Aj = A1 ∪ · · · ∪ Ak is an entity.
8. If A1 , . . . , An are entities, then A1 × · · · × An is an entity.
9. All n-ary relations on entities are entities. In particular, all functions acting
between entities are themselves entities.
Theorem 2.1 implies that the superstructure S is built of “levels” Sn : If we
can show that a set belongs to some level (either as an element or as a subset), this
set belongs to the superstructure. Conversely, each element of the superstructure
is contained in some level (both as an element and as a subset). For this reason,
the elements of Sn \ Sn−1 are said to be of type n; the atoms are said to be of type
0. As we shall see in the following proof, the “operations” P and × may increase
§2 Superstructures, Sentences, and Interpretations 15
More surprising than the above operations which are allowed in S is that some
important operations are not allowed: In general, subsets of S are not entities. For
example S itself is no entity; more generally, the subsets of S which are not entities
are precisely those sets which contain elements of infinitely many types (i.e. which
contain an infinite subset of elements of pairwise different type).
In other words: A set which is the “collection” of elements of S is not an
entity, in general. However, finite collections of such elements are entities, as we
have seen.
For the same reason as above, the union of a set A of entities need not be
an entity (a counterexample is A := {{x} : x ∈ S}). However, by what we have
proved, this is the case if either A itself is an entity or if A is finite.
Thus, roughly speaking, the operations “union”, and “collecting to some set”
are in general not admissible. An exception can be made if we either consider only
finitely many entities or if the whole collection forms an entity.
16 Chapter 1. Preliminaries
The reader familiar with set theory might find some analogies of these re-
strictions to usual set theory: Recall e.g. that there is no set containing all sets in
the universe (this is analogous to the fact that S is no entity). Moreover, it is in
general not allowed to “collect” arbitrary elements to some set. In particular, the
“union” of a class A of arbitrary sets need not be a set, in general. However, we
have the union axiom: If A is a set, then also A is a set.
In this sense, we might consider the superstructure S as a model of a set
theory (with so-called “urelements” which are the atoms, i.e. the elements of S):
The superstructure S serves as a model, if we just interpret the system of entities
as the class of all sets in the universe. S satisfies all axioms of set theory with the
exception of the infinity axiom. Note, that if we eliminate the urelements, i.e. if we
consider S := ∅, the corresponding superstructure S indeed contains only finite
sets. The same is true if S is finite. Thus, the only “interesting” case is the one
when S is infinite.
There is one axiom in set theory which in a certain sense allows us to “collect”
elements to some set: This is the axiom of choice. Since we assume the axiom
of choice throughout, it turns out that the axiom of choice holds also in the
superstructure S:
Let A ∈ S be a set whose elements are entities, and ∅ ∈ / A. By the axiom of
choice, there is a function f : A → A such that f (x) ∈ x for each x ∈ A. If A is
an entity, then f is an entity, because A and A are entities.
S = A ∪ {0} and can verify the statement in S. If we can do this for any given
instance of A, we may conclude that the general statement is true.
Now observe that all statements occurring in analysis, topology, . . . have such
a form that it suffices to verify them for given instances, and so superstructures
are actually large enough to represent the corresponding statements. Later, we
will tacitly make use of the above mentioned reasoning.
Let us introduce now a notational convention that we shall use throughout:
Let i : X → Y be a map. Then the value of the image of a point x ∈ X under this
map is usually denoted by i(x). However, in nonstandard analysis it is sometimes
more convenient to write i a for the value. We shall use both conventions. The
value of a set A ⊆ X is defined as i(A) = {i(x) : x ∈ A}. This definition may be
ambiguous: If e.g. X = S is a superstructure and A ∈ S is an entity, it is not clear
whether i(A) means the image of the element A, or of the set A (which consists
By i A we always mean the image of the element A.
of elements from S).
Not all combinations of the above symbols are admissible in our language.
Only the so-called well-formed formulas (wffs) are admissible. These are defined
inductively:
The smallest wffs are the atomic formulas a = b and a ∈ b where a and b are
variables or constants. If α and β are wffs, then also (α ∨ β), (α ∧ β), (α =⇒ β),
(α ⇐⇒ β), and (¬α) are wffs. Moreover, if α does not already contain one of ∀x
or ∃x, then (∀x : α) and (∃x : α) are wffs. In these formulas, α is called the scope
of the quantifier ∀x resp. ∃x.
For simpler notation, we sometimes add or eliminate braces in a wff, if there
is no ambiguity.
An occurrence of a variable x in a wff α is called free, if it is not the occurrence
in a quantifier ∀x or ∃x and not within the scope of such a quantifier. All other
occurrences of x are called bound . For example, in the formula (∀x : x = x) ∧
(x = y) the first three occurrences are bound while the last one is free.
If x1 , . . . , xn occurs freely in a wff α, we sometimes point out this fact by
writing α(x1 , . . . , xn ). In this case we mean by α(a1 , . . . , an ) the string where each
free occurrence of xi is replaced by ai . We use this notation also if not all of the
variables xi actually occur (freely) in α. In any case, the above notation does not
mean that there are no other free variables in α (unless we state this explicitly).
If all occurrences of all variables in α are bound, then α is called a sentence.
Otherwise, α is called a predicate. In other words: α is a sentence if variables
occur only in quantifiers or the scope of quantifiers; all other objects of sentences
are constants (here, we do not count e.g. logical connectives as an “object” of
a sentence). We point this out, since this means that although we think of a
sentence that it concerns sets, it actually concerns only constants: In a sentence
like ∀x : (x ∈ {∅} =⇒ ∀y ∈ x : (¬y = y)) the symbol {∅} is a constant
(otherwise this would not be a sentence in our formal language). To solve this
“paradox”, one has to think of {∅} as a constant which represents the set which
contains the empty set as its only element; under this interpretation, the above
sentence should be true. However, the pure “symbol” {∅} is not this set but just a
name (the “constant”) which traditionally is interpreted as this set. We will make
the term interpretation precise in Section 2.3. However, we already point out that
later we will use constants of e.g. the form {x : α(x)} where α(x) is some formula
of our language; under the traditional interpretation it is clear what we mean by
such constants.
For most applications, it suffices to consider so-called bounded quantifiers:
The quantifiers ∀x resp. ∃x are transitively bounded if they occur in the form
∀x : (x ∈ A =⇒ (α)) resp. ∃x : (x ∈ A ∧ (α)) where A is either a constant or
a variable of the language. In this connection, we use the shortcut ∀x ∈ A : α
resp. ∃x ∈ A : α for the above formulas. If A is a constant, we call the quantifier
§2 Superstructures, Sentences, and Interpretations 19
2.3 Interpretations
So far, we have defined only the syntax of a formal language L . Now we are going
to define the semantic, i.e. we associate a truth value to each of its sentences.
The reader is advised to read this section carefully, because the interpretation of
a sentence is not what it appears to be at first glance.
The reader should note that a formal language L is defined through its
constants cns(L ). An abstract interpretation map I is a one-to-one map of a
subset dom(I) ⊆ cns(L ) into a set S which is equipped with two binary relations
∈∗ and =∗ . Each interpretation map gives rise to an interpretation of all sentences
of L which have the form that all constants occurring in these sentences are in
dom(I):
Given such a sentence α, we define the interpreted sentence I α as follows:
We replace all occurrences of constants c ∈ dom(I) by the interpreted value I(c),
and all occurrences of ∈ and = by ∈∗ and =∗ , respectively. The truth value of the
sentence α under the interpretation map I is then defined as the truth value of the
interpreted sentence I α which in turn is defined in the obvious way by induction
on the structure of α:
The formula x ∈∗ y resp. x =∗ y is true if and only if the pairs (x, y) belong
to the relations ∈∗ resp. =∗ ; the logical connectives have their usual meaning, e.g.
α ⇐⇒ β is true if and only if α and β have the same truth value. Finally, the
quantified expressions ∀x : α resp. ∃x : α are true if α is true for every resp. at
least one value of x ∈ S (i.e. if we replace the free occurrences of x in α by the
corresponding value).
20 Chapter 1. Preliminaries
∃x : ∀y : (y ∈ M =⇒ x ∈ y).
∃x : ∀y : (y ∈∗ I(M ) =⇒ x ∈∗ y).
By definition, this sentence is true (i.e. α is true under the interpretation map I)
if and only if there is an element x ∈ S such that for each element y ∈ S for
which the pair (y, I(M )) belongs to the relation ∈∗ also the pair (x, y) belongs to
the relation ∈∗ .
The point which causes difficulties here is that the interpretation takes place
in S and not only in rng(I), i.e. for quantified variables (x and y in the above
example) we actually take all elements of S into account and not only the elements
arising as the image of a constant. In the above example, this means in particular
that x (and similarly y) only has to exist in S and need not necessarily be of the
form x = I(c) with some c ∈ cns(L ) (if I is not onto). If I is not onto, this leads
to strange effects, as we shall see soon.
The case most important for us is that S is a superstructure, and that ∈∗
and =∗ are defined in the set-theoretical way, i.e. for elements x, y ∈ S the relation
x ∈∗ y resp. x =∗ y holds if and only if (in the set-theoretical meaning) x ∈ y resp.
x = y. In this special case, we say that I is an interpretation map in set theory.
The essential point is that if I is not onto, we do not get the usual interpre-
tation of L as a set theoretical formula.
Let us give an example: Assume that the constants of the language cns(L )
are the elements of a superstructure S. Let I be an interpretation map in set
theory, say I : cns(L ) → T . If α denotes the formula “A ⊆ B” (with constants
A, B ∈ S ) which we use as a shortcut for
∀x : (x ∈ A =⇒ x ∈ B),
∀x : (x ∈∗ I(A) =⇒ x ∈∗ I(B)),
and so (since I is an interpretation map in set theory, i.e. ∈∗ has the usual meaning
of the element relation in T) α becomes defined as “true” (under the interpretation
map I) if and only if I(A) ⊆ I(B) in the usual set-theoretical sense.
This sounds natural at first glance, but if the set I(A) contains elements x
which do not belong to rng(I), this is strange because the truth of the sentence
§2 Superstructures, Sentences, and Interpretations 21
The above observation is the heart of nonstandard analysis: For example, let
L denote the “natural language” over the superstructure S of S = Ê
(in par-
ticular, the constants of the language are the elements of this superstructure, i.e.
Ê
cns(L ) = ). Assume that I : cns(L ) → T is an interpretation map into another
superstructure T with the property that rng(I) does not contain all elements of
Ê Ê Ê
I( ) (here, stands for the constant in the language L ), then the expression
∀x ∈ Ê becomes interpreted under I in such a way that x runs actually through
Ê
more elements than the elements of I( ). If the interpretation I is “canonical”
Ê
in the sense that I( ) has the same “structure” as Ê
(we will make this more
precise, soon), the interpretation thus adds additional (“nonstandard”) elements
to Ê (later we will call these elements internal ; it turns out that the elements
added in this way may be considered as the “infinitesimals”).
Chapter 2
Nonstandard Models
the name “nonstandard universe” for a certain subset of ∗ S which will be defined
later. The mapping ∗ is considered as the embedding of the standard world into
the nonstandard world.
For this reason, we call every element in the range of ∗ a standard element .
Hence, an element of ∗
S is standard, if it can be written in the form ∗ A with A ∈ S.
This notion may be slightly confusing, since one would expect that “standard
elements” are elements of the standard world S. However, since ∗ is injective (this
follows from the definition or also from Lemma 3.5 below), the map ∗ provides a
one-to-one correspondence between the elements of S and the standard elements.
We call a standard element a standard entity, if it is not an atom in ∗ S (no
confusion will be possible: It follows from Lemma 3.5 below that standard elements
are precisely those elements ∗ a where a ∈ S is not an atom in S).
If A ∈ S is an entity in the standard world, we are also interested in the
standard copy of A which is denoted by
σ
A = {∗ a : a ∈ A}.
As already pointed out at the end of Section 2.3, an interpretation may add more
elements to a set. For this reason, it is not surprising that we always have σ A ⊆ ∗ A
(Lemma 3.5), but that the inclusion may be strict. It is a good idea to think of ∗
as a “blow-up functor”.
Definition 3.2. An elementary map ∗ : S → ∗ S is called a nonstandard map (or
σ ∗
nonstandard embedding) if A = A for each infinite entity A ∈ S.
The existence of nonstandard embeddings will be the topic of §4. The re-
quirement that σ A = ∗ A for each infinite entity A ∈ S appears rather restrictive.
However, we will see later (Theorem 3.22) that already the existence of some
countable infinite set A with σ A = ∗ A implies that ∗ is a nonstandard embedding.
Sometimes in literature, it is additionally required that S ⊆ T and ∗ s = s
for each s ∈ S, see e.g. [LR94]. However, this is more or less a formal restriction,
since it follows from Lemma 3.5 below that ∗ maps S into T and is one-to-one.
Thus, the question whether ∗ s = s (s ∈ S) is just a question of naming the atoms
in T .
Mathematically, we still have to prove that the predicate “elementary” in
Definition 3.1 is actually well-defined, i.e. that it depends only on the map ∗:
Proposition 3.3. Definition 3.1 depends only on ∗ and not on the particular choice
of the language L or the interpretation maps I and I ′ .
Proof. Let L0 be another language, and I0 : dom(I0 ) → S and I0′ : dom(I0′ ) →
T be other interpretation maps with dom(I0 ), dom(I0′ ) ⊆ cns(L0 ) such that
I0′ ◦ I0−1 = ∗ = I ′ ◦ I −1 (in particular, I0 is onto). Assuming that ∗ is elemen-
tary with respect to L , I, I ′ we shall prove now that ∗ is elementary with respect
§3 The Three Fundamental Principles 25
Let us now discuss the requirements of Definition 3.1 step by step. One
requirement is apparently that I ′ ◦ I −1 is a function, i.e. that dom(I ′ ) ⊇ dom(I).
However, this is already a consequence of the requirement 1.: Indeed, K0 contains
all sentences of the form c = c where c ∈ dom(I). Since this sentence is bounded
and thus true under the interpretation map I ′ by 1. (in particular, it has an
interpretation), we must have c ∈ dom(I ′ ). Hence, dom(I ′ ) ⊇ dom(I), as claimed.
But the requirement 1. of Definition 3.1 implies much more: Actually, this is
the key property of nonstandard analysis:
Proposition 3.4 (Transfer principle, First version). Let ∗ = I ′ ◦ I −1 be an ele-
mentary map. A transitively bounded sentence whose constants are taken from
dom(I) is true under the interpretation map I if and only if it is true under the
interpretation map I ′ .
Proof. Let α be a sentence whose constants are taken from dom(I). If α is true,
then α ∈ K0 , and by property 1., α has a true interpretation under I ′ . Conversely,
if α is false, then ¬α is true, i.e. ¬α ∈ K0 , and so ¬α has a true interpretation
under I ′ which means that α has a false interpretation under I ′ .
The transfer principle is sometimes also called Leibniz’s principle. The reason
is that this principle implies, as we will discuss later, that the hyperreal numbers
(with infinitesimals) satisfy the same “formal” properties as the real numbers:
This was Leibniz’s demand which we mentioned in Section 1.1.
We note that in older references on nonstandard analysis like [SL76, Lux73],
the first property of Definition 3.1 (i.e. the transfer principle) is required only
for bounded sentences (not for transitively bounded sentences). In contrast, in
the book [LR94], the transfer principle is assumed to hold for even more general
formulas (the range of a variable in a quantifier can be a so-called term). We shall
discuss this later. However, we stress that the transfer principle does not hold for
all sentences if ∗ is a nonstandard map (if we exclude the case that S is finite
which becomes rather trivial as we shall see).
26 Chapter 2. Nonstandard Models
Proof. Let c and d be the constants from the language L which denote x resp. y,
i.e. c = I −1 (x) and d = I −1 (y). Then we have x = y, x ∈ y, resp. x ⊆ y if and only
if the sentence c = d, c ∈ d, resp. ∀z ∈ c : z ∈ d is true. This is the case if and only
if these sentences are true under the interpretation I ′ , i.e. if and only if (in the
usual set-theoretical sense) I ′ (c) = I ′ (d), I ′ (c) ∈ I ′ (d), resp. ∀z ∈ I ′ (c) : z ∈ I ′ (d).
But this means ∗ x = ∗ y, ∗ x ∈ ∗ y resp. ∗ x ⊆ ∗ y.
By definition, ∗ x is an atom in T if and only if ∗ x ∈ T . Since T = ∗ S, this is
the case if and only if ∗ x ∈ ∗ S. But by what we just proved this is equivalent to
x ∈ S, i.e. to the fact that x is an atom in S. Thus, ∗ x is an atom if and only if x
∗
is an atom. But this means also that x is an entity if and only if x is an entity.
The last statement follows from the fact that b ∈ σ A means b = ∗ a for some
a ∈ A. By what we had proved before, the latter means ∗ a ∈ ∗ A, i.e. b ∈ ∗ A.
From now on, we will assume without loss of generality that I is the identity
(by just renaming the constants in cns(L ) if necessary). This means that each
element of S is simultaneously a constant in the language L , and each formula
in L whose constants are in dom(I) = S is simultaneously a formula in standard
set theory. By this convention we will henceforth not have to distinguish between
such a formula and its interpretation in set theory. Since this is confusing in some
occasions, we will sometimes use the symbol I anyway.
With the above convention, it makes now sense to consider all the other
convenient shortcuts commonly used in set theory as part of our language. Let us
give a small list:
§3 The Three Fundamental Principles 27
A = {x ∈ B : α(x)}
where α is a transitively bounded predicate with x as its only free variable, and B
and all elements (=constants) occurring in α are standard elements.
§3 The Three Fundamental Principles 29
∀x ∈ B0 : (x ∈ A0 ⇐⇒ α(x, B1 , . . . , Bn ))
∀x ∈ ∗ B 0 : (x ∈ ∗ A0 ⇐⇒ α(x, ∗ B 1 , . . . , ∗ B n )).
3. ∗ preserves grouping:
∗
(x1 , . . . , xn ) = (∗ x1 , . . . , ∗ xn ).
∗ ∗
4. ∗ preserves basic set operations: ∗ ∅ = ∅, (A ∪ B) = ∗ A ∪ ∗ B, (A ∩ B) =
∗
A ∩ ∗ B, ∗ (A \ B) = ∗ A \ ∗ B, ∗ (A × B) = ∗ A × ∗ B.
5. ∗ preserves domains and ranges of n-ary relations Φ in X and commutes with
permutations of the variables:
For example, if Φ is a binary relation, then ∗ Φ is a relation, and
dom( Φ) = ∗ dom(Φ), rng(∗ Φ) = ∗ rng(Φ). If another relation Ψ on B, A has
∗
the property that (y, x) ∈ Φ if and only if (x, y) ∈ Ψ, then (z, w) ∈ ∗ Φ if and
∗
only if (w, z) ∈ ∗ Ψ (i.e. (Φ−1 ) = (∗ Φ)−1 ). An analogous statement holds for
relations of more than two variables.
30 Chapter 2. Nonstandard Models
We note that many of the properties in Definition 3.9 are actually redundant.
For example, the relation ∗ (A \ B) = ∗ A \ ∗ B implies for the choice A = B that
∗
∅ = ∅.
Theorem 3.10. Each elementary map ∗ is a superstructure monomorphism.
Proof. The injectivity of ∗ has already been proved in Lemma 3.5. Concerning the
other properties:
1. This was proved in Lemma 3.5.
2. Let C denote the entity {x1 , . . . , xn }. Then C = {x1 , . . . , xn } is a true and
bounded sentence by Proposition 3.6 with C and x1 , . . . , xn as the only constants,
and so its ∗-transform ∗ C = {∗ x1 , . . . , ∗ xn } is true.
3. Analogously with C = (x1 , . . . , xn ). The only difference is that this statement
is only transitively bounded.
4. Let C denote the entity A \ B. Then C = A \ B is a true and transitively
bounded sentence by Proposition 3.6, and so its ∗-transform ∗ C = ∗ A \ ∗ B is
true. Note that ∗ C is actually an entity by Lemma 3.5. The proof of the other
statements is similar. To see that ∗ ∅ = ∅, we may argue as remarked before, or
apply the transfer principle to the bounded true sentence ∀x ∈ ∅ : x = x.
5. Let Φ be an entity in S which is a binary relation. We have Φ ∈ Sn for some
n. Since Sn is transitive (Theorem 2.1), the relation (x, y) ∈ Φ implies x, y ∈ Sn .
∗
Consequently, Φ ⊆ Sn × Sn . By Lemma 3.5 and 4., we have ∗ Φ ⊆ (Sn × Sn ) =
∗
Sn × ∗Sn.
If C := dom(Φ), then C = {x ∈ Sn : (∃y ∈ Sn : (x, y) ∈ Φ)} where
(x, y) ∈ Φ is the shortcut of Proposition 3.6. The standard definition principle
implies ∗ C = {x ∈ ∗ S n | ∃y ∈ ∗ S n : (x, y) ∈ ∗ Φ}, i.e. ∗ C = ∗ dom(Φ). The formula
∗
rng(Φ) = rng(∗ Φ) is proved analogously.
If Ψ is a relation which satisfies (y, x) ∈ Ψ if and only if (x, y) ∈ Φ, then
∗
which in view of (Sn × Sn ) = ∗ S n × ∗ S n implies that ∗ Ψ = {(y, x) : (x, y) ∈ Φ},
as claimed.
The case of relations which are not binary is similar and left to the reader.
6. Let C := {(x, x) : x ∈ A} = {y ∈ A × A : (∃x ∈ A : y = (x, x))} where
y = (x, x) is the transitively bounded predicate from Proposition 3.6. The standard
∗
definition principle thus implies ∗ C = {y ∈ (A × A) : (∃x ∈ ∗ A : y = (x, x)}.
∗
Since (A × A) = A × A by 4., we find C = {(x, x) : x ∈ ∗ A}, as claimed.
∗ ∗ ∗
7. At first glance, one might suspect that for the proof of 7., one might argue
similarly to the proof of 6. However, from the definition of the set C := {(x, y) :
x ∈ y ∈ A} we may not apply the standard definition principle, since it is not clear
from which entity the elements (x, y) are taken: We need a “universal” entity U
with (x, y) ∈ U whenever x ∈ y ∈ A.
Such an entity indeed exists: Since A is an entity, we find some n with A ∈ Sn
(with Sn as in Section 2.1). Since Sn is transitive (Theorem 2.1, 2.), the relation
y ∈ A implies y ∈ Sn ; using the transitivity of Sn once more, we find that x ∈
y also implies x ∈ Sn . Consequently, (x, y) ∈ Sn × Sn whenever x ∈ y ∈ A.
Hence, U := Sn × Sn is the required universal entity (U actually is an entity by
Theorem 2.1).
Now the proof is straightforward: By our choice of U , we have ∀y ∈ A : ∀x ∈
y : (x, y) ∈ U , and so C = {(x, y) ∈ U : x ∈ y ∈ A} which implies by the standard
definition principle that ∗ C = {(x, y) ∈ ∗ U : x ∈ y ∈ ∗ A}. Now observe that an
application of the transfer principle to the above mentioned transitively bounded
true sentence implies that ∀y ∈ ∗ A : ∀x ∈ y : (x, y) ∈ ∗ U is true. Using this fact
in the above formula for ∗ C, we have ∗ C = {(x, y) : x ∈ y ∈ ∗ A}.
Proof. The inclusion σ A ⊆ ∗ A has been proved in Lemma 3.5, and the equality
for finite sets follows from the fact that ∗ is a superstructure monomorphism. The
fact that we have inequality for infinite sets is the definition of a nonstandard
embedding.
In particular, if S is finite and so all entities of the standard universe are finite,
we always have σ A = ∗ A. In this case, everything is trivial: After a renaming of
32 Chapter 2. Nonstandard Models
A = {(x1 , . . . , xn ) ∈ B1 × · · · × Bn : ∗ α(x1 , . . . , xn )}
A = ∗ {(x1 , . . . , xn ) ∈ A1 × · · · × An : α(x1 , . . . , xn )}
where α denotes the formula ∗ α where all occurrences of constants ∗ B are replaced
by B.
Proof. Put B0 = A1 × · · · × An . Then ∗ B 0 = B1 × · · · × ∗ B n , because ∗ is a
superstructure monomorphism. Putting B = ∗ B 0 , we thus have
∗ ∗
5. (f (C)) = ∗ f (∗ C) (C ⊆ A) and (f −1 (D)) = (∗ f )−1 (∗ D) (D ⊆ B).
∗
6. If g : B → C, then (g ◦ f ) = (∗ g) ◦ (∗ f ).
Proof. The transfer principle applied to the formula f : A → B from Proposi-
tion 3.6 shows that ∗ f : ∗ A → ∗ B.
1. f is one-to-one if and only if the transitively bounded sentence
is true, where f (x1 ) = f (x2 ) is a shortcut for ∃y1 , y2 ∈ rng(f ) : (y1 = f (x1 ) ∧ y2 =
f (x2 ) ∧ y1 = y2 ) (where we used the shortcut from Proposition 3.6). The ∗-trans-
form of this sentence reads
∗
∀x1 , x2 ∈ ∗ A : (x1 = x2 =⇒ f (x1 ) = ∗ f (x2 ))
which is true if and only if ∗ f is one-to-one. By the transfer principle the sentences
are either both true or both false. The formulas for the range and the inverse have
been proved in Theorem 3.10.
2. f is onto if and only if rng(f ) = B which by the injectivity of ∗ is the case if
∗
and only if rng(f ) = ∗ B. By Theorem 3.10, this means ∗ B = rng(∗ f ), i.e. ∗ f is
onto.
3. Let c denote the constant f (a). Then the sentence (a, c) ∈ f is true, and so by
∗
the transfer principle (a, c) ∈ ∗ f . Since ∗ is a superstructure monomorphism, we
∗
have (∗ a, ∗ c) = (a, c), and so (∗ a, ∗ c) ∈ ∗ f , i.e. ∗ c = ∗ f (∗ a).
4. We have
f |C = {(x, y) ∈ f : x ∈ C}.
The standard definition principle implies
∗
(f |C ) = {(x, y) ∈ f : x ∈ ∗ C}
∗
which means that (f |C ) = (∗ f )|∗ C , as claimed.
∗ ∗
5. (f (C)) = ∗ rng(f |C ) = rng( (f |C )) = rng(∗ f |∗ C ) = ∗ f (∗ C). Since
6. Let h = g ◦ f . Then the sentence ∀x ∈ A : h(x) = g(f (x)) with the evident
abbreviations is transitively bounded and true. The transfer principle implies ∀x ∈
∗
A : ∗ h(x) = ∗ g(∗ f (x)) which means ∗ h = (∗ g) ◦ (∗ f ).
34 Chapter 2. Nonstandard Models
From the previous proofs, the reader might get the wrong impression that
all useful sentences are transitively bounded so that the whole structure of S is
preserved by an elementary map ∗. This impression is false:
Although it actually is the case that any useful sentence can be formulated as
a transitively bounded formula, their “natural” formulation often is not a transi-
tively bounded formula. But the transfer principle only applies to the transitively
bounded formulation.
Let us give a key example: Let α denote the sentence
∀A ⊆ B : β(A)
∀A ∈ ∗ P(B) : β(A)
It is important to know that the sets ∗ S n are not the level sets of the su-
perstructure ∗ S as in Section 2.1: The superstructure ∗ S is much larger than the
union of the sets ∗ S n , as we shall see. Instead, it turns out that the elements of
∗
S n form what we call the nonstandard universe:
Definition 3.15. Let ∗ : S → ∗S be elementary. The elements of standard sets are
called internal . The nonstandard universe I is the system of all internal elements,
i.e.
I := {∗ A : A is an entity in S}.
where Sn are defined as in Section 2.1. Hence, I is the smallest transitive subset
of ∗
S which contains all sets ∗ S n as subsets (or, equivalently, which contains all
standard elements).
then x ∈ Sn for some n, and so ∗ x ∈ ∗ S n by Lemma 3.5. Since
Proof. If x ∈ S,
Sn is an entity of S (Theorem 2.1, 1.), I contains all sets ∗ S n as subsets (which
proves one inclusion of (3.1)), and hence in particular ∗ x ∈ I .
If x is an internal element, then x ∈ ∗ A for some entity A ∈ S. Then A ∈
Sn for some n. Recall that Sn is transitive, and so A ⊆ Sn , hence ∗ A ⊆ ∗ S n
by Lemma 3.5, i.e. x ∈ ∗ S n . The formula (3.1) and I ⊆ ∗ S now follows. If
additionally x is no atom, the transitivity of S n (Lemma 3.14) implies x ⊆ ∗ S n ⊆
∗
I which shows that each internal entity is a subset of some standard entity and,
moreover, that I is transitive.
A = {x ∈ B : α(x)}
where B ∈ S, and α is a transitively bounded internal predicate with x as its only
free variable.
38 Chapter 2. Nonstandard Models
A × B = {z ∈ ∗ S n × ∗ S n | ∃x ∈ A, y ∈ B : z = (x, y)}
is internal.
3. If ϕ is a binary internal relation, we find some n with ϕ ∈ ∗ S n . Since ∗ S n is
transitive, it follows that (x, y) ∈ ϕ implies x, y ∈ ∗ S n . Hence, dom(ϕ) = {x ∈
∗
S n | ∃y ∈ ∗ S n : (x, y) ∈ ϕ}, which implies by the internal definition principle that
dom(ϕ) is internal. rng(ϕ) ∈ I is proved analogously. If ψ = {(y, x) : (x, y) ∈ ϕ},
then ψ = {(y, x) ∈ ∗ S n × ∗ S n : (x, y) ∈ ϕ} is internal by the internal definition
principle.
4. If ϕ : A → B is internal and M ⊆ A is internal, then the image of M is {y ∈
rng(ϕ) | ∃x ∈ M : (x, y) ∈ Φ} which is internal by the internal definition principle.
An analogous proof shows that preimages of internal sets are internal.
We have some analogue to Theorem 2.1: A union of a system of internal sets
need not be internal; but it is internal if the system is finite or if the system is
internal.
§3 The Three Fundamental Principles 39
However, in contrast to Theorem 2.1, subsets of internal sets are not internal,
in general (and so we have a similar restriction as for the union also for the
intersection of internal sets).
Corollary 3.20 (Internal Definition Principle for Relations). An n-ary relation
ϕ ∈ ∗
S is internal if and only if it can be written in the form
ϕ = {(x1 , . . . , xn ) ∈ B1 × · · · × Bn : α(x1 , . . . , xn )}
A = {z ∈ B1 × · · · × Bn | ∃x1 ∈ B1 , . . . , xn ∈ Bn :
(z = (x1 , . . . , xn ) ∧ α(x1 , . . . , xn ))}.
∀y ∈ ∗ B : ((y = ∗ b1 ∧ · · · ∧ y = ∗ bn−1 ) =⇒ (∗ bn , y) ∈ ∗ ϕ)
∀y ∈ B : (y = b1 =⇒ (y, p(y)) ∈
/ ϕ).
∀y ∈ B : (p(y) = bn =⇒ (y = bn+1 ∨ y = b1 ))
thus implies y = ∗ bn+1 or y = ∗ b1 which are both not possible, since y ∈ / σ B. This
contradiction shows that C = ∗ B \ σ B is indeed external.
Now it follows that also σ B is external, because otherwise C = ∗ B \ σ B is
the difference of two internal sets (Proposition 3.16) and thus C were internal by
Theorem 3.19.
Since A is infinite, there is a function f : A → B which is onto B. Then
∗
f : ∗ A → ∗ B, and ∗ f maps σ A onto σ B: Indeed, for any a ∈ A and any b ∈ B for
which the sentence f (a) = b is true, the transfer principle implies ∗ f (∗ a) ∈ ∗ b. In
particular ∗ f (∗ a) ∈ σ B for all ∗ a ∈ σ A, and for each ∗ b ∈ σ B, we find indeed some
preimage ∗ a ∈ σ A, since f is onto. We may now conclude that σ A is external,
because otherwise the image of σ A under the internal map ∗ f (Proposition 3.16)
would be internal by Theorem 3.19; but this image is σ B and thus external. This
contradiction shows that σ A is external.
2. The inclusion σ P(A) ∗ P(A) follows from Corollary 3.11. Now we note that
P(∗ A) consists of all subsets of ∗ A, while ∗ P(A) consists of all internal such
subsets. Hence, ∗ P(A) ⊆ P(∗ A), and the inclusion is strict since by 1. there
actually is a subset of ∗ A which is not internal, namely σ A.
3. From Corollary 3.11, we conclude that ∗ S \ σ S is nonempty. Let a ∈ ∗ S. Then
a is an internal element and also an atom by Lemma 3.5. Using Lemma 3.5, we
find that a is a standard element if and only if a = ∗ b for some atom b ∈ S, i.e. if
and only if a ∈ σ S.
The previous proof might appear rather artificial to the reader: What we did
in fact was to identify B = and then proved that all internal subsets of ∗ have
a smallest element, but that ∗ \ σ contains no smallest element. We will repeat
this argument in §5.
To give the reader an impression about internal sets, let us already note the
following theorem, although we have to postpone the proof:
§3 The Three Fundamental Principles 43
4.1 Ultrafilters
Let J be some set. Probably, the reader has already heard the notion that a
property holds “almost everywhere” on J: By this, one means that the set of all
point j ∈ J with this property is “large” in a certain specified sense. For example,
one may mean that the complement of this set is finite (if J is infinite); if J is a
measure space, one can also mean that the complement of this set is a null set.
(Recall Exercises 3 and 4).
If we want to introduce a general definition of the term “almost everywhere”
which contains the two cases above, we should fix a family F of subsets of J
and say that a property holds almost everywhere if the set of all points j ∈ J
with this property is an element of F . It is natural to require that for any set
A ∈ F , F should also contain all sets which are larger than A. Moreover, we
should require that if a property P1 holds almost everywhere and a property P2
§4 Nonstandard Ultrapower Models 45
also holds almost everywhere, then P1 ∧ P2 also holds almost everywhere. To avoid
trivial cases, we also require that if a property holds nowhere, then it does not
hold almost everywhere. These requirements may be formulated in terms of F as
follows:
Definition 4.1. A set F of subsets of J is called a filter on J, if it has the following
properties:
1. If A ∈ F and A ⊆ B ⊆ J, then B ∈ F .
2. If A, B ∈ F , then A ∩ B ∈ F .
3. ∅ ∈/ F.
We say that a property holds almost everywhere on J (with respect to F ), if the
set of all j ∈ J with this property belongs to F ; we briefly say that this property
holds for almost every j ∈ J.
The examples mentioned above come from the following filters:
Example 4.2. Let J be infinite, and F be the system of all sets of the form J \ J0
where J0 ⊆ J is finite. Then F is a filter.
Example 4.3. Let J be a measure space (mes J = 0), and F be the system of all
sets of the form J \ N where N is a subset of some set of measure 0. Then F is a
filter.
Recall that not any subset of a set of measure 0 must be measurable; but if
we define a null set as an arbitrary subset of a set of measure 0, we get a filter in
the sense of Definition 4.1. In this sense, we can say that the complements of sets
of measure 0 generate the filter from Example 4.3:
Definition 4.4. A system B of sets has the finite intersection property, if for each
finitely many sets of B the intersection is nonempty. If B is a system of subsets
of J with the finite intersection property, then the filter generated by B is the
system F of all subsets B ⊆ J for which there exist finitely many A1 , . . . , An ∈ B
with B ⊇ A1 ∩ · · · ∩ An .
The proof of the following observation is straightforward and left to the
reader.
Proposition 4.5. Let F be the filter generated by B. Then F is a filter in the
sense of Definition 4.1. Moreover, F is the smallest filter which contains all sets
from B.
If we can say that a property holds almost everywhere, it also makes sense to
say that a property holds almost nowhere if this property fails almost everywhere.
Since a property is either true or false, we have that a property holds almost
nowhere on J (with respect to F ), if the complement of the set of all points
j ∈ J with this property belongs to F . It is a natural question whether there
46 Chapter 2. Nonstandard Models
exists a filter F such that each property holds either almost everywhere or almost
nowhere. The filters with this property are called ultrafilters:
Definition 4.6. A filter U on J is an ultrafilter if for any A ⊆ J with A ∈
/ U , we
have J \ A ∈ U .
Note that A ∈ U implies J \ A ∈ / U , since the intersection of these sets is
empty and thus cannot be contained in the filter U .
Proposition 4.7. A filter U on J is an ultrafilter, if and only if it is not contained
in a strictly larger filter on J.
Proof. Assume that U is an ultrafilter. If U is contained in a strictly larger filter
F , we find some A ∈ F which does not belong to U . By assumption, the set
B = J \ A belongs to U and thus to F . Since F is a filter, we must have
A ∩ B ∈ F . But this means ∅ ∈ F , a contradiction.
Conversely, suppose that U is not contained in a strictly larger filter. If U
is not an ultrafilter, we find some A ⊆ J such that A ∈ / U or J \ A ∈ / U . Then
the set B = U ∪ {A} has the finite intersection property: To see this, it suffices to
prove that A ∩ B = ∅ for any B ∈ U , because U is a filter. But if A ∩ B = ∅ for
some B ∈ U , we have B ⊆ J \ A, and so J \ A ∈ U , contradicting our assumption
on A. Hence, B has the finite intersection property and thus generates a filter
which is strictly larger than U .
A trivial example of an ultrafilter is the following: Fix some element j0 ∈ J,
and let U be the system of all subsets of J which contain the element j0 . Then U
is a filter which contains A ⊆ J if and only if j0 ∈ A; if A ∈
/ U , we have j0 ∈ J \ A,
and thus J \ A ∈ U . We want to exclude this example:
Definition 4.8. A filter F is called free, if F = ∅.
Exercise 10. Prove that an ultrafilter U is free if and only if it does not have the
form described above, i.e. if and only if we do not have
U = {U ⊆ J : j0 ∈ U }
for some j0 ∈ J.
It is not obvious whether there exist free ultrafilters. Consider, for example,
the filter F of all subsets of an infinite set J with finite complements. Then F is
free. Thus, any ultrafilter containing F is free. However, even for J = it is not
possible to describe an ultrafilter containing F . Nevertheless, the axiom of choice
implies that such an ultrafilter exists:
Theorem 4.9. Each filter F is contained in some ultrafilter.
The proof is a straightforward application of e.g. Zorn’s Lemma (or, alter-
natively, of Hausdorff’s maximality principle or of the well-ordering theorem) and
§4 Nonstandard Ultrapower Models 47
is left to the reader. We remark that, in contrast to Zorn’s lemma, the statement
of Theorem 4.9 is not equivalent to the axiom of choice [Pin73]. In other words, if
we take Theorem 4.9 as an axiom, this axiom is strictly less restrictive than the
axiom of choice (usually, this axiom is called the maximal ideal theorem because it
is equivalent to the fact that on any Boolean algebra there exists a maximal ideal
[Sik64, Lux69c]).
Now we can prove that over each infinite set J there exists a free ultrafilter
(using the axiom of choice):
Exercise 11. Prove that for an ultrafilter U over an infinite set J the following
statements are equivalent:
1. U is free.
2. U contains the filter of Example 4.2 (observe that by Theorem 4.9 such
ultrafilters exist for each infinite set J).
Actually, we do not really need free ultrafilters but δ-incomplete ultrafilters:
Definition 4.10. A filter F is called δ-incomplete if there is a countable subset
F0 ⊆ F with F0 ∈ / F.
Note that for a filter F , the intersection of finite subsets of F belongs to
F , so that F0 must actually be infinite.
Exercise 11 (and Theorem 4.9) imply that at least over countable sets J there
exist δ-incomplete ultrafilters:
Proposition 4.11. If F is a free filter on a countable set J, then F is δ-incomplete.
Proof. Let J = {j1 , j2 , . . .}. Since F = ∅, we find for each n some set Fn ∈ F
/ Fn . Hence, n Fn = ∅ ∈
with jn ∈ / F.
The importance of δ-incomplete filters lies in the following fact:
Proposition 4.12. If a filter F on J is δ-incomplete then there is a partition of J
into countably many pairwise disjoint sets J0 , J1 , . . . such that none of these sets
belongs to F .
Conversely, if U is an ultrafilter for which such a partition exists, then U
is δ-incomplete and free.
Proof. If F is δ-incomplete, there exist countably many A1 , A2 , . . . ∈ F such that
J0 := n An ∈ / F . Then we define by induction Jn := J \ (J0 ∪ · · · ∪ Jn−1 ∪ An )
(n = 1, 2, . . .). By construction, we have Jn ∩ Jk = ∅ for k < n, and J \ Jn ⊆ An .
The latter implies J \ n≥1 Jn = n≥1 (J \ Jn ) ⊆ J0 , and so J0 , J1 , . . . is indeed
a partition of J. Since the set J0 ∪ · · · ∪ Jn−1 ∪ An belongs to F (because F is a
filter which contains An ), the complement Jn does not belong to F .
Conversely, if J0 , J1 , . . . is a partition of J into (at most) countably many
pairwise disjoint sets with Jn ∈ / U , then J \ Jn ∈ U , because U is an ultrafilter.
48 Chapter 2. Nonstandard Models
Hence, U0 = {J \Jn : n = 0, 1, . . .} is a countable subset of U with U0 = ∅ ∈
/U.
Hence, U is δ-incomplete. Since U ⊆ U0 = ∅, U is free.
Corollary 4.13. Any δ-incomplete ultrafilter is free.
On any uncountable set J, there is a free filter which fails to be δ-incomplete,
namely the filter of all sets with countable complements. However, it is not clear
whether any free ultrafilter must be δ-incomplete (i.e. whether the converse of
Corollary 4.13 holds). This question is known as “Ulam’s measure problem”. It
is consistent with the axioms of ZF set theory to assume that any free filter is
δ-incomplete, but it is not provable that it is consistent to assume the converse:
If there actually should exist a free ultrafilter on J which fails to be δ-incomplete,
then J must have an extremely large cardinality, namely the cardinality of at
least a so-called measurable cardinal which in turn has the cardinality of at least a
so-called inaccessible cardinal. It is not provable that inaccessible cardinals exist.
Moreover, it is not even provable that it is consistent (with the axioms of ZF set
theory) to assume that such cardinals exist (see [Jec97]).
4.2 Ultrapowers
Let S be a superstructure, and L a language with a surjective interpretation map
I onto S.
Let J be an infinite set, and U an ultrafilter on J. We define now an abstract
model S of the language L , the so-called ultrapower of S modulo U .
On this set, we introduce
We start with the set SJ of all functions f : J → S.
a natural equivalence relation: We call two such functions f, g equivalent, if f (j) =
g(j) for almost all j ∈ J, i.e. if Jf,g = {j ∈ J : f (j) = g(j)} ∈ U . This is an
equivalence relation: This is trivial except for the transitivity. But if f is equivalent
to g and g is equivalent to h, then Jf,g , Jg,h ∈ U which implies Jf,g ∩ Jg,h ∈ U .
Since Jf,h ⊇ Jf,g ∩ Jg,h , this implies Jf,h ∈ U .
Let S be the set of equivalence classes of SJ with respect to this equivalence
relation. The interpretation map I0 : S → S is simply the map which associates
to each constant whose interpretation under I is c ∈ S the class which contains
the constant function fc , defined by fc (j) = c (j ∈ J). To speak of an abstract
model we have to equip S with two relations ∈U and =U . These are defined as
follows:
Denote the class containing a function f : J → S by [f ]. Then we define
and
[f ] =U [g] ⇐⇒ f (j) = g(j) for almost all j ∈ J.
§4 Nonstandard Ultrapower Models 49
actually only interested in (4.1) for sentences (n = 0), we have to consider more
general formulas with free variables for our induction proof).
Assume that I α(f1 (j), . . . , fn (j)) is true for almost all j. Then we find for
almost all j some f0 (j) such that I β(f0 (j), f1 (j), . . . , fn (j)) is true. We consider
f0 is a function (axiom of choice!). By induction hypothesis, this implies that
I′ ′
β([f0 ], [f1 ], . . . , [fn ]) is true, and so I α([f1 ], . . . , [fn ]) is true.
Conversely, if I α([f1 ], . . . , [fn ]) is true, then there is some [f0 ] ∈ S such
′
that I β([f0 ], [f1 ], . . . , [fn ]) is true. By induction assumption, this implies that
I
β(f0 (j), f1 (j), . . . , fn (j)) is true for almost all j. For all these j, we thus have
that I α(f1 (j), . . . , fn (j)) is true.
2. α has the form ¬β(x1 , . . . , xn ):
If I α(f1 (j), . . . , fn (j)) is true for almost all j, then I β(f1 (j), . . . , fn (j))
is false for almost all j. In particular, we do not have for almost all j
that I β(f1 (j), . . . , fn (j)) is true. By induction assumption, this means that
I0
β([f1 ], . . . , [fn ]) is not true, i.e. I0 α([f1 ], . . . , [fn ]) is true.
Conversely, assume that I0 α([f1 ], . . . , [fn ]) is true. Then I0 β([f1 ], . . . , [fn ])
is false, and by induction assumption it is not the case that, for almost all
j, I β(f1 (j), . . . , fn (j)) is true. Since U is an ultrafilter, we may conclude that
I
β(f1 (j), . . . , fn (j)) is true for almost no j, i.e. I β(f1 (j), . . . , fn (j)) is false for
almost all j. Hence, I α(f1 (j), . . . , fn (j)) is true for almost all j.
3. α has the form β1 ∧ β2 :
I
α(f1 (j), . . . , fn (j)) is true for almost all j if and only if I β i (f1 (j), . . . , fn (j))
(i = 1, 2) are both true for almost all j (because A, B ∈ U implies A ∩ B ∈ U ).
This is the case if and only if I0 β i ([f1 ], . . . , [fn ]) (i = 1, 2) are both true, i.e. if and
only if I0 α([f1 ], . . . , [fn ]) is true.
We note that we used the axiom of choice in the previous proof to find the
function f0 ; however, if e.g. J is countable, only a countable form of the axiom of
choice is needed.
We call the interpretation map I0 nonstandard , if for any constant A whose
interpretation under I is an infinite entity I A, the sets A∗ := {c ∈ S : c ∈U I0 A}
and Aσ := {I0 a : a ∈ A is true} differ.
Theorem 4.15 (Luxemburg). If the ultrafilter U is δ-incomplete, then the above
defined interpretation map I0 is nonstandard. More precisely, we have for any
constant A with infinite I A that A∗ Aσ .
Conversely, if there is a constant A with countable infinite I A such that
A∗ = Aσ , then U is δ-incomplete.
Proof. Note that A∗ consists of the equivalence classes of the functions f : J → S
such that f (j) ∈ I A for almost all j, and Aσ consists of the equivalence classes
of the constant functions f : J → S with values in I A. Hence, Aσ ⊆ A∗ , and we
§4 Nonstandard Ultrapower Models 51
Conversely, if [g] ∈ In+1 , and [f ] ∈U [g], we have f (j) ∈ g(j) ∈ Sn+1 for
almost all j. Since Sn+1 = S0 ∪ P(Sn ), this implies f (j) ∈ Sn for almost all j, and
so we may assume f : J → Sn . Hence, [f ] ∈ In .
It turns out that the range of ϕ consists precisely of the internal sets. To prove that
the function we are going to construct is actually injective, we need the following
lemma:
Lemma 4.17. If elements [f ], [g] ∈ S \ I0 satisfy
then [f ] = [g].
Proof. Let k : J → S be the constant function k(j) = S. The relation [f ] ∈
/ I0
means that f (j) ∈ S0 = k(j) does not hold almost everywhere, i.e. [f ] ∈U [k] is
not true. Hence, we have [f ] ∈
/ U [k]. Analogously, [g] ∈
/ U [k].
Note now that the sentence (in the language L )
∀x, y ∈
/ S : ((∀z : (z ∈ x ⇐⇒ z ∈ y)) =⇒ x = y)
∀x, y ∈
/ U [k] : ((∀z : (z ∈U x ⇐⇒ z ∈U y)) =⇒ x = y).
2. ∗ S n ⊆ Tn .
3. We have
[g] if [g] ∈ I0 = ∗ S,
ϕn ([g]) = (4.4)
{ϕn ([f ]) : [f ] ∈U [g]} if [g] ∈ In \ I0 .
{x ∈ ∗ S n : ϕ−1 ∗ −1
n (x) ∈U [f ]} = {x ∈ S n : ϕn (x) ∈U [g]}.
Lemma 4.16 now implies that (4.2) holds which by Lemma 4.17 implies that [f ] =
[g], as claimed.
54 Chapter 2. Nonstandard Models
The function ϕ can now be defined, and the range of ϕ is the set I := n ∗ S n .
Roughly speaking, it is now clear that ϕ preserves the truth of transitively
bounded sentences which deal with internal sets: It “preserves” the relations ∈
and = (for ∈ observe (4.2), and for = use Lemma 4.17). Moreover, this mapping
is onto, and thus only provides a “renaming” of the constants. A more rigorous
proof reads as follows:
Theorem 4.18. Let α be a transitively bounded formula in the language whose
constants are taken from I ′ , and let x1 , . . . , xn denote the free variables of α
(n = 0 is not excluded). For xi ∈ I ′ , let α0 denote the formula where all free
occurrences of xi are replaced by xi (i = 1, . . . , n). Then α0 is true under the
interpretation map ϕ if and only if it is true interpreted by the inclusion i into S .
In particular, a transitively bounded sentence with constants taken from I ′
is true under the interpretation map ϕ if and only if it is true under the inclusion
i into S .
Proof. Similarly as in the proof of Theorem 4.14, we may assume that the only
logical connectives used in α are ¬ and ∧ and that the only quantifier used is ∃
(in a transitively bounded form). The proof is by induction on the structure of the
formula α. For induction assumption, we have to consider the elementary formulas
x = y and x ∈ y where x and y are either free variables or constants. In general,
we have to distinguish the following cases:
1. α has the form x = y: If α0 is true under the interpretation map ϕ, then it is
also true under the interpretation map i, since ϕ is one-to-one. The converse is
trivial.
2. α has the form x ∈ y: Then the statement follows immediately from (4.2).
3. α has the form ¬β or β1 ∧ β2 : These cases are trivial, since only the constants
are exchanged.
4. α has the form ∃x ∈ y : β(x) where y is either a free variable or a constant.
Then α0 has the form ∃x ∈ y : β0 (x) where β0 is derived from β by replacing the
free occurrences of x1 , . . . , xn by x1 , . . . , xn , respectively.
If ϕ α0 is true, then there is some x in the set ϕ y such that ϕ β 0 (x) holds.
By 2., the element c := ϕ−1 (x) then satisfies c ∈U y, and by induction assumption
i
β 0 (c) holds. Hence, i α0 is true.
Conversely, if i α0 is true, we find some x ∈ I ′ such that x ∈U y and i β 0 (x)
holds. By 2., we then have c := ϕ x ∈ ϕ y, and by induction assumption ϕ β 0 (c) is
true. Hence, ϕ α0 is true.
Theorem 4.18 is the reason why we can prove the transfer principle only for
transitively bounded sentences. If one needs the transfer principle for a particular
§4 Nonstandard Ultrapower Models 55
class of sentences which are not transitively bounded, one “just” has to check
whether Theorem 4.18 can be generalized to this class.
Proposition 4.19. Let I ′ := ϕ ◦ I0 . Then ∗ := I ′ ◦ I −1 : S → ∗
S has the following
property: We have x ∈ ∗ A for some entity A ∈ S if and only if x = ϕ([f ]) for
some function f : J → A. Moreover, Sn is actually mapped into the set ∗ S n as
defined above.
Proof. Let some entity A ∈ S be given, and consider the constant function fA :
J → S, defined by fA (j) := A. Note that I0 ◦ I −1 maps A into [fA ]. If f : J → A,
then [f ]∈U [fA ], and so Theorem 4.18 (or also (4.2)) implies ϕ([f ]) ∈ ϕ([fA ]) = ∗ A.
Conversely, if x ∈ ∗ A, then x belongs to the range of ϕ, i.e. x = ϕ([f ]) for some
f : J → S. Since ϕ([f ]) ∈ ∗ A = ϕ([fA ]), Theorem 4.18 implies [f ] ∈U [fA ], i.e.
f (j) ∈ A for almost all j. By choosing a different representative if necessary, we
may assume that f (j) ∈ A even for all j, i.e. f : J → A.
For the second statement, apply what we just proved for A := Sn : We have
∗
A = {ϕ([f ]) | f : J → Sn } = {ϕ(x) : x ∈ In }.
occurs in the image of ϕ. With a similar argument as above, Theorem 4.18 now
implies Aσ = A∗ , and Theorem 4.15 shows that the ultrafilter U is δ-incomplete.
Indeed, Theorem 4.18 implies that (x1 , . . . , xn ) ∈ Φ is true if and only if the
corresponding formalized sentence is true if interpreted in the abstract model S .
But this just means that (fx1 (j), . . . , fxn (j)) ∈ fΦ (j) for almost all j.
Example 4.23. Let us illustrate the importance of Example 4.22 for a standard
function ∗ f when f : X → Y :
If x = ϕ([fx ]) with fx : J → X, then ∗ f (x) = ϕ([fy ]) where fy : J → Y is
given by fy (j) = f (fx (j)). Thus, the extension of a function f to a function ∗ f is
indeed defined in the canonical way announced in Section 1.1.
Although the model of Theorem 4.20 is rather “constructive”, the reader
should be aware that actually the ultrafilter U is a rather “unknown” element:
Except for very special cases it is impossible to decide from the representation
of two nonstandard elements x, y whether they satisfy e.g. a simple sentence like
x ∈ y or x = y:
Exercise 13. Consider the model of Theorem 4.20 with a countable infinite set J.
Let A ∈ S, and x, y ∈ ∗ A, i.e. x = ϕ([fx ]) and y = ϕ([fy ]) where fx , fy : J → A.
Give a necessary and sufficient condition on fx and fy such that the identity x = y
holds for any choice of a δ-incomplete ultrafilter U .
For our particular map ∗, we can prove a special case of Theorem 3.23 already.
(Actually, the model theoretic proof of Theorem 3.23 which can be found in [CK90]
reduces the general result to a variation of the following special case; our proof
in Section 7.1 will be rather different).
§4 Nonstandard Ultrapower Models 57
Exercise 14. Prove from the definition of the map ∗ in Theorem 4.20 that
1. ∗ A = σ A if A is a finite entity.
2. ∗ A is uncountable if A is infinite and U is δ-incomplete (in particular, ∗ A =
σ
A for countable A).
Hint: If f1 , f2 , . . . : J → A, construct a function f : J → A such that
f (j) = f1 (j) everywhere, f (j) = f2 (j) almost everywhere, f (j) = f3 (j) on a
smaller set but still almost everywhere, etc.
In particular, standard entities are either finite or uncountable if the mapping ∗
in Theorem 4.20 is a nonstandard embedding.
Chapter 3
§ 5 Hyperreal Numbers
5.1 Hyperreal and Hypernatural Numbers
Ê Ê Ê
Let ∗ be the value of the ∗-transform of . The elements of ∗ are called the
hyperreal numbers.
We first introduce the notation for the most important functions on ∗ : Ê
These are + : 2 →Ê Ê
(defined in the usual way), and similarly subtraction,
multiplication, division, and exponentiation. These functions are mapped by ∗
∗
Ê Ê ∗
Ê
into functions ( 2 ) → ∗ . Note that ( 2 ) = (∗ R)2 , and so in particular, e.g.
∗ ∗
Ê
+:( ) → 2 ∗
Ê Ê
. Instead of writing +(a, b) for a, b ∈ ∗ , we use the traditional
∗
notation a∗ +b. For the sake of convenience, we will later also drop the symbol ∗
in this connection and just write a + b. However, for the beginning and to avoid
confusion, we will keep this symbol.
Ê ∗
Proposition 5.1. If a, b ∈ , then (a + b) = ∗ a ∗ + ∗ b; similarly for multiplication,
division, and exponentiation.
However, the hyperreal numbers would not be useful if we had the field
Ê Ê
property only for the copy σ of : We want to have the same rules even for
Ê
the larger set ∗ . The essential point in the following result is that the statement
holds not only for elements of the form ∗ a where a is from the standard universe,
but even for nonstandard elements:
Ê
Proposition 5.3. The set ∗ equipped with the relations ∗ + and ∗ · is a field. The
neutral element of addition and multiplication is ∗ 0 and ∗ 1, respectively. The
Ê
inverse element of a ∈ ∗ for addition is ∗ 0 ∗ − a, and the inverse element of
Ê ∗
a ∈ ∗ \ {∗ 0} for multiplication is ∗ 1 / a.
Proof. The commutative law for addition follows by applying the transfer principle
for the transitively bounded true sentence
∀x, y ∈ Ê : x + y = y + x.
The commutative law for multiplication, and the associative and the distributive
Ê
laws are proved analogously. The transfer of the formula ∀x ∈ : x + 0 = x shows
that ∗ 0 is the neutral element of the addition ∗ +, and the transfer of the formula
∀x ∈ Ê : x + (0 − x) = 0
with the evident shortcuts implies
∀x ∈ ∗ Ê : x ∗+ (∗0 ∗− x) = ∗0
which means that ∗ 0 ∗ − a is the inverse element of addition. The proof concerning
∗
Ê Ê ∗
Ê
multiplication is similar, if one observes that ( \ {0}) = ∗ \ {0} = ∗ \ {∗ 0}.
We note that the inverse element of a with respect to addition and mul-
tiplication is usually denoted by −a resp. a−1 . One might thus define functions
Ê
f1 (a) = −a and f2 (a) = a−1 on . The ∗-transform gives us hyperreal functions
∗
Ê Ê
f i : ∗ → ∗ . One may ask whether ∗ f 1 (a) is the inverse element of a with
Ê
respect to addition even for all hyperreal numbers a ∈ ∗ . The transfer of the
formula ∀x ∈ Ê : x + f1 (x) = 0 shows that this is indeed the case. Similarly,
∗
Ê
f 2 (a) is the inverse of a with respect to multiplication for any a ∈ ∗ \ {∗ 0}.
One might also interpret f2 (a) as the result of the exponential function
e(a, b) := ab with b := −1 and may ask whether the ∗-transform of e yields
∗
the same function, i.e. whether ∗ f 2 (a) = ∗ e(a, (−1)) for any hyperreal number
Ê
a ∈ ∗ . This is indeed the case, as follows by the transfer principle from the
sentence
Ê
∀x ∈ \ {0} : f2 (x) = e(x, −1)
§5 Hyperreal Numbers 61
∗
Ê Ê
in view of the fact that ( \ {0}) = ∗ \ {∗ 0} (recall the proof of Proposition 5.3).
Thus the ∗-transform of a → a−1 yields always the same nonstandard function,
no matter how the symbol a−1 is interpreted.
We hope that the reader already has the impression that all elementary
properties of Ê Ê
carry over to ∗ in the canonical way. The limitations of this
transfer will be made clear later.
Ê
Let us now also consider the order properties of ∗ : The relation ≤ on is Ê
Ê
described by a subset of 2 , namely ≤:= {(a, b) : a ≤ b}. The ∗-transform of this
Ê
set is a relation on ∗ . We write a ∗ ≤ b (or later more briefly a ≤ b), if the pair
Ê
(a, b) ∈ (∗ )2 belongs to ∗ ≤. Similarly, we define the meaning of symbols like ∗ <
or ∗ > for hyperreal numbers.
The transfer principle of course implies that the relation a ≤ b for elements
a, b of the standard universe gives ∗ a∗ ≤∗ b. In particular, ∗ ≤ is a total order on
Ê Ê
the standard copy σ of . However, ∗ ≤ is even a total order in the nonstandard
universe, as we shall show now. Observe that this does not follow from the above
Ê
argument, since σ is a strict subset of ∗ . Ê
We note that we could alternatively have defined ∗ < on ∗ by Ê
a ∗ < b ⇐⇒ (a ∗ ≤ b ∧ a = b).
The following result implies that these two possible definitions coincide. An anal-
ogous remark holds for ∗ > and ∗ ≥.
Proposition 5.4. ∗ ≤ defines a total order on ∗
Ê. Moreover, for hyperreal numbers
Ê
a, b ∈ ∗ the following holds true:
1. We have a ∗ ≤ b if and only if a∗ <b or a = b.
2. We have a ∗ ≥ b if and only if a∗ >b or a = b.
3. Precisely one of the three relations a ∗ < b, a ∗ > b, a = b holds.
Ê
Proof. Let α be the bounded sentence ∀x ∈ : x ≤ x. This sentence is true, and
Ê
so the transfer principle implies that ∀x ∈ ∗ : x ∗ ≤ x, i.e. a ∗ ≤ a for any hyperreal
Ê Ê
number a ∈ ∗ . Similarly, the transfer of the sentence ∀x, y ∈ : ((x ≤ y ∧ y ≤
x) =⇒ x = y) shows that the relations a ≤ b and b ≤ a for hyperreal numbers
a, b imply a = b. The proof of the other properties is similar.
relations between arithmetic and order operations. But these follow by the transfer
principle from the sentences
and
∀x, y, z ∈ Ê : (x < y ∧ z > 0) =⇒ (x · z < y · z)
where we used evident abbreviations (henceforth, we will use such shortcuts with-
out further mention).
∗ ∗
We also use notation like |·|, max(·, ·) for the transfer of the functions with
their evident meanings.
∗
All elementary formulas like |a| = ∗ max(a, ∗ −a) (for hyperreal numbers
a∈ ∗
Ê) follow immediately from the transfer principle and will be used henceforth
without further mention. Moreover, we will henceforth drop the symbol ∗ on such
simple functions and on simple constants, if no confusion arises. Thus, 0 may either
Ê Ê
mean the element 0 in the set , or the element ∗ 0 in the set ∗ (or in σ ). Ê
Example 5.6. For the map ∗ from our ultrapower model (Theorem 4.20), one has
a simple interpretation for the elementary operations: Recall that any element x ∈
∗
Ê has the form x = ϕ([f ]) with a function f : J → Ê
(recall Proposition 4.19).
Ê
To simplify notation, we write fx for such an f . Note that if x ∈ σ , then x is
standard, and so fx may be chosen constant. We claim that
To see this, recall that +, · and ≤ are just standard relations, and apply Exam-
ple 4.22.
We define the hypernatural numbers as the elements of ∗ , and similarly the
hyperinteger numbers and the hyperrational numbers as the elements of ∗ and ∗ ,
respectively. Note that the relations ⊆ ⊆ ⊆ imply ∗ ⊆ ∗ ⊆ ∗ ⊆ ∗ .
There are two natural definitions for the order on ∗ : Either, we can define
the order as the restriction of the order on ∗ to ∗ , or we can use the ∗-transform
of the order of . As the reader might have expected, the two definitions actually
coincide: One may apply the transfer principle to see this, or just has to recall
Theorem 3.13.
We will show soon that ∗ contains, besides the copy σ of , also infinite
and infinitesimal numbers:
§5 Hyperreal Numbers 63
Ê
fin(∗ ) := {x ∈ ∗ Ê : x is finite},
inf( Ê) := {x ∈ Ê : x is infinitesimal}.
∗ ∗
The notation inf(∗ Ê) is of course ambiguous, since the symbol inf is usually
reserved for the infimum; however, we hope that no confusion will arise.
Ê
It will be convenient to use the notation + := {x ∈ Ê
: x > 0}. Then we
have σ
Ê + = {x ∈
σ
Ê
: x > 0} and ∗
Ê
+ = {x ∈
∗
Ê
: x > 0}.
Ê
Any x ∈ ∗ is either finite or infinite.
Proposition 5.8. A number x ∈ ∗ Ê is
1. finite, if and only if |x| < y for some y ∈ σ , Ê
2. infinite, if and only if |x| > y for any y ∈ σ , Ê
3. infinitesimal, if and only if |x| < ε for any ε ∈ σÊ+.
Proof. One implication follows immediately from Æ ⊆ σ Ê. For the converse
σ
∀x ∈ Æ : (x = 1 ∧ x = 2 ∧ · · · ∧ x = N =⇒ x > N ).
64 Chapter 3. Nonstandard Real Analysis
The transfer principle implies that any h ∈ ∗ which does not have the form
h = ∗ k with k = 1, 2, . . . , N (in particular, any h ∈ ∞ ) satisfies h > ∗ N . Since
N ∈ was arbitrary, the statement follows.
Corollary 5.10. In ∗
Ê there are infinite numbers and nonzero infinitesimal num-
bers.
Ê
Exercise 15. Show that for each x ∈ ∗ there is precisely one h ∈ ∗ Æ with
/σ .
h ≤ |x| < h + 1. Moreover, prove that x is infinite if and only if h ∈ Æ
Example 5.11. For the map ∗ from our ultrapower model (Theorem 4.20), it is easy
Ê
to characterize the finite numbers: Recall that x ∈ ∗ if and only if x = ϕ([f ])
Ê
where f : J → . By definition, x is finite if and only if |x| ≤ ∗ n for some n ∈ . Æ
Example 5.6 implies that the relation |x| ≤ ∗ n for x = ϕ([f ]) is equivalent to
f (j) ≤ n for almost all j. Hence, x = ϕ([f ]) is finite if and only if [f ] has a
representing function which is bounded on J.
Now we come to the limitations of the transfer principle:
Æ Æ
Theorem 5.12. The set ∗ is not well-ordered. More precisely, ∞ has no smallest
Æ
element. However, any nonempty internal subset of ∗ has a smallest element.
Æ
Proof. Assume by contradiction that ∞ has a smallest element h. Then h > ∗ n
Æ Æ
for each n ∈ , and so h − 1 > ∗ n for each n ∈ . Since no element n0 ∈ Æ
Æ
satisfies ∗ n0 > ∗ n for all n ∈ , we may conclude that h − 1 ∈ Æ
/ σ . Hence, the
Æ
element h0 := h − 1 belongs to ∞ and is strictly smaller than h, a contradiction.
SinceÆ is well-ordered, the sentence
Æ
∀x ∈ P( ) : (x = ∅ =⇒ ∃y ∈ x : ∀z ∈ x : y ≤ z)
Æ
is true. The transfer of this sentence implies that any nonempty x ∈ ∗ P( ) has a
Æ
smallest element. By Theorem 3.21, the set ∗ P( ) consists precisely of all internal
Æ
subsets of ∗ .
Æ Æ
Theorem 5.12 implies that ∗ N \ σ is external, and so σ is external. In the
light of this proof, the reader might want to reconsider the proof of Theorem 3.22:
This proof is actually just a repetition of the arguments that we used above.
The reason why the transfer principle does not apply for the sentence “ Æ
is well-ordered” is that the natural formalization of this sentence is unbounded
(namely it has the form ∀x ⊆ Æ
: . . .); recall in this connection the remark
following Theorem 3.13.
Proposition 5.9 shows another limitation of the transfer principle:
§5 Hyperreal Numbers 65
For the second statement, observe that Ê is Dedekind complete which means
that
Ê
∀x ∈ P( ) : (x = ∅ ∧ α(x, Ê)) =⇒ β(x, Ê)
Ê Ê
is true where α(x, ) and β(x, ) are transitively bounded formulas with the
Ê
meaning “x has an upper bound in ” and “x has a smallest upper bound in
Ê ”, respectively (we leave the precise formulation of the formulas to the reader).
The transfer of the above statement means that any internal subset of ∗ which Ê
Ê
is nonempty and bounded from above in ∗ possesses a smallest upper bound in
∗
Ê Ê
(recall that ∗ ∅ = ∅ and that ∗ P( ) consists by Theorem 3.21 precisely of all
internal subsets of ∗ ). Ê
The fact that ∗ Ê is not Dedekind complete should not be too surprising to the
reader, since the transfer principle simply does not provide much information on
external sets: The “natural” formalization of the sentence that a field is Dedekind
complete is not transitively bounded.
Ê
However, the reader might be surprised that ∗ is not Archimedean, because
the sentence that Ê
is Archimedean can in a natural way be formalized by the
transitively bounded sentence
∀x ∈ Ê : ∃y ∈ Æ : y > x.
Of course, the ∗-transform of this sentence must be true:
Ê : ∃y ∈ Æ : y > x.
∀x ∈ ∗ ∗
However, this sentence does not mean that Ê is Archimedean, because for X =
∗
∗
Ê the set Æ from Section 1.2 is Æ and not Æ.
X
σ ∗
66 Chapter 3. Nonstandard Real Analysis
Ê Ê
Proposition 5.15. fin(∗ ) is an Archimedean subring of ∗ without zero-divisors
Ê
which contains σ and inf(∗ ). Ê
Ê Ê
fin(∗ ) is not a field. More precisely, we have 1/x ∈ fin(∗ ) for some x ∈ ∗ Ê
/ inf(∗ ).
(x = 0) if and only if x ∈ Ê
Ê Ê
Proof. σ ⊆ fin(∗ ) follows from Proposition 5.8, and the fact that fin(∗ ) is Ê
Archimedean follows from the definition (note that for X := fin(∗ ), we have Ê
Æ X = Æ
σ
Ê
). If fin(∗ ) would have zero-divisors, we would have x · y = 0 for
Ê Ê Ê Ê
x, y ∈ fin(∗ ) ⊆ ∗ , contradicting the fact that ∗ is a field. fin(∗ ) is a subring:
Ê Ê
If x, y ∈ fin(∗ ), then x+y, x−y, and xy also belong to fin(∗ ): Indeed, there
Æ
are n, m ∈ σ with |x| ≤ n, |y| ≤ m which implies |x ± y| ≤ |x|+|y| ≤ n+m ∈ σ , Æ
and similarly |xy| ≤ nm ∈ σ . Æ
Ê Æ
If x ∈ inf(∗ ), then |x| < n−1 < 1 for all n ∈ σ . Hence, x ∈ fin(∗ ) and Ê
y := 1/x ∈ Ê
/ fin(∗ ) (if x = 0) because |y| > n for all n ∈ σ . Æ
Conversely, if x ∈ Ê
/ inf(∗ ), then we find some n with |x| ≥ n−1 . Hence,
y := 1/x satisfies |y| ≤ n and does not belong to fin(∗ ). Ê
Definition 5.16. We say that two hyperreal numbers x, y ∈ ∗ Ê are infinitely close
Ê
to each other, if x − y ∈ inf(∗ ). We then write x ≈ y.
Proposition 5.17. ≈ is an equivalence relation on ∗
Ê. Moreover, if x1 ≈ y1 and
x2 ≈ y2 we have:
1. x1 ± x2 ≈ y1 ± y2 .
2. x1 · x2 ≈ y1 · y2 if x1 and x2 are finite.
3. x1 /x2 ≈ y1 /y2 if x1 is finite and x2 ≈ 0.
Ê
Proof. Since 0 ∈ inf(∗ ), we have x ≈ x. Since |x − y| = |y − x|, the relation x ≈ y
implies y ≈ x. Finally, if x ≈ y and y ≈ z, then |x − z| ≤ |x − y| + |y − z| < 2ε for
Ê
any ε ∈ σ + which implies x ≈ z.
For each n ∈ σ Æ
we have |xi − yi | < n−1 (i = 1, 2). Hence,
Æ
|(x1 ± x2 ) − (y1 ± y2 )| ≤ 2n for each n ∈ σ . If x1 and x2 are finite, we find
−1
Æ
some N ∈ σ with |xi | ≤ N (i = 1, 2). Since |x2 − y2 | ≤ 1, we find |y2 | ≤ N + 1,
and so |x1 · x2 − y1 · y2 | = |x1 (x2 − y2 ) + (x1 − y1 )y2 | ≤ N n−1 + n−1 N = 2N n−1
Æ Æ
for each n ∈ σ . Finally, if x2 ≈ 0, then we find some N ∈ σ with |x2 | ≥ N −1 .
Since |x2 − y2 | ≤ N −1 /2, we also have |y2 | ≥ N −1 /2. Consequently,
1
n−1
− 1
=
y2 − x2
≤
x2 y2
x2 y2
N N −1 /2
−1
Æ
for each n ∈ σ which proves x−1 −1 −1
2 ≈ y2 . Since x2 is finite, it follows by what
−1 −1
we just proved that x1 /x2 = x1 x2 ≈ y1 y2 = y1 /y2 .
§5 Hyperreal Numbers 67
The reader might have observed that the above proof essentially repeats the
argument of the classical limit rules like lim(xn ± yn ) = lim xn ± lim yn , etc. In
fact, we will see later that Proposition 5.17 implies these limit rules.
Ê Ê
Corollary 5.18. For any x ∈ inf(∗ ), y ∈ fin(∗ ), we have x · y ∈ inf(∗ ). Ê
Proof. x ≈ 0 and y ≈ y implies x · y ≈ 0 · y = 0.
Theorem 5.19. For each x ∈ fin( Ê) there is precisely one x̂ ∈ Ê with x ≈ x̂.
∗ ∗
Proof. Uniqueness: If x̂, ŷ ∈ Ê satisfy x̂ ≈ x ≈ ŷ, then | x̂ − ŷ| < ε for any
σ ∗ ∗ ∗ ∗ ∗
ε ∈ Ê . The inverse transfer principle implies |x̂ − ŷ| < ε for any ε ∈ Ê , and so
+ +
x̂ = ŷ.
Ê
Existence: Consider the set A := {y ∈ σ : y < x}. Since x ∈ fin(∗ ), the Ê
set A is nonempty and bounded from above, and since σ is Dedekind complete Ê
Ê
(because it is isomorphic to ), it has a least upper bound s ∈ σ , i.e. s = ∗ x̂ Ê
Ê Æ
for some x̂ ∈ . Given some n ∈ σ , we have s ≥ x − n−1 since otherwise
s + n−1 ∈ A would contradict the fact that s is an upper bound for A. But we also
have s ≤ x + n−1 , since otherwise s − n−1 would be an upper bound for A which is
Æ
strictly smaller than s. Hence |s − x| ≤ n−1 for each n ∈ σ , i.e. x ≈ s = ∗ x̂.
We emphasize that the proof of Theorem 5.19 made essential use of the
Dedekind completeness of . Ê
Definition 5.20. Let st : fin(∗ ) → Ê Ê
be the map x → x̂ from Theorem 5.19.
Ê
We call st(x) = x̂ the standard part of x ∈ fin(∗ ), and st the standard part
homomorphism.
Theorem 5.21. st : fin(∗ ) → Ê Ê
is a surjective order-preserving ring-homomor-
Ê
phism with kernel inf(∗ ), i.e. for all x, x1 , x2 ∈ fin(∗ ) we have Ê
1. st(x) = 0 if and only if x ∈ inf( ), ∗
Ê
2. st(x1 ± x2 ) = st(x1 ) ± st(x2 ),
3. st(x1 · x2 ) = st(x1 ) · st(x2 ),
4. st(x1 /x2 ) = st(x1 )/ st(x2 ) if st(x2 ) = 0, and
5. x1 ≤ x2 implies st(x1 ) ≤ st(x2 ).
Hence, st induces an order-preserving ring-isomorphism
Ê
fin(∗ )/ inf(∗ ) ∼
= Ê Ê.
Ê Ê Ê
Since is a field, also fin(∗ )/ inf(∗ ) is a field (and inf(∗ ) is a maximal ideal Ê
Ê
in fin(∗ )).
68 Chapter 3. Nonstandard Real Analysis
We have
mon(x) = ∗ x + inf(∗ ) = {y : st(y) = x}.
Indeed, y ≈ ∗ x if and only if y − x ∈ inf(∗ ). Now observe that inf(∗ ) is the
kernel of the standard part homomorphism st which satisfies st(∗ x) = x for x ∈ .
Up to now we know that ∗ contains more elements than σ ∼
= ; we also
know that under these new elements are the infinite and the nonzero infinitesimal
elements. However, ∗ contains even more elements, namely all which are infinitely
close to some x ∈ σ , i.e. all which belong to some monad. The question arises
whether there are other “exotic” elements contained in ∗ . The answer is “no”:
Proposition 5.23. The set fin(∗ ) is the disjoint union of all monads. The elements
of ∗ \ fin(∗ ) are the inverses of the nonzero elements of inf(∗ ) = mon(0).
Proof. For any y ∈ fin(∗ ), we have y ∈ mon(st(y)), and so fin(∗ ) is contained in
the union of all monads. Conversely, each monad mon(x) is contained in fin(∗ ),
since ∗ x ∈ fin(∗ ) and y ≈ ∗ x implies y ∈ fin(∗ ) (we have |x| ≤ n for some
n ∈ , and so |y| ≤ |∗ x| + 1 ≤ ∗ n + 1). To see that the monads are disjoint,
assume that for x1 , x2 ∈ we find some y ∈ mon(x1 ) ∩ mon(x2 ). Then st(y) = x1
and st(y) = x2 , i.e. x1 = x2 .
The second statement is a reformulation of the second part of Proposi-
tion 5.15.
Most of the objects we considered so far are actually external. In this connec-
tion recall that ∗ A is internal and σ A is external for infinite sets A (in particular,
for A = , , , ).
Theorem 5.24. The sets inf(∗ ), fin(∗ ), and mon(x) are external. Moreover,
∗
also the mapping s : x → (st(x)) is external.
§5 Hyperreal Numbers 69
Ê Ê
Proof. The set inf(∗ ) ⊆ ∗ has no least upper bound (and so Theorem 5.14
Ê Ê
implies that it is external): Assume that x ∈ ∗ + is such a bound. If x ∈ inf(∗ ),
Ê
then 2x ∈ inf(∗ ) (Corollary 5.18) contradicts the fact that x is an upper bound.
Hence, x ∈ Ê
/ inf(∗ ) which implies x/2 ∈ Ê
/ inf(∗ ) (since otherwise x = 2(x/2) ∈
Ê
∗
inf( )). This contradicts the fact that x is the least upper bound.
Ê
If fin(∗ ) were internal, then the internal definition principle would imply
that
Ê Ê Ê
inf(∗ ) = {x ∈ ∗ : x = 0 ∧ 1/x ∈ fin(∗ )}
is internal, a contradiction. Similarly, if mon(x) were internal, the internal defini-
Ê
tion principle would imply that inf(∗ ) = mon(x) − x were internal.
Ê
If s were internal, then dom(s) = fin(∗ ) were internal by Theorem 3.19
Ê
(and even rng(s) = σ were internal).
lim f (j) = x.
j→F
§5 Hyperreal Numbers 71
The definition of convergence with respect to a filter contains the usual no-
tions of convergence as special cases:
Proposition 5.28. Let J = , X be a topological space, and F be the filter of
Example 4.2. Then
Proof. We have limj→∞ f (j) = x if and only if for any open neighborhood U of
x we have f (j) ∈ U for all except finitely many j. The latter means f −1 (U ) =
{j : f (j) ∈ U } ∈ F which by Lemma 5.27 is equivalent to U ∈ f (F ). Hence,
limj→∞ f (j) = x if and only if any open neighborhood U of x is contained in
f (F ), i.e. if and only if limj→F f (j) = x.
Exercise 20. Let J, X be topological spaces, j0 ∈ J, and let F be the filter gener-
ated by all sets of the form J0 \ {j0 } where J0 is an open neighborhood of j0 (we
assume that none of these sets is empty which implies that the family of these sets
indeed has the finite intersection property). Prove that
where the limit on the right-hand side exists and is independent of the particular
choice of f .
Proof. First note that, since x is finite, we have x = ϕ([f ]) for some bounded f :
Ê
J → (Example 5.11) so that the limit x0 = limj→U f (j) exists by Theorem 5.30.
Moreover, by Lemma 5.31, the limit exists also (and has the same value) if we
§5 Hyperreal Numbers 73
Ê
choose another representative f . Given ε ∈ + , the open neighborhood U :=
(x0 − ε, x0 + ε) belongs to f (U ). By Lemma 5.27, this means that {j : f (j) ∈ U }
belongs to U . But this means |f (j) − x0 | < ε almost everywhere, and so x ≈ ∗ x0
in view of Example 5.6. We thus must have x0 = st(x).
Proposition 5.25 is a special case of Theorem 5.32: Indeed, if J =
and
limj→∞ f (j) = x0 exists, then limj→F f (j) = x0 for the filter of Example 4.2
by Proposition 5.28, and so limj→U f (j) = x0 by Proposition 5.29 (observe that
F ⊆ U by Exercise 11, since U is δ-incomplete and thus free).
74 Chapter 3. Nonstandard Real Analysis
M := {n ∈ ∗ | ∀y ∈ ∗ : (n0 ≤ y ≤ n =⇒ α(y))}.
Then M is internal by the internal definition principle, and the assumption implies
that any n ∈ σ belongs to M : Indeed, any n1 ∈ ∗ with n1 ≤ n is finite and thus
belongs to σ by Proposition 5.9. Hence, σ ⊆ M . Since M is internal by the
internal definition principle and σ is external, we have M = σ . Hence, there is
some h ∈ M \ σ , which thus has the required properties.
2. Let
M := {n ∈ ∗ | ∀y ∈ ∗ : (y ≥ n =⇒ α(y))}.
If h ∈ ∞ is infinite (Proposition 5.9), then no h1 ∈ ∗ with h1 ≥ h belongs
to σ , and so the assumption implies h ∈ M . Hence, ∞ ⊆ M . Since σ is
external, also ∞ is external (Theorem 3.19), and so ∞ = M . Hence, there is
some n ∈ M \ ∞ = M ∩ σ .
The second part of the following consequence is also called the Cauchy prin-
ciple. The name “Cauchy principle” is due to the fact that it allows us to formulate
properties which hold for infinitesimals (which have been used by Leibniz) in an
ε-δ-type manner as was first propagated by Cauchy (and which is the only rea-
sonable definition in standard analysis).
Corollary 6.2 (Permanence Principle for ∗
Ê). Let α(ε) be an internal predicate
with ε as its only free variable.
1. If α(ε) holds for all sufficiently small standard ε ∈ σ Ê+, ε < ε0, then α(c)
Ê
holds also for some infinitesimal c ∈ inf(∗ ), c > 0.
§6 The Permanence Principle and ∗ -finite Sets 75
Ê
2. If α(c) holds for all infinitesimals c ∈ inf(∗ ), c > 0, then there is some
Ê
standard ε0 ∈ σ , such that α(c) holds for all standard or nonstandard c ∈ ∗ Ê
with 0 < c ≤ ε0 .
Ê
Moreover, if α(c) holds for all infinitesimals c ∈ inf(∗ ), then there is
Ê
some standard ε0 ∈ σ , such that α(c) holds for all standard or nonstandard
Ê
c ∈ ∗ with |c| ≤ ε0 .
Ê
3. If α(x) holds for all sufficiently large standard x ∈ σ , x > x0 , then α(c)
holds also for some infinite c > 0.
4. If α(c) holds for all infinite c > 0, then there is some finite standard x0 ∈ σÊ
Ê
such that α(x) holds for all standard or nonstandard x ∈ σ , x ≥ x0 .
Proof. 1. Since α(1/n) holds for all sufficiently large n, we have α(1/h) for some
Æ
h ∈ σ by Theorem 6.1. By Proposition 5.15, we have c := 1/h ∈ inf(∗ ). Ê
2. Let β(ε) denote the internal predicate
Then β(h) holds for all infinite h ∈ Æ∞ , and Theorem 6.1 thus implies that β(n)
holds for some finite n ∈ σ Æ. Now the claim follows with ε0 := 1/n. The second
part follows analogously by the predicate
The remaining claims follow by applying the above proved statements for the
Ê
predicate α(1/ε) in place of α; recall that c ∈ inf(∗ ) if and only if 1/c is infinite
(Proposition 5.23).
Exercise 21. Prove the following generalizations of the permanence principle: Let
α(x) be an internal predicate with x as its only free variable.
Æ Æ
1. If there is some h0 ∈ ∗ ∞ such that α(h) holds for all h ∈ ∗ ∞ with h < h0 ,
Æ Æ
then there is some n0 ∈ σ such that α(n) holds for all n ∈ σ with n ≥ n0 .
Ê
2. If there is some infinitesimal c ∈ inf(∗ ), c > 0, such that α(d) holds for all
Ê
infinitesimals d ∈ inf(∗ ) with d > c, then there is some standard ε0 ∈ σ + Ê
Ê
such that α(ε) holds for all standard or nonstandard ε ∈ ∗ with c < ε ≤ ε0 .
Æ Ê
Exercise 22. Prove Robinson’s sequential lemma: If x : ∗ → ∗ is an internal
Æ Æ
sequence such that xn ≈ 0 for all n ∈ σ , then there is some h ∈ ∞ such that
Æ
xn ≈ 0 for all n ∈ ∗ with n ≤ h.
As a sample application of the permanence principle, let us give a simpler
proof of Theorem 5.24:
Ê Ê
Example 6.3. inf(∗ ) is external. Indeed, if inf(∗ ) were internal, then the pred-
Ê
icate α(x) ≡ ∀y ∈ inf(∗ ) : y < x were internal. Since α(ε) holds for all standard
76 Chapter 3. Nonstandard Real Analysis
numbers ε > 0, the permanence principle implies that we have α(c) for some
Ê
infinitesimal c ∈ inf(∗ ), a contradiction.
Note, however, that we used the fact that ∞ is external for the proof of the
permanence principle. Thus, in a sense, the permanence principle is equivalent to
the fact that certain entities are external. This is not accidental: Many deep results
of nonstandard analysis depend on the fact that certain entities are external.
All nonstandard phenomena we observed so far are based on the fact that
∗
= σ . The transfer principle implies, roughly speaking, that ∗ plays in the
nonstandard universe the same role as the set
plays in the standard universe.
But then the word “finite” should be interpreted differently in the nonstandard
universe:
Definition 6.4. A set A is called finite, if it is in a one-to-one correspondence with
a set {1, . . . , n} of natural numbers.
A set A is called Dedekind finite, if it is not in a one-to-one correspondence
with a proper subset A0 A.
We recall that a countable form of the axiom of choice implies the following
well-known result (of the standard world). For the reader unfamiliar with such
results, we provide a (standard) proof:
Proposition 6.5. A set A is finite if and only if it is Dedekind finite.
Proof. If A is finite and A0 A, then the cardinality of A0 is strictly smaller
than that of A. Since bijections preserve the cardinality, there is no bijection
f : A → A0 . Conversely, if A is infinite, we may define inductively an injection
f : → A in the following way: Choose f (1) ∈ A arbitrary, and if f (1), . . . , f (n)
are already defined, choose f (n + 1) ∈ A such that f (n + 1) ∈ / {f (1), . . . , f (n)}.
Such a value exists, since otherwise we have a bijection witnessing that A is finite.
Then we may define a bijection F : A → A0 where A0 = A \ {f (1)} by
F (x) =
x if x ∈
/ f ( ),
f (n + 1) if x = f (n).
For h ∈ ∞ the set {1, . . . , h} is even uncountable (Theorem 3.23). Moreover,
this set is not well-ordered: The set {1, . . . , h} ∩ ∞ has no smallest element
by Theorem 5.12 (but internal nonempty subsets of {1, . . . , h} have a smallest
element by Theorem 5.12).
Note that if A is hyperfinite, then A = rng f is internal by Theorem 3.19.
The transfer principle implies that a standard entity ∗ A is ∗ -finite if and
only if A is finite (see Exercise 27 below). However, the crucial point in the above
definition is that this definition applies also for nonstandard (internal) sets.
For the rest of this section, we discuss the above notion. The following results
all sound rather natural. However, the proofs are surprisingly technical. The rea-
son is that sentences about internal sets cannot easily be formulated such that the
transfer principle can be applied, i.e. such that all constants are standard objects
(and not only internal objects): To do this, we always have to formulate the sen-
tences as sentences about objects which contain the given internal sets as elements.
In other words: We must consider objects of a higher type. It may be a good idea
if the reader works through the appendix parallel to this section. Alternatively,
the reader may also want to skip the proofs of this section at the first reading and
to consider the proofs after more experience.
which is true since any injection f : {1, . . . , n} → attains at least some value
m ≥ n. Hence, the ∗-transform of this sentence is true. Since ( ) contains F ,
∗
∃x ∈ AA : “x is one-to-one” ∧ ∃y ∈ A : ∀z ∈ A : (z, y) ∈
/ x,
(¬α) ⇐⇒ β
is true by Proposition 6.5. Since (AA ) and (A ) consist of all internal functions
∗ ∗
f : ∗ A → ∗ A resp. f : ∗ → ∗ A, the ∗-transform of the above sentence means that
∗
A is Dedekind ∗ -finite if and only if there is an internal function f : ∗ → ∗ A
which maps {1, . . . , h} onto ∗ A for some h ∈ ∗ . Since any internal function
f : {1, . . . , h} → ∗ A may be extended to a function f : ∗ → ∗ A by Example 8,
the claim follows.
§6 The Permanence Principle and ∗ -finite Sets 79
Proof. Assume that A ∈ S is an entity whose elements are finite entities. Note
that U := A ∈ S by Theorem 2.1. Then the following sentence is true:
∀x ∈ A : ∃y ∈ U : ∃n ∈ : (α(y, n, x) ∧ n = f (x))
where α(y, n, x) is a shortcut for a transitively bounded sentence with the meaning
“y maps {1, . . . , n} bijectively onto x”: This sentence is a reformulation of the fact
that each B ∈ A is finite and has f (B) elements. The transfer of this sentence
implies for any A ∈ ∗ A that there is some y ∈ (U ) (i.e. some internal function
∗
y : ∗ U → ∗ by Theorem 3.21) and some z ∈ ∗ such that y maps {1, . . . , z}
bijectively onto A and z = ∗ f (A). Hence, any A ∈ ∗ A is ∗ -finite, and # A = z =
∗
f (A).
Conversely, let A be ∗ -finite, i.e. there is some h ∈ ∗ and some internal
bijection f : {1, . . . , h} → A. Since A is an internal entity, we find by Corollary A.3
an entity B ∈ S which consists only of entities such that A ∈ ∗ B (we may even
assume that all elements of B have the same type as A). Put U := B, and
observe that ∗ A ⊆ ∗ U by Theorem A.4. Let A ⊆ B be the collection of all finite
entities B ∈ B. Then the sentence
is true, where α is defined as before. The transfer of this sentence means that ∗ A
contains all elements A0 ∈ ∗ B for which we find an internal function y : ∗ → ∗ U
(Theorem 3.21) and some h ∈ ∗ such that y maps {1, . . . , h} bijectively onto
A0 . But A is such an element: Indeed, since ∗ A ⊆ ∗ U , we may extend the given
function f to an internal function y : ∗ → ∗ U . Hence, we have A ∈ ∗ A .
If a set A is infinite, then there exists an injection f : → A (by a countable
form of the axiom of choice). We have an analogue in the nonstandard world:
Theorem 6.12. For each internal entity A precisely one of the following alternatives
holds:
1. A is ∗ -finite, or
80 Chapter 3. Nonstandard Real Analysis
(U ), n ∈ ∗ : β(y, n, x, ∗ U , ∗ ))).
∗
(∃y ∈
The reader should take care that the shortcuts dom(x) and rng(x) used here are
not transitively bounded, but we may take the quantifiers over the sets U and V .
Since ∗ F consists of all internal functions x with dom(x) ⊆ ∗ U and rng(x) ⊆ ∗ V
(Exercise 83) and thus in particular f ∈ ∗ F , we conclude from the ∗-transform
of the above sentence that B = rng(f ) ∈ ∗ B (concerning U ∗ and V ∗ observe
Theorem A.4). Applying the converse direction of Theorem 6.11, this shows that
B is indeed ∗ -finite, as claimed.
Theorem 6.14. If A and B are ∗ -finite, then A ∪ B and A × B are ∗ -finite and
satisfy
#
(A ∪ B) = # A + # B if A ∩ B = ∅, (6.1)
# # #
(A × B) = A · B. (6.2)
A × B ∈ ∗C , A ∪ B ∈ ∗ D, PA ∈ ∗ E , (6.3)
The transfer of this sentence implies in view of (6.3) that there is some w ∈
∗
(C ) (i.e. some internal function w : ∗ → ∗ C by Theorem 3.21) such that
w : {1, . . . , ∗ f 1 (A) · ∗ f 2 (B)} → z = A × B is bijective. Since ∗ f 1 (A) = # A and
∗
f 2 (B) = # B by Theorem 6.11, this proves (6.2).
Analogously, the transfer of the sentence
∀x ∈ A , y ∈ B, z ∈ D : z = x ∪ y =⇒ α
proves (6.1).
For the last statement, consider the sentence
∀z ∈ x : z ⊆ y
and
∃w ∈ E : “w maps {1, . . . , 2f1 (y) } bijectively onto x”.
Note that for x ∈ E and y ∈ A the statement α(x, y) can be interpreted as
x = P(y); now apply the ∗-transform of the above sentence with x = PA ∈ ∗ E
and y = A ∈ ∗ A .
Exercise 24. Prove that for any ∗ -finite entities A, B the formula
# #
(A ∪ B) = # A + # B − (A ∩ B)
holds.
Given some ∗ -finite sequence x : {1, . . . , h} → X, we denote by #X (x) the
number h.
Proposition 6.15. Given some internal entity X, the system
X<
∗
= {x | ∃n ∈ ∗ : x : {1, . . . , n} → X is internal}
X<
∗
= {x ∈ F | ∃n ∈ ∗ : x : {1, . . . , n} → X is internal}
is internal by the internal definition principle. Similarly,
: < →
Ê Ê be the mapping which associates to each finite sequence
Let
its sum. Then
∗ ∗
: ( Ê<) → ∗ Ê. For any ∗-finite (internal) sequence x ∈ ∗Ê< , ∗
we define
h
∗
x(n) := (x).
n=1
xn
≤ yn .
n=1 n=1
Moreover,
h
1 = h.
n=1
Proof. The first statement follows in view of the previous results by the transfer
of
∀x, y ∈ < : (∀n ∈ : (n ≤ #Ê (x) =⇒ x(n) ≤ y(n))) =⇒
Ê (x) ≤ (y).
§ 7 Calculus
The basic calculus is very easily described by nonstandard methods. As remarked
in Section 1.1, this was historically one of the main motivations of nonstandard
analysis.
However, the use of nonstandard analysis has the drawback that even the
simplest results make use of the axiom of choice: Recall that without the axiom
of choice (more precisely: Without the existence of δ-free ultrafilters) we were not
able to construct nonstandard embeddings. We will see later that this restriction is
essential: Indeed, with nonstandard methods one can “construct” so-called Hahn-
Banach limits and also nonmeasurable functions, as we will see; without the axiom
of choice it is for fundamental reasons not possible to prove the existence of such
objects.
The above observation is an essential disadvantage since this means in the
author’s opinion that nonstandard analysis is not a good model for “real-world”
phenomena.
On the other hand, if one is particularly interested in such objects whose
existence can only be proved by the axiom of choice, nonstandard analysis is
a much more convenient tool than classical analysis. We will see this later in
particular in our discussion of Hahn-Banach limits.
7.1 Sequences
We first discuss real sequences. Recall that a sequence is a mapping x : Ê
→ ;
as usual, we write xn instead of x(n). The essential point of nonstandard analysis
Æ Ê
here is that x, as a mapping, has a ∗-transform ∗ x : ∗ → . We will also write
∗
Æ
xn in place of ∗ x(n). For n ∈ , we have ∗ x∗ n = ∗ (xn ) (Theorem 3.13), i.e.
the sequence ∗ xn may be identified with the sequence xn for standard numbers.
Æ
However, for h ∈ ∞ , we get additional values of ∗ xh on “infinite” places. One
will suspect that these values have something to do with the limit of the sequence
xn . This is indeed the case:
Theorem 7.1. Let xn be a real sequence. Then we have for x ∈ Ê:
Æ
1. xn → x if and only if ∗ xh ≈ ∗ x for each infinite h ∈ ∞ .
2. xn has the accumulation point x if and only if ∗ xh ≈ ∗ x for some infinite
Æ
h ∈ ∞.
∀n ∈ ∗ Æ : n ≥ ∗ n0 =⇒ |∗ xn − ∗ x| < ∗ ε.
The reverse form of the transfer principle implies that |xn − x| < ε for all n ∈ Æ
with n ≥ n0 , and so xn → x.
2. If xn has the accumulation point x, then the transfer principle immediately
shows that
∃n ∈ ∗ Æ : (n ≥ ∗ n0 ∧ |∗ xn − ∗ x| < ∗ ε).
Conversely, if ∗ xh is finite for each h ∈ ∗ , then the internal predicate
∀n ∈ ∗ : |∗xn | ≤ m
holds for each infinite m ∈ ∞ . The permanence principle implies that it also
holds for some finite m = ∗ m ∈ σ . An application of the converse direction of
the transfer principle implies that xn is bounded by m.
2. If xn → ∞, then we have for any N ∈
that there is some n0 ∈
such that
(using the transfer principle)
∀n ∈ ∗ : n ≥ ∗ n0 =⇒ xn > ∗ N .
In particular, xh > ∗ N for each infinite h ∈ ∞ . Since xh > ∗ N for any N ∈ ,
this means that xh is infinite.
Conversely, if xh is infinite and positive for any h ∈ ∞ , then for any N ∈
the internal formula ∗ xn > N holds true for each infinite n ∈ ∞ . The permanence
principle implies that there is some n0 ∈ σ such that xn > N holds for all n ∈ ∗
with n ≥ n0 ; in particular xn > N for all sufficiently large n ∈ .
It may be slightly astonishing to the reader that there is some relation be-
tween boundedness and finiteness. However, if one thinks of ∗ xh for infinite h ∈ ∞
as “generalized accumulation points”, this is not surprising. This interpretation
indeed makes sense:
Corollary 7.3. Let xn be a real sequence. Then its set of accumulation points is
Æ
Proof. If the sequence xn is bounded, ∗ xh is finite for each h ∈ ∞ . Hence, st(xh )
is an accumulation point of x by Theorem 7.1. The assumption thus implies that
x = st(xh ) ∈ Ê Æ
is independent of h, i.e. ∗ xh ≈ ∗ x for all h ∈ ∞ which implies
xn → x by Theorem 7.1.
Ê
Recall that a sequence xn ∈ is called a Cauchy sequence, if for each ε > 0
there is some n0 such that |xn − xm | < ε for n, m ≥ n0 .
Exercise 30. Prove, without using the fact that Cauchy sequences converge, that a
Æ
real sequence xn is a Cauchy sequence if and only if ∗ xh ≈ ∗ xk for each h, k ∈ ∞ .
With Exercise 30, we find another nonstandard proof of a well-known stan-
dard fact:
Corollary 7.7. A real sequence converges if and only if it is a Cauchy sequence
(i.e.,Ê is complete).
Æ
Proof. If xn → x converges, then ∗ xh ≈ ∗ x ≈ ∗ xk for each h, k ∈ ∞ by Theo-
rem 7.1. Conversely, if xn is a Cauchy sequence, then xn is bounded, and so ∗ xh
Æ
is finite for any h ∈ ∞ by Theorem 7.2. Put x := st(xh ) for some h ∈ ∞ . Æ
Æ
Then Exercise 30 implies ∗ xk ≈ ∗ xh ≈ ∗ x for each k ∈ ∞ , and it follows from
Theorem 7.1 that xn → x.
7.2 Sets
While the fact that boundedness (in the classical sense) and finiteness (in the
nonstandard sense) are related for sequences is rather intuitive, the reader may be
surprised to see the same relation for sets:
Theorem 7.8. A set A ⊆ Ê is bounded if and only if ∗A contains only finite
elements, i.e. if and only if
∗
Ê
A ⊆ fin(∗ ). (7.2)
More precisely, A is unbounded from above if and only if A contains an infinite
positive element, and A is unbounded from below if and only if A contains an
infinite negative element.
Proof. If A ⊆ Ê is bounded from above, we find some c ∈ Ê+ with
∀x ∈ A : x ≤ c.
The transfer reads ∀x ∈ ∗ A : x ≤ ∗ c which implies that all elements of ∗ A are
either finite or negative. Conversely, if each x ∈ ∗ A is either finite or negative, then
any sequence xn ∈ A is bounded from above. Indeed, otherwise there were some
Æ
positive infinite ∗ xh by Theorem 7.2. The transfer of the sentence ∀n ∈ : xn ∈ A
implies in particular that ∗ xh ∈ ∗ A, i.e. A contains an infinite positive element.
Thus the map A → ∗ A reflects the boundedness of A by joining some positive
resp. negative infinite elements to ∗ A if A is unbounded. It is not very surprising
that it also reflects the local properties of A. In particular, we have:
Theorem 7.9. A set A ⊆ Ê
is closed if and only if each finite point of ∗ A is
infinitely close to some (standard) point of σ A, i.e. if and only if
Ê
st(∗ A ∩ fin(∗ )) = A. (7.3)
Proof. Let A be closed, and x ∈ ∗ A be finite. We claim that st(x) ∈ A. Indeed,
Ê
put y := st(x). For each ε ∈ + , we have
∃z ∈ ∗ A : |z − ∗ y| < ∗ ε,
because ∗ A ∋ x ≈ ∗ y. The converse form of the transfer principle implies that we
Ê
find some z ∈ A with |z − y| < ε. Since ε ∈ + was arbitrary and A is closed, this
implies y ∈ A, as claimed.
Conversely, if each finite point of ∗ A is infinitely close to some point from
σ
A, and xn ∈ A is a sequence with xn → x, we have x ∈ A: Indeed, the transfer
Æ
principle implies ∗ xh ∈ ∗ A for each h ∈ ∗ , and by Theorem 7.1, we have ∗ xh = ∗ x
Æ
for some h ∈ ∞ . Hence, ∗ x ∈ ∗ A. By assumption, ∗ x is infinitely close to some
standard point of σ A. Since ∗ x is itself standard, it must be that standard point
of σ A, i.e. ∗ x ∈ σ A, and so x ∈ A. Hence, A is closed.
§7 Calculus 91
Ê
We call a set A ⊆ compact if A is closed and bounded. The previous results
imply the following consequence:
Corollary 7.10. A set A ⊆ Ê is compact if and only if each point from ∗ A is
infinitely close to some (standard) point of σ A.
σ
Proof. If A is compact, then a combination of (7.2) and (7.3) shows that (∗ A) =
∗
A. Conversely, if each x ∈ ∗ A is infinitely close to some standard point, then
A is closed by Theorem 7.9, and each x ∈ ∗ A is finite, whence A is bounded by
Theorem 7.8.
We will see later another (deeper) reason why Corollary 7.10 is true. For
later applications, Corollary 7.10 is one of the most essential tools. In fact, all
results which are typically proved by the Heine-Borel compactness criterion (a set
is compact if and only if each open covering has a finite subcovering) can usually
more easily be proved by an application of Corollary 7.10.
Exercise 31. Prove that a point x ∈ A is an interior point of A if and only if the
relation y ≈ ∗ x implies y ∈ ∗ A, i.e. if mon(x) ∈ ∗ A. Thus, A is open if and only if
mon(x) ⊆ ∗ A.
x∈A
Exercise 32. Give a standard characterization of those sets A ⊆ Ê with the prop-
erty that the relations x ∈ ∗ A and y ≈ x imply y ∈ ∗ A.
Why does the answer not contradict Exercise 31?
Exercise 33. Give a standard characterization of those sets A ⊆ Ê satisfying
∗
A= mon(x).
x∈A
The following exercises are easier to prove if one makes use of Theorem 7.12
below. However, the author recommends solving them now (without appealing to
Theorem 7.12).
Exercise 34. Give a standard characterization of those sets A ⊆ Ê with the prop-
erty that each finite point x ∈ ∗ A satisfies x = ∗ (st(x)).
Exercise 35. Recall that a set A ⊆ Ê
is called perfect, if each point x ∈ A is an
accumulation point of A\ {x}. Give a nonstandard characterization of perfect sets.
Theorem 7.11. A point x ∈ Ê belongs to the closure of some set A ⊆ Ê if and
only if x is the standard part of some point from ∗ A, i.e. if and only if x ∈ st(∗ A).
Proof. If x belongs to the closure of A, then there is a sequence xn ∈ A with
xn → x. Then ∗ xh ≈ ∗ x for some h ∈ ∞ by Theorem 7.1. Since the perma-
nence principle implies ∗ xh ∈ ∗ A and since x = st(∗ xh ), we have the required
92 Chapter 3. Nonstandard Real Analysis
∗
representation of x. Conversely, if ∗ x = st(y) for some y ∈ ∗ A, then y ∈ A, since
∗
the permanence principle implies ∗ A ⊆ A. By Theorem 7.9, y is infinitely close
σ
to some standard point of A. Since ∗ x is such a standard point, we must have
∗ σ
x ∈ A, i.e. x ∈ A.
Ê
Theorem 7.12. Let A ⊆ . A point x ∈ A is isolated if and only if ∗ A contains
no point y = ∗ x with y ≈ ∗ x.
Proof. If x is isolated, there is some ε ∈ Ê+ such that
∀y ∈ A : (y = x =⇒ |y − x| > ε).
7.3 Functions
Throughout, we consider functions f : D → Ê Ê
where D ⊆ . Then ∗ f defines
Ê
a function ∗ f : ∗ D → ∗ which extends f . One might expect that ∗ f reflects
properties like continuity of f in nonstandard terms. This is indeed true.
Theorem 7.13. Let x0 be an accumulation point of D. Then for c ∈ Ê the following
statements are equivalent:
lim f (x) = c.
1. x→x
0
x∈D
2. For any x ∈ ∗ D with ∗ x0 = x ≈ ∗ x0 we have ∗ f (x) ≈ ∗ c.
Proof. Let f (x) → c as x → x0 . For any ε ∈ Ê+, we find some δ ∈ Ê+ with
∀x ∈ D : (0 < |x − x0 | < δ =⇒ |f (x) − c| < ε).
Ê
By the permanence principle for ∗ (Cauchy principle), the predicate then also
Ê
holds for some d = ∗ δ where δ ∈ + . The inverse direction of the transfer principle
implies
∀x ∈ D : (0 < |x − x0 | < δ =⇒ |f (x) − c| < ε).
But this means that f (x) → c as x → x0 .
Ê
x∈D
If x0 ∈ D is isolated, any function f : D → is continuous at x0 . Moreover,
by Theorem 7.12 the only point x ∈ ∗ D which satisfies x ≈ ∗ x0 is x = ∗ x0 , and so
∗
f (x) ≈ ∗ f (∗ x0 ) is always satisfied.
As an application, let us give a simple proof of the following fact whose proof
is much more complicated by standard methods (recall that we defined compact
subsets of Êsimply as the closed and bounded subsets):
Corollary 7.15. If D ⊆ Ê is compact and f : D → Ê is continuous, then f (D) is
compact.
Proof. Put B := f (D). By Corollary 7.10, we have to prove that each point y ∈ ∗ B
is infinitely close to some standard point of σ B. Thus, let x ∈ ∗ B be given. Since
∗
f : ∗ D → ∗ B is onto (Theorem 3.13), we have y = ∗ f (x) for some x ∈ ∗ D. Since
D is compact, Corollary 7.10 implies that x is infinitely close to some point ∗ x0
with x0 ∈ D. Since f is continuous at x0 and x ≈ ∗ x0 , Corollary 7.14 implies
∗
y = ∗ f (x) ≈ ∗ f (∗ x0 ) = (f (x0 )) ∈ σ B, as desired.
Proof. Without loss of generality, let f (a) < c < f (b), and we have to prove that
c ∈ B := f ([a, b]). Choose h ∈ ∞ , and let xn := a + n(b − a)/h (n = 0, . . . , h)
be an infinite equidistant partition of [a, b]. Let n0 ∈ ∗ be the first index with
∗
f (xn0 ) > ∗ c, i.e. ∗ f (xn0 −1 ) ≤ ∗ c. By Corollary 7.10, the point xn0 is infinitely
σ
close to some standard point from [a, b], i.e. xn0 ≈ ∗ x for some x ∈ [a, b]. Since
∗
x ≈ xn0 ≈ xn0 −1 and since f is continuous at x, we have ∗ f (∗ x) ≈ ∗ f (xn0 ) > ∗ c
and ∗ f (∗ x) ≈ ∗ f (xn0 −1 ) ≤ ∗ c which implies ∗ f (∗ x) ≈ ∗ c and so f (x) = c (since
all points are standard points).
94 Chapter 3. Nonstandard Real Analysis
Exercise 36. In the previous proof, we used that there is a first index n0 ∈ ∗
with ∗ f (xn0 ) > ∗ c. Why does such an index exist?
Theorem 7.17. A function f : D → Ê
is uniformly continuous if and only if the
relations x, y ∈ ∗ D and x ≈ y imply ∗ f (x) ≈ ∗ f (y).
Proof. Let f : D → Ê
be uniformly continuous. For any ε ∈ Ê+, we find some
Ê
δ ∈ + such that, in view of the transfer principle,
∀x, y ∈ ∗ D : (|x − y| < ∗ δ =⇒ |∗ f (x) − ∗ f (y)| < ∗ ε).
In particular, the relation x ≈ y for hyperreal numbers x, y ∈ ∗ D implies
Ê
|∗ f (x) − ∗ f (y)| < ∗ ε. Since this holds for any ε ∈ + , we even have ∗ f (x) ≈ ∗ f (y).
Conversely, if x ≈ y implies ∗ f (x) ≈ ∗ f (y), then we have for any ε ∈ + Ê
that the internal predicate
∀x, y ∈ ∗ D : (|x − y| < c =⇒ |∗ f (x) − ∗ f (y)| < ∗ ε)
Ê
holds for any infinitesimal c ∈ inf(∗ ), c > 0. By the Cauchy principle, this
Ê
predicate holds also for some c = ∗ δ with δ ∈ + . The converse direction of the
transfer principle now shows that the relation |x − y| < δ for x, y ∈ D implies
|f (x) − f (y)| < ε. Hence, f is uniformly continuous.
Theorem 7.17 might appear strange at first glance, because it is not clear how
the uniformity comes into play, compared to e.g. Corollary 7.14: The only difference
to the characterization of continuous functions by Corollary 7.14 is that we want
the relation ∗ f (x) ≈ ∗ f (y) for x ≈ y even if y is a nonstandard point. In this
sense, Theorem 7.17 is in a certain sense a “local” (nonstandard) characterization
of uniform continuity which is somewhat paradoxical.
Employing the above paradox, we get a simple proof for another well-known
standard result:
Corollary 7.18. If D ⊆ Ê is compact, then any continuous f : D → Ê is uniformly
continuous.
Proof. Let x, y ∈ ∗ D with x ≈ y. By Theorem 7.17, we have to prove that ∗ f (x) ≈
∗
f (y). But since D is compact, we find by Corollary 7.10 some x0 ∈ D with
∗
x ≈ ∗ x0 . Since x, y ≈ ∗ x0 , Corollary 7.14 implies ∗ f (x) ≈ (f (x0 )) ≈ ∗ f (y), as
claimed.
We now come to the real calculus:
Ê
Theorem 7.19. Let x0 ∈ D be an accumulation point of D ⊆ . Then f : D → Ê
is differentiable in x0 with derivative c ∈ Ê
if and only if for each x ∈ ∗ D with
x ≈ ∗ x0 and x = x0 the relation
∗
f (x) − ∗ f (∗ x0 ) ∗
≈ c
x − ∗ x0
§7 Calculus 95
Exercise 37. Prove by nonstandard methods that f (x) = |x| is not differentiable
at 0.
We get now a clear proof for the chain rule of the calculus. As usual, we write
f ′ (x0 ) for the derivative of f at x0 (if it exists).
Corollary 7.22 (Chain rule). We have for real functions f, g:
dg ∗
≈ (g ′ (x0 )),
dx
96 Chapter 3. Nonstandard Real Analysis
df ∗
≈ (f ′ (g(x0 ))),
dg
Thus, essentially, the chain rule follows by just multiplying nominator and
df
numerator of dx by dg: The crucial point here is that we may in fact calculate
with the infinitesimals df , dx, and dg as if they were real numbers. As for real
numbers, one only has to take care of the special case dg = 0.
Exercise 38. Give a nonstandard proof (as intuitive as possible) of the product
formula
(f · g)′ (x0 ) = f ′ (x0 )g(x0 ) + f (x0 )g ′ (x0 ).
Ê
Exercise 39. Let f : [a, b] → be differentiable on (a, b) with derivative f ′ . Derive
Ê
from the mean value theorem that for each x, y ∈ ∗ with ∗ a ≤ x < y ≤ ∗ b there
Ê
is some ξ ∈ ∗ , x < ξ < y such that
∗
f (x) − ∗ f (y) ∗ ′
= f (ξ).
x−y
∀x ∈ Ê Æ
: ∃n ∈ : α(x, n, δ) =⇒
c −
< ε,
k=1
Ê
Now we apply the transfer principle, observing that ( ) consists of all internal
∗
sequences, and that any ∗ -finite partition may be extended to such a sequence.
We thus find that for all infinitely fine internal partitions x1 , . . . , xh , the relation
c− ∗
f (xn−1 )( ϕ(xn ) − ϕ(xn−1 ))
< ∗ ε
∗ ∗
n=1
as claimed.
For the second statement, assume that the right-hand side of (7.5) is infinitely
Ê
close to ∗ c for some c ∈ whenever xn is an infinitely fine internal partition. Then
Ê
we have for any ε ∈ + that the internal predicate
∀x ∈ ( ) : ∃n ∈ ∗ :
Ê Æ
∗
∗
∗
α(x, n, z) =⇒
c − ∗
f (xk−1 )( ϕ(xk ) − ϕ(xk−1 ))
< ∗ ε
∗ ∗
k=1
Ê
holds for any z ∈ inf(∗ ), z > 0. By the permanence principle (Cauchy principle),
Ê
the above internal predicate holds for some z = ∗ δ, δ ∈ + . Then the inverse
direction of the permanence principle implies that the Riemann-Stieltjes sum for
any finite δ-fine partition differs from c by less than ε. Hence, f is Riemann-Stieltjes
integrable.
Exercise 40. If f : [a, b] → Ê
is continuous and ϕ : [a, b] → Ê
is monotone, it is
well-known that f is Riemann-Stieltjes integrable with respect to ϕ. Prove that in
this case ∗
b h
∗
f (x) dϕ(x) ≈ f (yn )(∗ ϕ(xn ) − ∗ ϕ(xn−1 ))
a n=1
98 Chapter 3. Nonstandard Real Analysis
h ∗
h
( f (yn ) − ∗ f (xn−1 ))(∗ ϕ(xn ) − ∗ ϕ(xn−1 ))
≤ ε |∗ ϕ(xn ) − ∗ ϕ(xn−1 )|
n=1
n=1
= ε |ϕ(b) − ϕ(a)|
for any ε ∈ σ Ê+. Thus, the left-hand side is infinitesimal which means that
h
h
∗ ∗
f (yn )(∗ ϕ(xn ) − ∗ ϕ(yn )) ≈ f (xn−1 )(∗ ϕ(xn ) − ∗ ϕ(xn−1 )).
n=1 n=1
where xn < ξn < xn+1 . Multiplying by (xn+1 − xn ) and summing up, we find
h
∗ ′
f (∗ b) − ∗ f (∗ a) = ∗
f (ξn )(xn+1 − xn ).
n=1
∗ b
By Exercise 40, the right-hand side is infinitely close to a
f ′ (x) dx , and the
∗
left-hand side is equal to (f (b) − f (a)).
Ê
Theorem 7.25. Let D = [a, b]×[c, d]. Then f : D → is continuous at (x0 , y0 ) ∈ D
if and only if for each (x, y) ∈ ∗ D with x ≈ ∗ x0 and y ≈ ∗ y 0 we have ∗ f (x, y) ≈
∗
f (x, ∗ y 0 ) and ∗ f (x, y) ≈ f (∗ x0 , y).
Proof. Let f have the properties of the statement. Then we actually have
∗
f (x, y) ≈ ∗ f (x, ∗ y 0 ) ≈ ∗ f (∗ x0 , ∗ y0 ) whenever (x, y) ∈ ∗ D satisfy x ≈ ∗ x0 and
Ê
y ≈ ∗ y0 . Hence, for each ε ∈ + the following predicate holds for each infinitesi-
mal z > 0:
∗ ∗
∀x ∈ [a, b], y ∈ [c, d] :
(7.6)
(|x − x0 | < z ∧ |y − y0 | < z) =⇒ |∗ f (x, y) − ∗ f (∗ x0 , ∗ y0 )| < ∗ ε.
By the permanence principle (Cauchy principle), the predicate holds also for some
Ê
z = ∗ δ with δ ∈ + . The inverse direction of the transfer principle shows that f
is continuous at (x0 , y0 ).
Conversely, let f be continuous at (x0 , y0 ). Then we find for all ε ∈ + some Ê
Ê
δ ∈ + such that by the transfer principle the sentence (7.6) is true for z = ∗ δ. In
particular, if x ≈ x0 and y ≈ y0 , then |∗ f (x, y) − ∗ f (x0 , y0 )| < ∗ ε. Since this holds
Ê
for all ε ∈ + , we even have ∗ f (x, y) ≈ ∗ f (x0 , y0 ). But since then ∗ f (∗ x0 , y) ≈
∗ ∗
f ( x0 , ∗ y 0 ) ≈ ∗ f (x, ∗ y 0 ), we have also ∗ f (x, y) ≈ ∗ f (∗ x0 , y) ≈ ∗ f (x, ∗ y 0 ).
is differentiable at almost all points of [a, b] (in the sense of Lebesgue) and satisfies
there F ′ (x) = f (x).
Using this fact, we can prove the following standard result:
Proposition 7.27. Let f : →Ê Ê be measurable on some nontrivial interval and
have arbitrarily small periods, i.e. there is a sequence Tn ↓ 0 with f (x + Tn ) = f (x)
Ê
(x ∈ ). Then f is almost everywhere constant, i.e. there is some c ∈ with Ê
Ê
f (x) = c for almost all x ∈ (in the sense of Lebesgue).
100 Chapter 3. Nonstandard Real Analysis
for all those x. But since Tn is a full period of f , we may replace x in the integral
by any other number, and thus have to prove that
F (kn Tn ) F (T1 )
−
→ 0.
kn Tn T1
But the periodicity of f implies F (kn Tn ) = kn F (Tn ), and so (7.7) could be proved.
Ê
For the moment, we define [x] for x ∈ as the largest natural number which
is not larger than x. Then for each natural number n, the number
might be interpreted as the n-th digit (after the colon) of the binary expansion of
x.
Theorem 7.28. Put h ∈ Æ∞, and
∗ ∗
f (x) := st(| [2h∗ x] − 2 [2h−1∗ x]|) (x ∈ Ê).
Then f : Ê → {0, 1} is nonmeasurable (in the sense of Lebesgue) on each nontrivial
interval.
§7 Calculus 101
Proof. Put g(n, x) := |[2n x] − 2[2n−1 x]|, and note that f (x) = st(∗ g(h, ∗ x)). Since
Ê Æ Ê Ê
g : × → {0, 1}, we have ∗ g : ∗ × ∗ → {∗ 0, ∗ 1}, and so f : → {0, 1}. The
transfer of the statement
∀n, k ∈ Æ : x = k2−n.
Since g(n, x) is the n-th number of the binary expansion of x, we have
Æ
with an appropriate h ∈ ∞ , see e.g. [SL76, Example 8.4.45] (see also [Tay69]).
The measurability of (7.8) in dependence of h is discussed in [BH02, Tay69].
Æ
Exercise 41. Let U be a free ultrafilter over . Apply Theorem 7.28 to prove that
the set
{x ∈ [0, 1] : lim ([2n x] − 2[2n−1 x]) = 0}
n→U
axiom of choice). This fact was first observed by Sierpinski [Sie38] (Sierpinski used
standard arguments, of course).
Theorem 7.28 has another interesting consequence:
Recall that a map · : X → [0, ∞) on a linear space (=vector space) X over
= or = is called a norm, if the following holds:
1. x = 0 if and only if x = 0.
2. λx = |λ| x for scalars λ ∈ .
3. x + y ≤ x + y (the triangle inequality).
Theorem 7.29. Let ℓ∞ denote the space of all bounded sequences with the natural
operations. There is a norm · on ℓ∞ which is additionally monotone (i.e. if
|xn | ≤ |yn |, then (xn )n ≤ (yn )n ) and a measurable function f : × →
such that
x → f (·, x)
is nonmeasurable on any nontrivial interval.
Proof. Fix some h ∈ ∞, and define the norm by the formula
(xn )n = sup |xn | + |st(∗ xh )| .
n∈
It is easily checked that this indeed provides a norm (the first term is only needed
to have that (xn )n = 0 implies xn = 0 for all n). For f (n, x) := [2n x] − 2[2n−1 x],
we have
where U denotes a free ultrafilter over . (Note that in view of Theorem 5.30,
the limit in this expression always exists when xn is a bounded sequence).
Chapter 4
We shall see later that for any set S and any κ one can find κ-saturated maps
(and thus also κ-enlargements).
It looks rather non-symmetric that for the definition of enlargements no
restriction on the cardinality on A is made while for the definition of polysat-
urated maps a restriction is made. However, for enlargements, this restriction is
implicit, since (see Lemma 8.7 below), one may assume that A ∈ S. Hence, each
S-enlargement is automatically an enlargement. In particular, each polysaturated
map is an enlargement.
It would not make sense to drop the assumptions on the cardinality of B in
the definition of polysaturated maps, since no such maps can exist:
Proposition 8.3. If A ∈ S is an infinite entity, then there is a nonempty system
B of internal subsets of ∗ A with the finite intersection property and B = ∅.
Corollary 8.4. If S is infinite and ∗ is κ-saturated, then κ has at most the cardi-
nality of ∗ S. In particular, there is no map ∗ which is κ-saturated for any κ.
This does of course not exclude that for any κ we can find a κ-saturated map
∗ (and, as remarked above, such maps indeed exist): Corollary 8.4 only implies
that ∗ then must depend on κ.
We already see that it cannot be too easy to find saturated maps: If we
use the construction of §4, then ∗ S consists of the equivalence classes of maps
x : J → S. In particular, the cardinality of ∗ S is at most the cardinality of S J .
We thus need that S J has a larger cardinality than S ⊇ S ∪ P(S) ∪ P(P(S)) ∪ · · · .
Thus, the cardinality of J must be rather large (in particular, the choice J :=
is never sufficient). Hence, there exists a large class of nonstandard maps which is
not polysaturated.
Theorem 8.5. Let S be infinite. Then ∗ is an -enlargement if and only if ∗ is a
nonstandard embedding.
§8 Enlargements, Saturation, and Concurrency 105
Proof. Let ∗ be an -enlargement, and B ⊆ S be infinite countable. To see that
∗ is a nonstandard embedding, it suffices by Theorem 3.22 to prove that σ B =
∗
B. Let A be the system of all sets of the form B \ {b} (b ∈ B). Since B is
infinite, A has the finite intersection property. Moreover, A is countable (since
B is countable). Hence, {∗ A : A ∈ A } = ∅. Since ∗ (B \ {b}) = ∗ B \ {∗ b}, this
means ∗ B \ σ B = b∈B (∗ B \ {∗ b}) = ∅, and so ∗ is a nonstandard embedding.
Conversely, assume that ∗ is a nonstandard embedding. We show that ∗ is
an -enlargement. Thus, let A denote a nonempty countable system of entities
A ∈ S which has the finite intersection property. Let A1 , A2 , . . . be an enumeration
of all elements of A . By an obvious identification, we may assume that ⊆ S (just
rename the atoms). Define a function f : → P(A1 ) by f (n) := A1 ∩ · · · ∩ An .
Note that Theorem 2.1 implies P(A1 ) ∈ S and f ∈ S. Since A has the finite
intersection property, we have f (n) = ∅ for each n, i.e. ∀x ∈ : f (x) = ∅.
The transfer principle implies ∀x ∈ ∗ ∗
: f (x) = ∅. Since ∗ is a nonstandard
embedding, there is some h ∈ ∗ \ σ , and we have ∗ f (h) = ∅. If we can prove
that ∗ f (h) is contained in ∗ An for each n ∈ , it follows that the intersection of
the sets ∗ An (n ∈ ) is nonempty, and so that ∗ is an -enlargement. To prove
∗
f (h) ⊆ ∗ An , let n ∈ be given. The transfer of the true sentence ∀x ∈ : (x >
n =⇒ f (x) ⊆ An ) reads ∀x ∈ ∗ : (x > ∗ n =⇒ ∗ f (x) ⊆ ∗ An ). Now we note
that h > ∗ n by Proposition 5.9, and so ∗ f (h) ⊆ ∗ An , as claimed.
It is in general not true that each nonstandard embedding is -saturated.
There exist even enlargements which are not -saturated, see [CK90, Exer-
cise 4.4.29]. However, these examples are rather “exotic”. All elementary embed-
dings that we discuss are “nicer”: We will see that all embeddings arising from the
ultrapower construction of §4 are -saturated. To see this, we need some auxiliary
notions.
If f : σ A → ∗ B is a function where A ∈ S is an infinite entity, then f
cannot be an internal function, since otherwise dom(f ) = σ A were internal by
Theorem 3.19, contradicting Theorem 3.22. One may ask whether f is just the
restriction of an internal function to the external set σ A. If this is the case for any
function, we call ∗ comprehensive:
Definition 8.6. An elementary map ∗ : S → ∗ S is called comprehensive, if for each
entities A, B ∈ S and each function f : σ A → ∗ B, there is an internal function
F : ∗ A → ∗ B such that F (a) = f (a) for each a ∈ σ A.
We will see in §9 that all embeddings arising from the ultrapower construction
of §4 are comprehensive. We intend to prove now a relation of comprehensive and
106 Chapter 4. Enlargements and Saturated Models
saturated maps. In particular, we will show that all comprehensive maps in turn
are -saturated and also a partial converse. To this end, we need a result which
is of independent interest:
Lemma 8.7. To verify that a map ∗ is a κ-enlargement resp. κ-saturated, it suffices
to consider systems A resp. B in Definition 8.1 which additionally are entities
in S resp. ∗
S. Moreover, it additionally suffices to consider systems B which are
subsets of standard entities.
Proof. Indeed, if A is as in Definition 8.1, fix some A0 ∈ A and consider the
set A0 := {A ∩ A0 : A ∈ A } in place of A : Then A0 ∈ P(A0 ) is an entity by
∗
Theorem 2.1. Moreover, σ A 0 ⊆ σ A , since (A ∩ A0 ) ⊆ ∗ A for each A ∈ A
σ
(Lemma 3.5). It thus suffices to verify that A 0 = ∅. Now the first statement
follows if we observe that A0 has the finite intersection property and at most the
cardinality of A .
Concerning κ-saturated maps, we first argue similarly: If B is given as in
Definition 8.1, fix some B0 ∈ B and consider the set B0 := {B ∩ B0 : B ∈ B}.
Then B0 consists of internal entities (Theorem 3.19) with the finite intersection
property whose cardinality is not larger than B. Since B0 ⊆ B, it suffices to
verify that B0 = ∅. Now observe that B0 consists only of internal subsets of
B0 . The system P of all internal subsets of B0 is internal (Exercise 81), and so
B0 ⊆ P . Proposition 3.16 implies that there is some n with P ∈ ∗ S n and P ⊆ ∗ S n .
Consequently, B0 ⊆ ∗ S n . Since ∗ S n is an entity of ∗S, this inclusion implies also
that B0 is an entity of ∗
S (Theorem 2.1).
Theorem 8.8. Let S be infinite. If ∗ is a comprehensive nonstandard embedding,
then ∗ is -saturated. Conversely, if ∗ is polysaturated, then it is comprehensive.
More precisely, if ∗ is κ-saturated, then for any internal entities A, B and for
each function f : A0 → B with A0 ⊆ A where A0 has at most the cardinality of κ,
there is an internal function F : A → B such that F (a) = f (a) for each a ∈ A0 .
Proof. We start to prove the last claim. Thus, let ∗ be κ-saturated, A, B be internal
entities, and f : A0 → B with A0 ⊆ A where A0 has at most the cardinality of κ.
Let F denote the system of all internal functions g : ∗ A → B. Recall that F is
internal by Exercise 82. Consider now the family B of sets
Ba = {x ∈ F : x(a) = f (a)} (a ∈ A0 ).
By the internal definition principle, each set Ba is internal. Moreover, the system
B has the finite intersection property: Indeed, an element of Ba1 ∩ · · · ∩ Ban can
be defined in view of Exercise 8 by redefining some function from F at the finitely
many points a1 , . . . , an ∈ A0 . Since B has at most the cardinality of A0 , we find
some F ∈ B. Then F ∈ F satisfies F (a) = f (a) for each a ∈ A.
§8 Enlargements, Saturation, and Concurrency 107
G := {x ∈ ∗ | ∃y ∈ U : ∀z ∈ ∗ : (z ≤ x =⇒ y ∈ F (z))}
is internal by the internal definition principle. For any n ∈ , we have ∗ n ∈ G,
since B has the finite intersection property and thus F (1) ∩ · · · ∩ F (∗ n) = ∅
(recall Proposition 5.9). Hence, by the permanence principle, G also contains some
h ∈ ∗ \ σ N . This means that there is some y ∈ U such that
y ∈ {F (z) : z ∈ ∗ ∧ z ≤ h} ⊆ {F (z) : z ∈ σ } = B.
Thus,
B = ∅, and ∗ is -saturated.
Let us now come to the point why enlargements and polysaturated maps are
of particular interest.
Definition 8.9. Let ϕ be a binary relation. We say that ϕ is satisfied by b ∈ rng(ϕ)
on A ⊆ dom(ϕ), if (a, b) ∈ ϕ for each a ∈ A.
We call ϕ concurrent on A ⊆ dom(ϕ), if for each finite subset A0 ⊆ A there
is some b ∈ rng(ϕ) which satisfies ϕ on A0 .
In other words: ϕ is concurrent on A, if for each finitely many a1 , . . . , an ∈ A
there is some b with (a1 , b), . . . , (an , b) ∈ ϕ.
3. For any entity A ∈ S which has at most the cardinality of κ there is some
∗
-finite entity B with σ A ⊆ B ⊆ ∗ A.
∗
Ad = {y ∈ ∗ rng(ϕ) | (∗ d, y) ∈ ∗ ϕ},
∗
ϕ = {(x, y) ∈ ∗ A × ∗ P(A) | x ∈ y ∧ “y is ∗ -finite”}.
Ê
Exercise 45. Let ∈ S be an entity, and ∗ : S → ∗
S be an Ê-enlargement. Prove
Æ
that there is a number h ∈ ∗ such that
∗
sin(πhx) ≈ 0 Ê
(x ∈ σ ).
110 Chapter 4. Enlargements and Saturated Models
In particular, there is a number h ∈ ∗ such that (7.8) is constant and thus
measurable on . Ê
Ê
Hint: Use without proof that for each finitely many x1 , . . . , xn ∈ and any
Ê Æ
ε ∈ + there is some h ∈ such that the distance of hxk to an integer is at most
ε (for any k).
Actually, the statement of Exercise 45 holds if ∗ is just an arbitrary nonstan-
dard map [Tay69], but the proof is harder for this case.
The fact that ∗ ϕ is not satisfied on dom(∗ ϕ) = ∗ dom(ϕ) in Theorem 8.10 but
only on σ dom(ϕ) is rather disappointing. If we consider instead of κ-enlargements
even κ-saturated maps, we do not have this restriction. Moreover, ϕ can even be
an internal relation:
Theorem 8.12. The following statements are equivalent for an elementary map ∗:
1. ∗ is κ-saturated.
2. For any (not necessarily internal) binary relation ϕ and any (not necessarily
internal) A ⊆ dom(ϕ) which has at most the cardinality of κ and for which
each of the sets
{y ∈ rng(ϕ) : (B, y) ∈ ϕ} = B
Corollary 8.13. If ∗ is κ-saturated, then for any internal binary relation ϕ for
which dom(ϕ) has at most the cardinality of κ, we have: If ϕ is concurrent on
dom(ϕ), then it is satisfied on dom(ϕ).
Proof. The sets ϕ(b) in Theorem 8.12 are internal by the internal definition prin-
ciple.
The difference between Theorem 8.12 and Corollary 8.13 corresponds to the
different definitions of κ-saturated maps which can be found in literature (e.g. in
[SL76]):
Exercise 46. Show that for elementary maps ∗ the property of Corollary 8.13 is
equivalent to the fact that ∗ is “κ-saturated” in the sense that for any nonempty
internal system B of entities which has the finite intersection property and at
most the cardinality of κ, we have B = ∅.
Roughly speaking that a map ∗ is polysaturated means: Whenever it ap-
pears possible that an internal relation can be satisfied (because there are not
finitely many elements witnessing the contrary), then it actually is satisfied (if the
cardinality of the domain is not too large).
The following result is a generalization of the compactness property of en-
largements (Exercise 44):
Exercise 47. Let ∗ be κ-saturated. Then for any system A of internal entities and
any internal A0 ⊆ A there is some finite A0 ⊆ A with
A0 ⊆ A0 .
Proof. 1. The first statement follows from Theorem 8.12 by restricting dom(ϕ)
to the set A. For the second statement, we apply Theorem 8.10: Let ϕ ∈ S be
a binary relation which is satisfied on A := dom(ϕ) where A has at most the
cardinality of κ. Assume that ϕ is satisfied on A. Then ∗ ϕ is a standard relation
which is concurrent on σ A. If ∗ is a compact κ-enlargement, then ∗ ϕ is satisfied
on σ A (because σ A contains only standard elements and has the same cardinality
as A). Hence, Theorem 8.10 implies that ∗ is a κ-enlargement.
2. Let ∗ be polysaturated, and ϕ be an internal binary relation which is concurrent
on an entity A which has at most finitely many nonstandard elements. Since we
may assume that A is infinite (otherwise ϕ is trivially satisfied on A), also S is
By Theorem 8.12, it follows that
infinite, and so A has at most the cardinality of S.
ϕ is satisfied, and so ∗ is a compact enlargement. The second statement follows
immediately from 1.
There exist enlargements which are not compact enlargements (even ultrafil-
ter models with this property exist [Lux69a]). For most practical purposes, com-
pact enlargements are sufficient, and as we shall see, they are considerably easier
to construct than saturated maps.
Compact enlargements also have a “finite subcovering property” which is “in
between” the corresponding property for enlargements and for saturated maps
(Exercise 44 resp. Exercise 47): In contrast to enlargements, the covered set A0
may be internal (and need not be standard). But in contrast to saturated maps,
the covering family must consist of standard sets (and not of internal sets).
Theorem 8.16. Let ∗ be a compact κ-enlargement. Then for any system A of
entities A ∈ S which has at most the cardinality of κ and any internal A0 ⊆ σ A
there is a finite A0 ⊆ A with
σ
A0 ⊆ A 0.
ϕ := {(x, y) ∈ ∗ A × A0 | y ∈
/ x}.
If there is no finite A0 ⊆ A with A0 ⊆ σ A 0 , then ϕ is concurrent on σ A . Since
σ
A has at most the cardinality of κ, it follows that ϕ is satisfied by some a ∈ A0
on σ A , i.e. a ∈
/ σ A , a contradiction to our assumption A0 ⊆ σ A .
§9 Saturated Models 113
§ 9 Saturated Models
In this section, we shall “construct” for any κ and any set S a model which pro-
vides a κ-saturated map ∗. (This of course implies that there exist polysaturated
maps for any S). To this aim, we first construct enlargements and then consider a
countable chain of enlargements (a so-called direct limit) which provides a compact
enlargement. If we consider even an uncountable chain, we obtain a κ-saturated
map. The model for the enlargement is a special ultrapower model. In contrast,
the direct limit models are not ultrapowers.
The way we proceed is not the only possible construction of polysaturated
maps. In fact, it is even possible to obtain polysaturated models as ultrapowers,
see e.g. [LR94, Lux69a]. However, the construction of the latter depends on the
existence of a certain type of ultrafilters which is extremely hard to prove. For
this reason, we chose the direct limit construction. The latter is essentially due to
W. A. J. Luxemburg [Lux69a, SL76].
However, for many applications, it suffices to have -saturated maps. For
these applications, we need no additional considerations at all, since each non-
standard map ∗ as constructed in §4 by ultrapowers has this property:
Theorem 9.1. Let ∗ be a map arising from the ultrapower construction of §4. Then
∗ is comprehensive. In particular, if S is infinite and U is δ-incomplete (i.e. ∗ is
nonstandard), then ∗ is -saturated.
Proof. The second statement follows from the first by Theorem 8.12. Let entities
A, B ∈ S and a function f : σ A → ∗ B be given. Consider first the abstract model
S from Section 4.2. Then ∗ A corresponds to the set of all equivalence classes
of maps x : J → A, and σ A corresponds to the subset of classes which have
a constant function as their representative (we write in abuse of notation that
[a] is the equivalence class of the constant function with value a). Similarly, ∗ B
corresponds to the set of all classes of functions x : J → B. Hence, for each a ∈ A,
the value f ([a]) corresponds to the equivalence class of some function fa : J → B.
Consider now the function F0 : J → B A , defined by (F0 (j))(a) := fa (j). By
Proposition 4.19, the equivalence class [F0 ] is mapped by ϕ into an internal element
∗
F . By Proposition 4.19, we have F ∈ (B A ), i.e. F is an internal function from ∗ A
into ∗ B (Theorem 3.21). To prove that F is an extension of f , we have to prove
that (∗ a, f (∗ a)) ∈ F is a true sentence for any given a ∈ A. By Theorem 4.18, this
sentence is true if and only if ([a], [fa ]) ∈U [F0 ] where the shortcut for pairing is
to be understood in the interpretation of Theorem 4.18. By the L oś/Luxemburg
Theorem 4.14, this is equivalent to (a, fa (j)) ∈ F0 (j) for almost all j which in turn
is equivalent to (F0 (j)) = fa (j) for almost all j. The latter is true by construction,
and so we have indeed F (∗ a) = f (∗ a) for any given a ∈ A.
114 Chapter 4. Enlargements and Saturated Models
FA := {j ∈ J : A ∈ j}.
The collection F0 of all sets FA (i.e. F0 := {FA : A ⊆ J}), has the finite intersec-
tion property. Indeed, if A1 , . . . , An ⊆ λ, then j := {A1 , . . . , An } belongs to J, and
the intersection FA1 ∩ · · · ∩ FAn contains j. Hence, F0 generates a filter F (recall
Proposition 4.5). By Theorem 4.9 (axiom of choice!), there exists an ultrafilter U
on J containing F and thus F0 .
We claim that U is λ-adequate. Thus, let A be a nonempty family of subsets
of λ with the finite intersection property. Recall that each j ∈ J is a finite collection
of subsets of λ. Let J0 be the subset of those collections which contain only elements
of A . Since A has the finite intersection property, each of the sets
Bj := A (j ∈ J0 )
A∈j
such that for any A ∈ A the relation f (j) ∈ A holds for almost all j. But since
U is S-adequate, we find a function f : J → S such that for each A ∈ A there is
some F ∈ U with f (F ) ⊆ A. This is the required function, since f (F ) ⊆ A means
f (j) ∈ A for all j ∈ F , i.e. (since F ∈ U ) for almost all j.
∗m m k
n = ∗k ∗n and im m k
n = ik in (n ≤ k ≤ m). (9.1)
m
A sentence α in the language Ln is transformed by ∗m n into a sentence
in
α
in the language Lm which arises from α by replacing any occurrence of a constant
c in α by the constant im
n (c).
The crucial point now is that a transitively bounded sentence α in the lan-
m
guage Ln is true (interpreted by In in Sn ) if and only if the sentence in α in the
language Lm is true (interpreted by Im in Sm ): This is a reformulation of the
statement that each ∗n is elementary (after a trivial induction). Hence, we have:
Lemma 9.5. Each of the maps ∗m
n : Sn → Sm is elementary.
Proof. It only remains to verify the second condition of Definition 3.1, i.e. that
∗m
n S ∗n
n = Sm . But since S n = Sn+1 (because ∗n is elementary), this follows by a
trivial induction on m.
To simplify notation, we assume that all of the sets Sn are pairwise disjoint
(which can be arranged by choosing the atom sets Sn pairwise disjoint). This
means that for each x ∈ n Sn there is a unique n with x ∈ Sn . We denote this n
by nx .
116 Chapter 4. Enlargements and Saturated Models
[x] =ω [y] ⇐⇒ ∗kn (x) = ∗km (y) for all k ≥ n, m where n = nx , m = my . (9.2)
[x] ∈ω [y] ⇐⇒ ∗kn (x) ∈ ∗km (y) for all k ≥ n, m where n = nx , m = my . (9.3)
[x] =ω [y] ⇐⇒ ∗kn (x) = ∗km (y) for some k ≥ n, m where n = nx , m = my . (9.4)
[x] ∈ω [y] ⇐⇒ ∗kn (x) ∈ ∗km (y) for some k ≥ n, m where n = nx , m = my . (9.5)
Proof. The fact that the right-hand side of (9.2) resp. of (9.3) is equivalent to the
right-hand of (9.4) resp. of (9.5) follows by observing that for any K ≥ k ≥ n, m,
we have ∗K K k K K k K
n = ∗k ∗n and ∗m = ∗k ∗m and that ∗k preserves the equality and
element relation: The latter follows from Lemma 3.5, since ∗K k is elementary by
Lemma 9.5. Once we know this equivalence, it follows straightforwardly that =ω
and ∈ω are well-defined.
A sentence α in the language Ln will now be interpreted in the abstract
model Sω by mapping any constant c ∈ cns(Ln ) corresponding to some x ∈ Sn
to the equivalence class [x].
Theorem 9.7. A transitively bounded sentence α in the language Ln is true in the
abstract model Sω if and only if it is true under the interpretation map In .
Proof. It suffices to prove that whenever α is true under the interpretation map
In , it is also true in the abstract model Sω : Indeed, if we have proved this, and α
is false under the interpretation map In , then the transitively bounded sentence
β = ¬α is true. But then, by assumption, β is true in Sω , and so α is false in Sω .
Thus, without loss of generality, assume that α is true under the interpre-
tation map In . We may equivalently rewrite α in prenex normal form, i.e. in the
form
Q1 x1 : Q2 x2 : . . . Qk xk : β
where Qj stands for a quantifier ∀ or ∃, and where β contains no further quantifiers.
§9 Saturated Models 117
For the proof, we make use of so-called Herbrand-Skolem functors. These are
defined as follows: Let j1 < j2 < · · · < jp be those indices for which Qji is the
symbol ∃ (p = 0 is not excluded). In particular, α has the form
(j1 = 1 is not excluded). Note that In is onto, i.e. each possible value of xj is ac-
tually represented by some constant in the language cns(Ln ). Then the statement
that α is true under the interpretation In means that for each possible value of
x1 , . . . , xj1 −1 , we find a constant cn1 (x1 , . . . , xj1 − 1) such that the sentence
is true, where β(cn1 ) arises from β by replacing all free occurrences of xj1 by the
constant cn1 (x1 , . . . , xj1 −1 ). By the axiom of choice, we may assume that cn1 is a
function. Note that conversely, the existence of such a function cn1 implies that α is
true under the interpretation map In . (However, the reader should be aware that
cn1 is not a function in the sense of the language Ln ). The function cn1 is called
the Herbrand-Skolem functor for the existential quantifier Qj1 . By an induction,
we thus can eliminate all quantifiers, and find that α is true if and only if
β(cn1 , . . . , cnp )
holds for any values of the free variables xj (j = j1 , . . . , jp ) where cnk is a Herbrand-
Skolem functor depending only on the choice of the values of xj (j = j1 , . . . , jp ).
Note now that α is transitively bounded, and so it remains true under the inter-
pretation Im if m ≥ n, provided we replace all occurrences of constants c by im n (c).
Hence, we find for each m ≥ n Herbrand-Skolem functors cm 1 , . . . , cm
p such that
im
n
β(cm m
1 , . . . , cp ) (9.6)
m
holds for all values of the free variables xj (j = j1 , . . . , jp ) in Sm , where in β arises
from β by replacing all occurrences of constants c by im n (c). We claim that there
ω ω
are even Herbrand-Skolem functors c1 , . . . , cp such that
ω
β(cω ω
1 , . . . , cp ) (9.7)
m
∗
for all j, and put bj := nj aj . Then bj ∈ Sm , and [aj ] = [bj ]. In particular,
xj = [bj ]. Now let ck be the equivalence class containing the interpretation of cm
ω
k
under the values xj = bj . For the choice xj = bj the formula (9.6) is true in Sm ,
and it follows that also (9.7) is true for xj = [bj ] in Sω , as claimed:
Indeed, since β contains no quantifiers, it consists only of logical connectives
and elementary formulas a = b and a ∈ b. If we can prove that the elementary
formula in (9.6) is true if and only if the corresponding formula in (9.7) is true,
also the complete formulas in (9.6) resp. (9.7) must have the same truth value.
It thus remains to prove that for any a, b ∈ Sm we have a = b if and only if
[a] =ω [b], and a ∈ b if and only if [a] ∈ω [b]. But it follows from (9.2) that [a] =ω [b]
implies ∗m m
m (a) = ∗m (b) and so a = b (since we assume a, b ∈ Sm ); analogously,
m m
[a] ∈ω [b] implies by (9.3) that ∗m (a) ∈ ∗m (b), and so a ∈ b. For the converse
implication, observe that a = b implies ∗m m
m (a) = ∗m (b), and so [a] =ω [b] by (9.4);
m m
analogously, a ∈ b implies ∗m (a) ∈ ∗m (b), and so [a] ∈ω [b] by (9.4).
It follows from the proof, that Theorem 9.7 holds not only for transitively
bounded sentences but even for a larger class of sentences, if the corresponding
embeddings ∗n preserve the truth for the corresponding class of sentences.
Exercise 48. Prove that any sentence α can equivalently be rewritten in prenex
normal form.
As in Section 4.3, we now want to replace the abstract model Sω by some
superstructure Sω . Of course, this replacement should be done in such a way that
transitively bounded sentences α in any of the languages Ln should keep their
truth value.
If we want to proceed analogously to Section 4.3, we have to define sets
Ik ⊆ Sω which represent “internal objects of level at most k”, in particular I0
will represent the atoms.
The definition of Ik is rather straightforward: Recall that each superstruc-
ture Sn is built of level sets Sn,k (i.e. Sn,0 represents the atoms of the superstruc-
ture, and Sn,k+1 = Sn,0 ∪ P(Sn,k )). Then we just let
We emphasize that the proof of Theorem 9.10 makes use of the fact that the
sentence is transitively bounded, but this restriction could be relaxed in the same
way in which this restriction in Theorem 4.18 could have been relaxed.
Now we define a new language Lω by taking the set of its constants, cns(Lω )
in a one-to-one correspondence with Sω (the correspondence being the interpreta-
tion map Iω ). Now we can define maps ∗ω
n : Sn → Sω by
∗ω
n (x) = ϕω ([x]),
and also iω ω −1 ω
n : cns(Ln ) → cns(Lω ) by in = I∞ ◦ ∗n ◦ In , i.e. if c ∈ cns(Ln )
corresponds to the element x ∈ Sn , then in (c) corresponds to the element ∗ω
ω
n (x).
ω
For later reference, we observe by the way that the value ∗n (x) depends by
definition only on the equivalence class of x, and so it follows that
∗ω ω k
n = ∗k ∗n (n ≤ k).
Hence, the relation (9.1) holds also for m = ω (and even for k = ω or n = ω if we
define ∗ω
ω (x) = x).
Theorem 9.11. Each of the maps ∗ω
n : Sn → Sω is elementary.
Moreover, if an element x ∈ Sω is internal under this map, then there is
some m < ω such that x is a standard entity under the map ∗ω
m : Sm → Sω .
By Lemma 9.8, the relation [x] ∈ω [Sn ] implies [x] ∈ I0 . But also the converse
holds: If [x] ∈ I0 , then [x] ∈ω [Sn ].
To see the latter, let [x] ∈ I0 and note that we may by definition of I0
choose some representative x which is an atom x ∈ Sm for some m. If m ≤ n, then
∗n
mS n n
m = Sn , because ∗m is elementary (Lemma 9.5), and so x0 = ∗m (x) ∈ Sn .
∗m
Since x0 ∼ x, we have [x] = [x0 ] ∈ω [Sn ]. If m ≥ n, then the relation n S n = Sm
∗m
(because ∗m n is elementary) implies that [x] ∈ω [
n S ] = [S ]. Hence, in both cases
n n
[x] ∈ω [Sn ], as claimed.
It thus follows from (9.10) that
∗ω
n S n = {ϕω ([x]) : [x] ∈ I0 } = I0 = Sω ,
§9 Saturated Models 121
where we made use of (9.9). This completes the proof that ∗ω n is elementary.
∗ω
If x is internal under the map ∗ω
n , then x ∈ nS
n,k for some k. By (9.9), we
have
∗ω
n S n,k = ϕω ([Sn,k ]) = {ϕω ([y]) : [y] ∈ω [Sn ]}.
In particular, x = ϕω ([y]) for some [y] ∈ Sω . We must have y ∈ Sm for some m,
and so x = ∗ωm (y).
We thus have constructed a true limit model Sω for the sequence Sn . Since
each of the models Sn becomes successively “more saturated”, and since any tran-
sitively bounded sentence in the model Sn is also true in the limit model Sω , one
might expect that the limit model is “enormously saturated”. This is indeed the
case:
Theorem 9.12. The map ∗ω
0 : S → Sω is a compact enlargement.
Applying the inverse form of the transfer principle (for the map ∗ω m ), we find that
there is some y ∈ rng(ψ) such that (ba1 , y), . . . , (ban , y) ∈ ψ, i.e. ψ is concurrent
on B, as claimed.
Since ∗m+1
m : Sm → Sm+1 is an enlargement, we may conclude that there
m+1 m+1
is some y ∈ Sm+1 such that (∗m ba , y) ∈ ∗m ψ holds for each ba ∈ B. Hence,
∗ω
m+1 m+1 ω ∗ω
m+1 m+1 ω
( (∗m ba ), ∗m+1 y) ∈ (∗m ψ) for each a ∈ A which means (a, ∗m+1 y) ∈ ϕ
for each a ∈ A, i.e. ϕ is satisfied on A, as desired.
122 Chapter 4. Enlargements and Saturated Models
∗α α3 α2
α1 = ∗α2 ∗α1 ,
3
and ∗α
α is the identity.
2. For each β < α the map ∗α
β : Sβ → Sα is elementary.
3. If α is a successor ordinal, i.e. α = β +1 for some ordinal β, then ∗α
β : Sβ → Sα
is even an enlargement.
4. If α is a limit ordinal (i.e. not a successor ordinal), then each element which is
internal under some map ∗α
β : Sβ → Sα with β < α is standard under another
such map.
Theorem 9.13. For each set S there exists a transfinite sequence as described above
such that S0 = S.
Proof. The induction start is clear: Put S0 := S, and let ∗00 : S0 → S0 be the
identity. If α = β + 1, let ∗α
β : Sβ → Sα be an enlargement, and for ordinals γ < β
α := α β
define ∗γ ∗β ∗γ .
Thus, the only case which needs some care is if α is a limit ordinal. For
α = ω, we have proved the existence of a limit model. However, for the general
case, the proof is analogous: Actually, we have at no place used the fact that we
determined the limit of a countable sequence (only the relation (9.1) has been
used). For clearer representation we had used in the above proof languages Lβ ,
interpretation maps Iβ : cns(Lβ ) → Sβ , and maps iββ21 : cns(Lβ1 ) → cns(Lβ2 ):
One may of course just put cns(Lβ ) := Sβ , and let Iβ and iββ21 be the identity map.
Then, with the same proof as above, we can construct a limit model Sα such that
each of the embeddings ∗α : Sβ → Sα is elementary.
β
The crucial point of the above construction is that we now indeed obtain κ-
saturated models. The proof is in most parts analogous to Theorem 9.12, but one
has to take care, since there need not exist a maximum for infinite sets of ordinals.
To ensure that the supremum is actually strictly smaller than the ordinal number
α of the highest model, we implicitly make use of a property of successor cardinals
α which is called “regularity” in literature on set theory:
§9 Saturated Models 123
Theorem 9.14. Let S and κ be arbitrary sets. Then in the transfinite sequence
from Theorem 9.13 the elementary map ∗α
0 : S → Sα is κ-saturated, if α is the
first ordinal with a strictly larger cardinality than κ.
Proof. We may assume that κ is infinite, since otherwise each elementary embed-
ding is κ-saturated.
Let B be a nonempty system of internal entities which has the finite intersec-
tion property and at most the cardinality of κ. We have to prove that B = ∅.
Since each B ∈ B is an internal set, we find some index βB < α such that B is a
standard set under the embedding ∗α
βB : SβB → Sα , i.e. there is some CB ∈ SβB
α
such that B = ∗βB C B .
Put β := {βB : B ∈ B}. Then β is an ordinal number. Since βB < α, the
set βB has at most the cardinality of κ. Hence, β has at most the cardinality of
the set κ × B. By assumption, B has at most the cardinality of κ. Hence, β has
at most the cardinality of κ × κ which is the cardinality of κ, since κ is infinite.
Summarizing, β has at most the cardinality of κ. By our choice of α, we may
conclude that β < α.
∗β
Consider in the model Sβ the sets AB := βB C B . Then the system A :=
{AB : B ∈ B} has the finite intersection property: Indeed, if B1 , . . . , Bn ∈ B,
then B1 ∩· · ·∩Bn = ∅, since B has the finite intersection property by assumption.
α α
Now note that ∗β ABi = ∗βB C Bi = Bi , and so
∗α α α
β
(AB1 ∩ · · · ∩ ABn ) = ∗β AB1 ∩ · · · ∩ ∗β ABn = B1 ∩ · · · ∩ Bn = ∅,
because ∗α
β is a superstructure monomorphism. Hence, AB1 ∩ · · · ∩ ABn = ∅, and
A has the finite intersection property, as claimed.
Since ∗β+1
β : Sβ → Sβ+1 is an enlargement, we may conclude that there is
β+1
some d ∈ Sβ+1 which is contained in each of the sets DB := ∗β AB (B ∈ B). Since
∗β+1 α α
∗α
DB = βB C B , we find by Lemma 3.5 that b := ∗β+1 d ∈ ∗β+1 DB = βB CB = B
(B ∈ B). In particular, b ∈ B, and so ∗α
0 is κ-saturated.
there exist enlargements which are not -saturated. By Theorem 9.1, these en-
largements cannot be described by ultrapower models). However, even if one uses
ultrapowers, it is not clear whether we end up with the same model as in [SL76],
because the embedding of the abstract model into the superstructure in each step
does not preserve all sentences but only the transitively bounded sentences.
Chapter 5
or, equivalently
f (x)
sup f (x) = sup f (x) = sup < ∞.
x=1 x≤1 x
=0 x
126 Chapter 5. Functionals, Generalized Limits, and Additive Measures
The minimum of all such constants L (which is the above supremum) is called the
norm. In particular, a linear functional is bounded if and only if
F (x + λx1 ) := f (x) + λc
Ê
with some c ∈ . We have to prove that we may choose c in such a way that
F ≤ f , i.e. |F (u)| ≤ f u for all u ∈ U . The latter is equivalent to
|f (x) + c| ≤ f x + x1 (x ∈ X0 ).
F = sup sup Re(λF (x)) = sup sup Re(F (λx)) ≤ sup |FÊ (x)| = FÊ ,
x=1 |λ|=1 x=1 |λ|=1 x=1
Also ℓp is a Banach space. It is well-known that in case 1 ≤ p < ∞ the space ℓp′
with p1 + p1′ = 1 is dual to ℓp in the sense that each element y = (ηn )n ∈ ℓp′ defines
a bounded linear functional fy on ℓp by means of the formula
∞
fy (x) = ηn ξn , (10.2)
n=1
and
N
|ηn | ≤ (1 + ε) f .
n=1
Ê
for any choice λ1 , . . . , λK ∈ . Indeed, if this is not true, we find for each N
corresponding numbers λk,N such that
But this contradicts the definition of the norm in ℓ∞ (note that the right-hand
side does not vanish for sufficiently large N , since we assumed that the vectors
x1 , . . . , xN are linearly independent). This contradiction shows that we find indeed
some N satisfying (10.4).
Together with x1 , . . . , xK , we consider the truncated vectors y1 , . . . , yK ∈
Ê N
Ê
where yk := (ξk,1 , . . . , ξk,N ). If we equip N with the max-norm, we may
read (10.4) as
Recalling that x1 , . . . , xK are linearly independent, (10.6) implies that also the
Ê
vectors y1 , . . . , yK are linearly independent. On the subspace of N spanned by
y1 , . . . , yK we define a functional g by
g(λ1 y1 + · · · + λK yK ) := f (λ1 x1 + · · · + λK xK ).
g(λ1 y1 + · · · + λK yK ) ≤ f λ1 x1 + · · · + λK xK ∞
≤ f (1 + ε) λ1 y1 + · · · + λK yK .
and so (10.3) holds. In order to prove the norm estimate, we consider the vector
x := (sgn(η1 ), . . . , sgn(ηN )). Then
N
N
|ηn | = sgn(ηn )g(en ) = g(x) ≤ g x ≤ (1 + ε) f .
n=1 n=1
Using Lemma 10.4, we obtain now:
Theorem 10.5. Let ∗ : S → ∗
S be a nonstandard embedding. If ∗ is even an enlarge-
ment, then for any f ∈ ℓ∗∞ there exists an internal ∗ -finite sequence η1 , . . . , ηh ∈ ∗ Ê
with
h
∗
(f (x)) = ηn ∗ ξ n (x = (ξn )n ∈ ℓ∞ ), (10.7)
n=1
132 Chapter 5. Functionals, Generalized Limits, and Additive Measures
where
h
f = st |ηn | . (10.8)
n=1
Conversely, each ∗ -finite internal sequence for which the right-hand side of (10.8)
is finite gives rise to a functional f ∈ ℓ∗∞ defined by
h
∗
f (x) := st ηn ξ n (x = (ξn )n ∈ ℓ∞ ), (10.9)
n=1
which satisfies
h
f ≤ st |ηn | .
n=1
Proof. Let f ∈ ℓ∗∞ be given. Let ϕ ∈ S be the following binary relation from
<
Ê+ × ℓ∞ into Ê
:
#Ê (y) #Ê (y)
ϕ = {(ε, x, y) ∈ Ê+ ×ℓ∞ × Ê< | f (x) = y(n)x(n) ∧ |y(n)| ≤ (1+ε) f }
n=1 n=1
Ê
for each ε ∈ + . Moreover (10.9) implies that f is linear, since st is linear (Theo-
rem 5.21). Hence, f ∈ ℓ∗∞ and
h
∗
(f ) ≤ ∗ ε + |ηn | (ε ∈ Ê+).
n=1
§10 Normed Spaces 133
Exercise 52. Let X be a linear space (not necessarily normed) of real sequences.
Prove that any linear functional f on X can be written in the form
h
∗
(f (x)) = ηn ξn (x = (ξn )n ∈ X),
n=1
f (x) = lim ξn
n→∞
and
h
f = st |ηn | .
n=h0
Ê
Conversely, each internal ∗ -finite sequence η1 , . . . , ηh ∈ ∗ which satisfies ηn ≈ 0
Æ
for any n ∈ σ , η1 + · · · + ηh ≈ 1, and for which |η1 | + · · · + |ηh | is finite defines
a Hahn-Banach limit f by means of the formula
h
∗
f (x) := st ηn ξ n (x = (ξn )n ∈ ℓ∞ ).
n=1
is some h0 ∈ ∞ with ηn = 0 for n < h0 , and η1 + · · · + ηh = 1. To see the latter,
consider the constant sequence x := (1) (i.e. ξn := 1). Since f is a Hahn-Banach
limit, we must have f (x) = 1. But since ∗ ξ n = 1 for each n ∈ ∗ (this follows by the
transfer principle or by Theorem 7.1), the formula (10.7) implies η1 + · · · + ηh = 1,
as claimed. To see that η∗ n = 0 for n ∈ , we consider the particular sequence x
defined by ξk := 0 for k = n and ξn := 1. The transfer principle implies ∗ ξ k = 0
for k = ∗ n and ∗ ξ ∗ n = 1. Since f (x) = 0, the formula (10.7) implies η∗ n = 0,
as claimed. We thus have proved that the internal formula ηn = 0 holds for each
n ∈ σ . By the permanence principle there is some h1 ∈ ∞ such that ηn = 0
holds for all n ≤ h1 . Thus, the first statement follows with h0 := h1 + 1.
For the second statement, let η1 , . . . , ηh and f be given as in the formulation
of the theorem. Theorem 10.5 implies that f ∈ ℓ∗∞ . It remains to prove that if
Ê
x = (ξn )n converges to some l ∈ , that f (x) = l, i.e. ∗ l ≈
∗
ηn ξ n . Let ε ∈ + Ê
∗
be given. Since ηn ξ n ≈ 0 for each n ∈ σ
Æ
, the internal predicate
x
|ηn | < ∗ ε
n=1
is true for any x ∈ Æ and by the permanence principle thus also for some x = h0 ∈
σ
Æ∞. For n > h0, we have ∗ξn ≈ ∗l by Theorem 7.1, in particular |∗ξ n − ∗l| < ∗ε.
Putting c := η1 + · · · + ηn − 1, we have by assumption c ≈ 0 and thus also |c| < ∗ ε.
Now we may calculate
h
h
h
h
ηn ∗ ξ n − ∗ l
=
ηn ∗ ξ n − ηn − c ∗ l
≤
ηn (∗ ξ n − ∗ l)
+ c |∗ l|
∗
Now observe that the transfer principle implies |∗ ξ n | ≤ (x∞ ) for all n ∈ ∗ Æ
and that by assumption |η1 | + · · · + |ηh | ≤ ∗ M for some M ∈ + . Hence, we have Ê
proved
h
∗
∗
∗
ηn ξ n − l
≤ ∗ ε( (x∞ ) + |∗ l|) + ∗ M ∗ ε + ∗ ε |∗ l| .
n=1
Ê
Since this estimate holds for any ε ∈ + , it follows that
∗
ηn ξ n ≈ ∗ l, which we
had to prove.
Theorem 10.6 slightly generalizes [Lux92, Theorem 4.4], using a refinement
of the technique from [Rob64].
Exercise 54. Does there exist a Hahn-Banach limit f such that for each x ∈ ℓ∞
the point f (x) is an accumulation point of the sequence x?
§10 Normed Spaces 135
holds.
Exercise 56. Let f be a Banach-Mazur limit. Calculate f (x) for the sequence
x = (ξn )n which is given by ξn := (−1)n .
Does there exist a Banach-Mazur limit which has the additional property
from Exercise 54, i.e. such that f (x) is always an accumulation point of the se-
quence x?
The standard proofs for the existence of Banach-Mazur limits are not very
constructive. We just mention one of the simplest standard approaches from
[Rud90, Chapter 3, Exercise 4]:
Exercise 57. Given a sequence x = (ξn )n ∈ ℓ∞ , put ζn := (ξ1 + · · · + ξn )/n. Apply
Exercise 49 for p(x) := lim sup ζn and f (x) := lim ζn (if ζn converges) to prove the
existence of a Banach-Mazur limit.
We now present a class of Banach-Mazur limits which can easily be charac-
terized by nonstandard methods. This result is taken from [Rob64]:
Theorem 10.7. Let ∗ : S → ∗S be a nonstandard embedding, and η1 , . . . , ηh ∈ ∗ Ê
∗
be an internal -finite sequence. Assume:
1. ηn ≥ 0 for n = 1, . . . , h.
2. η1 + · · · + ηh ≈ 1.
h
3. n=1 |ηn − ηn−1 | ≈ 0 (put η0 := 0).
Then a Banach-Mazur limit f is given by the formula
h
∗
f (x) := st ηn ξ n (x = (ξn )n ∈ ℓ∞ ).
n=1
Proof. An induction implies that ηn ≈ 0 for each n ∈ σ : Indeed, if ηn−1 ≈ 0, then
|ηn − ηn−1 | ≈ 0 (by assumption) shows that also ηn ≈ 0. Hence, by Theorem 10.6,
136 Chapter 5. Functionals, Generalized Limits, and Additive Measures
h h
h
∗
∗ ∗
∗
ηn ≥ 0,
h
ηn ≈ 1,
n=h0
and
h
|ηn − ηn−1 | ≈ 0.
n=h0 +1
The proof of Theorem 10.9 needs deeper facts about the geometry of Banach
spaces which are beyond the scope of this book. The proof can be found in [Lux92].
Theorem 10.9 suggests that the following Hahn-Banach limits are particularly
“natural”:
Example 10.10. Let ∗ : S → ∗
S be a nonstandard embedding. For any h0 , h ∈ ∞ Æ
with h0 ≤ h, the formula
h
1
∗
f (x) := st ξn (x = (ξn )n ∈ ℓ∞ )
h − h0 + 1
n=h0
§10 Normed Spaces 137
and so
h
h h
∗ 1 ∗
c0 = ηk ξn+k = ηk (∗ c + εk )
h − h0 + 1
k=h0 n=h0 k=h0
Ê
where εk ∈ inf(∗ ). Since
ηk = 1, we find for any ε ∈ + that Ê
h
h
∗ ∗
| c0 − c| =
ηk εk
≤ |ηk | ∗ ε ≤ ∗ (f + 1)∗ ε,
k=h0 k=h0
∗ ∗
and so c0 ≈ c which implies f (x) = c0 = c.
138 Chapter 5. Functionals, Generalized Limits, and Additive Measures
A deeper reason why Proposition 10.11 is true is revealed in [KM92] (see also
the remarks in [Lux92]). However, we will not go into further detail here.
Definition 10.12. A sequence x = (ξn )n is almost convergent to c if we have
n
1
lim ξm+k = c
n→∞ n
m=1
uniformly in k ∈ .
It can be proved by standard methods that any almost convergent sequence
is bounded. However, with nonstandard methods an easier proof can be given
[Lux92]:
Exercise 58. Let ∗ be a nonstandard embedding. Then x = (ξn )n is almost con-
vergent to c if and only if
h
1 ∗
ξ
h n=1 n+k
≈ ∗c (h ∈ ∞, k ∈ ∗). (10.10)
Proof. Let ∗ : S → ∗
S be an enlargement, and x be almost convergent to c. Then
x ∈ ℓ∞ by Exercise 58. By Proposition 10.11, it suffices to prove that f (x) = c for
any Banach-Mazur limit f of Cesàro type, i.e. we may assume
h
1 ∗
f (x) = st ξm (10.11)
h − h0 + 1
m=h0
∗ ∗
Since ξ m+h−1 − ξ m+h ≈ 0 by Theorem 7.1, we have ηn ≈ 0 for any finite n ∈ σ .
Robinson’s sequential lemma (Exercise 22) implies that there is some h0 ∈ ∞
with ηn ≈ 0 for all n ∈ ∗ with n ≤ h0 . In particular, we have for any ε ∈ + Ê
that
h
1 ∗
ηn
≤ h0 ε.
h0
n=1
h0
§ 11 Additive Measures
Let S0 be a set, and Σ be a set algebra over S0 , i.e. Σ is a system of subsets
of S0 with the property that S0 ∈ Σ and that A, B ∈ Σ implies A ∪ B ∈ Σ
and S0 \ A ∈ Σ. A function µ : Σ → [0, ∞] is called an additive measure if
µ(A ∪ B) = µ(A) + µ(B) for any A, B ∈ Σ with A ∩ B = ∅. If µ(S0 ) = 1,
then µ is called an additive probability measure. If Σ is even a σ-algebra, i.e.
additionally An ∈ Σ for countably many An ∈ Σ and µ is even σ-additive, i.e.
µ( An ) = µ(An ) whenever An ∈ Σ are pairwise disjoint, then µ is called a
measure resp. a probability measure.
An additive measure µ is called singular , if µ(A) = 0 for any finite set A ∈ Σ.
At the moment, we are interested in additive probability measures on
where Σ = P( ). Such a measure µ is singular if and only if µ({n}) = 0 for any
n ∈ . It cannot be proved without the axiom of choice that such measures exist
(even a rather powerful countable version of the axiom of choice is not sufficient
[PS77]). In particular, it is not possible by standard methods to construct singular
additive measures on . However, many Hahn-Banach limits f provide such a
Ê
measure: Given A ⊆ , we let χA denote the sequence an ∈ defined by
1 if n ∈ A,
an :=
0 if n ∈/ A.
Moreover, µ({n}) = 0, because χ{n} is a null sequence. The only property which
is not necessarily satisfied is that µ(A) ≥ 0. However, this holds if we choose some
f which has a representation as in Theorem 10.5 with ηn ≥ 0. Hence, we found a
rather large class of singular additive measures on .Æ
In particular, if f is a Banach-Mazur limit, then µ(A) = f (χA ) defines a
singular measure which additionally is translation invariant, i.e. µ({n : n + 1 ∈
A}) = µ(A).
By an appropriate choice, we can satisfy certain other additional properties.
Let us give a sample application:
Æ
A set A ⊆ is said to have a density d, if
n
1
d := lim an ((an )n = χA )
n→∞ n
k=1
exists. The density (if it exists) may be considered as a “relative frequency” of the
occurrences of 1 in the sequence (an )n . A singular measure µ with the property
§11 Additive Measures 141
that µ(A) = d whenever A has the density d may be considered as some sort
of “Laplace measure” on
(i.e. each number has in a certain sense the same
“weight” for the calculation of the probability).
Theorem 11.1. There is a singular additive translation invariant measure µ on
with the additional property that µ(A) = d whenever A has the density d. Moreover,
for any A ⊆ X the sequence χA = (an )n satisfies
n n
1 1
lim inf ak ≤ µ(A) ≤ lim sup ak .
n→∞ n n→∞ n
k=1 k=1
Proof. Let ∗ : S → ∗
S be a nonstandard embedding. Fix h0 ∈ ∞ and choose
some h ∈ ∞ such that h/h0 is infinite (put e.g. h := h20 ). Consider the Banach-
Mazur limit
h
1 ∗
f (x) := st ξk (x = (ξk )k )
h − h0
k=h0
h/h0
Note now that h/(h − h0 ) = h/h 0 −1
≈ 1, and
1 h 0 −1
h −1 h0 /h − 1/h
∗
0
ak
≤ = ≈ 0,
h − h0
h − h0 1 − h0 /h
k=1
∗ ∗
and so f (χA ) ≈ bh , as desired.
Exercise 59. In the proof of Theorem 11.1, we have chosen a particular Banach-
Mazur limit f of Cesàro type. Does the conclusion µ(A) = d also hold for any
Banach-Mazur limit f of Cesàro type or even for any Banach-Mazur limit f ?
Hint: Apply Theorem 10.13.
142 Chapter 5. Functionals, Generalized Limits, and Additive Measures
It turns out that in the nonstandard world any singular measure is a Laplace
measure in another sense. This was first proved in [Hen72b]. We present a slightly
modified version:
Theorem 11.2. Let ∗ : S → ∗S be a nonstandard embedding, S0 ∈ S be an entity,
and Σ be an algebra over S0 . Then for any nonempty ∗ -finite B ⊆ ∗ S 0 the function
# ∗
( A ∩ B)
µ(A) := st #
(A ⊆ S0 ) (11.1)
B
where α(y) is a shortcut for “each two different elements of y are disjoint”, and
β(y) is a shortcut for
∃z ∈ P(Σ) : (z ⊆ y ∧ x = z).
To each n ∈ and each infinite set z ∈ Σ, we can associate a subset x(z) ⊆ z
whose number of elements is the smallest integer which is at least nµ(z). Actually,
this holds also if z ∈ Σ is finite, because in this case µ(z) = 0. Hence,
∀n ∈ , y ∈ P(Σ) : ∃x ∈ P(S0)Σ : γ
where γ is a shortcut for
The transfer principle implies for the choice y := Σ0 in view of Theorem 3.21 that
we find for any h ∈ ∗ N some internal function f : ∗ Σ → ∗ P(S0 ) such that for any
A0 ∈ Σ0 the set f (A0 ) is a ∗ -finite subset of A0 with
# #
(f (A0 )) ≤ h∗ µ(A0 ) < (f (A0 )) + 1. (11.2)
Fix some h ∈ ∞ such that h/# Σ0 is infinite, and let f denote the corresponding
function. We claim that
B := {f (A0 ) : A0 ∈ Σ0 }
( (f (A0 )) − h µ(A0 ))
≤ # ΣA ≤ # Σ0 ,
∗
A0 ∈ΣA
we find
∗ #
h (µ(A)) − (f (A0 ))
≤ # Σ0 .
A0 ∈ΣA
Since f (A0 ) ⊆ A0 and since the sets A0 ∈ Σ0 are pairwise disjoint and ∗ A = ΣA ,
we find in view of the definition of B that ∗ A ∩ B = {f (A0 ) : A0 ∈ ΣA } where
the union is pairwise disjoint. In particular,
# ∗
#
( A ∩ B) = (f (A0 )).
A0 ∈ΣA
144 Chapter 5. Functionals, Generalized Limits, and Additive Measures
Summarizing,
∗ #
h (µ(A)) − (∗ A ∩ B)
≤ # Σ0 (A ∈ Σ). (11.3)
Applying (11.3) for A = S, we find in view of µ(S) = 1 that
h − # B
≤ # Σ0 .
# ∗
# (∗ A ∩ B)
( ( A ∩ B) − h∗ µ(A)) + (h − # B)∗ µ(A)
#
− (µ(A))
= #
B
B
2 # Σ0
≤ #
≈ 0.
B
We thus have the required representation.
Corollary 11.3 (Measure Extension Theorem). If µ is a singular additive probabil-
ity measure defined on an algebra Σ ⊆ S0 , then µ may be extended to a singular
additive probability measure on P(S0 ).
Proof. Let ∗ be a Σ-enlargement. Then we may write µ in the form (11.1), and
the latter defines even a singular additive probability measure on P(S0 ).
In particular, the Lebesgue measure (defined on the measurable subsets of
[0, 1]) has an extension to an additive measure which is defined on all subsets of
[0, 1]. A famous theorem of Banach states that the Lebesgue measure on Ê has
an extension to a translation invariant additive measure µ which is defined on all
subsets ofÊ (translation invariance means µ(A + x) = µ(A) for each A ⊆ ). Ê
We intend to give a nonstandard proof for this result. By calculating modulo
1, it obviously suffices to extend the Lebesgue measure on [0, 1) to a measure µ
which has the property that µ(A) = µ(A ⊕ x) for any A ⊆ [0, 1] and any x ∈ Ê
(here, A ⊕ x := {a ⊕ x : a ∈ A} and a ⊕ x := a + x + z where z ∈ is chosen such
that a ⊕ x ∈ [0, 1)).
This result follows rather immediately from the following result of the stan-
dard world.
Proposition 11.4. The operation ⊕ satisfies Følner’s condition on X = [0, 1),
i.e. for each x1 , . . . , xn ∈ [0, 1) and each ε ∈ + there is a nonempty finite set
A ⊆ [0, 1) such that
|A∆(A ⊕ xj )|
<ε (j = 1, . . . , n) (11.4)
|A|
§11 Additive Measures 145
Ê
Theorem 11.5 (Measure Extension Theorem for ). The Lebesgue measure has
an extension to a translation invariant additive measure which is defined on all
Ê
subsets of .
146 Chapter 5. Functionals, Generalized Limits, and Additive Measures
Proof. As noted above, it suffices to extend the Lebesgue measure on X = [0, 1).
By Corollary 11.3, we find some extension µ of the Lebesgue measure to all subsets
of X. Let ∗ : S → ∗ S be an X-enlargement. Let c : P(X) →
∪ {∞} be the
function which associates to each subset of X its number of elements. Exercise 27
implies ∗ c(B) = # B for any internal B ⊆ ∗ X. Consider the binary relation
is the desired measure. Note that µ0 is defined, because |∗ µ(A0 )| ≤ 1 for all
A0 ∈ ∗ P(X) implies
∗ ∗ ∗
µ( A ⊕ x) ≤ # B.
x∈B
∗ 1 ∗ ∗
(µ0 (A)) ≈ #
µ( A) = ∗ µ(∗ A) = ∗ (µ(A))
B x∈B
Observe now that the sum in the above formula may be written as
∗ ∗ ∗ ∗ ∗ ∗
µ( A ⊕ x) − µ( A ⊕ x)
x∈B x∈B⊕y
∗
= ∗ ∗
µ( A ⊕ x) − ∗
µ(∗ A ∗ ⊕ x).
x∈B\(B⊕y) x∈(B⊕y)\B
§11 Additive Measures 147
Since 0 ≤ ∗ µ(∗ A ∗ ⊕ x) ≤ 1 and the total number of summands in these two sums
#
is (B∆(B ⊕ y)), we obtain the estimate
1 #
|µ0 (A) − µ0 (A ⊕ y)| ≤ st # (B∆(B ∗ ⊕ ∗ y)) = 0.
B
Thus, µ0 is translation invariant.
The previous proofs are essentially taken from [Wag86, p. 161] and [Hen72b].
The reader will have realized that for Proposition 11.4 we might have replaced
X by any commutative group. Moreover, the proof of Theorem 11.5 holds even for
any group X which satisfies Følner’s condition: Any translation invariant finitely
additive probability measure on such a group X may be extended to a translation
invariant finitely additive probability measure defined on all subsets of X. Groups
possessing such a measure are called amenable. In particular, our above proofs
show that commutative groups and, more generally, groups satisfying Følner’s
condition, are amenable. Følner has proved that conversely amenable groups must
satisfy Følner’s condition. Readers more interested in amenable groups are referred
Ê
to [Wag86]. We only mention that the group of isometries in n is amenable if
and only if n ≤ 2.
Chapter 6
Ê
some ε ∈ + such that the open ball B(x, ε) := {y ∈ X : d(x, y) < ε} is contained
in O. If we speak of a (pseudo)metric space, we always mean that X is equipped
with this topology.
Any topology allows us a definition of neighborhoods of a point:
Definition 12.3. If X is a topological space and x ∈ X, then U ⊆ X is called a
neighborhood of x if there is an open set O ⊆ U with x ∈ O. The system U (x) of
all neighborhoods of x is called the neighborhood filter of x.
For example, in a (pseudo)metric space X, a set is a neighborhood of x if
and only if it contains some ball with positive radius and center x.
Proposition 12.4. Any neighborhood filter U (x) is a filter.
Proof. Let A denote the system of all subsets of U which are open. By definition,
A ⊆ U . If U is a neighborhood for each of its elements, then U ∈ A , and thus
U ⊆ A . Hence, U = A is open. The converse implication is trivial.
Proof. 1. Since F has the finite intersection property, mon(F ) = ∅ follows from
the definition of enlargements.
2. By Theorem 8.10, there is a ∗ -finite set F0 with σ F ⊆ F0 ⊆ ∗ F . Then
F1 := {F ∈ F0 : A ⊆ F } is an internal subset of F0 by the internal definition
principle. Theorem 6.13 thus implies that F1 is ∗ -finite. Since A ⊆ mon(F ), the
definition of mon(F ) implies σ F ⊆ F1 and thus also B := F1 ⊆ mon(F ).
Since F is a filter, the sentence
∀x ∈ P(F ) : (“x is finite” =⇒ x ∈ F)
is true. The transfer principle implies that for any ∗ -finite internal subset x of ∗ F ,
we have x ∈ F . In particular, B = F1 is an element of ∗ F .
3. By 2., we find some B ∈ ∗ F with B ⊆ mon(F ); in particular B ⊆ A. The
transfer of the sentence
∀x ∈ P(X) : ((∃y ∈ F : y ⊆ x) =⇒ x ∈ F )
implies that ∗ F contains all internal subsets of ∗ X which contain some element
of ∗ F as a subset. In particular, A ∈ ∗ F .
4. Since ∗ A ⊆ ∗ X is internal with mon(F ) ⊆ ∗ A, we find by 3. that ∗ A ∈ ∗ F
which by the inverse form of the transfer principle implies A ∈ F .
5. If F1 ⊇ F2 , then trivially mon(F1 ) ⊆ mon(F2 ). Conversely, suppose that
mon(F1 ) ⊆ mon(F2 ). For any F ∈ F2 , the definition of the monad implies
∗
F ⊇ mon(F2 ) ⊇ mon(F1 ), and so F ∈ F1 by 4.
6. Swapping the roles of F1 and F2 in 5., the statement follows.
Exercise 60. Prove that a filter U on X is an ultrafilter if and only if for any filter
F on X we have either mon(U ) ⊆ mon(F ) or mon(U )∩mon(F ) = ∅. Prove also
that in this case, any different ultrafilter U ′ on X satisfies mon(U ) ∩ mon(U ′ ) =
∅.
Hint: Consider the system F0 := {F ∩ U : F ∈ F , U ∈ U }.
Theorem 12.7. If ∗ is even a compact P(X)-enlargement, then the following holds:
If F is a filter on X, then any internal set B ⊇ {F ∈ ∗ F : F ⊆ mon(F )}
contains some standard element F ∈ σ F .
Proof. Assume that B contains no standard element. Consider the internal binary
relation
ϕ := {(x, y) ∈ ∗ F × ∗ F | y ⊆ x ∧ y ∈
/ B}.
152 Chapter 6. Nonstandard Topology and Functional Analysis
Proof of Theorem 12.10. The first statement has already been proved (the com-
pactness of ∗ was not needed in our previous proof concerning the first statement).
For the second statement, let F be nonprincipal. Then F has infinite di-
mension κ by Lemma 12.12. Without loss of generality, we assume that κ is an
Choose B and i as in Lemma 12.13, and put
entity of S.
P := {x ∈ ∗ P(B) | “x is ∗ -finite” ∧ x ⊆ mon(F )}.
B ⊆ mon(x).
Proof. It follows immediately from the definition that the neighborhood filter
U (x) of x is generated by any neighborhood base. Hence, the first statement
follows from Lemma 12.11. For the last statement, observe that Theorem 12.6 2.
implies that there is some some F ∈ ∗ F with F ⊆ mon(x). The transfer of the
statement
∀x ∈ F : ∃y ∈ B : y ⊆ x
implies that we find some B ∈ ∗ B with B ⊆ F ⊆ mon(x).
We also want to define monads for points in the nonstandard world; we define
even monads of sets:
Definition 12.21. If A ⊆ ∗ X is nonempty, we denote the filter generated by the
system of all open sets O ⊆ X with A ⊆ ∗ O the standard filter of A. Its monad is
called the monad of A and denoted by µ(A).
Note that the system B of all open sets O ⊆ X with A ⊆ ∗ O indeed generates
a filter, since it has the finite intersection property: If O1 , . . . , On ∈ B, then
O = O1 ∩ · · · ∩ On is open, and ∗ O = ∗ O1 ∩ · · · ∩ ∗ O n contains A and thus is
nonempty. Hence, O = ∅. By Lemma 12.11, we have µ(A) = σ B. In other
words:
Proposition 12.22. We have y ∈ µ(A) if and only if y ∈ ∗ O for each open set
O ⊆ X with A ⊆ ∗ O.
Corollary 12.23. If x ∈ X, then µ({∗ x}) = mon(x).
Proof. By the transfer principle, {∗ x} ⊆ ∗ O if and only if x ∈ O. Now apply
Propositions 12.22 and 12.19.
Definition 12.24. We call two nonstandard points x, y ∈ ∗ X of a standard topolog-
ical space X infinitely close to each other if for each open set O ⊆ X with x ∈ ∗ O
we also have y ∈ ∗ O, i.e. if y ∈ µ({x}). In this case, we write y ≈O x.
Ê
In the case X = , we have mon(x) = {y ∈ X : y ≈ ∗ x} = {y ∈ X : y≈O ∗ x}.
Indeed, Corollary 12.23 implies for any topological space X:
Corollary 12.25. We have for any x ∈ X that
mon(x) = {y ∈ ∗ X : y ≈O ∗ x}.
Theorem 12.27. Let X = and h ∈ ∞ . Then there are infinitely many n ∈ ∗
with n≈O h. Moreover, the relation n≈O h implies that either n = h or that |n − h|
is infinite.
Proof. Recall that µ({h}) is the filter monad of the standard filter of {h}. This
filter is nonprincipal, since for any n ∈ the set \ {n} belongs to the filter
∗
(because h ∈ ∗ \ {∗ n} = ( \ {n})). Consequently, µ({h}) ⊆ ∗ is external by
Theorem 12.10. Hence, µ({h}) is infinite by Exercise 6.
Assume that k := |n − h| > 0 is finite, i.e. k ∈ σ . Put Fj := {2ki + j : i ∈
∗ ∗
}. Then F0 ∪ · · · ∪ F2k−1 = , and so ∗ = (F0 ) ∪ · · · ∪ (F2k−1 ), i.e. we find
some j ∈ ∗
with h ∈ F j . Then Fj belongs to the standard filter of {h}, and so
n ∈ F j . By the standard definition principle, we have ∗ F j = {2ki + j : i ∈ ∗ }.
∗
y ≈ ∗ x ⇐⇒ y ≈O ∗ x.
For the above reasons, ≈O may not appear a “natural” notion. For so-called
Ê
uniform spaces (like X = ) we will later learn another relation which is more
natural and which for standard points coincides with ≈O ; for X = Ê this new
relation becomes the same as ≈.
One of the most useful concepts in real nonstandard analysis was the mapping
st. It appears natural to call x ∈ X the standard part of y ∈ ∗ X if y ≈O ∗ x, i.e.
mon(x) consists precisely of all those points y ∈ ∗ X whose standard part is x.
Ê
Recall that in case X = , the standard part mapping was not defined on ∗ R but
Ê
only on fin(∗ ). Hence, we cannot expect to define st on all of X. Even worse, in
general it may happen that st(y) is not uniquely determined. Nevertheless, we can
define:
Definition 12.29. The standard part relation st is a relation on ∗ X × X, defined
by
(y, x) ∈ st ⇐⇒ y ≈O ∗ x ( ⇐⇒ y ∈ mon(x)).
Points y ∈ dom(st) are called nearstandard . The set of all nearstandard points of
X is denoted by ns(X).
Ê
Of course, in case X = , the relation st is a function, and we end up with
the old Definition 5.20. Recall that a topological space is called a Hausdorff space
if each two points x = y have disjoint neighborhoods. For example, a pseudometric
158 Chapter 6. Nonstandard Topology and Functional Analysis
space is a Hausdorff space if and only if it is a metric space (indeed, if two points
x = y satisfy d(x, y) = 0, they have the same neighborhoods).
Proposition 12.30. For a topological space X the following three statements are
equivalent:
1. X is a Hausdorff space.
2. The relation st is a function.
3. Monads to different points are disjoint, i.e. x = y implies mon(x) ∩ mon(y) =
∅.
Proof. The equivalence of the last two statements follows immediately from the
definition. If x = y have disjoint monads, choose U ∈ ∗ U (x), V ∈ ∗ U (y) with
U ⊆ mon(x) and V ⊆ mon(y) (Proposition 12.19). Then the sentence
∃u ∈ ∗ U (x), v ∈ ∗ U (y) : u ∩ v = ∅
is true, and the inverse form of the transfer principle implies that x and y have
disjoint neighborhoods. Conversely, if x = y have disjoint neighborhoods U and
V , respectively, then U ∩ V = ∅ implies ∗ U ∩ ∗ V = ∅, and since mon(x) ⊆ ∗ U ,
mon(y) ⊆ ∗ V , it follows that mon(x) ∩ mon(y) = ∅.
For compact enlargements we have the following generalization of the Cauchy
principle:
Theorem 12.31 (Permanence principle for ∗ X (Cauchy principle)). Let ∗ be a
compact P(X)-enlargement. Let α(y) be an internal predicate with y as its only free
variable. If α(y) holds for all y ∈ mon(x), then there is some standard neighborhood
U of x such that α(y) holds for all y ∈ ∗ U.
Proof. The set
B := {u ∈ ∗ U (x) | ∀y ∈ u : α(y)}
is internal by the internal definition principle. Moreover, any F ∈ ∗ U (x) with
F ⊆ mon(x) belongs to B. Hence, Theorem 12.7 implies that B contains some
standard element V = ∗ U with U ∈ U (x).
As one might expect, monads can be used to characterize topological sets:
Definition 12.32. Let A ⊆ X. A point x ∈ A is called an interior point of A, if
A is a neighborhood of x. The closure A is the set of all points x ∈ X with the
property that any neighborhood of x intersects A.
We first recall some facts of the standard world:
Proposition 12.33. We have:
1. The set A is the smallest closed set which contains A. In particular, A is
closed if and only if A = A.
§12 Topologies and Filters 159
2. The set of all interior points of A is the smallest open set which is contained
in A. In particular, A is open if and only if each x ∈ A is interior, i.e. if and
only if A is a neighborhood for each of its elements.
Proof. 1. We claim that C := X \ A is the union of all open sets which are
contained in X \ A: Then C is open, and thus A is closed, and moreover, if B ⊇ A
is closed, then X \ B ⊆ C, i.e. A ⊆ B.
If O ⊆ X \ A is open and x ∈ O, then O is a neighborhood of x which does
not intersect A, and so x ∈ C by definition of A. Conversely, if x ∈ C, then x has
a neighborhood U which does not intersect A, and so there is some open O ⊆ U
with x ∈ O. We have O ⊆ C by definition of A.
2. We prove that the set I of interior points of A is the union of all open sets
contained in A: If O ⊆ A is open and x ∈ O, then x ∈ I. Conversely, if x ∈ I,
then there is some open O ⊆ A with x ∈ O; hence, x is contained in an open set
O ⊆ A.
Now we turn to the nonstandard characterizations for topological properties
of sets:
Theorem 12.34. Let A ⊆ X. A point x ∈ A is an interior point of A if and only
if mon(x) ⊆ ∗ A. The set A is open if and only if
mon(x) ⊆ ∗ A.
x∈A
Proof. The second statement follows from the first by Proposition 12.33. Let x ∈
A be interior, i.e. A ∈ U (x). The system B := {V ∈ U (x) : V ⊆ A} is a
neighborhood base for x such that any V ∈ B satisfies V ⊆ A, i.e. ∗ V ⊆ ∗ A.
Hence, Proposition 12.19 implies
σ
mon(x) = B = {∗ V : V ∈ B} ⊆ ∗ A.
∃u ∈ ∗ U (x) : u ⊆ ∗ A.
The inverse form of the transfer principle implies that U (x) contains a subset of
A, i.e. x is an interior point of A.
For the second statement, we could also have applied the Cauchy principle,
but this would require that ∗ be a compact enlargement.
We think of st as a multivalued function, and thus use for y ∈ ∗ X the notation
Moreover, for the inverse relation st−1 := {(x, y) : (y, x) ∈ st} we write corre-
spondingly for x ∈ X,
and for A ⊆ X,
st−1 (A) := st(x) = {y ∈ ∗ X : There is some x ∈ A with y ≈O ∗ x}.
x∈A
st(∗ A) = A,
st(∗ A) ⊆ A
(and then equality holds, because the converse inclusion is always true).
Proof. Let x ∈ A, i.e. for any U ∈ U (x), we have U ∩ A = ∅. Then the system
A := {U ∩ A : U ∈ U (x)} has the finite intersection property. Since ∗ is a P(X)-
∗
enlargement, this implies σ A = ∅, i.e. there is some y with y ∈ (U ∩ A) =
∗
U ∩ ∗ A for all U ∈ U (x). Hence, y ∈ mon(x) ∩ ∗ A, i.e. y ∈ ∗ A satisfies y ≈O ∗ x,
and so x ∈ st(∗ A).
Conversely, let x ∈ st(∗ A), i.e. there is some y ∈ ∗ A with y ≈O ∗ x, i.e.
y ∈ mon(x). For any U ∈ U (x), we thus have y ∈ ∗ U , and so ∗ U ∩ ∗ A = ∅ which
implies U ∩ A = ∅ by the transfer principle. Consequently, x ∈ A.
Proof. Let x ∈ st(A), and let B be the system of all open neighborhoods of x.
Consider the internal relation
ϕ := {(x, y) ∈ ∗ B × A | y ∈ x}.
Proof. We have U (x) ⊆ F if and only if for each U ∈ U (x) there are n1 , . . . , nk
with Fn1 ∩ · · · ∩ Fnk ⊆ U . Since Fn1 ∩ · · · ∩ Fnk = Fmax{n1 ,...,nk } , this is the case
if and only if for each U ∈ U (x) there is some n with Fn ⊆ U , i.e. if and only if
all except finitely many xn belong to U .
where Oi = Xi for all except finitely many i ∈ I, and Oi ⊆ Xi is open for all i ∈ I.
Then O is open in the product topology if and only if it is a union of sets from B.
The condition that Oi = Xi for all except finitely many i ∈ I appears rather
artificial. However, there are two main reasons why this should be included in the
above definition of the product topology (besides the fact that the above definition
has many applications):
It turns out that the above definition is the smallest topology such that
each of the projections pi : X → Xi is continuous (continuity will be defined
in Section 12.4). In nonstandard terms, this fact reads as follows:
We will assume henceforth that we have given a family Xi (i ∈ I) of topolog-
Then also U := Xi ,
ical spaces where Xi , I, and {Xi : i ∈ I} are all entities of S.
As before, we assume that ∗ is an
U I , and thus also X := Xi are entities of S.
P(X)-enlargement.
We use the notation of Corollary A.6, i.e. for i ∈ ∗ I, we define ∗ X i := ∗ f (i)
where f denotes the function i → Xi . Recall (Exercise 80), that the elements of
∗
X are precisely the internal elements of i∈∗ I ∗ X i . In particular (Corollary A.6),
each x ∈ ∗ X is an internal function
x : ∗I → ∗
X i.
i∈∗ I
∗
Theorem 12.52. In the product topology, we have y ≈O x if and only if
∗
y(∗ i) ≈O ∗ x(∗ i) = (x(i)) for each i ∈ I.
Proof. Suppose that y(∗ i) ≈O ∗ x(∗ i) for each i ∈ I. Let O ⊆ X be open with
∗
x ∈ ∗ O. We have to prove that y ∈ ∗ O. But ∗ x ∈ ∗ O implies x ∈ O, and so
we have x ∈ B ⊆ O for some B ∈ B where B is as in Definition 12.51. By
definition of B, we find finitely many i1 , . . . , in ∈ I and open sets Oik ⊆ Xik
such that B = i∈I Oi for each i ∈ I where we put Oi := Xi for i = i1 , . . . , in .
Exercise 80 implies that ∗ B consists of all internal elements of i∈∗ I ∗ O i . Note
that the transfer principle implies
∀x ∈ ∗ I : ((x = ∗ i1 ∧ · · · ∧ x = ∗ in ) =⇒ ∗
Ox = ∗ X x ).
166 Chapter 6. Nonstandard Topology and Functional Analysis
∗ ∗
Since y(∗ ik ) ≈O ∗ x(ik ) ∈ (Oik ), we have y(∗ ik ) ∈ (Oik ) = ∗ O ∗ ik (recall the
remarks preceding Corollary A.6). Thus, y is an internal function which satisfies
y(i) ∈ ∗ O i for all i ∈ ∗ I (for i = ∗ i1 , . . . , ∗ ik we have ∗ O i = ∗ X i , as we have shown
above). This proves y ∈ ∗ B ⊆ ∗ O, as desired.
Conversely, let y ≈O ∗ x. If i0 ∈ I and some open Oi0 ⊆ Xi0 with x(i0 ) ∈ Oi0
are given, put Oi := Xi (i = i0 ) and O := i∈I Oi . Then O is open with x ∈ O.
Hence, ∗ x ∈ ∗ O, and so our assumption implies y ∈ ∗ O. Since ∗ O consists of all
∗
internal elements of i∈∗ I ∗ Oi and since ∗ O ∗ i0 = (Oi0 ), we must have y(∗ i0 ) ∈
∗ ∗ ∗
(Oi0 ). Hence, y( i0 ) ≈O (x(i0 )).
Corollary 12.53. X := i∈I Xi is a Hausdorff space if and only if each Xi is a
Hausdorff space.
Proof. If Xi0 is not a Hausdorff space, there are points a, b ∈ Xi0 , a = b such that
mon(a) ∩ mon(b) contains some point c (Proposition 12.30). Choose some x ∈ X
with x(i0 ) = a, and put y(i) = x(i) (i ∈ I \ {i0 }) and y(i0 ) = b. Consider the
function z(i) := ∗ y(i) (i ∈ I \ {i0 }) and z(i0 ) := c. Then z is an internal function
(Exercise 8), and in view of Theorem 12.52, we have z ∈ mon(x)∩mon(y) although
x = y. Hence, Proposition 12.30 implies that X is not a Hausdorff space.
Conversely, if X is not a Hausdorff space, we find elements x = y in X such
that mon(x) ∩ mon(z) contains some element z. Choose some i0 with x(i0 ) =
y(i0 ). Then Xi0 is not a Hausdorff space, because Theorem 12.52 implies z(∗ i0 ) ∈
mon(x(i0 )) ∩ mon(y(i0 )).
The other reason for the definition of the product topology is that the fol-
lowing important theorem of Tychonoff holds, which has many applications. We
note that all known proofs of the Tychonoff theorem are rather technical so that
the following nonstandard proof is an essential simplification:
Corollary 12.54 (Tychonoff). X := i∈I Xi is compact if and only if each Xi is
compact.
Proof. Let X be compact. A standard argument immediately implies that Xi0 is
compact (because each projection pi : X → Xi is continuous, as mentioned above).
However, for completeness we provide a nonstandard proof: By Theorem 12.39,
we have to prove that for each b ∈ ∗ X i0 there is some a ∈ Xi0 with b ≈O ∗ a.
There is some y ∈ ∗ X with y(∗ i0 ) = b. Since X is compact, we find some x ∈ X
with y ≈O ∗ x. Then y(∗ i0 ) ≈O ∗ (x(i0 )) by Theorem 12.52, and so a = x(i0 ) ∈ Xi0
satisfies b ≈O ∗ a.
The converse direction is the one which is hard to prove by standard methods:
Suppose that all Xi are compact. Let y ∈ ∗ X. For each i ∈ I, we have y(∗ i) ∈
∗ ∗
X ∗ i = (Xi ). Since Xi is compact, we find some x(i) ∈ st(y(∗ i)). Then x ∈ X
(axiom of choice!), and Theorem 12.52 implies y ≈O ∗ x.
§12 Topologies and Filters 167
Some notes are in order: It lies in the nature of things that we had to use the
axiom of choice in its full generality to prove the Tychonoff theorem. In fact, the
Tychonoff theorem is actually equivalent to the axiom of choice [Kel50] (this paper
contains a minor mistake which however can be corrected, see [LRN51]). However,
to prove Tychonoff’s theorem for Hausdorff spaces, one does not need the full
power of the axiom of choice: In fact, the Tychonoff theorem for Hausdorff spaces
is actually equivalent to the so-called maximal ideal theorem [LRN54] which in
turn is equivalent to Theorem 4.9 [Sik64]. The most difficult of these implications
follows from our above proof of the Tychonoff theorem: In fact, if all Xi are
compact Hausdorff spaces, then st is a function by Proposition 12.30, and so no
axiom of choice is required to define the function x in the above proof. So in
this case, we only made use of the axiom of choice in the construction of the
ultrapower model. A careful analysis shows that a map ∗ sufficient for our proof
may be defined by only applying Theorem 4.9 and no other form of the axiom of
choice.
Proof. We first prove the equivalence of the first three statements: If f is contin-
uous at x, and F → x (i.e. U (x) ⊆ F ), then we find for each V ∈ U (f (x))
some U ∈ U (x) ⊆ F with f (U ) ⊆ V , and so V belongs to the filter generated by
{f (U ) : U ∈ F }. Thus 2. holds. If 2. holds, then we find for the particular choice
F = U (x) that f (U (x)) → f (x) which means that 3. is satisfied. Finally, assume
that 3. holds. Recall that by Lemma 5.27, f (U (x)) consists precisely of those sets
V ⊆ Y for which U := {x : f (x) ∈ V } ∈ U (x). Hence, if U (f (x)) ⊆ f (U (x)) and
V ∈ U (f (x)), we find some U ∈ U (x) with f (U ) ⊆ V , and so f is continuous at
x.
Theorem 12.6 5. implies that the inclusions 3. and 4. are equivalent.
Noting that mon(U (f (x))) = mon(f (x)), and that ∗ f (mon(x)) =
∗
f (U (f (x))) ⊆ mon(f (U (x))) in view of Theorem 12.55, we see that 4. im-
plies 5.; moreover, equivalence follows if ∗ is a compact enlargement. To see the
converse inclusion without this additional requirement, assume that 5. holds.
Choose some internal U ∈ ∗ U (x) with U ⊆ mon(x) (Proposition 12.19). Given
V ∈ U (f (x)), we have by assumption ∗ f (U ) ⊆ mon(f (x)) ⊆ ∗ V , i.e. we have
proved
∃u ∈ ∗ U (x) : ∗ f (u) ⊆ ∗ V .
The inverse form of the transfer principle implies that there is some U ∈ U (x)
with f (U ) ⊆ V . Hence, f is continuous at x. The equivalence of 6. with 5. is
trivial.
Proof. Let xn → x, and F be the filter generated by the sets {xn , xn+1 , . . .}
(n = 1, 2, . . .). Then F → x (Proposition 12.46), and so f (F ) → f (x)
by Theorem 12.57. By definition, f (F ) is the filter generated by the sets
{f (xn ), f (xn+1 ), . . .} (n = 1, 2, . . .). Thus, Proposition 12.46 implies f (xn ) →
f (x).
§12 Topologies and Filters 169
We point out that the converse to Corollary 12.58 does not hold in general.
For counterexamples (and assumptions which imply that the converse implication
holds), we refer the reader to books on topology.
Exercise 65. Prove by nonstandard methods that a continuous function maps
compact sets into compact sets.
Using Proposition 12.5, one can prove that a function f is continuous if and
only if preimages of open sets are open. For one direction of this statement, one
can give an easier nonstandard proof:
Exercise 66. Prove by nonstandard methods that for any continuous function
f : X → Y preimages of open sets are open.
170 Chapter 6. Nonstandard Topology and Functional Analysis
§ 13 Uniform Structures
13.1 Uniform Spaces
There are some concepts which appear topological but which actually cannot
be described in topological spaces: Uniform convergence, uniform continuity, or
Cauchy sequences. In (pseudo)metric spaces, one can define these concepts: For
example, a sequence fn of functions with values in a pseudometric space converges
Ê
uniformly to a function f , if for each ε ∈ + one finds some index N such that
d(fn (x), f (x)) < ε (n ≥ N ) simultaneously for all x. This definition is possible,
since ε determines not only a neighborhood, but actually a system of neighbor-
hoods for each point in the space (in particular for each of the points f (x)). Thus,
if one intends to introduce such a concept in more general topological spaces, one
should consider families of neighborhoods. This is the motivation for the definition
of a so-called uniform structure.
Recall that sets U ⊆ X × X are relations. Hence, the notation
U −1 := {(y, x) : (x, y) ∈ U }
and
U (x) := {y : (x, y) ∈ U }.
2
We write also U for U ◦ U . We already made use of these conventions for the
particular relation st (see the remarks in front of Theorem 12.35).
∗
Similar arguments as in Theorem 3.13 show that (∗ U )−1 = (U −1 ), ∗ U ◦ ∗ V =
∗ ∗ ∗ ∗
(U ◦ V ), and U( x) = (U (x)).
Definition 13.1. A uniform structure over a space X is a filter U over X × X such
that each U ∈ U satisfies:
1. ∆ := {(x, x) : x ∈ X} ⊆ U .
2. U −1 ∈ U .
3. There is some V ∈ U with V 2 ⊆ U .
Each uniform structure generates a topology in a canonical way: Given x ∈ X,
put U (x) := {U (x) : U ∈ U }.
Proposition 13.2. Let O be the system of all sets O with the property that O ∈
U (x) for any x ∈ O. Then O is a topology on X, and U (x) is the corresponding
neighborhood filter of x.
The topology is Hausdorff if and only if the relation (x, y) ∈ U for any U ∈ U
implies x = y.
§13 Uniform Structures 171
x ≈U y ⇐⇒ (x, y) ∈ mon(U ).
∃v ∈ ∗ U : v 2 ⊆ ∗ U
is true. The inverse form of the transfer principle implies that there is some V ∈ U
with V 2 ⊆ U .
Proposition 13.6. If the uniform structure on X is induced by a family D of pseu-
dometrics, then
∗
x ≈U y ⇐⇒ d(x, y) ≈ 0 for any d ∈ D.
Hence, (x, y) ∈ mon(U ) if and only if ∗ d(x, y) < ε for any ε ∈ Ê+ and any
d ∈ D.
174 Chapter 6. Nonstandard Topology and Functional Analysis
One might hope that x≈U y if and only if x≈O y with respect to the topology
generated by the uniform structure. Unfortunately, for nonstandard points this
may fail even in natural situations:
Example 13.7. Consider X = with the canonical (metric) uniform structure.
By Proposition 13.6, we have
n ≈U m ⇐⇒ n = m.
On the other hand, for h ∈ ∞, we find some n = h with n≈O h by Theorem 12.27.
Nevertheless, for standard points the situation is good:
Proposition 13.8. Let X be a uniform space. Then we have
y ≈O ∗ x ⇐⇒ y ≈U ∗ x (x ∈ X, y ∈ ∗ X).
∃x ∈ ∗ F : x × x ⊆ ∗ U .
The inverse form of the transfer principle implies that there is some F ∈ F with
F × F ⊆ U , and so F is a Cauchy filter.
Definition 13.12. A uniform space is called complete, if each Cauchy filter con-
verges.
Propositions 12.46 and 13.10 together imply:
Corollary 13.13. In a complete space any Cauchy sequence converges.
Proof. If xn is a Cauchy sequence, then the filter F generated by the sets Fn :=
{xn , xn+1 , . . .} is a Cauchy filter by Proposition 13.10 and thus convergent to some
x. Proposition 12.46 implies xn → x.
We point out that the converse to Corollary 13.13 does not hold, in gen-
eral. But the converse is true in (pseudo)metric spaces; although it is rather easy
to prove this fact by standard methods, one can also give a nonstandard proof
(Exercise 68).
It turns out that completeness of a uniform space X is related to another
important notion:
Definition 13.14. A point y ∈ ∗ X is called a pre-nearstandard point if for any
U ∈ U there is some x ∈ X with (∗ x, y) ∈ ∗ U . We write pns(∗ X) for the set of
pre-nearstandard points.
∗
Since U ∈ U if and only if U −1 ∈ U and since (∗ U )−1 = (U −1 ), it is
equivalent to require that for any y ∈ ∗ X and any U ∈ U there is some x ∈ X
with (y, ∗ x) ∈ ∗ U .
Exercise 67. Prove that in a (pseudo)metric space X a point y ∈ ∗ X is pre-
Ê
nearstandard if and only if for each ε ∈ + there is some x ∈ ∗ X with ∗ d(∗ x, y) <
∗
ε.
Lemma 13.15. If F is a Cauchy filter, then mon(F ) ⊆ pns(∗ X).
Proof. Let y ∈ mon(F ). Given U ∈ U , choose some F ∈ F with F × F ⊆ U . Fix
some x ∈ F . Then
∗
(∗ x, y) ∈ ∗ F × ∗ F = (F × F ) ⊆ ∗ U .
176 Chapter 6. Nonstandard Topology and Functional Analysis
Theorem 13.16. ns(∗ X) ⊆ pns(∗ X), i.e. each nearstandard point is a pre-
nearstandard point. Moreover, we have equality if and only if X is complete.
UY := {U ∩ (Y × Y ) : U ∈ U }
Proof. Since U has the finite intersection property, also σ U has the finite inter-
section property, and so the set U∗ is a filter over ∗ (X × X) = ∗ X × ∗ X. By
definition of a generated filter, we have V ∈ U∗ if and only if there are finitely
many U1 , . . . , Un ∈ U∗ with U1 ∩ · · · ∩ Un ⊆ V ⊆ ∗ X × ∗ X. Since U is a filter, we
have U1 ∩ · · · ∩ Un ∈ U , and so U∗ can be described as in the claim.
Since ∆ := {(x, y) ∈ X × X : x = y} ∈ U , we have ∗ ∆ ∈ U∗ . The standard
definition principle for relations implies
∗
∆ = {(x, y) ∈ ∗ X × ∗ X : x = y},
Proof. Let F be some Cauchy filter over ∗ X. For each U ∈ U , there is some
FU ∈ F with FU × FU ⊆ ∗ U and some xU ∈ FU . Consider the system A :=
{∗ U (xU ) : U ∈ U } (axiom of choice!). We claim that A has the finite intersection
property: If U1 , . . . , Un ∈ U , then FU1 ∩ · · · ∩ FUn contains some element x. For
each k = 1, . . . , n, we have (xUk , x) ∈ FUk × FUk ⊆ ∗ U k , and so x ∈ ∗ U k (xUk ).
Moreover, A has at most the cardinality of P(X) (if X is infinite). Since ∗ is
P(X)-saturated, we thus find some x ∈ A , i.e. x ∈ ∗ U (xU ) for any U ∈ U .
§13 Uniform Structures 179
which is a (pseudo)metric that does not attain the value ∞ and generates the
same uniform structure as d.
Theorem 13.27. Let the uniform structure on X be induced by a family D of
pseudometrics. Then the uniform structure U∗ on ∗ X is induced by the family of
pseudometrics
st(∗ d(x, y)) if ∗ d(x, y) is finite,
d∗ (x, y) := (d ∈ D).
∞ if ∗ d(x, y) is infinite.
Ê
with ε ∈ + and d1 , . . . , dn ∈ D. We have V ∈ U∗ if and only if there is some
U ∈ U with ∗ U ⊆ V , i.e. if and only if there is some U0 of the above form such
that (using the standard definition principle for relations)
In view of the monotonicity of st, this means that there are d1 , . . . , dn ∈ D and
Ê
ε ∈ + with
But this means that the uniform structure U∗ is induced by the family of pseudo-
metrics d∗ (d ∈ D).
Let F be a Cauchy filter over ∗ X. Then we find for each n ∈ and each
finite D0 ⊆ D some FD0 ,n ∈ F such that
where [x] denotes the equivalence class of x, i.e. the set of all y ∈ ∗ X with y ≈U x.
be the system of all sets of the form
Let U
implies
Lemma 13.28. If U = ∗ V for some V ∈ U , then the relation ([x], [y]) ∈ U
3
(x, y) ∈ U .
Proof. Since ([x], [y]) ∈ U , there are elements x0 , y0 ∈ ∗ X with x0 ≈U x, y0 ≈U x
and (x0 , y0 ) ∈ U = V . Since x0 ≈U x and y0 ≈U y, we have (x, x0 ) ∈ ∗ V and
∗
∗
V 3 ⊆ U . Since ([x], [y]) ∈ ∗
V , Lemma 13.28 implies (x, y) ∈ (∗ V )3 = (V 3 ) ⊆ ∗ U .
∗
Hence, (x, y) ∈ U for any U ∈ U which means means x ≈U y, and so [x] = [y],
as desired.
Now, let X be Hausdorff. If x, y ∈ X satisfy [∗ x] = [∗ y], we have ∗ x ≈U ∗ y,
and so ∗ x ≈O ∗ y which implies x = st(∗ y) = y, since st is a function by Proposi-
tion 12.30. Thus, the map x → [∗ x] is one-to-one.
Now we prove (13.1): If U ∈U, there is some V ∈ U with U ⊇ ∗ V . Since
σ
U ⊇ V , we have in the sense of the embedding
∩ (X × X),
V = {([∗ x], [∗ y]) : (x, y) ∈ V } ⊆ U
x ∈ X with (y, ∗ x) ∈ ∗ U ⊆ V . Hence, ([y], [∗ x]) ∈ V , i.e. [y] ∈ V ([∗ x]) = V (x) (in
the sense of our identification). Thus, [y] belongs to the closure of X. Conversely,
if [y] belongs to the closure of X and U ∈ U is given, choose some V ∈ U with
V 3 ⊆ U . Since [y] belongs to the closure of X, we find some element of X in the
neighborhood ∗ V ([y]), i.e. there is some x ∈ X with ([∗ x], [y]) ∈ ∗V . Lemma 13.28
implies ( x, y) ∈ (∗ V )3 ⊆ ∗ U . Hence, y ∈ pns(∗ X).
∗
The set X is called the nonstandard hull of
with the uniform structure U
the uniform space X.
Theorem 13.30. Let ∗ be P(X)-saturated. Then X is complete.
Proof. Let F be a Cauchy filter over X. Let F0 be the system of all subsets of
∗
X which contain an element of the form {x ∈ ∗ X : [x] ∈ F } where F ∈ F .
Then F0 is a filter: By definition, ∅ ∈/ F0 , and the relations G0 ∈ F0 and
G0 ⊆ G ⊆ ∗ X imply G ∈ F0 . Moreover, for any G1 , G2 ∈ F0 , we find F1 , F2 ∈ F
with Gi ⊇ {x : [x] ∈ Fi }. Hence,
and so G1 ∩ G2 ∈ F0 .
Moreover, F0 is a Cauchy filter: Given U ∈ U∗ , choose some V ∈ U∗ with
V 3 ⊆ U and some W ∈ U with V ⊇ ∗ W . Since F is a Cauchy filter, we find
§13 Uniform Structures 183
([x]) ∈ F , as desired.
and so U
As can be seen from the proof, the saturation property is only needed for the
completeness of ∗ X. In particular, the same result holds if ∗ is comprehensive and
a compact P(X)-enlargement (Exercise 72). Moreover, if the uniform structure is
induced by a family D of pseudometrics, it suffices that ∗ is (P(D) × )-saturated
(Theorem 13.27).
Corollary 13.31. If X is a Hausdorff space, then X has a completion, i.e. there
is a complete uniform Hausdorff space X such that X ⊆ X carries the inherited
uniform structure and such that X is the closure of X.
is complete.
Proof. By Exercise 70, the closure X of X in the complete space X
Proposition 13.32. Let the uniform structure on X be induced by a family D of
pseudometrics. Then the uniform structure on X is induced by the family of pseu-
dometrics
st(∗ d(x, y)) if ∗ d(x, y) is finite,
d([x], [y]) = (d ∈ D).
∞ if ∗ d(x, y) is infinite.
If X is a metric space, i.e. D = {d}, then d is a metric (which might assume the
value ∞). Moreover, X
is complete if ∗ is (P(D) × )-saturated.
Proof. The last statement follows by the remarks following Theorem 13.30. To see
that d is well-defined, let [x] = [x0 ] and [y] = [y0 ]. Then x ≈U x0 and y ≈U y0 ,
and by Proposition 13.6, this implies ∗ d(x, x0 ) ≈ 0 ≈ ∗ d(y, y0 ) for any d ∈ D.
By the inverse triangle inequality for ∗ d, we thus have |∗ d(x, y) − ∗ d(x0 , y0 )| ≤
∗
d(x, x0 ) + ∗ d(y, y0 ) ≈ 0, and so ∗ d(x, y) and ∗ d(x0 , y0 ) are either both infinite or
both finite with the same standard part.
If X is a metric space, then X is a pseudometric space. Since X is Hausdorff
(Theorem 13.29), the pseudometric on X must even be a metric.
184 Chapter 6. Nonstandard Topology and Functional Analysis
Proof. Put A(x, y) := x + y and M (λ, x) := λx. Assume first that A and M are
continuous.
To prove 1., we show that for any x0 , x1 ∈ X and any neighborhood U of x0
the set U0 := (x1 − x0 ) + U is a neighborhood of x1 : For the choice x0 := 0 and
x1 := x, we obtain then that for any neighborhood U of 0 the set U0 = x + U is a
neighborhood of x; for the choice x0 := x and x1 := 0, we obtain conversely that
any neighborhood U of x has the form U = x + U0 for some neighborhood U0 of
0.
Thus, let x0 , x1 ∈ X, and U be a neighborhood of x0 . Since A(x1 , x0 − x1 ) =
x0 , we find a neighborhood V of (x1 , x0 − x1 ) with A(V ) ⊆ x0 . By definition of
the product topology, there are neighborhoods V1 of x1 and V2 of x0 − x1 such
that V1 × V2 ⊆ V . Then A(V1 × V2 ) ⊆ U ; in particular, V1 + (x0 − x1 ) ⊆ U . Hence,
U0 = U − (x0 − x1 ) contains V1 and thus is a neighborhood of x1 , as claimed.
For the proof of 2. and 3., let a neighborhood U of 0 and points x ∈ X and
λ∈ be given. We find a neighborhood V0 of (λ, x) such that M (V0 ) ⊆ U . By
definition of the product topology this means that there are neighborhoods Λ of λ
186 Chapter 6. Nonstandard Topology and Functional Analysis
and so U1 ∩ U2 ∈ U , as desired.
To see that U is a uniform structure, let U ∈ U be given. Then U ⊇ UO
for some O ∈ O0 . By Theorem 14.2 2., there is some O0 ∈ O0 with −O0 ⊆ O and
§14 Topological Vector Spaces 187
V 2 = {(x, y) | ∃z ∈ X : x − z, z − y ∈ O0 }.
It can be proved that the uniform structure of Theorem 14.3 is the unique
uniform structure such that the addition is uniformly continuous. However, we
will not need this fact. If we speak of the uniform structure of a topological vector
space, we always mean the uniform structure of Theorem 14.3.
Note that if X is a vector space, then also ∗ X becomes equipped with an
addition and a scalar multiplication by the standard definition for relations. The
transfer principle implies that ∗ X is actually a vector space (with scalar multipli-
cation λx := (∗ λ) ∗ · x). We write + in place of ∗ + and 0 in place of ∗ 0 (noting
that ∗ 0 is also the neutral element of addition in ∗ X by the transfer principle).
Proposition 14.4. Let X be a topological vector space. Then
U ⊇ UO := {(x, y) ∈ X × X : x − y ∈ O}
(x, y) ∈ ∗ U O = {(x, y) ∈ ∗ X × ∗ X : x − y ∈ ∗ O}
188 Chapter 6. Nonstandard Topology and Functional Analysis
for any neighborhood O ⊆ X of 0, i.e. if and only if x−y ∈ ∗ O for any neighborhood
O of 0. But this means x − y ≈O 0. The second equivalence of (14.1) follows from
Proposition 13.8.
For the second statement, note that V is a neighborhood of x if and only if
−1 −1
V ⊇ ∗ U (x) for some U ∈ U . This is the case if and only if V ⊇ ∗ U O (x) for some
neighborhood O ⊆ X of 0. But this means V ⊇ {y ∈ X : y − x ∈ ∗ O} = x + ∗ O
∗
fin(∗ X) := {x ∈ ∗ X : x is finite}.
As we have mentioned at the end of §13, this is not the only natural definition
of the term “finite”: One could also use another definition which takes into account
only the uniform structure of the space (and not the multiplication operation).
Unfortunately, these two definitions may differ in certain cases. However, the above
definition appears to be the most natural one in the context of topological vector
spaces. For details, we refer the reader to [Hen72a, HM72].
Lemma 14.6. If X is a topological vector space, then any neighborhood U of 0
contains a balanced neighborhood O of 0, i.e. |λ| ≤ 1 implies λO ⊆ O.
inf(∗ X) := {x ∈ ∗ X : x ≈O 0} = {x ∈ ∗ X : x ≈U 0} = mon(0).
Ê
Exercise 73. Prove that x ∈ fin(∗ X) if and only if for any c ∈ inf(∗ ), c > 0, the
relation cx ∈ inf(∗ X) holds.
Lemma 14.8. The set inf(∗ X) is a linear subspace of ∗ X.
Ã
Proof. If x, y ∈ inf(∗ X), λ ∈ , and U is a neighborhood of 0, we find by The-
orem 14.2 some neighborhood O of 0 with O + O ⊆ U and λO ⊆ U . Since
∗ ∗
x, y ∈ ∗ O, we have λx ∈ (λO) ⊆ ∗ U and x + y ∈ ∗ O + ∗ O = (O + O) ⊆ ∗ U .
Hence, λx, x + y ∈ mon(0).
Proof. The fact that each topological vector subspace of ∗ X must be contained in
fin(∗ X) follows from our considerations preceding Definition 14.5.
Now we prove that fin(∗ X) is indeed a linear subspace: If x, y ∈ fin(∗ X) and
λ∈ Ã Ê
are given, then cx, cy ∈ inf(∗ X) for any c ∈ inf( ), c > 0 by Exercise 73.
By Lemma 14.8, we have c(x + y) = cx + cy ∈ inf(∗ X) and cλx = λ(cx) ∈ inf(∗ X)
Ê
for any c ∈ inf( ), c > 0, and so x + y, λx ∈ fin(∗ X) by Exercise 73.
To prove that fin(∗ X) is a topological vector space, we verify the three condi-
tions of Theorem 14.2. Condition 1. follows from Proposition 14.4: If x ∈ fin(∗ X),
and U0 ⊆ fin(∗ X) is a neighborhood of 0, i.e. U0 = U ∩ fin(∗ X) for some neigh-
borhood U ⊆ ∗ X of 0, then x + U is a neighborhood of x in ∗ X. Since fin(∗ X) is
a vector space, it follows that x + U0 = x + (U ∩ fin(∗ X)) = (x + U ) ∩ fin(∗ X) is a
neighborhood of x. Analogously, if V0 ⊆ fin(∗ X) is a neighborhood of x ∈ fin(∗ X),
then we find some neighborhood V ⊆ ∗ X of x with V0 = V ∩ fin(∗ X). Since V − x
is a neighborhood of 0 in ∗ X, it follows that U0 := V0 − x = (V − x) ∩ fin(∗ X) is
a neighborhood of 0 in fin(∗ X) with V0 = x + U0 .
190 Chapter 6. Nonstandard Topology and Functional Analysis
For conditions 2. and 3., let a neighborhood U0 ⊆ fin(∗ X) of 0, λ ∈ , and
x ∈ fin(∗ X) be given. We have U0 = U ∩fin(∗ X) for some neighborhood U ⊆ ∗ X of
0, and U ⊇ ∗ V for some balanced neighborhood V ⊆ X of 0 (Proposition 14.4 and
Lemma 14.6). By Theorem 14.2, we find neighborhoods O ⊆ X of 0 and Λ ⊆ of
λ with O + O ⊆ V and ΛO ⊆ V . The transfer principle implies ∗ O + ∗ O ⊆ ∗ V and
Λ∗ O ⊆ ∗ V . Hence, 2. holds with the neighborhoods Λ and O0 := ∗ O ∩ fin(∗ X).
Indeed, ∗ O is a neighborhood of 0 in ∗ X by Proposition 14.4, and so O0 is a
neighborhood of 0 in fin(∗ X). Moreover, O0 + O0 ⊆ ∗ V ∩ fin(∗ X) ⊆ U0 , and
ΛO0 ⊆ ∗ V ∩ fin(∗ X) ⊆ U0 , since fin(∗ X) is a vector space. For condition 3.,
observe that we find some n ∈ with x ∈ n∗ V . For each µ in the neighborhood
Λ0 := {µ ∈ : |µ| < n−1 } of 0, we have µx ∈ ∗ V , because ∗ V is balanced, and
thus µx ∈ V ∩ fin(∗ X) ⊆ U0 . Hence, also condition 3. holds.
∗
Since the neighborhoods of 0 in fin(∗ X) are precisely those sets of the form
O ∩ fin(∗ X) where O ⊆ ∗ X is a neighborhood of 0, we find that U0 consists
of all sets of the form U ∩ (fin(∗ X) × fin(∗ X)) where U ⊆ ∗ X × ∗ X is such that
there is some neighborhood O ⊆ ∗ X of 0 with
U ⊇ UO := {(x, y) ∈ ∗ X × ∗ X | x − y ∈ O}.
In view of Proposition 14.4, this holds for U if and only if there is a neighborhood
O ⊆ X of 0 with U ⊇ U∗ O . The latter means by the inverse form of the standard
definition principle for relations that
U ⊇ ∗ {(x, y) ∈ X × X : x − y ∈ O}.
and conclude that precompact subsets of topological vector spaces are bounded.
Exercise 76. Prove that the vector space inf(∗ X) is a closed subspace of ∗ X and
contained in fin(∗ X). In particular, inf(∗ X) is a closed subspace of fin(∗ X).
Recall that if U is a subspace of some vector space X, one defines the factor
space X/U as the set of all equivalence classes with respect to the equivalence
relation
x ≈ y ⇐⇒ x − y ∈ U.
The space X/U becomes a vector space with the operations [x] + [y] = [x + y] and
[λx] = λ[x] (which are well-defined). If X is a topological vector space and U is
a subspace, then one equips X/U with the following topology: The open sets are
the sets of the form {[x] : x ∈ O} where O is open.
Proposition 14.11. X/U is a topological vector space.
Proof. We prove first that the sets Õ := {[x] : x ∈ O} with open sets O form
a topology: Clearly, ∅, X/O are open. If {Õi : i} is a family of open sets, then
i Õi = {[x] : x ∈ i Oi } is open. Finally, if Õ1 , Õ2 are open,
let A be the family
of all open sets which are contained in A := Õ1 ∩ Õ2 . Then A ⊆ A, and if we
can prove that even A = A , then A is open. Thus, let [x] ∈ A be given. By
choosing a proper representative, we may assume that x ∈ O1 , and we find some
u ∈ U with x+u ∈ O2 . Now we apply Theorem 14.2 several times: O3 = O2 −x−u
is a neighborhood of 0, and so x + O3 := O2 − u is a neighborhood of x. We thus
find an open set O ⊆ O1 with x ∈ O ⊆ O2 − u. Then Õ is open with [x] ∈ Õ ⊆ A,
and so [x] ∈ A , as desired.
192 Chapter 6. Nonstandard Topology and Functional Analysis
and so
U ⊇ ∗
U 0 ∩ (X̆ × X̆) ⊇ {([x], [y]) | x, y ∈ fin(∗ X) ∧ x − y ∈ ∗ O0 }. (14.2)
belongs to the uniform structure of X̆. If we can prove that V0 ⊆ U , then also U
must belong to this uniform structure, and we are done. Thus, let ([x], [y]) ∈ V0
be given, i.e. x, y ∈ fin(∗ X) with [x − y] = [x] − [y] ∈ V . Then x − y − u ∈ O ⊆ ∗ O 0
for some u ∈ inf(∗ X) ⊆ fin(∗ X). By (14.2), this implies ([x], [y + u]) ∈ U . Since
[y] = [y + u], we find ([x], [y]) ∈ U , as desired.
To see that X̆ is closed, let y ∈ X \ X̆ be given, i.e. y = [x] for some
x ∈ ∗ X \ fin(∗ X). Since fin(∗ X) is closed, we find some neighborhood O ⊆ ∗ X of x
which is disjoint from fin(∗ X). By definition of the topology of ∗ X, we find some
U ∈ U with O ⊇ ∗ U(x). There is some V ∈ U with V 3 ⊆ U . Then ∗ V ([x]) is a
∗ ∗
neighborhood of [x] which is disjoint from fin( X). Indeed, if [y] ∈ V ([x]) then
Lemma 13.28 implies (x, y) ∈ (∗ V )3 ⊆ ∗ U , and so y ∈ ∗ U (x) ⊆ O which implies
y∈/ fin(∗ X), as desired.
If X is Hausdorff, Theorem 13.29 implies that the embedding X ֒→ X defined
∗ ∗
by x → [ x] is one-to-one. Since x is always finite, this is even an embedding into
the subspace X̆.
194 Chapter 6. Nonstandard Topology and Functional Analysis
and
inf(∗ X) = {x ∈ ∗ X : ∗ x is infinitesimal for each · ∈ N }.
The uniform structure of fin(∗ X) is induced by the family of seminorms
∗
x∗ := st( x) (· ∈ N ),
Proof. We have x ∈ fin(∗ X) if and only if for any ε ∈ Ê+ and each finitely many
·1 , . . . , ·k ∈ N , we find some n ∈ withÆ
∗
x ∈ n {y ∈ X : y1 < ε ∧ · · · ∧ yk < ε}.
∗ ∗
By the standard definition principle, this means x/n1 < ∗ ε, . . . , x/nk < ∗ ε,
Ê
i.e. ∗ x1 < n∗ ε, . . . , ∗ xk < n∗ ε. Hence x ∈ fin(∗ X) if and only if ∗ x ∈ fin(∗ )
for each · ∈ N .
Ê
Similarly, x ∈ inf(∗ X) if and only if for any ε ∈ + and each finitely many
·1 , . . . , ·n ∈ N , we have
∗
x ∈ {y ∈ X : y1 < ε ∧ · · · ∧ yn < ε}
∗ ∗
which by the standard definition principle means x1 , . . . xk ≤ ∗ ε. Hence,
∗
Ê
x ∈ inf(∗ X) if and only if x ∈ inf(∗ ) for each · ∈ N .
§14 Topological Vector Spaces 195
and since st is additive and monotone, the triangle inequality for ·∗ follows.
The equality λx∗ = |λ| x∗ is proved analogously. Hence, ·∗ (· ∈ N ) is a
family of seminorms on fin(∗ X) which generates the same uniform structure as
the pseudometrics d∗ (x, y) := st(∗ d(x, y)) (d ∈ D) where D is the family of all
pseudometrics d(x, y) := x − y with · ∈ N . Hence, the statement concerning
fin(∗ X) follows from Theorem 13.27 (recall that fin(∗ X) is a closed subspace of
∗
X, and so the completeness of ∗ X implies the completeness of fin(∗ X) by Ex-
ercise 70). The proof of the statements concerning X̆ follows analogously, using
Proposition 13.32.
This is true if fin(∗ X) = pns(∗ X). Conversely, if fin(∗ X) = pns(∗ X), then Ex-
ercise 75 implies that we find some x ∈ fin(∗ X) = pns(∗ X). Then [x] ∈ X̆, but
we have [x] ∈ / X, since otherwise [x] = [y] for some y ∈ pns(∗ X). But this im-
plies the contradiction x ∈ pns(∗ X): Indeed, given U ∈ U , choose some V ∈ U
with V 2 ⊆ U . Since y ∈ pns(∗ X), we find some z ∈ X with (∗ z, y) ∈ ∗ V . Since
2
(y, x) ∈ ∗ V (because x ≈U y), we have (∗ z, x) ∈ ∗ V ⊆ ∗ U. Hence, x ∈ pns(∗ X),
as claimed.
by Theorem 13.22. It follows that x ∈ pns(∗ X), and so we have proved fin(∗ X) ⊆
pns(∗ X). Exercise 75 now implies fin(∗ X) = pns(∗ X), and by Theorem 14.15, we
find that X̆ is the closure of X in X (which is X, since X is complete).
Ê Ê n
Ên
In particular, we have for X = n that fin(∗ )/ inf(∗ ) = n = Ê̆ Ê
∼ n . For
n = 1, this is a new proof of (a part of) Theorem 5.21. The previous results imply:
Corollary 14.18. If X is normed, then X̆ is the closure of X in X̆ if and only if
X has finite dimension.
The reader who is interested in deeper results on the nonstandard theory
of topological vector spaces and normed spaces is referred to the papers [HM72,
HM74, HM83] and also to [Lux69a, Lux92].
Chapter 7
Miscellaneous
∞
∞
∞
µ∗ ( En ) ≤ µ(An,k ) ≤ (µ∗ (En ) + 2−n ε) = ε + µ∗ (En ).
n,k=1 n=1 n=1
µ∗ (D) ≥ µ∗ (D ∩ E) + µ∗ (D \ E) (D ⊆ S0 ). (15.1)
198 Chapter 7. Miscellaneous
µ∗ (D ∩ (E ∪ F )) ≥ µ∗ ((D ∩ (E ∪ F )) ∩ E) + µ∗ ((D ∩ (E ∪ F )) \ E)
= µ∗ (D ∩ E) + µ∗ (D ∩ F ).
N
N
N
∞
µ∗ (D) ≥ µ∗ (D ∩ En ) + µ∗ (D \ En ) ≥ µ∗ (D ∩ En ) + µ∗ (D \ En ).
n=1 n=1 n=1 n=1
§15 Loeb Measures 199
µ∗ (E \ Dn ) ≤ ε/2.
µ∗ (E∆Dn ) ≤ µ∗ (E \ Dn ) + µ∗ (Dn \ E)
≤ ε/2 + µ∗ (A \ E) = ε/2 + (µ∗ (A) − µ∗ (E)) ≤ ε.
200 Chapter 7. Miscellaneous
We point out that the previous results hold true if Σ is not necessarily an
algebra: It suffices that Σ is a so-called semi-ring. For generalizations in this
direction, we refer the reader to [Zaa67].
So far, we have only recalled some results of the standard world. The inter-
esting point in connection with nonstandard analysis is that any additive measure
µ is automatically σ-additive, if it is internal:
Through the rest of this section, we assume that S0 ∈ ∗ S is a nonstan-
dard entity, and that [0, ∞] ∈ S is an entity. Let Σ be an algebra over S0 . Then
∗
an internal function µ : Σ → [0, ∞] is called an internal additive measure, if
µ(A ∪ B) = µ(A) + µ(B) for all A, B ∈ Σ with A ∩ B = ∅. Note that Σ is internal
by Theorem 3.19, and so any A ∈ Σ is internal, too (Proposition 3.16).
Example 15.4. If µ is an additive measure in the standard world, then ∗ µ is an
internal measure. This may be straightforwardly verified by the transfer principle.
Exercise 78. (Cumbersome). With Σ and µ as above, prove that for any ∗ -finite
sequence A1 , . . . , Ah ∈ Σ (h ∈ ∗ ) the relation A1 ∪ · · · ∪ Ah ∈ Σ holds. Moreover,
if the sets Ak are pairwise disjoint, prove also
Proposition 15.5. If ∗ is -saturated then, for any sequence An ⊆ S0 of pairwise
disjoint nonempty internal sets, the union An is not internal. In particular, with
the above notation:
1. Σ is either finite or fails to be a σ-algebra.
2. µ and µ0 are σ-additive.
Proof. If A := An were internal, then each of the sets Bn := A \ (A1 ∪ · · · ∪ An )
were internal by Theorem 3.19. Then B := {Bn : n ∈ } is a countable family
of nonempty internal sets with the finite intersection property, and so B = ∅.
This contradicts the definition of A.
If Σ is infinite, we find a sequence Bn ∈ Σ of nonempty internal sets. Putting
An := Bn \ k<n Bk , we have An ∈ Σ but not An ∈ Σ. Hence, Σ is not a
σ-algebra.
Finally, if An ∈ Σ are pairwise disjoint with An ∈ Σ, then all except finitely
many An must be empty. Hence, the σ-additivity of µ and µ0 follows from their
additivity.
Theorem 15.6. Let ∗ be -saturated. Let E be measurable with respect to the
Loeb measure µL and such that µL (E) < ∞. Then there is some D ∈ Σ with
µL (D∆E) = 0.
Proof. By Theorem 15.2, we find for any n some set Dn ∈ Σ with µL (Dn ∆E) <
n−1 . We may conclude that
ϕ := {(x, n, y) ∈ (Σ × σ ) × Σ | µ(x∆y) ≤ 3/n}
For the converse, recall that the Loeb measurable sets constitute a σ-algebra.
Hence, if D is the countable union of sets from Σ, then D is Loeb measurable.
Now, if E ⊆ S0 satisfies µ∗0 (E∆D) = 0, put F := E∆D. For any C ⊆ S0 , we have
i.e. (15.1) holds for µ0 which by definition means that F is Loeb measurable.
Hence, also E = D∆F is Loeb measurable.
§15 Loeb Measures 203
Also the following property does not hold for general Carathéodory exten-
sions. Recall that any nonstandard ultrapower model is -saturated and compre-
hensive.
Theorem 15.8. Let ∗ be -saturated and comprehensive. Let E ⊆ S0 be contained
in a set of finite Loeb measure. Then E is Loeb measurable if and only if for each
ε > 0 there are sets C, D ∈ Σ such that C ⊆ E ⊆ D and µL (D \ C) < ε.
Proof. First, assume that E is Loeb measurable. We find by definition of µ∗0 a
sequence Dn ∈ Σ with E ⊆ Dn and
∞
µL (E) = µ∗0 (E) ≥ µ0 (Dn ) − ε.
n=1
It is no loss of generality to assume that the sets Dn are pairwise disjoint: Otherwise
replace Dn by Dn \ k<n Dk .
Define f : σ → Σ by f (∗ n) = Dn . Since ∗ is comprehensive, we find an
extension of f to an internal mapping f : ∗ → Σ. Let
M := {n ∈ ∗ | (∃k ∈ ∗ : (k ≤ n ∧ F (n + 1) ∩ F (k) = ∅))
n+1
∗
∨ µ(F (k)) > (µL (E) + ε)}.
k=1
§ 16 Distributions
In this section, we show how nonstandard analysis can be used to describe distri-
butions. We only sketch some ideas and leave the details to the reader.
Ê
Throughout this section, we assume that ∈ S is an entity.
The idea of distributions goes back to Dirac’s δ-function. The physical idea
was that this is a function δ with the property that δ(x) = 0 for all x = 0 but
satisfying
δ(x) dx = 1.
Ê
Of course, no function δ with this property does exist. Nevertheless, a formal
calculation with the δ-function in physics was successful, in particular due to the
essential property
δ(x)f (x) ds = f (0).
Ê
An appropriate mathematical framework for the treatment of the δ-function is the
following:
The closure of the set {x ∈ Ê : f (x) = 0} is called the support of f . One
identifies a locally integrable function ϕ with the linear functional
Fϕ (f ) = ϕ(x)f (x) ds
Ê
defined on e.g. the system of all smooth functions f with compact support. Then
the “δ-function” can be identified with the linear functional
Fδ (f ) = f (0).
It follows from the transfer principle that the internal integral is linear in the
strong sense that
(λf (x) + µg(x)) ds = λ f (x) ds + µ g(x) dx
Theorem 16.2. If ∗ is a P( )-enlargement, then any linear functional F on a
linear subspace U ⊆ C0 can be written in the form
∗
(F (f )) = ϕ(x)∗ f (x) dx (f ∈ U ) (16.2)
∞
with some ϕ ∈ L ∩ ∗ C 0 . In particular, (16.1) holds.
Proof. Using standard linear algebra (and the axiom of choice) one can extend
F to a linear (not necessarily bounded) functional on C0 . Hence, without loss of
generality, we can assume U = C0 .
The essential step is to prove that the binary relation
ψ := {(x, y) ∈ C0 × C0∞ | y(t)x(t) dt = F (x)}
§16 Distributions 207
Then ϕ := F (f1 )ϕ1 + · · · + F (fn )ϕn has the required properties. Assume that the
claim has already been proved for n − 1. Under this assumption, we will construct
for any given linearly independent f1 , . . . , fn ∈ C0 and any given k ∈ {1, . . . , n} a
function ϕk ∈ C0∞ which satisfies (16.4) for j = 1, . . . , k. Then the induction step
is complete. Renumbering the functions fj if necessary, it suffices to describe this
construction for the case k = n.
By induction hypothesis, we find functions ψ1 , . . . , ψn−1 ∈ C0∞ with
1 if k = j and j = 1, . . . , n − 1,
ψk (t)fj (t) dt =
0 if k = j and k, j = 1, . . . , n − 1.
as desired.
Note that to determine a function from C0 , it suffices to determine it for ra-
tional arguments. Hence, C0 has the cardinality of which in turn has the cardi-
Ê
nality | | = |(2 ) | = |2× | = |2 | = |P( )|. Since ∗ is a P( )-enlargement,
Ê
∞
we have that ∗ ψ is satisfied on σ C 0 , i.e. we find some ϕ ∈ ∗ C 0 with (∗ f , ϕ) ∈ ∗ ψ
for any f ∈ C0 . By the standard definition principle, this means
ϕ(t)∗ f (t) dt = ∗ F (∗ f ) = ∗ (F (f )) (f ∈ C0 ).
∗
Hence (16.2) holds, and since (F (f )) is always finite, this implies also ϕ ∈ L.
Considering the functional F (f ) := f (0), we thus find indeed a function
∞
δ ∈ L ∩ ∗ C 0 satisfying
δ(x)∗ f (x) dx = ∗ f (0) (f ∈ C0 ).
Theorem A.1. Let ∗ : S → ∗S be elementary. Let Sn and Tn denote the level sets
∗
of the superstructure S and S, respectively, i.e. Sn is as in Section 2.1. Then
∗
S n = {x ∈ Tn : x is internal} (n = 0, 1, 2, . . .). (A.1)
and since all elements of ∗ S n+1 are internal, we thus have even
∗
S n+1 ⊆ {x ∈ Tn+1 : x is internal}.
For the converse inclusion, observe that if A ∈ Tn+1 \T0 is internal, then each x ∈ A
is internal, because I is transitive; moreover, since A ∈ Tn+1 = T0 ∪ P(Tn ), we
must have x ∈ Tn which by induction assumption implies x ∈ ∗ S n . Consequently,
However, since we work with a set theory with atoms, some care is needed:
Proposition A.2. Let Sn and Tn be as in Theorem A.1.
Given an entity A ∈ S, let An be the collection of all elements of x ∈ A
which are of type n, i.e. x ∈ Sn \ Sn−1 . Then
∗
An = {x ∈ ∗ A : x ∈ Tn \ Tn−1 },
i.e. ∗ An contains all elements from ∗ A which are of type n. Here, we put S−1 =
T−1 = ∅.
Proof. We have
An = {x ∈ A : x ∈ Sn \ Sn−1 }.
The standard definition principle implies
∗
An = {x ∈ ∗ A : x ∈ ∗ S n \ ∗ S n−1 }.
Since Theorem A.1 implies that ∗ S n \ ∗ S n−1 contains all internal elements of
Tn \ Tn−1 , the statement follows.
Corollary A.3. Let x ∈ ∗ S be internal, and of type n. Then there is some entity
A ∈ S with x ∈ ∗ A such that all elements of A are of type n.
In particular, any internal entity is contained in a set ∗ A where A is a set
consisting of entities.
Proof. Since x is internal, there is an entity B ∈ S with x ∈ ∗ B. Let A be the
collection of all elements y ∈ B of type n. By Proposition A.2, we have y ∈ ∗ A.
Theorem A.4. Let A ∈ S be an entity. Let A0 be the collection of all elements of
A which are entities. Then
∗
A0 = {A : A ∈ ∗ A is an entity}
and ∗
A0 = {A : A ∈ ∗ A is an entity}.
Proof. Since entities are the elements of at least type 1, Proposition A.2 implies
that ∗ A 0 = {A ∈ ∗ A : A entity}. Hence, it is no loss of generality to assume that
A = A0 .
Put U := A . The transitively bounded sentence
∀x ∈ U : ∃y ∈ A : x ∈ y
213
∀x ∈ ∗ U : ∃y ∈ ∗ A : x ∈ y,
∗
i.e. ∗ U ⊆ A . Conversely, the transfer of
∀x ∈ A : ∀y ∈ x : y ∈ U
implies analogously ∗ A ⊆ ∗ U . This proves the first equality. For the second
equality, put D := A , and observe that
D = {x ∈ U | ∀y ∈ A : x ∈ y}.
and
∗
{A \ B : A ∈ A , B ∈ B} = {A \ B : A ∈ ∗ A , B ∈ ∗ B}.
Proof. Put C := {A ∪ B : A ∈ A , B ∈ B}, and U := (A ∪ B). Then
C = {z ∈ U | ∃x ∈ A , y ∈ B : z = x ∪ y}.
Exercise 81. Show that for each internal entity X the system of all internal subsets
of X is an internal entity.
Exercise 82. Show that for each pair of internal entities A, B the system F of all
internal functions f : A → B is an internal entity. Is also the system F consisting
of all internal functions defined on subsets of A with values in B an internal entity?
We obtain similarly also results for sets of higher type:
Theorem A.8. Let A ∈ S be a system of entities. Then
∗
{P(A) : A ∈ A } = {PA : A ∈ ∗ A },
The following solutions to the exercises are not complete, but all important ideas
are sketched.
√
Exercise 1: X is not Dedekind complete: Let f (t) := t, and A ⊆ X consist of
all x ∈ X such that there is some t0 > 0 with x(t) ≤ f (t) for t > t0 . Then A
is bounded (e.g. by x(t) := t), but A has no maximal element. To see this, we
will prove that any x ∈ A satisfies S := lim supt→∞ x(t) < ∞. Then the constant
function y(t) := S + 1 belongs to A and is strictly larger than x.
We have for sufficiently large t that x(t2 ) ≤ f (t2 ) = t, and so the rational
function y(t) = x(t2 )/t satisfies lim supt→∞ y(t) ≤ 1. Since y(t) = p(t)/q(t) with
polynomials p, q this means that either y(t) → −∞ or that the degree deg p of
the polynomial p is at most as large as the degree deg q of q. In the first case,
x(t) ≤ 0 for sufficiently large t. In the second case, observe that deg p is even and
deg q is odd by the definition of y. Hence, deg p ≤ deg q − 1 which implies that
x(t2 ) = ty(t) converges as t → ∞. Hence, lim supt→∞ x(t) < ∞ in both cases, as
claimed.
X is not Archimedean: The function x(t) := t belongs to X and satisfies
x ≥ n for any n ∈ . ⋆
Exercise 2: The set is not a field, since x0,1 has no inverse with respect to mul-
tiplication. It is also not Dedekind complete, since the set {x0,b : b > 0} has no
least upper bound. X is Archimedean, for if xa,b ∈ X, then we find some n ∈
with n > a, and so xn,0 ∈ X satisfies xn,0 > xa,b . ⋆
Exercise 3: It is not totally ordered, since e.g. the equivalence class of the sequence
x : n → (−1)n cannot be compared with 0. It is not a field, because there is some
[x] ∈ X such that [x] = 0 and x contains infinitely many 0’s (e.g. x : n →
218 Appendix B. Solutions to the Exercises
1 + (−1)n ); then [x] is not invertible. X is not Dedekind complete: Let A be the
set of all equivalence classes of sequences converging to 0. Then A is bounded from
above (e.g. by 1). However, A has no least upper bound: If [x] ∈ X were such a
bound, then [x] > 0 (because the equivalence class of n → n−1 belongs to X).
Ê
Note that [y] ∈ A implies λ[y] ∈ A for any λ ∈ . Hence, in the case [x] ∈ A, we
find 2[x] ∈ A, and so [x] is no upper bound. But if [x] ∈ / A, then [x]/2 provides a
strictly smaller upper bound for A. In both cases, we found a contradiction to the
fact that [x] is the smallest upper bound. X fails to be Archimedean, since for the
sequence x : n → n, we even have [x] > n for any n. ⋆
Since σ B ⊆ σ A and c ∈
/ σ A, this implies B = ∅, and Lemma 3.5 then gives the
∗
contradiction {c} = B = ∗ ∅ = ∅. ⋆
Exercise 6: By Proposition 3.16 and Lemma 3.14, we find some index k with
x1 , . . . , xk ∈ ∗ S n . Then
{x1 , . . . , xn } = {x ∈ ∗ S k : x = x1 ∨ · · · ∨ x = xn }
and
(x1 , . . . , xn ) = {(x1 , . . . , xn ) ∈ ∗ S k × · · · × ∗ S k : x1 = x1 ∧ · · · ∧ xn = xn }
are internal by the internal definition principle resp. by the internal definition
principle for relations. If B is an internal entity, and A ⊆ B is external, then A
must be infinite, since otherwise A = {x1 , . . . , xn } with xi ∈ B would be internal
by what we just proved (and since each xi is internal, because I is transitive). ⋆
Exercise 7: We have
f |A0 = {(x, y) ∈ f : x ∈ A0 }
Exercise 9: The answer is negative: Let A be some external set. Then the set
A = {A} cannot be a subset of some internal set B, since then we would have
A ∈ B, and so A is internal because I is transitive (Proposition 3.16). ⋆
220 Appendix B. Solutions to the Exercises
Exercise 10: If U has the above form, then j0 ∈ U , and so U is not free.
Conversely, if U is not free, then there is some j0 ∈ J with j0 ∈ U . For any
U ⊆ J precisely one of the sets U and J \ U belongs to U , because U is an
ultrafilter. If j0 ∈
/ U , then U cannot belong to U by our choice of j0 . Conversely,
if j0 ∈ U , then j0 ∈ / J \ U , and so J \ U cannot belong to U ; but this implies
U ∈ U . Hence, U contains precisely those sets U ⊆ J with j0 ∈ U . ⋆
Exercise 11: If U contains the filter of Example 4.2, then we have for any j0 ∈ J
that J \ {j0 } ∈ U ; hence U is free. Conversely, assume that U is free but that
J \ J0 ∈
/ U for some finite set J0 ⊆ J. Since U is an ultrafilter, we must have
J0 ∈ U . For each j ∈ J0 there is some Uj ∈ U such that j ∈ / Uj (otherwise U
would not be free). The finite intersection J0 ∩ j∈J0 Uj belongs to U , because
U is a filter. But this intersection is empty, a contradiction. ⋆
Exercise 12: Note first that the relation [f ] ∈ / I0 means that f (j) ∈ S0 holds
not almost everywhere. Since U is an ultrafilter, this means that f (j) ∈ S0 holds
almost nowhere. By choosing an appropriate representative f , we may thus assume
that f (j) ∈
/ S0 for all j. Similarly, we may assume that g(j) ∈ / S0 for all j.
Assume now that [f ] = [g], i.e. f (j) = g(j) does not hold almost everywhere.
Since U is an ultrafilter, we have f (j) = g(j) almost nowhere, i.e. f (j) = g(j) for
all j ∈ U where U ∈ U . Let U1 denote the set of all j ∈ U for which f (j)\g(j) = ∅,
and U2 := U \ U1 . For all j ∈ U2 , we have g(j) \ f (j) = ∅. Since U is an
ultrafilter, one of the sets U1 or U2 must belong to U . Without loss of generality,
we assume that U1 ∈ U . For all j ∈ U1 , we find a function h : J → S such that
h(j) ∈ f (j) \ g(j). Then [h] ∈U [f ] but not [h] ∈U [g], a contradiction to (4.3). ⋆
Exercise 13: The condition is fx (j) = fy (j) for all except at most finitely many
j, i.e. the set D = {j : fx (j) = fy (j)} has a finite complement.
Indeed, if J \ D is finite, then we have D ∈ U for any δ-incomplete ultrafilter
(because U is free by Corollary 4.13, and so D ∈ U by Exercise 11). Hence,
fx (j) = fy (j) holds for almost all j.
Conversely, if J \ D is infinite, we “construct” a δ-incomplete ultrafilter U
with D ∈ / U as follows: Let F0 denote the filter of Example 4.2, i.e. F ∈ F0
if and only if J \ M is finite. The system B = F0 ∪ {J \ D} has the finite
intersection property, since for any F ∈ F0 the set F ∩ D is infinite (and in
particular nonempty). Hence, B generates a filter F . Let U be an ultrafilter
containing F (Theorem 4.9). Exercise 11 implies that U is free. Hence, U is
δ-incomplete by Proposition 4.11. Since J \ D ∈ B ⊆ U , we have D ∈ / U , as
claimed. Thus, fx (j) = fy (j) for almost all j, and so x = y. ⋆
Exercise 14: 1. Let A = {a1 , . . . , an }. By Proposition 4.19, we have ∗ A = ϕ([F ])
where F : J → S is the constant function F (j) := A. By Proposition 4.19 and
221
∀n ∈ ∗ Æ : ∃x ∈ É : |x
∗ 2
− 2| ≤ n−1 .
Ê
Hence, for each x ∈ ∗ and some infinitesimal y > 0 (even for all), we find some
É
z ∈ ∗ with |x − z| < y, i.e. x ≈ z. ⋆
Ê
Exercise 18: If this set (denote it by Mx ) were internal, then inf(∗ ) = Mx − x
would be internal by the internal definition principle. ⋆
Exercise 19: The statement follows by transfer of
M = {j : f (j) is even}
or its complement belongs to U . The first case occurs if and only if f may for
almost all j be written in the form f (j) = 2g(j) where g : J → , and the Æ
second case occurs if and only if f may for almost all j be written in the form
Æ
f (j) = 2g(j) − 1 where g : J → . Thus, the statement follows in view of
Example 5.6. ⋆
Exercise 20: We have limj→j0 f (j) = x if and only if for any open neighborhood U
of x we find some open neighborhood JU of j0 with f (j) ∈ U for j ∈ JU \ {j0 }. The
latter means f −1 (U ) = {j : f (j) ∈ U } ∈ F which by Lemma 5.27 is equivalent to
U ∈ f (F ). Hence, limj→j0 f (j) = x if and only if any open neighborhood U of x
is contained in f (F ), i.e. if and only if limj→F f (j) = x. ⋆
Exercise 21: 1. Let β(x) be the internal formula (x ≤ h0 =⇒ α(x)). Then β(h)
Æ
holds for all infinite h ∈ ∗ ∞ . By the permanence principle, we thus find some
Æ Æ
n0 ∈ σ such that β(n) holds for all n ∈ ∗ with n0 ≤ n ≤ h. In particular, β(n)
Æ
and thus α(n) holds for any finite n ∈ σ with n ≥ n0 .
Ê
2. The proof is analogously reduced to the permanence principle for : Let β(x)
denote the internal formula (x > c =⇒ α(x)). Then β(d) holds for all infinites-
Ê Ê
imals d ∈ inf(∗ ), d > 0. By the permanence principle for , we thus find some
Ê
ε0 ∈ σ + such that β(ε) holds for all standard or nonstandard ε ∈ ∗ with Ê
Ê
0 < ε ≤ ε0 . In particular, α(ε) holds for all ε ∈ ∗ with c < ε ≤ ε0 . ⋆
Exercise 22: Let α(n) be the internal formula |xn | ≤ n−1 . Then α(n) holds for all
Æ Æ
n ∈ σ . By the permanence principle, there is some h ∈ ∞ such that α(n) holds
Æ Æ
for all n ∈ ∗ with n ≤ h. In particular, |xn | ≤ n−1 for all n ∈ ∞ with n ≤ h
which implies that xn ≈ 0 for those n. ⋆
Exercise 23: The assumptions imply in both cases in view of Theorem 3.19 that
A and B are internal, i.e. there are sets C, D ∈ S with A ∈ ∗ C and B ∈ ∗ D. In
223
view of Corollary A.3, we may assume that all elements of C and D are entities.
Let C0 ⊆ C and D0 ⊆ D be the subsets of entities of C resp. D, and let U := C0
Let F denote the system of
and V := D0 . By Theorem 2.1, we have U, V ∈ S.
all functions f with dom(f ) ⊆ U and rng(f ) ⊆ V and recall (Exercise 83) that
∗
F consists of all internal functions f with dom(f ) ⊆ ∗ U and rng(f ) ⊆ ∗ V .
1. By the hint, the sentence
∀x ∈ F : ∃y ∈ F : dom(y) = rng(x) ∧ rng(y) = dom(y) ∧ “y is one-to-one”
is true. Here, the shortcuts rng and dom use quantifiers over U and V . The ∗-trans-
form implies the statement for the choice x = f and g = y.
2. By the Schröder-Bernstein theorem, the sentence
∀x1 , x2 ∈ F : ((“x1 , x2 one-to-one” ∧ rng(x1 ) ⊆ dom(x2 ) ∧ rng(x2 ) ⊆ dom(x1 ))
=⇒ ∃y ∈ F : (“y one-to-one” ∧ dom(y) = dom(x1 ) ∧ rng(y) = dom(x2 )))
is true (as before, rng and dom use quantifiers over U and V ). The ∗-transform
implies the statement for the choice x1 = f1 , x2 = f2 , g = y. ⋆
Exercise 24: The sets I := A∩B, and A0 := A\B are internal subsets of the ∗ -finite
set A and thus ∗ -finite by Theorem 6.13. Moreover, A = A0 ∪I and A∪B = A0 ∪B
where the sets on the right-hand side are disjoint. Hence, Theorem 6.14 implies
# #
A = # A0 + # I and (A ∪ B) = # A0 + # B. Combining these two equations, the
statement follows. ⋆
Exercise 25: We have
X < = {x ∈ P( × X) | ∃n ∈ : (x : {1, . . . , n} → X ∧ dom(x) ⊆ {1, . . . , n})}
/ dom(x) can be formulated in a bounded form, if we use that x ⊆ ×X.
where k ∈
Note now that ∗ P( × X) consists of all internal subsets of ( × X) = ∗ × ∗ X.
∗
Exercise 26: There is some n such that R and the order relation on R both belong
to ∗ S n . By Proposition 3.16, there is some n such that R ∈ ∗ S n . Consider the
sentence
where the order y is used to define max{u}. Note that in order to formalize α,
it suffices to quantify over Sn and . The ∗-transform of α then becomes “y is
a total order on x, and the elements of z are ∗ -finite subsets of x”. Hence the
∗-transform of the above true sentence implies the statement. ⋆
Exercise 27: Put U := A . Then ∗ U = ∗ A by Theorem A.4. Using the nota-
tion of Exercise 25, we have
∀x ∈ ∗ A : (∗ c(x) = ∞ ∨˙ ∃y ∈ ∗ U
<∗ :
(“y is a bijection onto x” ∧ ∗ c(x) = #∗ U (y))).
This implies that for any x = B ∈ ∗ A one of the following alternatives holds:
Either ∗ c(B) = ∞, or ∗ c(x) = h where y : {1, . . . , h} → B is an internal bijection.
But this means ∗ c(B) = # B. For the second statement, note that A is finite if and
#
only if ∗ A = ∗ c(∗ A) = ∗ (c(A)) = ∞. ⋆
Exercise 28: If the sequence xn is bounded, all ∗ xh with h ∈ ∞ are finite by
Theorem 7.2. Corollary 7.3 thus shows that the sets of accumulation points is
given by {st(∗ xh ) : h ∈ ∞ }. The supremum and infimum of this set are lim sup xn
and lim inf xn , respectively. If the sequence is unbounded, ∗ xh is infinite for some
h ∈ ∞ , and so st(∗ xh ) is not defined. However, if one chooses the natural notation
st(∗ xh ) = ±∞ if ∗ xh is infinite and ±∗ xh > 0, then the formula still holds: Indeed,
±∞ is an accumulation point of xn if and only if st(∗ xn ) = ±∞ for some n: For
positive sign, the proof is analogous to Theorem 7.2 (just drop the absolute values
in the proof). For negative sign, the proof is similar or may be reduced to the case
of positive sign by considering the sequence −xn . ⋆
225
Æ
holds for all y ∈ ∞ . By the permanence principle, it also holds for some y ∈ σ , Æ
Æ
i.e. for some y = ∗ n0 with n0 ∈ . The converse direction of the transfer principle
implies that xn is a Cauchy sequence. ⋆
Exercise 31: If x is an interior point of A, we find some ε ∈ Ê+ such that
∀y ∈ Ê : (|x − y| < ε =⇒ y ∈ A).
Ê
The transfer principle implies that ∗ A contains all points y ∈ ∗ which satisfy
|x − y| < ∗ ε, in particular all points with y ≈ x. Conversely, if mon(x) ⊆ ∗ A, then
the internal predicate
∀y ∈ ∗ Ê : (|x − y| < ε =⇒ y ∈ ∗ A)
Ê
holds for any ε ∈ inf(∗ ), ε > 0. By the Cauchy principle (permanence principle),
Ê
the predicate holds also for some standard ε = ∗ ε, ε ∈ + . The inverse form of
the transfer principle implies that A contains all y ∈ Ê
with |x − y| < ε, i.e. x is
an interior point of A. ⋆
Ê
Exercise 32: Evidently, A = ∅ and A = have this property. One might suspect
Ê
that all open sets have this property, but actually A = ∅ and A = are the only
sets: Assume that A has this property. Then the internal formula
∀x ∈ ∗ A, y ∈ ∗ Ê : (|x − y| ≤ z =⇒ y ∈ ∗ A)
Ê
holds for any z ∈ inf(∗ ), z > 0. By the Cauchy principle (permanence principle),
Ê
this formula also holds for some z = ∗ ε with ε ∈ + . The inverse direction of the
transfer principle implies
∀x ∈ A, y ∈ Ê : (|x − y| ≤ ε =⇒ y ∈ A).
226 Appendix B. Solutions to the Exercises
Exercise 37: Assume that c := f ′ (0) exists. Then we have for any 0 = x ≈ 0 that
|x| /x ≈ ∗ c, i.e. ∗ c ≈ 1 (for x > 0) and simultaneously ∗ c ≈ −1 (for x < 0), a
contradiction. ⋆
Exercise 38: Put F (x) := f (x)g(x). Given h ≈ 0, put dx := h, df :=
∗
f (∗ x0 + dx) − ∗ f (∗ x0 ), dg := ∗ g(∗ x0 + dx) − ∗ g(∗ x0 ), and dF :=
∗
F (∗ x0 + h) − ∗ F (∗ x0 ). Then
and so
dF df ∗ ∗ dg ∗
= g( x0 + dx) + ∗ f (∗ x0 ) ≈ (f ′ (x0 )g(x0 ) + f (x0 )g ′ (x0 )),
dx dx dx
where we used Proposition 5.17 and the fact that ∗ g(∗ x0 + dx) ≈ ∗ g(∗ x0 ) which
in turn follows from dg = g ′ (x0 )dx ≈ 0. ⋆
Exercise 39: The statement follows from transfer of
f (x) − f (y)
∀x, y ∈ [a, b] : (x < y =⇒ ∃z ∈ [a, b] : x < z < y ∧ = f ′ (z))
x−y
which is true by the classical mean value theorem. ⋆
Exercise 41: By Proposition 4.11, U is δ-incomplete over J := . Hence, the
corresponding ultrafilter model of Theorem 4.20 provides a nonstandard map ∗.
An element h ∈ ∞ in this model is given by h := ϕ([h0 ]) where h0 (n) := n. Note
Ê Ê
that g(n, x) := |[2n x] − 2[2n−1 x]| defines a relation on × × . By Example 4.22,
Ê Ê
we have for x ∈ and y ∈ ∗ , y = ϕ([fy ]), fy : J → that Ê
(h, ∗ x, y) ∈ ∗ g ⇐⇒ (h0 (j), x, fy (j)) ∈ g for almost all j,
i.e. ∗ g(h, x) = ϕ([F ]) where F (j) := g(h0 (j), x), i.e. F (n) = g(n, x). By Theo-
rem 5.32, we thus have
where f is the function from Theorem 7.28. Since this function attains only the
values 0 and 1 and is nonmeasurable on [0, 1], the statement follows. ⋆
Exercise 42: Let A be a nonempty system of entities A ∈ S which has the finite
intersection property and at most the cardinality of κ. By Lemma 8.7, we may
assume that A is an entity. Put U := A , and consider the relation
ϕ := {(x, y) ∈ A × U | y ∈ x},
228 Appendix B. Solutions to the Exercises
i.e. (A, y) ∈ ϕ if and only if y ∈ A. Note that ϕ ∈ S. Since A has the finite
intersection property, ϕ is concurrent on A , and so there is some b which satisfies
∗
ϕ on σ A . By the standard definition principle for relations, we have
ϕ = {(x, y) ∈ ∗ A × ∗ U | y ∈ x}.
∗
Since b satisfies ∗ ϕ on σ A , we have b ∈ σ A . ⋆
Exercise 43: Since S is infinite, we may assume that
∈ S is an entity. By
Proposition 4.19, we have x ∈ ∗
if and only if x = ϕ([f ]) for some f : J → .
Since J is countable, the system of all functions f : J →
has at most the
cardinality of which is the cardinality of P( ) (this can be seen e.g. by the
estimate | | ≤ |(2 ) | = |2× | = |2 | = |P( )|). Thus, ∗ has at most the
cardinality of P( ). Hence, Proposition 8.11 implies that ∗ is not a κ-enlargement
when κ has a strictly larger cardinality than P( ). In particular, ∗ is not an
enlargement. ⋆
Exercise 44: By Theorem 8.10, there is some ∗ -finite B ⊆ ∗ A with σ A ⊆ B.
Then ∗ A0 ⊆ σ A ⊆ B, and so
∃x ∈ ∗ P(A ) : (“x is ∗ -finite” ∧ ∗ A0 ⊆ x).
The inverse form of the transfer principle implies that there is some finite x =
A0 ∈ P(A ) with A0 ⊆ A0 .
A completely different solution proceeds as follows: Let B denote the system
of all sets of the form A0 \ A (A ∈ A ). If one cannot find a finite A0 ⊆ A with
A0 ⊆ A0 , then B has the finite intersection property, and so σ B = ∅. This
means ∗ A0 \ σ A = ∅, a contradiction to the assumption. ⋆
σ
Exercise 45: By Theorem 8.10, there is some ∗ -finite set R0 with ⊆ R0 ⊆ ∗ .
The transfer of the statement in the hint implies that
∃z ∈ ∗ : |nx − z| < ε.
Let c ∈ inf(∗ ), c > 0. By the transfer principle, we find some ε ∈ ∗ + such that
ϕ := {z ∈ P | ∃x ∈ B : ∃y ∈ x : z = (x, y)}
which is internal by the internal definition principle. Note that dom(ϕ) ⊆ B has
at most the cardinality of κ. We have (B, y) ∈ ϕ for some B ∈ B if and only if
y ∈ B. Since B has the finite intersection property, the relation ϕ is concurrent
on B. Hence, it is satisfied on B, i.e. B = ∅. ⋆
Exercise 47: Assume contrary that A \ A0 = ∅ for each finite A0 = ∅. Let
B be the system of all sets of the form A0 \ A with A ∈ A . Then B has the
finite intersection property, and so B = ∅. But this means that A ⊆ A , a
contradiction. ⋆
Exercise 49: The proof is analogous to Theorem 10.1 with p in place of ·. The
only difference is in the proof of the analogue to Lemma 10.2: We have to prove
Ê
that there is a constant c ∈ such that the functional F (x0 + λx1 ) := f (x0 ) + λc
Ê
(x ∈ X0 , λ ∈ ) satisfies F (x0 + λx1 ) ≤ p(x0 + λx1 ), i.e. f (x0 ) + λc ≤ p(x0 + λx1 ).
In the case λ > 0, we may divide by λ and need the estimate f (x) + c ≤ p(x + x1 )
(x ∈ X0 ). In case λ < 0, we divide by −λ and need f (x) − c ≤ p(x − x1 ) (x ∈ X0 ).
The case λ = 0 is trivial. Thus, we have to find some c satisfying
holds, we find
The standard definition principle implies in view of Theorem 3.21 and Exercise 25
that ∗ F consists of all internal maps Z : ∗ X → ∗ Y for which there are ∗ -finite
∗
sequences y1 , . . . , yh ∈ ∗ Y and f1 , . . . , fh ∈ (X ∗ ) such that
h
Z(x) = fn (x)yn (x ∈ ∗ X).
n=1
Exercise 52: We first prove an analogue of Lemma 10.4: For any finite number of
elements x1 , . . . , xK ∈ X, xk = (ξk,n )n and any ε > 0 there exist real numbers
Ê
η1 , . . . , ηN ∈ such that
N
f (xk ) = ηn ξk,n (k = 1, . . . , K).
n=1
Indeed, as in the proof of Lemma 10.4, we may assume that x1 , . . . , xK are linearly
independent. Choose N such that the truncated vectors yk := (ξk,1 , . . . , ξk,N ) are
linearly independent. Defining
g(λ1 y1 + · · · + λK yK ) := f (λ1 x1 + · · · + λK yK )
Ê
and extending g to N , we find as in the proof of Lemma 10.4 the required numbers
ηn := g(en ) where e1 , . . . , eN are the canonical base vectors of N . Ê
Consider now the relation
#Ê (y)
ϕ := {(x, y) ∈ X × Ê < | f (x) = y(n)x(n)}.
n=1
Exercise 53: Assume there is some y = (ηn )n such that (10.2) is a Hahn-Banach
limit. Given some n, consider the sequence defined by ξk := 0 (k = n) and ξn := 1.
Since fy is a Hahn-Banach limit, its value for this sequence (which converges to
0) must be 0. On the other hand, by (10.2), this value must be ηn . Hence ηn = 0.
Since this argument holds for any n, we would have fy (x) = 0 for any x ∈ ℓ∞
by (10.2), contradicting the fact that fy is a Hahn-Banach limit.
Let c ⊆ ℓ∞ be the subspace of all convergent sequences. For x = (ξn )n ∈ c,
define f0 (x) = lim ξn . Then f0 ∈ c∗ (with f0 = 1), and so we may use the
Hahn-Banach extension theorem to extend f0 to an element f ∈ ℓ∗∞ which thus is
a Hahn-Banach limit. ⋆
232 Appendix B. Solutions to the Exercises
Exercise 54: Fixing some h ∈ ∞ , the functional f ((ξn )n ) := st(∗ ξ h ) is such
a Hahn-Banach limit. Indeed, Theorem 10.6 implies that f is in fact a Hahn-
Banach limit (put η1 = · · · = ηh−1 = 0 and ηh = 1). Moreover, st(∗ ξ h ) is always
an accumulation point of a bounded sequence x = (ξn )n by Corollary 7.3 (note
that ∗ ξ h is finite, because |ξn | ≤ x∞ implies by the transfer principle that
∗
|∗ ξ h | ≤ (x∞ )). ⋆
Exercise 55: Given x = (ξn )n ∈ ℓ∞ , put l := lim inf ξn and L := lim sup ξn . Given
Ê Æ
ε ∈ + , there is some N ∈ such that l − ε ≤ ξn ≤ L + ε holds for each n ≥ N .
Putting ηn := ξn − (l − ε), we thus have ηn+N ≥ 0, and so 0 ≤ f ((ηn+N )n ) =
f ((ηn )n ) = f (x) − (l − ε) which implies f (x) ≥ l − ε. The estimate f (x) ≤ L + ε
follows analogously by putting ηn := (L + ε) − ξn in view of 0 ≤ f ((ηn+N )n ) =
f ((ηn )n ) = L + ε − f (x). We thus have proved l − ε ≤ f (x) ≤ L + ε. Let ε → 0
to obtain the estimate in the statement. This estimate then implies that f is a
Hahn-Banach limit: On the one hand, this estimate implies that the functional f
is bounded with f ≤ 1, because |f (x)| ≤ max{|l| , |L|} ≤ x∞ . On the other
hand, if ξn → c converges, then l = L = c, and so f (x) = c. To see that even
f = 1, observe that f (x) = 1 where x is the constant sequence ξn = 1. ⋆
Exercise 56: Since (ξn+1 )n = −x, we must have f (x) = f ((ξn+1 )n ) = f (−x) =
−f (x), and so f (x) = 0. Since 0 is not an accumulation point of the sequence x,
a Banach-Mazur limit with the required properties does not exist. ⋆
Exercise 57: The set X0 of all x for which ζn converges is a linear subspace of
X := ℓ∞ . Since ζn depends linearly on x, it follows that f is linear and p is
sublinear. Hence, there is a linear functional F ∈ ℓ∗∞ which extends f and satisfies
F (x) ≤ p(x) for all x ∈ ℓ∞ . We claim that each such functional F is a Banach-
Mazur limit: If x is the constant sequence ξn = c, then ζn = c, and so F (x) =
f (x) = c. The estimate F (x) ≤ p(x) implies that F is positive: Indeed, −F (x) =
F (−x) ≤ p(−x) shows F (x) ≥ −p(−x). Thus, if x = (ξn )n with ξn ≥ 0, we have
F (x) ≥ −p(−x) ≥ 0, since the definition immediately implies p(−x) ≤ 0. Finally,
F is shift invariant: If x = (ξn )n and y = (ξn+1 )n , then
and analogously
Æ
∀n ∈ ∗ : ⎝n ≥ ∗ n0 =⇒ ∀k ∈ ∗ :
1
Æ ∗
ξ m+k − ∗ c
< ∗ ε⎠ .
n m=1
∗ ∗
ξ n+k − c
< ∗ ε.
h
n=1
Ê
Since this holds for any ε ∈ + , (10.10) follows.
Ê
Conversely, let (10.10) be satisfied. Given ε ∈ + , the internal formula
∀k ∈ ∗
Æ
1 ∗
ξ m+k − c
< ∗ ε
n m=1
Æ
holds for all n ∈ ∞ . By the permanence principle, we find some n0 ∈ such Æ
Æ
that this formula holds for all n = ∗ n where n ∈ satisfies n ≥ n0 . By the inverse
form of the transfer principle, we find
1 n
∀k ∈ :
ξm+k − c
< ε
n m=1
Æ
for any n ∈ with n ≥ n0 . But this means that x is almost convergent to c.
Æ
For the second statement, fix h ∈ ∞ . Applying (10.10) two times, we find
that
h−1
1∗ ∗ h−1 1 ∗ h − 1∗ ∗
c
ξk ≈ c − ξ n+k+1 ≈ ∗ c − c= .
h h h − 1 n=1 h h
Hence, |∗ ξ k | /h ≤ |∗ c| /h + 1 which implies |∗ ξ k | ≤ |∗ c| + h. In particular,
∃n ∈ ∗ Æ : ∀k ∈ ∗Æ : |∗ ξk | ≤ |∗ c| + n.
The inverse form of the transfer principle implies that (ξk )k is bounded. ⋆
Exercise 59: The answer is negative: Choose A such that χA corresponds to the
sequence (0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, . . .). Then A has the density d = 1/2, but the
sequence χA = (an )n is not almost convergent to 1/2, because for any n we find
n
some k such that a1+k = · · · = an+k = 0, and so n1 m=1 am+k cannot converge
to 1/2 uniformly in k. Theorem 10.13 implies that there is some Banach-Mazur
limit f with f (χA ) = 1/2, and by Proposition 10.11, we find even a Banach-Mazur
limit of Cesàro type with this property. ⋆
234 Appendix B. Solutions to the Exercises
Exercise 62: If A is compact, then Theorem 12.39 implies that st−1 (A) ∩ ∗ A = ∗ A
is standard and thus internal. Conversely, let B := ∗ A∩st−1 (A) be internal, and let
C be an open cover of A. For any y ∈ B there is some x ∈ A with y ≈O ∗ x. There is
some O ∈ C with x ∈ O. Then y ∈ ∗ O, and so y ∈ σ C . This proves B ⊆ σ C .
σ
Theorem 8.16 implies that there is some finite C0 ⊆ C with B ⊆ C 0 . In
particular, σ A ⊆ B ⊆ σ C 0 which implies A ⊆ C0 . Hence, A is compact. ⋆
Exercise 63: Let C be an open cover of A0 := st(A). For any a ∈ A, we find in view
of A ⊆ ns(∗ X) some a0 ∈ ∗ X with a ≈O ∗ a0 . We have a0 ∈ st(a) ⊆ st(A) = A0 .
Hence, there is some O ∈ C with a0 ∈ O, and so a ∈ ∗ O. This proves A ⊆ σ C . By
Theorem 8.16, there is a finite C0 ⊆ C , with A ⊆ σ C 0 . Since C0 = {O1 , . . . , On }
is finite, we have
∗
σ
C 0 = ∗ O1 ∪ · · · ∪ ∗ O n = ∗ (O1 ∪ · · · ∪ On ) = C0 ,
∗ ∗
and so A ⊆ ( C0 ). Theorem 12.35 thus implies A0 = st(A) ⊆ st ( C0 ) =
C0 , and so A0 is compact by Lemma 12.43. ⋆
Ê
Hence, if we find for any ε ∈ + some x ∈ ∗ X with ∗ d(∗ x, y) < ∗ ε, we find
in particular for any U ∈ U some x ∈ ∗ X with (∗ x, y) ∈ ∗ B ε ⊆ ∗ U , and so
Ê
y ∈ pns(∗ X). Conversely, if y ∈ pns(∗ X) and ε ∈ + are given, we find in view of
Bε ∈ U some x ∈ X with (∗ x, y) ∈ ∗ B ε which means ∗ d(∗ x, y) < ∗ ε. ⋆
Exercise 68: Necessity has been proved in Corollary 13.13. For sufficiency, suppose
that any Cauchy sequence converges. By Theorem 13.16, we have to prove that
pns(∗ X) ⊆ ns(∗ X). Thus, let y ∈ pns(∗ X) be given. By Exercise 67, we find
∗
for each n some xn ∈ X with ∗ d( (xn ), y) < 1/∗ n. Using the triangle inequality
for ∗ d (which holds by the transfer principle), we thus find for any ε ∈ + that Ê
∗ ∗ ∗
d( (xn ), (xm )) < ∗ ε for all n, m ∈ Æ
with n, m ≥ 2/ε. But since this estimate
means by the inverse form of the transfer principle that d(xn , xm ) < ε, we may
conclude that xn is a Cauchy sequence (xn is a sequence by a countable form of
the axiom of choice). By assumption, xn → x for some x ∈ X.
Ê
We claim that y ≈U ∗ x. Given ε ∈ + , we have to prove that ∗ d(∗ x, y) < ∗ ε
(Proposition 13.6). But choosing some n ∈ Æ
with n > 2/ε such that d(x, xn ) <
ε/2, we find
∗
d(∗ x, y) ≤ ∗ d(∗ x, ∗ (xn )) + ∗ d(∗ (xn ), y) ≤ ∗ ε/2 + 1/∗ n = ∗ ε,
as desired. ⋆
Exercise 69: If U ∈ U and V ⊆ Y × Y satisfies V ⊇ U ∩ (Y × Y ), put W := U ∪ V .
Then W ∈ U (because W ⊇ U ), and V = W ∩ (Y × Y ). Hence V ∈ UY . If
U1 , U2 ∈ U , then U := U1 ∩ U2 ∈ U , and thus (U1 ∩ (Y × Y )) ∩ (U2 ∩ (Y × Y )) =
236 Appendix B. Solutions to the Exercises
Exercise 70: Let X be complete, and Y ⊆ X be closed with the inherited uniform
structure. By Theorem 13.16, we have to prove that any y ∈ pns(∗ Y ) belongs
to ns(∗ Y ). It follows from the definition (and Proposition 13.19) that pns(∗ Y ) ⊆
pns(∗ X). Since X is complete, Theorem 13.16 implies that there is some x ∈ X
with y ≈O ∗ x. Since Y is closed, we find in view of Theorem 12.35 and y ∈ ∗ Y
that x ∈ Y . We are done if we can prove that y ∈ ns(∗ Y ), i.e. y ∈ ∗ V (∗ x) for any
V ∈ UY . By definition, V = U ∩ (Y × Y ) for some U ∈ U . Since y ≈O ∗ x, we have
(y, ∗ x) ∈ ∗ U ∩ (∗ Y × ∗ Y ) = ∗ V , and so y ∈ ∗ V (∗ x), as desired. ⋆
Exercise 72: The only place in the proof of Theorem 13.26 where the saturation
property was used is in the proof that A = ∅. However, if ∗ is comprehensive,
we may extend the function f : σ U → ∗ X, defined by f (∗ U ) := xU (U ∈ U )
(axiom of choice!), to some internal function F : ∗ U → ∗ X. Consider the internal
binary relation
ϕ := {(x, y) ∈ ∗ U × ∗ X | y ∈ x(F (x))}.
Ê
This implies x ∈ y ∗ U for any y ∈ ∗ with y ≥ ∗ n. Hence, cx ∈ ∗ U ⊆ ∗ O for any
Ê Ê
c ∈ ∗ + with c ≤ 1/∗ n. In particular, cx ∈ ∗ O for any c ∈ inf(∗ ) with c > 0.
Ê
Conversely, assume that cx ∈ inf(∗ X) for any c ∈ inf(∗ ), c > 0. If U is a
neighborhood of 0, then the internal formula εx ∈ ∗ O holds for any ε ∈ inf(∗ ), Ê
ε > 0. By the permanence principle (Cauchy principle), we have ∗ εx ∈ ∗ O for
Ê
some ε ∈ + , and so x ∈ λ∗ O for λ := 1/ε. ⋆
UO := {(x, y) ∈ X × X : x − y ∈ O}
Exercise 77: In view of Proposition 13.2 and Theorem 14.3, a topological vector
space X is Hausdorff if and only if the relation x−y ∈ O for any open neighborhood
O of 0 implies that x = y.
The relation [x] − [y] ∈ O for any open neighborhood O ⊆ X/U of [0] is
equivalent to the fact that for any neighborhood O0 ⊆ X of 0 there is some
u ∈ U with x − y + u ∈ O0 . Putting z := y − x, this means that we find for any
neighborhood O0 ⊆ X of 0 some u with u ∈ z + O, i.e. any neighborhood of z
contains some element of U , i.e. z ∈ U . We thus have proved that [x] − [y] ∈ O for
any open neighborhood O ⊆ X/U if and only if y − x ∈ U .
Together with the assumption in the beginning, we obtain: X/U is Hausdorff
if and only if for any [x], [y] ∈ X/U the relation y − x ∈ U implies [x] = [y]. Since
the latter means y − x ∈ U , we have that X/U is Hausdorff if and only if the
relation y − x ∈ U for some y, x ∈ X implies y − x ∈ U . This is the case if and
only if U ⊆ U , i.e. if and only if U is closed.
For the second claim, note that X/{0} can in a canonical way be identified
with X such that also the open sets are in correspondence. ⋆
Exercise 78: Let F denote the system of all functions from subsets of P(S0 ) into
[0, ∞] (recall Exercise 83). Consider the sentence
and
rng z ∈ Σ ∧ (“Range of z pairwise disjoint” =⇒ y( rng z) = z(n)
n
respectively. The transfer of this sentence implies the statement for the choice
x := Σ, y := µ, z(n) := An (n = 1, . . . , h). ⋆
Exercise 79: Let c > 0 be infinitesimal with δ(x) = 0 for |x| > c. In view of the
transfer principle, we may estimate analogously as for standard integrals
δ(x)∗ f (x) dx − ∗ f (0)
=
δ(x) (∗ f (x) − ∗ f (0)) dx
≤ δ(x)M dx = M
Ê
where M := sup{|∗ f (x) − ∗ f (0)| : x ∈ ∗ , δ(x) = 0} (note that the supremum
exists, since the considered set is internal by the internal definition principle). We
have
Ê
M ≤ sup{|∗ f (x) − ∗ f (0)| : x ∈ ∗ ∧ |x| ≤ c} =: M0
239
X = {y ∈ U I | ∀x ∈ I : y(x) ∈ f (x)},
F = {x ∈ P | x : A → B}
F = {x ∈ P | ∃y ∈ PA : (x : y → B)}
that x ⊆ A × B, and that x is a function (to formalize the latter quantify over A
and B). The transfer principle implies that ∗ C consists of all elements x ∈ ∗ S n
for which ∗ α(x) is true. The condition x ∈ ∗ S n may be dropped, since we already
know ∗ C ⊆ ∗ S n , and ∗ α(x) becomes: “x ⊆ ∗ (A × B) = ∗ A × ∗ B is true, and x is
a function”. ⋆
Exercise 84: Put C := {A × B : A ∈ A , B ∈ B}, U := A and V := B. Then
A × B ⊆ U × V for each A ∈ A and each B ∈ B. Putting P := P(U × V ), we
thus have
C = {z ∈ P | ∃x ∈ A , y ∈ B : z = x × y}.
The standard definition principle implies
∗
C = {z ∈ ∗ P | ∃x ∈ ∗ A , y ∈ ∗ B : z = x × y}.
Now note that ∗ P = ∗ P(U × V ) consists by Theorem 3.21 of all internal subsets
of ∗ (U × V ) = ∗ U × ∗ V = ( ∗ A ) × ( ∗ B) (for the equalities we used that ∗ is
a superstructure monomorphism and Theorem A.4). Since by Theorem 3.19 each
of the sets A × B is internal for A ∈ ∗ A and B ∈ ∗ B, the statement follows. ⋆
Bibliography
[Rud87] Rudin, W., Real and complex analysis, 3rd ed., McGraw-Hill, Singa-
pore, 1987.
[Rud90] Rudin, W., Functional analysis, 14th ed., McGraw-Hill, New Delhi,
New York, 1990.
[RZ69] Robinson, A. and Zakon, E., A set-theoretical characterization of en-
largements, In Luxemburg [Lux69b], 109–122.
[SB86] Stroyan, K. D. and Bayod, J. M., Foundations of infinitesimal
stochastic analysis, North-Holland, Amsterdam, 1986.
[Sie38] Sierpiński, W., Fonctions additives non complètement additives et
fonctions non mesurables, Fund. Math. 30 (1938), 96–99.
[Sik64] Sikorski, R., Boolean algebras, Springer, Berlin, Heidelberg, New
York, 1964.
[SL76] Stroyan, K. D. and Luxemburg, W. A. J., Introduction to the theory
of infinitesimals, Academic Press, New York, San Francisco, London,
1976.
[Sol70] Solovay, R. M., A model of set-theory in which every set of reals is
Lebesgue measurable, Ann. of Math. (2) 92 (1970), 1–56.
[Tay69] Taylor, R. F., On some properties of bounded internal functions, In
Luxemburg [Lux69b], 167–170.
[vdB87] van den Bergh, I., Nonstandard asymptotic analysis, Lect. Notes
Math., no. 1249, Springer, Berlin, New York, 1987.
[vQ79] von Querenburg, B., Mengentheoretische Topologie, 2nd ed., Springer,
Berlin, Heidelberg, New York, 1979.
[Wag86] Wagon, S., The Banach-Tarski paradox, 2nd ed., Cambridge Univ.
Press, Cambridge, 1986.
[Zaa67] Zaanen, A. C., Integration, North-Holland Publ. Company, Amster-
dam, 1967.
[Zak69] Zakon, E., Remarks on the nonstandard real axis, In Luxemburg
[Lux69b], 195–227.
Index
∀, 17 µ(A), 156
A, 158 ns(X), 157
#
A, 76 pns(∗ X), 175
C0 , 206 rng(Φ), 14
C0∞ , 206 st, 67, 157
∃, 17 ⊆, 19
F → x, 163 x ≈O y, 156
I , 35 x ≈ y, 66
L1 , 205 x ≈U y, 172
Lloc xn → x, 85, 163
1 , 206
∞ , 63
abstract interpretation map, 19
P(A), 13, 19
Ê + , 63
accumulation point
∼ of a sequence, 85
U∗ , 178
180 ∼ of a set, 91
U,
additive
U, 180
∼ measure, 140
X̆, 192
X < , 83
internal ≈, 200
∼ probability measure, 140
X < , 83
∗
σ-∼, 140
180
X, adequate, λ-∼, 114
#X (x), 83 algebra, set-∼, 140
cns(L ), 17 σ-≈, 140
dom(Φ), 14 almost
Ê
fin(∗ ), 63 ∼ every j ∈ J, 45
fin(∗ X), 188 ∼ everywhere, 45
Ê
inf(∗ ), 63 ∼ nowhere, 45
inf(∗ X), 189 almost convergent, 138
lim inf xn , 88 amenable group, 147
lim sup xn , 88 Archimedean
mon(F ), 150 ∼ field, 5
mon(x), 68, 155 ∼ property, 4
246 Index
∼ cut, 5 Archimedean ∼, 5
∼ finite set, 76 commutative ∼, 4
∼ ∗ -finite set, 78 totally ordered ∼, 4
theorem of ∼, 6 filter, 45
δ-function, Dirac’s ∼, 205 Cauchy ∼, 174
δ-incomplete filter, 47 convergence of a ∼, 163
density of a set, 140 convergence with respect to ∼,
derivative of a function, 94 70
difference, symmetric ∼, 145 δ-incomplete ∼, 47
differentiable function, 94 ∼ dimension, 153
dimension of a filter, 153 free ∼, 46
Dirac’s δ-function, 205 generated ∼, 45
discrete topology, 164 image ∼, 70
distributive law, 4 λ-adequate ∼, 114
domain, 14 neighborhood ∼, 150
dual space, 126 nonprincipal ∼, 152
principal ∼, 152
element
standard ∼, 156
external ∼, 35
∼ subbase, 152
internal ∼, 35
ultra∼, 46
standard ∼, 24
finite
elementary embedding, 23
∼ hyperreal number, 63
embedding
∼ intersection property, 45
elementary ∼, 23
∼ point, 188
nonstandard ∼, 24
∼ set, 76
enlargement, 103
Dedekind ≈, 76
compact ∼, 111
Dedekind ∗ -≈, 78
κ-∼, 103 ∗
-≈, 76
compact ≈, 111
Følner’s condition, 144
entity, 13
formula
external ∼, 35
atomic ∼, 18
internal ∼, 35
internal ∼, 36
standard ∼, 24
well-formed ∼, 18
Eudoxos property, 4
free filter, 46
extension theorem, Hahn-Banach ∼,
free occurrence, 18
126
function, 14
external
bounded linear ∼, 125
∼ element, 35
continuous ∼, 93, 167, 168
∼ entity, 35
uniformly ≈, 94
field differentiable ∼, 94
248 Index
∼ of Dedekind, 6 ultrafilter, 46
∼ of Hahn-Banach, 126 δ-incomplete ∼, 47
∼ of Heine-Borel, 91, 162 free ∼, 46
intermediate value ∼, 93 ultrapower, 48
∼ of L oś/Luxemburg, 49 uniform space, 171
maximal ideal ∼, 47 nonstandard hull of a ∼, 182
measure extension ∼, 144 uniform structure, 170
Ê
≈ for , 145 ∼ induced
∼ of Schröder-Bernstein, 81 ≈ by a family of pseudomet-
Tauberian ∼, 139 rics, 172
∼ of Tychonoff, 166 ≈ by a metric, 171
topological space, 70, 149 inherited ∼, 176
Hausdorff ∼, 70, 157 ∼ of a topological vector space,
187
T3 ∼, 162
uniformly continuous function, 94
vector ∼, 185
universe
nonstandard hull of a ≈, 192
nonstandard ∼, 35
topological vector space, 185
standard ∼, 23
nonstandard hull of a ∼, 192
uniform structure of a ∼, 187 variable, 17
topology, 149
discrete ∼, 164 Weierstraß, theorem of Bolzano-∼,
∼ generated by a uniform struc- 88
ture, 171 well-formed formula, 18
inherited ∼, 177 well-order, 4
product ∼, 165 wff, 18
T3 ∼, 162 bounded ∼, 19
total order, 4 transitively bounded ∼, 19
totally ordered field, 4
transfer principle, 11, 28
∼ first version, 25
transitive set, 14
transitively bounded
∼ quantifier, 18
∼ wff, 19
translation invariant, 140
triangle inequality, 102, 125
truth value, 19
Tychonoff’s theorem, 166
type, 14