PDF Languages and Machines An Introduction To The Theory of Computer Science 3rd Edition Sudkamp Download

Download as pdf or txt
Download as pdf or txt
You are on page 1of 84

Full download ebook at ebookgate.

com

Languages and Machines An Introduction to the


Theory of Computer Science 3rd Edition
Sudkamp

https://ebookgate.com/product/languages-and-
machines-an-introduction-to-the-theory-of-
computer-science-3rd-edition-sudkamp/

Download more ebook from https://ebookgate.com


More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Theory of Computer Science Automata Languages and


Computation 3rd Edition K.L.P. Mishra

https://ebookgate.com/product/theory-of-computer-science-
automata-languages-and-computation-3rd-edition-k-l-p-mishra/

Introduction to Automata Theory Languages and


Computation 3rd Edition John Hopcroft

https://ebookgate.com/product/introduction-to-automata-theory-
languages-and-computation-3rd-edition-john-hopcroft/

Introduction to lattice theory with computer science


applications 1st Edition Garg

https://ebookgate.com/product/introduction-to-lattice-theory-
with-computer-science-applications-1st-edition-garg/

Practical programming An introduction to computer


science using Python Jennifer Campbell

https://ebookgate.com/product/practical-programming-an-
introduction-to-computer-science-using-python-jennifer-campbell/
Beginning Theory An Introduction to Literary and
Cultural Theory 3rd Edition Peter Barry

https://ebookgate.com/product/beginning-theory-an-introduction-
to-literary-and-cultural-theory-3rd-edition-peter-barry/

Introduction to Computer Theory Daniel I.A. Cohen

https://ebookgate.com/product/introduction-to-computer-theory-
daniel-i-a-cohen/

Introduction To Automata Theory Languages And


Computation 2nd Edition Jfry Dulman

https://ebookgate.com/product/introduction-to-automata-theory-
languages-and-computation-2nd-edition-jfry-dulman/

Understanding Jurisprudence An Introduction to Legal


Theory 3rd Edition Raymond Wacks

https://ebookgate.com/product/understanding-jurisprudence-an-
introduction-to-legal-theory-3rd-edition-raymond-wacks/

Computer processing of remotely sensed images an


introduction 3rd ed Edition Mather

https://ebookgate.com/product/computer-processing-of-remotely-
sensed-images-an-introduction-3rd-ed-edition-mather/
Preface

The objective of the third edition of Languages and Machines: An Introduction to the
Theory o f Computer Science remains the same as that of the first two editions, to provide
a mathematically sound presentation of the theory of computer science at a level suitable
for junior- and senior-level computer science majors. The impetus for the third edition was
threefold: to enhance the presentation by providing additional motivation and examples; to
expand the selection of topics, particularly in the area of computational complexity; and to
provide additional flexibility to the instructor in the design of an introductory course in the
theory of computer science.
While many applications-oriented students question the importance o f studying the­
oretical foundations, it is this subject that addresses the “big picture" issues of computer
science. When today’s programming languages and computer architectures are obsolete
and solutions have been found for problems currently of interest, the questions considered
in this book will still be relevant. What types of patterns can be algorithmically detected?
How can languages be formally defined and analyzed? What are the inherent capabilities
and limitations of algorithmic computation? What problems have solutions that require so
much time or memory that they are realistically intractable? How do we compare the relative
difficulty of two problems? Each of these questions will be addressed in this text.

Organization
Since most computer science students at the undergraduate level have little or no background
in abstract mathematics, the presentation is intended not only to introduce the foundations
of computer science but also to increase the student’s mathematical sophistication. This
is accomplished by a rigorous presentation of the concepts and theorems of the subject
accompanied by a generous supply of examples. Each chapter ends with a set of exercises
that reinforces and augments the material covered in the chapter.
To make the topics accessible, no special mathematical prerequisites are assumed.
Instead, Chapter 1 introduces the mathematical tools of the theory of computing; naive set
x iv P re f a c e

theory, recursive definitions, and proof by mathematical induction. With the exception of
the specialized topics in Sections 1.3 and 1.4, Chapters 1 and 2 provide background material
that will be used throughout the text. Section 1.3 introduces cardinality and diagonalization,
which are used in the counting arguments that establish the existence of undecidable
languages and uncomputable functions. Section 1.4 examines the use of self-reference in
proofs by contradiction. This technique is used in undecidability proofs, including the proof
that there is no solution to the Halting Problem. For students who have completed a course
in discrete mathematics, most of the material in Chapter 1 can be treated as review.
Recognizing that courses in the foundations of computing may emphasize different
topics, the presentation and prerequisite structure of this book have been designed to permit
a course to investigate particular topics in depth while providing the ability to augment
the primary topics with material that introduces and explores the breadth of computer
science theory. The core material for courses that focus on a classical presentation of formal
and automata language theory, on computability and undecidability, on computational
complexity, and on formal languages as the foundation for programming language definition
and compiler design are given in the following table. A star next to a section indicates that
the section may be omitted without affecting the continuity of the presentation. A starred
section usually contains the presentation of an application, the introduction of a related
topic, or a detailed proof of an advanced result in the subject.

Formal Languages
Formal Language Computability Computational for Programming
and Automata Theory Theory Complexity Languages

Chap. 1 : 1-3, 6 - 8 Chap. 1: all Chap. 1: all Chap. 1: 1-3, 6 - 8


Chap. 2: 1-3,4* Chap. 2: 1-3,4* Chap. 2: 1-3,4* Chap. 2: 1-4
Chap. 3: 1-3,4* Chap. 5: 1-6,7* Chap. 5: 1-4,5-7* Chap. 3: 1-4
Chap. 4: 1-5, 6 *, 7 Chap. 8 : 1-7, 8 ' Chap. 8 : 1-7, 8 * Chap. 4: 1-5, 6 *. 7
Chap. 5: 1-6, 7* Chap. 9: 1-5, 6 * Chap. 9: l^ t, 5-6* Chap. 5: 1-6, 7*
Chap. 6 : 1-5, 6 * Chap. 10: 1 Chap. 11: 1-4, 5* Chap. 7: 1-3,4-5*
Chap. 7: 1-5 Chap. 11: all Chap. 14: 1-4, 5-7* Chap. 18: all
Chap. 8 : 1-7, 8 * Chap. 12: all Chap. 15: all Chap. 19: all
Chap. 9: 1-5, 6 * Chap. 13: all Chap. 16: 1-6, 7* Chap. 20: all
Chap. 10: all Chap. 17: all

The classical presentation of formal language and automata theory examines the rela­
tionships between the grammars and abstract machines of the Chomsky hierarchy. The com­
putational properties of deterministic finite automata, pushdown automata, linear-bounded
automata, and Turing machines are examined. The analysis of the computational power of
abstract machines culminates by establishing the equivalence of language recognition by
Turing machines and language generation by unrestricted grammars.
Preface XV

Computability theory examines the capabilities and limitations of algorithmic prob­


lem solving. The coverage of computability includes decidability and the Church-Turing
Thesis, which is supported by the establishment of the equivalence of Turing computabil­
ity and ^-recursive functions. A diagonalization argument is used to show that the Halting
Problem for Turing machines is unsolvable. Problem reduction is then used to establish the
undecidability of a number of questions on the capabilities of algorithmic computation.
The study of computational complexity begins by considering methods for measuring
the resources required by a computation. The Turing machine is selected as the framework
for the assessment of complexity, and time and space complexity are measured by the
number of transitions and amount of memory used in Turing machine computations. The
class 7 of problems that are solvable by deterministic Turing machines in polynomial time
is identified as the set problems that have efficient algorithmic solutions. The class N T and
the theory of NP-completeness are then introduced. Approximation algorithms are used to
obtain near-optimal solutions for NP-complete optimization problems.
The most important application of formal language theory to computer science is the
use of grammars to specify the syntax of programming languages. A course with the focus
of using formal techniques to define programming languages and develop efficient parsing
strategies begins with the introduction of context-free grammars to generate languages
and finite automata to recognize patterns. After the introduction to language definition,
Chapters 18-20 examine the properties of LL and LR grammars and deterministic parsing
of languages defined by these types of grammars.

Exercises
Mastering the theoretical foundations of computer science is not a spectator sport; only by
solving problems and examining the proofs of the major results can one fully comprehend
the concepts, the algorithms, and the subtleties of the theory. That is, understanding the “big
picture” requires many small steps. To help accomplish this, each chapter ends with a set of
exercises. The exercises range from constructing simple examples of the topics introduced
in the chapter to extending the theory.
Several exercises in each set are marked with a star. A problem is starred because it
may be more challenging than the others on the same topic, more theoretical in nature, or
may be particularly unique and interesting.

Notation
The theory of computer science is a mathematical examination of the capabilities and lim­
itations of effective computation. As with any formal analysis, the notation must provide
XVi P refac e

precise and unambiguous definitions of the concepts, structures, and operations. The fol­
lowing notational conventions will be used throughout the book:

Items Description Examples

Elements and strings Italic lowercase letters from the beginning a, b, abc
of the alphabet
Functions Italic lowercase letters f'g 'h
Sets and relations Capital letters X. Y.Z, z , r
Grammars Capital letters G, G „ G2
Variables of grammars Italic capital letters A, B, C, S
Abstract machines Capital letters M, M „M 2

The use of roman letters for sets and mathematical structures is somewhat nonstandard
but was chosen to make the components of a structure visually identifiable. For example, a
context-free grammar is a structure G = (E , V, P, S). From the fonts alone it can be seen
that G consists of three sets and a variable S.
A three-part numbering system is used throughout the book; a reference is given by
chapter, section, and item. One numbering sequence records definitions, lemmas, theorems,
corollaries, and algorithms. A second sequence is used to identify examples. Tables, figures,
and exercises are referenced simply by chapter and number.
The end of a proof is marked by ■ and the end of an example by □ . An index o f symbols,
including descriptions and the numbers of the pages on which they are introduced, is given
in Appendix I.

Supplements
Solutions to selected exercises are available only to qualified instructors. Please contact your
local Addison-Wesley sales representative or send email to [email protected] for information
on how to access them.

Acknowledgments
First and foremost, I would like to thank my wife Janice and daughter Elizabeth, whose
kindness, patience, and consideration made the successful completion of this book possible.
I would also like to thank my colleagues and friends at the Institut de Recherche en
Informatique de Toulouse, Universite Paul Sabatier, Toulouse, France. The first draft of
this revision was completed while 1 was visiting IRIT during the summer of 2004. A special
thanks to Didier Dubois and Henri Prade for their generosity and hospitality.
The number of people who have made contributions to this book increases with each
edition. I extend my sincere appreciation to all the students and professors who have
used this book and have sent me critiques, criticisms, corrections, and suggestions for
improvement. Many of the suggestions have been incorporated into this edition. Thank
you for taking the time to send your comments and please continue to do so. My email
address is [email protected]. ,
This book, in its various editions, has been reviewed by a number of distinguished com­
puter scientists including Professors Andrew Astromoff (San Francisco State University),
Dan Cooke (University of Texas-El Paso), Thomas Fernandez, Sandeep Gupta (Arizona
State University), Raymond Gumb (University of Massachusetts-Lowell), Thomas F. Hain
(University of South Alabama), Michael Harrison (University of California at Berkeley),
David Hemmendinger (Union College), Steve Homer (Boston University), Dan Jurca (Cal­
ifornia State University-Hayward), Klaus Kaiser (University of Houston), C. Kim (Uni­
versity of Oklahoma), D. T. Lee (Northwestern University), Karen Lemone (Worcester
Polytechnic Institute), C. L. Liu (University of Illinois at Urbana-Champaign), Richard
J. Lorentz (California State University-Northridge), Fletcher R. Norris (The University
of North Carolina at Wilmington), Jeffery Shallit (University of Waterloo), Frank Stomp
(Wayne State University), William Ward (University of South Alabama), Dan Ventura
(Brigham Young University), Charles Wallace (Michigan Technological University), Ken­
neth Williams (Western Michigan University), and Hsu-Chun Yen (Iowa State University).
Thank you all.
I would also like to gratefully acknowledge the assistance received from the people at
the Computer Science Education Division of the Addison-Wesley Publishing Company and
Windfall Software who were members of the team that successfully completed this project.

Thomas A. Sudkamp
Dayton, Ohio
Contents

Preface xiii

Introduction 1

PART I Foundations

C h a p te r 1
Mathematical Preliminaries 7

1.1 Set Theory 8

1.2 Cartesian Product, Relations, and Functions 11


1.3 Equivalence Relations 14
1.4 Countable and Uncountable Sets 16
1.5 Diagonalization and Self-Reference 21
1. 6 Recursive Definitions 23
1.7 Mathematical Induction 27
1.8 Directed Graphs 32
Exercises 36
Bibliographic Notes 40

C h a p te r 2
Languages 41
2.1 Strings and Languages 42
2.2 Finite Specification of Languages 45
2.3 Regular Sets and Expressions 49
2.4 Regular Expressions and Text Searching 54
Exercises 58
Bibliographic Notes 61
V
Vi C ontents

PART II Grammars, Automata, and Languages

C h a p te r 3
Context-Free Grammars
3.1 Context-Free Grammars and Languages 68

3.2 Examples of Grammars and Languages 76


3.3 Regular Grammars 81
3.4 Verifying Grammars 83
3.5 Leftmost Derivations and Ambiguity 89
3.6 Context-Free Grammars and Programming Language Definition
Exercises 97
Bibliographic Notes 102

C h a p te r 4
Normal Forms for Context-Free Grammars
4.1 Grammar Transformations 104
4.2 Elimination of X-Rules 106
4.3 Elimination of Chain Rules 113
4.4 Useless Symbols 116
4.5 Chomsky Normal Form 121
4.6 The CYK Algorithm 124
4.7 Removal of Direct Left Recursion 129
4.8 Greibach Normal Form 131
Exercises 138
Bibliographic Notes 143

C h a p te r 5
Finite Automata

5.1 A Finite-State Machine 145


5.2 Deterministic Finite Automata 147
5.3 State Diagrams and Examples 151
5.4 Nondeterministic Finite Automata 159
5.5 A.-Transitions 165
5.6 Removing Nondeterminism 170
5.7 DFA Minimization 178
Exercises 184
Bibliographic Notes 190
Chapter 6
Properties o f Regular Languages
6 .1 Finite-State Acceptance of Regular Languages 191
6.2 Expression Graphs 193
6.3 Regular Grammars and Finite Automata 196
6.4 Closure Properties of Regular Languages 200
6.5 A Nonregular Language 203
6 .6 The Pumping Lemma for Regular Languages 205
6.7 The Myhill-Nerode Theorem 211
Exercises 217
Bibliographic Notes 220

Chapter 7
Pushdown Automata and Context-Free Languages
7.1 Pushdown Automata 221
7.2 Variations on the PDA Theme 227
7.3 Acceptance of Context-Free Languages 232
7.4 The Pumping Lemma for Context-Free Languages 239
7.5 Closure Properties of Context-Free Languages 243
Exercises 247
Bibliographic Notes 251

PART III Computability

Chapter 8
Turing Machines

8.1 The Standard Turing Machine 255


8.2 Turing Machines as Language Acceptors 259
8.3 Alternative Acceptance Criteria 262
8.4 Multitrack Machines 263
8.5 Two-Way Tape Machines 265
8 .6 Multitape Machines 268
8.7 Nondeterministic Turing Machines 274
8 .8 Turing Machines as Language Enumerators 282
Exercises 288
Bibliographic Notes 293
viii C o ntents

C h a p te r 9
Turing Computable Functions
9.1 Computation of Functions 295
9.2 Numeric Computation 299
9.3 Sequential Operation of Turing Machines 301
9.4 Composition of Functions 308
9.5 Uncomputable Functions 312
9.6 Toward a Programming Language 313
Exercises 320
Bibliographic Notes 323

C h a p te r 10
The Chomsky Hierarchy
10.1 Unrestricted Grammars 325
10.2 Context-Sensitive Grammars 332
10.3 Linear-Bounded Automata 334
10.4 The Chomsky Hierarchy 338
Exercises 339
Bibliographic Notes 341

C h a p te r 11
Decision Problems and the Church-Turing Thesis

11.1 Representation of Decision Problems 344


11.2 Decision Problems and Recursive Languages 346
11.3 Problem Reduction 348
11.4 The Church-Turing Thesis 352
11.5 A Universal Machine 354
Exercises 358
Bibliographic Notes 360

C h a p te r 12
Undecidability
12.1 The Halting Problem for Turing Machines 362
12.2 Problem Reduction and Undecidability 365
12.3 Additional Halting Problem Reductions 368
12.4 Rice’s Theorem 371
12.5 An Unsolvable Word Problem 373
12.6 The Post Correspondence Problem 377
12.7 Undecidable Problems in Context-Free Grammars 382
Exercises 386
Bibliographic Notes 388

Chapter 13
Mu-Recursive Functions
13.1 Primitive Recursive Functions 389
13.2 Some Primitive Recursive Functions 394
13.3 Bounded Operators 398
13.4 Division Functions 404
13.5 Godel Numbering and Course-of-Values Recursion 406
13.6 Computable Partial Functions 410
13.7 Turing Computability and Mu-Recursive Functions 415
13.8 The Church-Turing Thesis Revisited 421
Exercises 424
Bibliographic Notes 430

PART IV Computational Complexity

Chapter 14
Time Complexity
14.1 Measurement of Complexity 434
14.2 Rates of Growth 436
14.3 Time Complexity of a Turing Machine 442
14.4 Complexity and Turing Machine Variations 446
14.5 Linear Speedup 448
14.6 Properties of Time Complexity of Languages 451
14.7 Simulation of Computer Computations 458
Exercises 462
Bibliographic Notes 464

Chapter 15
3 \ N T , and Cook’s Theorem

15.1 Time Complexity of Nondeterministic Turing Machines


15.2 The Classes !P and N3* 468
15.3 Problem Representation and Complexity 469
15.4 Decision Problems and Complexity Classes 472
15.5 The Hamiltonian Circuit Problem 474
X C ontents

15.6 Polynomial-Time Reduction 477


15.7 479
15.8 The Satisfiability Problem 481
15.9 Complexity Class Relations 492
Exercises 493
Bibliographic Notes 496

C h a p te r 16
NP-Complete Problems 497
16.1 Reduction and NP-Complete Problems 497
16.2 The 3-Satisfiability Problem 498
16.3 Reductions from 3-Satisfiability 500
16.4 Reduction and Subproblems 513
16.5 Optimization Problems 517
16.6 Approximation Algorithms 519
16.7 Approximation Schemes 523
Exercises 526
Bibliographic Notes 528

C h a p te r 17
Additional Complexity C lasses 529
17.1 Derivative Complexity Classes 529
17.2 Space Complexity 532
17.3 Relations between Space and Time Complexity 535
17.4 y-Space, NP-Space, and Savitch’s Theorem 540
17.5 P-Space Completeness 544
17.6 An Intractable Problem 548
Exercises 550
Bibliographic Notes 551

PARTV Deterministic Parsing


C h a p te r 18
Parsing: An Introduction 555
18.1 The Graph of a Grammar 555
18.2 A Top-Down Parser 557
18.3 Reductions and Bottom-Up Parsing 561
18.4 A Bottom-Up Parser 563
C o n ten ts xi

18.5 Parsing and Compiling 567


Exercises 568
Bibliographic Notes 569

Chapter 19
LL(lc) Grammars 571

19.1 Lookahead in Context-Free Grammars 571


19.2 FIRST, FOLLOW, and Lookahead Sets 576
19.3 Strong LL(fc) Grammars 579
19.4 Construction of FIRST* Sets 580
19.5 Construction of FOLLOW* Sets 583
19.6 A Strong LL(1) Grammar 585
19.7 A Strong LL(it) Parser 587
19.8 LL(fc) Grammars 589
Exercises 591
Bibliographic Notes 593

Chapter 20
LR(fc) Grammars 595
20.1 LR(0) Contexts 595
20.2 An LR(0) Parser 599
20.3 The LR(0) Machine 601
20.4 Acceptance by the LR(0) Machine 606
20.5 LR(1) Grammars 612
Exercises 620
Bibliographic Notes 621

Appendix I
Index o f Notation 623

Appendix II
The Greek Alphabet 627

Appendix III
The ASCII Character Set 629

Appendix IV
Backus-Naur Form Definition o f Java 631

Bibliography 641
Subject Index 649
1 *1 j l u ii m m mi i i

Introduction

The theory of computer science began with the questions that spur most scientific endeavors:
how and what. After these had been answered, the question that motivates many economic
decisions, how much, came to'the forefront. The objective of this book is to explain the
significance of these questions for the study of computer science and provide answers
whenever possible.
Formal language theory was initiated by the question, “How are languages defined?” In
an attempt to capture the structure and nuances of natural language, linguist Noam Chomsky
developed formal systems called grammars for defining and generating syntactically correct
sentences. At approximately the same time, computer scientists were grappling with the
problem of explicitly and unambiguously defining the syntax of programming languages.
These two studies converged when the syntax of the programming language ALGOL was
defined using a formalism equivalent to a context-free grammar.
The investigation of computability was motivated by two fundamental questions:
“What is an algorithm?” and “What are the capabilities and limitations o f algorithmic
computation?” An answer to the first question requires a formal model of computation. It
may seem that the combination of a computer and high-level programming language, which
clearly constitute a computational system, would provide the ideal framework for the study
of computability. Only a little consideration is needed to see difficulties with this approach.
What computer? How much memory should it have? What programming language? More­
over, the selection of a particular computer or language may have inadvertent and unwanted
consequences on the answer to the second question. A problem that may be solved on one
computer configuration may not be solvable on another.
The question of whether a problem is algorithmically solvable should be independent
of the model computation used: Either there is an algorithmic solution to a problem or there
is no such solution. Consequently, a system that is capable of performing all possible al
rithmic computations is needed to appropriately address the question of computability. 1 ni
characterization of general algorithmic computation has been a major area o f research for
mathematicians and logicians since the 1930s. Many different systems have been proposed
as models of computation, including recursive functions, the lambda calculus of Alonzo

1
2 Introduction

Church, Markov systems, and the abstract machines developed by Alan Turing. All of these
systems, and many others designed for this purpose, have been shown to be capable of solv­
ing the same set of problems. One interpretation of the Church-Turing Thesis, which will
be discussed in Chapter 11, is that a problem has an algorithmic solution only if it can be
solved in any (and hence all) of these computational systems.
Because of its simplicity and the similarity of its components to those of a modem day
computer, we will use the Turing machine as our framework for the study o f computation.
The Turing machine has many features in common with a computer: It processes input,
writes to memory, and produces output. Although Turing machine instructions are primitive
compared with those of a computer, it is not difficult to see that the computation of
a computer can be simulated by an appropriately defined sequence of Turing machine
instructions. The Turing machine model does, however, avoid the physical limitations of
conventional computers; there is no upper bound on the amount of memory or time that may
be used in a computation. Consequently, any problem that can be solved on a computer can
be solved with a Turing machine, but the converse of this is not guaranteed.
After accepting the Turing machine as a universal model of effective computation,
we can address the question, “What are the capabilities and limitations o f algorithmic
computation?” The Church-Turing Thesis assures us that a problem is solvable only if there
is a suitably designed Turing machine that solves it. To show that a problem has no solution
reduces to demonstrating that no Turing machine can be designed to solve the problem.
Chapter 12 follows this approach to show that several important questions concerning our
ability to predict the outcome of a computation are unsolvable.
Once a problem is known to be solvable, one can begin to consider the efficiency
or optimality of a solution. The question how much initiates the study of computational
complexity. Again the Turing machine provides an unbiased platform that permits the
comparison of the resource requirements of various problems. The time complexity of
a Turing machine measures the number of instructions required by a computation. Time
complexity is used to partition the set of solvable problems into two classes: tractable and
intractable. A problem is considered tractable if it is solvable by a Turing machine in which
the number of instructions executed during a computation is bounded by a polynomial
function of length of the input. A problem that is not solvable in polynomial time is
considered intractable because of the excessive amount of computational resources required
to solve all but the simplest cases of the problem.
The Turing machine is not the only abstract machine that we will consider; rather,
it is the culmination of a series of increasingly powerful machines whose properties will
be examined. The analysis of effective computation begins with an examination of the
properties of deterministic finite automata. A deterministic finite automaton is a read-once
machine in which the instruction to be executed is determined by the state o f the machine
and the input symbol being processed. Although structurally simple, deterministic finite
automata have applications in many disciplines including pattern recognition, the design of
switching circuits, and the lexical analysis of programming languages.
A more powerful family of machines, known as pushdown automata, are created by
adding an external stack memory to finite automata. The addition of the stack extends the
Introduction 3

computational capabilities of a finite automaton. As with the Turing machines, our study of
computability will characterize the computational capabilities of both of these families of
machines.
Language definition and computability, the dual themes of this book, are not two
unrelated topics that fall under the broad heading of computer science theory, but rather
they are inextricably intertwined. The computations of a machine can be used to recognize
a language; an input string is accepted by the machine if the computation initiated with the
string indicates its syntactic correctness. Thus each machine has an associated language,
the set of strings accepted by the machine. The computational capabilities o f each family of
abstract machines is characterized by the languages accepted by the machines in the family.
With this in mind, we begin our investigations into the related topics of language definition
and effective computation.
PART I

Foundations

heoretical computer science includes the study of language definition, pattern recog­
T nition, the capabilities and limitations of algorithmic computation, and the analysis
of the complexity of problems and their solutions. These topics are built on the founda­
tions of set theory and discrete mathematics. Chapter 1 reviews the mathematical concepts,
operations, and notation required for the study of formal language theory and the theory of
computation.
Formal language theory has its roots in linguistics, mathematical logic, and computer
science. A set-theoretic definition of language is given in Chapter 2. This definition is suffi­
ciently broad to include both natural (spoken and written) languages and formal languages,
but the generality is gained at the expense of not providing an effective method for gen­
erating the strings of a language. To overcome this shortcoming, recursive definitions and
set operations are used to give finite specifications of languages. This is followed by the
introduction of regular sets, a family of languages that arises in automata theory, formal
language theory, switching circuits, and neural networks. The section ends with an exam­
ple of the use of regular expressions— a shorthand notation for regular sets— in describing
patterns for searching text.
CHAPTER 1

Mathematical
Preliminaries

Set theory and discrete mathematics provide the mathematical foundation for formal lan­
guage theory, computability theory, and the analysis of computational complexity. We begin
our study of these topics with a review of the notation and basic operations of set theory.
Cardinality measures the size of a set and provides a precise definition of an infinite set.
One of the interesting results of the investigations into the properties of sets by German
mathematician Georg Cantor is that there are different sizes of infinite sets. While Cantor’s
work showed that there is a complete hierarchy of sizes of infinite sets, it is sufficient for
our purposes to divide infinite sets into two classes: countable and uncountable. A set is
countably infinite if it has the same number o f elements as the set of natural numbers. Sets
with more elements than the natural numbers are uncountable.
In this chapter we will use a construction known as the diagonalization argument
to show that the set of functions defined on the natural numbers is uncountably infinite.
After we have agreed upon what is meant by the terms effective procedure and computable
function (reaching this consensus is a major goal of Part III of this book), we will be
able to determine the size of the set of functions that can be algorithmically computed.
A comparison of the sizes of these two sets will establish the existence of functions whose
values cannot be computed by any algorithmic process.
While a set may consist of an arbitrary collection of objects, we are interested in sets
whose elements can be mechanically produced. Recursive definitions are introduced to
generate the elements of a set. The relationship between recursively generated sets and
mathematical induction is developed, and induction is shown to provide a general proof
technique for establishing properties of elements in recursively generated infinite sets.

7
8 C hapter 1 M ath e m a tic a l Prelim inaries

This chapter ends with a review of directed graphs and trees, structures that will be
used throughout the book to graphically illustrate the concepts of formal language theory
and the theory of computation.

1.1 Set Theory


We assume that the reader is familiar with the notions of elementary set theory. In this
section, the concepts and notation of that theory are briefly reviewed. The symbol € signifies
membership; x e X indicates that x is a member or element of the set X. A slash through a
symbol represents not, so x & X signifies that * is not a member of X. Two sets are equal if
they contain the same members. Throughout this book, sets are denoted by capital letters.
In particular, X, Y, and Z are used to represent arbitrary sets. Italics are used to denote the
elements of a set. For example, symbols and strings of the form a , b, A, B, aaaa, and abc
represent elements of sets.
Brackets { } are used to indicate a set definition. Sets with a small number of members
can be defined explicitly; that is, their members can be listed. The sets

X = {1, 2, 3}
Y = {a, b , c, d, e)

are defined in an explicit manner. Sets having a large finite or infinite number of members
must be defined implicitly. A set is defined implicitly by specifying conditions that describe
the elements of the set. The set consisting of all perfect squares is defined by

{n | n = m 2 for some natural number m }.

The vertical bar | in an implicit definition is read “such that.” The entire definition is read
“the set of n such that n equals m squared for some natural number m .”
The previous example mentioned the set of n atu ra l num bers. This important set,
denoted N, consists of the numbers 0, 1, 2, 3, . . . . The em pty set, denoted 0, is the set
that has no members and can be defined explicitly by 0 = { }.
A set is determined completely by its membership; the order in which the elements are
presented in the definition is immaterial. The explicit definitions

X = {1, 2, 3}, Y = {2, 1, 3}, Z = {1, 3, 2, 2, 2}

describe the same set. The definition of Z contains multiple instances of the number 2.
Repetition in the definition of a set does not affect the membership. Set equality requires
that the sets have exactly the same members, and this is the case; each of the sets X, Y, and
Z has the natural numbers 1, 2, and 3 as its members.
A set Y is a subset of X, written Y C X, if every member of Y is also a member of X.
The empty set is trivially a subset of every set. Every set X is a subset of itself. If Y is a
1.1 Set Theory 9

subset of X and Y 5 6 X, then Y is called a proper subset of X. The set of all subsets of X
is called the power set of X and is denoted J ’(X).

Example 1.1.1

Let X = {1, 2, 3}. The subsets of X are

0 {1} {2} {3}


{1,2} {2,3} {3,1} {1,2,3}. □
Set operations are used to construct new sets from existing ones. The u nion of two sets
is defined by

X U Y = { z | z € X o r z € Y}.

The or is inclusive. This means that z is a member of X U Y if it is a member of X or Y or


both. The intersection of two sets is the set o f elements common to both. This is defined
by

X n Y = { z | z € X and z € Y}.

Two sets whose intersection is empty are said to be disjoint. The union and intersection of
n sets, Xj, X2, . . . , X„, are defined by
n
U X, = Xi U X 2 U • • • U X„ = {x | x e X,-, for some 1 = 1, 2 , . . . . n}
; =i
n
Q x( = x, n x2n • ■ • n X„ = {* IX € X,-, for all / = 1 , 2...... n },
1=1

respectively.
Subsets X ^ X2, . . . . X„ of a set X are said to partitio n X if

i) X = U X,
i=i
ii) Xj H X j = 0 , for 1 < i, j < n , and i ^ j .
For example, the set of even natural numbers (zero is considered even) and the set o f odd
natural numbers partition N.
The difference of sets X and Y, X — Y, consists of the elements of X that are not in Y:

X - Y = {z|z€Xandz£Y}.

Let X be a subset of a universal set U. The com plem ent of X with respect to U is the set
of elements in U but not in X. In other words, the complement of X with respect to U is
the set U j—X. When the universe U is known, the complement o f X with respect to U is
denoted X. The following identities, known as DeMorgan’s Laws, exhibit the relationships
10 C hapter 1 M ath e m a tic a l Prelim inaries

between union, intersection, and complement when X and Y are subsets o f a set U and
complementation is taken with respect to U:
i) ( X U Y ) = X H Y
ii) ( X n Y ) = X U Y .

Example 1.1.2

Let X = {0, 1, 2, 3}, Y = {2, 3, 4, 5), and let X and Y denote the complement of X and Y
with respect to N. Then

X U Y = {0, 1, 2, 3, 4, 5} X = {n | n > 3}

X n Y = {2, 3} Y = {0, 1} U {n | n > 5}

X — Y = {0, 1} X H Y = {h | n > 5}

Y - X = {4, 5} ( X U Y ) = {« | n > 5}
The final two sets in the right-hand column exhibit the equality required by DeMorgan’s
Law. □

The definition of subset provides the method for proving that a set X is a subset of Y;
we must show that every element of X is also an element of Y. When X is finite, we can
explicitly check each element of X for membership in Y. When X contains infinitely many
elements, a different approach is needed. The strategy is to show that an arbitrary element
of X is in Y.

Example 1.1.3

We will show that X = {8 n — 1 1n > 0} is a subset of Y = {2m + 1 1m is odd). To gain a


better understanding of the sets X and Y, it is useful to generate some of the elements of X
and Y:
X : 8 - 1 - 1 = 7, 8 - 2 — 1 = 15, 8 - 3 - 1 = 23, 8 - 4 - 1 = 31, . . .
Y: 2 1 + 1 = 3, 2 - 3 + 1 = 7, 2 - 5 + 1 = 1 1 , 2 - 7 + 1 = 1 3 , . . .
To establish the inclusion, we must show that every element of X is also an element o f Y.
An arbitrary element x of X has the form 8 n — 1, for some n > 0. Let m = 4n — 1. Then m
is an odd natural number and
2m + 1 = 2(4n - 1) + 1
= 8 w- 2 + 1

= 8n - 1

= x.
Thus x is also in Y and X C Y. □
1.2 C a r t e s i a n P r o d u c t , R e l a ti o n s , a n d F u n c t i o n s 11

Set equality can be defined using set inclusion; sets X and Y are equal if X C Y and
Y C X. This simply states that every element of X is also an element of Y and vice versa.
When establishing the equality of two sets, the two inclusions are usually proved separately
and combined to yield the equality.

Example 1.1.4

We prove that the sets

X = {n | n = m 2 for some natural number m > 0}

Y = [n2 + 2n + 1 1rt > 0}


are equal. First, we show that every element of X is also an element of Y. Let x G X; then
x = m 2 for some natural number m > 0. Let m j be that number. Then x can be written

x = (m 0 ) 2

= (m 0 - 1 + l) 2

= (m0 — l ) 2 + 2(m o — 1) + 1.

Letting n = m 0 — 1, we see that x = rt2 + 2rt + 1 with rt > 0. Consequently, x is a member


of the set Y.
We now establish the opposite inclusion. Let y = (n 0 ) 2 + 2«q + 1 be an element o f Y.
Factoring yields y = (n0 + l)2. Thus y is the square of a natural number greater than zero
and therefore an element of X.
Since X c Y and Y c X, we conclude that X = Y. □

1.2 Cartesian Product, Relations, and Functions


The C artesian product is a set operation that builds a set consisting o f ordered pairs of
elements from two existing sets. The Cartesian product of sets X and Y, denoted X x Y, is
defined by

X x Y = {[*, y] \ x 6 X and y e Y}.

A binary relation on X and Y is a subset of X x Y. The ordering of the natural numbers


can be used to generate a relation LT (less than) on the set N x N. This relation is the subset
of N x N defined by

LT = {[«', j ] | i < j and /, j e N}.

The notation [i, j] € LT indicates that i is less than j , for example, [0, 1], [0, 2] € LT and
[1, 1] £ LT.
12 C hapter 1 M ath e m a tic a l Prelim inaries

The Cartesian product can be generalized to construct new sets from any finite number
of sets. If x t, x 2, . . ■ , xn are n elements, then [*,, * 2 ......... x„] is called an ordered n-tuple.
An ordered pair is simply another name for an ordered 2-tuple. Ordered 3-tuples, 4-tuples,
and 5 -tuples are commonly referred to as triples, quadruples, and quintuples, respectively.
The Cartesian product of n sets Xj, X 2 , . . . , X„ is defined by

X, x X 2 x • • • x X„ = {[*,, x 2......... x„] | x, 6 X „ for i = 1, 2 , . . . , n}.

An n -ary relation on X |, X2, . . . , X„ is a subset o fX , x X 2 x - ■ ■ x X„. 1-ary, 2-ary, and


3 -ary relations are called unary, binary, and ternary, respectively.

Example 1.2.1

Let X = {1, 2, 3} and Y = [a, b\. Then


a) X x Y = {[1, a], [ 1 , H [2, a], [2, b], [3, a], [3, *]}
b) Y x X = {[a, 1], [a, 2], [a, 3], [b, 1], [b, 2], [b , 3]}
c) Y x Y = {[a, a ], [a, b], [b, a], [b, fc]}
d) X x Y x Y = {[1, a, a], [1, b, a], [2, a , a], [2, b , a], [3, a, a), [3, b, a],
[1, a, b], [1, b , b], [2, a, b], [2, b, b], [3, a, b], [3, b, b]} □

Informally, a function from a set X to a set Y is a mapping of elements o f X to elements


of Y in which each element of X is mapped to at most one element of Y. A function / from
X to Y is denoted / : X —► Y. The element of Y assigned by the function / to an element
x e X is denoted / ( x ) . The set X is called the dom ain of the function and the elements
of X are the arguments or operands of the function / . The range o f / is the subset of Y
consisting of the members of Y that are assigned to elements of X. Thus the range of a
function / : X -» Y is the set {y e Y | y = f ( x ) for some x € X).
The relationship that assigns to each person his or her age is a function from the set of
people to the natural numbers. Note that an element in the range may be assigned to more
than one element of the domain— there are many people who have the same age. Moreover,
not all natural numbers are in the range of the function; it is unlikely that the number 1 0 0 0
is assigned to anyone.
The domain of a function is a set, but this set is often the Cartesian product of two or
more sets. A function

/ : X, x X 2 x • • ■ x X„ - ► Y

is said to be an n-variable function or operation. The value of the function with variables
X|, * 2 , is denoted f ( x x 2 ......... x„). Functions with one, two, or three variables
are often referred to as unary, binary, and ternary operations. The function sq : N —► N
that assigns n2 to each natural number is a unary operation. When the domain of a function
consists of the Cartesian product of a set X with itself, the function is simply said to be a
binary operation on X. Addition and multiplication are examples of binary operations on N.
1.2 C a r t e s i a n P r o d u c t , R e l a ti o n s , a n d F u n c t i o n s 13

A function / relates members of the domain to members of the range o f / . A natural


definition of function is in terms of this relation. A total function / from X to Y is a binary
relation on X x Y that satisfies the following two properties:

i) For each x € X, there is a y e Y such that [*, y] e / .


ii) If [x, >>,] 6 / and [*, y2] € / , then y, = y2.

Condition (i) guarantees that each element o f X is assigned a member of Y, hence the term
total. The second condition ensures that this assignment is unique. The previously defined
relation LT is not a total function since it does not satisfy the second condition. A relation
on N x N representing greater than fails to satisfy either of the conditions. Why?

Example 1.2.2

Let X = {1, 2, 3} and Y = {a, b). The eight total functions from X to Y are listed below.

x fix ) X f( x ) X f(x) x fix )


1 a 1 a 1 a 1 b
2 a 2 a 2 b 2 a
3 a 3 b 3 a 3 a

X fix) X f( x ) X fix ) X fix )


1 a 1 b 1 b 1 b
2 b 2 a 2 b 2 b
3 b 3 b 3 a 3 b

V A p artial function / from X to Y is a relation on X x Y in which y x — y 2 whenever


U .V iJ € / and [;r, _v2] 6 / . A partial function / is defined for an argument x if there is a
y G Y such that [*, y] e / . Otherwise, / is undefined for x. A total function is simply a
partial function defined for all elements of the domain.
Although functions have been formally defined in terms of relations, we will use the
standard notation / ( * ) = y to indicate that y is the value assigned to x by the function / , that
is, that [*, y] € / . The notation f ( x ) f indicates that the partial function / is undefined for
the argument x. The notation / (x) I is used to show that / ( * ) is defined without explicitly
giving its value.
Integer division defines a binary partial function div from N x N to N . The quotient
obtained from the division of i by j , when defined, is assigned to div(i, j ) . For example,
div(3, 2) = 1, div(4, 2) = 2, and div( 1, 2) = 0. Using the previous notation, div(i, 0) | and
div(i, j ) | for all values of j other than zero.
A total function / : X —* Y is said to be one-to-one if each element o f X maps to a
distinct element in the range. Formally, / is one-to-one if x t ^ x 2 implies / ( * , ) ^ f (x2).
A function / : X -> Y is said to be onto if the range of / is the entire set Y. A total function
14 C hapter 1 M ath e m a tic a l Prelim inaries

that is both one-to-one and onto defines a correspondence between the elements of domain
and the range.

Example 1.2.3
The functions / , g, and s are defined from N to N - {0}. the set of positive natural numbers,
i) / ( « ) = 2 n+ l
... , . f 1 ifn = 0
“> * < " > = ( „ otherwise
iii) s(n) = n + 1

The function / is one-to-one but not onto; the range of / consists of the odd numbers.
The mapping from N to N - {0} defined by g is clearly onto but not one-to-one since
g(0) = g (l) = 1. The function s is both one-to-one and onto, defining a correspondence
that maps each natural number to its successor. □

Example 1.2.4

In the preceding example we noted that the function f ( n ) = 2n + 1 is one-to-one, but not
onto the set N — {0}. It is, however, a mapping from N to the set of odd natural numbers
that is both one-to-one and onto. We will use / to demonstrate how to prove that a function
has these properties.
One-to-one: To prove that a function is one-to-one, we show that n and m must be the same
whenever f ( n ) = f ( m ) . The assumption f (n) — / (m) yields,
2n + 1 = 2m + 1 or
2n = 2m, and finally,
n = m.
It follows that n ^ m implies f(n )J = f (m), and / is one-to-one.
Onto: To establish that / maps N onto the set of odd natural numbers, we must show that
every odd natural number is in the range of / . If m is an odd natural number, it can be
written m = 2n + 1 for some n e N. Then f ( n ) = 2n + 1 = m and m is in the range of / .

1.3 Equivalence Relations


A binary relation over a set X has been formally defined as a subset of the Cartesian product
X x X. Informally, we use a relation to indicate whether a property holds between two
elements of a set. An ordered pair is in the relation if its elements satisfy the prescribed
condition. For example, the property is less than defines a binary relation on the set of
natural numbers. The relation defined by this property is the set LT = {[/, j ] | i < j }.
1.3 E q u i v a le n c e R e l a t i o n s 15

Infix notation is often used to express membership in many common binary relations.
In this standard usage, i < j indicates that i is less than j and consequently the pair [i, j]
is in the relation LT defined above.
We now consider a type of relation, known as an equivalence relation, that can be used
to partition the underlying set. Equivalence relations are generally denoted using the infix
notation a = b to indicate that a is equivalent to b.

Definition 1.3.1
A binary relation = over a set X is an equivalence relation if it satisfies
i) Reflexivity: a = a, for all a € X
ii) Symmetry: a = b implies b = a, for all a, b € X
iii) Transitivity: a = b and b = c implies a = c, for all a, b, c e X.

Definition 1.3.2
Let = be an equivalence relation over X. The equivalence class of an element a € X defined
by the relation = is the set [a]s = {b e X | a = b).

Example 1.3.1

Let = P be the parity relation over N defined by n = P m if, and only if, n and m have the
same parity (even or odd). To prove that = P is an equivalence relation, we must show that
it is symmetric, reflexive, and transitive.
i) Reflexivity: For every natural number n, n has the same parity as itself and n = P n.
ii) Symmetry: If n = P m, then n and m have the same parity and m = P n.
iii) Transitivity: If n = P m and m = P k, then n and m have the same parity and m and k
have the same parity. It follows that n and k have the same parity and n = P k.
The two equivalence classes of the parity relation = P are [0]=p = {0, 2, 4, . . .} and [ l ] . p =
{1,3,5,...}. □

An equivalence class is usually written [a]E, where a is an element in the class. In the
preceding example, [0]_p was used to represent the set of even natural numbers. Lemma
1.3.3 shows that if a = b, then [a]s = [£>]_. Thus the element chosen to represent the class
is irrelevant.

Lemma 1.3.3
Let = be an equivalence relation over X and let a and b be elements of X. Then either
[fl]. = tb]m or [a]s n [b]s = 0 .
Proof. Assume that the intersection of [a]= and [£>]„ is not empty. Then there is some
element c that is in both of the equivalence classes. Using symmetry and transitivity, we
show that [6 ]= c [a]= . Since c is in both [a]m and [b]_, we know a = c and b = c. By
symmetry, c = b. Using transitivity, we conclude that a = b.
16 C hapter 1 M ath em atical Prelim inaries

Now let d be any element in [b\m. Then b = d. The combination of a = b, b = d, and


transitivity yields a = d . That is, d € [a]_. We have shown that every element in [b]B is
also in [a]m, so [fc]= c [a]s . By a similar argument, we can establish that [a]_ c The
two inclusions combine to produce the desired set equality. ■

Theorem 1.3.4
Let = be an equivalence relation over X. The equivalence classes of = partition X.
Proof. By Lemma 1.3.3, we know that the equivalence classes form a disjoint family of
subsets of X. Let a be any element of X. By reflexivity, a e [a]*. Thus each element of X
is in one of the equivalence classes. It follows that the union of the equivalence classes is
the entire set X. ■

vc-------------------------------------------------------------
1.4 Countable and Uncountable Sets
Cardinality is a measure that compares the size of sets. Intuitively, the cardinality of a set is
the number of elements in the set. This informal definition is sufficient when dealing with
finite sets; the cardinality can be obtained by counting the elements of the set. There are
obvious difficulties in extending this approach to infinite sets.
Two finite sets can be shown to have the same number of elements by constructing a
one-to-one correspondence between the elements of the sets. For example, the mapping

a —► 1

fc— ► 2

c—► 3
demonstrates that the sets {a, b, c} and {1, 2, 3} have the same size. This approach, com­
paring the size of sets using mappings, works equally well for sets with a finite or infinite
number of members.

Definition 1.4.1

i) Two sets X and Y have the same cardinality if there is a total one-to-one function from
X onto Y.
ii) The cardinality of a set X is less than or equal to the cardinality of a set Y if there is
total one-to-one function from X into Y.

Note that the two definitions differ only by the extent to which the mapping covers the set Y.
If the range of the one-to-one mapping is all of Y, then the two sets have the same cardinality.
The cardinality of a set X is denoted card(X). The relationships in (i) and (ii) are
denoted card(X) = card(Y) and card(X) < c a r d (Y), respectively. The cardinality of X is
said to be strictly less than that of Y, written card(X) < card( Y), if card(X) < card( Y) and
card(X) / card(Y). The Schroder-Bemstein Theorem establishes the familiar relationship
between < and = for cardinality. The proof of the Schroder-Bemstein Theorem is left as
an exercise.
1.4 Countable and Uncountable Sets 17

Theorem 1.4.2 (Schrdder-Bem stein)


If card(X) < card(Y) and card(Y) < card(X), then card(X) = card{Y).

The cardinality of a finite set is denoted by the number of elements in the set. Thus
card([a, fc}) = 2. A set that has the same cardinality as the set of natural numbers is said
to be countably infinite or denum erable. Intuitively, a set is denumerable if its members
can be put into an order and counted. The mapping / that establishes the correspondence
with the natural numbers provides such an ordering; the first element is / ( 0 ), the second
/ ( l ) , the third / ( 2 ) , and so on. The term countable refers to sets that are either finite or
denumerable. A set that is not countable is said to be uncountable.
The set N — {0} is countably infinite; the function s(n) = n + 1 defines a one-to-one
mapping from N onto N — {0}. It may seem paradoxical that the set N — {0}, obtained
by removing an element from N, has the same number of elements of N. Clearly, there is
no one-to-one mapping of a finite set onto a proper subset of itself. It is this property that
differentiates finite and infinite sets.

Definition 1.4.3
A set is infinite if it has a proper subset of the same cardinality.

Example 1.4.1

The set of odd natural numbers is countably infinite. The function f { n ) = 2n + 1 from
Example 1.2.4 establishes the one-to-one correspondence between N and the odd numbers.

A set is countably infinite if its elements can be put in a one-to-one correspondence
with the natural numbers. A diagram of a mapping from N onto a set graphically illustrates
the countability of the set. The one-to-one correspondence between the natural numbers
and the set of all integers

. . . -3 -2 -1 0 1 2 3 . . .
18 C hapter 1 M a th e m a tic a l Prelim inaries

exhibits the countability of the set of integers. This correspondence is defined by the function

... | div(n, 2 ) + 1 if n is odd


J (” ) — j _ 2 ) if n is even.

Example 1.4.2

The points of an infinite two-dimensional grid can be used to show that N x N, the set of
ordered pairs of natural numbers, is denumerable. The grid is constructed by labeling the
axes with the natural numbers. The position defined by the i th entry on the horizontal axis
and the j th entry on the vertical axis represents the ordered pair [i, j].

The elements of the grid can be listed sequentially by following the arrows in the diagram.
This creates the correspondence

0 1 2 3 4 5 6 7

[0 , 0 ] [0 , 1 ] [1, 0 ] [0 , 2 ] [ 1, 1 ] [2 , 0 ] [0,3] [1, 2 ] ...

that demonstrates the countability of N x N. The one-to-one correspondence outlined above


maps the ordered pair [i, j] to the natural number ((/' + j ) ( i + j + l ) / 2 ) + i. □

The sets of interest in language theory and computability are almost exclusively finite
or denumerable. We state, without proof, several closure properties of countable sets.

Theorem 1.4.4

i) The union of two countable sets is countable.


ii) The Cartesian product of two countable sets is countable.
1.4 Countable and Uncountable Sets 19

iii) The set of finite subsets of a countable set is countable.


iv) The set of finite-length sequences consisting of elements of a nonempty countable set
is countably infinite.

The preceding theorem indicates that the property of countability is retained under
many standard set-theoretic operations. Each of these closure results can be established by
constructing a one-to-one correspondence between the new set and a subset of the natural
numbers.
A set is uncountable if it is impossible to sequentially list its members. The following
proof technique, known as Cantor's diagonalization argument, is used to show that there
is an uncountable number of total functions from N to N. Two total functions / : N —>• N
and g : N —>■ N are equal if they have the same value for every element in the domain. That
is. / = g if / ( « ) = g(n) for all n € N. To show that two functions are distinct, it suffices
to find a single input value for which the functions differ.
Assume that the set of total functions from the natural numbers to the natural numbers
is denumerable. Then there is a sequence / q, f i , f i , ■ ■ ■ that contains all the functions. The
values of the functions are exhibited in the two-dimensional grid with the input values on
the horizontal axis and the functions on the vertical axis.

0 1 2 3 4

fo /o(0) /o U ) /o(2) /o O ) /o<4>


A / i( 0 ) /l(l) / i( 2 ) / ] (3) W )
fi / 2(0) /a d ) W ) h i 3) /jW
h / j( 0 ) /j(l) M 2) / j( 3 ) W )
U / 4(0) /.(D u m /«(3) f* W

Consider the function / : N —»■ N defined by f ( n ) = /„ (« ) + 1. The values of / are


obtained by adding 1 to the values on the diagonal of the grid, hence the name diagonaliza­
tion. By the definition of / , / ( i ) ^ / , (i) for every i. Consequently, / is not in the sequence
/„ , f \ , f i , • ■ • • This is a contradiction since the sequence was assumed to contain all the
total functions. The assumption that the number of functions is countably infinite leads to
a contradiction. It follows that the set is uncountable.
Diagonalization is a general proof technique for demonstrating that a set is not count­
able. As seen in the preceding example, establishing uncountability using diagonalization
is a proof by contradiction. The first step is to assume that the set is countable and there­
fore its members can be exhaustively listed. The contradiction is achieved by producing
a member of the set that cannot occur anywhere in the list. No conditions are put on the
listing of the elements other than that it must contain all the elements of the set. Producing
a contradiction by diagonalization shows that there is no possible exhaustive listing of the
elements and consequently that the set is uncountable. This technique is exhibited again in
the following examples.
20 C hapter 1 M a th e m a tic a l Prelim inaries

Example 1.4.3
A function / from N to N has a fixed point if there is some natural number i such that
/ ( / ) = i. For example, f ( n ) = n 2 has fixed points 0 and 1, while f ( n ) = n2 + 1 has no
fixed points. We will show that the number of functions that do not have fixed points is
uncountable. The argument is similar to the proof that the number of all functions from N
to N is uncountable, except that we now have an additional condition that must be met when
constructing an element that is not in the listing.
Assume that the number of the functions without fixed points is countable. Then these
functions can be listed f 0, / (, / 2, . . . . To obtain a contradiction to our assumption that the
set is countable, we construct a function that has no fixed points and is not in the list. Consider
the function / (/i) = /„ (« ) + n + 1. The addition of n + 1 in the definition o f / ensures that
f ( n ) > n for all n. Thus / has no fixed points. By an argument similar to that given above,
/ ( i ) ^ fj(i) for all i. Consequently, the listing f 0, fa, fa, . . . is not exhaustive, and we
conclude that the number of functions without fixed points is uncountable. □

Example 1.4.4

CP(N), the set of subsets of N, is uncountable. Assume that the set of subsets of N is
countable. Then they can be listed N0, Nj, N 2 ...........Define a subset D of N as follows: For
every natural number j ,

j € D if, and only if, y £ N^.

By our construction, 0 € D if 0 ^ Nq, l e D i f l ^ N], and so on. The set D is clearly a set of
natural numbers. By our assumption, N0, N[, N2, . . . is an exhaustive listing of the subsets
of N. Hence, D = N,- for some i . Is the number i in the set D? By definition of D,

i € D if, and only if, i & Nf.

But since D = N,-, this becomes

/ 6 D if, and only if, i £ D,

which is a contradiction. Thus, our assumption that TfN ) is countable must be false and we
conclude that 9 ( N ) is uncountable.
To appreciate the “diagonal” technique, consider a two-dimensional grid with the
natural numbers on the horizontal axis and the vertical axis labeled by the sets N0, N h
N 2 .......... The position of the grid designated by row N, and column j contains yes if j e N,-.
Otherwise, the position defined by N, and column j contains no. The set D is constructed by
considering the relationship between the entries along the diagonal of the grid: the number
j and the set N^. By the way that we have defined D, the number j is an element of D if,
and only if, the entry in the position labeled by N; and j is no. □
1.5 Diagonalization and Self-Reference 21

1.5 Diagonalization and Self-Reference


In addition to its use in cardinality proofs, diagonalization provides a method for demon­
strating that certain properties or relations are inherently contradictory. These results are
used in nonexistence proofs since there can be no object that satisfies such a property. Di­
agonalization proofs of nonexistence frequently depend upon contradictions that arise from
self-reference— an object analyzing its own actions, properties, or characteristics. Russell’s
paradox, the undecidability of the Halting Problem for Turing Machines, and Godel’s proof
of the undecidability of number theory are all based on contradictions associated with self­
reference.
The diagonalization proofs in the preceding section used a table with operators listed
on the vertical axis and their arguments on the horizontal axis to illustrate the relationship
between the operators and arguments. In each example, the operators were of a different
type than their arguments. In self-reference, the same family of objects comprises the
operators and their arguments. We will use the barber’s paradox, an amusing simplification
of Russell’s paradox, to illustrate diagonalization and self-reference.
The barber’s paradox is concerned with who shaves whom in a mythical town. We are
told that every man who is able to shave himself does so and that the barber of the town
(a man himself) shaves all and only the people who cannot shave themselves. We wish to
consider the possible truth of such a statement and the existence of such a town. In this case,
the set of males in the town make up both the operators and the arguments; they are doing
the shaving and being shaved. Let M = [ p h p 2, p$......... p h . . . } be the set of all males in
the town. A tabular representation of the shaving relationship has the form

Pi Pi Pi ■■■ Pi

Pi
P2
Pi

Pi

where the /', j th position of the table has a 1 if p, shaves p j and a 0 otherwise. Every column
will have one entry with a 1 and all the other entries will be 0 ; each person either shaves
himself or is shaved by the barber. The barber must be one of the people in the town, so
he is pi for some value i. What is the value of the position i, i in the table? This is classic
self-reference; we are asking what occurs when a particular object is simultaneously the
operator (the person doing the shaving) and the operand (the person being shaved).
W ho shaves the barber? If the barber is able to shave himself, then he cannot do so since
he shaves only people who are unable to shave themselves. If he is unable to shave himself,
22 C hapter 1 M ath e m a tic a l Prelim inaries

then he must shave himself since he shaves everyone who cannot shave themselves. We
have shown that the properties describing the shaving habits of the town are contradictory
so such a town cannot exist.
Russell’s paradox follows the same pattern, but its consequences were much more
significant than the nonexistence of a mythical town. One of the fundamental tenets of
set theory as proposed by Cantor in the late 1800s was that any property or condition that
can be described defines a set— the set of objects that satisfy the condition. There may be
no objects, finitely many, or infinitely many that satisfy the property, but regardless of the
number or the type of elements, the objects form a set. Russell devised an argument based
on self-reference to show that this claim cannot be true.
The relationship examined by Russell’s paradox is that of the membership of one set
in another. For each set X we ask the question, “Is a set Y an element of X?” This is not
an unreasonable question, since one set can certainly be an element of another. The table
below gives both some negative and positive examples of this question.

X Y YeX?

(a) {a) no
{(a), b) {«} yes
{{«).«. 0 ) 0 yes
{{a. *}. {a}) {{all no
{{{*}.&), b) {{«). b) yes

It is important to note that the question is not whether Y is a subset of X, but whether it is
an element of X.
The membership relation can be depicted by the table

X, X2 X3 ... X,

x, n I I ~ ~
X2
X3 . . . . . .

X,

where axes are labeled by the sets. A table entry [i, j ] is 1 if X; is an element of X,- and 0
if X j is not an element of X, .
A question of self-reference can be obtained by identifying the operator and the operand
in the membership question. That is, we ask if a set X, is an element o f itself. The diagonal
entry [/, i] in the preceding table contains the answer to the question, “Is X, an element of
X,?" Now consider the property that a set is not an element of itself. Does this property
define a set? There are clearly examples of sets that satisfy the property; the set {a} is not
1.6 R e c u rs iv e D e f i n i t i o n s 23

an element of itself. The satisfaction of the property is indicated by the complement of the
diagonal. A set X, is not an element of itself if, and only if, entry [/, i ] is 0.
Assume that S = {X | X <f. X} is a set. Is S in S? If S is an element of itself, then it is
not in S by the definition of S. Moreover, if S is not in S, then it must be in S since it is not
an element of itself. This is an obvious contradiction. We were led to this contradiction by
our assumption that the collection of sets that satisfy the property X ^ X form a set.
We have constructed a describable property that cannot define a set. This shows that
Cantor’s assertion about the universality of sets is demonstrably false. The ramifications of
Russell’s paradox were far-reaching. The study of set theory moved from a foundation based
on naive definitions to formal systems of axioms and inference rules and helped initiate the
formalist philosophy of mathematics. In Chapter 12 we will use self-reference to establish
a fundamental result in the theory of computer science, the undecidability of the Halting
Problem.

1.6 Recursive Definitions


Many, in fact most, of the sets of interest in formal language and automata theory contain
an infinite number of elements. Thus it is necessary that we develop techniques to describe,
generate, or recognize the elements that belong to an infinite set. In the preceding section we
described the set of natural numbers utilizing ellipsis dots ( . . . ) . This seemed reasonable
since everyone reading this text is familiar with the natural numbers and knows what comes
after 0, 1, 2 , 3 . However, this description would be totally inadequate for an alien unfamiliar
with our base 10 arithmetic system and numeric representations. Such a being would have
no idea that the symbol 4 is the next element in the sequence or that 1492 is a natural
number.
In the development of a mathematical theory, such as the theory o f languages or
automata, the theorems and proofs may utilize only the definitions of the concepts of that
theory. This requires precise definitions of both the objects of the domain and the operations.
A method of definition must be developed that enables our friend the alien, or a computer
that has no intuition, to generate and “understand” the properties of the elements of a set.
A recursive definition of a set X specifies a method for constructing the elements
of the set. The definition utilizes two components: a basis and a set of operations. The
basis consists of a finite set of elements that are explicitly designated as members of X.
The operations are used to construct new elements of the set from the previously defined
members. The recursively defined set X consists of all elements that can be generated from
the basis elements by a finite number of applications of the operations.
The key word in the process of recursively defining a set is generate. Clearly, no
process can list the complete set of natural numbers. Any particular number, however, can be
obtained by beginning with zero and constructing an initial sequence of the natural numbers.
This intuitively describes the process of recursively defining the set of natural numbers. This
idea is formalized in the following definition.
24 C hapter 1 M a t h e m a t i c a l P r e li m in a r ie s

Definition 1.6.1
A recursive definition of N, the set of natural numbers, is constructed using the successor
function s.
i) Basis: 0 € N.
ii) Recursive step: If n e N, then s(n) e N.
iii) Closure: n e N only if it can be obtained from 0 by a finite number of applications of
the operation s.

The basis explicitly states that 0 is a natural number. In (ii), a new natural number
is defined in terms of a previously defined number and the successor operation. The clo­
sure section guarantees that the set contains only those elements that can be obtained
from 0 using the successor operator. Definition 1.6.1 generates an infinite sequence 0,
5 (0 ), j(s(0 )), j(s (s (0 )))...........This sequence is usually abbreviated 0 , 1 , 2 , 3 ............How­

ever, anything that can be done with the familiar Arabic numerals could also be done with
the more cumbersome unabbreviated representation.
The essence of a recursive procedure is to define complicated processes or structures
in terms of simpler instances of the same process or structure. In the case of the natural
numbers, “simpler” often means smaller. The recursive step of Definition 1.6 .1 defines a
number in terms of its predecessor.
The natural numbers have now been defined, but what does it mean to understand their
properties? We usually associate operations of addition, multiplication, and subtraction with
the natural numbers. We may have learned these by brute force, either through memorization
or tedious repetition. For the alien or a computer to perform addition, the meaning of “add”
must be appropriately defined. One cannot memorize the sum of all possible combinations
of natural numbers, but we can use recursion to establish a method by which the sum of any
two numbers can be mechanically calculated. The successor function is the only operation
on the natural numbers that has been introduced. Thus the definition of addition may use
only 0 and s.

Definition 1.6.2
In the following recursive definition of the sum of m and n, the recursion is done on n, the
second argument of the sum.
i) Basis: If n = 0, then m + n = m.
ii) Recursive step: m + s(n) = s(m + n).
iii) Closure: m + n = k only if this equality can be obtained from rn + 0 = m using finitely
many applications of the recursive step.

The closure step is often omitted from a recursive definition of an operation on a given
domain. In this case, it is assumed that the operation is defined for all the elements of the
domain. The operation of addition given above is defined for all elements of N x N.
The sum of m and the successor of n is defined in terms of the simpler case, the sum of
m and n, and the successor operation. The choice of n as the recursive operand was arbitrary;
the operation could also have been defined in terms of m, with n fixed.
1.6 R e c u rs iv e D e f i n i t i o n s 25

Following the construction given in Definition 1.6.2, the sum of any two natural
numbers can be computed using 0 and s, the primitives used in the definition of the natural
numbers. Example 1.6.1 traces the recursive computation of 3 + 2.

Example 1.6.1

The numbers 3 and 2 abbreviate s(s(s(0))) and j ( 5 (0 )), respectively. The sum is computed
recursively by

5 (5 (5 (0 ) ) ) + 5 (s( 0 ))
= i ( 5 (s( 5 (0 ) ) ) + j ( 0 ))
= s(i(5(5(s(0))) + 0))
= s(s(s(s(s( 0 ))))) (basis case).

This final value is the representation of the number 5. □

Figure 1.1 illustrates the process of recursively generating a set X from basis X q. Each of
the concentric circles represents a stage of the construction. X, represents the basis elements
and the elements that can be obtained from them using a single application of an operation
defined in the recursive step. X,- contains the elements that can be constructed with i or
fewer operations. The generation process in the recursive portion of the definition produces
a countably infinite sequence of nested sets. The set X can be thought of as the infinite union
of the X, ’s. Let x be an element of X and let X j be the first set in which x occurs. This
means that x can be constructed from the basis elements using exactly j applications of the
operators. Although each element of X can be generated by a finite number o f applications of
the operators, there is no upper bound on the number of applications needed to generate the
entire set X. This property, generation using a finite but unbounded number of operations,
is a fundamental property of recursive definitions.
The successor operator can be used recursively to define relations on the set N x N. The
Cartesian product N x N is often portrayed by the grid of points representing the ordered
pairs. Following the standard conventions, the horizontal axis represents the first component
of the ordered pair and the vertical axis the second. The shaded area in Figure 1.2(a) contains
the ordered pairs [/, j] in which i < j . This set is the relation LT, less than, that was described
in Section 1.2.

Example 1.6.2

The relation LT is defined as follows:

i) Basis: [0, 1] € LT.


ii) Recursive step: If [m, n] e LT, then [m, s(n)] e LT and [s(/«), f(« )] € LT.
iii) Closure: [m, n] € LT only if it can be obtained from [0, 1] by a finite number of
applications of the operations in the recursive step.
26 C hapter 1 M a th e m a tic a l Prelim inaries

Recursive generation of X:
Xq = {jt | x is a basis element}
X, + i = X, U (at | at can be generated by < + 1 operations)
X = {a: | x € Xj for some j > 0}

FICURE 1.1 Nested sequence o f sets in recursive definition.

Using the infinite union description of recursive generation, the definition of LT gen­
erates the sequence LTf of nested sets where

LT0 = ([0, 1]}


LT, = LT0 U {[0,2], [1,2]}
LT2 = LT, U {[0, 3], [1,3], [2,3]}
LT3 = LT2 U {[0,4], [1,4], [2,4], [3,4]}

L T ^ L T . ^ U { [ ;,/ + !] | 7 = 0 , 1 ......... i)


The construction of LT shows that the generation of an element in a recursively defined
set may not be unique. The ordered pair [1, 3] € LT2 is generated by the two distinct
sequences of operations:

Basis [0, 1] [0 , 1]
1 [0 , s ( l ) ] = [ 0 , 2 ] [5(0), 5(1)] = [ 1 , 2 ]
2 [5(0), 5(2)] = [1, 3] [1,5(2)] = [1,3],
1. 7 M ath em atical Induction 27

0 1 2 3 4 5 6 7 8 9

(a) (b)

FICURE 1.2 Relations on N x N.

Example 1.6.3

The shaded area in Figure 1.2(b) contains all the ordered pairs with second component 3,
4, 5, or 6 . A recursive definition of this set, call it X, is given below.

i) Basis: [0, 3], [0, 4], [0, 5], and [0, 6 ] are in X.
ii) Recursive step: If [m, n] 6 X, then [5 (m), n] € X.
iii) Closure: [m, n] 6 X only if it can be obtained from the basis elements by a finite number
of applications of the operation in the recursive step.

The sequence of sets X, generated by this recursive process is defined by

X, = {[;, 3], [j, 4], [j, 5], [j, 6 ] | j = 0, 1 , . . . , / } .

1.7 Mathematical Induction


Establishing relationships between the elements of sets and operations on the sets requires
the ability to construct proofs that verify the hypothesized properties. It is impossible to
prove that a property holds for every member in an infinite set by considering each element
individually. The principle of mathematical induction gives sufficient conditions for proving
that a property holds for every element in a recursively defined set. Induction uses the family
of nested sets generated by the recursive process to extend a property from the basis to the
entire set.
28 C hapter 1 M ath e m a tic a l Prelim inaries

Principle o f Mathematical Induction Let X be a set defined by recursion from the basis Xq
and let Xq, X,, X 2 ...........X, , . . . be the sequence of sets generated by the recursive process.
Also let P be a property defined on the elements of X. If it can be shown that

i) P holds for each element in X q,


ii) whenever P holds for every element in the sets X<), X], . . . ,X ,-,P also holds for every
element in Xl+1,

then, by the principle of mathematical induction, P holds for every element in X.


The soundness of the principle of mathematical induction can be intuitively exhibited
using the sequence of sets constructed in the recursive definition of X. Shading the circle X,
indicates that P holds for every element of X ,. The first condition requires that the interior
set be shaded. Condition (ii) states that the shading can be extended from any circle to the
next concentric circle. Figure 1.3 illustrates how this process eventually shades the entire
set X.
The justification for the principle of mathematical induction should be clear from the
preceding argument. Another justification can be obtained by assuming that conditions (i)
and (ii) are satisfied but P is not true for every element in X. If P does not hold for all
elements of X, then there is at least one set X,- for which P does not universally hold. Let
X j be the first such set. Since condition (i) asserts that P holds for all elements of Xq, j
cannot be zero. Now P holds for all elements of Xy_j by our choice of j . Condition (ii)
then requires that P hold for all elements in X j. This implies that there is no first set in the
sequence for which the property P fails. Consequently, P must be true for all the X, ’s, and
therefore for X.
An inductive proof consists of three distinct steps. The first step is proving that the
property P holds for each element of a basis set. This corresponds to establishing condition
(i) in the definition of the principle of mathematical induction. The second is the statement
of the inductive hypothesis. The inductive hypothesis is the assumption that the property P
holds for every element in the sets Xq, X |...........X„. The inductive step then proves, using
the inductive hypothesis, that P can be extended to each element in Xn+(. Completing the
inductive step satisfies the requirements of the principle of mathematical induction. Thus,
it can be concluded that P is true for all elements of X.
In Example 1.6.2, a recursive definition was given to generate the relation LT, which
consists of ordered pairs [/, j ] that satisfy i < j . Does every ordered pair generated by
the definition satisfy this inequality? We will use this question to illustrate the steps of an
inductive proof on a recursively defined set.
The first step is to explicitly show that the inequality is satisfied for all elements in the
basis. The basis of the recursive definition of LT is the set {[0, 1]}. The basis step of the
inductive proof is satisfied since 0 < 1 .
The inductive hypothesis states the assumption that x < y for all ordered pairs [x, y] e
LT„. In the inductive step we must prove that i < j for all ordered pairs [i, j] e LT„+1. The
recursive step in the definition of LT relates the sets LTn + 1 and LT„. Let [/, j ] be an ordered
1.7 Mathematical Induction 29

FIGURE 1.3 Principle o f mathematical induction.

pair in LTn+1. Then either [/, j] = [x, ^(>>)] or [/, j ] = [.?(*), ■s(.y)] for some [jc, >] G LTn.
By the inductive hypothesis, x < y. If [i, j] = [jc, j(y )], then

i — x < y < *(;y) = j .

Similarly, if [/', 7 ] = [s(x), s(y)], then

i = 5(jc) < j(;y) = j .


30 C hapter 1 M a th e m a tic a l Prelim inaries

In either case, i < j and the inequality is extended to all ordered pairs in LTn+1. This
completes the requirements for an inductive proof and consequently the inequality holds
for all ordered pairs in LT.
In the proof that every ordered pair [i, j ] in the relation LT satisfies i < j , the inductive
step used only the assumption that the property was true for the elements generated by
the preceding application of the recursive step. This type of proof is sometimes referred
to as simple induction. When the inductive step utilizes the full strength o f the inductive
hypothesis— that the property holds for all the previously generated elements—the proof
technique is called strong induction. Example 1.7.1 uses strong induction to establish a
relationship between the number of operators and the number of parentheses in an arithmetic
expression.

Example 1.7.1

A set E of arithmetic expressions is defined recursively from symbols {a, b}, operators 4-
and —, and parentheses as follows:
i) Basis: a and b are in E.
ii) Recursive step: If u and v are in E, then (u + u), (u — v), and (—v) are in E.
iii) Closure: An expression is in E only if it can be obtained from the basis by a finite
number of applications of the recursive step.

The recursive definition generates the expressions (a + b), (a + (b + b)), ((a + a) —


(b — a)) in one, two, and three applications of the recursive step, respectively. We will use
induction to prove that the number of parentheses in an expression u is twice the number
of operators. That is, n p(u) = 2n0(u), where n p(u) is the number of parentheses in u and
n0(u) is the number of operators.

Basis: The basis for the induction consists of the expressions a and b. In this case,
n p(a) = 0 = 2 n0(a) and n p(b) = 0 = 2 n 0(b).

Inductive Hypothesis: Assume that n p(u) = 2n0(u) for all expressions generated by n or
fewer iterations of the recursive step, that is, for all u in E„.

Inductive Step: Let w be an expression generated by n + 1 applications o f the recursive


step. Then w = (u + v), w = (u — v), or w = (—u) where u and v are strings in E„. By the
inductive hypothesis,

n p(u) = 2 n0(u)

n p(v) = 2n„(v).

If w = (u + v) or w = (u — v),

n p(w) = n p(u) + n p(v) + 2

n„(w) = na(u) + na(v) + 1 .


1.7 Mathematical Induction 31

Consequently,

2na(w) = 2 na(u) + 2 n„(v) + 2 = n p(u) + n p( v ) + 2 = n p(u>).

If w = (—v), then

2 nB(w) = 2(n„(v) + 1 ) = 2 n0(v) + 2 = rtp( v) + 2 = n p(w).

Thus the property n p(w) = 2na(w) holds for all w e En+ 1 and we conclude, by mathemat­
ical induction, that it holds for all expressions in E. □

Frequently, inductive proofs use the natural numbers as the underlying recursively
defined set. A recursive definition of this set with basis {0} is given in Definition 1.6.1. The
nth application of the recursive step produces the natural number n, and the corresponding
inductive step consists of extending the satisfaction o f the property under consideration
from 0 ......... tt to n + 1 .

Example 1.7.2

Induction is used to prove that 0 + 1 + • • • + n = n(n + l)/2 . Using the summation nota­
tion, we can write the preceding expression as

n
£ i = n ( n + l)/2 .
(= 0

Basis: The basis is n = 0. The relationship is explicitly established by computing the values
of each of the sides of the desired equality.

o
= 0 = 0(0+ l)/2 .
1=0

Inductive Hypothesis: Assume for all values k = 1 , 2 , . . . , n that

k
£ « = * ( * + l)/2 .
1=0

Inductive Step: We need to prove that

n+l
£ i = (n + 1 )(« + 1 + l ) / 2 = (n + l)(n + 2 )/2 .
1=0
32 C hapter 1 M a t h e m a t i c a l P r e li m in a r ie s

The inductive hypothesis establishes the result for the sum of the sequence containing n
or fewer integers. Combining the inductive hypothesis with the properties of addition, we
obtain
n+ 1 n
£ / = £ , + (« + 1) (associativity of +)
;=o ;=o
= n(ji + l ) / 2 + {n + 1 ) (inductive hypothesis)
= (R + l ) ( « /2 + 1) (distributive property)
= (n + l)(n + 2 )/2 .
Since the conditions of the principle of mathematical induction have been established, we
conclude that the result holds for all natural numbers. □

Each step in the proof must follow from previously established properties of the
operators or the inductive hypothesis. The strategy of an inductive proof is to manipulate
the formula to contain an instance o f the property applied to a simpler case. When this
is accomplished, the inductive hypothesis may be invoked. After the application of the
inductive hypothesis, the remainder of the proof often consists of algebraic manipulation
to produce the desired result.

1.S Directed Graphs


A mathematical structure consists of a set or sets, distinguished elements from the sets,
and functions and relations on the sets. A distinguished element is an element of a set that
has special properties that differentiate it from the other elements. The natural numbers, as
defined in Definition 1.6.1, can be expressed as a structure (N, s, 0). The set N contains
the natural numbers, 5 is a unary function on N, and 0 is a distinguished element of N. Zero
is distinguished because of its explicit role in the definition of the natural numbers.
Graphs are frequently used to portray the essential features of a mathematical entity
in a diagram, which aids the intuitive understanding of the concept. Formally, a directed
graph is a mathematical structure consisting of a set N and a binary relation A on N. The
elements of N are called the nodes, or vertices, of the graph and the elements of A are called
arcs.or edges. The relation A is referred to as the adjacency relation. A node y is said to
be adjacent to x when [x, y] e A. An arc from x to >• in a directed graph is depicted by an
arrow from x to y. Using the arrow metaphor, y is called the head of the arc and x the tail.
The in-degree of a node x is the number of arcs with * as the head. The out-degree of x is
the number of arcs with x as the tail. Node a in Figure 1.4 has in-degree two and out-degree
one.
A path from a node x to a node y in a directed graph G = (N, A) is a sequence of
nodes and arcs x0, [jc0, *i], j q , [ j c j , x 2), x 2, . . . , *„], x„ of G w ith* = jr0 and
y = x„. The node x is the initial node of the path and y is the terminal node. Each pair
1. 8 D ir e c te d G r a p h s 33

N = [a, b, c,d) Node In-degree Out-degree


A = {la, b). lb, a], lb, c]. a 2 1

[b, d]. [c, H [c, d]. b 2 3


Id, a]. Id, d)} c 1 2

d 3 2

FIGURE 1.4 Directed graph.

of nodes x,-, Jt, + 1 in the path is connected by the arc [x,, xi+1]. The length o f a path is the
number of arcs in the path. We will frequently describe a path simply by sequentially listing
its arcs.
There is a path of length zero from any node to itself called the null path . A path
of length one or more that begins and ends with the same node is called a cycle. A cycle
is simple if it does not contain a cyclic subpath. The path [a, b], [b, c], [c, d], [d, a] in
Figure 1.4 is a simple cycle of length four. A directed graph containing at least one cycle is
said to be cyclic. A graph with no cycles is said to be acyclic.
The arcs of a directed graph often designate more than the adjacency of the nodes. A
labeled directed graph is a structure (N, L, A) where L is the set of labels and A is a relation
on N x N x L. An element [x, y, i>] e A is an arc from x to y labeled by v. The label
on an arc specifies a relationship between the adjacent nodes. The labels on the graph in
Figure 1.5 indicate the distances of the legs of a trip from Chicago to Minneapolis, Seattle,
San Francisco, Dallas, St. Louis, and back to Chicago.
An ordered tree, or simply a tree, is an acyclic directed graph in which each node is
connected by a unique path from a distinguished node called the root o f the tree. The root
has in-degree zero and all other nodes have in-degree one. A tree is a structure (N, A, r)
where N is the set of nodes, A is the adjacency relation, and r e N is the root o f the tree.
The terminology of trees combines a mixture of references to family trees and to those of
the arboreal nature. Although a tree is a directed graph, the arrows on the arcs are usually
omitted in the illustrations of trees. Figure 1.6(a) gives a tree T with root x t.
A node y is called a child of a node x, and x the parent o f y, if y is adjacent to x.
Accompanying the adjacency relation is an order on the children of any node. When a tree
is drawn, this ordering is usually indicated by listing the children o f a node in a left-to-right
manner according to the ordering. The order of the children of x 2 in T is x4, x 5, and x6.
34 C hapter 1 M a th e m a tic a l Prelim inaries

FIGURE 1.5 Labeled directed graph.

A node with out-degree zero is called a leaf. All other nodes are referred to as internal
nodes. The depth of the root is zero; the depth of any other node is the depth of its parent
plus one. The height or depth of a tree is the maximum of the depths o f the nodes in the
tree.
A node y is called a descendant of a node x , and x an ancestor of y, if there is a path
from x to y. With this definition, each node is an ancestor and descendant of itself. The
ancestor and descendant relations can be defined recursively using the adjacency relation
(Exercises 43 and 44). The minimal common ancestor of two nodes x and y is an ancestor
of both and a descendant of all other common ancestors. In the tree in Figure 1.6(a), the
minimal common ancestor of x 1 0 and * n is *5, of x 1 0 and x 6 is x 2, and of jc1 0 and * 1 4 is jcj.
A subtree of a tree T is a subgraph o f T that is a tree in its own right. The set of
descendants of a node x and the restriction of the adjacency relation to this set form a
subtree with root x. This tree is called the subtree generated by x.
The ordering of siblings in the tree can be extended to a relation LEFTOF on N x N.
LEFTOF attempts to capture the property of one node being to the left o f another in the
diagram of a tree. For two nodes x and y, neither of which is an ancestor o f the other,
the relation LEFTOF is defined in terms of the subtrees generated by the minimal common
ancestor of the nodes. Let z be the minimal common ancestor o f* and y and let z (, z2, . . . ,
z„ be the children of z in their correct order. Then x is in the subtree generated by one of the
children of z, call it z,. Similarly, y is in the subtree generated by Zj for some j . Since z is
the minimal common ancestor of x and y, i ^ j . i f ; < j , then [*, _y] e LEFTOF; [y, x] e
LEFTOF otherwise. With this definition, no node is LEFTOF one of its ancestors. If * 1 3
were to the left of x I2, then * ) 0 must also be to the left of * 5 , since they are both the first
1.8 D irected G ra p h s 35

! \

(a) (b)

FIGURE 1.6 (a) Tree with root jc(. (b) Subtree generated by Xy

child of their parent. The appearance of being to the left or right of an ancestor is a feature
of the diagram, not a property of the ordering of the nodes.
The relation LEFTOF can be used to order the set of leaves of a tree. The frontier of
a tree is constructed from the leaves in the order generated by the relation LEFTOF. The
frontier of T is the sequence x9, x I0, x n , jr6, Af13, x 14, xg.
When a family of graphs is defined recursively, the principle of mathematical induction
can be used to prove that properties hold for all graphs in the family. We will use induction to
demonstrate a relationship between the number of leaves and the number of arcs in strictly
binary trees, trees in which each node is either a leaf or has two children.

Example 1.8.1

A tree in which each node has at most two children is called a binary tree. If each node is
a leaf or has exactly two children, the tree is called strictly binary. The family of strictly
binary trees can be defined recursively as follows:

i) Basis: A directed graph T = ({r}, 0, r ) is a strictly binary tree.


ii) Recursive step: If T! = (Ni, A |, r () and T 2 = (N2, A2, r2) are strictly binary trees,
where Ni and N 2 are disjoint and r & N( U N2, then

T = ( N , U N 2 U{r}, A, U A 2 U {[r, r j], [r, r 2]}, r)

is a strictly binary tree.


iii) Closure: T is a strictly binary tree only if it can be obtained from the basis elements by
a finite number of applications of the construction given in the recursive step.

A strictly binary tree is either a single node or is constructed from two distinct strictly
binary trees by the addition of a root and arcs to the two subtrees. Let /v(T ) and arc(T)
denote the number of leaves and arcs in a strictly binary tree T. We prove by induction that
2 /i>(T) — 2 = arc(T) for all strictly binary trees.
36 C hapter 1 M ath e m a tic a l Prelim inaries

Basis: The basis consists of strictly binary trees of the form ({r}, 0, r). The equality clearly
holds in this case since a tree of this form has one leaf and no arcs.
Inductive Hypothesis: Assume that every strictly binary tree T generated by n or fewer
applications of the recursive step satisfies 2 / v(T) — 2 — arc(T).
Inductive Step: Let T be a strictly binary tree generated by n + 1 applications o f the recursive
step in the definition of the family of strictly binary trees. T is built from a node r and two
previously constructed strictly binary trees T] and T 2 with roots r x and r2, respectively.

The node r is not a leaf since it has arcs to the roots of T | and T2. Consequently, /t>(T) =
/v(T |) + /d(T 2). The arcs of T consist of the arcs of the component trees plus the two arcs
from r.
Since T[ and T 2 are strictly binary trees generated by n or fewer applications of the
recursive step, we may employ the inductive hypothesis to establish the desired equality.
By the inductive hypothesis,

2/ u( T]) — 2 = arc(T[)
2 Iv (T2) - 2 = arc(T2).
Now,

arc( T) = arc(T[) + arc( T2) + 2


= 2 /t>(T,) - 2 + 2 lv ( T2) - 2 + 2
= 2(/u(T |) + /d(T 2)) — 2
= 2(/i>(T)) - 2,
as desired. n

Exercises
1. Let X = {1, 2, 3, 4} and Y = {0, 2, 4, 6 ). Explicitly define the sets described in parts
(a) to (e).
a) X U Y d) Y - X
b) X n Y e) 0>(X)
c) X - Y
E x e rc is e s 37

2. Let X = {a, b, c) and Y = {1, 2).


a) List all the subsets of X.
b) List the members of X x Y.
c) List all total functions from Y to X.
3. Let X = {3" | rt > 0} and Y = {3n | n > 0}. Prove that X C Y.
4. Let X = {n3 + 3n2 + 3n \ n > 0} and Y = [rt3 — 1 1n > 0). Prove that X = Y.
* 5. Prove DeMorgan’s Laws. Use the definition of set equality to establish the identities.
6 . Give functions / : N -* N that satisfy the following.
a) / is total and one-to-one but not onto.
b) / is total and onto but not one-to-one.
c) / is total, one-to-one, and onto but not the identity.
d) / is not total but is onto.
7. Prove that the function / : N -* N defined by f ( n ) = rt2 + 1 is one-to-one but not onto.
8 . Let / : R + —► R + be the function defined by f (x) = l / x , where R + denotes the set of
positive real numbers. Prove that / is one-to-one and onto.
9. Give an example of a binary relation on N x N that is
a) reflexive and symmetric but not transitive.
b) reflexive and transitive but not symmetric.
c) symmetric and transitive but not reflexive.
10. Let = be the binary relation on N defined by rt = m if, and only if, n = m. Prove that
= is an equivalence relation. Describe the equivalence classes of = .
11. Let = be the binary relation on N defined by n = m for all n, m € N . Prove that = is
an equivalence relation. Describe the equivalence classes of = .
12. Show that the binary relation LT, less than, is not an equivalence relation.
13. Let = p be the binary relation on N defined by n s=p m if rt mod p = m mod p. For
p > 2, prove that = p is an equivalence relation. Describe the equivalence classes of
=p-
14. Let X], . . . , X„ be a partition of a set X. Define an equivalence relation = on X whose
equivalence classes are precisely the sets X ^ . . . , X„.
15. A binary relation = is defined on ordered pairs of natural numbers as follows:
[m, n] = [j, fc] if, and only if, m + k = n + j . Prove that = is an equivalence relation
in N x N.
16. Prove that the set of even natural numbers is denumerable.
17. Prove that the set of even integers is denumerable.
38 C hapter 1 M a th e m a tic a l Prelim inaries

* 18. Prove that the set of nonnegative rational numbers is denumerable.


19. Prove that the union of two disjoint countable sets is countable.
20. Prove that there are an uncountable number of total functions from N to {0, 1).
21. A total function / from N to N is said to be repeating if f ( n ) — f ( n + 1) for some
n G N. Otherwise, / is said to be nonrepeating. Prove that there are an uncountable
number of repeating functions. Also prove that there are an uncountable number of
nonrepeating functions.
22. A total function / from N to N is monotone increasing if f (n) < f ( n + 1) for all n e
N. Prove that there are an uncountable number of monotone increasing functions.
23. Prove that there are uncountably many total functions from N to N that have a fixed
point. See Example 1.4.3 for the definition of a fixed point.
24. A total function / from N to N is nearly identity if f ( n ) = n — 1, n, orn + 1 for every
n. Prove that there are uncountably many nearly identity functions.
* 25. Prove that the set of real numbers in the interval [0, 1] is uncountable. Hint: Use the
diagonalization argument on the decimal expansion of real numbers. Be sure that each
number is represented by only one infinite decimal expansion.
26. Let F be the set of total functions of the form / : {0, 1} —»• N (functions that map from
{0, 1} to the natural numbers). Is the set of such functions countable or uncountable?
Prove your answer.
27. Prove that the binary relation on sets Refined by X = Y if, and only if, card(X) =
card(Y) is an equivalence relation.
* 28. Prove the Schroder-Bemstein Theorem.
29. Give a recursive definition of the relation is equal to on N x N using the operator s.
30. Give a recursive definition of the relation greater than on N x N using the successor
operator s.
31. Give a recursive definition of the set of points [m, n] that lie on the line n = 3m in
N x N. Use s as the operator in the definition.
32. Give a recursive definition of the set of points [m, n] that lie on or under the line n = 3m
in N x N. Use s as the operator in the definition.
33. Give a recursive definition of the operation of multiplication of natural numbers using
the operations s and addition.
34. Give a recursive definition of the predecessor operation

At \ = {1 0
predin) if n = 0
I n —1 otherwise

using the operator s.


Ex e rc is e s 39

35. Subtraction on the set of natural numbers is defined by

n —m if n > m

! 0 otherwise.

This operation is often called proper subtraction. Give a recursive definition of proper
subtraction using the operations s and pred.

36. Let X be a finite set. Give a recursive definition of the set of subsets of X. Use union
as the operator in the definition.

* 37. Give a recursive definition of the set of finite subsets of N. Use union and the successor
s as the operators in the definition.

38. Prove that 2 + 5 + 8 + ■ • • + (3n — 1) = n(3n + l)/2 for all n > 0.

39. Prove that 1 + 2 + 2 2 + • • • + 2" = 2"+l — 1 for all n > 0.

40. Prove 1 + 2" < 3" for all n > 2 .

41. Prove that 3 is a factor of n 3 —n + 3 for all n > 0.

42. Let P = {A, B) be a set consisting of two proposition letters (Boolean variables). The
set E of well-formed conjunctive and disjunctive Boolean expressions over P is defined
recursively as follows:
i) Basis: A, B e E.
ii) Recursive step: If u, v € E, then (u v v) € E and (u a d) € E.
iii) Closure: An expression is in E only if it is obtained from the basis by a finite
number of iterations of the recursive step.
a) Explicitly give the Boolean expressions in the sets Eq, E |, and Ej.
b) Prove by mathematical induction that for every Boolean expression in E, the number
of occurrences of proposition letters is one more than the number o f operators. For
an expression u, let n p(u) denote the number of proposition letters in u and n„(u)
denote the number of operators in u.
c) Prove by mathematical induction that, for every Boolean expression in E, the
number of left parentheses is equal to the number of right parentheses.
43. Give a recursive definition of all the nodes in a directed graph that can be reached by
paths from a given node x. Use the adjacency relation as the operation in the definition.
This definition also defines the set of descendants of a node in a tree.

44. Give a recursive definition of the set of ancestors of a node x in a tree.

45. List the members of the relation LEFTOF for the tree in Figure 1.6 (a).
40 C hapter 1 M a th e m a tic a l Prelim inaries

46. Using the tree below, give the values of each of the items in parts (a) to (e).

*5 **6 X1 *8

1 1 r \
•*10 *11 *12
1
1

*14 *15

a) the depth of the tree


b) the ancestors of x t i
c) the minimal common ancestor of x \ 4 and jcj j, of jci 5 and X\i
d) the subtree generated by x 2
e) the frontier of the tree
47. Prove that a strictly binary tree with n leaves contains 2n — 1 nodes.
48. A com plete binary tree of depth n is a strictly binary tree in which every node on levels
1, 2...........n — 1 is a parent and each node on level n is a leaf. Prove that a complete
binary tree of depth n has 2 n + 1 — 1 nodes.

Bibliographic Notes
The topics presented in this chapter are normally covered in a first course in discrete math­
ematics. A comprehensive presentation of the discrete mathematical structures important
to the foundations of computer science can be found in Bobrow and Arbib [1974],
There are a number of classic books that provide detailed presentations o f the topics
introduced in this chapter. An introduction to set theory can be found in Halmos [1974],
Stoll [1963], andFraenkel, Bar-Hillel, and Levy [1984], The latter begins with an excellent
description of Russell’s paradox and other antinomies arising in set theory. The diagonal­
ization argument was originally presented by Cantor in 1874 and is reproduced in Cantor
[1947]. The texts by Wilson [1985], Ore [1963], Bondy and Murty [1977], and Busacker
and Saaty [1965] introduce the theory of graphs. Induction, recursion, and their relationship
to theoretical computer science are covered in Wand [1980].
Another random document with
no related content on Scribd:
GREAT SHIP OF HENRY VIII.
(From a drawing by Holbein.)
As already stated, the great majority of the ships built for mercantile
purposes were intended to be able to give a good account of
themselves if they should be assailed by a hostile vessel, a
contingency which was not at all unlikely in the days when ships
roved the seas under the protection of letters of marque and made
“mistakes” as to the nationality of the prize when the prospective
booty might be held to justify the error. Before the nations took to
building vessels especially for war every merchant was liable to have
his traders requisitioned for war purposes, and even up to the end of
the nineteenth century the inclusion of armed merchantmen in
national forces was not uncommon. Letters of marque were permits
granted to ship owners whose vessels had been despoiled by the
subjects of another nation to recoup themselves at the cost of any
vessels belonging to that nation which they could capture, and to
continue to do so until the losses were made good. Naturally they
found this profitable, much more so indeed than ordinary trading,
and did not hesitate to set a low value upon all captures when
casting about to find an excuse for another expedition. Piracy, too,
was rife, and as at sea every shipmaster was a law unto himself
unless there was someone at hand to enforce a change of views, the
shipmaster or merchant turned pirate usually nourished exceedingly
until captured red-handed, when his shrift was like to be a short
one.
As an instance of the license to which this liberty was extended, may
be mentioned the Barton family who, in the fifteenth century, had
granted to them letters of marque to prey upon the Portuguese in
retaliation for the murder of John Barton, who was captured and
beheaded by Portuguese. His sons conducted the enterprise with
such thoroughness that they were able to pay their Scottish Royal
master so well that they were never interfered with by him, and
when he entrusted them with the task of reducing the Flemish
pirates who levied toll on Scottish commerce, they sent him a few
barrels filled with pickled human heads to show that they were not
idle. The fame of this Scottish family became world wide, for they
had now a powerful fleet and traded and fought and captured where
they would, so that the reputation of the Scottish navy was great.
One of the ships of the Barton family, the Lion, was second in size
and armament only to the Great Harry itself. The death of Sir
Andrew Barton is commemorated in a well-known ballad.
When vessels with two and more decks were constructed, the lower
ports were cut so near the water that when the vessel heeled, or
even a moderate sea was running, the guns could not be worked.
The ports of the Mary Rose, which was the next largest ship to the
Regent, at one time, and had a tonnage variously stated at 500 and
660 tons, though afterwards surpassed by the Sovereign, 800 tons,
Gabriel Royal, 650 tons, and Katherine Forteless, or Fortileza, were
but 16 inches above the water. She was lost, in 1545, through the
water entering her lower ports when going about off Spithead, and
her commander and six hundred men went down with her; the Great
Harry had a narrow escape from a similar disaster at the same time.
A report on the Royal Navy in 1552 makes interesting reading. The
fleet was overhauled, and twenty-four “ships and pinnaces are in
good case to serve, so that they may be grounded and caulked once
a year to keep them tight.” This is endorsed, “To be so ordered, By
the King’s Command.” Other seven ships were ordered to be “docked
and new dubbed, to search their treenails and iron work.” The Mrs.
Grand, a name which no longer adorns the “Navy List,” a vessel
carrying a crew of two hundred and fifty men, and having one brass
gun and twenty-two iron guns, lying at Deptford, was recommended
to be “dry-docked—not thought worthy of new making”; so she was
ordered “To lie still, or to take that which is profitable of her for
other Ships.” Six others were stated in the report—a document
seemingly the work of a naval reform party—to be “not worth
keeping,” but they were ordered “To be preserved, as they may with
little charge.”
Queen Elizabeth, whose patriotism and naval enthusiasm were about
equally in evidence, was careful of her men and ships, raised the pay
of her officers and seamen, and took steps generally to have the
navy and the naval resources strengthened and conserved. She
seems to have had twenty-nine vessels in 1565. She also
encouraged merchants to build large vessels, which could be
converted into warships as occasion required. The exigencies of
trading over sea, however, were such that many of the vessels
required little to be done to them in the way of conversion. Vessels
were also rated at from 50 to 100 tons more than they measured.
BREECH-LOADING GUN RECOVERED FROM THE
WRECK OF THE “MARY ROSE.”
In the Museum of the Royal United Service Institution. A spare chamber
is shown in the front.
“The Queen’s Highness,” a contemporary historian writes,[19] “hath
at this present already made and furnished, to the number of One
Hundred and Twenty Great Ships, which lie for the most part in
Gillingham Road. Beside these, her Grace hath other in hand also;
she hath likewise three notable Galleys, the Speedwell, the Tryeright,
and the Black Galley, with the sight whereof, and the rest of the
Navy-Royal, it is incredible to say how marvellously her Grace is
delighted. I add, to the end that all men should understand
somewhat of the great masses of treasure daily employed upon our
Navy, how there are few merchant ships of the first and second sort,
that being apparelled and made ready to sail, are not worth one
thousand pounds, or three thousand ducats at the least, if they
should presently be sold. What then shall we think of the Navy-
Royal, of which some one vessel is worth two of the other, as the
shipwright has often told me.”
Queen Elizabeth had, in 1578, twenty-four ships ranging from the
Triumph, of 1,000 tons, built in 1561, to the George, of under 60
tons.
When the Spanish Armada arrived in the Channel in 1588, the British
fleet, which numbered one hundred and ninety-seven vessels,
included thirty-four belonging to the state. The remainder were ships
of various kinds and sizes, mostly small, hired by the state or
provided by private owners, and fitted out hastily for war purposes
by their owners or the ports. The Cinque Ports, it should be
remembered, which furnished a considerable number, were obliged
by Henry VIII., in return for certain privileges, to supply him with
fifty-seven ships, each containing twenty-one men and a boy, for
fifteen days once a year at the ports’ expense, and it often
happened that the ports had to find a greater number of vessels.
After the fifteen days they received state pay. A similar arrangement
held good at the time of the Armada. The largest ships in the English
force are sometimes stated to have carried fifty-five or sixty guns,
and one may have carried sixty-eight guns. The armament of the
Triumph, which was the heaviest armed English vessel, comprised
four cannon, three demi-cannon, seventeen culverins, eight demi-
culverins, six sakers, and four small pieces. The Elizabeth Jones, of
900 tons, built in 1559, carried fifty-six guns, and the Ark Royal, Lord
Howard’s flagship, launched in 1587, had fifty-eight guns and a crew
of four hundred and thirty men, her tonnage being 800. The
principal royal ships and the number of guns they carried were, as
far as can be ascertained accurately: Ark Royal, fifty-five guns; Lion,
thirty-eight; Triumph, forty-two; Victory, forty-two; Bonaventure,
thirty-four; Dreadnought, thirty-two; Nonpareil, thirty-eight;
Rainbow, forty; Vanguard, forty; Mary Rose, thirty-six; Antelope,
thirty; and Swiftsure, forty-two. The Spanish ships were rather
floating fortresses packed with soldiers, and desiring to come to
close quarters so that the fight should be of the hand-to-hand
description to which they were accustomed. The English ships were
smaller, and though more numerous, of little more than half the total
tonnage of the Armada, and were, on the whole, more lightly armed.
Still, a large number of the English vessels carried what were long,
heavy guns for those days, and they used them at short range when
they assumed a windward position and attacked the Spanish rear,
inflicting great damage and throwing the enemy into confusion. This
defeat definitely established the cannon as the principal weapon for
warfare afloat, and inaugurated a new era in the history of the
world’s fighting navies.
THE “ARK ROYAL,” THE ENGLISH ADMIRAL’S FLAGSHIP.
From a Contemporary Print.
(click image to enlarge)
Of the merchant ships engaged, the largest were the Leicester,
sometimes called the Galleon Leicester, and the Merchant Royal,
each of 400 tons. The great galleys and galleasses of the Armada
were not the largest ships afloat by a great deal, for they were far
exceeded in size by many contemporary merchantmen in the
Mediterranean.
The Queen’s ships were sometimes employed upon peaceful and
ambassadorial errands. The voyage of the Ascension to
Constantinople shows a definite attempt to spread English prestige
in distant seas by means of English trade openings, instead of by the
diplomacy of the day, a prominent feature of which was the
discovery of means and opportunities of raiding a state having much
portable riches and not sufficient power to protect them.
The Ascension, in which Queen Elizabeth sent her second present to
the Sultan of Turkey, left London in March, 1593, and arrived in
August, 1594. She was “a good shippe very well appointed, of two
hundred and three score tunnes (whereof was master one William
Broadbanke, a provident and skilfull man in his faculties).” Some
days after the arrival when the wind suited, “our shippe set out in
their best manner with flagges, streamers, and pendants of divers
coloured silke, with all the mariners, together with most of the
Ambassador’s men, having the winde faire, and came within two
cables’ length of this his moskyta,[20] where (hee to his great
content beholding the shippe in such bravery) they discharged first
volies of small shot, and then all the great ordinance twise over,
there being seven and twentie or eight and twentie pieces in the
shippe.”[21]
The early part of the seventeenth century, when James I. was king,
saw a remarkable advance in shipbuilding, thanks to Phineas Pett,
who dropped the somewhat haphazard rule-of-thumb methods of
ship construction and introduced a more or less scientific system of
measurement and estimate of weights. In 1610, the Prince, or Prince
Royal, of 1,400 tons, and mounting sixty-four guns, was launched.
She is described as “Double-built,” which has been supposed to
mean that she had an outer and inner skin and an additional number
of beams, etc. This may afford a partial explanation of the fact that
though seven hundred and seventy-five loads of timber were
estimated to be necessary for her construction, one thousand six
hundred and twenty-seven loads were used. Also, as the ship only
lasted fifteen years, a possible further explanation of the discrepancy
may be found in the suggestion that much of the timber supplied
and included in the larger amount was unfit for use. The Prince
Royal was “most sumptuously adorned, within and without, with all
manner of curious carving, painting and rich gilding, being in all
respects the greatest and goodliest ship that was ever built in
England.” In 1624 this ship had two cannon-petro, six demi-cannon,
twelve culverins, eighteen demi-culverins, thirteen sakers, and four
port-pieces.
THE “SOVEREIGN OF THE SEAS.”
From the Model in the Royal Naval College Museum, Greenwich.
THE “PRINCE ROYAL.”
Designed by Phineas Pett.
By permission of the Elder Brethren of Trinity House.
Good sea fighters as the English had proved themselves to be, they
yet were behind the Dutch and French as naval architects. Sir Walter
Raleigh, an outspoken critic of the King’s ships and of English
merchant vessels, comparing the latter with those of the Dutch,
nevertheless admitted that some progress had been made in English
shipping. “In my own time,” he writes, “the shape of our English
ships hath been greatly bettered. It is not long since the striking of
the topmast hath been devised. Together with the chain pump, we
have lately added the Bonnet and Drabler.... To the courses we have
devised studding sails, top-gallant sails, spritsails and topsails. The
weighing of anchors by the capstan is also new. We have fallen into
consideration of the length of cables, and by it we resist the malice
of the greatest winds that can blow. We have also raised our second
decks.” The last improvement was one of the most important, for the
space between the decks was cramped, and the lower deck was not
much above the water level. The raising of the decks gave the ships
more freeboard and increased their seaworthiness, rendered the
lower tier of guns more effective by enabling them to be used with
less danger from water entering the ports, and gave the men
working the guns on the lower tier more head room.
A list of the ships of King Charles, dated 1633, is of more than usual
interest, says Derrick, “this being the earliest list of the Navy I have
met with, wherein any part of the ships’ principal dimensions are
inserted.... This is the first list in which any nice regard seems to
have been paid to the tonnage of the Ships. Previous to 1663, the
tonnage of almost every Ship seems to have been rather estimated
than calculated, being inserted in even numbers.”
A natural development of the Prince Royal was the Sovereign of the
Seas. These two vessels may be regarded as marking the first and
second stages in the final period of transition from the old style of
warship to the wooden walls. She was a remarkable vessel in
national as well as naval history, for she played not a small part in
the agitation over the question of ship-money, which had such a
tremendous influence on the nation’s development.
“This famous vessel,” Heywood states in his publication addressed to
the King, “was built at Woolwich in 1637. She was in length by the
keel 128 feet or thereabout, within some few inches; her main
breadth 48 feet; in length, from the fore end of the beak-head to the
after end of the stern, a prora ad puppim, 232 feet; and in height,
from the bottom of her keel to the top of her lanthorn, 76 feet; bore
five lanthorns, the biggest of which would hold ten persons upright;
had three flush decks, a forecastle, half-deck, quarter deck, and
round house. Her lower tier had thirty ports for cannon and demi-
cannon, middle tier thirty for culverines and demi-culverines, third
tier twenty-six for other ordnance, forecastle twelve, and two half-
decks have thirteen or fourteen ports more within board, for
murdering pieces, besides ten pieces of chace-ordnance forward and
ten right aft, and many loop-holes in the cabin for musquet-shot.
She had eleven anchors, one of 4,400 pounds weight. She was of
the burthen of 1,637 tons.... She hath two galleries besides, and all
of most curious carved work, and all the sides of the ship carved
with trophies of artillery and types of honour, as well belonging to
sea as land, with symbols appertaining to navigation; also their two
sacred majesties’ badges of honour; arms with several angels
holding their letters in compartments, all which works are gilded
over and no other colour but gold or black. One tree, or oak, made
four of the principal beams, which was 44 feet, of strong serviceable
timber, in length, 3 feet diameter at the top and 10 feet at the stub
or bottom.
“Upon the stem head a Cupid, or Child bridling a Lion; upon the
bulkhead, right forward, stand six statues, in sundry postures; these
figures represent Concilium, Cura, Conamen, Vis, Virtus, Victoria.
Upon the hamers of the water are four figures, Jupiter, Mars,
Neptune, Eolus; on the stern, Victory, in the midst of a frontispiece;
upon the beak-head sitteth King Edgar on horseback, trampling on
seven kings.”
The Sovereign of the Seas was the largest vessel yet built in
England, and though she was intended as much for show as use,
she became, when she was reduced a deck and a lot of this
ornamental flummery was removed, one of the best fighting ships in
the navy, and was in nearly all the chief engagements in the war
with Holland, and proved herself a very serious opponent, as the
navy records show.
It was about this time that ships were first rated or classified
according to their size and efficiency as fighting units. About this
time also, a new type of vessel, the frigate, was introduced into the
navy. The frigate is not a British invention, but, so far as this country
is concerned, was copied from the French by Peter Pett, son of
Phineas Pett, who saw one in the Thames. He built, in 1649, the
Constant Warwick to the order of the Earl of Warwick, who intended
her for a privateer, but sold her.
According to Pepys, the Dutch and French, in 1663 and 1664, built
two-decked ships with sixty to seventy guns, and lower decks four
feet above the water. The English frigates were narrower and
sharper, and their lower gun ports were little more than three feet
above the sea. It was therefore decided that the English ships
should have their gun ports about four and a half feet from the
water. The French and Dutch three-deckers were usually about 44
feet in the beam, as compared with the 41 feet of some of the
English third rates, and the Henry, built in 1656, and the Katherine,
in 1674, to mention only two of many, were useless until they were
girdled, and after 1673 the three-decked second raters were ordered
to be 45 feet in the beam.
In the seventeenth century the Royal Louis was built at Toulon,
carrying 48-pounders on its lower deck, 24-pounders on the middle
deck, and 12-pounders on the upper deck. The French, indeed, were
taking the lead in naval construction at this period, and their
superiority was recognised by the English who captured and imitated
them whenever possible. Thus the Leviathan, built at Chatham, was
a copy of the Courageux of seventy-four guns, and the Invincible,
captured by Lord Anson during the Seven Years War, served as
model for many more.
During a French visit to Spithead in 1673, the Superbe, seventy-four
guns, attracted special attention. She was 40 feet broad and had her
lowest tier of guns higher from the water than the English frigates.
Accordingly the Harwich was built by Sir Henry Deane as a copy, and
gave such satisfaction that she was adopted as a pattern for second
and third rates. Besides the six rates of fighting ships, other classes
were included in the navy list, these being, in Charles II.’s reign,
thirteen sloops, one dogger, three fireships, one galley, two ketches,
five smacks, fourteen yachts, four hoys, and eight hulks.
The dimensions determined upon in 1677 for ships of one hundred,
ninety and seventy guns were sometimes exceeded; and in 1691
another set of dimensions, for ships of sixty and eighty guns, was
established. In the following year an appropriation for “bomb
vessels” was sanctioned; and about 1694, a revival of the fireships
was tried. These vessels were called internals, possibly on account of
their contents, which included “loaded pistols, carcasses (filled with
grenadoes), chain shot, etc., and all manner of combustibles.” Their
revival, or invention in this form, is attributed to an engineer named
Meesters, who directed the operations against Dunkirk, without
achieving any success with them.
LINE OF BATTLESHIP, 1650.
From a Model in the Museum of the Royal United Service Institution.
Prior to the battle of La Hogue, in 1692, five advice boats appear in
the navy list for the first time; they carried from forty to fifty men
each and were deputed to acquire information of the enemy’s
movements at Brest.
Complaints were made in 1744-5 that the British vessels compared
unfavourably with those of other nations in scantlings,
seaworthiness, and armament. This induced the adoption of another
set of rules, and the ships built according to them proved to be good
sea boats, carrying their guns well, and standing up stiffly under sail,
but they had the objection of being too full in the after part of their
under body, which retarded their speed somewhat. After ten years’
trial this establishment was modified, the faults complained of were
remedied, and the ships were increased in size, and from this time
onward fifty-gun ships were seldom classed as ships of the line of
battle. There has been some misconception in regard to the frigates
of the period, as many small vessels carrying eighteen guns, or less,
were so called, but were afterwards included among the sloops.
The real frigate was a vessel constructed to cruise in all weathers,
and able to show a good turn of speed; she had an armament which
was fairly heavy for her size, and it was carried on one deck, with
the exception of a few guns which might be disposed about the
poop or forecastle. For over two hundred years vessels of this type
were held in the highest esteem, until, indeed, they were
superseded, in common with all other sailing warships, when steam
was adopted. The career of the steam frigate was brought to an
early close by the adoption of the ironclad.
The frigate itself underwent considerable development during its two
centuries’ career. The earlier frigates carried twenty-four or twenty-
eight 9-pounders, and a crew of about one hundred and sixty men;
these vessels were about 500 tons burthen, or a little more, with a
gundeck length of 113 feet and a length of 93 feet on the keel. Their
rig marked a curious transition stage from the Mediterranean
influence to that of the modern square rig, as, although they carried
square sails on the fore and main masts, lateens were still carried on
the mizen. The frigate of thirty-two 12-pounders appeared shortly
afterwards, the first of this size being the Adventure, launched in
1741; and six years later the Pallas and Brilliant, thirty-six-gun
frigates, were added to the navy; but, while admittedly excellent
fighting cruisers, they were inferior to the French thirty-six-gun
frigates built about that time.[22] The frigates played a most
important part in the world’s naval history of the latter part of the
eighteenth century and the early years of the nineteenth century.

THE “DREADNOUGHT,” 1748.


From a Model in the Museum of the Royal United Service Institution.
THE “JUNO,” 1757.
From the Model in the Victoria and Albert Museum.
Tougher antagonists than the French frigates, however, were the
seven frigates the Americans built when matters became strained
between the United States and this country; they were the United
States, Constitution, President, Constellation, Congress, Chesapeake,
and Essex. The first-named was the largest, with a tonnage of
1,576, and the smallest the Essex, 860 tons. The American navy
consisted only of about a dozen vessels altogether on which reliance
could be placed, but these were among the best of their kind afloat;
there were a few others of little or no fighting value. The frigates
carried batteries of carronades supplemented by long guns, 12-
pounders. It was the custom to give the American ships more guns
than they rated. Thus the forty-four-gun frigate had thirty long 24-
pounders on the main deck, two long bow chasers on the forecastle,
and twenty or twenty-two 32-pounder carronades, as in the
Constitution, while the carronades of the President and United States
were 42-pounders. The armament of the Constellation, Congress,
and Chesapeake was twenty-eight long 18-pounders on the main
deck, two similar guns on the forecastle, and eighteen 32-pounder
carronades. The “ship-sloops,” of which the greater part of the rest
of the American naval force consisted, carried 32-pounder
carronades, and long 12-pounders for bow chasers. The “brig-
sloops” were equipped with carronades. The Americans claim to
have been the first to employ the heavy frigate effectively,
notwithstanding that the cannon balls their guns fired were of less
weight in some instances than the projectiles discharged from the
corresponding weapons in the British or French navies, and the shot
would also appear to have been really lighter than they were
supposed to be by as much as two to ten per cent. These frigates
were remarkable for the series of duels they fought with British
warships, winning six in succession, by superior seamanship and
better sailing qualities, to some extent, but mostly by superior
gunnery, until the final duel was won by the Shannon in her
memorable encounter with the Chesapeake. The series of American
victories was inaugurated by the Constitution, otherwise “Old
Ironsides,” the British victim being the Guerrière.
In considering the development of the warships of other types, it is
necessary to go back a few years. The British dockyards were
unequal to the demands upon them for the wars of the latter part of
the eighteenth century, and a greater number of warships than ever
before was built by contract at privately owned yards.
It is interesting to note that one firm of shipbuilders which built ships
for the navy in those days and even a century earlier, on Thames
side, is still in existence, and in spite of limited liability company laws
and the introduction of new partners, is still known as Green’s yard,
at Blackwall, and is still managed by bearers of the name.
Twenty-six sail of the line and eighty-two smaller vessels were
launched from private yards during the war ending in 1762, and
twenty-four sail of the line and twelve smaller ships were launched
at the King’s yards between the declaration of war in 1756 and the
proclamation of peace seven years later. This is of importance as
showing the resources of the country even at that time in warship
building, and the assistance the government was glad to receive
from the private builders at times of emergency.
During this war it was decided that no more eighty-gun three-
deckers or seventy-gun or sixty-gun ships should be built. In place of
the first-named, ships of seventy-four and sixty-four guns were
ordained, and fifty-gun ships with a roundhouse were ordered to
replace the latter. The first seventy-fours and sixty-fours were too
small for the weight of the guns they had to carry, and their
successors of that class were larger. No eighty-gun ship with three
decks was built after 1757, and no seventy-gun ship after 1759. The
Cæsar was the first English eighty-gun ship with two decks; she was
built in 1793.
Towards the end of 1778 many of the second rates were given eight
additional guns on the quarter deck, which virtually raised them to
ninety-eight-gun ships. An important constructional improvement in
1783 was the adoption of copper fastenings in all classes of ships
below the water-line; iron bolts had been found to corrode under the
influence of the salt water.
Ships continued to increase in size and power of armament. The
Ville de Paris, of one hundred and ten guns and 2,332 tons, and her
sister ship, the Hibernia, ordered in 1790, were the first of their
class. Before the latter was finished she was lengthened and her
tonnage raised to 2,508 tons. Another new class, introduced about
that time, comprised three ships of 776 tons each, carrying thirty-
two guns, the main deck armament consisting of 18-pounders; they
did so well that several others were added.
About 1783 a greater length in proportion to beam was adopted,
which made the ships faster sailers and better sea-boats, and
several vessels of the higher classes were altered, and many others
had their bottoms specially thickened to withstand stranding. The
42-pounder guns of the largest ships were found difficult to handle
and of less rapidity of fire than the 32-pounders, and were removed
from the main deck battery of the Royal Sovereign and other ships in
favour of the 32-pounders.
The Commerce de Marseilles, of 120 guns, was one of the French
vessels which accompanied under compulsion the combined English
and Spanish squadron from Toulon in 1793. She was considered to
be the largest ship in the world. Her gun-deck was 208 feet 4 inches
in length, and her keel for tonnage 172 feet 0⅛ inch. Her depth of
hold was 25 feet 0½ inch, and her extreme breadth 54 feet 9½
inches, her tonnage being 2,747 tons. She was not a very valuable
acquisition, however, for her timbers were in such a state that she
was not worth repairing; she was accordingly taken to pieces in
1802. Probably, like many more vessels built in those strenuous
times, she was constructed of unseasoned timber, or had a quantity
of immature or soft wood put into her in order that she might be got
ready for war as quickly as possible, for warships were wanted in
such a hurry that it was more necessary that they should be
available for use at the earliest opportunity than that they should be
expected to last for very long. Both the British and French fleets had
a number of these “green” ships.
If the French could have a vessel of such gun power and dimensions
there was no reason why the English should not, so the Caledonia,
of 2,602 tons, was ordered in 1794, and was to be the largest and
most powerful yet built in England. Her main deck guns were to be
32-pounders, because of the greater ease with which they could be
handled. On her lower deck she had thirty-two of these guns, on the
middle deck thirty-four 24-pounders, on the main deck thirty-four
18-pounders, on the quarter deck sixteen 12-pounders, and on the
forecastle four 12-pounders. Her officers and crew numbered eight
hundred and seventy-five. Her length was 205 feet, breadth 54 feet
6 inches, and depth of hold 23 feet 1 inch. She was the favourite
ship of Lord Exmouth. At first she had a square stern, but when the
rounded sterns were shown to be better in every way she was
altered to the new mode, and her armament was revised. She
afterwards became the hospital ship at Greenwich under the name
of the Dreadnought. The model of her at South Kensington shows
that her rigging was probably unique. Her royal masts were fidded,
that is, built above the topgallant masts instead of forming one long
pole with them, as is the custom, and there were also peculiarities in
the arrangement of some of her running rigging. This ship was
launched at Devonport in 1808.
THE “CORNWALLIS,” 1812.
From a Model in the Museum of the Royal United Service Institution.
The defeat of the Danes at Copenhagen, the battle of the Nile, the
“glorious first of June,” the battle of Trafalgar, the duels of the
American War, and the battle of Navarino, united to give a splendid
termination to the career of the wooden warship as a fighting unit.
That of Trafalgar was the last in which great fleets of the best
“wooden walls” that human skill could devise opposed each other in
manœuvre and counter-manœuvre. That of Navarino, fought in a
bay, almost in a dead calm, with the ships hardly moving and some
even at anchor, was the last conflict in the world’s history in which
the wooden battleships of the East and the West lay alongside each
other and blazed away with every available weapon at a range so
close at times that they could not possibly miss.
Constructionally, wooden battleships had about attained the limit of
size. Already they revealed unmistakable signs of longitudinal
weakness, and it had been a problem, which the builders up to that
time had been unable to solve, how to stiffen the hulls so that they
would withstand the hogging and sagging strains. It was not until Sir
Robert Seppings introduced his system of ship construction that the
difficulty was overcome, but the increase in the deadweight of the
ship was great. Still, had it not been for his system it would have
been impossible to construct some of the later vessels which left the
ways before steam was introduced and iron was adopted for ship
construction. Very few vessels were built larger than those which
fought in Trafalgar Bay, though several were designed. The
improvements made were rather in the form of the underbody in
order to increase the speed and sea-going qualities of the ships. One
of the largest old-style battleships ever proposed was the Duke of
Kent, which was to have been a four-decker carrying one hundred
and seventy guns, and having a tonnage of 3,700. She was to have
been given a length of 221 feet 6 inches on the gun-deck, an
extreme breadth of 64 feet, and a depth of hold of 26 feet. On the
lower deck she was to have had thirty-six 32-pounders, and a similar
complement on the lower middle deck; thirty-six 24-pounders on the
middle deck; thirty-eight 18-pounders on the upper deck; ten 12-
pounders and six 32-pounder carronades on the quarter-deck; and
four 12-pounders and four 32-pounder carronades on the forecastle.
Though she never progressed beyond the paper stage, these
particulars are interesting as showing what the naval architects of a
hundred years ago were prepared to design.
The Queen of one hundred and ten guns, the first three-decker
launched after Queen Victoria’s accession, the Vernon of fifty guns,
and Pique of forty guns, and others of various classes were designed
by Sir W. Symonds, who, during his fifteen years’ surveyorship to the
Admiralty, was responsible for no fewer than one hundred and eighty
vessels. The finer lines he gave them increased their speed, and
they were broader, loftier, and roomier between decks than their
predecessors, and were better ships all round. They may be
regarded as embodying the highest degree of excellence to which
the sailing wooden warship attained.
Reference has been made to the guns used on shipboard at various
times, and to the establishment of dimensions or rates to be
observed in building the ships employed in the British Navy. The
guns about to be described were used in all navies; the
establishments referred to are peculiar to the British Navy, though
the vessels themselves differed but little from those belonging to
other nations. It must also be remembered that though the names
of the guns were retained through century after century, very little is
known of the earliest weapons, and that their names came to be
applied to guns which had little in common.
The establishments, as they were called, were adopted to secure
uniformity in types, and it is well to bear these details in mind, or at
least to refer to them, in studying the history of the achievements of
the British Navy in order that an approximately correct idea may be
obtained of the ships and weapons used by and against Great Britain
which have had so great an influence on the world’s history.
The principal establishments were ordered in 1677, 1691, 1706,
1719, and 1745, and certain proposals were also made in 1733 and
1741, which were not of quite so far-reaching a character as the
others. The establishment of 1745 was not adhered to for many
years, and there has been no cut-and-dried establishment since, the
requirements of modern warfare and the inventiveness of all nations
having militated against adherence to a rigid standard. Ships of one
hundred guns were in length on the gun-deck in 1677, 165 feet; in
1719, 174 feet; in 1745, 178 feet; their extreme breadth was 46 feet
in 1677, and 51 feet in 1745, and the burthen increased from 1,550
tons in the first-named year, to 2,000 in the last. The ships of ninety
guns had lengths on the gun-deck of 158 feet, 164 feet, and 170
feet in the three years respectively; their extreme breadth was 44
feet, 47 feet 2 inches, and 48 feet 6 inches, and their tonnage 1,307,
1,569, and 1,730 tons. The three-deckers of eighty guns first appear
in the 1691 establishment; they were 156 feet on the gun-deck, 158
feet in 1719, and 165 feet in 1745; their extreme breadths at the
three dates were 41 feet, 44 feet 6 inches, and 47 feet, and their
burthens 1,100, 1,350, and 1,585 tons. Seventy-gun ships increased
from 150 feet in length in 1677, to 160 feet in 1745, their breadth
from 39 feet 8 inches to 45 feet, and their burthens from 1,013 tons
to 1,414 tons. Ships of sixty guns were 144 feet in length in 1691,
and 150 feet in 1745, with respective breadths of 37 feet 6 inches,
and 42 feet 8 inches, and tonnages of 900 and 1,191 tons. Fifty-gun
ships appear in the ratings of 1706 with a length of 130 feet, and in
1745 of 144 feet; then-respective breadths being 38 feet and 41
feet, and tonnages 704 and 1,052 tons. In the same year also, 40-
gun ships are recorded with a length of 118 feet, an extreme
breadth of 32 feet, and a tonnage of 531 tons; these dimensions
had risen in 1745 to 133 feet, 37 feet 6 inches, and 814 tons. Ships
of twenty guns were rated in 1719 with a length of 106 feet, breadth
28 feet 4 inches, and tonnage 374; increased by 1745 to 113 feet,
32 feet, and 508 tons.
In regard to their complements, a 100-gun ship in 1677 carried
seven hundred and eighty men; in 1733, eight hundred and fifty;
and in 1805, eight hundred and thirty-seven men. Ships of ninety
and ninety-eight guns had, in 1677, six hundred and sixty men; in
1706, six hundred and eighty men; in 1733, seven hundred and fifty
men; and in 1805, seven hundred and thirty-eight men. An 80-gun
ship carried in 1692, four hundred and ninety men; in 1706, five
hundred and twenty; in 1733, six hundred; in 1745, six hundred and
fifty; and in 1805, seven hundred and nineteen men. A 74-gun large
class ship had in 1762, six hundred and fifty men; and in 1805, ten
less; a 74-gun common class ship had, in 1745, six hundred men; in
1762, six hundred and fifty men; in 1783, six hundred; and in 1805,
five hundred and ninety men. A 70-gun ship had in 1677, four
hundred and sixty men; in 1706, four hundred and forty; in 1733,
four hundred and eighty; and in 1745, five hundred and twenty men.
A 64-gun ship in 1745 had four hundred and seventy men; in 1762,
five hundred; and in 1805, four hundred and ninety-one men. A 60-
gun ship had in 1692, three hundred and fifty-five men; in 1706,
three hundred and sixty-five men; in 1733, four hundred; and in
1745, four hundred and twenty. A 50-gun ship had in 1706, two
hundred and eighty men; in 1733, three hundred; in 1745, three
hundred and fifty; and in 1805, three hundred and forty-three. A 44-
gun ship carried in 1733, two hundred and fifty men; in 1745, two
hundred and eighty; in 1783, three hundred men; and in 1805, two
hundred and ninety-four men.
Very little indeed is known of the earliest types of firearms carried
afloat. The crudeness of the methods of manufacture, and the
absence of any standard for pattern or size, left the makers free to
produce whatever weapons they fancied. The Christopher of the
Tower, in June, 1338, is said to have had three iron cannon with five
iron chambers. The guns were breechloaders, and the chambers
contained the charge and perhaps the projectile. She also had a
hand-gun, which, though fired from the shoulder, had the barrel
supported by a rest standing on the deck, after the manner of the
hand-guns in use ashore. The Mary of the Tower was equipped with
an iron cannon provided with two chambers, and a brass gun with
one chamber. None of the weapons yet discovered show how the
chambers were fastened in the guns of this period. It is known that
they fitted loosely and that the chambers could be fired, if
necessary, without the guns.
The early naval guns were called “crakys of war.”[23] They included
cannon-paviors, or guns which threw round stone shot, and
appropriately named murtherers, which were smaller weapons and
were loaded with anything that could be fired out again.
An inventory of the Great Barke as “vyeuwyd” in the twenty-third
year of King Henry VIII., is preserved in the Cotton Library at the
British Museum. The following are extracts:—
“Hereafter followeth the ordinances pertayning to the sayde shype,
item, in primis, two brazyn pecys called kannon pecys on stockyes
which wayith The one 9 c. 3 q. 11 lb., the other 10 c. 1 q. 17 lb.,
whole weight 20 c. 28 lb.: Item 2 payer of shod wheeles nyeu: item
two ladyng ladells.
“Starboard side. Item oon port pece of yeron cast with 2 chambers:
item a port pece of yeron, with one chamber. Item a spruyche slyng
with one chamber.
“Larboard side. Item oon port pece with 2 chambers: Item another
port pece, with oon chamber, whyche chamber was not made for the
sayd pece.
“In the forecastell. Item a small slyng with 2 chambers. Item
another pece of yeron with two chambers, the oon broken.”
Even in Queen Elizabeth’s day much of the artillery had to be
imported from Germany. It was not until about 1531 that iron guns
were first cast in England, and brass guns were cast three or four
years later. Guns were made of greater weight and bore when it was
discovered how to cast them instead of building them, and muzzle-
loaders gradually superseded the old breechloaders. The change,
however, was slow, and was probably retarded by the reluctance of
those ship owners who had breechloaders to discard them while
they could yet be fired, a reluctance which no doubt extended,
owing to the paucity of weapons, to the rulers of the various states.
The guns of the sixteenth century were extraordinarily varied. The
largest was the cannon-royal of rather more than 8½ inches
diameter,[24] 8 feet 6 inches in length, and weighing about 8,000 lb.;
its charge of powder was about 30 lb., and its shot weighed 74 lb.
The cannon was 8 inches diameter, weighed about 6,000 lb., and
with a charge of 27 lb. threw a shot of 60 to 63 lb. The cannon-
serpentine was of 7 inches diameter, weighed 5,500 lb., and with a
charge of 25 lb. threw a shot of 42 lb. The bastard-cannon was of
about the same length as the cannon-serpentine, but a lighter
weapon, and though the charge of powder was 5 lb. less, the weight
of the shot was the same. The demi-cannon varied from a little
under 6½ inches diameter to 6¾ inches, and was about 11 feet in
length and weighed about 4,000 lb., and with a charge of 18 lb.,
threw a projectile weighing from 31 to 33½ lb. The bore of the
cannon-pedro, or petro, was 6 inches, its weight about 3,800 lb., its
shot, usually of stone, whence its name, from 24 to 26 lb. The
diameter of the culverin was from 5¼ inches to 5½ inches, its
length was close upon 11 feet, its weight 4,840 lb., it received a 12
lb. charge, and fired an 18 lb. shot. The basilisk was slightly shorter
and lighter, and its 14 lb. shot required 9 lb. of powder. The diameter
of the demi-culverin was 4 inches, its weight 3,400 lb., its charge
was 6 lb., and its shot 8 to 9½ lb. The culverin-bastard seems to
have been of half an inch larger bore, about 8½ feet long, but to
have been 400 lb. lighter than the demi-culverin, and to have fired
an 11 lb. shot with a charge of 5¾ lb. The saker, or sacar, was a far
smaller weapon, being less than 3¾ inches diameter, under 7 feet in
length, and weighing about 1,400 lb.; its charge was 4 lb., and its
shot 4 to 6 lb. The minion, slightly smaller in all respects, threw a 3
lb. to 4 lb. shot. The falcon was of 2½ inches diameter, 6 feet long,
weighed 680 lb., and fired a 2 lb. shot with a charge of a little over 1
lb. of powder. The falconet was a smaller edition of the falcon. The
serpentine was of 1½ inches diameter, weighed 400 lb., and fired a
½-lb. shot; and the rabinet, or robinet, was an even lighter weapon.
For loading, canvas or paper cartridges were used, but an iron ladle
for the powder was preferred. The following list of commands in the
gun-drill contrasts oddly with what would pass in the turret of, say, a
modern super-Dreadnought:—
“Search your piece; sponge your piece; fill your ladle; put in your
powder; empty your ladle; put up your powder; thrust home your
wad; regard your shot; put home your shot gently; thrust home your
last wad with three strokes; gauge your piece.”
Some curious guns were invented when the ordnance industry was
in its infancy. The Scots in a southern raid in 1640 used guns of
leather at their passage of the Tyne—which says more for the
strength of the leather than of the powder. A composite affair called
the “kalter” gun, introduced in the time of Gustavus Adolphus, of
Sweden, is described:—
“A thin cylinder of beaten copper screwed into a brass breech,
whose chamber was strengthened by four bands of iron, the tube
itself being covered with layers of mastic, over which cords were laid
firmly round its whole length and equalised by a layer of plaster, a
coating of leather, boiled and varnished completing the piece.”[25]
Another peculiar weapon was a twin gun, in shape something like a
stumpy tuning-fork, with parallel barrels and one touch-hole;
another was a gun which could be fired at either end, the cavity in
which the chambers were placed being in the middle. It must have
been an awkward piece to handle. Hand grenades, used sometimes
preparatory to boarding, were introduced in 1689 during William
III.’s reorganisation of the artillery.
Even when the ships were provided with guns, opinion was by no
means unanimous as to the extent to which the weapons should be
employed, or the range at which they would be most effective. The
method in vogue on the Atlantic was to shoot as soon as it was
thought the enemy could be seriously damaged. A gentleman named
Gibson, who reported on the condition of the British Navy in 1585-
1603, is quoted by Charnock as saying:—
“Be sure it is your enemy before you shoot, and that you are in halfe
gunnshott of your ennemy before you shoot. It is direct cowardice to
shoot at greater distance, unless he is running away. British gunns
being for the most part shorter, are made to carry a bigger shot than
a French gun of like weight, therefore the French gunns reach
further, and those of Britain make a bigger hole. By this the French
have the advantage to fight at a distance, and we yard-arm to yard-
arm. The like advantage we have of them in shipping (although they
are broader and carry a better saile) our sides are thicker and the

You might also like