0% found this document useful (0 votes)
35 views220 pages

001-2023-0714 DLMCSA01 Course Book

The document is a course book for the Algorithmics course (DLMCSA01) published by IU Internationale Hochschule GmbH, detailing the course structure, content, and required readings. It includes an introduction to algorithmics, algorithm design, important algorithms, correctness, efficiency, and advanced topics like parallel computing and quantum computing. The course aims to provide foundational knowledge in algorithms and their applications in computer science.

Uploaded by

Manish Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views220 pages

001-2023-0714 DLMCSA01 Course Book

The document is a course book for the Algorithmics course (DLMCSA01) published by IU Internationale Hochschule GmbH, detailing the course structure, content, and required readings. It includes an introduction to algorithmics, algorithm design, important algorithms, correctness, efficiency, and advanced topics like parallel computing and quantum computing. The course aims to provide foundational knowledge in algorithms and their applications in computer science.

Uploaded by

Manish Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 220

ALGORITHMICS

DLMCSA01
ALGORITHMICS
MASTHEAD

Publisher:
IU Internationale Hochschule GmbH
IU International University of Applied Sciences
Juri-Gagarin-Ring 152
D-99084 Erfurt

Mailing address:
Albert-Proeller-Straße 15-19
D-86675 Buchdorf
[email protected]
www.iu.de

DLMCSA01
Version No.: 001-2023-0714
N. N.

© 2023 IU Internationale Hochschule GmbH


This course book is protected by copyright. All rights reserved.
This course book may not be reproduced and/or electronically edited, duplicated, or dis-
tributed in any kind of form without written permission by the IU Internationale Hoch-
schule GmbH (hereinafter referred to as IU).
The authors/publishers have identified the authors and sources of all graphics to the best
of their abilities. However, if any erroneous information has been provided, please notify
us accordingly.

2
DR. COSMINA CROITORU

Ms. Croitoru has been an instructor in the computer science department at IU International
University of Applied Sciences since 2021. Her teaching areas include artificial intelligence,
algorithms, and data science.

After studying computer science at Alexandru Ioan Cuza University (Romania) and Saarland
University (Germany), she received her PhD from the Max Planck Institute for Computer Sci-
ence with a thesis in artificial intelligence.

Since receiving her doctorate, Ms. Croitoru has devoted more time to teaching. Among other
things, she lectures on data science and the advantages of Python, the most widely used pro-
gramming language in this field.

3
TABLE OF CONTENTS
ALGORITHMICS

Module Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Introduction
Signposts Throughout the Course Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Basic Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Required Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Learning Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Unit 1
Introduction to Algorithmics 15

1.1 Basic Concepts and Historical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16


1.2 Algorithms, Programming Languages, and Data Structures . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Quality Algorithms: Correctness, Accuracy, Completeness, and Efficiency . . . . . . . . . 24
1.4 The Role of Algorithms in Society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Unit 2
Algorithm Design 33

2.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


2.2 Recursion and Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.3 Divide-and-Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.4 Balancing, Greedy Algorithms, and Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . 64

Unit 3
Some Important Algorithms 71

3.1 Searching and Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72


3.2 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.3 The RSA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.4 The K-Means Data Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Unit 4
Correctness, Accuracy, and Completeness of Algorithms 101

4.1 Partial Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102


4.2 Total Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3 Ensuring Correctness in Day-to-Day Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.4 Accuracy, Approximation, and Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

4
Unit 5
Computability 133

5.1 Models of Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134


5.2 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.3 Undecidable Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Unit 6
Efficiency of Algorithms: Complexity Theory 151

6.1 Models of Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152


6.2 NP-Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.3 P = NP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

Unit 7
Advanced Algorithmics 173

7.1 Parallel Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174


7.2 Probabilistic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.3 Quantum Computing and the Shor Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Appendix
List of References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
List of Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

5
INTRODUCTION
WELCOME
SIGNPOSTS THROUGHOUT THE COURSE BOOK

This course book contains the core content for this course. Additional learning materials
can be found on the learning platform, but this course book should form the basis for your
learning.

The content of this course book is divided into units, which are divided further into sec-
tions. Each section contains only one new key concept to allow you to quickly and effi-
ciently add new learning material to your existing knowledge.

At the end of each section of the digital course book, you will find self-check questions.
These questions are designed to help you check whether you have understood the con-
cepts in each section.

For all modules with a final exam, you must complete the knowledge tests on the learning
platform. You will pass the knowledge test for each unit when you answer at least 80% of
the questions correctly.

When you have passed the knowledge tests for all the units, the course is considered fin-
ished and you will be able to register for the final assessment. Please ensure that you com-
plete the evaluation prior to registering for the assessment.

Good luck!

8
BASIC READING
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2013). Introduction to algorithms
(3rd ed.). MIT Press.

Dewdney, A. K. (2001). The new turing omnibus. Macmillan Education.

Harel, D. (2014). Algorithmics: The spirit of computing (3rd ed.). Springer.

Sedgewick, A., & Wayne, K. (2011). Algorithms (4th ed.). Pearson Education.

Skiena, S. S. (2012). The algorithm design manual (2nd ed.). Springer.

9
REQUIRED READING
UNIT 1

Ausiello, G. (2013). Algorithms, an historical perspective. In Ausiello G., & Petreschi R.


(Eds.), The power of algorithms (pp. 3—26). Springer. Database: Springer.

Ferguson, R. (2019). Beginning JavaScript. Apress.

Database: Springer. Chapter 1: Introduction to JavaScript.

UNIT 2

Rani, N. S., & Suman, S. P. (2013). The role of data structures in multiple disciplines of com-
puter science—A Review. International Journal of Scientific & Engineering Research,
4(7), 2286—2291. Available online.

UNIT 3

Attard Cassar, E. & Connections (2018). In search of the fastest sorting algorithm. Symposia
Melitensia, 14, 63—77. Database: BASE.

Lin, A. (2019). Binary search algorithm. WikiJournal of Science, 2(1), 1—13. Database: Infor-
mit Engineering Collection.

Michael, L. G., Donohue, J., Davis, J. C., Lee, D., & Servant, F. (2019). Regexes are hard: Deci-
sion-making, difficulties, and risks in programming regular expressions. In 2019 34th
IEEE/ACM International Conference on Automated Software Engineering (ASE) (pp. 415—
426). IEEE. Database: IEEE Explore.

UNIT 4

Turner, R. (2018). Computational artifacts: Towards a philosophy of computer science.


Springer. Database: EBSCO. Chapter 25: Program Correctness.

UNIT 5

Gravell, A. M., & Gruner, S. (2018). On more or less appropriate notions of “computation”.
South African Computer Journal, 30(1), 161—181. Database: African Journals

Robertson, B. W., Kosheleva, O., & Kreinovich, V. (2017). How to make a proof of halting
problem more convincing: A pedagogical remark. Database: BASE

10
UNIT 6

Yin, Y., Wang, D., & Cheng, T. C. E. (2020). Due date-related scheduling with two agents.
Springer. Database: EBSCO.

Chapter 2: Computational complexity

UNIT 7

Blelloch, G. E., & Dhulipala, L. (2018). Introduction to parallel algorithms 15-853: Algorithms
in the real world. Available online

Nannicini, G. (2017). An introduction to quantum computing without the physics. ArXiv.


Database: arXiv

11
FURTHER READING
UNIT 1

Olhede, S. C., & Wolfe, P. J. (2018). The growing ubiquity of algorithms in society: implica-
tions, impacts and innovations. Philosophical Transactions of the Royal Society A: Math-
ematical, Physical and Engineering Sciences, 376(2128), 1—16. The Royal Society Pub-
lishing. Database: EBSCO

UNIT 2

Skiena, S. S. (2012). The algorithm design manual (2nd ed.). Springer. Database: Springer-
Link. Chapter 10: How to design algorithms

UNIT 3

Reddy, C. K., & Vinzamuri, B. (2018). A survey of partitional and hierarchical clustering
algorithms. In C. C. Aggarwal & C. K. Reddy (Eds.), Data clustering: Algorithms and
applications (pp. 87—110) Chapman and Hall/CRC. Database: EBSCO

UNIT 5

Hehner, E. C. (2016). Observations on the halting problem. ArXiv. Database: arXiv.

Stoddart, B. (2019). The halting paradox. ArXiv. Database: arXiv.

UNIT 6

Neri, F. (2019). Linear algebra for computational sciences and engineering. Springer. Data-
base: SpringerLink.Chapter 11: An introduction to computational complexity

UNIT 7

Black, P. E. (2019). Formal methods for statistical software. (NIST Interagency/Internal


Report (NISTIR)—8274). U.S. Department of Commerce, National Institute of Stand-
ards and Technology. Available online

Riggs, C. J., Lewis, C. D., Mailloux, L. O., & Grimaila, M. (2016). Understanding quantum
computing: A case study using Shor's algorithm. In H. R. Arabina, G. Jandieri, A. M. G.
Solo, & F. G. Tinetti (Eds.), Proceedings of the International Conference on Foundations
of Computer Science (FCS). Machine Learning & Information. Available online

12
LEARNING OBJECTIVES
The course “Algorithmics” aims to present the theoretical and the historical foundations
of computer science and explore key concepts on the design and programming of algo-
rithms. Moreover, it looks at the societal implications of the use of the computing applica-
tions that are born from the programming of algorithms. It also presents some of the
futuristic paradigms of algorithms.

The first part of the course starts with the above mentioned historical and societal per-
spectives. It then focuses on searching, sorting, pattern-matching, clustering, and encryp-
tion algorithms, and on the use of key data structures and algorithm design strategies. The
theoretical foundations of computer science are presented in the second part of the
course where models of computation, complexity, completeness, and correctness are dis-
cussed. In the last part of the course, we single out parallelism, probabilistic algorithms,
and quantum computing as some of the innovative paradigms for the future of algo-
rithms. Links to the IUBH GitHub repository for this course have been provided through-
out this script. You are encouraged to use this resource for practice.

13
UNIT 1
INTRODUCTION TO ALGORITHMICS

STUDY GOALS

On completion of this unit, you will have learned …

– basic concepts of algorithmics.


– key issues in the algorithm quality and the difficulty levels of computational problems.
– important features of the interplay between algorithms, programming languages, and
data structures using the JavaScript programming language.
– the historical and the contemporary role of algorithmics.
1. INTRODUCTION TO ALGORITHMICS

Introduction
Computer programming is widely perceived as the holding block of the following three
main technologies of the Fourth Industrial Revolution: digital, biological, and physical.
However, a computer program is nothing more than an algorithm written in a given pro-
gramming language. In other words, algorithms and algorithmics (the study of algorithms)
are the real holding blocks of Industry 4.0 (also known as the Fourth Industrial Technol-
ogy). One of the highlights of this unit will be to single out this cornerstone role of algorith-
mics in the Fourth Industrial Revolution.

In addition to the discovery of the history of algorithmics, we study the definitions of basic
algorithmic concepts, as well as how to code algorithms in the JavaScript programming
language.

Some of the concepts introduced in this unit are related to the design of algorithms and
their evaluation in terms of their correctness and of their complexity. Finally, some of the
theoretical components of algorithmics such as the complexity theory and the computa-
bility theory are introduced.

1.1 Basic Concepts and Historical


Overview
Before looking at the history of algorithmics, it seems important to first briefly define the
concept. An algorithm is simply a sequence of steps toward the resolution of a computa-
tional problem. After a historical overview, this section discusses computational problems
and computability and explains how algorithms differ from algorithmics and programs.

History of Algorithmics

Many perceive Ancient Iraq and Ancient Egypt as the cradle of civilization. Consequently, it
is not surprising that they are also considered the cradle of algorithmics. Ancient Greek
and Italian mathematicians also played an important role. Fibonacci was key in the dis-
semination of the work of al-Khwarizmi, the Persian mathematician from Khiwa (present-
day Uzbekistan and Iran) whose name gave birth to the term “algorithm.” Turing and other
contemporary Western mathematicians such as Gödel and Hilbert were also instrumental
in the development of algorithmics.

16
Ancient Iraq and Ancient Egypt

The Sumerian people of Ancient Iraq are known as the first people to have used algo-
rithms around 2500 BCE, for example, in the teaching of geometry and basic operations
such as division on clay tablets.

For these calculations, their number system used decimal numbers within a sexagesimal
system. Thereafter, between 2000 and 1650 BCE, the Babylonians, another Ancient Iraq
people, also wrote algorithms on their tablets to teach other mathematical concepts, such
as inversing numbers, squaring roots, or solving algebraic quadratic equations. Similarly,
Ancient Egypt used papyrus between 1900 and 1800 BCE to write algorithms for the teach-
ing of geometrical concepts, such as surface areas, geometrical series, and volumes.

Ancient Greek civilization

Around 500 BCE, Ancient Greece ruled the world both militarily, politically, and academi-
cally. We know it as the “cradle of democracy”. Military commanders such as Alexander the
Great expanded the Greek empire to Egypt. Moreover, it was home to legendary philoso-
phers such as Plato, Aristotle, and Socrates, as well as brilliant mathematicians such as
Pythagoras, Thales, Euclid, and Eratosthenes. The renowned Hellenistic mathematicians
Eratosthenes and Euclid residing in Libya and in Alexandria, respectively, found the first
algorithms on the identification of prime numbers and on the calculation of the greatest
common divisor of two numbers.

Ancient Islamic translations

In 820, the House of Wisdom was built by Muslim rulers, such as al-Mamun in Ancient
Baghdad, with the initial purpose of translating Hellenistic and other manuscripts into
Arabic (Ausiello, 2013). The house ultimately became a vibrant, intellectual, and scientific
hub that hosted a wide range of scholars from diverse origins, languages, and religions.

This is how al-Khwarizmi introduced the number zero from the Indian numbers system to
the Arabic numbers system. He is also credited for the first detailed specification of the
main four basic arithmetic operations: addition, subtraction, multiplication, and division.

The Byzantine Empire

The Roman Empire split into the Western Empire and the Eastern Empire in 390. The West-
ern Empire ended around 480, but the Eastern (or Byzantine) Empire only ended around
1453. History teaches us that Pisa (present-day Tuscany) held a key maritime position in
the life of the Byzantine Empire. This is where the young Italian Fibonacci was born around
1170. He studied at Bugia (in what is now Algeria) but also traveled intensively, both as a
trader and as a scholar, in many Mediterranean countries where he discovered the work of
Hindu-Arabic scholars such as al-Khwarizmi. Fibonacci became quite skillful at merging al-
Khwarizmi’s Hindu-Arabic algorithms with Euclidian mathematics. His work contains algo-
rithms and mathematical solutions for various application domains such as mathematics,
accounting, and games, even though his name is usually associated in the history of algo-

17
rithmics with the “Rabbit Problem” and the “Fibonacci Sequence.” Interested readers may
find an account of the history of mathematical notations and how they traveled in Mazur
(2014).

Post-Renaissance and contemporary Western mathematicians

Historical hardware devices sustained the vibrancy of algorithmic activities, starting with
the previously mentioned clay tablets and the papyrus used in Ancient Iraq and in Ancient
Egypt, respectively. The invention of “algorithmic mechanical machines” (e.g., Joseph
Marie Jacquard, Charles Babbage) also constitutes a key moment in the history of algo-
rithmics. Jacquard was an eighteenth-century French inventor who created programma-
ble sewing machines for the silk industry. His contemporary, Charles Babbage, was an
English mechanical engineer and mathematician whose groundbreaking inventions inclu-
ded how to use punch cards to automate astronomical calculations.

Around 1900, the German mathematician Hilbert undertook the ambitious task of propos-
ing an algorithm or method that could decide on the veracity of any given mathematical
proposition or statement. However, the Austrian mathematician Gödel published a clear
proof in 1931 that no algorithm or method can decide the veracity of any given mathemat-
ical proposition or statement. In 1936, Turing confirmed Gödel's work and hypothesized
that if there is a proof of the veracity of a given mathematical proposition or statement
then that proof can be confirmed by a Turing machine (Ausiello, 2013).

With its rich history in mind, we move to the overview of the present state of algorithmics.

Algorithmics Basic Concepts

This subsection is a presentation of the definition of basic algorithmics concepts such as


computational problems, computability, as well as the difference between algorithms and
algorithmics.

Computational problems

The purpose of algorithms is to solve computational problems that are made up of three
components: inputs with their preconditions, outputs with their post-conditions, and a set
of relationships between inputs and outputs. The problem of the identification of the
highest common divisor between two strictly positive numbers is described in the next
paragraph as an example of a computational problem.

The inputs of the aforementioned computational problem are two natural numbers that
must be strictly positive (precondition). The output will be a number with the post-condi-
tion of being the highest common divisor of the two inputs.

As for the relationship between the inputs and the output, it simply portrays the output as
the biggest natural number that is both a divisor of the first input and a divisor of the sec-
ond input. Similarly, it is not difficult to specify the primality test problem of deciding
whether or not a given strictly positive natural number is prime.

18
From a Turing machine’s perspective, it is hypothesized that we can always run the algo-
rithm of a computational decision problem once we are in possession of such an algo-
rithm.

Decision problems are computational problems whose output simply yields Boolean
answers such as yes/no or true/false as opposed to non-decision problems. One can easily
identify the above described primality test and highest common divisor problems as a
decision problem and as a non-decision problem, respectively.

Computability and decidability

A computational problem is said to be computable if there exists an algorithm that can


solve it, i.e., if a Turing machine exists for it. For example, if the above specified problem of
the identification of the highest common divisor between two strictly positive numbers is
computable, so is the one of the primality test of a number. However, many other compu-
tational problems, for example, the classical Halting Problem, are not computable. The
dichotomy between decision problems and non-decision problems also makes it possible
to speak of decidability as a synonym of computability for decision problems; this does
not make sense for non-decision problems for which computability is only synonymous
with solvability. Thus, we say the primality test problem is decidable and the highest

Algorithms and algorithmics

An algorithm is a finite sequence of doable (by a Turing machine) steps aimed at solving a Algorithm
given computational problem. Consequently, algorithmics is the study of algorithms and An algorithm solves a
computational problem.
of computational problems since many computational problems are known to be non-
computable. While algorithmics examines the computability, solvability, and decidability
of computational problems as highlighted above, it also studies the design and the analy-
sis of algorithms as introduced below.

1.2 Algorithms, Programming Languages,


and Data Structures
It seems natural to present algorithms in the form of pseudocodes so that they can be
easily understood by humans in their natural languages. However, in order for an algo-
rithm to solve its computational problem with the help of a computer, the pseudocode of
that algorithm must first be translated into a program that will then be executed by a com-
puter, together with the relevant control and data structures. In contrast to pseudocodes
that are written in natural languages, computer programs are written in programming
languages. Programming language
A programming language
automates the processes
leading to the execution
of algorithms.

19
Programming Languages

Many programming languages are available algorithms, depending on the nature of the
computational problems. This course has adopted JavaScript as its programming lan-
guage mainly because of its ease of deployment on Web browsers. These were written and
tested in NodeJs, as visible in the following example.

“Hello There”

Please follow the steps below to write and execute our “Hello There” JavaScript pro-
gram. This program asks its users to enter their name and it displays on the screen a
“Hello” message to that name followed by the “Kind Regards”, with each output on a
separate line. It is assumed that readers are familiar with the basic concepts of program-
ming and efforts will be made to always write self-explanatory JavaScript examples. Non-
trivial code segments will always be explained.

1. Open a text editor (e.g., Notepad, Notepad++, Edit, GEdit, BBEdit, and Atom), type the
following code, and save your file with a name of your choice and with a js extension
(Use the option All files for the dropdown list Save as type).

Figure 1: “Hello There” Example

Source: Created on behalf of IU (2020).

We see from that the characters // are used for short comments not exceeding one
line. The readline.question instruction is also very important here because of its
role in the identification of the inputs of the program, and a similar role is played by
the console.log instruction with regards to the display of the outputs of the pro-
gram.
2. Assuming that HelloThere.js is the name that you have given to your program,
type the node HelloThere.js command on the command prompt to run your pro-
gram. If that command complains that it does not recognize the readline-sync
module, then type the npm install readline-sync command on the command
prompt to install it. You are encouraged to run each program and even modify the pro-
gram with your own ideas and use a debugger (e.g., in Eclipse, VisualStudio Code, net-
beans WebStorm) to see it walk through its steps. These are two excellent methods for
understanding the work of an algorithm.

20
Greatest Common Divisor (GCD)

As previously stated, the greatest common divisor (GCD) problem consists in identifying
the highest natural positive integer that divides two given strictly positive natural num-
bers.

A simple algorithm for the GCD problem is to start by assuming that the smallest of the
two given numbers is the highest divisor of these two numbers. Thereafter, that assump-
tion must be tested: If it is true, then the highest common divisor would have been found.
Otherwise, we have to reduce the value of currently assumed GCD by one and test that
new assumption. That iterative process of assumption and testing will ultimately end with
a GCD value greater or equal to one because one divides all strictly positive natural num-
bers. N. B. Euclid is credited as the inventor of the first GCD algorithm. The code below can
be found here.

Figure 2: GCD Naive Algorithm

Source: Created on behalf of IU (2020).

By default, it is usually assumed that users enter strings from the keyboard, so we have to
use parseInt to tell JavaScript that the value that is entered is an integer. The definition
and the use of the function gcd also deserve our attention. Similarly, from this program
we learn how to calculate the minimum between two numbers (Math.min), calculate the
remainder or modulus between two numbers (%) (e.g., 26 % 7 is equal to 5 in the sense that

21
when we divide 26 by 7, we get 3 but there is a remainder of 5), test the equality of two
entities ( ===), negate a Boolean value (!), and decrement the value of a variable (--). The
syntax of while loops and of function can also be learned from this example, especially
with the use of caliber brackets ({}). The use of the keyword return is also very impor-
tant for functions.

Data Structures

The HelloThere and GCD examples deal mainly with one or two integers or strings whose
values are collected from users. Those values are the data of the program. However, there
are many situations where an algorithm or a program has to collect data of a more diverse
nature either in terms of quantity or in terms of varieties. Consequently, this forces algo-
rithms and programs to organize or structure their data into relevant data structures, such
as arrays, lists, sets, trees, or graphs. For example, for the computational problem of sort-
ing a sequence of numbers in ascending order, it would be appropriate to use an array for
the storing and the processing of these numbers, as in the selection sort JavaScript exam-
ple below.

22
Figure 3: Selection Sort Algorithm

Source: Created on behalf of IU (2020).

23
The selection sort algorithm swaps the smallest value from the first position of the array
with the value in the first position, then it swaps the smallest value of the array from the
second position with the value in the second position, then it swaps the smallest value of
the array from the third position with the value in the third position, and so on, until it
reaches the end of the array.

In the JavaScript program above, the array numbs has been initialized with the empty
value []. In order for the captureNumbers function to fill it with values, it must push
those values one by one at the end of the array. Moreover, the captureNumbers function
uses the Number function instead of the parseInt function simply because this computa-
tional problem accepts both integers and real numbers as inputs. It is also important to
note that all three functions of this JavaScript program are marking use of the return key-
word without returning any value, simply because their purpose is to accomplish their
intended tasks without necessarily returning one specific value. In fact, two of these func-
tions are modifying the array that is passed to them as a parameter either by filling it with
values, or by sorting it, and the third function simply displays the content of its array on
the screen. When an array is passed as a parameter to a function, its content can be
changed by that function if it contains the necessary instructions to do so. This is possible
because, in JavaScript, most parameters are passed to functions by value, except for
objects parameters that are passed to functions by address, and JavaScript arrays are
actually objects. Readers are reminded that parameters that are passed by value to a func-
tion cannot be changed by that function, as opposed to the ones that are passed by refer-
ence or by address.

Control Structures

From an algorithm’s perspective, the two main control structures are conditions and
loops. Conditions include different forms of if statements and loops are used for the iter-
ation or repetition of sequences of instructions. The purpose of these control structures is
to indicate how instructions will follow one another. For example, the aforementioned
sortNumbers function has a condition inside a double loop.

1.3 Quality Algorithms: Correctness,


Accuracy, Completeness, and Efficiency
When proposing an algorithm for a given computational problem, we have to ensure that
such an algorithm is correct, accurate, complete, and efficient.

Correctness

The correctness of an algorithm can only be established through a mathematical correct-


ness proof of its partial or total correctness. An algorithm is totally correct if there is a
mathematical proof that the algorithm is partially correct and, for all its inputs that fulfill
its preconditions (correct inputs), the algorithm will always terminate. An algorithm is said
to be partially correct if one can prove mathematically that it has two types of inputs ful-

24
filling its preconditions: the ones for which the algorithm does not terminate and whose
outputs are therefore unknown, and the ones that terminate and whose outputs fulfill its
post-conditions (correct outputs).

Completeness

An algorithm is said to be complete if, for all its correct inputs (inputs that fulfill its precon-
ditions), it always renders the correct outputs (outputs that fulfill its post-conditions) for
the given inputs when there is a solution for such inputs, or it terminates with a “no solu-
tion found” message when there is no solution for such inputs. Let’s consider the compu-
tational problem of looking for the position of the first occurrence of a decimal digit (0…9)
in a given non-null string. Any algorithm that will ignore the fact that certain strings do not
contain decimal numbers will not be complete.

Accuracy

When inputting a value to a computational problem or when calculating the value of one
of its outputs, it is important for such values to have the desired level of accuracy, i.e., to
be as close as possible to the reality that those values are supposed to represent. The
issue of accuracy is important for approximation algorithms whose purpose is to make an
approximation of an expected value. Such algorithms will be considered inaccurate if their
approximations are too different from the expected value.

Efficiency and Complexity

Let us recall that an algorithm is a finite sequence of doable (by a Turing machine) steps
aimed at solving a given computational problem, using appropriate data structures. This
is why the efficiency of an algorithm can be assessed either in terms of space or in terms of
time. Space efficiency has to do with the amount of memory that is required for the data
structures of an algorithm. As for time efficiency, it has to do with the number of steps of Time efficiency
an algorithm. Complexity has more to do with the level of difficulty of a given computa- This quality indicates the
speed of a given algo-
tional problem. rithm.

1.4 The Role of Algorithms in Society


It is important to note that the previous definition of algorithms is different from the one
that has widely been adopted by modern societies. Let’s first define algorithms from a
societal viewpoint before discussing the opportunities that they create and their associ-
ated challenges. Modern societies consider algorithms powerful elements of computing
devices and applications whose mission it is to assist as much as possible in the manage-
ment of our lives in decision-making choices as well as the execution of activities.

25
Application Domains

Algorithms have so far mostly influenced modern life: search engines, public-key cryptog-
raphy and digital signatures, errors correction, patterns matching, data compression, and
databases (MacCormick, 2013).

Search engines

The World Wide Web is perceived by many as a gigantic collection of pages that are
deployed in a worldwide distribution and contain valuable information for its users.
Search engines allow their users to query the Web in order to select the right information
from all those pages. Currently, the most popular search engines are owned by Google
and Microsoft.

Public-key cryptography and digital signatures

Public-key cryptography and digital signatures are mainly used to secure information.
These two concepts are illustrated using the example of a box whose role is to transmit
messages between senders and receivers (Vryonis, 2013).

In private-key or symmetric cryptograph, we have a single private key that locks and
unlocks the box. Any key holder can hide the content of the box by locking it (encryption)
and only the other holders of a copy of that private key can unlock it and see its content
(decryption).

Public-key or asymmetric cryptography works a bit differently. Suppose we have an ini-


tially closed single box with a single clockwise-turning private key held by only one per-
son. There are multiple copies of an anticlockwise-turning public key with the middle ver-
tical position (↑) of the lock being the unlocked position and both the left (←) and the right
(→) horizontal positions being the locked ones. Anyone who puts something in the box or
takes something from it must close it immediately after, and two people cannot use the
box at the same time. When the lock of the box is in the right (→) horizontal position, any-
one (including the private key holder) can open the box with the public key, put things
inside, and lock it with the public key (encryption); however, only the private key holder
can unlock it (decryption). Similarly, if a public key owner opens that box and finds a
packet in it, then they are certain that the packet was sent by the private key owner. That
private key serves here as the digital signature of its owner. A representative diagram of
different cryptography systems is displayed below from Kessler (2020).

26
Figure 4: Cryptography Systems

Source: Kessler, 2016.

Error correction

Computing devices, their applications, and algorithms handle large quantities of data that
sometimes contain errors. The purpose of error correction algorithms is to automatically
detect and correct such data errors. Let’s illustrate this purpose by checking the validity of
a 13-digit ISBN code.

First, we have to multiply its even positions digits by three and add all these results to the
sum of all the other digits except for the thirteenth. The second step is to calculate the
remainder from the division of that total by ten. If the addition of that remainder to the
thirteenth digit gives ten, then the ISBN code is valid; otherwise, it is not.

Let’s consider the ISBN code 978-3-540-48663-3. Its odd positions digits are 7, 3, 4, 4, 6,
and 3. Their multiplications by three will give 21, 9, 12, 12, 18, and 9, whose sum is 81.
When adding 81 to the other digits except for the 13th one (9, 8, 5, 0, 8, and 6), we get a
total value of 117. The natural division of 117 by 10 yields a value of 110 with a remainder
of 7, and the addition of 7 to the 13th digit gives 10. So this ISBN code is valid.

Error detection and correction algorithms are used in various application domains, such
as fixed and mobile storage devices or discs, internet packets transmissions, and cell
phone calls.

27
Data compression

Data compression consists of taking advantage of the existence of redundancies in data in


order to present them in a more compact form. For example, the Run Length Encoding
(RLE) compression algorithm takes a sequence of binary digits and transforms it to
another sequence of binary digits based on the length of the longest sequence of repeti-
tive bits.

We will use the binary sequence 11111111111111111000011111111111 to illustrate the


concept of data compression. In this bit sequence, there are seventeen 1s followed by four
0s and eleven 1s. We must use the number of bits of seventeen for our compression: 17 is
converted to 10001 which consists of five bits. So we must also convert 4 and 11 to binary
on five digits: 4 will be converted to 00100 and 11 will be converted to 01011. It is now time
to say that our binary sequence has seventeen 1s followed by four 0s and by eleven 1s and
obtain the compressed binary sequence 110001000100101011 of the initial binary
sequence. Note that the initial sequence had 32 bits, but its compressed version has 18.

Pattern matching

Pattern matching algorithms aim to check the occurrence(s) of a given pattern in a given
object, using, for example, the regular expressions (regExp) facilities that are available in
current programming languages. For instance, the following JavaScript code illustrates
the matching of two patterns in a short present tense sentence input by users in order to
check whether such sentences are grammatically correct. This example assumes that such
sentences are simply made up of a pronoun (I, you, he, she, it, we, and they) that is fol-
lowed by a space and a verb. Pattern matching algorithms are used in several applica-
tions, such as search engines, machine learning, information security, virtual reality, DNA
sequencing, and pattern recognition (characters and images).

28
Figure 5: Third Person Present Tense Pattern

Source: Created on behalf of IU (2020).

Databases

In a database, data is structured and kept in a repository making it easy for users and pro-
grammers to store data and retrieve information. The repositories are currently almost
omnipresent in the management of all organizational activities dating back to 1960. Popu-
lar Database Management Systems (DBMS) include Oracle, SQL Server, DB2, MySQL,
Access, MongoDB, Solr, and CouchDB.

Challenges

In modern societies, algorithms tend to use personal data to help them manage their
activities and decision-making choices. However, such algorithmic societies are exposed
to several challenges such as the applicability of current legal and ethical frameworks, the
promotion of unfair and opaque practices, the lack of accountability, privacy breaches,
and the disruption of the labor system.

Applicability of current legal and ethical frameworks

The British Academy, the Royal Society, the House of the Lords, the French Parliament,
and the Association for Computing Machinery are all in agreement on the following
straightforward ethical principle for the use of data by algorithms: Algorithms should only
use data if it is for the general benefit of society (Olhede & Wolfe, 2018). Moreover, since
mid-2018, the European Union legal system follows a General Data Protection Regulation
(GDPR) law that requires all algorithms to give an explanation to anyone who is requesting
such an explanation when their data were used by the algorithms without their consent or

29
when they are affected by the processing and outcome of the algorithms. However, critics
doubt the ability of algorithms to generate meaningful explanations in human natural lan-
guages, let alone in the legal terminology.

Promotion of unfair and opaque practices

Suppose we have an algorithm on a mobile application that allows any resident to report
street light problems in their neighborhood for the assessment of the state of street light
in the country and for the consequent allocation of a budget for their maintenance (Craw-
ford, 2013). Unfortunately, this collection method of massive data for the algorithm is
biased in that it excludes the data of the non-users of the mobile application. Conse-
quently, all non-users of the mobile application will not have a budget for the mainte-
nance of their street lights even if those lights are not working, and that can be seen as
unfair. Another serious concern is that the development of algorithms is so complex that
they are a black box for most people who simply have to put their trust in the hands of the
few algorithm specialists. Unfortunately, the betrayal of this trust is possible and can have
serious negative consequences in people’s lives, as in the example of the mistaken arrest
of an African American due to a “faulty facial recognition match” (Hill, 2020). Search
engines are another example that illustrates the opacity of algorithms. It is difficult for an
end-user to understand how a search keyword can return pages that correspond to their
personal profile even when such a keyword does not contain their personal data.

Lack of accountability

Both the Association for Computing Machinery (ACM) and the European General Data Pro-
tection Regulation (GDPR) are adamant that individuals are entitled to question algorith-
mic decisions. It is the initiators of the use of automated algorithms in such decision-mak-
ing processes who are responsible and accountable for those decisions. This is, however,
difficult to enforce mainly because the typical end-user of automated algorithms is usually
a non-specialist who does not have the ability to understand and to explain in natural lan-
guages the details leading to the outcomes of algorithms.

Privacy breaches

The sharing of social media data (sometimes for financial motives) by different stakehold-
ers without the prior knowledge and prior consent of the owners and generators of such
data is a good example of a privacy breach in algorithmic societies. It is a major concern
because people usually make use of social media to share their life experiences with their
friends and families, and discuss sensitive matters on their health, love lives, or political
opinions. This, unfortunately, sometimes leads to situations where managers and govern-
ment authorities use social media applications to invade people’s privacy with negative
consequences to their work and their lives.

Disruption of the labor system

As stated above, the development of algorithms is so complex that they are a black box for
most people who simply have to put their trust in the hands of the few specialists who are
working on them. Moreover, there are perceptions that, because algorithms will ultimately

30
take full and exclusive control of the management of people lives in terms of their deci-
sion-making choices as well as the execution of their activities, the labor force currently in
charge of these decision-making choices and of these activities may eventually become
redundant. The main concern is that we may end up with an algorithmic society where
algorithmics’ workers are the only active population and all others are unemployed.

SUMMARY
This unit identified Euclid, al-Khwarizmi, Fibonacci, Gödel, Hilbert, and
Turing as some of the key players in the history of algorithms within
their respective historical contexts. Computational problems were intro-
duced before the presentation of key algorithmic concepts, such as algo-
rithm quality and the difference between algorithms and programming.
Finally, we reviewed modern algorithmic societies with a few examples
on some of their predominant algorithms, such as search engines and
error detection, and we identified societal issues resulting from algo-
rithms. Some of those issues include, for example, the opacity of algo-
rithms and their lack of accountability despite the existence of interna-
tional laws for the promotion of a transparent and fair algorithmic
society.

31
UNIT 2
ALGORITHM DESIGN

STUDY GOALS

On completion of this unit, you will have learned …

– how to organize the data of your algorithms into suitable data structures.
– the use of iteration and recursion in algorithm design.
– key algorithm design strategies.
2. ALGORITHM DESIGN

Introduction
When faced with the task of creating an algorithm for a computational problem, it is useful
to consider existing algorithm design techniques as a source of inspiration. It is also
important to have first-hand experience with other computational problems whose algo-
rithms are based on design techniques in order to be aware of the similarities and differ-
ences between various computational problems. Similarities and differences between
computational problems may arise from the nature of the data. This fact calls for a clear
understanding of the different data structures that are available to algorithms, while
knowing that the use of data structures may differ from one algorithm design technique to
another.

The concepts that are covered by this unit are intended to give you a sound understanding
of the use of data structures by key design techniques, as visible in relevant algorithms.

2.1 Data Structures


Data structure Algorithms use several data structures that are presented in this section, including
A data structure organizes arrays, lists, heaps, queues, stacks, trees, and graphs.
data into a usable format
to which operations can
be applied. Arrays

An array can be described as a finite sequence of elements which may be empty. The
length of an array is simply equal to the length of its sequence of elements. These ele-
ments are accessible through their indices. It is possible to insert and remove elements in
or from any position in the array. Current programming languages have many more meth-
ods and functions for arrays, as is visible in the following JavaScript example for the inser-
tions’ operations:

1. at the beginning: arrayVariable.unshift (element(s) to add)


2. somewhere at the middle: arrayVariable.splice (position where the added ele-
ment(s) will stay, 0, element(s) to add)
3. at the end: arrayVariable[arrayVariable.length] = element(s) to add; or,
arrayVariable.push (element(s) to add)

It is also possible to initialize an array simply by stating its number of elements even
though other operations can be used later to increase or decrease that initial length. That
is what is done in the JavaScript program below.

34
In this program, an array is created with an initial size of n1, making it possible for the
function captureElements to fill those n1 positions of the array simply by accessing
them with their indices. However, because the other n2 elements were not planned as
being part of the initialization of the array, the use of one of the methods identified above
is necessary for their insertion.

Figure 6: Array’s Input and Output Example

Source: Created on behalf of IU (2020).

35
Stacks

A stack is a sequence of elements where new elements can only be inserted at the end of
the sequence, and it is only the element at the end of the sequence that can be removed.
Thus, stack data structure has a limited number of operations: creating an empty stack,
putting a new element on top of the stack, getting the value of the element currently on
top of the stack, or removing it from the stack. All of these operations can easily be imple-
mented by a stack class with an array as a private attribute. On the other hand, imple-
menting stacks with linked nodes is a little bit more complicated, as is visible in the follow-
ing JavaScript program. Many points deserve our attention here.

36
Figure 7: Stack Implementation with Linked Nodes (Start)

Source: Created on behalf of IU (2020).

37
Figure 8: Stack Implementation with Linked Nodes (End)

Source: Created on behalf of IU (2020).

Lists

A list is a sequence of elements, just like an array or a stack. Although all three have a
beginning (head) position and an end (tail) position, for a list, we will also always want to
know the current position of the cursor of the list.

In fact, the operations of a list are usually carried out only from this unique position of the
cursor. In the case of a stack, the position of the cursor always points to the top of the
stack such that all stacks’ operations can only apply to the top of the stack. The identified
arrays’ and stacks’ operations still apply to lists for the position of their cursor. Moreover,
lists offer the possibility to change the value of the position of their cursor.

38
As is the case for stacks, because lists can easily be implemented by a class with an array
as a private attribute, we will show you how to implement them by using linked nodes, as
illustrated in the following JavaScript program.

Always trace this program with a handful of test cases for a better understanding of the
implementation of the list data structure so that missing methods can be added (e.g.,
insertion of elements).

39
Figure 9: List Implementation with Linked Nodes (Start)

Source: Created on behalf of IU (2020).

40
Figure 10: List Implementation with Linked Nodes (Cont'd)

Source: Created on behalf of IU (2020).

41
Figure 11: List Implementation with Linked Nodes (End)

Source: Created on behalf of IU (2020).

Non-Priority and Priority Queues

For stacks, the top of the stack contains the element that entered last. It is that top of the
stack that will always be the first element to leave the stack. This is why stacks are said to
have a last-in-first-out (LIFO) policy. The code below can be found here.

42
Figure 12: Non-Priority Queues Implementation with Linked Nodes (Start)

Source: Created on behalf of IU (2020).

43
Figure 13: Non-Priority Queues Implementation with Linked Nodes (End)

Source: Created on behalf of IU (2020).

44
In the first-in-first-out (FIFO) policy of non-priority queues, the first element to have
entered the queue is always the first element to leave. Non-priority queues behave like
lists with cursor positions always equal to zero, making their implementation very similar
to the one of lists except for the position. A non-priority queue that is implemented as a
list with a cursor position always equal to zero is visible in the above JavaScript program.
Readers are advised that the word list has simply been replaced by the word queue. The
cursor attribute has disappeared, and it has been replaced by the zero value where neces-
sary.

As in the case of a queue where the elderly are sometimes helped first, priority queues
allow their elements to have different levels of priority in order to first serve the ones with
the highest level of priority. This requires the QueueNodeClass class to have a third
attribute on the priority level of each of its nodes so that, periodically, we can scan
through the queue, identify the position of the first node with the highest level of priority,
and remove it from the queue in order to attend to it. This is done with the
getPriorityPosition() method of the PQueueNodeClass class in the JavaScript impl
ementation of a priority queue, shown below.

The first difference between the PQueueNodeClass class and the QueueNodeClass class
is the inclusion of the priority attribute in the PQueueNodeClass class to indicate that
each node of a priority queue has a priority level that is represented by a priority number.
Recall that a priority number is a strictly positive natural number and that the smaller the
priority number, the higher the priority level.

45
Figure 14: Priority Queues Implementation with Linked Nodes (Start)

Source: Created on behalf of IU (2020).

46
Figure 15: Priority Queues Implementation with Linked Nodes (Cont'd)

Source: Created on behalf of IU (2020).

47
Figure 16: Priority Queues Implementation with Linked Nodes (End)

Source: Created on behalf of IU (2020).

The second difference between these two classes is the inclusion of the
getPriorityPosition() method in the PQueueNodeClass class so that we can use
that method to find the position of the first element with the highest priority and remove
the associated element from the queue, if needed, in order to attend to it.

This is precisely what is achieved by the removeHighestPriorityElement() method of


the PQueueClass class. The use of the Math.random() method by the
captureElements() function for the generation of random priority numbers is also
worth noting.

48
Binary Trees and Binary Search Trees

A tree is made of up a root node that has one or many children or subtrees. In the case of
binary trees, every root node has one or two direct sub-nodes except for the leaves that do
not have any nodes. Let us suppose that we have a binary tree in which every node has
two sub-nodes, except for the leaves that do not, as represented in the following diagram.

Figure 17: Completely Full Binary Tree Example

Source: Created on behalf of IU (2020).

Such a completely full L levels tree has a number of nodes equal to the sum of the terms
of the geometrical sequence that has 1 as its first term and 2 as its ratio.

20 + 21 + 22 + 23 + 24 + … + 2L − 1

The application of the following formula for the sum of the first n terms of a geometric
series with an initial value a and with a ratio r yields a value equal to 2L–1.

2 a 1 − rn
ar0 + ar1 + ar + ar3 + ar4 + … + arn − 1 = 1−r

This means that if we want to store all nodes of a completely full binary tree of L levels, we
must keep 2L–1 slots for that storage. Such slots can be numbered from 1 to 2L–1 or from 0
to 2L–2. As the figure below illustrates: the numbers represent the positions of the tree
above, and the root of the tree is in position 0.

49
Figure 18: Numbering of the Nodes of a Completely Full Binary Tree

Source: Created on behalf of IU (2020).

It is interesting to note in this tree of positions that all the left nodes are located in odd
positions and all the right nodes are located in even positions. Moreover, the position of a
left child node is the addition of one to the double of the position of its father, and the
position of a right child is the addition of two to the double of the position of the father.
Conversely, the position of a father is the half of the position of its left child minus one, but
that position of the father is also the subtraction of one from half of the position of the
right child. In other words, the position of the two children can easily be calculated based
on the position of the parent, and the position of the parent can also be easily calculated
once we know the position of one of the children. Thus, it is possible to assign values to
the different nodes of the binary tree based on the following rule: assigning a value to the
root of the tree that is located in position 0 is always allowed, and, for all the other nodes,
assigning a value to a node in a position where the parent does not have a value is not
allowed. This is precisely what is done by the following JavaScript program where the val-
ues of the nodes of binary tree are kept in an array.

The following points can be singled out from this JavaScript implementation of binary
trees:

• The BinTreeClass class that represents binary trees is quite simple as it is only made
up of two attributes: (1) the number of levels in the tree and (2) the array that will store
all the elements of the array. It has a getTTNbOfElements() method whose purpose is
to calculate the total number of slots required for the array to store all the elements of
the binary tree, even when such a tree is completely full. That method is called by the
initializeBinTree() that assigns all these array’s slots to undefined as a way of
stating that none of the elements of the tree has so far been given a value. The role of
the putElementInNode() method is to place an element in a given position on the
binary tree provided that its parent’s position does not contain an undefined object.

50
Other instructions are included in the JavaScript program, and their modus operandi
will easily become apparent by tracing the program with a few representative test
cases.
• It is worth noting that the JavaScript program presented below does not have a method
on how to remove an element from a binary tree. Suppose we want to remove the ele-
ment that is currently located in a given position on the binary tree. The first thing to do
is to set the value in that position to the undefined object. Afterwards, one will have to
scan through all the nodes that have a higher index value than the current position and
that are located below the node of that position. All those higher index values will be set
to undefined.

Just as there are many species of trees in nature, there are different sizes and shapes of
binary trees in algorithmics. This variation has led to the identification of several types of
binary trees such as full, complete, perfect, and balanced trees. Other noticeable types of
binary trees include the Huffman and binary search trees, heaps, and AVLs (Adel’son-
Vel’sky & Landis, 1962), in addition to other types of multi-node trees where a node is
allowed to have more than two children. Similarly, there are different ways to systemati-
cally scan through the different nodes of a tree: the breadth-first approach, where the
nodes that are the nearest to the root are the first ones to be scanned, and the depth-first
approach, where the nodes that are the furthest from the root are the first ones to be scan-
ned. The implementation of binary trees can be done either with the use of an array or
linked nodes, as was done for lists and queues. The following JavaScript program uses
arrays for its implementation of binary trees.

51
Figure 19: Binary Trees Implementation with Arrays (Start)

Source: Created on behalf of IU (2020).

52
Figure 20: Binary Trees Implementation with Arrays (End)

Source: Created on behalf of IU (2020).

Let us briefly return to the concepts of full trees and complete trees. In a full tree, only the
leaves are allowed not to be full. For a binary tree to be complete, it must satisfy the fol-
lowing two conditions: (a) be full, and (b) have all leaves at the same height from the root.

53
It is also possible for a binary tree to be considered semi-complete, i.e., it is complete up
to one level before the leaves, but few of the parents of the leaves on the right side are not
full. One of the advantages of complete and almost-complete binary trees is that, because
they do not have many internal holes, their implementation by an array effectively makes
use of almost all the spaces of that array. As for binary search trees, the value of the left
child of each node must be smaller than the one of the node itself, which in return must be
smaller than the value of its right child. This makes it easy for these trees to be searched.

A heap is an almost complete binary tree that also keeps an order between each node and
its children. There are two types of heaps: the min-heap and the max-heap. In a max-heap,
the value of each parent node is greater than or equal to the one of each of its direct chil-
dren nodes. For a min-heap, however, the value of each parent node is smaller than or
equal to the one of its direct children nodes. The root of a max-heap always holds the
maximum value of the tree, and the root of the min-heap also holds the minimum value of
the tree. This is what makes heaps suitable for the representation of priority queues.

Graphs

Much like binary trees, graphs can either be scanned with a breadth-first approach or a
depth-first approach. Their application domains include communication systems,
hydraulic systems, integrated computer circuits, mechanical systems, and transportation
(Ahuja et al., 1993).

A graph can be seen as a set of connected nodes. In a directed graph, the direction of the
connection between two nodes is well indicated as opposed to general graphs where all
connections between nodes are bi-directional. Thus, we focus on directed graphs because
they also allow for the possibility of representing bi-directions in nodes’ connections, as
illustrated in the example below. This example is a labeled graph where weights are
assigned to the nodes’ connections. In any case, this also accommodates unlabeled
graphs where weights are Boolean values.

54
Figure 21: Example of a Labeled Graph

Source: Created on behalf of IU (2020).

In a graph with a given number of nodes, each node can be identified by a unique natural
positive number less than the total number of nodes. Thereafter, it will become easy to
identify each node and implement graphs either as two-dimensional arrays or as an array
of linked nodes. This is what is done below for the above graph.

Figure 22: Example of Nodes Identification in a Graph

Source: Created on behalf of IU (2020).

55
Figure 23: Representation of the Nodes of a Graph in an Array

Source: Created on behalf of IU (2020).

Let’s now represent the previous graph example first as a two-dimensional array and as an
array of linked nodes.

Figure 24: Graph’s Representation as a Two-Dimensional Array and an Array of Linked


Nodes

Source: Created on behalf of IU (2020).

56
Certain graphs’ configurations are more suitable for representation as a two-dimensional
array while others are better represented as an array of linked nodes. The JavaScript exam
ple below is an implementation of graphs with two-dimensional arrays so that readers can
also be introduced to the programming of these types of arrays in JavaScript.

In this JavaScript program, the nbOfNodes attribute of the GraphClass class represents
the total number of nodes of the graph, and the names of those nodes are stored in the
nodesNames array attribute. It is the weights array attribute that stores the values of the
weights of the links between the different nodes of the graph, as illustrated in the two-
dimensional array (above). Unfortunately, there are no two-dimensional arrays in Java-
Script. Hence, we must store arrays inside another array as a way to implement a two-
dimensional array. This is visible in the initializeGraph() method whose purpose is to
initialize each box of the nodesNames array with the undefined object. That method also
initializes each box of the weights array with a sequence of undefined objects.

Suppose we have a graph with three nodes. In this case, the nodesNames array is made up
of three boxes and each of these three boxes is initialized with the undefined object. Sim-
ilarly, the weights array is made up of three boxes. Each of these three boxes is also an
array of three boxes that are each initialized with the undefined object by the
initializeGraph() method. The rest of the program is not too difficult to understand,
and, again, it is always useful and recommended to trace your programs with a few test
cases in order to unearth all their details.

57
Figure 25: Graph Implementation with Two Arrays (Start)

Source: Created on behalf of IU (2020).

58
Figure 26: Graph Implementation with Two Arrays (End)

Source: Created on behalf of IU (2020).

59
2.2 Recursion and Iteration
Let’s briefly put ourselves in the situation where we have to fulfill the important mission of
climbing a cement stairway. One way to approach that mission is to step on the first stair,
then move to the second, then the third, the fourth, the fifth, and so on, up to the top. We
would call this an iterative approach. It usually takes the form of a for, a while, or a
repeat loop when it is adopted in an algorithm. Each of its steps is identified as an itera-
tion. The other way to approach the mission is to step on the first stair and simply consider
the remaining steps as a new but shorter mission that, when fulfilled, will also successfully
Recursion end the initial mission. This new approach is recursive in the sense that the completion of
This approach allows a the initial mission on a given object relies on the completion of the same mission but for a
function to call itself with
different arguments. simpler or smaller object. Let us now illustrate these two concepts of recursion and itera-
tion with the following two JavaScript programs on the factorial of numbers and on the
enumeration of prime numbers, respectively. More precisely, the first program calculates
the factorial of a given strictly positive natural number, both iteratively and recursively.
The second program enumerates the set of prime numbers smaller than or equal to a
given strictly positive natural number, also both iteratively and recursively.

This is a gentle reminder of the formula of the factorial of a strictly positive natural num-
ber n.

n! = n · n − 1 · n − 2 · n − 3 · n − 4 · … · 5 · 4 · 3 · 2 · 1

60
Figure 27: Iterative and Recursive Versions of Factorial

Source: Created on behalf of IU (2020).

The iterative calculation of the factorial of n can be illustrated as follows.

Figure 28: Iterative Illustration of Factorial

Source: Created on behalf of IU (2020).

As for the recursive calculation of the factorial of n, it is clearly visible in the formula of
factorial itself, as illustrated below.

61
n! = n · n − 1 · n − 2 · n − 3 · n − 4 · … · 5 · 4 · 3 · 2 · 1

n! = n · n − 1 !i.e.Factorial n = n · Factorial n − 1

In other words, the recursive calculation of the factorial of n involves the calculation of
another factorial, the factorial of n–1.

Both the iterative and recursive approaches are illustrated in the JavaScript program pre-
sented below. It enumerates the set of prime numbers less than or equal to a given strictly
positive natural number, starting with the primality testing function for which
isPrimeIter() uses the iterative approach, and isPrimeRec() uses the recursive
approach. The isPrimeIter() function loops from 2 to the value of its parameter to
check if the later one has a divisor. As for isPrimeRec(), if its first parameter is divisible
by its second, then it concludes on the non-primality of the first parameter; otherwise, it
reduces the value of its second parameter by one before calling itself back into action. The
primesIter() function uses an iterative approach by testing the primality of each num-
ber for its possible inclusion in the final array. On the other hand, the primesRec() func-
tion checks the primality of its parameter and decides its inclusion in the set of the prime
numbers that are less than that parameter.

62
Figure 29: Naive Iterative and Recursive Primality Test Algorithms

Source: Created on behalf of IU (2020).

63
2.3 Divide-and-Conquer
Divide-and-Conquer The divide-and-conquer algorithm design strategy consists of sub-dividing a given prob-
This strategy results in lem into similar sub-problems of almost equal sizes, solving those sub-problems, and
faster algorithms by
breaking the input data aggregating the solutions of those sub-problems into an overall solution for the initial
into many, almost equal- problem. According to Kao (n.d.), the divide-and-conquer strategy is linked to Herbert
sized inputs. Simon’s mathematical concept of near decomposability. Simon (2002) defines near
decomposability as the “boxes-within-boxes” hierarchical multilevel organization of a sys-
tem.

This algorithm design strategy is usually recursive and time-efficient. This strategy is illus-
trated in the following algorithm that multiplies two long integers with an equal length
that is a power of two. Suppose that we have the long integer L1 = 74983152 whose
length is 8. We can divide this long integer into two parts: the upper part 7498, and the
lower part 3152. Similarly, let’s consider the example of another long integer L2 =
54926813 of length 8 whose upper half is 5492 and lower half is 6813. This is where the
trick is.

L1 = 74983152 = 7498 · 104 + 3152


L2 = 54926813 = 5492 · 104 + 6813
L1 · L2 = 6813 · 3152 + 6813 · 7498 · 104 + 5492 · 3152 · 104
+ 5492 · 7498 · 108

This equation shows how the problem of multiplying integers of length 8 has been
reduced to the problem of multiplying integers of length 4 which itself will be reduced to
the easier problem of multiplying integers of length 2. The efficiency of this method comes
from the number of steps taken to move from 8 to 2 (three steps on the 8, 4, 2 trip) com-
pared to the 8 steps on the 8, 7, 6, 5, 4, 3, 2, 1 trip.

2.4 Balancing, Greedy Algorithms, and


Dynamic Programming
The following algorithm design techniques are usually used for optimization problems
where the aim is typically to find the best option for a given purpose.

Balancing

The divide-and-conquer algorithm design technique teaches us that an algorithm can


become significantly faster by subdividing its input into two smaller equal-sized sub-
inputs. This is the case, for example, for an algorithm on binary trees where it can decide
to consider the left and the right children of the tree separately in its quest to solve its
computational problem. Hence, ensuring that the left and the right children of the tree
have an almost equal size for the metric that the algorithm is interested in is vital. It is in

64
that perspective that AVL trees always stay balanced (heights of left and right children
never differ by more than one). The AVL acronym comes from the names of Adel’son-
Vel’sky and Landis (1962) who are credited for the discovery of the AVL data structure.

Let us now illustrate the concept of tree-balancing by successively inserting the following
numbers in an initially empty AVL, having in mind the two main properties of AVLs: (1) the
value of any parent is greater than the one of any left child but less than the one of any
right child, and (2) the difference in height between any left child and any right child can-
not exceed one. The sequence of numbers to be inserted is 75, 29, 52, 89, 92, 90, 24, 8, 17,
27.

65
Figure 30: Insertion of the Sequence 75, 29, 52, 89, 92, 90, 24, 8, 17, 27 in an AVL (Start)

Source: Created on behalf of IU (2020).

66
Figure 31: Insertion of the Sequence 75, 29, 52, 89, 92, 90, 24, 8, 17, 27 in an AVL (Cont’d)

Source: Created on behalf of IU (2020).

67
Figure 32: Insertion of the Sequence 75, 29, 52, 89, 92, 90, 24, 8, 17, 27 in an AVL (End)

Source: Created on behalf of IU (2020).

Greedy Algorithms

Dating back to the 1970s, the greedy algorithm design technique consists of always mak-
ing the best possible choice in the moment (local choice), even if the final or global out-
come from these successive local choices happens not to be the best one (Ye, 2013). This is
how this technique works for the knapsack problem. Suppose that we have to store a set
of valuable fruits and vegetables in a bag with a given weight so that the bag can be as full
as possible with the most valuable items. We have mangoes, avocadoes, yams, maize, cas-
sava, potatoes, and apples. The different weights of these different items are 20, 15, 10, 30,
5, 50, and 25, and their total costs are 60, 60, 90, 60, 80, 75, and 75, respectively. If we want
to know the value of a given item, we must calculate the unit cost of that item by dividing
its total cost by its weight.

The greedy approach starts by assuming that mangoes are the most valuable and com-
pares their unit cost to the other fruits and vegetables until it finds that avocadoes are
more valuable than mangoes. Once avocadoes are assumed to be the most valuable, their
unit cost will be compared against that of yams, which will become the assumed most val-
uable item. This process continues until we learn that cassava is actually the most valua-
ble.

68
Figure 33: Greedy Algorithm Example

Source: Created on behalf of IU (2020).

Dynamic Programming

This algorithm design technique looks for the best possible solution to a combinatory
problem by dividing the initial problem into relevant sub-problems whose solutions are
then stored in arrays before being aggregated toward the final solution. Currency
exchange illustrates this technique. Suppose we have 1, 7, and 11 coins denominations for
a given currency, and we want to change a cash value of 21 in that currency with the
smallest possible number of coins. A greedy approach will first look at the eleven (11),
then the seven (7), and finally the one (1). It will result in a single eleven (11) currency
denomination coin, a single seven (7) currency denomination coin, and 3 one (1) coins, for
a total of 5 coins. A quick look at this example shows that it is also possible to change the
same amount of cash with only 3 seven (7) coins.

In other words, the greedy approach fails to find the best or smallest possible number of
coins for this problem. Let's look at the dynamic programming solution of this example.

Figure 34: Dynamic Programming of the Currency Exchange Problem

Source: Created on behalf of IU (2020).

The first row in the table represents the cash value from 0 to 21. The three different cur-
rency denominations are represented in the first column. The entire second column is fil-
led with zeros because a zero cash value is changed into zero coins for any denomination.
Let’s start by showing how to fill the rest of the second row where it is assumed that we
only have a currency denomination of value 1 and where the cash value and the number
of coins are equal since we only have coins of 1s in the second row. When we are on the
third row, we now have coins of 1s and coins of 7. Cash values from one to six can only use
coins of 1s, but a cash value of seven will use one coin of seven that can be calculated as

69
the minimum value between the number on top of that box and the successor of the value
located in the seventh position on the left of that box. When this process is carried out for
all the other rows of the table, it will result in the minimum number of coins for this prob-
lem: three.

SUMMARY
This unit explained how to organize data in relevant data structures,
such as arrays, lists, queues, graphs, and trees. We reviewed how an iter-
ative algorithm can be transformed to its recursive equivalent. Impor-
tant algorithm design techniques were presented, including the divide-
and-conquer strategy, the greedy approach, dynamic programming, and
data balancing with the example of balanced binary search trees.

70
UNIT 3
SOME IMPORTANT ALGORITHMS

STUDY GOALS

On completion of this unit, you will have learned …

– key details about the main searching and sorting algorithms.


– jq commands for pattern matching.
– fundamental cryptography concepts as applied to the RSA algorithm.
– essential data clustering concepts as visible in the k-means algorithm.
3. SOME IMPORTANT ALGORITHMS

Introduction
Algorithms are designed and developed for computational problems from a myriad of
application domains and knowledge areas. However, a good proportion of algorithms
relies on searching and sorting tasks in the process of crafting their own solutions. This is
why searching and sorting algorithms are important, for example, for the task of matching
patterns from a given text.

Digital security is a critical domain for the protection of the integrity of computing sys-
tems. It mainly relies on data encryption and decryption protective techniques offered by
cryptography algorithms such as the RSA (Rivest–Shamir–Adleman) algorithm.

This unit is dedicated to searching and sorting algorithms, pattern matching, and the RSA
algorithm. It also presents the k-means algorithm that plays an important role in data
clustering. This is the task of subdividing a given set of objects into groups or clusters
based on the analysis of their data.

3.1 Searching and Sorting


This section is dedicated to the presentation of searching and sorting algorithms because
of their omnipresence in algorithmic tasks. An effort will be made to always compare the
efficiency of the different algorithms.

Searching Algorithms

The following searching algorithms are presented in this sub-section: linear, binary, and
hash search. A search problem consists simply of locating the position of a given value
within a sequence of values. We will assume that the sequence of values is stored in an
array. We will also assume that the search for a value in the array will return a position
where that searched value is stored in the array, or it will return -1 when a search did not
find the expected value.

Linear search

The linear search algorithm simply goes through each element of a sequence of values in
search of the expected one and returns with its position, if found; otherwise, it concludes
that it did not find it. The number of comparisons of the linear search algorithm can be as
high as the length of the sequence in situations where the value being searched is not
found in the sequence (Subero, 2020).

72
Binary search

The binary search algorithm assumes that its sequence of elements is already sorted. It
consists in dividing that sequence into two halves with the assumption of having only
three alternatives: (i) the value being searched is located at the middle of the sequence
and the search is over, (ii) the value being searched is located in the first half of the
sequence and the search should continue only inside that chunk of the sequence, and (iii)
the value being search is located in the second half of the sequence and the search should
continue only inside that second half of the sequence. It is important to note that the size
of the sequence being searched always shrinks by half until it eventually reaches the size
of one in the worst case. At the beginning, three comparisons (equal, less than, greater
than) are made on a value of the full size sequence. Then the same comparisons are made
on a value of the appropriate half-size sub-sequence, and thereafter the same compari-
sons are made on a value of the consequent quarter-size sub-sequence, etc.

Let’s consider the worst case scenario where the value being searched is not in the
sequence. The following series represents the sizes of the successive sub-sequences being
searched.

n n n n n n n n n
n, 2 , , , , , , , …, ,
22 23 24 25 26 27 2s − 1 2s

In other words, for a sequence with a length n, the binary search algorithm will succes-
sively consider s+1 sub-sequences and make three comparisons for each of them. The
algorithm will ultimately stop when its input is reduced to a sub-sequence with a length of
one, in other words, when the following equation is fulfilled.

n
=1
2s

or

2s = n

or

s = log2 n

This equation gives us an idea about the logarithmic efficiency of the binary search algo-
rithm (Subero, 2020).

Hash search

The hash search algorithm assumes that there is a sequence of elements where each ele-
ment is identified by a unique key that is either a natural number or a string. Searching
strings is possible here because it is always possible to transform any string into its integer
equivalent using ASCII codes. The hash search algorithm uses these keys to store the dif-
ferent elements in a hash table that can easily be searched later. In fact, each element of
the sequence is stored in the hash table in the position of the hash function of its key. So, if

73
we are looking for an element, we simply have to calculate the hash function of its key and
locate it at that position. The first problem is that it is possible for many different keys to
have the same hash function. This is called a collision, and it leads to an extra search in the
position of the collided keys. The second problem is that there are no standard hash func-
tions for all types of keys. This subsection uses the Multiplication, Addition, and Division
(MAD) hash function to illustrate the concept of hash search.

Let’s start from the assumption that there are n keys to search from, and p is the smallest
prime number that is greater or equal to n. Two random numbers a and b are then chosen
between 0 and p–1 with a being strictly positive. The MAD hash function h is

hk = ak + b mod p mod n

Let us suppose that we have the following 24 keys to search from: 19, 20, 30, 31, 67, 125,
189, 192, 267, 357, 388, 393, 428, 435, 483, 513, 574, 592, 645, 744, 794, 916, 954, and 980.

Twenty-nine is the smallest prime number that is greater than 24; and we will use 4 and 15
as the respective values of a and b. In other words, the values of n, p, a, and b are respec-
tively 24, 29, 4, and 15. The first column of the table below contains the values of the dif-
ferent keys. The fourth column contains the calculated hash function. It is the index of the
place where values are stored. When searching for a key, the hash search algorithm will
calculate the hash function of that key and locate it in the matching index of the hash
table. In case of collusion, the matching index will contain many keys that must be
searched with a different search algorithm. The speed of the hash search algorithm mainly
depends on the speed of the algorithm of the hash function and on the number of colli-
sions.

Sorting Algorithms

The history of sorting algorithms includes radix sort, merge sort, insertion sort, counting
sort, bubble sort, bucket sort, quicksort, introsort, timesort, library sort, and burst sort
(Attard Cassar, 2018). In this section, we will only focus on the fundamental sorting algo-
rithms: radix sort, merge sort, insertion sort, bubble sort, bucket sort, and quicksort. We
will not dwell on how these fundamental sorting algorithms were extended by other sort-
ing algorithms.

74
Figure 35: Sorting Algorithms

Source: Created on behalf of IU (2020).

Radix sort

Although the radix sort algorithm was invented by Hollerith in the 1880s, the first compu-
terized version was proposed by Seward (Attard Cassar, 2018). It works well with natural
numbers where it makes use of the different digits of these numbers starting from the
least significant ones to the most significant ones. This algorithm uses ten buckets num-
bered from zero to nine because it knows that each decimal digit has a value between
these two numbers.

75
At the beginning of the algorithm, each number is put in the bucket of its least significant
digit (digit 0, see the second column in the table below). Then each number is put in the
bucket of its second least significant digit (digit 1, see the third column in the table below).
This process ends when all the digits’ positions are covered. This is illustrated below for
the numbers 7, 500, 26, 6, 6648, 578, 45, 65947, 28, 3974, and 174.

Figure 36: Radix Sort Algorithm Example

Source: Created on behalf of IU (2020).

The total number of operations performed by the radix sort algorithm depends on the
number of elements to be sorted and on the highest number of significant digits for those
elements. In fact, each element is checked for each level of significance.

Bucket sort

The bucket sort algorithm assumes that it is given different sorted ranges or buckets to
which the various values to be sorted belong. It then simply places each value inside its
range or bucket before sorting each bucket with another sorting algorithm. Thereafter, the
sorted buckets are concatenated to get the final sorted sequence of values. This is illustra-
ted below for the numbers 7, 500, 26, 6, 6648, 578, 45, 65947, 28, 3974, and 174. This exam-
ple uses the following buckets or ranges: [0…9], [10…99], [100…999], [1000…9999], and
[10000…99999].

76
Figure 37: Bucket Sort Algorithm

Source: Created on behalf of IU (2020).

The final sorted sequence for this example is 6, 7, 26, 28, 45, 174, 500, 578, 3974, 6648, and
65947. This algorithm relies on the uniform distribution of values in the buckets even
though such a distribution is not guaranteed, and its total number of operations depends
on the other sorting algorithm that it is using.

Insertion sort

The insertion sort algorithm searches for the smallest value in a given sequence and
exchanges it with the value at the beginning of that sequence. It then ignores this newly
updated first element of the sequence in order to restart the above process with the rest of
the sequence, up to the point where the entire sequence is sorted. The following example
is an illustration of the insertion sort algorithm for the numbers 7, 500, 26, 6, 6648, 578, 45,
65947, 28, 3974, and 174.

77
Figure 38: Insertion Sort Algorithm Example

Source: Created on behalf of IU (2020).

For the table above, ei is the abbreviation of Exchange Index, and mi is the abbreviation of
Minimum Index. When the insertion sort algorithm is looking for its minimal value for the
first time, it checks almost the entire sequence of numbers. But, in the second time, its
number of checks is reduced by one, by one again for the third time, and so on until there
is nothing to check. The total number of checks and swaps made by the insertion sort
algorithm in the worst case scenario is, thus, almost equal to the following formula where
n represents the number of elements in the sequence to be sorted.

n n−1
n−1 + n−2 + n−3 +…+4+3+2+1= 2

Bubble sort

The bubble sort algorithm consists of continuously swapping neighboring values in a


sequence of elements so as to shift the highest value at the end of the sequence. The fol-
lowing example is an illustration of how the bubble sort algorithm shifts the highest value
of the following numbers to the end of the sequence: 7, 500, 26, 6, 6648, 578, 45, 65947, 28,
3974, and 174.

78
Figure 39: Bubble Sort Algorithm Example (Start)

Source: Created on behalf of IU (2020).

The same process is carried out for the following remaining sequence of numbers as is
illustrated below: 7, 26, 6, 500, 578, 45, 6648, 28, 3974, and 174.

79
Figure 40: Bubble Sort Algorithm Example (End)

Source: Created on behalf of IU (2020).

Now that the highest value 6648 has been shifted to the end of its sequence, the bubble
sort algorithm can continue with the remaining sequence of numbers 7, 6, 26, 500, 45, 578,
28, 3974, 174 until the entire sequence is sorted. The number of swapping positions is
almost equal to the number of elements in the sequence, and the lengths of the sequen-
ces where swapping happens successively decrease by one starting from the length of the
full sequence. In other words, in the worst case, the number of swaps done by the bubble
sort algorithm is similar to the number of operations that are performed by the insertion
sort algorithm.

Merge sort

Merge sort algorithm The merge sort algorithm starts with the partitioning of its sequence of values into a set
The merge sort algorithm of singleton arrays (single-element arrays). Each of these singleton arrays is sorted since
efficiently joins two
already sorted arrays into they only contain one value. This allows the merge sort algorithm to continuously merge
a newly sorted array. all neighboring pairs of arrays until the entire array is sorted. Before illustrating the merge
sort algorithm itself, it is important to first understand the following example on how to
merge two sorted arrays: 1, 3, 5, 6 and 2, 5, 5, 6, 9.

80
Figure 41: Merging Two Sorted Sequences and Merge Sort Algorithm Examples

Source: Created on behalf of IU (2020).

It is important to note that the number of comparisons made during the merging of two
already sorted sequences is almost equal to the total number of values in those two
sequences. It is also important to have an idea about the total number of mergers made
by the merge sort algorithm. This number of mergers depends on the length n of the
n n
sequence to be sorted. At the beginning, there are 2 mergers, then there are 4 mergers,
n
then 8 mergers, and so on, up to one merger. This roughly corresponds to the following
series.

n n n n n n n n n
, , , , , , ,
2 22 23 24 25 26 27
…, s − 1
, s
2 2

In other words, if s represents the total number of times that the merge sort algorithm
changes the size of the sequences being merged, the following equation must be fulfilled
by s.

81
n
=1
2s

or

2s = n

or

s = log n

This means that the merge sort algorithm roughly changes its merging size log(n) times,
and it makes almost n comparisons for each of those merging sizes for a total number of
comparisons of almost n log(n) for the entire merge sort algorithm.

Quicksort

The quicksort algorithm (Hoare, 1961) chooses a pivot value (possibly initialized with the
first element of the sequence to sort) and places it in a suitable position so that every
value on the left of that position is less than or equal to the pivot value, and every value on
the right of that position is greater than or equal to the value of the pivot. This process is
carried out many times for the sequence on the left of the pivot and for the one on its right
until the length of the sequence to be sorted is equal to one.

The quicksort algorithm has two cursors, a left-to-right → cursor that only moves from the
left to the right, and a right-to-left ← cursor that only moves from the right to the left. At the
start of the algorithm, the → cursor points to the beginning of the sequence being sorted
while the ← cursor points to its last element. The → cursor stops moving when it finds a
value that is greater than the value of the pivot, but the ← cursor continues moving as long
as it is meeting values that are greater than the one of the pivot.

Two cases are possible when both cursors stop moving. The first is that the position of the
← cursor is higher than the one of the → cursor. In this case, the two cursors will simply
exchange the values in their positions and continue with their moves. The second case is
that the position of the ← cursor is lower than the one of the → cursor. In this second case,
the following two steps are carried out:

1. The position of the ← cursor exchanges its value with the one in the position of the
current pivot.
2. The entire quicksort algorithm starts all over again for the sub-sequence on the left
side of the ← cursor and for the one on its right side until both sub-sequences are left
with one element maximum.

There is no guarantee that the found correct place of the pivot will divide the sequence
being sorted into two almost equal-sized left and right sub-sequences. This is why, in the
worst case scenario, the total number of operations performed by the quicksort algorithm
is roughly equal to the following (n is the length of the sequence being sorted).

82
n n−1
n−1 + n−2 + n−3 +…+4+3+2+1= 2

The following table illustrates the quicksort algorithm for the sequence of numbers 7, 500,
26, 6, 6648, 578, 45, 65947, 28, 3974, and 174.

83
Figure 42: Quicksort Algorithm Example

Source: Created on behalf of IU (2020).

84
3.2 Pattern Matching
This section focuses on two different applications of pattern matching. First, it looks at
how to define a pattern of characters so that we can check if and where it is matched by a
given string. Such patterns are known as regular expressions, and many modern program-
ming languages process them under the RegExps abbreviation. This section gives an over-
view of RegExps in JavaScript. It is also possible to define patterns and search them in
JavaScript objects. Such objects are described with the use of the JavaScript Object Nota-
tion (JSON). The second part of this section is dedicated to the jq utility where it is possi-
ble to define a pattern and match it against a JSON object.

Regular Expressions or RegExps

Regular expressions (RegExps) are constructed from the following building blocks: basic Regular expression
operations, ranges, the escape feature, anchors, and commonly used classes. A character A regular expression
describes the format of
is the most basic unit of a RegExp, and the dot (.) stands for any character. Regular expres- how a given text should
sions are concatenated by simply placing them one next to the other, while piping (|) two be written.
expressions means that we are interested in either of the expressions. The same role is
played by the square-bracketing of regular expressions inside, where it is possible to insert
a caret (^) to indicate that none of the expressions should be matched. The round-bracket-
ing of a regular expression forces it to be evaluated.

The * operator repeats its regular expression zero or more times, while the + operator
repeats its expression one or more times, and the ? operator simply checks whether its
regular expression appears at most once. The escape character \ must be applied to spe-
cial characters, such as those previously identified, in order to cancel their special mean-
ing. The minus sign - when placed between two characters inside square brackets [] rep-
resents any value in the range between these two characters, but it loses that special
meaning if it is located at the beginning of the left square bracket [.

The following expressions are commonly used by regular expressions. \d stands for any
numeric digit, \w stands for any alphanumeric character or for _, and \s stand for the
white space character. The capitalization of these expressions leads to their negation: \D
stands for any character that is not a numeric digit, \W stands for any character that is not
an alphanumeric character nor _, and \S stands for any character that is not a white
space.

Four anchoring tags are used to indicate the place where we want our expression to be:
The caret character (^) places the regular expression at the start of the sentence while the
dollar sign ( $) places it at the end. Similarly, the \< tag (\b in JavaScript) places its regular
expression at the beginning of the word while \> (\b in JavaScript) places it at the end.

Jq and Json Objects

While regular expressions are used to match desired patterns in strings, they can also be
used to match desired patterns in objects. Here, we focus on how to use jq commands to
search specific patterns in objects.

85
In jq, the dot (.) symbol simply refers to the entire set of objects being searched, but plac-
ing the name of an attribute after that symbol restricts the listed values to that attribute.
The keys keyword lists all the attributes’ names but that keyword can be given an index to
specify a specific attribute. Square brackets are used for arrays. The comma sign can be
used to apply different filters to the same set of objects. The length function gives the
number of elements in an array, the number of attributes in an object, or the length of a
string, depending on the nature of its parameter. This parameter is passed to it though the
| pipe character. Some of these jq filtering patterns are presented in the table below.

Table 1: jq Filtering Patterns (Selection)

Command Line Job Description

jq . This command lists all the objects being searched. This is use-
ful to format the json input simply.

jq .attrName This command lists the values of all the objects for the attrib-
ute attrName.

jq .arrayAttr[p] This command lists the values of all the objects at the position
p of the array attribute arrayAttr.

jq .[p] This command gives the value in position p of the object being
searched, provided that such object is an array

jq filt | length This command gives the number of elements or the length of
the result obtained from filtering the objects being searched
with the filter filt.

Source: Created on behalf of IU (2020).

We will restrict ourselves to the following jq general command to look for regular expres-
sions in the object located in the json file f.json.

Code
jq 'fltPat | fct ("regExp" ; "flg")' f.json

In this command, fltPat represents the filtering patterns that follow the jq keyword in
the first column of the above table. fct stands for any of the following functions: match,
test, or capture. The match function gives a four-field description of the occurrences of
the regExp regular expression in the filtered object(s), while the test function simply
gives a true or false answer. The capture function is used for the naming of the
matched patterns, for example, to identify the different parts of a string. Here, flg stands
for optional flags.

We now illustrate these different jq commands’ parameters with the following Json objec
t. We will not use the command line version of jq because of its issues in the processing of
single and double quotes for different operating systems. Instead, we will use the online
version of jq by simply specifying the designed filters with the assumption that the con-
tent of the following object is already present in the designated textbox of the jq online
tool.

86
Figure 43: Json Object First Example

Source: Created on behalf of IU (2020).

The RespId field is an array where each element is assumed to be a sequence of four deci-
mal digits. The ContId and the Gender fields are also arrays. ContId represents the
name of a continent assuming that there are six continents known under the six abbrevia-
tions AF, AS, AU, EU, NA, and SA.

The following filter will check each element of the RespId array to find if it respects the
above described format.

Code
.RespId[] | test("^([0-9]{4})$")

Let's examine this jq filter from left to right, starting with the full stop sign that simply
states that RespId is an attribute of the Json object being filtered. The square brackets []
are an indication that RespId is an array and that we are filtering each of its elements. The
first pipe | separates the input string (ending with []) from the function of the filter (test
in this case). A regular pattern expression is inside the test function and is formatted as a
sequence of exactly four digits. The presence of the caret (^) and the dollar sign ($) at the
beginning and at the end of the test function, respectively, is an indication that the given
pattern is expected to match the entire word, not only its beginning nor end. The output of
the jq filter for our displayed illustrative json object (above): true, false, true. It shows
that the second element did not match the specified pattern because it has five digits
instead of four.

The following filter will check each element of the ContId array to find if it respects the
format described above. The pipes characters | inside the test function represent the OR
operator.

Code
.ContId[] | test("^(AF|AS|AU|EU|NA|SA)$")

87
The output of this jq filter for our displayed illustrative json object is: false, true, true.
It shows that the first element did not match the specified pattern because its value ZA is
not a recognized continent. The Gender field is also an array. It is assumed that the recog-
nized genders are M, F, U, m, f, and u, where U and u both stand for the unspecified gender.
Here, a gender is valid if and only if it starts with any of the above listed six characters.

The following command checks each element of the Gender array to find if it respects the
format described above.

Code
.Gender[] | test("^[FMUfmu]")

Here is the output of the above jq filter for our above displayed illustrative json object:
true, true, false. It shows that the third element did not match the specified pattern
because its value bfm does not start with any of the six recognized alphabetical charac-
ters.

The Issues field is an object with two sub-fields Voting and Internet that are both
arrays. Let’s assume that the Voting field represents the voting age as a natural number
between 0 and 999. The following command will check the conformity to that specifica-
tion.

Code
.Issues.Voting[] | test("^([0-9]{1,3})$")

This time, the output of the Jq command is “false, false, true”, for the reasons you
have surely already identified

We will now show how these four jq commands can be transformed from being test
functions into becoming match functions so that answers can be shown as matched
objects instead of being mere true or false outputs. For ease of reference, the above jq
filters are denoted by C1, C2, C3, and C4 as seen below.

Notation Jq Filter

C1 .RespId[] | test("^([0-9]{4})$")

C2 .ContId[] | test("^(AF|AS|AU|EU|NA|SA)$")

C3 .Gender[] | test("^[FMUfmu]")

C4 .Issues.Voting[] | test("^([0-9]{1,3})$")

The match equivalent of the above test table is the following.

88
Notation Jq Filter

C5 .RespId[] | match("^([0-9]{4})$")

C6 .ContId[] | match("^(AF|AS|AU|EU|NA|SA)$")

C7 .Gender[] | match("^[FMUfmu]")

C8 .Issues.Voting[] | match("^([0-9]{1,3})$")

The outputs for the above match commands for our above illustrative Json object exam-
ple are presented by the following table.

89
Figure 44: Outputs

Source: Created on behalf of IU (2020).

90
You may have noticed that the results of the match commands are quite verbose. We will
now use the capture functionality to extract the information that we would like to focus
on, as shown by the following new commands.

Ref. Jq Filter

C9 .RespId[] | capture("^(?<idNo>([0-9]{4}))$")

C10 .ContId[] | capture("^(?<ctn>(AF|AS|AU|EU|NA|SA))$")

C11 .Gender[] | capture("^(?<Gender>([FMUfmu]))")

C12 .Issues.Voting[] | capture("^(?<vote>([0-9]{1,3}))$")

The results of the above capture commands are presented below.

C9 C10 C11 C12

{ { { {
"idNo": "0028" "ctn": "AF" "Gender": "m" "vote": "19"
} } } }
{ { {
"idNo": "0109" "ctn": "EU" "Gender": "F"
} } }

For the sake of completeness, let us try our test and capture commands on a different
Json object (displayed below), and check if we get the expected results.

Figure 45: Json Object Second Example

Source: Created on behalf of IU (2020).

The respective outputs for the eight tests and capture commands above are presented
by the following table for the new json object.

91
Ref. First element Second element Third element

C1 true false false

C2 true false true

C3 false true true

C4 false true false

C9 {
"idNo": "0000"
}

C10 { {
"ctn": "AF" "ctn": "EU"
} }

C11 { {
"Gender": "u" "Gender": "M"
} }

C12 {
"vote": "180"
}

Readers are invited to check the veracity of the above table.

3.3 The RSA Algorithm


The RSA algorithm is currently used worldwide to secure the transmission of data and
information. It is a public-key asymmetric encryption algorithm that uses both private and
public keys. The RSA acronym stands for Rivest, Shamir, and Adleman who are the three
scholars who authored the RSA algorithm in 1978. This section presents the strengths and
weaknesses of this algorithm.

Encryption, Decryption, and Signatures

Put yourself for a moment in the situation where you have to choose two different prime
numbers p1 and p2. Let’s say that you have chosen for example 11 and 23. Let us denote by
m1 the multiplication of p1 by p2 (m1 = 11 · 23 = 253), and m2 the multiplication of p1 – 1
by p2 – 1 (m2 = 10 · 22 = 220). You are now requested to choose a strictly positive num-
ber e less than m2 and coprime with it (e and m2 should not share a common divisor except
for 1, for example, e = 9). Let us now calculate a value d such that e · d is the immediate
successor of a multiple of m2 (for example, e · d = 1 + (2 · 220) = 441. So, d = 49). We
Encryption can now encrypt and decrypt messages. The encryption of a number n can be achieved
To encrypt a message is with the following formula.
to transform it so that it
cannot be understood
without having been Encrypt n = nemodm1
decrypted.

92
The decryption of the above encrypted message is achieved with the following formula.

Decrypt n = ndmodm1

Let us now take the example of non-case-sensitive English text messages made up exclu-
sively of alphabetical characters. There are 26 letters in the English alphabet, so each char-
acter can be converted into its numeric equivalent as a number between 01 and 26, repre-
senting the characters a and z, respectively.

Suppose we want to send the text Yes in its encrypted format. First, we must convert Yes
into its digits’ format 25, 05, 19. We can now use e = 9, d = 49, and m1 = 253 to encrypt
25, 05, 19 and decrypt it later as seen in the following table.

Encrypt 25 = 259mod253 = 213 Decrypt 213 = 21349mod253 = 25

Encrypt 05 = 59mod253 = 218 Decrypt 218 = 21849mod253 = 5

Encrypt 19 = 199mod253 = 194 Decrypt 194 = 19449mod253 = 19

The encrypted message is thus equal to 213, 218, 194, and its decryption will return the
values 25, 05, 19. Even when we know the values of e and m1, we will still not be able to
decrypt messages without knowing the value of d. This explains why e and m1 are consid-
ered public keys, and d is considered the private key. Conversely, if a message can be
decrypted by the public keys e and m1, it is because it was encrypted by the private key d.
In other words, a private key can be used to encrypt messages that can be decrypted by
public key holders with the assurance that these messages were encrypted with the use of
the private key. This is how a private key is used as the digital signature of its owner.

We can reverse the above example to illustrate this concept of digital signature by allow-
ing the private key owner to use their private key d = 49 to encrypt and sign their mes-
sage 25, 05, 19 so that public key holders can decrypt it with the public keys e = 9 and m1
= 253.

Encrypt 25 = 2549mod253 = 147 Decrypt 147 = 1479mod253 = 25

Encrypt 05 = 549mod253 = 020 Decrypt 20 = 209mod253 = 5

Encrypt 19 = 1949mod253 = 172 Decrypt 172 = 1729mod253 = 19

The RSA encryption system can be summarized in the following illustrated situation
whereby secret messages are exchanged between the members of a group and their
leader. Each group member knows the values of the public keys e and m1 but only the
group leader knows the value of the private key d. When a person wants to send a mes-
sage to the group leader, the sender must encrypt their message with the public keys e
and m1 so that only the group leader can decrypt it with the private key d. On the other

93
hand, when the group leader wants to send a message to the group members, they must
encrypt it with the private key d so that any member can use the public keys e and m1 to
decrypt it with the assurance that the message is coming from the group leader. The RSA
encryption system is asymmetric in the sense that the encryption and the decryption keys
are different.

Strengths and Weaknesses of the RSA Algorithm

The strength of the RSA algorithm comes from the fact that m1 is the product of two prime
numbers p1 and p2, and for large values of m1 it is very difficult to identify p1 and p2even
when we know the value of m1. That makes it equally difficult to find the value of m2 and,
consequently, that of d even when we know the value of e.

Choosing large values for m1 lowers the speed of the RSA algorithm. However, the RSA
cryptography system is less secure with smaller values of m1 because it is easier to factor-
ize them. The choice of the values of m1 is the Achilles heel of the RSA algorithm: On one
hand, large values of m1 make the RSA algorithm more secure but they reduce its speed,
and on the other hand, smaller values of m1 speed up the RSA algorithm but they make it
less secure. Let us also mention that the message needs to be unknown; otherwise, the
key becomes known.

3.4 The K-Means Data Clustering


Algorithm
The concept of data clustering refers to the process of dividing a dataset into clusters or
groups so that the closest elements of the dataset are assigned to the same cluster. Data
clustering is part of data mining because its clustering process only relies on the data
itself. It has several applications in image processing, market research, and geographical
information systems (GIS).

This section will restrict itself to two-dimensional coordinate datasets where it is simple to
calculate the distance between two data elements. Invented by Lloyd (1982), the k-means
name comes from the fact that the k-means algorithm divides its dataset into k clusters or
groups. These clusters are built based on the distance to the mean value of each cluster.
Here is an illustration of the k-means algorithm for the following two-dimensional dataset.

94
Figure 46: K-Means Algorithm Example

Source: Created on behalf of IU (2020).

Figure 47: Graphic Representation of the Above Dataset

Source: Created on behalf of IU (2020).

The first step of the k-means algorithm is to choose the value of k. For our example, we
choose k = 4 to state that we want our data to be partitioned into 4 clusters. This algo-
rithm also requires the random choice of k elements from the dataset as the initial mean
values to start working. We can, for example, choose E1, E6, E7, and E11 as our respective
initial mean values M1, M2, M3, and M4. We then have to calculate the Euclidean distance
of each element to each mean value and assign each element to the cluster of its closest

95
mean (see the first step table below). The first step table shows that so far, the dataset has
been divided into the following clusters: {E1, E2, E9}, {E4, E6, E8, E10}, {E7}, and {E3, E5,
E11, E12}.

We will now calculate the mean values of each of these clusters both for the X and Y com-
ponents, and we will get the following four means-elements: M1(40.33; 43), M2(42; 81.75),
M3(98; 7), and M4(93.25; 23). It is now time to check each element again in order to iden-
tify which of the M1, M2, M3, or M4 mean-elements it is closest to and assign it to the cor-
responding cluster (second step table).

Figure 48: K-Means Algorithm Example First Step

Source: Created on behalf of IU (2020).

The following table shows that, so far, the dataset has been divided into the following
clusters: {E1, E2}, {E9, E4, E6, E8, E10}, {E7, E11}, and {E3, E5, E12}.

96
Figure 49: K-Means Algorithm Example Second Step

Source: Created on behalf of IU (2020).

Figure 50: K-Means Algorithm Example Third Step

Source: Created on behalf of IU (2020).

We will now calculate the mean values for each of these clusters both for the X compo-
nent and for the Y component, and get the following four means-elements: M1(18.5; 26.5),
M2(50.4; 80.6), M3(95.5; 9), and M4(93.333333; 27). We check each element in order to
identify which of the M1, M2, M3, or M4 mean-element it is closest to and assign it to the

97
corresponding cluster (third step table). The third step table shows that so far, the dataset
has been divided into the following clusters: {E1, E2}, {E9, E4, E6, E8, E10}, {E7, E11, E12},
and {E3, E5}.

We will now calculate the mean values for each of these clusters both for the X and Y
components, and get the following four means-elements: M1(18.5; 26.5), M2(50.4; 80.6),
M3(91.66; 11.33), and M4(98; 32.5). We check each element once again to identify which of
the M1, M2, M3, or M4 mean-element it is closest to and assign it to the corresponding
cluster. This is the purpose of the next table.

Figure 51: K-Means Algorithm Example Fourth Step

Source: Created on behalf of IU (2020).

This time, we have the same clusters as in the previous step. Therefore, the same mean
values will prevail, and the algorithm has to stop in acknowledgment that the current clus-
ters are the final ones. These clusters are graphically represented below: {E1, E2}, {E3, E5},
{E4, E6, E8, E9, E10}, and {E7, E11, E12}.

98
Figure 52: K-Means Algorithm Example Diagram

Source: Created on behalf of IU (2020).

SUMMARY
This unit began with the presentation of classical searching and sorting
algorithms. Searching and sorting are very important because they
sometimes appear as the basic tasks of many algorithms. Moreover, an
idea about the number of steps in these searching and sorting algo-
rithms in the worst case scenario was given. This unit explained how to
write basic regular expressions for pattern matching with the use of jq
commands. The RSA algorithm and the k-means algorithm were also
presented by this unit.

99
UNIT 4
CORRECTNESS, ACCURACY, AND
COMPLETENESS OF ALGORITHMS

STUDY GOALS

On completion of this unit, you will have learned …

– how to write correctness proofs.


– the difference between total correctness and partial correctness.
– the practical side of a program’s correctness.
– how to analyze the accuracy of approximate algorithms.
4. CORRECTNESS, ACCURACY, AND
COMPLETENESS OF ALGORITHMS

Introduction
We have all endured, at least for a few moments in our lives, the discomfort of having to
scrutinize each and every line of one of our programs to find out why it was not working
according to plan. This is simply because an algorithm or a program is of no use if it produ-
ces the wrong output for a given legitimate input.

There are also instances where programs and algorithms fail to produce the expected out-
put simply because their execution is unable to reach an end point. This is why it is impor-
tant to verify the correctness of programs and algorithms during their development in
anticipation of their later testing.

The concepts covered in this unit are intended to give to the readers a sound understand-
ing of how to ensure that their algorithms and programs are correct. These correctness
concepts will be illustrated in this unit with suitable algorithms written in JavaScript.

4.1 Partial Correctness


The partial correctness of an algorithm is checked with the help of correctness proofs that
are themselves based on the mathematical induction proof method. We first briefly
present the mathematical induction proof method before explaining how to prove the
partial correctness of an algorithm.

Mathematical Induction

The basic principle of mathematical induction states that in order to show that a state-
ment is true for a sequence of objects, we first have to prove that it is true for the first
object. We then have to assume that the statement is true for all the objects up to a certain
one and prove that the statement is true for the next object. This can be expressed in the
mathematical language as follows. In order to prove that a statement S is true for a
sequence of objects O0, O1, O2, O3, O4, …, On–4, On–3, On–2, On–1, On, we must first prove
that S(O0) is true. We must then prove that S(Ok+1) is true when 0 <= k <= n – 1 and
S(Oi) is assumed to be true for any i <= k. For example, let us prove that the following
formula is true for any whole number n:

1+2+3+4+…+ n−4 + n−3 + n−2 + n−1 +n


n n+1
= 2

102
Let us start with the base case where n=0. The corresponding statement is the following
and is true:

0 0+1
0= 2

Let us now assume that the following statement is true for any i less or equal to k with k
itself being between 0 and n–1:

i i+1
1+2+3+4+…+ i−4 + i−3 + i−2 + i−1 +i= 2

Let us now calculate the following sum:

1+2+3+…+ k−4 + k−3 + k−2 + k−1 +k+ k+1

The sum can also be written as follows:

1+2+3+…+ k−4 + k−3 + k−2 + k−1 +k + k+1

This is also equal to the following expression because of the above assumption:

k k+1
2
+ k+1

Further calculations on the expression will lead to the following:

k k+1 2 k+1 k+1 k+2


2
+ 2
= 2

Thus, the equation confirms what we had to prove:

1+2+3+4+…+ k−3 + k−2 + k−1 +k+ k+1


k+1 k+1+1
= 2

Partial Correctness Proof of Iterative Algorithms

Correctness proofs of iterative algorithms are based on the use of the following three algo-
rithmic features: preconditions, loop invariants, and post-conditions. Preconditions are a
description of the criteria to be met by the inputs of an algorithm, while post-conditions
are a description of the criteria to be met both by its output and by some of its key internal
variables. As for loop invariants, they are a statement that must stay true for each and
every instance of a loop while ensuring that the loop is contributing to the computation of
the output.

The previous example on the sum of the first n natural numbers is implemented by the
following Node.Js JavaScript program to illustrate partial correctness concepts.

103
Figure 53: Example of the Sum of Numbers

Source: Created on behalf of IU (2020).

This JavaScript function is intended to calculate s as the sum of all the whole numbers
from 0 to n. Let us now give a partial correctness proof, starting with the definition of the
precondition, the invariant, and the post-condition. The precondition is simply that n
must be an integer greater than or equal to zero. The loop invariant is the following: s is
the sum of all the whole numbers between 0 and i. The post condition is that s is the sum
of all the whole numbers between 0 and n.

The partial correctness proof itself consists in proving that

• the invariant is true at the initialization of the loop.


• if we assume that the invariant is true for all the values of the iterator i up to a given
value k, then we have to show that the invariant remains true when the value of the
iterator i becomes equal to k+1.
• the algorithm ultimately yields the expected result after its final exit of the loop.

When the precondition is met, can we confirm that the loop invariant is true even before
entering the loop? Yes. Indeed, before entering the loop, we have i = 0, s = 0, and, in this
case, s (whose value is equal to 0) is as a matter of fact, the sum of all the natural positive
numbers between 0 and i (whose value is equal to 0).

Can we now prove that the loop invariant is true? Yes. We will do so with the help of the
second step of the mathematical induction technique. We assume that the loop invariant
is true for any value of i less than or equal to a given whole number k. We have to prove
that the loop invariant remains true for i = k + 1, with k being of course less than n.
When i exits the loop with the value i = k, the above assumption implies that s is the sum
of all the natural numbers between 0 and k. When i re-enters the loop this time with the

104
value i = k + 1, it will allow the assignment instruction s = s + i to replace the s on its
right side with its above indicated sum value. This replacement will update the value of s
as follows: s = (1 + 2 + 3 + …+ k) + k + 1. This clearly shows that s is the sum of all
the natural numbers from 0 to k + 1, which is what we had to prove.

The final step of the partial correctness proof is to show that after the last iteration of the
loop, s will yield the expected final result of the algorithm, which is the sum of all the natu-
ral numbers from 0 to n. This can be proven by the case of k = n – 1 in the previous step
of the proof that implies that the loop invariant is also true for k + 1 which is n. In other
words, s is the sum of all the natural numbers from 0 to n.

Partial Correctness Proof of Recursive Algorithms

Recursive algorithms are not too different from iterative algorithms as far as partial cor-
rectness proofs are concerned, especially because these proofs are both based on the
mathematical induction method. In fact, recursive algorithms also have preconditions,
invariants, and post-conditions. The following main differential feature of recursive func-
tions is however worth noting for the conceptualization of their correctness proofs.

Recursive functions always recall themselves into action with different parameters, except
when they reach their base case. The partial correctness proof of a recursive algorithm
therefore consists of the following three steps:

1. Proving that the base case of the invariant is true when the precondition is met
2. Proving that, if the invariant is true for all the parameter-values i less than or equal to
a given value k, then the invariant is also true when the parameter-value i becomes
equal to k+1
3. The algorithm ultimately yielding the expected result after its final recursive call. Let
us, for example, prove the partial correctness of the Node.Js JavaScript facto functio
n written below for the calculation of the factorial of m.

105
Figure 54: Example of the Recursive Version of Factorial

Source: Created on behalf of IU (2020).

The precondition for the recursive facto function is that the parameter m should be a
whole number. The invariant is that, for any whole number i between 0 and m, the
facto(i) function should yield the value i!, and the post-condition is that the facto(m)
function should yield the value m!.

Let us start with the first step of the partial correctness proof. Is the invariant true for the
base case when the precondition is met? Yes. Indeed, the base case is when m===0. In that
case, the facto(0) function yields the value 1 which indeed is the value of 0!. Assuming
that facto(i) is equal to i! for any value of i less than or equal to a given k between 0 and
m–1, let us prove that facto(k + 1) = (k + 1)!. The fact that k is between 0 and m–,
and that m is greater than or equal to 0 implies that k+1 is at least equal to 1. In other
words, it is the else part of the above if condition that will be used for the calculation of
facto(k+1) as being equal to (k+1)·facto(k), which is ultimately equal to (k+1)·k!
because of the induction assumption on facto(k). We have now shown that
facto(k+1) is equal to (k+1)·k! which is clearly the same as (k+1)! which we had to
prove. The last step of the proof is to show that facto(m) = m! by simply referring to the
previous step with k being equal to m–1.

4.2 Total Correctness


Proving the total correctness of an algorithm consists of two parts, namely, its partial cor-
rectness proof, and its termination proof.

106
Termination Proofs

The termination of an iterative algorithm can be proven by demonstrating that its loop is
made up of a finite number of steps that are always going to come to an end. As for the
termination of a recursive algorithm, it is proven by demonstrating that the parameters of
that recursive function will be subjected to a finite number of variations that are always
going to come at the end to the base case.

For example, the above iterative algorithm on the sum of the first n natural numbers has
the following for loop: for(let i=1; i<=n; i++). Can we prove that this algorithm
will always terminate? Yes. This algorithm will always terminate because, when n is meet-
ing the precondition (n >= 0), the loop of this algorithm will always come to an end after
being executed n times (n is a finite number).

Let us likewise prove that the above facto function will also always terminate. It is obvi-
ous that the facto function will terminate when its parameter is equal to zero. Let us now
prove that facto(m) will also always terminate for all the other whole numbers m that
meet the precondition (m >= 0). For all such m values, the following successive calls are
made for the calculation of facto(m): facto(m–1), facto(m–2), facto(m–3), and so
one, up to facto(0). This clearly shows that the recursive facto function will come to an
end after recursively calling itself m times (m is a finite number).

Total Correctness Proofs

An algorithm (or a program) is correct if and only if it is totally correct. For an algorithm to Total correctness
be declared totally correct, its termination (for all the inputs that are fulfilling its precondi- This categorization
entails partial correctness
tions) and its partial correctness must be proven. The following table summarizes the rela- and termination.
tionship between algorithms' partial correctness, their termination, and their total cor-
rectness.

Table 2: Summary

Partial Correctness Termination Total Correctness

False False or Unclear or Unproven False

False True False

True False or Unclear or Unproven False or Unclear or Unproven

True True True

Source: Created on behalf of IU (2020).

The table above seems clear, but it does not really explain why it is crucial to differentiate
partial correctness from total correctness: there are many algorithms for which the termi-
nation proof is unknown. For these algorithms, only the concept of partial correctness can
be applied. Here are two examples of such algorithms whose termination proofs are
unknown even though their partial correctness is proven.

107
Let’s consider the problem of determining whether the Collatz sequence of a given strictly
positive integer contains the value 1. For that sequence, the next number is equal to half
of the current number when the current number is even, else the next number is the
immediate successor of the triple of the current number. The Collatz sequence of a strictly
positive integer always starts with that number and it ends when it lands on a repetitive or
cyclic sub-sequence. Let’s assume that we are looking for the Collatz sequence of 6. That
sequence will start with 6, then it will go to 3, then 10, 5, followed by 16, 8, 4, 2, and finally
1, 4, 2, and 1. In other words, Collatz(6) is 6, 3, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1. Similarly, Col-
latz(18) is equal to the sequence 18, 9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8,
4, 2, 1, 4, 2, 1. It does not seem too difficult to write a naive JavaScript function to check
whether the Collatz sequence of a given strictly positive integer contains the value 1.

All that is needed is to loop up to the last value which is equal to one and change the value
for each iteration either to half of its previous value, or to the successor of its triple,
depending on its parity. The JavaScript program below computes Collatz sequences con-
taining the number 1.

Figure 55: Collatz Sequence Code

Source: Created on behalf of IU (2020).

The partial correctness of this JavaScript collatz function can be proven based on the
following specifications. The precondition is that the input n1 of the collatz function
should be a strictly positive integer. The invariant stipulates that, with lc being consid-

108
ered as the length of the array c, this array is the sub-sequence of the first lc elements of
the Collatz sequence of n1. The post-condition is that the array returned by the collatz
function is the Collatz sequence of n1.

Is the invariant true for the base case when the pre-condition is met? Yes. Indeed, in that
case, lc=1 since c is only made up of n1, and c is the sub-sequence of the first element of
the Collatz sequence of n1 (c=[n1] and lc =1).

Let us now assume that, for all the arrays c with a length lc less than or equal to a given
length klc, the array c is the sub-sequence of the first lc elements of the Collatz
sequence of n1. Can we prove that, when the length of c becomes equal to klc+1, c will
also be the sub-sequence of the first klc+1 elements of the Collatz sequence of n1? Yes.
The length of c can only become equal to klc+1 because it has added a new Collatz ele-
ment to the already calculated Collatz sub-sequence with lc elements.

Does the above function return the Collatz sequence after exiting its loop? Certainly. The
loop invariant is always fulfilled, up to when m becomes equal to one.

It is now time to look at the total correctness of the collatz JavaScript function by prov-
ing its termination now that we have proven its partial correctness. Does it always termi-
nate for all the inputs that are fulfilling its precondition? Unfortunately, this is an open
conjecture without an unanimously accepted mathematical proof judging by Paul Erdos’s
comment that “Mathematics is not yet ready for such problems” (Lagarias, 1985, p. 3). In
other words, presently there is no termination proof for the collatz function. No one
knows if the above collatz function is totally correct since it does not have a termination
proof. All that can be said of this algorithm is that it is partially correct as proven above.
The example of the twin primes problem can also help us to further understand why it is
sometimes necessary to restrict oneself to proving the partial correctness of an algorithm
instead of trying to prove its total correctness.

By definition, the numbers p and p+2 are twin primes if and only if p and p+2 are both
primes. An apparently simple problem will be to find, for a given whole number n, the
smallest possible value p greater than or equal to n such that p and p+2 are twin primes.
For example, for n in {0, 1, 2, 3}, p and p+2 are equal to 3 and 5, respectively; but for n in
{4, 5}, p and p+2 are respectively equal to 5 and 7. The naive algorithm for this twin primes
problem will simply consist of starting an upwards loop on n in search of the prime num-
ber p for which p+2 is also prime. Such a naive algorithm is available in the JavaScript pro
gram below.

109
Figure 56: Twin Primes Code

Source: Created on behalf of IU (2020).

The proof of the partial correctness of the above JavaScript program is twofold in the
sense that one has to prove the partial correctness of the isPrime primality test function
prior to proving the partial correctness of the twinP function itself. However, we can move
directly to the partial correctness proof of the twinP function since the partial correctness
proof of the isPrime primality test function was given as a self-check question in the pre-
vious section.

The precondition of the twinP function is that each of its inputs m should be a whole
number. The loop invariant states that: For each value of the loop iterator i, there is no
integer value p from m to i–1 such that p is prime and p+2 is also prime. As for the post-
condition, it requires that the twinP function returns the value of the smallest possible
prime number p greater than or equal to m such that p+2 is also prime.

Is the invariant true before the start of the loop? Yes. In this case, i=m, there is no single
number from m to m–1 (is equal to i–1), and there can’t be any prime number from m to i–
1.

110
Let us now suppose that for each value of the loop iterator i less than or equal to a given
value k there is no integer value p from m to i–1 such that p is prime and p+2 is prime. Is it
true that, when i=k+1, there is no integer value p from m to k such that p is prime and
p+2 is prime? Yes. Indeed, when i becomes equal to k+1, it is because k is not prime or
k+2 is not prime. The assumption of the induction implies that there is no integer value p
from m to k–1 such that p is prime and p+2 is prime. We just saw now that at least one
value between k and k+2 is not prime. This proves that there is no integer value p from m
to k such that p is prime and p+2 is prime.

The last step of this partial correctness proof is to demonstrate that the algorithm yields
the expected result at the final exit of its loop. This seems obvious because the loop will
only stop when it finds the first value i such that (isPrime(i)===true) and
(isPrime(i+2)===true), and that is the expected output since i was initialized to m
and the value of i is always increased by one.

It is now time to turn our attention to the termination proof of the twinP algorithm. Can
we prove that the while loop of the twinP algorithm will always terminate? Unfortu-
nately not because it has not yet been proven that there are infinitely many twin primes,
and one is not sure whether, given a random extra-large integer m, the above algorithm
will ultimately find a prime couple greater than m. This is why the JavaScript program
sometimes seems to loop forever in the face of an extra-large input value m, and ulti-
mately fails to output the expected result, even though it seems to work perfectly with
smaller input values.

In summary, this is an algorithm that has been proven as partially correct but without any
termination proof; sometimes, the algorithm does not seem to terminate. For example,
this happened to us when we ran this program with the input value 1000000000000; and,
after five minutes, we gave up on seeing the output that was still not there. This example
illustrates a different perspective on how the definitions of partial correctness and total
correctness can be tied down to the issue of termination. An algorithm is said to be totally
correct if and only if, for all the inputs that fulfill its precondition, the algorithm always
terminates and returns the correct output as defined by the post-condition. On the other
hand, an algorithm is said to be partially correct if and only if, for all the inputs that fulfill
its precondition, the algorithm returns the correct output as defined by the post-condition
whenever it terminates.

4.3 Ensuring Correctness in Day-to-Day


Programming
It is generally acknowledged that the writing of correctness proofs is perceived by most
programmers as a difficult exercise that is only worthwhile for special software develop-
ment projects such as the ones on the verification of security protocols. This might explain
why manual correctness proofs are rarely done in day-to-day programming. Instead, cor-
rectness is ensured in day-to-day programming by different mechanisms both during and
after coding.

111
Ensuring Correctness during Coding

Programmers can ensure the correctness of their code by making use of exceptions han-
dling and assertions mechanisms, by modularizing their code, for example, through the
use of existing libraries, and by programming in teams. Code analysis tools are also very
valuable for the detection of errors.

Modules and libraries

It is easier to detect errors in a program that is divided into modules compared to a pro-
gram that presents itself in a single block, especially for modules from tried and tested
libraries. This is the case because modularization allows programmers to isolate problem-
atic modules and focus their energy on the mitigation of their errors. For example, here is
a main program that makes use of an already written isPerfect Boolean function to test
whether given numbers are perfect (equal to the sum of their divisors excluding them-
selves, e.g., 6 since 6 = 3 + 2 + 1). This example assumes that isPerfect is from the
number-isperfect library that can be installed with the npm install number-
isperfect command.

Figure 57: Perfect Numbers Caller Example

Source: Created on behalf of IU (2020).

In this code, the loop of the main program is only made up of three instructions:

1. The first one collects the input from the user and stores it in the num variable.
2. The second one calls the isPerfect(num) function and stores its result in the pf
variable.
3. The last one displays the pf result on the screen.

The actual task of checking whether a number is perfect is not done by this main program
but by the isPerfect() function. In fact, the main program simply divides the job to be
done into relevant sub-tasks and coordinates their interactions. The isPerfect() func-
tion's sole role is to check if a given number is perfect.

112
Exceptions and assertions

Modern high-level programming languages allow programmers to test an expected pre-


condition for the purpose of handling any related exception. There are also instances
where programmers make use of assertion instructions to test an expected precondition
and ultimately halt the program for any negative test. Exception handling and assertions
are available in JavaScript and Node.js. These two mechanisms are also an attempt to
counter the Garbage In Garbage Out (GIGO). This is, for instance, the case of a program GIGO
that might tell you that 11.5 is a prime number simply because it has converted its input This concept describes
the challenge of programs
from the console into an integer without first rejecting all non-integer values. Exceptions outputting the wrong
use keywords such as try and catch to avoid certain instructions to be executed with answers because they
wrong values, and assertions use the assert keyword to ensure that a given condition is have been fed with inap-
propriate inputs.
fulfilled prior to the execution of certain instructions. Let’s see how to use exceptions and
assertions to improve our primality test algorithm. The code below can be found here.

113
Figure 58: Primality Test Example with Exceptions (First Version)

Source: Created on behalf of IU (2020).

A second version of the program can be found below with try, throw, and catch instruc-
tions.

114
Figure 59: Primality Test Example with Exceptions (Second Version)

Source: Created on behalf of IU (2020).

An equivalent program can be found below where exception handling is replaced by asser-
tion instructions.

115
Figure 60: Primality Test Example with Assertions

Source: Created on behalf of IU (2020).

Code analysis tools

Code analysis automated tools are used by programmers to quickly detect programming
errors instead of spending long hours manually debugging. Such error identification is rel-
atively easy in the first version of a program but becomes increasingly difficult with the
changes made by its newer versions. This tool is becoming essential since a growing num-
ber of programs are being “transpiled” from one language to another in order to keep
their qualities in newer execution environments.

We will now illustrate the use of code analysis tools by executing Jshint on the following J
avaScript program (Jshint is a JavaScript code analysis tool that can be installed by the
npm install jshint command).

116
Figure 61: Primality Test Example for Jshint Demo

Source: Created on behalf of IU (2020).

Trying to execute the above JavaScript program with the node command (node
nameOfprogram.js) will output the following error.

Code

while((d===false)&&(i<n-1){i++; d=((n%i)===0);}
^
SyntaxError: Unexpected token '{'

However, running the same code with the Jshint code analysis tool (jshint
nameOfprogram.js) will output the following errors.

Code
line 1, col 1, 'let' is available in ES6 (use 'esversion: 6') or Mozilla
JS extensions (use moz).
line 3, col 3, 'let' is available in ES6 (use 'esversion: 6') or Mozilla
JS extensions (use moz).
line 3, col 53, 'let' is available in ES6 (use 'esversion: 6') or Mozilla
JS extensions (use moz).
line 4, col 8, Expected a conditional expression and instead saw an assignment.
line 6, col 5, 'let' is available in ES6 (use 'esversion: 6') or Mozilla
JS extensions (use moz).
line 7, col 31, Expected ')' to match '(' from line 7 and instead saw '{'.
line 9, col 37, Expected an operator and instead saw '!'.
line 9, col 47, Unrecoverable syntax error. (90% scanned).

117
Team programming

The traditional image of a programmer is a computing geek always sitting alone in their
corner staring at the computer or hitting the keyboard in search of a solution for a com-
puting problem. It is what is known as solo programming as opposed to team program-
ming where many programmers work together on the same program. One form of collab-
oration is pair programming. According to Saltz and Shamshurin (2017), pair programming
improves code quality by approximately 15 percent, and its significant increases in code
development time usually lead to important decreases in debugging and testing times.
The same authors also report that pair programming significantly enhances the thinking
abilities, programming knowledge, and communication skills of programmers whose job
satisfaction and team spirit levels are likewise meaningfully increased.

Ensuring Correctness after Coding

In day-to-day programming, it is common practice to enforce programs' correctness after


coding through code reviews and through testing.

Code reviews

Once a code has been written, it is not unusual to ask a fellow programmer to review it in
order to assess its quality, just like research articles are peer-reviewed in the publication
process.

According to Alami et al. (2019), code review is currently extensively practiced in the soft-
ware industry. The same study presents the following benefits of code review, based on
the case study of 21 code reviewers from four open-source communities (Allura, CKAN,
FOSSASIA, and Kernel):

• Negative feedback constitutes the main quality assurance mechanism of code reviews.
• Negative feedback from a code review is an opportunity to learn and become a better
coder.
• Code reviewers are passion driven and always on the quest for excellence and quality.
• The primary trading currencies in the code review world are reputation and status.

Testing

It is unthinkable that a program could be put into use without having been tested. Testing
consists of running a program with various inputs or test cases in order to assess the
behavior of the program compared to its requirements. Testing happens at different levels
such as unit testing, integration testing, and system testing. Unit testing is restricted to
individual units or modules, integration testing checks the interactivity of these units, and
system testing extends to the assessment of the behavior of the system as a whole. We can
reasonably say that testing contributes to the improvement of the quality of a program,
but what about the quality of the test cases themselves? According to Kochhar et al.
(2019), there are five main test case characteristics that software testers use as guidelines
for the design of quality test cases.

118
Their study confirmed these characteristics by surveying 21 respondents and interviewing
261 software practitioners both from the open source and from the proprietary software
industry. The participants of this study were distributed over 29 countries; China and the
USA had the highest number of participants. Moreover, these respondents were from rep-
utable organizations such as Google, Facebook, Apache, and Microsoft. To examine the
perceptions of the respondents on the quality characteristics of test cases, 26 Likert scale
items were created. These items were classified into the following five themes: the content
of test cases, their size and complexity, coverage, maintainability, and their bug detection
requirements. Readers are invited to refer to Kochhar et al. (2019) for a closer look at the
above listed quality requirements for test cases.

In the meantime, we present an example of how to automate test cases for a NodeJs pro-
gram using the JEST testing tool. It is assumed that you have installed JEST on your
machine, for instance, with the npm install --save-dev jest command and that
you have added the following object in the list of objects in your package.json file inside
your home folder.

Figure 62: Jest Object for package.json File

Source: Created on behalf of IU (2020).

The NodeJs function to be tested is the isPrime primality test function. However, we had
to modify it slightly as follows and save it in a dedicated file named isPrime1.js for its
testing by JEST.

119
Figure 63: First Primality Test Example for Jest Demo

Source: Created on behalf of IU (2020).

The test cases themselves can be found in the isPrime1.test.js below written file
where for each integer i between 0 and 16, the expected primality r of the integer i is com-
pared to its calculated primality t by the isPrime1() function. The isPrime1() function
will fail the test whenever t and r are different, and this automated testing process is enac-
ted by the npm run test command.

Figure 64: First Set of Jest Test Cases for the Primality Test Program

Source: Created on behalf of IU (2020).

It is also possible to test whether a given test case can trigger the expected exceptions as
visible in the updated NodeJs program below (isPrime2.js).

120
Figure 65: Second Primality Test Example for Jest Demo

Source: Created on behalf of IU (2020).

The following code (isPrime2.test.js) contains three test cases (abc, 10.3, and –7) for
the testing of the handling of exceptions by the program above.

Figure 66: Second Set of Jest Test Cases for the Primality Test Program

Source: Created on behalf of IU (2020).

121
The automated testing process is enacted by the npm run test command.

We end this section by presenting a different side of testing where the behavior of a pro-
gram significantly differs from what was expected because its input is a code instead of
data. This is known as code injection. We show an example of SQL injection in a NodeJs
code that interacts with a MySql Database. Install MySql for the execution of the following
programs.

The following program creates a Credentials database assuming that the user name of
the administrator of the MySql server is root, and they use pswd!@#$%^ as the password
in MySQL. Successfully running this program on the command line with NodeJs leads to
the creation of the Credentials database in MySQL.

Figure 67: Code for the Creation of a MySQL Database

Source: Created on behalf of IU (2020).

The following code creates the login table in the above referred Credentials database.

This login table is made up of two fields that are both strings, the email address field
(email), and the password field (pswd). The email address field is the primary key. This
program must be run with NodeJs on the command line after execution of the one above
in order for this login table to be created in the Credentials database with its two
email and pswd fields.

122
Figure 68: Code for the Creation of a MySQL table

Source: Created on behalf of IU (2020).

The coding and the execution of the following program will allow a user to input their
email address and password in order for them to be stored in the login table in the
Credentials database. Let’s suppose that one user has entered Algo2020[@]exam-
ple[.]com as a username and yu2?&!me as a password; another user has entered Cor-
rect100[@]example[.]com as their username and me241&!u as the password.

123
Figure 69: Code for the Storage of a MySQL Record

Source: Created on behalf of IU (2020).

This final JavaScript will allow a user that has forgotten their password to see that pass-
word after inputting their email address.

124
Figure 70: Code for the Querying of a MySQL Record

Source: Created on behalf of IU (2020).

Readers can now have a first-hand experience with SQL injection by executing the pro-
gram above using the following usernames or email addresses as inputs:

a) Email address: Algo2020[@]example[.]com. The above program correctly outputs the


password of the user as yu2?&!me. Everything seems normal since the program is
behaving as expected.
b) Email address: Correct100[@]example[.]com. The above program correctly outputs
the password of the user as me241&!u. Here, everything also seems normal since the
program is behaving as expected.
c) Email address: "" OR 1=1. The above program displays the email addresses and
passwords of each and every user in the login table. Something is terribly wrong
with the program that has now suffered an SQL injection attack where a user has
input a code instead of a bona fide email address.

4.4 Accuracy, Approximation, and Error


Analysis
This section is dedicated to nondeterministic algorithms as opposed to the deterministic
ones that we have been dealing with. For a given input, a deterministic algorithm will
always yield the same output no matter how many times the algorithm is executed. But

125
for a nondeterministic algorithm, it is possible for the outcome of the execution of the
algorithm to change from one execution to another one, still with the same input, and
sometimes with the wrong output.

Rationale and Consequences

The main idea behind the design of nondeterministic algorithms is fourfold:

1. Random choices can sometimes lead to the solution.


2. Many problems do not require to be solved with a one hundred percent level of accu-
racy.
3. The use of approximations simplifies solutions.
4. A fast approximate solution to a problem is sometimes preferable to a delayed exact
answer for the same problem.

Nondeterministic The random and approximate nature of nondeterministic algorithms implies that those
algorithms algorithms do not always give the correct answer and are prone to errors. Nevertheless,
These algorithms tend to
yield different output val- they are useful as long as their level of accuracy is acceptable. The next section is an illus-
ues when executed many tration of these concepts using the example of the middle rank problem. In that example,
times with the same input simple probability calculations will be used for the estimation of the level of accuracy of
value.
an approximate algorithm.

Approximate Algorithm for the Middle Rank Problem

The middle rank problem assumes that there is a sequence U of n unsorted distinct num-
bers, and we simply want to identify any element S[k] with S being the sorted version of U
and with k fulfilling the following condition.

1−ε n 1+ε n 1
k∈ 2
, 2 ,0 ≤ ε ≤ 2

Let’s suppose that the unsorted sequence is made up of the following 16 numbers. Assum-
1
ing that ε = 8 , it is required for k to be equal to seven, eight, or nine. In other words, we
are simply looking for the seventh, eighth, or ninth element in the sorted version of the
following sequence of numbers: 51, 54, and 59.

26 71 65 43 60 95 54 72 74 42 51 85 49 46 16 59

This problem can be solved by first sorting the array and thereafter getting access to the
seventh, eighth, or ninth element of the sorted array. But sorting takes times. So why not
try our luck with any element from any position in the above array and calculate its rank-
ing? We might be lucky to land with a number whose ranking is between 7 and 9; for exam-
ple, what is the ranking of the element in position 10 whose value is 42? This can be calcu-
lated by first assuming that 42 is the smallest element of the array, and its rank is thus
equal to 1; by comparing each element of the array with 42 and incrementing the ranking
of 42 each time, an element is less than 42.

126
This gives a ranking of 2 for the value 42, which does not belong to the required interval
between 7 and 9. Why not try our luck one last time with the 16th element of the
sequence, 59? The ranking of 59 is 8: Bingo! We found our number in the required range
after two random choices only and without sorting the array.

This algorithm is formalized in the following JavaScript program that assumes that the
numbers of the unsorted sequence are stored line by line in a text file whose name is input
by the user together with the value of ε. Before going into the details of the JavaScript
program itself, let us calculate a few mid rank intervals for the different values of epsilon
that we will use for the testing of our program, assuming that there are 100 different num-
bers in the text file (n=100).

Figure 71: Mid Rank Intervals for Epsilon

Source: Created on behalf of IU (2020).

The following program uses the n-readlines module that should be installed on your
machine using the npm install n-readlines command. This module contains the
necessary functions for the line-by-line reading of a text file. Once the numbers are read
from the text file, they are immediately transferred into an array for their processing by the
algorithm designed for the mid rank problem.

127
Figure 72: Mid Rank Problem Algorithm Example

Source: Created on behalf of IU (2020).

In this program, the midRangeV0 function is an implementation of the algorithm that ran-
domly picks up a number and reports it as a successful answer if its rank is in the required
mid-range while acknowledging a failure if that is not the case. The main program above

128
always starts by trying its luck with a first attempt of the midRangeV0 function (let r1 =
midRangeV0(A,eps)) with the hope of it being successful. We have experimented with
this program using the following 100 values stored in a text file with one value per line.

8722, 3420, 4548, 5637, 5927, 5078, 3739, 6338, 8362, 4617, 3980, 2680, 6264, 8329, 1815,
6119, 9179, 8015, 9235, 3161, 8453, 8469, 3917, 2944, 7502, 6514, 4025, 8678, 8820, 6988,
7214, 6463, 7506, 2042, 7176, 3762, 9577, 5902, 5109, 4441, 9127, 2271, 3726, 2018, 8272,
9629, 8693, 5772, 3185, 6663, 3644, 7668, 1667, 3757, 2969, 6626, 6074, 3861, 2913, 7566,
2257, 3705, 1353, 3868, 9133, 8921, 8368, 8307, 4331, 1092, 6495, 8175, 6472, 9238, 8987,
2838, 1012, 6521, 2779, 2028, 3677, 7394, 2582, 2978, 7930, 7274, 2272, 4015, 3678, 6991,
1962, 9652, 5097, 3277, 4532, 6607, 9203, 4945, 5708, 9954

1
We executed this program four times with an epsilon value ε = 2
= 0.5 for a required
mid-range of [25, 75] and got the following different results.

Execution Result

First Rank and value found after first attempt: undefined, undefined
Rank & value found after more attempts: 68 , 4331

Second Rank and value found after first attempt: 31 , 6463

Third Rank and value found after first attempt: 72 , 6472

Fourth Rank and value found after first attempt: undefined, undefined
Rank & value found after more attempts: 60 , 2257

The table above shows that, out of four executions, the first attempt of the midRangeV0
function failed to do its job twice (first and fourth execution) in trying to find a mid-range
value ranked between the 25th and the 75th rank, but it yielded different good answers
when it succeeded (second and third execution). The main reason behind this nondeter-
ministic behavior is the use of a random number by the midRangeV0 function. We then
1
executed the same program another four times but with an epsilon value ε = 8 = 0.125
for a required mid-range of ranks in the [43.75, 56.25] interval. This time, we got the fol-
lowing different results.

Execution Result

First Rank and value found after first attempt: undefined, undefined
Rank & value found after more attempts: 51 , 7668

Second Rank and value found after first attempt: undefined, undefined
Rank & value found after more attempts: 47 , 5772

Third Rank and value found after first attempt: 49 , 6663

Fourth Rank and value found after first attempt: undefined, undefined
Rank & value found after more attempts: 56 , 6074

129
The table above shows that, out of four executions, the first attempt of the midRangeV0
function failed to do its job three times (first, second, and fourth execution) in trying to
find a mid-range value ranked between the 44th and the 56th rank, but it yielded a good
answer when it succeeded (third execution). The success rate of the midRangeV0 function
is 50 percent for the first table, but it is now 25 percent for the second. It is important to
note that each successful answer of the midRangeV0 function is a correct answer, but
each failure of the same function is a wrong answer.

In this case, the terms' success rate and accuracy levels are equivalent, and it is only natu-
ral for us to want to know the value of the estimated accuracy level of the midRangeV0
function. Is it 50 percent as experienced by the results of the first table? Is it 25 percent as
experienced by the results of the second? Or does it depend on certain parameters? The
rest of the section is dedicated to these questions.

What are the chances for the midRangeV0 function succeeding? It succeeds when, at the
end of the iteration of i, k ends with a value that fulfills the following condition.

1−ε n 1+ε n
k∈ 2
, 2

How many possible values are there in this interval? εn+1 as shown below

1+ε n 1−ε n 1+ε n− 1−ε n n + εn − n − εn


2
− 2
+1= 2
+1= 2
+1
2εn
= 2 +1

How many possible values can k have in general? n values, since k can end up with any
value between 1 and n.

What are the chances of the midRangeV0 function to succeed? The answer to this ques-
εn + 1
tion is the following in accordance with basic probability laws: n
≃ ε. This translates
into the following estimated accuracy levels of the midRangeV0 function when n is equal
to 100.

Estimated Accuracy Levels of


the midRangeV0 Function
ε When n=100 Estimated Failure Rate

1 50% 50%
= 0.5
2

1 25% 75%
= 0.25
4

1 12.5% 87.5%
= 0.125
8

1 6.25% 93.75
= 0,0625
16

130
Estimated Accuracy Levels of
the midRangeV0 Function
ε When n=100 Estimated Failure Rate

1 3.125% 96.875
= 0,03125
32

The table above shows that the estimated accuracy level of the midRangeV0 function is
quite low. The best that algorithm can do is to give us a fifty-fifty chance between a correct
answer and an erroneous one when we execute it. Worse, there is almost no chance of get-
ting a good answer when the value of epsilon becomes smaller. But what happens when
the midRangeV0 function is executed a few times, let’s say c times? The probability of the
midRangeV0 function to fail is equal to (1–ε) since its probability to succeed is equal to ε.
Therefore, the probability of the midRangeV0 function to fail c times is equal to (1–ε)c,
and its probability to succeed after c times is equal to 1–(1–ε)c. This translates into the
following accuracy levels of a midRangeV function that tries the midRange0 function c
times with n equal to 100 and c equal to 10.

Estimated failure rate Estimated success


Estimated failure rate of 10 midRange0 rate of 10 midRange0
of midRange0: attempts: attempts:
ε (1 – ε) (1–ε)10 1–(1–ε)10

1 0.5 0,00098 = 0.098% 0,99902 = 99.902%


= 0.5
2

1 0.75 0,05631 = 5.631% 0,94369 = 94.369%


= 0.25
4

1 0.875 0,26308 = 26.308% 0,73692 = 73.692%


= 0.125
8

1 0.9375 0,52446 = 52.446% 0,47554 = 47.554%


= 0,0625
16

1 0.96875 0,72798 = 72.798% 0,27202 = 27.202%


= 0,03125
32

It is important to note that, even when the midRangeV function runs the midRange0 func-
tion a few number of times, it is still achieved in a linear time. In fact, it is still faster by far
than the time that it would have taken to sort the array, which in the worst case is quad-
ratic.

SUMMARY
This unit presented mathematical induction as the cornerstone of par-
tial correctness proofs both for iterative algorithms and for recursive
functions where we have to show that the invariant is always true. It cov-
ered termination proofs as the additional verification step beyond the
partial correctness proof in order to determine total correctness. Day-to-

131
day programming correctness checking mechanisms, such as testing
and code reviews, together with other approaches such as the use of
code analysis tools, libraries, modules, assertions, and exceptions, were
reviewed. The end of the unit was dedicated to the presentation of an
example of an approximate and randomized algorithm. The presenta-
tion of this algorithm includes key details about the probabilistic calcu-
lation of its accuracy in an effort to explain the rationale behind such
nondeterministic algorithms despite the fact that they sometimes yield
erroneous results.

132
UNIT 5
COMPUTABILITY

STUDY GOALS

On completion of this unit, you will have learned …

– how to compute an algorithm in a given model of computation.


– the specification and characteristics of the halting problem.
– key details about some well-known undecidable problems.
5. COMPUTABILITY

Introduction
Today’s computing machinery seems so effective judging by its overwhelming success
both for traditional applications and the newest ones. In the midst of such an impressive
computing power, it is a legitimate question to ask whether a computational problem that
cannot be solved by a computer exists. This is the question at the core of this unit. But that
question cannot be answered without a clear understanding of the different models of
computation.

Classical models of computation are presented first. Thereafter, the specification and the
characteristics of the halting problem are described as an introduction to the concepts of
uncomputability and undecidability. The last section of the unit is dedicated to the pre-
sentation of some well-known undecidable problems.

5.1 Models of Computation


Model of computation A model of computation is a mathematical description of how a conceptual computer
A model of computation processes the inputs of computational problems towards the output of their results. These
fully describes a concep-
tual computer. models of computation are usually benchmarked against the Turing machine that was
invented around 1930.

Traditional Models

This subsection is dedicated to five models of computation: automata, Turing machines,


lambda calculus, recursive functions, and first order predicate calculus. An effort will be
made to present these models with the help of suitable understandable examples.
Although this book does not include JavaScript examples of each of these computational
models, they are all feasible and, in fact, have been done by various packages. If they use
these packages, motivated readers will grasp the concepts in question in a rich and
applied way. Let us recall that a model of computation is a description of how a concep-
tual computer processes its instructions written in a given formalized or formal language.
As in the case for natural languages, the formal language of each conceptual computer is
made up of semantically and syntactically approved words from a given alphabet. We will
later see that each model of computation covers its own family of languages, with certain
families being more powerful than the others.

Automata

There are two types of automata: finite automata, which are the model of computation for
regular languages based conceptual computers, and pushdown automata, which are the
model of computation for context-free language-based conceptual computers. Let’s sup-

134
pose that we want to build a small conceptual computer whose task is simply to check
whether an input from the user is solely made up of the two strings YES or NO. This con-
ceptual computer is equivalent to the following finite deterministic automaton.

Figure 73: YES or NO Automaton

Source: Created on behalf of IU (2020).

Assuming that all inputs are starting from the initial state (so), it is easy to observe that all
valid inputs will end up at the s4 acceptance state. It is important to note that the above
automaton implicitly assumes that any non-valid transition will land in the dustbin state.
For example, for the input string NOT, the automaton will be in s4 without having read the
character T, and reading T from s4 is considered as a non-valid transition since such a
transition is not explicitly stated. In other words, the input string NOT will end up in the
dustbin state. Let us now build a finite automaton for a conceptual computer whose sole
task is to check whether an input from the user in the decimal numbering system is a posi-
tive multiple of 10.

135
Figure 74: Automaton for the Positive Multiples of 10

Source: Created on behalf of IU (2020).

We can now formally define a finite automaton as a model of computation that includes
the family of conceptual computers. These are made up of the following components: a
finite number of states with one of them being the initial state (dotted line state) and one
or more of them being the final accepting (green state) state(s), an alphabet of characters,
and a transition function or table on how to move from one state to another state for each
character of the alphabet. Any input leading to a non-authorized move by the machine is
considered invalid. For instance, the above automaton is formally specified as follows:

Alphabet = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}; states = {s0, s1, s2} with s0 as the initial state and
s2 the final accepting state. The transition function is represented by the following table.

State c=0 c in {1,2,3,4,5,6,7,8,9}

s0 s2 s1

s1 s2 s1

s2 s2 s1

Certain automata restrict their transition function to less than two moves per character
from each state. Those are deterministic automata as opposed to nondeterministic
automata where characters are allowed any number of moves from any state. It is worth
mentioning that both finite deterministic and finite nondeterministic automata are part of
the same model of computation as the one of regular languages.

136
One of the characteristics of regular languages is that the union of two regular languages
remains a regular language; the same applies for their intersection. Another property of
regular languages, the one proven by the pumping lemma, is that a word of a given regu-
lar language can only be arbitrarily long if and only if it is of the form pmns, where p, m, s
are words and n is a positive integer. In other words, because the finiteness of the number
of states of automata and their lack of memory, an arbitrary long word will have to go
through a loop on its way to a final accepting state. As a result, many languages cannot be
modeled by finite automata.

Let’s consider a small machine whose sole task is to check whether a sequence of brackets
entered by a user is well-bracketed in the sense that each open bracket is suitably closed.
Such a machine needs a memory to keep track of the number of consecutively open
brackets so that it can check if that number matches the one for the closed brackets.

Figure 75: Pushdown Automaton for Well-Bracketed Expressions

Source: Created on behalf of IU (2020).

The above pushdown automaton uses a stack whose push operation increases the count
of the number of consecutive open brackets.

The pop operation of the stack decreases the above count. All in all, a final state with an
empty stack will correspond to a well-bracketed sequence. The concepts of states, transi-
tions, and alphabets are similar to the ones of the finite automata except that, for push-
down automata, stacks may have an alphabet of their own in addition to the alphabet of
the inputs. However, these two alphabets might sometimes coincide. It is also important
to note that the transition arrow of a pushdown automaton is made up of three elements:
the current letter of the input word being checked, the letter currently on the top of the
stack, and the action to be performed on the stack (push or pop). For example, there is a
transition between s0 and s2 and that transition has the “(, Ø, push ꓕ, (” label. This means

137
that from s0, if the open bracket ( is the current letter of the input word, and the stack is
currently empty Ø, then we have to push “ꓕ” followed by ( on top of the stack and move to
s2.

Readers are advised that the above pushdown automaton assumes that well-formatted
input words end with the “ꓕ” character, and the stack is initially empty. Readers are also
invited to test the machine using, for example, (())()ꓕ as an input word. They are also
reminded that pushdown automata are the model of computation of context-free lan-
guage-based conceptual computers where the stack serves as the memory device.

Turing machines

The use of a stack as a memory device has its own limitations. This calls for other memory
devices that make use of a less restrictive data structure as is the case of the tape of a
Turing machine. Such tapes allow their read/write head to move to the left (L), to stay
unchanged (U), or to move to the right (R).

Figure 76: Turing Machine for the Words anbncn

Source: Created on behalf of IU (2020).

138
The purpose of the Turing machine above is to recognize words of the form anbncn, where
n is a strictly positive integer. The transition from one state to another is made up of three
components. For example, the a, N, R transition between the SaZ state and the Sb state
stipulates that from state SaZ, if the head of the tape is currently pointing to a, then the
letter a has to be replaced by N, the head of the tape must be moved to the position on its
immediate right, and the Turing machine must go to state Sb. It is important to note that
the input word is always copied on the tape at the beginning of the processing with the
head of the Turing machine pointing to the first letter of that input. Here again, there is a
terminating character ꓕ as was the case for pushdown automata. The Turing machine is
tested here for the aabbcc input, but readers are invited to test it with other inputs.

139
Figure 77: Turing Machine (aabbcc Input)

Source: Created on behalf of IU (2020).

What is amazing about Turing machines is that they are the model of computation for all
possible language-based conceptual computers. In other words, we define an algorithm to
be computable if and only if can be modeled by a Turing machine.

Lambda calculus

Lambda calculus is a model of computation for conceptual computers whose language is


made up of the terms defined below:

140
• A variable such as x, y, z, a, b, c, d, and so on
• An anonymous lambda function that assigns the term t to a given variable x, denoted
by λx.t
• An application ts that computes the term t using another term s as an argument

In other words, a term is either a variable, an anonymous lambda function, or an applica-


tion. However, the use of brackets is sometimes unavoidable for the sake of clarification
even though the left priority rule [t1t2t3 = (t1t2)t3] applies by default. Here are a few exam-
ples of simple terms. The identity function for which each element is assigned to its own
value is represented by the anonymous lambda function λx.x, which is the same as λy.y
since x is a variable whose name is changeable.

Let’s also look at the example of how functions are composed where f∘g represents the
composition of the functions f and g. What is the meaning of f∘f? For f, every variable is
assigned to the value of its f-mapping, which is the value of its mapping by the function f.
For f∘f, each variable is assigned to its f∘f-mapping which is the value of its mapping, by
the function f∘f (i.e., the f-mapping of its f-mapping). We can carry on with f∘f∘f, f∘f∘f∘f,
and so on, up to eternity, but, let us pause for a while to look at how to translate these
examples of function compositions into the lambda calculus language:

• The anonymous function that assigns any given variable f (f is a function) to its own
value f is denoted in lambda calculus by λf.(λx.fx).
• The anonymous function that assigns any given variable f (f is a function) to the value of
its own double composition f∘f is denoted in lambda calculus by λf.(λx.f(fx)).
• An anonymous function that assigns any given variable f (f is a function) to the value of
its own triple composition f∘f∘f is denoted in lambda calculus by λf.(λx.f(f(fx))).

These three bullets are the respective lambda calculus representations of the numbers 1,
2, and 3, with the convention that the lambda term λf.(λx.x) is a representation of 0.

It is possible to reduce a lambda calculus expression to a simpler form as is the case for
other mathematical expressions. Such simplifications are done with the help of the fol-
lowing three rules: alpha-conversions, beta-reductions, and eta-reductions.

1. Alpha-conversion: For application terms, argument names should be changed in


order to avoid the name clashing with the names of the variables of mapping func-
tions.
2. Beta-reduction: An application term ts can be reduced by replacing the variable of the
term t with the argument s.
3. Eta-reduction or conversion: The term λx.(tx) can be simplified, reduced, or conver-
ted to the term t when the variable x is not used by the term t.

We will now illustrate the use of these simplification rules on the following lambda calcu-
lus term: λn.(λf.(λx.f((nf)x))).

Let us try to apply this function with the variable n equal to λf.(λx.x). This will give the
application term (λn.(λf.(λx.f((nf)x))))(λf.(λx.x)).

141
λn . λf . λx . f nf x λf . λx . x
= λn . λf . λx . f nf x λf . λx . x
= λf . λx . f λf . λx . x f x = λf . λx . f λf . λx . x f x by beta‐reduction
= λf . λx . f λf . λx . x f x = λf . λx . f λx . x x by Eta‐conversion
= λf . λx . f λx . x x = λf . λx . f x = λf . λx . f x = λf
. λx . fx

We have just applied λf.(λx.x) to the λn.(λf.(λx.f((nf)x))) lambda calculus term and lan-
ded on the term λf.(λx.fx). Readers are also invited to apply λf.(λx.fx) to the λn.(λf.
(λx.f((nf)x))) lambda calculus term, and they will land on the term λf.(λx.f(fx)) in confir-
mation of the fact that λn.(λf.(λx.f((nf)x))) is the lambda calculus representation of the
increment function that adds 1 to its parameter n.

Lambda calculus was discovered by Church and Kleene around 1930 (Turner, 2018). It is
Turing-complete compu- both Turing-complete and Turing-equivalent (Adams, 2018; Lyman, 2016) in the sense
tational models that any Turing machine can be modeled as a lambda calculus-based machine and vice-
These models are able to
model any Turing versa.
machine.
Turing-equivalent Recursive functions
computational models
Turing machines can sim-
ulate any Turing equiva- Discovered by Gödel and Herbrand around 1930, general recursive functions are Turing-
lent computational complete (Wang, 1990). There are two types of recursive functions for a tuple variable
model.
X=(x1, x2, x3, …, xn): primitive recursive functions and general recursive functions.

For a function to be declared a primitive recursive function, it has to fulfill one of the fol-
lowing conditions:

• It is a Zero function that assigns the value zero to each tuple variable X.
• It is a projection function that assigns a selected tuple coordinate to each tuple variable
X.
• It is a projection successor function that assigns the successor of a selected tuple coor-
dinate to each tuple variable X (e.g., X is mapped to 1+x3).
• It is the composition of primitive recursive functions.
• It is the recursion of two primitive recursive functions g and h, as expressed below:

f X, 0 = g X
.
f X, S n = h X, n, f X, n

Let us consider a few examples of functions to check whether they are primitive recursive,
starting with the increment function that simply adds 1 to its parameter. The increment
function f(x) = x + 1 is a primitive recursive function since x+1 is the successor of the
projection on the first and unique variable x. Let's look at the function f that adds its two
parameters x and n. This function is recursively written as follows even though it is not yet
the formal primitive recursive language.

f x, 0 = x
f x, S n = f x, n + 1

142
The workings of the formulation of the x+n function in the formal primitive recursive lan-
guage are as follows:

• f(x,0) = g(x) = x with g being the projection on its first and unique variable
• f(x, S(n)) = h(x, n, f(x, n)) with h being the successor of the projection on its third
attribute

This leads to the following formal formulation of the x+n function as a primitive recursive
function

f x, 0 = g x
f x, S n = h x, n, f x, n

with

g t =t
h u, v, w = S w

It is also possible to write the function f that multiplies its two parameters x and n. The
function is written recursively as follows even though it is not yet the formal primitive
recursive language.

f x, 0 = 0
f x, S n = f x, n + x

The workings of the formulation of the x·n function in the formal primitive recursive lan-
guage are as follows:

• f(x,0) = g(x) = 0 with g being the Zero function


• f(x, S(n)) = h(x, n, f(x, n)) with h being the addition of the first parameter with the
third one.

This leads to the following formal formulation of the x·n function as a primitive recursive
function, assuming that we have already shown that the addition function is a primitive
recursive function.

f x, 0 = g x
f x, S n = h x, n, f x, n

with

g t =0
h u, v, w = add u, w

Readers are invited to convert other arithmetic operations such as subtraction, the power
on one number to another one, and even the factorial to their formal primitive recursive
formats. For now, we are going to direct our attention to general recursive functions.

143
A function f(X) is said to be a general recursive function if and only if it fulfills either of the
following conditions:

• It is a primitive recursive function.


• It is the unbounded minimization of another general recursive function g and can be
expressed as

f X = Minimum zsuch thatg X, z = 0

Here is an example of a general recursive function that is not primitively recursive, the
Ackermann function A, defined as follows:

A 0, y = y + 1
A x + 1, 0 = A x, 1 .
A x + 1, y + 1 = A x, A x + 1, y

Below are the values of A(x,y) when x and y belong to the {0, 1, 2, 3} set.

y=0 y=1 y=2 y=3

x=0 1 2 3 4

x=1 2 3 4 5

x=2 3 5 7 9

x=3 5 13 29 61

We can see from the table that any couple of natural numbers has a value for the Acker-
mann function. This function is thus recognized as a total general recursive function. In
contrast, the following general recursive function only has a value when x=0 and not for
any other x. It is recognized as a partial general recursive function: f(x) = Min {z|
add(x,z) = 0}.

First order predicate calculus

Discovered by Frege (1879/1990), first order predicate calculus is proven to be Turing-com-


plete. It is a model of computation that consists in representing facts either as true or not
true, with the help of quantifiers such as “for all” and “there exists.” Here is an illustration
of the use of the first order predicate calculus model for the increment operation i. This
example simply says that for any number n, it is true that the increment i of n is simply its
successor s(n): ∀n.i(n,s(n)).

Similarly, the addition operation a can be expressed by the following first order predicate
calculus expression according to which it is true that the addition of any number x with 0
is equal to x. According to the second part of that expression, it is true that if z is the result
of the addition of x and y, then it is also true that z+1 is the result of the addition of x and
y+1.

144
∀x . a x, 0, x
∀x . ∀y . ∀z . a x, y, z a x, s y , s z

Let’s try to compute the addition of 3 and 2 with this predicate calculus model of compu-
tation: a(3,0,3) ⇒ a(3,1,4) ⇒ a(3,2,5).

New Models and the von Neumann Machine

The above models have led to the creation of new models of computation such as the von
Neumann machine, object calculus, and interaction nets. Object-oriented programming is
one of the relatively new programming approaches whose model of computation is
known as imperative object calculus as proposed by Abadi and Cardelli (1996). It is based
on lambda calculus and it has been proven to be Turing-complete. Interaction nets are a
graphical model of computation that was proposed by Lafont (1989) even though they
also have a textual version. This model of computation is based on a linear logic model
proposed by Girard (1987) as a combination of classical logic and constructive logic. It is
important to note that interaction nets are Turing-complete. Computing is also currently
being explored from the perspective of many other knowledge domains such as humani-
ties, physics, chemistry, biology, biochemistry, and mathematics, with the hope of pro-
posing new models of computation.

We would like to end this section with an overview of the von Neumann computer
because of its central role towards the physical implementation of several abstract con-
cepts from many of the aforementioned models of computation.

The von Neumann computer is made up of the following main components that are con-
nected by a bus system: control unit, processing unit, memory, hard disk, and input/
output devices. The role of the control unit is to coordinate the activities of the other com-
ponents by continuously updating its instruction register and its program counter. The
processing unit is made up of an arithmetic-logic unit and of registers for the processing of
instructions. As for the memory, its role is to store instructions and data prior to their exe-
cution by the processing unit. The memory is organized as a sequence of addresses whose
role is to identify the locations of program data and instructions.

Such addresses are used by the Fetch-Decode-Execute cycle of the control unit for its con-
tinuous coordination of the activities of the components of the computer. Let’s not forget
to mention input and output devices for their central role at the interface between the
computer and its user. We must also remember that data and programs are transferred to
the memory when it is their turn to be processed; otherwise, they are stored in the hard
disk.

5.2 The Halting Problem


Having in mind that a function is supposed to be fed with an input for which it will yield an
output, let us consider the following function denoted by hp:

145
• The input of hp is made up of two arguments ta and ti where ta represents any algo-
rithm in its JavaScript textual form, and ti represents any text input of that algorithm
ta. Please note that JavaScript is not mandatory here and can be replaced with any
other language.
• The output of hp for the (ta, ti) input is a Boolean value that is equal to true if and
only if the algorithm ta terminates for the text input ti.

The hp function portrays what is known as the halting problem, and it is well established
that there is no possible hp function that can decide in advance whether any given algo-
rithm ta will terminate for any given input ti.

Proof: Let us suppose that there is an algorithm hpa for the halting problem whereby
hpa(ta, ti) always returns true if the algorithm ta terminates for the input text ti and
returns false otherwise. We also have another function with the name of
happyCrazyLooper that works as follows for a given input text t: It is happy to return
true when hpa(t, t) is false; otherwise, it loops forever.

Code
function happyCrazyLooper(t) {
if (hpa(t,t) == false) {return true}
if (hpa(t,t) == true) {loop forever}
}

Having in mind that the happyCrazyLooper algorithm is also a text that we can denote,
for example, by hcl, an interesting question is to find out if happyCrazyLooper(hcl)
terminates. The simple example of a program that counts the number of words in a given
text can help us understand that a program can use itself as an input, i.e., a program can
count its own number of words.

If we first assume that happyCrazyLooper(hcl) terminates, then this will imply that
hpa(hcl, hcl) is false. In other words, hcl does not terminate with hcl as an input,
which also tells us that happyCrazyLooper(hcl) does not terminate. And that contra-
dicts the initial first assumption made.

If we now assume that happyCrazyLooper(hcl) does not terminate, then this will imply
that hpa(hcl, hcl) is true. In other words, hcl does terminate with hcl as an input,
which also tells us that happyCrazyLooper(hcl) does terminate. It is a contradiction of
the initial second assumption made. These contradictory conclusions point to the fact
that the existence of the hpa algorithm is not possible. In other words, there is no possible
algorithm for the halting problem.

146
5.3 Undecidable Problems
The previous section ended with the confirmation that it is not possible to have an algo-
rithm for the halting problem. In other words, there is no possible algorithm that can
decide in advance if a given algorithm will terminate for any given input. Otherwise, the
halting problem is undecidable, and so are so many other problems from diverse mathe- Decision problems
matical sub-domains such as logic, numbers theory, differential equations, and Turing These aim at returning a
binary answer such as
machines. true or false, yes or no, 0
or 1, etc.
This section will present two undecidable Turing machine problems, but let us first take a
few minutes to mention that the previous halting problem was initially described by
Turing himself as a Turing machine problem. That initial formulation of the halting prob-
lem states that there is no algorithm that can decide in advance whether any randomly
given Turing machine will terminate for a randomly given input.

It is now time to look at two other undecidable Turing machine problems: the null (empty)
string problem and the problem of the membership to a recursively enumerable set.

The Null (Empty) String Problem

Let us suppose that we have a randomly chosen Turing machine, TM, as well as a ran-
domly chosen input, x. We can then build the following Turing machine, TM’, whose
behavior for each of its inputs y is based on TM and x.

Figure 78: Turing Machine Concept for the Null (Empty) String Problem

Source: Created on behalf of IU (2020).

Let us suppose for a moment that we found an algorithm NSA (Null String Algorithm) that
can determine in advance whether any given Turing machine accepts the null(empty)
string. This will imply that the NSA algorithm is able to determine in advance whether the
above described Turing machine TM’ accepts the null string. In other words, having in

147
mind that yx=x when y is the null(empty) string, the NSA algorithm is able to determine in
advance whether or not TM halts on x. This shows that we have now found an algorithm
that, for any randomly chosen Turing machine TM and any randomly chosen input, x is
able to decide in advance whether TM halts on x.

This contradicts the undecidability of the halting problem, and we conclude that there is
no algorithm that can determine in advance whether a given Turing machine accepts the
null string.

The Problem of the Membership in a Recursively Enumerable Set

It is possible to represent each imaginable algorithm and each conceivable algorithm’s


input by a natural number. Let us assign ourselves the task of writing a JavaScript pseudo-
code listSHX to print SHX. We assume that SHX is the list of all the algorithms (represen-
ted by numbers) that are halting on a randomly chosen input X (also represented by a
number), having in mind that an algorithm is a sequence of many steps.

The listSHX JavaScript pseudocode shows that SHX is a recursively enumerable set, i.e.,
a set for which “there is a computer program that when left running forever eventually
prints out exactly” its elements (Poonen, 2014, p. 3).

Figure 79: Recursively Enumerable Set Membership JavaScript Pseudocode

Source: Created on behalf of IU (2020).

Is there a possible program that can determine whether an algorithm represented by a


number n belongs to SHX? In other words, is there a program that can determine in
advance whether any algorithm a will halt on a randomly chosen input X? The answer to
that question is negative because of the undecidability of the halting problem. In other
words, the problem of the membership of an algorithm to the recursively enumerable set

148
SHX is undecidable. This also shows that the problem of determining whether or not a
randomly chosen natural number belongs to a randomly chosen recursively enumerable
set is undecidable.

SUMMARY
This unit presented five traditional models of computation including
automata, Turing machines, lambda calculus, recursive functions, and
first order predicate calculus. With the exception of automata, these
classic models of computation are all Turing-complete. They are the
foundation of contemporary programming language paradigms, such as
imperative, functional, and logic. Newer models of computation, such as
the von Neumann machine, object calculus, interaction nets, and
nature-based computing models, were briefly introduced.

The concept of undecidability was presented and exemplified with the


halting problem, the null (empty) recognition problem, and the problem
of the membership to a recursively enumerable set. These examples are
a living proof that Turing machines are powerful enough to compute all
possible algorithms because of the equivalence between the concepts of
algorithms and Turing machines. But there are still many computational
problems that cannot be solved by any algorithm.

149
UNIT 6
EFFICIENCY OF ALGORITHMS: COMPLEXITY
THEORY

STUDY GOALS

On completion of this unit, you will have learned …

– how to measure the efficiency of an algorithm.


– the different computational complexity classes.
– key perceptions as to whether P = NP.
6. EFFICIENCY OF ALGORITHMS:
COMPLEXITY THEORY

Introduction
Let us recall that an algorithm is nothing more than a step-by-step method proposed as a
solution to a computational problem. It is important for an algorithm to have the smallest
possible number of steps for the fastest resolution of its computational problem. Apart
from this time efficiency requirement, the quality of an algorithm also depends on the
amount of space that it is using. Both complexity dimensions are important to assess if an
implemented algorithm will be able to process an expected class of inputs.

The purpose of this unit is to present the main computational complexity models that are
available for the measurement of the efficiency of algorithms and for the analysis of the
complexity of computational problems. This unit will also present the different classes of
computational complexity, in anticipation of the discussion of the question as to whether
P=NP.

6.1 Models of Complexity


Model of complexity A model of complexity is an evaluation framework for the assessment of the complexity
A model of complexity of computational problems or for the analysis of the efficiency of algorithms.
encompasses metrics
used to measure the effi-
ciency of algorithms and Time Efficiency Analysis with the Big O Approximation Model
the difficulty of computa-
tional problems.
Let us recall that an algorithm is nothing more than a sequence of steps to solve a given
computational problem. From that perspective, it is possible to consider that the number
of steps of an algorithm is a suitable metric for the measurement of its efficiency. Let us
also remember that different inputs do not necessarily yield the same number of steps for
a given algorithm. One of the most used complexity models for the analysis of the time
(and the space) efficiency of algorithms is the big O time and space approximation model.
Big O time complexity The big O model consists of approximating the time (or the space) efficiency of algorithms
The big O time complexity in the form of a function of their infinitely large input size. It is a measurement of an upper
model measures the
upper bound of the exe- bound of the execution time (or the space requirement) of an algorithm for an infinitely
cution time of an algo- large input size.
rithm when its input size
is considered to be infin-
itely large. The formal definition of the big O notation stipulates that h(n) = O(f(n)) if and only if
Big O space complexity two constant value s b and c exist such that 0 ≤ h(n) ≤ bf(n) for all values of n>=c. In
The big O space complex- other words, the definition of the big O is that bf(n) is an upper bound of h(n) for high
ity model measures the
upper bound of the space values of n. Below are a few examples of how to calculate the big O efficiency of an algo-
used by an algorithm rithm.
when its input size is con-
sidered to be infinitely
large.

152
The biggest subsequence problem

Let us assume that we have a sequence of numbers and are looking for an algorithm to
identify the first subsequence with the biggest sum among all the other subsequences.

Figure 80: Biggest Subsequence Problem JavaScript Code (Start)

Source: Created on behalf of IU (2020).

153
Figure 81: Biggest Subsequence JavaScript Code (End)

Source: Created on behalf of IU (2020).

The biggest subsequence problem can be illustrated with the example of the sequence 3,
—4, 8, —1, 6, —1 where it is visible that the subsequence with the biggest total (of 13) is 8,
—1, 6. The above NodeJs JavaScript code presents two different algorithms for the biggest
subsequence problem. These two algorithms are traced for the input sequence 3, —4, 8, —
1, 6, —1, starting with the first one denoted by biggestSubSeq.

Figure 82: Traced Algorithm

Source: Created on behalf of IU (2020).

154
This algorithm simply scans through each element of the sequence and calculates the
sums of all the possible subsequences that are starting from that element. This process
assigns a new value to the biggest subsequence whenever it finds a new subsequence that
is bigger than the one that was previously considered the biggest.

Let us estimate the efficiency of this algorithm with the help of the big O notation, assum-
ing that the input sequence has a length of n. When i=0, j loops from 0 to n–1. In other
words, j loops n times; each of these times a fixed number of basic steps are executed by
the algorithm. When i=1, j loops from 1 to n – 1, i.e ., j loops n – 1 time s, and e ach of
the se time s a fixe d numbe r of basic ste ps is e xe cute d by the algorithm. We can carry on
with that patte rn until we re ach the point whe re i = n – 1 and, in that case, the instruc-
tions inside the inner loop are only executed once. This leads to the following formula for
the number of basic operations carried out by the above traced algorithm when the value
of n is assumed to be infinitely big.

n n+1
n+ n−1 + n−2 +…+3+2+1= 2
= O n2

The equation shows that this algorithm has a quadratic big O time efficiency (n2 is the
square of n). Let us recall the following formal definition of the big O notation: h(n) =
O(f(n)) if and only if two constant values b and c exist such that 0<=h(n)<=bf(n) for all
values of n>=c. This definition of big O also clearly shows that bf(n) is an upper bound of
1
h(n) for high values of n. For instance, one can see in the above example that 2
·n n+1
is less or equal to n when n>=1 (for this example, b=c=1, and n is an upper bound of
2 2

1
2
· n n + 1 ).

The following table traces Kadane’s algorithm for the subsequence problem with the same
input sequence: 3, —4, 8, —1, 6, —1.

155
Figure 83: Tracing Kadane’s Algorithm

Source: Created on behalf of IU (2020).

The main characteristic of the traced algorithm is that, when we try to add a value a[i] to
the current local sum, it is better to start a new local sum from the element a[i] if that ele-
ment is greater than the addition. For example, if the current local sum is —3, and we are
trying to add —2 to it, then the addition will give —5, which is less than —2. Therefore, it is
better to start a new sum from —2. Another key characteristic of this algorithm is that it is
made up of a single loop instead of a double loop, as was the case for the first one
biggestSubSeq.

It is clear that the traced Kadane’s algorithm (above) loops n times, and for each of these
times, a fixed number of operations is executed. If we denote that fixed number of opera-
tions with c, then the total number of operations executed by that algorithm can be esti-
mated by the following formula when n is assumed to be infinitely big.

cn ≃ n = O n

156
Both algorithms are a typical illustration of the fact that for the same computational prob-
lem, it is possible to have different algorithms, but the most valuable ones are the ones
that are proven to be the most efficient. In this case, Kadane’s algorithm with a linear big
O time efficiency function (a linear function of n) is by far more efficient than the other
algorithm that has a quadratic big O time efficiency function. This difference is visible in
the graph below comparing the evolution of the quadratic function to a linear function.

Figure 84: Graphical Comparison of O(n2) and O(n)

Source: Created on behalf of IU (2020).

Fibonacci numbers

Fibonacci numbers are quite popular in computing and mathematics. They are defined as
follows:

F 0 =0
F 1 =1
F i = F i − 1 + F i − 2 fori ≥ 2

The following NodeJs JavaScript code contains two different algorithms for the computa-
tion of the Fibonacci number F(n) when n is considered a given positive integer.

In the following program, the usualFib function is an implementation of the usual calcu-
lation of Fibonacci numbers. It is a loop from 2 to n as an iterative computation of F(0),
F(1), F(2), F(3), F(4), …, and so on, up to F(n). That loop is executed almost n times with

157
a fixed number of operations for each instance of the loop. We can say that the big O
approximation of the usualFib algorithm is the following when the value of n is assumed
to be infinitely big.

cn ≃ n = O n

Figure 85: Fibonacci Algorithms (Start)

Source: Created on behalf of IU (2020).

158
Figure 86: Fibonacci Algorithms (Cont'd)

Source: Created on behalf of IU (2020).

159
Figure 87: Fibonacci Algorithms (End)

Source: Created on behalf of IU (2020).

The rationale behind the fastFib function as another way to compute Fibonacci num-
bers is less obvious. It deserves a few words, starting with the presentation of the follow-
ing matrix M:

1 1
M=
1 0

Let us multiply the matrix M by itself a few number of times: M · M = M2; M · M · M =


M3; M · M · M · M = M4; M · M · M · M · M = M5 and so on.

11 2 1 3 2 5 3
M1 = M2 = M3 = M4 =
10 1 1 2 1 3 2

85 13 8 21 13 34 21
M5 = M6 = M7 = M8 =
53 8 5 13 8 21 13

The above table can be generalized by stating that

F n+1 Fn
Mn =
Fn F n−1

This formula shows that the calculation of Mndirectly leads to the value of F(n). This is
precisely what is exploited by the fastFib function.

160
Figure 88: Tracing the Fast Fibonacci Algorithms

Source: Created on behalf of IU (2020).

A closer look at the fastFib function shows how the matrix M is raised to the power n,
visible above for all the numbers n from 1 to 16. Let us start by tracing the above tree for n
= 16. The calculation of M16 (= M8 · M8) calls for the calculation of M8 (= M4 · M4),
which itself calls for the calculation of M4 (= M2 · M2), and then M2 (= M1 · M1). This
shows that the calculation of M16 succeeds with only four (4=log2(16)) multiplications of
two matrices. The calculation of M32 will succeed with only five (5=log2(32)) multiplica-
tions of two matrices. In general, when n is a power of 2, the calculation of Mn will succeed
with only log2(n) multiplications of two matrices. This process works well when the divi-
sion by two always lands on an integer from n down to 1. The situation is, however,
slightly nuanced when the number to be halved is odd.

Let us trace the tree above for n=25. The calculation of M25 (= M12 · M12 · M) calls for
the calculation of M12 (= M6 · M6) which itself calls for the calculation of M6 (= M3 · M3)
and the one of M3 (= M1 · M1 · M). This shows that the calculation of M25 succeeds with
only four (4≤log2(25)) matrices multiplication requests, and each of these requests con-
sists of one or two multiplications. It can be generalized that when n is not a power of 2,
the calculation of Mn succeeds with less than log2(n) matrices multiplication requests,
with each request being made up of one or two multiplications. Let us recall that when n is
a power of 2, Mn succeeds with exactly log2(n) matrices multiplication requests, with each
request being made up of one multiplication. Having in mind that the multiplication of
two 2 x 2 matrice s involve s fe w arithme tic ope rations, we can conclude that the total
numbe r of ope rations e xe cute d by the fastFib algorithm can be estimated with the fol-
lowing formula when n is assumed to be infinitely big.

c log2 n = O log2 n

161
Once again, we have two different algorithms for the same computational problem where
one of them, fastFib, is more efficient than the other one, usualFib. In fact, the big O
approximation function of the time efficiency of fastFib is logarithmic (log2(n)) while the
one of usualFib is linear.

Space Efficiency Analysis with the Big O Approximation Model

Until now, we have focused on the approximation of the time efficiency of algorithms
using the big O model. Let’s briefly turn our attention to the big O approximation of the
space efficiency of the previous two algorithms.

The biggest subsequence problem

The two algorithms that were proposed for this problem both make use of a small number
of integer variables (c). This number does not depend on the size of the input array. We are
thus able to conclude that the total amount of space used with each of these algorithms
can be estimated by the following formula when the size n of the input array is assumed to
be infinitely big.

f n =c=O 1

This equation shows that the big O approximation function of the space efficiency of each
of these two algorithms is constant. It might still be necessary to consider the size n of the
input array as part of the space requirements of these algorithms, for example, when the
array is read from a network.

Fibonacci numbers

The usualFib algorithm that was proposed for Fibonacci the problem makes use of a
small number of integer variables. This number does not depend on the value of the input
n. We can conclude that the total amount of space used by that algorithm can be estima-
ted with the following formula when n is assumed to be infinitely big: f(n)=c = O(1). This
formula shows that the big O approximation function of the space efficiency of the
usualFib algorithm is constant.

The fastFib algorithm uses a recursive version of the powerMatr function; recursive
calls need to store all local variables of each of the functions that they visit until either it is
finished, or the variable is not necessary anymore. Therefore, “tail-recursive” functions
have an added advantage. Let us look at powerMatrI, which is the iterative version of the
powerMatr algorithm, in order to estimate the amount of space used by these two algo-
rithms. It is visible that the first loop of the powerMatrI algorithm builds a stack s to
record the parity of the successive numbers when they are halved because that parity
determines how the second loop multiplies its matrices (one multiplication or two multi-
plications). As was the case with the big O function of the time efficiency of the fastFib
algorithm because of the halving of numbers, we can conclude that the stack s does not

162
contain more than log2(n) integers. Moreover, the total amount of space used by the
fastFibI (also by the fastFib) algorithm can be estimated by the following formula
when n is assumed to be infinitely big.

c log2 n ≃ log2 n = O log2 n

Worst, Average, and Best Case Efficiency Analysis

There are many situations where the efficiency of an algorithm depends on the choice of
the value of its input. This calls for the need to analyze the efficiency of algorithms for the
best possible, worst possible, and average scenarios. It also helps provide an estimate of
the waiting time for the termination of an algorithm that is run by a user. Below is an
example of a search algorithm that seeks to identify the position of the first occurrence of
a given element in a given sequence of elements.

Figure 89: Search Algorithm Example

Source: Created on behalf of IU (2020).

Worst case scenario

The worst case scenario is the situation whereby the input of the algorithm pushes it to
perform the highest possible amount of work. Let’s consider the search algorithm that
scans through its input sequence from its first element to its last. Here, the worst case sce-
nario corresponds to the situation where the element that is being sought is either the last
element of the sequence or does not even belong to the sequence. In that case, the algo-

163
rithm would have checked each and every element of the sequence. This means that the
total number of comparisons made in the worst possible case by the search algorithm can
be estimated with the following formula when n (the number of elements in the input
sequence) is assumed to be infinitely big.

cn ≃ n = O n

This formula shows that the worst case big O approximation function of the time effi-
ciency of the above search algorithm is linear.

Best case scenario

The best case scenario is the situation where the input of the algorithm pushes that algo-
rithm to perform the smallest possible amount of work. Let’s consider the search algo-
rithm that scans through its input sequence from its first element to the last one. Here, the
best case scenario corresponds to the situation where the element that is being sought is
the first element of the sequence. In this case, the algorithm would have only checked one
element of the sequence. This means that the total number of comparisons made in the
best possible case by the search algorithm can be estimated with the following formula
when n (the number of elements in the input sequence) is assumed to be infinitely big.

f n =c=O 1

This formula shows that the best case big O approximation function of the time efficiency
of the above search algorithm is constant.

Average case scenario

The average case scenario assumes that algorithms’ inputs have random values. We
assume a uniform random distribution of data here to avoid data repeats and biases that
can influence the evaluation of the efficiency of an algorithm. For such randomly distrib-
uted data, the big O approximation notation can still represent the calculated average effi-
ciency when the value of the input size is considered to be infinitely big. If we again con-
sider the search algorithm, the general case is the one where the element being sought
can be found in any position in the sequence.

If the element being sought is the first element of the sequence, then only one comparison
is made by the search algorithm. If the element being sought is the second element of the
sequence, then only two comparisons are made by the search algorithm. We can easily
see that if the element being sought is the pth element of the sequence, then only p com-
parisons are made by the search algorithm. The probability of an element to be in position
1
p is n because of the hypothesized uniform distribution principle, and p can take any
value between 1 and n. In other words, the total number of comparisons made in the aver-
age case by the search algorithm can be estimated by the following formula when n (the
number of elements in the input sequence) is assumed to be infinitely big. The formula
shows that the average case big O approximation function of the time efficiency of the
above search algorithm is linear.

164
1 2 3 n−2 n−1 n 1 n n+1
n
+ n
+ n
+…+ n
+ n
+ n
= n
1+2+…+n = 2n

=O n

Here is a summary of the names of the most common big O approximation functions.

Big O approximation function name Big O Approximation function notation

Constant O(1)

Logarithmic O(log2(n))

Linear O(n)

Log linear O(nlog2(n))

Quadratic O(n2)

Exponential O(cn) with c constant

Complexity Classes

It is one thing to estimate the efficiency of a given algorithm, but it is entirely another
issue to estimate the level of complexity or difficulty of a given computational problem. In
fact, certain problems are intrinsically more difficult to resolve than others. This is why it is
possible to classify computational problems into distinct complexity classes according to
their different levels of difficulty or complexity. Such a classification implies that two dif-
ferent problems will belong to the same complexity class if and only if their best possible
algorithms have the same big O approximation function. A complexity class is formally
defined by the following four elements: a computational model (i.e., Turing machine), a
computational mode (deterministic or nondeterministic), a resource (time or space), and a
lowest upper bound (big O approximation of the best possible algorithms). The determin-
istic term is abbreviated to D, and the nondeterministic term to N for the naming of com-
putational classes. For example, DTime(nk) is the complexity class of all the computa-
tional problems whose best possible algorithm has a big O time efficiency approximation
function of nk when using a deterministic Turing machine as the model of computation.
Similarly, NSpace(log2(n)) is the complexity class of all the computational problems
whose best possible algorithm has a big O space efficiency approximation function of
log2(n) when using a nondeterministic Turing machine as the model of computation. We
see from these two examples that the name of a complexity class is of the form MTime(f)
or MSpace(f) where M is either D or N and f is a big O approximation function of n. The
following table gives a list of the most common complexity classes.

Table 3: Most Common Complexity Classes

Acronym Notation Name

REG DSpace 1 = NSpace 1 Regular languages problems

165
Acronym Notation Name

L DSpace log2 n Deterministic logarithmic space


problems

NL NSpace log2 n Nondeterministic logarithmic


space problems

PSPACE DSpace nO 1 = NSpace nO 1 Polynomial space problems (n, n2,


n3, etc.)

EXPSPACE O1 O1 Exponential space problems (2n,


DSpace 2n = NSpace 2n 2n2, 2n3, etc.)

CONSTTIME DTime 1 = NTime 1 Constant time problems

DLOGTIME DTime log2 n Deterministic logarithmic time


problems

NLOGTIME NTime log2 n Nondeterministic logarithmic


time problems

P DTime nO 1 Deterministic polynomial time


problems

NP NTime nO 1 Nondeterministic polynomial


time problems

EXPTIME O1 Deterministic exponential time


DTime 2n problems

NEXPTIME O1 Nondeterministic exponential


NTime 2n time problems

Source: Created on behalf of IU (2020).

It is proven that the following relationships are true for any big O approximation function
f(n).

DTIME f n ⊆ NTIME f n
DSPACE f n ⊆ NSPACE f n
DTIME f n ⊆ DSPACE f n
NTIME f n ⊆ NSPACE f n
NTIME f n ⊆ DSPACE f n
NSPACE f n ⊆ DTIME 2f n

NTIME f n ⊆ DTIME 2f n

2
NSPACE f n ⊆ DSPACE f n

These properties also lead to the following hierarchy of complexity classes.

CONSTTIME 1 ⊆ REG ⊆ L ⊆ NL ⊆ P ⊆ NP ⊆ PSPACE


⊆ EXPTIME ⊆ NEXPTIME

166
The attention of readers is drawn to the use of the ⊆ symbol above. This is because it is
still an open question whether that ⊆ symbol should be replaced by the ⊂ symbol or even
by the = symbol in certain instances. For example, it is well proven that P ⊆ NP, but it is
an open question whether P ⊂ NP or if N = NP. On the other hand, Vega (2016) recently
claimed that NL = P.

Each of the identified complexity classes has several computational problems than cannot
be explicitly listed. The following table is an attempt to describe a few examples of compu-
tational problems for some of the above identified complexity classes.

Table 4: Computational Problems (Examples)

Class Problem

REG Deciding whether or not a given natural number (in the decimal number system) is a
multiple of ten

L Deciding whether or not all opened brackets are well closed in an string of brackets

NL Deciding whether or not there is a path between two given points of a given directed
graph

P Identifying the complement of a given graph

NP Factorizing a given natural number

Source: Created on behalf of IU (2020).

6.2 NP-Completeness
It seems opportune to introduce this section with the example of the well-known traveling
salesperson and Hamilton cycle problems because of the central role of the notion of
reducibility in the conceptualization of NP-completeness.

The Traveling Salesperson Problem (TSP)

The formulation of the traveling salesperson problem is quite simple. It seeks to deter-
mine whether a person can travel from a point of origin and back to the same point after
visiting each city of a given network exactly once, and without exceeding a given maxi-
mum total distance. The following figure is an example that represents five cities denoted
by A, B, C, D, and E and their distances.

167
Figure 90: Traveling Salesperson Problem Example Graph

Source: Created on behalf of IU (2020).

The ABCDEA route has a total distance of 10 while the ACEBDA has a total distance of 13,
but the total distance of the AECDBA route is only 8. This shows that there is a solution
here for a maximum distance of 8.

The Hamilton Cycle Problem (HCP)

The Hamilton cycle problem seeks to determine whether a given graph has a path from a
point of origin and back to the same point (cycle) after visiting each node of the graph
exactly once. For example, the first graph below does not have a Hamiltonian cycle while
ADECBA, DECBAD, ECBADE, CBADCE, and BADECD are Hamiltonian cycles for the second
graph.

Figure 91: Decision Hamilton Cycle Problem Examples Graphs

Source: Created on behalf of IU (2020).

168
The Concept of Reducibility

Although they both deal with graphs, TSP and HCP are two different computational prob-
lems. In fact, the graph of the former is a fully connected graph but that is not necessarily
the case for the latter. Similarly, the first problem aims to optimize distances while the sec-
ond one does not involve any distance concept. Let us transform the two graphs of the
HCP by assigning a value of one to the existing links and by creating new links with a value
of two so that each of these graphs can be fully connected.

Figure 92: Reducing Hamilton Cycle Problem Examples Graphs

Source: Created on behalf of IU (2020).

Having in mind that each of the graphs has five nodes, let us determine for both whether a
person can travel from a point of origin and back to the same point after visiting each
node exactly once, without exceeding a maximum total distance of five. The answer is no
for the first graph but yes for the second.

What has just happened? An instance H of the HCP has been transformed or reduced into
an instance T of the TSP so that the answer of H can be directly derived from the answer
of T. In general terms, any instance of the HCP can be reduced into an instance of the TSP
by assigning a value of one to the existing links of the graph of the HCP and by creating
new links with a value of two in order for that graph to become fully connected. The big O
approximation of the time efficiency of the above described transformation or reduction is
equal to O(n2) (polynomial reduction) where n is the number of nodes of the graph of the
HCP. This is because such a reduction simply assigns relevant values in the n x n matrix
that represent the HCP graph. The general definition of reducibility states that a problem
PRO is reducible to another problem QUE if and only if each instance p of the problem
PRO can always be transformed into an instance q of the problem QUE such that the sol-
ution of p is directly derived from the solution of q. We have described how to reduce any
instance of the HCP into an instance of the TSP. There is also an algorithm to the reduce
each instance of the TSP into an instance of an HCP.

169
NP-Complete and NP-Hard Problems

Having in mind that the NP complexity class represents the class of nondeterministic poly-
nomial time computational problems, NP-complete problems are the subclass of the most
difficult decision problems (with a yes or no answer) of that class. For a given complexity
class C, we define the C-complete complexity class as the subclass of the most difficult
problems of C.

It is well established, for example, that the TSP is a NP-complete problem in the sense that
it is one of the most difficult NP problems. An interesting question is to determine whether
a given new problem is NP-complete. This requires proving that every instance of such a
problem can be reduced in polynomial time to an instance of a well-known NP-complete
problem and vice versa. This is, for instance, the case of the HCP that was proven to be
reducible to TSP in polynomial time, while it is also established that TSP is reducible to
HCP in polynomial time. In other words, the HCP is also NP-complete. Moreover, once it is
proven that for a given problem H, an existing NP-complete problem is reducible in poly-
NP-hard problems nomial time to H, that problem H is said to be NP-hard.
The polynomial time
reduction of an existing
NP-complete to another For instance, the previous version of the TSP is a decision problem in the sense that its
problem makes that other answer is a yes or a no. However, another version of the TSP exists. It is the optimization
problem NP-hard.
version that consists of determining the shortest path from a city of origin and back to the
same point after visiting each city only once for a given network. The optimization version
of the TSP is a NP-hard problem, even though it is neither a decision problem nor a NP-
complete problem. Similarly, most optimization versions of existing NP-complete prob-
lems are NP-hard problems. Here are a few examples of some well-known NP-complete
problems (Aho, 1977; Varadharajan, 2020; Zapata-Rivera & Aranzazu-Suescun, 2020):

• the satisfiability problem (SAT) to determine whether a given Boolean formula can be
satisfied
• the partitioning problem to determine whether a given set of positive integers can be
partitioned into two complementary subsets such that the sum of the elements of one
subset is equal to the sum of the elements of the other subset
• the graph coloring problem to determine whether it is possible to assign different colors
to each node of a graph while ensuring that two directly connected nodes do not have
the same color and the total number of colors used does not exceed a given value

6.3 P = NP?
A hierarchy of inclusion of the various complexity classes was presented in the first section
of this unit where we saw that P ⊆ NP. What remains an open question is whether P =
NP. That question is so interesting for computer scientists that there is even a one million
dollar reward for the first person to solve it (Blank, 2002).

The P versus NP question is also interesting because it is currently approached on the


basis of the “educated” perceptions and beliefs of computer theory scientists instead of
through systematic facts and proofs. In fact, surveys conducted by Gasarch (2002; 2012)

170
found that the majority of eminent computer theory scientists think that P ≠ NP. More
precisely, Gasarch (2012) found that 83 percent of the surveyed scientists thought that P
≠ NP, an increase from 61 percent in Gasarch (2002). Similar proportions are reported in
both surveys about the beliefs of the respondents that a proof will be found for this ques-
tion before the beginning of the twenty-second century. Interestingly, in 2002, 22 percent
of the respondents reported that they did not know whether or not P = NP, but that
numbe r shrunk to a me re 0.6 pe rce nt by 2012. The 2002 surve y also re ve ale d that almost a
fifth of respondents reported that they did not know when this problem would be solved,
and almost the same proportion of respondents thought that it will be solved after 2100.
In contrast, 41 percent of the respondents in 2012 believed that a solution would be found
for this problem after 2100.

These results seem to indicate that it is currently widely believed that P ≠ NP, but one of
the follow-up questions is about the consequences of the opposite statement. In other
words, how will a P = NP proven statement challenge the status quo? If there is a proof
that P≠NP, then such a proof can be used to solve other relevant open questions. Never-
theless, a P = NP proof will re lease a gigantic algorithmic powe r in the se nse that the
e ntire e xisting class of NP proble ms that are curre ntly conside re d as intractable (not real-
istically solvable by an efficient algorithm) will become tractable (realistically solvable by
an efficient algorithm). A P=NP proof will, however, destroy existing cryptography sys-
tems that currently rely on the fact that it is hard to break the primary and private keys
because the factorization problem is an NP problem (Rodó, 2010).

SUMMARY
This unit began with the presentation of the big O time and space
approximation model as the commonly used complexity model for the
measurement of the efficiency of algorithms and for the analysis of the
complexity of computational problems. This presentation is illustrated
with relevant examples for the best, average, and worst case scenarios,
as well as for the identification of the complexity classes of computa-
tional problems. The rest of the unit is dedicated to the concept of NP-
completeness, with the presentation of the central concept of reducibil-
ity with examples of both NP-complete problems and NP-hard
problems. The unit ends with a presentation of the perceptions of emi-
nent computer theory scientists on the question of whether P = NP.

171
UNIT 7
ADVANCED ALGORITHMICS

STUDY GOALS

On completion of this unit, you will have learned …

– the fundamental constructs of parallel algorithms.


– how to program basic probabilistic algorithms.
– core quantum computing concepts.
– the five steps of Shor’s algorithm.
7. ADVANCED ALGORITHMICS

Introduction
The world of computing is an ever evolving domain where novel and diverse paradigms
are regularly proposed in the quest for new solutions to existing computational problems
and for the advancement of computing sciences and technologies. This unit is an intro-
duction to alternative advanced algorithmic approaches such as parallel algorithms, prob-
abilistic algorithms, and quantum computing. These interesting approaches are consid-
ered important alternative solutions to some of the most difficult computational
problems. The unit will present Shor’s algorithm as an important application of quantum
computing. It will also use other relevant examples to outline the fundamental concepts of
parallel algorithms and introduce probabilistic programming.

7.1 Parallel Computing


This section is an introduction to parallel algorithms with an overview of their general con-
cepts together with suitable NodeJs parallel programming examples.

Parallel Algorithms General Concepts

Let us recall that an algorithm is a set of steps to be executed as a solution to a given com-
putational problem, and let us denote such steps as s0, s1, s2, …, and sn. The most common
approach for the execution of such steps is the sequential or serial execution whereby s0 is
executed first, then s1, s2, and so on. The sequential approach works well in instances
where the execution of one step depends on the outcome of the execution of the preceed-
ing one. There are, however, other instances where two steps can be executed at the same
time or in parallel simply because they use different data inputs. The main advantage of
this parallel approach is its time efficiency. In fact, the time taken for the parallel execution
of two steps is equal to the time taken for the execution of the longest of these two steps.
On the other hand, the sequential execution of two steps is equal to the addition of the
execution times of the two steps. It is also important to note that sequential execution is
unavoidable in situations where the executing computer only has one processor as was
the case in the early days of computing. On the contrary, the emergence of multiprocessor
computers and of distributed computing makes it possible for the different processors of
the same computer (or from different machines) to execute different instructions in paral-
lel. Nowadays, these multiprocess infrastructures are cleverly used by the parallel features
of many modern programming languages.

The parallel execution of instructions by the different processors of the same computer
can be referred to as multiprocessor parallel programming. This is different from multi-
computer parallel programming where different computers execute in parallel the differ-
ent processes of the same task. In both cases, parallel programming must take into
account situations where parallel tasks are not totally independent. In fact, parallel tasks

174
sometimes rely on, synchronize with, and communicate with one another either through
message-passing or through the use of a shared memory. Such a synchronization often
requires the use of semaphores or locks that are acquired and orderly released for the Semaphore
management of interesting situations such as mutual exclusion, critical sections, and A semaphore is a data
structure that is acquired
deadlocks, especially for the integrity of a shared memory. Besides synchronization, par- and on which one can
titioning and scheduling are two other important aspects of parallel algorithms. wait for the management
of a shared resource.
Deadlocks
Data partitioning versus function partitioning These occur when a proc-
ess is holding a shared
Let us suppose that we have some input data d for a given algorithm f that we would like resource while waiting for
a shared resource held by
to parallelize with a certain number of parallel tasks. A key question is to decide what will another process that is
constitute a parallel task. One method is the data partitioning approach which consists of also waiting for a shared
dividing the original dataset d into many data partitions or parts such that a parallel task resource.

can be created for each data partition. Here, the different parallel tasks are doing the same
job but on different data. Another method, the function partitioning approach, consists of
dividing or partitioning the original algorithm f into different sub-functions than can be
executed in parallel on the original dataset d. In both cases, the number of partitions
determines the granularity of the parallel algorithm which can either be classified as
coarse-granularity for large parallel tasks, or as fine-granularity for small parallel tasks.

Scheduling

Once the different parallel tasks have been designed, they must be assigned to different
processors or computers for their parallel execution. This assignment process is also
known under the name of scheduling where each processor or computer is assigned a
schedule of parallel tasks. This assignment process can either be dynamic or static. In
dynamic scheduling, the assignment of tasks to processors or to computers is decided
during the execution of the parallel tasks. In contrast, static scheduling allows program
designers to specify within their parallel programs how the parallel tasks will be assigned
to the different processors or computers.

Parallel Programming Building Blocks and Examples

There are a number of parallel building blocks or features that are available in modern
programming languages for the design of parallel programs. The parallel programming
features presented here are mostly applicable to data parallel algorithms. Some of these
features include elementwise operations, reductions, and broadcasting. These features
will be presented in the following subsections with relevant NodeJs illustrative examples
using the ParallelJs library. Other parallel JavaScript libraries include Hamsters.js,
Threads.js, and Parallel.es (Reiser, 2017). ParallelJs has a relatively small number of func-
tions but these functions are powerful enough to implement key parallelism features in
JavaScript. Readers are advised to install the ParallelJs library, for instance, with the help
of the npm install paralleljs command should they want to explore and test the
given examples in NodeJs.

175
Elementwise operations

Elementwise operations allow the parallel execution of the same operation on each of the
data partitions. This is visible in the following ParallelJs example where the
cubeOperation operation is carried out in parallel by each element of the lst sequence.
The final output of the program appears below it.

Figure 93: ParallelJs Map Operation First Example with Output Screenshot

Source: Created on behalf of IU (2020).

The first line of this program simply indicates that the program requires the use of the Par-
allelJs library. The second line declares an array lst with four integers. The list constitutes
the data of the program, and it is partitioned into four elements. The third line of the pro-
gram creates a ParallelJs object p for the parallel processing of the lst data. The fourth
line is a modification of the behavior of the log method so that it can simply print the
content of the arguments array that contains the current results of the parallel processing
of the partitioned data. The cubeOperation function simply returns the value of the cube
of its parameter. The last line of the program uses the ParallelJs map function for the paral-
lel execution of the cubeOperation function on each of the elements of the p ParallelJs
object such that the subsequent array of results can be printed by the log function.

Let us consider another little more elaborate example to illustrate the use of the concept
of elementwise operations by the ParallelJs map function. It is the example of the pairwise
multiplication of a row of values by a column of values. Assuming, for example, that the
elements of a row are presented from the left to the right and the ones of the column are
presented from the top to the bottom, the pairwise multiplication of the row [1,2,3,4] by
the column [5,6,7,8] will yield the value [(1*5), (2*6), (3*7), (4*8)] which is equal to the
array [5, 12, 21, 32].

A screenshot of the output of the following program is displayed below that program for
input row [1,2,3,4] and for input column [5,6,7,8]. Readers are advised to test the program
and trace it.

176
Figure 94: ParallelJs Map Operation Second Example

Source: Created on behalf of IU (2020).

Figure 95: Output Screenshot of the Second ParallelJs Map Operation Example

Source: Created on behalf of IU (2020).

177
The first two lines of the above program simply declare the two libraries that are used by
the program. Let us recall that the ParallelJs object’s constructor uses an array of values as
its main parameter, but our data is made up of two arrays which respectively represent a
row of values and a column of values. One way to merge these two arrays into a single
array is to pair their elements such that the first element of the row is paired with the first
element of the column, the second element of the row is paired with the second element
of the column, the third element of the row is paired with the third element of the column,
and so on. Such pairs can be constructed as objects with two attributes where the first
attribute is the value from the row and the second one is the value for the column.

Returning to our example on the pairwise multiplication of the row [1,2,3,4] by the column
[5,6,7,8], the pairwise alignment of these two arrays is equal to the array [{rowV:1,
colV:5}, {rowV:2, colV:6}, {rowV:3, colV:7}, {rowV:4, colV:8}].

This array has four objects which will be processed in parallel by the ParallelJs map ele-
mentwise operation as applied to the pMultRC function. It is easy to see that the pMultRC
function takes in an object that is made up of a rowV attribute and a colV attribute, and it
returns the multiplication of the value of rowV by the value of colV. As for the
pairRoAndColValues function, its role is to perform and return the pairwise alignment
of its two input arrays. Readers are also invited to write the sequential versions of the dif-
ferent examples as a way of comparing parallel programming with sequential program-
ming.

Reductions

The result of the parallel execution of partitioned data is usually stored in an array (seen
above), and the purpose of the reduction operation is to perform an action that can cumu-
late or reduce all the values of that array into a single scalar or atomic value. For example,
the result of the sum reduction of the array [10, 20, 30, 40] is 100. We use the term
sum reduction here to say that we are reducing the array [10, 20, 30, 40] to a single
scalar value by summing all its values.

The combination of the reduce operation with the map operation is highly credited for its
ability to optimize parallelism, for example, for graphics programming: “Map Reduce is an
ideal abstraction for programming general purpose computations on the graphics pro-
cessor. Structuring a computation as stages of Map Reduce operations ensures that maxi-
mal parallelism is expressed” (Catanzaro et al., 2008, Section 3).

Returning to our example of the multiplication of a row by a column, and keeping in mind
that the program has already shown us how to store the pairwise multiplication of a row
and a column into an array, the sum reduction of that array is the final result of the multi-
plication of the row by the column. This is demonstrated by making the following changes
in the above program:

• Add the following function to your program:


function addAll(d) {return d[0] + d[1];}
• Replace console.log('Pairwise multiplication of the row and the
column') with the following instruction:

178
console.log('Multiplying row and column: The final result is');
• Replace p.map(pMultRC).then(log) with the following instruction:
p.map(pMultRC).reduce(addAll).then(log);

Below is a screenshot of the output of the updated program using the input row [1,2,3,4]
and the input column [5,6,7,8]. This example shows that the multiplication of the row
[1,2,3,4] by the column [5,6,7,8] is the sum of all the values of the array [(1*5), (2*6), (3*7),
(4*8)], in other words 5 + 12 + 21 + 32, which is ultimately equal to 70. The sum is the
outcome of the addAll sum reduction operation as applied to the array resulting from the
pMultRC pairwise multiplication operation.

Figure 96: Output Screenshot of the ParallelJs Map Reduce Example

Source: Created on behalf of IU (2020).

Broadcasting

The purpose of broadcasting operations is to distribute a given value to different parallel


processes for their internal use. We will illustrate that concept with the following example
of the multiplication of two matrices.

179
Figure 97: Paralleljs Broadcasting Operation Example (Start)

Source: Created on behalf of IU (2020).

180
Figure 98: Paralleljs Broadcasting Operation Example (End)

Source: Created on behalf of IU (2020).

In the example above, the first (left) matrix is constant, and it is broadcast to the different
processes that will perform the multiplication of that matrix by the different columns of
the second (right) matrix. This broadcasting operation is performed by the require Paral-
lelJs method that distributes the outcome of the leftMatr function to all future parallel
processes for their internal use. It is important to note that the purpose of the leftMatr
function is simply to return the value of the left matrix in order for it to be broadcast to the
parallel processes that will use it in the multByLeftMatr function. The use of the Paral-
lelJs elementwise map operation allows the creation of a set of parallel processes where
the left array multiplies each column of the right array:

2 3
0 1 2 3 4 4 5 37 54
5 6 7 8 9 × 6 7 = 132 184 .
1 2 3 4 5 7 9 56 80
0 2

Below is a screenshot of the output of the above program for the matrices multiplication
example. Readers are advised to test and trace the program with this example as well as
with the others.

181
Figure 99: ParallelJJs Broadcasting Operation Example

Source: Created on behalf of IU (2020).

7.2 Probabilistic Algorithms


Probabilistic algorithms The purpose of probabilistic algorithms is to resolve computational problems on the cre-
These algorithms create ation and the processing of random data. Such data randomizations are based on the use
and process random data
for various probability of common probability distributions. This section is an introduction to the programming
distributions. of probabilistic algorithms using the WebPPL JavaScript probabilistic language. Other
probabilistic programming languages include Hakaru, Augur, R2, Figaro, IBAL, PSI, Church,
Anglican, BLOG, Turing.jl, BayesDB, Venture, Probabilistic-C, CPProb, Biips, LibBi, Birch,
STAN, JAGS, and BUGS (van de Meent et al., 2018).

WebPPL can be installed in NodeJs with the npm install -g webppl command, and
we will use the following HelloTest.wppl to illustrate how to test it. Readers are advised
to install WebPPL, type the following few lines in a text editor, and save their file under the
name HelloTest.wppl before executing the webppl HelloTest.wppl command.

Figure 100: WebPPL Hello Mam or Dad Example

Source: Created on behalf of IU (2020).

182
The following three screenshots of the output of three executions of the above program
illustrate the randomness of the data generated by that program. In fact, the greeting
function randomly returns either “Hello” or “Hi” with an equal “fifty-fifty” probability or
chance. Similarly, the aParent function randomly returns either “Mam” or “Dad” with an
equal “fifty-fifty” probability or chance.

Figure 101: Output Screenshot of the WebPPL Hello Mam or Dad Example

Source: Created on behalf of IU (2020).

First, let us review the programming of the listed probability distributions. We apply these
distributions later to the example of the visualization of an algorithm of average time effi-
ciency. Next, we introduce the programming of common descriptive statistics used by
probabilistic algorithms and hypothesis testing.

Generation of Random Data for Common Probability Distributions

The following probability distributions are introduced in this subsection: Bernoulli, bino-
mial, geometric, and uniform distributions.

The Bernoulli distribution

The Bernoulli distribution is that of a single event whose outcome is randomly chosen
with a given probability between two possible values. A common example of such a ran-
dom choice is the flip of a coin that has a 50 percent chance (probability) to land on either
side. It is, of course, always possible to set the value of the probability of a given distribu-
tion to any number between zero and one. The flip function allows WebPPL program-
mers to generate data according to the Bernoulli distribution as was the case for the above
HelloTest.wppl example.

That flip function receives a probability value as a parameter but that value is assumed
to be equal to 0.5 when absent. The following example uses the flip function inside the
random0To4 function to generate a random number between 0 and 4.

Figure 102: WebPPL Bernoulli Distribution First Example

Source: Created on behalf of IU (2020).

183
Here, it is acknowledged that the flip function can only return 0 or 1 and adding four
calls of the flip function can yield any natural number between 0 and 4.

This program can be slightly amended to generate six random natural numbers that are
each between 0 and 4.

Figure 103: WebPPL Bernoulli Distribution Second Example

Source: Created on behalf of IU (2020).

The binomial distribution

The binomial distribution represents the number of successes in a series of Bernoulli


events assuming that the outcome of each Bernoulli event is either a success (with a given
probability) or a failure (with the complementary probability). In the following program,
the b0To10 variable uses the Binomial constructor to represent the numbers of successes
for a series of ten “fifty-fifty” chance Bernoulli events.

The purpose of the randomSampleB0To10 function is to generate a sample of the bino-


mial object b0To10 as a single random value between 0 and 10. Seven of such random
values are generated by the randomSampleOf7ValuesB0To10 function.

Figure 104: WebPPL Binomial Distribution Example

Source: Created on behalf of IU (2020).

The geometric distribution

The geometric distribution represents the number of attempts by a series of Bernoulli


events before its first success, assuming that the outcome of each Bernoulli event is either
a success (with a given probability) or a failure (with the complementary probability).

184
WebPPL does not have a constructor for the geometric distribution. Instead, that distribu-
tion can be translated into a recursive function that uses the flip function of the Ber-
noulli distribution. The code can be found here.

Figure 105: WebPPL Geometric Distribution Example

Source: Created on behalf of IU (2020).

The geomF function of the program returns 0 when the Bernoulli event is successful; other-
wise, it increments the number of unsuccessful attempts by one before starting a new Ber-
noulli event.

We have biased each Bernoulli event by making the randomG function call the geomF func-
tion with an argument of 0.1 so that it can only have a 10 percent probability of success.
This was done for us to witness higher values for the geomF function. Here is a screenshot
of an example of execution of the above program as yielded by its repeat(10,
randomG) instruction.

Figure 106: Output Screenshot of WebPPL Geometric Distribution Example

Source: Created on behalf of IU (2020).

The uniform distribution

The uniform distribution can either be discrete or continuous. A discrete uniform distribu-
tion represents a single event whose outcome is a randomly chosen integer or value from
a given range of values. As for the continuous uniform distribution, it represents a single
event whose outcome is a randomly chosen real number from a given range of values.
WebPPL uses the RandomInteger constructor for discrete uniform distributions while the
Uniform constructor is used for continuous uniform distributions as visible.

185
Figure 107: WebPPL Uniform Distribution Example

Source: Created on behalf of IU (2020).

This program chooses two random numbers: one random integer between 0 and 50 and
one random real value between —3 and 1.

Figure 108: Output Screenshot of WebPPL Uniform Distribution Example

Source: Created on behalf of IU (2020).

Descriptive and Inferential Statistics

This subsection is an introduction to the programming of basic descriptive statistics such


as the computation of means, variances, and standard deviations. Inferential hypothesis
testing is also introduced in this subsection with the example of the computation of the
Bayes factor.

Descriptive statistics

Means, variances, and standard deviations are common in probabilistic algorithms. The fol
lowing program is an illustration of how to compute these in WebPPL.

186
Figure 109: WebPPL Descriptive Statistics Example (Start)

Source: Created on behalf of IU (2020).

Figure 110: WebPPL Descriptive Statistics Example (End)

Source: Created on behalf of IU (2020).

Figure 111: Output Screenshot of WebPPL Descriptive Statistics Example

Source: Created on behalf of IU (2020).

Hypothesis testing and Bayesian inference

It is possible to customize a given probability distribution according to the context. For


example, in the following program, the nullHypM variable simply represents a binomial
distribution for 200 attempts with a probability of 0.5. As for the altHypM variable, it rep-
resents a customized binomial distribution for 200 attempts but whose probability is a
randomly chosen real number between 0 and 1 (See p = uniform(0, 1)). The value of
the altHypM variable is inferred from 100 samples of the customized binomial distribu-
tion.

187
Figure 112: WebPPL Inferential Statistics Example

Source: Created on behalf of IU (2020).

This program represents the simple scenario whereby a coin is flipped 200 times with 115
successes. Such observed data may lead to the null hypothesis nullHypM that the proba-
bility of one of the 200 flips to succeed is 0.5, as represented by the Binomial({p: 0.5,
n: attempts}) distribution where the value of the attempts variable is 200. An alterna-
tive hypothesis with the name of altHypM counters the above nullHypM hypothesis by
stating that the probability of one of the 200 flips to succeed can take any value between 0
and 1, as explained above. The Bayes factor can be used to decide which of the hypothe-
ses is supported by the observed data, i.e., the actual observed number of successes. A
Bayes factor above 1.6 indicates that the hypothesis on its numerator (null hypothesis) is
the one that should be accepted, while a Bayes factor below 1.6 indicates that the hypoth-
esis on its denominator (alternative hypothesis) is the one that should be accepted. This
program was executed a few times, and its output always yielded a value below 1.6 (see
screenshots below), indicating that the null hypothesis of p=0.5 cannot be accepted.

Figure 113: Output Screenshot of WebPPL Inferential Statistics Example

Source: Created on behalf of IU (2020).

Visualizing the Effect of Uniform Distributions on Efficiency Estimates

The concept of the average efficiency of an algorithm can be suitably understood under
the assumption that its input data is uniformly randomly distributed. We will illustrate the
suitability of the uniform random distribution with the example of a naive linear algorithm
that seeks to identify the first number that is greater than or equal to 50 in an array A of n
whole numbers. It is assumed that the value of each element of A is less than 100. In the

188
best case, the first element of A is greater than or equal to 50, and it will only take one step
for the algorithm to find it. In the worst case, no element of A has a value greater than or
equal to 50, and it will take n steps for the algorithm to see that. In the average case, the
data of A will be uniformly randomly distributed between 0 and 99. So, how many steps
will the algorithms take to find the searched item in the average case? In other words,
what is the efficiency of the algorithm in the average case scenario? This question is
answered by the following webppl programs. The first program gives a textual output, and
the second uses plotly for the visualization of its graphical output. Both of these pro-
grams collect the value of n in the form of an argument (argv.n) on the command line
when running webppl for their execution.

Program with a textual output

This program creates array of n uniformly distributed random whole numbers with none
of the numbers greater than 99 (See udpns, udp1s, and udp). The
stepsToFind1stPassF(a) function returns the first position in the array a where a value
greater or equal to 50 is found. This function returns 0 if a does not have any value greater
than or equal to 50. Apart from the n argument, this program also uses a second argument
denoted by mode that can take three possible values: ad, mn, or mr. In the ad mode, the
program outputs the details of each randomly generated array together with the number
of steps used to find its first number with a value greater than or equal to 50. The code
below can be found here.

189
Figure 114: WebPPL Text Version Search Example with Uniformly Distributed Input
(Start)

Source: Created on behalf of IU (2020).

190
Figure 115: WebPPL Text Version Search Example with Uniformly Distributed Input
(End)

Source: Created on behalf of IU (2020).

Let us trace the above program when it is in the ad mode. The following functions are
called one after the other.

• randomSearchRunsOnMoreAndMore() simply calls the following function.


• cRandomSearchRuns(argv.n) simply calls the following function.
• repeat(argv.n,steps21stPass)) calls steps21stPassn times
• steps21stPass: simply calls the following function.
• randomSearchRun() creates a random array a of n values by calling udpns(a=
udpns()), calls the stepsToFind1stPassF(a) function to locate the first element of
a with a value greater than or equal to 50, and returns the value of the array and the
found location.

Assuming that the program is named unin.wppl, we can call it in the ad mode with the
webppl unin.wppl -- --n 10 --mode ad command that will yield an output similar
to the following screenshot.

191
Figure 116: Detailed Array Output of WebPPL Uniform Input Distribution Text Version
Example

Source: Created on behalf of IU (2020).

The mr and mn modes are slightly different from the ad mode, stemming mainly from the
following instructions:

• map(cRandomSearchRuns,builtArray(10,200)): The function


cRandomSearchRuns is called for each multiple of 10 between 1 and 200 and executes
the following instruction where c is a multiple of 10 between 1 and 200.
• return[c,listMean(repeat(c,steps21stPass))]: Here, c random arrays of n ele-
ments are generated. Each of these will have a calculated value of the position of the
first element greater than or equal to 50. It is the average value of these positions that is
returned here together with the value of c.

The only difference between mr and mn is that mr rounds its results to the nearest natural
value while mn does not.

Assuming that the above program is named unin.wppl, we can call it in the mr or in the
mn mode with the webppl unin.wppl -- --n 10000 --mode mr or unin.wppl --
--n 10000 --mode mn command, respectively.

192
Figure 117: Real Mean Output of WebPPL Uniform Input Distribution Text Version
Example

Source: Created on behalf of IU (2020).

Figure 118: Natural Mean Output of WebPPL Uniform Input Distribution Text Version
Example

Source: Created on behalf of IU (2020).

193
Program with a graphical output

This subsection exclusively considers the mr mode of the above program and modifies it
with the purpose of showing a visual representation of the final array generated by that
program. Each element of that array is made up of two numbers that will be considered
the x and y coordinates of a point. These points are visualized here. In the program above,
the data of the final array targets the multiples of 10 less than or equal to 200, but the prog
ram below extends its data to the sequence of natural numbers less than or equal to 1000.

Figure 119: WebPPL Graphic Version Search Example with Uniformly Distributed Input
(Start)

Source: Created on behalf of IU (2020).

194
Figure 120: WebPPL Text Version Search Example with Uniformly Distributed Input
(End)

Source: Created on behalf of IU (2020).

The concept of mode does not exist in the above program because it exclusively focuses
on the mr mode. The webppl 12-probabilistic-efficiency-graph.wbpl -- --n
100 > pts.js command will allow the above program (u12-probabilistic-
efficiency-graph.wbpl) to save its pts generated array of points in the pts.js file
whose pts variable can be imported by another JavaScript program. It is precisely this
importation mechanism that is used by the JavaScript program within the following HTML
Web page. For the Web page to run, the plotly open source library must be loaded with
npm install plotly.js.

This importation mechanism is used by the following program for the online plotly
drawing of the content of the pts array. The online use of plotly requires a sign up on
the plotly nodeJs website where you are issued a username and an API key. Those are
the two plotly credentials that are used in the plotly require function in the follow-
ing program. The other require instruction of that program simply imports the pts array
from the above described pts.js file so that it can transfer that array to a sequence of
points ( data) to be drawn by plotly. The graph generated by this program is denoted by
uniDisAvgStp, and the actual drawing is accomplished by the plotly plot function.
Our program code is visible below.

195
Figure 121: Plotly Uniform Input Distribution Example

Source: Created on behalf of IU (2020).

The program above is simply executed with the node plotlyDraw.js command if one
assumed that plotlyDraw.js is its name.

The url should show your plotly username on your screen. Please copy that url and
paste it on the address bar of your Web browser and you will see a graph that is similar to
the following.

Figure 122: Output of Plotly Uniform Input Distribution Example (Screenshot)

Source: Created on behalf of IU (2020).

196
The y-axis of the graph represents the average number of steps that it took to find the first
element greater than or equal to 50 in an array of 100 values uniformly distributed
between 0 and 99. The x-axis represents the number of times that an array was generated
and searched. An interesting question is to find out why the above graph tends to follow
the line with the equation y=2.

7.3 Quantum Computing and the Shor


Algorithm
A quantum computer is a computing machine that uses distinctive quantum physics and
mathematics features to solve computational problems. A historical overview of quantum
computing is hereby presented, followed by an introduction to some of the key concepts
of quantum computing. The rest of the subsection is dedicated to the overview of the five
phases of Shor’s algorithm (Lomonaco, 2000).

Genesis of Quantum Computing

According to Hidary (2019), the genesis of quantum computing can be traced back to 1979
when Paul Benioff suggested the idea of a theoretical paradigm for the construction of
quantum machines. Yuri Manin was another of the quantum computing pioneers with the
publication of his book on this topic in 1980 (Hidary, 2019). In 1981, the field of quantum
computing came to the public eye when Nobel Prize winner Feynman revealed that the
usual computing paradigm was not able to model many quantum physics features and
proposed a set of desirable functionalities for quantum computers. Some quantum com-
puting pioneers include David Deutsch, Richard Jozsa, Umesh Vazirani, Ethan Bernstein,
Daniel Simon, Seth Lloyd, and Peter Shor.

Complex Numbers, Vectors, and Matrices

Let us start by recalling a few fundamentals of complex numbers and matrices because of
their importance in quantum computing. We use the example of the complex number
c=4+3i which is made up of the real part 4 and the complex part 3. The complex conju-
gate c̅ of 4 + 3i is the complex number 4 – 3i, and its modulus |c| is the square root of (42
+ 32) which is equal to 5. Let’s not forget that i2 = –1. Quantum computing uses matrices
of complex numbers to model solutions for computational problems. Here is an example
of a matrix M of complex numbers followed by its transpose MT (the first row becomes the
first column, the second row becomes the second column, etc.), its conjugate M̅ (each ele-
ment is replaced by its conjugate), and its adjoint M† (the conjugate of its transpose, or the
transpose of its conjugate).

−5 + 4i 1 − 2i −5 − 4i 1 + 2i
−5 + 4i 6 − 7i 9 −5 − 4i 6 + 7i 9
, 6 − 7i −3 , , 6 + 7i −3
1 − 2i −3 −8i 1 + 2i −3 8i
M 9 −8i M 9 8i
Matrix MT Conjugate M†
Transpose Adjoint

197
The tensor product is also intensively used in quantum computing and can be defined as
follows. Assuming that A is a matrix with m rows and n columns, we can denote by ai,j the
element on the ith row and column jth of A. The tensor product of A by another matrix B is
a matrix denoted by A ⊗ B and it is calculated as follows.

A⊗B=

a1,1B a1,2B a1,3B ⋯ a1, n − 2B a1, n − 1B a1, nB


a2,1B a2,2B a2,3B ⋯ a2, n − 2B a2, n − 1B a2, nB
a3,1B a3,1B a3,3B ⋯ a3, n − 2B a3, n − 1B a3, nB
⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮
am − 2,1B am − 2,2B am − 2,3B ⋯ am − 2, n − 2B am − 2, n − 1B am − 2, nB
am − 1,1B am − 1,2B am − 1,3B ⋯ am − 1, n − 2B am − 1, n − 1B am − 1, nB
am, 1B am, 2B am, 3B ⋯ am, n − 2B am, n − 1B am, nB

It also seems important to recall the following formula and Dirac kets notations for the cal-
culation of ∥V∥ which is the norm of a column vector V of complex numbers.

198
c0
c1
c2
T c3
V = V = c0, c1, c2, c3, …, cn − 3, cn − 2, cn − 1 =

cn − 3
cn − 2
cn − 1

V = VT = c0,c1,c2,c3, …,cn − 3,cn − 2,cn − 1

∥V∥ = ∥V ∥ = V V

c0
c1
c2
c3
= c0,c1,c2,c3, …,cn − 3,cn − 2,cn − 1 ×

cn − 3
cn − 2
cn − 1

The formula is similar to the following from Strubell (2011, pp. 6—7):

∥V∥
2 2 2 2 2 2 2
= c0 + c1 + c2 + c3 + … + cn − 3 + cn − 2 + cn − 1 .

199
States and Dynamics

A quantum system is defined as a structure with a finite number of positions or states with
the understanding that at the time when that structure is observed, only one of these
positions or states is occupied by the proton, and each of these positions or states has its
own probability to be occupied. This does not cancel out the main feature of quantum
computing according to which the proton simultaneously occupies all its states when it is
not being observed. If we assume that our quantum system has n positions or states, then
those positions or states can be represented by the following column vector using the
Dirac ket notation.

T
φ = c0, c1, c2, c3, …, cn − 1, cn − 2, cn − 1

The above states can be normalized where it is clear that each value represents a probabil-
ity between 0 and 1:

φ c0 c1 c2 c3 cn − 3 cn − 2 cn − 1 T
∥φ ∥
= , , ,
∥φ ∥ ∥φ ∥ ∥φ ∥ ∥φ ∥
, …, ∥ φ ∥ , ∥ φ ∥ , ∥ φ ∥

The normalized observed state values show that at the time when the quantum system is
being observed the sum of all the probabilities is equal to 1 with the probability of the
proton being in state i calculated as:

2
ci
2
∥φ ∥

It is important to note that the classical 0 and 1 bits are translated as follows in quantum
computing so that any qubit [c0, c1]T can be expressed in terms of the qubit of 0 and the
qubit of 1.

1 0 c0 c0 1 0
0 1 = c0 · + c1 ·
c1 c1 0 1

0 1 c c = c0 0 + c1 1

Qubit of 0 Qubit of 1 A qubit c Expression of c†

The observed values of the states of a quantum system continuously change as specified
by an operator that is represented by a unitary square adjacency matrix of complex num-
bers whose number of columns (or rows) is equal to the number of states of the quantum
system. A matrix M is said to be unitary when the M×M† multiplication and the M†×M
multiplication are both equal to the identity matrix. This change process and example is
presented below.

200
c0 U0,0 U0,1 c0 U0,0 U0,1 c0 U0,0c0 + U0,1c1
c1 U1,0 U1,1 c1 U1,0 U1,1 c1 U1,0c0 + U1,1c1

φ U φ Uφ Uφ

Original qubit Adjacency Qubit change process Changed qubit

−1 + 3i 1 0 −1 + 3i −1 + 3i −1 + 3i
14 0 i 14 10 14 14
−2i −2i 0 i −2i 2
14 14 14 14

φ U φ Uφ Uφ

Original qubit Adjacency matrix Qubit change process Changed qubit

An Overview of Shor’s Algorithm

The purpose of Shor’s algorithm is to solve the problem of the factorization of natural
numbers. The efficiency of that algorithm is seen as a serious threat to the survival of cur-
rent encryption systems that are based on the difficulty of computing the prime factors of
numbers. Shor’s algorithm is usually divided into five steps but it is only one of these steps
that makes use of quantum computing paradigms. These five steps are briefly reviewed
below for an input value N.

1. Randomly choose a natural number m between 2 and N–1 and calculate the gcd or
greatest common divisor between m and N. If the value of the gcd is different from 1,
then the algorithm should terminate with the answer that the gcd is a factor of N. If
not, then the algorithm should move to the second step.
2. Use quantum computing to calculate the period p of the function f(n)=mn modN.
3. Determine whether p is even or odd. If p is even, then the algorithm should move to
the fourth step, otherwise it should move to the first step.
p
4. Determine whether m 2 + 1 = 0 modN . If so, then the algorithm should move to step
one, otherwise it should move to step five.
p
5. Calculate d as the highest common divisor between m 2 − 1 and N. The algorithm
must then terminate by returning d as a non-trivial factor of N.

Use of Quantum Computing by Shor’s Algorithm

Let us end this section by briefly illustrating how Shor’s algorithm uses quantum comput-
ing to calculate the period p of the function f(n)=mn modN with the example of the calcu-
lation of the period of the function f(x)=x mod2. A value r is a period of f if and only if
f(x)=f(x+r) for all x values. First, we need to introduce Quantum Fourier Transformation
(QFT) matrices because of their use by this algorithm. We also must remember that epi = –
2πi
1. Let us also denote the imaginary number e N by the name w where N is a natural num-
ber.

201
The QFTN matrix is defined as follows:

1
QFTN =
N

1 1 1 1 ⋯ 1 1
1 w w2 w3 ⋯ wN − 2 wN − 1
1 w2 w4 w6 ⋯ w2N − 4 w2N − 2
1 w3 w6 w9 ⋯ w3N − 6 w3N − 3
⋮ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮
1 wN − 3 w2N − 6 w3N − 9 ⋯ w N−2 N−3 N−2 w N−1 N−3 N−1

1 wN − 2 w2N − 4 w3N − 6 ⋯ w N−2 N−2 N−2 w N−1 N−2 N−1

1 wN − 1 w2N − 2 w3N − 3 ⋯ w N−2 N− N−2 w N−1 N− N−1

Let us review the examples of N=2, N=4, and N=8.

1 1 1 1 1 1
1 1 1
QFT2 = = 2πi =
2 1 w 2 2 1 −1
1 e 2

2πi πi
For N=4, the value of w is equal to e 4 = e 2 . The square of w will therefore yield the
value epi which is also equal to —1. This implies that w=i. The QFT4 matrix can now easily
be filled knowing that w=i, w2=–1, w3=–i, w4=1, w5=i, w6=–1, w7=–i, w8=1, w9=i.

1 1 1 1 1 1 1 1
1 1 w w2 w3 1 1 i −1 −i
QFT4 = 2
= 2
1 w2 w4 w6 1 −1 1 −1
1 w3 w6 w9 1 −i −1 i

2πi πi
For N=8, the value of w is equal to e 8 = e 4 . Raising w to the power 4 will yield the value
epi which is also equal to —1. This implies that w is equal to the square root of i which is
1+i
also equal to sqrt 2
, and QFT8 can be calculated as follows:

202
1 1 1 1 1 1 1 1
1 w w2 w3 w4 w5 w6 w7
1 w2 w4 w6 w8 w10 w12 w14
1 1 w3 w6 w9 w12 w15 w18 w21
QFT8 =
8 1 w4 w8 w12 w16 w20 w24 w28
1 w5 w10 w15 w20 w25 w30 w35
1 w6 w12 w18 w24 w30 w36 w42
1 w7 w14 w21 w28 w35 w42 w49
1 1 1 1 1 1 1 1
1 i i −1 w3 w5 −i w7
1 i −1 −i 1 i −1 −i
1 1 w3
−i w9
−1 w15 i w21
= .
8 1 −1 1 −1 1 −1 1 −1
1 w5 i w15 −1 w25 −i w35
1 −i −1 i 1 −i −1 −i
1 w7 −i w21 −1 w35 −i w49

We can now multiply QFT8 by the qubit |0⟩ |0⟩ = |0⟩ ⊗ |0⟩ as shown below.

1 1 10
0 0 00
0 0 00
0 0 00
0 0 = ⊗ =
0 0 00
0 0 00
0 0 00
0 0 00

1 1 1 1 110 1 1 1 10
1 i i −1 w3−i 00 w5 w7 10
1 i −1 −i 1 i −1 −i 00 10
1 1 w3 −i w9 −1 w15 i w21 00 1 10
× =
8 1 −1 1 −1 1 −1 1 −1 00 8 10
5 15 25
1 w i w −1 w −i w 35 00 10
1 −i −1 i 1 −i −1 −i 00 10
7 21 35
1 w −i w −1 w −i w 49 00 10

203
10 10 0 0 0 0 0 0 0
10 0 10 0 0 0 0 0 0
10 0 0 10 0 0 0 0 0
10 0 0 0 10 0 0 0 0
= + + + + + + +
10 0 0 0 0 10 0 0 0
10 0 0 0 0 0 10 0 0
10 0 0 0 0 0 0 10 0
10 0 0 0 0 0 0 0 10

The equation can be rewritten as

10
10
10
10 7

10
=0 0 +1 0 +2 0 +3 0 +4 0 +5 0 +6 0 +7 0 ∑
x=0
10
10
10

x 0

These calculations show that we landed on the following qubit after multiplying QFT8 by
the qubit |0⟩ |0⟩.

7
1
8
∑x 0
x=0

We now have to apply the following unitary transformation Uf to the above qubit:

7 7 7
1 1 1
8
∑x 0 Uf
8
∑x f x =
8
∑0 ⊗f x
x=0 x=0 x=0

The above unitary transformed quibit can be rewritten as follows:

204
7
1 1
8
∑x f x =
8
0 f 0 +
x=0

1 f 1 +2 f 2 +3 f 3 +4 f 4 +5 f 5 +6 f 6 +7 f 7

We now have to measure |f(x)⟩ whose value can either be equal to |0⟩ or to |1⟩ because
f(x) is a modulo of 2, having in mind that we are using modulus 2.

Let us assume that |f(x)⟩ is equal to |0⟩. This implies that x is even, and the measurement
of the above unitary transformed quibit is the following where p=2 (even or odd).

N
7 p−1
1 p
8
∑0 ⊗f x measure N ∑x f x
x=0 x=0

1
= 2
0 +2 +4 +6 ⊗0

The QFT of the above measurement yields

1 1
2
0 +0 +0 +0 QTF8 0 +4
2

This transformation tells us that we must measure |4⟩. Moreover, the division of N by 4 is
equal to 2, since N=8, and the period of our function is 2.

SUMMARY
Three advanced algorithmic paradigms were introduced in this unit: par-
allel, probabilistic, and quantum computing. Parallel algorithms are
classified either as data partitioning or function partitioning algorithms.
They are based on fundamental mechanisms such as elementwise oper-
ations, reductions, and broadcasting. This unit contains practical basic
examples on how to program fundamental probability distributions. It
also showed how to compute descriptive statistics and test a hypothesis
using the Bayes factor. We provided a brief historical perspective on
quantum computing before recalling core mathematical concepts on
complex numbers, vectors, and matrices. Finally, we reviewed key quan-
tum computing system components, such as states and dynamics,
qubit, and Shor’s algorithm.

205
BACKMATTER
LIST OF REFERENCES
Abadi, M., & Cardelli, L. (1996). A theory of objects. Springer.

Adams, P. (2018). Undecidability and the structure of the Turing degrees [REU Paper, Univer-
sity of Chicago].

Adel'son-Vels'ky, G. M., & Landis, E. M. (1962). An algorithm for organization of informa-


tion. Doklady Akademii Nauk SSSR,146(2), 263—266.

Aho, A. V. (1977). Algorithms and computational complexity. Acta Crystallographica Section


A: Crystal Physics, Diffraction, Theoretical and General Crystallography, 33(1), 5—12.

Ahujia, R. K., Magnanti, T. L., & Orlin, J. B. (1993). Network flows: Theory, algorithms and
applications. Prentice Hall.

Alami, A., Cohn, M. L., & Wąsowski, A. (2019, May 25—31). Why does code review work for
open source software communities? Proceedings of the 41st International Conference
on Software Engineering (ICSE ’19) (pp. 1073—1083). IEEE.

Attard Cassar, E. (2018). In search of the fastest sorting algorithm. Symposia Melitensia, 14,
63—77. https://www.um.edu.mt/library/oar/handle/123456789/30001

Ausiello, G. (2013). Algorithms, an historical perspective. In A. Giorgio & R. Petreschi (Eds.),


The Power of Algorithms (pp. 3—26). Springer.

Blank, B. E. (2002). The millennium problems: The seven greatest unsolved mathematical
puzzles of our time. Basic Books.

Catanzaro, B., Sundaram, N., & Keutzer, K. (2008, April). A map reduce framework for pro-
gramming graphics processors. Workshop on Software Tools for MultiCore Systems.

Crawford, K. (2013, April 1). The hidden biases in big data. Harvard Business Review. https:/
/hbr.org/2013/04/the-hidden-biases-in-big-data

Frege, G. (1990). Begriffsschrift, a formula language, modeled upon that of arithmetic, for
pure thought. In J. van Heijenoort (Ed.), Frege to Gödel: A source book in mathematical
logic (pp. 1—82). Harvard University Press. (Original work published 1879)

Gasarch, W. I. (2002). The P=?NP poll. ACM SIGACT News, 33(2), 34—47.

Gasarch, W. I. (2012). Guest column: The second P=?NP poll. ACM SIGACT News, 43(2), 53—
77.

Girard, J. Y. (1987). Linear logic. Theoretical computer science, 50(1), 1—101.

208
Hidary, J. D. (2019). Quantum computing: An applied approach. Springer.

Hill, K. (2020, June 24). Wrongfully accused by an algorithm. The New York Times. https://w
ww.nytimes.com/2020/06/24/technology/facial-recognition-arrest.html

Hoare, C. A. R. (1961). Algorithm 64: quicksort. Communications of the ACM, 4(7), 321.

Kessler, G. C. (2020, June 1). An overview of cryptography. Gary Kessler. https://www.garyk


essler.net/library/crypto.html

Kao, Y. F. (n. d.). Computable foundations of bounded rationality [Lecture notes]. https://pdf
s.semanticscholar.org/bc65/587fe1409cc4a152bdaefedec0c6e2020100.pdf

Kochhar, P. S., Xia, X., & Lo, D. (2019, May). Practitioners' views on good software testing
practices. 2019 IEEE/ACM 41st International Conference on Software Engineering: Soft-
ware Engineering in Practice (ICSE-SEIP) (pp. 61—70). IEEE.

Lafont, Y. (1989, December). Interaction nets. Proceedings of the 17th ACM SIGPLAN-SIGACT
symposium on principles of programming languages (pp. 95—108). Association for
Computing Machinery.

Lagarias, J. C. (1985). The 3x+ 1 problem and its generalizations. The American Mathemati-
cal Monthly, 92(1), 3—23.

Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information


Theory, 28(2), 129—137.

Lomonaco, S. J. (2000). A lecture on Shor’s quantum factoring algorithm [Lecture notes].


arXiv. https://arxiv.org/pdf/quant-ph/0010034.pdf

Lyman, J. (2016). Blossom: A language built to grow. Mathematics, Statistics, and Com-
puter Science Honors Projects, 45.

MacCormick, J. (2013). Nine algorithms that changed the future: The ingenious ideas that
drive today's computers. Princeton University Press.

Mazur, J. (2014). Enlightening symbols: A short history of mathematical notation and its hid-
den powers. Princeton University Press.

Olhede, S. C., & Wolfe, P. J. (2018). The growing ubiquity of algorithms in society: Implica-
tions, impacts and innovations. Philosophical Transactions of the Royal Society A: Math-
ematical, Physical and Engineering Sciences, 376(2128), 3—26.

Poonen, B. (2014). Undecidable problems: A sampler. In J. Kennedy (Ed.), Interpreting


Gödel: Critical essays (pp. 211—241). Cambridge University Press.

Reiser, M. (2017). Parallelize JavaScript computations with ease [Projektarbeit, Hochschule


für Technik Rapperswil]. Hochschule für Technik Rapperswil.

209
Richardson, K. (2020, May). Number theory meets computability theory. Kyle Richardson. ht
tps://www.nlp-kyle.com/files/h10.pdf

Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures
and public-key cryptosystems. Communications of the ACM, 21(2), 120—126.

Rodó, C. (2010). Efficiency in quantum key distribution protocols using entangled Gaussian
states [Master thesis, Universitat Autonoma de Barcelona]. arXiv:1005.2291.

Saltz, J. S., & Shamshurin, I. (2017, July). Does pair programming work in a data science
context? An initial case study. 2017 IEEE International Conference on Big Data (Big
Data) (pp. 2348—2354). IEEE.

Simon, H. A. (2002). Near decomposability and the speed of evolution. Industrial and cor-
porate change, 11(3), 587—599.

Strubell, E. (2011). CSE 301: An introduction to quantum algorithms [Lesson content].

Subero, A. (2020). Codeless data structures and algorithms: Learn DSA without writing a sin-
gle line of code. Apress.

Turner, R. (2018). Towards a philosophy of computer science. Computational Artifacts (pp.


13—19). Springer.

van de Meent, J. W., Paige, B., Yang, H., & Wood, F. (2018). An introduction to probabilistic
programming. arXiv:1809.10756.

Varadharajan, S. (2020). Hard mathematical problems in cryptography and coding theory


[Doctoral dissertation, The University of Bergen]. The University of Bergen.

Vega, F. (2016). NL versus P. hal-01354989. https://hal.archives-ouvertes.fr/hal-01354989/d


ocument

Vryonis, P. (2013, August 27). Public-key cryptography for non-geeks. Vrypan. https://blog.v
rypan.net/2013/08/28/public-key-cryptography-for-non-geeks/

Wang, H. (1990). The concept of computability. In Computation, Logic, Philosophy (pp. 13—
29). Springer. https://doi.org/10.1007/978-94-009-2356-0

Ye, Y. (2013). Generalizing contexts amenable to greedy and greedy-like algorithms [Doc-
toral dissertation, University of Toronto]. TSpace.

Zapata-Rivera, L. F., & Aranzazu-Suescun, C. (2020). Enhanced virtual laboratory experi-


ence for wireless networks planning learning. IEEERevista Iberoamericana de Tecnolo-
gias del Aprendizaje, 15(2), 105—112.

210
LIST OF TABLES AND
FIGURES
Figure 1: “Hello There” Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Figure 2: GCD Naive Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Figure 3: Selection Sort Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Figure 4: Cryptography Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Figure 5: Third Person Present Tense Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Figure 6: Array’s Input and Output Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Figure 7: Stack Implementation with Linked Nodes (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Figure 8: Stack Implementation with Linked Nodes (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Figure 9: List Implementation with Linked Nodes (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Figure 10: List Implementation with Linked Nodes (Cont'd) . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Figure 11: List Implementation with Linked Nodes (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Figure 12: Non-Priority Queues Implementation with Linked Nodes (Start) . . . . . . . . . . . . 43

Figure 13: Non-Priority Queues Implementation with Linked Nodes (End) . . . . . . . . . . . . . 44

Figure 14: Priority Queues Implementation with Linked Nodes (Start) . . . . . . . . . . . . . . . . . 46

Figure 15: Priority Queues Implementation with Linked Nodes (Cont'd) . . . . . . . . . . . . . . . 47

Figure 16: Priority Queues Implementation with Linked Nodes (End) . . . . . . . . . . . . . . . . . 48

Figure 17: Completely Full Binary Tree Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Figure 18: Numbering of the Nodes of a Completely Full Binary Tree . . . . . . . . . . . . . . . . . . 50

Figure 19: Binary Trees Implementation with Arrays (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Figure 20: Binary Trees Implementation with Arrays (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

211
Figure 21: Example of a Labeled Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Figure 22: Example of Nodes Identification in a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Figure 23: Representation of the Nodes of a Graph in an Array . . . . . . . . . . . . . . . . . . . . . . . . 56

Figure 24: Graph’s Representation as a Two-Dimensional Array and an Array of Linked


Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Figure 25: Graph Implementation with Two Arrays (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Figure 26: Graph Implementation with Two Arrays (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Figure 27: Iterative and Recursive Versions of Factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Figure 28: Iterative Illustration of Factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Figure 29: Naive Iterative and Recursive Primality Test Algorithms . . . . . . . . . . . . . . . . . . . . 63

Figure 30: Insertion of the Sequence 75, 29, 52, 89, 92, 90, 24, 8, 17, 27 in an AVL (Start) . 66

Figure 31: Insertion of the Sequence 75, 29, 52, 89, 92, 90, 24, 8, 17, 27 in an AVL (Cont’d)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Figure 32: Insertion of the Sequence 75, 29, 52, 89, 92, 90, 24, 8, 17, 27 in an AVL (End) . . 68

Figure 33: Greedy Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Figure 34: Dynamic Programming of the Currency Exchange Problem . . . . . . . . . . . . . . . . . 69

Figure 35: Sorting Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Figure 36: Radix Sort Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Figure 37: Bucket Sort Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Figure 38: Insertion Sort Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Figure 39: Bubble Sort Algorithm Example (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Figure 40: Bubble Sort Algorithm Example (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Figure 41: Merging Two Sorted Sequences and Merge Sort Algorithm Examples . . . . . . . . 81

Figure 42: Quicksort Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

212
Table 1: jq Filtering Patterns (Selection) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Figure 43: Json Object First Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Figure 44: Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Figure 45: Json Object Second Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Figure 46: K-Means Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Figure 47: Graphic Representation of the Above Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Figure 48: K-Means Algorithm Example First Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Figure 49: K-Means Algorithm Example Second Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Figure 50: K-Means Algorithm Example Third Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Figure 51: K-Means Algorithm Example Fourth Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Figure 52: K-Means Algorithm Example Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Figure 53: Example of the Sum of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Figure 54: Example of the Recursive Version of Factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Table 2: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Figure 55: Collatz Sequence Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Figure 56: Twin Primes Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Figure 57: Perfect Numbers Caller Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Figure 58: Primality Test Example with Exceptions (First Version) . . . . . . . . . . . . . . . . . . . . 114

Figure 59: Primality Test Example with Exceptions (Second Version) . . . . . . . . . . . . . . . . . 115

Figure 60: Primality Test Example with Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Figure 61: Primality Test Example for Jshint Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Figure 62: Jest Object for package.json File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Figure 63: First Primality Test Example for Jest Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

213
Figure 64: First Set of Jest Test Cases for the Primality Test Program . . . . . . . . . . . . . . . . . 120

Figure 65: Second Primality Test Example for Jest Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Figure 66: Second Set of Jest Test Cases for the Primality Test Program . . . . . . . . . . . . . . 121

Figure 67: Code for the Creation of a MySQL Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Figure 68: Code for the Creation of a MySQL table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Figure 69: Code for the Storage of a MySQL Record . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Figure 70: Code for the Querying of a MySQL Record . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Figure 71: Mid Rank Intervals for Epsilon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Figure 72: Mid Rank Problem Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

Figure 73: YES or NO Automaton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Figure 74: Automaton for the Positive Multiples of 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

Figure 75: Pushdown Automaton for Well-Bracketed Expressions . . . . . . . . . . . . . . . . . . . . 137

Figure 76: Turing Machine for the Words anbncn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Figure 77: Turing Machine (aabbcc Input) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Figure 78: Turing Machine Concept for the Null (Empty) String Problem . . . . . . . . . . . . . . 147

Figure 79: Recursively Enumerable Set Membership JavaScript Pseudocode . . . . . . . . . 148

Figure 80: Biggest Subsequence Problem JavaScript Code (Start) . . . . . . . . . . . . . . . . . . . 153

Figure 81: Biggest Subsequence JavaScript Code (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

Figure 82: Traced Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

Figure 83: Tracing Kadane’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

Figure 84: Graphical Comparison of O(n2) and O(n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Figure 85: Fibonacci Algorithms (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Figure 86: Fibonacci Algorithms (Cont'd) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

214
Figure 87: Fibonacci Algorithms (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Figure 88: Tracing the Fast Fibonacci Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Figure 89: Search Algorithm Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

Table 3: Most Common Complexity Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Table 4: Computational Problems (Examples) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

Figure 90: Traveling Salesperson Problem Example Graph . . . . . . . . . . . . . . . . . . . . . . . . . . 168

Figure 91: Decision Hamilton Cycle Problem Examples Graphs . . . . . . . . . . . . . . . . . . . . . . 168

Figure 92: Reducing Hamilton Cycle Problem Examples Graphs . . . . . . . . . . . . . . . . . . . . . 169

Figure 93: ParallelJs Map Operation First Example with Output Screenshot . . . . . . . . . . . 176

Figure 94: ParallelJs Map Operation Second Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

Figure 95: Output Screenshot of the Second ParallelJs Map Operation Example . . . . . . . 177

Figure 96: Output Screenshot of the ParallelJs Map Reduce Example . . . . . . . . . . . . . . . . 179

Figure 97: Paralleljs Broadcasting Operation Example (Start) . . . . . . . . . . . . . . . . . . . . . . . . 180

Figure 98: Paralleljs Broadcasting Operation Example (End) . . . . . . . . . . . . . . . . . . . . . . . . . 181

Figure 99: ParallelJJs Broadcasting Operation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

Figure 100: WebPPL Hello Mam or Dad Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

Figure 101: Output Screenshot of the WebPPL Hello Mam or Dad Example . . . . . . . . . . . 183

Figure 102: WebPPL Bernoulli Distribution First Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

Figure 103: WebPPL Bernoulli Distribution Second Example . . . . . . . . . . . . . . . . . . . . . . . . 184

Figure 104: WebPPL Binomial Distribution Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

Figure 105: WebPPL Geometric Distribution Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

Figure 106: Output Screenshot of WebPPL Geometric Distribution Example . . . . . . . . . . 185

Figure 107: WebPPL Uniform Distribution Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

215
Figure 108: Output Screenshot of WebPPL Uniform Distribution Example . . . . . . . . . . . . 186

Figure 109: WebPPL Descriptive Statistics Example (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Figure 110: WebPPL Descriptive Statistics Example (End) . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Figure 111: Output Screenshot of WebPPL Descriptive Statistics Example . . . . . . . . . . . . 187

Figure 112: WebPPL Inferential Statistics Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Figure 113: Output Screenshot of WebPPL Inferential Statistics Example . . . . . . . . . . . . . 188

Figure 114: WebPPL Text Version Search Example with Uniformly Distributed Input (Start)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Figure 115: WebPPL Text Version Search Example with Uniformly Distributed Input (End)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Figure 116: Detailed Array Output of WebPPL Uniform Input Distribution Text Version
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

Figure 117: Real Mean Output of WebPPL Uniform Input Distribution Text Version Example
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Figure 118: Natural Mean Output of WebPPL Uniform Input Distribution Text Version Exam-
ple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Figure 119: WebPPL Graphic Version Search Example with Uniformly Distributed Input
(Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

Figure 120: WebPPL Text Version Search Example with Uniformly Distributed Input (End)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

Figure 121: Plotly Uniform Input Distribution Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

Figure 122: Output of Plotly Uniform Input Distribution Example (Screenshot) . . . . . . . . 196

216
IU Internationale Hochschule GmbH
IU International University of Applied Sciences
Juri-Gagarin-Ring 152
D-99084 Erfurt

Mailing Address
Albert-Proeller-Straße 15-19
D-86675 Buchdorf

[email protected]
www.iu.org

Help & Contacts (FAQ)


On myCampus you can always find answers
to questions concerning your studies.

You might also like