0% found this document useful (0 votes)
113 views146 pages

Algorithms Simplified - A Minimalist Approach To Problem-Solving by Rohith B. V.

Uploaded by

singhkaran070
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views146 pages

Algorithms Simplified - A Minimalist Approach To Problem-Solving by Rohith B. V.

Uploaded by

singhkaran070
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 146

Algorithms Simplified

A Minimalist Approach to Problem-Solving


This book is dedicated to my parents who always were, are and will
continue to be by my side.
Copyright © 2024 Rohith B V

All rights reserved.

First Edition, 2024

Cover design by Rohith B V

Book design by Rohith B V

No part of this book can be reproduced in any form or by written, electronic or


mechanical, including photocopying, recording, or by any information retrieval
system without written permission by the author.

Published by Facile Publishing

Although every precaution has been taken in the preparation of this book, the
publisher and author assume no responsibility for errors or omissions. Neither is
any liability assumed for damages resulting from the use of information
contained herein.

ISBN 978-1-0687284-1-9

Special thanks to my dad for his invaluable assistance with the diagrams.
Preface

Writing Algorithms Simplified has been a journey of passion and


dedication. This book is the culmination of my experiences in the tech
industry and my commitment to teaching.

Over the years, I've had the privilege of working at some of the most
innovative tech companies in the world. These experiences have provided
me with a deep understanding of algorithms and their practical
applications.

However, my journey has not been without challenges. I've faced many
situations where I was humbled by how much more there was to learn. But
each experience was a learning opportunity, a chance to try again, and
improve.

This book represents the culmination of years of learning, practice, and


perseverance. It is designed to be a "shortcut" way of thinking about
algorithms, distilling complex concepts into clear explanations and
illustrative examples.

Algorithms Simplified is my effort to provide you with a resource that


enables you to learn efficiently and effectively. I invite you to dive into this
book with curiosity and an open mind. The world of algorithms is vast and
fascinating, and I hope this book becomes a trusted companion in your
journey to mastering it.

Thank you for joining me on this journey.

Rohith B V
This page intentionally left blank
Introduction 1

Fundamentals of Problem-Solving 4

1.1 What is a “problem”? 4

1.2 Types of real-world problems 4

1.3 Data Structures: the building blocks 5

1.4 Solving problems with the State Space formulation 9

1.5 Navigating Through State Space as Graph Traversal 14

1.6 The structure of problem decomposition 15

1.7 What is a solution? 21

1.8 Conclusion 27

Computation and Complexity 32

2.1 What is an algorithm? 32

2.2 What is a program? (barebones) 36

2.3 What is a process? 49

2.4 What is complexity? 51

2.5 What is a Turing Machine? 59

2.6 Conclusion 61

Fundamental Data Structures and their Algorithms 64

3.1 Introduction 64

3.2 Undirected Graphs 65

3.3 General Directed Graphs 68

3.4 Directed Acyclic Graphs (DAGs) 69

3.5 Linked Lists 71


3.6 Weighted Graphs 71

3.7 Arrays 74

3.8 Dictionaries 79

3.9 Sets 84

3.10 Conclusion 87

From Theory to Practice 90

4.1 Binary Search: Finding Information Quickly 91

4.2 Exponential Growth: Efficient Replication 94

4.3 Optimization Problems: Maximizing or Minimizing 97

4.4 Divide and Conquer: Breaking Down Complex Tasks 102

4.5 Greedy Algorithms: Making the Best Immediate Choice 106

4.6 Sorting: Organizing Information 108

4.7 Dynamic Programming: Minimizing Repeated Work 111

4.8 Topological Sorting: Handling Dependencies 116

4.9 Monte Carlo Method: Using Randomness to Solve Problems 121

4.10 Conclusion 124

Further Reading 127

Books 127

Online Courses 128

Websites and Platforms 128

Afterword 131

Glossary 133

Index 137
Introduction

Welcome! The purpose of this book is to provide an insight into powerful


ideas from the fields of Mathematics and Computer Science that can
fundamentally alter the way you think about solving many problems in your
daily life.

This journey through these pages will arm you with not just the what but
the how and why behind each concept, ensuring that you gain a deeper
understanding of the material presented. The aim is to transform not just
your approach to problem-solving, but to empower you with a new lens to
view the world — one filled with patterns, efficiency and logical
progression.

Care has been taken to keep the book as succinct as possible and to use
diagrams where they help explain the ideas faster or serve to supplement
the text. By stripping away the extraneous, what remains are the essentials
— clearly presented and richly illustrated. Hopefully, this should make for
an engaging book that keeps you stimulated as you read.

Moreover, this book can also serve as a pre-read for seasoned Software
Engineers who are preparing for interviews or other scenarios where
algorithmic thinking is required. Whether you're a veteran coder or a
novice, I believe that within these pages, you'll encounter new ways to look
at old concepts — refreshing your perspective and sharpening your skills.

Let the reading begin!

1
This page intentionally left blank
Chapter 1
Fundamentals of Problem-Solving

1.1 What is a “problem”?


A problem, in its essence, is a situation that requires resolution. It's a gap
between a current state and a desired state, accompanied by a set of
constraints or conditions. In the context of mathematics and computer
science, a problem is often a well-defined question or challenge that
demands a systematic approach to find a solution.

1.2 Types of real-world problems


Real-world problems come in various forms and complexities, for example:

• Optimization Problems: For instance, finding the fastest route


between two cities can involve considering various factors like
distance, traffic and time of day.

• Decision Problems: Determining whether a number is prime


involves checking divisibility by smaller numbers.

• Search Problems: Searching for a specific book in a large library


involves organizing and navigating the collection efficiently.

• Design Problems: Designing a bridge requires meeting criteria


such as load capacity, material strength and environmental impact.

4
• Prediction Problems: Predicting the weather involves analyzing
patterns in historical data to forecast future conditions.

1.3 Data Structures: the building blocks


Before we attempt to understand and solve problems, let's reflect on the
foundational elements of data representation, starting with a simple analogy
— a blank piece of paper. This space serves as our canvas for representing
"pieces" of data, where a "piece" is any discrete unit of information. We'll
represent these data pieces with closed shapes, using squares for simplicity,
as each side is parallel to a dimension on the paper.

1.3.1 Drawing Squares

Imagine drawing two identical squares on the paper. They can be:

• Partially touching

• Not touching

• Touching fully

• Overlapping

To maintain the integrity of distinct data units, we'll dismiss the fourth
scenario, as it implies a fusion of data that defies our requirement for
discrete units.

For consistency, we'll consider each square to be identical in size,


signifying that the data units, say numerical values, are equal in magnitude
and uniform in type.

5
Figure 1.1: Four types of square relationships

If we continue under the premise that our data squares must not touch, the
paper will be filled with non-intersecting squares. However, data in
isolation is often meaningless. Thus, we introduce lines that bridge these
squares, connecting one piece of data to another. This network of squares
and connecting lines forms the basic structure known as a graph. This
structure enables us to navigate from one data point to another, a principle
that underpins much of data organization and algorithm design.

Figure 1.2: Squares and connecting lines form a graph

6
When squares are placed adjacently, they naturally create a sequence. This
proximity can symbolize a related series of data units. We can extend this
sequence indefinitely, not just linearly, but also bi-dimensionally on our
two-dimensional paper, crafting rows and columns.

This arrangement gives rise to another fundamental data structure known as


a table, or in computational terms, an array, or matrix. Arrays provide the
groundwork for organizing data into an accessible order or grid, allowing
for efficient storage and retrieval in computer systems.

Figure 1.3: Squares form table-like structures. The 4th scenario can be seen as a
combination of the others

The idea of assigning identity to some area of the paper is twofold:

• Draw a boundary

• Name/label it

The boundary is a small piece of the paper, represented by infinitely many


points and the name is a finite string of symbols, mapped to a natural
number.

7
Figure 1.4: Two labelled squares “L1” and “L2”

Once we introduce labelling, we can then introduce the power of random


access memory. As humans, we can quickly point to the relevant area of the
paper and identify a square, but a computer is "blind" and thus has to
search the entire paper for the relevant item if necessary. By using Cartesian
coordinates (numerical values indicating position) and showing how if we
provide the coordinates of the relevant items, the computer can implicitly
"select" separately on x and y coordinates and end up at the right slot by
"magic" (efficiently), we demonstrate the power of RAM.

Figure 1.5: Demonstrating the concept of Random Access Memory (RAM)

In the end, our understanding of data, with all its intricacies, boils down to
two fundamental structures: the interconnected nodes of graphs and the

8
orderly rows and columns of tables. This insight is as straightforward as it is
deep, revealing the core frameworks that underpin all forms of data
organization.

On paper or within the circuits of computers, our complex world of


information is ultimately navigated through these two simple, yet powerful
structures. It's a clear reflection of the binary essence of data representation
—either as individual points linked by relationships or as sequences laid
out in a grid—each with its unique way of organizing our digital landscape.

1.4 Solving problems with the State


Space formulation
Now that we have established data representation, we can develop a
framework for solving a “general” problem. Solving problems can be
effectively visualized as navigating through a state space. In this framework,
the solution process is conceptualized as a series of transitions from one
state to another, guided by a set of choices or actions.

1.4.1 State Space Representation

In the context of state space, "space" refers to the set of all possible
configurations or conditions (states) that a system can occupy. It
encompasses every potential state the system can transition to based on its
variables and boundaries. The state space can be represented as a graph as
follows:

9
Figure 1.6: State and State Space

• Current State (Si): This represents the current configuration or


condition of the solution at any given moment. It encapsulates all
the relevant information needed to describe the solution's status.

Let a state S be defined as:

S = (v1, v2, v3 . . . vn )

Where each vi is a variable.

• Choices (Ci): From each state Si, there is a set of possible actions
or choices (Ci) that can be applied to transition to other states:

Ci = {ci1, ci2, ci3 . . . cim}

• Transition Function (T): Each choice cij in the set Ci leads to a new
state. Applying a choice cij to the current state Si results in a new

10
state Sj. This can be represented by a transition function T such
that:

T (Si, cij ) = Sj

• Final State (Sf): There can be many types of goals when


considering a solution to a problem. A common goal is to reach a
final state (Sf), where the problem is considered solved. The final
state satisfies the conditions or criteria defined by the problem. We
go from state to state applying choices until the success criteria are
met. Other types of goals are specified in a later section.

Reaching the Final State: To determine if a final state has been reached,
specific conditions are checked. These could include:

• Goal achievement: Verifying if the desired outcome has been


attained (e.g., all puzzle pieces are in the correct positions).

• Constraint satisfaction: Ensuring all problem constraints are met


(e.g., all items are packed within the weight limit).

• Optimization criteria: Checking if the solution meets or exceeds a


defined threshold (e.g., the shortest path has been found).

• Termination conditions: Assessing if predefined stopping criteria


have been met (e.g., a maximum number of iterations has been
reached).

The specific conditions checked depend on the nature of the problem and
the desired solution characteristics.

11
1.4.2 Universal Problem Structure

By assuming this structure, every problem can be considered a variant of


this "master" problem. This universal representation helps in applying
general problem-solving techniques and algorithms across different
domains. Here are a few methods that can be used within this framework:

• Greedy Algorithms: At each state, choose the action that appears


to be the best immediate option, hoping it leads to the optimal
final state.

• Exhaustive Search: Explore all possible states and transitions to


ensure that the optimal solution is found.

• Divide and Conquer: Break the problem into smaller subproblems,


solve each independently and combine the results to form the final
state.

• Dynamic Programming: Use previously computed solutions of


subproblems to construct the solution for the current state,
optimizing the process.

Example: Pathfinding in a Maze

Let's apply this concept to a simple example: finding a path through a


maze.

• Initial State (S0): The starting position in the maze.

• Choices (Ci): The possible moves (up, down, left, right) from the
current position.

12
• Transition Function (T): Applying a move to the current position
results in a new position.

• Final State (Sf): The goal position in the maze.

Figure 1.7: Pathfinding in a Maze

The solution involves navigating from the initial state to the final state by
making a series of choices that transition through the state space of the
maze.

1.4.3 Summary

By visualizing problems as navigating through a state space, we can apply a


consistent and structured approach to problem-solving. Each problem is
reduced to a sequence of state transitions, driven by choices, until the final
state is reached. This abstraction is powerful and versatile, enabling the

13
application of various algorithms and techniques to solve a wide array of
problems.

1.5 Navigating Through State Space as


Graph Traversal
The concept of navigating through a state space can indeed be effectively
considered as a traversal of a graph. This perspective allows us to leverage
the rich set of tools and algorithms developed for graph theory to solve a
wide range of problems. Let's explore this analogy.

1.5.1 Exploring the Graph

By viewing the state space as a graph, we can apply various graph traversal
techniques to explore and find solutions to the problem.

The following is a (non-exhaustive) list of graph traversal algorithms:

• Breadth-First Search (BFS)

• Depth-First Search (DFS)

• Dijkstra's Algorithm

• A* Algorithm (not covered)

The details will be discussed in a later chapter.

14
1.5.2 Advantages of Graph Traversal for State Space
Navigation

• Clarity and Structure: Representing problems as graphs provides a


clear and structured way to visualize and solve them.

• Algorithmic Tools: A wide range of well-established algorithms for


graph traversal and pathfinding can be directly applied.

• Optimization: Graph-based methods allow for the optimization of


solutions, such as finding the shortest or least costly path.

• Scalability: Graph algorithms are designed to handle large and


complex state spaces efficiently.

1.5.3 Summary

Viewing the navigation through state space as a graph traversal provides a


powerful and versatile framework for problem-solving. By representing
states as nodes and transitions as edges, we can apply various graph
traversal techniques to explore and solve problems effectively. This
approach leverages the extensive body of knowledge and algorithms in
graph theory, offering clarity, structure and efficiency in finding solutions.

1.6 The structure of problem


decomposition
Complex problems can most often be broken down into smaller, more

15
manageable subproblems. This decomposition can be effectively
represented as a Directed Acyclic Graph (DAG), which helps visualize the
relationships between different components of a problem.

Figure 1.8: Example of a Directed Acyclic Graph (DAG)

In the DAG above:

• Nodes represent subproblems or components of the main problem.

• Edges represent dependencies or relationships between these


components.

• The acyclic nature ensures there are no circular dependencies.

• The direction of edges indicates the flow of information or


sequence of solving.

This structure allows us to:

• Break down complex problems into manageable parts

• Understand the order in which these parts need to be addressed

16
• Identify independent subproblems that can be solved in parallel

• Recognize shared subproblems to avoid redundant work

The DAG representation is a powerful tool in problem-solving as it provides


a clear visual of the problem's structure, dependencies and potential
solution paths. It's applicable across various domains, from software
development to project management and forms the basis for many other
problem-solving techniques we'll explore in later chapters.

A graph representing a state space or problem can be considered to


"collapse" into a Directed Acyclic Graph (DAG) under certain conditions.
This transition often implies a simplification or restructuring of the problem,
allowing for more efficient solutions.

1.6.1 Conditions for Collapse into a DAG

• Acyclic Nature: The graph must be acyclic, meaning there are no


cycles or loops. Each state or node is visited only once, ensuring
that the graph flows in one direction without revisiting any node.

• Dependency Structure: The problem can be decomposed into


subproblems with dependencies that are strictly hierarchical. Each
subproblem depends on the results of other subproblems in a way
that avoids circular dependencies.

17
1.6.2 Implications of Collapsing into a DAG

When a problem's graph representation collapses into a DAG, it means that


the problem can be structured into a hierarchy of subproblems. This
restructuring provides several advantages:

• Subproblem Definition: The problem is now decomposed into


smaller, manageable subproblems. Each node (or state) in the DAG
represents a subproblem that contributes to the overall solution.

• Topological Ordering: The DAG allows for a “topological sorting”


of nodes, providing an order in which subproblems should be
solved. This order respects the dependencies and ensures that each
subproblem is solved before it is needed by other subproblems.
We will explore topological sorting in Chapter 3.

• Efficiency in Problem-Solving: With a DAG, we can apply


dynamic programming and memoization techniques. Since the
graph is acyclic, we can store the results of subproblems and reuse
them, avoiding redundant computations.

1.6.3 Equivalence of DAG Collapse and Subproblem


Decomposition

The collapse of a graph into a DAG is effectively equivalent to recognizing


that the problem can be decomposed into subproblems. Here’s why:

• Hierarchical Subproblems: A DAG inherently represents a


hierarchy or order of subproblems. Each node depends on the

18
results of its predecessors, aligning with the concept of solving
smaller problems to build up to the solution of the larger problem.

• No Cycles: The absence of cycles ensures that there is a definitive


direction to the problem-solving process, similar to how
subproblems are solved sequentially without looping back.

• Reusability of Solutions: In a DAG, once a subproblem is solved,


its solution can be reused by any other subproblem that depends
on it. This is a fundamental aspect of dynamic programming.

1.6.4 Example: Dynamic Programming

Consider the classic example of the Fibonacci sequence, where each


number is the sum of the two preceding ones:

1, 1, 2, 3, 5, 8, 13, 21, 34…

• Graph Representation: Initially, we might represent this with a


graph where each node corresponds to a Fibonacci number and
edges represent the dependency (e.g., Fn depends on Fn−1 and
Fn−2).

• DAG Structure: Though visualized as a tree, this graph is


inherently a DAG because there are overlapping (repeated)
subproblems and there are no cycles – once a Fibonacci number is
computed, it is used by the subsequent numbers without revisiting.
Try to draw the DAG yourself to see the structure.

19
Figure 1.9: Dependency Tree of Fibonacci Numbers. Can you see the DAG?

• Subproblem Decomposition: Each Fibonacci number is a


subproblem that depends on the solutions to two smaller
subproblems. Dynamic programming can be used to compute
each number efficiently by storing and reusing previously
computed values.

1.6.5 Summary

A problem's graph representation collapses into a DAG when it can be


structured into a hierarchy of subproblems with no cyclic dependencies.
This transformation indicates that the problem can be decomposed into
manageable subproblems, facilitating the use of efficient problem-solving
techniques like dynamic programming.

The DAG structure provides a clear order for solving subproblems, ensuring
that each subproblem's solution is available when needed by other
dependent subproblems. This equivalence highlights the powerful synergy
between graph theory and algorithm design in optimizing complex
problem-solving processes.
20
1.7 What is a solution?
A solution is a satisfactory answer or resolution to a problem. It bridges the
gap between the current state and the desired state while adhering to the
given constraints. In the context of computational problems, a solution
typically has the following characteristics:

• Correctness: A correct solution ensures that the problem is


accurately addressed as per the given requirements.

• Efficiency: Efficient solutions save time and resources, making


them practical for real-world applications.

• Completeness: A complete solution works for all possible valid


inputs, ensuring reliability.

• Clarity: Clear solutions are easier to understand, implement and


maintain.

• Robustness: Robust solutions handle unexpected inputs gracefully,


preventing failures.

• Scalability: Scalable solutions perform well even as the problem


size grows, ensuring long-term usability.

Often, there might be multiple valid solutions to a problem, each with its
trade-offs in terms of these characteristics. The choice of the best solution
typically depends on the specific context and requirements of the situation.

Understanding what constitutes a solution is crucial in problem-solving, as


it guides the development of algorithms and helps in evaluating different
approaches. As we delve deeper into various problem-solving techniques in

21
the following chapters, we'll see how these fundamental concepts of
problems and solutions form the backbone of computational thinking.

1.7.1 Types of Solutions in State Space

In the context of a graph representation of a problem, a solution can take


various forms depending on the nature of the problem and the specific
requirements. Here, we'll explore the different interpretations of a solution
within this framework.

Single Final State (Sf)

• Definition: A single, specific state that signifies the completion or


solution of the problem.

• Example: In a maze-solving problem, reaching the exit cell (Sf)


represents the solution.

• Graph Representation: The solution is a path from the initial state


(S0) to this final state (Sf).

Figure 1.10: “Single final state” solution

22
Set of Final States (F)

• Definition: A collection of acceptable final states, any of which


would be considered a valid solution.

• Example: In a game, reaching any of several winning states could


be considered a solution. For example, in football the states could
be encoded as (number of goals by team A, number of goals by
team B). So the set of possible winning states is (1, 0), (2, 0), (2, 1),
(3, 2)…

• Graph Representation: The solution includes multiple paths from


the initial state (S0) to any of the states in the set F.

Figure 1.11 “Set of final states” solution

Sequence of States

• Definition: The ordered sequence of states or steps taken to reach


the solution.

• Example: The sequence of moves in a puzzle game that leads from


the starting position to the completed puzzle.

• Graph Representation: The solution is the specific sequence of


nodes (states) and edges (transitions) that leads from S0 to Sf.

23
Figure 1.12: “Sequence of states” solution

Optimal Path

• Definition: The path that not only reaches the final state but does
so in the most efficient way according to a given metric (e.g.,
shortest path, least cost).

• Example: Finding the shortest route in a travel itinerary.

• Graph Representation: The solution is the path with the minimal


number of edges or the least cumulative weight from S0 to Sf.

Figure 1.13: “Optimal path” solution

All Possible Paths

• Definition: Every possible path from the initial state to the final
state(s), providing a complete set of solutions.

• Example: Generating all possible combinations of items that satisfy


a certain condition.

24
• Graph Representation: The solution set includes all paths from S0
to the final states.

Figure 1.14: An example state space graph with starting node 3 and its solution
encoded as a tree of all paths to ending node 9

In the above example, the solution paths (assuming 3 is the starting state
and 9 is the final state) are:

(3,6,5,7,9), (3,6,7,9), (3,6,8,9)

State Space Coverage

• Definition: The coverage of all reachable states from the initial


state, ensuring all possibilities have been considered.

• Example: In an exhaustive search, covering the entire state space


to ensure no potential solution is missed.

25
• Graph Representation: The solution involves traversing the entire
graph or a significant portion of it.

Figure 1.15: State space coverage

1.7.2 Determining the Type of Solution Needed

The type of solution required depends on the problem's context and the
specific goals:

• Optimization Problems: Often require finding an optimal path


(shortest, least cost).

• Search Problems: May need a single final state or a set of final


states.

• Exploratory Problems: Might require all possible paths or state


space coverage.

Example: Travelling Salesperson Problem (TSP)

• Single Final State: Finding a route that visits all cities and returns to
the start, considering the final state as the completed tour.

26
• Optimal Path: The shortest possible route that visits all cities.

• Sequence of States: The order of cities visited in the optimal route.

Example: Sudoku Puzzle

• Single Final State: A filled and valid Sudoku grid.

• Sequence of States: The steps or moves made to fill the grid from
the initial to the final state.

• All Possible Paths: All possible sequences of moves that result in a


completed grid.

Note: Even if the state space is "continuous", for the purposes of


computation, every delta is a discrete step, thus bringing us back to a state
space graph representation.

1.8 Conclusion
In this chapter, we explored the fundamentals of problem-solving, starting
with the definition and types of problems and delving into the foundational
elements of data structures. Understanding these core concepts allows us to
approach problem-solving systematically and efficiently.

We also introduced the concept of state space and its representation as a


graph, highlighting how navigating through this space can be visualized as
graph traversal. This perspective enables us to apply a wide range of graph-
based algorithms to find solutions. Additionally, we examined how
complex problems can be decomposed into manageable subproblems

27
using Directed Acyclic Graphs (DAGs), facilitating efficient problem-solving
through dynamic programming and other techniques.

By mastering these concepts, you'll be equipped with the tools to tackle a


wide array of computational problems effectively.

28
This page intentionally left blank
This page intentionally left blank
Chapter 2
Computation and Complexity

2.1 What is an algorithm?


An algorithm is a process or set of rules to be followed in calculations or
other problem-solving operations, especially by a computer. In the context
of the state space approach, it can be seen as a set of rules to be followed
to navigate the state space efficiently to arrive at a solution.

Before we dive into understanding programs and computation from scratch,


this section is designed to get a taste of the problem solving process using
computation by providing some definitions.

2.1.1 Greedy Algorithms

Greedy algorithms take a local optimum choice at each step with the hope
of finding a global optimum. They traverse the problem space graph by
always choosing the most promising node that can be reached directly from
the current node.

2.1.2 Divide and Conquer

Divide and conquer algorithms break the problem down into independent
subproblems, solve each subproblem recursively and then combine the
solutions to solve the original problem. This can be visualized as splitting
the graph into disjoint subgraphs, each representing a fraction of the
original problem.

32
2.1.3 Memoization and Caching

Memoization is a technique used in conjunction with this graph traversal,


where the results of subproblems are stored (cached) when they are first
computed. If the same subproblem arises again as the graph is traversed,
the stored solution is used instead of recalculating it, greatly improving
efficiency.

Memoization Explained: The Cinnamon Analogy

Imagine you're in the kitchen, embarking on a baking spree. Your menu for
the day includes two items: a delicious cinnamon cake and scrumptious
cinnamon buns. Both recipes require cinnamon, a key ingredient for that
warm, spicy flavour we all love.

As you begin with the cinnamon cake, you reach for the cinnamon, only to
find it's missing. So, you make a trip to the store and buy a packet of

33
cinnamon. The cake turns out wonderfully and it's now time to start on the
buns. You need cinnamon again. Would you head back to the store without
a second thought? Of course not! You'd remember that you already have
cinnamon from your earlier trip and use it, saving time and effort.

This scenario perfectly illustrates the concept of memoization in the realm


of algorithms and programming.

Memoization is a technique used to speed up computer programs by


storing the results of expensive function calls and reusing those results
when the same inputs occur again. Rather than computing the same
information multiple times, the program "remembers" the results of past
computations, much like how you remembered you had already bought
cinnamon. This reduces the number of calculations needed, making the
program more efficient.

• First Instance (Buying Cinnamon): Represents the initial


computation of a function. Just like buying cinnamon for the cake,
the program calculates the result for an input it hasn't seen before
and stores it.

• Subsequent Instance (Reusing Cinnamon): Corresponds to


accessing the stored result. When you go to make the cinnamon
buns, instead of buying more cinnamon, you reuse what you have.
Similarly, the program retrieves the stored result for a known input
without recalculating it.

This approach is particularly beneficial in programming for operations with


overlapping subproblems and optimal substructure, common in dynamic
programming. It ensures that each unique calculation is done only once,

34
mirroring the way you wouldn't make unnecessary trips to the store for
ingredients you already have.

Memoization, thus, embodies the principle of efficiency through reuse,


whether in the kitchen or in code, ensuring that efforts—be they in baking
or computing—are never wasted.

2.1.4 Dynamic Programming (DP)

DP is a method that combines this idea of graph traversal with


memoization. It systematically breaks down a problem into subproblems,
solves each subproblem just once, stores the solution in a table (often
conceptualized as a graph) and reuses these solutions to solve larger related
subproblems. The difference between DP and just memoizing is that DP

Figure 2.1.a: A state space graph

35
Figure 2.1.b: The corresponding decomposition into subproblem trees. With DP,
nodes can be evaluated just once in reverse topological order. In this tree, a valid
DP ordering could be 9, 7, 8, 5, 6, 4, 2, 3, 1

also finds the optimal order of traversing the state space so as to minimize
time and space usage. This will be explored in detail in the last chapter.

2.2 What is a program? (barebones)


A program is a set of instructions that tells a computer how to perform a
specific task. We can use programs to systematically implement solutions to
problems. In order to communicate with the computer and write programs,
we need to learn and use programming language such as Python, C++, Java
etc.

36
This book is not aimed at teaching programming, but we will cover the
basics using Python as it is relatively easy to understand. At its core, a
program consists of several fundamental elements:

2.2.1 Branches

Branches in programming allow for decision-making and control flow. They


enable a program to execute different code blocks based on certain
conditions.

If-else statements

These allow for binary decisions. If a condition is true, one block of code is
executed; otherwise, another block is executed.

if x == 0:
print("x is 0")
elif x == 1:
print("x is 1")
else:
print("x is some other value")

Switch-case/match-case statements

These are useful for multiple condition checking, especially when


comparing a single variable against multiple values.

Match-case is available in Python from version 3.10:

37
match x:
case 0:
print("x is 0")
case 1:
print("x is 1")
case _:
print("x is some other value")

“case 0” refers to the case where x is 0, “case 1” refers to the case where x
is 1 and so on. This can be seen as a sort of shorthand for the if-else blocks
we saw earlier.

2.2.2 Loops

Loops are fundamental for repetitive tasks in programming. They allow a set
of instructions to be executed multiple times.

For loops

Ideal when the number of iterations is known beforehand.

for x in range(0, 10)


print(x)

The above code prints:

0
1
2
3
4
5
6
7
8
9

38
While loops

Used when the number of iterations is unknown and depends on a


condition.

x = 0
while x < 10:
print(x)
x = x+1

This code also prints:

0
1
2
3
4
5
6
7
8
9

Do-while loops

Similar to while loops, but guarantees at least one execution of the loop
body.

Python doesn’t provide the option of using a do-while loop, so this example
simulates one:

x = 0
print(x)
x = x+1
while x < 10:
print(x)
x = x+1

39
This code also prints:

0
1
2
3
4
5
6
7
8
9

2.2.3 Variables

Variables are containers for storing data in a program. They are fundamental
to manipulating and processing information.

In this example, both x and y are variables holding integer values.

x = 0
if y == 0:
x = 1

2.2.4 I/O (Input/Output)

I/O operations are crucial for a program to interact with the external world,
including users and other systems.

Here is a program that prints to the screen and requests the user for a
number and if the number is equal to 42, prints a statement:

40
print("Hello World!")
data = input("Please enter a number:")
if data == 42:
print("Congratulations! You have entered the winning
number!")

2.2.5 Operations

Operations are the basic computations performed on data.

Arithmetic operations

Basic: addition, subtraction, multiplication, division

Advanced: modulus, exponentiation

#You can add a comment in the code like this

x = 3
y = 2
print(x+y) #prints 5
print(x-y) #prints 1
print(x*y) #prints 6
print(x/y) #prints 1.5

Logical operations

AND, OR, NOT for boolean logic

x = True
y = False
print(x or y) #prints True because either one of x or y is
True
print(x and y) #prints False because both x and y have to be
True
print(not x) #prints False because x is True

41
Bitwise operations

These operations are for manipulating individual bits in data. Useful for
low-level programming and optimizations. We won’t bother about these in
this book.

2.2.6 Methods (Functions)

Methods or functions are reusable blocks of code that perform specific


tasks.

Defining and calling methods

Function signatures: name, parameters (also called arguments), return type


(in some languages)

Function body: the actual implementation

def sum(x, y):


return x + y
#The method definition is complete

a = 5
b = 4
print(sum(a, b)) #prints 9

In this example, “def sum(x, y):…” is a method definition.

42
2.2.7 Recursion

Recursion is a powerful technique where a function calls itself to solve a


problem.

Base cases and recursive cases

Base case: condition to stop recursion

Recursive case: problem broken down and function called again

Consider the example of computing the factorial of a number. As a


refresher, the factorial of an natural number n is given by the formula:

n! = n × (n − 1) × (n − 2) × (n − 3) . . . × 2 × 1

For example:

5! = 5 × 4 × 3 × 2 × 1 = 120

def factorial(n):
if n <= 0:
return 1 #base case
return n*factorial(n-1) #recursion

print(factorial(5)) #prints 120

Advantages

• Elegant solutions for problems with recursive nature (e.g., tree


traversals)

• Can lead to more readable and maintainable code for certain


algorithms

43
Limitations

• Potential for stack overflow with deep recursion. We will discuss


the call stack in the next few sections.

• May be less efficient than iterative solutions in some cases.

2.2.8 What is execution?

Execution in computing can be understood through a simple cooking


scenario. Let's consider making carrot soup:

You, the cook, need to:

1. Cut the carrots

2. Fill a pot with water

Now, here's where it gets interesting:

You, as the human cook, are like one CPU or executing agent. You can
actively perform tasks, make decisions and process information. In this
case, you can cut the carrots.

But there's another "agent" at work - the universe itself, or more specifically,
gravity and the water system. This agent can also "execute tasks" for you.
When you turn on the tap, gravity and the water system work to fill the pot
without your constant attention.

44
Figure 2.2: Serial and parallel (simultaneous) ordering

So now we have two "executing agents":

• You (the human/CPU) - cutting carrots

• The universe (gravity/water system) - filling the pot

By leveraging both agents simultaneously, you can achieve parallel


execution:

• You start filling the pot (initiating the "universe agent")

• While the pot is filling, you cut the carrots (you, the "human
agent", working)

This parallel approach is more efficient because two tasks are being
completed simultaneously by different "executing agents".

45
In computing terms:

• You are like the main CPU

• The universe is like a separate CPU core

Execution in a computer involves several key components that are


discussed in the following subsections.

2.2.9 The Execution Stack

Call Stack for Managing Function Calls

The execution stack, commonly referred to as the call stack, is a special


kind of stack data structure that stores information about the active
subroutines of a computer program. It manages function calls in a last-in,
first-out (LIFO) manner.

It is called a “stack” because it works just like a stack in real life. Consider a
stack of plates. Plates are both removed from and added to the top. Stacks
are usually implemented using arrays, a type of data structure discussed in
the next chapter.

When a function is called, a stack frame is created and pushed onto the
stack. This frame contains the function's local variables, arguments and
return address. Once the function execution is complete, the frame is
popped off the stack.

46
Stack Frames and Local Variables

A stack frame is a data structure that contains all the information needed to
manage a single execution of a function. Each frame includes the function's
parameters, local variables and the address to return to after the function
finishes. This isolation allows for recursive function calls and proper
management of nested function executions.

To clarify, a “nested function execution” refers to a function being called


from inside another function as follows:

def func1(x, y):


return x+y

def func2(a, b, c):


return func1(a*b, c)

print(func(2,3,4)) #prints 2*3+4 = 6+4 = 10

Relationship Between the Stack and Depth-First Search (DFS)

The call stack is inherently linked to the concept of Depth-First Search


(DFS) in that both utilize a stack-based approach to manage their
operations. In DFS, the traversal of nodes is managed using a stack to keep
track of the path, mimicking the behaviour of the call stack during recursive
function calls.

47
2.2.10 Program as a Graph

Representing a Program as a Graph of Method Calls

Figure 2.3: Method call graph

def a(x):
return 2*x

def b(x):
return 3*x

def g(x):
return a(x)+b(x) #equivalent to returning 5*x

def h(x):
return b(x) #equivalent to returning 3*x

def f(x):
return g(x)+h(x) #equivalent to returning 8*x
print(f(2)) #prints 16

Programs can be represented as graphs where nodes correspond to methods


or functions and edges denote the calls between them. This graph

48
representation helps in visualizing the flow of control within the program,
making it easier to understand complex call hierarchies and dependencies.

2.2.11 Universal Nature of Graph Representations

Graph representations are ubiquitous in computer science because they


provide a clear and structured way to model relationships and processes.
Whether representing control flows in programs, network topologies, or I/O
operations, graphs enable a unified approach to problem-solving.

Example: Flowcharts of Program Execution Resemble Graphs


Used for Representing I/O Operations

Flowcharts used to represent program execution are akin to graphs that


model I/O operations. Both utilize nodes and edges to denote states and
transitions. This universal graphical representation highlights the step-by-
step execution or data movement through the system, facilitating
understanding and analysis.

2.3 What is a process?

2.3.1 A Process is the Smallest Unit of Execution

A process is the fundamental unit of CPU utilization, capable of being


scheduled and executed independently. A process is a “program in
execution”. It includes a program counter, a stack and a set of registers,

49
along with common program data and memory blocks. In the example we
saw earlier, “filling water” and “cutting carrots” are two separate processes.

2.3.2 Concurrent Execution

Figure 2.4: Types of execution. P1 and P2 are processes. Notice the similarities to
the “carrot soup” diagram

Concurrent execution refers to the ability of the system to manage multiple


processes simultaneously. This can occur through parallelism (true
simultaneous execution on multiple processors) or multitasking (time-
slicing on a single processor).

50
2.3.3 Multiprocessing and Its Advantages

• Improved Performance: Multiprocessing can significantly enhance


performance by allowing multiple operations to run concurrently.

• Responsive Applications: Multiprocessing can keep the operating


system responsive, such as in GUI (Graphical User Interface)
applications where the GUI application is running concurrently
with other background processes.

2.3.4 Challenges in Multiprocess Programming (Race


Conditions, Deadlocks)

• Race Conditions: Occur when multiple processes access and


modify shared data concurrently, leading to unpredictable results.

• Deadlocks: Situations where two or more processes are unable to


proceed because each is waiting for the other to release resources.

• Synchronization: Proper synchronization mechanisms are essential


to manage access to shared resources and prevent race conditions
and deadlocks.

2.4 What is complexity?

2.4.1 Dictionary definition

The Oxford Languages dictionary defines complexity as "the state or quality


of being intricate or complicated." Complexity in Computer Science refers

51
to the quantity or amount of resources required by an algorithm to solve a
problem. It measures the efficiency of an algorithm in terms of time and
space. It helps in understanding the scalability and performance of
algorithms under varying input sizes.

Understanding computational complexity involves breaking down an


algorithm's operations and assessing their impact on performance and
resource consumption. This assessment helps in designing efficient
algorithms and optimizing existing ones to handle large-scale problems
effectively.

2.4.2 Big O Notation

Big O notation is a fundamental concept in computer science used to


describe the performance or complexity of an algorithm. It specifically
characterizes the worst-case scenario of an algorithm's time complexity as a
function of the input size. The "O" stands for "order of," and it provides an
upper bound on the growth rate of the algorithm's running time.

For example, if we say an algorithm has O(n) time complexity, it means the
algorithm's running time grows linearly with the input size n in the worst
case. This notation allows us to compare algorithms abstractly, without
getting bogged down in implementation details or specific hardware
considerations.

52
2.4.3 Time Complexity

Understanding and Reasoning about Big O Notation

Before diving into specific complexity classes, it's crucial to understand


how to reason about Big O notation.

There are two main approaches.

Dominating Terms

When analyzing the runtime of an algorithm, we often end up with an


expression that has multiple terms. In Big O notation, we're concerned with
the term that grows the fastest as the input size increases. This is called the
dominating term.

For example, if we have an algorithm with runtime

T (n) = 3n 2 + 20n + 100

where n is the length of the input, we reason as follows:

1. As n grows large, n 2 grows much faster than n or constants.

2. The coefficient 3 doesn't affect the rate of growth significantly.

3. Therefore, we say this algorithm is O(n 2 ).

General rules for identifying dominating terms (from slowest to fastest


growth):

Constants < Logarithms < Polynomials < Exponentials

53
So

O(1) < O(log(n)) < O(n) < O(n log(n)) < O(n 2 ) < O(2n )

As a refresher, a logarithm is a way to measure how many times you need


to multiply a number by itself to reach another number. For example, the
logarithm base 10 of 100 is 2, because 10 multiplied by itself twice (10 2)
equals 100.

We write this as:

log10 (100) = 2

Here 10 is called the base of the logarithm. In computer science we deal


with binary numbers so the base is usually 2. Also:

log2 (8) = 3

because

23 = 8
Using Limits

For more rigorous analysis, we can use limits to determine the Big O
complexity. The idea is to compare the growth of our function to a standard
complexity class.

If we have a function f(n) and we want to know if it's O(g(n)) where g(n) is
some other function, we can check the limit:
f (n)
lim
n→∞ g (n)

If this limit is a finite number, then

f (n) = O(g (n))

54
As a refresher, in mathematics, a limit helps us understand the behaviour of
a function as its input approaches a certain value.

The syntax for expressing a limit is:

lim f (x) = L
x→a

meaning as x gets closer to a, the function f(x) approaches the value L.


Limits are foundational for defining concepts like continuity, derivatives and
integrals.

For example, to show that


3n 2 + 20n + 100 = O(n 2 )
We show that

( n2 )
3n 2 + 20n + 100 20 100
lim = lim 3 + + =3
n→∞ n 2 n→∞ n

This is because the “n”s in the denominator tend to infinity and one over
infinity is zero.

Since this limit is finite, we confirm that the function is indeed O(n 2 ).

This method is particularly useful for more complex functions where the
dominating term isn't immediately obvious.

Common Complexity Classes

1. O(1) - Constant Time:

These algorithms perform a fixed number of operations, regardless


of input size. Examples include accessing an array element by
index or performing basic arithmetic operations.

55
2. O(log n) - Logarithmic Time:

The running time grows logarithmically with input size. These


algorithms typically reduce the problem size by a constant factor in
each step. Binary search is a classic example: it repeatedly halves
the search space until the desired element is found.

3. O(n) - Linear Time:

The running time grows linearly with input size. Examples include
traversing an array or performing a simple search in an unsorted
list.

4. O(n log n) - Log-linear Time:

This complexity is often seen in efficient sorting algorithms like


mergesort and quicksort (average case). It's faster than quadratic
time but slower than linear time for large inputs.

5. O(n 2 ) - Quadratic Time:

Often resulting from nested loops, this complexity is seen in


simpler sorting algorithms like bubble sort or insertion sort.
Performance degrades quickly as input size increases.

6. O(2n ) - Exponential Time:

These algorithms have rapidly growing run times and are typically
only feasible for small inputs. The naive recursive solution to the
Fibonacci sequence (without optimization) is an example.

56
Analyzing Algorithms for Their Time Complexity

To analyze an algorithm's time complexity:

1. Identify the basic operations: Determine which operations


contribute significantly to the running time.
2. Count the operations: Express the number of operations as a
function of input size.
3. Identify the dominant term: As input size grows, which term has
the most significant impact?
4. Express in Big O notation: Drop constants and lower-order terms,
keeping only the most significant factor.

For example, consider this simple function:

def sum_and_product(arr):
sum = 0
product = 1
for num in arr:
sum += num
product *= num
return sum, product

This function performs 2n + 2 operations (n additions, n multiplications


and 2 initializations). As n grows large, the dominant term is n, so we
express this as O(n) time complexity.

57
2.4.4 Space Complexity

Memory Usage of Algorithms

Space complexity measures the total amount of memory an algorithm uses


relative to the input size. It includes:

• Input space: Memory needed to store the input data.

• Auxiliary space: Extra space used by the algorithm (e.g. temporary


variables, recursion stack).

We can use the same Big O notation for reasoning about space complexity
as well.

For example, an in-place sorting algorithm like quicksort uses O(log n)


auxiliary space for its recursion stack, while mergesort typically uses O(n)
auxiliary space for merging.

Trade-offs Between Time and Space Complexity

Often, there's a trade-off between time and space complexity. Some


common scenarios include:

• Memoization: Storing previously computed results can drastically


reduce time complexity at the cost of increased space usage.

• Data structures: More complex data structures (e.g., hash tables)


can offer faster access times but require more memory.

• Algorithm choice: Some algorithms, like counting sort, can


achieve O(n) time complexity but may require O(k) additional
space, where k is the range of input values.

58
Understanding these trade-offs is crucial for choosing the right algorithm for
a given problem and hardware constraints.

In conclusion, analyzing both time and space complexity provides a


comprehensive view of an algorithm's efficiency. This analysis helps
developers make informed decisions about algorithm selection and
optimization, balancing performance needs with available resources.

2.5 What is a Turing Machine?


A Turing Machine is a foundational theoretical concept in the theory of
computation, introduced by Alan Turing in 1936. It provides a simple yet
powerful model for understanding how computers process information and
solve problems.

Figure 2.5: Structure of a Turing Machine

2.5.1 Components of a Turing Machine

• Tape: The tape is an infinite, linear strip divided into cells, each
capable of holding a symbol from a finite alphabet. It acts as the
machine's memory.

59
• Head: The head is a read/write device that scans the tape. It can
read the symbol in the current cell, write a new symbol and move
the tape left or right one cell at a time.

• State Register: The state register holds the state of the Turing
Machine. At any given time, the machine is in one of a finite
number of states.

• Transition Function: The transition function defines the rules for


the machine's operation. Given the current state and the symbol
being read, it specifies the new state, the symbol to write and the
direction to move the head (forward or backward).

2.5.2 Concept of Computability

A Turing Machine can simulate any algorithm, making it a universal model


for computation. The concept of computability refers to what can be
computed by a Turing Machine. If a problem can be solved by a Turing
Machine, it is considered computable. This concept forms the basis for
understanding the limits of what can be achieved with computation.

2.5.3 Relationship to Modern Computers and


Programming Languages

Modern computers are practical implementations of the theoretical


principles established by the Turing Machine. The core idea of manipulating
symbols based on a set of rules is fundamental to all programming
languages and computer operations. While real-world computers are finite
and constrained by physical limits, the Turing Machine provides an

60
idealized model that helps computer scientists understand the essential
properties of computation.

2.6 Conclusion
In this chapter, we've explored the fundamental building blocks of
computation and complexity. We began by defining algorithms and
examining various algorithmic approaches such as greedy algorithms,
divide and conquer and dynamic programming. We then delved into the
structure of programs, discussing key elements like branches, loops,
variables and functions.

The concept of program execution was examined in detail, including the


critical role of the execution stack and how programs can be represented as
graphs of method calls. We also introduced the idea of processes as the
smallest units of execution and touched on concurrent execution.

Remember, mastering these fundamentals is key to becoming proficient in


algorithmic thinking and problem-solving.

61
This page intentionally left blank
Chapter 3
Fundamental Data Structures and
their Algorithms

3.1 Introduction
In computer science, mastering the core structures of graphs and tables is
essential for efficient problem-solving.

Graphs are versatile tools for representing and analyzing relationships and
processes, from social networks to navigation systems. By leveraging graph
algorithms, we can find optimal solutions to complex problems across
various domains. Imagine a social network where people are represented
by dots (nodes) and their friendships by lines connecting these dots (edges)
or road networks where intersections are nodes and roads are edges, or
computer networks where devices are nodes and connections are edges.
These are simple examples of graphs.

Tables, including arrays and hash tables, are fundamental for fast data
access and manipulation. Arrays excel in indexed operations and are
crucial for tasks such as sorting large datasets in database management,
binary searching in search engines, and managing sequences in multimedia
applications. Hash tables, on the other hand, offer near-instantaneous key-
value management, making them ideal for use in implementing caches for
web browsers, managing dictionaries in language processing, and handling
routing tables in networking.

64
3.2 Undirected Graphs
In an undirected graph, relationships go both ways. Think of Facebook
friendships: if Alice is friends with Bob, Bob is also friends with Alice. This
mutual relationship is represented by a line without arrows between two
nodes.

3.2.1 Depth-First Search (DFS)

Imagine you're exploring a maze. You start down a path and keep going as
far as you can. When you hit a dead end, you backtrack to the last
intersection and try a different path. This is essentially how Depth-First
Search works on a graph. It's great for tasks like finding a path between two
points or checking if a path exists.

Figure 3.1: Depth-First Search (DFS). The arrows represent the order that the
nodes are visited in starting from node 0. Green represents a forward movement
and red represents backtracking.

65
Real-world application: In computer game AI, a depth-limited version of
DFS can be used to explore possible moves in games like chess, looking
several moves ahead to determine the best strategy.

3.2.2 Breadth-First Search (BFS)

Now imagine you're in a dark cave and you light a torch. The light spreads
out evenly in all directions, illuminating nearby areas before farther ones.
This is how BFS explores a graph - it checks all nearby nodes before moving
outward.

Real-world application: In social networks, BFS can be used to find the


shortest chain of connections between two people, similar to the "Six
Degrees of Separation" concept.

Figure 3.2: Breadth-First Search (BFS). Each colour represents a new “layer” of
traversal starting from node 0.

66
3.2.3 Trees

Trees are special types of graphs that have a hierarchical structure, like a
family tree or an organization chart. They start with a root node (like the
CEO in a company structure) and branch out, with each node having child
nodes (like department heads, then team leaders, then team members etc.).

Tree Traversals

Traversing a tree means visiting every node in a specific order. There are
several ways to do this:

• Pre-order traversal: Visit a node, then its left subtree, then its right
subtree. This is like reading a book chapter by chapter, where you
read the chapter title, then the first section, then the second section
and so on.

• In-order traversal: Visit the left subtree, then the node, then the
right subtree. In a Binary Search Tree (BST), this gives you the
elements in sorted order. A Binary Search Tree is a data structure
where each node's left child has values less than the node, and the
right child has values greater, enabling efficient searches.

• Post-order traversal: Visit the left subtree, then the right subtree,
then the node. This is useful when you need to delete a tree, as
you'd want to delete the children before the parent.

• Level-order traversal: Visit nodes level by level, from left to right.


This is like reading a book page by page, rather than chapter by
chapter. This is the same as applying BFS on the tree from the root.

67
Figure 3.3: Different types of tree traversals on a BST starting from root node 4.
Try to analyze them yourself!

Real-world application: In file systems, directories and files form a tree


structure. Different traversals can be used for tasks like calculating total file
size (post-order) or displaying the directory structure (pre-order).

3.3 General Directed Graphs


In directed graphs, relationships have a direction. Think of Twitter follows: if
Alice follows Bob, Bob doesn't necessarily follow Alice. This is represented
by arrows between nodes. Directed graphs can have cycles.

68
Directed graphs can model many real-world scenarios:

• Web pages and links between them

• Road networks with one-way streets

• Workflow processes in a company

Figure 3.4: A general directed graph. Can you find the cycle?

All of the algorithms we discussed for undirected graphs are also applicable
for directed graphs - we just have to ensure that the algorithm respects the
directionality of the edges.

3.4 Directed Acyclic Graphs (DAGs)


A DAG is a directed graph without cycles. Imagine a project schedule: tasks
are represented by nodes and edges show which tasks need to be
completed before others can start. You can't have circular dependencies in
a project schedule, which is why it forms a DAG.

69
3.4.1 Topological Sorting

Topological sorting arranges the nodes of a DAG in a linear order that


respects all the directed edges. In our project schedule example, this would
give us an order to complete the tasks that ensures all prerequisites are met.

Real-world application: In software engineering, build systems use


topological sorting to determine the order in which to compile different
parts of a program, ensuring all dependencies are built before the
components that need them.

Figure 3.5: Topological sort

70
3.5 Linked Lists
Linked lists are simple structures where each element points to the next
and/or previous one, like a chain. They're useful when you need to
frequently add or remove elements from the beginning or end of a list.

Real-world applications:

• Undo functionality in text editors (each node represents a state of


the document)

• Music playlists (each node is a song that points to the next or


previous song)

Figure 3.6: A singly linked list. Each node points only to the next one and not the
previous one as in the case of a doubly linked list.

3.6 Weighted Graphs


Weighted graphs add values to the edges. These could represent distances,
costs, time, or any other measure. For example, in a road network, edges
could be weighted by distance or expected travel time.

71
Figure 3.7: A weighted graph

3.6.1 Dijkstra’s Algorithm

Dijkstra's shortest path algorithm finds the shortest path between two points
in a weighted graph. Imagine you're planning a road trip and want to find
the quickest route between two cities. Dijkstra's algorithm can help you
find this, taking into account the distances (or travel times) between each
city.

Figure 3.8: The result of running Dijkstra’s shortest path algorithm between two
nodes A and D. The length of the shortest path is 7. For “free”, the algorithm also
gives us the shortest path between A and every other node.

72
Real-world application:

Dijkstra's algorithm is widely used in modern navigation systems such as


GPS devices and mapping applications like Google Maps. These systems
help users find the shortest or fastest routes between destinations. By
representing the road network as a weighted graph, where intersections are
nodes and roads are edges with weights corresponding to distances or
travel times, it efficiently computes the optimal path.

3.6.2 Prim’s Algorithm

Prim's algorithm finds the minimum spanning tree of a weighted graph. The
minimal spanning tree is a tree extracted out of the graph such that the sum
of edge weights is minimum. Imagine you're designing a railway network to
connect several cities using the least total track length while ensuring all
cities are connected. Prim's algorithm solves this type of problem.

Figure 3.9: The minimum spanning tree of the graph from figure 3.7. The total
weight is 9

73
Real-world application: In network design, Prim's algorithm can be used to
design efficient computer networks, minimizing the total length of cable
needed to connect all computers.

3.7 Arrays
Arrays are fundamental data structures that store elements in contiguous
memory locations. They provide fast, constant-time (O(1)) access to
elements using indices. Think of an array like a locker with many numbered
locker compartments. In an array this compartment is called a cell and the
number represents the index of the cell.

Arrays can be used for implementing stacks (LIFO) or queues (FIFO) as well.
LIFO stands for “last in first out” like a stack of plates and FIFO stands for
“first in first out” like a queue at a concert. They are also used to implement
heaps which are relatively advanced data structures. Arrays can be
extended into any number of dimensions, including the standard two
dimensions of a table that we might encounter in a book or spreadsheet.

3.7.1 Arrays and Random Access Memory (RAM)

The power of arrays lies in their direct mapping to computer memory. The
CPU can quickly access any array element using its memory address,
making array operations extremely efficient. This is possible because:

Arrays allocate contiguous memory blocks. The memory address of any


element can be calculated using its index and the size of each element. This
calculation is a simple arithmetic operation, allowing for O(1) access time.

74
For example, if an integer array starts at memory address 10 and each
integer occupies 4 bytes, the address of the ith element would be:

10 + (i × 4)

As a refresher, a byte is a collection of 8 bits, so 4 bytes would hold 32 bits.


A bit is a binary digit, either 0 or 1.

Figure 3.10: An array. v1, v2, v3 are some integer values stored in the array. The rest
of the cells are empty.

3.7.2 Arrays as Complete Graphs

Conceptually, we can also think of arrays as complete graphs where every


element has a direct connection to every other element. This perspective
highlights the O(1) access time for any element in an array. In this analogy:

Each array element is a node in the graph.

There's an implicit edge between every pair of elements.

The "weight" of each edge could be considered the distance between


indices.

This view helps understand why operations like swapping elements or


accessing arbitrary positions are so efficient in arrays.

75
3.7.3 Algorithms

Sorting

Several sorting algorithms are particularly well-suited for arrays. Two of the
more efficient comparison based algorithms are discussed here.

Quicksort

Quicksort is a divide-and-conquer algorithm that sorts an array by selecting


a 'pivot' element and partitioning the array into two sub-arrays, one with
elements less than the pivot and one with elements greater than the pivot.
The process is recursively applied to the sub-arrays until the entire array is
sorted.

Figure 3.11: Quicksort

76
• Average time complexity: O(n log n)

• Space complexity: O(log n)

• In-place sorting

• Efficient for large datasets

Mergesort

Mergesort is a divide-and-conquer algorithm that recursively divides an


array into two halves, sorts each half and then merges the sorted halves
back together. This process continues until the array is fully sorted, ensuring
a stable O(n log n) time complexity.

Figure 3.12: Mergesort

• Time complexity: O(n log n)

• Space complexity: O(n)

77
• Stable sort

• Efficient for linked lists and external sorting

Comparison:

• Quicksort is often faster in practice but has a worst-case time


complexity of O(n 2 ) (can you reason why?).

• Mergesort has a consistent O(n log n) performance but requires


additional space.

Binary Search

Binary search is an efficient algorithm that finds the position of a target


value within a sorted array by repeatedly dividing the search interval in half.
If the target value is less than the midpoint value, the search continues on
the lower half; otherwise, it continues on the upper half until the target is
found or the interval is empty.

Time complexity: O(log n)

Space complexity: O(1) for iterative implementation, O(log n) for recursive

Prerequisites:

• The array must be sorted.

• Random access to elements is required (arrays are ideal).

Applications

• Finding an element in a sorted dataset

78
• Implementing efficient lookup in databases

• Optimizing other algorithms that work on sorted data

Figure 3.13: Binary Search for element 5

3.8 Dictionaries
Dictionaries, also known as associative arrays, hash maps, or hash tables,
are abstract data types that store key-value pairs. They work by allowing
you to associate a unique key with a specific value, much like looking up a
word in a dictionary to find its definition. This makes them incredibly useful
for quickly finding and updating data. For example, a dictionary can store a
phone book where each person’s name is the key and their phone number

79
is the value. Because dictionaries use a special method called hashing to
organize the data, they can retrieve information almost instantly, even if the
dictionary contains a large amount of data. Dictionaries are implemented
using arrays under the hood.

3.8.1 Hashing

Hashing is a technique that transforms input data of arbitrary size into a


fixed-size value, typically for indexing or quick data retrieval. This fixed-
size value is called a hash and the function that performs this
transformation is known as a hash function.

Hashing Analogy: Hash Browns from Potatoes

Imagine you have a pile of potatoes, each potato being unique in size,
shape and weight. You decide to make hash browns, which involves
shredding the potatoes into small, uniform pieces. After shredding, each
batch of hash browns looks similar, even though they originally came from
different potatoes. The process of shredding can be thought of as a hash
function and the hash browns represent the fixed-size hash values.

In this analogy:

• Potatoes represent the original input data of arbitrary size.

• Shredding is the (repeated) hash function that processes the


potatoes.

• Hash Browns are the output hash values (multiple) of fixed size.

The key points here are:

80
• Irreversibility: Just as you cannot reconstruct the original potatoes
from the hash browns, it is computationally infeasible to reverse
the hash value to obtain the original input data.

• Uniformity: Different potatoes produce similar-looking hash


browns, ensuring that different inputs produce uniformly
distributed hash values.

• Efficiency: Shredding potatoes (hashing) is a quick process, making


hashing an efficient way to handle data.

How Hashing Works

Figure 3.14: Hashing

Hash functions take input data (e.g., strings, files) and convert them into a
fixed-size string of characters, which is typically a sequence of numbers
and letters. Here's a simplified breakdown of the hashing process:

• Input Data: The data to be hashed, such as a password or a file.

81
• Hash Function: A function that processes the input data to produce
a hash value. Common hash functions include MD5, SHA-1 and
SHA-256.

• Output Hash: The fixed-size string that is the result of the hash
function.

Example:

• Input: "password123"

• Hash Function: SHA-256

• Output Hash:
"ef92b778bafe771e89245b89ecbc5c308c20e040ef2d59fbf0145ef
4ff9f3b67"

Application: Password Storage and Matching

Hashing is commonly used in password storage and verification due to its


security properties. Here's how it works:

When a user creates a password, the system applies a hash function to the
password. The resulting hash value is stored in the database, not the plain
text password.

Example:

• Password: "mypassword"

• Hash (stored in database):


"5f4dcc3b5aa765d61d8327deb882cf99"

82
When the user logs in, they enter their password.

The system hashes the entered password using the same hash function. It
then compares the resulting hash with the stored hash value. If the hashes
match, the password is correct; otherwise, it is incorrect.

Example:

• Entered Password: "mypassword"

• Hash of Entered Password:


"5f4dcc3b5aa765d61d8327deb882cf99"

• Stored Hash: "5f4dcc3b5aa765d61d8327deb882cf99"

• Result: Passwords match, access granted.

In practice, there is more nuance than the above simple method due to
techniques like salting and peppering, but the general principle applies.

Importance of Hashing in Security

• Irreversibility: Ensures that even if the hash values are


compromised, the original passwords cannot be easily recovered.

• Deterministic: The same input always produces the same hash,


ensuring consistency in password verification.

• Efficiency: Hashing functions are fast, making them suitable for


real-time password verification.

83
3.8.2 Hashing and Array Indexing

Hashing becomes particularly powerful when dealing with numbers due to


modular arithmetic*. Here's how it relates to array indexing:

• Generate a hash for an object.

• Take the remainder of that hash when divided by the array size.

• Use this remainder as the index to place the object in the array.

This method is computationally inexpensive, similar to RAM access and the


resulting array index is as random as the hash itself, minimizing collisions.

3.9 Sets
Sets are unordered collections of unique elements. They're often
implemented using hash tables for efficient operations. They can be thought
of as a special case of Dictionaries where the key is same as the value
stored.

3.9.1 Set Operations and Their Complexities

These operations are typically very efficient due to the use of hash tables,
making sets powerful for managing unique collections of data.

* Modular arithmetic is a system of arithmetic for integers, where numbers "wrap around"
upon reaching a certain value, called the modulus. For example, in modular arithmetic with
a modulus of 5, the expression 7%5 equals 2, because when 7 is divided by 5, the
remainder is 2.

84
Adding elements

Time complexity: O(1) average case, O(n) worst case

Process: Hash the element and add it to the appropriate bucket if not
already present

Removing elements

Time complexity: O(1) average case, O(n) worst case

Process: Hash the element, locate and remove it from the appropriate
bucket

Checking for membership

Time complexity: O(1) average case, O(n) worst case

Process: Hash the element and check if it exists in the corresponding


bucket

Set union

Figure 3.15: Set union of two sets

Time complexity: O(m + n), where m and n are the sizes of the two sets

85
Process: Create a new set and add all elements from both sets

Set intersection

Figure 3.16: Set intersection of two sets

Time complexity: O(min(m , n)), where m and n are the sizes of the two
sets

Process: Iterate through the smaller set and check each element's presence
in the larger set

Set difference

Figure 3.17: Set difference of the first set minus the second

Time complexity: O(m), where m is the size of the first set

86
Process: Create a new set, add elements from the first set that are not in the
second set

3.10 Conclusion
Graphs and their algorithms are powerful tools for modelling and solving
real-world problems. From social networks to GPS navigation, from project
scheduling to computer networking, graph theory provides a framework for
understanding and optimizing complex relationships and processes. By
representing problems as graphs, we can apply these algorithms to find
efficient solutions in various fields.

Tables, particularly arrays and hash tables, form the backbone of many
efficient algorithms and data structures. Arrays offer unparalleled speed for
indexed access and are fundamental to many sorting and searching
algorithms. Hash tables, used in dictionaries and sets, provide near-constant
time operations for key-value pair management and set operations.

In mastering the concepts of graphs and tables, you are equipped with the
foundational tools to innovate and optimize. These structures and their
algorithms not only enhance your problem-solving capabilities but also
open up opportunities to address complex challenges across various
industries.

87
This page intentionally left blank
Chapter 4
From Theory to Practice

In our daily lives, we often use algorithmic thinking without realizing it.
This chapter explores common scenarios where computational concepts
come into play, demonstrating how the algorithms we've learned about
manifest in everyday situations.

Algorithmic thinking isn't confined to computer screens or coding


environments. It's a powerful problem-solving approach that we
instinctively use in many aspects of our lives. When you're organizing your
closet, planning a trip, or even cooking a meal, you're often employing
strategies that mirror computational algorithms.

For instance, when you're getting dressed in the morning, you might use a
process similar to a decision tree: if it's cold, you choose warm clothes; if
it's raining, you grab an umbrella or a raincoat. This is akin to the if-then-
else structures used in programming.

Even in social situations, we use graph-like thinking. When introducing two


friends from different social circles, you're essentially creating a new edge
in your social network graph.

Understanding how algorithms manifest in everyday life can help us:

• Improve efficiency: By recognizing patterns, we can streamline


our daily routines and tasks.

• Enhance decision-making: Algorithmic thinking can provide a


framework for making more logical, systematic decisions.

90
• Solve complex problems: Breaking down big problems into
smaller, manageable steps is a key aspect of both algorithmic
thinking and effective problem-solving.

• Communicate better: Understanding processes algorithmically can


help us explain procedures more clearly to others.

• Innovate: Recognizing algorithmic patterns can inspire new


solutions to old problems.

In the following sections, we'll explore specific examples of how different


algorithms and data structures we've learned about apply to real-world
scenarios. From sorting and searching in daily life to using graph concepts
in social and professional networks, we'll see how computational thinking
extends far beyond the realm of computers.

I encourage you to try these examples out in real life or bring out a pencil
and paper and try to draw along to the examples in order to gain a better
understanding. Refer to the earlier chapters to refresh your memory if
required.

4.1 Binary Search: Finding Information


Quickly
Binary search is a powerful algorithm that allows us to efficiently find
information in ordered sets by repeatedly halving the search space. This
method is particularly useful when dealing with large amounts of sorted
data.

91
4.1.1 Flipping through a book to find a specific page

Imagine you have a book and you don’t know how many pages are there
beforehand and you need to find page 753:

1. Open the book in the middle. It happens to show a page number


around 500.

2. 753 is greater than 500, so you know it's in the second half.

3. Open to the middle of the second half (around page 750).

4. 753 is slightly after, so you flip a few pages forward.

5. You find page 753 in just a few steps.

This intuitive process is like binary search in action. Instead of checking all
1000 pages, you've narrowed it down quickly by halving the search space
each time. We tend to do this intuitively without following the step by step
procedure above.

92
Key points:

• The book's pages must be in order (sorted).

• Each step eliminates approximately half of the remaining pages.

• You can find any page in about 10 steps or fewer (log₂(1000) ≈ 10).

4.1.2 Locating a name in a phonebook

Consider finding "Smith, John" in a phonebook with 1,000,000 entries:

1. Open the phonebook in the middle.

2. If the name you see is alphabetically before "Smith," look in the


second half; if after, look in the first half.

3. Repeat this process, halving the search area each time.

4. You'll find "Smith, John" (or determine it's not there) in about 20
steps or fewer.

Efficiency:

• Linear search (checking each name): up to 1,000,000 steps

• Binary search: about 20 steps (log₂(1,000,000) ≈ 20)

4.1.3 How you likely found this book

Modern search algorithms, including those used by online bookstores or


libraries, often incorporate principles of binary search:

1. When you search for "Algorithms Simplified," the system doesn't


check every book title sequentially.

93
2. It uses indexed databases and efficient search algorithms to quickly
narrow down the possibilities.

3. While not pure binary search, these systems use similar divide-and-
conquer principles to provide near-instantaneous results.

4.2 Exponential Growth: Efficient


Replication
Exponential growth is a pattern of data that shows greater increases over
time. In the context of algorithms and problem-solving, understanding
exponential growth can lead to highly efficient methods for replication and
scaling.

4.2.1 Copying spreadsheet rows multiple times

Imagine you need to create 1024 rows of data in a spreadsheet, starting


with one row:

1. Copy the initial row to make 2 rows (21)

2. Copy these 2 rows to make 4 (22)

3. Copy these 4 rows to make 8 (23)

4. Continue this process...

5. After just 10 iterations, you'll have 1024 (210) rows

This method is much faster than copying the initial row 1023 times
individually.

94
Figure 4.1: Efficient copying in a spreadsheet

Efficiency:

• Linear copying: 1023 operations

• Exponential copying: Only 10 operations

4.2.2 Creating multiple copies of a hi-hat† beat in


music production software

Let's say you want to create a 16-bar hi-hat pattern:

1. Create a 1-bar hi-hat pattern

2. Copy to make a 2-bar pattern

† A hi-hat is a pair of cymbals mounted on a stand, played with a pedal and drumsticks to
create rhythmic patterns in music.

95
3. Copy the 2-bar pattern to make a 4-bar pattern

4. Copy the 4-bar pattern to make an 8-bar pattern

5. Copy the 8-bar pattern to make the final 16-bar pattern

You've created a 16-bar pattern in just 4 operations instead of 15.

4.2.3 Viral spread of information on social media

This concept also explains how information can spread rapidly on social
networks:

1. One person shares a post with 5 friends

2. Each of those 5 friends shares with 4 more friends (they don’t share
it back with you)

3. After just 6 iterations of growth, the post could potentially reach


5×45 = 5120 people

Figure 4.2: How viral growth happens

96
4.2.4 Key Takeaways

• Exponential growth leads to rapid increases with relatively few


iterations

• It's highly efficient for replication tasks

• The power lies in doubling (or multiplying by a constant) at each


step

• While powerful for growth, it can also lead to quick resource


exhaustion if not managed properly

4.2.5 Real-world Implications

• Compound Interest: Financial growth through interest that builds


upon itself

• Population Growth: Under ideal conditions, populations can grow


exponentially

• Technology Advancement: Moore's Law predicts exponential


growth in computing power

4.3 Optimization Problems: Maximizing


or Minimizing
Optimization problems involve finding the best solution from all feasible
solutions. They are ubiquitous in everyday life and are crucial in many
fields, including computer science, economics and engineering.

97
4.3.1 The Travelling Salesperson Problem: Finding the
Most Efficient Route

The Travelling Salesman Problem (TSP) involves finding the shortest possible
route that visits each given location exactly once and returns to the starting
point.

Planning an Itinerary via Flights to Minimize Cost

Consider planning a trip where you need to visit these cities:

• London

• Paris

• Berlin

98
• Rome

• Madrid

• Amsterdam

The challenge is to find the most cost-efficient route through all cities,
minimizing the total travel cost.

Solution Approach:

Map Out the Flight Network:

• Identify all possible direct flights between the cities and their
respective costs.

Calculate Costs Between All City Pairs:

• Use available data to determine the flight costs between each pair
of cities.

Use Optimization Algorithms to Find the Optimal Path:

• For smaller problems, use exact algorithms like branch and bound
or dynamic programming.

• For larger problems, use approximation algorithms or heuristics,


such as genetic algorithms or simulated annealing, to find a near-
optimal solution.

Complexities:

The number of possible routes grows exponentially with the number of


stops. For just 6 cities, there are 720 possible routes.

99
Real-world applications:

• Delivery Route Optimization: Planning delivery routes for logistics


companies to minimize costs and time.

• Circuit Board Drilling in Manufacturing: Optimizing the sequence


of drilling holes in circuit boards to reduce manufacturing time
and costs.

• Data Fetch Optimization in Computer Memory: Minimizing the


cost of fetching data from different memory locations to optimize
computational efficiency.

4.3.2 The Knapsack Problem: Maximizing Value


Within Constraints

The Knapsack Problem is about selecting a subset of items that maximize


total value while keeping the total weight under a specific limit.

Packing a suitcase with a weight limit

Imagine you're packing for a trip with a 50-pound luggage limit.

Items:

• Laptop (5 lbs, importance: 10/10)

• Camera (3 lbs, importance: 8/10)

• Books (10 lbs, importance: 6/10)

• Clothes (20 lbs, importance: 9/10)

100
• Shoes (8 lbs, importance: 7/10)

• Toiletries (4 lbs, importance: 8/10)

The challenge is to pack the most important items without exceeding 50


pounds.

Solution approach:

• Calculate value-to-weight ratios

• Prioritize items with higher ratios

• Add items until reaching the weight limit

This problem becomes complex with more items, leading to various


algorithmic solutions in computer science (dynamic programming, greedy
algorithms).

Real-world applications:

• Financial portfolio optimization

• Resource allocation in project management

• Cargo loading in logistics

4.3.3 Key Takeaways

Optimization problems often have multiple feasible solutions, but finding


the best one can be challenging.

These problems frequently involve trade-offs (e.g., value vs. weight,


distance vs. time). As the problem size increases, finding the exact optimal

101
solution becomes computationally intensive. In practice, we often use
approximation algorithms or heuristics to find good (but not necessarily
perfect) solutions in reasonable time.

A heuristic is a practical approach or shortcut used to make problem-


solving and decision-making more efficient when an optimal solution is
impractical to obtain.

4.3.4 Everyday Applications of Optimization


Algorithms

• Meal planning to maximize nutrition within a calorie budget

• Time management to complete the most important tasks within a


workday

• Budget allocation in personal finance or business

4.4 Divide and Conquer: Breaking Down


Complex Tasks
Divide and Conquer is a problem-solving approach that breaks down a
complex problem into smaller, more manageable subproblems. These
subproblems are solved independently and then combined to solve the
original problem. This strategy is widely used in computer science
algorithms and can be applied to many real-life situations. In the state
space representation, it can be thought of as repeatedly breaking the state

102
Figure 4.3: Two connected components (the subgraphs inside the dotted circles)
in a graph

space graph into connected components and then combining the results of
solving smaller subproblems.

4.4.1 Sorting Laundry: A Household Example

Consider sorting a large pile of mixed laundry:

• Divide: Split the large pile into smaller piles (e.g., 4-5 smaller
piles)

• Conquer: Sort each small pile by category (e.g., colours, whites,


delicates)

• Combine: Merge the sorted small piles into larger category piles

Benefits:

• Easier to manage smaller piles

103
• Can parallelize the work (family members can sort different piles)

• Reduces the complexity of the overall task

This process mirrors the Mergesort algorithm, which is highly efficient for
sorting large datasets.

4.4.2 Organizing a Large Project: A Professional


Example

Let's consider managing a company-wide software upgrade.

Divide:

Break down the project into smaller tasks:

• Assessment of current systems

• Planning the upgrade process

• Data migration

• Software installation

• Staff training

• Testing and quality assurance

Conquer:

• Tackle each subtask independently:

• Assign specialized teams to each area

• Set deadlines for each subtask

104
• Allocate resources appropriately

Combine:

• Integrate the results of each subtask:

• Ensure all systems work together after the upgrade

• Conduct final overall testing

• Roll out the upgrade company-wide

Benefits:

• Allows for parallel work streams

• Easier to track progress and manage resources

• Simplifies a complex project into manageable pieces

4.4.3 Key Principles of Divide and Conquer

• Problem Decomposition: Ability to break a problem into similar


subproblems

• Base Case Recognition: Identify when a subproblem is small


enough to solve directly

• Problem-Solving: Solve the subproblems recursively or directly

• Solution Combination: Merge sub-solutions into a solution for the


original problem

105
4.5 Greedy Algorithms: Making the Best
Immediate Choice
Greedy algorithms are problem-solving methods that make the locally
optimal choice at each step with the hope of finding a global optimum.
While they don't always lead to the best overall solution, they are often
used for their simplicity and efficiency in many real-world scenarios.

4.5.1 Finding the Best Deal When Shopping

Consider buying a new laptop with a budget of $1000:

• Research available laptops within your budget.

• Compare them based on key features (processor speed, RAM,


storage, etc.).

106
• At each step, choose the laptop that offers the best value for a
particular feature.

• Continue until you find a laptop that satisfies most of your criteria
within the budget.

This approach is "greedy" because at each step, you're making the best
choice available without reconsidering previous decisions.

Benefits:

• Quick decision-making

• Often leads to a satisfactory, if not optimal, solution

• Simplifies complex decisions

Limitations:

• May miss better overall deals by focusing on individual features

• Doesn't consider future implications of current choices

4.5.2 Choosing a Route While Driving

At each intersection, select the direction that seems to lead most directly to
your destination.

This may not always result in the absolute shortest route, but it's often
efficient.

107
4.5.3 Task Scheduling

Always choose the task with the nearest deadline or shortest completion
time.

Efficient for many scenarios but may not be optimal if tasks have different
priorities or dependencies.

4.5.4 Key Principles of Greedy Algorithms

• Greedy Choice Property: A globally optimal solution can be


reached by making locally optimal choices.

• Optimal Substructure: An optimal solution to the problem


contains optimal solutions to subproblems.

4.6 Sorting: Organizing Information

108
Sorting is a fundamental concept in both computer science and everyday
life. It involves arranging items in a specific order, which makes finding and
managing information much easier.

4.6.1 Household Examples of Sorting

Arranging Books on a Shelf

• By author: Alphabetical order makes finding books by a specific


author easy

• By title: Useful for quickly locating a particular book

• By genre: Groups similar books together, helpful for browsing

Organizing a Closet

• By clothing type: Grouping shirts, trousers, dresses, etc.

• By colour: Creates a visually pleasing arrangement and helps


coordinate outfits

• By season: Keeps current clothes accessible and others stored


away

Arranging Spice Rack

• Alphabetically: Easy to find specific spices quickly

• By frequency of use: Most common spices in front

• By type of cuisine: Grouping spices used in similar dishes

109
4.6.2 Key Concepts in Sorting

Comparison-Based Sorting

Most household sorting relies on comparing items. This is similar to


comparison sorts in computer science (e.g., Bubble Sort, Merge Sort)

Stability in Sorting

Maintaining the relative order of equal items, mostly from a previous cycle
of sorting.

Example: Sorting clothes first by type, then by colour within each type

Natural Order vs. Custom Order

• Some items have a natural order (alphabetical, numerical)

• Others require a custom order (seasonal clothes, meal ingredients)

4.6.3 Benefits of Sorting in Daily Life

• Saves Time: Quickly find what you need

• Reduces Stress: An organized environment is often more calming

• Improves Decision Making: Easier to see what you have and what
you need

• Enhances Productivity: Streamlines daily tasks and routines

110
• Facilitates Sharing: Others can easily find things in an organized
system

4.7 Dynamic Programming: Minimizing


Repeated Work
Dynamic Programming (DP) is a method for solving complex problems by
breaking them down into simpler subproblems. It's particularly useful when
subproblems overlap and have optimal substructure.

4.7.1 Example: Planning Weekend Activities

Imagine you're planning activities for an entire quarter of a year. You have a
list of potential activities, each with an associated cost and an "enjoyment
score” on a scale of 1-10. Your goal is to maximize your total enjoyment
while staying within your budget. You may repeat activities as much as you
like.

Activities:

• Movie night (Cost: $20, Enjoyment: 3)

• Hiking trip (Cost: $50, Enjoyment: 8)

• Fancy dinner (Cost: $100, Enjoyment: 16)

• Museum visit (Cost: $30, Enjoyment: 4)

• Concert (Cost: $80, Enjoyment: 12)

• Home spa day (Cost: $40, Enjoyment: 6)

111
Budget: $160

Figure 4.4: Dynamic Programming solution. Dinner + Movie + Spa gives total
enjoyment of 25 with cost 160

I encourage you to try to reason about how to solve the problem before
diving into the next couple of sections. The hint is that the solution is
described by the recurrence relation:

max _enjoyment(bu dget) = max(max _enjoyment(bu dget − cost[i ]) : i = 1...n)

Memoization Version

In a memoization approach, we'd use a recursive function but store results


to avoid recalculating:

• Create a function max_enjoyment(budget)

• If the result for this budget is already calculated, return it


(memoization)

• If not, calculate it by considering each possible activity:

112
• If the activity's cost fits within the budget, recursively call
max_enjoyment(remaining_budget)

• Compare the enjoyment of choosing this activity plus the


result of the recursive call with the current best

• Store and return the best result

Solving with Dynamic Programming

Break down the problem into subproblems. Notice that the costs are in
multiples of 10 which is crucial for this method to work.

Subproblem: What's the maximum enjoyment for each budget amount?

Build a table:

• Columns: Available budget (0 to 160 in increments of 10 for


simplicity)

• 1 Row

• Cells: Maximum enjoyment possible so far

Fill the table:

• Start with the least expensive activity, then build up

• For each cell, consider all activities that fit the budget

• Choose the option that maximizes enjoyment while fitting the


budget

Backtrack to find the optimal activities:

113
• Start from the full budget

• Trace back the choices that led to the maximum enjoyment

4.7.2 Example: Coin Change Problem

Imagine you need to make a certain amount of change using the fewest
number of coins possible. You have a list of coin denominations available,
and your goal is to find the minimum number of coins needed to make the
exact change.

Figure 4.5: The decision tree of the coin change problem. A possible solution is
highlighted in green (there can be multiple).

Coin Denominations:

Coin A (2 units)

Coin B (3 units)

Coin C (5 units)
114
Amount: 16 units

Solving with Dynamic Programming

To solve the coin change problem using dynamic programming, we break


down the problem into subproblems and build a solution iteratively.

Break down the problem into subproblems.

Subproblem: What's the minimum number of coins needed for each


amount from 0 to 16 units?

A similar approach to the last problem can be followed to create a table


and find the solution. This is left as an exercise to the reader.

This dynamic programming approach ensures that we find the minimum


number of coins required to make the given amount by considering each
denomination and updating the table iteratively.

4.7.3 Real-Life Applications

This approach helps in various real-life scenarios:

• Meal Planning: Optimize nutrition and taste within a calorie


budget

• Vacation Planning: Maximize experiences within a travel budget

• Time Management: Allocate time to tasks to maximize productivity

• Financial Planning: Optimize investments and savings over time

115
4.7.4 Benefits in Daily Life

• Makes complex planning more manageable

• Helps in making optimal decisions over a series of choices

• Useful for balancing multiple goals or constraints

4.8 Topological Sorting: Handling


Dependencies
A we saw in an earlier chapter, topological sorting is an algorithm used to
linearly order the vertices of a directed acyclic graph (DAG) such that for
every directed edge from vertex A to vertex B, A comes before B in the
ordering. This concept is crucial when dealing with tasks or items that have
dependencies on each other.

Topological sorting ensures that each task is completed before any tasks
dependent on it are started. Additionally, topological sorting helps in
identifying potential parallelism in processes, enabling the efficient
scheduling of tasks that do not depend on each other.

4.8.1 Getting dressed in a 3-piece suit after a shower

This everyday scenario perfectly illustrates topological sorting.

116
Start with a list of items to put on:

1. Shoes
2. Socks
3. Jacket
4. Dress shirt
5. Trousers
6. Inner wear
7. Vest
8. Tie

Identify the dependencies:

1. Inner wear must be put on before trousers

2. Socks must be put on before shoes

3. Inner wear goes on before the dress shirt

117
4. Dress shirt must be on before the tie and vest

5. Trousers must be on before the vest and jacket

6. Vest goes on before the jacket


7. Trousers go on before the shoes

A valid topologically sorted order might be:

1. Inner wear

2. Socks

3. Dress shirt

4. Trousers

5. Tie

6. Vest

7. Jacket

8. Shoes

This example demonstrates key properties of topological sorting:

• Clear dependencies between items

• No circular dependencies (it's a DAG)

• Multiple valid orderings (tie could go on before or after the vest)

4.8.2 Scheduling tasks for a new product marketing


campaign

Identify project tasks:

• A. Conduct market research

• B. Define target audience

118
• C. Develop product messaging and positioning

• D. Create marketing strategy

• E. Design promotional materials (brochures, ads)

• F. Develop social media content plan

• G. Set up email marketing campaign

• H. Brief sales team on product features

• I. Launch PR campaign

• J. Activate paid advertising

• K. Host product launch event

Determine dependencies:

1. B depends on A

2. C depends on A and B

3. D depends on C

4. E depends on C and D

5. F depends on D and E

6. G depends on D and E

7. H depends on C

8. I depends on E

9. J depends on E and F

10. K depends on E, H and I

A possible topological order:

1. A [Conduct market research]

2. B [Define target audience]

119
3. C [Develop product messaging and positioning]

4. D [Create marketing strategy]

5. E [Design promotional materials]

6. H [Brief sales team on product features]

7. F [Develop social media content plan]

8. G [Set up email marketing campaign]

9. I [Launch PR campaign]

10. J [Activate paid advertising]

11. K [Host product launch event]

This ordering ensures that each marketing task is performed only after all
necessary prerequisite tasks are completed.

This example demonstrates how topological sorting can be applied to


complex business processes outside of technical fields. It helps marketing
teams ensure that their campaign elements are developed and deployed in
a logical sequence, with each step building upon the foundations laid by
previous tasks.

This approach can significantly improve the coherence and effectiveness of


the overall marketing campaign.

4.8.3 Real-world applications of topological sorting

1. Build systems in software development (e.g., Make, Maven)

2. Course scheduling in universities

3. Data processing pipelines

4. Dependency resolution in package managers

120
5. Task scheduling in parallel computing

6. Project management in construction

4.9 Monte Carlo Method: Using


Randomness to Solve Problems
The Monte Carlo method is a powerful technique that uses random
sampling to solve problems that might be deterministic in principle. It's
particularly useful for complex systems with many variables or when
traditional analytical methods are impractical.

It is a tool that allows us to tackle problems that would be infeasible to


solve through direct computation, making it invaluable in fields ranging
from physics and finance to computer graphics and artificial intelligence.
Some applications of Monte Carlo method are discussed below.

4.9.1 Estimating Pi: A Circle in a Square

Imagine a square with a side length of 1 unit and a circle inscribed within
it. The circle's diameter is equal to the square's side, so its radius is 0.5
units. Now, let's conduct a thought experiment.

Throw a large number of pebbles (all assumed to be of the same size)


randomly and uniformly at the square. Count how many pebbles land
inside the circle versus the total number thrown.

The ratio of pebbles inside the circle to the total number of pebbles will
approximate the ratio of the circle's area to the square's area. We can use
this to estimate pi:
121
• Area of the square: 1 × 1 = 1 square unit

2 π
• Area of the circle: π r = π (0.5)2 = square units
4

(4)
π
π
• The ratio of these areas is =
1 4
So, if we multiply our pebble ratio by 4, we get an estimate of π. The more
pebbles we throw, the more accurate our estimate becomes.

This method demonstrates how random sampling can be used to


approximate a value (in this case, π) without direct calculation.

Figure 4.6: Circle inscribed in a square. There are 47 green dots and 13 red dots.
π
47/60 is 0.78333… which is close to or 0.78539…
4

4.9.2 Raytracing in Computer Graphics

In computer graphics, particularly in rendering realistic lighting, the Monte


Carlo method plays a crucial role.

This technique simulates the path of light rays as they interact with objects
in a scene.

122
Figure 4.7: A ray traced image of a 3D model that I created in Blender, rendered
using LuxRender

Instead of calculating every possible light path (which would be


computationally infeasible), a Monte Carlo approach is used:

• Multiple light rays are randomly cast from each pixel of the image.

• The paths of these rays are traced as they bounce off surfaces,
refract through materials, or get absorbed.

• The final colour of a pixel is determined by averaging the results of


these random light paths.

Recent advancements in real-time raytracing for video games heavily rely


on these Monte Carlo techniques, allowing for more realistic lighting,
reflections and shadows in interactive environments.

123
4.10 Conclusion
In this chapter, we explored how algorithmic thinking permeates our
everyday lives, from organizing our closets to planning our daily activities.
By recognizing and applying computational concepts, we can enhance our
problem-solving abilities, improve efficiency and make more informed
decisions.

We've seen practical applications of binary search, exponential growth,


optimization problems, divide and conquer strategies, greedy algorithms,
sorting techniques, dynamic programming, topological sorting and the
Monte Carlo method. Each of these algorithms and data structures provides
powerful tools for tackling real-world challenges.

By understanding and leveraging these concepts, we can not only solve


complex problems more effectively but also innovate and communicate
more clearly in our personal and professional lives.

As you continue to encounter and apply these principles, you will develop
a deeper appreciation for the pervasive role of algorithms in shaping our
world.

124
This page intentionally left blank
This page intentionally left blank
Further Reading

Books
"Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson,
Ronald L. Rivest, and Clifford Stein

A comprehensive textbook covering a wide range of algorithms in


depth, suitable for both beginners and advanced learners.

"The Algorithm Design Manual" by Steven S. Skiena

A practical guide to designing and analyzing algorithms, with a


focus on real-world applications and problem-solving techniques.

"Algorithms" by Robert Sedgewick and Kevin Wayne

An accessible introduction to algorithms, with a focus on


fundamental principles and practical implementations.

"Grokking Algorithms" by Aditya Bhargava

A beginner-friendly book that uses illustrations and examples to


explain key algorithms and data structures.

127
Online Courses
"Algorithms" by Princeton University on Coursera

A comprehensive online course covering essential algorithms and


data structures, taught by leading experts in the field.

"Introduction to Algorithms" on MIT OpenCourseWare by MIT

A course based on MIT's renowned algorithms class, offering deep


insights into algorithm design and analysis.

"Data Structures and Algorithms" by University of California, San Diego


on Coursera

A course that provides a thorough introduction to data structures


and algorithms, with practical coding exercises.

Websites and Platforms


GeeksforGeeks (www.geeksforgeeks.org)

A popular website offering tutorials, articles, and coding problems


on a wide range of topics related to algorithms and data structures.

LeetCode (www.leetcode.com)

A platform with a vast collection of coding problems and


challenges to help you practice and improve your algorithm skills.

HackerRank (www.hackerrank.com)

128
A site offering coding challenges and competitions across various
domains, including algorithms, data structures, and more.

Stack Overflow (www.stackoverflow.com)

A community-driven Q&A platform where you can ask questions,


share knowledge, and learn from experienced programmers.

129
This page intentionally left blank
Afterword

Throughout this book, we've explored algorithms and computational


thinking not just as abstract concepts, but as practical tools for everyday
problem-solving. From data structures to algorithmic design, these
principles extend far beyond computer science.

We've seen how common activities — like getting dressed or flipping


through a book — can illustrate complex concepts such as topological
sorting and binary search. This demonstrates that algorithmic thinking is
already part of our daily lives, often without our realization.

As you move forward, I encourage you to apply these concepts to your


daily challenges. Whether you're organizing tasks, planning a trip, or
tackling a work project, the problem-solving strategies we've discussed can
help you approach issues more efficiently.

In our increasingly digital world, the ability to think algorithmically is


becoming more valuable across all fields. It's about breaking down
complex problems, recognizing patterns and continuously improving our
approaches.

I hope this book has equipped you with useful problem-solving tools and
sparked a curiosity to see algorithms in the world around you. Thank you
for joining me on this exploration. May it serve you well in your future
endeavours!

131
This page intentionally left blank
Glossary

1. Algorithm: A process or set of rules to be followed in calculations or


problem-solving operations.

2. Array: A data structure that stores a fixed-size sequential collection


of elements of the same type, each accessible by an index.

3. Big O Notation: A mathematical notation that describes the limiting


behaviour of a function when the argument tends towards a
particular value or infinity.

4. Binary Search: An efficient algorithm for finding an item from a


sorted list of items.

5. Breadth-First Search (BFS): A graph traversal algorithm that explores


all the neighbour nodes at the present depth before moving to nodes
at the next depth level.

6. Caching: Caching is a technique of storing frequently accessed data


in a faster, more readily available storage location to reduce access
time and improve system performance.

7. Cartesian Coordinates: A system that uniquely defines the position


of a point in a plane or space using a set of numerical coordinates
that measure the point's perpendicular distances from a set of
reference lines or planes that intersect at a fixed point called the
origin.

8. Data Structure: A data structure is an organized format for storing,


managing and accessing data efficiently in a computer program.

133
9. Depth-First Search (DFS): A graph traversal algorithm that explores
as far as possible along each branch before backtracking.

10. Directed Acyclic Graph (DAG): A graph with directed edges and no
cycles.

11. Directed Graph: A directed graph is a structure consisting of vertices


connected by edges, where each edge has a specific direction from
one vertex to another.

12. Divide and Conquer: An algorithmic paradigm that solves a problem


by recursively breaking it down into smaller non-overlapping
subproblems, solving these subproblems independently and then
combining their solutions to solve the original problem.

13. Dynamic Programming: A method for solving complex problems by


breaking them down into simpler subproblems and storing the
results of these subproblems to avoid redundant computations,
thereby efficiently solving overlapping subproblems.

14. Edge: A connection between two vertices (nodes) in a graph,


representing a relationship or link between them.

15. Exponential Growth: A pattern of data that shows greater increases


over time.

16. Graph: A data structure consisting of a set of nodes (vertices) and a


set of edges connecting these nodes.

17. Greedy Algorithm: An algorithmic paradigm that follows the


problem-solving heuristic of making the locally optimal choice at
each stage.

18. GUI (Graphical User Interface): A visual interface that allows users
to interact with electronic devices using graphical elements like
icons, buttons and menus instead of text-based commands.

134
19. Hash Table: A data structure that implements an associative array
abstract data type, a structure that can map keys to values.

20. Heuristic: A heuristic is a practical approach or shortcut used to


make problem-solving and decision-making more efficient when an
optimal solution is impractical to obtain.

21. Linked List: A linear collection of data elements whose order is not
given by their physical placement in memory.

22. Memoization: An optimization technique used primarily to speed up


computer programs by storing the results of expensive function calls.

23. Monte Carlo Method: A broad class of computational algorithms


that rely on repeated random sampling to obtain numerical results.

24. Node: A fundamental unit in a graph or tree structure that represents


a discrete entity or data point, often connected to other nodes by
edges.

25. Optimization Problem: A problem of finding the best solution from


all feasible solutions.

26. Set: A data structure that stores a collection of unique elements with
no specific order.

27. Simulated Annealing: An optimization technique inspired by the


annealing process in metallurgy, used to find approximate solutions
to complex problems by exploring the solution space and gradually
reducing the likelihood of accepting worse solutions as it progresses.

28. Sorting Algorithm: An algorithm that puts elements of a list in a


certain order.

29. State Space: The set of all possible configurations of a system.

135
30. Time Complexity: A description of the amount of time it takes to run
an algorithm.

31. Topological Sorting: A linear ordering of vertices in a directed


acyclic graph (DAG) such that for every directed edge uv, vertex u
comes before v in the ordering.

32. Tree: A widely used data structure that simulates a hierarchical tree
structure, with a root value and subtrees of children with a parent
node.

33. Turing Machine: An abstract machine that manipulates symbols on a


strip of tape according to a table of rules.

34. Undirected Graph: An undirected graph is a structure consisting of


vertices connected by edges, where the connections between
vertices have no direction and are bidirectional.

136
Index

Algorithm, 6, 12, 14-15, 20-21, 27, 32, 34, 43, 52-53, 55-61, 64, 69,
72-74, 76-79, 87, 90-91, 93-94, 99, 101-102, 104, 106, 108, 116, 124

Array, 7, 14, 28, 46, 55-56, 64, 74-80, 84, 87

Big O, 52-54, 57-58

Binary Search, 56, 64, 67, 78-79, 91-94, 124

Breadth-first Search (BFS), 14, 66-67

Data Structure, 5, 7, 27, 46-47, 58, 64, 67, 74, 87, 91, 124

Depth-first Search (DFS), 14, 47, 65-66

Directed Acyclic Graph (DAG), 16-17, 28, 69, 116, 18-20, 70, 118

Directed Graph, 65, 68-69

Divide and Conquer, 12, 32, 61, 102, 105, 124

Dynamic Programming (DP), 12, 18-20, 28, 35, 61, 99, 101, 111-113, 115,
124, 36, 78

137
E

Exponential Growth, 94, 97, 124

First In First Out (FIFO), 74

Graph, 6, 8-9, 14-20, 22-28, 32-33, 35, 48-49, 51, 61, 64-69, 71-73, 75,
87, 90-91, 103, 116, 121-122

Greedy Algorithm, 12, 32, 61, 106, 108, 124

Hash Table, 58, 64, 79, 84, 87

Hashing, 80-84

Heap, 74

Last In First Out (LIFO), 74, 46

Linked List, 71, 78

Memoization, 18, 33-35, 58, 112

138
Monte Carlo Method, 121, 123-124

Optimization Problem, 26, 97, 101, 124

Set, 84-86

Sorting, 18, 56, 58, 64, 70, 76-78, 87, 91, 103-104, 108-110, 116, 118,
120, 124

Space Complexity, 58-59, 77-78

State Space, 9-10, 13-15, 17, 22, 25-27, 32, 35-36

Time Complexity, 52-53, 57-58, 77-78, 85-86

Topological Sort, 18, 70, 116, 118, 120, 124,

Tree, 19-20, 25, 36, 43, 67-69, 73, 90, 114

Turing Machine, 59-60

Undirected Graph, 65, 69

139

You might also like