0% found this document useful (0 votes)
13 views73 pages

Data Structures and Alg2 - 2021

The document outlines the course structure for Data Structures and Algorithms II at GCUC, focusing on key topics such as graphs, trees, searching algorithms, and sorting algorithms. It includes a detailed table of contents and introduces fundamental concepts in algorithms and data structures, emphasizing their importance in software engineering and computer science. The course aims to equip students with the skills to implement and analyze various data structures and algorithms using programming languages like Java or C++.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views73 pages

Data Structures and Alg2 - 2021

The document outlines the course structure for Data Structures and Algorithms II at GCUC, focusing on key topics such as graphs, trees, searching algorithms, and sorting algorithms. It includes a detailed table of contents and introduces fundamental concepts in algorithms and data structures, emphasizing their importance in software engineering and computer science. The course aims to equip students with the skills to implement and analyze various data structures and algorithms using programming languages like Java or C++.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Faculty of Applied

Sciences, GCUC
Data Structures and Algorithms II
Aps. Nixon Adu-Boahen 2023
Course outline
Introduction
Graph
• Definitions
• Examples of Graph
• Graph ADT
• Graph Search

Trees
• Binary Trees
• B-Tree

Searching algorithms
• Linear Search
• Binary Search

Sorting Algorithms
• Bubble Sort
• Insertion Sort
• Selection Sort
• Quick Sort

2
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Table of Contents
1.0 INTRODUCTION ............................................................................................................................ 5
1. 1 GRAPHS ....................................................................................................................................... 7
1.2 Terminologies ...................................................................................................................... 7
1.3 Graph Theorems.......................................................................................................................... 14
1.4 A Graph ADT ............................................................................................................................. 18
1.5 Implementing a Graph ADT ....................................................................................................... 18
1.5.1 Adjacency Matrix ................................................................................................................. 18
1.5.2 Edge Lists.............................................................................................................................. 20
2.0 GRAPH SEARCH ALGORITHMS ............................................................................................... 23
2.1 Breadth first and depth first search ............................................................................................. 23
2.2 Tree search .................................................................................................................................. 24
2.3 Graph search revisited................................................................................................................. 27
2.4 Binary Tree traversal................................................................................................................... 31
2.5 Algorithms for tree traversal ....................................................................................................... 32
2.6 Binary search tree ....................................................................................................................... 34
3.0 RECURSION .................................................................................................................................. 36
3.1 Binary Search Tree ..................................................................................................................... 39
3.2 Searching the Binary Search Tree ............................................................................................... 40
3.3 Insertion into a Binary Search Tree ........................................................................................ 42
3.4 Deletion in a Binary Search Tree ................................................................................................ 43
3.5 B-Trees........................................................................................................................................ 44
3.5.1 Insertion in B-Trees ............................................................................................................. 46
4.0 SEARCHING .......................................................................................................................... 48
4.1 Simple Searching ........................................................................................................................ 48
4.2 Ordered Sequential Search .......................................................................................................... 49
4.3 Binary Search for “105”.............................................................................................................. 50
4.4 Recursive Binary Search ............................................................................................................. 52
4.5 Interpolation Search .................................................................................................................... 53
4.6 Algorithmic Complexity ............................................................................................................. 55
4.7 Hashing .......................................................................................... Error! Bookmark not defined.
4.8 Collisions .................................................................................................................................... 58
5.0 SORTING ....................................................................................................................................... 62
3
Page

5.1 Bubble Sort ................................................................................................................................. 62

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


5.2 Selection Sort .............................................................................................................................. 64
5.3 Quicksort ..................................................................................................................................... 66
5.4 Heap sort ..................................................................................................................................... 70

4
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


1.0 INTRODUCTION
In software Engineering and computer science, data structures and algorithms are vital
players. Through data structures and algorithms, the best or most appropriate structures and
methods can be selected for implementation during program developments. Thus studying
Data structures and Algorithms should be of great interest to all the fields (e.g. Computer
Science, Information Technology, Software Engineering, etc.) in computing.
In this course, we will explore the design and analysis of algorithms as well as learn how to
define and implement data structures using Java or C++ (which is high level programming
language like C#, VB.NET etc). Before starting with the analysis of algorithm we will briefly
explain the terms algorithms and data structures and then proceed with the analysis.

Algorithms
Generally an algorithm can be thought of as, a sequence of steps that solves a given problem.
In computation, an algorithm is any well-defined computational procedure that takes some
value, or set of values, as input and produces some value or a set of values, as output. Thus
it is a sequence of computational steps that transform input into the output. The steps may be
precise instructions that tell the one looking for the solution to the problem at hand what to
do.

Example:
Problem: How do I save a new document I have created in word 2007?

Solution (Algorithm) (to the problem of saving in word 2007 on document creation)
Step1: Click on the office button
Step2: Click on the save or save as menu item
Step3: Type the name of your file in the textbox provided on the Save As dialog box with the
description Name
Step4: Choose where to save your document by dropping down the combo box with the
description Location or in the address bar – optional
Step5: Click on the Save button
Step5: Done
5

Most at times algorithms can be used as a tool for solving computational problems (well-
Page

defined problem statement in terms of input/output relationship).

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


The following is an example of formal definition of computational problem (sorting
algorithm)

Input: A sequence of n numbers (a1, a2, …, an)

Output: A permutation (reordering) (a’1, a’2, …, a’n) of the input sequence such that a’1 ≤ a’2
≤ …≤ a’n.

For example, given the input sequence (31, 45, 68, 25, 45, 58), a sorting algorithm returns as
output the sequence (25, 31, 45, 45, 58, 65). Such an input sequence is called an instance of
the sorting problem. Generally, an instance of a problem consists of the input (satisfying
whatever constraints are imposed in the problem statement) needed to compute a solution to
the problem.

An algorithm is said to be correct if, for every input instance, it halts with correct output. An
incorrect algorithm might not halt at all on some input instances, or it might halt with an
answer other than the desired one. Incorrect algorithm might be useful sometimes, hence do
not be disturbed when your algorithm turns out to be incorrect in this course But rather have
time to resolve the inconsistency in your solution.

An algorithm can be specified in English or language of the learner’s language, as a computer


program, or even as hardware design.

Kinds of problem solved by Algorithms


Practical applications of algorithms are everywhere and include the following

❖ Medicine: Determining the sequence of the 3 billion chemical base pairs that make up
human, DNA, storing this information in databases, and developing tools for data
analysis.
❖ Internet: Clever algorithms are employed to manage and manipulate large volume of
data. This problem may include routing – finding good routs on which the data travel and
using search engine to quickly find pages on which particular information resides.
❖ Ecommerce: Encryption of data may be required to maintain privacy and prevent fraud
❖ Manufacturing and production settings: Algorithms are required to allocate scarce
resource in most beneficial way. Example placing order to maximize its expected profit.
6
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


The implementation of algorithms becomes a program. A program may require good data
structures and algorithms in order to speed up computational operations and save space of
storage. In the prerequisite course, we studied some algorithms (and their analysis) and some
common data structures. In this course, we will handle some advance data structures (such as
graphs, trees etc.) and their implementations as well as some well known algorithms (such as
sorting).

The prerequisite of this course handled, introduction to algorithms and data structures. We
defined algorithms for performing several operations on the data structures and analysed the
algorithms for their efficiency. This part of the course, will introduce us to graph and other
non-linear data structures.

1. 1 GRAPHS
Graphs are useful models for reasoning about relations among objects and combinatorial
problems. Many real-life problems can be solved by converting them to graphs. Proper
application of graph theory ideas can drastically reduce the solution time for some important
problems.

1.2 Terminologies
A graph is a collection of non-empty set of vertices, V and associated edges, E where each
edge is a pair (u, v) of vertices. The equation for a graph:

G = (V, E)

Figure 1 is an example of a simple graph. In the figure the graph has

V = {v1,v2,v3}

E = {(v1,v2),(v2,v3)}
v1
Edge
(v1,v2)
7

v2
Page

v3
Fig1. A simple Graph
CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen
V1

V2 V4
V3

Graph A
V5
Any edge between vertices can be drawn as any kind of a curve (not necessarily a straight
line). The important thing is the incidence between the edges and vertices: any intersection of
edges is not vertex unless specified. The same graph shown pictorially in three different ways
is illustrated below.
2
2 2

1 1 3 5
3 5 3
1 5

4 4

4
Fig. 2 Drawing same graph in different shapes

Size and Order of Graph


The size of a graph G is the number of vertices in the graph.
size(G) = |V|
The order of a graph G is the number of edges in a graph
order(G) = |E|

So, for the graph A above, size(G) = 5 and order(G) = 6

NB: Minimum possible order is 0 (empty graph) Maximum possible order is n(n-1)/2
(complete graph), where n is the size (number of vertices) of the graph.

Density
The density of G is the ratio of edges in G to the maximum possible number of edges
2L
Density = , where L is the order of G and n is the maximum order of G.
n(n−1)
8
Page

Density = 2×4/(4×3) = 2/3

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Reachability
The node a is reachable from b if there is a path from a to b

Neighbour vertices
Vertices are said to be neighbours if they share the same edge. V1 is said to be a neighbour of
V2, if V2 can be reached from V1 by an edge. V1 is said to be neighbour to V2, if V1 can be
reached from V2 by an edge. In other words, if there is an edge from V1 to V2 then V2 is
neighbour of V1 otherwise, V2 is neighbour to V1This implies neighbourhood is based on the
direction of an edge
V1 V2

V1 is neighbour to V2 or V2 is neighbour of V1

Degree of Vertex
The degree of a vertex (or a node), in a graph, is the number of edges containing the
vertex. E.g. In Graph A above, we have the following degrees

degree of v1 = 3
degree of v3 = 3
degree of v2 = 2
degree of v4 = 3
degree of v5 = 1
In a graph, G = (V, E), two vertices, v1 and v2, are neighbours if (v1,v2) is an edge.

In-degree and Out-degree


For a directed graph whose edges have order (starting and ending vertices), we look at two
distinct degrees. They are, in and out degrees.
The out-degree of a vertex (v) is equal to the number of neighbours of v. It can also be
defined as the number of edges whose initial vertex is the vertex (v).

The in-degree of a vertex (v) is equal to the number of vertices that have v as a neighbour. It
can also be defined as the number of edges whose final vertex is the vertex (v).
V1

V2 V4
9

V3
Page

Graph B V5

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


For example, graph B above:
The out-degree of V2 = 2
The out-degree of vertex V4 = 1
The in-degree of vertex V4 = 2
The out-degree of vertex V5 = 0
Isolated
A vertex of degree of zero (0) is called isolated. The vertex v in the graph below is an
isolated.
V

Pendant
A vertex is pendant if and only if it has a degree of 1. Consequently, a pendant vertex is
adjacent to exactly one other vertex. Vertex V5 in Graph B is a pendant.

Null Graph
A null graph is a graph consisting of isolated vertices, an isolated vertex being a vertex
having no edges no incident to it.

Loop
A loop is an edge with both endpoints being the same. A loop is not a path necessarily a
path; it is an edge. It involves only one vertex.
V1
Fig3. A loop
Path
A path is sequence of vertices V1, V2, ..., Vn, such that each pair (vi, vi+1) is an edge.
The definition implies that, vertices are neighbour-wise adjacent and every vertex appears
once. For example in the graph B above we can identify 16 paths which are as follows

1. V2, V1 7. V2,V1,V4 13. V2,V3,V4,V5


2. V3, V4 8. V2,V1, V3 14. V2, V1,V3,V4
3. V2, V3 9. V1,V3,V4 15. V2,V1,V4,V5
4. V1, V4 10. V3, V4,V5 16. V1,V3,V4,V5
10

5. V1, V3 11. V2, V3, V4


Page

6. V4, V5 12. V1,V4,V5

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Length
The length of a path is the number of edges in it, or total weight if there are weights. The
length in terms of vertices in it can be termed as the number of vertices less 1. Thus if a path
consists of n nodes, then the length = n – 1. The special case is a loop which has a length of
zero (0).

Connected Graph
A graph is connected if for each vertex pair (vi,vj) there is a path from vi to vj. Hence Graph
B is not a connected graph because there is no path between the vertex V5 and any other
node. But graph C below is a connected graph. V1

V2 V3 V4

V5

Directed and Undirected Graph Graph C

A directed graph is a graph with vertices and edges where each edge has a specific direction
relative to each of the vertices. We can convert an undirected graph to a directed one by
duplicating edges, and orienting them both ways.
For example we can convert the undirected graph A into the directed graph below.

V1

V2 V3 V4

Graph D V5

Connected Graph
An undirected graph is connected if there is a path from every vertex to every other vertex. A
directed graph with this property is called strongly connected. If a directed graph is not
strongly connected, but the underlying graph (without direction to the arcs) is connected, then
11

the graph is said to be weakly connected. Fig 4 below shows examples of strongly connected
Page

graphs

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


V1 V1

V2 V3 V4 V2 V3 V4

V5 V5

Fig 4. Converting undirected graph to directed graph

Complete Graph
A complete graph is a graph in which there is an edge between every pair of vertices.
Examples are shown below.
V1
V1

V2 V3 V4
V2 V3 V4

V5
Exercise V5
1. Construct a complete directed graph with 6 vertices
2. List all the paths in the graph above. How many paths can you get?
3. Find the order of the graph in question 1
4. Find the size of graph in question 1
5. Find the density of a graph G, with 15 nodes 14 edges

Subgraph
Let H=(V1, E1) of G=(V,E) is a graph with V1 ⊆ V and E1 ⊆ E, and each edge in H has the
same end vertices in H as in G. a a
a b b
b

e e
12

c c
c d
Page

G H1 H2

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


From the above, G is the main graph and H1 and H2 are derived graphs. H1 is a subgraph of G
whereas H2 is not.
H2 is not because, from G there is no edge between c and e but the H2 has an edge between c
and e.

Cycle and Acyclic Graph


A cycle is a path from a vertex to itself which does not repeat any vertices except the first and
the last. A graph containing no cycles is said to be acyclic. An example of cyclic and acyclic
graphs is shown in the Fig. 5 below.

Undirected Cyclic Undirected Acyclic


graph graph
Fig 5. Cyclic and Acyclic Graphs

Tree
A tree is an acyclic connected graph. Trees are very vital structures in computing which can
store data in non-linear manner. They are normally used to handle hierarchical information in
memory. Fig6 below shows some trees

Fig6. Examples of Trees


Forest
A collection of disjoint trees is called a forest. Fig7 shows a forest consisting of three disjoint
trees.
13

Fig7 A forest
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Basic properties of trees
From the above definition, we have the following properties
i) The path between any two nodes is unique and the length of a path is also unique.
Further, if two paths have the same final node, then one is a subgraph of the other.
ii) A tree with n nodes contains n-1 lines. Conversely, any graph with n nodes and n-1
lines is a tree.

Rooted Trees
Since a tree contains no cycles, the length of a path in it is bounded. Therefore there exist
maximal paths which are proper subpaths of no longer paths. The initial and final nodes of
a maximal path are called the root and leaf (terminal node) of the tree respectively. A tree is
said to be rooted if it has a particular node specially designated as the root. Because a
maximal path in a tree starts from the root and ends at a leaf, it is convenient to think of a
rooted tree as a directed graph.

Level
The root is said to lie on the first level of the tree. The level of any other node is the number
of nodes on the path from the root to that node. In general, a node which lies on the jth level,
is said to lie at the end of a path (from the root) with length j-1.
Branch node
It is any node which is not a terminal node.
Subtree
Any node defines a subtree of which it is the root, consisting of itself and all other nodes
reachable from it.

1.3 Graph Theorems


Theorem 1: Handshaking Theorem
The theorem states that, the sum of the degrees of the vertices in an undirected graph is equal
to twice the number of edges in the graph G=(V,E):
2e = ∑𝑣⁡∈𝑉 deg⁡(𝑣) where e is the number of edges and v is a vertex in the graph, and deg
means degree of.
Example1: A graph consists of 10 vertices and each vertex has 6 degrees. What is the
number of edges in the graph?
14
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Solution: Summing the degrees of the 10 vertices will give 6*10 = 60. From the formula
twice the edges equals sum of degrees, hence 2e=60 and e=30. Thus the graph consists of 30
edges.

Example2: A graph G consists of 2 vertices of degree 2 each, 3 vertices of degree 3 each and
the remaining each of degree 1. If the number of edges in G is 8.
i. What is the number of vertices in G?
ii. What is the density of G
Solution: Let n be the unknown number of vertices each of degree 1.
Sum of degrees in G : 2x2 + 3x3 + n*1
Using the theorem 1: sum of degree = 2xe (2 times number of edges)
➔4 + 9 + n = 2 x 8 ➔ 13 + n = 16 ➔ n = 16-13 ➔ n=3
i. The number of vertices : 2 + 3 + 3 = 8
ii. The density : e =L= 8, D =2L/n(n-1) ➔ D = 2x8/8(7) ➔ 16/56 or 0.2857

Theorem 2: An undirected graph has an even number of vertices of odd degree.


Let V1 and V2 be the set of vertices of even degree and the set of vertices of odd degree
respectively, in an undirected graph G=(V,E).

Then 2e = ∑𝑣⁡∈𝑉 deg⁡(𝑣) = ∑𝑣⁡∈𝑉1 deg⁡(𝑣)⁡+ ∑𝑣⁡∈𝑉2 deg⁡(𝑣)


V1

V2 V4
V3

V5

From the graph above, deg(v1)=3, deg(v2)=2, deg(v3)=3, deg(v4)=3 and deg(v5)=1
15

From the above have 4 vertices with their degrees being odd, and 4 is even.
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Examples of Graphs

1. airport system:
nodes = airports; edges = pairs of airports with non-stop flights.
(weight/cost = airfare; distance; capacity)

2. Internet:

nodes = routers; edges = links.

3. social graphs:

nodes = people; edges = friends/acquaintance/family

4. academic graphs:

nodes = courses; edges = prereqs;

Examples of situations where graphs can be used

• Finding routes between cities: the objects could be towns, and the connections could
be road/rail links.

• Deciding what first year courses to take: the objects are courses, and the relationships
are prerequisite and co-requisite relations. Similarly, planning a course: the objects
are topics, and the relations are prerequisites between topics (you have to understand
topic X before topic Y will make sense).

• Planning a project: the objects are tasks, and relations are relationships between tasks.

• Finding out whether two points in an electrical circuit are connected: the objects are
electrical components and the connections are wires.

• Deciding on a move in a game: the objects are board states, and the connections
16

correspond to possible moves.


Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Trials 1
1. A graph consists of 15 edges and vertices of equal degrees. If the first node in the graph
has a degree of 3, what is the number of vertices of the graph?
2. The sum of odd degrees in a graph Q is 14 and the sum of even degrees is 12. What is the
number of edges in the graph.
3. Distinguish between a cycle and a loop.
4. Distinguish between connected and complete graph.
5. Observe the following graph V1

V2 V4
V3

Graph G V5

i) Which of the vertices is a pendant?


ii) List all the paths in G
iii) Is there a cycle in G? Explain your answer
iv) Is v1,v2,v3,v4,v5 a path? Explain your answer. If it is a path what is the length?
v) What is the order and size of G
vi) Create three subgraphs of G

6. What is the term given to two edges with the same initial and final nodes?
7. What is the level of a root in a tree?
8. There are six nodes between node p and q in a tree. What is the length of the path from p
to q?
9. Assignment: What the various ways of representing a tree?
10. Assignment: Give four practical use of trees in computation
11. A graph H consists of 5 vertices each with degree 3, a vertex with degree 1 and the
remaining vertices each with degree 2. If the number of edges is 10, find the number of
vertices and the density of G.
17
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


1.4 A Graph ADT
An abstract datatype for a graph should provide operations for constructing a graph (adding
and removing edges and vertices) and for checking connections in a graph. If we allow
labelled nodes then there will be additional operations to add, access and remove labels.
Nodes can just be indicated by integers. Assuming that we have suitable type declarations for
label and graph, the following is a reasonable minimal set of operations:
Graph(int n)
// Creates and initialises a graph of given size (no. of nodes)
void AddEdge(int n1, int n2)
// Adds edge from N1 to N2
// Pre: N1 and N2 are nodes in graph
void RemoveEdge(int n1, int n2)
// Removes an edge}
// Pre: There is an edge in graph from N1 to N2
int EdgeExists(int n1, int n2)
// 1 if there is an edge in graph from N1 to N2, else 0 //
void SetLabel(graphlabel l, int n)
// Adds a label L to node N
// Pre: N is a node in Graph
Label GetLabel(int n)
// Returns the label of a node
// Pre: N is node of Graph

1.5 Implementing a Graph ADT


There are two reasonable implementations for a graph ADT
• Adjacency Matrix
• Edge Lists

1.5.1 Adjacency Matrix


For a graph of N nodes, a simple representation is just an NxN matrix of boolean values −say
G[1..Max,1..Max] of boolean. An edge between nodes n and m is indicated by a ‘true’ entry
in the array G[n,m] and lack of an edge is represented by a ‘false’. For a labelled graph, a
further 1dimensional array can give the labels of each node. When dealing with a weighted
18

graph, then the weights of the edges can be used where an edge exists and zero where there is
Page

no edge.

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


An integer can be used to represent the size (number of nodes) of a specific graph.

N1 N2

N3

N4 N5

Fig[2.1]: a graph
The graph in the figure above would be represented as follows:
1 2 3 45
1FTFTF
2FFTFF
3FFFFF
4FFTFT
5FFFFF
NB: In a directed graph, there is an edge between two nodes (a,b), if there is an edge that
begins from a and ends at b

Exercise 1: Represent the graphs below in an adjacency Matrix


N1
N2
1.

N3

N7 N6

2. N1 N5
N4
19

N5
Page

N4
N3

N2 N6
CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen
1.5.2 Edge Lists

The adjacent matrix representation is very simple, but it is inneficient (in terms of space) for
sparse graphs, ie, those without many edges compared with nodes (vertices). Such a graph
would still need an NxN matrix, but would be almost full of 'False's.

So an alternative is to have associated with each node a list (or set) of all the nodes it is linked
to via an edge. So, in this representation we have a one dimensional array, where each
element in the array is a list of nodes.

1 2

3 4 5

6 7
Fig [2.2]

For the graph in fig[2.2] this would give:


1 (2 4 3)
2 (4 5)
3 (6)
4 (6 7 3)
5 (4 7)
6 ()
7 (6)

Now comes the question of how to represent the list of nodes. One way is to use linked lists.
20

The graph would involve an array, where each element of that array is a linked list.
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


1
2 4 3

4 5
2

3 6

4 6 7 3

5 4 7

6
7

C++ graph implementation using adjacency matrix


The following is the class definition without the method implementation
class Graph{
public:
Graph(int n);
// Creates and initialises a graph of given size (no. of nodes)
void AddEdge(int n1, int n2);
// Adds edge from N1 to N2
// Pre: N1 and N2 are nodes in graph
void RemoveEdge(int n1, int n2);
// Removes an edge}
// Pre: There is an edge in graph from N1 to N2
int EdgeExists(int n1, int n2);
// 1 if there is an edge in graph from N1 to N2, else 0 //
void SetLabel(graphlabel l, int n);
// Adds a label L to node N
21

// Pre: N is a node in Graph


Page

Label GetLabel(int n);

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


// Returns the label of a node
// Pre: N is node of Graph
private:
bool G[][2];
}

Trials 2
1. Represent the graphs drawn in Fig [2.2] as Edge Lists.
2. What is a graph?
3. Distinguish between digraph and undirected graph.
4. Graphically represent the following graph G,
G=(V,E)
V={n1, n2, n3, n4, n5}
E={(n2,n3), (n1,n5),(n2,n3),(n4,n5),(n1,n4),(n1,n3)}
5. What is a connected graph?
6. Distinguish between complete and connected graph.
7. A graph consists of 5 vertices with equal degrees. If there are 15 edges in the graph. What
is the degree of each vertex in the graph?
8. Use adjacency matrix to represent the graph G=(V,E).
V={v1, v2, v3, v4, v5}
E={(v1,v3),(v1,v4),(v2,v3),(v2,v5),(v4,v5),(v3,v4),(v1,v2)}
9. Represent the graph in 7 using linked list.
10. Implement the ADT Graph using C++;
11. List three properties of a tree and binary tree.
12. Use Edge List to represent the graph G=(V,E).
V={v1, v2, v3, v4, v5}
E={(v1,v3),(v1,v4),(v2,v3),(v2,v5),(v4,v5),(v3,v4),(v1,v2)}
22
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


2.0 GRAPH SEARCH ALGORITHMS
A graph search (or graph traversal) algorithm is just an algorithm to systematically go
through all the nodes in a graph, often with the goal of finding a particular node, or one with
a given property. Searching a linear structure such as a list is easy: you can just start at the
beginning, and work through to the end. Searching a graph is obviously more complex.

2.1 Breadth first and depth first search


There are two main ways to traverse a graph: depth first and breadth first. In breadth first
search, if we start at a particular node (say, n1), all nodes path length M away from n1 are
searched before all nodes path length M+1 away. In depth first search, from any particular
node, the whole of one sub-tree is investigated before the other is looked at all.

We can also describe it in terms of family tree terminology: in depth first the node's
descendants are searched before its (unvisited) siblings; in breadth first the siblings are
searched before its descendants.

N1 N2

N3

N7 N6

N5
N4

Fig[2.2]: An example of a graph


23
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


For example in the graph above, if we take N1 as our starting point, then a breadth search is
as follows:
• N1: N7, N4, N2
• N7: N4, N3
• N4:N5
• N2: N3,N4
• N3:N5
• N4: N5
• N5: N6

And a depth search will be as follows:


• N1:N7,N4 N5, N6, N2,N3
• N7:N3,N5,N6, N2
• N4:N5,N6, N2, N3
• N2:N3,N5,N6

2.2 Tree search


For trees (which are, as we said, a specialised kind of graph, where cycles aren’t allowed and
each node has only one 'parent' ) the distinction between breadth search and depth search is
clear graphically. As we see in figure 3.2, in breadth first we search
across the tree before we search down. In depth first we follow a path down, before we search
across. N1

N4
N2
N3
N9
24

N5
N8
Page

N6 N7
Fig[2.3]: An example of a tree.

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


For trees, there is a very simple algorithm for depth first search, which uses a stack of nodes.
The following assumes we are searching for some target node, and will quit when it is found.

stack.push(startnode); {(assumes stack initially empty..)}


do
{
currentnode = stack.pop();
for (each neighbour n of current node)
stack.push(n);
}
while(! stack.empty() && currentnode != target)

Pseudocode has been used to describe putting all neighbours of the node on the stack; you'd
actually need to either use list traversal of neighbours or check through all possible nodes and
add them if they are a neighbour.

We can see how this will work for the tree in the figure above. (The trace below shows 'stack'
on entering the body of the loop each time, and 'current node' after it has been popped off
stack).

Stack Current−node Neighbours

(1) 1 2,3,4

(2 3 4) 2 5,6

(5 6 3 4) 5 none

(6 3 4) 6 none

(3 4) 3 7,8

(7 8 4) 7 none

(8 4) 8 none
25

(4) 4 9
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


(9) 9 none

()

The algorithm for breadth first is exactly the same BUT we use a queue rather than a stack of
nodes: put the neighbours on the BACK of the queue, but remove current−node from the
front.

Queue.enqueue(startnode); //(assumes queue initially empty..)


do
{
currentnode = Queue.dequeue();
for {each neighbour n of current node)
Queue.enqueue(n);
}
while(! Queue.empty() && currentnode != target)

Note
Advantage of Breadth-first search
• It can often avoid getting lost in fruitless scanning of deep parts of the tree,
Disadvantage of Breadth-first search
• The queue that is used in Breadth-first search often requires much more memory
than depth-first search’s stack

Advantage of Depth-first Search


• The use of stacks in depth-first search makes use of less memory compared to
queues used in breadth first search.

Disadvantage of Depth-first Search


26

• There is always the tendency of searching deep parts of the tree fruitlessly.
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


2.3 Graph search revisited
If we are searching a general graph rather than a tree it is necessary to keep track of which
nodes have already been searched, as they might be met again. If we don't do this, then if
there are cycles in the graph the loop might never terminate. Even if there are no cycles,
redundant work is done, re−visiting old nodes.

Avoiding revisiting previously visited nodes leads to the following modified algorithm,
which keeps track of nodes visited (using an array visited[], which would be initialised
appropriately).

stack.push(startnode);
do
{
currentnode = stack.pop();
if(! visited[currentnode])
{
visited[currentnode] = 1;
for (each neighbour n of currentnode)
if( !visited[n])
stack.push(n);
}
}while(! stack.empty() && currentnode != target)

Binary Trees
A tree is a widely-used data structure that emulates a hierarchical tree structure with a set of
linked nodes.
A node is a structure which may contain a value, a condition, or represent a separate data
structure (which could be a tree of its own). A node has at most one parent.
27

An internal node (also known as an inner node or branch node) is any node of a tree that has
Page

child nodes. Similarly, an external node (also known as an outer node, leaf node, or terminal

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


node) is any node that does not have child nodes. The topmost node in a tree is called the
root node. Being the topmost node, the root node will not have a parent.

In binary trees, each node has at most 2 children (i.e. each node has 0, 1 or 2 children).

N1

N2
N3

N4

Properties of binary tree


1. If every node of a binary tree T has zero or two children, the number of external nodes is
1 more than the number of internal nodes.
2. If h is the height of a binary tree T and n is the number of its nodes, then
i. Number of nodes in T ≤ 2h+1-1
ii. The height of T is at least ceil of log2(n+1) -1. That is ceil(log2(n+1)-1)
3. depth (or level) of a node:
• root has level 1
• otherwise 1+ level of parent
4. height of a tree:
• if the tree is empty, its height is 0
• otherwise, its height is 1 + max{ height TL, height TR },
where TL and TR designate left and right subtrees

Full and Complete Binary Trees


A full binary tree (sometimes proper binary tree or 2-tree) is a tree in which every node other
than the leaves has two children. A complete binary tree is a binary tree in which every
level, except possibly the last, is completely filled, and all nodes are as far left as possible.
Complete Binary Tree
28
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


A typical class supporting a binary tree in C++ is
class tree
{
public:
int key;
tree * left;
tree * right;
}
Procedure for Finding height :
public int height(Node p){
int l,r,h;
if(p==null)
h=-1;
else{
l=height(p.leftChild());
r=height(p.rightChild());
if(l>r)
h=l+1;
else
h=r+1;
}
return h;
}

Binary trees have one key and two pointers in each node. The leaves of the tree
are indicated by null pointers.

E.g. The binary tree below can be represented in the c++ code below

3
8
29
Page

1
4

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


#include <iostream>
#include <stdlib.h>
Using namespace std;
typedef struct tree {
int key;
tree * left;
tree * right;
}Tree;

int main(){
Tree node[4];
node[0].key = 5;
node[1].key = 3;
node[2].key = 8;
node[3].key = 1;
node[4].key = 4;

node[0]->left = node[1];
node[0]->right = node[2];
node[1]->left = node[3];
node[1]->right = node[4];

cout<< node[1].key<<endl;

The following is an algorithm for Inserting items to binary tree.


algorithm Insert(value)
Pre: value has passed custom type checks for type T
Post: value has been placed in the correct location in the tree
if root = null then
root  node(value)
30

else
InsertNode(root, value)
Page

end if

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


return

algorithm InsertNode(current, value)


Pre: current is the node to start from
Post: value has been placed in the correct location in the tree
if value < current.Value then
if current.Left = null then
current.Left  node(value)
else
InsertNode(current.Left, value)
end if
else
if current.Right = null then
current.Right  node(value)
else
InsertNode(current.Right, value)
end if
end if
return

The insertion algorithm is split for a good reason. The ¯rst algorithm (non-recursive) checks a
very core base case - whether or not the tree is empty. If the tree is empty then we simply create
our root node and finish. In all other cases we invoke the recursive InsertNode algorithm which
simply guides us to the first appropriate place in the tree to put value. Note that at each stage
we perform a binary chop: we either choose to recurse into the left subtree or the right by
comparing the new value with that of the current node. For any totally ordered type, no value
can simultaneously satisfy the conditions to place it in both subtrees.

2.4 Binary Tree traversal


Binary tree traversal means, visiting each node of a binary tree in a specified order.
There are a number of algorithms for traversing a binary tree given a pointer to the root of the
tree. The most common strategies are preorder, inorder, and postorder.
❖ The preorder strategy visits the root prior to visiting the left and right subtrees.
❖ The inorder strategy visits the left subtree, the root, and the right subtree.
❖ The postorder strategy visits the left subtree, the right subtree, followed by the root.
These strategies are recursively invoked.

Visiting a node simply means accessing it to do something with it. It could be displaying the
name of the node, changing the label etc.
31

To traverse, always start from the root of the tree and perform the visitation in the order of
Page

the traversal strategy.

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


If the following tree is to be traversed then, it would be traversed as follows.

/
+

1
4 c d

Logically attach an arrow to all the nodes either at the left (for pre-order), below (for in order)
or right (for post-order) and then move from the root leftward, rightward and then upward
back to the root around all the nodes. When an arrow is encountered, the node is visited (or
its name is written).

= =
=

/
/ + /
+ +

1
1 4 c d 1
4 c d 4 c d

Treetraversal,
After A: Preorder
the following are the visited
Tree B:node values:
Inorder Tree C: Postorder
Preorder traversal (Tree A): =/14+cd

Inorder traversal (Tree B): 1/4=c+d


Postorder traversal (Tree C): 14/cd+=

NB: If the expression in preorder to convert it to infix or algebraic expression, scan from
right to left until an operator is encountered
32

2.5 Algorithms for tree traversal


Assume Node is defined to have value, left and right children as follows:
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


C++ definition
typedef struct node
{
int val;
struct node *left, *right;
}//tree has been typedefed as a node pointer.

Recursively, a tree could be traversed as follows


1. Preorder traversal
Algorithm
preorder(node)
if node = null then return
print node.value
preorder(node.left)
preorder(node.right)

C++ implementation
void preorder(node t)
{
if(t == NULL)
return;
cout<<" "<<t->val;//Visiting the root
preorder(t->left);
preorder(t->right);
}

2. Inorder Traversal
Algorithm
inorder(node)
if node = null then return
inorder(node.left)
print node.value
inorder(node.right)
33

C++ implementation
Page

void inorder(tree t)

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


{
if(t == NULL)
return;
inorder(t->left);
cout<<" "<<t->val;//Visiting the root
inorder(t->right);
}

3. Postorder traversal
Algorithm
postorder(node)
if node = null then return
postorder(node.left)
postorder(node.right)
print node.value

C++ implementation
void postorder(tree t)
{
if(t == NULL)
return;
postorder(t->left);
postorder(t->right);
cout<<" "<<t->val;//Visiting root
}

2.6 Binary search tree


Time complexity
in big O notation

Average Worst case

Space O(n) O(n)

Search O(log n) O(n)

Insert O(log n) O(n)


34

Delete O(log n) O(n)


Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


In computer science, a binary search tree (BST), which may sometimes also be called an
ordered or sorted binary tree, is a node-based binary tree data structure which has the
following properties:

• The left subtree of a node contains only nodes with keys less than the node's key.
• The right subtree of a node contains only nodes with keys greater than the node's key.
• Both the left and right subtrees must also be binary search trees.

Generally, the information represented by each node is a record rather than a single data
element. However, for sequencing purposes, nodes are compared according to their keys
rather than any part of their associated records.

The major advantage of binary search trees over other data structures is that the related
sorting algorithms and search algorithms such as in-order traversal can be very efficient.

Binary search trees are a fundamental data structure used to construct more abstract data
structures such as sets, multisets, and associative arrays.

Trials 3
1. Give two advantages and disadvantages of breadth first search and depth first search
2. Write down algorithm for Preorder, Inorder and Postorder traversal of trees
3. Define a node in C++ and write down the C++ implementation for each of the traversals in
question 2.
4. Give the breadth first search and depth first search traversal of the following graph

V1 V2

V3

V7 V6
35
Page

V5
V4
CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen
5. List the properties of a binary tree.

3.0 RECURSION
Procedure may contain repetitive task which can be achieved by iteration or recursion. With
iteration, the procedure may implement looping control structure such as while, do…while, or
for but the procedure may not call itself in its definition. A procedure on the other hand may
be recursive when it directly or indirectly calls itself. Recursion is a powerful technique for
defining an algorithm. When a function calls itself it is known as recursion. This is an important
aspect in Computer Science. Recursion can sometimes lead to very simple and elegant
programs but if not well defined can create infinite loops in a program.
Generally, for a function to be recursive, the variable involved in its definition is referenced at
the right hand-side of its definition.

Examples of recursive functions


36

1. Factorial
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


One of the simplest examples of recursion is the factorial function f(n) = n!. This function can
be defined recursively as
1, 𝑛 = 0⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡
p*⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡𝑓(𝑛) = {
𝑛⁡𝑥⁡𝑓(𝑛 − 1), 𝑛 > 0
n!=n.(n-1).(n-2)…2.1
Programmatically, the factorial of a number can be defined using iteration or recursion as
follows:
algorithm fact(n)//using iteration
Input: n is a positive integer number
Output: the factorial of n, fact(n)=nx(n-1)x(n-2)x…x1
k1
for x n to 1 step -1
k k * x
next x
return k

algorithm factr(n)//using recursion


Input: n is a positive integer number
Output: the factorial of n, fact(n)=nx(n-1)x(n-2)x…x1
if(n=0) then return 1
return n*factr(n-1)
From above, it can be observed that, the fact(n) which implemented the factorial of a number
made use of loops; specifically the for loop to accomplish its tasks whereas the factr(n) made
use of recursion; by calling itself in its definition. But the two yield same result. The factr(n)
procedure is snappier than the fact(n). It makes the definition of the factorial clearer as
expressed mathematically above in p*
Below is a simple C++ program that implements the recursive factorial function.
#include <iostream>
long factr(int x)
{
if(x == 0)
return 1;
else
37

return (x * factr(x-1));
}
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


int main(){
for(int i=1; i<10; i++)
cout<< factr(i)<<endl; // calling the factorial function on the numbers 1 to 10
}

2. Fibonacci Numbers
Fibonacci sequence is a series in which a term depends on the sum of previous two terms and
the first two terms are always given. e.g 1 1 2 3 5 8 13 …. In here, 1 and 1 are the first two
terms. It can be seen that, each term after the first two is obtained by adding the previous two
terms. 2 = 1 + 1, 3 = 1 + 2, 5 = 2 + 3, 8 = 3 + 5 and so on
The Fibonacci sequence, F(n), is defined recursively by the recurrence relation
F(0) = 1 F(1) = 1 // Here f(0) and f(1) are the first two terms
F(n) = F( n-1) + F(n-2)
When one is listing the sequence, he/she can stop at a point say t. Which means there will be t
minus 2 terms generated by the function call. The iterative and recursive definitions are
algorithmically given below :
algorithm fibo(w,u, t)//using iteration
Input: w and u are the first two terms t is the required number of terms in the sequence generation
Output: a list of Fibonacci sequence
count1
print (w, ‘ ’, u)
while cout < t -2
begin
za + b
print(‘ ’,z)
ab
bz
end
return
algorithm fibor(w, u, t)//using recursion
Input: w and u are the first two terms t is the required number of terms in the sequence generation
Output: a list of Fibonacci sequence
if (t =0) then return
print (w + u)
fibor(u, w +u, t-1)
//Assumption: it is assumed the first two terms are already printed so only the other terms are
printed

Here too, the fibo(w,u,t) implements its tasks by recursion whereas the fibor(w,u,t)
38

implements its tasks using recursion. It can again be seen that, the fibor is precise compared
to the iterative counterpart fibo, but they will all produce same output.
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Recursion can always be implemented if the task to be considered itself is recursive. You can
define any function to be iterative or recursive depending on how its tasks in a repetitive
manner is defined.

Assignment:
1. Write A simple program (Java or C++) which implements the Fibonacci sequence
recursively. Your program should print the first 200 sequences.
2. Using C++, define a factorial function recursively
3. Implement the recursive algorithm called getLCM that accepts two numbers f and s and
returns the Least Common Multiple (LCM) of the two numbers in C++. The LCM of two
numbers is a least integer which is divisible by both two numbers. e.g. LCM of 6 and 8 is
24.
algorithm getLCM( f , s, m, n)
Input: f and s are the numbers to find their lcm, m is the maximum of the two and n is natural number
Output: returns the LCM of the two numbers f and s
if m mod f=0 and m mod s=0 then return m
if f * n > s * n then
return getLCM(f,s, s * n, n+1)
else
return getLCM(f,s, f * n, n+1)
end if

3.1 Binary Search Tree


A binary search tree (BST) is a node based binary tree data structure which has the following
properties:
• The left subtree of a node contains only nodes with keys less than the node’s key
• The right subtree of a node contains only nodes with keys greater than the node’s
key
• Each node has a distinct key.
• Both the left and right subtrees must also be binary search trees.

Generally, the information represented by each node is a record rather than a single data
39

element. However, for sequencing purposes, nodes are compared according to their keys
Page

rather than any part of their associated records.

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


The major advantage of binary search trees over other data structures is that the related
sorting algorithms and search algorithms such as in-order traversal can be very efficient.

Example: Represent the following sequence of numbers as a binary Search tree, 9,3, 6,1,10,
14,7,13,5

3 10

1 14
6

5 7
13

What will be the binary Search tree for the following sequence of keys?

• 80, 100, 50, 30, 23, 200, 95, 7, 9, 8, 90, 4, 1, 87, 150, 17, 43
• 8, 5, 10, 15, 4, 6, 9, 20, 2, 17, 32, 1, 0,18

3.2 Searching the Binary Search Tree


Searching a binary tree for a specific value can be a recursive or iterative process. We will
first examine the recursive method.

We begin by examining the root node. If the tree is null, the value we are searching for does
not exist in the tree. Otherwise, if the value equals the root, the search is successful. If the
value is less than the root, search the left subtree. Similarly, if it is greater than the root,
40

search the right subtree.


Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


This process is repeated until the value is found or the indicated subtree is null. If the
searched value is not found before a null subtree is reached, then the item must not be present
in the tree.

If we represent a tree by struct as shown below,

typedef struct tree {


int key;
tree * left;
tree * right;
}Tree;

Then the recursive algorithm for searching a binary search tree as shown below:

int search_binary_tree(Tree* node, int val){


if (node==null)
{
return null; //key not found
}
if (val < node.key)
{
return search_binary_tree(node->left, val);
}

else if (val > node.key)


{
return search_binary_tree(node->right, val);
}
else { // key is equal to node key
return node.key; // found key
}
41

}
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


This operation requires O(log n) time in the average case, but needs O(n) time in the worst-
case, when the unbalanced tree resembles a linked list.

The algorithm can also be easily implemented in terms of an iterative approach.

The algorithm enters a loop, and decides whether to branch left or right depending on the
value of the node at each parent node.

boolean BinarySearchTree(Tree* node, int val)


{
while (node != null)
{
if (val == node.key)
{
return true;
}
else if (val < node.key)
{
node = node->left;
}
else if (val > node.key)
{
node = node->right;
}
}

//not found
return false;
}

3.3 Insertion into a Binary Search Tree


42
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Insertion begins as a search would begin; if the root is not equal to the value, we search the
left or right subtrees as before. Eventually, we will reach an external node and add the value
as its right or left child, depending on the node's value.

In other words, we examine the root and recursively insert the new node to the left subtree if
the new value is less than the root, or the right subtree if the new value is greater than or
equal to the root.

Algorithm:

/* Inserts the node pointed to by "newNode" into the subtree rooted at "treeNode" */

void InsertNode(Node* treeNode, Node *newNode)


{
if (treeNode == NULL)
treeNode = newNode;
else if (newNode->key < treeNode->key)
InsertNode(treeNode->left, newNode);
else
InsertNode(treeNode->right, newNode);
}

3.4 Deletion in a Binary Search Tree


There are three possible cases to consider:

• Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply
remove it from the tree.
• Deleting a node with one child: Delete it and replace it with its child.
• Deleting a node with two children: Call the node to be deleted "N". Do not delete N.
Instead, choose either its in-order successor node or its in-order predecessor node,
"R". Replace the value of N with the value of R, then delete R. (Note: R itself has up
to one child.)
43
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


As with all binary trees, a node's in-order successor is the left-most child of its right subtree,
and a node's in-order predecessor is the right-most child of its left subtree. In either case, this
node will have zero children. Delete it (the node) according to the first rule above.
For example if we are to delete the node with key 10 from the tree below (the middle tree),
we will solve it as below.
10 11
7

1. 2.
3 12 3 1
3 12

1 6 11 14 1 6 1
1 6 11 14
5
7 5 7
5

Exercises: Using the figure in the tree below as reference, what will be the new trees that will
be generated when the nodes with the following keys are deleted.

(a) 3
(b) 14 10

(c) 6
(d) 12
3 1
2

1 6 1 1

5 7
1
6

3.5 B-Trees
When working with large sets of data, it is often not possible or desirable to maintain the
entire structure in primary storage (RAM). Instead, a relatively small portion of the data
structure is maintained in primary storage, and additional data is read from secondary storage
as needed. Unfortunately, a magnetic disk, the most common form of secondary storage, is
44

significantly slower than random access memory (RAM). In fact, the system often spends
Page

more time retrieving data than actually processing data.

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


B-trees are optimized for situations when part or all of the tree must be maintained in
secondary storage such as a magnetic disk. Since disk accesses are expensive (time
consuming) operations, a b-tree tries to minimize the number of disk accesses.

The B-tree is a generalization of a binary search tree in that more than two paths diverge
from a single node.

Definition

A B-tree of order m (the maximum number of children for each node) is a tree which satisfies
the following properties:

1. Every node has at most m children.


2. Every node (except root and leaves) has at least m⁄2 children.
3. The root has at least two children if it is not a leaf node.
4. All leaves appear in the same level.

Unlike a binary-tree, each node of a b-tree may have a variable number of keys and children.
The keys are stored in non-decreasing order. Each key has an associated child that is the root
of a subtree containing all nodes with keys less than or equal to the key but greater than the
preceeding key. A node also has an additional rightmost child that is the root for a subtree
containing all keys greater than any keys in the node.

The number of branches (or child nodes) from a node will be one more than the number of
keys stored in the node.

4 10 16

1 2 3 6 7 9 11 12 15 17 22 30
45
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


An example of a B-tree

3.5.1 Insertion in B-Trees

All insertions start at a leaf node. To insert a new element

Search the tree to find the leaf node where the new element should be added. Insert the new
element into that node with the following steps:

1. If the node contains fewer than the maximum legal number of elements, then there is
room for the new element. Insert the new element in the node, keeping the node's
elements ordered.
2. Otherwise the node is full, so evenly split it into two nodes.
1. A single median is chosen from among the leaf's elements and the new
element.
2. Values less than the median are put in the new left node and values greater
than the median are put in the new right node, with the median acting as a
separation value.
3. Insert the separation value in the node's parent, which may cause it to be split,
and so on. If the node has no parent (i.e., the node was the root), create a new
root above this node (increasing the height of the tree).

E.g: The figure below displays the steps used to insert the numbers 1, 2, 3, 4, 5, 6, 7.

46
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


A B-Tree insertion example with each iteration

Assignment:

Using diagrams, show how the following numbers are inserted into a B-tree of order 4:

5,12, 18,7, 15, 12, 4, 3, 1, 16, 19, 20, 32, 50, 43, 28, 76, 100,85,96

47
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


4.0 SEARCHING
In the modern days of computing, vast data are almost always scrutinized to obtain required
data from a set of data. Data might be presented in diverse ways but searching through a list
of items is based on the same concept. In all there should be scanning of a set of element to
locate a particular data element. Many algorithms have been devised to accomplish the task
of searching in the field of computing. The sections that follow will give you the details of
the various searching techniques.

4.1 Simple Searching


The need to search an array for a particular value is a common problem. This is used to delete
names from a mailing list, or upgrading the salary of an employee etc.

The simplest method for searching is called the sequential search. Simply move through the
array from beginning to end, stopping when you have found the value you require.

Below is a C++ program that calls a sequential search function SearchAges

/* A C++ class demonstrating sequential search */

#include<iostream>
using namespace std;

int SearchAges(int ages[], int age, int n);

int main(){

int ages[] = {10,15,35,40,100} ; //An array containing ages.


int result;

result = SearchAges(ages,10,5); //we are searching for 10 and the array size is 5)
48
Page

if(result>-1)

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


cout<<"The Age "<<result<<" exists."<<endl;
else
cout<<"The Age doesn't exist."<<endl;

system("PAUSE");

}
/* A function to return an age from an array if it exists
* "ages[]" is the array containing the ages, "age" is the age
* we are looking for, and n is the size of the age array.
*/
int SearchAges(int ages[], int age, int n)
{
int j;
for(j=0; j<n; j++){
if(ages[j] == age){
return ages[j];
}
}

return -1;
}

4.2 Ordered Sequential Search


If we know that the array is ordered on the basis of numbers, then we can stop searching once
the search key is less than the item at the current position in the list. In large lists this can
save an enormous amount of time.
49

int SearchAges(int ages[], int age, int n)


Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


{
int j, m;
for(j=0; j<n; j++){
if(ages[j] == age)
return ages[j];
if(ages[j] > age)
return -1; // The age doesn’t exist
}
return -1; //The age doesn’t exist
}

4.3 Binary Search


Searching small lists doesn’t require much computation time. However, as lists get longer
(e.g. phone directories), sequential searching becomes extremely inefficient.
A binary search consists of examining the middle element of ordered array to see if it has the
desired value. If not, then half the array may be discarded for the next search.

4 7 19 25 36 37 50 100 105 205 220 270 301 321

50
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Binary Search Example

#include <iostream>
Using namespace std;

int isIn(int k, int a[], int n); //The binary function

int main(void)
{
int a[] = {4,7,19,25,36,37,50,100,105,205,220,271,301,321};

if(isin(105, a, 14) > 0){


cout<<("Found It !\n";
}
else
cout<<"Not in List\n";

system(“PAUSE”);
}

/*The isIn function. It checks whether the int k exists in the


* array a, n is the size of the array
*/
int isIn(int k, int a[], int n)
{
int l, r, m; //to represent left, right and middle
l = 0;
r = n - 1;

while(l <= r){


m = (l+r)/2;
if(k == a[m])
51

return m; //found
Page

else{

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


if (k > a[m])
l = m + 1;
else
r = m- 1;
}
}

return -1; //Cannot be found in the list


}

4.4 Recursive Binary Search


Much of the processing in the previous function was controlled by a while() loop.
We now know how to replace this by careful use of recursion.

int RecBin(int k, int a[], int l, int r)


{
int m;
if(l > r)
return -1;

m = (l+r)/2;
if(k == a[m])
return m;
else{
if (k > a[m])
return RecBin(k, a, m+1, r);
else
return RecBin(k, a, l, m-1);
52

}
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


}

4.5 Interpolation Search


When we look for a name in a phone book, we don’t start in the middle. We make an
educated guess as to where to start based on the name of the person to be searched for.
This idea has led to the interpolation search.

In binary searching, we simply used the middle of an ordered list as a best guess as to where
to begin the search. Now we use an interpolation involving the key, the start of the list and
the end.

In each search step it calculates where in the remaining search space the sought item might be
based on the key values at the bounds of the search space and the value of the sought key,
through linear interpolation. The key value actually found at this estimated position is then
compared to the key value being sought. If it is not equal, then depending on the comparison,
the remaining search space is reduced to the part before or after the estimated position.

if
l = 0, r = n-1 and “k” is the value we are looking for, then the formula for
interpolation search is

index = l + ceil((r-l) * (k – L[ l]) / (L[r] – L[l]) )


Where l is the left index, r is the right index of section being searched and L is the list to
search.

Therefore when searching for ‘15’ from the list below:


0 4 5 9 10 12 15 20

index = 0 + ceil (7 * (15 - 0)/(20 - 0)) = 6

a[6] = 15 which is equal to the key we are looking for.


53
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Example 2:
Search for 20 in the list below:

int a[] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30}

m = l + ((r-l) * (k-a[l]))/(a[r]-a[l])

m = 0 + ceil((15-0)*(20-0)/(30-0) )
m = 10
a[m] = 20 which is equal to the key we are looking for.

Below is the algorithm for Implementing the Interpolation Search

// Function to implement interpolation search


// Returns index of k in sorted array a[], or -1 if not found

int interpolationSearch(int k, int a[], int n){


int l = 0;
int r = n-1;
int ind;

while (a[l] <= k && a[r] >= k) {


ind = l + ceil ((k - a[l]) * (r - l) / (a[r] - a[l]));//Interpolation function

if (a[ind] < k) //less than


l = ind + 1;
else if (a[ind] > k) //greater than
r = ind - 1;
else
return ind; //we found it.
54

}
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


if (a[l] == k)
return l;
else
return -1; // Not found
}

4.6 Algorithmic Complexity


An Algorithm complexity indicates how, for n numbers, performance deteriorates when n
changes.

Searching and sorting algorithms have a complexity associated with them, called big-O.
• Sequential Search : O(n)
• Binary Search : O(log n)
• Interpolation Search : O(log log n)

55
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


5.0 Hash Tables
Hash table (hash map) in computing is a data structure that implements an associative
array abstract data type, a structure that can map keys to values. A hash table uses a hash
function to compute an index, also called a hash code, into an array of buckets or slots, from
which the desired value can be found. During lookup, the key is hashed and the resulting hash
indicates where the corresponding value is stored.

Ideally, the hash function will assign each key to a unique bucket, but most hash table designs
employ an imperfect hash function, which might cause hash collisions where the hash function
generates the same index for more than one key. Such collisions are typically accommodated
in some way.

In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is
independent of the number of elements stored in the table. Many hash table designs also allow
arbitrary insertions and deletions of key–value pairs, at (amortized) constant average cost per
operation – O(1). The following figure 5.1 shows a sample hash table

Figure 5.1 Hash table


5.1 Hash Function
A hash function is any well-defined procedure or mathematical function that converts a large,
possibly variable-sized amount of data into a small datum, usually a single integer that may
serve as an index to an array. The values returned by a hash function are called hash values,
hash codes, hash sums, or simply hashes.

Hash functions are mostly used to speed up table lookup or data comparison tasks such as
finding items in a database. The following is an example of a hash function for inserting integer
56

keys into a hash table h(K) = K mod n. Where n is the size of the array. Therefore, a hash
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


function acts as a mapping, h(K), that maps from key K, onto the index i, of an entry. It’s like
a black-box into which we insert a key (e.g. Voters ID number) and out pops an array index.

Example 1
As an example, let us use an array of size 11 to store some airport codes such as GHA, PHL,
DCA, FRA, ORY, GCM, etc.

From above, each airport code can be seen as a three letter string X2X1X0 and we assume
the letter ’A’ has an integer value 0, ’B’ has the value 1, ‘C’ has the value 2, etc.
If our hash function is as below:
h(K) = (X2 * 262 + X1 * 26 + X0 ) mod 11
Where mod (modulus) is the remainder after division.
Applying this to K= “DCA” we can hash DCA as follows: X2=D=3, X1=C=2 and X0=A=0

h(K) = (X2* 262 + X1 * 26 + X1 ) mod 11


0
h("DCA") = (3 * 676 + 2 * 26 + 0) mod 11
h("DCA") = (2080) mod 11 DCA 1
h("DCA") = 1
2

Applying this to K= “PHL” we can hash PHL as follows: 3


X2=P=15, X1=H=7 and X0=L=11
PHL 4
h(K) = (X2* 26 + X1 * 26 + X1 ) mod 11
2

5
h("PHL ") = (15 * 676 + 7 * 26 + 11) mod 11
h("PHL ") = (10333) mod 11 6
h("PHL ") = 4
7

Applying this to K= “PHL” we can hash PHL as follows: ORY 8


X2=O=14, X1=R=17 and X0=Y=24
9
h(K) = (X2* 262 + X1 * 26 + X1 ) mod 11
h("ORY ") = 8 10
57

Applying this to K= “PHL” we can hash PHL as follows: X2=H=7, X1=K=10 and X0=G=6
Page

h(K) = (X2* 262 + X1 * 26 + X1 ) mod 11

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


h("HKG ") = 4

It can be seen that the resulting hash sum for HKG is as that of PHL. Hence, the two different
keys are having same destination index. Hence collision is occurring.
Inserting “PHL”, “ORY” and “GCM”:

However, inserting “HKG” causes a collision.

Figure 5.2 Hashing collision

5.2 Collision Handling


An ideal hashing function maps keys into the array in a uniform and random manner. Collisions
occur when a hash function maps two different keys onto the same address. The policy of
finding another free location if a collision occurs is called open-addressing.
The simplest method of open-addressing is linear probing. Other open-addressing methods are
quadratic probing and double hashing.

5.2.1 Linear Probing


58

In linear probing open addressing, a free slot is sought for when collision occurs by simply
Page

incrementing the array index and checking for availability of the current index.

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


The Algorithm for the linear probing is as follows
Step 1: increment the current index I (where collision occurred) by 1 that is I = I + 1
Step 2: if I is above array size (n), wrap the index by the array size (I mod n)
Step 3: If the current index is free, we place the value at Ith location and go to step 5.
Step 4: otherwise, go to step 1
Step 5: Stop
Figure 5.2 shows a diagram for checking for availability. Assume the collision occurred at I=3,
then I + 1=4 is occupied hence we increment to I+1+1 = I + 2 = 5 which is free and that is
where the colliding item will be kept.

T A F
0 1 2 3 4 5 6 7 8 9 10
I I+1 I+2 0

Figure 5.3 Probing for space during collision

5.2.2 Quadratic Probing


Quadratic probing is an open addressing scheme in computer programming for resolving hash
collisions in hash tables. Quadratic probing operates by taking the original hash index and
adding successive values of an arbitrary quadratic polynomial until an open slot is found.

An example sequence using quadratic probing is:

H+12, H+22, H+32, H+42, ..., H+k2

Quadratic probing can be a more efficient algorithm in an open addressing table, since it better
avoids the clustering problem that can occur with linear probing, although it is not immune. It
also provides good memory caching because it preserves some locality of reference; however,
linear probing has greater locality and, thus, better cache performance.

Quadratic function
Let h(k) be a hash function that maps an element k to an integer in [0, m−1], where m is the
size of the table. Let the ith probe position for a value k be given by the function

h(k, i) = h(k) + c1i + c2i2 (mod m)


where c2 ≠ 0 (If c2 = 0, then h(k, i) degrades to a linear probe). For a given hash table, the
values of c1 and c2 remain constant.
Examples
59

• If h(k,i)=(h(k)+i+i2) mod m, then the probe sequence will be h(k), h(k)+2, h(k)+6,...
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


• For m = 2n, a good choice for the constants are c1 = c2 = 1/2, as the values of h(k,i) for i in
[0, m−1] are all distinct. This leads to a probe sequence of h(k), h(k)+1, h(k)+3,
h(k)+6,... (the triangular numbers) where the values increase by 1, 2, 3, ...
• For prime m > 2, most choices of c1 and c2 will make h(k,i) distinct for i in [0, (m−1)/2].
Such choices include c1 = c2 = 1/2, c1 = c2 = 1, and c1 = 0, c2 = 1. However, there are
only m/2 distinct probes for a given element, requiring other techniques to guarantee that
insertions will succeed when the load factor is exceeding 1/2.
• For m=np, where m, n, and p are integer greater or equal 2 (degrades to linear probe

when p = 1), then h(k, i) = ( h(k)+i+ni2 ) mod m gives cycle of all distinct probes. It can

be computed in loop as: h(k, 0) = h(k), and h(k, i+1) = (h(k,i) + 2in + n + 1) mod m

• For any m, full cycle with quadratic probing can be achieved by rounding up m to closest
power of 2, compute probe index: h(k,i)=h(k) + ((i2+i)/2) mod roundUp2(m), and skip
iteration when h(k, i) ≥ m. There is maximum roundUp2(m) - m<m/2 skipped iterations,
and these iterations do not refer to memory, so it is fast operation on most modern
processors.

Limitation of quadratic probing


When using quadratic probing, however (with the exception of triangular number cases for a

hash table of size 2n there is no guarantee of finding an empty cell once the table becomes
more than half full, or even before this if the table size is composite, because collisions must
be resolved using half of the table at most.

The inverse of this can be proven as such: Suppose a hash table has size p (a prime greater than
3), with an initial location h(k) and two alternative locations h(k)+x2 mod p and h(k) + y2 mod
p (where 0≤ x and y ≤p/2). If these two locations point to the same key space, but x ≠ y},
then
• h(k)+x2 =h(k) + y2 mod p
• x2 =y2 mod p
• x2 – y2=0 mod p
• (x-y)(x+y)=0 mod p.

5.2.3 Double Hashing


60

Double hashing is a computer programming technique used in conjunction with open-


Page

addressing in hash tables to resolve hash collisions, by using a secondary hash of the key as an

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


offset when a collision occurs. Double hashing with open addressing is a classical data structure

on a table T .
The double hashing technique uses one hash value as an index into the table and then repeatedly
steps forward an interval until the desired value is located, an empty location is reached, or the
entire table has been searched; but this interval is set by a second, independent hash function.
Unlike the alternative collision-resolution methods of linear probing and quadratic probing, the
interval depends on the data, so that values mapping to the same location have different bucket
sequences; this minimizes repeated collisions and the effects of clustering.

Given two random, uniform, and independent hash functions h1 and h2, the ith location in the

bucket sequence for value k in a hash table of |T| buckets is: h(i, k) = h1(k) + i.h2(k)) mod
|T| Generally, h1 and h2 are selected from a set of universal hash functions; h1 is selected to
have a range of {0, |T|-1} and h2 to have a range of {1, |T|-1}. Double hashing approximates
a random distribution; more precisely, pair-wise independent hash functions yield a probability
of (n / |T|)2 that any pair of keys will follow the same bucket sequence.

5.2.4 Separate Chaining


Open-addressing is not the only method of collision reduction. Another way of resolving
collision is by the use of separate chaining. In the method known as separate chaining, each
bucket is independent, and has some sort of list of entries with the same index. The time for
hash table operations is the time to find the bucket (which is constant) plus the time for the list
operation.
In most implementations buckets will have few entries, if the hash function is working
properly. Therefore, structures that are efficient in time and space for these cases are preferred.
Structures that are efficient for a fairly large number of entries per bucket are not needed or
desirable. If these cases happen often, the hashing function needs to be fixed.
There are some implementations which give excellent performance for both time and space,
with the average number of elements per bucket ranging between 5 and 100. Separate Chaining
can be implemented by linked list, list head cell or any other structure will make searching
efficient.
61

For instance, the airport codes example will have the collision handled as shown in the figure
Page

5.3 below

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Figure 5.4 Separate Chaining Collision Handling

Accessing an element from the hash table is random and hence the algorithm time
complexity of hashing is O(1)

6.0 SORTING
In Computer Science, a sorting algorithm is an algorithm that puts elements of a list in a
certain order. The most-used orders are numerical order and lexicographical order. Efficient
sorting is important to optimizing the use of other algorithms (such as search algorithms) that
require sorted lists to work correctly.

6.1 Bubble Sort


62

Bubble sort is a straightforward and simplistic method of sorting data. The algorithm starts at
Page

the beginning of the data set. It compares the first two elements, and if the first is greater than

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


the second, then it swaps them. It continues doing this for each pair of adjacent elements to
the end of the data set. It then starts again with the first two elements, repeating until no
swaps have occurred on the last pass. This algorithm is highly inefficient, and is rarely used.
For example, if we have 100 elements then the total number of comparisons will be 10000.

Algorithm
do
{
swapped = 0;
for (x = 0; x < array.size -1 ; x++)
{
if (array[x] > array[x+1])
{
Swapped = 1;
tmp = array[x];
array[x] = array[x + 1];
array[x + 1] = tmp;

}
}
} while (swapped);

An example:

3 1 4 1 5 9 2 6 5 4

1 3 1 4 5 2 6 5 4 9

1 1 3 4 2 5 5 4 6 9
63
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


1 1 3 2 4 5 4 5 6 9

1 1 2 3 4 4 5 5 6 9

Complexity: Bubble sort average case and worst case are both O(n²)

6.2 Selection Sort


Selection sort is a simple sorting algorithm that improves on the performance of bubble sort.
It works by first finding the smallest element using a linear search and swapping it into the
first position in the list, then finding the second smallest element by scanning the remaining
elements, and so on.

Algorithm :
int x,y,min;
for (x = 0; x < array.size-1; x++)
{
min = x;
for (y=x+1; y<array.size; y++)
{
if (array[y] < array[min])
{
min = y;
}
}
/* swap the places */
tmp = array[x];
array[x] = array[min];
array[min] = tmp;
64

}
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Example:

3 1 4 7 5 9 10 6 8 2

1 3 4 7 5 9 10 6 8 2

1 2 4 7 5 9 10 6 8 3

1 2 3 7 5 9 10 6 8 4

1 2 3 4 5 9 10 6 8 7

1 2 3 4 5 9 10 6 8 7

1 2 3 4 5 6 10 9 8 7

1 2 3 4 5 6 7 9 8 10

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10
65
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Complexity: Selection Sort average case and worst case are both O(n²)

6.3 Quicksort

Quicksort is a divide and conquer algorithm which relies on a partition operation: to partition
an array, we choose an element, called a pivot, move all smaller elements before the pivot,
and move all greater elements after it. We then recursively sort the lesser and greater sublists

Description of the algorithm

If the array contains only one element or zero elements then the array is sorted.

If the array contains more than one element then:

• Select an element from the array. This element is called the "pivot element". For
example select the element in the middle of the array.
• All elements which are smaller than the pivot element are placed in one array and all
elements which are larger are placed in another array.
• Sort both arrays by recursively applying Quicksort to them.
• Combine the arrays

Quicksort can be implemented to sort "in-place". This means that the sorting takes place in
the array and that no additional array need to be created.

Complexity: The average case complexity is O(n log n).


66
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


Algorithm

void quickSort(int arr[], int left, int right) {


int i = left, j = right;
int tmp;
int pivot = arr[(left + right) / 2];

/* partition */
while (i <= j) {
while (arr[i] < pivot)
i++;
while (arr[j] > pivot)
j--;
if (i <= j) {
tmp = arr[i];
arr[i] = arr[j];
arr[j] = tmp;
i++;
j--;
}
}

/* recursion */
if (left < j)
quickSort(arr, left, j);
if (i < right)
quickSort(arr, i, right);
}
67

Example
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


1 12 5 26 7 14 3 7 2 unsorted

1 12 5 26 7 14 3 7 2

i j

1 12 5 26 7 14 3 7 2
12>=7 >=2 swap 12 and 2

i j

26>=7 >=7 swap 26 and 7


1 2 5 26 7 14 3 7 12

i j

1 2 5 7 7 14 3 26 12 7>= 7 >=3 swap 7 and 3

i j

i > j stop partition


1 2 5 7 3 14 7 26 12

j i

We partition the list using j as right index for first list and I as left index for second list.

1 2 5 7 3 14 7 26 12 Run quick sort recursively


68
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


..................................................................................

1 2 3 5 7 7 12 14 26

Quick Sort with java

public class Quicksort {


private int[] numbers;
private int number;

public void sort(int[] values) {


// Check for empty or null array
if (values ==null || values.length==0){
return;
}
this.numbers = values;
number = values.length;
quicksort(0, number - 1);
}

private void quicksort(int low, int high) {


int i = low, j = high;
// Get the pivot element from the middle of the list
int pivot = numbers[low + (high-low)/2];

// Divide into two lists


while (i <= j) {
// If the current value from the left list is
smaller then the pivot
// element then get the next element from the left
list
while (numbers[i] < pivot) {
i++;
}
// If the current value from the right list is
69

larger then the pivot


// element then get the next element from the right
Page

list

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


while (numbers[j] > pivot) {
j--;
}

// If we have found a values in the left list which


is larger then
// the pivot element and if we have found a value in
the right list
// which is smaller then the pivot element then we
exchange the
// values.
// As we are done we can increase i and j
if (i <= j) {
exchange(i, j);
i++;
j--;
}
}
// Recursion
if (low < j)
quicksort(low, j);
if (i < high)
quicksort(i, high);
}

private void exchange(int i, int j) {


int temp = numbers[i];
numbers[i] = numbers[j];
numbers[j] = temp;
}
}

6.4 Heap sort


Heapsort is a much more efficient version of selection sort. It also works by determining the
70

largest (or smallest) element of the list, placing that at the end (or beginning) of the list, then
Page

continuing with the rest of the list, but accomplishes this task efficiently by using a data

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


structure called a heap, a special type of binary tree (if B is a child node of A, then key(A) ≥
key(B). This implies that an element with the greatest key is always in the root node. It is also
known as max-heap).

An example of a heap data structure

Once the data list has been made into a heap, the root node is guaranteed to be the largest(or
smallest) element. When it is removed and placed at the end of the list, the heap is rearranged
so the largest element remaining moves to the root. Using the heap, finding the next largest
element takes O(log n) time, instead of O(n) for a linear scan as in simple selection sort. This
allows Heapsort to run in O(n log n) time.

Complexity: Average case complexity is O(n log n)

Assignement:
1. Write C++ implementation for Bubble Sort
2. Write C++ implementation for Binary Search
3. Write a C++ program, that accepts a list of integer numbers from the user, arrange it and
print out the sorted list.
4. Write a C++ program that accepts a list or real numbers and prints the list in a reverse
order to the screen.
5. Write down all the steps for performing binary sort on the following set of data:
5 45 3 6 23 7 9 2 15 3 1 39 50 30
71
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


PROJECT:
Write a program in C++, that is capable of accepting a set of Staff(with details Staffno,
Name, Gender, Department Name and Basic) records into a collection (An Array) and
printing the staff list in either ascending order based on the Net Salary . It should also group
the staff list based on the department.
Tasks:
1. Create a class called Staff with the details above
2. Read number of staff from the user and create an array of the type staff and of size the
72

given number
Page

3. Read the details of each staff into each array location

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen


4. Sort staff records based on Net Salary using Quick sorting method
5. Derive a set of distinct department from the records
6. Display records of staff belonging to each distinct department such as follows:
SNo Staff No staff Name Gender Mark
------------------------------------------------------------------------------
* Department1
***************
1. xxxxxx xxxxx xxxx X xx.xx
2. xxxxxx xxxxx xxxx X xx.xx
...
n. xxxxxx xxxxx xxxx X xx.xx
------------------------
No. of Staff : n
------------------------

* Department 2
***************
1. xxxxxx xxxxx xxxx X xx.xx
2. xxxxxx xxxxx xxxx X xx.xx
...
n. xxxxxx xxxxx xxxx X xx.xx
------------------------
No. of Staff : n
------------------------
...
* Department n
***************
1. xxxxxx xxxxx xxxx X xx.xx
2. xxxxxx xxxxx xxxx X xx.xx
...
n. xxxxxx xxxxx xxxx X xx.xx
------------------------
No. of Staff : n
------------------------

7. Ask for staff number from the user and search and display the record of a staff to the
screen.

8. Request the staff number of a staff from the user and print to the screen only the name and
the department of the staff.

NB: Net salary = basic * 120 – Tax

tax = 10% of basic


73
Page

CSC 312: Data Structures and Algorithm II Aps. Nixon Adu-Boahen

You might also like