Data Structures Course Module PDF
Data Structures Course Module PDF
IN ASSOCIATION
WITH
Course Module
Page | 1
Prepared by:
Mr. Robert Simushi Junior- BSc. Computer Science, Master’s degree in Software
engineering
Assessments
Prescribed Textbooks
Recommended Textbooks
ADTs, Data Structures, & Prob Solving w/C++ 2nd Edition by Larry
Page | 2
Course Overview
students with knowledge and skills required to implement a variety of advanced data
structures using hash tables, priority queues, balanced search trees, graphs and to
Learning Outcomes
Page | 3
Unit I: Overview of Data Structures
Programming entails writing instructions that are meant for the computer to perform
analysis, searching, sorting data. This means that a program will be made up the
instructions plus the data that will be worked on or processed using the instructions.
Therefore, when one wishes to the learn how to program, it is important that they
learn the types of instructions they can use to write a program. They are equally
expected to learn about the means and mechanisms that facilitate efficient
data can be stored and organised in program to facilitate this desired efficiency.
Page | 4
A data structure is an implementation of a model or blueprint called an Abstract
data type. From our introduction to programming course we know that a data type is
a kind of data item or form in which data exists. Data types are defined by the values
they can take, the programming language and the operations that can be performed
on them. An abstract data type specifies the characteristics that a data type or data
structure is expected to have as well as the operations that can be performed on it.
memory
Processing alternatives for data stored and the amount of time required to
Page | 5
Data structures are normally classified into two broad categories:
Primitive data structures are basic structures that generally can only hold one data
item or value at any single point during program execution. Generally, primitive data
structure or types are directly operated upon by machine instructions. Primitive data
2.5, 0.56,12.56
Page | 6
These data types are available in most programming languages as built in data
types. This means that the programmer does not have to worry about how to create
Complex programs are generally written to manipulate data which is structured and
composed of several pieces of data such as records. To store such data items, non-
primitive/ composite data structures are called upon. These data structures are more
sophisticated with respect to the operations allowed on them and the fact that they
can hold several pieces of data at once. They emphasize on structuring of a group of
Non primitive data structures are derived from primitive data structures. Examples of
Array: A fixed-size sequenced collection of data items of the same data type
Non-primitive data types are further divided into Linear and Non-Linear data
structures. This distinction is as a result of how the individual elements stored in the
Page | 7
Linear data structures
A data structure is said to be Linear, if its elements are organized in linear manner by
means of logical or in sequential memory locations. This means that the individual
elements stored in these structures are accessed in a sequential manner and are
stored in such a way that to access the second element you may have to go through
a. Linked - List:
The linked list is a dynamic data structure that is used to a list of items. They
extend the array structure by allowing storage of entities that are composed of
b. Stack :
one end only. The insertion operation is referred to as ‘PUSH’ and deletion
being a structure that is based on the concept called Last in First out (LIFO).
c. Queue:
This data structure permits the insertion of elements at one end and Deletion
at another end. The end at which deletion is occurs is known as FRONT end
and another end at which insertion occurs is known as REAR end. The Queue
takes its name after the concept of a real-life queue, where access to a
service is granted to the person in front of the queue, while joining of the
queue is done at the back or end. The Queue is also called as First in First
The data items stored in the linear data structures are sometimes referred to as
nodes.
Page | 8
Nonlinear data structures
A data structure is said to be non-linear if the data items stored in the structure are
include:
a. Tree:
A tree is a finite set of data items in which data items (also called nodes/
vertices) are arranged in a hierarchical order. This means the data items are
such as the case of a binary search tree where all elements on the right side
of a node are supposed to be bigger that the node and those on the left side
are expected to be smaller than the node. Trees represent the hierarchical
the circles.
b. Graph:
The graph is a data structure which extends the concept of a tree. This means
that just like the tree it is a collection of nodes and connecting edges. The
nodes just like in the case of the tree is used to represent/store data whereas
the edges represent the logical relationships among nodes. It is for this
reason that the tree is viewed as a restricted graph. There are many types of
graphs namely;
Un-directed Graph
Directed Graph
Mixed Graph
Multi Graph
Page | 9
Simple Graph
Graphs are used to solve many real-life problems. Graphs are used to
also used in social networks like LinkedIn, each person is represented with a
vertex(or node). Each node is a structure and contains information like person
Abstract data types describe the key characteristics that a data structure is expected
to have as well as the operations that are allowed on the data structure. Hence,
these operations. There are different types of operations that are allowed on different
data structures but generally they can be categorized into the following types:
data structure.
Page | 10
Searching: It finds the presence of desired data item in the list of data items;
it may also find the locations of all elements that satisfy certain conditions.
manner.
There are two ways of how memory required to store values of a data structures is
allocated.
Introduction
A linked list is a collection of elements that represent generally records that are
called nodes. Every node is made up of at least one data field and a pointer field/link.
The data field is used to store the data that is intended to be stored whereas the
pointer field is used to store the address of the next element in the list (i.e. contains
Page | 11
If we must create a linked list that is meant to store integers, graphically such a
Where the blue compartment represents the data field and the yellow compartment
of the node represents the pointer field. Since each node of a linked list has two
components, we need to declare each node as a struct. The data type of each node
depends on the specific application—that is, what kind of data is being processed.
However, the link component of each node is a pointer. The data type of this pointer
variable must correspond to the data type of the node itself. Therefore, the previous
struct node{
int info;
};
To understand how to manipulate data in a linked list, one needs to be familiar with
structure manipulation. In a structure the fields are accessed using the dot notation.
But since pointers are used to manipulate elements stored in the linked list, the
Example:
Page | 12
Suppose that the first node is at location 2000, the second node is at location 2800,
the third node is at location 1500, and the fourth node is at location 3600. Therefore,
the value of head is 2000, the value of the component link of the first node is 2800,
the value of the component link of the second node is 1500, and so on. Also, the
value 0 in the component link of the last node means that this value is NULL. The
Traversing of a list involves accessing all the nodes of the list to conduct some form
of processing. It now follows that we must traverse the list using another pointer of
the same type, because the pointer head always points to the first node. Suppose
that current is a pointer of the same type as head. The following code traverses the
list:
current = head;
Page | 13
while (current != NULL) {
current = current->link;
For example, suppose that head points to a linked list of numbers. The following
current = head;
current = current->link;
Elements can be inserted at any position of the list i.e. at the beginning, in the middle
or the end of the list. These operations must be implemented using functions.
Page | 14
Unit III: Stacks, Queues and Hashing
Stacks
A stack is a container of objects that are inserted and removed based on the last-in
first-out (LIFO) principle. Objects can be inserted into a stack at any time, but only
the most recently inserted (that is, “last”) object can be removed at any time.
The name “stack” is derived from the metaphor of a stack of plates in a spring
loaded, cafeteria plate dispenser. In this case, the fundamental operations involve
the “pushing” and “popping” of plates on the stack. When we need a new plate from
the dispenser, we “pop” the top plate off the stack, and when we add a plate, we
Stacks are a fundamental data structure. They are used in many applications,
Internet Web browsers store the addresses of recently visited sites on a stack.
Each time a user visits a new site, that site’s address is “pushed” onto the
stack of addresses. The browser then allows the user to “pop” back to
Text editors usually provide an “undo” mechanism that cancels recent editing
Page | 15
Stacks are the simplest of all data structures, yet they are also among the most
important, since they are used in a host of different applications that include many
more sophisticated data structures. Formally, a stack is an abstract data type (ADT)
pop(): Remove the top element from the stack; an error occurs if the stack is
empty.
top(): Return a reference to the top element on the stack, without removing it;
The example below indicates how content stored in a stack is organized based on
Page | 16
Queues
Another fundamental data structure is the queue, which is a close relative of the
stack. A queue is a container of elements that are inserted and removed according
to the first-in first-out (FIFO) principle. Elements can be inserted in a queue at any
time, but only the element that has been in the queue the longest can be removed at
any time. We usually say that elements enter the queue at the rear and are removed
from the front. The metaphor for this terminology is a line of people waiting to get on
an amusement park ride. People enter at the rear of the line and get on the ride from
Formally, the queue abstract data type defines a container that keeps elements in a
sequence, where element access and deletion are restricted to the first element in
the sequence, which is called the front of the queue, and element insertion is
restricted to the end of the sequence, which is called the rear of the queue. This
restriction enforces the rule that items are inserted and deleted in a queue according
The queue abstract data type (ADT) supports the following operations:
dequeue(): Remove element at the front of the queue; an error occurs if the
queue is empty.
front(): Return, but do not remove, a reference to the front element in the
Page | 17
empty(): Return true if the queue is empty and false otherwise.
The example below indicates how content stored in a queue is organized based on
Page | 18
Hashing
Hash table is a data structure which is designed to use a special function called the
Hash function which is used to map a given value with a key for faster access of
elements. The keys associated with values in a map are typically thought of as
consist of a collection of symbolic names where each name serves as the “address”
for properties about a variable’s type and value. One of the most efficient ways to
The efficiency of mapping depends of the efficiency of the hash function used. In
general, a hash table consists of two major components, a bucket array and a hash
function.
Page | 19
Definition of Recursion
called recursion. Recursion is a very powerful way to solve certain problems for which
the solution would otherwise be very complicated. Let us consider a problem that is a
0!=1 (1)
n ! = n * ( n -1 ) ! if n > 0 (2)
3! = 3 * 2!
2! =2 * 1!
1! = 1 * 0!
An algorithm that finds the solution to a given problem by reducing the problem to
smaller versions of itself is called a recursive algorithm. The recursive algorithm must
have one or more base cases as well as one or more recursive cases, and the general
Page | 20
1. Every recursive definition must have one (or more) base cases.
themselves called recursive functions. A function is said to be one that calls itself if its
body contains a statement that causes the same function to execute again before
completing the current call. Recursive algorithms are implemented using recursive
functions.
Examples:
below:
if (num == 0){
return 1;
else{
Page | 21
The implementation of recursion
A function is called directly recursive if it calls itself. A function that calls another
function and eventually results in the original function call is said to be indirectly
recursive. For example, if a function A calls a function B and function B calls function
Indirect recursion can be several layers deep. For example, suppose that function A
calls function B, function B calls function C, function C calls function D, and function
Unit V: Trees
Introduction
Page | 22
The tree data structure is considered one of the many important breakthroughs in
data organization. This is attributed by many experts to the fact that this data
data structure allows us to implement a whole range of algorithms much faster when
compared to linear data structures such as lists, vectors or sequences. The tree data
structure provides a natural organization for data, and as such they are a data
structure of choice for file systems, graphical user interfaces, databases, websites
and other complex computer systems. The main terminology for the tree data
structures come from family trees as we know them where we find terms such as
this structure, the terms have the same literal meaning as that of the family tree.
Page | 23
Trees
A tree is an abstract data type that stores data elements hierarchically. All elements
excluding the element at the top (called the root) of the hierarchy has a parent and
zero or more child elements. Each of these items of the tree is connected to its
parent by a straight line (also called arcs). These data items are also referred to as
nodes. Hence a tree is made up of nodes which represent the data items and the
If P is not empty, then it has a special node called the root of P that has no
parent.
Each node r of P different from the root has a unique parent node s; where
From these outlined properties of the tree we have to take note that a tree can be
empty and also that a tree is a structure that can been defined recursively.
Binary trees
An ordered in which every node (excluding the leaf node) has at most two children is
Each child node is labelled as being either a left child or a right child.
Page | 24
The subtree rooted at a left or right child of an internal node is called the node’s left
subtree or right subtree, respectively. A binary tree is proper if each node has either
zero or two children. Some people also refer to such trees as being full binary trees.
Thus, in a proper binary tree, every internal node has exactly two children. A binary
A binary search tree can be implemented using this linked-list data structure
concept. This means that each node will be made up of three pointer fields as
indicated in the figure below and at least one data field. The three pointers point to
hence the left point takes us to another node which can be represented similarly to
the node and provided that it is not a leaf node. The pointers left and right of leaf
nodes do not pointer anywhere hence their values will be null while the pointer
parent of the root node points nowhere hence null due to the fact that a root node
does not have a parent node. Below is an example of a binary tree data structure
example.
Page | 25
To implement this data structure, we begin by defining the basic constituents that
make up each node. This is achieved by defining a structure that specifies all the
pointers in the node namely; left, right and parent as well as well the data fields that
};
The binary tree that is of major interest in this course is the binary search tree. A
binary search tree is a binary tree where all the elements stored in the structure are
sorted in the manner that all nodes located to the left side of a node must be smaller
than the node while all elements stored to the right side of a node must be greater
than the node. Therefore, this means that whenever an element is being inserted
into the binary search tree care must be taken so as to ensure that the element is
inserted in the right place. Likewise, whenever an element is being deleted from the
Page | 26
tree, the tree must be updated in such a way that the remaining elements are sorted
data structure entails visiting all the elements stored in a structure to perform some
process. There are three different ways in which a binary search tree can be
traversed namely;
a. In order traversal: In the in order traversal, you are required to visit all the left
children before visiting the root node. After visiting the root node the you can
inorder (root) {
b. Pre order traversal: In this form of traversal you are required to visit the root
node followed by all the left child before concluding with all the right children.
Preorder(root) {
Page | 27
Preorder(rightchild) {recursively traverse right subtree}
c. Post order traversal: In the post order traversal, you are required to visit all
the left children, followed by all the right children before visiting the root node.
Postorder(root) {
The basic operations on a binary search include insertion, update and deletion of an
element.
Heaps
A heap is a binary tree T that stores a collection of elements with their associated
We assume that a total order relation on the keys is given, for example, by a
comparator. The relational property of T, defined in terms of the way keys are stored,
is the following:
Heap-Order Property: In a heap T, for every node v other than the root, the
key associated with v is greater than or equal to the key associated with v’s
parent.
Page | 28
Unit VI: Graphs
That is, a graph is a set of objects, called vertices, together with a collection of
pairwise connections between them. This notion of a “graph” should not be confused
with bar charts and function plots, as these kinds of “graphs” are unrelated to the
topic of this unit. Graphs have applications in different domains, including mapping,
between pairs of objects from some set V. Some other literature uses different
terminology for graphs and refer to what we call vertices as nodes and what we call
edges as arcs. These terms mean one and the same thing.
directed from u to v if the pair (u,v) is ordered, with u preceding v. An edge (u,v) is
said to be undirected if the pair (u,v) is not ordered. Undirected edges are sometimes
denoted with set notation, as {u,v}, but for simplicity we use the pair notation (u,v),
noting that in the undirected case (u,v) is the same as (v,u). Graphs are typically
visualized by drawing the vertices as ovals or rectangles and the edges as segments
Page | 29
Below are examples of how the graph can be used:
with A.
dead ends, and whose edges are stretches of streets without intersections.
This graph has both undirected edges, which correspond to stretches of two-
streets. Thus, in this way, a graph modelling a city map is a mixed graph.
The definition of a graph refers to the group of edges as a collection, not a set, thus
allowing for two undirected edges to have the same end vertices, and for two
directed edges to have the same origin and the same destination. Such edges are
Page | 30
Unit VII: Sorting and Searching Algorithms
Searching algorithms
One key operation many applications that process huge amounts of data uses is
namely the binary search algorithm and the linear search algorithm.
A sequential search is therefore not very efficient for large lists. In fact, it can be
proved that, on average, the number of comparisons (key comparisons, not index
comparisons)made by the sequential search is equal to half the size of the list. So,
for a list size of 1000, on average, the sequential search makes about 500 key
comparisons. The sequential search algorithm does not assume that the list is
sorted.
int loc;
loc = 0;
if (list[loc] == searchItem){
found = true;
loc++;
if (found){
return loc;
Page | 31
else{
return -1;
Binary Search
A sequential or linear search is not very efficient for large lists. Assuming we have a
list of 100 elements that are not repeated or organized in any order and we wish to
search for a given element which may coincidently be occupying the ninety-ninth, it
would require us to test 99 elements before we confirm that the item being searched
for exists in the list. However, if the list is sorted, you can use another search
algorithm called binary search. A binary search is much faster than a sequential
search. In order to apply a binary search, the list must be sorted. A binary search
uses the ‘‘divide and conquer’’ technique to search the list by partitioning the list into
two parts. First, the search item is compared with the middle element of the list. If the
search item is less than the middle element of the list, we restrict the search to the
upper half of the list; otherwise, we search the lower half of the list.
if (list[mid] == searchItem){
found = true;
else{
Page | 32
if (list[mid] > searchItem){
last = mid - 1;
else{
first = mid + 1;
if (found){
return mid;
else{
return -1;
Note that in the binary search algorithm, two key (item) comparisons are made each
time through the loop, except in the successful case—the last time through the
Page | 33
Sorting Algorithms
Sorting is one of the most important operations that is done on a collection of data
items. There are several algorithms that maybe used to sort data items names
selection sort, bubble sort, insertion sort, quick sort, merge sort and heap sort. You
might be wondering why there are so many different sorting algorithms. The answer
is that the performance of each sorting algorithm is different. Some algorithms make
more comparisons, whereas others make fewer item assignments. Also, there are
understanding and analysis of the number of key comparisons and item assignments
Bubble sort
rearrange, that is, sort, the elements of list in increasing order. The bubble sort
+ 1] of list are compared. If list[index] is greater than list[index + 1], then the
follows that the smaller elements move toward the top (beginning), and the larger
Page | 34
for (iteration = 1; iteration < length; iteration++){
temp = list[index];
list[index + 1] = temp;
#include <iostream>
int main(){
int list[] = {2, 56, 34, 25, 73, 46, 89, 10, 5, 16};
int i;
bubbleSort(list, 10);
cout << "After sorting, the list elements are:" << endl;
return 0;
Page | 35
Selection Sort
In the selection sort algorithm, we rearrange the list by selecting an element in the
list and moving it to its proper position in the list of items. This algorithm finds the
location of the smallest element in the unsorted portion of the list and moves it to the
top of the unsorted portion of the list. The first time, we locate the smallest item in the
entire list. The second time, we locate the smallest item in the list starting from the
smallestIndex = index;
smallestIndex = location;
temp = list[smallestIndex];
list[smallestIndex] = list[index];
list[index] = temp;
Page | 36
Insertion Sort
For a list of length 1000, both the bubble sort and selection sort make approximately
time and processing power. In this part of the document, we introduce the sorting
algorithm called insertion sort, which tries to improve—that is, reduce—the number
of key comparisons.
The insertion sort algorithm sorts the list by moving each element to its proper place.
int temp;
temp = list[firstOutOfOrder];
location = firstOutOfOrder;
do{
location--;
list[location] = temp;
Page | 37