0% found this document useful (0 votes)
6 views52 pages

CS312 - ATBU Lecture Note 7 3-03-2025 Data Structure

The document outlines the CS312 Data Structures and Algorithms course at Abubakar Tafawa Balewa University, focusing on the importance of data structures for efficient data management in complex applications. It covers various data structures, algorithms, and their classifications, as well as the principles of algorithm design and analysis, including efficiency and complexity. The course aims to equip students with practical skills in implementing data structures and algorithms using C++.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views52 pages

CS312 - ATBU Lecture Note 7 3-03-2025 Data Structure

The document outlines the CS312 Data Structures and Algorithms course at Abubakar Tafawa Balewa University, focusing on the importance of data structures for efficient data management in complex applications. It covers various data structures, algorithms, and their classifications, as well as the principles of algorithm design and analysis, including efficiency and complexity. The course aims to equip students with practical skills in implementing data structures and algorithms using C++.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Department of Computer Science

Faculty of Computing
Abubakar Tafawa Balewa University, Bauchi

CS312 Data Structures and Algorithms (2 Units)


Review of elementary data structures and their applications. Graphs; sub graph, minimum
spanning tree, Depth First Search, Breath First Search. Trees and tree traversal algorithms.
Binary Search, Tree (BST), AVL tree; tree balancing, B tree. Searching and sorting algorithms.
Symbol table, Hashing, Complexity analysis, Implementation of the algorithms should be using
C++.

INTRODUCTION

OVERVIEW
Data Structures are the programmatic way of storing data so that it can be used efficiently.
Different kinds of data structures are suited to different kinds of applications, and some are
highly specialized to specific tasks. Almost every enterprise application uses various types of
data structures in one way or the other. This course will provide to student the understanding
of the fundamentals data structures and to develop algorithm towards solving the complexity
of enterprise level applications. In a nutshell the primary concern of this course is to improve
efficiency in computing.

EFFICIENCY
A solution is said to be efficient if it solves the problem within its resource constraints. Space
and time are typical constraints of a program. To write an efficient programs, one needs to
organize the data in such a way that it can be accessed and be manipulated efficiently.

OBJECTIVES OF THE COURSE


1. To solve problems using data structures such as array, lists, stacks, queues, de- queue,
tree and graphs and writing programs for these solutions.
2. To solve problems using algorithm design methods such as the greedy method, divide
and conquer, dynamic programming, backtracking, and writing programs for these

1
solutions.
3. Develop proficiency in the specification, representation, and implementation of Data
Types and Data Structures.

NEED FOR DATA STRUCTURES AND ALGORITHMS


As applications are getting more complex and highly rich with data content, the followings
are the common problems associated with applications.

1. Data Search: Consider an inventory of 1 million (106) items of a store. If the


application is to search an item, it has to search an item in 1 million (106) items every
time there by slowing down the search. As data grows, search will become slower.
2. Processor speed: Processor speed although being very high, falls limited if the data
grows to billion records.
3. Multiple requests: As thousands of users can search data simultaneously on a web
server, even the fast server fails while searching the data.

To solve these problems, data need to be organized in such a way that all items may not be
required to be searched and the required data can be searched almost instantly. Therefore, the
need for Data Structures is to manage the complexity of application while searching data
by creating models that allow us to describe the data that our algorithms will manipulate in a
consistent way with respect to the problem.

DATA TYPES
All data items in the computer are represented as strings of binary digits (0’s and 1’s). In order
to give these strings meaning, there is need to have data types. Data types provide an
interpretation of binary data as representatives of binary digits with respect to the problem
being solved.

ABSTRACTION
Computer scientists use abstraction to allow them focus on “bigger problem” without getting
lost in the details. Abstraction allows us to view problems and solutions by separating the so-
called logical and physical perspectives.

ABSTRACT DATA TYPE


An Abstract Data Type (ADT), is a logical description of how data’s were viewed and the
operations that are allowed without regarding how they will be implemented. This is
concerned only with what the data is representing and not with how it will eventually be
constructed. The figure below shows a picture of what an abstract data type is and how it
operates.

2
Abstract Data Type

The user interacts with the interface, using the operations that have been specified by the
abstract data type. The abstract data type is the shell that the user interacts with. The
implementation is the hidden one level deeper. The user is not concerned with the details of
the implementation.
Example: The abstract view of a television:
1. Ability to change channels, or adjust volume.
2. TV displays the show to watch
3. Don't care: who made the TV, or how circuit inside was constructed

DATA STRUCTURE
Physical implementation of an Abstract Data Type (ADT) often referred to as a data
structure. Each operation associated with ADT is implemented by one or more subroutines in
the implementation. Since there are many different ways to implement an ADT, this
implementation independence allows the programmer to switch the details of the
implementation without changing the way the user of the data interacts with it. The user can
remain focused on the problem-solving process.

SELECTING A DATA STRUCTURE


When writing a program, computer programmers decide which data structures to use based on
the nature of the data and the processes that need to be performed on that data. Below are the
factors to consider before selecting data structure.
1. Analyze the problem to determine the resource constraints.
2. Determine the basic operations that must be supported.
3. Determine how much space does the data occupy, and what is the running times of the
operations in its interface.
Select the data structure that best meets these requirements by considering the “simplest” data
structure that will meet requirements.

3
CLASSIFICATION OF DATA STRUCTURES
Data structures can be classified as either linear or non-linear, based on how the data is
conceptually organized or aggregated.

1. Linear structures. The array, list, linked lists, queue, de-queue and stack belong to
this category. Each of them is a collection that stores its entries in a linear sequence,
and in which entries may be added or removed at will. They differ in the restrictions
they place on how these entries may be added, removed, or accessed. The common
restrictions include First-In First-Out (FIFO) and Last-In First-Out (LIFO).
2. Non-linear structures. Trees and graphs are classical non-linear structures. Data
entries are not arranged in a sequence, but with different rules.

Example: Suppose you are hired to create database of names with all company's management
and employees. You can make it inform of list or tree as illustrated below.

ALGORITHM
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, (i.e. an algorithm can be implemented in more than one programming
language). Below are some important categories of algorithms:
1. Search: Algorithm to search an item in a data structure.
2. Sort: Algorithm to sort items in a certain order.
3. Insert: Algorithm to insert item in a data structure.
4. Update: Algorithm to update an existing item in a data structure.
5. Delete: Algorithm to delete an existing item from a data structure.

CHARACTERISTICS OF AN ALGORITHM
An algorithm should have the following characteristics, meaning not all procedures can be

4
called an algorithm.
1. It must be feasible with the available resources.
2. It must be correct.
3. It must be composed of a series of concrete steps.
4. An algorithm should have step-by-step directions, no ambiguity as to which step will
be performed next.
5. It must be composed of a finite number of steps.
6. It must terminate.

RELATIONSHIP BETWEEN DATA STRUCTURES AND ALGORITHM


Below are few among the relationships between Data Structures and Algorithms outlined for
the purpose of this course.
1. An algorithm defines a sequence of steps and decisions which can be employed to
solve a problem, while Data Structures describe a collection of values, often with
names and information about the hierarchical relationship of those values. A database
is a data structure; a shopping list is a data structure etc.
2. Algorithms are almost exclusively procedural in nature, while Data Structures have no
such procedure component, instead have hierarchy, contents and values.
3. Both data structures and algorithms are the most central concepts employed in creating
software solutions. All procedural programming languages inherently support the
development of algorithms, while data structures of any complexity can be created in
virtually any language.
4. Another relationship is that, algorithm requires collection of input, storing intermediate
results, and delivering output. Such input and results, require data structures at a
minimum way to identify specific atomic values.

HOW TO WRITE AN ALGORITHM


There are no well-defined standards for writing algorithms. Rather, it is problem and resource
dependent. Algorithms are never written to support a particular programming code. As all
programming languages share basic code constructs like loops (do, for, while), flow-control
(if-else) etc, these common constructs can be used to write an algorithm. Algorithm writing is
a process and is executed after the problem domain is well-defined.
Example
Problem: Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b

5
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP

Alternatively, the algorithm can be written as:−


Step 1 − START ADD
Step 2 − get values of a & b
Step 3 − c ←a + b
Step 4 − display c
Step 5 − STOP

In design and analysis of algorithms, usually the second method is used to describe an
algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all unwanted
definitions. There are more than one ways to design an algorithm for a particular problem.
Hence, many solution algorithms can be derived for a given problem. This is illustrated in the
figure below:

The next step is to analyze the proposed solution algorithms and implement the best suitable
one.
ALGORITHM ANALYSIS
Efficiency of an algorithm can be analyze in two different stages, before implementation and
after implementation. These are described below.
 Priori Analysis: This is a theoretical analysis of an algorithm before it is implemented
on a computer system. Efficiency of an algorithm is measured by assuming that all
other factors, such as processor speed, are constant and have no effect on the
implementation.
 Posterior Analysis: This is an empirical analysis of an algorithm. The selected
algorithm is implemented using programming language, then executed on computer. In this

6
analysis, actual statistics of running time and space required are collected.
Algorithm analysis deals with the running time of various operations involved. The running
time of an operation can be defined as the number of computer instructions executed per
operation.

ALGORITHM COMPLEXITY
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
 Time Factor: Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
 Space Factor: Space is measured by counting the maximum memory space
required by the algorithm.
Therefore, complexity of an algorithm f(n) gives the running time and/or the storage space
required by the algorithm in terms of n as the size of input data.
TIME COMPLEXITY
Time complexity of an algorithm represents the amount of time required by the algorithm to
run to completion. Time requirements can be defined as a numerical function of T(n), where
T(n) can be measured as the number of steps, provided each step consumes constant time.
SPACE COMPLEXITY
Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle. The space required by an algorithm is equal to the sum of the
following two components:
 Fixed part that is a space required to store certain data and variables that are
independent of the size of the problem.
 Variable part is a space required by variables, whose size depends on the size of the
problem.
Space complexity S(P) of any algorithm P is S(P) = C + S(I), where C is the fixed part and
S(I) is the variable part of the algorithm, which depends on instance characteristics of (I).

ASYMPTOTIC ANALYSIS
Asymptotic analysis of an algorithm refers to computing running time of any operation in
mathematical units of computation. Using asymptotic analysis, one can conclude the best
case, average case, and worst case scenario of an algorithm. These are described below:
 Best Case: Minimum time required for program execution.
 Average Case: Average time required for program execution.
 Worst Case: Maximum time required for program execution.

7
For example, the running time of one operation is computed as f(n) and may be for another
operation it is computed as g(n2). This means the first operation running time will increase
linearly with the increase in n and the running time of the second operation will increase
exponentially when n increases. Similarly, the running time of both operations will be nearly
the same if n is significantly small.

ASYMPTOTIC NOTATIONS
Below are the commonly used asymptotic notations to calculate the running time
complexity of an algorithm.
 Ο Notation
 Ω Notation
 θ Notation
BIG OH NOTATION, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running
time. It measures the worst case time complexity or the longest amount of time an algorithm
can possibly take to complete.

OMEGA NOTATION, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running
time. It measures the best case time complexity or the best amount of time an algorithm can
possibly take to complete.

8
THETA NOTATION, Θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound of
an algorithm's running time. It is represented as follows −

GREEDY ALGORITHM
A greedy algorithm is a problem-solving technique that makes the best local choice at each
step in the hope of finding the global optimum solution. It prioritizes immediate benefits over
long-term consequences, making decisions based on the current situation without
considering future implications. While this approach can be efficient and straightforward, it
doesn’t guarantee the best overall outcome for all problems.
However, it’s important to note that not all problems are suitable for greedy algorithms.
They work best when the problem exhibits the following properties:

1. Greedy Choice Property: The optimal solution can be constructed by


making the best local choice at each step.
2. Optimal Substructure: The optimal solution to the problem contains the optimal
solutions to its sub problems.

CHARACTERISTICS OF GREEDY ALGORITHM


Below are characteristics of greedy algorithm:
 Greedy algorithms are simple and easy to implement.
 They are efficient in terms of time complexity, often providing quick solutions.
 Greedy algorithms are used for optimization problems where a locally
optimal choice leads to a globally optimal solution.
 These algorithms do not reconsider previous choices, as they make decisions based on
current information without looking ahead.
 Greedy algorithms are suitable for problems for optimal substructure.

DIVIDE AND CONQUER


In divide and conquer approach, the problem at hand, is divided into smaller sub- problems
and then each problem is solved independently. When we keep on dividing the sub-problems
into even smaller sub-problems, we may eventually reach a stage where no more division is
possible. Those "atomic" smallest possible sub-problems (fractions) are solved. The solution
of all sub-problems is finally merged in order to obtain the solution of an original problem.

9
In a nutshell, divide and conquer algorithm can be summarized and illustrated below:

1. Breaking the problem into smaller sub-problems


2. Solving the sub-problems, and
3. Combining them to get the desired output.

10
LINEAR DATA STRUCTURES

This section presents the study of data structures by considering the five powerful concepts of
linear data structures. These are stacks, queues, de-queue, array and list methods of data
collections whose items are ordered depending on how they are added or removed. Once an item
is added, it stays in that position relative to the other elements that came before and after.
Collections of such are often referred to as linear data structures. Linear structures can be
thought of having two ends referred to the “left” and “right”, the “front” and “rear” or in some
cases “top” and the “bottom.” What distinguishes one linear structure from another is the way
in which items are added and removed, in particular the location where these additions and
removals occur.

STACK
A stack is an ordered collection of items where the addition of new items and the removal of the
existing items always take place at the same end. This end is commonly referred to the “top.”
The other end opposite of the top is known as the “base.” The base of the stack is significant
since items stored in the stack that are closer to the base represent those that have been in the
stack the longest. The most recently added item is the one that is in position to be removed first.
This ordering principle of stack is called LIFO, last-in first-out.

Many examples of stack occur in everyday situations. Almost any cafeteria has a stack of trays
or plates where the one at the top will be taken, uncovering a new tray or plate for the next
customer in line. Imagine a stack of books on a desk shown in the figure 2.1 below. The only
book whose cover is visible is the one on top. To access others in the stack, one need to remove
the ones that are sitting on top of them. Figure 2.2 shows another example of stack.

Figure 2.1: A Stack of Books

11
Figure 2.2: A Stack of Primitive Python Objects

REVERSAL PROPERTY OF STACK


One of the most useful ideas related to stacks comes from the simple observation of items as they
are added and then removed. Assume you start out with a clean desktop. Now place books or
objects one at a time on top of each other. You are constructing a stack. Consider what happens
when you begin removing books. The order that they are removed is exactly the reverse of the
order that they were placed. Stacks are fundamentally important, as they can be used to reverse
the order of items. The order of insertion is the reverse of the order of removal. Figure 2.3
illustrate reversal property of stack.

Figure 2.3: Reversal Property of Stacks

Considering this reversal property, one can perhaps think of examples of stacks that occur while
operating a computer. For example, every web browser has a Back button. As you navigate from
web page to another web page, those pages are placed on a stack (actually it is the URLs that are
going on the stack). The current page that you are viewing is on the top and the first page you
looked at is at the base. If you click on the Back button, you begin to move in reverse order
through the pages.
STACK ABSTRACT DATA TYPE
Stack Abstract Data Type is defined as the description of logical operations of how an ordered
collection of items were added to and removed from a stack. The description of stack operations

12
and their meanings are presented in the Table 2.1 below, while Table 2.2 shows the results of the
sequence of stack operations.

Table 2.1: Stack Operations


Operation Meaning

 Stack( ) Creates a new stack that is empty. It needs no


parameters and returns an empty stack.
 push(item) Adds a new item to the top of the stack. It
needs the item and returns nothing.
 pop( ) Removes the top item from the stack. It needs
 no parameters and returns the item. The stack
is modified.
 peek( ) Returns the top item from the stack but does
not remove it. It needs no parameters. The
stack is not modified.
isEmpty( ) Tests to see whether the stack is empty. It
needs no parameters and returns a Boolean
value.

 size() Returns the number of items on the stack. It


needs no parameters and returns an integer.

Table 2.2: Stack Operation


Stack Operation Stack Contents Return Value
s.isEmpty() [] True
s.push(4) [4]
s.push('dog') [4,'dog']
s.peek() [4,'dog'] 'dog'
s.push(True) [4,'dog',True]
s.size() [4,'dog',True] 3
s.isEmpty() [4,'dog',True] False
s.push(8.4) [4,'dog',True,8.4]
s.pop() [4,'dog',True] 8.4
s.pop() [4,'dog'] True
s.size() [4,'dog'] 2

APPLICATION OF STACK
Stacks are used extensively at every level of a modern computer system. Below are few among
the application of stack
1. Modern PC uses stacks at the architecture level, which are used in the basic design of an
operating system for interrupt handling and function calls.

13
2. Stacks are used to run a Java Virtual Machine, and the Java language itself has a class
called "Stack", which can be used by the programmer.
3. Another common use of stacks at the architecture level is as a means of allocating and
accessing memory.

IMPLEMENTATION OF STACK IN PYTHON


Since Stack Abstract Data Type was clearly defined, attention will be given to using Python to
implement a stack. Recall that an Abstract Data Type referred to the description of logical
operations, while its physical implementation defines the Data Structure. Python, is an object-
oriented programming language use for the implementation of Abstract Data Type such as stack
operations. This will be fully implemented in practical class.

QUEUE
A queue is an ordered collection of items where the addition of new items happens at one end,
called “rear,” and the removal of existing items occurs at the other end, commonly called
“front.” As an element enters the queue it starts at the rear and makes its way toward the front,
waiting until that time when it is the next element to be removed. The most recently added item
in the queue must wait at the end of the collection. This ordering principle is called FIFO, first-
in first-out. It is also known as “first-come first-served.” This is illustrated in figure 2.4 below

Figure 2.4: Queue of Python Data Objects

PRACTICAL APPLICATIONOF QUEUES


Below are some common practical applications of queue in computing:
1. Computer laboratory computers networked with a single printer. When students want to
print, their print tasks “get in line” with all the other printing tasks that are waiting. The
first task in is the next to be completed.
2. Operating systems use a number of different queues to control processes within a
computer. The scheduling of what gets done next is typically based on a queuing
algorithm that tries to execute programs as quickly as possible and serve as many users as
it can.
3. Keystrokes get ahead of the characters that appear on the screen. This is due to the
computer doing other work at that moment. The keystrokes are being placed in a queue-
like buffer so that they can eventually be displayed on the screen in the proper order.
DEQUEUE
De-queue, is also known as a double-ended queue, is an ordered collection of items

14
similar to the queue. It has two ends, a front and a rear, and the items remain positioned
in the collection. What makes a dequeue different is the unrestrictive nature of adding
and removing items. New items can be added at either the front or the rear. Likewise,
existing items can be removed from either end. In essence, this hybrid linear structure
provides all the capabilities of stacks and queues in a single data structure. Figure 2.5
below shows a dequeue of Python data objects. It is important to note that even though
the dequeue can assume many of the characteristics of stacks and queues; it does not
require the LIFO and FIFO orderings that are enforced by those data structures
(example, inserting a song into a playlist).

Figure 2.5: A Deque of Python Data Objects

DEQUE ABSTRACT DATA TYPE


Dequeue Abstract Data Type is defined as the description of logical operations of how an
ordered collection of items, where items are added and removed from either end, either front or
rear. The description of dequeue operations and their meanings are presented in the Table 2.3
below, while Table 2.4 depicts the results of the sequence of dequeue operations.

Table 2.3: Dequeue Operations


Operation Meaning
Dequeue() Creates a new dequeue that is empty. It needs no parameters and returns an empty
dequeue.
addFront(item) Adds a new item to the front of the dequeue. It needs the item and returns nothing.
addRear(item) Adds a new item to the rear of the dequeue. It needs the item and returns nothing.
removeFront() Removes the front item from the dequeue. It needs no parameters and returns the
item. The dequeue is modified.
removeRear() Removes the rear item from the dequeue. It needs no parameters and returns the
item. The dequeue is modified.
isEmpty() Tests to see whether the dequeue is empty. It needs no parameters and returns a
Boolean value.
size() Returns the number of items in the dequeue. It needs no parameters and returns an
integer value.
Table 2.4: Dequeue Operations
Dequeue Operation Dequeue Contents Return Value
d.isEmpty() [] True
d.addRear(4) [4]

15
d.addRear('dog') ['dog',4,]
d.addFront('cat') ['dog',4,'cat']
d.addFront(True) ['dog',4,'cat',True]
d.size() ['dog',4,'cat',True] 4
d.isEmpty() ['dog',4,'cat',True] False
d.addRear(8.4) [8.4,'dog',4,'cat',True]
d.removeRear() ['dog',4,'cat',True] 8.4
d.removeFront() ['dog',4,'cat'] True

ARRAY
Array is a collection of similar data types, stored into a common variable. The collection forms a
data structure where objects are stored linearly, one after another in memory. Sometimes arrays
are even replicated into the memory hardware. Most of the data structures make use of arrays to
implement their algorithms. The following are the important terms to understand the concept of
array.

 Element: Each item stored in an array is called an element.


 Index: Each location of an element in an array has a numerical index, which is used to
identify the element.

In a nutshell, an array is a linear data structure consisting of a group of elements that are
accessed by indexing.
ARRAY REPRESENTATION
Let consider the array below for illustration:

As per the above illustration, the followings are important points to be considered.
 Index starts with 0.
 Array length is 10 which mean it can store 10 elements.
 Each element can be accessed via its index. For example, we can fetch an element at
index 5 as 19.

16
BASIC OPERATIONS OF ARRAY
The following are the basic operations supported by an array.
 Traverse − Print all the array elements one by one.
 Insertion − Adds an element at the given index.
 Deletion − Remove an element at the given index.
 Search − Find an element using the given index or by the value.
 Update − Updates an element at the given index.

CLASSIFICATION OF ARRAYS
Arrays can be classified as static and dynamic.
1. Arrays whose size cannot change once their storage has been allocated are called static.
2. Arrays whose size can be resized even if storage has been allocated are called dynamic.

APPLICATIONS OF ARRAYS
Arrays are employed in many computer applications in which data items need to be saved in the
computer memory for subsequent reprocessing. Due to their performance characteristics, arrays
are used to implement other data structures, such as heaps, hash tables, dequeues, queues, stacks
and strings.

LIST
List is a linear data structure which contains a sequence of elements. It has property of known
length and its elements are arranged consecutively. The collection of items are accessible one
after the other, beginning at the head and ending at the tail. It is a widely used data structure for
applications which do not need random access. Unlike an array, stack and queue list is a data
structure allowing insertion and deletion of elements at any position in the list.
For example, the collection of integers 54, 26, 93, 17, 77, and 31 might represent a simple
unordered list of exam scores.

LIST IMPLEMENTATION
Lists can be implemented in many ways, depending on how the programmer will use lists in their
program. Common implementations include:
1. Array List
2. Linked List

ARRAY LISTS
This implementation stores the list in an array. The Array List has the following properties:
1. The position of each element is given by an index from 0 to n-1, where n is the number of
elements.
2. Given any index, the element with that index can be accessed in constant time. i.e. the
time to access does not depend on the size of the list.
3. To add an element at the end of the list, the time taken does not depend on the size of the
list. However, the time taken to add an element at any other point in the list does depend

17
on the size of the list, as all subsequent elements must be shifted up. Additions near the
start of the list take longer than additions near the middle or end.
4. When an element is removed, subsequent elements must be shifted down, so removals
near the start of the list take longer than removals near the middle or end.

LINKED LIST
A linked list is a linear data structure where each element is a separate object connected together
via links. Each element is considered as node consisting of two items; data and next reference to
the next node.

NODE
Node is the basic building block for linked list implementation which contains the item and a
reference to the next node. Each node object must hold at least two pieces of information called
data and field or next. This is illustrated below:

Each link contains a connection to another link. Linked list is the second most-used data
structure after array. The followings are important terms to understand the concept of Linked
List.
 Link: Each link of a linked list can store a data called an element.
 Next: Each link of a linked list contains a link to the next link called Next.
 LinkedList: A Linked List contains the connection link to the first link called First.

PROPERTIES OF LINKED LIST


Linked List has the following properties:
 The list can grow and shrink as needed. The position of each element is given by an index
from 0 to n-1, where n is the number of elements.
 Given any index, the time taken to access an element with that index depends on the
index. This is because each element of the list must be traversed until the required index
is found.
 The time taken to add an element at any point in the list does not depend on the size of
the list, as no shifts are required. It does, however, depend on the index.
 Additions near the end of the list take longer than additions near the middle or start. The
same applies to the time taken to remove an element.

18
LINKED LIST REPRESENTATION
Linked list can be visualized as a chain of nodes, where every node points to the next node.

As per the above illustration, following are the important points to be considered.
 Linked List contains a link element called head or first.
 Each link carries a data field(s) and a link field called next.
 Each link is linked with its next link using its next link.
 Last link carries a link as null to mark the end of the list.

TYPES OF LINKED LIST


1. Singly Linked List: In this type of linked list, every node stores address or reference of
next node in list and the last node has next address or reference as NULL, For example 1-
>2->3->4->NULL (Item navigation is forward only)
2. Doubly Linked List: In this type of Linked list, there are two references associated with
each node, one of the reference points to the next node and the other one to the previous
node. Advantage of this data structure is that one can traverse in both the directions and
for deletion we don’t need to have explicit access to previous node. Example NULL <-
1<->2<->3->NULL (Items can be navigated forward and backward.)
3. Circular Linked List: Circular linked list is a linked list where all nodes are connected
to form a circle. There is no NULL at the end. A circular linked list can be a singly
circular linked list or doubly circular linked list. Advantage of this data structure is that
any node can be made as starting node. This is useful in implementation of circular queue
in linked list.

BASIC OPERATIONS OF LINKED LIST


The following are the basic operations supported by a list.
1. Insertion − Adds an element at the beginning of the list.
2. Deletion − Remove an element at the beginning of the list.
3. Display − Show the complete list.
4. Search − Finds an element using a given key.
5. Delete − Removes an element using a given key.

19
INSERTION OPERATION
Adding a new node in linked list is a more than one step activity. Let’s consider the diagrams
below. First, create a node using the same structure and find the location where it has to be
inserted.

Imagine that we are inserting a node B (New Node), between A (Left Node) and C (Right Node).
Then point B, next to C −
Command 1: NewNode.next −> RightNode;

Command 2: LeftNode.next −> NewNode;

This will put the new node in the middle of the two. The new list should look like this.

20
DELETION OPERATION
Deletion is also a more than one step process. We shall learn with pictorial representation. First,
locate the target node to be removed, by using searching algorithms.

The left (previous) node of the target node now should point to the next node of the target node
Command 1: LeftNode.next −> RightNode.next;

This will remove the link that was pointing to the target node. Now, using the following code, we
will remove what the target node is pointing at.
Command 2: TargetNode.next −> NULL;

If we need to use the deleted node. We can keep that in memory otherwise we can simply
deallocate memory and wipe off the target node completely.

REVERSE OPERATION
This operation is a thorough one. We need to make the last node to be pointed by the head node
and reverse the whole linked list.

21
First, we traverse to the end of the linked list which is pointing to NULL and makes it point to its
previous node −

We have to make sure that the last node is not the lost node. So we'll have some temp node,
which looks like the head node pointing to the last node. Now, we shall make all left side nodes
point to their previous nodes one by one.

Except the node (first node) pointed by the head node, all nodes should point to their
predecessor, making them their new successor. The first node will point to NULL.

We'll make the head node point to the new first node by using the temp node.

The linked list is now reversed.

DOUBLE LINKED LIST


Doubly Linked List is a variation of Linked list in which navigation is possible in both ways,
both forward and backward easily as compared to Single Linked List.

The following are the important terms to understand the concept of doubly linked list.
 Link: Each link of a linked list can store a data called an element.
 Next: Each link of a linked list contains a link to the next link called Next.

22
 Prev: Each link of a linked list contains a link to the previous link called Prev.
 LinkedList: A Linked List contains the connection link to the first link called First and
to the last link called Last.

DOUBLE LINKED LIST REPRESENTATION

As per the above illustration, the following are the important points to be considered.
 Doubly Linked List contains a link element called first and last.
 Each link carries a data field(s) and
 The two link fields called next and prev.
 Each link is linked with its next link using its next link.
 Each link is linked with its previous link using its previous link.
 The last link carries a link as null to mark the end of the list.

BASIC OPERATIONS
Following are the basic operations supported by a double linked list.
 Insertion − Adds an element at the beginning of the list.
 Deletion − Remove an element at the beginning of the list.
 Insert Last − Adds an element at the end of the list.
 Delete Last − Remove an element from the end of the list.
 Insert After − Adds an element after an item of the list.
 Delete − Removes an element from the list using the key.
 Display forward − Displays the complete list in a forward manner.
 Display backward − Displays the complete list in a backward manner.

CIRCULAR LINKED LIST


Circular Linked List is a variation of Linked list in which the first element points to the last
element and the last element points to the first element. Both Singly Linked List and Doubly
Linked List can be made into a circular linked list.
SINGLE LINKED LIST AS CIRCULAR
In singly linked list, the next pointer of the last node points to the first node.

23
DOUBLE LINKED LIST AS CIRCULAR LINKED LIST
In doubly linked list, the next pointer of the last node points to the first node and the previous
pointer of the first node points to the last node making the circular in both directions.

BASIC OPERATIONS
Following are the important operations supported by a circular linked list.
 Insert − Add an element at the start of the list.
 Delete − Remove an element from the start of the list.
 Display − Show the list.

UNORDERED LIST
An unordered list is a collection of related items that have no special order or sequence. It will be
built from a collection of nodes, each linked to the next by explicit references. As long as the
first node is known others can be found by successively following the next links.

IMPLEMENTING AN UNORDERED LIST: LINKED LISTS


In order to implement an unordered list, there is need to construct what is commonly known as
a linked list. Recall that relative positioning of items has to be maintained. However, there is no
requirement that positioning in contiguous memory has to be maintained. For example, consider
the collection of items shown in figure below. It appears that these values have been placed
randomly. To maintain some explicit information in each item, namely the location of the next
item, then the relative position of each item can be expressed by simply following the link from
one item to the next as shown in figure below:

Figure: Items Not Constrained in their Physical Placement

24
Figure: Relative Positions Maintained by Explicit Links.

It is important to note that the location of the first item of the list must be explicitly specified.
Once the location of first item is known, the first item can tell us where the second is, and so on.
The external reference is often referred to as the head of the list. Similarly, the last item needs to
know that there is no next item.

THE UNORDERED LIST ABSTRACT DATA TYPE


The structure of an unordered list, as described above, is a collection of items where each item
holds a relative position with respect to the others. Some possible unordered list operations are
given below.

Table 2.7: List Operations


Operation Meaning

List() Creates a new list that is empty. It needs no parameters and returns an empty
list.
add(item) Inserts a new item to the list. It needs the item and returns nothing. Assume
the item is not already in the list.
remove(item) Deletes the item from the list. It needs the item and modifies the list. Assume
the item is present in the list.
search(item) Look for the item in the list. It needs the item and returns a Boolean value.
isEmpty() Tests to see whether the list is empty. It needs no parameters and returns a
Boolean value.
size() Returns the number of items in the list. It needs no parameters and returns an
integer.
append(item) Adds a new item to the end of the list making it the last item in the collection.
It needs the item and returns nothing. Assume the item is not already in the
list.
index(item) Returns the position of item in the list. It needs the item and returns the index.
Assume the item is in the list.

25
insert(pos,item) Adds a new item to the list at position pos. It needs the item and returns
nothing. Assume the item is not already in the list and there are enough
existing items to have position pos.
pop() Removes and returns the last item in the list. It needs nothing and returns an
item. Assume the list has at least one item.
pop(pos) Removes and returns the item at position pos. It needs the position and returns
the item. Assume the item is in the list.

26
NON-LINEAR DATA STRUCTURE
TREE
A tree is often used to represent a hierarchy. The relationships between the items in the hierarchy
suggest the branches of a botanical tree. It is a collection of nodes storing elements such that the
nodes have a parent-child relationship. A tree has the following properties:
1. If tree is not empty, it has a special tree node called the root that has no parent.
2. Each node of a tree that is different from the root has a unique parent node.
3. Each element has zero or more than one children.
4. A unique path traverses from the root to each node.
In summary,
Tree stored elements in hierarchical order.
The top element is called root.
Except the root, each element has a parent.

BASIC DEFINITIONS
 Node: Node is an element storing device which contains field and data contents, it can
either be internal or external node.
 Internal nodes: Nodes that have children.
 External nodes or leaves: Nodes that don’t have children
 Edge: An edge connects two nodes together to show that there is a relationship between
them. Edges are usually drawn as simple lines, they are really directed from parent to
child. In tree drawings, this is top-to-bottom.
 Root: The root of the tree is the only node in the tree that has no incoming edges.
 Path: A path is an ordered list of nodes that are connected by edges.
 Children: Set of nodes that have incoming edges from the same node are said to be the
children of that node.
 Parent: A node is the parent of all nodes if it connects to with outgoing edges.
 Sibling: Siblings: two nodes that have the same parent are called siblings
 Descendants: The descendents of a node are all the nodes that are on same path from the
node to any leaf.
 Ancestors: Ancestors of a node are all the nodes that are on the path from the node to the
root.
 Level: The level of a node n is the number of edges on the path from the root node to n.
 Height: The height of a node is the length of the longest path from the node to a leaf.
Figure 3.1 below illustrates a tree definition above. The arrowheads on the edges indicate the
direction of the connection.

27
Figure 3.1: A Tree Consisting of a Set of Nodes and Edges

APPLICATION OF TREE
1. Class hierarchy in Java.
2. File system.
3. Storing hierarchies in organizations
PARSE TREE
Parse trees can be used to represent real-world constructions such as sentences or mathematical
expressions. Figure 3.2 below is an example of how tree can be used to solve some real life
problems.

Figure 3.2: A Parse Tree for a Simple Sentence


Figure above shows the hierarchical structure of a simple sentence. Representing a sentence as a
tree structure allows us to work with the individual parts of the sentence by using sub-trees.

28
One can also represent a mathematical expression such as ((7+3)*(5−2)) as a parse tree, as shown
in Figure 3.3 below.

Figure 3.3: Parse Tree for ((7+3)*(5−2))

Above parenthesized expressions shows that multiplication has a higher precedence than either
addition or subtraction. Meaning, addition and subtraction expressions must be evaluated before
multiplication.

HOW TO BUILD A PARSE TREE FROM MATHEMATICAL EXPRESSION


The first step in building a parse tree is to break up the expression string into a list of tokens.
There are four different kinds of tokens to consider:
 Left parentheses,
 Right parentheses,
 Operators, and
 Operands.
Using the information above one can define the four rules as follows:
1. If the current token is ' ( ', add a new node as the left child of the current node, and
descend to the left child.
2. If the current token is in the list ['+','-','/','*'], set the root value of the current node to the
operator represented by the current token. Add a new node as the right child of the
current node and descend to the right child.
3. If the current token is a number, set the root value of the current node to the number and
return to the parent.
4. If the current token is a ')', go to the parent of the current node.
Let’s look at an example of the rules outlined above in action. Considering example (3+(4x5)). It
will be express into the following list of character tokens ['(', '3', '+', '(', '4', '*', '5' ,')',')']. Initially
we will start out with a parse tree that consists of an empty root node.

Figures below illustrate the structure and contents of the parse tree, as each new token is
processed.

29
Step 1: Create an empty tree.

Step 2: By rule 1, create a new node as the left child of the root. Make the current node
this new child.

Step 3: By rule 3, set the root value of the current node to 3 and go back up the tree to the
parent.

Step 4: Read + as the next token. By rule 2, set the root value of the current node to +
and add a new node as the right child. The new right child becomes the current node.

Step 5: Read a ( as the next token. By rule 1, create a new node as the left child of the current
node. The new left child becomes the current node.

Step 6: Read a 4 as the next token. By rule 3, set the value of the current node to 4. Make the
parent of 4 the current node.

30
Step 7: Read * as the next token. By rule 2, set the root value of the current node to * and create
a new right child. The new right child becomes the current node.

Step 8: Read 5 as the next token. By rule 3, set the root value of the current node to 5. Make the
parent of 5 the current node.

a. Read ) as the next token. By rule 4 we make the parent of * the current node.
b. Read ) as the next token. By rule 4 we make the parent of + the current node. At this
point there is no parent for + so we are done.

From the example above, it is clear that one need to keep track of the current node as well as the
parent of the current node. The tree interface provides us with a way to get children of a node,
through the getLeftChild and getRightChild methods.

Example: Hierarchically represent the mathematical expression below using tree terminology
1. (5a + 10b) / (14a4 - 28b)

31
2. (6b - 20c)/(2a+3d)/(5a+10b)7

Answer 1:

BINARY TREE
A binary tree is a finite set of data items that is either empty or partitioned into three disjoint
subsets. The first part contains a single data item referred to as the root of the binary tree, other
two data items are left and right sub-trees. These data items are referred to as nodes of the binary
tree. This is illustrated in the Figure 3.4 below.

Figure 3.4: Binary Tree

From this figure we can portray the following:


A - Root node of the binary tree
A - Parent node of the nodes B & C
B & C - Children nodes of A and these nodes are in a level one more than the root node A
C - Right child node of A
B - Left child node of A
D & E - These node are without a child, so they are referred to as leafs or external nodes
B & C - These nodes have a child each, so they are referred to as internal nodes

32
Full Binary Tree (FBT): In a full binary tree all the internal nodes have equal degree, which
means that one node at the root level, two nodes at level 2, four nodes at level 3, eight nodes at
level 4, etc, as shown in the Figure 3.5 below:

Figure 3.5: Full Binary Tree

Complete Binary Tree (CBT): A complete binary tree is a FBT except, that the deepest level
may not be completely filled. If not completely filled, it is filled from left-to-right as depicted in
Figure 3.6.

Figure 3.6: Complete Binary Tree


APPLICATION OF BINARY TREE
Binary tree are applicable in the below mentioned fields:
1. Used in the compilers of high-level programming languages for intermediate
representation.
2. Used as a searching technique.
3. Used in databases for storing data.
4. To solve arithmetic expressions

BINARY TREE TRAVERSALS


Traversal is a systematic way to visit all nodes of a tree. There are three commonly used patterns
to visit a node and the difference between them is the order in which each node is visited. These
are called pre-order, in-order, and post-order.

a. Pre-order: In preorder traversal, parent comes first followed by left child then right
child.
b. Post-order: In post-order traversal, left child comes first followed by right child then
root.
c. In-order: In an in-order traversal, left child comes first followed by root then right child.

33
Let’s look at some examples that illustrate each of these three kinds of traversals. First let’s look
at the preorder traversal by considering a book as a tree. The book is the root of the tree, and
each chapter is a child of the root. Each section within a chapter is a child of the chapter, and
each subsection is a child of its section, and so on. Figure 3.7 shows a limited version of a book
with only two chapters.

Figure 3.7: Representing a Book as a Tree

SEARCHING AND SORTING TECHNIQUES

Another example to try preorder, post order and in-order

B C

D E F G

Preorder: ABDECFG
Post order: DEBFGCA
In-order: DBEAFCG

Searching
The process of finding the desired information from the set of items stored in the form of
elements in the computer memory is referred to as searching in data structure. Searching
algorithms are essential part of a programmer's toolkit. They help programmers to efficiently
locate specific elements within a collection of data. These sets of items are in various forms,
such as an array, tree, graph, or linked list.

The following are the basic types of searching techniques.


1. Linear Search

34
2. Binary Search
3. Interpolation search

1. Linear Search
Linear search is a very simple search algorithm. In this type of search, a sequential search is
made over all items one by one. Every item is checked and if a match is found then that
particular item is returned, otherwise the search continues till the end of the data collection. This
is illustrated in the figure below:

2. Binary Search
Binary search looks for a particular item by comparing the middle most item of the collection. If
a match occurs, then the index of item is returned. If the middle item is greater than the item
under search, then the item is searched in the sub-array to the left of the middle item. Otherwise,
the item is searched for in the sub-array to the right of the middle item. This process continues
on the sub-array as well until the size of the sub-array reduces to zero.

Principles of Binary Search


To understand the principles of binary search, let’s consider the pictorial array below.

Above is a sorted array and let’s assume that we need to search the location of a value 31 using
binary search.

First, we shall determine half of the array by using the formula:


Mid = Low + (High - Low) / 2
= 0 + (9 - 0) /2 = 4.5 (integer value of 4.5).

Therefore, the mid of the array is 4 as illustrated below.

35
Now we compare the value stored at location 4, with the value being searched, i.e. 31. We find
that the value at location 4 is 27, which is not a match. As the value is greater than 27 and we
have a sorted array, so we also know that the target value must be in the upper portion (right
hand side) of the array.

We change our low to mid + 1 and find the new mid value again.
Low = mid + 1
Mid = low + (high - low) / 2
Mid = 5 + (9 - 5) /2
=5+2=7

New mid is now 7. We compare the value stored at location 7 with the target value 31.

The value stored at location 7 is not a match, rather it is more than what we are looking for. So,
the value must be in the lower part (left hand side) from this location (35). The sub-array is
illustrated below.

Hence, we calculate the mid again.


Mid = low + (high - low) / 2
Mid = 5 + (6 - 5) /2
= 5 + 0.5 = 5.5
This time the mid is array 5.

We compare the value stored at location 5 with our target value. We find that it is a match.

36
We conclude that the target value 31 is stored at location 5.
Binary search halves the searchable items and thus reduces the count of comparisons to be made
to very less numbers.

3. Interpolation search
Interpolation search is a searching algorithm that differs from other traditional methods, such as
linear or binary search. While linear search works sequentially from the beginning to the end of
a data set, and binary search divides the array in half at each step, interpolation search estimates
the position of the desired element based on its value.
This algorithm assumes that the elements within the collection are uniformly distributed,
enabling a more informed guess of where the target item might exist. Instead of relying on
constant intervals like binary search, interpolation search adapts its approach based on the range
of values within the data set.

How interpolations search work


Below are the steps involved in performing an interpolation search:
1. Sort the data: Interpolation search requires the input data to be sorted in ascending or
descending order. It is crucial to organize the data before applying this algorithm.
2. Calculate the position: The algorithm uses a mathematical formula to estimate the
likely position of the target element. It takes into account the values of the first and last
elements, as well as the target value itself.
The calculation formula is as follows:
Position = low + ((target – arr[low]) * (high – low)) / (arr[high] – arr[low])
Where,
Low represents the index of the first element,
High represents the index of the last element,
Target is the value we are searching for, and
Arr is the sorted array.
3. Compare and adjust: Once we have the estimated position, we compare the target
value with the element at that position. Based on the comparison, we can make
adjustments to narrow down the search range.
 If the target value matches the element at the estimated position, we have found
the desired element and can return its index.
 If the target value is less than the element at the estimated position, the search is
confined to the lower half of the array.
 If the target value is greater than the estimated position, the search is confined to
the upper half of the array.
4. Repeat or terminate: After adjusting the search range, we repeat the interpolation
process until the target element is found or deemed absent. We continue updating the

37
position estimate and narrowing down the search range until we locate the target value or
exhaust all possibilities

Example:
Let's illustrate the interpolation search algorithm with a simple example. Search for the target
value of 12 in the following sorted array of integers:
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Let’s work through the algorithm step by step;


Position = low + ((target – arr[low]) * (high – low)) / (arr[high] –arr[low])
= 0 + ((12 – 2) * (9 – 0)) / (20 – 2)
= 0 + (10 * 9) / 18
= 0 + 90/18
=0+5
=5
Comparing the target value, 12, with the element at position 5 (which is 12), we have found our
desired element. The algorithm terminates, and we return the index 5

When to use interpolation search


Interpolation search offers several advantages over other searching algorithms in certain
scenarios. It performs exceptionally well when the data is uniformly distributed, as it is capable
of making informed guesses.

However, there are considerations to keep in mind. Interpolation requires the data to be sorted
beforehand, which adds an initial time cost. Additionally, if data set is unevenly distributed or
has repetitive elements, interpolation search may not yield significant performance benefits.

Interpolation Algorithm
Step 1 − Start searching data from middle of the list.
Step 2 − If it is a match, return the index of the item, and exit.
Step 3 − If it is not a match, probe position.
Step 4 − Divide the list using probing formula and find the new middle.
Step 5 − If data is greater than middle, search in higher sub-list.
Step 6 − If data is smaller than middle, search in lower sub-list.
Step 7 − Repeat until match.

SORTING TECHNIQUES
Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to
arrange data in a particular order. Most common orders are in numerical or lexicographical
order. The following are some of the examples of sorting in real-life scenarios:

38
 Telephone Directory: The telephone directory stores the telephone numbers of people
sorted by their names, so that the names can be searched easily.
 Dictionary: The dictionary stores words in an alphabetical order so that searching of any
word becomes easy.

In-place Sorting and Not-in-place Sorting


Sorting algorithms may require some extra space for comparison and temporary storage of few
data elements.
Algorithms that do not require any extra space and sorting is said to happen in-place is called in-
place sorting. Bubble sort is an example of in-place sorting, while algorithms which requires
space that is more than or equal to the elements being sorted is called not-in-place sorting.
Merge-sort is an example of not-in-place sorting.

Important Terms
Some terms are generally coined while discussing sorting techniques, here is a brief introduction
to them:
1. Increasing Order: A sequence of values is said to be in increasing order, if the
successive element is greater than the previous one. E.g 1, 3, 4, 6, 8, 9
2. Decreasing Order: A sequence of values is said to be in decreasing order, if the
successive element is less than the current one. E.g. 9, 8, 6, 4, 3, 1
3. Non-Increasing Order: A sequence of values is said to be in non-increasing order, if the
successive element is less than or equal to its previous element in the sequence. This
order occurs when the sequence contains duplicate values. For example, 9, 8, 6, 3, 3, 1
4. Non-Decreasing Order: A sequence of values is said to be in non-decreasing order, if
the successive element is greater than or equal to its previous element in the sequence.
This order occurs when the sequence contains duplicate values. For example, 1, 3, 3, 6,

Bubble Sort Algorithm


Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm
in which each pair of adjacent elements is compared and the elements are swapped if they are
not in order. This algorithm is not suitable for large data sets as its average and worst case
complexity are of Ο(n2) where n is the number of items.

How Bubble Sort Works?


We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it
short and precise.

Bubble sort starts with very first two elements, comparing them to check which one is greater.

39
In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare 33
with 27.

We find that 27 is smaller than 33 and these two values must be swapped.

The new array should look like this −

Next we compare 33 and 35. We find that both are in already sorted positions.

Then we move to the next two values, 35 and 10.

Since 10 is smaller than 35. Hence they are not sorted.

We swap these values. We find that we have reached the end of the array. After one iteration,
the array should look like this

To be precise, we are now showing how an array should look like after each iteration. After the
second iteration, it should look like this −

40
Notice that after each iteration, at least one value moves at the end.

And when there's no swap required, bubble sorts learns that an array is completely sorted.

Now we should look into some practical aspects of bubble sort.

Insertion Sort
This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which is
always sorted. For example, the lower part of an array is maintained to be sorted. An element
which is to be 'inserted’ in this sorted sub-list, has to find its appropriate place and then it has to
be inserted there. Hence, the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted into the sorted
sub-list (in the same array). This algorithm is not suitable for large data sets as its average and
worst case complexity are of Ο(n2), where n is the number of items.

How Insertion Sort Works?


We take an unsorted array for our example.

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.

Insertion sort moves ahead and compares 33 with 27.

41
And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the
sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list
remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.

These values are not in a sorted order.

So we swap them.

However, swapping makes 27 and 10 unsorted.

Hence, we swap them too.

Again we find 14 and 10 in an unsorted order.

42
We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items.

This process goes on until all the unsorted values are covered in a sorted sub-list. Now we shall
see algorithm aspects of insertion sort.

Insertion Sort Algorithm


Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

SELECTION SORT
Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-
based algorithm in which the list is divided into two parts, the sorted part at the left end and the
unsorted part at the right end. Initially, the sorted part is empty and the unsorted part is the entire
list. The smallest element is selected from the unsorted array and swapped with the leftmost
element, and that element becomes a part of the sorted array. This process continues moving
unsorted array boundary by one element to the right. This algorithm is not suitable for large data
sets as its average and worst case complexities are of Ο (n2), where n is the number of items.

How Selection Sort Works?


Consider the following depicted array as an example.

For the first position in the sorted list, the whole list is scanned sequentially. The first position
where 14 is stored presently, we search the whole list and find that 10 is the lowest value.

43
So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the
list, appears in the first position of the sorted list.

For the second position, where 33 is residing, we start scanning the rest of the list in a linear
manner.

We find that 14 is the second lowest value in the list and it should appear at the second place.
We swap these values.

After two iterations, two least values are positioned at the beginning in a sorted manner.

The same process is applied to the rest of the items in the array.
Following is a pictorial depiction of the entire sorting process −

44
Now, let us see the algorithm aspects of selection sort.

Algorithm
Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted

45
MEMORY ALLOCATION

Introduction
Memory allocation is a process of assigning blocks of memory on request. This chapter describes
the process of memory allocation as well its classifications, their advantages and disadvantages.
One should note that, memory allocation is a critical task in modern operating systems, and one of
the most commonly used techniques is contiguous memory allocation. Therefore, this module
concentrates more on contiguous memory allocation and its techniques.

Memory Allocation
A software or process requires memory space in order to run. As a result, there must be a process
that will give a specific amount of memory space that corresponds to the need of the software’s
or processes. In a nut shell, the procedure of assigning memory space to software applications is
referred to as memory allocation process.

Typically, the allocator receives memory from the operating system in a small number of large
blocks that it must divide up to the required number to satisfy the requests for smaller blocks. It
must also make any returned blocks available for reuse.

Classification of Memory Allocation


Basically, there are two categories of memory allocations. These are briefly described and
illustrated below;
1. Contiguous: Contiguous memory allocation enables the tasks to be finished in a single
memory region.
2. Non-contiguous: Contrarily, non-contiguous memory allocation distributes the
procedure throughout many memory locations in various memory sections.

Figure 4.1: Classification of Memory Allocation

As illustrated in Figure 4.1 above, contiguous memory allocation is also divided into two
types:

46
1. Fixed (or static) Partition: In the fixed partition scheme, memory is divided into fixed
number of partitions in which on every partition only one process will be
accommodated. Maximum size of the process is restricted by maximum size of the
partition. Every partition is associated with the limit registers.
2. Variable (or dynamic) Partition: In the variable partition scheme, initially memory
will be single continuous free block. When a request by the process arrives, partition
will be made in the memory in accordance with the size of the process.

Contiguous
Contiguous memory allocation is one among the memory allocation strategies. As the name
implies, it is a strategy to allocate contiguous blocks of memory to each process. Therefore,
continuous segment is allotted from the entirely empty area to the process based on its size
whenever a process requests reached the main memory.

There are many types of contiguous operation techniques which are often being used in
combination with one another in a particular case. However, this note presents few among other
as follows.
1. First fit
2. Best fit
3. Worst fit
4. Next fit
5. Buddy system

First Fit
The first-fit algorithm searches for the first free partition that is large enough to accommodate the
process. The operating system starts searching from the beginning of the memory and allocates
the first free partition that is large enough to fit the process.

For example, suppose we have the following memory partitions:


| 10 KB | 20 KB | 15 KB | 18 KB | 30 KB |
Now, a process requests 18 KB of memory. The operating system starts searching from the
beginning and finds the first free partition of 20 KB. It allocates the process to that partition and
keeps the remaining 2 KB as left out free memory.

Best Fit
The best-fit algorithm searches for the smallest free partition that is large enough to accommodate
the process. The operating system searches the entire memory and selects the free partition that is
slightly greater or equal in size to the process space in the block.

For example, suppose we have the following memory partitions:


| 10 KB | 19 KB | 15 KB | 25 KB | 30 KB |
47
Now, a process requests 18KB of memory. The operating system searches for the smallest free
partition that is larger than 18KB, and it finds the partition of 19KB. It allocates the process to
that partition and keeps the remaining 1KB as free memory.

Worst Fit
The worst-fit algorithm searches for the largest free partition and allocates the process to it. This
algorithm is designed to leave the largest possible free partition for future use.

For example, suppose we have the following memory partitions:


| 10 KB | 20 KB | 15 KB | 25 KB | 30 KB |
Now, a process requests 18KB of memory. The operating system searches for the largest free
partition, which is 30KB. It allocates the process to that partition and keeps the remaining 12KB
as free memory space in the block.

Next Fit:
Next fit is similar to the first fit but it starts searching for the first sufficient partition (where
the last allocation was made) from the last (bottom) allocation point.

Comparison between First, Best and Worse Fits


First Fit is fast and simple to implement, making it the most commonly used algorithm.
However, it can suffer from external fragmentation, where small free partitions are left between
allocated partitions. Best Fit reduces external fragmentation by allocating processes to the
smallest free partition, but it requires more time to search for the appropriate partition. Worst Fit
reduces external fragmentation by leaving the largest free partition, but it can lead to inefficient
use of memory.

Advantages and Disadvantages of Contiguous Memory Allocation


Memory allocation has a range of benefits and drawbacks. The following are a few benefits and
drawbacks:
Advantages
1. The number of memory blocks remaining, which affects how many further processes can
be given memory space, is easy to keep track of.
2. Contiguous memory allocation has good read performance since the entire file can be
read from the disc in a single process.
3. The contiguous allocation works well and is easy to set up.
Disadvantages
1. External Fragmentation
Contiguous memory allocation can lead to external fragmentation, where free memory is broken
into small, non-contiguous blocks. This can make it difficult to allocate large blocks of memory.

2. Waste of Memory
48
If a process requests a large block of memory, but only uses a portion of it, the remaining
memory is wasted. This is known as internal fragmentation.

3. Difficulty in Memory Allocation


As memory is allocated and deallocated, the free memory space can become fragmented, making
it difficult to allocate large blocks of memory.

4. Poor Memory Utilization


Contiguous memory allocation can lead to poor memory utilization, as processes may be
allocated more memory than they need.

5. Difficulty in Expanding Memory


If a process needs more memory than initially allocated, it can be difficult to expand the memory
allocation, as the adjacent memory blocks may be allocated to other processes.
Exercise: Consider the requests from processes in given order 300K, 25K, 125K, and 50K. Let
there be two blocks of memory available of size 150K followed by a block size 350K.
Which of the following partition allocation schemes can satisfy the above requests?
A) Best fit.
B) First fit.
C) Both First & Best fits.
D) Neither first nor best fits.

Solution: Let us try all options.


Best Fit:
300K is allocated from a block of size 350K. 50 is left in the block.
25K is allocated from the remaining 50K block. 25K is left in the block.
125K is allocated from 150 K block. 25K is left in this block also.
50K can’t be allocated in the 1st 25K left out memory space, so also in the 2nd 25K.

First Fit:
300K request is allocated from 350K block, 50K is left out.
25K is allocated from the 150K block, 125K is left out.
Then 125K and 50K are allocated to the remaining left out partitions.
So, the first fit can handle requests.

So option B is the correct choice.

Buddy System
The two smaller parts of a block are of equal size and called buddies. Buddy system is a memory
allocation technique used in computer OS to allocate and manage memory efficiently. It is
technique which divides the memory into fixed-size blocks, and when there is a memory

49
requests, the system finds the smallest available block that can accommodate the requested
memory size.

In the Buddy System, memory is split into fixed-length blocks, regularly in powers of 2 (e.g.,
1KB, 2KB, 4KB, and so on.). When a request for memory allocation is made, the device search
for the correct-sized block. If an appropriate block is determined, the space is allocated.
However, if the asked size doesn’t precisely get an appropriate block among the existing
blocks, then the device allocates a bigger block after which splits it into smaller blocks till an
accurately size is obtained.
Steps Involve in Buddy System
Below are the steps involved in the Buddy System Memory Allocation Technique:
1. The first step includes the division of memory into fixed-sized blocks that have a power
of 2 in size (such as 2, 4, 8, 16, 32, 64, 128, etc. ).
2. Each block is labeled with its size and unique identification.
3. Initially, all the memory blocks are free and are linked together in a binary tree
structure, with each node representing a block and the tree’s leaves representing the
smallest available blocks.
4. When a process is requesting a memory space, the system finds the smallest available
block that can accommodate the requested size. If the block is larger than the requested
size, the system splits the block into two equal-sized “buddy” blocks.
5. The system marks one of the buddy blocks as allocated and adds it to the process’s
memory allocation table, while the other buddy block is returned to the free memory
pool and linked back into the binary tree structure.
6. When a process releases memory, the system marks the corresponding block as free and
looks for its buddy block. If the buddy block is also free, the system merges the two
blocks into a larger block and links it back into the binary tree structure.

Types of Buddy System


The Buddy System in memory control generally refers to a selected method used to allocate
and de-allocate memory blocks. However, within this framework, versions or adaptations may
additionally exist relying on specific necessities or optimizations wished for distinctive
structures. Here are a few types of Buddy Systems:
 Fibonacci Buddy System: A Fibonacci buddy system has block sizes of 16, 32, 48, 80,
128, and 208 bytes, with each block size equal to the total of the two blocks previous to it.
When a block is divided from one free list, its two portions are added to the two free lists
that came before it. It can be minimized by bringing the allowable block sizes close
together. Now, let’s look at an example to better grasp the Fibonacci buddy system.
Assume the memory size is 377 kb. Then, 377 kb will be partitioned into 144 kb and 233
kb. Following that, 144 will be divided into (55 + 89), whereas 233 will be divided into (89
+ 144). This separation will continue based on memory requirements.

50
Formula : F(n) = F(n-1) + F(n-2)
Fibonacci Series
Here's the first 15 numbers in the Fibonacci series:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377

Step 1: Initial Partitioning


The memory size is 377 KB. To apply the Fibonacci buddy system, we need to divide the
memory into two blocks, where the sizes of the blocks are Fibonacci numbers.
The closest Fibonacci numbers that add up to 377 are 233 and 144.
Step 2: Partitioning the Blocks
Now, we have two blocks: 144 KB and 233 KB.
Step 3: Sub-Partitioning the Blocks
We'll sub-partition each block into smaller blocks, again using Fibonacci numbers.
For the 144 KB block:
- The closest Fibonacci numbers that add up to 144 are 89 and 55.
For the 233 KB block:
- The closest Fibonacci numbers that add up to 233 are 144 and 89.
Notice that the 89 KB block is reused in both sub-partitions.
The resulting partitioning is:

- 377 KB = 233 KB + 144 KB


- 144 KB = 89 KB + 55 KB
- 233 KB = 144 KB + 89 KB

This is the basic idea behind the Fibonacci buddy system. By using Fibonacci numbers to
partition the memory, we can create a hierarchical structure that allows for efficient
allocation and deallocation of memory blocks.
 Binary Buddy System: The buddy system keeps track of the free blocks of each size
(known as a free list) so that you can easily discover a block of the necessary size if one is
available. If no blocks of the requested size are available, Allocate examines the first non-
empty list for blocks of at least the requested size. In both cases, a block is deleted from the
free list. For ex: The 512 KB memory size is initially partitioned into two active partitions
of 256 KB each, with additional subdivisions based on a capacity of 2 to handle memory
requests.
 Weighted Buddy System: In a weighted peer system, each memory block is associated
with a weight, which represents its size relative to other blocks. When a memory allocation
request occurs, the system searches for the appropriate block considering the size of the
requested memory and the weight of the available blocks.

51
 Tertiary Buddy System : In a traditional buddy system, memory is divided into blocks of
fixed size, usually a power of 2, and allocated to these blocks but the tertiary buddy system
introduces a third memory structure, which allows flexibility large in memory allocation.

Summary
Contiguous memory allocation is an essential technique used in modern operating systems to
allocate memory space to processes. First Fit, Best Fit, Worst Fit and buddy system are popular
algorithms used for contiguous memory allocation. Each algorithm has its advantages and
disadvantages, but all are designed to optimize memory allocation and reduce fragmentation.
With these techniques, operating systems can efficiently manage memory, making them a critical
component of computer systems.

52

You might also like