0% found this document useful (0 votes)
2 views58 pages

Notes DSA

The document outlines the fundamentals of data structures, particularly focusing on their definitions, characteristics, and types, including linear and non-linear structures. It discusses the importance of data structures in efficiently organizing and accessing data, as well as basic operations such as insertion, deletion, and searching. Additionally, it provides specific details on arrays and stacks, including their representations and operations.

Uploaded by

jhandj559
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views58 pages

Notes DSA

The document outlines the fundamentals of data structures, particularly focusing on their definitions, characteristics, and types, including linear and non-linear structures. It discusses the importance of data structures in efficiently organizing and accessing data, as well as basic operations such as insertion, deletion, and searching. Additionally, it provides specific details on arrays and stacks, including their representations and operations.

Uploaded by

jhandj559
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Neelam Vidya Vihar, Sijoul, Mailam, Madhubani, Bihar – 847235

Website: htp://www.sandipuniversity.edu.in Email: info@ sandipuniversity.edu.in


SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Subject Code & Name: Data Structure Using C (BCA202T)
Course: BCA(Bachelor of Computer Application )
Semester: 2nd

Notes:-
UNIT-1
Data Structure is a systematic way to organize data in order to use it efficiently. Following
terms are the foundation terms of a data structure.
• Interface − Each data structure has an interface. Interface represents the set of
operations that a data structure supports. An interface only provides the list of supported
operations, type of parameters they can accept and return type of these operations.
• Implementation − Implementation provides the internal representation of a data
structure. Implementation also provides the definition of the algorithms used in the
operations of the data structure.
Characteristics of a Data Structure
• Correctness − Data structure implementation should implement its interface correctly.
• Time Complexity − Running time or the execution time of operations of data structure
must be as small as possible.
• Space Complexity − Memory usage of a data structure operation should be as little as
possible.
Need for Data Structure
As applications are getting complex and data rich, there are three common problems that
applications face now-a-days.
• Data Search − Consider an inventory of 1 million(106) items of a store. If the
application is to search an item, it has to search an item in 1 million(106) items every
time slowing down the search. As data grows, search will become slower.
• Processor speed − Processor speed although being very high, falls limited if the data
grows to billion records.
• Multiple requests − As thousands of users can search data simultaneously on a web
server, even the fast server fails while searching the data.
To solve the above-mentioned problems, data structures come to rescue. Data can be organized
in a data structure in such a way that all items may not be required to be searched, and the
required data can be searched almost instantly.
Execution Time Cases
There are three cases which are usually used to compare various data structure's execution time
in a relative manner.
• Worst Case − This is the scenario where a particular data structure operation takes
maximum time it can take. If an operation's worst case time is ƒ(n) then this operation
will not take more than ƒ(n) time where ƒ(n) represents function of n.
• Average Case − This is the scenario depicting the average execution time of an
operation of a data structure. If an operation takes ƒ(n) time in execution, then m
operations will take mƒ(n) time.
• Best Case − This is the scenario depicting the least possible execution time of an
operation of a data structure. If an operation takes ƒ(n) time in execution, then the actual
operation may take time as the random number which would be maximum as ƒ(n).
Basic Terminology
• Data − Data are values or set of values.
• Data Item − Data item refers to single unit of values.
• Group Items − Data items that are divided into sub items are called as Group Items.
• Elementary Items − Data items that cannot be divided are called as Elementary Items.
• Attribute and Entity − An entity is that which contains certain attributes or properties,
which may be assigned values.
• Entity Set − Entities of similar attributes form an entity set.
• Field − Field is a single elementary unit of information representing an attribute of an
entity.
• Record − Record is a collection of field values of a given entity.
• File − File is a collection of records of the entities in a given entity set.
Data Structure Basics
This chapter explains the basic terms related to data structure.
Data Definition
Data Definition defines a particular data with the following characteristics.
• Atomic − Definition should define a single concept.
• Traceable − Definition should be able to be mapped to some data element.
• Accurate − Definition should be unambiguous.
• Clear and Concise − Definition should be understandable.
Data Object
Data Object represents an object having a data.
Data Type
Data type is a way to classify various types of data such as integer, string, etc. which
determines the values that can be used with the corresponding type of data, the type of
operations that can be performed on the corresponding type of data. There are two data types

• Built-in Data Type
• Derived Data Type
Built-in Data Type
Those data types for which a language has built-in support are known as Built-in Data types.
For example, most of the languages provide the following built-in data types.
• Integers
• Boolean (true, false)
• Floating (Decimal numbers)
• Character and Strings
Derived Data Type
Those data types which are implementation independent as they can be implemented in one or
the other way are known as derived data types. These data types are normally built by the
combination of primary or built-in data types and associated operations on them.
For example −
• List
• Array
• Stack
• Queue
Basic Operations
The data in the data structures are processed by certain operations. The particular data
structure chosen largely depends on the frequency of the operation that needs to be performed
on the data structure.
• Traversing
• Searching
• Insertion
• Deletion
• Sorting
• Merging
Data Structures and Types
Data structures are introduced in order to store, organize and manipulate data in
programming languages. They are designed in a way that makes accessing and processing of
the data a little easier and simpler. These data structures are not confined to one particular
programming language; they are just pieces of code that structure data in the memory.
Data types are often confused as a type of data structures, but it is not precisely correct even
though they are referred to as Abstract Data Types. Data types represent the nature of the data
while data structures are just a collection of similar or different data types in one.
There are usually just two types of data structures −
• Linear
• Non-Linear
Linear Data Structures
The data is stored in linear data structures sequentially. These are rudimentary structures since
the elements are stored one after the other without applying any mathematical operations.

Linear data structures are usually easy to implement but since the memory allocation might
become complicated, time and space complexities increase. Few examples of linear data
structures include −
• Arrays
• Linked Lists
• Stacks
• Queues
Based on the data storage methods, these linear data structures are divided into two sub-types.
They are − static and dynamic data structures.
Static Linear Data Structures
In Static Linear Data Structures, the memory allocation is not scalable. Once the entire memory
is used, no more space can be retrieved to store more data. Hence, the memory is required to
be reserved based on the size of the program. This will also act as a drawback since reserving
more memory than required can cause a wastage of memory blocks.
The best example for static linear data structures is an array.
Dynamic Linear Data Structures
In Dynamic linear data structures, the memory allocation can be done dynamically when
required. These data structures are efficient considering the space complexity of the program.
Few examples of dynamic linear data structures include: linked lists, stacks and queues.
Non-Linear Data Structures
Non-Linear data structures store the data in the form of a hierarchy. Therefore, in contrast to
the linear data structures, the data can be found in multiple levels and are difficult to traverse
through.
However, they are designed to overcome the issues and limitations of linear data structures.
For instance, the main disadvantage of linear data structures is the memory allocation. Since
the data is allocated sequentially in linear data structures, each element in these data structures
uses one whole memory block. However, if the data uses less memory than the assigned block
can hold, the extra memory space in the block is wasted. Therefore, non-linear data structures
are introduced. They decrease the space complexity and use the memory optimally.
Few types of non-linear data structures are −
• Graphs
• Trees
• Tries
• Maps
Array Data Structure
Array is a type of linear data structure that is defined as a collection of elements with same or
different data types. They exist in both single dimension and multiple dimensions. These data
structures come into picture when there is a necessity to store multiple elements of similar
nature together at one place.

The difference between an array index and a memory address is that the array index acts like a
key value to label the elements in the array. However, a memory address is the starting address
of free memory available.
Following are the important terms to understand the concept of Array.
• Element − Each item stored in an array is called an element.
• Index − Each location of an element in an array has a numerical index, which is used
to identify the element.
Syntax
Creating an array in C and C++ programming languages −
data_type array_name[array_size] = {elements separated using commas}
or,
data_type array_name[array_size];
Need for Arrays
Arrays are used as solutions to many problems from the small sorting problems to more
complex problems like travelling salesperson problem. There are many data structures other
than arrays that provide efficient time and space complexity for these problems, so what makes
using arrays better? The answer lies in the random access lookup time.
Arrays provide O(1) random access lookup time. That means, accessing the 1st index
of the array and the 1000th index of the array will both take the same time. This is due to the
fact that array comes with a pointer and an offset value. The pointer points to the right location
of the memory and the offset value shows how far to look in the said memory.
array_name[index]
| |
Pointer Offset
Therefore, in an array with 6 elements, to access the 1st element, array is pointed towards the
0th index. Similarly, to access the 6th element, array is pointed towards the 5th index.
Array Representation
Arrays are represented as a collection of buckets where each bucket stores one element.
These buckets are indexed from 0 to n-1, where n is the size of that particular array. For
example, an array with size 10 will have buckets indexed from 0 to 9.
This indexing will be similar for the multidimensional arrays as well. If it is a 2-
dimensional array, it will have sub-buckets in each bucket. Then it will be indexed as
array_name[m][n], where m and n are the sizes of each level in the array.

As per the above illustration, following are the important points to be considered.
• Index starts with 0.
• Array length is 9 which means it can store 9 elements.
• Each element can be accessed via its index. For example, we can fetch an element at
index 6 as 23.
Basic Operations in the Arrays
The basic operations in the Arrays are insertion, deletion, searching, display, traverse, and
update. These operations are usually performed to either modify the data in the array or to
report the status of the array.
Following are the basic operations supported by an array.
• Traverse − print all the array elements one by one.
• Insertion − Adds an element at the given index.
• Deletion − Deletes an element at the given index.
• Search − Searches an element using the given index or by the value.
• Update − Updates an element at the given index.
• Display − Displays the contents of the array.
In C, when an array is initialized with size, then it assigns defaults values to its elements in
following order.
Data Type Default Value

bool false

char 0

int 0

float 0.0

double 0.0f

void

wchar_t 0

Insertion Operation
In the insertion operation, we are adding one or more elements to the array. Based on the
requirement, a new element can be added at the beginning, end, or any given index of array.
This is done using input statements of the programming languages.
Algorithm
Following is an algorithm to insert elements into a Linear Array until we reach the end of the
array −
1. Start
2. Create an Array of a desired datatype and size.
3. Initialize a variable i as 0.
4. Enter the element at ith index of the array.
5. Increment i by 1.
6. Repeat Steps 4 & 5 until the end of the array.
7. Stop
Here, we see a practical implementation of insertion operation, where we add data at the end
of the array −
Deletion Operation
In this array operation, we delete an element from the particular index of an array. This deletion
operation takes place as we assign the value in the consequent index to the current index.

Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to delete an element available at the Kth position of LA.
1. Start
2. Set J = K
3. Repeat steps 4 and 5 while J < N
4. Set LA[J] = LA[J + 1]
5. Set J = J+1
6. Set N = N-1
7. Stop
Search Operation
Searching an element in the array using a key; The key element sequentially compares every
value in the array to check if the key is present in the array or not.
Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to find an element with a value of ITEM using sequential search.
1. Start
2. Set J = 0
3. Repeat steps 4 and 5 while J < N
4. IF LA[J] is equal ITEM THEN GOTO STEP 6
5. Set J = J +1
6. PRINT J, ITEM
7. Stop
Traversal Operation
This operation traverses through all the elements of an array. We use loop statements to carry
this out.
Algorithm
Following is the algorithm to traverse through all the elements present in a Linear Array −
1 Start
2. Initialize an Array of certain size and datatype.
3. Initialize another variable i with 0.
4. Print the ith value in the array and increment i.
5. Repeat Step 4 until the end of the array is reached.
6. End
Update Operation
Update operation refers to updating an existing element from the array at a given index.
Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to update an element available at the Kth position of LA.
1. Start
2. Set LA[K-1] = ITEM
3. Stop
Display Operation
This operation displays all the elements in the entire array using a print statement.
Algorithm
1. Start
2. Print all the elements in the Array
3. Stop
Unit-2
Stack
A stack is an Abstract Data Type (ADT), that is popularly used in most programming languages.
It is named stack because it has the similar operations as the real-world stacks, for example a
pack of cards or a pile of plates, etc.

The stack follows the LIFO (Last in - First out) structure where the last element inserted would
be the first element deleted.
Stack Representation
A Stack ADT allows all data operations at one end only. At any given time, we can only access
the top element of a stack.
The following diagram depicts a stack and its operations −

A stack can be implemented by means of Array, Structure, Pointer, and Linked List. Stack can
either be a fixed size one or it may have a sense of dynamic resizing. Here, we are going to
implement stack using arrays, which makes it a fixed size stack implementation.
Basic Operations on Stacks
Stack operations usually are performed for initialization, usage and, de-initialization of the
stack ADT.
The most fundamental operations in the stack ADT include: push(), pop(), peek(), isFull(),
isEmpty(). These are all built-in operations to carry out data manipulation and to check the
status of the stack.
Stack uses pointers that always point to the topmost element within the stack, hence called as
the top pointer.
Insertion: push()
push() is an operation that inserts elements into the stack. The following is an algorithm that
describes the push() operation in a simpler way.
Algorithm
1 Checks if the stack is full.
2 If the stack is full, produces an error and exit.
3 If the stack is not full, increments top to point next empty space.
4 Adds data element to the stack location, where top is pointing.
5 Returns success.
Note − In Java we have used to built-in method push() to perform this operation.
Deletion: pop()
pop() is a data manipulation operation which removes elements from the stack. The following
pseudo code describes the pop() operation in a simpler way.
Algorithm
1 Checks if the stack is empty.
2 If the stack is empty, produces an error and exit.
3 If the stack is not empty, accesses the data element at which top is pointing.
4 Decreases the value of top by 1.
5 Returns success.
Note − In Java we are using the built-in method pop().
peek()
The peek() is an operation retrieves the topmost element within the stack, without deleting it.
This operation is used to check the status of the stack with the help of the top pointer.
Algorithm
1. START
2. return the element at the top of the stack
3. END
isFull()
isFull() operation checks whether the stack is full. This operation is used to check the status of
the stack with the help of top pointer.
Algorithm
1. START
2. If the size of the stack is equal to the top position of the stack, the stack is full. Return 1.
3. Otherwise, return 0.
4. END
Expression Parsing
The way to write arithmetic expression is known as a notation. An arithmetic expression can
be written in three different but equivalent notations, i.e., without changing the essence or
output of an expression. These notations are −
• Infix Notation
• Prefix (Polish) Notation
• Postfix (Reverse-Polish) Notation
These notations are named as how they use operator in expression. We shall learn the same
here in this chapter.
Infix Notation
We write expression in infix notation, e.g. a - b &plus; c, where operators are used in-between
operands. It is easy for us humans to read, write, and speak in infix notation but the same does
not go well with computing devices. An algorithm to process infix notation could be difficult
and costly in terms of time and space consumption.
Prefix Notation
In this notation, operator is prefixed to operands, i.e. operator is written ahead of operands. For
example, &plus;ab. This is equivalent to its infix notation a &plus; b. Prefix notation is also
known as Polish Notation.
Postfix Notation
This notation style is known as Reversed Polish Notation. In this notation style, the operator
is postfixed to the operands i.e., the operator is written after the operands. For
example, ab&plus;. This is equivalent to its infix notation a &plus; b.
The following table briefly tries to show the difference in all three notations −
Sr.No. Infix Notation Prefix Notation Postfix Notation

1 a &plus; b &plus; a b a b &plus;

2 (a &plus; b) ∗ c ∗ &plus; a b c a b &plus; c ∗

3 a ∗ (b &plus; c) ∗ a &plus; b c a b c &plus; ∗

4 a / b &plus; c / d &plus; / a b / c d a b / c d / &plus;

5 (a &plus; b) ∗ (c &plus; d) ∗ &plus; a b &plus; c d a b &plus; c d &plus; ∗

6 ((a &plus; b) ∗ c) - d - ∗ &plus; a b c d a b &plus; c ∗ d -


Parsing Expressions
As we have discussed, it is not a very efficient way to design an algorithm or program to parse
infix notations. Instead, these infix notations are first converted into either postfix or prefix
notations and then computed.
To parse any arithmetic expression, we need to take care of operator precedence and
associativity also.

Precedence
When an operand is in between two different operators, which operator will take the operand
first, is decided by the precedence of an operator over others.
For example −
As multiplication operation has precedence over addition, b * c will be evaluated first. A table
of operator precedence is provided later.
Associativity
Associativity describes the rule where operators with the same precedence appear in an
expression. For example, in expression a &plus; b c, both &plus; and have the same
precedence, then which part of the expression will be evaluated first, is determined by
associativity of those operators. Here, both &plus; and are left associative, so the expression
will be evaluated as (a &plus; b) c.
Precedence and associativity determines the order of evaluation of an expression. Following is
an operator precedence and associativity table (highest to lowest) −
Sr.No. Operator Precedence Associativity

1 Exponentiation ^ Highest Right Associative

2 Multiplication ( ∗ ) & Division ( / ) Second Highest Left Associative

3 Addition ( &plus; ) & Subtraction ( − ) Lowest Left Associative


The above table shows the default behavior of operators. At any point of time in expression
evaluation, the order can be altered by using parenthesis. For example −
In a &plus; b*c, the expression part b*c will be evaluated first, with multiplication as
precedence over addition. We here use parenthesis for a &plus; b to be evaluated first, like (a
&plus; b)*c.
Postfix Evaluation Algorithm
We shall now look at the algorithm on how to evaluate postfix notation −
Step 1 scan the expression from left to right
Step 2 if it is an operand push it to stack
Step 3 if it is an operator pull operand from stack and perform operation
Step 4 store the output of step 3, back to stack
Step 5 scan the expression until all operands are consumed
Step 6 pop the stack and perform operation

Queue
Queue, like Stack, is also an abstract data structure. The thing that makes queue different from
stack is that a queue is open at both its ends. Hence, it follows FIFO (First-In-First-Out)
structure, i.e. the data item inserted first will also be accessed first. The data is inserted into the
queue through one end and deleted from it using the other end.
A real-world example of queue can be a single-lane one-way road, where the vehicle enters
first, exits first. More real-world examples can be seen as queues at the ticket windows and
bus-stops.
Representation of Queues
Similar to the stack ADT, a queue ADT can also be implemented using arrays, linked lists, or
pointers. As a small example in this tutorial, we implement queues using a one-dimensional
array.

Basic Operations
Queue operations also include initialization of a queue, usage and permanently deleting the
data from the memory.
The most fundamental operations in the queue ADT include: enqueue(), dequeue(), peek(),
isFull(), isEmpty(). These are all built-in operations to carry out data manipulation and to check
the status of the queue.
Queue uses two pointers − front and rear. The front pointer accesses the data from the front
end (helping in enqueueing) while the rear pointer accesses data from the rear end (helping in
dequeuing).
Insertion operation: enqueue()
The enqueue() is a data manipulation operation that is used to insert elements into the stack.
The following algorithm describes the enqueue() operation in a simpler way.
Algorithm
1 START
2 Check if the queue is full.
3 If the queue is full, produce overflow error and exit.
4 If the queue is not full, increment rear pointer to point the next empty space.
5 Add data element to the queue location, where the rear is pointing.
6 return success.
7 END

Deletion Operation: dequeue()


The dequeue() is a data manipulation operation that is used to remove elements from the stack.
The following algorithm describes the dequeue() operation in a simpler way.
Algorithm
1 START
2 Check if the queue is empty.
3 If the queue is empty, produce underflow error and exit.
4 If the queue is not empty, access the data where front is pointing.
5 Increment front pointer to point to the next available data element.
6 Return success.
7 END
The peek() Operation
The peek() is an operation which is used to retrieve the frontmost element in the queue, without
deleting it. This operation is used to check the status of the queue with the help of the pointer.
Algorithm
1 START
2 Return the element at the front of the queue
3 END
The isFull() Operation
The isFull() operation verifies whether the stack is full.
Algorithm
1 START
2 If the count of queue elements equals the queue size, return true
3 Otherwise, return false
4 END
The isEmpty() operation
The isEmpty() operation verifies whether the stack is empty. This operation is used to check
the status of the stack with the help of top pointer.
Algorithm
1 START
2 If the count of queue elements equals zero, return true
3 Otherwise, return false
4 END
Implementation of Queue
In this chapter, the algorithm implementation of the Queue data structure is performed in four
programming languages.
UNIT-3
Linked List
If arrays accommodate similar types of data types, linked lists consist of elements with different
data types that are also arranged sequentially.
But how are these linked lists created?
A linked list is a collection of nodes connected together via links. These nodes consist of the
data to be stored and a pointer to the address of the next node within the linked list. In the case
of arrays, the size is limited to the definition, but in linked lists, there is no defined size. Any
amount of data can be stored in it and can be deleted from it.
There are three types of linked lists −
• Singly Linked List − The nodes only point to the address of the next node in the list.
• Doubly Linked List − The nodes point to the addresses of both previous and next
nodes.
• Circular Linked List − The last node in the list will point to the first node in the list.
It can either be singly linked or doubly linked.
Linked List Representation
Linked list can be visualized as a chain of nodes, where every node points to the next node.

As per the above illustration, following are the important points to be considered.
• Linked List contains a link element called first (head).
• Each link carries a data field(s) and a link field called next.
• Each link is linked with its next link using its next link.
• Last link carries a link as null to mark the end of the list.
Types of Linked List
Following are the various types of linked list.
Singly Linked Lists
Singly linked lists contain two buckets in one node; one bucket holds the data and the other
bucket holds the address of the next node of the list. Traversals can be done in one direction
only as there is only a single link between two nodes of the same list.

Doubly Linked Lists


Doubly Linked Lists contain three buckets in one node; one bucket holds the data and the other
buckets hold the addresses of the previous and next nodes in the list. The list is traversed twice
as the nodes in the list are connected to each other from both sides.

Circular Linked Lists


Circular linked lists can exist in both singly linked list and doubly linked list.
Since the last node and the first node of the circular linked list are connected, the traversal in
this linked list will go on forever until it is broken.

Basic Operations in the Linked Lists


The basic operations in the linked lists are insertion, deletion, searching, display, and deleting
an element at a given key. These operations are performed on Singly Linked Lists as given
below −
• Insertion − Adds an element at the beginning of the list.
• Deletion − Deletes an element at the beginning of the list.
• Display − Displays the complete list.
• Search − Searches an element using the given key.
• Delete − Deletes an element using the given key.
Insertion Operation
Adding a new node in linked list is a more than one step activity. We shall learn this with
diagrams here. First, create a node using the same structure and find the location where it has
to be inserted.
Imagine that we are inserting a node B (NewNode), between A (LeftNode) and C (RightNode).
Then point B.next to C −
NewNode.next −> RightNode;
It should look like this −

Now, the next node at the left should point to the new node.
LeftNode.next −> NewNode;

This will put the new node in the middle of the two. The new list should look like this −
Insertion in linked list can be done in three different ways. They are explained as follows −
Insertion at Beginning
In this operation, we are adding an element at the beginning of the list.
Algorithm
1. START
2. Create a node to store the data
3. Check if the list is empty
4. If the list is empty, add the data to the node and assign the head pointer to it.
5 If the list is not empty, add the data to a node and link to the current head. Assign the head to
the newly added node.
6. END

Insertion at Ending
In this operation, we are adding an element at the ending of the list.
Algorithm
1. START
2. Create a new node and assign the data
3. Find the last node
4. Point the last node to new node
5. END
Insertion at a Given Position
In this operation, we are adding an element at any position within the list.
Algorithm
1. START
2. Create a new node and assign data to it
3. Iterate until the node at position is found
4. Point first to new first node
5. END
Deletion Operation
Deletion is also a more than one step process. We shall learn with pictorial representation. First,
locate the target node to be removed, by using searching algorithms.

The left (previous) node of the target node now should point to the next node of the target node

LeftNode.next > TargetNode.next;

This will remove the link that was pointing to the target node. Now, using the following code,
we will remove what the target node is pointing at.
TargetNode.next > NULL;

We need to use the deleted node. We can keep that in memory otherwise we can simply
deallocate memory and wipe off the target node completely.
Similar steps should be taken if the node is being inserted at the beginning of the list. While
inserting it at the end, the second last node of the list should point to the new node and the new
node will point to NULL.
Deletion in linked lists is also performed in three different ways. They are as follows −
Deletion at Beginning
In this deletion operation of the linked, we are deleting an element from the beginning of the
list. For this, we point the head to the second node.
Algorithm
1. START
2. Assign the head pointer to the next node in the list
3. END
Deletion at Ending
In this deletion operation of the linked, we are deleting an element from the ending of the list.
Algorithm
1. START
2. Iterate until you find the second last element in the list.
3. Assign NULL to the second last element in the list.
4. END
Deletion at a Given Position
In this deletion operation of the linked, we are deleting an element at any position of the list.
Algorithm
1. START
2. Iterate until find the current node at position in the list
3. Assign the adjacent node of current node in the list to its previous node.
4. END
Reverse Operation
This operation is a thorough one. We need to make the last node to be pointed by the head node
and reverse the whole linked list.

First, we traverse to the end of the list. It should be pointing to NULL. Now, we shall make it
point to its previous node −

We have to make sure that the last node is not the last node. So we'll have some temp node,
which looks like the head node pointing to the last node. Now, we shall make all left side nodes
point to their previous nodes one by one.
Except the node (first node) pointed by the head node, all nodes should point to their
predecessor, making them their new successor. The first node will point to NULL.

We'll make the head node point to the new first node by using the temp node.

Algorithm
Step by step process to reverse a linked list is as follows −
1 START
2. We use three pointers to perform the reversing: prev, next, head.
3. Point the current node to head and assign its next value to the prev node.
4. Iteratively repeat the step 3 for all the nodes in the list.
5. Assign head to the prev node.
Search Operation
Searching for an element in the list using a key element. This operation is done in the same
way as array search; comparing every element in the list with the key element given.
Algorithm
1 START
2 If the list is not empty, iteratively check if the list contains the key
3 If the key element is not present in the list, unsuccessful search
4 END
Traversal Operation
The traversal operation walks through all the elements of the list in an order and displays the
elements in that order.
Algorithm
1. START
2. While the list is not empty and did not reach the end of the list, print the data in each node
3. END
Doubly Linked List Data Structure
Doubly Linked List is a variation of Linked list in which navigation is possible in both ways,
either forward and backward easily as compared to Single Linked List. Following are the
important terms to understand the concept of doubly linked list.
• Link − Each link of a linked list can store a data called an element.
• Next − Each link of a linked list contains a link to the next link called Next.
• Prev − Each link of a linked list contains a link to the previous link called Prev.
• Linked List − A Linked List contains the connection link to the first link called First
and to the last link called Last.
Doubly Linked List Representation

As per the above illustration, following are the important points to be considered.
• Doubly Linked List contains a link element called first and last.
• Each link carries a data field(s) and a link field called next.
• Each link is linked with its next link using its next link.
• Each link is linked with its previous link using its previous link.
• The last link carries a link as null to mark the end of the list.
Basic Operations
Following are the basic operations supported by a list.
• Insertion − Adds an element at the beginning of the list.
• Deletion − Deletes an element at the beginning of the list.
• Insert Last − Adds an element at the end of the list.
• Delete Last − Deletes an element from the end of the list.
• Insert After − Adds an element after an item of the list.
• Delete − Deletes an element from the list using the key.
• Display forward − Displays the complete list in a forward manner.
• Display backward − Displays the complete list in a backward manner.
Insertion at the Beginning
In this operation, we create a new node with three compartments, one containing the data, the
others containing the address of its previous and next nodes in the list. This new node is inserted
at the beginning of the list.
Algorithm
1. START
2. Create a new node with three variables: prev, data, next.
3. Store the new data in the data variable
4. If the list is empty, make the new node as head.
5. Otherwise, link the address of the existing first node to the next variable of the new node,
and assign null to the prev variable.
6. Point the head to the new node.
7. END
Deletion at the Beginning
This deletion operation deletes the existing first nodes in the doubly linked list. The head is
shifted to the next node and the link is removed.
Algorithm
1. START
2. Check the status of the doubly linked list
3. If the list is empty, deletion is not possible
4. If the list is not empty, the head pointer is shifted to the next node.
5. END
Insertion at the End
In this insertion operation, the new input node is added at the end of the doubly linked list; if
the list is not empty. The head will be pointed to the new node, if the list is empty.
Algorithm
1. START
2. If the list is empty, add the node to the list and point the head to it.
3. If the list is not empty, find the last node of the list.
4. Create a link between the last node in the list and the new node.
5. The new node will point to NULL as it is the new last node.
6. END
Circular Linked List Data Structure
Circular Linked List is a variation of Linked list in which the first element points to the last
element and the last element points to the first element. Both Singly Linked List and Doubly
Linked List can be made into a circular linked list.
Singly Linked List as Circular
In singly linked list, the next pointer of the last node points to the first node.

Doubly Linked List as Circular


In doubly linked list, the next pointer of the last node points to the first node and the previous
pointer of the first node points to the last node making the circular in both directions.

As per the above illustration, following are the important points to be considered.
• The last link's next points to the first link of the list in both cases of singly as well as
doubly linked list.
• The first link's previous points to the last of the list in case of doubly linked list.
Basic Operations
Following are the important operations supported by a circular list.
• insert − Inserts an element at the start of the list.
• delete − Deletes an element from the start of the list.
• display − Displays the list.
Insertion Operation
The insertion operation of a circular linked list only inserts the element at the start of the list.
This differs from the usual singly and doubly linked lists as there is no particular starting and
ending points in this list. The insertion is done either at the start or after a particular node (or a
given position) in the list.
Algorithm
1. START
2. Check if the list is empty
3. If the list is empty, add the node and point the head to this node
4. If the list is not empty, link the existing head as the next node to the new node.
5. Make the new node as the new head.
6. END
Deletion Operation
The Deletion operation in a Circular linked list removes a certain node from the list. The
deletion operation in this type of lists can be done at the beginning, or a given position, or at
the ending.
Algorithm
1. START
2. If the list is empty, then the program is returned.
3. If the list is not empty, we traverse the list using a current pointer that is set to the head
pointer and create another pointer previous that points to the last node.
4. Suppose the list has only one node, the node is deleted by setting the head pointer to NULL.
5. If the list has more than one node and the first node is to be deleted, the head is set to the
next node and the previous is linked to the new head.
6. If the node to be deleted is the last node, link the preceding node of the last node to head
node.
7. If the node is neither first nor last, remove the node by linking its preceding node to its
succeeding node.
8. END
Display List Operation
The Display List operation visits every node in the list and prints them all in the output.
Algorithm
1. START
2. Walk through all the nodes of the list and print them
3. END
UNIT-4
Tree
A tree is a non-linear abstract data type with a hierarchy-based structure. It consists of nodes
(where the data is stored) that are connected via links. The tree data structure stems from a
single node called a root node and has subtrees connected to the root.

Important Terms
Following are the important terms with respect to tree.
• Path − Path refers to the sequence of nodes along the edges of a tree.
• Root − The node at the top of the tree is called root. There is only one root per tree and
one path from the root node to any node.
• Parent − Any node except the root node has one edge upward to a node called parent.
• Child − The node below a given node connected by its edge downward is called its
child node.
• Leaf − The node which does not have any child node is called the leaf node.
• Subtree − Subtree represents the descendants of a node.
• Visiting − Visiting refers to checking the value of a node when control is on the node.
• Traversing − Traversing means passing through nodes in a specific order.
• Levels − Level of a node represents the generation of a node. If the root node is at level
0, then its next child node is at level 1, its grandchild is at level 2, and so on.
• Keys − Key represents a value of a node based on which a search operation is to be
carried out for a node.
Types of Trees
There are three types of trees −
• General Trees
• Binary Trees
• Binary Search Trees
General Trees
General trees are unordered tree data structures where the root node has minimum 0 or
maximum n subtrees.
The General trees have no constraint placed on their hierarchy. The root node thus acts like the
superset of all the other subtrees.

Binary Trees
Binary Trees are general trees in which the root node can only hold up to maximum 2 subtrees:
left subtree and right subtree. Based on the number of children, binary trees are divided into
three types.
Full Binary Tree
• A full binary tree is a binary tree type where every node has either 0 or 2 child nodes.
Complete Binary Tree
• A complete binary tree is a binary tree type where all the leaf nodes must be on the
same level. However, root and internal nodes in a complete binary tree can either have
0, 1 or 2 child nodes.
Perfect Binary Tree
• A perfect binary tree is a binary tree type where all the leaf nodes are on the same level
and every node except leaf nodes have 2 children.
Binary Search Trees
Binary Search Trees possess all the properties of Binary Trees including some extra properties
of their own, based on some constraints, making them more efficient than binary trees.
The data in the Binary Search Trees (BST) is always stored in such a way that the values in the
left subtree are always less than the values in the root node and the values in the right subtree
are always greater than the values in the root node, i.e. left subtree < root node right subtree.

Advantages of BST
• Binary Search Trees are more efficient than Binary Trees since time complexity for
performing various operations reduces.
• Since the order of keys is based on just the parent node, searching operation becomes
simpler.
• The alignment of BST also favors Range Queries, which are executed to find values
existing between two keys. This helps in the Database Management System.
Disadvantages of BST
The main disadvantage of Binary Search Trees is that if all elements in nodes are either greater
than or lesser than the root node, the tree becomes skewed. Simply put, the tree becomes slanted
to one side completely.
This skewness will make the tree a linked list rather than a BST, since the worst case time
complexity for searching operation becomes O(n).
To overcome this issue of skewness in the Binary Search Trees, the concept of Balanced Binary
Search Trees was introduced.
Balanced Binary Search Trees
Consider a Binary Search Tree with m as the height of the left subtree and n as the height of
the right subtree. If the value of (m-n) is equal to 0,1 or -1, the tree is said to be a Balanced
Binary Search Tree.
The trees are designed in a way that they self-balance once the height difference exceeds
1. Binary Search Trees use rotations as self-balancing algorithms. There are four different types
of rotations: Left Left, Right Right, Left Right, Right Left.
There are various types of self-balancing binary search trees −
• AVL Trees
• Red Black Trees
• B Trees
• B+ Trees
• Splay Trees
• Priority Search Trees
Tree Traversal
Traversal is a process to visit all the nodes of a tree and may print their values too. Because, all
nodes are connected via edges (links) we always start from the root (head) node. That is, we
cannot randomly access a node in a tree. There are three ways which we use to traverse a tree

• In-order Traversal
• Pre-order Traversal
• Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree or to print all
the values it contains.
In-order Traversal
In this traversal method, the left subtree is visited first, then the root and later the right sub-
tree. We should always remember that every node may represent a subtree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in an ascending
order.

We start from A, and following in-order traversal, we move to its left subtree B.B is also
traversed in-order. The process goes on until all the nodes are visited. The output of in-order
traversal of this tree will be −
DBEAFC G

Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally the right
subtree.

We start from A, and following pre-order traversal, we first visit A itself and then move to its
left subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited.
The output of pre-order traversal of this tree will be −
A→B→D →E→C →F→G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we traverse the left
subtree, then the right subtree and finally the root node.

We start from A, and following pre-order traversal, we first visit the left subtree B. B is also
traversed post-order. The process goes on until all the nodes are visited. The output of post-
order traversal of this tree will be
D→E→B→F→G→C→A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.
Implementation
Traversal is a process to visit all the nodes of a tree and may print their values too. Because, all
nodes are connected via edges (links) we always start from the root (head) node. That is, we
cannot randomly access a node in a tree. There are three ways which we use to traverse a tree

• In-order Traversal
• Pre-order Traversal
• Post-order Traversal
We shall now see the implementation of tree traversal in C programming language here using
the following binary tree −

Output
Preorder traversal: 27 14 10 19 35 31 42
Inorder traversal: 10 14 19 27 31 35 42
Post order traversal: 10 19 14 31 42 35 27
Binary Search Tree
A Binary Search Tree (BST) is a tree in which all the nodes follow the below-mentioned
properties −
• The left sub-tree of a node has a key less than or equal to its parent node's key.
• The right sub-tree of a node has a key greater than or equal to its parent node's key.
Thus, BST divides all its sub-trees into two segments; the left sub-tree and the right sub-tree
and can be defined as −
left_subtree (keys) node (key) right_subtree (keys)
Representation
BST is a collection of nodes arranged in a way where they maintain BST properties. Each node
has a key and an associated value. While searching, the desired key is compared to the keys in
BST and if found, the associated value is retrieved.
Following is a pictorial representation of BST −
We observe that the root node key (27) has all less-valued keys on the left sub-tree and the
higher valued keys on the right sub-tree.
Basic Operations
Following are the basic operations of a tree −
• Search − Searches an element in a tree.
• Insert − Inserts an element in a tree.
• Pre-order Traversal − Traverses a tree in a pre-order manner.
• In-order Traversal − Traverses a tree in an in-order manner.
• Post-order Traversal − Traverses a tree in a post-order manner.
Defining a Node
Define a node that stores some data, and references to its left and right child nodes.
struct node {
int data;
struct node *leftChild;
struct node *rightChild;
};
Search Operation
Whenever an element is to be searched, start searching from the root node. Then if the data is
less than the key value, search for the element in the left subtree. Otherwise, search for the
element in the right subtree. Follow the same algorithm for each node.
Algorithm
1. START
2. Check whether the tree is empty or not
3. If the tree is empty, search is not possible
4. Otherwise, first search the root of the tree.
5. If the key does not match with the value in the root, search its subtrees.
6. If the value of the key is less than the root value, search the left subtree
7. If the value of the key is greater than the root value, search the right subtree.
8. If the key is not found in the tree, return unsuccessful search.
9. END
Insert Operation
Whenever an element is to be inserted, first locate its proper location. Start searching from the
root node, then if the data is less than the key value, search for the empty location in the left
subtree and insert the data. Otherwise, search for the empty location in the right subtree and
insert the data.
Algorithm
1 START
2 If the tree is empty, insert the first element as the root node of the tree. The following elements
are added as the leaf nodes.
3 If an element is less than the root value, it is added into the left subtree as a leaf node.
4 If an element is greater than the root value, it is added into the right subtree as a leaf node.
5 The final leaf nodes of the tree point to NULL values as their child nodes.
6 END
Inorder Traversal
The inorder traversal operation in a Binary Search Tree visits all its nodes in the following
order −
• Firstly, we traverse the left child of the root node/current node, if any.
• Next, traverse the current node.
• Lastly, traverse the right child of the current node, if any.
Algorithm
1. START
2. Traverse the left subtree, recursively
3. Then, traverse the root node
4. Traverse the right subtree, recursively.
5. END
Preorder Traversal
The preorder traversal operation in a Binary Search Tree visits all its nodes. However, the root
node in it is first printed, followed by its left subtree and then its right subtree.
Algorithm
1. START
2. Traverse the root node first.
3. Then traverse the left subtree, recursively
4. Later, traverse the right subtree, recursively.
5. END
Postorder Traversal
Like the other traversals, postorder traversal also visits all the nodes in a Binary Search Tree
and displays them. However, the left subtree is printed first, followed by the right subtree and
lastly, the root node.
Algorithm
1. START
2. Traverse the left subtree, recursively
3. Traverse the right subtree, recursively.
4. Then, traverse the root node
5. END

AVL Trees
The first type of self-balancing binary search tree to be invented is the AVL tree. The name
AVL tree is coined after its inventor's names − Adelson-Velsky and Landis.
In AVL trees, the difference between the heights of left and right subtrees, known as the Balance
Factor, must be at most one. Once the difference exceeds one, the tree automatically executes
the balancing algorithm until the difference becomes one again.
BALANCE FACTOR = HEIGHT(LEFT SUBTREE) - HEIGHT(RIGHT SUBTREE)
There are usually four cases of rotation in the balancing algorithm of AVL trees: LL, RR, LR,
RL.-
LL Rotations
LL rotation is performed when the node is inserted into the right subtree leading to an
unbalanced tree. This is a single left rotation to make the tree balanced again −

Fig : LL Rotation
The node where the unbalance occurs becomes the left child and the newly added node
becomes the right child with the middle node as the parent node.
RR Rotations
RR rotation is performed when the node is inserted into the left subtree leading to an
unbalanced tree. This is a single right rotation to make the tree balanced again −

Fig : RR Rotation
The node where the unbalance occurs becomes the right child and the newly added node
becomes the left child with the middle node as the parent node.
LR Rotations
LR rotation is the extended version of the previous single rotations, also called a double
rotation. It is performed when a node is inserted into the right subtree of the left subtree. The
LR rotation is a combination of the left rotation followed by the right rotation. There are
multiple steps to be followed to carry this out.
• Consider an example with A as the root node, B as the left child of A and C as the right
child of B.
• Since the unbalance occurs at A, a left rotation is applied on the child nodes of A, i.e.
B and C.
• After the rotation, the C node becomes the left child of A and B becomes the left child
of C.
• The unbalance still persists, therefore a right rotation is applied at the root node A and
the left child C.
• After the final right rotation, C becomes the root node, A becomes the right child and
B is the left child.

Fig : LR Rotation
RL Rotations
RL rotation is also the extended version of the previous single rotations, hence it is called a
double rotation and it is performed if a node is inserted into the left subtree of the right subtree.
The RL rotation is a combination of the right rotation followed by the left rotation. There are
multiple steps to be followed to carry this out.
• Consider an example with A as the root node, B as the right child of A and C as the left
child of B.
• Since the unbalance occurs at A, a right rotation is applied on the child nodes of A, i.e.
B and C.
• After the rotation, the C node becomes the right child of A and B becomes the right
child of C.
• The unbalance still persists, therefore a left rotation is applied at the root node A and
the right child C.
• After the final left rotation, C becomes the root node, A becomes the left child and B is
the right child.

Fig : RL Rotation
Basic Operations of AVL Trees
The basic operations performed on the AVL Tree structures include all the operations
performed on a binary search tree, since the AVL Tree at its core is actually just a binary search
tree holding all its properties. Therefore, basic operations performed on an AVL Tree are
− Insertion and Deletion.
Insertion
The data is inserted into the AVL Tree by following the Binary Search Tree property of
insertion, i.e. the left subtree must contain elements less than the root value and right subtree
must contain all the greater elements. However, in AVL Trees, after the insertion of each
element, the balance factor of the tree is checked; if it does not exceed 1, the tree is left as it is.
But if the balance factor exceeds 1, a balancing algorithm is applied to readjust the tree such
that balance factor becomes less than or equal to 1 again.
Algorithm
The following steps are involved in performing the insertion operation of an AVL Tree −
Step 1 − Create a node
Step 2 − Check if the tree is empty
Step 3 − If the tree is empty, the new node created will become the root node of the AVL Tree.
Step 4 − If the tree is not empty, we perform the Binary Search Tree insertion operation and
check the balancing factor of the node in the tree.
Step 5 − Suppose the balancing factor exceeds 1, we apply suitable rotations on the said node
and resume the insertion from Step 4.
Insertion Example
Let us understand the insertion operation by constructing an example AVL tree with 1 to 7
integers.
Starting with the first element 1, we create a node and measure the balance, i.e., 0.

Since both the binary search property and the balance factor are satisfied, we insert another
element into the tree.
The balance factor for the two nodes are calculated and is found to be -1 (Height of left subtree
is 0 and height of the right subtree is 1). Since it does not exceed 1, we add another element to
the tree.

Now, after adding the third element, the balance factor exceeds 1 and becomes 2. Therefore,
rotations are applied. In this case, the RR rotation is applied since the imbalance occurs at two
right nodes.

The tree is rearranged as −

Similarly, the next elements are inserted and rearranged using these rotations. After
rearrangement, we achieve the tree as −

Deletion
Deletion in the AVL Trees take place in three different scenarios −
• Scenario 1 (Deletion of a leaf node) − If the node to be deleted is a leaf node, then it
is deleted without any replacement as it does not disturb the binary search tree property.
However, the balance factor may get disturbed, so rotations are applied to restore it.
• Scenario 2 (Deletion of a node with one child) − If the node to be deleted has one
child, replace the value in that node with the value in its child node. Then delete the
child node. If the balance factor is disturbed, rotations are applied.
• Scenario 3 (Deletion of a node with two child nodes) − If the node to be deleted has
two child nodes, find the inorder successor of that node and replace its value with the
inorder successor value. Then try to delete the inorder successor node. If the balance
factor exceeds 1 after deletion, apply balance algorithms.
Deletion Example
Using the same tree given above, let us perform deletion in three scenarios −

• Deleting element 7 from the tree above −


Since the element 7 is a leaf, we normally remove the element without disturbing any other
node in the tree

• Deleting element 6 from the output tree achieved −


However, element 6 is not a leaf node and has one child node attached to it. In this case, we
replace node 6 with its child node: node 5.
The balance of the tree becomes 1, and since it does not exceed 1 the tree is left as it is. If we
delete the element 5 further, we would have to apply the left rotations; either LL or LR since
the imbalance occurs at both 1-2-4 and 3-2-4.

The balance factor is disturbed after deleting the element 5, therefore we apply LL rotation (we
can also apply the LR rotation here).

Once the LL rotation is applied on path 1-2-4, the node 3 remains as it was supposed to be the
right child of node 2 (which is now occupied by node 4). Hence, the node is added to the right
subtree of the node 2 and as the left child of the node 4.

• Deleting element 2 from the remaining tree −


As mentioned in scenario 3, this node has two children. Therefore, we find its inorder successor
that is a leaf node (say, 3) and replace its value with the inorder successor.
The balance of the tree still remains 1, therefore we leave the tree as it is without performing
any rotations.

Heap
Heap is a special case of balanced binary tree data structure where the root-node key is
compared with its children and arranged accordingly. If α has child node β then −
key(α) ≥ key(β)
As the value of parent is greater than that of child, this property generates Max Heap. Based
on this criteria, a heap can be of two types −
For Input → 35 33 42 10 14 19 27 44 26 31
Min-Heap − Where the value of the root node is less than or equal to either of its children.

Max-Heap − Where the value of the root node is greater than or equal to either of its children.

Both trees are constructed using the same input and order of arrival.
Max Heap Construction Algorithm
We shall use the same example to demonstrate how a Max Heap is created. The procedure to
create Min Heap is similar but we go for min values instead of max values.
We are going to derive an algorithm for max heap by inserting one element at a time. At any
point of time, heap must maintain its property. While insertion, we also assume that we are
inserting a node in an already heapified tree.
Step 1 − Create a new node at the end of heap.
Step 2 − Assign new value to the node.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
Note − In Min Heap construction algorithm, we expect the value of the parent node to be less
than that of the child node.
Let's understand Max Heap construction by an animated illustration. We consider the same
input sample that we used earlier.

Max Heap Deletion Algorithm


Let us derive an algorithm to delete from max heap. Deletion in Max (or Min) Heap always
happens at the root to remove the Maximum (or minimum) value.
Step 1 − Remove root node.
Step 2 − Move the last element of last level to root.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
// Deallocate memory occupied by the heap
void destroyHeap(Heap* heap)
{
free(heap->array);
free(heap);
}
Recursion Algorithms
Some computer programming languages allow a module or function to call itself. This
technique is known as recursion. In recursion, a function α either calls itself directly or calls a
function β that in turn calls the original function α. The function α is called recursive function.
Properties
A recursive function can go infinite like a loop. To avoid infinite running of recursive function,
there are two properties that a recursive function must have −
• Base criteria − There must be at least one base criteria or condition, such that, when
this condition is met the function stops calling itself recursively.
• Progressive approach − The recursive calls should progress in such a way that each time
a recursive call is made it comes closer to the base criteria.
Implementation
Many programming languages implement recursion by means of stacks. Generally, whenever
a function (caller) calls another function (callee) or itself as callee, the caller function transfers
execution control to the callee. This transfer process may also involve some data to be passed
from the caller to the callee.
This implies, the caller function has to suspend its execution temporarily and resume later when
the execution control returns from the callee function. Here, the caller function needs to start
exactly from the point of execution where it puts itself on hold. It also needs the exact same
data values it was working on. For this purpose, an activation record (or stack frame) is created
for the caller function.
This activation record keeps the information about local variables, formal parameters, return
address and all information passed to the caller function.
Analysis of Recursion
One may argue why to use recursion, as the same task can be done with iteration. The first
reason is, recursion makes a program more readable and because of latest enhanced CPU
systems, recursion is more efficient than iterations.
Time Complexity
In case of iterations, we take number of iterations to count the time complexity. Likewise, in
case of recursion, assuming everything is constant, we try to figure out the number of times a
recursive call is being made. A call made to a function is Ο(1), hence the (n) number of times
a recursive call is made makes the recursive function Ο(n).
Space Complexity
Space complexity is counted as what amount of extra space is required for a module to execute.
In case of iterations, the compiler hardly requires any extra space. The compiler keeps updating
the values of variables used in the iterations. But in case of recursion, the system needs to store
activation record each time a recursive call is made. Hence, it is considered that space
complexity of recursive function may go higher than that of a function with iteration.
UNIT-5

Searching is a process of finding a particular record, which can be a single element or a small
chunk, within a huge amount of data. The data can be in various forms: arrays, linked lists,
trees, heaps, and graphs etc. With the increasing amount of data nowadays, there are multiple
techniques to perform the searching operation.
Searching Algorithms in Data Structures
Various searching techniques can be applied on the data structures to retrieve certain data. A
search operation is said to be successful only if it returns the desired element or data; otherwise,
the searching method is unsuccessful.
There are two categories these searching techniques fall into. They are −
• Sequential Searching
• Interval Searching
Sequential Searching
As the name suggests, the sequential searching operation traverses through each element of the
data sequentially to look for the desired data. The data need not be in a sorted manner for this
type of search.
Example − Linear Search

Fig. 1: Linear Search Operation


Interval Searching
Unlike sequential searching, the interval searching operation requires the data to be in a sorted
manner. This method usually searches the data in intervals; it could be done by either dividing
the data into multiple sub-parts or jumping through the indices to search for an element.
Example − Binary Search, Jump Search etc.
Fig. 2: Binary Search Operation

Asymptotic Notations
Execution time of an algorithm depends on the instruction set, processor speed, disk I/O speed,
etc. Hence, we estimate the efficiency of an algorithm asymptotically.
Time function of an algorithm is represented by T(n), where n is the input size.
Different types of asymptotic notations are used to represent the complexity of an algorithm.
Following asymptotic notations are used to calculate the running time complexity of an
algorithm.
• O − Big Oh Notation
• Ω − Big omega Notation
• θ − Big theta Notation
• o − Little Oh Notation
• ω − Little omega Notation
Big Oh, O: Asymptotic Upper Bound
The notation (n) is the formal way to express the upper bound of an algorithm's running time.
is the most commonly used notation. It measures the worst case time complexity or the
longest amount of time an algorithm can possibly take to complete.

A function f(n) can be represented is the order of g(n) that is O(g(n)), if there exists a value of
positive integer n as n0 and a positive constant c such that −
𝑓𝑓(𝑛𝑛) ⩽ 𝑐𝑐. 𝑔𝑔(𝑛𝑛)𝑓𝑓(𝑛𝑛) ⩽ 𝑐𝑐. 𝑔𝑔(𝑛𝑛) 𝑓𝑓𝑓𝑓𝑓𝑓 𝑛𝑛 > 𝑛𝑛0 in all case
Hence, function g(n) is an upper bound for function f(n), as g(n) grows faster than f(n).
Big Omega, Ω: Asymptotic Lower Bound
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running
time. It measures the best case time complexity or the best amount of time an algorithm can
possibly take to complete.

We say that f(n)=Ω(g(n))f(n)=Ω(g(n)) when there exists


constant c that f(n)⩾c.g(n)f(n)⩾c.g(n) for all sufficiently large value of n. Here n is a positive
integer. It means function g is a lower bound for function f ; after a certain value of n, f will
never go below g.

Theta, θ: Asymptotic Tight Bound


The notation (n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. Some may confuse the theta notation as the average case time
complexity; while big theta notation could be almost accurately used to describe the average
case, other notations could be used as well.

We say that f(n)=θ(g(n))f(n)=θ(g(n)) when there exist


constants c1 and c2 that c1.g(n)⩽f(n)⩽c2.g(n)c1.g(n)⩽f(n)⩽c2.g(n) for all sufficiently large
value of n. Here n is a positive integer.

This means function g is a tight bound for function f.


Little Oh, o
The asymptotic upper bound provided by O-notation may or may not be asymptotically
tight. The bound 2.n2=O(n2)2.n2=O(n2) is asymptotically tight, but the
bound 2.n=O(n2)2.n=O(n2) is not.

We use o-notation to denote an upper bound that is not asymptotically tight.

We formally define o(g(n)) (little-oh of g of n) as the set f(n) = o(g(n)) for any positive
constant c>0c>0 and there exists a value n0>0n0>0, such that 0⩽f(n)⩽c.g(n)0⩽f(n)⩽c.g(n).

Intuitively, in the o-notation, the function f(n) becomes insignificant relative


to g(n) as n approaches infinity; that is,
limn→∞(f(n)g(n))=0limn→∞(f(n)g(n))=0
Little Omega, ω
We use ω-notation to denote a lower bound that is not asymptotically tight. Formally,
however, we define ω(g(n)) (little-omega of g of n) as the set f(n) = ω(g(n)) for any positive
constant C > 0 and there exists a value n0>0n0>0, such that $0 \leqslant c.g(n)
For example, n22=ω(n)n22=ω(n), but n22≠ω(n2)n22≠ω(n2). The
relation f(n)=ω(g(n))f(n)=ω(g(n)) implies that the following limit exists
limn→∞(f(n)g(n))=∞limn→∞(f(n)g(n))=∞
That is, f(n) becomes arbitrarily large relative to g(n) as n approaches infinity.
Common Asymptotic Notations
Following is a list of some common asymptotic notations −
constant − O(1)

logarithmic − O(log n)

linear − O(n)

n log n − O(n log n)

quadratic − O(n2)
cubic − O(n3)

polynomial − nO(1)

exponential − 2O(n)

Linear search algorithm


Linear search algorithm is a type of sequential searching algorithm. In this method, every
element within the input array is traversed and compared with the key element to be found. If
a match is found in the array the search is said to be successful; if there is no match found the
search is said to be unsuccessful and gives the worst-case time complexity.
For instance, in the given animated diagram, we are searching for an element 33. Therefore,
the linear search method searches for it sequentially from the very first element until it finds a
match. This returns a successful search.

In the same diagram, if we have to search for an element 46, then it returns an unsuccessful
search since 46 is not present in the input.
Linear Search Algorithm
The algorithm for linear search is relatively simple. The procedure starts at the very first index
of the input array to be searched.
Step 1 − Start from the 0th index of the input array, compare the key value with the value
present in the 0th index.
Step 2 − If the value matches with the key, return the position at which the value was found.
Step 3 − If the value does not match with the key, compare the next element in the array.
Step 4 − Repeat Step 3 until there is a match found. Return the position at which the match
was found.
Step 5 − If it is an unsuccessful search, print that the element is not present in the array and
exit the program.
Pseudocode
procedure linear_search (list, value)
for each item in the list
if match item == value
return the item's location
end if
end for
end procedure

Time and Space Complexity of Linear Search Algorithm:


Time Complexity:
• Best Case: In the best case, the key might be present at the first index. So the best case
complexity is O(1)
• Worst Case: In the worst case, the key might be present at the last index i.e., opposite
to the end from which the search has started in the list. So the worst-case complexity is
O(N) where N is the size of the list.
• Average Case: O(N)
Auxiliary Space: O(1) as except for the variable to iterate through the list, no other variable is
used.
Applications of Linear Search Algorithm:
• Unsorted Lists: When we have an unsorted array or list, linear search is most
commonly used to find any element in the collection.
• Small Data Sets: Linear Search is preferred over binary search when we have small
data sets with
• Searching Linked Lists: In linked list implementations, linear search is commonly
used to find elements within the list. Each node is checked sequentially until the desired
element is found.
• Simple Implementation: Linear Search is much easier to understand and implement
as compared to Binary Search or Ternary Search.
Advantages:
• Linear search can be used irrespective of whether the array is sorted or not. It can be
used on arrays of any data type.
• Does not require any additional memory.
• It is a well-suited algorithm for small datasets.
Disadvantages:
• Linear search has a time complexity of O(N), which in turn makes it slow for large
datasets.
• Not suitable for large arrays.

Binary search algorithm


Binary search is a fast search algorithm with run-time complexity of (log n). This search
algorithm works on the principle of divide and conquer, since it divides the array into half
before searching. For this algorithm to work properly, the data collection should be in the sorted
form.
Binary search looks for a particular key value by comparing the middle most item of the
collection. If a match occurs, then the index of item is returned. But if the middle item has a
value greater than the key value, the right sub-array of the middle item is searched. Otherwise,
the left sub-array is searched. This process continues recursively until the size of a subarray
reduces to zero.

Binary Search Algorithm


Binary Search algorithm is an interval searching method that performs the searching in
intervals only. The input taken by the binary search algorithm must always be in a sorted array
since it divides the array into subarrays based on the greater or lower values. The algorithm
follows the procedure below −
Step 1 − Select the middle item in the array and compare it with the key value to be searched.
If it is matched, return the position of the median.
Step 2 − If it does not match the key value, check if the key value is either greater than or less
than the median value.
Step 3 − If the key is greater, perform the search in the right sub-array; but if the key is lower
than the median value, perform the search in the left sub-array.
Step 4 − Repeat Steps 1, 2 and 3 iteratively, until the size of sub-array becomes 1.
Step 5 − If the key value does not exist in the array, then the algorithm returns an unsuccessful
search.
Pseudocode
The pseudocode of binary search algorithms should look like this −
Procedure binary_search
A ← sorted array
n ← size of array
x ← value to be searched

Set lowerBound = 1
Set upperBound = n

while x not found


if upperBound < lowerBound
EXIT: x does not exists.

set midPoint = lowerBound + ( upperBound - lowerBound ) / 2

if A[midPoint] < x
set lowerBound = midPoint + 1
if A[midPoint] > x
set upperBound = midPoint - 1

if A[midPoint] = x
EXIT: x found at location midPoint
end while
end procedure
Time complexity of Binary Search
Time complexity of Binary Search is O(log n), where n is the number of elements in the array. It
divides the array in half at each step. Space complexity is O(1) as it uses a constant amount of extra
space.

Example of Binary Search Algorithm

Aspect Complexity

Time Complexity O(log n)

Space Complexity O(1)

The time and space complexities of the binary search algorithm are mentioned below.
Time Complexity of Binary Search Algorithm:
Best Case Time Complexity of Binary Search Algorithm: O(1)
Best case is when the element is at the middle index of the array. It takes only one comparison to find
the target element. So the best case complexity is O(1).
Average Case Time Complexity of Binary Search Algorithm: O(log N)
Consider array arr[] of length N and element X to be found. There can be two cases:
• Case1: Element is present in the array
• Case2: Element is not present in the array.
There are N Case1 and 1 Case2. So total number of cases = N+1. Now notice the following:
• An element at index N/2 can be found in 1 comparison
• Elements at index N/4 and 3N/4 can be found in 2 comparisons.
• Elements at indices N/8, 3N/8, 5N/8 and 7N/8 can be found in 3 comparisons and so on.
Based on this we can conclude that elements that require:
• 1 comparison = 1
• 2 comparisons = 2
• 3 comparisons = 4
• x comparisons = 2x-1 where x belongs to the range [1, logN] because maximum comparisons
= maximum time N can be halved = maximum comparisons to reach 1st element = logN.
So, total comparisons
= 1*(elements requiring 1 comparisons) + 2*(elements requiring 2 comparisons) + . . . +
logN*(elements requiring logN comparisons)
= 1*1 + 2*2 + 3*4 + . . . + logN * (2logN-1)
= 2logN * (logN - 1) + 1
= N * (logN - 1) + 1
Total number of cases = N+1.
Therefore, the average complexity = (N*(logN - 1) + 1)/N+1 = N*logN / (N+1) + 1/(N+1). Here the
dominant term is N*logN/(N+1) which is approximately logN. So the average case complexity is
O(logN)
Worst Case Time Complexity of Binary Search Algorithm: O(log N)
The worst case will be when the element is present in the first position. As seen in the average case,
the comparison required to reach the first element is logN. So the time complexity for the worst case
is O(logN).

Hashing
Hashing refers to the process of generating a small sized output (that can be used as index in a table)
from an input of typically large and variable size. Hashing uses mathematical formulas known as hash
functions to do the transformation. This technique determines an index or location for the storage of
an item in a data structure called Hash Table.
Components of Hashing
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in the hash function the
technique that determines an index or location for storage of an item in a data structure.
2. Hash Function: Receives the input key and returns the index of an element in an array called
a hash table. The index is known as the hash index .
3. Hash Table: Hash table is typically an array of lists. It stores values corresponding to the
keys. Hash stores the data in an associative manner in an array where each data value has its
own unique index.
Hash Functions and Types of Hash functions
Hash functions are a fundamental concept in computer science and play a crucial role in various
applications such as data storage, retrieval, and cryptography. A hash function creates a mapping from
an input key to an index in hash table. Below are few examples.
• Phone numbers as input keys : Consider a hash table of size 100. A simple example hash
function is to consider the last two digits of phone numbers so that we have valid hash table
indexes as output. This is mainly taking remainder when input phone number is divided by
100.
• Lowercase English Strings as Keys : Consider a hash table of size 100. A simple way to
hash the strings would be add their codes (1 for a, 2 for b, ... 26 for z) and take remainder of
the sum when divided by 100. This hash function may not be a good idea as strings "ad" and
"bc" would have the same hash value.
Key Properties of Hash Functions
• Deterministic: A hash function must consistently produce the same output for the same input.
• Fixed Output Size: The output of a hash function should have a fixed size, regardless of the
size of the input.
• Efficiency: The hash function should be able to process input quickly.
• Uniformity: The hash function should distribute the hash values uniformly across the output
space to avoid clustering.
• Pre-image Resistance: It should be computationally infeasible to reverse the hash function,
i.e., to find the original input given a hash value.
• Collision Resistance: It should be difficult to find two different inputs that produce the same
hash value.
• Avalanche Effect: A small change in the input should produce a significantly different hash
value.
Applications of Hash Functions
• Hash Tables: The most common use of hash functions in DSA is in hash tables, which
provide an efficient way to store and retrieve data.
• Data Integrity: Hash functions are used to ensure the integrity of data by generating
checksums.
• Cryptography: In cryptographic applications, hash functions are used to create secure hash
algorithms like SHA-256.
• Data Structures: Hash functions are utilized in various data structures such as Bloom filters
and hash sets.
Types of Hash Functions
There are many hash functions that use numeric or alphanumeric keys. This article focuses on
discussing different hash functions:
1. Division Method.
2. Multiplication Method
3. Mid-Square Method
4. Folding Method
1. Division Method
The division method involves dividing the key by a prime number and using the remainder as the hash
value.
h(k)=k mod m
Where k is the key and ?m is a prime number.
Advantages:
• Simple to implement.
• Works well when ?m is a prime number.
Disadvantages:
• Poor distribution if ?m is not chosen wisely.
2. Multiplication Method
In the multiplication method, a constant ?A (0 < A < 1) is used to multiply the key. The fractional part
of the product is then multiplied by ?m to get the hash value.
h(k)=⌊m(kAmod1)⌋
Where ⌊ ⌋ denotes the floor function.
Advantages:
• Less sensitive to the choice of ?m.
Disadvantages:
• More complex than the division method.
3. Mid-Square Method
In the mid-square method, the key is squared, and the middle digits of the result are taken as the hash
value.
Steps:
1. Square the key.
2. Extract the middle digits of the squared value.
Advantages:
• Produces a good distribution of hash values.
Disadvantages:
• May require more computational effort.
4. Folding Method
The folding method involves dividing the key into equal parts, summing the parts, and then taking the
modulo with respect to ?m.
Steps:
1. Divide the key into parts.
2. Sum the parts.
3. Take the modulo ?m of the sum.
Advantages:
• Simple and easy to implement.
Disadvantages:
• Depends on the choice of partitioning scheme.

Collision Resolution Techniques


In Hashing, hash functions were used to generate hash values. The hash value is used to create an
index for the keys in the hash table. The hash function may return the same hash value for two or
more keys. When two or more keys have the same hash value, a collision happens. To handle this
collision, we use Collision Resolution Techniques.

Collision Resolution Techniques


There are mainly two methods to handle collision:
1. Separate Chaining
2. Open Addressing
1) Separate Chaining
The idea behind Separate Chaining is to make each cell of the hash table point to a linked list of
records that have the same hash function value. Chaining is simple but requires additional memory
outside the table.
Example: We have given a hash function and we have to insert some elements in the hash table using
a separate chaining method for collision resolution technique.

2) Open Addressing
In open addressing, all elements are stored in the hash table itself. Each table entry contains either a
record or NIL. When searching for an element, we examine the table slots one by one until the desired
element is found or it is clear that the element is not in the table.
2.a) Linear Probing
In linear probing, the hash table is searched sequentially that starts from the original location of the
hash. If in case the location that we get is already occupied, then we check for the next location.
Algorithm:
1. Calculate the hash key. i.e. key = data % size
2. Check, if hashTable[key] is empty
• store the value directly by hashTable[key] = data
3. If the hash index already has some value then
• check for next index using key = (key+1) % size
4. Check, if the next index is available hashTable[key] then store the value. Otherwise try for
next index.
5. Do the above process till we find the space.
Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys that are to
be inserted are 50, 70, 76, 85, 93.
2.b) Quadratic Probing
Quadratic probing is an open addressing scheme in computer programming for resolving hash
collisions in hash tables. Quadratic probing operates by taking the original hash index and adding
successive values of an arbitrary quadratic polynomial until an open slot is found.
An example sequence using quadratic probing is:
H + 1 2 , H + 2 2 , H + 3 2 , H + 4 2 ...................... H + k 2
This method is also known as the mid-square method because in this method we look for i2-th probe
(slot) in i-th iteration and the value of i = 0, 1, . . . n – 1. We always start from the original hash
location. If only the location is occupied then we check the other slots.
Let hash(x) be the slot index computed using the hash function and n be the size of the hash table.
If the slot hash(x) % n is full, then we try (hash(x) + 1 2 ) % n.
If (hash(x) + 1 2 ) % n is also full, then we try (hash(x) + 2 2 ) % n.
If (hash(x) + 2 2 ) % n is also full, then we try (hash(x) + 3 2 ) % n.
This process will be repeated for all the values of i until an empty slot is found
Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision resolution
strategy to be f(i) = i 2 . Insert = 22, 30, and 50

2.c) Double Hashing


Double hashing is a collision resolving technique in Open Addressed Hash tables. Double hashing
make use of two hash function,
• The first hash function is h1(k) which takes the key and gives out a location on the hash table.
But if the new location is not occupied or empty then we can easily place our key.
• But in case the location is occupied (collision) we will use secondary hash-function h2(k) in
combination with the first hash-function h1(k) to find the new location on the hash table.
This combination of hash functions is of the form
h(k, i) = (h1(k) + i * h2(k)) % n
where
• i is a non-negative integer that indicates a collision number,
• k = element/key which is being hashed
• n = hash table size.
Complexity of the Double hashing algorithm:
Time complexity: O(n)
Example:

You might also like