Data Structures Full Notes
Data Structures Full Notes
UNIT I
Data structure is a particular way of organizing data in a computer so that it
can be used effectively.
Data Structures are widely used in almost every aspect of Computer Science i.e.
Operating System, Compiler Design, Artificial intelligence, Graphics and many
more.
Data Structures are the main part of many computer science algorithms as they
enable the programmers to handle the data in an efficient way. It plays a vital
role in enhancing the performance of a software or a program as the main
function of the software is to store and retrieve the user's data as fast as possible
1. Arrays
2. Linked Lists
A stack is a LIFO (Last In First Out — the element placed at last can be
accessed at first) structure which can be commonly found in many
programming languages. This structure is named as ―stack‖ because it
resembles a real-world stack — a stack of plates.
4. Queues
A queue is a FIFO (First In First Out — the element placed at first can be
accessed at first) structure which can be commonly found in many
programming languages. This structure is named as ―queue‖ because it
resembles a real-world queue — people waiting in a queue.
5. Trees
6. Graphs
Group Items: Data items which have subordinate data items are called Group
item, for example, name of a student can have first name and the last name.
Record: Record can be defined as the collection of various data items, for
example, if we talk about the student entity, then his name, address, course and
marks can be grouped together to form the record for the student.
File: A File is a collection of various records of one type of entity, for example,
if there are 60 employees in a Bank, then there will be 60 records in the related
file where each record contains the data about each employee.
For example an integer data type For example tree type data structures
describes every integer that the often allow for efficient searching
computer can handle. algorithms.
Values can directly be assigned to the The data is assigned to the data
data type variables. structure object using some set of
algorithms and operations like push,
pop and so on.
These are the structures which are supported at the machine level, they can be
used to make non-primitive data structures. These are integral and are pure in
data structures. Although, they too are provided by the system itself yet they are
derived data structures and cannot be formed without using the primitive data
structures.
The Non-primitive data structures are further divided into the following
categories:
where the elements are attached to its previous and next adjacent in what
involved. Therefore, we can traverse all the elements in single run only.
arranged in a linear way. Its examples are array, stack, queue, linked list
1. Traversing- It is used to access each data item exactly once so that it can
be processed.
2. Searching- It is used to find out the location of the data item if it exists in
3. Inserting- It is used to add a new data item in the given collection of data
items.
5. Sorting- It is used to arrange the data items in some order i.e. in ascending
of alphanumeric data.
6. Merging- It is used to combine the data items of two sorted files into
mean, median, average, count from a group of data objects the arrays are
very efficient.
Queues are used for CPU job scheduling and in disk scheduling
Linked lists are used in Symbol Tables for balancing parenthesis and in
folders act as tree nodes. The tree structure is useful because it easily
the shortest path between any two points. This is practically used in
Algorithms
executed in a certain order to get the desired output. Algorithms are generally
created independent of underlying languages, i.e. an algorithm can be
From the data structure point of view, following are some important categories
of algorithms −
Characteristics of an Algorithm
Not all procedures can be called an algorithm. An algorithm should have the
following characteristics −
its steps (or phases), and their inputs/outputs should be clear and must
As we know that all programming languages share basic code constructs like
loops (do, for, while), flow-control (if-else), etc. These common constructs can
Example
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 6 − print c
Step 7 − STOP
Algorithms tell the programmers how to code the program. Alternatively, the
Step 3 − c ← a + b
Step 4 − display c
Step 5 − STOP
describe an algorithm. It makes it easy for the analyst to analyze the algorithm
ignoring all unwanted definitions. He can observe what operations are being
Suppose X is an algorithm and n is the size of input data, the time and space
used by the algorithm X are the two main factors, which decide the efficiency
of X.
The complexity of an algorithm f(n) gives the running time and/or the storage
Space Complexity
required by the algorithm in its life cycle. The space required by an algorithm
A fixed part that is a space required to store certain data and variables,
that are independent of the size of the problem. For example, simple
fixed part and S(I) is the variable part of the algorithm, which depends on
concept −
Algorithm: SUM(A, B)
Step 1 - START
Step 2 - C ← A + B + 10
Step 3 - Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 +
3. Now, space depends on data types of given variables and constant types and
Time Complexity
numerical function T(n), where T(n) can be measured as the number of steps,
Time-Memory Tradeoff
a problem or calculation in less time by using more storage space (or memory),
people are willing to wait a little while for a big calculation, but not forever. So
if your problem is taking a long time but not much memory, a space-time
tradeoff would let you use more memory and solve the problem more quickly.
Or, if it could be solved very quickly but requires more memory than you have,
you can try to spend more time solving the problem in the limited memory.
Big O Notation
Big O notation is the language we use for talking about how long an algorithm
problem. It's like math except it's an awesome, not-boring kind of math.
quickly it grows relative to the input, as the input gets arbitrarily large.
Big O notation is used to classify algorithms according to how their run time or
Strings
#include <string.h>
UNIT II
Array
Array is a container which can hold a fixed number of items and these items
should be of the same type. Most of the data structures make use of arrays to
implement their algorithms. Following are the important terms to understand
the concept of Array.
Each element can be accessed via its index. For example, we can fetch an
element at index 6 as 27.
Basic Operations
Example
Following program traverses and prints the elements of an array:
#include <stdio.h>
int main()
int age[5]={20,21,19,18,25};
int i;
for(i=0;i<5;i++)
return 0;
When we compile and execute the above program, it produces the following
result −
Output
20 21 19 18 25
Insertion operation
Given an array arr of size n, this article tells how to insert an element x in this
array arr at a specific position pos.
Approach:
Here’s how to do it.
1. First get the element to be inserted, say x
2. Then get the position at which this element is to be inserted, say pos
3. Then shift the array elements from this position to one position forward,
and do this for all the other elements next to pos.
4. Insert the element x now at the position pos, as this is now empty.
Deletion operation
A user will enter the position at which the array element deletion is required.
Deleting an element does not affect the size of the array. It also checks whether
deletion is possible or not, for example, if an array contains five elements and
user wants to delete the element at the sixth position, it isn't possible.
we need to shift array elements which are after the element to be deleted, it is
very inefficient if the size of the array is large or we need to remove elements
from an array repeatedly. In linked list data structure shifting isn't required only
pointers are adjusted. If frequent deletion is required and the number of
elements is large, it is recommended to use a linked list.
Multi-Dimensional Array
When the number of dimensions specified is more than one, then it is called as a
multi-dimensional array. Multidimensional arrays include 2D arrays and 3D
arrays.
Declaration/Initialization of Arrays
Parallel arrays
A parallel array is a structure that contains multiple arrays. Each of these arrays
are of the same size and the array elements are related to each other. All the
elements in a parallel array represent a common entity.
An example of parallel arrays is as follows −
Sparse Matrix
In computer programming, a matrix can be defined with a 2-dimensional array.
Any array with 'm' columns and 'n' rows represent a m X n matrix. There may
be a situation in which a matrix contains more number of ZERO values than
NON-ZERO values. Such matrix is known as sparse matrix.
In this representation, we consider only non-zero values along with their row
and column index values. In this representation, the 0 th row stores the total
number of rows, total number of columns and the total number of non-zero
values in the sparse matrix.
In above example matrix, there are only 6 non-zero elements ( those are 9, 8, 4,
2, 5 & 2) and matrix size is 5 X 6. We represent this matrix as shown in the
above image. Here the first row in the right side table is filled with values 5, 6 &
6 which indicates that it is a sparse matrix with 5 rows, 6 columns & 6 non-zero
values. The second row is filled with 0, 4, & 9 which indicates the non-zero
value 9 is at the 0th-row 4th column in the Sparse matrix. In the same way, the
remaining non-zero values also follow a similar pattern.
Linked List
A linked list is a linear data structure, in which the elements are not stored at
contiguous memory locations. The elements in a linked list are linked using pointers
In simple words, a linked list consists of nodes where each node contains a data field
Size of the array must be specified at the Size of a Linked list grows/shrinks as and
time of array declaration/initialization. when new elements are inserted/deleted.
Let LIST is linear linked list. It needs two linear arrays for memory representation. Let these linear arrays
are INFO and LINK. INFO[K] contains the information part and LINK[K] contains the next pointer field of
node K. A variable START is used to store the location of the beginning of the LIST and NULL is used as
next pointer sentinel which indicates the end of LIST. It is shown below:
ptr = head;
while (ptr!=NULL)
{
ptr = ptr -> next;
}
Algorithm
o STEP 1: SET PTR = HEAD
o STEP 2: IF PTR = NULL
[END OF LOOP]
o STEP 7: EXIT
Insertion Operation
Adding a new node in linked list is a more than one step activity. We shall learn this
with diagrams here. First, create a node using the same structure and find the
location where it has to be inserted.
Now, the next node at the left should point to the new node.
LeftNode.next −> NewNode;
This will put the new node in the middle of the two. The new list should look like this
−
Similar steps should be taken if the node is being inserted at the beginning of the
list. While inserting it at the end, the second last node of the list should point to the
new node and the new node will point to NULL.
Deletion Operation
Deletion is also a more than one step process. We shall learn with pictorial
representation. First, locate the target node to be removed, by using searching
algorithms.
The left (previous) node of the target node now should point to the next node of the
target node −
LeftNode.next −> TargetNode.next;
This will remove the link that was pointing to the target node. Now, using the
following code, we will remove what the target node is pointing at.
TargetNode.next −> NULL;
We need to use the deleted node. We can keep that in memory otherwise we can
simply deallocate memory and wipe off the target node completely.
Algorithm
Step-1: Initialise the Current pointer with the beginning of the List.
Step-2: Compare the KEY value with the Current node value;
else go to step-3.
Step-3: Move the Current pointer to point to the next node in the list and go
to step-2, till the list is not over or else quit
As per the above illustration, following are the important points to be considered.
Doubly Linked List contains a link element called first and last.
Each link carries a data field(s) and two link fields called next and prev.
Each link is linked with its next link using its next link.
Each link is linked with its previous link using its previous link.
The last link carries a link as null to mark the end of the list.
Basic Operations
Following are the basic operations supported by a list.
Insertion − Adds an element at the beginning of the list.
Deletion − Deletes an element at the beginning of the list.
Insert Last − Adds an element at the end of the list.
Delete Last − Deletes an element from the end of the list.
Insert After − Adds an element after an item of the list.
Delete − Deletes an element from the list using the key.
Display forward − Displays the complete list in a forward manner.
Display backward − Displays the complete list in a backward manner.
As per the above illustration, following are the important points to be considered.
The last link's next points to the first link of the list in both cases of singly as
well as doubly linked list.
The first link's previous points to the last of the list in case of doubly linked
list.
Basic Operations
Following are the important operations supported by a circular list.
insert − Inserts an element at the start of the list.
delete − Deletes an element from the start of the list.
display − Displays the list.
Applications of linked list
1. Implementation of stacks and queues
2. Implementation of graphs : Adjacency list representation of graphs is most
popular which is uses linked list to store adjacent vertices.
3. Dynamic memory allocation : We use linked list of free blocks.
4. Maintaining directory of names
5. Performing arithmetic operations on long integers
6. Manipulation of polynomials by storing constants in the node of linked list
7. representing sparse matrices
8. Image viewer – Previous and next images are linked, hence can be
accessed by next and previous button.
9. Previous and next page in web browser – We can access previous and next
url searched in web browser by pressing back and next button since, they
are linked as linked list.
10. Music Player – Songs in music player are linked to previous and next song.
you can play songs either from starting or ending of the list.
Step 1 - Create a newNode with given value and newNode → next as NULL.
Step 2 - Check whether list is Empty (head == NULL).
Step 3 - If it is Empty then, set head = newNode.
Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
Step 5 - Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
Step 6 - Set temp → next = newNode.
Deletion
In a single linked list, the deletion operation can be performed in three ways. They are as
follows...
UNIT III
A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing.
Here, we are going to implement stack using arrays, which makes it a fixed size
stack implementation.
Primitive operations on stack
Stack operations may involve initializing the stack, using it and then de-initializing it.
Apart from these basic stuffs, a stack is used for the following two primary
operations −
push() − Pushing (storing) an element on the stack.
pop() − Removing (accessing) an element from the stack.
When data is PUSHed onto stack.
To use a stack efficiently, we need to check the status of stack as well. For the
same purpose, the following functionality is added to stacks −
peek() − get the top data element of the stack, without removing it.
isFull() − check if stack is full.
isEmpty() − check if stack is empty.
At all times, we maintain a pointer to the last PUSHed data on the stack. As this
pointer always represents the top of the stack, hence named top. The top pointer
provides top value of the stack without actually removing it.
If the linked list is used to implement the stack, then in step 3, we need to allocate
space dynamically.
Algorithm for PUSH Operation
A simple algorithm for Push operation can be derived as follows −
begin procedure push: stack, data
if stack is full
return null
endif
top ← top + 1
stack[top] ← data
end procedure
Pop Operation
Accessing the content while removing it from the stack, is known as a Pop
Operation. In an array implementation of pop() operation, the data element is not
actually removed, instead top is decremented to a lower position in the stack to
point to the next value. But in linked-list implementation, pop() actually removes
data element and deallocates memory space.
A Pop operation may involve the following steps −
Step 1 − Checks if the stack is empty.
Step 2 − If the stack is empty, produces an error and exit.
Step 3 − If the stack is not empty, accesses the data element at which top is
pointing.
Step 4 − Decreases the value of top by 1.
Step 5 − Returns success.
if stack is empty
return null
endif
data ← stack[top]
top ← top - 1
return data
end procedure
1. Increment the variable Top so that it can now refere to the next memory location.
2. Add element at the position of incremented top. This is referred to as adding new
element at the top of the stack.
Stack is overflown when we try to insert an element into a completely filled stack
therefore, our main function must always avoid stack overflow condition.
Algorithm:
begin
if top = n then stack full
top = top + 1
stack (top) : = item;
end
The underflow condition occurs when we try to delete an element from an already empty
stack.
Algorithm :
begin
if top = 0 then stack empty;
item := stack(top);
top = top - 1;
end;
Stack applications
1. Stacks can be used for expression evaluation.
2. Stacks can be used to check parenthesis matching in an expression.
3. Stacks can be used for Conversion from one form of expression to
another.
4. Stacks can be used for Memory Management.
5. Stack data structures are used in backtracking problems.
Polish Notation
Polish notation is a notation form for expressing arithmetic, logic and algebraic
equations. Its most basic distinguishing feature is that operators are placed on
the left of their operands. If the operator has a defined fixed number of
operands, the syntax does not require brackets or parenthesis to lessen
ambiguity.
Polish notation is also known as prefix notation, prefix Polish notation, normal
Polish notation, Warsaw notation and Lukasiewicz notation.
Polish notation was invented in 1924 by Jan Lukasiewicz, a Polish logician and
philosopher, in order to simplify sentential logic. The idea is simply to have a
parenthesis-free notation that makes each equation shorter and easier to parse in
terms of defining the evaluation priority of the operators.
Example:
Infix notation with parenthesis: (3 + 2) * (5 – 1)
Polish notation: * + 3 2 – 5 1
Recursion
Queue
A Queue is a linear structure which follows a particular order in which the operations
are performed. The order is First In First Out (FIFO). A good example of a queue is
any queue of consumers for a resource where the consumer that came first is served
first. The difference between stacks and queues is in removing. In a stack we
remove the item the most recently added; in a queue, we remove the item the least
recently added.
Queue Representation
As we now understand that in queue, we access both ends for different reasons.
The following diagram given below tries to explain queue representation as data
structure −
Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory. Here we shall try to understand the basic
operations associated with queues −
enqueue() − add (store) an item to the queue.
dequeue() − remove (access) an item from the queue.
Few more functions are required to make the above-mentioned queue operation
efficient. These are −
peek() − Gets the element at the front of the queue without removing it.
isfull() − Checks if the queue is full.
isempty() − Checks if the queue is empty.
In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.
Circular Queue
Circular Queue is a linear data structure in which the operations are performed
based on FIFO (First In First Out) principle and the last position is connected back to
the first position to make a circle. It is also called ‘Ring Buffer’.
In a normal Queue, we can insert elements until queue becomes full. But once
queue becomes full, we can not insert the next element even if there is a space in
front of queue.
Priority Queue
Priority Queue is an extension of queue with following properties.
1. Every item has a priority associated with it.
2. An element with high priority is dequeued before an element with low priority.
3. If two elements have the same priority, they are served according to their order in the
queue.
FRONT- address of the first element of the Linked list storing the Queue.
REAR- address of the last element of the Linked list storing the Queue.
Queue Applications
Queues are used in various applications in Computer Science-
Enqueue Operation
Queues maintain two data pointers, front and rear. Therefore, its operations are
comparatively difficult to implement than that of stacks.
The following steps should be taken to enqueue (insert) data into a queue −
Step 1 − Check if the queue is full.
Step 2 − If the queue is full, produce overflow error and exit.
Step 3 − If the queue is not full, increment rear pointer to point the next
empty space.
Step 4 − Add data element to the queue location, where the rear is pointing.
Step 5 − return success.
Sometimes, we also check to see if a queue is initialized or not, to handle any
unforeseen situations.
Algorithm for enqueue operation
procedure enqueue(data)
if queue is full
return overflow
endif
rear ← rear + 1
queue[rear] ← data
return true
end procedure
rear = rear + 1;
queue[rear] = data;
return 1;
end procedure
Dequeue Operation
Accessing data from the queue is a process of two tasks − access the data
where front is pointing and remove the data after access. The following steps are
taken to perform dequeue operation −
Step 1 − Check if the queue is empty.
Step 2 − If the queue is empty, produce underflow error and exit.
Step 3 − If the queue is not empty, access the data where front is pointing.
Step 4 − Increment front pointer to point to the next available data element.
Step 5 − Return success.
if queue is empty
return underflow
end if
data = queue[front]
front ← front + 1
return true
end procedure
return data;
}
1. If front = -1, then there are no elements in the queue and therefore this
will be the case of an underflow condition.
2. If there is only one element in the queue, in this case, the condition rear =
front holds and therefore, both are set to -1 and the queue is deleted
completely.
3. If front = max -1 then, the value is deleted from the front end the value of
front is set to 0.
4. Otherwise, the value of front is incremented by 1 and then delete the
element at the front end.
Algorithm
o Step 1: IF FRONT = -1
Write " UNDERFLOW "
Goto Step 4
[END of IF]
o Step 2: SET VAL = QUEUE[FRONT]
o Step 3: IF FRONT = REAR
SET FRONT = REAR = -1
ELSE
IF FRONT = MAX -1
SET FRONT = 0
ELSE
SET FRONT = FRONT + 1
[END of IF]
[END OF IF]
o Step 4: EXIT
UNIT IV
Tree
What are trees?
The above figure represents structure of a tree. Tree has 2 subtrees. A is a parent
of B and C.
Field Description
Root Root is a special node in a tree. The entire tree is referenced through it. It does not have a
parent.
Path Path is a number of successive edges from source node to destination node.
Height of Height of a node represents the number of edges on the longest path between that node
Node and a leaf.
Depth of Depth of a node represents the number of edges from the tree's root node to the node.
Node
Edge Edge is a connection between one node to another. It is a line between two nodes or a node and
a leaf.
In the above figure, D, F, H, G are leaves. B and C are siblings. Each node excluding a root is
connected by a direct edge from exactly one other node
parent → children.
Levels of a node
Levels of a node represents the number of connections between the node and the root. It represents generation
of a node. If the root node is at level 0, its next node is at level 1, its grand child is at level 2 and so on. Levels
of a node can be shown as follows:
- If node has no children, it is called Leaves or External Nodes.
- Nodes which are not leaves, are called Internal Nodes. Internal nodes have at
least one child.
- A tree can be empty with no nodes or a tree consists of one node called the Root.
Height of a Node
As we studied, height of a node is a number of edges on the longest path between that node and a leaf. Each
node has height.
In the above figure, A, B, C, D can have height. Leaf cannot have height as there will be no path starting from a
leaf. Node A's height is the number of edges of the path to K not to D. And its height is 3.
Note:
- Height of a node defines the longest path from the node to a leaf.
- Path can only be downward.
Depth of a Node
While talking about the height, it locates a node at bottom where for depth, it is located at top
which is root level and therefore we call it depth of a node.
In the above figure, Node G's depth is 2. In depth of a node, we just count how many edges between the
targeting node & the root and ignoring the directions.
Advantages of Tree
Tree Representation
A tree whose elements have at most 2 children is called a binary tree. Since each
element in a binary tree can have only 2 children, we typically name them the left
and right child.
Insertion: For inserting element as left child of 2, we have to traverse all elements. Therefore,
insertion in binary tree has worst case complexity of O(n).
Deletion: For deletion of element 2, we have to traverse all elements to find 2 (assuming we do
breadth first traversal). Therefore, deletion in binary tree has worst case complexity of O(n).
1. Preorder traversal
2. Inorder traversal
3. Postorder traversal
In-order Traversal
In this traversal method, the left subtree is visited first, then the root and later
the right sub-tree. We should always remember that every node may represent
a subtree itself.
If a binary tree is traversed in-order, the output will produce sorted key values
in an ascending order.
We start from A, and following pre-order traversal, we first visit A itself and
then move to its left subtree B. B is also traversed pre-order. The process goes
on until all the nodes are visited. The output of pre-order traversal of this tree
will be −
A→B→D→E→C→F→G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we
traverse the left subtree, then the right subtree and finally the root node.
We start from A, and following Post-order traversal, we first visit the left
subtree B. B is also traversed post-order. The process goes on until all the
nodes are visited. The output of post-order traversal of this tree will be −
D→E→B→F→G→C→A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.
The above example of the binary tree represented using Linked list
representation is shown as follows...
Binary Search Tree (BST)
Binary Search Tree is a node-based binary tree data structure which has the
following properties:
The left subtree of a node contains only nodes with keys lesser than the
node’s key.
The right subtree of a node contains only nodes with keys greater than the
node’s key.
The left and right subtree each must also be a binary search tree.
Operations on binary search tree
Following are the basic operations −
Search − Searches an element in a tree.
Insert − Inserts an element in a tree.
Pre-order Traversal − Traverses a tree in a pre-order manner.
In-order Traversal − Traverses a tree in an in-order manner.
Post-order Traversal − Traverses a tree in a post-order manner.
Expression Tree
The expression tree is a binary tree in which each internal node corresponds to
the operator and each leaf node corresponds to the operand so for example
expression tree for 3 + ((5+9)*2) would be:
Now For constructing an expression tree we use a stack. We loop through input
expression and do the following for every character.
1. If a character is an operand push that into the stack
2. If a character is an operator pop two values from the stack make them its
child and push the current node again.
In the end, the only element of the stack will be the root of an expression tree.
Forest
Applications of trees
UNIT V
Graph
A Graph is a non-linear data structure consisting of nodes and edges. The nodes
are sometimes also referred to as vertices and the edges are lines or arcs that
connect any two nodes in the graph. More formally a Graph can be defined as,
A Graph consists of a finite set of vertices(or nodes) and set of Edges which
connect a pair of nodes.
In the above Graph, the set of vertices V = {0,1,2,3,4} and the set of edges E = {01,
12, 23, 34, 04, 14, 13}.
Graphs are used to solve many real-life problems. Graphs are used to represent
networks. The networks may include paths in a city or telephone network or circuit
network. Graphs are also used in social networks like Instagram, Facebook. For
example, in Facebook, each person is represented with a vertex(or node). Each
node is a structure and contains information like person id, name, gender, locale etc.
A collection of vertices V
A collection of edges E, represented as ordered pairs of
vertices (u,v)
In the graph,
V = {0, 1, 2, 3}
G = {V, E}
Graph Terminology
Graph Representation
1. Adjacency Matrix
2. Adjacency List
The index of the array represents a vertex and each element in its linked list
represents the other vertices that form an edge with the vertex.
The adjacency list for the graph we made in the first example is as follows:
An adjacency list is efficient in terms of storage because we only need to store
the values for the edges. For a graph with millions of vertices, this can mean a
lot of saved space.
Graph Operations
Directed Graphs
Edges are usually represented by arrows pointing in the direction the graph can
be traversed.
In the example on the right, the graph can be traversed from vertex A to B, but
not from vertex B to A.
Undirected Graphs
Some more complex directed and undirected graphs might look like the
following:
Weighted graph:
Example:
The weight of an edge can represent:
Example:
Graph:
Representation:
Explanation:
And so on....
Graph Traversal
Graph traversal is a technique used for a searching vertex in a graph. The graph
traversal is also used to decide the order of vertices is visited in the search
process. A graph traversal finds the edges to be used in the search process
without creating loops. That means using graph traversal we visit all the vertices
of the graph without getting into looping path.
There are two graph traversal techniques and they are as follows...
Back tracking is coming back to the vertex from which we reached the current
vertex.
BFS (Breadth First Search)
Hashing
In both these examples the students and books were hashed to a unique number.
Assume that you have an object and you want to assign a key to it to make
searching easy. To store the key/value pair, you can use a simple array like a
data structure where keys (integers) can be used directly as an index to store
values. However, in cases where the keys are large and cannot be used directly
as an index, you should use hashing.
In hashing, large keys are converted into small keys by using hash functions.
The values are then stored in a data structure called hash table. The idea of
hashing is to distribute entries (key/value pairs) uniformly across an array. Each
element is assigned a key (converted key). By using that key you can access the
element in O(1) time. Using the key, the algorithm (hash function) computes an
index that suggests where an entry can be found or inserted.
Hash Table
Hash Table is a data structure which stores data in an associative manner. In a
hash table, data is stored in an array format, where each data value has its own
unique index value. Access of data becomes very fast if we know the index of
the desired data.
Thus, it becomes a data structure in which insertion and search operations are
very fast irrespective of the size of the data. Hash Table uses an array as a
storage medium and uses hash technique to generate an index where an
element is to be inserted or is to be located from.
Hashing
Hashing is a technique to convert a range of key values into a range of indexes
of an array. We're going to use modulo operator to get a range of key values.
Consider an example of hash table of size 20, and the following items are to be
stored. Item are in the (key,value) format.
(1,20)
(2,70)
(42,80)
(4,25)
(12,44)
(14,32)
(17,11)
(13,78)
(37,98)
Sr.No. Key Hash Array Index
1 1 1 % 20 = 1 1
2 2 2 % 20 = 2 2
3 42 42 % 20 = 2 2
4 4 4 % 20 = 4 4
5 12 12 % 20 = 12 12
6 14 14 % 20 = 14 14
7 17 17 % 20 = 17 17
8 13 13 % 20 = 13 13
9 37 37 % 20 = 17 17
Infix, Postfix and Prefix notations are three different but equivalent ways of
writing expressions. It is easiest to demonstrate the differences by looking at
examples of operators that take two operands.
Infix notation: X + Y
Operators are written in-between their operands. This is the usual way we
write expressions. An expression such as A * ( B + C ) / D is usually
taken to mean something like: "First add B and C together, then multiply
the result by A, then divide by D to give the final answer."
Operators are written after their operands. The infix expression given
above is equivalent to A B C + * D /
The order of evaluation of operators is always left-to-right, and brackets
cannot be used to change this order. Because the "+" is to the left of the
"*" in the example above, the addition must be performed before the
multiplication.
Operators act on values immediately to the left of them. For example, the
"+" above uses the "B" and "C". We can add (totally unnecessary)
brackets to make this explicit:
( (A (B C +) *) D /)
Thus, the "*" uses the two values immediately preceding: "A", and the
result of the addition. Similarly, the "/" uses the result of the
multiplication and the "D".
Operators are written before their operands. The expressions given above
are equivalent to / * A + B C D
As for Postfix, operators are evaluated left-to-right and brackets are
superfluous. Operators act on the two nearest values on the right. I have
again added (totally unnecessary) brackets to make this clear:
(/ (* A (+ B C) ) D)
In all three versions, the operands occur in the same order, and just the operators
have to be moved to keep the meaning correct. (This is particularly important
for asymmetric operators like subtraction and division: A - B does not mean the
same as B - A; the former is equivalent to A B - or - A B, the latter to B A - or -
B A).
Examples:
multiply A and B,
A * B + C / D A B * C D / + + * A B / C D divide C by D,
add the results
add B and C,
A * (B + C) / D A B C + * D / / * A + B C D multiply by A,
divide by D
divide C by D,
A * (B + C / D) A B C D / + * * A + B / C D add B,
multiply by A
( (A * B) + (C / D) ) ( (A B *) (C D /) +) (+ (* A B) (/ C D) )
((A * (B + C) ) / D) ( (A (B C +) *) D /) (/ (* A (+ B C) ) D)
(A * (B + (C / D) ) ) (A (B (C D /) +) *) (* A (+ B (/ C D) ) )
You can convert directly between these bracketed forms simply by moving the
operator within the brackets e.g. (X + Y) or (X Y +) or (+ X Y). Repeat this for
all the operators in an expression, and finally remove any superfluous brackets.
You can use a similar trick to convert to and from parse trees - each bracketed
triplet of an operator and its two operands (or sub-expressions) corresponds to a
node of the tree.
Binary Search
Binary search is a fast search algorithm with run-time complexity of Ο(log n).
This search algorithm works on the principle of divide and conquer. For this
algorithm to work properly, the data collection should be in the sorted form.
Binary search looks for a particular item by comparing the middle most item of
the collection. If a match occurs, then the index of item is returned. If the
middle item is greater than the item, then the item is searched in the sub-array
to the left of the middle item. Otherwise, the item is searched for in the sub-
array to the right of the middle item. This process continues on the sub-array as
well until the size of the subarray reduces to zero.
How Binary Search Works?
For a binary search to work, it is mandatory for the target array to be sorted.
We shall learn the process of binary search with a pictorial example. The
following is our sorted array and let us assume that we need to search the
location of value 31 using binary search.
Now we compare the value stored at location 4, with the value being searched,
i.e. 31. We find that the value at location 4 is 27, which is not a match. As the
value is greater than 27 and we have a sorted array, so we also know that the
target value must be in the upper portion of the array.
We change our low to mid + 1 and find the new mid value again.
low = mid + 1
mid = low + (high - low) / 2
Our new mid is 7 now. We compare the value stored at location 7 with our
target value 31.
The value stored at location 7 is not a match, rather it is more than what we are
looking for. So, the value must be in the lower part from this location.
We compare the value stored at location 5 with our target value. We find that it
is a match.
Set lowerBound = 1
Set upperBound = n
if A[midPoint] < x
set lowerBound = midPoint + 1
if A[midPoint] > x
set upperBound = midPoint - 1
if A[midPoint] = x
EXIT: x found at location midPoint
end while
end procedure
Sorting
Sorting refers to arranging data in a particular format. Sorting algorithm
specifies the way to arrange data in a particular order. Most common orders are
in numerical or lexicographical order.
The importance of sorting lies in the fact that data searching can be optimized
to a very high level, if data is stored in a sorted manner. Sorting is also used to
represent data in more readable formats. Following are some of the examples
of sorting in real-life scenarios −
Telephone Directory − The telephone directory stores the telephone
numbers of people sorted by their names, so that the names can be
searched easily.
Dictionary − The dictionary stores words in an alphabetical order so that
searching of any word becomes easy.
Bubble Sort Algorithm
Bubble sort is a simple sorting algorithm. This sorting algorithm is
comparison-based algorithm in which each pair of adjacent elements is
compared and the elements are swapped if they are not in order. This algorithm
is not suitable for large data sets as its average and worst case complexity are
of Ο(n2) where n is the number of items.
How Bubble Sort Works?
We take an unsorted array for our example. Bubble sort takes Ο(n2) time so
we're keeping it short and precise.
Bubble sort starts with very first two elements, comparing them to check which
one is greater.
We find that 27 is smaller than 33 and these two values must be swapped.
Next we compare 33 and 35. We find that both are in already sorted positions.
We know then that 10 is smaller 35. Hence they are not sorted.
We swap these values. We find that we have reached the end of the array. After
one iteration, the array should look like this −
To be precise, we are now showing how an array should look like after each
iteration. After the second iteration, it should look like this −
Notice that after each iteration, at least one value moves at the end.
And when there's no swap required, bubble sorts learns that an array is
completely sorted.
return list
end BubbleSort
Insertion Sort
This is an in-place comparison-based sorting algorithm. Here, a sub-list is
maintained which is always sorted. For example, the lower part of an array is
maintained to be sorted. An element which is to be 'insert'ed in this sorted sub-
list, has to find its appropriate place and then it has to be inserted there. Hence
the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted
into the sorted sub-list (in the same array). This algorithm is not suitable for
large data sets as its average and worst case complexity are of Ο(n2), where n is
the number of items.
How Insertion Sort Works?
We take an unsorted array for our example.
It finds that both 14 and 33 are already in ascending order. For now, 14 is in
sorted sub-list.
It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here
we see that the sorted sub-list has only one element 14, and 27 is greater than
14. Hence, the sorted sub-list remains sorted after swapping.
By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.
These values are not in a sorted order.
So we swap them.
We swap them again. By the end of third iteration, we have a sorted sub-list of
4 items.
This process goes on until all the unsorted values are covered in a sorted sub-
list. Now we shall see some programming aspects of insertion sort.
Algorithm
Now we have a bigger picture of how this sorting technique works, so we can
derive simple steps by which we can achieve insertion sort.
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the
value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted
Selection Sort
Selection sort is a simple sorting algorithm. This sorting algorithm is an in-
place comparison-based algorithm in which the list is divided into two parts,
the sorted part at the left end and the unsorted part at the right end. Initially, the
sorted part is empty and the unsorted part is the entire list.
The smallest element is selected from the unsorted array and swapped with the
leftmost element, and that element becomes a part of the sorted array. This
process continues moving unsorted array boundary by one element to the right.
This algorithm is not suitable for large data sets as its average and worst case
complexities are of Ο(n2), where n is the number of items.
How Selection Sort Works?
Consider the following depicted array as an example.
For the first position in the sorted list, the whole list is scanned sequentially.
The first position where 14 is stored presently, we search the whole list and
find that 10 is the lowest value.
So we replace 14 with 10. After one iteration 10, which happens to be the
minimum value in the list, appears in the first position of the sorted list.
For the second position, where 33 is residing, we start scanning the rest of the
list in a linear manner.
We find that 14 is the second lowest value in the list and it should appear at the
second place. We swap these values.
After two iterations, two least values are positioned at the beginning in a sorted
manner.
The same process is applied to the rest of the items in the array.
Following is a pictorial depiction of the entire sorting process −
Now, let us learn some programming aspects of selection sort.
Algorithm
Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted