DSA Notes
DSA Notes
Unit I
Arrays and sequential representations – ordered lists – Stacks and Queues – Evaluation
ofExpressions – Multiple Stacks and Queues – Singly Linked List – Linked Stacks and
queues – Polynomial addition.
The primitive data structures are primitive data types. The int, char, float, double, and
pointer are the primitive data structures that can hold a single value.
When one element is connected to the 'n' number of elements known as a non-
linear data structure. The best example is trees and graphs. In this case, the elements
are arranged in a random manner.
Static data structure: It is a type of data structure where the size is allocated
at the compile time. Therefore, the maximum size is fixed.
Dynamic data structure: It is a type of data structure where the size is
allocated at the run time. Therefore, the maximum size is flexible.
Major Operations
The major or the common operations that can be performed on the data structures
are:
Definition
o Arrays are defined as the collection of similar type of data items stored at contiguous
memory locations.
o Arrays are the derived data type in C programming language which can store the
primitive type of data such as int, char, double, float, etc.
o Array is the simplest data structure where each data element can be randomly
accessed by using its index number.
o For example, if we want to store the marks of a student in 6 subjects, then we don't
need to define different variable for the marks in different subject. instead of that, we
can define an array which can store the marks in each subject at a the contiguous
memory locations.
The array marks[10] defines the marks of the student in 10 different subjects where
each subject marks are located at a particular subscript in the array
i.e. marks[0] denotes the marks in first subject, marks[1] denotes the marks in 2nd
subject and so on.
Advantages of Array
o Array provides the single name for the group of variables of the same type therefore,
it is easy to remember the name of all the elements of an array.
o Traversing an array is a very simple process, we just need to increment the base
address of the array in order to visit each element one by one.
o Any element in the array can be directly accessed by using the index.
Memory Allocation of the array
As we have mentioned, all the data elements of an array are stored at
contiguous locations in the main memory. The name of the array represents
the base address or the address of first element in the main memory. Each
element of the array is represented by a proper indexing.
1. 0 (zero - based indexing) : The first element of the array will be arr[0].
2. 1 (one - based indexing) : The first element of the array will be arr[1].
3. n (n - based indexing) : The first element of the array can reside at any random index
number.
In the following image, we have shown the memory allocation of an array arr of size
5. The array follows 0-based indexing approach. The base address of the array is
100th byte. This will be the address of arr[0]. Here, the size of int is 4 bytes therefore
each element will take 4 bytes in the memory.
In 0 based indexing, If the size of an array is n then the maximum index number, an
element can have is n-1. However, it will be n if we use 1 based indexing.
However, 2D arrays are created to implement a relational database look alike data
structure. It provides ease of holding bulk of data at once which can be passed to
any number of functions wherever required.
25M
Above image shows the two dimensional array, the elements are organized in the form of
rows and columns. First element of the first row is represented by a[0][0] where the number
shown in the first index is the number of that row while the number shown in the second
index is the number of the column.
How do we access data in a 2D array
Similar to one dimensional arrays, we can access the individual cells in a
2D array by using the indices of the cells. There are two indices attached
to a particular cell, one is its row number while the other is its column
number.
However, we can store the value stored in any particular cell of a 2D array to some
variable x by using the following syntax.
1. int x = a[i][j];
where i and j is the row and column number of the cell respectively.
Initializing 2D Arrays
We know that, when we declare and initialize one dimensional array in C
programming simultaneously, we don't need to specify the size of the
array. However this will not work with 2D arrays. We will have to define
at least the second dimension of the array.
The number of elements that can be present in a 2D array will always be equal to
(number of rows * number of columns).
A 3 X 3 two dimensional array is shown in the following image. However, this array
needs to be mapped to a one dimensional array in order to store it into the memory.
There are two main techniques of storing 2D array elements into memory
first, the 1st row of the array is stored into the memory completely, then the 2 nd row
of the array is stored into the memory completely and so on till the last row.
first, the 1st column of the array is stored into the memory completely, then the
2nd row of the array is stored into the memory completely and so on till the last
column of the array.
1. Address(a[i][j]) = B. A. + (i * n + j) * size
where, B. A. is the base address or the address of the first element of the array a[0][0] .
Example :
1. a[10...30, 55...75], base address of the array (BA) = 0, size of an element = 4 bytes .
2. Find the location of a[15][68].
3.
4. Address(a[15][68]) = 0 +
5. ((15 - 10) x (68 - 55 + 1) + (68 - 55)) x 4
6.
7. = (5 x 14 + 13) x 4
8. = 83 x 4
9. = 332 answer
If array is declared by a[m][n] where m is the number of rows while n is the number
of columns, then address of an element a[i][j] of the array stored in row major order
is calculated as,
1. Address(a[i][j]) = ((j*m)+i)*Size + BA
Example:
A [-
5 ... +20][20 ... 70], BA = 1020, Size of element = 8 bytes. Find the location of a[0][30].
Ordered list
An ordered list is a list in which the order of the items is significant.
However, the items in an ordered list are not necessarily sorted.
Consequently, it is possible to change the order o items and still have a
valid ordered list.
Consider a list of the titles of the chapters in this book. The order of the
items in the list corresponds to the order in which they appear in the book.
However, since the chapter titles are not sorted alphabetically, we cannot
consider the list to be sorted. Since it is possible to change the order of the
chapters in book, we must be able to do the same with the items of the list.
As a result, we may insert an item into an ordered list at any position.
Insert
-used to put objects into a the container;
withdraw
-used to remove objects from the container;
find
-used to locate objects in the container;
isMember
-used to test whether a given object instance is in the container.
stack
A real-world stack allows operations at one end only. For example, we can place or
remove a card or plate from the top of the stack only. At any given time, we can only
access the top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here,
the element which is placed (inserted or added) last, is accessed first. In stack
terminology, insertion operation is called PUSH operation and removal operation is
called POP operation.
Stack Representation
The following diagram depicts a stack and its operations −
A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing.
Here, we are going to implement stack using arrays, which makes it a fixed size
stack implementation.
Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing it.
Apart from these basic stuffs, a stack is used for the following two primary
operations −
push() − Pushing (storing) an element on the stack.
pop() − Removing (accessing) an element from the stack.
When data is PUSHed onto stack.
To use a stack efficiently, we need to check the status of stack as well. For the
same purpose, the following functionality is added to stacks −
peek() − get the top data element of the stack, without removing it.
isFull() − check if stack is full.
isEmpty() − check if stack is empty.
At all times, we maintain a pointer to the last PUSHed data on the stack.
The top pointer provides top value of the stack without actually removing it.
First we should learn about procedures to support stack functions −
peek()
Algorithm of peek() function −
begin procedure peek
return stack[top]
end procedure
isfull()
Algorithm of isfull() function −
begin procedure isfull
end procedure
isempty()
Algorithm of isempty() function −
begin procedure isempty
end procedure
Implementation of isempty() function in C programming language is slightly
different. We initialize top at -1, as the index in array starts from 0. So we check if
the top is below zero or -1 to determine if the stack is empty. Here's the code −
Push Operation
The process of putting a new data element onto stack is known as a Push
Operation. Push operation involves a series of steps −
Step 1 − Checks if the stack is full.
Step 2 − If the stack is full, produces an error and exit.
Step 3 − If the stack is not full, increments top to point next empty space.
Step 4 − Adds data element to the stack location, where top is pointing.
Step 5 − Returns success.
If the linked list is used to implement the stack, then in step 3, we need to allocate
space dynamically.
if stack is full
return null
endif
top ← top + 1
stack[top] ← data
end procedure
Pop Operation
Accessing the content while removing it from the stack, is known as a Pop
Operation. In an array implementation of pop() operation, the data element is not
actually removed, instead top is decremented to a lower position in the stack to
point to the next value. But in linked-list implementation, pop() actually removes
data element and deallocates memory space.
A Pop operation may involve the following steps −
Step 1 − Checks if the stack is empty.
Step 2 − If the stack is empty, produces an error and exit.
Step 3 − If the stack is not empty, accesses the data element at which top is
pointing.
Step 4 − Decreases the value of top by 1.
Step 5 − Returns success.
if stack is empty
return null
endif
data ← stack[top]
top ← top - 1
return data
end procedure
Queue
Queue is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queue is
open at both its ends. One end is always used to insert data (enqueue) and the other is used
to remove data (dequeue). Queue follows First-In-First-Out methodology, i.e., the data item
stored first will be accessed first.
Queue Representation
As we now understand that in queue, we access both ends for different reasons.
The following diagram given below tries to explain queue representation as data
structure −
Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory. Here we shall try to understand the basic
operations associated with queues −
enqueue() − add (store) an item to the queue.
dequeue() − remove (access) an item from the queue.
Few more functions are required to make the above-mentioned queue operation
efficient. These are −
peek() − Gets the element at the front of the queue without removing it.
isfull() − Checks if the queue is full.
isempty() − Checks if the queue is empty.
In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.
peek()
This function helps to see the data at the front of the queue. The algorithm of
peek() function is as follows −
Algorithm
begin procedure peek
return queue[front]
end procedure
isfull()
As we are using single dimension array to implement queue, we just check for the
rear pointer to reach at MAXSIZE to determine that the queue is full. In case we
maintain the queue in a circular linked-list, the algorithm will differ. Algorithm of
isfull() function –
Algorithm
begin procedure isfull
end procedure
isempty()
Algorithm of isempty() function −
Algorithm
begin procedure isempty
end procedure
If the value of front is less than MIN or 0, it tells that the queue is not yet initialized,
hence empty.
Enqueue Operation
Queues maintain two data pointers, front and rear. Therefore, its operations are
comparatively difficult to implement than that of stacks.
The following steps should be taken to enqueue (insert) data into a queue −
Step 1 − Check if the queue is full.
Step 2 − If the queue is full, produce overflow error and exit.
Step 3 − If the queue is not full, increment rear pointer to point the next
empty space.
Step 4 − Add data element to the queue location, where the rear is pointing.
Step 5 − return success.
Sometimes, we also check to see if a queue is initialized or not, to
handle any unforeseen situations.
Dequeue Operation
Accessing data from the queue is a process of two tasks − access the data
where front is pointing and remove the data after access. The following steps are
taken to perform dequeue operation −
Step 1 − Check if the queue is empty.
Step 2 − If the queue is empty, produce underflow error and exit.
Step 3 − If the queue is not empty, access the data where front is pointing.
Step 4 − Increment front pointer to point to the next available data element.
Step 5 − Return success.
Algorithm for dequeue operation
procedure dequeue
if queue is empty
return underflow
end if
data = queue[front]
front ← front + 1
return true
end procedure
Evaluation of Expression
The way to write arithmetic expression is known as a notation. An arithmetic
expression can be written in three different but equivalent notations, i.e., without
changing the essence or output of an expression. These notations are –
Infix Notation
Prefix (Polish) Notation
Postfix (Reverse-Polish) Notation
These notations are named as how they use operator in expression. We shall learn
the same here in this chapter.
Infix Notation
We write expression in infix notation, e.g. a - b + c, where operators are used in-
between operands. It is easy for us humans to read, write, and speak in infix
notation but the same does not go well with computing devices. An algorithm to
process infix notation could be difficult and costly in terms of time and space
consumption.
Prefix Notation
In this notation, operator is prefixed to operands, i.e. operator is written ahead of
operands. For example, +ab. This is equivalent to its infix notation a + b. Prefix
notation is also known as Polish Notation.
Postfix Notation
This notation style is known as Reversed Polish Notation. In this notation style,
the operator is postfixed to the operands i.e., the operator is written after the
operands. For example, ab+. This is equivalent to its infix notation a + b.
The following table briefly tries to show the difference in all three notations −
2 (a + b) ∗ c ∗+abc ab+c∗
3 a ∗ (b + c) ∗a+bc abc+∗
5 (a + b) ∗ (c + d) ∗+ab+cd ab+cd+∗
Parsing Expressions
As we have discussed, it is not a very efficient way to design an algorithm or
program to parse infix notations. Instead, these infix notations are first converted
into either postfix or prefix notations and then computed.
To parse any arithmetic expression, we need to take care of operator precedence
and associativity also.
Precedence
When an operand is in between two different operators, which operator will take the
operand first, is decided by the precedence of an operator over others. For example
−
a+b*c - a+(b*c)
Associativity
Associativity describes the rule where operators with the same precedence appear
in an expression. For example, in expression a + b − c, both + and – have the same
precedence, then which part of the expression will be evaluated first, is determined
by associativity of those operators. Here, both + and − are left associative, so the
expression will be evaluated as (a + b) − c.
Precedence and associativity determines the order of evaluation of an expression.
Following is an operator precedence and associativity table (highest to lowest) −
The above table shows the default behavior of operators. At any point of time in
expression evaluation, the order can be altered by using parenthesis. For example
−
In a + b*c, the expression part b*c will be evaluated first, with multiplication as
precedence over addition. We here use parenthesis for a + b to be evaluated first,
like (a + b)*c.
Example
Queue 1 expands from the 0th element to the right and circular back to the 0th element
Queue 2 expands from the 8th element to the left and circular back to the 8th element
Temporary boundary between the Queue 1 and the Queue 2; as long as there has free
elements in the array and boundary would be shift
Free elements could be anywhere in the Queue such as before the front, after the rear, and
between front and rear in the Queue
Queue 1’s and Queue 2 ‘s size could be change if it is necessary. When the Queue 1 is full
and the Queue 2 has free space; the Queue 1 can increase the size to use that free space
from the Queue 2. Same way for the Queue 2
Elements –1, –2, and –3 are using to store the size of the Queue 1, the front of the Queue 1,
and the data count for the Queue 1 needed to manipulate the Queue 1
Elements –4, –5, and –6 are using to store the size of the Queue 2, the front of the Queue 2,
and the data count for the Queue 2 needed to manipulate the Queue 2
Inserts data to the Queue 1, Q1Rear = (Q1Front + Q1count) % Q1Size
Inserts data to the Queue 2, Q2Rear = (Q2Front + Q2count) % Q2Size + Q1Size
Deletes data from the Queue 1, Q1Front = (Q1Front + 1) % Q1Size
Deletes data from the Queue 2, Q2Front = (Q2Front + 1) % Q2Size + Q1Size
A linked list is a sequence of data structures, which are connected together via links.
Linked List is a sequence of links which contains items. Each link contains a
connection to another link. Linked list is the second most-used data structure after
array. Following are the important terms to understand the concept of Linked List.
Link − Each link of a linked list can store a data called an element.
Next − Each link of a linked list contains a link to the next link called Next.
LinkedList − A Linked List contains the connection link to the first link called
First.
As per the above illustration, following are the important points to be considered.
Linked List contains a link element called first.
Each link carries a data field(s) and a link field called next.
Each link is linked with its next link using its next link.
Last link carries a link as null to mark the end of the list.
Basic Operations
Following are the basic operations supported by a list.
Insertion − Adds an element at the beginning of the list.
Deletion − Deletes an element at the beginning of the list.
Display − Displays the complete list.
Search − Searches an element using the given key.
Delete − Deletes an element using the given key.
Insertion Operation
Adding a new node in linked list is a more than one step activity. We shall learn this
with diagrams here. First, create a node using the same structure and find the
location where it has to be inserted.
Imagine that we are inserting a node B (NewNode), between A (LeftNode)
and C (RightNode). Then point B.next to C −
NewNode.next −> RightNode;
It should look like this −
Now, the next node at the left should point to the new node.
LeftNode.next −> NewNode;
Single linked list is a sequence of elements in which every element has link to its next
element in the sequence.
In any single linked list, the individual element is called as "Node". Every "Node" contains two
fields, data field, and the next field. The data field is used to store actual value of the node and
next field is used to store the address of next node in the sequence.
The graphical representation of a node in a single linked list is as follows...
Example
Insertion
Deletion
Display
Before we implement actual operations, first we need to set up an empty list. First, perform the
following steps before implementing actual operations.
Step 1 - Include all the header files which are used in the program.
Step 2 - Declare all the user defined functions.
Step 3 - Define a Node structure with two members data and next
Step 4 - Define a Node pointer 'head' and set it to NULL.
Step 5 - Implement the main method by displaying operations menu and make suitable
function calls in the main method to perform user selected operation.
Insertion
In a single linked list, the insertion operation can be performed in three ways. They are as
follows...
Step 1 - Create a newNode with given value and newNode → next as NULL.
Step 2 - Check whether list is Empty (head == NULL).
Step 3 - If it is Empty then, set head = newNode.
Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
Step 5 - Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
Step 6 - Set temp → next = newNode.
Deletion
In a single linked list, the deletion operation can be performed in three ways. They are as
follows...
Linked stack
Instead of using array, we can also use linked list to implement stack. Linked list
allocates the memory dynamically. However, time complexity in both the scenario is
same for all the operations i.e. push, pop and peek.
The top most node in the stack always contains null in its address field. Lets discuss
the way in which, each operation is performed in linked list implementation of stack.
1. Check for the underflow condition: The underflow condition occurs when
we try to pop from an already empty stack. The stack will be empty if the
head pointer of the list points to null.
2. Adjust the head pointer accordingly: In stack, the elements are popped
only from one end, therefore, the value stored in the head pointer must be
deleted and the node must be freed. The next node of the head node now
becomes the head node.
Display the nodes (Traversing)
Displaying all the nodes of a stack needs traversing all the nodes of the linked
list organized in the form of stack. For this purpose, we need to follow the
following steps.
Linked Queue
In a linked queue, each node of the queue consists of two parts i.e. data part and the
link part. Each element of the queue points to its immediate next element in the
memory.
In the linked queue, there are two pointers maintained in the memory i.e. front
pointer and rear pointer. The front pointer contains the address of the starting
element of the queue while the rear pointer contains the address of the last element
of the queue.46Diference, and JVM
Insertion and deletions are performed at rear and front end respectively. If front and
rear both are NULL, it indicates that the queue is empty.
The linked representation of queue is shown in the following figure.
Firstly, allocate the memory for the new node ptr by using the following statement.
There can be the two scenario of inserting this new node ptr into the linked queue.
we insert element into an empty queue. In this case, the condition front =
NULL becomes true. Now, the new element will be added as the only element of the
queue and the next pointer of front and rear pointer both, will point to NULL.
Deletion operation removes the element that is first inserted among all the queue
elements. Firstly, we need to check either the list is empty or not. The condition front
== NULL becomes true if the list is empty, in this case , we simply write underflow on
the console and make exit. Otherwise, we will delete the element that is pointed by
the pointer front. For this purpose, copy the node pointed by the front pointer into
the pointer ptr. Now, shift the front pointer, point to its next node and free the node
pointed by the node ptr.
Unit II
Trees – Binary tree representations – Tree Traversal – Threaded Binary Trees – Binary Tree
Representation of Trees – Graphs and Representations – Traversals, Connected
Components and Spanning Trees – Shortest Paths and Transitive closure – Activity
Networks – Topological Sort and Critical Paths.
Tree is a popular data structure used in wide range of applications. A tree data structure
can be defined as follows...
Tree is a non-linear data structure which organizes data in hierarchical structure and
this is a recursive definition.
A tree data structure can also be defined as follows...
A tree is a finite set of one or more nodes such that:
There is a specially designated node called the root. The remaining nodes are partitioned
into n>=0 disjoint sets T1, ..., Tn, where each of these sets is a tree. We call T1, ..., Tn are the
subtrees of the root.
A tree is hierarchical collection of nodes. One of the nodes, known as the root, is at the top of the
hierarchy. Each node can have at most one link coming into it. The node where the link originates
is called the parent node. The root node has no parent. The links leaving a node (any number of
links are allowed) point to child nodes. Trees are recursive structures. Each child node is itself the
root of a subtree. At the bottom of the tree are leaf nodes, which have no children.
Advantages of trees
Trees are so useful and frequently used, because they have some very serious advantages:
Trees reflect structural relationships in the data
Trees are used to represent hierarchies
Trees provide an efficient insertion and searching
Trees are very flexible data, allowing to move sub trees around with minimum effort
Introduction Terminology
In a Tree, Every individual element is called as Node. Node in a tree data structure, stores
the actual data of that particular element and link to next element in hierarchical structure.
Example
1. Root
In a tree data structure, the first node is called as Root Node. Every tree must have root node. We
can say that root node is the origin of tree data structure. In any tree, there must be only one root
node. We never have multiple root nodes in a tree. In above tree, A is a Root node
2. Edge
In a tree data structure, the connecting link between any two nodes is called as EDGE. In a tree
with 'N' number of nodes there will be a maximum of 'N-1' number of edges.
3. Parent
In a tree data structure, the node which is predecessor of any node is called as PARENT NODE.
In simple words, the node which has branch from it to any other node is called as parent node.
Parent node can also be defined as "The node which has child / children". e.g., Parent (A,B,C,D).
4. Child
In a tree data structure, the node which is descendant of any node is called as CHILD Node. In
simple words, the node which has a link from its parent node is called as child node. In a tree, any
parent node can have any number of child nodes. In a tree, all the nodes except root are child
nodes. e.g., Children of D are (H, I,J).
5. Siblings
In a tree data structure, nodes which belong to same Parent are called as SIBLINGS. In simple
words, the nodes with same parent are called as Sibling nodes. Ex: Siblings (B,C, D)
6. Leaf
In a tree data structure, the node which does not have a child (or) node with degree zero is called
as LEAF Node. In simple words, a leaf is a node with no child. In a tree data structure, the leaf
nodes are also called as External Nodes. External node is also a node with no child. In a tree, leaf
node is also called as 'Terminal' node. Ex: (K,L,F,G,M,I,J)
7. Internal Nodes
In a tree data structure, the node which has atleast one child is called as INTERNAL Node. In
simple words, an internal node is a node with atleast one child. In a tree data structure, nodes
other than leaf nodes are called as Internal Nodes. The root node is also said to be Internal Node
if the tree has more than one node. Internal nodes are also called as 'Non-Terminal' nodes.
Ex:B,C,D,E,H
8. Degree
In a tree data structure, the total number of children of a node (or)number of subtrees of a node is
called as DEGREE of that Node. In simple words, the Degree of a node is total number of
children it has. The highest degree of a node among all the nodes in a tree is called as 'Degree of
Tree'
9. Level
In a tree data structure, the root node is said to be at Level 0 and the children of root node are at
Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so on... In
simple words, in a tree each step from top to bottom is called as a Level and the Level count starts
with '0' and incremented by one at each level (Step). Some authors start root level with 1.
10. Height
In a tree data structure, the total number of edges from leaf node to a particular node in the
longest path is called as HEIGHT of that Node. In a tree, height of the root node is said to be
height of the tree. In a tree, height of all leaf nodes is '0'.
11. Depth
In a tree data structure, the total number of edges from root node to a particular node is called as
DEPTH of that Node. In a tree, the total number of edges from root node to a leaf node in the
longest path is said to be Depth of the tree. In simple words, the highest depth of any leaf node in
a tree is said to be depth of that tree. In a tree, depth of the root node is '0'.
12. Path
In a tree data structure, the sequence of Nodes and Edges from one node to another node is called
as PATH between that two Nodes. Length of a Path is total number of nodes in that path. In
below example the path A - B - E - J has length 4.
Tree Representations
A tree data structure can be represented in two methods. Those methods are as follows...
1.List Representation
2. Left Child - Right Sibling Representation
Consider the following tree...
1. List Representation
In this representation, we use two types of nodes one for representing the node with data and
another for representing only references. We start with a node with data from root node in the
tree. Then it is linked to an internal node through a reference node and is linked to any other node
directly. This process repeats for all the nodes in the tree.
The above tree example can be represented using List representation as follows...
In this representation, every node's data field stores the actual value of that node. If that node has
left child, then left reference field stores the address of that left child node otherwise that field
stores NULL. If that node has right sibling then right reference field stores the address of right
sibling node otherwise that field stores NULL. The above tree example can be represented using
Left Child - Right Sibling representation as follows...
Binary Trees
In a normal tree, every node can have any number of children. Binary tree is a special
type of tree data structure in which every node can have a maximum of 2 children. One is known
as left child and the other is known as right child.
~A tree in which every node can have a maximum of two children is called as Binary Tree.
~In a binary tree, every node can have either 0 children or 1 child or 2 children but not more than
2 children. Example
There are different types of binary trees and they are...
1. Strictly Binary Tree
In a binary tree, every node can have a maximum of two children. But in strictly binary tree,
every node should have exactly two children or none. That means every internal node must have
exactly two children. A strictly Binary Tree can be defined as follows...
A binary tree in which every node has either two or zero number of children is called Strictly
Binary Tree. Strictly binary tree is also called as Full Binary Tree or Proper Binary Tree or 2-
Tree
2. Complete Binary Tree
In a binary tree, every node can have a maximum of two children. But in strictly binary tree,
every node should have exactly two children or none and in complete binary tree all the nodes
must have exactly two children and at every level of complete binary tree there must be 2 level
number of nodes. For example at level 2 there must be 2^2 = 4 nodes and at level 3 there must be
2^3 = 8 nodes.
A binary tree in which every internal node has exactly two children and all leaf nodes are at same
level is called Complete Binary Tree.
Complete binary tree is also called as Perfect Binary Tree
Above two trees are different when viewed as binary trees. But same when viewed as trees.
Properties of Binary Trees
1.Maximum Number of Nodes in BT
maximum number of nodes on level i of a binary tree is 2i-1, i>=1.
k-1, k>=1.
Proof By Induction:
Induction Base: The root is the only node on level i=1.Hence ,the maximum number of nodes on
level i=1 is 2i-1=20=1.
Induction Hypothesis: Let I be an arbitrary positive integer greater than 1.Assume that maximum
number of nodes on level i-1 is 2i-2.
Induction Step: The maximum number of nodes on level i-1 is 2i-2 by the induction hypothesis.
Since each node in a binary tree has a maximum degree of 2,the maximum number of nodes on
level i is two times the maximum number of nodes on level i-1,or 2i-1.
The maximum number of nodes in a binary tree of depth k is
2.Relation between number of leaf nodes and degree-2 nodes: For any nonempty binary tree,
T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2, then n0=n2+1.
PROOF: Let n and B denote the total number of nodes and branches in T. Let n0, n1, n2 represent
the nodes with zero children, single child, and two children respectively.
B+1=n B=n1+2n2 ==> n1+2n2+1= n,
n1+2n2+1= n0+n1+n2 ==> n0=n2+1
3. A full binary tree of depth k is a binary tree of depth k having 2 -1 nodes, k>=0.
A binary tree with n nodes and depth k is complete iff its nodes correspond to the nodes numbered
from 1 to n in the full binary tree of depth k.
2. Linked Representation : We use linked list to represent a binary tree. In a linked list, every
node consists of three fields. First field, for storing left child address, second for storing actual
data and third for storing right child address. In this linked list representation, a node has the
following structure...
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
void inorder(tree_pointer ptr) /* inorder tree traversal */ Recursive
{
if (ptr) {
inorder(ptr->left_child);
printf(―%d‖, ptr->data);
indorder(ptr->right_child);
}
}
2. Pre - Order Traversal ( root - leftChild - rightChild )
In Pre-Order traversal, the root node is visited before left child and right child nodes. In this
traversal, the root node is visited first, then its left child and later its right child. This pre-order
traversal is applicable for every root node of all subtrees in the tree. In the above example of
binary tree, first we visit root node 'A' then visit its left child 'B' which is a root for D and F. So
we visit B's left child 'D' and again D is a root for I and J. So we visit D's left child'I' which is the
left most child. So next we go for visiting D's right child 'J'. With this we have completed root,
left and right parts of node D and root, left parts of node B. Next visit B's right child'F'. With this
we have completed root and left parts of node A. So we go for A's right child 'C' which is a root
node for G and H. After visiting C, we go for its left child 'G' which is a root for node K. So next
we visit left of G, but it does not have left child so we go for G's right child 'K'. With this we have
completed node C's root and left parts. Next visit C's right child 'H' which is the right most child
in the tree. So we stop the process. That means here we have visited in the order of A-B-D-I-J-F-
C-G-K-H using Pre-Order Traversal.
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
void preorder(tree_pointer ptr) /* preorder tree traversal */ Recursive
{
if (ptr) {
printf(―%d‖, ptr->data);
preorder(ptr->left_child);
preorder(ptr->right_child);
}
}
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.
void postorder(tree_pointer ptr) /* postorder tree traversal */ Recursive
{
if (ptr) {
postorder(ptr->left_child);
postorder(ptr->right_child);
printf(―%d‖, ptr->data);
}
}
Binary Search Trees
Binary Search Tree Representation
Binary Search tree exhibits a special behavior. A node's left child must have value less than its
parent's value and node's right child must have value greater than it's parent value.
We're going to implement tree using node object and connecting them through references.
Definition: A binary search tree (BST) is a binary tree. It may be empty. If it is not empty,then
all nodes follows the below mentioned properties −
keys in a nonempty left subtree (right subtree) are smaller (larger) than the key in the root
of subtree.
Graph
A graph G = (V,E) is composed of:
V: set of vertices
E: set of edges connecting the vertices in V
• An edge e = (u,v) is a pair of vertices
Example:
V= {a,b,c,d,e}
E={(a,b),(a,c),(a,d),(b,e),(c,d),(c,e),(d,e)}
Graph Terminology
Undirected Graph:
An undirected graph is one in which the pair of vertices in a edge is unordered,
(v0, v1) = (v1,v0)
Directed Graph:
A directed graph is one in which each edge is a directed pair of vertices, <v0,
v1> != <v1,v0>
Complete Graph:
A complete graph is a graph that has the maximum number of edges for undirected graph with
n vertices, the maximum number of edges is n(n-1)/2 for directed graph with n vertices, the
maximum number of edges is n(n-1)
Adjacent and Incident:
If (v0, v1) is an edge in an undirected graph,
– v0 and v1 are adjacent
– The edge (v0, v1) is incident on vertices v0 and v1
Multigraph:
In a multigraph, there can be more than one edge from vertex P to
vertex Q. In a simple graph there is at most one.
Subgraph:
A subgraph of G is a graph G‟ such that V(G‟) is a subset of V(G) and E(G‟) is a
subset of E(G)
Path:
A path from vertex vp to vertex vq in a graph G, is a sequence of vertices, vp,
vi1, vi2, ..., vin, vq, such that (vp, vi1), (vi1, vi2), ..., (vin, vq) are edges in an
undirected graph
The length of a path is the number of edges on it.
Simple Path and Style:
A simple path is a path in which all vertices, except possibly the first and the
last, are distinct.
A cycle is a simple path in which the first and the last vertices are the same
In an undirected graph G, two vertices, v0 and v1, are connected if there is a
path in G from v0 to v1.
An undirected graph is connected if, for every pair of distinct vertices vi, vj,
there is a path from vi to vj
Degree
The degree of a vertex is the number of edges incident to that vertex
For directed graph,
– the in-degree of a vertex v is the number of edges that have v as the head
– the out-degree of a vertex v is the number of edges that have v as the tail
– if di is the degree of a vertex i in a graph G with n vertices and e edges, the
number of edges is
Example:
ADT for Graph
Graph ADT is
Data structures: a nonempty set of vertices and a set of undirected
edges, where each edge is a pair of vertices
Functions: for all graph Graph, v, v1 and v2 Vertices
Graph Create()::=return an empty graph
Graph InsertVertex(graph, v)::= return a graph with v inserted. V
has no incident edge.
Graph InsertEdge(graph, v1,v2)::= return a graph with new edge between
v1 and v2
Graph DeleteVertex(graph, v)::= return a graph in which v and all edges
incident to it are removed
Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge
(v1, v2) is removed
Boolean IsEmpty(graph)::= if (graph==empty graph) return TRUE
else return FALSE
List Adjacent(graph,v)::= return a list of all vertices that are adjacent
to v
Graph Representations
Graph can be represented in the following ways:
a) Adjacency Matrix
b) Adjacency Lists
c) Adjacency Multilists
a) Adjacency Matrix
Let G=(V,E) be a graph with n vertices.
The adjacency matrix of G is a two-dimensional by array, say adj_mat.
If the edge (vi, vj) is in E(G), adj_mat[i][j]=1
If there is no such edge in E(G), adj_mat[i][j]=0
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a digraph
need not be symmetric
Examples for Adjacency Matrix:
Merits of Adjacency Matrix
From the adjacency matrix, to determine the connection of vertices is easy
The degree of a vertex is
For a digraph, the row sum is the out_degree, while the column sum is the
in_degree
b) Adjacency Lists
Each row in adjacency matrix is represented as an adjacency list.
Interesting Operations
degree of a vertex in an undirected graph
# of nodes in adjacency list
# of edges in a graph
determined in O(n+e)
out-degree of a vertex in a directed graph
# of nodes in its adjacency list
in-degree of a vertex in a directed graph
traverse the whole data structure
Orthogonal representation for graph G3
c) Adjacency Multilists
An edge in an undirected graph is represented by two nodes in adjacency list
representation.
Adjacency Multilists
– lists in which nodes may be shared among several lists. (an edge is shared by
two different paths)
depth first search: v0, v1, v3, v7, v4, v5, v2, v6
breadth first search: v0, v1, v2, v3, v4, v5, v6, v7
1 Initialize the
stack.
As C does not have any unvisited adjacent node so we keep popping the stack
until we find a node that has an unvisited adjacent node. In this case, there's
none and we keep popping until the stack is empty.
Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
7 From A we have D as
unvisited adjacent node. We
mark it as visited and
enqueue it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the
algorithm we keep on dequeuing in order to get all unvisited nodes. When the
queue gets emptied, the program is over.
Spanning Trees
When graph G is connected, a depth first or breadth first search starting at any
vertex will visit all vertices in G
A spanning tree is any tree that consists solely of edges in G and that includes
all the vertices
E(G): T (tree edges) + N (nontree edges) where T: set of edges used during search
N: set of remaining edges
Examples of Spanning Tree
While adding a nontree edge into any spanning tree, this will create a cycle
DFS VS BFS Spanning Tree
A spanning tree is a minimal subgraph, G‟, of G such that V(G‟)=V(G) and G‟ is
connected.
Any connected graph with n vertices must have at least n-1 edges.
A biconnected graph is a connected graph that hasno articulation points.
biconnected component: a maximal connected subgraph H (no subgraph that is both
biconnected and properly contains H).
– Kruskal
– Prim
– Sollin
Kruskal’s Algorithm
Build a minimum cost spanning tree T by adding edges to T one at a time
Select the edges for inclusion in T in nondecreasing order of the cost
An edge is added to T if it does not form a cycle
Since G is connected and has n > 0 vertices, exactly n-1 edges will be selected
Kruskal’s algorithm
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanning
tree formed so far. If cycle is not formed, include this edge. Else,
discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Prim’s Algorithm
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm)
uses the greedy approach. Prim's algorithm shares a similarity with the
shortest path first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a
single tree and keeps on adding new nodes to the spanning tree from the given
graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm better,
we shall use the same example −
Steps of Prim's Algorithm: The following are the main 3 steps of the Prim's
Algorithm:
1. Begin with any vertex which you think would be suitable and add it to the
tree.
2. Find an edge that connects any vertex in the tree to any vertex that is not in
the tree. Note that, we don't have to form cycles.
3. Stop when n - 1 edges have been added to the tree.
Solution 1
3)
Solution 2
iently large
number
Algorithm
An algorithm is a step-by-step procedure to solve a problem in a finite
number of steps.
Branching and repetition are included in the steps of an algorithm.
This branching and repetition depend on the problem for which Algorithm is
developed.
All the steps of Algorithm during the definition should be written in a
human-understandable language which does not depend on any
programming language.
we can choose any programming language to implement the Algorithm.
Pseudocode and flow chart are popular ways to represent an algorithm.
Binary Search
Problem definition: Let ai, 1 ≤ i ≤ n be a list of elements that are sorted in non-
decreasing order. The problem is to find whether a given element x is present in the list
or not. If x is present we have to determine a value j (element‘s position) such that aj=x.
If x is not in the list, then j is set to zero.
Solution: Let P = (n, ai…al , x) denote an arbitrary instance of search problem where n
is the number of elements in the list, ai…al is the list of elements and x is the key element
to be searched for in the given list. Binary search on the list is done as follows:
Step1: Pick an index q in the middle range [i, l] i.e. q= [(n + 1)/2] and compare x with
aq. Step 2: if x = aq i.e key element is equal to mid element, the problem is immediately
solved.
Step 3: if x <aqin this case x has to be searched for only in the sub-list ai, ai+1, ……, aq-
Therefore, problem reduces to (q-i, ai…aq-1, x).
Step 4: if x >aq,x has to be searched for only in the sub-list aq+1, ...,., al . Therefore problem
reduces to (l-i, aq+1…al, x).
For the above solution procedure, the Algorithm can be implemented as recursive or non-
recursive algorithm.
Recursive binary search algorithm
Iterative binary search:
Space Complexity
Compared to the straight forward method, the MaxMin method requires extra
stack space for i, j, max, min, max1 and min1. Given n elements there will be
[log2n] + 1 levels of recursion and we need to save seven values for each
recursive call. (6 + 1 for return address).
Merge Sort
Merge sort is a perfect example of a successful application of the divide-and
conquer technique. It sorts a given array A [O ... n - 1] by dividing it into two
halves A [0 .. \n/2]-1] and A [ ⎝n/2] .. n-1], sorting each of them recursively, and
then merging the two smaller sorted arrays into a single sorted one.
Example:
The operation of the algorithm on the list 8, 3, 2, 9, 7, 1, 5, 4 is illustrated in
the figure.
Analysis
Here the basic operation is key comparison. As merge sort execution does not
depend on the order of the data, best case and average case runtime are the
same as worst case runtime.
Worst case: During key comparison, neither of the two arrays becomes empty
before the
other one contains just one element leads to the worst case of merge sort.
Assuming for
simplicity that total number of elements n is a power of 2, the recurrence
relation for the
number of key comparisons C(n) is
where, Cmerge(n) is the number of key comparisons made during the merging
stage.
Let us analyze Cmerge(n), the number of key comparisons performed during the
merging stage. At each step, exactly one comparison is made, after which the
total number of elements in the two arrays still needing to be processed is
reduced by 1. In the worst case, neither of the two arrays becomes empty before
the other one contains just one element (e.g., smaller elements may come from
the alternating arrays).Therefore, for the worst case, Cmerge(n) = n –1.
Now,
Obviously, after a partition is achieved, A[s] will be in its final position in the sorted array,
and we can continue sorting the two subarrays to the left and the right of A[s] independently (e.g.,
by the same method).
In quick sort, the entire work happens in the division stage, with no work required to combine the
solutions to the sub problems.
Partitioning
We start by selecting a pivot—an element with respect to whose value we are going to divide
the subarray. There are several different strategies for selecting a pivot. We use
the sophisticated method suggested by C.A.R. Hoare, the prominent British computer scientist
who invented quicksort.
Select the subarray‘s first element: p = A[l].Now scan the subarray from both ends, comparing
the subarray‘s elements to the pivot.
The left-to-right scan, denoted below by index pointer i, starts with the second element. Since
we want elements smaller than the pivot to be in the left part of the subarray, this scan skips over
elements that are smaller than the pivot and stops upon encountering the first element greater than
or equal to the pivot.
The right-to-left scan, denoted below by index pointer j, starts with the last element of the
subarray. Since we want elements larger than the pivot to be in the right part of the subarray, this
scan skips over elements that are larger than the pivot and stops on encountering the first
element smaller than or equal to the pivot.
After both scans stop, three situations may arise, depending on whether or not
the scanning indices have crossed.
If scanning indices i and j have not crossed, i.e., i< j, we simply exchange
A[i] and A[j ] and resume the scans by incrementing I and decrementing j,
respectively:
If the scanning indices stop while pointing to the same element, i.e., i = j, the
value they are pointing to must be equal to p. Thus, we have the subarray
partitioned, with the split position s = i = j :
Analysis
Best Case -Here the basic operation is key comparison. Number of key
comparisons made before a partition is achieved is n + 1 if the scanning indices
cross over and n if they coincide. If all the splits happen in the middle of
corresponding subarrays, we will have the best case. The number of key
comparisons in the best case satisfies the recurrence,
According to the Master Theorem, Cbest(n) ∈Θ(n log2 n); solving it exactly for n =
2k yields
Cbest(n) = n log2 n.
Worst Case – In the worst case, all the splits will be skewed to the extreme: one
of the two subarrays will be empty, and the size of the other will be just 1 less
than the size of the subarray being partitioned. This unfortunate situation will
happen, in particular, for increasing arrays. Indeed, if A[0..n − 1] is a strictly
increasing array and we use A[0] as the pivot, the left-to-right scan will stop on
A[1] while the right-to-left scan will go all the way to reach A[0], indicating the
split at position 0:So, after making n + 1 comparisons to get to this partition
and exchanging the pivot A[0] with itself, the algorithm will be left with the
strictly increasing array A[1..n − 1] to sort. This sorting of strictly increasing
arrays of diminishing sizes will continue until the last one A[n−2.. n−1] has
been processed. The total number of key comparisons made will be equal to
Average Case - Let Cavg(n) be the average number of key comparisons made by
quicksort on a randomly ordered array of size n. A partition can happen in any
position s (0 ≤ s ≤ n−1) after n+1comparisons are made to achieve the partition.
After the partition, the left and right subarrays will have s and n − 1− s
elements, respectively. Assuming that the partition split can happen in each
position s with the same probability 1/n, we get the following recurrence
relation:
Its solution, which is much trickier than the worst- and best-case analyses,
turns out to be
Thus, on the average, quicksort makes only 39% more comparisons than in the
best case. Moreover, its innermost loop is so efficient that it usually runs faster
than mergesort on randomly ordered arrays of nontrivial sizes. This certainly
justifies the name given to the algorithm by its inventor.
Unit - 4
Greedy Method : The General Method – Optimal Storage on Tapes – Knapsack Problem – Job
Sequencing with Deadlines – Optimal Merge Patterns.
GENERAL METHOD
Greedy Method
Greedy is the most straight forward design technique. Most of the problems have n
inputs and require us to obtain a subset that satisfies some constraints. Any subset that
satisfies these constraints is called a feasible solution. We need to find a feasible solution
that either maximizes or minimizes the objective function. A feasible solution that does
this is called an optimal solution.
The greedy method is a simple strategy of progressively building up a solution, one
element at a time, by choosing the best possible element at each stage. At each stage, a
decision is made regarding whether or not a particular input is in an optimal solution.
This is done by considering the inputs in an order determined by some selection
procedure. If the inclusion of the next input, into the partially constructed optimal
solution will result in an infeasible solution then this input is not added to the partial
solution. The selection procedure itself is based on some optimization measure. Several
optimization measures are plausible for a given problem. Most of them, however, will
result in algorithms that generate sub-optimal solutions. This version of greedy
technique is called subset paradigm. Some problems like Knapsack, Job sequencing with
deadlines and minimum cost spanning trees are based on subset paradigm.
For the problems that make decisions by considering the inputs in some order, each
decision is made using an optimization criterion that can be computed using decisions
already made. This version of greedy method is ordering paradigm. Some problems like
optimal storage on tapes, optimal merge patterns and single source shortest path are
based on ordering paradigm.
CONTROL ABSTRACTION
Algorithm Greedy (a, n)
// a(1 : n) contains the „n‟ inputs
{
solution := ; // initialize the solution to empty for i:=1 to n do
{
x := select (a);
if feasible (solution, x) then
solution := Union (Solution, x);
}
return solution;
}
Procedure Greedy describes the essential way that a greedy based algorithm will look,
once a particular problem is chosen and the functions select, feasible and union are
properly implemented.
The function select selects an input from „a‟, removes it and assigns its value to „x‟. Feasible is a
Boolean valued function, which determines if „x‟ can be included into the solution vector. The
function Union combines „x‟ with solution and updates the objective function.
KNAPSACK PROBLEM
Let us apply the greedy method to solve the knapsack problem. We are given „n‟ objects
and a knapsack. The object „i‟ has a weight wi and the knapsack has a capacity „m‟. If a
fraction xi, 0 < xi < 1 of object i is placed into the knapsack then a profit of pi xi is
earned. The objective is to fill the knapsack that maximizes the total profit earned.
Since the knapsack capacity is „m‟, we require the total weight of all chosen objects to be at
most „m‟. The problem is stated as:
Algorithm:
The algorithm for assigning programs to tapes is as follows:
Algorithm Store (n, m)
// n is the number of programs and m the number of tapes
{
j := 0; // next tape to store on for i :=1 to n do
{
Print („append program‟, i, „to permutation for tape‟, j); j := (j + 1) mod m;
}
}
On any given tape, the programs are stored in non-decreasing order of their lengths.
Example:
Let n = 4, (P1, P2, P3, P4,) = (100, 10, 15, 27) and (d1 d2 d3 d4) = (2, 1, 2, 1). The
feasible solutions and their values are:
OPTIMAL MERGE PATERNS
Given „n‟ sorted files, there are many ways to pair wise merge them into a single sorted
file. As, different pairings require different amounts of computing time, we want to
determine an optimal (i.e., one requiring the fewest comparisons) way to pair wise
merge „n‟ sorted files together. This type of merging is called as 2-way merge patterns.
To merge an n-record file and an m-record file requires possibly n + m record moves,
the obvious choice choice is, at each step merge the two smallest files together. The
two-way merge patterns can be represented by binary merge trees.
struct treenode
{
treenode * lchild;
treenode * rchild;
};
Example 1:
Suppose we are having three sorted files X1, X2 and X3 of length 30, 20, and 10 records each.
Merging of the files can be carried out as follows:
Example 2:
Given five files (X1, X2, X3, X4, X5) with sizes (20, 30, 10, 5, 30). Apply greedy rule to
find optimal way of pair wise merging to give an optimal solution using binary merge
tree representation.
Solution:
Backtracking
Some problems can be solved, by exhaustive search. The exhaustive-search technique
suggests generating all candidate solutions and then identifying the one (or the ones) with a
desired property.
Backtracking is a more intelligent variation of this approach. The principal idea is to
construct solutions one component at a time and evaluate such partially constructed
candidates as follows. If a partially constructed solution can be developed further without
violating the problem‘s constraints, it is done by taking the first remaining legitimate option
for the next component. If there is no legitimate option for the next component, no
alternatives for any remaining component need to be considered. In this case, the algorithm
backtracks to replace the last component of the partially constructed solution with its next
option.
It is convenient to implement this kind of processing by constructing a tree of choices being
made, called the state-space tree. Its root represents an initial state before the search for a
solution begins. The nodes of the first level in the tree represent the choices made for the first
component of a solution; the nodes of the second level represent the choices for the second
component, and soon. A node in a state-space tree is said to be promising if it corresponds to
a partially constructed solution that may still lead to a complete solution; otherwise, it is
called non-promising. Leaves represent either non-promising dead ends or complete
solutions found by the algorithm.
In the majority of cases, a state space tree for a backtracking algorithm is constructed in the
manner of depth-first search. If the current node is promising, its child is generated by adding
the first remaining legitimate option for the next component of a solution, and the processing
moves to this child. If the current node turns out to be non-promising, the algorithm
backtracks to the node‘s parent to consider the next possible option for its last component; if
there is no such option, it backtracks one more level up the tree, and so on. Finally, if the
algorithm reaches a complete solution to the problem, it either stops (if just one solution is
required) or continues searching for other possible solutions.
General method
General Algorithm (Recursive)
N-Queens problem
The problem is to place n queens on an n × n chessboard so that no two queens attack each other
by being in the same row or in the same column or on the same diagonal.
So let us consider the four-queens problem and solve it by the backtracking technique. Since each of the
four queens has to be placed in its own row, all we need to do is to assign a column for each queen on the
board presented in figure.
We start with the empty board and then place queen 1 in the first possible position of its row, which is
in column 1 of row 1. Then we place queen 2, after trying unsuccessfully columns 1 and 2, in the first
acceptable position for it, which is square (2, 3), the square in row 2 and column 3. This proves to be a
dead end because there is no acceptable position for queen 3. So, the algorithm backtracks and puts
queen 2 in the next possible position at (2, 4). Then queen 3 is placed at (3, 2), which proves to be
another dead end. The algorithm then backtracks all the way to queen 1 and moves it to (1, 2). Queen 2
then goes to (2, 4), queen 3 to(3, 1), and queen 4 to (4, 3), which is a solution to the problem. The state-
space tree of this search is shown in figure.
Figure: State-space tree of solving the four-queens problem by backtracking. × denotes an
unsuccessful attempt to place a queen in the indicated column. The numbers above the nodes indicate
the order in which the nodes are generated.
If other solutions need to be found, the algorithm can simply resume its operations at the leaf at
which it stopped. Alternatively, we can use the board‘s symmetry for this purpose.
Finally, it should be pointed out that a single solution to the n-queens problem for any n ≥ 4 can
be found in linear time.
Note: The algorithm NQueens() is not in the syllabus. It is given here for interested learners. The
algorithm is referred from textbook T2.
Sum of subsets problem
Problem definition: Find a subset of a given set A = {a1, . . . , an} of n positive integers whose
sum is equal to a given positive integer d.
For example, for A = {1, 2, 5, 6, 8} and d = 9, there are two solutions: {1, 2, 6} and {1, 8}. Of
course, some instances of this problem may have no solutions.
It is convenient to sort the set‘s elements in increasing order. So, we will assume that
a1< a2< . . . < an.
The state-space tree can be constructed as a binary tree like that in Figure shown below for the
instance A = {3, 5, 6, 7} and d = 15.
The number inside a node is the sum of the elements already included in the subsets represented by the
node. The inequality below a leaf indicates the reason for its termination.
The root of the tree represents the starting point, with no decisions about the given elements made
as yet. Its left and right children represent, respectively, inclusion and exclusion of a 1 in a set
being sought.
Similarly, going to the left from a node of the first level corresponds to inclusion of a 2 while
going to the right corresponds to its exclusion, and so on. Thus, a path from the root to a node on
the ith level of the tree indicates which of the first in numbers have been included in the subsets
represented by that node.
We record the value of s, the sum of these numbers, in the node. If s is equal to d, we have a solution to
the problem. We can either report this result and stop or, if all the solutions need to be found, continue by
backtracking to the node‘s parent. If s is not equal to d, we can terminate the node as non-promising if
either of the following two inequalities holds:
Example: Apply backtracking to solve the following instance of the subset sum problem: A
= {1, 3, 4, 5} and d = 11.
Graph coloring
Analysis