0% found this document useful (0 votes)
4 views

Chapter 1-3 algorithm analysis

The document discusses the design and analysis of algorithms, focusing on data structures such as heaps and hashing algorithms. It explains the properties and operations of heaps, including insertion, deletion, and heap sort, as well as the concept of hashing for efficient data retrieval. Additionally, it covers set representation and operations like union and find, along with the divide and conquer method for algorithm design.

Uploaded by

nemeralelisa38
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 1-3 algorithm analysis

The document discusses the design and analysis of algorithms, focusing on data structures such as heaps and hashing algorithms. It explains the properties and operations of heaps, including insertion, deletion, and heap sort, as well as the concept of hashing for efficient data retrieval. Additionally, it covers set representation and operations like union and find, along with the divide and conquer method for algorithm design.

Uploaded by

nemeralelisa38
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 181

Design and Analysis of Algorithm

Chapter – One

Introduction and Elementary Data Structures

Introduction to Algorithm analysis


Heap
✓ A heap is a complete binary tree with the property that the value at each node
is at least as large as the values at its children (if they exist)

✓ This definition implies that a largest element is at the root of the heap.
✓ Permits one to insert elements into a set and also to find the largest element
efficiently.

✓ If the elements are distinct, then the root contains the largest item.
✓ The parent node contains a value as small as or smaller than its children.
✓ In this case the root contains the smallest element.
✓ Assume that the larger values are closer to the root.
✓ A data structure which provides for these two operations is called a
priority queue.

✓ A heap is a data structure that stores a collection of objects (with keys),


and has the following properties:
– Complete Binary tree
– Heap Order

It is implemented as an array where each node in the tree corresponds to


an element of the array.
Heap
• Each node of the binary tree corresponds to an element of the array.
• The array is completely filled on all levels except possibly lowest.

19

12 16

1 4 7

19 12 16 1 4 7

Array A
Heap
• A heap can be stored as an array A.
– Root of tree is A[i],where i=1
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• The root of the tree A[1] and given index i of a node, the indices of its parent, left child
and right child can be computed

PARENT (i)
return floor(i/2)
LEFT (i)
return 2i
RIGHT (i)
return 2i + 1
Heap order property
• For every node v, other than the root, the key stored in v is greater or equal
(smaller or equal for max heap) than the key stored in the parent of v.
• In this case the maximum value is stored in the root
Definition
• Max Heap
– Store data in ascending order
– Has property of
A[Parent(i)] ≥ A[i]
• Min Heap
– Store data in descending order
– Has property of
A[Parent(i)] ≤ A[i]
19

12 16

1 4 7

19 12 16 1 4 7

Array A
1

4 16

7 12 19

1 4 16 7 12 19
Array A
Insertion
• Algorithm
1. Add the new element to the next available position at the lowest level
2. Restore the max-heap property if violated
• General strategy is percolate up (or bubble up): if the parent of the
element is smaller than the child of element, then interchange the
parent and child.

OR

Restore the min-heap property if violated


• General strategy is percolate up (or bubble up): if the parent of the
element is larger than the element, then interchange the parent and
child.
19 19

12 16 12 16

4 7 1 4 7 17
1
Insert 17
19

12 17
swap

1 4 7 16

Percolate up to maintain the heap property


Deletion
• Delete max
– Copy the last number to the root ( overwrite the maximum element stored
there ).
– Restore the max heap property by percolate down.
• Delete min
– Copy the last number to the root ( overwrite the minimum element stored
there ).
– Restore the min heap property by percolate down.
A sorting algorithm that works by first organizing the data to be
sorted into a special type of binary tree called a heap
Procedures on Heap
• Heapify
• Build Heap
• Heap Sort
Heapify
• Heapify picks the largest child key and compare it to the parent key.
• If parent key is larger, then Heapify quits, otherwise it swaps the parent key with the
largest child key. So that the parent is now becomes larger than its children.
Heapify(A, i)
{
l  left(i)
r  right(i)
if l <= heapsize[A] and A[l] > A[i]
then largest l
else largest  i
if r <= heapsize[A] and A[r] > A[largest]
then largest  r
if largest != i
then swap A[i] → A[largest]
Heapify(A, largest)
}
Build heap
• We can use the procedure 'Heapify' in a bottom-up fashion to convert an array
A[1 . . n] into a heap. Since the elements in the subarray A[n/2 +1 . . n] are all
leaves, the procedure BUILD_HEAP goes through the remaining nodes of the
tree and runs 'Heapify' on each one. The bottom-up order of processing node
guarantees that the subtree rooted at children are heap before 'Heapify' is run at
their parent.
Buildheap(A)
{
heapsize[A] length[A]
for i |length[A]/2 //down to 1
do Heapify(A, i)
}
Heap Sort Algorithm
• The heap sort algorithm starts by using procedure BUILD-HEAP to build a heap
on the input array A[1 . . n]. Since the maximum element of the array stored at
the root A[1], it can be put into its correct final position by exchanging it with
A[n] (the last element in A). If we now discard node n from the heap than the
remaining elements can be made into heap. Note that the new element at the root
may violate the heap property. All that is needed to restore the heap property.
Heapsort(A)
{
Buildheap(A)
for i  length[A] //down to 2
do swap A[1] → A[i]
heapsize[A]  heapsize[A] - 1
Heapify(A, 1)
}
Example: Convert the following array to a heap

16 4 7 1 12 19

Picture the array as a complete binary tree:

16

4 7

1 12 19
16 16

4 7 4 19
swa
p

12 19 1 12 7
1

16 19
swa
p
12 19 12 16
swa
p

4 7 1 4 7
1
Heap Sort
• The heapsort algorithm consists of two phases:
- build a heap from an arbitrary array
- use the heap to sort the data

• To sort the elements in the decreasing order, use a min heap


• To sort the elements in the increasing order, use a max heap
19

12 16

1 4 7
Take out biggest
19

12 16
Move the last element
to the root

1 4 7

Sorted:
Array A

12 16 1 4 7 19
Take out biggest
1

Sorted:
Array A

1 4 7 12 16 19
Hashing Algorithms.
• Hashing algorithms are a certain type of search procedure.
• We assume that we are given a set of records, where each record R is uniquely
identified by its key K.
• Besides K the record R contains some unspecified useful information in the
field INFO.
• We wish to organize our records in such a way that (1) we can quickly find the
record having a given key K (if such a record exists), and (2) we can easily add
additional records to our collection.
• A straight forward way to implement this organization is to maintain our
records in a table.
• A table entry is either empty, or it contains one of our records, in which case it
is full.
• Similarly, a new record can be inserted into the table by searching for an
empty position.
• Suppose we want to design a system for storing employee records
keyed using phone numbers. And we want following queries to be
performed efficiently:
➢Insert a phone number and corresponding information.
➢Search a phone number and fetch the information.
➢Delete a phone number and related information.
• We can think of using the following data structures to maintain
information about different phone numbers.
➢Array of phone numbers and records.
➢Linked List of phone numbers and records.
➢Balanced binary search tree with phone numbers as keys.
➢Direct Access Table.
• Direct access table use big array and use phone numbers as
index in the array.
• An entry in array is NULL if phone number is not present, else
the array entry stores pointer to records corresponding to phone
number.
• To insert a phone number, we create a record with details of
given phone number, use phone number as index and store the
pointer to the created record in table.
• This solution has many practical limitations.
➢First problem with this solution is extra space required is
huge.
➢Another problem is an integer in a programming language
may not store n digits.
• Due to above limitations Direct Access Table cannot always be
used.
• Hashing is the solution that can be used in almost all such situations
and performs extremely well compared to above data structures like
Array, Linked List, Balanced BST in practice.
• With hashing we get O(1) search time on average (under reasonable
assumptions) and O(n) in worst case.
• Hashing is an improvement over Direct Access Table.
• The idea is to use hash function that converts a given phone number
or any other key to a smaller number and uses the small number as
index in a table called hash table.
Hash Function
• A function that converts a given big phone number to a small practical integer value.
• The mapped integer value is used as an index in hash table.
• In simple terms, a hash function maps a big number or string to a small integer that
can be used as index in hash table.
• A good hash function should have following properties
1) Efficiently computable.
2) Should uniformly distribute the keys.
• Hash Table: An array that stores pointers to records corresponding to a given phone
number.
• An entry in hash table is NIL if no existing phone number has hash function value
equal to the index for the entry.
• Collision Handling: Hash function gets possibility that two keys result in same
value.
• E.g keys like k2=k7=k5 , k1=k4 and k6=k8
• Following are the ways to handle collisions:
✓ Chaining: The idea is to make each cell of hash table point to a linked list of
records that have same hash function value. Chaining is simple, but requires
additional memory outside the table.
✓ Open Addressing: In open addressing, all elements are stored in the hash table
itself. Each table entry contains either a record or NIL.
Sets Representation
• A set is defined as a collection of distinct objects of the same type or
class of objects like numbers, alphabets, names, etc.
• Sets are represented in two forms:-
✓ a) Roster or tabular form: List all the elements of the set within braces { }
and separate them by commas.
➢Example: If A= set of all odd numbers less then 10
then in the roster from it can be expressed as A={
1,3,5,7,9}.
✓ b) Set Builder form: List the properties fulfilled by all the elements of the
set.
➢ We note as {x: x satisfies properties P}. and read as 'the set of those
entire x such that each x has properties P.‘
➢ Example: If B= {2, 4, 8, 16, 32}, then the set builder representation
will be: B={x: x=2n, where n ∈ N and 1≤ n ≥5}
• Disjoint Sets: Disjoint set union ... if Si and Sj are two disjoint
sets, then their union Si U S j = { all elements x such that x is in Si or
Sj}.
Example:
➢ If we have a set S1= {l, 7, 8, 9}; S 2 = {2, 5, 10} and S3 ={3, 4, 6}. So S 1
U S 2 = {1, 7, 8, 9, 2, 5, 10}.

▪ Find (i) ... find the set containing element i. Thus, 4 is in set S3 and 9
is in set S 1.
▪ The sets will be represented by trees.
▪ Representing disjoint sets by trees for S1, S 2 and S 3 .
Union of Sets
• Union of Sets A and B is defined to be the set of all those elements which belong to
A or B or both and is denoted by A∪B.

• The nodes are linked on the parent relationship, i.e. each node
other than the root is linked to its parent.
• Example the S 1 U S 2 can be represented by tree representations

• In presenting the UNION and FIND algorithms we shall identify sets by the index
of the roots of the trees.
• The operation of FIND(i) now becomes: determine the root of
the tree containing element i.
• UNION(i, j) requires two trees with roots i and j to be joined.
• Each node needs only one field, the PARENT field to link to its parent.
• Root nodes have a PARENT field of zero.
Simple union and find algorithms
procedure U(i, j)
//replace the disjoint sets with roots i and j, i ;it. j, by their union//
integer i, j
PARENT(i) j
end U
procedure F(i)
integer i, j //find the root of the tree containing element //
j i
while PARENT(j) > 0 do //PARENT(j) = 0 if this node is a root//
j PARENT(j)
repeat
return(j)
end F
• For instance, if we start off with n elements each in a set of its own, i.e. Si= {i}, 1 ≤
i ≤ n, then the initial configuration consists of a forest with n nodes and PARENT(i)
= 0, 1 ≤ i ≤ n.
• Now imagine that we process the following sequences of UNION-FIND
operations:
• U(l, 2), F(l), U(2, 3), F(l), U(3, 4), F(l), U(4, 5), ... , F(l), U(n - 1, n)
Weighting Rule for UNION (i j).
• If the number of nodes in tree i is less than the number in tree j, then make j the
parent of i, otherwise make i the parent of j. E.g.
Chapter-2 Divide and conquer
• Outlines:
✓The General Method,
✓binary search,
✓finding, maximum and minimum
✓Merge sort
✓Quick sort
✓selection sort
Divide & Conquer
• Divide & Conquer design technique is a method that helps in
sorting a set of data.
• Is a design strategy which is well known to breaking down
efficiency barriers.

• This approach has three steps at each level of recursion:


1. Divide the problem into number of smaller units called sub-
problems.
2. Conquer (Solve) the sub-problems recursively.
3. Combine the solutions of all the sub-problems into a solution
for the original problem.
Maximum and Minimum:
+
1. Let us consider simple problem that can be solved by the divide-
and conquer technique.
2. The problem is to find the maximum and minimum value in a set of
‘n’ elements.
3. By comparing numbers of elements, the time complexity of this
algorithm can be analysed.
4. Hence, the time is determined mainly by the total cost of the
element comparison.
Algorithm straight MaxMin(a,n,max,min)
//set max to the maximum & min to the minimum of [1:n]
{
max = min= a[1];
for i = 2 to n do
{
if (a [i]>max) then max = a[i];
if (a [i]<min) then min = a[i];
}
}
Binary Search Algorithm
• A binary search algorithm is a search algorithm that finds the
position of searched value within the array.
• In the binary search algorithm, the element in the middle of the
array is checked each time for the searched element to be found.
• If the middle element is not equal to the searched element, the
search is repeated in the other half of the searched element.
• In this way, the search space is halved at each step.
• The binary search algorithm works on sorted arrays.
• For non-sorted arrays, the array must first be sorted by any sorting
algorithm in order to make a binary search.
The steps of binary search algorithm:

1. Select the element in the middle of the array.


2.Compare the selected element to the searched element, if it is
equal to the searched element, terminate.

3. If the searched element is larger than the selected element, repeat


the search operation in the major part of the selected element.

4. If the searched element is smaller than the selected element,


repeat the search in the smaller part of the selected element.
5. Repeat the steps until the smallest index in the search
space is less than or equal to the largest index.

• For example; 2,3,4,5,6,7,8,9,22,33,45.

• The following steps will be followed to find the number 4


with binary search in a sequential array of these numbers.
• The middle element of the array is selected as 7 and compared with
the searched element 4.
• The searched element (4) is not equal to the middle element (7), the
part at the center of the array which is less than 7.

• Our new search array: 2,3,4,5,6.


• The middle element of our new search array is 4 and the search is
completed.

• Complexity - With binary search algorithm, it is possible to find the


searched value to log2N comparisons in an N-element array.
Algorithm
Algorithm: BINSRCH (a, n, x)
// array a(1 : n) of elements in increasing order, n ≥ 0,
// determine whether ‘x’ is present, and if so, set j such that x = a(j)
// else return j
{
low :=1 ; high :=n ;
while (low < high) do
{
mid :=|(low + high)/2|
if (x < a [mid]) then high:=mid – 1;
else if (x > a [mid]) then low:= mid + 1
else return mid;
}
return 0;
}
Example for Binary Search
Let us illustrate binary search on the following 9 elements:

• The number of comparisons required for searching different elements is as follows:

2. Searching for x = 82 low high mid


1 9 5
6 9 7
8 9 8
Number of comparisons = 3 Found
2. Searching for x = 42 low high mid
1 9 5
6 9 7
6 6 6
7 6 Not Found
Number of comparisons = 4
3. Searching for x = -14 low high mid
1 9 5
1 4 2
1 1 1
2 1 Not Found
Number of comparisons = 4
▪ Continuing in this manner the number of element comparisons needed to find each of nine
elements is:
• Merge Sort is a kind of Divide and Conquer algorithm in OS.

• To sort an array, recursively, sort its left and right halves


separately and then merge them.

• The time complexity of merge sort in the best case, worst case and
average case is O(n log n) and the number of comparisons used is
nearly optimal.
Divide and Conquer Strategy
• Using the Divide and Conquer technique, we divide a problem into
subproblems.
• When the solution to each subproblem is ready, we 'combine' the results
from the subproblems to solve the main problem.
• Suppose we had to sort an array A.
• A subproblem would be to sort a sub-section of this array starting at
index p and ending at index r, denoted as A[p..r].
Divide
• If q is the half-way point between p and r, then we can split the subarray
A[p..r] into two arrays A[p…q] and A[q+1, r].
Conquer
• In the conquer step, we try to sort both the subarrays A[p..q] and A[q+1, r].
• If we haven't yet reached the base case, we again divide both these
subarrays and try to sort them.
Combine
• When the conquer step reaches the base step and we get two sorted subarrays A[p..q] and
A[q+1, r] for array A[p..r], we combine the results by creating a sorted array A[p..r] from
two sorted subarrays A[p..q] and A[q+1, r]

The Merge Sort Algorithm


• The Merge Sort function repeatedly divides the array into two halves until we reach a
stage where we try to perform Merge Sort on a subarray of size 1 i.e. p == r.
MergeSort(A, p, r)
If p > r
return;
q = (p+r)/2;
mergeSort(A, p, q)
mergeSort(A, q+1, r)
merge(A, p, q, r)
• E.g 2 select the ff 8 entries 7, 2, 9, 4, 3, 8, 6, 1 to
given merge sort algorithm
Quick Sort
• Quick Sort is also based on the concept of Divide and Conquer, just like merge
sort.
• But in quick sort all the heavy lifting(major work) is done while dividing the
array into subarrays, while in case of merge sort, all the real work happens
during merging the subarrays.
• In case of quick sort, the combine step does absolutely nothing.
• It is also called partition-exchange sort.
• This algorithm divides the list into three main parts:

1. Elements less than the Pivot element


2. Pivot element(Central element)
3. Elements greater than the pivot element
• Pivot element can be any element from the array, it can be the first
element, the last element or any random element.
• Here we will take the rightmost element or the last element as pivot.
• For example: In the array {52, 37, 63, 14, 17, 8, 6, 25}, we
take 25 as pivot.
• So after the first pass, the list will be changed like this.
• {6 8 17 14 25 63 37 52}
• Hence after the first pass, pivot will be set at its position, with all the
elements smaller to it on its left and all the elements larger than to
its right.
• Now 6 8 17 14 and 63 37 52 are complete array is sorted.
Let's consider an array with values {9, 7, 5, 11, 12, 2, 14, 3, 10, 6}
Selection Sort:
• The selection sort algorithm sorts an array by repeatedly finding the
minimum element (considering ascending order) from unsorted part
and putting it at the beginning.
• The algorithm maintains two subarrays in a given array.
1. The subarray which is already sorted.
2. Remaining subarray which is unsorted.
• In every iteration of selection sort, the minimum element
(considering ascending order) from the unsorted subarray is picked
and moved to the sorted subarray.
arr[] = 64 25 12 22 11
// Find the minimum element in arr[0...4]
// and place it at beginning
11 25 12 22 64
// Find the minimum element in arr[1...4]
// and place it at beginning of arr[1...4]
11 12 25 22 64
// Find the minimum element in arr[2...4]
// and place it at beginning of arr[2...4]
11 12 22 25 64
// Find the minimum element in arr[3...4]
// and place it at beginning of arr[3...4]
11 12 22 25 64
• Time Complexity: O(n2) as there are two nested loops.
Chapter-3- Greedy Model
Contents
✓Job sequencing with deadlines
✓Optimal merge pattern,
✓Minimum spanning trees
✓Single source shortest pattern.
• Greedy is the most straight forward design technique.
• Most of the problems have n inputs and require us to obtain a subset that
satisfies some constraints.
• Any subset that satisfies these constraints is called a feasible solution.
• We need to find a feasible solution that either maximizes or minimizes the
objective function.
• A feasible solution that does this is called an optimal solution.
• The greedy method is a simple strategy of progressively building up a solution,
one element at a time, by choosing the best possible element at each stage.
• At each stage, a decision is made regarding whether or not a particular input is
in an optimal solution.
• This is done by considering the inputs in an order determined by some
selection procedure.

• If the inclusion of the next input, into the partially constructed optimal
solution will result in an infeasible solution then this input is not added to the
partial solution.

• Several optimization measures will result in algorithms that generate sub-


optimal solutions.

• This version of greedy technique is called subset paradigm.

• Some problems like Knapsack, Job sequencing with deadlines and minimum
cost spanning trees are based on subset paradigm.
Algorithm Greedy (a, n)
// a(1 : n) contains the ‘n’ inputs
{
solution := // initialize the solution to empty
for i:=1 to n do
{
x := select (a);
if feasible (solution, x) then
solution := Union (Solution, x);
}
return solution;
}
➢ Procedure Greedy function select selects an input from ‘a’, removes it and assigns
its value to ‘x’

➢ Feasible is a Boolean valued function, which determines if ‘x’ can be included into
the solution vector.

➢ The function Union combines ‘x’ with solution and updates the objective function.
Job Sequencing With Deadlines
• The sequencing of jobs on a single processor with deadline constraints is called as
Job Sequencing with Deadlines.
Here:
➢ You are given a set of jobs.
➢ Each job has a defined deadline and some profit associated with it.
➢ The profit of a job is given only when that job is completed within its
deadline.
➢ Only one processor is available for processing all the jobs.
➢ Processor takes one unit of time to complete a job.
The problem states-
• “How can the total profit be maximized if only one job can be completed at a
time?”
Approach to Solution-
• A feasible solution would be a subset of jobs where each job of the subset
gets completed within its deadline.

• Value of the feasible solution would be the sum of profit of all the jobs
contained in the subset.
• An optimal solution of the problem would be a feasible solution which
gives the maximum profit.

Greedy Algorithm-
• Greedy Algorithm is adopted to determine how the next job is selected for
an optimal solution.

• The greedy algorithm described below always gives an optimal solution to


the job sequencing problem.
Step-01:
✓ Sort all the given jobs in decreasing order of their profit.

Step-02:
✓ Check the value of maximum deadline.
✓ Draw a Gantt chart where maximum time on Gantt chart is the value of
maximum deadline.

Step-03:
✓ Pick up the jobs one by one.
✓ Put the job on Gantt chart as far as possible from 0 ensuring that the job
gets completed before its deadline.
Practice problem based on job sequencing with deadlines-
• Problem- Given the jobs, their deadlines and associated profits as shown-

Jobs J1 J2 J3 J4 J5 J6
Deadlines 5 3 3 2 4 2
Profits 200 180 190 300 120 100

• Answer the following questions-


1. Write the optimal schedule that gives maximum profit.
2. Are all the jobs completed in the optimal schedule?
3. What is the maximum earned profit?
Solution-
Step-01:
✓ Sort all the given jobs in decreasing order of their profit-
Jobs J4 J1 J3 J2 J5 J6
Deadlines 2 5 3 3 4 2
Profits 300 200 190 180 120 100

Step-02:
✓ Value of maximum deadline = 5. So, draw a Gantt chart with
maximum time on Gantt chart = 5 units as shown.
✓ Now, We take each job one by one in the order they appear in Step-01.
✓ We place the job on Gantt chart as far as possible from 0.
Step-03:
✓ We take job J4. Since its deadline is 2, so we place it in the first empty cell
before deadline 2 as-

Step-04:
✓ We take job J1. Since its deadline is 5, so we place it in the first empty cell
before deadline 5 as-
Step-05:
✓ We take job J3. Since its deadline is 3, so we place it in the first empty cell
before deadline 3 as-

Step-06:
• We take job J2. Since its deadline is 3, so we place it in the first empty cell
before deadline 3.
• Since the second and third cells are already filled, so we place job J2 in the
first cell as-
Step-07:
✓ Now, we take job J5. Since its deadline is 4, so we place it in the first empty cell
before deadline 4 as-

• Now, The only job left is job J6 whose deadline is 2. All the slots before deadline
2 are already occupied.
• Thus, job J6 can not be completed. Now, the given questions may be answered as-

• Part-01: Answer for Question 1


• The optimal schedule is- J2 , J4 , J3 , J5 , J1 . This is the required order in which
the jobs must be completed in order to obtain the maximum profit.
Part-02: Answer for Question 2
✓ All the jobs are not completed in optimal schedule.
✓ This is because job J6 could not be completed within its deadline.
• Part-03: Answer for Question 3
✓ Maximum earned profit
= Sum of profit of all the jobs in optimal schedule
= Profit of job J2 + Profit of job J4 + Profit of job J3 + Profit of job J5 + Profit
of job J1
= 180 + 300 + 190 + 120 + 200
• Algorithm GreedyJob (d, J, n)
// J is a set of jobs that can be completed by their deadlines.
{
J := {1};
for i := 2 to n do
{
if (all jobs in J U {i} can be completed by their dead lines)
then J := J U {i};
}
}
Optimal merge pattern:
• Optimal merge pattern is a pattern that relates to the merging of
two or more sorted files in a single sorted file.

• This type of merging can be done by the two-way merging method.

• If we have two sorted files containing n and m records respectively


then they could be merged together, to obtain one sorted file in
time O (n+m).

• There are many ways in which pairwise merge can be done to get a
single sorted file.
Algorithm to Generate Two-way Merge Tree:
struct treenode
{
treenode * lchild;
treenode * rchild;
};
Algorithm TREE (n)
// list is a global of n single node binary trees
{
for i := 1 to n – 1 do
{
pt  new treenode
(pt → lchild) least (list); // merge two trees with smallest lengths
(pt → rchild) least (list);
(pt → weight)  ((pt → lchild) → weight) + ((pt →rchild) → weight);
insert (list, pt);
}
return least (list); // The tree left in list is the merge tree
}
Example:
• Given a set of unsorted files: 5, 3, 2, 7, 9, 13
• Now, arrange these elements in ascending order: 2, 3, 5, 7, 9, 13
• After this, pick two smallest numbers and repeat this until we left with only one number.
• Now follow following steps:
Step 1: Insert 2, 3
Step 2:
Step 3: Insert 5
Step 4: Insert 13
Step 5: Insert 7 and 9
Step 6:

So, The merging cost = 5 + 10 + 16 + 23 + 39 = 93


Minimum Spanning Trees
• Definition Spanning Tree means a graph which contains all
Vertexes with minimum number of edges.
• Or spanning sub graphs (contains all Vertexes )or A trees
(minimum number of edges). Examples.
• To get number of all edges e = n-1 where, n is the number of all vertexes.
• So from the given all vertexes we can get many spanning trees.
Spanning Trees
• Given (connected) graph G(V,E), a spanning tree T(V’,E’):
✓ Is a subgraph of G; that is, V’  V, E’  E.
✓ Spans the graph (V’ = V)
✓ Forms a tree (no cycle);
✓ So, E’ has |V| -1 edges

Examples
Minimum Spanning Trees
• Spanning Tree
– A tree (i.e., connected, acyclic graph) which contains all the
vertices of the graph
• Minimum Spanning Tree
– Spanning tree with the minimum sum of weights
8 7
b c d
4 9
2
a 11 i 4 14 e
7 6
8 10
g g f
1 2
• Spanning forest
– If a graph is not connected, then there is a spanning tree for each connected
component of the graph
Properties of Minimum Spanning Trees

• Minimum spanning tree is not unique

• MST has no cycles


✓ We can take out an edge of a cycle, and still have the vertices
connected while reducing the cost
• # of edges in a MST:
✓ |V| - 1
Two Algorithms
1. Prim: (build tree incrementally)
– Pick lower cost edge connected to known (incomplete) spanning tree that does not create
a cycle and expand to include it in the tree
– Steps:
1. Removes loops and parallel edges (keep min weight).
2. While adding new edge, select edge with minimum weight out of the edges from
already visited vertices.(no cycle allowed).
3. Stop at n – 1 edges.

2. Kruskal: (build forest that will finish as a tree)


– Pick lowest cost edge not yet in a tree that does not create a cycle.
– Then expand the set of included edges to include it. (It will be somewhere in the forest.)
– Steps:
1. Removes loops and parallel edges (keep min weight).
2. List all edges and sort them according to weights(ascending order).
3. Take n – 1 edges from the sorted list(skip cycle making edges)
1
Starting from empty T, 10 5
choose a vertex at random 1
and initialize
3
V = {1), E’ ={} 2
8
3 4

1 1 6

4
2
6 5
1
Choose the vertex u not in V such 10 5
that edge weight from u to a vertex 1
in V is minimal (greedy!)
8 3
2 3 4
V={1,3} E’= {(1,3) }
1 1 6

4
2
6 5
Prim’s algorithm
Repeat until all vertices have been
chosen 1
10 5
Choose the vertex u not in V such
that edge weight from v to a vertex 1
in V is minimal (greedy!)
8 3
V= {1,3,4} E’= {(1,3),(3,4)} 2 3 4

V={1,3,4,5} E’={(1,3),(3,4),(4,5)}
1 1 6
….
V={1,3,4,5,2,6} 4

E’={(1,3),(3,4),(4,5),(5,2),(2,6)} 2
6 5
Repeat until all vertices have been 1
chosen 10 5
V={1,3,4,5,2,6} 1

E’={(1,3),(3,4),(4,5),(5,2),(2,6)} 8 3
2 3 4

1 1 6
Final Cost: 1 + 3 + 4 + 1 + 1 = 10
4
2
6 5
Kruskal’s Algorithm
• Select edges in order of increasing cost
• Accept an edge to expand tree or forest only if it does not cause a cycle
• Implementation using adjacency list, priority queues and disjoint sets
• Its algorithm is:
Initialize a forest of trees, each tree being a single node
Build a priority queue of edges with priority being lowest cost
Repeat until |V| -1 edges have been accepted {
Deletemin edge from priority queue
If it forms a cycle then discard it
else accept the edge – It will join 2 existing trees yielding a larger tree
and reducing the forest by one tree
}
The accepted edges form the minimum spanning tree
• Vertices in different trees are disjoint
– True at initialization and Union won’t modify the fact for remaining trees

• Trees form equivalent classes under the relation “is connected to”
– u connected to u (reflexivity)
– u connected to v implies v connected to u (symmetry)
– u connected to v and v connected to w implies a path from u to w so u
connected to w (transitivity)
1
10 5
1
1

8 3
2 3 4

1 1 6

4
2
6 5
1
10 5
1
1

8 3
2 3 4

1 1 6

4
2
6 5
1
Initially, Forest of 6 trees
F= {{1},{2},{3},{4},{5},{6}}

Edges in a heap (not shown) 2 3 4

6 5
1
Select edge with lowest cost (2,5)
Find(2) = 2, Find (5) = 5
Union(2,5)
2 3 4
F= {{1},{2,5},{3},{4},{6}}
1 edge accepted 1

6 5
1
Select edge with lowest cost (2,6)
Find(2) = 2, Find (6) = 6
Union(2,6)
F= {{1},{2,5,6},{3},{4}} 2 3 4
2 edges accepted
1
1

6 5
1
Select edge with lowest cost (1,3)
Find(1) = 1, Find (3) = 3
1
Union(1,3)
F= {{1,3},{2,5,6},{4}} 2 3 4
3 edges accepted
1
1

6 5
1
Select edge with lowest cost (5,6)
Find(5) = 2, Find (6) = 2
1
Do nothing
F= {{1,3},{2,5,6},{4}} 2 3 4
3 edges accepted
1
1

6 5
1
Select edge with lowest cost (3,4)
Find(3) = 1, Find (4) = 4
1
Union(1,4)
3
F= {{1,3,4},{2,5,6}} 2 3 4
4 edges accepted
1
1

6 5
Select edge with lowest cost (4,5)
1
Find(4) = 1, Find (5) = 2
Union(1,2)
1
F= {{1,3,4,2,5,6}}
3
5 edges accepted : end 2 3 4
Total cost = 10
Although there is a unique spanning tree in 1
this example, this is not generally the case 4
1

6 5
• Recall that m = |E| = O(V2) = O(n2 )
• Prim’s runs in O((n+m) log n)
• Kruskal runs in O(m log m) = O(m log n)

• In practice, Kruskal has a tendency to run faster since graphs might


not be dense and not all edges need to be looked at in the Delete
min operations
Single source shortest pattern.

• Given: A single source vertex in a weighted, directed graph.


• Want to compute a shortest path for each possible destination.
– Similar to BFS.
• We will assume either
– no negative-weight edges, or
– no reachable negative-weight cycles.
• Algorithm will compute a shortest-path tree.
– Similar to BFS tree.
• Works when all of the weights are positive.

• Provides the shortest paths from a source to all other


vertices in the graph.
– Can be terminated early once the shortest path to t is
found if desired.
• Consider the following graph with positive weights and
cycles.
• A first attempt at solving this problem might require an array of Boolean values,
all initially false, that indicate whether we have found a path from the source.

1 F
2 F
3 F
4 F
5 F
6 F
7 F
8 F
9 F
• Graphically, we will denote this with check boxes next to
each of the vertices (initially unchecked)
• We will work bottom up.
– Note that if the starting vertex has any adjacent edges, then there
will be one vertex that is the shortest distance from the starting
vertex. This is the shortest reachable vertex of the graph.

• We will then try to extend any existing paths to new vertices.


• Initially, we will start with the path of length 0
– this is the trivial path from vertex 1 to itself
• If we now extend this path, we should consider the paths
– (1, 2) length 4
– (1, 4) length 1
– (1, 5) length 8

The shortest path so far is (1, 4) which is of length 1.


• Thus, if we now examine vertex 4, we may deduce that
there exist the following paths:
– (1, 4, 5) length 12
– (1, 4, 7) length 10
– (1, 4, 8) length 9
• We need to remember that the length of that path from
node 1 to node 4 is 1
• Thus, we need to store the length of a path that goes
through node 4:
– 5 of length 12
– 7 of length 10
– 8 of length 9
• We have already discovered that there is a path of length
8 to vertex 5 with the path (1, 5).
• Thus, we can safely ignore this longer path.
• We now know that:
– There exist paths from vertex 1 to vertices Vertex Length
{2,4,5,7,8}. 1 0
– We know that the shortest path from vertex 2 4
1 to vertex 4 is of length 1. 4 1
– We know that the shortest path to the other 5 8
vertices {2,5,7,8} is at most the length
7 10
listed in the table to the right.
8 9
• There cannot exist a shorter path to either of the vertices 1 or 4, since the
distances can only increase at each iteration.
• We consider these vertices to be Vertex Length
visited 1 0
2 4
If you only knew this information and nothing else 4 1
about the graph, what is the possible lengths from 5 8
vertex 1 to vertex 2? What about to vertex 7?
7 10
8 9
• In Dijkstra’s algorithm, we always take the next unvisited vertex
which has the current shortest path from the starting vertex in the
table.
• This is vertex 2
• We can try to update the shortest paths to vertices 3 and 6 (both of
length 5) however:
– there already exists a path of length 8 < 10 to vertex 5 (10 = 4 + 6)
– we already know the shortest path to 4 is 1
• To keep track of those vertices to which no path has
reached, we can assign those vertices an initial distance
of either
– infinity (∞ ),
– a number larger than any possible path, or
– a negative number
• For demonstration purposes, we will use ∞
• As well as finding the length of the shortest path, we’d like to
find the corresponding shortest path

• Each time we update the shortest distance to a particular vertex,


we will keep track of the predecessor used to reach this vertex on
the shortest path.
• We will store a table of pointers, each initially 0
• This table will be updated each time a distance is updated
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
• Graphically, we will display the reference to the preceding vertex
by a red arrow

– if the distance to a vertex is ∞, there will be no preceding


vertex
– otherwise, there will be exactly one preceding vertex
• Thus, for our initialization:
– we set the current distance to the initial vertex as 0
– for all other vertices, we set the current distance to ∞
– all vertices are initially marked as unvisited
– set the previous pointer for all vertices to null
• Thus, we iterate:
– find an unvisited vertex which has the shortest distance to it
– mark it as visited
– for each unvisited vertex which is adjacent to the current
vertex:
• add the distance to the current vertex to the weight of the
connecting edge
• if this is less than the current distance to that vertex,
update the distance and set the parent vertex of the
adjacent vertex to be the current vertex
• Halting condition:
– we successfully halt when the vertex we are visiting is the
target vertex
– if at some point, all remaining unvisited vertices have distance
∞, then no path from the starting vertex to the end vertex exits

• Note: We do not halt just because we have updated the distance


to the end vertex, we have to visit the target vertex.
• Consider the graph:
– the distances are appropriately initialized
– all vertices are marked as being unvisited
• Visit vertex 1 and update its neighbours, marking it as visited
– the shortest paths to 2, 4, and 5 are updated
• The next vertex we visit is vertex 4
– vertex 5 1 + 11 ≥ 8 don’t update
– vertex 7 1+ 9<∞ update
– vertex 8 1+ 8<∞ update
• Next, visit vertex 2
– vertex 3 4+1<∞ update
– vertex 4 already visited
– vertex 5 4+6≥8 don’t update
– vertex 6 4+1<∞ update
• Next, we have a choice of either 3 or 6
• We will choose to visit 3
– vertex 5 5+2<8 update
– vertex 6 5+5≥5 don’t update
• We then visit 6
– vertex 8 5+7≥9 don’t update
– vertex 9 5+8<∞ update
• Next, we finally visit vertex 5:
– vertices 4 and 6 have already been visited
– vertex 7 7 + 1 < 10 update
– vertex 8 7+1< 9 update
– vertex 9 7 + 8 ≥ 13 don’t update
• Given a choice between vertices 7 and 8, we choose vertex 7
– vertices 5 has already been visited
– vertex 8 8+2≥8 don’t update
• Next, we visit vertex 8:
– vertex 9 8 + 3 < 13 update
• Finally, we visit the end vertex
• Therefore, the shortest path from 1 to 9 has length 11
• We can find the shortest path by working back from the final
vertex:
– 9, 8, 5, 3, 2, 1
• Thus, the shortest path is (1, 2, 3, 5, 8, 9)
• Find the shortest path from 1 to 4:
– the shortest path is found after only three vertices are visited
– we terminated the algorithm as soon as we reached vertex 4
– we only have useful information about 1, 3, 4
– we don’t have the shortest path to vertex 2
d[s]  0
for each v Î V – {s}
do d[v]  ¥
S
Q  V ⊳ Q is a priority queue maintaining V – S
while Q ¹ 
do u  EXTRACT-MIN(Q)
S  S È {u}
for each v Î Adj[u]
do if d[v] > d[u] + w(u, v)
then d[v]  d[u] + w(u, v)
p[v]  u
Graph with nonnegative 2
B D
edge weights: 10
8
A 1 4 7 9

3
C 2
E
 
Initialize: 2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0      

S: {}
 
“A”  EXTRACT-MIN(Q): 2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0      

S: { A }
10 
Relax all edges leaving A: 2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 
10 3  

S: { A }
10 
“C”  EXTRACT-MIN(Q): 2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 
10 3  

S: { A, C }
Relax all edges leaving C: 7 11
2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 5
10 3  
7 11 5
S: { A, C }
“E”  EXTRACT-MIN(Q): 7 11
2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 5
10 3  
7 11 5
S: { A, C, E }
Relax all edges leaving E: 7 11
2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 5
10 3  
7 11 5
7 11 S: { A, C, E }
“B”  EXTRACT-MIN(Q): 7 11
2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 5
10 3  
7 11 5
7 11 S: { A, C, E, B }
Relax all edges leaving B: 7 9
2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 5
10 3  
7 11 5
7 11 S: { A, C, E, B }
9
“D”  EXTRACT-MIN(Q): 7 9
2
B D
10
8
0 A 1 4 7 9

3
Q: A B C D E C 2
E
0     3 5
10 3  
7 11 5
7 11 S: { A, C, E, B, D }
9

You might also like