0% found this document useful (0 votes)
36 views62 pages

04 Greedy

The document discusses greedy algorithms and provides an example of the knapsack problem. It explains that greedy algorithms make locally optimal choices at each step but may not lead to a globally optimal solution. It then describes the fractional and 0-1 knapsack problems and shows that a greedy approach finds the optimal solution for fractional knapsack by always selecting the item with the highest value per unit weight. However, this greedy approach is suboptimal for 0-1 knapsack.

Uploaded by

Sai Srinivas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views62 pages

04 Greedy

The document discusses greedy algorithms and provides an example of the knapsack problem. It explains that greedy algorithms make locally optimal choices at each step but may not lead to a globally optimal solution. It then describes the fractional and 0-1 knapsack problems and shows that a greedy approach finds the optimal solution for fractional knapsack by always selecting the item with the highest value per unit weight. However, this greedy approach is suboptimal for 0-1 knapsack.

Uploaded by

Sai Srinivas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 62

Computer Science and Engineering| Indian Institute of Technology Kharagpur

cse.iitkgp.ac.in

Algorithms – I (CS21203)

Autumn 2023, IIT Kharagpur

Greedy Algorithms
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Greedy Algorithms
• Greedy algorithm repeatedly makes locally best choice/decision
ignoring what its effect will be in future

• They are often intuitive, easy to understand and easy to implement

• However the problem is that in many situations we can not solve a


problem using a greedy approach

• Greedy solution is not necessarily best

• Sometimes greedy may also be good enough


• When you can prove it

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 2


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimization Problems
• A class of problems in which we are asked to
• Find a set (or a sequence) of “items”
• That satisfy some constraints and simultaneously optimize (i.e., maximize
or minimize) some objective function
• Formally

• A sequence of tasks with deadline, maximize reward while finishing


before deadline
• Items: tasks, constraints: finish before deadline, optimize: total reward
• A set of products with weights and values, put into a bag of weight
limit and maximize value
• Items: products, constraints: weight limit, optimize: total value
•A file in computer, encode/compress it to minimize the length
• Items: codewords for each character, constraints: original file recoverable,
optimize: code length
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 3
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• Given items of known weights and values and a knapsack of
capacity , find the most valuable subset of the items that fit into the
knapsack
• Two variations of Knapsack problem is there
• (Fractional Knapsack): You are allowed to take fractions of items
• (0-1 Knapsack): You have to take an item either whole or none
•Objective: Total value in the knapsack
•Constraints: Sum of weights of items in Knapsack can’t exceed the
knapsack capacity. (In 0-1 version) Only whole item or none
Item 3
Item 2
Item 1
30 Kg 50 Kg
20 Kg
10 Kg
Rs. 60 Rs. 100 Rs. 120
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 4
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• Which variation of the Knapsack problem should be easy to solve?
•Any simple approach to solve the fractional Knapsack problem?
•Get the per unit value of the items and fill out the knapsack starting
with the item highest per unit value and go on in a descending order
•The total value is coming to be Rs. 240
•The strategy here is literally “greedy”
•It is also optimal in this case Total value =
60+100+4*20 =
4 Rs/kg Item 3 Rs. 240

5 Rs/kg Item 2
6 Rs/kg Item 1 20 kg
30 kg 50 Kg
20 kg 20 kg
10 kg
10 kg
Rs. 60 Rs. 100 Rs. 120
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 5
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• What may come challenging sometimes is proving a greedy
technique indeed gives optimal solution
• For fractional knapsack its relatively easy
• Intuitively: For the first 10 kg space of the knapsack we are putting
the best possible option (item 1).
• For the next 20 kg space we going for the best possible option
available and so on Total value =
• Formal proof will come later 60+100+4*20 =
4 Rs/kg Item 3 Rs. 240

5 Rs/kg Item 2
6 Rs/kg Item 1 20 kg
30 kg 50 Kg
20 kg 20 kg
10 kg
10 kg
Rs. 60 Rs. 100 Rs. 120
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 6
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• Lets see how this strategy works for 0-1 Knapsack problem
• But this is not the optimal solution. We can do better
• What is the optimal solution?

Total value =
60+100 =
4 Rs/kg Item 3 Rs. 160

5 Rs/kg Item 2
6 Rs/kg Item 1 0 kg
30 kg 50 Kg
20 kg 20 kg
10 kg
10 kg
Rs. 60 Rs. 100 Rs. 120
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 7
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• Lets see how this strategy works for 0-1 Knapsack problem
• But this is not the optimal solution. We can do better
• What is the optimal solution?
• Full of item 3 and item 2
• It was easy for this example as the number of items are only 3
• In general, for items, you need to check combinations
Total value =
120+100 =
4 Rs/kg Item 3 Rs. 220

5 Rs/kg Item 2
6 Rs/kg Item 1 20 kg
30 kg 50 Kg
20 kg
10 kg 30 kg
Rs. 60 Rs. 100 Rs. 120
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 8
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• The greedy strategy that worked for the fractional Knapsack
problem, did not work for the 0-1 Knapsack problem
• Lets try another greedy strategy
• Starting with the item highest total value and go on in a
descending order
• For this instance, it gives the optimal solution
• Any instance/example where this strategy fails? Total value =
120+100 =
4 Rs/kg Item 3 Rs. 220

5 Rs/kg Item 2
6 Rs/kg Item 1 20 kg
30 kg 50 Kg
20 kg
10 kg 30 kg
Rs. 60 Rs. 100 Rs. 120
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 9
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• Add a 4 item with more weight but with slightly high value
th

• Say a 4th item weighs 40 kg with value Rs. 130


• The greedy strategy gives the solution as 40 kg of item 4 and 10 kg
of item 1. The total value is Rs. 190
• However, the optimal solution is 30 kg of item 1 and 20 kg of item
2. The total value is Rs. 220
Item 4

Item 3

Item 2 40 kg
Item 1
30 kg 50 Kg
20 kg
10 kg
Rs. 60 Rs. 100 Rs. 120 Rs. 130
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 10
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Knapsack Problem
• We have tried two different greedy strategies to solve 0-1
Knapsack problem. But none of them worked
• Do we have any greedy strategy that works for 0-1 Knapsack
problem?
• To the best of our knowledge there is no known greedy strategy
that works for 0-1 Knapsack problem
• Later we will see other approaches to solve a 0-1 Knapsack
problem Item 4

Item 3

Item 2 40 kg
Item 1
30 kg 50 Kg
20 kg
10 kg
Rs. 60 Rs. 100 Rs. 120 Rs. 130
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 11
image source: www.indiamart.com
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Fractional Knapsack Problem - Pseudocode


FractionalKnapsack(w, V, C, n)
Find V[i]/w[i] for all item i
Sort the items in both V[i] and w[i] by V[i]/w[i] in descending order
load = 0
for i=1 to n
if w[i] < C – load
Take whole of item i
load += w[i]
else
Take (C-load) amount of item i
Load = C
break

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 12


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Fractional Knapsack Problem - Analysis


FractionalKnapsack(w, V, C, n)
Find V[i]/w[i] for all item i Θ(𝑛)
Sort the items in both V[i] and w[i] by V[i]/w[i] in descending order
load = 0 Θ(𝑛 log𝑛)
for i=1 to n Θ(1)
if w[i] C – load Overall runtime
Take whole of item i
load += w[i]
else
Take (C-load) amount of item i
Θ(𝑛)
Load = C
break

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 13


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Activity Selection/Interval Scheduling Problem


• Imagine that you are trying to schedule as many classes as possible
without any conflicting lectures.
• Given a collection of intervals, find a subset so that
• No two intervals in overlap
• is as large as possible
1 2 3 4 5 6 7 8 9

1 2 4 2 5 8 9 11 13

8 5 7 3 9 10 11 14 16

1
2
3
4
5
6
7
8
9

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 14


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Activity Selection/Interval Scheduling Problem


• What would be a brute force solution?
• Try all combinations, i.e., find the set of all subsets and check if the
elements of the subset are compatible
• Complexity is

1 2 3 4 5 6 7 8 9

1 2 4 2 5 8 9 11 13

8 5 7 3 9 10 11 14 16

1
2
3
4
5
6
7
8
9

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 15


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Activity Selection/Interval Scheduling Problem


• What can be a greedy strategy?
• Remember greedy implies choosing a criterion and according to
the criterion, take a decision that seems best at the moment and
repeat this
• What about picking the shortest duration first, then the next
shortest duration and so on?
1 2 3 4 5 6 7 8 9

1 2 4 2 5 8 9 11 13

8 5 7 3 9 10 11 14 16

1
2
3
4
5
6
7
8
9

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 16


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Shortest Duration
1 2 3 4 5 6 7 8 9

1 2 4 2 5 8 9 11 13

8 5 7 3 9 10 11 14 16

7 3 3 1 4 2 2 3 3

❌ ❌ ✅ ✅ ❌ ✅ ❌ ✅ ❌
• So, our greedy solution is the activity subset
• Is it optimal?
• In this case – yes
• We can show that for this problem, we can at max choose 4 actions
• We are deferring the formal proof for later
• However, this greedy strategy may not be optimal for all instances

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 17


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Earliest Start
• What about choosing by early start?
• Is it optimal?
• No – in general

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 18


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Earliest Finish
• What about choosing by early finish?
• Is it optimal?
• Yes
1 2 3 4 5 6 7 8 9

1 2 4 2 5 8 9 11 13

8 5 7 3 9 10 11 14 16

7 3 3 1 4 2 2 3 3

❌ ❌ ✅ ✅ ❌ ✅ ❌ ✅ ❌
• So, our greedy solution is the activity subset
• Note: optimal solution is not unique. Another candidate

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 19


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Alternate Strategy
• Given that ‘earliest finish’ strategy works, using symmetry what can
be an alternate strategy that will also work?
• Hint: Think why ‘earliest finish’ strategy works

• Choosing earliest finish leaves maximum room for other activities


to fill in
• So, by symmetry, ‘latest start’ [looking from the other end] will also
give optimal solution

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 20


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Activity Selection Problem - Pseudocode


s: Array of start times, f: Array of finish times
R: Set of all requests, A: Set of accepted requests
n: Number of activities, k: Index of last accepted activity
ActivitySel(s, f, A, n)
Sort the items in s, f and A by f[i] in ascending order
Remove R[1] from R and add to A
k = 1; i = 2
while R is not empty:
if s[i] > f[k]:
Remove R[i] from R and add to A
k=i
else:
Remove R[i] from R
i++
• The runtime is [Sorting dominates]
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 21
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Greedy Activity Selection


• It is not obvious how and whether the greedy activity selection
strategy returns an optimal set of intervals i.e., whether or not is
optimal is not clear yet
• However, we can immediately say one thing that is a compatible
set of requests i.e., no two activities in overlap in time
• We need to show is optimal i.e., contains maximum possible non-
overlapping activities
• As we don’t know yet whether is optimal, for the purpose of
comparison, let us take to be an optimal set of activities
• We need to show
• That is - contains the same number of intervals as and hence is
also an optimal solution
• Note: is what we got using the greedy strategy and is an optimal
set of activities

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 22


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Greedy Activity Selection


• We are following the book by Kleinberg and Tardos for the proof
• The idea underlying the proof will be to show that the greedy
strategy “stays ahead” of the optimal solution
• It will be very similar to proof by induction
• Let be the set of activities in in the order they are added to
• Similarly, let be the set of activities in
• Why did we not tell “in the order they are added” for the second
case?
• The set may have followed some other strategy to get these
activities
• However we can always sort the activities in increasing finishing
time. Lets assume that and this do not cause any loss of generality

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 23


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Greedy Activity Selection


• So , ; both are sorted in increasing order of the finishing times of
the activities
• Note that and and our goal is to prove
• Lets start by comparing the first activity in both and
• has the least finishing time among all activities. So,
• So, if we replace by in , the resulting set still remains optimal as
still contains the same number of activities () which are still
compatible
• Now we will prove that for each , the activity selected by the
greedy strategy finishes no later than the activity in
• Thus we will prove that

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 24


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Greedy Activity Selection


• We will prove that
• We have already proved it for
• For , we will assume that the statement is true for and we will try
to prove it for
• Thus we have
• Since, consists of compatible intervals,
• Combining (1) and (2),
• So, activity is one of the possible candidates to be chosen by our
greedy strategy. However, the greedy strategy always choses with
earliest finish time.
• So, among the available candidates (of which is one), the activity
chosen by the greedy strategy has the smallest finish time
• Thus . This completes the induction step
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 25
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Greedy Activity Selection


• Thus we have proven that the greedy strategy always “stays ahead”
of the optimal solution
• This is in the sense that – for each , the activity the greedy
algorithm selects finishes at least as soon as the activity in
• Now we will see why this implies optimality of the greedy
algorithm’s set
• This will be done by contradiction

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 26


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Greedy Activity Selection


• Suppose is not optimal. Now, as is an optimal set, we must have
• As , [Putting ] … (3)
• Now, . So, there must be at least one activity in
• As it starts after is complete, start time of is after finish time of
• But, by (3), finishes before . So, start time of is after finish time of
• Thus, after deleting all activities that are not compatible with , the
set of possible activities , still contains
• This is a contradiction as we have assumed that the greedy
algorithm has stopped at , and thus is empty
• This completes the proof that

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 27


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles
• We have piles of pebbles:

• We want to merge them into one pile, but


• We can only merge two of them at a time
• Merging two piles of size and costs you units of energy (Let’s assume
we need to move both piles)
• It means merging piles of size 12 and 7 results in a new pile of size 19
and costs you 19 units of energy
•How can we merge all of them with least energy?
Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 28
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles
• Lets try to merge the piles in the order they are given
12 7 8 15 4

• We will use a tree to represent the trace of merging


46

42 4

27 15

19 8
Energy cost: 19 + 27 + 42 + 46 = 134
12 7 Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 29
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles – Another Solution


• Lets try to merge the piles two at a time from original piles
12 7 8 15 4

• We will use a tree to represent the trace of merging

46

42 4

19 23
Energy cost: 19 + 23 + 42 + 46 = 130
12 7 8 15 Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 30
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles – Any Greedy Idea?


• Always merge two smallest piles
12 7 8 15 4

• We will use a tree to represent the trace of merging

46 One thing to notice: The total


cost sums all the internal nodes
19 27

11 8 12 15
Energy cost: 11 + 19 + 27 + 46 = 103
4 7 Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 31
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles – Why Greedy is Good


• You may need to move a pile multiple times (Cost for
moving the same pile is incurred multiple times)
• The pile will be charged at all of its ancestors!
• Lets consider moving the pile of size 8 Paying thrice
46
Paying twice
Paying twice
46 42 4
Paying once
Paying once 27 15
19 27

19 8
11 8 12 15
12 7 Cost: 134
4 7 Cost: 103 Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 32
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles – Why Greedy is Good


• You may need to move a pile multiple times (Cost for
moving the same pile is incurred multiple times)
• The pile will be charged at all of its ancestors!
• How many times do we move the pile of size 8?
46
• The height of it (number of ancestors) Total cost: 12x4 + 7x4
Total cost: 4x3 + 7x3 + 8x2 + 12x2 + 15x2 = 103 + 8x3 + 15x2 + 4x1 =
46 134 42 4

27 15
19 27

19 8
11 8 12 15
12 7 Cost: 134
4 7 Cost: 103 Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 33
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles – Why Greedy is Good


• , is the set of leaf nodes, is the height of leaf node
• We will try to see what if we do not exactly follow the greedy
strategy and exchange the order of merging of two piles

46 +12 -7 46

19 27 24 22

11 8 12 15 16 8 7 15

4 7 4 12
Total cost: 4x3 + 7x2 + 8x2 + 12x3 + 15x2 = 108
Total cost: 4x3 + 7x3 + 8x2 + 12x2 + 15x2 = 103
Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 34
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Merge Pebbles – Some Observations


• , is the set of leaf nodes, is the height of leaf node

46 • It makes sense to put the


smallest piles the deepest
19 27 • Since everytime we merge two
piles, there are always two
leaves in the deepest level
11 8 12 15 • Once we merge two smallest
piles we have the same
4 7 problem decreased in size by 1
(optimal substructure)
Total cost: 4x3 + 7x3 + 8x2 + 12x2 + 15x2 = 103 • Why do we care about moving
pebble piles?

Source: UC Riverside, CS141 course, Fall 2021


Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 35
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Huffman Codes
• How data is represented in computers?
• Using binary (0’s and 1’s) codes
• Fixed-size codes, e.g., ASCII
• A: 1000001 (65)
• B: 1000010 (66)
• Fixed size codes may not necessarily be the best way to
store/communicate data
• Any other ways?

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 36


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Huffman Codes
• Variable size codes like Morse codes
• A:
• B:
• E:
• T:
•Invented in 1800s. Image source: https://tinyurl.com/4mncb26k

•Not used for storing in computers, rather for communication


•Get more information in www.youtube.com/watch?v=iy8BaMs_JuI
•Note: The length of each code is not same.
•Why? Any advantage?

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 37


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Fixed Length vs Variable Length Code


• Suppose we have a 100,000-character data file that we wish to
store compactly
• The file contains only 6 characters, appearing with the following
frequencies

• We would like to find a binary code that encodes the file using as
few bits as possible, i.e., the compression is maximum

• Fixed length code requires 300,000 bits to encode the file


• The variable length code requires (45*1 + 13*3 + 12*3 + 16*3 +
9*4 + 5*4)*1000 = 224,000 bits
• A savings of approximately 25%
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 38
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Lets Go Back to Morse Codes

• Anything you can notice?


• All the coded words are same
• Actually ‘pause’ plays a role in
Morse code
Source: UC Riverside, CS141 course, Fall 2021
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 39
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Prefix Codes
• No code should sit in front of any other code
• Termed as “Prefix code” [Though the book rightly says “Perhaps
prefix-free codes would be a better name”]
• Morse code is not “Prefix code” [Or called as Non-prefix code]
• Encoding means simply concatenate all the codes
Character Prefix code Non-prefix code
a 0 00
b 101 001
c 100 11
d 111 111
e 1101 01
f 1100 010

• abd -> 0101111


Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 40
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Prefix Codes
Character Prefix code Non-prefix code
a 0 00
b 101 001
c 100 11
d 111 111
e 1101 01
f 1100 010

• Decoding is unambiguous
• Example:
• Message: ‘DABA’
• Encoded message: 11101010
• Decoding “11101010” – greedily decode it!

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 41


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimum Prefix Codes


• Given the codewords for different characters, we can easily compute
the number of bits required to encode a file
• Given an alphabet with frequency distribution , codeword and
length , the number of bits required to encode the file is

• An optimum prefix code is a binary prefix code for such that it


minimizes the total number of bits
• Huffman developed a nice greedy algorithm for solving this problem
and producing a minimum-cost prefix code.
• The code that it produces is called a Huffman code

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 42


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Relation between Prefix Codes and Tree


• Prefix codes can be represented by the leaves of a binary tree
• An optimal code for a file is always represented by a full binary tree,
in which every nonleaf node has two children
100 • A left edge means 0
0 1 • A right edge means 1
• Each leaf is a character with its
45 55 code found by traversing to the
0 1 leaf from the root
0: a
• Each leaf is labeled with the
25 30 frequency of occurrence of the
0 1 corresponding character
0 1
• Each internal node contains the
12 13 14 16 sum of the frequencies of the
100: c 101: b 0 1 111: d leaves of its subtrees
• Any similarity to something we
1100: f 5 9 1101: e have seen earlier?
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 43
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Relation between Prefix Codes and Tree


• If is the alphabet set, then the tree
100 for an optimal prefix code has
0 1 exactly leaves
• And exactly internal nodes
45 55 • Given a tree the number of bits
0 1 required to encode the file is
0: a

25 30
0 1 0 1

12 13 14 16
100: c 101: b 0 1 111: d
1100: f 5 9 1101:•e is the depth (or height) of the leaf
which is also equal to the length of
the codeword associated with that
Aug 25, 30, 31, 2023
leaf
CS21203 / Algorithms - I | Greedy Algorithms 44
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Morse Code is Non-prefix

Taken from wikipedia. Attribution to author: By The original uploader was Aris00 at English Wikipedia. - Transferred from
en.wikipedia to Commons. Transfer was stated to be made by User:Ddxc., CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?curid=3177632

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 45


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Finding Huffman Code


• Step 1: Pick two letters from alphabet with the smallest frequencies
and create a subtree that has these two characters as leaves (greedy
idea)
• Label the root of this subtree as
• Step 2: Set frequency
• Remove and add creating new alphabet

• Note that
• Repeat this procedure, called merge with new alphabet until an
alphabet with only one symbol is left
• The resulting tree is the Huffman tree giving Huffman code

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 46


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Example of Huffman Coding


• New
Let the
alphabet
alphabetis along with its frequency distribution is
18 15 5 20 15 45 In the first step,
0c 1 merge and
a b d e
5 15
c d

20
0 1
5 15
Aug 25, 30, 31, 2023
c d
CS21203 / Algorithms - I | Greedy Algorithms 47
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Example of Huffman Coding


• New alphabet is
18 33 15 20 45 Next, merge and
0 1 0 1
a b e
15 18 5 15
b a c d

20 33
0 1 0 1
5 15 15 18
Aug 25, 30, 31, 2023
c d
CS21203 / Algorithms - I | Greedy Algorithms
b a 48
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Example of Huffman Coding


• New alphabet is
33 53
1 20 45 Next, merge nodes
0 01 0 1
e with freq 20 and 33
15 20 18
1 05
33 15
0 1
b a c d
5 15 15 18
c d b a

0 53 1
20 33
0 1 0 1
5 15 15 18
Aug 25, 30, 31, 2023
c d
CS21203 / Algorithms - I | Greedy Algorithms
b a 49
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Example of Huffman Coding


• New alphabet is empty
53
1 45 Next, merge node
0
e with freq 53 and the
20 33 only remaining
0 1 0 1
character
5 15 15 18
c d b a 98
0 1

0 53 1 45
e
20 33
0 1 0 1
5 15 15 18
Aug 25, 30, 31, 2023
c d
CS21203 / Algorithms - I | Greedy Algorithms
b a 50
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Example of Huffman Coding


• Algorithm terminates and Huffman tree is obtained
• The Huffman codes are:

• Running time is [Proof is deferred]

0 98 1

0 53 1 45
e
20 33
0 1 0 1
5 15 15 18
c d b a

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 51


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• Lemma 1: Let be two characters in alphabet with two smallest
frequencies. Then there exists an optimal prefix code tree for in
which the codewords for and are sibling leaves in the tree in the
lowest level
• Proof: (The idea) – Take a tree representing an arbitrary optimal
prefix code tree. Assume some other characters sit at the bottom
as sibling nodes.
We will try to modify this tree such that (sitting somewhere else in
the tree) are exchanged with .
If the modified tree is also at least as good as then we are done
proving the lemma
• Quick quiz: Can there be two or more optimal trees?

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 52


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• Proof: – Let us consider the tree and sit at the maximum depth as
sibling nodes
• Without loss of generality, lets assume and . Since, have the
lowest two frequencies, the above and also

𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓

𝑥 𝑦 𝑎 𝑏 𝑥 𝑦 𝑎 𝑏 𝑥 𝑦 𝑎 𝑏 𝑥 𝑦 𝑎 𝑏 𝑥 𝑦 𝑎 𝑏 𝑥 𝑦 𝑎 𝑏 𝑥 𝑦 𝑎 𝑏
• The last case makes the lemma trivially true (as exchanging and
does not change anything on the objective function)
• So we will assume [Other wise that will make ]

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 53


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• We exchange the positions of and in to produce a tree
• Then we exchange the positions of and in to produce
• In , and are sibling leaves at maximum depth

)
[Rest of the nodes, as unchanged, get cancelled]
)
=
≥0 ≥0

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 54


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• Similarly
• So,
• But, is optimal. So,
• So,
• Thus, is an optimal tree in which and appear as sibling leaves of
maximum depth– from which the lemma follows

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 55


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• Let and be two characters in alphabet with minimum frequency.
Let be the alphabet with and removed and a new character
added, so that . Frequencies for characters in is same as for ,
except that . Let be any optimal prefix code tree for . Then the tree
, obtained from by replacing the leaf node with an internal node
having and as children, represents an optimal prefix code tree for
• What do we get if the above lemma is true?
• This is exactly how Huffman code is formed. Once, you get the
final , you can make the whole tree by recursively replacing with
its two children

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 56


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• Lemma: Let and be two characters in alphabet with minimum
frequency. Let be the alphabet with and removed and a new
character added, so that . Frequencies for characters in is same as
for , except that . Let be any optimal prefix code tree for . Then the
tree , obtained from by replacing the leaf node with an internal
node having and as children, represents an optimal prefix code tree
for .
• Lets try to express in terms of
• For each character , we have . So, … (1)
• Since, , we have
… (2)
• Adding (1) for all to both sides of (2), we get,

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 57


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Optimality of Huffman Coding


• Rearranging,
• We will prove the lemma by contradiction. Suppose does not
represent an optimal prefix code tree for
• Then there exists an optimal tree such that
• Again, by the previous lemma, has and as siblings
• Let be the tree with common parent of and replaced by a leaf
with frequency
• Then,
[as, = ]
[as, ]

[by, (4)]
[by, (3)]  This is a contradiction that is optimal for . Thus must be
optimal for 58
Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms
Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Tutorial Problems
• Given an infinite array in which the first elements are integers in
sorted order and the rest of the cells are filled with some special
symbol (say $). Assume we do not know the value. Give an
algorithm that takes an integer as input and finds a position in the
array containing , if the integer exists in the array in time

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 59


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Tutorial Problems
• Given a sorted array of non-repeated integers A[1..n], check
whether there is an index for which . Give a divide-and-conquer
algorithm that runs in time

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 60


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Tutorial Problems
• We are given two sorted arrays of size n. Give an algorithm for
finding the median element in the union of the two lists so that the
complexity is

Aug 25, 30, 31, 2023 CS21203 / Algorithms - I | Greedy Algorithms 61


Computer Science and Engineering| Indian Institute of Technology Kharagpur
cse.iitkgp.ac.in

Thank You

You might also like