0% found this document useful (0 votes)
8 views81 pages

Weak 1

The document outlines a MOOC on Design and Analysis of Algorithms, covering topics such as asymptotic complexity, algorithmic techniques, and data structures. It includes a schedule for the course, evaluation methods, and examples illustrating algorithm efficiency and problem-solving strategies. The course requires a background in programming and data structures, with continuous evaluation through quizzes and assignments.

Uploaded by

sai13aadityad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views81 pages

Weak 1

The document outlines a MOOC on Design and Analysis of Algorithms, covering topics such as asymptotic complexity, algorithmic techniques, and data structures. It includes a schedule for the course, evaluation methods, and examples illustrating algorithm efficiency and problem-solving strategies. The course requires a background in programming and data structures, with continuous evaluation through quizzes and assignments.

Uploaded by

sai13aadityad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

NPTEL MOOC,JAN-FEB 2015

Week 1, Module 1

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Understanding Algorithms
Correctness

Efficiency

Asymptotic complexity, O( ) notation

Modelling

Graphs, data structures, decomposing the problem

Techniques

Divide and conquer, greedy, dynamic programming


Expectations

Background in programming

Any language (C, C++, Java)

Basic data structures

Arrays, lists
Topics to be covered
Asymptotic complexity

Searching and sorting in arrays

Binary search, insertion sort, selection sort, merge


sort, quick sort

Graphs and graph algorithms

Representations, reachability, connectedness

Directed acyclic graphs

Shortest paths, Spanning trees


Topics to be covered
Algorithmic design techniques

Divide and conquer, Greedy algorithms,


Dynamic programming

Data structures

Priority queues/heaps, Search trees,


Union of disjoint sets (union-find)

Miscellaneous topics

Intractability, …
Tentative schedule January

Week 1: Motivation, asymptotic complexity 4 5 6 7 8 9 10

Week 2: Searching and sorting 11 12 13 14 15 16 17

18 19 20 21 22 23 24
Week 3: Graphs and basic graph algorithms
25 26 27 28 29 30 31
Week 4: More graph algorithms, disjoint set
February
Week 5: Divide and conquer, heaps 1 2 3 4 5 6 7

Week 6: Search trees, greedy algorithms 8 9 10 11 12 13 14

Week 7: Dynamic programming 15 16 17 18 19 20 21

Week 8: Miscellaneous topics 22 23 24 25 26 27 28


Evaluation
Continuous evaluation

8 Weekly quizzes

6 programming assignments

Certification exam

Requirement for successful course completion

60% in quizzes, certification exam

Submit at least 5 of 6 assignments

At least 4 with nonzero marks


Textbooks

Algorithm Design

Jon Kleinberg and Eva Tardos

Algorithms

Sanjoy Dasgupta, Christos Papadimitriou and


Umesh Vazirani
NPTEL MOOC,JAN-FEB 2015
Week 1, Module 2

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Example 1: Air travel

Barbet Airlines serves several cities in India

Some cities are connected by direct flights

Want to compute all pairs of cities A,B such that A


and B are connected by a sequence of flights
Delhi

Varanasi
Ahmedabad

Kolkata
Mumbai
Hyderabad
Visakhapatnam

Bangalore
Chennai

Trivandrum
Throw away the map and record the network

This is a graph—a collection of nodes and edges


Can distort the picture without changing meaning
Can distort the picture without changing meaning
Connected destinations

Compute paths in the graph

How do we represent the graph so that we can


manipulate it using a computer program?

Suitable data structure

How do we design an efficient algorithm for this


data representation?
Efficiency?

N cities, F direct flights

Computing paths depends on N and F

What is this dependency?

How large a value of N and F can we handle?

Online booking requires response in seconds


Variations

Flights have arrival and departure times

Only some connections are feasible

Should not have to wait overnight

… or more than 4 hours

How to compute feasible paths with constraints?


Other problems

Each sector has a cost

Compute cheapest route between a pair of


cities

Some aircraft grounded for maintenance

Which routes to operate to maintain


connectivity?
NPTEL MOOC,JAN-FEB 2015
Week 1, Module 3

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Example 2: Xerox Shop

Campus Xerox has several photocopiers

Tomorrow is the deadline for BTech projects and


there is a rush of reports to be printed

How to schedule the pending jobs most


effectively?
Xerox Shop …
The number of pages for each job is known

Each customer has been promised delivery by a


deadline

Campus Xerox offers discount if deadline is not


met

How to sequentially allocate the jobs to


photocopiers to maximize revenue?
Xerox Shop …
Brute force

Try all possible allocations

Choose one that is optimum

Number of possibilities is exponential!

Even with 30 jobs, it would take hours to compute


an optimal schedule
Xerox Shop …

Decompose the problem

Choose a job to schedule first, and the machine


on which it will run, according to some strategy

Now, recursively solve the problem for N-1 jobs


Xerox Shop …
Greedy approach

Fix the choice of next job once and for all

Never go back and try another sequence

How to choose the next job?

Shortest processing time?

Earliest deadline?

How to show that this strategy is optimal?


Variations

Some photocopiers are old and slow, some are


new and fast

Time for a job depends on choice of machine

Cost of ink and paper varies across machines

Net revenue for a job depends on choice of


machine
Variations

Account for set up time between jobs

Need to reserve time slots to reload paper

Is there a valid greedy strategy?


NPTEL MOOC,JAN-FEB 2015
Week 1, Module 4

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Example 3: Document
similarity

Given two documents, how similar are they?

Plagiarism detection

Checking changes between versions of code

Answering web search queries more effectively


Document similarity …
What is a good measure of similarity?

Edit distance

How many changes does one have to make to get


from one document to another?

What types of changes are allowed?

Add or remove a letter

Replace one letter by another


Document similarity …
Edit Distance

Minimum number of edit operations to transform


one document to another

How do we compute it?

Brute force: try all sequences and choose the best


one

Delete all of first document, add all of second


document

Impossibly inefficient!
Decomposing the problem
Make the first character in both documents the
same

Explore all possible edit operations that make


this possible

Recursively fix the rest of the documents

Naive recursion is inefficient

Same subproblem solved recursively many times


Naive recursion can be
inefficient
Fibonacci numbers:

F(n) = F(n-1) + F(n-2), F(1) = 1, F(2) = 1

Sequence is 1,1,2,3,5,8,13,21,….

Computing recursively

F(7) = F(6) + F(5) = (F(5) + F(4)) + (F(4)+F(3)) =


(F(4)+F(3)+F(3)+F(2)) + (F(3)+F(2)+F(2)+F(1)) = …
Dynamic Programming

Making recursive computations efficient

Ensure that subproblems are computed only once

How do we store and look up answers to already


solved subproblems?
Variations
Interested only in the meaning of the document

Focus on words

Documents are near if they overlap on many words

Order in which words occur may not matter

Useful for topic based web search

Can have dictionary of “similar” words


NPTEL MOOC,JAN-FEB 2015
Week 1, Module 5

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Analysis of algorithms

Measuring efficiency of an algorithm

Time: How long the algorithm takes (running


time)

Space: Memory requirement


Time and space

Time depends on processing speed

Impossible to change for given hardware

Space is a function of available memory

Easier to reconfigure, augment

Typically, we will focus on time, not space


Measuring running time
Analysis independent of underlying hardware
Don’t use actual time
Measure in terms of “basic operations”
Typical basic operations
Compare two values
Assign a value to a variable
Other operations may be basic, depending on context
Exchange values of a pair of variables
Input size
Running time depends on input size
Larger arrays will take longer to sort
Measure time efficiency as function of input size
Input size n
Running time t(n)
Different inputs of size n may each take a different
amount of time
Typically t(n) is worst case estimate
Example 1: Sorting
Sorting an array with n elements
Naïve algorithms : time proportional to n2
Best algorithms : time proportional to n log n
How important is this distinction?
Typical CPUs process up to 108 operations per
second
Useful for approximate calculations
Example 1: Sorting …
Telephone directory for mobile phone users in India
India has about 1 billion = 109 phones

Naïve n2 algorithm requires 1018 operations


108 operations per second ⟹ 1010 seconds
2778000 hours
115700 days
300 years!
Smart n log n algorithm takes only about 3 x 1010
operations
About 300 seconds, or 5 minutes
Example 2: Video game
Several objects on screen
Basic step: find closest pair of objects
Given n objects, naïve algorithm is again n2
For each pair of objects, compute their distance
Report minimum distance over all such pairs
There is a clever algorithm that takes time n log n
Example 2: Video game …
High resolution monitor has 2500 x 1500 pixels

3.75 million points


5
Suppose we have 500,000 = 5 x 10 objects

Naïve algorithm takes 25 x 1010 steps = 2500 seconds

2500 seconds = 42 minutes response time is


unacceptable!

Smart n log n algorithm takes a fraction of a second


Orders of magnitude
When comparing t(n) across problems, focus on
orders of magnitude

Ignore constants

f(n) = n3 eventually grows faster than g(n) = 5000 n2

For small values of n, f(n) is smaller than g(n)

At n = 5000, f(n) overtakes g(n)

What happens in the limit, as n increases :


asymptotic complexity
Typical functions

We are interested in orders of magnitude

Is t(n) proportional to log n, …, n2 , n3 , …, 2n?

Logarithmic, polynomial, exponential …


Typical functions t(n)…
Input log n n n log n n2 n3 2n n!
10 3.3 10 33 100 1000 1000 106
100 6.6 100 66 104 106 1030 10157
1000 10 1000 104 106 109
104 13 104 105 108 1012
105 17 105 106 1010
106 20 106 107
107 23 107 108
108 27 108 109
109 30 109 1010
1010 33 1010
NPTEL MOOC,JAN-FEB 2015
Week 1, Module 6

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Input size
Running time depends on input size
Larger arrays will take longer to sort
Measure time efficiency as function of input size
Input size n
Running time t(n)
Different inputs of size n may each take a different
amount of time
Typically t(n) is worst case estimate
Input size …
How do we fix input size?
Typically a natural parameter
For sorting and other problems on arrays:
array size
For combinatorial problems: number of objects
For graphs, two parameters: number of vertices
and number of edges
Input size …
Input size for numbers
Is n a prime?
What should be the input size? Magnitude of n?
Arithmetic operations are performed digit by digit
Addition with carry, subtraction with borrow,
multiplication, long division …
Number of digits is input size
Same as logb n when we write n in base b
Orders of magnitude
When comparing t(n) across problems, focus on
orders of magnitude

Ignore constants

f(n) = n3 eventually grows faster than g(n) = 5000 n2

For small values of n, f(n) is smaller than g(n)

At n = 5000, f(n) overtakes g(n)

What happens in the limit, as n increases :


asymptotic complexity
Choice of basic operations
Flexibility in identifying “basic operations”

Swapping two variables involves three assignments


tmp ← x
x ← y
y ← tmp

Number of swaps is 3 times number of assignments

If we ignore constants, t(n) is of the same order of


magnitude even if swapping values is treated as a
basic operation
Worst case complexity
Running time on input of size n varies across
inputs
Search for K in an unsorted array A
i ← 0
while i < n and A[i] != K do
i ← i+1
if i < n return i
else return -1
Worst case complexity
For each n, worst case input forces algorithm to take
the maximum amount of time

If K not in A, search scans all elements

Upper bound for the overall running time

Here worst case is proportional to n for array size n

Can construct worst case inputs by examining the


algorithm
Average case complexity
Worst case may be very rare: pessimistic
Compute average time taken over all inputs
Difficult to compute
Average over what?
Are all inputs equally likely?
Need probability distribution over inputs
Worst case vs average
case
Worst case can be unrealistic …
… but average case is hard, if not impossible, to
compute
A good worst case upper bound is useful
A bad worst case upper bound may be less
informative
Try to “classify” worst case inputs, look for
simpler subclasses
NPTEL MOOC,JAN-FEB 2015
Week 1, Module 7

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Comparing time efficiency

We measure time efficiency only upto an order of


magnitude

Ignore constants

How do we compare functions with respect to


orders of magnitude?
Upper bounds, “big O”
t(n) is said to be O(g(n)) if we can find suitable
constants c and n0 so that cg(n) is an upper bound
for t(n) for n
beyond n0

t(n) ≤ cg(n)
for every n ≥ n0
Examples: Big O
100n + 5 is O(n2)
100n + 5
≤ 100n + n, for n ≥ 5
2
= 101n ≤ 101n , so n0 = 5, c = 101

Alternatively
100n + 5
≤ 100n + 5n, for n ≥1
= 105n ≤ 105n2, so n0 = 1, c = 105

n0 and c are not unique!

Of course, by the same argument, 100n+5 is also O(n)


Examples: Big O
100n2 + 20n + 5 is O(n2)
100n2 + 20n + 5
≤ 100n2 + 20n2 + 5n2, for n ≥ 1
≤ 125n2
n0 = 1, c = 125

What matters is the highest term


20n + 5 dominated by 100n2
Examples: Big O

n3 is not O(n2)
No matter what c we choose, cn2 will be
dominated by n3 for n ≥ c
Useful properties

If

f1(n) is O(g1(n))

f2(n) is O(g2(n))

then f1(n) + f2(n) is O(max(g1(n),g2(n)))


Proof
f1(n) ≤ c1g1(n) for all n > n1
f2(n) ≤ c2g2(n) for all n > n2
Why is this important?
Algorithm has two phases
Phase A takes time O(gA(n))
Phase B takes time O(gB(n))

Algorithm as a whole takes time


max(O(gA(n)),O(gB(n)))

For an algorithm with many phases, least efficient


phase is an upper bound for the whole algorithm
Lower bounds, Ω (omega)
t(n) is said to be Ω(g(n)) if we can find suitable
constants c and n0 so that cg(n) is an lower bound
for t(n) for n
beyond n0

t(n) ≥ cg(n)
for every n ≥ n0
Lower bounds
n3 is Ω(n2)

n3 ≥ n2 for all n

n0 = 0 and c = 1

Typically we establish lower bounds for problems


as a whole, not for individual algorithms

Sorting requires Ω(n log n) comparisons, no


matter how clever the algorithm is
Tight bounds, Θ (theta)
t(n) is Θ(g(n)) if it is both O(g(n)) and Ω(g(n))

Find suitable constants


c1, c2, and n0 so that

c2g(n) ≤ t(n) ≤ c1g(n)


for every n ≥ n0
Tight bounds
n(n-1)/2 is Θ(n2)

Upper bound
2 2
n(n-1)/2 = n /2 - n/2 ≤ n /2, for n ≥ 0

Lower bound

n(n-1)/2 = n2/2 - n/2 ≥ n2/2 - (n/2 x n/2) ≥ n2/4,


for n ≥ 2

Choose n0 = max(0,2) = 2, c1 = 1/2 and c2 = 1/4


Summary
f(n) = O(g(n)) means g(n) is an upper bound for f(n)

Useful to describe limit of worst case running


time for an algorithm

f(n) = Ω(g(n)) means g(n) is a lower bound for f(n)

Typically used for classes of problems, not


individual algorithms

f(n) = Θ(g(n)): matching upper and lower bounds

Best possible algorithm has been found


NPTEL MOOC,JAN-FEB 2015
Week 1, Module 8

DESIGN AND ANALYSIS


OF ALGORITHMS

MADHAVAN MUKUND, CHENNAI MATHEMATICAL INSTITUTE


http://www.cmi.ac.in/~madhavan
Calculating complexity

Iterative programs

Recursive programs
Example 1
Maximum value in an array
function maxElement(A):
maxval = A[0]
for i = 1 to n-1:
if A[i] > maxval:
maxval = A[i]
return(maxval)
Example 2
Check if all elements in an array are distinct
function noDuplicates(A):
for i = 0 to n-1:
for j = i+1 to n-1:
if A[i] == A[j]:
return(False)
return(True)
Example 3
Matrix multiplication
function matrixMultiply(A,B):
for i = 0 to n-1:
for j = 0 to n-1:
C[i][j] = 0
for k = 0 to n-1:
C[i][j] = C[i][j] + A[i][k]*B[k][j]
return(C)
Example 4
Number of bits in binary representation of n
function numberOfBits(n):
count = 1
while n > 1:
count = count + 1
n = n div 2
return(count)
Example 5
Towers of Hanoi

Three pegs,
A, B, C

Move n disks
from A to B

Never put a larger disk above a smaller one

C is transit peg
Example 5
Recursive solution

Move n-1 disks from A to C, using B as transit peg

Move largest disk from A to B

Move n-1 disks from C to B, using A as transit peg


Example 5
Solve recurrence by repeated substitution

M(n) = number of moves to transfer n disks

M(n) = M(n-1) + 1 + M(n-1)

M(1) = 1

Recurrence

Recursive expression for M(n)


Example 5
Complexity

M(n) = 2M(n-1) + 1
= 2(2M(n-2)+1) + 1 = 22M(n-2) + (2+1)
= 22(2M(n-3)+1) + 2 + 1 = 23M(n-3) + (4+2+1)
=…
= 2kM(n-k) + (2k - 1)
=…
= 2n-1M(1) + (2n-1 - 1)
= 2n-1 + 2n-1 - 1 =
= 2n - 1
Summary
Iterative programs

Focus on loops

Recursive programs

Write and solve a recurrence

Will see more complicated examples

Need to be clear about “accounting” for basic


operations

You might also like