0% found this document useful (0 votes)
209 views65 pages

CSC373 Algorithm Design, Analysis & Complexity Nisarg Shah

This document provides information about the CSC373 Algorithm Design, Analysis and Complexity course taught by Nisarg Shah. It introduces the instructor, lecture and tutorial times, office hours, and course organization. Key aspects of the course include 4 assignments, 2 term tests, and a final exam. The primary textbook is Introduction to Algorithms by Cormen et al. and the course covers algorithm design paradigms like divide-and-conquer, greedy algorithms, and more.

Uploaded by

yiyang hua
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
209 views65 pages

CSC373 Algorithm Design, Analysis & Complexity Nisarg Shah

This document provides information about the CSC373 Algorithm Design, Analysis and Complexity course taught by Nisarg Shah. It introduces the instructor, lecture and tutorial times, office hours, and course organization. Key aspects of the course include 4 assignments, 2 term tests, and a final exam. The primary textbook is Introduction to Algorithms by Cormen et al. and the course covers algorithm design paradigms like divide-and-conquer, greedy algorithms, and more.

Uploaded by

yiyang hua
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

CSC373

Algorithm Design,
Analysis & Complexity

Nisarg Shah

373F19 - Nisarg Shah 1


Introduction
• Instructors
➢ Karan Singh
o dgp.toronto.edu/~karan, karan@dgp, BA 5258
o SEC 5101 and 5201
➢ Nisarg Shah
o cs.toronto.edu/~nisarg, nisarg@cs, SF 2301C
o SEC 5301

• TAs: Too many to list


• From here on, we’ll only discuss information
pertaining to our section (5301).

373F19 - Nisarg Shah 2


Introduction
• Lectures
➢ Every Wed 6-9pm
➢ BA 1190

• Tutorials
➢ Every Mon 6-7pm
➢ Divided by birth month
➢ Jan-Jun: UC 244, Jul-Dec: UC 261

• Office Hours: Wed 2-3pm in SF 2301C

373F19 - Nisarg Shah 3


No tutorial on Sep 9
Check the course webpage
for further announcements

373F19 - Nisarg Shah 4


Course Information
• Course Page
www.cs.toronto.edu/~nisarg/teaching/373f19/
➢ All the information below is in the course information
sheet, available on the course page

• Discussion Board
piazza.com/utoronto.ca/fall2019/csc373

• Grading – MarkUs system


➢ Link will be distributed after about two weeks
➢ LaTeX preferred, scans are OK!

➢ An arbitrary subset of questions may be graded…

373F19 - Nisarg Shah 5


Course Organization
• Tutorials
➢ A problem sheet will be posted ahead of the tutorial
➢ Easier problems that are warm-up to assignments/exams

➢ You’re expected to try them before coming to the tutorial

➢ TAs will solve the problems on the board

➢ No written/typed solutions will be posted

373F19 - Nisarg Shah 6


Course Organization
• Assignments
➢ 4 assignments
➢ In groups of up to three students

➢ Final marks will be taken from best 3 out of 4

➢ Questions will be more difficult


o May need to mull them over for several days; do not expect to
start and finish the assignment on the same day!
o May include bonus questions
➢ Submit a single PDF on MarkUs
o May need to compress the PDF

373F19 - Nisarg Shah 7


Course Organization
• Exams
➢ Two term tests, one final exam
➢ Details will be posted on the course webpage

➢ In each exam, you’ll be allowed to bring one 8.5” x 11”


sheet of handwritten notes on one side

373F19 - Nisarg Shah 8


Grading Policy
• 3 homeworks * 10% = 30%

• 2 term tests * 20% = 40%

• Final exam * 30% = 30%

• NOTE: If you earn less than 40% on the final exam,


your final course grade will be reduced below 50

373F19 - Nisarg Shah 9


Textbook
• Primary reference: lecture slides

• Primary textbook (required)


➢ [CLRS] Cormen, Leiserson, Rivest, Stein: Introduction to
Algorithms.

• Supplementary textbooks (optional)


➢ [DPV] Dasgupta, Papadimitriou, Vazirani: Algorithms.
➢ [KT] Kleinberg; Tardos: Algorithm Design.

373F19 - Nisarg Shah 10


Other Policies
• Collaboration
➢ Free to discuss with classmates or read online material
➢ Must write solutions in your own words
o Easier if you do not take any pictures/notes from discussions

• Citation
➢ For each question, must cite the peer (write the name) or
the online sources (provide links), if you obtained a
significant insight directly pertinent to the question
➢ Failing to do this is plagiarism!

373F19 - Nisarg Shah 11


Other Policies
• “No Garbage” Policy
➢ Borrowed from: Prof. Allan Borodin (citation!)

1. Partial marks for viable approaches


2. Zero marks if the answer makes no sense
3. 20% marks if you admit to not knowing how to
approach the question (“I do not know how to
approach this question”)

• 20% > 0% !!

373F19 - Nisarg Shah 12


Other Policies
• Late Days
➢ 4 total late days across all 4 assignments
➢ Managed by MarkUs
➢ At most 2 late days can be applied to a single assignment
➢ Already covers legitimate reasons such as illness,
university activities, etc.
o Petitions will only be granted for circumstances which cannot be
covered by this

373F19 - Nisarg Shah 13


Enough with the
boring stuff.

373F19 - Nisarg Shah 14


What will we study?

Why will we study it?

373F19 - Nisarg Shah 15


Muhammad ibn Musa al-Khwarizmi
c. 780 – c. 850

373F19 - Nisarg Shah 16


What is this course about?
• Algorithms
➢ Ubiquitous in the real world
o From your smartphone to self-driving cars
o From graph problems to graphics problems
o…
➢ Important to be able to design and analyze algorithms
➢ For some problems, good algorithms are hard to find
o For some of these problems, we can formally establish complexity
results
o We’ll often find that one problem is easy, but its minor variants
are suddenly hard

373F19 - Nisarg Shah 17


What is this course about?
• Algorithms
➢ Algorithms in specialized environments or using advanced
techniques
o Distributed, parallel, streaming, sublinear time, spectral, genetic…
➢ Other concerns with algorithms
o Fairness, ethics, …

➢ …mostly beyond the scope of this course

373F19 - Nisarg Shah 18


What is this course about?
• Algorithm design paradigms in this course
➢ Divide and Conquer
➢ Greedy
➢ Dynamic programming
➢ Network flow
➢ Linear programming
➢ Approximation algorithms
➢ Randomized algorithms

373F19 - Nisarg Shah 19


What is this course about?
• How do we know which paradigm is right for a
given problem?
➢ A very interesting question!
➢ Subject of much ongoing research…
o Sometimes, you just know it when you see it…

• How do we analyze an algorithm?


➢ Proof of correctness
➢ Proof of running time
o We’ll try to prove the algorithm is efficient in the worst case
o In practice, average case matters just as much (or even more)

373F19 - Nisarg Shah 20


What is this course about?
• What does it mean for an algorithm to be efficient
in the worst case?
➢ Polynomial time
➢ It should use at most poly(n) steps on any n-bit input
o 𝑛, 𝑛2 , 𝑛100 , 100𝑛6 + 237𝑛2 + 432, …

➢ How much is too much?

373F19 - Nisarg Shah 21


What is this course about?

373F19 - Nisarg Shah 22


What is this course about?

373F19 - Nisarg Shah 23


What is this course about?
• What if we can’t find an efficient algorithm for a
problem?
➢ Try to prove that the problem is hard
➢ Formally establish complexity results

➢ NP-completeness, NP-hardness, …

• We’ll often find that one problem may be easy, but


its simple variants may suddenly become hard
➢ Minimum spanning tree (MST) vs bounded degree MST
➢ 2-colorability vs 3-colorability

373F19 - Nisarg Shah 24


I’m not convinced.

Will I really ever need to


know how to design
abstract algorithms?

373F19 - Nisarg Shah 25


At the very least…

This will help you prepare for your


technical job interview!

Real Microsoft interview question:


• Given an array 𝑎, find indices (𝑖, 𝑗) with
the largest 𝑗 − 𝑖 such that 𝑎 𝑗 > 𝑎[𝑖]
• Greedy? Divide & conquer?

373F19 - Nisarg Shah 26


Disclaimer
• The course is theoretical in nature
➢ You’ll be working with abstract notations, proving
correctness of algorithms, analyzing the running time of
algorithms, designing new algorithms, and proving
complexity results.
• Question
➢ How many of you are somewhat scared going into the
course?
➢ How many of you feel comfortable with proofs, and want
challenging problems to solve?
➢ How many prefer concrete examples to abstract symbols?
➢ We’ll have something for everyone to enjoy this course

373F19 - Nisarg Shah 27


Related/Follow-up Courses
• Direct follow-up
➢ CSC473: Advanced Algorithms
➢ CSC438: Computability and Logic
➢ CSC463: Computational Complexity and Computability

• Algorithms in other contexts


➢ CSC304: Algorithmic Game Theory and Mechanism
Design (self promotion!)
➢ CSC384: Introduction to Artificial Intelligence
➢ CSC436: Numerical Algorithms
➢ CSC418: Computer Graphics

373F19 - Nisarg Shah 28


Divide & Conquer

373F19 - Nisarg Shah 29


History?
• How many of you saw some divide & conquer
algorithms in, say, CSC236/CSC240 and/or
CSC263/CSC265?

• Maybe you saw a subset of these algorithms?


➢ Mergesort - 𝑂 𝑛 log 𝑛
➢ Karatsuba algorithm for fast multiplication - 𝑂 𝑛
log2 3

rather than 𝑂 𝑛2
➢ Largest subsequence sum in 𝑂 𝑛
➢…

373F19 - Nisarg Shah 30


Divide & Conquer
• General framework
➢ Break (a large chunk of) a problem into two smaller
subproblems of the same type
➢ Solve each subproblem recursively
➢ At the end, quickly combine solutions from the two
subproblems and/or solve any remaining part of the
original problem

• Hard to formally define when a given algorithm is


divide-and-conquer…
• Let’s see some examples!

373F19 - Nisarg Shah 31


Master Theorem
• Here’s the master theorem, as it appears in CLRS
➢ Useful for analyzing divide-and-conquer running time
➢ If you haven’t already seen it, please spend some time
understanding it

373F19 - Nisarg Shah 32


Master Theorem
Intuition:
Compare the function f(n) with the function nlogba. The larger
of the two functions determines the recurrence solution.

373F19 - Nisarg Shah 33


Counting Inversions
• Problem
➢ Given an array 𝑎 of length 𝑛, count the number of pairs
(𝑖, 𝑗) such that 𝑖 < 𝑗 but 𝑎 𝑖 > 𝑎[𝑗]

• Applications
➢ Voting theory
➢ Collaborative filtering
➢ Measuring the “sortedness” of an array
➢ Sensitivity analysis of Google's ranking function
➢ Rank aggregation for meta-searching on the Web
➢ Nonparametric statistics (e.g., Kendall's tau distance)

373F19 - Nisarg Shah 34


Counting Inversions
• Problem
➢ Count (𝑖, 𝑗) such that 𝑖 < 𝑗 but 𝑎 𝑖 > 𝑎[𝑗]
• Brute force
➢ Check all Θ 𝑛2 pairs
• Divide & conquer
➢ Divide: break array into two equal halves 𝑥 and 𝑦
➢ Conquer: count inversions in each half recursively
➢ Combine:
o Solve (remaining): count inversions with one entry in 𝑥 and one in 𝑦
o Merge: add all three counts

373F19 - Nisarg Shah 35


Counting Inversions
• From Kevin Wayne’s slides

373F19 - Nisarg Shah 36


Counting Inversions

373F19 - Nisarg Shah 37


Counting Inversions

373F19 - Nisarg Shah 38


Counting Inversions
• How do we formally prove correctness?
➢ Induction on 𝑛 is usually very helpful
➢ Allows you to assume correctness of subproblems

• Running time analysis


➢ Suppose 𝑇(𝑛) is the running time for inputs of size 𝑛
➢ Our algorithm satisfies 𝑇 𝑛 = 2 𝑇 𝑛Τ2 + 𝑂(𝑛)
➢ Master theorem says this is 𝑇 𝑛 = 𝑂(𝑛 log 𝑛)

373F19 - Nisarg Shah 39


Without Master Theorem
Let’s say 𝑇 𝑛 = 2 𝑇 𝑛Τ2 + 2𝑛

373F19 - Nisarg Shah 40


Closest Pair in ℝ2

• Problem:
➢ Given 𝑛 points of the form (𝑥𝑖 , 𝑦𝑖 ) in the plane, find the
closest pair of points.

• Applications:
➢ Basic primitive in graphics and computer vision
➢ Geographic information systems, molecular modeling, air
traffic control
➢ Special case of nearest neighbor

• Brute force: Θ 𝑛2

373F19 - Nisarg Shah 41


Intuition from 1D?
• In 1D, the problem would be easily 𝑂(𝑛 log 𝑛)
➢ Sort and check!

• Sorting attempt in 2D
➢ Find closest points by x coordinate
➢ Find closest points by y coordinate

• Non-degeneracy assumption
➢ No two points have the same x or y coordinate

373F19 - Nisarg Shah 42


Intuition from 1D?
• Sorting attempt in 2D
➢ Find closest points by x or y coordinate
➢ Doesn’t work!

1+𝜖

1 2

1 1+𝜖

373F19 - Nisarg Shah 43


Closest Pair in ℝ2

• Let’s try divide-and-conquer!


➢ Divide: points in equal halves by drawing a vertical line 𝐿
➢ Conquer: solve each half recursively
➢ Combine: find closest pair with one point on each side of 𝐿
➢ Return the best of 3 solutions
Seems like Ω(𝑛2 ) 

373F19 - Nisarg Shah 44


Closest Pair in ℝ2

• Combine
➢ We can restrict our attention to points within 𝛿 of 𝐿 on
each side, where 𝛿 = best of the solutions in two halves

373F19 - Nisarg Shah 45


Closest Pair in ℝ2

• Combine (let 𝛿 = best of solutions in two halves)


➢ Only need to look at points within 𝛿 of 𝐿 on each side,
➢ Sort points on the strip by 𝑦 coordinate
➢ Only need to check each point with next 11 points in
sorted list!
Wait, what? Why 11?

373F19 - Nisarg Shah 46


Why 11?
• Claim:
➢ If two points are at least 12
positions apart in the sorted list,
their distance is at least 𝛿
• Proof:
➢ No two points lie in the same
𝛿/2 × 𝛿/2 box
➢ Two points that are more than two
rows apart are at distance at least 𝛿

373F19 - Nisarg Shah 47


Recap: Karatsuba’s Algorithm
• Fast way to multiply two 𝑛 digit integers 𝑥 and 𝑦
• Brute force: 𝑂(𝑛2 ) operations

• Karatsuba’s observation:
➢ Divide each integer into two parts
𝑛 𝑛
o 𝑥 = 𝑥1 ∗ 10 Τ2 + 𝑥2 , 𝑦 = 𝑦1 ∗ 10 Τ2 + 𝑦2
𝑛
o 𝑥𝑦 = 𝑥1 𝑦1 ∗ 10𝑛 + 𝑥1 𝑦2 + 𝑥2 𝑦1 ∗ 10 Τ2 + (𝑥2 𝑦2 )
➢ Four 𝑛Τ2-digit multiplications can be replaced by three
o 𝑥1 𝑦2 + 𝑥2 𝑦1 = 𝑥1 + 𝑥2 𝑦1 + 𝑦2 − 𝑥1 𝑦1 − 𝑥2 𝑦2
➢ Running time
o 𝑇 𝑛 = 3 𝑇 𝑛Τ2 + 𝑂(𝑛) ⇒ 𝑇 𝑛 = 𝑂 𝑛log2 3

373F19 - Nisarg Shah 48


Strassen’s Algorithm
• Generalizes Karatsuba’s insight to design a fast
algorithm for multiplying two 𝑛 × 𝑛 matrices
➢ Call 𝑛 the “size” of the problem
𝐶11 𝐶12 𝐴11 𝐴12 𝐵11 𝐵12
= ∗
𝐶21 𝐶22 𝐴21 𝐴22 𝐵21 𝐵22

➢ Naively, this requires 8 multiplications of size 𝑛/2


o 𝐴11 ∗ 𝐵11 , 𝐴12 ∗ 𝐵21 , 𝐴11 ∗ 𝐵12 , 𝐴12 ∗ 𝐵22 , …

➢ Strassen’s insight: replace 8 multiplications by 7


o Running time: 𝑇 𝑛 = 7 𝑇 𝑛Τ2 + 𝑂(𝑛2 ) ⇒ 𝑇 𝑛 = 𝑂 𝑛log2 7

373F19 - Nisarg Shah 49


Strassen’s Algorithm
𝐶11 𝐶12 𝐴11 𝐴12 𝐵11 𝐵12
= ∗
𝐶21 𝐶22 𝐴21 𝐴22 𝐵21 𝐵22

373F19 - Nisarg Shah 50


Median & Selection
• Selection:
➢ Given array 𝐴 of 𝑛 comparable elements, find 𝑘th smallest
➢ 𝑘 = 1 is min, 𝑘 = 𝑛 is max, 𝑘 = 𝑛 + 1 Τ2 is median
➢ 𝑂 𝑛 is easy for min/max

• What about 𝑘-selection?


➢ 𝑂(𝑛𝑘) by modifying bubble sort
➢ 𝑂 𝑛 log 𝑛 by sorting
➢ 𝑂 𝑛 + 𝑘 log 𝑛 using min-heap
➢ 𝑂(𝑘 + 𝑛 log 𝑘) using max-heap

• Q: What about just 𝑂(𝑛)?


• A: Yes! Selection is easier than sorting.

373F19 - Nisarg Shah 51


QuickSelect
• Find a pivot 𝑝
• Divide 𝐴 into two sub-arrays
➢ 𝐴𝑙𝑒𝑠𝑠 = elements ≤ 𝑝, 𝐴𝑚𝑜𝑟𝑒 = elements > 𝑝
➢ If 𝐴𝑙𝑒𝑠𝑠 ≥ 𝑘, return 𝑘th smallest in 𝐴𝑙𝑒𝑠𝑠 , otherwise return
(𝑘 − 𝐴𝑙𝑒𝑠𝑠 )th smallest in 𝐴𝑚𝑜𝑟𝑒

• Problem?
➢ If pivot is close to the min or the max, then we basically get
𝑇 𝑛 = 𝑇 𝑛 − 1 + 𝑂(𝑛), which is just 𝑛2
➢ Want to reduce 𝑛 − 1 to a fraction of 𝑛 (like 𝑛/2, 5𝑛/6, etc)

373F19 - Nisarg Shah 52


Finding a Good Pivot
• Divide 𝑛 elements into 𝑛Τ5 groups of 5 each

373F19 - Nisarg Shah 53


Finding a Good Pivot
• Divide 𝑛 elements into 𝑛Τ5 groups of 5 each
• Find the median of each group

373F19 - Nisarg Shah 54


Finding a Good Pivot
• Divide 𝑛 elements into 𝑛Τ5 groups of 5 each
• Find the median of each group
• Find the median of 𝑛/5 medians

373F19 - Nisarg Shah 55


Finding a Good Pivot
• Divide 𝑛 elements into 𝑛Τ5 groups of 5 each
• Find the median of each group
• Find the median of 𝑛/5 medians
• Use this median of medians as the pivot in
quickselect

• Q: Why does this work?

373F19 - Nisarg Shah 56


Analysis
• How many elements can be ≤ 𝑝∗ ?
➢ Out of 𝑛/5 medians, 𝑛/10 are > 𝑝∗

373F19 - Nisarg Shah 57


Analysis
• How many elements can be ≤ 𝑝∗ ?
➢ Out of 𝑛/5 medians, 𝑛/10 are > 𝑝∗

373F19 - Nisarg Shah 58


Analysis
• How many elements can be ≤ 𝑝∗ ?
➢ Out of 𝑛/5 medians, 𝑛/10 are > 𝑝∗
➢ Each such median brings 3 elements > 𝑝

373F19 - Nisarg Shah 59


Analysis
• How large can |𝐴𝑙𝑒𝑠𝑠 | or |𝐴𝑚𝑜𝑟𝑒 | be?
➢ ≈ 𝑛Τ10 ⋅ 3 ≈ 3𝑛Τ10 elements are guaranteed to be > 𝑝∗
➢ So 𝐴𝑙𝑒𝑠𝑠 ≤ 7𝑛Τ10
➢ Similarly, 𝐴𝑚𝑜𝑟𝑒 ≤ 7𝑛Τ10
➢ (These are rough calculations…)

• How does this factor into overall algorithm


analysis?

373F19 - Nisarg Shah 60


Analysis
• Divide 𝑛 elements into 𝑛Τ5 groups of 5 each
𝑂(𝑛)
• Find the median of each group
• Find 𝑝∗ = median of 𝑛/5 medians 𝑇(𝑛/5)
• Create 𝐴𝑙𝑒𝑠𝑠 and 𝐴𝑚𝑜𝑟𝑒 according to 𝑝∗ 𝑂(𝑛)
• Run selection on one of 𝐴𝑙𝑒𝑠𝑠 or 𝐴𝑚𝑜𝑟𝑒 𝑇(7𝑛/10)

• 𝑇 𝑛 = 𝑇 𝑛Τ5 + 𝑇 7𝑛Τ10 + 𝑂(𝑛)


• Note: 𝑛Τ5 + 7𝑛Τ10 = 9𝑛Τ10
➢ Only a fraction of 𝑛, so by Master theorem, 𝑇 𝑛 = 𝑂(𝑛)

373F19 - Nisarg Shah 61


Residual Notes
• Best algorithm for a problem?
➢ Typically hard to determine
➢ We still don’t know best algorithms for multiplying two 𝑛-
digit integers or two 𝑛 × 𝑛 matrices
o Integer multiplication
• Breakthrough in March 2019: first 𝑂(𝑛 log 𝑛) time algorithm
• It is conjectured that this is asymptotically optimal
o Matrix multiplication
• 1969 (Strassen): 𝑂(𝑛2.807 )
• 1990: 𝑂(𝑛2.376 )
• 2013: 𝑂(𝑛2.3729 )
• 2014: 𝑂(𝑛2.3728639 )

373F19 - Nisarg Shah 62


Residual Notes
• Best algorithm for a problem?
➢ Usually, we design an algorithm and then analyze its
running time

➢ Sometimes we can do the reverse:


o E.g., if you know you want an 𝑂(𝑛2 log 𝑛) algorithm
o Master theorem suggests that you can get it by
𝑇 𝑛 = 4 𝑇 𝑛ൗ2 + 𝑂 𝑛2

o So maybe you want to break your problem into 4 problems of size


𝑛/2 each, and then do 𝑂(𝑛2 ) computation to combine

373F19 - Nisarg Shah 63


Residual Notes
• Access to input
➢ For much of this analysis, we are assuming random access
to elements of input
➢ So we’re ignoring underlying data structures (e.g. doubly
linked list, binary tree, etc.)

• Machine operations
➢ We’re only counting comparison or arithmetic operations
➢ So we’re ignoring issues like how real numbers will be
represented in closest pair problem
➢ When we get to P vs NP, representation will matter

373F19 - Nisarg Shah 64


Residual Notes
• Size of the problem
➢ Can be any reasonable parameter of the problem

➢ E.g., for matrix multiplication, we used 𝑛 as the size


2
➢ But an input consists of two matrices with 𝑛 entries

➢ It doesn’t matter whether we call 𝑛 or 𝑛2 the size of the


problem

➢ The actual running time of the algorithm won’t change

373F19 - Nisarg Shah 65

You might also like