CS3230 Cheatsheet
CS3230 Cheatsheet
1 Week 1-Reasoning and 1.2 Efficiency 2 Week 2-Recurrence and 2.1.3 Summation
asymptotic analysis Asymptotic Analysis is a method of de- Mater Theorem Arithmetic Series
scribing the limiting behavior. Asymptotic n
1.1 Correctness of algorithm 2.1 Properties of
X
notations: k = 1 + 2 + 3 + ...n
functions (MATHS) k=1
1.1.1 Correctness of iterative algo- • O-notation (BIG-O) 1
rithm (upper bound) 2.1.1 Exponential = n(n + 1) = Θ(n2 )
1
A loop invariant is: a−1 = 1/a
– O(g(n)) = {f (n) : there exist Geometric Series
m n mn
• true at the beginning of an iteration constants c > 0, n0 > 0 such that (a ) = a n
X
0 ≤ f (n) ≤ cg(n) for all n ≥ n0 } xk = 1 + x2 + x3 + ...xn
am an = am+n
• remains true at the beginning of the k=1
• Termination: When the algorithm constants c > 0, n0 > 0 such that = ln n + O(1)
logc (ab) = logc a + logc b
terminates, the invariant provides a 0 ≤ cg(n) ≤ f (n) for all n ≥ n0 }
Telescoping Series
useful property for showing correct- logb an = n logb a
ness. • o-notation (small-o)
logc a For any sequence a0 , a1 , ..., an
(tight upper bound) logb a =
logc b (a0 +
a1 )+
1.1.2 Correctness of recursive algo- n−1 (
a
1 +a2 )+
– o(g(n)) = {f (n) : for any con- logb (1/a) = − logb a X
rithm (ak − ak+1 ) = (a
2 +a3 )+ = a0 − an
stant c > 0, there is a constant
n0 > 0 such that 0 ≤ f (n) < 1 k=0 ...
To proof the correctness of an iterative al- logb a =
cg(n) for all n ≥ n0 } loga b (an−1
+ an )
gorithm:
Page 1
CS3230 Cheatsheet Austin Santoso
Page 2
CS3230 Cheatsheet Austin Santoso
4.2 Classification of 4.4 Linear time sorting 5.2 Average vs Expected run- • Linearity of Expectations
Sorting Algorithms ning time
• Counting Sort E[X + Y ] = E[X] + E[Y ]
4.2.1 Running time • average running time: For non- E[aX] = aE[X]
• O(n )
2 • Radix Sort randomized algorithms that depend
on input, if we know the distribution • Expectation of Product if X and
• O(n log n) of the input we can find the average Y are independent
running time
5 Week 5-Randomized E[XY ] = E[X]E[Y ]
4.2.2 In-place
Algorithms • expected running time: For ran-
• Uses very little additional memory, domized algorithms, it depends on • Bernoulli Trial: an instance of a
beyond that used by the data. Usu- An algorithm is called randomized if its random number generator, even for Bernoulli trial has probability p of suc-
aly O(1) behavior is determined not only by its input the same input. The ”Average” run- cess and probability 1−p = q of failure
but also by values produced by a random- ning time over all possible numbers is
• Insertion Sort • Geometric Distribution: suppose
number generator the expected running time
• Quicksort (O(lg n) additional memory we have a sequence of independent
with proper implementation) Bernoulli trial, each with probability
5.3 Probability p of success. let X be the number of
5.1 Types of Randomized Al-
4.2.3 Stable gorithms • A and B are not mutually exclu- trials needed to obtain success for the
sive (i.e. A ∩ B 6= ∅): P r{A ∪ B} = first time . Then, X follows the geo-
• The original order of equal elements is 1. Monte Carlo Algorithm: Ran- P r{A} + P r{B} − P r{A ∩ B} metric distribution:
preserved after sorting domized algorithm that gives the
correct answer with probability 1 - • Two events are independent if P r[X = k] = q ( k − 1)p
• Insertion Sort
o(1) (”high probability”), but runtime P r[A ∩ B] = P r[A] · P r[B] 1
E[X] =
• Merge Sort bound holds deterministically p
• The Conditional Probability of an
4.2.4 Comparison or not • finding π by randomly sampling
event A given event B is • Binomial Distribution: let X be
n (x,y) and count fractions satis- the number of successes in n Bernoulli
• Comparison-based: Compares the el- P r[A ∪ B]
fying x2 + y 2 ≤ 1 then multiply P r[A|B] = trials. Then X follows the binomial
ement. Ω(n log n) is the lower bound P r[B] distribution
by 4 to get an estimate to π
for comparison based sorting.
• run is Θ(n) but only approxi- whenever P r[B] 6= 0 n k n−k
• Non-Comparison-based: no compari- P r{X = k} = p q
mates k
son between elements, Linear time. • Bayes Theorem
E[X] = n[
2. Las Vegas Algorithm: Randomized P r{A}P r{B|A}
4.3 Comparison based sorting algorithm that always gives the cor- P r{A|B}] =
5.4 Indicator Random Variable
P r{B}
• Bubble Sort rect answer, but the runtime bounds Method
P r{A}P r{B|A}
depend on the random numbers. =
• Selection Sort P r{A}P r{B|A} + P r{Ā}P r{B|Ā} Indicator random variable for an event A:
Page 3
CS3230 Cheatsheet Austin Santoso
6 Week 6-Order Statis- 6.1.2 Worst case linear time 7.1 Aggregate method • In the case of overflow
tics The idea is that we use maths 1. Between this overflow and the
Based on the problem above. Let t(i) = the previous overflow we have i
cost of the ith insertion
Given an unsorted list, we want to find the ( 2. So we have $2 ∗ i money in the
element that has rank i. i if i-1isanexactpowerof 2 bank,, use this to copy over, leav-
= ing 0 in the bank
1 otherwise
• i = 1: minimum 3. The new item that caused the
overflow is then inserted to the
bigger array normally
• i = n: maximum The idea is that we generate a good pivot
each time, instead of randomly. The me- 7.3 Potential method
dian of the median of the groups has at least Cost of n insertions = ni=1 t(i)
P
3n/10 elements greater or less than it. φ: Potential function associated with the
• b(n + 1/2c or d(n + 1)/2e: median log(n−1) j
≤ n + Σj=0 2 algorithm/data structure φ(i): Potential at
the end of the ith operation
≤ 3n
Normally achieved by first sorting the list Important conditions to be fulfilled by φ
then reporting the element at index i: Thus the average cost of each insertion in
7 Week 7-Amortized • φ(0) = 0
Θ(n lg n) the dynamic array is O(n)/n = O(1)
Analysis • φ(i) ≥ 0 for all i
7.2 Accounting method Amortized cost of ith operation
Amortized analysisis a strategy for an- The idea is to impose an extra charge on =Actual cost of ith operation + ( ∆φ(i))
6.1 finding rank-i element alyzing a sequence of operations to show inexpensive operations and use it to pay =cost of ith operation + ( φ(i) − φ(i − 1))
that the average cost per operation is small, for expensive operations later on. Excess
6.1.1 Randomized Divide and Con- even though a single operation within the money goes to the bank to be used for fu- Amortized cost of n operations
quer sequence might be expensive. Without the ture operations. Need to prove that bank =Σi Amortized cost of ith operation
use of probability. will never go negative. =Actual cost of n operations + φ(n)
Consider the following case: We have a dy- Some observations in the problem above ≥ Actual cost of n operations
namic array initially of size 1. When we Need to find a suitable Potential function
• at insertion i overflow happens, thus
insert, if the list is full (overflow) we copy φ, so that for the costly operation ∆φi is
next overflow at insertion 2i
over to a new array with twice the size negative such that is nullifies or reduces the
• between then there are i insertions effect of the actual cost.
and we need to copy over 2i items. try to find what is decreasing in the expen-
sive operation
Thus to find amortized cost
• Each insertion is charged $3
The idea is to randomly select a pivot then – $1 is for the current insertion
check the rank of the pivot, if the element – $2 is stored to handle overflow
we are looking for is int the left, recursively
do it in the left sublist, if it is in the right, Below are methods to perform amortized • In the case of no overflow, insert for
do so on the right analysis $1 and store $2
Page 4
CS3230 Cheatsheet Austin Santoso
For the example above each step. 10.1 Optimization vs Decision • NP-complete: A problem X in NP
φ(i) = 2i − size(T ) class is NP-complete if
Optimization:
Given a decision problem P for every A ∈ NP, A ≤p X
1. ith insertion causes overflow. Actual instance A is of size n of problem P • give t h min/max
cost= i. size=i−1 before new creation we perform a greedy step to reduce the • NP-Hard: A in NP ≤p B in NP hard.
• satisfy constraints IfX is not known to be in NP, then we
• φ(i − 1) = i − 1 problem to
0 just say X is in NP-hard
instance A of size < n of problem P • ex: visit each city with min cost
• phi(i) = 2 ∗ i − 2 ∗ (i − 1) = 2 How to show that a problem is NP-
To prove that the greedy step is correct
• ∆φi = 3 − i Decision: complete:
1. Try to establish a relation between • A problem instance, check is it solv- 1. Let X be the problem we wish to
2. ith insertion does not cause overflow. OPT(A) and OPT(A0 ) able or not prove is in NP-complete
Actual cost = 1. size = l
2. Try to prove the relation formally by • ex: can i visit every city within cost k 2. Show that X ∈ NP
• φ(i − 1) = 2(i − 1) − l
• φ(i) = 2i − l • deriving a (not necessarily opti- 10.2 Reductions between Deci-
3. Pick a problem A which is already
known to be in NP-complete
• ∆φi = 2 mal) solution of A from OPT(A )0
sion Problems
4. Show that A ≤p X
• deriving a (not necessarily opti- Given two decision problem A and B, a
mal) solution of A0 from OPT(A) polynomial time reduction from A to B, de-
noted A ≤p B, is a transformation from in- 12 Week 12-approximation
3. If you succeed, this algorithm is cor- stance α of A to instance β of B such that:
rect Sometimes a problem is just too hard to
8 Week 8-Dynamic Pro- • Transformation must run in polyno- solve, so we approximate the solution.
OPT denotes optimal algorithm to solve mial time in the size of α. Word input For an optimization problem, find a solution
gramming encoding length that is nearly optimal in cost
Suppose we have a recursive solution • α is a YES-instance for A if and only
overall there are only a polynomial number 10 Week 10-Intractability if β is a YES-instance for B 12.1 Approximation Ratio
of subproblems
– yes for B → yes for A Let C ∗ be the optimal algorithm and C be
And there is a huge overlap among the sub- To determine how hard a problem is, we use
the cost of the solution given by an approx-
problems, the recursive algorithm takes ex- the idea of Reduction – yes for A → yes for B
imation algorithm.
ponential time because it solves the same A ⇒ B
We take a general instance of the Problem An approximation algorithm A has an ap-
problem many times)
So we compute the recursive solution itera- A transform it to a specific problem B We 11 Week 11-NP com- proximation ratio ρ(n) if:
tively in a bottom up manner, and memoize can use some algorithm M to solve B, then pleteness • for minimization problem
/ remember past solutions to avoid wastage use solution to solve A
C
of computation if B is easy → A is easy if A is hard → B is Complexity grouping: ≤ ρ(n), ρ(n) ≥ 1
C∗
hard
• P - (polynomial) - The set of decision
We focus on if A is hard → B is hard • for maximization problem
9 Week 9-Greedy Algo- problems which have efficient poly-
time algorithm C
rithm A ≤q B ≥ ρ(n), ρ(n) ≤ 1
C∗
• NP - (Non-deterministic polynomial
At each step of solving the algorithm, we make sure reduction is very fast, or else also time) - The set of all decision prob- Here Cost refers to the maximization or
need only solve one (greedy) sub problem no point reduction must in polynomial time lems which have efficient certifier minimization result C
Page 5
CS3230 Cheatsheet Austin Santoso
12.2 Analyzing Approximation 13.3 Independent Set there exist a subset V 0 of V , such that all
Algorithms vertices in V 0 are adjacent to each other
Definition: Given an undirected graph
Optimization Version: What is the
• Analyze a Heuristic: A heuristic is G = (V, E), a subset X ⊆ V is said to be
largest max clique
a procedure that does not always pro- an independent set if
Decision Version: Does there exist a max
duce the optimal answer, but we can For each u, v ∈ X, (u, v) ∈
/E
clique of size k
show that it is not too bad. Compare There is no edge connecting any two vertex
heuristic with an optimal solution to in the independent set
Optimization Version: Compute Inde- 13.8 Knapsack
find approximation ratio
pendent Set of largest size Definition: Given n items described by
• Solve an Linear Programming re- Decision Version: Does there exist an in-non-negative integer pairs (wi , vi ), capacity
laxation: Not Covered dependent set of size > k W and value V . wi denotes the weight of
an item, vi denotes the value of the item.
13.4 Vertex Cover Optimization Version: What is the max-
13 Problem Definitions is NP-Complete imum value a subset of item of total weight
at most W
and Algorithms Definition: Given a graph G = (V, E), a
Decision Version: Is there a subset of
vertex cover V 0 is a subset of V such that
item of total weight at most W and total
13.1 3-SAT ∀(u, v) ∈ E : u ∈ V 0 ∨ v ∈ V 0
value at least V
every edge has at least one end point in the
is NP-Complete vertex cover
Definition: SAT where each clause con- Optimization Version: What is the
tains exactly 3 literals corresponding to smallest size of a vertex cover
different variables. Decision Version: Does there exist a ver-
tex cover of size k
(x¯1 ∨ x2 ∨ x3 ) ∧ (x1 ∨ x¯2 ∨ x3 )
13.5 Partition
Optimization Version:
Decision Version: Given a 3-SAT, does Definition: Given a set of positive integers
there exist a satisfying assignment set S, can the set be partitioned into two
sets of equal total sum.
Page 6