0% found this document useful (0 votes)
5 views5 pages

Final Review

The document outlines various algorithms and optimization problems, including methods for finding the minimum steps to reach 1 from an integer, calculating drone drop requirements, and maximizing sandwich production under resource constraints. It discusses dynamic programming, flow networks, and linear programming techniques to solve these problems efficiently. Additionally, it emphasizes the importance of optimality and correctness in algorithm design.

Uploaded by

salwanaveedasif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

Final Review

The document outlines various algorithms and optimization problems, including methods for finding the minimum steps to reach 1 from an integer, calculating drone drop requirements, and maximizing sandwich production under resource constraints. It discusses dynamic programming, flow networks, and linear programming techniques to solve these problems efficiently. Additionally, it emphasizes the importance of optimality and correctness in algorithm design.

Uploaded by

salwanaveedasif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Consider the following process.

At all times you have a single positive integer x, which is initially equal to some value n. In each step, you can either decrement x by 1 or halve x (only if x is
even). Your goal is to produce 1 given a starting value n. For example, you can reach 1 starting from the integer 10 in four steps as follows: 10 → 5 → 4 → 2 → 1 using an decr ement, a
halving, another decrement and finally another halving. Obviously you can get to 1 from any integer n using exactly n − 1 decrements, but for almost all values of n, this is horribly inefficient.
Describe an algorithm to compute the minimum number of steps required to reach 1 from any given integer n. Solution: Algorithm: We first find the binary representation of n and list them
from least significant bit to most significant bit (right to left). At each step, if the current bit in consideration is 1, t hen we perform a decrement, otherwise we halve. Once the halving
happens, then we can proceed to the next bit since we have re duced to the number of bits in the number. The remaining steps were not asked for, but are presented for your benefit.
Algorithm Correctness: Our process is to (un)reproduce the binary representation of n, starting from the least significant bit, hence it terminate if and only if x = 0. That proves the
correctness of the algorithm. Running time: Computing the binary representation of n requires O(log n) time and since we perf orm at most two operations per bit, that also requires O(log n)
time and the whole algorithm runs in O(log n) time. Proof of Optimality: First we note that the number of steps required decreases monotonically the closer you get to 0. We only need to
show that any optimal algorithm will also require as many steps as our algorithm does. Suppose for contradiction that there is another algorithm that uses fewer steps. We find the first place
where the two algorithm’s chosen operation differ. If our algorithm does a halving while the other algorithm does a decrement, then because of the monotonic property of number of steps,
our algorithm will be faster. On the other hand, if our algorithm does a decrement while the other algorithm does a halving, then the other algorithm must be performing an illegal step since
our algorithm will always choose halving if it can and will only perform a decrement if the halving step is illegal. Thus, while our algorithm produces the correct output in that stage, the other
algorithm will produce an incorrect output/perform an illegal step.

Complete the following description of a dynamic programming algorithm which will compute the smallest number of drone drops required to guaranteed to find the correct floor at which
the drone breaks. Solution: 1. English Description: Let D(x, i) be the min imum number of drone drops required to find the threshold floor (lowest floor which breaks the drone) in a x -floor
building given access to i drones. If the drone breaks, then we know we only need to look at the floors below, that is, x − 1 floors, using the remaining i − 1 drones. On the other hand, if the
drone doesn’t break, then we can use drone i again but will only need to consider the top n − x floors. Since we do not know whether we’ll need to look at the bottom x − 1 floors or the top n
− x floors, we’ll pretend that we’re always forced to pick the worst option and consider the case that forces us to make as many moves as possible. However, we still do control the value of x
and can pick the best possible floor x for the first drop attempt out of all possible x ∈ {1, . . . , n}. 2. Recurrence: D(x, i) = { 1 x = 1 | x i = 1 | 1 + min_x∈{1,...,n} (max (D(x − 1, i − 1), D(n − x, i)))
else

Flow: Complete the flow network below such that it can be used to determine if there is a way to assign doctors to night shifts. If there is a way, you should describe what the assignments
are and otherwise, you should output that it is not possible. solve the problem. Solution: Capacity of m on s to doctors ensu res that no doctor can work more than m night shifts. Capacity
limit of sk allows us to check if the max flow saturates these edges which implies that the requirements for each night shift are satisfied. The edges from doctor to shifts are created based on
the personal constraints Pi that doctors have . An assignment is possible if the max flow saturates all edges from shifts to the sink. Alternatively, there is an assignmen t if the maximum flow is
equal to ∑n k=1 sk. To find the assignment, we use the non-zero flow edges to say that any doctor to shift edge which has a flow of 1 means that that doctor has been assigned to that
particular night shift. Diagram:

DP: Describe the iterative approach (clearly, pseudocode preferred) that will be used to calculate the value of Cn for a give n n ≥ 0 based on the following recurrence. Your approach should
be based on a dynamic programming based memoization of the necessary subproblems required to compute the value of Cn. Your ap proach must only require looking at each entry in the
memoized DP table once to compute Cn for any value of n. C0 = 1 and Cn = ∑n i=1 Ci−1Cn−i for n > 0 Solution: We will create an array to store intermediate valu es of Cj and utilize this array
to quickly find the next element Cn rather than recalculating all previous values of Cj . Initialize our array C with 1 since C0 = 1. For computing any intermediate value Cj we utilize two
pointers to the array C the left one set to ℓ = 1 and the other set to r = j − 1. At each step, we multiply the values Cℓ and Cr and add it to out running total for Cj . Then ℓ is incremented by 1
and r is decremented by 1. This continues until ℓ ≥ r, at which point we double our running total. If ℓ > r, we are done and return this new total. Otherwise, if ℓ = r, then j must have been odd
so we additionally add Cr · Cr to the new total and return this. We utilize the symmetry in the summation in the recursive formula for Cj to only visit values in the C array once to compute Cj
for a new j.

Given an array of integers A, the subsequence with the maximum sum can be found using one pass(es). During the pass(es) we delete all numbers n satisfying the condition n < 0. This will
leave only positive numbers left. In case all numbers satisfy the earlier condition then the subsequence with maximum sum wil l be the subsequence containing all numbers.

Complete the provided (incomplete) framing of this optimization problem in the language of a linear program. Table 1: Arbitrage opportunities Commodity Transaction 1 Transaction 2
Transaction 3 Max allowed trade I -1 3.14 4.3 100 II +2.5 -2.7 1.41 70 III +0.5 1 -1.5 20 Transaction amount t1 t2 t3 max(−1 + 2.5 + 0.5) · t1 + (3.14 − 2.7 + 1) · t2 + (4.3 + 1.41 − 1.5) · t3 (1) s.t. 1
· t1 ≤ 100 (2) 2.7 · t2 ≤ 70 (3) 1.5 · t2 ≤ 20 (4) t1, t2, t3 ≥ 0 (5) Explain what each line in the LP accomplishes. Line 1: Maximizes the profit which is the net change in currency holdings due to
each transaction (depending on how that transaction affects currency holdings). Line 2: Ensures that we do not perform more t han 100 trades involving commodity I. Line 3: Ensures that we
do not perform more than 70 trades involving commodity II. Line 4: Ensures that we do not perform more than 20 trades involvi ng commodity III. Line 5: We cannot perform negative
amounts of transactions.

Given two sorted arrays A and B containing n integers each, find the median of all 2n integers. That is, find the median of the combined/me rged array. Your algorithm should run in O(lg n)
time. Which means that you would not be able to perform the actual merge operation on the arrays to find the median. Thus your algorithm should somehow utilize the sorted nature of A
and B when finding the median. Solution: We solve a more general problem, finding the kth smallest element from two sorted ar rays of poten- tially different lengths. First, compare the
“middle” elements of the two arrays. If they are equal then that’s the median since half the elements in both arrays are less than this, so in total half of all the integers are smaller than this
which makes this the median. If middle element of A is smaller, then we know that the first n elements in the merged array can contain elements from the ri ght half of B, so perform binary
search for the middle element (or get the next largest number) of A in the right half of B and hit index i. Once that has been found, our goal changes to finding the nth smallest element in the
first half of A and upto this newly found index in B. A similar step can be done if the middle element of A is larger. To han dle general kth smallest element, if k is more than half the sizes of A
and B then we know that one of the left halves can be discarded (followed by an adjustment to the value of k). If k is less t han half the sizes of A and B then we know that one of the right
halves can be discarded. Thus, in each step we are able to discard half of one of the two arrays, which gives us a binary search like recurrence which wou ld give us a logarithmic runtime
matching the required O(lg n) run time.

Describe an O(m + n) algorithm that searches for a target value t in an m × n integer matrix M which has the following properties: • Integers in each row are sorted in ascending from left to
right. • Integers in each column are sorted in ascending from top to bottom. 1 4 7 11 15 2 5 8 12 19 3 6 9 16 22 10 13 14 1 7 24 18 21 23 26 30 Table 2: Example of the sorted search grid
Solution: We perform a binary search like algorithm. First we compare the target with the middle element of the grid (middle row, middle column) denoted by rm, cm. If they are equal then
we have found the target and return these indices. If that element is smaller than the target t then we recursively search in the s ubgrid defined by ((rm + 1, 1), (m, cm)), the subgrid defined
by ((rm + 1, cm + 1), (m, n)) and the subgrid defined by ((1, cm + 1) , (rm, n)). In particular we are able to discard the whole subgrid defined by ((1, 1), (rm, cm)). Alternatively, if the targe t t is
smaller than the middle element, then we proceed like the previous case except this time we will discard the subgrid defined by ((rm, cm), (m, n)) and recurse on subgrids ((1, 1), (rm − 1, cm
− 1)), ((rm, 1), (m, cm − 1)) and ((1, cm), (rm − 1, n)). At each step we discard 1 4 of the grid, giving us the recurrence T (N ) = 3T (N/4) + O(1) where N = (max(m, n))2. Solving this re currence
gives us T (N ) = O(lg N ) = O(2 lg max(m, n)) = O(m + n).

You have been called in to help the leader of Pillowtown to determine the smallest number of hallways that would need personn el placed so that whichever path the elite shock troops of
Sgt. Chang might take, your doomsday device can be deployed to defend Pillowtown’s base. You can assume that the blueprint of Gree ndale is available to you, depicting the rooms where
the bases are as well as all the hallways and their intersections. Solution: This is essentially an application of Menger’s theorem. We want to place surveillance on as few hallways as possible
to ensure that all paths from the Blanketsburg base to the Pillowtown base are covered. This corresponds to removing as few e dges as possible while still managing to disconnect every path
from a start vertex to an end vertex. This is precisely what Menger’s theorem tells us about edge disjoint paths and we’ve al ready seen how to do that using Max-flow by using unit capacity
edges.

Everything is same as Minimum Cost Flow, except instead of mini- mizing cost, we want to maximize it. Solution: We utilize the fact that in min cost flow, the price/cost of sending flow
through an edge is allowed to be negative. Specifically, we use the fact that min imization of x is equivalent to maximization of −x. So, to solve a maximum cost flow problem, we would
reduce it to min cost flow by simply creating new prices p′(e) = −p(e).
The way AonDor’s magic works is by selecting a subset of a points on a grid as co nduits for raw energy. However care needs to be taken that you do not select more than the allowed
number of conduits per row and column. The allowed number of selections differs by row and column for each grid. All of these control what the specific effects of the magic are and the
more points that are chosen, the more powerful the magic. As a young Elantrian learning this magic, you have been tasked to f ind the largest selection that can be made given a grid. That is,
you are given an n × n grid with each index (i, j) filled with one of {0, 1} where 1 indicates that the point (i, j) could be chosen and 0 indicates that (i, j) cannot be chosen. Along with the grid,
you also have an upper limit ri on how many points can be chosen for each row i, and an upper limit cj on how many points can be chosen for each column j. Design an algorithm, that takes
the grid, ri values, and cj values as input and chooses the largest possible subset of points to unleash the maximum possible magical energy. Solution: We’ll solve this using max-flow. Create
a source node s and a sink node t. Create a node for each (i, j) that can be selected. Create a node for each row i and each column j. Connect source s to each i with capacity ri, and connect
each j to sink t with capacity cj . Connect each (i, j) with an incoming unit capacity directed arc from row node i and with an outgoing directed arc of unit c apacity to column node j. Compute
max-flow, this will be the number of (i, j) points that can be selected and the specific selectio n will be all the nodes (i, j) which have an incoming flow of 1.

LP Max: A catering company is to make lunch for a business meeting. It will serve ham sandwiches, light ham sandwiches, and v egetarian sandwiches. A ham sandwich has 1 serving of
vegetables, 4 slices of ham, 1 slice of cheese, and 2 slices of bread. A light ham sandwich has 2 serving of vegetables, 2 slices of ham, 1 slice of cheese and 2 slices of bread. A vegetarian
sandwich has 3 servings of vegetables, 2 slices of cheese. and 2 slices of b read. A total of 10 bags of ham are available, each of which has 40 slices; 18 loaves of bread are available, each with
14 slices; 200 servings of vegetables are available, and 15 bags of cheese, each with 60 slices, are available. Given the res ources, how many of each sandwich can be produced if the goal is to
maximize the number of sandwiches? Solution: Let x, y, z be the number of ham, light ham and vegetarian sandwiches. Consideri ng the con- straints this gives us the following LP: max x + y +
z s. t. 4x + 2y + 0z ≤ 400 2x + 2y + 2z ≤ 252 2x + 2y + 2z ≤ 252 x, y, z ≥ 0. ||||| LP Min: A company is creating a meal replacement bar. They plan to incorporate peanut butter, oats, and dried
cranberries as the primary ingredients. The nutritional content of 10 grams of each is listed below, along with the cost, in cents, of each ingredient. Find the amount of each ingredient they
should use to minimize the cost of producing a bar containing a minimum of 15g of each ingredient, at least 10g of protein an d at most 14g of fat. Peanut Butter (10g) Oats (10g) Cranberries
(10g) Proteins (grams) 2.5 1.7 0 Fat (grams) 5 0.7 0.1 Cost (cents) 6 1 2 Table 2: Ingredients, nutrients and costs Solution: Let p, a, and c be the amount of peanut butter, oats and cranberries
that get used. min 6p + 1a + 2c s.t. 2.5p + 1.7a + 0c ≥ 10 5p + 0.7a + 0.1c ≥ 14 p, a, c ≥ 1.5

DP: Use the following recurrence along with a dynamic programming implementation to efficiently compute the nth Catalan numbe r. Cn = { 1 n = 1 ∑n i=1 Ci−1Cn−i otherwise

Given a denomination of positive coins c1, c2, . . . , cm and a value n as input how many ways can you make change for n. For example, with coins being {1, 2, 3} the number of ways to get
the value 4 is 4 as follows: {1, 1, 1, 1}, {1, 1, 2}, {2, 2}, {3, 1}. With coins being {2, 5, 3, 6} the number of ways to get the value 10 is 5 as follows: {2, 2, 2, 2, 2}, {2, 2, 3, 3}, {2, 2, 6}, {2, 3, 5}, {5,
5}. Solution: • Logic: Let us arbitrarily arrange the coins as c1, c2, . . . , cm. Now for any number i we can say that the number of ways to get to i using the coins from c1 to cj will need to
count the ways that use no cj coin and will also need to count the ways to use at least one cj coin. These are separate event s but they also account for all possibilities. Thus, they become our
subproblems. If cj is not used, then we still need to make i but now we only use c1 to cj−1. On the other hand, if cj is used at least once, then we only need to make i − cj and we still have all
of c1 to cj to use (because cj may be used more than once). • The Bellman Equation: Let OP T (i, j) denote the number of ways to make the value i using the coins f rom c1 to cj . OP T (i, j) =
0 if i < 0 or j = 0 1 if i = 0 OP T (i, j − 1) + OP T (i − cj , j) otherwise • Iterative algo rithm: The iterative algorithm would build up the OP T table from bottom up. It would first fill in the
values when i = 0 and when j = 0. From there, it would iterate over all available coins (upto j) and see how many ways can hi gher up i can be filled. • Number of subproblems: The index i
varies from 0 to n and the index j varies from 0 to m. So a total of O(nm) problems need to be solved. • Time per subproblem: Each recursive call is just an addition. So O(1). • Total time for
memoization: O(nm). • Final output: OP T (n, m).

Given an array A[1...n] of letters, compute the length of the longest palindrome subsequence of A. A sequence P [1...m] is a palindrome if P [i] = P [m − i + 1] for every i. For example,
RACECAR is a palindrome, while REAR is not a palindrome. And the longest palindrome subsequence in the string ABETEB is BEEB. English Description of problem: We create a new array
where P al[i, j] will contain the length of the longest palindrome subsequence of the the subarray of A that starts at index i and ends at index j. • English Description of logic: It’s e asy to see
that if A[i] = A[j], for some i < j, we can include that in palindrome subsequence and increase i by 1 and decrease j by 1. And if they’re not equal, we either increase i or decrease j, but not
both. • Recurrence Relation (and Base Case) P al[i, j] = Pal[i, j] = 1 i = j max(P al[i, j + 1], P al[i − 1, j]) A[i] ̸ = A[j] 2 + P al[i − 1, j + 1] A[i] = A[j] • Iterative Algorithm: Now it’s not too clear
how we can memoize P al[i, j] correctly (but it can be done carefully). The next solution is an alternative way we can look at the recursion that uses length as a parameter.

Describe algorithms that will allow you to find and store the required metadata quickly given a n node BST T as input. 2. ( ∗) BST usage involves updating nodes; describe how each of these
metadata values can be kept updated as elements are added or removed from T . Solution: 1. Algorithm: To find each o f the required metadata, use dynamic program to memoize at the
node itself and use the recursive definitions for the respective metadata as detailed below. Each recursion should be based o n either the parent node of the current node or the left and right
children of the current node. Recursion: Depth of a Node: depth[node] = depth[parent] +1 Height of a Node: height[node] = max( height[left], height[right]) +1 Subtree Size of a Node:
size[node] = size[left] + size[right] +1 Rank of a Node: The rank of a node is computed through in-order traversal. A counter keeps track of current rank and is assigned to the current node. As
the algorithm traverses through the BST, the counter increments by 1 for each node. 2. Depth of a Node: • If the node removed is above the current node, subtract 1 from the depth. If the
node removed is below the current node, no change is needed. • If the node added is above the current node, add 1 to the dept h. If the node added is below the current node, no change is
needed. Height of a Node: 1Every node’s key is at least as large as the largest key in the left subtree and at most as large as the smallest key in the right subtree. • If the node removed is
above the current node, no change is needed. If the node removed is below the curren t node, subtract 1 from the height. • If the node added is above the current node, no change is
required. If the node added is below the current node, add 1 to the height. Subtree Size of a Node: • If the node removed is above the current node, no change is needed. If the node
removed is below the current node, subtract 1 from the size. • If the node added is above the current node, no change is requ ired. If the node added is below the current node, add 1 to the
size. Rank of a Node: • If the node removed is to the left of the current node, subtract 1 from the rank. If the node removed is to the right of the current node, no chan ge is needed. • If the
node added is to the left of the current node, add 1 from the rank. If the node added is to the right of the current node, no change is needed.

You are helping the Department of Computer Science at Uskees University create a new flexible curriculum with a complex set o f graduation requirements. The department offers n different
courses, and there are m dierent requirements. Each requirement specifies a subset of the n courses and the number of courses that must be taken from that subs et. The subsets for
different requirements may overlap, but each course can be used to satisfy at most one requirement. For example , suppose there are n = 5 courses A, B, C, D, E and m = 2 graduation
requirements: • You must take at least 2 courses from the subset {A, B, C}. • You must take at least 2 courses from the subse t {C, D, E}. A student who has only taken courses B, C, D cannot
graduate, but a student who has taken either A, B, C, D or B, C, D, E can graduate. Given m requirements and the list of cour ses a student has taken, determine whether the student can
graduate or not. Solution: Create a source s and a sink t. Create a vertex for each course that the student has taken and for each of the m requirements that they require to graduate. Add
edges from each requirement to the sink with capacity equal to the number of courses from the corresponding subset that the s tudent must have taken to consider that requirement as
satisfied. Create edges from the source to each course that has been taken with capacity 1. For every course create an edge t o each requirement which contains that course in its subset with
infinite capacity. Find the max flow in this flow network and if all edges from requirements to the sink have been saturated then it means that the c ourses the student has taken are sufficient
to satisfy all the requirement for graduation. Due to the capacity from source to courses being 1, each course can only count towards a single requirement.

Suppose you are given an arbitrary directed graph G = (V, E) with arbitrary edge weights w : E → R, and two special vertices s, t. Each edge in G is colored either red, white, or blue to indicate
how you are permitted to modify its weight: • You may increase, but not decrease, the length of any red edge. • You may decre ase, but not increase, the length of any blue edge. • You may
not change the length of any black edge. Your task is to modify the edge weights, subject to the color constraints, so that every path from s to t has exactly the same length. Both the given
weights and the new weights of the edges can be positive, negative, or zero. Assume every edge in G lies on at least one pat h from s to t, and that G has no isolated vertices (it is connected).
Solution: We will reduce this to a linear program. Let δe be a variable denoting the change that we will make to edge e. Furt her, let pi denote the list of edges in path i from s to t. We add a
constraint that the length of the paths (sum of weights) after making the change δe must be equal for any two pair of paths f rom s to t. max 1 s.t. ∑ e∈pi w(e) + δe = ∑ e′∈pj w(e′) + δe′ ∀
paths pi, pj δe = 0 ∀e colored black δe ≥ 0 ∀e colored red δe ≤ 0 ∀e colored blue.

Find the maximum and minimum of an array of integers. Solution: Use a similar construction as we did for sum/average of an ar ray of integers. However, the lower bounds will all be 0 for
every edge except the t to s edge. That edge uses l = u = 1. This will ensure that exactly one flow has to go through the network and since we are doing min cost circulat ion, this will ensure
that the flow goes through the minimum price edge which is the minimum integer value in the array. Using the reduction for max cost flow (using negative of the original prices) will allow us
to also find the maximum value in the array.

(∗) Given a directed graph G = (V, E) find the shortest path between every pair of vertices u and v. Solution: We utilize the r eduction for finding the shortest path from s to every other node
and repeatedly use it by setting s to be each node from the graph in turn.
(∗) Given an undirected graph G = (V, E) with weights w : E → R≥0 on edges, find a minimum spanning tree T of G. Solution: Approach 1: We will reduce this to a linear program. Let xe be an
indicator variable which is 1 whenever the edge e is a part of the MST and is 0 otherwise. max ∑ e ∈V wexe Minimize weight of picked edges s.t. ∑ e ∈E xe = |V | − 1 Fix the total number of
edges picked ∑ e={u,v}∈E xe ≥ 1 ∀u ∈ V Ensure picking 1 edge adjacent to every vertex xe ∈ {0, 1} One downside of the above approach is that we are using integer linear programming and
not standard LP. So we also provide another reduction.

Major online portals like Google and Facebook have considerable information about individual users based on their past interactions. Th is allows them to post targeted advertisements to
the users. Suppose a set U of n users, labeled 1 through n, visit the portal on a particular day. The portal has a set A of m ads, labeled 1 through m, to choose from. The analysis of the users
has revealed k different groups (from a marketing standpoint), the ith group consisting of subset Si of users from U . A user may be part of several groups; i.e., a user may be an element of
several different Si’s. The jth ad belongs to a subset Gj ⊆ {1, . . . , k} of the groups. Each ad j also has an integer rate criteria rj based on what companies pay to have that ad shown. The portal
needs to decide whether there exists a way of assigning advertisements to users such that the following conditions hold: (a) each user is shown exactly one ad; (b) ad j is shown to user i only
if i ∈ Sk for some k ∈ Gj , that is, user i must be part of a group which contains ad j; (c) the number of times the ad j is shown is exactly rj , where rj is a given integer. Give a polynomial -time
algorithm that takes the above input: U , A, the sets Si’s, the groups Gj ’s, the rj ’s — and determines whether the portal can assign an ad to each user so that the above three conditions are
satisfied, and if so, then returns such an assignment. State the running time of your algorithm. We can solve it by converting this question into a flow diagram and then applying the Ford
Fulkerson Algorithm. We can make a few nodes and connect flow to them accordingly. First we take the source node. Connect the m to m ads nodes with the capacity of rj for each respective
ad. Then connect the ads to the respective group nodes to which they can be shown. Keep the capacity of all edges equal to th e rj value of the ads they are outgoing from. Then connect the
groups to the users contained in them with a capacity of 1 for each edge. Finally connect all the users to the sink with the capacity 1. Now we just run Ford -Fulkerson on this network flow
graph. A assignment is possible iff the max flow is ∑ rj = n. Clearly, if the flow is < ∑ rj then there is some ad which cannot be shown exactly rj times. If sumrj ̸ = n then the constraint of each
user seeing exactly 1 ad is violated. We only need to show that we can extract a valid assignment if the max flow is∑ rj = n. This is done by using the edges between ads and groups and
between groups and users that have been saturated. If the flow equals capacity then we show those ads to corresponding users. This assignment will satisfy condition (b) by construction of
the edges. Due to the max flow being ∑ rj this assignment will also satisfy condition (c). Since ∑ rj = n condition (a) will also be satisfied by the assignment. If we run Ford -Fulkerson on this, it
will take time O(|E|f ) where |E| is the number of edges and f is the max-flow. Note that in the constructed graph the number of nodes is O(nmk) and the number of edges is O(mk + kn)
(this happens when every ad can be shown to every group and every group contains every user) and the final max -flow is O(n). So the total running time will be O(n(mk + kn)).

Assigning TAs to grade questions in an exam is a perennial task for professors. It would be useful to have a general purpose method to do so given a general scenario as follows. There are n
types of questions to grade, with ti questions of type i, for 1 ≤ i ≤ n. There are m TAs, where TA j has the time to grade at most cj questions, and is capable of only grading questions of types
drawn from a subset Sj ⊆ {1, 2, . . . , n} of types. Describe an algorithm to determine if the professor can assign TAs to the questions so that all the questions are graded. If it is possible to do
so, your algorithm should also return an assignment that indicates how many questions of each type are graded by each TA. Sol ution: Construct the following graph, start vertex on the left
and end vertices on the right. The labels on the edges are the capacities. We run a max-flow algorithm and see if the flow is ∑ i ti, which would imply that all questions are being answered.
The incoming ti capacities to each type node, captures the number of q uestion of type i. The connections between the type nodes, subset nodes and TA nodes captures the restriction of
which TA can grade which subset of questions.1 Finally, the ci capacities out of the TA nodes captures the limit on number of questions a TA can grade. Finally, the max-flow is compared to
total number of questions to ensure that all questions will get graded. Note that the flow values also give us the assignment s: by interpreting the flow from type nodes to subset/TA nodes as
being number of questions of that type being assigned to that TA.

There are n filled water tanks in county A and 3m empty water tanks in county B. The i-th water tank in county A currently holds ai gallons of water. The j-th empty water tank in county B has
a capacity of bj gallons of water. To fix, repair and upgrade the tanks in county A, the city has decided to remove all the water from those and transport the water to county B. The city has
contracted out the water transportation to a private company which has evaluated c osts and determined that for each water tank i in county A it can only transport water to a certain subset
of water tanks water in county B. You know which i → j transportation is possible. Now, in the coming month, the tanks in cou nty B will be emptied to fill 3 new artificial pools that have been
constructed. These three pools have a total water capacity of p1, p2, p3 respectively. Further, • Pool 1 can be filled only b y tanks 1, 2..., m in county B • Pool 2 can be filled only by tanks m +
1, m + 2..., 2m in county B • Pool 3 can be filled only by tanks 2m + 1, 2m + 2..., 3m in county B Since any water that cannot be transported to tanks in county B and in the next month, water
that cannot be poured into a pool must be discarded, you have to design an algo rithm to find the maximum total amount of water that can be filled in all the pools at the end. Solution: Max -
flow is a good framework for modeling transport of commodities (in this case water). We will create the following flow graph and run a max-flow algorithm to solve determine the maximum
amount of water that can be transported given the constraints: (a) A source node s, n nodes for each tank in country A, 3m no des for county B, three nodes for the artificial pools, and one
sink node t. (b) We add an edge with capacity ai from s to node i for county A. (c) From every node i, we add an edge with capacity ∞ (a capacity of ai would also work) to the node for tank j
of county B if the contracted company can transport from i to j. (d) From each county B node j ∈ {1, 2, . . . , m} to the first pool node an edge with capacities bj , similarly edges from county B
nodes j ∈ {m + 1, . . . , 2m} to the second pool node with capacities bj , from county B nodes j ∈ {2m + 1, . . . , 3m} to the third pool node with capacities bJ . (e) Finally, from pool node k ∈ {1,
2, 3}, we add an edge to the sink t with capacity pk. We then compute the maximum flow from s to t and that will be the amoun t that can be moved from tanks in county A to the pools via
tanks in county B without discarding. The rest of the water cannot be transported given the limitations on the transport capacities and hence must be discarded.

Flow: Having just joined the Blue Bikes Business your manager tells you that you are responsible for ensuring that th e bike supply satisfies the demand at each location. If not, you need to
figure out whether the bikes can be transported around so that popular stations have bikes supplied to them from less popular ones. If the current infrastructure is insufficient, then you
need to recommend buying more transport vehicles which is costly. Thankfully, your colleagues have surveyed bike stations and can give you all the necessary data. At some given moment,
let d(v) denote the number of additional bikes needed at station v. This can be less than 0, in which that station can give up that many bikes to a different station. For some pairs of statio ns,
u, v, there is a van available to transport c(u, v) bikes from u to v. This transport procedure and the surveyed values for d (v) are valid for 2 hour periods and you have such values for every
two hour period starting at 8 in the morning until 6 in the evening. Give an algorithm to determine whether the vans can tran port bikes around to satisfy the demand of every station at each
2 hour mark. Solution: The problem for each two hour mark is the same, so we can solve the underlying abstract problem, and t hen change the input for the each 2 hour window. Since there
are demands at each station, which can be positive or negative, and there are transports with a upper bound capacity, this suggests a circulation problem formulation. We will create a node
for each bike station and the d(v) for the station become the demands at the corresponding nodes. For any pair of stations u, v that have a van available, we will create the (u, v) edge and
use c(u, v) to denote the capacity on that edge. This gives us our circulation graph. To reduce the circulation graph to max -flow, we will add a source (s) and a sink (t). Then, any node that has
an excess of bikes (d(v) < 0) will have an edge of capacity |d(v)| from s to v. Similarly, any node that has a demand for bikes (d(v ) > 0) will have an edge of capacity d(v) from v to t. This will
now convert the circulation problem into a max-flow problem with a source and sink. To solve the circulation problem we solve max-flow and check if the max-flow value is equal to the sum
of all capacities coming out of the source (or the sum of all capacities going into the sink).

LP: A factory manufactures three products, A,B, and C. Each product requires a certain number of hours of manufacturing, assembly and finishing time, shown below, al ong with the total
time available and profit. Find production levels to maximize profit. A B C Total Hours Manufacturing 4 5 2 300 Assembly 7 4 1 250 Polishing 4 5 1 200 Profit 175 130 40 Table 3:
Manufacturing Products Solution: Let A, B, C be the production levels for the products A, B, C. max 175A + 130B + 40C s.t. 4A + 5B + 2C ≤ 300 7A + 4B + 1C ≤ 250 4A + 5B + 1C ≤ 200 A, B, C ≥ 0

Breadth First Search (BFS) and Dijkstra’s shortest path algorithms are great and well understood. Often, a problem seems to r equire modifying these algorithms but modifying the problem is
often better. None of the following problems require any change to BFS or Dijkstra’s algorithms. 1. You are given a directed graph G = (V, E), where each edge has a weight w(u, v), which is
one of k positive integers from the set W = {w1, w2, . . . , wk}. You want to find the shortest path between a given pair of vertices s and t. Can you solve this using BFS? 2. You are given a
directed graph G = (V, E), where each edge has a weight w(u, v) > 1. You want to find the shortest path between a given pair of vertices s and t, but now the weight of a path is defined as the
product of all the edges in the path. 3. (∗) You are given a directed graph G = (V, E), where each edge has a positive integer weight w(u, v). You want to find the shor test path between a given
pair of vertices s and t, but if there are multiple shortest paths, you want to return the one with the fewest number of edges. Solution: 1. Replace every edge (u, v) in G with a path of length
w(u, v) using dummy vertices and edges. Since none of the edges have any weights, we can run BFS to find the shortest path. The running time of this will be O(|V ′| + |E′|) where V ′, E′ are
the vertices and edges of the new graph. Since we replace every edge (u, v) with w(u, v) edges and w(u, v) − 1 vertices, we c an write the runtime as: O ∑ (u,v)∈E w(u, v) + w(u, v) − 1 =
O ∑ (u,v)∈E w(u, v) = O (|E| max{W }) We thought we could be smart and run a simpler algorithm, but if the numbers in W are large, then this algo rithm may run much slower than
Dijkstra’s shortest path algorithm. 2. Replace all weights w(u, v) with w′(u, v) = log w(u, v) and run Dijkstra’s algorithm. Since summation of logarithms is the logarithm of products, this
translates our problem into the classical setting. 3. Replace all weights w(u, v) with |E| · w(u, v) + 1. We will show that f or any two paths p and p′ if p was originally shorter, then it would
remain shorter in the new graph. If they were equal then the one which has fewer edges will be shorter. Let |p| and |p′| be t he number of edges in p and p′ respectively. We can simplify the
difference of the path weights in the new graph as follows:: ∑ e′∈p |E|we′ + |p′| − ∑ e∈p |E|we − |p| = |E| ∑ e′∈p we′ − ∑ e∈p we + |p′| − |p|. If in the original graph the two paths
had the same weight, then in the new graph the one which has fewer edges will have a smaller weight. On the other hand, if the sum of weights of p was smaller than p′, then the difference
in the sums is at least 1 so the first term is at least |E|, and the second term cannot be smaller than 1 − |E|. Thus, even i n the new graph the path p will have a smaller weight than p′.

Graphs General: Prove that every connected undirected graph which is not itself a tree must have at last three different span ning trees (two spanning trees are different if they use different
sets of edges). Since it is connected and it is not a tree, there is at least one cycle. Solution: Since it is an undirected graph, each of those is a 3-cycle and removing each edge from the 3-cycle
one by one gives a new spanning tree. 2. Consider edges that must be in every spanning tree of a graph. Must every graph have such an edge? Give an example of a graph that has exactly
one such edge. Solution: No, a graph that is just a cycle does not have any edge which will be in all spanning trees. Take a graph with two component s and connect them with a single edge.
Then any spanning tree must have this edge. 3. For any cut C of the graph, if the weight of an edge e in the cut -set of C is strictly smaller than the weights of all other edges of the cut-set of
C, then this edge belongs to all MSTs of the graph. Solution: Assume that there is an MST T that does not contain e. Adding e to T will produce a cycle, that crosses the cut once at e and
crosses back at another edge e′. Deleting e′ we get a spanning tree T \ {e′} ∪ {e} of strictly smaller weight than T . This contradicts the assumption that T was a MST. 4. For any cycle C in the
graph G, if the weight of an edge e of C is larger than the individual weights of all other edges of C, then the edge e canno t belong to any MST. Solution: Assume the contrary, i.e. that e
belongs to an MST T1. Then deleting e will break T1 into two subtrees with the two ends of e in different subtrees. The remai nder of C reconnects the subtrees, hence there is an edge f of C
with ends in different subtrees, i.e., it reconnects the subtrees into a tree T2 with weight less than that of T1, because the weight of f is less than the weight of e.

Given an m × n binary matrix, where each 1 represents that a knight who only speak the truth is standing there and each 0 represents that a knave who only speaks falsehoods is standing
there, find size of largest square sub-matrix that does not have any liars in it. In the following example, the biggest square sub-matrix without liars (without 0s) is of size 3 (using the middle
rows in the rightmost columns as highlighted). 0 0 1 0 1 1 0 1 1 1 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 Solution: • English description of the re cursion and logic
: We use a dynamic programming approach where the dp(i, j) element of the DP table stores the size of the largest square sub -matrix with all 1’s in the array A starting at the top-left of A
and with element A(i, j) being the right-bottom most element. We initialize the first column and the first row of the dp table with the same value as that in the array A, si nce these elements
do not have any top or left neighbors. 0 0 1 0 1 1 0 X X X X X 0 X X X X X 1 X X X X X 1 X X X X X 1 X X X X X 1 X X X X X 1 X X X X X For the rest of the entries we check if the top, the left and the
top-left diagonal elements can successfully combine with the element A(i, j) (if it is a 1) to form a larger square of 1’s. The s ize of the new larger square will be the minimum of the top, left
and top-left diagonal entries plus 1. The final output value can be found by finding the maximum value in the dp array. • Recurrence relation : dp(i, j) = A(i, j) if i = 0 ∨ j = 0 min (dp(i −
1, j), dp(i, j − 1), dp(i − 1, j − 1)) + 1 if A(i, j) > 0 ∧ i, j > 0 0 otherwise • Final output : max(dp(i, j)) • Iterative algorithm : rows = len(A) cols = len(A[0]) dp = [[0]*cols for _ in range(rows)]
max_side = 0 for i in range(rows): for j in range(cols): if i == 0 or j == 0: d p[i][j] = A[i][j] elif A[i][j] == 1: dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1 max_side = max(max_side, dp[i][j])
return max_side • Number of subproblems : There are n × m subproblems, where n is the number of rows in the matrix and m is t he number of columns. This is because for each element in
the matrix, we calculate the largest square that has that element as its bottom-right corner. • Time per subproblem : Each subproblem takes constant time O(1) since it relies only on the
values of its neighbors, which are already computed. • Final runtime : Since there are n×m subproblems and each takes O(1) time, the final runtime is O(n × m).

Suppose Alice and Bob are given an array A[1...n] of integers. They are playing a game where they alternate t urns, and at each turn a player chooses one of the two integers at the end of the
array. After they choose that number, the number gets deleted from the array and it’s the next person’s turn. The winner of t he game is the one who has the larger sum from the numbers
they have chosen. Give a dynamic programming solution for Alice to maximize the sum of the integers she chooses, assuming that Bob plays optimally. For example, suppose we have the
array A = [8, 7, 8, 9]. Alice can choose 9 in the first turn. The n we have the array [8, 7, 8] and Bob can choose either 8. Then Alice has a choice between picking 7 and 8 so she chooses 8 which
leaves Bob with 7. Alternatively, if Alice had initially chosen 8, then Bob would have chosen the 9 and Alice would choose th e remaining 8. Thus, for this specific configuration, with optimal
play, Alice and Bob will both have the same totals. Solution: 1. We will keep track of the maximum possible total Alice can get depending on whether she picks the left end or the right end of
the array. To account for Bob’s optimal play, we will also keep track of the minimum possible total in the subsequent turn. L et the table be T [i, j, who] which gives us the best possible game
that the player in who can play assuming the numbers remaining are from A[i . . . j], both ends inclusive. When who is Alice, we will pick the maximum of the two possible moves in this t urn.
When who is Bob, we will pick the minimum because Bob is trying to minimize Alice’s total. Due to the alternation of max and min, the values in the tables account for the alternating of
turns by Alice and Bob. Thus the values in our memoization table will always correspond to the best Alice can achieve given t he state of the game after some turns have finished and the
remaining numbers in the array A are between indices i and j. 2. T [i, j, who] = A[i] if i = j max (T [i + 1, j, Bob], T [i, j − 1, Bob]) if who = Alice min (T [i + 1, j, Alice], T [i, j − 1, Alice]) if
who = Bob 3. We return T [1, n, Alice]. 4. The iterative algorithm would first fill in the values where i = j using the base case, or alternatively, j − i = 0. In the next iteration, it would fill in all
values where j − i = 1, since the table entries needed to fill those had been filled in the previous iteration. The iteration would continue with increasing values of j − i. 5. The indices can each
take n values and for each pair of indices we need the solution for either Bob or Alice, so we solve O(n2) subproblems. 6. Each subproblem requires a constant number of table lookups
followed by a max or min oper- ation. So the time per subproblem would be O(1). 7. Final running time: O(n2).

TRUE?FALSE PF: 1. n1.05 = O (n(lg n)3) False 2. For any two functions f (n), g(n), we always have that either f (n) = O (g(n) ) or g(n) = O (f (n)) True 3. In a directed graph G = (V, E) with
capacities on edges and a specied source s and sink t the set of nodes in the minimum st cut is always unique. False 4. In a connected weighted graph with distinct positive weights, the edge
with maximum weight is never in the minimum spanning tree. False 5. Let P ∗ be the bounded, optimal solution value of a minimization LP, the primal, and D ∗ be the bounded, optimal
solution value of the corresponding dual. Then P ∗ ≥ D∗ . True 6. In the LP formulation of Max-Flow, the objective function∑ u:(s,u)∈E f (s, u) can be replaced with ∑ v:(v,t)∈E f (v, t) without
aect- ing the output of the LP. Here f (u, v) denotes the nal ow from u to v. True.

DP: Describe the iterative approach (clearly, pseudocode preferred) that will be used to calculate the value of xn for a given n ≥ 0 based on the following recurrence. Your approach should be
based on a dynamic programming based memoization of the necessary subproblems required to compute the value of xn. You sh ould only use a constant amount of extra space as part of
the memoization approach must only require looking at each entry in the memoized DP table once to compute xn for any value of n. x0 = x1 = 1; xn = 1 n ((6n − 9)xn−1 − (n − 3)xn−2)
Solution: We use two variables, one for the immediate predecessor and another for the predecessor of the predecessor. Let these be p1 and p2 (p redecessors one and two generations ago
respectively). They are initialized with 1 since x0 = x1 = 1. To compute any value xi we plug p1 in place of xi−1 and p2 in place of xi−2 in the recurrence equation. Once the new xn has been
computed, we update our p1 and p2 values such that p1 becomes the new p1 and the newly computed value xi becomes the new p1. As soon as we compute xn we are done and we can
return. By only keeping track of p1 and p2 and updating them appropriately whenever a new value is calculated, we are ensurin g that we only need a constant amount of space.

Represent the Minimum Cost Circulation as a linear program. Solution: min ∑ e ∈E p(e) · f (e) s.t. ℓ(e) ≤ f (e) ≤ u(e) ∀e ∈ E ∑ e=(u,v)∈E f (e) − ∑ e′=(v,w)∈E f (e′) = d(v) ∀v ∈ V f (e) ≥ 0

Given a directed graph G = (V, E) find the shortest path from s to every other node. Solution: We’ll reduce to minimum cost circulations. The graph stays the same, s becomes the source and
the lower bound and upper bound on each edge will be 0 and ∞ respectively. For every other node, we will in turn consider the m to be the sink. The price for each edge will be 1 if the
original graph was unweighted and the price will be the weight of the edge if the original graph was weighted. Finally, we cr eate an edge from the sink to the source s with lower and upper
bound equal to 1. Using this as the input to min cost circulations would provide us with a flow through the graph with the following properti es: • Since the lower and upper bounds on the
sink to source edge is 1, there must be a flow of value 1 on a path from s to the sin k. • Since we are solving min cost circulations, the total cost of the flow from s to t must have been
minimized, that is, the path with non-zero flow must be the shortest path, irrespective of whether the original graph was weighted or not.

Find the sum and average of an array of integers. Solution: Create a source s and a sink t, create a node for every integer in the array and connect s to these nodes and these nodes to t. Both
the lower and upper bound for each edge will be 1. The price for the edge from s to the integer nodes will be equal to the value of the integers and the price for the remaining edges will be
0. Finally, we create an edge from t to s with lower and upper bound equal to a value n the size of the array. Passing this t o min cost circulations would result in total flow equal to n.
However, the total cost of this flow will precisely equal the sum of all the integers in the array. Dividing this by the numb er of integers will then give us the average of the array as well.

DnC: Given a set A = {a1, a2, . . . , an} of n natural numbers and a target value x ∈ N, come up with an algorithm that finds a pair of numbers ai, aj ∈ A such that ai + aj = x or output that there
is no such pair. Your algorithm should run in time O(n log n). Prove the co rrectness and running time of your algorithm. [Hint: Use some divide and conquer algorithms you already know as
building blocks.] (b) (∗) Modify your algorithm to work in the case where you are given a set of tuples B = {(a1, b1), (a2, b2), . . . , (an, bn )} of cardinality n, a target tuple (x, y) and are asked to
find a pair of tuples (ai, bi), (aj , bj ) ∈ B such that ai + aj = x and |bi − bj | = y. Your algorithm should still run in time O(n log n). Prove correctness and running time of your modified
algorithm. Solution: (a) The idea is to first sort the set of numbers and then for each number a in the sorted array, perform a binary search for x − a. The initial sorting requires Θ(n log n) time
and each of the n binary searches requires log n time. Thus the total time is Θ(n log n) + Θ(n log n) = Θ(n log n). (b) We follow a similar idea, but first note that this requires the el ements in
the set are unique. First sort the tuples according to the first coordinate/number in the tuple. Perform a binary search similar to the previous subpart. However, if we find a tuple in which
the condition for the first coordinate is satisfied, we additionally need to perform a check to ensure that the second condit ion is also satisfied. If both conditions are satisfied, we output that
tuple. Otherwise we reject that tuple and continue performing binary searches. If at the end we don’t find a tuple that satis fies the requirement, we output that there is no tuple. This gives
us the same complexity as before.

You are given an array A which contains n values [a, a + d, a + 2d, . . . , a + nd], where a is the starting value of an Arithmetic Progression (AP) and d is the common difference between two
consecutive terms in the AP. Some intermediate value a + id is missing for some i ∈ {1, . . . , n − 1} (only the first and last entries are guaranteed to be present). Given this array A, describe an
O(log n) time algorithm to determine which value is missing. Solution: The value of a is the first element of the array. To f ind d, we look at the first three elements and check their common
difference. If the differences match, then that’s d. If they don’t, then we instead use the last three elements. One of these must give us d since at most one element is missing. Once we know
a and d, we can compute what the value should be at index i, using the formula a + id (assuming 0 indexing). We perform binary search on A and check if the value at the given index is equal
to the value it should be. If it is, then it means that none of the values to the left are missing and we recurse on the right half. Otherwise, some value is missing in the left and we recurse on
the left half. When our algorithm ends, we would have an index i and we will be able to check the two neighbors to determine which two consecutive elements have a difference larger than
d. This will tell us which element is missing. Since we perform binary search, the running time will be O(log n). Let T (n) b e the time taken for this recursive algorithm on an array of size n. In
this recurrence, we always eliminate half the array at each step using constant amount of work, so T (n) = T (n/2) + O(1). Further, the b ase case takes O(1) since in the end we can simply look
at a constant number of neighbors and get the answer. Solving this recurren ce we get T (n) = O(log n).

Given a sorted array A of n integers, determine if A has a majority element. A majority element is one that occurs more than ⌊n/2⌋ times. Can you provide an algorithm that runs in O(log n)
time instead of linear time? Solution: • We perform a single pass through the input array A and keep track of a candidate majority element, and a count. These are i nitialized to A[0] and 1
respectively. For every new element A[i] we reset the count to 1 and keep incrementing until we see a new element. Since the array is sorted, if there is a majority element then we will see
the count become greater than ⌊n/2⌋ and the current element can be returned. If we never see count go beyond that, then there is no majority element. This is a l inear time algorithm since
we perform a single pass through the array. • Note that any majority element, if it exists, must take up the middle index of A since the array is sorted. If not, then all copies of it must be on
either half of the array. However, there aren’t enough available slots in either half to become a majority element (there cannot be ⌊n/2⌋ copies of that element). So we look at the middle
index element and call it x. Then we perform binary search for the first occurrence of x in the left half of A in O(log n) time using binary search. Let the index returned be i. If A[i + ⌊n/2⌋] = x
then we know that x is a majority (it occurs enough number of times). If A[i + ⌊n/2⌋] ̸ = x then there is no majority element. This would work in O(log n) since we only pe rform one binary
search and a couple of constant time operations.

Recursion: Determine if the following algorithm can correctly sort an array of length n: 1. Recursively sort the first ⌈ 2n/3 ⌉ elements. 2. Recursively sort the last ⌈ 2n/3 ⌉ elements. 3.
Recursively sort the first ⌈ 2n 3 ⌉ elements. As a base case it considers an array of length 1 to be sorted. • Write the recurrence for the above algorithm and s olve it. • (∗) If instead of using ⌈
2n/3 ⌉ length subproblems, if the algorithm used ⌈ kn k+1 ⌉ length subproblems, discuss what changes and determine which value of k produces the asymptotically fastest algorithm. You
should write the recurrence for the general k form of the algorithm, solve the recurrence and then discuss trade -offs as k changes. Solution: • Correctness: We’ll use induction and we
already know that the base case of n = 1 works. Suppose that the algorithm works correctly for all lengths < n, so the three recursive calls each return those elements in sorted order. We
only need to show that the particular subproblems chosen will result in an overall sorted array. The first call will ensure that the largest ⌈n/3⌉ elements that were in A[1 . . . ⌈2n/3⌉] will end
up in the second half of A[1 . . . ⌈2n/3⌉]. So out of all the first ⌈2n/3⌉ elements, after sorting none of the first ⌈n/3⌉ should go into the latter ⌈2n/3⌉ part of A. Thus, the second recursive call
works on the appropriate elements to finalize the final ⌈n/3⌉ elements of A. After the second recursive call the smallest ⌈2n/3⌉ elements all get put in the first ⌈2n/3⌉ indices of A. The final
recursive call will them sort these correctly to result in a overall sorted array. • Recurrence: T (n) = 3T ( 2n 3 ) + O(1) = nlog3/2 3 = n lg 3 lg 3−1 • If the fraction of the array recursed on is ⌈ kn
k+1 ⌉ , then the recurrence becomes T ′(n) = 3T ( kn k+1 )+ O(1) = n lg 3 lg k+1 k . It is easy to see that the larger k is, the sm aller k+1 k will be and hence the smaller lg k+1 k will be. Since we
are dividing by this term, the smaller it is, the larger the overall degree of the polynomial run-time. So the smallest possible k is best. Setting k = 1 gives us nlg 3 which is the quickest runtime
we can expect given that we are recursing three times in this specific manner. Another way of looking at it is that since we are always recursing three times, if we can recurse on as small an
input as possible, our runtime is going to be as fast as possible. Which would again suggest setting k = 1.

You are given k arrays, each containing n elements in sorted order. You need to merge all the n ∗ k elements into a single sorted array. Suppose that comparison is a unit time operation. a.
(Direct k-way merge) Suppose at each step you compare the first element of each of the k arrays and insert the minimum of these into the output array. What is the time complexity of this
procedure? b. (Sorting) What would be the time complexity if you concatenated all the elements into the output array and then sorted the whole output array using an optimal algorithm like
mergesort? c. (Divide-and-Conquer) What if you merge the arrays in groups of 2, doing the standard merge operation on each pair of arrays. So after eac h stage, you would halve the number
of arrays and double the number of elements in any given array. Continue this grouping and merging until you are left with a single array of size n ∗ k. What would be the time complexity of
this procedure? d. (Using a MinHeap) Suppose you have access to a MinHeap of size k. A MinHeap is a data structure that can b e queried to return its minimum element in time O(log k), and
you can add a new element to it in O(log k) time. Suppose you do the direct k-way merge but use a MinHeap to find the minimum, as described in Algorithm 1. What is the complexity of this
procedure? Algorithm 1 K-Way Merge using Min Heap input: A1, A2, . . . , Ak sorted arrays each of size n output: A single sorted array containing all n · k elements. 1 Create an output array B
of size n · k 2 Create a min-heap H of size k 3 for 1 ≤ i ≤ k 4 insert Ai[0] into H 5 for i ∈ {1, 2, . . . , n · k} 6 m ← Extract − M in(H) Let m be from Ai 7 B[i] = m 8 Insert next element from Ai into
heap using ∞ if Ai empty 9 Solution: a. (Direct k-way merge) Finding minimum of k elements in each step takes k − 1 comparisons, and since we are putting in nk elements one at a time
there are nk steps, hence complexity is O(k2n). b. (Sorting) Concatenating all arrays, element by element, takes nk time. Sor ting it takes O(nk log(nk)) time. Hence, total complexity is O(nk
log(nk)). c. (Divide-and-Conquer) In the first stage, merging 2 arrays takes time 2n and after we are done, there are k 2 arrays. Total time taken = 2 n · k 2 = nk. In stage two, we’ll spend 4n
time merging the k 4 pairs for a total of nk time. This continues for log2(k) stages after which 1 array will be left. In the last stage, merging 2 arrays of size nk 2 each will take time nk. Hence,
each stage takes time nk. So, total time taken is O(nk log2(k)). d. (Using a MinHeap) According to the algorithm, every eleme nt in the output array was extracted from the min-heap. Each of
the nk elements took log(k) time to be inserted into the min-heap. Extracting it also takes log(k) time and putting it into the output array took constant time. So, the total time taken must be
O(nk log(k)).

You might also like