0% found this document useful (0 votes)
172 views214 pages

Cmpe 224 Exams

Uploaded by

Arda Baran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
172 views214 pages

Cmpe 224 Exams

Uploaded by

Arda Baran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 214

COS 226 Algorithms and Data Structures Fall 2005

Final

This test has 11 questions worth a total of 83 points. You have 150 minutes. The exam is closed
book, except that you are allowed to use a one page cheatsheet. No calculators or other electronic
devices are permitted. Give your answers and show your work in the space provided. Write out
and sign the Honor Code pledge before turning in the test.

“I pledge my honor that I have not violated the Honor Code during this examination.”

Problem Score Problem Score Name:


1 7
2 8 Login ID:
3 9
4 10
5 11
6
Sub 1 Sub 2

Total

1
2 PRINCETON UNIVERSITY

1. Analysis of algorithms. (10 points)


A Fibonacci heap is a priority queue that supports the following operations in the given
amortized running time. Here, the elements are 1 through N and the data in the keys can
only be accessed via pairwise comparisons.

Operation Description Amortized


create an empty heap
Create(N ) O(N )
of capacity N
insert element i, and
Insert(i, k) O(1)
assign it key k
decrease the key associated
DecreaseKey(i, k) O(1)
with element i to k
delete and return the element
DeleteMin() O(log N )
with the smallest key

(a) Define what the amortized running times mean in this context.

(b) Suppose you implement Dijkstra’s algorithm using a Fibonacci heap as the underlying
priority queue. What is the resulting worst-case asymptotic running time of Dijkstra’s
algorithm as a function of the number of vertices V and the number of edges E.

(c) Explain briefly why no priority queue can implement Insert, DecreaseKey, and
DeleteMin in O(1) time each.
COS 226 FINAL, FALL 2005 3

2. String searching. (6 points)


The following DFA purports to accept precisely those strings (over the two letter alpha-
bet) that contain bbabbb. Assume state 0 is the start state and state 6 is the accept
state.

(a) Are there any strings containing bbabbb that it rejects?


If so, give the shortest such string and circle it.

(b) Are there any strings not containing bbabbb that it accepts?
If so, give the shortest such string and circle it.

(c) Describe how to fix the DFA so that it works as intended.


4 PRINCETON UNIVERSITY

3. Pattern matching. (6 points)


Draw an NFA that recognizes the same language that the regular expression a(bc)*d | e* de-
scribes. Use the notation and construction given in lecture. Circle your final answer.
COS 226 FINAL, FALL 2005 5

4. Convex hull. (6 points)


Run the Graham scan algorithm to compute the convex hull of the 9 points below, using I as
the base point, and continuing counterclockwise starting at H.

8 H

7 C G

6 E

4 F

3 B

2 A D

1 I

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

(a) List the points in the order that they are considered for insertion into the convex hull.

(b) Give the points that appear on the trial hull (after each of the 8 remaining points are
considered) in the order that they appear.

1.

2.

3.

4.

5.

6.

7.

8.
6 PRINCETON UNIVERSITY

5. Geometry. (12 points)


Given a set of N intervals on the x-axis of the form (ai , bi ), design an O(N log N ) sweep-line
algorithm to find a value x that is contained within the maximum number of intervals. You
may assume that no two endpoints have the same value.

(a) What are the events?

(b) How do you implement the sweep line?

(c) What data structure stores the set of intervals that intersect the sweep line?

(d) How does your sweep-line algorithm work, i.e., how do you process each event?
COS 226 FINAL, FALL 2005 7

6. Digraphs and DFS. (6 points)


Consider the following DAG. Run depth first search, starting at vertex A. Assume the adja-
cency lists are in lexicographic order, e.g., when exploring vertex A, use A-B before A-D or
A-E.
hello

(a) List the vertices in preorder.

(b) List the vertices in postorder.

(c) List the vertices in topological order.


8 PRINCETON UNIVERSITY

7. Undirected graphs and BFS. (10 points)


The girth of an undirected graph is the length of the shortest cycle.
hello

A girth 5 graph.

(a) Describe a polynomial time algorithm to find the girth of a graph.

(b) What is its asymptotic running time as a function of the number of vertices V and the
number of edges E.
COS 226 FINAL, FALL 2005 9

8. Minimum spanning tree. (8 points)


hello

(a) Consider the weighted graph above. Give the list of edges in the MST in the order that
Kruskal’s algorithm inserts them. For reference, the 18 edge weights in ascending order
are:

4 9 12 18 19 21 22 23 24 25 30 33 34 35 36 39 42 65

(b) Consider the weighted graph above. Give the list of edges in the MST in the order that
Prim’s algorithm inserts them. Start Prim’s algorithm from vertex A.
10 PRINCETON UNIVERSITY

9. Data compression. (6 points)


Suppose you compress the following string using LZW compression over the two letter alpha-
bet.

a a b a a b a b a a a b a

(a) List the strings in the LZW dictionary in the order they are inserted. (Assume the
dictionary is initialized to begin with the two strings a and b.)

(b) Draw the binary trie representation of the resulting LZW dictionary.
COS 226 FINAL, FALL 2005 11

10. Linear programming. (5 points)


Convert the following linear program to standard form, i.e., a maximization problem with
equality constraints and nonnegative variables. (Do not solve.)

minimize 26A + 30B + 20C


subject to: A + B + C = 100
2A + 6B + 3C ≥ 145
7A + 1B + C ≥ 85
5A + 1B + 6C ≤ 95
A , B , C ≥ 0
11. Reductions. (8 points)
Consider the following two decision problems.

Ham-Path. Given a digraph, is there a path that visits each vertex exactly once?
Shortest-Simple-Path. Given a digraph with integer edge weights (positive or negative),
two distinguished vertices s and t, and an integer L, is there a simple path from s to t
of length at most L. (A simple path is a path that visits each vertex at most once.)

Show that Ham-Path polynomial reduces to Shortest-Simple-Path. To demonstrate your


reduction, draw the instance of Shortest-Simple-Path associated with the following in-
stance of Ham-Path. Be sure to include the edge weights and the value L. Assuming it is
correct (and obvious how it extends to arbitrary digraphs), you need not describe it further.
hello

12
COS 226 Algorithms and Data Structures Fall 2006

Final

This test has 12 questions worth a total of 83 points. You have 180 minutes. The exam is closed
book, except that you are allowed to use a one page cheatsheet (8.5-by-11, in your own handwriting).
No calculators or other electronic devices are permitted. Give your answers and show your work
in the space provided. Write out and sign the Honor Code pledge before turning in the
test.

“I pledge my honor that I have not violated the Honor Code during this examination.”

Problem Score Problem Score Name:


1 7
2 8 Login ID:
3 9
4 10
5 11
6 12
Sub 1 Sub 2

Total

1
2 PRINCETON UNIVERSITY

1. Analysis of algorithms. (8 points)

(a) Describe precisely and succinctly what the following statement means.
Do not use big Oh notation in your answer.

Heapsort has a worst-case running time of O(N log N ).

(b) State precisely and succinctly what is the lower bound for sorting that we proved in class.

(c) Suppose you have an algorithm that takes about 1 hour to solve a problem of size N .
You buy a new computer that is twice as fast and has twice as much memory. About
how long will it take you to solve a problem of size 2N ? Assume the running time of
your program is 8N 2 + 13N 4/3 + 12N + 3 nanoseconds. Circle the best answer.

1/2 hour 1 hour 2 hours 4 hours 16 hours

(d) Repeat the previous question, but assume the running time of your program is 2N ln N .

1/2 hour 1 hour 2 hours 4 hours 16 hours


COS 226 FINAL, FALL 2006 3

2. Algorithm analogies. (10 points)


Complete each of the following analogies with the best answer.

(a) tractable : Euler path :: intractable :

(b) MSD radix sort : R-way trie :: 3-way radix quicksort :

(c) unweighted graph : BFS :: weighted graph :

(d) sorting : pairwise comparison :: convex hull :

(e) symbol table : BST :: priority queue :

3. String searching. (6 points)


Complete the following DFA to match precisely those strings (over the two letter alphabet)
that contain bababb as a substring. State 0 is the start state and state 6 is the accept state.
4 PRINCETON UNIVERSITY

4. Convex hull. (6 points)


Run the Graham scan algorithm to compute the convex hull of the 10 points below, using J
as the base point, and continuing counterclockwise starting at G.

(a) List the points in the order that they are considered for insertion into the convex hull.

J G H

(b) Give the points that appear on the trial hull (after each of the remaining iterations).

1. J -> G -> H

2.

3.

4.

5.

6.

7.

8.

(c) Define what it means for a set of points in the plane to be convex.
COS 226 FINAL, FALL 2006 5

5. BFS and DFS. (6 points)


Consider the following directed graph.

(a) Run depth-first search, starting at vertex A. Assume the adjacency lists are in lexico-
graphic order, e.g., when exploring vertex E, consider E-C before E-F, E-G or E-I.

i. List the vertices in preorder.

ii. List the vertices in postorder.

(b) Run breadth-first search, starting at vertex A. Assume the adjacency lists are in lexico-
graphic order. List the vertices in the order in which they are enqueued.
6 PRINCETON UNIVERSITY

6. Algorithm throwdown. (10 points)


For each of the following pairs, briefly describe one reason why you’d use one instead of the
other. A familiar example is given below.

Mergesort Quicksort
stability in-place

(a)

Red-black tree Ternary search trie

(b)

Dijkstra’s algorithm Bellman-Ford-Moore

(c)

Burrows-Wheeler LZW compression

(d)

Red-black tree Hash table

(e)

Breadth-first search Depth-first search


COS 226 FINAL, FALL 2006 7

7. Minimum spanning tree. (6 points)


Consider the following weighted graph.

(a) Give the list of edges in the MST in the order that Kruskal’s algorithm inserts them.
For reference, the edge weights in ascending order are:

12 13 14 16 18 19 21 23 24 25 30 33 34 36 37 39 42 65

(b) Give the list of edges in the MST in the order that Prim’s algorithm inserts them. Start
Prim’s algorithm from vertex A.
8 PRINCETON UNIVERSITY

8. Data compression and tries. (6 points)


Suppose you apply the LZW algorithm to the following string (using the DNA alphabet).

a a c a a t a a c t

(a) List the strings in the LZW dictionary in the order they are inserted.
Assume the dictionary is initialized to begin with the four strings a, c, g and t.

(b) Complete the ternary search trie representation of the resulting LZW dictionary.
COS 226 FINAL, FALL 2006 9

9. Linear programming. (5 points)


Convert the following linear program to standard form, i.e., a maximization problem with
equality constraints and nonnegative variables. (Do not solve.)

minimize 26A + 30B + 20C


subject to: A + B + 2C = 200
3A + 6B + 3C ≤ 45
9A + 2B + 4C ≥ 85
| 5A + 9B + 6C | ≤ 95
A , B , C ≥ 0

Here, | · | denotes the absolute value function.


10 PRINCETON UNIVERSITY

10. Reductions. (6 points)


Consider the following two problems.

ElementDistinctness. Given N real numbers, are any two of them equal?

ClosestPair. Given N points in the plane, find a pair that is closest in Euclidean distance.

Show that ElementDistinctness linear reduces to ClosestPair. To demonstrate your


reduction, give the instance of ClosestPair associated with the following instance of Ele-
mentDistinctness and describe how you could solve the element distincness problem given
the solution to the corresponding closest pair problem.

23.0 3.14 2.72 1.41 3.14 5.32


COS 226 FINAL, FALL 2006 11

11. Sorting and hashing. (8 points)


Your answers will be graded on correctness, clarity, and succinctness.

(a) Describe an algorithm for ElementDistinctness that runs in O(N log N ) time in the
worst-case and uses O(1) extra memory. Assume the N real numbers are stored in an
array.

(b) Describe an algorithm for ElementDistinctness that runs in O(N ) time on average.
12. Shortest path with landmark. (6 points)
Given a directed graph G with positive edge weights and a landmark vertex x, your goal is
to find the length of the shortest path from one vertex v to another vertex w that passes
through the landmark x.
(For example, Federal Express packages are routed through x = Atlanta.)

(a) Describe a O(E log V ) algorithm for the problem. Justify briefly why your proposed
algorithm is correct.

(b) Now suppose that you will perform many such shortest path queries for the same land-
mark x, but different values of v and w. Describe how to build a data structure in
O(E log V ) time so that, given the data structure, you can process each query in con-
stant time.

12
COS 226 Algorithms and Data Structures Fall 2012

Final Exam

This test has 16 questions worth a total of 100 points. You have 180 minutes. The exam is closed
book, except that you are allowed to use a one page cheatsheet (8.5-by-11, both sides, in your own
handwriting). No calculators or other electronic devices are permitted. Give your answers and
show your work in the space provided. Write out and sign the Honor Code pledge before
turning in the test.

“I pledge my honor that I have not violated the Honor Code during this examination.”

Problem Score Problem Score Name:


0 8
1 9 Login:
2 10
3 11 Room:
4 12
5 13 Precept: P01 F 11 Maia Ginsburg
6 14 P02 F 12:30 Diego Perez Botero
7 15 P03 F 1:30 Diego Perez Botero
Sub 1 Sub 2 P03B F 1:30 Dushyant Arora
P04 Th 2:30 Maia Ginsburg
Total P04A Th 2:30 Dan Larkin

Jan 22: 653e 10f3 8823

1
2 PRINCETON UNIVERSITY

0. Initialization. (1 point)
Write your name and Princeton NetID in the space provided on the front of the exam; circle
your precept number; and write and sign the honor code.

1. Analysis of algorithms. (8 points)

(a) Suppose that you observe the following running times for a program with an input of
size N .

N time
5,000 0.2 seconds
10,000 1.2 seconds
20,000 3.9 seconds
40,000 16.0 seconds
80,000 63.9 seconds

Estimate the running time of the program (in seconds) on an input of size N =200,000.

(b) How many bytes of memory does a KMP object consume as a function of the length of the
pattern M and the size of the alphabet R? Use tilde notation to simplify your answer.

public class KMP {


private int[][] dfa;
private char[] pat;

public KMP(String pattern, int R) {


int M = pattern.length();
dfa = new int[R][M];
pat = new char[M];
...
}
...
}
COS 226 FINAL, FALL 2012 3

2. Graphs. (5 points)
Consider the following Java class. Assume that digraph G has no parallel edges.

public class Mystery {


private boolean[] marked;

public Mystery(Digraph G, int s) {


marked = new boolean[G.V()];
mystery(G, s);
}

private void mystery(Digraph G, int v) {


marked[v] = true;
for (int w : G.adj(v))
if (!marked[w]) mystery(G, w);
}

public boolean marked(int v) {


return marked[v];
}
}

(a) Describe in one sentence what the method marked(v) returns for vertex v after calling
the constructor with a digraph G and a vertex s.

(b) Suppose that a Digraph is represented using the adjacency-lists representation.


What is the order of growth of the running time of the constructor in the worst case?

1 V E E+V V2 EV E2

(c) Suppose that a Digraph is represented using the adjacency-lists representation.


What is the order of growth of the running time of the constructor in the best case?

1 V E E+V V2 EV E2

(d) Suppose that a Digraph is represented using the adjacency-matrix representation.


What is the order of growth of the running time of the constructor in the worst case?

1 V E E+V V2 EV E2
Final, Fall 2012
4 PRINCETON UNIVERSITY

3. Graph search. (6 points)


Consider the following digraph. Assume the adjacency lists are in sorted order: for example,
when iterating through the edges pointing from 2, consider the edge 2 → 7 before 2 → 8.

0 1 2 3 4

5 6 7 8 9

postorder: C I H J E D A B G F
Run depth-first search on the digraph, starting from vertex 0.

preorder: A B G F C H I D J E
(a) List the vertices in reverse postorder.

0
___ ___ ___ ___ ___ ___ ___ ___ ___ ___

(b) List the vertices in preorder.

0
___ ___ ___ ___ ___ ___ ___ ___ ___ ___
COS 226 FINAL, FALL 2012 Final, Fall 2012 5

4. Minimum spanning trees. (8 points)


Suppose that a MST of the following edge-weighted graph contains the edges with weights x,
y, and z.

A 130 B z C 20 D y E

10 80 70
110 0 140 120 90 60 50

F x G 40 H 30 I 10 J

MST
(a) List the weights of the other edges in the MST in ascending order of weight.

10
____ ____ ____ ____ ____ ____

(b) Circle which one or more of the following can be the value of x?

5 15 25 35 45 55 65 75 85 95 105 115 125 135 145

(c) Circle which one or more of the following can be the value of y?

5 15 25 35 45 55 65 75 85 95 105 115 125 135 145

(d) Circle which one or more of the following can be the value of z?

5 15 25 35 45 55 65 75 85 95 105 115 125 135 145


Final, Fall 2012
6 PRINCETON UNIVERSITY

5. Shortest paths. (8 points)


Suppose that you are running Dijkstra’s algorithm on the edge-weighted digraph below, start-
ing from vertex 0.

1 15 4
weight

2 5 4 x
1

0 15 2 6 5 y 7

23 22
7 29 18

3 12 6

The table below gives the edgeTo[] and distTo[] values immediately after vertex 4 has been
deleted from the priority queue and relaxed.

v distTo[] edgeTo[]

0 0.0 null

1 2.0 0→1

2 13.0 5→2

3 23.0 0→3

4 11.0 5→4

5 7.0 1→5

6 36.0 5→6

7 19.0 4→7
COS 226 FINAL, FALL 2012 7

(a) Give the order in which the first 4 vertices were deleted from the priority queue and
relaxed.

(b) What are all possible values of the weight of the edge x?

(c) What are all possible values of the weight of the edge y?

(d) Which is the next vertex to be deleted from the priority queue and relaxed?

(e) In the table below, fill in those entries (and only those entries) in the edgeTo[] and
distTo[] arrays that change (from the corresponding entries on the facing page) when
the next vertex is deleted from the priority queue and relaxed.

v distTo[] edgeTo[]

7
8 PRINCETON UNIVERSITY
Final, Fall 2012
6. Maximum flow. (8 points)
Consider the following flow network and feasible flow f from from the source vertex A to the
sink vertex J.

source flow capacity

A 9/9 B 4 / 10 C 7/7 D 5/6 E

9 11
7/7 / 5/5 4 3/8 5 4/6 / 5/5
14 / / 11
0 5

F 7 / 10 G 21 / 21 H 13 / 20 I 9 / 15 J

sink

augmenting path: A-G-B-C-H-I-J"


(a) What is the value of the flow f ?
min cut: { A, B, C, F, G }
max flow value = 28

(b) Starting from the flow f given above, perform one iteration of the Ford-Fulkerson algo-
rithm. List the sequence of vertices on the augmenting path.

(c) What is the value of the maximum flow?

(d) List the vertices on the source side of the minimum cut in alphabetical order.

(e) What is the capacity of the minimum cut?


COS 226 FINAL, FALL 2012 9

7. String sorting algorithms. (7 points)


The column on the left is the original input of strings to be sorted; the column on the right
are the strings in sorted order; the other columns are the contents at some intermediate step
during one of the algorithms listed below. Match up each algorithm by writing its number
under the corresponding column. You may use a number more than once.

KISS ABBA ENYA ABBA ENYA ACDC SOAD SADE ABBA


ENYA ACDC INXS ACDC ABBA ABBA WHAM CAKE ACDC
INXS AQUA DIDO AQUA AQUA AQUA ABBA CARS AQUA
STYX BECK CARS BECK ACDC BUSH MOBY JAYZ BECK
SOAD BLUR ACDC BLUR SOAD BLUR BECK ABBA BLUR
ACDC BUSH FUEL BUSH CAKE BECK ACDC ACDC BUSH
KORN CAKE BUSH CAKE MUSE CAKE SADE BECK CAKE
FUEL CARS ABBA CARS HOLE CARS DIDO WHAM CARS
BUSH DIDO AQUA DIDO SADE DIDO FUEL DIDO DIDO
ABBA ENYA CAKE ENYA BUSH ENYA CAKE KISS ENYA
WHAM FUEL BLUR FUEL RUSH FUEL HOLE BLUR FUEL
CAKE HOLE JAYZ HOLE BECK HOLE TSOL INXS HOLE
BLUR INXS BECK INXS FUEL INXS KORN ENYA INXS
MUSE JAYZ HOLE JAYZ TSOL JAYZ CARS SOAD JAYZ
BECK KISS KORN KISS WHAM KISS MUSE MOBY KISS
MOBY KORN KISS KORN KORN KORN BUSH HOLE KORN
HOLE MUSE TSOL TSOL DIDO MUSE RUSH KORN MOBY
TSOL MOBY MOBY MOBY BLUR MOBY KISS AQUA MUSE
JAYZ RUSH MUSE MUSE KISS RUSH AQUA TSOL RUSH
AQUA STYX SADE SADE INXS STYX BLUR STYX SADE
SADE SOAD WHAM WHAM CARS SOAD INXS FUEL SOAD
CARS SADE SOAD SOAD STYX SADE ENYA MUSE STYX
DIDO TSOL RUSH RUSH MOBY TSOL STYX BUSH TSOL
RUSH WHAM STYX STYX JAYZ WHAM JAYZ RUSH WHAM
---- ---- ---- ---- ---- ---- ---- ---- ----
0 1

(0) Original input (2) LSD radix sort

(1) Sorted (3) MSD radix sort

(4) 3-way string quicksort (no shuffle)


10 PRINCETON UNIVERSITY

8. Ternary search tries. (6 points)


Consider the following ternary search trie, where the values are shown next to the nodes of
the corresponding string keys. Final, Fall 2012

7 A A T 3

G G C

5 A A 13 C 8 A A T 9

4 A A 11 17 T G

T 12

(a) Circle which one or more of the following strings are keys in the TST?

A AGA CA CAA CACA CAT CGA

CGCA TA TC TCA TGT TT TTT

(b) Insert the two strings CGTT and TGA into the TST with the associated values 0 and 99,
respectively; update the figure above to reflect the changes.
COS 226 FINAL, FALL 2012 11

9. Knuth-Morris-Pratt substring search. (5 points)


Below is a partially-completed Knuth-Morris-Pratt DFA for a string s of length 12 over the
alphabet { A, B, C }. Reconstruct the string s in the space below. (You need not fill in the
first three rows of the table, but they may be used to award partial credit.)

0 1 2 3 4 5 6 7 8 9 10 11

A 1 11 12

B 0 5

C 3 3 0 0

s A A
12 PRINCETON UNIVERSITY

10. Boyer-Moore substring search. (5 points)


Suppose that you run the Boyer-Moore algorithm (the basic version considered in the textbook
and lecture) to search for the pattern

I D O F T H E

in the text
Final, Fall 2012
M E N D E R O F R O A D S W I T H T H E A I D O F T H E

Give the trace of the algorithm in the grid below, circling the characters in the pattern that
get compared with the text.

M E N D E R O F R O A D S W I T H T H E A I D O F T H E
I D O F T H E
COS 226 FINAL, FALL 2012 Final, Fall 2012 13

11. Regular expressions. (6 points)


Suppose that we run the RE-to-NFA construction algorithm from the lecture and textbook
on the regular expression ( B | ( C D * A ) * ). The match transitions are shown below.

0 1 2 3 4 5 6 7 8 9 10 11

( B | ( C D * A ) * )

Circle which one or more of the following edges are in the -transition digraph.

0→2 0→3 0→4 0→8

2→8 2→9 2 → 10 2 → 11

3→4 3→6 3→8 3→9

5→6 5→7 6→5 6→7

8 → 10 9→2 9→3 9→8


14 PRINCETON UNIVERSITY

12. Huffman codes. (5 points)

(a) Draw the Huffman trie corresponding to the encoding table below.

char freq encoding


B 2 01111
F 1 01110
H 3 0110
I ? 00
L 5 010
M 15 10
S 15 11

(b) Circle which one or more of the following are possible values for the frequency of
the character I.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
COS 226 FINAL, FALL 2012 15

13. Data compression. (6 points)


What is the compression ratio achieved by the following algorithms and inputs? Write
the best-matching letter from the right-hand column in the space provided. For Huff-
man and LZW, assume that the input is a sequence of 8-bit characters (R = 256).

Recall, the compression ratio is the number of bits in the compressed message divided
by the number of bits in the original message.

A. ∼ 1/4096

−−−−− Run-length coding with 8-bit counts for best-case inputs of B. ∼ 1/3840
N bits. C. ∼ 1/2731
D. ∼ 1/2560
E. ∼ 1/320
−−−−− Run-length coding with 8-bit counts for worst-case inputs of
N bits. F. ∼ 1/256
G. ∼ 1/255
H. ∼ 1/128
−−−−− Huffman coding for best-case inputs of N characters.
I. ∼ 1/127
J. ∼ 1/32

−−−−− Huffman coding for worst-case inputs of N characters. K. ∼ 8/255


L. ∼ 1/16
M. ∼ 1/8
−−−−− LZW coding for best-case inputs of N characters using 12-
bit codewords. Recall: no new codewords are added to the N. ∼ 1/7
table if the table already has 212 = 4096 entries. O. ∼ 1/4
P. ∼ 1/2
Q. ∼ 2/3
−−−−− LZW coding for worst-case inputs of N characters using with
12-bit codewords. Recall: no new codewords are added to R. ∼ 1
the table if the table already has 212 = 4096 entries.
S. ∼ 3/2
T. ∼ 2
U. ∼ 3
V. ∼ 4
W. ∼ 7
X. ∼ 8
16 PRINCETON UNIVERSITY

14. Algorithm design. (8 points)


Two strings s and t are cyclic rotations of one another if they have the same length
and s consists of a suffix of t followed by a prefix of t. For example, "suffixsort" and
"sortsuffix" are cyclic rotations.

Given N distinct strings, each of length L, design an algorithm to determine whether


there exists a pair of distinct strings that are cyclic rotations of one another. For
example, the following list of N = 12 strings of length L = 10 contains exactly one pair
of strings ("suffixsort" and "sortsuffix") that are cyclic rotations of one another.

algorithms polynomial sortsuffix boyermoore


structures minimumcut suffixsort stackstack
binaryheap digraphdfs stringsort digraphbfs

For full credit, the order of growth of the running time should be N L2 (or better) in the
worst case. You may assume that the alphabet size R is a small constant. Your answer
will be graded on correctness, efficiency, clarity, and succinctness.

(a) Describe your algorithm in the space below.

(b) What is the order of growth of the running time of your algorithm (in the worst
case) as a function of both N and L?
15. Reductions. (8 points)
Consider the following two graph problems:
• LongestPath. Given an undirected graph G and two distinct vertices s and t,
find a simple path (no repeated vertices) between s and t with the most edges.
• LongestCycle. Given an undirected graph G0 , find a simple cycle (no repeated
vertices or edges except the first and last vertex) with the most edges.
Final, Fall 2012 Longest Path
(a) Show that LongestPath linear-time reduces to LongestCycle. Give a brief de-
scription of your reduction. To illustrate your reduction, superimpose the Longest-
Cycle instance G0 that it constructs in order to solve the following LongestPath
instance G:

s 1 2 3

t 4 5 6

(b) Circle which one or more of the following that can you infer from the facts that
LongestPath is NP-complete and that LongestPath linear-time reduces to
LongestCycle.
i. If there exists an N 3 algorithm for LongestCycle, then P = N P .
ii. If there does not exist an N 3 algorithm for LongestCycle, then P 6= N P .
iii. If there exists an N 3 algorithm for LongestCycle, then there exists an N 3
algorithm for LongestPath.
iv. If there exists an N 3 algorithm for LongestPath, then there exists an N 3
algorithm for LongestCycle.

17
COS 226 Algorithms and Data Structures Fall 2014

Final

This test has 14 questions worth a total of 100 points. You have 180 minutes. The exam is closed book, with the
exception of a one page cheatsheet. No calculators or other electronic devices are permitted. Write out and sign
the Honor Code pledge just before turning in the test.
This exam is preprocessed by computer. Please use a pen; if you use a pencil, be sure to write
darkly. Do not write any answers outside of the designated frames. And do not write on the corners.

“I pledge my honor that I have not violated the Honor Code during this examination.”

Name:

netID:

Room:

P01 P02 P03 P03A P04 P04A


Precept:

Problem Score Problem Score P01 F 9 Andy Guna


0 7 P02 F 10 Jérémie Lumbroso
1 8 P03 F 11 Josh Wetzel
2 9 P03A F 11 Jérémie Lumbroso
3 10 P04 F 12:30 Robert MacDavid
4 11 P04A F 13:30 Shivam Agarwal
5 12
6 13
Sub 1 Sub 2

Total
0. Initialization (2 points)
In the space provided on the front of the exam, write your name and Princeton netID; fill in your precept number;
write the name of the room in which you are taking the exam; and write and sign the honor code.

1. Digraph Traversal (6 points)


Consider the following digraph. Assume the adjacency lists are in sorted order: for example, when iterating
through the edges pointing from vertex 5, consider the edge 5 → 3 before the others.

8
1 7

4 5
0
6

3
2

(a) Starting from vertex 0, run a depth-first search of the digraph, and list the vertices in reverse postorder.

(b) Starting from vertex 0, run a depth-first search of the digraph, and list the vertices in preorder.
2. Analysis of Algorithms (5 points)
For each code fragment on the left, check the best matching order of growth of the running time. You may use an
answer more than once or not at all.

N log N N log N R+N RN N + R2 (N + R) log N N (N + R)


int x = 1 , i ;
for ( i = 0; i < N ; i ++)
x ++;

public static int f2 ( int N ) {


int x = 1;
while ( x < N )
x = x * 2;
return x ;
}

int x = 0 , i ;
for ( i = 0; i < N ; i ++)
x += f2 ( N );

int x = 1, i, j;
for ( i = 0; i < N ; i ++)
for ( j = 1; j < R ; j ++)
x = x * j;

int x = 0 , i , j ;
for ( i = 1; i <= N ; i ++)
for ( j = 1; j <= N + R ; j += i )
x += j ;
3. String Sorting Algorithms (7 points)
The column on the left is the original input of 24 strings to be sorted; the column on the right are the strings
in sorted order; the other 7 columns are the contents at some intermediate step during one of the 3 radix sorting
algorithms listed below.
Match up each column with the corresponding sorting algorithm. You may use a number more than once.
Hint: think about algorithm invariants; do not trace code.

leaf cost hash edge rank load find cost cost


size edge edge cost hash leaf load edge edge
null flow cost fifo edge heap size find fifo
type find heap flow leaf swap type fifo find
cost fifo fifo find heap node trie flow flow
sink heap flow heap less fifo node heap hash
heap hash find hash next edge edge hash heap
trie leaf leaf leaf fifo trie time leaf leaf
loop loop loop loop time swim leaf loop less
flow less load load find null push less list
less load less less sink time hash load load
node list list list list find sink list loop
find null next next size sink rank null next
next node node node flow rank null node node
fifo next push push load loop swim next null
push push rank rank node flow fifo push push
rank rank trie trie loop type heap rank rank
load size sink sink cost push loop size sink
edge sink type type trie hash swap sink size
hash swap time time null less less swap swap
time swim swap swap push cost cost swim swim
swap type null null swap list next type time
list trie swim swim swim next list trie trie
swim time size size type size flow time type

0 4

(0) Original input (2) MSD radix sort

(1) LSD radix sort (3) 3-way radix quicksort (no shuffle)

(4) Sorted
4. Substring Search (8 points)
(a) Consider the Knuth-Morris-Pratt DFA for the following string of length 8:
C A C A C B C B
Complete the first row of the table.

0 1 2 3 4 5 6 7
A
B 0 0 0 0 0 6 0 8
C 1 1 3 1 5 1 7 1

(b) Suppose that you run the Boyer-Moore algorithm (the basic version considered in the textbook and lecture)
to search for the pattern
M Y F A T H E
in the text
Y B R O T H E R TFinal,
H A T FFall
A T H2014
E R W A S M Y F A T H E R T
Give the trace of the algorithm in the grid below, circling the characters in the pattern that get compared
with characters in the text.

Y B R O T H E R T H A T F A T H E R W A S M Y F A T H E R T
M Y F A T H E

Y B R O T H E R T H A T F A T H E R W A S M Y F A T H E R T
M Y F A T H E
5. Minimum Spanning Tree Algorithms (6 points)
Each of the figures below represents a partial spanning tree. Determine whether it could possibly be obtained from
(a prematurely stopped) Prim’s algorithm, (a prematurely stopped) Kruskal’s algorithm, both or neither.

Prim Kruskal Both Neither


15 14 9 13
10 8
3 6
3 1
4 1 2
7 12 11

13 17 6 1

15 14 9 13
10 8
3 6
3 1
4 1 2
7 12 11

13 17 6 1

15 14 9 13
10 8
3 6
3 1
4 1 2
7 12 11

13 17 6 1

15 14 9 13
10 8
3 6
3 1
4 1 2
7 12 11

13 17 6 1

15 14 9 13
10 8
3 6
3 1
4 1 2
7 12 11

13 17 6 1

15 14 9 13
10 8
3 6
3 1
4 1 2
7 12 11

13 17 6 1
Final, Fall 2014

6. Maximum Flow (7 points)


Consider the following flow network and feasible flow f from from the source vertex A to the sink vertex J.

flow capacity

A 7/7 B 0/5 C ?/3 D 1 / 14 E

2 3 ?/
6/6 /
16 7/7 13 0/7 /
12 0 / 10 7 1 / 10
/
0

F 6 / 10 G 15 / 15 H 15 / 19 I 12 / 17 J

(a) Check the value of the flow on edge C → D?


augmenting path: A-G-B-C-I-J
0 1 2 bottleneck
3 4 capacity: 3
min cut: { A, B, C, F, G }
max flow value = 18
C->D
(b) Check the value of the flowhas
f . flow 3
D->J has flow 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

(c) Starting from the flow f , perform one iteration of the Ford-Fulkerson algorithm. List the sequence of vertices
on the augmenting path.

(d) Check the value of the maximum flow?

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

(e) Check the vertices on the source side of a minimum cut.

A B C D E F G H I J
7. Properties of Algorithms (9 points)
Check whether each of the following statements are True or False.

(a) Shortest paths. Consider an edge-weighted digraph G with distinct and positive edge weights, a source
vertex s, and a destination vertex t. Assume that G contains at least 3 vertices, has no parallel edges or self
loops, and that every vertex is reachable from s.
True False
Any shortest s → t path must include the lightest edge.
Any shortest s → t path must include the second lightest edge.
Any shortest s → t path must exclude the heaviest edge.
The shortest s → t path is unique.

(b) Minimum spanning trees. Consider an edge-weighted graph G with distinct and positive edge weights.
Assume that G contains at least 3 vertices, has no parallel edges or self loops, and is connected.
True False
Any MST must include the lightest edge.
Any MST must include the second lightest edge.
Any MST must exclude the heaviest edge.
The MST is unique.

(c) Burrows-Wheeler transform.


True False
Any input x consisting of an integer (between 0 and N − 1) followed by N characters is the
Burrows-Wheeler transform of some string s of length N .

If the Burrows-Wheeler transforms of s and t are equal, then s = t.

If the Burrows-Wheeler inverse transforms of x and y are equal, then x = y.

In practice, applying the Burrows-Wheeler transform is significantly faster than applying


the Burrows-Wheeler inverse transform.
8. Huffman Trees (4 points)
Consider the string “DATA-STRUCTURES-AND-ALGORITHMS”: which of the following trees is an optimal prefix-free
code for this input string?

Optimal Prefix-Free Code Not Optimal Prefix-Free Code

R A - T

N D U S

C E G I H M L O

T A S R -

U D

L O H M N G I

C E

T - R A

S U H M

N D C E G I L O

R A T

C E - S N D U

H M G I L O
9. LZW Compression (5 points)
What is the result of compressing the following string of length 15 using LZW compression?

B B B B B B C A B B C B B B C

Assume the original encoding table consists of all 7-bit ASCII characters and uses 8-bit codewords. Recall that
codeword 80 is reserved to signify end of file.

42 80

5.5 Data Compression 815


For reference, below is the hexadecimal-to-ASCII conversion table from the textbook:

When you HexDump a bit- 0 1 2 3 4 5 6 7 8 9 A B C D E F


ns ASCII-encoded charac- 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI

ght is useful for reference. 1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

hex number, use the first 2 SP ! " # $ % & ‘ ( ) * + , - . /


index and the second hex 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
ndex to find the character 4 @ A B C D E F G H I J K L M N O
r example, 31 encodes the 5 P Q R S T U V W X Y Z [ \ ] ^ _
the letter J, and so forth. 6 ` a b c d e f g h i j k l m n o
bit ASCII, so the first hex 7 p q r s t u v w x y z { | } ~ DEL
ess. Hex numbers starting
Hexadecimal-to-ASCII conversion table
the numbers 20 and 7F)
n-printing control charac-
control characters are left over from the days when physical devices
s were controlled by ASCII input; the table highlights a few that you
10. Burrows-Wheeler Transform (6 points)
(a) What is the Burrows-Wheeler transform of the following?

A D D B D B C A

(b) What is the Burrows-Wheeler inverse transform of the following?

COS 226 FINAL, FALL 2014 2 COS 226 FINAL, FALL 2014 7
C A D D A B C C

5. Burrows-Wheeler transform. (8 points)


5. Burrows-Wheeler transform. (8 points)

(a) What is the Burrows-Wheeler transform (a)


of the
What
following?
is the Burrows-Wheeler transform of the
A D D B D B C A D D B D B C

Feel free to use both of these grids for scratch work.


Feel free to use this grid for scratch work. Feel free to use this grid for scratch work.

(b) What is the Burrows-Wheeler inverse transform


(b) What
of is
thethe
following?
Burrows-Wheeler inverse transform
2 2
Collection of DNA fragments (Final, Fall 2014)

11. Algorithm and Data Structure Design (13 points)


Design a data type to store a collection of gene fragments over the DNA alphabet {A, C, T, G}, according to the
following API:

public class FragmentCollection

public FragmentCollection() create an empty collection of DNA fragments

public void add(String fragment) add the DNA fragment to the collection

public int prefixCount(String p) number of DNA fragments that start with prefix p

Here is an example:

Fr ag me ntC ol le cti on fc = new F ra gme nt Co lle ct io n ();


fc . add ( " AC " );
fc . add ( " TACG " );
fc . add ( " TCGAA " );
fc . add ( " CGA " );
fc . add ( " AGCT " );
fc . add ( " TCGG " );
fc . add ( " TCGG " ); // added twice , will be counted twice
fc . prefixCount ( " " ); // returns 7 ( number of adds )
fc . prefixCount ( " T " ); // returns 4 ( TACG , TCGAA , TCGG , TCGG )
fc . prefixCount ( " TC " ); // returns 3 ( TCGAA , TCGG , TCGG )
fc . prefixCount ( " G " ); // returns 0

Give a crisp and concise English description of your data structure. Your answer will be graded on correctness,96
efficiency, and clarity.
(a) Declare the instance variables for your FragmentCollection data type. You may use nested data types.
public class FragmentCollection {

}
(b) Briefly describe how to implement each of the operations, using either prose or code.
• public void add(String fragment):

• public int prefixCount(String p):

(c) What is the order of growth of prefixCount(p) as a function of the number N of keys added, the length W
of the prefix p, the alphabet size R, and the number M of fragments that match the given prefix p?
NR
1 log N N W W + log N W +M N + W log R WR
WR
12. Reductions (13 points)
Consider the following two graph-processing problems:
• Shortest-Path. Given an edge-weighted digraph G with nonnegative edge weights, a source vertex s and
a destination vertex t, find a shortest path from s to t.

• Shortest-Teleport-Path. Given an edge-weighted digraph G with nonnegative edge weights, a source


Final, Fall
vertex s and a destination vertex t, find a shortest path2014
from s to t where you are permitted to teleport
across one edge for free. That is, the weight of a path is the sum of the weights of all of the edges in the
path, excluding the largest one.
For example, in the edge-weighted digraph below, the shortest path from s to t is s → w → t (with weight 11) but
the the shortest teleport path is s → u → v → t (with weight 3).

u 2 v

weight
8

99
1

source s 5 w 6 t destination

(a) Design a linear-time reduction from Shortest-Path to Shortest-Teleport-Path. To demonstrate your


reduction, draw the edge-weighted digraph (and label the source and destination vertices) that you would
construct to solve the Shortest-Path problem on the digraph above. You may additionally explain your
construction with a few concise sentences.
(b) Design a linear-time reduction from Shortest-Teleport-Path to Shortest-Path. To demonstrate your
reduction, draw the edge-weighted digraph (and label the source and destination vertices) that you would
construct to solve the Shortest-Teleport-Path problem on the digraph given in the previous page. You
may additionally explain your construction with a few concise sentences.

(c) Determine whether each of following statements can be infered from the fact that Shortest-Path and
Shortest-Teleport-Path linear-time reduces to one another. For simplicity, assume E ≥ V .

Yes No
If there exists an E log log E algorithm for Shortest-Teleport-Path, then there exists an
E log log E algorithm for Shortest-Path.

If there exists an E log log E algorithm for Shortest-Path, then there exists an E log log E
algorithm for Shortest-Teleport-Path.

If there does not exist a linear-time algorithm for Shortest-Path, then there does not exists
a linear-time algorithm for Shortest-Teleport-Path.

If there does not exist a linear-time algorithm for Shortest-Teleport-Path, then there
does not exists a linear-time algorithm for Shortest-Path.
13. Problem Identification (9 points)
You are applying for a job at a new software technology company. Your interviewer asks you to identify the
following tasks as either possible (with algorithms and data structures introduced in this course), impossible, or an
open research problem.

Possible Impossible Open


Given a digraph, find a directed cycle that is simple (if one exists) in time
proportional to E + V . A simple cycle is a cycle that has no repeated vertices
other than the requisite repetition of the first and last vertex.

Given an edge-weighted digraph in which all edge weights are either 1 or 2 and
two vertices s and t, find a shortest path from s to t in time proportional to
E+V.

Given an edge-weighted DAG with positive edge weights and two vertices s and
t, find a path from s to t that maximizes the product of the weights of the edges
participating in the path in time proportional to E + V .

Given an edge-weighted graph with positive edge weights, find a spanning tree
that maximizes the product of the weights of the edges participating in the span-
ning tree in time proportional to E + V .

Given an edge-weighted graph with positive edge weights and two distinguished
vertices s and t, find a simple path (no repeated vertices) between s and t that
maximizes the sum of the weights of the edges participating in the path in time
proportional to E V .

Given a flow network and a mincut in that flow network, find a maxflow in time
proportional to E + V .

Given an array of N strings over the DNA alphabet {A, C, T, G}, determine
whether all N strings are distinct in time linear in the number of characters in
the input.

Given an array a of N 64-bit integers, determine whether there are two indices
i and j such that ai + aj = 0 in time proportional to N .

Given an array of N integers between 0 and R2 − 1, stably sort them in time


proportional to N + R.
COS 226 Algorithms and Data Structures Fall 2015

Final Exam

You have 180 minutes for this exam. The exam is closed book, except that you are allowed to use one
page of notes (8.5-by-11, one side, in your own handwriting). No calculators or other electronic devices are
permitted. Give your answers and show your work in the space provided. You may use the back of each
page for scratch space, or to continue long answers.

P01 9:00 Andy Guna


P02 10:00 Andy Guna
Name: P02A 10:00 Elena Sizikova
P03 11:00 Maia Ginsburg
NetID: P03A 11:00 Nora Coler
P04 12:30 Maia Ginsburg
P04A 12:30 Miles Carlsten
Precept: P05 1:30 Tom Wu

Write and sign: “I pledge my honor that I have not violated the Honor Code during this examination.”

Grading note: To ensure that guessing on true/false and multiple-choice questions does not affect your
expected score, grading on these questions will be as follows:
True / False: +1 point if correct, −1 point if incorrect, 0 points if left unanswered.
Multiple choice: +2 points if correct, −0.4 points if incorrect, 0 points if left unanswered.

Problem Score Problem Score


0 7
1 8
2 9
3 10
4 11
5 12
6 13
Sub 1 Sub 2

Total:
COS 226, Fall 2015 Page 2 of 15

0. Init. (1 point)
In the space provided on the front of the exam, write your name, Princeton netID, and precept number, and
write and sign the honor code.

1. Flow. (10 points)


Consider the following flow network and feasible flow f from the source vertex S to the sink vertex T.

(a) What is the value of the flow f ? Circle the correct answer.
3 5 7 11 13 17

(b) Starting from the flow given above, perform one iteration of the Ford-Fulkerson algorithm. List the
sequence of vertices on the augmenting path, in order from S to T.

(c) What is the value of the maximum flow? Circle the correct answer.

3 5 7 11 13 17

(d) Circle all vertices on the sink (T) side of the minimum cut.

S A B C D E F T
COS 226, Fall 2015 Page 3 of 15
2. SPT. (12 points)
Simulate Dijkstra’s algorithm on the edge-weighted digraph below, starting from vertex 0.
24
0 1

10
5

1
2 3

10
30 8

20
4 5

(a) Fill in the following table:

distTo[] edgeTo[]

(b) What is the maximum number of items in the priority queue? Circle the correct answer.
1 2 3 4 5 6

(c) What is the last vertex popped from the priority queue? Circle the correct answer.
0 1 2 3 4 5

(d) What letter is spelled out by the edges of the shortest-paths tree (SPT) computed by Dijkstra’s algorithm?
COS 226, Fall 2015 Page 4 of 15
3. TST. (13 points)
Consider the following Ternary Search Trie (TST), where the values are shown next to the nodes of the
corresponding string keys.

(a) We would like to construct the above TST by inserting six strings into an empty TST. Circle the sequences
below that can produce the above TST. There may be multiple correct answers.

Sequence 1: ATT ACG T CT AGC GA

Sequence 2: ATT T CT ACG GA AGC

Sequence 3: ATT T GA ACG CT AGC

Sequence 4: ATT T AGC ACG CT GA

Sequence 5: ATT ACG AGC T CT GA

Sequence 6: ATT T AGC GA ACG CT

Sequence 7: ATT ACG T CT GA AGC

(b) Insert the three strings CA, AGA, and GAC into the TST with the associated values 0, 18, and 29, respec-
tively. Update the figure above to reflect the changes.
COS 226, Fall 2015 Page 5 of 15
4. KMP DFA. (13 points)
(a) Below is a partially-completed Knuth-Morris-Pratt DFA for a string s of length 6 over the alphabet
{A, B}. State 6 is the accept state. Fill in all the missing spots in the table.

j 0 1 2 3 4 5

pat.charAt(j)

A 1 1

B 3 3

(b) Given the following KMP DFA:

j 0 1 2 3 4 5 6

A 1 1 3 1 5 1 5

B 0 2 0 4 0 6 7

List the string that this DFA searches for.

(c) Given each of the following strings as input, what state would the DFA in (b) end in? Circle the correct
answer for each string.

BABBAA: 0 1 2 3 4 5 6 accept

ABABABA: 0 1 2 3 4 5 6 accept

BABABABA: 0 1 2 3 4 5 6 accept

BBAABBABAB: 0 1 2 3 4 5 6 accept
COS 226, Fall 2015 Page 6 of 15
5. DAG. (10 points)
Consider the following directed graph.

(a) You wish to find the shortest common ancestor (SCA) of the two given sets, using BFS. List the first
six vertices added to the queue by running BreadthFirstDirectedPaths.java on an iterator with the
sources from set A = {13, 23, 24}?

(b) How many vertices in all will BreadthFirstDirectedPaths.java visit when passed an iterator with
the sources from set B = {6, 16, 17}?

(c) Using BFS to find the SCA can take running time proportional to V + E. Suppose you wished to use
DFS instead. What would be the order-of-growth running time? Circle the correct answer.
constant V +E E logV V log E (V + E)2 exponential

(d) True or false: any pair of vertices in a rooted directed acyclic graph (DAG) has at least one shortest
common ancestor.
True False

(e) True or false: any pair of vertices in any DAG for which a topological sort exists has at least one shortest
common ancestor.
True False
COS 226, Fall 2015 Page 7 of 15
6. Regex. (11 points)
(a) Consider the regular expression
((A|B)DA*C)
Circle all words matched by this regular expression.

ABDAC ADAAC ABDACA BDC BDAC AACA

(b) The following NFA matches the regular expression in (a):

Which of the labeled edges correspond to ε transitions (as opposed to match transitions)? Circle the num-
bers of only the ε transitions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14

(c) Which of the following (if any) are true reasons why we usually prefer NFAs for matching a regular
expression (RE), as opposed to DFAs? Circle the correct answer in each case.

The size of the NFA is linear in the size of the RE, while the size of the DFA might be as bad as quadratic.
True False

The size of the NFA is linear in the size of the RE, while the size of the DFA might be as bad as exponential.
True False

The running time to simulate the NFA is linear in the size of the RE, while the running time for the DFA
might be as bad as quadratic.
True False

The running time to simulate the NFA is linear in the size of the RE, while the running time for the DFA
might be as bad as exponential.
True False

The NFA only has two kinds of transitions (match and ε), while the DFA requires determining the correct
transition for each possible input character.
True False

The DFA might require backing up in the input stream, while the NFA does not.
True False
COS 226, Fall 2015 Page 8 of 15
7. Huffman. (10 points)
Consider the following Huffman tree:

(a) Decode the following 24-bit bitstring: 111000110001100001011000

(b) What is the compression ratio (compressed size / uncompressed size) for the above bitstring? Assume
that characters were represented by 8 bits before compression.

(c) What is the best compression ratio achievable on any string using this Huffman tree?

(d) Suppose you added another character, H, with a count of 1. After re-creating the new Huffman code,
circle all the letters that acquire a different codeword.
A B C D E F G

(e) Using the Huffman code from (d), what is the worst compression ratio achievable on any string?
COS 226, Fall 2015 Page 9 of 15
8. Graph T/F. (10 points)
(a) The adjacency matrix representation is usually preferred over adjacency lists, especially for storing sparse
graphs compactly.
True False

(b) Given the data structures produced by depth-first search (DFS), one can check whether a given vertex is
connected to the source in constant time.
True False

(c) Breadth-first search (BFS) will visit every vertex in a directed graph, in nondecreasing order from the
source.
True False

(d) BFS and DFS are interchangeable and equally practical for all applications of graph search.
True False

(e) Kruskal’s algorithm computes the minimum spanning tree (MST) in time proportional to E log E (in the
worst case).
True False

(f) Given any directed graph, there is always a shortest-paths tree (SPT) containing every vertex reachable
from a source vertex s.
True False

(g) Dijkstra’s algorithm can find shortest paths in a directed graph with negative weights, but no negative
cycles.
True False

(h) An st-cut in a graph is any partition of vertices into two disjoint sets, such that vertices s and t wind up
in different sets.
True False

(i) A graph flow is a max flow if and only if there exists no cut with the same capacity as the flow’s value.
True False

(j) The choice of which augmenting paths to consider first in the Ford-Fulkerson algorithm doesn’t impact
the number of paths that need to be considered.
True False
COS 226, Fall 2015 Page 10 of 15
9. Sort. (14 points)
The column on the left is an array of strings to be sorted. The column on the right is in sorted order. The
other columns are the contents of the array at some intermediate step during one of the algorithms below.
Write the number of each algorithm under the corresponding column. You may use each number more than
once.

mink bear bear calf crow myna crab bear bear


moth calf calf lamb lamb crab toad crow calf
crow crow crow hare deer lamb swan calf crab
myna crab crab wasp crab toad bear crab crow
swan deer hare hawk hare mule deer deer deer
wolf hare kiwi ibex bear hare ibex hare hare
mule hawk deer bear kiwi sole hoki hawk hawk
slug hoki hawk deer calf wolf mule hoki hoki
hare ibex ibex mink hawk calf sole ibex ibex
bear kiwi hoki lion ibex slug wolf kiwi kiwi
kiwi lion lion kiwi hoki moth calf lion lamb
calf lynx lynx slug lion kiwi lamb lynx lion
hawk lamb lamb toad lynx hoki myna lamb lynx
ibex mink mink hoki mink mink mink mink mink
oryx moth mule sole mule hawk lynx moth moth
lion myna myna wolf myna swan lion myna mule
sole mule moth moth moth lion crow mule myna
wasp oryx wasp crab wasp wasp hare oryx oryx
lynx swan sole crow sole bear wasp swan slug
hoki slug oryx oryx oryx deer moth slug sole
crab sole slug mule slug crow slug sole swan
deer toad wolf swan wolf ibex kiwi toad toad
lamb wolf toad myna toad oryx hawk wolf wasp
toad wasp swan lynx swan lynx oryx wasp wolf
---- ---- ---- ---- ---- ---- ---- ---- ----
0 4

(0) Original input (2) LSD radix sort (4) Sorted


(1) 3-way radix quicksort (3) MSD radix sort
(no shuffle)
COS 226, Fall 2015 Page 11 of 15
10. G2 . (10 points)
The square of a digraph G consisting of vertices V and edges E is a digraph G2 such that:
• the vertices in G2 are the same as the vertices in G, and
• two vertices in G2 are connected by an edge (u, v) if and only if G contains edges (u, w) and (w, v),
for some vertex w.
That is, vertices u and v are connected by an edge in G2 whenever G contains a path with exactly two edges
from u to v.
Describe an algorithm for computing the square of a digraph (represented using adjacency lists). For full
credit, your solution should run in O(V E) time. To simplify the problem, you need not remove duplicates
from the adjacency lists in G2 .
COS 226, Fall 2015 Page 12 of 15
11. ST Analysis. (10 points)
You are deciding between symbol table implementations to store L-character strings, consisting of characters
from the extended-ASCII (R = 256) character set. Analyze the worst-case order-of-growth running time
required by the get() operation (with the key present in the symbol table — i.e., a search hit) for the
following implementations, assuming that N strings are already in the symbol table. Circle the correct
answer in each case.

(a) A Left-Leaning Red-Black BST of strings.


Worst-case number of character comparisons:

lg N lg2 N L + lg2 N L lg N (lg N)(logR L) NL

(b) An M-entry hash table with separate chaining. Assume the hash table has been resized such that the
average chain length N/M is bounded: 2 ≤ N/M ≤ 8. Do not include the time to compute the hashCode.
Worst-case number of character comparisons:
 
1
M+L NL/M ML/N (N/M) logR L (logR L) 1 + NL
1 − N/M

(c) An M-entry hash table with linear probing. Assume the hash table has been resized such that the average
occupancy N/M is bounded: 1/8 ≤ N/M ≤ 1/2. Do not include the time to compute the hashCode.
Worst-case number of character comparisons:
 
1
M+L NL/M ML/N (N/M) logR L (logR L) 1 + NL
1 − N/M

(d) An R-way trie.


Worst-case number of array accesses:

R L R+L RL logR N L + logR N

(e) A ternary search trie (TST).


Worst-case number of character comparisons:

R L R+L RL logR N L + logR N


COS 226, Fall 2015 Page 13 of 15
12. Reduction. (10 points)
(a) The F IND -42ND problem is to find the 42nd smallest item in an (initially unsorted) array. You can imple-
ment this easily by sorting the array in O(N log N) time and returning the item in the 42nd position. Given
this, which of the following (if any) must be true? Circle the correct answer in each case.

F IND -42ND reduces to sorting.


True False

Sorting reduces to F IND -42ND .


True False

O(N log N) is a lower bound on F IND -42ND .


True False

O(N log N) is an upper bound on F IND -42ND .


True False

F IND -42ND must be NP-complete.


True False

F IND -42ND cannot be NP-complete unless P = NP.


True False

(b) Of course, it is also easy to implement F IND -42ND in O(N) time, using O(42) additional space. Further-
more, it is possible to show that linear time is the lower bound on F IND -42ND , since all elements must be
examined. Given this algorithm and the reduction in (a), which of the following (if any) must be true:

O(N) is a lower bound on sorting.


True False
O(N) is an upper bound on sorting.
True False
Sorting is strictly harder than F IND -42ND , so can never be accomplished in O(N) time.
True False
New developments in sorting might result in an asymptotically faster algorithm for F IND -42ND .
True False
COS 226, Fall 2015 Page 14 of 15
13. MST. (16 points)
You are given an edge-weighted undirected graph, using the adjacency list representation, together with the
list of edges in its minimum spanning tree (MST). Describe an efficient algorithm for updating the MST,
when each of the following operations is performed on the graph. Assume that common graph operations
(e.g., DFS, BFS, finding a cycle, etc.) are available to you, and don’t describe how to re-implement them.

(a) Update the MST when the weight of an edge that was not part of the MST is decreased.
Give the order-of-growth running time of your algorithm as a function of V and/or E.

(b) Update the MST when the weight of an edge that was part of the MST is decreased.
Give the order-of-growth running time of your algorithm as a function of V and/or E.
COS 226, Fall 2015 Page 15 of 15
(c) Update the MST when the weight of an edge that was not part of the MST is increased.
Give the order-of-growth running time of your algorithm as a function of V and/or E.

(d) Update the MST when the weight of an edge that was part of the MST is increased.
Give the order-of-growth running time of your algorithm as a function of V and/or E.
COS 226 Algorithms and Data Structures Fall 2017

Final

This exam has 16 questions worth a total of 100 points. You have 180 minutes. This exam
is preprocessed by a computer, so please write darkly and write your answers inside the
designated spaces.

Policies. The exam is closed book, except that you are allowed to use a one page cheatsheet
(8.5-by-11 paper, two sides, in your own handwriting). No electronic devices are permitted.

Discussing this exam. Discussing the contents of this exam before solutions have been posted
is a violation of the Honor Code.

This exam. Do not remove this exam from this room. In the space below, write your name and
NetID; mark your precept number; and write and sign the Honor Code pledge. You may fill in this
information now.

Name:

NetID:

Exam room: McCosh 10 Other


#
P01 P02 P03 P03A P04 P05 P06
Precept:
# # # # # # #

“I pledge my honor that I will not violate the Honor Code during this examination.”

Signature
2 PRINCETON UNIVERSITY

1. Initialization. (2 point)
In the space provided on the front of the exam, write your name and NetID; mark your
precept number; and write and sign the Honor Code pledge.

2. Memory. (5 points)
Consider the following representation for a ternary search trie for LZW compression with
string keys and integer values:

public class TernarySearchTrie {


private int n; // number of key-value pairs
private Node root; // root node

private static class Node {


private char c; // character
private int value; // value of key-value pair
private Node left; // left sub-trie
private Node mid; // middle sub-trie
private Node right; // right sub-trie
}
...
}

Using the 64-bit memory cost model from lecture and the textbook, how much memory does
a TernarySearchTrie object use as a function of the number of key–value pairs n. Use tilde
notation to simplify your answer.

∼ bytes

Hint 1: For LZW compression, the number of TST nodes equals the number of key–value
pairs (because every prefix of a key is also a key).

Hint 2: There is no 8-byte inner-class overhead for static nested classes.


COS 226 FINAL, FALL 2017 3

3. Running time. (6 points)


Let x be a StringBuilder object of length n. For each code fragment at left, write the letter
corresponding to the order of growth of the running time as a function of n.
Assume that Java’s StringBuilder data type represents a string of length n using a resizing
array of characters (with doubling and halving), with the first character in the string at index
0 and the last character in the string at index n − 1.

// converts x to a String
String s = ""; A. 1
for (int i = 0; i < n; i++)
s += x.charAt(i);
B. log n

// creates a copy of x
StringBuilder y = new StringBuilder(); C. n log n
for (int i = 0; i < n; i++)
y.append(x.charAt(i));
D. n

// reverses x E. n2
for (int i = 0; i < n/2; i++) {
char c1 = x.charAt(i);
char c2 = x.charAt(n - i - 1);
x.setCharAt(i, c2);
x.setCharAt(n - i - 1, c1);
}

// concatenates x with itself


for (int i = 0; i < n; i++)
x.append(x.charAt(i));

// removes the last n/2 characters of x


for (int i = 0; i < n/2; i++)
x.deleteCharAt(x.length() - 1);

// removes the first n/2 characters of x


for (int i = 0; i < n/2; i++)
x.deleteCharAt(0);
4 PRINCETON UNIVERSITY

4. String sorts. (5 points)


The column on the left contains the original input of 24 strings to be sorted; the column on
the right contains the strings in sorted order; the other 5 columns contain the contents at
some intermediate step during one of the 3 radix-sorting algorithms listed below. Match each
algorithm by writing its letter in the box under the corresponding column.

0 null byte cost byte java byte byte


1 tree cost lifo cost load cost cost
2 lifo edge list edge find miss edge
3 list find miss flip tree hash find
4 miss flip hash find byte java flip
5 hash hash java hash edge load hash
6 java java load java trie leaf java
7 next lifo leaf lifo type flip lazy
8 load list flip list leaf link leaf
9 leaf load link load hash list left
10 flip leaf byte leaf path edge lifo
11 path lazy edge lazy sink lazy link
12 byte left lazy left link left list
13 edge link left link rank find load
14 lazy miss find miss null lifo miss
15 trie null next null lifo next next
16 find next null next flip null null
17 left path type path swap type path
18 type rank sink rank miss sink rank
19 sink sink trie sink list trie sink
20 link swap swap swap next swap swap
21 swap tree path tree left path tree
22 cost trie rank trie cost rank trie
23 rank type tree type lazy tree type

A E

A. Original input

B. LSD radix sort

C. MSD radix sort

D. 3-way radix quicksort (no shuffle)

E. Sorted
COS 226 FINAL, FALL 2017 5

Final, Fall 2017


5. Depth-first search. (6 points)
Run depth-first search on the following digraph, starting from vertex 0. Assume the adjacency
lists are in sorted order: for example, when iterating over the edges pointing from 7, consider
the edge 7 → 1 before either 7 → 6 or 7 → 8.

run DFS from here

1 2 0 4 5

6 7 8 9 3

postorder:
(a) List the 10 vertices in preorder.
preorder:
0

(b) List the 10 vertices in postorder.

(c) Does this digraph have a topological order? If yes, write one in the box below; if no,
succinctly explain why not.
6 PRINCETON UNIVERSITY

Final, Fall 2017


6. Breadth-first search. (4 points)
Run breadth-first search on the following digraph, starting from vertex 0. Assume the adja-
cency lists are in sorted order: for example, when iterating over the edges pointing from 7,
consider the edge 7 → 1 before either 7 → 6 or 7 → 8.

run BFS from here

1 2 0 4 5

6 7 8 9 3

List the 10 vertices in the order in which they are added to the queue.

0
COS 226 FINAL, FALL 2017
Final, Fall 2017 7

7. Maximum flow. (10 points)


Consider the following flow network and flow f from the source vertex A to sink vertex J.

source flow capacity

A 14 / 29 B 1/8 C 18 / 18 D 9/9 E

12 3
26 8 /
10 / 10 / 13 / 22 / 5 / 13 / 6/6 19 9 / 10
26 12 0

F 10 / 14 G 37 / 37 H 32 / 39 I 38 / 38 J

sink

augmenting path: A->B->C->H->I->D->J


(a) What is the value of the flow f ? Mark the correct answer.

47 bottleneck
50 51 52capacity
53 54= 5 55 65 70 79
# min # cut: # {#A, B,#C, F, # G }# # # #
max flow value = 55
(b) What is the capacity of the cut {A, F, G}? Mark the correct answer.

41 46 49 50 56 63 65 74 78 100
# # # # # # # # # #
(c) Starting from the flow f , perform one iteration of the Ford–Fulkerson algorithm. Which
vertices are on the (unique) augmenting path? Mark all that apply.

A B C D E F G H I J
◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
(d) What is the bottleneck capacity of the augmenting path? Mark the correct answer.

0 1 2 3 4 5 6 7 8 9
# # # # # # # # # #
(e) Which vertices are on the source side of the (unique) minimum cut? Mark all that apply.

A B C D E F G H I J
◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
8 PRINCETON UNIVERSITY

8. LZW compression. (6 points)


Expand the following LZW-encoded sequence of 8 hexadecimal integers.

43 41 41 81 42 84 41 80
Assume the original encoding table consists of all 7-bit ASCII characters and uses 8-bit
codewords. Recall that codeword 80 is reserved to signify end of file.

(a) What was the encoded message?

(b) Which of the following strings are in the LZW dictionary upon termination of the algo-
rithm? Mark all that apply.

AA AB ABA AC ACA BC CA CAA CAB CABA CABC


◻ ◻ ◻ 5.5◻ Data
◻ Compression
◻ ◻ ◻ 815
◻ ◻ ◻

en you HexDump a bit- 0 1 2 3 4 5 6 7 8 9 A B C D E F


SCII-encoded charac- 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
is useful for reference. 1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
number, use the first 2 SP ! " # $ % & ‘ ( ) * + , - . /
ex and the second hex 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
x to find the character 4 @ A B C D E F G H I J K L M N O
ample, 31 encodes the 5 P Q R S T U V W X Y Z [ \ ] ^ _
letter J, and so forth. 6 ` a b c d e f g h i j k l m n o
ASCII, so the first hex 7 p q r s t u v w x y z { | } ~ DEL
Hex numbers starting
reference, this Hexadecimal-to-ASCII conversion tabletable from the textbook.
numbers 20 andFor 7F)
is the hexadecimal-to-ASCII conversion

inting control charac-


rol characters are left over from the days when physical devices
re controlled by ASCII input; the table highlights a few that you
COS 226 FINAL, FALL 2017 9

9. Ternary search tries. (6 points) Final, Fall 2017


Consider the following TST, where the values are shown next to the nodes of the corresponding
string keys. Each node labeled with a ? contains some uppercase letter (possibly different
for each node).

T R Z 14

I 8 ? T

1 ? G R 5 9 O E 10 12 ?

E ? 3 I 6 11 O O 13

R 2 ? 4 ? 7

Which of the following string keys are (or could be) in the TST? Mark all that apply.

TIGER TILE TO TOO TREE TRIE TRUE TWO URGE


∎ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
10 PRINCETON UNIVERSITY

10. Knuth–Morris–Pratt substring search. (6 points)


Below is a partially-completed Knuth–Morris–Pratt DFA for the string

C C A C C A C B
over the alphabet { A, B, C }. Complete the third row of the table.

0 1 2 3 4 5 6 7

A 0 0 3 0 0 6 0 0

B 0 0 0 0 0 0 0 8

s C C A C C A C B

Final, Spring 2017 KMP

0 1 2 3 4 5 6 7 8

Feel free to use this diagram for scratch work.


COS 226 FINAL, FALL 2017 11

11. Programming assignments. (12 points)


Answer the following questions related to COS 226 programming assignments.

(a) Suppose that in the WordNet assignment, you needed to check whether a digraph G is
a rooted tree (instead of a rooted DAG). A rooted tree is a digraph that contains a root
vertex r such that there is exactly one directed path from every vertex to r.

Which of the following properties hold for all rooted trees? Mark all that apply.

◻ There is exactly one vertex of outdegree 0.

◻ There is exactly one vertex of indegree 0.

◻ There are no directed cycles.

◻ There is a directed path between every pair of vertices.

◻ There are V − 1 edges, where V is the number of vertices.

◻ There are E − 1 vertices, where E is the number of edges.

(b) In the Seam Carving assignment, what is the worst-case running time of an efficient
algorithm for finding a horizontal seam of minimum total energy in a picture of width
W and height H? Mark the best answer.

W H W +H WH W H2 W 2H
# # # # # #
12 PRINCETON UNIVERSITY

(c) Suppose that you compress the text of Algorithms, 4th edition using one of the following
sequences of transformations:
A. Huffman coding
B. Burrows–Wheeler transform
C. Burrows–Wheeler transform Ð→ Huffman coding
D. Burrows–Wheeler transform Ð→ move-to-front coding Ð→ Huffman coding.
E. Huffman coding Ð→ Burrows–Wheeler transform.

Which of the following can you infer? Mark all that apply.

◻ A achieves a better compression ratio than B.

◻ C achieves a better compression ratio than A.

◻ E achieves a better compression ratio than A.

◻ D achieves the best compression ratio among A–E.

(d) In which of the following programming assignments was the super-source trick (implicitly
or explicitly adding a source vertex to convert a graph or digraph with multiple sources
into one with a single source) a key component in improving the order of growth of the
running time? Mark all that apply.

◻ Assignment 1 (Percolation)

◻ Assignment 2 (Deques and Randomized Queues)

◻ Assignment 3 (Autocomplete)

◻ Assignment 4 (8-Puzzle)

◻ Assignment 5 (Kd-Trees)

◻ Assignment 6 (WordNet)

◻ Assignment 7 (Seam Carving)

◻ Assignment 8 (Burrows–Wheeler)
COS 226 FINAL, FALL 2017 13

12. Properties of minimum spanning trees. (5 points)


Let G be a connected graph with distinct edge weights. Let S be a cut that contains exactly 4
crossing edges e1 , e2 , e3 , and e4 such that weight(e1 ) < weight(e2 ) < weight(e3 ) < weight(e4 ).
For each statement at left, write the letter corresponding to the best-matching description at
right.

Kruskal’s algorithm adds edge e1 to A. True for every such edge-weighted graph G
the MST. and every such cut S.

B. False for every such edge-weighted graph G


Prim’s algorithm adds edge e4 to the and every such cut S.
MST.

C. Neither A nor B.

If Kruskal’s algorithms adds edges e1 ,


e2 , and e4 to the MST, then it also
adds e3 .

If edges e1 and e2 are both in the MST,


then Kruskal’s algorithm adds e1 to
the MST before e2 .

If edges e1 and e2 are both in the MST,


then Prim’s algorithm adds e1 to the
MST before e2 .
14 PRINCETON UNIVERSITY

13. Properties of shortest paths. (5 points)


Let G be any DAG with positive edge weights and assume all vertices are reachable from the
source vertex s. For each statement at left, identify whether it is a property of Dijkstra’s
algorithm and/or the topological sort algorithm by writing the letter corresponding to the
best-matching term at right.

If G contains the edge v → w, then vertex v is A. Dijkstra’s algorithm.


relaxed before vertex w.

B. Topological sort algorithm.

Each vertex is relaxed at most once. C. Both A and B.

D. Neither A nor B.

If the length of the shortest path from s to v is less


than the length of the shortest path from s to w,
then vertex v is relaxed before vertex w.

Immediately after relaxing any edge v → w,


distTo[w] is the length of the shortest path from
s to w.

During each edge relaxation, for each vertex v,


distTo[v] either remains unchanged or decreases.

Recall that relaxing a vertex v means relaxing every edge pointing from v.
COS 226 FINAL, FALL 2017 15
Final, Fall 2017

14. Regular expressions. (6 points)


Consider the following NFA, where 0 is the start state and 12 is the accept state:

ε-transition

0 1 2 3 4 5 6 7 8 9 10 11 12

( A A B A )

match transition

(a) Complete the regular expression below so that it matches the same set of strings as the
NFA by writing one of the following symbols in each box:

A* | (AB*A)+
( ) * + |

( A A B A )

(b) Suppose that you simulate the NFA with the following input:

A A A A A

In which state(s) could the NFA be after reading the entire input? Mark all that apply.

0 1 2 3 4 5 6 7 8 9 10 11 12
◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
16 PRINCETON UNIVERSITY

15. Shortest discount path. (8 points)


Consider the following variant of the shortest path problem.
Shortest-Discount-Path. Given an edge-weighted digraph G with positive edge weights,
a source vertex s, and a destination vertex t ≠ s, find the weight of the shortest discount path
from s to t, where the weight of a discount path is the sum of the weights of the edges in the
path, but with the largest weight in the path discounted by 50%.

For example, in the Shortest-Discount-Path instance below, the shortest path from A to
E is A → B → C → D → E (with weight 120 = 20 + 10 + 30 + 60) but the the shortest discount
Final, Fall 2017
path is A → B → C → E (with weight 80 = 20 + 10 + 100
2 ).

edge-weighted digraph G B 10 C

10
30
20

25

0
source s A 80 D 60 E destination t

Design an efficient algorithm for solving the Shortest-Discount-Path problem by solving a


traditional shortest path problem on a related edge-weighted digraph G′ with positive weights.
To demonstrate your algorithm, draw G′ for this Shortest-Discount-Path instance in the
space provided the facing page.
COS 226 FINAL, FALL 2017 17

Draw G′ here. Be sure to specify the weight of each edge and label the source and destination.

Hint: you shouldn’t need more than 10 vertices or 21 edges.

In general, how many vertices and edges does G′ have as a function of V and E?
(where V and E denote the number of vertices and edges in G, respectively)

number of vertices in G′ number of edges in G′


18 PRINCETON UNIVERSITY

16. Substring of a circular string. (8 points)


Design an algorithm to determine whether a string s is a substring of a circular string t.
Let m denote the length of s and let n denote the length t. Assume the binary alphabet.
Final,
For reference, the following table showsFall 2017
a few examples:

string s circular string t substring

ABBA BB BB BB AB BA BB BB B yes

ABBA BA B BB BBA BB BB BA B yes

BBAABBAABBAABB A BB A yes

ABBA BB BB BB BA BA BB BB no

BAABAAB A BB A no

Give a crisp and concise English description of your algorithm in the space below.

Your answer will be graded for correctness, efficiency, and clarity. For full credit, the order
of growth of the worst-case running time must be m + n.
This page is provided as scratch paper. If you tear it out, please write your name, NetID, and
precept number in the space provided and return it inside your exam.

Name: NetID: Precept:


COS 226 Algorithms and Data Structures Fall 2018

Final

This exam has 16 questions (including question 0) worth a total of 100 points. You have 180
minutes. This exam is preprocessed by a computer when grading, so please write darkly and
write your answers inside the designated spaces.
Policies. The exam is closed book, except that you are allowed to use a one-page cheatsheet
(8.5-by-11 paper, two sides, in your own handwriting). Electronic devices are prohibited.
Discussing this exam. Discussing the contents of this exam before solutions have been posted
is a violation of the Honor Code.
This exam. Do not remove this exam from this room. In the space provided, write your name and
NetID. Also, mark your exam room and the precept in which you are officially registered. Finally,
write and sign the Honor Code pledge. You may fill in this information now.

Name:

NetID:

Course: COS 126 COS 226


#
Exam room: McCosh 10 Other
# #
P01 P01A P02 P02A P03 P03A P04 P05
Precept:
# # # # # # # #

“I pledge my honor that I will not violate the Honor Code during this examination.”

Signature
2 PRINCETON UNIVERSITY

0. Initialization. (2 point)
In the space provided on the front of the exam, write your name and NetID; mark your exam
room and the precept in which you are officially registered; write and sign the Honor Code
pledge.

1. Empirical running time. (5 points)


Suppose that you observe the following running times for a program on inputs of size n for
varying values of n.

n time
10,000 1.2 seconds
30,000 2.1 seconds
90,000 3.9 seconds
270,000 7.9 seconds
810,000 16.0 seconds

(a) Estimate the running time of the program (in seconds) for an input of size n = 2,430,000.

seconds

(b) Estimate the order of growth of the running time of the program as a function of n.
COS 226 FINAL, FALL 2018 3

2. Mathematical running time. (5 points)


Let list be a LinkedList object containing a sequence of n characters. For each code
fragment at left, write the letter corresponding to the order of growth of the worst-case
running time as a function of n.
Java’s LinkedList data type represents a sequence of items using a doubly linked list, main-
taining references to the first and last nodes. All operations are implemented in an efficient
manner for the given representation.

// convert the list to a string


String s = ""; A. 1
for (char c : list)
s += c;
B. log n

// Knuth shuffle
for (int i = 0; i < list.size(); i++) { C. n
int r = (int) (Math.random() * (i + 1));
char c1 = list.get(r); // get element r
char c2 = list.get(i); // get element i D. n log n
list.set(r, c2); // replace element r
list.set(i, c1); // replace element i
} E. n2

// sort (using Timsort/mergesort) F. n3


Collections.sort(list);

// palindrome?
boolean isPalindrome = true;
while (list.size() > 1) {
char c1 = list.removeFirst();
char c2 = list.removeLast();
if (c1 != c2) isPalindrome = false;
}

// create a reverse copy of the list


LinkedList<Character> copy = new LinkedList<Character>();
for (char c : list)
copy.addFirst(c);
4 PRINCETON UNIVERSITY

3. String sorts. (5 points)


The column on the left contains the original input of 24 strings to be sorted; the column on
the right contains the strings in sorted order; the other 5 columns contain the contents at
some intermediate step during one of the 3 radix-sorting algorithms listed below. Match each
algorithm by writing its letter in the box under the corresponding column.

6862 1131 5091 1131 3906 5790 1131


7924 1216 1131 1188 9608 9880 1188
1131 1188 2294 1216 8814 7270 1216
8276 2786 5790 2786 1216 1131 2294
9299 2294 1216 2294 7924 7671 2786
5790 3906 5035 3906 8424 6551 3906
1216 5790 2786 5790 1131 5091 5035
7383 5035 3906 5035 5035 6862 5091
8424 5091 1188 5091 9545 7383 5790
3906 6862 6188 6862 6551 7924 6188
9545 6551 6862 6551 9757 8424 6551
7671 6188 6551 6188 6862 8814 6862
9880 7924 9880 7924 7270 2294 7270
6551 7383 7671 7383 7671 9545 7383
1188 7671 9545 7671 8276 5035 7671
2786 7270 9608 7270 9880 8276 7924
9608 8276 8424 8276 7383 1216 8276
5035 8424 9757 8424 2786 3906 8424
9757 8814 8814 8814 1188 2786 8814
8814 9299 7383 9299 6188 9757 9299
2294 9545 9299 9545 5790 1188 9545
6188 9880 8276 9880 5091 9608 9608
5091 9608 7270 9608 2294 6188 9757
7270 9757 7924 9757 9299 9299 9880

A E

A. Original input

B. LSD radix sort

C. MSD radix sort

D. 3-way radix quicksort (no shuffle)

E. Sorted
COS 226 FINAL, FALL 2018 5

4. Depth-first search. (6 points) Final, Fall 2018


Run depth-first search on the following digraph, starting from vertex 0. Assume the adjacency
lists are in sorted order: for example, when iterating over the edges pointing from 6, consider
the edge 6→1 before either 6→5 or 6→7.

run DFS from here

0 1 2 3 4

5 6 7 8 9

postorder: 5 2 1 3 4 9 8 7 6 0
(a) List the 10 vertices in preorder.
preorder: 0 5 6 1 2 7 8 4 3 9
0

(b) List the 10 vertices in postorder.

0
6 PRINCETON UNIVERSITY

5. Breadth-first search. (6 points) Final, Fall 2018


Run breadth-first search on the following digraph, starting from vertex 0. Assume the adja-
cency lists are in sorted order: for example, when iterating over the edges pointing from 8,
consider the edge 8→3 before either 8→4 or 8→9.

run BFS from here

0 1 2 3 4

5 6 7 8 9

(a) List the 10 vertices in the order in which they are added to the queue.

(b) Give the entries in the edgeTo[] array upon termination of breadth-first search.

v 0 1 2 3 4 5 6 7 8 9
edgeTo[v] –
COS 226 FINAL, FALL 2018 7

Final,
6. Minimum spanning tree. (6 points) Fall 2018
Consider the following edge-weighted graph G containing 10 vertices and 17 edges. The thick
black edges T define a spanning tree of G but not a minimum spanning tree of G.

A 140 B 120 C 20 D 170 E

80 10 0 60
90 150 130 50 40
0 16

F 70 G 30 H 110 I 10 J

(a) Find a cut in G whose minimum weight crossing edge is not an edge in T .
H–I
Mark the vertices on the side in cut
of the MST instead
containing ofA.B–C
vertex

A B C D E F G H I J
∎ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻

(b) Which of the following edges are in the MST of G? Mark all that apply.

A–B B–C B–G B–H C–H D–H D–I D–J H–I


◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
8 Final, Fall 2018 PRINCETON UNIVERSITY

7. Maximum flow. (8 points)


Consider the following flow network and maximum flow f ∗ .

source max flow capacity

A 19 / 29 B 6/6 C 19 / 19 D 9/9 E

12 8
26 8 /
10 / 10 / 13 / 15 / 0 / 13 / 2/6 19 9 / 10
26 12 1

F 10 / 14 G 37 / 40 H 37 / 37 I 38 / 38 J

sink

min cut: { A, B, F, G, H }
max
(a) What is the flow
value of the flowvalue
f ∗? = 55

(b) What is the capacity of the cut {A, B, C}?

(c) What is the net flow across the cut {A, B, C}?

(d) Which vertices are on the source side of the minimum cut? Mark all that apply.

A B C D E F G H I J
∎ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
(e) Mark each edge below if increasing its capacity by 1 would increase the value of the
maximum flow by exactly 1.

A→F A→G B →C I →C I →J H →I

◻ ◻ ◻ ◻ ◻ ◻
COS 226 FINAL, FALL 2018 9

8. Huffman compression. (6 points)


Consider running Huffman compression over an alphabet of 16 characters with a given fre-
quency distributions of characters (i.e., entry i is how many times character i appears in the
input). For each frequency distribution below, write the length of the longest codeword.

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }
(equal frequencies)

{ 1, 2, 4, 8, 16, 32, 64, 128, . . . , 215 }


(powers of 2)

{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, . . . , 16 }
(positive integers)

{ 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . , 987 }


(Fibonacci numbers)
10 PRINCETON UNIVERSITY

9. LZW compression. (6 points)


Compress the following string of length 15 using LZW compression.
A A B C B C A B B B C B C A C
As usual, assume that the original encoding table consists of all 7-bit ASCII characters and
uses 8-bit codewords. Recall that codeword 80 is reserved to signify end of file.

(a) Give the resulting sequence of 11 two-digit hexadecimal integers in the space below.

41 80

(b) Which of the following strings are in the LZW dictionary upon termination of the algo-
rithm? Mark all that apply.

AA AB ABB ABBB ABC BB BC BCA BCAC BCB BCBC CB


◻ ◻ ◻ ◻5.5 ◻Data Compression
◻ ◻ ◻815 ◻ ◻ ◻ ◻

en you HexDump a bit- 0 1 2 3 4 5 6 7 8 9 A B C D E F


SCII-encoded charac- 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
is useful for reference. 1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
number, use the first 2 SP ! " # $ % & ‘ ( ) * + , - . /
ex and the second hex 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
x to find the character 4 @ A B C D E F G H I J K L M N O
ample, 31 encodes the 5 P Q R S T U V W X Y Z [ \ ] ^ _
letter J, and so forth. 6 ` a b c d e f g h i j k l m n o
ASCII, so the first hex 7 p q r s t u v w x y z { | } ~ DEL
Hex numbers starting
reference, this Hexadecimal-to-ASCII conversion tabletable from the textbook.
numbers 20 andFor 7F)
is the hexadecimal-to-ASCII conversion

inting control charac-


rol characters are left over from the days when physical devices
COS 226 FINAL, FALL 2018 11

10. Knuth–Morris–Pratt substring search. (6 points)


Consider the Knuth–Morris–Pratt DFA for the string

C C A C C A C B

over the alphabet { A, B, C }.

(a) In which state is the DFA after consuming the following sequence of characters?

A B C

0 1 2 3 4 5 6 7 8
# # # # # # # # #

(b) In which state is the DFA after consuming the following sequence of characters?

C C B A C C A C C

0 1 2 3 4 5 6 7 8
# # # # # # # # #

(c) In which state is the DFA after consuming the following sequence of characters?

A C C C C A C C A C C C A C C C A C C C A C C A C C A C C A C

0 1 2 3 4 5 6 7 8
# # # # # # # # #
12 PRINCETON UNIVERSITY

11. Properties of shortest paths. (6 points)


For each statement at left, identify whether it is a property of Dijkstra’s algorithm and/or
the Bellman–Ford algorithm by writing the letter corresponding to the best-matching term
at right.
Assume that the digraph has positive edge weights and that all vertices are reachable from the
source vertex s. Recall that relaxing a vertex v means relaxing every edge pointing from v. As
usual, E denotes the number of edges and V denotes the number of vertices.

Each vertex is relaxed at most once. A. Dijkstra’s algorithm


(using a binary heap for PQ)

B. Bellman–Ford algorithm
(queue-based implementation)
Throughout the algorithm, distTo[v] is either
infinite or the length of some directed path from
s to v. C. Both A and B.

D. Neither A nor B.

When relaxing edge v → w, distTo[w] either


remains unchanged or decreases.

If the length of the shortest path from s to v is less


than the length of the shortest path from s to w,
then vertex v is not the last vertex relaxed.

In the worst case, the order of growth of the


running time is EV .

In the best case, the order of growth of the running


time is E + V .
COS 226 FINAL, FALL 2018 13

12. Why did we do that? (8 points)


For each pair of algorithms or data structures, identify a critical reason why we prefer the
first to the second. Mark the best answer.

Use a queue instead of a stack to store the vertices A. Guarantees correctness.


to be processed during breadth-first search of a
graph.
B. Improves worst-case
running time.
Use reverse postorder traversal instead of preorder
traversal to compute a topological order in a DAG.
C. Uses less memory.

Process the edges in ascending order of weight D. Simpler to code.


in Kruskal’s algorithm instead of descending order.

Use Knuth–Morris–Pratt instead of brute-force for


substring search.

Use a stable sorting algorithm (key-indexed count-


ing) instead of an unstable one to rearrange the
strings as a subroutine of LSD radix sort.

Form an array of Suffix objects instead of an ar-


ray of String objects when suffix sorting a string.

Use a ternary-search trie instead of a 256-way trie


for a string symbol table over the extended ASCII
alphabet.

Initialize right[c] in Boyer–Moore to contain the


index of the rightmost occurrence of character c
instead of the leftmost occurrence.
14 PRINCETON UNIVERSITY

13. Regular expressions. (6 points) Final, Fall 2018


Consider the NFA that results from applying the RE-to-NFA construction algorithm from
lecture and the textbook to the regular expression

( A * | ( A | B C ) * )

The states and match transitions (solid lines) are shown below, but most of the -transitions
(dotted lines) are suppressed.

ε-transition

0 1 2 3 4 5 6 7 8 9 10 11 12

( A * | ( A | B C ) * )

match transition

(a) Which of the following are edges in the full -transition digraph? Mark all that apply.
A* | (A | BC)*
0→1 0→3 0→4 0→9 1→2 1→4 2→1 3→0 3→11 4→7
∎ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
4→9 4→10 6→7 6→9 7→6 9→4 9→10 9→12 10→2 10→4
◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻

(b) Suppose that you simulate the NFA with the following input:

A A A A A A A A

In which states could the NFA be after consuming the entire input? Mark all that apply.

0 1 2 3 4 5 6 7 8 9 10 11 12
◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻ ◻
COS 226 FINAL, FALL 2018 15

Final, Fall 2018


14. Prefix count data structure. (10 points)
Design a data structure that supports inserting strings and prefix-count queries. A prefix-count
query returns the number of strings inserted into the data structure (including duplicates)
that start with a given prefix. To do so, describe how to implement this API:

public class PrefixCount

PrefixCount() create an empty data type

void insert(String s) add the string to the data structure

int prefixCount(String prefix) number of strings that start with prefix

Here is an example:

PrefixCount pc = new PrefixCount();


pc.insert("ANNA");
pc.insert("BELLA");
pc.insert("ANNABELLA");
pc.insert("AN");
pc.prefixCount("ANNA"); // 2
pc.prefixCount("BELL"); // 1
pc.insert("ANNA"); // duplicate
pc.insert("ANNABEL");
pc.prefixCount("ANNA"); // 4
pc.prefixCount("BANANA"); // 0

Your answer will be graded for correctness, efficiency, and clarity (but not precise Java syn-
tax). For full credit, the PrefixCount constructor must take constant time; insert() must
take time proportional to R L (or better); and prefixCount() must take time proportional to
L (or better), where L is the length of the string argument and R is the alphabet size.
16 PRINCETON UNIVERSITY

(a) In the space below, declare the Java instance variables for your PrefixCount data type
using Java code. You may define nested classes and/or use any of the data types that
we have considered in this course (either algs4.jar or java.util versions).

public class PrefixCount {

}
COS 226 FINAL, FALL 2018 17

(b) Describe how to implement insert(), using either Java code or concise prose. If it is
similar to an algorithm that we implemented in class, just say so and focus your answer
on the part that is different.

(c) Describe how to implement prefixCount(), using either Java code or concise prose. If
it is similar to an algorithm that we implemented in class, just say so and focus your
answer on the part that is different.
18 PRINCETON UNIVERSITY

15. Shortest directed cycle containing a given vertex. (9 points)


Given a digraph G with positive edge weights and a distinguished vertex s, design an algorithm
to find a shortest directed cycle that contains s (or report that no such cycle exists). To do
so, solve a source–sink shortest-paths problem on a related edge-weighted digraph.

For full credit, the order of growth of running time must be E log V (or better) in the worst
case, where E is the number of edges and V is the numberFinal, FallFor
of vertices. 2018
simplicity, assume
no parallel edges or self loops.

B 10 C
G

70
30
20

50
distinguished A 80 D 60 E
vertex

40

The shortest directed cycle containing A is A–B–C–D–A


and has weight 140 (20 + 10 + 30 + 80).
G′ B 10 C

(a) Draw the source–sink shortest-paths problem that you would construct in order to find
the shortest directed cycle containing A in the 5-vertex digraph shown above. Be sure

70
30
20

50

to label the source and sink vertices and include the edge weights.

source A D 60 E 40 A′ s

80
COS 226 FINAL, FALL 2018 19

(b) Give a crisp and concise English description of your algorithm in the space below.

Your answer will be graded for correctness, efficiency, and clarity.


COS 226 Algorithms and Data Structures Fall 2019

Final

This exam has 16 questions (including question 0) worth a total of 100 points. You have 180
minutes. This exam is preprocessed by a computer when grading, so please write darkly and
write your answers inside the designated spaces.
Policies. The exam is closed book, except that you are allowed to use a one-page cheatsheet
(8.5-by-11 paper, two sides, in your own handwriting). Electronic devices are prohibited.
Discussing this exam. Discussing the contents of this exam before solutions have been posted
is a violation of the Honor Code.
This exam. Do not remove this exam from this room. In the space provided, write your name and
NetID. Also, mark your exam room and the precept in which you are officially registered. Finally,
write and sign the Honor Code pledge. You may fill in this information now.

Name:

NetID:

Course: COS 126 COS 226


#
Exam room: McCosh 50 Other
# #
P01 P02 P04 P05 P07 P08 P09 P10
Precept:
# # # # # # # #

“I pledge my honor that I will not violate the Honor Code during this examination.”

Signature
2 PRINCETON UNIVERSITY

0. Initialization. (1 point)
In the space provided on the front of the exam, write your name and NetID; mark your exam
room and the precept in which you are officially registered; write and sign the Honor Code
pledge.

1. Empirical running time. (6 points)


Suppose that you observe the following running times (in seconds) for a program on graphs
with V vertices and E edges.

E
10,000 20,000 40,000 80,000 160,000 320,000
10,000 6.25 8.84 12.50 17.68 25.00 35.36
20,000 12.50 17.68 25.00 35.36 50.00 70.71
V 40,000 25.00 35.36 50.00 70.71 100.00 141.42
80,000 50.00 70.71 100.00 141.42 200.00 282.84
160,000 100.00 141.42 200.00 282.84 400.00 565.69
320,000 200.00 282.84 400.00 565.69 800.00 1131.37

running time of the program (in seconds)

(a) Estimate the running time of the program (in seconds) for a graph with V = 640,000
vertices and E = 640,000 edges.

seconds

(b) Estimate the order of growth of the running time of the program as a function of both
V and E.
COS 226 FINAL, FALL 2019 3

2. Memory. (4 points)
Suppose that you implement a symbol table (containing string keys and integer values) using
an r-way trie with following data type:

public class RwayTrie {


private final int r;
private Node root;

public RwayTrie(int r) {
this.r = r;
root = null;
}

private class Node {


private final int value;
private Node[] next;

private Node(int value) {


this.value = value;
this.next = new Node[r];
}
}

...
}

Using the 64-bit memory cost model from lecture and the textbook, how much memory does
each Node object use? Count all memory allocated when a Node object is constructed. Write
your answer as a function of r.

bytes
4 PRINCETON UNIVERSITY

3. String sorts. (5 points)


The column on the left contains the original input of 24 strings to be sorted; the column on
the right contains the strings in sorted order; the other 5 columns contain the contents at
some intermediate step during one of the 3 radix-sorting algorithms listed below. Match each
algorithm by writing its letter in the box under the corresponding column.

You may use each letter once, more than once, or not at all.

0 4170 1233 3963 4170 9601 1018 1018


1 9601 1866 2145 9601 5601 1233 1233
2 8287 1018 1018 5601 4514 1866 1866
3 6853 2119 2119 5052 1018 2119 2119
4 5185 2145 3923 9152 2119 2145 2145
5 5052 3923 1866 7722 7722 3923 3923
6 9152 3963 1233 6853 3923 3963 3963
7 3923 4170 4514 3923 4528 4170 4170
8 9388 4528 4170 1233 6728 4528 4435
9 1233 4514 4435 7453 1233 4514 4514
10 4528 4435 4528 3963 4435 4435 4528
11 8587 5185 7453 4514 2145 5185 5052
12 7453 5052 6728 5185 5052 5052 5185
13 6728 5601 8587 4435 9152 5601 5601
14 1866 6853 9388 2145 6853 6853 6728
15 2119 6728 9152 1866 7453 6728 6853
16 1018 7453 5052 7056 7056 7453 7056
17 4514 7056 5185 8287 3963 7056 7453
18 4435 7722 6853 8587 1866 7722 7722
19 2145 8287 8287 9388 4170 8287 8287
20 3963 8587 7056 4528 5185 8587 8587
21 7056 9601 5601 6728 8287 9601 9152
22 5601 9152 7722 1018 8587 9152 9388
23 7722 9388 9601 2119 9388 9388 9601

A E

A. Original input

B. LSD radix sort

C. MSD radix sort

D. 3-way radix quicksort (no shuffle)

E. Sorted
COS 226 FINAL, FALL 2019 5

4. Depth-first search. (6 points)


Final, Fall 2019
Run depth-first search on the following digraph, starting from vertex 0. Assume the adjacency
lists are in sorted order: for example, when iterating over the edges leaving vertex 3, consider
the edge 3→2 before either 3→4 or 3→8.

run DFS from here

0 1 2 3 4

5 6 7 8 9

preorder: 0 5 6 1 2 7 3 4 8 9
(a) List the 10 vertices in preorder.
postorder: 2 1 4 3 9 8 7 6 5 0
0

(b) List the 10 vertices in postorder.

(c) The above digraph does not have a topological order. If, however, you delete one edge,
it will have a topological order. Which edge?
6 PRINCETON UNIVERSITY

5. Breadth-first search. (7 points)


Consider the following buggy implementation of breadth-first search in a digraph.

private void bfs(Digraph G, int s) {


marked = new boolean[G.V()];
distTo = new int[G.V()];
Queue<Integer> queue = new Queue<Integer>();

queue.enqueue(s);
while (!queue.isEmpty()) {
int v = queue.dequeue();
for (int w : G.adj(v)) {
if (!marked[w]) {
distTo[w] = distTo[v] + 1;
queue.enqueue(w);
}
}
} Final, Fall 2019
}

(a) Suppose that you run the code fragment on the following DAG, starting from s = 0.
Mark all statements below that are true.

0 1 2 3 4

run code
from here

5 6 7 8 9

∎ It terminates.

◻ Some vertices are added to the queue more than once.

◻ At some point, the queue contains multiple copies of the same vertex.

◻ Upon termination, distTo[1] is 1 (the length of the shortest path from 0 to 1).

◻ Upon termination, distTo[9] is 9 (the length of the longest path from 0 to 9).

(b) Annotate the code above to correct it.


COS 226 FINAL, FALL 2019 7

Final, Fall 2019


6. Minimum spanning tree. (6 points)
Consider the following edge-weighted graph.

run Prim from here

10 20 70 s

40
100 50 30 80 60 130

110 90 120

(a) List the weights of the MST edges in the order that Kruskal’s algorithm adds them to

Kruskal: 10 20 30 60 70 100 120


the MST.

(b) List the weights of the MST edges in the order that Prim’s algorithm adds them to the
MST. Start Prim’s algorithm from vertex s.
8 PRINCETON UNIVERSITY

7. Knuth–Morris–Pratt substring search. (6 points)


Consider the Knuth–Morris–Pratt DFA for the following string of length 8 over the alphabet
{ A, B, C }:

B B C B B B C A

(a) Complete the last three columns of this partially-completed DFA table.

0 1 2 3 4 5 6 7

A 0 0 0 0 0

B 1 2 2 4 5

C 0 0 3 0 0

(b) In which state is the DFA after consuming the following sequence of characters?
Mark the correct answer.

C B B B B C B B C C B B B C B B B C B B B C B B C B B C B B C B

0 1 2 3 4 5 6 7 8
# # # # # # # # #
COS 226 FINAL, FALL 2019 9

8. Java String library performance. (8 points)


For each of the String expressions at left, write the letter of the best-matching worst-case
running time (as a function of m and n) at right, where
• s is a string of length n
• t and regexp are strings of length m
• m≤n

Assume the standard (Oracle or OpenJDK) Java 8 representation and implementation for
the String data type. You may use each letter once, more than once, or not at all.

A. 1
s.length() + t.length()

B. log m
s.charAt(n/2)

C. log n

s.substring(n/2, n)

D. m

s.equals(t)
E. n

s.indexOf(t) F. m2

G. mn
s.matches(regexp)

H. n2
s += t

I. 2n

for (int i = 0; i < s.length(); i++)


t += s.charAt(i);
10 PRINCETON UNIVERSITY

9. Burrows–Wheeler data compression. (5 points)


Consider compressing strings of length 6n that contains n copies of X X Y Y Z Z concatenated
together. For example, here is the string corresponding to n = 5.

X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z

For each transformation at left, determine the compression ratio (as a function of n) and write
the letter of the best-matching term at right. As usual, assume the alphabet size R = 256.

You may use each letter once, more than once, or not at all.

A. ∼ 1
Move-to-front encoding.

B. ∼ 7/8

C. ∼ 1/2
Burrows–Wheeler transform.

D. ∼ 3/8

E. ∼ 5/24

Huffman compression.
F. ∼ 1/4

G. ∼ 3/16

Move-to-front encoding,
followed by Huffman compression. H. ∼ 1/8

I. ∼ 1/16

Burrows–Wheeler transform, J. ∼ 5/768


followed by move-to-front encoding,
followed by Huffman compression.
K. ∼ 1/256
COS 226 FINAL, FALL 2019 11

10. Why did we do that? (8 points)


For each pair of algorithms or data structures, identify a critical reason why we prefer the
first to the second. Write the letter of the best-matching answer.

You may use each letter once, more than once, or not at all.

Use adjacency lists instead of an adjacency-matrix A. Guarantees correctness.


to represent a sparse undirected graph.

B. Improves performance
in practice.
Use union–find instead of depth-first search for
cycle detection in Kruskal’s algorithm.
C. None of the above.

Relax the vertices in increasing order of distance


from the source in Dijkstra’s algorithm instead
of in reverse DFS postorder to compute shortest
paths in digraphs with positive edge weights.

Use 3-way radix quicksort instead of mergesort to


sort an array of strings.

Use Boyer–Moore instead of Knuth–Morris–Pratt


for substring search.

Use depth-first search instead of breadth-first


search to compute a topological order in a di-
rected acyclic graph.

Use depth-first search instead of breadth-first


search to determine all vertices reachable from a
set of vertices in NFA simulation.

Use breadth-first search instead of depth-first


search to find a shortest ancestral path in the
WordNet assignment.
12 PRINCETON UNIVERSITY

11. Shortest paths. (7 points)


Given a digraph G with positive edge weights, complete the constructor below to compute
the length of the shortest path from s to each vertex. To do so, write the letter of one of the
following code fragments in each provided space.

A. (int i = 1; i < G.V(); i++) F. Double.NEGATIVE_INFINITY J. 0

B. (int i = 1; i < G.E(); i++) G. Double.POSITIVE_INFINITY K. distTo[v]

C. (int v = 0; v < G.V(); v++) H. distTo[v] + e.weight() L. distTo[w]

D. (int v = 0; v < G.E(); v++) I. distTo[w] + e.weight()

E. (DirectedEdge e : G.adj(v))

You may use each letter once, more than once, or not at all. No other code is allowed.

public BellmanFordSP(EdgeWeightedDigraph G, int s) {

distTo = new double[G.V()];

for (int v = 0; v < G.V(); v++)

distTo[v] = _________ ;

distTo[s] = _________ ;

for _________ {

for _________ {

for _________ {

int w = e.to();

if ( _________ > _________ )

_________ = _________ ;
}
}
}
}
COS 226 FINAL, FALL 2019 13

12. Ternary search tries. (5 points)


Final, Fall 2019
yes Consider the following TST, where the integer values are shown next to the nodes of the cor-
——- responding string keys. Each node labeled with a ? contains some uppercase letter (possibly
PRO
FUN different for each node).
QUEUE
HUE
DATA
TRIE ?
WRIE
DO

no
——- D R U 11
BRIE
TRUE
DARK
ERIE
DAD A G 5 I U

T R E 6 E 7 O 8 E

1 A ? 2 K 3 R 4 U 10 N

9 E

Which of the following string keys are (or could possibly be) in the TST? Mark all that apply.

DATA BRIE DAD DARK DO FUN


∎ ◻ ◻ ◻ ◻ ◻
HUE PRO QUEUE TRIE TRUE
◻ ◻ ◻ ◻ ◻
14 PRINCETON UNIVERSITY

13. Regular expressions. (6 points)


Consider the NFA that results from applying the RE-to-NFA construction algorithm from
lecture and the textbook to the regular expression

( ( A | B C ) * B * )

Final,
The states and match transitions (solid lines) areFall 2019
shown below, but some of the -transitions
(dotted lines) are suppressed.

ε-transition

0 1 2 3 4 5 6 7 8 9 10 11

( ( A | B C ) * B * )

match transition

(a) Mark all edges in the -transition digraph.

0
0→11 2
1→2 3
1→3 4
1→4 5 6
1→7 7
3→4 8
3→5 9
3→610 11
6→7
( ( A | B C ) * B * )
∎ ∎ ◻ ◻ ◻ ◻ ◻ ◻ ∎
7→1 7→3 7→6 7→8 8→9 9→1 9→8 9→10 10→11
◻ ◻ ◻ ∎ ◻ ◻
((A | BC)* | B*)
◻ ∎ ∎

(b) Suppose that you want to construct an NFA for the regular expression

( ( A | B C ) ? B * )
where the operator ? means zero or one copy of the expression that precedes it. What
minimal change(s) would you make (e.g., adding or removing -transitions) to the NFA
you defined in part (a)?
COS 226 FINAL, FALL 2019 15

14. Prefix-free codes. (10 points)

(a) For a final exam question, an absentminded professor created a Huffman code for a set of
7 symbols. Unfortunately, she forgot to write down the codeword for one of the symbols.

symbol codeword

A 00

B 01100

E 10

P 0111

R 01101

S ?

T 11

Deduce the codeword associated with the symbol S.

(b) Given a Huffman code (or optimal prefix-free code) for a set of n ≥ 3 symbols, with one
codeword missing, design an algorithm to deduce the missing codeword. The input to
the problem is an array of the n − 1 known codewords.

Write your answer in the spaces provided on the next page.

Your answers to (b) and (c) will be graded for correctness, efficiency, and clarity. For
full credit, your algorithm must take time linear in the input size (the total number of
bits to represent the codewords) in the worst case.
16 PRINCETON UNIVERSITY

Briefly describe your algorithm in the space below.

Draw a diagram of your data structure(s) for deducing the missing codeword when the
known codewords are 00, 01100, 10, 0111, 01101, and 11, as in part (a).
COS 226 FINAL, FALL 2019 17

(c) Now, suppose that there are two codewords missing. Design an algorithm to deduce the
two missing codewords. Do not repeat details from part (b) if they are identical.
18 PRINCETON UNIVERSITY

15. Writing seminar assignment problem. (10 points)


A prominent northeastern university assigns n students to m writing seminars. Each student
ranks the writing seminars in order of preference (from favorite to least favorite). Each writing
seminar has space for as many as p students. Design an algorithm to determine whether it is
possible to assign the students to the writing seminars so that each student gets one of their
top two choices. To do so, model the problem as a maximum flow problem.

An example. Here is an example input with n = 6 students (Abigail, Bjarne, Čazir, De-
Andre, Eun-jung, and Flor) and m = 3 writing seminars (X-Ray Crystallography, Your Life
in Numbers, and Zoom!), where each seminar has space for p = 2 students.

1st 2nd 3rd


A X Y Z
B Z X Y
C X Y Z
D Y Z X
E X Y Z
F Y X Z

In this example, there is no assignment in which each student gets their first choice (because
three students rank X as their first choice). However, there is an assignment in which each
student gets one of their top two choices:

assignment
A—X
B—Z
C—X
D—Z
E—Y
F—Y
COS 226 FINAL, FALL 2019 19

(a) Draw the flow network that you would construct in order to solve the writing seminar
assignment problem on the facing page (with 6 students and 3 writing seminars). Be
sure to label the source and destination vertices and specify the edge capacities.

(b) After solving such a maximum flow problem, how would you determine whether there
exists an assignment in which each students gets one of their top two choices?

(c) In the worst case, how many augmenting paths will the Ford–Fulkerson algorithm find
(as a function of m and n)? Assume n ≥ m. Mark the best answer.

m n m+n n log n m log m mn mn2 m2 n m2 n2


# # # # # # # # #
20 PRINCETON UNIVERSITY

(d) Let k be an integer between 1 and m. Suppose that you want to know whether it is
possible to assign the students to writing seminars so that each students gets one of their
top k choices (instead of top 2 choices). Briefly describe how you would modify your
solution to (a).

(e) Design an efficient algorithm to find the smallest integer k for which it is possible to
assign the students to writing seminars so that each student gets one of their top k
choices. Your algorithm should be substantially faster in the worst case than repeatedly
applying (d) to solve m maximum flow problems (one for each possible value of k).
COS 226 Algorithms and Data Structures Fall 2023

Final

This exam has 13 questions worth a total of 100 points. You have 180 minutes.

Instructions. This exam is preprocessed by computer. Write neatly, legibly, and darkly. Put all
answers (and nothing else) inside the designated spaces. Fill in bubbles and checkboxes completely:
and . To change an answer, erase it completely and redo.

Resources. The exam is closed book, except that you are allowed to use a one page reference
sheet (8.5-by-11 paper, both sides, in your own handwriting). No electronic devices are permitted.

Honor Code. This exam is governed by Princeton’s Honor Code. Discussing the contents of this
exam before the solutions are posted is a violation of the Honor Code.

Please complete the following information now.

Name:

NetID:

Exam room: # McCosh 46 # McCosh 50 # McCosh 60 # Other


P01 P02 P03 P04 P05 P06 P07 P08 P09
Precept:
# # # # # # # # #

“I pledge my honor that I will not violate the Honor Code during this examination.”

Signature
2 PRINCETON UNIVERSITY

1. Initialization. (1 point)
In the spaces provided on the front of the exam, write your name and NetID; fill in the bubble
for your exam room and the precept in which you are officially registered; write and sign the
Honor Code pledge.

2. Empirical running time. (6 points)


Suppose that you observe the following running times (in seconds) for a program on graphs
with V vertices and E edges.

E
100 400 1600 6400
100 0.3 1.0 4.0 16.0
200 1.0 4.0 16.0 64.0
V 400 4.0 16.0 64.0 256.0
800 16.0 64.0 256.0 1024.0

(a) Estimate the running time of the program (in seconds) for a graph with V = 1,600
vertices and E = 25,600 edges. Fill in the best-matching bubble.

# # # # #
2,000 4,000 8,000 16,000 32,000

(b) Estimate the order of growth of the running time as a function of both V and E.
Fill in the best-matching bubble.

# # # # #
Θ(V 2 + E 2 ) Θ(E + V 2 ) Θ(V 2 E) Θ(V E 2 ) Θ(V 2 E 2 )
COS 226 FINAL, FALL 2023 3

3. Depth-first search. (9 points)


Final, Fall 2023
Run depth-first search on the following digraph, starting from vertex 0. Assume the adjacency
lists are in sorted order: for example, when iterating over the edges leaving vertex 0, consider
the edge 0→2 before either 0→4 or 0→6.

start from here

0 2 5 1 7

6 4 8 3 9

preorder: 0 2 5 8 1 3 7 9 6 4
(a) List the 10 vertices in DFS preorder.

postorder: 5 1 7 9 3 8 6 8 4 0
0

(b) List the 10 vertices in DFS postorder.

(c) Is the reverse of the DFS postorder in (b) a topological order for this digraph?

# #
yes no
4 PRINCETON UNIVERSITY

Final,
4. Minimum spanning trees. (8 points) Fall 2023
Consider the following edge-weighted graph.

30 s 50 90

20
0 100 80 70 10 120

40 60 110

(a) List the weights of the MST edges in the order that Kruskal’s algorithm adds them to
the MST.

Kruskal: 0 10 20 30 50 60 110
Prim: 30 0 20 50 60 10 110
(b) List the weights of the MST edges in the order that Prim’s algorithm adds them to the
MST. Start Prim’s algorithm from vertex s.
COS 226 FINAL, FALL 2023 5

5. Shortest paths. (8 points)


Consider running the Bellman–Ford algorithm in the following edge-weighted digraph, with
source vertex s = 0. Assume that, within a pass, the edges are relaxed in sorted order:
Final, Fall 2023
0→1, 0→4, 0→5, 1→2, 2→3, 3→1, 4→1, 4→3, 5→4

source vertex edge weight

0 50 1 8 2 v distTo[]

0 0.0
1 7.0

1 4 3 6 7 2 58.0
3 13.0
4 3.0

5 1.0
5 2 4 9 3

(a) Immediately after the first pass, what are the values of distTo[v] for each vertex v?
Write the values in the corresponding boxes.

distTo[0] distTo[1] distTo[2] distTo[3] distTo[4] distTo[5]

(b) Immediately after the first pass, for which vertices v is distTo[v] the length of the
shortest path from s to v? Mark all vertices that apply.

0 1 2 3 4 5
6 PRINCETON UNIVERSITY
Final, Fall 2023

6. Maxflows and mincuts. (10 points)


Consider the following flow network and a flow f .

source ow f capacity

A 14 / 14 B
C 13 / 13 C 8 / 20 D

9 5 5 8 / 10
14 / 17 / 4/6 / 0/4 /5
3 3

E 14 / 14 F 21 / 21 G 18 / 24 H

target

(a) What is the value of the flow f ?

# # # # #
29 min31cut: 34{ A, 37
B, E, 39F }
max flow value = 34
(b) What is the capacity of the cut {A, B, E, F }?
fl
# # # # #
29 31 34 37 39

(c) What is the net flow across the cut {A, B, E, F }?

# # # # #
29 31 34 37 39

(d) Find an augmenting path with respect to f . Write the sequence of vertices in the path.

A →

(e) What is the bottleneck capacity of the augmenting path found in part (d)?

# # # # #
1 2 3 4 5
COS 226 FINAL, FALL 2023 7

7. Data structures. (10 points)

(a) Suppose that the following keys are inserted into an initially empty linear-probing hash
table, but not necessarily in the order given:

key hash
A 1
B 1
C 4
D 3
E 2

Which of the following could be the contents of the underlying array? Assume that the
length of the array is 6 and that it neither grows nor shrinks.
Fill in all checkboxes that apply.

0 1 2 3 4 5
– A B C D E

0 1 2 3 4 5
– A B D C E

0 1 2 3 4 5
– B A E D C
8 Final, Fall 2023 PRINCETON UNIVERSITY

(b) Consider the following 2d-tree:

(9, 5)

(5, 9) (14, 8)

(4, 2) (6, 12) (16, 1) (13, 12)

(7, 15) (20, 7) (10, 14)

9 <= x <= 13
8 <=
Which of the following points could y the
be in <=subtree
14 T ?
Fill in all checkboxes that apply.

(5, 10) (7, 16) (10, 10) (11, 16) (12, 9) (16, 13)
COS 226 FINAL, FALL 2023 9

(c) Consider the following code fragment for creating a uniformly shuffled version of an
ArrayList containing n strings.

ArrayList<String> from = ...;


ArrayList<String> to = new ArrayList<String>();
while (from.size() > 0) {
int r = StdRandom.uniformInt(from.size());
String x = from.remove(r); // remove and return item at index r in list,
// shifting subsequent elements to the left
to.append(x); // appends x to the end of the list
}

Assume that the ArrayList data type is implemented using a resizing array (with dou-
bling when full and halving when one-quarter full) and that element i in the list is stored
at index i in the resizing array.

Final,
All operations perform as efficiently as could Fall 2023
be expected for this representation.

element 0 element n-1

"F" "I" "N" "A" "L" – – –

0 1 2 3 4 5 6 7

What is the order-of-growth of the worst-case running time as a function of n?

# # # # #
Θ(1) Θ(n) Θ(n log n) Θ(n2 ) Θ(n3 )

What is the order-of-growth of the best-case running time as a function of n?

# # # # #
Θ(1) Θ(n) Θ(n log n) Θ(n2 ) Θ(n3 )
10 PRINCETON UNIVERSITY

8. Dynamic programming. (6 points)


You are taking an idealized exam with n questions and have m minutes to complete it.
Question j is worth pj points and takes tj minutes to earn the points. Your goal is to
maximize the number of points earned in the allotted time. Assume that all pj and tj are
positive integers (and that there is no partial credit).

You will solve this problem using dynamic programming. Define the following subproblems,
one for each i and j with 0 ≤ i ≤ m and 0 ≤ j ≤ n:

OP T (i, j) = max points earned in i minutes by working only on questions 1 through j

Final, Fall 2024


Consider the following partial bottom-up implementation:

int[][] opt = new int[m+1][n+1]; A. (int i = 1; i <= m; i++)

B. (int i = m; i >= 1; i--)

for 1 { C. (int j = 1; j <= n; j++)

for 2 { D. (int j = n; j >= 1; j--)

if ( 3 ) { E. times[j] > i

opt[i][j] = 4 ; F. points[j] > i

} G. opt[i-1][j]

else { H. opt[i][j-1]

opt[i][j] = Math.max( 5 , I. opt[i-1][j-1]

J. points[j]
points[j] + 6 );
K. times[j]
}
L. opt[i - times[j]][j-1]
}
M. opt[i - points[j]][j-1]
}

For each numbered oval above, write the letter of the corresponding code fragment on the right
in the space provided. You may use each letter once, more than once, or not at all.

1 2 3 4 5 6
COS 226 FINAL, FALL 2023 11

9. Karger’s algorithm. (5 points)


Run one execution of Karger’s algorithm for finding a global mincut in the following graph.
Final,
The table at right gives the uniformly Fall
random 2023
weights that this execution of Karger’s algo-
rithm assigns to the edges.

random
edge weight
A B C
A–B 0.5
A–D 0.4
A–E 0.6
B–C 0.9
B–E 0.8
D E F C–E 0.1
C–F 0.3
DEE
D–E 0.7
E–F 0.2

A B C

(a) Which cut does this execution of Karger’s algorithm find?


Mark all vertices that are on the same side of the cut as vertex A.

A B C D E F
D E F

Karger: 0.1 0.2 0.3 0.4 0.6

(b) How many edges cross the cut found by this execution of Karger’s algorithm?

# # # # #
0 1 2 3 4
12 PRINCETON UNIVERSITY

10. Multiplicative weights (9 points).


Consider the experts problem with n ≥ 2 experts over a period of T days.

Identify each property as either always true or sometimes/always false.

true false

# # Suppose that one of the n experts always predicts correctly. Then, the
total number of mistakes made by the elimination algorithm is ≤ log2 n.

Suppose that one of the n experts always predicts correctly. Then, after
# # ⌈log2 n⌉ days, there will be exactly one expert remaining in the
elimination algorithm.

Suppose that exactly two of the n experts always predict correctly.


# # Then, the total number of mistakes made by the elimination algorithm
is ≤ 21 log2 n.

# # Suppose that more than n/2 of the n experts predict 1 on a given day.
Then, the multiplicative weights algorithm also predicts 1 for that day.

In the multiplicative weights algorithm, an expert who has made 5


# # mistakes will have exactly one-half of the weight of an expert who has
made 10 mistakes.

# # Suppose that the best expert makes 7 mistakes. Then, the total number
of mistakes made by the multiplicative weights algorithm is ≥ 7.
COS 226 FINAL, FALL 2023 13

11. Intractability (8 points).


Suppose that Problem X is NP-complete; Problem Y is in NP; and Problem X poly-time
reduces to Problem Y . Which of the following can you infer? Fill in all checkboxes that apply.

Problem X is SAT.

Problem X is in NP.

The Integer-Factorization problem poly-time reduces to Problem X.

Problem Y poly-time reduces to Problem X.

Problem Y is NP-complete.

If Problem X can be solved in poly-time, then P = NP.

If Problem Y cannot be solved in poly-time, then P ≠ NP.

P ≠ NP.
14 PRINCETON UNIVERSITY

12. Princeton path game. (10 points)


Two players compete on a digraph G with two distinguished vertices, s and t.

• The orange player tries to build a directed path from vertex s to vertex t. The black
player tries to prevent this.
• The two players alternate moves. The orange player moves by coloring an uncolored
edge orange. The black player moves by coloring an uncolored edge black.
• The orange player wins if there is a directed path of orange edges from s to t. The black
player wins if every directed path from s to t contains a black edge.

To make the game interesting, assume that s ≠ t and that G contains at least one directed
path from s to t.

Goal. Your goal is to design an algorithm that, given the current state of the game (i.e., a
graph G with each edge either uncolored, orange, or black), determines whether either player
has already won and, if so, who. Note that the game may end before all of the edges are
colored.

• The orange player wins as soon as there is a directed path of orange edges from s to t.
• The black players wins as soon as every directed path from s to t contains one (or more)
black edges.

Examples. Consider two examples of the game being played on the same digraph.

• In the example at left, the orange player has won: the directed path 0 → 4 → 2 → 3 → 6
contains only orange edges.
Final, Fall 2023
• In the example at right, the black player has won: every directed path from s to t
contains one of the black edges 1→2, 4→2, or 4→5.

1 2 3 1 2 3
s s
0 0

4 5 6 4 5 6

t t
orange wins black wins
(s-t path of orange edges) (every s-t path contains a black edge)

Performance requirements. For full credit, your algorithm must take Θ(E + V ) time,
where V and E are the number of vertices and edges in G, respectively. Assume that, given
access to an edge, you can determine its color in Θ(1) time.

Your answer will be graded for correctness, efficiency, and clarity.


COS 226 FINAL, FALL 2023 15

(a) Given a digraph G with each edge either uncolored, orange, or black, design an algorithm
to determine whether there is a directed path from s to t containing only orange edges.

(b) Given a digraph G with each edge either uncolored, orange, or black, design an algorithm
to determine whether every path from s to t contains one (or more) black edges.

(c) Can the game can end in a tie, with all edges colored and neither player winning?

Yes # No #
16 PRINCETON UNIVERSITY

13. Princeton minimum spanning trees. (10 points)


Consider the classic minimum spanning tree problem and a variant.

• Classic-MST: Given a connected, edge-weighted graph G′ , find a spanning tree of G′


that has minimum total weight.

• Princeton-MST: Given a connected, edge-weighted graph G with each edge colored


orange or black, find a spanning tree of G that has minimum total weight among all
spanning trees that contain all of the orange edges (or report that no such spanning tree
exists).

Example. Consider the edge-weighted graph below.

• The Classic-MST includes the edges of weight 0, 10, 20, 50, 60.
Final, Fall 2023
• The Princeton-MST includes the edges of weight 0, 10, 20, 60, and 80.

30 50

20
10 0 80 70

40 60

Goal. Design an efficient algorithm to solve the Princeton-MST problem on an edge-


weighted and edge-colored graph G. To do so, model it as a Classic-MST problem on a
31
closely related edge-weighted graph G′ . 51

Performance requirements. For full credit, your algorithm must run in O(E log E) time,
where V and 2 1 vertices and
0 E are the1 number of 0 edges in
71G, respectively.

Your answer will be graded for correctness, efficiency, and clarity.


41 61
COS 226 FINAL, FALL 2023 17

(a) Describe your algorithm for solving the Princeton-MST problem. Your description
should work for any instance of Princeton-MST, not just the one on the facing page.

(b) Draw the Classic-MST instance G′ that your algorithm would construct in order to
solve the Princeton-MST instance G on the facing page. Be sure to draw the vertices,
edges, and edge weights.
18 PRINCETON UNIVERSITY

This page is intentionally blank. You may use this page for scratch work.
COS 226 Algorithms and Data Structures Spring 2014

Final Exam

This test has 15 questions worth a total of 100 points. You have 180 minutes. The exam is closed
book, except that you are allowed to use a one page cheatsheet (8.5-by-11, both sides, in your own
handwriting). No calculators or other electronic devices are permitted. Give your answers and
show your work in the space provided. Write out and sign the Honor Code pledge before
turning in the test.

“I pledge my honor that I have not violated the Honor Code during this examination.”

Problem Score Problem Score


0 8
1 9 Name:
2 10
3 11 netID:
4 12
5 13 Room:
6 14
Precept: P01 Th 11 Andy Guna
7
Sub 1 Sub 2 P02 Th 12:30 Andy Guna
P03 Th 1:30 Chris Eubank
Total P04 F 10 Jenny Guo
P05 F 11 Madhu Jayakumar
P05A F 11 Nevin Li
P06 F 2:30 Josh Hug
P06A F 2:30 Chris Eubank
P06B F 2:30 Ruth Dannenfelser
P07 F 3:30 Josh Hug

1
2 PRINCETON UNIVERSITY

0. Initialization. (1 point)
In the space provided on the front of the exam, write your name and Princeton netID; circle
your precept number; write the name of the room in which you are taking the exam; and
write and sign the honor code.

1. Analysis of algorithms. (8 points)

(a) You observe the following running times for a program with an input of size N .

N time
1,000 0.1 seconds
2,000 0.3 seconds
4,000 2.5 seconds
8,000 19.8 seconds
16,000 160.1 seconds

Estimate the running time of the program (in seconds) on an input of size N = 80, 000.

seconds

(b) Consider the following implementation of a binary trie data type:

public class BinaryTrieST<Value> {


private Node root; // root of trie
private int N; // number of nodes in the trie

private class Node {


private Value val;
private Node left;
private Node right;
}
...
}

Using the 64-bit memory cost model from lecture and the textbook, how much memory
(in bytes) does a BinaryTrieST object use to store M key-value pairs in N nodes?

Use tilde notation to simplify your answer. Do not include the memory for the values
themselves but do include all other memory (including pointers to values).

∼ bytes
COS 226 FINAL, SPRING 2014 3

(c) For each function on the left, give the best matching order of growth of the running time
on the right. You may use an answer more than once or not at all.

−− B−− public static int f1(int N) { A. R


int x = 0;
for (int i = 0; i < N; i++)
B. N
x++;
return x;
} C. N + R

−−−−− public static int f2(int N, int R) { D. N log R


int x = 0;
for (int i = 0; i < R; i++)
x += f1(i); E. R log N
return x;
} F. N R

−−−−− public static int f3(int N, int R) { G. R2


int x = 0;
for (int i = 0; i < R; i++)
for (int j = 0; j < N; j++) H. N 2
x += f1(j);
return x; I. N R log N
}

J. N R log R
−−−−− public static int f4(int N, int R) {
int x = 0;
for (int i = 0; i < N; i++) K. N R2
for (int j = 1; j <= R; j += j)
x++; L. RN 2
return x;
}
M. R3

−−−−− public static int f5(int N, int R) {


int x = 0; N. N 3
for (int i = 0; i < N; i++)
for (int j = 1; j <= R; j += j)
x += f1(j);
return x;
}
4 PRINCETON UNIVERSITY
Final, Spring 2014

2. Graph search. (6 points)


Consider the following digraph. Assume the adjacency lists are in sorted order: for example,
when iterating through the edges pointing from 2, consider the edge 2 → 3 before either
2 → 7 or 2 → 8.

0 1 2 3 4

5 6 7 8 9

postorder:
Run depth-first search on the digraph, starting from vertex 0.

preorder:
(a) List the vertices in reverse postorder.

0
___ ___ ___ ___ ___ ___ ___ ___ ___ ___

(b) List the vertices in preorder.

0
___ ___ ___ ___ ___ ___ ___ ___ ___ ___
COS 226 FINAL, SPRING 2014 5
Final, Spring 2014

3. Maximum flow. (10 points)


Consider the following flow network and feasible flow f from from the source vertex A to the
sink vertex J.

flow capacity

A 20 / 20 B 8/8 C 4 / 10 D 9 / 14 E

5 4
1/6 /
12 8/8 /
11 4/9 /
6 5/5 17 9 / 15
0 /
0

F 1/1 G 14 / 14 H 22 / 24 I 17 / 17 J

augmenting path: A-G-B-H-C-D-E-J"


(a) What is the value of the flow f ?

min cut: { A, B, F, G, H, I }
max flow value = 30
(b) Starting from the flow f given above, perform one iteration of the Ford-Fulkerson algo-
rithm. List the sequence of vertices on the augmenting path.

(c) What is the value of the maximum flow?

(d) Circle the vertices on the source side of a minimum cut.

A B C D E F G H I J

(e) Give one edge such that if its capacity were decreased by one, then the value of the
maxflow would decrease.
6 PRINCETON UNIVERSITY
Final, Spring 2014

4. Shortest paths. (6 points)


Suppose that you are running Dijkstra’s algorithm on the edge-weighted digraph below, start-
ing from some vertex s (not necessarily 0).

cost

0 35 1 16 2 27 3 17 4

3 12 19 5 38 28 32 10
9

5 9 6 4 7 40 8 5 9

shortest path: 5 0 6 7 3
The table below gives the edgeTo[] and distTo[] values immediately after vertex 7 has been
deleted from the priority queue and relaxed.

v distTo[] edgeTo[]

0 3.0 5→0

1 28.0 6→1

2 51.0 7→2

3 22.0 7→3

4 ∞ null

5 0.0 null

6 9.0 5→6

7 13.0 6→7

8 53.0 7→8

9 ∞ null
COS 226 FINAL, SPRING 2014 7

(a) Give the order in which the first 4 vertices were deleted from the priority queue and
relaxed.

(b) Which is the next vertex after 7 to be deleted from the priority queue and relaxed?

0 1 2 3 4 5 6 7 8 9

(c) In the table below, fill in those entries (and only those entries) in the edgeTo[] and
distTo[] arrays that change (from the corresponding entries on the facing page) imme-
diately after the next vertex after 7 is deleted from the priority queue and relaxed.

v distTo[] edgeTo[]

9
8 PRINCETON UNIVERSITY

5. String sorting algorithms. (7 points)


The column on the left is the original input of 24 strings to be sorted; the column on the right
are the strings in sorted order; the other 7 columns are the contents at some intermediate
step during one of the 3 radix sorting algorithms listed below. Match up each column with
the corresponding sorting algorithm. You may use a number more than once.

mink bear bear calf crow myna crab bear bear


moth calf calf lamb lamb crab toad crow calf
crow crow crow hare deer lamb swan calf crab
myna crab crab wasp crab toad bear crab crow
swan deer hare hawk hare mule deer deer deer
wolf hare kiwi ibex bear hare ibex hare hare
mule hawk deer bear kiwi sole hoki hawk hawk
slug hoki hawk deer calf wolf mule hoki hoki
hare ibex ibex mink hawk calf sole ibex ibex
bear kiwi hoki lion ibex slug wolf kiwi kiwi
kiwi lion lion kiwi hoki moth calf lion lamb
calf lynx lynx slug lion kiwi lamb lynx lion
hawk lamb lamb toad lynx hoki myna lamb lynx
ibex mink mink hoki mink mink mink mink mink
oryx moth mule sole mule hawk lynx moth moth
lion myna myna wolf myna swan lion myna mule
sole mule moth moth moth lion crow mule myna
wasp oryx wasp crab wasp wasp hare oryx oryx
lynx swan sole crow sole bear wasp swan slug
hoki slug oryx oryx oryx deer moth slug sole
crab sole slug mule slug crow slug sole swan
deer toad wolf swan wolf ibex kiwi toad toad
lamb wolf toad myna toad oryx hawk wolf wasp
toad wasp swan lynx swan lynx oryx wasp wolf
---- ---- ---- ---- ---- ---- ---- ---- ----
0 4

(0) Original input (2) MSD radix sort

(1) LSD radix sort (3) 3-way radix quicksort (no shuffle)

(4) Sorted
COS 226 FINAL, SPRING 2014 9

6. Ternary search tries. (5 points)


Consider the following ternary search trie over the alphabet { A, C, G, T }, where the values
are shown next to the nodes of the corresponding string keys. The node containing ? contains
one of the characters { A, C, G, TFinal,
}. Spring 2014

0 A C T 6

T T C

1 A T 2 T 5 A T T 8

3 A T 4 7 T ? G 9

T 10

Circle which one or more of the following string keys are (or could be) in the TST above.

A CT GCA GCG GT GTT TA

TCA TAT TCT TCTT TGT TTT TTTT


10 PRINCETON UNIVERSITY

7. Knuth-Morris-Pratt substring search. (6 points)


Below is a partially-completed Knuth-Morris-Pratt DFA for a string s of length 10 over the
alphabet { A, B, C }.

0 1 2 3 4 5 6 7 8 9

A 4 5 2

B 0 0 7 3

C 0 0 0 10

Final, Spring 2014 KMP


(a) Reconstruct the string s in the last row of the table above.

(b) Complete the first row of the table above (corresponding to the character A).

0 1 2 3 4 5 6 7 8 9 10

Feel free to use this diagram for scratch work.


COS 226 FINAL, SPRING 2014 11

8. Boyer-Moore substring search. (6 points)


Suppose that you run the Boyer-Moore algorithm (the basic version considered in the textbook
and lecture) to search for the pattern

R O W S T H E

in the text

O C I E T Y E X C E P T T H E S C A R E C R O W S T H E

Final, Spring 2014


(a) Give the trace of the algorithm in the grid below, circling the characters in the pattern
that get compared with characters in the text.

O C I E T Y E X C E P T T H E S C A R E C R O W S T H E
R O W S T H E

O C I E T Y E X C E P T T H E S C A R E C R O W S T H E
R O W S T H E
R O W S T H E
R ofOlength
(b) Give a pattern string W S7 that
T would
H E result in the Y in the text being compared
twice when running theRBoyer-Moore
O W S algorithm.
T H E
R O W S T H E
R O W S T H E
R O W S T H E
Final, Spring 2014
12 PRINCETON UNIVERSITY

9. Regular expressions. (7 points)


The following NFA is the result of applying the NFA construction algorithm from lecture and
the textbook to some regular expression.

ε-transition

0 1 2 3 4 5 6 7 8 9 10 11 12

( A * | ( A B * A ) * )

(a) What is the regular expression?

(b) Suppose that you simulate the following sequence of characters on the NFA above:

A A A A A A A

In which one or more states could the NFA be?

0 1 2 3 4 5 6 7 8 9 10 11 12

(c) Suppose that you want to construct an NFA for the regular expression
( A * | ( A B * A ) + )
where the operator + means one or more copies. What minimal change(s) would you
make to the NFA above?
COS 226 FINAL, SPRING 2014 13

10. LZW compression. (5 points)


What is the result of expanding the following LZW-encoded sequence of 11 hexadecimal
integers?

43 41 42 42 82 43 81 41 87 82 80

Assume the original encoding table consists of all 7-bit ASCII characters and uses 8-bit
codewords. Recall that codeword 80 is reserved to signify end of file.

C A B B

5.5 Data Compression 815


For reference, below is the hexademical-to-ASCII conversion table from the textbook:

g. When you HexDump a bit- 0 1 2 3 4 5 6 7 8 9 A B C D E F


ains ASCII-encoded charac- 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI

right is useful for reference. 1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

it hex number, use the first 2 SP ! " # $ % & ‘ ( ) * + , - . /


w index and the second hex 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
n index to find the character 4 @ A B C D E F G H I J K L M N O
For example, 31 encodes the 5 P Q R S T U V W X Y Z [ \ ] ^ _
des the letter J, and so forth. 6 ` a b c d e f g h i j k l m n o
7-bit ASCII, so the first hex 7 p q r s t u v w x y z { | } ~ DEL
or less. Hex numbers starting
Hexadecimal-to-ASCII conversion table
nd the numbers 20 and 7F)
on-printing control charac-
e control characters are left over from the days when physical devices
ers were controlled by ASCII input; the table highlights a few that you
14 PRINCETON UNIVERSITY

11. Burrows-Wheeler transform. (8 points)

(a) What is the Burrows-Wheeler transform of the following?


B D A B A C A C

Final, Spring 2014

Feel free to use this grid for scratch work.

(b) What is the Burrows-Wheeler inverse transform of the following?


4
D A D C C C D B
Final, Spring 2014

Feel free to use this grid for scratch work.


COS 226 FINAL, SPRING 2014 15

12. Problem identification. (9 points)

You are applying for a job at a new software technology company. Your interviewer asks you
to identify which of the following tasks are possible, impossible, or unknown.

−−−−− Given an undirected graph, determine if there exists a path of A. Possible.


length V − 1 with no repeated vertices in time proportional to
EV in the worst case.
B. Impossible.

−−−−− Given a digraph, determine if there exists a directed path be- C. Unknown.
tween every pair of vertices in time proportional to E + V in
the worst case.

−−−−− Given a digraph, design an algorithm to determine whether it


is a rooted DAG (i.e., a DAG in which there is a path from
every vertex to some root r) in time proportional to E + V in
the worst case.

−−−−− Given a flow network (a digraph with positive edge capacities)


and two vertices s and t, find the the value of the min st-cut
in time proportional to E + V in the worst case.

−−−−− Given a digraph where each edge is colored black or orange


and two vertices s and t, find a path from s to t that uses the
fewest number of black edges in time proportional to E + V in
the worst case.

−−−−− Given an array a of N 64-bit integers, determine whether there


are two indices i and j such that ai = −aj in time proportional
to N in the worst case.

−−−−− Given an array of N integers between 0 and R − 1, stably sort


them in time proportional to N + R in the worst case.

−−−−− Determine how many times a pattern string of length M ap-


pears as a substring in a text string of length N in time pro-
portional to M + N in the worst case. For simplicity, assume
the binary alphabet.

−−−−− Design an algorithm that compresses at least half of all 10,000-


bit messages by one (or more) bits.
16 PRINCETON UNIVERSITY

13. Reductions. (8 points)


Consider the following two string-processing problems:

• Suffix-Array. Given a string s, compute its suffix array sa[].

• Circular-Suffix-Array. Given a string s, compute its circular suffix array csa[].

For example, the suffix array sa[] and circular suffix array csa[] of the string s = ABAAB are
given below, along with the the corresponding suffixes and circular suffixes (in parentheses).

i s[i] sa[i] csa[i]


0 A 2 (AAB) 2 (AABAB)
1 B 3 (AB) 0 (ABAAB)
2 A 0 (ABAAB) 3 (ABABA)
3 A 4 (B) 1 (BAABA)
4 B 1 (BAAB) 4 (BABAA)

Show that Suffix-Array over the binary alphabet linear-time reduces to Circular-
Suffix-Array over the binary alphabet by completing parts (a) and (b).

(a) Show that Suffix-Array over the binary alphabet {A, B} linear-time reduces to
Circular-Suffix-Array over the base-4 alphabet {0, 1, 2, 3}.

i. Given a string input s to Suffix-Array over the alphabet {A, B}, how do
you construct the corresponding string input s0 to Circular-Suffix-Array
over the alphabet {0, 1, 2, 3}?

ii. Given the string input s = ABAAB, what is the corresponding string input s0 ?

You need not use all of the boxes.

iii. Given the solution csa[] to s0 , how do you construct the solution sa[] to s?
COS 226 FINAL, SPRING 2014 17

(b) Show that Circular-Suffix-Array over the base-4 alphabet {0, 1, 2, 3} linear-
time reduces to Circular-Suffix-Array over the binary alphabet {A, B}.

i. Given a string input s to Circular-Suffix-Array over the alphabet {0, 1, 2, 3},


how do you construct the corresponding string input s0 to Circular-Suffix-
Array over the alphabet {A, B}?

ii. Given the string input s = 03122, what is the corresponding string input s0 ?

You need not use all of the boxes.

iii. Given the solution csa'[] to s0 , how do you construct the solution csa[] to
s?
14. Algorithm design. (8 points)
There are N dorm rooms, each of which needs a secure internet connection. It costs
wi > 0 dollars to install a secure router in dorm room i and it costs cij > 0 dollars to
Spring
build a secure fiber connection between 2014
rooms i and j. A dorm room receives a secure
internet connection if either there is a router installed there or there is some path of
fiber connections between the dorm room and a dorm room with an installed router.
The goal is to determine in which dorm rooms to install the secure routers and which
pairs of dorm rooms to connect with fiber so as to minimize the total cost.

60 10 router cost
0 20 1

45 50
55
15 fiber cost
75 3

65
25 30

4 5 5 45 6

40 70 35

This instance contains 6 dorm rooms and 10 possible connections. The optimal
solution installs a router in dorm rooms 1 and 4 (for a cost of 10 + 40) and builds
the following fiber connections: 0-1, 1-6, 3-4, 4-5 (for a cost of 20 + 15 + 25 + 5).

Formulate the problem as a minimum spanning tree problem. To demonstrate your


formulation, modify the figure above to show the MST problem that you would solve
to find the minimum cost set of routers and fiber connections.

18
COS 226 Algorithms and Data Structures Spring 2015

Final

This exam has 14 questions worth a total of 100 points. You have 180 minutes. The exam is closed book, except
that you are allowed to use a one page cheatsheet (8.5-by-11, both sides, in your own handwriting). No calculators
or other electronic devices are permitted. Give your answers and show your work in the space provided.Write
and sign the Honor Code pledge just before turning in the exam.
This exam is preprocessed by computer: if you use pencil (and eraser), write darkly; write all
answers inside the designated rectangles; do not write on the corner marks.

“I pledge my honor that I have not violated the Honor Code during this examination.”

Name:

netID:

Room:

P01 P01A P02 P03 P04 P05 P05A P06 P06A P06B P07
Precept:

P01 Th 11 Andy Guna


Problem Score Problem Score P01A Th 11 Shivam Agarwal
0 7 P02 Th 12:30 Andy Guna
1 8 P03 Th 1:30 Swati Roy
2 9 P04 F 10 Robert MacDavid
3 10 P05 F 11 Robert MacDavid
4 11 P05A F 11 Shivam Agarwal
5 12 P06 F 2:30 Jérémie Lumbroso
6 13 P06A F 2:30 Josh Wetzel
Sub 1 Sub 2 P06B F 2:30 Ryan Beckett
P07 F 3:30 Jérémie Lumbroso
Total
0. Initialization (1 point)
In the space provided on the front of the exam, write your name and Princeton netID; mark your precept number;
write the name of the room in which you are taking the exam; and write and sign the honor code.

1. Analysis of Algorithms (8 points)


(a) You observe the following memory usage for a program with an input of size N .

N memory
1,000 2.1 MB
2,000 8.2 MB
4,000 32.4 MB
8,000 128.8 MB

Estimate the memory usage of the program (in megabytes) on an input of size 24,000. Your answer should
be accurate to within 5%.

megabytes

(b) Consider the following implementation of a trie data type:

public class TrieST<Value> {


private static final int R = 256;
private Node root; // root of trie
private int N; // number of nodes in the trie

private static class Node {


private Object val;
private Node[] next = new Node[R];
}
// ...
}

Using the 64-bit memory cost model from lecture and the textbook, how much memory (in bytes) does a
TrieST object use to store M key-value pairs in N nodes as a function of N and M ?
Use tilde notation to simplify your answer. Do not include the memory for the values themselves but do
include all other memory (including references to values). Recall that with a static nested class, there is no
8 byte inner class overhead.

∼ bytes
Final, Spring 2015

2. Graph Search (6 points)


Perform a depth-first search in the digraph below, starting from vertex 0. Assume the adjacency lists are in sorted
order: for example, when iterating over the edges pointing from 3, process the edge 3 → 2 before either 3 → 7 or
3 → 8.

1 2 3 4 5

6 7 8 9 0

run DFS from here

postorder:
preorder:
(a) List all vertices in reverse postorder.

(b) List all vertices in preorder.

0
3. Minimum Spanning Tree (8 points)
Final, Fall 2015
The following diagram shows the set of edges (in thick black lines) selected at some intermediate step of an MST
algorithm. partial MST

A x B y C 120 D
can infer (Kr
x <= 110
10 80 0
90 0 140 50 11 z y <= 80
z >= 90

F 60 G 130 H 70 I

in the MST
can infer (Pr
x <= 130
y <= 80
(a) Which of the following could be the weights of edges x, y, and z, respectively, at some intermediate step of
Kruskal’s algorithm? Mark all that apply.

55 65 75 85 95 105 115 125 135 145


x
y
z

(b) Which of the following could be the weights of edges x, y, and z, respectively, at some intermediate step of
Prim’s algorithm, starting from vertex A? Mark all that apply.

55 65 75 85 95 105 115 125 135 145


x
y
z
Final, Spring 2015

4. Maximum Flow (10 points)


Consider the following flow network and feasible flow f from the source vertex A to the sink vertex J.
source flow capacity

A 8/8 B 4/4 C 4/7 D 9/9 E

4 20
23 / 28 /
1/6 / 8/8 6 0/6 / 3/8 32 9/9
34 28

F 1/1 G 32 / 32 H 0/5 I 3/3 J

target

augmenting path: A-G-B-H-I-D-J


(a) Mark the value of the flow f .

bottleneck
0
capacity = 3
10 20 22 24 26 28 30 32 34 36 38 40
min cut: { A, B, F, G, H, I }
max flow value = 35
(b) Starting from the flow f , perform one iteration of the Ford-Fulkerson algorithm. Mark all vertices that are
on the (unique) augmenting path.

A B C D E F G H I J

(c) Mark the bottleneck capacity of the augmenting path.

0 1 2 3 4 5 6 7 8 9

(d) Mark the vertices on the source side of a minimum cut.

A B C D E F G H I J

(e) Mark the edges below, for which doubling the capacity would increase the value of the maximum flow.

A→G B→C D→I G→H I→J H→D


5. String Sorting Algorithms (7 points)
The column on the left is the original input of 24 strings to be sorted; the column on the right are the strings
in sorted order; the other 7 columns are the contents at some intermediate step during one of the 3 radix sorting
algorithms listed below.
Match up each column with the corresponding sorting algorithm. You may use a number more than once.
Hint: think about algorithm invariants. Do not trace code.

0 lust bole bone bole leaf bone lava cafe bole


1 rust bone buff fawn teal bole herb sage bone
2 fawn buff bole flax pear buff sand palm buff
3 pine cafe cafe cafe flax cafe gold sand cafe
4 sand herb dust buff gray dust pine lava dust
5 cafe dust fawn herb puce fawn cafe fawn fawn
6 pear gray flax dust cafe flax puce leaf flax
7 puce flax gray gray buff gray sage teal gold
8 sage gold gold bone sage gold rose pear gray
9 herb fawn herb gold gold herb bone herb herb
10 dust leaf lust leaf bole lust lime lime lava
11 gray lava lava lava palm lava bole pine leaf
12 rose lime leaf lime lime leaf buff flax lime
13 gold lust lime lust sand lime leaf plum lust
14 bone rose pine rose pine pine teal gold palm
15 buff sage pear sage bone pear plum bole pear
16 lava plum puce plum herb puce palm bone pine
17 plum puce plum puce rose plum fawn rose plum
18 leaf pear palm pear lust palm pear gray puce
19 lime sand rust sand rust rust lust puce rose
20 flax pine rose pine dust rose rust buff rust
21 bole teal sand teal plum sand dust lust sage
22 teal palm sage palm lava sage flax rust sand
23 palm rust teal rust fawn teal gray dust teal
---- ---- ---- ---- ---- ---- ---- ---- ----

0 4

(0) Original input (3) 3-way radix quicksort (no shuffle)

(1) LSD radix sort (4) Sorted

(2) MSD radix sort


6. Substring Search (8 points)
(a) Consider the Knuth-Morris-Pratt DFA for the following string of length 8:
C C A C C C A B
Complete the last three columns of this partially-completed DFA table. (Feel free to use the partially-
completed DFA diagram below for scratch work.)
Final, Spring 2015 KMP

0 1 2 3 4 5 6 7
A 0 0 3 0 0
B
0 0 1 0 20 0 3 0 4 5 6 7 8
C 1 2 2 4 5

A, B

C
A, B

A, B 0 C 1 C 2 A 3 C 4 C 5 6 7 8

A, B

(b) What is the Rabin-Karp hash function of text[4..11] over the decimal alphabet with R = 10, using the
modulus Q = 157?

j 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
--------------------------------------------------------------------
text[j] 6 1 3 2 6 9 ? ? ? 7 7 8 4 4 2 9 5 1 9 6
The digits labeled with a question mark (?) are suppressed. Assume that the hash function of text[3..10]
is 115 and note that 10000000 (mod 157) = 42.
7. Regular Expressions (6 points) Final, Spring 2015
Consider the NFA that results from applying the RE-to-NFA construction algorithm from lecture and the textbook
to the regular expression ( A ( B C * | D ) * ). The states and match transitions are shown below, but
the ε-transitions are suppressed.

0 1 2 3 4 5 6 7 8 9 10 11

( A ( B C * | D ) * )

(a) Which of the following are edges in the ε-transition digraph? Mark all that apply.

0→1 0→8 0→9 0 → 10 0 → 11

2→6 2→7 2→8 2→9 2 → 10


0 1 2 3 4 5 6 7 8 9 10 11

( A4 → 2 ( B
4→5 C 4*→ 6 | D 7
4→ ) 4 →* 8 )

5→4 5→5 5→6 5→7 5→9

6→2 6→7 6→8 6→9 6 → 10

9→2 9→6 9→8 9 → 10 9 → 11

(b) Suppose that you simulate the NFA with the following input:
A B C C C B B C B B B
In which states could the NFA be? Mark all that apply.

0 1 2 3 4 5 6 7 8 9 10 11
8. LZW Compression (5 points)
Expand the following LZW-encoded sequence of 10 hexadecimal integers.

42 42 41 43 81 43 83 85 87 80

Assume the original encoding table consists of all 7-bit ASCII characters and uses 8-bit codewords. Recall that
codeword 80 is reserved to signify end of file.

(a) What was the encoded message?

(b) Which of the substrings below are in the LZW dictionary upon termination of the algorithm? Mark all that
apply.

AC ACB BA BB BC BBA BBC BAC BCA CA CAC CB CBB

For reference, above is the hexadecimal-to-ASCII conversion table from the textbook.
9. Burrows-Wheeler Transform (6 points)
(a) What is the Burrows-Wheeler transform of the following?

B A C B C D B A

(b) What is the Burrows-Wheeler inverse transform of the following?

COS 226 FINAL, FALL 2014 5 COS 226 FINAL, FALL 2014 7
B A D A B B D C

5. Burrows-Wheeler transform. (8 points)


5. Burrows-Wheeler transform. (8 points)

(a) What is the Burrows-Wheeler transform (a)


of the
What
following?
is the Burrows-Wheeler transform of the
A D D B D B C A D D B D B C

Feel free to use both of these grids for scratch work.


Feel free to use this grid for scratch work. Feel free to use this grid for scratch work.

(b) What is the Burrows-Wheeler inverse transform


(b) What
of is
thethe
following?
Burrows-Wheeler inverse transform
2 2
10. Properties of Problems (9 points)
Mark whether each of the following statements are True or False.

(a) Reductions. Suppose that Problem X poly-time reduces to Problem Y .


True False
If X can be solved in polynomial time, then so can Y .
If Y can be solved in quadratic time, then X can be solved in polynomial time.
If X cannot be solved in quadratic time, Y cannot be solved in polynomial time.
If Y cannot be solved in polynomial time, then neither can X.
If Y is NP-complete, then so is X.

(b) Minimum spanning trees. Let G be any simple graph (no self-loops or parallel edges) with positive and
distinct edge weights.
True False
Any MST of G must include the edge of minimum weight.
Any MST of G must exclude the edge of maximum weight.
The MST of G is unique.
If the weights of all edges incident to any vertex v are increased by 17, then any MST in G
is an MST in the modified edge-weighted graph.
If the weights of all edges in G are increased by 17, then any MST in G is an MST in the
modified edge-weighted graph.

(c) Shortest Paths. Let G be any simple digraph (no self-loops or parallel edges) with positive and distinct
edge weights.
True False
Any shortest path from s to t in G must include the edge of minimum weight.
Any shortest path from s to t in G must exclude the edge of maximum weight.
The shortest path from s to t in G is unique.
If the weights of all edges leaving s are increased by 17, then any shortest path from s to t
in G is a shortest path in the modified edge-weighted digraph.
If the weights of all edges in G are increased by 17, then any shortest path from s to t in G
is a shortest path in the modified edge-weighted digraph.
11. Properties of Algorithms (9 points)
(a) Consider the execution of depth-first search on a digraph G from vertex s, beginning with the function call
dfs(G, s). Suppose that dfs(G, v) is called during the depth-first search. Which of the following
statements can you infer at the moment when dfs(G, v) is called? Mark all that apply.

G contains a directed path from s to v.


The function-call stack contains a directed path from s to v.
The edgeTo[] array contains a directed path from s to v.
If G includes an edge v → w for which w has been previously marked, then G has a directed cycle
containing v.
If G includes an edge v → w for which w is currently a vertex on the function-call stack, then G has a
directed cycle containing v.

(b) Consider the execution of breadth-first search on a digraph G, starting from vertex s. Suppose that vertex v
is removed from the queue during the breadth-first search. Which of the following statements can you infer
at the moment when v is removed from the queue? Mark all that apply.

G contains a directed path from s to v.


The queue contains a directed path from s to v.
The edgeTo[] array contains a directed path from s to v.
If G includes an edge v → w for which w has been previously marked, then G has a directed cycle
containing v.
If G includes an edge v → w for which w is currently a vertex on the queue, then G has a directed cycle
containing v.

(c) Which of the following statements about string-processing algorithms are true?
Mark all that apply.

Both MSD radix sort and LSD radix sort are stable sorting algorithms.
The shape of an R-way trie depends not only on the keys that were inserted but also on the order in
which they were inserted.
The shape of a ternary search tree depends not only on the keys that were inserted but also on the
order in which they were inserted.
Searching for an M -character pattern in an N -character text takes time proportional to M in the best
case and M +N in the worst case using the Boyer-Moore algorithm (with the mismatch character heuristic
only).
Building the NFA corresponding to an M -character regular expression (using the algorithm from the
textbook and lecture) takes time proportional to M in the worst case.
12. Reductions (8 points)
Consider the following two graph-processing problems:
• Shortest-Path. Given an edge-weighted digraph G with nonnegative edge weights, a source vertex s, and
a destination vertex t, find a shortest path from s to t.

• Shortest-Princeton-Path. Given an edge-weighted digraph G with nonnegative edge weights, a source


vertex s, a destination vertex t, and with each vertex colored black or orange, find a shortest path from s to
t that uses at most one orange vertex. Assume that the source vertex is not orange.
In the edge-weighted digraph below, the shortest path from A to F is A → D → E → B → C → F (weight 15)
but the the shortest Princeton path is A → B → C → F (weight 18).
Final, Spring 2015

source A 9 B 4 C

weight
1 7 3 10 5

D 2 E 99 F destination

(a) Give a linear-time reduction from Shortest-Path to Shortest-Princeton-Path. To demonstrate your


reduction, draw the edge-weighted digraph (labeling the source and destination vertices and coloring each
vertex black or orange) that you would construct to solve the Shortest-Path instance above.
source 0 9 2 weight

10
1 7 3 4 destination

99

1 2 3
(b) Give a linear-time reduction from Shortest-Princeton-Path to Shortest-Path. To demonstrate your
reduction, draw the edge-weighted digraph (labeling the source and destination vertices) that you would
construct to solve the Shortest-Princeton-Path instance on the facing page.
Streaming sum (Final, Spring 2015)

13. Algorithm Design (9 points)


(a) Design a data structure that supports the following API:

public class StreamingSum

public StreamingSum() create an empty data structure

public void add(int weight) add the weight to the data structure

public void remove() remove the least-recently added weight

public int sum() sum of weights in data structure

Here is an example,
StreamingSum ss = new StreamingSum();
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
ss.add(1); // 1 ( add 1 )
query string t A A A B
ss.add(2); // 1 2 ( add 2 )
text string s A A A B A B A B B A B A A A B A B B B A B A B A B A B B
ss.add(3); // 1 2 3 ( add 3 )
weights
ss.sum(); 3 1 4 1//5 1 92 2
3 6 5 3 (5 return
8 9 7 6 9) 3 2 3 8 4 6 2 6 4 3 3 8 3
ss.add(4); //
19 1 2 3 4 ( add 4 )21 18 18
ss.remove(); // 2 3224 ( remove 1 ) 15
ss.sum(); // 2 3 4 ( return 9 )
9

Each operation should take constant time in the worst case.

Declare the instance variables for your StreamingSum data type. You may declare nested classes but you
may not use higher-level data types (such as those in algs4.jar or java.util).
public class StreamingSum

public StreamingSum() create an empty data structure

public void add(int weight) add the weight to the data structure

(b) Given a binarypublic


string svoid
with remove() remove
integer weights associated the least-recently
with each characteraddedand
weight
a query string t, find a
minimum weight occurrence of t in s (or report that t does not appear as a substring in s). The weight of
public
an occurrence is equal toint sum()
the sum sumcorresponding
of the weights of the of weights in data structure in the text.
characters

For example, if s = AAABABABBABAAABABBBABABABABABB and t = ABAB, and the weights


are given as below, then the minimum weight occurrence of t in s starts at index 21.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

query string t A B A B
text string s A A A B A B A B B A B A A A B A B B B A B A B A B A B B
weights 3 1 4 1 5 9 2 6 5 3 5 8 9 7 9 3 2 3 8 4 6 2 6 4 3 3 8 3

19 21 18 18
22 15

99

Your algorithm should run in time proportional to N + M , where N and M are the lengths of s and t,
respectively. Your answer will be graded on correctness, efficiency, clarity, and conciseness.
COS 226 Algorithms and Data Structures Spring 2023

Final

This exam has 14 questions worth a total of 100 points. You have 180 minutes.

Instructions. This exam is preprocessed by computer. Write neatly, legibly, and darkly. Put all
answers (and nothing else) inside the designated spaces. Fill in bubbles and checkboxes completely:
and . To change an answer, erase it completely and redo.

Resources. The exam is closed book, except that you are allowed to use a one page reference
sheet (8.5-by-11 paper, both sides, in your own handwriting). No electronic devices are permitted.

Honor Code. This exam is governed by Princeton’s Honor Code. Discussing the contents of this
exam before the solutions are posted is a violation of the Honor Code.

Please complete the following information now.

Name:

NetID:

Exam room: # McCosh 46 # McCosh 50 # Other

P01 P02 P02A P03 P03A P03B P04 P04A P05


Precept:
# # # # # # # # #

“I pledge my honor that I will not violate the Honor Code during this examination.”

Signature
2 PRINCETON UNIVERSITY

1. Initialization. (1 point)
In the spaces provided on the front of the exam, write your name and NetID; fill in the bubble
for your exam room and the precept in which you are officially registered; write and sign the
Honor Code pledge.
COS 226 FINAL, SPRING 2023 3

2. Empirical running time. (6 points)


Suppose that you observe the following running times (in seconds) for a program on graphs
with V vertices and E edges.

E
100 200 400 800
100 0.25 0.5 1.0 2.0
V 200 2.0 4.0 8.0 16.0
400 16.0 32.0 64.0 128.0
800 128.0 256.0 512.0 1024.0

(a) Estimate the running time of the program (in seconds) for a graph with V = 1,600
vertices and E = 1,600 edges.

# # # # #
2,000 4,000 8,000 16,000 32,000

(b) What is the order of growth of the running time as a function of both V and E?

# # # # #
Θ(V 3 + E) Θ(V + E 3 ) Θ(V 3 E) Θ(V E 3 ) Θ(V 2 E 2 )
4 PRINCETON UNIVERSITY

3. Analysis of algorithms. (6 points)


Determine the order of growth of the running time of each of the following code fragments
as a function of V and E, where V and E are the number of vertices and edges in graph G,
respectively. Assume the standard adjacency-lists representation.

(a) int count = 0;


int V = G.V();
for (int v = 0; v < V; v++)
for (int w = 0; w < v; w++)
count++;

# # # # #
Θ(V ) Θ(E) Θ(V log V ) Θ(V 2 ) Θ(V 2 log V )

(b) int count = 0;


int V = G.V();
for (int v = 0; v < V; v++)
for (int w : G.adj(v))
count++;

# # # # #
Θ(V ) Θ(E) Θ(E + V ) Θ(V 2 ) Θ(V E)

(c) int count = 0;


int V = G.V();
for (int v = V; v >= 1; v = v / 2)
for (int w = 1; w <= v; w++)
count++;

# # # # #
Θ(V ) Θ(E) Θ(V log V ) Θ(V 2 ) Θ(V 2 log V )
COS 226 FINAL, SPRING 2023 5

4. String sorts. (5 points)


The column on the left contains the original input of 24 strings to be sorted; the column on
the right contains the strings in sorted order; the other 5 columns contain the contents at
some intermediate step during one of the 3 radix-sorting algorithms listed below. Match each
algorithm by writing its letter in the box under the corresponding column.

You may use each letter once, more than once, or not at all.

0 3543 1100 2346 1100 1100 1100 1100


1 2346 6501 1664 1491 1864 1491 1491
2 9397 3006 1100 1532 1491 6501 1532
3 8686 5609 1563 1563 1532 1532 1563
4 1100 5316 1719 1664 1719 7092 1664
5 3239 3117 1532 1719 1563 3543 1719
6 9458 3419 1864 1864 1664 1563 1864
7 7868 1719 1491 2346 2346 1864 2346
8 5609 5629 3239 3543 3543 7584 3006
9 5316 1532 3419 3239 3239 1664 3117
10 3006 3239 3006 3006 3006 2346 3239
11 1864 3543 3117 3419 3419 8686 3419
12 1491 2346 3543 3117 3117 5316 3543
13 3419 4149 4149 4149 4149 3006 4149
14 4149 9458 7584 5609 5609 9397 5316
15 7584 1563 5316 5316 5316 3117 5609
16 1532 1864 6501 5629 5629 9458 5629
17 6501 1664 5609 6501 6501 7868 6501
18 1719 7868 7092 7868 7868 3239 7092
19 7092 7584 7868 7584 7584 5609 7584
20 1563 8686 5629 7092 7092 3419 7868
21 5629 1491 9458 8686 8686 4149 8686
22 3117 7092 8686 9397 9397 1719 9397
23 1664 9397 9397 9458 9458 5629 9458

A E

A. Original input

B. LSD radix sort

C. MSD radix sort

D. 3-way radix quicksort (no shuffle)

E. Sorted
6 PRINCETON UNIVERSITY

5. Depth-first search. (8 points)


Final, Spring 2023
Run depth-first search on the following digraph, starting from vertex 0. Assume the adjacency
lists are in sorted order: for example, when iterating over the edges leaving vertex 0, consider
the edge 0→2 before either 0→4 or 0→6.

start from here

0 2 5 1 9

6 4 8 3 7

preorder: 0 2 5 4 8 6 7 3 1 9
(a) List the 10 vertices in DFS preorder.
postorder: 5 2 8 4 1 3 9 7 6 0
0

(b) List the 10 vertices in DFS postorder.

(c) Is the reverse of the DFS postorder in (b) a topological order for this digraph?

# #
yes no
COS 226 FINAL, SPRING 2023 7

6. Minimum spanning trees. (8 points)


Final, Spring 2023
Consider the following edge-weighted graph.

130 110

20 0
0 140 12 150

90 s 100

10
40 30 80 70

60 50

Kruskal: 0 10 20 30 50 70 110 120


(a) List the weights of the MST edges in the order that Kruskal’s algorithm adds them to
Prim: 10 20 0 30 50 70 120 110
the MST.

(b) List the weights of the MST edges in the order that Prim’s algorithm adds them to the
MST. Start Prim’s algorithm from vertex s.
8 PRINCETON UNIVERSITY

7. Shortest paths. (8 points) Final, Spring 2023


Suppose that you are running Dijkstra’s algorithm in the following edge-weighted digraph,
with source vertex s = 0. Just prior to relaxing vertex 6, the distTo[] array is as follows:

v distTo[]
0 1 2 3 0 0.0
edge 1 50.0
weight
2
6 4 3 43.0
4 41.0
5 35.0
4 5 6 2 7 6 38.0
7 36.0

(a) Which vertices (including vertex 6) are currently in the priority queue?
Mark all that apply.

0 1 2 3 4 5 6 7

(b) Which vertex will Dijkstra’s algorithm relax immediately after vertex 6?

# # # # # # # # #
cannot be
0 1 2 3 4 5 6 7
determined

(c) Which is the weight of edge 7→3 ?

# # # # # # # # #
cannot be
1 2 3 4 5 6 7 8
determined
COS 226 FINAL, SPRING 2023 Final, Spring 2023 9

8. Maxflows and mincuts. (8 points)


Consider the following flow network and maximum flow f .

source ow f capacity

A 10 / 10 B 9/9 C 20 / 20 D 10 / 15 E

11 8 6
1 / 13 / 5 /
10 / 10 / 8 0/4 / 4/7 6 10 / 14
16 3

F 10 / 14 G 14 / 30 H 14 / 14 I 15 / 17 J

target

(a) What is the value of the flow f ?

# # # # #
20
min31cut: 32{ A, 36B, F, 37G, H }
max flow value = 31
(b) What is the capacity of the cut {A, B, C}?
fl
# # # # #
31 42 50 63 76

(c) Which vertices are on the source side of a minimum cut? Mark all that apply.

A B C D E F G H I J

(d) Suppose that the capacity of edge B →C is increased from 9 to 10. Which of the following
paths would become augmenting paths with respect to flow f ? Mark all that apply.

A→G→B→C→I →D→E→J
A→B→C→D→E→J
A→G→B→C→I →J
A→G→H →C →I →J
none of the above
10 PRINCETON UNIVERSITY

9. Data structures. (12 points)

(a) Suppose that the following keys are inserted into an initially empty linear-probing hash
table, but not necessarily in the order given,

key hash
A 3
B 4
C 4
D 0
E 1

Which of the following hash tables could arise? Assume that the initial size of the hash
table is 5 and that it neither grows nor shrinks.
Fill in all checkboxes that apply.

0 1 2 3 4
A B C D E

0 1 2 3 4
D B C E A

0 1 2 3 4
C D E A B
COS 226 FINAL, SPRING 2023 Final, Spring 2023 11

(b) Consider the following 2d-tree:

(9, 5)

(5, 9) (14, 9)

(4, 2) (6, 12) (16, 1) (11, 12)

(7, 15)
(20, 7) (10, 14)

(x, y)

Which of the following points could6correspond


<= x <= 9 y)?
to (x,
9 <= y <= 15
Fill in all checkboxes that apply.

(5, 10) (7, 10) (7, 16) (8, 8) (8, 14) (10, 10)
12 PRINCETON UNIVERSITY

(c) Consider the following ternary


Final, search trie,
Spring 2023 where the question mark represents an un-
known digit:

3 1 2

6 2 2 6

4 0 5 6 6 1 9
2

6 4
2 ?

5 6 6 7

Which of the following string keys are (or could possibly be) in the TST?
Fill-in all checkboxes that apply.

1 226 236 36 56 646 76

81 822 8225 826 8269 869 96


COS 226 FINAL, SPRING 2023 13

10. Data compression. (8 points)


For each of the following data compression algorithms, identify the worst-case compression
ratio. Recall that the compression ratio is the number of bits in the encoded message divided
by the number of bits in the original message.

For each algorithm on the left, write the letter of the best-matching term on the right. You
may use each letter once, more than once, or not at all.

Run-length coding with 8-bit counts. A. ∼ 1

B. ∼ 3/2
Huffman coding over the extended ASCII alphabet
(R = 256).
C. ∼ 2

LZW compression over the extended ASCII alphabet


D. ∼ 4
(R = 256), with 12-bit codeword.

E. ∼ 8
Burrows–Wheeler compression over the extended ASCII
alphabet (R = 256). This includes the Burrows–Wheeler
transform, move-to-front encoding, and Huffman coding.
F. ∼ 12

G. ∼ 16

H. ∼ 256
14 PRINCETON UNIVERSITY

11. Burrows–Wheeler transform. (5 points)

(a) What is the Burrows–Wheeler transform of the following string?

A N A B E L L A

integer index

core

Feel free to use this grid for scratch work.

(b) Consider all strings whose core Burrows–Wheeler transform (i.e., the Burrows–Wheeler
transform excluding the integer index) is the same as the core Burrows–Wheeler trans-
form of

A N A B E L L A
In the space below, write the lexicographically smallest such string (i.e., the first one
that would appear alphabetically).
COS 226 FINAL, SPRING 2023 15

12. DFS postorder. (5 points)


Final, Spring
Consider the following partial implementation 2023the DFS postorder in a digraph:
for computing

public PostorderDFS(Digraph G) { A. dfs(G, v);

marked = new boolean[G.V()]; B. dfs(G, w);


postorder = new Queue<Integer>(); C. marked[v] = true;
for (int v = 0; v < G.V(); v++) D. marked[w] = true;
if (!marked[v])
E. postorder.enqueue(v);
dfs(G, v);
F. postorder.enqueue(w);
}
G. if (!marked[v])

H. if (!marked[w])
private void dfs(Digraph G, int v) {
I. for (int w : G.adj(v))
marked[v] 1= true;
J. for (int v = 0; v < G.V(); v++)
for (int w 2: G.adj(v))

3
if (!marked[w])

4
dfs(G, w);

postorder.enqueue(v);
5

For each numbered oval above, write the letter of the corresponding code fragment on the right
in the space provided. Use each letter at most once.

1 2 3 4 5
16 PRINCETON UNIVERSITY

13. Shortest tiger path. (10 points)


Consider a graph in which each vertex is colored black or orange. A tiger path is a path that
contains exactly one edge whose endpoints have opposite colors.

Shortest tiger path problem. Given an undirected graph G and two vertices s and t, find
a tiger path between s and t that uses the fewest edges (or report that no such path exists).

Final, Spring 2023


An example. Consider the graph G below with s = 0 and t = 6.

• The shortest path between s and t is 0–4–5–6, but it is not a tiger path.
• The shortest tiger path between s and t is 0–1–2–3–6.

G
1 2 3

s 0

4 5 6 t

Goal. Formulate the shortest tiger path problem as a traditional (unweighted) shortest path
problem in a directed graph. Specifically, define a digraph G′ , source s′ , and destination t′
such that the length of the shortest path from s′ to t′ in G′ is always equal to the length of
the shortest tiger path between s and t in G. For simplicity, you may assume that s is black
and t is orange.

Performance requirements. For full credit, the number of vertices in G′ must be Θ(V )
and the number of edges must be Θ(E), where V and E are the number of vertices and edges
in G, respectively.

Your answer will be graded for correctness, efficiency, and clarity.


COS 226 FINAL, SPRING 2023 17

Briefly describe how to construct the digraph G′ , s′ , and t′ from G, s, and t.


Your description should work for any graph G, not just the one on the facing page.

Draw the digraph G′ corresponding to graph G on the facing page. Label s′ and t′ .
18 PRINCETON UNIVERSITY

14. Necklaces. (10 points)


A necklace consists of a sequence n beads, each of which is either orange (0) or black (1).
In this question, we have a set of m necklaces and are interested in identifying a common
sequence of beads that appears at the end of many necklaces.

The problem. Given a set of m necklaces, each containing n beads, and a positive integer
Final, Spring 2023
k ≤ n, design an algorithm to find the most popular ending sequence of length k.

An example. Consider the following m = 4 necklaces, each containing n = 5 beads.

bead 1 bead 2 bead 3 bead 4 bead 5

necklace 1

necklace 2

necklace 3

necklace 4

The most popular ending sequences for various values of k are as follows:

• k = 1: black (3 necklaces end with a black bead).


• k = 2: black–black (3 necklaces end with two black beads).
• k = 3: orange–black–black (2 necklaces end with this sequence).
• k = 4: black–orange–black–black (2 necklaces end with this sequence).
• k = 5: orange–orange–orange–black–orange (1 necklace ends with this sequence).
There are three alternative answers.

Performance requirements. For full credit, the running time of your algorithm must be
be Θ(mn) in the worst case.

Your answer will be graded for correctness, efficiency, and clarity (but not Java syntax). If
your solution relies upon an algorithm or data structure from the course, do not reinvent it;
simply describe how you are applying it.
COS 226 FINAL, SPRING 2023 19

(a) Describe your algorithm for identifying a most popular ending sequence of length k.

(b) Draw a diagram of the underlying data structures (such as arrays, linked lists, or binary
trees) that your algorithm uses for the example input on the facing page. Show all
relevant information, including any links and auxiliary data.
20 PRINCETON UNIVERSITY

This page is intentionally blank. You may use this page for scratch work.

You might also like