CSC263 Cheat Sheet
CSC263 Cheat Sheet
1, 2, or 3. Since a node has 1 more child than internal values and we can perform our queries faster? In-neighborhood, out-neighborhood: given a vertex u in a
Q. What is an ADT? A set of objects together with a set of the number of children is between 2 and 4. Q: What would help us with questions about rank? augment each directed graph, get the set of vertices v such that (v, u) (or(u, v),
operations that can be performed on these objects. Q: What range of values would be allowed in any subtree rooted node x such that it has an additional field ’size[x]’ respectively) is an edge Degree, in-degree, out-degree: compute
1. Objects: integers Operations: ADD (x; y), SUBTRACT (x; y), at node (33)? Between 31 and 40 inclusive that stores the number of keys in the subtree rooted at x the size of the neighborhood, in-neighborhood, or out-
MULTIPLY (x; y), Q: If we were to insert the value 15 into the tree above, where (including x itself). neighborhood, respectively.
QUOTIENT (x; y), REMAINDER (x; y) would it need to go to preserve the order property? In the same Q: How is this related to ‘rank’? Suppose we have a 2-node with Traversal: visit each vertex of a graph to perform some task.
2. Objects: stacks Operations: PUSH (S; x) - add the element x to node as (12) making (12,15) NOTE: This is formally defined in a single element x. What is rank(x) in terms of the keys that come Applications of Graphs
the top of the stack S POP (S) - delete the top element from the section 3.3.1, before x in the tree? WWW-google!/ Scheduling /Chip Design /Network Analysis,
nonempty stack S and return it EMPTY (S) - return true if S is 2-3-4 Trees RANK(x) = 1 + # keys that come before x in the tree. such as transportation flow, cellular coverage, electrical current
empty, false otherwise Let’s formalize our intuition by introducing some notation: Q: Now with respect to the left subtree rooted at x, what is the etc./ Flow Charts/ Explanatory schematics
Q. What is a data structure?A data structure is an implementation A node with d children is called a d-node;_ The values stored at relative RANK(x)?
of an ADT. This includes away to represent objects and the node are labelled k1, k2,…, kd The children are labelled RANK(x) = SIZE (v1) + 1 where v1 is the left child. Data structures for graphs
algorithms for the operations. Examples objects: A stack could v1,v2,….,vd Q: Now suppose we have a 4-node X with elements x1,x2, x3. There are two reasonable data structures to store graphs:
be implemented by either a singly-linked list or an array with a Q: Given the two properties, size and depth what can we say What is the relative rank of xi with respect to the subtree rooted a) Adjacency matrix b) Adjacency list
counter to keep track of the “top.” about the height of a 2-3-4 tree which stores n items? at X? An adjacency matrix: let V = {v1, v2.., vn}.
Q: Why are ADTs important? h h RELATIVE RANK(xi) = 1+# elements the precede xi in the tree We store information about the edges of the graph in an
Important for specification. Provides modularity, usage depends 2 ≤ n+1 ≤ 4 = 1+ SIZE(v1) +1+ SIZE(v2) + . . . 1+SIZE(vi1) + 1+SIZE(vi) n x n array A where:
only on the definition, not on the implementation of the ADT can
vj
be changed (corrected or improved) without changing the rest of
the program Reusability an abstract data type can be
implemented once, and used in lots of different programs
Recap: ADT is a way to describe what the data is and what
Relating 2-3-4 Trees to Red-Black Trees
The following properties must hold:
1. The root of the tree is black.
2. Every external node is black
j=i
∑ ¿ (¿)+1
A [i, j] =
{10ifotherwise
( vi, vj ) ∈ E
{
2i for i≠ 0 construct a symbol table to keep track of the identifiers used in
The idea is that when we want to access key k, instead of looking
t (A ) =
n t the input program.
up T[k] in a table T, we look in T[f(k)]
2 n+1 if i=0 Q: If m < |U|, then what must be true about some k1 and k2 in U?
Q: We would like to use the division method for hashing. How
can we do this?
there must be k1; k2 ∈ U such that k1 ≠ k2 and yet Turn the identifiers (strings of text) into positive integers.
h(k1) = h(k2). This is called a collision. We can do this by:
Q: How can we resolve a collision? • considering each string to be a number in base128 (if there are
Q: Does this make intuitive sense? 128 text characters).
Example. You have a small address book and one of the letters Yes! If we are searching for a key that is not in the hash table, we • each character x can be represented by a number from 1
fills up, for example, ”N”s.
need to traverse one complete linked list whose average size is a. through 128 denoted num(x).
Where do you add the next ”N” entry? Q: What if k 2 Lj for some j? Then we want to consider k chosen
Closed Addressing: write a little note explaining where to find • a string of characters xnxn1...x1 can be represented uniquely by
uniformly at random from the elements in hash table T. So:
rest of the N names (explicit directions) the number
Open Addressing: flip to the next page–overflow (general rule)
and the probability that k is the jth element in bucket i
Closed Addressing
Then: conditional to the fact that h(k) = i, is uniform, i.e., equals 1/Li
Q: How can we resolve a collision using Closed Addressing?
This value is somewhat smaller (as expected) than Twc(n) The Multiplication Method
Chaining: Idea: store a linked list at each entry in the hash table
Asymptotically, are Tavg(n) and Twc(n) the same?
yes, they are both θ(n). For some algorithms, Twc and Tavg are h(k) = floor[ m x fract (kA) ]
different even when written asymptotically A: fixed real number
This study source was downloaded by 100000883358294 from CourseHero.com on 01-05-2025 12:42:31 GMT -06:00
fract(x) is the fractional part of a real number x. END KRUSKAL-MST
m: power of 2
Closed Addressing Theorem. If G is connected, Kruskal’s algorithm produces
We handle collisions by enlarging the storage capacity at the an MST (given by the edges in A).
relevant entry in the hash table, e.g., using a linked list.
Open Addressing Proof. We’ll prove the following three claims:
Each entry in the hash table stores only one element, in particular, Intuition because there are n j + 1 elements that we haven’t seen (1) “The graph F = (V, A) is a forest” is a loop
we only use it when n < m. among the remaining mj+1 slots that we haven’t seen. Hence, invariant,
Q: How can we insert a new element if we get a collision? (2) when the while loop terminates F = (V, A) is connected
Answer: (3) the sum of weights of edges in A is the same as the
- Find a new location to store the new element. sum of weights in an MST. Taken all together these imply
- We need to know where we put it as well. that A is an MST because (1) and (2) show A is a spanning
- Search a well-defined sequence of other locations in the hash tree and (3) shows its weight is minimal.
table until we find one that’s not full. Now we can calculate the expected value of T, or the average- 1. The statement in italics is true the first time we enter
This sequence is called a probe sequence. case complexity of insert. the loop body since no edges have been added to A (so
there are no cycles in A). If the statement is true upon
Probe Sequences entering the loop body then it will be true at the end of the
si = (h(k) + i) mod m for i = 0,1,2,.... body because an edge e is added to A only if its endpoints
Q: What is the problem with linear probing? are in different components of F in which case adding e
Clustering: cannot produce a new cycle.
Q: What happens when we hash to something within a group of 2. Suppose F is not connected and consider its components,
filled locations? say they have edge sets V1, V2, . . .Vk.
- we have to probe the whole group until we reach an empty slot. Since G is connected there must be at least one edge of E
- we increase the size of the cluster.Resulting in two keys that with an endpoint in V1 and the other endpoint in one of
didn’t necessarily share the same “home” location ending up the Vi, i = 2, . . . , k. If there are several such edges let e
with almost identical probe sequences. be the one of least weight. By our assumption, it is an
element of E but not of A. However since the algorithm
considers edges in order of weight, when it considers e
there would be no other edges from V1 to the “outside
Non-Linear Probing: world” (i.e. vertices of the other Vi) so the two endpoints
Non-linear probing includes schemes where the probe sequence of e would at that moment be in different components and e
does not involve steps of fixed size. would have been added to A, a contraction.
Example: Quadratic probing where the probe sequence is 3. Let T be an MST (i.e the edge set of an MST). If A =
calculated as: T we are done. If A = T let e ∈ T be the smallest weight
si = (h(k) + i2) mod m for i = 0,1,2,.... Asymptotic Notation edge in T \ A. Note A ∪ {e} contains a single cycle C and
all its edges are lower weight than e (because the algorithm
Q: Now what problem may occur? Let d(n), e(n), f(n) and g(n) be functions mapping adds edges in the order of weight and would otherwise have
probe sequences will still be identical for elements that hash to nonnegative integers to nonnegative reals. Then added e to A before one of the other edges of C). Also
the same home location. 1. If d(n) is O(f(n), then ad(n) is O(f(n)), for any some edge f of C is in A but not T (because T is a tree).
constants a>0 Let A2 := A \ {f } ∪ {e}. This has one more edge in
Double Hashing: common with T than A did, but weight(A) ≤ weight(A2).
In double hashing we use a different hash function h2(k) to
2. If d(n) is O(f(n)) and e(n) is O(g(n), then d(n)
Continuing in this way we can keep producing spanning
+e(n) is O(f(n)+g(n)) trees which have more and more edges in common with T.
calculate the step size.
The probe sequence is: 3. If d(n) is O(f(n)) and e(n) is O(g(n)), then Eventually we have a tree Ar with all edges in common
Ai = (h(k) + j * h2(k)) mod m for j = 0,1,2,.... d(n)e(n) is O(f(n)g(n)) with T (i.e. T = Ar). On the other hand, the total weights
4. If d(n) is O(f(n) and f(n) is O(g(n)), then d(n) is were non-decreasing too:
Also, we want to choose h2 so that, if h(k1) = h(k2) for two keys weight(A) ≤ weight(A2) ≤ weight(A3) ≤ . . . ≤ weight(Ar)
O(g(n)
k1, k2, it won’t be the case that h2(k1) = h2(k2). = A(T ).
5. If f(n) is a polynomial of degree d, then f(n) is
O(nd) Thus A must have the same weight as T .
Analysis of Open Addressing:
Notice that in open addressing, INSERT and SEARCH take the 6. nx is O(an ) for any fixed x > 0 and a>1
same amount of work. 7. log nx is O(log n) for any fixed x> 0 Pre-Order Traversal: node, left, right
Let’s consider the complexity of INSERT for a key k: 8. logx n is O(ny) for any fixed constants x>0 and Yes, it is possible that a pre-order traversal of a heap returns the
It’s not hard to come up with worst-case situations where the keys in decreasing order. This occurs when at every node the key
y>0
above types of open addressing require O(n) time for INSERT. of the left child is greater than the
To simplify the analysis of the average case, we make some key of the right child (if it exists). For any set of keys, there is
assumptions: Proof of Kruskal’s Algorithm exactly one heap that obeys this property.
- there is a hash table with m locations In-order traversal: left, node, right
- the hash table contains n elements and we want to insert a new KRUSKAL-MST(G=(V,E),w:E->Z) No, it is not possible that an in-order traversal of a heap returns
key k. A := {} the keys in sorted order. This is because any heap containing at
- consider a random probe sequence for k, that is, it’s probe insert the edges into a priority queue Q least 3 keys will contain a parent x
sequence is equally likely to be any permutation of (0,1,...,m 1). for each vertex v in V with children y and z such that x > y and x > z but an in-order
MAKE-SET(v) traversal of this subtree will necessarily visit x between y and z
Computing the Expected Average Complexity end (for) which is not in sorted order.
Let T denote the number of probes performed in the INSERT. Let while (Q not empty) Post-Order traversal: left, right, node
Ai denote event that each location until the i-th probe is occupied. e = EXTRACT-MIN(Q)
if FIND-SET(u) =/= FIND-SET(v)
UNION(u,v)
A := A U {e}
end (if)
end (while)
This study source was downloaded by 100000883358294 from CourseHero.com on 01-05-2025 12:42:31 GMT -06:00
Powered by TCPDF (www.tcpdf.org)