DATA STRUCTURE NOTES
WEEK 09-14
How do we solve a problem?
Two main approaches
• Think about the solution yourself
• Learn from others
How do we use the solution for another problem to solve our problem?
Modeling
• We model our problem as a known problem
• We then modify the solution of known problem as per our model, and then apply it to solve our
problem
How do we model a problem?
• Using data structures
How do we apply the solutions of known problems?
• Using algorithms
Graphs
Graphs are an important and powerful modeling technique. Used to model ‘connectivity’ among entities
o Graph terminology
The entities are called vertices
The links between vertices are called edges
Edges can be
un-directed (bidirectional)
directed (one-directional)
Vertices are usually labeled
Edges can be weighted
A graph that has directed edges is called a directed graph
A graph with undirected edges is termed as undirected graph
Graphs may have cycles
A cycle in a graph is a path that starts and ends at the same vertex, with no
repeated edges or vertices (except for the start/end vertex).
The path BED is a cycle, so is EDB, and also DBE…
A graph with a cycle is called cyclic graph
A graph without any cycle is called an acyclic graph
A directed graph without any cycle is called a directed acyclic graph (DAG)
A graph with a small number of edges as compared to the number of vertices is called a
sparse graph
A graph with considerably more edges is a dense graph
Two vertices in a graph are adjacent if they are joined by an edge
From any vertex we can reach every other vertex, this is called a connected graph
The degree of a vertex refers to how many edges are connected to that vertex.
Can a graph have both directed and undirected edges? Yes
o Graph Representation
A graph is denoted by G = (V, E) where
V is the set of vertices
E is the set of edges, such that each edge in E connects two vertices in V
An edge in E is represented as (u, v) where u and v are vertices in V
A vertex v is reachable from another vertex u if there is a path from u to v
A sequence of vertices v1, v2, …, vn forms a path if there exists edges (vi, vi+1) such that 1
≤i<n
Length of a path between two vertices is the number of edges on the path
A graph is connected if there is a path from each vertex to every other vertex
o Two common methods for representing graphs:
Adjacency matrix
A |V| x |V| array, where |V| denotes the number of vertices
For each vertex row, put a 1 in column of adjacent vertex
For weighted graphs, put the weight of edge instead of 1
Space requirement: θ (|V|2)
For undirected graphs, each edge is represented twice
Adjacency list
An array containing |V| linked lists
For each vertex, store each adjacent vertex in its list
Space requirement: θ (|V| + |E|)
For undirected graphs, each edges has to be added in linked list for both vertices
So which is better? Adjacency matrix or adjacency list? For sparse graphs, using adjacency matrix will be
wastage of space. For dense graphs, maintaining and traversing linked lists could be time consuming. So it
all depends on the number of edges in the graph.
Topological sort
o An ordering of vertices in a DAG, such that if there is a path from a vertex vi to vj, then vi appears
before vj in the ordering
o Topological sort is not possible in a graph with cycles, why?
The ordering may not be unique…
There may be more than one topological orders for a graph
o Applications:
Job scheduling
Checking if a sequence of tasks follows certain rules (as in the case of course-
prerequisites)
o In-degree:
o The number of edges coming into a vertex
o Out-degree:
o The number of edges going out of a vertex
Topological Sort is a way to arrange tasks or steps in a specific order, where:
Some tasks depend on others being done first.
It applies to Directed Acyclic Graphs (DAGs) — a graph with arrows (directions) and no cycles (you
can’t loop back to the same point).
The algorithm:
o Precompute the number of incoming edges indeg(v) for each vertex v
o Enqueue all vertices v with indeg(v) = 0 into a queue Q
o Repeat until Q becomes empty:
o Dequeue v from Q (and print it!)
o For each edge v → u:
o Decrement indeg(u)
o If indeg(u) = 0, enqueue u into Q
Time complexity: Θ(|V|+|E|)
Graph Traversal
To visit all the vertices of a graph, Two different methods exist:
o Depth-first search (DFS)
Pick an adjacent vertex, explore one of its adjacent vertices and so on
Depth-First Search (DFS) is a graph traversal algorithm. It explores as far as possible
along a path before backtracking. Think of it like walking through a maze:
You pick a path and keep going deep until you hit a wall.
Then you go back and try a different path.
▪ Algorithm FindPath (vs, vd)
• Mark vs as visited and add to path
• For each unvisited vertex u adjacent to vs
▪ if u == vd
▪ Path found! Add u to path
else
o Call FindPath (u, vd)
• If FindPath for all u, vd fail
• No path exists from vs to vd
• Remove vs from path!
Same in simple terms
Finding a Path (Recursive) – Plain English Steps
1. At your current spot (vs):
o Mark it as “seen” so you don’t revisit it.
o Add it to your current path list.
2. Check each neighbor (u) you haven’t seen yet:
o If u is the destination (vd):
Add u to the path.
You’re done—path found!
o Otherwise:
“Dive in” and try to find a path from u to vd by calling this same procedure on u.
3. If none of those neighbors leads to vd:
o Remove vs from your path (you’re backtracking).
o Report that no path was found from vs to vd.
Algorithm FindCycle (vertex v)
Mark v as visited
For each edge v → u:
if u is unvisited
Mark v as parent of u
Call FindCycle (u)
else if u is not parent of v
cycle found!
Same in simple terms
Detecting a Cycle in a Graph – Simple Steps
1. Visit a node (v):
o Mark it as “seen.”
2. Look at each neighbor (u) of v:
o If u hasn’t been seen yet:
1. Remember that you came to u from v (so v is u’s “parent”).
2. Go explore from u (repeat these same steps on u).
o Otherwise (u has been seen):
If u isn’t the one you came from, you’ve just found a loop—a cycle.
Algorithm FindMST (Vertex v)
Add v to Tree T
For each vertex u adjacent to v
and not yet in T:
Add u and edge (v, u) to T
Call FindMST(u)
Same in simple terms
Building a Spanning Tree (MST) – Plain English Steps
1. Start with your root node (v):
o Put v into your tree T.
2. Look at each neighbor (u) of v that isn’t already in T:
o Add u to T and include the edge (v–u) that connects them.
o Then, from u, repeat the same process: treat u as your new “current node” and explore its
neighbors.
3. Keep going until every vertex you can reach from v is in T.
o Breadth-first search (BFS)
Explore all adjacent vertices, then their adjacent vertices, and so on
▪ Algorithm BFS (vertex v)
1. Initialize a queue Q
2. Mark v as visited and enqueue it into Q
3. While Q is not empty:
4. Dequeue the front element of Q
and call it w
1. For each edge w → u:
2. If u is not visited, mark it as visited and enqueue into Q
Same algorithm in simple terms
Breadth-First Search (BFS) – Simple Steps
1. Make a line (a queue) and stand your starting point (vertex v) in that line.
2. Mark v as “seen” so you don’t visit it again.
3. While there’s someone in the line:
o Step forward to the person at the front (call them w) and remove them from the line.
o Look at each neighbor (each vertex u connected to w):
If you haven’t “seen” u yet,
1. Mark u as seen,
2. Join u to the back of the line.
Each method has its own application areas
• Edges in a graph can be classified into types
o Tree edge: The edge between the current vertex and an adjacent vertex which has not
been visited
o Back edge: The edge between the current vertex and its adjacent vertex which has been
already visited
• Except the vertex from where this vertex was visited (also called the parent
vertex)
Sorting
To put things in some order
• Performance criteria
o Efficiency
• We may need to sort very large number of entries…
o Memory requirements
• It’ll be bad if we need too much extra memory
o Stability
• The ability of a sorting algorithm to keep the original order of equal keys
Selection Sort
o Assume we have a list of elements and we want to sort them in ascending order
o Starting from 1st element, find the minimum element in the list and swap it with the 1st
element
o Starting from 2nd element, find the minimum element in the list and swap it with 2nd
element
o Do this for all elements, and the list will be sorted!
for (i = 1; i < N; i++)
minIndex = i
for (j = i+1; j <= N; j++)
if (a[j] <
a[minIndex])
minIndex = j
swap (a[i], a[minIndex])
▪ Time complexity?
• O(n2)
▪ Space requirement?
• Constant, In-Place sorting – no additional space required
▪ Is this algorithm stable and Effiecient ?
• No!
Merge Sort
o An example of divide-and-conquer algorithms
o Background: Two sorted lists can be merged in linear time
o In first step, Merge sort divides the given list into two smaller sub-lists of equal size
o In next step it recursively calls Merge sort on both sub-lists
o In last step it merges the sorted sub-lists together
Merge (A, B, C)
i = 1; j = 1; m = |A|; n = |B|
while (i <= m && j <= n)
if (A[i] < B[j])
C[k++] = A[i++]
else
C[k++] = B[j++]
while (i <= m)
C[k++] = A[i++]
while (j <= n)
C[k++] = B[j++]
▪ Time complexity?
• O (n Log (n))
▪ Space requirements?
• Needs extra space for temporary arrays (A & B)
▪ Is this algorithm stable?
• Yes!