0% found this document useful (0 votes)
23 views51 pages

ETCPC 2024 Data Structures - 2

Uploaded by

firahagos7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views51 pages

ETCPC 2024 Data Structures - 2

Uploaded by

firahagos7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

2024 | ETCPC

ETCPC 2024 Tutorial Session

Data Structures
01
Ethiopian collegiate programming contest
Topics to be discussed

Introduction

Basic data structures

Trees

Graph

Algorithm analysis

Application
Presented by

Mebatsion Sahle
Competitive programmer &
ETCPC manager
Introduction

Data structures are specialized formats for organizing,


processing, retrieving, and storing data. They are crucial
for managing large amounts of data efficiently, enabling
optimal data retrieval and modification.
Efficient data structures are fundamental to designing
efficient algorithms and ensuring high performance in
software applications.
Note that
Each data structure is designed to solve specific types
of problems and comes with its own set of advantages
and disadvantages.
Classification

Linear vs. Non-linear


Linear Data Structures: Data elements are arranged in a
sequential manner (e.g., arrays, linked lists).
Non-linear Data Structures: Data elements are arranged
hierarchically or graphically (e.g., trees, graphs).
Classification

Static vs. Dynamic


Static Data Structures: Fixed size and structure at compile-
time (e.g., arrays).
Dynamic Data Structures: Flexible size and structure,
allocated and deallocated during runtime (e.g., linked lists).
Basic Data Structures
Arrays
An array is a linear data structure that stores a
collection of elements, all of the same data type, in
contiguous memory locations.
Think of it like a row of lockers, where each locker holds
a specific item and you can access any item by knowing
its position (or index) in the row.
Basic Data Structures
Arrays... (In python)

# initialize with...
myArray = []
# or
myArray = list()

# to add an element
myArray.append(2)
# or
myArray += [2]

print(myArray[0]) # 2

# to remove
myArray.remove(2)
Basic Data Structures
Arrays... (pros & cons)
Pros: Easy to access and manipulate elements using
indices.
Cons: But fixed size, inefficient insertions and deletions
at front and middle.

front middle
Basic Data Structures
Linked List
Linked list consists of a series of nodes, where each node
contains data and a reference or link to the next node in
the sequence. This structure allows for efficient insertion
and deletion of elements, as the nodes are not stored
contiguously in memory.
Basic Data Structures
Linked List... (Implementation)

class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next

My_linked = ListNode(2, None)


# to add another node
My_linked = ListNode(3, My_linked)

print(My_linked) # 3 -> 2
print(My_linked.next) # 2 -> None
Basic Data Structures
Linked List... (pros and cons)
Pros: Dynamic size, efficient insertions and deletions at
the head.
Cons: Slower access time, more memory usage due to
pointers.
Basic Data Structures
Top
Stack
Set of piles of items that uses a principle
called Last-In-First-Out (LIFO).
Operations done like:
Push (insert)
Pop (remove)
Peek (retrieve top element)
only from the top of the stack.
Used in expression evaluation and
backtracking algorithms.
Basic Data Structures
Stack... (implementation)

stack = []
stack.append(2)
stack.append(3)
stack.append(4)
print(stack[-1]) # 4 <- top of the stack
stack.pop()
print(stack[-1]) # 3 <- top of the stack
Basic Data Structures
Top
Stack... (pros & cons)
Pros:
Simple implementation.
useful for LIFO (Last In, First Out)
operations like undo - redo
operations, current directory tracking
in file systems etc...
Cons:
Limited access to elements.
not suitable for random access.
Basic Data Structures
Queues
uses a principle called First-In-First-Out (FIFO) to simulate a waiting line
that we normally use in our lives, enqueue from the back and dequeue at
the front.
Types:
Simple Queues: Basic queue operations.
Circular Queues: End of the queue wraps around to the front.
Priority Queues: Elements processed based on priority.
Queue has a priceless value in Task scheduling, breadth-first search in
graphs. Front Back

Dequeue Enqueue
Basic Data Structures
Queues... (implementation)

from collections import deque

my_queue = deque()

# enqueue from the back


my_queue.append(2)
my_queue.append(3)

# dequeue from the front


my_queue.popleft()
Basic Data Structures
Queues... (pros & cons)
Pros:
Useful for FIFO (First In, First Out) operations.
Simple implementation.
Cons:
Limited access to elements (front and back).
Not suitable for random access.

Front Back

Dequeue Enqueue
Trees
Tree is a hierarchical, non-linear data structure
composed of a single or many nodes connected
with hierarchically with edges.
We can define node as a point where we can hop-
on and go to its next, descendant or even previous
sometimes which it’s going to point at.
A tree can also be defined as a special type of
acyclic graph (a graph without a cycle).
Trees... (implementation)

class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.next = next

tree1 = TreeNode(1)
tree2 = TreeNode(2)
tree3 = TreeNode(3)

tree1.left = tree2
tree1.right = tree3
Trees
Terminologies
node: a data structure that contains a value and a link to a condition or
another data structure.
parent: a node is called parent node to the nodes it’s pointers point to.
child: the nodes a node’s pointers point to are called child nodes of that
node.
root node: a node without a parent, found at the top of the tree.
edge: the relation or connection between parent and child nodes.
leaf nodes: nodes without a child, nodes found at the bottom of the
tree.
height (depth): distance between the root node and the furthest leaf
not to the root node.
Trees
Terminologies

root

height (depth)

edge

leaf nodes
Trees... (types)
There are some types of trees which differ by each of
their unique characteristics.
They are:
Binary tree
N-ary tree
Binary search tree
Balanced tree
Trees... (types)
Binary Tree
Each node in the tree can have at most 2 children
referred to as the left child and the right child.

left right
child child
Trees... (types)
N-ary Tree
Each node in the tree can have at most N children.

eg: N = 3
Trees... (types)
Binary Search Tree
A special type of binary tree used to organize and
store data in a sorted manner.
In each sub tree, the left child contains values less
than the parent node, while the right child contains
values greater than the parent node.
Trees... (types)
Balanced Trees
A balanced tree is a binary search tree that keeps its height as low
as possible by re-balancing itself after insertions and deletions.
This ensures that the tree remains efficient for operations like
search, insertion, and deletion.
From the root node to each leaf node the maximum difference in
height is 1.
Trees... (types)
Balanced Trees
Trees Traversals

Tree traversal refers to the process of visiting each


node in a binary tree exactly once.
There are several ways to traverse a tree, each with its
own use cases and characteristics.
Those are:
Pre - order
In - order
Post - order
Trees Traversals...
Pre - order
Visit with the order root node -> left subtree -> right subtree.
Use case: Used to create a copy of the tree.

R def preOrder(root: Node):


if not root:
return
print(root.val)
l r preOrder(root.left)
preOrder(root.right)
Trees Traversals...
In - order
Visit with the order left subtree -> root node -> right subtree.
Use case: In binary search trees, this traversal gives nodes in non-
decreasing order.

R def inOrder(root: Node):


if not root:
return
preOrder(root.left)
l r print(root.val)
preOrder(root.right)
Trees Traversals...
Post - order
Visit with the order left subtree -> right subtree -> root node.
Use case: Used to delete the tree, garbage collection.

R def postOrder(root: Node):


if not root:
return
preOrder(root.left)
l r preOrder(root.right)
print(root.val)
Trees... (pros and cons)
In general
Pros:
Hierarchical structure.
Efficient search, insert, and delete operations.
Cons:
Complex implementation.
Requires balancing.
Graph
A way to represent relationships between
objects.
A collection of nodes that have data and
are connected to other nodes with edges.
Some graph terminologies:
Nodes or vertices: The objects in
graph.
Edges: the relation between nodes.
Neighbors: Two nodes are neighbors or
adjacent if there is an edge between
them.
Graph... (Types)
Directed vs Undirected
In an undirected graph there’s no specific direction. They are
bidirectional, meaning you can traverse them in both directions. In
contrary edges in a directed graph have a specific direction,
indicated by an arrow.

Undirected Directed
Graph... (Types)
Weighted vs Unweighted
In a weighted graph, each edge has an associated numerical value
called a weight. This weight can represent various metrics like
distance, cost, or time. But the case of unweighted, it’s a matter of
existence of a connection among nodes.

Unweighted Weighted
Graph... (Types)
Connected vs Disconnected
A graph is said to be connected if there is a path between every pair
of vertices. This means that you can travel from any vertex to any
other vertex within the graph without leaving the graph. In case of
disconnected, if at least two vertices are not connected by a path.

Connected Disconnected
Graph Representation

Since graphs are a little bit hard to put them in a generic data
structure, there are some types of representation techniques that
we use to process them.
those are:
Adjacency matrix
Adjacency list
Edge list
Graph Representation
Adjacency Matrix
a square matrix used to represent a finite graph. It provides a
straightforward way to describe the connections between vertices
in a graph.
Graph Representation
Adjacency Matrix... (pros & cons)

Advantages Disadvantages

It takes more time to build and consume more


To represent dense graphs.
memory. O(n^2)

Edge lookup is fast. Finding neighbors of a node is costly.


Graph Representation
Adjacency List
another way to represent a graph, and it’s particularly efficient for
sparse graphs (graphs with relatively few edges).
Graph Representation
Adjacency List... (pros & cons)

Advantages Disadvantages

It uses less memory. Edge lookup is slow.

Neighbours of a node can be easily found.

Best for sparse graphs.


Graph Representation
Edge List
represent a graph by listing all its edges. Each edge is represented
as a pair of nodes (or vertices) that it connects.
Graph Representation
Edge List... (pros & cons)

Advantages Disadvantages

It uses less memory. Edge lookup is slow.

Easy to represent. Hard to traverse.


Graph Traversal Algorithms

Mainly two of them:


Breadth-First Search (BFS): Explores all neighbors at the
present depth prior to moving on to nodes at the next depth
level.
Depth-First Search (DFS): Explores as far as possible along each
branch before backtracking.
Graph... (pros and cons)
In general
Pros:
Excellent for representing networks.
flexible.
Cons:
Complex implementation.
Can be memory intensive.
Algorithm analysis

It involves assessing how


algorithms interact with and
manipulate different data
structures.
It helps us design efficient
solutions and optimize
performance.
Algorithm analysis...

Time and Space Complexity


Measure of the amount of time and memory an algorithm takes
relative to the input size.
Big O Notation
Mathematical notation to describe the upper bound of an
algorithm's complexity.
Algorithm analysis...

Best, Average, and Worst-case Scenarios (in sort)


Best-case: Minimum time taken.
Average-case: Expected time taken.
Worst-case: Maximum time taken.
Applications of Data Structures

Real-World Examples and Case Studies


Databases.
File systems.
Operating systems.
Networking.
Thank You
Any Questions?

You might also like