0% found this document useful (0 votes)
26 views13 pages

Topic 4 Trees

Uploaded by

Dominic Chuchu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views13 pages

Topic 4 Trees

Uploaded by

Dominic Chuchu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

TOPIC 4: TREES ADT

A tree data structure is a hierarchical structure that is used to represent and organize data in a
way that is easy to navigate and search. It is a collection of nodes that are connected by edges
and has a hierarchical relationship between the nodes.

Basic Terminologies:

 Parent Node: The node which is a predecessor of a node is called the parent node of that
node. {B} is the parent node of {D, E}.

 Child Node: The node which is the immediate successor of a node is called the child
node of that node. Examples: {D, E} are the child nodes of {B}.

 Root Node: The topmost node of a tree or the node which does not have any parent node
is called the root node. {A} is the root node of the tree. A non-empty tree must contain
exactly one root node and exactly one path from the root to all other nodes of the tree.

 Edge: It is the link between any two nodes.

 Leaf Node or External Node: The nodes which do not have any child nodes are called
leaf nodes. {K, L, M, N, O, P} are the leaf nodes of the tree.

 Ancestor of a Node: Any predecessor nodes on the path of the root to that node are
called Ancestors of that node. {A,B} are the ancestor nodes of the node {E}

1
TOPIC 4: TREES ADT

 Descendant: Any successor node on the path from the leaf node to that node. {E,I} are
the descendants of the node {B}.

 Sibling: Children of the same parent node are called siblings. {D,E} are called siblings.

 Level of a node: The count of edges on the path from the root node to that node. The root
node has level 0.

 Internal node: A node with at least one child is called Internal Node.

 Neighbor of a Node: Parent or child nodes of that node are called neighbors of that node.

 Subtree: Any node of the tree along with its descendant.

 Height of a Node: The height of a node is the number of edges from the node to the
deepest leaf (ie. the longest path from the node to a leaf node).

 Depth of a Node: The depth of a node is the number of edges from the root to the node.

 Height of a Tree: The height of a Tree is the height of the root node or the depth of the
deepest node.

 Forest: A collection of disjoint trees is called a forest; You can create a forest by cutting
the root of a tree.

2
TOPIC 4: TREES ADT

 Path: Path refers to the sequence of nodes along the edges of a tree.
 Visiting: Visiting refers to checking the value of a node when control is on the node.
 Traversing: Traversing means passing through nodes in a specific order.
 Keys: Key represents a value of a node based on which a search operation is to be carried
out for a node.

Representation of a Node in Tree Data Structure:


struct Node
{
int data;
struct Node *first_child;
struct Node *second_child;
struct Node *third_child;
.
.
.
struct Node *nth_child;
};
Types of trees
 Binary tree
 BST
 AVL
 B-
1. Binary Tree
A binary tree is a tree data structure in which each parent node can have at most two children.
Each node of a binary tree consists of three items:
 data item

3
TOPIC 4: TREES ADT

 address of left child


 address of right child
Types of binary trees:
 Full binary: A full Binary tree is a special type of binary tree in which every parent
node/internal node has either two or no children.
 Perfect binary tree: A perfect binary tree is a type of binary tree in which every internal
node has exactly two child nodes and all the leaf nodes are at the same level.
 Complete binary tree:
 Every level must be completely filled
 All the leaf elements must lean towards the left.
 The last leaf element might not have a right sibling i.e. a complete binary tree
doesn't have to be a full binary tree.
 Degenerate or Pathological tree: is the tree having a single child either left or right.
 Skewed binary tree: the tree is either dominated by the left nodes or the right nodes.
Thus, there are two types of skewed binary tree: left-skewed binary tree and right-
skewed binary tree.
 Balance binary tree: a type of binary tree in which the difference between the height of
the left and the right subtree for each node is either 0 or 1.

2. Binary search tree


Binary search tree is a data structure that quickly allows us to maintain a sorted list of numbers.
 It is called a binary tree because each tree node has a maximum of two children.
 It is called a search tree because it can be used to search for the presence of a number
in O(log(n)) time.
The properties that separate a binary search tree from a regular binary tree is
1. All nodes of left subtree are less than the root node
2. All nodes of right subtree are more than the root node
3. Both subtrees of each node are also BSTs i.e. they have the above two properties
a) Search Operation

4
TOPIC 4: TREES ADT

The algorithm depends on the property of BST that if each left subtree has values below root and
each right subtree has values above the root.

If the value is below the root, we can say for sure that the value is not in the right subtree; we
need to only search in the left subtree and if the value is above the root, we can say for sure that
the value is not in the left subtree; we need to only search in the right subtree.

Algorithm:
If root == NULL
return NULL;
If number == root->data
return root->data;
If number < root->data
return search(root->left)
If number > root->data
return search(root->right)
b) Insert Operation
Inserting a value in the correct position is similar to searching because we try to maintain the rule
that the left subtree is lesser than root and the right subtree is larger than root.
We keep going to either right subtree or left subtree depending on the value and when we reach a
point left or right subtree is null, we put the new node there.
Algorithm:
If node == NULL
return createNode(data)
if (data < node->data)
node->left = insert(node->left, data);
else if (data > node->data)
node->right = insert(node->right, data);
return node;
c) Deleting a Node in BST
Algorithm
1. Perform search for value X
2. If X is a leaf, delete X

5
TOPIC 4: TREES ADT

3. Else // must delete internal node


 Replace with largest value Y on left subtree OR smallest value Z on right subtree.
 Delete replacement value (Y or Z) from subtree
Observation
 O( log(n) ) operation for balanced tree
 Deletions may unbalance tree
Example:
Given a root node reference of a BST and a key, delete the node with the given key in the BST.
Return the root node reference (possibly updated) of the BST.
Basically, the deletion can be divided into two stages:
 Search for a node to remove.
 If the node is found, delete the node.

Input: root = [5,3,6,2,4,null,7], key = 3


Output: [5,4,6,2,null,null,7]
Explanation: Given key to delete is 3. So we find the node with value 3 and delete it.
One valid answer is [5,4,6,2,null,null,7], shown in the above BST.
notice that another valid answer is [5,2,6,null,4,null,7] and it's also accepted.

3. AVL Tree

6
TOPIC 4: TREES ADT

 AVL tree got its name after its inventor Georgy Adelson-Velsky and Landis.
 AVL tree is a self-balancing binary search tree in which each node maintains extra
information called a balance factor whose value is either -1, 0 or +1.

Balance Factor

Balance factor of a node in an AVL tree is the difference between the height of the left subtree
and that of the right subtree of that node.

Balance Factor = (Height of Left Subtree - Height of Right Subtree) or (Height of Right Subtree
- Height of Left Subtree)

The self-balancing property of an AVL tree is maintained by the balance factor. The value of
balance factor should always be -1, 0 or +1.

Properties of Tree Data Structure:

 Number of edges: An edge can be defined as the connection between two nodes. If a tree
has N nodes then it will have (N-1) edges. There is only one path from each node to any
other node of the tree.

 Depth of a node: The depth of a node is defined as the length of the path from the root to
that node. Each edge adds 1 unit of length to the path. So, it can also be defined as the
number of edges in the path from the root of the tree to the node.

 Height of a node: The height of a node can be defined as the length of the longest path
from the node to a leaf node of the tree.

 Height of the Tree: The height of a tree is the length of the longest path from the root of
the tree to a leaf node of the tree.

 Degree of a Node: The total count of subtrees attached to that node is called the degree
of the node. The degree of a leaf node must be 0. The degree of a tree is the maximum
degree of a node among all the nodes in the tree.

Application of Tree Data Structure:


 File System: This allows for efficient navigation and organization of files.

7
TOPIC 4: TREES ADT

 Data Compression: Huffman coding is a popular technique for data compression that
involves constructing a binary tree where the leaves represent characters and their
frequency of occurrence. The resulting tree is used to encode the data in a way that
minimizes the amount of storage required.
 Compiler Design: In compiler design, a syntax tree is used to represent the structure of a
program.
 Database Indexing: B-trees and other tree structures are used in database indexing to
efficiently search for and retrieve data.

Advantages of Tree Data Structure:

 Tree offer Efficient Searching Depending on the type of tree, with average search times
of O(log n) for balanced trees like AVL.

 Trees provide a hierarchical representation of data, making it easy to organize and


navigate large amounts of information.

 The recursive nature of trees makes them easy to traverse and manipulate using
recursive algorithms.

Disadvantages of Tree Data Structure:

 Unbalanced Trees, meaning that the height of the tree is skewed towards one side, which
can lead to inefficient search times.

 Trees demand more memory space requirements than some other data structures like
arrays and linked lists, especially if the tree is very large.

 The implementation and manipulation of trees can be complex and require a good
understanding of the algorithms.

Tree Traversal

In order to perform any operation on a tree, you need to reach to the specific node. The tree
traversal algorithm helps in visiting a required node in the tree.

There are three ways which we use to traverse a tree −

8
TOPIC 4: TREES ADT

 In-order Traversal

 Pre-order Traversal

 Post-order Traversal

In-order Traversal

In this traversal method, the left subtree is visited first, then the root and later the right sub-tree.
We should always remember that every node may represent a subtree itself.

If a binary tree is traversed in-order, the output will produce sorted key values in an ascending
order.

(Infix: Left child, then root node, then right child)

We start from A, and following in-order traversal, we move to its left subtree B.B is also
traversed in-order. The process goes on until all the nodes are visited. The output of in-order
traversal of this tree will be −

D→B→E→A→F→C→G

Algorithm

Until all nodes are traversed −

Step 1 − Recursively traverse left subtree.

Step 2 − Visit root node.

Step 3 − Recursively traverse right subtree.

9
TOPIC 4: TREES ADT

Pre-order Traversal

In this traversal method, the root node is visited first, then the left subtree and finally the right
subtree.

(Prefix: Root node, then left child, then right child)

We start from A, and following pre-order traversal, we first visit A itself and then move to its left
subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited. The
output of pre-order traversal of this tree will be −

A→B→D→E→C→F→G

Algorithm

Until all nodes are traversed −

Step 1 − Visit root node.

Step 2 − Recursively traverse left subtree.

Step 3 − Recursively traverse right subtree.

Post-order Traversal

10
TOPIC 4: TREES ADT

In this traversal method, the root node is visited last, hence the name. First we traverse the left
subtree, then the right subtree and finally the root node.

(Postfix: Left child, then right child, then root node)

We start from A, and following pre-order traversal, we first visit the left subtree B. B is also
traversed post-order. The process goes on until all the nodes are visited. The output of post-order
traversal of this tree will be −

D→E→B→F→G→C→A

Algorithm

Until all nodes are traversed −

Step 1 − Recursively traverse left subtree.

Step 2 − Recursively traverse right subtree.

Step 3 − Visit root node.

Activity:

Example using C to implement tree traversal.

11
TOPIC 4: TREES ADT

Reverse Polish Notation Explained

The reverse polish notation refers to a mathematical notation representing arithmetic expressions
where operators follow the operands. Operators are functions such as addition, subtraction,
multiplication, division, exponential, etc. Additionally, the operation is performed on numerical
values or variables, which serve as the operands.

For example, a normal mathematical expression looks like this (Infix notation):

(2 + 1) x 8

Conventionally, we evaluate what is inside the brackets first by removing them. As a result, we
obtain the sum of 2 and 1, which equals 3. Subsequently, we multiply 3 by 8, resulting in 24.

We can write the same expression in postfix notation as follows:

21+8x

Thus, considering stacks allows for quick evaluation of this expression using reverse Polish
notation. Stacks help manage data and can do push and pop functions.

To evaluate the RPN expression, we consider the following steps:

First, push the number”2” into the stack to assess the expression. Now, push”1”. There are only
two numbers and nothing to do with them. Next, we push the operator “+” onto the stack.
Therefore, we have an operator and two operands, we can pop them from the stack and perform
the operation. Consequently, we add 1 to 2, resulting in the sum of 3.

Now, only three are in the stack. Now, push 8. Again, there are only two numbers. We can pop
the stack when the operator “x” is pushed. Then, we multiply 3 by 8, resulting in the product of
24. Therefore, the only value in the stack is 24. We can observe the unnecessary use of brackets
in RPN by evaluating each term individually, one by one.

12
TOPIC 4: TREES ADT

Furthermore, computers use reverse polish notation calculators. Hewlett and Packard were
among the first companies to use this system in their desktop calculators in the 1970s and 80s.

Example 1:

Let’s assume the RPN expression: 3 5 – 1 +

To evaluate this expression, we follow these steps:

1. Push 3 to the stack.


2. Move 5 to the pile.
3. Push – to the stack.
4. Pop 3 and 5. Therefore, five should be subtracted from 3 to get –2. Now, –2 is the only
term in the stack.
5. Push 1 to the stack.
6. Push + to the stack.
7. Pop –2 and 1. Thus, adding 1 to –2 gives –1. Therefore, the stack consists of the result, –
1.

Example 2:

So in a computer using RPN, the evaluation of the expression 5 1 – 3 * is as follows:

1. Push 5 into the stack. This is the first value.

2. Push 1 into the stack. This is the second value and is on the position above the 5.

3. Apply the subtraction operation by taking two operands from the stack (1 and 5). The top
value (1) is subtracted from the value below it (5), and the result (4) is stored back to the
stack. 4 is now the only value in the stack and is in the bottom.

4. Push 3 into the stack. This value is in the position above 4 in the stack.

5. Apply the multiplication operation by taking the last two numbers off the stack and
multiplying them. The result is then placed back into the stack. After this operation, the
stack now only contains the number 12.

13

You might also like