0% found this document useful (0 votes)
34 views321 pages

DS Course File-Kalyani

The document outlines the course file for Data Structures, including the vision and mission of the institute and department, program educational objectives (PEOs), program outcomes (POs), and program specific outcomes (PSOs). It details the syllabus, course objectives, and outcomes, as well as a lesson plan and mapping of course outcomes to program outcomes. The course covers various data structures, algorithms, and their applications, with a structured approach to teaching and assessment.

Uploaded by

anitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views321 pages

DS Course File-Kalyani

The document outlines the course file for Data Structures, including the vision and mission of the institute and department, program educational objectives (PEOs), program outcomes (POs), and program specific outcomes (PSOs). It details the syllabus, course objectives, and outcomes, as well as a lesson plan and mapping of course outcomes to program outcomes. The course covers various data structures, algorithms, and their applications, with a structured approach to teaching and assessment.

Uploaded by

anitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 321

DATA STRUCTURES

1
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

COURSE FILE

Course Name: Data Structures Code:

Academic Year: 2023-24

Year/Semester: II/I

Name of the Faculty: A. KALYANI Branch: CSM,CSD

2
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

INDEX

Page No.
S. No. CONTENT
(From-To)
1. Vision and Mission of the Institute 4
2. Department Vision and Mission 5
3. PEOs, POs and PSOs 6-7
4. Syllabus 8
5. List of COs 9
6. CO PO Mapping 10
7. Lesson Plan 11-13
Unit wise important Questions/ Assignment Questions
8. 14
(Descriptive/Objective Questions)
9. Unit wise Notes
10. Previous Question Papers

3
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Vision of the Institute:

Empowering women as innovators, entrepreneurs and thought leaders in the area of Science &
Technology

Mission of the Institute:

M1: Create an engaging inter-disciplinary experiential teaching and learning ecosystem.

M2: Foster a culture of innovation and entrepreneurial spirit among students and faculty.

M3: Structure collaborative partnerships with industry and academia to address societal
needs and contemporary technological challenges.

4
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Vision of the Department:

To impart quality education in Computer Science and Engineering for women empowerment.

Mission of the Department:

M1: To make the students strong in fundamental concepts and in problem solving skills.
M2: Imparting value based education for women empowerment.
M3: To bring out creativity in students that would promote innovation, research and entrepreneurship.

5
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

PROGRAM EDUCATIONAL OBJECTIVES (PEOs):

PEO-1: Graduates shall have the ability to apply knowledge across the disciplines and in emerging areas
of Computer Science and Engineering for higher studies, research, employability and handle the realistic
problems.
PEO-2: Graduates shall have good communication skills, possess ethical conduct, sense of responsibility
to serve the society, and protect the environment.

PEO-3: Graduates shall have strong foundation in academic excellence, soft skills, managerial skills,
leadership qualities and understand the need for lifelong learning for a successful professional career.

PROGRAM OUTCOMES (PO’S):

PO1: Engineering knowledge: Apply the knowledge of mathematics, science,


engineering fundamentals, and an engineering specialization to the solution of complex
engineering problems.

PO2: Problem analysis: Identify, formulate, review research literature, and analyze
complex engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.

PO3: Design/development of solutions: Design solutions for complex engineering problems


and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal,
and environmental considerations.

PO4: Conduct investigations of complex problems: Use research-based knowledge and research
methods including design of experiments, analysis and interpretation of data, and synthesis of
the information to provide valid conclusions.

6
PO5: Modern tool usage: Create, select, and apply appropriate techniques, resources and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.

PO6: The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.

PO7: Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.

PO8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.

PO9: Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.

PO10: Communication: Communicate effectively on complex engineering activities with the


engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and
receive clear instructions.

PO11: Project management and finance: An ability to use the techniques, skills and modern
engineering tools necessary for engineering practice.

PO12: Independent and Life Long Learning: Demonstrate a knowledge and understanding of
contemporary technologies, their applications and limitations, contemporary research in the
broader context of relevant fields.

PROGRAM SPECIFIC OUTCOMES (PSO’s):

PSO1: Foundation of Mathematical Concepts: Mathematical Methodologies and Data structure’s


algorithms are used in problem solving.

PSO2: Foundation of Computer System: The ability to interpret the fundamental concepts,
methodologies and the functionality of hardware and software aspects of computer systems.

PSO3: Foundations of Software development: The ability to grasp the software development lifecycle
and methodologies. Acquire competent skills and knowledge of software design process.

7
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SYLLABUS

:DATA STRUCTURES
II Year B.Tech. CSM I-Sem

Course Objectives:
1. Exploring basic data structures such as stacks and queues.
2. Introduces a variety of data structures such as hash tables, search trees, tries, heaps, graphs.
3. Introduces sorting and pattern matching algorithms.
Course Outcomes:
1. Ability to select the data structures that efficiently model the information in a problem
2 .Ability to assess efficiency trade-offs among different data structure implementations or combinations.
3.Implement and know the application of algorithms for sorting and pattern matching.
4.Design programs using a variety of data structures, including hash tables, binary and general tree
structures, search trees, tries, heaps, graphs, and AVL-trees.
UNIT - I
Introduction to Data Structures:abstract data types, Linear list – singly linked list implementation,
insertion, deletion and searching operations on linear list, Stacks-Operations, array and linked
representations of stacks, stack applications, Queues-operations, array and linked representations.
UNIT - II
Dictionaries: linear list representation, skip list representation, operations - insertion, deletion and
searching. Hash Table Representation: hash functions, collision resolution-separate chaining,open
addressing-linear probing, quadratic probing, double hashing, rehashing, extendible hashing
UNIT - III
Search Trees: Binary Search Trees, Definition, Implementation, Operations- Searching, Insertion and
Deletion, B-Trees,B+ Trees,AVL Trees, Definition, Height of an AVL Tree, Operations – Insertion,
Deletion and Searching, Red –Black, Splay Trees.
UNIT – IV
Graphs: Graph Implementation Methods. Graph Traversal Methods.
Sorting: Quick sort,Heap Sort, External Sorting- Model for external sorting, Merge Sort.
UNIT – V
Pattern Matching and Tries: Pattern matching algorithms-Brute force, the Boyer –Moore algorithm, the
Knuth-Morris-Pratt algorithm, Standard Tries, Compressed Tries, Suffix tries.
TEXT BOOKS: 1. Fundamentals of Data Structures in C, 2nd Edition, E. Horowitz, S. Sahni and Susan
Anderson Freed, Universities Press.
2. Data Structures using C – A. S. Tanenbaum, Y. Langsam, and M.J. Augenstein, PHI/Pearson
Education.
REFERENCE BOOKS: 1. Data Structures: A Pseudocode Approach with C, 2nd Edition, R. F. Gilberg
and B.A. Forouzan, Cengage Learning.

8
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Course Outcomes (Cos)

Course Outcomes (Cos): R-22 Regulation

1. Ability to select the data structures that efficiently model the information in a problem

2. Ability to assess efficiency trade-offs among different data structure implementations or


combinations.

3.Implement and know the application of algorithms for sorting and pattern matching.

4.Design programs using a variety of data structures, including hash tables, binary and general tree
structures, search trees, tries, heaps, graphs, and AVL-trees.

9
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

COs-POs-PSOs MAPPING

CO-PO MAPPING

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
C302.1 2 3 2 2 3
C302.2 3 3 3 2 2
C302.3 2 2 3 2 1
C302.4 3 2 3 2 1
Avg. 2.50 2.50 2.75 2.00 1.75

CO-PSO MAPPING

PSO1 PSO2 PSO3


C302.1 2 1 3
C302.2 1 3 2
C302.3 2 3 2
C302.4 1 3 3
Avg. 1.50 2.50 2.50

10
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

LESSON PLAN

Course Name : Data Structures Code:

Academic Year: 2023-24

Year/Semester: II/I

Name of the Faculty:A. Kalyani Branch: CSM,CSD

Course objectives:

1. Exploring basic data structures such as stacks and queues.

2. Introduces a variety of data structures such as hash tables, search trees, tries, heaps, graphs.

3. Introduces sorting and pattern matching algorithms.

Unit No./ Scheduled


S. No. of date
Topic Mode of teaching
No. periods
per Unit
1. Introduction to data structures and its 19/9/23 PowerPoint
types Presentation
2. Types of data structures with examples 20/9/23 Demonstration
3. Linear data structures types linked Unit-I / 21/9/23 Chalk and Board
list ,stacks, queues introduction 20
4. Single linked list operations 22/9/23 Chalk and Board
5. Algorithm for create, insertion, deletion 23/9/23 Chalk and Board
operations
6. Example for single linked list 25/9/23 Demonstration
7. Double linked list, circular linked list 26/9/23 to 29/9/23 PowerPoint
and its differences Presentation
8. Stacks by using arrays 3/10/23 Chalk and Board
9. Stacks using linked list 4/10/23 Chalk and Board
10. Applications of stack 5/10/23 Chalk and Board
11. Types of notations 6/10/23 Demonstration
12. Conversion of expressions from postfix 7/10/23 Demonstration
11
to prefix and postfix to infix
13. Conversion of expressions from prefix 9/10/23 Chalk and Board
to postfix and prefix to infix
14. Conversion of expressions from infix to 11/10/23 Chalk and Board
prefix and postfix
15. Evaluation of expressions 12/10/23 Demonstration
16. Queue implementation using arrays and 13/10/23 Chalk and Board
LinkedList with algorithm
17. Circular queue and its operations, 17/10/23 In-Class Activity
Types of queues
18. Dictionary ADT 18/10/23 PowerPoint
Unit-II/ Presentation
10 19/10/23

19. Skip list with its operations 20/10/23 Chalk and Board
20. Example with operations of skip list 21/10/23 Chalk and Board
21. Hashing, hash table, hash key, Types of 30/10/23 Chalk and Board
hash function
22. Collision resolution techniques with 1/11/23 PowerPoint
example Presentation
23. Rehashing, extendible hashing with 2/11/23 Chalk and Board
example
24. Trees introduction with basic 3/11/23 Chalk and Board
terminology Unit-III/
25. Binary tree and its types, tree traversals 13 4/11/23 Chalk and Board
26. BST and its operations 6/11/23 Chalk and Board
27. B-Trees Explanation 7/11/23 to 8/11/23 Chalk and Board
28. Examples of B-Trees,adv&disadv 9/11/23 PowerPoint
Presentation
29 B+ Trees Explanation 10/11/23 to Chalk and Board
15/11/23
30. Examples of B+-Trees,adv&disadv 16/11/23to17/11/23 PowerPoint
Presentation
31. AVL tree and its operations, example 18/11/23 to Chalk and Board
21/11/23
32. Red Black tree its operations example 22/11/23 to PowerPoint
for insertion 25/1123 Presentation
33. Deletion operation with example 26/11/23 Chalk and Board
to27/11/23
34. Splay trees introduction ,rotations and 2/12/23 to4/12/23 Chalk and Board
operations

35. Graphs Introduction,Types of graphs 5/12/23 to 7/12/23 Chalk and Board


12
Unit-IV/
13 Chalk and Board

36. Graph implementation methods 8/12/23 PowerPoint


Presentation
37. Graph implementation using linkedlist 11/12/23 to PowerPoint
13/12/23 Presentation
38. Graph traversal methods,BFS 14/12/23 to Chalk and Board
15/12/23
39. Algorithm and example for BFS 18/12/23 PowerPoint
Presentation
40. DFS,Algorithm explanation 20/12/23 Chalk and Board
41.. Quick sort&example 21/12/23 to PowerPoint
22/12/23 Presentation
42. Heap sort with example 23/12/23 Chalk and Board
43. External sorting 27/12/23 PowerPoint
Presentation
44. Mergesort 29/12/23 PowerPoint
Presentation
45. Pattern matching-Brute force Unit-V/7 2/1/24 to5/1/24 PowerPoint
Presentation
46. Boyer-Moore Algorithm 6/1/24 to 8/1/24 In-Class Activity
47. Knuth-Morris-Prat algorithm 9/1/24 Chalk and Board
48. Standard Tries 17/1/24to 19/1/24 In-Class Activity
49. Compressed Tries 22/1/24to 24/1/24 Chalk and Board
50. Suffix Tries 2 Chalk and Board

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

13
Unit wise Important Questions (Descriptive)

Unit-I

1. Discuss about linear and nonlinear data structures?


2. Define data structures? and list various types of data structures.
3. What is stack? Explain basic operations of stack data structures.
4. What is queue? Explain basic operations of queue data structures.
5. Explain about the applications of stack?
6. Write an algorithm to convert an expression from infix to prefix?
7. Write an algorithm to convert an expression from infix to postfix?
8. What is circular queue? explain its operations
9. Define single linked list? Explain how to insert and delete the element an element at the end of
the list.
10. Define linked list, Double linked list. Circular linked list with its advantages?

Unit-II

1. Explain in detail about the skip list operations?


2. Write an algorithm for all operations of Dictionary ADT?
3. Discuss about dictionary ADT?
4. Define hash function, hash table, hashing in detail?
5. What are the properties of hash function?
6. What are the different methods of collision resolution techniques? Explain briefly.
7. Explain about a) Rehashing b) Extendible hashing?
8. Explain about various hash function techniques with an example.
9. What are the disadvantages with linear probing and quadratic probing?
10. Explain in detail about the advantages and disadvantages of each collision resolution techniques?

Unit-III

1. What is a Binary tree? Explain about the types of binary trees with an example?
2. Discuss in detail about the types of Tree traversal methods?
3. In How many ways we represent a binary tree? Discuss in detail with an example?
4. What is a binary search tree? Explain properties of binary search tree with an example?
5. What are the operations performed on BST?
6. Explain about the operations of an AVL tree with an example?
7. Insert the following list of elements from the AVL tree. Delete the elements 18, 2 and 30 from
the AVL tree 12, 30, 36, 18, 25, 9, 4, 2, 17, 14 , 20, 47.
8. Explain about RED-BLACK trees with an example?
14
9. Construct a Red-Black tree with the following elements 40, 16, 36, 54, 18, 7, 48, 5. Delete
element 18 and add element 66.
10. What is splay tree? Explain about its rotations with an example?
11. Explain about B-Trees with example?
12. Explain about B+-Trees with example?

Unit-IV

1. What is a graph? discuss various types of graphs. Briefly few applications of graph?
2. Write a BFS algorithm with an example?
3. Write a DFS algorithm with an example?
4. Difference between BFS and DFS algorithm with examples
5. Explain about external sorting with an example.
6. Write an algorithm to implement a depth-first search with an example
7. Write an algorithm to Quicksort concept
8. Explain about Quicksort algorithm with example?
9. Perform heap sort algorithm for (10 15 6 2 25 18 16 2 20 4).
10. Explain the process of heap sort with example
11. Compare and contrast different sorting methods?
12. Sort the following list of elements by using Merge sort 30, 56, 78, 99, 12, 43, 10, 24, 85

Unit-V

1. Difference between tree and tries.


2. Explain about standard tries with an example
3. Explain about the suffix tries with an example?
4. Illustrate the Brute force algorithm.
5. Write an algorithm of compressed Tries
6. Explain in brief about tries with example?
7. What is the use of Knuth–Morris–Pratt string-searching algorithm
8. Explain about compressed and standard tries with an example
9. Explain about the comparison between tries and hash table?
10. Write about the advantages and disadvantages of tries

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

15
Objective Questions:

1. Stack is also called as

A) Last in first out

B) First in last out

C) Last in last out

D) First in first out

2. 8. Inserting an item into the stack when stack is not full is called …………. Operation and deletion of
item form the stack, when stack is not empty is called ………..operation.

A) push, pop

B) pop, push

C) insert, delete

D) delete, insert

3. . ……………. Is a pile in which items are added at one end and removed from the other.

A) Stack

B) Queue

C) List

D) None of the above

4. ………… is very useful in situation when data have to stored and then retrieved in reverse order.

A) Stack

B) Queue

C) List

D) Link list

5. Which data structure allows deleting data elements from and inserting at rear?

16
A) Stacks

B) Queues

C) Dequeues

D) Binary search tree

6. . Which of the following data structure can’t store the non-homogeneous data elements?

A) Arrays

B) Records

C) Pointers

D) Stacks

7. A ……. is a data structure that organizes data similar to a line in the supermarket, where the first one
in line is the first one out.

A) Queue linked list

B) Stacks linked list

C) Both of them

D) Neither of them

8. Which of the following is non-liner data structure?

A) Stacks

B) List

C) Strings

D) Trees

9. Which of the following data structure is linear type?

A) Graph

B) Trees

C) Binary tree

D) Stack

10. To represent hierarchical relationship between elements, Which data structure is suitable?

17
A) Dequeue

B) Priority

C) Tree

D) Graph

11. 35. What will be the value of top, if there is a size of stack STACK_SIZE is 5

A) 5

B) 6

C) 4

D) None

12. ………… is not the operation that can be performed on queue.

A) Insertion

B) Deletion

C) Retrieval

D) Traversal

13. There is an extra element at the head of the list called a ……….

A) Antinel

B) Sentinel

C) List header

D) List head

14. ………………. is not an operation performed on linear list

a) Insertion b) Deletion c) Retrieval d) Traversal

A) only a,b and c

B) only a and b

C) All of the above

D) None of the above

15. Which is/are the application(s) of stack


18
A) Function calls

B) Large number Arithmetic

C) Evaluation of arithmetic expressions

D) All of the above

16. …………………. Is a directed tree in which outdegree of each node is less than or equal to two.

A) Unary tree

B) Binary tree

C) Trinary tree

D) Both B and C

17. Which of the following data structures are indexed structures?

A. Linear arrays

B. Linked lists

C. Queue

D. Stack

18. Which of the following data structure store the homogeneous data elements?

A. Arrays

B. Records

C. Pointers

D. Lists

19. When new data are to be inserted into a data structure, but there is not available space; this situation
is usually called ….

A. Underflow

B. overflow

C. houseful

D. saturated

20. A data structure where elements can be added or removed at either end but not in the middle is called

19
A. linked lists

B. stacks

C. queues

D. dequeue

21. Operations on a data structure may be …..

A. creation

B. destruction

C. selection

D. all of the above

22. The way in which the data item or items are logically related defines …..

A. storage structure

B. data structure

C. data relationship

D. data operation

23. Which of the following are the operations applicable on primitive data structures?

A. create

B. destroy

C. update

D. all of the above

24. The use of pointers to refer elements of a data structure in which elements are logically adjacent is ….

A. pointers

B. linked allocation

C. stack

D. queue

25. Which of the following data structure is non-linear type?

A) Strings
20
B) Lists

C) Stacks

D) Tree

26. Which of the following data structure is linear type?

A) Array

B) Tree

C) Graphs

D) Hierarchy

27. The logical or mathematical model of a particular organization of data is called a ………

A) Data structure

B) Data arrangement

C) Data configuration

D) Data formation

28. The simplest type of data structure is ………………

A) Multidimensional array

B) Linear array

C) Two dimensional array

D) Three dimensional array

29. Linear arrays are also called ……………….

A) Straight line array

B) One-dimensional array

C) Vertical array

D) Horizontal array

30. Arrays are best data structures …………

A) For relatively permanent collections of data.

B) For the size of the structure and the data in the structure are constantly changing
21
C) For both of above situation

D) For none of the above

31. Which of the following data structures are indexed structures?

A) Linear arrays

B) Linked lists

C) Graphs

D) Trees

32. Each node in a linked list has two pairs of ………….. and ……………….

A) Link field and information field

B) Link field and avail field

C) Avail field and information field

D) Address field and link field

33. When does top value of the stack changes?

A) Before deletion

B) While checking underflow

C) At the time of deletion

D) After deletion

34.. The disadvantage in using a circular linked list is …………………….

A) It is possible to get into infinite loop.

B) Last node points to first node.

C) Time consuming

D) Requires more memory space

35. A linear list in which each node has pointers to point to the predecessor and successors nodes is
called as..

A) Singly Linked List

B) Circular Linked List

22
C) Doubly Linked List

D) Linear Linked List

36. A ……………….. is a linear list in which insertions and deletions are made to from either end of the
structure.

A) circular queue

B) random of queue

C) priority

D) dequeue

37. Which of the following is an application of stack?

A) finding factorial

B) tower of Hanoi

C) infix to postfix conversion

D) all of the above

38. The operation of processing each element in the list is known as

a. Sorting

b. Merging

c. Inserting

d. Traversal

39. Finding the location of the element with a given value is:

a. Traversal

b. Search

c. Sort

d. None of above

40. What is a hash table?

a) A structure that maps values to keys

b) A structure that maps keys to values

23
c) A structure used for storage

d) A structure used to implement stack and queue

41. If several elements are competing for the same bucket in the hash table, what is it called?

a) Diffusion

b) Replication

c) Collision

d) None of the mentioned

42. What is direct addressing?

a) Distinct array position for every possible key

b) Fewer array positions than keys

c) Fewer keys than array positions

d) None of the mentioned

43. 4. What is the search complexity in direct addressing?

a) O(n)

b) O(logn)

c) O(nlogn)

d) O(1)

44What is a hash function?

a) A function has allocated memory to keys

b) A function that computes the location of the key in the array

c) A function that creates an array

d) None of the mentioned

45. What data structure is used when converting an infix notation to prefix notation?

a) Stack

b) Queue

c) B-Trees
24
d) Linked-list

46. What is the postfix expression for the corresponding infix expression a+b*c+(d*e)

a) abc*+de*

b) abc+*de*

c) a+bc*de+*

d) abc*+(de)*

47. In infix to postfix conversion algorithm, the operators are associated from?

a) right to left

b) left to right

c) centre to left

d) centre to right

48. Binary trees can have how many children?

a) 2

b) any number of children

c) 0 or 1 or 2

d) 0 or 1

49. Disadvantage of using array representation for binary trees is?

a) difficulty in knowing children nodes of a node

b) difficult in finding the parent of a node

c) have to know the maximum number of nodes possible before creation of trees

d) difficult to implement

50. What must be the ideal size of array if the height of tree is ‘l’?

a) 2l-1

b) l-1

c) l

d) 2l
25
51. Which of the following an external sorting technique ( )

A. Quick Sort B. Merge Sort C. Selection Sort D. Insertion Sort

52. The average depth of binary search tree is ( )

A. o(n/2) B. o(n ) C .o(log n) D .o(n2)

53. Comparisons in Brute-Force is ( )

A. from middle B. start to middle C. left to right D. right to left

54. In a graph set of nodes are called as ( )

A. notes B. arrows C. vertices D. edges

55.What is the special property of red-black trees and what root should always be? ( )
a) a color which is either red or black and root should always be black color only
b) height of the tree

c) pointer to next node


d) a color which is either green or black

56. What are splay trees? ( )


a) self adjusting binary search trees b) self adjusting binary trees
c) a tree with strings d) a tree with probability distributions

57. What is a splay operation? ( )


a) moving parent node to down of child b) moving a node to root
c) moving root to leaf d) removing leaf node

58. What is an AVL tree? ( )


a) a tree which is balanced and is a height balanced tree
b) a tree which is unbalanced and is a height balanced tree
c) a tree with three children
d) a tree with atmost 3 children

59. Why we need to a binary tree which is height balanced? ( )


a) to avoid formation of skew trees b) to save memory
c) to attain faster memory access d) to simplify storing

60. A BST is traversed in the following order recursively: Right, root, left
The output sequence will be in ( )

a. ascending order b. descending order c. Sequence d. No specific order

26
61. In a binary min heap containing n elements, the largest element can be found in ___ time. ( )
a) O(n) b) O(nlogn) c) O(logn) d) O(1)

62. Min heap is a complete ( )

a. binary tree 3. binary search tree 3.tree 4. b+tree


63.What will be the position of 5, when a max heap is constructed on the input elements 5, 70, 45, 7, 12,
15, 13, 65, 30, 25? ( )
a) 5 will be at root b) 5 will be at last level
c) 5 will be at second level d) 5 can be anywhere in heap

64. Which one of the following array elements represents a binary min heap? ( )
a) 12 10 8 25 14 17 b) 8 10 12 25 14 17
c) 25 17 14 12 10 8 d) 14 17 25 10 12 8

65. Space complexity for an adjacency list of an undirected graph having large values of V (vertices) and
E (edges) is ( )
a) O(V) b) O(E*E) c) O(E) d) O(E+V)

66. Time complexity to find if there is an edge between 2 particular vertices is ( )


a) O(V) b) O(E) c) O(1) d) O(V+E)

67. In which case adjacency list is preferred in front of an adjacency matrix? ( )


a) Dense graph b) Sparse graph

c) Adjacency list is always preferred d) Complete graph

68. The number of elements in the adjacency matrix of a graph having 7 vertices is ( )
a) 7 b) 14 c) 36 d) 49

69. What is the number of edges present in a complete graph having n vertices? ( )
a) (n*(n+1))/2 b) (n*(n-1))/2
c) n d) Information given is insufficient

70.Which of the following properties does a simple graph not hold? ( )


a) Must be connected b) Must be unweighted

71. What is the maximum number of edges in a bipartite graph having 10 vertices? ( )
a) 24 b) 21 c) 25 d) 16

72. For a given graph G having v vertices and e edges which is connected and has no cycles, which of the
following statements is true? ( )
a) v=e b) v = e+1 c) v + 1 = e d) v = e-1

73.A graph with all vertices having equal degree is known as a ( )


a) Multi Graph b) Regular Graph
c) Simple Graph d) Complete Graph

27
74. Which of the following ways can be used to represent a graph? ( )
a) Adjacency List and Adjacency Matrix b) Incidence Matrix
c) sparse Matrix d) No way to represen

75. The Data structure used in standard implementation of Breadth First Search is? ( )
a) Stack b) Queue
c) Linked List d) Tree

76. The Breadth First Search traversal of a graph will result into? ( )
a) Linked List b) Tree c) Graph with back edges d) Arrays

77. A person wants to visit some places. He starts from a vertex and then wants to visit every place
connected to this vertex and so on. What algorithm he should use? ( )
a) Depth First Search b) Breadth First Search
c) Trim’s algorithm d) Kruskal’s algorithm

78. Depth First Search is equivalent to which of the traversal in the Binary Trees? ( )
a) Pre-order Traversal b) Post-order Traversal
c) Level-order Traversal d) In-order Traversal

79. Time Complexity of DFS is? (V – number of vertices, E – number of edges) ( )


a) O(V + E) b) O(V) c) O(E) d) O(V*E)

80. The Data structure used in standard implementation of Breadth First Search is? ( )
a) Stack b) Queue c) Linked List d) Tree

81. The Depth First Search traversal of a graph will result into? ( )

a) Linked List b) Tree c) Graph with back edges d) Array

82. A person wants to visit some places. He starts from a vertex and then wants to visit every vertex till it
finishes from one vertex, backtracks and then explore other vertex from same vertex. What algorithm he
should use? ( )

a) Depth First Search b) Breadth First Search

c) Trim’s algorithm d) Kruskal’s Algorithm

83 What is the worst case time complexity of KMP algorithm for pattern searching (m = length of text, n
= length of pattern)? ( )

a) O(n) b) O(n*m) c) O(m) d) O(log n)

84. Trie is also known as ( )

a) Digital Tree b) Treap c) Binomial Tree d) 2-3 Tree

85. Which of the following is the efficient data structure for searching words in dictionaries? ( )
28
a) BST b) Linked List c) Balancded BST d) Trie

86. Which of the following special type of trie is used for fast searching of the full texts? ( )

a) Ctrie b) Hash tree c) Suffix tree d) T tree

87. What can be the maximum depth of the trie with n strings and m as the maximum sting the length?

a) log2n b) log2m c) n d) m

88. Which of the following is true about the trie? ( )

a) root is letter a b) path from root to the leat yields the string

c) children of nodes are randomly ordered d) each node stores the associated keys

89. What is a time complexity for finding the longest prefix that is common between suffix in a string?(
)

a) Ɵ (n) b) Ɵ (n!) c) Ɵ (1) d) O (log n!)

90.What is a time complexity for finding the longest palindromic substring in a string by using the
generalized suffix tree? ( )

a) Linear Time b) Exponential Time

c) Logarithmic Time d) Cubic Time

91.What is a time complexity for finding the total length of all string on all edges of a tree? ( )
a) Ɵ (n) b) Ɵ (n!) c) Ɵ (1) d) O (n2)

92. For what size of nodes, the worst case of usage of space in suffix tree seen? ( )
a) n Nodes b) 2n Nodes c) 2n nodes d) n! Nodes

93.An algorithm which tries all the possibilities unless results are satisfactory is and generally is time
consuming is ( )
a) Brute Force b) Divide and Conquer
c) Dynamic programming algorithms d) None of the mentioned

94. which is the simple sorting technique to sort a list of elements ( )

a)insertion sort b)selection sort c)bubblesort d) none of the above

95.To perform the heap sort, you need to create a tree with all nodes greater than their ( )

a)sibling b)children c) parent d)root

29
96 Which of the following algorithms formed the basis for the Quick search algorithm? ( )
a) Boyer-Moore’s algorithm b) Parallel string matching algorithm
c) Binary Search algorithm d) Linear Search algorithm

97.What character shift tables does Boyer-Moore’s search algorithm use? ( )


a) good-character shift tables b) bad-character shift tables
c) next-character shift tables d) both good and bad character shift tables

98. What is the worst case running time in searching phase of Boyer-Moore’s algorithm? ( )
a) O(n) b) O(log n) c) O(m+n) d) O(mn)

99. Procedure of sorting algorithms for larger records that does not fit in main memory and are stored on
disk is classified as ( )

a. parser sorting b.external sorting c.internal sorting d.secondary sorting

100. In external sorting, number of runs that can be merged in every pass are called ( )

a.degree of merging b.degree of passing c.degree of sorting d.degree of runs

Fill up the Blanks.

1.Process of inserting an element in stack is called ----------------

2. Process of removing an element from stack is called __________

3. In a stack, if a user tries to remove an element from empty stack it is called _________

4. Pushing an element into stack already having five elements and stack size of 5, then stack becomes

5. What is the value of the postfix expression 6 3 2 4 + – *:

6. Which data structure is needed to convert infix notation to postfix notation?

7. Consider the following operation performed on a stack of size 5.


Push(1);Pop();Push(2);Push(3);Pop();Push(4);Pop();Pop();Push(5);
After the completion of all operation, the number of elements present in stack are

8. A linear list of elements in which deletion can be done from one end (front) and insertion can take
place only at the other end (rear) is known as a ?

30
9. A queue follows __________

10. Circular Queue is also known as ________

11 If the elements “A”, “B”, “C” and “D” are placed in a queue and are deleted one at a time, in what
order will they be removed?-----------------------

12. 6. A data structure in which elements can be inserted or deleted at/from both the ends but not in the
middle is? ------------------------------

13. 7. A normal queue, if implemented using an array of size MAX_SIZE, gets full when the condition is
-------------------------

14. Queues serve major role in ______________

15. A linear collection of data elements where the linear node is given by means of pointer is called?

16. In linked list each node contain minimum of two fields. One field is data field to store the data
second field is?

17. what c code is used to create new node?

18. Linked list is considered as an example of ___________ type of memory allocation.

19.Name one major disadvantage for the applications of linked list?

20. ‘stack underflow’ mean

21. ‘stack overflow’ mean

22. Minimum number of queues to implement stack is ___________


23Which of the following properties is associated with a queue?

31
24.. In a circular queue, how do you increment the rear end of the queue?

25. What is the need for a circular queue?

26.. In linked list implementation of a queue, from where is the item deleted?

27.. In linked list implementation of a queue, the important condition for a queue to be empty is?

28. The essential condition which is checked before insertion in a linked queue is?

29. Name one real world scenarios that you would associate with a stack data structure?

30. The result of evaluating the postfix expression 5, 4, 6, +, *, 4, 9, 3, /, +, * is?

31.Convert the following infix expressions into its equivalent postfix expressions
(A+B⋀D)/(E–F)+G

32. In linked list each node contain minimum of two fields. One field is data field to store the data second
field is?

33.. Which c code is used to create new node?

34.Each node in a linked list must contain at least …..

35. hash table is a

36. What is the hash function used in the division method?

37 How many steps are involved in creating a hash function using a multiplication method?

38. If several elements are competing for the same bucket in the hash table, what is it called?

39. In a hash table of size 10, where is element 7 placed?

40. A linear list in which each node has pointers to point to the predecessor and successors nodes is
called as..

41. A ……………….. is a linear list in which insertions and deletions are made to from either end of the
structure.

42. A data structure where elements can be added or removed at either end but not in the middle is called

43. The way in which the data item or items are logically related defines …..
32
44. A ……. is a data structure that organizes data similar to a line in the supermarket, where the first one
in line is the first one out.

45. Which of the following is non-liner data structure?

46. Which of the following data structure is linear type?

47. To represent hierarchical relationship between elements, Which data structure is suitable?

48. What will be the value of top, if there is a size of stack STACK_SIZE is 5

49. ………… is not the operation that can be performed on queue.

50. There is an extra element at the head of the list called a ……….

51.____________ is the process of visiting every node in a tree atleast once.

52. In post order traversal, the root node is visited________________.

53. Children of the same parent are called ____________________.

54. Nodes which are subtrees of another node are called _______________.

55. In Binary Search Trees, the keys of all elements are __________________.

56. If a node is a leaf node, then its left child and right child field are filled with _____________.

57. The process where two rotations are required to balance a tree is called _________________.

58. When the height of the left subtree and right subtree of a node in an AVL Tree are equal the
balancing factor is _____________________.

59. The __________________ of a vertex is the number of edges this vertex has that are connected to
other vertices.

60. The weight or value of an edge is also called ___________________.

61. In _______________ the shortest path can be found.

62. The vertex is _________________ from the queue when it is visited.

63. The process of sorting a list stored in a file in secondary memory is known as _________________.

64. In a maxheap data structure, the_______________ element is placed in the root of the heap.

65. In a minheap data structure, the _______________ element is placed in the root of the heap.

66. ______________is known as compressed trie.

67. __________________provides a linear time solution for substring operation?

33
68. The other name for tire is __________________.

69._____________ is the efficient data structure for searching words in dictionaries.

70._____________is special type of trie is used for fast searching of the full texts.

71. In Breadth First Search of Graph, _______________data structure is used.

72. In depth first search graph, _____________ data structure is used.

73. In depth first search graph, _______________technique is used to remove a vertex from stack.

74. In red black tree, every path from a given node to any of its descendant leaves must contain the
___________________ number of black nodes.

75. The root of a red black tree is always ____________________.

76. A red-black tree is a self-balancing tree structure that applies a ______________ to each of its nodes.

77. In an AVL tree, the heights of the two child subtrees of any node differ by at most_______________

78. The operation of processing each element in the list is known as ____________________

79. A graph is said to be __________________ if every vertex' u ' in graph ' G ' is adjacent to every
other vertex ' v ' in graph ' G ',

80. ___________________ is a self - adjusted Binary Search Tree.

81. ________________was the first linear time complexity algorithm for string matching

82. The term trie came from the word ____________________

83. Prefix tree is known as _____________________

84. A trie searches the time complexity of a string depends upon the _________________the string

85. In trie, every node except the root stores a ___________________ value

86. In tries, if any two strings have a common prefix then they will have the ____________ ancestors

87. In ____________________representation , every vertex of a graph contains list of its adjacent


vertices

88. Graph traversal is a technique used for a searching ________________in a graph.

89. A __________________is also known as generate and test10. Sequential search is also considered as
and referred as __________________.

90. In a __________________ every new node operation is performed such that it moves to the root of
the tree.
34
91.If preorder of the binary tree is A B D H E C F I G J K, then the postorder of the binary tree will be
____________________

92. what is the value which cannot be a balance factor of any node of an AVL tree ____________

93.In zig rotation, every node moves one position to the _______________ from its current position.

94.In zag rotation, every node moves one position to the ______________ from its current position.

95.The ___________________in splay tree ,every node moves two positions to the right from its current
position

96.The Zag-Zag Rotation in splay tree,every node moves two positions to the ___________from its
current position.

97.In zig-zag rotation, every node moves one position to the ________________followed by one position
to the ___________________from its current position

98.In_____________, every node moves one position to the left followed by one position to the right
from its current position

99.In splay tree ,before deleting the element, we first need to _______________that element and then
delete it from the root position

100. In an AVL tree, the heights of the two child subtrees of any node differ by at
most_______________

UNIT-I
Introduction To Datastructures

Data Structure
Data Structure can be defined as the group of data elements which provides
an efficient way of storing and organising data in the computer so that it can
be used efficiently. Some examples of Data Structures are arrays, Linked
List, Stack, Queue, etc.
35
Need of Data Structures
Processor speed: To handle very large amout of data, high speed
processing is required, but as the data is growing day by day to the
billions of files per entity, processor may fail to deal with that much
amount of data.

Data Search: Consider an inventory size of 106 items in a store, If


our application needs to search for a particular item, it needs to
traverse 106 items every time, results in slowing down the search
process.

Multiple requests: If thousands of users are searching the data


simultaneously on a web server, then there are the chances that a very large
server can be failed during that process. in order to solve the above
problems, data structures are used.

Advantages of Data Structures


Efficiency: Efficiency of a program depends upon the choice of
data structures. For example: suppose, we have some data and we
need to perform the search for a perticular record. In that case, if
we organize our data in an array, we will have to search
sequentially element by element. hence, using array may not be
very efficient here. There are better data structures which can
make the search process efficient like ordered array, binary search
tree or hash tables.

Reusability: Data structures are reusable, i.e. once we have


implemented a particular data structure, we can use it at any other
place. Implementation of data structures can be compiled into
libraries which can be used by different clients.

Abstraction: Data structure is specified by the ADT which


provides a level of abstraction. The client program uses the data
structure through interface only, without getting into the
implementation details.

36
Linear Data Structures: A data structure is called linear if all of
its elements are arranged in the linear order. In linear data
structures, the elements are stored in nonhierarchical way where
each element has the successors and predecessors except the first
and last element.

Types of Linear Data Structures are given below:

Arrays: An array is a collection of similar type of data items and


each data item is called an element of the array. The data type of
the element may be any valid data type like char, int, float or
double.

The elements of array share the same variable name but each one
carries a different index number known as subscript. The array can
be one dimensional, two dimensional or multidimensional.

The individual elements of the array age are:

age[0], age[1], age[2], age[3],......... age[98], age[99].

Linked List: Linked list is a linear data structure which is used to


maintain a list in the memory. It can be seen as the collection of
nodes stored at non-contiguous memory locations. Each node of
the list contains a pointer to its adjacent node.
37
Stack: Stack is a linear list in which insertion and deletions are
allowed only at one end, called top.

A stack is an abstract data type (ADT), can be implemented in most


of the programming languages. It is named as stack because it
behaves like a real-world stack, for example: - piles of plates or
deck of cards etc.

Queue: Queue is a linear list in which elements can be inserted


only at one end called rear and deleted only at the other end
called front.

It is an abstract data structure, similar to stack. Queue is opened at


both end therefore it follows First-In-First-Out (FIFO) methodology
for storing the data items.

Non Linear Data Structures: This data structure does not form a
sequence i.e. each item or element is connected with two or more
other items in a non-linear arrangement. The data elements are not
arranged in sequential structure.

Types of Non Linear Data Structures are given below:

Trees: Trees are multilevel data structures with a hierarchical


relationship among its elements known as nodes. The bottommost
nodes in the herierchy are called leaf node while the topmost
node is called root node. Each node contains pointers to point
adjacent nodes.

Tree data structure is based on the parent-child relationship among


the nodes. Each node in the tree can have more than one children
except the leaf nodes whereas each node can have atmost one
parent except the root node. Trees can be classfied into many
categories which will be discussed later in this tutorial.

Graphs: Graphs can be defined as the pictorial representation of


the set of elements (represented by vertices) connected by the
links known as edges. A graph is different from tree in the sense
that a graph can have cycle while the tree can not have the one.

Operations on data structure


1) Traversing: Every data structure contains the set of data
elements. Traversing the data structure means visiting each
38
element of the data structure in order to perform some specific
operation like searching or sorting.

Example: If we need to calculate the average of the marks


obtained by a student in 6 different subject, we need to traverse
the complete array of marks and calculate the total sum, then we
will devide that sum by the number of subjects i.e. 6, in order to
find the average.

2) Insertion: Insertion can be defined as the process of adding the


elements to the data structure at any location.

If the size of data structure is n then we can only insert n-1 data
elements into it.

3) Deletion:The process of removing an element from the data


structure is called Deletion. We can delete an element from the
data structure at any random location. If we try to delete an
element from an empty data structure then underflow occurs.

4) Searching: The process of finding the location of an element


within the data structure is called Searching. There are two
algorithms to perform searching, Linear Search and Binary Search.
We will discuss each one of them later in this tutorial.

5) Sorting: The process of arranging the data structure in a


specific order is known as Sorting. There are many algorithms that
can be used to perform sorting, for example, insertion sort,
selection sort, bubble sort, etc.

6) Merging: When two lists List A and List B of size M and N


respectively, of similar type of elements, clubbed or joined to
produce the third list, List C of size (M+N), then this process is
called merging

What is an Algorithm?
An algorithm is a process or a set of rules required to perform
calculations or some other problem-solving operations especially
by a computer. The formal definition of an algorithm is that it
contains the finite set of instructions which are being carried in a
specific order to perform the specific task.

39
Characteristics of an Algorithm
The following are the characteristics of an algorithm:

o Input: An algorithm has some input values. We can pass 0 or some


input value to an algorithm.
o Output: We will get 1 or more output at the end of an algorithm.
o Unambiguity: An algorithm should be unambiguous which means
that the instructions in an algorithm should be clear and simple.
o Finiteness: An algorithm should have finiteness. Here, finiteness
means that the algorithm should contain a limited number of
instructions, i.e., the instructions should be countable.
o Effectiveness: An algorithm should be effective as each instruction
in an algorithm affects the overall process.
o Language independent: An algorithm must be language-
independent so that the instructions in an algorithm can be
implemented in any of the languages with the same output.

The performance of the algorithm can be measured in two factors:

o Time complexity: The time complexity of an algorithm is the


amount of time required to complete the execution. The time
complexity of an algorithm is denoted by the big O notation.
Here, big O notation is the asymptotic notation to represent
the time complexity. The time complexity is mainly calculated
by counting the number of steps to finish the execution. Let's
understand the time complexity through an example.

sum=0;
// Suppose we have to calculate the sum of n
numbers.
for i=1 to n
sum=sum+i
;
// when the loop ends then sum holds the sum of the
n numbers return sum;
In the above code, the time complexity of the loop statement will be
atleast n, and if the value of n increases, then the time complexity
also increases. While the complexity of the code, i.e., return sum will
40
be constant as its value is not dependent on the value of n and will
provide the result in one step only. We generally consider the worst-
time complexity as it is the maximum time taken for any given input
size.
o Space complexity: An algorithm's space complexity is the
amount of space required to solve a problem and produce an
output. Similar to the time complexity, space complexity is
also expressed in big O notation.
For an algorithm, the space is required for the following purposes:

1. To store program instructions


2. To store constant values
3. To store variable values
4. To track the function calls, jumping statements, etc.

Auxiliary space: The extra space required by the algorithm,


excluding the input size, is known as an auxiliary space. The space
complexity considers both the spaces, i.e., auxiliary space, and
space used by the input.

Array
o Arrays are defined as the collection of similar type of data items
stored at contiguous memory locations.
o Arrays are the derived data type in C programming language
which can store the primitive type of data such as int, char,
double, float, etc.
o Array is the simplest data structure where each data element can
be randomly accessed by using its index number.
o For example, if we want to store the marks of a student in 6
subjects, then we don't need to define different variable for the
marks in different subject. instead of that, we can define an array
which can store the marks in each subject at a the contiguous
memory locations.

41
Properties of the Array
1. Each element is of same data type and carries a same size i.e. int =
4 bytes.
2. Elements of the array are stored at contiguous memory locations
where the first element is stored at the smallest memory location.
3. Elements of the array can be randomly accessed since we can
calculate the address of each element of the array with the given
base address and the size of data element.

for example, in C language, the syntax of declaring an array is like


following:

int arr[10]; char arr[10]; float arr[5]

#include
<stdio.h>
void main ()
{
int marks[6] =
{56,78,88,76,56,89);
int i; float avg;
for (i=0; i<6; i++ )
{
avg = avg + marks[i];
}
printf(avg);
}

 Complexity of Array operations


Time and space complexity of various array operations are
described in the following table.

 Time Complexity

Algorithm Average Case Worst Case

42
Access O(1) O(1)

Search O(n) O(n)

Insertion O(n) O(n)

Deletion O(n) O(n)

Space Complexity

In array, space complexity for worst case is O(n).

Advantages of Array
o Array provides the single name for the group of variables of the
same type therefore, it is easy to remember the name of all the
elements of an array. o Traversing an array is a very simple process,
we just need to increment the base address of the array in order to
visit each element one by one.
o Any element in the array can be directly accessed by using the
index.

Accessing Elements of an array


To access any random element of an array we need the following
information:

1. Base Address of the array.


2. Size of an element in bytes.
3. Which type of indexing, array follows.

Address of any element of a 1D array can be calculated by using


the following formula:

Byte address of element A[i] = base address + size * ( i - first index)

43
2D Array
2D array can be defined as an array of arrays. The 2D array is
organized as matrices which can be represented as the collection
of rows and columns.

However, 2D arrays are created to implement a relational database


look alike data structure. It provides ease of holding bulk of data at
once which can be passed to any number of functions wherever
required.

How to declare 2D Array


The syntax of declaring two dimensional array is very much similar
to that of a one dimensional array, given as follows.

int arr[max_rows][max_columns];

44
Initializing 2D Arrays
We know that, when we declare and initialize one dimensional
array in C programming simultaneously, we don't need to specify
the size of the array. However this will not work with 2D arrays. We
will have to define at least the second dimension of the array.

The syntax to declare and initialize the 2D array is given as follows.

int arr[2][2] = {0,1,2,3};

Exampl

e:

#include

<stdio.h

> void

main ()

{ i
nt
arr[3]
[3],i,j;
for
(i=0;i
<3;i+
+)
{
for (j=0;j<3;j++)

{ printf("Ent
er a[%d][%d]: ",i,j);
scanf("%d",&arr[i]
[j]);
}
}

45
printf("\n printing the elements ....\n");
for(i=0;i<3;i++)
{
printf("\
n");
for
(j=0;j<3
;j++)
{
printf("%d\
t",arr[i][j]);
}
}
}

Abstract Data Types


Abstract Data type (ADT) is a type (or class) for objects whose
behaviour is defined by a set of value and a set of operations.
The definition of ADT only mentions what operations are to be
performed but not how these operations will be implemented. It does
not specify how data will be organized in memory and what algorithms
will be used for implementing the operations. It is called “abstract”
because it gives an implementationindependent view. The process of
providing only the essentials and hiding the details is known as
abstraction.
ADT as a black box which hides the inner structure and design of the
data type.
Now we’ll define three ADTs namely List ADT, Stack ADT, Queue ADT.

List ADT
• The data is generally stored in key sequence in a list which has
a head structure consisting of count, pointers and address of
compare function needed to compare the data in the list.

46
Stack ADT
• In Stack ADT Implementation instead of data being stored in each node, the
pointer to data is stored.
The program allocates memory for the data and address is passed to
the stack ADT.
A Stack contains elements of the same type arranged in sequential
order. All operations take place at a single end that is top of the stack
and following operations can be performed:
• push() – Insert an element at one end of the stack called top.

• pop() – Remove and return the element at the top of the stack, if it is
not empty.
• peek() – Return the element at the top of the stack without removing
it, if the stack is not empty.
• size() – Return the number of elements in the stack.
• isEmpty() – Return true if the stack is empty, otherwise return false.
• isFull() – Return true if the stack is full, otherwise return false.

47
 Queue ADT
• The queue abstract data type (ADT) follows the basic design of the
stack abstract data type.

• Each node contains a void pointer to the data and the link pointer to
the next element in the queue. The program’s responsibility is to
allocate memory for storing the data.

A Queue contains elements of the same type arranged in sequential


order. Operations take place at both ends, insertion is done at the end
and deletion is done at the front. Following operations can be
performed:
• enqueue() – Insert an element at the end of the queue.
• dequeue() – Remove and return the first element of the queue, if the
queue is not empty.
• peek() – Return the element of the queue without removing it, if the
queue is not empty.
• size() – Return the number of elements in the queue.
• isEmpty() – Return true if the queue is empty, otherwise return
false.
• isFull() – Return true if the queue is full, otherwise return false.

Linked List
There are two major drawbacks of using array:
48
o We cannot insert more than 3 elements in the above example
because only 3 spaces are allocated for 3 elements.
o In the case of an array, lots of wastage of memory can occur. For
example, if we declare an array of 50 size but we insert only 10
elements in an array. So, in this case, the memory space for
other 40 elements will get wasted and cannot be used by
another variable as this whole space is occupied by an array.
o In array, we are providing the fixed-size at the compile-time, due to
which wastage of memory occurs. The solution to this problem is to
use the linked list

What is Linked List?


A linked list is also a collection of elements, but the elements are
not stored in a consecutive location.

Suppose a programmer made a request for storing the integer


value then size of 4-byte memory block is assigned to the integer
value. The programmer made another request for storing 3 more
integer elements; then, three different memory blocks are assigned
to these three elements but the memory blocks are available in a
random location.

These elements are linked to each other by providing one additional


information along with an element, i.e., the address of the next
element. The variable that stores the address of the next element is
known as a pointer. Therefore, we conclude that the linked list
contains two parts, i.e., the first one is the data element, and the
other is the pointer. The pointer variable will occupy 4 bytes which is
pointing to the next element.

A linked list can also be defined as the collection of the


nodes in which one node is connected to another node, and
node consists of two parts, i.e., one is the data part and the
second one is the address part, as shown in the below
figure:

49
In the above figure, we can observe that each node contains the
data and the address of the next node. The last node of the linked
list contains the NULL value in the address part.
How can we declare the Linked list?
The declaration of an array is very simple as it is of single type. But
the linked list contains two parts, which are of two different types,
i.e., one is a simple variable, and the second one is a pointer
variable. We can declare the linked list by using the user-defined
data type known as structure.The structure of a linked list can be
defined as:

struct node
{
int
data;
struct
node
*next;
}

 Advantages of using a Linked list over Array


The following are the advantages of using a linked list over an
array:

oDynamicdatastructure:
The size of the linked list is not fixed as it can vary according to our
requirements.
o InsertionandDeletion:

50
Insertion and deletion in linked list are easier than array as the
elements in an array are stored in a consecutive location. In
contrast, in the case of a linked list, the elements are stored in a
random location. The complexity for insertion and deletion of
elements from the beginning is O(1) in the linked list, while in the
case of an array, the complexity would be O(n).
o Memoryefficient
Its memory consumption is efficient as the size of the linked list
can grow or shrink according to our requirements.
o Implementation
Both the stacks and queues can be implemented using a linked
list.

Types of Linked List


Before knowing about the types of a linked list, we should know
what is linked list. So, to know about the linked list, click on the
link given below:

 Types of Linked list


The following are the types of linked list:

Singly
Linked
list o

Doubly
Linked
list o

Circular
Linked
list

51
Singly Linked list
It is the commonly used linked list in programs. If we are talking
about the linked list, it means it is a singly linked list. The singly
linked list is a data structure that contains two parts, i.e., one is the
data part, and the other one is the address part, which contains the
address of the next or the successor node. The address part in a
node is also known as a pointer.

The linked list, which is shown in the above diagram, is known as a


singly linked list as it contains only a single link. In this list, only
forward traversal is possible; we cannot traverse in the backward
direction as it has only one link in the list.

Representation of the node in a singly linked list

struct node
{
int
data;
struct
node
*next;
}

52
 Complexity

 Operations on Singly Linked List


There are various operations which can be performed on singly
linked list. A list of all such operations is given below.

Node Creation
struct node
{
int
data;
struct
node
*next;
};
struct node *head, *ptr;
ptr = (struct node *)malloc(sizeof(struct node *));

Linked List Operations: Traverse, Insert and Delete

53
There are various linked list operations that allow us to perform different actions on linked
lists.

For example, the insertion operation adds a new element to the linked list.
Here's a list of basic linked list operations that we will cover in this article.
• Traversal - access each element of the linked list
• Insertion - adds a new element to the linked list

• Deletion - removes the existing elements

• Search - find a node in the linked list


• Sort - sort the nodes of the linked list
Things to Remember about Linked List
• head points to the first node of the linked list

next pointer of the last node is NULL , so if the next current node is NULL , we have reached
the end of the linked list.
In all of the examples, we will assume that the linked list has three nodes 1 --->2 --->3
with

node structure as below:

struct node {
int data;
struct node *next;
};

Traverse a Linked List

Displaying the contents of a linked list is very simple. We keep moving the temp

node to the next one and display its contents.

When temp is NULL , we know that we have reached the end of the linked list so

we get out of the while loop.

struct node *temp = head;


printf("\n\nList elements
are - \n"); while(temp !=
NULL) { printf("%d ---
>",temp->data); temp =
temp->next;
54
}
You can add elements to either the beginning, middle or end of the linked list.

1. Insert at the beginning


• Allocate memory for new node

• Store data

• Change next of new node to point to head

• Change head to point to recently created node

struct node *newNode;


newNode =
malloc(sizeof(struct
node)); newNode->data =
4; newNode->next = head;
head = newNode;

Insert at the Middle


• Allocate memory and store data for new node

• Traverse to node just before the required position of new node

• Change next pointers to include new node in between


struct node *newNode; newNode =
malloc(sizeof(struct node)); newNode->data =
4;

struct node *temp = head;

55
for(int i=2; i < position; i+
+) { if(temp->next != NULL)
{ temp = temp->next;
} } newNode->next =
temp->next; temp->next =
newNode;

2. Insert at the End


• Allocate memory for new node

• Store data

• Traverse to last node

• Change next of last node to recently created node

struct node *newNode; newNode =


malloc(sizeof(struct node));
newNode->data = 4; newNode->next =
NULL;
struct node *temp =
head; while(temp->next !=
NULL){ temp = temp-
>next;
}

56
temp->next = newNode;

Delete from a Linked List

You can delete either from the beginning, end or from a particular position.

1. Delete from beginning


• Point head to the second node

head = head->next;

2. Delete from end


• Traverse to second last element

• Change its next pointer to null

struct node* temp = head;


while(temp->next->next!=NULL){
temp = temp->next;
} temp->next =
NULL;

3. Delete from middle


• Traverse to element before the element to be deleted

• Change next pointers to exclude the node from the chain

57
for(int i=2; i< position; i+
+) { if(temp->next!=NULL)
{ temp = temp->next;
}
} temp->next = temp-
>next->next;

Search an Element on a Linked List

You can search an element on a linked list using a loop using the following steps. We
are finding item on a linked list.
• Make head as the current node.

• Run a loop until the current node is NULL because the last element points to NULL .
• In each iteration, check if the key of the node is equal to item . If it the key matches the
item, return true otherwise return false .

// Search a node bool searchNode(struct


Node** head_ref, int key) { struct Node*
current = *head_ref;
while (current != NULL) { if
(current->data == key) return true;
current = current->next;
} return
false;
}

o Sort Elements of a Linked List

We will use a simple sorting algorithm, Bubble Sort, to sort the elements of a linked list

in ascending order below.

1. Make the head as the current node and create another node index for
later use.

2. If head is null, return.


3. Else, run a loop till the last node (i.e. NULL ).
58
4. In each iteration, follow the following step 5-6.

5. Store the next node of current in index .

6. Check if the data of the current node is greater than the next node. If it is
greater, swap current and index .

// Sort the linked list void


sortLinkedList(struct Node** head_ref)
{ struct Node *current = *head_ref, *index
= NULL; int temp;
if (head_ref == NULL) {
return; } else {
while (current != NULL) {
// index points to the node next to current
index = current->next;
while (index != NULL) { if
(current->data > index->data) {
temp = current->data;
current->data = index->data;
index->data = temp;
} index =
index->next;
}
current = current->next;
}
}
}

o Doubly Linked List

We add a pointer to the previous node in a doubly-linked list. Thus, we


can go in either direction: forward or backward.

59
A node is represented as

struct
node {
int data;
struct
node
*next;
struct
node
*prev; }

Each struct node has a data item, a pointer to the previous struct node, and a
pointer to the next struct node.
/* Initialize nodes */
struct node *head;
struct node *one =
NULL; struct node *two
= NULL; struct node
*three = NULL;

/* Allocate memory */ one =


malloc(sizeof(struct node)); two
= malloc(sizeof(struct node));
three = malloc(sizeof(struct
node));

/* Assign data values


*/ one->data = 1;
two->data = 2; three-
>data = 3;

/* Connect nodes */
one->next = two; one-
>prev = NULL;
two->next =
three; two->prev
= one;
three->next =
NULL;
three->prev = two;
60
/* Save address of first node
in head */ head = one;

In the above code, one, two, and three are the nodes with data items 1,
2,
and 3 respectively.
• For node one: next stores the address of two and prev stores null (there
is no
node before it)
• For node two: next stores the address of three and prev stores the address of
one

• For node three: next stores null (there is no node after it) and prev stores the
address of two.

o Insertion on a Doubly Linked List

Pushing a node to a doubly-linked list is similar to pushing a node to


a linked list, but extra work is required to handle the pointer to the
previous node.

We can insert elements at 3 different positions of a doubly linked list:

1. Insertion at the beginning


2. Insertion in-between nodes
3. Insertion at the End

o 1. Insertion at the Beginning

Let's add a node with value 6 at the beginning of the doubly linked
list we made above. 1. Create a new node
• allocate memory for newNode
• assign the data to newNode .

61
2. Set prev and next pointers of new node
• point next of newNode to the first node of the doubly linked list
• point prev to null

3. Make new node as head node


• Point prev of the first node to newNode (now the previous head is the second
node)

• Point head to newNode

o Code for Insertion at the Beginning

// insert node at the front void


insertFront(struct Node** head, int data) {

// allocate memory for newNode


struct Node* newNode = new Node;

62
// assign data to newNode
newNode->data = data;

// point next of newNode to the first node of the doubly linked list
newNode->next = (*head);

// point prev to NULL


newNode->prev = NULL;

// point previous of the first node (now first node is the second
node) to newNode if ((*head) != NULL)
(*head)->prev = newNode;

// head points to newNode


(*head) = newNode;
}

o 2. Insertion in between two nodes

Let's add a node with value 6 after node with value 1 in the doubly linked
list.

1. Create a new node


• allocate memory for newNode
• assign the data to newNode .
2. Set the next pointer of new node and previous node
• assign the value of next from previous node to the next of newNode
• assign the address of newNode to the next of previous node

63
3. Set the prev pointer of new node and the next node
• assign the value of prev of next node to the prev of newNode
• assign the address of newNode to the prev of next node

o Code for Insertion in between two Nodes

// insert a node after a specific node void


insertAfter(struct Node* prev_node, int data) {

// check if previous node is NULL


if (prev_node == NULL)
{ printf("previous node cannot be
NULL"); return;
}

// allocate memory for newNode


struct Node* newNode = new Node;

64
// assign data to newNode
newNode->data = data;

// set next of newNode to next of


prev node newNode->next = prev_node-
>next;

// set next of prev node to newNode


prev_node->next = newNode;

// set prev of newNode to the


previous node newNode->prev =
prev_node;

// set prev of newNode's next to


newNode if (newNode->next != NULL)
newNode->next->prev = newNode; }

o 3. Insertion at the End

Let's add a node with value 6 at the end of the doubly linked list.

1. Create a new node


2. Set prev and next pointers of new node and the previous node If
the linked list is empty, make the newNode as the head node. Otherwise,
traverse to the end of the doubly linked list and

/ insert a newNode at the end of the list


void insertEnd(struct Node** head, int
data) {
// allocate memory for node
65
struct Node* newNode = new Node;

// assign data to newNode


newNode->data = data;

// assign NULL to next of newNode


newNode->next = NULL;

// store the head node temporarily (for later use)


struct Node* temp = *head;

// if the linked list is empty, make the newNode as


head node if (*head == NULL) { newNode->prev =
NULL; *head = newNode; return;
}

// if the linked list is not empty, traverse to the end of the


linked list while (temp->next != NULL) temp = temp->next;

// now, the last node of the linked list is temp

// point the next of the last node (temp) to newNode.


temp->next = newNode;

// assign prev of newNode to temp


newNode->prev = temp;
}

Deletion from a Doubly Linked List

Similar to insertion, we can also delete a node from 3 different


positions of a doubly linked list.
Suppose we have a double-linked list with elements 1, 2, and 3.

66
1. Delete the First Node of Doubly Linked List

If the node to be deleted (i.e. del_node) is at the beginning

Code for Deletion of the First Node

if (*head == del_node)
*head = del_node->next;
if (del_node->prev != NULL)
del_node->prev->next = del_node->next;

free(del);

2. Deletion of the Inner Node

If del_node is an inner node (second node), we must have to reset the value

of next and prev of the nodes before and after the del_node .
For the node before the del_node (i.e. first node)
Assign the value of next of del_node to the next of the first node. For
the node after the del_node (i.e. third node)
Assign the value of prev of del_node to the prev of the third node.

67
Code for Deletion of the Inner Node

if (del_node->next != NULL)
del_node->next->prev =
del_node->prev;
if (del_node->prev != NULL)
del_node->prev->next =
del_node->next;

3. Delete the Last Node of Doubly Linked List

In this case, we are deleting the last node with value 3 of the doubly linked list.
Here, we can simply delete the del_nodeand make the next of node before del_nodepoint

to NULL.

Code for Deletion of the Last Node

if (del_node->prev != NULL)
del_node->prev->next =
del_node->next;

68
Circular Linked List

A circular linked list is a variation of a linked list in which the last element is linked to the first

element. This forms a circular loop.

A circular linked list can be either singly


d or doubly
linke linked.

• for singly linked list, next pointer of last item points to the first item

• In the doubly linked list, prev pointer of the first item points to the last item as well.
A three-member circular singly linked list can be created as:
/* Initialize nodes */
struct node *head;
struct node *one =
NULL; struct node *two
= NULL; struct node
*three = NULL;

/* Allocate memory */ one =


malloc(sizeof(struct node)); two
= malloc(sizeof(struct node));
three = malloc(sizeof(struct
node));

/* Assign data values


*/ one->data = 1;
two->data = 2; three-
>data = 3;

/* Connect nodes */
one->next = two; two-
>next = three; three-
>next = one;

69
/* Save address of first node in
head */ head = one;

2. Circular Doubly Linked List

Here, in addition to the last node storing the address of the first node, the first node will

also store the address of the last node.

Stack Data Structure


A stack is a linear data structure that follows the principle of Last In
First Out
(LIFO). This means the last element inserted inside the stack is
removed first. You can think of the stack data structure as the pile of
plates on top of another.

Here, you can:


• Put a new plate on top
• Remove the top plate
70
And, if you want the plate at the bottom, you must first remove all the
plates
on top. This is exactly how the stack data structure works.

LIFO Principle of Stack

In programming terms, putting an item on top of the stack is called


push and removing an item is called pop.

Stack Push and Pop Operations

In the above image, although item 2 was kept last, it was removed
first. This is exactly how the LIFO (Last In First Out) Principle
works.

There are some basic operations that allow us to perform different


actions on
a stack.
• Push: Add an element to the top of a stack
• Pop: Remove an element from the top of a stack
• IsEmpty: Check if the stack is empty
• IsFull: Check if the stack is full
• Peek: Get the value of the top element without removing it
Working of Stack Data Structure

The operations work as follows:


71
1. A pointer called TOP is used to keep track of the top element in the
stack.
2. When initializing the stack, we set its value to -1 so that we can
check if the stack is empty by comparing TOP == -1 .
3. On pushing an element, we increase the value of TOP and place
the new element in the position pointed to by TOP .
4. On popping an element, we return the element pointed to by TOP

and reduce its value.


5. Before pushing, we check if the stack is already full

6. Before popping, we check if the stack is already empty

Working of Stack Data Structure

Stack Implementation with Arrays


The most common stack implementation is using arrays, but it can
also be implemented using lists.
#include<stdio.h>
#include<conio.h>

#define SIZE 10

voi
d

72
pus
h(i
nt)
;
voi
d
pop
();
voi
d
dis
pla
y()
;
int
stack[SIZE],
top = -1;
void main() {
int value,
choice;
clrscr();
while(1){
printf("\n\
n***** MENU
*****\n");
printf("1.
Push\n2. Pop\
n3. Display\
n4. Exit");
printf("\
nEnter your
choice: ");
scanf("%d",&c

73
hoice);
switch(choice
){
case 1: printf("Enter the value to be insert: ");

scanf("%d",&value);

push(value);

break;

case 2: pop();

break;

case 3: display();

break;

case 4: exit(0);

default: printf("\nWrong selection!!! Try again!!!");

} } void push(int value){ if(top == SIZE-


1) printf("\nStack is Full!!! Insertion is
not possible!!!"); else{ top++;
stack[top] = value; printf("\nInsertion
success!!!");

v
o
i
d

p
o
p

74
(
)
{

i
f
(
t
o
p

=
=

-
1
)

p
r
i
n
t
f
(
"
\
n
S
t
a
c
k

75
i
s

E
m
p
t
y
!
!
!

D
e
l
e
t
i
o
n

i
s

n
o
t

p
o
s
s

76
i
b
l
e
!
!
!
"
)
;

e
l
s
e
{

p
r
i
n
t
f
(
"
\
n
D
e
l
e
t
e

77
d

%
d
"
,

s
t
a
c
k
[
t
o
p
]
)
;

t
o
p
-
-
;

} } void display()
{ if(top == -1)
printf("\nStack is
Empty!!!"); else{

78
int i; printf("\
nStack elements are:\n");
for(i=top; i>=0; i--)
printf("%d\n",stack[i]);

Stack Time Complexity

For the array-based implementation of a stack, the push and pop


operations take constant time, i.e. O(1) .
Following are the applications of stack:
1. Delimiter Checking
2. Reverse a Data
3. Processing Function Calls

4. Evaluation of Arithmetic Expressions


5. Backtracking

Evaluation of Arithmetic Expression requires two steps:


o First, convert the given expression
into special notation. o Evaluate the
expression in this new notation.

Notations for Arithmetic Expression


There are three notations to represent an arithmetic expression:

Infix
Notati
on o
79
Prefix
Notati
on o

Postfi
x
Notati
on

Infix Notation

The infix notation is a convenient way of writing an expression in which each


operator is placed between the operands. Infix expressions can be parenthesized or
unparenthesized depending upon the problem requirement.

Example: A + B, (C - D) etc.
• All these expressions are in infix notation because the operator comes between
the operands.
• Prefix Notation

• The prefix notation places the operator before the operands.


This notation was introduced by the Polish mathematician
and hence often referred to as polish notation.
• Example: + A B, -CD etc.
• All these expressions are in prefix notation because the
operator comes before the operands.
• Postfix Notation
• The postfix notation places the operator after the operands.
This notation is just the reverse of Polish notation and also
known as Reverse Polish notation.
• Example: AB +, CD+, etc.
• All these expressions are in postfix notation because the
operator comes after the operands.
• Conversion of Arithmetic Expression into various
Notations:
Infix Notation Prefix Notation Postfix Notation

80
A*B *AB AB*

(A+B)/C /+ ABC AB+C/

(A*B) + (D-C) +*AB - DC AB*DC-+


Conversion of Infix to Postfix
Algorithm for Infix to Postfix

Step 1: Consider the next element in the input.

Step 2: If it is operand, display it.

Step 3: If it is opening parenthesis, insert it on stack.

Step 4: If it is an operator, then

• If stack is empty, insert operator on stack.


• If the top of stack is opening parenthesis, insert the operator on
stack
• If it has higher priority than the top of stack, insert the operator
on stack.
• Else, delete the operator from the stack and display it, repeat
Step 4.
Step 5: If it is a closing parenthesis, delete the operator from stack and
display them until an opening parenthesis is encountered. Delete and
discard the opening parenthesis.

Step 6: If there is more input, go to Step 1.

Step 7: If there is no more input, delete the remaining operators to


output.

Example: Suppose we are converting 3*3/(4-1)+6*2 expression into


postfix form. Following table shows the evaluation of Infix to
Postfix:

Expression Stack Output

81
3 Empty 3

* * 3

3 * 33

/ / 33*

( /( 33*

4 /( 33*4

- /(- 33*4

1 /(- 33*41

) - 33*41-

+ + 33*41-/

6 + 33*41-/6

* +* 33*41-/62

2 +* 33*41-/62

Empty 33*41-/62*+

So, the Postfix Expression is 33*41-/62*+\

Reverse a Data:
To reverse a given set of data, we need to reorder the data so that
the first and last elements are exchanged, the second and second
last element are exchanged, and so on for all other elements.

82
Example: Suppose we have a string Welcome, then on reversing it
would be Emoclew.

There are different reversing applications:

o Reversing
a string o
Converting
Decimal to Binary

Reverse a String
A Stack can be used to reverse the characters of a string. This can
be achieved by simply pushing one by one each character onto the
Stack, which later can be popped from the
Stack one by one. Because of the last in first out property of the
Stack, the first character of the Stack is on the bottom of the Stack
and the last character of the String is on the Top of the Stack and
after performing the pop operation in the Stack, the Stack returns
the String in Reverse order.

Backtracking
Backtracking is another application of Stack. It is a recursive
algorithm that is used for solving the optimization problem.

 Delimiter Checking
The common application of Stack is delimiter checking, i.e., parsing
that involves analyzing a source program syntactically. It is also
called parenthesis checking. When the compiler translates a source
program written in some programming language such as C, C++ to
a machine language, it parses the program into multiple individual
parts such as variable names, keywords, etc. By scanning from left
to right. The main problem encountered while translating is the
unmatched delimiters. We make use of different types of delimiters
83
include the parenthesis checking (,), curly braces {,} and square
brackets [,], and common delimiters /* and */. Every opening
delimiter must match a closing delimiter, i.e., every opening
parenthesis should be followed by a matching closing parenthesis.

Processing Function Calls:


Stack plays an important role in programs that call several
functions in succession. Suppose we have a program containing
three functions: A, B, and C. function A invokes function B, which
invokes the function C.

When we invoke function A, which contains a call to function B,


then its processing will not be completed until function B has
completed its execution and returned. Similarly for function B and
C. So we observe that function A will only be completed after
function B is completed and function B will only be completed after
function C is completed. Therefore, function A is first to be started
and last to be completed. To conclude, the above function activity
matches the last in first out behavior and can easily be handled
using Stack.

84
Stack using linked list
Stack using an array
- drawback
If we implement the stack using an array, we need to specify the array size at the
beginning(at compile time).
We can't change the
izesof an array at runtime. So, it will only work for a fixed number
of elements.
Solution
We can implement the stack using the linked list.
In the linked list, we can change its size at runtime.

#include<stdio.h>
#include<stdlib.h>
void
push(
);
void
pop()
;
void
displ
ay();
struc
t
node

85
{ in
t val;
struct
node
*next
;
};
struct node *head;

void main ()
{
int choice=0; printf("\n*********Stack
operations using linked list*********\n");
printf("\n----------------------------------------------\n");
while(choice != 4)
{
printf("\n\nChose one from the
below options...\n"); printf("\
n1.Push\n2.Pop\n3.Show\n4.Exit");
printf("\n Enter your choice \n");
scanf("%d",&choice);
switch(choice)

c
a
s
e

1
:

{
push();
break;
} case
2: {
pop();

86
break;
}
case 3:
{ d
isplay();
break;
}
case 4:
{ p
rintf("Exiting...
.");
break;

}
defa
ult:
{
printf("Please Enter valid
choice ");
}
};
}
}
void push ()
{

i
n
t

v
a
l
;

struct node *ptr = (struct


node*)malloc(sizeof(struct node)); if(ptr ==
NULL)
{

87
printf("not able to push the
element");
}
e
l
s
e

{
printf("Enter the
value");
scanf("%d",&val);
if(head==NULL)
{
ptr->val =
val;
ptr -> next =
NULL;
head=ptr;

e
l
s
e

{
ptr->val =
val;
ptr->next =
head;
head=ptr;

}
printf("Item pushed");

}
}

88
void pop()
{ in
t item;
struct
node
*ptr;
if (head
==
NULL)
{
printf("Underflow");
}
e
l
s
e

{
item = head->val;
ptr = head;
head = head->next;
free(ptr);
printf("Item
popped");

}
}
void display()
{ i
nt i;
struct
node
*ptr;
ptr=h
ead;
if(ptr
==
NULL)
89
{
printf("Stack is empty\n");

e
l
s
e

{
printf("Printing Stack elements \n");
while(ptr!=NULL)
{
printf("%d\
n",ptr->val);
ptr = ptr->next;
}
}
}

Queue
1.A queue can be defined as an ordered list which enables insert
operations to be performed at one end called REAR and delete
operations to be performed at another end called FRONT.

2.Queue is referred to be as First In First Out list.

3.For example, people waiting in line for a rail ticket form a queue.

90
 Applications of Queue
Due to the fact that queue performs actions on first in first out
basis which is quite fair for the ordering of actions. There are
various applications of queues discussed as below.

1. Queues are widely used as waiting lists for a single shared resource
like printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is not
being transferred at the same rate between two processes) for eg.
pipes, file IO, sockets.
3. Queues are used as buffers in most of the applications like MP3
media player, CD player, etc.
4. Queue are used to maintain the play list in media players in order to
add and remove the songs from the play-list.
5. Queues are used in operating systems for handling interrupts.

 Complexity
Time Complexity
Data Space
Structure Compleity

Average Wors Worst


t
Acces Searc Insertio Deletio Acces Searc Insertio Deletio
s h n n s h n n

Queue θ(n) θ(n) θ(1) θ(1) O(n) O(n) O(1) O(1) O(n)

91
Operations on Queue

There are two fundamental operations performed on a Queue:

o Enqueue: The enqueue operation is used to insert the element at


the rear end of the queue. It returns void.
o Dequeue: The dequeue operation performs the deletion from the
front-end of the queue. It also returns the element which has been
removed from the front-end. It returns an integer value. The
dequeue operation can also be designed to void.
o Peek: This is the third operation that returns the element, which is
pointed by the front pointer in the queue but does not delete it.
o Queue overflow (isfull): When the Queue is completely full, then it
shows the overflow condition.
o Queue underflow (isempty): When the Queue is empty, i.e., no
elements are in the Queue then it throws the underflow condition.

Enqueue Operation
• check if the queue is full
for the first element, set the value of FRONT to 0

• increase the index by 1
REAR

• add the new element in the position pointed to by REAR

Dequeue Operation
• check if the queue is empty

92
return the value pointed by FRONT

• increase the index by 1
FRONT

• for the last element, reset the values of FRONT and REAR to -1

 Algorithm to insert any element in a queue


Check if the queue is already full by comparing rear to max - 1. if
so, then return an overflow error.

If the item is to be inserted as the first element in the list, in that


case set the value of front and rear to 0 and insert the element at
the rear end.

Otherwise keep increasing the value of rear and insert each


element one by one having rear as the index.

Algorithm
o Step1: IFREAR=MAX-1
WriteOVERFLOW
Gotostep
[END OF IF] o
Step2: IFFRONT=-
1andREAR=-1
SETFRONT=REAR=0
ELSE
SETREAR=REAR+1
[END OF IF] o Step
3: Set QUEUE[REAR]
= NUM o Step 4: EXIT

 Algorithm to delete an element from the queue


If, the value of front is -1 or value of front is greater than rear ,
write an underflow message and exit.

Otherwise, keep increasing the value of front and return the item
stored at the front end of the queue at each time.

93
Algorithm
o Step1: IFFRONT=-1orFRONT>REAR
WriteUNDERFLOW
ELSE
SETVAL=QUEUE[FRONT]
SETFRONT=FRONT+1
[
END
OF
IF] o

Ste
p 2:
EXI
T

Implementation of Queue
There are two ways of implementing the Queue:

o Sequential allocation: The sequential allocation in a Queue can be


implemented using an array.
o Linked list allocation: The linked list allocation in a Queue can be
implemented using a
linked list.

Array representation of Queue


• We can easily represent queue by using linear arrays. There
are two variables i.e. front and rear, that are implemented in
the case of every queue. Front and rear variables point to the
position from where insertions and deletions are performed
in a queue. Initially, the value of front and queue is -1 which
represents an empty queue. Array representation of a queue
containing 5 elements along with the respective values of
front and rear, is shown in the following figure.

94
 Menu driven program to implement queue using array
#include<stdio.h>
#include
<stdlib.h
>
#define
maxsize 5
void
insert();
void
delete();
void
display();
int front
= -1, rear
= -1; int
queue[ma
xsize];
void
main ()
{
int choice;
while(choice
!= 4)
{
printf("\n*************************Main Menu*****************************\n");
printf("\
n=================================================
======
==========\n");
printf("\n1.insert an element\n2.Delete an element\n3.Display
the queue\n4.Exit\n"); printf("\nEnter your choice ?");
scanf("%d",&choice); switch(choice)
95
{
ca
se
1:
ins
ert(
);
br
ea
k;
ca
se
2:
del
ete
();
br
ea
k;
ca
se
3:
dis
pla
y();
br
ea
k;
ca
se
4:
exi
t(0)
;
br
ea
k;
def
aul
t:
printf("\nEnter valid choice??\n");

96
}
}
}
void insert()
{ int item;
printf("\nEnter the
element\n");
scanf("\n
%d",&item);
if(rear ==
maxsize-1)
{
printf("\
nOVERFLOW\n");
return;
}
if(front == -1 && rear == -1)

{
fr
o
n
t
=
0;
r
e
a
r
=
0;

e
l
s
e

{
97
rear = rear+1;
}
queue[rear] = item;
printf("\nValue
inserted ");

}
void delete()
{
int item; if
(front == -1 ||
front > rear)
{
printf("\
nUNDERFLOW\n");
return;

e
l
s
e

{
item =
queue[front];
if(front == rear)

{
front
=-
1;
rear
=-
1;
}
else

98
{
front = front + 1;
}
printf("\nvalue deleted ");
}

void display()
{
int
i;
if(r
ea
r
=
=-
1)
{
printf("\nEmpty queue\n");

e
l
s
e

{ printf("\nprinting values .....\n");


for(i=front;i<=rear;i++)
{
printf("\n%d\n",queue[i]);
}
}
}

99
 Drawback of array implementation
Although, the technique of creating a queue is easy, but there are
some drawbacks of using this technique to implement a queue.

o Memory wastage : The space of the array, which is used to store


queue elements, can never be reused to store the elements of that
queue because the elements can only be inserted at front end and
the value of front might be so high so that, all the space before
that, can never be filled.

Linked List implementation of Queue


o Due to the drawbacks discussed in the previous section of
this tutorial, the array implementation can not be used for
the large scale applications where the queues are
implemented. One of the alternative of array implementation
is linked list implementation of queue.
o The storage requirement of linked representation of a queue
with n elements is o(n) while the time requirement for
operations is o(1).
o In a linked queue, each node of the queue consists of two
parts i.e. data part and the link part. Each element of the
queue points to its immediate next element in the memory.
o In the linked queue, there are two pointers maintained in the
memory i.e. front pointer and rear pointer. The front pointer
contains the address of the starting element of the queue
while the rear pointer contains the address of the last
element of the queue.
o Insertion and deletions are performed at rear and front end
respectively. If front and rear both are NULL, it indicates that
the queue is empty.
o The linked representation of queue is shown in the following
figure.

100
 Operation on Linked Queue
There are two basic operations which can be implemented on the

linked queues. The operations are Insertion and Deletion. Insert


operation
The insert operation append the queue by adding an element to
the end of the queue. The new element will be the last element of
the queue.

 Algorithm
o Step 1: Allocate the space
for the new node PTR o Step 2:
SET PTR -> DATA = VAL o
Step3: IFFRONT=NULL
SETFRONT=REAR=PTR
SETFRONT->NEXT=REAR->NEXT=NULL
ELSE
SETREAR->NEXT=PTR
SETREAR=PTR
SETREAR->NEXT=NULL
[
END
OF
IF] o

Ste
p 4:
END

Deletion
Deletion operation removes the element that is first inserted
among all the queue elements. Firstly, we need to check either the
list is empty or not. The condition front == NULL becomes true if
the list is empty, in this case , we simply write underflow on the
console and make exit.
101
Otherwise, we will delete the element that is pointed by the pointer
front. For this purpose, copy the node pointed by the front pointer
into the pointer ptr. Now, shift the front pointer, point to its next
node and free the node pointed by the node ptr.

 Algorithm
o Step1: IFFRONT=NULL
Write"Underflow"
GotoStep5
[END OF IF o Step
2: SET PTR = FRONT o
Step 3: SET
FRONT = FRONT -> NEXT
o Step 4: FREE PTR
o Step 5: END

 Menu-Driven Program implementing all the operations on Linked Queue


#include<st
dio.h>
#include<st
dlib.h>
struct node
{
int
data;
struct
node
*next;
};
struct
node
*front
;
struct
node
*rear;
void
insert
();
102
void
delet
e();
void
displa
y();
void
main
()
{
int choice;
while(choice
!= 4)
{
printf("\n*************************Main Menu*****************************\n");
printf("\
n=================================================
======
==========\n");
printf("\n1.insert an element\n2.Delete an element\n3.Display the queue\
n4.Exit\n"); printf("\nEnter your choice ?");
scanf("%d",& choice);
switch(choice)

{
ca
se
1:
ins
ert(
);
br
ea
k;
ca
se
2:
del
ete
();
br
103
ea
k;
ca
se
3:
dis
pla
y();
br
ea
k;
ca
se
4:
exi
t(0)
;
br
ea
k;
def
aul
t:
printf("\nEnter valid choice??\n");
}
}
}
void insert()
{
struct
node
*ptr;
int item;

ptr = (struct node *) malloc


(sizeof(struct node)); if(ptr ==
NULL)
{

104
printf("\
nOVERFLOW\n");
return;

e
l
s
e

{
printf("\nEnter
value?\n");

scanf("%d",&i
tem); ptr
-> data =
item;
if(front ==
NULL)

{ fron
t = ptr;
rear = ptr;
front -> next
= NULL;
rear -> next =
NULL;

e
l
s
e

{
rear ->
next = ptr;
105
rear = ptr;
rear->next =
NULL;
}
}
}
void delete ()
{
struct
node *ptr;
if(front ==
NULL)
{
printf("\
nUNDERFLOW\n");
return;

e
l
s
e

{ ptr
= front;
front =
front ->
next;
free(ptr);
}
}
void display()
{
struct
node
*ptr;
ptr =
106
front;
if(front
== NULL)
{
printf("\nEmpty queue\n");

e
l
s
e

{ printf("\nprinting
values .....\n");
while(ptr != NULL)
{
printf("\n%d\
n",ptr -> data);
ptr = ptr -> next;
}
}
}

Circular Queue
There was one limitation in the array implementation of Queue

. If the rear reaches to the end position of the Queue then there might be possibility that
some vacant spaces are left in the beginning which cannot be utilized. So, to overcome such
limitations, the concept of the circular queue was introduced.

107
What is a Circular Queue?
A circular queue is similar to a linear queue as it is also based on
the FIFO (First In First Out) principle except that the last position is
connected to the first position in a circular queue that forms a
circle. It is also known as a Ring Buffer.

 Operations on Circular Queue


The following are the operations that can be performed on a
circular queue:

o Front: It is used to get the front element from the Queue. o


Rear: It is used to get the rear element from the Queue.
o enQueue(value): This function is used to insert the new
value in the Queue. The new element is always inserted from
the rear end.
o deQueue(): This function deletes an element from the
Queue. The deletion in a Queue always takes place from the

front end. Applications of Circular Queue


The circular Queue can be used in the following scenarios:

o Memory management: The circular queue provides memory


management. As we have already seen that in linear queue, the
memory is not managed very efficiently. But in case of a circular

108
queue, the memory is managed efficiently by placing the elements
in a location which is unused.
o CPU Scheduling: The operating system also uses the circular
queue to insert the processes and then execute them.
o Traffic system: In a computer-control traffic system, traffic light is
one of the best examples of the circular queue. Each light of traffic
light gets ON one by one after every interval of time. Like red light
gets ON for one minute then yellow light for one minute and then
green light. After green light, the red light gets ON.

Deque
o The dequeue stands for Double Ended Queue. In the queue,
the insertion takes place from one end while the deletion
takes place from another end. The end at which the insertion
occurs is known as the rear end whereas the end at which
the deletion occurs is known as front end.

o Deque is a linear data structure in which the insertion and


deletion operations are performed from both ends. We can say
that deque is a generalized version of the queue.

 Priority Queue
A priority queue is a special type of queue in which each element is
associated with a priority and is served according to its priority. If
elements with the same priority occur, they are served according to
their order in the queue.

109
Priority Queue Representation

Insertion occurs based on the arrival of the values and removal


occurs based on priority.

Abstract Data Types


Abstract Data type (ADT) is a type (or class) for objects whose
behaviour is defined by a set of value and a set of operations.
The definition of ADT only mentions what operations are to be
performed but not how these operations will be implemented. It does
not specify how data will be organized in memory and what algorithms
will be used for implementing the operations. It is called “abstract”
because it gives an implementationindependent view. The process of
providing only the essentials and hiding the details is known as
abstraction.
ADT as a black box which hides the inner structure and design of the
data type.
Now we’ll define three ADTs namely List ADT, Stack ADT, Queue ADT.

List ADT
• The data is generally stored in key sequence in a list which has
a head structure consisting of count, pointers and address of
compare function needed to compare the data in the list.

Stack ADT
• In Stack ADT Implementation instead of data being stored in each node, the
pointer to data is stored.
The program allocates memory for the data and address is passed to
the stack ADT.
110
A Stack contains elements of the same type arranged in sequential
order. All operations take place at a single end that is top of the stack
and following operations can be performed:
• push() – Insert an element at one end of the stack called top.

• pop() – Remove and return the element at the top of the stack, if it is
not empty.
• peek() – Return the element at the top of the stack without removing
it, if the stack is not empty.
• size() – Return the number of elements in the stack.
• isEmpty() – Return true if the stack is empty, otherwise return false.
• isFull() – Return true if the stack is full, otherwise return false.

 Queue ADT
• The queue abstract data type (ADT) follows the basic design of the
stack abstract data type.

111
• Each node contains a void pointer to the data and the link pointer to
the next element in the queue. The program’s responsibility is to
allocate memory for storing the data.

A Queue contains elements of the same type arranged in sequential


order. Operations take place at both ends, insertion is done at the end
and deletion is done at the front. Following operations can be
performed:
• enqueue() – Insert an element at the end of the queue.
• dequeue() – Remove and return the first element of the queue, if the
queue is not empty.
• peek() – Return the element of the queue without removing it, if the
queue is not empty.
• size() – Return the number of elements in the queue.
• isEmpty() – Return true if the queue is empty, otherwise return
false.
• isFull() – Return true if the queue is full, otherwise return false.

UNIT-II

112
113
114
115
5.Rehashing

116
117
118
119
120
121
122
Skip list:

A skip list is a probabilistic data structure. The skip list is used to store a sorted list
of elements or data with a linked list. It allows the process of the elements or data
to view efficiently. In one single step, it skips several elements of the entire list,
which is why it is known as a skip list.

The skip list is an extended version of the linked list. It allows the user to search,
remove, and insert the element very quickly.

Skip list structure:

123
It is built in two layers:

The lowest layer and Top layer.

The lowest layer of the skip list is a common sorted linked list, and the top layers
of the skip list are like an "express line" where the elements are skipped.

Working of the Skip list:

Let's take an example to understand the working of the skip list. In this example,
we have 14 nodes, such that these nodes are divided into two layers, as shown in
the diagram.

The lower layer is a common line that links all nodes, and the top layer is an
express line that links only the main nodes, as you can see in the diagram.

Suppose you want to find 47 in this example. You will start the search from the
first node of the express line and continue running on the express line until you
find a node that is equal a 47 or more than 47.

You can see in the example that 47 does not exist in the express line, so you search
for a node of less than 47, which is 40. Now, you go to the normal line with the
help of 40, and search the 47, as shown in the diagram.

Skip List Basic Operations

There are the following types of operations in the skip list.

124
Insertion operation: It is used to add a new node to a particular location in a
specific situation.

Deletion operation: It is used to delete a node in a specific situation.

Search Operation: The search operation is used to search a particular node in a


skip list.

Algorithm of the insertion operation

Insertion (L, Key)

local update [0...Max_Level + 1]

a = L → header

for i = L → level down to 0 do.

while a → forward[i] → key forward[i]

update[i] = a

a = a → forward[0]

lvl = random_Level()

if lvl > L → level then

for i = L → level + 1 to lvl do

update[i] = L → header

L → level = lvl

a = makeNode(lvl, Key, value)

for i = 0 to level do

a → forward[i] = update[i] → forward[i]


125
update[i] → forward[i] = a

Algorithm of deletion operation:

Deletion (L, Key)

local update [0... Max_Level + 1]

a = L → header

for i = L → level down to 0 do.

while a → forward[i] → key forward[i]

update[i] = a

a = a → forward[0]

if a → key = Key then

for i = 0 to L → level do

if update[i] → forward[i] ? a then break

update[i] → forward[i] = a → forward[i]

free(a)

while L → level > 0 and L → header → forward[L → level] = NIL do

L → level = L → level - 1

Algorithm of searching operation

Searching (L, SKey)

a = L → header

126
loop invariant: a → key level down to 0 do.

while a → forward[i] → key forward[i]

a = a → forward[0]

if a → key = SKey then return a → value

else return failure

Example 1: Create a skip list, we want to insert these following keys in the empty
skip list.

6 with level 1.

29 with level 1.

22 with level 4.

9 with level 3.

17 with level 1.

4 with level 2.

Ans:

Step 1: Insert 6 with level 1

127
Step 2: Insert 29 with level 1

Step 3: Insert 22 with level 4

Step 4: Insert 9 with level 3

128
Step 5: Insert 17 with level 1

Step 6: Insert 4 with level 2

129
Example 2: Consider this example where we want to search for key 17.

Ans:

130
Advantages of the Skip list

If you want to insert a new node in the skip list, then it will insert the node very
fast because there are no rotations in the skip list.

The skip list is simple to implement as compared to the hash table and the binary
search tree.

It is very simple to find a node in the list because it stores the nodes in sorted form.

The skip list algorithm can be modified very easily in a more specific structure,
such as indexable skip lists, trees, or priority queues.

The skip list is a robust and reliable list.

Disadvantages of the Skip list:

It requires more memory than the balanced tree.

Reverse searching is not allowed.

The skip list searches the node much slower than the linked list.

Dictionary:
131
Dictionary is one of the important Data Structures that is usually used to store data
in the key-value format. Each element presents in a dictionary data structure
compulsorily have a key and some value is associated with that particular key.

In other words, we can also say that Dictionary data structure is used to store the
data in key-value pairs.

In Dictionary or associative array, the relation or association between the key and
the value is known as the mapping.

A dictionary is also called a hash, a map, a hashmap in different


programminglanguages.

example

the results of a classroom test could be represented as a dictionary with pupil's


names as keys and their scores as the values

results = { 'Detra' : 17,

'Nova' : 84,

'Charlie' : 22,

'Henry' : 75,

'Roxanne' : 92,

'Elsa' : 29 }

Instead of using the numerical index of the data we can use the dictionary names
to return values

Dictionary is an abstract data structure that supports the following operations: –

search(K key)
132
(returns the value associated with the given key) –

insert(K key, V value) –

delete(K key) • Each element stored in a dictionary is identified by a key of type


K.

Dictionary represents a mapping from keys to values

Operations on dictionaries

Dictionaries typically support several operations:

retrieve a value (depending on language, attempting to retrieve a missing key may


give a default value or throw an exception)

insert or update a value (typically, if the key does not exist in the dictionary, the
key-value pair is inserted; if the key already exists, its corresponding value is
overwritten with the new one)

remove a key-value pair – test for existence of a key

Note that items in a dictionary are unordered, so loops over dictionaries will
return items in an arbitrary order

Implementations on dictionaries

simple implementations: sorted or unsorted sequences, direct addressing

• hash tables

• binary search trees (BST)

• AVL trees

• red-black trees

Dictionary Implementation withHash-Table

Following are basic primary operations of a hashtable which are following.


133
Search − search an element in a hashtable.

Insert − insert an element in a hashtable.

delete − delete an element from a hashtable

• DataItem Define a data item having some data, and key based on which search is
to be conducted in hashtable.

struct DataItem

int data;

int key;

};

Hash Method Define a hashing method to compute the hash code of the key of the
data item.

int hashCode(int key)

return key % SIZE;

134
135
struct nlist { /* table entry: */

struct nlist *next; /* next entry in chain */

char *name; /* defined name */

char *defn; /* replacement text */

};

#define HASHSIZE 101static struct nlist *hashtab[HASHSIZE]; /* pointer table */

/* hash: form hash value for string s */unsigned hash(char *s)

unsigned hashval;

for (hashval = 0; *s != '\0'; s++)

hashval = *s + 31 * hashval;

return hashval % HASHSIZE;

/* lookup: look for s in hashtab */struct nlist *lookup(char *s)

struct nlist *np;

for (np = hashtab[hash(s)]; np != NULL; np = np->next)

if (strcmp(s, np->name) == 0)

return np; /* found */

return NULL; /* not found */


136
}

char *strdup(char *);/* install: put (name, defn) in hashtab */struct nlist
*install(char *name, char *defn)

struct nlist *np;

unsigned hashval;

if ((np = lookup(name)) == NULL) { /* not found */

np = (struct nlist *) malloc(sizeof(*np));

if (np == NULL || (np->name = strdup(name)) == NULL)

return NULL;

hashval = hash(name);

np->next = hashtab[hashval];

hashtab[hashval] = np;

} else /* already there */

free((void *) np->defn); /*free previous defn */

if ((np->defn = strdup(defn)) == NULL)

return NULL;

return np;

char *strdup(char *s) /* make a duplicate of s */

137
{

char *p;

p = (char *) malloc(strlen(s)+1); /* +1 for ’\0’ */

if (p != NULL)

strcpy(p, s);

return p;

138
Unit 3

Non Linear Data Structures - Trees

PRELIMINARIES :
TREE : A tree is a finite set of one or more nodes such that there is a specially
designated node called the Root, and zero or more non empty sub trees T1, T2
.........................................................................................................................................................................................................Tk, each of

whose roots are connected by a directed edge from Root R.


A

B C D E

F G H I J

K L M

Fig. 3.1.1 Tree


ROOT : A node which doesn‘t have a parent. In the above tree. The Root is A.

139
NODE : Item of Information.

LEAF : A node which doesn‘t have children is called leaf or Terminal node. Here B, K,
L, G, H, M, J are leafs.

SIBLINGS : Children of the same parents are said to be siblings, Here B, C, D, E are
siblings, F, G are siblings. Similarly I, J & K, L are siblings.

PATH : A path from node n1 to nk is defined as a sequence of nodes n1, n2,n3 nk suchthat
ni is the parent of ni+1. for 1  i  k . There is exactly only one path from each
node to root.
In fig 3.1.1 path from A to L is A, C, F, L. where A is the parent for C, C is the
parent of F and F is the parent of L.
LENGTH : The length is defined as the number of edges on the path.
In fig 3. 1.1 the length for the path A to L is 3.
DEGREE : The number of subtrees of a node is called its degree.

140
In fig 3.1.1
Degree of A is 4 Degree of C is 2
Degree of D is 1 Degree of H is
0.

* The degree of the tree is the maximum degree of any node in the tree.
In fig 3.1.1 the degree of the tree is 4.
LEVEL : The level of a node is defined by initially letting the root be at level one, if a node is at level L
then its children are at level L + 1.
Level of A is 1.
Level of B, C, D, is 2. Level of F, G, H, I, J
is 3 Level of K, L, M is 4.

DEPTH : For any node n, the depth of n is the length of the unique path from root to n.

The depth of the root is zero. In fig 3.1.1


Depth of node F is 2. Depth of node L is 3.

HEIGHT : For any node n, the height of the node n is the length of the longest path from n to the leaf.

The height of the leaf is zero


In fig 3.1.1 Height of node F is 1. Height of L is 0.

Note : The height of the tree is equal to the height of


the root

Depth of the tree is equal to the height of the


tree.

Degree of a Node

The degree of a node is the total number of branches of that node.

Binary Tree

141
Binary Tree Representation

A node of a binary tree is represented by a structure containing a data part and two
pointers to other structures of the same type.

struct node

int data;

struct node
*left; struct
node *right;

};

Binary Search Tree(BST)

Binary search tree is a data structure that quickly allows us to maintain a sorted list of
numbers.

 It is called a binary tree because each tree node has a maximum of two children.

 It is called a search tree because it can be used to search for the presence of a
number in O(log(n)) time.
The properties that separate a binary search tree from a regular binary tree is

1. All nodes of left subtree are less than the root node

2. All nodes of right subtree are more than the root node

3. Both subtrees of each node are also BSTs i.e.142


they have the above two
BST Basic Operations
The basic operations that can be performed on a binary search tree data
structure, are the following −
 Insert − Inserts an element in a tree/create a tree.
 Search − Searches an element in a tree.
 Preorder Traversal − Traverses a tree in a pre-order manner.
 Inorder Traversal − Traverses a tree in an in-order manner.
 Postorder Traversal − Traverses a tree in a post-order manner.

Insert Operation
The very first insertion creates the tree. Afterwards, whenever an element is to
be inserted, first locate its proper location. Start searching from the root node,
then if the data is less than the key value, search for the empty location in the
left subtree and insert the data. Otherwise, search for the empty location in the
right subtree and insert the data.

143
Algorithm

If root is NULL

then create root node


return

If root exists then

compare the data with node.data

while until insertion position is located

If data is greater than node.data

goto right subtree

else

goto left subtree

144
Implementation

The implementation of insert function should look like this −

void insert(int data) {

struct node *tempNode = (struct node*) malloc(sizeof(struct node));


struct node *current;

struct node *parent;

tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild = NULL;

//if tree is empty, create root node


if(root == NULL) {

root = tempNode;

} else {

current = root;
parent = NULL;

while(1) {

parent = current;

//go to left of the tree


if(data < parent->data) {

current = current->leftChild;

//insert to the left


if(current == NULL) {

parent->leftChild = tempNode;
return;

Search Operation
Whenever an element is to be searched, start searching from the root node,
then if the data is less than the key value, search for the element in the left
subtree. Otherwise, search for the element in the right subtree. Follow the same
algorithm for each node.

Algorithm
If root.data is equal to search.data
return root 145
else

while data not found


If data is greater than node.data
goto right subtree

else

goto left subtree

If data found
return node

endwhile
The implementation of this algorithm should look like this.

struct node* search(int data) {


struct node *current = root;
printf("Visiting elements: ");

while(current->data != data) {
if(current != NULL)
printf("%d ",current->data);

//go to left tree

if(current->data > data) {

current = current->leftChild;

//else go to right tree


else {

current = current->rightChild;

Deletion

There are 3 cases that can happen when you are trying to delete a node.
If it has,

1. No subtree (no children): This one is the easiest one. You can simply
just delete the node, without any additional actions required.
146
2. One subtree (one child): You have to make sure that after the node is
deleted, its child is then connected to the deleted node's parent.

147
3. Two subtrees (two children): You have to find and replace the node
you want to delete with its inorder successor (the leftmost node in the
right subtree).
o Delete Operation

void deleteNode(struct node* root, int data){

if (root == NULL) root=tempnode;

if (data < root->key)

root->left = deleteNode(root->left, key);

else if (key > root->key)

root->right = deleteNode(root->right, key);

else

if (root->left == NULL)

struct node *temp = root->right;


free(root);

return temp;

else if (root->right == NULL)

struct node *temp = root->left;


free(root);

return temp;

struct node* temp = minValueNode(root->right);

root->key = temp->key;

root->right = deleteNode(root->right, temp->key);

148
}

return root;

149
TREE TRAVERSALS:

Traversal is a process to visit all the nodes of a tree and may print their values
too. Because, all nodes are connected via edges (links) we always start from the
root (head) node. That is, we cannot randomly access a node in a tree. There
are three ways which we use to traverse a tree −

 In-order Traversal
 Pre-order Traversal
 Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree
or to print all the values it contains.

In-order Traversal
In this traversal method, the left subtree is visited first, then the root and later
the right sub- tree. We should always remember that every node may represent
a subtree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in
an ascending order.

We start from A, and following in-order traversal, we move to its left subtree B.
B is also traversed in-order. The process goes on until all the nodes are visited.
The output of inorder traversal of this tree will be −
o D→B→E→A→F→C→G

Algorithm
Until all nodes are traversed −

Step 1 − Recursively traverse left subtree.

Step 2 − Visit root node.

Step 3 − Recursively traverse right subtree.

150
void inorder_traversal(struct node* root) {
if(root != NULL) {

inorder_traversal(root->leftChild);
printf("%d ",root->data);
inorder_traversal(root->rightChild);

Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree
and finally the right subtree.

151
We start from A, and following pre-order traversal, we first visit A itself and then
move to its left subtree B. B is also traversed pre-order. The process goes on
until all the nodes are visited. The output of pre-order traversal of this tree will
be −
o A→B→D→E→C→F→G

Algorithm
Until all nodes are traversed −

Step 1 − Visit root node.

Step 2 − Recursively traverse left subtree.

Step 3 − Recursively traverse right subtree.

void pre_order_traversal(struct node* root) {


if(root != NULL) {

printf("%d ",root->data);
pre_order_traversal(root->leftChild);
pre_order_traversal(root->rightChild);

Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we
traverse the left subtree, then the right subtree and finally the root node.

We start from A, and following Post-order traversal, we first visit the left subtree
B. B is also traversed post-order. The process goes on until all the nodes are
visited. The output of post-order traversal of this tree will be −
152
o D→E→B→F→G→C→A

Algorithm
Until all nodes are traversed −

Step 1 − Recursively traverse left subtree.


Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.

void post_order_traversal(struct node* root) {


if(root != NULL) {

post_order_traversal(root->leftChild);
post_order_traversal(root->rightChild);

printf("%d ", root->data);

153
Non Linear Data Structures - Trees 3.1
Example : -
Traverse the given tree using inorder, preorder and postorder
traversals. (1)

+ -

/
A B C

D E

Fig. 3.2.7

Inorder :A+B*C-D/E
Preorder :*+AB-C/DE
Postorder : A B + C D E / -*

20

10 30

5 15 25 40

Fig. 3.2.8
Inorder :5 10 15 20 25 30 40
Preorder : 20 10 5 15 30 25 40
Postorder :5 15 10 25 40 30 20

154
Non Linear Data Structures - Trees 3.1
3.1 BINARY

TREE
Definition
:-
Binary Tree is a tree in which no node can have more than two children.
Maximum number of nodes at level i of a binary tree is 2i+1.

15

18 20

8 5 10

Fig. 3.3.1 Binary Tree


Binary Tree Node Declarations
Struct TreeNode

int Element;

Struct TreeNode *Left ;


Struct TreeNode *Right;

};
COMPARISON
BETWEEN GENERAL
TREE & BINARY
TREE

General Tree Binary Tree

155
Non Linear Data Structures - Trees 3.1

* General Tree has any * A Binary Tree has not


number of children. more than two children.

15 A

18 20 10 B C

Full Binary Tree :-

A full binary tree of height h has 2h+1 - 1 nodes.

Here height is 3 No. of nodes in full


binary tree is = 23+1 -1
= 15 nodes.

2 3

4 5 6 7

8 9 10 11 12 13 14 15

Fig. 3.3.2 A
Complete Full Binary
Binary Tree : Tree
A complete binary tree height h has between 2h and 2h+1 - 1 In the bottom level the
of
elements should be filled from left to right.

156
Non Linear Data Structures - Trees 3.1

2 3

4 5 6 7

8 9

Fig. 3.3.3 A Complete Binary Tree.


Note : A full binary tree can be a complete binary tree, but all complete binary tree is not a
full
binary
tree.

157
Non Linear Data Structures - Trees 3.1

3.3.1 Representation of a Binary Tree


There are two ways for representing binary tree, they are
* Linear Representation
* Linked Representation
Linear Representation
The elements are represented using arrays. For any element in position i, the left child is in
position 2i, the right child is in position (2i + 1), and the parent is in position (i/2).

B C A B C D E F G

1 2 3 4 5 6 7

D E F G

Fig. 3.3.4 Linear Representation


Linked Representation

The elements are represented using pointers. Each node in linked representation has
three fields, namely,
* Pointer to the left

* Data field
* Pointer to the right subtree
In leaf nodes, both the pointer fields are assigned as NULL.

A
A

B C
C

F
B G

D E F D E G
Non Linear Data Structures - Trees 3.1

Fig. 3.3.5 Linked Representation


3.3.2 The Leftmost-child, Right-sibling Data Structures
In this representation, cellspace contains three fields namely, leftmost child, label and
right sibling. A node is identified with the index of the cell in cellspace that represents it as a
child. Then, next pointers of cellspace point to right siblings, and the information contained in
the nodespace array can be held by introducing a field leftmost-child in cellspace.
Declaration of cellspace in leftmost-child right-sibling data
structure Typedef struct cellspace * ptrtonode;

Struct cellspace
{

Element type label;


ptrtonode leftmost-child;

ptrtonode right sibling;

Exam
ple 1:
Righ
t Label
Lef
sibling
t
chil
d

A
B C

A D

B C

D Null D Null
2 Cellspace
Null B 11
5
5
5 A Null
1 2 C Null
0
Non Linear Data Structures - Trees 3.1

Lef
Label
t
mos
Right
t
siblin
chil
g
d

Figure 3.3.6 : Leftmost-child, right-sibling representation of a tree


Example 2:
A

B C

D E F G H

I J K

B C
F H

D E G

I J K

1 2 B 10

2 15 D 3
14 E 11
3

1 A Null
7

13 C Null
Null F Null
Null H Null
T
7
Null G 12l
10
Null K Null
11 Null I 16
Null J Null
12
13

14

15

16

Figure 3.3.7: Leftmost-child, right-sibling representation of the above tree


3.2 EXPRESSION TREE
Expression Tree is a binary tree in which the leaf nodes are operands and the interior
nodes are operators. Like binary tree, expression tree can also be travesed by inorder,
preorder and postorder traversal.

Constructing an Expression Tree


Let us consider postfix expression given as an input for constructing an expression tree
by performing the following steps :
1. Read one symbol at a time from the postfix expression.
2. Check whether the symbol is an operand or operator.
(a) If the symbol is an operand, create a one - node tree and push a
pointer on to the stack.
(b) If the symbol is an operator pop two pointers from the stack namely
T1 and T2 and form a new tree with root as the operator and T2 as a left
child and T1 as a right child. A pointer to this new tree is then pushed
onto the stack.

Example : -
ab + c *
The first two symbols are operand, so create a one node tree and push the pointer on to the
stack.

a b

Fig. 3.4.1 (a)


Next ‗+‘ symbol is read, so two pointers are popped, a new tree is formed and a pointer
to this is pushed on to the stack.

a b
Fig. 3.4.2 (b)
Next the operand C is read, so a one node tree is created and the pointer to it is pushed
onto the stack.

stack + c
.
a b

Now ‗*‘ is read, so two trees are merged and the pointer to the final tree is pushed onto the

+ c

a b

Fig. 3.4.3 (d)


3.5. APPLICATIONS OF TREE
 Binary Search Tree - Used in many search applications where data is constantly
entering/ leaving, such as the map and set objects in many languages‘ libraries.
 Binary Space Partition - Used in almost every 3D video game to determine what objects
need to be rendered.
 Binary Tries - Used in almost every high-bandwidth router for storing router-tables.
 Hash Trees - used in p2p programs and specialized image-signatures in which a hash
needs to be verified, but the whole file is not available.
 Heaps - Used in implementing efficient priority-queues, which in turn are used for
scheduling processes in many operating systems, Quality-of-Service in routers, and A*
(path-finding algorithm used in AI applications, including robotics and video games). Also
used in heap- sort.
 Huffman Coding Tree (Chip Uni) - used in compression algorithms, such as those used by
the .jpeg and .mp3 file-formats.
 GGM Trees - Used in cryptographic applications to generate a tree of pseudo-
random numbers.

 Syntax Tree - Constructed by compilers and (implicitly) calculators to parse expressions.

 Treap - Randomized data structure used in wireless networking and memory allocation.

 T-tree - Though most databases use some form of B-tree to store data on the drive,
databases which keep all (most) their data in memory often use T-trees to do so.

BTree :

We use BTree in indexing large records in database to improve search.

3.6 THE SEARCH TREE


ADT : - BINARY SEARCH
TREE Definition : -
Binary search tree is a binary tree in which for every node X in the tree, the values of all
the keys in its left subtree are smaller than the key value in X, and the values of all the keys in
its right subtree are larger than the key value in X.

19

10 22

7 20

21

Fig. 3.6.1
Binary Search Tree Comparision
Between Binary Tree & Binary
Search Tree
Binary Tree
Binary Search Tree
* A tree is said to be a binary * A binary search tree is a binary tree in which the
tree if it has atmost two key values in the left node is less than the root
childrens. and the keyvalues in the right node is greater
than the root.

* It doesn‘t have any order.


* Example
4 4

7 8 3 10

3 10 12 1 7 12

1 8

Note : * Every binary search tree is a binary tree.


* All binary trees need not be a binary search tree.
Example : -OF BST(BINARY SEARCH TREE)
To insert 8, 5, 10, 15, 20, 18, 3
* First element 8 is considered as Root.

As 5 < 8, Traverse towards left


5

5 10

10 > 8, Traverse towards Right.

5 10

15

Similarly the rest of the elements are traversed.


8

5 10

5 10
15

15
20

20
18

After 20 After 18
Find : -

* Check whether root is NULL if so then return NULL.


the

* Otherwise, Check the value X with the root node value (i.e. T  data)
(1) If X is equal to T  data, return T.
(2) If X is less than T  data, Traverse the left of T recursively.
(3) If X is greater than T  data, traverse the right of T recursively.

Example : - To Find an element 10 (consider, X = 10)

8 8

5 15 5 15

3 10 3 10

10 is checked with the Root 10 > 8, Go to the right child of 8


8 8

5 15 5 15

3 10 3 10

10 is checked with Root 15 10 < 15, Go to the left child of 15.

5 15

3 10 10 is checked with root 10 (Found)

Find Min :
This operation returns the position of the smallest element in the tree.

To perform FindMin, start at the root and go left as long as there is a left child. The
stopping point is the smallest element.
(b) T! = NULL and T left!=NULL, Traverse left Traverse left

10

5 15

Min T
3 8

(c) Since T left is Null, return T as a minimum element.

Non - Recursive Routine For Findmin


int FindMin (SearchTree T)

if (T! = NULL)

while (T  Left ! = NULL) T


= T  Left ;

return T;

FindMax
FindMax routine return the position of largest elements in the tree. To perform a
FindMax, start at the root and go right as long as there is a right child. The stopping point is the
largest element.
Non Linear Data Structures - Trees 3.2

int FindMax (SearchTree T)

if (T = = NULL)

return NULL ;

else if (T  Right = = NULL)

return T;

else FindMax (T  Right);

}
Recursive Routine for Findmax
Example :-
Root T

10 10

8 15 8 15 T

5 20 5 20

(a) T! = NULL and T Right!=NULL, (b) T! = NULL and T Right!=NULL,


Traverse Right Traverse Right

10

8 15 Max

5 20 T
Non Linear Data Structures - Trees 3.2

(c) Since T Right is NULL, return T as a Maximum element.


Non Linear Data Structures - Trees 3.2

BST DELETION:
Deletion operation is the complex operation in the Binary search tree. To delete an
element, consider the following three possibilities.

CASE 1  Node to be deleted is a leaf node (ie) No children.

CASE 2  Node with one child.

CASE 3  Node with two children.

CASE 1  Node with no children (Leaf node)

If the node is a leaf node, it can be deleted immediately.

Delete : 8

4 10
4 10

8 After the deletion

CASE 2 : - Node with one child

If the node has one child, it can be deleted by adjusting its parent pointer that points to
its child node.
Non Linear Data Structures - Trees 3.2

To Delete 5

7
4 10

4 10
3 5

3 6
6

before deletion After deletion

To delete 5, the pointer currently pointing the node 5 is now made to its child node 6.

Case 3 : Node with two children


It is difficult to delete a node which has two children. The general strategy is to replace
the data of the node to be deleted with its smallest data of the right subtree and recursively
delete that node.

Example 1 :
To Delete 5 : 10

5 15

3 8

* The minimum element at the right subtree is 7.


Non Linear Data Structures - Trees 3.2

(a)
Non Linear Data Structures - Trees 3.2
* Now the value 7 is replaced in the position of 5.
10

7 15

* Since the position of 7 is the leaf node


3 8 delete immediately.

(b)

10

After deleting the node 5


7 15

3 8

15
(c)

10 25

Example 2 : - To Delete 25
20 35
5

30

32
Non Linear Data Structures - Trees 3.2

* The minimum element


at the right subtree of 25 is 30

(a)
Non Linear Data Structures - Trees 3.2

15 * The minimum value 30 is replaced in the position of 25

10 30

20 35
5

32

(b)

15

* Since this node has one child, the pointer currently


10 30 pointing to this node is made to points to its child node
32
20 35
5

32

(c)
Non Linear Data Structures - Trees 3.2

15

10 30

20 35
5
Binary Search Tree after deleting 25

32

(d)
Non Linear Data Structures - Trees 3.2
AVL TREE (ADELSON - VELSKILLAND LANDIS) Non Linear Data Structures - Trees 3.2

AVL Tree
AVL Tree is invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in
honour of its inventors.

AVL Tree can be defined as height balanced binary search tree in which each node is associated with
a balance factor which is calculated by subtracting the height of its right sub- tree from that of its left
sub-tree.

Tree is said to be balanced if balance factor of each node is in between -1 to 1, otherwise, the
tree will be unbalanced and need to be balanced.

Balance Factor (k) = height (left(k)) - height (right(k))


If balance factor of any node is 1, it means that the left sub-tree is one level higher than the right
sub-tree.

If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal
height.

If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right
sub-tree.

An AVL tree is given in the following figure. We can see that, balance factor associated with each
node is in between -1 and +1. therefore, it is an example of AVL tree.
Operations on AVL tree Non Linear Data Structures - Trees 3.2

SN Operation Description

1 Insertion Insertion in AVL tree is performed in the same way as it is performed in a binary
search tree. However, it may lead to violation in the AVL tree
property and therefore the tree may
Nonneed
Linearbalancing. The
Data Structures tree can be balanced
- Trees 3.2
by applying rotations.

2 Deletion Deletion can also be performed in the same way as it is performed in a binary
search tree. Deletion may also disturb the balance of the tree therefore, various
types of rotations are used to rebalance the tree.

 Why AVL Tree?

AVL tree controls the height of the binary search tree by not letting it to be skewed. The time taken
for all operations in a binary search tree of height h is O(h). However, it can be extended to O(n) if
the BST becomes skewed (i.e. worst case). By limiting this height to log n, AVL tree imposes an upper
bound on each operation to be O(log n) where n is the number of nodes.

AVL Rotations
We perform rotation in AVL tree only in case if Balance Factor is other than -1, 0, and 1. There are
basically four types of rotations which are as follows:

1. L L rotation: Inserted node is in the left subtree of left subtree of A


2. R R rotation : Inserted node is in the right subtree of right subtree of A
3. L R rotation : Inserted node is in the right subtree of left subtree of A
4. R L rotation : Inserted node is in the left subtree of right

subtree of A Where node A is the node whose balance Factor is

other than -1, 0, 1.

The first two rotations LL and RR are single rotations and the next two rotations LR and RL are
double rotations

1. RR Rotation

When BST becomes unbalanced, due to a node is inserted into the right subtree of the right subtree
of A, then we perform RR rotation, RR rotation is an anticlockwise rotation, which is applied on the
edge below a node having balance factor -2
Non Linear Data Structures - Trees 3.2

In above example, node A has balance factor -2 because a node C is inserted in the right subtree of A
right subtree. We perform the RR rotation on the edge below A.

2. LL Rotation

When BST becomes unbalanced, due to a node is inserted into the left subtree of the left subtree of
C, then we perform LL rotation, LL rotation is clockwise rotation, which is applied on the edge below
a node having balance factor 2.

In above example, node C has balance factor 2 because a node A is inserted in the left subtree of C
left subtree. We perform the LL rotation on the edge below A.

3. LR Rotation

Double rotations are bit tougher than single rotation which has already explained above. LR rotation
= RR rotation + LL rotation, i.e., first RR rotation is performed on subtree and then LL rotation is
performed on full tree, by full tree we mean the first node from the path of inserted node whose
balance factor is other than -1, 0, or 1.

Let us understand each and every step very clearly:

State Action
A node B has been inserted intoNon
theLinear
rightData
subtree of A
Structures the left subtree of
- Trees 3.2C,
because of which C has become an unbalanced node having balance factor 2.
This case is L R rotation where: Inserted node is in the right subtree of left
subtree of C

As LR rotation = RR + LL rotation, hence RR (anticlockwise) on subtree


rooted at A is performed first. By doing RR rotation, node A, has become the
left subtree of B.

After performing RR rotation, node C is still unbalanced, i.e., having


balance factor 2, as inserted node A is in the left of left of C

Now we perform LL clockwise rotation on full tree, i.e. on node C. node C


has now become the right subtree of node B, A is left subtree of B

Balance factor of each node is now either -1, 0, or 1, i.e. BST is balanced now.

4. RL Rotation

As already discussed, that double rotations are bit tougher than single rotation which has already
explained above. R L rotation = LL rotation + RR rotation, i.e., first LL rotation is
performed on subtree and then RR rotation is performed on full tree, by full tree we mean the
Non Linear Data Structures - Trees 3.2
first node from the path of inserted node whose balance factor is other than -1, 0, or 1.

State Action

A node B has been inserted into the left subtree of C the right subtree of A, because
of which A has become an unbalanced node having balance factor -
2. This case is RL rotation where: Inserted node is in the left subtree of right subtree
of A

As RL rotation = LL rotation + RR rotation, hence, LL (clockwise) on subtree rooted


at C is performed first. By doing RR rotation, node C has become the right subtree of
B.

After performing LL rotation, node A is still unbalanced, i.e. having balance factor -
2, which is because of the right-subtree of the right-subtree node A.

Now we perform RR rotation (anticlockwise rotation) on full tree, i.e. on node


A. node C has now become the right subtree of node B, and node A has become
the left subtree of B.
Balance factor of each node is now eitherData
Non Linear -1, 0, or 1, i.e.,
Structures BST is balanced 3.2
- Trees now.

Algorithm to insert a newNode


A is always inserted as a leaf node with balance factor equal to 0.
newNod
1. Letethe initial tree be:

Initial tree for insertion

Let the node to be inserted be: New node


using the following recursive
2. Go to the appropriate leaf node to insert a
steps. Compare with of the current tree.

If < rootKey, call insertion algorithm on the left subtree of the current node until the
leaf node is reached.
Else if > rootKey, call insertion algorithm on the right subtree of current
node until the leaf node is reached.
newNod
e
newKe rootKe
y y
newKe
y

newKe
y

c. Else, return leafNode.


3. Compare leafKe obtained from the above steps with
Non Linear newKey: - Trees
Data Structures 3.2
y
newKe leftChil
a. If
y < leafKey, make newNode as the d of leafNode.

4. Else, make newNod


as rightChild of leafNode.
e

Update balanceFact of the nodes.


or
5. If the nodes are unbalanced, then rebalance the node.

a. If
balanceFact > 1, it means the height of the left subtree is greater than that of
or
the right subtree. So, do a right rotation or left-right rotation
a. If newNodeKe < leftChildKe do right rotation. Non Linear Data Structures - Trees 3.2
y y
b. Else, do left-right rotation.

Balancing the tree with rotation


Balancing the tree with rotation

b. If < -1, it means the height of the right subtree is greater than that of
the left subtree. So, do right rotation or right-left rotation

If > do left rotation.


Else, do right-left rotation

balanceFact
or

6. The final tree is:


newNodeKe rightChildK
y ey
Algorithm to Delete a node
A node is always deleted as a leaf node. After deleting a node, the balance
factors of the nodes get changed. In order to rebalance the balance factor,
suitable rotations are performed.
1. Locate nodeToBeDelet
in the code used .
nodeToBeDelet (recursion is used to ed
ed
find

a. If is the leaf
nodeToBeDelet node (ie. does not have any child), then
ed
remove nodeToBeDeleted.
nodeToBeDelet nodeToBeDelet
b. If has one child, then substitute the contents of
ed ed with
that of the child. Remove the child.

c. I has two children, find the


successor f ofnodeToBeDelet
inorder
w nodeToBeDelet
ed
ed (ie. node with a minimum value of key in the right
subtree)
.
d.

a. Substitute the contents of


nodeToBeDelet with that of w.
ed
b. Remove the leaf node w.

2. Update balanceFact of the nodes.


or

3. Rebalance the tree if the balance factor of any of the nodes is not equal to -1, 0 or 1.

a. If
balanceFact
of currentNod > 1,
a. If or e
balanceFact of leftChil >= 0, do right rotation.
or d
b. Right-rotate for balancing the tree
c. Else do left-right rotation.

b. If of currentNod < -1,


balanceFact
a. If or
of e
<= 0, do left rotation.
balanceFact rightChil
or d
b. Else do right-left rotation.
Types of Binary Tree
4. The finalBinary
1. Full tree is:Tree

A full Binary tree is a special type of binary tree in which every parent
node/internal node has either two or no children.

Full Binary Tree

To learn more, please visit full binary tree.

2. Perfect Binary Tree

A perfect binary tree is a type of binary tree in which every internal node has exactly two
child nodes and all the leaf nodes are at the same level.

Perfect Binary Tree

To learn more, please visit perfect binary tree.

4. Complete Binary Tree


A complete binary tree is just like a full binary tree, but with two major differences

1. Every level must be completely filled

2. All the leaf elements must lean towards the left.

3. The last leaf element might not have a right sibling i.e. a complete binary tree
doesn't have to be a full binary tree.
Complete Binary Tree

To learn more, please visit complete binary tree.

4. Degenerate or Pathological Tree

A degenerate or pathological tree is the tree having a single child either left or right.

Degenerate Binary Tree

5. Skewed Binary Tree

A skewed binary tree is a pathological/degenerate tree in which the tree is either


dominated by the left nodes or the right nodes. Thus, there are two types of
Skewed Binary Tree

6. Balanced Binary Tree

It is a type of binary tree in which the difference between the height of the left and the
right subtree for each node is either 0 or 1.

Balanced Binary Tree


Non Linear Data Structures - Trees 3.20

Basis for Binary tree Binary search tree


comparison

Definition A binary tree is a non-linear data structure in


which a node can have utmost two children,
i.e., a node can have 0, 1 or maximum two
children. A binary search tree is an ordered
binary tree in which some order is followed to
organize the nodes in a tree.

Structure The structure of the binary tree is that the first The binary search tree is one of the
node or the topmost node is known as the root types of binary tree that has the value
node. Each node in a binary tree contains the of all the nodes in the left subtree
left pointer and the right pointer. The left lesser or equal to the root node, and the
pointer contains the address of the left subtree, value of all the nodes in a right subtree
whereas right pointer contains the address of are greater than or equal to the value of
right subtree. the root node.

Operations The operations that can be implemented on a Binary search trees are the sorted
binary tree are insertion, deletion, and binary trees that provide fast insertion,
traversal. deletion and search. Lookups mainly
implement binary search as all the keys
are arranged in sorted order.

types Four types of binary trees are Full Binary Tree, There are different types of binary
Complete Binary Tree, Perfect Binary Tree, search trees such as AVL trees, Splay
and Extended Binary Tree. tree, Tango trees, etc.

1) The maximum number of nodes at level ‘l’ of a binary


tree is 2l. Here level is the number of nodes on the path from
the root to the node (including root and node). Level of the
root is 0.
This can be proved by induction.

For root, l = 0, number of nodes = 20 = 1

Assume that the maximum number of nodes on level ‘l’ is 2 l

Since in Binary tree every node has at most 2 children, next level
would have twice nodes, i.e. 2 *
Non Linear Data Structures - Trees 3.20
2) 2) The Maximum number of nodes in a binary tree of
height ‘h’ is 2h – 1.

Red-Black tree
The red-Black tree is a binary search tree. Each node in the Red-black tree
contains an extra bit that represents a color to ensure that the tree is
balanced during any operations performed on the tree like insertion,
deletion, etc

Why do we require a Red-Black tree

 The Red-Black tree is used because the AVL tree requires many
rotations when the tree is large, whereas the Red-Black tree requires a
maximum of two rotations to balance the tree.
 The main difference between the AVL tree and the Red-Black tree is
that the AVL tree is strictly balanced, while the Red-Black tree is not
completely height-balanced.
 So, the AVL tree is more balanced than the Red-Black tree, but the
Red-Black tree guarantees O(log2n) time for all operations like
insertion, deletion, and searching.
As the name suggests that the node is either colored in Red or Black color.
Sometimes no rotation is required, and only recoloring is needed to balance the
tree.

 Properties of Red-Black tree


o It is a self-balancing Binary Search tree. Here, self-balancing means that it
balances the tree itself by either doing the rotations or recoloring the nodes.
o This tree data structure is named as a Red-Black tree as each node is either
Red or Black in color. Every node stores one extra information known as a bit
that represents the color of the node. For example, 0 bit denotes the black
color while 1 bit denotes the red color of the node. Other information stored
by the node is similar to the binary tree, i.e., data part, left pointer and right
pointer.
o In the Red-Black tree, the root node is always black in color.
o In a binary tree, we consider those nodes as the leaf which have no child. In
contrast, in the Red-Black tree, the nodes that have no child are considered
Non Linear Data Structures - Trees 3.20
the internal nodes and these nodes are connected to the NIL nodes that are
always black in color. The NIL nodes are the leaf nodes in the Red-Black tree.
o If the node is Red, then its children should be in Black color. In other words,
we can say that there should be no red-red parent-child relationship.
o Every path from a node to any of its descendant's NIL node should have
same number of black nodes.

 Is every AVL tree can be a Red-Black tree?


Yes, every AVL tree can be a Red-Black tree if we color each node either by
Red or Black color. But every Red-Black tree is not an AVL because the AVL
tree is strictly height-balanced while the Red-Black tree is not completely
height-balanced.

 Insertion in Red Black tree


The following are some rules used to create the Red-Black tree:

1. If the tree is empty, then we create a new node as a root node with the color
black.
2. If the tree is not empty, then we create a new node as a leaf node with a
color red.
3. If the parent of a new node is black, then exit.
4. If the parent of a new node is Red, then we have to check the color of the
parent's sibling of a new node.

4a) If the color is Black, then we perform rotations and recoloring.

4b) If the color is Red then we recolor the node. We will also check whether
the parents' parent of a new node is the root node or not; if it is not a root
node, we will recolor and recheck the node.

Let's understand the insertion in the Red-Black tree.

10, 18, 7, 15, 16, 30, 25, 40, 60

Step 1: Initially, the tree is empty, so we create a new node having value
10. This is the first node of the tree, so it would be the root node of the tree.
As we already discussed, that root node must be black in color, which is
shown below:
Non Linear Data Structures - Trees 3.20

Step 2: The next node is 18. As 18 is greater than 10 so it will come at the
right of 10 as shown below.

We know the second rule of the Red Black tree that if the tree is not empty
then the newly created node will have the Red color. Therefore, node 18 has
a Red color, as shown in the below figure:

Now we verify the third rule of the Red-Black tree, i.e., the parent of the new
node is black or not. In the above figure, the parent of the node is black in
color; therefore, it is a Red-Black tree.

Step 3: Now, we create the new node having value 7 with Red color. As 7 is
less than 10, so it will come at the left of 10 as shown below.

Now we verify the third rule of the Red-Black tree, i.e., the parent of the new
node is black or not. As we can observe, the parent of the node 7 is black in
color, and it obeys the Red-Black tree's properties.

Step 4: The next element is 15, and 15 is greater than 10, but less than 18,
so the new node will be created at the left of node 18. The node 15 would be
Red in color as the tree is not empty.

The above tree violates the property of the Red-Black tree as it has Red-red
parent-child relationship. Now we have to apply some rule to make a Red-
Black tree. The rule 4 says that if the new node's parent is Red, then
we have to check the color of the parent's sibling of a new node. The
new node is node 15; the parent of the new node is node 18 and the sibling
of the parent node is node 7. As the color of the parent's sibling is Red in
color, so we apply the rule 4a. The rule 4a says that we have to recolor both
Non Linear Data Structures - Trees 3.20
the parent and parent's sibling node. So, both the nodes, i.e., 7 and 18,
would be recolored as shown in the below figure.

We also have to check whether the parent's parent of the new node is the
root node or not. As we can observe in the above figure, the parent's parent
of a new node is the root node, so we do not need to recolor it.

Step 5: The next element is 16. As 16 is greater than 10 but less than 18
and greater than 15, so node 16 will come at the right of node 15. The tree
is not empty; node 16 would be Red in color, as shown in the below figure:

In the above figure, we can observe that it violates the property of the
parent-child relationship as it has a red-red parent-child relationship. We
have to apply some rules to make a Red-Black tree. Since the new node's
parent is Red color, and the parent of the new node has no sibling, so
rule 4a will be applied. The rule 4a says that some rotations and recoloring
would be performed on the tree.

Since node 16 is right of node 15 and the parent of node 15 is node 18. Node
15 is the left of node 18. Here we have an LR relationship, so we require to
perform two rotations. First, we will perform left, and then we will perform
the right rotation. The left rotation would be performed on nodes 15 and 16,
where node 16 will move upward, and node 15 will move downward. Once
the left rotation is performed, the tree looks like as shown in the below
figure:
Non Linear Data Structures - Trees 3.20

In the above figure, we can observe that there is an LL relationship. The


above tree has a Red-red conflict, so we perform the right rotation. When we
perform the right rotation, the median element would be the root node.
Once the right rotation is performed, node 16 would become the root node,
and nodes 15 and 18 would be the left child and right child, respectively, as
shown in the below figure.

After rotation, node 16 and node 18 would be recolored; the color of node 16
is red, so it will change to black, and the color of node 18 is black, so it will
change to a red color as shown in the below figure:

Step 6: The next element is 30. Node 30 is inserted at the right of node 18.
As the tree is not empty, so the color of node 30 would be red.
Non Linear Data Structures - Trees 3.20
The color of the parent and parent's sibling of a new node is Red, so rule 4b
is applied. In rule 4b, we have to do only recoloring, i.e., no rotations are
required. The color of both the parent (node 18) and parent's sibling (node
15) would become black, as shown in the below image.

We also have to check the parent's parent of the new node, whether it is a
root node or not. The parent's parent of the new node, i.e., node 30 is node
16 and node 16 is not a root node, so we will recolor the node 16 and
changes to the Red color. The parent of node 16 is node 10, and it is not in
Red color, so there is no Red-red conflict.

Step 7: The next element is 25, which we have to insert in a tree. Since 25
is greater than 10, 16, 18 but less than 30; so, it will come at the left of node
30. As the tree is not empty, node 25 would be in Red color. Here Red-red
conflict occurs as the parent of the newly created is Red color.

Since there is no parent's sibling, so rule 4a is applied in which rotation, as


well as recoloring, are performed. First, we will perform rotations. As the
newly created node is at the left of its parent and the parent node is at the
right of its parent, so the RL relationship is formed. Firstly, the right rotation
is performed in which node 25 goes upwards, whereas node 30 goes
downwards, as shown in the below figure.
Non Linear Data Structures - Trees 3.21
After the first rotation, there is an RR relationship, so left rotation is
performed. After right rotation, the median element, i.e., 25 would be the
root node; node 30 would be at the right of 25 and node 18 would be at the
left of node 25.

Now recoloring would be performed on nodes 25 and 18; node 25 becomes


black in color, and node 18 becomes red in color.

Step 8: The next element is 40. Since 40 is greater than 10, 16, 18, 25, and
30, so node 40 will come at the right of node 30. As the tree is not empty,
node 40 would be Red in color. There is a Red-red conflict between nodes 40
and 30, so rule 4b will be applied.

As the color of parent and parent's sibling node of a new node is Red so
recoloring would be performed. The color of both the nodes would become
black, as shown in the below image.

After recoloring, we also have to check the parent's parent of a new node,
i.e., 25, which is not a root node, so recoloring would be performed, and the
color of node 25 changes to Red.
Non Linear Data Structures - Trees 3.21
After recoloring, red-red conflict occurs between nodes 25 and 16. Now node
25 would be considered as the new node. Since the parent of node 25 is red
in color, and the parent's sibling is black in color, rule 4a would be applied.
Since 25 is at the right of the node 16 and 16 is at the right of its parent, so
there is an RR relationship. In the RR relationship, left rotation is performed.
After left rotation, the median element 16 would be the root node, as shown
in the below figure.

After rotation, recoloring is performed on nodes 16 and 10. The color of node
10 and node 16 changes to Red and Black, respectively as shown in the
below figure.

Step 9: The next element is 60. Since 60 is greater than 16, 25, 30, and 40,
so node 60 will come at the right of node 40. As the tree is not empty, the
color of node 60 would be Red.

As we can observe in the above tree that there is a Red-red conflict occurs.
The parent node is Red in color, and there is no parent's sibling exists in the
tree, so rule 4a would be applied. The first rotation would be performed. The
RR relationship exists between the nodes, so left rotation would be
performed.
Non Linear Data Structures - Trees 3.21

When left rotation is performed, node 40 will come upwards, and node 30
will come downwards, as shown in the below figure:

After rotation, the recoloring is performed on nodes 30 and 40. The color of
node 30 would become Red, while the color of node 40 would become black.

The above tree is a Red-Black tree as it follows all the Red-Black tree
properties.

 Deletion in Red Back tree


Let's understand how we can delete the particular node from the Red-Black
tree. The following are the rules used to delete the particular node from the
tree:

Step 1: First, we perform BST rules for the deletion.

Step 2:

Case 1: if the node is Red, which is to be deleted, we simply delete it.

Let's understand case 1 through an example.


Non Linear Data Structures - Trees 3.21
Suppose we want to delete node 30 from the tree, which is given below.

Initially, we are having the address of the root node. First, we will apply BST
to search the node. Since 30 is greater than 10 and 20, which means that 30
is the right child of node 20. Node 30 is a leaf node and Red in color, so it is
simply deleted from the tree.

If we want to delete the internal node that has one child. First, replace the
value of the internal node with the value of the child node and then simply
delete the child node.

Let's take another example in which we want to delete the internal


node, i.e., node 20.

We cannot delete the internal node; we can only replace the value of that
node with another value. Node 20 is at the right of the root node, and it is
having only one child, node 30. So, node 20 is replaced with a value 30, but
the color of the node would remain the same, i.e., Black. In the end, node 20
(leaf node) is deleted from the tree.

If we want to delete the internal node that has two child nodes. In this case,
we have to decide from which we have to replace the value of the internal
node (either left subtree or right subtree). We have two ways:
Non Linear Data Structures - Trees 3.21
o Inorder predecessor: We will replace with the largest value that
exists in the left subtree.
o Inorder successor: We will replace with the smallest value that
exists in the right subtree.

Suppose we want to delete node 30 from the tree, which is shown below:

Node 30 is at the right of the root node. In this case, we will use the inorder
successor. The value 38 is the smallest value in the right subtree, so we will
replace the value 30 with 38, but the node would remain the same, i.e., Red.
After replacement, the leaf node, i.e., 30, would be deleted from the tree.
Since node 30 is a leaf node and Red in color, we need to delete it (we do
not have to perform any rotations or any recoloring).

Case 2: If the root node is also double black, then simply remove the double
black and make it a single black.

Case 3: If the double black's sibling is black and both its children are black.

o Remove the double black node.


o Add the color of the node to the parent (P) node.

1. If the color of P is red then it becomes black.


2. If the color of P is black, then it becomes double black.

o The color of double black's sibling changes to red.


o If still double black situation arises, then we will apply other cases.
Non Linear Data Structures - Trees 3.21
Let's understand this case through an example.

Suppose we want to delete node 15 in the below tree.

We cannot simply delete node 15 from the tree as node 15 is Black in color.
Node 15 has two children, which are nil. So, we replace the 15 value with a
nil value. As node 15 and nil node are black in color, the node becomes
double black after replacement, as shown in the below figure.

In the above tree, we can observe that the double black's sibling is
black in color and its children are nil, which are also black. As the
double black's sibling and its children have black so it cannot give its black
color to neither of these. Now, the double black's parent node is Red so
double black's node add its black color to its parent node. The color of the
node 20 changes to black while the color of the nil node changes to a single
black as shown in the below figure.

After adding the color to its parent node, the color of the double black's
sibling, i.e., node 30 changes to red as shown in the below figure.

In the above tree, we can observe that there is no longer double black's
problem exists, and it is also a Red-Black tree.

Case 4: If double black's sibling is Red.

o Swap the color of its parent and its sibling.


Non Linear Data Structures - Trees 3.21
o Rotate the parent node in the double black's direction.
o Reapply cases.

Let's understand this case through an example.

Suppose we want to delete node 15.

Initially, the 15 is replaced with a nil value. After replacement, the node
becomes double black. Since double black's sibling is Red so color of the
node 20 changes to Red and the color of the node 30 changes to Black.

Once the swapping of the color is completed, the rotation towards the
double black would be performed. The node 30 will move upwards and the
node 20 will move downwards as shown in the below figure.

In the above tree, we can observe that double black situation still exists in
the tree. It satisfies the case 3 in which double black's sibling is black as well
as both its children are black. First, we remove the double black from the
node and add the black color to its parent node. At the end, the color of the
double black's sibling, i.e., node 25 changes to Red as shown in the below
figure.
Non Linear Data Structures - Trees 3.21
In the above tree, we can observe that the double black situation has been
resolved. It also satisfies the properties of the Red Black tree.

Case 5: If double black's sibling is black, sibling's child who is far from the
double black is black, but near child to double black is red.

o Swap the color of double black's sibling and the sibling child which is
nearer to the double black node.
o Rotate the sibling in the opposite direction of the double black.
o Apply case 6

Suppose we want to delete the node 1 in the below tree.

First, we replace the value 1 with the nil value. The node becomes double
black as both the nodes, i.e., 1 and nil are black. It satisfies the case 3 that
implies if DB's sibling is black and both its children are black. First, we
remove the double black of the nil node. Since the parent of DB is Black, so
when the black color is added to the parent node then it becomes double
black. After adding the color, the double black's sibling color changes to Red
as shown below.

We can observe in the above screenshot that the double black problem still
exists in the tree. So, we will reapply the cases. We will apply case 5
because the sibling of node 5 is node 30, which is black in color, the child of
node 30, which is far from node 5 is black, and the child of the node 30
which is near to node 5 is Red. In this case, first we will swap the color of
node 30 and node 25 so the color of node 30 changes to Red and the color
of node 25 changes to Black as shown below.
Non Linear Data Structures - Trees 3.21

Once the swapping of the color between the nodes is completed, we need to
rotate the sibling in the opposite direction of the double black node. In this
rotation, the node 30 moves downwards while the node 25 moves upwards
as shown below.

As we can observe in the above tree that double black situation still exists.
So, we need to case 6. Let's first see what is case 6.

Case 6: If double black's sibling is black, far child is Red

o Swap the color of Parent and its sibling node.


o Rotate the parent towards the Double black's direction
o Remove Double black
o Change the Red color to black.

Now we will apply case 6 in the above example to solve the double black's
situation.

In the above example, the double black is node 5, and the sibling of node 5
is node 25, which is black in color. The far child of the double black node is
node 30, which is Red in color as shown in the below figure:

First, we will swap the colors of Parent and its sibling. The parent of node 5 is
node 10, and the sibling node is node 25. The colors of both the nodes are
black, so there is no swapping would occur.
Non Linear Data Structures - Trees 3.21
In the second step, we need to rotate the parent in the double black's
direction. After rotation, node 25 will move upwards, whereas node 10 will
move downwards. Once the rotation is performed, the tree would like, as
shown in the below figure:

In the next step, we will remove double black from node 5 and node 5 will
give its black color to the far child, i.e., node 30. Therefore, the color of node
30 changes to black as shown in the below figure.

Splay Tree
Splay trees are the self-balancing or self-adjusted binary search trees.

 What is a Splay Tree?


A splay tree is a self-balancing tree, but AVL and Red-Black trees are also
self-balancing trees then. What makes the splay tree unique two trees. It has
one extra property that makes it unique is splaying.

A splay tree contains the same operations as a Binary search tree, i.e.,
Insertion, deletion and searching, but it also contains one more operation,
i.e., splaying. So. all the operations in the splay tree are followed by
splaying.

Splay trees are not strictly balanced trees, but they are roughly balanced
trees. Let's understand the search operation in the splay-tree.

Suppose we want to search 7 element in the tree, which is shown below:


Non Linear Data Structures - Trees 3.22

To search any element in the splay tree, first, we will perform the standard
binary search tree operation. As 7 is less than 10 so we will come to the left
of the root node. After performing the search operation, we need to perform
splaying. Here splaying means that the operation that we are performing on
any element should become the root node after performing some
rearrangements. The rearrangement of the tree will be done through the
rotations.

Note: The splay tree can be defined as the self-adjusted tree in which any operation
performed on the element would rearrange the tree so that the element on which operation
has been performed becomes the root node of the tree.

 Rotations
There are six types of rotations used for splaying:

1. Zig rotation (Right rotation)


2. Zag rotation (Left rotation)
3. Zig zag (Zig followed by zag)
4. Zag zig (Zag followed by zig)
5. Zig zig (two right rotations)
6. Zag zag (two left rotations)

Factors required for selecting a type of rotation

The following are the factors used for selecting a type of rotation:

o Does the node which we are trying to rotate have a grandparent?


o Is the node left or right child of the parent?
o Is the node left or right child of the grandparent?
Non Linear Data Structures - Trees 3.22
 Cases for the Rotations
Case 1: If the node does not have a grand-parent, and if it is the right child
of the parent, then we carry out the left rotation; otherwise, the right
rotation is performed.

Case 2: If the node has a grandparent, then based on the following


scenarios; the rotation would be performed:

Scenario 1: If the node is the right of the parent and the parent is also right
of its parent, then zig zig right right rotation is performed.

Scenario 2: If the node is left of a parent, but the parent is right of its
parent, then zig zag right left rotation is performed.

Scenario 3: If the node is right of the parent and the parent is right of its
parent, then zig zig left left rotation is performed.

Scenario 4: If the node is right of a parent, but the parent is left of its
parent, then zig zag right-left rotation is performed.

Now, let's understand the above rotations with examples.

To rearrange the tree, we need to perform some rotations. The following are
the types of rotations in the splay tree:

o Zig rotations

The zig rotations are used when the item to be searched is either a root
node or the child of a root node (i.e., left or the right child).

The following are the cases that can exist in the splay tree while
searching:

Case 1: If the search item is a root node of the tree.

Case 2: If the search item is a child of the root node, then the two scenarios
will be there:

1. If the child is a left child, the right rotation would be performed, known
as a zig right rotation.
2. If the child is a right child, the left rotation would be performed, known
as a zig left rotation.
Non Linear Data Structures - Trees 3.22
Let's look at the above two scenarios through an example.

Consider the below example:

In the above example, we have to search 7 element in the tree. We will


follow the below steps:

Step 1: First, we compare 7 with a root node. As 7 is less than 10, so it is a


left child of the root node.

Step 2: Once the element is found, we will perform splaying. The right
rotation is performed so that 7 becomes the root node of the tree, as shown
below:

Let's consider another example.

In the above example, we have to search 20 element in the tree. We will


follow the below steps:

Step 1: First, we compare 20 with a root node. As 20 is greater than the


root node, so it is a right child of the root node.

Step 2: Once the element is found, we will perform splaying. The left
rotation is performed so that 20 element becomes the root node of the tree.
Non Linear Data Structures - Trees 3.22

o Zig zig rotations

Sometimes the situation arises when the item to be searched is having a


parent as well as a grandparent. In this case, we have to perform four
rotations for splaying.

Let's understand this case through an example.

Suppose we have to search 1 element in the tree, which is shown below:

Step 1: First, we have to perform a standard BST searching operation in


order to search the 1 element. As 1 is less than 10 and 7, so it will be at the
left of the node 7. Therefore, element 1 is having a parent, i.e., 7 as well as a
grandparent, i.e., 10.

Step 2: In this step, we have to perform splaying. We need to make node 1


as a root node with the help of some rotations. In this case, we cannot
simply perform a zig or zag rotation; we have to implement zig zig rotation.

In order to make node 1 as a root node, we need to perform two right


rotations known as zig zig rotations. When we perform the right rotation
then 10 will move downwards, and node 7 will come upwards as shown in
the below figure:

Again, we will perform zig right rotation, node 7 will move downwards, and
node 1 will come upwards as shown below:
Non Linear Data Structures - Trees 3.22

As we observe in the above figure that node 1 has become the root node of
the tree; therefore, the searching is completed.

Suppose we want to search 20 in the below tree.

In order to search 20, we need to perform two left rotations. Following are
the steps required to search 20 node:

Step 1: First, we perform the standard BST searching operation. As 20 is


greater than 10 and 15, so it will be at the right of node 15.

Step 2: The second step is to perform splaying. In this case, two left
rotations would be performed. In the first rotation, node 10 will move
downwards, and node 15 would move upwards as shown below:

In the second left rotation, node 15 will move downwards, and node 20
becomes the root node of the tree, as shown below:
Non Linear Data Structures - Trees 3.22

As we have observed that two left rotations are performed; so it is known as


a zig zig left rotation.

o Zig zag rotations

Till now, we have read that both parent and grandparent are either in RR or
LL relationship. Now, we will see the RL or LR relationship between the
parent and the grandparent.

Let's understand this case through an example.

Suppose we want to search 13 element in the tree which is shown below:

Step 1: First, we perform standard BST searching operation. As 13 is


greater than 10 but less than 15, so node 13 will be the left child of node 15.

Step 2: Since node 13 is at the left of 15 and node 15 is at the right of node
10, so RL relationship exists. First, we perform the right rotation on node 15,
and 15 will move downwards, and node 13 will come upwards, as shown
below:
Non Linear Data Structures - Trees 3.22
Still, node 13 is not the root node, and 13 is at the right of the root node, so
we will perform left rotation known as a zag rotation. The node 10 will move
downwards, and 13 becomes the root node as shown below:

As we can observe in the above tree that node 13 has become the root
node; therefore, the searching is completed. In this case, we have first
performed the zig rotation and then zag rotation; so, it is known as a zig zag
rotation.

o Zag zig rotation

Let's understand this case through an example.

Suppose we want to search 9 element in the tree, which is shown below:

Step 1: First, we perform the standard BST searching operation. As 9 is less


than 10 but greater than 7, so it will be the right child of node 7.

Step 2: Since node 9 is at the right of node 7, and node 7 is at the left of
node 10, so LR relationship exists. First, we perform the left rotation on node
7. The node 7 will move downwards, and node 9 moves upwards as shown
below:
Non Linear Data Structures - Trees 3.22
Still the node 9 is not a root node, and 9 is at the left of the root node, so we
will perform the right rotation known as zig rotation. After performing the
right rotation, node 9 becomes the root node, as shown below:

As we can observe in the above tree that node 13 is a root node; therefore,
the searching is completed. In this case, we have first performed the zag
rotation (left rotation), and then zig rotation (right rotation) is performed, so
it is known as a zag zig rotation.

 Advantages of Splay tree


o In the splay tree, we do not need to store the extra information. In contrast,
in AVL trees, we need to store the balance factor of each node that requires
extra space, and Red-Black trees also require to store one extra bit of
information that denotes the color of the node, either Red or Black.
o It is the fastest type of Binary Search tree for various practical applications. It
is used in Windows NT and GCC compilers.
o It provides better performance as the frequently accessed nodes will move
nearer to the root node, due to which the elements can be accessed quickly
in splay trees. It is used in the cache implementation as the recently
accessed data is stored in the cache so that we do not need to go to the
memory for accessing the data, and it takes less time.

 Drawback of Splay tree


The major drawback of the splay tree would be that trees are not strictly
balanced, i.e., they are roughly balanced. Sometimes the splay trees are
linear, so it will take O(n) time complexity.

Insertion operation in Splay tree

In the insertion operation, we first insert the element in the tree and then
perform the splaying operation on the inserted element.
Non Linear Data Structures - Trees 3.22
15, 10, 17, 7

Step 1: First, we insert node 15 in the tree. After insertion, we need to


perform splaying. As 15 is a root node, so we do not need to perform
splaying.

Step 2: The next element is 10. As 10 is less than 15, so node 10 will be the
left child of node 15, as shown below:

Now, we perform splaying. To make 10 as a root node, we will perform the


right rotation, as shown below:

Step 3: The next element is 17. As 17 is greater than 10 and 15 so it will


become the right child of node 15.

Now, we will perform splaying. As 17 is having a parent as well as a


grandparent so we will perform zig zig rotations.
Non Linear Data Structures - Trees 3.22
In the above figure, we can observe that 17 becomes the root node of the
tree; therefore, the insertion is completed.

Step 4: The next element is 7. As 7 is less than 17, 15, and 10, so node 7
will be left child of 10.

Now, we have to splay the tree. As 7 is having a parent as well as a


grandparent so we will perform two right rotations as shown below:

Still the node 7 is not a root node, it is a left child of the root node, i.e., 17.
So, we need to perform one more right rotation to make node 7 as a root
node as shown below:

 Deletion in Splay tree


As we know that splay trees are the variants of the Binary search tree, so
deletion operation in the splay tree would be similar to the BST, but the only
difference is that the delete operation is followed in splay trees by the
splaying operation.

Types of Deletions:

There are two types of deletions in the splay trees:

1. Bottom-up splaying
Non Linear Data Structures - Trees 3.23
2. Top-down splaying

Bottom-up splaying

In bottom-up splaying, first we delete the element from the tree and then we
perform the splaying on the deleted node.

Let's understand the deletion in the Splay tree.

Suppose we want to delete 12, 14 from the tree shown below:

o First, we simply perform the standard BST deletion operation to delete 12


element. As 12 is a leaf node, so we simply delete the node from the tree.

The deletion is still not completed. We need to splay the parent of the
deleted node, i.e., 10. We have to perform Splay(10) on the tree. As we can
observe in the above tree that 10 is at the right of node 7, and node 7 is at
the left of node 13. So, first, we perform the left rotation on node 7 and then
we perform the right rotation on node 13, as shown below:

Still, node 10 is not a root node; node 10 is the left child of the root node. So,
we need to perform the right rotation on the root node, i.e., 14 to make node
10 a root node as shown below:
Non Linear Data Structures - Trees 3.23

o Now, we have to delete the 14 element from the tree, which is shown below:

As we know that we cannot simply delete the internal node. We will replace
the value of the node either using inorder predecessor or inorder
successor. Suppose we use inorder successor in which we replace the value
with the lowest value that exist in the right subtree. The lowest value in the
right subtree of node 14 is 15, so we replace the value 14 with 15. Since
node 14 becomes the leaf node, so we can simply delete it as shown below:

Still, the deletion is not completed. We need to perform one more operation,
i.e., splaying in which we need to make the parent of the deleted node as
the root node. Before deletion, the parent of node 14 was the root node, i.e.,
10, so we do need to perform any splaying in this case.

Top-down splaying
Non Linear Data Structures - Trees 3.23
In top-down splaying, we first perform the splaying on which the deletion is
to be performed and then delete the node from the tree. Once the element
is deleted, we will perform the join operation.

Let's understand the top-down splaying through an example.

Suppose we want to delete 16 from the tree which is shown below:

Step 1: In top-down splaying, first we perform splaying on the node 16. The
node 16 has both parent as well as grandparent. The node 16 is at the right
of its parent and the parent node is also at the right of its parent, so this is a
zag zag situation. In this case, first, we will perform the left rotation on node
13 and then 14 as shown below:

The node 16 is still not a root node, and it is a right child of the root node, so
we need to perform left rotation on the node 12 to make node 16 as a root
node.
Non Linear Data Structures - Trees 3.23

Once the node 16 becomes a root node, we will delete the node 16 and we
will get two different trees, i.e., left subtree and right subtree as shown
below:

As we know that the values of the left subtree are always lesser than the
values of the right subtree. The root of the left subtree is 12 and the root of
the right subtree is 17. The first step is to find the maximum element in the
left subtree. In the left subtree, the maximum element is 15, and then we
need to perform splaying operation on 15.

As we can observe in the above tree that the element 15 is having a parent
as well as a grandparent. A node is right of its parent, and the parent node is
also right of its parent, so we need to perform two left rotations to make
node 15 a root node as shown below:
Non Linear Data Structures - Trees 3.23

After performing two rotations on the tree, node 15 becomes the root node.
As we can see, the right child of the 15 is NULL, so we attach node 17 at the
right part of the 15 as shown below, and this operation is known as
a join operation.
Non Linear Data Structures - Trees 3.23
Non Linear Data Structures - Trees 3.23
Compare General Tree and binary tree?

1. Define the following terminologies in a tree


(1) Siblings, parent
(2) Depth, Path
(3) Height, Degree
2. What is complete binary tree?
3. Define Binary Search Tree.
4. Give the array and linked list representation of tree with an example.
5. Show that the maximum number of nodes in a binary tree of height H as 2H + 1 -1.
6. Define Tree Traversal.
7. Give the preorder form for the following Tree.

10

5 15

2 12

14

8. Write a routine to find the minimum element in a given tree.


9. Write the recursive procedure for inorder traversals.
10. Draw a binary search tree for the following input lists. 60, 25, 75, 15, 33, 44
11. How is a binary tree represented using an array?
12. Define AVL tree.
13. What are the two properties of a binary heap.
14. What do you mean by self adjusting tree?
15.List the operations performed on priorityNQonueLuinee.ar Data Structures - Trees 3.89
Non Linear Data Structures - Trees 3.23
15. Differentiate between binary tree and Binary search tree.
16. Differentiate between general tree and binary tree.
17. What is threaded binary tree?
18. Show that the maximum number of nodes in a binary tree of height H is 2H + 1 – 1.

PART - B

1. (a) Write an algorithm to find an element from binary searchtree.


(b) Write a program to insert and delete an element from binary search tree.

2. Write about the AVL trees with an example?.


3. What are the different tree traversal techniques? Explain with examples.
4. Write a function to perform insertion and deletemin in a binary heap.
5. Show the result of accessing the keys 3,9,1,5 in order in the splay tree in the following figure.

1
0
1
4
1
1
6 2
2
1
8 3
1 3 5
7 9

6. Write the function to perform AVL single rotation and double rotation.
7. Explain about the RED-BLACK trees with its properties?
8. Construct a RED-BLACK tree by inserting 10,5,2,6,3,4,17,16 delete5,2.
9. Construct splay tree for the
following values: 1, 2, 3, 4, 5, 6, 7, 8

UNIT-IV GRAPHS
Graphs Terminology

A graph consists of: Non Linear Data Structures - Trees 3.23

 A set, V, of vertices (nodes)


 A collection, E, of pairs of vertices from V called edges (arcs) Edges, also called arcs, are represented by
(u, v) and are either:
Directed if the pairs are ordered (u, v) u the origin v the destination

Undirected if the pairs are unordered

A graph is a pictorial representation of a set of objects where some pairs of objects are connected by links. The
interconnected objects are represented by points termed as vertices, and the links that connect the vertices are
called edges.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges, connecting the pairs
of vertices. Take a look at the following graph −

In the above graph, V = {a, b, c, d, e}


E = {ab, ac, bd, cd, de}

Then a graph can be:


Directed graph (di-graph) if all the edges are directed
Undirected graph (graph) if all the edges are undirected
Non Linear Data Structures - Trees 3.23

Mixed graph if edges are both directed or undirected

Weighted: In a weighted graph, each edge is assigned a weight or cost. Consider a graph of 4
nodes as in the diagram below. As you can see each edge has a weight/cost assigned to it. If you
want to go from vertex 1 to vertex 3, you can take one of the following 3 paths:

o 1 -> 2 -> 3
o 1 -> 3
o 1 -> 4 -> 3

Therefore the total cost of each path will be as follows: - The total cost of 1 -> 2 -> 3 will
be (1 + 2) i.e. 3 units - The total cost of 1 -> 3 will be 1 unit - The total cost of 1 -> 4 -> 3
will be (3 + 2) i.e. 5 units

Cyclic: A graph is cyclic if the graph comprises a path that starts from a vertex and ends at
the same vertex. That path is called a cycle. An acyclic graph is a graph that has no cycle.
A tree is an undirected graph in which any two vertices are connected by only one path. A
tree is an acyclic graph and has N - 1 edges
Nonwhere N Structures
Linear Data is the number
- Trees of vertices.
3.24
Each node in
a graph may have one or multiple parent nodes. However, in a tree, each node (except the
root node) comprises exactly one parent node.

Note: A root node has no parent.

A tree cannot contain any cycles or self loops, however, the same does not apply to graphs.

Illustrate terms on graphs

End-vertices of an edge are the endpoints of the edge.

Two vertices are adjacent if they are endpoints of the same edge.

An edge is incident on a vertex if the vertex is an endpoint of the edge.

Outgoing edges of a vertex are directed edges that the vertex is the origin.

Incoming edges of a vertex are directed edges that the vertex is the destination.

Degree of a vertex, v, denoted deg(v) is the number of incident edges.

Out-degree, outdeg(v), is the number of outgoing edges.

In-degree, indeg(v), is the number of incoming edges.

Parallel edges or multiple edges are edges of the same type and end-vertices
Self-loop is an edge with the end vertices the same vertex

Simple graphs have no parallel edges or self-loops Non Linear Data Structures - Trees 3.24
 Properties Non Linear Data Structures - Trees 3.24
If graph, G, has m edges then Σv∈G deg(v) = 2m

If a di-graph, G, has m edges then

Σv∈G indeg(v) = m = Σv∈G outdeg(v)

If a simple graph, G, has m edges and n vertices:

If G is also directed then m ≤ n(n-1)

If G is also undirected then m ≤ n(n-1)/2

So a simple graph with n vertices has O(n2) edges at most


 More Terminology
Path is a sequence of alternating vertices and edges such that each successive vertex is connected by the edge.
Frequently only the vertices are listed especially if there are no parallel edges.
Cycle is a path that starts and end at the same vertex.
Simple path is a path with distinct vertices. Directed path is a path of only directed edges Directed cycle is
a cycle of only directed edges. Sub-graph is a subset of vertices and edges.

Spanning sub-graph contains all the vertices.

Connected graph has all pairs of vertices connected by at least one path.

Connected component is the maximal connected sub-graph of a unconnected graph. Forest is a graph
without cycles.

Tree is a connected forest (previous type of trees are called rooted trees, these are free trees)
Spanning tree is a spanning subgraph that is also a tree.

 More Properties
If G is an undirected graph with n vertices and m edges:

 If G is connected then m ≥ n - 1
 If G is a tree then m = n - 1
 If G is a forest then m ≤ n – 1

 Graph operations:

Suppose we want the following operations:


• AddVertex:
Non Linear Data Structures - Trees 3.24

Adds a new vertex to the graph.

For example, suppose there is a new city we want to add to our map of train
routes. AddVertex(graph, G) would give:

• AddEdge:

Adds a new directed edge to the graph.

For example, adding the city was not enough, we also need to say how the rail lines
connect it to other cities. Thus, we might do an AddEdge(graph, C, G), giving:

(an edge from C to G).

• IsReachable:
Reports whether we can get there from here.
For example, we might want to know whether we can get to city E from city A:
IsReachable(graph, E, A) would report a true value.

Again, we might want to know whether we can get to city D from city E:
IsReachable(graph, D, E) would report a false value.
The basic operations provided by a graph data structure G usually include:1]
Non Linear Data Structures - Trees 3.24

 adjacent(G,x, y): tests whether there is an edge from the vertex x to the vertex y;
 neighbors(G, x): lists all vertices y such that there is an edge from the vertex x to the vertex y;

 add_vertex(G, x): adds the vertex x, if it is not there;


 remove_vertex(G, x): removes the vertex x, if it is there;
 add_edge(G, x, y): adds the edge from the vertex x to the vertex y, if it is not there;
 remove_edge(G, x, y): removes the edge from the vertex x to the vertex y, if it is there;
 get_vertex_value(G, x): returns the value associated with the vertex x;
 set_vertex_value(G, x, v): sets the value associated with the vertex x to v.

Graph representation

 You can represent a graph in many ways. The two most common ways of representing a graph is as follows:

Adjacency matrix
 An adjacency matrix is a VxV binary matrix A. Element Ai,j is 1 if there is an edge from vertex i to vertex j else Ai,jis
0.
 Note: A binary matrix is a matrix in which the cells can have only one of two possible values - either a 0 or 1.
 The adjacency matrix can also be modified for the weighted graph in which instead of storing 0 or 1 in Ai,j, the
weight or cost of the edge will be stored.
 In an undirected graph, if Ai,j = 1, then Aj,i = 1. In a directed graph, if Ai,j = 1, then Aj,i may or may not be 1.
 Adjacency matrix provides constant time access (O(1) ) to determine if there is an edge between two
nodes. Space complexity of the adjacency matrix is O(V2).
 The adjacency matrix of the following graph is:

 i/j : 1 2 3 4
 1:0101
 2:1010
 3:0101
 4:1010
Non Linear Data Structures - Trees 3.24


 The adjacency matrix of the following graph is:
 i/j: 1 2 3 4
 1:0100
 2:0001
 3:1001
 4:0100
Non Linear Data Structures - Trees 3.24

 Adjacency list
 The other way to represent a graph is by using an adjacency list. An adjacency list is an array A of separate lists.
Each element of the array Ai is a list, which contains all the vertices that are adjacent to vertex i.
 For a weighted graph, the weight or cost of the edge is stored along with the vertex in the list using pairs. In an
undirected graph, if vertex j is in list Ai then vertex i will be in list Aj.
 The space complexity of adjacency list is O(V + E) because in an adjacency list information is stored only for those
edges that actually exist in the graph. In a lot of cases, where a matrix is sparse using an adjacency matrix may not
be very useful. This is because using an adjacency matrix will take up a lot of space where most of the elements
will be 0, anyway. In such cases, using an adjacency list is better.
 Note: A sparse matrix is a matrix in which most of the elements are zero, whereas a dense matrix is a matrix in
which most of the elements are non-zero.

Consider the same undirected graph from an adjacency matrix. The adjacency list of the graph is as follows:
 A1 → 2 → 4
 A2 → 1 → 3
 A3 → 2 → 4
 A4 → 1 → 3
Non Linear Data Structures - Trees 3.24

Consider the same directed graph from an adjacency matrix. The adjacency list of the graph is as follows:
 A1 → 2
 A2 → 4
 A3 → 1 → 4
 A4 → 2

Graph Traversal:
Graph traversal is a technique used for a searching vertex in a graph. The graph traversal is also
used to decide the order of vertices is visited in the search process. A graph traversal finds the
edges to be used in the search process without creating loops. That means using graph traversal
we visit all the vertices of the graph without getting into looping path.

There are two graph traversal techniques and they are as follows...

1. DFS (Depth First Search)

2. BFS (Breadth First Search)


Non Linear Data Structures - Trees 3.24

Depth First Search:

Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to remember to get the
next vertex to start a search, when a dead end occurs in any iteration.
Non Linear Data Structures - Trees 3.24

As in the example given above, DFS algorithm traverses from S to A to D to G to E to B first, then to F and lastly to C.
It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a stack.
 Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all the vertices
from the stack, which do not have adjacent vertices.)
 Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

Step Traversal Description

Initialize the stack.


2 Non Linear Data Structures - Trees 3.25
Mark S as visited and put it onto
the stack. Explore any unvisited
adjacent node from S. We have
three nodes and we can pick any
of them. For this example, we
shall take the node in an
alphabetical order.

Mark A as visited and put it onto


the stack. Explore any unvisited
adjacent node from
A. Both Sand D are adjacent to A
but we are concerned for
unvisited nodes only.

4
Visit D and mark it as visited and
put onto the stack. Here, we have
B and C nodes, which are
adjacent to D and both are
unvisited. However, we shall
again choose in an alphabetical
order.

We choose B, mark it as visited


and put onto the stack. Here
Bdoes not have any unvisited
adjacent node. So, we pop Bfrom
the stack.
6 Non Linear Data Structures - Trees 3.25

We check the stack top for return


to the previous node and check if
it has any unvisited nodes. Here,
we find D to be on the top of the
stack.

Only unvisited adjacent node is


from D is C now. So we visit C,
mark it as visited and put it onto
the stack.

As C does not have any unvisited adjacent node so we keep popping the stack until we find a node that has an
unvisited adjacent node. In this case, there's none and we keep popping until the stack is empty
Breadth First Search
Non Linear Data Structures - Trees 3.25
Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a queue to remember to
get the next vertex to start a search, when a dead end occurs in any iteration.

As in the example given above, BFS algorithm traverses from A to B to E to F first then to C and G lastly to D. It
employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a queue.
 Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
 Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.

Step Traversal Description


1 Non Linear Data Structures - Trees 3.25

Initialize the queue.


2 Non Linear Data Structures - Trees 3.25

We start from
visiting S(starting node), and
mark it as visited.

3
We then see an unvisited adjacent
node from S. In this example, we
have three nodes but
alphabetically we choose A,
mark it as visited and enqueue it.

Next, the unvisited adjacent node


from S is B. We mark it as visited
and enqueue it.

Next, the unvisited adjacent node


from S is C. We mark it as visited
and enqueue it.
6 Non Linear Data Structures - Trees 3.25

Now, S is left with no unvisited


adjacent nodes. So, we dequeue
and find A.

From A we have D as
unvisited adjacent node. We mark
it as visited and enqueue it.

At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we keep on dequeuing in
order to get all unvisited nodes. When the queue gets emptied, the program is over.
SORTING
Non Linear Data Structures - Trees 3.25

Sorting refers to the operation or technique of arranging and rearranging sets of data in some
specific order. A collection of records called a list where every record has one or more fields. The
fields which contain a unique value for each record is termed as the key field. For example, a
phone number directory can be thought of as a list where each record has three fields - 'name' of
the person, 'address' of that person, and their 'phone numbers'. Being unique phone number can
work as a key to locate any record in the list.

The techniques of sorting can be divided into two categories. These are:

 Internal Sorting
 External Sorting

Internal Sorting: If all the data that is to be sorted can be adjusted at a time in the main memory, the internal
sorting method is being performed.

External Sorting: When the data that is to be sorted cannot be accommodated in the memory at the same time
and some has to be kept in auxiliary memory such as hard disk, floppy disk, magnetic tapes etc, then external
sorting methods are performed.

Bubble Sort

We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it short and
precise.

Bubble sort starts with very first two elements, comparing them to check which one is greater.
In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare 33 with 27.
Non Linear Data Structures - Trees 3.25

We find that 27 is smaller than 33 and these two values must be swapped.

The new array should look like this −

Next we compare 33 and 35. We find that both are in already sorted positions.

Then we move to the next two values, 35 and 10.

We know then that 10 is smaller 35. Hence they are not sorted.

We swap these values. We find that we have reached the end of the array. After one iteration, the array
should look like this −

To be precise, we are now showing how an array should look like after each iteration. After the second
iteration, it should look like this −
Non Linear Data Structures - Trees 3.25

Notice that after each iteration, at least one value moves at the end.

And when there's no swap required, bubble sorts learns that an array is completely sorted.

Now we should look into some practical aspects of bubble sort.

Algorithm
We assume list is an array of n elements. We further assume that swapfunction swaps the values of
the given array elements.
begin BubbleSort(list)

for all elements of list if


list[i] > list[i+1]
swap(list[i], list[i+1])
end if
end for

return list

Pseudocode
We observe in algorithm that Bubble Sort compares each pair of array element unless the whole array
is completely sorted in an ascending order. This may cause a few complexity issues like what if the
array needs no more swapping as all the elements are already ascending.
To ease-out the issue, we use one flag variable swapped which will help us see if any swap has
Non Linear Data Structures - Trees 3.25
happened or not. If no swap has occurred, i.e. the array requires no more processing to be sorted, it
will come out of the loop.
Pseudocode of BubbleSort algorithm can be written as follows −

procedure bubbleSort( list : array of items ) loop =

list.count;

for i = 0 to loop-1 do:


swapped = false
Non Linear Data Structures - Trees 3.26

for j = 0 to loop-1 do:

/* compare the adjacent elements */ if


list[j] > list[j+1] then
/* swap them */ swap(
list[j], list[j+1] ) swapped
= true
end if

end for

/*if no number was swapped that means array is


sorted now, break the loop.*/

if(not swapped) then


break
end if

end for
Insertion Sort
Non Linear Data Structures - Trees 3.26
We take an unsorted array for our example.

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.

Insertion sort moves ahead and compares 33 with 27.

And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the sorted
sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list remains sorted
after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.

These values are not in a sorted order.


So we swap them.
Non Linear Data Structures - Trees 3.26

However, swapping makes 27 and 10 unsorted.


Hence, we swap them too. Non Linear Data Structures - Trees 3.26

Again we find 14 and 10 in an unsorted order.

We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items.

This process goes on until all the unsorted values are covered in a sorted sub-list. Now we shall see
some programming aspects of insertion sort.

Algorithm

Now we have a bigger picture of how this sorting technique works, so we can derive simple steps by
which we can achieve insertion sort.
Step 1 − If it is the first element, it is already sorted. return 1;

Step 2 − Pick next element

Step 3 − Compare with all elements in the sorted sub-list


Step 4 − Shift all the elements in the sorted sub-list that is greater than the value to be
sorted
Step 5 − Insert the value

Pseudocode
procedure insertionSort( A : array of items ) int Non Linear Data Structures - Trees 3.26
holePosition
int valueToInsert
for i = 1 to length(A) inclusive do:
valueToInsert = A[i] holePosition = i

while holePosition > 0 and A[holePosition-1] > valueToInsert do:


A[holePosition] = A[holePosition-1]
holePosition = holePosition -1 end
while
A[holePosition] = valueToInsert
Selection Sort
Non Linear Data Structures - Trees 3.26
Consider the following depicted array as an example.

For the first position in the sorted list, the whole list is scanned sequentially. The first position where 14
is stored presently, we search the whole list and find that 10 is the lowest value.

So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the list,
appears in the first position of the sorted list.

For the second position, where 33 is residing, we start scanning the rest of the list in a linear manner.

We find that 14 is the second lowest value in the list and it should appear at the second place. We
swap these values.

After two iterations, two least values are positioned at the beginning in a sorted manner.

The same process is applied to the rest of the items in the array. Following is a pictorial
depiction of the entire sorting process −
Non Linear Data Structures - Trees 3.26

Now, let us learn some programming aspects of selection sort. Algorithm

Step 1 − Set MIN to location 0

Step 2 − Search the minimum element in the list


Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Pseudocode Non Linear Data Structures - Trees 3.26

procedure selection sort list


: array of items
n : size of list

for i = 1 to n - 1
/* set current element as minimum*/ min = i

/* check the element to be minimum */ for j =

i+1 to n
if list[j] < list[min] then
min = j;
end if
end for

/* swap the minimum element with the current element*/ if


indexMin != i then
swap list[min] and list[i]
end if
External Sorting-Model for external sorting
Non Linear Data Structures - Trees 3.26

External sorting is a term for a class of sorting algorithms that can handle massive amounts of
data. External sorting is required when the data being sorted do not fit into the main memory of a
computing device (usually RAM) and instead they must reside in the slower external memory
(usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting
phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a
temporary file. In the merge phase, the sorted sub-files are combined into a single larger file.
One example of external sorting is the external merge sort algorithm, which sorts chunks that each
fit in RAM, then merges the sorted chunks together. We first divide the file into runs such that
the size of a run is small enough to fit into main memory. Then sort each run in main memory
using merge sort sorting algorithm. Finally merge the resulting runs together into successively
bigger runs, until the file is sorted.

One example of external sorting is the external merge sort algorithm, which is a K-way merge
algorithm. It sorts chunks that each fit in RAM, then merges the sorted chunks together.
The algorithm first sorts M items at a time and puts the sorted lists back into external memory For
example, for sorting 900 megabytes of data using only 100 megabytes of RAM:

1. Read 100 MB of the data in main memory and sort by some conventional method,
like quicksort.
2. Write the sorted data to disk.
3. Repeat steps 1 and 2 until all of the data is in sorted 100 MB chunks (there are 900MB /
100MB = 9 chunks), which now need to be merged into one single output file.
4. Read the first 10 MB (= 100MB / (9 chunks + 1)) of each sorted chunk into input buffers in
main memory and allocate the remaining 10 MB for an output buffer. (In practice, it might
provide better performance to make the output buffer larger and the input buffers slightly
smaller.)
5. Perform a 9-way merge and store the result in the output buffer. Whenever the output
buffer fills, write it to the final sorted file and empty it. Whenever any of the 9 input
buffers empties, fill it with the next 10 MB of its associated 100 MB sorted chunk until no
more data from the chunk is available. This is the key step that makes external merge sort
work externally -- because the merge algorithm only makes one pass sequentially through
each of the chunks, each chunk does not have to be loaded completely; rather, sequential
parts of the chunk can be loaded as needed.
Historically, instead of a sort, sometimes a replacement-selection algorithm was used to perform
Non Linear Data Structures - Trees 3.26
the initial distribution, to produce on average half as many output chunks of double the length.

Merge Sort

To understand merge sort, we take an unsorted array as the following −

We know that merge sort first divides the whole array iteratively into equal halves unless the atomic
values are achieved. We see here that an array of 8 items is divided into two arrays of size 4.

This does not change the sequence of appearance of items in the original. Now we divide these two
arrays into halves.

We further divide these arrays and we achieve atomic value which can no more be divided.

Now, we combine them in exactly the same manner as they were broken down. Please note the color
codes given to these lists.
We first compare the element for each list and then combine them into another list in a sorted
manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10 and in the target list of
2 values we put 10 first, followed by 27. We change the order of 19 and 35 whereas 42 and 44 are
placed sequentially.
Non Linear Data Structures - Trees 3.27

In the next iteration of the combining phase, we compare lists of two data values, and merge them into
a list of found data values placing all in a sorted order.

After the final merging, the list should look like this −

Now we should learn some programming aspects of merge sorting.


Algorithm
Non Linear Data Structures - Trees 3.27
Merge sort keeps on dividing the list into equal halves until it can no more be divided. By definition, if it
is only one element in the list, it is sorted. Then, merge sort combines the smaller sorted lists keeping
the new list sorted too.

Step 1 − if it is only one element in the list it is already sorted, return.


Step 2 − divide the list recursively into two halves until it can no more be divided.
Step 3 − merge the smaller lists into new list in sorted order.

Merge sort works with recursion and we shall see our implementation in the same way.

procedure mergesort( var a as array ) if ( n


== 1 ) return a
var l1 as array = a[0] ... a[n/2] var l2 as
array = a[n/2+1] ... a[n] l1 =
mergesort( l1 )
l2 = mergesort( l2 )
return merge( l1, l2 )
end procedure
procedure merge( var a as array, var b as array ) var c as
array
while ( a and b have elements ) if (
a[0] > b[0] )
add b[0] to the end of c
remove b[0] from b
else
add a[0] to the end of c
remove a[0] from a
end if
end while

while ( a has elements ) add


a[0] to the end of c remove
a[0] from a
end while

while ( b has elements ) add


Non Linear Data Structures - Trees 3.27

Heap Sort

Heap sort is a comparison based sorting technique based on Binary Heap data structure. It is similar to
selection sort where we first find the maximum element and place the maximum element at the end.
We repeat the same process for remaining element.

What is Binary Heap?

Let us first define a Complete Binary Tree. A complete binary tree is a binary tree in which every level,
except possibly the last, is completely filled, and all nodes are as far left as possible
A Binary Heap is a Complete Binary Tree where items are stored in a special order such that value in a
parent node is greater(or smaller) than the values in its two children nodes. The former is called as max
heap and the latter is called min heap. The heap can be represented by binary tree or array.

Why array based representation for Binary Heap?

Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array and array based
representation is space efficient. If the parent node is stored at index I, the left child can be calculated
by 2 * I + 1 and right child by 2 * I + 2 (assuming the indexing starts at 0).

Heap Sort Algorithm for sorting in increasing order:

1. Build a max heap from the input data.


2. At this point, the largest item is stored at the root of the heap. Replace it with the last item
of the heap followed by reducing the size of heap by 1. Finally, heapify the root of tree.
3. Repeat above steps while size of heap is greater than 1.
How to build the heap?

Heapify procedure can be applied to a node only if its children nodes are heapified. So the heapification
must be performed in the bottom up order.
Lets understand with the help of an example:

Non Linear Data Structures - Trees 3.27


Input data: 4, 10, 3, 5, 1
4(0)
/ \
10(1) 3(2)
/ \
5(3) 1(4)
Non Linear Data Structures - Trees 3.27

Applying heapify procedure to index 1: 4(0)


/ \
10(1) 3(2)
/ \
5(3) 1(4)

Applying heapify procedure to index 0: 10(0)


/\
5(1) 3(2)
/ \
4(3) 1(4)

Radix Sort
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is
Ω(nLogn), i.e., they cannot do better than nLogn.
Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in range
from 1 to k.

 What if the elements are in range from 1 to n2?

We can’t use counting sort because counting sort will take O(n2) which is worse than comparison based
sorting algorithms. Can we sort such an array in linear time? Radix Sort is the answer. The idea of
Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix
sort uses counting sort as a subroutine to sort.
Non Linear Data Structures - Trees 3.27

UNIT-V PATTERN MATCHING AND TRIES


Pattern matching in computer science is the checking and locating of specific
sequences of data of some pattern among raw data or a sequence of tokens.
Unlike pattern recognition, the match has to be exact in the case of pattern
matching. Pattern matching is one of the most fundamental and important
paradigms in several programming languages. Many applications make use of
pattern matching as a major part of their tasks.

Brute force is a straightforward approach to solving a problem, usually


directly based on the problem statement and definition
 Brute Force(Naive) String Matching
Algorithm
When we talk about a string matching algorithm, every one can get a
simple string matching technique. That is starting from first letters of the
text and first letter of the pattern check whether these two letters are equal.
if it is, then check second letters of the text and pattern. If it is not equal,
then move first letter of the pattern to the second letter of the text. then
check these two letters. this is the simple technique everyone can thought.

Brute Force string matching algorithm is also like that. Therefore we


call that as Naive string matching algorithm. Naive means basic. Lets learn
this method using an example.

Red Boxes-Mismatch Green Boxes-Match


In above red boxes says mismatch letters against letters of the text and
Non Linear Data Structures - Trees 3.27
green boxes says match letters against letters of the text. According to the
above

In first raw we check whether first letter of the pattern is matched with
the first letter of the text. It is mismatched, because "S" is the first letter of
pattern and "T" is the first letter of text. Then we move the pattern by one
position. Shown in second raw.

Then check first letter of the pattern with the second letter of text. It is
also mismatched. Likewise we continue the checking and moving process.
In fourth raw we can see first letter of the pattern matched with text. Then
we do not do any moving but we increase testing letter of the pattern. We
only move the position of pattern by one when we find mismatches. Also in
last raw, we can see all the letters of the pattern matched with the some
letters of the text continuously.

Running Time Analysis Of Brute Force String Matching Algorithm


let length of the text as n (|text| = n) and length of the pattern as m (|
pattern| = m).
In worst case, in each position we should have to make m comparisons.
Also we should have to make these m comparisons in (n-m+1) positions.
Therefore all the comparisons in total string matching process is {m*(n-
m+1)}.
Therefore running time of Brute Force String Matching Algorithm is
O(m(n-m+1)). Simply we say that as O(nm) if n<<<m.

From the above example, |text|= 24. but we do comparisons only for first 16
letters. Then we can get |text|= 16 and |pattern|=6.
In worst case we should have to compare these 6 letters of the
pattern with 6 letters of the text in each position. Also we can see there are
11 positions for first 16 letters of text. that is (|text|-|pattern|+1) = 11.
Positions we can find using number of raw's in above example.
Therefore, all the comparisons for above example is {|pattern|*[|text|-|
pattern|+1]}. This means, to match our pattern with first 16 letters of text
we should have to make 6*11 comparisons. But if |text|<<<|pattern|, we
simply get all the comparison by |text|*|pattern|. Therefore running time
should be O(|pattern|*|text|).
Advantages
Non Linear Data Structures - Trees 3.27
1. Very simple technique and also that does not require any
preprocessing. Therefore total running time is the same as its matching
time.
Disadvantages
1. Very inefficient method. Because this method takes only one position
movement in each time.

Brute-Force String Matching


Searching for a pattern, P[0...m-1], in text, T[0...n-1]

Algorithm BruteForceStringMatch(T[0...n-1], P[0...m-1])

for i ← 0 to n-m do

j←0

while j < m and P[j] = T[i+j] do

j++

if j = m then return i

return -1

#include<stdio.h>
#include<string.h>
char t[100],p[50];
void main()
{
int pos;
clrscr();
printf("Enter the Source String ");
scanf("%s",t);
printf("Enter the pattern ");
scanf("%s",p);
pos=brute_force();
if(pos==-1)
printf("%s pattern not found in text",p);
else
printf("%s pattern found at index %d",p,pos);
getch();
}
int brute_force() Non Linear Data Structures - Trees 3.27
{
int n,j,m,i;
n=strlen(t);
m=strlen(p);
for(i=0;i<n;i++)
{
j=0;
while(j<m && t[i+j]==p[j])
{
j++;
if(j==m)
return i+1; //pattern found
}
}
return -1; //pattern not found
}

Boyer Moore Algorithm


Pattern searching is an important problem in computer science. When we
do search for a string in notepad/word file or browser or database, pattern
searching algorithms are used to show the search results. A typical problem
statement would be- Given a text txt[0..n-1] and a pattern pat[0..m-1], write
a function search(char pat[], char txt[]) that prints all occurrences of pat[] in
txt[]. You may assume that n > m.
Examples:
Input: txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"
Output: Pattern found at index 10

Input: txt[] = "AABAACAADAABAABA"


pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12 Non Linear Data Structures - Trees 3.27

Boyer Moore is a combination of following two approaches.


1) Bad Character Heuristic
2) Good Suffix Heuristic
Both of the above heuristics can also be used independently to search a pattern
in a text. Let us first understand how two independent approaches work together
in the Boyer Moore algorithm. If we take a look at the Naive algorithm, it slides
the pattern over the text one by one. KMP algorithm does preprocessing over
the pattern so that the pattern can be shifted by more than one. The Boyer
Moore algorithm does preprocessing for the same reason. It processes the
pattern and creates different arrays for both heuristics. At every step, it slides
the pattern by the max of the slides suggested by the two heuristics. So it uses
best of the two heuristics at every step.
Unlike the previous pattern searching algorithms, Boyer Moore algorithm starts matching
from the last character of the pattern.

Bad Character Heuristic


The idea of bad character heuristic is simple. The character of the text which
doesn’t match with the current character of the pattern is called the Bad
Character. Upon mismatch, we shift the pattern until –
1) The mismatch becomes a match
2) Pattern P move past the mismatched character.
Case 1 – Mismatch become match
We will lookup the position of last occurrence of mismatching character in
pattern and if mismatching character exist in pattern then we’ll shift the pattern
such that it get aligned to the mismatching character in text T.
Non Linear Data Structures - Trees 3.28

case 1

Explanation: In the above example, we got a mismatch at position 3. Here our


mismatching character is “A”. Now we will search for last occurrence of “A” in
pattern. We got “A” at position 1 in pattern (displayed in Blue) and this is the last
occurrence of it. Now we will shift pattern 2 times so that “A” in pattern get
aligned with “A” in text.
Case 2 – Pattern move past the mismatch character
We’ll lookup the position of last occurrence of mismatching character in pattern
and if character does not exist we will shift pattern past the mismatching
character.

Explanation: Here we have a mismatch at position 7. The mismatching


character “C” does not exist in pattern before position 7 so we’ll shift pattern
past to the position 7 and eventually in above example we have got a perfect
match of pattern (displayed in Green). We are doing this because, “C” do not
exist in pattern so at every shift before position 7 we will get mismatch and our
search will be fruitless.
In the following implementation, we preprocess the pattern and store the last
occurrence of every possible character in an array of size equal to alphabet
size. If the character is not present at all, then it may result in a shift by m
(length of pattern). Therefore, the bad character heuristic takes o(n/m) time in
the best case. Non Linear Data Structures - Trees 3.28
/* C Program for Bad Character Heuristic of Boyer
Moore String Matching Algorithm */
# include <limits.h>
# include <string.h>
# include <stdio.h>

# define NO_OF_CHARS 256

// A utility function to get maximum of two integers


int max (int a, int b) { return (a > b)? a: b; }

// The preprocessing function for Boyer Moore's


// bad character heuristic
void badCharHeuristic( char *str, int size,
int badchar[NO_OF_CHARS])
{
int i;

// Initialize all occurrences as -1


for (i = 0; i < NO_OF_CHARS; i++)
badchar[i] = -1;

// Fill the actual value of last occurrence


// of a character
for (i = 0; i < size; i++)
badchar[(int) str[i]] = i;
}

/* A pattern searching function that uses Bad


Character Heuristic of Boyer Moore Algorithm */
void search( char *txt, char *pat)
{
int m = strlen(pat);
int n = strlen(txt);

int badchar[NO_OF_CHARS];

/* Fill the bad character array by calling


the preprocessing function badCharHeuristic()
for given pattern */
badCharHeuristic(pat, m, badchar);

int s = 0; // s is shift of the pattern with


// respect to text
while(s <= (n - m))
{
int j = m-1;

/* Keep reducing index j of pattern while


characters of pattern and text are
matching at this shift s */
while(j >= 0 && pat[j] == txt[s+j])
j--;

/* If the pattern is present at current


shift, then index j will become -1 after
the above loop */
if (j < 0)
{
printf("\n pattern occurs atNon Linear Data
shift Structures
= %d", s); - Trees 3.28

/* Shift the pattern so that the next


character in text aligns with the last
occurrence of it in pattern.
The condition s+m < n is necessary for
the case when pattern occurs at the end
of text */
s += (s+m < n)? m-badchar[txt[s+m]] : 1;

else
/* Shift the pattern so that the bad character
in text aligns with the last occurrence of
it in pattern. The max function is used to
make sure that we get a positive shift.
We may get a negative shift if the last
occurrence of bad character in pattern
is on the right side of the current
character. */
s += max(1, j - badchar[txt[s+j]]);
}
}

/* Driver program to test above function */


int main()
{
char txt[] = "ABAAABCD";
char pat[] = "ABC";
search(txt, pat);
return 0;
}

Knuth-Morris-Pratt algorithm
KMP Algorithm is one of the most popular patterns matching algorithms. KMP stands
for Knuth Morris Pratt. KMP algorithm was invented by Donald Knuthand Vaughan
Pratt together and independently by James H Morris in the year 1970. In the year
1977, all the three jointly published KMP Algorithm.
KMP algorithm was the first linear time complexity algorithm for string matching.
KMP algorithm is one of the string matching algorithms used to find a Pattern in a Text.

KMP algorithm is used to find a "Pattern" in a "Text". This algorithm campares character
by character from left to right. But whenever a mismatch occurs, it uses a preprocessed
table called "Prefix Table" to skip characters comparison while matching. Some times
prefix table is also known as LPS Table. Here LPS stands for "Longest proper Prefix
which is also Suffix".

Steps for Creating LPS Table (Prefix Table)


 Step 1 - Define a one dimensional array with the size equal to the length of the
Pattern. (LPS[size])
 Step 2 - Define variables i & j. Set i = 0, j = 1 and LPS[0] = 0.
 Step 3 - Compare the characters at Pattern[i] and Pattern[j].

 Non=Linear
Step 4 - If both are matched then set LPS[j] i+1 Data
and Structures
increment - Trees 3.28
both i & j values
by one. Goto to Step 3.

 Step 5 - If both are not matched then check the value of variable 'i'. If it is '0' then
set LPS[j] = 0 and increment 'j' value by one, if it is not '0' then set i = LPS[i-1].
Goto Step 3.

 Step 6- Repeat above steps until all the values of LPS[] are filled.

Let us use above steps to create prefix table for a pattern...


Non Linear Data Structures - Trees 3.28
How to use LPS Table
Non Linear Data Structures - Trees 3.28
We use the LPS table to decide how many characters are to be skipped for comparison
when a mismatch has occurred.
When a mismatch occurs, check the LPS value of the previous character of the mismatched
character in the pattern. If it is '0' then start comparing the first character of the pattern with
the next character to the mismatched character in the text. If it is not '0' then start comparing
the character which is at an index value equal to the LPS value of the previous character to
the mismatched character in pattern with the mismatched character in the Text.

How the KMP Algorithm Works


Let us see a working example of KMP Algorithm to find a Pattern in a Text...
Non Linear Data Structures - Trees 3.28
code in C:
Non Linear Data Structures - Trees 3.28
#include<stdio.h>
#include<string.h>
char txt[100],pat[100];
int M ,N ,lps[100],j=0,i=0;
void computeLPSArray()
{
int len = 0, i;
lps[0] = 0;
i = 1;
while(i < M)
{
if(pat[i] == pat[len])
{
len++;
lps[i] = len;
i++;
}
else
{
if( len != 0 )
len = lps[len-1];
else
{
lps[i] = 0;
i++;
}
}
}
}
void KMPSearch()
{
int j=0,i=0;
M = strlen(pat);
N = strlen(txt);
computeLPSArray();
while(i < N)
{
if(pat[j] == txt[i])
{
j++;
i++;
}

if (j == M)
{
printf("Found pattern at index %d \n", i-j);
j = lps[j-1];
}
else if(pat[j] != txt[i])
{
if(j != 0)
j = lps[j-1];
else
i = i+1;
}
}
}
int main()
{
printf("\n ENTER THE TEXT : "); Non Linear Data Structures - Trees 3.28
gets(txt);
printf("\n ENTER THE PATTERN : ");
gets(pat);
KMPSearch();
return 0;
}
output:-

ENTER THE TEXT : Welcome To CampusCoke


ENTER THE PATTERN : C
Found pattern at index 11
Found pattern at index 17
--------------------------------

Tries
All the search trees are used to store the collection of numerical values but
they are not suitable for storing the collection of words or strings. Trie is a
data structure which is used to store the collection of strings and makes
searching of a pattern in words more easy. The term trie came from the
word retrieval. Trie data structure makes retrieval of a string from the
collection of strings more easily. Trie is also called as Prefix Tree and
some times Digital Tree. A trie is defined as follows...
Trie is a tree like data structure used to store collection of strings.
A trie can also be defined as follows...
Trie is an efficient information storage and retrieval data structure.
The trie data structure provides fast pattern matching for string data values.
Using trie, we bring the search complexity of a string to the optimal limit. A
trie searches a string in O(m) time complexity, where m is the length of the
string.
In trie, every node except the root stores a character value. Every node in
trie can have one or a number of children. All the children of a node are
alphabetically ordered. If any two strings have a common prefix then they
will have the same ancestors.
Example
Non Linear Data Structures - Trees 3.28

 The standard Trie data structure


o Definitions:

 Alphabet = a set of characters

 Let S = a set of s strings (= keys)


over an alphabet Σ

 A trie T that store the keys in S is a


structure where:

cept the root node) is labeled with a character c ∈ Σ


 Each node of T (ex

 Th
e root node has no label !!!
 Each internal
node of T can have ≤ |Σ| # of keys
 The keys are stored
in alphabetical order inside an internal node
 The trie T has s ext
ernal nodes


string in S
(And index (= location)
Non Linear Data Structures - Trees 3.29
information are stored for
these strings)
 The path from
the root node to an external node yields exactly one
string in S

o Example:
 S = { bear, bell, bid, bull, buy, sell,
stock, stop } (s = 8)

Trie:

 Note:

 There
are 8 leaf nodes !!

 How to implementation a trie


o There are many ways to implement a trie

 Use an array of references


 Each array
Non Linear Data Structures - Trees
element represent one letter of the alphabet 3.29

 Each array reference will


point to a sub-trie that corresponds to strings that starts with

the corresponding letter

 se a binary tree

 This
implementation is called a bitwise trie
 Implementing a trie using an array of references
o Array implementation:

 If the trie is dense and/or alphabet


Σ is small (e.g., Σ = {a, b, c, ..., z}), you can use
an array to store the labels

 Node structure:
Non Linear Data Structures - Trees 3.29

o Example:

Implementing a trie using a binary tree

o The binary tree implementation:


 The binary tree implementation of
a trie is known as a bitwise trie

 The implementation uses


the alphabet: Σ = {0, 1}

 In
other words, the implementation stores
a sequences of bits
Non Linear Data Structures - Trees 3.29

 The keys are read as a sequence of


bits

Example:

bear = 'b' 'e' 'a' 'r'


= 01100010 01100101 01100001 01110010

o A bitwise trie is a binary tree:

Structural properties of the standard trie

o Properties of the standard trie:


 Every internal node has ≤ |Σ|
children
Non Linear Data Structures - Trees 3.29

 This follows from


the way that the trie is constructed

 The trie T on the set S with s strings


(keys) has exactly s external nodes

 This is
because a path from the root
node to one external node corresponds to 1
key
 The height of the trie T on
the set S = the length of the longest string ∈ S
 This
Non Linear Data Structures - Trees
is because a path from the root 3.29
node to one external node corresponds
to 1 key

ongest path = longest key ∈ S


 The l

 The number of nodes in the trie

strings ∈ S
T on the set S = O(n), where n = # characters in the

 In the worst
case, every character in the keys are different

E.g.:

 Inserting into a standard trie


o High level description:

p = root; // Start at the root

for ( each character c in the keyword ) do


{
if ( c is already stored of the sub-trie )
traverse to that subtrie
else
create a new node for c
}

insert value into leaf node


Example:
Non Linear Data Structures - Trees 3.29
 Insert stock in this trie:

 First we traverse the prefix that is


already stored in the trie: "sto":

 Then we create nodes for letters are


are not in the trie: "ck":
o Psuedo code:
Non Linear Data Structures - Trees 3.29

Value put( Key k, Value v )


{
int i;
TrieNode p;

p = root;

for ( i = 0; i < k.length(); i++ )


{
nextChar = k[i];

if ( nextChar ∈ p.char[] )
{
nextLink = link associated with the char
entry;
p = nextLink; // Traverse
}
else
{
p = new TrieNode(); // Create new
node

insert "nextChar" into p;


}
}

/*
------------------------------------------------------
When the while loop ends, we have found or
created
the external node for the key "k"

------------------------------------------------------
*/
insert v into node p;
}

 Advantages of Tries over an ordinary map


 Looking up data in a trie is in
the worst case = O(m), where m = length of the key

 Look
up in a map is O(lg(n)) where n = #
entries !!!
So a trie has performance levels that is
similar to a hash table !!!

 Unlike a hash table, a trie can
provide an alphabetical ordering
Non Linear of the -entries
Data Structures Trees by key 3.29

(I.e., A trie implements an ordered map while a hash


table cannot !)

Handling keys that are prefixes of another key

o The standard trie has the property that:


 Only the external nodes can
store information
(The path formed by the internal nodes represents
the key)
o When a key (string) is a prefix of another key, the path of
the first key would end in an internal node

Example: at and ate

o Solution:
 Add a special termination symbol
◊ to the alphabet Σ

 The t
ermination symbol ◊ has
the lower value in the alphabet
I.e.,: termination symbol
◊ preceeds every character in
the alphabet Σ !!

Non Linear Data Structures - Trees 3.29

 We append the termination symbol


◊ to each keyword stored in the trie

Example:

o Note:

 We typically use the NUL


character '\0' as termination symbol

Compressed tries
o A compressed trie is a trie with one additional rule:

 Each internal node has ≥ 2


children

Such compacted trie is also known as:


 Patricia tries or Radix tries
Non Linear Data Structures - Trees 3.30
Wikipedia page: click here
o In order to enforce the above rule, the labels are generalized:

 Each node is labeled with a string


(multiple characters)

(The label used to be a single character)

Converting a standard trie to a compressed tree

o Redundant node:
 An internal node v is redundant if

 Node
v is not the root node, and

 Node
v has 1 child node

Example:

Redundant chain of edges:

 A chain of edges:

(v0,v1), (v1,v2), ..., (vk−1,vk)


is redundant if: Non Linear Data Structures - Trees 3.30

 Node
s v1, v2, ..., vk−1 are redundant

 Node
s v0 and vk are not redundant

Example:

Compression algorithm:

 Replace:

 a red
undant chain or edges (v0,v1), (v1,v2), ...,
(vk−1,vk)

by one edge:

 (v0 ,v
k )

 Replace:

 the la
bel vk
by the label: Non Linear Data Structures - Trees 3.30

 v1 v2 .
.. vk

Example:

 Before compression:

After compression:

What is different with the implementation of a compressed trie

o Recall how the standard trie is implemented:


Non Linear Data Structures - Trees 3.30

o The compressed trie uses strings as keys:

 Find a node that has only one child


node:

 Result:
Non Linear Data Structures - Trees 3.30

Another example:

 Find a node that has only one child


node:

 Result:
Non Linear Data Structures - Trees 3.30

Properties of the compress trie

 Each internal node has ≥ 2


children and ≤ |Σ| children

 A compressed trie
T storing s strings (keys) has: s external nodes

 A compressed trie
T storing s strings (keys) has: O(s) total number of
nodes
 Because
a compressed trie is in the worst case comparable to
a binary tree
 In a binary tree:

# external nodes = # internal nodes +


1

 Hence:
# internal nodes in a compress
trie < s


Non Linear Data Structures - Trees 3.30
Suffix tries
Suffix tree is a compressed trie of all the suffixes of a given string. Suffix
trees help in solving a lot of string related problems like pattern matching,
finding distinct substrings in a given string, finding longest palindrome etc.

Before going to suffix tree, let's first try to understand what a compressed
trie is.
Consider the following set of strings:
{"banana","nabd","bcdef","bcfeg","aaaaaa","aaabaa" }
A standard trie for the above set of strings will look like:

And a compressed trie for the given set of strings will look like:
Non Linear Data Structures - Trees 3.30

As it might be clear from the images show above, in a compressed trie,


edges that direct to a node having single child are combined together to
form a single edge and their edge labels are concatenated. So this means
that each internal node in a compressed trie has atleast two children. Also it
has atmost N leaves, where N is the number of strings inserted in the
compressed trie. Now both the facts: Each internal node having atleast two
children, and that there are N leaves, implies that there are
atmost 2N−1 nodes in the trie. So the space complexity of a compressed trie
is O(N) as compared to the O(N2) of a normal trie.
So that is one reason why to use compressed tries over normal tries.

Before going to construction of suffix trees, there is one more thing that
should be understood, Implicit Suffix Tree. In Implicit suffix trees, there are
atmost N leaves, while in normal one there should be exactly N leaves. The
reason for atmost N leaves is one suffix being prefix of another suffix.
Following example will make it clear. Consider the string "banana"
Implicit Suffix Tree for the above string is shown in image below:

To avoid getting an Implicit Suffix Tree we append a special character that


is not equal to any other character of the string. Suppose we append $ to the
given string then, so the new string is "banana$". Now its suffix tree will be
Non Linear Data Structures - Trees 3.30

Now let's go to the construction of the suffix trees.


Suffix tree as mentioned previously is a compressed trie of all the suffixes
of a given string, so the brute force approach will be to consider all the
suffixes of the given string as separate strings and insert them in the trie one
by one. But time complexity of the brute force approach is O(N^2), and that
is of no use for large values of N.

A Suffix Tree for a given text is a compressed trie for all suffixes of the
given text. We have discussed Standard Trie. Let us
understand Compressed Trie with the following array of words.
{bear, bell, bid, bull, buy, sell, stock, stop}
Following is standard trie for the above input set of words.
Following is the compressed trie. Compress Trie is obtained from standard
trie by joining chains of single nodes. The Non
nodes
Linearof a Structures
Data compressed- Trees trie can 3.30
be stored by storing index ranges at the nodes.

How to build a Suffix Tree for a given text?


As discussed above, Suffix Tree is compressed trie of all suffixes, so
following are very abstract steps to build a suffix tree from given text.
1) Generate all suffixes of given text.
2) Consider all suffixes as individual words and build a compressed trie.
Let us consider an example text “banana\0” where ‘\0’ is string termination
character. Following are all suffixes of “banana\0”
banana\0
anana\0
nana\0
ana\0
na\0
a\0
\0
If we consider all of the above suffixes as individual words and build a trie,
we get following.
Non Linear Data Structures - Trees 3.31

If we join chains of single nodes, we get the following compressed trie,


which is the Suffix Tree for given text “banana\0”

Please note that above steps are just to manually create a Suffix Tree. We
will be discussing actual algorithm and implementation in a separate post.
How to search a pattern in the built suffix tree?
We have discussed above how to build a Suffix Tree which is needed as a
preprocessing step in pattern searching. Following are abstract steps to
search a pattern in the built Suffix Tree.
1) Starting from the first character of the pattern and root of Suffix Tree, do
following for every character. Non Linear Data Structures - Trees 3.31
…..a) For the current character of pattern, if there is an edge from the
current node of suffix tree, follow the edge.
…..b) If there is no edge, print “pattern doesn’t exist in text” and return.
2) If all characters of pattern have been processed, i.e., there is a path from
root for characters of the given pattern, then print “Pattern found”.
Let us consider the example pattern as “nan” to see the searching process.
Following diagram shows the path followed for searching “nan” or “nana”.

How does this work?


Every pattern that is present in text (or we can say every substring of text)
must be a prefix of one of all possible suffixes. The statement seems
complicated, but it is a simple statement, we just need to take an example to
check validity of it.

Applications of Suffix Tree

Suffix tree can be used for a wide range of problems. Following are some
famous problems where Suffix Trees provide optimal time complexity
solution.
1) Pattern Searching
2) Finding the longest repeated substring
3) Finding the longest common substring
4) Finding the longest palindrome in a string
Non Linear Data Structures - Trees 3.31

PREVIOUS YEAR QUESTION PAPERS


Non Linear Data Structures - Trees 3.31
b) What is a graph? Explain various representations of graphs. [5+5]

8.a) Write an algorithm for Heap sort.

b) Apply selection sort on the following elements:

{21, 11, 5, 78, 49, 54, 72, 88} [5+5]

OR
9. What is collision? Explain different collision resolution techniques with examples. [10]

10.a) Build an AVL tree with the following values:

{15, 20, 24, 10, 13, 7, 30, 36, 25, 42, 29}

b) Write Knuth-Morris-Pratt pattern matching algorithm. [5+5]

OR
11. Write short notes on:

a) Red-Black trees b) splay trees c) b-trees. [3+3+4]


Non Linear Data Structures - Trees 3.31
8.a Write a C LineSear.
) Program for ar ch
b Explain Collision [5+
) Resolution Methods. 5]
O
R Non Linear Data Structures - Trees 3.31
9.a Differentiate between Linear and Binary
) Search Methods.
b Write a QuiSor [5+
) program forck t. 5]

1 .a Explain about
Trewith an
0) AVL e example.
b Give the typesTrie [5+
) examples forof s. 5]
O
R
1 .a Write Spl Tree with an.
1) about ay example
b Discuss about Binary Search [5+
) Tree operations. 5]

--ooO--
- oo -

JN
20 04 TU
17 H
A U
Non Linear Data Structures - Trees 3.31
Non Linear Data Structures - Trees 3.31
Non Linear Data Structures - Trees 3.31
Non Linear Data Structures - Trees 3.31

319
Non Linear Data Structures - Trees 3.32

www.ManaResults.co.in

320
Non Linear Data Structures - Trees 3.32

321

You might also like