Data Structures Notes
Data Structures Notes
Mathematical Background, Model, Analysis and Run Time Calculations, Lists: Abstract
Data Types, List using arrays and pointers, Singly Linked, Doubly Linked, Circular Linked
Properties of an algorithm:-
1. Input
2. Output
3. Finiteness
4. Definiteness
5. Effectiveness
5. Effectiveness: - An algorithm should be effective i.e. operations can be performed with the
given inputs in a finite period of time by a person using paper and pencil.
Analysis of algorithm is the task of determining how much computing time (Time
complexity) and storage (Space complexity) required by an algorithm.
We can analyze performance of an algorithm using Time complexity and Space complexity.
Time Complexity: - The amount of time that an algorithm requires for its execution is
known as time complexity.
Time complexity is mainly classified into 3 types.
1. Best case time complexity
2. Worst case time complexity
3. Average case time complexity
1. Best case time complexity: - If an algorithm takes minimum amount of time for its
execution then it is called as Best case time complexity.
2
Ex: - While searching an element using linear search, if key element found at first position
then it is best case.
2. Worst case time complexity: - If an algorithm takes maximum amount of time for its
execution then it is called as Worst case time complexity.
Ex: - While searching an element using linear search, if key element found at last position
then it is worst case.
3. Average case time complexity: - If an algorithm takes average amount of time for its
execution then it is called as Average case time complexity.
Ex: - While searching an element using linear search, if key element found at middle position
then it is average case.
Ex: - A sample program to calculate time complexity of sum of cubes of ‘n’ natural numbers.
Rule 1:-The running time of for loop is the running time of statements in for loop.
Ex:- for(i=0;i<n;i++) -----------(n+1)
s=s+i; ------------ n
__________
2n+1
__________
So, time complexity=O(n)
Rule 2:-The total running time of statements inside a group of nested loops is the product of
the sizes of all loops.
Ex:- for(i=0;i<n;i++) ---------------- (n+1)
for(j=0;j<n;j++) ---------------- n(n+1)
c[i][j]=a[i][j]+b[i][j]; ------------- (n*n)
_____________
2n2+2n+1
_____________
So, time complexity=O(n2)
s=s+i; ---------- n
return s; ----------- 1
}
}
____________
2n+5
__________________
So, time complexity=O(n)
Asymptotic notations:-
Asymptotic notations are used to calculate time complexity of algorithm.
Using asymptotic notations we can calculate best case, worst case and average case time
complexity.
There are 5 types of time complexities.
1. Big Oh notation (O)
2. Big Omega notation (Ω)
3. Theta notation (θ)
4. Little Oh notation (o)
5. Little omega notation (ω)
Definition: - Let f(n) , g(n) be 2 non negative functions. Then f(n)=O(g(n)), if there are 2
positive constants c, n0 such that f(n)<=c*g(n) ∀ n>=n0.
Ex:- f(n)=3n+2
g(n)=n;
To show f(n)=O(g(n))
f(n)<=c*g(n) , c>0, n0>=1
5
3n+2<=c*n
Let C=4, n0=1 then 5<=4 wrong
n0=2 then 8<=8 correct
n0=3 then 11<=12 correct
n0=4 then 14<=16 correct
So, n0>=2
Hence, 3n+2<=4*n ∀ n>=2
Definition: - Let f(n) , g(n) be 2 non negative functions. Then f(n)= Ω (g(n)), if there are 2
positive constants c, n0 such that f(n)>=c*g(n) ∀ n>=n0.
f(n)= Ω (g(n))
Ex:- f(n)=3n+2
g(n)=n;
To show f(n)= Ω(g(n))
f(n)>=c*g(n) , c>0, n0>=1
3n+2>=c*n
6
So, n0>=1
Hence, 3n+2>=1*n ∀ n>=1
Definition: - Let f(n) , g(n) be 2 non negative functions. Then f(n)= θ (g(n)), if there are 3
positive constants c1, c2,n0 such that c1*g(n)<=f(n)<=c2*g(n) ∀ n>=n0.
Ex:- f(n)=3n+2
g(n)=n;
To show f(n)= θ(g(n))
c1*g(n)<=f(n)<=c2*g(n), c1>0,
c2>0
n0>=1
c1*n<=3n+2<= c2*n
c2=4 , c1=1,
Let n0=1 then 1<=5<=4 wrong
Let n0=2 then 2<=8<=8 correct
Let n0=3 then 3<=11<=12 correct
Let n0=4 then 4<=14<=16 correct
So, n0>=2
Hence, 1*n<=3n+2<=*4n n>=2
4. Little oh notation(o):-
7
Let f(n), g(n) be 2 non negative functions then f(n)=o(g(n)) such that
N n2 n3 2n logn nlogn
1 1 1 2 0 0
2 4 8 4 1 2
4 16 64 16 2 8
8 64 512 256 3 24
Space complexity:- The amount of space required by an algorithm for its execution.
Space complexity S(P) can be calculated as S(P)=C+SP
Where, S(P) is space complexity of a program or problem
C means constant or fixed part
SP means variable part
Constant part denotes memory (space) required by variables.
Variable part depends on problem instance.
Data structure is a data organization, management & storage format that enables
efficient access & modification.
8
Types of Data Structures:- Data Structures are mainly classified into 2 types.
1.Linear Data Structures:- In linear data structures all the elements are organized in linear
(sequential) fashion.
9
2.Non Linear Data Structures:- In Non linear data structures all the elements are organized
in non linear (random) fashion.
The basic idea is that the implementation of these operations is written once in the
program and any other part of the program that needs to perform an operation on the ADT
can do so by calling the appropriate function.
For Example, stack ADT have operations like push, pop but stack ADT doesn’t
provide any implementation details regarding operations.
List ADT:- List is basically a collection of elements. For example list is of the following
form
The size of the list is ‘n’. If the list size is zero (0) then the corresponding list is called as
empty list.
2. Implementation of list using Linked list (or) Implementation of list using pointers
1. Create
3. Insertion
4. Deletion
10
5. Updation
6. Searching
7. Sorting
8. Merging
#include<stdio.h>
#include<conio.h>
int n, a[20];
void create();
int insert(int,int);
void display();
void find(int);
int del(int);
void update(int,int);
void count();
int main()
{
int choi,ele,pos,val;
clrscr();
printf("enter no ele in list \n");
scanf("%d",&n);
while(1)
{
clrscr();
printf("\t\t List Adt using arrays \n");
printf("1.create \n");
printf("2.insert \n");
printf("3.display \n");
printf("4.find \n");
printf("5.delete \n");
printf("6.update\n");
printf("7.count\n");
printf("8.exit \n");
11
case 8:exit(0);
break;
}
}
}
void create()
{
int i;
for(i=0;i<n;i++)
12
{
printf("enter ele of list\n");
scanf("%d",&a[i]);
}
}
int insert(int pos,int val)
{
int i;
for(i=n-1;i>=pos;i--)
a[i+1]=a[i];
a[pos]=val;
return n+1;
}
void display()
{
int i;
for(i=0;i<n;i++)
printf("\nele of list are =%d",a[i]);
getch();
}
int del(int pos)
{
int i;
for(i=pos;i<n-1;i++)
{
a[i]=a[i+1];
}
return n-1;
}
void find(int val)
{
int i,flag=0;
for(i=0;i<n;i++)
{
if(val==a[i])
{
flag=1;
break;
}
}
if(flag==1)
{
13
}
void update(int pos,int val)
{
a[pos]=val;
}
void count()
{
printf("no of ele in the list =%d",n);
getch();
}
Advantages of Arrays:-
2. Array allows random access of elements i.e. an array element can be randomly
accessed using index value. Arrays are simple and easy to implement.
3. Arrays are suitable when the number of elements are already known.
4. Arrays can be used to implement data structures like stacks, queues and so on.
Drawbacks of Arrays:-
1. We must know in advance regarding how many elements are to be stored in array.
2. Arrays uses static memory allocation i.e. memory will be allocated at compilation
time. So it’s not possible to change size of the array at run time.
3. Since array size is fixed, if we allocate more memory than required then memory
space will be wasted.
5. The main drawback of arrays is insertion and deletion operations are expensive.
14
6. For example to insert an element into the array at beginning position, we need to shift
all array elements one position to the right in order to store a new element at
beginning position.
11 22 33 44 55 Let value=100
100 11 22 33 44 55
7. For example to delete an element from the array at beginning position , we need to
shift all array elements one position to its left.
11 22 33 44 55
22 33 44 55
Because of the above limitations simple arrays are not used to implement lists.
2. Implementation of list using Linked list (or) Implementation of list using pointers:-
Linked list is a collection of nodes which are not necessary to be in adjacent memory
locations.
1. Data Field
Data field:- Data field contains values like 9, 6.8, ‘a’ , “ramu” , 9849984900
Next field:- It contains address of its next node. The last node next field contains NULL
which indicates end of the linked list.
10 2000 30 NULL
15
20 3000
A single linked list is a collection of nodes which are not necessary to be in adjacent memory
locations.
1. Data Field
Data field:- Data field contains values like 9, 6.8, ‘a’ , “ramu” , 9849984900
Next field:- It contains address of its next node. The last node next field contains NULL
which indicates end of the linked list.
It is called as single linked list because each node contains a single link which points to its
next node.
1. Creation
2. Display
#include<stdio.h>
#include<stdlib.h>
struct node
{
int info;
struct node *next;
};
struct node *start;
void insend(int ele);
void insbeg(int ele);
void insmiddle(int ele);
void delend();
void delbeg();
void delmiddle();
void traverse();
void main()
{
int ele,choice;
start=NULL;
while(1)
{
clrscr();
printf("\t\t Single List Operations \n");
printf("1.insbeg\n");
printf("2.insend\n");
printf("3.insmiddle\n");
printf("4.delbeg\n");
printf("5.delend\n");
printf("6.traverse\n");
printf("7.delmiddle\n");
17
printf("8.exit\n");
printf("enter choice\n");
scanf("%d",&choice);
switch(choice)
{
case 1: printf("\n enetr ele ");
scanf("%d",&ele);
insbeg(ele);
break;
case 2:printf("\n enetr ele ");
scanf("%d",&ele);
insend(ele);
break;
case 3:printf("\n enetr ele ");
scanf("%d",&ele);
insmiddle(ele);
break;
case 4:delbeg();
break;
case 5:delend();
break;
case 6:traverse();
break;
case 7:delmiddle();
break;
case 8:exit(0);
}
}
}
void insbeg(int ele)
{
struct node *temp;
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
if(start==NULL)
{
temp->next=NULL;
start=temp;
}
else if(start->next==NULL)
{
temp->next=start;
start=temp;
}
else
{
temp->next=start;
start=temp;
}
18
}
void insmiddle(int ele)
{
struct node *temp,*p;int pos,count=0;
if(start==NULL)
{
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
temp->next=NULL;
start=temp;
}
else
{
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
printf("\n enter position");
scanf("%d",&pos);
p=start;
while(p!=NULL)
{
if(pos==count)
break;
count=count+1;
p=p->next;
}
temp->next=p->next;
p->next=temp;
}
}
void insend(int ele)
{
struct node *temp,*p;
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
temp->next=NULL;
if(start==NULL)
{
start=temp;
}
else if(start->next==NULL)
{
start->next=temp;
}
else
{
p=start;
while(p->next!=NULL)
{
19
p=p->next;
}
p->next=temp;
}
}
void delbeg()
{
struct node *temp;
if(start==NULL)
printf("\n list is empty");
else if(start->next==NULL)
{
temp=start;
start=NULL;
free(temp);
}
else
{
temp=start;
start=start->next;
free(temp);
}
}
void delend()
{
struct node *temp,*p;
if(start==NULL)
printf("\n list is empty");
else if(start->next==NULL)
{
temp=start;
start=NULL;
free(temp);
}
else
{
p=start;
while(p->next->next!=NULL)
{
p=p->next;
}
temp=p->next;
p->next=NULL;
free(temp);
}
}
void traverse()
{
struct node *p;
20
if(start==NULL)
printf("\n list is empty");
else
{
p=start;
printf("Elements of Single Linked List are\n\n");
while(p!=NULL)
{
printf("%d->",p->info);
p=p->next;
}
}
getch();
}
void delmiddle()
{
}
1. Linked list is a dynamic data structure i.e. memory will be allocated at run time. So,
no wastage of memory.
2. It is not necessary to know in advance regarding the number of elements to be stored
in the list.
3. Insertion and deletion operations are easier, as shifting of values is not necessary.
4. Different types of data can be stored in data field of a node.
5. The memory allocation need not be in contiguous locations.
6. Linear data structures such as stacks, queues are easily implemented using linked list.
1. Random access of elements in not possible in single linked list i.e. we can’t access a
particular node directly.
2. Binary search algorithm can’t be implemented on a single linked list.
3. There is no way to go back from one node to its previous node i.e. only forward
traversal is possible.
4. Extra storage space for pointer is required.
5. Reversing single linked list is difficult.
1. Creation
2. Display
#include<stdio.h>
22
#include<stdlib.h>
struct node
{
int info;
struct node *next;
};
struct node *start;
void insend(int ele);
void insbeg(int ele);
void insmiddle(int ele);
void delend();
void delbeg();
void delmiddle();
void traverse();
void main()
{
int ele,choice;
start=NULL;
while(1)
{
clrscr();
printf("\t\t Circular Linked List operations\n");
printf("1.insbeg\n");
printf("2.insend\n");
printf("3.insmiddle\n");
printf("4.delbeg\n");
printf("5.delend\n");
printf("6.traverse\n");
printf("7.delmiddle\n");
printf("8.exit\n");
printf("enter choice\n");
scanf("%d",&choice);
switch(choice)
{
case 1:printf("\n enter ele ");
scanf("%d",&ele);
insbeg(ele);
break;
case 2:printf("\n enetr ele ");
scanf("%d",&ele);
insend(ele);
break;
case 3:printf("\n enetr ele ");
scanf("%d",&ele);
insmiddle(ele);
break;
case 4:delbeg();
break;
case 5:delend();
23
break;
case 6:traverse();
break;
case 7:delmiddle();
break;
case 8:exit(0);
}
}
}
void insbeg(int ele)
{
struct node *temp,*p;
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
if(start==NULL)
{
start=temp;
start->next=start;
}
else if(start->next==start)
{
temp->next=start;
start->next=temp;
start=temp;
}
else
{
p=start;
while(p->next!=start)
{
p=p->next;
}
p->next=temp;
temp->next=start;
start=temp;
}
}
void insmiddle(int ele)
{
struct node *temp,*p;int pos,count=0;
if(start==NULL)
{
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
start=temp;
temp->next=start;
}
else
{
24
}while(p!=start);
temp->next=p->next;
p->next=temp;
}
}
void insend(int ele)
{
struct node *temp,*p;
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
if(start==NULL)
{
start=temp;
temp->next=start;
}
else if(start->next==start)
{
start->next=temp;
temp->next=start;
}
else
{
p=start;
while(p->next!=start)
{
p=p->next;
}
p->next=temp;
temp->next=start;
}
}
void delbeg()
{
struct node *temp,*p;
if(start==NULL)
printf("\n list is empty");
else if(start->next==start)
25
{
temp=start;
start=NULL;
free(temp);
}
else
{
p=start;
temp=start;
while(p->next!=start)
{
p=p->next;
}
start=start->next;
p->next=start;
free(temp);
}
}
void delend()
{
struct node *temp,*p;
if(start==NULL)
printf("\n list is empty");
else if(start->next==start)
{
temp=start;
start=NULL;
free(temp);
}
else
{
p=start;
while(p->next->next!=start)
{
p=p->next;
}
temp=p->next;
p->next=start;
free(temp);
}
}
void traverse()
{
struct node *p;
if(start==NULL)
printf("\n list is empty");
else
{
p=start;
26
do
{
printf("\n elements are %d",p->info);
p=p->next;
}while(p!=start);
}
getch();
}
void delmiddle()
{
struct node *temp,*p;
int pos,count=0;
if(start==NULL)
{
printf("\n list is empty");
}
else if(start->next==start)
{
temp=start;
start=NULL;
free(temp);
}
else if(start->next->next==start)
{
temp=start->next;
free(temp);
start->next=start;
}
else
{
printf("\n enter pos");
scanf("%d",&pos);
p=start;
while(p->next!=start)
{
count=count+1;
if(count==pos)
break;
temp=p;
p=p->next;
}
temp->next=p->next;
free(p);
}
}
27
Advantages of circular linked list:- It saves time when we have to go to the first node from
the last node. But in double linked list we have to go through in between nodes.
Limitations of circular linked list:- It is not easy to reverse circular linked list.
In a single linked list only forward traversal is possible i.e. we can traverse from left to right
only where as in a double linked list both forward traversal(Left to right) as well as backward
traversal(Right to left) is possible.
int ele,choice;
start=NULL;
while(1)
{
clrscr();
printf("\t\t Double List operations\n");
printf("1.insbeg\n");
printf("2.insend\n");
printf("3.insmiddle\n");
printf("4.delbeg\n");
printf("5.delend\n");
printf("6.traverse\n");
printf("7.delmiddle\n");
printf("8.displaybackwards\n");
printf("9.exit\n");
printf("enter choice\n");
scanf("%d",&choice);
switch(choice)
{
case 1:printf("\n enetr ele ");
scanf("%d",&ele);
insbeg(ele);
break;
case 2:printf("\n enetr ele ");
scanf("%d",&ele);
insend(ele);
break;
case 3:printf("\n enetr ele ");
scanf("%d",&ele);
insmiddle(ele);
break;
case 4:delbeg();
break;
case 5:delend();
break;
case 6:traverse();
break;
case 7:delmiddle();
break;
case 8:disbackwards();
break;
case 9:exit(0);
}
}
}
void insbeg(int ele)
{
struct node *temp;
29
}
temp->next=p->next;
temp->prev=p;
30
temp->next->prev=temp;
p->next=temp;
}
}
void insend(int ele)
{
struct node *temp,*p;
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
temp->next=NULL;
if(start==NULL)
{
start=temp;
temp->prev=NULL;
}
else if(start->next==NULL)
{
temp->prev=start;
start->next=temp;
}
else
{
p=start;
while(p->next!=NULL)
{
p=p->next;
}
temp->prev=p;
p->next=temp;
}
}
void delbeg()
{
struct node *temp;
if(start==NULL)
printf("\n list is empty");
else if(start->next==NULL)
{
temp=start;
start=NULL;
free(temp);
}
else
{
temp=start;
start=start->next;
start->prev=NULL;;
free(temp);
31
}
}
void delend()
{
struct node *temp,*p;
if(start==NULL)
printf("\n list is empty");
else if(start->next==NULL)
{
temp=start;
start=NULL;
free(temp);
}
else
{
p=start;
while(p->next->next!=NULL)
{
p=p->next;
}
temp=p->next;
p->next=NULL;
free(temp);
}
}
void traverse()
{
struct node *p;
if(start==NULL)
printf("\n list is empty");
else
{
p=start;
while(p!=NULL)
{
printf(" %d->",p->info);
p=p->next;
}
}
getch();
}
void disbackwards()
{
struct node *p;
if(start==NULL)
printf("\n list is empty");
else
{
p=start;
32
while(p->next!=NULL)
{
p=p->next;
}
while(p !=NULL)
{
printf(" %d-->",p->info);
p=p->prev;
}
}
getch();
}
void delmiddle()
{
struct node *temp,*p;
int pos,count=0;
if(start==NULL)
{
printf("\n list is empty");
}
else if(start->next==NULL)
{
temp=start;
start=NULL;
free(temp);
}
else if(start->next->next==NULL)
{
temp=start->next;
free(temp);
start->next=NULL;
}
else
{
printf("\n enter pos");
scanf("%d",&pos);
p=start;
while(p!=NULL)
{
count=count+1;
if(count==pos)
break;
p=p->next;
}
temp=p->prev;
temp->next=p->next;
free(p);
}
}
33
1. We can traverse in both directions i.e. from left to right as well as from right to left.
2. It is easy to reverse a double linked list.
Polynomial ADT:- Polynomial is a collection of terms where each term contains coefficient
and exponent.
34
Ex:- 8x6+5x4+2x+9
Where 8,5,2 and 9 are coefficients while 6,4,1 and 0 are exponents.
We can perform various operations on polynomial such as addition, subtraction,
multiplication and division.
clrscr();
printf("\t\t Polynomial Adt Operations \n");
printf("1.create poly1 \n");
printf("2.create poly2 \n");
printf("3.traverselist1\n");
printf("4.traverselist2\n");
printf("5.polynomial add\n");
printf("6.treaverse list \n");
printf("7.exit\n");
printf("enter choice\n");
scanf("%d",&choice);
switch(choice)
{
case 1:createp1();
break;
case 2:createp2();
break;
case 3:traversep1();
getch();
break;
case 4:traversep2();
getch();
break;
case 5:polyadd();
break;
case 6:traversep3();
getch();
break;
case 7:exit(0);
}
}
}
void createp1()
{
struct node *temp,*p;
char ch;
do
{
temp=(struct node*)malloc(sizeof(struct node));
printf("enter coeff & expo values of a first polynomial term");
scanf("%d%d",&temp->coeff,&temp->expo);
temp->next=NULL;
if(s1==NULL)
s1=temp;
else if(s1->next==NULL)
s1->next=temp;
else
{
p=s1;
36
while(p->next!=NULL)
p=p->next;
p->next=temp;
}
printf("do you want another term(y/n)\n");
fflush(stdin);
scanf("%c",&ch);
}while(ch=='y');
}
void createp2()
{
struct node *temp,*p;
char ch;
do
{
temp=(struct node*)malloc(sizeof(struct node));
printf("enter coeff & expo values of a second polynomial term\n");
scanf("%d%d",&temp->coeff,&temp->expo);
temp->next=NULL;
if(s2==NULL)
s2=temp;
else if(s2->next==NULL)
s2->next=temp;
else
{
p=s2;
while(p->next!=NULL)
p=p->next;
p->next=temp;
}
printf("do you want another term(y/n)\n");
fflush(stdin);
scanf("%c",&ch);
}while(ch=='y');
}
void traversep1()
{
struct node *p;
if(s1==NULL)
printf("s1 list is empty\n");
else
{
p=s1;
while(p!=NULL)
{
if(p->coeff>0)
printf("+%dx^%d",p->coeff,p->expo);
else
printf("%dx^%d",p->coeff,p->expo);
37
p=p->next;
}
}
}
void traversep2()
{
struct node *p;
if(s2==NULL)
printf("s1 list is empty\n");
else
{
p=s2;
while(p!=NULL)
{
if(p->coeff>0)
printf("+%dx^%d",p->coeff,p->expo);
else
printf("%dx^%d",p->coeff,p->expo);
p=p->next;
}
}
}
void polyadd()
{
struct node *p1,*p2;
int coef_sum;
p1=s1;
p2=s2;
while(p1!=NULL & p2!=NULL)
{
if(p1->expo==p2->expo)
{
coef_sum=p1->coeff+p2->coeff;
createp3(coef_sum,p1->expo);
p1=p1->next;
p2=p2->next;
}
else if(p1->expo>p2->expo)
{
createp3(p1->coeff,p1->expo);
p1=p1->next;
}
else
{
createp3(p2->coeff,p2->expo);
p2=p2->next;
}
}
if(p1==NULL)
38
{
while(p2!=NULL)
{
createp3(p2->coeff,p2->expo);
p2=p2->next;
}
}
else if(p2==NULL)
{
while(p1!=NULL)
{
createp3(p1->coeff,p1->expo);
p1=p1->next;
}
}
}
void traversep3()
{
struct node *p;
if(s3==NULL)
printf("s1 list is empty\n");
else
{
p=s3;
while(p!=NULL)
{
if(p->coeff>0)
printf("+%dx^%d",p->coeff,p->expo);
else
printf("%dx^%d",p->coeff,p->expo);
p=p->next;
}
}
}
void createp3(int c,int e)
{
struct node *temp,*p;
temp=(struct node*)malloc(sizeof(struct node));
temp->coeff=c;
temp->expo=e;
temp->next=NULL;
if(s3==NULL)
s3=temp;
else if(s3->next==NULL)
s3->next=temp;
else
{
p=s3;
while(p->next!=NULL)
39
p=p->next;
p->next=temp;
}
}
STACKS and Queues
Learning Material
Stack is a linear data structure.
Stack can be defined as a collection of homogeneous elements, where insertion and
deletion operations takes place at only one end called TOP.
The insertion operation is termed as PUSH and deletion operation is termed as POP
operation.
The PUSH and POP operations are performed at TOP of the stack.
An element in a stack is termed as ITEM.
The maximum number of elements that stack can accommodate is termed as SIZE of
the stack.
Stack Pointer ( SP ) always points to the top element of the stack.
Stack follows LIFO principle. i.e. Last In First Out i.e. the element which is inserted
last into the stack will be deleted first from the stack.
Diagram of a stack
Representation of stack
There are two ways of representation of a stack.
1. Array representation of a stack.
2. Linked List representation of a stack.
Algorithm Stack_PUSH(item)
Input: item is new item to push into stack
Output: pushing new item into stack at top whenever stack is not full.
1. if(top == ARRAYSIZE-1)
a) print(stack is full, not possible to perform push operation)
2. else
a) top=top+1
b) read item
c) s[top]=item
End
Stack_PUSH
Algorithm Stack_POP( )
Input: Stack with some elements.
Output: item deleted at top most end.
1. if(top ==-1)
a) print(stack is empty not possible to pop)
2. else
a) item=s[top]
b) top=top-1
c) print(deleted item)
End Stack_POP
C Program to implement stack using arrays
#include<stdio.h>
#include<conio.h>
void push(int ele );
void pop( );
void peep( );
void display( );
int stack[10], top= -1, ele;
void main( )
{
int choi, ele;
clrscr( );
while(1)
{
printf(" \t\t Stack ADT Using Arrays\n");
printf("\n1.Push \n2.Pop \n3.Peep \n4.Display \n5.Exit");
printf("\nEnter your choice");
scanf("%d",&choi);
switch(choi)
{
case 1: printf("\nEnter the element to be inserted into the stack");
scanf("%d",&ele);
push(ele );
break;
case 2: pop( );
break;
case 3: peep( );
break;
case 4: display( );
break;
case 5: exit(0);
}
}
getch( );
}
void pop( )
{
if( top==-1)
printf("\nStack is empty");
else
{
ele=stack[top];
printf("\nThe deleted element from the stack is %d",ele);
top--;
}
}
void peep( )
{
if( top == -1)
printf("\nStack is empty");
else
{
ele=stack[top];
printf("\nThe top most element of the stack is %d",ele);
}
}
void display( )
{
int i;
if( top==-1)
printf("\nStack is empty");
else
{
printf("\nThe elements of the stack are:\n");
for(i=top;i>=0;i--)
printf("%d\n",stack[i]);
}
}
The array representation of stack allows only fixed size of stack. i.e. static memory
allocation only.
To overcome the static memory allocation problem, linked list representation of stack
is preferred.
In linked list representation of stack, each node has two parts. One is data field is for
the item and link field points to next node.
Full condition is not applicable for Linked List representation of stack. Because
here memory is dynamically allocated.
In linked List representation of stack, top pointer always points to top most node only.
i.e. first node in the list.
}
else
{
temp->next=top;
top=temp;
}
}
void pop()
{
struct node *temp;
if(top==NULL)
printf("\n stack is empty");
else if(top->next==NULL)
{
temp=top;
top=NULL;
free(temp);
}
else
{
temp=top;
top=top->next;
free(temp);
}
getch();
}
void display()
{
struct node *p;
if(top==NULL)
printf("\n stack is empy");
else
{
p=top;
while(p!=NULL)
{
printf("\n elements are %d",p->info);
p=p->next;
}
}
getch();
Eg. c= a + b
i. Infix notation
ii. Prefix notation
iii. Postfix notation
Applications of stack
While conversion of infix expression to postfix expression, we must follow the precedence
(priority) of the operators.
Operator priority
( 0
+ - 1
*/% 2
^ or $ 3
To convert an infix expression to postfix expression, we can use one stack.
Within the stack, we place only operators and left parenthesis only. So stack used in
conversion of infix expression to postfix expression is called as operator stack.
1. Perform the following steps while reading of infix expression is not over
a) if symbol is left parenthesis then push symbol into stack.
b) if symbol is operand then add symbol to post fix expression.
c) if symbol is operator then check stack is empty or not.
i) if stack is empty then push the operator into stack.
ii) if stack is not empty then check priority of the operators.
(I) if priority of current operator > priority of operator present at top of
stack then push operator into stack.
(II) else if priority of operator present at top of stack >= priority of
current operator then pop the operator present at top of stack and add
popped operator to postfix expression (go to step I)
d) if symbol is right parenthesis then pop every element form stack up corresponding
left parenthesis and add the poped elements to postfix expression.
2. After completion of reading infix expression, if stack not empty then pop all the items
from stack and then add to post fix expression.
End conversion of infix to postfix
Infix Expression: A+ (B*C-(D/E^F)*G)*H, where ^ is an exponential operator
char pop()
{ /* Function for POP operation */
return(s[top--]);
}
void main()
{ /* Main Program */
char infx[50],pofx[50],ch,elem;
int i=0,k=0;
printf("\n\nRead the Infix Expression ? ");
scanf("%s",infx);
push('#');
while( (ch=infx[i++]) != '\0')
{
if( ch == '(')
push(ch);
else if(isalnum(ch))
pofx[k++]=ch;
else if( ch == ')')
{
while( s[top] != '(')
pofx[k++]=pop();
elem=pop(); /* Remove ( */
}
else
{ /* Operator */
while( pr(s[top]) >= pr(ch) )
pofx[k++]=pop();
push(ch);
}
} //end of while
while( s[top] != '#') /* Pop from stack till empty */
pofx[k++]=pop();
pofx[k]='\0'; /* Make pofx as valid string */
printf("\n\nGiven Infix Expn: %s Postfix Expn:
%s\n",infx,pofx);
}
2. Finally stack has only one item, after completion of reading the postfix expression.
That item is the result of expression.
End PostfixExpressionEvaluation
Evaluate below postfix expression 234+*6-
C Program to evaluate postfix expression:-
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
int st[100],top=-1;
int cal(char post[]);
void push_item(int);
int pop_item();
void main()
{
char in[50];
int result;
clrscr();
printf("\n \t enter the postfix expression");
gets(in);
result=cal(in);
printf("\n result=%d",result);
getch();
}
void push_item(int it)
{
if(top==99)
printf("stack is overflow\n");
else
{
top++;
st[top]=it;
}
}
int pop_item()
{
int it;
if(top==-1)
printf("stack underflow\n");
else
{
return (st[top--]);
}
}
int cal(char post[])
{
int m,n,x,y,j=0,len;
len=strlen(post);
while(j<len)
{
if(isdigit(post[j]))
{
x=post[j]-'0';
push_item(x);
}
else
{
m=pop_item();
n=pop_item();
switch(post[j])
{
case '+':x=m+n;
break;
case '-':x=m-n;
break;
case '*':x=m*n;
break;
case '/':x=m/n;
break;
}
push_item(x);
}
j++;
}
if(top>0)
{
printf("no of operands are more than operators");
exit(0);
}
else
{
y=pop_item();
return(y);
}
}
3.Balancing Symbols or Delimiter matching :-
The objective of this application is to check the symbols such as parenthesis ( ) , braces { }
, brackets [ ] are matched or not.
Thus every left parenthesis, brace and bracket must have its right counterpart.
4. If the reading character is closing symbol and if the stack is empty, then report as
unbalanced expression.
5. If the reading character is closing symbol and if the stack is not empty, then pop the
stack.
6. If the symbol popped is not the corresponding opening symbol, then report as
unbalanced expression.
7. After processing the entire expression and if the stack is not empty then report as
unbalanced expression.
8. After processing the entire expression and if the stack is empty then report as
balanced expression.
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
void push(char);
char pop();
char stack[20];
int top=-1;
main()
{
char expr[20],ch;
int i;
clrscr();
printf("\nEnter an expression\n");
gets(expr);
for(i=0;expr[i]!='\0';i++)
{
ch=expr[i];
if(ch=='(' || ch=='{' || ch=='[')
push(ch);
else if(ch==')')
{
if(top==-1)
{
printf("\nUnbalanced expression");
exit();
}
else if( (ch=pop())!='(')
{
printf("\nUnbalanced expression");
exit();
}
}
else if(ch=='}')
{
if(top==-1)
{
printf("\nUnbalanced expression");
exit();
}
else if( (ch=pop())!='{')
{
printf("\nUnbalanced expression");
exit();
}
}
else if(ch==']')
{
if(top==-1)
{
printf("\nUnbalanced expression");
exit();
}
else if( (ch=pop())!='[')
{
printf("\nUnbalanced expression");
exit();
}
}
}
if(top==-1)
printf("\nBalanced expression");
getch();
}
void push(char x)
{
top++;
stack[top]=x;
}
int pop()
{
return stack[top--];
}
QUEUES
In the Queue the ENQUEUE (insertion) operation is performed at REAR end and
DEQUEUE (deletion) operation is performed at FRONT end.
Queue follows FIFO principle i.e. First In First Out principle i.e. an element First
inserted into Queue, that element only First deleted from Queue.
Representation of Queue
A Queue can be represented in two ways
1. Using arrays
2. Using Linked List
Queue overflow: Trying to perform ENQUEUE (insertion) operation in full Queue is known
as Queue overflow.
Queue overflow condition is rear = = ARRAYSIZE-1
Queue Underflow: Trying to perform DEQUEUE (deletion) operation on empty Queue is
known as Queue Underflow.
Queue Underflow condition is rear<front
Operation on Queue
1. ENQUEUE : To insert element in to Queue
2. DEQUEUE : To delete element from Queue
Algorithm Enqueue(item)
Input:item is new item insert in to queue at rear end.
Output:Insertion of new item at rear end if queue is not full.
1.if(rear= = ARRAYSIZE-1)
a) print(queue is full, not possible for enqueue operation)
2.else
i) read element
ii)rear++
iii)queue[rear]=ele
End Enqueue
While performing ENQUEUE operation two situations are occur.
1. if queue is empty, then newly inserting element becomes first element and last
element in the queue. So Front and Rear points to first element in the list.
2. If Queue is not empty, then newly inserting element is inserted at Rear end.
Algorithm Dequeue( )
Input: Queue with some elements.
Output: Element is deleted from queue at front end if queue is not empty.
1.if( rear<front)
a) print(Queue is empty, not possible for dequeue operation)
2.else
i)ele=queue[front]
ii)print(Deleted element from queue is ele)
iii)front++;
End Dequeue
}
}
void dequeue( )
{
if(rear<front)
printf("\nQueue is empty");
else
{
ele=queue[front];
printf("\nDeleted element from the queue is %d",ele);
front++;
}
}
void display( )
{
int i;
if(rear<front)
printf("\nQueue is underflow");
else
{
printf("\nThe elements of queue are\n");
for(i=front;i<=rear;i++)
printf("%d\t",queue[i]);
}
}
To overcome the static memory allocation problem, Queue can be represented using
Linked List.
In Linked List Representation of Queue, Front always points to First node in the
Linked List and Rear always points to Last node in the Linked List.
#include<stdio.h>
#include<stdlib.h>
struct node
{
int info;
struct node *next;
};
struct node *rear,*front;
void insend(int ele); /* void enqueue(int ele)*/
void delbeg(); /* void dequeue() */
void display();
void main()
{
int ele,choice;
rear=front=NULL;
while(1)
{
clrscr();
printf("\t\t queue list operations\n");
printf("1.enqueue\n");
printf("2.dequeue\n");
printf("3.displau\n");
printf("4.exit\n");
printf("enter choice\n");
scanf("%d",&choice);
switch(choice)
{
case 1:printf("\n enetr ele ");
scanf("%d",&ele);
insend(ele);
break;
case 2:delbeg();
break;
case 3:display();
break;
case 4:exit(0);
}
}
}
void insend(int ele)
{
struct node *temp,*p;
temp=(struct node*)malloc(sizeof(struct node));
temp->info=ele;
temp->next=NULL;
if(rear==NULL)
{
rear=temp;
rear=front=temp;
}
else if(rear->next==NULL) /* or else */
{
rear->next=temp;
rear=temp;
}
}
void delbeg()
{
struct node *temp;
if(front==NULL)
printf("\n que is empty");
else if(front->next==NULL)
{
temp=front;
front=NULL;
rear=NULL;
free(temp);
}
else
{
temp=front;
front=front->next;
free(temp);
}
}
void display()
{
struct node *p;
if(front==NULL)
printf("\n list is empty");
else
{
p=front;
while(p!=NULL)
{
printf("%d->",p->info);
p=p->next;
}
}
getch();
}
Types of Queues Or Various Queue Structures
1. Linear Queue
2. Circular Queues
3. DEQue
4. Priority Queue
1. Circular Queues:-
A circular Queue is a linear data structure in which all the locations are treated as circular
such that the first location cqueue[0] follows the last location cqueue[SIZE-1].
Circular queues are mainly useful to overcome the drawbacks of linear queue.
10 20 30 40 50
0 1 2 3 4
front rear
After performing 2 Dequeue operations, the queue look likes the below
30 40 50
0 1 2 3 4
front rear
If rear reaches to the end of queue, then it’s not possible to insert an element into the
queue even though there is a space at the beginning of the queue.
Algorithm dequeue()
Step 1. if(rear==-1 && front==-1)
print Circular Queue is underflow
Step 2. else if(front==rear) // deleting first element
print cq[front]
rear= front=-1
Step 3. else
front=(front+1)%size;
Circular Queue ADT using Arrays
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#define arraysize 5
int a[arraysize],rear=-1,front=-1;
void enque(int ele);
void deque();
void display();
void main()
{
int ele,choice;
clrscr();
while(1)
{
clrscr();
printf("\t\t circular que operations \n");
printf("1.enque \n2.deque \n3.display \n4.exit\n");
printf("enter choice\n");
scanf("%d",&choice);
switch(choice)
{
case 1:printf("enter ele \n");
scanf("%d",&ele);
enque(ele);
break;
case 2:deque();
break;
case 3:display();
break;
case 4:exit(0);
break;
}
}
}
void enque(int ele)
{
if(front==(rear+1)%arraysize)
{
printf("\n Circular que is full");
getch();
}
else if(front==-1 && rear==-1)
{
rear++;
front++;
a[rear]=ele;
}
else
{
rear=(rear+1)%arraysize;
a[rear]=ele;
}
}
void deque()
{
Insertion Deletion
Deletion Insertion
Front Rear
A DEQue structure
Here DEQue structure is general representation of stack and Queue. In other words, a DEQue
can be used as stack and Queue.
Here DEQue allows insertion at one end (say REAR end) only, but allows deletion at
both ends.
Deletion
Deletion Insertion
Front Rear
Input restricted DEQue
2. Output restricted DEQue
Here DEQue allows deletion at one end (say FRONT end) only, but allows insertion
at both ends.
DEQue is organized from Double Ended Queue.
Insertion
Deletion Insertion
Front Rear Output restricted DEQue
UNIT –III
Syllabus:
Searching: Linear and Binary Search.
Sorting: Bubble sort, Insertion Sort, Heap Sort, Merge Sort & Quick Sort.
Learning Material
Searching:
It is a process of verifying whether the searching element is available in the given set
of elements or not.
Types of Searching techniques are:
1. Linear Search
2. Binary Search
1. Linear Search:-
In linear search, search process starts from starting index of array i.e. 0th index
of array and end’s with ending index of array i.e. (n-1)th index. Here searching is
done in Linear fashion (Sequential fashion).
}
if(flag==1)
printf(“\nKey element is found”);
else
printf(“\nKey element is not found”);
getch();
}
Analysis of linear search:-
1. Best case time complexity:- In linear search, the best case will occurs if key
element is the first element of the array.
Best case time complexity T(n)=O(1)
2. Worst case time complexity:- In linear search, the worst case will occurs if
key element is the last element of the array.
Worst case time complexity T(n)=O(n)
3. Average case time complexity:- In linear search, the average case will occurs
if key element is in between of the array.
Average case time complexity T(n)=O(n)
2. Binary Search:-
The input to binary search must be in ascending order i.e. set of elements be in
ascending order.
Searching process in Binary search as follows:
First, key element is compared with middle element of array.
If key element is less than the middle element of array, then search in LEFT
part. So update high value therefore high=mid-1.
If key element is greater than middle element of array, then search in RIGHT
part. So update low value therefore low=mid+1.
i) flag=1
ii) break
c) else if(key<a[mid])
i) high = mid - 1
d) else if(key>a[mid])
i) low= mid + 1
5.end loop
6.if(flag==1)
a) print(key element is found)
else
a)print(key element is not found)
End binarysearch
#include<stdio.h>
#include<conio.h>
main( )
{
int n,a[10],i,key,low,mid,high,flag;
clrscr();
printf(“\nEnter size of the array”);
scanf(“%d”,&n);
printf(“\nEnter elements of the array”);
for(i=0;i<n;i++)
scanf(“%d”,&a[i]);
printf(“\nEnter key element”);
scanf(“%d”,&key);
flag=0;
low=0;
high=n-1;
while(low<=high)
{
mid=(low+high)/2;
if(key==a[mid])
{
flag=1;
break;
}
else if(key<a[mid])
high=mid-1;
else
low=mid+1;
}
if(flag==1)
printf(“key element is found”);
else
printf(“key element is not found”);
getch();
}
1. Best Case time complexity:- In binary search, the best case will occurs if
key element is array’s middle element.
So, Best case time complexity T(n)=O(1)
2. Worst case time complexity:-In binary search, the worst case will occurs if
key element is either first or last element.
T(n)=T(n/2) + 1
The given array 1 comparison (to compare key element with array’s
Is divided into 2 parts middle element
T(n)=T(n/2k)+k
=T(n/n)+logn
=T(1)+logn
=logn
So, worst case time complexity=O(logn)
12 15 23 34 56 56 74 78 89 92
0 1 2 3 4 5 6 7 8 9
low high
low<=high
0 <= 9 condition is true
mid=(low+high)/2 =>(0+9)/2 =>9/2 =>4 (int/int=int)
so, mid=4
Compare key element (89) with array’s middle element (a[4] i.e. 56)
89>56 , so low=mid+1
low=4+1
=5
12 15 23 34 56 56 74 78 89 92
0 1 2 3 4 5 6 7 8 9
low high
low<=high
5 <= 9 condition is true
mid=(low+high)/2 =>(5+9)/2 =>14/2 =>7
so, mid=7
compare key element(89) with array’s middle element(a[7] i.e. 78)
89>78 , so low=mid+1
low=7+1
=8
12 15 23 34 56 56 74 78 89 92
0 1 2 3 4 5 6 7 8 9
low
high
low<=high
8 <= 9 condition is true
mid=(low+high)/2 =>(8+9)/2 =>17/2 =>8
so, mid=8
compare key element(89) with array’s middle element(a[8] i.e. 89)
89= =89, so key element is found.
1. Bubble sort
2. Insertion sort
3. Merge sort
4. Quick sort
5. Heap sort
1. Bubble sort:
In bubble sort, in each iteration we compare adjacent elements i.e. ith index
element will be compared with (i+1)th index element, if they are not in
ascending order, then swap them.
After first iteration the biggest element is moved to the last position.
After second iteration the next biggest element is moved to next last but one
position.
In bubble sort for sorting n elements, we require (n-1) passes (or) iterations
Process:
1. In pass1, a[0] and a[1] are compared, then a[1] is compared with a[2], then
a[2] is compared with a[3] and so on. Finally a[n-2] is compared with a[n-1].
Pass1 involves (n-1) comparisons and places the biggest element at the
highest index of the array.
2. In pass2, a[0] and a[1] are compared, then a[1] is compared with a[2], then
a[2] is compared with a[3] and so on. Finally a[n-3] is compared with a[n-2].
Pass2 involves (n-2) comparisons and places the next biggest element at the
next highest index of the array.
3. In pass (n-1), a[0] and a[1] are compared. After this step all the elements of
the array are arranged in ascending order.
2. Insertion sort
1. Step 1: The second element of an array is compared with the elements that
appears before it (only first element in this case). If the second element is
smaller than first element, second element is inserted in the position of first
element. After first step, first two elements of an array will be sorted.
2. Step 2: The third element of an array is compared with the elements that
appears before it (first and second element). If third element is smaller than
first element, it is inserted in the position of first element. If third element is
larger than first element but, smaller than second element, it is inserted in the
position of second element. If third element is larger than both the elements, it
is kept in the position as it is. After second step, first three elements of an
array will be sorted.
#include<stdio.h>
#include<conio.h>
int a[10],n;
void insertionsort();
void main()
{
int i;
printf("enter size\n");
scanf("%d",&n);
printf("enter elements to sort\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
insertionsort();
}
void insertionsort()
{
int i,j,temp;
for(i=1;i<n;i++)
{
temp=a[i];
for(j=i-1;a[j]>temp && j>=0;j--)
{
a[j+1]=a[j];
}
a[j+1]=temp;
}
for(i=0;i<n;i++)
printf("%d ",a[i]);
}
1.Best case time complexity:- The best case will occurs when the elements are in
ascending order.
Ex:- 10 20 30 40 50
Iteration No. Of comparisions
1st 1
2nd 1
3rd 1
4th 1
.
.
.
.
Last iteration 1
T(n)=1+1+1+1+------+1
=n-1
T(n)=O(n)
2. Worst case time complexity:-The worst case will occur if the elements are in
descending order.
Iteration No. Of comparisions
1st n-4
nd
2 n-3
rd
3 n-2
th
4 n-1
.
.
.
.
Last iteration 1
Worst case time complexity T(n)=(n-4)+(n-3)+(n-2)+-----------+1
T(n)=n(n-1)/2
T(n)=n2
3. Merge sort:- Merge sort works on the principle of Divide and Conquer
technique.
Any Divide and Conquer algorithm is implemented using 3 steps.
1.Divide
2.Conquer
3.Combine or merge
Merging algorithm:-
The basic merging algorithm takes 2 input arrays A and B and an
output array C, 3 variables i,j,k which are initially set to the beginning of their
respective arrays.
The smallest of A and B is copied to the array C and the appropriate variables are
incremented.
When either input list is completed, then the remaining elements of other list are
copied to the array C.
List 1:- 24 13 26 1
List 2:- 2 27 28 15
2.Conquer:- Each sub array is sorted recursively, so that the first sub array and
second sub array are in sorted order.
List 1:- 1 13 24 26
List 2:- 2 15 27 28
3.Combine or merge:- Merge the solutions of 2 sub arrays. After merging elements
are
1 2 13 15 24 26 27 28
C Program to implement Merge Sort:-
#include<stdio.h>
#include<conio.h>
int a[20],n;
void merge(int i,int j,int p,int q);
void mergesort(int l,int h);
void main()
{
int i;
clrscr();
printf("enter no array of elements to sort\n");
scanf("%d",&n);
printf("enter array of elements to sort\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
mergesort(0,n-1);
printf("elements after mergesort \n");
for(i=0;i<n;i++)
printf("%d ",a[i]);
getch();
}
void mergesort(int low,int high)
{
int mid;
if(low<high)
{
mid=(low+high)/2;
mergesort(low,mid);
mergesort(mid+1,high);
merge(low,mid,mid+1,high);
}
}
void merge(int i,int j,int p,int q)
{
int c[100],k=i,r=i;
while((i<=j) && (p<=q))
{
if(a[i]<a[p])
c[k++]=a[i++];
else
c[k++]=a[p++];
}
while(i<=j)
{
c[k++]=a[i++];
}
while(p<=q)
{
c[k++]=a[p++];
}
for(i=r;i<=q;i++)
a[i]=c[i];
Analysis of Merge Sort:- If n=1, then we can get the solution directly.
So, T(n)=1 if n=1
If n>1, then the given array is divided into 2 parts where each array contains n/2
elements and we need minimum ‘n’ comparisons for merging.
T(n)=2T(n/2) + n if n>1
Let 2k=n
Apply log on both sides then log 2k=log n
Klog 2=logn
k.1=logn
k=logn
T(n)=2kT(n/2k)+k.n
=n.T(n/n)+logn.n
=T(1)+n logn
=1+n logn
=n log n
So, the time complexity of Merge sort in all cases is O(nlogn)
4. Quick Sort :- Quick sort is implemented based on the principle of Divide and
Conquer algorithm.
Divide and conquer algorithm is implemented using 3 steps.
1.Divide 2.Conquer 3.Combine or merge
1.Divide:- Select the first element of an array as pivot element. Divide the given
array 2 parts such that the left part contains the elements which are less than the
pivot element and right part contains the elements which are greater than the pivot
element. This arrangement process is called as Partitioning.
2.Conquer:- The left part and right part will be sorted recursively, so that the
entire array will be in sorted order.
Process:
1. Initialize pivot element as first element in the array to be sort.
2. Initialize i as starting index of the array to be sort, j as ending index of the
array to be sort.
3. Do the following steps while i < j
1. Repeatedly move the i to right (i.e. increase the i value) while ( a[i] <=
pivot) i.e. all the elements to Left of pivot element are Less than pivot
element.
2. Repeatedly move the j to left (i.e. decrease the j value) while (a[j] > pivot)
i.e. all the elements to Right of pivot element are Greater than pivot
element.
3. If i < j, then swap a[i] and a[j]
4. If i<j is false, then j becomes pivot element position so swap a[j] and a[low].
How to select Pivot element in Quick Sort:-
Quick sort is implemented by selecting an element of an array as pivot element.
There are 3 approaches to select the pivot element.
1. Selecting 1st element as pivot element
2. Selecting a random element as pivot element
3. Selecting Median of 3 elements as pivot element
1. Selecting 1st element as pivot element:- The popular approach is to select first
element as pivot element. This is acceptable if the input is random but if the input is
either in ascending or descending order then it provides a poor partition.
#include<stdio.h>
#include<conio.h>
int a[20],n;
int partion(int l,int h);
void quicksort(int low,int high);
void main()
{
int i;
clrscr();
printf("enter no array of elements to sort\n");
scanf("%d",&n);
printf("enter array of elements to sort\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
quicksort(0,n-1);
printf("elements after quicksort \n");
for(i=0;i<n;i++)
printf("%d ",a[i]);
getch();
}
void quicksort(int low,int high)
{
int j;
if(low<high)
{
j=partion(low,high);
quicksort(low,j-1);
quicksort(j+1,high);
}
}
int partion(int l,int h)
{
int i=l,j=h;
int pivot,temp;
pivot=a[i];
while(i<j)
{
while(a[i]<=pivot)
i++;
while(a[j]>pivot)
j--;
if(i<j)
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
temp=a[j];
a[j]=pivot;
a[l]=temp;
return j;
}
Analysis of Quick Sort:-
1.Best Case Time Complexity:- In Quick sort, the best case will occur when we
divide the given array into exactly 2 half’s.
Let 2k=n
Apply log on both sides then log 2k=log n
Klog 2=logn
k.1=logn
k=logn
T(n)=2kT(n/2k)+k.n
=n.T(n/n)+logn.n
=T(1)+n logn
=1+n logn
=n log n
So, the best case time complexity of Quick sort sort =O(nlogn)
3.Worst Case time complexity:- The worst case will occurs when pivot element is
either minimum element or maximum element.
Let ‘i’ be the pivot element (smallest element of the array) and assumes that i index
location is 1.
T(n)=T(i-1)+T(n-i)+n
=T(1-1)+T(n-1)+n
=T(0)+T(n-1)+n
T(n)=T(n-1)+n
T(n-1)=T(n-2)+n-1
T(n-2)=T(n-3)+n-2
.
.
.
.
1 2 3 4. . . . . . . . n=n(n+1)/2=n2
Time Complexity
Algorithm
Syllabus:
Learning Material
Tree is mainly used to represent information level by level.
A
T3
T1 T2
D
B C
H I
E F G
A sample Tree
Basic Terminology:
1. Node: Every element of tree is called as a node. It stores the actual data and links to other
nodes.
Data
Link Link
Data
LC RC
Y Z
5. Root Node: Which is a specially designated node, and does not have any parent node.
Level
A 0
B C 1
D E F G 2
H
I 3
J K L
6. Leaf node or terminal node: The node which does not have any child nodes
is called leaf node. In the above diagram node H, I, E, J, K, L and G are Leaf
nodes.
7. Level: It is the rank of the hierarchy and the Root node is present at level 0. If a node is
present at level l then its parent node will be at the level l-1and child nodes will present at
level l+1.
8. Siblings: The nodes which have same parent node are called as siblings.
In the above example nodes B and C are siblings, nodes D and E are siblings, nodes F and G
are siblings, nodes H and I are siblings.
10. Degree of a tree: The maximum degree of a node is called as degree of tree.
15. Height of a tree:- : It is the length of longest path from the root node to leaf.
16. Depth of a node:- : It is the length of longest path from the root to that node. Depth of root
node is 0.
17. Depth of a tree:- : It is the length of longest path from the root node to leaf.
18. Ancestor and Descendent node:- If there is a path from node A to node B then A is called
as Ancestor of B and B is called as descendent of A.
B C
D E F G
H I J
B C
D E F G
H I
J K L M N O
2. Complete Binary Tree: A complete binary tree is a binary tree in which every level
except possibly the last level is completely filled and all nodes are from left to right.
Eg:
B C
D E F G
H I
J K L
10
20
30
4.Right Skewed binary tree: A binary tree in which each node is having only right sub trees is
called as right skewed binary tree.
10
20
30
10
20 30
40 50 60 70
80 90
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
10 20 30 40 50 60 70 80 90
Array representation of above binary tree.
Data
With linked list representation, if one knows the address of ROOT node, then any
other node can be accessed.
Tree Traversals
It is the process of visiting the nodes of a tree exactly once.
A Binary tree can be traversed in 3 ways.
1. Preorder traversal
2. Inorder traversal
3. Postorder traversal
Ex:- A
B C
D E F G
1. Preorder traversal :-
Here first ROOT node is visited, then LEFT sub tree is visited in Preorder fashion and then
RIGHT sub tree is visited in Preorder fashion.
i.e. ROOT, LEFT, RIGHT
Function:-
2. Inorder traversal:-
Here first LEFT sub tree is visited in Inorder fashion, then ROOT node is visited and then
RIGHT sub tree is visited in Inorder fashion
i.e. LEFT, ROOT, RIGHT
Function:-
3. Postorder traversal:-
Here first LEFT sub tree is visited in postorder fashion, then RIGHT sub tree is visited in
postorder fashion and then ROOT node is visited.
i.e. LEFT, RIGHT, ROOT
Eg:
Function:-
Expression trees:- Expression tree is a binary tree in which each internal node represents an
operator and each leaf node represents operand.
Expression trees are constructed based on postfix expression.
Since the first two symbols are operands, one-node trees are created and pointers are pushed to
them onto a stack.
The next symbol is a '+'. It pops the top two pointers to the trees, a new tree is formed, and a
pointer to it is pushed onto to the stack.
Next, c, d, and e are read. A one-node tree is created for each and a pointer to the corresponding
tree is pushed onto the stack.
Next '+' is read, pops the top 2 elements and a new tree is formed with + as parent node.
Now, a '*' is read. The last two tree pointers are popped and a new tree is formed with a '*' as the
root.
Finally, the last symbol is read. The two trees are merged and a pointer to the final tree remains
on the stack.
Binary Search Trees (BST):-
A Binary Search Tree is a binary tree in which the left sub tree contains the values which are less
than the parent node and the right sub tree contains the values which are greater than the parent
node.
To make the searching process faster, we uses binary search tree.
Binary Search Tree is a combination of binary tree and binary search.
1. Insertion:- This Operation is used to insert an element into a Binary Search Tree.
(ii) Make new node’s left and right field as NULL i.e. newleft=newright=NULL.
NULL NULL
Ex:-Construct binary search tree for the elements 20, 23, 13, 9, 14, 19, 21, 27 and 24.
20
Insert 23 into the given Binary Search Tree. 23 > 20 (data in root). So, 23 needs to be inserted in
the right sub-tree of 20.
20
\
23
Insert 13 into the given Binary Search Tree. 13 < 20(data in root). So, 13 needs to be inserted in
left sub-tree of 20.
20
/ \
13 23
Insert 9 into the given Binary Search Tree.
20
/ \
13 23
/
9
20
/ \
13 23
/ \
9 14
20
/ \
13 23
/ \
9 14
\
19
Inserting 21 into the given Binary Search Tree.
20
/ \
13 23
/ \ /
9 14 21
\
19
Inserting 27 into the given Binary Search Tree.
20
/ \
13 23
/ \ / \
9 14 21 27
\
19
Inserting 24 into the given Binary Search Tree.
20
/ \
13 23
/ \ / \
9 14 21 27
\ /
19 24
2.Find min:-
This operation returns the address of the smallest element of the tree when the tree is not empty.
To find the minimum element, we start at root node and we go left as long as there is a left child.
The stopping element is the smallest element of the tree.
Function:-
3.Find max:-
This operation returns the address of the biggest element of the tree when the tree is not empty.
To find the maximum element, we start at root node and we go right as long as there is a right
child.
The stopping element is the biggest element of the tree.
Function:-
}
4.Search:-
Searching for a node is similar to inserting a node. We start from root, and then go left or
right until we find (or not find the node). A recursive definition of search is as follows. If the root
is equal to NULL, then we return root. If the root is equal to key, then we return root.
Otherwise we recursively solve the problem for left sub tree or right, depending on key.
Function:-
struct node * search(struct node *root,int key)
{
if(root==NULL)
return root;
else if(key==root->data)
return root;
else if(key<root->data)
search(root->left,key);
else if(key>root->data)
search(root->right,key);
}
5.Delete:-
There are three different cases that needs to be considered for deleting a node from binary search
tree.
Case 1: Delete a leaf node:- This case is quite simple. Set corresponding link of the parent
1
65
19 81
15 28 72 94
25 29 96
Deletion of node 29
65
19 81
15 28 72 94
25 96
If the node to be deleted is a left child of the parent, then we connect the left pointer of the parent
(of the deleted node) to the single child. Otherwise if the node to be deleted is a right child of the
parent, then we connect the right pointer of the parent (of the deleted node) to single child.
65
19 81
15 28 72 94
25 29 96
Deletion of node 94
65
19 81
15 28 72 96
25 29
N
19 81
15 28 72 94
25 29 96
Deletion of node 19
65
25 81
15 28 72 94
29 96
Function:-
struct node *delete(struct node *root,int ele)
{
struct node *temp;
if(root==NULL)
return root;
else if(ele<root->data) /*if the node to be deleted <root key then it lies in left sub tree*/
root->left=delete(root->left,ele);
else if(ele>root->data)/*If the node to be deleted is > root then it lies in right sub tree*/
root->right=delete(root->right,ele);
else /*If key is same as root key then this is the node to be deleted*/
{
if(root->left==NULL) /* node with only one child or no child */
{
temp=root;
root=root->right;
free(temp);
}
else if(root->right==NULL)
{
temp=root;
root=root->left;
free(temp);
}
else /*node with 2 Childs*/
{
temp=find_min(root->right); /* find min value of right sub tree */
root->data=temp->data; /*replace root node with min value */
root->right=delete(root->right,temp->data); /*remove the node */
}
}
return root;
}
Ex:- Construct BST by inserting 25,10,12,15,39,64,53, and then delete 15, 10, and
insert 45, 23, and 30.
UNIT – V
Graphs
Vertices are nothing but nodes of the graph where as Edges are nothing but connections between
the nodes.
Eg:-
E={ (V1, V2), (V1, V3), (V1, V4), (V2, V3), (V3, V4)}
Applications of Graph:-
1.Digraph (or) Directed graph:- It is a graph such that G=(V, E) where V is a set of vertices and E is a
set of edges with direction.
Eg:-
E={ (V1, V2), (V1, V3), (V2, V3), (V3, V4), (V4, V1)}
2.Undirected graph:- It is a graph such that G=(V, E) where V is a set of vertices and E is set of edges
with no direction.
Eg:-
E={ (V1, V2), (V1, V3), (V1, V4), (V2, V3), (V3, V4)}
3.Weighted graph:- It is a graph in which all the edges are labelled with some weights.
Eg:-
4.loop (or) self loop:- If there is an edge whose starting and ending vertices are same i.e. (Vi, Vi)
then that edge is called as Cycle.
Eg:-
Eg:-
6.Acyclic Graph:-A graph that does not have cycle is called Acyclic graph.
7.Parallel edges:- If there is more than one edge between the same pair of vertices, then they are
known as parallel edges.
Eg:-
8. Connected Graph:-If there is an edge between every pair of distinct vertices then the graph is
called as Connected Graph.
Eg:-
A
B C
D
Representation of a Graph:- A graph can be represented in 3 ways.
1. Set representation
1. Set representation:- This is the straight forward method used for representing a graph.
In this method 2 sets are maintained V (Set of Vertices) and E (Set of Edges).
Eg1:-
E={ (V1, V2), (V1, V4), (V2, V3), (V2, V4), (V3, V4)}
Eg2:-
E={ (V1, V2), (V1, V4), (V2, V3), (V3, V1), (V4, V3)}
This representation uses a square matrix of order n*n, where n is number of vertices in graph.
Consider a graph ‘G’ with ‘n’ number of vertices and adjacency matrix “adj”.
If there is an edge present between vertices Vi and Vj then adj[i][j]=1 otherwise adj[i][j]=0.
3. Adjacency list representation (or) Linked list representation:- We uses linked list representation
of graphs when number of vertices of the graph is not known in advance.
Eg1:-
Eg2:-
Graph searching techniques:- There are 2 searching techniques.
Here any vertex at level ‘i’ will be visited only after visiting all the vertices present at the preceding
level i-1.
Algorithm:-
1. Take any vertex as starting vertex, enqueue that vertex and set its status as visited.
b) ENQUEUE all the unvisited adjacent vertices of V into queue and set their status as visited.
#include<stdio.h>
#include<conio.h>
#define MAX 6
main()
{
int visited[MAX]={0};
int adj[MAX][MAX],i,j;
for(i=0;i<MAX;i++)
for(j=0;j<MAX;j++)
scanf("%d",&adj[i][j]);
bfs(adj,visited,0);
getch();
int queue[MAX],rear=-1,front=-1,i;
queue[++rear]=start;
visited[start]=1;
while(rear!=front)
start=queue[++front];
printf("%c\t",start+65);
for(i=0;i<MAX;i++)
queue[++rear]=i;
visited[i]=1;
}
}
The basic principle of DFS is quite sample, visit all the vertices up to the deepest level and so on.
Algorithm:-
1. Take any vertex as starting vertex, push that vertex onto a stack and set its status as visited.
2. While stack is not empty
a) pop a vertex V from stack and print it.
b) push all the unvisited neighbours of vertex onto a stack and set their status as visited.
#include<stdio.h>
#include<conio.h>
#define MAX 4
main()
int visited[MAX]={0};
int adj[MAX][MAX],i,j;
for(i=0;i<MAX;i++)
for(j=0;j<MAX;j++)
scanf("%d",&adj[i][j]);
dfs(adj,visited,0);
getch();
}
void dfs(int adj[MAX][MAX],int visited[], int start)
int stack[MAX],top=-1,i;
stack[++top]=start;
visited[start]=1;
while(top!=-1)
start=stack[top--];
printf("%c\t",start+65);
for(i=0;i<MAX;i++)
stack[++top]=i;
visited[i]=1;
}
/* Binary Tree & its traversals */
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
struct bt
{
struct bt *left;
int key;
struct bt *right;
}*root=NULL,*new1,*temp;
void create();
void insert(struct bt *,struct bt *);
void inorder(struct bt*);
void preorder(struct bt*);
void postorder(struct bt*);
void create()
{
int ele;
new1=(struct bt*)malloc(sizeof(struct bt));
new1->left=NULL;
new1->right=NULL;
printf("enter data ");
scanf("%d",&ele);
new1->key=ele;
if(root==NULL)
root=new1;
else
insert(root,new1);
}
void insert(struct bt *root,struct bt *new1)
{
char ch;
printf("enter l toinsert left child or r for right child\n");
fflush(stdin);
scanf("%c",&ch);
switch(ch)
{
case 'l':
if(root->left==NULL)
root->left=new1;
else
insert(root->left,new1);
break;
case 'r':if(root->right==NULL)
root->right=new1;
else
insert(root->right,new1);
break;
}
}
void inorder(struct bt *temp)
{
if(temp!=NULL)
{
inorder(temp->left);
printf(" %d",temp->key);
inorder(temp->right);
}
}
if(temp!=NULL)
{
postorder(temp->left);
postorder(temp->right);
printf(" %d",temp->key);
}
}
void main()
{
int choice,ele;
while(1)
{
clrscr();
printf("\t \t binary Tree");
printf("\n 1.create & insert");
printf("\n 2.ioorder traversal");
printf("\n 3.preorder traversal");
printf("\n 4.post order traversal");
printf("\n 5 exit");
printf("\nenter choic");
scanf("%d",&choice);
switch(choice)
{
case 1:create();
break;
case 2:inorder(root);
getch();
break;
case 3:preorder(root);
getch();
break;
case 4:postorder(root);
getch();
break;
case 5:exit(0);
}
}
}
/*Binary Search Tree Program */
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
struct bst
{
struct bst *left;
int key;
struct bst *right;
}*root=NULL,*new1,*temp,*parent;
void create();
void display();
struct bst* find(int );
void findmax();
struct bst* findmin(struct bst *q);
void insert(struct bst *,struct bst *);
void inorder(struct bst*);
void delete1(struct bst *,int);
void create()
{
int ele;
new1=(struct bst*)malloc(sizeof(struct bst));
new1->left=NULL;
new1->right=NULL;
printf("enter data ");
scanf("%d",&ele);
new1->key=ele;
if(root==NULL)
root=new1;
else
insert(root,new1);
}
void insert(struct bst *root,struct bst *new1)
{
if(new1->key<root->key)
{
if(root->left==NULL)
root->left=new1;
else
insert(root->left,new1);
}
else if(new1->key>root->key)
{
if(root->right==NULL)
root->right=new1;
else
insert(root->right,new1);
}
}
void display()
{
if(root==NULL)
printf("\n trre is not creayed");
else
inorder(root);
}
void inorder(struct bst *temp)
{
if(temp!=NULL)
{
inorder(temp->left);
printf(" %d",temp->key);
inorder(temp->right);
}
}
struct bst* find(int x)
{
struct bst *par;
int found=0;
temp=root;
if(temp==NULL)
printf("\n element not found");
else
{
while(temp!=NULL)
{
if(temp->key==x)
{
found=1;
break;
}
else if(x<temp->key)
{
parent=temp;
temp=temp->left;
}
else
{
parent=temp;
temp=temp->right;
}
}
if(found==1)
{
printf("node present in bst with key=%d",temp->key);
}
else
printf("node is not present with key=%d",x);
return temp;
}
}
void findmax()
{
temp=root;
while(temp->right!=NULL)
{
temp=temp->right;
}
printf("\n maxium element of bst is%d ",temp->key);
getch();
}
struct bst * findmin(struct bst *p)
{
struct bst *par;
temp=p;
while(temp->left!=NULL)
{
par=temp;
temp=temp->left;
}
printf("\n minium element of bst is%d ",temp->key);
getch();
return par;
}
void delete1(struct bst *temp,int x)
{
struct bst *p,*q;
if(temp==NULL)
printf("bst is empty \n");
else if(temp->left==NULL & temp->right==NULL)
{
free(temp);
root=NULL;
}
while(temp!=NULL)
{
if(temp->key==x)
break;
else if(x<temp->key)
{
parent=temp;
temp=temp->left;
}
else
{
parent=temp;
temp=temp->right;
}
printf("temp =%d",temp->key);
printf("parent =%d",parent->key);
getch();
/* deleting leaf node */
else
{
if(temp->left!=NULL && temp->right!=NULL)
{
p=findmin(temp->right);
q=p->left;
temp->key=q->key;
p->left=NULL;
free(q);
}
}
}
void main()
{
int choice,ele;
while(1)
{
clrscr();
printf("\t \t binary search tree");
printf("\n 1.create & insert");
printf("\n 2. display tree");
printf("\n 3.find");
printf("\n 4.find max");
printf("\n 5 find min");
printf("\n 6.delete ");
printf("\n 7.exit");
printf("\nenter choic");
scanf("%d",&choice);
switch(choice)
{
case 1:create();
break;
case 2:display();getch();
break;
case 3:printf("\n enter ele to search");
scanf("%d",&ele);
temp=find(ele);
getch();
break;
case 4:findmax();
break;
case 5:findmin(root);
break;
case 6: printf("\n enter ele to delete");
scanf("%d",&ele);
delete1(root,ele);
break;
case 7:exit(0);
}
}
}
14
11 17
7 53
4
AVL Tree Example:
• Insert 14, 17, 11, 7, 53, 4, 13 into an empty AVL tree
14
7 17
4 11 53
13
AVL Tree Example:
• Now insert 12
14
7 17
4 11 53
13
12
AVL Tree Example:
• Now insert 12
14
7 17
4 11 53
12
13
AVL Tree Example:
• Now the AVL tree is balanced.
14
7 17
4 12 53
11 13
AVL Tree Example:
• Now insert 8
14
7 17
4 12 53
11 13
8
AVL Tree Example:
• Now insert 8
14
7 17
4 11 53
8 12
13
AVL Tree Example:
• Now the AVL tree is balanced.
14
11 17
7 12 53
4 8 13
AVL Tree Example:
• Now remove 53
14
11 17
7 12 53
4 8 13
AVL Tree Example:
• Now remove 53, unbalanced
14
11 17
7 12
4 8 13
AVL Tree Example:
• Balanced! Remove 11
11
7 14
4 8 12 17
13
AVL Tree Example:
• Remove 11, replace it with the largest in its left branch
7 14
4 12 17
13
AVL Tree Example:
• Remove 8, unbalanced
4 14
12 17
13
AVL Tree Example:
• Remove 8, unbalanced
4 12
14
13 17
AVL Tree Example:
• Balanced!!
12
7 14
4 13 17
In Class Exercises
• Build an AVL tree with the following values:
15, 20, 24, 10, 13, 7, 30, 36, 25
15, 20, 24, 10, 13, 7, 30, 36, 25
20
15
15 24
20
10
24
13
20 20
13 24 15 24
10 15 13
10
15, 20, 24, 10, 13, 7, 30, 36, 25
20
13
13 24 10 20
10 15 7 15 24
7 30
13 36
10 20
7 15 30
24 36
15, 20, 24, 10, 13, 7, 30, 36, 25
13 13
10 20 10 20
7 15 30 7 15 24
24 36 30
25 13 25 36
10 24
7 20 30
15 25 36
Remove 24 and 20 from the AVL tree.
13 13
10 24 10 20
7 20 30 7 15 30
15 25 36 25 36
13 13
10 30 10 15
7 15 36 7 30
25 25 36
Trees 4: AVL Trees
• Section 4.4
1
Motivation
• When building a binary search tree, what type of
trees would we like? Example: 3, 5, 8, 20, 18, 13, 22
3
5
13
8
5 20
13
18 3 8 18 22
20
22
2
Motivation
• Complete binary tree is hard to build when we allow
dynamic insert and remove.
– We want a tree that has the following properties
• Tree height = O(log(N))
• allows dynamic insert and remove with O(log(N)) time
complexity.
– The AVL tree is one of this kind of trees.
8
13
5 18
5 20
3 13 20
3 8 18 22 22
3
AVL (Adelson-Velskii and Landis) Trees
• An AVL Tree is a 4
44
binary search tree
2 3
such that for every 17 78
internal node v of T, 1 2 1
32 50 88
the heights of the
1
children of v can differ 48 62
1
by at most 1.
5
Which is an AVL Tree?
6
Height of an AVL tree
• Theorem: The height of an AVL tree storing n keys is O(log
n).
• Proof:
– Let us bound n(h), the minimum number of internal nodes of an
AVL tree of height h.
– We easily see that n(0) = 1 and n(1) = 2
– For h > 2, an AVL tree of height h contains the root node, one AVL
subtree of height h-1 and another of height h-2 (at worst).
– That is, n(h) >= 1 + n(h-1) + n(h-2)
– Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So
n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … (by
induction),
n(h) > 2in(h-2i)
– Solving the base case we get: n(h) > 2 h/2-1
– Taking logarithms: h < 2log n(h) +2
– Since n>=n(h), h < 2log(n)+2 and the height of an AVL tree is
O(log n)
7
AVL Tree Insert and Remove
• Do binary search tree insert and remove
• The balance condition can be violated
sometimes
– Do something to fix it : rotations
– After rotations, the balance of the whole tree is
maintained
8
Balance Condition Violation
• If condition violated after a node insertion
– Which nodes do we need to rotate?
– Only nodes on path from insertion point to root may have their
balance altered
• Rebalance the tree through rotation at the deepest node with
balance violated
– The entire tree will be rebalanced
• Violation cases at node k (deepest node)
1. An insertion into left subtree of left child of k
2. An insertion into right subtree of left child of k
3. An insertion into left subtree of right child of k
4. An insertion into right subtree of right child of k
– Cases 1 and 4 equivalent
• Single rotation to rebalance
– Cases 2 and 3 equivalent
• Double rotation to rebalance
9
AVL Trees Complexity
• Overhead
– Extra space for maintaining height information at each node
– Insertion and deletion become more complicated, but still
O(log N)
• Advantage
– Worst case O(log(N)) for insert, delete, and search
10
Single Rotation (Case 1)
11
Example
• After inserting 6
– Balance condition at node 8 is violated
12
Single Rotation (Case 1)
13
Example
• Inserting 3, 2, 1, and then 4 to 7 sequentially into
empty AVL tree
3
2
2
1 3
1
14
Example (Cont’d)
• Inserting 4
2
1 3
• Inserting 5
2 2
1 3 1 4
4 3 5
5 15
Example (Cont’d)
• Inserting 6 4
2
2 5
1 4
1 3 6
3 5
• Inserting 7 6
4 4
2 5 2 6
6 1 3 5 7
1 3
7 16
Single Rotation Will Not Work for the
Other Case
• For case 2
• After single rotation, k1 still not balanced
• Double rotations needed for case 2 and case 3
17
Double Rotation (Case 2)
18
Example
• Continuing the previous example by inserting
– 16 down to 10, and then 8 and 9
• Inserting 16 and 15
4
4
2 6
2 6
1 3 5 15
1 3 5 7
16 7 16
15
19
Example (Cont’d)
• Inserting 14
4 4
2 6 2 7
15 1 3 6 15
1 3 5
7 16 5 14 16
14
20
Double Rotation (Case 2)
21
Summary
Violation cases at node k (deepest node)
1. An insertion into left subtree of left child of k
2. An insertion into right subtree of left child of k
3. An insertion into left subtree of right child of k
4. An insertion into right subtree of right child of k
Case 1
Case 2
Case 4?
Case 3 22
Implementation of AVL Tree
23
Case 1
Case 2
Case 4
Case 3
24
Single Rotation (Case 1)
25
Double Rotation (Case 2)
26
Review Insertion -- Case 1
h+2
h+1
Height = h
h
h
Before insert
h+2
h+1
h+2
h h+1
h+1 h h h
After rotation
27
After insert
Review Insertion -- Case 2
Before insert
h+1/h+2
h/h+1
h+1
h-1 h
h/h-1 h/h-1 h-1
h
Binary Trees,
Binary Search Trees
Binary Search Trees / Slide 2
Trees
Linear access time of linked lists is prohibitive
Does there exist any simple data structure for
which the running time of most operations (search,
insert, delete) is O(log N)?
Binary Search Trees / Slide 3
Trees
A tree is a collection of nodes
The collection can be empty
(recursive definition) If not empty, a tree consists of
a distinguished node r (the root), and zero or more
nonempty subtrees T1, T2, ...., Tk, each of whose
roots are connected by a directed edge from r
Binary Search Trees / Slide 4
Some Terminologies
Some Terminologies
Path
Length
number of edges on the path
Depth of a node
length of the unique path from the root to that node
The depth of a tree is equal to the depth of the deepest leaf
Height of a node
length of the longest path from that node to a leaf
all leaves are at height 0
The height of a tree is equal to the height of the root
Ancestor and descendant
Proper ancestor and proper descendant
Binary Search Trees / Slide 6
Binary Trees
A tree in which no node can have more than two children
Tree traversal
Used to print out the data in a tree in a certain
order
Pre-order traversal
Print the data at the root
Recursively print out all data in the left subtree
Recursively print out all data in the right subtree
Binary Search Trees / Slide 10
Inorder traversal
left, node, right.
infix expression
a+b*c+d*e+f*g
Binary Search Trees / Slide 12
Preorder
Binary Search Trees / Slide 13
Postorder
Binary Search Trees / Slide 14
Binary Trees
Possible operations on the Binary Tree ADT
parent
left_child, right_child
sibling
root, etc
Implementation
Because a binary tree has at most two children, we can keep
direct pointers to them
Binary Search Trees / Slide 16
Implementation
Binary Search Trees / Slide 21
Searching BST
If we are searching for 15, then we are done.
If we are searching for a key < 15, then we
should search in the left subtree.
If we are searching for a key > 15, then we
should search in the right subtree.
Binary Search Trees / Slide 22
Binary Search Trees / Slide 23
Searching (Find)
Find X: return a pointer to the node that has key X, or
NULL if there is no such node
Time complexity
O(height of the tree)
Binary Search Trees / Slide 24
findMin/ findMax
Return the node containing the smallest element in
the tree
Start at the root and go left as long as there is a left
child. The stopping point is the smallest element
insert
Proceed down the tree as you would with a find
If X is found, do nothing (or update something)
Otherwise, insert X at the last spot on the path traversed
delete
When we delete a node, we need to consider
how we take care of the children of the
deleted node.
This has to be done such that the property of the
search tree is maintained.
Binary Search Trees / Slide 28
delete
Three cases:
(1) the node is a leaf
Delete it immediately
(2) the node has one child
Adjust a pointer from the parent to bypass that node
Binary Search Trees / Slide 29
delete
(3) the node has 2 children
replace the key of that node with the minimum element at the
right subtree
delete the minimum element
Has either no child or only right child because if it has a left
child, that left child would be smaller and would have been
chosen. So invoke case 1 or 2.
77 42 35 12 101 5
1 2 3 4 5 6
5 12 35 42 77 101
"Bubbling Up" the Largest Element
1 2 3 4 5 6
77 42 35 12 101 5
"Bubbling Up" the Largest Element
1 2 3 4 5 6
42Swap
77 77
42 35 12 101 5
"Bubbling Up" the Largest Element
1 2 3 4 5 6
42 35Swap35
77 77 12 101 5
"Bubbling Up" the Largest Element
1 2 3 4 5 6
42 35 12Swap12
77 77 101 5
"Bubbling Up" the Largest Element
1 2 3 4 5 6
42 35 12 77 101 5
No need to swap
"Bubbling Up" the Largest Element
1 2 3 4 5 6
42 35 12 77 5 Swap101
101 5
"Bubbling Up" the Largest Element
1 2 3 4 5 6
42 35 12 77 5 101
index <- 1
last_compare_at <- n – 1
loop
exitif(index > last_compare_at)
if(A[index] > A[index + 1]) then
Swap(A[index], A[index + 1])
endif
index <- index + 1
endloop
LB
1 2 3 4 5 6
42 35 12 77 5 101
12 35 5 42 77 101
1 2 3 4 5 6
12 5 35 42 77 101
1 2 3 4 5 6
5 12 35 42 77 101
Reducing the Number of Comparisons
1 2 3 4 5 6
77 42 35 12 101 5
1 2 3 4 5 6
42 35 12 77 5 101
1 2 3 4 5 6
35 12 42 5 77 101
1 2 3 4 5 6
12 35 5 42 77 101
1 2 3 4 5 6
12 5 35 42 77 101
Reducing the Number of Comparisons
• On the Nth “bubble up”, we only need to
do MAX-N comparisons.
• For example:
– This is the 4th “bubble up”
– MAX is 6
– Thus we have 2 comparisons to do
1 2 3 4 5 6
12 35 5 42 77 101
Putting It All Together
N is … // Size of Array
loop
exitif(to_do = 0)
index <- 1
loop
exitif(index > to_do)
Outer loop
Inner loop
if(A[index] > A[index + 1]) then
Swap(A[index], A[index + 1])
endif
index <- index + 1
endloop
to_do <- to_do - 1
endloop
endprocedure // Bubblesort
Already Sorted Collections?
• What if the collection was already sorted?
• What if only a few elements were out of place and
after a couple of “bubble ups,” the collection was
sorted?
loop
exitif ((to_do = 0) OR NOT(did_swap))
index <- 1
did_swap <- false
loop
exitif(index > to_do)
if(A[index] > A[index + 1]) then
Swap(A[index], A[index + 1])
did_swap <- true
endif
index <- index + 1
endloop
to_do <- to_do - 1
endloop
An Animated Example
N 8 did_swap true
to_do 7
index
98 23 45 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap false
to_do 7
index 1
98 23 45 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap false
to_do 7
index 1
Swap
98 23 45 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 1
Swap
23 98 45 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 2
23 98 45 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 2
Swap
23 98 45 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 2
Swap
23 45 98 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 3
23 45 98 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 3
Swap
23 45 98 14 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 3
Swap
23 45 14 98 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 4
23 45 14 98 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 4
Swap
23 45 14 98 6 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 4
Swap
23 45 14 6 98 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 5
23 45 14 6 98 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 5
Swap
23 45 14 6 98 67 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 5
Swap
23 45 14 6 67 98 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 6
23 45 14 6 67 98 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 6
Swap
23 45 14 6 67 98 33 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 6
Swap
23 45 14 6 67 33 98 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 7
23 45 14 6 67 33 98 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 7
Swap
23 45 14 6 67 33 98 42
1 2 3 4 5 6 7 8
An Animated Example
N 8 did_swap true
to_do 7
index 7
Swap
23 45 14 6 67 33 42 98
1 2 3 4 5 6 7 8
After First Pass of Outer Loop
N 8 did_swap true
to_do 7
index 8 Finished first “Bubble Up”
23 45 14 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap false
to_do 6
index 1
23 45 14 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap false
to_do 6
index 1
No Swap
23 45 14 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap false
to_do 6
index 2
23 45 14 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap false
to_do 6
index 2
Swap
23 45 14 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 2
Swap
23 14 45 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 3
23 14 45 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 3
Swap
23 14 45 6 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 3
Swap
23 14 6 45 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 4
23 14 6 45 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 4
No Swap
23 14 6 45 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 5
23 14 6 45 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 5
Swap
23 14 6 45 67 33 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 5
Swap
23 14 6 45 33 67 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 6
23 14 6 45 33 67 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 6
Swap
23 14 6 45 33 67 42 98
1 2 3 4 5 6 7 8
The Second “Bubble Up”
N 8 did_swap true
to_do 6
index 6
Swap
23 14 6 45 33 42 67 98
1 2 3 4 5 6 7 8
After Second Pass of Outer Loop
N 8 did_swap true
to_do 6
index 7 Finished second “Bubble Up”
23 14 6 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap false
to_do 5
index 1
23 14 6 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap false
to_do 5
index 1
Swap
23 14 6 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 1
Swap
14 23 6 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 2
14 23 6 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 2
Swap
14 23 6 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 2
Swap
14 6 23 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 3
14 6 23 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 3
No Swap
14 6 23 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 4
14 6 23 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 4
Swap
14 6 23 45 33 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 4
Swap
14 6 23 33 45 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 5
14 6 23 33 45 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 5
Swap
14 6 23 33 45 42 67 98
1 2 3 4 5 6 7 8
The Third “Bubble Up”
N 8 did_swap true
to_do 5
index 5
Swap
14 6 23 33 42 45 67 98
1 2 3 4 5 6 7 8
After Third Pass of Outer Loop
N 8 did_swap true
to_do 5
index 6 Finished third “Bubble Up”
14 6 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap false
to_do 4
index 1
14 6 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap false
to_do 4
index 1
Swap
14 6 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 1
Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 2
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 2
No Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 3
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 3
No Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 4
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fourth “Bubble Up”
N 8 did_swap true
to_do 4
index 4
No Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
After Fourth Pass of Outer Loop
N 8 did_swap true
to_do 4
index 5 Finished fourth “Bubble Up”
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fifth “Bubble Up”
N 8 did_swap false
to_do 3
index 1
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fifth “Bubble Up”
N 8 did_swap false
to_do 3
index 1
No Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fifth “Bubble Up”
N 8 did_swap false
to_do 3
index 2
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fifth “Bubble Up”
N 8 did_swap false
to_do 3
index 2
No Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fifth “Bubble Up”
N 8 did_swap false
to_do 3
index 3
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
The Fifth “Bubble Up”
N 8 did_swap false
to_do 3
index 3
No Swap
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
After Fifth Pass of Outer Loop
N 8 did_swap false
to_do 3
index 4 Finished fifth “Bubble Up”
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
Finished “Early”
N 8 did_swap false
to_do 3
We didn’t do any swapping,
index 4 so all of the other elements
must be correctly placed.
6 14 23 33 42 45 67 98
1 2 3 4 5 6 7 8
Summary
• “Bubble Up” algorithm will move largest
value to its correct location (to the right)
• Repeat “Bubble Up” until all elements are
correctly placed:
– Maximum of N-1 times
– Can finish early if no swapping occurs
• We reduce the number of elements we
compare each time one is correctly placed
LB
Truth in CS Act
• NOBODY
• NOT EVER
77 42 35 12 101 5
1 2 3 4 5 6
5 12 35 42 77 101
Divide and Conquer
37 23 6 89 15 12 2 19
How to Remember Merge Sort?
That’s easy. Just
remember Mariah
Carey.
The Maria
“Wax-on” Angle:
(q,t)
The Siren of Subquadratic Sorts
How To Remember Merge Sort?
(q,t)
More TRUTH in CS
• Honest.
s1 f1 s2 f2
LB
Algorithm
Mergesort(Passed an array)
if array size > 1
Divide array in half
Call Mergesort on first half.
Call Mergesort on second half.
Merge two halves.
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23
23
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23
23 98
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14 45
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14 45
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14 45
14
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14 45
14 23
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14 45
14 23 45
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
98 23 45 14
23 98 14 45
14 23 45 98
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14
23 98 14 45
14 23 45 98
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67
23 98 14 45
14 23 45 98
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67
23 98 14 45
14 23 45 98 Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67
23 98 14 45 6
14 23 45 98 Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67
23 98 14 45 6 67
14 23 45 98 Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67
14 23 45 98
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67
14 23 45 98 Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33
14 23 45 98 Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23 33
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23 33 42
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23 33 42 45
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23 33 42 45 67
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23 33 42 45 67 98
Merge
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
98 23 45 14 6 67 33 42
23 98 14 45 6 67 33 42
14 23 45 98 6 33 42 67
6 14 23 33 42 45 67 98
98 23 45 14 6 67 33 42
6 14 23 33 42 45 67 98
Summary
element
element
element
element
element
element
In the best case, the target value is in the first element of the
array.
So the search takes some tiny, and constant, amount of time.
In the worst case, the target value is in the last element of the
array.
So the search takes an amount of time proportional to the
length of the array.
Analysis of Linear Search
low high
middle
Searching for 18.
Binary Search Example
low high
middle
Searching for 18: found!
Time Complexity of Binary Search
How fast is binary search?
Think about how it operates: after you examine a value, you cut the
search region in half.
So, the first iteration of the loop, your search region is the whole
array.
The second iteration, it’s half the array.
The third iteration, it’s a quarter of the array.
...
The kth iteration, it’s (1/2k-1) of the array.
Hashing
Hash Tables
• We’ll discuss the hash table ADT which supports only a subset of the
operations allowed by binary search trees.
• The implementation of hash tables is called hashing.
• Hashing is a technique used for performing insertions, deletions and finds
in constant average time (i.e. O(1))
• This data structure, however, is not efficient in operations that require any
ordering information among the elements, such as findMin, findMax and
printing the entire table in sorted order.
General Idea
•
•
The ideal hash table structure is merely an array of some fixed size, containing the
items.
A stored item needs to have a data member, called key, that will be used in
• computing the index value for the item.
• –Key could be an integer, a string, etc
–e.g. a name or Id that is a part of a large employee structure
hashVal %=tableSize;
if (hashVal < 0) /* in case overflows occurs */ hashVal +=
tableSize;
return hashVal;
};
Hash function for strings:
98 108 105 key[i]
key a l i
0 1 2 i
KeySize = 3;
……
10,006 (TableSize)
Collision Resolution
• If, when an element is inserted, it hashes to the same value as an
already inserted element, then we have a collision and need to
resolve it.
• There are several methods for dealing with this:
– Separate chaining
– Open addressing
• Linear Probing
• Quadratic Probing
• Double Hashing
Separate Chaining
• The idea is to keep a list of all elements that hash to the same value.
– The array elements are pointers to the first nodes of the lists.
– A new item is inserted to the front of the list.
• Advantages:
– Better space utilization for large items.
– Simple collision handling: searching linked list.
– Overflow: we can store more items than the hash table size.
– Deletion is quick and easy: deletion from the linked list.
Example
Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81
1 81 1
2
4 64 4
5 25
6 36 16
7
9 49 9
Operations
• Initialization: all entries are set to NULL
• Find:
– locate the cell using hash function.
– sequential search on the linked list in that cell.
• Insertion:
– Locate the cell using hash function.
– (If the item does not exist) insert it as the first item in the list.
• Deletion:
– Locate the cell using hash function.
– Delete the item from the linked list.
Hash Table Class for separate chaining
template <class HashedObj> class HashTable
{
public:
HashTable(const HashedObj & notFound, int size=101 ); HashTable( const
HashTable & rhs )
:ITEM_NOT_FOUND( rhs.ITEM_NOT_FOUND ),
theLists( rhs.theLists ) { }
const HashedObj & find( const HashedObj & x ) const; void makeEmpty( );
void insert( const HashedObj & x );
void remove( const HashedObj & x );
if( !itr.isValid() )
whichList.insert( x, whichList.zeroth( ) );
}
Remove routine
/**
* Remove item x from the hash table.
*/
template <class HashedObj>
void HashTable<HashedObj>::remove( const HashedObj & x )
{
theLists[hash(x, theLists.size())].remove( x );
}
Find routine
/**
*Find item x in the hash table.
*Return the matching item or ITEM_NOT_FOUND if not found
*/
template <class HashedObj>
const HashedObj & HashTable<HashedObj>::find( const HashedObj & x ) const
{
ListItr<HashedObj> itr;
itr = theLists[ hash( x, theLists.size( ) ) ].find( x ); if(!itr.isValid())
return ITEM_NOT_FOUND; else
return itr.retrieve( );
}
Analysis of Separate Chaining
• Collisions are very likely.
– How likely and what is the average length of lists?
• Load factor definition:
– Ratio of number of elements (N) in a hash table to the hash
TableSize.
• i.e. = N/TableSize
– The average length of a list is also
– For chaining is not bound by 1; it can be > 1.
Cost of searching
• Cost = Constant time to evaluate the hash function
+ time to traverse the list.
• Unsuccessful search:
– We have to traverse the entire list, so we need to compare nodes on the average.
• Successful search:
– List contains the one node that stores the searched item + 0 or more other nodes.
– Expected # of other nodes = x = (N-1)/M which is essentially
since M is presumed large.
– On the average, we need to check half of the other nodes while searching for a certain element
– Thus average search cost = 1 + /2
Summary
• The analysis shows us that the table size is not really
important, but the load factor is.
• TableSize should be as large as the number of expected
elements in the hash table.
– To keep load factor around 1.
• TableSize should be prime for even distribution of keys to
hash table cells.
Hashing: Open Addressing
41
CENG 213 Data Structures
Collision Resolution with Open
Addressing
• Separate chaining has the disadvantage of using linked
lists.
– Requires the implementation of a second data structure.
• In an open addressing hashing system, all the data go
inside the table.
– Thus, a bigger table is needed.
• Generally the load factor should be below 0.5.
– If a collision occurs, alternative cells are tried until an empty
cell is found.
Open Addressing
• More formally:
– Cells h0(x), h1(x), h2(x), …are tried in succession where
hi(x) = (hash(x) + f(i)) mod TableSize, with f(0) = 0.
– The function f is the collision resolution strategy.
• There are three common collision resolution strategies:
– Linear Probing
– Quadratic probing
– Double hashing
Linear Probing
• In linear probing, collisions are resolved by sequentially
scanning an array (with wraparound) until an empty cell
is found.
– i.e. f is a linear function of i, typically f(i)= i.
• Example:
– Insert items with keys: 89, 18, 49, 58, 9 into an empty hash
table.
– Table size is 10.
– Hash function is hash(x) = x mod 10.
• f(i) = i;
Figure 20.4
Linear probing hash
table after each
insertion
Find and Delete
• The find algorithm follows the same probe sequence as
the insert algorithm.
– A find for 58 would involve 4 probes.
– A find for 19 would involve 5 probes.
• We must use lazy deletion (i.e. marking items as deleted)
– Standard deletion (i.e. physically removing the item) cannot be
performed.
– e.g. remove 89 from hash table.
Clustering Problem
• As long as table is big enough, a free cell can always be
found, but the time to do so can get quite large.
• Worse, even if the table is relatively empty, blocks of
occupied cells start forming.
• This effect is known as primary clustering.
• Any key that hashes into the cluster will require several
attempts to resolve the collision, and then it will add to
the cluster.
Analysis of insertion
• The average number of cells that are examined in an insertion
using linear probing is roughly
(1 + 1/(1 – λ)2) / 2
• Proof is beyond the scope of text book.
• For a half full table we obtain 2.5 as the average number of cells
examined during an insertion.
• Primary clustering is a problem at high load factors. For half
empty tables the effect is not disastrous.
Analysis of Find
• An unsuccessful search costs the same as insertion.
• The cost of a successful search of X is equal to the cost of inserting
X at the time X was inserted.
• For λ = 0.5 the average cost of insertion is 2.5. The average cost of
finding the newly inserted item will be 2.5 no matter how many
insertions follow.
• Thus the average cost of a successful search is an average of the
insertion costs over all smaller load factors.
Average cost of find
• The average number of cells that are examined in an unsuccessful
search using linear probing is roughly (1 + 1/(1 – λ)2) / 2.
• The average number of cells that are examined in a successful
search is approximately
(1 + 1/(1 – λ)) / 2.
– Derived from:
1 1 1 1 dx
2 (1 x) 2
x 0
Linear Probing – Analysis -- Example
• What is the average number of probes for a successful search and an
unsuccessful search for this hash table?
0 9
1
-- 25: 3,4
•
2 2
– Hash Function: h(x) = x mod 11
3 13
Avg. •Probe
Successful Search:
for SS = (1+1+1+2+2+4+1+3)/8=15/8
• – 20: 9Search:
Unsuccessful -- 30: 8 -- 2 : 2 -- 13: 2, 3
4 25
– • assume
We – 24: 2,3,4,5
that the--hash
10: function
10 -- 9:uniformly
9,10, 0 distributes the keys. 5 24
– 0: 0,1 -- 1: 1 -- 2: 2,3,4,5,6 -- 3: 6
3,4,5,6 7
– 4: 4,5,6 -- 5: 5,6 -- 6: 6 -- 7: 7 -- 8 30
8: 8,9,10,0,1
9 20
– 9: 9,10,0,1 -- 10: 10,0,1
Avg. Probe for US = (2+1+5+4+3+2+1+1+5+4+3)/11=31/11 10 10
Quadratic Probing
• Quadratic Probing eliminates primary clustering problem of linear
probing.
• Collision function is quadratic.
– The popular choice is f(i) = i2.
• If the hash function evaluates to h and a search in cell h is
inconclusive, we try cells h + 12, h+22, … h + i2.
– i.e. It examines cells 1,4,9 and so on away from the original probe.
• Remember that subsequent probe points are a quadratic
number of positions from the original probe point.
Figure 20.6
A quadratic probing hash
table after each insertion
(note that the table size
was poorly chosen
because it is not a prime
number).
Quadratic Probing
• Problem:
– We may not be sure that we will probe all locations in the table (i.e. there is
no guarantee to find an empty cell if table is more than half full.)
– If the hash table size is not prime this problem will be much severe.
• However, there is a theorem stating that:
– If the table size is prime and load factor is not larger than 0.5, all probes
will be to different locations and an item can always be inserted.
Theorem
• If quadratic probing is used, and the table size is prime,
then a new element can always be inserted if the table is at
least half empty.
Proof
• Let M be the size of the table and it is prime. We show that the first
M/2 alternative locations are distinct.
• two of these locations are h + i2 and h + j2, where i, j are two probes
Let
s.t. 0 i,j M/2. Suppose for the sake of contradiction, that these two locations are the same but
i j. Then
h + i2 = h + j2 (mod M) i2 = j2 (mod M)
i2 - j2 = 0 (mod M)
(i-j)(i+j) = 0 (mod M)
Because M is prime, either (i-j) or (i+j) is divisible by M. Neither of these possibilities can occur.
Thus we obtain a contradiction.
• follows that the first M/2 alternative are all distinct and since there are at most M/2 items in
It
the hash table it is guaranteed that an insertion must succeed if the table is at least half full.
•
Some considerations
• How efficient is calculating the quadratic probes?
– Linear probing is easily implemented.
Quadratic probing appears to require * and % operations.
– However by the use of the following trick, this is overcome:
•H =H
i i-1 +2i – 1 (mod M)
Some Considerations
• What happens if load factor gets too high?
– Dynamically expand the table as soon as the load factor
reaches 0.5, which is called rehashing.
– Always double to a prime number.
– When expanding the hash table, reinsert the new table by
using the new hash function.
Analysis of Quadratic Probing
• Quadratic probing has not yet been mathematically analyzed.
• Although quadratic probing eliminates primary clustering,
elements that hash to the same location will probe the same
alternative cells. This is know as secondary clustering.
• Techniques that eliminate secondary clustering are available.
– the most popular is double hashing.
Double Hashing
• A second hash function is used to drive the collision resolution.
– f(i) = i * hash2(x)
• We apply a second hash function to x and probe at a distance
hash2(x), 2*hash2(x), … and so on.
• The function hash2(x) must never evaluate to zero.
– e.g. Let hash (x) = x mod 9 and try to insert 99 in the
2 previous example.
• A function such as hash2(x) = R – ( x mod R) with R a prime
smaller than TableSize will work well.
– e.g. try R = 7 for the previous example.(7 - x mode 7)
The relative efficiency of four collision-
resolution methods
Hashing Applications
• Compilers use hash tables to implement the symbol table
(a data structure to keep track of declared variables).
• Game programs use hash tables to keep track of positions
it has encountered (transposition table)
• Online spelling checkers.
Summary
• Hash tables can be used to implement the insert and find operations
in constant average time.
– it depends on the load factor not on the number of items in the table.
• It is important to have a prime TableSize and a correct choice of
load factor and hash function.
• For separate chaining the load factor should be close to 1.
• For open addressing load factor should not exceed
0.5 unless this is completely unavoidable.
– Rehashing can be implemented to grow (or shrink) the table.