Data Structures
Data Structures
Table of Contents
Introduction ........................6
Introduction to Data Structures 6
COMMON OPERATIONS IN A DATA STRUCTURE .................................................................................................. 8
CLASSIFICATION OF DATA STRUCTURE ............................................................................................................... 8
Linear Data Structures ............................ 9
Non-Linear Data Structures................. 10
Arrays ................................. 11
What is Arrays? ................ 11
WHY USE ARRAYS OVER A BUNCH OF VARIABLES? .............................................................................................11
DECLARING A ONE DIMENSIONAL ARRAY ......................................................................................................... 12
ASSIGNING VALUES WHILE INITIALIZATION ................................................................................................ 13
ASSIGNING VALUES AFTER INITIALIZATION ................................................................................................... 13
TRAVERSING THE ARRAY ...................................................................................................................................... 14
INSERTING AN ELEMENT IN THE ARRAY .......................................................................................................... 15
DELETING AN ELEMENT FROM THE ARRAY ....................................................................................................... 16
MULTI-DIMENSIONAL ARRAYS ........................................................................................................................... 17
Two Dimensional Array ........................... 17
Three Dimensional Array ....................... 18
MEMORY ALLOCATION IN ARRAYS ...................................................................................................................... 18
For One Dimensional Array ................... 18
For Two Dimensional Array ................... 19
TIME COMPLEXITY OF OPERATIONS ................................................................................................................... 19
Access.......................................................... 19
Search ......................................................... 20
Insertion ..................................................... 20
Deletion ....................................................... 20
Space Required ......................................... 20
ADVANTAGES OF ARRAYS .................................................................................................................................... 20
DISADVANTAGES OF ARRAYS............................................................................................................................... 21
2
Implementation ......................................... 24
OPERATIONS IN A LINKED LIST ......................................................................................................................... 25
Creating An Empty List .......................... 26
Adding To The End Of The List .......... 26
Adding To The Beginning Of The List27
Adding To A Specific Position Of The List 27
Deletion From The End Of The List... 28
Deletion From The Beginning Of The List 29
Deletion From A Specific Position Of The List 30
Searching In A Linked List ................... 31
DOUBLY LINKED LIST ........................................................................................................................................... 32
CIRCULAR LINKED LIST ........................................................................................................................................ 32
Complexity of operations:...................... 33
Applications of a Linked List ................ 33
Stacks ................................ 35
OPERATIONS IN A STACK .................................................................................................................................... 36
Push Operation .......................................... 36
Pop Operation ............................................ 36
Peek Operation.......................................... 36
Overflow and Underflow Conditions... 36
About The Top Pointer ........................... 37
CREATING A STACK .............................................................................................................................................. 37
Pushing To The Stack ............................. 37
Popping From The Stack ........................ 38
Accessing The Top Element (Peeking)38
STACK COMPLEXITY .............................................................................................................................................. 39
Access.......................................................... 39
Search ......................................................... 39
Insertion ..................................................... 39
Deletion ....................................................... 40
Space Required ......................................... 40
APPLICATIONS OF STACKS IN PROGRAMMING ................................................................................................. 40
Queues ................................ 41
OPERATIONS IN A QUEUE .................................................................................................................................... 41
Enqueue Operation................................... 41
Dequeue Operation .................................. 42
The Front And Rear Pointer ................. 42
Overflow and Underflow Conditions... 42
CREATING A QUEUE ............................................................................................................................................. 43
Enqueue Operation................................... 43
Dequeue Operation .................................. 44
VARIATIONS OF A QUEUE ................................................................................................................................... 44
Double-Ended queue (Deque) ................ 45
3
Circular Queue (Circular Buffer) ........ 45
Priority Queue........................................... 45
QUEUE COMPLEXITY ............................................................................................................................................. 46
Access.......................................................... 46
Search ......................................................... 46
Insertion ..................................................... 46
Deletion ....................................................... 46
Space Required ......................................... 46
APPLICATIONS OF QUEUES IN PROGRAMMING ................................................................................................ 47
Trees .................................. 48
BASIC TERMINOLOGY ........................................................................................................................................... 48
TYPES OF A TREE .................................................................................................................................................. 49
General Tree .............................................. 50
Binary Tree ................................................ 50
Binary Search Trees ............................... 50
Multiway Trees ......................................... 51
AVL Trees................................................... 52
APPLICATIONS OF TREES IN PROGRAMMING .................................................................................................... 52
THE HEAP PROPERTY ............................................................................................................................................. 53
MAX HEAP .............................................................................................................................................................. 53
MIN HEAP .............................................................................................................................................................. 54
HEAP OPERATIONS ............................................................................................................................................... 54
Insertion ..................................................... 54
Deletion ....................................................... 55
Finding Maximum/Minimum ................... 55
APPLICATION IN PROGRAMMING ........................................................................................................................ 55
Graphs ................................ 57
1. UNDIRECTED GRAPHS ....................................................................................................................................... 57
2. DIRECTED GRAPHS ............................................................................................................................................ 58
BASIC TERMINOLOGY IN A GRAPH .................................................................................................................... 58
REPRESENTATION OF A GRAPH .......................................................................................................................... 59
Adjacency List Representation............ 60
Adjacency Matrix Representation ...... 60
Graph Traversal Algorithms ................. 61
APPLICATIONS OF GRAPHS IN PROGRAMMING ................................................................................................. 63
4
APPLICATIONS IN PROGRAMMING ...................................................................................................................... 65
DIFFERENT HASH FUNCTIONS ........................................................................................................................... 66
Division Method ........................................ 66
Multiplication Method ............................. 66
Mid-Square Method ................................ 66
COLLISIONS ........................................................................................................................................................... 67
Open Addressing ...................................... 67
Chaining ....................................................... 68
5
Introduction
~ Linus Torvalds
Time and energy are both required to process any instruction. Every
CPU cycle that is saved will have an effect on both the time and
energy consumed and can be put to better use in processing other
instructions.
6
one. The study includes the description, implementation and
quantitative performance analysis of the data structure .
For eg. The user of the stack data structure only knows about the
push and pop operations in a stack. They do not care how the push
7
operation interacts with the memory to store the data. They only
expect it to store it in the way specified.
Access
Search
Insertion
Deletion
8
Linear Data Structures
Arrays
Linked Lists
Stacks
Queues
9
Non-Linear Data Structures
Trees
Heaps
Graphs
Hash Tables
10
Arrays
What is Arrays?
The reason why we use arrays is that every element can be accessed
by its index value. This has several advantages over storing a bunch
of variables.
One can create a variable for each employee in the office. Let’s say
the office has only 3 employees. Fairly easy right? Just declare 3
variables: emp1_age, emp2_age and emp3_age.
11
When new recruitments come in, we sit down to create more
variables. Maintaining a system like this gets tedious. Imagine one
new employee and the whole system code has to be modified.
For this example, the array can hold all the ages of the employees
under one name, like employees_age. These are all of
the integer type.
Data Type: This is the kind of values that the array will store.
This can be characters, integers, floating points or any legal
data type.
12
Name: The variable name used to identify the array and
interact with it.
Size: The size of the array, which specifies the maximum
number of values that the array will store.
Syntax Used
type name[size];
1. int marks[100];
There are 2 ways to assign elements to an array:
13
1. int ages[10];
2.
3. // accessing array without assigning elements first
4. for(int i = 0; i < 10; i++)
5. printf("\n arr[%d] = %d", i, ages[i]);
Each element of the array can be accessed using its index. The
indexing in an array generally starts with 0, which means that the
first element is at the 0th index. Subsequently, the last element of
the array would be at the (n-1)th index. This is known as 0-based
indexing.
The indexing of the array may also be different by using any other
base. These are known as n-based indexing.
Example: Values are first being assigned and then displayed from
the array.
1. int id[10];
2.
3. // assigning values using a loop
4.
5. for (int i = 0; i < 10; i++) {
6. printf("\nEnter an id: ");
14
7. scanf("%d", &id[i]);
8. }
9.
10. // displaying the entered ids
11.
12. for (int i = 0; i < 10; i++) {
13. printf("\n id[%d] = %d", i, id[i]);
14. }
Maintaining the order of an array while inserting or deleting requires
manipulating the others already present in the array. This is one of
the disadvantages, as such operations can be costly on larger arrays.
At the end
15
9. arr[pos] = num;
10. n = n + 1; // increase total number of used positions
11. display_array(arr);
12. }
At the end
16
Multi-Dimensional Arrays
type name[max_size_x][max_size_y]
The max_size_x and max_size_y are the max values each dimension
can store.
17
Three Dimensional Array
type name[max_size_x][max_size_y][max_size_z];
A simple formula consisting of the size of the element and the lower
bound is used.
18
For Two Dimensional Array
In this form, the elements are stored row by row. n elements of the
first row are stored in the first n locations, elements of the second
row elements are stored in the next n locations, and so on.
Access
19
Search
Insertion
Deletion
Space Required
An array only takes the space used to store the elements of the
data type specified. This means that for storing n elements the
space required is O(n).
Advantages of Arrays
20
Arrays allow for random access of elements. Each element in
the array can be interacted with by directly accessing to its
index.
Arrays have good cache locality, which means the speed of
execution of code may be significantly faster in some cases
due to nature how arrays are stored.
Disadvantages of Arrays
21
Linked List
A linked list is a linear data structure where each element is a
separate object, known as a node. Each node contains some data and
points to the next node in the structure, forming a sequence. The
nodes may be at different memory locations, unlike arrays where all
the elements are stored continuously.
The linked list can be used to store data similar to arrays but with
several more advantages.
Not fixed in size: A linked list is not fixed in size. The memory
locations to store the nodes are allocated dynamically when
each node is created. There is no wastage of memory for
unused locations. In comparison, an array can only be defined
once of a specific size, and then further cannot be extended
or shrunk down accordingly.
22
Efficient Insertion and Deletion: A quick manipulation of the
links between the nodes allows for a constant time taken for
insertion and deletion. In contrast, one has to move over all the
memory locations while dealing with arrays so that they are in
order.
23
Singly Linked List
This is the most common type of linked list, where each node has one
pointer to the next node in the sequence. This means that the list
can only be traversed from the beginning to the end in one direction.
To access the last element, it is always required to traverse the
whole list to the end.
Implementation
The Node contains 2 parts, one that is the data itself and the other
which references the next node in the sequence. For simplicity, we
will consider a Node where the data is a single integer. The data is
not just limited to one value, one can define any number of pieces of
information to be stored in each node.
24
1. struct Node
2. {
3. int data;
4. struct Node *next;
5. } *head = NULL;
A new Node is created first with the desired variable name. We will
call this newNode for now.
1. newNode->data
Similarly, the link to the next Node in the list can be accessed by
using the arrow character to the *next member of the structure.
1. newNode->next
The head node is used to point to the first node in a linked list. This
is used to keep track of the list beginning and helps during the
traversing operations.
25
Creating An Empty List
New data can be added to the end of the linked list by creating a
new Node with the data to be used, traversing to the end of the list
and then appending this data to the end.
26
Adding To The Beginning Of The List
27
3. int i = 0;
4. struct Node *newNode;
5. newNode = (struct Node*)malloc(sizeof(struct Node));
6. newNode->data = value;
7. if(head == NULL)
8. {
9. newNode->next = NULL;
10. head = newNode;
11. }
12. else {
13. struct Node *temp = head;
14. for (i = 0; i < pos - 1; i++) {
15. temp = temp-> next;
16. }
17. newNode->next = temp->next;
18. temp->next = newNode;
19. }
20.
21. printf("\nNode inserted successfully\n");
22. }
New data can be added to the end of the linked list by creating a
new Node with the data to be used, traversing to the end of the list
and then appending this data to the end.
1. void removeEnd()
2. {
3. if(head == NULL)
4. {
5. printf("\nList is Empty\n");
6. }
7. else
28
8. {
9. struct Node *temp1 = head,*temp2;
10. if(head->next == NULL)
11. head = NULL;
12. else
13. {
14. while(temp1->next != NULL)
15. {
16. temp2 = temp1;
17. temp1 = temp1->next;
18. }
19. temp2->next = NULL;
20. }
21. free(temp1);
22. printf("\nNode deleted at the end\n\n");
23. }
24. }
1. void removeBeginning()
2. {
3. if(head == NULL)
4. printf("\n\nList is Empty");
5. else
6. {
7. struct Node *temp = head;
8. if(head->next == NULL)
9. {
10. head = NULL;
29
11. free(temp);
12. }
13. else
14. {
15. head = temp->next;
16. free(temp);
17. printf("\nNode deleted at the beginning\n\n");
18. }
19. }
20. }
30
18. temp2 = temp1;
19. temp1 = temp1 -> next;
20. }
21. else {
22. flag = 0;
23. break;
24. }
25. }
26. if (flag) {
27. temp2 -> next = temp1 -> next;
28. free(temp1);
29. printf("\nNode deleted\n\n");
30. }
31. else {
32. printf("Position exceeds number of elements in linked
list. Please try again");
33. }
34. }
35. }
36. }
31
4. {
5. if (head->data == key)
6. {
7. printf("The key is found in the list\n");
8. return;
9. }
10. head = head->next;
11. }
12. printf("The Key is not found in the list\n");
13. }
A doubly linked list has 2 pointers, one pointing to the next node and
one to the previous node. This allows for moving in any direction
while traversing the list, which may be useful in certain situations.
The implementation and details are here: Link to Doubly linked list
A circular linked list is like a regular one except for the last element
of the list pointing to the first. This has the advantage of allowing
to go back back to the first element while traversing a list without
starting over.
32
Complexity of operations:
Access
Insertion
Deletion
33
3. Polynomials can be represented and manipulated by using linked
lists.
4. It can be used to perform operations on long integers.
34
Stacks
A stack is a linear data structure that store data in an order known
as the Last In First Out (LIFO) order. This property is helpful in
certain programming cases where the data needs to be ordered.
35
Operations in a Stack
Push Operation
This is used to add (or push) an element to the stack. The element
always gets added to the top of the current stack items.
Pop Operation
This is used to remove (or pop) an element from the stack. The
element always gets popped off from the top of the stack.
Peek Operation
The peek operation is used to return the first element of the stack
without removing the element. It is a variation of the pop operation.
1. if (top == -1) {
2. // underflow condition
3. }
The overflow condition checks if the stack is full (or more memory is
available) before pushing any element. This prevents any error if
more space cannot be allocated for the next item.
36
1. if (top == sizeOfStack) {
2. // overflow condition
3. }
Creating A Stack
1. #define SIZE 10
2.
3. int stack[SIZE];
4. int top = -1;
Steps
37
3. If it is NOT FULL, then increment top value by one (top++) and
set stack[top] to value ( stack[top] = value).
Steps
1. void pop() {
2. if(top == -1)
3. printf("\nUnderflow. Stack is empty");
4. else{
5. printf("\nDeleted : %d", stack[top]);
6. top--;
7. }
8. }
Steps
38
1. Check whether stack is EMPTY (top == -1).
2. If it is EMPTY, then terminate the function and throw an
error.
3. If it is NOT EMPTY, then return stack[top].
1. void peek() {
2. if(top == -1)
3. {
4. printf("\n The stack is empty");
5. break;
6. }
7. else
8. printf("%d", stack[top]);
9. }
Stack Complexity
Access
Search
Insertion
39
Deletion
Space Required
A stack only takes the space used to store the elements of the data
type specified. This means that for storing n elements the space
required is O(n).
40
Queues
A queue is a linear data structure that stores data in an order known
as the First In First Out order. This property is helpful in certain
programming cases where the data needs to be ordered.
Operations in a Queue
Enqueue Operation
41
Dequeue Operation
1. if(front == rear)
2. // underflow condition
The overflow condition checks if the queue is full (or more memory
is available) before enqueueing any element. This prevents any error
if more space cannot be allocated for the next item.
1. if(rear == SIZE-1)
2. // overflow condition
42
Creating A Queue
1. #define SIZE 10
2.
3. int queue[SIZE];
4. int front = -1, rear = -1;
Enqueue Operation
43
11. }
Dequeue Operation
1. void deQueue() {
2. if(front == rear)
3. printf("\nUnderflow. Queue is Empty.");
4. else{
5. printf("\nDeleted item is: %d", queue[front]);
6. front++;
7. if(front == rear)
8. front = rear = -1;
9. }
10. }
Variations of a Queue
44
Double-Ended queue (Deque)
In a standard queue, insertion can only be done from the back and
deletion only from the front. A double-ended queue allows for
insertion and deletion from both ends.
Priority Queue
45
An element with the highest priority gets processed first. If there
exist two elements with the same priority, then the order of which
the element was inserted is considered.
Queue Complexity
Access
Search
Insertion
Deletion
Space Required
A queue only takes the space used to store the elements of the data
type specified. This means that for storing n elements, the space
required is O(n).
46
Applications of Queues in Programming
47
Trees
A tree is a data structure that simulates a hierarchical tree, with
a root value and the children as the subtrees, represented by a set
of linked nodes. The children of each node could be accessed by
traversing the tree until the specified value is reached.
Basic Terminology
Root: The first node in a tree is called as Root Node. Every tree
must have one Root Node.
48
Siblings: Nodes which belong to the same Parent are called as
Siblings.
Leaf Node: In a tree data structure, the node which does not have a
child is called a Leaf Node. They are also known as External Nodes
or Terminal Nodes.
Internal Nodes: The node which has at least one child is called an
Internal Node.
Depth: The total number of edges from the root node to a particular
node is called the Depth of that Node.
Path: The sequence of Nodes and Edges from one node to another
node is called a Path.
Types of A Tree
1. General trees
2. Binary trees
3. Binary Search trees
4. M-way trees
5. AVL trees
49
General Tree
Binary Tree
50
The binary search property states that the key in each node must
be greater than or equal to any key stored in the left sub-tree, and
less than or equal to any key stored in the right sub-tree.
Multiway Trees
A multiway tree can have more than one value per node. They are
written as m-way trees where m means the order of the tree. A
multiway tree can have m-1 values per node and m children. It is not
necessary that every node has m-1 values or m children.
1. B-Trees
51
It was developed in the year 1972 by Bayer and McCreight. A B-tree
is designed to store sorted data and allows search, insertion, and
deletion operations to be performed in logarithmic running time.
2. B+ Trees
AVL Trees
52
Heaps
A heap is a complete binary tree that satisfies the heap property.
There are two types of heaps, the max heap and the min heap.
The heap property says that is the value of Parent is either greater
than or equal to (in a max heap ) or less than or equal to (in a min
heap) the value of the Child.
Max Heap
In a max heap, the key present at the root is the largest in the heap
and all the values below this are less than this value.
53
Min Heap
In a min heap, the key present at the root is the smallest in the
heap and all the values below this are greater than this value.
Heap Operations
Insertion
If the value is greater (in a max heap) or smaller (in a a min heap) it
is swapped with its parent. The process is then continued from the
parent node recursively until the heap property is satisfied or the
root node is hit.
54
Deletion
An element is always deleted from the root of the heap. But deleting
an element will leave a hole in the heap, which disturbs the
requirement that the heap must be a complete binary tree. To fill
this hole, the last node in the heap is swapped to this place. This
causes the heap to violate the heap property.
1. Replace the root node’s value with the last node’s value.
2. Delete the last node.
3. Sink down the new root node’s value so that the heap again
satisfies the heap property.
Finding Maximum/Minimum
Finding the node which has maximum or minimum value is easy due to
the heap property and is one of the advantages of using a heap.
Since all the elements below it are smaller (or larger in a min-heap),
it will be always the root node. This can be accessed in constant
time.
Application in Programming
55
2. Implementing priority queues: As the highest (or lowest)
priority element is always stored at the root of the heap, they
could be accessed quickly.
3. Selection algorithms: A heap allows access to the min or max
element in constant time, and other selections (such as median
or kth-element) can be done in sub-linear time on data that is
in a heap.
4. Graph algorithms: By using heaps as internal traversal data
structures, run times can be reduced by polynomial order.
56
Graphs
A graph data structure is used to represent relations between pairs
of objects .
1. Undirected Graphs
An undirected graph does not have any directed associated with its
edges. This means that any edge could be traversed in both ways.
57
2. Directed Graphs
Representation Of A Graph
59
Adjacency List Representation
An array of lists is used where the size of the array is equal to the
number of vertices. Each of the elements in the arrays contains a
linked list of all the vertices adjacent to the list.
60
To represent the weights for weighted graphs, the weight of edge
(u, v) is simply stored as the entry in row u and column v of the
adjacency matrix.
61
Breadth First Search (BFS)
62
Applications of Graphs in Programming
63
Hash Tables
A hash table is a data structure where data is stored in
an associative manner. The data is mapped to array positions by
a hash function that generates a unique value from each key.
Advantages of Hashing
Hash Functions
64
The main aim of a hash function is that elements should be uniformly
distributed. It produces a unique set of integers within some
suitable range in order to reduce the number of collisions.
Uniformity
A good hash function must map the keys as evenly as possible. This
means that the probability of generating every hash value in the
output range should roughly be the same. This also helps in reducing
collisions.
Deterministic
A hash function must always generate the same hash value for a
given input value.
Low Cost
Applications in Programming
65
Different Hash Functions
Division Method
h(k) = k mod M
Multiplication Method
Mid-Square Method
66
2. The middle r digits of the result are extracted.
3. The result r is the hash obtained.
The algorithm works well because most or all digits of the key-value
contribute to the resulting hash.
Collisions
Collisions occur when the hash function maps two different keys to
the same location. Two records cannot be stored in the same
location of a hash table normally.
Open Addressing
67
2. Quadratic Probing: The interval between the probes increases
quadratically. This means that the next available position that
would be tried would increase quadratically.
3. Double Hashing: The interval between probes is fixed for each
record but the hash is computed again by double hashing.
Chaining
This effectively means that each location in the hash table is not
limited to store one value. Searching for a value in a chained hash
table is as simple as scanning a linked list for an entry with the given
key.
Insertion operation appends the key to the end of the linked list
pointed by the hashed location.
Deleting a key requires searching the list and removing the element.
68
This solution, however, presents a problem if the linked list becomes
large enough that it takes O(n) time to search one position. This
occurs if the hash table is too small and has to accommodate many
values.
69