0% found this document useful (0 votes)
33 views

Data Structure Notes Update-1

Uploaded by

eschosysbifmet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Data Structure Notes Update-1

Uploaded by

eschosysbifmet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

What is Data Structure?

The data structure name indicates itself that organizing the data in memory. There are many
ways of organizing the data in the memory as we have already seen one of the data structures,
i.e., array in C language. Array is a collection of memory elements in which data is stored
sequentially, i.e., one after another. In other words, we can say that array stores the elements
in a continuous manner. This organization of data is done with the help of an array of data
structures. There are also other ways to organize the data in memory. Let's see the different
types of data structures

The data structure is not any programming language like C, C++, java, etc. It is a set of
algorithms that we can use in any programming language to structure the data in the memory.

To structure the data in memory, 'n' number of algorithms were proposed, and all these
algorithms are known as Abstract data types. These abstract data types are the set of rules.

Types of Data Structures


There are two types of data structures:

o Primitive data structure

o Non-primitive data structure

Primitive Data structure

The primitive data structures are primitive data types. The int, char, float, double, and pointer
are the primitive data structures that can hold a single value.

Non-Primitive Data structure

The non-primitive data structure is divided into two types:


o Linear data structure

o Non-linear data structure

Linear Data Structure

The arrangement of data in a sequential manner is known as a linear data structure. The data
structures used for this purpose are Arrays, Linked list, Stacks, and Queues. In these data
structures, one element is connected to only one another element in a linear form.

When one element is connected to the 'n' number of elements known as a non-linear
data structure. The best example is trees and graphs. In this case, the elements are
arranged in a random manner.

We will discuss the above data structures in brief in the coming topics. Now, we will see the
common operations that we can perform on these data structures.

Data structures can also be classified as:

o Static data structure: It is a type of data structure where the size is allocated at the compile
time. Therefore, the maximum size is fixed.

o Dynamic data structure: It is a type of data structure where the size is allocated at the run
time. Therefore, the maximum size is flexible.

Major Operations
The major or the common operations that can be performed on the data structures are:

o Searching: We can search for any element in a data structure.

o Sorting: We can sort the elements of a data structure either in an ascending or descending
order.

o Insertion: We can also insert the new element in a data structure.

o Updation: We can also update the element, i.e., we can replace the element with another
element.

o Deletion: We can also perform the delete operation to remove the element from the data
structure.

Which Data Structure?


A data structure is a way of organizing the data so that it can be used efficiently. Here, we
have used the word efficiently, which in terms of both the space and time. For example, a
stack is an ADT (Abstract data type) which uses either arrays or linked list data structure for
the implementation. Therefore, we conclude that we require some data structure to implement
a particular ADT.

An ADT tells what is to be done and data structure tells how it is to be done. In other words,
we can say that ADT gives us the blueprint while data structure provides the implementation
part. Now the question arises: how can one get to know which data structure to be used for a
particular ADT?.

As the different data structures can be implemented in a particular ADT, but the different
implementations are compared for time and space. For example, the Stack ADT can be
implemented by both Arrays and linked list. Suppose the array is providing time efficiency
while the linked list is providing space efficiency, so the one which is the best suited for the
current user's requirements will be selected.

Advantages of Data structures


The following are the advantages of a data structure:

o Efficiency: If the choice of a data structure for implementing a particular ADT is proper, it
makes the program very efficient in terms of time and space.

o Reusability: he data structures provide reusability means that multiple client programs can
use the data structure.

o Abstraction: The data structure specified by an ADT also provides the level of abstraction.
The client cannot see the internal working of the data structure, so it does not have to worry
about the implementation part. The client can only see the interface.

LINEAR DATA STRUCTURES

Stack Data Structure

Stack is a linear data structure which follows a particular order in which the operations are performed.
The order may be LIFO(Last In First Out) or FILO(First In Last Out).
There are many real-life examples of a stack. Consider an example of plates stacked over one another
in the canteen. The plate which is at the top is the first one to be removed, i.e. the plate which has been
placed at the bottommost position remains in the stack for the longest period of time. So, it can be
simply seen to follow LIFO(Last In First Out)/FILO(First In Last Out) order.

Linked List Data Structure


Unlike arrays, linked list is a linear data structure, in which the elements are not stored at contiguous
memory locations. The elements in a linked list are linked using pointers as shown in the below image:

In simple words, a linked list consists of nodes where each node contains a data field and a
reference(link) to the next node in the list.

Why Linked List?


Arrays can be used to store linear data of similar types, but arrays have following limitations.
1) The size of the arrays is fixed: So we must know the upper limit on the number of elements
in advance. Also, generally, the allocated memory is equal to the upper limit irrespective of
the usage.
2) Inserting a new element in an array of elements is expensive, because room has to be
created for the new elements and to create room existing elements have to shifted.

For example, in a system if we maintain a sorted list of IDs in an array id[].

id[] = [1000, 1010, 1050, 2000, 2040].


And if we want to insert a new ID 1005, then to maintain the sorted order, we have to move
all the elements after 1000 (excluding 1000).
Deletion is also expensive with arrays until unless some special techniques are used. For
example, to delete 1010 in id[], everything after 1010 has to be moved.

Advantages over arrays


1) Dynamic size
2) Ease of insertion/deletion

Drawbacks:
1) Random access is not allowed. We have to access elements sequentially starting from the
first node. So we cannot do binary search with linked lists efficiently with its default
implementation. Read about it here.
2) Extra memory space for a pointer is required with each element of the list.
3) Not cache friendly. Since array elements are contiguous locations, there is locality of
reference which is not there in case of linked lists.

Representation:
A linked list is represented by a pointer to the first node of the linked list. The first node is
called head. If the linked list is empty, then value of head is NULL.
Each node in a list consists of at least two parts:
1) data
2) Pointer (Or Reference) to the next node
In C, we can represent a node using structures. Below is an example of a linked list node with
an integer data.
In Java, LinkedList can be represented as a class and a Node as a separate class. The
LinkedList class contains a reference of Node class type.

Linked List vs Array

Both Arrays and Linked List can be used to store linear data of similar types, but they both
have some advantages and disadvantages over each other.

Key Differences Between Array and Linked List


1. An array is the data structure that contains a collection of similar type data elements
whereas the Linked list is considered as non-primitive data structure contains a collection of
unordered linked elements known as nodes.
2. In the array the elements belong to indexes, i.e., if you want to get into the fourth element
you have to write the variable name with its index or location within the square bracket.
3. In a linked list though, you have to start from the head and work your way through until
you get to the fourth element.
4. Accessing an element in an array is fast, while Linked list takes linear time, so it is quite a
bit slower.
5. Operations like insertion and deletion in arrays consume a lot of time. On the other hand,
the performance of these operations in Linked lists is fast.
6. Arrays are of fixed size. In contrast, Linked lists are dynamic and flexible and can expand
and contract its size.
7. In an array, memory is assigned during compile time while in a Linked list it is allocated
during execution or runtime.
9. Elements are stored consecutively in arrays whereas it is stored randomly in Linked lists.
10. The requirement of memory is less due to actual data being stored within the index in the
array. As against, there is a need for more memory in Linked Lists due to storage of additional
next and previous referencing elements.
11. In addition memory utilization is inefficient in the array. Conversely, memory utilization
is efficient in the linked list.

Following are the points in favor of Linked Lists.

(1) The size of the arrays is fixed: So we must know the upper limit on the number of
elements in advance. Also, generally, the allocated memory is equal to the upper limit
irrespective of the usage, and in practical uses, the upper limit is rarely reached.

(2) Inserting a new element in an array of elements is expensive because a room has to be
created for the new elements and to create room existing elements have to be shifted.

For example, suppose we maintain a sorted list of IDs in an array id[].

id[] = [1000, 1010, 1050, 2000, 2040, …..].

And if we want to insert a new ID 1005, then to maintain the sorted order, we have to move
all the elements after 1000 (excluding 1000).

Deletion is also expensive with arrays until unless some special techniques are used. For
example, to delete 1010 in id[], everything after 1010 has to be moved.

So Linked list provides the following two advantages over arrays


1) Dynamic size
2) Ease of insertion/deletion

Linked lists have following drawbacks:


1) Random access is not allowed. We have to access elements sequentially starting from the
first node. So we cannot do a binary search with linked lists.
2) Extra memory space for a pointer is required with each element of the list.
3) Arrays have better cache locality that can make a pretty big difference in performance.
Methods to insert a new node in linked list are discussed. A node can be added in three
ways
1) At the front of the linked list
2) After a given node.
3) At the end of the linked list

Add a node at the front: (A 4 steps process)


The new node is always added before the head of the given Linked List. And newly added
node becomes the new head of the Linked List. For example if the given Linked List is 10-
>15->20->25 and we add an item 5 at the front, then the Linked List becomes 5->10->15-
>20->25. Let us call the function that adds at the front of the list is push(). The push() must
receive a pointer to the head pointer, because push must change the head pointer to point to
the new node

Time complexity of push() is O(1) as it does constant amount of work.

Add a node after a given node: (5 steps process)


We are given pointer to a node, and the new node is inserted after the given node.

Time complexity of insertAfter() is O(1) as it does constant amount of work.

Add a node at the end: (6 steps process)


The new node is always added after the last node of the given Linked List. For example if the
given Linked List is 5->10->15->20->25 and we add an item 30 at the end, then the Linked
List becomes 5->10->15->20->25->30.
Since a Linked List is typically represented by the head of it, we have to traverse the list till
end and then change the next of last node to new node.
Time complexity of append is O(n) where n is the number of nodes in linked list. Since there is a loop
from head to end, the function does O(n) work.
This method can also be optimized to work in O(1) by keeping an extra pointer to tail of linked list

To delete a node from linked list, we need to do following steps.


1) Find previous node of the node to be deleted.
2) Change the next of previous node.
3) Free memory for the node to be deleted.

Queue Data Structure


A Queue is a linear structure which follows a particular order in which the operations are performed.
The order is First In First Out (FIFO). A good example of a queue is any queue of consumers for a
resource where the consumer that came first is served first. The difference between stacks and queues
is in removing. In a stack we remove the item the most recently added; in a queue, we remove the item
the least recently added.
NON LINEAR DATA STRUCTURES

Graph data structures and their applications


A graph is a pictorial representation of a set of objects where some pairs of objects are
connected by links. The interconnected objects are represented by points termed as vertices
or a node, and the links that connect the vertices are called edges.

Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of
edges, connecting the pairs of vertices. Take a look at the following graph −

In the above graph,

V = {a, b, c, d, e}

E = {ab, ac, bd, cd, de}

Basic terms
Mathematical graphs can be represented in data-structure. We can represent a graph using an
array of vertices and a two dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms

 Vertex − Each node of the graph is represented as a vertex. In example given below,
labeled circle represents vertices. So A to G are vertices. We can represent them using
an array where A can be identified by index 0. B can be identified using index 1 and
so on.
 Edge − Edge represents a path between two vertices or a line between two vertices. In
example given below, lines from A to B, B to C and so on represents edges. We can
use a two dimensional array to represent edges where AB can be represented as 1 at
row 0, column 1, BC as 1 at row 1, column 2 and so on, keeping other combinations as
0.

 Adjacency − Two node or vertices are adjacent if they are connected to each other
through an edge. In example given below, B is adjacent to A, C is adjacent to B and so
on.
 Path − Path represents a sequence of edges between two vertices. In example given
below, ABCD represents a path from A to D.

Kinds of Graphs

 Undirected Graphs.

In an undirected graph, the order of the vertices in the pairs in the Edge set doesn't matter.
Thus, if we view the sample graph above we could have written the Edge set as
{(4,6),(4,5),(3,4),(3,2),(2,5)),(1,2)),(1,5)}. Undirected graphs usually are drawn with straight
lines between the vertices.

The adjacency relation is symetric in an undirected graph, so if u ~ v then it is also the


case that v ~ u.
 Directed Graphs.

In a directed graph the order of the vertices in the pairs in the edge set matters. Thus u is
adjacent to v only if the pair (u,v) is in the Edge set. For directed graphs we usually use arrows
for the arcs between vertices. An arrow from u to v is drawn only if (u,v) is in the Edge set.
The directed graph below

Has the following parts.

o The underlying set for the Verticies set is capital letters.


o The Vertices set = {A,B,C,D,E}
o The Edge set = {(A,B),(B,C),(D,C),(B,D),(D,B),(E,D),(B,E)}

Note that both (B,D) and (D,B) are in the Edge set, so the arc between B and D is an
arrow in both directions.

 Vertex labeled Graphs.

In a labeled graph, each vertex is labeled with some data in addition to the data that identifies
the vertex. Only the indentifying data is present in the pair in the Edge set. This is silliar to the
(key,satellite) data distinction for sorting.
Here we have the following parts.

o The underlying set for the keys of the Vertices set is the integers.
o The underlying set for the satellite data is Color.
o The Vertices set = {(2,Blue),(4,Blue),(5,Red),(7,Green),(6,Red),(3,Yellow)}
o The Edge set = {(2,4),(4,5),(5,7),(7,6),(6,2),(4,3),(3,7)}
 Cyclic Graphs.

A cyclic graph is a directed graph with at least one cycle. A cycle is a path along the directed
edges from a vertex to itself. The vertex labeled graph above as several cycles. One of them is
2»4»5»7»6»2

 Edge labeled Graphs.

A Edge labeled graph is a graph where the edges are associated with labels. One can indicate
this be making the Edge set be a set of triples. Thus if (u,v,X) is in the edge set, then there is
an edge from u to v with label X

Edge labeled graphs are usually drawn with the labels drawn adjacent to the arcs
specifying the edges.
Here we have the following parts.

o The underlying set for the the Vertices set is Color.


o The underlying set for the edge labels is sets of Color.
o The Vertices set = {Red,Green,Blue,White}
o The Edge set = {(red,white,{white,green}) ,(white,red,{blue})
,(white,blue,{green,red})
hite,blue,{green,red}) ,(red,blue,{blue}) ,(green,red,{red,blue,white})
,(blue,green,{white,green,red})}
 Weighted Graphs.

A weighted graph is an edge labeled graph where the labels can be operated on by the usual
arithmetic operators, including comparison
comparisonss like using less than and greater than. In Haskell
we'd say the edge labels are i the Num class. Usually they are integers or floats. The idea is
that some edges may be more (or less) expensive, and this cost is represented by the edge
labels or weight. Inn the graph below, which is an undirected graph, the weights are drawn
adjacent to the edges and appear in dark purple.

Here we have the following parts.

o The underlying set for the the Vertices set is Integer.


o The underlying set for the weights is Integer.
o The Vertices set = {1,2,3,4,5}
o The Edge set = {(1,4,5) ,(4,5,58) ,(3,5,34) ,(2,4,5) ,(2,5,4) ,(3,2,14) ,(1,2,2)}
 Directed Acyclic Graphs.

A Dag is a directed graph without cycles. They appear as special cases in CS applications all
the time.

Acyclic, undirected, labelled graph

Here we have the following parts.

o The underlying set for the the Vertices set is Integer.


o The Vertices set = {1,2,3,4,5,6,7,8}
o The Edge set = {(1,7) ,(2,6) ,(3,1),(3,5) ,(4,6) ,(5,4),(5,2) ,(6,8) ,(7,2),(7,8)}
 Disconnected Graphs

Vertices in a graph do not need to be connected to other vertices. It is legal for a graph to have
disconnected components, and even lone vertices without a single connection.

Vertices (like 5,7,and 8) with only in-arrows are called sinks. Vertices with only out-
arrows (like 3 and 4) are called sources.

Here we have the following parts.

o The underlying set for the the Vertices set is Integer.


o The Vertices set = {1,2,3,4,5,6,7,8}
o The Edge set = {(1,7) ,(3,1),(3,8) ,(4,6) ,(6,5)}

 Connecting with friends on social media, where each user is a vertex, and when users
connect they create an edge.
 Using GPS/Google Maps/Yahoo Maps, to find a route based on shortest route.
 Google, to search for webpages, where pages on the internet are linked to each other
by hyperlinks; each page is a vertex and the link between two pages is an edge.
 On eCommerce websites relationship graphs are used to show recommendations.

Practical applications of Graph Algorithms


Graphs are applied widely in our days. They are used in economy, aeronautics, physics,
biology (for analyzing DNA), mathematics and other areas.
Some of the applications of a collection of algorithms (all of them are available in Graph
Magics).

Graphs are used to solve many real-life problems. Graphs are used to represent networks. The
networks may include paths in a city or telephone network or circuit network. Graphs are also
used in social networks like linkedIn, Facebook. For example, in Facebook, each person is
represented with a vertex(or node). Each node is a structure and contains information like
person id, name, gender, locale etc.

Shortest Path (this can be applicable in routing protocols):

This is probably the most often used algorithm. It may be applied in situations where the
shortest path between 2 points is needed.
Examples of such applications would be:

 Computer games - finding the best/shortest route from one point to another.
 Maps - finding the shortest/cheapest path for a car from one city to another, by using
given roads.
 May be used to find the fastest way for a car to get from one point to another inside a
certain city. E.g. satellite navigation system that shows to drivers which way they
should better go.

Minimal Spanning Tree:


Consider some communications stations (for telephony, cable television, Internet etc.) and a
list of possible connections between them, having different costs (weights). Find the cheapest
way to connect these stations in a network, so that a station is connected to any other (directly,
or through intermediate stations). This may be used for example to connect villages to cable
television, or to Internet.

 The same problem, but instead of connecting communications stations - villages are to
be connected with roads.

Eulerian Path/Circuit:
A postman has to visit a set of streets in order to deliver mails and packages. It is needed to
find a path that starts and ends at the post-office, and that passes through each street (edge)
exactly once. This way the postman will deliver mails and packages to all streets he has to,
and in the same time will spend minimum efforts/time for the road.
Note that not all graphs have an eulerian circuit. If needed - the algorithm for Chinese
Postman Problem can be used.

Chinese Postman Problem:

 The same problem with the postman as above, but instead of visiting each street
(vertex) exactly once, the postman can visit them more than once if needed. Thus the
path should pass through each street at least one time and should have the minimum
cost.
 Drawing a circuit with a plotter in a fastest possible way, or with a minimum cost.
 It may be used to determine the cheapest path for garbage collection, street cleaning,
or snow removal.
 Also applied in routing robots, analysing DNA, and others.

Hamiltonian Path/Circuit:

 The same problem with the postman as above, but instead of visiting a set of streets
(edges), he has to visit each point (house) exactly once.

Network Flows:

 With Maximum Flow algorithm it is possible to find the most loaded roads or rails in a
certain transportation network, and also to determine its maximum intensivity. This
information may be then used to improve the traffic situation in those places.

Optimal Graph Coloring:

 This algorithm may be used to color a map with a minimum number of colors.

Graph Median:
A warehouse should be placed in a city (a region) so that the sum of shortest distances to all
other points (regions) is minimal. This is useful for lowering the cost of transporting goods
from a warehouase to clients.
Same thing can be considered for selecting the place of a shop, market, office and other
buildings.

Graph Center:
Suppose that a hospital, a fire department, or a police department, should be placed in a city
so that the farthest point is as close as possible. For example a hospital should be placed in
such a way that an ambulance can get as a fast as possible to the farthest situated house
(point).

Graph representation
Following two are the most commonly used representations of graph.

The following graph will be used to illustrate


1. Adjacency Matrix
2. Adjacency List
There are other representations also like, Incidence Matrix and Incidence List. The choice of
the graph representation is situation specific. It totally depends on the type of operations to be
performed and ease of use.

Adjacency Matrix:
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph.
Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to
vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is
also used to represent weighted graphs. If adj[i][j] = w, then there is an edge from vertex i to
vertex j with weight w.

The adjacency matrix for the above example graph is:

Adjacency Matrix Representation of the above graph

Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time.
Queries like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and can be
done O(1).

Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of
edges), it consumes the same space. Adding a vertex is O(V^2) time.

Adjacency List:
An array of linked lists is used. Size of the array is equal to number of vertices. Let the array
be array[]. An entry array[i] represents the linked list of vertices adjacent to the ith vertex.
This representation can also be used to represent a weighted graph. The weights of edges can
be stored in nodes of linked lists. Following is adjacency list representation of the above
graph.
Adjacency List Representation of the above Graph

Below is C code for adjacency list representation of an undirected graph:

// A C Program to demonstrate adjacency list representation of graphs

#include <stdio.h>
#include <stdlib.h>

// A structure to represent an adjacency list node


struct AdjListNode
{
int dest;
struct AdjListNode* next;
};

// A structure to represent an adjacency list


struct AdjList
{
struct AdjListNode *head; // pointer to head node of list
};

// A structure to represent a graph. A graph is an array of adjacency lists.


// Size of array will be V (number of vertices in graph)
struct Graph
{
int V;
struct AdjList* array;
};

// A utility function to create a new adjacency list node


struct AdjListNode* newAdjListNode(int dest)
{
struct AdjListNode* newNode =
(struct AdjListNode*) malloc(sizeof(struct AdjListNode));
newNode->dest = dest;
newNode->next = NULL;
return newNode;
}

// A utility function that creates a graph of V vertices


struct Graph* createGraph(int V)
{
struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph));
graph->V = V;
// Create an array of adjacency lists. Size of array will be V
graph->array = (struct AdjList*) malloc(V * sizeof(struct AdjList));

// Initialize each adjacency list as empty by making head as NULL


int i;
for (i = 0; i < V; ++i)
graph->array[i].head = NULL;

return graph;
}

// Adds an edge to an undirected graph


void addEdge(struct Graph* graph, int src, int dest)
{
// Add an edge from src to dest. A new node is added to the adjacency
// list of src. The node is added at the begining
struct AdjListNode* newNode = newAdjListNode(dest);
newNode->next = graph->array[src].head;
graph->array[src].head = newNode;

// Since graph is undirected, add an edge from dest to src also


newNode = newAdjListNode(src);
newNode->next = graph->array[dest].head;
graph->array[dest].head = newNode;
}

// A utility function to print the adjacenncy list representation of graph


void printGraph(struct Graph* graph)
{
int v;
for (v = 0; v < graph->V; ++v)
{
struct AdjListNode* pCrawl = graph->array[v].head;
printf("\n Adjacency list of vertex %d\n head ", v);
while (pCrawl)
{
printf("-> %d", pCrawl->dest);
pCrawl = pCrawl->next;
}
printf("\n");
}
}

// Driver program to test above functions


int main()
{
// create the graph given in above fugure
int V = 5;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1);
addEdge(graph, 0, 4);
addEdge(graph, 1, 2);
addEdge(graph, 1, 3);
addEdge(graph, 1, 4);
addEdge(graph, 2, 3);
addEdge(graph, 3, 4);

// print the adjacency list representation of the above graph


printGraph(graph);
return 0;
}

Output:

Adjacency list of vertex 0


head -> 4-> 1

Adjacency list of vertex 1


head -> 4-> 3-> 2-> 0

Adjacency list of vertex 2


head -> 3-> 1

Adjacency list of vertex 3


head -> 4-> 2-> 1

Adjacency list of vertex 4


head -> 3-> 1-> 0

You might also like