DS E-Content

School of Computer Science and Engineering
Data Structures using JAVA [R1UC303B]
Dr. A K Yadav

Plat No 2, Sector 17A, Yamuna Expressway
Greater Noida, Uttar Pradesh - 203201
September 24, 2024
Dr. A K Yadav Data Structures using JAVA 1/102

Contents
Introduction 5
Classification/Types of Data Structures 9
Applications of Data Structures 11
Algorithm 12
Efficiency of an algorithm 14
Time-space trade-off and complexity 15
Asymptotic notations 19
Complexity Analysis 23
Arrays 29
Representation of Arrays 32
Derivation of Index Formula 33
Application of arrays 38
Sparse Matrices 44
Arithmetic operations on matrices 50
Recursion 51
Direct Recursion 54
Indirect Recursion 62
Removal of recursion 63
Iteration and recursion with examples 65
Trade-off between iteration and recursion 71
Searching 72
Linear Search 73
Binary Search 76
Indexed Sequential Search 80
Hashing 82
Sorting 88
Insertion Sort 89
Bubble Sort 91
Selection Sort 93
Quick Sort 95
Merge Sort 98

Basic Terminology
What is Data Structure?
- A data structure is a particular way of organising data in a
computer so that it can be used effectively. The idea is to reduce
the space and time complexities of different tasks.
- The choice of a good data structure makes it possible to perform
a variety of critical operations effectively.
- An efficient data structure also uses minimum memory space and
execution time to process the structure.
- A data structure is not only used for organising the data. It is
also used for processing, retrieving, and storing data.
I Data: Data are simply values or sets of values.
I Data items: Data items refers to a single unit of values.

I Group items: Data items that are divided into sub-items are
called Group items. Ex: An Employee Name may be divided
into three subitems- first name, middle name, and last name.
I Elementary items: Data items that are not able to divide
into sub-items are called Elementary items. Ex: EnRollNo.
I Entity: An entity is something that has certain attributes or
properties which may be assigned values. The values may be
either numeric or non-numeric.
I Entities with similar attributes form an entity set.
I Each attribute of an entity set has a range of values, the set
of all possible values that could be assigned to the particular
attribute.
I The term information is sometimes used for data with given
attributes, in other words meaningful or processed data.

I Field is a single elementary unit of information representing

an attribute of an entity.
I Record is the collection of field values of a given entity.
I File is the collection of records of the entities in a given entity
set.
Need of Data Structure:
The structure of the data and the synthesis of the algorithm are
relative to each other. Data presentation must be easy to
understand to the developer, as well as the user, can make an
efficient implementation of the operation.
Data structures provide an easy way of organising, retrieving,
managing, and storing data. Here is a list of the needs for data
structure.
I Data structure modification is easy.
I It requires less time.
I Save storage memory space.

I Data representation is easy.
I Easy access to the large database

Classification/Types of Data Structures

1. Linear Data Structure
2. Non-Linear Data Structure
Linear Data Structure:
Linear data structures: Elements are accessed in a sequential order
but it is not compulsory to store all elements sequentially (say,
Linked Lists). Examples: Linked Lists, Stacks and Queues.
Non-Linear Data Structure:
Elements of this data structure are stored/accessed in a non-linear
order. Examples: Trees and graphs.


Applications of Data Structures

Data structures are used in various fields such as:
I Operating system
I Graphics
I Computer Design
I Blockchain
I Genetics
I Image Processing
I Simulation
I ...

Algorithm
I What is an algorithm?
- An algorithm is a set of rules for carrying out calculation
either by hand or on a machine.
- An algorithm is a sequence of computational steps that
transform the input into output.
- An algorithm is a sequence of operations performed on data
that have to be organized in data structures.
- A finite set of instructions that specify a sequence of
operations to be carried out in order to solve a specific
problem or class of problems.
- An algorithm is an abstraction of a program to be executed
on a physical machine.

- An algorithm is any well-defined computational procedure

that takes some value, or set of values, as input and produces
some value, or set of values, as output.
I Why do we study algorithms?
- To make solution more faster.
- To compare performance as a function of input size.
Two main property of algorithm is:
1. Correctness: Does the algorithm give solution to the problem
in a finite number of steps?
2. Efficiency: How much resources in terms of memory and
time, does it take to execute the algorithm.

Efficiency of an algorithm
I To go from city “A” to city “B”, there can be many ways of
accomplishing this: by flight, by bus, by train and also by
bicycle.
I Depending on the availability, convenience, and affordability
etc., we choose the one that suits us.
I Similarly, in computer science, multiple algorithms are
available for solving the same problem (for example, a sorting
problem has many algorithms, like insertion sort, selection
sort, quick sort and many more).
I Algorithm analysis helps us to determine which algorithm is
most efficient in terms of time and space consumed.
I The goal of the analysis of algorithms is to compare algorithms
(or solutions) mainly in terms of running time but also in
terms of other factors e.g., memory, developer effort, etc.
Time-space trade-off and complexity

A tradeoff is a situation where one thing increases and another
thing decreases. It is a way to solve a problem in:
I Either in less time and by using more space
I In very little space by spending a long amount of time.
The best Algorithm is that which helps to solve a problem that
requires less space in memory and also takes less time to generate
the output.
- But in general, it is not always possible to achieve both of these
conditions at the same time.
- The most common condition is an algorithm using a lookup table.
- This means that the answers to some questions for every possible
value can be written down.

- One way of solving this problem is to write down the entire

lookup table, which will let you find answers very quickly but will
use a lot of space.
- Another way is to calculate the answers without writing down
anything, which uses very little space, but might take a long time.
- Therefore, the more time-efficient algorithms you have, that
would be less space-efficient.
Types of Space-Time Trade-off:

I Compressed or Uncompressed data: A space-time trade-off

can be applied to the problem of data storage. If data stored
is uncompressed, it takes more space but less time. But if the
data is stored compressed, it takes less space but more time to
run the decompression algorithm. There are many instances
where it is possible to directly work with compressed data. In
that case of compressed bitmap indices, where it is faster to
work with compression than without compression.
I Re-Rendering or Stored images: In this case, storing only the
source and rendering it as an image would take more space
but less time i.e., storing an image in the cache is faster than
re-rendering but requires more space in memory.

I Smaller code or loop unrolling: Smaller code occupies less

space in memory but it requires high computation time that is
required for jumping back to the beginning of the loop at the
end of each iteration. Loop unrolling can optimize execution
speed at the cost of increased binary size. It occupies more
space in memory but requires less computation time.
I Lookup tables or Recalculation: In a lookup table, an
implementation can include the entire table which reduces
computing time but increases the amount of memory needed.
It can recalculate i.e., compute table entries as needed,
increasing computing time but reducing memory requirements.

Asymptotic notations
1. O - notation ”Big O” : Asymptotic upper bound,
O(g(n)) = {f (n) : ∃c, n0 > 0 such that 0 ≤ f (n) ≤
cg(n) for all n ≥ n0 }
f (n) ∈ O(g(n))
f (n) = O(g(n))

2. Ω - notation ”Big omega” : Asymptotic lower bound,

Ω(g(n)) = {f (n) : ∃c, n0 > 0 such that 0 ≤ cg(n) ≤
f (n) for all n ≥ n0 }
f (n) ∈ Ω(g(n))
f (n) = Ω(g(n))

3. Θ - notation : Asymptotic tight bound,

Θ(g(n)) = {f (n) : ∃c1 , c2 , n0 > 0 such that 0 ≤ c1 g(n) ≤
f (n) ≤ c2 g(n) for all n ≥ n0 }
f (n) ∈ Θ(g(n))
f (n) = Θ(g(n))

4. o - notation ”small o”: Asymptotic loose upper bound,

o(g(n)) = {f (n) : ∀c > 0, ∃n0 such that 0 ≤ f (n) <
cg(n) for all n ≥ n0 }
5. ω - notation ”small omega”: Asymptotic loose lower bound,
ω(g(n)) = {f (n) : ∀c > 0, ∃n0 such that 0 ≤ cg(n) <
f (n) for all n ≥ n0 }
Benefits of Asymptotic Notations:
- Simple representation of algorithm efficiency.
- Easy comparison of performance of algorithms.

Complexity analysis
Analyzing an algorithm means predicting the resources that the
algorithm requires. Resources may be memory, communication
bandwidth, computer hardware or CPU time. Our primary concern
is to measures the computational time required for the algorithm.
Running time:-The running time of an algorithm is the number of
primitive operations or steps executed on a particular input.
Why do we normally concentrate on finding only the worst-case
running time?
1. The worst-case running time of an algorithm gives us an
upper bound on the running time for any input. So it
guarantees that the algorithm will never slower than this. In
real applications, worst case normally occurs for example
searching a non existing data.

2. Best case is like an ideal case which guarantees that the

algorithm will never faster than stated. Based upon this we
can’t allocate the resources.
3. Average case normally perform as worst case because normally
we take average case as average of best and worst or best for
half size input and worst for other half size.

Complexity analysis: Insertion sort

Insertion-Sort(A,N) Cost Times
1. for j = 2 to N c1 n
2. key = A[j] c2 n-1
Insert A[j] in sorted A[1] to A[j − 1]
3. i = j − 1 c3 n-1
Pn
4. while i > 0 and A[i] > key c4 tj
Pj=2
n
5. A[i + 1] = A[i] c5 (t j −1)
Pj=2
n
6. i = i − 1 c6 j=2 (tj −1)
while-end
7. A[i + 1] = key c7 n-1
for-end

n
X n
X
T (n) = c1 n + c2 (n − 1) + c3 (n − 1) + c4 tj + c5 (tj − 1)
j=2 j=2
n
X
+c6 (tj − 1) + c7 (n − 1)
j=2
n
X n
X n
X
⇒ T (n) = an + b + c4 tj + c5 (tj − 1) + c6 (tj − 1)
j=2 j=2 j=2
Now consider different cases:

1. Best Case: The algorithm performs best if key ≤ A[i] for

every value of j in step 4.
Then it executes only once for each value of j and total of n-1
times.
Step 5 and 6 will not be execute at all.
This is the case when array is already sorted
T (n) = an + b = O(n)

2. Worst Case: The algorithm performs worst if key > A[i] for
each value of j and stops only when i < 1 in step 4.
Then it will execute always j times for each value of
j = 2, 3, . . . , n
so
n
X (n − 1)(2 + n)
j=
j=2
2
and step 5 and 6 will execute

n
X (n − 1)n
(j − 1) =
j=2
2
This is the case when array is already sorted in reverse order
T (n) = an2 + bn + c = O(n2 )

Arrays
Here are the main properties of arrays in Java:
I Arrays are objects.
I Arrays are created dynamically (at run time).
I Any method of the Object class may be invoked on an array.
I The variables are called the components or elements of the
array.
I If the component type is T, then the array itself has type T[].
I An element’s type may be either primitive or reference.
I The length of an array is its number of components.
I An array’s length is set when the array is created, and it
cannot be changed.
I Array index values must be integers in the range 0...length –1.

I Variables of type short, byte, or char can be used as indexes.

Here are some valid array definitions:
I float x[ ] = new float[100];
I String[ ] args; args = new String[10];
I boolean[ ] isPrime = new boolean[1000];
I int fib[ ] = {0, 1, 1, 2, 3, 5, 8, 13};
I short[ ][ ][ ] b = new short[4][10][5];
I double a[ ][ ] = {{1.1, 2.2}, {3.3, 4.4}, null, {5.5, 6.6}, null};
a[4] = new double[66];
a[4][65] = 3.14;

Single and Multidimensional Arrays

I int a[ ];
I int a[ ][ ];
I int a[ ][ ][ ];

Representation of Arrays
I Row Major Order: Row major ordering assigns successive
elements, moving across the rows and then down the next
row, to successive memory locations. In simple language, the
elements of an array are stored in a Row-Wise fashion.
I Column Major Order: If elements of an array are stored in a
column-major fashion means moving across the column and
then to the next column then it’s in column-major order.

Derivation of Index Formula

I 1-D Array: Address of A[Index ] = B + W ∗ (Index –LB)
where:
Index = The index of the element whose address is to be
found (not the value of the element).
B = Base address of the array.
W = Storage size of one element in bytes.
LB = Lower bound of the index (if not specified, assume
zero).

I 2-D Array:
Row Major Order:
Address of A[I][J] = B + W ∗ (M ∗ (I–LR) + (J–LC )) where:
I = Row Subset of an element whose address to be found,
J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in an array(in byte),
LR = Lower Limit of row/start row index of the matrix(If not
given assume it as zero),
LC = Lower Limit of column/start column index of the
matrix(If not given assume it as zero),
M = Number of column given in the matrix.

Column Major Order:

Address of A[I][J] = B + W ∗ ((I–LR) + N ∗ (J–LC )) where:
I = Row Subset of an element whose address to be found,
J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in an array(in byte),
LR = Lower Limit of row/start row index of the matrix(If not
given assume it as zero),
LC = Lower Limit of column/start column index of the
matrix(If not given assume it as zero),
N = Number of rows given in the matrix.

I multi-D Array:
Row Major Order:
Address of
A[I][J][K ] = B + W ∗ (N ∗ L(I − x ) + L ∗ (J − y ) + (K − z))
where:
B = Base Address (start address)
W = Weight (storage size of one element stored in the array)
N = Hight/Layer (total number of cells depth-wise)
M = Row (total number of rows)
L = Column (total number of columns)
x = Lower Bound of Row
y = Lower Bound of Column
z = Lower Bound of Hight

Column Major Order:

Address of
A[I][J][K ] = B + W ∗ (N ∗ L ∗ (I − x ) + (J − y ) + (K − z) ∗ N)
where:
B = Base Address (start address)
W = Weight (storage size of one element stored in the array)
N = Hight/Layer (total number of cells depth-wise)
M = Row (total number of rows)
L = Column (total number of columns)
x = Lower Bound of Row
y = Lower Bound of Column
z = Lower Bound of Hight

Application of arrays
Below are some applications of arrays.
I Storing and accessing data: Arrays are used to store and
retrieve data in a specific order. For example, an array can be
used to store the scores of a group of students, or the
temperatures recorded by a weather station.
I Sorting: Arrays can be used to sort data in ascending or
descending order. Sorting algorithms such as bubble sort,
merge sort, and quicksort rely heavily on arrays.
I Searching: Arrays can be searched for specific elements using
algorithms such as linear search and binary search.
I Matrices: Arrays are used to represent matrices in
mathematical computations such as matrix multiplication,
linear algebra, and image processing.

I Stacks and queues: Arrays are used as the underlying data

structure for implementing stacks and queues, which are
commonly used in algorithms and data structures.
I Graphs: Arrays can be used to represent graphs in computer
science. Each element in the array represents a node in the
graph, and the relationships between the nodes are
represented by the values stored in the array.
I Dynamic programming: Dynamic programming algorithms
often use arrays to store intermediate results of subproblems
in order to solve a larger problem.
Below are some real-time applications of arrays:
I Signal Processing: Arrays are used in signal processing to
represent a set of samples that are collected over time. This
can be used in applications such as speech recognition, image
processing, and radar systems.

I Multimedia Applications: Arrays are used in multimedia

applications such as video and audio processing, where they
are used to store the pixel or audio samples. For example, an
array can be used to store the RGB values of an image.
I Data Mining: Arrays are used in data mining applications to
represent large datasets. This allows for efficient data access
and processing, which is important in real-time applications.
I Robotics: Arrays are used in robotics to represent the
position and orientation of objects in 3D space. This can be
used in applications such as motion planning and object
recognition.

I Real-time Monitoring and Control Systems: Arrays are

used in real-time monitoring and control systems to store
sensor data and control signals. This allows for real-time
processing and decision-making, which is important in
applications such as industrial automation and aerospace
systems.
I Financial Analysis: Arrays are used in financial analysis to
store historical stock prices and other financial data. This
allows for efficient data access and analysis, which is
important in real-time trading systems.
I Scientific Computing: Arrays are used in scientific
computing to represent numerical data, such as measurements
from experiments and simulations. This allows for efficient
data processing and visualization, which is important in
real-time scientific analysis and experimentation.

Applications of Array in Java:

I Storing collections of data: Arrays are often used to store
collections of data of the same type. For example, an array of
integers can be used to store a set of numerical values.
I Implementing matrices and tables: Arrays can be used to
implement matrices and tables. For example, a
two-dimensional array can be used to store a matrix of
numerical values.
I Sorting and searching: Arrays are often used for sorting and
searching data. For example, the Arrays class in Java provides
methods like sort() and binarySearch() to sort and search
elements in an array.

I Implementing data structures: Arrays are used as the

underlying data structure for several other data structures like
stacks, queues, and heaps. For example, an array-based
implementation of a stack can be used to store elements in
the stack.
I Image processing: Arrays are commonly used to store the
pixel values of an image. For example, a two-dimensional
array can be used to store the RGB values of an image.

Sparse Matrices and their representations

A matrix is a two-dimensional data object made of n rows and m
columns, therefore having total mxn values. If most of the elements
of the matrix have 0 value, then it is called a sparse matrix.
Why to use Sparse Matrix instead of simple matrix ?
I Storage: There are lesser non-zero elements than zeros and
thus lesser memory can be used to store only those elements.
I Computing time: Computing time can be saved by logically
designing a data structure traversing only non-zero elements.

- Representing a sparse matrix by a 2D array leads to wastage of

lots of memory as zeroes in the matrix are of no use in most of the
cases.
- So, instead of storing zeroes with non-zero elements, we only
store non-zero elements.
- This means storing non-zero elements with triples-
(Row, Column, value).
Sparse Matrix Representations can be done in many ways following
are two common representations:
1. Array representation
2. Linked list representation

Method 1: Using Arrays

2D array is used to represent a sparse matrix in which there are
three rows named as
I Row: Index of row, where non-zero element is located
I Column: Index of column, where non-zero element is located
I Value: Value of the non zero element located at index –
(row,column)


Method 2: Using Linked Lists

In linked list, each node has four fields. These four fields are
defined as:
I Row: Index of row, where non-zero element is located
I Column: Index of column, where non-zero element is located
I Value: Value of the non zero element located at index –
(row,column)
I Next node: Address of the next node


Arithmetic operations on matrices

I Addition of Matrix
I Subtraction of Matrix
I Scaler Multiplication of Matrix
I Multiplication of Matrix
I Transpose
I Inversion

Recursion
The process in which a function calls itself directly or indirectly is
called recursion and the corresponding function is called a
recursive function. Using recursive algorithm, certain problems
can be solved quite easily.
Need of Recursion:
I Recursion is an amazing technique with the help of which we
can reduce the length of our code and make it easier to read
and write.
I It has certain advantages over the iteration technique which
will be discussed later.
I A task that can be defined with its similar subtask, recursion
is one of the best solutions for it. For example; The Factorial
of a number.

Properties of Recursion:
I Performing the same operations multiple times with different
inputs.
I In every step, we try smaller inputs to make the problem
smaller.
I Base condition is needed to stop the recursion otherwise
infinite loop will occur.
Algorithmic Steps:
The algorithmic steps for implementing recursion in a function are
as follows:
1 - Define a base case: Identify the simplest case for which the
solution is known or trivial. This is the stopping condition for
the recursion, as it prevents the function from infinitely calling
itself.

2 - Define a recursive case: Define the problem in terms of

smaller subproblems. Break the problem down into smaller
versions of itself, and call the function recursively to solve
each subproblem.
3 - Ensure the recursion terminates: Make sure that the
recursive function eventually reaches the base case, and does
not enter an infinite loop.
4 - Combine the solutions: Combine the solutions of the
subproblems to solve the original problem.
Types of Recursions:
- Recursion are mainly of two types depending on whether a
function calls itself from within itself or more than one function
call one another mutually.
- The first one is called direct recursion and another one is called
indirect recursion.

Direct Recursion
When a function calls itself from within itself is called direct
recursion. These can be further categorized into four types:
I Tail Recursion: If a recursive function calling itself and that
recursive call is the last statement in the function then it’s
known as Tail Recursion.
- After that call the recursive function performs nothing. The
function has to process or perform any operation at the time
of calling and it does nothing at returning time.


I Head Recursion: If a recursive function calling itself and that

recursive call is the first statement in the function then it’s
known as Head Recursion.
- There’s no statement, no operation before the call.
- The function doesn’t have to process or perform any
operation at the time of calling and all operations are done at
returning time.


I Tree Recursion: If a recursive function calling itself for one

time then it’s known as Linear Recursion. Otherwise if a
recursive function calling itself for more than one time then
it’s known as Tree Recursion.


I Nested Recursion: In this recursion, a recursive function will

pass the parameter as a recursive call. That means recursion
inside recursion.


Indirect Recursion
In this recursion, there may be more than one functions and they
are calling one another in a circular manner.
In the above diagram fun(A) is calling for fun(B), fun(B) is calling

for fun(C) and fun(C) is calling for fun(A) and thus it makes a
cycle.

Removal of recursion
By replacing the selection structure with a loop, recursion can be
eliminated. A data structure is required in addition to the loop if
some data needs to be kept for processing beyond the end of the
recursive step. A simple string, an array, or a stack are examples of
data structures.
There are a few ways to remove recursion from code, including:
I Iteration: Wrap your algorithm in a loop, pushing and popping
a custom call stack at the start and end of each iteration.
I Macro expansion: This technique can eliminate recursion, but
the depth of recursion is limited by the number of macro
invocations.

I Refactoring: In Python, you can refactor the code using a

series of small, careful refactorings to remove a single
recursion.
I Stack: You can use a stack to store a representation of the
operations that need to be performed.
I Generalization: Generalize the function definition.
I Computation traces: Study the computation traces of the
function.

Iteration and recursion with examples

I Linear Search
Figure: Linear Search Iterative

Figure: Linear Search Recursive

I Fibonacci Numbers



I Tower of Hanoi

Trade-off between iteration and recursion

The trade-offs between iteration and recursion in programming
include:
I Speed: Iteration is generally faster than recursion.
I Memory: Recursion requires more memory than iteration.
I Code complexity: Recursion can lead to simpler, more
readable code, while iteration can result in more complex
code.
I Time complexity: Recursion has higher time complexity than
iteration.
I Approach: Recursion follows a divide and conquer approach,
while iteration follows a sequential execution approach.
I Suitability: Recursion is better for tasks that can be described
naturally in a recursive way, while iteration is better for loops.
I Optimization: It can be difficult to optimize recursive code.
Searching
Searching algorithms are essential tools in computer science used to
locate specific items within a collection of data. These algorithms
are designed to efficiently navigate through data structures to find
the desired information, making them fundamental in various
applications such as databases, web search engines, and more.
Different searching algorithms are:
I Linear Search
I Binary Search
I Indexed Sequential Search
I Hashing

Linear Search
Linear search is a method for searching for an element in a
collection of elements. Each element of the collection is visited one
by one in a sequential fashion to find the desired element. Linear
search is also known as sequential search.
Linear Search Algorithm:
I Every element is considered as a potential match for the key
and checked for the same.
I If any element is equal to the key, the search is successful and
the index of that element is returned.
I If no element is found equal to the key, the search yields “No
match found”.

Figure: Linear Search using Iteration

Figure: Linear Search using Recursion

Binary Search
Binary search is a search algorithm used to find the position of a
target value within a sorted array. It works by repeatedly dividing
the search interval in half until the target value is found or the
interval is empty. The search interval is halved by comparing the
target element with the middle value of the search space.
Conditions to apply Binary Search
I The data structure must be sorted.
I Access to any element of the data structure should take
constant time.
Binary Search Algorithm:
I Divide the search space into two halves by finding the middle
index “mid”.
I Compare the middle element of the search space with the key.

I If the key is found at middle element, the process is

terminated.
I If the key is not found at middle element, choose which half
will be used as the next search space.
I If the key is smaller than the middle element, then the left side
is used for next search.
I If the key is larger than the middle element, then the right side
is used for next search.
I This process is continued until the key is found or the total
search space is exhausted.

Figure: Binary Search using Iteration

Figure: Binary Search using Recursion

Indexed Sequential Search

Indexed Sequential search: In this searching method, first of all, an
index file is created, that contains some specific group or division
of required record when the index is obtained, then the partial
indexing takes less time because it is located in a specified group.
When the user makes a request for specific records it will find that
index group first where that specific record is recorded.
Characteristics:
I In Indexed Sequential Search a sorted index is set aside in
addition to the array.
I Each element in the index points to a block of elements in the
array or another expanded index.
I The index is searched 1st then the array and guides the search
in the array.

Indexed Sequential Search actually does the indexing multiple

time, like creating the index of an index.
Figure: Indexed Sequential Search

Hashing
Hashing is a technique used in data structures that efficiently
stores and retrieves data in a way that allows for quick access.
- Hashing refers to the process of generating a fixed-size output
from an input of variable size using the mathematical formulas
known as hash functions.
- This technique determines an index or location for the storage of
an item in a data structure. - It involves mapping data to a
specific index in a hash table using a hash function that enables
fast retrieval of information based on its key.
- This method is commonly used in databases, caching systems,
and various programming applications to optimize search and
retrieval operations.
- The great thing about hashing is, we can achieve all three
operations (search, insert and delete) in O(1) time on average.

Components of Hashing: There are majorly three components of

hashing:
1. Key: A Key can be anything string or integer which is fed as
input in the hash function the technique that determines an
index or location for storage of an item in a data structure.
2. Hash Function: The hash function receives the input key and
returns the index of an element in an array called a hash
table. The index is known as the hash index.
3. Hash Table: Hash table is a data structure that maps keys to
values using a special function called a hash function. Hash
stores the data in an associative manner in an array where
each data value has its own unique index.

Collision: in Hashing occurs when two different keys map to the

same hash value.
- The hashing process generates a small number for a big key, so
there is a possibility that two keys could produce the same value.
- The situation where the newly inserted key maps to an already
occupied key value then it must be handled using some collision
handling technology.
Causes of Hash Collisions:
I Poor Hash Function: A hash function that does not distribute
keys evenly across the hash table can lead to more collisions.
I High Load Factor: A high load factor (ratio of keys to hash
table size) increases the probability of collisions.
I Similar Keys: Keys that are similar in value or structure are
more likely to collide.
- There are mainly two methods to handle collision:

I Open Addressing:
Linear Probing: Search for an empty slot sequentially
Quadratic Probing: Search for an empty slot using a quadratic
function
I Closed Addressing:
Chaining: Store colliding keys in a linked list or binary search
tree at each index
Cuckoo Hashing: Use multiple hash functions to distribute
keys Separate Chaining
Applications of Hashing: Hash tables are used wherever we have a
combinations of search, insert and/or delete operations.
I Dictionaries: To implement a dictionary so that we can
quickly search a word.

I Databases: Hashing is used in database indexing. There are

two popular ways to implement indexing, search trees (B or
B+ Tree) and hashing.
I Cryptography: When we create a password on a website, they
typically store it after applying a hash function rather than
plain text.
I Caching: Storing frequently accessed data for faster retrieval.
For example browser caches, we can use URL as keys and find
the local storage of the URL.
I Symbol Tables: Mapping identifiers to their values in
programming languages
I Network Routing: Determining the best path for data packets

I Associative Arrays: Associative arrays are nothing but hash

tables only. Commonly SQL library functions allow you
retrieve data as associative arrays so that the retrieved data in
RAM can be quickly searched for a key.

Sorting
A Sorting Algorithm is used to rearrange a given array or list of
elements according to a comparison operator on the elements. The
comparison operator is used to decide the new order of elements in
the respective data structure. For example arranging students
acoording to hight in morning assembly, seating roll no wise in
exams, arranging names marks wise in merit list etc. There are
different algorithms for sorting:
I Insertion Sort
I Bubble Sort
I Selection Sort
I Quick Sort
I Merge Sort

Insertion Sort
I Insertion sort is a simple sorting algorithm that works by
iteratively inserting each element of an unsorted list into its
correct position in a sorted portion of the list.
I It is a stable sorting algorithm, meaning that elements with
equal values maintain their relative order in the sorted output.
I Insertion sort is like sorting playing cards in your hands.
I You split the cards into two groups: the sorted cards and the
unsorted cards.
I Then, you pick a card from the unsorted group and put it in
the right place in the sorted group.

Figure: Insertion Sort

Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by
repeatedly swapping the adjacent elements if they are in the wrong
order. This algorithm is not suitable for large data sets as its
average and worst-case time complexity is quite high.
Algorithm:
I traverse from left and compare adjacent elements and the
higher one is placed at right side.
I In this way, the largest element is moved to the rightmost end
at first.
I This process is then continued to find the second largest and
place it and so on until the data is sorted.

Figure: Bubble Sort

Selection Sort
I Selection sort is a simple and efficient sorting algorithm that
works by repeatedly selecting the smallest (or largest) element
from the unsorted portion of the list and moving it to the
sorted portion of the list.
I The algorithm repeatedly selects the smallest (or largest)
element from the unsorted portion of the list and swaps it
with the first element of the unsorted part.
I This process is repeated for the remaining unsorted portion
until the entire list is sorted.

Figure: Selection Sort

Quick Sort
I QuickSort is a sorting algorithm based on the Divide and
Conquer that picks an element as a pivot and partitions the
given array around the picked pivot by placing the pivot in its
correct position in the sorted array.
I There are mainly three steps in the algorithm.
I 1. Choose a pivot
I 2. Partition the array around pivot. After partition, it is
ensured that all elements are smaller than all right and we get
index of the end point of smaller elements. The left and right
may not be sorted individually.
I 3. Recursively call for the two partitioned left and right
subarrays. We stop recursion when there is only one element
is left.


Figure: Quick Sort

Merge Sort
I Merge sort is a sorting algorithm that follows the
divide-and-conquer approach.
I It works by recursively dividing the input array into smaller
subarrays and sorting those subarrays then merging them back
together to obtain the sorted array.
I In simple terms, the process of merge sort is to divide the
array into two halves, sort each half, and then merge the
sorted halves back together.
I This process is repeated until the entire array is sorted.


Figure: Merge Sort


Thank you
Please send your feedback or any queries to
[email protected]

DS E-Content

Uploaded by

Copyright:

Available Formats

DS E-Content

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DS E-Content

Uploaded by

Copyright:

Available Formats

School of Computer Science and Engineering

Data Structures using JAVA [R1UC303B]

School of Computer Science and Engineering

September 24, 2024

Dr. A K Yadav Data Structures using JAVA 1/102

Dr. A K Yadav Data Structures using JAVA 4/102

Dr. A K Yadav Data Structures using JAVA 5/102

Dr. A K Yadav Data Structures using JAVA 6/102

I Field is a single elementary unit of information representing

I Save storage memory space.

Dr. A K Yadav Data Structures using JAVA 8/102

Classification/Types of Data Structures

Dr. A K Yadav Data Structures using JAVA 9/102

Dr. A K Yadav Data Structures using JAVA 10/102

Applications of Data Structures

Dr. A K Yadav Data Structures using JAVA 11/102

Dr. A K Yadav Data Structures using JAVA 12/102

- An algorithm is any well-defined computational procedure

Dr. A K Yadav Data Structures using JAVA 13/102

Time-space trade-off and complexity

Dr. A K Yadav Data Structures using JAVA 15/102

- One way of solving this problem is to write down the entire

Dr. A K Yadav Data Structures using JAVA 16/102

I Compressed or Uncompressed data: A space-time trade-off

Dr. A K Yadav Data Structures using JAVA 17/102

I Smaller code or loop unrolling: Smaller code occupies less

Dr. A K Yadav Data Structures using JAVA 18/102

Dr. A K Yadav Data Structures using JAVA 19/102

2. Ω - notation ”Big omega” : Asymptotic lower bound,

Dr. A K Yadav Data Structures using JAVA 20/102

3. Θ - notation : Asymptotic tight bound,

Dr. A K Yadav Data Structures using JAVA 21/102

4. o - notation ”small o”: Asymptotic loose upper bound,

Dr. A K Yadav Data Structures using JAVA 22/102

Dr. A K Yadav Data Structures using JAVA 23/102

2. Best case is like an ideal case which guarantees that the

Dr. A K Yadav Data Structures using JAVA 24/102

Complexity analysis: Insertion sort

Dr. A K Yadav Data Structures using JAVA 25/102

Now consider different cases:

Dr. A K Yadav Data Structures using JAVA 26/102

1. Best Case: The algorithm performs best if key ≤ A[i] for

Dr. A K Yadav Data Structures using JAVA 27/102

and step 5 and 6 will execute

This is the case when array is already sorted in reverse order

T (n) = an2 + bn + c = O(n2 )

Dr. A K Yadav Data Structures using JAVA 29/102

I Variables of type short, byte, or char can be used as indexes.

Dr. A K Yadav Data Structures using JAVA 30/102

Single and Multidimensional Arrays

Dr. A K Yadav Data Structures using JAVA 31/102

Dr. A K Yadav Data Structures using JAVA 32/102

Derivation of Index Formula

Dr. A K Yadav Data Structures using JAVA 33/102

Dr. A K Yadav Data Structures using JAVA 34/102

Column Major Order:

Dr. A K Yadav Data Structures using JAVA 35/102

Dr. A K Yadav Data Structures using JAVA 36/102

Column Major Order: