DS E-Content
DS E-Content
DS E-Content
Dr. A K Yadav
Contents
Introduction 5
Classification/Types of Data Structures 9
Applications of Data Structures 11
Algorithm 12
Efficiency of an algorithm 14
Time-space trade-off and complexity 15
Asymptotic notations 19
Complexity Analysis 23
Arrays 29
Representation of Arrays 32
Derivation of Index Formula 33
Application of arrays 38
Sparse Matrices 44
Arithmetic operations on matrices 50
Dr. A K Yadav Data Structures using JAVA 2/102
School of Computer Science and Engineering
Recursion 51
Direct Recursion 54
Indirect Recursion 62
Removal of recursion 63
Iteration and recursion with examples 65
Trade-off between iteration and recursion 71
Searching 72
Linear Search 73
Binary Search 76
Indexed Sequential Search 80
Hashing 82
Sorting 88
Insertion Sort 89
Bubble Sort 91
Selection Sort 93
Dr. A K Yadav Data Structures using JAVA 3/102
School of Computer Science and Engineering
Quick Sort 95
Merge Sort 98
Basic Terminology
What is Data Structure?
- A data structure is a particular way of organising data in a
computer so that it can be used effectively. The idea is to reduce
the space and time complexities of different tasks.
- The choice of a good data structure makes it possible to perform
a variety of critical operations effectively.
- An efficient data structure also uses minimum memory space and
execution time to process the structure.
- A data structure is not only used for organising the data. It is
also used for processing, retrieving, and storing data.
I Data: Data are simply values or sets of values.
I Data items: Data items refers to a single unit of values.
I Group items: Data items that are divided into sub-items are
called Group items. Ex: An Employee Name may be divided
into three subitems- first name, middle name, and last name.
I Elementary items: Data items that are not able to divide
into sub-items are called Elementary items. Ex: EnRollNo.
I Entity: An entity is something that has certain attributes or
properties which may be assigned values. The values may be
either numeric or non-numeric.
I Entities with similar attributes form an entity set.
I Each attribute of an entity set has a range of values, the set
of all possible values that could be assigned to the particular
attribute.
I The term information is sometimes used for data with given
attributes, in other words meaningful or processed data.
Algorithm
I What is an algorithm?
- An algorithm is a set of rules for carrying out calculation
either by hand or on a machine.
- An algorithm is a sequence of computational steps that
transform the input into output.
- An algorithm is a sequence of operations performed on data
that have to be organized in data structures.
- A finite set of instructions that specify a sequence of
operations to be carried out in order to solve a specific
problem or class of problems.
- An algorithm is an abstraction of a program to be executed
on a physical machine.
Efficiency of an algorithm
I To go from city “A” to city “B”, there can be many ways of
accomplishing this: by flight, by bus, by train and also by
bicycle.
I Depending on the availability, convenience, and affordability
etc., we choose the one that suits us.
I Similarly, in computer science, multiple algorithms are
available for solving the same problem (for example, a sorting
problem has many algorithms, like insertion sort, selection
sort, quick sort and many more).
I Algorithm analysis helps us to determine which algorithm is
most efficient in terms of time and space consumed.
I The goal of the analysis of algorithms is to compare algorithms
(or solutions) mainly in terms of running time but also in
terms of other factors e.g., memory, developer effort, etc.
Dr. A K Yadav Data Structures using JAVA 14/102
School of Computer Science and Engineering
Asymptotic notations
1. O - notation ”Big O” : Asymptotic upper bound,
O(g(n)) = {f (n) : ∃c, n0 > 0 such that 0 ≤ f (n) ≤
cg(n) for all n ≥ n0 }
f (n) ∈ O(g(n))
f (n) = O(g(n))
Complexity analysis
Analyzing an algorithm means predicting the resources that the
algorithm requires. Resources may be memory, communication
bandwidth, computer hardware or CPU time. Our primary concern
is to measures the computational time required for the algorithm.
Running time:-The running time of an algorithm is the number of
primitive operations or steps executed on a particular input.
Why do we normally concentrate on finding only the worst-case
running time?
1. The worst-case running time of an algorithm gives us an
upper bound on the running time for any input. So it
guarantees that the algorithm will never slower than this. In
real applications, worst case normally occurs for example
searching a non existing data.
n
X n
X n
X
⇒ T (n) = an + b + c4 tj + c5 (tj − 1) + c6 (tj − 1)
j=2 j=2 j=2
T (n) = an + b = O(n)
2. Worst Case: The algorithm performs worst if key > A[i] for
each value of j and stops only when i < 1 in step 4.
Then it will execute always j times for each value of
j = 2, 3, . . . , n
so
n
X (n − 1)(2 + n)
j=
j=2
2
Arrays
Here are the main properties of arrays in Java:
I Arrays are objects.
I Arrays are created dynamically (at run time).
I Any method of the Object class may be invoked on an array.
I The variables are called the components or elements of the
array.
I If the component type is T, then the array itself has type T[].
I An element’s type may be either primitive or reference.
I The length of an array is its number of components.
I An array’s length is set when the array is created, and it
cannot be changed.
I Array index values must be integers in the range 0...length –1.
Representation of Arrays
I Row Major Order: Row major ordering assigns successive
elements, moving across the rows and then down the next
row, to successive memory locations. In simple language, the
elements of an array are stored in a Row-Wise fashion.
I Column Major Order: If elements of an array are stored in a
column-major fashion means moving across the column and
then to the next column then it’s in column-major order.
I 2-D Array:
Row Major Order:
Address of A[I][J] = B + W ∗ (M ∗ (I–LR) + (J–LC )) where:
I = Row Subset of an element whose address to be found,
J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in an array(in byte),
LR = Lower Limit of row/start row index of the matrix(If not
given assume it as zero),
LC = Lower Limit of column/start column index of the
matrix(If not given assume it as zero),
M = Number of column given in the matrix.
I multi-D Array:
Row Major Order:
Address of
A[I][J][K ] = B + W ∗ (N ∗ L(I − x ) + L ∗ (J − y ) + (K − z))
where:
B = Base Address (start address)
W = Weight (storage size of one element stored in the array)
N = Hight/Layer (total number of cells depth-wise)
M = Row (total number of rows)
L = Column (total number of columns)
x = Lower Bound of Row
y = Lower Bound of Column
z = Lower Bound of Hight
Application of arrays
Below are some applications of arrays.
I Storing and accessing data: Arrays are used to store and
retrieve data in a specific order. For example, an array can be
used to store the scores of a group of students, or the
temperatures recorded by a weather station.
I Sorting: Arrays can be used to sort data in ascending or
descending order. Sorting algorithms such as bubble sort,
merge sort, and quicksort rely heavily on arrays.
I Searching: Arrays can be searched for specific elements using
algorithms such as linear search and binary search.
I Matrices: Arrays are used to represent matrices in
mathematical computations such as matrix multiplication,
linear algebra, and image processing.
Recursion
The process in which a function calls itself directly or indirectly is
called recursion and the corresponding function is called a
recursive function. Using recursive algorithm, certain problems
can be solved quite easily.
Need of Recursion:
I Recursion is an amazing technique with the help of which we
can reduce the length of our code and make it easier to read
and write.
I It has certain advantages over the iteration technique which
will be discussed later.
I A task that can be defined with its similar subtask, recursion
is one of the best solutions for it. For example; The Factorial
of a number.
Properties of Recursion:
I Performing the same operations multiple times with different
inputs.
I In every step, we try smaller inputs to make the problem
smaller.
I Base condition is needed to stop the recursion otherwise
infinite loop will occur.
Algorithmic Steps:
The algorithmic steps for implementing recursion in a function are
as follows:
1 - Define a base case: Identify the simplest case for which the
solution is known or trivial. This is the stopping condition for
the recursion, as it prevents the function from infinitely calling
itself.
Direct Recursion
When a function calls itself from within itself is called direct
recursion. These can be further categorized into four types:
I Tail Recursion: If a recursive function calling itself and that
recursive call is the last statement in the function then it’s
known as Tail Recursion.
- After that call the recursive function performs nothing. The
function has to process or perform any operation at the time
of calling and it does nothing at returning time.
Indirect Recursion
In this recursion, there may be more than one functions and they
are calling one another in a circular manner.
Removal of recursion
By replacing the selection structure with a loop, recursion can be
eliminated. A data structure is required in addition to the loop if
some data needs to be kept for processing beyond the end of the
recursive step. A simple string, an array, or a stack are examples of
data structures.
There are a few ways to remove recursion from code, including:
I Iteration: Wrap your algorithm in a loop, pushing and popping
a custom call stack at the start and end of each iteration.
I Macro expansion: This technique can eliminate recursion, but
the depth of recursion is limited by the number of macro
invocations.
I Fibonacci Numbers
I Tower of Hanoi
Searching
Searching algorithms are essential tools in computer science used to
locate specific items within a collection of data. These algorithms
are designed to efficiently navigate through data structures to find
the desired information, making them fundamental in various
applications such as databases, web search engines, and more.
Different searching algorithms are:
I Linear Search
I Binary Search
I Indexed Sequential Search
I Hashing
Linear Search
Linear search is a method for searching for an element in a
collection of elements. Each element of the collection is visited one
by one in a sequential fashion to find the desired element. Linear
search is also known as sequential search.
Linear Search Algorithm:
I Every element is considered as a potential match for the key
and checked for the same.
I If any element is equal to the key, the search is successful and
the index of that element is returned.
I If no element is found equal to the key, the search yields “No
match found”.
Binary Search
Binary search is a search algorithm used to find the position of a
target value within a sorted array. It works by repeatedly dividing
the search interval in half until the target value is found or the
interval is empty. The search interval is halved by comparing the
target element with the middle value of the search space.
Conditions to apply Binary Search
I The data structure must be sorted.
I Access to any element of the data structure should take
constant time.
Binary Search Algorithm:
I Divide the search space into two halves by finding the middle
index “mid”.
I Compare the middle element of the search space with the key.
Hashing
Hashing is a technique used in data structures that efficiently
stores and retrieves data in a way that allows for quick access.
- Hashing refers to the process of generating a fixed-size output
from an input of variable size using the mathematical formulas
known as hash functions.
- This technique determines an index or location for the storage of
an item in a data structure. - It involves mapping data to a
specific index in a hash table using a hash function that enables
fast retrieval of information based on its key.
- This method is commonly used in databases, caching systems,
and various programming applications to optimize search and
retrieval operations.
- The great thing about hashing is, we can achieve all three
operations (search, insert and delete) in O(1) time on average.
I Open Addressing:
Linear Probing: Search for an empty slot sequentially
Quadratic Probing: Search for an empty slot using a quadratic
function
I Closed Addressing:
Chaining: Store colliding keys in a linked list or binary search
tree at each index
Cuckoo Hashing: Use multiple hash functions to distribute
keys Separate Chaining
Applications of Hashing: Hash tables are used wherever we have a
combinations of search, insert and/or delete operations.
I Dictionaries: To implement a dictionary so that we can
quickly search a word.
Sorting
A Sorting Algorithm is used to rearrange a given array or list of
elements according to a comparison operator on the elements. The
comparison operator is used to decide the new order of elements in
the respective data structure. For example arranging students
acoording to hight in morning assembly, seating roll no wise in
exams, arranging names marks wise in merit list etc. There are
different algorithms for sorting:
I Insertion Sort
I Bubble Sort
I Selection Sort
I Quick Sort
I Merge Sort
Insertion Sort
I Insertion sort is a simple sorting algorithm that works by
iteratively inserting each element of an unsorted list into its
correct position in a sorted portion of the list.
I It is a stable sorting algorithm, meaning that elements with
equal values maintain their relative order in the sorted output.
I Insertion sort is like sorting playing cards in your hands.
I You split the cards into two groups: the sorted cards and the
unsorted cards.
I Then, you pick a card from the unsorted group and put it in
the right place in the sorted group.
Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by
repeatedly swapping the adjacent elements if they are in the wrong
order. This algorithm is not suitable for large data sets as its
average and worst-case time complexity is quite high.
Algorithm:
I traverse from left and compare adjacent elements and the
higher one is placed at right side.
I In this way, the largest element is moved to the rightmost end
at first.
I This process is then continued to find the second largest and
place it and so on until the data is sorted.
Selection Sort
I Selection sort is a simple and efficient sorting algorithm that
works by repeatedly selecting the smallest (or largest) element
from the unsorted portion of the list and moving it to the
sorted portion of the list.
I The algorithm repeatedly selects the smallest (or largest)
element from the unsorted portion of the list and swaps it
with the first element of the unsorted part.
I This process is repeated for the remaining unsorted portion
until the entire list is sorted.
Quick Sort
I QuickSort is a sorting algorithm based on the Divide and
Conquer that picks an element as a pivot and partitions the
given array around the picked pivot by placing the pivot in its
correct position in the sorted array.
I There are mainly three steps in the algorithm.
I 1. Choose a pivot
I 2. Partition the array around pivot. After partition, it is
ensured that all elements are smaller than all right and we get
index of the end point of smaller elements. The left and right
may not be sorted individually.
I 3. Recursively call for the two partitioned left and right
subarrays. We stop recursion when there is only one element
is left.
Merge Sort
I Merge sort is a sorting algorithm that follows the
divide-and-conquer approach.
I It works by recursively dividing the input array into smaller
subarrays and sorting those subarrays then merging them back
together to obtain the sorted array.
I In simple terms, the process of merge sort is to divide the
array into two halves, sort each half, and then merge the
sorted halves back together.
I This process is repeated until the entire array is sorted.
Thank you
Please send your feedback or any queries to
[email protected]