0% found this document useful (0 votes)
24 views243 pages

D.S Notes

Uploaded by

Arya Sathyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views243 pages

D.S Notes

Uploaded by

Arya Sathyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 243

Introduction to Data

Structures

Ms.Benita Jaison,Dept. of CSA,St. Francis College


Data Storage
• Data base systems refers to data stored in the secondary
memory of the computer.(Disk)
• Data structures is concerned with data structures in main
memory.(RAM)
What is data? DATA:
It is a representation of facts, figures and
DATA CAN BE FACTS RELATED TO ANY OBJECT IN
statistics etc. having no particular
19 meaning.
CONSIDERATION.

DATA
Data can be in the form of numbers,
characters, symbols, or even pictures
CHENNAI
NEETHA

INFORMATION:
It is a processed data (collection of data)
which is having perfect meaning

NAME PLACE AGE GENDER FIELD


NEETHA CHENNAI 19 FEMALE A field is a single piece of information
Data
• Data
• a collection of facts (numbers, words, measurements, observations
,descriptions of things)
• Data can be a value or a set of values.
• Eg: Employee details (emp_name, emp_id, emp_contactno)
• Datum : a single unit of value. Singular of data.
• Eg: emp_gender, emp_name.
• Group item : data item that can be subdivided into sub item.
• Eg: Employee Name : First Name, Middle initial and Last Name
• Elementary item : data item that cannot be sub divided .
• Eg : PAN card number / Bank Pass Book Number
Data

Data is a collection of facts or values

Group item Elementary item


Entity

• Entity: Any object that has certain attributes or properties which may
be assigned values.
• Eg: The employee of an organization.
Attributes Values

Name : John
Age : 33
Gender : Male
Emp.ID : 13472
Entity Set

• Entities with similar attributes form an entity set.


• Domain refers to the range of values applicable for an attribute.
• Information: Processed or meaningful data.
Domain of age –integer
values from 0 to 150

Employee Details
Emp ID Name Age Gender
13472 John 33 Male
13583 Mary 32 female
Entity Set
13467 Rahul 31 Male
Data type
• A data type specifies what type of value, a variable can be
assigned in computation or processing.

Eg: In C language int, float, char, double, long double, etc


Data Structure
• A data structure is a specialized format for organizing,
processing, retrieving and storing data.

• Data structures classified :


• Primitive data structures: Consists of basic (simple)data types which cannot be
divided (eg: integers, float, character and Boolean ).
• Non primitive data strucutres :Refers to data types that are derived from
primary data structures. Used to store group of values (eg: linked lists, stack,
queues, trees and graphs).
Non primitive data structures: 2 types
• Linear if data is arranged in a sequence or a linear list.
eg: Arrays, linked lists, stacks and queues

• Non linear if the data is not arranged in sequence.


eg: Trees, graphs.
Float
Linear Data Structures
•Arrays
•Linked List
•Stack
•Queue
Linear Data Structures
• Arrays : a list of finite number of elements of same datatype reference by
index numbers or subscript.
Linear Data Structures
• Linked List: a list of nodes where each node contains a data field
and a link to the next node in the list. Elements are not stored at
contiguous memory locations.
Linear Data Structures
• Stack : Last-In-First Out (LIFO) system, in which insertion and
deletion takes place only at “top” end.
Linear Data Structures
• Queue : First-in-First-out (FIFO) system, in which the deletion takes
place at “front” of the list and insertion at the “rear” end of the
list.
Non-Linear Data structures
• Graph: consist of finite set of vertices and set of Edges which connect a pair
of nodes. Eg: Un-directed Graph, Directed Graph.

• Trees : stores the information in hierarchy style, (data items are arranged in
branches and sub branches)represented by vertices connected by edges.
Eg: Binary Tree, B-Tree.
Difference - Linear and Non Linear Data Structure
Linear Data Structure Non-Linear Data Structure

Every item is related to its Every item is attached with


previous and next item. many other items.
Data is arranged in linear Data is not arranged in
sequence. sequence.
Data items can be traversed in Data cannot be traversed in
a single run. a single run.
Eg. Array, Stacks, linked list, Eg. tree, graph.
queue.
Implementation is easy. Implementation is difficult.
Data Structures Operations
The data appearing in data structures are processed by means of operations:

• Traversing: Accessing and processing each data or record exactly once.

• Searching: Finding the location of the data/record with a given key value.

• Inserting: Adding a new data or record to the data structure.

• Deleting: Removing a data or record from the data structure.

• Sorting: Arranging the data or records in some logical order.

• Merging: Combining the data or records in two different sorted files into a single sorted
file.
Data Structures Operations on Arrays
• Traversing: Accessing and processing each data or record exactly once.

25
23 27
25 29
27 31
29 32
30 34
32 37
35 39
37 41
39 42
40

0 1 2 3 4 5 6 7 8 9
Data Structures Operations on Arrays
• Searching: Finding the position of the number with a given search value.
• Find the location of 67

67 67 67 67 67
= = = = =

22 32 44 52 67 15 90 87 37 40
0 1 2 3 4 5 6 7 8 9

Index no 4
position 5
Data Structures Operations on Arrays
• Inserting: Adding a new record to the data structure.
• Insert 12 at the index location 3.

12

22 32 44 52
12 67
52 67
0 1 2 3 4 5 6 7 8 9
Data Structures Operations on Arrays
• Deleting: Removing a data (number) from the data structure.
• Delete 12 at the index location 3.

22 32 44 52
12 67
67
52 67
0 1 2 3 4 5 6 7 8 9
Data Structures Operations on Arrays
• Sorting: Arranging the data (elements) in some logical order.(ascending or descending)

33 25 27 29 30 32 35 37 39 40

0 1 2 3 4 5 6 7 8 9

25 27 29 30 32 33 35 37 39 40

0 1 2 3 4 5 6 7 8 9
Data Structures Operations on Arrays
• Merging: Combining the elements in two different sorted files into a single sorted file.

1 3 5 7 9 2 4 6 8 10

0 1 2 3 4 0 1 2 3 4

1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9
Abstract Data Types
• The abstract datatype is special kind of datatype, whose
behavior is defined by a set of values and set of operations.

• An abstract data type defines


• a data representation for objects of the type(store data).
• the set of operations that can be performed on the data.
Abstract Data Types
•An abstract data type (ADT) refers to a set of data values and the operations
that are applicable for the datatype, hiding the implementation details.
• Two different parts of ADT model :
• Data structures and Functions (Public and Private).
Both are within the scope of each other.
•Application program interface can access public functions for data
manipulation.
Important Questions
1. Define data structure.
2. Explain classifications of data structures in detail.
3. What are linear data structures? Give an example.
4. Explain various operations performed on non-primitive data
structures.
5. What is Abstract Data Type ?
Chapter 2
Algorithm Complexity
Why we need an algorithm?
Algorithms are required to solve a problem.
What are the Important Problem Types ?
• Sorting
• Searching
• String processing
• Graph problems
• Combinatorial problems
• Geometric problems
• Numerical problems
Algorithms
• Algorithm, is a well defined step by step instructions for solving a particular
problem
• Algorithms are used to find the best possible way for solving a problem.
CHARACTERISTICS OF AN ALGORITHM

1. Input and output should be defined precisely.


2. Each steps in algorithm should be clear (unambiguous).
3. Algorithm should be most effective method to solve a problem.
4. An algorithm should be such that, it can be used in similar programming
languages.
5. An algorithm should work for all instances of same problems type.

BCA Dept., St Francis College


Fundamental or Steps of Algorithmic Problem Solving
• Understanding the problem
• Decide on:
• the capabilities of the computational device.
• Exact /approximate solution.
• The appropriate data structure.
• Algorithm design techniques
• Designing an algorithm
• Proving an algorithms correctness
• Analysing an algorithm
• Coding the algorithm
Algorithm Design Techniques
ADT are general approach for solving problems algorithmically from different
areas of computing.

1.Brute-force or exhaustive search (eg: Linear Search)


2.Divide and Conquer (eg: Merge Sort)
3.Greedy Algorithms (eg: Kruskal’s and Prim’s)
4.Dynamic Programming (eg: Floyd-Warshall and Bellman-Ford)
5.Branch and Bound Algorithm (eg: Knap Sack Problem)
6.Randomized Algorithm (eg: Quick sort)
7.Backtracking (eg: Queens Problem,Graph Coloring)
8.Approximation Algorithms (eg: Vertex Cover, Travelling Sales man)
Important Problem Types
• Sorting algorithm is an algorithm that rearrange elements of a list in a
ascending or descending order. The output should be
stable(nondecreasing order) and inplace (use less memory for
rearrangement).
• Search algorithm is an algorithm for finding an item among a collection of
items.
• String processing algorithms: are used to search through a very large
document for the occurrence , replace or find length of a given string .
Important Problem Types
Graph problems can be
• Reachability: To find whether node B is reachable from A?
• Shortest path: (min-cost path). Find the path from B to A with the minimum
cost (Dijkstra's and Floyd's algorithms)
• Visit all nodes: Traversal.(Depth- and breadth-first traversals)
• Transitive closure: Determine all pairs of nodes that can reach each other
(Warshal’s algorithm)
• Minimum spanning tree: A minimum spanning tree is a set of edges such that
every node is reachable from every other node and the removal of any edge
from the tree eliminates the reachability property. (Prim's and Kruskal's
algorithms)
Important Problem Types
• Combinatorial problems are problems used to find a combinatorial object
that satisfies certain constraints and has some desired property. problems
are difficult as the number of combinatorial objects grows extremely fast
with problem size.
• Eg:The traveling Salesman problem ,graph coloring problems, permutation,
combination or subset.
• Geometric algorithms deal with geometric objects such as points , lines,
and polygons.
• Eg: The closest-pair problem ,The convex hull problem
• Numerical problems, are problems that involve mathematical objects of
continuous nature.
• Eg:Solving functions ,equation, integrals.
Algorithm Analysis
•Algorithm analysis is a technique used to
measure the effectiveness and performance of
the algorithms.

•Analysis of algorithm is the process of analyzing


the problem-solving capability of the algorithm
in terms of the time and space required by an
algorithm.
Complexity of Algorithms
• The complexity of an algorithm gives the running time
and/or the storage space requirement of the algorithm in
terms of the size n of the input data.
• The complexity of an algorithm is denoted by the function
f(n).
• Time and space are the two main measures for efficiency of
an algorithm.
• Time is the count of the no:of key operations in the algorithm.
• Space is measured by counting the maximum of memory needed
by the algorithm
Time Complexity (Time Efficiency)
• Time complexity is the amount of time taken by an algorithm to run,

• Time complexity measures the time taken to execute each statement of code in an
algorithm. If a statement is set to execute N times then,
• Time complexity = N x (the time required to run that line or function one time).

• The relation between the input data size (n) and number of operations
performed (N) with respect to time is denoted as Order of growth in Time
complexity.
Space Complexity (Space Efficiency)
• The space complexity can be defined as amount of memory required by an algorithm
to run.
• Space efficiency depends on the following factors:

• Program Space: to store machine code generated by the compiler or


assembler
• Data Space: to store the constants, global variables etc
• Stack Space: to store the return address,parameters passed to the function,
local variables in called function etc.
Time – Space Tradeoff
• Time - space tradeoff refers to the choice between algorithmic
solutions of a data processing problem that allows one to

• Decrease running time of an algorithmic solution, by increasing the space to


store the data.

Or

• Decrease the space required to store the data, by increasing the running
time of an algorithm.
Sequential or Linear Search
Linsearch(n,a,S)

Aim: To search and return the position of the search value S.


Input: Array a[0,1,….n-1] of size n and search key S
Output : Index of the first element in a[] that matches S or -1 if no matching element

Step 1: Repeat Step 2 for I=0 to n :


Step 2: If (S==a[I])
RETURN I as the index number
EXIT
Step 3: RETURN -1 as no matching
value in array
Case 1 : Search element is 22

22
=

22 32 44 52 67 15 90 87 37 40
0 1 2 3 4 5 6 7 8 9

Case 2: Search element is 40


40 40 40 40 40 40 40 40 40
= = = = = = = = =

22 32 44 52 67 15 90 87 37 40
40
0 1 2 3 4 5 6 7 8 9

Case 3: Search element is 67

Average=(1+2+3+4+5+6+7+8+9+10)/10
No:of Comparisions
IfIfIfSSS==37
=67
52
=87
==15
44
32
=90
=22 No:of
No:of Comparisions
Comparisions =8
=7
=9
=5
=3
=6
=2
=4
=1 22 37
=10 22 32 44 52
52 67 15 90 87 40
0 1 2 3 4 5 6 7 8 9
Complexity of Linear Search
• The time required to execute the algorithm is given by the
number of comparisons C, between S and a[i].

• If the element S is the first element in the array a[],then C=1


• If the element S is the last element or not present in array
a[],then C=n, where n is the no:of elements in the array.
• If the search element S is present in any other position (the real-
life input), C can vary between 1 and n.
Worst-case, Best-case, Average case efficiencies
• Efficiency is considered in terms of the number of times the basic
operation will be executed.
• For most algorithm, →efficiency depends on the input size n.
• But for some algorithms, → efficiency depends on type of input.

• The efficiency of the algorithm are of 3 different types:

• Best case efficiency


• Worst case efficiency
• Average case efficiency
Best-case efficiency
• Best case is the case of minimum value of f(n), (time/space complexity of
an algorithm) , among all possible types of inputs of size n.
• If an algorithm takes minimum amount of time to run to completion for a
specific set of input then it is called best case time complexity
• Provides an absolute guarantee to the lower bound on running time.

• Eg: In sequential search the best case is if the search element is first element in the
array.
Worst-case efficiency
• Worst case is the case of maximum value of f(n), time/space complexity of
an algorithm , among all possible inputs of size n.

• If an algorithm takes maximum amount of time to run to completion for a


specific set of input then it is called worst case time complexity.

• Provides an absolute guarantee to the upper bound on running time.

• Eg: In sequential search the worst case is , if the search element not found or if it is
the last element in the array.
Average case efficiency
• Average of the time taken to solve all the possible (random) instances of
the input of size n.

• Provides the expected running time and mostly it will be the


real-life inputs.
• Average case efficiency is NOT the average of worst and best case.
• Eg: In sequential search the average case is the average of the time taken if the
search element is 1st,2nd,3rd,…..or last element in the array.
Performance measurement and Performance Evaluation

If only time is considered


• f(n) is a function, that shows the running time .(time
complexity)
• f(n)→how long it takes if ‘n’ is the size of input.
If only space is considered
• f(n) is a function, that shows the memory required .(space
complexity)
• f(n)→how much memory it takes if ‘n’ is the size of input.
Order of Growth

• Order of growth shows how fast the function, f(n) grows


with the input size.
• Order of growth →“Rate of growth of running time”.

• The algorithm with less rate of growth of running time is


considered better.
Order of Growth of n
• Order of Growth is the increase in time of execution with respect to input
size n.
• 1<log n<n<n log n<n2<n3<2n<n!

• The function growing the slowest among these is the logarithmic


function.
• The exponential function 2n and the factorial function n! grows fast and
their values become astronomically large even for small values of n.
Order of Growth
• The efficiencies of a large number of algorithms fall into the following few
classes:
• Constant (1 or any constant): An algorithm is said to run in constant time if it requires
the same amount of time regardless of the input size. (eg: accessing 1 array elements)
• Logarithmic (log n) : An algorithm is said to run in logarithmic time if its time execution
is proportional to the logarithm of the input size (eg: binary search)
• Linear (n) : An algorithm is said to run in linear time if its time execution is directly
proportional to the input size, i.e. time grows linearly as input size increases.
(eg:linear search)
• Linearithmic (n log n) (eg:quick sort,merge sort)
• Quadratic (n2) : An algorithm is said to run in quadratic time if its time execution is
proportional to the square of the input size. (eg:addition or subtraction of matrices)
• Cubic (n3) (eg:mulitplication of matricies)
• Exponential(2n) (eg:tower of hanoi)
• Factorial(n!) (eg:permutation of set)
Order of Growth
Asymptotic Notations
• Asymptotic Notations are used to identifying the behavior of an algorithm as
the input size increases.

• Asymptotic notation is a way of comparing algorithm functions that ignores


constant factors and small input sizes(n).

• Asymptotic notations are used to compare the growth of algorithm function


f(n) in relation to another simpler function g(n).

• The O (big oh), Ω (big omega), ʘ (big theta), o (little oh), (little omega) are
the asymptotic notations used to indicate ,compare and rank the order of
growth of an algorithm.
Big Oh- O notation
• The “big O” notation defines an upper bound for the function f(n).
f(n) ∈ O(g(n)) read as “f(n) is an element of O of g(n)”
• Function f(n) is bounded above by C multiples of g(n) where C is a positive
number and n is a positive integer > n0,
f(n) <= C * g(n)
• f(n) is a function whose growth rate is asymptotically less than or equal to the
C *( g( n) ) function.

Worst case time complexity of linear search is O(n) ,and that for Binary search is O(log n).
Big Oh- O notation

O-notation is to give an upper bound


for the function, f(n).

• For all values n to the right of n0, the value of the function f(n) is on or
below c*g(n).
Little oh –o notation
• o notation is used to denote an upper bound that is strictly less than C *( g( n) )
function.
f(n) < C * g(n) for all n > n0.
• f( n ) is a function whose growth rate is strictly less than C *( g( n) ) function.

f(n) ∈o(g(n))
• Implies that for any positive constant C and positive integer n>n0, f(n) < C*g(n)

f(n)
Big Omega- Ω notation
• The “big Ω” notation defines a lower bound for the function f(n).
• f(n) ∈ Ω(g(n)) read as “f(n) is omega of g(n)”
• Function f(n) is bounded below by C multiples of g(n) for all values of n> n0,.
f(n) >= C * g(n) for all n > n0,

• f(n) is a function whose growth rate is asymptotically greater than or equal to


C*( g( n) ) function.
Big Ω notation

Ω-notation provides an asymptotic lower bound.


For all values n to the right of n0,the value of f(n) is on or above c*g(n).
Little omega- ω notation
• ω-notation is used to denote a lower bound whose growth rate is strictly
greater than C *g(n).
f(n) ∈ω(g(n)) ,
Implies, for any positive constant C and positive integer n>n0
f( n ) is a function whose growth rate is strictly greater than C *g(n).
f(n)

f(n) >C *g(n).


Big Theta- ʘ notation
• The theta notation is used to give tight bound for f(n).

• f(n) ∈ ʘ (g (n)), implies f(n) is bounded both above and below by some
constant multiple of g (n) for all n> n0,
• f(n) = ʘ (g (n))
• “f(n) is theta of g(n)”, If there exist two positive constant C1 and C2 and
positive integer n>n0 such that
C1 * g (n) <= f (n) <= C2 * g (n)
Big Theta - Ѳ notation
f(n)

For all values of n to the right of n0, the value of f(n) lies at or above C2*g(n) and at or below
C1*g(n).
g(n) is an tight upper and lower bound for f(n).
Important Questions
• Define algorithm. Explain the characteristics of an algorithm.
• Define the terms Space complexity, time complexity.
• Illustrate asymptotic notations with examples.
Chapter 3
Preliminaries
Mathematical Notations and Functions
• Remainder Function : Modular Arithmetic
• Floor and Ceiling Function
• Integer and Absolute Value function
• Summation
• Permutations
• Exponents and Logarithms

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Q 3
27 Mod 7 = 6
M 7 27 K
21
R 6
7 X 3 + 6 = 27

M X Q + R = K

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Remainder Function : Modular Arithmetic
• Let K and M be a positive integer, then K mod M denotes the
remainder when K is divided by M.
• K=MQ+R , where Q is the quotient and R is the remainder while
dividing K by M.

Eg: 35 mod 6 = 5
6 x 5 + 5 = 35
25 mod 5=0. (Remainder 0)
5 X 5 +0=25
3 mod 5=3
5X0 + 3=3
Ms. Benita Jaison, Dept. of CSA, St. Francis College.
Arithmetic modulus
• Arithmetic modulus M refers to the arithmetic operations
of addition , multiplication and subtraction where the
result is replaced by it’s equivalent value in set,with respect
to each modulus M.

Eg: For Arithmetic modulo 5,set of values are {0,1,2,3,4}


3 + 5 = 8 mod 5 =3= 3
3 + 5= 3

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Arithmetic Modulus
• Arithmetic modulo 12 is called clock arithmetic.
• The set of values are {0,1,2,3,4,5,6,7,8,9,10,11}
6 + 9 = 15 mod 12 =3= 3
7x5=35 mod 12 =11=
3
1-5=-4mod 12 = 8 3

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Arithmetic modulus

1–5= ?
1-5 = -4 mod 12 = 8

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Mathematical Congruence

• a≡b (mod M),read as, a is congruent to b modulo M,


It implies that
a mod M=b mod M.
a≡b (mod M), if and only if a-b is completely divisible by M.

Eg: 25 ≡7 mod 3
25 mod 3=7 mod 3
25 ≡7 mod 3 , if and only if (25-7) is completely divisible by 3.
Ms. Benita Jaison, Dept. of CSA, St. Francis College.
Consider a real number 3.6
Integer greater than 3.6 are 4,5,6..etc.
4 Least integer which is greater than 3.6 is
4,which is considered as the ceiling(3.6)
3.6

Integer lesser than 3.6 are 3,2,1.


Greatest integer that is lesser than 3.6 is 3, which
is considered as the floor(3.6)

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Consider a negative real number -3.6
Integer lesser than -3.6 are -4,-5,-6..etc.
-4 Greatest integer which is lesser than -3.6 is -
4,which is considered as the floor(-3.6)
-3.6

-3

Integer greater than -3.6 are -3,-2,-1.


Least integer that is greater than -3.6 is -3, which
is considered as the ceiling(-3.6)

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Floor Function
If x is a real number, then
• floor(x) denotes the greatest integer that is lesser than x.

e.g. floor(2.6) = 2
floor(–2.1) = –3

• Notation: x = floor(x)

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Ceiling Function
If x is a real number, then
• ceiling(x) denotes the least integer that is greater than x.

Eg ceiling(2.6) = 3
ceiling(–2.1) = –2.
• Notation: x = ceiling(x).
• If x is itself integer then x = x
• If x is real number then x +1 = x

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Integer and Absolute values functions
• Let x be any real number. The integer value of x, written INT(x), converts x
into an integer by deleting the factorial part of the number.

Eg. INT (3.14) =3, INT (-8.5) =-8,


• Absolute value describes the distance of a number on the number line from
0 without considering which direction from zero the number lies.
The absolute value of a number is never negative .
• ABS(x) or |x|
Eg: |-6|=6,
|-3.33|=3.33
Ms. Benita Jaison, Dept. of CSA, St. Francis College.
Summation Symbol : Σ
• Consider a sequence a1,a2,a3,…… Then their sum a1+a2+……..+an is denoted, by

∑a j
where j=1,2,3,….
• The letter j in the above expression is called a dummy index or dummy variable.
• Sum of squares of numbers 22+32+42+52 can be represented using ∑ and the dummy
variable j where j value range from 2 to 5 as

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Factorial Function , Permutations

• Factorial is the product of the positive integers from 1 to n, inclusive and is denoted
by n!.
• n!=1 x 2 x 3 x….x (n-2)x (n-1) x n

• Eg: 2!=1 x 2=2; 3!=1x2x3=6; 4!=1x2x3x4=24

• A permutation of a set of n elements is arranging the members of a set into a


sequence or order.

• Eg: The permutations of the set consisting of the elements a,b,c are as follows:
abc, acb, bac, bca, cab, cba

• There are n! permutations for a set of n elements.


Ms. Benita Jaison, Dept. of CSA, St. Francis College.
Algorithmic Notation
• Algorithmic notations include the basic format conventions used in the
formulation of algorithms.
• Name of Algorithm:algorithm is given an identifying name.
• Comment:a brief description of the tasks the algorithm performs and any
assumptions that have been made.
• The execution of an algorithm begins at step 1 and continues from there in
sequential order unless the result of a condition tested or an unconditional
transfer (‘Go to’, ‘Exitloop’, or ‘Exit’) specifies otherwise.
• Steps: algorithm is made up of a sequence of numbered steps, an ordered
sequence of statements which describe actions to be executed or tasks to be
performed. The statements in each step are executed in a left-to-right order.
Ms. Benita Jaison, Dept. of CSA, St. Francis College.
STATEMENTS AND CONTROL STRUCTURES
• Assignment Statement :The assignment statement is indicated by placing an
arrow () between the right-hand side of the statement and the variable
receiving the value. A12.
• An exchange of the values of two variables is written A →B.
• Many variables can be set to the same value by using a multiple assignment.
ij0.
• Exit Statement: The exit statement is used to terminate an algorithm. It is
usually the last step:

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Control Structures
• Algorithms use modules and logic for the flow of control.It can
be Sequence or in Selection or in Iteration
1.Sequence logic :Instructions are executed in the sequential order.
2.Selection logic: Depending on the condition, selection of one module.
(if statements)

• Single alternative:if the condition holds, then statement A is executed;


otherwise control transfers to the next step of the algorithm.
• Double alternative:if the condition satisfies, then statement A is executed;
otherwise statement B is executed.
• Multiple alternative:the statement which follows the condition which satisfies
is executed, or the statement which follows the final Else statement is executed.

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Control Structures

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Control Structures
3.Iteration logic, or Repetitive flow: Repeating body of the loop, controlled number
of times.(for loop,while loop)
• The repeat-for loop uses an control variable, I to control the loop.
Repeat for I=S to E by T:
[Statements]
[End of loop]
S is the Start value. E is the end value, and T is the increment.
Eg: for(i=0;i<10;i++)
• The repeat-while loop uses a condition to control the loop.
Repeat while condition:
[Statements]
[end of loop]
The looping continues until the condition is false.
Eg: i=0
while(i<10)
i++;
Ms. Benita Jaison, Dept. of CSA, St. Francis College.
String Processing
Chapter 4
String
• A finite sequence of zero or more characters is called a string.

• String of zero characters is called the empty string or null string.

• Eg:The string in C programming language is a one-dimensional array of characters which


is terminated by a null character '\0’.
Storing Strings
• Strings can be stored in three types of storage
• Fixed length Storage
• Variable length storage with fixed maximum
• Linked storage
Fixed length storage
• In this representation successive characters of a string will be placed in consecutive
character positions.
• In fixed length storage all records have the same length. Each record can accommodate
same number of characters.
• Eg: Records having 80 columns
Storing Strings : Fixed length storage

Disadvantages are

➢ Wastage of time in reading blank space

➢ Records may require more space than the length fixed for each record
➢ Correction of few characters may require the entire record to be
changed
➢ Inserting a new record may require records to be moved to new memory
location.
Fixed length storage using linear array pointer
• Record are not stored in consecutive memory location.
• A linear array of pointer stores the address of each successive record.

Array pointer

• Inserting a new record requires updating the array pointer.


Variable length
Strings of variable length can be stored in fixed length memory locations,if the actual length of
each string is known. The storage of variable length strings can be done in two general ways:
• Use a marker, such as 2 dollar signs, to indicate the end of the string.
• Enter the length of the string as an additional item in the pointer array.

End of the string by Two dollar ($$) signs

Length of the string in pointer array


String Storage : Linked storage
• For most extensive word processing applications, strings are stored by
means of linked lists.
• Each memory cell data is assigned one character of the string.
• Link contained in the cell gives the address of the next character or the group
of characters in the string.
• Eg: str=‘TO BE OR NOT TO BE’.
Linked list :series of connected nodes
Linked list is a linearly ordered sequence of memory cells called nodes.
Each node contain data and a link, which points to the next node in the list (address of the
next node)
Node
A

data pointer

B S c 

Head: pointer to the first node The last node points to NULL
Advantages and Disadvantages of Linked Storage
Adv:
• Can easily insert, delete, concatenate and rearrange
substrings when using a linked storage.
Disadv:
• Additional space is used for storing the links.
• One cannot directly access a character in the middle of the
list.
String as ADT
• An abstract data type (ADT) is special kind of datatype which consists of a set
of data values, a defined set of properties of for values and the operations that
are applicable for the datatype, hiding the implementation details.

• String as ADT has it own predefined Values , Properties and Operations

• The string ADT values are, sequences of characters upto a specified length.
• String properties : String is made up of ASCII characters set, having length varying from
0 to specified length.
• String operations are used for processing the values.
String as ADT
Operations allowed on a string: a c a b a b c
0 1 2 3 4 5 6
• Return the nth character in a string - GETCHAR (str, n)
• Set the nth character in a string to c – PUTCHAR (str , n , c)
• Concatenate two strings – CONCAT(str1, str2)
• Compare string – COMPARE(str1, str2)
• Find the length of a string - LENGTH(str)
• Return the Index of the 1st occurrence of a character in text. INDEX (text , pattern)
• Return the position of 1st occurrence of character in text. POS (text , pattern).
• Return a substring of str of length m, starting at the position i SUBSTRING (str, i , m)
• Delete a substring from str of length m, starting at the position i– DELETE (str, i, m)
• str1 inserted to str2 at position i - INSERT(str1, str2, i)
• Replace a character1 in text with character2 -REPLACE (text, char1 , char2)
String Operations
• Concatenation - joining two strings end to end.
• CONCAT(S1, S2) Requires the name of 2 strings.
• The string , consisting of the characters of S11 followed by the characters of S2 is
called concatenation of S1 and S2.
eg:.S1=“MCA”,S2=“DEPT”
CONCAT(S1,S2) =“MCADEPT".

• Length -returns the length of the string.


• LENGTH(string) Require the name of a string.
eg : .str=“computer”
LENGTH(str)=8
String operations
• Indexing – pattern matching
• Return the index of the first appearance of the character in the given text T
• INDEX (text, character)
• eg: T = “ HIS FATHER IS PROFESSOR”
• INDEX (T, ‘S’) return 6

• Substring – accessing a substring from a given string


• Requires string name , starting position of the substring and length of the substring
• SUBSTRING (string , pos , length)
eg : SUBSTRING (‘THE END’,5,3) = ‘END’
Word/text Processing
• A word processor is a computer program or device that provides input,
editing, formatting and output of text and other features.
• The operators that can be performed with word processing are:
• Replace
• Insertion
• Deletion
• Replace operation involves replacing one string in the text by another
• Insertion involves inserting a string in the middle of the text
• Deletion operation involves deleting a substring from the text
Word/text Processing
• The operation associated with word processing are :
• Replace – Replacing one string in the text with another
• REPLACE (text, pattern1 , pattern2)
• Eg: REPLACE(‘XYZXYZ’, ‘YZ’, ‘AB’) X AY B
Z X AY B
Z

• Insertion – inserting a string in the middle of the text


• INSERT (text, position , string)
• Eg: INSERT (‘ABCDEFG’, 3 , ‘XYZ’) returns “ABXYZCDEFG”

• Deletion – deleting a string from the text


• DELETE (text , position , length)
• Egg: DELETE (‘ABCDEFG’, 4, 2) returns “ABCFG”
Pattern Matching Algorithms
• Given an input text string T of length N, and a pattern string P of length M,
Pattern Matching Algorithms find the first (or all) instances or occurrence of
the pattern p in the text t.
• The complexity or efficiency of the pattern matching algorithm is measured by
the number of comparisons C, between the characters in the pattern P and
characters of the text T.
Pattern Matching Algorithms

• The Naive String Matching Algorithm or Brute Force Method.


• The Rabin-Karp-Algorithm.
• Finite Automata.
• The Knuth-Morris-Pratt Algorithm.
• The Boyer-Moore Algorithm.
Brute-Force Algorithm
• The brute-force algorithm simply matches corresponding
pairs of characters in the pattern and the text from left to
right
• The maximum number of (n − m + 1) substrings and, in each
substring the worst case, is m comparisons need to be made
on each of them, the worst-case efficiency of the brute-force
algorithm is O(nm).
The First Pattern Matching Algorithm (Naïve or Brute
Force String Matching)

• T (text) and P(pattern) are stored as arrays of characters. Algorithm finds the
INDEX of P in T.

• Compare a given pattern P with each of the substring of T, moving from left to
right, until we get a match.

• If N is the length of the Text and M is the length of Pattern, then no:of
substring possible is given as, S=N-M+1.
The Brute Force Method of Pattern Matching Algorithm
a c a b a b c T[ ] Text of length N=7
0 1 2 3 4 5 6

a b c
0 1 2 P[ ] Pattern of length M=3

• Let length of Text be N=7 and length of Pattern be M=3


• Maximum no:of shift is S= 7-3+1=5.
• There will be 5 possible substrings having the same length as pattern are
• aca, cab, aba,bab,abc.
• Substrings will be referred by the variable i. (i=0 to s)
• Corresponding characters in each substring and pattern will be referred by the variable j. (j=0 to M)
The Brute Force Method of Pattern Matching Algorithm
ALGORITHM:BruteForce (t[0…n-1],p[0…m-1])
Input: Text array t[] of length N and pattern p[] of length M.
Output: Index of the first character of the pattern matching in the text string else -1 if
pattern not found.
1. Initialize S=N-M+1
2. Repeat step 3 to 6 for i=0 to S
3. j=0
4. Repeat step 5 while (j<M and T[i+j]== P[j] )
5. j=j+1;
6. If j==M return i
7. If i==S
8. return -1(pattern not found)
T[ i+j ]
T[ 0+j ] T[1+j ]
S=5
a c a b a b c a c a b a b c
0 1 2 3 4 5 6 0 1 2 3 4 5 6
i=0 to 4
m=3
a b c a b c J=0 to 2
0 1 2 0 1 2

T[2+j ] T[3+j ] T[4+j ]


a c a b a b c a c a b a b c a c a b a b c
0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6

a b c a b c a b c
0 1 2 0 1 2 0 1 2
First Shift
i determines the shift or substring number
j determines the character in each substring
T[ i+j ]==P[j]
i=0

T[n ] a c a b a b c
0 1 2 3 4 5 6

J=0 J=1
a b c
P[m]
0 1 2
The Brute Force Method of Pattern Matching Algorithm
T[ j+i ] i=0
T[ j+i ] i=2
a c a b a b c a c a b a b c
0 1 2 3 4 5 6 0 1 2 3 4 5 6
J=0 J=1 J=0 J=1 J=2
P[j] a b c P[j] a b c
0 1 2 0 1 2

T[ j+i ] i=1 T[ j+i ]


i=3
a c a b a b c a c a b a b c
0 1 2 3 4 5 6 0 1 2 3 4 5 6
J=0 J=0
P[j] a b c a b c
P[j]
0 1 2 0 1 2
The Brute Force Method of Pattern Matching Algorithm
i=4 Total No:of Comparison
=2+1+3+1+3=10
a c a c a b c
0 1 2 3 4 5 6
j=0 j=1 j=2

a b c
0 1 2

Let length of Text be N=7 and length of Pattern be M=3


Maximum no:of shift or substring = 7-3+1=5
i=4
j=0
if P[0]= = T[4+0], then a= =a j=j+1
j=1
if P[1]= = T[4+1], then b= =b j=j+1
j=2
if P[2]= = T[4+2], then c= =c j=j+1
Find the no:of comparisons required for
returning the index of the pattern in the text.
Text T[ ]= a c a c a c a b c a b c d
Length of text, N= 13

Pattern P[ ]=a b c d
Length of pattern, M=4

No:of Substrings, S=N-M+1 = 13-4+1= 9+1=10


Brute-Force Algorithm
• The brute-force algorithm simply matches corresponding pairs of
characters in the pattern and the text from left to right
• If a mismatch occurs, shifts the pattern one position to the right
for the next trial.
• The maximum number of (n − m + 1) substrings and, in each
substring the worst case, is m comparisons need to be made on
each of them,
• The worst-case efficiency of the brute-force algorithm is O(nm).
Faster Pattern Matching Algorithm
• To get a faster algorithm, enhance the input by preprocessing
the pattern to get some information about it, store this
information in a table.

• Use this information during an actual search for the pattern


in a given text.
• Boyer Moore algorithm is one such algorithm which use the
information stores in table to reduce the no:of comparisons
and hence a makes the algorithm faster.
Boyer-Moore algorithm
• The Boyer-Moore algorithm starts by aligning the pattern
against the beginning characters of the text;
• Compare characters of a pattern with their counterparts in a
text, starting with the last character in the pattern.

• If the first trial fails, it shifts the pattern to the right. The
comparisons within a shift, that the algorithm does is from
right to left,
Boyer-Moore algorithm
• The Boyer-Moore algorithm precompute the shift size.
• Shift size by considering two quantities from the
preprocessed information.
• Bad‐symbol table indicates how much to shift based on the
text’s character that causes a mismatch
• Good‐suffix table indicates how much to shift based on
matched part (suffix) of the pattern

Ms.Benita Jaison,MCA Dept., KDC


Boyer-Moore algorithm
• The first one is guided by the character in text that caused a
mismatch with the character in pattern. It is called the bad
symbol shift.(Bad symbol Heuristic table)
• The second type of shift is guided by a successful match of
the last k characters of the pattern(k>0). It is called the
good-suffix shift. (Good suffix Heuristic table)

Ms.Benita Jaison,MCA Dept., KDC


Bad-suffix shift table
• The Bad suffix table for each character is constructed by count of the
number of shifts required for each character from the last character(scan
the pattern from left to right). If a character is repeated , then the former
number is replaced by the later.
• The last overwrite will happen for the character’s rightmost occurance.
• For all other characters and special symbols which are not present in the
pattern the no:of shift required is entered same as the length of the pattern.

Ms.Benita Jaison,MCA Dept., KDC


Bad-suffix shift table
• If the unmatched text character is not in the pattern, shift the
pattern to pass that character in the text.
• The size of the shift can be computed by the formula (t1(c) − k),
where t1(c) is the entry in the precomputed table and k is the
number of matched characters.

Ms.Benita Jaison,MCA Dept., KDC


Bad-suffix shift table
• If mismatch occurs, at the first character itself, k is 0,(k=0) then retrieve
the entry t1(c) from the bad-symbol table and calculate t1(c) − k .If it is
negative or zero,d1= max{t1(c) − k, 1}.

Ms.Benita Jaison,MCA Dept., KDC


Good-suffix shift table
• If k > 0 then calculate good-suffix shift .
• Compute the good-suffix shift table as follows:
• Good suffix table indices are k = 1...m‐1, where K indicate the no:of successful
match characters and m is the length of the pattern.
• The d2 values are how far we can shift after matching k‐character suffix (from
the right)
• The length of shift required for the pattern d2, is calculated as d2=suff(k).

Ms.Benita Jaison,MCA Dept., KDC


Good-suffix shift table
• This table is used if there is no other occurrence of the mismatched character
preceded by the same character as its rightmost occurrence.
• Because this would repeat same failure.
• In most of the cases the pattern will be shift by its entire length.

Ms.Benita Jaison,MCA Dept., KDC


Boyer-Moore algorithm
1. For a given pattern,construct the bad-symbol shift table.
2. Using the pattern, construct the good-suffix shift table.
3. Align the pattern against the beginning of the text.
4. Repeat the following step until either a matching substring is
found or the pattern reaches beyond the last character of the text.
5. Compare the last character in the pattern and the last character in
the first substring of the text.
• If mismatch occurs, at the first character itself, k is 0,(k=0) if c is the mismatched
character,then retrieve the entry t1(c) from the bad-symbol table and calculate t1(c) − k
.If it is negative or zero,d1= max{t1(c) − k, 1}.
• If the no:of characters matching k is greater than 0, k > 0, also retrieve the
corresponding d2 =suff(k) entry from the good-suffix table.
6. Shift the pattern to the right by the number of positions as d= max{d1,d2}
Ms.Benita Jaison,MCA Dept., KDC
Boyer-Moore algorithm

Ms.Benita Jaison,MCA Dept., KDC


Boyer-Moore algorithm(Example)

Ms.Benita Jaison,MCA Dept., KDC


Important questions
• What is a string?
• Which is the string concatenation operator?
• Write a note on storing strings.
• Write a note on character data type.
• Write a note on strings as ADT
• Explain the Brute force pattern matching algorithm.
• How the efficiency of the algorithm can be increased by the
precomputed tables used in Boyer Moore algorithm.
• Explain Boyer Moore algorithm using an example.
STRING FUNCTIONS in C
• C library <string.h> supports a large number of String-handling functions that
can be used to carry out string manipulations.
• Commonly used string handling functions are :

Function Action
strcat() Concatenates 2 strings
strcmp() Compares 2 strings
strcpy() Copies one string over
another
strlen() Finds the length of a string
STRING HANDLING FUNCTIONS
strlen()
• strlen() function takes string name as parameter and returns the length of
string.
strcpy()
• strcpy() takes 2 strings and copies the content of one string to the content of
another string.
strcat() denoted as ‘||’
• strcat() takes 2 strings and concatenate (joins) two strings.
• Resultant string is stored in the first string specified in the argument
strcmp()
• strcmp() takes two string and returns value 0, if the two strings are equal.
String Copy without using string function strcpy()
char s1[10], s2[10];
int i;
clrscr();
printf("\nEnter the string :");
gets(s1);
i = 0;
while (s1[i] != '\0') {
s2[i] = s1[i];
i++;
}
s2[i] = '\0';
printf("Original string in s1 is %s",s1);
printf("\nCopied String in s2 is %s ", s2);
getch();
Deleting every occurrence of P in T

Algorithm
Input :Text T and pattern P
Output: Returns text T, after deleting every occurrence of P in T
1. Repeat step 2 to 4 while (pos!=-1)
2. pos=INDEX(T,P)
3. L=LENGTH(P)
4. T=DELETE(T,pos,L)
5. Write T
Replacing every occurrence of P in T
Algorithm
Input :Text T and pattern P
Output: Returns text T, after replacing every occurrence of P in T
1. Repeat step 2 to 4 while (pos!=-1)
2. pos=INDEX(T,P)
3. L=LENGTH(P)
4. T=REPLACE(T,pos,L)
5. Write T
ARRAYS
Chapter 5

Ms. Benita Jaison, Dept. of CSA, St. Francis College.


Arrays
• An array is a list elements of same data type
• An array is a list of finite number of homogeneous data elements.

Syntax of array declaration:


• datatype variablename[size];

• Eg:
• int a[10]; //integer array
• char a[10]; //character array
Arrays
• Each Programming language has its own rules for declaring arrays.
• Declaration must implicitly or explicitly provide 3 types of information:
• The name of the array
• The data type of the array
• The index of the array
• Some programming languages (eg:FORTRAN) allocate memory space for arrays
statically during the compilation of the program.

• While some languages allocate (eg:C ) can also allocate memory space
dynamically by accepting the size and then declaring array with n elements.
• int a[10]; //Static allocation
• int *a; // Dynamic allocation
• a=(int *)malloc( n* sizeof(int))
Arrays

• Elements of an array are referenced by subscript or index.


• Index consist of n consecutive numbers starting from 0 to n-1 ,
where n is the size of the array.
• Elements are stored in successive memory locations.

• a[0], a[1], …a[n-1], where n is the size of the array.


Eg: int a[5] ; a[0],a[1],a[2],a[3],a[4]
• Length of the array = Upper Bound - Lower Bound +1 = 4-0+1=5
Arrays as ADT
• The array is an abstract data type (ADT) that holds a collection
of elements (data)accessible by an index.
• The elements stored in an array can be either primitives types
or more complex types.
Property:
• Arrays store and retrieve items using an integer index.
• An item is stored in a given index and can be retrieved at a later time
by specifying the same index.
Operations on Arrays
• Traversal: processing each element in the list
• Search: finding the location of the element with a given value
or the record with a given key.
• Insertion: adding a new element to the list
• Deletion: removing an element from the list.
• Sorting: arranging the elements in some order.
• Merging: combining two lists into a single list.
Representation of linear arrays in memory
• Whenever an array is declared in the program, contiguous memory is
allocated to its elements.
• Initial address of the array – address of the first element of the array is
called base address of the array.
• Each element will occupy the memory space required to accommodate
the values for its datatype,
• Eg:1 for char, 2 for int ---bytes of memory is allocated for each element in the
array.
char a[6];
int a[6] a[0] 1002
a[1] 1004
a[2] 1006
a[3] 1008
a[4] 1010
a[5] 1012
Representation of linear arrays in memory
• The address location of any element Ai is given by the following equation
loc(Ai)=L0+ (i) * c
• Address(Ai)=Base Address + i * sizeof(datatype)
• L0 is the base address of the array A and c represents the size of the data type.
• int A[6];
34 23 12 56 45 89

0 1 2 3 4 5
2078

• Loc(A2)= 2078 + 2* (2)= 2082


Traversing linear arrays
Accessing and processing each elements of a[n] , where n is the
size of the array.
Processing can be:
• to print the content of each element of a[]
• to count the number of elements of a[].
• to increment or decrement each element of a[] by certain value.
Traversing linear arrays
Algorithm to traverse a linear array using Repeat for loop:
1. Repeat step 2 for i=0 to n-1:
2. Print a[i]
3. Exit.

Algorithm to traverse a linear array using Repeat while loop:


1. Initialize i=0.
2. Repeat 3 to 4 while (i<= n-1)
3. Print a[i]
4. i=i+1.
5. Exit.
Inserting
• Inserting refers to the operations of adding a new element to an
array A.
• Inserting an element
• At the end of a linear array can be done if memory space is allocated for
the array.
• In the middle of an array requires half the elements must be moved
downward to a new location to accommodate the new element keeping
the order of elements.
Steps:
• Create space in array by moving downward one location,each element
from the position to insert.
• Insert item into the array in the space created.
• Increase the size or the count of the no:of elements of the array by 1 to
account for the new element.
Inserting
Input : An array a[] with n no:of elements ,Val to be inserted at the index
position IndPos.
Output :Inserts an element Val at the index position IndPos in an array.
INSERT (a, n, IndPos, val)
1. Set i=n-1
2. Repeat steps 3 and 4 while (i>=IndPos)
3. Set a[i+1]:=a[i]
4. i--
5. Set a[IndPos]:=Val
6. Set n=n+1
7. exit
Int a[10] Insertion
n=6 Indpos=3
n=6 + 1 i=n-1=6-1=5
Val =77
a[i+1]=a[i] a[i+1]=a[i] a[i+1]=a[i]
a[Indpos]=val

2 8 5 12
12 9 44

0 1 2 3 4 5 6 7 8 9

i>= ind
2 >=3 3 >=3 4 >=3 5 >=3
i-- i-- i--
Deleting
• Deleting refers to the operation of removing one elements from the
array A[].
• Deleting an element at the end of an array does not require much
changes.
• But deleting an element in the middle requires moving one location
towards left in order to fill up the array.
Deleting
Input : An array a[] of with n no:of elements , index position IndPos of the number
to be deleted.
Output :Value Val at index position IndPos deleted.
Delete (a, n, IndPos)
1. Set Val=a[IndPos]
2. i=IndPos
3. Repeat step 4 and 5 while (i <n-1)
4. Set a[i] =a[i+1]
5. i=i+1
6. Set n=n-1
7. Print “Val Deleted”
8. Exit
Int a[10] Deletion
n=6 Indpos=3
n=6 -1 =5 i=indpos=3
val =a[Indpos]
a[i]=a[i+1] a[i]=a[i+1]
val =

2 8 5 12
12 9 4

0 1 2 3 4 5 6 7 8 9

i< n-1
3 <5 4 <5 5 <5
i++ i++
Sorting
• Sorting refers to the operation of rearranging the elements of
an array a[] so that they are in increasing or decreasing order.
• Suppose a[] has the following elements 8,4,19,2,7,13,5,16.
• After sorting the array in ascending a[] becomes
2,4,5,7,8,13,16,19.
• After sorting the array in descending a[] becomes
19,16,13,8,7,5,4,2.
Bubble Sort
• Bubble sort is a simple comparison-based algorithm in
which a pair of adjacent elements is compared at a time
and the elements are swapped if they are not in
order.
• Each iteration through the array places the next largest
value in its proper place.
• Largest number “bubbles” up to the location where it
belongs.
• Bubbled up item are avoided for the next iteration.
Bubble sort
• Suppose the list of number a[1], a[2]…a[n] is in memory.
• Bubble Sort Algorithm
• Step 1: compare adjacent elements and arrange such that largest element
is bubble up to the a[n-1] position and involves n-1 comparisons.
• Step 2: Repeat step 1 with one less comparison; largest element is bubble
up to the a[n-2] position and involves n-2 comparisons. …..
• …..
• Step n-1: compare a[1] with a[2] and arrange them so that a[1] < a[2].
After this steps, the list will be sorted in increasing order.
First Iteration
temp= 12
i=0
j=0

12
12 88 5 3 9 4 1

j j+1
0 1 2 3 4 5 6
Algorithm
Input : Array a[] of size n
Output :Sorted elements in the array
BUBBLESORT (a[],n)
1. Repeat step 2 for i=0 to i< n-1
2. Repeat step 3 and 4 for j=0 to j< n-1 -i
3. If a[j]>a[j+1]
temp=a[j]
a[j]=a[j+1]
a[j+1]=temp
4. j=j+1
5. Exit
Complexity of Bubble Sort
n=8
void bubsort(int a[]) No:of iterations is 7 (0 to i<n-1)
No:of comparisons in each iterations is (0 to n-1-i)
{ 1st iteration 8-1-0 =7
for (i = 0; i < n-1; i++) Comparing 2nd iteration 8-1-1=6
3rd iteration 8-1-2=5
for (j = 0; j < n-1-i; j++) 4th iteration 8-1-3 =4
if (a[j] > a[j+1]) 5th iteration 8-1-4=3
6th iteration 8-1-5=2
{ Swapping 7th iteration 8-1-6=1
temp=a[j];
Total comparisons = 7+6+5+4+3+2+1
a[j]=a[j+1]; =(n-1)+(n-2)+(n-3)+(n-4)+…+3+2+1
a[j+1]=temp; Sum of the series, S== n⁄2 (a + L)
=n/2(n-1+1)=n/2 * n
} =n2/2
=O(n2)
In each comparison , if a[j]>a[j+1] then swap operation.
Basic operation are comparison and swapping.
Best Case Time Complexity
• If the numbers are already sorted in ascending order, the
algorithm will determine in the first iteration that no number
pairs need to be swapped and can be terminated immediately.
• The algorithm must perform n-1 comparisons:
• The best-case time complexity of Bubble Sort is: O(n)
Worst and Average Case Time Complexity

• If the elements in the array is sorted in the descending


order,on average across all iterations, half of the elements
are compared and swapped.
• The worst-case time complexity of Bubble Sort is: O(n²).
• In the average case, there may be about half as many
exchange operations as in the worst case,since about half of
the elements are in the correct position compared to the
neighboring element.
• The average time complexity of Bubble Sort case is: O(n²)
Selection sort
• Selection sort is a sorting algorithm that selects the
smallest element from an unsorted list in each
iteration and places that element at the beginning of
the unsorted list.
First Iteration

Min=0

22 12 34 14 10
10
22

Min
0 1 2 3 4
Second Iteration

Min=1

10 12 34 14 10
22

0 Min
1 2 3 4
Third Iteration

Min=2

10 12 3434 14 22

0 1 Min
2 3 4
Fourth Iteration

Min=3

10 12 14 34
34 22
22

0 1 2 Min
3 4
Algorithm
Input : Array a[] of size n
Output :Sorted elements in the array
SELECTIONSORT (a[],n)
1. Repeat step 2 to step 5 for i=0 to n-1
2. Min=i
3. Repeat step 4 and 5 for j=i+1 to j<n
4. If a[j]<a[Min]
5. Min=j
6. Temp=a[i]
a[i]=a[min]
a[min]=Temp
7. Exit
Best case of Selection sort
• Selection Sort is an easy-to-implement,
• Selection sort has an average, best-case, and worst-case time
complexity of O(n²).
• Selection Sort is slower than Insertion Sort,
• The best case is the case when the array is already sorted.
• As swapping at each step can be avoided but the time spend to
find the smallest element is still O(N). Hence, the best case has:
• N * (N+1) / 2 comparisons
• 0 swaps
Worst case of Selection sort
• The worst case is the case when the array is already sorted in
descending order.
• The cost in this case is that at each step, a swap is done. This
is because the smallest element will always be the last element
and then the second smallest element that is the smallest
element of the new unsorted sub-array. Hence, the worst case
has:
• N * (N+1) / 2 comparisons
• N swaps
• Hence, the time complexity is O(N^2).
Average case of Selection sort
• Based on the worst case and best case, we know that the
number of comparisons will be the same for average case as
well, the number of comparisons will be constant.
• Number of comparisons = N * (N+1) / 2
• Therefore, the time complexity will be O(N^2).
• The number of swaps, is N swaps
Insertion Sort
• Insertion sort is the sorting mechanism where the sorted array is
built using one item at a time.
• For the 1st iteration the element at index 1, is set as the key
element.
• The key elements is compared with the sorted elements on the left
sequentially and then inserted in the current position in the list.
• The analogy is same as we arrange a deck of cards.
• This sort works on the principle of inserting an element at a
particular position, hence the name Insertion Sort.
Algorithm
Input : Array a[] of size n
Output :Sorted elements in the array
INSERTIONSORT (a[],n)
1. Repeat step 2 to step 7 for i=1 to i<n
2. key=a[i];
3. j=i-1;
4. Repeat step 5 to step 6 until j>=0 and a[j]>key
5. a[j+1]=a[j];
6. j=j-1;
7. a[j+1]=key;
First Iteration
a[j+1]=a[j]
9>5 j—
i=1 J=-1
Key= 5 a[j]>Key a[j+1]=Key Exit loop
j=i-1 =0

99 5 1 4 10

j i
0 1 2 3 4
Second Iteration
9>1 a[j+1]=a[j]
i=2
Key= 1 a[j]>Key

j=i-1 =1 a[j]=Key

5 9 1 4 10

j i
0 1 2 3 4
Third Iteration
a[j+1]=a[j]
9>4
i=3 a[j]>Key
a[j]=Key
Key= 4
j=i-1 =2

1 95 9 4 10

j i
0 1 2 3 4
Fourth Iteration
4 >10
i=4 a[j]>Key Exit the while loop
Key= 10
J=i-1 =3

1 94 95 9 10

j i
0 1 2 3 4
Difference between Bubble, Selection and Insertion Sort
• Bubble sort finds the largest number and “bubbles” up it to the last
location, then finds the 2nd largest element and take it last but position
and continues till array is sorted.
• Selection sort is the sorting algorithm that finds the smallest
element in the array and exchanges the element with the first
position, then find the second smallest element and exchange it
with 2nd position and continues till array is sorted.
• The insertion sort is the sorting algorithm that sorts by inserting the
an element into an existing sorted file.
Complexity of Insertion Sort
• The worst-case time complexity of Insertion Sort is when the
array elements are sorted in descending order and it is given
as : O(n²)
• Best-Case Time Complexity If the elements already appear in
sorted order, there is precisely one comparison in the inner
loop and no swap operation at all. The best-case time
complexity of Insertion Sort is: O(n)
• The average time complexity of Insertion Sort is: O(n²)
Divide and Conquer Approach

• In divide and conquer approach, a problem is divided into


smaller problems, then the smaller problems are solved
independently, and finally the solutions of smaller problems are
combined into a solution for the large problem.
General plan for Divide-and-conquer
algorithms has three parts
• Divide the problem into a number of sub-problems of the
same type, that are smaller instances of the same problem.
• Conquer the sub-problems by solving them recursively. If
they are small enough, solve the sub-problems as base
cases.
• Combine the solutions to the sub-problems into the
solution for the original problem.
Quicksort
• Quicksort is a sorting algorithm that is based on the divide-and
conquer approach.
• Quick sort involves the partition of array elements based on a pivot
element.
• A pivot element is an element with respect to whose value the
subarray will be divided.

• A partition is an arrangement of the array’s elements so that all the


elements to the left of pivot element are less than or equal to pivot,
and all the elements to the right are greater than or equal to the pivot
element.
Quicksort
• After a partition, the pivot element will be in its final
position in the sorted array,
• Continue sorting of the two subarrays to the left and to
the right of pivot independently.
• The entire sorting happens in the division stage, with
no operation required to combine the solutions to the
subproblems.
ALGORITHM Quicksort(a,low,high)
Input: Subarray of array a[0..n−1], defined by its left and right indices low
and high
Output: Subarray a[low..high]sorted in ascending order

1. If low >high then return


2. Set pivot low; ilow+1; jhigh;
3. while(i<=j)
while(a[i]<=a[pivot]) increment i End while.
while(a[j]>a[pivot]) decrement j End while
if(i<j) Swap values of a[i] and a[j] ;
End while
4. Swap values of a[pivot] and a[j];
5. qsort(a,low,j-1);
6. qsort(a,j+1,high);
Quicksort’s efficiency
• The number of key comparisons made before a partition
is
• n +1 if the scanning indices cross over
• n if they coincide.
• Best case : If all the splits happen in the middle of
corresponding subarrays.
• Worst case : If all the splits happen only at one end, it
results in one subarray.
Best Case
Best case : If all the splits happen in the middle of corresponding subarrays.
Worst Case
In the worst case, one of the
two subarrays will be empty,
and the size of the other will
be just 1 less than the size of
the subarray being
partitioned.
The total number continue
until the last one element
has been processed.
Mergesort
• Mergesort is an application of the divide-and conquer
technique.
• Mergesort sorts a given array a[0..n − 1] by
1. Dividing array a[] into two halves a[0..mid] and a[mid+1..n −
1], using Partition function.
2. Sorting each of them recursively, then merging the two smaller
sorted arrays into a single sorted one, using Merge function
ALGORITHM Partition(int a[],int lt,int rt)
Input: An array a[0..n − 1] ,low,high
Output: Array a[0..n − 1] sorted in ascending order
{
if(low>=high) then return;
mid=(low+high)/2;
partition(a,low,mid);
partition(a,mid+1,high);
merge(a,low,mid,high);
}
Mergesort
• The merging of two sorted arrays can be done as follows.
• Two control variables are initialized to point to the first elements
of the arrays.
• The elements pointed to are compared,
• The smaller of them is added to a new array being constructed;
• The index of the smaller element array and the new array is
incremented.
• The comparison is repeated until one of the two given arrays is
exhausted, and then the remaining elements of the other array
are copied to the end of the new array.
ALGORITHM Merge(a[low..mid], a[mid+1..rt], a[lt….high])
Input: Arrays a[low..mid] and a[mid+1..high] both sorted
Output: Sorted array a[low..high] of elements
{
ilow; jmid+1; klow;
while(i<=mid && j<=high)
{
if(a[i]<a[j]) then copy a[i++] to t[k++]
else copy a[j++] to t[k++]
}
while(i<=mid) copy a[i++] to t[k++]
while(j<=high) copy a[j++] to t[k++]
for(i=low;i<=high;i++)
copy t[i] to a[i]
}
Best case, Worst case and Average case

• Best Case Time Complexity(sorted in ascending order): O(N logN),


• No: of Comparisons: 0.5 N logN and no:of swapping is 0.

• Average Case Time Complexity: O(N logN),


• No: of Comparisons: 0.74 N logN and no:of swapping is half that of worst case.

• Worst Case Time Complexity (sorted in descending order): O(N logN),


• No: of Comparisons: N logN and no:of swapping is double that of average case.
Advantages and disadvantages of Merge sort
Advantage:
• Merge sort is a stable algorithm.
• It can be applied to files of any size.
• Time complexity (nlogn) is very efficient when compared to other algorithms
Disadvantage:
• Algorithm requires linear amount of extra storage space.
• Soln: Divide the list to be sorted in more than two parts, sort each recursively, and
then merge them together . This is called multiway mergesort and is particularly
useful for sorting files residing on secondary memory devices.
Merge sort Vs Quicksort
• Mergesort, divides its input elements according to their position
in the array.
• Quicksort divides its input elements according to their value.
• In mergesort the division of the problem into two subproblems is
immediate and the entire sorting happens during merging.
• In quicksort the entire work happens in the division stage, with no
work required to combine the solutions of the subproblems.
• Quick sort is an in-place algorithm.
• Merge sort is not a in-place algorithm.
Shell sort
• Shell sort is an extended version of insertion sort, in which
elements are compared by elements separated by a gap of several
positions.
• It is a comparison-based and in-place sorting algorithm. Shell sort
is efficient for medium-sized data sets.
• The algorithm first sorts the elements that are far away from each
other, then it subsequently reduces the gap between them. This
gap is called as interval. This interval can be calculated by using
the Knuth's formula given
hh = h * 3 + 1
where, 'h' is the interval having initial value 1.
First loop
• n is equal to 8 (size of the array), so the elements are lying at the
interval of 4 (n/2 = 4). Elements will be compared and swapped if
they are not in order.
Second loop
• Elements are lying at the interval of 2 (n/4 = 2), where n = 8.
Third loop
• Elements lying at the interval of 1 (n/8 = 1), where n = 8, is compared
to sort the rest of the array elements.
• In this step, shell sort uses insertion sort to sort the array elements.
Algorithm for Shell Sort

Step 1: Initialize the gap size i.e. h


Step 2: Divide the array into sub-arrays each having interval of h
Step 3: Sort the sub-arrays with insertion sort
Step 4: Reduce the value of h
Step 5: Repeat the above steps until the array is sorted
ALGORITHM SHELL SORT(a[],n)
Input: Arrays a[ ] of size n
Output: Sorted array a[ ] of elements

1. Repeat step 2 to step 8 for k=n/2 to k>0 Decrement k=k/2


2. Repeat step 3 to step 8 for i=k to i<n increment i=i+1
3. key=a[i];
4. j=i;
5. Repeat step 6 to step 7 while j>=k &&a[j-k]>key
6. a[j]=a[j-k];
7. j=j-k;
8. a[j]=key;
Shell Sort Complexity
• Best Case Complexity - It occurs when there is no sorting required, i.e.,
the array is already sorted. The best-case time complexity of Shell sort
is O(n*logn).
• Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly
descending. The average case time complexity of Shell sort
is O(n*logn).
• Worst Case Complexity - It occurs when the array elements are
required to be sorted in reverse order. That means suppose you have to
sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of Shell sort is O(n2).
Heap Sort
• Heap sort is a comparison-based sorting technique based on Binary Heap
data structure.
• The concept of heap sort is to eliminate the elements one by one from
the heap part of the list, and then insert them into the sorted part of the
list.
• Heapsort is the in-place sorting algorithm.
Heap Sort
• Binary tree is a tree in which the node can have maximum of two children.
• A complete binary tree is a binary tree in which all the levels are completely filled
except possibly the leaf nodes, which is filled from the left or left-justified.
• A Binary Heap is a Complete Binary Tree where items are stored in a special order
such that the value in a parent node is greater(or smaller) than the values in its two
children nodes.

• Max heap if the value in parent node is greater than the values in its two children
nodes and Min heap if parent node is smaller.
• The process of reshaping a binary tree into a Heap data structure is known as
‘heapify’.
• Heapify procedure can be applied to a node only if its children nodes are heapified.
Heapification must be performed in the bottom-up order.
Steps for Heap Sort using Max Heap
1. Build a max heap from the input data.
2. The largest item is stored at the root of the heap.
3. Replace it with the last item of the heap followed by reducing the
size of heap by 1.
4. Heapify the root of tree.
5. Repeat above steps while size of heap is greater than 1.
Note :Heap Sort using max heap sorts in ascending order.
Steps for Heap Sort Using Min Heap
1. Build a min heap from the input data.
2. At this point, the smallest item is stored at the root of the heap.
3. Replace it with the last item of the heap followed by reducing the
size of heap by 1.
4. Heapify the root of tree.
5. Repeat above steps while size of heap is greater than 1.
Note :Heap Sort using min heap sorts in descending order.
Binary Tree: Array representation

0 1 2 3 4 5 6 7

0 If n is the no:of nodes then index no of


• Internal nodes are from 0 to n/2-1
1 2
• Leaf nodes are from n/2 to n-1.

If i index no of root node then


3 4 5 6 • index no of left node is 2* i +1
• Index no of right node is 2 * i +2
7
Heap Sort – Max Heap
Heap Sort
Delete the root element (89) from the max heap:
To delete this node, swap (89) with the last node, (11).
The largest number which is swapped to last position is shown as deleted node by
ignoring from the next step of Heapify, by reducing the array size by one.
89

11 89

11
Heap Sort
After deleting the root element, we again have to heapify it to convert it into max
heap.
Heap Sort
In the next step, to delete the root element (81) from the max heap,swap it with the last node, i.e. (54).
After deleting the root element, heapify it to convert it into max heap.
Heap Sort
Heap Sort
Heap Sort
ALGORITHM heapSort(int a[], int n)
1. Repeat step 2 for i = n / 2 – 1 to i >= 0 decrement i
2. heapify(a, n, i);
3. Repeat step 4 to 5 for i = n – 1 to i >= 0 decrement i
4. swap(&a[0],&a[i]);
5. heapify(a, i, 0);
ALGORITHM heapify(int a[], int n, int i)
1. large =i
2. lt  2 * i + 1
3. rt = 2 * i + 2
4. if (lt < n && a[lt] > a[large])
large = lt;
5. if (rt < n && a[rt] > a[large])
large = rt
6. if (large != i)
swap(&a[i],&a[large])
heapify(a, n, large)
Heap Sort Complexity and Efficency
• Time Complexity: Time complexity of heapify is O(Logn). Time
complexity of createAndBuildHeap() is O(n) and the overall time
complexity of Heap Sort is O(nLogn).
• Advantages of heapsort –
• Efficiency – The time required to perform Heap sort increases
logarithmically while other algorithms may grow exponentially slower
as the number of items to sort increases. This sorting algorithm is
very efficient.
• Memory Usage – Memory usage is minimal because apart from what
is necessary to hold the initial list of items to be sorted, it needs no
additional memory space to work
Searching
• Searching refers to the operation of finding the location of ITEM in array of
elements or printing not found if the item is not present.
• The search is successful, if the ITEM is found and unsuccessful otherwise.
• There are 2 types of searching algorithms. Binary search and linear search.

• The complexity of the searching algorithm is measured in terms of the


number f(n) of comparisons required to find the item in array of n elements.
• Linear search is proportional to (O(n)),binary search is proportional to
O(log2n).
Linear Search
• Linear search or sequential search is the method which
traverses the array A[] sequentially to locate an ITEM.
• Suppose a[] is a linear array with n elements. To search for a
KEY number in a[], involves comparing KEY number with
each element of a[] one by one.
Sequential or Linear Search
Algorithm Sequential Search

Input: Array A[0,1,….n-1] of size n and search key S


Output : Index of the first element in A that matches S or -1 if no matching element

Step 1: Initialise i=0.


Step 2: Repeat step 3 while (i<n and A[i] !=S)
Step 3: i=i+1
Step 4: if i<n ,
return i as the index
else
return -1
S= 40

int A[10]

40 40 40 40 40 40 40 40 40
= = = = = = = = =

22 32 44 52 67 15 90 87 37 40
40
0 1 2 3 4 5 6 7 8 9

0<10 1<10 2<10 3<10 4<10 5<10 6<10 7<10 8<10 9<10
A[8] !=40
A[0] !=40 A[1] !=40 A[2] !=40 A[3] !=40 A[4] !=40 A[5] !=40 A[6] !=40 A[7] !=40 A[9] ==40
Binary Search
• Binary search is the most popular Search algorithm.
Binary search works only on a sorted set of
elements.
• Binary search begins by comparing the element in
the middle of the array with the search value.
• Index of middle element is calculated as index of( first
+ last )/2.
• The algorithm eliminates the half in which the
search value cannot be a part.
Binary Search
35 70 1 5 50 65 20
• Sort the elements in the array.
• Index of middle element is calculated First sort the elelments in the array
as index of( first + last )/2.
• Compare the element in the middle
of the array with the search value. first 50 last

• If the search value is equal to the mid


element, its position in the array is first
mid 50 last
returned.

• If the search value is greater than the


last
mid element, then make first=mid+1
first

• If the search value is less than the mid


element, then make last=mid-1 first
last
Binary Search
• Binary Search is an efficient technique for searching an ordered
sequential list of elements.
Procedure :
• Accept the sorted array elements a[0,1,2,…n-1]
• Accept the search element x
• Compare x with the middle element of the sorted array
• If x matches with middle element, return the mid index.
• Else If x is greater than the mid element, then search x in right half subarray after
the mid element.
• Else (x is smaller) search in left half.
Binary Search Algorithm
BinarySearch(A[],ITEM,N)
Input : A sorted array A[] of N elements and the ITEM being searched.
Output: Position of the ITEM found or return not found
1. Set first=0, last=n-1, mid=(first+last)/2.
2. Repeat step3 while (first<=last)
3. if(x==a[mid]) then {
Set ind=mid
if(x>a[mid])
Set first=mid+1;
if(x<a[mid])
last=mid-1;
4. If (Pos>=0) then
Write(‘element found in location’, ind+1)
Else Write(‘element not found’)
5. Exit
Conditions for Binary Search
The binary search algorithm requires the following conditions :
1. The list must be sorted.
2. There must be elements present in each subarray such that each
step yields FIRST<=LAST , eventually when item is not present in the
array, FIRST=LAST=MID.
3. There should be direct access to the middle element.
Two-dimensional arrays
• A two-dimensional array is a table of items, all of the same type and same
name organised in two dimensions.
• It has finite number of rows and finite number of columns.
• 2D arrays are generally known as matrix or table.
• The declaration of 2-dimensional array is
data_type array_name [row size][column size];
Eg: int a[5][5];
datatype specifies the data type of array elements. Type can be int, float, or
char.
• First index selects the row and the second index selects the column within
the row.
Two-dimensional arrays
• An element in 2-dimensional array is accessed by using the subscripts, i.e., row index and
column index of the array.
Initialization of two-dimensional arrays
Initializing arrays in its declaration
• Array is initialized by following their declaration with a list of values
enclosed in braces.
int matrix[2][3]={(1,2,3),(4,5,6)};

Declaration followed by initialisation.


• Array initialisation is done row by row by using assignment statement
int matrix[2][3] ;
matrix[0][0]=1; matrix[0][1]=2 ; matrix[0][2]=3;
matrix[1][0]=4; matrix[1][1]=5; matrix[1][2]=6;
Two dimensional Array
Memory representation of two-dimensional arrays
• A two dimensional array is mapped to one dimensional array in order
to store in the memory.

• Elements of arrays are stored in contiguously increasing memory


locations, in a single list.

82206 82208 82210 82212 82214 82216 82218 82220


Memory Representation of Two-dimensional Arrays

Two dimensional Arrays are stored in the memory in the following two ways :
• Row-major representation .
• Column-Major representation.
Row Major ordering
• In row major ordering, all the rows of the 2D array are stored into the memory
contiguously.

0 1 2
0

8206 8208 8210 8212 8214 8216 8218 8220 8222


1
2

If array is declared by a[m][n] where m is the number of rows while n is the number of columns, then
address of an element a[i][j] of the array stored in row major order is calculated as,

Address(a[i][j]) = Base Address + (i * n + j) * size of(datatype)


Column Major ordering
• According to the column major ordering, all the columns of the 2D array are stored into the
memory contiguously.

0 1 2
0
1

8206 8208 8210 8212 8214 8216 8218 8220 8222


2

If array is declared by a[m][n] where m is the number of rows while n is the number of columns, then address
of an element a[i][j] of the array stored in row major order is calculated as,

Address(a[i][j]) = Base Address + ((j*m)+i)* (Sizeof (datatype))


Address of a[2][1] = 8206 + ( 1 * 3 + 2) * 2 = 8206 +10 =8216
Multi-dimensional arrays
• Multidimensional arrays are derived from the basic or built-in data types.
• Arrays where the elements are referenced by more than one subscripts are
known as multidimensional arrays.
• The simplest form of the multidimensional array is the two-dimensional
array.
• 2D-array has two subscripts, three-dimensional array has three subscripts
and so on.
One dimensional
Multi-dimensional Arrays Declaration and Initialisation
The general form of a multidimensional array declaration:
datatype name[size1][size2]...[sizeN];
The following declaration creates a three dimensional 3 . 3 . 3 integer
array:
int arr[3][3][3];
Memory Representation of Multidimensional Arrays

The first dimension represents the row, second represent the column and the third represents the depth
Memory Representation of Multidimensional Arrays
• Row-major representation : The arrays are stored in the memory in row
design.First row is followed by the second row in the memory and so on.
• Location [i,j]=Base address(A)+{(i* colSize)+j}* (Sizeof datatype).

• Column-Major representation: the arrays are stored in the memory in column


design.first column of the array is followed by the second column and so on.
• Location [i,j]=Base address(A)+{(i+(j*rowSize)}* (Sizeof datatype).

82206 82208 82210 82212 82214 82220 82218


A[0][0] A[0][1] A[0][2] A[1][0] A[1][1] ……… A[3][2] A[3][3]
Disadvantages of Arrays

1. How many elements are to be stored in array, must know in advance

2. Array is static structure. It means that array is of fixed size. The


memory which is allocated to array can not be increased or reduced.

3. Since array is of fixed size, if we allocate more memory than


requirement then the memory space will be wasted. And if we allocate
less memory than requirement, then it will create error.

4. The elements of array are stored in consecutive memory locations. So


insertions and deletions are very difficult and time consuming.
Sparse Matrix
• Matrices with a relatively high proportion of zero entries are called sparse
matrices.
• Sparse matrix consist of no:of zero elements greater than (m*n)/2,where
m *n is the order of the matrix.
Sparse Matrix
• A sparse matrix is a matrix having a relatively small number of nonzero
elements.
• 2 types of n-square sparse matrices

• First matrix, where all the entries above the main diagonal are zero and all
non zero entries can occur only on or below the main diagonal is called a
lower triangular matrix.
• The second matrix ,where non zero entries can occur on the diagonal or
on the elements immediately above or below the diagonal is called a
tridiagonal matrix.
Sparse matrix storage
• Representation of sparse matrices should store only
nonzero elements to save space.
• The non zero elements are represented by their row and
column number.
• Each element is characterized by <row, col, value>.
The sparse matrix
• The sparse matrix is represented by a two-dimensional array.
• Each element is characterized by <row, col, value>.
• The first row indicate the no:of rows and columns and total non-
zero values.
No:of Columns
No :of Rows Nonzero terms

Row, Column in ascending order


SPARSE MATRIX -ABSTRACT DATA TYPE
Data :
A set of triples, <row, column, value>, where row and column are integers and
and value comes from the set item.
Operations :
• Creation of sparse matrix
• Transpose of sparse matrix
• Addition of sparse matrix
• Multiplication of sparse matrix
Transposing a sparse matrix
To transpose a sparse matrix,
• For each row i, take element (i,j) value and store it in element (j,i) value of the
transpose
• Or
• For all elements in column j, place element (i,j) value in element (j,i) value.
Row No, Columns No, Nonzero terms

Row, Column in ascending order

transpose
Transpose a sparse matrix
• For all elements in each row i
• Element <i, j, value> is stored in element <j, i, value> of the transpose matrix.
• (0, 0, 15) ====> (0, 0, 15)
(0, 3, 22) ====> (3, 0, 22)
(0, 5, -15) ====> (5, 0, -15)
(1, 1, 11) ====> (1, 1, 11)

OR

• For all elements in column j,


• place element <i, j, value> in element <j, i, value>
(1, 2, 3) ====> (2 ,1,3)
(2, 3,-6) ====> (3, 2,-6)
(4,0,91) ====> (0,4,91)
(5,2,28) ====> (2,5,28)

You might also like