1 Intro
1 Intro
CO 1 3 2 2 2 3 - - - 2 2 2 3
CO 2 3 2 2 2 3 - - - 2 2 2 3
CO 3 3 2 2 2 3 - - - 2 2 2 3
CO 4 3 2 2 2 3 - - - 2 2 2 3
UNIT – I
Overview of data structure, Basics of Algorithm Analysis including Running Time Calculations, Abstract Data Types, Arrays, Arrays and Pointers,
Multidimensional Array, String processing, General Lists and List ADT, List manipulations, Single, double and circular lists. Stacks and Stack ADT,
Stack Manipulation, Prefix, infix and postfix expressions, recursion. Queues and Queue ADT, Queue manipulation.
UNIT – II
Sparse Matrix Representation (Array and Link List representation) and arithmetic (addition, subtraction and multiplication), polynomials and
polynomial arithmetic.
Trees, Properties of Trees, Binary trees, Binary Tree traversal, Tree manipulation algorithms, Expression trees and their usage, binary search trees,
AVL Trees, Heaps and their implementation, Priority Queues, B-Trees, B* Tree, B+ Tree
UNIT – III
Sorting concept, order, stability, Selection sorts (straight, heap), insertion sort (Straight Insertion, Shell sort), Exchange Sort (Bubble, quicksort),
Merge sort (External Sorting) (Natural merge, balanced merge and polyphase merge). Searching – List search, sequential search, binary search,
hashing methods, collision resolution in hashing.
UNIT – IV
Disjoint sets representation, union find algorithm, Graphs, Graph representation, Graph Traversals and their implementations (BFS and DFS).
Minimum Spanning Tree algorithms, Shortest Path Algorithms
Textbook(s):
1. Richard Gilberg , Behrouz A. Forouzan, “Data Structures: A Pseudocode Approach with C, 2nd Edition, Cengage Learning, Oct 2004
2. E. Horowitz, S. Sahni, S. Anderson-Freed, "Fundamentals of Data Structures in C", 2nd Edition, Silicon Press (US), 2007.
References:
1. Mark Allen Weiss, “Data Structures and Algorithm Analysis in C”, 2nd Edition, Pearson, September, 1996
2. Robert Kruse, “Data Structures and Program Design in C”, 2nd Edition, Pearson, November, 1990
3. Seymour Lipschutz, “Data Structures with C (Schaum's Outline Series)”, McGrawhill, 2017
4. A. M. Tenenbaum, “Data structures using C”. Pearson Education, India, 1st Edition 2003.
5. Weiss M.A., “Data structures and algorithm analysis in C++”, Pearson Education, 2014.
Data Structure
• DATA
• Dictionary – sorted list of words
• Map – 2D plane, position, direction
• Cash book – Tabular, cash-in cash-out
DATA STRUCTURING
• Organizing DATA to solve the problem
• Established some level of order
• Arrange things to get them quickly
• The complexity of an algorithm f(n) gives the running time and / or storage space
required by the algorithm in terms of n as the size of input data.
Time Complexity
• Time Complexity of an algorithm represents the amount of time
required by the algorithm to run to completion.
• Time requirements can be defined as a numerical function T(n),
where T(n) can be measured as the number of steps, provided each
step consumes constant time.
Space Complexity
• Space complexity represents the amount of memory space required
by the algorithm in its life cycle
• Instruction Space : space required to store the executable version of
the program (number of lines of code)
• Data Space : space required to store constants and variables
• Environment Space : space required to store the environment
information needed to resume the suspended function
Algorithm Types
• Backtracking algorithms
• Branch and bound algorithms
• Brute force algorithms
• Divide and conquer algorithms
• Dynamic programming algorithms
• Greedy algorithms
• Randomized algorithms
• Simple recursive algorithms
Greedy algorithms
• In greedy algorithm approach, decisions are made from the
given solution domain.
• the closest solution that seems to provide optimum solution is
chosen.
• Greedy algorithms tries to find localized optimum solution
which may eventually land in globally optimized solutions.
• But generally greedy algorithms do not provide globally
optimized solutions.
• Counting Coins
Greedy algorithms
• For currency system,
• where we have coins of 1, 7, 10 value,
• counting coins for value 18 will be absolutely optimum (10, 7, 1)
• For count like 15 may use more coins then necessary.
• For example − greedy approach will use
• 10 + 1 + 1 + 1 + 1 + 1 total 6 coins.
• Where the same problem could be solved by using only 3 coins (7 +
7 + 1)
• Hence, we may conclude that greedy approach picks immediate
optimized solution and may fail where global optimization is major
concern.
Examples
• Most networking algorithms uses greedy approach. Here is the list
of few of them −
• Travelling Salesman Problem
• Prim's Minimal Spanning Tree Algorithm
• Kruskal's Minimal Spanning Tree Algorithm
• Dijkstra's Minimal Spanning Tree Algorithm
• Graph - Map Coloring
• Graph - Vertex Cover
• Knapsack Problem
• Job Scheduling Problem
Divide-and-Conquer approach
• In divide and conquer approach, the problem in hand, is divided
into smaller sub-problems and then each problem is solved
independently.
• Keep dividing the sub-problems into even smaller sub-problems, till
no more dividation is possible "atomic"
• The solution of all sub-problems is finally merged in order to obtain
the solution of original problem.
Divide-and-Conquer approach
• Broadly, divide-and-conquer approach as three step process:
• Divide/Break
• Conquer/Solve
• Merge/Combine
• This algorithmic approach works recursively and conquer & merge
steps works so close that they appear as one.
Examples
f(n) = log n
Nested Loops
• Loops contain loops requires to determine how many iterations
each loop completes
• The total number of loop is product of the number of iterations in
the inner loop and the number of loops in the outer loop
f(n) = ???
SORTING
• Input: 3, 4, 6, 8, 9, 7, 2, 5, 1
• Output: 1, 2, 3, 4, 5, 6, 7, 8, 9
• How to sort them? ALGORITHM
• First sorting technique – INSERTION SORT
• Playing cards
Insertion Sorting Example
• 3, 4, 6, 8, 9, 7, 2, 5, 1
• 3, 4, 6, 8, 9, 7, 2, 5, 1; Key=4
• 3, 4, 6, 8, 9, 7, 2, 5, 1; Key=6
• 3, 4, 6, 8, 9, 7, 2, 5, 1; Key=8
• 3, 4, 6, 8, 9, 7, 2, 5, 1; Key=9
• 3, 4, 6, 8, 9, 7, 2, 5, 1; Key=7
• 3, 4, 6, 8, 7, 9, 2, 5, 1; Key=7
• 3, 4, 6, 7, 8, 9, 2, 5, 1; Key=7 … so on
• 1, 2, 3, 4, 5, 6, 7, 8, 9
Analysis of Insertion Sort
Steps Cost Times
for j 2 to n 3, 4, 6, 8, 9, 7, 2, 5, 1 C1 n (roughly)
Key A[j] 3, 4, 6, 8, 9, 7, 2, 5, 1 C2 n–1
Insert A[j] into the sorted sequence A[j-1]
i j -1 C3 n–1
𝑛
C4
while i > 0 and A[i] > Key 9&7 𝑡𝑗
𝑗=2
𝑛
C5
do A[i + 1] A[i] (𝑡𝑗 −1)
𝑗=2
𝑛
C6
i-- (𝑡𝑗 −1)
𝑗=2
A[i + 1] Key C7 n-1
Insertion Sorting
• tj reflects the number
• accounts for shifting elements to the right while inserting jth card
• Inserting 7 shifts TWO cards
• Total Time ≈
𝑛
• n(C1 + C2 +C3 +C7) + σ𝑗=2 𝑡𝑗 ( C4 + C5 +C6) – (C2 + C3 + C5
+ C6 + C7)
• How tj affects?
Insertion Sorting
• Best case: Elements already sorted
• tj = 1; only compares with the last element
• Running time is linear time f(n) - LOW
•Refers to
• defining the mathematical boundation/framing
of its run-time performance are input bound
• i.e., if there's no input to the algorithm it is
concluded to work in a constant time
• Other than the "input" all other factors are
considered constant
Asymptotic analysis
• GOAL: to simplify analysis of running time by getting rid
of “details” which may be affected by specific
implementation and hardware
• For instance 10000001 ≈ 10000000
• And f(n)= 3n2 ≈ n2
• usually 3 is depending on computer system (H/w etc. )
• f(n) = θ(g(n))
• there exists c1, c2 > 0 and n0
• such that
• c1. g(n) ≤ (f(n)) ≤ c2. g(n) for all n > n0
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
n log n − Ο(n log n)
2
quadratic − Ο(n )
3
cubic − Ο(n )
Ο(1)
polynomial − n
Ο(n)
exponential − 2
Examples
• f(n) = 50 n log n is O(n log n)
• f(n) = 7n – 3 is O(n)
• f(n) = 8n2 log n + 5 n2 + n is O(n2 log n)
• f(n) = n(n+1)/2 is O( ? )
• f(n) = aj nk + aj-1 nk-1 + aj-2 nk-2 + … a2 n2 + a1 n +
a0 is O(?)
Check
Examples
Example
• Show that f(x) = 4x2 - 5x + 3 is O(x2)
|f(x)| = |4x2 – 5x + 3| <= |4x2|+ |- 5x| + |3|
<= 4x2 + 5x + 3, for all x > 0
<= 4x2 + 5x2 + 3x2, for all x > 1
<= 12x2, for all x > 1
Hence we conclude that f(x) is O(x2)
Example
• By definition, f(n) is O( g(n) ) if:
There exists constants c, n0 where c > 0, s.t. for all n >n0:
f(n) <= c * g(n)
faster than nb
• Rule 3: Any polynomial grows slower than any
exponential n6 and 2n
• Rule 4: Any polylogarithm grows slower than
any polynomial log n indeed grows slower than n
• Rule 5: Smaller terms can be omitted 8n2+6n+4 Both 6n and 4
grow slower than 8n2
Data Type
• DATA TYPE
• set of values
• a set of operations on values
•Company XYZ
• Linear : LIST →
• Non-Linear : TREE →
NAME POSITION
A Manager P
C VP
C J
G Employee
H Employee A
J VP
Data Representation
• Physical representation
• How data is actually organized in the memory cell 20001
• Logical representation
• We think how data being stored in the computer
Abstract view
• Storing data using LIST or POINTERS
• But provide same functionality
• Although they are implemented differently
ABSTRACT VIEW of the list which is distinct from any
particular computer implementation
• Basic things should be available
• Functionality must be there
don’t worry
• about the implementation
ABSTRACT DATA TYPE
• DATA STRUCTURE is a programming construct used to
implement abstract data types i.e.
• Two Disadvantages-
❑Computer require sequential list so we can’t leave
blanks in between
❑Array fixed size, it can be increases provided adjacent
free memory cells available
Ordered List: Pointers
Implementation
• Link groups of memory cells together using pointers
• Each memory cell is called a NODE
• Node →
DATA Pointer to next item
• Linked list more flexible but with high complexity due to indirection
Pointers Implementation
5 20010
20001 20002
7 30076
20010 20011
8 30100
30076 30077
9 NULL
2000
2001
2002
2003
2004
2005
Traversing Linear Arrays
1. Repeat For i = LB to UB
2. Apply PROCESS to A[i]
[End of For Loop]
3. Exit
Inserting
1. Set i = N [Initialize counter]
2. Repeat While (i >= LOC)
3. Set A[i+1] = A[i] [Move elements downward]
4. Set i = i – 1 [Decrease counter by 1]
[End of While Loop]
5. Set A[LOC] = ITEM [Insert element]
6. Set N = N + 1 [Reset N]
7. Exit
Deleting
1. Set ITEM = A[LOC] [Assign the element to be deleted
to ITEM]
5. Exit
Searching: Linear Search
1. Repeat For j = 1 to N
2. If (ITEM == A[j]) Then
3. Print: ITEM found at location j
4. Return
[End of If]
[End of For Loop]
5. If (j > N) Then
6. Print: ITEM doesn’t exist
[End of If]
7. Exit
Searching: Binary Search
1. Set BEG = 1 and END = N
2. Set MID = (BEG + END) / 2
13. Exit
Two-Dimensional Arrays
1 2 3 4
1 A[1, 1] A[1, 2] A[1, 3] A[1, 4]
Rows
• Length
o first dimension: 5 – 2 + 1 = 4
o second dimension: 5 – (-1) + 1 = 7
Memory representation
• In computing
• row-major order and column-major order are methods for storing
multidimensional arrays in linear storage such as random-access
memory
• Array Can be stored as
• Column – major order: column by column
• Row – major order: row by row
Row 1
(3, 1) (1, 3)
(1, 2)
(1, 4)
Column 2
(2, 2) (2, 1)
(3, 2) Columns
(2, 2)
Row 2
(1, 3) 1 2 3 4 (2, 3)
Column 3
(3, 3) (3, 1)
2 A[2, 1] A[2, 2] A[2, 3] A[2, 4]
(3, 2)
Row 3
(1, 4)
3 A[3, 1] A[3, 2] A[3, 3] A[3, 4]
Column 4
(2, 4) (3, 3)
(3, 4)
(3, 4)
Row major
Column major
Row-Major
LB2 UB2
LB1
a[i, j]
UB1
+ +
• Ei = Ki – Lower bound
a[j, k]
m UB2
Two-Dimensional Array
• Array size m × n
a[j, k]
m
3-Dimensional Array
• B(2, 4, 3) contains
2. 4. 3 = 24 elements
• Three layers
where each layer
have 2 × 4 elements
n
• Li = UBi – LBi + 1
(1, 1) (LB1 ,LB2) UB1
• Ei = Ki – Lower bound
a[j, k]
m UB2
Three-Dimensional Array
• Length Li , effective index Ei of Li
• Li = UBi – LBi + 1
• Ei = Ki – Lower bound
• Column major address
• 2D: LOC(A[j, k]) = Base (A) + w[L1 (E2 ) + (E1)]
• 3D: LOC(B[K1, K2, k3]) = Base (B) + w[((E3L2 + E2)L1) + E1]
• Row major address
• 2D: LOC(A[j, k]) = Base (A) + w[L2 (E1 ) + (E2)]
• 3D: LOC(B[K1, K2, k3]) = Base (B) + w[((E1L2 + E2)L3) + E3]
General multidimensional Arrays
• Base(C) the address of the first element
• Column major address
LOC(C[K1, K2, … kN]) =
Base (C) + w[((( … ENLN-1 + EN)LN-2) + … E3)L2 + E2)L1 + E1]
• B stored in Row Major LOC(B[K1, K2, k3]) = Base (B) + w[((E1L2 + E2)L3) + E3]
300 + 4 ×[( (2×11 + 8) × 16 ) + 13] = 300 + 4 × 493 = 2272
• B stored in Column major LOC(B[K1, K2, k3]) = Base (B) + w[((E3L2 + E2)L1) +
E1]
300 + 4 ×[( (13×11 + 8) × 8 ) + 2] = 300 + 4 × 1210 = 5140
Sparse Matrix
• Matrix that has many elements with a value zero
0 2 0 0 3 0
0 0 6 0 0 0
0 9 0 2 0 0
0 4 0 0 0 4
Sparse Matrix: Arrays
• Matrix that has many elements with a value zero
4 6 7
0 1 2
0 2 0 0 3 0
0 4 3
0 0 6 0 0 0 1 2 6
0 9 0 2 0 0 2 1 9
0 4 0 0 0 4 2 3 2
3 1 4
3 5 4