DS Unit 1
DS Unit 1
Syllabus Unit 1
Chapter 1: Introduction and Overview: Definition, Elementary data organization,
Data Structures, data Structures operations, Abstract data types, algorithms
complexity, time-space trade-off.
1.1 Definition
1.2 Elementary data organization
1.3 Data Structures
1.4 Data Structures operations
1.5 Abstract data types
1.6 Algorithm complexity
1.7 Time-space trade-off
Data structures refer to the systematic way of organizing, managing, and storing data
in a computer so that it can be used efficiently. A data structure defines the
relationship among data elements and the operations that can be performed on them.
In computing, data structures are fundamental as they enable the efficient handling
of data, ensuring that algorithms work optimally in terms of time and space. They are
used in various applications such as databases, operating systems, artificial intelligence,
and computer networks.
Data structures can be broadly classified into primitive and non-primitive types:
1. Primitive Data Structures: These are basic data types provided by the
programming language, such as:
○ Integer (int)
○ Character (char)
○ Floating-point numbers (float, double)
○ Boolean (true/false)
2. Non-Primitive Data Structures: These are more complex structures derived from
primitive types and can be further categorized into:
Thus, data structures form the foundation of computer science and software
development, providing a structured way to handle and manipulate data efficiently.
○ Data: Raw facts or values that do not have any meaningful interpretation.
Example: "John", "25", "New York".
○ Information: Processed and meaningful data. Example: "John is 25 years old
and lives in New York."
2. Data Elements and Fields:
○ Data Structure: Defines how data is arranged in memory (e.g., arrays, linked
lists).
○ Data Organization: Deals with the storage, retrieval, and management of data.
1. Linear Organization: Data is stored sequentially (e.g., arrays, linked lists).
2. Hierarchical Organization: Data is arranged in a tree-like structure (e.g., file
systems).
3. Network Organization: Data follows a complex structure with multiple links (e.g.,
graphs, databases).
4. Relational Organization: Data is stored in tabular format with relationships (e.g.,
relational databases).
Thus, elementary data organization is essential for managing data efficiently and plays
a significant role in computer science and programming.
A data structure is a way of organizing and storing data in a computer so that it can be
used efficiently. It defines the relationship between data elements and the operations
that can be performed on them. Different data structures are used to solve different
kinds of problems in computer science.
Thus, data structures form the backbone of efficient programming, enabling optimal
memory usage and fast algorithm execution.
Data structures support various operations that allow efficient manipulation and
retrieval of data. These operations are fundamental to implementing algorithms and
solving computational problems. The choice of a data structure significantly impacts
the efficiency of these operations.
Efficiency of Operations
The efficiency of these operations varies depending on the data structure used. For
example:
● Searching in an unsorted array takes O(n) time, while in a Binary Search Tree
(BST), it takes O(log n) time.
● Inserting an element in a linked list is more efficient than in an array, as linked
lists do not require shifting elements.
An abstract data type is a way of organising and using data without worrying about
how it is implemented internally. It defines what operations can be performed on the
data but hides the details of how those operations work.
Characteristics of ADT
A stack follows the LIFO (Last In, First Out) principle and supports operations such
as:
A stack can be implemented using arrays or linked lists, but the ADT only defines the
operations without specifying the implementation.
Importance of ADTs
● Separation of Concerns: Users focus on what the ADT does rather than how it
works.
● Reusability: ADTs allow different implementations while maintaining the same
interface.
● Modularity: ADTs help in designing modular and maintainable code.
Thus, Abstract Data Types provide a high-level way to define and work with data
structures without worrying about their internal details.
Abstract Data Type Model
1. Components:
○ Functions: Divided into public and private functions.
○ Data Structures: Includes arrays, linked lists, and records.
2. Encapsulation:
○ The ADT encapsulates data structures and operations within itself.
○ The application program can only interact with public functions.
3. Interface:
○ The application programming interface (API) provides controlled access
to ADT functions.
○ Only the operation names and parameters are exposed to the application.
4. Implementation Hiding:
○ The internal implementation of data structures is hidden from the user.
○ Different versions of a structure can coexist without affecting the user.
5. Usage of Data Structures:
○ Arrays and linked lists are commonly used to implement ADTs.
○ ADTs enable efficient data management and reusability in software
development.
1. Time Complexity – Measures the amount of time an algorithm takes to execute as a
function of input size (n).
2. Space Complexity – Measures the amount of memory required by an algorithm to
execute.
Time Complexity
Time complexity is expressed using Big-O Notation (O), which describes the upper
bound of an algorithm’s running time.
O(n log n) Log-Linear Time Merge Sort, Quick Sort (Best/Average Case)
Space Complexity
1. Fixed Part – Independent of input size (e.g., program code, constants).
2. Variable Part – Depends on input size (e.g., dynamic memory allocations, function
call stack).
Example Space Complexities
● O(1) – Algorithms with a constant amount of extra space (e.g., swapping two
variables).
● O(n) – Algorithms that require additional space proportional to input size (e.g.,
storing an array).
● O(n²) – Storing a 2D matrix.
● Best Case (Ω): The minimum time required for execution (e.g., searching an
element that appears at the first position).
● Worst Case (O): The maximum time required (e.g., searching an element that
appears at the last position).
● Average Case (Θ): The expected time complexity based on random input
distribution.
1. Performance Analysis – Helps compare different algorithms for the same problem.
2. Resource Optimization – Ensures efficient use of CPU and memory.
3. Scalability – Helps determine how an algorithm behaves as the input size grows.
Thus, analyzing the complexity of algorithms is crucial for writing efficient programs
and selecting the best algorithm for a given problem.
For example:
● If memory is limited, use an algorithm that requires less space, even if it takes
more time.
● If speed is crucial, use an algorithm that runs faster, even if it consumes more
memory.
Definition
For any real number x, the floor and ceiling functions are defined as:
● Floor function ⌊x⌋: The largest integer that does not exceed x.
● Ceiling function ⌈x⌉: The smallest integer that is not less than x.
Properties
Examples
Definition
For an integer k and a positive integer M, the remainder function is written as:
k mod M
This gives the remainder when k is divided by M. It satisfies the equation:
k = Mq + r, where 0 ≤ r < M
Examples
Congruence Relation
Denoted as INT(x), it converts a real number into an integer by truncating the decimal
part.
Examples: INT(3.14) = 3, INT(√5) = 2, INT(-8.5) = -8, INT(7) = 7
Examples:
|-15| = 15, |7| = 7, |-3.33| = 3.33, |4.441| = 4.44, |-0.0751| = 0.075
Summation Symbol
The sum of a sequence is written using the Greek letter sigma (Σ).
Examples:
Σ(i=1 to n) a_i = a1 + a2 + ... + an
Σ(j=2 to 5) j² = 2² + 3² + 4² + 5² = 4 + 9 + 16 + 25 = 54
Σ(j=1 to n) j = 1 + 2 + 3 + ... + n = (n(n+1))/2
Example:
1 + 2 + 3 + ... + 50 = (50(51))/2 = 1275
5. Factorial Function
Definition
n! = 1 × 2 × 3 × ... × (n-1) × n
It is defined that 0! = 1.
Examples
2! = 1 × 2 = 2, 3! = 1 × 2 × 3 = 6, 4! = 1 × 2 × 3 × 4 = 24
5! = 5 × 4! = 5 × 24 = 120, 6! = 6 × 5! = 6 × 120 = 720
6. Exponents and Logarithms
Exponent Rules
Examples:
2⁴ = 16, 2⁻⁴ = 1 / 2⁴ = 1 / 16, 125^(2/3) = 5² = 25
Logarithm Rules
Examples:
log₂ 8 = 3 (since 2³ = 8)
log₁₀ 100 = 2 (since 10² = 100)
Conclusion
These mathematical notations are essential in data structures and algorithms, helping
to analyze efficiency, memory usage, and computational complexity.
Definition of an Algorithm
Problem Statement:
Given an array DATA containing numerical values, find the largest element and its
position (LOC) in the array.
The flowchart visually represents the steps in Algorithm 2.1. It consists of:
C Program Implementation
#include <stdio.h>
#include <conio.h>
void main() {
int DATA[10] = {22, 65, 1, 99, 32, 17, 74, 49, 33, 2};
int N, LOC, MAX, K;
N = 10;
K = 0;
LOC = 0;
MAX = DATA[0];
clrscr();
loop:
K = K + 1;
if (K == N) {
printf("LOC = %d, MAX = %d", LOC, MAX);
getch();
exit();
}
if (MAX < DATA[K]) {
LOC = K;
MAX = DATA[K];
}
goto loop;
}
Output:
LOC = 3, MAX = 99
Key Observations
● Array Indexing in C:
Example:
If multiple operations are written in the same step, they are executed from left to
right.
Control structures define the flow of execution in an algorithm or a program. There are
three primary types of flow control structures:
Example:
Step 1: Start
Step 2: Read A, B
Step 3: Compute C = A + B
Step 4: Print C
Step 5: Stop
This structure is commonly used in most basic programs where each instruction is
performed one after another.
Syntax:
If condition then:
Execute Module A
[End of If structure]
Syntax:
If condition then:
Execute Module A
Else:
Execute Module B
[End of If structure]
● Definition: When there are multiple conditions, different blocks of code execute
based on which condition is true.
Syntax:
If condition 1 then:
Execute Module A1
Else if condition 2 then:
Execute Module A2
Else if condition M then:
Execute Module AM
Else:
Execute Module B
[End of If structure]
Syntax:
● Explanation:
○ R: Initial value
○ S: End value
○ T: Increment step
○ Loop continues until K > S.
Syntax:
● Explanation:
○ The loop runs only when the condition is true.
○ If the condition is false initially, the loop does not execute.
○ The loop must have an update statement inside to change the condition over
time.
Problem Statement: Given an array DATA with N numerical values, find the location
LOC and the value MAX of the largest element.
Steps:
1. Initialize K = 1, LOC = 1, MAX = DATA[1].
2. Loop: Repeat Steps 3 and 4 while K ≤ N.
3. Condition: If MAX < DATA[K], update LOC = K and MAX = DATA[K].
4. Increment: Set K = K + 1.
5. Output: Write LOC, MAX.
6. Exit.
Output:
LOC = 3, MAX = 99
1. Read: A, B, C.
2. Compute D = B² - 4AC.
3. Check D:
○ If D > 0, compute two real roots.
○ If D = 0, compute one unique root.
○ If D < 0, print "No real solutions".
4. Exit.
Output Example:
Input: 3 3 1
Output: X1 = -0.58, X2 = -1.00
Conclusion
Introduction
Consider an English short story TEXT, where we need to find the first occurrence of a
3-letter word W:
● If W = "the", it likely appears early in the text, leading to a small value of f(n).
● If W = "zoo", it might not appear at all, leading to a large f(n).
This example illustrates that the running time of an algorithm depends not only on the
input size n but also on the specific data.
1. Worst Case Complexity: Maximum value of f(n) for any possible input.
2. Average Case Complexity: Expected value of f(n) over all possible inputs.
3. Best Case Complexity: Minimum possible value of f(n).
For average case complexity, we assume a probabilistic distribution where each input is
equally likely. The expectation E of the running time is calculated as:
where n₁, n₂, ..., n are the possible numbers of operations, and p₁, p₂, ..., p are
their respective probabilities.
Problem Statement
Given a linear array DATA of size n, we need to find the position LOC of a given ITEM
in the array. If ITEM is not found, LOC = 0.
Algorithm
1. Initialize K = 1 and LOC = 0.
2. Repeat steps 3-4 while K ≤ n:
○ If ITEM = DATA[K], set LOC = K and exit.
○ Increment K.
3. If LOC = 0, print "ITEM not in array".
4. Otherwise, print "LOC is the location of ITEM".
C Implementation
#include <stdio.h>
#include <conio.h>
void main() {
int DATA[10] = {22, 65, 1, 99, 32, 17, 74, 49, 33, 2};
int ITEM = 17, N = 10, LOC = -1, K = 0;
clrscr();
while (LOC == -1 && K < N) {
if (ITEM == DATA[K])
LOC = K;
K++;
}
if (LOC == -1)
printf("ITEM is not in the array DATA");
else
printf("%d is the location of ITEM", LOC);
getch();
}
Complexity Analysis
● Worst Case: If ITEM is the last element or not in the array, C(n) = n.
● Average Case: If ITEM appears at a random position, the expected number of
comparisons is:
Definition
For an algorithm M, the function f(n) increases with input size n. To analyze how f(n)
grows, we compare it with standard functions:
Big O Notation
then we write:
Definition
The Omega notation (Ω) defines a lower bound for a function f(n). It is used to
describe the best-case complexity or the minimum time required by an algorithm.
Mathematically, we say:
Similarly, consider:
This means that the function g(n) = n² is a lower bound for f(n).
Choosing the Correct Bound
However, we always select the largest possible function g(n) that satisfies the
condition. Thus, in this case, Ω(n) is the correct choice.
Definition
The Theta notation (Θ) is used when f(n) is bounded from both above and below by the
same function g(n). It gives an exact asymptotic behavior of f(n).
Mathematically:
Given:
f(n) = 18n + 9
Thus, it satisfies:
Definition
The Little o notation (o) defines a strict upper bound for f(n), meaning f(n) grows
slower than g(n).
Mathematically:
f(n) = o(g(n))
if:
Example
For:
f(n) = 18n + 9
We can say:
we conclude:
These notations are essential in algorithm analysis to classify how functions grow with
input size n, helping in choosing the most efficient algorithm.
Chapter 3
3.1 Introduction to Strings
Historically, computers were used primarily for numerical data processing. However,
with advancements, the need to process text-based data emerged, leading to the
development of string processing. A string is a sequence of characters stored in
memory. It can include alphabets, digits, spaces, punctuation marks, and special
symbols. Strings differ from numerical data as they carry meaning in sequences, unlike
independent numerical values.
H E L L O \0
This null character helps distinguish meaningful characters from unused memory
spaces.
String Operations
Applications of Strings
Each programming language has a character set, which consists of all valid symbols
used in that language. These symbols include alphabets (A-Z, a-z), digits (0-9), and
special characters (+, -, *, /, =, $, etc.). Characters are stored in memory using
encoding schemes such as ASCII and Unicode.
Definition of a String
String Representation in C
H E L L O \0
The null character (\0) ensures that the system knows where the string ends.
String Operations
Some common operations performed on strings include:
#include <stdio.h>
#include <string.h>
int main() {
char str[] = "HELLO WORLD";
printf("String length: %d\n", strlen(str));
return 0;
}
Output:
String length: 11
Concatenation of Strings
Concatenation means joining two strings together. It is done using the strcat()
function in C.
#include <stdio.h>
#include <string.h>
int main() {
char str1[20] = "Hello";
char str2[] = " World";
strcat(str1, str2); // Concatenates str2 to str1
printf("Concatenated String: %s\n", str1);
return 0;
}
Output:
Concatenated String: Hello World
Substrings in Strings
#include <stdio.h>
#include <string.h>
void substring(char str[], int start, int length) {
char sub[20];
int i;
for (i = 0; i < length; i++)
sub[i] = str[start + i];
sub[i] = '\0'; // Null terminate the substring
printf("Substring: %s\n", sub);
}
int main() {
char str[] = "HELLO WORLD";
substring(str, 6, 5); // Extract "WORLD"
return 0;
}
Output:
Substring: WORLD
A null string ("") contains no characters and has a length of 0. It is different from a
string containing a single space (" "), which has a length of 1.
Example:
Applications of Strings
Strings are stored as character arrays in memory, with each character occupying a
fixed storage space. In C, a string is terminated by a null character (\0), which marks
the end of the string.
Example:
Memory representation:
H E L L O \0
The null character ensures that functions like strlen() and printf() can determine
the end of the string.
Fixed-Length Storage
Each string is stored in a fixed-size array, meaning every record has the same length.
If a string is shorter than the allocated space, unused memory is wasted.
Example:
Here, even though "DATA" has 4 characters, the full 10 bytes are allocated, wasting 6
bytes.
Advantages:
Disadvantages:
#include <stdio.h>
int main() {
char str[10] = "HELLO";
printf("Stored String: %s\n", str);
return 0;
}
Output:
Stored String: HELLO
Strings are stored in memory, but their actual length can vary. However, a maximum
limit is set for the storage. Two methods are used:
Advantages:
Disadvantages:
#include <stdio.h>
#include <string.h>
int main() {
char str1[20] = "HELLO$$"; // End marker method
char str2[] = "WORLD"; // Length-based method
printf("String 1: %s\n", str1);
printf("Length of String 2: %d\n", strlen(str2));
return 0;
}
Output:
String 1: HELLO$$
Length of String 2: 5
Linked Storage
Instead of storing the entire string in a continuous block of memory, each character
(or group of characters) is stored in a linked list node. Each node contains:
Example:
Advantages:
Disadvantages:
#include <stdio.h>
#include <stdlib.h>
struct Node {
char data;
struct Node* next;
};
// Function to print the linked list
void printList(struct Node* head) {
while (head != NULL) {
printf("%c", head->data);
head = head->next;
}
printf("\n");
}
int main() {
struct Node* head = malloc(sizeof(struct Node));
struct Node* second = malloc(sizeof(struct Node));
struct Node* third = malloc(sizeof(struct Node));
head->data = 'H'; head->next = second;
second->data = 'I'; second->next = third;
third->data = '!'; third->next = NULL;
printf("Stored String: ");
printList(head);
return 0;
}
Output:
Example:
char ch = 'A';
Character Constants in C
char ch = 'B';
Example:
#include <stdio.h>
int main() {
char newline = '\n';
printf("Hello%cWorld", newline);
return 0;
}
Output:
Hello
World
Example:
Example:
Memory representation:
H E L L O \0
The null character (\0) is essential for marking the end of the string.
#include <stdio.h>
int main() {
char ch = 'G';
printf("Character: %c\n", ch);
printf("ASCII Value: %d\n", ch);
return 0;
}
Output:
Character: G
ASCII Value: 71
The basic unit of a string is a character, but in text processing, the primary focus is on
substrings rather than individual characters.
The length of a string is the number of characters it contains (excluding the null
character \0). In C, the strlen() function is used.
#include <stdio.h>
#include <string.h>
int main() {
char str[] = "HELLO WORLD";
printf("Length of the string: %d\n", strlen(str));
return 0;
}
Output:
Length of the string: 11
2. Concatenating Two Strings
Concatenation means joining two strings together. The strcat() function in C appends
one string to another.
#include <stdio.h>
#include <string.h>
int main() {
char str1[20] = "Hello";
char str2[] = " World";
strcat(str1, str2); // Appends str2 to str1
printf("Concatenated String: %s\n", str1);
return 0;
}
Output:
Concatenated String: Hello World
3. Copying a String
#include <stdio.h>
#include <string.h>
int main() {
char source[] = "C Programming";
char destination[20];
strcpy(destination, source); // Copy source to destination
printf("Copied String: %s\n", destination);
return 0;
}
Output:
Copied String: C Programming
4. Extracting a Substring
#include <stdio.h>
#include <string.h>
void substring(char str[], int start, int length) {
char sub[20];
int i;
for (i = 0; i < length; i++)
sub[i] = str[start + i];
sub[i] = '\0'; // Null terminate the substring
printf("Substring: %s\n", sub);
}
int main() {
char str[] = "HELLO WORLD";
substring(str, 6, 5); // Extract "WORLD"
return 0;
}
Output:
Substring: WORLD
#include <stdio.h>
#include <string.h>
int main() {
char text[] = "HELLO WORLD";
char *found = strstr(text, "WORLD");
if (found)
printf("Substring found at position: %ld\n", found - text);
else
printf("Substring not found.\n");
return 0;
}
Output:
Substring found at position: 6
#include <stdio.h>
#include <string.h>
int main() {
char str1[] = "Hello";
char str2[] = "World";
if (strcmp(str1, str2) == 0)
printf("Strings are equal\n");
else
printf("Strings are not equal\n");
return 0;
}
Output:
Strings are not equal
7. Inserting a Substring into a String
#include <stdio.h>
#include <string.h>
void insertSubstring(char str[], char sub[], int pos) {
char temp[100];
strncpy(temp, str, pos);
temp[pos] = '\0';
strcat(temp, sub);
strcat(temp, str + pos);
strcpy(str, temp);
}
int main() {
char str[100] = "Hello World";
insertSubstring(str, " Beautiful", 5);
printf("Modified String: %s\n", str);
return 0;
}
Output:
Modified String: Hello Beautiful World
#include <stdio.h>
#include <string.h>
void deleteSubstring(char str[], int pos, int length) {
strcpy(str + pos, str + pos + length);
}
int main() {
char str[100] = "Hello Beautiful World";
deleteSubstring(str, 6, 10); // Remove "Beautiful"
printf("Modified String: %s\n", str);
return 0;
}
Output:
Modified String: Hello World
#include <stdio.h>
#include <string.h>
void replaceWord(char str[], char oldWord[], char newWord[]) {
char temp[200];
char *pos = strstr(str, oldWord);
if (pos) {
int index = pos - str;
strncpy(temp, str, index);
temp[index] = '\0';
strcat(temp, newWord);
strcat(temp, pos + strlen(oldWord));
strcpy(str, temp);
}
}
int main() {
char str[100] = "Hello Beautiful World";
replaceWord(str, "Beautiful", "Wonderful");
printf("Modified String: %s\n", str);
return 0;
}
Output:
Modified String: Hello Wonderful World
Word processing involves creating, editing, formatting, and managing textual data.
It is widely used in applications like Microsoft Word, Google Docs, and text editors.
Computers process text in lines, paragraphs, and pages using string operations such
as insertion, deletion, search, and replacement.
1. Insertion in a String
#include <stdio.h>
#include <string.h>
void insertSubstring(char str[], char sub[], int pos) {
char temp[100];
strncpy(temp, str, pos);
temp[pos] = '\0';
strcat(temp, sub);
strcat(temp, str + pos);
strcpy(str, temp);
}
int main() {
char str[100] = "Hello World";
insertSubstring(str, " Beautiful", 5);
printf("Modified String: %s\n", str);
return 0;
}
Output:
Modified String: Hello Beautiful World
2. Deletion in a String
#include <stdio.h>
#include <string.h>
void deleteSubstring(char str[], int pos, int length) {
strcpy(str + pos, str + pos + length);
}
int main() {
char str[100] = "Hello Beautiful World";
deleteSubstring(str, 6, 10); // Remove "Beautiful"
printf("Modified String: %s\n", str);
return 0;
}
Output:
Modified String: Hello World
#include <stdio.h>
#include <string.h>
int main() {
char text[] = "Welcome to Programming World";
char *found = strstr(text, "Programming");
if (found)
printf("Word found at position: %ld\n", found - text);
else
printf("Word not found.\n");
return 0;
}
Output:
Word found at position: 11
Replacing words helps in modifying text without manually editing every instance.
#include <stdio.h>
#include <string.h>
void replaceWord(char str[], char oldWord[], char newWord[]) {
char temp[200];
char *pos = strstr(str, oldWord);
if (pos) {
int index = pos - str;
strncpy(temp, str, index);
temp[index] = '\0';
strcat(temp, newWord);
strcat(temp, pos + strlen(oldWord));
strcpy(str, temp);
}
}
int main() {
char str[100] = "Hello Beautiful World";
replaceWord(str, "Beautiful", "Wonderful");
printf("Modified String: %s\n", str);
return 0;
}
Output:
Modified String: Hello Wonderful World
#include <stdio.h>
#include <ctype.h>
int main() {
char str[] = "hello world";
int i;
for (i = 0; str[i] != '\0'; i++)
str[i] = toupper(str[i]);
printf("Uppercase String: %s\n", str);
return 0;
}
Output:
Uppercase String: HELLO WORLD
● Microsoft Word: Offers document editing, spell check, and text formatting.
● Notepad/Text Editors: Basic text editing without formatting.
● Google Docs: Online document processing with cloud storage.
● LaTeX: Used for scientific documents with complex formatting.
Simple C program to copy one string into another without using built-in functions like
strcpy():
#include <stdio.h>
void copyString(char dest[], char src[]) {
int i = 0;
while (src[i] != '\0') { // Copy characters until null terminator
dest[i] = src[i];
i++;
}
dest[i] = '\0'; // Append null character at the end
}
int main() {
char source[100], destination[100];
// Input the source string
printf("Enter a string: ");
gets(source);
// Copy the string manually
copyString(destination, source);
// Print the copied string
printf("Copied String: %s\n", destination);
return 0;
}
Explanation
Sample Output
Enter a string: Hello World
Copied String: Hello World
Steps:
Example:
Let P = "abcd" and T = "abcdefabcd".
W1 = "abcd"
W2 = "bcde"
W3 = "cdef"
W4 = "defa"
W5 = "efab"
W6 = "fabc"
W7 = "abcd" (match found at index 7)
The maximum shifts required to check all possible positions is calculated as:
Substituting values:
MAX = 10 - 4 + 1 = 7
Complexity Analysis
○ C = N1 + N2 + ... + NL
The worst case happens when all characters match except the last one at each shift,
making the algorithm check almost every position in the text.
○ C(n) = r * (s - r + 1)
where:
○ O(n²)
This means that as the text gets longer, the number of comparisons increases very
fast, making this method slow for large texts.
Key Idea:
For P = "aaba":
Q0 → λ (empty string)
Q1 → "a"
Q2 → "aa"
Q3 → "aab"
Q4 → "aaba"
● Rows: Substrings of P.
● Columns: Possible characters.
● Entries: The longest matching prefix.
Example 3.12
T = "abcababa", P = "aaba"
States: Q0 → Q1 → Q2 → Q3 → Q0
P is NOT found.
T = "abcaabaca", P = "aaba"
States: Q0 → Q1 → Q2 → Q3 → P
P is found at index 3.
Output:
P = aaba
T = abcaabaca
Index of P in T is 3
Complexity Analysis
● Time Complexity:
○ Brute-force approach: O(n²)
○ Optimized approach: O(n) (linear time)
● Conclusion: The second algorithm is more efficient.