0% found this document useful (0 votes)
9 views74 pages

DS Unit 1

The document outlines the syllabus for Unit 1, covering fundamental concepts of data structures, including their definitions, types, operations, and algorithm complexity. It emphasizes the importance of efficient data organization and manipulation in computer science, detailing both primitive and non-primitive data structures. Additionally, it discusses abstract data types, algorithm complexity, and the time-space trade-off, highlighting their significance in optimizing performance and resource utilization.

Uploaded by

jesijesintha34
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views74 pages

DS Unit 1

The document outlines the syllabus for Unit 1, covering fundamental concepts of data structures, including their definitions, types, operations, and algorithm complexity. It emphasizes the importance of efficient data organization and manipulation in computer science, detailing both primitive and non-primitive data structures. Additionally, it discusses abstract data types, algorithm complexity, and the time-space trade-off, highlighting their significance in optimizing performance and resource utilization.

Uploaded by

jesijesintha34
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Unit 1

Syllabus Unit 1
Chapter 1: Introduction and Overview: Definition, Elementary data organization,
Data Structures, data Structures operations, Abstract data types, algorithms
complexity, time-space trade-off.

Chapter 2: Preliminaries: Mathematical notations and functions, Algorithmic notations,


control structures, Complexity of algorithms, asymptotic notations for complexity of
algorithms.

Chapter 3: Introduction to Strings: Storing String, Character Data Types, String


Operations, word processing, Introduction to pattern matching algorithms.
Chapter 1
Chapter 1: Introduction and Overview

1.1 Definition​
1.2 Elementary data organization​
1.3 Data Structures​
1.4 Data Structures operations​
1.5 Abstract data types​
1.6 Algorithm complexity​
1.7 Time-space trade-off

1.1 Definition of Data Structures

Data structures refer to the systematic way of organizing, managing, and storing data
in a computer so that it can be used efficiently. A data structure defines the
relationship among data elements and the operations that can be performed on them.

In computing, data structures are fundamental as they enable the efficient handling
of data, ensuring that algorithms work optimally in terms of time and space. They are
used in various applications such as databases, operating systems, artificial intelligence,
and computer networks.

A data structure provides:

1.​ Storage Mechanism: It helps store data in a structured manner.


2.​ Efficient Access: Data can be retrieved, searched, and modified quickly.
3.​ Manipulation Operations: Includes inserting, deleting, sorting, and searching.
4.​ Optimization of Resources: Ensures optimal use of memory and processing power.

Types of Data Structures

Data structures can be broadly classified into primitive and non-primitive types:

1.​ Primitive Data Structures: These are basic data types provided by the
programming language, such as:​

○​ Integer (int)
○​ Character (char)
○​ Floating-point numbers (float, double)
○​ Boolean (true/false)
2.​ Non-Primitive Data Structures: These are more complex structures derived from
primitive types and can be further categorized into:​

○​ Linear Data Structures: The elements are arranged in a sequential manner.


Examples:
■​ Arrays
■​ Linked Lists
■​ Stacks
■​ Queues
○​ Non-Linear Data Structures: The elements are not arranged sequentially, but
follow hierarchical or network-based structures. Examples:
■​ Trees
■​ Graphs

Need for Data Structures

Efficient data structures are essential for:

●​ Managing large datasets effectively.


●​ Reducing computational complexity.
●​ Enhancing data retrieval and modification operations.
●​ Supporting the implementation of advanced algorithms in real-world applications.

Thus, data structures form the foundation of computer science and software
development, providing a structured way to handle and manipulate data efficiently.

1.2 Elementary Data Organization

Data organization refers to the systematic arrangement of data to facilitate easy


access, modification, and management. It plays a crucial role in improving the efficiency
of computing operations. The way data is structured and represented affects the
performance of algorithms and storage optimization.

Basic Concepts in Data Organization

1.​ Data and Information:​

○​ Data: Raw facts or values that do not have any meaningful interpretation.
Example: "John", "25", "New York".
○​ Information: Processed and meaningful data. Example: "John is 25 years old
and lives in New York."
2.​ Data Elements and Fields:​

○​ Data Element: A single unit of data (e.g., an integer or character).


○​ Field: A group of characters that represent an attribute of an entity. Example:
"Name", "Age".
3.​ Records and Files:​

○​ Record: A collection of related fields representing a single entity. Example: A


student record may contain Name, Roll Number, and Marks.
○​ File: A collection of records stored together. Example: A student database file
containing multiple student records.
○​ Entity: An entity is an object or concept that has a distinct existence and can
be uniquely identified. In databases, it represents a real-world object with
attributes, such as a student, employee, or product.
4.​ Data Items and Data Types:​

○​ Data Item: The smallest unit of named data.


○​ Data Type: Defines the kind of data stored (e.g., integer, float, character,
string).
5.​ Data Structure vs. Data Organization:​

○​ Data Structure: Defines how data is arranged in memory (e.g., arrays, linked
lists).
○​ Data Organization: Deals with the storage, retrieval, and management of data.

Forms of Data Organization

1.​ Linear Organization: Data is stored sequentially (e.g., arrays, linked lists).
2.​ Hierarchical Organization: Data is arranged in a tree-like structure (e.g., file
systems).
3.​ Network Organization: Data follows a complex structure with multiple links (e.g.,
graphs, databases).
4.​ Relational Organization: Data is stored in tabular format with relationships (e.g.,
relational databases).

Importance of Elementary Data Organization

●​ Helps in efficient data retrieval and storage.


●​ Optimizes memory usage.
●​ Enhances data processing speed.
●​ Forms the foundation for advanced data structures and algorithms.

Thus, elementary data organization is essential for managing data efficiently and plays
a significant role in computer science and programming.

1.3 Data Structures

A data structure is a way of organizing and storing data in a computer so that it can be
used efficiently. It defines the relationship between data elements and the operations
that can be performed on them. Different data structures are used to solve different
kinds of problems in computer science.

Classification of Data Structures

Data structures can be broadly classified into two categories:

1.​ Linear Data Structures​

○​ Data elements are arranged in a sequential manner.


○​ Each element has a unique predecessor and successor, except the first and last
elements.
○​ Examples:
■​ Arrays: A collection of elements stored in contiguous memory locations.
■​ Linked Lists: A collection of nodes where each node points to the next.
■​ Stacks: A collection following the Last In, First Out (LIFO) principle.
■​ Queues: A collection following the First In, First Out (FIFO) principle.
2.​ Non-Linear Data Structures​

○​ Data elements are not arranged sequentially.


○​ Relationships between elements can be hierarchical or complex.
○​ Examples:
■​ Trees: A hierarchical structure with a root node and child nodes.
■​ Graphs: A set of nodes connected by edges, representing relationships.

Types of Data Structures

Data structures can also be classified as:


1.​ Primitive Data Structures: Basic built-in types such as integers, characters, and
floating-point numbers.
2.​ Abstract Data Types (ADT): High-level representations that define the
operations without specifying implementation details. Examples: Stack ADT, Queue
ADT.
3.​ Static Data Structures: The size of the structure is fixed at compile time (e.g.,
Arrays).
4.​ Dynamic Data Structures: The size can change at runtime (e.g., Linked Lists).

Applications of Data Structures

●​ Efficient data management (Databases, File Systems).


●​ Optimized searching and sorting (Binary Search Trees, Heaps).
●​ Graph-based problems (Social networks, Routing algorithms).
●​ Compiler design and Memory management (Symbol tables, Garbage collection).

Thus, data structures form the backbone of efficient programming, enabling optimal
memory usage and fast algorithm execution.

1.4 Data Structure Operations

Data structures support various operations that allow efficient manipulation and
retrieval of data. These operations are fundamental to implementing algorithms and
solving computational problems. The choice of a data structure significantly impacts
the efficiency of these operations.

Basic Operations on Data Structures


1.​ Traversing
○​ Accessing each element of the data structure to process it.
○​ Example: Scanning an array to print all its elements.
2.​ Searching
○​ Finding a particular element in the data structure.
○​ Example: Searching for a number in an array using linear or binary search.
3.​ Insertion
○​ Adding a new element to the data structure.
○​ Example: Adding a node to a linked list.
4.​ Deletion
○​ Removing an element from the data structure.
○​ Example: Deleting an element from a queue.
5.​ Sorting
○​ Arranging elements in ascending or descending order.
○​ Example: Sorting an array using Quick Sort or Merge Sort.
6.​ Merging
○​ Combining two data structures into one.
○​ Example: Merging two sorted lists.
7.​ Updation (Modification)
○​ Changing the value of an element in the data structure.
○​ Example: Updating an element in an array.

Examples of Operations in Different Data Structures

Data Common Operations


Structure

Arrays Traversing, Searching, Insertion, Deletion, Sorting

Linked Lists Traversing, Insertion, Deletion, Searching

Stacks Push (Insertion), Pop (Deletion), Peek (Top Element)

Queues Enqueue (Insertion), Dequeue (Deletion)

Trees Insertion, Deletion, Traversal (Preorder, Inorder, Postorder)

Graphs Traversal (BFS, DFS), Searching, Shortest Path (Dijkstra’s


Algorithm)

Efficiency of Operations
The efficiency of these operations varies depending on the data structure used. For
example:

●​ Searching in an unsorted array takes O(n) time, while in a Binary Search Tree
(BST), it takes O(log n) time.
●​ Inserting an element in a linked list is more efficient than in an array, as linked
lists do not require shifting elements.

Importance of Data Structure Operations

●​ Optimized Performance: Helps in selecting the right structure for different


applications.
●​ Memory Utilization: Ensures efficient use of memory resources.
●​ Real-world Applications: Used in databases, networking, artificial intelligence, etc.

Thus, understanding data structure operations is crucial for designing efficient


algorithms and improving computational performance.

1.5 Abstract Data Types (ADT)

An abstract data type is a way of organising and using data without worrying about
how it is implemented internally. It defines what operations can be performed on the
data but hides the details of how those operations work.

Characteristics of ADT

1.​ Encapsulation – ADTs hide implementation details from the user.


2.​ Independence – The implementation can be changed without affecting the external
usage.
3.​ Operations Defined – ADTs specify what operations can be performed but not how
they are implemented.

Common Abstract Data Types

ADT Operations Example


Implementations

List ADT Insert, Delete, Search, Traverse Array, Linked List

Stack ADT Push, Pop, Peek, isEmpty Array-based Stack,


Linked Stack
Queue ADT Enqueue, Dequeue, Peek, isEmpty Circular Queue, Linked
Queue

Deque ADT Insert Front, Insert Rear, Delete Double-ended Queue


Front, Delete Rear

Priority Queue Insert, Remove Highest Priority Hea


ADT

Set ADT Union, Intersection, Difference Hash Table, Binary


Search Tree

Map (Dictionary) Insert(Key, Value), Delete(Key), Hash Map, Binary Search


ADT Search(Key) Tree

Example: Stack as an ADT

A stack follows the LIFO (Last In, First Out) principle and supports operations such
as:

●​ push(x): Insert element x on top.


●​ pop(): Remove and return the top element.
●​ peek(): Return the top element without removing it.
●​ isEmpty(): Check if the stack is empty.

A stack can be implemented using arrays or linked lists, but the ADT only defines the
operations without specifying the implementation.

Importance of ADTs

●​ Separation of Concerns: Users focus on what the ADT does rather than how it
works.
●​ Reusability: ADTs allow different implementations while maintaining the same
interface.
●​ Modularity: ADTs help in designing modular and maintainable code.

Thus, Abstract Data Types provide a high-level way to define and work with data
structures without worrying about their internal details.
Abstract Data Type Model

1.​ Components:
○​ Functions: Divided into public and private functions.
○​ Data Structures: Includes arrays, linked lists, and records.
2.​ Encapsulation:
○​ The ADT encapsulates data structures and operations within itself.
○​ The application program can only interact with public functions.
3.​ Interface:
○​ The application programming interface (API) provides controlled access
to ADT functions.
○​ Only the operation names and parameters are exposed to the application.
4.​ Implementation Hiding:
○​ The internal implementation of data structures is hidden from the user.
○​ Different versions of a structure can coexist without affecting the user.
5.​ Usage of Data Structures:
○​ Arrays and linked lists are commonly used to implement ADTs.
○​ ADTs enable efficient data management and reusability in software
development.

This model ensures data abstraction, modularity, and ease of maintenance in


programming.

1.6 Algorithm Complexity


Algorithm complexity refers to the analysis of how the runtime or space requirements
of an algorithm grow concerning the input size. It helps in evaluating the efficiency of
an algorithm in terms of time complexity and space complexity.

Types of Algorithm Complexity

1.​ Time Complexity – Measures the amount of time an algorithm takes to execute as a
function of input size (n).
2.​ Space Complexity – Measures the amount of memory required by an algorithm to
execute.

Time Complexity

Time complexity is expressed using Big-O Notation (O), which describes the upper
bound of an algorithm’s running time.

Common Time Complexities

Notation Type Example Algorithms

O(1) Constant Time Accessing an element in an array

O(log n) Logarithmic Time Binary Search

O(n) Linear Time Linear Search, Traversing an Array

O(n log n) Log-Linear Time Merge Sort, Quick Sort (Best/Average Case)

O(n²) Quadratic Time Bubble Sort, Selection Sort

O(2ⁿ) Exponential Time Recursive Fibonacci

O(n!) Factorial Time Solving the Traveling Salesman Problem

Space Complexity

Space complexity refers to the total memory required by an algorithm, including:

1.​ Fixed Part – Independent of input size (e.g., program code, constants).
2.​ Variable Part – Depends on input size (e.g., dynamic memory allocations, function
call stack).
Example Space Complexities

●​ O(1) – Algorithms with a constant amount of extra space (e.g., swapping two
variables).
●​ O(n) – Algorithms that require additional space proportional to input size (e.g.,
storing an array).
●​ O(n²) – Storing a 2D matrix.

Best Case, Worst Case, and Average Case

●​ Best Case (Ω): The minimum time required for execution (e.g., searching an
element that appears at the first position).
●​ Worst Case (O): The maximum time required (e.g., searching an element that
appears at the last position).
●​ Average Case (Θ): The expected time complexity based on random input
distribution.

Why is Algorithm Complexity Important?

1.​ Performance Analysis – Helps compare different algorithms for the same problem.
2.​ Resource Optimization – Ensures efficient use of CPU and memory.
3.​ Scalability – Helps determine how an algorithm behaves as the input size grows.

Thus, analyzing the complexity of algorithms is crucial for writing efficient programs
and selecting the best algorithm for a given problem.

1.7 Time-Space Trade-off

The Time-Space Trade-off is a fundamental concept in computer science that involves


balancing the use of time (execution speed) and space (memory usage) in an algorithm.
It refers to the situation where optimizing for one resource (time or space) leads to
increased usage of the other.

Understanding Time-Space Trade-off


●​ Some algorithms use more memory to execute faster (reducing time complexity).
●​ Others use less memory but take longer to execute (reducing space complexity).

For example:

●​ Storing precomputed values in a table can speed up execution but increases


memory usage.
●​ Reducing the memory footprint by computing values on demand saves space but
increases execution time.

Types of Time-Space Trade-offs

1.​ Using More Space to Reduce Time​

○​ Example: Look-up Tables


■​ Storing results of previous computations (e.g., Memoization in Dynamic
Programming) speeds up execution but requires additional memory.
○​ Example: Hash Tables
■​ Searching in a hash table is O(1), but it consumes extra space compared to
a simple array.
2.​ Using More Time to Reduce Space​

○​ Example: Recursion vs. Iteration


■​ Recursive solutions use extra stack space (O(n) space complexity), while
iterative solutions can run with O(1) space.
○​ Example: Recomputing Values
■​ Instead of storing computed values, recomputing them each time saves
space but increases execution time.

Examples of Time-Space Trade-offs

Scenario More Space, Less Time Less Space, More Time

Searching Hash Table (O(1)) Linear Search (O(n))


Sorting Merge Sort (O(n log n), O(n) Quick Sort (O(n log n), O(log n)
space) space)

Fibonacci Memoization (O(n) space, O(n) Recursive Fibonacci (O(1) space,


time) O(2ⁿ) time)

Graph Adjacency Matrix (O(V²) space, Adjacency List (O(V+E) space,


Algorithms fast lookup) slower lookup)

Practical Applications of Time-Space Trade-offs

●​ Database Indexing: Uses extra storage to speed up search queries.


●​ Compression Algorithms: Reduce file size but increase decompression time.
●​ Cache Memory: Stores frequently accessed data to improve speed at the cost of
additional memory.

Choosing Between Time and Space

●​ If memory is limited, use an algorithm that requires less space, even if it takes
more time.
●​ If speed is crucial, use an algorithm that runs faster, even if it consumes more
memory.

Thus, the time-space trade-off is a critical factor in designing efficient algorithms,


requiring a balance based on the problem's constraints.
Chapter 2
Chapter 2: Preliminaries

2.1 Mathematical Notations and Functions​


2.2 Algorithmic Notations​
2.3 Control Structures​
2.4 Complexity of Algorithms​
2.5 Asymptotic Notations for Complexity of Algorithms

2.1: Mathematical Notations and Functions

Mathematical notations and functions are important in analyzing algorithms and


measuring their efficiency. This section covers floor and ceiling functions, modular
arithmetic, integer functions, summation notation, factorials, and exponents &
logarithms.

1. Floor and Ceiling Functions

Definition

For any real number x, the floor and ceiling functions are defined as:

●​ Floor function ⌊x⌋: The largest integer that does not exceed x.
●​ Ceiling function ⌈x⌉: The smallest integer that is not less than x.

Properties

●​ If x is an integer, then ⌊x⌋ = ⌈x⌉ = x.


●​ If x is not an integer, then ⌊x⌋ + 1 = ⌈x⌉.

Examples

⌊3.14⌋ = 3, ⌊√5⌋ = 2, ⌊-8.5⌋ = -9, ⌊7⌋ = 7​


⌈3.14⌉ = 4, ⌈√5⌉ = 3, ⌈-8.5⌉ = -8, ⌈7⌉ = 7
2. Remainder Function and Modular Arithmetic

Definition

For an integer k and a positive integer M, the remainder function is written as:​
k mod M​
This gives the remainder when k is divided by M. It satisfies the equation:​
k = Mq + r, where 0 ≤ r < M

Examples

25 mod 7 = 4, 25 mod 5 = 0, 35 mod 11 = 2, 3 mod 8 = 3

Congruence Relation

A modular arithmetic relation:​


a ≡ b mod M (if and only if M divides b - a)​
Here, M is the modulus, and a is congruent to b modulo M.

Modular Arithmetic Operations

●​ Addition: (a + b) mod M ≡ (a mod M + b mod M) mod M


●​ Multiplication: (a × b) mod M ≡ (a mod M × b mod M) mod M
●​ Subtraction: (a - b) mod M ≡ (a mod M - b mod M) mod M

Example: "Clock Arithmetic" with Modulo 12​


6 + 9 ≡ 3 mod 12, 7 × 5 ≡ 11 mod 12, 1 - 5 ≡ 8 mod 12

3. Integer and Absolute Value Functions

Integer Value Function

Denoted as INT(x), it converts a real number into an integer by truncating the decimal
part.​
Examples: INT(3.14) = 3, INT(√5) = 2, INT(-8.5) = -8, INT(7) = 7

Absolute Value Function


Defined as:​
ABS(x) = x, if x ≥ 0​
ABS(x) = -x, if x < 0

Examples:​
|-15| = 15, |7| = 7, |-3.33| = 3.33, |4.441| = 4.44, |-0.0751| = 0.075

4. Summation Notation and Dummy Index

Summation Symbol

The sum of a sequence is written using the Greek letter sigma (Σ).

Examples:​
Σ(i=1 to n) a_i = a1 + a2 + ... + an ​
Σ(j=2 to 5) j² = 2² + 3² + 4² + 5² = 4 + 9 + 16 + 25 = 54​
Σ(j=1 to n) j = 1 + 2 + 3 + ... + n = (n(n+1))/2

Example:​
1 + 2 + 3 + ... + 50 = (50(51))/2 = 1275

5. Factorial Function

Definition

The factorial of n, denoted as n!, is the product of all integers from 1 to n.

n! = 1 × 2 × 3 × ... × (n-1) × n​
It is defined that 0! = 1.

Examples

2! = 1 × 2 = 2, 3! = 1 × 2 × 3 = 6, 4! = 1 × 2 × 3 × 4 = 24​
5! = 5 × 4! = 5 × 24 = 120, 6! = 6 × 5! = 6 × 120 = 720
6. Exponents and Logarithms

Exponent Rules

a^m = a × a × a ... a (m times), a⁰ = 1, a^(-m) = 1 / a^m​


For rational numbers:​
a^(m/n) = ⁿ√(a^m) = (ⁿ√a)^m

Examples:​
2⁴ = 16, 2⁻⁴ = 1 / 2⁴ = 1 / 16, 125^(2/3) = 5² = 25

Logarithm Rules

A logarithm is the inverse of exponentiation.​


y = log_b(x) ⇔ b^y = x

Examples:​
log₂ 8 = 3 (since 2³ = 8)​
log₁₀ 100 = 2 (since 10² = 100)

Binary Logarithms (Log Base 2)

⌊log₂ 100⌋ = 6 (since 2⁶ = 64, 2⁷ = 128)​


⌈log₂ 1000⌉ = 10 (since 2⁹ = 512, 2¹⁰ = 1024)

Conclusion

These mathematical notations are essential in data structures and algorithms, helping
to analyze efficiency, memory usage, and computational complexity.

2.2: Algorithmic Notations

Definition of an Algorithm

An algorithm is a finite step-by-step set of well-defined instructions used to solve a


specific problem. Algorithms are structured to be executed on a computing machine
like a Turing Machine or its equivalent. This section focuses on how algorithms are
presented in a standard format.
Example of Algorithmic Notation (Finding the Largest Element in an
Array)

Problem Statement:

Given an array DATA containing numerical values, find the largest element and its
position (LOC) in the array.

Steps to Solve the Problem:

1.​ Initialize LOC = 1 and MAX = DATA[1].


2.​ Compare MAX with each successive element DATA[K] in the array.
3.​ If DATA[K] is greater than MAX, update LOC = K and MAX = DATA[K].
4.​ The final values of LOC and MAX provide the position and value of the largest
element.

Formal Algorithm Representation

Algorithm 2.1: Finding the Largest Element in an Array

Given: A non-empty array DATA of size N.​


Output: The location LOC and value MAX of the largest element.​
Variable K is used as a counter.

1.​ Initialize: Set K = 1, LOC = 1, MAX = DATA[1].


2.​ Increment Counter: Set K = K + 1.
3.​ Test Condition: If K > N, output LOC, MAX and exit.
4.​ Compare and Update: If MAX < DATA[K], update LOC = K and MAX =
DATA[K].
5.​ Repeat Loop: Go to Step 2.

Flowchart Representation of Algorithm

The flowchart visually represents the steps in Algorithm 2.1. It consists of:

●​ Start and Stop symbols


●​ Decision blocks for checking conditions like K > N and MAX < DATA[K]
●​ Process blocks for initialization and updating values
The flowchart helps in understanding how the algorithm executes step by step.

C Program Implementation

Program 2.1: C Implementation of Algorithm 2.1

#include <stdio.h>​
#include <conio.h>​

void main() {​
int DATA[10] = {22, 65, 1, 99, 32, 17, 74, 49, 33, 2};​
int N, LOC, MAX, K;​
N = 10;​
K = 0;​
LOC = 0;​
MAX = DATA[0];​

clrscr();​
loop:​
K = K + 1;​
if (K == N) {​
printf("LOC = %d, MAX = %d", LOC, MAX);​
getch();​
exit();​
}​
if (MAX < DATA[K]) {​
LOC = K;​
MAX = DATA[K];​
}​
goto loop;​
}

Output:

LOC = 3, MAX = 99

Key Observations
●​ Array Indexing in C:​

○​ In C, arrays start with index 0, not 1.


○​ This means an array with 10 elements will have indices 0 to 9.
●​ Algorithm Format:​

○​ Every algorithm has two parts:


1.​ A paragraph describing the goal of the algorithm and defining the
variables.
2.​ A step-by-step procedure detailing the execution.
●​ Algorithm Identifiers:​

○​ Algorithms are labeled based on their chapter and sequence number.


○​ Example: Algorithm 4.3 means the third algorithm in Chapter 4.
○​ P5.3 refers to Solved Problem 5.3 in Chapter 5.

Control Structures in Algorithms

Algorithms use control flow mechanisms such as:

●​ Steps: Executed sequentially from Step 1 onwards unless redirected.


●​ Conditionals: Decision-making using if-else or comparison operations.
●​ Loops (Repetition): Steps can be repeated using a goto statement or loops.
●​ Exit Statement: The algorithm ends when the Exit statement is reached.

Example:​
If multiple operations are written in the same step, they are executed from left to
right.

Set K = 1, LOC = 1, MAX = DATA[1]

This means K is set first, followed by LOC, then MAX.

Algorithm Notation Rules


1.​ Comments:​

○​ Comments may be included in brackets (e.g., /* Comment */).


○​ Typically placed at the beginning or end of a step.
2.​ Variable Naming:​

○​ Variable names should be capitalized (e.g., MAX, DATA).


○​ Single-letter variables (e.g., K, N) are used for counters or loop
variables.
3.​ Assignment Statements:​

○​ In algorithms, assignments use := (e.g., MAX := DATA[1]).


○​ In C programming, the equal sign = is used instead.
4.​ Input and Output:​

○​ Input is assigned using: Read: Variable name


○​ Output is written as: Write: Messages and/or variable names
5.​ Procedures:​

○​ A procedure is an independent module solving a subproblem.


○​ The terms procedure and module are often interchangeable with
algorithm.

2.3: Control Structures

Control structures define the flow of execution in an algorithm or a program. There are
three primary types of flow control structures:

1.​ Sequence Logic (Sequential Flow)


2.​ Selection Logic (Conditional Flow)
3.​ Iteration Logic (Repetitive Flow)

These structures help in organizing the logic of an algorithm in a structured and


understandable way.

1. Sequence Logic (Sequential Flow)


●​ Definition: In this structure, the instructions are executed in the same order in
which they appear.
●​ Flow: One instruction follows another without branching or repetition.
●​ Representation: It can be written as a numbered list of steps or implemented in a
sequential manner in a program.

Example:
Step 1: Start​
Step 2: Read A, B​
Step 3: Compute C = A + B​
Step 4: Print C​
Step 5: Stop

This structure is commonly used in most basic programs where each instruction is
performed one after another.

2. Selection Logic (Conditional Flow)


Selection logic introduces decision-making using if-else conditions. It allows the
algorithm to execute different sets of instructions based on a given condition.

There are three types of Selection Structures:

2.1 Single Alternative (If Condition)

●​ Definition: If the condition is true, execute a statement; otherwise, skip it.

Syntax:

If condition then:​
Execute Module A​
[End of If structure]

●​ Flowchart Representation Description: A condition is checked; if true, Module A


executes, otherwise, the flow continues to the next step.

2.2 Double Alternative (If-Else Condition)


●​ Definition: If the condition is true, execute Module A; otherwise, execute Module
B.

Syntax:

If condition then:​
Execute Module A​
Else:​
Execute Module B​
[End of If structure]

●​ Flowchart Representation: If the condition holds, Module A is executed;


otherwise, Module B is executed.

2.3 Multiple Alternatives (Else-If Ladder)

●​ Definition: When there are multiple conditions, different blocks of code execute
based on which condition is true.

Syntax:

If condition 1 then:​
Execute Module A1​
Else if condition 2 then:​
Execute Module A2​
Else if condition M then:​
Execute Module AM​
Else:​
Execute Module B​
[End of If structure]

●​ Flowchart Representation Description: Only one of the possible conditions


executes. This is useful when handling multiple cases in an algorithm.

3. Iteration Logic (Repetitive Flow - Loops)


Iteration logic allows certain steps to be repeated multiple times based on a condition.
There are two main types of loops:

3.1 Repeat-For Loop


●​ Definition: A loop that repeats a fixed number of times.

Syntax:

Repeat for K = R to S by T:​


Execute Module​
[End of loop]

●​ Explanation:
○​ R: Initial value
○​ S: End value
○​ T: Increment step
○​ Loop continues until K > S.

3.2 Repeat-While Loop

●​ Definition: A loop that executes while a condition remains true.

Syntax:

Repeat while condition:​


Execute Module​
[End of loop]

●​ Explanation:
○​ The loop runs only when the condition is true.
○​ If the condition is false initially, the loop does not execute.
○​ The loop must have an update statement inside to change the condition over
time.

Example 1: Finding the Largest Element in an Array


(Using While Loop)

Algorithm 2.3: Finding Largest Element

Problem Statement: Given an array DATA with N numerical values, find the location
LOC and the value MAX of the largest element.

Steps:
1.​ Initialize K = 1, LOC = 1, MAX = DATA[1].
2.​ Loop: Repeat Steps 3 and 4 while K ≤ N.
3.​ Condition: If MAX < DATA[K], update LOC = K and MAX = DATA[K].
4.​ Increment: Set K = K + 1.
5.​ Output: Write LOC, MAX.
6.​ Exit.

C Program Implementation (Program 2.3)


#include <stdio.h>​
#include <conio.h>​

void main() {​
int DATA[10] = {22, 65, 1, 99, 32, 17, 74, 49, 33, 2};​
int N, LOC, MAX, K;​
N = 10;​
K = 0;​
LOC = 0;​
MAX = DATA[0];​

clrscr();​

while (K < N) {​
if (MAX < DATA[K]) {​
LOC = K;​
MAX = DATA[K];​
}​
K = K + 1;​
}​

printf("LOC = %d, MAX = %d", LOC, MAX);​
getch();​
}

Output:​
LOC = 3, MAX = 99

Example 2: Solving Quadratic Equations


Quadratic Equation Formula:​
For ax² + bx + c = 0, the roots are calculated as:

x = (-b ± √(b² - 4ac)) / 2a

The value D = b² - 4ac is called the discriminant.

●​ If D > 0, the equation has two distinct real roots.


●​ If D = 0, the equation has one real root.
●​ If D < 0, there are no real solutions.

Algorithm 2.2: Quadratic Equation Solver

1.​ Read: A, B, C.
2.​ Compute D = B² - 4AC.
3.​ Check D:
○​ If D > 0, compute two real roots.
○​ If D = 0, compute one unique root.
○​ If D < 0, print "No real solutions".
4.​ Exit.

C Program Implementation (Program 2.2)


#include <stdio.h>​
#include <math.h>​

void main() {​
int A, B, C;​
float X1, X2, D;​

printf("Enter the values of A, B, and C: ");​
scanf("%d %d %d", &A, &B, &C);​

D = B * B - 4 * A * C;​

if (D > 0) {​
X1 = (-B + sqrt(D)) / (2 * A);​
X2 = (-B - sqrt(D)) / (2 * A);​
printf("X1 = %.2f, X2 = %.2f", X1, X2);​
} ​
else if (D == 0) {​
X1 = -B / (2 * A);​
printf("Unique Solution: X = %.2f", X1);​
} ​
else {​
printf("No Real Solutions");​
}​
}

Output Example:​
Input: 3 3 1​
Output: X1 = -0.58, X2 = -1.00

Conclusion

●​ Sequence logic executes steps one by one.


●​ Selection logic allows branching with if-else conditions.
●​ Iteration logic allows looping with for and while structures.
●​ Understanding these control structures is fundamental to programming and
algorithm design.

2.4: Complexity of Algorithms

Introduction

The analysis of algorithms is a fundamental aspect of computer science. To compare


different algorithms, we need criteria to measure their efficiency. Two primary
measures used for this are time complexity and space complexity.

Time and Space Complexity

●​ Time Complexity: Measures the efficiency of an algorithm in terms of the


number of key operations it performs. In sorting and searching algorithms,
these operations typically involve comparisons.
●​ Space Complexity: Measures the amount of memory required by an algorithm in
terms of the input size.
The complexity function of an algorithm, denoted as f(n), expresses its running time
and/or space requirement in relation to the input size n.

Example: Searching in a Text

Consider an English short story TEXT, where we need to find the first occurrence of a
3-letter word W:

●​ If W = "the", it likely appears early in the text, leading to a small value of f(n).
●​ If W = "zoo", it might not appear at all, leading to a large f(n).

This example illustrates that the running time of an algorithm depends not only on the
input size n but also on the specific data.

Worst Case and Average Case Analysis

When analyzing an algorithm’s performance, we usually consider:

1.​ Worst Case Complexity: Maximum value of f(n) for any possible input.
2.​ Average Case Complexity: Expected value of f(n) over all possible inputs.
3.​ Best Case Complexity: Minimum possible value of f(n).

For average case complexity, we assume a probabilistic distribution where each input is
equally likely. The expectation E of the running time is calculated as:

where n₁, n₂, ..., n are the possible numbers of operations, and p₁, p₂, ..., p are
their respective probabilities.

Example: Linear Search Algorithm

Problem Statement

Given a linear array DATA of size n, we need to find the position LOC of a given ITEM
in the array. If ITEM is not found, LOC = 0.

Algorithm
1.​ Initialize K = 1 and LOC = 0.
2.​ Repeat steps 3-4 while K ≤ n:
○​ If ITEM = DATA[K], set LOC = K and exit.
○​ Increment K.
3.​ If LOC = 0, print "ITEM not in array".
4.​ Otherwise, print "LOC is the location of ITEM".

C Implementation

#include <stdio.h>​
#include <conio.h>​

void main() {​
int DATA[10] = {22, 65, 1, 99, 32, 17, 74, 49, 33, 2};​
int ITEM = 17, N = 10, LOC = -1, K = 0;​

clrscr();​

while (LOC == -1 && K < N) {​
if (ITEM == DATA[K])​
LOC = K;​
K++;​
}​

if (LOC == -1)​
printf("ITEM is not in the array DATA");​
else​
printf("%d is the location of ITEM", LOC);​

getch();​
}

Complexity Analysis

●​ Worst Case: If ITEM is the last element or not in the array, C(n) = n.
●​ Average Case: If ITEM appears at a random position, the expected number of
comparisons is:

Thus, the average number of comparisons is approximately n/2.


Rate of Growth & Big O Notation

Definition

For an algorithm M, the function f(n) increases with input size n. To analyze how f(n)
grows, we compare it with standard functions:

The growth order is:

Big O Notation

If there exist positive integers n₀ and M such that:

then we write:

which means f(n) grows at most as fast as g(n).

Common Algorithm Complexities

●​ Linear Search: O(n)


●​ Binary Search: O(log n)
●​ Bubble Sort: O(n²)
●​ Merge Sort: O(n log n)

These complexities are discussed in detail in sorting and searching topics.

2.5: Other Asymptotic Notations for Complexity of Algorithms

In algorithm analysis, asymptotic notations provide a way to classify algorithms based


on their growth rates. These notations define upper and lower bounds for an
algorithm’s complexity function f(n). The most common asymptotic notations are Big O
(O), Omega (Ω), Theta (Θ), and Little O (o).

1. Omega (Ω) Notation - Lower Bound

Definition

The Omega notation (Ω) defines a lower bound for a function f(n). It is used to
describe the best-case complexity or the minimum time required by an algorithm.

Mathematically, we say:

if there exist positive constants c and 𝑛0 such that:


Example

Let’s analyze the function:

Since 18n > n for all n, we can conclude:

Similarly, consider:

Since 90n² > n² for all n, we conclude:

This means that the function g(n) = n² is a lower bound for f(n).
Choosing the Correct Bound

If f(n) = 5n + 1, it satisfies both:

However, we always select the largest possible function g(n) that satisfies the
condition. Thus, in this case, Ω(n) is the correct choice.

2. Theta (Θ) Notation - Tight Bound

Definition

The Theta notation (Θ) is used when f(n) is bounded from both above and below by the
same function g(n). It gives an exact asymptotic behavior of f(n).

Mathematically:

if there exist positive constants 𝑐1 , 𝑐2, and 𝑛0 such that:


Example

Given:

f(n) = 18n + 9

We already established that f(n) = Ω(n).

Now, for an upper bound, assume:

Thus, it satisfies:

Since f(n) is both O(n) and Ω(n), we conclude:

3. Little o (o) Notation - Strict Upper Bound

Definition

The Little o notation (o) defines a strict upper bound for f(n), meaning f(n) grows
slower than g(n).

Mathematically:

f(n) = o(g(n))

if:
Example

For:

f(n) = 18n + 9

We can say:

However, since it does not satisfy:

we conclude:

Thus, f(n) grows strictly slower than n².

Summary of Asymptotic Notations

Notation Meaning Mathematical Condition

Big O (O) Upper Bound f(n) ≤ cg(n)

Omega Lower Bound f(n) ≥ cg(n)


(Ω)

Theta (Θ) Tight Bound c1g(n)≤f(n)≤c2g(n)


Little o Strict Upper f(n) = O(g(n)) but f(n) ≠ Ω(g(n))
(o) Bound

These notations are essential in algorithm analysis to classify how functions grow with
input size n, helping in choosing the most efficient algorithm.
Chapter 3
3.1 Introduction to Strings

Historically, computers were used primarily for numerical data processing. However,
with advancements, the need to process text-based data emerged, leading to the
development of string processing. A string is a sequence of characters stored in
memory. It can include alphabets, digits, spaces, punctuation marks, and special
symbols. Strings differ from numerical data as they carry meaning in sequences, unlike
independent numerical values.

Storage of Strings in Memory

Strings are stored as character arrays in contiguous memory locations. In C, strings


are terminated with a null character (\0) to indicate their end. For example, the
string "HELLO" is stored as:

H E L L O \0

This null character helps distinguish meaningful characters from unused memory
spaces.

Character Encoding in Strings

To store and process strings, character encoding systems are used:

●​ ASCII (7 or 8-bit representation for English characters)


○​ Example: 'A' → 65, 'B' → 66
●​ Unicode (UTF-8, UTF-16) (supports multiple languages)
○​ Example: 'अ' → 2309, '你' → 20320

String Operations

A string supports various operations:

1.​ Accessing a character at a specific position.


2.​ Modifying a character within the string.
3.​ Finding the length of a string.
4.​ Concatenating two strings (joining "HELLO" and "WORLD" to "HELLO WORLD").
5.​ Copying a string to another variable.
6.​ Searching for a substring (finding "BE" in "TO BE OR NOT TO BE").

Example Program (C) - Finding Length of a String


#include <stdio.h>​
#include <string.h>​
int main() {​
char str[] = "COMPUTER";​
printf("Length of the string: %d\n", strlen(str));​
return 0;​
}​

Output:​
Length of the string: 8

Here, strlen(str) counts the number of characters excluding \0.

Applications of Strings

●​ Word Processing: Editing and formatting text in documents.


●​ Search Operations: Finding a word in a text file.
●​ Data Storage: Storing names, addresses, and email IDs.
●​ Pattern Matching: Identifying specific sequences in a text.

3.2 Basic Terminology

Each programming language has a character set, which consists of all valid symbols
used in that language. These symbols include alphabets (A-Z, a-z), digits (0-9), and
special characters (+, -, *, /, =, $, etc.). Characters are stored in memory using
encoding schemes such as ASCII and Unicode.

Definition of a String

A string is a sequence of characters stored in a continuous memory block. The number


of characters in a string is known as its length. In C, strings are stored as character
arrays and are terminated by a null character (\0) to mark the end.

String Representation in C

char str[] = "HELLO";

Stored in memory as:

H E L L O \0

The null character (\0) ensures that the system knows where the string ends.

String Operations
Some common operations performed on strings include:

●​ Concatenation: Joining two strings together.


●​ Substring Extraction: Extracting a portion of a string.
●​ Finding Length: Counting the number of characters in a string.
●​ Copying a String: Copying one string to another.
●​ Comparison: Checking if two strings are equal or which one is greater.

Example Program (C) - Finding the Length of a String

#include <stdio.h>​
#include <string.h>​
int main() {​
char str[] = "HELLO WORLD";​
printf("String length: %d\n", strlen(str));​
return 0;​
}​

Output:​
String length: 11

The strlen() function calculates the number of characters excluding \0.

Concatenation of Strings

Concatenation means joining two strings together. It is done using the strcat()
function in C.

#include <stdio.h>​
#include <string.h>​
int main() {​
char str1[20] = "Hello";​
char str2[] = " World";​
strcat(str1, str2); // Concatenates str2 to str1​
printf("Concatenated String: %s\n", str1);​
return 0;​
}​

Output:​
Concatenated String: Hello World
Substrings in Strings

A substring is a smaller portion of a string extracted from a larger string. It is defined


as:

SUBSTRING(string, start_position, length)

For example, in "HELLO WORLD", a substring from position 6 of length 5 gives


"WORLD".

Example Program (C) - Extracting a Substring

#include <stdio.h>​
#include <string.h>​
void substring(char str[], int start, int length) {​
char sub[20]; ​
int i;​
for (i = 0; i < length; i++)​
sub[i] = str[start + i];​

sub[i] = '\0'; // Null terminate the substring​
printf("Substring: %s\n", sub);​
}​

int main() {​
char str[] = "HELLO WORLD";​
substring(str, 6, 5); // Extract "WORLD"​
return 0;​
}​

Output:​
Substring: WORLD

Null String and Empty String

A null string ("") contains no characters and has a length of 0. It is different from a
string containing a single space (" "), which has a length of 1.

Example:

char str1[] = ""; // Null string​


char str2[] = " "; // String with a space
Here, strlen(str1) gives 0, but strlen(str2) gives 1.

Applications of Strings

●​ Data Storage: Names, passwords, addresses.


●​ Word Processing: Editing text in software like MS Word.
●​ Search Operations: Finding text in a document.
●​ Pattern Matching: Used in text analysis and data retrieval.

3.3 Storing Strings

How Strings are Stored in Memory

Strings are stored as character arrays in memory, with each character occupying a
fixed storage space. In C, a string is terminated by a null character (\0), which marks
the end of the string.

Example:

char str[] = "HELLO";

Memory representation:

H E L L O \0

The null character ensures that functions like strlen() and printf() can determine
the end of the string.

Types of String Storage

There are three common ways to store strings in memory:

1.​ Fixed-Length Storage


2.​ Variable-Length Storage with Fixed Maximum
3.​ Linked Storage

Fixed-Length Storage

Each string is stored in a fixed-size array, meaning every record has the same length.
If a string is shorter than the allocated space, unused memory is wasted.
Example:

char str[10] = "DATA"; // Fixed-length array with extra space

Here, even though "DATA" has 4 characters, the full 10 bytes are allocated, wasting 6
bytes.

Advantages:

●​ Easy to access any record directly.


●​ Useful for structured data like database records.

Disadvantages:

●​ Wastage of space if the string is smaller than the allocated memory.


●​ Inserting a new string may require shifting other strings.

Example Program (C) - Fixed-Length Storage

#include <stdio.h>​
int main() {​
char str[10] = "HELLO";​
printf("Stored String: %s\n", str);​
return 0;​
}​

Output:​
Stored String: HELLO

Variable-Length Storage with Fixed Maximum

Strings are stored in memory, but their actual length can vary. However, a maximum
limit is set for the storage. Two methods are used:

1.​ Using a marker ($$) to indicate the end.


2.​ Storing the length separately.
Example:

"PROGRAM PRINTING ORDER$$" (Using $$ as end marker)​


or​
[Length: 24] "PROGRAM PRINTING ORDER"

Advantages:

●​ No unnecessary space is wasted.


●​ Searching and modifications are easier.

Disadvantages:

●​ Extra processing is needed to check the end of a string.

Example Program (C) - Variable-Length Storage

#include <stdio.h>​
#include <string.h>​
int main() {​
char str1[20] = "HELLO$$"; // End marker method​
char str2[] = "WORLD"; // Length-based method​

printf("String 1: %s\n", str1);​
printf("Length of String 2: %d\n", strlen(str2));​
return 0;​
}​

Output:​
String 1: HELLO$$​
Length of String 2: 5

Linked Storage

Instead of storing the entire string in a continuous block of memory, each character
(or group of characters) is stored in a linked list node. Each node contains:

1.​ Character data


2.​ Pointer to the next node

Example:

Advantages:

●​ Efficient for modifying strings (insertion, deletion).


●​ No fixed size required.

Disadvantages:

●​ More memory is needed due to pointers.


●​ Slower access compared to arrays.

Example Program (C) - Linked List for String Storage

#include <stdio.h>​
#include <stdlib.h>​

struct Node {​
char data;​
struct Node* next;​
};​

// Function to print the linked list​
void printList(struct Node* head) {​
while (head != NULL) {​
printf("%c", head->data);​
head = head->next;​
}​
printf("\n");​
}​

int main() {​
struct Node* head = malloc(sizeof(struct Node));​
struct Node* second = malloc(sizeof(struct Node));​
struct Node* third = malloc(sizeof(struct Node));​

head->data = 'H'; head->next = second;​
second->data = 'I'; second->next = third;​
third->data = '!'; third->next = NULL;​

printf("Stored String: ");​
printList(head);​

return 0;​
}

Output:

Stored String: HI!

This program stores the string "HI!" using a linked list.

Comparison of Storage Methods

Storage Type Advantages Disadvantages

Fixed-Length Easy access Wastes memory

Variable-Length (Fixed Saves space Processing


Max) overhead

Linked Storage Efficient Uses extra


modifications memory

3.4 Character Data Type

Introduction to Character Data Type


A character data type is used to store individual characters in memory. Each
character is represented using a unique numerical code defined by encoding schemes
such as ASCII or Unicode. In C, characters are stored using the char data type, which
occupies 1 byte (8 bits) of memory.

Example:

char ch = 'A';

Here, 'A' is stored as ASCII 65 in memory.

Character Constants in C

A character constant is a single character enclosed in single quotes (' ').​


Example:

char ch = 'B';

Internally, 'B' is stored as ASCII 66.

There are also escape sequences for special characters:

●​ '\n' → New line


●​ '\t' → Tab space
●​ '\0' → Null character

Example:

#include <stdio.h>​
int main() {​
char newline = '\n';​
printf("Hello%cWorld", newline);​
return 0;​
}​

Output:​
Hello​
World

The '\n' creates a new line.


ASCII Table

Variables and Storage of Characters

A variable is a named memory location that stores a character. Variables can be of


three types:

1.​ Static: Size is fixed at compile-time.


2.​ Semistatic: Size may vary within a given limit.
3.​ Dynamic: Size can change at runtime.

Example:

char letter = 'C'; // Static character variable​


char str[20]; // Semistatic character array​

For dynamic allocation, we use pointers:​
#include <stdio.h>​
#include <stdlib.h>​
int main() {​
char *ptr;​
ptr = (char*) malloc(10 * sizeof(char)); // Dynamic allocation​
ptr = "HELLO";​
printf("%s\n", ptr);​
free(ptr); // Free allocated memory​
return 0;​
}​

Output:​
HELLO

Here, memory is allocated dynamically for a string.

String Representation in Memory

Characters are stored sequentially in memory. Strings in C are stored as arrays of


characters terminated by a null character (\0).

Example:

char str[] = "HELLO";

Memory representation:

H E L L O \0

The null character (\0) is essential for marking the end of the string.

Example Program (C) - Storing a Character and Its ASCII Value

#include <stdio.h>​
int main() {​
char ch = 'G';​
printf("Character: %c\n", ch);​
printf("ASCII Value: %d\n", ch);​
return 0;​
}​

Output:​
Character: G​
ASCII Value: 71

The ASCII value of 'G' is 71.

Character Arrays vs. Strings

Feature Character Array String

Storage Stores individual characters Stores characters with \0 at


Method the end

Usage Can store letters, digits Used for words, sentences

Example char arr[5] = char str[] = "HELLO";


{'H','E','L','L','O'};

Access arr[0], arr[1] str[0], str[1]


Method

3.5 String Operations

Introduction to String Operations


A string is a sequence of characters stored in memory. Unlike numeric arrays, which
store independent values, strings contain words and phrases that hold meaning. To
manipulate strings effectively, several string operations are used in programming.

The basic unit of a string is a character, but in text processing, the primary focus is on
substrings rather than individual characters.

Common String Operations

1.​ Finding the length of a string


2.​ Concatenating (joining) two strings
3.​ Copying a string
4.​ Extracting a substring
5.​ Searching for a pattern in a string
6.​ Comparing two strings
7.​ Inserting and deleting substrings
8.​ Replacing characters or words in a string

1. Finding the Length of a String

The length of a string is the number of characters it contains (excluding the null
character \0). In C, the strlen() function is used.

Example (C Program) - Finding Length

#include <stdio.h>​
#include <string.h>​

int main() {​
char str[] = "HELLO WORLD";​
printf("Length of the string: %d\n", strlen(str));​
return 0;​
}​

Output:​
Length of the string: 11
2. Concatenating Two Strings

Concatenation means joining two strings together. The strcat() function in C appends
one string to another.

Example (C Program) - Concatenation

#include <stdio.h>​
#include <string.h>​

int main() {​
char str1[20] = "Hello";​
char str2[] = " World";​

strcat(str1, str2); // Appends str2 to str1​

printf("Concatenated String: %s\n", str1);​
return 0;​
}​

Output:​
Concatenated String: Hello World

3. Copying a String

To copy one string into another, we use the strcpy() function.

Example (C Program) - Copying a String

#include <stdio.h>​
#include <string.h>​

int main() {​
char source[] = "C Programming";​
char destination[20];​

strcpy(destination, source); // Copy source to destination​

printf("Copied String: %s\n", destination);​
return 0;​
}​

Output:​
Copied String: C Programming

4. Extracting a Substring

A substring is a smaller part of a string. We can extract a substring using an index


position and length.

Example (C Program) - Extracting a Substring

#include <stdio.h>​
#include <string.h>​

void substring(char str[], int start, int length) {​
char sub[20];​
int i;​
for (i = 0; i < length; i++)​
sub[i] = str[start + i];​

sub[i] = '\0'; // Null terminate the substring​
printf("Substring: %s\n", sub);​
}​

int main() {​
char str[] = "HELLO WORLD";​
substring(str, 6, 5); // Extract "WORLD"​
return 0;​
}​

Output:​
Substring: WORLD

5. Searching for a Pattern in a String

Finding whether a word or pattern appears in a string is known as pattern matching.


The strstr() function in C helps search for substrings.

Example (C Program) - Searching for a Word

#include <stdio.h>​
#include <string.h>​

int main() {​
char text[] = "HELLO WORLD";​
char *found = strstr(text, "WORLD");​

if (found)​
printf("Substring found at position: %ld\n", found - text);​
else​
printf("Substring not found.\n");​

return 0;​
}​

Output:​
Substring found at position: 6

6. Comparing Two Strings

To compare two strings, we use the strcmp() function. It checks:

●​ If strings are equal → returns 0


●​ If the first string is greater → returns positive
●​ If the second string is greater → returns negative

Example (C Program) - Comparing Strings

#include <stdio.h>​
#include <string.h>​

int main() {​
char str1[] = "Hello";​
char str2[] = "World";​

if (strcmp(str1, str2) == 0)​
printf("Strings are equal\n");​
else​
printf("Strings are not equal\n");​

return 0;​
}​

Output:​
Strings are not equal
7. Inserting a Substring into a String

Inserting a substring at a particular position requires shifting characters.

Example (C Program) - Inserting a Substring

#include <stdio.h>​
#include <string.h>​

void insertSubstring(char str[], char sub[], int pos) {​
char temp[100];​
strncpy(temp, str, pos);​
temp[pos] = '\0';​
strcat(temp, sub);​
strcat(temp, str + pos);​
strcpy(str, temp);​
}​

int main() {​
char str[100] = "Hello World";​
insertSubstring(str, " Beautiful", 5);​
printf("Modified String: %s\n", str);​
return 0;​
}​

Output:​
Modified String: Hello Beautiful World

8. Deleting a Substring from a String

To delete a portion of a string, characters must be shifted left.

Example (C Program) - Deleting a Substring

#include <stdio.h>​
#include <string.h>​

void deleteSubstring(char str[], int pos, int length) {​
strcpy(str + pos, str + pos + length);​
}​

int main() {​
char str[100] = "Hello Beautiful World";​
deleteSubstring(str, 6, 10); // Remove "Beautiful"​
printf("Modified String: %s\n", str);​
return 0;​
}​

Output:​
Modified String: Hello World

9. Replacing a Word in a String

Replacing words or characters is a common operation.

Example (C Program) - Replacing a Word

#include <stdio.h>​
#include <string.h>​

void replaceWord(char str[], char oldWord[], char newWord[]) {​
char temp[200];​
char *pos = strstr(str, oldWord);​

if (pos) {​
int index = pos - str;​
strncpy(temp, str, index);​
temp[index] = '\0';​
strcat(temp, newWord);​
strcat(temp, pos + strlen(oldWord));​
strcpy(str, temp);​
}​
}​

int main() {​
char str[100] = "Hello Beautiful World";​
replaceWord(str, "Beautiful", "Wonderful");​
printf("Modified String: %s\n", str);​
return 0;​
}​

Output:​
Modified String: Hello Wonderful World

3.6 Word Processing

Introduction to Word Processing

Word processing involves creating, editing, formatting, and managing textual data.
It is widely used in applications like Microsoft Word, Google Docs, and text editors.
Computers process text in lines, paragraphs, and pages using string operations such
as insertion, deletion, search, and replacement.

Modern word processing software includes features like:

●​ Text Formatting (bold, italic, underline)


●​ Search and Replace (finding words and replacing them)
●​ Spell Checking (detecting errors in words)
●​ Cut, Copy, and Paste
●​ Alignment and Indentation

Basic Word Processing Operations

1.​ Insertion – Adding text to a document.


2.​ Deletion – Removing characters, words, or entire sentences.
3.​ Replacement – Substituting one word or phrase with another.
4.​ Search and Find – Locating a specific word in a text.
5.​ Formatting – Modifying the appearance of text (font size, style, alignment).

1. Insertion in a String

To insert a word or character in a string, existing characters must be shifted.

Example (C Program) - Inserting a Word into a String

#include <stdio.h>​
#include <string.h>​

void insertSubstring(char str[], char sub[], int pos) {​
char temp[100];​
strncpy(temp, str, pos);​
temp[pos] = '\0';​
strcat(temp, sub);​
strcat(temp, str + pos);​
strcpy(str, temp);​
}​

int main() {​
char str[100] = "Hello World";​
insertSubstring(str, " Beautiful", 5);​
printf("Modified String: %s\n", str);​
return 0;​
}​

Output:​
Modified String: Hello Beautiful World

This program inserts "Beautiful" at position 5.

2. Deletion in a String

Deleting a word from a sentence requires shifting characters left.

Example (C Program) - Deleting a Word

#include <stdio.h>​
#include <string.h>​

void deleteSubstring(char str[], int pos, int length) {​
strcpy(str + pos, str + pos + length);​
}​

int main() {​
char str[100] = "Hello Beautiful World";​
deleteSubstring(str, 6, 10); // Remove "Beautiful"​
printf("Modified String: %s\n", str);​
return 0;​
}​

Output:​
Modified String: Hello World

3. Search and Find a Word in a String

Searching helps in locating specific words in a text document.

Example (C Program) - Searching a Word

#include <stdio.h>​
#include <string.h>​

int main() {​
char text[] = "Welcome to Programming World";​
char *found = strstr(text, "Programming");​

if (found)​
printf("Word found at position: %ld\n", found - text);​
else​
printf("Word not found.\n");​

return 0;​
}​

Output:​
Word found at position: 11

The function strstr() searches for "Programming" inside the text.

4. Replacing a Word in a String

Replacing words helps in modifying text without manually editing every instance.

Example (C Program) - Replacing a Word

#include <stdio.h>​
#include <string.h>​

void replaceWord(char str[], char oldWord[], char newWord[]) {​
char temp[200];​
char *pos = strstr(str, oldWord);​

if (pos) {​
int index = pos - str;​
strncpy(temp, str, index);​
temp[index] = '\0';​
strcat(temp, newWord);​
strcat(temp, pos + strlen(oldWord));​
strcpy(str, temp);​
}​
}​

int main() {​
char str[100] = "Hello Beautiful World";​
replaceWord(str, "Beautiful", "Wonderful");​
printf("Modified String: %s\n", str);​
return 0;​
}​

Output:​
Modified String: Hello Wonderful World

5. Formatting Text (Capitalization, Lowercase, Uppercase)

Text formatting operations include converting text to uppercase, lowercase, or


capitalizing words.

Example (C Program) - Convert to Uppercase

#include <stdio.h>​
#include <ctype.h>​

int main() {​
char str[] = "hello world";​
int i;​

for (i = 0; str[i] != '\0'; i++)​
str[i] = toupper(str[i]);​

printf("Uppercase String: %s\n", str);​
return 0;​
}​

Output:​
Uppercase String: HELLO WORLD

The toupper() function converts lowercase letters to uppercase.

Word Processing in Real Applications

Modern word processing software performs various functions:

●​ Microsoft Word: Offers document editing, spell check, and text formatting.
●​ Notepad/Text Editors: Basic text editing without formatting.
●​ Google Docs: Online document processing with cloud storage.
●​ LaTeX: Used for scientific documents with complex formatting.

Comparison of String Operations in Word Processing

Operation Function Used in C Example Usage

Insert Text strcat(), Manual Add "Good" before "Morning"


Shifting

Delete Text strcpy(), Manual Remove "is" from "This is


Shifting text"

Search Text strstr() Find "C" in "C Programming"


Replace strstr() + strcpy() Change "bad" to "good"
Text

Convert toupper(), tolower() Convert "hello" to "HELLO"


Case

Applications of Word Processing

1.​ Document Editing - Creating reports, books, and articles.


2.​ Data Entry - Filling forms and records efficiently.
3.​ Text Analysis - Searching for patterns in text.
4.​ Printing Documents - Formatting text for proper printing.
5.​ Spell Checking & Grammar Correction - Used in applications like MS Word.

Simple C program to copy one string into another without using built-in functions like
strcpy():

#include <stdio.h>​

void copyString(char dest[], char src[]) {​
int i = 0;​
while (src[i] != '\0') { // Copy characters until null terminator​
dest[i] = src[i];​
i++;​
}​
dest[i] = '\0'; // Append null character at the end​
}​

int main() {​
char source[100], destination[100];​

// Input the source string​
printf("Enter a string: ");​
gets(source); ​

// Copy the string manually​
copyString(destination, source);​

// Print the copied string​
printf("Copied String: %s\n", destination);​

return 0;​
}

Explanation

1.​ The function copyString(char dest[], char src[]):


○​ Loops through each character in src.
○​ Copies each character into dest.
○​ Ends the copied string with \0 (null terminator).
2.​ Inside main():
○​ The user inputs a string (source).
○​ The function copyString(destination, source); is called to copy the
string.
○​ The copied string is displayed.

Sample Output
Enter a string: Hello World​
Copied String: Hello World

3.7 Pattern Matching Algorithms


Pattern matching is the process of determining whether a string P (pattern) appears in
a string T (text). The length of P should not exceed T.

During pattern matching:

●​ Characters may be represented as lowercase letters.


●​ Exponents may denote repetition, e.g.:
2 3 2
○​ 𝑎 𝑏 𝑎 𝑏 → "aabbbaab"
3
○​ (𝑐𝑑) → "cdcdcd"
●​ Concatenation of strings is denoted as XY or X-Y.
●​ Empty strings are denoted by λ (lambda).

First Pattern Matching Algorithm

●​ The brute-force approach.


●​ Compares P with each substring of T from left to right.
●​ Stops when a match is found or all substrings are checked.

Steps:

1.​ Extract each substring of T with length equal to P.


2.​ Compare each substring with P.
3.​ If a match is found, return the index.
4.​ If no match, return 0.

Example:​
Let P = "abcd" and T = "abcdefabcd".

We extract substrings of T, each of length equal to P:

W1 = "abcd"​
W2 = "bcde"​
W3 = "cdef"​
W4 = "defa"​
W5 = "efab"​
W6 = "fabc"​
W7 = "abcd" (match found at index 7)

The maximum shifts required to check all possible positions is calculated as:

MAX = length(T) - length(P) + 1

Substituting values:

MAX = 10 - 4 + 1 = 7

So, a maximum of 7 shifts is needed to check all possible occurrences of P in T.

Algorithm : Brute-Force Pattern Matching


1.​ Initialize: Set K = 1 and MAX = S - R + 1
2.​ Repeat while K < MAX:
○​ Compare P with 𝑊𝑘
○​ If a mismatch is found, move to the next substring
3.​ If a match is found: Return INDEX = K
4.​ If no match is found: Return INDEX = 0

C Implementation (Program 3.6)


#include <stdio.h>​
#include <string.h>​

void main()​
{​
char P[80] = "bab";​
char T[80] = "aabbbabb";​
int R, S, L, MAX, INDEX;​

R = strlen(P);​
S = strlen(T);​
K = 0;​
MAX = S - R;​

while (K < MAX)​
{​
for (L = 0; L < R; L++)​
if (P[L] != T[K + L])​
break;​

if (L == R)​
{​
INDEX = K;​
break;​
}​
else​
K = K + 1;​
}​

if (K > MAX)​
INDEX = -1;​

printf("P = %s", P);​
printf("\n\nT = %s", T);​

if (INDEX != -1)​
printf("\n\nIndex of P in T is %d", INDEX);​
else​
printf("\n\nP does not exist in T");​

getch();​
}​

Output:​
P = bab​
T = aabbbabb​
Index of P in T is 4

Complexity Analysis

Number of Comparisons (C)

The total number of comparisons needed before finding a match is:

○​ C = N1 + N2 + ... + NL

where L is the position where the first match occurs.

Worst Case Scenario

The worst case happens when all characters match except the last one at each shift,
making the algorithm check almost every position in the text.

The number of comparisons in the worst case is:

○​ C(n) = r * (s - r + 1)

where:

●​ r is the length of the pattern


●​ s is the length of the text

The time complexity is:

○​ O(n²)
This means that as the text gets longer, the number of comparisons increases very
fast, making this method slow for large texts.

Second Pattern Matching Algorithm

●​ Uses a precomputed table to avoid unnecessary comparisons.


●​ Faster than the brute-force approach.
●​ Works by shifting based on previously matched characters.

Key Idea:

●​ If a mismatch occurs, the algorithm does not start over.


●​ Instead, it shifts the pattern based on known character repetitions.

Table Construction (Example)

For P = "aaba":​
Q0 → λ (empty string)​
Q1 → "a"​
Q2 → "aa"​
Q3 → "aab"​
Q4 → "aaba"

●​ Rows: Substrings of P.
●​ Columns: Possible characters.
●​ Entries: The longest matching prefix.

Pattern Matching Graph

●​ A directed graph that represents transitions.


●​ Each arrow indicates a state transition.

Optimized Pattern Matching

1.​ Initialize states: Q0 → Qn


2.​ Start at Q0 and read characters
3.​ Follow the state transitions
4.​ If final state is reached, return index

Example 3.12

T = "abcababa", P = "aaba"​
States: Q0 → Q1 → Q2 → Q3 → Q0​
P is NOT found.​
T = "abcaabaca", P = "aaba"​
States: Q0 → Q1 → Q2 → Q3 → P​
P is found at index 3.

C Implementation (Program 3.7)


int state[3][3] = {0}; // Transition table​
int N, K = 0, S = 0;​
N = strlen(T);​

while (K < N && S != -1)​
{​
if (T[K] == 'a') I = 0;​
if (T[K] == 'b') I = 1;​
if (T[K] == 'x') I = 2;​

S = F(S, I);​
K = K + 1;​
}​

if (S == -1)​
INDEX = K - strlen(P);​
else​
INDEX = -1;​

printf("P = %s", P);​
printf("\n\nT = %s", T);​

if (INDEX != -1)​
printf("\n\nIndex of P in T is %d", INDEX);​
else​
printf("\n\nP does not exist in T");​

getch();

Output:

P = aaba

T = abcaabaca

Index of P in T is 3

Complexity Analysis

●​ Time Complexity:
○​ Brute-force approach: O(n²)
○​ Optimized approach: O(n) (linear time)
●​ Conclusion: The second algorithm is more efficient.

You might also like