0% found this document useful (0 votes)
5 views

3. String Processing

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

3. String Processing

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

STRING

PROCESSING

GORA (CSC 202)


S.P CONT...

Strings are typically made up of characters, and are

often used to store human-readable data, such as words

or sentences.

Strings are used for storing text/characters. For

example, "Hello World" is a string of characters.

GORA (CSC 202)


S.P CONT...

The String data structure is the backbone of programming languages

and the building blocks of communication. String data structures are

one of the most fundamental and widely used tools in Computer

Science and Programming. They allow for the representation and

manipulation of text and character sequences in a variety of ways.

GORA (CSC 202)


S.P CONT...

The string data structure is a powerful tool that

can be used to store and process large amounts

of text data, from simple strings to complex

sentences, paragraphs, and even entire books.

GORA (CSC 202)


S.P CONT...

It is a sequence of characters that represent text

or other forms of data. It is a fundamental data

structure that is used in many programming

languages to store and manipulate text-based

data.
GORA (CSC 202)
S.P CONT...

In most programming languages, strings are

implemented as an array of characters, with

each character having a unique index position

within the array.

GORA (CSC 202)


STRING REPRESENTATION

GORA (CSC 202)


APPLICATIONS OF
STRING

GORA (CSC 202)


1. PLAGIARISM CHECKER

Strings can be used to find Plagiarism in codes and

contents in a very little amount of time using string

matching algorithms. Using this, the computer could

easily tell us the percentage of code, and text written by

any two users matches by how much percent.

GORA (CSC 202)


2.
ENCODING/DECODING(CIPHER
TEXT GENERATION)
Strings can be used for encoding and decoding for the safe

transfer of data from sender to receiver to make sure no

one in the way of transmission gets to read your data as

they could perform both active and passive attacks. The

text you transfer as a message gets ciphered at the sender’s

end and decoded at the receiver’s end.


GORA (CSC 202)
3. INFORMATION RETRIEVAL

String applications help us to retrieve information from

unknown data sources (large datasets used as input)

along with the help of string matching/retrieval module

helps us to retrieve important information.

GORA (CSC 202)


4. IMPROVED FILTERS FOR THE APPROXIMATE
SUFFIX-PREFIX OVERLAP PROBLEM

Strings and its algorithms applications help us to


provide improved Filters for the Approximate
Suffix-Prefix Overlap Problem. The approximate
suffix-prefix overlap problem is to find all pairs of
strings from a given set such that a prefix of one
string is similar to a suffix of the other.
GORA (CSC 202)
5. NETWORK
COMMUNICATION
Strings are used to encode and decode data sent over

networks, such as HTTP requests and responses.

GORA (CSC 202)


6. FILE HANDLING

Strings are used to manipulate file paths and names,

and to read and write files.

GORA (CSC 202)


7. DATA ANALYSIS
Strings can be used to extract meaningful

insights from large amounts of text data, such as

natural language processing and sentiment

analysis.

GORA (CSC 202)


REAL-TIME APPLICATION
OF STRING

GORA (CSC 202)


1. SPAM DETECTION
Strings can be used to serve as a spam detection system

as the concept of string matching algorithm will be

applied here. Spam (unwanted emails) could cause

great financial loss. All the spam filters use the concept

of string matching to identify and discard the spam.

GORA (CSC 202)


2. BIOINFORMATICS
Strings can be used in the field of Bioinformatics (DNA
sequencing). String matching module can be used to solve
issues or problems regarding genetic sequences and to find
the patterns in DNA.
Intrusion Detection System: Strings can be used in intrusion
detection systems. Packets that contain intrusion related
keywords are found by applying string matching algorithms.
GORA (CSC 202)
3. SEARCH ENGINES
Strings can be used in many search engine techniques.

Most of the data are available on the internet in the form of

textual data. Due to huge amount of uncategorized text

data, it becomes really difficult to search a particular

content. Web search engines organize the data and to

categorize the data, string matching algorithms are used.


GORA (CSC 202)
4. OPERATIONS ON STRING
String provides users with various operations. Some of the

important ones are:

size(): This function is used to find the length of the string.

substr(): This is used to find a substring of a particular length

starting from a particular index.

+: This operator is used to concatenate two strings.

GORA (CSC 202)


s1.compare(s2): This is used to compare two strings s1
and s2 to find which is lexicographically greater and
which one is smaller.
reverse(): This function is used to reverse a given
string.
sort(): This function is used to sort the string in
lexicographic order
GORA (CSC 202)
ADVANTAGES OF
STRING

GORA (CSC 202)


1. TEXT PROCESSING

Strings are used to represent text in programming

languages. They can be used to manipulate and process

text in various ways, such as searching, replacing,

parsing, and formatting.

GORA (CSC 202)


2. DATA REPRESENTATION
Strings can be used to represent other data types, such

as numbers, dates, and times. For example, you can use

a string to represent a date in the format “YYYY-MM-

DD”, or a time in the format “HH:MM:SS”.

GORA (CSC 202)


3. EASE OF USE
Strings are easy to use and manipulate. They can be

concatenated, sliced, and reversed, among other things.

They also have a simple and intuitive syntax, making

them accessible to programmers of all skill levels.

GORA (CSC 202)


4. COMPATIBILITY
Strings are widely used across programming languages,

making them a universal data type. This means that

strings can be easily transferred between different

systems and platforms, making them a reliable and

efficient way to communicate and share data.

GORA (CSC 202)


5. MEMORY EFFICIENCY
Strings are usually stored in a contiguous block of

memory, which makes them efficient to allocate and

de-allocate. This means that they can be used to

represent large amounts of data without taking up too

much memory.

GORA (CSC 202)


DISADVANTAGES OF
STRING

GORA (CSC 202)


1. MEMORY CONSUMPTION

Strings can consume a lot of memory, especially

when working with large strings or many strings.

This can be a problem in memory-constrained

environments, such as embedded systems or

mobile devices.
GORA (CSC 202)
2. IMMUTABILITY
In many programming languages, strings are
immutable, meaning that they cannot be changed
once they are created. This can be a disadvantage
when working with large or complex strings that
require frequent modifications, as it can lead to
inefficiencies and memory overhead.
GORA (CSC 202)
3. PERFORMANCE
OVERHEAD
String operations can be slower than operations on

other data types, especially when working with large or

complex strings. This is because string operations often

involve copying and reallocating memory, which can

be time-consuming.

GORA (CSC 202)


4. ENCODING AND DECODING
OVERHEAD
Strings can have different character encodings, which

can lead to overhead when converting between them.

This can be a problem when working with data from

different sources or when communicating with systems

that use different encodings.

GORA (CSC 202)


5. SECURITY VULNERABILITIES

Strings can be vulnerable to security, such as

buffer overflows or injection attacks, if not

handled properly. This is because strings can be

manipulated by attackers to execute arbitrary code

or access sensitive data.


GORA (CSC 202)
GORA (CSC 202)

You might also like