0% found this document useful (0 votes)
9 views20 pages

Natural Language Processing - Theory and Application

Uploaded by

antomjonas3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views20 pages

Natural Language Processing - Theory and Application

Uploaded by

antomjonas3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

A

SEMINAR ON

NATURAL LANGUAGE PROCESSING: THEORY AND


APPLICATION

PRESENTED

BY

OLALOTI MOSES OLAJIDE

2203046

SUBMITTED TO THE:

DEPARTMENT OF COMPUTER SCIENCE, EKITI STATE


POLYTECHNIC, ISAN-EKITI

SUPERVISED BY:

MR. OBASADE
IN PARTIAL FULFILMENT FOR THE AWARD OF
NATIONAL DIPLOMA (ND) IN COMPUTER SCIENCE

SEPTEMBER, 2024

TABLE OF CONTENTS
CHAPTER ONE......................................................................................................................................2

INTRODUCTION...................................................................................................................................2

1.1 Overview of Natural Language Processing.............................................................................2

1.2 Core Concepts of Natural Language Processing.....................................................................3

CHAPTER TWO.....................................................................................................................................7

THEORIES AND APPLICATION IN NATURAL LANGUAGE PROCESSING.................................7

2.1 How Natural Language Processing (NLP) Work...........................................................................7

2.2 Techniques of Natural Language Processing (NLP)................................................................7

2.2.1 Traditional Machine Learning NLP techniques:..............................................................7

2.2.2 Deep Learning NLP Techniques:.....................................................................................8

2.3 Important Natural Language Processing (NLP) Models.......................................................11

2.4 Applications of Natural Language Processing (NLP)...........................................................12

Sentiment Analysis........................................................................................................................12

Toxicity classification....................................................................................................................13

Named entity recognition...............................................................................................................13

Text generation..............................................................................................................................14

CHAPTER THREE...............................................................................................................................16

CONCLUSION......................................................................................................................................16

REFERENCES......................................................................................................................................18
CHAPTER ONE

INTRODUCTION
1.1 Overview of Natural Language Processing
Whether it's Alexa, Siri, Google Assistant, Bixby, or Cortana, everyone with a smartphone or
smart speaker has a voice-activated assistant nowadays. Every year, these voice assistants
seem to get better at recognizing and executing the things we tell them to do. But have you
ever wondered how these assistants process the things we're saying? They manage to do this
thanks to Natural Language Processing, or NLP.

Natural Language Processing (NLP) is one of the hottest areas of artificial intelligence (AI)
thanks to applications like text generators that compose coherent essays, chatbots that fool
people into thinking they’re sentient, and text-to-image programs that produce photorealistic
images of anything you can describe. Recent years have brought a revolution in the ability of
computers to understand human languages, programming languages, and even biological and
chemical sequences, such as DNA and protein structures, that resemble language. The latest
AI models are unlocking these areas to analyze the meanings of input text and generate
meaningful, expressive output. (deeplearning, 2023).

Natural Language Processing (NLP) is a subfield of artificial intelligence (AI) that focuses on
the interaction between computers and human languages. The goal of NLP is to enable
machines to understand, interpret, and respond to human languages in a way that is both
meaningful and useful. As humans increasingly rely on digital technologies, the demand for
machines to effectively process and interpret large volumes of textual and spoken language
data has grown. NLP bridges this gap by applying various computational techniques to
analyze and understand human language (Rohit Kumar Yadav et al., 2024).

NLP encompasses various tasks, including but not limited to text classification, machine
translation, question-answering systems, speech recognition, and sentiment analysis. The
field covers a broad spectrum, from simple keyword-matching systems to advanced deep-
learning algorithms capable of understanding the nuances and complexities of human
language.

The journey of NLP can be traced back to the 1950s when researchers began exploring how
computers could understand language. Early systems were largely rule-based and involved
manually encoding grammatical rules and vocabulary. For instance, one of the earliest efforts
was the development of the Georgetown-IBM experiment in 1954, which involved the
automatic translation of Russian sentences into English. Over the next few decades,
advancements were made with the introduction of statistical models in the 1980s and
machine learning algorithms in the 1990s. However, the major leap in NLP came with the
advent of deep learning techniques and large datasets, which allowed for more accurate and
scalable language models. The introduction of neural networks, particularly the Transformer
architecture, revolutionized the field, enabling state-of-the-art performance in various NLP
tasks (Vaswani et al., 2017).

In essence, NLP seeks to address one of the most profound challenges in computer science:
enabling machines to process and generate human language in a way that mimics human
understanding. This challenge stems from the intricacies of human language, such as syntax
(structure), semantics (meaning), and pragmatics (context). Each of these linguistic
components plays a role in how humans communicate, and NLP aims to model these
complexities computationally.

As technology continues to evolve, the importance of NLP in modern-day applications cannot


be overstated. Today, NLP powers a vast array of applications, including voice assistants like
Siri and Alexa, chatbots for customer service, automated translation tools such as Google
Translate, and even systems for detecting fraudulent activities by analyzing textual patterns in
transaction logs. The rise of Big Data has further increased the need for NLP to process and
analyze unstructured text data at scale, making it a critical component in areas such as
healthcare, finance, and education.
Given the rapid advancement in both NLP theory and application, the field has the potential
to reshape industries by improving machine-human interaction, enhancing information
retrieval systems, and making language-based data more accessible. Through continuous
research and development, NLP is helping to create smarter, more intuitive technologies that
can understand and respond to the complexity of human communication.

1.2 Core Concepts of Natural Language Processing


Natural Language Processing (NLP) is built upon several fundamental linguistic concepts
that govern the way human language is structured and understood. To develop computational
models that can process and interpret language, it is essential to grasp these core concepts:
syntax, semantics, and pragmatics. Each of these plays a critical role in shaping how
machines analyze and generate human language (deeplearning, 2023).

1.2.1 Syntax
Syntax refers to the structure of language, which dictates how words are arranged to form
grammatically correct sentences. In human languages, each word has a specific part of speech
—nouns, verbs, adjectives, etc.—and the rules of syntax govern the permissible combinations
of these parts. For instance, in English, a basic sentence follows a Subject-Verb-Object (SVO)
structure, as in “The cat (subject) eats (verb) fish (object).”

In NLP, syntactic analysis is a crucial step in understanding the structure of a sentence.


Techniques like Part-of-Speech (POS) tagging and parsing are used to break down
sentences into their syntactic components. POS tagging involves identifying the grammatical
category of each word in a sentence, while parsing maps out the grammatical relationships
between words, often represented through constituency or dependency parsing. These
methods help machines understand how words are related to each other in a sentence, an
essential task for applications like machine translation and text generation.

1.2.2 Semantics
While syntax focuses on structure, semantics is concerned with meaning. It seeks to capture
the meaning of words, phrases, and sentences, allowing a machine to understand the context
in which language is used. One major challenge in NLP is that words often have multiple
meanings, known as polysemy. For example, the word “bank” can refer to the edge of a river
or a financial institution, depending on the context.
In traditional NLP systems, meaning is often captured through techniques like Word Sense
Disambiguation (WSD), which involves identifying the correct meaning of a word based on
its context. However, modern NLP systems use deep learning models to represent word
meanings more effectively. These models generate word embeddings, such as Word2Vec or
GloVe, which represent words as continuous vectors in a high-dimensional space. This allows
words with similar meanings to be placed closer together in this vector space, thereby
capturing semantic similarity. More advanced models, like BERT (Bidirectional Encoder
Representations from Transformers), capture contextualized word meanings by analyzing
entire sentences, rather than words in isolation (Devlin et al., 2019).

1.2.3 Pragmatics
Pragmatics goes beyond syntax and semantics by focusing on the use of language in different
contexts. Human language is highly dependent on context, which includes the speaker's
intentions, the surrounding conversation, and even the social and cultural background.
Understanding pragmatics is critical for machines to engage in more natural conversations.

For example, the sentence “Can you pass the salt?” is a request for action in a dining setting,
despite being framed as a yes/no question. Pragmatic understanding would allow an NLP
system, such as a conversational agent, to respond appropriately to such requests. Tasks like
coreference resolution, which determines when different words refer to the same entity (e.g.,
“John” and “he”), are essential for pragmatics. Additionally, tasks such as sentiment
analysis, which gauges the emotional tone of a text, and speech act recognition, which
identifies the speaker’s intent (e.g., question, command, statement), fall under the realm of
pragmatics.

1.2.4 Linguistic Challenges in NLP


One of the primary challenges in NLP is dealing with linguistic phenomena like ambiguity,
polysemy, and context dependency. Human languages are inherently ambiguous. A single
sentence can have multiple interpretations based on its syntactic structure or word meanings.
For instance, the sentence “Visiting relatives can be exhausting” could mean that either
visiting one's relatives is tiring, or that relatives who are visiting are exhausting. Resolving
such ambiguities is a significant challenge in NLP and often requires advanced algorithms
and a deeper understanding of both syntax and semantics.

Another issue is polysemy, where words carry multiple meanings depending on the context.
Additionally, context dependency poses a problem since understanding a sentence often
requires knowledge of the broader discourse. For instance, in a conversation about weather,
the pronoun "it" in "It is raining" refers to the weather, but in a different context, "it" could
mean something entirely different. NLP systems need to be designed to capture such nuances
to avoid misinterpretation (Daniel Jurafsky & James H. Martin, 2018).

1.2.5 NLP vs. NLU and NLG


While NLP is a broad field encompassing all aspects of language processing, two closely
related subfields are Natural Language Understanding (NLU) and Natural Language
Generation (NLG). NLU focuses on interpreting and making sense of human language
input. It involves the ability to parse, understand, and infer meaning from language. On the
other hand, NLG involves producing human-like language from a machine. This could be as
simple as generating an automated response to a query or as complex as creating a long-form
essay or conversation based on input data.

Together, NLU and NLG form the foundation of many modern applications such as chatbots,
automated translation systems, and virtual assistants. They are fundamental in making
machines not just process language but also interact with humans in a meaningful and
coherent way. (Devlin et al., 2019)
CHAPTER TWO

THEORIES AND APPLICATION IN NATURAL LANGUAGE PROCESSING


2.1 How Natural Language Processing (NLP) Work
NLP models work by tracking down connections between the constituent pieces of language,
for instance, the letters, words, and sentences found in a message dataset. NLP structures
utilize different strategies for information preprocessing, highlight extraction, and
demonstrating (Vicente, 2020). A portion of these cycles are:

Data preprocessing: Before a model processes text for a specific task, the text often needs to
be pre-processed to improve model performance or to turn words and characters into a format
the model can understand. Data-centric AI is a growing movement that prioritizes data
preprocessing. Various techniques may be used in this data preprocessing.

Feature extraction: Most conventional machine-learning techniques work on the features –


generally numbers that describe a document about the corpus that contains it – created by
either Bag-of-Words, TF-IDF, or generic feature engineering such as document length, word
polarity, and metadata (for instance, if the text has associated tags or scores). More recent
techniques include Word2Vec, GLoVE, and learning the features during the training process
of a neural network.
2.2 Techniques of Natural Language Processing (NLP)
A dozen general techniques can model most of the NLP tasks discussed above. It’s helpful to
think of these techniques in two categories: Traditional machine learning methods and deep
learning methods (Devlin et al., 2019).

2.2.1 Traditional Machine Learning NLP techniques:


Logistic regression is a supervised classification algorithm that aims to predict the
probability that an event will occur based on some input. In NLP, logistic regression models
can be applied to solve problems such as sentiment analysis, spam detection, and toxicity
classification.

Naive Bayes is a supervised classification algorithm that finds the conditional probability
distribution P(label | text) using the following Bayes formula:

P(label | text) = P(label) x P(text|label) / P(text)

And predicts based on which joint distribution has the highest probability. The naive
assumption in the Naive Bayes model is that the individual words are independent.

Decision trees are supervised classification models that split the dataset based on different
features to maximize information gain in those splits.
2.2.2 Deep Learning NLP Techniques:
Convolutional Neural Network (CNN): The idea of using a CNN to classify text was first
presented in the paper “Convolutional Neural Networks for Sentence Classification” by Yoon
Kim. The central intuition is to see a document as an image. However, instead of pixels, the
input is sentences or documents represented as a matrix of words.

Recurrent Neural Network (RNN): Many techniques for text classification that use deep
learning process words nearby using n-grams or a window (CNNs). They can see “New
York” as a single instance. However, they can’t capture the context provided by a particular
text sequence. They don’t learn the sequential structure of the data, where every word is
dependent on the previous word or a word in the previous sentence. RNNs remember
previous information using hidden states and connect it to the current task. The architectures
known as Gated Recurrent Unit (GRU) and long short-term memory (LSTM) are types of
RNNs designed to remember information for an extended period. Moreover, the bidirectional
LSTM/GRU keeps contextual information in both directions, which is helpful in text
classification. RNNs have also been used to generate mathematical proofs and translate
human thoughts into words.
Autoencoders are deep learning encoder-decoders that approximate a mapping from X to X,
i.e., input=output. They first compress the input features into a lower-dimensional
representation (sometimes called a latent code, latent vector, or latent representation) and
learn to reconstruct the input. The representation vector can be used as input to a separate
model, so this technique can be used for dimensionality reduction. Among specialists in many
other fields, geneticists have applied autoencoders to spot mutations associated with diseases
in amino acid sequences.
2.3 Important Natural Language Processing (NLP) Models
Over the years, many NLP models have made waves within the AI community, and some
have even made headlines in the mainstream news. The most famous of these have been
chatbots and language models.

 The ability to analyze both structured and unstructured data, such as speech, text
messages, and social media posts.

 Improving customer satisfaction and experience by identifying insights using


sentiment analysis.

 Perform large-scale analysis.

 Get a more objective and accurate analysis.

 Streamline processes and reduce costs.

 Better understand your market.

 Empower your employees.

 Get real, actionable insights.

Here are some of them:

 Eliza was developed in the mid-1960s to try to solve the Turing Test; that is, to fool
people into thinking they were conversing with another human being rather than a
machine. Eliza used pattern matching and a series of rules without encoding the
context of the language.

 Tay was a chatbot that Microsoft launched in 2016. It was supposed to tweet like
a teen and learn from conversations with real users on Twitter. The bot adopted
phrases from users who tweeted sexist and racist comments, and Microsoft
deactivated it not long afterward. Tay illustrates some points made by the “Stochastic
Parrots” paper, particularly the danger of not debiasing data.

 BERT and his Muppet friends: Many deep learning models for NLP are named after
Muppetcharacters,
including ELMo, BERT, BigBIRD, ERNIE, Kermit, Grover, RoBERTa, and Rosita.
Most of these models are good at providing contextual embeddings and enhanced
knowledge representation.
 Generative Pre-Trained Transformer 3 (GPT-3) is a 175 billion-parameter model that
can write original prose with human-equivalent fluency in response to an input
prompt. The model is based on the transformer architecture. The previous version,
GPT-2, is open source. Microsoft acquired an exclusive license to access GPT-3’s
underlying model from its developer OpenAI, but other users can interact with it via
an application programming interface (API). Several groups
including EleutherAI and Meta have released open source interpretations of GPT-3.

 Language Model for Dialogue Applications (LaMDA) is a conversational chatbot


developed by Google. LaMDA is a transformer-based model trained on dialogue
rather than the usual web text. The system aims to provide sensible and specific
responses to conversations. Google developer Blake Lemoine came to believe that
LaMDA is sentient. Lemoine had detailed conversations with AI about his rights and
personhood. During one of these conversations, the AI changed Lemoine’s mind
about Isaac Asimov’s third law of robotics. Lemoine claimed that LaMDA was
sentient, but the idea was disputed by many observers and commentators.
Subsequently, Google placed Lemoine on administrative leave for distributing
proprietary information and ultimately fired him.

 Mixture of Experts (MoE): While most deep learning models use the same set of
parameters to process every input, MoE models aim to provide different parameters
for different inputs based on efficient routing algorithms to achieve higher
performance. Switch Transformer is an example of the MoE approach that aims to
reduce communication and computational costs.

2.4 Applications of Natural Language Processing (NLP)


Sentiment Analysis
Sentiment Analysis determines whether a text expresses positive, negative, or neutral
sentiments. Sentiment analysis is the process of classifying the emotional intent of text.
Generally, the input to a sentiment classification model is a piece of text, and the output is the
probability that the sentiment expressed is positive, negative, or neutral. Typically, this
probability is based on either hand-generated features, word n-grams, TF-IDF features, or
using deep learning models to capture sequential long- and short-term dependencies.
Sentiment analysis is used to classify customer reviews on various online platforms as well as
for niche applications like identifying signs of mental illness in online comments(Pang &
Lee, 2008).

Given text Sentiment analysis classifies its emotional quality.

Toxicity classification
is a branch of sentiment analysis where the aim is not just to classify hostile intent but also to
classify particular categories such as threats, insults, obscenities, and hatred towards certain
identities. The input to such a model is text, and the output is generally the probability of
each class of toxicity. Toxicity classification models can be used to moderate and improve
online conversations by silencing offensive comments, detecting hate speech, or scanning
documents for defamation.

Named entity recognition


aims to extract entities in a piece of text into predefined categories such as personal names,
organizations, locations, and quantities. The input to such a model is generally text, and the
output is the various named entities along with their start and end positions. Named entity
recognition is useful in applications such as summarizing news articles and combating
disinformation. For example, here is what a named entity recognition model could provide:
Spam detection is a prevalent binary classification problem in NLP, where the purpose is to
classify emails as either spam or not. Spam detectors take as input an email text along with
various other subtexts like title and sender’s name. They aim to output the probability that the
mail is spam. Email providers like Gmail use such models to provide a better user experience
by detecting unsolicited and unwanted emails and moving them to a designated spam folder.

Grammatical error correction models encode grammatical rules to correct the grammar
within text. This is viewed mainly as a sequence-to-sequence task, where a model is trained
on an ungrammatical sentence as input and a correct sentence as output. Online grammar
checkers like Grammarly and word-processing systems like Microsoft Word use such systems
to provide a better writing experience to their customers. Schools also use them to grade
student essays.

Topic modeling is an unsupervised text-mining task that takes a corpus of documents and
discovers abstract topics within that corpus. The input to a topic model is a collection of
documents, and the output is a list of topics that defines words for each topic as well as
assignment proportions of each topic in a document. Latent Dirichlet Allocation (LDA), one
of the most popular topic modeling techniques, tries to view a document as a collection of
topics and a topic as a collection of words. Topic modeling is being used commercially to
help lawyers find evidence in legal documents.

Text generation
Text Generation, more formally known as natural language generation (NLG), produces text
that’s similar to human-written text. Such models can be fine-tuned to produce text in
different genres and formats — including tweets, blogs, and even computer code. Text
generation has been performed using Markov processes, LSTMs, BERT, GPT-2, LaMDA,
and other approaches. It’s particularly useful for autocomplete and chatbots

 Autocomplete predicts what word comes next, and autocomplete systems of varying
complexity are used in chat applications like WhatsApp. Google uses autocomplete to
predict search queries. One of the most famous models for autocomplete is GPT-2,
which has been used to write articles, song lyrics, and much more.
 Chatbots automate one side of a conversation while a human conversant generally
supplies the other side. They can be divided into the following two categories:
 Database query: We have a database of questions and answers, and we would
like a user to query it using natural language.
 Conversation generation: These chatbots can simulate dialogue with a human
partner. Some are capable of engaging in wide-ranging conversations. A high-
profile example is Google’s LaMDA, which provided such human-like
answers to questions that one of its developers was convinced that it had
feelings.

Information retrieval finds the documents that are most relevant to a query. This is a
problem every search and recommendation system faces. The goal is not to answer a
particular query but to retrieve, from a collection of documents that may be numbered in the
millions, a set that is most relevant to the query. Document retrieval systems mainly execute
two processes: indexing and matching. In most modern systems, indexing is done by a vector
space model through Two-Tower Networks, while matching is done using similarity or
distance scores. Google recently integrated its search function with a multimodal information
retrieval model that works with text, image, and video data.
CHAPTER THREE

CONCLUSION
Natural Language Processing (NLP) supposedly makes the job easier but still demands a
human interference. People and the industry fear NLP would start a trend of job snatching
which is true to a certain sense but it certainly cannot function the way it does without human
inputs. The will to work and cater to the loopholes or bugs in a machine is the task of a
human who is handling it. Notwithstanding, the advantages of NLP may anger in the arena of
jobs but right now it is the knight in the shining armor of the industry.

After exploring the foundational aspects of Natural Language Processing (NLP), it is clear
that NLP is a critical component in the development of intelligent systems capable of
understanding and generating human language. The first chapter provided an introduction to
NLP, outlining its core concepts such as syntax, semantics, and pragmatics, and highlighting
the field's historical evolution from rule-based systems to modern machine learning
techniques.

We also delved deeper into the theories and techniques that drive NLP, focusing on the
algorithms, models, and methodologies that allow machines to interpret language
meaningfully. This chapter examined key approaches like statistical models, deep learning,
and neural network architectures, including state-of-the-art models such as BERT and GPT.
We explored their applications in tasks like machine translation, text classification, and
sentiment analysis, and how these models have revolutionized the way machines process
language.

Summary of Key Insights:

NLP as a Bridge Between Humans and Machines: NLP enables more intuitive and efficient
human-computer interaction, making technology more accessible by allowing users to
communicate in natural language.

Rapid Evolution and Advancements: NLP has evolved from simple rule-based systems to
advanced deep learning models that understand the complexities of language. With
developments like the Transformer architecture, the field continues to push the boundaries of
language understanding.

Broad Applications Across Industries: NLP has transformed industries such as healthcare,
finance, customer service, and education. Its ability to process and analyse large volumes of
unstructured text has made it indispensable in automating tasks and generating insights.

Challenges and Opportunities: Despite the advancements, challenges such as ambiguity,


context-dependency, and polysemy still pose difficulties for NLP models. Addressing these
challenges through continued research will enable NLP to become even more sophisticated in
handling human language.
REFERENCES
Daniel Jurafsky & James H. Martin. (2018, February 2). Speech and Language Processing:
An Introduction to Natural Language Processing, Computational Linguistics, and
Speech Recognition.
https://www.researchgate.net/publication/200111340_Speech_and_Language_Process
ing_An_Introduction_to_Natural_Language_Processing_Computational_Linguistics_
and_Speech_Recognition
deeplearning. (2023, January 11). Natural Language Processing (NLP)—A Complete Guide.
https://www.deeplearning.ai/resources/natural-language-processing/
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep
Bidirectional Transformers for Language Understanding. In J. Burstein, C. Doran, &
T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies,
Volume 1 (Long and Short Papers) (pp. 4171–4186). Association for Computational
Linguistics. https://doi.org/10.18653/v1/N19-1423
Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and
Trends® in Information Retrieval, 2(1–2), 1–135. https://doi.org/10.1561/1500000011
Rohit Kumar Yadav, Aanchal Madaan, & Janu. (2024). Comprehensive analysis of natural
language processing. Global Journal of Engineering and Technology Advances, 19(1),
083–090. https://doi.org/10.30574/gjeta.2024.19.1.0058
Vicente, V. (2020, April 21). What Is Natural Language Processing, and How Does It Work?
How-To Geek. https://www.howtogeek.com/665702/what-is-natural-language-
processing-and-how-does-it-work/

You might also like