Introduction to Natural Language Processing
Introduction to Natural Language Processing
Natural Language
Processing
By
Dr. Om Prakash Sharma
Mr. Siyang P. Kamble
rof. Shubhada Labde
r. Rushikesh Prasad Kulkarni
2025
i
Introduction to Natural
Language Processing
ISBN: 978-93-6422-893-0
ii
About the Book
iii
Processing will provide the tools and knowledge necessary
for success in this exciting field.
iv
Preface
v
Table of Content
Processing .................................................................................. 1
Applications............................................................... 1
FastText .................................................................... 69
vi
2.5. Sentiment Analysis and Text Classification ........ 75
................................................................................................... 86
(LSTM) ...................................................................... 98
vii
CHAPTER-5: NLP Tools, Frameworks, and Future
Trends..................................................................................... 175
viii
Fundamentals of
Natural
1 Language
Processing
CHAPTER-1:
1.1. Introduction to NLP: Definition, Scope, and
Applications
1. Definition
1
may be fully understood by computers thanks to these
technologies.
2. Scope
2
speech (POS) tagging aid in the analysis of word
connections and sentence structure. Lastly, semantic
analysis promotes deeper knowledge by interpreting
meanings and clearing out ambiguities.
3
models improve performance in real-world translation
tools like Google Translate by learning intricate
correlations across languages.
4
h. Text Mining and Knowledge Extraction
5
k. Emerging Trends in NLP
3. Applications
6
b. Machine Translation
c. Text Summarization
d. Sentiment Analysis
7
e. Chatbots and Virtual Assistants
8
h. Content Recommendation
9
k. Financial Services
l. Education
10
n. Customer Service Automation
1.2.1. History
11
development have been indicative of advances in AI,
machine learning, and computational linguistics.
12
algorithms, which were modelled after the human
brain.
• Advanced Architectures: NLP skills were further
improved by deep learning architectures such as
transformers and recurrent neural networks.
Without delving into technical specifics, briefly
describe these designs.
13
processing is multimodal natural language processing
(NLP). NLP has historically prioritised analysing and
comprehending textual material.
14
c. Video Understanding: As the quantity of online
video footage continues to increase, there could be
an increasing need for NLP frameworks that can
identify and condense video data. This now
involves understanding the narrative structure and
context in addition to having top-notch recognition
tools and movements inside films. Programs in
content fabric suggestion, video summarisation,
and even sentiment analysis based just on visual
and aural signals are made possible by video data.
d. Social Media Analysis: In the setting of social
media, where people exchange a wide variety of
information, including text, images, and moving
photos, multimodal natural language processing
(NLP) becomes particularly pertinent. NLP
frameworks must be adept in processing
multimodal information in order to analyse and
comprehend the sentiment, context, and capability
implications of social media posts. This affects
social media platform trends analysis, logo
monitoring, and content material cloth moderation.
15
popularity with the goal of illuminating the inner
workings of complex models and improving the
customer's comprehension of their results.
16
into NLP. This hybrid technique aims to strike a
compromise between the clarity of rule-primarily
based structures and the expressive expressiveness
of deep mastery. Customers may learn why a
certain forecast or decision was made by the
version by receiving rule-based explanations.
17
7. The Evolution of Language Models
18
Distributed representations increased the overall
performance of downstream NLP tasks and enabled more
sophisticated nuanced language knowledge.
19
confirmed the robustness of large-scale language models
for pre-schooling on large datasets. By examining
contextualised representations of words and concepts,
BERT and later models like as GPT (Generative Pre-
educated Transformer) achieved exceptional results. These
pre-professional models, which are highly skilled for
certain tasks, have proven to be the driving force behind
advances in natural language comprehension.
1.2.2. Evolution
20
comprehend and analyse text, these systems focused on
grammar, parsing, and sentence structure. But they were
inflexible and had trouble comprehending complicated,
nuanced language, particularly when it came to ambiguity
and context.
21
Penn Treebank corpus, which offered a standardised
dataset for NLP system training and evaluation.
22
(such pictures and videos). With applications ranging from
chatbots to AI-powered writing aids and real-time
translation services, natural language processing (NLP) is
becoming more accessible and personalised. More
advancements in ethical AI, linguistic inclusiveness, and
cross-lingual models that can comprehend and produce
text in a greater range of languages are probably in store
for NLP in the future.
23
• Syntax would dissect the grammatical framework
into its constituent parts.
• architecture would be informed by semantics that
"not bad" probably implies "good," albeit this
would depend on the situation.
• Pragmatics would make sure the algorithm
recognises that, depending on the circumstance,
voice tone, or prior exchange, this might be a casual
praise or a courteous understatement.
1.3.1. Syntax
24
edges connecting nodes. For tasks like part-of-speech
tagging, sentence parsing, and question answering, this
aids NLP systems in recognising important sentence
components and their connections.
1.3.2. Semantics
25
Take, for instance, the phrase "He went to the bank to fish."
The term "bank" may refer to a variety of things, including
a financial institution or a riverbank. on separating the
meaning of "bank" according to context, a semantic
analysis would assist the NLP system in determining
whether the user is at a bank or on the side of a river.
1.3.3. Pragmatics
26
capacity to pass the salt. Practically speaking, however, it
is usually used in a social setting as a courteous request. In
order to comprehend this change in meaning, a system
must take into account both the social rules that regulate
interactions and the context in which the utterance is
delivered.
27
Raw text data is cleaned and prepared for further analysis
and modelling through text preprocessing, a critical phase
in Natural Language Processing (NLP). In this process,
unstructured text is converted into a structured format that
can be efficiently analysed by machine learning
algorithms. The primary stages in text preprocessing are as
follows:
1. Lowercasing
2. Removing Punctuation
28
or root form (e.g., "running" to "run"). In order to obtain a
comparable outcome, stemming eliminates word endings
(e.g., "runners" to "runner"). Consolidating word variations
is facilitated by both methodologies.
5. Removing Numbers
7. Tokenization
29
1.4.1. Tokenization
30
["C", "h", "a", "t", "b", "o", "t", "s", " ", "a", "r", "e", " ", "h", "e",
"l", "p", "f", "u", "l"].
1. Types of tokenization
a. Word tokenization
31
b. Character tokenization
c. Subword tokenization
32
Machine translation tools, including Google Translate,
employ tokenisation to segment sentences in the source
language. Once tokenised, these segments can be
translated and subsequently reconstructed in the target
language, thereby guaranteeing that the translation
preserves the original context.
3. Tokenization challenges
a. Ambiguity
33
b. Languages without clear boundaries
4. Implementing Tokenization
34
a. NLTK (Natural Language Toolkit). NLTK is a
comprehensive Python library that is a stalwart in
the NLP community, providing support for a
diverse array of linguistic requirements. It is a
versatile option for both novices and seasoned
practitioners, as it provides both word and
sentence tokenisation functionalities.
b. Spacy. Spacy is an additional Python-based NLP
library that serves as a contemporary and effective
substitute for NLTK. It is a preferred choice for
large-scale applications due to its support for
multiple languages and its impressive
performance.
c. BERT tokenizer. This tokeniser is exceptional at
context-aware tokenisation, as it is derived from
the BERT pre-trained model. It is a top choice for
sophisticated NLP projects due to its ability to
handle the nuances and ambiguities of language
(see this tutorial on NLP with BERT).
1.4.2. Lemmatization
35
SEOs. Lemmatisation often entails using a morphological
and vocabulary study of words, eliminating inflectional
ends, and returning the lemma, or dictionary form, of a
word.
*https://cdn.prod.website-
files.com/5ef788f07804fb7d78a4127a/65f985e9c789b549c4774842_
_GCX2S0080ZfWRKJ585W-KKYuBobBS3a8Mxg_9Zr-
XHCsHph_A7V1_J-
AF3c2ZVvUnEXZQIPHEfWYdvbnOgNkCbAOWlAQGQdNCN
6kIUETBlmu3DUncVZz5HMJPX_nlomcMVl-
zkqdtAb1m1i8wCnBtM.png
36
1. Uses
2. Importance
37
3. Importance
a. Sentiment analysis
c. Biomedicine
38
created from the CRAFT corpus, this tool has achieved
97.5% accuracy.
d. Document clustering
e. Search engines
1.4.3. Stemming
39
A stemming algorithm, on the other hand, is a linguistic
normalisation procedure that reduces a word's different
forms to a standard form. This method involves
eliminating affixes from words in order to retrieve their
fundamental form. It is analogous to chopping off a tree's
branches to reveal the stems. For instance, "eat" is the stem
of the words "eating," "eats," and "eaten."
*https://cdn.prod.website-
files.com/5ef788f07804fb7d78a4127a/61d44079aad03bd419c4ba90
_stemming.jpeg
40
1. Popular stemming algorithms
b. Lovins Stemmer
c. Dawson Stemmer
41
d. Krovetz Stemmer
e. N-Gram Stemmer
f. Snowball Stemmer
42
handles short strings. The Snowball stemmer, also known
as the Porter2 Stemmer, is much more aggressive than the
Porter Stemmer. The Snowball stemmer has a faster
computing speed than the Porter stemmer due to the
enhancements made.
g. Lancaster Stemmer
4. Applications of Stemming
43
• Document clustering, also referred to as text
clustering, is a group analysis technique used to
textual information. Subject extraction, automated
document structure, and rapid information
retrieval are some of its key applications.
5. Disadvantages in Stemming
44
stemming. In stemming, methods like as sentiment
analysis and semantic role labelling improve
context awareness.
6. Advantages of Stemming
45
1. Common Uses of Regular Expressions in NLP
46
or certain phrases (e.g., locating dates in a news
story or legal document).
Example:
47
text more streamlined and concentrates on the main
linguistic ideas.
Example:
3. Tokenization
4. Stemming
5. Lemmatization
48
Lemmatising "running" to "run," for example, would
lemmatise "better" to "good."
7. Handling Numbers:
49
NLP Techniques
2 and Methods
CHAPTER-2:
2.1. Part-of-Speech (POS) Tagging
1. Defining
50
phrase by adding a layer of syntactic and semantic
information to the words.
51
2. Techniques for POS tagging
52
then assign the POS tags from the sequence that
has the greatest probability. A POS Tag may be
assigned using probabilistic methods called
Hidden Markov Models (HMMs).
53
d. To support linguistics research: POS tagging may
also be used to investigate language use trends and
traits as well as to learn more about the
composition and purpose of various speech
components.
54
e. Test the POS tagger: Make predictions about the
POS tags of the words in the testing set using the
trained model or rules. To assess the tagger's
performance, compare the anticipated and real tags
and compute measures like accuracy and recall.
f. Make the POS tagger better: If the tagger's
performance isn't up to par, modify the model or
rules and carry out the training and testing
procedure again until the required accuracy is
attained.
g. Employ the POS tagger: New, unseen text may be
tagged using the POS tagger after it has been
trained and tested. This might include applying the
rules to the text or preparing the text before feeding
it into the trained model. The anticipated POS tags
for every word in the text will be the output.
55
POS tagging. For jobs like creating client profiles or
locating important characters in a news article, this
is helpful.
56
1. Defining
2. Working
57
determine the borders of sentences. When a word
begins with a capital letter, it assumes it may be the
start of a new sentence and recognises the end of
the sentence. Understanding sentence boundaries
helps the model comprehend connections and
meanings by contextualising textual items.
• NER may be taught to categorise whole documents
into distinct groups, such passports, invoices, and
receipts. By enabling it to modify its entity
recognition according to the unique properties and
context of various document kinds, document
categorisation increases NER's adaptability.
• NER analyses labelled datasets using machine
learning methods, such as supervised learning. The
model is guided in identifying comparable things
in fresh, unseen data by the instances of annotated
entities found in these datasets.
• The model constantly improves its accuracy over
time by honing its comprehension of entity
patterns, grammatical structures, and contextual
characteristics over several training rounds.
• The model is more resilient and efficient because it
can withstand changes in language, context, and
entity types thanks to its capacity to adjust to new
data.
58
based, and deep learning techniques are the four types of
NER systems. Let's examine each of them separately.
a. Dictionary-based Systems
b. Rule-based Systems
59
When we use an ML-based solution for NER, there are
primarily two stages. Training the ML model on the
annotated texts is the initial step. The complexity of the
model we are creating will affect how long it takes the
model to train. The trained model may be used to annotate
the unprocessed documents in the next stage.
a. Customer support
b. Resume Filtering
60
most important talents in a distinct section of the resume if
you had previously participated in a resume-building
class. Additionally, they may have suggested that you
include just the essential talents associated with the
employment role. This is due to the possibility that the
automated system's Named entity recognition in Python
(NER) model was specially trained to recognise certain
skill sets as entities. A résumé is eligible for the following
step if it contains the necessary number of entities.
61
• Dependency on context. Words often get their
meaning from the text around them. In a tech
article, the term "Apple" presumably refers to the
company, yet in a recipe, it most likely refers to the
fruit. Accurate entity identification requires an
understanding of these subtleties.
• Language differences. With its slang, dialects, and
regional variations, the diverse fabric of human
language may provide difficulties. The NER
process may become more difficult if something
that is ubiquitous in one area is unfamiliar in
another.
• Sparsity of data. The availability of extensive
labelled data is essential for NER techniques based
on machine learning. It may be difficult to get such
information, however, particularly for specialised
sectors or less widely used languages.
• Generalisation of the model. A model may perform
well in one domain but poorly in another when it
comes to identifying things. A recurring problem is
making sure NER models generalise successfully
across different domains.
62
sentence's syntactical structure in accordance with formal
grammar. The properties of the final tree will vary
depending on the grammatical type we choose.
63
Dependency parsing, which seeks to determine the
syntactic relationships between words in a phrase, is
distinct from constituency parsing. Dependency parsing
focusses on the sentence's linear structure, while
constituency parsing concentrates on the sentence's
hierarchical structure. Both strategies may be used to
improve sentence comprehension, and each has benefits of
its own.
64
c. Text-to-Speech: This technology uses the text's
syntax and structure to produce speech that sounds
human.
d. Sentiment Analysis: This method shows if the
elements of a text have neutral, negative, or
positive attitudes.
e. Text-based Games and Chatbots: It makes text-
based games and chatbots respond more like
humans.
f. Text summarisation: This method breaks down
lengthy texts into their most essential components
and presents them in a condensed format.
g. Text Classification: This method analyses the
connections and component structure of text to
group it into predetermined groups.
65
recognising them. The parser analyses the text and creates
a dependency tree or graph using a grammar model and a
set of grammatical rules.
66
entities, including individuals, locations, and
organisations, inside a text.
b. Part-of-Speech (POS) Tagging: It helps in
identifying the parts of speech of each word in a
sentence and classifying them as nouns, verbs,
adjectives, etc.
c. Sentiment analysis: By examining the relationships
between words and the feeling attached to each
one, it helps ascertain the sentiment of a phrase.
d. Machine Translation: By examining the
relationships between words and producing the
equivalent dependencies in the target language,
this tool assists in translating phrases across
languages.
e. Text Generation: It supports the creation of text by
examining the relaionships between words and
producing new terms that complement the
preexisting structure.
f. Question Answering: It assists in answering
questions by examining the relationships between
words and locating pertinent data in a corpus..
67
Constituency parsing uses Dependency parsing uses
phrase structure grammar, dependency grammar, which
such as context-free represents the relationships
grammar or dependency between words as labeled
grammar. directed arcs.
Constituency parsing is Dependency parsing is based
based on a top-down on a bottom-up approach,
approach, where the parse where the parse tree is built
tree is built from the root from the leaves up to the
node down to the leaves. root.
Constituency parsing Dependency parsing
represents a sentence as a represents a sentence as a
tree structure with non- directed graph, where words
overlapping constituents. are represented as nodes and
grammatical relationships are
represented as edges.
Constituency parsing is Dependency parsing is more
more suitable for natural suitable for natural language
language understanding generation tasks and
tasks. dependency-based machine
learning models.
Constituency parsing is Dependency parsing is
more expressive and simpler and more efficient,
captures more syntactic but may not capture as much
information, but can be syntactic information as
more complex to compute constituency parsing.
and interpret.
Constituency parsing is Dependency parsing is more
more appropriate for appropriate for languages
languages with rich with less morphological
morphology such as inflection like English and
agglutinative languages. Chinese.
Constituency parsing is Dependency parsing is used
68
used for more traditional for more advanced NLP tasks
NLP tasks like Named like Machine Translation,
Entity Recognition, Text Language Modeling, and
classification, and Text summarization.
Sentiment analysis.
Constituency parsing Dependency parsing is
Constituency parsing is Dependency parsing is more
more suitable for suitable for languages with
languages with rich less complex syntactic
syntactic structures. structures.i
69
Three prominent word embedding techniques are as
follows: Word2Vec, GloVe, and BERT.
2.4.1. Word2Vec
70
rather than utilising local context to learn word
representations. It generates a co-occurrence matrix that
quantifies the frequency with which words are
encountered in a specific text window.
2.4.3. FasText
71
represent words. As an example, the character n-
grams "app", "ppl", and "ple" might be used to
represent the word "apple". This method aids
FastText in capturing word structure, which makes
it especially useful for languages with rich
morphology (tense, gender, etc.) where word forms
vary greatly.
b. FastText can also handle uncommon words,
misspellings, and morphological changes thanks to
its character-level modeling. Even if FastText has
never encountered the word before during training,
it can nevertheless produce a meaningful
embedding by taking into account the n-grams
included in the term.
c. Subword Information: FastText handles out-of-
vocabulary (OOV) words more effectively by using
subword information, or character n-grams. Rare
words or misspellings are often difficult for
traditional models like Word2Vec to handle, but
FastText can deconstruct these words into
recognized subword components and provide
useful embeddings for them. This is particularly
crucial for applications using OOV terms, such as
named entity identification or machine translation.
d. Effective and Scalable: FastText is designed to use
memory and time as efficiently as possible. It can
generate embeddings rapidly and train on big
corpora. FastText's capacity to efficiently build
embeddings and train on unsupervised data makes
72
it a popular option for real-world NLP
applications.
e. Text Classification: FastText facilitates supervised
learning for text classification problems in addition
to producing word embeddings. Classifiers for
applications like sentiment analysis, language
recognition, and document classification may be
effectively trained using FastText. In order to train
a classifier and achieve high accuracy with very
few computer resources, it treats each document or
phrase as a bag of word n-grams.
f. Pre-trained Models: FastText makes it simple to
begin using word embeddings for tasks like
semantic similarity, information retrieval, or
recommendation systems by offering a range of
pre-trained word vectors for various languages.
These pre-trained models provide excellent
embeddings for a large number of frequently used
words and phrases and are often trained on
enormous datasets like Wikipedia.
73
embeddings are used by FastText during training
to create a more accurate and richer word
representation.
b. Skip-gram Model: Similar to Word2Vec, FastText
trains a Skip-gram model with the aim of
predicting a word's context. FastText, in contrast to
Word2Vec, generates embeddings even for words
that are not in the dictionary since it takes into
account the word's character n-grams in addition to
the word itself.
c. Text Classification: FastText represents each
document in text classification tasks using word
and subword embeddings. In order to link these
representations to the appropriate labels—such as
positive or negative for sentiment analysis or
particular categories for topic classification—the
model trains a classifier.
3. Applications of FastText
74
c. Text Classification: FastText excels in document
classification tasks like product classification, spam
detection, sentiment analysis, and news article
classification. It is very helpful when working with
big datasets.
2.5.1. Sentiment
75
to categorize the text according to the attitude or mindset it
conveys, which might be neutral, positive, or negative.
1. Definition
76
Figure 2.1Sentiment analysis*
2. Importance
*https://media.geeksforgeeks.org/wp-
content/uploads/20200717010244/gfgsentiment-300x206.png
77
Here are some key reasons why sentiment analysis is
important for business:
d. Competitor Analysis
78
In order to make strategic decisions, businesses evaluate
their strengths and weaknesses in comparison to their
rivals..
1. Defining
79
more. Machines can now organize, filter, and comprehend
vast amounts of textual data thanks to text classification
algorithms, which examine the text's properties and
patterns to provide precise predictions about its category.
*https://cdn.analyticsvidhya.com/wp-
content/uploads/2023/08/What-is-Text-Classification-.png
80
classification, while having a higher accuracy rate.
Although it is less accurate, unsupervised text
categorization may be used in situations when labels are
not provided.
3. Working
Finding the texts that belong in each class is the next step
after selecting the classes. Although there are several
81
methods to achieve this, the most popular method is to
search for certain key words or phrases that are
representative of the subject. If you were searching for
books on animals, for instance, you may search for terms
like "fur," "milk," or "birth."
82
machine learning models. Another difficulty for machine
learning models is that textual input might be quite high-
dimensional.
a. Sentiment Analysis
b. Language detection
83
consuming process. However, machine learning models
may also be useful in this situation.
84
Algorithms that use natural language processing can
decipher emotions and understand the written text's
intended meaning. Before classifying a message as good,
neutral, or negative, sentiment analysis, for instance, might
identify its tone and classify it as bullying, wrath, abuse,
irony, and so on.
85
Advanced NLP
3 Models and
Architectures
CHAPTER-3:
3.1. Introduction to Machine Learning in NLP
1. Definition
86
machine learning implementations into four main
categories, which are as follows:
a. Supervised learning
b. Unsupervised learning
c. Reinforcement learning
87
d. Semi-supervised learning
88
overlook. This enables more informed decision-
making that is informed by real-world data.
c. Improved Personalization: ML customises user
experiences on a variety of platforms. ML
customises content and services to meet the
preferences of individual users, from
recommendation systems to targeted advertising.
d. Advanced Automation and Robotics: Machine
learning (ML) enables robots and machines to
execute intricate tasks with increased precision and
ability to adapt. This is transforming industries
such as logistics and manufacturing.
89
difficult to comprehend, which can make it difficult
to elucidate. This absence of transparency may
prompt enquiries regarding trust and
accountability.
d. Job Displacement and Automation: Certain
sectors may experience employment displacement
as a result of automation through machine
learning. It is imperative to address the necessity of
retraining and reskilling the workforce.
3.2.1. Supervised
90
supervised learning approach, and the machines
subsequently forecast the output based on the training. In
this instance, the classified data indicates that a portion of
the inputs have already been assigned to the output. More
importantly, one may assert that the system is instructed to
predict the output using the test dataset after it has been
trained with the input and corresponding output.
91
a. Advantages and Disadvantages of Supervised
Learning
Advantages:
Disadvantages:
a. Image Segmentation
b. Medical Diagnosis
92
photographs and data that have already been tagged with
identifiers for ailment conditions are employed in the
process. This procedure may be employed by the machine
to diagnose a disease in new patients.
c. Fraud Detection
d. Spam detection
e. Speech Recognition
3.2.2. Unsupervised
93
unlabelled dataset to train the system, which then makes
output predictions on its own without human oversight.
a. Clustering
94
aggregation of items in a manner that ensures those that
are most similar to one another remain in that group and
are less similar to or not at all similar to those in other
groups. Grouping clients according to their purchasing
habits is an illustration of the clustering algorithm in
action.
b. Association
95
2. “Advantages and Disadvantages of Unsupervised
Learning Algorithm”
Advantages:
Disadvantages:
96
3. Applications of Unsupervised Learning
97
Goal Predicts outcomes Discovers hidden
or classifies data patterns, structures,
based on known or groupings in data.
labels.
Computational Less complex, as More complex, as
Complexity the model learns the model must find
from labeled data patterns without any
with clear guidance.
guidance.
Types Two types : Clustering and
Classification (for association
discrete outputs)
or regression (for
continuous
outputs).
Testing the Model can be Cannot be tested in
Model tested and the traditional sense,
evaluated using as there are no
labeled test data. labels.
98
Technically speaking, "neural networks," which draw
inspiration from the human brain, are used in deep
learning. These networks are made up of information-
processing layers of linked nodes. The network becomes
"deeper" as it gains additional layers, which enables it to
learn more intricate characteristics and carry out more
difficult tasks.
1. Definition
*https://media.geeksforgeeks.org/wp-
content/uploads/20230413105611/Maachine-Learning.webp
99
Because of its success in a range of applications, including
computer vision, natural language processing, and
reinforcement learning, deep learning artificial intelligence
(AI) has grown to become one of the most well-known and
visible subfields in machine learning today.
2. Concepts
100
technique known as backpropagation, the weights are
modified during training to reduce the discrepancy
between the expected and actual outputs.
101
accelerating convergence and enhancing efficiency. In
order to avoid overfitting, regularisation strategies like
Dropout or L2 Regularisation penalise too complicated
models or randomly deactivate neurones during training.
102
functions. Methods like data augmentation and batch
normalisation are used to improve and stabilise the
training process, which in turn improves the performance
of the model.
103
handle long-term dependencies efficiently and train more
quickly.
e. Autoencoders
104
in a competitive environment, in which the discriminator
aims to accurately discriminate between genuine and false
data, while the generator wants to generate data that is
indistinguishable from actual data.
105
Memory State, as it retains the network's most recent
input. It employs identical parameters for each input and
executes the same task on all inputs or concealed layers to
generate the output. In contrast to other neural networks,
this simplifies the parameters.
1. Types Of RNN
a. One to One
*https://media.geeksforgeeks.org/wp-
content/uploads/20231204125839/What-is-Recurrent-Neural-
Network-660.webp
106
Figure 3.3 One to One RNN*
b. One to Many
*https://media.geeksforgeeks.org/wp-
content/uploads/20231204131135/One-to-One-300.webp
107
Figure 3.4 One to Many RNN*
c. Many to One
*https://media.geeksforgeeks.org/wp-
content/uploads/20231204131304/One-to-Many-300.webp
108
Figure 3.5 Many to One RNN*
d. Many to Many
*https://media.geeksforgeeks.org/wp-
content/uploads/20231204131355/Many-to-One-300.webp
109
Figure 3.6 Many to Many RNN*
2. Advantages of RNN
*https://media.geeksforgeeks.org/wp-
content/uploads/20231204131436/Many-to-Many-300.webp
110
3. Disadvantages of RNN
1. Introduction
111
reference to specific data that was saved a long time ago is
necessary. However, RNNs are completely unable to
manage these "long-term dependencies."
2. Structure of LSTM
112
layer and three logistic sigmoid gates. The purpose of
gates is to restrict the amount of information that may
enter through the cell. They decide which information
should be deleted and which will be required by the next
cell. The output is usually in the range of 0-1 where ‘0’
means ‘reject all’ and ‘1’ means ‘include all’.
a. Forget Gate
*https://media.geeksforgeeks.org/wp-
content/uploads/newContent1.png
113
preceding cell), which are multiplied by weight matrices
before bias is added. An activation function is applied to
the outcome, producing a binary output. If the output for a
certain cell state is 0, the information is lost, and if the
output is 1, the information is saved for later use.
b. Input gate
*https://media.geeksforgeeks.org/wp-
content/uploads/newContent2.png
114
from h_t-1 and x_t. Finally, to get relevant information, the
vector values and the controlled values are multiplied.
c. Output gate
*https://media.geeksforgeeks.org/wp-
content/uploads/newContent4.png
115
Figure 3.10 Output gate in the LSTM cell*
*https://media.geeksforgeeks.org/wp-
content/uploads/newContent3.png
116
number of images and their accompanying informative
captions is necessary for this. The characteristics of the
photos in the dataset are predicted by a previously trained
model. Photo data is what this is. After then, the dataset is
processed such that it contains just the most intriguing
terms. This data is textual. We attempt to fit the model
using these two kinds of data. The model's task is to use
input words that it has previously predicted and the image
to create a descriptive phrase for the image, one word at a
time.
117
a. Self-Attention Mechanism
b. Multi-Head Attention
c. Positional Encoding
118
d. Encoder and Decoder
2. Advantages of Transformers
119
b. Long-Range Dependencies: The self-attention
mechanism of transformers effectively captures
long-range dependencies in sequences.
c. Scalability: Transformers are well-suited for large-
scale NLP tasks due to their ability to scale with the
growth of data and models.
3. Applications
3.4.1. BERT
120
to conventional embeddings such as GloVe and
Word2Vec, which generate static vectors for words.
3.4.2. GPT
121
1. Introduction
122
attention heads. Unsupervised learning was used to pre-
train this model on a variety of datasets, and it was then
adjusted for certain tasks.
123
boost contextual awareness and reasoning. Performance
demonstrated how models may display comprehension
and reasoning-like actions, sparking a broad conversation
regarding the consequences of strong AI models.
124
Figure 3.11 GPT architecture*
*https://media.geeksforgeeks.org/wp-
content/uploads/20240712150234/GPT-Arcihtecture.webp
125
next word in a sentence. This phase uses a wide
range of online content to ensure that the model
can generate writing that is human-like in a variety
of contexts and domains.
b. Fine-tuning: Although GPT models do well in
zero-shot and few-shot learning, there are times
when specific applications call for fine-tuning,
which involves training the model on data unique
to a given task or domain.
126
offering conversational agents to support patients,
and supporting research by summarising scientific
material.
6. Advantages of GPT
3.4.3. T5
1. Introduction
127
particular tasks. This approach makes the model more
flexible and versatile by treating the input and output for
different NLP tasks as textual sequences.
2. Working
128
3. Key Features of the T5 Model
https://miro.medium.com/v2/resize:fit:1100/format:webp/1*kD5
*
H8pRe-9kJZLL_L29kLA.png
129
enables it to be tailored to various job needs and
computing resources.
130
cookies," "use of cookies," or "use cookies" since
many sites had boilerplate policy statements.
i. We eliminated all but one of all three-sentence
spans that appeared more than once in the data set
in order to deduplicate it.
https://miro.medium.com/v2/resize:fit:1100/format:webp/1*vK9
*
Fa0mcfz_Ed_XHHHMcIg.png
131
3.5. Transfer Learning and Pre-trained
Language Models
132
corpus of text to a new, more specialized job, such named
entity identification, text categorization, or sentiment
analysis.
a. Pre-training
b. Fine-tuning
c. Feature Extraction
133
embeddings are delivered to a different classifier for the
job, such as text categorization or sentiment analysis.
134
a. Transformer Architecture: The Transformer
architecture, upon which the majority of
contemporary pre-trained language models are
based, makes use of self-attention mechanisms to
comprehend the connections among words in a
sentence. These models can effectively handle
massive volumes of data and capture long-range
relationships in text thanks to the Transformer
design.
b. Bidirectionality (for models like BERT):
Bidirectional training is one of the most significant
developments in pre-trained models. In contrast to
unidirectional models, models like as BERT are
trained to anticipate missing words by taking into
account both the words that before and follow the
missing word. This allows them to collect greater
contextual information.
c. Generative Capabilities (for models such as GPT):
Autoregressive pre-trained models such as GPT are
taught to predict the subsequent word in a series.
These models are ideal for text creation jobs like
writing articles, emails, or creative material since
they are excellent at producing language that is
both cohesive and contextually relevant.
d. Unified Framework (for models like T5):
Activities like summarization, translation, and
question answering are transformed into a single
text generation task by T5 (Text-to-Text Transfer
Transformer), which considers all NLP activities as
135
text-to-text issues. This method makes the model's
construction simpler and enables it to be used for a
variety of NLP applications.
136
Applications of
4 NLP in the Real
World
CHAPTER-4:
4.1. Machine Translation: Statistical vs. Neural
Machine Translation
137
NLP models, such as Seq2Seq and Transformers, are
employed by advanced machine translation systems, such
as DeepL and Google Translate. These models are
intended to capture the subtleties and variations in
language, resulting in translations that are both
contextually pertinent and accurate.
138
equivalents in the target language. Large datasets
of parallel materials, such translated novels or
papers, are used to learn this alignment. Based on
context and frequency, the algorithm determines
which word correspondences are most probable.
2. Strengths of SMT
139
defined translation rules and a lot of parallel data
accessible.
b. Interpretability: It is simpler to examine how the
model produces translation choices since SMT's
phrase tables and word alignments are
comprehensible and evaluated.
3. Weaknesses of SMT
140
makes use of deep learning models, more especially neural
networks. NMT does not depend on direct statistical
mappings or phrase tables as SMT does. Rather, it models
the translation process directly using an end-to-end neural
network.
141
linguistic patterns like word order, grammar, and
even colloquial idioms.
c. Context Awareness: When it comes to context,
NMT is more adept than SMT. NMT can provide
more natural-sounding translations that preserve
the original text's content and structure by using
deep neural networks, which allow it to take the
full phrase into account while translating.
2. Strengths of NMT
3. Weaknesses of NMT
142
comprehend how they arrive at a certain
translation. Error analysis and model improvement
become more difficult as a result.
b. High computer Cost: When working with huge
datasets and intricate structures, training NMT
models necessitates a significant investment of time
and computer resources. Additionally, inference
performance may be slower than SMT, which
might be problematic for real-time applications.
c. Needs Big Datasets: NMT works best when
trained on big data sets, even though it often needs
less parallel data than SMT. NMT may have trouble
with languages with little data.
143
Translation Often produces Produces more
Quality stiff, word-for- fluent and
word translations natural-sounding
that may be translations.
unnatural.
Data Requires large Requires large
Requirements parallel corpora parallel corpora
but can work for optimal
with smaller performance but
datasets. is more data-
efficient than
SMT.
Computational Computationally Requires
Cost less intensive significantly more
during training computational
and inference. resources during
both training and
inference.
Flexibility Limited in Can handle
handling unseen unseen words
words or rare more effectively
language pairs. by leveraging
embeddings and
contextual
information.
Interpretability More Often considered
interpretable, as a "black box,"
phrase tables and making it harder
word alignments to understand or
can be analyzed. debug errors.
144
4.2. Speech Recognition and Text-to-Speech
(TTS)
1. Working
145
voice samples. Next, the speech-to-text technology decodes
the audio, eliminates any unwanted noise, and modifies
the speech's pitch, loudness, and cadence. After that, it
breaks down the digital data into frequencies and
examines individual content segments.
146
Figure 4.1 Working of speech recognition*
*https://nordvpn.com/wp-content/uploads/blog-asset-speech-
recognition.svg
147
b. Dynamic time warping (DTW).
a. Navigation systems
148
b. Virtual assistants
c. Healthcare
d. Call centers
e. Accessibility
149
processing. Voice search may be used by people with
restricted movement to use their gadgets, such as taking
phone calls or accessing the internet.
f. Language translation
g. Voice search
150
a. Text Analysis: Examining the input text is the first
stage of TTS. Understanding the text's linguistic
structure entails dissecting it into its constituent
words, phrases, and paragraphs. This aids the
system in determining how to produce suitable
speech.
151
2. Types of Text-to-Speech Systems
152
3. Applications of Text-to-Speech
4. Advantages of Text-to-Speech
153
like dyslexia or visual impairments, TTS is an
essential assistive tool. By reading them aloud, it
facilitates these people's access to written materials
like books, articles, and papers.
b. Multitasking: TTS enables users to listen to
information while engaging in other tasks like
cooking, driving, or working out. When time is
limited, this enables increased convenience and
efficiency.
c. Global Communication: People from a variety of
backgrounds may engage with technology in a
manner that is suitable for their culture and
language because to TTS's ability to be modified for
many languages and accents.
154
may be difficult for TTS systems to understand,
which might result in incorrect pronunciations or
strange speech patterns.
c. Real-time Processing: Real-time processing may be
difficult, especially in situations with limited
resources, since high-quality TTS, especially those
based on deep learning models, can demand
substantial computing resources.
4.3.1. Chatbots
155
1. Types of Chatbots
4.3.2. Conversational AI
156
1. Key Components of Conversational AI
157
which are useful in fields including education, healthcare,
e-commerce, and customer support. These are a few
examples of typical uses:
158
improve user experience, and provide scalability to
manage high contact volumes. Among the main
advantages are:
159
a. Limited Understanding: A lot of chatbots still have
trouble comprehending slang, context, and unclear
language, which may result in misunderstandings
or unsatisfactory user experiences.
b. Emotional Intelligence Deficit: Although
conversational AI is capable of processing
language, it is not emotionally intelligent. It often
lacks the human ability to comprehend or express
emotions, which may lead to less sympathetic
relationships.
c. Complexity of Multi-turn Conversations: Because
it may be challenging to preserve context over
extended exchanges, conversational AI is still
having trouble handling complicated, multi-turn
conversations.
d. Integration Problems: It may be difficult to
incorporate conversational AI into databases and
systems that already exist, particularly when
working with legacy software and many lines of
communication.
e. Issues with Data Privacy: Data privacy and
security are issues when chatbots and
conversational AI gather and analyze user data,
especially when handling sensitive data like
financial or medical information.
160
conversational AI have a bright future. These systems will
be able to manage more difficult jobs, have more in-depth
discussions, and provide more individualized experiences
as they advance in sophistication. Multimodal
conversational AI, which enables computers to
comprehend and react to inputs from several
communication channels including text, audio, and video,
is also anticipated to become increasingly prevalent in the
future.
161
4.4.1. Information Retrieval
162
words—common phrases that are often useless in
searches.
c. Ranking: Following query processing, the IR
system assigns a ranking to the pertinent
documents according to how pertinent they are to
the user's question. In order to guarantee that the
most relevant pages show up at the top of search
results, ranking algorithms are essential. When
ordering the results, factors including user
preferences, document structure, and keyword
frequency are taken into account.
d. Retrieval Models: A number of retrieval models
influence the ranking and retrieval of texts. The
Boolean Model, Vector Space Model, and Probabil-
istic Model are a few of the most often used
models. Different methods are used by each model
to gauge how relevant documents are to a query.
163
are automated programs that search engines
employ to browse the internet and gather data
from web sites. These crawlers examine websites in
a methodical manner, collecting information, text,
and pictures while following connections to other
pages.
b. Indexing: Data is indexed after web crawlers have
collected it from web sites. Information is arranged
and stored throughout the indexing process so that
it may be quickly retrieved when a user does a
search. Creating an inverted index, which
associates terms (such keywords) with the
documents or web pages that contain them, is the
standard method of indexing.
c. Algorithms and Ranking: Following indexing,
search engines provide a ranking to the sites
according to how relevant they are to the user's
query. Complex algorithms, like Google's
PageRank, are used in the ranking process to assess
user experience signals (e.g., mobile friendliness,
website loading speed), backlinks, keyword use,
and page quality.
d. Query Matching and Display of Results: The
search engine finds relevant results by comparing a
user's query to its index. After then, the results are
shown in an ordered order, usually with the most
relevant results at the top. Additionally, current
search engines tailor results according on user
preferences, geography, and past search history.
164
2. Applications of Information Retrieval and Search
Engines
165
d. Query Matching and Display of Results: The
search engine finds relevant results by comparing a
user's query to its index. After then, the results are
shown in an ordered order, usually with the most
relevant results at the top. Additionally, current
search engines tailor results according on user
preferences, geography, and past search history.
166
4. Challenges and Limitations
167
sophisticated search engines, particularly in
specialized or niche fields.
168
thorough analysis of the development, application, and
usage of NLP models is necessary for the social integration
of AI-driven technologies to guarantee that they advance
equity, openness, and inclusion. This section will examine
important NLP ethical issues and the difficulties relating to
prejudice.
Making sure that the data used to train these systems does
not violate private rights is one of the most important
ethical concerns in natural language processing. Large
volumes of data, including sensitive personal data, are
often needed for NLP models. If this information is
managed improperly or made public, it may result in
privacy breaches and endanger people.
169
2. Accountability and Transparency
170
Furthermore, rather than fully replacing people, NLP
systems need to be created to support them. Instead of
undercutting human agency and creativity, developers
must make sure that these systems enhance human roles
and decision-making processes.
1. Sources of Bias
171
b. Representation Bias: NLP models may have
trouble correctly processing the language or
preferences of certain groups if they are
underrepresented in the training data. When
applied to other languages or dialects, for example,
models that were primarily trained on English-
language data may not perform as well, producing
results that are of lesser quality for users who do
not understand English.
c. Labeling Bias: In supervised learning, people often
construct the labels that are used to train models,
which may unintentionally add biases of their own.
For instance, an annotator may assign certain
emotional tones to particular races or genders
when classifying text data or photos, which might
result in biased predictions when the model is
used.
172
b. Misinformation: The dissemination of false
information may also result from bias in NLP
systems. An NLP model used to summarise news
stories, for example, may provide summaries that
reinforce biassed opinions or distort important
facts if it is trained on biassed sources.
c. Exclusion: NLP systems that fail to take diversity
and inclusion into consideration may inadvertently
leave out some groups, especially those who speak
languages or dialects that are under-represented in
training data. Unfair access to services like
customer service or medical information may result
from this.
173
evaluate how effectively their models handle
various demographic groups by using methods like
bias detection and fairness rating measures.
c. Human-in-the-loop Systems: Including human
supervision in NLP systems' decision-making
process may assist identify biases and guarantee
that the system is functioning equitably. When the
machine generates biassed results or is unclear,
humans may step in.
d. Explainability and Transparency: By improving
the explainability of NLP systems, developers may
guarantee that stakeholders and users comprehend
the decision-making process. Because of this
openness, it may be simpler to spot prejudiced
trends and address them before they have a
negative impact.
e. Algorithms for Debiasing: Scholars are also
attempting to create algorithms that actively reduce
bias in NLP models. These methods modify the
training procedure to offset biassed data,
guaranteeing that models provide more equitable
and well-rounded outcomes.
174
NLP Tools,
5 Frameworks, and
Future Trends
CHAPTER-5:
5.1. Popular NLP Libraries: NLTK, SpaCy, and
Hugging Face Transformers
175
resources, corpora, and pre-defined functions for typical
natural language processing applications.
176
2. Use Cases
5.1.2. SpaCy
177
workloads since it is meant to be among the fastest
NLP libraries available.
b. Pre-trained Models: English, French, German, and
Spanish are among the languages for which SpaCy
has pre-trained models. These models have been
optimized for applications including text
categorization, named entity recognition (NER),
dependency parsing, and part-of-speech tagging.
c. Pipeline-based Architecture: SpaCy employs a
pipeline-based methodology in which text is
processed via a number of steps (including parsing,
tagging, and tokenization) in order to provide
valuable linguistic characteristics. The library may
be easily extended and customized for certain
needs thanks to its modular design.
d. Integration with Deep Learning: Training and
deploying bespoke models is made simple by
SpaCy's good integration with deep learning
frameworks like TensorFlow and PyTorch.
3. Use Cases
178
to extract valuable information from massive text
datasets.
179
summarization after being trained on large
corpora.
180
2. Use Cases
181
Step 1: Define the Problem and Collect Data
182
Step 2: Preprocess the Data
183
Step 3: Choose an NLP Model
184
function successfully even when given little task-
specific info.
185
Step 5: Fine-tune and Optimize
It's time to deploy the model so that end users may use it if
you're happy with its performance. When deploying NLP
models, there are a few important factors to take into
account:
186
implementing NLP models, particularly deep
learning models. Cloud systems that provide
services to scale model deployment over numerous
servers, such as AWS, Google Cloud, or Azure, are
good places to deploy models.
c. Batch vs. Real-Time Processing: The choice
between batch and real-time processing will
depend on your application. For instance, text
summarizing may be done in batch mode, but a
chatbot would need real-time processing.
d. Maintenance and Monitoring: Following
deployment, it's critical to keep an eye on the
model's functionality in actual environments and
gather input for ongoing enhancement. As new
information becomes available or language use
changes over time, models may need to be
retrained.
e. Containerization: You may use technologies like
Docker to containerize the model in order to make
deployment more portable. This makes it possible
to bundle the model with its dependencies,
guaranteeing that it functions uniformly in various
settings.
187
that despite significant advancements over the years, NLP
still faces a number of obstacles that limit its performance
in real-world applications. Resolving these obstacles is
essential to enhancing NLP systems and making them
more accurate and useful across a variety of domains.
188
3. Semantic Ambiguity: This occurs when
ambiguous word usage or phrasing leaves the
meaning of a sentence unclear. For example, "He
didn’t believe in the theory of relativity" could
indicate that the speaker doesn’t accept the
scientific theory or that they don’t believe in a
particular theory known as "the theory of
relativity." This ambiguity can confuse NLP models
and make it more difficult for them to provide
accurate interpretations.
189
(such as "he," "she," "it," or "they") are often used in
texts. Coreference resolution is the process by
which NLP systems determine which entities these
pronouns refer to. A system may misidentify
references if context is not understood, which
might cause misunderstandings or wrong
interpretations.
3. Managing Idiomatic Expressions: Figurative
language and idioms are very heavily influenced
by context. For instance, the phrase "kick the
bucket" refers to "to die" in a metaphorical sense;
nonetheless, taking it literally would be deceptive.
NLP models must comprehend the larger context
of the text or conversation in order to comprehend
such statements.
4. Contextualized Word Embeddings: Newly
developed NLP models that can identify contextual
associations between words in a phrase include
BERT (Bidirectional Encoder Representations from
Transformers) and GPT (Generative Pretrained
Transformer). These models dynamically modify
word meanings based on context, which aids with
language comprehension and disambiguation.
190
interpretation. Among the main challenges associated with
multilingual processing are:
191
words in many languages is one of the main issues
in multilingual natural language processing.
Because they are often trained on a single language,
word embeddings such as Word2Vec and GloVe
may not function effectively when used on
different languages. Although there are still issues
with accuracy and efficiency, more recent methods
such as Multilingual BERT and XLM-R try to
address this by offering cross-lingual word
embeddings that can handle text in various
languages.
5. Machine Translation: One of the main
responsibilities of multilingual natural language
processing is the proper translation of text across
languages. Even while neural machine translation
(NMT) models, such as Google Translate, have
advanced significantly, problems with sentence
structure, colloquial idioms, and cultural
differences still exist. Machine translation models
often perform worse in low-resource languages
because they have fewer parallel corpora available
for training.
192
developments have transformed how robots comprehend
and produce human language, opening up a plethora of
possibilities. This section examines some of the most
significant recent developments in NLP research that are
influencing the discipline's direction.
193
Transformer), GPT (Generative Pretrained Transformer),
and BERT (Bidirectional Encoder Representations from
Transformers) have raised the bar. These models are
refined on task-specific datasets after being pre-trained on
vast volumes of text data.
3. Multimodal NLP
194
NLP, in which models are taught to process and
comprehend several types of input, including text,
pictures, and audio, while conventional NLP has mostly
concentrated on text. Applications like voice recognition,
video analysis, and picture captioning benefit greatly from
this.
195
any prior samples of the work. For instance, new
categories might be classified using a zero-shot text
classification model without the requirement for
annotated samples. The strength of pre-trained
models, which have acquired a wide variety of
general knowledge during their pre-training phase,
has enabled zero-shot learning to be successful.
196
advancements in NLP. Although models such as GPT-3,
T5, and BERT have already shown impressive success,
much bigger and more potent models are anticipated in
the future. We anticipate seeing models with billions of
parameters that can generate more coherent text,
comprehend more subtle information, and perform better
on a greater variety of tasks as computing power and data
availability increase.
197
image recognition or video analysis for tasks like
automatic video captioning or sign language translation,
might be revolutionized by multimodal AI, which
combines different sources of data. Text and other sensory
modalities will continue to be integrated, making NLP
systems more adaptable and more in line with how people
naturally perceive information.
198
4. Ethics, Fairness, and Explainability
199
This change will be significantly influenced by
personalised NLP systems. Individual requirements,
communication preferences, and styles will all be
accommodated by these platforms. For example,
personalised virtual assistants will be able to learn from
past interactions and improve their ability to predict user
requirements, whether those needs are related to task
scheduling, question answering, or content creation.
Human-machine interactions will become more intuitive
and user-friendly as a result of this personalisation.
200
References
[1]. Bengio, Yoshua; Ducharme, Réjean; Vincent,
Pascal; Janvin, Christian (March 1, 2003). "A neural
probabilistic language model". The Journal of
Machine Learning Research. 3: 1137–1155 – via
ACM Digital Library.
[2]. Mikolov, Tomáš; Karafiát, Martin; Burget, Lukáš;
Černocký, Jan; Khudanpur, Sanjeev (26 September
2010). "Recurrent neural network based language
model" (PDF). Interspeech 2010. pp. 1045–1048.
doi:10.21437/Interspeech.2010-343. S2CID 17048224.
{{cite book}}: |journal= ignored (help)
[3]. Goldberg, Yoav (2016). "A Primer on Neural
Network Models for Natural Language
Processing". Journal of Artificial Intelligence
Research. 57: 345–420. arXiv:1807.10854.
doi:10.1613/jair.4992. S2CID 8273530.
[4]. Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron
(2016). Deep Learning. MIT Press.
[5]. Jozefowicz, Rafal; Vinyals, Oriol; Schuster, Mike;
Shazeer, Noam; Wu, Yonghui (2016). Exploring the
Limits of Language Modeling. arXiv:1602.02410.
Bibcode:2016arXiv160202410J.
[6]. Choe, Do Kook; Charniak, Eugene. "Parsing as
Language Modeling". Emnlp 2016. Archived from
the original on 2018-10-23. Retrieved 2018-10-22.
[7]. Vinyals, Oriol; et al. (2014). "Grammar as a Foreign
Language" (PDF). Nips2015. arXiv:1412.7449.
201
Bibcode:2014arXiv1412.7449V.
[8]. Turchin, Alexander; Florez Builes, Luisa F. (2021-
03-19). "Using Natural Language Processing to
Measure and Improve Quality of Diabetes Care: A
Systematic Review". Journal of Diabetes Science
and Technology. 15 (3): 553–560.
doi:10.1177/19322968211000831. ISSN 1932-2968.
PMC 8120048. PMID 33736486.
[9]. Lee, Jennifer; Yang, Samuel; Holland-Hall, Cynthia;
Sezgin, Emre; Gill, Manjot; Linwood, Simon;
Huang, Yungui; Hoffman, Jeffrey (2022-06-10).
"Prevalence of Sensitive Terms in Clinical Notes
Using Natural Language Processing Techniques:
Observational Study". JMIR Medical Informatics. 10
(6): e38482. doi:10.2196/38482. ISSN 2291-9694.
PMC 9233261. PMID 35687381.
[10]. Winograd, Terry (1971). Procedures as a
Representation for Data in a Computer Program for
Understanding Natural Language (Thesis).
[11]. Schank, Roger C.; Abelson, Robert P. (1977).
Scripts, Plans, Goals, and Understanding: An
Inquiry Into Human Knowledge Structures.
Hillsdale: Erlbaum. ISBN 0-470-99033-3.
[12]. Writer, Beta (2019). Lithium-Ion Batteries.
doi:10.1007/978-3-030-16800-1. ISBN 978-3-030-
16799-8. S2CID 155818532.
[13]. "Document Understanding AI on Google Cloud
(Cloud Next '19) – YouTube". www.youtube.com.
11 April 2019. Archived from the original on 2021-
202
10-30. Retrieved 2021-01-11.
[14]. Robertson, Adi (2022-04-06). "OpenAI's DALL-E AI
image generator can now edit pictures, too". The
Verge. Retrieved 2022-06-07.
[15]. "The Stanford Natural Language Processing
Group". nlp.stanford.edu. Retrieved 2022-06-07.
[16]. Coyne, Bob; Sproat, Richard (2001-08-01).
"WordsEye". Proceedings of the 28th annual
conference on Computer graphics and interactive
techniques. SIGGRAPH '01. New York, NY, USA:
Association for Computing Machinery. pp. 487–
496. doi:10.1145/383259.383316. ISBN 978-1-58113-
374-5. S2CID 3842372.
[17]. "Google announces AI advances in text-to-video,
language translation, more". VentureBeat. 2022-11-
02. Retrieved 2022-11-09.
[18]. Vincent, James (2022-09-29). "Meta's new text-to-
video AI generator is like DALL-E for video". The
Verge. Retrieved 2022-11-09.
203