Class 1 - NLP
Class 1 - NLP
What is NLP?
NLP stands for Natural Language Processing, which is a part
of Computer Science, Human language, and Artificial
Intelligence. It is the technology that is used by machines to
understand, analyse, manipulate, and interpret human's languages.
It helps developers to organize knowledge for performing tasks
such as translation, automatic summarization, Named Entity
Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
History of NLP
In the Year 1948, the first recognisable NLP In 1957, Chomsky also introduced the idea of
application was introduced in Birkbeck Generative Grammar, which is rule based
Focused on Machine Translation (MT) College, London. descriptions of syntactic structures.
The Natural Languages Processing started in In the Year 1950s, there was a conflicting view
the year 1940s. between linguistics and computer science.
Now, Chomsky developed his first book
syntactic structures and claimed that language
is generative in nature.
History of NLP
(1960-1980) - Flavored with Artificial Intelligence (AI)
In the year 1960 to 1980, the key developments were:
Augmented Transition Networks (ATN)
Augmented Transition Networks is a finite state machine that is capable of recognizing regular
languages.
Case Grammar
Case Grammar was developed by Linguist Charles J. Fillmore in the year 1968. Case Grammar
uses languages such as English to express the relationship between nouns and verbs by using the
preposition.
In Case Grammar, case roles can be defined to link certain kinds of verbs and objects.
For example: "Neha broke the mirror with the hammer". In this example case grammar identify
Neha as an agent, mirror as a theme, and hammer as an instrument.
History of NLP
In the year 1960 to 1980, key systems were:
SHRDLU
SHRDLU is a program written by Terry Winograd in 1968-70. It helps users to
communicate with the computer and moving objects. It can handle instructions such as
"pick up the green boll" and also answer the questions like "What is inside the black box."
The main importance of SHRDLU is that it shows those syntax, semantics, and reasoning
about the world that can be combined to produce a system that understands a natural
language.
LUNAR
LUNAR is the classic example of a Natural Language database interface system that is used
ATNs and Woods' Procedural Semantics. It was capable of translating elaborate natural
language expressions into database queries and handle 78% of requests without errors.
History of NLP
1980 - Current
Till the year 1980, natural language processing systems were based on complex sets of hand-written
rules. After 1980, NLP introduced machine learning algorithms for language processing
In the beginning of the year 1990s, NLP started growing faster and achieved good process accuracy,
especially in English Grammar. In 1990 also, an electronic text introduced, which provided a good
resource for training and examining natural language programs. Other factors may include the
availability of computers with fast CPUs and more memory. The major factor behind the
advancement of natural language processing was the Internet.
Now, modern NLP consists of various applications, like speech recognition, machine
translation, and machine text reading. When we combine all these applications then it allows the
artificial intelligence to gain knowledge of the world. Let's consider the example of AMAZON
ALEXA, using this robot you can ask the question to Alexa, and it will reply to you.
Advantages of NLP
NLP helps users to ask questions about any subject and get a direct response within seconds.
NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
Most of the companies use NLP to improve the efficiency of documentation processes, accuracy of
documentation, and identify the information from large databases.
Disadvantages of NLP
NLU NLG
NLU is the process of reading and interpreting language. NLG is the process of writing or generating language.
It produces non-linguistic outputs from natural language It produces constructing natural language outputs from
inputs. non-linguistic inputs.
Applications of NLP
Speech recognition is used for converting spoken words into text. It is used in
applications, such as mobile, home automation, video recovery, dictating to
Microsoft Word, voice biometrics, voice user interface, and so on.
7. Chatbot
Implementing the Chatbot is one of the important applications of NLP.
It is used by many companies to provide the customer's chat services.
8. Information extraction
It converts a large set of text into more formal representations such as first-
order logic structures that are easier for the computer programs to manipulate
notations of the natural language processing.
Phases of NLP
1. Lexical Analysis and Morphological
The first phase of NLP is the Lexical Analysis. This phase scans the source code as a stream of characters and
converts it into meaningful lexemes. It divides the whole text into paragraphs, sentences, and words.
Syntactic Analysis is used to check grammar, word arrangements, and shows the relationship among the words.
In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected by the Syntactic
analyzer.
3. Semantic Analysis
Semantic analysis is concerned with the meaning representation. It
mainly focuses on the literal meaning of words, phrases, and sentences.
4. Discourse Integration
Discourse Integration depends upon the sentences that proceeds it and
also invokes the meaning of the sentences that follow it.
5. Pragmatic Analysis
Pragmatic is the fifth and last phase of NLP. It helps you to discover the
intended effect by applying a set of rules that characterize cooperative
dialogues.
For Example: "Open the door" is interpreted as a request instead of an
order.
NLP Libraries
• Scikit-learn: It provides a wide range of algorithms for building machine learning models in Python.
• Natural language Toolkit (NLTK): NLTK is a complete toolkit for all NLP techniques.
• Pattern: It is a web mining module for NLP and machine learning.
• TextBlob: It provides an easy interface to learn basic NLP tasks like sentiment analysis, noun phrase extraction,
or pos-tagging.
• Quepy: Quepy is used to transform natural language questions into queries in a database query language.
• SpaCy: SpaCy is an open-source NLP library which is used for Data Extraction, Data Analysis, Sentiment
Analysis, and Text Summarization.
• Gensim: Gensim works with large datasets and processes data streams.
Difference between Natural language and Computer
Language
Natural language has a very large Computer language has a very limited
vocabulary. vocabulary.
Natural language is easily understood by Computer language is easily understood by
humans. the machines.
Natural language is ambiguous in nature. Computer language is unambiguous.
NLP approaches
Heuristics-Based NLP: This is the initial approach of NLP. It is based on defined
rules. Which comes from domain knowledge and expertise. Example: regex
Statistical Machine learning-based NLP: It is based on statistical rules and
machine learning algorithms. In this approach, algorithms are applied to the data and
learned from the data, and applied to various tasks. Examples: Naive Bayes, support
vector machine (SVM), hidden Markov model (HMM), etc.
Neural Network-based NLP: This is the latest approach that comes with the
evaluation of neural network-based learning, known as Deep learning. It provides
good accuracy, but it is a very data-hungry and time-consuming approach. It requires
high computational power to train the model. Furthermore, it is based on neural
network architecture. Examples: Recurrent neural networks (RNNs), Long short-term
memory networks (LSTMs), Convolutional neural networks (CNNs), Transformers,
etc.