0% found this document useful (0 votes)
0 views

Natural Language Processing (NLP) in AI

Natural Language Processing (NLP) is a field that combines computer science, artificial intelligence, and linguistics to enable machines to understand and generate human language. It includes components like Natural Language Generation (NLG) and Natural Language Understanding (NLU), each serving distinct functions in processing language. The NLP pipeline consists of several steps, from sentence segmentation to pragmatic analysis, aimed at automating tasks and extracting insights from text data.

Uploaded by

ayushikorde371
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Natural Language Processing (NLP) in AI

Natural Language Processing (NLP) is a field that combines computer science, artificial intelligence, and linguistics to enable machines to understand and generate human language. It includes components like Natural Language Generation (NLG) and Natural Language Understanding (NLU), each serving distinct functions in processing language. The NLP pipeline consists of several steps, from sentence segmentation to pragmatic analysis, aimed at automating tasks and extracting insights from text data.

Uploaded by

ayushikorde371
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Natural Language Processing (NLP) in AI

The meaning of NLP is Natural Language Processing (NLP) which is a fascinating

and rapidly evolving field that intersects computer science, artificial intelligence,

and linguistics. NLP focuses on the interaction between computers and human

language, enabling machines to understand, interpret, and generate human

language in a way that is both meaningful and useful. With the increasing volume

of text data generated every day, from social media posts to research articles, NLP

has become an essential tool for extracting valuable insights and automating

various tasks.

Natural language processing (NLP) is a field of computer science and a

subfield of artificial intelligence that aims to make computers understand human

language. NLP uses computational linguistics, which is the study of how language

works, and various models based on statistics, machine learning, and deep learning.

These technologies allow computers to analyze and process text or voice data, and

to grasp their full meaning, including the speaker’s or writer’s intentions and

emotions.

From tokenization and parsing to sentiment analysis and machine

translation, NLP encompasses a wide range of applications that are reshaping

industries and enhancing human-computer interactions. Whether you are a


seasoned professional or new to the field, this overview will provide you with a

comprehensive understanding of NLP and its significance in today’s digital age.

Components of NLP
There are the following two components of NLP –

Natural language generation (NLG)

NLG is a method of creating meaningful phrases and sentences (natural language)

from data. It comprises three stages: text planning, sentence planning, and text

realization.

• Text planning: Retrieving applicable content.

• Sentence planning: Forming meaningful phrases and setting the sentence

tone.

• Text realization: Mapping sentence plans to sentence structures.

Chatbots, machine translation tools, analytics platforms, voice assistants, sentiment

analysis platforms, and AI-powered transcription tools are some applications of

NLG.

Natural language understanding (NLU)

NLU enables machines to understand and interpret human language by extracting

metadata from content. It performs the following tasks:

• Helps analyze different aspects of language.

• Helps map the input in natural language into valid representations.

NLU is more difficult than NLG tasks owing to referential, lexical, and syntactic

ambiguity.
• Lexical ambiguity: This means that one word holds several meanings. For

example, "The man is looking for the match." The sentence is ambiguous

as ‘match’ could mean different things such as a partner or a competition.

• Syntactic ambiguity: This refers to a sequence of words with more than one

meaning. For example, "The fish is ready to eat.” The ambiguity here is

whether the fish is ready to eat its food or whether the fish is ready for

someone else to eat. This ambiguity can be resolved with the help of the

part-of-speech tagging technique.

• Referential ambiguity: This involves a word or a phrase that could refer to

two or more properties. For example, Tom met Jerry and John. They went

to the movies. Here, the pronoun ‘they’ causes ambiguity as it isn’t clear

who it refers to.

Difference between NLU and NLG

NLU NLG

NLU is the process of reading and NLG is the process of writing or generating
interpreting language. language.

It produces non-linguistic outputs It produces constructing natural language


from natural language inputs. outputs from non-linguistic inputs.

Pipeline of natural language processing in artificial intelligence


The NLP pipeline comprises a set of steps to read and understand human language.

Step 1: Sentence segmentation

Sentence segmentation is the first step in the NLP pipeline. It divides the entire

paragraph into different sentences for better understanding. For example, "London

is the capital and most populous city of England and the United Kingdom. Standing

on the River Thames in the southeast of the island of Great Britain, London has

been a major settlement for two millennia. It was founded by the Romans, who

named it Londinium."

After using sentence segmentation, we get the following result:

1. “London is the capital and most populous city of England and the United

Kingdom.”

2. “Standing on the River Thames in the southeast of the island of Great

Britain, London has been a major settlement for two millennia.”

3. “It was founded by the Romans, who named it Londinium.”


Step 2: Word tokenization
Word tokenization breaks the sentence into separate words or tokens. This helps

understand the context of the text. When tokenizing the sentence “London is the

capital and most populous city of England and the United Kingdom”, it is broken

into separate words, i.e., “London”, “is”, “the”, “capital”, “and”, “most”, “populous”,

“city”, “of”, “England”, “and”, “the”, “United”, “Kingdom”, “.”


Step 3: Stemming
Stemming helps in pre-processing text. The model analyses the parts of speech to

figure out what exactly the sentence is talking about.

The result will be:

Stemming normalizes words into their base or root form. In other words, it helps to

predict the parts of speech for each token. For example, intelligently, intelligence,

and intelligent. These words originate from a single root word ‘intelligen’. However,
in English there’s no such word as ‘intelligen’.
Step 4: Lemmatization
Lemmatization removes inflectional endings and returns the canonical form of a

word or lemma. It is similar to stemming except that the lemma is an actual word.

For example, ‘playing’ and ‘plays’ are forms of the word ‘play’. Hence, play is the

lemma of these words. Unlike a stem (recall ‘intelligen’), ‘play’ is a proper word.
Step 5: Stop word analysis
The next step is to consider the importance of each and every word in a given

sentence. In English, some words appear more frequently than others such as "is",

"a", "the", "and". As they appear often, the NLP pipeline flags them as stop words.

They are filtered out so as to focus on more important words.


Step 6: Dependency parsing
Next comes dependency parsing which is mainly used to find out how all the words

in a sentence are related to each other. To find the dependency, we can build a tree

and assign a single word as a parent word. The main verb in the sentence will act

as the root node.


Step 7: Part-of-speech (POS) tagging
POS tags contain verbs, adverbs, nouns, and adjectives that help indicate the

meaning of words in a grammatically correct way in a sentence.

Natural language processing is used when we want machines to interpret human

language. The main goal is to make meaning out of text in order to perform certain

tasks automatically such as spell check, translation, for social media monitoring

tools, and so on.


Phases of NLP
There are the following five phases of NLP:
1. Lexical Analysis and Morphological
The first phase of NLP is the Lexical Analysis. This phase scans the source
code as a stream of characters and converts it into meaningful lexemes.
It divides the whole text into paragraphs, sentences, and words.
2. Syntactic Analysis (Parsing)
Syntactic Analysis is used to check grammar, word arrangements, and
shows the relationship among the words.
Example: Agra goes to the Poonam
In the real world, Agra goes to the Poonam, does not make any sense, so
this sentence is rejected by the Syntactic analyzer.
3. Semantic Analysis
Semantic analysis is concerned with the meaning representation. It
mainly focuses on the literal meaning of words, phrases, and sentences.
4. Discourse Integration
Discourse Integration depends upon the sentences that proceeds it and
also invokes the meaning of the sentences that follow it.
5. Pragmatic Analysis
Pragmatic is the fifth and last phase of NLP. It helps you to discover the
intended effect by applying a set of rules that characterize cooperative
dialogues.

Phases of Natural Language Processing

You might also like