0% found this document useful (0 votes)
61 views

NLP U5

Natural language processing (NLP) uses computer science and artificial intelligence to understand, analyze, and generate human language. NLP has important applications for businesses to analyze large amounts of text data and online reviews. Key components of NLP include natural language understanding to extract meaning from text and natural language generation to convert data into readable text. Building an NLP pipeline involves processes like tokenization, stemming, lemmatization, part-of-speech tagging, named entity recognition, and chunking.

Uploaded by

Sana Mateen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

NLP U5

Natural language processing (NLP) uses computer science and artificial intelligence to understand, analyze, and generate human language. NLP has important applications for businesses to analyze large amounts of text data and online reviews. Key components of NLP include natural language understanding to extract meaning from text and natural language generation to convert data into readable text. Building an NLP pipeline involves processes like tokenization, stemming, lemmatization, part-of-speech tagging, named entity recognition, and chunking.

Uploaded by

Sana Mateen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Unit-V

Natural Language Processing


Natural Language Processing
• NLP stands for Natural Language Processing, which is a part
of Computer Science, Human language, and Artificial
Intelligence.
• It is the technology that is used by machines to understand,
analyse, manipulate, and interpret human's languages.
• It helps developers to organize knowledge for performing
tasks such as translation, automatic summarization, Named
Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
Why NLP is important
• The biggest benefit of NLP for businesses is the ability of technology to
detect, and process massive volumes of text data across the digital world
including; social media platforms, online reviews, news reports, and others.
• Also, by collecting and analyzing business data, NLP is able to offer
businesses valuable insights into brand performance.
• In addition, NLP models can detect any persisting issues and take necessary
mitigation measures to improve performance.
• Google speech to text is able to achieve all of this by training machines to
understand human language in a faster, more accurate, and consistent way
than human agents.
• The technology is able to consistently monitor and process data. This helps
brands remain updated with their online presence, and not get riddled with
inconsistencies.
Applications of NLP
• Chat bot
• Speech recognition
• Machine Translation
• Spell Checking
• Keyword Searching
• Information extraction
• Advertisement Matching
Components of NLP
• There are two components of NLP -
• 1. Natural Language Understanding (NLU)
• Natural Language Understanding (NLU) helps the machine to
understand and analyse human language by extracting the
metadata from content such as concepts, entities, keywords,
emotion, relations, and semantic roles.
• NLU mainly used in Business applications to understand the
customer's problem in both spoken and written language.
• NLU involves the following tasks -
• It is used to map the given input into useful representation.
• It is used to analyze different aspects of the language.
• 2. Natural Language Generation (NLG)
• Natural Language Generation (NLG) acts as a
translator that converts the computerized
data into natural language representation.
• It mainly involves Text planning, Sentence
planning, and Text Realization.
• Note: The NLU is difficult than NLG.
Building an NLP pipeline
• There are the following steps to build an NLP
pipeline
• Tokenization
• Stemming
• Lemmatization
• POS Tags
• Name Entity Recognition
• Chunking
Tokenization
• It breaks the sentence into separate words or
tokens.
Stemming
• Refers to the process of slicing the end or the
beginning of words with the intention of removing
affixes (lexical additions to the root of the word).
• Normalize words into its base or root forms.
• For example, celebrates, celebrated and celebrating,
all these words are originated with a single root word
"celebrate."
Problem in Stemming
• The big problem with stemming is that
sometimes it produces the root word which
may not have any meaning.
Lemmatization
• Lemmatization is quite similar to the Stemming but it
overcomes the limitation of Stemming.
• It is used to group different inflected forms of the
word, called Lemma.
• The main difference between Stemming and
lemmatization is that it produces the root word,
which has a meaning.
• Output of Lemmatization is a proper word.
• POS Tags
• POS stands for Parts of Speech Tags.
• It includes Noun, verb, adverb, and Adjective.
• It indicates that how a word functions with its
meaning as well as grammatically within the
sentences.
• A word has one or more parts of speech based on
the context in which it is used.
• Example: "Google" something on the Internet.
• In the above example, Google is used as a verb,
although it is a proper noun.
Name Entity Recognition
• Named Entity Recognition (NER) is the
process of detecting the named entity such as
person name, organization name, location or
Monetary value.
• Example: HelpingHands founder Arhaan lists
his Bangalore penthouse for 20 Million rupees.
Chunking
• Chunking is used to collect the individual piece
of information and grouping them into bigger
pieces of sentences.
• This help getting insight and meaningful
information from the text.
Phases of NLP
• Morphological and Lexical Analysis
• Syntactic Analysis
• Semantic Analysis
• Discourse Integration
• Pragmatic Analysis
Morphological and Lexical Analysis
• The lexicon of a language is its vocabulary
that includes its words and expressions
• Morphology depicts analyzing, identifying and
description of structure of words
• Lexical analysis involves dividing a text into
paragraphs and the sentences
Syntactic Analysis
• Syntax concerns the proper ordering of words and its
effects on meaning
•  This involves analysis of the words in a sentence to
depict the grammatical structure of the sentence.
•  The words are transformed into structure that
shows how the words are related to each other.
•  Eg: School went to Raju This would be rejected by
English syntactic analyzer
Semantic Analysis
• Semantic concerns the (literal) meaning of
words, phrases and sentences.
•  This abstracts the dictionary meaning or the
exact meaning from context.
•  The structures which are created by the
syntactic analyzer are assigned meaning.
•  Eg: “Hot Ice cream” It is rejected as it does
not give any sense
Discourse Integration
• Sense of the context
•  The meaning of any single sentence depends
upon the sentences that precedes it and also
invokes the meaning of the sentences that
follow it.
•  Eg: The word “there” in the sentence “He
wants to go there” depends upon the prior
discourse context.
Pragmatic Analysis
• Pragmatic concerns the overall communicative and
social context and its effects on interpretation.
•  It means abstracting or deriving the purposeful use
of the language in situations.
•  Importantly those aspects of language which require
world knowledge.
•  The main focus is on what was said is reinterpreted
on what it actually means
•  Eg: “close the window” should have been interpreted
as a request rather than an order.

You might also like