Natural language processing (NLP) uses computer science and artificial intelligence to understand, analyze, and generate human language. NLP has important applications for businesses to analyze large amounts of text data and online reviews. Key components of NLP include natural language understanding to extract meaning from text and natural language generation to convert data into readable text. Building an NLP pipeline involves processes like tokenization, stemming, lemmatization, part-of-speech tagging, named entity recognition, and chunking.
Natural language processing (NLP) uses computer science and artificial intelligence to understand, analyze, and generate human language. NLP has important applications for businesses to analyze large amounts of text data and online reviews. Key components of NLP include natural language understanding to extract meaning from text and natural language generation to convert data into readable text. Building an NLP pipeline involves processes like tokenization, stemming, lemmatization, part-of-speech tagging, named entity recognition, and chunking.
Natural language processing (NLP) uses computer science and artificial intelligence to understand, analyze, and generate human language. NLP has important applications for businesses to analyze large amounts of text data and online reviews. Key components of NLP include natural language understanding to extract meaning from text and natural language generation to convert data into readable text. Building an NLP pipeline involves processes like tokenization, stemming, lemmatization, part-of-speech tagging, named entity recognition, and chunking.
Natural language processing (NLP) uses computer science and artificial intelligence to understand, analyze, and generate human language. NLP has important applications for businesses to analyze large amounts of text data and online reviews. Key components of NLP include natural language understanding to extract meaning from text and natural language generation to convert data into readable text. Building an NLP pipeline involves processes like tokenization, stemming, lemmatization, part-of-speech tagging, named entity recognition, and chunking.
Download as PPTX, PDF, TXT or read online from Scribd
Download as pptx, pdf, or txt
You are on page 1of 26
Unit-V
Natural Language Processing
Natural Language Processing • NLP stands for Natural Language Processing, which is a part of Computer Science, Human language, and Artificial Intelligence. • It is the technology that is used by machines to understand, analyse, manipulate, and interpret human's languages. • It helps developers to organize knowledge for performing tasks such as translation, automatic summarization, Named Entity Recognition (NER), speech recognition, relationship extraction, and topic segmentation. Why NLP is important • The biggest benefit of NLP for businesses is the ability of technology to detect, and process massive volumes of text data across the digital world including; social media platforms, online reviews, news reports, and others. • Also, by collecting and analyzing business data, NLP is able to offer businesses valuable insights into brand performance. • In addition, NLP models can detect any persisting issues and take necessary mitigation measures to improve performance. • Google speech to text is able to achieve all of this by training machines to understand human language in a faster, more accurate, and consistent way than human agents. • The technology is able to consistently monitor and process data. This helps brands remain updated with their online presence, and not get riddled with inconsistencies. Applications of NLP • Chat bot • Speech recognition • Machine Translation • Spell Checking • Keyword Searching • Information extraction • Advertisement Matching Components of NLP • There are two components of NLP - • 1. Natural Language Understanding (NLU) • Natural Language Understanding (NLU) helps the machine to understand and analyse human language by extracting the metadata from content such as concepts, entities, keywords, emotion, relations, and semantic roles. • NLU mainly used in Business applications to understand the customer's problem in both spoken and written language. • NLU involves the following tasks - • It is used to map the given input into useful representation. • It is used to analyze different aspects of the language. • 2. Natural Language Generation (NLG) • Natural Language Generation (NLG) acts as a translator that converts the computerized data into natural language representation. • It mainly involves Text planning, Sentence planning, and Text Realization. • Note: The NLU is difficult than NLG. Building an NLP pipeline • There are the following steps to build an NLP pipeline • Tokenization • Stemming • Lemmatization • POS Tags • Name Entity Recognition • Chunking Tokenization • It breaks the sentence into separate words or tokens. Stemming • Refers to the process of slicing the end or the beginning of words with the intention of removing affixes (lexical additions to the root of the word). • Normalize words into its base or root forms. • For example, celebrates, celebrated and celebrating, all these words are originated with a single root word "celebrate." Problem in Stemming • The big problem with stemming is that sometimes it produces the root word which may not have any meaning. Lemmatization • Lemmatization is quite similar to the Stemming but it overcomes the limitation of Stemming. • It is used to group different inflected forms of the word, called Lemma. • The main difference between Stemming and lemmatization is that it produces the root word, which has a meaning. • Output of Lemmatization is a proper word. • POS Tags • POS stands for Parts of Speech Tags. • It includes Noun, verb, adverb, and Adjective. • It indicates that how a word functions with its meaning as well as grammatically within the sentences. • A word has one or more parts of speech based on the context in which it is used. • Example: "Google" something on the Internet. • In the above example, Google is used as a verb, although it is a proper noun. Name Entity Recognition • Named Entity Recognition (NER) is the process of detecting the named entity such as person name, organization name, location or Monetary value. • Example: HelpingHands founder Arhaan lists his Bangalore penthouse for 20 Million rupees. Chunking • Chunking is used to collect the individual piece of information and grouping them into bigger pieces of sentences. • This help getting insight and meaningful information from the text. Phases of NLP • Morphological and Lexical Analysis • Syntactic Analysis • Semantic Analysis • Discourse Integration • Pragmatic Analysis Morphological and Lexical Analysis • The lexicon of a language is its vocabulary that includes its words and expressions • Morphology depicts analyzing, identifying and description of structure of words • Lexical analysis involves dividing a text into paragraphs and the sentences Syntactic Analysis • Syntax concerns the proper ordering of words and its effects on meaning • This involves analysis of the words in a sentence to depict the grammatical structure of the sentence. • The words are transformed into structure that shows how the words are related to each other. • Eg: School went to Raju This would be rejected by English syntactic analyzer Semantic Analysis • Semantic concerns the (literal) meaning of words, phrases and sentences. • This abstracts the dictionary meaning or the exact meaning from context. • The structures which are created by the syntactic analyzer are assigned meaning. • Eg: “Hot Ice cream” It is rejected as it does not give any sense Discourse Integration • Sense of the context • The meaning of any single sentence depends upon the sentences that precedes it and also invokes the meaning of the sentences that follow it. • Eg: The word “there” in the sentence “He wants to go there” depends upon the prior discourse context. Pragmatic Analysis • Pragmatic concerns the overall communicative and social context and its effects on interpretation. • It means abstracting or deriving the purposeful use of the language in situations. • Importantly those aspects of language which require world knowledge. • The main focus is on what was said is reinterpreted on what it actually means • Eg: “close the window” should have been interpreted as a request rather than an order.