0% found this document useful (0 votes)
142 views7 pages

UNIT 6 Applications of NLP

The document discusses various natural language processing techniques including machine translation, sentiment analysis, and question answering systems. It provides overviews and examples of rule-based, statistical, neural, and hybrid approaches to machine translation. It also outlines how sentiment analysis and question answering systems work at a high level.

Uploaded by

Yuvraj Pardeshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views7 pages

UNIT 6 Applications of NLP

The document discusses various natural language processing techniques including machine translation, sentiment analysis, and question answering systems. It provides overviews and examples of rule-based, statistical, neural, and hybrid approaches to machine translation. It also outlines how sentiment analysis and question answering systems work at a high level.

Uploaded by

Yuvraj Pardeshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

UNIT 6 Applications of NLP

Machine translation (MT) is a subfield of computational linguistics and artificial intelligence


(AI) that focuses on the automatic translation of text or speech from one language to another.
The goal of machine translation is to enable communication between people who speak
different languages without the need for human translators.

Machine Translation Appraches -


Rule Based Machine Translation –
The rule-based machine translation is also known as knowledge driven approach. This approach is the
first approach developed in the field of machine translation and it is based on linguistic information.
The translation system consists of a collection of grammar rules, a lexicon and software programs to
process the rules .It produces more predictable output for grammar since it deals with syntactic,
semantic and morphological analysis in both source language and target language. Building ruled
based machine translation is expensive as all the rules of the language need to be applied and there is
a requirement of huge linguistic knowledge. But once it is built it can be deeply analysed at syntax
and semantic level. There are three different types of ruled based machine translation. They are direct
translation, transfer based translation and Interlingua translation.
1.Direct translation -
Direct translation is a word by word translation approach. It directly translates the source language to
the target language. It is unidirectional bilingual machine translation. It requires huge amount of
morphological analysis but only a little syntax and semantic analysis is required.
2. Transfer-based translation –
A transfer based translation involves three stages; analysis, transfer and generation (Sindhu, 2014). In
analysis stage, the source language is analyse and converts it into syntactic representation of source
language. In transfer stage, it transfers the syntactic representation of source language to a syntactic
representation of target. In generation stage, the target language is generated using morphological
analyser.
3. Interlingua translation –
Interlingua is a combination of two Latin words Inter and Lingua which means intermediary and
Language respectively The source language is transformed into intermediate language then the
intermediate language is transformed into target language. There is no language pair involves;
therefore, it can be used in multilingual machine translation.

Statistical machine translation (SMT) -


Statistical machine translation (SMT) is a type of machine translation (MT) that uses statistical
models to translate text from one language to another. It is a subfield of natural language processing
(NLP) that involves analyzing large amounts of bilingual text to build models that can accurately
translate between languages.
Statistical machine translation consists of three models language model, translation model and
decoder model.
The language model gives possible translation for each word or phrase in the input sentence with a
probability assigned to each translation P(T).
The translation model compute the conditional probability of target sentences by giving the source
sentence P(T/S).
The decoder model search for the best translation possible P(S,T) by maximizing the product of two
probabilities, the language model and translation model as in the equation:
P(S,T)=argmax P(t)*P(T/S)

4.Neural machine translation (NMT). Neural machine translation utilizes deep learning
models, particularly sequence-to-sequence models or transformer models, to learn translation
patterns from training data. NMT learns to generate translations by processing the entire
sentence, considering the context and dependencies between words. It has demonstrated
significant improvements in translation quality and fluency. NMT can handle long-range
dependencies and produce more natural-sounding translations.
Example: NMT takes an input sentence like "The cat is sleeping" and generates a translation
like "El gato está durmiendo" in Spanish, capturing the context and idiomatic expression
accurately.
5. Hybrid machine translation (HMT). Hybrid machine translation may incorporate rule-
based, statistical and neural components to enhance translation quality. For example, a hybrid
system might use rule-based methods for handling specific linguistic phenomena, statistical
models for general translation patterns, and neural models for generating fluent and
contextually aware translations.
Example: A hybrid system could use a rule-based approach for handling grammatical rules,
statistical models for common phrases, and a neural model to generate fluent translations with
improved context understanding.
6. Example-based machine translation (EBMT). Example-based machine translation relies
on a database of previously translated sentences or phrases to generate translations. It
searches for similar examples in the database and retrieves the most relevant translations.
EBMT is useful when dealing with specific domains or highly repetitive texts but may
struggle with unseen or creative language usage.
Example: If the sentence, "The cat is playing," has been previously translated as "El gato
está jugando," EBMT can retrieve that translation as a reference to translate a new sentence,
"The cat is eating."

What is Sentiment Analysis?


Sentiment analysis is the process of classifying whether a block of text is positive, negative,
or neutral. The goal that Sentiment mining tries to gain is to be analysed people’s opinions in
a way that can help businesses expand. It focuses not only on polarity (positive, negative &
neutral) but also on emotions (happy, sad, angry, etc.). It uses various Natural Language
Processing algorithms such as Rule-based, Automatic, and Hybrid.
Types of sentiment analysis
Sentiment analysis systems fall into several different categories:
Fine-grained sentiment analysis breaks down sentiment indicators into more precise
categories, such as very positive and very negative. This approach is similar to opinion
ratings on a one to five star scale. This approach is therefore effective at grading customer
satisfaction surveys.
Emotion detection analysis identifies emotions rather than positivity and negativity.
Examples include happiness, frustration, shock, anger and sadness.
Intent-based analysis recognizes motivations behind a text in addition to opinion. For
example, an online comment expressing frustration about changing a battery may carry the
intent of getting customer service to reach out to resolve the issue.
Aspect-based analysis examines the specific component being positively or negatively
mentioned. For example, a customer might review a product saying the battery life was too
short. The sentiment analysis system will note that the negative sentiment isn't about the
product as a whole but about the battery life.
How does Sentiment Analysis work?
Sentiment Analysis in NLP, is used to determine the sentiment expressed in a piece of text,
such as a review, comment, or social media post.
The goal is to identify whether the expressed sentiment is positive, negative, or neutral. let’s
understand the overview in general two steps:
Preprocessing
Starting with collecting the text data that needs to be analysed for sentiment like customer
reviews, social media posts, news articles, or any other form of textual content. The collected
text is pre-processed to clean and standardize the data with various tasks:
 Removing irrelevant information (e.g., HTML tags, special characters).
 Tokenization: Breaking the text into individual words or tokens.
 Removing stop words (common words like “and,” “the,” etc. that don’t contribute
much to sentiment).
 Stemming or Lemmatization: Reducing words to their root form.
Analysis
Text is converted for analysis using techniques like bag-of-words or word embeddings (e.g.,
Word2Vec, GloVe).Models are then trained with labeled datasets, associating text with
sentiments (positive, negative, or neutral).
After training and validation, the model predicts sentiment on new data, assigning labels
based on learned patterns.
Application –
 Customer Feedback Analysis
 Brand Monitoring and Reputation Management
 Market Research
 Political Analysis
 Social Media Content
What are the challenges in Sentiment Analysis?
There are major challenges in the sentiment analysis approach:
 If the data is in the form of a tone, then it becomes really difficult to detect whether
the comment is pessimist or optimistic.
 If the data is in the form of emoji, then you need to detect whether it is good or bad.
 Even the ironic, sarcastic, comparing comments detection is really hard.
 Comparing a neutral statement is a big task.

What is a question-answering System?


Question answering (QA) is a field of natural language processing (NLP) and artificial
intelligence (AI) that aims to develop systems that can understand and answer questions
posed in natural language.
A natural language question-answering (QA) system is a computer program that
automatically answers questions using NLP. The basic process of a natural language QA
system includes the following steps:
1. Text pre-processing: The question is pre-processed to remove irrelevant information
and standardise the text’s format. This step includes tokenisation, lemmatisation,
and stop-word removal, among others.
2. Question understanding: The pre-processed question is analysed to extract the
relevant entities and concepts and to identify the type of question being asked. This
step can be done using natural language processing (NLP) techniques such as named
entity recognition, dependency parsing, and part-of-speech tagging.
3. Information retrieval: The question is used to search a database or corpus of text to
retrieve the most relevant information. This can be done using information retrieval
techniques such as keyword search or semantic search.
4. Answer generation: The retrieved information is analysed to extract the specific
answer to the question. This can be done using various techniques, such as machine
learning algorithms, rule-based systems, or a combination.
5. Ranking: The extracted answers are ranked based on relevance and confidence score.
Approaches –
 IR Based
 Knowledge Base
 Generative
 Hybrid
 Rule-based

Applications of Question Answering System –


1. Customer service: QA systems can be used to answer customers’ questions quickly
and correctly, reducing the need for human customer service reps.
2. Search engines: QA systems can make search results more accurate and valuable by
answering specific questions instead of just giving a list of relevant documents.
3. Finance: QA systems can tell financial advisors about the latest market trends and
investment strategies.
4. In e-commerce, QA systems can be used to recommend products to customers and
answer their questions about the features and availability of those products.
5. Voice assistants: QA systems can be connected to voice assistants so that users can
conversationally get answers to their questions.
6. Chatbots: QA systems can be linked to chatbots so that users can naturally get
answers to their questions.
https://spotintelligence.com/2023/01/20/question-answering-qa-system-nlp/
Challenges in Question Answering -
 Lexical Gap
 Ambiguity
 Multilingualism

Text entailment -
Text entailment, also known as Recognizing Textual Entailment (RTE), is a natural language
processing (NLP) task that focuses on determining whether one text snippet logically entails
another text snippet. In simpler terms, it assesses whether a given statement (the hypothesis)
can be inferred or logically deduced from another statement (the premise).
Here's a more detailed explanation of text entailment:
Task Definition:
Given a premise sentence (P) and a hypothesis sentence (H), the task of text entailment is to
determine if the meaning of H is entailed by the meaning of P.
The task is typically framed as a binary classification problem, where the model predicts
whether H is entailed (True) or not entailed (False) by P.
Examples:
Premise (P): "The cat is sleeping on the mat."
Hypothesis (H): "The cat is resting."
In this example, H can be logically inferred from P, so the entailment label is True.
Premise (P): "The sky is blue."
Hypothesis (H): "The grass is green."
Here, there is no logical connection between P and H, so the entailment label is False.

Dialog and Conversational Agents -


A conversational agent is any dialogue system that conducts natural language processing
(NLP) and responds automatically using human language. Conversational agents represent
the practical implementation of computational linguistics, and are usually deployed as
chatbots and virtual or AI assistants.
A conversational agent is a virtual agent you can use to communicate with a human in natural
language. It simulates human-to-human interaction and understands context and meaning just
as humans do.
Difference between chatbots and conversational agents
A chatbot is a software program designed to simulate human conversation. You can use it to
provide information, answer questions, perform tasks, and make purchases. They can be rule-
based, too, which means they can only respond based on specific text or button inputs. In
addition, these bots are more narrowly focused on their objectives. For example, they're
either employed to resolve particular customer queries such as looking up an insurance policy
or help with the e-commerce checkout process.
On the other hand, conversational agents are programs that use NLP and natural language
understanding (NLU) technology to converse with humans. The program can understand
human emotions, answer basic questions, respond to commands, and interact through natural
language conversations. These agents are often used to automate customer support and
marketing campaigns.
Application –
Iris: Conversational agent for data science tasks:
Description: Iris is a conversational agent designed to assist users with data science tasks,
such as data analysis, visualization, and model building.
Woebot: Mental Health App:
Description: Woebot is a mental health app that provides conversational therapy and support
to users dealing with stress, anxiety, depression, and other mental health issues.
Roof.ai: Real estate conversational AI chatbot:
Description: Roof.ai is a conversational AI chatbot designed for the real estate industry to
assist users with property search, inquiries, and transactions.

Natural language generation (NLG)


https://www.techtarget.com/searchenterpriseai/definition/natural-language-generation-NLG

You might also like