0% found this document useful (0 votes)
8 views5 pages

Unit 1 and 2

Natural Language Processing (NLP) is a subfield of AI focused on enabling machines to understand and generate human language. Key applications include machine translation, sentiment analysis, speech recognition, and chatbots, while challenges involve ambiguities in language and issues like vanishing and exploding gradients in neural networks. Techniques such as word embeddings and Hidden Markov Models (HMMs) are utilized to improve the understanding and processing of natural language.

Uploaded by

pinky.thogaru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views5 pages

Unit 1 and 2

Natural Language Processing (NLP) is a subfield of AI focused on enabling machines to understand and generate human language. Key applications include machine translation, sentiment analysis, speech recognition, and chatbots, while challenges involve ambiguities in language and issues like vanishing and exploding gradients in neural networks. Techniques such as word embeddings and Hidden Markov Models (HMMs) are utilized to improve the understanding and processing of natural language.

Uploaded by

pinky.thogaru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Introduction to NLP: representation.

It mainly involves Text planning,


Sentence planning, and Text Realization.
NLP means that Natural Language Processing which
refers to the domain of artificial intelligence technology Applications of NLP:
which is focused on analysing the human
1. Machine Translation: Translating text from one
languages and making it understandable to the
language to another (e.g., Google Translate).
machines.
Natural Language Processing (NLP) is a subfield of 2. Sentiment Analysis: Understanding emotions
Artificial Intelligence (AI) and Computational Linguistics and opinions in text (e.g., product reviews).
that focuses on enabling computers to understand,
interpret, and generate human language in a way that 3. Speech Recognition: Converting spoken
is both meaningful and useful. NLP bridges the gap language into text (e.g., Siri, Alexa).
between human communication and machine 4. Chatbots and Virtual Assistants: Providing
understanding. automated conversational interfaces.
Key Goals of NLP:
5. Text Summarization: Creating concise
Language Understanding: Teaching machines to summaries of lengthy texts.
comprehend the meaning of text or speech.
6. Information Retrieval: Searching and
Language Generation: Enabling machines to generate extracting relevant data (e.g., search engines).
human-like text or speech.
7. Spam Filtering: Identifying and filtering out
Translation and Transformation: Facilitating conversion unwanted emails.
between languages or transforming text for specific
applications. Ambiguity in language
Advantages of NLP: There are the following three ambiguity in Language

NLP helps users to ask questions about any subject and Lexical Ambiguity
get a direct response within seconds. Lexical Ambiguity exists in the presence of two or more
NLP offers exact answers to the question means it does possible meanings of the sentence within a single
not offer unnecessary and unwanted information. word.

NLP helps computers to communicate with humans in Example:


their languages. Manya is looking for a match.
It is very time efficient. In the above example, the word match refers to that
Components of NLP: either Manya is looking for a partner or Manya is
looking for a match. (Cricket or other match)
Most of the companies use NLP to improve the
efficiency of documentation processes, accuracy of Syntactic Ambiguity
documentation, and identify the information from Syntactic Ambiguity exists in the presence of two or
large databases. more possible meanings within the sentence.
There are the following two components of NLP . Example:
Natural Language Understanding (NLU) I saw the girl with the binocular.
Natural Language Generation (NLG) In the above example, did I have the binoculars? Or did
Natural Language Understanding (NLU) : NLU enables the girl have the binoculars?
machines to understand and analyze human language, Referential Ambiguity
aiding in business applications by mapping and Referential Ambiguity exists when you are referring to
analyzing spoken and written inputs. something using the pronoun.
Natural Language Generation (NLG): Natural Language Example:
Generation (NLG) acts as a translator that converts the
computerized data into natural language Harsha went to Varsha. He said, "I am hungry."
In the above sentence, you do not know that who is Word senses refer to the different meanings a word can
hungry, either Harsha or Varsha. have depending on the context. Word senses refer to
Sentence Segmentation the distinct meanings or interpretations that a word
Defination: can have. In NLP, resolving word senses is a crucial step
in understanding text accurately, as many words are
The initial stage in creating an NLP pipeline is sentence
polysemous (have multiple related meanings) or
segmentation. The paragraph is divided into distinct
homonymous (have multiple unrelated meanings). For
sentences by it.
example, the word bank can mean:
Eg:
 A financial institution.
One of the most significant holidays for every Indian is
Independence Day. Since India gained independence  The side of a river.
from British rule, August 15th has been the annual
Applications of Word Senses in NLP:
celebration of this event. On this day, true
independence is celebrated. Machine Translation:
Sentence Segmentation results in the following Correct word sense ensures accurate translation.
outcome:
Example:
"Independence Day is one of the important festivals for
every Indian citizen." English: Bank (financial institution).

"It is celebrated on the 15th of August each year ever French: Banque (not rive, which means river bank).
since India got independence from the British rule." Information Retrieval:
"This day celebrates independence in the true sense." Disambiguating user queries improves search accuracy.
Stemming is used to normalize words into its base Example:
form or root form. For example, celebrates, celebrated
Query: Apple.
and celebrating, all these words are originated with a
single root word "celebrate." The big problem with Sense: A fruit vs. a tech company.
stemming is that sometimes it produces the root word
Chatbots and Virtual Assistants:
which may not have any meaning.
Resolving word senses enhances user interaction.
For Example, intelligence, intelligent, and intelligently,
all these words are originated with a single root word Example:
"intelligen." In English, the word "intelligen" do not
have any meaning. User: What is the capital of the bank?

Correct sense: Bank as a financial institution.


Tokenization
Word Tokenizer is used to break the sentence into Sentiment Analysis:
separate words or tokens.
Accurate sense identification refines sentiment
Example: interpretation.

SR University offers Corporate Training, Summer Example:


Training, Online Training, and Winter Training.
That movie was sick!
Word Tokenizer generates the following result:
Sense: Sick as cool (positive) vs. unwell (negative).
“SR University", "offers", "Corporate", "Training",
Question Answering Systems:
"Summer", "Training", "Online", "Training", "and",
"Winter", "Training", "." Understanding the intended word sense improves
response accuracy.

Example:
Word Senses and Word embeddings:
Question: What does a bat eat?
Correct sense: Bat as an animal. Key Features of RNNs:

Word Embeddings Sequential Data Handling: RNNs are designed to


process sequences of data, such as sentences or time
Word embeddings are numerical vector
series, where the order of the elements is important.
representations of words that capture their semantic
meaning. Words with similar meanings have Hidden State: RNNs maintain a hidden state that is
embeddings that are closer in the vector space. updated at each time step, capturing information about
previous elements in the sequence. This hidden state
Word embedings are a fundamental technique in
helps the network retain context and understand
Natural Language Processing (NLP) that represent
dependencies between words in a sentence.
words as vectors in a continuous vector space. This
allows algorithms to capture semantic relationships Shared Weights: The same set of weights is applied at
between words. They help in reducing the each time step, allowing RNNs to generalize across
dimensionality of text data and preserving the context different positions in the input sequence.
and meaning of words.

Popular Word Embedding Techniques:


RNNs are widely used in various NLP tasks, including:
Word2Vec

GloVe (Global Vectors for Word Representation)


Language Modeling: Predicting the next word in a
FastText. sequence based on the previous words.
Applications of word embeddings
Text Generation: Generating coherent text by
Text Classification: predicting sequences of words, character by character
or word by word.
Detect spam emails or categorize news articles.
Machine Translation: Translating text from one
Machine Translation:
language to another by processing the input sequence
Map words between languages (e.g., English to and generating the corresponding sequence in the
French). target language.

Sentiment Analysis: Speech Recognition: Converting spoken language into


text by processing the sequence of audio features.
Analyze the sentiment of movie or product reviews.
Sentiment Analysis: Analyzing sequences of text to
Question Answering: determine the sentiment (positive, negative, neutral)
Help AI understand and answer questions based on expressed in the text.
context. Architecture of an RNN:
Named Entity Recognition (NER): Input Layer:
Identify entities like names, dates, and places in text. Accepts sequential input data (e.g., words in a
RNN and HMM: sentence, stock prices).

Hidden Layer:
RNN(Recurrent Neural Network): Contains a recurrent connection that allows the
Recurrent Neural Networks (RNNs) are a class of neural network to remember past states.
networks specifically designed to handle sequential Output Layer:
data, making them particularly well-suited for tasks in
Natural Language Processing (NLP). Unlike traditional Generates the output at each time step or after
feedforward neural networks, RNNs have loops that processing the entire sequence.
allow information to be passed from one step of the
Challenges with RNNs:
network to the next, enabling the network to maintain
a memory of previous inputs. Vanishing Gradient Problem:
Gradients diminish over long sequences, making it Impact: The training process becomes unstable, with
difficult for the network to learn long-term the model's loss function often resulting in "NaN" (Not
dependencies. a Number) values, and the model's performance
deteriorates.
Exploding Gradient Problem:
HIDDEN MARKOV MODELS:
Gradients can grow excessively large during
Markov chain: A Markov chain is a way to model a
backpropagation, causing instability.
system where what happens next only depends on
Limited Memory: what is happening right now, not on what happened
before.
Difficulty in handling very long sequences due to
reliance on hidden states. In simple terms, it's a chain of events where each event
depends only on the one right before it.
Vanishing Gradients and Exploding Gradients are two
common problems encountered during the training of A Hidden Markov Model (HMM) in NLP is a tool used to
deep neural networks, particularly in Recurrent Neural predict things that we can't directly see (hidden states)
Networks (RNNs) and other deep architectures. These based on things we can observe.
issues arise during the backpropagation process, which
Ex: Imagine you're trying to guess the weather based
is used to update the model's weights by calculating
on how people are dressed. You can't see the weather
gradients.
directly (that's the hidden part), but you can see if
Vanishing Gradients:
people are wearing coats, hats, or sunglasses (these are
The Vanishing Gradient problem occurs when the the observable parts). The model helps you predict the
gradients of the loss function with respect to the weather (hidden state) from the outfits (observations)
model's parameters become very small as they are and the probabilities of transitioning between different
propagated backward through the network. This leads weather types (like sunny, rainy, etc.).
to very small updates to the model’s weights,
The model defines probabilities for transitioning
effectively stalling learning, particularly in the early
between hidden states and for generating observable
layers of the network. This problem is especially
symbols, allowing for the modeling of dynamic systems
prevalent in deep networks or in RNNs when trying to
with uncertainty.
capture long-term dependencies.
HMMs are useful for modelling dynamic systems and
In RNNs: When processing long sequences, the
forecasting future states based on sequences that have
contributions from earlier inputs diminish
been seen because of their flexibility.
exponentially, making it difficult for the network to
In NLP, HMMs are used for tasks like part-of-speech
learn relationships between distant inputs in the
tagging—where the hidden states are the grammatical
sequence.
categories (like noun, verb, etc.) and the observable
Impact: The model struggles to learn and represent words are the sentences. The model guesses the
long-range dependencies in the data, leading to poor hidden tags based on the words and patterns in the
performance on tasks that require understanding of text.
context over long sequences (e.g., in NLP tasks like
States:
language modeling or translation).
Hidden States: The actual states of the system are not
Exploding Gradients
directly observable. Instead, they are inferred based on
The Exploding Gradient problem occurs when the observable outputs.
gradients become excessively large during
Observable States: These are the outputs or
backpropagation. This can cause the model's weights to
observations that can be directly seen or measured.
grow exponentially, leading to numerical instability.
The model may diverge during training, making it Markov Property:
impossible to learn anything meaningful.
The Markov property assumes that the probability of
In RNNs: This typically happens when there are large transitioning to the next state depends only on the
weight values or when trying to model very complex current state, not on the sequence of previous states.
sequences, where the error gradients multiply and This is known as the first-order Markov property.
grow rapidly as they propagate backward through time.
An HMM consists of two types of variables: hidden Given the observed data, the Viterbi algorithm is used
states and observations. to compute the most likely sequence of hidden states.
This can be used to predict future observations, classify
The hidden states are the underlying variables that
sequences, or detect patterns in sequential data.
generate the observed data, but they are not directly
observable. Step 7: Evaluate the model

The observations are the variables that are measured The performance of the HMM can be evaluated using
and observed. various metrics, such as accuracy, precision, recall.

The Hidden Markov Model (HMM) is the relationship


between the hidden states and the observations using
two sets of probabilities: the transition probabilities
and the emission probabilities.

The transition probabilities describe the probability of


transitioning from one hidden state to another.

The emission probabilities describe the probability of


observing an output given a hidden state.
Hidden Markov Model Algorithm:

The Hidden Markov Model (HMM) algorithm can be


implemented using the following steps:

Step 1: Define the state space and observation space

The state space is the set of all possible hidden states,


and the observation space is the set of all possible
observations.

Step 2: Define the initial state distribution

This is the probability distribution over the initial state.

Step 3: Define the state transition probabilities

These are the probabilities of transitioning from one


state to another. This forms the transition matrix, which
describes the probability of moving from one state to
another.

Step 4: Define the observation likelihoods:

These are the probabilities of generating each


observation from each state. This forms the emission
matrix, which describes the probability of generating
each observation from each state.

Step 5: Train the model

The parameters of the state transition probabilities and


the observation likelihoods are estimated using the
Baum-Welch algorithm, or the forward-backward
algorithm. This is done by iteratively updating the
parameters until convergence.

Step 6: Decode the most likely sequence of hidden


states

You might also like