0% found this document useful (0 votes)

8 views5 pages

Unit 1 and 2

Natural Language Processing (NLP) is a subfield of AI focused on enabling machines to understand and generate human language. Key applications include machine translation, sentiment analysis, speech recognition, and chatbots, while challenges involve ambiguities in language and issues like vanishing and exploding gradients in neural networks. Techniques such as word embeddings and Hidden Markov Models (HMMs) are utilized to improve the understanding and processing of natural language.

Uploaded by

pinky.thogaru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

Unit 1 and 2

Uploaded by

pinky.thogaru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Introduction to NLP: representation.

It mainly involves Text planning,

Sentence planning, and Text Realization.
NLP means that Natural Language Processing which
refers to the domain of artificial intelligence technology Applications of NLP:
which is focused on analysing the human
1. Machine Translation: Translating text from one
languages and making it understandable to the
language to another (e.g., Google Translate).
machines.
Natural Language Processing (NLP) is a subfield of 2. Sentiment Analysis: Understanding emotions
Artificial Intelligence (AI) and Computational Linguistics and opinions in text (e.g., product reviews).
that focuses on enabling computers to understand,
interpret, and generate human language in a way that 3. Speech Recognition: Converting spoken
is both meaningful and useful. NLP bridges the gap language into text (e.g., Siri, Alexa).
between human communication and machine 4. Chatbots and Virtual Assistants: Providing
understanding. automated conversational interfaces.
Key Goals of NLP:
5. Text Summarization: Creating concise
Language Understanding: Teaching machines to summaries of lengthy texts.
comprehend the meaning of text or speech.
6. Information Retrieval: Searching and
Language Generation: Enabling machines to generate extracting relevant data (e.g., search engines).
human-like text or speech.
7. Spam Filtering: Identifying and filtering out
Translation and Transformation: Facilitating conversion unwanted emails.
between languages or transforming text for specific
applications. Ambiguity in language
Advantages of NLP: There are the following three ambiguity in Language

NLP helps users to ask questions about any subject and Lexical Ambiguity
get a direct response within seconds. Lexical Ambiguity exists in the presence of two or more
NLP offers exact answers to the question means it does possible meanings of the sentence within a single
not offer unnecessary and unwanted information. word.

NLP helps computers to communicate with humans in Example:

their languages. Manya is looking for a match.
It is very time efficient. In the above example, the word match refers to that
Components of NLP: either Manya is looking for a partner or Manya is
looking for a match. (Cricket or other match)
Most of the companies use NLP to improve the
efficiency of documentation processes, accuracy of Syntactic Ambiguity
documentation, and identify the information from Syntactic Ambiguity exists in the presence of two or
large databases. more possible meanings within the sentence.
There are the following two components of NLP . Example:
Natural Language Understanding (NLU) I saw the girl with the binocular.
Natural Language Generation (NLG) In the above example, did I have the binoculars? Or did
Natural Language Understanding (NLU) : NLU enables the girl have the binoculars?
machines to understand and analyze human language, Referential Ambiguity
aiding in business applications by mapping and Referential Ambiguity exists when you are referring to
analyzing spoken and written inputs. something using the pronoun.
Natural Language Generation (NLG): Natural Language Example:
Generation (NLG) acts as a translator that converts the
computerized data into natural language Harsha went to Varsha. He said, "I am hungry."
In the above sentence, you do not know that who is Word senses refer to the different meanings a word can
hungry, either Harsha or Varsha. have depending on the context. Word senses refer to
Sentence Segmentation the distinct meanings or interpretations that a word
Defination: can have. In NLP, resolving word senses is a crucial step
in understanding text accurately, as many words are
The initial stage in creating an NLP pipeline is sentence
polysemous (have multiple related meanings) or
segmentation. The paragraph is divided into distinct
homonymous (have multiple unrelated meanings). For
sentences by it.
example, the word bank can mean:
Eg:
 A financial institution.
One of the most significant holidays for every Indian is
Independence Day. Since India gained independence  The side of a river.
from British rule, August 15th has been the annual
Applications of Word Senses in NLP:
celebration of this event. On this day, true
independence is celebrated. Machine Translation:
Sentence Segmentation results in the following Correct word sense ensures accurate translation.
outcome:
Example:
"Independence Day is one of the important festivals for
every Indian citizen." English: Bank (financial institution).

"It is celebrated on the 15th of August each year ever French: Banque (not rive, which means river bank).
since India got independence from the British rule." Information Retrieval:
"This day celebrates independence in the true sense." Disambiguating user queries improves search accuracy.
Stemming is used to normalize words into its base Example:
form or root form. For example, celebrates, celebrated
Query: Apple.
and celebrating, all these words are originated with a
single root word "celebrate." The big problem with Sense: A fruit vs. a tech company.
stemming is that sometimes it produces the root word
Chatbots and Virtual Assistants:
which may not have any meaning.
Resolving word senses enhances user interaction.
For Example, intelligence, intelligent, and intelligently,
all these words are originated with a single root word Example:
"intelligen." In English, the word "intelligen" do not
have any meaning. User: What is the capital of the bank?

Correct sense: Bank as a financial institution.

Tokenization
Word Tokenizer is used to break the sentence into Sentiment Analysis:
separate words or tokens.
Accurate sense identification refines sentiment
Example: interpretation.

SR University offers Corporate Training, Summer Example:

Training, Online Training, and Winter Training.
That movie was sick!
Word Tokenizer generates the following result:
Sense: Sick as cool (positive) vs. unwell (negative).
“SR University", "offers", "Corporate", "Training",
Question Answering Systems:
"Summer", "Training", "Online", "Training", "and",
"Winter", "Training", "." Understanding the intended word sense improves
response accuracy.

Example:
Word Senses and Word embeddings:
Question: What does a bat eat?
Correct sense: Bat as an animal. Key Features of RNNs:

Word Embeddings Sequential Data Handling: RNNs are designed to

process sequences of data, such as sentences or time
Word embeddings are numerical vector
series, where the order of the elements is important.
representations of words that capture their semantic
meaning. Words with similar meanings have Hidden State: RNNs maintain a hidden state that is
embeddings that are closer in the vector space. updated at each time step, capturing information about
previous elements in the sequence. This hidden state
Word embedings are a fundamental technique in
helps the network retain context and understand
Natural Language Processing (NLP) that represent
dependencies between words in a sentence.
words as vectors in a continuous vector space. This
allows algorithms to capture semantic relationships Shared Weights: The same set of weights is applied at
between words. They help in reducing the each time step, allowing RNNs to generalize across
dimensionality of text data and preserving the context different positions in the input sequence.
and meaning of words.

Popular Word Embedding Techniques:

RNNs are widely used in various NLP tasks, including:
Word2Vec

GloVe (Global Vectors for Word Representation)

Language Modeling: Predicting the next word in a
FastText. sequence based on the previous words.
Applications of word embeddings
Text Generation: Generating coherent text by
Text Classification: predicting sequences of words, character by character
or word by word.
Detect spam emails or categorize news articles.
Machine Translation: Translating text from one
Machine Translation:
language to another by processing the input sequence
Map words between languages (e.g., English to and generating the corresponding sequence in the
French). target language.

Sentiment Analysis: Speech Recognition: Converting spoken language into

text by processing the sequence of audio features.
Analyze the sentiment of movie or product reviews.
Sentiment Analysis: Analyzing sequences of text to
Question Answering: determine the sentiment (positive, negative, neutral)
Help AI understand and answer questions based on expressed in the text.
context. Architecture of an RNN:
Named Entity Recognition (NER): Input Layer:
Identify entities like names, dates, and places in text. Accepts sequential input data (e.g., words in a
RNN and HMM: sentence, stock prices).

Hidden Layer:
RNN(Recurrent Neural Network): Contains a recurrent connection that allows the
Recurrent Neural Networks (RNNs) are a class of neural network to remember past states.
networks specifically designed to handle sequential Output Layer:
data, making them particularly well-suited for tasks in
Natural Language Processing (NLP). Unlike traditional Generates the output at each time step or after
feedforward neural networks, RNNs have loops that processing the entire sequence.
allow information to be passed from one step of the
Challenges with RNNs:
network to the next, enabling the network to maintain
a memory of previous inputs. Vanishing Gradient Problem:
Gradients diminish over long sequences, making it Impact: The training process becomes unstable, with
difficult for the network to learn long-term the model's loss function often resulting in "NaN" (Not
dependencies. a Number) values, and the model's performance
deteriorates.
Exploding Gradient Problem:
HIDDEN MARKOV MODELS:
Gradients can grow excessively large during
Markov chain: A Markov chain is a way to model a
backpropagation, causing instability.
system where what happens next only depends on
Limited Memory: what is happening right now, not on what happened
before.
Difficulty in handling very long sequences due to
reliance on hidden states. In simple terms, it's a chain of events where each event
depends only on the one right before it.
Vanishing Gradients and Exploding Gradients are two
common problems encountered during the training of A Hidden Markov Model (HMM) in NLP is a tool used to
deep neural networks, particularly in Recurrent Neural predict things that we can't directly see (hidden states)
Networks (RNNs) and other deep architectures. These based on things we can observe.
issues arise during the backpropagation process, which
Ex: Imagine you're trying to guess the weather based
is used to update the model's weights by calculating
on how people are dressed. You can't see the weather
gradients.
directly (that's the hidden part), but you can see if
Vanishing Gradients:
people are wearing coats, hats, or sunglasses (these are
The Vanishing Gradient problem occurs when the the observable parts). The model helps you predict the
gradients of the loss function with respect to the weather (hidden state) from the outfits (observations)
model's parameters become very small as they are and the probabilities of transitioning between different
propagated backward through the network. This leads weather types (like sunny, rainy, etc.).
to very small updates to the model’s weights,
The model defines probabilities for transitioning
effectively stalling learning, particularly in the early
between hidden states and for generating observable
layers of the network. This problem is especially
symbols, allowing for the modeling of dynamic systems
prevalent in deep networks or in RNNs when trying to
with uncertainty.
capture long-term dependencies.
HMMs are useful for modelling dynamic systems and
In RNNs: When processing long sequences, the
forecasting future states based on sequences that have
contributions from earlier inputs diminish
been seen because of their flexibility.
exponentially, making it difficult for the network to
In NLP, HMMs are used for tasks like part-of-speech
learn relationships between distant inputs in the
tagging—where the hidden states are the grammatical
sequence.
categories (like noun, verb, etc.) and the observable
Impact: The model struggles to learn and represent words are the sentences. The model guesses the
long-range dependencies in the data, leading to poor hidden tags based on the words and patterns in the
performance on tasks that require understanding of text.
context over long sequences (e.g., in NLP tasks like
States:
language modeling or translation).
Hidden States: The actual states of the system are not
Exploding Gradients
directly observable. Instead, they are inferred based on
The Exploding Gradient problem occurs when the observable outputs.
gradients become excessively large during
Observable States: These are the outputs or
backpropagation. This can cause the model's weights to
observations that can be directly seen or measured.
grow exponentially, leading to numerical instability.
The model may diverge during training, making it Markov Property:
impossible to learn anything meaningful.
The Markov property assumes that the probability of
In RNNs: This typically happens when there are large transitioning to the next state depends only on the
weight values or when trying to model very complex current state, not on the sequence of previous states.
sequences, where the error gradients multiply and This is known as the first-order Markov property.
grow rapidly as they propagate backward through time.
An HMM consists of two types of variables: hidden Given the observed data, the Viterbi algorithm is used
states and observations. to compute the most likely sequence of hidden states.
This can be used to predict future observations, classify
The hidden states are the underlying variables that
sequences, or detect patterns in sequential data.
generate the observed data, but they are not directly
observable. Step 7: Evaluate the model

The observations are the variables that are measured The performance of the HMM can be evaluated using
and observed. various metrics, such as accuracy, precision, recall.

The Hidden Markov Model (HMM) is the relationship

between the hidden states and the observations using
two sets of probabilities: the transition probabilities
and the emission probabilities.

The transition probabilities describe the probability of

transitioning from one hidden state to another.

The emission probabilities describe the probability of

observing an output given a hidden state.
Hidden Markov Model Algorithm:

The Hidden Markov Model (HMM) algorithm can be

implemented using the following steps:

Step 1: Define the state space and observation space

The state space is the set of all possible hidden states,

and the observation space is the set of all possible
observations.

Step 2: Define the initial state distribution

This is the probability distribution over the initial state.

Step 3: Define the state transition probabilities

These are the probabilities of transitioning from one

state to another. This forms the transition matrix, which
describes the probability of moving from one state to
another.

Step 4: Define the observation likelihoods:

These are the probabilities of generating each

observation from each state. This forms the emission
matrix, which describes the probability of generating
each observation from each state.

Step 5: Train the model

The parameters of the state transition probabilities and

the observation likelihoods are estimated using the
Baum-Welch algorithm, or the forward-backward
algorithm. This is done by iteratively updating the
parameters until convergence.

Step 6: Decode the most likely sequence of hidden

states

NLP Meterial 5 Units
No ratings yet
NLP Meterial 5 Units
151 pages
NLP Chapter-1
No ratings yet
NLP Chapter-1
24 pages
NLP M4 Part 2 SPP
No ratings yet
NLP M4 Part 2 SPP
71 pages
Natural Language Processing
No ratings yet
Natural Language Processing
57 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
28 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
Unit I
No ratings yet
Unit I
36 pages
Core Components of Natural Language Processing
No ratings yet
Core Components of Natural Language Processing
43 pages
Chapter - 6 Communicating, Perceiving, and Acting
No ratings yet
Chapter - 6 Communicating, Perceiving, and Acting
30 pages
Unit-I NLP
No ratings yet
Unit-I NLP
37 pages
Unit 1 TB
No ratings yet
Unit 1 TB
19 pages
NLP
No ratings yet
NLP
17 pages
Natural Language Processing (NLP) : April 2024
No ratings yet
Natural Language Processing (NLP) : April 2024
88 pages
The 7 NLP Techniques That Will Change How You Communicate in The Future (Part I)
No ratings yet
The 7 NLP Techniques That Will Change How You Communicate in The Future (Part I)
19 pages
NLP Sem Imp
No ratings yet
NLP Sem Imp
46 pages
Notes MSC NLP
No ratings yet
Notes MSC NLP
36 pages
Unit-I NLP
No ratings yet
Unit-I NLP
15 pages
TOPIC 4 Natural Language Processing
No ratings yet
TOPIC 4 Natural Language Processing
26 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
17 pages
Rohini 69628885691 - Notes Rohini 69628885691 - Notes
No ratings yet
Rohini 69628885691 - Notes Rohini 69628885691 - Notes
11 pages
Hadi Pres, 21-12-24-1
No ratings yet
Hadi Pres, 21-12-24-1
16 pages
NLP Handwritten Notes
No ratings yet
NLP Handwritten Notes
26 pages
4 - Aisc
No ratings yet
4 - Aisc
14 pages
About NLP
No ratings yet
About NLP
14 pages
Unit-4 NLP
No ratings yet
Unit-4 NLP
54 pages
What Is NLP?: Natural Language Processing Computer Science, Human Language, Artificial Intelligence
No ratings yet
What Is NLP?: Natural Language Processing Computer Science, Human Language, Artificial Intelligence
10 pages
NLP Unit 1 To 5
No ratings yet
NLP Unit 1 To 5
91 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Unit Iii
No ratings yet
Unit Iii
6 pages
Basic NLP To End-To-End Pipeline .PPTX - Removed
No ratings yet
Basic NLP To End-To-End Pipeline .PPTX - Removed
35 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Natural Language Processing Lec 1
No ratings yet
Natural Language Processing Lec 1
23 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Natural Language Processing
No ratings yet
Natural Language Processing
14 pages
Introduction To NLP - Part 1
No ratings yet
Introduction To NLP - Part 1
23 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
NLP Presentation
No ratings yet
NLP Presentation
15 pages
NLP Introduction Overview
No ratings yet
NLP Introduction Overview
34 pages
Sha 10
No ratings yet
Sha 10
6 pages
NLP LectureNotes UNIT 1
No ratings yet
NLP LectureNotes UNIT 1
55 pages
Natural Language Processing
No ratings yet
Natural Language Processing
21 pages
Big Data Analytics Chap 11
No ratings yet
Big Data Analytics Chap 11
8 pages
Natural Language Processing - Bridging The Gap Between Humans and Machines
No ratings yet
Natural Language Processing - Bridging The Gap Between Humans and Machines
6 pages
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
DL Unit-IV
No ratings yet
DL Unit-IV
20 pages
AI4youngster - 6 - Topic NLP
No ratings yet
AI4youngster - 6 - Topic NLP
66 pages
NLP Short Que Ans
No ratings yet
NLP Short Que Ans
21 pages
Natural Language Processing
No ratings yet
Natural Language Processing
3 pages
Unit 5 DL
No ratings yet
Unit 5 DL
11 pages
NLP Crash Course Comprehensive
No ratings yet
NLP Crash Course Comprehensive
2 pages
Natural Language Processing
100% (1)
Natural Language Processing
6 pages
Topic 2: Introduction To Natural Language Processing (NLP)
No ratings yet
Topic 2: Introduction To Natural Language Processing (NLP)
16 pages
Natural Language Processing
No ratings yet
Natural Language Processing
2 pages
NLP Quick NOtes
No ratings yet
NLP Quick NOtes
15 pages
The Routledge Handbook of Second Language Acquisition and Psycholinguistics-1
100% (10)
The Routledge Handbook of Second Language Acquisition and Psycholinguistics-1
479 pages
Introducing Natural Language Processing
No ratings yet
Introducing Natural Language Processing
13 pages
Introduction To NLP - First - Week - Lecture - 1st
No ratings yet
Introduction To NLP - First - Week - Lecture - 1st
6 pages
Values Education
No ratings yet
Values Education
17 pages
II. Five Stages of Group Development
No ratings yet
II. Five Stages of Group Development
37 pages
NLP Lecture 1
No ratings yet
NLP Lecture 1
3 pages
Concepts A Middle Range Theories
No ratings yet
Concepts A Middle Range Theories
2 pages
What Is Natural Language Processing?
No ratings yet
What Is Natural Language Processing?
5 pages
Beckett: A Guide For The Perplexed
91% (11)
Beckett: A Guide For The Perplexed
27 pages
Why Is The Clutch Slipping 3
No ratings yet
Why Is The Clutch Slipping 3
45 pages
Research Methodology - Syllabus
100% (1)
Research Methodology - Syllabus
1 page
Psychology Lecture and Textbook Notes
No ratings yet
Psychology Lecture and Textbook Notes
52 pages
A Study of The Non-Standard English of Negro and Puerto Rican Speakers in New York City. Volume 1: Phonological and Grammatical Analysis
No ratings yet
A Study of The Non-Standard English of Negro and Puerto Rican Speakers in New York City. Volume 1: Phonological and Grammatical Analysis
398 pages
CS 105.3 Database Management Systems
No ratings yet
CS 105.3 Database Management Systems
6 pages
TFG G6251
No ratings yet
TFG G6251
63 pages
Effectiveness of Using Quizlet To Learn Vocabulary For First-Year English Major Students at Hanoi Pedagogical University 2
No ratings yet
Effectiveness of Using Quizlet To Learn Vocabulary For First-Year English Major Students at Hanoi Pedagogical University 2
14 pages
Teaching Practice Task 1: Weekly Reflections: TP Tasks
No ratings yet
Teaching Practice Task 1: Weekly Reflections: TP Tasks
2 pages
Module 6 - Learners With Disabilities
No ratings yet
Module 6 - Learners With Disabilities
4 pages
Introduction To Fuzzy Logic, Classical Sets and Fuzzy Sets
No ratings yet
Introduction To Fuzzy Logic, Classical Sets and Fuzzy Sets
20 pages
Common Skills For Resume
100% (2)
Common Skills For Resume
7 pages
OAKS Private Schools
No ratings yet
OAKS Private Schools
21 pages
Aguilos Mario Jr. G. LDM Portfolio
No ratings yet
Aguilos Mario Jr. G. LDM Portfolio
22 pages
Academic Writing
No ratings yet
Academic Writing
10 pages
Pemeriksaan Klinis Neurology 1 (Fungsi Luhur) : Neimy Novitasari
No ratings yet
Pemeriksaan Klinis Neurology 1 (Fungsi Luhur) : Neimy Novitasari
47 pages
PERDEV
No ratings yet
PERDEV
12 pages
Teacher Performance Feedback Form - HERO
No ratings yet
Teacher Performance Feedback Form - HERO
4 pages
SYST 469 - Assignment 1 - ID Plan
No ratings yet
SYST 469 - Assignment 1 - ID Plan
5 pages
Toolbox Grade 5 Formative Assessment Performing
No ratings yet
Toolbox Grade 5 Formative Assessment Performing
5 pages
Reality Monitoring PDF
No ratings yet
Reality Monitoring PDF
19 pages
Noi Quy Giang Tap & Lesson Plan 2021
No ratings yet
Noi Quy Giang Tap & Lesson Plan 2021
6 pages
Lilian Osondu Njiogu: Kgs Coordinator Position
No ratings yet
Lilian Osondu Njiogu: Kgs Coordinator Position
3 pages
DRichard Wilczynski Paper 1
No ratings yet
DRichard Wilczynski Paper 1
2 pages
Computational Thinking: Syllabus
No ratings yet
Computational Thinking: Syllabus
2 pages
Part 1 Criteria Toni Hauri 2
No ratings yet
Part 1 Criteria Toni Hauri 2
1 page
Natural Language Processing with Python: Natural Language Processing Using NLTK
From Everand
Natural Language Processing with Python: Natural Language Processing Using NLTK
Frank Millstein
3.5/5 (4)
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
From Everand
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
daniel Huston
No ratings yet