Natural Language Processing Lec 1

Natural language processing (NLP) is a subfield of artificial intelligence that deals with interactions between computers and human language. NLP combines computational linguistics and machine learning to help computers process and analyze large amounts of text and voice data to understand its full meaning and intent. It drives applications like translation, voice assistants, and text summarization. NLP tasks include speech recognition, part-of-speech tagging, word sense disambiguation, named entity recognition, and sentiment analysis. These tasks help computers make sense of the ambiguities in human language.

Uploaded by

Touseef sultan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views23 pages

Natural Language Processing Lec 1

Uploaded by

Touseef sultan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Natural Language

Processing
Instructor: Touseef Sultan
Core Concept
• Natural language processing is a subfield of linguistics, computer
science, and artificial intelligence concerned with the interactions
between computers and human language, in particular how to
program computers to process and analyze large amounts of natural
language data.
• Natural language processing (NLP) refers to the branch of computer
science—and more specifically, the branch of artificial intelligence or
AI—concerned with giving computers the ability to understand text
and spoken words in much the same way human beings can.
Continue…
• NLP combines computational linguistics—rule-based modeling of
human language—with statistical, machine learning, and deep
learning models. Together, these technologies enable computers to
process human language in the form of text or voice data and to
‘understand’ its full meaning, complete with the speaker or writer’s
intent and sentiment.
• NLP drives computer programs that translate text from one language
to another, respond to spoken commands, and summarize large
volumes of text rapidly—even in real time.
NLP tasks
• Human language is filled with ambiguities that make it incredibly
difficult to write software that accurately determines the intended
meaning of text or voice data.
• Several NLP tasks break down human text and voice data in ways that
help the computer make sense of what it's ingesting. Some of these
tasks include the following:
Tasks
• Speech recognition, also called speech-to-text, is the task of reliably converting voice data
into text data. Speech recognition is required for any application that follows voice
commands or answers spoken questions. What makes speech recognition especially
challenging is the way people talk—quickly, slurring words together, with varying emphasis
and intonation, in different accents, and often using incorrect grammar.
• Part of speech tagging, also called grammatical tagging, is the process of determining the
part of speech of a particular word or piece of text based on its use and context. Part of
speech identifies ‘make’ as a verb in ‘I can make a paper plane,’ and as a noun in ‘What
make of car do you own?’
• Word sense disambiguation is the selection of the meaning of a word with multiple
meanings through a process of semantic analysis that determine the word that makes the
most sense in the given context. For example, word sense disambiguation helps distinguish
the meaning of the verb 'make' in ‘make the grade’ (achieve) vs. ‘make a bet’ (place).
Tasks
• Named entity recognition, or NEM, identifies words or phrases as useful entities.
NEM identifies ‘Kentucky’ as a location or ‘Fred’ as a man's name.
• Co-reference resolution is the task of identifying if and when two words refer to
the same entity. The most common example is determining the person or object to
which a certain pronoun refers (e.g., ‘she’ = ‘Mary’), but it can also involve
identifying a metaphor or an idiom in the text (e.g., an instance in which 'bear' isn't
an animal but a large hairy person).
• Sentiment analysis attempts to extract subjective qualities—attitudes, emotions,
sarcasm, confusion, suspicion—from text.
• Natural language generation is sometimes described as the opposite of speech
recognition or speech-to-text; it's the task of putting structured information into
human language.
How does NLP work?
• Natural language processing includes many different techniques for
interpreting human language, ranging from statistical and machine
learning methods to rules-based and algorithmic approaches. We need a
broad array of approaches because the text- and voice-based data varies
widely, as do the practical applications.
• Basic NLP tasks include tokenization and parsing,
lemmatization/stemming, part-of-speech tagging, language detection and
identification of semantic relationships. If you ever diagramed sentences
in grade school, you’ve done these tasks manually before.
• In general terms, NLP tasks break down language into shorter, elemental
pieces, try to understand relationships between the pieces and explore
how the pieces work together to create meaning.
lemmatization vs stemming
• Stemming is a process that stems or removes last few
characters from a word, often leading to incorrect meanings and
spelling. Lemmatization considers the context and converts the
word to its meaningful base form, which is called Lemma.
Stemming is a process that stems or removes last few
characters from a word, often leading to incorrect meanings and
spelling. Lemmatization considers the context and converts the
word to its meaningful base form, which is called Lemma.
Natural Language Understanding (NLU)
• NLU is branch of natural language processing (NLP), which
helps computers understand and interpret human language by
breaking down the elemental pieces of speech. While speech
recognition captures spoken language in real-time, transcribes
it, and returns text, NLU goes beyond recognition to determine
a user’s intent. Speech recognition is powered by statistical
machine learning methods which add numeric structure to
large datasets. In NLU, machine learning models improve over
time as they learn to recognize syntax, context, language
patterns, unique definitions, sentiment, and intent.
Natural language understanding (NLU)
• NLU enables machines to understand and interpret human language by extracting metadata from content. It
performs the following tasks:
• Helps analyze different aspects of language.
• Helps map the input in natural language into valid representations.
• NLU is more difficult than NLG tasks owing to referential, lexical, and syntactic ambiguity.
• Lexical ambiguity: This means that one word holds several meanings. For example, "The man is looking for
the match." The sentence is ambiguous as ‘match’ could mean different things such as a partner or a
competition.
• Syntactic ambiguity: This refers to a sequence of words with more than one meaning. For example, "The
fish is ready to eat.” The ambiguity here is whether the fish is ready to eat its food or whether the fish is ready
for someone else to eat. This ambiguity can be resolved with the help of the part-of-speech tagging technique.
• Referential ambiguity: This involves a word or a phrase that could refer to two or more properties. For
example, Tom met Jerry and John. They went to the movies. Here, the pronoun ‘they’ causes ambiguity as it
isn’t clear who it refers to.
Business applications often rely on NLU to understand what people are saying
in both spoken and written language. This data helps virtual assistants and
other applications determine a user’s intent and route them to the right task.
Natural language generation (NLG)
• NLG is a method of creating meaningful phrases and sentences (natural
language) from data. It comprises three stages: text planning, sentence
planning, and text realization.
• Text planning: Retrieving applicable content that should be intelligent.
• Sentence planning: Forming meaningful phrases and setting the
sentence tone.
• Text realization: Mapping sentence plans to sentence structures.
• Chatbots, machine translation tools, analytics platforms, voice
assistants, sentiment analysis platforms, and AI-powered transcription
tools are some applications of NLG.
Techniques and methods of natural
language processing
• Syntax and semantic analysis are two main techniques used with natural language processing.
• Syntax is the arrangement of words in a sentence to make grammatical sense. NLP uses syntax to assess meaning from a
language based on grammatical rules. Syntax techniques include:
• Parsing. This is the grammatical analysis of a sentence. Example: A natural language processing algorithm is fed the
sentence, "The dog barked." Parsing involves breaking this sentence into parts of speech -- i.e., dog = noun, barked = verb.
This is useful for more complex downstream processing tasks.
• Word segmentation. This is the act of taking a string of text and deriving word forms from it. Example: A person scans a
handwritten document into a computer. The algorithm would be able to analyze the page and recognize that the words are
divided by white spaces.
• Sentence breaking. This places sentence boundaries in large texts. Example: A natural language processing algorithm is
fed the text, "The dog barked. I woke up." The algorithm can recognize the period that splits up the sentences using
sentence breaking.
• Morphological segmentation. This divides words into smaller parts called morphemes. Example: The word untestably
would be broken into [[un[[test]able]]ly], where the algorithm recognizes "un," "test," "able" and "ly" as morphemes. This is
especially useful in machine translation and speech recognition.
• Stemming. This divides words with inflection in them to root forms. Example: In the sentence, "The dog barked," the
algorithm would be able to recognize the root of the word "barked" is "bark." This would be useful if a user was analyzing a
text for all instances of the word bark, as well as all of its conjugations. The algorithm can see that they are essentially the
same word even though the letters are different.
Ambiguities Problem in NLU
• Word sense disambiguation. This derives the meaning of a word
based on context. Example: Consider the sentence, "The pig is in
the pen." The word pen has different meanings. An algorithm using
this method can understand that the use of the word pen here refers
to a fenced-in area, not a writing implement.
• Types of ambiguities given here
• Lexical ambiguity (The tank was full of water)
• Syntactical ambiguity (Old men and women were taken to safe place)
• Semantic ambiguity (The hit the pole while it was moving)
• Pragmatic ambiguity (The police are coming)
Why is NLP important?
1.Large volumes of textual data
Natural language processing helps computers communicate with humans in their own language and scales
other language-related tasks. For example, NLP makes it possible for computers to read text, hear speech,
interpret it, measure sentiment and determine which parts are important. Today’s machines can analyze
more language-based data than humans, without fatigue and in a consistent, unbiased way. Considering
the staggering amount of unstructured data that’s generated every day, from medical records to social
media, automation will be critical to fully analyze text and speech data efficiently.
2.Structuring a highly unstructured data source
Human language is astoundingly complex and diverse. We express ourselves in infinite ways, both verbally
and in writing. Not only are there hundreds of languages and dialects, but within each language is a unique
set of grammar and syntax rules, terms and slang. When we write, we often misspell or abbreviate words,
or omit punctuation. When we speak, we have regional accents, and we mumble, stutter and borrow terms
from other languages. While supervised and unsupervised learning, and specifically deep learning, are
now widely used for modeling human language, there’s also a need for syntactic and semantic
understanding and domain expertise that are not necessarily present in these machine learning
approaches. NLP is important because it helps resolve ambiguity in language and adds useful numeric
structure to the data for many downstream applications, such as speech recognition or text analytics.
Syntactic & Semantic Analysis
• Syntactic analysis (syntax) and semantic analysis (semantic) are the
two primary techniques that lead to the understanding of natural
language. Language is a set of valid sentences, but what makes a
sentence valid? Syntax and semantics.
• Syntax is the grammatical structure of the text, whereas semantics is
the meaning being conveyed. A sentence that is syntactically correct,
however, is not always semantically correct. For example, “cows flow
supremely” is grammatically valid (subject — verb — adverb) but it
doesn't make any sense.
SYNTACTIC ANALYSIS

• Syntactic analysis, also referred to as syntax analysis or parsing, is the

process of analyzing natural language with the rules of a formal
grammar. Grammatical rules are applied to categories and groups of
words, not individual words. Syntactic analysis basically assigns a
semantic structure to text.
• For example, a sentence includes a subject and a predicate where the
subject is a noun phrase and the predicate is a verb phrase. Take a look
at the following sentence: “The dog (noun phrase) went away (verb
phrase).” Note how we can combine every noun phrase with a verb
phrase. Again, it's important to reiterate that a sentence can be
syntactically correct but not make sense.
SEMANTIC ANALYSIS
• The way we understand what someone has said is an unconscious
process relying on our intuition and knowledge about language itself.
In other words, the way we understand language is heavily based on
meaning and context. Computers need a different approach, however.
The word “semantic” is a linguistic term and means "related to
meaning or logic."
PARSING
• What is parsing? According to the dictionary, to parse is to “resolve a
sentence into its component parts and describe their syntactic roles.”
• That actually nailed it but it could be a little more comprehensive.
Parsing refers to the formal analysis of a sentence by a computer into
its constituents, which results in a parse tree showing their syntactic
relation to one another in visual form, which can be used for further
processing and understanding.
• Below is a parse tree for the sentence "The thief robbed the
apartment." Included is a description of the three different information
types conveyed by the sentence.
Diagram explained
• The letters directly above the single words show the parts of speech for each word (noun, verb
and determiner). One level higher is some hierarchical grouping of words into phrases. For
example, "the thief" is a noun phrase, "robbed the apartment" is a verb phrase and when put
together the two phrases form a sentence, which is marked one level higher.
• But what is actually meant by a noun or verb phrase? Noun phrases are one or more words that
contain a noun and maybe some descriptors, verbs or adverbs. The idea is to group nouns with
words that are in relation to them.
• A parse tree also provides us with information about the grammatical relationships of the words
due to the structure of their representation. For example, we can see in the structure that "the
thief" is the subject of "robbed."
• With structure I mean that we have the verb ("robbed"), which is marked with a "V" above it and
a "VP" above that, which is linked with a "S" to the subject ("the thief"), which has a "NP"
above it. This is like a template for a subject-verb relationship and there are many others for
other types of relationships.
STEMMING
• Stemming is a technique that comes from morphology and information retrieval which is used in NLP for pre-
processing and efficiency purposes. It's defined by the dictionary as to "originate in or be caused by.”
• Basically, stemming is the process of reducing words to their word stem. A "stem" is the part of a word that
remains after the removal of all affixes. For example, the stem for the word "touched" is "touch." "Touch" is
also the stem of "touching," and so on.
• You may be asking yourself, why do we even need the stem? Well, the stem is needed because we're going to
encounter different variations of words that actually have the same stem and the same meaning. For example:
• I was taking a ride in the car.
• I was riding in the car.
• These two sentences mean the exact same thing and the use of the word is identical.
• Now, imagine all the English words in the vocabulary with all their different fixations at the end of them. To
store them all would require a huge database containing many words that actually have the same meaning.
This is solved by focusing only on a word’s stem. Popular algorithms for stemming include the Porter
stemming algorithm from 1979, which still works well.

Hazout 1991 Thèse
No ratings yet
Hazout 1991 Thèse
454 pages
FCE at A Glance (Overview + Resources) Table
0% (1)
FCE at A Glance (Overview + Resources) Table
1 page
NLP - Natural Language Processing and APPLICATION
No ratings yet
NLP - Natural Language Processing and APPLICATION
31 pages
NLP Notes2
No ratings yet
NLP Notes2
27 pages
Unit-I NLP
No ratings yet
Unit-I NLP
15 pages
NLP Important Question and Answers Module Wise
No ratings yet
NLP Important Question and Answers Module Wise
101 pages
NLP Lab1
No ratings yet
NLP Lab1
33 pages
CC S 339 NLP Basics &TSA
No ratings yet
CC S 339 NLP Basics &TSA
68 pages
Natural Language Processing Unit 1-2
No ratings yet
Natural Language Processing Unit 1-2
18 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
9 pages
NLP
No ratings yet
NLP
17 pages
Chapter - 6 Communicating, Perceiving, and Acting
No ratings yet
Chapter - 6 Communicating, Perceiving, and Acting
30 pages
تعلم ML4
No ratings yet
تعلم ML4
42 pages
MOST IMORTANT FILE - From Purdue University
No ratings yet
MOST IMORTANT FILE - From Purdue University
503 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
Natural Language Processing
No ratings yet
Natural Language Processing
30 pages
NLP Unit 1 Part1
No ratings yet
NLP Unit 1 Part1
61 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
Dil Bilimi Sınav Örneği
No ratings yet
Dil Bilimi Sınav Örneği
10 pages
NLP Module 1
No ratings yet
NLP Module 1
124 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Unit 4
No ratings yet
Unit 4
39 pages
Phases of NLP (8 Files Merged)
No ratings yet
Phases of NLP (8 Files Merged)
66 pages
NLP PPT
No ratings yet
NLP PPT
41 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
Nayie Bayes Classifier 21 Page
No ratings yet
Nayie Bayes Classifier 21 Page
28 pages
Chapter - 1
No ratings yet
Chapter - 1
25 pages
Natural Language Processing: By-Himani (ROLL NO. 43)
No ratings yet
Natural Language Processing: By-Himani (ROLL NO. 43)
19 pages
Introduction To Language and Communication-Week11
No ratings yet
Introduction To Language and Communication-Week11
33 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
32 pages
Chapter 6.
No ratings yet
Chapter 6.
31 pages
519 Assignment
No ratings yet
519 Assignment
26 pages
NLP 1
No ratings yet
NLP 1
29 pages
Speech Processing System
No ratings yet
Speech Processing System
20 pages
Book 17 Language Laboratory Activities
No ratings yet
Book 17 Language Laboratory Activities
101 pages
Natural Language Processing
No ratings yet
Natural Language Processing
14 pages
2 Introduction
No ratings yet
2 Introduction
15 pages
Unit-4 NLP
No ratings yet
Unit-4 NLP
54 pages
NLP Unit1
No ratings yet
NLP Unit1
24 pages
1 Introduction
No ratings yet
1 Introduction
13 pages
NLP Self
No ratings yet
NLP Self
22 pages
MAD Presentation Instructions MAD
No ratings yet
MAD Presentation Instructions MAD
2 pages
English For Academic Week 1
No ratings yet
English For Academic Week 1
23 pages
Seminar Report
No ratings yet
Seminar Report
12 pages
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
From Everand
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
daniel Huston
No ratings yet
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
Chapter 4
No ratings yet
Chapter 4
17 pages
ELEMENTS OF ENGLSIH STRUCTURE AND GRAMMAR (Part1)
No ratings yet
ELEMENTS OF ENGLSIH STRUCTURE AND GRAMMAR (Part1)
34 pages
Seminar Report1
No ratings yet
Seminar Report1
17 pages
NLP Ia1
No ratings yet
NLP Ia1
7 pages
NLP 833
No ratings yet
NLP 833
26 pages
ENG 203-Course Outline
No ratings yet
ENG 203-Course Outline
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
21 pages
1 Natural Language Processing-Intro
No ratings yet
1 Natural Language Processing-Intro
16 pages
Week4.pdf 82849 1 1583836882000 PDF
No ratings yet
Week4.pdf 82849 1 1583836882000 PDF
17 pages
Buku Ajar Translation
50% (2)
Buku Ajar Translation
48 pages
What Is Computational Linguistics
No ratings yet
What Is Computational Linguistics
14 pages
NLP PPT1
No ratings yet
NLP PPT1
29 pages
Unit V
No ratings yet
Unit V
16 pages
Natural Language Processing-2
No ratings yet
Natural Language Processing-2
13 pages
Natural Language Processing
No ratings yet
Natural Language Processing
30 pages
Context Clues BY HAZEL
No ratings yet
Context Clues BY HAZEL
20 pages
NLP Unit1
No ratings yet
NLP Unit1
51 pages
CA 1st Lec
No ratings yet
CA 1st Lec
22 pages
Department of Education: Diagnostic Test in English 7 - Quarter 1
No ratings yet
Department of Education: Diagnostic Test in English 7 - Quarter 1
6 pages
Natural Language Processing: A Comprehensive Overview
No ratings yet
Natural Language Processing: A Comprehensive Overview
30 pages
Unit 25 Ta Ündem Formacio Ün
No ratings yet
Unit 25 Ta Ündem Formacio Ün
9 pages
NLP Isr 2018 Feb-Marc
No ratings yet
NLP Isr 2018 Feb-Marc
16 pages
CA Assignment 2
50% (2)
CA Assignment 2
2 pages
New Doc 2019-04-12 09.18.25
No ratings yet
New Doc 2019-04-12 09.18.25
5 pages
a. telling b. talking d. playing: c. rendering Câu trả lời đúng
No ratings yet
a. telling b. talking d. playing: c. rendering Câu trả lời đúng
22 pages
Goals: Listening and Vocabulary
No ratings yet
Goals: Listening and Vocabulary
11 pages
Archivo - 01 (4 Cópia)
No ratings yet
Archivo - 01 (4 Cópia)
6 pages
CG Draft
No ratings yet
CG Draft
7 pages
An In-Depth Exploration of Natural Language Processing: Evolution, Applications, and Future Directions
100% (8)
An In-Depth Exploration of Natural Language Processing: Evolution, Applications, and Future Directions
5 pages
3.1 Natural Language Processing
No ratings yet
3.1 Natural Language Processing
5 pages
Introducing Natural Language Processing
No ratings yet
Introducing Natural Language Processing
13 pages
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
Eng Grammar 1
No ratings yet
Eng Grammar 1
7 pages
English in Common 6 Unit 7
No ratings yet
English in Common 6 Unit 7
7 pages
Human Communication, Either Spoken or Written, Consisting of The Use of Words in A Structured and Conventional Way". Language Makes Us Unique From Other Living Beings and I Would
No ratings yet
Human Communication, Either Spoken or Written, Consisting of The Use of Words in A Structured and Conventional Way". Language Makes Us Unique From Other Living Beings and I Would
7 pages
English - Grammar and Style
No ratings yet
English - Grammar and Style
3 pages
Grammer Guide
No ratings yet
Grammer Guide
13 pages
English Syntax Answers
No ratings yet
English Syntax Answers
2 pages
Spoken English Cource Details
No ratings yet
Spoken English Cource Details
2 pages
Mission General TB PDF
No ratings yet
Mission General TB PDF
21 pages
A Simplified Arabic Grammar
90% (10)
A Simplified Arabic Grammar
322 pages
What Is Natural Language Processing?
No ratings yet
What Is Natural Language Processing?
5 pages
Idiomatic Prepositions
No ratings yet
Idiomatic Prepositions
7 pages
English 4 DLP 61 - Using Prepositions
No ratings yet
English 4 DLP 61 - Using Prepositions
10 pages
Telephone English
No ratings yet
Telephone English
3 pages
Technical Writing - What To Avoid
No ratings yet
Technical Writing - What To Avoid
2 pages
SpeakOut Intermediate - Scope - Indice PDF
No ratings yet
SpeakOut Intermediate - Scope - Indice PDF
2 pages

Natural Language Processing Lec 1

Uploaded by

Natural Language Processing Lec 1

Uploaded by

Natural Language

• Syntactic analysis, also referred to as syntax analysis or parsing, is the

You might also like