0% found this document useful (0 votes)

26 views42 pages

Lecture 16-17-18-19

The document outlines the course objectives and outcomes for a Natural Language Processing course at the Apex Institute of Technology, focusing on fundamental concepts and techniques. It covers topics such as part of speech tagging, including rule-based, statistical, and hybrid approaches. The course aims to equip students with practical applications of formal languages and grammar in NLP.

Uploaded by

Mrinal Bhatt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views42 pages

Lecture 16-17-18-19

Uploaded by

Mrinal Bhatt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

Apex Institute of Technology

Department of Computer Science & Engineering

NATURAL LANGUAGE PROCESSING

(20CST392)

DISCOVER . LEARN . EMPOWER

1
NATURAL LANGUAGE PROCESSING : Course Objectives

COURSE OBJECTIVES

The Course aims to:

•This course is an introduction to the fundamental concepts and techniques of natural
language processing (NLP).

2
COURSE OUTCOMES

On completion of this course, the students shall be able to:-

Understand practical applications of formal languages and grammar in Natural

CO1 BT 2
Language Processing

3
Contents to be Covered
• Part of Speech tagging
• Rule based part of speech Tagging
• Transformation based tagging.
PART OF SPEECH TAGGING

• Introduction
• Part of speech tagging(POS)
• Rule-based taggers
• Statistical taggers
• Hybrid approaches

POS Tagging
INTRODUCTION1

Content
1. Introduction to Human Language Technology
2. Applications
3. Resources
4. Language models
5. Morphology and lexicons
6. Syntactic processing
7. Semantic processing
8. Generation

POS Tagging
INTRODUCTION
2
- Parts of speech (POS), word classes, morphological
classes, or lexical tags give information about a word
and its neighbors

- Since the greeks 8 basic POS have been

distinguished:
Noun, verb, pronoun, preposition, adverb,
conjunction, adjective, and article

- Modern works use extended lists of POS: 45 in Penn

Treebank corpus, 87 in Brown corpus
POS Tagging
PART OF SPEECH TAGGING 1

Closed class. Function words: prepositions, pronouns,

determiners,conjunctions, numerals, auxiliary verbs and particles
(preposition or adverbs in phrasal verbs)

Open class:
Nouns: people, place and things proper nouns, common
nouns,count nouns and mass nouns
Verbs: actions and processes. Main verbs, not auxiliaries
Adjectives: Properties
Adverbs

POS Tagging
PART OF SPEECH CATEGORIES 2

Brown Corpus tagset (87 tags)

https://en.wikipedia.org/wiki/Brown_Corpus Penn
Treebank tagset (45 tags)
http://web.mit.edu/6.863/www/PennTreebankTa

POS Tagging 1
PART OF SPEECH CATEGORIES 3

CC Coordinating conjunction CD Penn Tree Bank tagset

Cardinal number
DT Determiner RB Adverb
EX Existential there RBR Adverb, comparative
FW Foreign word RBS Adverb, superlative RP
IN Preposition Particle
JJ Adjective SYM Symbol TO to
JJR Adjective, comparative JJS UH Interjection
Adjective, superlative LS List item VB Verb, base form VBD
marker Verb, past tense
MD Modal VBG Verb, gerund
NN Noun, singular VBN Verb, past participle
NNP Proper noun, singular NNS VBP Verb, non-3rd ps. sing. present VBZ
Noun, plural Verb, 3rd ps. sing. present WDT wh-determiner
NNPS Proper noun, plural PDT WP wh-pronoun
Predeterminer WP Possessive wh-pronoun WRB
POS Posessive ending PRP wh-adverb
Personal pronoun PP Possessive
pronoun
POS Tagging 1
PART OF SPEECH CATEGORIES 4

Penn Tree Bank tagset 2

# Pound sign Dollar
$ sign
. Sentence-final punctuation Comma
, Colon, semi-colon Left bracket
: character
( Right bracket character Straight
) double quote Left open single
" quote Left open double quote
` Right close single quote
`` Right close double quote
'
''

POS Tagging 1
PART OF SPEECH CATEGORIES 5

Examples of sentences tagged sentences Using

the 87 tag Brown corpus tagset Tag TO for
infinitives
Tag IN for prepositional uses of to
- Secretariat/NNP is/BEZ expected/VBN
to/TO race/VB tomorrow/NR
-to/TO give/VB priority/NN to/IN teacher/NN pay/NN
raises/NNS

POS Tagging 1
PART OF SPEECH TAGGING 2

PAVLOV N SG PROPER
HAVE V PAST VFIN SVO (verb with subject and object)
HAVE PCP2(past participle) SVO
SHOWN SHOW PCP2 SVO SV SVOO (verb with
subject and two complements)
THAT ADV
PRON DEM SG
DET CENTRAL DEM SG
CS (subordinating conjunction)
SALIVATION N SG
POS Tagging 1
PART OF SPEECH TAGGING 3

Words taken isolatedly are ambiguous

regarding its POS
Yo bajo con el hombre bajo a
PP VM SP TD NC VM NC
VM VM SP
AQ AQ
NC NC
SP SP

tocar el bajo bajo la escalera .

VM TD VM VM TD
NC FP
VM VM VM NC
AQ AQ PP
NC NC
POS Tagging
SP SP 1
PART OF SPEECH TAGGING 4

Most of words have a unique POS within a

context
Yo bajo con el hombre bajo a
PP VM SP TD NC VM NC
VM VM SP
AQ AQ
NC NC
SP SP

tocar el bajo bajo la escalera .

VM TD VM VM TD
NC FP
VM VM VM NC
AQ AQ PP
NC NC
POS Tagging
SP SP 1
PART OF SPEECH TAGGING 5

Pos taggers

The goal of a POS tagger is to assign each word

the most likely within a context

• Rule-based
• Statistical
• Hybrid

POS Tagging 1
PART OF SPEECH TAGGING 6

W = w 1 w 2 … w n sequence of words
T = t1 t2 …t n sequence of POS tags
f : W  T = f(W)

For each word w i only several of the tags can

be assigned (except the unknown words).
We can get them from a lexicon or a
morphological analyzer.
Tagset.
Open and closed categories

POS Tagging 1
RULE-BASED TAGGERS1

• Knowledge-driven taggers Usually rules built

• manually Limited amount of rules ( 1000)
• LM and smoothing explicitely defined.

POS Tagging 1
Brill’s set of
templates
“Change tag a to tag b when: ..” The
preceding (following) word is tagged z. The word
two before (after) is tagged z.
One of the two preceding (following) words is
tagged z.
One of the three preceding (following) words is
tagged z.
The preceding word is tagged z and the following
word is tagged w.
The preceding (following) word is tagged z and the
word two before (after) is tagged w
a,b,z and w are part of speech tags
Rules are automatically induced from tagged corpus
POS Tagging 2
RULE-BASED TAGGERS2
ADVERBIAL - THAT RULE
Given input: “that”
if
(+1 A/ADV/QUANT) /* if next word is adj, adv or quantifier */ (+2 SENT-
LIM) /* and following is a sentence boundary */ (NOT -1
SVOC/A) /* and the previous word is not a verb like */
/* ‘consider’ which allows adjs as object complements */
then eliminate non-ADV tags
else eliminate ADV tag
Ex: In the sentence “I consider that odd “, that will not be tagged as
adverb (ADV)

POS Tagging 2
RULE-BASED TAGGERS3
 Linguistically motivated rules
 High precision
 ej. EngCG 99.5%
– High development cost
– Not transportable
–Time cost of tagging

• TAGGIT, Green,Rubin,1971
• TOSCA, Oosdijk,1991
•Constraint Grammars, EngCG, Voutilainen,1994,
Karlsson et al, 1995
• AMBILIC, de Yzaguirre et al, 2000
POS Tagging 2
RULE-BASED TAGGERS4
Constraint Grammars CG

A CG consists of a sequence of subgrammars each

one consisting of a set of restrictions
(constraints) which set context conditions
• ej. (@w =0 VFIN (-1 TO))
• Discards POS VFIN when the
previous word is “to”

POS Tagging 2
RULE-BASED TAGGERS5
Constraint Grammars CG
• ENGCG. ENGTWOL Reductionist
POS tagging 1,100 constraints
93-97% of the words are correctly disambiguated 99.7%
accuracy
Heuristic rules can be applied over the rest 2-3%
residual ambiguity with 99.6% precision
CG syntactic

POS Tagging 2
STATISTICAL POS TAGGING 1
To find the most probable tag sequence given the observationP(t1 n|w ) isn highest.
sequence of n words w1 , that is, find 1
n
But P(t1 |w ) is difficult to compute and Bayesian classification rule is
n n
1
used:
P(x|y) = P(x) P(y|x) / P(y)
When applied to the sequence of words, the most probable tag sequence would be

P(t1 |w1 ) = P(t1 ) P(w1 |t1 )/P(w1 )

n n n n n n

where P(w1 ) does not change and thus do not need to be calculated
n

Thus, the most probable tag sequence is the product of two probabilites for each possible sequence:
- Prior probability of the tag sequence. Context P(t1 )
n

- Likelihood of the sequence of words considering a sequence of (hidden) tags. P(w n|t n)

POS Tagging 2
STATISTICAL POS TAGGING 2
Two simplifications for computing the most probable sequence of tags

- Prior probability of the part of speech tag of a word depends only on the tag of the
previous word (bigrams, reduce context to previous). Facilitates the computation of P(t1 )
n

Ex. Probability of noun after determiner

- Probability of a word depends only on its part-of-speech tag. (independent of other

words in the context). Facilitates the computation of
P(w1 |t1 ), Likelihood probability.
n n

Ex. given the tag noun, probabilty of word dog

This stochastic algorithm is also called HIDDEN MARKOV MODEL

POS Tagging 2
STATISTICAL POS TAGGING 3

Computing the most-likely tag sequence:

●
Secretariat/NNP is/BEZ expected/VBN to/TO
race/VB tomorrow/NR

●
People/NNS continue/VB to/TO inquire/VB the/AT
reason/NN for/IN the/AT race/NN for/IN outer/JJ
space/NN
POS Tagging 2
STATISTICAL POS TAGGING
4

Hidden Markov Models (HMM) are extensions of finite state

automata

POS Tagging 2
STATISTICAL POS TAGGING 5
Formalization of a Hidden Markov Model Q = q1q2 ...qN
a set of N states
A = a11a12 ...an1 ...ann a transition probability matrix A, each aij
representing the probability of moving from state i to state j,
∑nj=1 aij = 1 ∀i
O = o1o2 ...oT a sequence of T observations, each one drawn from
a vocabulary V = v1,v2,...,vV
B = bi(ot) A sequence of observation likelihoods, also called
emission probabilities, each expressing the probability of an
observation ot being generated from a state i.
q0,qF
POS Tagging 2
STATISTICAL POS TAGGING 6

POS Tagging 3
STATISTICAL POS TAGGING 7

POS Tagging 3
STATISTICAL POS TAGGING 8

VB TO NN PPSS

<s> VB .019 .0043 .041 .067

TO .0038 .035 .047 .0070
NN .83 0 .00047 0
PPSS .0040 .016 .087 .0045
.23 .00079 .0012 .00014

Tag transition probabilities (the matrix A, p(t |t )) computed

i i-1

from the 87-tag Brown corpus without smoothing. The rows are labeled
with the conditioning event; thus P(PPSS|VB) is .0070. The symbol <s> is
the start-of-sentence symbol

POS Tagging 3
STATISTICAL POS TAGGING 9

I want to race
VB 0 .0093 0 .00012
TO 0 0 .99 0
NN 0 .000054 0 .00057
PPSS .37 0 0 0

Observation likelihoods (the matrix B) computed from the 87-

tag Brown corpus without smoothing

POS Tagging 3
STATISTICAL POS TAGGING 10

Hidden Markov Model

Statistical inference. (Bayesian inference) Hidden

States associated to n-grams
Transition probabilities restricted to valid
transitions
Emision probabilities restricted by lexicons

POS Tagging 3
STATISTICAL POS TAGGING 11

 Well founded theoretical

framework
 Simple models.
 Acceptable precision
• CLAWS, Garside et al, 1987
 > 97% • De Rose, 1988

Language independent • Church, 1988
– • Cutting et al,1992
Learning the model • Merialdo, 1994
– Sparseness
–
less precision

POS Tagging 3
STATISTICAL POS TAGGING 12

Data-driven
LM and smoothing automatically learned from
tagged corpora (supervised learning)
N-gram
Hidden Markov Models Machine
Charniak, 1993
Learning Jelinek, 1998
Supervised learning Manning, Schütze,
1999
– Semi-supervised
– Forward-Backward, Baum-Welc h

POS Tagging 3
HYBRID SYSTEMS1

• Transformation-based, error-driven Brill, 1995

• Based on rules automatically acquired Roche,Schabes, 1995

• Maximum Entropy
• Combination of several knowledge sources No
• independence is assumed
Ratnaparkhi, 1998,
• A high number of parameters is allowed Rosenfeld, 1994
(e.g. lexical features) Ristad, 1997

POS Tagging 3
HYBRID SYSTEMS2
Brill’s system

• Based on transformation rules that correct errors produced by an initial

HMM tagger
• rule
• change label A into label B when ...
• Each rule corresponds to the instantiation of a templete
• templetes
• The previous (following) word is tagged with Z
• One of the two previous (following) words is tagged with Z The
• previous word is tagged with Z and the following with W
• ...
• Learning of the variables A,B,Z,W through an iterative process That choose at
each iteration the rule (the instanciation) correcting more errors.

POS Tagging 3
OTHER COMPLEX SYSTEMS1

Black,Magerman, 1992
• Decision trees Magerman 1996
Màrquez, 1999
• Supervised learning Màrquez, Rodríguez, 1997
• ej. TreeTagger
TiMBL
Case-based, Memory-based
Daelemans et al, 1996
Learning
• Relaxation labelling
• Statistical and linguistic Padrò,
constraints ej. RELAX 1997

POS Tagging 3
OTHER COMPLEX SYSTEMS2
Combining taggers
Màrquez, Rodríguez, 1998
• Combination of Language models in a Màrquez, 1999
tagger Padrò, 1997

• STT+
• RELAX
• Combination of taggers through Màrquez et al, 1998
votation
•bootstrapping Combinación of
• classifiers Brill, Wu, 1998
• bagging (Breiman, 1996) Màrquez et al,
1999
• boosting (Freund, Schapire, Abney et al, 1999
1996)

POS Tagging 4
Reference:
Books:

TEXTBOOKS
T1: Speech and Language processing an introduction to Natural Language Processing, Computational Linguistics
and speech Recognition by Daniel Jurafsky and James H. Martin
T2: Natural Language Processing with Python by Steven Bird, Ewan Klein, Edward Lopper
REFERENCE BOOKS:
R1: Handbook of Natural Language Processing, Second Edition—Nitin Indurkhya, Fred J. Damerau, Fred J. Damera

Course Link:
https://in.coursera.org/specializations/natural-language-processing

Video Link:
https://youtu.be/YVQcE5tV26s

Web Link:
https://www.tutorialspoint.com/natural_language_processing/natural_language_processing_tutorial.pdf
41
Thank you

Approach, Method, and Technique
100% (1)
Approach, Method, and Technique
23 pages
Pos Tagging Pushpak
No ratings yet
Pos Tagging Pushpak
88 pages
Sigachi Industries Limited: Purchase Order
100% (1)
Sigachi Industries Limited: Purchase Order
1 page
Lec04 2 PartOfSpeechTagging
No ratings yet
Lec04 2 PartOfSpeechTagging
56 pages
Electrical Electronics VOL.08 PDF
50% (2)
Electrical Electronics VOL.08 PDF
148 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
36 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
94 pages
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
No ratings yet
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
93 pages
Lecture 5 Part of Speech Tagging
No ratings yet
Lecture 5 Part of Speech Tagging
39 pages
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
No ratings yet
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
108 pages
Limbum – English Dictionary, English – Limbum Index and Grammar
From Everand
Limbum – English Dictionary, English – Limbum Index and Grammar
Francis Wepngong Ndi
No ratings yet
SPR 07 Nltk2
No ratings yet
SPR 07 Nltk2
30 pages
Module-2 NLP
No ratings yet
Module-2 NLP
50 pages
NLP Unit III Notes
No ratings yet
NLP Unit III Notes
30 pages
Ai TXT Unit4
No ratings yet
Ai TXT Unit4
39 pages
Lecture Part of Speech Tagging
No ratings yet
Lecture Part of Speech Tagging
41 pages
10 - POS Tagging
No ratings yet
10 - POS Tagging
75 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
Lecture 20-23 Part of Speech Tagging
No ratings yet
Lecture 20-23 Part of Speech Tagging
36 pages
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
No ratings yet
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
86 pages
NLP 4
No ratings yet
NLP 4
83 pages
Unit 3
No ratings yet
Unit 3
50 pages
Lecture7 Pos Tagging
No ratings yet
Lecture7 Pos Tagging
33 pages
Parts of Speech
No ratings yet
Parts of Speech
26 pages
NLP-Lectures 4,5,6
No ratings yet
NLP-Lectures 4,5,6
85 pages
Module 2 HMMPPT
No ratings yet
Module 2 HMMPPT
31 pages
Week 9
No ratings yet
Week 9
36 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
Module-5 (Markov Model and Pos Tagging)
No ratings yet
Module-5 (Markov Model and Pos Tagging)
66 pages
Module 3
No ratings yet
Module 3
33 pages
Blow Moulding Trends U K Saroop 12062007 PDF
No ratings yet
Blow Moulding Trends U K Saroop 12062007 PDF
47 pages
Module 2.1 Managerial Economics
No ratings yet
Module 2.1 Managerial Economics
18 pages
Lect6 Pos
No ratings yet
Lect6 Pos
62 pages
5 Sequence Learning
No ratings yet
5 Sequence Learning
50 pages
Lecture 5
No ratings yet
Lecture 5
56 pages
Pos Tagging and Chunking
No ratings yet
Pos Tagging and Chunking
29 pages
Pursue Lesson 1
No ratings yet
Pursue Lesson 1
10 pages
Print Lect6 Pos
No ratings yet
Print Lect6 Pos
11 pages
Math C4 Practice
No ratings yet
Math C4 Practice
53 pages
Cme4408 p6 Pos Tagging
No ratings yet
Cme4408 p6 Pos Tagging
33 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
47 pages
Test Bank For Understanding Economics A Contemporary Perspective, 9th Edition Mark Lovewell
100% (1)
Test Bank For Understanding Economics A Contemporary Perspective, 9th Edition Mark Lovewell
10 pages
Lec3-Posner Intro
No ratings yet
Lec3-Posner Intro
30 pages
10pos Tagging PDF
No ratings yet
10pos Tagging PDF
76 pages
Lecture#11 (POS Tagging)
No ratings yet
Lecture#11 (POS Tagging)
19 pages
Parts of Speech Tagging
No ratings yet
Parts of Speech Tagging
17 pages
8 POSNER Intro May 6 2021
No ratings yet
8 POSNER Intro May 6 2021
26 pages
Part of Speech Tagging (Chapter 5) : Adapted From Kathy Mccoy'S Presentation Downloaded From The Web, September 2010
No ratings yet
Part of Speech Tagging (Chapter 5) : Adapted From Kathy Mccoy'S Presentation Downloaded From The Web, September 2010
63 pages
NLPChapter 3
No ratings yet
NLPChapter 3
14 pages
Unit 3
No ratings yet
Unit 3
16 pages
3 Natural Language Processing-PoS Tagging
No ratings yet
3 Natural Language Processing-PoS Tagging
14 pages
Ilak Pos Tagging
No ratings yet
Ilak Pos Tagging
48 pages
3 cs626 Pos Tagging Week of 8aug22
No ratings yet
3 cs626 Pos Tagging Week of 8aug22
27 pages
POStagging
No ratings yet
POStagging
72 pages
Natural Language Processing: Parts of Speech Tagging - Pos
No ratings yet
Natural Language Processing: Parts of Speech Tagging - Pos
20 pages
Unit No 3
No ratings yet
Unit No 3
8 pages
Chapter Two Natural Language Processing
No ratings yet
Chapter Two Natural Language Processing
141 pages
Synopsis Diabetic Retinopathy
No ratings yet
Synopsis Diabetic Retinopathy
23 pages
FLUKE ProSim 8 User Manual 3
No ratings yet
FLUKE ProSim 8 User Manual 3
30 pages
Lec-5 POStagging
No ratings yet
Lec-5 POStagging
24 pages
Tagging and Its Types
No ratings yet
Tagging and Its Types
3 pages
Worksheet (AS)
No ratings yet
Worksheet (AS)
4 pages
GSTR1 Excel Workbook Template V1.4
No ratings yet
GSTR1 Excel Workbook Template V1.4
84 pages
Part-Of-Speech (POS) Tagging
No ratings yet
Part-Of-Speech (POS) Tagging
53 pages
Pas 7 - Statement of Cash Flows
No ratings yet
Pas 7 - Statement of Cash Flows
8 pages
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
No ratings yet
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
40 pages
Speech and Language Processing: SLP Chapter 5
No ratings yet
Speech and Language Processing: SLP Chapter 5
56 pages
Ajmer - RajRAS
No ratings yet
Ajmer - RajRAS
8 pages
Session 7: Genetics, Experience and Financial Sophistication
100% (1)
Session 7: Genetics, Experience and Financial Sophistication
40 pages
Lecture 2.1.2-2.1.3
No ratings yet
Lecture 2.1.2-2.1.3
42 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Search:: A Really Simple Database
No ratings yet
Search:: A Really Simple Database
30 pages
Multi-Tagging For Transition-Based Dependency Parsing
No ratings yet
Multi-Tagging For Transition-Based Dependency Parsing
10 pages
Final Marketing Plan Whole
No ratings yet
Final Marketing Plan Whole
19 pages
Hazardous Area Ventilation Sce Performance Standard
No ratings yet
Hazardous Area Ventilation Sce Performance Standard
82 pages
Part of Speech Tagging
No ratings yet
Part of Speech Tagging
13 pages
Complete Final Report
No ratings yet
Complete Final Report
62 pages
POS Tagging: Introduction: Heng Ji
No ratings yet
POS Tagging: Introduction: Heng Ji
35 pages
Zok The Armenian Dialect of Agulis
No ratings yet
Zok The Armenian Dialect of Agulis
19 pages
Journal of Oral Health and Dentistry Research (ISSN: 2583-522X) Case Report The in Uence of The Pulp On The Periodontium: A Viewpoint
No ratings yet
Journal of Oral Health and Dentistry Research (ISSN: 2583-522X) Case Report The in Uence of The Pulp On The Periodontium: A Viewpoint
11 pages
Chemical Burn
No ratings yet
Chemical Burn
32 pages
RTE 1503 Unit 3 Self Test
No ratings yet
RTE 1503 Unit 3 Self Test
15 pages
Evolution Overview
No ratings yet
Evolution Overview
1 page
Top 10 DAX Interview Questions and Answers
No ratings yet
Top 10 DAX Interview Questions and Answers
3 pages
Bai Tap Ham Tai Chinh
No ratings yet
Bai Tap Ham Tai Chinh
4 pages
A Tricky Joint Probability Density Problem - John Petrie's LifeBlag
No ratings yet
A Tricky Joint Probability Density Problem - John Petrie's LifeBlag
3 pages
Movie Ticket
No ratings yet
Movie Ticket
1 page
Economy Overview
No ratings yet
Economy Overview
1 page
My Life 10 Years From Now
No ratings yet
My Life 10 Years From Now
2 pages
Fis2014 Fish Biology Individual Assignment (50 Marks)
No ratings yet
Fis2014 Fish Biology Individual Assignment (50 Marks)
2 pages
Conservation Strategies and Plannings of Pench Tiger Reserve
No ratings yet
Conservation Strategies and Plannings of Pench Tiger Reserve
5 pages
Tour de Samos 2025 Results Overall
No ratings yet
Tour de Samos 2025 Results Overall
1 page
Pronunciation Rules Regular Past Verbs - US
No ratings yet
Pronunciation Rules Regular Past Verbs - US
1 page
PHD Download
No ratings yet
PHD Download
1 page
New Familiar Abenakis and English Dialogues: The First Vocabulary Ever Published in the Abenakis Language
From Everand
New Familiar Abenakis and English Dialogues: The First Vocabulary Ever Published in the Abenakis Language
Abenakis Chief Joseph Laurent
No ratings yet

Lecture 16-17-18-19

Uploaded by

Lecture 16-17-18-19

Uploaded by

Apex Institute of Technology

Department of Computer Science & Engineering

NATURAL LANGUAGE PROCESSING

DISCOVER . LEARN . EMPOWER

The Course aims to:

On completion of this course, the students shall be able to:-

Understand practical applications of formal languages and grammar in Natural

- Since the greeks 8 basic POS have been

- Modern works use extended lists of POS: 45 in Penn

Tagging is the process of assigning a tag to a word in

Closed class. Function words: prepositions, pronouns,

Brown Corpus tagset (87 tags)

CC Coordinating conjunction CD Penn Tree Bank tagset

Penn Tree Bank tagset 2

Examples of sentences tagged sentences Using

Words taken isolatedly are ambiguous

tocar el bajo bajo la escalera .

Most of words have a unique POS within a

tocar el bajo bajo la escalera .

The goal of a POS tagger is to assign each word

For each word w i only several of the tags can

• Knowledge-driven taggers Usually rules built

A CG consists of a sequence of subgrammars each

P(t1 |w1 ) = P(t1 ) P(w1 |t1 )/P(w1 )

Ex. Probability of noun after determiner

- Probability of a word depends only on its part-of-speech tag. (independent of other

Ex. given the tag noun, probabilty of word dog

This stochastic algorithm is also called HIDDEN MARKOV MODEL

Computing the most-likely tag sequence:

Hidden Markov Models (HMM) are extensions of finite state

<s> VB .019 .0043 .041 .067

Tag transition probabilities (the matrix A, p(t |t )) computed

Observation likelihoods (the matrix B) computed from the 87-

Hidden Markov Model

Statistical inference. (Bayesian inference) Hidden

 Well founded theoretical

• Transformation-based, error-driven Brill, 1995

• Based on transformation rules that correct errors produced by an initial

You might also like