0% found this document useful (0 votes)
14 views6 pages

NLP Unit-4

The document discusses probabilistic parsing and language processing, focusing on Probabilistic Context Free Grammar (PCFG) which incorporates probabilities into production rules to handle ambiguity in sentence parsing. It also covers semantic parsing, which translates natural language into machine-understandable representations, and introduces text classification using machine learning techniques like Naive Bayes and Logistic Regression. Additionally, the document outlines the use of a dataset for training and testing text classifiers in the context of economic news articles.

Uploaded by

gaursujal02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views6 pages

NLP Unit-4

The document discusses probabilistic parsing and language processing, focusing on Probabilistic Context Free Grammar (PCFG) which incorporates probabilities into production rules to handle ambiguity in sentence parsing. It also covers semantic parsing, which translates natural language into machine-understandable representations, and introduces text classification using machine learning techniques like Naive Bayes and Logistic Regression. Additionally, the document outlines the use of a dataset for training and testing text classifiers in the context of economic news articles.

Uploaded by

gaursujal02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

UNIT-4

Probabilistic parsing
Probabilistic parsing is using dynamic programming algorithms to compute the most likely
parse(s) of a given sentence, given a statistical model of the syntactic structure of a language.
Research at Stanford has focused on improving the statistical models used as well as the
algorithms.
A probabilistic context free grammar consists of terminal and nonterminal variables. Each
feature to be modeled has a production rule that is assigned a probability estimated from a
training set of RNA structures. Production rules are recursively applied until only terminal
residues are left.
Probabilistic Language Processing
Probabilistic language processing presupposes a probabilistic model of the language; and uses
that model to infer, for example, how sentences should be parsed, or ambiguous words
interpreted.

Probabilistic Language Processing

Example:
For example: weather forecasting, mail delivery. A probabilistic model is, instead, meant to
give a distribution of possible outcomes (i.e. it describes all outcomes and gives some measure
of how likely each is to occur).

PCFG
Probabilistic Context Free Grammar. (PCFG) A PCFG is a probabilistic version of a CFG
where each production has a probability. Probabilities of all productions rewriting a given non-
terminal must add to 1, defining a distribution for each non-terminal.
The PCFG is used to predict the prior probability distribution of the structure whereas posterior
probabilities are estimated by the inside-outside algorithm and the most likely structure is found
by the CYK algorithm.
Advantages:
PCFGs are good for grammar induction, learning a grammar from a text; CFGs require
negative data in order to learn. PCFGs tend to be robust to disfluencies and grammatical
mistakes because these simply receive low probabilities.
• They provide a precise mathematical definition that clearly rules out certain types of
language.
• The formal definition means that context free grammars are computationally
TRACTABLE--it is possible to write a computer program which determines whether
sentences are grammatical or not.
Limitations of Probabilistic Context Free Grammar:
• Lexical rules are difficult in case of Context free grammar.
• Notations in Context free grammar are quite complex.
• By using the context free grammar, it is very difficult to construct the recognizer.
Example:

Why PCFG Used:

Ambiguity is the reason why we are using probabilistic version of CFG. For instance, some
sentences may have more than one underlying derivation. That is, the sentence can be parsed
in more than one ways. In this case, the parse of the sentence become ambiguous.

Probabilistic Context Free Grammar (PCFG) is an extension of Context Free Grammar (CFG)
with a probability for each production rule. Ambiguity is the reason why we are using
probabilistic version of CFG. For instance, some sentences may have more than one underlying
derivation. That is, the sentence can be parsed in more than one ways. In this case, the parse of
the sentence become ambiguous. To eliminate this ambiguity, we can use PCFG to find the
probability of each parse of the given sentence.
A PCFG is made up of a CFG and a set of probabilities for each production rule of CFG. A
PCFG can be formally defined as follows;
A probabilistic context free grammar G is a quintuple G = (N, T, S, R, P) where

• (N, T, S, R) is a context free grammar where N is set of non-terminal (variable)


symbols, T is set of terminal symbols, S is the start symbol and R is the set of production
rules where each rule of the form A → s [Refer for more here – Context Free Grammar
Formal Definition].
• A probability P(A → s) for each rule in R. The properties governing the
probability are as follows;
o P(A → s) is a conditional probability of choosing a rule A → s in a left-
most derivation, given that A is the non-terminal that is expanded.
o The value for each probability lies between 0 and 1.
o The sum of all probabilities of rules with A as the left hand side non-
terminal should be equal to 1.

Example PCFG:
Probabilistic Context Free Grammar G = (N, T, S, R, P)
• N = {S, NP, VP, PP, Det, Noun, Verb, Pre}
• T = {‘a’, ‘ate’, ‘cake’, ‘child’, ‘fork’, ‘the’, ‘with’}
• S=S
• R = { S → NP VP
NP → Det Noun | NP PP
PP → Pre NP
VP → Verb NP
Det → ‘a’ | ‘the’
Noun → ‘cake’ | ‘child’ | ‘fork’
Pre → ‘with’
Verb → ‘ate’ }
• P = R with associated probability as in the table below;
Rule Probability Rule Probability
S → NP VP 1.0 Det → ‘a’ 0.5
Det → ‘the’ 0.5
NP → NP PP 0.6 Noun → ‘cake’ 0.4
NP → Det Noun 0.4 Noun → ‘child’ 0.3
Noun → ‘fork’ 0.3
PP → Pre NP 1.0 Pre → ‘with’ 1.0
VP → Verb NP 1.0 Verb → ‘ate’ 1.0

We have formally defined PCFG. Now the next question is how to use PCFG to derive the
probability of a parse tree (derivation tree). As discussed, a sentence can be parsed into more
than one way. That means, we can have more than one parse trees for the sentence as per the
CFG due to ambiguity.
Given a parse tree t, with the production rules α1 → β1, α2 → β2, … , αn → βn from R (ie., αi →
βi ∈ R), we can find the probability of tree t using PCFG as follows;

As per the equation, the probability P(t) of parse tree is the product of probabilities of
production rules in the tree t.

Example:
Find the probability of the parse tree t given below;

= 1.0 * 0.1 * 0.7 * 1.0 * 0.4 * 0.18 * 1.0 * 1.0 * 0.18


= 0.0009072
The probability of the parse tree t is calculated as 0.0009072.
Semantic Parsing
Semantic parsing is the task of converting a natural language utterance to a logical form: a
machine-understandable representation of its meaning. Semantic parsing can thus be
understood as extracting the precise meaning of an utterance.
Example
A semantic parser is a program that automatically translates natural-language utterances
(NLUs) to formal meaning representations (MRs) that a computer can execute. For
example, the NLU in (1-a) is a question that might be posed to a geographical information
system.
What is the difference between semantic parsing and syntax parsing?
As such, semantic parsing refers to the task of mapping natural language text to formal
representations or abstractions of its meaning. A syntactic parser may generate constituency or
dependency trees from a sentence, but a semantic parser may be built depending upon the task
for which inference is required.

The Semantics of Programming Languages. Semantics, roughly, are meanings given for groups
of symbols: ab+c, "ab"+"c", mult(5,4). For example, to express the syntax of adding 5 with 4,
we can say: Put a "+" sign in between the 5 and 4, yielding " 5 + 4 ". However, we must also
define the semantics of 5+4.

Kaggle-based Text Classification


Text classification also known as text tagging or text categorization is the process of
categorizing text into organized groups. By using Natural Language Processing (NLP), text
classifiers can automatically analyze text and then assign a set of pre-defined tags or categories
based on its content.

Text classification is a machine learning technique that assigns a set of predefined categories
to open-ended text. Text classifiers can be used to organize, structure, and categorize pretty
much any kind of text – from documents, medical studies and files, and all over the web.

With text classification, there are two main deep learning models that are widely
used: Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). CNN is
a type of neural network that consists of an input layer, an output layer, and multiple hidden
layers that are made of convolutional layers.

This kernel aims to give a brief overview of performing text classification using Naive Bayes,
Logistic Regression, Support Vector Machines and Decision Tree Classifier. We will be using
a dataset called "Economic news article tone and relevance" which consists of approximately
8000 news articles, which were tagged as relevant or not relevant to the US Economy. Our
goal in this kernel is to explore the process of training and testing text classifiers for this dataset.
Import Required Libraries
In [1]:
import numpy as np
import pandas as pd

import matplotlib as mpl


import matplotlib.cm as cm
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import seaborn as sns

from sklearn.feature_extraction.text import CountVectorizer


from sklearn.feature_extraction import stop_words
from nltk.stem import WordNetLemmatizer
from sklearn.feature_extraction.text import TfidfVectorizer

import string
import re

from sklearn.naive_bayes import MultinomialNB


from sklearn.naive_bayes import GaussianNB
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC

from sklearn.metrics import accuracy_score


import sklearn.metrics as metrics
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn import metrics

from time import time

import warnings
warnings.filterwarnings("ignore")

You might also like