Syntactic Pattern Recognition: by Nicolette Nicolosi Ishwarryah S Ramanathan
Syntactic Pattern Recognition: by Nicolette Nicolosi Ishwarryah S Ramanathan
By
Nicolette Nicolosi
Ishwarryah S
Ramanathan
Syntactic Pattern Recognition
□ Statistical pattern recognition is
straightforward, but may not be ideal for
many realistic problems.
■ Patterns that include structural or relational
information are difficult to quantify as feature
vectors.
□ Syntactic pattern recognition uses this
structural information for classification and
description.
□ Grammars can be used to create a definition
of the structure of each pattern class.
Classification
□Producing a classification can be done
based on a measure of structural
similarity in patterns.
□Each pattern class can be represented
by a structural representation or
description.
□It is often difficult to classify patterns
that contain a large number of
features.
Description
□ A description of the pattern structure is
useful for recognizing entities when a simple
classification isn’t possible.
□ Can also describe aspects that cause a
pattern to not be assigned to a particular
class.
□ In complex cases, recognition can only be
achieved through a description for each
pattern rather than through classification.
When to Use It
□ Picture recognition and scene analysis are
problems in which there are a large number
of features and the patterns are complex.
■ For example, recognizing areas such as highways,
rivers, and bridges in satellite pictures.
□ In this case, a complex pattern can be
described in terms of a hierarchical
composition of simpler subpatterns.
Hierarchical Approach
□ The hierarchical approach comes from the
similarity that can be seen between the
structure of patterns and the syntax or
grammar of languages.
□ Following this analogy, patterns can be built
up from sub-patterns in a number of ways,
similarly to how one builds words by
concatenating characters, and builds a
phrase or sentence by concatenating words.
Definitions
□ The simplest sub-patterns are called pattern
primitives, and should be much easier to
recognize than the overall patterns.
□ The language used to describe the structure
of the patterns in terms of sets of pattern
primitives is called the pattern description
language.
□ The pattern description language will have a
grammar that specifies how primitives can
be composed into patterns.
Syntax Analysis
□When a primitive within the pattern is
identified, syntax analysis (parsing) is
performed on the sentence describing
the pattern to determine if it is
correct with respect to the grammar.
□Syntax analysis also gives a structural
description of the sentence
associated with the pattern.
Syntax Analysis
□One advantage of this approach is
that a grammar (rewriting) rule can
be applied many times.
□This allows for expressing basic
structural characteristics for an infinite
number of sentences in a number of
compact ways.
Other Representations
□Relational graph - describe a pattern
using the relations between sub-
patterns and primitives.
□Relational matrix - any relational
graph can also be expressed as
a matrix.
Other Representations
□Generalizing to allow for any relation
that can be determined from the
pattern, we can express richer
descriptions than through tree-based
structures.
□Hierarchical (tree-based) approaches
are convenient because it is easy to
apply formal language theory.
Syntactic System
□ Consists of two main parts:
■ Analysis - primitive selection and grammatical or
structural inference
■ Recognition - preprocessing, segmentation or
decomposition, primitive and relation recognition,
and syntax analysis
□ Preprocessing includes the tasks of pattern
encoding and approximation, filtering,
restoration, and enhancement.
Syntactic System
Syntactic System
Syntactic System
□After preprocessing, the pattern is
segmented into sub-patterns and
primitives using predefined
operations.
□Sub-patterns are identified with a
given set of primitives, so each
pattern is represented by a set of
primitives with the specified syntactic
operations.
Syntax Parsing
□For example, using the concatenation
operation, each pattern is recognized
by a string of concatenated primitives.
□ At this point, the parser will
determine if the pattern is
syntactically correct.
■ It belongs to the class of patterns
described by the grammar if it is correct.
Syntax Parsing
□During parsing/syntax analysis, a
description is produced in terms of a
parse tree, assuming the pattern is
syntactically correct.
□If it isn’t correct, it will either be
rejected or analyzed based on a
different grammar, which could
represent other possible pattern
classes.
Matching
□ The simplest form of recognition is template
matching, in which a string of primitives
representing an input pattern is compared
to strings of primitives representing
reference patterns.
□ The input pattern is classified in the same
class as the prototype that is the best
match, which is determined by a similarity
criterion.
Matching vs. Complete Parsing
□ In this case, the structural description is
ignored.
□ The opposite approach is a complete parsing
that uses the entire structural description.
□ There are many intermediate approaches;
for example, a series of tests designed to
test the occurrence of certain primitives,
sub-patterns, or combinations of these. The
result of these tests will determine a
classification.
Parsing
□Parsing is required if the problem
necessitates using a complete pattern
description for recognition.
□Efficiency of the recognition process is
improved by simpler approaches that
do not require a complete parsing.
□Basically, parsing can be expensive,
so don’t use it unnecessarily.
Inferring Grammars
□Grammatical inference machine -
similar to “learning” in the
discriminant approach; it infers a
grammar from a set of training
patterns.
□The inferred grammar can then
be used for pattern description
and syntax analysis.
Parsing - Fundamentals
□ Parser Hierarchical Structure
■ Smaller decompositions
■ Graphically shown by derivation trees
Parsing Problems
□ Approaches of Parsing
□ Parsing/Generation Similarities
■ Application of grammar is easier in
generative mode than analytic mode.
■ Concerns
□ Parser must determine the
extent of the elements that comprise
non- terminals.
□ Parser must find a use for all of x
Parsing Approaches
□ Top-Down Parsing
■ From S to terminals. A derivation for x, where x
is a sentence.
■ Method 1: Depth First Expansion of non-
terminals, starting with leftmost non-terminal.
Allows back-up.
■ Method 2: Recursive Descent may not work on
all grammars. No back-up. Recursive
functions to recognize sub-strings
corresponding to the expansion of a non-
terminal.
□ Bottom-Up Parsing
■ Knowing x, we proceed to S by reversing the
productions defined.
Comparing Top-down and Bottom-up
□ Difficult to compare because the
efficiency factor lies with the
grammar.
□ Normalization or Transformation of a
grammar will affect parsing efficiency.
□ Brute force method of the top-down
and bottom-up approaches have
computational complexity growing
exponentially with |x|.
Alternative Approaches – CYK
Parsing
□ Cocke-Younger-Kasami Algorithm
■ Parse string x in number of steps proportional to
|x|3.
■ The CFG should be in Chomsky Normal Form
■ Building CYK table
CYK Parsing contd.
□ The cell (1,n) should have S. Then the
parsing is said to be complete.
□ Example
■ Productions
■ CYK table
Stochastic Grammars
□ Assumptions of the formal grammar
used in SyntPR
■ Languages are disjoint
■ No errors in the sentences produced by
the grammar
□ In practice the assumptions are faulty
■ Errors in the primitive extraction process
■ Noise or pattern deformation in
descriptions
Stochastic Grammars contd.
□ Definition
■ Gs = {VN, VT, Ps, Ss}
□ Ps is a set of Stochastic Productions
■ Each production is of form
□ ai -> bj with probability pij
□ Derivations in Stochastic Language
■ Derivations of sentence from Ss to x
■ Labels tk-1,k where k=1 to n are given to each
production such as βk-1 to βk
■ Every production will have a probability pi
■ Unconditional Probability is given by
□ P(t0,1 ‘n’ t1,2 ‘n’ … ‘n’ tn-1,n)=
P(t0,1).P(t1,2) … P(tn-1,n)
Stochastic Grammars contd.
□ P(t0,1,t1,2,…,tn-1,n) = Πq=1 to n P(tq-1,q)
□ This uses the assumption that every
production is independent of the previous one
applied.
□ Proper Stochastic Grammar
■ Elements of Ps is of form
□ Ai -> βi with probability pij
■ Where Ai Є VN, βi Є (VN U VT)+
■ Σk=1 to ni pik =1 (Sum of all the probabilities of
each production in the Grammar is equal to 1)
Stochastic Grammars contd.
□ Characteristic Grammar
■ Remove the probability measure from the
Stochastic grammar
□ Stochastic Languages
■ L(Gs)={(x,p(x))|x є V T+, S S derives x
probability
with pj, j = 1 to k, p(x) = Σj=1 to k pj}
■ Where pj is the probability to parse a string x
from SS and p(x) is the total probability of
deriving various strings (Say k number of
strings) using the grammar.
Stochastic Grammars contd.
□ For example, x is ‘abc’ and productions of a grammar
are
■ S->aA with p1; A->bC with p2
■ B->dC with p3; C->eD with p4
■ B->c with p5; B->f with p6
■ B->g with p7; C->c with p8
■ C->f with p9; C->g with p10
■ D->c with p11; D->f with p12
■ D->g with p13
□ Then to get x we have S->aA->abC->abc.
□ Here the probability to get abc is p(abc)=p1.p2.p8
□ p1+p2+…+p13 = 1 if the given grammar is Proper
Stochastic Grammar
Structural Semantic Interconnections: A
Knowledge-Based Approach to Word Sense
Disambiguation