0% found this document useful (0 votes)
20 views10 pages

Final Part Load HAPS2

The document provides examples of using different features and capabilities in JAPE grammars for named entity recognition. It discusses using macros, negation operators, and Java code in the right-hand side of rules to manipulate annotations. It also describes using a main file to run multiple grammar phases sequentially and exploit temporary annotations to help resolve ambiguities. Finally, a complex example is provided that uses Java code to combine first name and lookup annotations into full names while adding gender properties.

Uploaded by

Marcos Gonzalez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views10 pages

Final Part Load HAPS2

The document provides examples of using different features and capabilities in JAPE grammars for named entity recognition. It discusses using macros, negation operators, and Java code in the right-hand side of rules to manipulate annotations. It also describes using a main file to run multiple grammar phases sequentially and exploit temporary annotations to help resolve ambiguities. Finally, a complex example is provided that uses Java code to combine first name and lookup annotations into full names while adding gender properties.

Uploaded by

Marcos Gonzalez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Exercise 3:

1. From the data store, load the exercise5.txt file in the GUI. Also load the
playercontext.jape file from Example 5 folder. Before you run this Jape grammar, look at
the text in exercise5.txt and write down what you would expect the result of running the
program using “Brill”, “Appelt” and “All”. Run the program and see the results to check
against your assumptions and try to understand any differences.

2. The “appelt” control style is the most appropriate for named entity recognition as under
“appelt” only one rule can fire for the same pattern. Do you agree?

Example 6. Handling repetitiveness in patterns using Macro

We can use macros with JAPE grammar with the same effect as in the programming
languages. In the Example 2 of this tutorial, the reusable pattern Lookup.majorType ==
Person can be converted in a macro. Look for PersonMacro.jape file from the folder
Example 5 and load it into GUI. Run the jape transducer on Example 6.txt file to inspect
the results. You will achieve same results as with the nestedpattern.jape.

Phase: PersonMacro
Input: Lookup Token
//note that we are using Lookup and Token both inside our rules.
Options: control = appelt
Macro: PERSON
(
{Lookup.majorType == Person}

//trying to detect entities with word “player” mentioned before a


person’s name
Rule: playerid
(

{Token.string == "player"}
)
:temp
(
PERSON
|

(
{Token.kind==word, Token.category==NNP, Token.orth==upperInitial}
{Token.kind==word, Token.category==NNP, Token.orth==upperInitial}

) )

:player -->

:player.Player= {rule = "playerid"}

JAPE Grammar 10 PersonMacro.jape

Example 7. Using negation operator in JAPE

Following shows you how to use a negation operator in JAPE grammar. Let’s take one
example to demonstrate the requirement of the negation operator in entity extraction. For
example, we are looking for titles in the text but particularly not interested in title “Sir”.
The correct rule shall not detect title (“Sir”) from the following story text (Example 7.txt)
but detect “Mr”:

Park Ji Sung and Jonny Evans are expected to commit their long-term
futures to Manchester United in the coming weeks as the English, European
and world champions continue to plan for life after Sir Alex Ferguson.

However Mr Alex Ferguson was unavailable to comment.

Rule for doing so is in the folder Example 7, NegationOperator.jape.

Rule: negationop
(

{Lookup.majorType == "Title", !Token.string =~ "[Ss]"} )

:TitleNotStartingWithS
(
{Lookup.majorType == "Person"}
):person
-->
:TitleNotStartingWithS.Title = {rule= "negationop" }, :person.Person =
{rule= "negationop" }

JAPE Grammar 11 NegationOperator.jape

The line:

{Lookup.majorType == "Title",!Token.string =~ "[Ss]"}

Will take care of ignoring the title “Sir” and making sure that person will be annotated
only once as the rule as a whole will be applied.

Example 8. Using JAVA in RHS of JAPE Grammar

The RHS of a JAPE rule can consist of any Java code. This is useful for removing
temporary annotations and for percolating and manipulating features from previous
annotations identified by the LHS. The example text story (Example 8.txt) we are using for
this example is:

Soccer - Rooney Gerrard - File.


Composite file picture of Liverpool 's Steven Gerrard (left , dated 27
September 2006 ) and Manchester United 's Wayne Rooney (dated 20 August
2006 ) . On the occasion of his 21st Birthday , Tuesday 24 October 2006 ,
Wayne Rooney has hailed England team -mate Steven Gerrard as one of the
world 's best midfielders and wishes the Liverpool star could play at
Manchester United .

We would ideally like to annotate name of the Team with label “Team” and also annotate
the team name with the property “teamOfSport” which is already available through the
Lookup.
The JAPE grammar to achieve this is the usingJAVAinRHS.jape.

Phase:usingJAVAinRHS
Input: Lookup
Options: control = all
Rule: javainRHS1
(
{Lookup.majorType == Team}
)

:team
-->
{
gate.AnnotationSet team = (gate.AnnotationSet)bindings.get("team");
gate.Annotation teamAnn = (gate.Annotation)team.iterator().next();
gate.FeatureMap features = Factory.newFeatureMap();
features.put("teamOfSport", teamAnn.getFeatures().get("minorType"));
features.put("rule","javainRHS1");
outputAS.add(team.firstNode(), team.lastNode(), "Team",features); }

JAPE Grammar 12 usingJAVAinRHS.jape

The rule matches a team’s name, e.g. “Manchester United”, and adds a teamOfSport
feature depending on the value of the minorType from the gazetteer list in which the name
was found. We first get the bindings associated with the team label (i.e. the Lookup
annotation). We then create a new annotation called “teamAnn” which contains this
annotation, and create a new FeatureMap to enable us to add features. Then we get the
minorType features (and its value) from the teamAnn annotation (in this case, the feature
will be “teamOfSport” and the value will be “Football_Club”), and add this value to a new
feature called “teamOfSport”. We create another feature “rule” with value “javainRHS1”.
Finally, we add all the features to a new annotation “Team” which attaches to the same
nodes as the original “team” binding.

Note that inputAS and outputAS represent the input and output annotation set. Normally,
these would be the same (by default when using ANNIE, these will be the “Default”
annotation set) however the user is at liberty to change the input and output annotation sets
in the parameters of the JAPE transducer at runtime, it cannot be guaranteed that the input
and output annotation sets will be the same, and therefore we must specify the annotation
set we are referring to.

Example 9. Using a common file as a holder of application specific JAPE


grammar files

So far, we have individual JAPE grammars doing their trick in isolation, however easily we
can contemplate real-world scenarios where you want these grammar to work together to
achieve a complex task. For achieving this, the list of phases can be specified (in the order
in which they are to be run) in a file, conventionally named main.jape. When loading the
grammar into GATE, it is only necessary to load this main file – the phases will then be
loaded automatically. It is, however, possible to omit this main file, and just load the phases
individually, but this is much more time- consuming. The grammar phases do not need to
be located in the same directory as the main file, but if they are not, the relative path should
be specified for each phase.
One of the main reasons for using a sequence of phases is that a pattern can only be used
once in each phase, but it can be reused in a later phase. Combined with the fact that
priority can only operate within a single grammar, this can be exploited to help deal with
ambiguity issues. The solution currently adopted is to write a grammar phase for each
annotation type, or for each combination of similar annotation types, and to create
temporary annotations. These temporary annotations are accessed by later grammar phases,
and can be manipulated as necessary to resolve ambiguity or to merge consecutive
annotations. The temporary annotations can either be removed later, or left and simply
ignored. Generally, annotations about which we are more certain are created earlier on.
Annotations which are more dubious may be created temporarily, and then manipulated by
later phases as more information becomes available.

See the difference in the syntax of main.jape compared to other jape files that contains
single phase.

JAPE Grammar 13 main.jape

Example 10. Using JAVA in RHS of JAPE: A complex example

Following is a complex example using Java in RHS. To explain what we are after we will
use following text story (Example 10.txt).

“Jane Rooney and Wayne Rooney and Jan Rooney”.

The lookup gazetteer annotates the text as following:


MultiPhase: TestTheGrammars
Phases:
firstname
Fullname
Jane is annotated as majorType = person_first, minorType = female Wayne is annotated as
majorType = person_first, minorType = male Jan is annotated as majorType =
person_first, minorType = ambig

The aim here is to generate full name from this information (Lookup annotation) and at the
same time specify gender component of each such person as a property. We are after
something like:

Fullname = Jane Rooney , gender = female

The rule will be divided into two phases:


The first phase (FirstName) will create an annotation of type FirstPerson that basically

looks for “Lookup.majorType=person_first” and creates a gender property by copying


minorType values
only if the value is not ambiguous. Next phase (FullName) will take the FirstPerson
annotation and will

Check it the first person is immediately followed by a token which is orthogonally


upperInitial if it is then combine them as Fullname and create a gender property (by
copying from) FirstPerson annotation only where it is present

Rule: FirstName
// Fred
(
{Lookup.majorType == person_first}

):person -->

{
gate.AnnotationSet person = (gate.AnnotationSet)bindings.get("person");
gate.Annotation personAnn = (gate.Annotation)person.iterator().next();
gate.FeatureMap features = Factory.newFeatureMap();

//find out if the gender is unambiguous


String gender = (String)personAnn.getFeatures().get("minorType"); boolean
ambig = false;
gate.FeatureMap constraints = Factory.newFeatureMap();
constraints.put("majorType", "person_first");
Iterator lookupsIter =
inputAS.get(personAnn.getStartNode().getOffset()).get("Lookup",
constraints).iterator();
while(!ambig && lookupsIter.hasNext()){

gate.Annotation anAnnot = (gate.Annotation)lookupsIter.next(); //we're


only interested in annots of the same length

if(anAnnot.getEndNode().getOffset().equals(personAnn.getEndNode().get
Offset())){

ambig = !gender.equals(anAnnot.getFeatures().get("minorType")); }

}
if(!ambig) features.put("gender", gender);

features.put("rule", "FirstName"); annotations.add(person.firstNode(),


person.lastNode(), "FirstPerson", features);
}

You might also like