0% found this document useful (0 votes)
38 views18 pages

Compiler Lab Antlr

Uploaded by

hariomkantsharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views18 pages

Compiler Lab Antlr

Uploaded by

hariomkantsharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Introduction to ANTLR

ANTLR (ANother Tool for Language Recognition) is a powerful parser


generator for reading, processing, executing, or translating structured
text or binary files.
Developed by Terence Parr and widely used to build languages, tools,
and frameworks.
From a grammar, ANTLR generates a parser that can build and walk
parse trees.
ANTLR is a tool (program) that converts a grammar into a parser:
ANTLR supports Multiple
Target Languages

ANTLR can also generate C# and


JavaScript and (future) C++ parsers
What kinds of inputs need parsing?
Data formats: there are thousands of
different data formats. There is a good
chance that you will need to write an
application to process data that is in a data
format. You need ANTLR!
Examples of popular data formats: XML,
JSON, Comma-Separated-Values (CSV),
Key-Value pairs

Programming language: there are


hundreds of different programming
languages. It is possible that you will need
to write an application to process a program
written in a programming language. You
need ANTLR!
Example of popular programming
languages: Java, Python, C++, C#, C
ANTLR Installation
1. Install Java (version 11 or higher)
2. Download

$ cd /usr/local/lib
$ sudo curl -O https://www.antlr.org/download/antlr-4.13.1-complete.jar

Add antlr-4.13.1-complete.jar to your CLASSPATH:


$ export CLASSPATH=".:/usr/local/lib/antlr-4.13.1-complete.jar:$CLASSPATH"
It's also a good idea to put this in your .bash_profile or whatever your startup script is.

Create aliases for the ANTLR Tool, and TestRig.


$ alias antlr4='java -Xmx500M -cp "/usr/local/lib/antlr-4.13.1-complete.jar:$CLASSPATH" org.antlr.v4.Tool'
$ alias grun='java -Xmx500M -cp "/usr/local/lib/antlr-4.13.1-complete.jar:$CLASSPATH"org.antlr.v4.gui.TestRig'

Testing the installation

$ java org.antlr.v4.Tool
ANTLR Parser Generator Version 4.13.1
-o ___ specify output directory where all output is generated
-lib ___ specify location of .tokens files
Two grammars
Two grammars
1. Lexer Grammar
2. Parser Grammar
Let's create our first parser
Let's create a lexer and parser grammar for input that consists of a greeting ('Hello' or 'Greetings')
followed by the name of a person. Here are two valid input strings:

Hello Maitri
Greetings Maitri

The lexer grammar describes how to break the input into two tokens:
(1) The greeting (Hello or Greetings), and
(2) The name of the person.

The parser grammar describes how to structure the tokens as shown by the below graphic (the
input is: Hello Maitri):
The lexer grammar

lexer grammar MyLexer ;

Token names Must begin GREETING : ('Hello' | 'Greetings') ;


with a capital letter ID : [a-zA-Z]+ ;
WS : [ \t\r\n]+ -> skip ;

• fiilename, must have this suffix: g4


• the filename and the lexer name must be the
same
• Any string in the input that matches 'Hello' or
'Greetings' is to be treated as a GREETING.
• 'Hello' and 'Greetings' are token values (string
literals).
The Parser grammar

parser grammar MyParser;

options { tokenVocab=MyLexer; } This parser will use the tokens generated by MyLexer

message : GREETING ID; If the input is valid, it must contain a token of type GREETING
followed by a token of type ID.
MyParser.g4

• fiilename, must have this suffix: g4


• The filename prefixes for the lexer and parser must be the same.
• Parser names must begin with a lower-case letter
Let's play with a simple grammar:
grammar Expr;
prog: expr EOF ;
expr: expr ('*'|'/') expr
| expr ('+'|'-') expr
| INT
| '(' expr ')'
;
NEWLINE : [\r\n]+ -> skip;
INT : [0-9]+ ;

To parse and get the parse tree in text form, use:


$ antlr4-parse Expr.g4 prog -tree
10+20*30
^D
(prog:1 (expr:2 (expr:3 10) + (expr:1 (expr:3 20) * (expr:3 30))) <EOF>)

(Note: ^D means control-D and indicates "end of input" on Unix; use ^Z on Windows.)
Here's how to get the tokens and trace through the parse:
$ antlr4-parse Expr.g4 prog -tokens -trace
10+20*30
^D

[@0,0:1='10',<INT>,1:0]
[@1,2:2='+',<'+'>,1:2]
…..
enter prog, LT(1)=10
enter expr, LT(1)=10
consume [@0,0:1='10',<8>,1:0] rule expr
enter expr, LT(1)=+
…..

Here's how to get a visual tree view:


$ antlr4-parse Expr.g4 prog -gui
10+20*30
^D

The following will pop up in a Java-based GUI window:


References
• https://www.xfront.com/ANTLR/
• https://www.antlr.org/

You might also like