0% found this document useful (0 votes)

32 views12 pages

Compiler Project Abstract

Uploaded by

RITHIK JOSHUA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views12 pages

Compiler Project Abstract

Uploaded by

RITHIK JOSHUA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

PROJECT ABSTRACT

LEXICAL ANALYZER FOR C LANGUAGE

INTRODUCTION
The goal of this project is to implement a Lexical Analyzer for the C programming language.
Lexical analysis is the first phase of the compilation process, where the input source code is
converted into a sequence of tokens. Each token represents a basic unit of meaning such as
keywords, identifiers, literals, operators, and punctuation. This tool will be built using Flex,
a widely-used lexical analyzer generator, and will output various tokens while processing a
C program file.

OBJECTIVES

The primary objectives of this project are:

1. To read C source code, break it into individual components (tokens), and categorize them
into predefined token types (keywords, operators, numbers, etc.).
2. To ignore unnecessary elements like comments and whitespace.

3. To recognize and print unrecognized characters that do not fit into valid token categories.

4. To support the user by reporting basic token information in a human-readable format, such
as token type and value.

FEATURES

- Keyword Recognition: The analyzer identifies common C keywords such as ìnt`, ìf`, èlse`,
`while`, `for`, etc.

- Identifier Matching: It detects valid identifiers based on C language rules, including variable
names and function names.
- Number Detection: It processes integer, floating-point, octal, and hexadecimal literals in
C programs.
- String and Character Literals: The tool correctly identifies and classifies string and
character literals.
- Operators and Punctuation: Recognizes a variety of operators like `+`, `-`, `*`, `/`, relational
operators (`==`, `!=`, `<`, `>`), and punctuation marks like `{`, `}`, `;`, etc.
- Comment Ignoring: Single-line (`//`) and multi-line comments (`/* */`) are ignored during
tokenization.
- Error Handling: The analyzer prints a message when encountering any unrecognized or
invalid characters.
- File Handling: Users can provide a source file as input, allowing the analyzer to process
entire C programs.

METHEDOLOGY

The project uses Flex, a powerful tool for lexical analysis, to define rules for recognizing
different types of tokens. The Flex rules for token categories (keywords, operators, identifiers,
etc.) are implemented using regular expressions. The lexical analyzer is built to read a file
containing C code, break it down into tokens, and print the token type and value to the console.

CONCLUSION

This project successfully implements a basic Lexical Analyzer for C programs, allowing users to
visualize the structure of a C source file by identifying key tokens. By recognizing keywords,
identifiers, operators, literals, and more, the analyzer provides a foundational step toward further
compiler development. It offers essential features like error detection and report generation for
basic C programs.

In future work, the lexical analyzer could be extended to handle more complex C language
features, such as preprocessor directives and macros. Additionally, it could be integrated with a
parser to continue the syntactic and semantic analysis phases, bringing it closer to a full-fledged
compiler.

FUTURE ENHANCEMENTS

- Extend the analyzer to handle preprocessor directives and macros.

- Implement scope handling for variables and function declarations.

- Integrate the lexical analyzer with a parser for syntactic analysis.

This project demonstrates the importance of lexical analysis in compiler construction and
provides a hands-on understanding of how a compiler breaks down source code into
meaningful units.
LEXICAL ANALYZER FOR C LANGUAGE
SOURCE CODE
%{

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

Void print_token(const char token_type, const char token_value)

{ Printf(“Token Type: %s, Token Value: %s\n”, token_type,
token_value);
}

%option noyywrap

If { print_token(“KEYWORD”, “if”); }
Else { print_token(“KEYWORD”, “else”); }
While { print_token(“KEYWORD”, “while”); }
For { print_token(“KEYWORD”, “for”); }
Return { print_token(“KEYWORD”, “return”);
} Break { print_token(“KEYWORD”, “break”); }
Continue { print_token(“KEYWORD”, “continue”); }
Int { print_token(“KEYWORD”, “int”); }
Float { print_token(“KEYWORD”, “float”); }
Char { print_token(“KEYWORD”, “char”); }
Void { print_token(“KEYWORD”, “void”); }

[a-zA-Z_][a-zA-Z0-9_]* { print_token(“IDENTIFIER”, yytext); }

0[xX][0-9a-fA-F]+ { print_token(“NUMBER”, yytext); }

0[0-7]* { print_token(“NUMBER”, yytext); }
[1-9][0-9]* { print_token(“NUMBER”, yytext); }

[0-9]+\.[0-9]+ { print_token(“NUMBER”, yytext); }

\”([^\\\”]|\\.)*\” { print_token(“STRING”, yytext); } // String literals

\’([^\\\’]|\\.)\’ { print_token(“CHAR_LITERAL”, yytext); } // Character literals

“+” { print_token(“OPERATOR”, “+”); }

“-“ { print_token(“OPERATOR”, “-“); }

“*” { print_token(“OPERATOR”, “*”); }

“/” { print_token(“OPERATOR”, “/”); }

“=” { print_token(“OPERATOR”, “=”); }

“==” { print_token(“OPERATOR”, “==”); }

“<” { print_token(“OPERATOR”, “<”); }

“>” { print_token(“OPERATOR”, “>”); }

“<=” { print_token(“OPERATOR”, “<=”); }

“>=” { print_token(“OPERATOR”, “>=”); }
“!=” { print_token(“OPERATOR”, “!=”); }

“++” { print_token(“OPERATOR”, “++”); }

“—” { print_token(“OPERATOR”, “—“); }

“{“ { print_token(“PUNCTUATION”, “{“); }

“}” { print_token(“PUNCTUATION”, “}”); }

“(“ { print_token(“PUNCTUATION”, “(“); }

“)” { print_token(“PUNCTUATION”, “)”); }

“;” { print_token(“PUNCTUATION”, “;”); }

“,” { print_token(“PUNCTUATION”, “,”); }

“//”.* { /* Ignore single line comments / } “/\\([^*]|[\\r\\n]|(\\

*+[^*/]))*\\*+/” { /* Ignore multi-line comments */ }

[ \t\n]+ { /* Ignore whitespace */ }

. { printf(“Unrecognized character: ‘%s’\n”, yytext); }

Int main(int argc, char **argv)

{ If (argc > 1) {
FILE *file = fopen(argv[1], “r”);
If (!file) {
Perror(“Error opening file”);
Return EXIT_FAILURE;
}

Yyin = file;
}

Yylex();

Return EXIT_SUCCESS;

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

Void print_token(const char token_type, const char token_value)

{ Printf(“Token Type: %s, Token Value: %s\n”, token_type,
token_value);
}

%option noyywrap

[a-zA-Z_][a-zA-Z0-9_]* { print_token(“IDENTIFIER”, yytext); }

0[xX][0-9a-fA-F]+ { print_token(“NUMBER”, yytext); }

0[0-7]* { print_token(“NUMBER”, yytext); }
[1-9][0-9]* { print_token(“NUMBER”, yytext); }

[0-9]+\.[0-9]+ { print_token(“NUMBER”, yytext); }

\”([^\\\”]|\\.)*\” { print_token(“STRING”, yytext); } // String literals

\’([^\\\’]|\\.)\’ { print_token(“CHAR_LITERAL”, yytext); } // Character literals

“+” { print_token(“OPERATOR”, “+”); }

“-“ { print_token(“OPERATOR”, “-“); }
“*” { print_token(“OPERATOR”, “*”); }

“/” { print_token(“OPERATOR”, “/”); }

“=” { print_token(“OPERATOR”, “=”); }

“==” { print_token(“OPERATOR”, “==”); }

“<” { print_token(“OPERATOR”, “<”); }

“>” { print_token(“OPERATOR”, “>”); }

“<=” { print_token(“OPERATOR”, “<=”); }

“>=” { print_token(“OPERATOR”, “>=”); }

“!=” { print_token(“OPERATOR”, “!=”); }

“++” { print_token(“OPERATOR”, “++”); }

“—” { print_token(“OPERATOR”, “—“); }

“{“ { print_token(“PUNCTUATION”, “{“); }

“}” { print_token(“PUNCTUATION”, “}”); }

“(“ { print_token(“PUNCTUATION”, “(“); }

“)” { print_token(“PUNCTUATION”, “)”); }

“;” { print_token(“PUNCTUATION”, “;”); }

“,” { print_token(“PUNCTUATION”, “,”); }

“//”.* { /* Ignore single line comments / } “/\\([^*]|[\\r\\n]|(\\

*+[^*/]))*\\*+/” { /* Ignore multi-line comments */ }

[ \t\n]+ { /* Ignore whitespace */ }

. { printf(“Unrecognized character: ‘%s’\n”, yytext); }

Int main(int argc, char **argv) { If (argc > 1) {

FILE *file = fopen(argv[1], “r”); If (!file) {
Perror(“Error opening file”); Return EXIT_FAILURE;
}

Yyin = file;
}

Yylex();

return EXIT_SUCCESS;

Lesson 1.2 Real Number Line, Inequality, Intervals, and Absolute Value
No ratings yet
Lesson 1.2 Real Number Line, Inequality, Intervals, and Absolute Value
19 pages
Chapter-10 Numerical Problems 2 (Lenses)
No ratings yet
Chapter-10 Numerical Problems 2 (Lenses)
4 pages
07 5123 08 Zigbee Cluster Library 1
No ratings yet
07 5123 08 Zigbee Cluster Library 1
1,213 pages
honda accord 2021
No ratings yet
honda accord 2021
59 pages
cs3501-compiler-design-lab-manual (1)
No ratings yet
cs3501-compiler-design-lab-manual (1)
56 pages
Manual 602773
No ratings yet
Manual 602773
288 pages
Pump Curve From Gould Pumps PDF
No ratings yet
Pump Curve From Gould Pumps PDF
36 pages
CD LAB MANUAL (1)-1
No ratings yet
CD LAB MANUAL (1)-1
60 pages
Cd final manual
No ratings yet
Cd final manual
34 pages
Compiler Design-R21
No ratings yet
Compiler Design-R21
27 pages
Implement A Lexical Analyzer Using Lex Tool /: Program
No ratings yet
Implement A Lexical Analyzer Using Lex Tool /: Program
4 pages
CS3501 STudent Reference
No ratings yet
CS3501 STudent Reference
69 pages
CD Student Manual (1)
No ratings yet
CD Student Manual (1)
76 pages
CS3501 - Compiler Design Lab Manual
No ratings yet
CS3501 - Compiler Design Lab Manual
37 pages
Compiler Design Lab
No ratings yet
Compiler Design Lab
49 pages
CD Lab manual
No ratings yet
CD Lab manual
36 pages
CD Lab Manual
No ratings yet
CD Lab Manual
48 pages
FINAL CS3501 Compiler Design LAB
No ratings yet
FINAL CS3501 Compiler Design LAB
49 pages
Unit 2
No ratings yet
Unit 2
56 pages
Federated Learning For Internet of Things A Comprehensive Survey
No ratings yet
Federated Learning For Internet of Things A Comprehensive Survey
37 pages
Compiler - Design - Lab Final 2024
No ratings yet
Compiler - Design - Lab Final 2024
45 pages
CD LAB MANUAL (1)
No ratings yet
CD LAB MANUAL (1)
52 pages
2775
No ratings yet
2775
65 pages
Investments ISE 13th Edition PDF
100% (1)
Investments ISE 13th Edition PDF
50 pages
CD LAB RECORD
No ratings yet
CD LAB RECORD
40 pages
CS3501-Compiler-Design-Lab-Manual-doc
No ratings yet
CS3501-Compiler-Design-Lab-Manual-doc
42 pages
CD LAB MANUAL
No ratings yet
CD LAB MANUAL
68 pages
CS3501- COMPILER DESIGN LAB MANUAL
No ratings yet
CS3501- COMPILER DESIGN LAB MANUAL
53 pages
Cs3501 Compiler Design Lab Manual
No ratings yet
Cs3501 Compiler Design Lab Manual
54 pages
Cs6612 Compiler Laboratory (1)
No ratings yet
Cs6612 Compiler Laboratory (1)
42 pages
17ACS42 Manual
No ratings yet
17ACS42 Manual
54 pages
CS3501-Compiler Lab-2021R-Updated-19-7-2023
No ratings yet
CS3501-Compiler Lab-2021R-Updated-19-7-2023
44 pages
Semantic Analysis - 16CO125-151-254
No ratings yet
Semantic Analysis - 16CO125-151-254
43 pages
Compilerdesignlab Mannual
No ratings yet
Compilerdesignlab Mannual
69 pages
Etsi TR 101 205
No ratings yet
Etsi TR 101 205
46 pages
Ant Colony Optimization Model For Tsunamis Evacuation Routes
No ratings yet
Ant Colony Optimization Model For Tsunamis Evacuation Routes
15 pages
MASTR D 20 00175 - Revised - ShaofengWang2020
No ratings yet
MASTR D 20 00175 - Revised - ShaofengWang2020
28 pages
PT8A977B
No ratings yet
PT8A977B
11 pages
(1959) Control Chart Tests Based On Geometric Moving Averages PDF
No ratings yet
(1959) Control Chart Tests Based On Geometric Moving Averages PDF
13 pages
CS3501 Compiler Design Lab
No ratings yet
CS3501 Compiler Design Lab
35 pages
CD Labmanual
No ratings yet
CD Labmanual
54 pages
Quantitative Aptitude
No ratings yet
Quantitative Aptitude
33 pages
CD Manual
No ratings yet
CD Manual
58 pages
cd_week3
No ratings yet
cd_week3
6 pages
Compiler design program
No ratings yet
Compiler design program
26 pages
2021UCS1618 Compiler
No ratings yet
2021UCS1618 Compiler
31 pages
Compiler Design Record Old
No ratings yet
Compiler Design Record Old
43 pages
Compiler Lab Manual
No ratings yet
Compiler Lab Manual
32 pages
lab manual2021 regulation
No ratings yet
lab manual2021 regulation
28 pages
CD Lab Programs
No ratings yet
CD Lab Programs
9 pages
CD LexProgram
No ratings yet
CD LexProgram
11 pages
Compiler Record
No ratings yet
Compiler Record
42 pages
7th Grade CRCT Jeopardy
0% (2)
7th Grade CRCT Jeopardy
51 pages
Compiler Design (CD) : Lab Assignment 1
No ratings yet
Compiler Design (CD) : Lab Assignment 1
36 pages
Adhiparasakthi College of Engineering: G. B. Nagar, Kalavai - 632 506, Ranipet District, Tamil Nadu
No ratings yet
Adhiparasakthi College of Engineering: G. B. Nagar, Kalavai - 632 506, Ranipet District, Tamil Nadu
38 pages
MP3 Music Player in Python
No ratings yet
MP3 Music Player in Python
15 pages
SDS1230 21 50 00
No ratings yet
SDS1230 21 50 00
6 pages
SPCC EXP7
No ratings yet
SPCC EXP7
8 pages
20dit057 LP
No ratings yet
20dit057 LP
42 pages
Practical File Compiler Design
No ratings yet
Practical File Compiler Design
32 pages
The Nature and Role of Chemistry in Life and The Scientific Method
No ratings yet
The Nature and Role of Chemistry in Life and The Scientific Method
6 pages
Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer
No ratings yet
Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer
16 pages
Comsol
100% (1)
Comsol
34 pages
Cs35o1 - Compiler Design
No ratings yet
Cs35o1 - Compiler Design
27 pages
Compiler Design Lab
No ratings yet
Compiler Design Lab
68 pages
Compiler Lab Manual
No ratings yet
Compiler Lab Manual
36 pages
ex 1 _ lexical analyser
No ratings yet
ex 1 _ lexical analyser
8 pages
Creating Stress vs. Strain Plots in Excel
No ratings yet
Creating Stress vs. Strain Plots in Excel
4 pages
Pongal Funwork Schedule: Class: Viii
No ratings yet
Pongal Funwork Schedule: Class: Viii
1 page
Compiler Design & Networks Lab Manual
No ratings yet
Compiler Design & Networks Lab Manual
69 pages
Activity 2-Distance Measuring by Pacing
100% (1)
Activity 2-Distance Measuring by Pacing
7 pages
Technology - Mca Master of Computer Applications - Semester 3 - 2023 - December - Elective 3 Deep Learning Rev 2019 C Scheme
No ratings yet
Technology - Mca Master of Computer Applications - Semester 3 - 2023 - December - Elective 3 Deep Learning Rev 2019 C Scheme
1 page
Statement of the program
No ratings yet
Statement of the program
3 pages
CS 356 Cache Exercises: Redekopp Name: - Score
No ratings yet
CS 356 Cache Exercises: Redekopp Name: - Score
4 pages
Power IC L9132
100% (7)
Power IC L9132
1 page
Img 037
No ratings yet
Img 037
8 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
36 pages
Compiler Design (CS-701) : Develop A Lexical Analyzer To Recognize A Few Patterns in C
No ratings yet
Compiler Design (CS-701) : Develop A Lexical Analyzer To Recognize A Few Patterns in C
17 pages
Busbar General Datasheet PDF
No ratings yet
Busbar General Datasheet PDF
2 pages
Ex - No:2 Develop A Lexical Analyzer To Recognize A Few Patterns in C Aim: Algorithm
No ratings yet
Ex - No:2 Develop A Lexical Analyzer To Recognize A Few Patterns in C Aim: Algorithm
7 pages
RX-F31S SCH
No ratings yet
RX-F31S SCH
22 pages
Rajalakshmi Institute of Technology Chennai: Department of Computer Science and Engineering
No ratings yet
Rajalakshmi Institute of Technology Chennai: Department of Computer Science and Engineering
20 pages
Diverter Valve Wam Brochure
No ratings yet
Diverter Valve Wam Brochure
4 pages
CD Expt 3 Implementation of A Lexical Analyzer Using Lex Tool
No ratings yet
CD Expt 3 Implementation of A Lexical Analyzer Using Lex Tool
6 pages
How To Configure Portal URL Alias in SAP NetWeaver Portal 7.01
No ratings yet
How To Configure Portal URL Alias in SAP NetWeaver Portal 7.01
8 pages
NgRx SignalStore: An effortless solution for state management
From Everand
NgRx SignalStore: An effortless solution for state management
Abdelfattah Ragab
No ratings yet
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
150+ C Pattern Programs
From Everand
150+ C Pattern Programs
Hernando Abella
No ratings yet
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet

Compiler Project Abstract

Uploaded by

Compiler Project Abstract

Uploaded by

PROJECT ABSTRACT

LEXICAL ANALYZER FOR C LANGUAGE

The primary objectives of this project are:

- Extend the analyzer to handle preprocessor directives and macros.

- Implement scope handling for variables and function declarations.

- Integrate the lexical analyzer with a parser for syntactic analysis.

Void print_token(const char *token_type, const char *token_value)

[a-zA-Z_][a-zA-Z0-9_]* { print_token(“IDENTIFIER”, yytext); }

0[xX][0-9a-fA-F]+ { print_token(“NUMBER”, yytext); }

[0-9]+\.[0-9]+ { print_token(“NUMBER”, yytext); }

\”([^\\\”]|\\.)*\” { print_token(“STRING”, yytext); } // String literals

\’([^\\\’]|\\.)\’ { print_token(“CHAR_LITERAL”, yytext); } // Character literals

“+” { print_token(“OPERATOR”, “+”); }

“-“ { print_token(“OPERATOR”, “-“); }

“/” { print_token(“OPERATOR”, “/”); }

“=” { print_token(“OPERATOR”, “=”); }

“==” { print_token(“OPERATOR”, “==”); }

“<” { print_token(“OPERATOR”, “<”); }

“>” { print_token(“OPERATOR”, “>”); }

“<=” { print_token(“OPERATOR”, “<=”); }

“++” { print_token(“OPERATOR”, “++”); }

“—” { print_token(“OPERATOR”, “—“); }

“{“ { print_token(“PUNCTUATION”, “{“); }

“}” { print_token(“PUNCTUATION”, “}”); }

“(“ { print_token(“PUNCTUATION”, “(“); }

“)” { print_token(“PUNCTUATION”, “)”); }

“;” { print_token(“PUNCTUATION”, “;”); }

“,” { print_token(“PUNCTUATION”, “,”); }

“//”.* { /* Ignore single line comments */ } “/\\*([^*]|[\\r\\n]|(\\

[ \t\n]+ { /* Ignore whitespace */ }

. { printf(“Unrecognized character: ‘%s’\n”, yytext); }

Int main(int argc, char **argv)

Void print_token(const char *token_type, const char *token_value)

[a-zA-Z_][a-zA-Z0-9_]* { print_token(“IDENTIFIER”, yytext); }

0[xX][0-9a-fA-F]+ { print_token(“NUMBER”, yytext); }

[0-9]+\.[0-9]+ { print_token(“NUMBER”, yytext); }

\”([^\\\”]|\\.)*\” { print_token(“STRING”, yytext); } // String literals

\’([^\\\’]|\\.)\’ { print_token(“CHAR_LITERAL”, yytext); } // Character literals

“+” { print_token(“OPERATOR”, “+”); }

“/” { print_token(“OPERATOR”, “/”); }

“=” { print_token(“OPERATOR”, “=”); }

“==” { print_token(“OPERATOR”, “==”); }

“>” { print_token(“OPERATOR”, “>”); }

“<=” { print_token(“OPERATOR”, “<=”); }

“>=” { print_token(“OPERATOR”, “>=”); }

“!=” { print_token(“OPERATOR”, “!=”); }

“++” { print_token(“OPERATOR”, “++”); }

“—” { print_token(“OPERATOR”, “—“); }

“{“ { print_token(“PUNCTUATION”, “{“); }

“}” { print_token(“PUNCTUATION”, “}”); }

“(“ { print_token(“PUNCTUATION”, “(“); }

“)” { print_token(“PUNCTUATION”, “)”); }

“;” { print_token(“PUNCTUATION”, “;”); }

“,” { print_token(“PUNCTUATION”, “,”); }

“//”.* { /* Ignore single line comments */ } “/\\*([^*]|[\\r\\n]|(\\

[ \t\n]+ { /* Ignore whitespace */ }

Int main(int argc, char **argv) { If (argc > 1) {

You might also like

Void print_token(const char token_type, const char token_value)

“//”.* { /* Ignore single line comments / } “/\\([^*]|[\\r\\n]|(\\

Void print_token(const char token_type, const char token_value)

“//”.* { /* Ignore single line comments / } “/\\([^*]|[\\r\\n]|(\\