0% found this document useful (0 votes)
9 views3 pages

Logbook

The document presents a program for implementing a lexical analyzer using Python. It defines sets of keywords, operators, and punctuation, and categorizes tokens from a given C-like code input. The program outputs categorized tokens such as header files, keywords, operators, punctuation marks, constants, and identifiers.

Uploaded by

saif.221230.co
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

Logbook

The document presents a program for implementing a lexical analyzer using Python. It defines sets of keywords, operators, and punctuation, and categorizes tokens from a given C-like code input. The program outputs categorized tokens such as header files, keywords, operators, punctuation marks, constants, and identifiers.

Uploaded by

saif.221230.co
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

EXPERIMENT NO.

AIM: To implement Lexical Analyzer.

PROGRAM :
import re

# Define sets of keywords, operators, and punctuation


keywords = {"int", "float", "if", "else", "while", "return", "for", "char", "double", "include"}
operators = {'+', '-', '*', '/', '=', '==', '!=', '>', '<', '>=', '<=', '&&', '||', '++', '--'}
punctuations = {',', ';', '(', ')', '{', '}', '[', ']'}

def lexical_analyzer(code):
# Splitting code into tokens
tokens = re.findall(r'\w+|\S', code)

# Categorizing tokens
headers = []
found_keywords = []
found_operators = []
found_punctuations = []
found_constants = []
found_identifiers = []

for token in tokens:


if token in keywords:
found_keywords.append(token)
elif re.match(r'\d+', token): # Constant detection
found_constants.append(token)
elif token in operators:
found_operators.append(token)
elif token in punctuations:
found_punctuations.append(token)
elif re.match(r'#[ ]*include[ ]*<[a-zA-Z.]+>', token): # Header files
headers.append(token)
elif re.match(r'[a-zA-Z_][a-zA-Z_0-9]*', token): # Identifiers
found_identifiers.append(token)

# Printing categorized tokens


print("Header Files:", headers)
print("Keywords:", found_keywords)
print("Operators:", found_operators)
print("Punctuation Marks:", found_punctuations)
print("Constants:", found_constants)
print("Identifiers:", found_identifiers)

# Example input C-like code


code = """
#include<math.h>
double power(double base, int exp) {
double result = 1.0;
while (exp > 0) {
result = result * base;
exp--;
}
return result;
}
"""
print("Input Program:")
print(code)
print("\nLexical Analysis:\n")
lexical_analyzer(code)

OUTPUT :

You might also like