0% found this document useful (0 votes)

19 views3 pages

NLP 04

Uploaded by

TAHA MURADE [UCOE-3968]

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views3 pages

NLP 04

Uploaded by

TAHA MURADE [UCOE-3968]

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Experiment 4

Name of the Student: - Taha Murade

Roll No. 74
Date of Practical Performed: - 01/08/2024 Staff Signature with Date & Marks

Aim: Write a program to perform Regular Expression/Perform Morphological

Analysis & word generation for any given text.
Theory: A regular expression (RE) is a language for specifying text search strings. RE helps
us to match or find other strings or sets of strings, using a specialized syntax held in a pattern.
Regular expressions are used to search texts in UNIX as well as in MS WORD in identical
way.
Function Description
Findall: Returns a list containing all matches
Search: Returns a Match object if there is a match anywhere in the string
Split: Returns a list where the string has been split at each match
Sub: Replaces one or many matches with a string

Regular Expression:
RE is a string that defines a text matching pattern. In NLP, RE is used to find strings having
certain patterns in given text. A regular expression is built up using defining rules. Simple
operation in Regular Expressions:
Kleene Closure: If E is a regular expression, then E* is a regular expression Positive
Closure: If E is a regular expression, then E+ is a regular expression or: If E1and
E2are regular expressions, then E1 | E2 is a regular expression concatenation: If
E1and E2are regular expressions, then E1E2 is a regular expression.

Some ways to represent REs are as follows:

Name Regular Expression Matched Strings

Disjunction of characters [wW]oodchuck woodchuck, Woodchuck

Range [A-Z] All uppercase letters

Disjunction negation [^sS] Except “s” and “S”

Disjunction- fl(y|ies) “fly” or “flies”

operator(pipe symbol)

Kleene Closure ba* b, ba, baa, …

Positive Closure ba+ ba, baa, baaa, …

Code:
import re
import nltk

# Download the necessary NLTK data files

nltk.download('wordnet')
nltk.download('omw-1.4')
from nltk.corpus import wordnet
from nltk.stem import WordNetLemmatizer

#OMW is a open multilingual wordnet :-it is multilingual lexical database

that extends the WordNet database to multiple languages. #It provides
translations and mappings between different languages' synsets (sets of
synonyms) and the English WordNet synsets. nltk.download('punkt')

def perform_regex_operations(text, pattern):

"""
Perform regular expression operations on the given
text. """
matches = re.findall(pattern, text)
return matches

def perform_morphological_analysis(word):
"""
Perform simple morphological analysis (lemmatization) on the given
word.
"""
lemmatizer = WordNetLemmatizer()
lemma = lemmatizer.lemmatize(word)
return lemma

def generate_synonyms(word):
"""
Generate synonyms for the given word using WordNet.
"""
synonyms = [] #initializes an empty list named synonyms that will
be used to collect synonyms for the given word.
for syn in wordnet.synsets(word):
for lemma in syn.lemmas():
synonyms.append(lemma.name())
return set(synonyms)

# Example usage
if __name__ == "__main__":
# Sample text
text = "The quick brown fox jumps over the lazy dog and it starts from
1 to end of 10_"
# Regular Expression Pattern (find all words)
pattern = r'\b\w+\b' # In regex, \w matches any
alphanumeric character (letters and digits) and underscores (_). # Word
boundary anchor. This asserts a position where a word starts or ends. It
ensures that the match occurs at the boundary of a word.

# Perform regex operations

matches = perform_regex_operations(text, pattern)
print("Matches:", matches)

# Perform morphological analysis

word = 'jumps' #"scouts"
#'Jumps','boys,','girls','runs','meaningfull' lemma =
perform_morphological_analysis(word)
print(f"Lemmatized form of '{word}': {lemma}")

# Generate synonyms
word = "quick"
synonyms = generate_synonyms(word)
print(f"Synonyms for '{word}': {synonyms}")
Output

Conclusion:
Thus, we have successfully studied and performed the concept of lemmatization, stemming

Week 2
No ratings yet
Week 2
44 pages
Word Level Analysis NLP Mod 2
No ratings yet
Word Level Analysis NLP Mod 2
18 pages
4 Re
No ratings yet
4 Re
78 pages
NLP - Exp 1 11
No ratings yet
NLP - Exp 1 11
29 pages
Term Paper On Regular Expression
100% (1)
Term Paper On Regular Expression
8 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
Class 3
No ratings yet
Class 3
52 pages
Unit - 2 TAFL Regular Expression
No ratings yet
Unit - 2 TAFL Regular Expression
15 pages
Unit Iii
No ratings yet
Unit Iii
51 pages
3b TextProcessing
No ratings yet
3b TextProcessing
32 pages
Recognition of Tokens
No ratings yet
Recognition of Tokens
34 pages
Computability 05
No ratings yet
Computability 05
28 pages
Lecture 2n 04032024 081220pm 19022025 105409am
No ratings yet
Lecture 2n 04032024 081220pm 19022025 105409am
38 pages
13B RegExp
No ratings yet
13B RegExp
38 pages
Regular Expression
No ratings yet
Regular Expression
42 pages
Basic Text Processing: Regular Expressions and Text Normalization
No ratings yet
Basic Text Processing: Regular Expressions and Text Normalization
53 pages
Basic Text Processing: Regular Expressions and Text Normalization
No ratings yet
Basic Text Processing: Regular Expressions and Text Normalization
53 pages
NLP Experiment 04
No ratings yet
NLP Experiment 04
3 pages
JavaScript Core
No ratings yet
JavaScript Core
206 pages
NLP Exp-123
No ratings yet
NLP Exp-123
6 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
Flat Unit-2
No ratings yet
Flat Unit-2
41 pages
Chapter 2 - Regular Expression
No ratings yet
Chapter 2 - Regular Expression
25 pages
02 Textprocessingboth
No ratings yet
02 Textprocessingboth
46 pages
Lab 02 - Regular Expression
No ratings yet
Lab 02 - Regular Expression
7 pages
NLP - Sem
No ratings yet
NLP - Sem
31 pages
Chapter 7
No ratings yet
Chapter 7
5 pages
Tsa Lab Record - Cse
No ratings yet
Tsa Lab Record - Cse
53 pages
3-Regular Expressions
No ratings yet
3-Regular Expressions
34 pages
IBM Workload Scheduler 9.4 Database Views Awsdvmst
No ratings yet
IBM Workload Scheduler 9.4 Database Views Awsdvmst
94 pages
Unit 2
No ratings yet
Unit 2
20 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Chapter - 11 - Regular Expressions
100% (1)
Chapter - 11 - Regular Expressions
10 pages
J277-Computer Science-Paper2-MS - 241103 - 230709-1
100% (1)
J277-Computer Science-Paper2-MS - 241103 - 230709-1
27 pages
2
0% (1)
2
115 pages
Regular Expression
No ratings yet
Regular Expression
29 pages
Unit-III (Regular Expression)
No ratings yet
Unit-III (Regular Expression)
58 pages
Regular Expressions, Tok-Enization, Edit Distance
No ratings yet
Regular Expressions, Tok-Enization, Edit Distance
29 pages
4.word Level Analysis-Regular Expression
No ratings yet
4.word Level Analysis-Regular Expression
8 pages
UNIT-4 (Regular Expressions)
No ratings yet
UNIT-4 (Regular Expressions)
25 pages
2 Regular Expression
No ratings yet
2 Regular Expression
23 pages
John English Ada 95 The Craft of Object-Oriented Programming PDF
No ratings yet
John English Ada 95 The Craft of Object-Oriented Programming PDF
437 pages
NLP Record
No ratings yet
NLP Record
15 pages
Lect2 Regular Expressions
No ratings yet
Lect2 Regular Expressions
41 pages
Natural Language Processing - Session 3 - Regular Expressions
No ratings yet
Natural Language Processing - Session 3 - Regular Expressions
39 pages
Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
IDesign C# Coding Standard 2.4
0% (1)
IDesign C# Coding Standard 2.4
27 pages
CS-20004 (Oop) - CS End April 2024
No ratings yet
CS-20004 (Oop) - CS End April 2024
26 pages
2 Regular Expressions
No ratings yet
2 Regular Expressions
34 pages
Regular Expressions, Text Normalization, Edit Distance
No ratings yet
Regular Expressions, Text Normalization, Edit Distance
30 pages
Ex1 Lab
No ratings yet
Ex1 Lab
4 pages
Visual C# Programming Basics
100% (7)
Visual C# Programming Basics
19 pages
CH 6
No ratings yet
CH 6
64 pages
Regular Expressions
No ratings yet
Regular Expressions
34 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
40 Latest Interview Questions and Answers On Spring, Spring MVC, and Spring Boot
No ratings yet
40 Latest Interview Questions and Answers On Spring, Spring MVC, and Spring Boot
8 pages
Regex
No ratings yet
Regex
24 pages
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
Unit-Ii Regular Expressions and Languages Definition
No ratings yet
Unit-Ii Regular Expressions and Languages Definition
34 pages
Itw Midsem Notes
No ratings yet
Itw Midsem Notes
21 pages
Slide 07
No ratings yet
Slide 07
100 pages
Delos Santos - AutomataResearch2
No ratings yet
Delos Santos - AutomataResearch2
2 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet
Regular Expressions, Text Normalization, Edit Distance
No ratings yet
Regular Expressions, Text Normalization, Edit Distance
23 pages
Unit 5
No ratings yet
Unit 5
4 pages
Real Time ABAP Interview Questions and
No ratings yet
Real Time ABAP Interview Questions and
10 pages
SFSU Business Requirements Template v1.7
100% (1)
SFSU Business Requirements Template v1.7
3 pages
Dot Net Programming Unit III
No ratings yet
Dot Net Programming Unit III
20 pages
NLP CT1
No ratings yet
NLP CT1
6 pages
Regular Expression Python
No ratings yet
Regular Expression Python
23 pages
Theory of Computation - Regular Expressions
No ratings yet
Theory of Computation - Regular Expressions
23 pages
PHP programming
From Everand
PHP programming
Nino Paiotta
No ratings yet
Unit - 5 (C#.NET Programming) User Controls, Reports & Setup
No ratings yet
Unit - 5 (C#.NET Programming) User Controls, Reports & Setup
13 pages
CSE1021 - Problem Solving and Programming - Unit 2
No ratings yet
CSE1021 - Problem Solving and Programming - Unit 2
34 pages
User Acceptance Testing
No ratings yet
User Acceptance Testing
15 pages
Lex Analysis
No ratings yet
Lex Analysis
13 pages
Wijmo ReportViewer
No ratings yet
Wijmo ReportViewer
43 pages
Expression (Abbreviated Regex or Regexp) and Sometimes Called A Rational Expression
No ratings yet
Expression (Abbreviated Regex or Regexp) and Sometimes Called A Rational Expression
4 pages
CS Ip
No ratings yet
CS Ip
17 pages
Javascript Pre Test: A) Event Type
No ratings yet
Javascript Pre Test: A) Event Type
4 pages
SAP S/4HANA 1909 FPS00 Fully-Activated Appliance: Print Form Customization
No ratings yet
SAP S/4HANA 1909 FPS00 Fully-Activated Appliance: Print Form Customization
15 pages
How To Install Aspen 8.6
No ratings yet
How To Install Aspen 8.6
2 pages
JG iOS
No ratings yet
JG iOS
2 pages
Copy Badi
No ratings yet
Copy Badi
5 pages
Sis Org de Enventos
No ratings yet
Sis Org de Enventos
4 pages
Blood Bank Database12
No ratings yet
Blood Bank Database12
14 pages
Balancing Agility and Discipline Evaluating and in
No ratings yet
Balancing Agility and Discipline Evaluating and in
3 pages
Vivek Kumar Mishra Resume
No ratings yet
Vivek Kumar Mishra Resume
1 page
Sawan Kulkarni: Professional Summary
No ratings yet
Sawan Kulkarni: Professional Summary
3 pages

NLP 04

Uploaded by

NLP 04

Uploaded by

Experiment 4

Name of the Student: - Taha Murade

Aim: Write a program to perform Regular Expression/Perform Morphological

Some ways to represent REs are as follows:

Disjunction of characters [wW]oodchuck woodchuck, Woodchuck

Range [A-Z] All uppercase letters

Disjunction negation [^sS] Except “s” and “S”

Disjunction- fl(y|ies) “fly” or “flies”

Kleene Closure ba* b, ba, baa, …

Positive Closure ba+ ba, baa, baaa, …

# Download the necessary NLTK data files

#OMW is a open multilingual wordnet :-it is multilingual lexical database

def perform_regex_operations(text, pattern):

# Perform regex operations

# Perform morphological analysis

You might also like