NLP 04
NLP 04
Regular Expression:
RE is a string that defines a text matching pattern. In NLP, RE is used to find strings having
certain patterns in given text. A regular expression is built up using defining rules. Simple
operation in Regular Expressions:
Kleene Closure: If E is a regular expression, then E* is a regular expression Positive
Closure: If E is a regular expression, then E+ is a regular expression or: If E1and
E2are regular expressions, then E1 | E2 is a regular expression concatenation: If
E1and E2are regular expressions, then E1E2 is a regular expression.
def perform_morphological_analysis(word):
"""
Perform simple morphological analysis (lemmatization) on the given
word.
"""
lemmatizer = WordNetLemmatizer()
lemma = lemmatizer.lemmatize(word)
return lemma
def generate_synonyms(word):
"""
Generate synonyms for the given word using WordNet.
"""
synonyms = [] #initializes an empty list named synonyms that will
be used to collect synonyms for the given word.
for syn in wordnet.synsets(word):
for lemma in syn.lemmas():
synonyms.append(lemma.name())
return set(synonyms)
# Example usage
if __name__ == "__main__":
# Sample text
text = "The quick brown fox jumps over the lazy dog and it starts from
1 to end of 10_"
# Regular Expression Pattern (find all words)
pattern = r'\b\w+\b' # In regex, \w matches any
alphanumeric character (letters and digits) and underscores (_). # Word
boundary anchor. This asserts a position where a word starts or ends. It
ensures that the match occurs at the boundary of a word.
# Generate synonyms
word = "quick"
synonyms = generate_synonyms(word)
print(f"Synonyms for '{word}': {synonyms}")
Output
Conclusion:
Thus, we have successfully studied and performed the concept of lemmatization, stemming