0% found this document useful (0 votes)
12 views2 pages

NLP Exp 4

Uploaded by

sonkambleabhi54
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

NLP Exp 4

Uploaded by

sonkambleabhi54
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

8/27/24, 3:04 PM Untitled5.

ipynb - Colab

import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.corpus import wordnet

# Ensure you have the necessary NLTK resources


nltk.download('punkt')
nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

def get_wordnet_pos(treebank_tag):
"""Convert treebank tags to wordnet tags."""
if treebank_tag.startswith('J'):
return wordnet.ADJ
elif treebank_tag.startswith('V'):
return wordnet.VERB
elif treebank_tag.startswith('N'):
return wordnet.NOUN
elif treebank_tag.startswith('R'):
return wordnet.ADV
else:
return None

# Initialize stemmer and lemmatizer


stemmer = PorterStemmer()
lemmatizer = WordNetLemmatizer()

# Example text
text = "The dogs are running faster than the better-trained dog."

# Tokenize and POS tagging


tokens = word_tokenize(text)
tagged_tokens = nltk.pos_tag(tokens)

# Process each token


for token, tag in tagged_tokens:
# Determine the wordnet POS tag
wn_tag = get_wordnet_pos(tag)

# Lemmatize the token


if wn_tag:
lemmatized = lemmatizer.lemmatize(token, wn_tag)
else:
lemmatized = lemmatizer.lemmatize(token)

# Stem the token


stemmed = stemmer.stem(token)

# Print results
print(f"Token: {token}")
print(f"POS Tag: {tag}")
print(f"Lemmatized: {lemmatized}")
print(f"Stemmed: {stemmed}")
print()

Token: The
POS Tag: DT
Lemmatized: The
Stemmed: the

Token: dogs
POS Tag: NNS
Lemmatized: dog
Stemmed: dog

Token: are
POS Tag: VBP
Lemmatized: be
Stemmed: are

Token: running
POS Tag: VBG
Lemmatized: run
Stemmed: run

Token: faster
POS Tag: RBR
Lemmatized: faster
Stemmed: faster

Token: than
POS Tag: IN
Lemmatized: than

https://colab.research.google.com/drive/13Y1y7SMqzJNqb3cI_X29q0jPLciyaBBz#scrollTo=Wm2zkrGxLiuO&printMode=true 1/2
8/27/24, 3:04 PM Untitled5.ipynb - Colab
Stemmed: than

Token: the
POS Tag: DT
Lemmatized: the
Stemmed: the

Token: better-trained
POS Tag: JJ
Lemmatized: better-trained
Stemmed: better-train

Token: dog
POS Tag: NN
Lemmatized: dog
Stemmed: dog

Token: .
POS Tag: .
Lemmatized: .
Stemmed: .

[nltk_data] Downloading package punkt to /root/nltk_data...


[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /root/nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk data] date!

Start coding or generate with AI.

https://colab.research.google.com/drive/13Y1y7SMqzJNqb3cI_X29q0jPLciyaBBz#scrollTo=Wm2zkrGxLiuO&printMode=true 2/2

You might also like