0% found this document useful (0 votes)
10 views

Week-4 NLP 2

The document describes performing tokenization, stemming, and lemmatization on text corpora. Code is provided to tokenize, stem, and lemmatize input text using NLTK modules.

Uploaded by

Varshini Gourani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Week-4 NLP 2

The document describes performing tokenization, stemming, and lemmatization on text corpora. Code is provided to tokenize, stem, and lemmatize input text using NLTK modules.

Uploaded by

Varshini Gourani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

NATURAL LANGUAGE PROCESSING LAB

WEEK-4
NAME:Varshini ROLL NO:21R21A7324
BRANCH:AIML DATE:21-03-2024
PROBLEM STATEMENT:
Perform Tokenization,Stemming and Lemmatization to carry out the analysis
with the text corpora.
CODE:
import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.corpus import stopwords
import string
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
# Get input text from the user
text_input = input("Enter text for tokenization, stemming, and lemmatization:
")
# Tokenization
tokens = word_tokenize(text_input.lower())
print("Tokens:")
print(tokens[:20])
# Stemming
stemmer = PorterStemmer()
stemmed_tokens = [stemmer.stem(token) for token in tokens]
print("\nStemmed Tokens:")
print(stemmed_tokens[:20]) # Print the first 20 stemmed tokens as an example
# Lemmatization
lemmatizer = WordNetLemmatizer()
lemmatized_tokens = [lemmatizer.lemmatize(token) for token in tokens]
print("\nLemmatized Tokens:")
print(lemmatized_tokens[:20])
OUTPUT:

You might also like