NLTK - Stem NLTK - Stem: Print Print Print Print

The document compares a Porter Stemmer and WordNet Lemmatizer for natural language processing tasks. The Porter Stemmer is simpler and faster but less accurate, while the WordNet Lemmatizer is more complex and slower but more accurate by considering context.

Uploaded by

pranavi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views1 page

NLTK - Stem NLTK - Stem: Print Print Print Print

Uploaded by

pranavi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

In [1]:

from nltk.stem import PorterStemmer

from nltk.stem import WordNetLemmatizer

# Porter Stemmer
stemmer = PorterStemmer()
print("Stemmer:")
print("running ->", stemmer.stem("running")) # Output: run (correct)
print("better ->", stemmer.stem("better")) # Output: bettr (incorrect, not a real word)
print("corpora ->", stemmer.stem("corpora")) # Output: corpora (incorrect, should be corpus)

# WordNet Lemmatizer (considering nouns by default)

lemmatizer = WordNetLemmatizer()
print("\nLemmatizer:")
print("running ->", lemmatizer.lemmatize("running")) # Output: running (correct)
print("better ->", lemmatizer.lemmatize("better")) # Output: good (better as an adjective)
print("better (as adjective) ->", lemmatizer.lemmatize("better", pos="a")) # Output: better (correct)
print("corpora ->", lemmatizer.lemmatize("corpora")) # Output: corpus (correct)

Stemmer:
running -> run
better -> better
corpora -> corpora

Lemmatizer:
running -> running
better -> better
better (as adjective) -> good
corpora -> corpus

In [ ]:
#Porter Stemmer

Simpler and faster: It uses a rule-based approach to chop off suffixes from words.
Less accurate: May not always produce actual words and can lead to stemming errors. For instance, stemming "runni
ng" might result in "run" which is a valid word, but stemming "caring" might result in "car" which is not a valid
word in this context.
Doesn't consider context: Focuses solely on the word itself, ignoring its part of speech (POS) or surrounding wor
ds.
WordNet Lemmatizer

#WORDNETLEMMATIZER
More complex and slower: Relies on a lexical database (WordNet) to map words to their dictionary base forms (lemm
as).
More accurate: Aims to produce actual words that exist in the language.
Considers context (ideally): Can incorporate part-of-speech (POS) tagging to choose the most appropriate lemma (e
.g., "running" as the present participle of "run" vs "run" as a noun).

Lemmatization Stemming Presentation
No ratings yet
Lemmatization Stemming Presentation
11 pages
Stemming and Lemmatization
No ratings yet
Stemming and Lemmatization
17 pages
ChatGPT-Tokenization Stemming Lemmatization NLTK
No ratings yet
ChatGPT-Tokenization Stemming Lemmatization NLTK
110 pages
20BCP112 - NLP Lab - LAB - Manual
No ratings yet
20BCP112 - NLP Lab - LAB - Manual
65 pages
20BCP123 - NLP Lab Manual
No ratings yet
20BCP123 - NLP Lab Manual
45 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
NLP Experiment 3
No ratings yet
NLP Experiment 3
5 pages
Language Engineering - Section
No ratings yet
Language Engineering - Section
24 pages
Natural Language Processing-Section
No ratings yet
Natural Language Processing-Section
25 pages
NLP - Exp 1 11
No ratings yet
NLP - Exp 1 11
29 pages
Unit 1b
No ratings yet
Unit 1b
24 pages
Word Level Analysis (NLP)
No ratings yet
Word Level Analysis (NLP)
28 pages
NLP Manual
No ratings yet
NLP Manual
9 pages
UBC Summer School in NLP - VSP 2019 Lecture 10
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 10
33 pages
Lab 04 - Text Normalization Tutorial
No ratings yet
Lab 04 - Text Normalization Tutorial
5 pages
R22 NLP Python Programs
No ratings yet
R22 NLP Python Programs
15 pages
NLP Lab Programs
No ratings yet
NLP Lab Programs
18 pages
Lab 2
No ratings yet
Lab 2
49 pages
NLP 3-6
No ratings yet
NLP 3-6
20 pages
01 NLP - Merged Vinay
No ratings yet
01 NLP - Merged Vinay
27 pages
NLP Lab 2
No ratings yet
NLP Lab 2
4 pages
NLP Intro
No ratings yet
NLP Intro
15 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
19 pages
Labsheet 1
No ratings yet
Labsheet 1
3 pages
Chapter 6
No ratings yet
Chapter 6
6 pages
7 TextAnalysis
No ratings yet
7 TextAnalysis
3 pages
NLP Assignment (917722H031)
No ratings yet
NLP Assignment (917722H031)
18 pages
Experiment 3 Manual
No ratings yet
Experiment 3 Manual
7 pages
NLP Notebook
No ratings yet
NLP Notebook
20 pages
14python Stemming and Lemmatization
No ratings yet
14python Stemming and Lemmatization
2 pages
NLP Record
No ratings yet
NLP Record
15 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
Lemmatization Approaches
No ratings yet
Lemmatization Approaches
13 pages
NLP Lab Programms
No ratings yet
NLP Lab Programms
9 pages
NLP Lab
No ratings yet
NLP Lab
7 pages
3 A Morphology
No ratings yet
3 A Morphology
4 pages
NLP Soln
No ratings yet
NLP Soln
6 pages
NLP Exp 4
No ratings yet
NLP Exp 4
2 pages
XSTEM: An Exemplar-Based Stemming Algorithm: Kirk Baker Lexical Intelligence, LLC May 10, 2022
No ratings yet
XSTEM: An Exemplar-Based Stemming Algorithm: Kirk Baker Lexical Intelligence, LLC May 10, 2022
11 pages
NLTK
No ratings yet
NLTK
4 pages
NLP Exp-123
No ratings yet
NLP Exp-123
6 pages
NLP 03
No ratings yet
NLP 03
3 pages
Tokenisation
No ratings yet
Tokenisation
3 pages
Lab 2
No ratings yet
Lab 2
4 pages
NLP Exp 5, Implement Stemming, Lemmetization, Pos - Tag, Wordnet - Colab
No ratings yet
NLP Exp 5, Implement Stemming, Lemmetization, Pos - Tag, Wordnet - Colab
2 pages
04 Word Normalization and Stemming 11-47
No ratings yet
04 Word Normalization and Stemming 11-47
5 pages
Token Ization
No ratings yet
Token Ization
5 pages
Viva Questions
No ratings yet
Viva Questions
6 pages
Natual Languagr Processing
No ratings yet
Natual Languagr Processing
12 pages
7 Idf
No ratings yet
7 Idf
5 pages
Text Preprocessing For NLP
No ratings yet
Text Preprocessing For NLP
15 pages
NLP CT1
No ratings yet
NLP CT1
6 pages
02-Stemming - Jupyter Notebook
No ratings yet
02-Stemming - Jupyter Notebook
4 pages
NLTK
No ratings yet
NLTK
3 pages
From Import From Import From Import From Import Import
No ratings yet
From Import From Import From Import From Import Import
3 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
17 pages
Stemming and Lemmatizing in Action (Sources)
No ratings yet
Stemming and Lemmatizing in Action (Sources)
3 pages
Corporate Social Responsibility-Bmw: Strategy
No ratings yet
Corporate Social Responsibility-Bmw: Strategy
7 pages
Carson Planters Case Study
No ratings yet
Carson Planters Case Study
10 pages
QTDM 3 PDF
No ratings yet
QTDM 3 PDF
9 pages
QTDM 3
No ratings yet
QTDM 3
8 pages
Ex4-2 Complete The Worksheet: Instructions
No ratings yet
Ex4-2 Complete The Worksheet: Instructions
18 pages
Corporate Social Responsibility
No ratings yet
Corporate Social Responsibility
7 pages
QTDM 2
No ratings yet
QTDM 2
7 pages
QTDM 2
No ratings yet
QTDM 2
3 pages
Marxian Theory of Unemployment MACRO
No ratings yet
Marxian Theory of Unemployment MACRO
15 pages
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Linux For Developers
From Everand
Profound Linux For Developers
Onder Teker
No ratings yet

NLTK - Stem NLTK - Stem: Print Print Print Print

Uploaded by

NLTK - Stem NLTK - Stem: Print Print Print Print

Uploaded by

In [1]:

from nltk.stem import PorterStemmer

# WordNet Lemmatizer (considering nouns by default)

You might also like