0% found this document useful (0 votes)

14 views3 pages

NLP Lab Programs

The document outlines several NLP lab programs using the NLTK library, including text tokenization, sentence extraction from documents, and removing stop words and punctuation. It also covers tokenization with stop words as delimiters and demonstrates stemming of words. Each program includes example code snippets and instructions for downloading necessary data.

Uploaded by

Boomika G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views3 pages

NLP Lab Programs

Uploaded by

Boomika G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

NLP Lab Programs

1. Tokenize a text
from nltk.tokenize import word_tokenize, sent_tokenize
import nltk

nltk.download('punkt') # Download tokenizer data

# Example text
text = "NLP makes machines understand language. Tokenization is the first step."

# Sentence Tokenization
print("Sentences:", sent_tokenize(text))

# Word Tokenization
print("Words:", word_tokenize(text))

output:

2. sentences of a text document

from nltk.tokenize import sent_tokenize
import nltk

nltk.download('punkt') # Download tokenizer data

# Read the text from a file

file_path = "example.txt" # Replace with your file path
with open(file_path, 'r') as file:
text = file.read()

# Sentence Tokenization
sentences = sent_tokenize(text)

# Display the sentences

print("Sentences in the document:")
for i, sentence in enumerate(sentences, 1):
print(f"{i}: {sentence}")
save a text file as example.txt in jupyter notebook
output:

3. tokenize text with stop words as delimiters

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords

import nltk

# Download necessary data

nltk.download('punkt')

nltk.download('stopwords')

# Example text

text = "I enjoy learning Python and coding."

# Define stop words

stop_words = set(stopwords.words('english'))

# Tokenize the text

words = word_tokenize(text)

# Tokenize using stop words as delimiters

tokens_without_stopwords = [word for word in words if word.lower() not in stop_words]

# Output the result

print("Original Tokens:", words)

print("Tokens without Stop Words:", tokens_without_stopwords)

output:

4. remove stop words and punctuations in a text

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords
import string
import nltk

# Download necessary data

nltk.download('punkt')
nltk.download('stopwords')

# Example text
text = "Python is great! It's simple and powerful."

# Define stop words

stop_words = set(stopwords.words('english'))

# Tokenize the text

words = word_tokenize(text)

# Remove stop words and punctuation

tokens_cleaned = [word for word in words if word.lower() not in stop_words and word not in
string.punctuation]

# Output the result

print("Tokens without Stop Words and Punctuation:", tokens_cleaned)

output:

5. perform stemming
# import these modules
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize

ps = PorterStemmer()

# choose some words to be stemmed

words = ["pythonprogramming", "programs", "programmer", "event", "thankyou"]

for w in words:
print(w, " : ", ps.stem(w))

output:

GENERAC 5000 Watt Generator
100% (2)
GENERAC 5000 Watt Generator
16 pages
Research Methods Concepts and Connections 2nd Edition Passer Test Bank instant download
100% (3)
Research Methods Concepts and Connections 2nd Edition Passer Test Bank instant download
52 pages
Basic Limits & Continuity Hard
No ratings yet
Basic Limits & Continuity Hard
8 pages
Cesp 104 Reporting Group 1.
No ratings yet
Cesp 104 Reporting Group 1.
13 pages
NLP_Lab1
No ratings yet
NLP_Lab1
2 pages
Nlp Lab Manual
No ratings yet
Nlp Lab Manual
32 pages
Milk Billing System Documentation
No ratings yet
Milk Billing System Documentation
1 page
A12-Passed Out Students Count-1
No ratings yet
A12-Passed Out Students Count-1
1 page
pranayAssign2
No ratings yet
pranayAssign2
1 page
Text Preprocessing
No ratings yet
Text Preprocessing
3 pages
Crime Delinquency And Justice A Caribbean Reader Ramesh Deosaran pdf download
No ratings yet
Crime Delinquency And Justice A Caribbean Reader Ramesh Deosaran pdf download
83 pages
Program-7(AI)
No ratings yet
Program-7(AI)
1 page
prog 1
No ratings yet
prog 1
2 pages
1
No ratings yet
1
2 pages
a12 Route Visit Bus Report
No ratings yet
a12 Route Visit Bus Report
2 pages
Ir 1 Stop Word Removed
No ratings yet
Ir 1 Stop Word Removed
1 page
Ai 9
No ratings yet
Ai 9
1 page
NLP_Lab_1.ipynb - Colab
No ratings yet
NLP_Lab_1.ipynb - Colab
4 pages
ID3 Decision Tree
No ratings yet
ID3 Decision Tree
5 pages
a7 dsbda sana
No ratings yet
a7 dsbda sana
15 pages
Complete ID3 Decision Tree
No ratings yet
Complete ID3 Decision Tree
15 pages
NLP LAB ASSIGNMENT-2
No ratings yet
NLP LAB ASSIGNMENT-2
6 pages
lab DSA
No ratings yet
lab DSA
7 pages
journal 1
No ratings yet
journal 1
9 pages
Removing Stopwords in NLP
No ratings yet
Removing Stopwords in NLP
32 pages
SpaceX Starship – Roadmap to Mars _ New Space Economy
No ratings yet
SpaceX Starship – Roadmap to Mars _ New Space Economy
13 pages
AI Lab Manual aktu
No ratings yet
AI Lab Manual aktu
11 pages
NLP Lab File
No ratings yet
NLP Lab File
15 pages
NLP Lab File (1)
No ratings yet
NLP Lab File (1)
13 pages
7.TextAnalysis
No ratings yet
7.TextAnalysis
3 pages
Experiment 2 Manual
No ratings yet
Experiment 2 Manual
6 pages
NLP Lab 1
No ratings yet
NLP Lab 1
1 page
NLP Lab Manual (1)
No ratings yet
NLP Lab Manual (1)
19 pages
Impact of The Power Quality of The Wind Farm and The Effects On The Power System
No ratings yet
Impact of The Power Quality of The Wind Farm and The Effects On The Power System
5 pages
NLP Practicals All
No ratings yet
NLP Practicals All
57 pages
7 idf
No ratings yet
7 idf
5 pages
Lab Prgms Weel1-Output
No ratings yet
Lab Prgms Weel1-Output
4 pages
NLP Experiment 2
No ratings yet
NLP Experiment 2
5 pages
exp2_Ananya_66_C_NLP
No ratings yet
exp2_Ananya_66_C_NLP
7 pages
Assignment 2 NLP 20bci7108
No ratings yet
Assignment 2 NLP 20bci7108
2 pages
NLP LAB MANUAL 3-2 AIML R22 UPDATE (1)
100% (1)
NLP LAB MANUAL 3-2 AIML R22 UPDATE (1)
20 pages
NLP PRATICAL
No ratings yet
NLP PRATICAL
14 pages
NLP 02
No ratings yet
NLP 02
6 pages
NLP Lab File
No ratings yet
NLP Lab File
13 pages
Chapter 2
No ratings yet
Chapter 2
4 pages
Model 450, F & FF: V Oltage Transformer
No ratings yet
Model 450, F & FF: V Oltage Transformer
2 pages
Final_NLP_Lab_File
No ratings yet
Final_NLP_Lab_File
28 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
CH4
No ratings yet
CH4
15 pages
NLP (1)
No ratings yet
NLP (1)
12 pages
Baracor 450: Completion Fluid Services Corrosion Inhibitor
No ratings yet
Baracor 450: Completion Fluid Services Corrosion Inhibitor
2 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
PPT for Assignment-10 (Machine Learning With Python_NLP-2)
No ratings yet
PPT for Assignment-10 (Machine Learning With Python_NLP-2)
37 pages
Teaching and Assessment of Grammar
100% (1)
Teaching and Assessment of Grammar
2 pages
4-h Recommendations Cheyenne Whitney
No ratings yet
4-h Recommendations Cheyenne Whitney
1 page
Sec 2 E Math Peicai Sec SA2 2018i
No ratings yet
Sec 2 E Math Peicai Sec SA2 2018i
42 pages
Tokenizer
No ratings yet
Tokenizer
4 pages
Text Processing
No ratings yet
Text Processing
16 pages
20BCP112 - NLP Lab - LAB - Manual
No ratings yet
20BCP112 - NLP Lab - LAB - Manual
65 pages
20BCP123 - NLP Lab Manual
No ratings yet
20BCP123 - NLP Lab Manual
45 pages
JAI Parking Management Project File Class 12
No ratings yet
JAI Parking Management Project File Class 12
28 pages
Unfair Means Cases - Notification: Islamia College Peshawar
No ratings yet
Unfair Means Cases - Notification: Islamia College Peshawar
1 page
NLP LAB MANUAL
No ratings yet
NLP LAB MANUAL
17 pages
NLP Programs
No ratings yet
NLP Programs
5 pages
NLP
No ratings yet
NLP
81 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
Defining Criticism, Theory and Literature
No ratings yet
Defining Criticism, Theory and Literature
20 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
AIML_P4
No ratings yet
AIML_P4
12 pages
p4
No ratings yet
p4
10 pages
Lab2 IR
No ratings yet
Lab2 IR
16 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
Guidelines For Drawing DFD
No ratings yet
Guidelines For Drawing DFD
5 pages
NACE-CIP1-001
No ratings yet
NACE-CIP1-001
9 pages
Barriers and Boundaries
0% (3)
Barriers and Boundaries
2 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
Open Source Linux For You - May 2013
100% (1)
Open Source Linux For You - May 2013
112 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
NLTK Tutorial
No ratings yet
NLTK Tutorial
33 pages
Technical Textile and Sustainability
No ratings yet
Technical Textile and Sustainability
5 pages
Test PDF
No ratings yet
Test PDF
150 pages
277e5fcb-2a64-4802-9bfa-c0b031207675
No ratings yet
277e5fcb-2a64-4802-9bfa-c0b031207675
20 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
2 pages
C Programming Full
No ratings yet
C Programming Full
93 pages
NLP Lab Manual Lab Work
No ratings yet
NLP Lab Manual Lab Work
24 pages
Valve Sizing Tables
No ratings yet
Valve Sizing Tables
5 pages
Your Business Advantage Checking Bus Platinum Privileges: Account Summary
100% (1)
Your Business Advantage Checking Bus Platinum Privileges: Account Summary
6 pages
SixthEditionSyllabus2008 01 23
No ratings yet
SixthEditionSyllabus2008 01 23
44 pages
Digital SAT Math Practice Questions
61% (31)
Digital SAT Math Practice Questions
29 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
Edexcel IGCSE Further Pure Mathematics June 2022 Question Paper 1R - 4pm1-01r-Que-20220527
No ratings yet
Edexcel IGCSE Further Pure Mathematics June 2022 Question Paper 1R - 4pm1-01r-Que-20220527
36 pages
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Alectek Shoes Case Study
50% (2)
Alectek Shoes Case Study
2 pages
Turolla Hydraulic Gear Pumps Group2 Catalogue en l1016341
No ratings yet
Turolla Hydraulic Gear Pumps Group2 Catalogue en l1016341
44 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
Mind Over Money by Brad Klontz - Excerpt
69% (13)
Mind Over Money by Brad Klontz - Excerpt
30 pages

NLP Lab Programs

Uploaded by

NLP Lab Programs

Uploaded by

NLP Lab Programs

nltk.download('punkt') # Download tokenizer data

2. sentences of a text document

nltk.download('punkt') # Download tokenizer data

# Read the text from a file

# Display the sentences

3. tokenize text with stop words as delimiters

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords

# Download necessary data

text = "I enjoy learning Python and coding."

# Define stop words

# Tokenize the text

# Tokenize using stop words as delimiters

tokens_without_stopwords = [word for word in words if word.lower() not in stop_words]

# Output the result

print("Original Tokens:", words)

print("Tokens without Stop Words:", tokens_without_stopwords)

4. remove stop words and punctuations in a text

from nltk.tokenize import word_tokenize

# Download necessary data

# Define stop words

# Tokenize the text

# Remove stop words and punctuation

# Output the result

# choose some words to be stemmed

You might also like