0% found this document useful (0 votes)

28 views13 pages

NLP Lab File

Uploaded by

Bharat Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views13 pages

NLP Lab File

Uploaded by

Bharat Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

DELHI TECHNOLOGICAL

UNIVERSITY
SE-316
NATURAL LANGUAGE PROCESSING

Department of Software Engineering

Delhi Technological University
Bawana Road, Delhi-110042

Submitted by
Prashant Tiwari
Roll Number :- 2K20/IT/103
Batch :- IT-B

Submitted to : Dr. Divyashikha Sethia

Department of Software Engineering
Delhi Technological University
INDEX

S. No. Experiment Date

1. Import nltk and download the 13-01-2023

‘stopwords’ and ‘punkt’
packages

2. Import spacy and load the language 13-01-2023

model.

3. WAP in python to tokenize a given 20-01-2023

text.

4. WAP in python to get the sentences 03-03-2023

of a text document.

5. WAP in python to tokenize text 03-02-2023

with stopwords as delimiters.

6. WAP in python to add custom stop 03-02-2023

words in spaCy.

7. WAP to remove punctuations, perform 24-02-2023

stemming, lemmatize given text and
extract usernames from emails

8. WAP to do spell correction, extract 07-03-2023

all nouns, pronouns and verbs in a
given text

9. WAP to find similarity between two 31-03-2023

words and classify a text as
positive/negative sentiment
EXPERIMENT - 1
AIM : Import nltk and download the ‘stopwords’ and
‘punkt’ packages

CODE :
import nltk

nltk.download('stopwords')
nltk.download('punkt')

OUTPUT :
EXPERIMENT - 2
AIM : Import spacy and load the language model

CODE :
import spacy
nlp_eng = spacy.load('en_core_web_sm')
nlp_multi = spacy.load('xx_ent_wiki_sm')

OUTPUT :
EXPERIMENT - 3
AIM : WAP in python to tokenize a given text

CODE :
from nltk import word_tokenize
text = "Last week, the University of Cambridge shared its own research
that shows if everyone wears a mask outside home,dreaded ‘second wave’
of the pandemic can be avoided."
text = word_tokenize(text)
for t in text:
print(t)

OUTPUT :
EXPERIMENT - 4
AIM : WAP in python to get the sentences of a text document.

CODE :
file = open('04.txt')
Input_text = file.read()
ans = Input_text.split('.')

for an in ans: print(an,'\

n')

OUTPUT :
EXPERIMENT - 5
AIM : WAP in python to tokenize text with stopwords
as delimiters.

CODE :
text = "Walter was feeling anxious. He was diagnosed today. He probably
is the best person I know."

stop_words_and_delims = ['was', 'is', 'the', '.', ',', '-', '!', '?']

for r in stop_words_and_delims:
text = text.replace(r, 'DELIM')

words = [t.strip() for t in text.split('DELIM')]

words_filtered = list(filter(lambda a: a not in [''], words))
for word in words_filtered:
print(word)

OUTPUT :
EXPERIMENT - 6
AIM : WAP in python to add custom stop words in spaCy.

CODE :
import spacy

nlp = spacy.load('en_core_web_sm')

custom_stop_words = ['was', 'is','the','JUNK','NIL','of','more' ,'.',

',', '-', '!', '?','a']
for word in custom_stop_words:
nlp.vocab[word].is_stop = True

doc = nlp("Jonas was a JUNK great guy NIL Adam was evil NIL Martha JUNK
was more of a fool")
for token in doc:
if not token.is_stop:
print(token.text, end=" ")

OUTPUT :
EXPERIMENT - 7
AIM : WAP to remove punctuations, perform
stemming, lemmatize given text and extract
usernames from emails

CODE :
punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''

string = "Jonas!!! great \\guy <> Adam --evil [Martha] ;;fool() ."

ans = ""
for char in string:
if char not in punctuations:
ans+=char

print(ans)

from nltk.stem import PorterStemmer

from nltk.tokenize import word_tokenize
text= "Dancing is an art. Students should be taught dance as a subject
in schools . I danced in many of my school function. Some people are
always hesitating to dance."
ans = ""
stemmer = PorterStemmer()
tokens = word_tokenize(text)
for token in tokens:
ans+=stemmer.stem(token)
ans+=" "
print(ans)

from nltk.corpus import wordnet

from nltk.tokenize import word_tokenize

from nltk.stem.wordnet import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()
text= "Dancing is an art. Students should be taught dance as a subject
in schools . I danced in many of my school function. Some people are
always hesitating to dance."
ans = ""
tokens = word_tokenize(text)
for token in tokens:
ans+=lemmatizer.lemmatize(token, wordnet.VERB)
ans+=" "
print(ans)
from nltk.tokenize import word_tokenize

text= "The new registrations are [email protected] ,

[email protected]. If you find any disruptions, kindly contact
[email protected] or [email protected] "

text_list = word_tokenize(text)
usernames = []
for i in range(len(text_list)):
if text_list[i] == "@":
usernames.append(text_list[i-1])
print(usernames)

OUTPUT :
EXPERIMENT - 8
AIM : WAP to do spell correction, extract all nouns,
pronouns and verbs in a given text

CODE :
from textblob import TextBlob
text="He is a gret person. He beleives in bod"
textb = TextBlob(text)
correct_text = textb.correct()
print(correct_text)

import nltk
from nltk import word_tokenize, pos_tag
text="James works at Microsoft. She lives in manchester and likes to
play the flute"
tokens = word_tokenize(text)
parts_of_speech = nltk.pos_tag(tokens)
nouns = list(filter(lambda x: x[1] == "NN" or x[1] == "NNP",
parts_of_speech))
for noun in nouns:
print(noun[0])

from nltk import pos_tag, word_tokenize

text = "I may bake a cake for my birthday. The talk will introduce
reader about Use of baking"

words = word_tokenize(text)

verb_phrases = []
for i in range(len(words)):
if i > 0 and pos_tag(words)[i][1] == 'VB':
verb_phrase = words[i-1] + ' ' + words[i]
verb_phrases.append(verb_phrase)

for i in verb_phrases:
print (i)

OUTPUT :
EXPERIMENT - 9
AIM : WAP to find similarity between two words and
classify a text as positive/negative sentiment

CODE :
import spacy

nlp = spacy.load('en_core_web_md')
words = "amazing terrible excellent"

tokens = nlp(words)

token1, token2, token3 = tokens[0], tokens[1], tokens[2]

print(f"Similarity between {token1} and {token2} : ",

token1.similarity(token2))
print(f"Similarity between {token1} and {token3} : ",
token1.similarity(token3))

from textblob import TextBlob

text = "It was a very pleasant day"
print(TextBlob(text).sentiment)

OUTPUT :

NLP Lab Manual Lab Work
No ratings yet
NLP Lab Manual Lab Work
24 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
21 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
NLP Lab Manual 3-2 Aiml R22 Update
100% (1)
NLP Lab Manual 3-2 Aiml R22 Update
20 pages
SOAL PTS KELAS 8 Bahasa Inggris
0% (1)
SOAL PTS KELAS 8 Bahasa Inggris
5 pages
Human Aspects - Rapoport PDF
No ratings yet
Human Aspects - Rapoport PDF
4 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
NLP Experiment 1
No ratings yet
NLP Experiment 1
13 pages
Verigy Lab 4 SW Overview
No ratings yet
Verigy Lab 4 SW Overview
8 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
25 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
Lab Manual - NLP
No ratings yet
Lab Manual - NLP
60 pages
Basic Factors of Delivery
100% (1)
Basic Factors of Delivery
44 pages
2 Corinthians
No ratings yet
2 Corinthians
120 pages
NLP Lab File
No ratings yet
NLP Lab File
15 pages
NLP Lab File
No ratings yet
NLP Lab File
13 pages
Accent NeutralizationV2.0
100% (1)
Accent NeutralizationV2.0
57 pages
English Grade 10 Term Notes For 2024 With Column For Date
No ratings yet
English Grade 10 Term Notes For 2024 With Column For Date
12 pages
Counting of Figures (Reseoning)
No ratings yet
Counting of Figures (Reseoning)
4 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
NLP Lab Complete
No ratings yet
NLP Lab Complete
23 pages
NLP - Practical List
No ratings yet
NLP - Practical List
14 pages
20BCP123 - NLP Lab Manual
No ratings yet
20BCP123 - NLP Lab Manual
45 pages
AI Practical No 9-13
No ratings yet
AI Practical No 9-13
5 pages
NLTK Tutorial
No ratings yet
NLTK Tutorial
33 pages
Week-4 NLP 2
No ratings yet
Week-4 NLP 2
2 pages
7 Idf
No ratings yet
7 Idf
5 pages
NLP Lab Programs
No ratings yet
NLP Lab Programs
3 pages
DBMS
No ratings yet
DBMS
2 pages
01 NLP - Merged Vinay
No ratings yet
01 NLP - Merged Vinay
27 pages
NLP FinAL
No ratings yet
NLP FinAL
27 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
Marxismo y Dialéctica, Lucio Colletti
No ratings yet
Marxismo y Dialéctica, Lucio Colletti
24 pages
Aiml P4
No ratings yet
Aiml P4
12 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
Text Preprocessing For NLP
No ratings yet
Text Preprocessing For NLP
15 pages
NLP Pratical
No ratings yet
NLP Pratical
14 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
NLP Record
No ratings yet
NLP Record
23 pages
123 NLP 456
No ratings yet
123 NLP 456
4 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
Text Preprocessing
No ratings yet
Text Preprocessing
3 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
C24064 - NLP - Lab Manual
No ratings yet
C24064 - NLP - Lab Manual
28 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
Machine Learning NLP LAB Sayak Mallick
No ratings yet
Machine Learning NLP LAB Sayak Mallick
4 pages
NLP 02
No ratings yet
NLP 02
6 pages
NLP Lab1
No ratings yet
NLP Lab1
2 pages
NLP Experiment 2
No ratings yet
NLP Experiment 2
5 pages
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
No ratings yet
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
20 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
Python NLP Assignment
No ratings yet
Python NLP Assignment
9 pages
Natural Language Pre-Processing: Prepared By: Syed Afroz Ali
No ratings yet
Natural Language Pre-Processing: Prepared By: Syed Afroz Ali
81 pages
Implementation of High-Speed and Area-Efficient VLSI Architecture of Three-Operand Binary Adder
No ratings yet
Implementation of High-Speed and Area-Efficient VLSI Architecture of Three-Operand Binary Adder
26 pages
NLP
No ratings yet
NLP
12 pages
Lab-1 - Tokenization, Stemming, Stopwords - Jupyter Notebook
No ratings yet
Lab-1 - Tokenization, Stemming, Stopwords - Jupyter Notebook
15 pages
NLPPractical
No ratings yet
NLPPractical
12 pages
Tinywow Pythass3 77951173
No ratings yet
Tinywow Pythass3 77951173
17 pages
NLP Lab Work
No ratings yet
NLP Lab Work
34 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
32 pages
NLP Notebook
No ratings yet
NLP Notebook
20 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
Bling
No ratings yet
Bling
7 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
19 pages
NLP - Exp 1 11
No ratings yet
NLP - Exp 1 11
29 pages
NLP Lab Programms
No ratings yet
NLP Lab Programms
9 pages
Toots Thielemans
No ratings yet
Toots Thielemans
89 pages
Patanjali: Yoga Sutra
No ratings yet
Patanjali: Yoga Sutra
1 page
Thrax
No ratings yet
Thrax
2 pages
Ivy-Alvarez Edited
No ratings yet
Ivy-Alvarez Edited
14 pages
PHL216
No ratings yet
PHL216
141 pages
Baddley's Working Memory
No ratings yet
Baddley's Working Memory
5 pages
Signals and Daemon Processes: UNIX Programming
No ratings yet
Signals and Daemon Processes: UNIX Programming
17 pages
Báo Cáo Thực Hành Vi Điều Khiển
No ratings yet
Báo Cáo Thực Hành Vi Điều Khiển
39 pages
Centurion Plus Tranfer Guide
No ratings yet
Centurion Plus Tranfer Guide
12 pages
500 450 Demo
No ratings yet
500 450 Demo
6 pages
PS2 Final Exam Description (Online)
No ratings yet
PS2 Final Exam Description (Online)
2 pages
COE301 Lab 8 MIPS Exceptions and IO
No ratings yet
COE301 Lab 8 MIPS Exceptions and IO
10 pages
15-Gerunds and Infinitives - Complex Forms
No ratings yet
15-Gerunds and Infinitives - Complex Forms
3 pages
Research Defense Template by Rome
No ratings yet
Research Defense Template by Rome
28 pages
Ideologies Ideologies: BA (Hons.) History (University of Delhi) BA (Hons.) History (University of Delhi)
No ratings yet
Ideologies Ideologies: BA (Hons.) History (University of Delhi) BA (Hons.) History (University of Delhi)
6 pages
Problems On Speed, Distance & Time
No ratings yet
Problems On Speed, Distance & Time
40 pages
Quiz 8
No ratings yet
Quiz 8
3 pages
KSM Starter Smart Contract Security Audit Report Halborn v1 1
No ratings yet
KSM Starter Smart Contract Security Audit Report Halborn v1 1
51 pages
What Playing Hookie - Google Search
No ratings yet
What Playing Hookie - Google Search
1 page
Querying The Linked Data Graph Using Owl:Sameas Provenance
No ratings yet
Querying The Linked Data Graph Using Owl:Sameas Provenance
13 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

NLP Lab File

Uploaded by

NLP Lab File

Uploaded by

DELHI TECHNOLOGICAL

Department of Software Engineering

Submitted to : Dr. Divyashikha Sethia

S. No. Experiment Date

1. Import nltk and download the 13-01-2023

2. Import spacy and load the language 13-01-2023

3. WAP in python to tokenize a given 20-01-2023

4. WAP in python to get the sentences 03-03-2023

5. WAP in python to tokenize text 03-02-2023

6. WAP in python to add custom stop 03-02-2023

7. WAP to remove punctuations, perform 24-02-2023

8. WAP to do spell correction, extract 07-03-2023

9. WAP to find similarity between two 31-03-2023

for an in ans: print(an,'\

stop_words_and_delims = ['was', 'is', 'the', '.', ',', '-', '!', '?']

words = [t.strip() for t in text.split('DELIM')]

custom_stop_words = ['was', 'is','the','JUNK','NIL','of','more' ,'.',

from nltk.stem import PorterStemmer

from nltk.corpus import wordnet

from nltk.stem.wordnet import WordNetLemmatizer

text= "The new registrations are [email protected] ,

from nltk import pos_tag, word_tokenize

token1, token2, token3 = tokens[0], tokens[1], tokens[2]

print(f"Similarity between {token1} and {token2} : ",

from textblob import TextBlob

You might also like