Machine Learning NLP LAB Sayak Mallick

This document contains code for two machine learning tasks: sentiment analysis and n-gram modeling. For sentiment analysis, the code imports data, cleans text by removing punctuation and stopwords, encodes sentiment labels, and visualizes the distribution of labels. For n-gram modeling, the code defines a function to generate n-grams from text, constructs n-grams of size 3 from a sample text by adding delimiter characters, strips whitespace and returns a list of n-grams without space. It then loops through words to generate n-grams for each word.

Uploaded by

Sayak Mallick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views4 pages

Machine Learning NLP LAB Sayak Mallick

Uploaded by

Sayak Mallick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Machine Learning AND NLP LAB Examinations

Name:- Sayak Mallick

ID :- 191001111025
Stream :- B.SC IT (Machine Learning)

1. Sentiment Analysis
import re
from matplotlib import rcParams
from nltk.stem import WordNetLemmatizer
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from nltk.corpus import stopwords
from wordcloud import WordCloud

df_train=pd.read_csv("train.txt",delimiter=';',names=['text','label'])

df_val=pd.read_csv("val.txt",delimiter=';',names=['text','label'])
print(df_val)

df=pd.concat([df_train, df_val])
df.reset_index(inplace=True,drop=True)
print(df)
print("Shape of the Data frame: ",df.shape)
print(df.sample(5))

sns.countplot(df.label, data=df)
plt.show()

def custom_encoder(df):
df.replace(to_replace ="surprise", value =1, inplace=True)
df.replace(to_replace ="love", value =1, inplace=True)
df.replace(to_replace ="joy", value =1, inplace=True)
df.replace(to_replace ="fear", value =0, inplace=True)
df.replace(to_replace ="anger", value =0, inplace=True)
df.replace(to_replace ="sadness", value =0, inplace=True)

#driver code for custom_encoder

custom_encoder(df['label'])
sns.countplot(df.label,data=df)
plt.show()

import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from pylab import rcParams

lm=WordNetLemmatizer()
def text_transformation(df_col):
corpus = []
for item in df_col:
new_item = re.sub('[^a-zA-Z]',' ',str(item))
new_item = new_item.lower()
new_item = new_item.split()
new_item = [lm.lemmatize(word) for word in new_item if word not in
set(stopwords.words('english'))]
corpus.append(' '.join(str(x) for x in new_item))
return corpus

corpus=text_transformation(df['text'])
rcParams['figure.figsize'] = 20,8
word_cloud=""
for row in corpus:
for word in row:
word_cloud += " ".join(word)
wordcloud=WordCloud(width=1000,height=500,
background_color='white',min_font_size=10).generate(word_cloud)
plt.imshow(wordcloud)

2. ngram program
from nltk import ngrams
import numpy
def remove(string):
return string.replace(" ", "")
vocab = "Today is a good day to learn natural language proccesing"
print("Sample Document - ",vocab)

#contructing lexicon
lex = vocab.split(" ")
lex
spaced = ' '
for i in lex[0]:
spaced = spaced + i + " "
spaced = "$ " + spaced + " $"
n=3
ngrams_ = ngrams(spaced.split(), n)
ngram_list = []
for i in ngrams_:
ngram_list.append((''.join([w + ' ' for w in i])).strip())
for i in range(len(ngram_list)):
ngram_list[i] = remove(ngram_list[i])
ngram_list

ngram_list = []
for word in lex:
spaced = ' '
for i in word:
spaced = spaced + i + " "
spaced = "$ " + spaced + " $"
n=3
ngrams_ = ngrams(spaced.split(), n)
l = []
for i in ngrams_:
l.append((''.join([w + ' ' for w in i])).strip())
for i in range(len(l)):
l[i] = remove(l[i])
ngram_list.append(l)
ngram_list

01 DSA PPT Introduction To Data Structures
No ratings yet
01 DSA PPT Introduction To Data Structures
27 pages
RAB03-PRE-EPCC-ZZZ-461-0001 - Rev02 - Method Statement For MV SWGR, LV SWGR & Control Panel Installation
100% (1)
RAB03-PRE-EPCC-ZZZ-461-0001 - Rev02 - Method Statement For MV SWGR, LV SWGR & Control Panel Installation
14 pages
15 DSA PPT Sorting Techniques-I
No ratings yet
15 DSA PPT Sorting Techniques-I
23 pages
NLP Preprocessing Steps
No ratings yet
NLP Preprocessing Steps
20 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
02 DSA PPT Introduction To Algorithms
No ratings yet
02 DSA PPT Introduction To Algorithms
17 pages
Iot Applications - Value Creation For Industry
100% (1)
Iot Applications - Value Creation For Industry
11 pages
NLP - Cheatsheet
No ratings yet
NLP - Cheatsheet
10 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
21 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
Fluid Mech 1
100% (1)
Fluid Mech 1
49 pages
05 DSA PPT Algorithmic Anaysis-II
No ratings yet
05 DSA PPT Algorithmic Anaysis-II
19 pages
03 DSA PPT Algorithmic Anaysis-I
No ratings yet
03 DSA PPT Algorithmic Anaysis-I
28 pages
Lecture 4 Effective Stress
86% (7)
Lecture 4 Effective Stress
18 pages
Mid Term 2 PLSQL
100% (1)
Mid Term 2 PLSQL
22 pages
R&M Manual Partes CXT
No ratings yet
R&M Manual Partes CXT
20 pages
ATV303 Modbus Manual en S1A94572 03
No ratings yet
ATV303 Modbus Manual en S1A94572 03
26 pages
3D Printing: A Seminar Report On
100% (1)
3D Printing: A Seminar Report On
17 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
Paper C
No ratings yet
Paper C
12 pages
A Carrierless Amplitude Phase CAP Modulation Forma
100% (1)
A Carrierless Amplitude Phase CAP Modulation Forma
11 pages
4650 and 4850 Section 70
No ratings yet
4650 and 4850 Section 70
58 pages
4-1 Engineering Specification For Structural Steel Fabrication and Erection
No ratings yet
4-1 Engineering Specification For Structural Steel Fabrication and Erection
24 pages
Tds - Waboflex SR
100% (1)
Tds - Waboflex SR
3 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
NLP Final
No ratings yet
NLP Final
26 pages
Mamzar
No ratings yet
Mamzar
4 pages
Listo de Kits Atlas Copco Z 2009
No ratings yet
Listo de Kits Atlas Copco Z 2009
11 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
NLP Final Review
No ratings yet
NLP Final Review
32 pages
Parametric Geometry For Propulsion-Airframe Integration
No ratings yet
Parametric Geometry For Propulsion-Airframe Integration
31 pages
Rajeev Mishra 20 SCSE1180087
No ratings yet
Rajeev Mishra 20 SCSE1180087
29 pages
20BCP112 - NLP Lab - LAB - Manual
No ratings yet
20BCP112 - NLP Lab - LAB - Manual
65 pages
Masterpower - Turbos
No ratings yet
Masterpower - Turbos
28 pages
07 DSA PPT Arrays in C-II
No ratings yet
07 DSA PPT Arrays in C-II
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
25 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
NLP Manual
No ratings yet
NLP Manual
21 pages
Module 5
No ratings yet
Module 5
69 pages
Final NLP Lab File
No ratings yet
Final NLP Lab File
28 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
20BCP123 - NLP Lab Manual
No ratings yet
20BCP123 - NLP Lab Manual
45 pages
CS-875-Lecture 4
No ratings yet
CS-875-Lecture 4
47 pages
0620 s16 QP 22
No ratings yet
0620 s16 QP 22
16 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
Polymer Materials and Their Characterization (PP-212) : Raza Muhammad Khan
No ratings yet
Polymer Materials and Their Characterization (PP-212) : Raza Muhammad Khan
27 pages
For Assignment-10 (Machine Learning With Python - NLP-2)
No ratings yet
For Assignment-10 (Machine Learning With Python - NLP-2)
37 pages
Ai&Ml Bai601 NLP Lab Manual
No ratings yet
Ai&Ml Bai601 NLP Lab Manual
48 pages
Shore X
No ratings yet
Shore X
12 pages
NLP - Practical List
No ratings yet
NLP - Practical List
14 pages
Sist en 15048 1 2007
No ratings yet
Sist en 15048 1 2007
11 pages
NLTK - N-Gram LM
No ratings yet
NLTK - N-Gram LM
13 pages
01 NLP - Merged Vinay
No ratings yet
01 NLP - Merged Vinay
27 pages
NLP Preprocessing Steps 1740444240
No ratings yet
NLP Preprocessing Steps 1740444240
20 pages
Garage Locator: Click To Edit Master Subtitle Style
No ratings yet
Garage Locator: Click To Edit Master Subtitle Style
22 pages
1 Project Management
No ratings yet
1 Project Management
14 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
NLP Lab
No ratings yet
NLP Lab
18 pages
Page Headline Details
No ratings yet
Page Headline Details
14 pages
ASTW RA03 PracticalManual
No ratings yet
ASTW RA03 PracticalManual
18 pages
SK Engineering Company Profile
No ratings yet
SK Engineering Company Profile
20 pages
110kV PTX Installation Manual
No ratings yet
110kV PTX Installation Manual
27 pages
Text Preprocessing For NLP
No ratings yet
Text Preprocessing For NLP
15 pages
NLP Notes
No ratings yet
NLP Notes
12 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
NLP Record
No ratings yet
NLP Record
23 pages
Aiml P4
No ratings yet
Aiml P4
12 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
NLP Record
No ratings yet
NLP Record
16 pages
Cambridge IGCSE: Combined Science 0653/63
No ratings yet
Cambridge IGCSE: Combined Science 0653/63
16 pages
Tennis Elbow 2011 - Full Do..
No ratings yet
Tennis Elbow 2011 - Full Do..
19 pages
NLP - Record (Weeks 1-12)
No ratings yet
NLP - Record (Weeks 1-12)
41 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
C24064 - NLP - Lab Manual
No ratings yet
C24064 - NLP - Lab Manual
28 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
Cloud Computing Assignment EX-3
No ratings yet
Cloud Computing Assignment EX-3
5 pages
Assignment - IoT (MOD-5) Priyanka Singh
No ratings yet
Assignment - IoT (MOD-5) Priyanka Singh
5 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
19 pages
NLP TP1 Report Lahouel Ibrahim
No ratings yet
NLP TP1 Report Lahouel Ibrahim
6 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
5 pages
NLP CT1
No ratings yet
NLP CT1
6 pages
1a NLTK
No ratings yet
1a NLTK
10 pages
Natural Language Processing - NOTES
No ratings yet
Natural Language Processing - NOTES
4 pages
DSBA+Master+Codebook+ +Text+Mining+&+TSF
No ratings yet
DSBA+Master+Codebook+ +Text+Mining+&+TSF
11 pages
Python NLP Assignment
No ratings yet
Python NLP Assignment
9 pages
Natural Language Processing Lab Manual
No ratings yet
Natural Language Processing Lab Manual
24 pages
Aped For Fake News
No ratings yet
Aped For Fake News
6 pages
Assignment 3
No ratings yet
Assignment 3
1 page
CSE 3652 Lab Record Format - PDF
No ratings yet
CSE 3652 Lab Record Format - PDF
13 pages
Bling
No ratings yet
Bling
7 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
13 pages
NLPPractical
No ratings yet
NLPPractical
12 pages
Sayak Mallick: Contact Profile
No ratings yet
Sayak Mallick: Contact Profile
2 pages
Mdssd9sdmdces1nsnn Data Sheet
No ratings yet
Mdssd9sdmdces1nsnn Data Sheet
4 pages
123 NLP 456
No ratings yet
123 NLP 456
4 pages
Syllabus CFD PDF
No ratings yet
Syllabus CFD PDF
1 page
80-10 F&D Heads 1.02 - 18ksi
No ratings yet
80-10 F&D Heads 1.02 - 18ksi
4 pages
ldf5 50a Product Specifications
No ratings yet
ldf5 50a Product Specifications
4 pages
NLP PDF
No ratings yet
NLP PDF
3 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages

Machine Learning NLP LAB Sayak Mallick

Uploaded by

Machine Learning NLP LAB Sayak Mallick

Uploaded by

Machine Learning AND NLP LAB Examinations

Name:- Sayak Mallick

#driver code for custom_encoder

You might also like