CS663-2024-Executive NLP - Assignment Sentiment Analysis

The document outlines an assignment on sentiment analysis using neural networks. Students must implement feed-forward and RNN models for binary and multi-class sentiment classification, evaluating on standard datasets and reporting accuracy, loss, and model performance.

Uploaded by

mukesh shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views4 pages

CS663-2024-Executive NLP - Assignment Sentiment Analysis

Uploaded by

mukesh shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

CS 663: Natural Language Processing

Assignment-3: Sentiment Analysis

Deadline: 11 May 2024

● Markings will be based on the correctness and soundness of the outputs.

● Marks will be deducted in case of plagiarism.
● Proper indentation and appropriate comments (if necessary) are mandatory.
● Use of frameworks like scikit-learn, PyTorch etc is allowed.
● All benchmarks(accuracy etc), answers to questions and supporting examples
should be added in a separate file with the name ‘report’.
● All code needs to be submitted in ‘.py’ format. Even if you code it in ‘.IPYNB’
format, download it in ‘.py’ format and then submit
● You should zip all the required files and name the zip file as:
○ <roll_no>_assignment_<#>.zip, eg. 1501cs11_assignment_01.zip.
● Upload your assignment ( the zip file ) in the following link:
○ https://www.dropbox.com/request/hHsfo0RpF9tIuXPaBEr3

Problem Statement:
● The assignment targets to implement Feed-Forward NN and RNN for Binary and
Multi-class sentiment analysis

Implementation:

Input features:

● Tokenize the dataset and consider words with frequency >= 5

● Assign “UNK” token to all other remaining words
● During testing, if a word is not in vocabulary it should be taken as “UNK”
● Use spaCy English tokenizer for tokenizing the data (link: https://spacy.io/models)

Input to the Neural Network:

● Input to the NN should be one-hot encoding of input tokens
● For example, given the following sentence:

I love watching anime and reading manga .

● Vocabulary size: 8 (I, love, watching, anime, and, reading, manga, . (including
fulls stop at the end))
● The one-hot encoding for the tokens is as follows:
I: [1, 0, 0, 0, 0, 0, 0, 0]
love: [0, 1, 0, 0, 0, 0, 0, 0]
…
manga: [0, 0, 0, 0, 0, 0, 1, 0]
.: [0, 0, 0, 0, 0, 0, 0, 1]

● The input sentence is now can be represented as tensor of one-hot encoded

vectors as:
[[1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0],
…,
[0, 0, 0, 0, 0, 0, 0, 1],]

● The size of the input tensor is: [1, 8, 8] (1: batch size (because the current batch
contains only one sentence), 8: sentence length, 8: one-hot vector length (same
as the size of vocabulary))
● Since the network takes a fixed length input, longer sentences should be
truncated after a maximum length and the smaller sentences should be padded
○ For maximum length: Use the average length of the corpus as maximum
length after tokenization. For example if the average length of the corpus
is 60 tokens, then the maximum length should be set to 60. Average
length of corpus = (Total no. of words in corpus / Total no. of sentences in
corpus)
○ For minimum length: A special token “PAD” should be added to
vocabulary and fill the remaining positions with this PAD token. For
example if the maximum length is 10, the following sentence is padded as:
■ I love watching anime and reading manga . PAD PAD
○ Previously input has a length of 8, now we padded to the maximum length
of 10
● In general, both Feed-Forward NN and LSTM inputs should be prepared in this
way

Feed-Forward NN:

● Explain and draw the architecture of Feed-Forward NN that you are proposing
with justification. Describe the features of Feed-Forward NN.
● Network should contain TWO hidden layers
○ input — hidden_layer_1 (hidden_layer_1 size is 256)
○ hidden_layer_1 — hidden_layer_2 (hidden_layer_2 size is 128)
● Finally, hidden_layer_2 — Output (Output depends upon no. of classes)
● Use non-linearity of your choice (tanh, relu, gelu etc.) between hidden layers
● Clearly discriminate between binary class and multi class loss functions

RNN:

● Use the following architecture for RNN based model (ref:

https://karpathy.github.io/2015/05/21/rnn-effectiveness/)

● After reaching the end of the sentence, the last state is used to classify the input
(hence the prediction is at the end)
● Conduct experiments on LSTM (not on base RNN model)
● Hidden layer size: 256
● Output size depends upon no. of classes
● Clearly discriminate between binary class and multi class loss functions

Dataset:
● IMDB (binary class) dataset:
○ Dataset consists of movie reviews and each review is tagged with its
corresponding sentiment tag (positive or negative)
○ Link:
■ https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
○ Example:
■ Input: If you like adult comedy cartoons, like South Park, then this
is nearly a similar format about the small adventures of three
teenage girls at Bromwell High. Keisha, Natella and Latrina have
given exploding sweets and behaved like bitches, I think Keisha is
a good leader. There are also small stories going on with the
teachers of the school. There's the idiotic principal, Mr. Bip, the
nervous Maths teacher and many others. The cast is also fantastic,
Lenny Henry's Gina Yashere, EastEnders Chrissie Watts,
Tracy-Ann Oberman, Smack The Pony's Doon Mackichan, Dead
Ringers' Mark Perry and Blunder's Nina Conti. I didn't know this
came from Canada, but it is very good. Very good!
■ Label: Positive

● AG_News (multiclass) dataset:

○ Dataset consists of news articles (both Headline and Description). The
task is to classify the News based on the Headlines into one of the 4
categories (Business, World, Sports, Sci/Tech)
○ Link:
■ https://drive.google.com/drive/folders/14ZT8i_HznEp-hiH6cwKQVtb
xzIr8GB7-?usp=sharing
○ Example:
■ Input: Wall St. Bears Claw Back Into the Black (Reuters)
■ Label: Business

Evaluation:
● Run each model for 10 epochs
● For SemEval dataset use the given dev set as validation set and for IMDB
dataset, use last 10% samples of train set as dev set
● Save best model checkpoint based on the Accuracy of dev set
● Report overall Accuracy, Precision, Recall and F-Score and also for each label
on the best model checkpoint on test set

Documents to submit:
● Model code
● Model logs (in the form of graph):
○ Accuracy of dev set for each epoch
○ Train loss for each epoch
● Write a report (doc or pdf format) on how you are solving the problems as well as
all the results including model architecture (if any).

For any queries regarding this assignment, contact:

Ramakrishna Appicharla ([email protected]), and
Arpan Phukan ([email protected])

NLP Assignment 2
No ratings yet
NLP Assignment 2
3 pages
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
No ratings yet
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
2 pages
ECE 610 Syllabus 1 PDF
No ratings yet
ECE 610 Syllabus 1 PDF
3 pages
Deep Learning Notes Andrew NG
No ratings yet
Deep Learning Notes Andrew NG
54 pages
Assingment-3 NLP
No ratings yet
Assingment-3 NLP
5 pages
COMP 4650 6490 Assignment 3 2023-v1.1
No ratings yet
COMP 4650 6490 Assignment 3 2023-v1.1
6 pages
Text Classification_movie Review_news Wires
No ratings yet
Text Classification_movie Review_news Wires
5 pages
Deep DL Manual Deep
No ratings yet
Deep DL Manual Deep
8 pages
CISC 867 Deep Learning: 14. Text Classification With Recurrent Neural Networks and Word Embeddings
No ratings yet
CISC 867 Deep Learning: 14. Text Classification With Recurrent Neural Networks and Word Embeddings
28 pages
Recurrent Neural Network Using LSTM Model
No ratings yet
Recurrent Neural Network Using LSTM Model
15 pages
Deep DL Manual Nainish
No ratings yet
Deep DL Manual Nainish
8 pages
DL record
No ratings yet
DL record
11 pages
CNN Text Classification
No ratings yet
CNN Text Classification
12 pages
LLM_FINE_TUNE
No ratings yet
LLM_FINE_TUNE
11 pages
Assignment 3 2
No ratings yet
Assignment 3 2
2 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Neural Networks
No ratings yet
Neural Networks
8 pages
DL_22_IMDB_SentimentAnalysis_RNN.ipynb - Colab
No ratings yet
DL_22_IMDB_SentimentAnalysis_RNN.ipynb - Colab
6 pages
Over Description About The Model
No ratings yet
Over Description About The Model
3 pages
Assignment 3
No ratings yet
Assignment 3
5 pages
A Comprehensive Guide To Understand and Implement Text Classification in Python
No ratings yet
A Comprehensive Guide To Understand and Implement Text Classification in Python
34 pages
Exercise 8
No ratings yet
Exercise 8
6 pages
unit4 (1)
No ratings yet
unit4 (1)
23 pages
Natural Language Processing With Neural Network - Class3
No ratings yet
Natural Language Processing With Neural Network - Class3
25 pages
تمثيل النص كموترات - تدريب _ مايكروسوفت ليرن
No ratings yet
تمثيل النص كموترات - تدريب _ مايكروسوفت ليرن
14 pages
Final_DL
No ratings yet
Final_DL
26 pages
3 - Deep Learning
No ratings yet
3 - Deep Learning
33 pages
Sentiment Analysis with an Recurrent Neural Networks
No ratings yet
Sentiment Analysis with an Recurrent Neural Networks
12 pages
09 Milestone Project 2 Skimlit
No ratings yet
09 Milestone Project 2 Skimlit
32 pages
vnd.openxmlformats-officedocument.wordprocessingml.document&rendition=1-10
No ratings yet
vnd.openxmlformats-officedocument.wordprocessingml.document&rendition=1-10
13 pages
Report
No ratings yet
Report
13 pages
Assignment: Machine Learning Engineer: Problem Description 1 (NLP)
No ratings yet
Assignment: Machine Learning Engineer: Problem Description 1 (NLP)
1 page
Adobe Scan 08 Jan 2025
No ratings yet
Adobe Scan 08 Jan 2025
7 pages
Neural Networks For Automated Essay Grading
No ratings yet
Neural Networks For Automated Essay Grading
11 pages
Assignment 4
No ratings yet
Assignment 4
5 pages
WINSEM2024-25_CSE4006_ETH_AP2024254000689_2025-02-28_Reference-Material-I
No ratings yet
WINSEM2024-25_CSE4006_ETH_AP2024254000689_2025-02-28_Reference-Material-I
39 pages
NLP-2 - Problem Statement
No ratings yet
NLP-2 - Problem Statement
3 pages
CS585 Lecture October15th
No ratings yet
CS585 Lecture October15th
162 pages
Maneesha Nidigonda Verzeo Major Project
No ratings yet
Maneesha Nidigonda Verzeo Major Project
11 pages
Computer Organization: National Institute of Technology Hamirpur
No ratings yet
Computer Organization: National Institute of Technology Hamirpur
8 pages
10-rnn
No ratings yet
10-rnn
56 pages
Unit 5b - Natural Language Processing
No ratings yet
Unit 5b - Natural Language Processing
41 pages
Dl lab answers batch 2
No ratings yet
Dl lab answers batch 2
27 pages
RNN LSTM
No ratings yet
RNN LSTM
37 pages
ANN Lab Assignment
No ratings yet
ANN Lab Assignment
1 page
Transformers Torch
No ratings yet
Transformers Torch
38 pages
102679174
No ratings yet
102679174
6 pages
ML7 - Text Classification
No ratings yet
ML7 - Text Classification
13 pages
Cv prince
No ratings yet
Cv prince
120 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
All-In-One Emotion, Sentiment and Intensity Prediction Using A Multi-Task Ensemble Framework-Ppt-1
No ratings yet
All-In-One Emotion, Sentiment and Intensity Prediction Using A Multi-Task Ensemble Framework-Ppt-1
29 pages
Natural Language Processing Lab 9
No ratings yet
Natural Language Processing Lab 9
13 pages
DL Practical 09text Pre Processing
No ratings yet
DL Practical 09text Pre Processing
6 pages
Paper id - ICCCAI25_188
No ratings yet
Paper id - ICCCAI25_188
8 pages
Cse425 Assignement - 20101257
No ratings yet
Cse425 Assignement - 20101257
12 pages
6 - RNN LSTM & Gru
No ratings yet
6 - RNN LSTM & Gru
14 pages
SatishDeepLearningLabMAnual
No ratings yet
SatishDeepLearningLabMAnual
85 pages
21. Deep learning for industries
No ratings yet
21. Deep learning for industries
45 pages
Keras For Beginners: Implementing A Recurrent Neural Network
No ratings yet
Keras For Beginners: Implementing A Recurrent Neural Network
13 pages
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Unsupervised Traffic Accident Detection in First-Person Videos
No ratings yet
Unsupervised Traffic Accident Detection in First-Person Videos
9 pages
Artificial Neural Network Based Model For Forecasting of Inflation in India
No ratings yet
Artificial Neural Network Based Model For Forecasting of Inflation in India
12 pages
Adaptive Linear Neuron Using Linear (Identity) Activation Function With Batch Gradient Method
No ratings yet
Adaptive Linear Neuron Using Linear (Identity) Activation Function With Batch Gradient Method
19 pages
OlahLSTM NEURAL NETWORK TUTORIAL 15
No ratings yet
OlahLSTM NEURAL NETWORK TUTORIAL 15
9 pages
DIT University: Registration Form
No ratings yet
DIT University: Registration Form
2 pages
ML, DL Questions: Downloaded From
No ratings yet
ML, DL Questions: Downloaded From
4 pages
1612-Article Text-6168-1-4-20250219
No ratings yet
1612-Article Text-6168-1-4-20250219
20 pages
Artificial neural network course slides
No ratings yet
Artificial neural network course slides
61 pages
C1_W1
No ratings yet
C1_W1
17 pages
CS550 Lec1
No ratings yet
CS550 Lec1
35 pages
4.0 The Complete Guide To Artificial Neural Networks
No ratings yet
4.0 The Complete Guide To Artificial Neural Networks
23 pages
Draw Drawio
No ratings yet
Draw Drawio
2 pages
A Data Driven Approach of ROP Prediction and Drilling Performance
No ratings yet
A Data Driven Approach of ROP Prediction and Drilling Performance
9 pages
Multi-Traffic Scene Perception Based On Supervised Learning
No ratings yet
Multi-Traffic Scene Perception Based On Supervised Learning
10 pages
Introduction To Deep Learning Assignment 0: September 2023
No ratings yet
Introduction To Deep Learning Assignment 0: September 2023
3 pages
Big Data Analytics
No ratings yet
Big Data Analytics
5 pages
Unit III - Question Bank
No ratings yet
Unit III - Question Bank
1 page
AI in Marketing Industry Course Curriculum
No ratings yet
AI in Marketing Industry Course Curriculum
17 pages
003 05 KNN - Enhancements W3L2
No ratings yet
003 05 KNN - Enhancements W3L2
10 pages
Advanced AI & ML
No ratings yet
Advanced AI & ML
3 pages
Course Outcomes
No ratings yet
Course Outcomes
3 pages
Deep Learning EECS 6327
No ratings yet
Deep Learning EECS 6327
43 pages
Cybersecurity Issues in Generative AI
No ratings yet
Cybersecurity Issues in Generative AI
4 pages
Unit - 5 Re-Inforcement Learning
No ratings yet
Unit - 5 Re-Inforcement Learning
3 pages
A Closer Look at Fake News Detection
No ratings yet
A Closer Look at Fake News Detection
5 pages
19eid331 - Artificial Neural Networks
No ratings yet
19eid331 - Artificial Neural Networks
3 pages
Trans-IFFT-FGSM: A Novel Fast Gradient Sign Method For Adversarial Attacks
No ratings yet
Trans-IFFT-FGSM: A Novel Fast Gradient Sign Method For Adversarial Attacks
21 pages
Reverse Image Search Project
100% (1)
Reverse Image Search Project
31 pages