0% found this document useful (0 votes)

298 views2 pages

NLP Assignment 2

This document provides instructions for Natural Language Processing Assignment 2 on part-of-speech tagging using Hidden Markov Models. Students are asked to: 1. Train an emission model and transition model on labeled sentences from the Brown corpus tagged with universal POS tags. 2. Use the last 500 sentences as a test set, and report the actual and predicted POS tags for each sentence in a table. 3. Submit the code and results file following specific formatting and naming conventions.

Uploaded by

muhammad shoaib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

298 views2 pages

NLP Assignment 2

Uploaded by

muhammad shoaib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Natural Language Processing

Assignment 2
Due Date: 20th Nov 2020

Hidden Markov Models: Part-of-speech Tagging

1. Introduction

This assignment will make use of the Natural Language Tool Kit (NLTK) for Python. NLTK is a
platform for writing programs to process human language data, that provides both corpora and modules.
For more information on NLTK, please visit: http://www.nltk.org/.

1.1. Getting Started

This is an individual assignment and submission is only through LMS. Late assignments will not be
entertained without a valid reason.

1.2. Submitting Your Assignment

When ready to submit, create a directory that should be called nlp-assign2-< CMS >, where < CMS > is
your CMS ID number e.g.: 1234567. In this directory put your template.py file and test result word or
pdf file, renamed with your CMS ID number, e.g. 1234567.py.

Submit your assignment by creating a gzipped tar file from your nlp-assign2-< CMS > directory. You do
this using the following command in a DICE machine:

tar -cvzf nlp-assign2-< CMS >tar.gz nlp-assign2-< CMS >

You can check that this file stores the intended data with the following command, which lists all the files
one would get when extracting the original directory (and its files) from this file:

tar -tv nlp-assign2-< CMS >.tar.gz

Upload the file on LMS. Before submitting your assignment:

 Ensure that your code works on DICE. Your modified template.py should fully execute using
python3.

 Ensure that you include comments in your code where appropriate. This makes it easier for the
markers to understand what you have done and makes it more likely that partial marks can be
awarded.

 Any character limits to open questions will be strictly enforced. Answers will be passed through
an automatic filter that only keeps the first N characters, where N is the character limit given in a
question.

 Important: Whenever you use corpus data in this assignment, you must convert the data to
lowercase, so that e.g. the original tokens “Freedom” and “freedom” are made equal. Do this
throughout the assignment, whether it’s explicitly stated or not.
Section A: Training a Hidden Markov Model
In this part of the assignment you have to train a Hidden Markov Model (HMM) for part-of-speech
(POS) tagging. You will need to create and train two models—an Emission Model and a Transition
Model as described in lectures.

Use labelled sentences from the ‘news’ part of the Brown corpus. You can download the dataset using
instruction given at NLTK website1. These are annotated with parts of speech, which you will convert
into the Universal POS tagset (NLTK uses the smaller version of this set defined by Petrov et al. 2).
Having a smaller number of labels (states) will make Viterbi decoding faster.

Use the last 500 sentences from the corpus as the test set and the rest for training. This split corresponds
roughly to a 90/10% division. Do not shuffle the data before splitting.

Give the results of last 500 sentences in the form of table as given below:

Sr. No Sentence Actual Tags Predicted Tags

Note: Submit your code and file with results

Failure to follow these instructions exactly will render most of your answers incorrect.

1
https://www.nltk.org/book/ch02.html

2
https://github.com/slavpetrov/universal-pos-tags

Unit-2 Aim 502
No ratings yet
Unit-2 Aim 502
6 pages
NLP - Short Assignments
No ratings yet
NLP - Short Assignments
8 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
Ch11 3 Tries
No ratings yet
Ch11 3 Tries
11 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
NLP Unit-2 Notes
No ratings yet
NLP Unit-2 Notes
45 pages
Question Bank
No ratings yet
Question Bank
13 pages
NLP- AI2214601 unit 1to unit 5 notes
No ratings yet
NLP- AI2214601 unit 1to unit 5 notes
98 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
Week 6: Introduction To Natural Language Processing
No ratings yet
Week 6: Introduction To Natural Language Processing
18 pages
BAI601-NLP
No ratings yet
BAI601-NLP
5 pages
Solutions To NLP I Mid Set A
100% (1)
Solutions To NLP I Mid Set A
8 pages
Unit-1 Aim 502
No ratings yet
Unit-1 Aim 502
15 pages
NLP End Sem Paper - Evaluation Scheme
No ratings yet
NLP End Sem Paper - Evaluation Scheme
14 pages
NLP Iat QB
No ratings yet
NLP Iat QB
10 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Computer Organization and Design: Lecture: 3 Tutorial: 1 Practical: 0 Credit: 4
No ratings yet
Computer Organization and Design: Lecture: 3 Tutorial: 1 Practical: 0 Credit: 4
18 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
NLP Semester 7
No ratings yet
NLP Semester 7
1,072 pages
Assignment 5 (COPY)
No ratings yet
Assignment 5 (COPY)
5 pages
nlp unit 1
No ratings yet
nlp unit 1
133 pages
Unit 1
No ratings yet
Unit 1
35 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
NLP Notes
No ratings yet
NLP Notes
71 pages
Lecture 1: Introduction To NLP: Understand Concepts Applications
No ratings yet
Lecture 1: Introduction To NLP: Understand Concepts Applications
32 pages
NLP UNIT-II PPT
No ratings yet
NLP UNIT-II PPT
45 pages
NLP Module 4 Notes
No ratings yet
NLP Module 4 Notes
8 pages
Module 3 - Paper 1 - Extracting Relations From Text From Word Sequences To Dependency Paths
No ratings yet
Module 3 - Paper 1 - Extracting Relations From Text From Word Sequences To Dependency Paths
11 pages
CSE4022 Natural-Language-Processing ETH 1 AC41
No ratings yet
CSE4022 Natural-Language-Processing ETH 1 AC41
6 pages
NLP Lab Tasks
No ratings yet
NLP Lab Tasks
16 pages
Unit-III PDF
No ratings yet
Unit-III PDF
72 pages
Be Computer Engineering Semester 7 2023 May Dloc III Natural Language Processing Rev 2019 C Scheme
0% (1)
Be Computer Engineering Semester 7 2023 May Dloc III Natural Language Processing Rev 2019 C Scheme
2 pages
1-NLP - Lab Manual
No ratings yet
1-NLP - Lab Manual
15 pages
Module-5:: Network Analysis
No ratings yet
Module-5:: Network Analysis
22 pages
Speech Recognition Architecture
No ratings yet
Speech Recognition Architecture
13 pages
Swe1017 NLP Syllabus
No ratings yet
Swe1017 NLP Syllabus
2 pages
Model Question Paper
0% (1)
Model Question Paper
2 pages
Semantic Analysis: Natural Language Processing (CSE 5321)
No ratings yet
Semantic Analysis: Natural Language Processing (CSE 5321)
35 pages
1-Introduction To Networking
No ratings yet
1-Introduction To Networking
18 pages
Assignment 6 (COPY)
No ratings yet
Assignment 6 (COPY)
6 pages
Natural Language Processing
100% (1)
Natural Language Processing
3 pages
NLP Unit-V
No ratings yet
NLP Unit-V
30 pages
2024-25 NLP Question Bank
No ratings yet
2024-25 NLP Question Bank
4 pages
Module 2 - Natural Language Processing: Paulo Gomes DEI - FCTUC, 2006/2007
No ratings yet
Module 2 - Natural Language Processing: Paulo Gomes DEI - FCTUC, 2006/2007
42 pages
NLP Course File Notes
No ratings yet
NLP Course File Notes
71 pages
21AD3202 - Natural LanguageProcessing-Record
No ratings yet
21AD3202 - Natural LanguageProcessing-Record
64 pages
Cognitive Computing (Course Code: 18CS3272) : CO1 - Session4 Session Topic: The Elements of A Cognitive System
No ratings yet
Cognitive Computing (Course Code: 18CS3272) : CO1 - Session4 Session Topic: The Elements of A Cognitive System
9 pages
Unit 1 2 3 4 5 NLP Notes Merged
100% (1)
Unit 1 2 3 4 5 NLP Notes Merged
105 pages
CS 224n Assignment #2: Word2vec (43 Points)
No ratings yet
CS 224n Assignment #2: Word2vec (43 Points)
4 pages
NLP Lab Manual Updated
No ratings yet
NLP Lab Manual Updated
34 pages
Unit I
No ratings yet
Unit I
30 pages
Data Centric Artificial Intelligence: A Beginner's Guide
No ratings yet
Data Centric Artificial Intelligence: A Beginner's Guide
137 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
NLP Lab Expdoc New
No ratings yet
NLP Lab Expdoc New
103 pages
NLP Worksheet: Text Processing, Bag of Words, Tf-Idf Activity
No ratings yet
NLP Worksheet: Text Processing, Bag of Words, Tf-Idf Activity
6 pages
Natural Language Processing Revision Notes
No ratings yet
Natural Language Processing Revision Notes
4 pages
Chapter 6
100% (1)
Chapter 6
28 pages
Natural Language Processing: Dr. Abdulfetah A.A
No ratings yet
Natural Language Processing: Dr. Abdulfetah A.A
25 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Abap
No ratings yet
Abap
79 pages
Gcbasic PDF
No ratings yet
Gcbasic PDF
799 pages
CS409 Orignal Handsout by MR - Burhan
0% (1)
CS409 Orignal Handsout by MR - Burhan
170 pages
2101.06341
No ratings yet
2101.06341
27 pages
Chapter 12
No ratings yet
Chapter 12
34 pages
Recruitment of Specialist Cadre Officers in Sbi On Contract Basis For Wealth Management Business Unit
No ratings yet
Recruitment of Specialist Cadre Officers in Sbi On Contract Basis For Wealth Management Business Unit
4 pages
RFEM 5 Brochure
No ratings yet
RFEM 5 Brochure
6 pages
Project PPT
No ratings yet
Project PPT
32 pages
DDS For Physical and Logical Files
No ratings yet
DDS For Physical and Logical Files
106 pages
High-Speed Low-Power Viterbi Decoder Design For TCM Decoders
No ratings yet
High-Speed Low-Power Viterbi Decoder Design For TCM Decoders
20 pages
V - Cse - CS3501 - CD - QB - Unit 2
No ratings yet
V - Cse - CS3501 - CD - QB - Unit 2
7 pages
Ade Ap1000d5 Data Sheet
No ratings yet
Ade Ap1000d5 Data Sheet
5 pages
Weekly Home Learning Plan in Basic Calculus: Strand
100% (1)
Weekly Home Learning Plan in Basic Calculus: Strand
3 pages
417 AI Handbook Class9!81!96
No ratings yet
417 AI Handbook Class9!81!96
16 pages
Advt No.18-2022
No ratings yet
Advt No.18-2022
2 pages
Date Sheet With Instructions
No ratings yet
Date Sheet With Instructions
9 pages
Email List Building Blueprint
100% (1)
Email List Building Blueprint
29 pages
Quick Guide To Download and Install Canon Printer Driver From Canon - Comijsetup
No ratings yet
Quick Guide To Download and Install Canon Printer Driver From Canon - Comijsetup
2 pages
Bootstrap LA
No ratings yet
Bootstrap LA
55 pages
DragonNest 2011 10 28-21 01 28 97
No ratings yet
DragonNest 2011 10 28-21 01 28 97
1 page
class 10 term 1 paper 2024 (2) (1)
No ratings yet
class 10 term 1 paper 2024 (2) (1)
6 pages
Cubicsdr Documentation: Release Latest
No ratings yet
Cubicsdr Documentation: Release Latest
14 pages
Management Information Systems, 10/e
No ratings yet
Management Information Systems, 10/e
34 pages
Main Project 1
No ratings yet
Main Project 1
23 pages
Callcentrehelper Dashboard 210426 v1 0
No ratings yet
Callcentrehelper Dashboard 210426 v1 0
8 pages
Pay-Pa - Adapted Style Guide
No ratings yet
Pay-Pa - Adapted Style Guide
19 pages
Instant Download Intelligent information technologies and applications 1st Edition Vijayan Sugumaran PDF All Chapters
100% (1)
Instant Download Intelligent information technologies and applications 1st Edition Vijayan Sugumaran PDF All Chapters
67 pages
Al Qadim 1994
No ratings yet
Al Qadim 1994
33 pages
Unit 04. PWP (22616)
No ratings yet
Unit 04. PWP (22616)
19 pages
Awon Eto Funfun | PDF
No ratings yet
Awon Eto Funfun | PDF
103 pages

NLP Assignment 2

Uploaded by

NLP Assignment 2

Uploaded by

Natural Language Processing

Hidden Markov Models: Part-of-speech Tagging

1.1. Getting Started

1.2. Submitting Your Assignment

tar -cvzf nlp-assign2-< CMS >tar.gz nlp-assign2-< CMS >

tar -tv nlp-assign2-< CMS >.tar.gz

Upload the file on LMS. Before submitting your assignment:

Sr. No Sentence Actual Tags Predicted Tags

Note: Submit your code and file with results

You might also like