0% found this document useful (0 votes)

16 views9 pages

CW Sequence Analysis

This document provides information about a coursework assignment on deep learning for sequence analysis. The goal is for students to design, train, and evaluate deep learning models for tasks involving sequential data, such as text, videos, or time series. Students will complete a project applying concepts covered in the course, including recurrent neural networks, transformers, and language models. The project involves selecting a sequence-based problem and developing a solution using PyTorch. Students must submit a written report and present their work orally. The report should describe the implemented model, training process, and evaluation results. Code must be well-structured, documented, and version controlled on GitHub with experiments logged on Weights and Biases.

Uploaded by

shk93359

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views9 pages

CW Sequence Analysis

Uploaded by

shk93359

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Coursework assignment

Deep Learning for Sequence Analysis

INM706

COURSE 2023-2024
GENERAL INFORMATION
This coursework extends your knowledge gained from tutorials and lectures, focusing
on the design, training, and evaluation of Deep Learning (DL) based models for
sequence based tasks.. By the end of this coursework, participants should demonstrate
proficiency in designing, training, and evaluating different models for various tasks
where sequences are used. It leverages various concepts covered in the module,
including:

● Basic n-gram language models: bigrams, trigrams, etc., with smoothing methods
like KneserNey smoothing.
● Neural Network Language Models (e.g., skipgrams).
● Recurrent Neural Networks (RNNs) encompassing Language Models and
Classifiers, such as LSTM and GRU.
● Sequence analysis using ConvNets and RNNs for tasks like video classification,
image tagging, and action classification.
● Implementation of dialogue systems (chatbots).
● Exploration of Transformers for applications like text generation and chatbots.
● LLMs (large language models), RAG (Retrieval Augmented generation)

By the end of this coursework, you will have honed your skills in applying DL concepts
to real-world problems. Your chosen problem will serve as a practical case study to
showcase your proficiency in utilizing PyTorch for model development and evaluation.
The ultimate goal is to present a well-crafted solution that reflects your understanding
and application of the learned concepts.

Read the whole document and ask as soon as possible anything that is unclear. The
sections are as follow:

● Marks and Deadline

● About the coursework - general information on the courseworks purpose
● Implementation details and deliverables - contains important information about
the expected code structure
● Written report - expected structure and content of your report
● Oral presentation
● Final submission - a summary of everything that needs to be on moodle
● The final sections refer to referencing, plagiarism and extenuating circumstances
please read them carefully
1.Marks and Deadlines:
The coursework consists of two components:

● written report (WR) weighing 70% of the total mark

● oral presentation (OP) or Viva weighing 30% of the total mark

Each component is evaluated up to 100%, and you need to get 50% in each component to pass
the module. Otherwise, you will need to resit the component you have failed.
You cannot do your oral presentation without having presented (or passed if resitting the OP
only) your written report.

Report submission deadline (WR) Sunday, 5th May 2024 17:00 GMT

a. You must submit your WR in PDF format online using the Submission Area:
Written Report in Moodle.
b. Caution: The system will not admit any submissions after the deadline. That is, if
you press the submit button at 17:00:01, it won’t be accepted.

Oral Presentation Date (OP) Thursday, 16th May 2024 - the time will be scheduled

c. Please submit your slides to the Submission Area Oral Presentation (Viva) in
Moodle.
d. Your specific slot will be announced later and the timetable will be uploaded to
Moodle
IMPORTANT:
In this coursework, it is imperative that the submitted code and the accompanying report go
hand in hand, forming a cohesive and comprehensive submission. The code should be
structured and documented in a manner that aligns seamlessly with the report's content.
Each section of the report should reference and explain the relevant portions of the code,
providing clarity on the implementation details, methodologies, and key findings. It is essential
that both components, the code, and the report, complement each other to convey a complete
and coherent narrative of the undertaken project.
2. About the coursework
In your project, you will leverage the knowledge acquired from the module to tackle a diverse
range of challenges based on sequential data. This involves:

● Architecting Language Models:

○ Design and refine models tailored to language-based tasks.
○ Integrate specialized layers for handling sequential data effectively.
○ Implement discriminators, generators, and classifiers for language-related
applications.

● Architecting Sequence-to-Sequence Models:

○ Explore cutting-edge techniques within sequence-to-sequence architectures.
○ Incorporate attention mechanisms for improved sequence modeling.

● Training and Evaluating DL Models for Language Tasks:

○ Utilize various optimizers, including nuanced applications of ADAM, for training
language models.
○ Methodically document experiments, leveraging tools like the Wandb library.

● Potential Problems to Address in Language and Sequence Tasks:

○ Text Generation:
■ Develop models capable of generating coherent and contextually relevant
text.
○ Named Entity Recognition:
■ Implement models for accurately identifying and classifying named
entities in text.
○ Sentiment Analysis:
■ Utilize DL models to analyze and classify sentiment in textual data.
○ Machine Translation:
■ Architect sequence-to-sequence models for translating text between
languages.
○ Question Answering:
■ Develop models capable of understanding and answering questions
based on textual input.

In addition to language and sequence-based tasks, consider expanding the scope of your
project to encompass domains such as trading data and weather, where sequential data plays a
crucial role. Here are examples that illustrate the application of deep learning in these domains:
● Time Series Forecasting for Trading Data:
○ Problem: Predicting stock prices or market trends based on historical trading
data.
○ Potential Solution: Employ recurrent neural networks (RNNs) or long short-term
memory networks (LSTMs) to capture temporal dependencies and patterns in
financial time series data.

● Weather Prediction using Sequential Data:

○ Problem: Forecasting future weather conditions based on historical
meteorological data.
○ Potential Solution: Utilize recurrent neural networks (RNNs) or convolutional
neural networks (CNNs) to model temporal and spatial dependencies in
sequential weather data.

You will work on exactly one problem you will have selected from one of the topics on this list. If
you have a different idea in mind we can discuss it, but always be aware of the time you have to
implement it.
Before starting to work on your projects make sure you have a quick chat with the module
leader about your idea and your implementation plan (i.e. design, train and evaluate).
You do not need to add any state of the art functionality to your report, this is left for the
individual project, however make sure you make your project your own. Use your own ideas.

3.Implementation and deliverables:

During the duration of the course you will be presented best practices in structuring your code,
using github and logging libraries.

Must:
● Only Pytorch framework must be used for all implementations.
● Only Wandb should be used for logging.
● Code should be committed periodically to github.

Code Deliverables are:

● Zipped folder containing the project
● Github url of the project (make sure it is accessible)
● Wandb logs - you can add them to a report and share the report - a how to file
will be provided
● Extras:
○ Database url wether from a website or where it has been uploaded
The project folder should have the following structure:
● Python files with models (name them appropriately and split based on functionality)
● Important files:
○ Requirements.txt - Should have all the needed requirements for your project
○ Setup.sh - running this bash file should create a virtual environment and install all
requirements
○ Train.py - running python train.py should start a train loop, and should also log
results. I will run this to ensure everything was set up correctly. Parameters can
be passed to this file to select nr_epochs and other relevant parameters.
○ Inference files:
■ At inference time your model should be run with a selected checkpoint to
showcase the final results. Example: If it’s a chatbot then I should be able
to chat with it. If it’s sentiment analysis I should be able to give it some
input and see the results. If you choose something more complicated in
terms of input (e.g. trading) make sure you write down how to test it.
During the labs several methods of showing your results will be presented
and you can choose which one you like best.
● Jupyter notebook
● Dash app
● Streamlit app
● Colab
● Files needed for inference: notebook / folder to run your
(streamlit/dash app), checkpoint file

4. Written Report (70%):

The report must be at most 3000 words and 15 pages long. Format for reports: pdf format,
single column, standard A4 margins, standard default line spacing of 1.15, Arial 11, including all
figures. It is preferable to use Latex, but Word/LibreOffice are fine too. Late submissions will
score 0. You can upload work on Moodle more than once, so there is no need for last minute
submission. Don't leave final submission until the last minute though.

CW reports should encompass the following sections:

1. Introduction (15 marks, Individual Component):

Provide an overview of the problem you are addressing, specify the dataset employed, mention
any borrowed source code, and include a brief literature review (which can be integrated into
the Introduction text or having a separate section).

2. Methodology (30 marks):

Explain the models used along with Architecture figures for clarification - if you’re using parts of
an existing Architecture don’t forget to reference them, as well as making it clear what your
contribution is. Include all relevant equations, such as loss functions, ensuring they are
presented, explained, correctly labeled/numbered, and appropriately referenced.

3. Results (25 marks):

Presenting the results should involve meticulous attention to detail, employing tables, such as
the confusion matrix for classification problems, and plots. The submitted code for generating
these figures is imperative, accompanied by the inclusion of Wandb logs for comprehensive
transparency.

It is essential that all visual elements, tables, and accuracy metrics adhere to a precise labeling
and numbering scheme. Each component should be thoroughly referenced, and the rationale
behind their inclusion must be clearly explained to ensure the integrity and interpretability of the
results.

In the case of multiple experiments showcased in the results, it is crucial that accompanying
comments provide a comparative analysis, highlighting both commonalities and distinctions
among the various trials. This approach contributes to a nuanced understanding of the
outcomes and facilitates a comprehensive evaluation of the experiments conducted.

4. Conclusions (15 marks, Individual Component):

Sum up the framework you've worked on in the earlier sections, tying it back to the Introduction.
Capture the main takeaways concisely and clearly. This part should showcase your
understanding of the topic.

5. Reflections (15 marks): Reflect on the learning outcomes of the coursework, detailing
encountered challenges, deviations from the initial plan, and insights gained. Discuss what you
would have done differently.

For pairs working on the coursework, each student should choose one of the individual
components (Introduction or Conclusions) to work on independently. These components will be
assessed separately, resulting in different final marks. Equal contribution is expected for all
other sections.

5. Oral Presentation (30%):

● The Oral Presentation will last 15 minutes.

● You are required to give a presentation of your work in which you go over the
methodology and the results, this should be 12 minutes long, and 3 minutes will be used
to ask additional questions.
● Slides must be submitted by the deadline above in Moodle’s submission area.
You cannot do your oral presentation without having presented your written report.

When choosing a problem, consider these constraints:

Time:
● Plan effectively for timely submission.
● Avoid overly ambitious or overly simplistic projects.
● Be prepared for the possibility of unforeseen challenges during implementation or
evaluation.
Results:
● Aim for meaningful results; surpassing benchmarks is not mandatory.
● Avoid outcomes like a 1% overall accuracy or an F1 score of 125.
Dataset:
● Optimize time by using complete, pre-labeled datasets.
● Consider benchmark datasets or open-source alternatives discussed in the
course.

6. Final submission and deliverables:

Submission is through Moodle, and no other method of submission will be accepted.

You should submit the following files:

● Report (pdf) - check the Written Report section above for details on the content
● Code deliverables check Implementation and deliverables section above for a
detailed outline of the needed files.
● Presentation - will be online, you must be presented during your assigned slot on
the given conference link.

Coding & Referencing:

This assignment primarily involves coding. If you use code or any other materials authored by
someone else, it is essential to provide proper citation. Failure to appropriately cite work
constitutes Academic Misconduct and will result in appropriate consequences. Superficial
modifications to the code do not establish it as your own. Refer to the addendum to the
guidelines or consult the module leader for further clarification.

Extenuating Circumstances:
If unforeseen medical or personal circumstances prevent you from submitting your coursework
on time, promptly contact the Programmes Office. Fill out an Extenuating Circumstances form
and provide strong, genuine evidence, such as medical certificates or legal statements, to
support your case.

Plagiarism:
Copying the work of others, whether from another team or a third party, with or without
permission, will result in zero marks, and disciplinary action will be taken. The same
consequences apply if you allow others to copy your work. Refer to the addendum to the
guidelines or consult the module leader for additional clarification.

Feedback
In the labs we can check your progress and give formative feedback.
Evaluative feedback and marks on your coursework will be released after the presentations.

NN Lab Course Plan
No ratings yet
NN Lab Course Plan
9 pages
Time and Space Complexity
No ratings yet
Time and Space Complexity
5 pages
UVa Problem List (Catagorized Algorithmic Problem)
100% (1)
UVa Problem List (Catagorized Algorithmic Problem)
8 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Understanding Software Engineering Vol 2: Programming principles and concepts to build any software.
From Everand
Understanding Software Engineering Vol 2: Programming principles and concepts to build any software.
Gabriel Clemente
5/5 (1)
Introduction To Functional Programming in Racket: CS 270 Math Foundations of CS Jeremy Johnson
100% (1)
Introduction To Functional Programming in Racket: CS 270 Math Foundations of CS Jeremy Johnson
26 pages
Cs329s 01 Slides
No ratings yet
Cs329s 01 Slides
70 pages
DL Lab Manual Student
No ratings yet
DL Lab Manual Student
6 pages
Deep Learning Generative AI
No ratings yet
Deep Learning Generative AI
6 pages
Data Science Lab-KTU
No ratings yet
Data Science Lab-KTU
5 pages
Huawei Questions: True or False
No ratings yet
Huawei Questions: True or False
24 pages
AKTI Gen A.I Course Outline
No ratings yet
AKTI Gen A.I Course Outline
4 pages
Introduction To Wavelet
No ratings yet
Introduction To Wavelet
44 pages
Deep Learning-KTU
No ratings yet
Deep Learning-KTU
6 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
21 pages
Frequency Response (Report)
No ratings yet
Frequency Response (Report)
19 pages
Comprehensive AI & ML Course - From Beginner To Gen...
No ratings yet
Comprehensive AI & ML Course - From Beginner To Gen...
5 pages
Roadmap Gen AI
No ratings yet
Roadmap Gen AI
2 pages
Lab Report (1) Bachpan
No ratings yet
Lab Report (1) Bachpan
29 pages
Optativas Dbabba
No ratings yet
Optativas Dbabba
641 pages
Spring Assignment 2024
No ratings yet
Spring Assignment 2024
12 pages
Lec0 Logistics
No ratings yet
Lec0 Logistics
40 pages
ML C-57 Program Calendar - Sept 23
No ratings yet
ML C-57 Program Calendar - Sept 23
3 pages
DBA4714 Deep Learning Generative AI in Business - R1
No ratings yet
DBA4714 Deep Learning Generative AI in Business - R1
3 pages
Graded: What Is AI? Applications and Examples of AI
100% (1)
Graded: What Is AI? Applications and Examples of AI
2 pages
Machine Learning Assignment 2: Assessment Type
No ratings yet
Machine Learning Assignment 2: Assessment Type
11 pages
AIML 2nd Year
No ratings yet
AIML 2nd Year
5 pages
Lesson 1:: 1. Explain The Mlops Methodology Mlops
No ratings yet
Lesson 1:: 1. Explain The Mlops Methodology Mlops
27 pages
Zero To Advance in DSA Challenge - Shumbul Arifa
No ratings yet
Zero To Advance in DSA Challenge - Shumbul Arifa
33 pages
Introduction To Deep Learning: 0. Logistics Spring 2021
No ratings yet
Introduction To Deep Learning: 0. Logistics Spring 2021
56 pages
1a. Overview
No ratings yet
1a. Overview
18 pages
AIEngineering
No ratings yet
AIEngineering
25 pages
CS 5720 Neural Network & Deep Learning - Fall24 - Syllabus
No ratings yet
CS 5720 Neural Network & Deep Learning - Fall24 - Syllabus
10 pages
AIML Weekly Report
No ratings yet
AIML Weekly Report
5 pages
Intro To Ai Submission Guidelines
No ratings yet
Intro To Ai Submission Guidelines
2 pages
Investment Predictions
No ratings yet
Investment Predictions
5 pages
Adavanced - Applied Artificial Intelligence (Practical Implementations)
No ratings yet
Adavanced - Applied Artificial Intelligence (Practical Implementations)
9 pages
CS598 - Deep Learning For Healthcare Syllabus
No ratings yet
CS598 - Deep Learning For Healthcare Syllabus
5 pages
Question Paper Code:: (10×2 20 Marks)
No ratings yet
Question Paper Code:: (10×2 20 Marks)
3 pages
Investment Predictions
No ratings yet
Investment Predictions
5 pages
CYBER 207 - Applied Machine Learning For Cybersecurity Syllabus
No ratings yet
CYBER 207 - Applied Machine Learning For Cybersecurity Syllabus
7 pages
Artificial Intelligence & Machine Learning Curriculum Pregrad
No ratings yet
Artificial Intelligence & Machine Learning Curriculum Pregrad
12 pages
Artificial Intelligence & Data Science Course Outline
No ratings yet
Artificial Intelligence & Data Science Course Outline
5 pages
ERA V3 - Course Structure
No ratings yet
ERA V3 - Course Structure
12 pages
Final Term Topics For AI BSCS 6 - Phoenix
No ratings yet
Final Term Topics For AI BSCS 6 - Phoenix
2 pages
Fewshot-Fewshot (LO-Concepts)
No ratings yet
Fewshot-Fewshot (LO-Concepts)
6 pages
AI Learning Roadmap
No ratings yet
AI Learning Roadmap
6 pages
Deeplearning 1
No ratings yet
Deeplearning 1
4 pages
Syllabus
No ratings yet
Syllabus
4 pages
AI & Data Science Course Curriculum-Compressed
No ratings yet
AI & Data Science Course Curriculum-Compressed
15 pages
Deep Learning Nanodegree Syllabus
No ratings yet
Deep Learning Nanodegree Syllabus
15 pages
CPE 695 WS: Applied Machine Learning: Lecture 0: Course Logistics and Introduction To ML
No ratings yet
CPE 695 WS: Applied Machine Learning: Lecture 0: Course Logistics and Introduction To ML
17 pages
GIKI BootCamp
No ratings yet
GIKI BootCamp
4 pages
Curricullum Advanced Generative AI Certification Course
No ratings yet
Curricullum Advanced Generative AI Certification Course
6 pages
AI Engineer TOC - Sample
No ratings yet
AI Engineer TOC - Sample
3 pages
FinalProject Requirements
No ratings yet
FinalProject Requirements
3 pages
Deep Learning NLP and Computer Vision
No ratings yet
Deep Learning NLP and Computer Vision
9 pages
Ds Genai Parta
No ratings yet
Ds Genai Parta
4 pages
Curriculum
No ratings yet
Curriculum
2 pages
DLNLP - Course Outline
No ratings yet
DLNLP - Course Outline
3 pages
Fundamentals of Generative AI
No ratings yet
Fundamentals of Generative AI
5 pages
RAI AI Engineer Intern Assignments
No ratings yet
RAI AI Engineer Intern Assignments
3 pages
Master of Science in Machine Learning & AI - Liverpool Joh Moore University
No ratings yet
Master of Science in Machine Learning & AI - Liverpool Joh Moore University
6 pages
BCA Minor Project Proposal Vinutha AASCU3BCA2207162
No ratings yet
BCA Minor Project Proposal Vinutha AASCU3BCA2207162
4 pages
IEC Artificial Intelligence Syllabus
No ratings yet
IEC Artificial Intelligence Syllabus
8 pages
Programming And Coding in Intermidiate Level
From Everand
Programming And Coding in Intermidiate Level
Memo
No ratings yet
AI and ML With Generative AI
No ratings yet
AI and ML With Generative AI
3 pages
Tasks B.1 - Setup
No ratings yet
Tasks B.1 - Setup
1 page
Syllabus - ML Lab
No ratings yet
Syllabus - ML Lab
3 pages
Cap450:Artificial Intelligence and Intelligent Systems: Session 2023-24 Page:1/2
No ratings yet
Cap450:Artificial Intelligence and Intelligent Systems: Session 2023-24 Page:1/2
2 pages
Roundoff Error Material Balance Error Non-Linear Error Instability Error Truncation Error Numerical Dispersion Grid Orientation Problems
No ratings yet
Roundoff Error Material Balance Error Non-Linear Error Instability Error Truncation Error Numerical Dispersion Grid Orientation Problems
19 pages
Disease Prediction Research Report
No ratings yet
Disease Prediction Research Report
6 pages
Practice Problems CO2
No ratings yet
Practice Problems CO2
2 pages
Brochure Degree Sciences
No ratings yet
Brochure Degree Sciences
205 pages
Term Paper SIS Mizuno ID 1745441
No ratings yet
Term Paper SIS Mizuno ID 1745441
6 pages
ML DL Engineer Plan
No ratings yet
ML DL Engineer Plan
2 pages
SIE 431 Simulation Modeling and Analysis Midterm Exam, May 15 2021 60 Minutes For Exam Name
No ratings yet
SIE 431 Simulation Modeling and Analysis Midterm Exam, May 15 2021 60 Minutes For Exam Name
9 pages
6.1 Graphing Linear Inequalities in Two Variables PDF
No ratings yet
6.1 Graphing Linear Inequalities in Two Variables PDF
4 pages
Dynamic Programming
No ratings yet
Dynamic Programming
8 pages
4-Arithmetic Coding
No ratings yet
4-Arithmetic Coding
1 page
Final Asfaw BDU MSC Thesis
No ratings yet
Final Asfaw BDU MSC Thesis
124 pages
Comparative Evaluation of Credit Card Fraud Detection
No ratings yet
Comparative Evaluation of Credit Card Fraud Detection
7 pages
CSC462-AI Lec01 Slides
No ratings yet
CSC462-AI Lec01 Slides
9 pages
Basic Signals and Signal Operation Lec2
No ratings yet
Basic Signals and Signal Operation Lec2
18 pages
Shortest Path To Work: Problem
No ratings yet
Shortest Path To Work: Problem
1 page
CNN3 Pooling and Fully Contected Layers
No ratings yet
CNN3 Pooling and Fully Contected Layers
21 pages
Jurnal Dama 09021181924016 Rev
No ratings yet
Jurnal Dama 09021181924016 Rev
10 pages
Marginal Effects For Binary Response Models Nonlinear
No ratings yet
Marginal Effects For Binary Response Models Nonlinear
10 pages
Aspen Plus 10.2 Not Giving The Same Density As Aspen Plus 10.1
No ratings yet
Aspen Plus 10.2 Not Giving The Same Density As Aspen Plus 10.1
2 pages
DATA STR File
No ratings yet
DATA STR File
87 pages

CW Sequence Analysis

Uploaded by

CW Sequence Analysis

Uploaded by

Coursework assignment

Deep Learning for Sequence Analysis

● Marks and Deadline

● written report (WR) weighing 70% of the total mark

● Architecting Language Models:

● Architecting Sequence-to-Sequence Models:

● Training and Evaluating DL Models for Language Tasks:

● Potential Problems to Address in Language and Sequence Tasks:

● Weather Prediction using Sequential Data:

3.Implementation and deliverables:

Code Deliverables are:

4. Written Report (70%):

CW reports should encompass the following sections:

1. Introduction (15 marks, Individual Component):

2. Methodology (30 marks):

3. Results (25 marks):

4. Conclusions (15 marks, Individual Component):

5. Oral Presentation (30%):

● The Oral Presentation will last 15 minutes.

When choosing a problem, consider these constraints:

6. Final submission and deliverables:

You should submit the following files:

Coding & Referencing:

You might also like