0% found this document useful (0 votes)

7 views

Assignment-5-Solution

This document is an assignment on Large Language Models consisting of 8 questions, with a total mark of 10. The questions cover topics such as the disadvantages of RNNs, the purpose of LSTM cell states, and the time complexity of RNNs. Each question includes the correct answer and a brief solution or explanation.

Uploaded by

Harsh Vardhan Choudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Assignment-5-Solution

Uploaded by

Harsh Vardhan Choudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Introduction to Large Language Models

Assignment- 5

Number of questions: 8 Total mark: 6 X 1 + 2 X 2 = 10

_________________________________________________________________________

QUESTION 1: [1 mark]
Which of the following is a disadvantage of Recurrent Neural Networks (RNNs)?

a. Can only process fixed-length inputs.

b. Symmetry in how inputs are processed.
c. Difficulty accessing information from many steps back.
d. Weights are not reused across timesteps.

Correct Answer: c

Solution: Please refer to the lecture slides.

_______________________________________________________________________

QUESTION 2: [1 mark]

Why are RNNs preferred over fixed-window neural models?

a. They have a smaller parameter size.
b. They can process sequences of arbitrary length.
c. They eliminate the need for embedding layers.
d. None of the above.

Correct Answer: b
Solution: Please refer to lecture slides.
_________________________________________________________________________

QUESTION 3: [1 mark]

What is the primary purpose of the cell state in an LSTM?

a. Store short-term information.
b. Control the gradient flow across timesteps.
c. Store long-term information.
d. Perform the activation function.

Correct Answer: c
Solution: The cell stores long-term information in LSTM.
_________________________________________________________________________

QUESTION 4: [1 mark]

In training an RNN, what technique is used to calculate gradients over multiple timesteps?
a. Backpropagation through Time (BPTT)
b. Stochastic Gradient Descent (SGD)
c. Dropout Regularization
d. Layer Normalization

Correct Answer: a
Solution: Please refer to lecture slides.
_________________________________________________________________________

QUESTION 5: [2 mark]

Consider a simple RNN:

● Input vector size: 3

● Hidden state size: 4
● Output vector size: 2
● Number of timesteps: 5

How many parameters are there in total?

a. 210
b. 190
c. 90
d. 42

Correct Answer: d
Solution:
Input to hidden weights: 3×4=12
Hidden to hidden weights: 4×4=16
Hidden to output weights: 4×2=8
Bias terms: 4(hidden) + 2(output) = 6
Total: 12+16+8+6=42
_________________________________________________________________________

QUESTION 6: [1 mark]

What is the time complexity for processing a sequence of length 'N' by an RNN, if the input
embedding dimension, hidden state dimension, and output vector dimension are all 'd'?

a. O(N)
b. O(N²d)
c. O(Nd)
d. O(Nd²)

Correct answer: d
Solution: The time complexity of processing a sequence of length N by an RNN depends on
the computational cost of updating the hidden state at each time step.
At each time step, the RNN updates its hidden state ht using the previous hidden state ht-1
and the current input xt. This update typically involves matrix multiplications:

I. Input-to-hidden transformation: Wx * xt, where Wx is a d × d matrix, leading to a

complexity of O(d²).
II. Hidden-to-hidden transformation: Wh * ht-1, where Wh is also a d × d matrix, leading
to a complexity of O(d²).
III. Activation function application: This is typically O(d) and negligible compared to
matrix multiplications.

Since these computations occur at every time step, the total complexity for a sequence of
length N is: O(N * d²)
_________________________________________________________________________

QUESTION 7: [1 mark]

Which of the following is true about Seq2Seq models?

(i) Seq2Seq models are always conditioned on the source sentence.
(ii) The encoder compresses the input sequence into a fixed-size vector representation.
(iii) Seq2Seq models cannot handle variable-length sequences.
a. (i) and (ii)
b. (ii) only
c. (iii) only
d. (i), (ii), and (iii)

Correct Answer: a
Solution: Seq2Seq models are designed to encode variable-length sequences but
compress them into fixed-size vector representations.

_________________________________________________________________________

QUESTION 8: [2 marks]

Given the following encoder and decoder hidden states, compute the attention scores. (Use
dot product as the scoring function)

Encoder hidden states: h1=[1,2], h2=[3,4], h3=[5,6]

Decoder hidden state: s=[0.5,1]

a. 0.00235,0.04731,0.9503
b. 0.0737,0.287,0.6393
c. 0.9503,0.0137,0.036
d. 0.6393,0.0737,0.287

Correct Answer: a
Solution:
e1 = 1*0.5+2*1 =0.5+2 = 2.5
e2 = 3*0.5+4*1 =1.5+4 = 5.5
e3 = 5*0.5+6*1 =2.5+6 = 8.5

α1 = e2.5/(e2.5 + e5.5 + e8.5) = 0.00235

α2 = e5.5/(e2.5 + e5.5 + e8.5) = 0.04731

α3 = e8.5/(e2.5 + e5.5 + e8.5) = 0.9503

_________________________________________________________________________

Sample Exam COMP 9444 NEURAL NETWORKS PDF
No ratings yet
Sample Exam COMP 9444 NEURAL NETWORKS PDF
7 pages
Santana Gopala Stotram Telugu
83% (12)
Santana Gopala Stotram Telugu
12 pages
Deep Learning-Question Bank-Module-Wise
67% (3)
Deep Learning-Question Bank-Module-Wise
5 pages
Question Bank
No ratings yet
Question Bank
14 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
Solution Dseclzg524 05-07-2020 Ec3r
No ratings yet
Solution Dseclzg524 05-07-2020 Ec3r
7 pages
Week11
No ratings yet
Week11
3 pages
Week 11 nptel deep learning
No ratings yet
Week 11 nptel deep learning
6 pages
Week 11
No ratings yet
Week 11
3 pages
Exam Long Questions
No ratings yet
Exam Long Questions
8 pages
ML Endsem 2022
No ratings yet
ML Endsem 2022
7 pages
ADL Midterm Mock Exam 2021
No ratings yet
ADL Midterm Mock Exam 2021
5 pages
Introduction to Large Language Models (LLMs) - - Unit 7 - Week 5
No ratings yet
Introduction to Large Language Models (LLMs) - - Unit 7 - Week 5
4 pages
Sample-Part B
No ratings yet
Sample-Part B
5 pages
Assesment Ns
No ratings yet
Assesment Ns
2 pages
Unit IV
No ratings yet
Unit IV
31 pages
QuestionBank-DL
No ratings yet
QuestionBank-DL
7 pages
10 Exercises RNN MUD SOLVED
No ratings yet
10 Exercises RNN MUD SOLVED
4 pages
dl-qb-2marks[1]
No ratings yet
dl-qb-2marks[1]
4 pages
Week 11 Exercises Solutions
No ratings yet
Week 11 Exercises Solutions
6 pages
245008-23CS2902 - Deep Learning
No ratings yet
245008-23CS2902 - Deep Learning
4 pages
UNIT-4(MCQs)
No ratings yet
UNIT-4(MCQs)
13 pages
Deep Learning Question Bank
No ratings yet
Deep Learning Question Bank
8 pages
2410[1]
No ratings yet
2410[1]
27 pages
Unit 3
No ratings yet
Unit 3
4 pages
Exercise #2 28_4_2025
No ratings yet
Exercise #2 28_4_2025
7 pages
2311.01927
No ratings yet
2311.01927
14 pages
RNN LSTM
No ratings yet
RNN LSTM
71 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
No ratings yet
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
3 pages
CSE 4237 SoftCom Solutions
No ratings yet
CSE 4237 SoftCom Solutions
115 pages
Second Exam 2021-22
No ratings yet
Second Exam 2021-22
14 pages
Question QUIZ MID 2
No ratings yet
Question QUIZ MID 2
6 pages
Unit-5-updated
No ratings yet
Unit-5-updated
125 pages
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
No ratings yet
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
10 pages
General Notes: Heruntergeladen Durch Petre Weinberger (Extern - Weinberger@tum - De)
No ratings yet
General Notes: Heruntergeladen Durch Petre Weinberger (Extern - Weinberger@tum - De)
6 pages
E9 205 - Machine Learning For Signal Processing
No ratings yet
E9 205 - Machine Learning For Signal Processing
2 pages
AN2DL_04_2324_RecurrentNeuralNetworks
No ratings yet
AN2DL_04_2324_RecurrentNeuralNetworks
34 pages
Week11 Discussion_ Deep Learning
No ratings yet
Week11 Discussion_ Deep Learning
23 pages
F16midterm Sols v2
No ratings yet
F16midterm Sols v2
14 pages
RNN Notes
No ratings yet
RNN Notes
36 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
Sequence Generation With RNNs - Post Quiz - Attempt Review
100% (2)
Sequence Generation With RNNs - Post Quiz - Attempt Review
5 pages
2304.11461v1
No ratings yet
2304.11461v1
15 pages
Cs224n Midterm 2018 Solution
No ratings yet
Cs224n Midterm 2018 Solution
17 pages
RNN-1 All
No ratings yet
RNN-1 All
44 pages
Dl Classtest3
No ratings yet
Dl Classtest3
4 pages
Solution: Introduction To Deep Learning
No ratings yet
Solution: Introduction To Deep Learning
20 pages
Deep Learning - Unit-V Two marks
No ratings yet
Deep Learning - Unit-V Two marks
5 pages
PML Questions September 25 2023
No ratings yet
PML Questions September 25 2023
4 pages
DSE 3151 25 Sep 2023
No ratings yet
DSE 3151 25 Sep 2023
9 pages
Unit 4b - Recurrent Neural Networks
No ratings yet
Unit 4b - Recurrent Neural Networks
60 pages
final mcqs
No ratings yet
final mcqs
7 pages
SS_2021
No ratings yet
SS_2021
16 pages
SS_2021_Solutions
No ratings yet
SS_2021_Solutions
16 pages
deep learning questions
No ratings yet
deep learning questions
17 pages
QuestionBank-DL Lab
No ratings yet
QuestionBank-DL Lab
6 pages
Vin AI
No ratings yet
Vin AI
55 pages
HW 4
No ratings yet
HW 4
10 pages
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Neural Networks and Deep Learning
100% (1)
Neural Networks and Deep Learning
3 pages
FLAT
No ratings yet
FLAT
7 pages
Amharic Abstractive Text Summarization
No ratings yet
Amharic Abstractive Text Summarization
6 pages
Syllabus Theory of Computation (CSC257)
No ratings yet
Syllabus Theory of Computation (CSC257)
6 pages
ARIMA Models: X = X + Z, ∼ W N (0, σ)
No ratings yet
ARIMA Models: X = X + Z, ∼ W N (0, σ)
9 pages
CS402 Mcqs MidTerm by Vu Topper RM
No ratings yet
CS402 Mcqs MidTerm by Vu Topper RM
50 pages
DL Assignment 4
No ratings yet
DL Assignment 4
7 pages
50 LLM Interview Questions
No ratings yet
50 LLM Interview Questions
56 pages
Computer Vision 15 Exam q and a(4)
No ratings yet
Computer Vision 15 Exam q and a(4)
44 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
6 pages
Stat 244 Cheatsheet
No ratings yet
Stat 244 Cheatsheet
2 pages
1157_CS_F425_20231222015056_Mid_Semester_Question_Paper_DL
No ratings yet
1157_CS_F425_20231222015056_Mid_Semester_Question_Paper_DL
2 pages
ML Questions
No ratings yet
ML Questions
9 pages
MA2216 Summary
100% (1)
MA2216 Summary
1 page
Lec.7,8,9
No ratings yet
Lec.7,8,9
23 pages
JSS2008
No ratings yet
JSS2008
23 pages
Week 1 L2
No ratings yet
Week 1 L2
17 pages
Unit 5 Autoencoders.docx
No ratings yet
Unit 5 Autoencoders.docx
6 pages
Unit 4 Notes
100% (1)
Unit 4 Notes
45 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
Machine Learning Unit 4
No ratings yet
Machine Learning Unit 4
21 pages
Chapter 7 Pushdown Automata
No ratings yet
Chapter 7 Pushdown Automata
9 pages
TAFL Typing Notes All Units PDF
No ratings yet
TAFL Typing Notes All Units PDF
117 pages
Unit 3 Deep Learning SPPU BE IT
No ratings yet
Unit 3 Deep Learning SPPU BE IT
30 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Flat Bits
No ratings yet
Flat Bits
9 pages
Lecture 22
No ratings yet
Lecture 22
39 pages
PQT, Anna Coimbatore Unit 3
No ratings yet
PQT, Anna Coimbatore Unit 3
4 pages
10 RNN
No ratings yet
10 RNN
19 pages

Assignment-5-Solution

Uploaded by

Assignment-5-Solution

Uploaded by

Introduction to Large Language Models

Number of questions: 8 Total mark: 6 X 1 + 2 X 2 = 10

a. Can only process fixed-length inputs.

Solution: Please refer to the lecture slides.

Why are RNNs preferred over fixed-window neural models?

What is the primary purpose of the cell state in an LSTM?

Consider a simple RNN:

● Input vector size: 3

How many parameters are there in total?

I. Input-to-hidden transformation: Wx * xt, where Wx is a d × d matrix, leading to a

Which of the following is true about Seq2Seq models?

Encoder hidden states: h1=[1,2], h2=[3,4], h3=[5,6]

Decoder hidden state: s=[0.5,1]

α1 = e2.5/(e2.5 + e5.5 + e8.5) = 0.00235

α2 = e5.5/(e2.5 + e5.5 + e8.5) = 0.04731

α3 = e8.5/(e2.5 + e5.5 + e8.5) = 0.9503

You might also like