0% found this document useful (0 votes)

19 views7 pages

ML Endsem 2022

Uploaded by

shobhitraj0011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views7 pages

ML Endsem 2022

Uploaded by

shobhitraj0011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

CSE343/CSE543/ECE363/ECE563: Machine Learning Sec A (Monsoon 2023)

END-Sem

Date of Examination: 6.12.2023 Duration: 2 hours Total Marks: 30 marks

Instructions –
• Attempt all questions. MCQs have a single correct option.
• State any assumptions you have made clearly.
• Standard institute plagiarism policy holds.
• No evaluation without suitable justification.

0 marks if either option or explanation is incorrect.

Question 1: [1 Mark] In the context of batch learning in neural networks, does the order in
which input data is presented to the network influence the learning process?
A. Yes, the order significantly affects learning, leading to different models.

B. No, the order of input data does not influence the learning outcome in batch learning.

C. Yes, but only in networks with more than three layers.

D. The order only matters in real-time data streaming, not in batch learning.
Answer: (B)No, the order of input data does not influence the learning outcome
in batch learning. In batch learning, the entire dataset is considered in one go, or
in large batches, for the training process. This approach means that the order in
which the data points are presented does not affect the learning process, as the model
processes the entire dataset or large subsets of it collectively. The training is based on
the aggregate error over the entire dataset or the batch, rather than individual data
points, making the order of data presentation irrelevant in this context.

Question 2: [1Marks]What impact will increasing the number of nodes in a neural network’s
hidden layer have?
A. It will always improve the network’s accuracy on both training and test sets.

B. It may lead to underfitting, resulting in poor accuracy on both training and test sets.

C. It may result in overfitting, leading to higher accuracy on the training set but poor
accuracy on the test set, and increased training time.

D. The number of nodes in the hidden layer has no impact on the network’s performance.
(C). It may result in overfitting, leading to higher accuracy on the training set but
poor accuracy on the test set and increased training time.

Question 3: [1Marks]You are designing a deep learning system to detect driver fatigue in cars,
where it is crucial to detect fatigue accurately to prevent accidents. Which of the following
evaluation metrics would be the most appropriate for this system?

1. Precision

2. Recall

3. F1 score

4. Loss value
(B)Explanation: Recall is the most appropriate metric in this context because it measures the
proportion of actual positive cases (fatigue detected) that were correctly identified. In safety-
critical applications like driver fatigue detection, it’s crucial to minimize false negatives (i.e.,
not detecting fatigue when it is present), which is what Recall focuses on.

Question 4: [2 Marks] Assuming δL/δx3 is known, write the weight update for w1 (δL/δw1
should be in the expanded form). Input x1 and all weights are positive. All neurons have ReLU
Activation.Figure attached below.

Figure 1: Solution to Q4 [1 mark for derivation, 1 mark for correct answer]

Question 5: [2+2 Marks] Consider an activation function ρ(x) = x · σ(x), where σ(x) is the
sigmoid function.
(a) Compute ρ′ (x) in terms of σ(x).
(b) For large x, compare ρ(x) and ρ′ (x) with standard activation functions. (No derivation
required).
Part (a) Given that: σ ′ (x) = σ(x)(1 − σ(x))
By the multiplication rule:

ρ′ (x) = x′ σ(x) + xσ ′ (x) = 1σ(x) + xσ(x)(1 − σ(x)) = σ(x)[1 + x(1 − σ(x))]

Part (b)

1. For large values of x, σ(x) ≈ 1, thus ρ(x) ≈ x and the function mimics RELU activation.

2. For large values of x, σ(x) ≈ 1, or (1 − σ(x)) ≈ 0, and overall [1 + x(1 − σ(x))] ≈ 1, ρ′ (x)
function mimics Sigmoid activation.

Question 6: [2 Marks] Consider an ensemble of 5 models which are trained on the same ar-
chitecture but have different initializations for a handwritten digit classification task. Does the
guarantee of better performance in expectation (in terms of cross-entropy loss) by averaging
the predictions of all five networks hold if you instead average the weights and biases of the
networks? Why or why not?

Page 2
Solution: No, the guarantee does not hold because the loss is not convex with respect to the
weights and biases. Networks starting from different initializations might learn different hidden
representations, so it makes no sense to average the weights and biases.

Question 7: [2 Marks] Prove that approximately 63% of the entire original dataset (total
training set) is present in any of the sampled bootstrap datasets using the Bagging method.

Solution: We have a dataset with n observations. In bootstrap sampling, we draw with

replacement from this original dataset n times to create a new dataset of the same size n.
For any single observation, the probability of not being chosen on the first draw is: P(not
chosen) = 1 - 1/n [1 mark]
Since each draw is independent of the others, the probability of not being chosen in all n draws
is: P(not chosen in all n draws) = (1 − 1/n)n
Taking the limit: As n gets large (which is typically the case in real applications),
limitn to infinity (1 − 1/n)n = e-1 = 0.368
Probability of appearing = 1-0.368 = 0.63 [1 mark]

Question 8:[2 marks] Consider the given distance metric:

X x i − yi
d(x, y) =
x i + yi
Enumerate and explain the desirable properties of a distance metric. Evaluate the given distance
metric against these properties, demonstrating mathematically or through examples how the
metric adheres to or fails to meet each of the properties.

Figure 2: Solution to Q8

Alternate sol for proving non validity:- when xi = −yi , distance is infinite.

Page 3
Question 9: [3 Marks] For a CNN-based classifier, calculate the number of weights, number
of biases, and the size of the associated feature maps for each layer, following the notation:
• CONV-K-N denotes a convolutional layer with N filters, each of size K × K. Padding and
stride parameters are always 0 and 1, respectively.
• POOL-K indicates a K × K pooling layer with stride K and padding 0.
• FC-N stands for a fully-connected layer with N neurons.
Successively:

120 × 120 × 32 and 32 × (9 × 9 × 3 + 1)

60 × 60 × 32 and 0
56 × 56 × 64 and 64 × (5 × 5 × 32 + 1)
28 × 28 × 64 and 0
24 × 24 × 64 and 64 × (5 × 5 × 64 + 1)
12 × 12 × 64 and 0
3 and 3 × (12 × 12 × 64 + 1)

Figure 3: Question 4
Figure 4: Question 9

Question 10: [4 Marks] Discuss the role of activation functions in mitigating exploding and
vanishing gradient problems. Provide examples of activation functions that are more or less
prone to these issues and explain why.
Exploding and Vanishing Gradient Issues:
Exploding Gradient: Occurs when gradients become extremely large during backpropagation,
leading to unstable learning. This occurs when the weights become very large.
Vanishing Gradient: Happens when gradients become too small, causing slow or halted
learning for deep networks.
Activation Functions:
Sigmoid Activation: Prone to vanishing gradients, particularly for deep networks.
Hyperbolic Tangent (tanh) Activation: Similar to sigmoid, still susceptible to vanishing
gradients.
Sigmoid and tanh: More prone to vanishing gradient, limiting their use in deep networks.
ReLU, Leaky ReLU, PReLU, ELU: Designed to alleviate vanishing gradients, making them
suitable for deep architectures.

Question 11: [2 Marks] Express the derivative of a sigmoid in terms of the sigmoid itself for
positive constants a and b:
1
(a) A purely positive sigmoid: φj (v) = 1+exp(−av)
(b) An antisymmetric sigmoid: φj (v) = a tanh(bv)

Page 4
Question 12:[2 Marks] Consider the following patterns, each having four binary-valued at-
tributes:
ω1 1100 0000 1010 0011
ω2 1100 1111 1110 0111
Note: especially that the first patterns in the two categories are the same. Identify the root
node feature for a binary classification tree for this data so that the leaf nodes have the lowest
impurity possible.

Question 13: [2 Marks] In the context of behavior simulation and content simulation tasks,
discuss the implications of the observed performance gap between LCBM and large content-only
models like GPT-3.5 and GPT-4. What insights can be drawn from this performance difference,
and how does it contribute to our understanding of the effectiveness of including behavior tokens
in language model training? Provide a brief analysis of the observed trends as presented during

Page 5
Figure 5: Q12, All the calculations for entropy/impurity should be there.

Figure 6: Q12. Final Decision Tree

the lecture.
Q13:- Refer this for more details: https://arxiv.org/pdf/2309.00359.pdf

• LCBM, while being 10x smaller than GPT-3.5 and 4, performs better than them on all
behavior-related tasks.

• Further, we see that there is no significant difference between 10-shot and 2-shot GPT-4
or between GPT-3.5 and GPT-4, indicating that unlike other tasks, it is harder to achieve
good performance through in-context learning on the behavior modality.

• It can be observed that often GPT-3.5 and 4 achieve performance comparable to (or worse
than) random baselines. Interestingly, the performance of GPTs on the content simulation
task is also substantially behind LCBM.

• The way we formulate the content simulation task (Listing 5), it can be seen that a
substantial performance could be achieved by strong content knowledge, and behavior
brings in little variance.

Page 6
• We still see a substantial performance gap between the two models. All of this indicates
that large models like GPT-3.5 and 4 are not trained on behavior tokens.

Question 14: [2 Marks] Consider a scenario where a company is developing an AI-based

conversational chatbot focused on mental health support. Before deploying this chatbot, outline
the critical considerations the company should take into account and discuss each of the following
aspects:
• Controlled Generation vs. Free Flow QA: Compare and contrast the advantages and disad-
vantages of implementing a controlled generation approach versus a free-flow question-answer
model in the context of a mental health chatbot.
• Industry-Specific Approvals: Adherence to industry-specific regulations such as HIPAA,
GDPR, NIST, etc., holds significant implications. Explain the implications of adhering to
these regulations and the measures the company should take to ensure compliance in the
development and deployment of the mental health chatbot.
Controlled Generation: Advantages:

• Precision and predictability in responses, Lower risk of generating inappropriate or harm-

ful content, Aligns with ethical and regulatory standards.

Disadvantages:

• Limited flexibility in responding to diverse user inputs, Potential to miss nuanced expres-
sions and unique user needs, May feel less conversational and empathetic.

Free Flow QA: Advantages:

• Enhanced flexibility in addressing various user inputs, Can simulate more natural and
empathetic conversations, Better adaptation to users’ emotional states.

Disadvantages:

• Higher risk of generating inappropriate or unsafe content, Difficulty in maintaining control

over the conversation, Challenges in aligning with regulatory standards.

0.25 marks for any one advantage and disadvantage of each type(0.25*4=1 mark)

• Implications: Ensures the protection of sensitive health information, Guarantees user pri-
vacy and control over personal data,Sets standards for cybersecurity and data protection.
Give 0.5 for any one mentioned.

• To ensure compliance : Implement robust encryption for data transmission, Ensure secure
storage and access controls for user data, Obtain clear consent for data collection and
usage, or something along these lines. 0.5 marks

Page 7

Outline Field Development & Project Management (5th Apr 22) Rev.2
No ratings yet
Outline Field Development & Project Management (5th Apr 22) Rev.2
67 pages
Seat Leon (1P, 1P0,1P1) Workshop - Electrical System
67% (3)
Seat Leon (1P, 1P0,1P1) Workshop - Electrical System
365 pages
1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
Fee Structure 2024 25 MBBS
No ratings yet
Fee Structure 2024 25 MBBS
1 page
Chitaliya Dipak - Nirma
No ratings yet
Chitaliya Dipak - Nirma
93 pages
Acefone Company Brochure 2025 250604 162726
No ratings yet
Acefone Company Brochure 2025 250604 162726
12 pages
Project 2
No ratings yet
Project 2
7 pages
Solution: Introduction To Deep Learning
No ratings yet
Solution: Introduction To Deep Learning
20 pages
DL Group Exercise 1
No ratings yet
DL Group Exercise 1
7 pages
BerlanShiffman 2011
No ratings yet
BerlanShiffman 2011
10 pages
HTML in A Day For Digital Marketing Pro Course
No ratings yet
HTML in A Day For Digital Marketing Pro Course
1 page
Module 3 Quiz - Review
No ratings yet
Module 3 Quiz - Review
4 pages
State Space Control of Systems Tutorial
No ratings yet
State Space Control of Systems Tutorial
15 pages
(AK) AIMLCZG511 Midsem Regular
No ratings yet
(AK) AIMLCZG511 Midsem Regular
7 pages
1718sem2-Ee5904 Me5404
No ratings yet
1718sem2-Ee5904 Me5404
4 pages
DL Midterm Rubrics
No ratings yet
DL Midterm Rubrics
5 pages
2425 CS420 22TT HW04
No ratings yet
2425 CS420 22TT HW04
6 pages
2015 Midsem Solutions
No ratings yet
2015 Midsem Solutions
4 pages
Midterm Solutions
No ratings yet
Midterm Solutions
14 pages
Faculty - COURSE - ALLOCATION - First - Semester - 2023-2024 - and 2024 - 2025 Academic - Session - Doc UPDATED
No ratings yet
Faculty - COURSE - ALLOCATION - First - Semester - 2023-2024 - and 2024 - 2025 Academic - Session - Doc UPDATED
3 pages
London Show Daily April 11, 2018
No ratings yet
London Show Daily April 11, 2018
32 pages
1160 BITS F464 20241220013738 Mid Semester Question Paper
No ratings yet
1160 BITS F464 20241220013738 Mid Semester Question Paper
4 pages
FD Revised 5 - Asf Devastation and Financial Performance of Pork Suppliers in Davao City
No ratings yet
FD Revised 5 - Asf Devastation and Financial Performance of Pork Suppliers in Davao City
53 pages
2017 Quiz02 Soln
No ratings yet
2017 Quiz02 Soln
2 pages
2016 Quiz03 Solutions
No ratings yet
2016 Quiz03 Solutions
2 pages
Unit 3
No ratings yet
Unit 3
3 pages
Fosroc Nitomortar FC (FS) : Constructive Solutions
No ratings yet
Fosroc Nitomortar FC (FS) : Constructive Solutions
2 pages
2016 Quiz02 Solutions
No ratings yet
2016 Quiz02 Solutions
1 page
Classics: Invention of The Integrated Circuit
No ratings yet
Classics: Invention of The Integrated Circuit
16 pages
Marine Spread Specification
No ratings yet
Marine Spread Specification
38 pages
NN Moraish
No ratings yet
NN Moraish
2 pages
DL - Midterm - Fall23
No ratings yet
DL - Midterm - Fall23
2 pages
BBS Implementation Process - Matrix
No ratings yet
BBS Implementation Process - Matrix
2 pages
SS 2020
No ratings yet
SS 2020
21 pages
AI42001 Machine Learing Foundations ES 2024
No ratings yet
AI42001 Machine Learing Foundations ES 2024
18 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
BITS F464 Machine Learning Neural Network Practice Questions - SolutionKey
No ratings yet
BITS F464 Machine Learning Neural Network Practice Questions - SolutionKey
5 pages
7COM1033test 0000
No ratings yet
7COM1033test 0000
4 pages
Devtern
No ratings yet
Devtern
7 pages
Kinetic Theory of Gases Notes
No ratings yet
Kinetic Theory of Gases Notes
5 pages
Mock Endterm ADL 2021
No ratings yet
Mock Endterm ADL 2021
8 pages
Cs224n Midterm 2018 Solution
No ratings yet
Cs224n Midterm 2018 Solution
17 pages
13 Marquez v. CA
No ratings yet
13 Marquez v. CA
1 page
Disc11-Examprep-Sols (9 Files Merged)
No ratings yet
Disc11-Examprep-Sols (9 Files Merged)
12 pages
Week 3
No ratings yet
Week 3
5 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
DS3001 - DAV - Final Exam - Fall23 - v3
No ratings yet
DS3001 - DAV - Final Exam - Fall23 - v3
14 pages
Midpaper
No ratings yet
Midpaper
16 pages
CS230 Midterm Solutions Fall 2021
No ratings yet
CS230 Midterm Solutions Fall 2021
14 pages
Null Pages 21 Merged
No ratings yet
Null Pages 21 Merged
3 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Homework 04: Your Content Should Use Any Color That Is Different From Those
No ratings yet
Homework 04: Your Content Should Use Any Color That Is Different From Those
5 pages
Second Exam 2021-22
No ratings yet
Second Exam 2021-22
14 pages
DNN Cluster S2 22 MidSem Regular
No ratings yet
DNN Cluster S2 22 MidSem Regular
6 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Neural Network Basics
No ratings yet
Neural Network Basics
37 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
CS230: Deep Learning: Winter Quarter 2018 Stanford University Midterm Examination 180 Minutes
100% (1)
CS230: Deep Learning: Winter Quarter 2018 Stanford University Midterm Examination 180 Minutes
36 pages
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
No ratings yet
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
3 pages
COE292 - T221 - Final - Version C
No ratings yet
COE292 - T221 - Final - Version C
19 pages
Solutions
No ratings yet
Solutions
30 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
9 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 3
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 3
8 pages
Application of Responsibility Accounting
No ratings yet
Application of Responsibility Accounting
28 pages
1st Exam Question Paper 2
No ratings yet
1st Exam Question Paper 2
16 pages
Service and Parts Frymaster Bigl30 Series Manual Lov™ Gas Fryer
No ratings yet
Service and Parts Frymaster Bigl30 Series Manual Lov™ Gas Fryer
75 pages
TT1 QBAns1
No ratings yet
TT1 QBAns1
15 pages
Is The Data Linearly Separable?: A) Yes B) No
No ratings yet
Is The Data Linearly Separable?: A) Yes B) No
19 pages
WS 2021
No ratings yet
WS 2021
16 pages
Siemens SW Process Simulate Fs
No ratings yet
Siemens SW Process Simulate Fs
3 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
CSE489: Machine Vision (Sheet 7) : Yehia Zakaria
No ratings yet
CSE489: Machine Vision (Sheet 7) : Yehia Zakaria
34 pages
SSC GR 10 Electronics Q4 Module 1 WK 1 - v.01-CC-released-22May2021
No ratings yet
SSC GR 10 Electronics Q4 Module 1 WK 1 - v.01-CC-released-22May2021
20 pages
Computer Organization: Basic Structure of Computer
No ratings yet
Computer Organization: Basic Structure of Computer
59 pages
DSE 3151 25 Sep 2023
No ratings yet
DSE 3151 25 Sep 2023
9 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
Extract Hidden Data From PDF
No ratings yet
Extract Hidden Data From PDF
2 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
Assignment 12: Introduction To Machine Learning Prof. B. Ravindran
100% (1)
Assignment 12: Introduction To Machine Learning Prof. B. Ravindran
4 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
As ISO IEC 6523.1-2005 Information Technology - Structure For The Identification of Organizations and Organiz
No ratings yet
As ISO IEC 6523.1-2005 Information Technology - Structure For The Identification of Organizations and Organiz
7 pages
Solution Dseclzg524 05-07-2020 Ec3r
No ratings yet
Solution Dseclzg524 05-07-2020 Ec3r
7 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
Sample Final AI
No ratings yet
Sample Final AI
9 pages
Control and Operation of Centrifugal Gas Compressors
0% (1)
Control and Operation of Centrifugal Gas Compressors
6 pages
Completion (Natural Flow)
No ratings yet
Completion (Natural Flow)
3 pages
Lista Hasting
No ratings yet
Lista Hasting
2 pages
Vagtacho Usb: See The List of Supported Cars For The Delco Hsfi, and Delco "F" Update
No ratings yet
Vagtacho Usb: See The List of Supported Cars For The Delco Hsfi, and Delco "F" Update
9 pages

ML Endsem 2022

Uploaded by

ML Endsem 2022

Uploaded by

CSE343/CSE543/ECE363/ECE563: Machine Learning Sec A (Monsoon 2023)

Date of Examination: 6.12.2023 Duration: 2 hours Total Marks: 30 marks

0 marks if either option or explanation is incorrect.

C. Yes, but only in networks with more than three layers.

Figure 1: Solution to Q4 [1 mark for derivation, 1 mark for correct answer]

ρ′ (x) = x′ σ(x) + xσ ′ (x) = 1σ(x) + xσ(x)(1 − σ(x)) = σ(x)[1 + x(1 − σ(x))]

Solution: We have a dataset with n observations. In bootstrap sampling, we draw with

Question 8:[2 marks] Consider the given distance metric:

120 × 120 × 32 and 32 × (9 × 9 × 3 + 1)

Figure 6: Q12. Final Decision Tree

Question 14: [2 Marks] Consider a scenario where a company is developing an AI-based

• Precision and predictability in responses, Lower risk of generating inappropriate or harm-

Free Flow QA: Advantages:

• Higher risk of generating inappropriate or unsafe content, Difficulty in maintaining control

You might also like