0% found this document useful (0 votes)

55 views4 pages

Introduction To Machine Learning - Unit 8 - Week 5

The document outlines the content and assignments for Week 5 of the 'Introduction to Machine Learning' course on NPTEL. It includes various questions related to neural networks, activation functions, parameter estimation, and loss minimization. Additionally, it provides correct answers and scores for the assignments submitted by students.

Uploaded by

240188.ad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views4 pages

Introduction To Machine Learning - Unit 8 - Week 5

Uploaded by

240188.ad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

4/22/25, 4:50 AM Introduction to Machine Learning - - Unit 8 - Week 5

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected] 

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Introduction to Machine Learning (course)


Click to register
for Certification
exam
Week 5 : Assignment 5
(https://examform.nptel.ac.in/2025_01/exam_form/dashboard)
The due date for submitting this assignment has passed.

If already Due on 2025-02-26, 23:59 IST.

registered, click
to check your Assignment submitted on 2025-02-24, 19:38 IST
payment status
1) Consider a feedforward neural network that performs regression on a p -dimensional1 point
input to produce a scalar output. It has m hidden layers and each of these layers has k hidden
units. What is the total number of trainable parameters in the network? Ignore the bias terms.
Course
outline pk + mk
2
+ k

2
About pk + (m − 1)k + k

NPTEL ()
2
p + (m − 1)pk + k

How does an 2 2
p + (m − 1)pk + k
NPTEL
online Yes, the answer is correct.
course Score: 1
work? () Accepted Answers:
2
pk + (m − 1)k + k

Week 0 ()
2) Consider a neural network layer defined as y = ReLU (W x). Here x ∈ R
p
is the 1 point
Week 1 () input, y ∈ R
d
is the output and W ∈ R
d×p
is the parameter matrix. The ReLU activation
∂y
(defined asReLU (z) := max(0, z) for a scalar z) is applied element-wise to W xFind i

∂Wij
Week 2 () where i = 1, . . , d and j = 1, . . . , p. In the following options, I(condition) is an indicator
function that returns 1 if the condition is true and 0 if it is false.
Week 3 ()

p
I(∑ Wik x k ≤ 0)x i
Week 4 () k=1

p
I(∑ Wik x k > 0)x j
Week 5 () k=1

https://onlinecourses.nptel.ac.in/noc25_cs46/unit?unit=60&assessment=313 1/5
4/22/25, 4:50 AM Introduction to Machine Learning - - Unit 8 - Week 5

Artificial
p
I(∑ Wik x k > 0)Wij x j
Neural k=1

Networks I -
p
Early Models I(∑
k=1
Wik x k ≤ 0)Wij x j

(unit?
Yes, the answer is correct.
unit=60&lesso Score: 1
n=61)
Accepted Answers:
p
I(∑ Wik x k > 0)x j
Artificial k=1

Neural
3) Consider a two-layered neural network y = σ(W
(B)
σ(W
(A)
x)). Let 1 point
Networks II -
Backpropagati h = σ(W
(A)
x) denote the hidden layer representation.W (A)
and W (B)
are arbitrary weights.
on (unit? Which of the following statement(s) is/are true? Note:∇ g (f ) denotes the gradient of f w.r.t g.
unit=60&lesso
n=62)
∇ h (y) depends on W (A)
Artificial
Neural ∇ (A) (y) depends on W (B)
W
Networks III -
Backpropagati
∇ (A) (h) depends on W (B)
on Continued W

(unit?
unit=60&lesso ∇
W
(B) (y) depends on W (A)
n=63) Yes, the answer is correct.
Score: 1
Artificial
Accepted Answers:
Neural
(y) depends on W
(B)
∇
Networks IV - W
(A)

Training, ∇
W
(B) (y) depends on W (A)
Initialization
and Validation 4) Which of the following statement(s) about the initialization of neural network weights 1 point
(unit? is/are true for a network that uses the sigmoid activation function?
unit=60&lesso
n=64) Two different initializations of the same network could converge to different minima
Parameter For a given initialization, gradient descent will converge to the same minima irrespective
Estimation I - of the learning rate.
The Maximum
Initializing all weights to the same constant value leads to undesirable results
Likelihood
Estimate (unit? Initializing all weights to very large values leads to undesirable results
unit=60&lesso Yes, the answer is correct.
n=65) Score: 1
Accepted Answers:
Parameter
Two different initializations of the same network could converge to different minima
Estimation II -
Initializing all weights to the same constant value leads to undesirable results
Priors and the
MAP estimate Initializing all weights to very large values leads to undesirable results
(unit?
unit=60&lesso 5) Consider the following statements about the derivatives of the sigmoid 1 point
n=66) 1 exp(x)−exp(−x)
(σ(x) = ) and tanh (tanh(x) = ) activation functions. Which of
1+exp(−x) exp(x)+exp(−x)

Parameter these statement(s) is/are correct?

Estimation III
(unit?
′
unit=60&lesso σ (x) = σ(x)(1 − σ(x))

n=67)
′ 1
0 < σ (x) ≤
Week 5 4

Feedback

https://onlinecourses.nptel.ac.in/noc25_cs46/unit?unit=60&assessment=313 2/5
4/22/25, 4:50 AM Introduction to Machine Learning - - Unit 8 - Week 5

Form : ′
tanh (x) =
1
(1 − (tanh(x)) )
2

2
Introduction To
Machine 0 < tanh (x) ≤ 1
′

Learning (unit?
unit=60&lesso Yes, the answer is correct.
Score: 1
n=286)
Accepted Answers:
Quiz: Week 5 ′
σ (x) = σ(x)(1 − σ(x))

: Assignment ′
0 < σ (x) ≤
1

4
5 ′
0 < tanh (x) ≤ 1
(assessment?
name=313) 6) A geometric distribution is defined by the p.m.f. f (x; p) = (1 − p)(x−1) p for 1 point
x = 1, 2, . . . . . . Given the samples [4, 5, 6, 5, 4, 3] drawn from this distribution, find the MLE
Week 6 () of p .

Week 7 () 0.111
0.222
Week 8 ()
0.333
0.444
Week 9 ()
Yes, the answer is correct.
Week 10 () Score: 1
Accepted Answers:
0.222
Week 11 ()

7) Consider a Bernoulli distribution with p = 0.7 (true value of the parameter). We 1 point
Week 12 ()
draw samples from this distribution and compute an MAP estimate of p by assuming a prior
distribution over p . Let N (μ, σ 2 ) denote a gaussian distribution with a mean μ and variance
Text
Transcripts σ
2
.Distributions are normalized as needed. Which of the following statement(s) is/are true?
()
If the prior is N (0.6, 0.1) , we will likely require fewer samples for converging to the true
Download value than if the prior is N (0.4, 0.1)
Videos ()
If the prior is N (0.4, 0.1) , we will likely require fewer samples for converging to the true
Books () value than if the prior is N (0.6, 0.1)

Problem With a prior of N (0.1, 0.001), the estimate will never converge to the true value,
Solving regardless of the number of samples used.
Session -
Jan 2025 () With a prior of U (0, 0.5) (i.e. uniform distribution between 0 and 0.5), the estimate will
never converge to the true value, regardless of the number of samples used.

No, the answer is incorrect.

Score: 0
Accepted Answers:
If the prior is N (0.6, 0.1) , we will likely require fewer samples for converging to the true
value than if the prior is N (0.4, 0.1)
With a prior of U (0, 0.5) (i.e. uniform distribution between 0 and 0.5), the estimate will never
converge to the true value, regardless of the number of samples used.

8) Which of the following statement(s) about parameter estimation techniques is/are 1 point
true?

https://onlinecourses.nptel.ac.in/noc25_cs46/unit?unit=60&assessment=313 3/5
4/22/25, 4:50 AM Introduction to Machine Learning - - Unit 8 - Week 5

To obtain a distribution over the predicted values for a new data point, we need to
compute an integral over the parameter space.
The MAP estimate of the parameter gives a point prediction for a new data point.
The MLE of a parameter gives a distribution of predicted values for a new data point.
We need a point estimate of the parameter to compute a distribution of the predicted
values for a new data point.
Yes, the answer is correct.
Score: 1
Accepted Answers:
To obtain a distribution over the predicted values for a new data point, we need to compute an
integral over the parameter space.
The MAP estimate of the parameter gives a point prediction for a new data point.

9) In a classification setting, it is common in machine learning applications to minimize 1 point

the discrete cross entropy loss given by HCE (p, q) = −Σi p i logq i where p i and qi are the
true and predicted distributions (p i ∈ {0, 1} depending on the label of the corresponding entry).
Which of the following statement(s) about minimizing the cross entropy loss is/are true?

Minimizing HCE (p, q) is equivalent to minimizing the (self) entropy H(q)

Minimizing HCE (p, q) is equivalent to minimizing HCE (q, p).

Minimizing HCE (p, q) is equivalent to minimizing the KL divergence D KL (p||q)

Minimizing HCE (p, q) is equivalent to minimizing the KL divergence D KL (q||p)

Yes, the answer is correct.

Score: 1
Accepted Answers:
Minimizing HCE (p, q) is equivalent to minimizing the KL divergence D KL (p||q)

10) Which of the following statement(s) about activation functions is/are NOT true? 1 point

Non-linearity of activation functions is not a necessary criterion when designing very deep
neural networks

Saturating non-linear activation functions (derivative → 0 as x → ±∞ ) avoid the vanishing

gradients problem
Using the ReLU activation function avoids all problems arising due to gradients being too
small.
The dead neurons problem in ReLU networks can be fixed using a leaky ReLU activation
function

Yes, the answer is correct.

Score: 1
Accepted Answers:
Non-linearity of activation functions is not a necessary criterion when designing very deep
neural networks
Saturating non-linear activation functions (derivative → 0 as x → ±∞ ) avoid the vanishing
gradients problem
Using the ReLU activation function avoids all problems arising due to gradients being too
small.

https://onlinecourses.nptel.ac.in/noc25_cs46/unit?unit=60&assessment=313 4/5

Naomi
No ratings yet
Naomi
1 page
Power in Global Governance
No ratings yet
Power in Global Governance
32 pages
Introduction to Machine Learning - - Unit 8 - Week 5
No ratings yet
Introduction to Machine Learning - - Unit 8 - Week 5
5 pages
Introduction to Machine Learning - - Unit 4 - Week 1
No ratings yet
Introduction to Machine Learning - - Unit 4 - Week 1
5 pages
Wa0193.
No ratings yet
Wa0193.
4 pages
Week 5
100% (1)
Week 5
3 pages
Introduction To Machine Learning - Unit 11 - Week 8
No ratings yet
Introduction To Machine Learning - Unit 11 - Week 8
4 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Introduction To Machine Learning - Unit 10 - Week 7
No ratings yet
Introduction To Machine Learning - Unit 10 - Week 7
5 pages
Introduction To Machine Learning - Unit 5 - Week 2
No ratings yet
Introduction To Machine Learning - Unit 5 - Week 2
4 pages
Introduction To Machine Learning - Unit 7 - Week 4
No ratings yet
Introduction To Machine Learning - Unit 7 - Week 4
4 pages
Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
5 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Lecture 34
No ratings yet
Lecture 34
135 pages
Notes5 Regression
No ratings yet
Notes5 Regression
14 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
Introduction to Machine Learning - - Unit 10 - Week 7
No ratings yet
Introduction to Machine Learning - - Unit 10 - Week 7
5 pages
Main 2
No ratings yet
Main 2
37 pages
Week 4
No ratings yet
Week 4
61 pages
Introduction To Machine Learning - Unit 4 - Week 2
No ratings yet
Introduction To Machine Learning - Unit 4 - Week 2
4 pages
Introduction To Machine Learning - Unit 12 - Week 9
No ratings yet
Introduction To Machine Learning - Unit 12 - Week 9
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Regression
No ratings yet
Regression
39 pages
Introduction To Machine Learning - Unit 4 - Week 2
No ratings yet
Introduction To Machine Learning - Unit 4 - Week 2
4 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Introduction To Machine Learning Lecture 2: Linear Regression
No ratings yet
Introduction To Machine Learning Lecture 2: Linear Regression
38 pages
Introduction to Machine Learning - - Unit 6 - Week 3
No ratings yet
Introduction to Machine Learning - - Unit 6 - Week 3
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
ML 01
No ratings yet
ML 01
24 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
25 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Introduction to Machine Learning - - Unit 7 - Week 4
No ratings yet
Introduction to Machine Learning - - Unit 7 - Week 4
4 pages
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
No ratings yet
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
40 pages
Instructor's Solution Manual For Neural Networks
No ratings yet
Instructor's Solution Manual For Neural Networks
40 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
4 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
61 pages
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
No ratings yet
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
5 pages
CS-31002 (ML) - CS End April 2025
No ratings yet
CS-31002 (ML) - CS End April 2025
19 pages
Introduction To Machine Learning - Unit 3 - Week 1
No ratings yet
Introduction To Machine Learning - Unit 3 - Week 1
3 pages
Dl Unit1 Ppt
No ratings yet
Dl Unit1 Ppt
61 pages
Introduction To Machine Learning - Unit 9 - Week 6
No ratings yet
Introduction To Machine Learning - Unit 9 - Week 6
3 pages
Week 7
No ratings yet
Week 7
7 pages
CS229 Lecture 2 PDF
100% (1)
CS229 Lecture 2 PDF
48 pages
NPTEL ML Assignment Week1
100% (4)
NPTEL ML Assignment Week1
5 pages
cs772 Quiz1 Spring24 Solutions
No ratings yet
cs772 Quiz1 Spring24 Solutions
2 pages
Lecture 18. Backpropagation
No ratings yet
Lecture 18. Backpropagation
55 pages
Fit Without Fear - Remarkable Mathematical Phenomena of Deep Learning Through The Prism of Interpolation
No ratings yet
Fit Without Fear - Remarkable Mathematical Phenomena of Deep Learning Through The Prism of Interpolation
51 pages
Deep Learning & Machine Learning
No ratings yet
Deep Learning & Machine Learning
180 pages
MS Key-4
No ratings yet
MS Key-4
4 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
No ratings yet
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
6 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
Lec11 Introduction2BayesianStatistics
No ratings yet
Lec11 Introduction2BayesianStatistics
48 pages
Introduction to Machine Learning - - Unit 11 - Week 8
No ratings yet
Introduction to Machine Learning - - Unit 11 - Week 8
5 pages
Introduction to Machine Learning - - Unit 5 - Week 2
No ratings yet
Introduction to Machine Learning - - Unit 5 - Week 2
4 pages
Montanari
No ratings yet
Montanari
10 pages
Solution Dseclzg524!01!102020 Ec2r
100% (1)
Solution Dseclzg524!01!102020 Ec2r
6 pages
2021 Exam2 Solution
No ratings yet
2021 Exam2 Solution
11 pages
Short Course Machine Learning F de Vuyst 1715052496
No ratings yet
Short Course Machine Learning F de Vuyst 1715052496
74 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
practicalMachineLearning Lecture3
No ratings yet
practicalMachineLearning Lecture3
25 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
m4 SD Manual
No ratings yet
m4 SD Manual
2 pages
Medstar Obgyn 1st Edition
83% (12)
Medstar Obgyn 1st Edition
320 pages
Development of Video Prentation For Utilization in Teaching Beauty Care For TLE
50% (2)
Development of Video Prentation For Utilization in Teaching Beauty Care For TLE
82 pages
S-Oil+transformer+oil TDS
No ratings yet
S-Oil+transformer+oil TDS
1 page
November Bku
No ratings yet
November Bku
3 pages
Special Topics 2
No ratings yet
Special Topics 2
1 page
Pitched Battle GHB 2023-24 2,000 (Order - Sylvaneth) (860pts)
No ratings yet
Pitched Battle GHB 2023-24 2,000 (Order - Sylvaneth) (860pts)
5 pages
Carrier ComfortVIEW
100% (1)
Carrier ComfortVIEW
624 pages
Class Vi PDF
No ratings yet
Class Vi PDF
14 pages
Camera Quick Start Guide
No ratings yet
Camera Quick Start Guide
20 pages
Jt100 - Operators Manual
No ratings yet
Jt100 - Operators Manual
268 pages
2023 Newsletter
No ratings yet
2023 Newsletter
9 pages
Nordic ShipNews 02 - 2023
No ratings yet
Nordic ShipNews 02 - 2023
22 pages
Fuels and CombustionsEnergy SourcesLecture 03numerical On Dulong FormulaEngineering Chemistry - YouTube
No ratings yet
Fuels and CombustionsEnergy SourcesLecture 03numerical On Dulong FormulaEngineering Chemistry - YouTube
1 page
JEE Advanced Paper - 1 (18-08) - Solutions
No ratings yet
JEE Advanced Paper - 1 (18-08) - Solutions
39 pages
Interest of Early Toxicological Sampling During Alpha Chloralose in Toxication, A Case Report
No ratings yet
Interest of Early Toxicological Sampling During Alpha Chloralose in Toxication, A Case Report
4 pages
CPAR Reviewer Finals
No ratings yet
CPAR Reviewer Finals
2 pages
ECO401 Short Notes. Ali-1
No ratings yet
ECO401 Short Notes. Ali-1
23 pages
Ffe
No ratings yet
Ffe
38 pages
Salcedo National High School: Department of Education Region I Schools Division of Ilocos Sur Salcedo, Ilocos Sur
No ratings yet
Salcedo National High School: Department of Education Region I Schools Division of Ilocos Sur Salcedo, Ilocos Sur
2 pages
Resume-Mansoor Sultan &amp Pic
No ratings yet
Resume-Mansoor Sultan &amp Pic
2 pages
Cat011 - Pads & Rotors Catalogue - Lrpe 0116
No ratings yet
Cat011 - Pads & Rotors Catalogue - Lrpe 0116
58 pages
Dr. B.S. Singh Et Al Fundamentals of Plant Biotechnology
100% (5)
Dr. B.S. Singh Et Al Fundamentals of Plant Biotechnology
767 pages
Kicker KX550.3 Amplificador de Potencia para Audio Automotriz. Manual de Usuario (En Inglés)
No ratings yet
Kicker KX550.3 Amplificador de Potencia para Audio Automotriz. Manual de Usuario (En Inglés)
16 pages
Tanzania STG 052013-Copy 1544379670122
No ratings yet
Tanzania STG 052013-Copy 1544379670122
220 pages
Applications of GNSS
No ratings yet
Applications of GNSS
19 pages
2022 - Pengaruh Promosi Dan Kepercayaan Terhadap Keputusan Pembelian Pada E-Commerce Tokopedia
No ratings yet
2022 - Pengaruh Promosi Dan Kepercayaan Terhadap Keputusan Pembelian Pada E-Commerce Tokopedia
6 pages
United Grain Growers
No ratings yet
United Grain Growers
4 pages

Introduction To Machine Learning - Unit 8 - Week 5

Uploaded by

Introduction To Machine Learning - Unit 8 - Week 5

Uploaded by

4/22/25, 4:50 AM Introduction to Machine Learning - - Unit 8 - Week 5

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Introduction to Machine Learning (course)

If already Due on 2025-02-26, 23:59 IST.

Parameter these statement(s) is/are correct?

No, the answer is incorrect.

9) In a classification setting, it is common in machine learning applications to minimize 1 point

Minimizing HCE (p, q) is equivalent to minimizing the (self) entropy H(q)

Minimizing HCE (p, q) is equivalent to minimizing HCE (q, p).

Minimizing HCE (p, q) is equivalent to minimizing the KL divergence D KL (p||q)

Minimizing HCE (p, q) is equivalent to minimizing the KL divergence D KL (q||p)

Yes, the answer is correct.

Saturating non-linear activation functions (derivative → 0 as x → ±∞ ) avoid the vanishing

Yes, the answer is correct.

You might also like