Machine Learning Questions Final - Solutions

The document is a solution key for the final examination of a Machine Learning course at Birla Institute of Technology & Science, detailing various questions and solutions related to machine learning concepts such as Adaboost, bagging, clustering, SVM, neural networks, and log-likelihood functions. Each question includes a breakdown of marks and specific solutions, demonstrating the application of theoretical knowledge in practical scenarios. The examination covers a range of topics, indicating a comprehensive assessment of students' understanding of machine learning principles.

Uploaded by

Sri Roop

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

Machine Learning Questions Final - Solutions

Uploaded by

Sri Roop

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Birla Institute of Technology & Science - Pilani, Hyderabad Campus

Second Semester 2024-2025

BITS F464: Machine Learning (Final Examination)
Solution Key

Type: Closed Book Time: 180 minutes Max Marks: 120 Date: 09/05//2025
------------------------------------------------------------------------------------------------------
(Write in Answer Sheet) Self Declaration: I declare that I am not carrying with me:
1. Any written material on paper, clothes, or body parts
2. any mobile phone, communication or data storage devices.
Name: and Signature with date:

Note: This is only the solution key. You are expected to derive/solve to get the
solutions.
Q.1) [Total Marks: 16] You are using Adaboost and obtain the final ensemble of weak classifiers as shown below.
There are 9 regions, and each region either will either predict a +ve (+1) outcome or a -ve (-1) outcome. Given
that the final prediction from region (5) is -ve, find the relationships between the weighting coefficients.

Solution:
Since, we do not know the prediction of each classifier, we need to assume different cases. There are
total 16 different cases. Each case will give a different relationship between the weighting coefficients.
Case 1:
• Classifier 1 (with weighting coefficient 𝛼1 ): Left is +ve and Right is -ve
• Classifier 2 (with weighting coefficient 𝛼2 ): Left is +ve and Right is -ve
• Classifier 3 (with weighting coefficient 𝛼3 ): top is +ve and down is -ve
• Classifier 4 (with weighting coefficient 𝛼4 ): top is +ve and down is -ve

1
Relationship: -𝛼1 +𝛼2 − 𝛼3 +𝛼4 <0 and hence 𝛼2 +𝛼4 < 𝛼1 +𝛼3
Case 2:
• Classifier 1 (with weighting coefficient 𝛼1 ): Left is -ve and Right is +ve
• Classifier 2 (with weighting coefficient 𝛼2 ): Left is +ve and Right is -ve
• Classifier 3 (with weighting coefficient 𝛼3 ): top is +ve and down is -ve
• Classifier 4 (with weighting coefficient 𝛼4 ): top is +ve and down is -ve
Relationship: 𝛼1 +𝛼2 − 𝛼3+𝛼4 <0 and hence 𝛼2 +𝛼4 + 𝛼1 < 𝛼3

So for each case, you can write: 𝛼1 ? 𝛼2 ? 𝛼3 ? 𝛼4 < 0 where based on the use-case, ‘?’ can be either +
or - . Hence, you will get all 8 possibilities.
1 mark will be given for each case and the corresponding relationship.

Q.2) [Total Marks: 12] There are 2000 labelled samples and you decide to use bagging. You use ensemble of
decision trees by generating 50 bootstrap samples, each used to train a separate tree. Each bootstrap is created
by sampling with replacement from the original dataset, and contains 2000 instances (same size as the original
dataset). Each tree has an accuracy of 60% and their errors are uncorrelated. We model X, which is the number
of trees that correctly classify, with a Binomial Distribution. Note that a Chernoff bound is used to calculate the
conservative lower bound for ensemble accuracy. For a Binomial 𝑋~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑛, 𝑝), with Chernoff bound, the
𝜇𝛿2
probability that X deviates below its mean by a factor of 𝛿 is 𝑃(𝑋 < (1 − 𝛿)𝜇) ≤ 𝑒 − 2 where 𝜇 is the mean.
(a) [Marks: 8] What is the expected accuracy of the majority-vote ensemble using Chernoff bound ?
(b) [Marks: 4] Is there anything wrong with the result or is it correct ? Please justify with the reason.

Solution:
The number of trees that correctly classify is modelled as Binomial. Each tree has an accuracy of 60%
(error rate of 40%). The mean of X is 𝜇 = 𝑛𝑝 = 50 × 0.6 = 30.
30−26
The decision is incorrect if 26 trees make a wrong decision. So for P(X<26), 𝛿 = = 0.13
30
𝜇𝛿2
30 × 0.132
P(ensemble incorrect) = P(X<26) ≤ 𝑒 − 2 ≈ exp (− ) = exp(−.2535) = .776
2

The probability that the ensemble is incorrect is 77.6%. Hence, therefore it has an accuracy of 23.4%,
which is much lower than the individual accuracy. Therefore the result is wrong and Chernoff bound
does not product correct result.

2
Note that Chernoff bound is useful when the sample size is high. In this case, the sample size is small
and hence the bound is not helpful (I do not expect you to know this – and hence no marks are
deducted if you do not mention this reason – this is only for your information)

Q.3) [Total Marks: 22] 4 Clusters and 5 Data Points are given to you. The data points are X1: [1,2]; X2: [2,1]; X3:
[8,8]; X4: [9,9]; X5: [4,5]. Mixing coefficients for all the clusters are all same and equals to ¼. The mean of four
clusters are: C1: [1,1]; C2: [2,2]; C3: [8,8]; C4: [9,9]. We also assume covariance matrices as identity matrices.
Complete one step of E-M and answer the following:
(a) [Marks: 10] Draw a table with the values of responsibilities (𝛾) after the E-Step. The table should contain
rows representing data points and columns representing clusters ?
(b) [Marks: 4] Data point [4,5] belongs to which cluster and why ?
(c) [Marks: 8] After M-step, which cluster has a highest mixing coefficient ? Why and what is the value?

Solution:
(a)
C1 C2 C3 C4
X1 0.5 0.5 2.87e-19 2.39e-25
X2 0.5 0.5 2.87e-19 2.39e-25
X3 3.83e-22 1.70e-16 0.73 0.26
X4 1.17e-28 3.83e-22 .27 0.73
X5 .002 .995 .002 8.27e-07

(b) Data points belong to cluster 2 since the responsibility is .995.

(c) Mixing Coefficients
a. C1: .2005
b. C2: .3990
c. C3: .2005
d. C4: .2000
Cluster two has a highest mixing coefficient. Sum of the responsibilities is highest for C2 because out of
5 points, there has high chances to be in Cluster 2.

3
Q.4) [Total Marks: 24] Answer the following questions:
(a) [Marks: 10] For a linear SVM with decision boundary 𝒘𝑇 + 𝜑(𝒙𝑛 ) + 𝑏 = 0, derive the width of the margin.
(b) [Marks: 4] Can XOR data be classified correctly by Linear SVMs, why or why Not ?
(c) [Marks: 4] Let us assume that the basis function is used on XOR data 𝜑(𝒙1 , 𝒙2 ) = (𝒙1 , 𝒙2 , 𝒙1 . 𝒙2 ), will it now
be correctly classified by Linear SVMs ?
(d) [Marks: 4] Assuming that we are using a hard-margin linear SVM on the transformed data generated in part
( c ), can you write the constraints for the optimization problem where you maximize the margin ? You can
assume 𝑤1 , 𝑤2 𝑎𝑛𝑑 𝑤3 as components of weights.
[Marks: 2] Instead of manual transformation used in part (c ), can you specify the polynomial kernel which can
be used ?
Solution:
(a) Done in class
(b) No since they are not linearly separable
(c) Yes. Please provide the complete table of the output.
(d) Four constrains for four points
i. For (1,0,0): w1+b>=1
ii. For (0,1,0): w2+b>=1
iii. For (0,0,0): b<=-1
iv. For (1,1,1): w1+w2+w3+b <= -1
v. (x. x’)2 [Marks will be given for any other correct Kernel as well]

Q.5) [Total Marks: 26] Let us consider a 2-layer neural network 1-hidden and 1 output layer. Input layer contains
2 neurons and output layer contains 1 neuron. The hidden layer uses a Swish Activation Function where
𝑆𝑤𝑖𝑠ℎ(𝑧) = 𝑧. 𝜎(𝑧) with 𝜎(𝑧) as a sigmoid activation function. Input is given as [1, 0.5] and the real output is 1.
𝑤11 𝑤12 0.1 0.2
Hidden layer weights are [𝑤
21 𝑤22 ] = [0.3 0.4] with both biases as 𝑏1 = 𝑏2 = 0.1 and weights for output
𝑤@ 0.7
layers are [ 11
@
] =[ ] with bias 𝑏1@ as 0.2. Considering the loss function as MSE. Please solve and answer the
𝑤12 0.8
following after one step of feedforward and backward propagation.
(a) [Marks: 2] What is the derivative of the activation function at the hidden layers?
(b) [Marks: 8] Find out the loss.
(b) [Marks: 16] Find out the gradients with respect to all the weights and biases.

4
Solution:
(a) The derivative of Swish function is Swish’(z) = 𝜎(𝑧) + 𝑧. 𝜎(𝑧) (1 − 𝑧. 𝜎(𝑧))
(b) You need to show the complete feedforward calculations.
Hidden layer
• Pre-activation input to two units: [0.35 0.65]
• After Activation (Swish) output from two units: [0.497 0.934]
• Output Layer pre-activation: [0.633]
• Output (Sigmoid): 0.6526
• Loss = 0.0603
(c) Perform the complete backpropagation steps
𝜕𝐿 𝜕𝐿 𝜕𝐿
• At output unit @ = −0.0136, @ = −0.0305, = −0.0788
𝜕𝑤11 𝜕𝑤12 𝜕𝑏1@

• At Hidden units:
−0.0357 −0.0179
𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 = [ ]
−0.0493 −0.0247

𝜕𝐿 𝜕𝐿
𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝑏𝑖𝑎𝑠𝑒𝑠; 𝜕𝑏 = −0.0357 and 𝜕𝑏 = −0.0493
1 2

Q.6) [Total Marks: 20] Let us assume that we have IID observations: {𝑥1 , 𝑥2 , … … , 𝑥𝑛 } which are drawn from the
(𝑥−𝑤)6
following PDF: 𝑓𝑤 (𝑥) = 𝐶 exp(− ) where C is the constant. Answer the following:
6

(a) [Marks: 10] Express the log-likelihood function for 𝑤.

(b) [Marks: 6] Compute the derivative of the log-likelihood function with respect to 𝑤.
(c ) [Marks: 4] Given that there are 5 IID observations: 𝑥1 = 1.5, 𝑥2 = 2, 𝑥3 = 2.5, 𝑥4 = 3 and 𝑥5 = 3.5. What is
the derivative of the log-likelihood at w=2.5 and what is the interpretation of this ?

Solution:
(a) Derive the loss function. The final expression is:
𝑛
1
log 𝐿(𝑤) = 𝑛 log(𝐶) − ∑(𝑥𝑖 − 𝑤)6
6
𝑖=1

(b) Derivative of log-likelihood will be: ∑𝑛𝑖=1(𝑥𝑖 − 𝑤)6

1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
Final 2014 Wwithanswers
No ratings yet
Final 2014 Wwithanswers
8 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
HW 3
No ratings yet
HW 3
7 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Quiz 3
No ratings yet
Quiz 3
12 pages
CS 419M Midsem 2021 22
No ratings yet
CS 419M Midsem 2021 22
6 pages
EE 769 2020.02.29 Mid Term Solution
No ratings yet
EE 769 2020.02.29 Mid Term Solution
6 pages
Machine Learning PYQ 2022 Ans
No ratings yet
Machine Learning PYQ 2022 Ans
17 pages
Exam 2011
No ratings yet
Exam 2011
22 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
12s 701 Final
No ratings yet
12s 701 Final
17 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
No ratings yet
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
12 pages
Midterm 2008s Solution
No ratings yet
Midterm 2008s Solution
12 pages
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
HW 1
No ratings yet
HW 1
11 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
12 pages
ML Practice 1
No ratings yet
ML Practice 1
106 pages
MLvsMAP Merged
No ratings yet
MLvsMAP Merged
208 pages
hw5 1
No ratings yet
hw5 1
6 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
Final 2012 W
No ratings yet
Final 2012 W
8 pages
Final 2012 Wsolutions
No ratings yet
Final 2012 Wsolutions
14 pages
Machine Learning Homework
No ratings yet
Machine Learning Homework
8 pages
CS771: Machine Learning: Tools, Techniques and Applications Mid-Semester Exam
No ratings yet
CS771: Machine Learning: Tools, Techniques and Applications Mid-Semester Exam
7 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
EE 769 2023.02.23 Mid Term
No ratings yet
EE 769 2023.02.23 Mid Term
2 pages
Itt453-Scheme Apr 25
No ratings yet
Itt453-Scheme Apr 25
7 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
ECE521H1 20191 631567517513final2019
No ratings yet
ECE521H1 20191 631567517513final2019
14 pages
Practice Finals
No ratings yet
Practice Finals
7 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Final: CS 189 Spring 2013 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2013 Introduction To Machine Learning
9 pages
Machine 2020 Jul-Dec
No ratings yet
Machine 2020 Jul-Dec
45 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
10f 601 Midterm
No ratings yet
10f 601 Midterm
17 pages
1160 BITS F464 20241220013738 Mid Semester Question Paper
No ratings yet
1160 BITS F464 20241220013738 Mid Semester Question Paper
4 pages
Midterm F02soln
No ratings yet
Midterm F02soln
14 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
PRML 2022 Endsem
No ratings yet
PRML 2022 Endsem
3 pages
ML Quiz 3
No ratings yet
ML Quiz 3
4 pages
ML Endsem 2022
No ratings yet
ML Endsem 2022
7 pages
MS Key-4
No ratings yet
MS Key-4
4 pages
Ai ML Exam - 1march 16 2022-Michael Magreola
No ratings yet
Ai ML Exam - 1march 16 2022-Michael Magreola
8 pages
CSE475 Set46
No ratings yet
CSE475 Set46
2 pages
CSE475 Set42
No ratings yet
CSE475 Set42
2 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Solving Math Problems
From Everand
Solving Math Problems
George N. Frempong
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Contents
No ratings yet
Contents
15 pages
Problem Bank
No ratings yet
Problem Bank
4 pages
IT Final Version PS I Chronicles
No ratings yet
IT Final Version PS I Chronicles
687 pages
SetC - Detailed Solution
No ratings yet
SetC - Detailed Solution
4 pages
EEE F246 - EEC Lab
No ratings yet
EEE F246 - EEC Lab
3 pages
Importance of Risk Management
No ratings yet
Importance of Risk Management
14 pages
2024WIAWIS Limbs
No ratings yet
2024WIAWIS Limbs
7 pages
Burgess-What Is Literature
No ratings yet
Burgess-What Is Literature
4 pages
UT525 526 User Manual
No ratings yet
UT525 526 User Manual
31 pages
DCA2102 Unit-05
No ratings yet
DCA2102 Unit-05
21 pages
Full The Lab Manual To Accompany The 8088 and 8086 Microprocessors Programming Interfacing Software Hardware and Applications 4th Edition Walter A. Triebel Ebook All Chapters
No ratings yet
Full The Lab Manual To Accompany The 8088 and 8086 Microprocessors Programming Interfacing Software Hardware and Applications 4th Edition Walter A. Triebel Ebook All Chapters
71 pages
Oceanic Feeling PDF
No ratings yet
Oceanic Feeling PDF
20 pages
Troubleshooting
No ratings yet
Troubleshooting
6 pages
Annexure-9 Technical & Functional Requirements SUB: RFP For Implementation of HRMS Under SAAS Model. Ref: Your GEM BID
No ratings yet
Annexure-9 Technical & Functional Requirements SUB: RFP For Implementation of HRMS Under SAAS Model. Ref: Your GEM BID
94 pages
Reliability Study of An LNG Plant
No ratings yet
Reliability Study of An LNG Plant
16 pages
D0BL-D0BM Parts Catalog (PDF) - Rfg080421
No ratings yet
D0BL-D0BM Parts Catalog (PDF) - Rfg080421
218 pages
A Study On Employer Employee Relationship With Special Reference To Shree Devi Textile, Coimbatore
No ratings yet
A Study On Employer Employee Relationship With Special Reference To Shree Devi Textile, Coimbatore
5 pages
SunlightV6 On Top6872265039
No ratings yet
SunlightV6 On Top6872265039
5 pages
Yugoslav Register YU-CJA To YU-CJZ
No ratings yet
Yugoslav Register YU-CJA To YU-CJZ
3 pages
Group Project (Operations Management - I) : Maximum Marks: 20
No ratings yet
Group Project (Operations Management - I) : Maximum Marks: 20
1 page
Solutions-Grand Marks Booster Challenege#1
No ratings yet
Solutions-Grand Marks Booster Challenege#1
66 pages
ACN Microproject
No ratings yet
ACN Microproject
16 pages
Tesla, .. ? / Cold Fusion, Tesla, Zeropoint Energy Utilization.. Pseudoscience?// ( ) ! / Analysis of New Energy Paradigm: Including Controversial & Questionable Claims
100% (1)
Tesla, .. ? / Cold Fusion, Tesla, Zeropoint Energy Utilization.. Pseudoscience?// ( ) ! / Analysis of New Energy Paradigm: Including Controversial & Questionable Claims
498 pages
Apacible - NCM118 LP1 Introduction
No ratings yet
Apacible - NCM118 LP1 Introduction
6 pages
Younity Community Course Module Oct 1.0
0% (1)
Younity Community Course Module Oct 1.0
103 pages
Dhrifi 2015
No ratings yet
Dhrifi 2015
20 pages
Mid-Term Exam
No ratings yet
Mid-Term Exam
3 pages
Trends
No ratings yet
Trends
13 pages
BHEL Application
No ratings yet
BHEL Application
6 pages
Pressure Groups in India
No ratings yet
Pressure Groups in India
3 pages
NCO Tutorial
100% (1)
NCO Tutorial
3 pages
Astm E9 09
No ratings yet
Astm E9 09
4 pages
BRIM Syllabus
No ratings yet
BRIM Syllabus
4 pages
Heizer 17-1
No ratings yet
Heizer 17-1
33 pages
Fixed Displacement Vane Pumps Datasheet
No ratings yet
Fixed Displacement Vane Pumps Datasheet
6 pages

Machine Learning Questions Final - Solutions

Uploaded by

Machine Learning Questions Final - Solutions

Uploaded by

Birla Institute of Technology & Science - Pilani, Hyderabad Campus

Second Semester 2024-2025

(b) Data points belong to cluster 2 since the responsibility is .995.

(a) [Marks: 10] Express the log-likelihood function for 𝑤.

(b) Derivative of log-likelihood will be: ∑𝑛𝑖=1(𝑥𝑖 − 𝑤)6

You might also like