0% found this document useful (0 votes)

13 views8 pages

Final 2014 Wwithanswers

This 3-sentence summary provides the key information about the document: The document outlines a 15-question final exam for a machine learning course, covering topics like linear algebra, decision trees, linear classification, Gaussian processes, and neural networks. The exam is 2 hours long and consists of 8 pages of multiple choice and short answer questions worth a total of 44 points. Students are instructed to write their identification number but not their name on submitted pages.

Uploaded by

Việt Nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views8 pages

Final 2014 Wwithanswers

Uploaded by

Việt Nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Machine Learning 1 — WS2014 — Module IN2064 Final Exam · Page 1

Machine Learning 1 — Final Exam

1 Preliminaries

• Please write your immatriculation number but not your name on every page you hand in.

• The exam is closed book. You may, however, take one A4 sheet of handwritten notes.

• The exam is limited to 2 × 60 minutes.

• If a question says “Describe in 2–3 sentences” or “Show your work” or something similar, these
mean the same: give a succinct description or explanation.

• This exam consists of 8 pages, 15 problems. You can earn up to 44 points.

Problem 1 [3 points] Fill in your immatriculation number on every sheet you hand in. Make sure
it is easily readable. Make sure you do not write your name on any sheet you hand in.

2 Linear Algebra and Probability Theory

Problem 2 [2 points] Let X and Y be two random variables. Show that

var[X + Y ] = var[X] + var[Y ] + 2cov[X, Y ]

where cov[X,Y ] is the covariance between X and Y . You can use that

var[X] = E[X 2 ] − E2 [X]

cov[X, Y ] = E[XY ] − E[X]E[Y ]

We know [1 point]:

var[X] = E[X 2 ] − E2 [X]

cov[X, Y ] = E[XY ] − E[X]E[Y ]

Hence [1 point]:

var[X + Y ] = E[(X + Y )2 ] − E2 [X + Y ]
= E[X 2 + Y 2 + 2XY ] − (E[X] + E[Y ])2
= E[X 2 ] + E[Y 2 ] + 2E[XY ] − E2 [X] − E2 [Y ] − 2E[X]E[Y ]
= E[X 2 ] − E2 [X] + E[Y 2 ] − E2 [Y ] +2(E[XY ] − E[X]E[Y ])
| {z } | {z } | {z }
=var[X] =var[Y ] =cov[X,Y ]

imat:
Final Exam · Page 2 Machine Learning 1 — WS2014/2015 — Module IN2064

3 Decision Trees & kNN

You are given two-dimensional input data with corresponding targets in below plot.
6

0
x2

6
4 2 0 2 4 6 8
x1

Problem 3 [2 points] Sketch the decision boundaries of a maximally-trained decision tree classifier
using misclassification rate.

0
x2

6
4 2 0 2 4 6 8
x1

Problem 4 [2 points] Describe how this model can overfit the data. Describe how that problem can
be solved or prevented.

Overfitting!
• prune the tree
• restrict the depth of the tree
• train an ensemble of slightly different trees –> random forests

Problem 5 [2 points] Perform 1-NN with leave-one-out cross validation on the data in the plot.
Circle all points that are misclassified and write down the accuracy.

imat:
Machine Learning 1 — WS2014 — Module IN2064 Final Exam · Page 3

0
x2
2

6
4 2 0 2 4 6 8
x1

accuracy = 6/10

4 Linear Classification

Problem 6 [4 points] The decision boundary for some linear classifier on two-dimensional data
crosses axis x1 at 2 and x2 at 5. First, write down the general form of a linear classifier model (how
many parameters do you need, given the dimensions?). Calculate the coefficients (parameters).

w0 + w1 x 1 + w2 x 2 = 0
w0 + 2w1 = 0
w0
w1 = −
2
w0 + 5w2 = 0
w0
w2 = −
5

set, e.g., w0 = −2
w1 = 1
2
w2 =
5

or anything proportional

Problem 7 [2 points] Which basis function φ(x1 , x2 ) makes the data in the example below linearly
separable (crosses in one class, circles in the other)?

imat:
Final Exam · Page 4 Machine Learning 1 — WS2014/2015 — Module IN2064

φ(x1 , x2 ) = x1 x2

5 Gaussian Processes

The posterior conditional for an MVN with distribution

y1 µ1 Σ11 Σ12
y= ∼N µ= ,Σ =
y2 µ2 Σ21 Σ22

is given by

p(y 1 |y 2 ) = N (y 1 |µ1|2 , Σ1|2 )

µ1|2 = µ1 + Σ12 Σ−1
22 (y 2 − µ2 )
Σ1|2 = Σ11 − Σ12 Σ−1
22 Σ21

Assume a noise-free GP with mean function

m(x) = 0

and covariance function

K(x, x0 ) = 1 + (x − 2)(x0 − 2).

You are given two data points (0, 4) =: x.

Problem 8 [2 points] Compute the kernel matrix (aka covariance matrix) for x.

5 −3
K=
−3 5

Problem 9 [6 points] Given corresponding outputs y = (2, 6), compute the posterior function values
for data points x∗ = (0, 2, 4).

imat:
Machine Learning 1 — WS2014 — Module IN2064 Final Exam · Page 5

5 3

−1 16 16
K = 3 5
16 16
 
5 −3
K T∗ =  1 1
−3 5
µf ∗ |y = m(x∗ ) + K T∗ K −1 (y − m(x))
 
1 0
2
= 0 + 1/2 1/2 −0
6
0 1
 
2
= 4
6

Problem 10 [4 points] Which other algorithm does this resemble? Please describe the corresponding
feature space.

K is a linear kernel, and we are basically solving linear regression.

6 Neural networks

Problem 11 [2 points] Geoffrey has a data set with input X ∈ R2 and output Y ∈ R1 . He has
neural network A with one hidden layer and 9 neurons in that layer, which can fit the data. However,
he does not know how good the model is, so he also tests neural network B with two hidden layers
and three neurons for each of these layers. Both models have biases for the hidden units only.
• How many free parameters do the two models have? Show your calculation, not just the result.
• What are the pros and cons of model A compared to model B? Mention at least one pro and
one con.

Model A and model B have 36 and 24 free parameters, respectively. Pros: fewer parameters for
B. Cons: more complicated structure may prone to local minimum.

Problem 12 [4 points] You know that the sum of squared errors is related to the Gaussian distribution—
differently put, if you assume a normal distribution of the data around their expectation, the maximum
likelihood estimate (MLE) is reached when the summed squared errors is minimised.
The same is true for a Laplace distribution and the sum of absolute errors. In particular, if the data
observes a Laplacian distribution

N N
Y Y 1 |zn − y(xn , w)|
p(z|x, w, β) = p(zn |xn , w, β) = exp −
2β β
n=1 n=1

imat:
Final Exam · Page 6 Machine Learning 1 — WS2014/2015 — Module IN2064

then minimising the summed absolute errors

N
X
|zn − y(xn , w)|
n=1

leads to MLE. In these equations, x is the vector of all inputs, xn is the input of sample n, while
y(xn , w) is the neural network prediction on xn . Then, zn is the desired output for xn .

Show that the MLE of the Laplace distribution minimises the sum of absolut errors.

Taking the negative logarithm, we obtain the error function

N
βX N N
zn − y(xn , w) − ln β + ln(2π).
2 2 2
n=1

Maximising the likelihood function is equivalent to minimising the error given by the sum of
absolutes
XN
|zn − y(xn , w)|.
n=1

7 Unsupervised learning

Problem 13 [2 points] Consider the plot below. The two classes are circles and triangles. What
significance do class labels have for k-means? Draw the resulting decision boundary in the plot for
k-means with two centroids.

imat:
Machine Learning 1 — WS2014 — Module IN2064 Final Exam · Page 7

Exact position is not important, but should not seperate the two classes perfectly.

15
15 10 5 0 5 10 15

Problem 14 [3 points] Would the separation be different using the EM algorithm with a Gaussian
mixture model using two components and individual full covariance matrices? What would it look
like? The left cluster has 200 points and the right cluster has 40 points. Draw qualitatively in the
above figure.

EM for GMM would be able to distinguish better between the two classes. k-means only fits
clusters of the same size. EM for GMM can model different sizes for each cluster. The new
boundary is now a circle around the right cluster:

imat:
Final Exam · Page 8 Machine Learning 1 — WS2014/2015 — Module IN2064

0.0
0

00
5

15
15 10 5 0 5 10 15

Problem 15 [4 points] The likelihood for ICA is

m X
X n
`(W ) = log pS i (wTi x) + log |W |
i=1 j=1

When calculating the gradient for this likelihood we need to compute the derivative of log pS i (wTi x).
Here we are using pS i (s) ≈ σ 0 (s) where the sigmoid function is given as

1
σ(a) = .
1 + e−a

with the derivative

dσ(a)
σ 0 (a) ≡ = σ(a) (1 − σ(a)) .
da

Show that d
ds log σ 0 (s) = 1 − 2σ(s).

d e−s e−s (e−s −1)

σ 00 (s) ds (1+e−s )2 (1+e−s )3 e−s −1 e−s +1−2
d
ds log σ 0 (s) = σ 0 (s) = e−s
= e−s
= 1+e−s
= 1+e−s
=1− 2
1+e−s
= 1 − 2σ(s)
(1+e−s )2 (1+e−s )2

imat:

ML June 2024
No ratings yet
ML June 2024
12 pages
Sample Final Exam Solutions
No ratings yet
Sample Final Exam Solutions
30 pages
ECE521H1 20191 631567517513final2019
No ratings yet
ECE521H1 20191 631567517513final2019
14 pages
EDAN96 2024 Last Lecture-1
No ratings yet
EDAN96 2024 Last Lecture-1
78 pages
Final 2012 Wsolutions
No ratings yet
Final 2012 Wsolutions
14 pages
Mock 2015 Wwithanswers
No ratings yet
Mock 2015 Wwithanswers
7 pages
Final 2015 W
No ratings yet
Final 2015 W
4 pages
Repeat 2014 Wwith Answers
No ratings yet
Repeat 2014 Wwith Answers
9 pages
Final 2012 W
No ratings yet
Final 2012 W
8 pages
Mock Exams 2024
No ratings yet
Mock Exams 2024
81 pages
10f 601 Midterm
No ratings yet
10f 601 Midterm
17 pages
Machine Learning Homework
No ratings yet
Machine Learning Homework
8 pages
ML 20231026 1
No ratings yet
ML 20231026 1
8 pages
EE 769 2018 End-Sem
No ratings yet
EE 769 2018 End-Sem
2 pages
Machine Learning Questions Final - Solutions
No ratings yet
Machine Learning Questions Final - Solutions
5 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
12 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
hw5 1
No ratings yet
hw5 1
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
Ai ML Exam - 1march 16 2022-Michael Magreola
No ratings yet
Ai ML Exam - 1march 16 2022-Michael Magreola
8 pages
AI42001 Machine Learing Foundations ES 2024
No ratings yet
AI42001 Machine Learing Foundations ES 2024
18 pages
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Midterm2008f Sol
No ratings yet
Midterm2008f Sol
12 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
Final 2006
No ratings yet
Final 2006
15 pages
ML Practice 1
No ratings yet
ML Practice 1
106 pages
Quiz3 2023
No ratings yet
Quiz3 2023
2 pages
CS273a Final Exam
No ratings yet
CS273a Final Exam
9 pages
ORAN-WG2 AIML v01 00 PDF
No ratings yet
ORAN-WG2 AIML v01 00 PDF
34 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
ML 20240315
No ratings yet
ML 20240315
8 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
No ratings yet
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
12 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
12s 701 Final
No ratings yet
12s 701 Final
17 pages
Final F04soln
No ratings yet
Final F04soln
10 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Advances in CMOST AI Development: Chaodong Yang
100% (1)
Advances in CMOST AI Development: Chaodong Yang
32 pages
HW 3
No ratings yet
HW 3
7 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
Final: CS 189 Spring 2013 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2013 Introduction To Machine Learning
9 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
No ratings yet
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
27 pages
DL Notes ALL
No ratings yet
DL Notes ALL
63 pages
HW 7
No ratings yet
HW 7
4 pages
PRML 2022 Endsem
No ratings yet
PRML 2022 Endsem
3 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Midterm Solutions For Machine Learning
No ratings yet
Midterm Solutions For Machine Learning
13 pages
Midterm 2008s Solution
No ratings yet
Midterm 2008s Solution
12 pages
2nd Exam Question Paper 2
No ratings yet
2nd Exam Question Paper 2
16 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
Download
No ratings yet
Download
203 pages
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
CSE35 Project Report
No ratings yet
CSE35 Project Report
111 pages
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Lightweight Densenet Model For Plant Disease Diagnosis
No ratings yet
Lightweight Densenet Model For Plant Disease Diagnosis
17 pages
ANN Most Notes
100% (1)
ANN Most Notes
6 pages
Efficient Prompting Methods For Large Language Models - A Survey
100% (1)
Efficient Prompting Methods For Large Language Models - A Survey
18 pages
A Review of Condition Monitoring and Fault Diagnosis For Diesel Engines
No ratings yet
A Review of Condition Monitoring and Fault Diagnosis For Diesel Engines
25 pages
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
From Everand
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
CSPacademic
No ratings yet
A Comprehensive Review of YOLO From YOLOv1 To YOLO
No ratings yet
A Comprehensive Review of YOLO From YOLOv1 To YOLO
27 pages
Adaptive Sliding Mode Control For Attitude and Altitude System of A Quadcopter UAV Via Neural Network
No ratings yet
Adaptive Sliding Mode Control For Attitude and Altitude System of A Quadcopter UAV Via Neural Network
10 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
5 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
487 pages
10.1007@978 3 030 45402 9
No ratings yet
10.1007@978 3 030 45402 9
341 pages
Keke Zhang Et Al - 2021 - Detecting Soybean Leaf Disease From Synthetic Image Using Multi-Feature Fusion
No ratings yet
Keke Zhang Et Al - 2021 - Detecting Soybean Leaf Disease From Synthetic Image Using Multi-Feature Fusion
11 pages
AI in Education
100% (3)
AI in Education
39 pages
GRE Analytical Writing ISSUE Essay Topic - 24
100% (1)
GRE Analytical Writing ISSUE Essay Topic - 24
2 pages
00 Lectureslides LinAlg
No ratings yet
00 Lectureslides LinAlg
20 pages
Solving Math Problems
From Everand
Solving Math Problems
George N. Frempong
No ratings yet
Tasd
No ratings yet
Tasd
143 pages
A Simulator For Robot Navigation Algorithms: Michael A. Folcik and Bijan Karimi
No ratings yet
A Simulator For Robot Navigation Algorithms: Michael A. Folcik and Bijan Karimi
6 pages
Stock Prediction Using Deep Learning and Sentiment Analysis
No ratings yet
Stock Prediction Using Deep Learning and Sentiment Analysis
8 pages
Frequency Compensated Diffusion Model For Real-Scene Dehazing
No ratings yet
Frequency Compensated Diffusion Model For Real-Scene Dehazing
16 pages
CHPT 5 ITB NOTES
No ratings yet
CHPT 5 ITB NOTES
12 pages
TT 12 Future Computation Cognitive Adaptive Iccgi Advcomp
No ratings yet
TT 12 Future Computation Cognitive Adaptive Iccgi Advcomp
21 pages
05 Lectureslides Kernels
No ratings yet
05 Lectureslides Kernels
47 pages
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Unit 10 Data Mining and Data Visualization: Structure
No ratings yet
Unit 10 Data Mining and Data Visualization: Structure
25 pages
Sri Ram - Week 3 Assignment
No ratings yet
Sri Ram - Week 3 Assignment
14 pages
2014 VL CogSys 6
No ratings yet
2014 VL CogSys 6
23 pages
03 Lectureslides ParameterInference
No ratings yet
03 Lectureslides ParameterInference
24 pages
Artificial Neural Networks in Data Mining: Nashaat El-Khamisy Mohamed, Ahmed Shawky Morsi El-Bhrawy
No ratings yet
Artificial Neural Networks in Data Mining: Nashaat El-Khamisy Mohamed, Ahmed Shawky Morsi El-Bhrawy
5 pages
231
No ratings yet
231
8 pages
IJIRSET Paper Sample
No ratings yet
IJIRSET Paper Sample
4 pages
IJISRT23MAY2427
No ratings yet
IJISRT23MAY2427
7 pages
GRE Analytical Writing ISSUE Essay Topic - 45
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 45
2 pages
Sway 020 A
No ratings yet
Sway 020 A
7 pages
Sat - 75.Pdf - Analysis of Automatic Genger Prediction in Social Media by Using Xgboost Algorithm
No ratings yet
Sat - 75.Pdf - Analysis of Automatic Genger Prediction in Social Media by Using Xgboost Algorithm
11 pages
IELTS Writing Task 1 Simon
100% (13)
IELTS Writing Task 1 Simon
27 pages
Gradient Leakage Attacks in Federated Learning - Research Frontiers, Taxonomy and Future Directions
No ratings yet
Gradient Leakage Attacks in Federated Learning - Research Frontiers, Taxonomy and Future Directions
8 pages
Software Requirements Specification Project Based Learning-II
No ratings yet
Software Requirements Specification Project Based Learning-II
12 pages
General Structure
No ratings yet
General Structure
6 pages
GRE Analytical Writing ISSUE Essay Topic - 14
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 14
2 pages
GRE Analytical Writing ISSUE Essay Topic - 50
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 50
2 pages
GRE Analytical Writing ISSUE Essay Topic - 32
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 32
2 pages
GRE Analytical Writing ISSUE Essay Topic - 31
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 31
2 pages
GRE Analytical Writing ISSUE Essay Topic - 9
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 9
2 pages
GRE Analytical Writing ISSUE Essay Topic - 41
No ratings yet
GRE Analytical Writing ISSUE Essay Topic - 41
1 page
GRE Pool Solved ISSUE
77% (69)
GRE Pool Solved ISSUE
308 pages

Final 2014 Wwithanswers

Uploaded by

Final 2014 Wwithanswers

Uploaded by

Machine Learning 1 — WS2014 — Module IN2064 Final Exam · Page 1

Machine Learning 1 — Final Exam

• The exam is limited to 2 × 60 minutes.

• This exam consists of 8 pages, 15 problems. You can earn up to 44 points.

2 Linear Algebra and Probability Theory

Problem 2 [2 points] Let X and Y be two random variables. Show that

var[X + Y ] = var[X] + var[Y ] + 2cov[X, Y ]

var[X] = E[X 2 ] − E2 [X]

var[X] = E[X 2 ] − E2 [X]

3 Decision Trees & kNN

The posterior conditional for an MVN with distribution

p(y 1 |y 2 ) = N (y 1 |µ1|2 , Σ1|2 )

Assume a noise-free GP with mean function

and covariance function

You are given two data points (0, 4) =: x.

K is a linear kernel, and we are basically solving linear regression.

then minimising the summed absolute errors

Taking the negative logarithm, we obtain the error function

Problem 15 [4 points] The likelihood for ICA is

with the derivative

d e−s e−s (e−s −1)

You might also like