0% found this document useful (0 votes)

14 views5 pages

IML19 Term1

The document is an examination paper for an Introduction to Machine Learning course, consisting of three main questions with various subparts. It covers topics such as classification types, decision tree algorithms, neural networks, overfitting, K-means clustering, Gaussian Mixture Models, and hyperparameters. Students are required to answer two questions within a specified time frame and demonstrate their understanding of machine learning concepts through calculations and explanations.

Uploaded by

Alexander Arzt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views5 pages

IML19 Term1

Uploaded by

Alexander Arzt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

PAPER C395

INTRODUCTION TO MACHINE LEARNING

Tuesday 17 March 2020, 10:00

Duration: 90 minutes
Post-processing time: 30 minutes
Answer TWO questions

Paper contains 3 questions

1a You are given three different problems below. For each problem, classify it as
either supervised learning, unsupervised learning, or reinforcement learning.
Also justify (in one sentence) why you classified each problem as such.

i) A book distributor has a collection of books, which it has classified into

different categories, e.g. “Young adults”, “Science fiction”, “Biography”,
and “Horror”. It wants to use this information to build a system to classify
its new products automatically.

ii) A supermarket has a database of its customers, and wants to automatically

discover and group its customers into different market segments to target
them separately.

iii) A group of aviation companies wants to develop a machine learning

algorithm which can predict the ( x, y) coordinates pinpointing the location
of a plane that has crashed. To develop this, the companies have collected
historical plane crash data, which include the coordinates of the planes’
crash sites.

b You are given the dataset below. Each row is a sample email annotated as spam
(or not), given whether or not a word appears in the email (0 indicates that the
word does not appear in the email, 1 indicates that it does).

# cash win debt home Spam?

1 0 1 0 0 no
2 1 1 1 0 yes
3 1 0 0 0 no
4 0 0 1 0 no
5 1 1 0 1 yes
6 1 1 1 1 yes
7 0 1 1 0 yes
8 1 0 1 1 no
9 0 0 0 0 no
10 1 1 0 0 yes

i) Using the Information Gain metric, which attribute will be selected as the
root node of a decision tree classifier? Please show all calculations
(including the Information Gain of all candidate nodes) to justify your
answer.

ii) Give one reason why one would need to prune a decision tree. Also
describe (in one or two sentences) how a validation set can be useful in
performing pruning.

c Imperial College London 2019 - 2020 Paper C395 Page 1 of 4

c Consider the training dataset below for a single-variable regression problem.

(i )
i x (i ) y (i ) d ( x ( q ) , x (i ) ) wq
1 1.5 3.16 ??? ???
2 2.3 1.45 ??? ???
3 3.0 1.07 ??? ???
4 3.8 2.01 ??? ???
5 4.9 4.51 ??? ???

i) At test time, you are given a query x (q) = 4.2.

(i )1
Given d( x (q) , x (i) ) = | x (q) − x (i) | and wq = , where | x |
d( x (q) ,x (i) )
indicates the absolute value of x, please complete the table above.

ii) Predict the output y(q) for x (q) = 4.2 using the k-nearest neighbours
regression algorithm with d( x (q) , x (i) ) as its distance measure, and
assuming k = 3. Show your calculation.

iii) Now predict the output y(q) for x (q) = 4.2 using the locally weighted
k-nearest neighbours regression algorithm. Use k = 3, the distance
(i )
measure d( x (q) , x (i) ), and the weights wq . Show your calculation.

The three parts carry, respectively, 30%, 40%, and 30% of the marks.

c Imperial College London 2019 - 2020 Paper C395 Page 2 of 4

2a Consider a fully-connected feedforward neural network for regression with 1
hidden layer and a single output neuron. Both the hidden and the output layers
use sigmoid activation. Mean squared error is used as the loss function for
optimisation.
Write out the necessary calculations for updating both the connection weights
and bias weights in the last layer using gradient descent. You are given the
matrix of input features X with N datapoints, output values Ŷ from the network,
the desired targets (labels) Y, and the current network weights.

b Explain the concept of overfitting. Name 3 methods you can use to deal with
overfitting and explain how each of them helps.

c A bank has developed a machine learning model for automatically identifying

fraudulent card transactions, which will then be manually reviewed. You run
some examples through the model and get the following results:

Example nr True label Predicted label

1 fraud real
2 real real
3 real fraud
4 real real
5 fraud real
6 real real
7 real real
8 fraud fraud
9 real real
10 real real
11 real real
12 fraud real
13 fraud fraud
14 real real
i) Construct the confusion matrix for this output.

ii) Calculate the classification accuracy, along with precision, recall and F1
for both classes.

iii) Analyse the results. Are there any issues? If so, which metrics identify
them?

The three parts carry, respectively, 40%, 30%, and 30% of the marks.

c Imperial College London 2019 - 2020 Paper C395 Page 3 of 4

3a Suppose at an update step, the K-means algorithm computes 3 cluster centroids:
µ1 = h−3, −1i, µ2 = h1, 2i, and µ3 = h−4, 1i. It then executes a cluster
assignment step. Assume that the algorithm uses a Euclidean distance measure.
To which cluster will the training example x (i) = h−2, 0i be assigned after the
cluster assignment step? Justify your answer by showing your calculations.

b Consider below the parameters θ = {πk , µk , σk2 : k = 1, 2, 3} of a univariate

Gaussian Mixture Model with 3 components that has been fitted to a set of
training examples, where πk is the mixing proportion of component k, µk the
mean of component k, and σk2 the variance for component k:

k πk µk σk2
1 0.5 -2 1
2 0.3 1 4
3 0.2 4 0.25

Suppose you are given an example x (i) = 0 at test time. Compute the probability
density for p( x (i) |θ ) given the parameters of the Gaussian Mixture Model above.
Show your calculations.
( x − µ )2
−
Hint: the Gaussian distribution is defined as: N ( x |µ, σ2 ) = √ 1 exp 2σ2
2πσ2

c When developing a neural network, which activation function and loss function
would you use in the output layer for the following applications? Justify your
decisions.

i) Predicting the temperature for tomorrow, based on the weather today.

ii) Generating text by predicting the next word in the sequence based on the
previous words.

iii) Detecting whether the camera image from a self-driving car contains a stop
sign.

d Hyperparameters are something that we need to deal with when designing

machine learning models.

i) Explain what hyperparameters are and give 2 examples.

ii) Given a dataset of 10,000 datapoints, how would you use it to find good
hyperparameters?

The four parts carry, respectively, 20%, 30%, 30%, and 20% of the marks.

c Imperial College London 2019 - 2020 Paper C395 Page 4 of 4

Amt305 Introduction To Machine Learning, Pyq
No ratings yet
Amt305 Introduction To Machine Learning, Pyq
5 pages
Cs229 Midterm Aut2015
No ratings yet
Cs229 Midterm Aut2015
21 pages
AIF-C01 AWS Certified Updated Practice Questions
No ratings yet
AIF-C01 AWS Certified Updated Practice Questions
5 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
Chapter 9 - BDMT
No ratings yet
Chapter 9 - BDMT
61 pages
Stanford University CS 229, Autumn 2014 Midterm Examination
No ratings yet
Stanford University CS 229, Autumn 2014 Midterm Examination
23 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
AI Engineer Interview Prep Guide
No ratings yet
AI Engineer Interview Prep Guide
16 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
VO Thesis Proposal 082716
100% (1)
VO Thesis Proposal 082716
35 pages
164 Artificial Intelligence 2021
No ratings yet
164 Artificial Intelligence 2021
77 pages
ML Question
No ratings yet
ML Question
2 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
14 pages
Graph Neural Network Introduction
No ratings yet
Graph Neural Network Introduction
88 pages
Machine Learning PYQ 2022 Ans
No ratings yet
Machine Learning PYQ 2022 Ans
17 pages
CS771 IITK EndSem Solutions
100% (1)
CS771 IITK EndSem Solutions
8 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
AKTU EXAM 19-20 Machine Learning Solved MCQ Answer Key
No ratings yet
AKTU EXAM 19-20 Machine Learning Solved MCQ Answer Key
11 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
K-Medoids-Clustering Method
No ratings yet
K-Medoids-Clustering Method
5 pages
ML Question Papers
No ratings yet
ML Question Papers
8 pages
Algorithmic Trading System: April 2024
No ratings yet
Algorithmic Trading System: April 2024
6 pages
ML 2022
No ratings yet
ML 2022
6 pages
Canonical MLOps Toolkit
No ratings yet
Canonical MLOps Toolkit
17 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
ML Imp QB
No ratings yet
ML Imp QB
34 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
QSRI Lecture1
No ratings yet
QSRI Lecture1
45 pages
Lec03 Pruning I
No ratings yet
Lec03 Pruning I
74 pages
Q Learning
No ratings yet
Q Learning
38 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
AI42001 Machine Learing Foundations ES 2024
No ratings yet
AI42001 Machine Learing Foundations ES 2024
18 pages
Machine Learning PYQ 2023
No ratings yet
Machine Learning PYQ 2023
8 pages
CLS2022 Conference Booklet 220222 RELEASE
No ratings yet
CLS2022 Conference Booklet 220222 RELEASE
44 pages
IMP - Fundamentals of Deep Learning - Introduction To Recurrent Neural Networks
No ratings yet
IMP - Fundamentals of Deep Learning - Introduction To Recurrent Neural Networks
33 pages
Machine Learning Homework
No ratings yet
Machine Learning Homework
8 pages
Tentamen 27 Maart 2018 Antwoorden
No ratings yet
Tentamen 27 Maart 2018 Antwoorden
11 pages
Module-4 - AI & ML - PCCEC403
No ratings yet
Module-4 - AI & ML - PCCEC403
8 pages
2023 Machine Learning
No ratings yet
2023 Machine Learning
8 pages
Marda 2018 Artificial Intelligence Policy in India A Framework For Engaging The Limits of Data Driven Decision Making
No ratings yet
Marda 2018 Artificial Intelligence Policy in India A Framework For Engaging The Limits of Data Driven Decision Making
19 pages
PM-ASDS Final Syllabus - 2019-2021
No ratings yet
PM-ASDS Final Syllabus - 2019-2021
19 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
12s 701 Final
No ratings yet
12s 701 Final
17 pages
EE2211 Past Paper
No ratings yet
EE2211 Past Paper
14 pages
Week 1
No ratings yet
Week 1
11 pages
1st Exam Question Paper 2
No ratings yet
1st Exam Question Paper 2
16 pages
Artificial Intelligence and Machine Learning in NG-RAN
No ratings yet
Artificial Intelligence and Machine Learning in NG-RAN
5 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
12 pages
ML 20231026 1
No ratings yet
ML 20231026 1
8 pages
Min Lin PDF
No ratings yet
Min Lin PDF
10 pages
Advantages:: Q.No 1.a Ans
No ratings yet
Advantages:: Q.No 1.a Ans
12 pages
ML 20240315
No ratings yet
ML 20240315
8 pages
Untitled 10
No ratings yet
Untitled 10
12 pages
Sem 5 External
No ratings yet
Sem 5 External
12 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Mcculloch Pittsneuron
No ratings yet
Mcculloch Pittsneuron
16 pages
Adobe Scan 30-May-2023
No ratings yet
Adobe Scan 30-May-2023
7 pages
Iml19 2
No ratings yet
Iml19 2
4 pages
Lokesh T00691325
No ratings yet
Lokesh T00691325
5 pages
Flood Prediction Using Supervised Machine Learning Algorithms
No ratings yet
Flood Prediction Using Supervised Machine Learning Algorithms
4 pages
2022 - Machine Learning
No ratings yet
2022 - Machine Learning
6 pages
Data Mining Lab
No ratings yet
Data Mining Lab
9 pages
Machine Learning, (CS-3035), Online Spring End Semester Examination 2021
No ratings yet
Machine Learning, (CS-3035), Online Spring End Semester Examination 2021
8 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
Ml-Unit 2-QB
No ratings yet
Ml-Unit 2-QB
6 pages
3cs1111 Ir RPR December 2019
No ratings yet
3cs1111 Ir RPR December 2019
4 pages
Machine Learning Approaches To Predict Asthma Exac
No ratings yet
Machine Learning Approaches To Predict Asthma Exac
19 pages
Problemset2 PDF
No ratings yet
Problemset2 PDF
4 pages
ML End Sem Nov2024 Paper
No ratings yet
ML End Sem Nov2024 Paper
4 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
2CS501 IR May 2022
No ratings yet
2CS501 IR May 2022
3 pages
ML Assignment
No ratings yet
ML Assignment
5 pages
Dec'19
No ratings yet
Dec'19
2 pages
Creating Value From Next-Generation Real-World Evidence
No ratings yet
Creating Value From Next-Generation Real-World Evidence
9 pages
Cyberbullying Detection Through Sentiment Analysis
No ratings yet
Cyberbullying Detection Through Sentiment Analysis
6 pages
ML Mid Sem Sep2023 Paper
No ratings yet
ML Mid Sem Sep2023 Paper
3 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Data Science and ML - End Term
No ratings yet
Data Science and ML - End Term
4 pages
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
No ratings yet
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
3 pages
9.4 Scoped Rules (Anchors) : Authors: Tobias Goerke & Magdalena Lang
No ratings yet
9.4 Scoped Rules (Anchors) : Authors: Tobias Goerke & Magdalena Lang
12 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
ML Lesson Plan
No ratings yet
ML Lesson Plan
2 pages
Abstract
No ratings yet
Abstract
2 pages
Josh Data
No ratings yet
Josh Data
2 pages
IGNOU BCA Computer Oriented Numerical Technique Previous Year Unsolved Papers BCS 054
From Everand
IGNOU BCA Computer Oriented Numerical Technique Previous Year Unsolved Papers BCS 054
Manish Soni
No ratings yet
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet

IML19 Term1

Uploaded by

IML19 Term1

Uploaded by

PAPER C395

INTRODUCTION TO MACHINE LEARNING

Tuesday 17 March 2020, 10:00

Paper contains 3 questions

i) A book distributor has a collection of books, which it has classified into

ii) A supermarket has a database of its customers, and wants to automatically

iii) A group of aviation companies wants to develop a machine learning

# cash win debt home Spam?

c Imperial College London 2019 - 2020 Paper C395 Page 1 of 4

i) At test time, you are given a query x (q) = 4.2.

c Imperial College London 2019 - 2020 Paper C395 Page 2 of 4

c A bank has developed a machine learning model for automatically identifying

Example nr True label Predicted label

c Imperial College London 2019 - 2020 Paper C395 Page 3 of 4

b Consider below the parameters θ = {πk , µk , σk2 : k = 1, 2, 3} of a univariate

i) Predicting the temperature for tomorrow, based on the weather today.

d Hyperparameters are something that we need to deal with when designing

i) Explain what hyperparameters are and give 2 examples.

c Imperial College London 2019 - 2020 Paper C395 Page 4 of 4

You might also like