0% found this document useful (0 votes)
27 views7 pages

IML21 Term1

The document outlines the structure and guidelines for timed remote assessments for various computing and engineering degrees at Imperial College London for the 2021-2022 academic year. It includes specific instructions for an open book assessment on machine learning, detailing the format, rules against plagiarism, and the types of questions to be answered. Additionally, it provides a framework for evaluating student performance and maintaining academic integrity during the examination process.

Uploaded by

Alexander Arzt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views7 pages

IML21 Term1

The document outlines the structure and guidelines for timed remote assessments for various computing and engineering degrees at Imperial College London for the 2021-2022 academic year. It includes specific instructions for an open book assessment on machine learning, detailing the format, rules against plagiarism, and the types of questions to be answered. Additionally, it provides a framework for evaluating student performance and maintaining academic integrity during the examination process.

Uploaded by

Alexander Arzt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

I MPERIAL C OLLEGE L ONDON

T IMED R EMOTE A SSESSMENTS 2021-2022

BEng Honours Degree in Computing Part III


BEng Honours Degree in Electronic and Information Engineering Part III
MEng Honours Degree in Electronic and Information Engineering Part III
MEng Honours Degree in Electronic and Information Engineering Part IV
BEng Honours Degree in Mathematics and Computer Science Part III
MEng Honours Degree in Mathematics and Computer Science Part III
MEng Honours Degrees in Computing Part III
MSc Advanced Computing
MSc Artificial Intelligence
MSc in Computing (Specialism)
for Internal Students of the Imperial College of Science, Technology and Medicine
This paper is also taken for the relevant assessments for the
Associateship of the City and Guilds of London Institute

PAPER COMP70050=COMP97101=COMP97151

INTRODUCTION TO MACHINE LEARNING (TERM1)

Tuesday 14 December 2021, 10:00


Writing time: 90 minutes
Upload time: 25 minutes

Answer ALL THREE questions


Open book assessment

This time-limited remote assessment has been designed to be open book. You may use resources which have been identified by the examiner to
complete the assessment and are included in the instructions for the examination. You must not use any additional resources when completing
this assessment.
The use of the work of another student, past or present, constitutes plagiarism. Giving your work to another student to use constitutes an
offence. Collusion is a form of plagiarism and will be treated in a similar manner. This is an individual assessment and thus should be completed
solely by you. The College will investigate all instances where an examination or assessment offence is reported or suspected, using plagiarism
software, vivas and other tools, and apply appropriate penalties to students. In all examinations we will analyse exam performance against
previous performance and against data from previous years and use an evidence-based approach to maintain a fair and robust examination. As
with all exams, the best strategy is to read the question carefully and answer as fully as possible, taking account of the time and number of marks
available.

Paper contains 3 questions


1a You are building a binary K-nearest neighbours classifier.
You are given the following table for 6 test instances.
Label represents the correct ground truth label for the test instance.
NN1 represents the label of the training instance closest to the test instance.
NN2 represents the label of the second closest training instance. NN3 the third
closest, NN4 the fourth closest, NN5 the fifth closest.

Predictions
# NN1 NN2 NN3 NN4 NN5 Label
(1-NN) (3-NN) (5-NN)
1 + + + × + + +
2 + × + + × + +
3 × + × + + + ×
4 × + × × × × ×
5 + × × × + × +
6 × + + × + × ×
i) Compute the predicted labels (+ or ×) for a 3-nearest neighbours and
5-nearest neighbours classifier for each test instance in the table above.
The predictions for a 1-nearest neighbour are provided as an example.
There is no need to provide any calculations for this question. Just provide
the predicted labels (either by filling up the table above, or writing them
down on a blank sheet).

ii) In one sentence, explain why increasing the K in a K-nearest neighbours


classifier generally results in better accuracy.

iii) Assume the weights for the five neighbours of test instance #6 are 0.6,
0.15, 0.1, 0.1, and 0.05 respectively. What is the prediction for a
distance-weighted nearest neighbour classifier for this test instance (+ or
×)? Justify your answer in one sentence (there is no need to show your
calculations).

© Imperial College London 2021 - 2022 Paper COMP70050=COMP97101=COMP97151 Page 1 of 6


b You would like to train a decision tree that classifies whether a book belongs to
either one of three genres: fantasy, romance, or horror.
You want to find out whether the frequency of the word love in the book is a
good attribute for your decision tree. You have discretised the frequency of the
word into two groups: low and high.
Out of the 100 books you have, 30 are fantasy, 40 are romance, and 30 are
horror.
Out of the 30 fantasy books, 15 mention the word love with low frequency, and
15 mention the word with high frequency.
Out of the 40 romance books, 10 mention the word love with low frequency, and
30 mention the word with high frequency.
Out of the 30 horror books, 25 mention the word love with low frequency, and 5
mention the word with high frequency.
Compute the information gain for selecting the frequency of the word love as
an attribute for your decision tree, with respect to the initial entropy of the whole
dataset of 100 books.
Please use log2 for all calculations, and show all intermediate calculations.
Hint: Entropies can be greater than 1 when there are more than two categories.
You may find the following calculations useful:
0.1 × log2 (0.1) = −0.3322
0.2 × log2 (0.2) = −0.4644
0.3 × log2 (0.3) = −0.5211
0.4 × log2 (0.4) = −0.5288
0.5 × log2 (0.5) = −0.5000
0.6 × log2 (0.6) = −0.4422
0.7 × log2 (0.7) = −0.3602
0.8 × log2 (0.8) = −0.2575
0.9 × log2 (0.9) = −0.1368
1.0 × log2 (1.0) = 0

The two parts carry equal marks.

© Imperial College London 2021 - 2022 Paper COMP70050=COMP97101=COMP97151 Page 2 of 6


2 You are building a machine learning model to predict the result of taking a
French language test, based on three features: the number of years the person has
studied French (x1 ), the number of years they have studied any foreign language
(x2 ), and the number of hours they spent preparing for this test (x3 ). The feature
values are normalised to be in a range between 0 and 1.
You have the following neural network architecture:

The network takes the three features as input. There is one hidden neuron (h)
with tanh activation. The output (ŷ) is a single neuron with linear activation,
predicting the resulting mark in a range between 0-10. The weights of the
network have been randomly initialised and can be seen on the diagram. The
network does not have any bias parameters.
If you need to make any assumptions, state them clearly in your answer.

a You are given normalised feature values for one test taker: 0.5 for the number of
years the person has studied French, 0.6 for the number of years they have
studied any foreign language, and 0.7 for the number of hours they spent
preparing for this test.
Find what result does the model predict for this person. Demonstrate your
calculations and intermediate steps.

b After the person takes the test, they receive mark 5.0 as their official result. You
now want to use this knowledge to improve your model.
Calculate the updated values for all the trainable parameters in this network after
one step of stochastic gradient descent using this datapoint. Use mean squared
error as the loss function and 0.5 as the learning rate.
Show the path of your calculations.

c Below is a table with predicted scores and true results for 5 test takers. You want
to evaluate how well the model is able to predict who passes the test (receives a
score ≥ 6.0). Calculate precision, recall and F-score for the system.

c Imperial College London 2021 - 2022 Paper COMP70050=COMP97101=COMP97151 Page 3 of 6


Person ID Predicted result True result
1 6.44 7.5
2 6.82 5
3 5.37 6.5
4 8.59 9
5 4.21 8

The three parts carry, respectively, 30%, 50%, and 20% of the marks.

c Imperial College London 2021 - 2022 Paper COMP70050=COMP97101=COMP97151 Page 4 of 6


3 After running the MAP-Elites algorithms several times independently, the
content of all the grids is reported in the table below:
index Gen/Phen BD Fitness
1 0.9 0.8 0.4 4 7 0.4
2 0.5 0.2 0.5 7 1 1.1
3 0.2 0.2 0.9 9 2 1.2
4 0.2 0.9 0.5 9 3 1.0
5 0.1 0.2 0.9 2 1 1.2
6 0.2 0.7 0.1 8 9 0.9
7 0.2 0.3 0.3 4 8 0.6
8 0.9 1.0 0.3 4 8 0.5
9 0.2 0.4 0.4 3 4 1.1
10 0.3 0.4 0.5 6 6 0.5
11 1.0 0.2 0.5 7 2 1.1
12 0.2 0.7 0.1 8 9 0.9
13 0.4 0.0 0.6 1 1 1.5
14 0.2 0.7 0.1 8 9 0.9
In this experiment, there is no difference between the genotype and the
phenotype (i.e., the function to develop the genotype into the phenotype is the
identity function). The behavioural descriptor (BD) is a 2D vector. The index
column corresponds to the row number of the table, which is provided to more
easily refer to elements in the table, if necessary.

a Initially, the grid was composed of 100 cells (i.e., 10x10 resolution). However,
we can observe that a significantly lower number of solutions has been reported.
To avoid this, we want to group all the found solutions into three clusters.
Apply one iteration of the k-Means algorithm with k = 3 and the following
data-points as initial values for the centroids. Show the details of your
calculations.
Please use the Manhattan distance (the L1-norm) as the distance metric for
this question.
Initial Centroids
c1 1 1
c2 9 1
c3 8 9

b To make this question independent from the previous one, let’s assume that we
ran k-Means with k = 4 and obtained the following centroids:
c1 1 3
c2 7 3
c3 5 7
c4 8 9

© Imperial College London 2021 - 2022 Paper COMP70050=COMP97101=COMP97151 Page 5 of 6


i) We want to replace the original grid of MAP-Elites, with the clusters and
centroids obtained from k-Means. For this, we need to update the function
of MAP-Elites “add to collection(solution)” that takes as input a solution
and adds it to the collection of solutions if the solution is either novel, or
better than previously encountered one. This function returns nothing.
Write the pseudo code of the new version of this function so that it uses
the clusters and centroids defined above instead of a grid.
You can assume that a solution has these three attributes: Genotype, BD,
Fitness.

ii) Assume that we start with an empty collection. Use your


“add to collection(solution)” function to potentially add in the collection
each sample of the dataset provided at the beginning of this question. Give
the content of the collection at the end of this process.

iii) Quantify the diversity, the performance and the QD score of the
collection created above.

c We use (µ + λ) − ES to improve the performance of the solution with index 14


(also reported in the table below). The parameters are set as follow: µ = 1 and
λ = 5, and the results of the evaluation of the offsprings (Ok ) from the first
generation is listed below:
index Gen/Phen Fitness
14 0.2 0.7 0.1 0.9
O1 0.3 0.7 0.1 0.7
O2 0.2 0.8 0.1 0.6
O3 0.2 0.7 0.2 0.7
O4 0.1 0.6 0.1 0.5
O5 0.3 0.8 0.1 0.8
i) Give the fitness and genotype of the parent(s) selected for the second
generation.

ii) Instead of using an ES algorithm, we want to use the gradient ascent


algorithm. Explain in two or three sentences how this can work in the
context of a black box problem.

The three parts carry, respectively, 40%, 40%, and 20% of the marks.

© Imperial College London 2021 - 2022 Paper COMP70050=COMP97101=COMP97151 Page 6 of 6

You might also like