Answers 2024
Answers 2024
a) Calculate the covariance and correlation coefficient by completing the below table. Show all
working.
b) What is the difference between covariance and correlation? Support your answer with graphs.
𝐗 𝐘 (X − ̅X) (Y − ̅Y) (X − ̅X)(Y − ̅Y)
3 0
1 3
5 6
6 9
5 7
Solution
𝐗 𝐘 (X − ̅X) (Y − ̅Y) (X − ̅X)(Y − ̅Y)
3 0 3−4=−1 0−5=−5 -1*-5 =5
1 3 1−4=−3 3−5=−2 6
5 6 5−4=1 6−5=1 1
6 9 6−4=2 9−5=4 8
5 7 5−4=1 7−5=2 2
µ= 4 µ=5
SD for X=
SD for Y=
= = 5.5
= Correlation= =
Page 1 of 11
• Covariance : shows the direction of the relationship between two variables but doesn't
indicate its strength. Its value can range from -∞ to +∞.
• Correlation : measures both the strength and direction of the relationship between two
variables. It ranges from -1 to 1, where +1 indicates a strong positive correlation, -1
indicates a strong negative correlation, and 0 means no linear relationship.
Question 2: [5 marks]
Consider the following items bought in a supermarket and some of their characteristics:
Item
Cost Volume
no. Color Class label
($) (cm3)
1 20 6 Blue Inexpensive
2 50 8 Blue Inexpensive
3 90 10 Blue Inexpensive
a) Which of the three features (cost, volume, and color) is the best classifier and why?
Solution
To know the best classifier, calculate fisher score for each feature and the highest value of fisher
score is the best classifier
• For cost
Page 2 of 11
• For volume
• For cost
a) Show how a perceptron can be used to implement the logical OR function. What is the used
activation function? draw the discriminant function.
Solution
0 0 0
0 1 1
1 0 1
1 1 1
𝑦 = 𝐴𝑓 (𝑥1𝑤1 + 𝑥2𝑤2 + 𝑏)
• 𝑤1, 𝑤2: Weights for inputs 𝑥1 and 𝑥2
Page 3 of 11
• b: Bias term.
• 𝐴𝑓: Activation function.
the activation function is step function
b) Consider a neuron with two inputs, one output, and a threshold activation function. If the two
weights are w1 =w2 = 1, and the bias is -1.5, what is the output for an input of ([0,0])? What
about inputs ([0,1]), ([1.0]) and ([1,1]). Draw the discriminant function and write down its
equation. What logic function does it represent?
Solution
𝑦 = 𝐴𝑓 (𝑥1𝑤1 + 𝑥2𝑤2 + 𝑏)
0 0 0
0 1 0
1 0 0
1 1 1
Page 4 of 11
c) What if the neuron had four or five inputs? Can you give a formula that computes the number of
binary input patterns for a given number of inputs?
Solution
2𝑛 Where 2 if its (0 or 1 ) , n be numbers of input
For:
• n=4 : 24 = 16
• n=5: 25 = 32
Construct a Hopfield Network to train with the vectors 𝑃1 = [−1 − 1 1 1], 𝑃2 = [1 1 1 1], 𝑃3 = [0 0 0
0] , 𝑃4 = [1 1 − 1 − 1]. Apply −1 for neuron outputs <= −1, 0 for 0 and between −1 and 1, and +1 for
>= 1.
We want to perform 𝐾-means clustering manually with 𝐾 = 2 on a small example of six observations
with two features.
Obs. 1 2 3 4 5 6
𝑋1 1 1 6 5 0 4
𝑋2 4 3 2 1 4 0
We use the Euclidean distance. Suppose that we initially assign the observations #1, #2, #3 as cluster 1
and the observations #4, #5, #6 as cluster 2.
(a) What are cluster centroids and cluster assignments after the first iteration of 𝐾-means clustering?
Solution:
For observations #1, #2, #3 as cluster 1 and the observations #4, #5, #6 as cluster 2.
1 1
√(2.67 − 1)2 + (3 − 4)2= 1.95 √(3 − 1)2 + (1.67 − 4)2= 3.07
Page 5 of 11
2 1
√(2.67 − 1)2 + (3 − 3)2= 1.67 √(3 − 1)2 + (1.67 − 3)2= 2.4
3 2
√(2.67 − 6)2 + (3 − 2)2= 3.47 √(3 − 6)2 + (1.67 − 2)2= 3.02
4 2
√(2.67 − 5)2 + (3 − 1)2= 3.07 √(3 − 5)2 + (1.67 − 1)2= 2.11
5 1
√(2.67 − 0)2 + (3 − 4)2= 2.85 √(3 − 0)2 + (1.67 − 4)2= 3.8
6 2
√(2.67 − 4)2 + (3 − 0)2= 3.28 √(3 − 4)2 + (1.67 − 0)2= 1.95
• Cluster 1: Observations #1, #2, #5 • Cluster 2: Observations #3, #4, #6 Calculate New Centroids
(b) Continue the algorithm of 𝐾-means clustering until it converges. Report the cluster centroids and
cluster assignments after each iteration.
Solution
3 2
√(0.67 − 6)2 + ( 3.67 − √(5 − 6)2 + (1 − 2)2=
2)2= 5.59 1.41
5 1
Page 6 of 11
√(0.67 − 0)2 + ( 3.67 − √(5 − 0)2 + (1 − 4)2=
4)2= 0.75 5.83
6 2
√(0.67 − 4)2 + ( 3.67 − √(5 − 4)2 + (1 − 0)2=
0)2= 4.96 1.41
a) Consider the following multilayer feedforward neural network with two inputs [X1=0.4 and X2
= −0.7] and one output [Y = 0.1]. The initial weights values are assigned (attached with
arrow), the learning rate is 0.6 and consider sigmoidal function as neuron activation function.
Apply backpropagation algorithm (show each step) to update all the weight values (only one
epoch).
Solution
𝑛𝑒𝑡 = 𝑋1 𝑤1 + 𝑋2 𝑤2
1
0=
1+𝑒 −𝑛𝑒𝑡
𝑂1 = 0.544
Net2 = 0.4*0.4 – 0.7*0.2 = 0.02
𝑂2 = 0.5049
Page 7 of 11
Nety = 0.544 * 0.2 – 0.5049*0.5 = 0.14365
E = 0.5 ∗ (𝑡 − 𝑦)2
E = 0.066
𝜕𝐸 𝜕𝐸 𝜕𝑦
= ∗ = (𝑦 − 𝑡) ∗ 𝑦 ∗ (1 − 𝑦) = 0.0905
𝜕𝑛𝑒𝑡𝑦 𝜕𝑦 𝜕𝑛𝑒𝑡𝑦
𝜕𝐸
𝑤0.2𝑛𝑒𝑤 = 𝑤0.2𝑜𝑙𝑑 − η ∗ ∗ 𝑂1
𝜕𝑛𝑒𝑡𝑦
Solution
Gradient Descent is an optimization algorithm used to minimize a cost (or error) function E(w) by
iteratively adjusting the weights in the direction of the negative gradient
It helps the network learn by using errors to adjust and improve.
Question8 : [5 marks]
Consider a single-layer single-neuron (output neuron) feedforward network with linear activation functions.
Use the method of gradient descent to derive the update rule for the network that minimizes the squared error
at the output neuron. It is known that the desired output is defined by the equation
w0 + w1x1 + w1x1 2 + ⋯ +wnXn + wnXn 2, where w's are the parameters of the network and n in the number
of inputs. Is the single-layer network appropriate for this problem? Explain briefly your answer.
Solution
Page 8 of 11
Question 9: [10 marks]
Page 9 of 11
Question 10: [10 marks]
a) Describe the architecture of the Self Organising Map (SOM) known as a Kohonen network.
b) The self-organizing process can be said to have four major components: Initialization,
Competition, Cooperation, and Adaptation. Briefly describe each of these components.
c) Consider the following SOM:
The output layer of this map consists of six nodes, A, B, C, D, E and F, which are organized into a two–
dimensional lattice with neighbors connected by lines. Each of the output nodes has two inputs x1 and
x2 (not shown on the diagram). Thus, each node has two weights corresponding to these inputs: w1
and w2. The values of the weights for all output in the SOM nodes are given in the table below:
a) Calculate which of the six output nodes is the winner if the input pattern is 𝒙 = (𝟐, −𝟒)
Solution
b) After the winner for a given input x has been identified, adjust the weights of the nodes in SOM.
Let 𝜂 = 0.5
Page 10 of 11
Solution
𝑤(𝑡 + 1) = 𝑤(𝑡) + η(𝑥𝑖 − 𝑤𝑖)
η = 0.5
Update for C :
𝑤1 = 3 + 0.5 ∗ (2 − 3) = 2.5
𝑤2 = −2 + 0.5 ∗ (−4 − (−2)) = −3
Good Luck
Page 11 of 11