0% found this document useful (0 votes)

25 views10 pages

Trial Exam 2021 With Solutions

Uploaded by

Amir Sharifi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views10 pages

Trial Exam 2021 With Solutions

Uploaded by

Amir Sharifi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Trial Exam IN3050/4050 Spring 2021 –

With Solutions
Hi IN3050/4050-students! Below, we have made a trial exam consisting of questions representative
for what you will see in this year’s exam. Many of them have been used before (e.g. in last year’s
exam/trial exam), but these are all questions that we believe would also be fitting for this year’s
exam, and you can expect to see a similar style of problems on this year’s final exam.

A difference between this and the final exam, however, is that the final exam will be given and
delivered in Inspera. So, please take some time to familiarize yourself with how to do exams in
Inspera if you have not done this already. Some links to useful information:

• Examination guidelines for Spring 2021 at the Faculty of Mathematics and Natural Sciences
• UiO’s page for preparing for exams in Inspera
• Each question in our exam will be answered in the “long form assignment” style
demonstrated here.
• We encourage you to log in to Inspera and test their demo exams, especially the “long form
assignment” demo. Accessible here after logging in. In particular, have a look at the equation
editor, with which it may be useful to have a bit of practice well before the exam.

Simulated Annealing (6p)

In simulated annealing,

a) What would happen if we start with a very low temperature (keeping low through search)? Which
search algorithm would this be like? (3p)

b) What would happen if we start with a very high temperature and never decrease? (3p)

Suggested solutions
a) We would have a method doing almost only exploitation. This would be similar to hill climbing /
local search, but with a bit more randomness, depending on how low we set the temperature.

b) We would have a lot of randomness and never settle on the areas in the fitness landscape with
the best solution. This would be similar to random search/exhaustive search. We would essentially
move through random neighbors in the search landscape.

Master Students Only: Search (5p)

In a few sentences, sketch how you could modify a hill climbing algorithm in order to improve
chances of finding the global optimum.

Suggested solution
You could for example:

-Run it multiple times with random starting positions

-Sometimes randomly choose worse solutions to not get stuck in local optima (like Simulated
Annealing does)

1
Bachelor Students only: EA Selection (5p)
Five strings have the following fitness values: 3, 6, 9, 12, 15. Under Fitness Proportionate Selection,
compute the expected number of copies of each string in the mating pool if a constant population
size, n = 5, is maintained.

Suggested solution
With a constant population size, we need to select 5 individuals for the next generation. Fitness
Proportionate Selection implies that each of these 5 solutions should have a chance of being picked
equal to its proportion of the total fitness in the population.

Total population fitness: 3+6+9+12+15 = 45

Expected number of copies:

• Individual 1 (fitness = 3): Probability of getting chosen once: 3/45. Multiplying probability
with 5 draws: (3/45)*5 = 0.33 copies are expected (0 if rounding off)
• Individual 2 (fitness = 6): Probability of getting chosen once: 6/45. Multiplying probability
with 5 draws: (6/45)*5 = 0.66 copies are expected (1 if rounding off)
• Individual 3 (fitness = 9): Probability of getting chosen once: 9/45. Multiplying probability
with 5 draws: (9/45)*5 = 1 copy is expected.
• Individual 4 (fitness = 12): Probability of getting chosen once: 12/45 Multiplying probability
with 5 draws: (12/45)*5 = 1.33 copies are expected (1 if rounding off)
• Individual 5 (fitness = 15): Probability of getting chosen once: 15/45 Multiplying probability
with 5 draws: (15/45)*5 = 1.66 copies are expected (2 if rounding off)

We expect the final mating pool to contain 2 copies of individual 5, and 1 of each of individuals 2,3
and 4.

Pareto Optimality (9p)

For an optimization problem we wish to optimize solutions according to two different objectives, f1
and f2. The fitness values according to the two objectives for 7 different solutions are plotted in the
figure below.

2
a) What requirements do the solutions in a Pareto optimal set need to fulfill? (3p)

Find the Pareto optimal set of solutions when

b) Maximizing f1 and f2 (3p)

c) Maximizing f1 but minimizing f2 (3p)

Suggested solutions
a) Solutions on the Pareto optimal set have to be non-dominated. That is, no other solutions
should be better according to both objectives (or even better on some but equally good on
others).
b) 4, 6 and 7
c) 1, 7

Perceptron and linear regression classifier (12p)

Given the following data

Item x1 x2 Class
A 1 2 yes =1
B 2 1 yes =1
C 1 1 no =0
D 1 0 no =0

a) Are the data linearly separable? State reasons for your answer. (4p)

b) We will train a perceptron on the data. We add a bias 𝑥0 = −1 to each of the data points.
Suppose the current weigths to be 𝒘 = (0, −1, 1). Assume a learning rate of 0.1. How should the
weights be updated if point A is considered? How would the weights have been updated if the
algorithm instead had considered point B? (4p)

c) Say, we instead had applied a linear regression classifier. How should the weights have been
updated when considering datapoint A, again assuming a learning rate of 0.1. And how would they
have been updated if we instead considered point B? (4p)

Suggested solutions
a) The datapoints can be separated e.g. by the line 𝑥1 + 𝑥2 = 2.5

b) For A, we get 𝑧 = 𝑤0 𝑥0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 = 0(−1) − 1(1) + 1(2) = 1.

Since 𝑧 > 0, we get the prediction 𝑦 = 1.
Since 𝑦 = 𝑡, there is no change to 𝒘.

For B, we get 𝑧 = 𝑤0 𝑥0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 = 0(−1) − 1(2) + 1(1) = −1.

Since 𝑧 < 0, we get the prediction 𝑦 = 0. The update
𝒘 = 𝒘 − 𝜂(𝑦 − 𝑡)𝒙 = (0, −1,1) − 0.1(0 − 1)(−1,2,1) = (−0.1, −0.8,1.1)
c) Point A. Since 𝑦 = 𝑧 = 1 = 𝑡, there is no update.

Datapoint B. Since 𝑦 = 𝑧 = −1, the update is

𝒘 = 𝒘 − 𝜂(𝑦 − 𝑡)𝒙 = (0, −1,1) − 0.1(−1 − 1)(−1,2,1) = (−0.2, −0.6,1.2)

3
Logistic Regression (12 p)
Kim is building a spam filter. She has the hypothesis that counting the occurrences of the letter ‘x’ in
the e-mails will be a good indicator of spam or no-spam. She collects 7 spam messages and 7 no-
spam messages and counts the number of x-s in each. Here is what she finds.

• Number of ‘x’-s in each spam: [0, 3, 4, 8, 9, 13, 21]

• Number of ‘x’-s in each no-spam: [0, 0, 1, 2, 2, 5, 6]
She trains a logistic regression classifier on the data and plots the classifier against the data.

Assume the logistic regression model and answer the following questions:

a) Consider an e-mail with no ‘x’-s. According to the model, what is roughly the probability of this
message being a spam message and what is the probability of it not being a spam. (3p)

b) How many x-s must an e-mail contain to guarantee it is a spam mail? (3p)

c) How is a logistic regression model normally turned into a binary classifier? If you turn the model
into a classifier in this way, what is the accuracy of the classifier on the training data? (3p)

d) It is most important that no no-spams are classified as spams. How can this goal be described in
terms of precision and recall? How can the logistic regression classifier be modified to try to achieve
this goal? (3p)

Suggested Solution
a) We see from the graph that P(1 | x=0) = 0.2 and P(0 | x=0) = 0.8.

b) You can never be 100% sure with a logistic regression model,

c) This is normally done by choosing the class 1 if P(1 | x) > 0.5. We see from the graph that this
classify 4 spams correctly and 3 spams incorrectly and 5 no-spams correctly and 2 incorrectly.
Altogether 9 out of 14 are classified correctly, yielding and accuracy of 9/14.

d) We could either say that the goal is to get a good precision for spam, or a good recall for no-spam.
We achieve this by raising the threshold from 0.5. For the training data, a threshold of 0.7 would
suffice. If we want to be prepared for more variation in the test data, we could set the threshold
even higher.

4
Figure 1

5
MLP and Back-propagation (10p)
a) Figure 1 shows the Multi-layer Perceptron Algorithm as presented by Marsland in the course
book. As presented, this is an algorithm for classification. Suppose you instead will use an MLP for
regression. Which lines in the algorithm do you have to change? What are their new forms? (5p)

b) Which activation function does Marsland’s algorithm apply in the hidden layer? Suppose you
instead will use the RELU activation function in the hidden layer. Which lines do you have to change
and what will they look like with RELU? (5p)

Suggested solutions
a) Equations (4.7) and (4.8)

New forms
4.7) 𝑦𝑘 = 𝑔(ℎ𝑘 ) = ℎ𝑘

4.8) 𝛿𝑜 (𝑘) = (𝑦𝑘 − 𝑡𝑘 )

b) Marsland uses the logistic (sigmoid) activation function.

Equations (4.5) and (4.9)

New forms
4.5) 𝑎𝜁 = 𝑔(ℎ𝜁 ) = 𝑅𝐸𝐿𝑈(ℎ𝜁 ) = max⁡(ℎ𝜁 , 0)

4.9) 𝛿ℎ (𝜁) = 𝑖𝑛𝑡𝑒𝑔𝑒𝑟(𝑎𝜁 ≥ 0) ∑𝑁

𝑘=1 𝑤𝜁 𝛿0 (𝑘)

Majority Voting Classifier (8 p)

We have trained three different classifiers on the same training data (from mandatory assignment
2); a linear regression classifier, a logistic regression classifier, and a perceptron classifier. We have
plotted the decision boundaries for all three classifiers on the training data in the figure. They all
classify the points above their boundaries as class 1 (purple) and the points below the boundary as
class 0 (red). By referring to the figure and the circled point, explain how a majority voting classifier
works

6
Suggested Solution
A voting classifier consists of several different classifiers trained on the same problem and (possibly
subsets of) the same data. To make a prediction, the voting classifier applies all the classifiers on the
datapoint and let them each predict a class; collect the predictions and choose the majority class. In
the figure, the circled red point will be classified as red, since it is classified as red by two of the
three classifiers (the perceptron and the linear regression) even though it is classified as purple by
the logistic regression classifier. In this case, the decision boundary will look as the thick, black lines.

Master’s students only: Regularization (8p)

In several forms of supervised machine learning we find a model with a set of weights. The goal of
training is to find the weights which best fit the training data (𝑿, 𝒕). To determin what we mean by
best fit, we introduce a loss function, 𝐿, and the goal is to determine the weights which minimize the
loss, in symbols 𝒘 ̂ = argmax(−𝐿( 𝑿, 𝒕, 𝒘))
𝒘

In reguarization, one replaces the objective −𝐿(𝑿, 𝒕, 𝒘) with another objective, e.g., for L2-
regularization one replaces it with −𝐿(𝑿, 𝒕, 𝒘) − 𝛼‖𝒘‖2.

a) Why does one apply regularization, and what is achieved by regularization?

b) How is 𝛼 determined?

Suggested Solutions
a) When training towards the original learning objective, there is a danger for overfitting, which
means that the trained model fits the training set very well, but it does not generalize equally well to
other data. In particular, some features might be particular important and by repeated applications
of gradient descent, the model ascribes more weight to these features. The goal of regularization is
to avoid putting too much weight on some features, and instead try to get more even weights
between the features.

b) Alpha determines how much larger weights get punished. There is no fixed 𝛼 which fits all cases.
The optimal 𝛼 depends on the task in question. 𝛼 is tuned experimentally. One separates the
development data into a training set and a development test set. For various values of 𝛼, one trains
models on the training set and evaluates on the development test set. The value of 𝛼 which gives
the best score on the development test set is chosen for the final model.

7
Bachelor students only: Training and test sets (8p)
Describe what is meant by a training set, test set and development test set in supervised machine
learning, and how they are used.

Suggested Solution
In machine learning, one applies algorithms that improve automatically through experience. The
algorithms are trained by the help of data, a training set. To see that the algorithms have learned,
one needs a precise description of what the learning objective is, and ways to measure improvement
with respect to this objective. To measure the improvement and how well the system performs, one
uses a test set of relevant data. The test set must be disjoint from the training set.

Often, one would compare different ML systems trained on the same data. For this, one needs a
development test set different from both the training set and the test set. One may train repeatedly
different systems on the same training set and test them on the development test set. One then
decides on the system which performs best on the development test set. After deciding on the best
system, this can be tested on the (final) test set, which hasn’t been considered so far.

Unsupervised Learning (8p bachelor / 8p master)

a) (Master Students Only) An application of unsupervised learning discussed in class and in the
exercises is dimensionality reduction. Sometimes, however, it may be necessary to project data from
a lower-dimensional space to a higher-dimensional space. Consider the two algorithms that you
have used for dimensionality reduction: PCA and autoencoders; can they be used to generate higher
dimensional representations? Briefly explain why/why not. (4p)

b) (Bachelor Students Only) Overfitting is a central problem in learning. Suppose you have been
given an unlabeled data set containing 1000 samples in 20 dimensions. You ran the K-means
algorithms on it using 900 centroids, and you achieved a satisfying matching. Now, someone
examines your work, and states that your algorithm is overfitting the data. What does she mean?
How is this related to overfitting? (4p)

c) (All students) Big data is undoubtedly an important driver for machine learning. However, in
certain situations there may be limits on the memory space or the computational power available
(think of embedded systems, for instance). In these cases, we may prefer running our algorithms on
few selected datapoints. Suppose you have been given an unlabeled data set containing 1000
samples in 20 dimensions, and someone has told you that they want only 10 representative samples
(prototypes) to run their analysis. Out of the unsupervised algorithms you have studied (PCA, K-
means, autoencoders), which one would you use and how? (4p)

Suggested solutions
a) PCA can not be learn high-dimensional representations: intuitively, we can not find more
orthogonal dimensions that the ones provided; formally, the number of eigenvectors is limited by
the dimensionality. Autoencoders can learn higher-dimensional representations, although the loss
function may have to change.

b) The large number of centroids means that most centroids are likely identifying individual
datapoints; there is no real learning, as the whole data is memorized with no generalization;
processing of new data will likely be unreliable.

c) The most natural solution would be to use k-means to find 10 centroids and use those as
prototypes.

8
Evolutionary Algorithms and Reinforcement Learning (13p)
We would like to set up a neural network (multilayer perceptron) for robot
control. The inputs to the neural network are measurements from range
sensors, and the output is a direction of movement. The robot is inserted into
the circular maze shown to the right, and the goal is to enable it to drive in the
direction of the arrow, getting as far as possible within a given time limit, while
colliding with the walls as few times as possible.

a) One way to design this neural network is by use of an evolutionary

algorithm (EA). The individuals in the population will be possible robot
controlling networks that get their fitness computed in simulation. The “lifetime” of one EA
individual (a neural network) thus consists of many timesteps, where in each timestep the individual
receives inputs from sensors, and calculates outputs (by feeding inputs forward through its
connections) that are given as speed settings to the robot’s wheels. The wheels then follow this
setting until a new output is available. Each individual gets N timesteps to try to drive as far as
possible. Assuming that the structure of the network is already specified, briefly describe how you
could allow an EA to find the proper weights for this neural network. Include in your description a
possible choice for:

a1) the genetic representation (genotype)

a2) variation operators. Include both their names and a brief description of how they work

a3) which measurements to include in the fitness function. You can assume the robot, or the
simulator, can gather any physical measurements of relevance to fitness calculation

(6 points, 2 per sub-task)

b) A different way to solve this problem is to apply reinforcement learning (RL). Describe how you
would model this problem as a reinforcement learning problem, including how you would define
rewards, states, and actions. The RL algorithm is not to be described. Note that you should not
describe this as a deep reinforcement learning problem, or with other function approximation
techniques – rather, formulate it as a discrete RL problem like the examples from the textbook and
lectures. (7 points)

Suggested solutions
a1) Genetic representation (genotype): Since we are representing the weights of a neural network,
the genotype needs to encode several numbers, that can be mapped to the neural network
connections. The most straightforward way is to define each genotype as a list of floating-point
values, where each value represents the weight of a single specific network connection.

a2) Variation operators: Here, one should choose variation operators suitable for the representation
defined in a1. Since we defined the genome as a list of floating-point values, we could for instance
select uniform mutation and simple arithmetic crossover here. Other operators applicable to the
representation are also accepted.

a3) fitness function: Since the goal of evolved controllers is to drive as far as possible within a time
limit without crashing into walls, we should include measurements of the distance travelled and the
total number of wall collisions in the fitness function.

9
b) Since this robot control problem is continuous, rather than discrete, there is a potentially infinite
number of different states and actions. We therefore need to discretize states and actions before
modelling this problem in the traditional RL way. For instance, we could model the problem this way:

States: States need to include information about distance to walls. To guide the movements of the
robot, we should also know on which side of the robot the wall is. There are many ways to represent
this information. One example is to represent each state as two variables, one of which represents
the direction towards the wall (dir), and the other the distance to it (dist). To guide actions, we need
to discretize these states, for instance into the sets dir (left, front, behind, right) and dist (close,
medium, far).

Actions: These need to be the operations the robot can carry out in order to complete its task.
Again, we could discretize the robot’s (continuous) control into a few different actions such as (go
forward, go backward, turn left, turn right).

Rewards: These need to be adapted to the robot’s goal, which is to drive far without collisions. For
instance, one could give a positive reward for every N cm driven, and a negative reward for every
collision.

Particle Swarm Optimization (PSO) and Developmental Systems (9p)

a) Describe what happens when the position of all particles in PSO are set to the same value of a
local optimum (Think about fitness landscapes). How could you adjust PSO to get out of the local
optimum to potentially find the global optimum without resetting the particles to random positions
initially? Describe your approach in up to 100 words. (3p)

b) An L-System is a parallel rewrite method. Describe how from an alphabet {h,j}, the axiom h will be
rewritten when using the rewrite rules: h->jjh and j->h. Write down three iterations/recursions. (3p)

c) When visualizing a string of an L-System, it is useful to implement a bracketed L-System. Describe

what ‘+’, ‘-’, ‘[‘ and ‘]’ in such L-Systems are used for. (3p)

Suggested solutions
a) This question is very subjective and there are many possible valid solutions. You can for example
set the initial velocity of every particle to a very high number. You can also add a repulsion function
that makes particles repel one another. Thereby particles are gathered around a ‘center of mass’ but
cannot squeeze on the same position. Related to how ‘boids’ flock which was mentioned in the
lecture.

b) h := jjh := hhjjh := jjhjjhhhjjh

c) ‘+’ and ‘-’ refer to turning left and right with a predefined angle. The brackets ‘[‘ and ‘]’ push and
pop the state when drawing an L-System. Pushing meaning that when the bracket is being read, the
position and angle are being stored. The pop returns the previously stored position and angle.
Bracketed L-Systems therefore enable branching. The string of an expanded L-System can be read as
a set of instructions. Classical example is by having F draw a straight line. F+F[FF]+F would read as:
(1) draw (2) turn (3) draw (4) push (5) draw (6) draw (7) pop (8) turn (9) draw.

Sybex - Maya - Secrets of The Pros - 2003 (PDF)
No ratings yet
Sybex - Maya - Secrets of The Pros - 2003 (PDF)
384 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
MM - B412, B432, B512, MB472, MB492, MB562, ES4132, ES4192, ES5112, ES5162 (Option Tray) - 1
No ratings yet
MM - B412, B432, B512, MB472, MB492, MB562, ES4132, ES4192, ES5112, ES5162 (Option Tray) - 1
18 pages
Hoeganaes Corporation
No ratings yet
Hoeganaes Corporation
11 pages
Ix Developer: User's Guide
100% (1)
Ix Developer: User's Guide
48 pages
A Comparative Study of The Academic Performance of USL-SHS Students Before and During Pandemic I
100% (1)
A Comparative Study of The Academic Performance of USL-SHS Students Before and During Pandemic I
10 pages
Kingdom of Saudi: Jubail Industrial City Project
No ratings yet
Kingdom of Saudi: Jubail Industrial City Project
45 pages
DD Assignment
No ratings yet
DD Assignment
40 pages
Disassembly and Assembly Manual Cat c15 Engine
0% (1)
Disassembly and Assembly Manual Cat c15 Engine
2 pages
CSDM 2.0 White Paper Final
No ratings yet
CSDM 2.0 White Paper Final
23 pages
Phase 1 Report Group ID CSE19-G58 Malware Detection Using ML
No ratings yet
Phase 1 Report Group ID CSE19-G58 Malware Detection Using ML
30 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
Celia SlidesCarnival
No ratings yet
Celia SlidesCarnival
30 pages
Compliance Under Case-B'.: Notes
No ratings yet
Compliance Under Case-B'.: Notes
10 pages
CEM How To - Final
No ratings yet
CEM How To - Final
84 pages
Kolom Distilasi Tinjauan Umum
No ratings yet
Kolom Distilasi Tinjauan Umum
22 pages
Solar Photovoltaic Glint and Glare Guidance First Edition
No ratings yet
Solar Photovoltaic Glint and Glare Guidance First Edition
55 pages
Engine Immobilizer System
No ratings yet
Engine Immobilizer System
6 pages
ECE 528 / DA 528 Machine Learning, Midterm Exam: Instructions
No ratings yet
ECE 528 / DA 528 Machine Learning, Midterm Exam: Instructions
5 pages
GRIHA Case Studies - ISA
No ratings yet
GRIHA Case Studies - ISA
4 pages
Machine Learning, Spring 2005
No ratings yet
Machine Learning, Spring 2005
3 pages
IE426 - Optimization Models and Application: 1 Goal Programming
No ratings yet
IE426 - Optimization Models and Application: 1 Goal Programming
10 pages
ANSWERS TO 15-381 Final, Spring 2004: Friday May 7, 2004
No ratings yet
ANSWERS TO 15-381 Final, Spring 2004: Friday May 7, 2004
20 pages
Chapter 8-Vector Control of Induction Motors PDF
No ratings yet
Chapter 8-Vector Control of Induction Motors PDF
18 pages
Sem IV Cse and It PSQT Maths Application Problem R 19 3
No ratings yet
Sem IV Cse and It PSQT Maths Application Problem R 19 3
45 pages
Hw2 - Raymond Von Mizener - Chirag Mahapatra
No ratings yet
Hw2 - Raymond Von Mizener - Chirag Mahapatra
13 pages
Sol All
No ratings yet
Sol All
66 pages
2023 - SP2 - CP3401 - CP5636-Assessment Item 1
No ratings yet
2023 - SP2 - CP3401 - CP5636-Assessment Item 1
4 pages
2020 - 21 - Extra Exercise 1
No ratings yet
2020 - 21 - Extra Exercise 1
2 pages
Fundamentals of Mathematics Unit 2 - V1
No ratings yet
Fundamentals of Mathematics Unit 2 - V1
21 pages
AI sp12 Final Solutions
No ratings yet
AI sp12 Final Solutions
19 pages
ML Assignment
No ratings yet
ML Assignment
17 pages
Big Data Assignment #1: Submitted To/ Eng. Eman Hossam
No ratings yet
Big Data Assignment #1: Submitted To/ Eng. Eman Hossam
16 pages
Exam With Solutions
No ratings yet
Exam With Solutions
7 pages
XT4N 250 Ekip LS/I in 250A 4p F F
No ratings yet
XT4N 250 Ekip LS/I in 250A 4p F F
3 pages
SLA Mid-termV2 Soln
No ratings yet
SLA Mid-termV2 Soln
5 pages
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
No ratings yet
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
6 pages
IML-IITKGP - Assignment 7 Solution
No ratings yet
IML-IITKGP - Assignment 7 Solution
8 pages
Exam 21
No ratings yet
Exam 21
17 pages
Midterm Solutions PDF
No ratings yet
Midterm Solutions PDF
17 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
Ptu PHD Thesis Format
100% (3)
Ptu PHD Thesis Format
8 pages
Tut4 Questions
No ratings yet
Tut4 Questions
2 pages
Tut1 Questions
No ratings yet
Tut1 Questions
2 pages
Tut7 Questions
No ratings yet
Tut7 Questions
2 pages
Name: in The Name of Almighty Statistical Pattern Recognition Homework 1
No ratings yet
Name: in The Name of Almighty Statistical Pattern Recognition Homework 1
2 pages
T 2 BSM20
No ratings yet
T 2 BSM20
7 pages
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
No ratings yet
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
6 pages
Practice Final CS61c
No ratings yet
Practice Final CS61c
19 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Midterm Solutions Machine
100% (1)
Midterm Solutions Machine
17 pages
Wallahibrother
No ratings yet
Wallahibrother
23 pages
Solutions Problem Set 1
No ratings yet
Solutions Problem Set 1
7 pages
Orf523 S24 HW3
No ratings yet
Orf523 S24 HW3
4 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Final - Accops Certified Sales Professional - Fundamentals
No ratings yet
Final - Accops Certified Sales Professional - Fundamentals
14 pages
DMT 2020
No ratings yet
DMT 2020
7 pages
Assignment-0 DIP-CS406-Fal 2024
No ratings yet
Assignment-0 DIP-CS406-Fal 2024
6 pages
hw1 PDF
No ratings yet
hw1 PDF
6 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Machine Learning PYQ 2021
No ratings yet
Machine Learning PYQ 2021
4 pages
Exam Spring 10
No ratings yet
Exam Spring 10
10 pages
Sew Cost Map
No ratings yet
Sew Cost Map
20 pages
LR, Decision Tree
No ratings yet
LR, Decision Tree
48 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
Midterm 2021 - Model Answer1
No ratings yet
Midterm 2021 - Model Answer1
4 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
AIML II Test Scheme and Soluion 2023
No ratings yet
AIML II Test Scheme and Soluion 2023
12 pages
MLT 2021-22
No ratings yet
MLT 2021-22
14 pages
Answers 2024
No ratings yet
Answers 2024
11 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
Midem ML Regular Solution
No ratings yet
Midem ML Regular Solution
10 pages
Radio Com & Nav System
No ratings yet
Radio Com & Nav System
32 pages
Concordia University Machine Learning Assaignment With Solutions
No ratings yet
Concordia University Machine Learning Assaignment With Solutions
8 pages
IHHA sts2011 - Turner
No ratings yet
IHHA sts2011 - Turner
9 pages
cs188 sp16 F Sol
No ratings yet
cs188 sp16 F Sol
27 pages
Mid-Term A2 ML Solution
No ratings yet
Mid-Term A2 ML Solution
7 pages
MS Key-4
No ratings yet
MS Key-4
4 pages
MidA F21
No ratings yet
MidA F21
8 pages
Krushi Bhavan
No ratings yet
Krushi Bhavan
5 pages
X400004 20220215 Solutions
No ratings yet
X400004 20220215 Solutions
8 pages
Unit-2 MLT
No ratings yet
Unit-2 MLT
84 pages
Ex 83622 2025 1
No ratings yet
Ex 83622 2025 1
2 pages
Machine Learning Insem-01 QP
No ratings yet
Machine Learning Insem-01 QP
6 pages
SMM 2024 WRF
No ratings yet
SMM 2024 WRF
374 pages
Unit 3 LOGISTIC
No ratings yet
Unit 3 LOGISTIC
7 pages
Advanced Stats Final Exam Sample
No ratings yet
Advanced Stats Final Exam Sample
9 pages
ML End Sem Nov2024 Paper
No ratings yet
ML End Sem Nov2024 Paper
4 pages
Global Summit On Innovation, Productivity in The Age of AI - Revised
No ratings yet
Global Summit On Innovation, Productivity in The Age of AI - Revised
6 pages
t4 Sol
No ratings yet
t4 Sol
8 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet

Trial Exam 2021 With Solutions

Uploaded by

Trial Exam 2021 With Solutions

Uploaded by

Trial Exam IN3050/4050 Spring 2021 –

Simulated Annealing (6p)

Master Students Only: Search (5p)

-Run it multiple times with random starting positions

Total population fitness: 3+6+9+12+15 = 45

Expected number of copies:

Pareto Optimality (9p)

Find the Pareto optimal set of solutions when

b) Maximizing f1 and f2 (3p)

Perceptron and linear regression classifier (12p)

b) For A, we get 𝑧 = 𝑤0 𝑥0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 = 0(−1) − 1(1) + 1(2) = 1.

For B, we get 𝑧 = 𝑤0 𝑥0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 = 0(−1) − 1(2) + 1(1) = −1.

Datapoint B. Since 𝑦 = 𝑧 = −1, the update is

• Number of ‘x’-s in each spam: [0, 3, 4, 8, 9, 13, 21]

b) You can never be 100% sure with a logistic regression model,

4.8) 𝛿𝑜 (𝑘) = (𝑦𝑘 − 𝑡𝑘 )

b) Marsland uses the logistic (sigmoid) activation function.

Equations (4.5) and (4.9)

4.9) 𝛿ℎ (𝜁) = 𝑖𝑛𝑡𝑒𝑔𝑒𝑟(𝑎𝜁 ≥ 0) ∑𝑁

Majority Voting Classifier (8 p)

Master’s students only: Regularization (8p)

a) Why does one apply regularization, and what is achieved by regularization?

Unsupervised Learning (8p bachelor / 8p master)

a) One way to design this neural network is by use of an evolutionary

a1) the genetic representation (genotype)

(6 points, 2 per sub-task)

Particle Swarm Optimization (PSO) and Developmental Systems (9p)

c) When visualizing a string of an L-System, it is useful to implement a bracketed L-System. Describe

b) h := jjh := hhjjh := jjhjjhhhjjh

You might also like