0% found this document useful (0 votes)
95 views4 pages

Assignment 4

This document contains instructions for an assignment in an introductory artificial intelligence course. It includes 4 questions - the first asks students to modify neural network code to write error values to a file, try different network architectures, and add noise to the data. The second questions describes building a neural network to predict soccer match outcomes. The third questions involves calculations with a network of linear neurons. The fourth question asks students to explain the relationship between gradient descent and backpropagation.

Uploaded by

Selmanurto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views4 pages

Assignment 4

This document contains instructions for an assignment in an introductory artificial intelligence course. It includes 4 questions - the first asks students to modify neural network code to write error values to a file, try different network architectures, and add noise to the data. The second questions describes building a neural network to predict soccer match outcomes. The third questions involves calculations with a network of linear neurons. The fourth question asks students to explain the relationship between gradient descent and backpropagation.

Uploaded by

Selmanurto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

CS 470/670 – Intro to Artificial Intelligence – Fall 2018

Instructor: Marc Pomplun

Assignment #4
Posted on November 21 - Due by November 29, 2:00pm

Question 1: Modifying the Neural Network Code

On the course homepage, you will find the neural network code that I showed in class in a
file named “MNIST_demo.c”. In order to make it work, you will also need to download
the following files from the website at:

http://yann.lecun.com/exdb/mnist/

train-images-idx3-ubyte.gz: training set images (9912422 bytes)


train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)
Unpack all files into the same directory as the network code, and then the program should
be able to learn the MNIST data and test its accuracy for 150 epochs. It will take quite a
while, though. So make sure you have other things to do while the network does its job.

(a) Modify the code so that it writes the error for the training set and the error for the test
set after each epoch into a file. Let the network run for at least 150 epochs (if you let it
run overnight, you can as well perform more epochs) and plot training and test error as
functions of epoch number like in Figure 9.9 (right panel) of Chapter 9 that I sent you.
If your plot looks completely different from Figure 9.9, please let me know and we can
check what went wrong.

(b) See what happens if you use fewer neurons in the two hidden layers. You can choose
any numbers, but make at least one layer significantly smaller than in the original
network. Again, plot the training and test errors across epochs. How do the results differ
from the initial ones?

(c) Restore the original number of neurons, and now change the dropout rate. Originally,
25% of input units (see line 164 in the code) and 50% of hidden-layer neurons (see line
172) are randomly chosen to drop out, i.e., give no output. Note that when we run the
network in production (non-training) mode without dropout, we need to increase the
output of neurons accordingly to keep the overall activation at the same level (lines 180
and 182). Choose a different dropout rate for the hidden-layer and/or input layer units
to whichever value you like, and run the network again for at least 150 epochs. Plot the
results and compare them to the ones you got in (a).

(d) Add noise to the data by randomly choosing a certain percentage of pixels in both the
training and test images and flipping their intensity so that intensity i will become (255
– i). For example, if a pixel has the original value 10, then after this transformation it
has the value 245. The easiest way to do this is to modify the readData() function. You
can choose any percentage you like, and any of the three networks that you created
above. Again, run it for at least 150 epochs and compare the results to the original
network (whichever one you chose).

Question 2: The Soccer Network

Here is an idea for an ANN that would make you rich if it performed well. This ANN
predicts the results of soccer matches. The network receives information about the two
competing teams and the conditions of the match and is supposed to predict how many
goals each team will score. With this knowledge, you could bet on the projected winner
team and gain a lot of money.

Let us say that every team consists of 20 players. You are providing the following input
data to the network:

• The skill level of every player on each of the two teams. Skill is rated by a group
of soccer reporters on a scale from 0 (“is unable to kick the ball”) to 10 (“world
class player”).

• The number of matches that each team has played during the last two weeks. There
are never more than seven matches in that period of time.

• The statistics of former matches between the same two teams within the past 10
years (e.g., Team A won 30% of the matches, Team B 45%, and 25% of the matches
were tied).

• The continent that each team comes from (North America, South America, Europe,
Africa, Asia, or Australia).

• Where the match takes place (Team A’s stadium, Team B’s stadium, or neutral
place).

• The phase of the soccer season (early season vs. late season).

You want to build and train a backpropagation network that, based on this information, is
able to predict the number of goals each team will score. Describe an appropriate way of
formatting the input, interpreting the output, collecting exemplars, constructing the
network, training the network, and testing the network. Give reasons for the decisions that
you make. Describe everything in great detail so that a computer programmer who does
not know anything about ANNs would be able to successfully build this network
application, predict results, and become rich. The programmer can look up the BPN
equations for training and operation in a book, but needs precise explanations for
everything else. Please help him/her out!

Question 3: Linear Neurons

The following is a network of linear neurons - that is, neurons whose output is identical to
their net input, x⋅w (in other words, their output function that translates net input into output
is simply the identity function). These neurons do not receive any “dummy” inputs (biases
or offsets). The numbers in the circles indicate the output of a neuron, and the labels of
connections indicate the value of the corresponding weight.

1
-2
-1
-4 3 4

2 3
2 0
-3
1

2 1

(a) Just as a warm-up exercise, compute the output of the hidden-layer and the output-layer
neurons for the given input (2, 1).

(b) Only mandatory for CS670: Show that a network of linear neurons, such as this one,
always computes a linear function, regardless of its number of layers and neurons.
Hint: A function y = f(x) is linear if and only if it can be expressed as y = Ax for some
matrix A.

(c) Only mandatory for CS670: Given that our three-layer network computes a linear
function, we suddenly notice that our network is wastefully large. It must be possible
to compute exactly the same function with a two-layer network. Draw such a network,
including all of its weights, that only consists of an input layer and an output layer and
computes the same function as the network shown above. Hint: In the network above,
determine how the output of each output-layer neuron depends on the two network
inputs, and then you should be able to find the correct weights for the two-layer
network. There is a more elegant way of deriving the solution that is related to (b), but
any correct solution gets full points, regardless of your approach.

Question 4 (Bonus): From Gradient Descent to Backpropagation

Explain in your own words how the concepts of gradient descent and backpropagation
are related to each other.

Please put your answers to all questions in a single text file and upload it to your course
directory. Alternatively, you can submit some or all answers as a hardcopy at the start of
the class.

You might also like