0% found this document useful (0 votes)
11 views74 pages

7 Neural Networks - Lecture Slides

The document provides an overview of neural networks, explaining their biological inspiration and how artificial neural networks (ANNs) function. It covers topics such as the structure of neurons, activation functions, different types of ANNs, and their applications in classification, regression, and compression. Additionally, it discusses the learning process of ANNs through supervised learning and the importance of training datasets.

Uploaded by

Polaris Star
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views74 pages

7 Neural Networks - Lecture Slides

The document provides an overview of neural networks, explaining their biological inspiration and how artificial neural networks (ANNs) function. It covers topics such as the structure of neurons, activation functions, different types of ANNs, and their applications in classification, regression, and compression. Additionally, it discusses the learning process of ANNs through supervised learning and the importance of training datasets.

Uploaded by

Polaris Star
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Data & A.I.

Neural networks
Wim De Keyser
Rony Baekeland
Tom Magerman
Quote

“Your brain does not manufacture


thoughts. Your thoughts shape neural
networks.”

Deepak Chopra (1947- )


Agenda

1. What is a neural network?


2. What to use an ANN for?
3. How does an ANN work?
4. ANN in Python
5. In the media
6. Questionnaire
What is a neural network?
A biologic neural network

The brain
A human brain consists of about 86,100,000,000 nerve cells.
Reference points:
A nerve cell or neuron consists of three parts: 1,000,000 second = 11.57 days
• a cell body (aka soma aka cyton) contains 1,000,000,000 seconds = 31.71 years
86,100,000,000 seconds = 2,47 milennia
o dendrites: inputs that receive signals
o a nucleus
• axon or output with branches
• synapse: connection between an axon and a dendrite
Neuron
The neuron explained

Watch to 10:40 to just get the basic


Introduction to neurons
A biologic neural network

A neuron combines incoming signals from other neurons and in turn transmits
this signal to other neurons. Amplification or attenuation of the signal occurs
in the dendrites and axons.

→ it creates a network with 100,000,000,000,000 connections


Can learn
→ non linear system anything!
→ universal function approximator
Artificial Neuron

Mimicking a biological neuron

Neuron j

Integration Activation
function function
Artificial Neuron – Activation function

Each neuron j uses a function g() to map its weighted input zj to


a new value aj

Common activation functions are:


 Linear function
 Sigmoid function
 Relu function
 Leaky rely function
 Tanh function
 Elu function

 Another special case is SOFTMAX not shown here on the graph

Note: when the neural network needs to yield binary values, the logistic function is widely used:
(returns a value between 0 and 1 and is a special case of the sigmoid function)
Artificial Neural Network (ANN or NN)

Mimicking biological neural network by connecting


artificial neurons in a network.

Note: within most NN, the same integration and


activation function is used for all neurons.
Except for the output layer where sometimes
softmax is used.
ANN
network layers

hidden

hidden

output
each connection synapses
positive and
input

= contain negative
weight weights weights
ANN

Input neurons hidden neurons

Output neurons
ANN

topologies
ANN
Lots of topologies possible:

• Feed Forward NN
• Recurrent NN
• LSTM
• Autoencoders
• Convolutional NN
• Kohonnen NN
• …
→ each has a specific goal

We limit ourselves to
Feed Forward NN

https://towardsdatascience.com/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464
Perceptrons

In a neural network we call one dot a perceptron.


What can we do with perceptrons?

E.g. AND function X1

X1 x2 Y w1
0 0 0
w2
1 0 0 X2
0 1 0
E.g; b
1 1 1

1 x2

Y= 0 if w1.x1+w2.x2+ b<=0 and Y=1 if w1.x1+w2.x2+b>0

Every line that separates the (0,0),(1,0)and (0,1) from (1,1) will do fine
x1
Perceptrons

In a neural network we call one dot a perceptron.


What can we do with perceptrons?

E.g. XOR function X1

X1 x2 Y
0 0 0
1 0 1 X2
0 1 1
E.g;
1 1 0

1 x2

Y= 0 if w1.x1+w2.x2+ b<=0 and Y=1 if w1.x1+w2.x2+b>0

No line that separates the (0,0),(1,1)and (0,1) from (1,0) will do it


x1
RELU function

The relu function is defined by:


relu(x)=max(0,x)
XOR with RELU function
A simple example - XOR

XOR –exclusive OR- is a boolean operator Inputs Output

with the following truth table:

We try to recreate this on the basis of a neural network:


 There are two input variables (also called covariates) and 1 output
variable (also called response variable) so we provide 2 input neurons ( )
and 1 output neuron ( ).
 The number of hidden neurons in a Feed Forward NN is usually somewhere
between the number of input neurons and the number of output neurons. In our
case, 2 or 1. 1 intermediate neuron has little added value when there is 1 output
neuron, so we choose here for 2 intermediate neurons ( ).
 Every layer –exept the output layer- has an additional input
neuron (aka constant neuron) ( 1 1 ) whose value is always one. There are no
connnections coming into the constant neuron but the constant neuron is
connected to each non constant neuron of the next layer.
 Weights are associated to the connections between the neurons in the network.
The weights of the connections from a constant neuron to a neuron is called a
bias.
A simple example - XOR

The ANN looks as follows:

Input-layer Hidden-layer Output-layer

Constant
1 1
neurons

Response variabele
or ouput variabele
Covariates or
Input variabels

 
4 weights and 2 weights and
2 biases 1 bias
A simple example - XOR

With Python –see 4.- we obtain:

1 - 0 - 0. 00
1 -0
.0
.0 01
01 1996 99
99 71
52
5 8
93

P1 (variable 1) 0.15405427 0.44075155 Q (output XOR)


-0
.0
99
14
05

2
6

49
99

5
7

92
28

7
27

0.
P2 (variable 2)
.
-0

5
1.090252
A simple example - XOR

With Python –see below- we obtain:


n1:
Integration function: zj = 0.15405 x 1 -0.27288 x 1 – 0.00200
= -0.12082
Activation function : = = 0.46983
n2:
Integration function: zj = -0.09914 x 1 + 1.09025 x 1 - 0.00200
= 0.98912
Activation function : = = 0.72891

1 n1 n3

1 n2
n3:
Integration function: zj = 0.44075 x 0.46983 +
0.79255 x 0.72891 – 0.00200
= 0.782781
Activation function : = = 0.68628

Note: here was chosen -by means of a parameter- not to apply the activation function to the output neuron -see below-
What to use an ANN for?
What to use an ANN for?

Classification
Image recognition, OCR, fraud detection,
identification, logistic regression, binary classification,
multiclass classifcation, …
see also cluster analysis (Data & A.I. 2)

Regression
See also linear regression and Forecasting
(Data & A.I. 2)

Compression
See also PCA (Data & A.I. 2) by means of
autoencoder
Data Science Process

All topics from Data & A.I. 2 and 3 apply to the Data Science
Process.
• Understanding the Business: research phases, …
• Understanding the Data: frequency distributions, histogram,
center measures, distributions, …
• Data management: transformation and manipulation of data
• Modeling: linear regression, forecasting, decision trees,
clustering, association rules, naive bayes, metaheuristics, ANN
• Evaluation: Evaluation metrics
• Application: writing smart applications, research phases

Understanding Understanding Data Evaluation


Modeling Application
the Business the Data management
What to use an ANN for?
linear regression with g(z) = c x z
You already know this from Data
& A.I. 2
There this was called
'Coherence' and Linear Model
(lm)
n With only two inputs (1 and x1)
s io
es and c=1, the
gr
re output of
… …

Logistic regression g(z)=


c la
ss
ific

The only difference with linear


at
io
n

regression is
that here you get an output
between 0 and 1.
You can look at that output as a
probability:
How does an ANN
work?
How does an ANN learn?
We will use some notation to indicate elements of a ANN. Look careful at the examples.

input vector = weight matrix in layer error-vector for 4


from the training neurons of the 3e layer
𝟒
dataset target vector
from the training 𝑾
dataset

𝒙𝒊 𝒚𝒊 𝟑 𝟒 𝟑
𝒛 𝒂 𝜹

activation-vector
4e layer
weighted input vector
3e layer

Superscript == number of layer


Subscript == element of a vector/matrix
How does an ANN learn?

An ANN must be trained with examples before predictions can be


made.
 supervised learning (cfr. Data & AI 2)
 fine-tuning the weights

A training dataset is a collection of inputs and outputs :


where and are input and target vectors

Example 1
Training dataset with
7 input neurons required 3 output neurons required

0,3 0,4 0,8 0,9 0,2 0,2 0,5 0 1 0

0,5 0,1 0,5 0,8 0,3 0,7 0,3 1 0 0

0,9 0,2 0,1 0,1 0,3 0,3 0,8 1 1 1


N
0,6 0,3 0,5 0,9 1,0 0,4 0,5 0 0 1

0,5 0,7 0,8 0,8 0,3 0,9 0,1 1 1 0

0,7 0,2 0,4 0,5 0,1 0,3 0,1 0 0 0


How does an ANN learn?
Example 1

0,3

0,4

0,8 0

0,9 1

0,2 0

0,2

0,5

7 input neurons required 3 output neurons required

0,3 0,4 0,8 0,9 0,2 0,2 0,5 0 1 0

0,5 0,1 0,5 0,8 0,3 0,7 0,3 1 0 0

0,9 0,2 0,1 0,1 0,3 0,3 0,8 1 1 1

0,6 0,3 0,5 0,9 1,0 0,4 0,5 0 0 1

0,5 0,7 0,8 0,8 0,3 0,9 0,1 1 1 0

0,7 0,2 0,4 0,5 0,1 0,3 0,1 0 0 0


How does an ANN learn?
Example 1

0,5

0,1

0,5 1

0,8 0

0,3 0

0,7

0,3

7 input neurons required 3 output neurons required

0,3 0,4 0,8 0,9 0,2 0,2 0,5 0 1 0

0,5 0,1 0,5 0,8 0,3 0,7 0,3 1 0 0

0,9 0,2 0,1 0,1 0,3 0,3 0,8 1 1 1

0,6 0,3 0,5 0,9 1,0 0,4 0,5 0 0 1

0,5 0,7 0,8 0,8 0,3 0,9 0,1 1 1 0

0,7 0,2 0,4 0,5 0,1 0,3 0,1 0 0 0


How does an ANN learn?
Example 1

0,9

0,2

0,1 1

0,1 1

0,3 1

0,3

0,8

7 input neurons required 3 output neurons required

0,3 0,4 0,8 0,9 0,2 0,2 0,5 0 1 0

0,5 0,1 0,5 0,8 0,3 0,7 0,3 1 0 0

0,9 0,2 0,1 0,1 0,3 0,3 0,8 1 1 1

0,6 0,3 0,5 0,9 1,0 0,4 0,5 0 0 1


etc. etc.
0,5 0,7 0,8 0,8 0,3 0,9 0,1 1 1 0

0,7 0,2 0,4 0,5 0,1 0,3 0,1 0 0 0


How does an ANN learn?

Initialize all network weights randomly.

For all training examples, perform two steps:

A. Feedforward pass
1. Put input vector into the input neurons
2. Have number values propagated by ANN
3. Output neurons take a value

B. Backpropagation pass
4. Calculate cost by comparing with
5. Use the cost to calculate the errors for all layers
6. Adjust weights in all layers based on the errors

The backpropagation algorithm looks for the weights in the neural network that
provide a (local) minimum for the error function. Is this effect an algorithm or is
it a heuristic (see later)?
How does an ANN learn? - Initialize network weights

𝟏 𝟐 𝟑 𝟒
𝑾 𝑾 𝑾 𝑾

..

: :

8x6 6x4 4x4 4x3

100 weights are generated at random


How does an ANN learn? - Feedforward pass

𝟏 𝟐 𝟑 𝟒
𝑾 𝑾 𝑾 𝑾

activation of 4° neuron in 2° activation of 1° neuron in 2°


layer layer
Backpropagation explained (intuitive)
Backpropagation: the calaculus behind it
How does an ANN learn? - Backpropagation pass (1/2)

error of 4° neuron in 2° layer

= errors in layer
How does an ANN learn? - Backpropagation pass (2/2)

𝟏 𝟐 𝟑 𝟒
𝑾 𝑾 𝑾 𝑾

..

: :

8x6 6x4 4x4 4x3

Adjust weights in all layers based on the errors


How does an ANN learn?
Example: MNIST dataset
Dataset of 70000 handwritten numbers
number = 28 x 28 bitmap image

Each bit can serve as an input neuron


How does an ANN learn?

28
0 0
0 1
0 2
0 3
28
5
0 4
1 5
0 6
0 7
0 8
0 9
transformatie
0
0.01

=
0.00
0.4
0.03 One-Hot Encoding
0.03
0.01
28x28 = 784 inputs

0.02
0.04
0.64
0 0
. . 0 1
0 2
.
.
. 0 3
. . 0 4
0,78 1 5
0,8 0 6
0,9 0 7
1 0 8
0,64 0 9
0,04
1
0,36
0
0
0
0 1 trainingsexample
0
0
0
Input vector + target vector
How does an ANN learn?

One-Hot Encoding
One-hot encoding is a process by which categorical variables are
converted into a form that could be provided to ML algorithms to
do a better job in prediction.

One-hot encoding performs a “binarization” of the category:


How does an ANN learn? - Scaling data

Scaling data is essential because otherwise one independent variable


can have a large impact on the dependent variable simply because of
its scale. Using unscaled data can lead to meaningless results.

Commonly used techniques to scale data are:


 Z-score normalisation – see Data & A.I. 2-
 min-max normalisation: zi = (xi-min)/(max-min)
linear transformation into values between 0 and 1
 Decimal scaling: divide all number by 10j where 10j-1 < max < 10j
 Tanh estimators:
 Median and MAD:
where

Do we break down the data into a training and a test dataset before or
after the normalization?
How does an ANN learn? - Scaling data

Make functions for the different normalization techniques!


E.g. min-max normalization:
def min_max_norm(col):
minimum = col.min()
range = col.max() - minimum
return (col-minimum)/range
def normalized_values(df, norm_funct):
df_norm = pd.DataFrame()
for column in df:
df_norm[column] = norm_funct(df[column])
return df_norm:
df_normalized = normalized_values(df, min_max_norm)

E.g. decimal scaling


def decimal_scaling_norm(col):
maximum = col.max()
tenfold = 1
while (maximum > tenfold):
tenfold = tenfold * 10
return(col/tenfold)
df_normalized = normalized_values(df, decimal_scaling_norm)
ANN in Python
ANN in Python - Keras

Keras is a popular library for building and training


neural networks.

Keras is a Python wrapper around TensorFlow


 Install –if not yet done in the past- the packages tensorflow, keras, livelossplot, pydot and graphviz:
>>> pip install tensorflow
>>> pip install keras
>>> pip install livelossplot
>>> pip install pydot
>>> pip install graphviz
(most of them are in the requirement.txt file)
ANN in Python - Keras – Steps to take

Step 0: Install the package & import the required libraries, functions,…
Step 1: Upload the dataset and inspect the data
Step 2: Perform the needed data management manipulations in order to
prepare the data for processing.
Step 3: Normalise the data (only if needed and normalisation is not a part of
the chozen ANN-model –see example MNIST-)
Step 4: If required, split the dataset into a training dataset and a test dataset
Step 5: Build the ANN-model
Step 6: Train the ANN-model
Step 7: Evaluate the quality of the ANN-model
Step 8: Apply the ANN-model to a new dataset
Note: Depending on the project at hand, some steps can be skipped

Understanding Understanding Data Evaluation


Modeling Application
the Business the Data management

Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8


ANN in Python - Keras – Example XOR (1/4)

# Step 0: Install the packages & import the required libraries, functions,…
import numpy as np
from tensorflow import keras
from tensorflow.keras import Model
from tensorflow.keras.layers import Input, Dense, BatchNormalization
from tensorflow.keras.utils import to_categorical
from tensorflow. keras.optimizers import Adam
from livelossplot import PlotLossesKeras
from keras.utils.vis_utils import plot_model

# Step 1: Upload the dataset and inspect the data


xor_data = pd.DataFrame({'P1': [0,1,0,1], 'P2': [0,0,1,1], 'Q' : [0,1,1,0]})
x_xor_data= xor_data[['P1', 'P2']]
y_xor_data = xor_data[['Q’]]

# Step 5: Build the ANN-model


inputs_xor = Input(shape=(2,))
x_xor = Dense(2, activation='sigmoid')(inputs_xor) #sigmoid(x) = 1 / (1 + exp(-x)).
outputs_xor = Dense(1, activation='sigmoid')(x_xor)
model_xor = Model(inputs_xor, outputs_xor, name='XOR_NN')
# you can visualise the model in different ways – see next slide
model_xor.compile(optimizer=Adam(learning_rate=0.00001),
loss=keras.losses.binary_crossentropy,
metrics=['accuracy'])

# Step 6: Train the ANN-model


history_xor = model_xor.fit(x_xor_data, y_xor_data, epochs=200,
callbacks=[PlotLossesKeras()],
verbose=False)
# you can visualise the values of the trained ANN – see one of the next slides
ANN in Python - Keras – Example XOR (2/4)
# Visualisation of the ANN-model
model_xor.summary();


4 weights and
2 biases

plot_model(model_xor, to_file='model_xor_plot.png', show_shapes=True,


show_layer_names=True)
ANN in Python - Keras – Example XOR (3/4)
# Visualisation of the values of the trained ANN-model
for lay in model_xor.layers:
print(lay.name)
print(lay.get_weights())
4 weights 2 biases

4 weights and
2 biases

Do you get the same weights?


ANN in Python - Keras – Example XOR (4/4)

# Step 7: Evaluate the quality of the ANN-model


model_xor.predict(x_xor_data)

Do you get the same predictions?


Are these predictions useful?
P1=1, P2=1 

model_xor.evaluate(x_xor_data, y_xor_data)

Change the model and add several hidden layers. Check if this would improve
the predictions.

Quiz question: Why can't a neural network with only linear activation
functions predict the XOR function?
Because the XOR function cannot be linearly separated, i.e. you can never
perfectly separate the two cases with one line.
ANN in Python - Keras – ANN parameters

Dense parameters:
 activation: sigmoid Applies the sigmoid activation function: sigmoid(x) = 1 / (1 + exp(-x)).
Returns a value between 0 and 1, not useful in an regression ANN
relu Applies the rectified linear unit activation function: max(x, 0)
linear Linear activation function (pass-through). Useful in an regression
ANN
softmax: converts a vector of values to a probability distribution. Useful when
output layer consist of nodes for different outcome categories
Other alternatives: https://keras.io/api/layers/activations/

model.compile parameters:
 optimizer: Adam(learning_rate=lr) #lr 
{0.1,0.01,0.001,0.0001,0.00001;0.000001}
useful in a classification ANN
RMSprop(learning_rate=lr) useful in an regression ANN
Other alternatives: https://keras.io/api/optimizers/
 loss: keras.losses.binary_crossentropy when the ANN is aimed at binary classification
keras.losses.categorical_crossentropy when the ANN is aimed at multiclass
classification
keras.losses.MeanAbsoluteError() when expected outcome is a numerical value
(regression)
Other alternatives: https://keras.io/api/losses/
 metrics: ['accuracy’]: useful in an classification ANN
keras.metrics.MeanAbsolutePercentageError(): useful in an regression ANN
ANN in Python - Keras – ANN parameters

model.fit parameters:
 epoch: number of times the training examples are offered to the ANN (= number of iterations)
 batch size: accumulate the errors of a number of examples before updating the weights  faster
training
 validation_split: percentage of training set used as a validation set ≠ test set

Quiz question: What is the use of a test set?


The test set is used as an effective litmus test. Only on this data can the
real performance of the NN be determined.

Quiz question: What is the use of a validation set?


A validation prevents you from overtraining the NN. During training, the
validation set (a separate part of the training set) is used to measure
how the neural network performs. If the accuracy on the validation set
becomes too poor but remains good on the training set (or the loss
starts to increase for the validation set while the loss for the training set
still decreases), then the NN is overfitting on the training set.
ANN in Python - Keras – ANN parameters

Quiz Question: What is the role of the learning rate (model.compile


parameter) and how can it influence the results of the neural
network?
The learning rate determines how much the weights are adjusted per update. A
large learning rate will adjust the weights very quickly in the right direction, but if
the learning rate is too large, the NN will continue to "jump" over the optimal
point. That is why the learning rate is often reduced during training.

Quiz Question: When is it better to use a larger batch size (model.fit


parameter)? When better a smaller one?
A smaller batch size has the advantage that you will update the weights very finely,
but that the learning process will take many times longer, also because there may
be updates that do not send the weights in the direction in which a larger group of
agencies sends the weights (central limit theorem / law of the large numbers). A
larger batch size has the advantage that you can train faster, and that the updates
are averaged out over all instances in the batch. So the updates will be coarser
grained, but the step will be bigger. In practice, it remains a bit of trial-and-error
and experimentation
ANN tuning – the cross-binary entropy loss function

Cost function:

Cross-binary entropy (binary log loss) cost function measures the


dissimilarity or error between the predicted probability distribution and
the true binary labels (0 or 1) for each data point.

It is defined as follows:

−(y⋅log⁡(p)+(1−y)⋅log⁡(1−p))

Where:
• y is the true binary label (0 or 1).
• p is the predicted probability that the data point belongs to class 1.
ANN tuning - Metrics

Cost and loss functions are useful to fit the model.


But with metrics we evaluate the model
This will depend on what we want to achieve with our model!

We can use for categorical data:

Accuracy
Recall
Precision
F1
….
ANN tuning – Popular optimizers based on SGD

Stochastic Gradient Descent (SGD):


–SGD is the basic optimization algorithm used in neural networks
–Variants of SGD include Mini-Batch SGD (used most often), Batch SGD (uses the entire
training dataset for each update), and online SGD (updates parameters after each
training example).
Momentum: is an enhancement to SGD using a moving average of past gradients.
Adagrad (Adaptive Gradient Algorithm): adapts the learning rate for each
parameter based on the historical gradient information.
RMSprop (Root Mean Square Propagation): is another optimizer that adapts the
learning rate
Adadelta: is an extension of RMSprop (adapts the learning rate in another way)
Adam (Adaptive Moment Estimation):combines elements of both momentum and
RMSprop. It maintains separate adaptive learning rates for each parameter and
includes a momentum term. Adam is known for its robustness and is widely used in
practice.
Nadam: is an extension of Adam Adaptive Learning Rate Methods.

In short all these optimizers try to improve the SGD algorithm


ANN in Python - Keras – Example MNIST (1/2)

# Step 1 & 4: Upload the dataset and split dataset


import keras.datasets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Step 2: Data management manipulations


### 60000 training images with 28x28 pixels and their associated number
print(x_train.shape) # (60000, 28, 28)
print(y_train.shape) # (60000,)
### Transform the 2D-images into 1D-vectors
x_train = x_train.reshape((-1, 784)) # 28 x 28 = 784
x_test = x_test.reshape((-1, 784))
print(x_train.shape) # (60000, 784) input transformation
### Targetvalues transformed into One-hot encoding
y_train = to_categorical(y_train) NN with 5 layers:
• Layer 1 : 784 input neurons
y_test = to_categorical(y_test) • Layer 2 : 784 normalisation
neurons
# Step 5: Build the ANN-model • Layer 3 : 64 neurons
### Preparing the layers of the neural network • Layer 4 : 64 neurons
inputs = Input(shape=(784,)) • Layer 5 : 10 output neurons
x = BatchNormalization()(inputs) Add Normalisation
x = Dense(64, activation='relu')(x) layer
hidden neurons have the relu-
x = Dense(64, activation='relu')(x) activation-function
outputs = Dense(10, activation='softmax')(x) output neurons have softmaxactivation (=
### Build the neural network model probabilities)
model = Model(inputs, outputs, name='MNIST_Crusher')
model.summary() Compile model
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
Set up cost function
loss=keras.losses.categorical_crossentropy,
Choosing
metrics=['accuracy'])
optimization
algorithm
ANN in Python - Keras – Example MNIST (2/2)
# Step 6: Train the ANN-model
history = model.fit(
x_train, # training data Train the network 5
y_train, # training targets times on the training
epochs=5, set, update the weights
batch_size=32, every 32 examples
validation_split=0.2,
)

# Step 7: Evaluate the quality of the ANN-model


model.evaluate(x_test,y_test)
### Predict the first 5 images of the test dataset
print(np.argmax(model.predict(x_test[:5]),axis=1)) # [7 2 1 0 4] Evaluate the real
### compare with the actual numbers quality on unseen
print(np.argmax(y_test[:5], axis=1)) # [7 2 1 0 4] data and show the
predicted values

Note: np.argmax Returns the indices of the maximum values along an axis.

Plot of the ANN:


ANN in Python – Keras – Loss and accuracy

# Summarize history for loss cost of validation set increase


plt.plot(history.history['loss']) =
plt.plot(history.history['val_loss']) Cost Reduction sign of overtraining
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

epochs

# summarize history for accuracy


plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

accuracy increases
ANN in Python - Keras – Example Cereals US (1/4)
We have data on American cereals (see ‘cereals US.csv’ on Canvas). There is
also a rating that gives an indication of how healthy these cereals are.
We want to construct, train and check a neural network and use it to predict the
rating of a new cereal.
Amongst other things, we have to split the data into a training and test dataset.
Place -at random- 80% of the data in the training dataset and the remaining 20
% in the test dataset.-).
We want to be able to predict the ‘Rating’ based on the data in the columns
‘Calories’, ‘Protein (g)’, ‘Fat’, ‘Sodium’ and ‘Dietary Fiber’.
So we will have 5 input neurons and 1 output neuron. We will add only one
hidden layer with 3 neurons.
We will test the quality of the neural network by means of the test dataset.
The rating should be predicted for the following two cereals available on the
Belgium market:
Calories Protein (g) Fat Sodium Dietary Fiber
Kellogg's Coco pops 116 1.7 0.8 230 0.9
Boni Cereal flakes 124 2.6 2.1 260 1.2
ANN in Python - Keras – Example Cereals US (2/4)
# Step 1: Upload the dataset and inspect the data
cereals = pd.read_csv('cereals US.csv',delimiter=';')
cereals.info()
cereals.describe() #note: visualisation of values only for quantitative variables
cereals.isna().sum().sum()
# Step 2: Perform the needed data management manipulations
x_cereals = cereals[['Calories','Protein (g)', 'Fat', 'Sodium', 'Dietary Fiber']].copy()
y_cereals = cereals[['Rating']].copy()
# Step 3: Normalise the data
### min-max normalisation
def minmax_norm(col):
minimum = col.min()
range = col.max() - minimum
return (col-minimum)/range
x_cereals_norm = pd.DataFrame()
for column in x_cereals:
x_cereals_norm[column] = minmax_norm(x_cereals[column])
# Step 4: Split the dataset into a training dataset and a test dataset
from sklearn.model_selection import train_test_split
x_train_cer,x_test_cer,y_train_cer,y_test_cer=train_test_split(x_cereals_norm,y_cereals,
test_size=0.2) #0.2 = 20%
# Step 5: Build the ANN-model
### Preparing the layers of the neural network
inputs_cer = Input(shape=(5,))
x_cer = Dense(32, activation='relu')(inputs_cer)
x_cer = Dense(16, activation='relu')(x_cer)
x_cer = Dense(8, activation='relu')(x_cer)
x_cer = Dense(4, activation='relu')(x_cer)
outputs_cer = Dense(1, activation='linear')(x_cer)
ANN in Python - Keras – Example Cereals US (3/4)
### Build the neural network model
model_cer = Model(inputs_cer, outputs_cer, name='Cereals')
model_cer.summary()
model_cer.compile(optimizer=RMSprop(learning_rate=0.01),
loss=keras.losses.MeanAbsoluteError(),
metrics= keras.metrics.MeanAbsolutePercentageError())
# Step 6: Train the ANN-model
history_cer = model_cer.fit(
x_train_cer, # training data
y_train_cer, # training targets
epochs=200)
# Step 7: Evaluate the quality of the ANN-model
model_cer.evaluate(x_test_cer,y_test_cer)
predicted_values = model_cer.predict(x_test_cer)
pred =[]
for i in range(predicted_values.size):
pred = pred + [predicted_values[i][0]]
predicted = pd.Series(pred, name='predicted')
actual = y_test_cer['Rating'].copy()
actual = actual.reset_index()
actual = actual['Rating']
mape = ((predicted - actual).abs()/actual).mean()
rmse = math.sqrt(((predicted - actual)**2).mean())
ANN in Python - Keras – Example Cereals US (4/4)
# Step 8: Apply the ANN-model to a new dataset
cerealsBE= pd.DataFrame({'Calories':[116,124], 'Protein (g)':[1.7,2.6], 'Fat':
[0.8,2.1],
'Sodium': [230,260], 'Dietary Fiber': [0.9,1.2]})
def minmax_norm_2(col1, col2):
minimum = col2.min()
range = col2.max() - minimum
return (col1-minimum)/range
cerealsBE_norm = pd.DataFrame()
for column in cerealsBE:
cerealsBE_norm[column] = minmax_norm_2(cerealsBE[column], x_cereals[column])
predicted_BE = pd.Series(np.argmax(model_cer.predict(cerealsBE_norm),axis=1),
name='predicted')
Neural networks in
the media
In the media
https://www.bbc.com/news/technology-5
0720823

It takes minutes for most new Minecraft players to work out how to dig up the
diamonds that are key to the game, but training artificial intelligence to do it has
proved harder than expected.
Over the summer, Minecraft publisher Microsoft and other organisations challenged coders
to create AI agents that could find the coveted gems. Most can crack it in their first session.
But out of more than 660 entries submitted, not one was up to the task. The results of the
MineRL - which is pronounced mineral - competition are due to be announced formally on
Saturday at the NeurIPS AI conference in Vancouver, Canada. The aim had been to see
whether the problem could be solved without requiring a huge amount of computing power.
Despite the lack of a winner, one of the organisers said she was still "hugely impressed" by
some of the participants.
:
The organisers wanted the coders to create programs that learned by example, through a
technique known as "imitation learning". This involves trying to get AI agents to adopt the
best approach by getting them to mimic what humans or other software do to solve a task. It
contrasts with relying solely on "reinforcement learning", in which an agent is effectively
trained to find the best solution via a process of trial and error, without drawing on past
knowledge. Researchers have found that using reinforcement learning alone can sometimes
deliver superior results. For instance, DeepMind's AlphaGo Zero program trumped one of the
research hub's earlier efforts, which used both reinforcement learning and the study of
labelled data from human play to learn the board game Go. But this "pure" approach
typically requires much more computing power, making it too expensive for researchers
AlphaGo Zero (AGZ)

AlphaGo Zero is the algorithm that learned the game of


Go. It is an improved version of its predecessor
AlphaGo. It consists of two large neural networks (Head
1 &2).
Head 1 calculates the probabilities of all subsequent
moves that AGZ could make.

Bron: https://hackernoon.com/the-3-tricks-that-made-alphago-zero-work-f3d47b6686ef
AlphaGo Zero (AGZ)

Head 2 calculates the probability of winning based on


the moves that Head 1 represents.
In this way, AGZ was able to put together its own
training set from which it could learn.

Bron: https://hackernoon.com/the-3-tricks-that-made-alphago-zero-work-f3d47b6686ef
In the media
https://www.bbc.com/news/tech
nology-48799045

An app that claimed to be able to digitally remove the clothes from pictures of
women to create fake nudes has been taken offline by its creators.
The $50 (£40) Deepnude app won attention and criticism because of an article by tech
news site Motherboard.
One campaigner against so-called revenge porn called the app "terrifying".
The developers have now removed the software from the web saying the world was not
ready for it.
"The probability that people will misuse it is too high," wrote the programmers in a
message on their Twitter feed. "We don't want to make money this way."
Anyone who bought the app would get a refund, they said, adding that there would be no
other versions of it available and withdrawing the right of anyone else to use it.
The developers also urged people who had a copy not to share it, although the app will
still work for anyone who owns it.
:
The program reportedly uses AI-based neural networks to remove clothing from images of
women to produce realistic naked shots.
The networks have been trained to work out where clothes are in an image, mask them
by matching skin tone, lighting and shadows and then fill in estimated physical features.
The technology is similar to that used to create so-called deepfakes, which manipulate
video to produce convincingly realistic clips. Early deepfake software was used to create
pornographic clips of celebrities.
QUESTIONNAIRE

Questionnaire
Questionnaire

• Download the file 'Questionnaire 21-22.csv'


(see Canvas)
• Put the file in your Python workspace
• Load the data in the data frame studenq
>>> import pandas as pd
>>> studenq = pd.read_csv('Questionnaire 21-22.csv',
delimiter=';', decimal='.')
Questionnaire
1.A Build a neural network –with one hidden layer
containing one neuron less that the number of input
neurons- to determine a student's opinion about the
number of years Biden will stay in the white house based
on the number of hours of math in the final year of high
school, the number of mobile devices he/she uses, and the
number of siblings. Split the data is a training dataset and
a test dataset (90-10 ratio). Also normalize the data using
the decimal scaling.
Questionnaire
1.B Retake the neural network that determines a
student's opinion of the number of years Biden will
stay in the white house (see 1.A).
Draw the ROC curve, create the confusion matrix
and calculate the most commonly used evaluation
metrics.

You might also like