0% encontró este documento útil (0 votos)
203 vistas144 páginas

PythonDL Parte2

Este documento presenta un curso sobre aprendizaje profundo con Python. La parte 2 cubre los fundamentos del aprendizaje profundo, incluyendo redes neuronales densamente conectadas, redes neuronales en Keras, cómo se entrenan las redes neuronales, parámetros e hiperparámetros, y redes neuronales convolucionales. También incluye un caso de estudio sobre clasificación de dígitos escritos a mano usando la base de datos MNIST.
Derechos de autor
© © All Rights Reserved
Nos tomamos en serio los derechos de los contenidos. Si sospechas que se trata de tu contenido, reclámalo aquí.
Formatos disponibles
Descarga como PDF, TXT o lee en línea desde Scribd
0% encontró este documento útil (0 votos)
203 vistas144 páginas

PythonDL Parte2

Este documento presenta un curso sobre aprendizaje profundo con Python. La parte 2 cubre los fundamentos del aprendizaje profundo, incluyendo redes neuronales densamente conectadas, redes neuronales en Keras, cómo se entrenan las redes neuronales, parámetros e hiperparámetros, y redes neuronales convolucionales. También incluye un caso de estudio sobre clasificación de dígitos escritos a mano usando la base de datos MNIST.
Derechos de autor
© © All Rights Reserved
Nos tomamos en serio los derechos de los contenidos. Si sospechas que se trata de tu contenido, reclámalo aquí.
Formatos disponibles
Descarga como PDF, TXT o lee en línea desde Scribd
Está en la página 1/ 144

Python Deep

Learning
Introducción práctica con
Keras y TensorFlow 2
PATC Courses | Barcelona - February 2020

Jordi TORRES.AI
1
Transparencias para
impartir docencia con
el libro #PythonDL

https://torres.ai/python-deep-learning/

2
Acerca de estas transparencias:

● Versión: 0.8 (Barcelona, 31/01/2020)

○ Borrador actual de las transparencias del libro


«Pyhon Deep Learning».
○ Algunas transparencias contienen texto en inglés. Con el tiempo
iremos «puliendo» las transparencias (pero creemos que incluso
así pueden ser usadas y por ello ya las compartimos)

3
Contenido del curso
Course content
PART 1: INTRODUCCIÓN PART 3: TÉCNICAS DEL DEEP LEARNING
PART 1: INTRODUCTION PART 3: DEEP LEARNING TECHNIQUES
1. ¿Qué es el Deep Learning? 9. Etapas de un proyecto Deep Learning
1. What is Deep Learning? 9. Stages of a Deep Learning project
2. Entorno de trabajo 10. Datos para entrenar redes neuronales
2. Work environment 10. Data to train neural networks
3. Python y sus librerías 11. Data Augmentation y Transfer Learning
3. Python and its libraries 11. Data Augmentation and Transfer Learning
12. Arquitecturas avanzadas de redes neuronales
12. Advanced neural network architectures
PART 2: FUNDAMENTOS DEL DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas PART 4: DEEP LEARNING GENERATIVO
4. Densely connected neural networks. PART 4: GENERATIVE DEEP LEARNING
5. Redes neuronales en Keras
5. Neural networks in Keras
13. Redes neuronales recurrentes
6. Cómo se entrena una red neuronal 13. Recurrent neural networks
6. How a neural network is trained 14. Generative Adversarial Networks
7. Parámetros e hiperparámetros en redes neuronales 14. Generative Adversarial Networks
7. Parameters and hyperparameters in neural networks
8. Redes neuronales convolucionales
8. Convolutional neural networks.

4
Recursos del libro
● Página web del libro:
https://JordiTorres.ai/python-deep-learning

● Github del libro:


https://github.com/JordiTorresBCN/python-deep-learning

● Material adicional del libro para descargar:


https://marketing.marcombo.com + código promocional del libro

5
PART 2: FUNDAMENTOS DE DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas
4. Densely connected neural networks

5. Redes Neuronales en Keras


5. Neural networks in Keras

6. Cómo se entrena un red neuronal


6. How a neural network is trained

7. Parámetros e hiperparámetros en redes neuronales


7. Parameters and hyperparameters in neural networks

8. Redes neuronales convolucionales


8. Convolutional neural networks
6
PART 2: FUNDAMENTOS DE DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas
4. Densely connected neural networks

5. Redes Neuronales en Keras


5. Neural networks in Keras

6. Cómo se entrena un red neuronal


6. How a neural network is trained

7. Parámetros e hiperparámetros en redes neuronales


7. Parameters and hyperparameters in neural networks

8. Redes neuronales convolucionales


8. Convolutional neural networks
7
Caso de estudio
● The MNIST database
○ Dataset of handwritten digits classification
○ 60,000 28x28 grayscale images of the 10 digits, along with a test
set of 10,000 images.

8
Caso de estudio
● The MNIST database
○ Features: matrix of 28x28 pixels with values [0, 255]
○ Labels: values [0, 9]

9
Caso de estudio

10
Basic machine learning terminology
● Model: defines the relation between features and labels

y=wx+b

○ y: Labels
○ x: Features

○ w :weights
○ b : bias

11
Basic machine learning terminology

12
Una neurona artificial simple

13
Una neurona artificial simple

14
Una neurona artificial simple

15
Una neurona artificial simple

16
Un neurona artificial simple
● función sigmoid

17
18
Perceptron (esquemáticamente)

19
Red neuronal

20
Perceptrón multicapa

21
Perceptrón multicapa

22
Perceptron multicapa para clasificar

Neural Networks
are often used for
classification, and
specifically when
classes are
exclusive. In this
case the output
layer is a softmax
function in which
the output of each
neuron
corresponds to the
estimated
probability of the
corresponding
class.

23
Función de activación softmax
● The softmax function has two main steps:
○ first, the “evidences” for an image belonging to a certain label are
computed,
○ and later the evidences are converted into probabilities for each
possible label.

24
Evidence of belonging

● An approach to measure the


evidence that a certain image
belongs to a particular class is to
make a weighted sum of the
evidence of belonging to each of its
pixels to that class.

To explain the idea I will use a visual


example ->

25
Evidence of belonging
Let’s suppose that we already have the
model learned for the number zero
(28x28):
● Pixels in red represent negative
weights (i.e., reduce the evidence that
it belongs),
● Pixels in blue represent positive
weights (the evidence of which is
greater increases).
● The black color represents the neutral
value.

26
Evidence of belonging
• Let’s imagine that we trace a zero
over it.
• In general, the trace of our zero
would fall on the blue zone
• It is quite evident that if our stroke
goes over the red zone, it is most likely
that we are not writing a zero;
• therefore, using a metric based on
adding if we pass through the blue
zone and subtracting if we pass
through the red zone seems
reasonable.

27
28
Evidence of belonging
● To confirm that it is a good
metric, let’s imagine now
that we draw a three
● it is clear that the red zone of
the center of the previous
model that we used for the
zero will penalize the
aforementioned metric since,
● as we can see in the figure,
when writing a three we pass
over

29
Evidence of belonging
But on the other
hand, if the
reference model
is the one
corresponding to
number 3

we can see that,


in general, the
different possible
traces that
represent a three
are mostly
maintained in the
blue zone.

30
Probability of belonging
● The second step involves computing probabilities.

● Specifically we turn the sum of evidences into predicted


probabilities using this function:

● softmax uses the exponential value of the calculated


evidence and then normalizes them so that the sum
equates to one, forming a probability distribution.

31
Probability of belonging
● Intuitively, the effect obtained with the use of
exponentials is that one more unit of
evidence has a multiplier effect and one unit
less has the inverse effect.
● The interesting thing about this function is
Notes: that
○ a good prediction will have a single entry in the
vector with a value close to 1, while the remaining
entries will be close to 0.
○ in a weak prediction, there will be several
possible labels, which will have more or less the
same probability.

32
PART 2: FUNDAMENTOS DE DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas
4. Densely connected neural networks

5. Redes Neuronales en Keras


5. Neural networks in Keras

6. Cómo se entrena un red neuronal


6. How a neural network is trained

7. Parámetros e hiperparámetros en redes neuronales


7. Parameters and hyperparameters in neural networks

8. Redes neuronales convolucionales


8. Convolutional neural networks
33
Preparar entorno ejecución

34
Precarga de los datos en Keras

35
36
37
Preprocesado de datos de entrada en
una red neuronal

38
39
one-hot encoding

40
one-hot encoding

41
Definición del modelo

Number of Type of activation


neurons per layer function

- keras.models.Sequential class is a wrapper for the neural network model


- Keras will automatically infer the shape of all layers after the first layer

42
Definición del modelo

43
Definición del modelo

44
Configuración del proceso de learning

45
Configuración del proceso de learning

46
Entrenamiento del modelo

47
Evaluación del modelo

48
Matriz de confusión

49
Matriz de confusión

50
Generación de predicciones

51
Generación de predicciones

52
53
It is time to get
your hands dirty!

54
Homework: Fashion-MNIST dataset

55
Usando el mismo modelo

Equivalent to numpy.reshape (,784)


that gives a new shape to an array
without changing its data.

56
Motivation for next chapter

57
PART 2: FUNDAMENTOS DE DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas
4. Densely connected neural networks

5. Redes Neuronales en Keras


5. Neural networks in Keras

6. Cómo se entrena un red neuronal


6. How a neural network is trained

7. Parámetros e hiperparámetros en redes neuronales


7. Parameters and hyperparameters in neural networks

8. Redes neuronales convolucionales


8. Convolutional neural networks
58
Una red neuronal está parametrizada

59
Función de pérdida

60
Optimizador

61
Learning process

62
How does Deep Learning
works?

63
MANY EXAMPLES
MANY EXAMPLES
(X,Y) PAIRS
MANY EXAMPLES
(X,Y) PAIRS
MANY EXAMPLES
x1 (X,Y) PAIRS
(X,Y) PAIRS
w1j

w2j
x2 zj = ∑' (') *' + bj yj = +(-j )
yj
1
1 + # $%

wnj

xn
bj
TRAINING stage 64
MANY EXAMPLES
MANY EXAMPLES
(X,Y) PAIRS
MANY EXAMPLES
(X,Y) PAIRS
MANY EXAMPLES
x1 (X,Y) PAIRS
(X,Y) PAIRS
w1j

w2j
x2 zj = ∑' (') *' + bj yj = +(-j )
yj
1
1 + # $%

wnj

xn
bj
TRAINING stage 65
MANY EXAMPLES
MANY EXAMPLES
(X,Y) PAIRS
MANY EXAMPLES
(X,Y) PAIRS
MANY EXAMPLES
x1 (X,Y) PAIRS
(X,Y) PAIRS
w1j

Tunning W & b
w2j
x2 zj = ∑' (') *' + bj yj = +(-j )
yj
1
1 + # $%

wnj

xn
bj
TRAINING stage 66
New
Data x1
w1j

w2j
x2 zj = ∑' (') *' + bj yj = +(-j ) Predicción
Y
yj
1
1 + # $%

wnj

xn
bj
67
Learning process

68
Piezas claves del proceso de
backpropagation

69
Ajuste de parámetros: Gradient Descent

70
Ajuste de parámetros: Gradient Descent

71
Ajuste de parámetros: Gradient Descent

72
Tipos de Gradient Descent
● ¿con qué frecuencia se ajustan los valores de los
parámetros?
○ Stochastic Gradient Descent
○ Batch Gradient Descent
○ Mini Bath Gradient Descent

● SGD (con batch)

73
Loss function

74
Optimizers
● SGD, RMSprop, AdaGrad, Adadelta, Adam, Adamax, Nadam …

75
PART 2: FUNDAMENTOS DE DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas
4. Densely connected neural networks

5. Redes Neuronales en Keras


5. Neural networks in Keras

6. Cómo se entrena un red neuronal


6. How a neural network is trained

7. Parámetros e hiperparámetros en redes neuronales


7. Parameters and hyperparameters in neural networks

8. Redes neuronales convolucionales


8. Convolutional neural networks
76
Parámetros e hiperparámetros
● Parameter: A variable of a model that the DL system
trains on its own. For example, weights are parameters
whose values the DL system gradually learns through
successive training iterations.

● Hyperparameters: The "knobs" that you tweak during


successive runs of training a model.

77
Parámetros e hiperparámetros
● Hiperparámetros a nivel de estructura y topología de la red
neuronal:
○ número de capas,
○ número de neuronas por capa,
○ sus funciones de activación,
○ inicialización de los pesos,
○ etc.

78
Parámetros e hiperparámetros
● Hiperparámetros a nivel de algoritmo de aprendizaje:
○ epochs,
○ batch size,
○ learning rate,
○ momentum,
○ etc.

79
Epochs y Batch size
● Epoch: as a single training iteration of all batches in both
forward and back propagation. This means 1 epoch is a
single forward and backward pass of the entire input data.

● Batch size: The number of examples in a batch. The set of


examples used in one single update of a model's weights
during training.

80
Learning rate y learning rate decay

81
Momentum

82
Activation functions
● Sigmoid

83
Activation functions
● Tanh

84
Activation functions
● ReLU

85
TensorFlow Playground

86
TensorFlow Playground

87
TensorFlow Playground

88
TensorFlow Playground

89
Clasificación con una sola neurona

90
91
Clasificación con más de una neurona

92
93
94
95
Clasificación con varias capas

96
97
98
PART 2: FUNDAMENTOS DE DEEP LEARNING
PART 2: FUNDAMENTALS OF DEEP LEARNING
4. Redes neuronales densamente conectadas
4. Densely connected neural networks

5. Redes Neuronales en Keras


5. Neural networks in Keras

6. Cómo se entrena un red neuronal


6. How a neural network is trained

7. Parámetros e hiperparámetros en redes neuronales


7. Parameters and hyperparameters in neural networks

8. Redes neuronales convolucionales


8. Convolutional neural networks
99
Repaso:

Deep Learning
Matematic models

100
101
forward
face
Prob.

Deep Learnning - Supervised Learning


102
Fase ”TRAINING”

forward
F(x) = 70%

Jordi TORRES.AI
103
Fase ”TRAINING”

forward
F(x) = 70%

BACKPROPAGATION
(update of model weights)
error = 30%

Jordi TORRES.AI
104
Fase ”TRAINING”

forward
F(x) = 80%

BACKPROPAGATION
(update of model weights)
error = 20%

105
Fase ”TRAINING”

FORWARD PROPAGATION

LOSS
estimation
(update of model weights)

Jordi TORRES.AI
106
Fase ”INFERENCE”

Jordi TORRES.AI
107
Fase ”INFERENCE”

Jordi TORRES.AI
108
Convolutional Neural Networks
● Un CNN: An explicit assumption that the inputs are images.

● Channel
○ is a conventional term used to refer to a certain component of an
image.
○ For an RGB color image à 3 channels

109
Convolutional Neural Networks
• Intuitive Explanation of CNN.
Fase ”TRAINING”

edges edge combination object models

Jordi TORRES.AI
110
Convolutional Neural Networks
• Intuitive Explanation of CNN.
Fase ”INFERENCE”

edges edge combination object models

Jordi TORRES.AI
111
Convolutional Neural Networks
• Intuitive Explanation of CNN.
Fase ”INFERENCE”

edges edge combination object models

Jordi TORRES.AI
112
Basic components of a CNN
● The convolution operation

● The pooling operation

● Classification (Fully Connected Layer)

113
Basic components of a CNN
● The convolution operation

● The pooling operation

● Classification (Fully Connected Layer)

114
The convolution operation

In CNN not all the neurons of a layer


are connected with all the neurons of
the next layer as in the case of fully
connected neural networks; it is done
by using only small localized areas of
the space of input neurons.

115
The convolution operation

Sliding window

Use the same filter (the same W matrix of


weights and the same bias b) for all the
neurons in the next layer

116
The convolution operation: visual example

Source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

117
The convolution operation
● Many filters (one for each feature that we want to detect)

118
Basic components of a CNN
● The convolution operation

● The pooling operation

● Classification (Fully Connected Layer)

119
The pooling operation
○ max-pooling
○ average-pooling

120
The pooling operation

We slide our 2 x 2 window by 2


cells (also called ‘stride’) and take
Source: http://cs231n.github.io/convolutional-networks/ the maximum value in each
region.
121
The pooling operation
● The pooling maintains the spatial relationship

122
Convolutional+Pooling layers: Summary

123
Implementación con la API Keras

124
125
Basic components of a CNN
● The convolution operation

● The pooling operation

● Classification (Fully Connected Layer)

126
Un modelo simple

127
128
número de parámetros
● conv2D: (32 × (25 + 1))
● conv2D: ((5 × 5 × 32) + 1) × 64
● Dense: 10 × 1024 +10

129
130
131
Hyperparameters of the convolutional layer
● Size and number of filters
○ The size of the window (window_height × window_width) that
keeps information about the spatial relationship of pixels are
usually 3×3 or 5×5.
○ The number of filters (output_depth) indicates the number of
features and is usually 32 or 64.

Conv2D(output_depth, (window_height, window_width))

132
Hyperparameters of the convolutional layer

5×5 3×3

Output image 3x3!!!

133
Hyperparameters of the convolutional layer
● Padding
○ Sometimes we want to get an output image of the same dimensions as the input.

○ We can add zeros around the input images before sliding the window through it.

134
Hyperparameters of the convolutional layer
In TensorFlow, the padding in
the Conv2D layer is configured
with the padding argument,
which can have two values:

"Same"
indicates that as many rows and
columns of zeros are added as
necessary, so that the output has
the same dimension as the
input.

"Valid”
indicates not to do padding (it is
the default value of this
argument in Keras/TensorFlow).

135
Hyperparameters of the convolutional layer
● Stride : Number of steps the sliding window jumps

○ Ex: stride 2

136
It is time to get
your hands dirty!

137
Homework: Fashion-MNIST dataset

138
139
140
Capas y optimizadores

141
142
Capas de Dropout y BatchNormalization

143
Decaimiento del ratio de aprendizaje
● callback LearningRateScheduler

144

También podría gustarte