SIC - AI - Chapter 8. Neural Network and Deep Learning - v1.2
SIC - AI - Chapter 8. Neural Network and Deep Learning - v1.2
Innovation
Campus
Artificial Intelligence Course
Chapter 8.
Chapter objectives
Chapter contents
Overview
Biological origin
‣ Human neuron is the smallest information processing unit that composes the neural system. It
consists of a cell body, dendrites, and an axon.
‣ A cell body transmits simple operations, a dendrite transmits received signals, and an axon
transmits the executed results.
Overview
Biological origin
‣ They receive and store various information by exchanging chemical signals with adjacent neurons
through a structure named synapse.
‣ The number of neurons in the brain of a human being is about 1011 neurons. And the average
neuron has around 1,000 synapses. Thus, 100 trillion (1014) synapses are interconnected in the
human brain.
‣ In the 1940s, several researchers who studied biological neural networks began to investigate how
to emulate the mechanisms of the neural network. Perceptron is one of the early artificial neural
network models.
‣ From here on, the term ‘neural network’ refers to an ‘artificial’ neural network.
‣ In 1943, McCulloch and Pitts first delineated the working mechanisms of neurons.
‣ Rosenblatt’s perceptron is based on McCulloch Pitts neuron model.
‣ In 1958, Rosenblatt first proposed perceptron.
‣ In 1969, Minsky and Papert mathematically proved the limitations of the perceptron
in their book 『 Perceptrons 』 . It was found that a perceptron could not solve the XOR
problem because it was only a linear classifier.
Frank Rosenblatt
(1928~1971)
Perceptron
The Structure of a Perceptron
Input Output
𝜒 0=1
𝜒1 𝑊0
𝑊1
𝜒2 𝑊2 y
𝑊𝑑
𝜒𝑑
Structure of a perceptron
‣ Comprehend the structure of a perceptron by solving the OR function of the truth table.
Perceptron
‣ First, you need to understand the OR operation of the truth table. The results of the OR operation
are as below.
X1 X2 OR operation
False False F
True False T
False True T
True True T
Perceptron
‣ Suppose that we know the ‘w’ values to solve a problem. w0 is -0,5, w1 is 1, w2 is 1.
(w: a weight vector)
‣ Execute the Excel function after substituting x1 with 0 and x2 with 0.
x1 X2 w0 w1 w2
0 0 -0.5 1 1
1
x0 1
w0 x0 * w0 -0.5
+
0 w1 x0 * w1 0
x1 0 -0.5
+
x0 * w2 0
w2 Sum -0.5
0
x2 0
Perceptron
‣ Put the value of the sum into a function.
‣ Here, that function is the step function.
1
𝑠
−1
Perceptron
‣ A threshold function can be defined as follows, and we can create a graph by putting random values
in the Excel table.
-100 -1 Threshold
-90 -1 function
1.5
A threshold can be defined as follows, -80 -1
Here, -70 -1
1.0
-60 -1 0.5
-50 -1 0
-40 -1 -150 -100 -50 0 50 100 150
-0.5
-30 -1
-1
-20 -1
-10 -1 -1.5
0 1
10 1
20 1
30 1
40 1
50 1
60 1
70 1
80 1
90 1
100 1
Perceptron
‣ Put the result from the previous slide, -0.5 as s, and run it through the threshold function. The return
is -1.
‣ Let’s learn about the truth table.
A truth table displays truth or false for all results of the propositions or the combination of their Boolean
functions.
For example, in conjunction with statements P and Q , the truth table can be constructed as below.
In addition, true∙false is also notated as T∙F or 1∙0.
Proposition P Proposition Q
True True True
True False False
False True False
False False False
Perceptron
‣ A truth table can be expressed with 1 and 0 as follows.
x1 x2
0 0
1 0
0 1
1 1
‣ The figure from Slide 9 puts 0 for the first variable x1 and 0 for x2 as well.
‣ x0 is called a bias, and it initializes as 1.
Perceptron
‣ Substitute x1 with 1 and x2 with 0.
x0(bias) 1 1
w
x0*w0 -0.5
0
+
w
0 1 -0.5 x1*w1 1
x1 1
+
w
x2*w2 0
2
x2 0 0
Sum 0.5
Perceptron
‣ The sum is 0.5. Run this through the threshold function, and the result is 1.
‣ Likewise, if we process all values in the truth table through the perceptron, the results are as
follows.
w
x2*w2 1
2
x2 1
Sum 1.5
Perceptron
‣ These results can be shown in the coordinate plane as follows.
‣ The results can be grouped into that of point (0,0) or the rest.
x1 x2
(0,0) (1.0)
Perceptron
‣ Such classification can be presented as a geometrical figure as follows.
‣ If you recall the basic notion of machine learning through linear regression, machine learning is a
process that builds regression equations with the predicted values of the slope and y-intercept. The
figure below shows how linear regression forms a line that separates the two groups.
x1 x2
(0,1) (1.1) 0 0 -0.5 -1
1 0 0.5 1
0 1 0.5 1
1 1 1.5 1
(0,0) (1.0)
OR operation
( 0) (0 ) ( 1) ()
=$A$2*$C$7+B8*$C$8+C5*$C
1
𝑥1 =$90 , 𝑦 1=− 1 , 𝑥 2= 1 , 𝑦 2=1=IF(D5>=0,1,-1)
1
, 𝑥 3 = 0 , 𝑦 3=1 , 𝑥 4 = 1 , 𝑦 =1
4
1
𝑥2 𝑥 0=1
𝑥3 𝑥4 -0.5
𝑥1 1.0 𝑦
𝑥1 𝑥2
1.0
𝑥1 𝑥2
(a) Train set (b) Perceptron
OR operation
Let’s input four samples to the perceptron and check the results.
‣ As you’ve seen in the previous slide, the perceptron delivered correct results for all four samples.
‣ It can be said that this perceptron classifies the train set with 100% performance.
AND operation
‣ To recap, you can solve the OR problem by perceptron with the appropriate w values.
‣ Now, the AND operation. The truth table is on the left, and the geometrical solution is the line that
separates the units into two groups.
AND opera-
x1 x2
tion
0 0 F
1 0 F
(0,1) (1.1)
0 1 F
1 1 T
(0,0) (1.0)
AND operation
‣ Find the values of w0, w1, and w2 that return the result from the previous slide.
‣ First, let’s apply a random value, and then gradually change the values of w0, w1, and w2 to find
the solution.
AND operation
‣ By substituting w0 with 0.5, w1 with -0.5, and w2 with -0.1, the equation solves the AND operation.
‣ You found that a structure with appropriate values for the w vector solves a certain problem.
‣ Although the perceptron solved the OR and AND problems, it could not solve the XOR. Let’s find out
why.
XOR operation
‣ XOR operation in a truth table
‣ In XOR, the result is true if two propositions are the opposite.
XOR Opera-
x1 x2
tion
(0,1) (1.1)
0 0 F
1 0 T
0 1 T
1 1 F
(0,0) (1.0)
‣ Since the perceptron is a linear classifier, you cannot find a line separating the blue and red dots.
‣ A multilayer perceptron solved the problem.
Input Output 𝑦 =𝜏 ( 𝑠 )
𝑥 0=1
1
𝑥1 𝑤0
𝑤1
𝑥2 𝑤2 𝑦 𝑆
𝑤𝑑
-1
𝑥𝑑
(b) Use threshold function as an activation function
(a) Structure of a perceptron
(s)
Mechanism of a perceptron
OR XOR
Multilayer Perceptron
‣ In XOR, the inputs are not linearly separable.
‣ It could not solve a simple problem like XOR.
• This was claimed by Marvin Minsky, an MIT professor who was the pioneer of the AI field, in his
book <Perceptrons>, published in 1969.
People realized that emulating the diagram of ‘neuron neural network intelligence’ through
the diagram of ‘perceptron artificial neural network artificial intelligence’ was not an easy
task.
Multilayer Perceptron
‣ What are the ways you can separate the inputs that are not linearly separable?
‣ First, the inputs can be separated in a non-linear form.
‣ Second, a two-dimensional figure on a plane could be modified into a three-dimensional figure to
enable separation.
Outpu
t
𝑥2
Perceptron ② Perceptron ① Perceptron ②
C 𝑥 0=1 𝑥 0=1
Perceptron ① -
1.5
A 0.5
𝑥1 𝑧1 𝑥1 - 𝑧2
1.0
1.0
𝑥1 -
- + 1.0
B 1.0
+ - 𝑥2 𝑥2
Solving XOR
b1 b2
Input layer Hidden layer Output layer
x W1 n W2
yout
x n
Solving XOR
b1 b2
Input layer Hidden layer Output layer
x W1 n W2
yout
x n
『 Perceptrons 』 by Minsky
‣ Minsky pointed out the limitations of the perceptron and suggested a multilayer structure as a
solution. The level of technology at the time was not advanced enough to materialize his idea.
‣ In 1974, Paul Werbos proposed a backpropagation algorithm in his doctoral thesis.
‣ In 1986, Rumelhart established the multilayer theory in his book 『 Parallel Distributed Processing 』
and revived interest in the neural network.
(a) Parallel combine two percep- (b) Transform a feature space x into a new feature
trons space z
Transformation of a feature space
𝑧2 Perceptron ③
Perceptron ③
𝑥 0=1 1
-0.5 -1.0
𝑧1 𝑥1 1.0
1.0 1.0
1.0
-1.0 1.5
𝑥2
-1.0
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
𝑋 𝑌
‣ How about the cases that require non-linear decision boundaries? ANN with hidden layers!
Example
,
𝑥1𝑤
𝑤11
1
𝑤31 𝑥1
21
𝑥 2𝑤𝑤1222
𝑤32 1
𝑤13 𝑥 2
𝑥 3𝑤
𝑤33
23
1
𝑤𝑤
𝑤
14 24 𝑥 3
𝑥4 34
[ ] ([ ][ ] [ ])
1 𝑥1
𝑥1 𝑤 11 𝑤 12 𝑤 13 𝑤 14 𝑏1
𝑥2
𝑥1
2
= 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑤 21 𝑤 22 𝑤 23 𝑤 24 ∙ + 𝑏2
𝑥3
𝑥1 𝑤 31 𝑤 32 𝑤 33 𝑤 34 𝑏3
3
𝑥4
Example
𝑏1 ,𝑏2 ,𝑏3 ,𝑏4
𝑤1 1𝑥 2
𝑤
1 𝑤 12 1
𝑥1 𝑤 13 2
𝑤2221
1 𝑤23
𝑥2
𝑥 2 𝑤𝑤3 1
2
𝑤33𝑥 3
32
𝑥1 𝑤𝑤 41
3 42
𝑤 43 2
𝑥4
[ ] ([ ] [ ] [ ])
2
𝑥1 𝑤 11 𝑤 12 𝑤 13 1 𝑏1
2 𝑥1
𝑥2 𝑤 21 𝑤 22 𝑤 23 𝑏2
2
= 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 ∙ 𝑥
1
2
+
𝑥3 𝑤 31 𝑤 32 𝑤 33 1 𝑏3
𝑥
𝑥2
4
𝑤 42 𝑤 42 𝑤 43 3
𝑏4
Example
𝑏
2
𝑥1𝑤
1
𝑤
2
𝑥 22
𝑤3
2
^
𝑦 ?
𝑥 3
𝑤4 Error
2
𝑥
( [ ] )
4 𝑥2
1
𝑥2
^
𝑦 = 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 [ 𝑤1 𝑤2 𝑤3 𝑤4 ]∙ 2
+b
𝑥2
3
𝑥2
4
Example
𝐸𝑟𝑟𝑜𝑟𝑜𝑟𝐿𝑜𝑠𝑠
Activation Function
Activation function: Linear threshold or Step
‣ )
‣ An activation function that was frequently used in the early days of neural networks is the step
function.
Step
0 𝑥0 𝑥
‣ The Sigmoid is another widely used activation function. Note that the output value ranges
between 0 and 1.
Sigmoid
+1
0 𝑥
‣ When we have more than two classes, it can be generalized to the “Softmax."
tanh
+1
-1
𝐿𝑖𝑛𝑒𝑎𝑟 ( 𝑥 )= 𝑥
‣ The linear activation’s output is the same as the input. The linear activation function is used in the
regression.
Linear
0 𝑥
‣ ReLu is another commonly used activation function. It can be expressed in a simple functional form.
ReLu
0 𝑥
Leawky ReLu
0 𝑥
Step
Sigmoid tf.math.sigmoid(x)
Tanh tf.math.tanh(x)
ReLu tf.nn.relu(x)
Softmax tf.nn.softmax(x)
Website:
https://playground.tensorflow.o
rg
⋯
𝑋 ⋯ 𝑌
⋯
+1
0 𝑥
Dropout
‣ Randomly exclude certain nodes during the training step to avoid over dependency.
Dropout
‣ Randomly exclude certain nodes during the training step to avoid over dependency.
Dropout
‣ When predicting, use all the nodes and edges without excluding.
TensorFlow Basics
2.1. Introduction to TensorFlow
2.2. TensorFlow dataset
2.3. Machine Learning with TensorFlow
Introduction to TensorFlow
About TensorFlow
‣ TensorFlow can utilize both CPU and GPU to improve the training performance of a machine learning
model.
About TensorFlow
‣ Since TensorFlow’s Python API is fully mature, it is popular to deep learning and machine learning
technologists.
‣ In addition, two new TensorFlow-based libraries, TensorFlow.js and TensorFlow Lite, were introduced.
‣ TensorFlow developers and open source community are constantly working to resolve such issues.
What is TensorFlow?
‣ TensorFlow generates a computational graph composed of a series of nodes.
‣ Each node expresses an operation that can denote more than 0 input or output.
‣ Tensor was created as a symbolic handle to refer to inputs and outputs of such operation.
Why TensorFlow?
‣ TensorFlow immensely expedites the machine learning process.
• Performance issue
→ Since the performance of computer processors has consistently improved in the past years, it can
train more complex and powerful machine learning systems.
• Even the cheapest computer hardware carries processors with a number of cores.
• Numerous functions from Scikit-learn can disperse an operation into a number of processes.
• Generally, Python can only run one core due to GIL (Global Interpreter Lock).
• Multiproceessing libraries can disperse an operation into multiple cores. Still, even the most
advanced desktop PCs do not carry more than 8 or 16 cores.
Why TensorFlow?
‣ Even the simplest multilayer perceptron has a hundred units in a single hidden layer.
‣ You need about 80,000 weight parameters ([784×100 + 100] + [100×10 + 10] = 79,510) to train a
model to classify even the simplest images.
Why TensorFlow?
‣ You can buy a GPU that carries 290 times more core and executes 10 times more floating point
operations per second, with 70% price of the latest CPU.
‣ Writing code for a certain CPU is not as easy as running code on a Python interpreter.
‣ There are packages such as CUDA or OpenCL that enable the usage of designated GPUs.
‣ However, CUDA or OpenCL does not offer the best environment to write code for machine learning
algorithms.
What is TensorFlow?
‣ Refer to the figure below to precisely understand the notion of a tensor.
‣ The first row shows rank 0 and 1 tensors. The second row shows rank 2 and 3 tensors.
Ex Tensors
Rank Rank
0: 1:
(scalar (vecto
) r)
What is TensorFlow?
‣ Each node expresses an operation that can denote more than 0 input or output.
‣ Tensor was created as a symbolic handle to refer to inputs and outputs of such operation.
‣ To be more precise, a scalar can be defined as a rank 0 tensor, a vector as a rank 1 tensor, a matrix
as a rank 2 tensor, and a matrix with three dimensions as a rank 3 tensor.
‣ The actual values are stored as NumPy arrays, and the tensor provides a reference for this array.
What is TensorFlow?
‣ Learn how to load data and use TensorFlow Dataset objects which efficiently circulate datasets.
‣ Let’s get acquainted with tf.keras API and build a machine learning model.
‣ Learn how to compile and train this model and store the trained model in a drive.
What is TensorFlow?
‣ In TensorFlow version 1.x, it supported a static computational graph that expressed the flow of data.
‣ Many users had a difficult time utilizing the static computational graph. This issue has been resolved
in the recent 2.0 version, and the user can create and train neural network models more easily.
‣ TensorFlow 2.0 still supports static computational graphs, but dynamic computational graphs are the
default.
Installing TensorFlow
‣ Use Python’s pip installer to install TensorFlow from PyPI.
‣ Run the following command on the terminal.
Installing TensorFlow
‣ For GPU, you need an NVIDIA-compatible graphic card and an installation of CUDA Toolkit and
cuDNN library
(GPU is recommended for neural network training).
‣ If the above conditions are fulfilled, you can install the TensorFlow GPU version.
> pip install tensorflow-
gpu
‣ Refer to the official website for more information on installation (
https://www.tensorflow.org/install/gpu).
‣ The tf.convert to tensor function supports tf.Variable objects unlike the tf.constant function (which
will be discussed soon).
‣ The tf.fill function and the tf.one hot function also create a tensor.
‣ The tf.fill function is more efficient than the tf.ones to generate tensors of larger size.
TensorFlow Basics
2.1. Introduction to TensorFlow
2.2. TensorFlow dataset
2.3. Machine Learning with TensorFlow
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 100
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 101
2.2. TensorFlow dataset UNIT
02
batch
batch
batch
batch
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 102
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 103
2.2. TensorFlow dataset UNIT
02
## Step
1
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 104
2.2. TensorFlow dataset UNIT
02
## Step
1
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 105
2.2. TensorFlow dataset UNIT
02
## Step
2
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 106
2.2. TensorFlow dataset UNIT
02
## Step
2
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 107
2.2. TensorFlow dataset UNIT
02
## Step
3
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 108
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 109
2.2. TensorFlow dataset UNIT
02
'Image
Size:
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 110
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 111
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 112
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 113
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 114
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 115
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 116
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 117
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 118
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 119
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 120
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 121
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 122
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 123
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 124
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 125
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 126
2.2. TensorFlow dataset UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 127
Unit 2.
TensorFlow Basics
2.1. Introduction to TensorFlow
2.2. TensorFlow Dataset
2.3. Machine Learning with TensorFlow
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 128
2.3. Machine Learning with TensorFlow UNIT
02
Keras
A high-level API that allows users to build, train, evaluate, and execute all kinds of neural
networks
Code Code
코드 코드 코드
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 129
2.3. Machine Learning with TensorFlow UNIT
02
𝑌 =𝑏+𝑤1 𝑋 1 +𝑤 2 𝑋 2 +⋯+𝑤 𝑘 𝑋 𝑘 +
‣ The coefficients previously denoted as 𝛽_ 𝑖 will be called “weights” 𝑤_ 𝑖 from now on.
‣ The intercept previously denoted as 𝛽_0 will be called “bias” 𝑏 from now on.
‣ We will use the “linear” activation function.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 130
2.3. Machine Learning with TensorFlow UNIT
02
1
𝑏
Linear
𝑋𝑤
1
1
Activation
𝑋𝑤 2 𝜺
Input 2 Output 𝑌
⋮
⋮ 𝑤
𝑘
Summe
𝑋𝑘 d over
Parameters
‣ The disagreement between the predicted and the true 𝑦 has to be minimized by training.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 131
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 132
2.3. Machine Learning with TensorFlow UNIT
02
𝑆=𝑏+𝑤1 𝑋 1 +𝑤 2 𝑋 2 +⋯+𝑤 𝑘 𝑋 𝑘
‣ The conditional probability of 𝑌 being equal to 1 is denoted as 𝑃(𝑌=1|{𝑥𝑖}).
‣ A “Sigmoid” function connects the probability with the logit:
𝑆
𝑒
𝜎 ( 𝑆 )= 𝑆
0.
1+𝑒
5
‣ This Sigmoid is the “activation function” the biggest difference from the linear regression.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 133
2.3. Machine Learning with TensorFlow UNIT
02
1
𝑏
Sigmoid 0, 0, 1, 1,
𝑋𝑤
1
1
Activation 0, …
𝑋𝑤 2 𝜺
Input 2 Output 𝑌
⋮
⋮ 𝑤
𝑘 Sum = 𝑆 𝑃 ( 𝑌 =1|𝑑𝑎𝑡𝑎 ) → ^𝑦
𝑋𝑘
Parameters
‣ The mismatch between the predicted 𝑦 ̂ and the true has to be minimized by training.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 134
2.3. Machine Learning with TensorFlow UNIT
02
𝑛
1
𝐿=− ∑ ( 𝑦 𝑖 𝑙𝑜𝑔 ( 𝑝 ^ 𝑖 ))
^ 𝑖 ) + ( 1− 𝑦 𝑖 ) 𝑙𝑜𝑔(1 − 𝑝
𝑛 𝑖=1 0 or .
‣ We went back to the definition
‣ This is the loss function we’d like to minimize by the gradient descent algorithm.
‣ Binary logistic regression can be generalized to the multi-class version using the Softmax activation
and multi-class cross entropy as the loss function.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 135
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 136
2.3. Machine Learning with TensorFlow UNIT
02
1) is randomly initialized.
2) Calculate the gradient .
3) Update by one step: .
Convergence speed is controlled by the “learning rate” .
4) Repeat from step 2) a fixed number of times (epochs).
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 137
2.3. Machine Learning with TensorFlow UNIT
02
𝐿(𝑏,𝑾 )
Optimized
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 138
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 139
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 140
2.3. Machine Learning with TensorFlow UNIT
02
global_variables_initializer()
Session instance.
Send data to the placeholders and train the model.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 141
2.3. Machine Learning with TensorFlow UNIT
02
X0 X1 X2 X3 X4 X5 X6 X7 X8 X9
0 0 0 0 0 1 0 0 0 0
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 142
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 143
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 144
2.3. Machine Learning with TensorFlow UNIT
02
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 145
Unit 3.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 146
3.1. Main Features of TensorFlow UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 147
3.1. Main Features of TensorFlow UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 148
3.1. Main Features of TensorFlow UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 149
3.1. Main Features of TensorFlow UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 150
Unit 3.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 151
3.2. Keras Basics UNIT
03
Keras basics
The most popular backend up to this day was TensorFlow which was the default backend for Keras.
Since more and more TensorFlow users worked with Keras for high-level APIs, TensorFlow developers had
to consider containing the Keras project in a separate module (tf.keras).
In TensorFlow 2.0, Keras and tf.keras are synchronized.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 152
3.2. Keras Basics UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 153
3.2. Keras Basics UNIT
03
Node (activation)
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 154
3.2. Keras Basics UNIT
03
Specify the batch learning method and start the training (gradient descent).
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 155
3.2. Keras Basics UNIT
03
Squared Error
loss = "mse"
(L2 Error)
Absolute Error
loss = "mae"
(L1 Error)
Binary
loss = "binary_crossentropy"
Cross Entropy
Multi-Class
loss = "categorical_crossentropy"
Cross Entropy
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 156
3.2. Keras Basics UNIT
03
Stochastic
Gradient De- Gradient descent algorithm with batch learning SGD(lr=0.1)
scent
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 157
3.2. Keras Basics UNIT
03
Activation functions
Linear activation="linear"
Sigmoid activation="sigmoid"
Tanh activation="tanh"
ReLu activation="relu"
Softmax activation="softmax"
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 158
3.2. Keras Basics UNIT
03
Layers
Keras Explanation
Dense() Dense layer
Conv1D() 1D convolution layer
Conv2D() 2D convolution layer
MaxPooling1D() 1D max pooling layer
MaxPooling2D() 2D max pooling layer
Dropout() Applies dropout to the input
Flatten() Flattens the input
Embedding() Turns positive integers (label encoding) into dense vectors
SimpleRNN() RNN where the output is fed back to input
LSTM() LSTM layer
TimeDistributed() A wrapper that applies a layer to every time slice
Input() The first layer for a functional API model
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 159
3.2. Keras Basics UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 160
3.2. Keras Basics UNIT
03
Name Description
cifar10 color images labeled over 10 categories. 50000 training and 10000 testing images.
cifar100 color images labeled over 100 categories. 50000 training and 10000 testing im-
ages.
imdb 25000 movie reviews from IMDB with binary labeling (positive or negative).
reuters 11228 newswires from Reuters, labeled over 46 topics.
mnist grayscale images of handwritten digits. 60000 training and 10000 testing images.
fashion_mnist grayscale images of 10 fashion categories. 60000 training and 10000 testing im-
ages.
boston_housing 13 attributes of houses at different locations around Boston suburbs in the late
1970s.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 161
Unit 3.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 162
3.3. AI with Keras UNIT
03
AI with Keras
Linear/Logistic regression with Keras
1
𝑏 Linear
or
𝑋𝑤
1
1 Sigmoid
𝑤
𝑋2 2
Input Output
⋮
⋮𝑤
𝑘
Summed
𝑋𝑘 over
‣ The major difference between linear and logistic regression lies in the choice of activation function.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 163
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 164
3.3. AI with Keras UNIT
03
‣ view
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 165
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 166
3.3. AI with Keras UNIT
03
Line 91
• Add an output layer for linear regression.
Line 92
• Summary of the model
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 167
3.3. AI with Keras UNIT
03
Line 93
• Hyperparameters
Line 94
• Define the optimizer and then compile.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 168
3.3. AI with Keras UNIT
03
Line 95
• verbose = 0 means no output. verbose = 1 to view the epochs.
Line 96
• View the keys.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 169
3.3. AI with Keras UNIT
03
Line 97-1
• Skip the first few steps.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 170
3.3. AI with Keras UNIT
03
Training History
Train
100 Validation
80
MSE
60
40
20
0 250 500 750 1000125015001750
Epoch
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 171
3.3. AI with Keras UNIT
03
Testing
‣ Predict and test using a formula.
Line
100
• Returns the 0 = loss value and 1 = metrics value.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 172
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 173
3.3. AI with Keras UNIT
03
Line
102-1
• Input layer
Line
102-2
• Output layer
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 174
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 175
3.3. AI with Keras UNIT
03
Line
105
• Loss = MAE (L1) and Metrics = MSE (L2)
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 176
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 177
3.3. AI with Keras UNIT
03
Training History
Train
100 Validation
80
MSE
60
40
20
0 250 500 750 1000 1250 1500 1750
Epoch
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 178
3.3. AI with Keras UNIT
03
Line
109
• Returns the 0 = loss value and 1 = metrics value.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 179
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 180
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 181
3.3. AI with Keras UNIT
03
‣ View as DataFrame
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 182
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 183
3.3. AI with Keras UNIT
03
Line 123
• Add layers to a Sequential object.
• units = N# of output variables
Line 124
• Summary of the model
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 184
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 185
3.3. AI with Keras UNIT
03
Line 128
• Verbose = 0 means no output. Verbose = 1 to view the epochs.
Line 129
• View the keys.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 186
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 187
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 188
3.3. AI with Keras UNIT
03
Pooling layer
Dropout layer
Fully connected
0.02
layer
0.95
layer
0.03
0
0
0
0
0
Repeated several
times
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 189
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 190
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 191
3.3. AI with Keras UNIT
03
Line 135-1
• You may change this at will.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 192
3.3. AI with Keras UNIT
03
‣ Reshaping
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 193
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 194
3.3. AI with Keras UNIT
03
Line 140-1
• 1 channel of grayscale
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 195
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 196
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 197
3.3. AI with Keras UNIT
03
Line
144
• verbose = 0 means no output. verbose = 1 to view the epochs.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 198
3.3. AI with Keras UNIT
03
Training History
0.99
0.98
Accuracy
0.97
0.96
0.95
Train
0.94 Validation
0 2 4 6 8
Epoch
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 199
3.3. AI with Keras UNIT
03
Testing
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 200
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 201
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 202
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 203
3.3. AI with Keras UNIT
03
‣ Reshaping
‣ One-hot-encoding
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 204
3.3. AI with Keras UNIT
03
Line 155
• 3 channels of color
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 205
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 206
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 207
3.3. AI with Keras UNIT
03
Line
159
• verbose = 0 means no output. verbose = 1 to view the epochs.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 208
3.3. AI with Keras UNIT
03
Training History
0.8
0.7
Accuracy
0.6
0.5
Train
0.4 Validation
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 209
3.3. AI with Keras UNIT
03
Testing
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 210
3.3. AI with Keras UNIT
03
𝑌 𝑡− 1 𝑌 𝑡 𝑌 𝑡 +1 Output
h 𝑡 −2 h 𝑡 −1 h𝑡 h 𝑡+ 1
Hidden
𝑋 𝑡 𝑋 𝑡 +1 Input
time
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 211
3.3. AI with Keras UNIT
03
𝑌 𝑡− 1 𝑌𝑡 𝑌 𝑡 +1
𝑋 𝑡− 1 𝑋𝑡 𝑋 𝑡 +1
time
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 212
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 213
3.3. AI with Keras UNIT
03
10
2
0 5 10 15 20 25 30 35 40
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 214
3.3. AI with Keras UNIT
03
Data
10 Fit
2
0 10 20 30 40 50
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 215
3.3. AI with Keras UNIT
03
1.0
0.8
0.6
0.4
0.2
0.0
0 5 10 15 20 25 30 35 40
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 216
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 217
3.3. AI with Keras UNIT
03
Necessary definitions
‣ Hyperparameters
Line 170-2
• There is only 1 time series data. No other choice but 1.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 218
3.3. AI with Keras UNIT
03
Necessary definitions
‣ Hyperparameters
Line 171-1
• Scalar input
Line 171-2
• N# of neurons per layer
Line 171-3
• Scalar output
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 219
3.3. AI with Keras UNIT
03
Line 172
• Return sequences = True: means “Sequence to Sequence."
• Input shape = (None, n inputs): the variable length of the time series.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 220
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 221
3.3. AI with Keras UNIT
03
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 222
3.3. AI with Keras UNIT
03
Line 176
• No validation.
• CAUTION: y is X shifted by +1.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 223
3.3. AI with Keras UNIT
03
Training History
1.4
1.2
1.0
MSE
0.8
0.6
0.4
0.2
Train
0.0
0 200 400 600 800 1000
Epoch
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 224
3.3. AI with Keras UNIT
03
Line 180-1
• Seed length
Line 180-2
• Prediction length
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 225
3.3. AI with Keras UNIT
03
Line 181-3
• Reshape.
Line 181-5
• The last output is the predicted y.
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 226
3.3. AI with Keras UNIT
03
12 Data
Fit
10
2
0 10 20 30 40 50 60
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 227
End of
Document
Samsung Innovation Campus Chapter 8. Neural Network and Deep Learning 228/44
ⓒ2022 SAMSUNG. All rights reserved.
Samsung Electronics Corporate Citizenship Office holds the copyright of book.
This book is a literary property protected by copyright law so reprint and reproduction without permission are prohibited.
To use this book other than the curriculum of Samsung Innovation Campus or to use the entire or part of this book, you must receive written
consent from copyright holder.