NN 01
NN 01
Agenda
Course Outlines
Overview of Neural Networks
Biological and Artificial Neuron Model
Definition of Neural Networks
Applications of Neural Networks
Artificial Neuron Structures
1
Course Outlines
1. Introduction to NNs
2. Main characteristics of Neural Networks
3. Resenblatt’s perceptron Single Layer Network
4. Least Mean Square algorithm for Single Layer Network
5. Multilayer Perceptron (MLP) Network
6. Optimization of Back-Propagation Algorithm
7. Deep Learning
8. Convolutional Neural Networks (CNNs)
9. Regularization and CNNs
10. YOLO for Object Detection
11. Fully CNNs and U-Net for Image Segmentation, Generative Models
Textbooks
Haykin. Neural Networks and Learning Machines, 3ed., Prentice Hall (Pearson), 2009.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, MIT Press, Cambridge,
2016
Neural Networks and Deep Learning, Charu C. Aggarwal, Springer, 2018
2
Course Assessment
Assessment
◦ Homework, Quizzes, Computer Assignments, and Project (35 Points)
◦ Midterm Exam (15 Points)
◦ Final Exam (50 Points)
Course Materials
All lectures and labs will be uploaded to:
Neural Network & Deep Learning
3
Lecture 1:
Introduction to ANNs
Lecture Objectives
After studying this lecture, the student will be able to:
– Summarize the difference between the Biological and the Artificial Neuron.
4
Agenda
Course Outlines
Where is NN?
5
What is a Neural Network?
Neural Networks replicate the way humans learn, inspired by how the neurons in our brains
fire, only much simpler.
The Human brain could give the correct response (output) for each input of its environment.
The researchers tried to ‘train’ the neural black-box to ‘learn’ the correct response
output for each of the training samples.
6
The Human Nervous System
The human nervous system may be viewed as a three-stage system:
◦ The brain, represented by the neural (verve) net, is central to the system. It continually
receive information, perceives it, and makes appropriate decision.
◦ The receptors convert stimuli from the human body or the external environment into
electrical impulses that convey information to the brain.
◦ The effectors convert electrical impulses generated by the brain into discernible responses
as system output.
Human brain
The Human brain computes in different way from digital-computer:
The brain is a highly complex, nonlinear, and parallel computing.
It characterize by;
Robust and fault tolerant - because they are always able to respond
and small changes in input do not normally cause a change in
output.
Flexible – can adjust to new environment by learning
Can deal with probabilistic, noisy or inconsistent information
Is small, compact and requires little power than the digital
computer.
7
Human brain, cont.
The brain is slower than the digital-computer in the mathematic computation,
however, The brain is many times faster than the digital-computer in:
vision, pattern recognition, perception, motor control
Human uses 1% calculation, 99% understanding
based on patterns, drawing information from experience
Machine opposite: 99% calculation, 1% understanding
though this understanding is growing
8
Basic element in a biological brain
A neuron is the basic element in a biological brain
There are approximately 100,000,000,000 neurons in a
human brain
One neuron is connectedly with approximately 10,000
other neurons
Each of these neurons is relatively simple in design.
9
Agenda
Course Outlines
Overview of Neural Networks
10
How does a bio-neuron work?
Synaptic activity
Electrical Signals (impulses) come into the dendrites
through the synapses.
Electrical signal causes a change in synaptic potential
and the release of transmitter chemicals.
Chemicals can have an excitatory effect on the
receiving neuron (making it more likely to fire) or an
inhibitory effect (making it less likely to fire).
A physical neuron
An artificial neuron
11
Translate from Biological Neuron to Artificial Neuron
Network of Neurons
The human brain composed of many “neurons” that co-operate to perform the
desired objective, which means give desired output to a specific input.
12
Agenda
Course Outlines
Overview of Neural Networks
Biological and Artificial Neuron Model
13
What is a Neural Network?, cont.
It is a massively parallel distributed processor (Formal Definition in the Book)
◦ Made up of simple processing units (neurons)
◦ The simple processing units has a natural propensity for storing experience
knowledge and making it available for use.
It resembles the brain in two respects:
◦ Knowledge is acquired from environment through learning process.
◦ Interneuron connection strengths, known as synaptic weights, are used to store
the acquired knowledge
14
Properties and Capabilities of Neural Networks
1. Nonlinearity. An artificial neuron can be linear or nonlinear. A neural network, made up of an
interconnection of nonlinear neurons, is itself nonlinear.
2. Input-Output Mapping
It is built by learning from examples in order to minimize the difference between the
desired response and the actual response.
3. Adaptively
NNs have a built-in capability to adapted their synaptic weights to changes in the
surrounding environment, i.e. it can be easily retrained to deal with minor changes in the
operating environment.
In non-stationary environment, a NN can be designed to change its synaptic weights in
real-time.
Agenda
Course Outlines
Overview of Neural Networks
Biological and Artificial Neuron Model
Definition of Neural Networks
15
Problems Commonly Solved With Neural Networks
There are many different problems that can be solved with a neural network.
However, neural networks are commonly used to address particular types of problems.
The following types of problem are frequently solved with neural networks:
◦ Regression - Approximate an unknown function
◦ Classification
◦ Pattern recognition
◦ Prediction
◦ Optimization
◦ Clustering
Regression (1l3)
Computer program required to Predict a numerical value given some input.
Ex.: for every house price, I know the area of this house.
House price 100,000 130,000 200,000 500,000 1,000,000
Area (m2) 80 90 100 150 200
Then someone asked me that there is a house with specific area what is its price?
If we are following rule-based methods or search methods, then the algorithm will fail
because the new price.
As a result, we need house-fitting or interpolation to get the value of the house price.
The output of this problem will be continuous variable.
Face detection problem can be categorized below regression?
NEURAL NETWORKS - LECTURE 1 35
16
Regression (2/3)
Another Ex.: the job salaries and these job salaries are a very large database that
tells us the domain of every job.
Job Salary 10X 20X 30X 35X 50X
Domain X Y X Z Z
Location CAI NY PAR NY CAI
Grade B A A B A
If anybody searches for a new job, even if this job is not existed in our DB, the
system can predict the output by using the regression.
In all cases, the output is a continuous variable.
If we looked at the problem mathematics, we will find that there are a lot of
names; House-fitting, regression, in general it is a function approximation.
17
Classification or Recognition (1/2)
Here the output variable turned from continuous variable into a discrete
variable.
Instead of predicting the house price, we need to predict the house category:
instead of returning numbers for prices, we will use low category, economical
category, high category, or premium end or high end.
The difference is that the problem is converted from function approximation to
classification problem.
We still trying to estimate the function but in this case we are trying to get
something called Decision Boundary instead of curve.
In case of the binary classification, the break will separate two classes where
there are any example of class 1 existed in class 2 region.
One of the classification names is discrimination, or discriminative function.
18
Classification
Classification is the process of classifying input into groups.
For example, an insurance company may
Want to classify insurance applications into different risk categories,
Or an online organization may want its email system to classify incoming mail into
groups.
Often, the neural network is trained by presenting it with a sample group of
data and instructions as to which group each data element belongs.
This allows the neural network to learn the characteristics that may indicate
group membership.
Pattern Recognition
Pattern recognition is one of the most common uses for neural networks.
Pattern recognition is a form of classification.
Pattern recognition is simply the ability to recognize a pattern. The pattern must be
recognized even when it is distorted.
In general, pattern recognition is the problem to classify given patterns into several
classes
Character recognition
Speech recognition
Face detection/recognition
Pattern recognition is the basis for creating machines that can learn and think.
19
Application: Face detection
Face detection is an example of pattern classification or pattern
recognition
Face detection (“face” or “non-face”)
Face detection is to search for a face in a given image
The image can be a still picture taken by a digital camera, or moving
pictures captured by a video camera
This problem is important for many security related systems,
internet based media search, etc.
The problem is highly non-linear.
20
Prediction
Prediction = estimating the future value(s) in a time series
Given a time-based series of input data, a neural network will predict future values.
The accuracy of the guess will be dependent upon many factors, such as the quantity
and relevancy of the input data.
For example, neural networks are commonly applied to problems involving
predicting movements in financial markets.
Example: Given stock values observed in the past few days, guess if we should buy or
sell today
Optimization
Optimization can be applied to many different problems for which an optimal solution is
sought.
The neural network may not always find the optimal solution; rather, it seeks to find an
acceptable solution.
Perhaps one of the most well-known optimization problems is the traveling salesman problem
(TSP).
21
Clustering
It is the process of grouping the data into classes (clusters) so that the data objects
(examples) are:
similar to one another within the same cluster
dissimilar to the objects in other clusters
Clustering, Application
An example:
All users can be categorized into several groups
according to their occupations (Professors,
Engineers, Wives, Students, Sells man,
Managers, Lawyers, Doctors, Unemployed)
The data can be visualized by mapping them to
a 2-D space
From this map, we can see if a person is a
payable user or not
22
Clustering Applications
Agenda
Course Outlines
Overview of Neural Networks
Biological and Artificial Neuron Model
Definition of Neural Networks
Applications of Neural Networks
23
Artificial Neuron
Neuron is the basis information processing unit.
Three-basic elements of the
neural model:
1. A set of synapses or connecting links
Characterized by weight or strength.
2. An adder
Summing the inputs signals weights by synapses.
is called a linear combiner.
3. An activation function
Also called squashing function
Squash limits the output to some finite values.
24
Mathematical terms of Nonlinear Neuron Model
25
Effect of adding a Bias
The use of bias bk has the effect of applying an
affine transformation to the output uk.
vk uk bk
Depending on weather the bias bk is positive or
negative, the relation between
◦ the activation potential vk of neuron k, and
◦ the linear combiner output uk
26
Bias as extra input
In this figure, the effect of the bias is accounted for
by doing two things:
1. A Bias unit can be thought of
as a unit which always has an
input value of 1, x0=+1.
2. A bias value is exactly
equivalent to a weight wk0 on
an extra input line x0.
• So we may formulate
n
vk as
follows: v w x k j 0
kj j
27
Example
The inputs (3, 1, 0, -2) are presented to single neuron, its
weight is (0.3, -0.1, 2.1, -1.1).
i. What is the net input to the transfer function?
ii. What is the neuron output?
X1=3 0.3
X2=1 -0.1
Σ f y
X3=0 2.1
X4=-2 -1.1
Example
Solution:
i- The net input is given by summed weighted inputs:
uk net k W T X
3
1
0.3 0.1 2.1 1.1
0
2
3(0.3) 1(-0.1) 0(2.1) - 2(-1.1)
0.9 (-0.1) 2.2
3
ii- The output cannot be determined because the transfer function is not
specified. which function should be used as an activation function?
28
Thank you!
Next Lecture:
◦ Types of activation functions
29