0% found this document useful (0 votes)
9 views

Lecture 1

Deep learning

Uploaded by

Hiruni Chamindi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture 1

Deep learning

Uploaded by

Hiruni Chamindi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

EC 9170

Deep Learning for Electrical &


Computer Engineers

Lecture 01:
Deep feedforward networks

26th February 2024


Faculty of Engineering, University of Jaffna
•Is Artificial Intelligence, Machine learning,
and Deep Learning the same thing?

Ability of machine to imitate human


intelligence

Algorithms to incorporate intelligence into


machine by automatically learning from
data

Algorithms that mimics human brain to


incorporate intelligence into machine
Artificial Intelligence
• Al is any technique, code or algorithm that enables machines to
develop, demonstrate and mimic human cognitive behaviour or
intelligence, hence the name "Artificial Intelligence."
• Al doesn't mean that everything machines will be doing. Rather, Al
can be better represented as "Augmented Intelligence", i.e. Man +
Machine, to solve business problems better and faster.
• Al won't replace managers, but managers who use Al will replace
those who don't.
• Some of the most successful applications of Al around us can be seen
in Robotics, Computer Vision, Virtual Reality, Speech Recognition,
Automation, Gaming and so on...
Machine learning
• Machine learning is the sub-field of Al, which gives machines the
ability to improve their performance over Fme without explicit
intervenFon or help from the human being
• In this approach machines are shown thousands or millions of
examples and trained how to correctly solve a problem.
• Most of the current applicaFons of machine learning leverage
supervised learning
• Other uses of ML can be broadly classified between unsupervised
learning and reinforced learning.
Deep learning
• Deep learning is a sub field of Machine Learning that very closely
tries to mimic human brain’s working using neurons.
Deep learning
• These techniques focus on building Artificial Neural Networks (ANN)
using several hidden layers.
• There are variety of deep learning networks such as Multilayer
Perceptron ( MLP), Autoencoders (AE), Convolution Neural Network
(CNN), Recurrent Neural Network (RNN), Deep Feedforward
Network etc.
Why Deep learning?
Why Deep learning is growing?
• Processing power needed for Deep learning is readily becoming
available using GPs, Distributed Computing and powerful CPUs
• Moreover, deep learning models seem to outperform machine
learning models as the data grows.
• Explosion of features and datasets
• Focus on customisation and real-time decisions
• Uncover hard to detect patterns (using traditional techniques)
when the incidence rate is low
Why Deep learning is growing?
• Find latent features (super variables) without significant manual
feature engineering
• Real-time fraud detection and self-learning models using streaming
data (KAFKA, MapR)
• Ensure consistent customer experience and regulatory compliance
• Higher operational efficiency
Challenges with Deep learning
• Data Quality and Quantity: high-quality labeled data can be expensive and time-consuming.
Additionally, the quality of the data can significantly impact the performance and robustness of
the models.
• Computational Resources: Need computational resources, including powerful GPUs or even
specialized hardware like TPUs (Tensor Processing Units). This can be a barrier for smaller
organizations or researchers with limited access to such resources.
• Overfitting: especially when trained on limited data or when the model capacity is too high
relative to the complexity of the problem. Techniques like dropout, regularization, and data
augmentation are commonly employed to mitigate this issue.
• Interpretability: Deep learning models are often considered "black boxes" due to their
complexity, making it challenging to understand how they arrive at a particular prediction. This
lack of interpretability can be problematic, especially in critical applications like healthcare or
finance, where understanding the reasoning behind a decision is crucial.
Build a Neural Network
• Neural Network: A computational model that works in a similar
way to the neurons in the human brain.
• Biological neurons are organized in a vast network of billions of neurons.
• Each neuron typically is connected to thousands of other neurons.
Build a Neural Network
• A biological neuron is composed of a Cell body, many dendrites (branching
extensions), one axon (long extension), synapses
• Biological neurons receive signals from other neurons via these synapses.

When a neuron receives a sufficient number of signals within a few milliseconds, it


fires its own signals.
• Comparison between biological neuron and artificial neuron
Neural Network
• Neural network consists of large number of highly interconnected
neurons in it.
• Each neuron takes an input, performs some operations then passes
the output to the following neuron.

Two-Layer Neural Network


Key Components
1. Layers-
• Input layer: It contains artificial neurons which receive input data, which could be
raw data (e.g., pixel values of an image). Input layer neurons depend on the number
of features.
• Output layer: the final layer in the neural network, contains artificial neurons that
are responsible for producing the model's predictions or outputs. output layer
neurons depend on the number of outputs.
• Hidden layers: layers of neurons that perform computations and transformations on
the input data. They are called "hidden" because they are not directly observable as
inputs or outputs of the system. Instead, they serve as intermediate layers between
the input and output layers, capturing complex patterns and features in the data.
More neurons = More calculation = More time
Key Components Cont…
2. Neurons - Basic unit of a Neural Network. It can take inputs from
other neurons and give the corresponding output. the inputs and
output can only be a binary number i.e. 0 or 1.
3. Weights - ConnecFon between every pair of neurons. the
importance is given to each factor in compuFng the output.
Typically chosen randomly in the first run and opFmized using
backward propagaFon.
Key Components Cont…
4. Activation Function- Function used to generate outputs by matrix
multiplication of inputs and weights along with bias.
F(x) =
0.67
Key Components Cont…
Ø Neural Network Notation
Key Components Cont…
Ø Neural Network NotaFon
Key Components Cont…
4. Forward Propagation- Weights for each input are initialized to make
predictions and compute error. Output from each layer is fed
forward to the next layer.
Key Components Cont…
4. Loss Function- To compute error between actual and prediction
values and measure models performance. Hyperparameters are
fine tuned to minimize the loss function. Some common loss
functions are- Mean Square Error, Log loss, Cross entropy,
A Simple Artificial Neural Network
• One or more binary inputs and one binary output
• Activates its output when more than a certain number of its inputs are
active.
Ø Linear Threshold Unit (LTU)
• Inputs of a LTU are numbers (not binary).
• Each input connecFon is associated with a weight.
• Computes a weighted sum of its inputs and applies a step funcFon to
that sum.
Ø Perceptron
• The perceptron is a single layer of LTUs.
• The input neurons output whatever input they are fed.
• A bias neuron, which just outputs 1 all the time.
• If we use logistic function (sigmoid) instead of a step function, it
computes a continuous output.
Ø How is a Perceptron Trained?
• For an LTU to give an output it needs to know the values of the
weights w1, w2… wn.
• The Perceptron training algorithm is inspired by Hebb's rule.
• When a biological neuron often triggers another neuron, the
connection between these two neurons grow stronger.
• Feed one training instance x to each neuron j at a time and make its
prediction y cat.
• Update the connection weights.
Ø Perceptron in Keras
Multi-Layer Perceptron (MLP)
Perceptron Weakness
Incapable of solving some trivial problems, e.g., XOR classification problem. Why?
Multi-Layer Perceptron (MLP)
Perceptron Weakness
Incapable of solving some trivial problems, e.g., XOR classification problem. Why?
Multi-Layer Perceptron (MLP)
• The limitations of Perceptrons can be eliminated by stacking multiple
Perceptrons.
• The resulting network is called a Multi-Layer Perceptron (MLP) or deep
feedforward neural network.
• A feedforward neural network is composed of:
• One input layer
• One or more hidden layers
• One final output layer
Every layer except the output layer includes
a bias neuron and is fully connected to the
next layer
Ø How Does it Work?
• The model is associated with a directed acyclic graph describing how
the funcFons are composed together.
• E.g., assume a network with just a single neuron in each layer.
Ø XOR with Feedforward Neural Network
Ø How to Learn Model Parameters W?
Feedforward Neural Network - Cost Function

We use the cross-entropy (minimizing the negative log-likelihood) between the


training data y and the model's predictions 𝑦! as the cost function.

Ø Gradient-Based Learning
• The most significant difference between the linear models we have seen so
far and feedforward neural network?
• The non-linearity of a neural network causes its cost functions to become non-
convex
Ø Gradient-Based Learning Cont…
• Linear models, with convex cost function, guarantee to find global
minimum.
• Convex optimization converges starting from any initial parameters.

• Stochastic gradient descent applied to non-convex cost functions has no


such convergence guarantee.
• It is sensitive to the values of the initial parameters.
• For feedforward neural networks, it is important to initialize all weights to
small random values.
• The biases may be initialized to zero or to small positive values.
Training Feedforward Neural Networks
Training Feedforward Neural Networks Cont…
Hidden Units
Feedforward Network in Keras

You might also like