0% found this document useful (0 votes)

101 views44 pages

Omar Arif Omar - Arif@seecs - Edu.pk National University of Sciences and Technology

The document discusses the basic building blocks of neural networks including perceptrons and how they can be stacked to form neural networks. It covers loss minimization using gradient descent and how to implement and train artificial neural networks using libraries. Activation functions, backpropagation, stochastic gradient descent, and techniques for avoiding overfitting like regularization, dropout, and early stopping are also summarized.

Uploaded by

Muhammad Rizwan Khalid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views44 pages

Omar Arif Omar - Arif@seecs - Edu.pk National University of Sciences and Technology

Uploaded by

Muhammad Rizwan Khalid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Omar Arif

[email protected]
National University of Sciences and Technology
 The Perceptron – Basic building block of Neural network

 Neural Network – Stacking perceptron to build neural networks

 Loss Minimization – Gradient descent

 Implementing ANNs – How to use libraries to implement neural

network
 Training ANNs
Building Block of Neural Network
Bias term

𝑥0 𝑤0
𝒙 = 𝑥1 , 𝐰 = 𝑤1
𝑥2 𝑤2

ℎ𝒘 𝒙 = g(𝐰 T 𝒙)

Non-linearity, activation functio

 Activation function allows to Introduce non-linearity into the
network
1
 Sigmoid: 𝜎(𝑧) =
1+𝑒 −𝑧

 Rectified Linear Unit: relu z = max 𝑧, 0

 Softplus: softplus z = log(1 + 𝑒 𝑧 )

𝒙𝟏 𝒙𝟐 𝒉

0 0 0

0 1 1

1 0 1

1 1 1
𝒙𝟏 𝒙𝟐 𝒉

0 0 0

0 1 0

1 0 0

1 1 1
 Non Linear Decision Boundary
𝒙𝟏 𝒙𝟐 𝒉

0 0 0
 Using the basic perceptron, we can
not approximate a non-linear
0 1 1
function

1 0 1

1 1 0
 Feature Engineering: Use higher order features such as 𝑥 2 or 𝑥 3
to obtain non-linear function.
 Problems: Don’t know what features to choose

 we would like to automate things and let the algorithm choose

the features
 Neural networks allow one to automatically learn the
representation/features of a linear classifier which are geared
towards the desired task, rather than specifying them all by
hand.
Building neural networks by stacking perceptron
x1 x2 h h h
AND NOR OR
0 0 0 1 1

0 1 0 0 0

1 0 0 0 0

1 1 1 0 1
Loss function tells us how good our neural network is
 Optimization problem
𝑚
Training data: 𝐷𝑡𝑟𝑎𝑖𝑛 = (𝑥 𝑖 , 𝑦 𝑖 ) 𝑖=1

min 𝐽(𝒘, 𝐷𝑡𝑟𝑎𝑖𝑛 )

𝒘

1
𝐽 𝒘𝐷𝑡𝑟𝑎𝑖𝑛 = ෍ 𝐿𝑜𝑠𝑠(𝑥, 𝑦, 𝒘)
𝑚
𝑥,𝑦 ∈𝐷𝑇𝑟𝑎𝑖𝑛

 Goal: Compute gradient

𝛻w J(𝐰, Dtrain )
 Mean squared loss error

𝐿𝑜𝑠𝑠 𝑥, 𝑦, 𝒘 = (ℎ𝑤 𝑥 − 𝑦)2

 Binary Cross Entropy Loss (Logistic Loss)

𝐿𝑜𝑠𝑠 𝑥, 𝑦, 𝒘 = −𝑦𝑙𝑜𝑔 log ℎ 𝑥 − 1 − 𝑦 (1 − log(ℎ(𝑥)))

 Forward Pass: compute the output of the network

 Backward Pass: compute gradients

See Backpropagation_examples.pdf
Stochastic Gradient
Batch Gradient Descent
Descent
 Initialize weights randomly  Initialize weights randomly
 Loop  Loop
 Computer gradient  For all data points in 𝐷𝑡𝑟𝑎𝑖𝑛
𝜕𝐽(𝒘)  Computer gradient
𝜕𝒘 𝜕𝐿𝑜𝑠𝑠(𝑥, 𝑦, 𝒘)
𝜕𝒘
 Update 𝒘  Update 𝒘
𝜕𝐽(𝒘) 𝜕𝐿𝑜𝑠𝑠(𝑥, 𝑦, 𝒘)
𝒘≔𝒘−𝜶 𝒘≔𝒘−𝜶
𝜕𝒘 𝜕𝒘
CIFAR10
MNIST
Fashion-MNIST
 Softmax function takes as input a vector of k real numbers and
normalizes it into a probability distribution
𝑒 𝑦𝑖
 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑦𝑖 =
σ𝑘
𝑖=1 𝑒 𝑦𝑖

𝑒 𝑦1 𝑝(𝑦 = 1|𝑥, 𝑤)
1
 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 = ⋮ = ⋮
σ𝑘
𝑖=1 𝑒
𝑦𝑖
𝑒 𝑦𝑘 𝑝(𝑦 = 𝑘|𝑥, 𝑤)

 Negative-Log-Likelihood-Loss= − σ𝑘
𝑖=1 log(𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑦𝑖 ))
The activation function of all neurons in the hidden layer is ReLU
The output neurons implement Logsoftmax
For complete code see cifar10linear.ipynb
See mnist_classification.ipynb
Labels
● 0 T-shirt/top
● 1 Trouser
● 2 Pullover
● 3 Dress
● 4 Coat
● 5 Sandal
● 6 Shirt
● 7 Sneaker
● 8 Bag
● 9 Ankle boot

Deadline: Submit .ipynb file by16th Feb. midnight

Mini batch gradient descent
Learning Rate
Avoiding overfitting
 Batch gradient descent
Batch gradient descent, computes the gradient of the cost
function w.r.t. to the parameters 𝑤 for the entire training dataset.
 Stochastic gradient descent
Stochastic gradient descent (SGD) in contrast performs a
parameter update for each training example (𝑥 𝑖 , 𝑦 𝑖 )
 Mini-batch gradient descent
Mini-batch gradient descent performs an update for every mini-
batch of 𝑛𝑏𝑎𝑡𝑐ℎ𝑠𝑖𝑧𝑒 training examples.
 How to choose Learning Rate

𝜕𝐽
𝒘≔𝒘−𝜶
𝜕𝒘
 Small learning rate converges slowly while large learning rate
overshoots

 Loss Landscape of Neural Nets is

non convex
 Momentum is a method that helps accelerate SGD in the relevant
direction and dampens oscillations

𝑣𝑡 = 𝛾𝑣𝑡−1 + 𝛼𝛻𝑤 𝐽
𝑤 ≔ 𝑤 − 𝑣𝑡
optimizer = optim.SGD(h.parameters(), lr = 0.001, momentum=.9)

http://ruder.io/optimizing-gradient-descent/
Learning rate is not fixed
 Adam (Adaptive Momentum Estimation)
 torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999))

 Adagrad - adapts the learning rate for each weight

torch.optim.Adagrad(params, lr=0.01)

https://pytorch.org/docs/stable/optim.html
1. L2 weight Regularization

𝐽 𝑤 = 𝐿𝑜𝑠𝑠 𝑥, 𝑦, 𝑤 + 𝜆σ𝑤 2

torch.optim.SGD(params, lr=<>, momentum=0, weight_decay=0)

Set weight_decay to 𝜆
1. Dropout:
 randomly select neurons and remove them along with incoming and
outgoing connections
 Forces the network to use all neurons

torch.nn.functional.dropout(input, p=0.5)
3. Early Stopping:
 Stop before the network starts to over fit
 The Perceptron – Basic building block of Neural network

 Neural Network – Stacking perceptron to build neural networks

 Loss Minimization – Gradient descent

 Implementing ANNs – How to use libraries to implement neural

network
 Training ANN

Unit1.5 Mathematics Helps Control Nature and The Occurrences in The World F
88% (66)
Unit1.5 Mathematics Helps Control Nature and The Occurrences in The World F
2 pages
Test Taker Score Report: December 21, 2019 Test Date Scores Scores
100% (1)
Test Taker Score Report: December 21, 2019 Test Date Scores Scores
2 pages
Unit 5
No ratings yet
Unit 5
61 pages
Leadership On The Line
No ratings yet
Leadership On The Line
7 pages
Inbound 8392301798635648784
No ratings yet
Inbound 8392301798635648784
43 pages
MLP 1122 20240509 ch10 DeepNN
No ratings yet
MLP 1122 20240509 ch10 DeepNN
47 pages
Unit 4 ML NN, DL, CNN-1
No ratings yet
Unit 4 ML NN, DL, CNN-1
84 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
31 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Ann TP
No ratings yet
Ann TP
40 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
Lecture 5-Introduction To Neural Network
No ratings yet
Lecture 5-Introduction To Neural Network
42 pages
AI ML Nov 15
No ratings yet
AI ML Nov 15
32 pages
Lecture2 Slides 1
No ratings yet
Lecture2 Slides 1
28 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit V
No ratings yet
Unit V
49 pages
Unit 4 Neural Networks
No ratings yet
Unit 4 Neural Networks
76 pages
Chapter 5 Summary
No ratings yet
Chapter 5 Summary
5 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
Deep Learningchap2
No ratings yet
Deep Learningchap2
20 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
CS4442 - CS9542 - Part 2 - Lecture 5 - DNN - Intro
No ratings yet
CS4442 - CS9542 - Part 2 - Lecture 5 - DNN - Intro
113 pages
Notes ML 02 Slides RNN ANN
No ratings yet
Notes ML 02 Slides RNN ANN
105 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
48 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
Neural Network
No ratings yet
Neural Network
97 pages
Module4 AI
No ratings yet
Module4 AI
12 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Artificial Neural Networks (Anns) : Intro
No ratings yet
Artificial Neural Networks (Anns) : Intro
15 pages
10 Multilayer Perceptrons
No ratings yet
10 Multilayer Perceptrons
54 pages
Neural Deep Learning
No ratings yet
Neural Deep Learning
221 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
221 pages
Chapter10 Keras
No ratings yet
Chapter10 Keras
66 pages
Machine Learning
No ratings yet
Machine Learning
77 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
cs519 hw2
No ratings yet
cs519 hw2
15 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Deep Learning
No ratings yet
Deep Learning
180 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
Part 13 MD
No ratings yet
Part 13 MD
41 pages
Deep Learning - Part II-1
No ratings yet
Deep Learning - Part II-1
23 pages
Probability Neuron Network
No ratings yet
Probability Neuron Network
84 pages
Deep Learning Tutorial: Reference: Hung-Yi Lee
100% (1)
Deep Learning Tutorial: Reference: Hung-Yi Lee
179 pages
Control System Term Paper
No ratings yet
Control System Term Paper
12 pages
Unit-3 ML
No ratings yet
Unit-3 ML
21 pages
Lecture8 DeepLearning
No ratings yet
Lecture8 DeepLearning
94 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Neural Network Representation
No ratings yet
Neural Network Representation
5 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
AIML Unit-5
No ratings yet
AIML Unit-5
26 pages
01 - Neural Network Basics
No ratings yet
01 - Neural Network Basics
93 pages
5 1 ArtificialNeuralNetworks 4up
No ratings yet
5 1 ArtificialNeuralNetworks 4up
12 pages
9 MLP Example 08 08 2024
No ratings yet
9 MLP Example 08 08 2024
50 pages
UNIT 4 - Perceptron and DL
No ratings yet
UNIT 4 - Perceptron and DL
39 pages
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
QUESTION 1 - Linear Regression (10 Points) : Your Solution (Only Run For 1 Iteration)
No ratings yet
QUESTION 1 - Linear Regression (10 Points) : Your Solution (Only Run For 1 Iteration)
1 page
QUESTION 2 - Convolution Neural Network Q2 A - Consider The Following CNN Architecture
No ratings yet
QUESTION 2 - Convolution Neural Network Q2 A - Consider The Following CNN Architecture
2 pages
CS490 Advtopics Bese7 - PDF
No ratings yet
CS490 Advtopics Bese7 - PDF
4 pages
Backpropagation Examples PDF
No ratings yet
Backpropagation Examples PDF
9 pages
CS490 Advanced Topics in Computing - Deep Learning
No ratings yet
CS490 Advanced Topics in Computing - Deep Learning
20 pages
04-CNN PDF
No ratings yet
04-CNN PDF
170 pages
CS224n: Natural Language Processing With Deep Learning
No ratings yet
CS224n: Natural Language Processing With Deep Learning
14 pages
05-TrainingNN PDF
No ratings yet
05-TrainingNN PDF
81 pages
Linear Algebra: Lecture Slides For Chapter 2 of
No ratings yet
Linear Algebra: Lecture Slides For Chapter 2 of
23 pages
CS 490 Deep Learning: Reading Chap 1 of Deep Learing DR Omar Arif Omar - Arif@seecs - Edu.pk
No ratings yet
CS 490 Deep Learning: Reading Chap 1 of Deep Learing DR Omar Arif Omar - Arif@seecs - Edu.pk
33 pages
Assessment 1 Solution
No ratings yet
Assessment 1 Solution
3 pages
Cooling by Underground Earth Tubes
No ratings yet
Cooling by Underground Earth Tubes
4 pages
S.2 Ict Eot I 2025
100% (1)
S.2 Ict Eot I 2025
3 pages
Working Drawing: UV Sterilizer
No ratings yet
Working Drawing: UV Sterilizer
19 pages
CH 9
No ratings yet
CH 9
9 pages
Thesis Reservoir Simulation
No ratings yet
Thesis Reservoir Simulation
87 pages
On-The-Job Training (OJT) Orientation
No ratings yet
On-The-Job Training (OJT) Orientation
65 pages
Aerodyne Product Shopper
No ratings yet
Aerodyne Product Shopper
2 pages
Self Made Questionnaire Version2
No ratings yet
Self Made Questionnaire Version2
3 pages
Us-China Trade War: An Analysis From The Viewpoint of U.S. Economy
No ratings yet
Us-China Trade War: An Analysis From The Viewpoint of U.S. Economy
12 pages
Philosophy of Quantum Mechanics For Everyone
No ratings yet
Philosophy of Quantum Mechanics For Everyone
14 pages
Catch Up Friday DLL Sheena
86% (7)
Catch Up Friday DLL Sheena
2 pages
Global Vector Control Response 2017-2030: Fourth Draft (Version 4.3)
No ratings yet
Global Vector Control Response 2017-2030: Fourth Draft (Version 4.3)
50 pages
Request To Write PHD Research Proposal On Climate Change
No ratings yet
Request To Write PHD Research Proposal On Climate Change
8 pages
(Council of Scientific & Industrial Research) 196, Raja S. C. Mullick Road, Kolkata-32, Website
No ratings yet
(Council of Scientific & Industrial Research) 196, Raja S. C. Mullick Road, Kolkata-32, Website
4 pages
LP For Reading and Writing Skills
No ratings yet
LP For Reading and Writing Skills
4 pages
Patrich Geddes Cities in Evolution
100% (2)
Patrich Geddes Cities in Evolution
442 pages
Faculty of Engineering: Geotechnical Engineering Laboratory Report
No ratings yet
Faculty of Engineering: Geotechnical Engineering Laboratory Report
8 pages
Artificial Japanese Glass Eel Production in Korea
No ratings yet
Artificial Japanese Glass Eel Production in Korea
3 pages
Leadership Style of Managers in 5 Star Hotels
No ratings yet
Leadership Style of Managers in 5 Star Hotels
6 pages
Health and Safety in Relation To The Use of ICT Systems
No ratings yet
Health and Safety in Relation To The Use of ICT Systems
2 pages
Group Discussion Notes
No ratings yet
Group Discussion Notes
7 pages
ÔN TẬP CK
No ratings yet
ÔN TẬP CK
3 pages
Homework Signs
100% (1)
Homework Signs
5 pages
Desmos Lesson
No ratings yet
Desmos Lesson
2 pages
Smart Helmet Based On IoT Technology
No ratings yet
Smart Helmet Based On IoT Technology
5 pages
LH 1600 Alarm Reference Guide
No ratings yet
LH 1600 Alarm Reference Guide
212 pages
ICH Q8 and Q9 - A Review
No ratings yet
ICH Q8 and Q9 - A Review
22 pages