0% found this document useful (0 votes)

25 views10 pages

SHAI - Task 3 - NN

Introduction to neural networks

Uploaded by

mohmadanasshb21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views10 pages

SHAI - Task 3 - NN

Introduction to neural networks

Uploaded by

mohmadanasshb21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Deep Learning

Neural Networks Report

Based on :
Hands on machine learning Book & StatQuest
Task 3

Mhd Anas Al-Sheikh Bakri

1. Introduction

In today's era of artificial intelligence (AI), two major approaches stand out: machine
learning and deep learning. While machine learning has been around for some time,
deep learning has revolutionized how we tackle complex problems. This introduction
aims to highlight the main differences between the two and explain why deep
learning has become so important, especially with the explosion of data from the
internet and social media.

1.1 The main difference between Machine Learning & Deep learning

Machine learning and deep learning both teach computers to learn from data.
However, they differ in how they handle the data. Traditional machine learning
often requires humans to manually pick out important features from the data.
In contrast, deep learning does this automatically. Deep neural networks, the
backbone of deep learning, can figure out the relevant features on their own.
This ability makes deep learning particularly powerful for tasks like recognizing
objects in images or understanding speech.

1.2 The Rise of Deep Learning with Internet and Social Media

One big reason why deep learning has taken off is the vast amount of data
available, thanks to the internet and social media. These platforms generate
massive amounts of information every second, ranging from text to images to
videos. Deep learning models thrive on data, and the internet provides an
endless supply. This abundance of data has fueled breakthroughs in areas like
computer vision and natural language processing. Essentially, the internet and
social media have turbocharged the development of deep learning, allowing
researchers and companies to push the boundaries of what's possible.
2. Neural Networks

Neural networks, a cornerstone of modern machine learning, draw inspiration from

the intricate workings of the human brain. Scientists were fascinated by the brain's
ability to process information and learn from experience, leading them to develop
artificial neurons that mimic the behavior of biological neurons.

At its core, a neural network comprises interconnected nodes, or neurons, arranged

in layers. These artificial neurons simulate the functionality of biological neurons by
receiving input signals, processing them through an activation function, and
transmitting output signals to other neurons. The connections between neurons,
often referred to as weights, determine the strength of the signal transmission.

Neural networks typically consist of three types of layers:

Input Layer: This layer receives the initial input data and passes it on to the
subsequent layers for processing. Each neuron in the input layer represents a feature
or attribute of the input data.

Hidden Layer(s): Intermediate layers between the input and output layers where the
majority of computation occurs. Each neuron in a hidden layer receives input from
the previous layer, applies a transformation using activation functions, and passes
the result to the next layer.

Output Layer: The final layer of the neural network responsible for producing the
desired output based on the processed information from the hidden layers. The
number of neurons in the output layer depends on the nature of the task, such as
classification or regression.
3. Neural Network Training

Training a neural network involves iteratively adjusting its parameters (weights and
biases) to minimize the difference between predicted and actual outputs using
specific Loss Function .
The weights and the biases should be initialized before the training begin , usually the
biases intilize with zeros , but weights must initialized as random value .
This optimization process relies on fundamental concepts such as the chain rule,
gradient descent, and propagation algorithms , and we will talk about each one of
them , and activation functions .

3.1 The Chain Rule

The chain rule is a fundamental concept in calculus that helps us understand

how changes in multiple variables affect each other. In the context of neural
networks, it allows us to compute the derivatives of complex functions
composed of several nested functions. By applying the chain rule, we can
efficiently calculate the gradients, which indicate the direction and magnitude
of changes needed to minimize the network's error.

3.2 Gradient Descent

Gradient descent is an optimization algorithm used to minimize a function by

iteratively moving in the direction of the steepest decrease in the function's
value. In the context of neural networks, the function we aim to minimize is
the loss function, which quantifies the disparity between predicted and actual
outputs. There are several variants of gradient descent:

Fully Batch Gradient Descent: Updates the parameters using the gradients
computed from the entire training dataset.
Mini-Batch Gradient Descent: Divides the training dataset into small batches
and updates the parameters based on the gradients computed from each
batch.
Stochastic Gradient Descent (SGD): Updates the parameters after processing
each individual training example, making it computationally efficient but more
prone to noise.
3.3 Forward & Back Propagation

Forward propagation involves passing the input data through the neural
network to generate predictions. Each neuron in the network receives input
signals, applies a transformation using weights and biases, and passes the
result to the next layer until the output is obtained.

Backward propagation, also known as backpropagation, is the process of

computing gradients of the loss function with respect to the parameters of the
network. It involves propagating the error backwards from the output layer to
the input layer, using the chain rule to efficiently compute the gradients layer
by layer. These gradients are then used in conjunction with gradient descent to
update the parameters, thereby optimizing the network's performance.

3.4 Activation Function

Activation functions play a crucial role in neural networks by introducing

nonlinearity into the model. Without nonlinearity between layers, even a deep
stack of layers would be equivalent to a single layer. This limitation prevents
the network from effectively capturing complex patterns and solving intricate
problems.

Activation functions enable neural networks to learn complex mappings

between inputs and outputs by introducing nonlinear transformations. They
determine the output of individual neurons based on their input signals and
help the network model intricate relationships in the data. Common activation
functions include sigmoid, tanh, ReLU (Rectified Linear Unit), and softmax,
each with its own characteristics and suitability for different types of tasks.
4. Classification & Regression in Neural Networks

Neural networks are versatile models capable of performing both classification and
regression tasks. These tasks differ in their objectives and the nature of the output
they produce.

4.1 Classification

Classification tasks involve categorizing input data into discrete classes or categories.
The goal is to assign a label or class to each input based on its features. Neural
networks used for classification typically have an output layer with multiple neurons,
each corresponding to a different class. During training, the network learns to predict
the probability distribution over these classes for a given input.

Common examples of classification tasks include image classification (identifying

objects in images), sentiment analysis (determining the sentiment of text), and spam
detection (classifying emails as spam or non-spam).

4.2 Regression

Regression tasks, on the other hand, involve predicting continuous numerical values
based on input features. The objective is to estimate a real-valued output that best
fits the underlying relationship between the input variables. Neural networks used
for regression typically have a single output neuron that directly predicts the
continuous target variable.

Examples of regression tasks include predicting house prices based on features such
as size, location, and number of bedrooms, forecasting stock prices based on
historical data, and estimating the age of a person based on demographic
information.
4.3 Differences between Classification & Regression

While both classification and regression tasks involve making predictions based on
input data, they differ in several key aspects:

Output Type: In classification, the output is categorical, representing class labels or

probabilities. In regression, the output is continuous, representing numerical values.

Loss Function: Classification tasks often use categorical cross-entropy or binary cross-
entropy loss functions, which measure the difference between predicted class
probabilities and true labels. Regression tasks typically use mean squared error (MSE)
or mean absolute error (MAE) loss functions, which quantify the difference between
predicted and actual numerical values.

Evaluation Metrics: Classification models are evaluated using metrics such as

accuracy, precision, recall, and F1-score, which assess the model's performance in
correctly classifying instances. Regression models are evaluated using metrics such as
mean squared error, mean absolute error, and R-squared, which measure the
accuracy of the predicted numerical values relative to the ground truth.
5. Hyperparameter for Neural Network

Hyperparameters are configuration settings that are external to the model and
cannot be learned from the training data. They define the structure and behavior of
the neural network during training and influence its performance.

5.1 Difference between Hyperparameters & Parameters

Parameters are the internal variables of the model that are learned during
training, such as weights and biases. They directly impact the model's
predictions. In contrast, hyperparameters are settings that govern the training
process itself, such as the learning rate, batch size, and choice of optimizer.
Hyperparameters must be specified by the user before training begins and can
significantly affect the model's performance.

5.2 Learning Rate

The learning rate is a hyperparameter that controls the size of the step taken
during gradient descent optimization. It determines how much the model's
parameters are adjusted in each iteration to minimize the loss function. A high
learning rate may cause the model to overshoot the optimal solution, while a
low learning rate may result in slow convergence. Tuning the learning rate
involves experimenting with different values to find the optimal balance
between convergence speed and stability.

Learning rate schedules, such as exponential decay or step decay, adjust the
learning rate over time to improve convergence. Learning rate warm-up
techniques gradually increase the learning rate at the beginning of training to
accelerate convergence while mitigating the risk of instability.
5.3 Batch Size

The batch size determines the number of training examples used in each
iteration of gradient descent. Choosing an appropriate batch size is crucial for
efficient training. A smaller batch size may result in noisy gradients but faster
convergence, while a larger batch size may provide more stable gradients but
slower convergence.

Common approaches for selecting the batch size include using a default value
like 32, which is commonly used in practice, or adjusting it based on GPU
memory constraints. Learning rate warm-up techniques can help mitigate the
potential negative effects of larger batch sizes by gradually increasing the
learning rate during the initial training epochs.

5.4 Optimizer Function

The optimizer function is responsible for updating the model's parameters

during training based on the computed gradients. There are various optimizer
algorithms available, each with its own strengths and weaknesses. Popular
optimizers include stochastic gradient descent (SGD), Adam.

Choosing the best optimizer for a specific problem involves experimentation

and depends on factors such as the dataset size, model architecture, and
convergence speed requirements. It's often advisable to start with a well-
established optimizer like Adam and fine-tune its hyperparameters if
necessary.

5.5 Activation Function

Activation functions introduce nonlinearity into the neural network, enabling it

to learn complex relationships in the data. Choosing the appropriate activation
function depends on the task and the properties of the data. Rectified Linear
Unit (ReLU) is commonly used in hidden layers due to its simplicity and
effectiveness in mitigating the vanishing gradient problem. Sigmoid and
softmax activation functions are suitable for output layers in binary and
multiclass classification tasks, respectively. However, using sigmoid activation
functions in hidden layers may lead to vanishing or exploding gradients,
hindering training stability.
5.6 Number of iterations (Epochs)

The number of epochs specifies the total number of iterations over the entire
training dataset during training. While the number of epochs is not typically
fine-tuned, it's important to set it to a sufficiently high value to allow the
model to converge to an optimal solution. Early stopping techniques can be
employed to monitor the model's performance on a validation set and stop
training when performance begins to deteriorate, preventing overfitting.

Zlib - Pub Marine Biology Function Biodiversity Ecology
100% (4)
Zlib - Pub Marine Biology Function Biodiversity Ecology
584 pages
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
No ratings yet
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
14 pages
DL Unit 1
No ratings yet
DL Unit 1
200 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
205 pages
Deep Learning and Its Applications
No ratings yet
Deep Learning and Its Applications
21 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Unit 1
No ratings yet
Unit 1
16 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Don T Be Scared of Neural Network 1731079571
No ratings yet
Don T Be Scared of Neural Network 1731079571
18 pages
4.0 The Complete Guide To Artificial Neural Networks
No ratings yet
4.0 The Complete Guide To Artificial Neural Networks
23 pages
Machine Learning NN
100% (2)
Machine Learning NN
16 pages
Chapter 5
No ratings yet
Chapter 5
63 pages
Deep Learning UNIT 5
No ratings yet
Deep Learning UNIT 5
182 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Deep Learning 1687744660
No ratings yet
Deep Learning 1687744660
26 pages
4-Discovery of The Subatomic Particles
100% (1)
4-Discovery of The Subatomic Particles
35 pages
ML MU Unit 5NeuralNetworkpdf 2025 04 16 13 47 39
No ratings yet
ML MU Unit 5NeuralNetworkpdf 2025 04 16 13 47 39
57 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
03-Lecture Notes-Mid
No ratings yet
03-Lecture Notes-Mid
23 pages
Artificial Neural Network Using R
No ratings yet
Artificial Neural Network Using R
15 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
No ratings yet
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
52 pages
Machine Learning-Gkouzionis
No ratings yet
Machine Learning-Gkouzionis
14 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Neural Networks in Machine Learning11
No ratings yet
Neural Networks in Machine Learning11
11 pages
Unit I
No ratings yet
Unit I
90 pages
ML06 Neural-Network 2024-2025
No ratings yet
ML06 Neural-Network 2024-2025
78 pages
Module 1
No ratings yet
Module 1
64 pages
Unit 1
No ratings yet
Unit 1
19 pages
Unit 5
No ratings yet
Unit 5
59 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Introduction Deep Eng
No ratings yet
Introduction Deep Eng
50 pages
Unit 3
No ratings yet
Unit 3
8 pages
DL Unit 3 Important Questions and Answers PDF .. - 1
No ratings yet
DL Unit 3 Important Questions and Answers PDF .. - 1
8 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Module 1
No ratings yet
Module 1
22 pages
Shortnotedeeplearning
No ratings yet
Shortnotedeeplearning
11 pages
CH 19 AI
No ratings yet
CH 19 AI
17 pages
Neural NetworksChapter2Sup
No ratings yet
Neural NetworksChapter2Sup
20 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Cours1 Annotations
No ratings yet
Cours1 Annotations
42 pages
ML Unit 4
No ratings yet
ML Unit 4
16 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
NNML Full
No ratings yet
NNML Full
19 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
2 Notes
No ratings yet
2 Notes
2 pages
Unit 1
No ratings yet
Unit 1
20 pages
What Are Neural Networks
No ratings yet
What Are Neural Networks
5 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
Neural Network
No ratings yet
Neural Network
7 pages
Neural Networks and Deep Learning
No ratings yet
Neural Networks and Deep Learning
22 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
DeepLearning Glossary
No ratings yet
DeepLearning Glossary
5 pages
Tunnel Lining Analysis and Design Using Staad Pro
No ratings yet
Tunnel Lining Analysis and Design Using Staad Pro
4 pages
Sum of An Arithmetic Sequence
100% (1)
Sum of An Arithmetic Sequence
2 pages
R Max Powered Running Manual
100% (2)
R Max Powered Running Manual
40 pages
Jaw Relations
No ratings yet
Jaw Relations
131 pages
Specification For Structural Steel Buildings 04
No ratings yet
Specification For Structural Steel Buildings 04
4 pages
Đọc Viết 2 - 23092021
No ratings yet
Đọc Viết 2 - 23092021
9 pages
Week 8
No ratings yet
Week 8
2 pages
IAN Akyildiz
No ratings yet
IAN Akyildiz
49 pages
11th Math Summer Vacation Task
No ratings yet
11th Math Summer Vacation Task
41 pages
Phinma University of Iloil1
No ratings yet
Phinma University of Iloil1
6 pages
Sedimentology Final Examination
No ratings yet
Sedimentology Final Examination
12 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Practice Test Planner - 2024-25 (TYM) Phase-03 Version 2.0
No ratings yet
Practice Test Planner - 2024-25 (TYM) Phase-03 Version 2.0
4 pages
Coping Strategies
No ratings yet
Coping Strategies
13 pages
Black Holes and Beyond
No ratings yet
Black Holes and Beyond
140 pages
Island of Ignorance 31 Aug 23 Digital Draft
No ratings yet
Island of Ignorance 31 Aug 23 Digital Draft
41 pages
MTH101 Lec#1
No ratings yet
MTH101 Lec#1
28 pages
Punjab PET Syllabus
No ratings yet
Punjab PET Syllabus
4 pages
Uts Finals Reviewer Complete
No ratings yet
Uts Finals Reviewer Complete
6 pages
Icme 14-TSG 23 Visualizacion
No ratings yet
Icme 14-TSG 23 Visualizacion
111 pages
CS Project File
No ratings yet
CS Project File
8 pages
Newton's Laws of Motion at Work Science Presentation in Beige Charcoal Hand Drawn Style
No ratings yet
Newton's Laws of Motion at Work Science Presentation in Beige Charcoal Hand Drawn Style
18 pages
DextranSPRI 04
No ratings yet
DextranSPRI 04
16 pages
Whose Side Are We On Howard Becker
No ratings yet
Whose Side Are We On Howard Becker
10 pages
Opa1632 Used in AMB Laboratories Schematics
No ratings yet
Opa1632 Used in AMB Laboratories Schematics
35 pages
AN240P
No ratings yet
AN240P
5 pages
Strength and Durability of Mortar and Concrete Containing Rice Husk Ash: A Review
No ratings yet
Strength and Durability of Mortar and Concrete Containing Rice Husk Ash: A Review
15 pages
Statistics Practice Set
No ratings yet
Statistics Practice Set
6 pages
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet

SHAI - Task 3 - NN

Uploaded by

SHAI - Task 3 - NN

Uploaded by

Deep Learning

Neural Networks Report

Mhd Anas Al-Sheikh Bakri

Neural networks, a cornerstone of modern machine learning, draw inspiration from

At its core, a neural network comprises interconnected nodes, or neurons, arranged

Neural networks typically consist of three types of layers:

3.1 The Chain Rule

The chain rule is a fundamental concept in calculus that helps us understand

3.2 Gradient Descent

Gradient descent is an optimization algorithm used to minimize a function by

Backward propagation, also known as backpropagation, is the process of

3.4 Activation Function

Activation functions play a crucial role in neural networks by introducing

Activation functions enable neural networks to learn complex mappings

Common examples of classification tasks include image classification (identifying

Output Type: In classification, the output is categorical, representing class labels or

Evaluation Metrics: Classification models are evaluated using metrics such as

5.1 Difference between Hyperparameters & Parameters

5.2 Learning Rate

5.4 Optimizer Function

The optimizer function is responsible for updating the model's parameters

Choosing the best optimizer for a specific problem involves experimentation

5.5 Activation Function

Activation functions introduce nonlinearity into the neural network, enabling it

You might also like