0% found this document useful (0 votes)
15 views32 pages

Chapter 3

Chapter 3 provides an overview of deep learning, emphasizing its distinction from traditional machine learning and highlighting key characteristics such as end-to-end learning and representation learning. It focuses on Convolutional Neural Networks (CNNs), detailing their architecture, layers, and functionality, including convolution, activation, and pooling layers. The chapter also discusses the training process of CNNs, including forward propagation, loss functions, and backpropagation.

Uploaded by

youssef1hisham1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views32 pages

Chapter 3

Chapter 3 provides an overview of deep learning, emphasizing its distinction from traditional machine learning and highlighting key characteristics such as end-to-end learning and representation learning. It focuses on Convolutional Neural Networks (CNNs), detailing their architecture, layers, and functionality, including convolution, activation, and pooling layers. The chapter also discusses the training process of CNNs, including forward propagation, loss functions, and backpropagation.

Uploaded by

youssef1hisham1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 3

Key Points

(1) Introduction to Deep Learning


(2) Machine Learning vs Deep Learning
(3) Key Characteristics of Deep Learning
(4) Convolutional Neural Networks
(5) How CNN works?
(6) ConvNet Layers
(7) CONV layer
(8) Relu Layer
(9) Pooling Layer
(10) Flatten - Fully Connected Layer - Soft max
(11) Training Convolutional Neural Networks
(12) Applications of CNNs
(13) Challenges and Limitations of CNNs

Computational Intelligence
Chapter 3
Introduction to Deep Learning
Overview:
- Deep Learning, a subfield of machine learning, is designed to mimic
the human brain’s neural networks in learning complex patterns from
large datasets.
Machine Learning ‫ وهو مجال فرعي من الـ‬،Deep Learning ‫الـ‬
human brain ‫ يف الـ‬neural networks ‫مصمم ًلحاكاة الـ‬
.large daatasets ‫ من‬complex patterns ‫يف تعلم الـ‬

Convolutional Neural Networks (CNNs) are one of the most


prominent deep learning architectures, particularly for tasks involving
image and spatial data.
deep learning ‫ هو أحد أبرز هياكل الـ‬CNNs ‫الـ‬
spatial data‫ و‬image data ‫ التي تتضمن‬tasks ‫وخاص ًة للـ‬

Computational Intelligence
Chapter 3
Deep Learning:
refers to neural networks with multiple layers, also known as deep
neural networks.
multiple layers ‫ بها‬neural networks ‫بريمز إىل‬
.deep neural networks ‫معروفة باسم‬

The “depth” of a network refers to its number of layers.


.‫ بتاعتها‬layers ‫ بريمز إىل عدد الـ‬network ‫“ بتاع الـ‬depth” ‫الـ‬

The increased depth allows the network to model increasingly complex


patterns in data, enabling advanced applications such as image
recognition, language translation, and autonomous driving.
modeling ‫ بأنها تعمل‬network ‫ اًلتزايد بيسمح للـ‬depth ‫الـ‬
data ‫ يف الـ‬increasingly complex patterns ‫لـ‬
:‫ زي‬advanced applications ‫وده بيمكن من‬
- image recognition - language translation - autonomous driving

Computational Intelligence
Chapter 3
Machine Learning vs Deep Learning

:Machine Learning ‫يف الـ‬


‫ بتاعتك‬data ‫بتدخل الـ‬
classification ‫ =< وبعدين بيحصل‬feature extraction ‫وبعدين بيحصل‬
output ‫وبعدين بيطلع الـ‬

:Deep Learning ‫يف الـ‬


‫ بتاعتك‬data ‫بتدخل الـ‬
CNN ‫ جوه الـ‬feature extraction & classification ‫وبعدين بيحصل‬
output ‫وبعدين بيطلع الـ‬

Computational Intelligence
Chapter 3
Key Characteristics of Deep Learning
End-to-End Learning:
Deep learning models can learn directly from raw data without
requiring manual feature extraction.
raw data ‫ ممكن تتعلم بشكل مبارش من الـ‬Deep Leaning ‫ بتاعة الـ‬Models ‫الـ‬
.manual feature extraction ‫بدون الحاجة إىل‬
Representation Learning:
Deep networks automatically learn hierarchical representations of data.
hierarchical representations of data ‫ ممكن تتعلم‬Deep networks ‫الـ‬
‫بشكل أتوماتييك‬
Big Data and Computing Power:
Deep learning thrives on large datasets and requires significant
computational power (e.g., GPUs).
large datasets ‫ بيزدهر عىل‬deep learning ‫الـ‬
.significant computational power ‫وبيتطلب‬

Computational Intelligence
Chapter 3
Convolutional Neural Networks
- A convolutional neural network (or ConvNet) is a type of feed
forward artificial neural network.
.feed-forward artificial neural network ‫نوع من أنواع‬
- The architecture of a ConvNet is designed to take advantage of the
2D structure of an input image.
ConvNet ‫البنية بتاعة الـ‬
.input image ‫ بتاع الـ‬2D structure ‫تم تصميمها لالستفادة من‬
CNNs are a type of deep learning model
deep learning model ‫هو نوع من أنواع الـ‬
primarily designed for image-related tasks.
.image-related tasks ‫تم تصميمه من أجل‬

Computational Intelligence
Chapter 3
They are highly effective at recognizing spatial hierarchies in images,
such as:
- edges
- textures
- more complex features as the network depth increases.
‫ داخل الصور‬spatial hierarchies ‫هم فعالني يف التعرف عىل الـ‬
:‫مثل‬
- edges
- textures
- more complex features as the network depth increases.

Computational Intelligence
Chapter 3
(Basic Structure of CNNs)
- Convolutional Layers:
Perform convolution operations on input images to extract features.
input images ‫ عىل‬convolution operations ‫بتؤدي‬
.features ‫ للـ‬extract ‫من أجل عمل‬
- Pooling Layers:
Down-sample the feature maps to reduce spatial dimensions while
preserving important information.
spatial dimensions ‫ علشان تقلل الـ‬feature maps ‫ للـ‬down-sample ‫بتعمل‬
.preserving important information ‫أثناء‬
- Fully Connected Layers:
Final layers where the high-level reasoning is done, outputting
predictions or classifications.
‫ بيتم‬high-level ‫ النهائية حيث الـ‬layers ‫الـ‬
.predictions or classifications ‫ اليل هو‬output ‫ويخرج الـ‬

Computational Intelligence
Chapter 3
How CNN works?

ConvNet takes the input as an image which can be classified as:


‘X’ or ‘O’

Computational Intelligence
Chapter 3
ConvNet Layers
CONV layer
It will compute the output of neurons that are connected to local
regions in the input, each computing a dot product between their
weights and a small region they are connected to in the input volume.
input ‫ يف الـ‬local regions ‫ اًلتصلة بالـ‬neurons ‫ بتاع الـ‬output ‫هتحسب الـ‬
:‫ بني‬dot product ‫حيث يقوم كل منها بحساب حاصل‬
.input volume ‫ متصلة بها يف الـ‬small region‫ بتاعتها و‬weights ‫الـ‬

- The convolutional layer is the core of CNNs.


.CNNs ‫ بتاع الـ‬core ‫ هي الـ‬convolutional layer ‫الـ‬
- It applies filters (also called kernels) to the input image to extract
features such as edges, corners, textures, etc.
input image ‫ عىل الـ‬filters (kernels) ‫بيطبق‬
:‫ زي‬features ‫من أجل استخراج الـ‬
- edges - corners - textures

Computational Intelligence
Chapter 3
RELU layer
It will apply an elementwise activation function, such as the max(0,x)
thresholding at zero.
elelmentwise activation function ‫هتطبق‬
max(0,x) at zero ‫زي الـ‬
This leaves the size of the volume unchanged.
.‫ ثابت بدون تغري‬volume ‫ده بيرتك حجم الـ‬

POOL layer
It will perform a down sampling operation along the spatial
dimensions (width, height).
down sampling operation ‫بتؤدي‬
spatial dimensions (width, height) ‫بطول الـ‬

Computational Intelligence
Chapter 3
FC (i.e. fully-connected) layer
It will compute the class scores, resulting in volume of size [1x1xN],
where each of the N numbers correspond to a class score, such as
among the N categories.
class scores ‫سيقوم بحساب الـ‬
]x1xN1[ ‫ بحجم‬volume ‫مام يؤدي إىل‬
.N categories ‫ مثل بني‬،class score ‫ مع الـ‬N ‫حيث يتوافق كل رقم من أرقام‬

Computational Intelligence
‫‪Chapter 3‬‬
‫‪CONV layer‬‬
‫الصورة بتاعتك بتبدأ تطبق عليها شوية ‪filters‬‬
‫الـ ‪ filter‬بيبدأ يطلعلك ‪feature extraction‬‬

‫هنا الصورة بحجم ‪9×9‬‬


‫الـ ‪ filter‬بحجم ‪3×3‬‬
‫هنرضب عنارص الـ ‪ filter‬يف جزئية من الصورة بنفس حجم الـ ‪filter‬‬
‫ونجمع ونقسم عىل عددهم‬
‫الناتج اليل بيطلع بنحطه يف الـ ‪origin‬‬

‫‪Computational Intelligence‬‬
Chapter 3
Input Size (W): 9
W ‫ اليل هو حجم الصورة وبرنمزله بالرمز‬Input Size ‫الـ‬
Filter Size (F): 3 X 3
W ‫ وبرنمزله بالرمز‬Filter ‫ اليل هو حجم الـ‬Filter Size ‫الـ‬

Filter/Kernel:
A small matrix that slides over the input image, multiplying and
summing values to produce a feature map.
input image ‫ الـ‬slides over ‫ صغرية بـ‬Matrix
feature map ‫ من أجل الوصول إىل‬values ‫برضب وجمع الـ‬

Stride (S): 1
‫ يعني هتميش كام بكسل‬،‫ بتاعتك‬step ‫ هو الـ‬Stride ‫الـ‬
‫ بكسل‬1 ‫ يبقى هتميش مبقدار‬1 ‫ بـ‬stride ‫لو الـ‬
s ‫هرنمزله بالرمز‬

Stride:
The step size at which the filter moves across the image.
.‫ هيتحركها خالل الصورة‬filter ‫ اليل الـ‬step ‫هو حجم الـ‬

Computational Intelligence
Chapter 3
Filters: 1
Filters ‫ده عدد الـ‬

Padding: 0
‫ بيبقى يف منتصف الصورة‬origin ‫يف الطبيعي الـ‬
‫ هنكرب حجم الصورة‬origin ‫لو عايز أخيل البكسل األوالين هو الـ‬
‫بنضيف إطار حوالني الصورة‬

‫ يبقى مش هنعمل الكالم ده‬،0 = padding ‫لو الـ‬


‫يعني مش بنضيف إطار حوالني الصورة‬

Padding:
Adding extra pixels (usually zeros) around the input image to control
the spatial dimensions of the output.
input image ‫ حول الـ‬extra pixels (usually zeros) ‫إضافة‬
.spatial dimensions of the output ‫من أجل التحكم يف الـ‬

Computational Intelligence
Chapter 3
Feature Map ‫الصورة اليل هتطلع اسمها‬
:‫ بطبق القانون‬،‫لو عايز أعرف حجمها‬

)RGB ‫ بتوع الـ‬3 channels ‫ بريمز للـ‬3 ‫ (هنا الرقم‬7×7×3

Filters ‫أنواع الـ‬


• Edge Detectors: Identify edges and boundaries.
• Texture Detectors: Capture surface patterns.
• Color Detectors: Detect specific colors.
• Shape Detectors: Recognize shapes and geometric forms.
• Gradient Detectors: Capture transitions in color or intensity.
• Blob Detectors: Identify homogeneous regions in the image.

Computational Intelligence
Chapter 3
Relu Layer
After convolution, the output is passed through an activation function.
.activation function ‫ بيتمرر خالل‬output ‫ الـ‬،convolution ‫بعد الـ‬
The most commonly used activation function in CNNs is ReLU.
.RELU ‫ هي الـ‬CNNs ‫ بيتم استخدامها يف الـ‬activation function ‫أشهر‬

x ‫ والـ‬0 ‫ بني الـ‬max ‫بتحسب الـ‬


0 ‫ بتطلع بـ‬0 ‫أي قيمة تحت‬
‫ بتطلع زي ما هي‬0 ‫أي قيمة فوق‬

Computational Intelligence
Chapter 3
ReLU introduces non-linearity into the network, which helps the CNN
learn complex patterns.
،network ‫ يف الـ‬non-linearity ‫ بتقدم‬ReLU ‫الـ‬
.complex patterns ‫ عىل تعلم الـ‬CNN ‫مام يساعد‬

Any negative values from the convolution are set to zero, allowing the
network to focus only on positive signals.
zero ‫ بتكون‬convolution ‫ من الـ‬negative values ‫أي‬
.positive signals ‫ أنها تركز فقط عىل الـ‬network ‫بتسمح للـ‬

Computational Intelligence
Chapter 3
Pooling Layer
Pooling Layer
- The pooling layer is used to reduce the spatial dimensions of feature
maps, which helps lower computational complexity and prevent
overfitting.
‫ بتستخدم‬pooling layer ‫الـ‬
feature maps ‫ بتاعة الـ‬spatial dimensions ‫لتقليل الـ‬
.overfitting ‫ وبتمنع الـ‬computational complexity ‫ودي بتساعد عىل الـ‬
- Max Pooling:
Selects the maximum value from a patch of the feature map.
.feature map ‫ بتاع الـ‬patch ‫ من‬maximum value ‫بتحدد الـ‬

- Average Pooling:
Averages the values within the patch.
.patch ‫بتوجد متوسط القيم داخل الـ‬

Computational Intelligence
Chapter 3
Key Points:
- Pooling reduces the size of the feature map, allowing for faster
computation in deeper layers.
feature map ‫ بتقلل حكم الـ‬pooling ‫الـ‬
.deeper layers ‫ يف‬faster computation ‫وبتسمح لـ‬
- It also introduces some form of translation invariance, meaning the
network is less sensitive to small movements or distortions in the
image.
translation invariance ‫أيضا شكل من‬
ً ‫بتقدم‬
‫ يف‬small movements or distortions ‫ أقل حساسية لـ‬network ‫وتعني أن الـ‬
.‫الصورة‬

Computational Intelligence
‫‪Chapter 3‬‬
‫‪Pooling Filter example‬‬
‫‪Size = 2 X 2, Stride = 2‬‬

‫يف حالة ‪:max pooling‬‬


‫همسك كل جزء من الصورة بحجم ‪ 2×2‬واميش مبقدار ‪2‬‬
‫وأخد أكرب قيمة‬

‫يف حالة ‪:average pooling‬‬


‫همسك كل جزء من الصورة بحجم ‪ 2×2‬واميش مبقدار ‪2‬‬
‫وأخد متوسط القيم‬

‫‪Computational Intelligence‬‬
Chapter 3
Flatten - Fully Connected Layer - Soft max
‫ واحد‬vector ‫ جوه‬pool layer ‫هحط الصورة اليل طالعة من الـ‬
Fully Connected Layer ‫وبعدين بدخل الصورة عىل الـ‬
Softmax ‫وبعدين بدخل الصورة عىل الـ‬

After several convolutional and pooling layers, the network flattens the
feature maps and passes them through one or more fully connected
layers.
:several convolutional and pooling layers ‫بعد‬
feature maps ‫ للـ‬flatten ‫ بتعمل‬network ‫الـ‬
.Fully Connected Layers ‫وبعدين بتمررهم خالل‬
These layers are responsible for making predictions or classifications.
.classifications ‫ أو‬predictions ‫ دي هي اًلسئولة عن عمل‬Layers ‫الـ‬
- Flattening:
Converts the 2D feature maps into a 1D vector.
.1D vector ‫ إىل‬2D feature maps ‫بتحول الـ‬
- Fully Connected Layer:
Every neuron in the layer is connected to every neuron in the previous
layer.
.‫ السابقة‬layer ‫ يف الـ‬neuron ‫ بيتوصل بكل‬layer ‫ يف الـ‬neuron ‫كل‬
Computational Intelligence
Chapter 3
Training Convolutional Neural Networks
Training CNNs involves the same general process as other neural
networks:
forward propagation, backpropagation, and gradient descent.
general process ‫ بيتضمن نفس الـ‬CNNs ‫تدريب الـ‬
.‫ اآلخرى‬neural networks ‫مثل الـ‬
.gradient descent <= backpropagation <= forward propagation

Computational Intelligence
Chapter 3
Forward Propagation
During the forward pass:
- The input image is convolved with filters.
.convolved with filters ‫ بيحصلها‬input image ‫الـ‬

- Activation functions (e.g., ReLU) are applied.


.ReLU ‫ زي الـ‬activation functions ‫بيتم تطبيق الـ‬
- Pooling is performed to down-sample the data.
.data ‫ للـ‬down-sample ‫ بيتم تطبيقه علشان يتم عمل‬Pooling ‫الـ‬
- The output is passed through fully connected layers, and the final
layer provides a prediction.
fully connected layers ‫ بيتمرر خالل‬output ‫الـ‬
.prediction ‫ بتوفر‬final layer ‫والـ‬

Computational Intelligence
Chapter 3
Loss Function
A loss function measures how far the network's predictions are from
the actual values.
network’s predictions ‫ بتقيس مدى بعد الـ‬loss function ‫الـ‬
.actual values ‫عن الـ‬

Common loss functions for CNNs include:


➜ Cross-Entropy Loss:

Used for classification problems.


.classification problems ‫بيستخدم يف الـ‬

➜Mean Squared Error (MSE):

Used for regression tasks.


.regression problems ‫بيستخدم يف الـ‬

Computational Intelligence
Chapter 3
Backpropagation and Gradient Descent
➜Backpropagation:

The error is propagated back through the layers of the network, and
the gradients are calculated.
.network ‫ بتاعة الـ‬layers ‫ خالل الـ‬propagated back ‫ بيحصله‬error ‫الـ‬

➜Gradient Descent:

Updates the network's weights to minimize the loss function.


network ‫ بتاعة الـ‬weights ‫بيتم تحديث الـ‬
.loss function ‫ للـ‬minimize ‫علشان نعمل‬

Computational Intelligence
Chapter 3
Data Augmentation
‫ مش كفاية‬data ‫عندك‬
‫فأنت عايز تخلق من الصورة شوية صور‬
)overfitting ‫ بشكل كويس (علشان ميحصلش‬training ‫علشان تعمل عليها‬

One common technique to improve the performance of CNNs is data


augmentation.
data augmentation ‫ هو الـ‬CNNs ‫واحدة من أهم التقنيات لتحسني األداء بتاع الـ‬
It artificially increases the size of the training dataset by applying
random transformations to the input images, such as:
- Rotation - Flipping - Zooming - Cropping
training dataset ‫بيزود حجم الـ‬
the input images ‫ إىل‬random transformations ‫عن طريق تطبيق‬
- Rotation - Flipping - Zooming - Cropping :‫زي‬
- This will reduce overfitting, making this a regularization technique.
overfitting ‫هيقلل الـ‬
- The trick is to generate realistic training instances.

Computational Intelligence
Chapter 3
Dropout
Neurons ‫ممكن توقع شوية‬
‫ فهنوقع منه منه شوية‬،‫ بتاعك ما هيتعلم كل حاجة‬Model ‫علشان الـ‬
‫ اليل عندي‬overfitting ‫لتجنب الـ‬

- To prevent overfitting, CNNs often use dropout, where some


neurons are randomly ignored during the training process.
dropout ‫ غال ًبا بيستخدم‬CNNs ‫ الـ‬،overfitting ‫علشان مننع الـ‬
.training ‫ بيتم تجاهلها بشكل عشوايئ خالل عملية الـ‬neurons ‫حيث بعض الـ‬
This forces the network to learn robust features and not rely too
heavily on specific neurons.
robust features ‫ أنها تتعلم‬network ‫ده بيجرب الـ‬
.‫ معينة‬neurons ‫وليس باالعتامد الشديد عىل‬
- It is a fairly simple algorithm: at every training step, every neuron
(including the input neurons but excluding the output neurons) has a
probability p of being temporarily “dropped out,” meaning it will be
entirely ignored during this training step, but it may be active during
the next step.

Computational Intelligence
Chapter 3
Transfer Learning
pre-trained ‫ بيكون‬model ‫بتعمل‬
.‫يعني موديل اتعلم قبل كده عىل شوية حاجات تانية‬

.‫ تانية خالص‬data ‫ وتطبقه عىل‬data ‫بتدربه عىل‬

- It is generally not a good idea to train a very large DNN from


scratch.
.scratch ‫ من الـ‬very large DNN ‫ليست فكرة جيدة أن تدرب‬
- Try to find an existing neural network that accomplishes a similar
task.
.‫ بتحقق تاسك مشابه‬existing neural network ‫حاول أن تجد‬
- Reuse the lower layers of this network.
.‫ بتاعة الشبكة دي‬lower layers ‫قم بإعادة استخدام الـ‬
- This is called transfer learning.
.Transfer Learning ‫ده بيسمى بالـ‬

Computational Intelligence
Chapter 3
- In transfer learning, a pre-trained model (e.g., trained on ImageNet)
is used as a starting point for a new task.
Transfer Learning ‫يف الـ‬
.new task ‫ للـ‬starting point ‫ بيستخدم كـ‬pre-trained model ‫الـ‬

The idea is that the model has already learned useful features (edges,
shapes, textures) from a large dataset and can be fine-tuned for a new
task with limited data.
‫ بالفعل‬useful features ‫ اتعلم‬model ‫ هي أن الـ‬idea ‫الـ‬
edges, shapes, textures :‫زي الـ‬
large dataset ‫اتعلمها من‬
.limited data ‫ بـ‬task ‫ من أجل‬fine-tuned ‫وممكن يبقى‬

Computational Intelligence
Chapter 3
Applications of CNNs
➜ Image Classification

CNNs are widely used for image classification tasks, where the model
assigns a label to an image. Common datasets for image classification
include MNIST (handwritten digits), CIFAR-10 (small object images),
and ImageNet (large-scale object recognition).
➜ Object Detection

In object detection, the task is to identify and localize multiple objects


within an image. Techniques like Region-based CNNs (R-CNNs) are
used to propose regions of interest and classify objects within those
regions.
➜ Image Segmentation

Image segmentation divides an image into multiple segments or


objects. CNN-based models like Fully Convolutional Networks (FCNs)
are used to assign a label to every pixel in an image.
➜ Medical Imaging

CNNs are revolutionizing medical imaging, where they help in


detecting diseases from images such as X-rays, MRIs, and CT scans.
For example, CNNs are used to detect tumors, fractures, and retinal
diseases.

Computational Intelligence
Chapter 3
Challenges and Limitations of CNNs
➜Data-Hungry:

CNNs require large amounts of labeled data to perform effectively.


➜Computational Cost:

Training deep CNNs can be very resource-intensive, requiring high-


end GPUs.
➜Interpretability:

CNNs are often seen as "black boxes," making it difficult to interpret


their decision making process.
➜Overfitting:

CNNs can easily overfit on small datasets if proper regularization


techniques (like dropout) are not used.

Computational Intelligence

You might also like