0% found this document useful (0 votes)

22 views17 pages

Deep Learning Module-04 Search Creators

Uploaded by

patrick Park

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views17 pages

Deep Learning Module-04 Search Creators

Uploaded by

patrick Park

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

21CS743 | DEEP LEARNING | SEARCH CREATORS.

Module-04

Convolutional Networks

Definition of Convolution

• Convolution: A mathematical operation that combines two functions (input signal/image

and filter/kernel) to produce a third function.

• Purpose: Captures important patterns and structures in the input data, crucial for tasks like
image recognition.

2. Mathematical Formulation

Search Creators... Page 1

21CS743 | DEEP LEARNING | SEARCH CREATORS.

3. Parameters of Convolution

a. Stride

• Definition: The number of pixels the filter moves over the input.

• Types:

o Stride of 1: Filter moves one pixel at a time, resulting in a detailed output.

o Stride of 2: Filter moves two pixels at a time, reducing output size (downsampling).

b.Padding

• Definition: Adding extra pixels around the input image.

• Types:

o Valid Padding: No padding applied; results in a smaller output feature map.

o Same Padding: Padding applied to maintain the same output dimensions as the
input.

4. Significance in Neural Networks

• Application: Used in convolutional layers of CNNs to extract features from images.

• Learning Hierarchical Representations: Stacked convolutional layers enable learning of

complex patterns, essential for image classification and other tasks.

Search Creators... Page 2

21CS743 | DEEP LEARNING | SEARCH CREATORS.

Purpose of Pooling

• Spatial Size Reduction: Decreases the dimensions of the feature maps.

• Parameter and Computation Reduction: Reduces the number of parameters and

computations in the network.

• Overfitting Control: Helps to control overfitting by providing a form of translational

invariance.

2. Types of Pooling

a. Max Pooling

• Definition: Selects the maximum value from each patch (sub-region) of the feature map.

• Purpose: Captures the most prominent features while reducing spatial dimensions.

b. Average Pooling

• Definition: Takes the average value from each patch of the feature map.

• Purpose: Provides a smooth representation of features, reducing sensitivity to noise.

Search Creators... Page 3

21CS743 | DEEP LEARNING | SEARCH CREATORS.

3. Operation of Pooling

4. Significance in Neural Networks

• Feature Extraction: Reduces the size of the feature maps while retaining the most relevant
features.

• Efficiency: Decreases computational load, allowing deeper networks to train faster.

• Robustness: Provides a degree of invariance to small translations in the input, making the
model more robust.

Search Creators... Page 4

21CS743 | DEEP LEARNING | SEARCH CREATORS.

1. Convolution as an Infinitely Strong Prior

• Focus on Local Patterns: Emphasizes the importance of local patterns in the data (e.g.,
edges and textures) over global patterns.

• Effectiveness in CNNs: This locality assumption enhances the effectiveness of

Convolutional Neural Networks (CNNs) for image and video analysis.

2. Pooling as an Infinitely Strong Prior

• Enhances Translational Invariance: Allows the network to recognize objects regardless

of their position within the image.

• Reduces Sensitivity to Position: By downsampling, pooling reduces sensitivity to the

exact location of features, improving generalization.

3. Significance in Neural Networks

• Feature Learning: Both operations prioritize local features, enabling efficient learning of
essential characteristics from input data.

• Improved Generalization: The combination of convolution and pooling enhances the

model's ability to generalize across various input variations.

Search Creators... Page 5

21CS743 | DEEP LEARNING | SEARCH CREATORS.

Variants of the Basic Convolution Function

1. Dilated Convolutions

• Definition: Introduces spacing (dilation) between kernel elements.

• Wider Context: Allows the model to incorporate a wider context of the input data without
significantly increasing the number of parameters.

• Applications: Useful in tasks where understanding broader spatial relationships is

important, such as in semantic segmentation.

2. Depthwise Separable Convolutions

• Two-Stage Process:

o Depthwise Convolution: Applies a separate convolution for each input channel,

reducing computational complexity.

o Pointwise Convolution: Uses 1x1 convolutions to combine the outputs from the
depthwise convolution.

• Parameter Efficiency: Reduces the number of parameters and computations compared to

standard convolutions while maintaining performance.

• Applications: Commonly used in lightweight models, such as MobileNets, for mobile and
edge devices.

Search Creators... Page 6

21CS743 | DEEP LEARNING | SEARCH CREATORS.

1. Definition of Structured Outputs

• Structured Outputs: Refers to tasks where the output has a specific structure or spatial
arrangement, such as pixel-wise predictions in image segmentation or keypoint localization
in object detection.

2. Importance in Semantic Segmentation

• Maintaining Spatial Structure: For tasks like semantic segmentation, it’s crucial to
maintain the spatial relationships between pixels in predictions to ensure that the output
accurately represents the original input image.

3. Specialized Networks

• Network Design: Specialized neural network architectures, such as Fully Convolutional

Networks (FCNs), are designed to handle structured outputs by replacing fully connected
layers with convolutional layers, allowing for spatially consistent predictions.

• Skip Connections: Techniques like skip connections (used in U-Net and ResNet) help
preserve high-resolution features from earlier layers, improving the accuracy of the output.

4. Adjusted Loss Functions

• Loss Function Modification: Loss functions may be adjusted to enforce structural

consistency in the predictions. Common approaches include:

o Pixel-wise Loss: Evaluating the loss on a per-pixel basis (e.g., Cross-Entropy Loss
for segmentation).

o Structural Loss: Incorporating penalties for structural deviations, such as Dice

Loss or Intersection over Union (IoU) metrics, which consider the overlap between
predicted and true regions.

Search Creators... Page 7

21CS743 | DEEP LEARNING | SEARCH CREATORS.

5. Applications

• Use Cases: Structured output networks are widely used in various applications, including:

o Semantic Segmentation: Assigning class labels to each pixel in an image.

o Instance Segmentation: Identifying and segmenting individual object instances

within an image.

o Object Detection: Predicting bounding boxes and class labels for objects in an
image while maintaining spatial relations.

Data Types

1. 2D Images

• Standard Input: The most common input type for CNNs, typically used in image
classification, object detection, and segmentation tasks.

• Format: Represented as height × width × channels (e.g., RGB images have three channels).

Search Creators... Page 8

21CS743 | DEEP LEARNING | SEARCH CREATORS.

2. 3D Data

• Definition: Includes video processing and volumetric data, such as those found in medical
imaging (e.g., MRI or CT scans).

• Format: Represented as depth × height × width × channels, allowing the network to

capture spatial and temporal information.

• Applications: Useful in tasks like action recognition in videos or analyzing 3D medical

images for diagnosis.

3. 1D Data

• Definition: Consists of sequential data, such as time-series data or audio signals.

• Format: Represented as sequences of data points, often one-dimensional.

• Applications: Used in tasks like speech recognition, audio classification, and analyzing
sensor data from IoT devices.

Efficient Convolution Algorithms

1. Fast Fourier Transform (FFT)

• Definition: A mathematical algorithm that computes the discrete Fourier transform (DFT)
and its inverse, converting signals between time (or spatial) domain and frequency domain.

• Convolution in Frequency Domain:

o Convolution in the time or spatial domain can be transformed into multiplication in

the frequency domain, which is often more computationally efficient for large
kernels.

Search Creators... Page 9

21CS743 | DEEP LEARNING | SEARCH CREATORS.

• Applications: Commonly used in applications requiring large kernel convolutions, such as

in image processing and signal analysis.

2. Winograd's Algorithms

• Definition: A set of algorithms designed to optimize convolution operations by reducing

the number of multiplications needed.

• Efficiency Improvement:

o Winograd's algorithms work by rearranging the computation of convolution to

minimize redundant calculations.

o They can reduce the complexity of convolution operations, particularly for small
kernels, making them more efficient in terms of computational resources.

• Key Concepts:

o The algorithms break down the convolution operation into smaller components,
allowing for fewer multiplicative operations and leveraging addition and
subtraction instead.

o They are particularly effective in scenarios where computational efficiency is

critical, such as mobile devices or real-time applications.

• Applications: Frequently used in lightweight models and resource-constrained

environments where computational power and memory usage are limited.

Search Creators... Page 10

21CS743 | DEEP LEARNING | SEARCH CREATORS.

1. Random Feature Maps

• Definition: A technique that uses random projections to map input data into a higher-
dimensional space, facilitating the extraction of features without the need for labels.

• Purpose: Helps to approximate kernel methods, enabling linear models to learn complex
functions.

• Advantages:

o Efficiency: Reduces the computational burden of traditional kernel methods while

retaining useful information.

o Scalability: Suitable for large datasets as it allows for faster training times.

• Applications: Commonly used in tasks where labeled data is scarce, such as clustering and
anomaly detection.

2. Autoencoders

• Definition: A type of neural network designed to learn efficient representations of data

through unsupervised learning by encoding the input into a lower-dimensional space and
then reconstructing it back.

• Structure:

o Encoder: Compresses the input data into a latent representation.

o Decoder: Reconstructs the original input from the latent representation.

• Purpose: Learns to capture important features and structures in the data without
supervision, making it effective for dimensionality reduction and feature extraction.

• Advantages:

o Robustness: Can learn from noisy data and still produce meaningful
representations.

Search Creators... Page 11

21CS743 | DEEP LEARNING | SEARCH CREATORS.

o Flexibility: Can be adapted for various tasks, including denoising, anomaly

detection, and generative modeling.

• Applications: Used in scenarios such as image compression, data denoising, and

generating new data samples.

3. Facilitation of Unsupervised Learning

• Role in Unsupervised Learning: Both methods enable the extraction of meaningful

features from unlabelled data, facilitating learning in scenarios where obtaining labeled
data is challenging or expensive.

• Enhancing Model Performance: By leveraging these techniques, models can improve

their performance on downstream tasks, such as clustering, classification, or regression,
even in the absence of labels.

Search Creators... Page 12

21CS743 | DEEP LEARNING | SEARCH CREATORS.

Notable Architectures

1. LeNet-5

• Introduction:

o Developed by Yann LeCun and colleagues in 1998.

o One of the first convolutional networks designed specifically for image recognition
tasks.

• Architecture Details:

o Input Layer: Takes in grayscale images of size 32x32 pixels.

o Convolutional Layer 1:

▪ 6 filters (5x5) with a stride of 1.

▪ Output size: 28x28x6.

o Activation Function: Sigmoid or hyperbolic tangent (tanh).

Search Creators... Page 13

21CS743 | DEEP LEARNING | SEARCH CREATORS.

o Pooling Layer 1:

▪ Average pooling (subsampling) with a 2x2 filter and a stride of 2.

▪ Output size: 14x14x6.

o Convolutional Layer 2:

▪ 16 filters (5x5).

▪ Output size: 10x10x16.

o Pooling Layer 2:

▪ Average pooling (2x2).

▪ Output size: 5x5x16.

o Fully Connected Layers:

▪ 120 neurons in the first layer.

▪ 84 neurons in the second layer.

▪ Output layer with 10 neurons (for digit classes 0-9).

• Significance:

o Introduced the concept of using convolutional layers for feature extraction followed
by pooling layers for dimensionality reduction.

o Paved the way for modern CNNs, influencing later architectures.

2. AlexNet

• Introduction:

o Developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012.

o Marked a breakthrough in deep learning by achieving top performance in the

ImageNet competition.

Search Creators... Page 14

21CS743 | DEEP LEARNING | SEARCH CREATORS.

• Architecture Details:

o Input Layer: Accepts images of size 224x224 pixels (RGB).

o Convolutional Layer 1:

▪ 96 filters (11x11) with a stride of 4.

▪ Output size: 55x55x96.

o Activation Function: ReLU, introduced to improve training speed.

o Pooling Layer 1:

▪ Max pooling (3x3) with a stride of 2.

▪ Output size: 27x27x96.

o Convolutional Layer 2:

▪ 256 filters (5x5).

▪ Output size: 27x27x256.

o Pooling Layer 2:

▪ Max pooling (3x3).

▪ Output size: 13x13x256.

o Convolutional Layer 3:

▪ 384 filters (3x3).

▪ Output size: 13x13x384.

o Convolutional Layer 4:

▪ 384 filters (3x3).

▪ Output size: 13x13x384.

Search Creators... Page 15

21CS743 | DEEP LEARNING | SEARCH CREATORS.

o Convolutional Layer 5:

▪ 256 filters (3x3).

▪ Output size: 13x13x256.

o Pooling Layer 3:

▪ Max pooling (3x3).

▪ Output size: 6x6x256.

o Fully Connected Layers:

▪ First layer with 4096 neurons.

▪ Second layer with 4096 neurons.

▪ Output layer with 1000 neurons (for 1000 classes).

• Innovative Techniques Introduced:

o ReLU Activation:

▪ Enabled faster convergence during training compared to traditional

activation functions like sigmoid or tanh.

o Dropout:

▪ Regularization method that randomly drops neurons during training to

prevent overfitting, significantly improving generalization.

o Data Augmentation:

▪ Used techniques like image rotation, translation, and flipping to artificially

expand the training dataset and improve robustness.

Search Creators... Page 16

21CS743 | DEEP LEARNING | SEARCH CREATORS.

o GPU Utilization:

▪ Leveraged parallel processing power of GPUs, enabling training on large

datasets in a reasonable timeframe.

• Significance:

o Established deep learning as a powerful approach for image classification and

sparked widespread research and development in CNN architectures.

o Highlighted the importance of large labeled datasets and robust training techniques
in achieving state-of-the-art performance.

Search Creators... Page 17

Haitian Plastics Machinery Group CO., LTD.: Spare Parts
0% (1)
Haitian Plastics Machinery Group CO., LTD.: Spare Parts
529 pages
Emergency Response and Disaster Management Plan (Erdmp)
0% (1)
Emergency Response and Disaster Management Plan (Erdmp)
25 pages
The Dairy Farming Handbook 2017 - by DR CJC Muller
No ratings yet
The Dairy Farming Handbook 2017 - by DR CJC Muller
346 pages
Unit III
No ratings yet
Unit III
38 pages
The Book of Obyriths
100% (5)
The Book of Obyriths
8 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
98 pages
Image Recognition Using Neural Networks
No ratings yet
Image Recognition Using Neural Networks
18 pages
Festo Motion FHPP Rockwell
No ratings yet
Festo Motion FHPP Rockwell
57 pages
Gone With The Wind
No ratings yet
Gone With The Wind
10 pages
Unit 2 Convolutional Neural Network
No ratings yet
Unit 2 Convolutional Neural Network
16 pages
Australia Training in Oil, Chemical and Hydrocarbons PDF
No ratings yet
Australia Training in Oil, Chemical and Hydrocarbons PDF
507 pages
Motionmountain Volumen 3 PDF
No ratings yet
Motionmountain Volumen 3 PDF
440 pages
Shabad of Saints
No ratings yet
Shabad of Saints
351 pages
Tarun Internship
No ratings yet
Tarun Internship
15 pages
Activity 4 Finished
No ratings yet
Activity 4 Finished
19 pages
Class 9 Maths Number System Worksheets 1 4
No ratings yet
Class 9 Maths Number System Worksheets 1 4
4 pages
Ecommerce React Tutorial 2025
No ratings yet
Ecommerce React Tutorial 2025
13 pages
Schimmel Deciphering Signs
No ratings yet
Schimmel Deciphering Signs
287 pages
BSIM4 4.8.1 Technical Manual
No ratings yet
BSIM4 4.8.1 Technical Manual
185 pages
Deep Learning Module-01 Search Creators
No ratings yet
Deep Learning Module-01 Search Creators
17 pages
21CS733 IMP Questions
0% (1)
21CS733 IMP Questions
2 pages
Modular Electronic Devices - 2015
No ratings yet
Modular Electronic Devices - 2015
156 pages
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
No ratings yet
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
15 pages
9505 Iridium Motorola User Guideenglish
No ratings yet
9505 Iridium Motorola User Guideenglish
206 pages
Convolutional Neural Networks (Part I)
No ratings yet
Convolutional Neural Networks (Part I)
61 pages
Convolution Neural Networks: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
No ratings yet
Convolution Neural Networks: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
123 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Convolution Neural Networks U2
No ratings yet
Convolution Neural Networks U2
24 pages
High Voltage Engineering Lecture Notes: G.Kranthi Kumar
No ratings yet
High Voltage Engineering Lecture Notes: G.Kranthi Kumar
46 pages
Ch3 CNN
No ratings yet
Ch3 CNN
64 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
Borax - The Inexpensive Detox, Arthritis, Osteoporosis and Mycoplasma Cure
80% (20)
Borax - The Inexpensive Detox, Arthritis, Osteoporosis and Mycoplasma Cure
14 pages
Lecture CNN
No ratings yet
Lecture CNN
68 pages
Intro To CNN
No ratings yet
Intro To CNN
93 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
55 pages
DL Module - (4,5)
No ratings yet
DL Module - (4,5)
70 pages
Lecture4 - Convnets For CV Slide
No ratings yet
Lecture4 - Convnets For CV Slide
65 pages
Lecture 3 Updated
No ratings yet
Lecture 3 Updated
56 pages
Module 3
No ratings yet
Module 3
46 pages
CNN
No ratings yet
CNN
62 pages
4 Biostratigraphy PDF
No ratings yet
4 Biostratigraphy PDF
34 pages
Sarma CNN Vce Oct 2022
No ratings yet
Sarma CNN Vce Oct 2022
63 pages
Adebukola's Research Project 2024 Edit
No ratings yet
Adebukola's Research Project 2024 Edit
48 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Module 3
No ratings yet
Module 3
67 pages
FODL Unit-4
No ratings yet
FODL Unit-4
46 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
55 pages
Unit 3 CNN
No ratings yet
Unit 3 CNN
47 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Nissan Google Sheet Client Status
No ratings yet
Nissan Google Sheet Client Status
50 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
DSA5102X Lecture5
No ratings yet
DSA5102X Lecture5
44 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
Unit2 CNN
No ratings yet
Unit2 CNN
34 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Deep Learning Module-03 Search Creators
No ratings yet
Deep Learning Module-03 Search Creators
20 pages
Convolution Nueral Networks
No ratings yet
Convolution Nueral Networks
32 pages
CNNs
No ratings yet
CNNs
22 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
Module 4
No ratings yet
Module 4
20 pages
Module-4 DL
No ratings yet
Module-4 DL
22 pages
Unit 2
No ratings yet
Unit 2
22 pages
CNS Module 2
No ratings yet
CNS Module 2
19 pages
AIDS - ANN - Unit 5 - Convolutional Neural Network AIDS - ANN - Unit 5 - Convolutional Neural Network
No ratings yet
AIDS - ANN - Unit 5 - Convolutional Neural Network AIDS - ANN - Unit 5 - Convolutional Neural Network
17 pages
NN 06
No ratings yet
NN 06
18 pages
21CS743 Module4 Notes
No ratings yet
21CS743 Module4 Notes
15 pages
DL Mod4
No ratings yet
DL Mod4
18 pages
Review of Bacteriology: BY: Paul Aeron E. Bansil, RMT
No ratings yet
Review of Bacteriology: BY: Paul Aeron E. Bansil, RMT
18 pages
Deep Learning Series CNN - 2
No ratings yet
Deep Learning Series CNN - 2
15 pages
CNN Interview Question
No ratings yet
CNN Interview Question
16 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
Deep Learning Module-04
No ratings yet
Deep Learning Module-04
17 pages
Research Plan: Key Lime As Potential Chelating Agent For Grease - Contaminated Water
No ratings yet
Research Plan: Key Lime As Potential Chelating Agent For Grease - Contaminated Water
14 pages
Sahil INT
No ratings yet
Sahil INT
15 pages
DL Unit Iii
No ratings yet
DL Unit Iii
13 pages
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
No ratings yet
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
11 pages
Miniproject Draft
No ratings yet
Miniproject Draft
10 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
13 pages
Convolution in CNN and GCN (Related Work)
No ratings yet
Convolution in CNN and GCN (Related Work)
12 pages
2 ND
No ratings yet
2 ND
1 page
Abaqus-Modeling of Nonlinear Cyclic Load Behavior of Ishaped
No ratings yet
Abaqus-Modeling of Nonlinear Cyclic Load Behavior of Ishaped
10 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
11 pages
L09 Convolutional Networks
No ratings yet
L09 Convolutional Networks
9 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning - by Prabhu Raghav - Medium
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning - by Prabhu Raghav - Medium
10 pages
Refrion Sistemi Adiabatici Def ENG 3 LR 02
No ratings yet
Refrion Sistemi Adiabatici Def ENG 3 LR 02
11 pages
CNN 1
No ratings yet
CNN 1
9 pages
Unit III
No ratings yet
Unit III
8 pages
Cryptography m2 Super Imp
No ratings yet
Cryptography m2 Super Imp
7 pages
M4 Ia2
No ratings yet
M4 Ia2
6 pages
Explain The Convolution Operation in The Context of Image Processing. How Does It Differ From Standard Matrix Multiplication?
No ratings yet
Explain The Convolution Operation in The Context of Image Processing. How Does It Differ From Standard Matrix Multiplication?
5 pages
INC 341 Feedback Control Systems 30 พ.ย.2558 PDF
No ratings yet
INC 341 Feedback Control Systems 30 พ.ย.2558 PDF
6 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
3 pages
Measurements Lab
No ratings yet
Measurements Lab
5 pages
CN QB-Final
No ratings yet
CN QB-Final
2 pages
Yukti Circular
No ratings yet
Yukti Circular
1 page
The Farmer and The Seed English Tract
No ratings yet
The Farmer and The Seed English Tract
2 pages
Longjian - Kec JV: Subject: Proposal For Increased Time of Retention For Concrete Mixes at DC-02 Project
No ratings yet
Longjian - Kec JV: Subject: Proposal For Increased Time of Retention For Concrete Mixes at DC-02 Project
2 pages
1006TAG2 ElectropaK PN1793
No ratings yet
1006TAG2 ElectropaK PN1793
2 pages
Numerology
No ratings yet
Numerology
3 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet