0% found this document useful (0 votes)
43 views8 pages

Unit Iv (CNN)

Cnn unit 4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views8 pages

Unit Iv (CNN)

Cnn unit 4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT IV

Convolutional Neural Network (CNN)

4.1 What is CNN?

A Convolutional Neural Network (CNN) is a type of artificial neural network


designed for processing structured grid data, such as images. CNNs are
particularly effective for tasks like image recognition and classification.

Here's a brief overview:

1. Convolutional Layers: These layers apply convolutional filters (or


kernels) to the input data. These filters detect local features like edges,
textures, or patterns in the data.
2. Activation Function: After convolution, the output is passed through an
activation function (commonly ReLU), introducing non-linearity to help
the network learn complex patterns.
3. Pooling Layers: These layers reduce the dimensionality of the data,
helping to simplify the model and make it more computationally efficient.
Pooling operations, like max pooling, retain important features while
reducing the data size.
4. Fully Connected Layers: After several convolutional and pooling layers,
the high-level features are fed into fully connected layers, which help in
making the final classification or prediction.

4.2 Representational Learning in Conventional Layer:

1. Feature Extraction:

Convolutional layers apply a series of filters or kernels to the input


data. Each filter is designed to detect specific patterns or features, such as
edges, textures, or more complex shapes as you go deeper into the network.
These features are learned automatically during training.

2. Hierarchical Learning:

CNNs learn features at multiple levels of abstraction. Early


convolutional layers might detect simple features like edges or corners. As
you go deeper, the network combines these simple features to recognize
more complex patterns, such as parts of objects or even entire objects.

3. Spatial Hierarchy:

The convolutional layers maintain spatial relationships within the data.


This means that the network preserves the spatial structure of the input,
which is crucial for tasks like image recognition where the spatial
arrangement of pixels is important.

4. Filter Learning:

During training, the filters in the convolutional layers are optimized


to minimize the loss function. This means the network learns the most
relevant filters for extracting useful features from the data.

4.3 Multichannel convolution operation

1. Multichannel Input

• Input Channels:

Each channel represents a different feature map or dimension of the input.


For instance, a color image typically has three channels (RGB), where each
channel contains different pixel intensity values for that color.
• Depth:

The depth of the input volume is determined by the number of channels.


For an RGB image, the depth is 3.

2. Convolutional Filters

• Filter Dimensions:

Convolutional filters also have multiple channels. For instance, a 3x3 filter
applied to an RGB image will have a depth of 3, matching the number of
input channels.

• Filter Operation:

The filter is applied across all input channels. For each spatial position,
the filter computes a weighted sum of all channels, producing a single value
for that position in the output feature map.

3. Convolution Process

• Channel-wise Computation: For each spatial location, the filter performs


a dot product between the filter weights and the corresponding pixels
across all channels, summing the results to produce a single output value.
• Resulting Feature Map: This process generates a 2D feature map (or 2D
spatial output) for each filter. If multiple filters are used, each produces its
own feature map, resulting in a 3D volume of feature maps.

4. Example

Consider an RGB image with dimensions 32x32x3 (32x32 spatial resolution and
3 color channels) and a convolutional filter with dimensions 3x3x3:
• Filter Operation: The 3x3x3 filter slides over the 32x32 spatial
dimensions of the image. For each position, it multiplies its 3x3x3 weights
with the corresponding pixels in the input image’s 3 channels, summing
the results.
• Output: The result is a single value for each spatial location, producing a
2D feature map. If multiple such filters are used, the output consists of
multiple 2D feature maps, each representing different learned features.

4.4 Recurrent Neural Network (RNN)

Recurrent Neural Network (RNN) is a type of artificial neural network designed


to handle sequential data or time-series data by maintaining a form of memory
about previous inputs in the sequence. Unlike traditional feedforward neural
networks, RNNs have connections that form directed cycles, allowing them to
process sequences of inputs and use information from previous steps to influence
current predictions. Here's a brief overview of how RNNs work:

Key Concepts of RNNs

1. Sequential Data Processing:


o Memory of Previous Inputs: RNNs are designed to process
sequences by maintaining hidden states that capture information
from previous time steps. This allows the network to consider past
inputs when making predictions about the current step.
o Time Steps: Each input in the sequence is processed in a step-by-
step manner. The output at each time step depends not only on the
current input but also on the hidden state from the previous time step.
2. Hidden State:
o Internal Memory: The hidden state of an RNN acts as its memory,
storing information from previous time steps. This hidden state is
updated as each new input is processed.
o State Transition: The hidden state transitions are governed by
learned weights and a non-linear activation function (such as tanh or
ReLU).
3. Vanishing and Exploding Gradients:
o Training Challenges: RNNs can struggle with long-term
dependencies due to issues like vanishing or exploding gradients,
where gradients become too small or too large during
backpropagation, making learning difficult.
4. Variants of RNNs:
o Long Short-Term Memory (LSTM): LSTMs are a type of RNN
designed to address the vanishing gradient problem by incorporating
memory cells and gating mechanisms. This allows them to maintain
long-term dependencies more effectively.
o Gated Recurrent Unit (GRU): GRUs are similar to LSTMs but
with a simplified architecture that uses fewer gates, making them
computationally more efficient.

How RNNs Work

1. Forward Pass:
o At each time step, the RNN takes an input and the previous hidden
state to produce a new hidden state.
o The hidden state is updated based on the current input and the
previous hidden state.
o The new hidden state is used to produce an output for the current
time step.
2. Backpropagation Through Time (BPTT):
o During training, RNNs use a variant of backpropagation called
BPTT to adjust weights based on errors. BPTT involves unfolding
the RNN through time and applying backpropagation to each step.

Applications of RNNs

• Natural Language Processing (NLP): RNNs are used in tasks like


language modeling, text generation, and machine translation.
• Time-Series Prediction: RNNs are applied to predict future values in
financial markets, weather forecasting, and other sequential data problems.
• Speech Recognition: RNNs help in recognizing and processing spoken
language.

4.6 PyTorch

PyTorch is an open-source machine learning library developed by Facebook's AI


Research lab (FAIR). It provides tools for building and training deep learning
models and is known for its flexibility, ease of use, and dynamic computation
graph. Here’s a concise overview of PyTorch:

Key Features of PyTorch

1. Dynamic Computation Graph (Define-by-Run)


o Flexibility: Unlike static computation graphs (like those in
TensorFlow 1.x), PyTorch uses a dynamic computation graph. This
means the graph is constructed on-the-fly as operations are
performed, making it easier to debug and modify models.
2. Tensors
o Core Data Structure: Tensors in PyTorch are multi-dimensional
arrays similar to NumPy arrays, but with additional features for GPU
acceleration.
o Operations: Tensors support a wide range of mathematical
operations, and they can be moved seamlessly between CPUs and
GPUs.
3. Autograd
o Automatic Differentiation: PyTorch’s autograd system
automatically computes gradients for tensor operations, which is
crucial for training neural networks through backpropagation.
4. Neural Network Module
o torch.nn: This module provides a high-level interface for building
neural networks. It includes pre-defined layers, loss functions, and
optimizers, making it easier to construct and train models.
5. Optimizers
o torch.optim: Contains a variety of optimization algorithms, such as
SGD, Adam, and RMSprop, to update model parameters during
training.
6. GPU Acceleration
o CUDA Support: PyTorch offers native support for CUDA, enabling
efficient computation on NVIDIA GPUs. This allows for faster
training and inference of models.
7. Interoperability
o Integration with NumPy: PyTorch tensors can be easily converted
to and from NumPy arrays, facilitating data manipulation and
integration with other libraries.
8. Community and Ecosystem
o Growing Ecosystem: PyTorch has a large and active community,
with many extensions and libraries built on top of it, including
torchvision for computer vision, torchaudio for audio processing,
and torchtext for natural language processing.

4.7 CNN in pyTorch :

Steps :

➢ Install PyTorch:
Ensure you have PyTorch and necessary libraries installed.
➢ Import Libraries:
Load PyTorch and torchvision libraries.
➢ Prepare Data:
Load and preprocess the dataset using torchvision.
➢ Define the CNN Model:
Create a CNN by defining convolutional and fully connected layers.
➢ Define Loss Function and Optimizer:
Choose a loss function and optimizer f or training.
➢ Train the Model:
Implement the training loop to optimize model parameters.
➢ Evaluate the Model:
Test the model on a separate test dataset to measure performance.
➢ Save and Load the Model:
Save the trained model and reload it as needed.

You might also like