0% found this document useful (0 votes)

879 views20 pages

Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414

Module 3 | S8 CSE NOTES -KTU DEEP LEARNING NOTES | CST414

Uploaded by

suryajit27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

879 views20 pages

Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414

Module 3 | S8 CSE NOTES -KTU DEEP LEARNING NOTES | CST414

Uploaded by

suryajit27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

🧘‍♂️

Deep Learning Module 3

1. What is the motivation behind convolution neural
networks?
Motivation for Convolutional Neural Networks (CNNs)
Convolutional Neural Networks (CNNs) are a specialized kind of neural network
for processing data that has a known grid-like topology, such as images.

1. Sparse Interactions

Sparse Connectivity:
Explanation: Instead of connecting every input neuron to every output
neuron, CNNs connect each neuron to only a small region of the input. This
local connectivity is achieved by using kernels (or filters) that are smaller
than the input image.

Benefit: Reduces the number of parameters, leading to lower computational

cost and less risk of overfitting.

Example:
In an image with millions of pixels, detecting edges can be done with small
kernels of just hundreds of pixels. This significantly reduces the number of
connections and computations needed.

2. Parameter Sharing

Shared Parameters:
Explanation: The same parameters (weights) are used for multiple positions
of the input. This is akin to using the same stencil to draw patterns across
different parts of an image.

Benefit: This dramatically reduces the number of parameters that need to

be learned, which in turn reduces the memory requirements and improves

Deep Learning Module 3 1

computational efficiency.

Outcome: The network becomes efficient in detecting the same feature at

different locations within an image.

3. Equivariant Representations

Equivariance to Translation:
Explanation: A function is equivariant to a transformation if applying the
transformation to the input and then applying the function yields the same
result as applying the function and then the transformation. Convolution is
equivariant to translation, meaning if an object in the image shifts, the
feature map shifts in the same way.

Benefit: This property makes CNNs robust to changes in the position of the
features in the input image.

Example:
In an image processing task, if an object is moved within the image, the
feature map produced by the convolutional layer will move correspondingly.
This is useful for tasks like object detection where the exact position of the
object might vary.

4. Handling Variable Input Sizes

Adaptability:
Explanation: CNNs can handle input images of variable sizes due to the
nature of convolution operations. This makes them versatile for different
types of input data without requiring a fixed-size input.

Benefit: Flexibility in dealing with different sizes of input images, which is

particularly useful in real-world applications where input dimensions can
vary.

Structured Outputs in Convolutional Neural Networks (CNNs)

Structured outputs in the context of Convolutional Neural Networks (CNNs)
refer to predictions that have a specific, often complex, structure, such as
sequences, trees, or graphs, rather than simple, unstructured outputs like
single-label classifications. Here’s an explanation and some examples of how

Deep Learning Module 3 2

structured outputs are handled in deep learning, particularly within the realm of
CNNs and related architectures.

Sequence-to-Sequence Models
Description: Sequence-to-sequence (seq2seq) models are designed to
convert an input sequence into an output sequence. They are commonly used
for tasks where both the input and the output are sequences of varying lengths.

Components:

Encoder: Processes the input sequence and transforms it into a fixed-

length vector.

Decoder: Uses this vector to generate the output sequence.

Types:

Autoregressive Decoders: Generate one token at a time, with each token

being dependent on the previously generated tokens.

Non-Autoregressive Decoders: Generate all tokens simultaneously,

independent of each other.

Examples:

Machine Translation (e.g., translating a sentence from English to French).

Text Summarization.

Speech Recognition.

Graph Neural Networks (GNNs)

Description: Graph neural networks operate directly on graph structures,
making them suitable for tasks that involve relationships between entities.

Mechanism: GNNs typically use message passing algorithms where nodes

communicate with their neighbors to update their representations.
Types:

Graph-Level Outputs: Generate a single output for the entire graph (e.g.,
predicting the property of a molecule).

Node-Level Outputs: Generate outputs for each node in the graph (e.g.,
classifying nodes in a social network).

Examples:

Deep Learning Module 3 3

Molecule Property Prediction.

Social Network Analysis.

Recommendation Systems.

Tree Recursive Neural Networks (TreeRNNs)

Description: TreeRNNs are used for tasks where the input or output has a tree
structure. They recursively process subtrees to generate structured outputs.

Strategies:

Bottom-Up: Start from the leaves of the tree and work up to the root.

Top-Down: Start from the root and work down to the leaves.

Examples:

Natural Language Parsing.

Image Captioning (where the structure of the description can be tree-like).

Conditional Random Fields (CRFs)

Description: CRFs are used for sequence labeling tasks where the output is a
sequence with dependencies between the labels.

Mechanism: They model the conditional probability of the output sequence

given the input sequence and use transition probabilities to account for the
dependencies.

Training: Can be done using maximum likelihood estimation or gradient-based

methods.

Examples:

Named Entity Recognition.

Part-of-Speech Tagging.

Efficient Convolution Algorithms in Deep Learning

Convolution operations are central to many neural network architectures,
especially Convolutional Neural Networks (CNNs). To enhance computational
efficiency, various algorithms can be utilized to perform these convolutions.
Below are some of the most commonly used algorithms along with their
characteristics and trade-offs:

Deep Learning Module 3 4

1. Direct Convolution
Description:

The most straightforward approach.

Involves iterating over all possible positions of the filter over the input
feature map.

For each position, the filter values are multiplied by the corresponding input
values and summed up to produce the output.

Pros:

Simple to understand and implement.

Cons:

Computationally expensive, especially for large inputs and filters.

High time complexity due to the nested loops.

Use Case:

Small-scale applications where simplicity is preferred over performance.

2. Fast Fourier Transform (FFT)-Based Convolution

Description:

Converts both the input and the filter to the frequency domain using the
Fourier transform.

Multiplies them in the frequency domain.

Converts the result back to the spatial domain using the inverse Fourier
transform.

Pros:

Can be faster than direct convolution for large inputs and filters.

Reduces the convolution operation to element-wise multiplications in the

frequency domain.

Cons:

Requires additional memory for storing the transformed data.

FFT introduces overhead for the transforms, which might not be beneficial
for smaller filters.

Deep Learning Module 3 5

Use Case:

Scenarios involving very large inputs or filters where FFT can significantly
reduce computation time.

3. Winograd's Minimal Filtering Algorithm

Description:

Reduces the number of multiplications needed to compute a convolution.

Transforms the filter into a smaller matrix and uses smaller matrix
multiplications to achieve the convolution.

Pros:

More efficient than direct convolution for small filters.

Reduces the arithmetic complexity of the convolution operation.

Cons:

Requires additional memory for the transformed filter.

Optimization is beneficial mainly for small filter sizes (e.g., 3x3).

Use Case:

Applications where small filters are predominant, such as certain image

processing tasks.

4. Separable Convolution
Description:

Decomposes a 2D filter into two 1D filters.

First applies the filter along rows, then along columns (or vice versa).

Pros:

Reduces the number of computations compared to regular 2D convolution.

Simplifies the convolution process.

Cons:

Can result in reduced accuracy compared to non-separable convolutions

because the decomposition might not perfectly capture the desired filtering
effect.

Deep Learning Module 3 6

Use Case:

Applications where reducing computational complexity is critical, and slight

accuracy loss is acceptable.

5. Depthwise Separable Convolution

Description:

Splits the input feature map into separate channels.

Each channel is convolved with its own filter (depthwise convolution).

Follows with a pointwise convolution (1x1 convolution) to combine the

channels.

Pros:

Significantly reduces the number of computations.

More efficient for mobile and embedded applications.

Cons:

May require careful tuning to maintain accuracy.

More complex implementation compared to standard convolution.

Use Case:

Mobile and embedded applications where computational resources are

limited and efficiency is crucial.

Convolution and Pooling as an Infinitely Strong Prior

Convolution as an Infinitely Strong Prior

In a CNN, convolutions can be thought of as imposing an infinitely strong prior
over the network's weights. Specifically, this prior enforces that:

1. Weight Sharing: The weights for one hidden unit must be identical to the
weights of its neighboring units, but shifted in space.

2. Local Receptive Fields: The weights must be zero outside the small,
spatially contiguous receptive field assigned to each hidden unit.

This results in a prior that insists on the learned function being based only on
local interactions, effectively simplifying the model by reducing the number of
parameters that need to be learned.

Deep Learning Module 3 7

Pooling as an Infinitely Strong Prior
Similarly, pooling operations in CNNs act as an infinitely strong prior that
enforces invariance to small translations. This means that the function learned
by the layer must produce similar outputs for slightly shifted versions of the
input. This is useful for tasks where the precise location of features is less
important than their presence, such as in image recognition tasks where the
exact location of an object within the image is not critical.
Pooling achieves this by summarizing responses over a neighborhood, allowing
for fewer pooling units than detector units and thereby reducing computational
load and improving statistical efficiency.

Trade-offs and Implications

While these priors introduce significant efficiencies, they come with trade-offs:

Underfitting: If the assumptions of local interactions and translation

invariance do not hold, the model may underfit, failing to capture important
aspects of the data. For example, tasks requiring precise spatial information
might suffer if pooling is applied indiscriminately.

Comparison of Models: Convolutional models should ideally be compared

against other convolutional models, as their built-in assumptions about
spatial relationships make them fundamentally different from non-
convolutional models. Benchmarks often separate models based on
whether they assume spatial relationships or not.

Why is Non Linearity essential ?

The layers in a deep neural network architecture need to be non-linear to allow
the network to model complex and intricate patterns in the data. Let's break
down why this non-linearity is essential:

Linear vs. Non-linear Transformations

1. Linear Transformations:

A linear transformation involves operations like scaling, rotating, or

translating data in a linear manner. Mathematically, if you apply multiple
linear transformations (like matrix multiplications) in sequence, the
result is still a linear transformation. In other words, stacking linear

Deep Learning Module 3 8

layers without any non-linear activation functions can be reduced to a
single linear transformation.

This means that a network with only linear layers, no matter how many
layers it has, can only represent linear relationships between the input
and the output. It cannot capture the complexity needed for tasks like
image recognition, natural language processing, or any problem where
the data relationships are non-linear.

2. Non-linear Transformations:

Non-linear transformations introduce elements like squaring, cubing, or

applying non-linear functions (e.g., ReLU, sigmoid, tanh) to the data.
These operations cannot be reduced to a single linear transformation
when stacked.

Non-linear activation functions, such as ReLU (Rectified Linear Unit),

sigmoid, or tanh, enable the network to learn and represent complex
patterns. By inserting these non-linearities between layers, the network
can approximate any continuous function, making it a universal function
approximator.

Mathematical Perspective
From a mathematical standpoint, if ( f(x) ) and ( g(x) ) are both linear functions,
then their composition ( f(g(x)) ) is also a linear function. Thus, a neural
network with only linear transformations can be collapsed into a single layer,
losing the benefits of depth.

However, if ( f(x) ) or ( g(x) ) is non-linear, their composition ( f(g(x)) ) can

represent a much broader class of functions. This is crucial for:

Learning Hierarchical Features: Non-linear layers allow the network to

build hierarchical features where each layer captures more abstract
representations of the data. For example, in image recognition, early layers
might detect edges, while deeper layers recognize complex structures like
faces.

Universal Approximation: The Universal Approximation Theorem states

that a neural network with at least one hidden layer and non-linear
activation functions can approximate any continuous function to any
desired precision, given enough neurons.

Common Non-linear Activation Functions

Deep Learning Module 3 9

1. ReLU (Rectified Linear Unit):

ReLU(x) = max(0, x)

Introduces non-linearity by setting negative values to zero while leaving
positive values unchanged.

2. Sigmoid:
1
σ(x) = 1+e −x

Maps input values to the range (0, 1), useful for binary classification.

3. Tanh (Hyperbolic Tangent):

e x −e −x
tanh(x) = e x +e −x

Maps input values to the range (-1, 1), often used in hidden layers.

Increasing the stride of a convolutional layer in a neural

network affects the output in several ways:
1. Reduced Output Size: A larger stride reduces the size of the output feature
map. For instance, if the stride is increased from 1 to 2, the output size is
roughly halved in each dimension because the convolutional filter moves
two steps at a time instead of one.

2. Downsampling: The convolution operation with a larger stride can be seen

as downsampling the input feature map. This is because the convolution
effectively skips over some input values, producing fewer output values
and thus a smaller feature map.

3. Less Computational Cost: With fewer output values to compute, the

computational cost decreases. This is beneficial for reducing the time
complexity and computational load, especially for large inputs.

4. Loss of Detail: A higher stride means that some of the fine-grained details
in the input are ignored, which might result in a loss of spatial resolution
and potentially useful information. This trade-off between computational
efficiency and the preservation of detail needs to be carefully managed
depending on the application.

Maximum Stride
The maximum stride is typically constrained by the size of the input feature
map and the size of the convolutional filter. Specifically, the stride should not

Deep Learning Module 3 10

exceed the dimensions of the input feature map because this would lead to
skipping all input data, resulting in an invalid or empty output.

For practical purposes, the stride is usually kept small (1 or 2) to balance the
trade-offs between output size, computational efficiency, and the amount of
detail preserved in the feature maps.

Benefits of Using Convolutional Layers Over Fully Connected

Layers for Visual Tasks
1. Sparse Interactions: Convolutional layers leverage the concept of sparse
interactions, meaning that each output unit is connected to a small subset
of input units. This is achieved by making the convolutional kernel smaller
than the input, allowing the network to detect small, meaningful features
such as edges or textures. Sparse interactions significantly reduce the
number of parameters and computations required compared to fully
connected layers, which connect every input unit to every output unit. This
reduction in parameters leads to more efficient training and less risk of
overfitting.

2. Parameter Sharing: In convolutional layers, the same set of parameters (the

convolutional kernel) is used across different spatial locations of the input.
This parameter sharing allows the network to learn features that are
invariant to location, such as recognizing an object regardless of where it
appears in the image. This is particularly useful in visual tasks where
patterns and features can appear at different locations within an image.
Fully connected layers do not have this property, as each weight is used
only once, leading to a much higher number of parameters and increased
computational cost.

VARIANTS OF COVOLUTION
Variants of the convolution function in neural networks adapt the basic
convolution operation to optimize performance, computational efficiency, and
feature extraction capabilities for different tasks. Here are detailed explanations
of some of the key variants:

1. Multi-Channel Convolution:

In traditional convolution, a single kernel extracts one feature type at

multiple spatial locations. Multi-channel convolution extends this by

Deep Learning Module 3 11

having multiple kernels, each extracting a different feature type from
the input, which is often multi-dimensional (e.g., an RGB image has
three channels for red, green, and blue).

The operation involves a 4-D kernel tensor where each output channel
is connected to all input channels with a unique filter.

2. Stride:

Convolution operations can be modified by skipping positions to reduce

computational cost, a technique known as striding. The stride
determines the step size of the filter as it moves over the input.

Stride of 1 processes every position, while a stride of 2 processes every

other position, effectively downsampling the output by a factor of 2.
This is useful for reducing the dimensionality of the feature maps while
preserving essential spatial hierarchies.

3. Padding:

To control the spatial dimensions of the output, padding is used, which

involves adding extra rows and columns to the input matrix.

"Valid" padding means no padding is added, resulting in an output

smaller than the input. "Same" padding involves adding zeros around
the border to ensure the output has the same dimensions as the input.
This is crucial for maintaining the spatial resolution of feature maps
through successive layers.

4. Dilated Convolution:

Also known as atrous convolution, this variant introduces gaps

(dilations) between kernel elements, allowing the network to have a
larger receptive field without increasing the number of parameters.

Dilated convolution is particularly useful in tasks requiring a broader

contextual understanding, like semantic segmentation, where the
relationships between distant pixels are significant.

5. Transposed Convolution:

Often used in generating higher resolution outputs from lower resolution

inputs, such as in image generation tasks.

Also called deconvolution or upsampling, it works by inserting zeros

between the pixels of the input and then performing a standard

Deep Learning Module 3 12

convolution. This effectively increases the spatial dimensions of the
feature map.

6. Separable Convolution:

Separable convolutions decompose a standard 2D convolution into two

simpler operations: a depthwise convolution (applying a single filter per
input channel) followed by a pointwise convolution (a 1x1 convolution
combining the output of the depthwise convolution).

This significantly reduces the computational complexity while still

capturing essential spatial features, making it highly efficient for mobile
and embedded applications.

Convolutional Neural Network (CNN) Architecture

A Convolutional Neural Network (CNN) is composed of several layers that
process and transform input data through a series of stages. Below is a
diagram and explanation of the different stages in a typical CNN
architecture:

Diagram of CNN Architecture

1. Input Layer:

The input to a CNN is typically an image represented as a 3D matrix

of pixel values. For example, a color image of size 256x256 pixels
with three color channels (RGB) would have dimensions
256x256x3.

2. Convolution Layer:

The convolution layer applies a set of filters (also called kernels) to

the input image. Each filter slides over the input image, performing a

Deep Learning Module 3 13

dot product between the filter and a region of the input image to
produce a feature map.

The purpose of convolution is to extract features such as edges,

textures, and patterns from the input image.

3. ReLU Activation Layer:

After each convolution operation, an activation function is applied to

introduce non-linearity into the model. The most commonly used
activation function is the Rectified Linear Unit (ReLU), which
replaces all negative values in the feature map with zero.

This helps the network learn complex patterns and relationships in

the data.

4. Pooling Layer:

Pooling layers reduce the spatial dimensions (width and height) of

the feature maps while retaining the most important information.
This is typically done using operations like max pooling, which
selects the maximum value in each region of the feature map.

Pooling helps to make the representation smaller and more

manageable, and it also provides some level of translation
invariance.

5. Flattening:

Before passing the feature maps to the fully connected layer, a

flattening step is performed. This step involves reshaping the 3D
feature maps into a 1D vector, effectively "flattening" them.

Flattening allows the subsequent fully connected layer to treat the

entire feature map as a single input, simplifying the connectivity
pattern between the convolutional and fully connected layers.

6. Fully Connected Layer:

After several convolution and pooling layers, the high-level

reasoning in the neural network is done via fully connected layers.
These layers are similar to those found in traditional neural
networks.

Each neuron in a fully connected layer is connected to every neuron

in the previous layer. These layers integrate the features detected

Deep Learning Module 3 14

by the convolutional layers and output the final classification results.

7. Softmax Layer:

The final layer of the CNN is typically a softmax layer, which

converts the raw scores from the fully connected layer into
probabilities. Each output node represents a different class, and the
softmax function ensures that the sum of the probabilities across all
classes is 1.

This probability distribution is then used to make the final

prediction.

Suppose that a CNN was trained to classify images into different

categories. It
performed well on a validation set that was taken from the same source
as the
training set but not on a testing set, which comes from another
distribution.
What could be the problem with the training of such a CNN? How will
you
ascertain the problem? How can those problems be solved?
When a CNN performs well on a validation set sourced from the same
distribution as the training set but fails to generalize to a testing set
from a different distribution, it indicates a problem with the model's
ability to generalize beyond the training data. This issue is known as
overfitting, where the model learns to memorize the training data rather
than capturing underlying patterns that generalize well to unseen data.

Identifying the Problem:

1. Performance Discrepancy: Observing a significant drop in
performance on the testing set compared to the validation set is a
clear indicator of overfitting.

2. Validation-Testing Set Discrepancy: If the validation and testing

sets have similar characteristics and the model performs well on the
former but poorly on the latter, it suggests overfitting.

Ascertaining the Problem:

Deep Learning Module 3 15

1. Cross-Validation: Conducting cross-validation on the training set
can help validate the model's performance across different subsets
of the data. If the performance varies widely across folds, it
indicates overfitting.

2. Validation Curves: Plotting validation performance against model

complexity (e.g., number of layers, neurons) can reveal whether the
model's performance plateaus or decreases on the validation set
while improving on the training set, indicating overfitting.

Solutions to Overfitting:
1. Regularization Techniques:

L2 Regularization: Penalize large weights in the model's

parameters to prevent over-reliance on specific features.

Dropout: Randomly deactivate neurons during training to

prevent co-adaptation and encourage robust feature learning.

2. Data Augmentation:

Introduce variations to the training data (e.g., rotations,

translations, flips) to expose the model to diverse instances of
each class, enhancing generalization.

3. Transfer Learning:

Utilize pre-trained CNN models trained on large datasets to

leverage knowledge learned from similar tasks and fine-tune
them on the specific task with limited data.

4. Ensemble Methods:

Combine predictions from multiple CNNs trained with different

initializations or architectures to reduce variance and improve
generalization.

Validation Strategy:
1. Holdout Validation: Split the data into three sets - training,
validation, and testing - ensuring that the validation and testing sets
come from similar distributions. Monitor the model's performance
on the validation set and use the testing set for final evaluation.

Deep Learning Module 3 16

2. Cross-Validation: Perform k-fold cross-validation on the training set
to validate the model's performance across different subsets of the
data, ensuring robustness to variability.

RECURSIVE FILTERING
Recursive filtering in CNNs integrates recurrent neural network (RNN)
structures, like LSTM or GRU layers, into convolutional architectures.
This combination allows CNNs to process sequential data while
capturing temporal dependencies. Techniques include hybrid
architectures, temporal convolution, and attention mechanisms.
Recursive filtering enhances CNNs' ability to handle tasks like time
series analysis, natural language processing, and video processing.

POOLING

Pooling layers in convolutional neural networks (CNNs) offer

advantages and disadvantages, with various types catering to different
needs.

Advantages of Pooling:
1. Dimension Reduction: Pooling reduces the spatial dimensions of
feature maps, making subsequent layers computationally more
efficient.

2. Translation Invariance: Pooling captures the most important

features while reducing sensitivity to spatial translations, enhancing
the model's robustness.

3. Feature Generalization: Pooling helps in generalizing learned

features by retaining only the most prominent information from each
region.

4. Noise Robustness: Pooling can mitigate the effects of noise in the

data by emphasizing the most significant activations.

Disadvantages of Pooling:
1. Loss of Information: Pooling discards detailed spatial information,
potentially leading to loss of fine-grained features.

Deep Learning Module 3 17

2. Over-Aggregation: Aggressive pooling can oversimplify
representations, leading to loss of discriminative power.

3. Pooling Bias: Certain pooling methods may introduce biases

towards specific features or regions.

4. Gradient Dilution: Pooling layers do not have learnable parameters,

so gradients may be diluted during backpropagation, potentially
hindering learning.

Types of Pooling:
1. Max Pooling: Selects the maximum value from each region of the
feature map, emphasizing the most active feature in each region.

1. Average Pooling: Computes the average value from each region,

providing a smoother down-sampling mechanism compared to max
pooling.

2. Global Average Pooling: Computes the average of all feature maps,

reducing the spatial dimensions to 1x1, often used as an alternative
to fully connected layers for classification tasks.

3. Min Pooling: Selects the minimum value from each region, focusing
on the least active features.

4. Sum Pooling: Computes the sum of values from each region, which
can be useful in certain scenarios, but less commonly used than
max or average pooling.

To calculate the size of the feature map after

convolution in a CNN, you can use the provided
equation:
( )
Deep Learning Module 3 18
output_size = ( )+
input_size−kernel_size+2×padding
stride
1

Here's a step-by-step breakdown of how to use this equation:

1. Identify Variables:

( input_size): Size of the input volume (width or height,

assuming square input).

( kernel_size): Size of the kernel/filter (width or height,

assuming square kernel).

( padding): Amount of padding applied to the input volume.

( stride): Stride used in the convolution operation.

2. Substitute Values:

Replace these variables with their respective values in the

equation.

3. Calculate:

Plug the values into the equation and perform the arithmetic
operations to find the output size.

4. Round Down:

Since feature map dimensions are typically integer values,

round down the result to the nearest integer.

5. Repeat:

If the kernel is rectangular, apply the equation separately to the

width and height dimensions.

Example Calculation:

Input size (width): 28

Kernel size (width): 3

Padding: 1

Stride: 1

output_size = ( 28−3+2×1
1
) + 1

output_size = ( 28−3+2
1
) + 1

Deep Learning Module 3 19

output_size = ( 27
1
) + 1

output_size = 27 + 1
output_size = 28
So, the output size (width) of the feature map after convolution is 28.

Deep Learning Module 3 20

DL Notes 1 5 Deep Learning
100% (1)
DL Notes 1 5 Deep Learning
189 pages
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
100% (1)
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
18 pages
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
21 pages
DEEP LEARNING NOTES - Btech
No ratings yet
DEEP LEARNING NOTES - Btech
26 pages
Deep Learning Unit-II
No ratings yet
Deep Learning Unit-II
19 pages
Deep Learning-Question Bank-Module-Wise
67% (3)
Deep Learning-Question Bank-Module-Wise
5 pages
Deep Learning Lab Manual
100% (1)
Deep Learning Lab Manual
19 pages
DL Unit-2
No ratings yet
DL Unit-2
24 pages
Deep Learning R18 Jntuh Lab Manual
0% (1)
Deep Learning R18 Jntuh Lab Manual
21 pages
Deep Learning
No ratings yet
Deep Learning
243 pages
Question Bank
No ratings yet
Question Bank
14 pages
Unit 2 Machine Learning Notes
100% (1)
Unit 2 Machine Learning Notes
25 pages
Unit I: Chapter 3:functional Units For Anns For Pattern Recognition Task
100% (2)
Unit I: Chapter 3:functional Units For Anns For Pattern Recognition Task
24 pages
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
Module 5 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 5 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
26 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
Module 1
No ratings yet
Module 1
23 pages
DL Unit - 5
No ratings yet
DL Unit - 5
14 pages
RL Unit 5
No ratings yet
RL Unit 5
30 pages
DL Unit - 4
No ratings yet
DL Unit - 4
14 pages
Ait401 DL Syllubus
100% (1)
Ait401 DL Syllubus
13 pages
Deep Learning-KTU
No ratings yet
Deep Learning-KTU
6 pages
Unit I
0% (1)
Unit I
21 pages
NNDL Lab Manual
No ratings yet
NNDL Lab Manual
41 pages
ADL Unit-3
100% (2)
ADL Unit-3
21 pages
ML UNIT-4 Notes PDF
100% (1)
ML UNIT-4 Notes PDF
40 pages
Deep Learning Question Paper
100% (1)
Deep Learning Question Paper
3 pages
RL Unit 1
100% (1)
RL Unit 1
26 pages
Unit IV
No ratings yet
Unit IV
22 pages
Unit V
No ratings yet
Unit V
21 pages
Ccs355 Neural Networks and Deep Learning Unit1
No ratings yet
Ccs355 Neural Networks and Deep Learning Unit1
29 pages
ccs355 Syllabus NNDL
100% (1)
ccs355 Syllabus NNDL
3 pages
Deep Learning Question Bank (2024-25)
No ratings yet
Deep Learning Question Bank (2024-25)
2 pages
Unit 2 Introduction To Deep Learning
No ratings yet
Unit 2 Introduction To Deep Learning
79 pages
Unit II
No ratings yet
Unit II
56 pages
UNIT-1 Foundations of Deep Learning
100% (1)
UNIT-1 Foundations of Deep Learning
51 pages
AD3501 - Deep Learning University Question
No ratings yet
AD3501 - Deep Learning University Question
2 pages
Optimization For Long-Term Dependencies
No ratings yet
Optimization For Long-Term Dependencies
57 pages
Unit 4
100% (1)
Unit 4
7 pages
DEEP LEARNING (Previous Question Papers)
No ratings yet
DEEP LEARNING (Previous Question Papers)
3 pages
Machine Learning-1
100% (1)
Machine Learning-1
9 pages
ccs355 Lab Manual
No ratings yet
ccs355 Lab Manual
24 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Assignment Week 8-Deep-Learning PDF
100% (1)
Assignment Week 8-Deep-Learning PDF
5 pages
Machine Learning Question Paper Solved ML
No ratings yet
Machine Learning Question Paper Solved ML
55 pages
Solving XOR Problem Using DNN AIDS
100% (1)
Solving XOR Problem Using DNN AIDS
4 pages
AKTU Notes Machine Learning (ROE083) Unit-1 - UPTU Notes PDF
50% (2)
AKTU Notes Machine Learning (ROE083) Unit-1 - UPTU Notes PDF
66 pages
NNDL Technical Publication Notes
No ratings yet
NNDL Technical Publication Notes
81 pages
DL - Assignment 8 Solution
100% (2)
DL - Assignment 8 Solution
6 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Al3502deep Learning For Visionl T P C
No ratings yet
Al3502deep Learning For Visionl T P C
3 pages
Deep Learning 117 MCQ
No ratings yet
Deep Learning 117 MCQ
33 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
Tangent Prop and Manifold Tangent Classifier Are B
No ratings yet
Tangent Prop and Manifold Tangent Classifier Are B
4 pages
Machine Learning-4
100% (1)
Machine Learning-4
18 pages
Question Bank Ann
50% (2)
Question Bank Ann
2 pages
Machine Learning-2
No ratings yet
Machine Learning-2
16 pages
Question Bank Beel801 PDF
100% (1)
Question Bank Beel801 PDF
10 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Deep LearningUNIT-IV
No ratings yet
Deep LearningUNIT-IV
16 pages
Assembly Tables Project Plan
No ratings yet
Assembly Tables Project Plan
8 pages
Asdancoursedirectory 20172018 v3 Web
No ratings yet
Asdancoursedirectory 20172018 v3 Web
32 pages
The Ecstasy by John Donne Notes
100% (1)
The Ecstasy by John Donne Notes
3 pages
CHAPTER II - Ngô Thị Hậu - 113202
No ratings yet
CHAPTER II - Ngô Thị Hậu - 113202
11 pages
Pressure - Vacuum Relief Valve End of Line
No ratings yet
Pressure - Vacuum Relief Valve End of Line
62 pages
Adding and Removing Programs On The HP Calculator
No ratings yet
Adding and Removing Programs On The HP Calculator
5 pages
Im Supplement Bolt Torque Information Fisher 249 Sensors en 124786
No ratings yet
Im Supplement Bolt Torque Information Fisher 249 Sensors en 124786
6 pages
Module 5 Entrepreneurship (Grade 12)
No ratings yet
Module 5 Entrepreneurship (Grade 12)
35 pages
Rubber Industry - Rsal-2
No ratings yet
Rubber Industry - Rsal-2
33 pages
Null
No ratings yet
Null
8 pages
Class 12 - Chemistry Sample Paper 2
No ratings yet
Class 12 - Chemistry Sample Paper 2
10 pages
STS MODULE Final 17 25
No ratings yet
STS MODULE Final 17 25
9 pages
Maths-XI Ch-11 BMN
No ratings yet
Maths-XI Ch-11 BMN
46 pages
Entrepreneurship Class 12 Questions Paper
No ratings yet
Entrepreneurship Class 12 Questions Paper
16 pages
Mock Test For Biology Exam 10th Grade Chapter 4
No ratings yet
Mock Test For Biology Exam 10th Grade Chapter 4
5 pages
Mitali's Resume (1) - 3
No ratings yet
Mitali's Resume (1) - 3
3 pages
BENEFICENCE MEaning
No ratings yet
BENEFICENCE MEaning
2 pages
Ss Lesson 1
No ratings yet
Ss Lesson 1
7 pages
Global Oncology Challenges
No ratings yet
Global Oncology Challenges
36 pages
Faxing A Purchase Order
No ratings yet
Faxing A Purchase Order
15 pages
Healthy Rice Recipes For Dinner PDF
No ratings yet
Healthy Rice Recipes For Dinner PDF
36 pages
BRM Group 5 Part 1
No ratings yet
BRM Group 5 Part 1
8 pages
Icmr Specimen Referral Form For Covid-19 (Sars-Cov2) : (If Yes, Attach Prescription If No, Test Cannot Be Conducted)
No ratings yet
Icmr Specimen Referral Form For Covid-19 (Sars-Cov2) : (If Yes, Attach Prescription If No, Test Cannot Be Conducted)
3 pages
Mobile Robot Position Determination Using Data Integration of Odometry and Gyroscope
No ratings yet
Mobile Robot Position Determination Using Data Integration of Odometry and Gyroscope
9 pages
DWM 10212 Case Study 1 Hirarc
No ratings yet
DWM 10212 Case Study 1 Hirarc
9 pages
Fundamentals of Data Analytics (In-Person) Course Outline
No ratings yet
Fundamentals of Data Analytics (In-Person) Course Outline
3 pages
Chromosomal Crossover
No ratings yet
Chromosomal Crossover
4 pages
Bent A Zal Portfolio
No ratings yet
Bent A Zal Portfolio
13 pages
Index Numbers Part 1
No ratings yet
Index Numbers Part 1
13 pages
AHA Volunteer FAQs PDF
No ratings yet
AHA Volunteer FAQs PDF
3 pages