0% found this document useful (0 votes)
8 views24 pages

Unit - IV

The document provides an overview of deep learning methods, focusing on feedforward networks, gradient descent, backpropagation, regularization techniques, and optimization strategies for training deep models. It discusses the challenges faced in training neural networks, such as overfitting and vanishing gradients, and introduces sequence modeling with recurrent neural networks. Key techniques for enhancing model performance, including dropout and data augmentation, are also highlighted.

Uploaded by

Harishri MQ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views24 pages

Unit - IV

The document provides an overview of deep learning methods, focusing on feedforward networks, gradient descent, backpropagation, regularization techniques, and optimization strategies for training deep models. It discusses the challenges faced in training neural networks, such as overfitting and vanishing gradients, and introduces sequence modeling with recurrent neural networks. Key techniques for enhancing model performance, including dropout and data augmentation, are also highlighted.

Uploaded by

Harishri MQ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Unit – IV

Historical trends in
Deep learning, Deep
learning: Overview of
Methods.
Deep
Feedforward
Networks
Overview of Feedforward Networks

Definition: Key Features: Applications:


A feedforward network is a type of Input Layer, Hidden Layers, Output Image recognition, Natural Language
artificial neural network where Layer. Processing (NLP), Regression.
connections between nodes do not Activation functions such as ReLU,
form a cycle. Sigmoid, Tanh.
Information flows in one direction –
from input to output.
Feed Forward Neural Network
Gradient Descent:
Gradient- • A method to minimize the loss
Based function by updating weights.
• Types: Batch Gradient Descent,
Learning Stochastic Gradient Descent (SGD),
Mini-batch Gradient Descent.

Challenges:
• Local minima, vanishing gradients,
exploding gradients.
Batch Gradient Descent: Uses the entire dataset in each
step (slow for large data).

Stochastic Gradient Descent (SGD): Updates weights for


each data point (fast but noisy).

Mini-batch Gradient Descent: A compromise using small


batches of data.
Backpropagation

A supervised learning algorithm for training neural networks.


Definition: Computes gradients of loss with respect to weights.

1. Forward pass to compute output.


2. Compute loss (e.g., Mean Squared Error, Cross-Entropy).
Steps: 3. Backward pass to propagate error and calculate gradients.
4. Update weights using gradient descent.

Advantages: Efficient computation, scalability.

Susceptible to vanishing and exploding


Disadvantages: gradients.
Regularization
Overview of
Regularization

• Purpose:
• Prevents overfitting by adding
constraints to the model.
• Key Techniques:
• Parameter penalties (e.g., L1, L2
regularization).
• Data augmentation.
• Dropout.
• Impact:
• Reduces model complexity.
• Improves generalization.
Parameter Penalties

L1 • Adds a penalty proportional to the absolute


Regularization value of weights.
• Encourages sparsity in the model.
(Lasso):

L2 • Adds a penalty proportional to the square of


Regularization weights.
• Reduces large weights, prevents overfitting.
(Ridge):
Data Augmentation

Definition: Examples: Benefits:


• Increases the diversity of • Image data: flipping, cropping, • Enhances model robustness.
training data by applying brightness adjustment. • Reduces overfitting.
transformations (e.g., rotation, • Text data: synonym
scaling, flipping). replacement, back translation.
Dropout:
• Randomly drops neurons during training.
• Reduces dependency on specific neurons.
Dropout
Bagging:
and
• Combines predictions of multiple models
Bagging trained on different data subsets.
• Example: Random Forest.

Impact:
• Improves model performance and stability.
Optimization
for Training
Deep Models
Optimization:

• Process of finding the best parameters


Optimizat (weights) to minimize loss.
ion vs
Training:
Training
• Involves both optimization and
generalization to unseen data.

Challenges:

• Computational cost, saddle points, local


minima.
Basic Optimization Algorithms

Gradient Descent Variants: Momentum-Based Methods:


SGD: Faster updates but noisy convergence. Adds a momentum term to gradients for faster
Mini-batch Gradient Descent: Balances speed convergence.
and accuracy.
Adaptive Learning Rates

Adagrad: • Adjusts learning rate based on past gradients.

RMSProp: • Combines Adagrad with momentum.

Adam: • Uses adaptive learning rates for each parameter.

Benefits: • Faster convergence, better performance.


Sequence
Modeling:
Recurrent and
Recursive Nets
Sequence Modeling and
Unfolding Graphs

Modeling Unfolding

Sequence Modeling: Unfolding Graphs:


• Deals with sequential data (e.g., time • Expands the recurrent structure over
series, speech). time.
• Examples: Machine Translation, Text • Allows computation of gradients for
Summarization. training.
Recurrent Neural Networks
(RNNs)
Definition: Challenges: Solutions:
• A type of neural • Vanishing • Long Short-
network for gradients, Term Memory
sequential data. limited memory. (LSTM), Gated
• Maintains a Recurrent Unit
hidden state (GRU).
representing
past inputs.
Bidirectional RNNs

Definition: Advantages: Architecture:


• Processes input • Captures past • Combines
sequences in and future outputs from
both forward and context. forward and
backward • Improves backward RNNs.
directions. performance on
tasks like speech
recognition.
Deep Recurrent Networks

Definition: Benefits: Challenges:


Stacks multiple RNN layers for Captures more complex patterns. Increased computational cost,
hierarchical feature learning. difficulty in training.
Conclusion

• Summary:
• Key concepts in feedforward networks, regularization, optimization, and
sequence modeling.
• Importance of choosing the right techniques for specific tasks.
Thank you

You might also like