0% found this document useful (0 votes)

8 views24 pages

Unit - IV

The document provides an overview of deep learning methods, focusing on feedforward networks, gradient descent, backpropagation, regularization techniques, and optimization strategies for training deep models. It discusses the challenges faced in training neural networks, such as overfitting and vanishing gradients, and introduces sequence modeling with recurrent neural networks. Key techniques for enhancing model performance, including dropout and data augmentation, are also highlighted.

Uploaded by

Harishri MQ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views24 pages

Unit - IV

Uploaded by

Harishri MQ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Unit – IV

Historical trends in
Deep learning, Deep
learning: Overview of
Methods.
Deep
Feedforward
Networks
Overview of Feedforward Networks

Definition: Key Features: Applications:

A feedforward network is a type of Input Layer, Hidden Layers, Output Image recognition, Natural Language
artificial neural network where Layer. Processing (NLP), Regression.
connections between nodes do not Activation functions such as ReLU,
form a cycle. Sigmoid, Tanh.
Information flows in one direction –
from input to output.
Feed Forward Neural Network
Gradient Descent:
Gradient- • A method to minimize the loss
Based function by updating weights.
• Types: Batch Gradient Descent,
Learning Stochastic Gradient Descent (SGD),
Mini-batch Gradient Descent.

Challenges:
• Local minima, vanishing gradients,
exploding gradients.
Batch Gradient Descent: Uses the entire dataset in each
step (slow for large data).

Stochastic Gradient Descent (SGD): Updates weights for

each data point (fast but noisy).

Mini-batch Gradient Descent: A compromise using small

batches of data.
Backpropagation

A supervised learning algorithm for training neural networks.

Definition: Computes gradients of loss with respect to weights.

1. Forward pass to compute output.

2. Compute loss (e.g., Mean Squared Error, Cross-Entropy).
Steps: 3. Backward pass to propagate error and calculate gradients.
4. Update weights using gradient descent.

Advantages: Efficient computation, scalability.

Susceptible to vanishing and exploding

Disadvantages: gradients.
Regularization
Overview of
Regularization

• Purpose:
• Prevents overfitting by adding
constraints to the model.
• Key Techniques:
• Parameter penalties (e.g., L1, L2
regularization).
• Data augmentation.
• Dropout.
• Impact:
• Reduces model complexity.
• Improves generalization.
Parameter Penalties

L1 • Adds a penalty proportional to the absolute

Regularization value of weights.
• Encourages sparsity in the model.
(Lasso):

L2 • Adds a penalty proportional to the square of

Regularization weights.
• Reduces large weights, prevents overfitting.
(Ridge):
Data Augmentation

Definition: Examples: Benefits:

• Increases the diversity of • Image data: flipping, cropping, • Enhances model robustness.
training data by applying brightness adjustment. • Reduces overfitting.
transformations (e.g., rotation, • Text data: synonym
scaling, flipping). replacement, back translation.
Dropout:
• Randomly drops neurons during training.
• Reduces dependency on specific neurons.
Dropout
Bagging:
and
• Combines predictions of multiple models
Bagging trained on different data subsets.
• Example: Random Forest.

Impact:
• Improves model performance and stability.
Optimization
for Training
Deep Models
Optimization:

• Process of finding the best parameters

Optimizat (weights) to minimize loss.
ion vs
Training:
Training
• Involves both optimization and
generalization to unseen data.

Challenges:

• Computational cost, saddle points, local

minima.
Basic Optimization Algorithms

Gradient Descent Variants: Momentum-Based Methods:

SGD: Faster updates but noisy convergence. Adds a momentum term to gradients for faster
Mini-batch Gradient Descent: Balances speed convergence.
and accuracy.
Adaptive Learning Rates

Adagrad: • Adjusts learning rate based on past gradients.

RMSProp: • Combines Adagrad with momentum.

Adam: • Uses adaptive learning rates for each parameter.

Benefits: • Faster convergence, better performance.

Sequence
Modeling:
Recurrent and
Recursive Nets
Sequence Modeling and
Unfolding Graphs

Modeling Unfolding

Sequence Modeling: Unfolding Graphs:

• Deals with sequential data (e.g., time • Expands the recurrent structure over
series, speech). time.
• Examples: Machine Translation, Text • Allows computation of gradients for
Summarization. training.
Recurrent Neural Networks
(RNNs)
Definition: Challenges: Solutions:
• A type of neural • Vanishing • Long Short-
network for gradients, Term Memory
sequential data. limited memory. (LSTM), Gated
• Maintains a Recurrent Unit
hidden state (GRU).
representing
past inputs.
Bidirectional RNNs

Definition: Advantages: Architecture:

• Processes input • Captures past • Combines
sequences in and future outputs from
both forward and context. forward and
backward • Improves backward RNNs.
directions. performance on
tasks like speech
recognition.
Deep Recurrent Networks

Definition: Benefits: Challenges:

Stacks multiple RNN layers for Captures more complex patterns. Increased computational cost,
hierarchical feature learning. difficulty in training.
Conclusion

• Summary:
• Key concepts in feedforward networks, regularization, optimization, and
sequence modeling.
• Importance of choosing the right techniques for specific tasks.
Thank you

UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
Unit II
No ratings yet
Unit II
56 pages
Unit III
No ratings yet
Unit III
43 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
Mcculloh: Linear Activation Function
No ratings yet
Mcculloh: Linear Activation Function
18 pages
Mcculloh: Linear Activation Function
No ratings yet
Mcculloh: Linear Activation Function
12 pages
6 NN RNN
No ratings yet
6 NN RNN
55 pages
Lecture 1 Part II
No ratings yet
Lecture 1 Part II
24 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
Unit-1 and 2 and 3
No ratings yet
Unit-1 and 2 and 3
212 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
Terms To Review
No ratings yet
Terms To Review
9 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
4 pages
Deep Learning
No ratings yet
Deep Learning
23 pages
New - Neural Network & Deep Learning
No ratings yet
New - Neural Network & Deep Learning
8 pages
2 Marks Gen AI
No ratings yet
2 Marks Gen AI
14 pages
Cst414-Deep Learning Module 2
No ratings yet
Cst414-Deep Learning Module 2
13 pages
Assignment Jaiprakash
No ratings yet
Assignment Jaiprakash
5 pages
Gen Aiml Notes by Piyush
No ratings yet
Gen Aiml Notes by Piyush
39 pages
Module 2
No ratings yet
Module 2
67 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
Deep Learning Module 2 Important Topics PYQs
No ratings yet
Deep Learning Module 2 Important Topics PYQs
30 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
2 Deep Neural Network - 241120 - 095158
No ratings yet
2 Deep Neural Network - 241120 - 095158
47 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
Home Assignment Submission Solutions
No ratings yet
Home Assignment Submission Solutions
82 pages
Secrets of Deep Learning 1716536527
No ratings yet
Secrets of Deep Learning 1716536527
12 pages
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
No ratings yet
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
44 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
6 - Tips For Training Deep Neural Networks
No ratings yet
6 - Tips For Training Deep Neural Networks
59 pages
Deep Learning Updated
No ratings yet
Deep Learning Updated
11 pages
Deep Learning Unit 4
No ratings yet
Deep Learning Unit 4
10 pages
DL Module 2
No ratings yet
DL Module 2
8 pages
Unit 3
No ratings yet
Unit 3
7 pages
2023246032-Backward Propagation and Other Differential Algorithms
No ratings yet
2023246032-Backward Propagation and Other Differential Algorithms
48 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Tutorial 1,2
No ratings yet
Tutorial 1,2
12 pages
HCIP-AI-EI Developer V2.0 Training Material
No ratings yet
HCIP-AI-EI Developer V2.0 Training Material
508 pages
DGM Mid Sem
No ratings yet
DGM Mid Sem
39 pages
DL Test-2
No ratings yet
DL Test-2
28 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Pure Optimization
No ratings yet
Pure Optimization
23 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
Deep Learning Book Part1
No ratings yet
Deep Learning Book Part1
100 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Deep Learning
100% (2)
Deep Learning
49 pages
DL Cie2
No ratings yet
DL Cie2
5 pages
S5 and S6-2023 Curriculum Syllabus
No ratings yet
S5 and S6-2023 Curriculum Syllabus
6 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
Deep Learning Concise Notes
No ratings yet
Deep Learning Concise Notes
4 pages
QuestionBank C# and
No ratings yet
QuestionBank C# and
3 pages
Aws ML PDF
No ratings yet
Aws ML PDF
74 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
4 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
21CSC305P ML - Unit 1-E
No ratings yet
21CSC305P ML - Unit 1-E
137 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
House Prices Prediction in King County
No ratings yet
House Prices Prediction in King County
10 pages
3-D Seismic Imaging: Biondo L. Biondi - Stanford University
No ratings yet
3-D Seismic Imaging: Biondo L. Biondi - Stanford University
8 pages
MLT
No ratings yet
MLT
13 pages
Unit 3
No ratings yet
Unit 3
21 pages
Pa 1 Unit
No ratings yet
Pa 1 Unit
23 pages
CS 230 - Deep Learning Tips and Tricks Cheatsheet
No ratings yet
CS 230 - Deep Learning Tips and Tricks Cheatsheet
8 pages
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
No ratings yet
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
18 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
3 pages
Machine Learning Refined: Foundations, Algorithms, and Applications Second Edition Jeremy Watt 2024 Scribd Download
100% (1)
Machine Learning Refined: Foundations, Algorithms, and Applications Second Edition Jeremy Watt 2024 Scribd Download
55 pages
Physics Constrained Taylor Neural Networks
No ratings yet
Physics Constrained Taylor Neural Networks
6 pages
REGULARIZATION TOOLS - A Matlab Package For Analysis and Solution of Discrete Ill-Posed Problems
No ratings yet
REGULARIZATION TOOLS - A Matlab Package For Analysis and Solution of Discrete Ill-Posed Problems
35 pages
Seismic Acoustic Impedance Inversion - Raka Fajar Nugroho
No ratings yet
Seismic Acoustic Impedance Inversion - Raka Fajar Nugroho
11 pages
A Causality Inspired Framework For Model Interpretation
No ratings yet
A Causality Inspired Framework For Model Interpretation
11 pages
Regularization
No ratings yet
Regularization
38 pages
Hyperparameter Tuningin Machine Learning AComprehensive Review
No ratings yet
Hyperparameter Tuningin Machine Learning AComprehensive Review
9 pages
2017 Unser (Slides) Biomedical Image Reconstruction
No ratings yet
2017 Unser (Slides) Biomedical Image Reconstruction
37 pages
Enhancing Supply Chain Resilience: A Machine Learning Approach For Predicting Product Availability Dates Under Disruption
No ratings yet
Enhancing Supply Chain Resilience: A Machine Learning Approach For Predicting Product Availability Dates Under Disruption
21 pages
Ipse Paper
No ratings yet
Ipse Paper
17 pages
Online Regression With Model Selection: Lizhe Sun Adrian Barbu
No ratings yet
Online Regression With Model Selection: Lizhe Sun Adrian Barbu
9 pages
Applied Machine Learning Course Schedule: Topic
No ratings yet
Applied Machine Learning Course Schedule: Topic
29 pages
1 s2.0 S0167473020301107 Main
No ratings yet
1 s2.0 S0167473020301107 Main
14 pages
Deep Learning For Time Series Forecasting: The Electric Load Case
No ratings yet
Deep Learning For Time Series Forecasting: The Electric Load Case
19 pages
Real-Time Convex Optimization in Signal Processing: Comments To or Are Welcome
No ratings yet
Real-Time Convex Optimization in Signal Processing: Comments To or Are Welcome
11 pages
Unsupervised Learning of Depth and Ego-Motion From Video
No ratings yet
Unsupervised Learning of Depth and Ego-Motion From Video
10 pages
HW-5 Week 5: Dhananjai Sharma
No ratings yet
HW-5 Week 5: Dhananjai Sharma
6 pages

Unit - IV

Uploaded by

Unit - IV

Uploaded by

Unit – IV

Definition: Key Features: Applications:

Stochastic Gradient Descent (SGD): Updates weights for

Mini-batch Gradient Descent: A compromise using small

A supervised learning algorithm for training neural networks.

1. Forward pass to compute output.

Advantages: Efficient computation, scalability.

Susceptible to vanishing and exploding

L1 • Adds a penalty proportional to the absolute

L2 • Adds a penalty proportional to the square of

Definition: Examples: Benefits:

• Process of finding the best parameters

• Computational cost, saddle points, local

Gradient Descent Variants: Momentum-Based Methods:

Adagrad: • Adjusts learning rate based on past gradients.

RMSProp: • Combines Adagrad with momentum.

Adam: • Uses adaptive learning rates for each parameter.

Benefits: • Faster convergence, better performance.

Sequence Modeling: Unfolding Graphs:

Definition: Advantages: Architecture:

Definition: Benefits: Challenges:

You might also like