0% found this document useful (0 votes)

7 views6 pages

Reserch Papers On Deep Learning Mpgi

The document provides an overview of various concepts in deep learning, including multilayer perceptrons, feedforward neural networks, backpropagation, and optimization algorithms like AdaGrad and RMSProp. It discusses dimensionality reduction techniques such as Principal Component Analysis and Singular Value Decomposition, as well as regularization methods to prevent overfitting. Additionally, it covers neural network architectures like convolutional and recurrent neural networks, along with concepts like bias-variance tradeoff, normalization, and encoder-decoder models.

Uploaded by

askadu16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views6 pages

Reserch Papers On Deep Learning Mpgi

Uploaded by

askadu16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Deep Leaning

What is Multilayer Perceptron in neural network?

A multilayer perceptron (MLP) is a fully connected class of feedforward artificial neural network
(ANN). The term MLP is used ambiguously, sometimes loosely to mean any feedforward ANN, sometimes
strictly to refer to networks composed of multiple layers of perceptrons (with threshold activation)

What is a Feed Forward Neural Network?

A Feed Forward Neural Network is an artificial neural network in which the connections between nodes
does not form a cycle. The opposite of a feed forward neural network is a recurrent neural network, in which
certain pathways are cycled. The feed forward model is the simplest form of neural network as information
is only processed in one direction. While the data may pass through multiple hidden nodes, it always moves
in one direction and never backwards.

What is Backpropagation?

Backpropagation is the essence of neural network training. It is the method of fine-tuning the weights of a
neural network based on the error rate obtained in the previous epoch (i.e., iteration). Proper tuning of the
weights allows you to reduce error rates and make the model reliable by increasing its generalization.

Backpropagation in neural network is a short form for “backward propagation of errors.” It is a standard
method of training artificial neural networks. This method helps calculate the gradient of a loss function
with respect to all the weights in the network

Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD

What is AdaGrad?
Adaptive Gradient Algorithm (Adagrad) is an algorithm for gradient-based optimization. The learning
rate is adapted component-wise to the parameters by incorporating knowledge of past observations.

RMSProp

RMSProp is a very effective extension of gradient descent and is one of the preferred approaches
generally used to fit deep learning neural networks. Empirically, RMSProp has been shown to be an
effective and practical optimization algorithm for deep neural networks.
Eigenvalues

They are often referred as right vectors, which simply means a column vector (as opposed to a row vector
or a left vector). A right-vector is a vector as we understand them. Eigenvalues are coefficients applied to
eigenvectors that give the vectors their length or magnitude. Picking the features which represent that
data and eliminating less useful features is an example of dimensionality reduction. We can use
eigenvalues and vectors to identify those dimensions which are most useful and prioritize our
computational resources toward them

Principal Component Analysis

Principal Component Analysis is an unsupervised learning algorithm that is used for the dimensionality
reduction in machine learning. It is a statistical process that converts the observations of correlated features
into a set of linearly uncorrelated features with the help of orthogonal transformation. These new
transformed features are called the Principal Components. It is one of the popular tools that is used for
exploratory data analysis and predictive modeling. It is a technique to draw strong patterns from the given
dataset by reducing the variances.

PCA generally tries to find the lower-dimensional surface to project the high-dimensional data.

PCA works by considering the variance of each attribute because the high attribute shows the good split
between the classes, and hence it reduces the dimensionality. Some real-world applications of PCA are
image processing, movie recommendation system, optimizing the power allocation in various
communication channels. It is a feature extraction technique, so it contains the important variables and
drops the least important variable.

The Singular Value Decomposition

The Singular Value Decomposition (SVD) of a matrix is a factorization of that matrix into three matrices. It
has some interesting algebraic properties and conveys important geometrical and theoretical insights about
linear transformations. It also has some important applications in data science. In this article, I will try to
explain the mathematical intuition behind SVD and its geometrical meaning.

Mathematics behind SVD

The SVD of mxn matrix A is given by the formula :

where:

 U: mxn matrix of the orthonormal eigenvectors of .

 T
V : transpose of a nxn matrix containing the orthonormal eigenvectors of A^{T}A.
 W: a nxn diagonal matrix of the singular values which are the square roots of the eigenvalues
of .
PCA is essentially a linear transformation but Auto-encoders are capable of modelling complex non
linear functions. PCA features are totally linearly uncorrelated with each other since features are
projections onto the orthogonal basis.

Regularization

Regularization helps with the effects of out-of-control parameters by using different methods to minimize
parameter size over time.

In mathematical notation, we see regularization represented by the coefficient lambda, controlling the trade-
off between finding a good fit and keeping the value of certain feature weights low as the exponents on
features increase.

Regularization coefficients L1 and L2 help fight overfitting by making certain weights smaller. Smaller-
valued weights lead to simpler hypotheses, which are the most generalizable. Unregularized weights with
several higher-order polynomials in the feature sets tend to overfit the training set.

As the input training set size grows, the effect of regularization decreases, and the parameters tend to
increase in magnitude. This is appropriate because an excess of features relative to training set examples
leads to overfitting in the first place. Bigger data is the ultimate regularizer.

Regularized autoencoders

There are other ways to constrain the reconstruction of an autoencoder than to impose a hidden layer of
smaller dimensions than the input. The regularized autoencoders use a loss function that helps the model to
have other properties besides copying input to the output. We can generally find two types of regularized
autoencoder: the denoising autoencoder and the sparse autoencoder.

Denoising autoencoder

We can modify the autoencoder to learn useful features is by changing the inputs; we can add random noise
to the input and recover it to the original form by removing noise from the input data. This prevents the
autoencoder from copying the data from input to output because it contains random noise. We ask it to
subtract the noise and produce meaningful underlying data. This is called a denoising autoencoder.

Bias, and Variance

Introduction

The terms-, you must have heard of them even if you’re new to the domain. But it’s common for budding
data scientists to confuse the two. It’s essential to understand that no machine learning model can be 100%
accurate. As a matter of fact, it’s not even supposed to be. There are always going to be some prediction
errors - bias and variance. And understanding the bias-variance tradeoff is an integral part of a data
scientist’s learning path.
Bias

Bias is the skewness in the machine learning model occurring due to incorrect assumptions in the machine
learning process. Bias can be defined as the error between model predictions and the actual results.
Essentially, it describes how well the model captures in the training data set.

 A model which doesn’t capture the trends in training data set well is said to show high bias.
 A model with low bias resembles the trends in the data set.

Characteristics of a high bias model include:

 Failure to capture proper data trends

 Likely to underfit
 Gives an overly simplified view of the data

Variance

Practically, Variance could be defined as the model’s flexibility to changes in the data set or how robust the
model is.

It is the variability in the model prediction- how adjustable the function is to changes in the data set. More
complex models lead to high variance. Models having high bias have low variance and vice versa.

Characteristics of a high variance model include:

 Noisy dataset
 Likely to overfit
 Non-generalised/ complex model
 Accounting for outliers
 What is supervised greedy layer-wise pre-training?

Greedy layer-wise pretraining provides a way to develop deep multi-layered neural networks whilst
only ever training shallow networks. Pretraining can be used to iteratively deepen a supervised model or
an unsupervised model that can be repurposed as a supervised model.

What is Normalization vs Batch Normalization?

Normalization is a procedure to change the value of the numeric variable in the dataset to a typical scale,
without misshaping contrasts in the range of value.

Batch normalization is a technique for training very deep neural networks that normalizes the contributions
to a layer for every mini-batch. This has the impact of settling the learning process and drastically
decreasing the number of training epochs required to train deep neural networks.
What is a vector representation deep learning?
A vector is often represented as a 1-dimensional array of numbers, referred to as components and is
displayed either in column form or row form. Represented geometrically, vectors typically represent
coordinates within a n-dimensional space, where n is the number of dimensions.

CNN

In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network
(ANN), most commonly applied to analyze visual imagery.[1] CNNs are also known as Shift Invariant or
Space Invariant Artificial Neural Networks (SIANN), based on the shared-weight architecture of the
convolution kernels or filters that slide along input features and provide translation-equivariant responses
known as feature maps.[2][3] Counter-intuitively, most convolutional neural networks are not invariant to
translation, due to the downsampling operation they apply to the input.[4] They have applications in image
and video recognition, recommender systems,[5] image classification, image segmentation, medical image
analysis, natural language processing,[6] brain–computer interfaces,[7] and financial time series.[8]

CNN types

LeNet, AlexNet, ZF-Net, VGGNet, GoogLeNet, ResNet

What is recurrent neural network in deep learning?

A recurrent neural network (RNN) is a type of artificial neural network which uses sequential data or
time series data.

What is BPTT in deep learning?

Backpropagation through time (BPTT) is a gradient-based technique for training certain types of
recurrent neural networks. It can be used to train Elman networks. The algorithm was independently
derived by numerous researchers.

What is vanishing and exploding gradients?

Exploding gradient occurs when the derivatives or slope will get larger and larger as we go backward
with every layer during backpropagation. This situation is the exact opposite of the vanishing gradients.
This problem happens because of weights, not because of the activation function.
What is an encoder decoder model?

The best way to understand the concept of an encoder-decoder model is by playing Pictionary. The rules of
the game are very simple, player 1 randomly picks a word from a list and needs to sketch the meaning in a
drawing. The role of the second player in the team is to analyse the drawing and identify the word which it
describes. In this example we have three important elements player 1(the person that converts the word into
a drawing), the drawing (rabbit) and the person that guesses the word the drawing represents (player 2). This
is all we need to understand an encoder decoder model, below we will build a comparative of the Pictionary
game and an encoder decoder model for translating Spanish to English.

Pictionary Game, Image by the author

If we translate the above graph into machine learning concepts, we would see the below one. In the
following sections we will go through each component.

Gen AI Unit 2
100% (1)
Gen AI Unit 2
65 pages
Artificial Intelligence
100% (1)
Artificial Intelligence
47 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
17 pages
SCSA3015 Deep Learning Unit 3
100% (1)
SCSA3015 Deep Learning Unit 3
23 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
1
No ratings yet
1
61 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
Deep Learning Answers
No ratings yet
Deep Learning Answers
36 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
125 pages
Deep-Learning Notes 01
No ratings yet
Deep-Learning Notes 01
8 pages
شباتر اله مجمعه
No ratings yet
شباتر اله مجمعه
126 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
Optimization
No ratings yet
Optimization
95 pages
Deep Learning Lab With Output
No ratings yet
Deep Learning Lab With Output
12 pages
Fundamentos de Procesamiento de Lenguaje Natural: Summer Camp
100% (1)
Fundamentos de Procesamiento de Lenguaje Natural: Summer Camp
97 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
155 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
Unit Online 1.4
No ratings yet
Unit Online 1.4
132 pages
An Introduction To Variational Autoencoders: Foundations and Trends in Machine Learning
No ratings yet
An Introduction To Variational Autoencoders: Foundations and Trends in Machine Learning
89 pages
4 MachineLearningForCV
No ratings yet
4 MachineLearningForCV
73 pages
Unit 3
No ratings yet
Unit 3
110 pages
Ensemble Learning and Random Forests
No ratings yet
Ensemble Learning and Random Forests
151 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
ANN Viva Prep
No ratings yet
ANN Viva Prep
66 pages
6 CNN
No ratings yet
6 CNN
50 pages
E-Commerce Product Delivery Prediction
No ratings yet
E-Commerce Product Delivery Prediction
13 pages
Lab2 Solution PDF
No ratings yet
Lab2 Solution PDF
2 pages
6 Batchnorm
No ratings yet
6 Batchnorm
30 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
NN Learning
No ratings yet
NN Learning
69 pages
机器学习绘图模板
No ratings yet
机器学习绘图模板
101 pages
CMPE257 - W2C3 - ML Fundamentals - Part 2
No ratings yet
CMPE257 - W2C3 - ML Fundamentals - Part 2
34 pages
Ai ML
No ratings yet
Ai ML
2 pages
CNN Stanford2015
No ratings yet
CNN Stanford2015
129 pages
Data Science Interview Question
No ratings yet
Data Science Interview Question
23 pages
ML Answer Key (M.tech)
No ratings yet
ML Answer Key (M.tech)
31 pages
Bitcoin Prise Using LSTM - Ipynb - Colab
No ratings yet
Bitcoin Prise Using LSTM - Ipynb - Colab
49 pages
Unit 2
No ratings yet
Unit 2
37 pages
Machine Learning Unit 3-5
No ratings yet
Machine Learning Unit 3-5
13 pages
Machine Learning Unit 4
No ratings yet
Machine Learning Unit 4
21 pages
Ann CNN RNN
No ratings yet
Ann CNN RNN
26 pages
Module1 - Deep Learning
No ratings yet
Module1 - Deep Learning
26 pages
Data Analysis ch1
No ratings yet
Data Analysis ch1
13 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
Module 1
No ratings yet
Module 1
22 pages
Dl-Unit 3
No ratings yet
Dl-Unit 3
14 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
An Evaluation of Machine Learning and Deep Learning Models For Drought Prediction Using Weather Dara
No ratings yet
An Evaluation of Machine Learning and Deep Learning Models For Drought Prediction Using Weather Dara
36 pages
03-Lecture Notes-Mid
No ratings yet
03-Lecture Notes-Mid
23 pages
Maths Roadmap For Machine Learning
No ratings yet
Maths Roadmap For Machine Learning
21 pages
Lecture 2: Basics and Definitions: Networks As Data Models
No ratings yet
Lecture 2: Basics and Definitions: Networks As Data Models
28 pages
Deep Learning LAB
No ratings yet
Deep Learning LAB
47 pages
W9a Autoencoders Pca
No ratings yet
W9a Autoencoders Pca
7 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
CVDL Cae 2
No ratings yet
CVDL Cae 2
7 pages
Deep Feed-Forward Neural Network
No ratings yet
Deep Feed-Forward Neural Network
4 pages
3 Hours / 70 Marks: Seat No
No ratings yet
3 Hours / 70 Marks: Seat No
2 pages
Artificial Neural Network Bao
No ratings yet
Artificial Neural Network Bao
26 pages
ML Final Print Upload
No ratings yet
ML Final Print Upload
10 pages
Deep Learning - Summary - Deep - Learning
No ratings yet
Deep Learning - Summary - Deep - Learning
17 pages
Functions:: Sparse Modeling
No ratings yet
Functions:: Sparse Modeling
7 pages
Word 2 Vec
No ratings yet
Word 2 Vec
29 pages
Module 03
No ratings yet
Module 03
13 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Neural Net
No ratings yet
Neural Net
15 pages
DL UNIT 1 (AB22) Continution
No ratings yet
DL UNIT 1 (AB22) Continution
9 pages
Ai Worksheet Answer
No ratings yet
Ai Worksheet Answer
10 pages
DL Unit1
No ratings yet
DL Unit1
10 pages
Deep Learning U1
No ratings yet
Deep Learning U1
5 pages
Unit 3
No ratings yet
Unit 3
7 pages
Machine Learning Interview Questions
No ratings yet
Machine Learning Interview Questions
8 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
Deep Belief Network
No ratings yet
Deep Belief Network
4 pages
Linear Algebra: Submitted by Ahmad Saeed Submitted To Sir Muzzam Ali BITM-F18-022
No ratings yet
Linear Algebra: Submitted by Ahmad Saeed Submitted To Sir Muzzam Ali BITM-F18-022
5 pages
ML Concepts
No ratings yet
ML Concepts
3 pages
Backpropagation: Static Backpropagation Is A Network Designed To
No ratings yet
Backpropagation: Static Backpropagation Is A Network Designed To
2 pages
An Introductory Note On Machine Learning. A V Narasimhadhan
No ratings yet
An Introductory Note On Machine Learning. A V Narasimhadhan
2 pages
Detection of Ocular Cataracts With Convolutional Neural Networks
No ratings yet
Detection of Ocular Cataracts With Convolutional Neural Networks
10 pages
Ai Presentation
No ratings yet
Ai Presentation
9 pages
DL Assignment 4
No ratings yet
DL Assignment 4
7 pages
Comparative Study of CNN and RNN For Natural Language Processing
No ratings yet
Comparative Study of CNN and RNN For Natural Language Processing
7 pages
NDVI Versus CNN Features in Deep Learning For Land Cover Clasification of Aerial Images
No ratings yet
NDVI Versus CNN Features in Deep Learning For Land Cover Clasification of Aerial Images
4 pages
ANN (Artificial Neural Network) 4. LSTM (Long Short-Term Memory)
No ratings yet
ANN (Artificial Neural Network) 4. LSTM (Long Short-Term Memory)
2 pages
Introduction and Overview of The Project - Transcript
No ratings yet
Introduction and Overview of The Project - Transcript
2 pages
Adobe Scan Dec 17, 2023
No ratings yet
Adobe Scan Dec 17, 2023
1 page
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet