Gen AI Unit 2

The document discusses autoencoders and autoregressive models, detailing various types of autoencoders such as regularized, denoising, sparse, variational, and convolutional autoencoders, along with their advantages and disadvantages. It also introduces autoregressive models, explaining their function in predicting sequences based on previous values, and outlines traditional autoregressive models like AR, ARMA, and ARIMA. Additionally, it covers specific models like Fully Visible Sigmoid Belief Networks (FVSBN), Neural Autoregressive Density Estimation (NADE), and Masked Autoencoder for Distribution Estimation (MADE).

Uploaded by

23adl05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

77 views65 pages

Gen AI Unit 2

Uploaded by

23adl05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

UNIT – II AUTOENCODERS AND AUTOREGRESSIVE MODELS

Autoencoders – Regularised autoencoders – Stochastic Encoders and Decoders –

Autoregressive Models – Fully Visible Sigmoid Belief Network (FVSBN) – Neural
Autoregressive Density Estimation (NADE) – Masked Autoencoder for
Distribution Estimation (MADE)
Autoencoders
Regularized Autoencoder
• With code dimension less than the input dimension, can learn the
most salient features of the data distribution.
• A similar problem occurs if the hidden code is allowed to have
dimension equal to the input, and in the overcomplete case in which
the hidden code has dimension greater than the input.
• In these cases, even a linear encoder and linear decoder can learn to
copy the input to the output without learning anything useful about
the data distribution.
• Rather than limiting the model capacity by keeping the encoder
and decoder shallow and the code size small, regularized
autoencoders use a loss function that encourages the model to have
other properties besides the ability to copy its input to its output.
• These other properties include sparsity of the representation,
smallness of the derivative of the representation, and robustness to
noise or to missing inputs.
• A regularized autoencoder can be nonlinear and overcomplete but
still learn something useful about the data distribution even if the
model capacity is great enough to learn a trivial identity function.
Types of Autoencoders
• Denoising Autoencoder
• Denoising autoencoder works on a partially corrupted input and trains to
recover the original undistorted image. This method is an effective way to
constrain the network from simply copying the input and thus learn the
underlying structure and important features of the data.
• Advantages
• This type of autoencoder can extract important features and reduce the
noise or the useless features.
• Denoising autoencoders can be used as a form of data augmentation, the
restored images can be used as augmented data thus generating additional
training samples.
• Disadvantages
• Selecting the right type and level of noise to introduce can be
challenging and may require domain knowledge.
• Denoising process can result into loss of some information that is
needed from the original input. This loss can impact accuracy of the
output.
• Sparse Autoencoder
• This type of autoencoder typically
contains more hidden units than the
input but only a few are allowed to
be active at once. This property is
called the sparsity of the network.
• The sparsity of the network can be
controlled by either manually
zeroing the required hidden units,
tuning the activation functions or by
adding a loss term to the cost
function. It avoids bottle-neck
• Advantages
• The sparsity constraint in sparse autoencoders helps in filtering out
noise and irrelevant features during the encoding process.
• These autoencoders often learn important and meaningful features due
to their emphasis on sparse activations.
• Disadvantages
• The choice of hyperparameters play a significant role in the performance
of this autoencoder. Different inputs should result in the activation of
different nodes of the network.
• The application of sparsity constraint increases computational
complexity.
• Variational Autoencoder
• Variational autoencoder makes strong assumptions about the
distribution of latent variables and uses the Stochastic Gradient
Variational Bayes (SGVB) estimator in the training process.
• It assumes that the data is generated by a Directed Graphical Model
and tries to learn an approximation to to the conditional
property where are the parameters of the encoder and
the decoder respectively.
• Advantages
• Variational Autoencoders are used to generate new data points that
resemble the original training data. These samples are learned from the
latent space.
• Variational Autoencoder is probabilistic framework that is used to learn a
compressed representation of the data that captures its underlying
structure and variations, so it is useful in detecting anomalies and data
exploration.
• Disadvantages
• Variational Autoencoder use approximations to estimate the true
distribution of the latent variables. This approximation introduces some
level of error, which can affect the quality of generated samples.
• The generated samples may only cover a limited subset of the true data
distribution. This can result in a lack of diversity in generated samples.
• Convolutional Autoencoder
• Convolutional autoencoders are a type of autoencoder that use
convolutional neural networks (CNNs) as their building blocks.
• The encoder consists of multiple layers that take a image or a grid as
input and pass it through different convolution layers thus forming a
compressed representation of the input.
• The decoder is the mirror image of the encoder it deconvolves the
compressed representation and tries to reconstruct the original
image.
• Advantages
• Convolutional autoencoder can compress high-dimensional image
data into a lower-dimensional data. This improves storage efficiency
and transmission of image data.
• Convolutional autoencoder can reconstruct missing parts of an
image. It can also handle images with slight variations in object
position or orientation.
• Disadvantages
• These autoencoder are prone to overfitting. Proper regularization
techniques should be used to tackle this issue.
• Compression of data can cause data loss which can result in
reconstruction of a lower quality image.
Stochastic Encoders and Decoders
• Autoencoders are just feedforward networks. The same loss
functions and output unit types that can be used for traditional
feedforward networks are also used for autoencoders.
• General strategy for designing the output units and the loss function
of a feedforward network is to define an output distribution
p(y | x) and minimize the negative log-likelihood - log p(y | x).
• In that setting, y was a vector of targets, such as class labels.
• In the case of an autoencoder, x is now the target as well as the input.
• Given a hidden code h, we may think of the decoder as providing a
conditional distribution pdecoder(x | h). We may then train the
autoencoder by minimizing - log pdecoder(x | h).
• The exact form of this loss function will change depending on the
form of pdecoder .
• As with traditional feedforward networks, we usually use linear
output units to parametrize the mean of a Gaussian distribution if x is
real-valued. In that case, the negative log-likelihood yields a mean
squared error criterion.
• Similarly, binary x values correspond to a Bernoulli distribution whose
parameters are given by a sigmoid output unit, discrete x values
correspond to a softmax distribution, and so on.
• Stochastic encoder

• Stochastic decoder
Autoregressive models
• An autoregressive model is a statistical model that describes a sequence of
observations where each observation depends on its preceding values. In
other words, the model predicts the next value in a sequence based on the
previous values. Autoregressive models are commonly used in time series
analysis, signal processing, and various other fields.
• The term "autoregressive" is derived from the idea that the model
regresses a variable onto itself. The mathematical representation of an
autoregressive model of order p, often denoted as AR(p), is given by the
following equation:
• Xt=ϕ1Xt−1+ϕ2Xt−2+…+ϕpXt−p+ϵt
• Xt is the value at time t, ϕ1,ϕ2,…,ϕp are the parameters of the model
representing the weights,
• Xt−1,Xt−2,…,Xt−p are the lagged values of the series (past observations),
ϵt is a white noise term representing the error or randomness at time
t.
• The order p indicates how many past observations are considered in
predicting the current value. If p=1, it's an AR(1) model, and if p=2, it's
an AR(2) model, and so on.
Types of Autoregressive Models
• Autoregressive models are used in various domains and have
different formulations:
• 2.1 Traditional Autoregressive Models
• These models are used in time series forecasting. The two most
common ones are:
• AR (Autoregressive) Model
• The AR model predicts the future value of a variable based on a weighted
sum of its past values.
• Example: AR(1) (first-order AR model)
• ARMA (Autoregressive Moving Average) Model
• Combines AR and MA (Moving Average) models.
• ARMA(p, q) includes both autoregressive terms and past error terms:

• ARIMA (Autoregressive Integrated Moving Average) Model

• Extends ARMA by incorporating differencing to handle non-stationary time
series.
• Two examples of data from autoregressive models with different
parameters. Left: AR(1) with yt=18−0.8yt−1+εt
• Right: AR(2) with yt=8+1.3yt−1−0.7yt−2+εt. In both cases, εt is
normally distributed white noise with mean zero and variance one.
Sigmoid Belief Network
Learning Rule for Sigmoid Belief Network
Fully Visible Sigmoid Belief Network
(FVSBN)
without any hidden units
● the fully visible sigmoid belief network is denoted FVSBN.

The conditional variables x |x ,..., x in FVSBN are Bernoulli with

i 1 i−1

parameters.
Some conditionals are too complex. So FVSBN assume:

𝑥1 𝑥2 𝑥3 𝑥4 𝜎(𝛼 =7i 𝑝+
= 𝑥4 1
𝛼𝑥 i=𝑥11 𝑥+ ⋯𝑥 +…
𝛼(i)𝑥 𝑥i–1)= 𝑓 𝑥 , 𝑥 , … , 𝑥 ; 𝛼i
i i i–1 1, 2, i–1 i 1 $ i–1

FVSBN • σ denotes the sigmoid function

• The conditional for variable xi requires i parameters, and hence

the total number of parameters in the model is given by
𝑛(-1 𝑖 = 𝑂 𝑛
2
≪ 𝑂(2𝑛)
∑

Gan Z , Henao R , Carlson D , et al. Learning Deep Sigmoid Belief Networks with Data Augmentation[C]// Artificial Intelligence and Statistics
(AISTATS). 2015.
FVSBN Example
• Suppose we have a dataset D of handwritten digits (binarised
MNIST)

• Each image has n = 28×28x1 = pixels. Each pixel can either be black (0)
or white (1).
• We want to learn a probability distribution 𝑝 𝑥= 𝑝(𝑥1, … , 𝑥784) over 𝑥 ∈
0,1 784such that when 𝑥~𝑝(𝑥), 𝑥 looks like a digit.
• Idea: define a FVSBN model , then pick a good one based on training data
D.
FVSBN Example
• We can pick an ordering, i.e., order variables (pixels) from top-left (𝑥1)
to bottom-right (𝑥)*+).
• Use product rule factorisation without loss of generality:

• FVSBN model assume: (less parameters)

𝑥!& = 𝑝 𝑥& = 1 𝑥1, 𝑥2, … 𝑥&#1 = 𝑓& 𝑥1, 𝑥$, … , 𝑥 𝛼&

= 𝜎(𝛼 - & +
1 𝛼
&
𝑥1 + ⋯ + 𝛼(&) 𝑥&#1)
&#1
modelling assumption
• Note: This is a . We are using a logistic regression
to predict next pixel distribution based on the previous ones. Called
autoregressive.
FVSBN Example
Neural Autoregressive Density Estimation (NADE)

https://www.youtube.com/watch?v=uLVo6KtWk2
Masked Autoencoder Density Estimation
(MADE)
Thank you

Mc4301 APR May 24 (Machine Learning)
No ratings yet
Mc4301 APR May 24 (Machine Learning)
3 pages
Gen AI Unit 1
100% (1)
Gen AI Unit 1
86 pages
Chapter Four Full
No ratings yet
Chapter Four Full
65 pages
BE02000041 Funda of AI Unit 1 Introduction
No ratings yet
BE02000041 Funda of AI Unit 1 Introduction
63 pages
Pattern Recognition - Unit - 1&2
100% (1)
Pattern Recognition - Unit - 1&2
41 pages
170 Machine Learning Interview Questions and Answer For 2021
100% (1)
170 Machine Learning Interview Questions and Answer For 2021
65 pages
Programming: Just Basic Tutorials
67% (3)
Programming: Just Basic Tutorials
360 pages
Fdsa Unit 5
No ratings yet
Fdsa Unit 5
48 pages
Databases and Data Modelling
No ratings yet
Databases and Data Modelling
53 pages
Generative Ai Course
No ratings yet
Generative Ai Course
3 pages
AI Institutes
No ratings yet
AI Institutes
98 pages
r22 1 9 ML Lab Manual r22 Regulations
No ratings yet
r22 1 9 ML Lab Manual r22 Regulations
24 pages
Distributed System
100% (1)
Distributed System
119 pages
Ai Notes Jntuk r20 Unit 1
No ratings yet
Ai Notes Jntuk r20 Unit 1
17 pages
Bayesian Belief Network
No ratings yet
Bayesian Belief Network
23 pages
OOSE Lab Report
No ratings yet
OOSE Lab Report
30 pages
Deep Learning Unit1
No ratings yet
Deep Learning Unit1
63 pages
01 Basics of Data Analytics and Machine Learning
No ratings yet
01 Basics of Data Analytics and Machine Learning
16 pages
Artificial Intelligence Aakash
No ratings yet
Artificial Intelligence Aakash
129 pages
Visualisation For Data Science Predict Overview 3267
No ratings yet
Visualisation For Data Science Predict Overview 3267
15 pages
Chapter-2-Fundamentals of Machine Learning
No ratings yet
Chapter-2-Fundamentals of Machine Learning
23 pages
Artificial Intelligence: Using Predicate Logic
No ratings yet
Artificial Intelligence: Using Predicate Logic
64 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
LM39 - Naïve Bayes Models
No ratings yet
LM39 - Naïve Bayes Models
14 pages
ARTIFICIAl iNTELLIGENCE Unit III &iv
No ratings yet
ARTIFICIAl iNTELLIGENCE Unit III &iv
39 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Agents in Artificial Intelligence Book
No ratings yet
Agents in Artificial Intelligence Book
29 pages
Kaushal Chavda
No ratings yet
Kaushal Chavda
137 pages
JNTUGV B.tech R23 Course Structure
No ratings yet
JNTUGV B.tech R23 Course Structure
6 pages
Data Engineering Interview Preparation Questions
No ratings yet
Data Engineering Interview Preparation Questions
7 pages
Isom 3400 - Python For Business Analytics 1. Intro To Python
No ratings yet
Isom 3400 - Python For Business Analytics 1. Intro To Python
46 pages
Overfitting vs. Underfitting, Bias vs. Variance
No ratings yet
Overfitting vs. Underfitting, Bias vs. Variance
7 pages
Tf-Idf: David Kauchak cs160 Fall 2009
No ratings yet
Tf-Idf: David Kauchak cs160 Fall 2009
51 pages
Ai Course File
No ratings yet
Ai Course File
67 pages
Artificial Intelligence: Chapter 6: Representing Knowledge Using Rules
No ratings yet
Artificial Intelligence: Chapter 6: Representing Knowledge Using Rules
54 pages
ML Lesson Plan (21AI63)
No ratings yet
ML Lesson Plan (21AI63)
8 pages
320 Cohort 9 Report Final
No ratings yet
320 Cohort 9 Report Final
46 pages
Social Network Analytics Session2
No ratings yet
Social Network Analytics Session2
34 pages
AIML Lab Manual
No ratings yet
AIML Lab Manual
43 pages
Seminar 7 Introduction To Databases
No ratings yet
Seminar 7 Introduction To Databases
41 pages
Probabilistic Reasoning in Artificial Intelligence
No ratings yet
Probabilistic Reasoning in Artificial Intelligence
7 pages
NLP and Generative AI Syllabus - 2025
No ratings yet
NLP and Generative AI Syllabus - 2025
5 pages
Lecture Notes - Logistic Regression
100% (1)
Lecture Notes - Logistic Regression
11 pages
ML Unit-1
No ratings yet
ML Unit-1
32 pages
Practical No.2 Perform The Extraction Transformation and Loading (ETL) Process To Construct The Database in The Sqlserver
No ratings yet
Practical No.2 Perform The Extraction Transformation and Loading (ETL) Process To Construct The Database in The Sqlserver
12 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
ML Question Bank and Sol
No ratings yet
ML Question Bank and Sol
12 pages
Excel Adv Formulae & Functions
No ratings yet
Excel Adv Formulae & Functions
26 pages
Week 2-Consumer Behaviour in The Digital Age
No ratings yet
Week 2-Consumer Behaviour in The Digital Age
27 pages
AI 2marks Questions
100% (1)
AI 2marks Questions
121 pages
1 Autoencoders
No ratings yet
1 Autoencoders
22 pages
Digital Image Processing - Lecture Weeks 1&2 PDF
No ratings yet
Digital Image Processing - Lecture Weeks 1&2 PDF
50 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
14 pages
OPSC7311MM
No ratings yet
OPSC7311MM
200 pages
Polynomials 3
No ratings yet
Polynomials 3
11 pages
DELL - LATITUDE - E6500 - COMPAL - LA-4041P (Diagramas - Com.br)
No ratings yet
DELL - LATITUDE - E6500 - COMPAL - LA-4041P (Diagramas - Com.br)
56 pages
PolyClad Manual, 200 Series
No ratings yet
PolyClad Manual, 200 Series
69 pages
Ai-Unit-Iii Notes
No ratings yet
Ai-Unit-Iii Notes
46 pages
Thesis Information Security Management
100% (3)
Thesis Information Security Management
6 pages
Artificial Intelligence Module 5
No ratings yet
Artificial Intelligence Module 5
23 pages
6QPG1 CSE Artificial Intelligence CS8691 QBM
No ratings yet
6QPG1 CSE Artificial Intelligence CS8691 QBM
2 pages
NEURAL NETWORKS and Deep Learning: Going Deep About Neural Network
No ratings yet
NEURAL NETWORKS and Deep Learning: Going Deep About Neural Network
4 pages
Ad3501 - Deep Learning
No ratings yet
Ad3501 - Deep Learning
2 pages
Deep Learning
No ratings yet
Deep Learning
2 pages
Electrical Engineering Department: Vision
No ratings yet
Electrical Engineering Department: Vision
26 pages
Ai Notes
No ratings yet
Ai Notes
2 pages
Encoders: For Machine Tool Inspection and Acceptance Testing
No ratings yet
Encoders: For Machine Tool Inspection and Acceptance Testing
20 pages
Module 2 Lesson 1: The Kalman Filter
No ratings yet
Module 2 Lesson 1: The Kalman Filter
14 pages
Societal Project On Scada by Medha Servo Drives
No ratings yet
Societal Project On Scada by Medha Servo Drives
24 pages
Mathematics For Machine Learning-I
No ratings yet
Mathematics For Machine Learning-I
10 pages
Leap Ahead Phonics Homeschool Educational Software Torrent - Kickass Torrents
100% (1)
Leap Ahead Phonics Homeschool Educational Software Torrent - Kickass Torrents
3 pages
MM1 MMS MD1
No ratings yet
MM1 MMS MD1
9 pages
Sequential Parallel Algorithms For Big-Integer PDF
No ratings yet
Sequential Parallel Algorithms For Big-Integer PDF
7 pages
Project Report On Secondary Research
No ratings yet
Project Report On Secondary Research
7 pages
Document From .
No ratings yet
Document From .
36 pages
Masked Label Prediction (Contiene GTN)
No ratings yet
Masked Label Prediction (Contiene GTN)
7 pages
ESP32-ESP8266 Web Server HTTP Authentication (Username and Password Protected)
No ratings yet
ESP32-ESP8266 Web Server HTTP Authentication (Username and Password Protected)
4 pages
A Comparative Study of Language Models For Book and Author Recognition
No ratings yet
A Comparative Study of Language Models For Book and Author Recognition
12 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Service Provider Agreement - 20 12 24 SR Comments
No ratings yet
Service Provider Agreement - 20 12 24 SR Comments
10 pages
j2c Uk (s12)
No ratings yet
j2c Uk (s12)
2 pages
Optiplex 780 Tech Spec Sheet
No ratings yet
Optiplex 780 Tech Spec Sheet
2 pages
Solution 2 28122022 013134am
No ratings yet
Solution 2 28122022 013134am
5 pages
All PGM Graphics & Multimedia
No ratings yet
All PGM Graphics & Multimedia
7 pages
Website Contents Details
No ratings yet
Website Contents Details
5 pages
PC 3000 Express
No ratings yet
PC 3000 Express
1 page
Microblogging Platforms Explained
No ratings yet
Microblogging Platforms Explained
4 pages
Financial Analyst
No ratings yet
Financial Analyst
1 page
The Impact of COVID-19 On Customer's Online Banking and E-Payment Usage: A Study of Customers Within Kaduna Metropolis
No ratings yet
The Impact of COVID-19 On Customer's Online Banking and E-Payment Usage: A Study of Customers Within Kaduna Metropolis
18 pages