DNN Lab Manual
DNN Lab Manual
(SCSEA)
Lab Manual
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 2
2
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 3
Course Objectives:
Course Outcomes:
Program Outcomes:
Engineering knowledge:
Apply the knowledge of mathematics, science, engineering fundamentals, and an
PO engineering specialization to the solution of complex engineering problems.
1
Problem analysis:
Identify, formulate, review research literature, and analyze complex engineering problems
reaching substantiated conclusions using first principles of mathematics, natural sciences, and
PO engineering sciences.
2
Design/development of solutions:
Design solutions for complex engineering problems and design system components or processes
that meet the specified needs with appropriate consideration for the public health and safety, and
PO the cultural, societal, and environmental considerations.
3
Conduct investigations of complex problems:
Use research-based knowledge and research methods including design of experiments, analysis
PO and interpretation of data, and synthesis of the information to provide valid conclusions.
4
Modern tool usage:
Create, select, and apply appropriate techniques, resources, and modern engineering and IT tools
3
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 4
including prediction and modeling to complex engineering activities with an understanding of the
PO limitations.
5
The engineer and society:
Apply reasoning informed by the contextual knowledge to assess societal, health, safety, legal and
PO cultural issues and the consequent responsibilities relevant to the professional engineering
6 practice.
4
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 5
● Please switch off the fans and lights and keep the chair in proper position before leaving the Lab
5
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 6
● Maintain proper discipline in Lab
6
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 7
Index
*Absent/Attended/Late/Partially Completed/Completed
CERTIFICATE
8
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 9
Practical 01
Objectives:
1. Generate a synthetic dataset with a polynomial relationship.
2. Add noise to the dataset.
3. Use the polyfit function to fit a polynomial curve to the data.
4. Visualize the original data and the fitted curve.
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory:
Polynomial curve fitting:
Polynomial curve fitting is a mathematical technique used to approximate a relationship between two variables using a
polynomial function. It has various applications in fields such as statistics, engineering, physics, economics, and more.
It is a very valuable tool in data analysis.
Some of the common applications of polynomial curve fitting:
Data Modeling, Interpolation and Extrapolation, Function Approximation Signal Processing, Regression Analysis,
Scientific Research.
A Simple Regression Problem
• We observe a real-valued input variable x and we wish to use this observation to predict the value of a real-valued
target variable t.
• We use synthetically generated data from the function sin (2πx) with random noise included in the target values. – A
small level of random noise having a Gaussian distribution
• We have a training set comprising N observations of x, written x ≡ (x1 , . . . , xN ) T ,
together with corresponding observations of the values of t, denoted t ≡ (t1 , . . . ,
tN ) T.
• Our goal is to predict the value of t for some new value of x, A training data set of
N = 10 points, (blue circles),
• The green curve shows the actual function sin (2πx) used to generate the data.
• Our goal is to predict the value of t for some new value of x, without knowledge of the green curve
9
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 10
Algorithm :
1. Define the input data points and corresponding target outputs (XOR or XNOR truth table).
3. Define the activation function (e.g., step function for binary classification).
5. Update the weights and bias using the perceptron learning rule.
Program code :
10
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 11
Output :
11
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 12
Conclusion:
In this practical, we used a method called polynomial fitting to find a curve that closely matches our data
points. This curve helps us better understand how the variables in our dataset are related. By doing this, we
get a clearer picture of how things are connected, as shown in the graph we generated.
12
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 13
Practical 02
Student Name: Rugved Agasti
Date of Experiment: 31-01-2024
Date of Submission:
PRN No: 20210802075
Aim: Implement a basic single perceptron for regression and classification on XOR and XNOR truth table
Objectives:
Single Layer Perceptron
1. Define a simple dataset with linearly separable classes.
2. Implement a perceptron with a step activation function.
3. Train the perceptron using the perceptron learning algorithm.
4. Evaluate the model on the dataset. Multi-Layer
Perceptron
1. Define a simple dataset (e.g., XOR problem).
2. Implement a multilayer perceptron architecture.
3. Train the network using backpropagation.
4. Evaluate the model on the dataset.
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory:
A single layer perceptron (SLP) is a feed-forward network based on a threshold transfer function. SLP is the simplest
type of artificial neural networks and can only classify linearly separable cases with a binary target (1 , 0). The single
layer perceptron does not have a priori knowledge, so the initial weights are assigned randomly. SLP sums all the
weighted inputs and if the sum is above the threshold (some predetermined value), SLP is said to be activated
(output=1).
13
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 14
The input values are presented to the perceptron, and if the predicted output is the same as the desired output, then the
performance is considered satisfactory and no changes to the weights are made. However, if the output does not match
the desired output, then the weights need to be changed to reduce the error.
The most famous example of the inability of perceptron to solve problems with linearly non- separable cases is the
XOR problem. However, a multi-layer perceptron using the back propagation algorithm can successfully classify the
XOR data. A multi-layer perceptron (MLP) has the same structure of a single layer perceptron with one or more hidden
layers.
Program Code :
14
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 15
15
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 16
16
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 17
Output :
17
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 18
Conclusion:
The provided code implements a single-layer perceptron for XOR and XNOR gate classification and regression tasks. It
initializes the perceptron with specified input and output sizes, learning rate, and epochs. Through training, it adjusts
weights based on input data to minimize prediction errors. The predict_regression and predict_classification methods
compute outputs for regression and classification tasks, respectively. The code demonstrates the ability of a single-layer
perceptron to learn and classify non-linearly separable data like XOR and XNOR gates, showcasing its limitations when
dealing with more complex problems that require multi-layered architectures for better performance.
18
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 19
Practical 03
Aim: Build a Basic neural network from scratch using a high level library like tensorflow or Pytorch. Use appropriate
dataset
Objectives:
1. select two datasets for prediction and classification model
2. implement prediction model using Relu and linear activation functions
3. implement classification model using Relu and Sigmoid activation function
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory:
Neural Networks are computational models that mimic the complex functions of the human brain. The neural networks
consist of interconnected nodes or neurons that process and learn from data, enabling tasks such as pattern recognition
and decision making in machine learning. Working of a Neural Network
Neural networks are complex systems that mimic some features of the functioning of the human brain. It is composed
of an input layer, one or more hidden layers, and an output layer made up of layers of artificial neurons that are
coupled. The two stages of the basic process are called back propagation and forward propagation.
Forward Propagation
● Input Layer: Each feature in the input layer is represented by a node on the network, which receives input data.
19
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 20
● Weights and Connections: The weight of each neuronal connection indicates how strong the connection is.
20
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 21
● Hidden Layers: Each hidden layer neuron processes inputs by multiplying them by weights, adding them up, and then
passing them through an activation function. By doing this, non- linearity is introduced, enabling the network to
recognize intricate patterns.
● Output: The final result is produced by repeating the process until the output layer is reached. Backpropagation
Loss Calculation: The network’s output is evaluated against the real goal values, and a loss function is used to
compute the difference. For a regression problem, the Mean Squared Error (MSE) is commonly used as the cost
function.
Loss Function:
● Gradient Descent: Gradient descent is then used by the network to reduce the loss. To lower the inaccuracy, weights
are changed based on the derivative of the loss with respect to each weight.
● Adjusting weights: The weights are adjusted at each connection by applying this iterative process, or backpropagation,
● Training: During training with different data samples, the entire process of forward propagation, loss calculation, and
backpropagation is done iteratively, enabling the network to adapt and learn patterns from the data.
● Activation Functions: Model non-linearity is introduced by activation functions like the rectified linear unit
(ReLU) or sigmoid. Their decision on whether to “fire” a neuron is based on the whole weighted input.
The final layer of the neural network will have one neuron and the value it returns is a continuous numerical value. To
understand the accuracy of the prediction, it is compared with the true value which is also a continuous number.
21
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 22
Loss Function
Mean squared error (MSE) — This finds the average squared difference between the predicted value and the true
value
The final layer of the neural network will have one neuron and will return a value between 0 and 1, which can be
inferred as a probably.
To understand the accuracy of the prediction, it is compared with the true value. If the data is that class, the true value is a
1, else it is a 0.
22
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 23
Loss Function
Binary Cross Entropy — Cross entropy quantifies the difference between two probability distribution. Our model
predicts a model distribution of {p, 1-p} as we have a binary distribution. We use binary cross-entropy to compare this
with the true distribution {y, 1-y}
Program code
23
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 24
Output:
24
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 25
Output
25
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 26
Conclusion:
In summary, our experiment with a basic prediction model using ReLU and linear activation
functions, run for 10 iterations, provided valuable insights into the learning process of neural
networks. We observed weight updates at each epoch, enhancing our understanding of how neural
networks function.
26
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 27
Practical 04
Student Name: Rugved Agasti
Date of Experiment: 21-02-2024
Date of Submission:
PRN No: 20210802075
Aim: Optimize hyperparameters for a neural network model. Implement a hyperparameter optimization strategy and compare the
performance with different hyperparameter configurations for both classification and regression task.
Objectives:
1. Choose a dataset and neural network architecture.
2. Define a range of hyperparameters to tune.
3. Implement a hyperparameter optimization strategy (e.g. grid search, random search).
4. Compare the performance with different hyperparameter configurations for both classification and
regression tasks.
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory:
Hyper parameters are the variables which determines the network structure (Eg: Number of Hidden Units) and the
variables which determine how the network is trained (Eg: Learning Rate). Hyper parameters are set before training
(before optimizing the weights and bias).
Hyper parameter related to Neural Networks:
1) Number of Hidden layers
● Many hidden units within a layer with regularization techniques can increase accuracy. Smaller number of
● Generally, use a small dropout value of 20%-50% of neurons with 20% providing a good starting point.
● A probability too low has minimal effect and value too high results in under-learning by the network.
● You are likely to get better performance when dropout is used on a larger network, giving the model more of an
opportunity to learn independent representations. Learning The Hyperparameters
27
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 28
3) Network Weight Initialization
● Ideally, it may be better to use different weight initialization schemes according to the
each layer
28
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 29
• Activation functions are used to introduce nonlinearity to models, which allows deep learning models to learn
nonlinear prediction boundaries.
• Softmax is used in the output layer while making multi-class predictions. Methods used to find
Algorithm :
1. Importing necessary libraries
2. Loading the Fashion MNIST dataset
3. Visualizing a sample of the dataset
4. Normalizing the images
5. Building a sequential model
6. Compiling the model
7. Fitting the model on training data
8. Evaluating the model on test data
9. Defining a function to build the model with hyperparameters
10. Instantiating the RandomSearch tuner.
11. Summarizing the search space for hyperparameters
12. Fitting the tuner on the training dataset
13. Summarizing the results of the hyperparameter search
Program Code :
29
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 30
Output :
30
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 31
Conclusion:
In this lab we implemented the optimization of the hyperparameters with the help of Keras tuner which
tuned the hyperparameters so to find the optimal model configuration which helps to increase generalization
an efficiency and performance.
31
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 32
Practical 05
Aim: Develop a Python function (Multivariate Optimization) to compute the Hessian matrix for a given scalar-valued function of
multiple variables.
Objectives:
1. Define objective function
2. Define the gradient of the objective function
3. Implement Hessian matrix function using objective and gradient
4. Perform optimization using Newton’s method
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory:
In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued
function. It describes the local curvature of a function of many variables. Hessian matrices belong to a class of
mathematical structures that involve second order derivatives. They are often used in machine learning and data science
algorithms for optimizing a function of interest. The Hessian matrix is a mathematical tool used to calculate the curvature
of a function at a certain point in space.
The formula for Hessian Matrix is given as
Algorithm :
32
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 33
Program Code :
Output :
Conclusion:
In this lab, we have implemented a program to optimize a function using both the gradient descent method
33
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 34
and Newton's method with the assistance of the Hessian matrix. This enabled us to find the optimal solution
by iteratively updating parameters.
34
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 35
P
Practical 06
Objectives:
1. Choose a dataset suitable for Bayesian classification.
2. Implement a Bayesian classifier using probabilistic models.
3. Train the classifier and evaluate its performance.
Where P(X/w) is the likelihood probability of occurring X given that event w has occurred. P(w) is the prior
probability of event w,
P(X) is the marginal probability of event X occurring, marginal probability refers to the probability of particular event
happening without considering any other events. It focuses on individual probability of a single event.
P(w/X) is the posterior probability, which represents the updated probability of event w happening given that event X has
occurred.
To understand how Bayes formula is derived, see conditional probability
35
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 36
Algorithm:
Program Code
36
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 37
Output :
Conclusion:
In this we implemented a Bayesian neural network or gaussian neural network using iris data set
and did the operation of classification with it and found the accuracy of the whole network which
came up as 1 which shows us why Bayesian models are known as trusted models.
37
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 38
P
Practical 07
Aim: Implement Principal component analysis for dimensionality reduction of data points.
Objectives:
1. Load iris dataset as an example
2. Standardize the data
3. Implement PCA Algorithm
4. Calculate the cumulative explained variance
5. Determine the number of components to keep for given variance
6. Apply PCA with the selected number of components
.
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory :
Principal Component Analysis (PCA) is a dimensionality reduction technique widely used in data analysis and machine
learning. Its primary goal is to transform high-dimensional data into a lower- dimensional representation, capturing the
most important information.
How does PCA works:
1. Standardization
Standardize the data when features are measured in diverse units. This entails subtracting the mean and dividing by the
standard deviation for each feature. Failure to standardize data with features of varying scales can result in misleading
components.
2. Compute the Covariance Matrix
Calculate the covariance matrix as discussed earlier
3. Calculate Eigenvectors and Eigenvalues
Determine the eigenvectors and eigenvalues of the covariance matrix.
Eigenvectors represent the directions (principal components), and eigenvalues represent the magnitude of variance in
those directions. To understand what eigenvectors and eigenvalues are, you can go through this video:
4. Sort Eigenvalues
38
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 39
Sort the eigenvalues in descending order. The eigenvectors corresponding to the highest eigenvalues are the principal
components that capture the most variance in the data.
5. Select Principal Components
39
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 40
Choose the top k eigenvectors (principal components) based on the explained variance needed. Typically, you aim to retain
a significant portion of the total variance, like 85%.
6. Transform the Data
Now, we can transform the original data using the eigenvectors:
So, if we have m dimensional original n data points then X : m*n
P : k*m
Y = PX : (k*m)(m*n) = (k*n)
Hence, our new transformed matrix has n data points having k dimensions.
Pros:
1. Dimensionality Reduction:
PCA effectively reduces the number of features, which is beneficial for models that suffer from the curse of dimensionality.
2. Feature Independence:
Principal components are orthogonal (uncorrelated), meaning they capture independent information, simplifying the
interpretation of the reduced features.
3. Noise Reduction:
PCA can help reduce noise by focusing on the components that explain the most significant variance in the data.
4. Visualization:
The reduced-dimensional data can be visualized, aiding in understanding the underlying structure and patterns.
Cons:
1. Loss of Interpretability:
Interpretability of the original features may be lost in the transformed space, as principal components are linear
combinations of the original features.
2. Assumption of Linearity:
PCA assumes that the relationships between variables are linear, which may not be true in all cases.
3. Sensitive to Scaling:
PCA is sensitive to the scale of the features, so standardization is often required.
4. Outliers Impact Results:
Outliers can significantly impact the results of PCA, as it focuses on capturing the maximum variance, which may be
influenced by extreme values.
Attachment: Algorithm, Program code, Results and output
Algorithm :
1. Import necessary libraries
2. Load Iris dataset
3. Split the dataset into training and testing sets
40
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 41
4. Standardize the data
5. Apply PCA
6. Calculate the cumulative explained variance
7. Determine the number of components to keep for 85% variance explained
8. Apply PCA with the selected number of components
9. Display the results
Program Code:
Conclusion:
41
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 42
P
Practical 08
Aim: Implement hidden Markov Model for sequence prediction and evaluate model performance
Objectives:
1. Choose a dataset suitable for sequence prediction.
2. Implement a Hidden Markov Model.
3. Train the model and predict sequences.
4. Evaluate the model's performance.
Software/Tool: Python/Jupyter/Anaconda/Colab
Theory:
The hidden Markov Model (HMM) is a statistical model that is used to describe the probabilistic relationship between a
sequence of observations and a sequence of hidden states. It is often used in situations where the underlying system or
process that generates the observations is unknown or hidden, hence it has the name “Hidden Markov Model.”
It is used to predict future observations or classify sequences, based on the underlying hidden process that generates the
data.
An HMM consists of two types of variables: hidden states and observations.
● The hidden states are the underlying variables that generate the observed data, but they are not directly observable.
● The observations are the variables that are measured and observed.
The relationship between the hidden states and the observations is modeled using a probability distribution. The Hidden
Markov Model (HMM) is the relationship between the hidden states and the observations using two sets of probabilities:
the transition probabilities and the emission probabilities.
● The transition probabilities describe the probability of transitioning from one hidden state to another.
● The emission probabilities describe the probability of observing an output given a hidden state.
● The state space is the set of all possible hidden states, and the observation space is the set of all possible observations.
42
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 43
Step 2: Define the initial state distribution
● These are the probabilities of transitioning from one state to another. This forms the transition matrix, which
describes the probability of moving from one state to another.
Step 4: Define the observation likelihoods:
● These are the probabilities of generating each observation from each state. This forms the emission matrix, which
describes the probability of generating each observation from each state.
Step 5: Train the model
● The parameters of the state transition probabilities and the observation likelihoods are estimated using the Baum-
Welch algorithm, or the forward-backward algorithm. This is done by iteratively updating the parameters until
convergence.
Step 6: Decode the most likely sequence of hidden states
● Given the observed data, the Viterbi algorithm is used to compute the most likely sequence of hidden states. This can
be used to predict future observations, classify sequences, or detect patterns in sequential data.
Step 7: Evaluate the model
● The performance of the HMM can be evaluated using various metrics, such as accuracy, precision, recall, or F1 score.
To summarize, the HMM algorithm involves defining the state space, observation space, and the parameters of the state
transition probabilities and observation likelihoods, training the model using the Baum-Welch algorithm or the forward-
backward algorithm, decoding the most likely sequence of hidden states using the Viterbi algorithm, and evaluating the
performance of the model.
Algorithm :
43
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 44
Program Code :
Output:
44
Deep Neural B.Tech. (Third Year)- D.Y.P.I.U, 45
Conclusion:
In this project, we utilized a Hidden Markov Model (HMM) to construct a stock movement
Pediction model. Leveraging historical data sourced from Yahoo Finance for Apple, we aimed to
forecast stock price movements. Our model generated a prediction suggesting a decrease in stock
price for the next day. Additionally, we evaluated the model's performance by calculating the log
likelihood of the observed sequence of data. This metric serves as a measure of how well the model
aligns with the observed data, providing insights into its performance.
45