0% found this document useful (0 votes)
4 views3 pages

Assignmnet-5 CSET 335 Deep Learning

Uploaded by

Ruchir Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

Assignmnet-5 CSET 335 Deep Learning

Uploaded by

Ruchir Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

School of Computer Science Engineering and Technology

Course-B. Tech Type- Elective V


Course Code- CSET-335 Course Name- Deep Leaning

Year- 2025 Semester- Even


Date- 07/04/2025 Batch- 2024-2025

CO-Mapping

Exp. No. Name CO1 CO2 CO3


05
Deep RL with Q-learning
algorithm

Objectives
CO1: To explain the fundamentals of deep learning, Convolution neural network.
CO2: To articulate different problem of classification, detection, segmentation, generation and
understand existing solutions/ deep learning architectures.
CO3: To implement a solution for the given problem and improve it using various methods transfer
learning, hyperparameter optimization.

Assignment-5(WK9 and Wk10)


Goal: Implement Deep RL with Q-learning algorithm for the ‘Cartpole’ game
environment.

To Do: You can use the gym library to simulate the environment. You can start from the
sample code provided.
• Number of episodes = 100
• Hyperparameters: γ = 0.95
• Memory = 2000
• Mini batch size = 32
• Initial ε = 1 (ε-greedy policy) Minimum ε = 0.1
• ε decay = 0.995
• Target Update rate = 10

Neural Network Architecture:


Fully connected Dense NN (For both Q network and Target Network)
• Input layer : State dimensions
• Hidden Layer 1 : 24 nodes,Activation: ReLU
• Hidden Layer 2 : 24 nodes, Activation: ReLU
School of Computer Science Engineering and Technology

• Output Layer : action size, Activation : Linear


• Loss: MSE
• Optimizer : Adam, Learning rate : 0.001

Complete the sections of the code wherever indicated. Observe the score as training
progresses. Submit the following explanation in a text file along with your code.

Submit the following explanation in a text file along with your code.
1. What do you observe as epsilon decreases? Why?
2. How many number of episodes does it take for ε to reach its minimum value. Repeat
this for minimum ε = {0.1,0.01 }?
3. Report any other observation you make by varying the parameters.

Submission: You need to submit a zip file “PA 5 yourfullname.zip” containing


• Your code renamed as “PA 5 name.ipynb”.
• A text file “PA 5 name.txt” reporting all the results.
School of Computer Science Engineering and Technology

softmax in the output layer.

• Compile the model using Adam optimizer and categorical_crossentropy loss function.
• Fit the model for 200 epochs and batch size 512.
• Evaluate the model and print achieved loss and accuracy.
• Visualize the plot between loss and epoch for training and test data.
• Implement the above code without regularization, L1 regularization, L2 regularization and
Dropout. To implement L1 and L2 regularization, change the abovepointand use kernel_regularizer()
function on all input and hidden layers. To implement Dropout, change the above point and use
Dropout() function (in keras) to drop 20% of nodes on all hidden layers.

Note: For all regularization techniques, visualize the plot between loss and epoch fortraining
and test data.

Scenario: Consider any neural network containing a single weight w_0 and bias b_0. Initialize them to
some random values using a function.
To Do:

1. Initialize iteration num t to 1.


2. For the given quadratic loss function f(m) = m2 – 2m +1, write the code to implement Momentum,
RMSPRop and Adam optimizers from scratch to update weights until convergence.
Note: In each optimizer, also do the bias correction after computing accumulated gradient equation.

3. The criteria for convergence should be wt-1 = wt (that is when weight at iteration t-1 equals weight
at iteration t)

4. At each iteration print the updated weight value and loss value.
5. Plot the visualization showing epoch vs loss for each optimizer: Momentum, RMSProp, and Adam
using Matplotlib or Seaborn library function.

6. Finally compare how many iterations it takes to converge in each optimizer and finally conclude
with best optimizer (based on loss value) and number of iterationsrequired in that optimizer.

You might also like