0% found this document useful (0 votes)

2 views

ML Assignment[1]

The document is an assignment on Machine Learning for a B. Tech CSE 6th Semester course, covering key concepts such as the Bellman equation, Linear Quadratic Regulation, Q-Learning, Deep Neural Networks (DNN), and Convolutional Neural Networks (CNN). Each concept is explained with its significance, applications, advantages, and limitations. The assignment is submitted by Simran Tomar to Dr. Amit R Khaparde.

Uploaded by

Simran tomar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

ML Assignment[1]

Uploaded by

Simran tomar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

GB PANT DSEU OKHLA-1 CAMPUS

(Govt. of NCT of Delhi)

Okhla Industrial Estate Phase-III, New Delhi-110020

B. Tech CSE 6th Semester

Machine Learning
Subject Code: BT-CS-ES602
Assignment-01

Submitted to: Submitted by:

Dr. Amit R Khaparde Simran Tomar (41721016)
Qn(A): Write a short note on:

1. Bellman equation
2. Linear quadratic regulation
3. Q Learning
4. DNN
5. CNN

Ans1. The Bellman Equation: Optimizing Decisions Over Time

The Bellman equation, named after Richard E. Bellman, is a fundamental concept in dynamic
programming. It's a recursive equation that helps us make optimal decisions in situations
where we need to consider both immediate rewards and future consequences.

Imagine an agent navigating an environment, like a maze. The Bellman equation tells the
agent that the value of being in a current state (s) is equal to:

 The immediate reward (R) received by taking a specific action (a) in that state.
 Plus, the discounted value (γ * V(s')) of the next state (s') that results from taking that
action.

Here's a breakdown of the symbols:

 V(s): The value of being in state ‘s’.

 R (s, a): The reward for taking action ‘a’ in state ‘s’.
 s': The next state reached after taking action ‘a’ in state ‘s’.
 γ (gamma): A discount factor (between 0 and 1) that balances the importance of
immediate rewards vs. future rewards. A higher gamma gives more weight to future
rewards.

The key idea is that the optimal decision considers both the immediate reward of an action
and the long-term value of the resulting state. The Bellman equation helps us iteratively
evaluate the value of each state, allowing the agent to find the sequence of actions that leads
to the maximum long-term reward.

The Bellman equation is widely used in various applications, especially in reinforcement

learning, where agents learn through trial and error to make optimal decisions in complex
environments.
Ans 2. Linear Quadratic Regulation (LQR)

Linear Quadratic Regulation (LQR) is a powerful technique in control theory for finding optimal
control strategies for linear systems. It achieves this by minimizing a quadratic cost function that
penalizes both deviations of the system's state from a desired equilibrium and the effort required to
control the system.

Here's how it works:

 The System: The system is described by linear differential equations in state-space form,
representing the relationship between the system's state, control inputs, and their evolution
over time.

 The Cost Function: A quadratic function penalizes two things:

o Deviations from the desired state: This is captured by a positive semi-definite matrix
(Q) that weights the importance of keeping each state variable close to its desired
value.

o Control effort: The control effort required to manipulate the system is also penalized
using another positive definite matrix (R) that weights the importance of minimizing
control inputs (e.g., minimizing energy consumption or actuator wear).

 Finding the Optimal Control: LQR solves an optimization problem to find a state-feedback
controller. This controller uses all the system's state variables (full state feedback) to
compute the control input that minimizes the cost function over time.

LQR offers several advantages:

 Systematic Design: It provides a structured approach to designing controllers for linear

systems.

 Guaranteed Stability: If the system is controllable and observable, the LQR controller
guarantees closed-loop stability.

 Tuning Flexibility: The weighting matrices (Q and R) allow you to tailor the controller's
behaviour by prioritizing specific state variables or control efforts.

However, LQR also has limitations:

 Linearity Assumption: It only applies to linear systems, which may not always be realistic.

 Full State Feedback: It requires access to all state variables, which may not be feasible in
practice.

Despite these limitations, LQR remains a valuable tool for control engineers due to its effectiveness
and ease of implementation for linear systems.

Ans 3. Q-Learning: Learning Through Trial and Reward

Q-learning is a fundamental algorithm in the field of reinforcement learning. Unlike supervised
learning where you have labelled data, reinforcement learning deals with situations where an agent
learns through trial and error in an environment. Q-learning helps the agent discover the best course
of action to take in a given situation to maximize long-term rewards.

Here's what makes Q-learning work:

 Q-Values: At the core is the concept of Q-values. A Q-value represents the expected future
reward an agent can get by taking a specific action (a) in a particular state (s). The agent
maintains a Q-table (or Q-function) that stores these Q-values for all possible state-action
pairs.

 Exploration vs. Exploitation: The agent balances exploration (trying new actions) and
exploitation (taking the currently believed best action). This is often achieved through an
epsilon-greedy policy. With a certain probability (epsilon), the agent explores by trying a
random action, and with probability (1-epsilon), it exploits by taking the action with the
highest Q-value in the current state.

 Bellman Equation: Q-learning updates the Q-values based on the Bellman equation. This
equation considers the immediate reward received after taking an action, along with the
discounted future reward expected from the resulting state.

Through this iterative process of exploration, reward collection, and Q-value updates, the agent
gradually learns which actions to take in different states to achieve the maximum cumulative reward
over time.

Here are some key features of Q-learning:

 Model-Free: It doesn't require a detailed model of the environment, only the ability to
interact with it and receive rewards.

 Off-Policy Learning: It can learn from experience even if the data comes from a different
policy than the one it's currently following.

 Versatility: Q-learning can be applied to various scenarios where an agent interacts with an
environment to learn optimal behaviour.
However, there are also challenges to consider:

 Exploration-Exploitation Trade-off: Finding the right balance between exploring new

possibilities and exploiting known good actions is crucial.

 Convergence: Learning can be slow, and convergence to the optimal policy isn't guaranteed.

Despite these challenges, Q-learning remains a powerful tool for training agents in reinforcement
learning problems.

Ans 4. Deep Neural Networks (DNNs): Learning Like the Brain

Deep Neural Networks (DNNs) are a type of artificial neural network inspired by the
structure and function of the human brain. Unlike simpler neural networks, DNNs have
multiple hidden layers between the input and output layers. These hidden layers allow DNNs
to learn complex patterns and relationships in data, making them highly effective for a variety
of tasks.

Here's a breakdown of how DNNs work:

 Structure: DNNs are composed of interconnected artificial neurons, arranged in

layers. Each neuron receives inputs from the previous layer, performs a mathematical
operation (activation function), and sends its output to the next layer.
 Learning: DNNs learn through a process called backpropagation. During training, the
network is presented with data and calculates its output. The difference between the
desired output and the actual output (error) is then propagated backward through the
network. The weights and biases of the neurons are adjusted to minimize this error
iteratively.
 Strength: The hidden layers allow DNNs to extract features from the data at
increasing levels of complexity. This enables them to model intricate relationships
that might be missed by simpler models.

DNNs are widely used in various applications due to their capabilities:

 Image Recognition: DNNs excel at recognizing objects and patterns in images,
powering applications like facial recognition and self-driving cars.
 Natural Language Processing: DNNs can understand and generate human language,
enabling tasks like machine translation and chatbots.
 Recommender Systems: DNNs personalize recommendations on e-commerce
platforms and streaming services by analysing user behaviour and preferences.

However, DNNs also come with challenges:

 Computational Cost: Training DNNs requires significant computational power and

large amounts of data.
 Interpretability: Understanding how DNNs arrive at their decisions can be difficult,
limiting their use in safety-critical applications.

Despite these challenges, DNNs are a powerful tool at the forefront of artificial intelligence,
with ongoing research pushing the boundaries of their capabilities.

Ans 5. Convolutional Neural Networks (CNNs): Masters of Visual Recognition

Convolutional Neural Networks (CNNs) are a powerful type of deep learning architecture particularly
adept at image recognition and processing tasks. Their structure, inspired by the human visual cortex,
allows them to excel at finding patterns and relationships within grid-like data like images.

Key Features of CNNs:

 Convolutional Layers: These layers apply filters to extract features from the input image. By
moving these filters across the image, the network can identify edges, shapes, and other visual
elements at various scales.
 Pooling Layers: These layers down sample the data, reducing its dimensionality and
computational cost, while preserving important features.
 Fully Connected Layers: In the final stages, these layers take the extracted features and
classify the image or make predictions based on the learned patterns.

Fig: A CNN sequence to classify handwritten digits

Applications of CNNs:

 Image Recognition: Classifying objects in images, facial recognition, medical image

analysis.
 Computer Vision: Tasks like object detection, image segmentation, and self-driving car
perception.
 Video Analysis: Action recognition in videos, anomaly detection in surveillance footage.

Advantages of CNNs:

 Highly effective for visual tasks: Their architecture is specifically designed to exploit the
spatial relationships within images.
 Automatic feature extraction: CNNs can learn features directly from data, eliminating the
need for manual feature engineering.
 Transfer learning: Pre-trained CNN models can be fine-tuned for new tasks, leveraging their
learned knowledge as a starting point.

Limitations of CNNs:

 Computational Cost: Training large CNNs can be computationally expensive and require
significant data.
 Interpretability: Understanding how CNNs arrive at their decisions can be challenging,
limiting their use in some applications.

Overall, CNNs are a cornerstone of deep learning for visual tasks, with ongoing research expanding
their capabilities and applications.

Level of Satisfaction On Chosen Strand
67% (6)
Level of Satisfaction On Chosen Strand
16 pages
Case Study-Machine Learning at American Express
No ratings yet
Case Study-Machine Learning at American Express
8 pages
FS Curriculum Episode 9 - 12
94% (575)
FS Curriculum Episode 9 - 12
20 pages
Discover The Beauty of An Online Writing Course
100% (1)
Discover The Beauty of An Online Writing Course
3 pages
Lecture Notes v1.0 687 F22
No ratings yet
Lecture Notes v1.0 687 F22
115 pages
Deep Reinforcement Learning: Lecture Notes
No ratings yet
Deep Reinforcement Learning: Lecture Notes
60 pages
Deep Reinforcement Learning Handout v2.0.docx (1)
0% (1)
Deep Reinforcement Learning Handout v2.0.docx (1)
6 pages
lecture doubts
No ratings yet
lecture doubts
2 pages
Reinforcement Learning and Dynamic Programming For Control
100% (1)
Reinforcement Learning and Dynamic Programming For Control
111 pages
DL questions
No ratings yet
DL questions
30 pages
Solution ML KOE -073 PUT(7th Sem 2024-25) Neeru
No ratings yet
Solution ML KOE -073 PUT(7th Sem 2024-25) Neeru
14 pages
Lecture 30 Reinforcement-Learning
No ratings yet
Lecture 30 Reinforcement-Learning
50 pages
Artificial 2
No ratings yet
Artificial 2
4 pages
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
No ratings yet
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
15 pages
Data Driven Control
No ratings yet
Data Driven Control
2 pages
Deep Reinforcement Learning - Guide To Deep Q-Learning
No ratings yet
Deep Reinforcement Learning - Guide To Deep Q-Learning
1 page
SSRN 4963741
No ratings yet
SSRN 4963741
26 pages
15
No ratings yet
15
17 pages
Einforcement Learning
No ratings yet
Einforcement Learning
27 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
5 pages
ML
No ratings yet
ML
5 pages
RL Theory Tutorial
No ratings yet
RL Theory Tutorial
80 pages
Algorithms For Reinforced Learning
No ratings yet
Algorithms For Reinforced Learning
98 pages
DL Unit 6 QP Solution
No ratings yet
DL Unit 6 QP Solution
15 pages
Reinforcement Learning: Instructor: Max Welling
No ratings yet
Reinforcement Learning: Instructor: Max Welling
18 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
Algorithms For Reinforcement Learning - Szepesvari
No ratings yet
Algorithms For Reinforcement Learning - Szepesvari
98 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
232 pages
MAS-Lab7-QFA
No ratings yet
MAS-Lab7-QFA
10 pages
Alg RLearning Ejemplo
No ratings yet
Alg RLearning Ejemplo
99 pages
RLAlgs in MDPs
No ratings yet
RLAlgs in MDPs
98 pages
Final MSC Report Divyam Rastogi
No ratings yet
Final MSC Report Divyam Rastogi
78 pages
Hota-ML-ReinforcementLearning
No ratings yet
Hota-ML-ReinforcementLearning
12 pages
Unit-5 Genetic Reinforcement Markov Q-Learning
No ratings yet
Unit-5 Genetic Reinforcement Markov Q-Learning
39 pages
Modern_Deep_Reinforcement_Learning_Algorithms
No ratings yet
Modern_Deep_Reinforcement_Learning_Algorithms
56 pages
Control Systems and Reinforcement Learning -- Sean Meyn -- 2022 -- Cambridge University Press -- 9781009051873 -- 0790f102da4ffa7564329bf76c7d0815 -- Anna’s Archive
No ratings yet
Control Systems and Reinforcement Learning -- Sean Meyn -- 2022 -- Cambridge University Press -- 9781009051873 -- 0790f102da4ffa7564329bf76c7d0815 -- Anna’s Archive
454 pages
mlt 2022-23
No ratings yet
mlt 2022-23
22 pages
ArtificialGeneralIntelligence
No ratings yet
ArtificialGeneralIntelligence
4 pages
RL Class Notes (4)
No ratings yet
RL Class Notes (4)
68 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
101 pages
RL Ese Answers
No ratings yet
RL Ese Answers
16 pages
Rule-based Reinforcement Learning augmented by External Knowledge
No ratings yet
Rule-based Reinforcement Learning augmented by External Knowledge
7 pages
Reinforcement Learning Optimization
No ratings yet
Reinforcement Learning Optimization
6 pages
Syllabus ECE586BH 2023
No ratings yet
Syllabus ECE586BH 2023
2 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
3 pages
6S191 MIT DeepLearning L5
No ratings yet
6S191 MIT DeepLearning L5
62 pages
Reinforcement Learning Exam
No ratings yet
Reinforcement Learning Exam
6 pages
کتاب هشتم بارگزاری شده
No ratings yet
کتاب هشتم بارگزاری شده
112 pages
MLT Unit-5 notes
No ratings yet
MLT Unit-5 notes
17 pages
Origins of Life Questions and Debates
No ratings yet
Origins of Life Questions and Debates
12 pages
Reinforcement Learning - A comprehensive Overview
No ratings yet
Reinforcement Learning - A comprehensive Overview
177 pages
Full Download Foundations of Deep Reinforcement Learning Theory and Practice in Python First Edition Laura Graesser PDF
100% (5)
Full Download Foundations of Deep Reinforcement Learning Theory and Practice in Python First Edition Laura Graesser PDF
62 pages
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
No ratings yet
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
30 pages
Mlt-Cia Iii Ans Key
No ratings yet
Mlt-Cia Iii Ans Key
14 pages
Lecture Notes on Reinforcement Learning Basics
No ratings yet
Lecture Notes on Reinforcement Learning Basics
6 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Evolutionary Game Theory and Multi-Agent Reinforcement Learning
No ratings yet
Evolutionary Game Theory and Multi-Agent Reinforcement Learning
26 pages
Gujarat Technological University: Bachelor of Engineering Syllabus Subject Code: Subject Name
No ratings yet
Gujarat Technological University: Bachelor of Engineering Syllabus Subject Code: Subject Name
3 pages
Unit 3
No ratings yet
Unit 3
12 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
INST252 Sec1
No ratings yet
INST252 Sec1
109 pages
Final Assignment Name: Alif Fajri Hairi Fadilah (Smks Mambaul Ulum Bata-Bata) Instructions
No ratings yet
Final Assignment Name: Alif Fajri Hairi Fadilah (Smks Mambaul Ulum Bata-Bata) Instructions
3 pages
Mobilephone Itseffecttoacademicperformance
No ratings yet
Mobilephone Itseffecttoacademicperformance
21 pages
Toefl PBT
No ratings yet
Toefl PBT
28 pages
Eya Ngozi - M
No ratings yet
Eya Ngozi - M
168 pages
Tensions and Trends Regarding Employment, Socio-Cultural Practices, and Work
No ratings yet
Tensions and Trends Regarding Employment, Socio-Cultural Practices, and Work
7 pages
Health W8 Quick Wins
No ratings yet
Health W8 Quick Wins
6 pages
Error Analysis, Nature, Purpose, Causes, Stages
No ratings yet
Error Analysis, Nature, Purpose, Causes, Stages
5 pages
Designing English Speaking Materials For Cabin Crew in Jogja Flight School
No ratings yet
Designing English Speaking Materials For Cabin Crew in Jogja Flight School
96 pages
Instant Access to Understanding Human Behavior and the Social Environment 7th Edition Charles Zastrow ebook Full Chapters
100% (7)
Instant Access to Understanding Human Behavior and the Social Environment 7th Edition Charles Zastrow ebook Full Chapters
75 pages
EDUC 2220-Educational Technology Lesson Plan Template Cultural Diversity
No ratings yet
EDUC 2220-Educational Technology Lesson Plan Template Cultural Diversity
3 pages
Culminating Activity-Life of Pi Day
No ratings yet
Culminating Activity-Life of Pi Day
2 pages
G-lie-Living-in-the-IT-Era-Syllabus_8ac741ee069a1634f8a79bf543e5966e
No ratings yet
G-lie-Living-in-the-IT-Era-Syllabus_8ac741ee069a1634f8a79bf543e5966e
9 pages
TTL 2 Conversational Framework
No ratings yet
TTL 2 Conversational Framework
2 pages
Thesis 1
No ratings yet
Thesis 1
23 pages
A Paper Presentation On Inclusive Education in The Philippines
No ratings yet
A Paper Presentation On Inclusive Education in The Philippines
4 pages
ANNOTATION Template For Teacher 2018 19
67% (3)
ANNOTATION Template For Teacher 2018 19
13 pages
AKANDWANAHO JOHNBAPTIST
No ratings yet
AKANDWANAHO JOHNBAPTIST
23 pages
Curriculum Implementation Scale
No ratings yet
Curriculum Implementation Scale
1 page
Internship Resume 2018
No ratings yet
Internship Resume 2018
1 page
Daily Lesson Plan in Math 9 - Trigonometric Ratio
100% (3)
Daily Lesson Plan in Math 9 - Trigonometric Ratio
1 page
RPH English
No ratings yet
RPH English
12 pages
Kindergarten Detailed Lesson Plan Format
No ratings yet
Kindergarten Detailed Lesson Plan Format
3 pages
Iowa 2023 Spring Student Assessment Results
No ratings yet
Iowa 2023 Spring Student Assessment Results
7 pages
Mapeh Week3docx
No ratings yet
Mapeh Week3docx
15 pages

ML Assignment[1]

Uploaded by

ML Assignment[1]

Uploaded by

GB PANT DSEU OKHLA-1 CAMPUS

(Govt. of NCT of Delhi)

B. Tech CSE 6th Semester

Submitted to: Submitted by:

Ans1. The Bellman Equation: Optimizing Decisions Over Time

Here's a breakdown of the symbols:

 V(s): The value of being in state ‘s’.

The Bellman equation is widely used in various applications, especially in reinforcement

Here's how it works:

 The Cost Function: A quadratic function penalizes two things:

LQR offers several advantages:

 Systematic Design: It provides a structured approach to designing controllers for linear

However, LQR also has limitations:

Ans 3. Q-Learning: Learning Through Trial and Reward

Here's what makes Q-learning work:

Here are some key features of Q-learning:

 Exploration-Exploitation Trade-off: Finding the right balance between exploring new

Ans 4. Deep Neural Networks (DNNs): Learning Like the Brain

Here's a breakdown of how DNNs work:

 Structure: DNNs are composed of interconnected artificial neurons, arranged in

DNNs are widely used in various applications due to their capabilities:

However, DNNs also come with challenges:

 Computational Cost: Training DNNs requires significant computational power and

Ans 5. Convolutional Neural Networks (CNNs): Masters of Visual Recognition

Key Features of CNNs:

Fig: A CNN sequence to classify handwritten digits

 Image Recognition: Classifying objects in images, facial recognition, medical image

You might also like