0% found this document useful (0 votes)
3 views3 pages

Reinforcement Learning Is An Autonomous

Reinforcement learning is a self-teaching system that learns through trial and error to maximize rewards by performing actions in an environment. It involves an agent that interacts with the environment, receives feedback in the form of rewards or penalties, and updates its policy to improve future actions. Key concepts include positive and negative reinforcement, various algorithms, and practical applications in fields such as robotics, autonomous vehicles, and AI development.

Uploaded by

surya.s2710153
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views3 pages

Reinforcement Learning Is An Autonomous

Reinforcement learning is a self-teaching system that learns through trial and error to maximize rewards by performing actions in an environment. It involves an agent that interacts with the environment, receives feedback in the form of rewards or penalties, and updates its policy to improve future actions. Key concepts include positive and negative reinforcement, various algorithms, and practical applications in fields such as robotics, autonomous vehicles, and AI development.

Uploaded by

surya.s2710153
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Reinforcement learning

Reinforcement learning is an autonomous, self-teaching system that essentially learns by trial and error. It
performs actions with the aim of maximizing rewards, or in other words, it is learning by doing in order to
achieve the best outcomes.

How Does Reinforcement Learning Work?

1. Start in a state.
2. Take an action.
3. Receive a reward or penalty from the environment.
4. Observe the new state of the environment.
5. Update your policy to maximize future rewards.

Here what do you see?

You can see a dog and a master. Let’s imagine you are training your dog to get the stick. Each time the
dog gets a stick successfully, you offered him a feast (a bone let’s say). Eventually, the dog understands
the pattern, that whenever the master throws a stick, it should get it as early as it can to gain a reward (a
bone) from a master in a lesser time.

Terminologies used in Reinforcement Learning

Agent – is the sole decision-maker and learner


Environment – a physical world where an agent learns and decides the actions to be performed
Action – a list of action which an agent can perform
State – the current situation of the agent in the environment
Reward – For each selected action by agent, the environment gives a reward. It’s usually a scalar value
and nothing but feedback from the environment
Policy – the agent prepares strategy(decision-making) to map situations to actions.
Value Function – The value of state shows up the reward achieved starting from the state until the policy
is executed
Model – Every RL agent doesn’t use a model of its environment. The agent’s view maps state-action
pairs probability distributions over the states

Reinforcement Learning Workflow


– Create the Environment
– Define the reward
– Create the agent
– Train and validate the agent
– Deploy the policy

Characteristics of Reinforcement Learning


– No supervision, only a real value or reward signal
– Decision making is sequential
– Time plays a major role in reinforcement problems
– Feedback isn’t prompt but delayed
– The following data it receives is determined by the agent’s actions

Reinforcement Learning Algorithms

There are 3 approaches to implement reinforcement learning algorithms

Fig: Reinforcement Learning Algorithms


Value-Based – The main goal of this method is to maximize a value function. Here, an agent through a
policy expects a long-term return of the current states. Eg- robot learning to navigate a maze.

Policy-Based – In policy-based, you enable to come up with a strategy that helps to gain maximum
rewards in the future through possible actions performed in each state. Two types of policy-based
methods are deterministic and stochastic.eg- Training a self-driving car to navigate traffic.

Model-Based – In this method, we need to create a virtual model for the agent to help in learning to
perform in each specific environment. Eg- Teaching a robot to manipulate objects in the real world.

Types of Reinforcement Learning

 Positive reinforcement: Adding


something pleasant to increase the
likelihood of a behaviour.(eg-
Training a dog to sit on
command.)
 Negative reinforcement: Removing
something unpleasant to increase
the likelihood of a behaviour.(eg -
you have a headache, and you
take pain medication.)

1. Positive Reinforcement

Positive reinforcement is defined as when an event, occurs due to specific behaviour, increases the
strength and frequency of the behaviour. It has a positive impact on behaviour.
Advantages
– Maximizes the performance of an action
– Sustain change for a longer period

Disadvantage
– Excess reinforcement can lead to an overload of states which would minimize the results.

2. Negative Reinforcement

Negative Reinforcement is represented as the strengthening of a behaviour. In other ways, when a


negative condition is barred or avoided, it tries to stop this action in the future.
Advantages
– Maximized behaviour
– Provide a decent to minimum standard of performance
Disadvantage
limits itself enough to meet up a minimum behaviour

Widely used models for reinforcement learning


1. Markov Decision Process (MDP’s)
2. Q Learning

Practical Applications of reinforcement learning


– Robotics for Industrial Automation
– Text summarization engines, dialogue agents (text, speech), gameplays
– Autonomous Self Driving Cars
– Machine Learning and Data Processing
– Training system which would issue custom instructions and materials with respect to the requirements
of students
– AI Toolkits, Manufacturing, Automotive, Healthcare, and Bots
– Aircraft Control and Robot Motion Control
– Building artificial intelligence for computer games

You might also like