We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15
What is Reinforcement Learning?
• Reinforcement Learning is a feedback-based
Machine learning technique in which an agent learns to behave in an environment by performing the actions and seeing the results of actions. • For each good action, the agent gets positive feedback, and for each bad action, the agent gets negative feedback or penalty. Reinforcement Learning • In Reinforcement Learning, the agent learns automatically using feedbacks without any labeled data, unlike supervised learning. • Since there is no labeled data, so the agent is bound to learn by its experience only. • RL solves a specific type of problem where decision making is sequential, and the goal is long-term, such as game-playing, robotics, etc. Reinforcement Learning • The agent learns with the process of hit and trial, and based on the experience, it learns to perform the task in a better way. Hence, we can say that "Reinforcement learning is a type of machine learning method where an intelligent agent (computer program) interacts with the environment and learns to act within that." How a Robotic dog learns the movement of his arms is an example of Reinforcement learning. Reinforcement Learning • It is a core part of Artificial intelligence, and all AI agent works on the concept of reinforcement learning. Here we do not need to pre-program the agent, as it learns from its own experience without any human intervention. Example • Example: Suppose there is an AI agent present within a maze environment, and his goal is to find the diamond. The agent interacts with the environment by performing some actions, and based on those actions, the state of the agent gets changed, and it also receives a reward or penalty as feedback. • The agent continues doing these three things (take action, change state/remain in the same state, and get feedback), and by doing these actions, he learns and explores the environment. Terms used in Reinforcement Learning • Agent(): An entity that can perceive/explore the environment and act upon it. • Environment(): A situation in which an agent is present or surrounded by. • In RL, we assume the stochastic environment, which means it is random in nature. • Action(): Actions are the moves taken by an agent within the environment. Terms used in Reinforcement Learning • State(): State is a situation returned by the environment after each action taken by the agent. • Reward(): A feedback returned to the agent from the environment to evaluate the action of the agent. • Policy(): Policy is a strategy applied by the agent for the next action based on the current state. Terms used in Reinforcement Learning • Value(): It is expected long-term retuned with the discount factor and opposite to the short- term reward. • Q-value(): It is mostly similar to the value, but it takes one additional parameter as a current action (a). Key Features of Reinforcement Learning • In RL, the agent is not instructed about the environment and what actions need to be taken. • It is based on the hit and trial process. • The agent takes the next action and changes states according to the feedback of the previous action. • The agent may get a delayed reward. • The environment is stochastic, and the agent needs to explore it to reach to get the maximum positive rewards. Approaches to implement Reinforcement Learning • There are mainly three ways to implement reinforcement-learning in ML, which are: 1. Value-based: • The value-based approach is about to find the optimal value function, which is the maximum value at a state under any policy. Therefore, the agent expects the long-term return at any state(s) under policy π. Policy-based:
• Policy-based approach is to find the optimal policy for
the maximum future rewards without using the value function. In this approach, the agent tries to apply such a policy that the action performed in each step helps to maximize the future reward. • The policy-based approach has mainly two types of policy: • Deterministic: The same action is produced by the policy (π) at any state. • Stochastic: In this policy, probability determines the produced action. Model-based: • In the model-based approach, a virtual model is created for the environment, and the agent explores that environment to learn it. There is no particular solution or algorithm for this approach because the model representation is different for each environment. Elements of Reinforcement Learning • There are four main elements of Reinforcement Learning, which are given below: 1. Policy 2. Reward Signal 3. Value Function 4. Model of the environment How does Reinforcement Learning Work?
• To understand the working process of the RL,
we need to consider two main things:
• Environment: It can be anything such as a
room, maze, football ground, etc. • Agent: An intelligent agent such as AI robot.