Reinforcement Learning Is An Autonomous
Reinforcement Learning Is An Autonomous
Reinforcement learning is an autonomous, self-teaching system that essentially learns by trial and error. It
performs actions with the aim of maximizing rewards, or in other words, it is learning by doing in order to
achieve the best outcomes.
1. Start in a state.
2. Take an action.
3. Receive a reward or penalty from the environment.
4. Observe the new state of the environment.
5. Update your policy to maximize future rewards.
You can see a dog and a master. Let’s imagine you are training your dog to get the stick. Each time the
dog gets a stick successfully, you offered him a feast (a bone let’s say). Eventually, the dog understands
the pattern, that whenever the master throws a stick, it should get it as early as it can to gain a reward (a
bone) from a master in a lesser time.
Policy-Based – In policy-based, you enable to come up with a strategy that helps to gain maximum
rewards in the future through possible actions performed in each state. Two types of policy-based
methods are deterministic and stochastic.eg- Training a self-driving car to navigate traffic.
Model-Based – In this method, we need to create a virtual model for the agent to help in learning to
perform in each specific environment. Eg- Teaching a robot to manipulate objects in the real world.
1. Positive Reinforcement
Positive reinforcement is defined as when an event, occurs due to specific behaviour, increases the
strength and frequency of the behaviour. It has a positive impact on behaviour.
Advantages
– Maximizes the performance of an action
– Sustain change for a longer period
Disadvantage
– Excess reinforcement can lead to an overload of states which would minimize the results.
2. Negative Reinforcement