0% found this document useful (0 votes)

26 views50 pages

Lec 1 Intro Course Overview

This document provides an overview of deep reinforcement learning and why it is an important area of machine learning. It discusses: - How reinforcement learning differs from supervised learning by learning from experience rather than labeled examples. The goal is to maximize rewards rather than predict labels. - Examples of deep reinforcement learning being applied to complex tasks like playing Atari games and controlling robots. End-to-end learning allows jointly training perception and control. - Why deep reinforcement learning is an important approach for building intelligent machines that can learn decision making and control through experience rather than being explicitly programmed. It provides a formalism for intelligent behavior.

Uploaded by

ghauch

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views50 pages

Lec 1 Intro Course Overview

Uploaded by

ghauch

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

?

Option 1:
Understand the problem, design a solution

Option 2:
Set it up as a machine learning problem

data
supervised
learning
Deep Reinforcement Learning, Decision Making,
and Control

CS 285
Instructor: Sergey Levine
UC Berkeley
data

reinforcement
learning
What is reinforcement learning?
What is reinforcement learning?

Mathematical formalism for learning-based

decision making

Approach for learning decision making and control

from experience
How is this different from other machine
learning topics?
Standard (supervised)
machine learning:
Reinforcement learning:

• Data is not i.i.d.: previous outputs influence

future inputs!
• Ground truth answer is not known, only know
if we succeeded or failed
Usually assumes: • more generally, we know the reward

• i.i.d. data
• known ground truth outputs in training
decisions (actions)

Actions: muscle contractions Actions: motor current or torque

Observations: sight, smell Observations: camera images
Rewards: food Rewards: task success measure (e.g.,
running speed)
consequences
observations (states)
rewards

Actions: what to purchase

Observations: inventory levels
Rewards: profit
Complex physical tasks…

Rajeswaran, et al. 2018

Unexpected solutions…

Mnih, et al. 2015

In the real world…

Kalashnikov et al. ‘18

In the real world…

Kalashnikov et al. ‘18

Not just games and robots!

Cathy Wu
Why should we care about deep
reinforcement learning?
How do we build intelligent machines?
Intelligent machines must be able to adapt
Deep learning helps us handle unstructured
environments
Reinforcement learning provides a formalism for
behavior
decisions (actions)

Schulman et al. ’14 & ‘15 Mnih et al. ‘13

consequences
observations
rewards
Levine*, Finn*, et al. ‘16
What is deep RL, and why should we care?
standard
features mid-level features classifier
computer
(e.g. HOG) (e.g. DPM) (e.g. SVM)
vision
Felzenszwalb ‘08

end-to-end training
deep
learning

standard
reinforcement
learning
features
? more features
? linear policy
or value func.
action

deep end-to-end training

reinforcement action
learning
What does end-to-end learning mean for
sequential decision making?
perception

Action
(run away)

action
sensorimotor loop

Action
(run away)
Example: robotics

robotic state
modeling & low-level
control observations estimation
prediction
planning
control
controls
pipeline (e.g. vision)
tiny, highly specialized tiny, highly specialized
“visual cortex” “motor cortex”
decisions (actions)

Deep models are what allow reinforcement Actions: muscle contractions

Observations: sight, smell
Actions: motor current or torque
Observations: camera images
Rewards: task success measure (e.g.,
learning algorithms to solve complex problemsRewards: food
running speed)
consequences
endobservations
to end!
rewards

Actions: what to purchase

The reinforcement learning problem is the AI problem! Observations: inventory levels
Rewards: profit
Why should we study this now?

1. Advances in deep learning

2. Advances in reinforcement learning
3. Advances in computational capability
Why should we study this now?

Tesauro, 1995

L.-J. Lin, “Reinforcement learning for robots using neural networks.” 1993
Why should we study this now?

Atari games: Real-world robots: Beating Go champions:

Q-learning: Guided policy search: Supervised learning + policy
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. S. Levine*, C. Finn*, T. Darrell, P. Abbeel. “End-to-end gradients + value functions +
Antonoglou, et al. “Playing Atari with Deep training of deep visuomotor policies”. (2015).
Reinforcement Learning”. (2013).
Monte Carlo tree search:
Q-learning: D. Silver, A. Huang, C. J. Maddison, A. Guez,
Policy gradients: D. Kalashnikov et al. “QT-Opt: Scalable Deep L. Sifre, et al. “Mastering the game of Go
J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Reinforcement Learning for Vision-Based Robotic with deep neural networks and tree
Abbeel. “Trust Region Policy Optimization”. (2015). Manipulation”. (2018). search”. Nature (2016).
V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap,
et al. “Asynchronous methods for deep reinforcement
learning”. (2016).
What other problems do we need to solve to
enable real-world sequential decision making?
Beyond learning from reward

• Basic reinforcement learning deals with maximizing rewards

• This is not the only problem that matters for sequential decision
making!
• We will cover more advanced topics
• Learning reward functions from example (inverse reinforcement learning)
• Transferring knowledge between domains (transfer learning, meta-learning)
• Learning to predict and using prediction to act
Where do rewards come from?
Are there other forms of supervision?

• Learning from demonstrations

• Directly copying observed behavior
• Inferring rewards from observed behavior (inverse reinforcement learning)
• Learning from observing the world
• Learning to predict
• Unsupervised learning
• Learning from other tasks
• Transfer learning
• Meta-learning: learning to learn
Imitation learning

Bojarski et al. 2016

More than imitation: inferring intentions

Warneken & Tomasello

Inverse RL examples

Finn et al. 2016

Prediction
Prediction for real-world control

Ebert et al. 2017

Using tools with
predictive models

Xie et al. 2019

Playing games with predictive models

But sometimes there are issues…

predicted real

Kaiser et al. 2019

How do we build intelligent machines?
How do we build intelligent machines?
• Imagine you have to build an intelligent machine, where do you start?
Learning as the basis of intelligence
• Some things we can all do (e.g. walking)
• Some things we can only learn (e.g. driving a car)
• We can learn a huge variety of things, including very difficult things
• Therefore our learning mechanism(s) are likely powerful enough to do
everything we associate with intelligence
• But it may still be very convenient to “hard-code” a few really important bits
A single algorithm?
• An algorithm for each “module”?
• Or a single flexible algorithm?
Seeing with your tongue

Auditory
Cortex

[BrainPort; Martinez et al; Roe et al.]

adapted from A. Ng
What must that single algorithm do?
• Interpret rich sensory inputs

• Choose complex actions

Why deep reinforcement learning?
• Deep = can process complex sensory input
▪ …and also compute really complex functions
• Reinforcement learning = can choose complex actions
Some evidence in favor of deep learning
Some evidence for reinforcement learning
• Percepts that anticipate reward
become associated with similar
firing patterns as the reward
itself
• Basal ganglia appears to be
related to reward system
• Model-free RL-like adaptation is
often a good fit for experimental
data of animal adaptation
• But not always…
What can deep learning & RL do well now?
• Acquire high degree of proficiency in
domains governed by simple, known
rules
• Learn simple skills with raw sensory
inputs, given enough experience
• Learn from imitating enough human-
provided expert behavior
What has proven challenging so far?
• Humans can learn incredibly quickly
• Deep RL methods are usually slow
• Humans can reuse past knowledge
• Transfer learning in deep RL is an open problem
• Not clear what the reward function should be
• Not clear what the role of prediction should be
Instead of trying to produce a
program to simulate the adult
mind, why not rather try to
produce one which simulates the
child's? If this were then subjected general learning
to an appropriate course of algorithm

observations
education one would obtain the

actions
adult brain.
- Alan Turing
environment

As Built Electronics Plan: General Notes
100% (1)
As Built Electronics Plan: General Notes
1 page
Deep Reinforcement Learning: From Q-Learning To Deep Q-Learning
No ratings yet
Deep Reinforcement Learning: From Q-Learning To Deep Q-Learning
9 pages
1 Introduction To RL
No ratings yet
1 Introduction To RL
46 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
47 pages
Lecture - 02 - Introduction - II
No ratings yet
Lecture - 02 - Introduction - II
43 pages
Lec 23
No ratings yet
Lec 23
51 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
No ratings yet
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
9 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Reinforcement Learning in AI
No ratings yet
Reinforcement Learning in AI
4 pages
A Concise Introduction To Reinforcement Learning: February 2018
No ratings yet
A Concise Introduction To Reinforcement Learning: February 2018
12 pages
UNIT V Reinforcement Learning
No ratings yet
UNIT V Reinforcement Learning
8 pages
Lecture 1: Introduction: Reinforcement Learning With Tensorflow&Openai Gym
No ratings yet
Lecture 1: Introduction: Reinforcement Learning With Tensorflow&Openai Gym
18 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
3 pages
A Beginners Guide To Deep Reinforcement Learning PDF
No ratings yet
A Beginners Guide To Deep Reinforcement Learning PDF
9 pages
Reinforcement Learning With Python
No ratings yet
Reinforcement Learning With Python
24 pages
Reinforcement Learning Workflows For Ai
No ratings yet
Reinforcement Learning Workflows For Ai
39 pages
RL Week - 1
No ratings yet
RL Week - 1
53 pages
Final
No ratings yet
Final
18 pages
Reinf Learning Res Paper 2
No ratings yet
Reinf Learning Res Paper 2
12 pages
AI Reinforcdement Learning
No ratings yet
AI Reinforcdement Learning
20 pages
Stockhammer TCP 2019
No ratings yet
Stockhammer TCP 2019
37 pages
Reinforcement Learning (RL) : Agent
No ratings yet
Reinforcement Learning (RL) : Agent
35 pages
Lec 01
No ratings yet
Lec 01
60 pages
ML 5 Reinforcement
No ratings yet
ML 5 Reinforcement
23 pages
An Introduction To Deep ReinforcementLearning
No ratings yet
An Introduction To Deep ReinforcementLearning
65 pages
Reinforcement Learning For IoT - Final
No ratings yet
Reinforcement Learning For IoT - Final
45 pages
Reinforcement Learning - Introduction
No ratings yet
Reinforcement Learning - Introduction
19 pages
03 04 Lessonarticle
No ratings yet
03 04 Lessonarticle
5 pages
Reinforcement Learning Details
No ratings yet
Reinforcement Learning Details
9 pages
DQN Atari
No ratings yet
DQN Atari
26 pages
Module 1
No ratings yet
Module 1
72 pages
A Beginner's Guide To Deep Reinforcement Learning: Skymind - Ai
No ratings yet
A Beginner's Guide To Deep Reinforcement Learning: Skymind - Ai
23 pages
Lecture Week12
No ratings yet
Lecture Week12
37 pages
50 Vivek Singh Reinforcement Learning
No ratings yet
50 Vivek Singh Reinforcement Learning
7 pages
IntroductiontoRL BR
No ratings yet
IntroductiontoRL BR
22 pages
Reinforcement Learning: Pablo Zometa - Department of Mechatronics - GIU Berlin 1
No ratings yet
Reinforcement Learning: Pablo Zometa - Department of Mechatronics - GIU Berlin 1
12 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
5 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
27 pages
Playbook Executive Briefing Reinforcement Learning
No ratings yet
Playbook Executive Briefing Reinforcement Learning
20 pages
Ai PPT New
No ratings yet
Ai PPT New
14 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Exp-14 Reinforcement Learning
No ratings yet
Exp-14 Reinforcement Learning
11 pages
Multi-Agent Systems and Strategic Decision Making: Module CS4760
No ratings yet
Multi-Agent Systems and Strategic Decision Making: Module CS4760
21 pages
Unit 5
No ratings yet
Unit 5
7 pages
Module 01
No ratings yet
Module 01
66 pages
RL & DL Notes
No ratings yet
RL & DL Notes
73 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
RL Chap 5
No ratings yet
RL Chap 5
21 pages
Introduction To Deep Reinforcement Learning
No ratings yet
Introduction To Deep Reinforcement Learning
7 pages
42-Deep Q Learning
No ratings yet
42-Deep Q Learning
8 pages
RL Introduction
No ratings yet
RL Introduction
225 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Lecture1 Introduction Part1
No ratings yet
Lecture1 Introduction Part1
17 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
19 pages
Unit V Reinforcement Learning and Genetic Algorithm
No ratings yet
Unit V Reinforcement Learning and Genetic Algorithm
40 pages
2015.08.26.Lecture01Intro 2
No ratings yet
2015.08.26.Lecture01Intro 2
37 pages
Unleashing The Power of Reinforcement Learning
No ratings yet
Unleashing The Power of Reinforcement Learning
2 pages
Morales GrokkingDRL V02 Ch1
83% (6)
Morales GrokkingDRL V02 Ch1
34 pages
Unit 5 ML
No ratings yet
Unit 5 ML
49 pages
Intermediate AI Prompting – Reinforcement Learning
From Everand
Intermediate AI Prompting – Reinforcement Learning
Eric Centore
No ratings yet
Viewing Pipeline
100% (1)
Viewing Pipeline
34 pages
Raman Kumar,: Career Objective
No ratings yet
Raman Kumar,: Career Objective
3 pages
Parts List of DG-0503 (Job No-13322-24)
No ratings yet
Parts List of DG-0503 (Job No-13322-24)
6 pages
Order - All.20241201 20241231
No ratings yet
Order - All.20241201 20241231
34 pages
Nanjing Insulators Brochure
No ratings yet
Nanjing Insulators Brochure
33 pages
Food Processing - DLL - WEEK 3
No ratings yet
Food Processing - DLL - WEEK 3
6 pages
Airflow Rate L/S: Air Velocity M / S
No ratings yet
Airflow Rate L/S: Air Velocity M / S
1 page
Flame Scanner
No ratings yet
Flame Scanner
4 pages
CH 03
No ratings yet
CH 03
15 pages
1715164858.206271 LBSL Ar 2023
No ratings yet
1715164858.206271 LBSL Ar 2023
119 pages
DLP in English No.2 Final
No ratings yet
DLP in English No.2 Final
2 pages
Year 11 GCSE Edexcel Practice Paper 2F Set B 2022
No ratings yet
Year 11 GCSE Edexcel Practice Paper 2F Set B 2022
16 pages
POL 211 Lecture 1 10dec16
No ratings yet
POL 211 Lecture 1 10dec16
13 pages
GP1 Q1 Week-1
No ratings yet
GP1 Q1 Week-1
18 pages
CIS Debit Trading Deutsche
No ratings yet
CIS Debit Trading Deutsche
4 pages
Equilibrium in 2D
No ratings yet
Equilibrium in 2D
36 pages
CLASSIFICATION OF THINGS and OWNERSHIP
No ratings yet
CLASSIFICATION OF THINGS and OWNERSHIP
10 pages
Class 6 Worksheet - Document Creation
No ratings yet
Class 6 Worksheet - Document Creation
4 pages
MSDS Zetag 8125 PDF
No ratings yet
MSDS Zetag 8125 PDF
8 pages
Piping Tie-Rod Design Made Simple
No ratings yet
Piping Tie-Rod Design Made Simple
3 pages
BIO&Epi Final22
No ratings yet
BIO&Epi Final22
3 pages
Layout Mata101n
No ratings yet
Layout Mata101n
5 pages
Asuprin Activity Worksheet: Compiled By: A. S. MALQUISTO
No ratings yet
Asuprin Activity Worksheet: Compiled By: A. S. MALQUISTO
4 pages
Online Education Dashboard
No ratings yet
Online Education Dashboard
4 pages
Content Weightage For Ogdcl Test
No ratings yet
Content Weightage For Ogdcl Test
3 pages
JD - Cto
No ratings yet
JD - Cto
3 pages
LLB 1 Sem Legal Language and Legal Writing Including General English Winter 2015
No ratings yet
LLB 1 Sem Legal Language and Legal Writing Including General English Winter 2015
5 pages
Terms For Mechanical Engineering
0% (1)
Terms For Mechanical Engineering
5 pages
Imu Cet 2019
No ratings yet
Imu Cet 2019
4 pages