0% found this document useful (0 votes)
101 views13 pages

Deep Reinforcement Learning Nanodegree Program Syllabus

This document provides an overview and syllabus for a Deep Reinforcement Learning Nanodegree program. The program is estimated to take 4 months at 10-15 hours per week and includes 4 courses that cover foundations of reinforcement learning, value-based methods using deep Q-learning, policy-based methods, and multi-agent reinforcement learning. The courses include lessons, projects, and are taught by a team of instructors with expertise in deep learning, computer science, and engineering. Prerequisites include intermediate Python and machine learning skills.

Uploaded by

Cylub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views13 pages

Deep Reinforcement Learning Nanodegree Program Syllabus

This document provides an overview and syllabus for a Deep Reinforcement Learning Nanodegree program. The program is estimated to take 4 months at 10-15 hours per week and includes 4 courses that cover foundations of reinforcement learning, value-based methods using deep Q-learning, policy-based methods, and multi-agent reinforcement learning. The courses include lessons, projects, and are taught by a team of instructors with expertise in deep learning, computer science, and engineering. Prerequisites include intermediate Python and machine learning skills.

Uploaded by

Cylub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

INDIVIDUAL LEARNERS

SCHOOL OF ARTIFICIAL INTELLIGENCE

Deep Reinforcement
Learning
Nanodegree Program Syllabus
Overview
The Deep Reinforcement Learning Nanodegree program is designed to enhance students’ existing machine learning and
deep learning skills with the addition of reinforcement learning theory and programming techniques. This program will grow
students’ deep learning and reinforcement learning expertise, give them the skills they need to understand the most recent
advancements in deep reinforcement learning, and build and implement their own algorithms.

Built in collaboration with:

Deep Reinforcement Learning 2


Program information

Estimated Time Skill Level

4 months at 10-15hrs/week* Advanced

Prerequisites

A well-prepared learner should have:

• Intermediate to advanced Python experience.

• Familiarity with object-oriented programming.

• Read and understand code.

• Understanding of probability and statistics.

• Intermediate knowledge of machine learning techniques.

• Ability to describe backpropagation and knowledge of neural network architectures (like a CNN for image classification).

• Experience with a deep learning framework like TensorFlow, Keras, or PyTorch.

Required Hardware/Software

Learners need access to a computer running a 64-bit operating system with at least 8GB of RAM, along with administrator
account permissions sufficient to install programs including Anaconda with Python 3.6 and supporting packages.

*The length of this program is an estimation of total hours the average student may take to complete all required
coursework, including lecture and project time. If you spend about 5-10 hours per week working through the program, you
should finish within the time provided. Actual hours may vary.

Deep Reinforcement Learning 3


Course 1

Foundations of Reinforcement Learning


Master the fundamentals of reinforcement learning by writing one’s own implementations of many classical solution methods.

Lesson 1
• A friendly introduction to reinforcement learning.
Introduction to RL

Lesson 2

• Learn how to define Markov decision processes to solve real-world problems.


The RL Framework:
The Problem

Lesson 3
• Learn about policies and value functions.
The RL Framework: • Derive the Bellman equations.
The Solution

Lesson 4 • Write one’s own implementations of iterative policy evaluation, policy


improvement, policy iteration, and value iteration.
Dynamic Programming

• Implement classic Monte Carlo prediction and control methods.


Lesson 5
• Learn about greedy and epsilon-greedy policies.
Monte Carlo Methods
• Explore solutions to the exploration-exploitation dilemma.

Lesson 6 • Learn the difference between the Sarsa, Q-Learning, and Expected Sarsa
algorithms.
Temporal-Difference Methods

Deep Reinforcement Learning 4


Lesson 7
• Design one’s own algorithm to solve a classical problem from the research
Solve openai Gym’s community.
Taxi-V2 Task

Lesson 8
• Learn how to adapt traditional algorithms to work with continuous spaces.
RL in Continuous Spaces

Course 2

Value-Based Methods
Leverage neural networks to train an agent that learns intelligent behaviors from sensory data.

Course Project

Navigation
Leverage neural networks to train an agent to navigate a virtual world and collect as many yellow bananas
as possible while avoiding blue bananas.

Deep Reinforcement Learning 5


Lesson 1 • Learn how to build and train neural networks and convolutional neural
networks in PyTorch.
Deep Learning in PyTorch

• Extend value-based reinforcement learning methods to complex problems


Lesson 2 using deep neural networks.

Deep Q-Learning • Learn how to implement a Deep Q-Network (DQN), along with Double-DQN,
Dueling-DQN, and Prioritized Replay.

Lesson 3 • Learn from experts at NVIDIA how to use value-based methods in real-world
robotics.
Deep RL for Robotics

Course 3

Policy-Based Methods
Learn the theory behind evolutionary algorithms and policy-gradient methods. Design one’s own algorithm to train a simulated
robotic arm to reach target locations.

Course Project

Continuous Control
Train a robotic arm to reach target locations. For an extra challenge, train a four-legged virtual creature
to walk.

Deep Reinforcement Learning 6


Lesson 1 • Learn the theory behind evolutionary algorithms, stochastic policy search, and
the REINFORCE algorithm.
Introduction to Policy-Based
Methods • Learn how to apply the algorithms to solve a classical control problem.

• Learn about techniques such as Generalized Advantage Estimation (GAE) for


Lesson 2
lowering the variance of policy gradient methods.
Improving Policy • Explore policy optimization methods such as Trust Region Policy Optimization
Gradient Methods (TRPO) and Proximal Policy Optimization (PPO).

Lesson 3 • Study cutting-edge algorithms such as Deep Deterministic Policy Gradients


(DDPG).
Actro-Critic Methods

Lesson 4 • Learn from experts at NVIDIA how to use actor-critic methods to generate
optimal financial trading strategies.
Deep RL for Financial Trading

Course 4

Multi-Agent Reinforcement Learning


Learn how to apply reinforcement learning methods to applications that involve multiple interacting agents. These techniques
are used in a variety of applications such as the coordination of autonomous vehicles.

Course Project

Collaboration & Competition


Train a system of agents to demonstrate collaboration or cooperation on a complex task.

Deep Reinforcement Learning 7


• Learn how to define Markov games to specify a reinforcement learning task
Lesson 1
with multiple agents.
Introduction Multi-Agent RL • Explore how to train agents in collaborative and competitive settings.

Lesson 2
• Master the skills behind DeepMind’s AlphaZero.
Case Study: Alphazera

Deep Reinforcement Learning 8


Meet your instructors.

Alexis Cook
Curriculum Lead

Alexis is an applied mathematician with a master’s in computer science from Brown University and
a master’s in applied mathematics from the University of Michigan. She was formerly a National
Science Foundation Graduate Research Fellow.

Arpan Chakraborty
Computer Scientist

Arpan is a computer scientist with a PhD from North Carolina State University. He teaches at
Georgia Tech (within the Master of Computer Science program), and is a coauthor of the book
Practical Graph Mining with R.

Mat Leonard
Instructor

Mat is a former physicist, research neuroscientist, and data scientist. He completed his PhD
and postdoctoral fellowship at the University of California, Berkeley.

Luis Serrano
Instructor

Luis was formerly a machine learning engineer at Google. He holds a PhD in mathematics
from the University of Michigan and a postdoctoral fellowship at the University of Quebec at
Montreal.

Deep Reinforcement Learning 9


Cezanne Camacho
Curriculum Lead

Cezanne is an expert in computer vision with a master’s in electrical engineering from Stanford
University. As a former researcher in genomics and biomedical imaging, she’s applied computer
vision and deep learning to medical diagnostic applications.

Dana Sheahen
Electrical Engineer

Dana is an electrical engineer with a master’s in computer science from Georgia Tech. Her work
experience includes software development for embedded systems in the automotive group at
Motorola, where she was awarded a patent for an onboard operating system.

Chhavi Yadav
Content Developer

Chhavi is a computer science graduate student at New York University where she researches
machine learning algorithms. She is also an electronics engineer and has worked on wireless
systems.

Juan Delgado
Computational Physicist

Juan is a computational physicist with a master’s in astronomy. He is finishing his PhD in biophysics.
He previously worked at NASA developing space instruments and writing software to analyze large
amounts of scientific data using machine learning techniques.

Miguel Morales
Content Developer

Miguel is a software engineer at Lockheed Martin. He earned a master’s in computer science at


Georgia Tech and is an instructional associate for the Reinforcement Learning and Decision Making
course. He’s the author of Grokking Deep Reinforcement Learning.

Deep Reinforcement Learning 10


Udacity’s learning
experience

Hands-on Projects Quizzes


Open-ended, experiential projects are designed Auto-graded quizzes strengthen comprehension.
to reflect actual workplace challenges. They aren’t Learners can return to lessons at any time during
just multiple choice questions or step-by-step the course to refresh concepts.
guides, but instead require critical thinking.

Knowledge Custom Study Plans


Find answers to your questions with Knowledge, Create a personalized study plan that fits your
our proprietary wiki. Search questions asked by individual needs. Utilize this plan to keep track of
other students, connect with technical mentors, movement toward your overall goal.
and discover how to solve the challenges that
you encounter.

Workspaces Progress Tracker


See your code in action. Check the output and Take advantage of milestone reminders to stay
quality of your code by running it on interactive on schedule and complete your program.
workspaces that are integrated into the platform.

Deep Reinforcement Learning 11


Our proven approach for building
job-ready digital skills.
Experienced Project Reviewers

Verify skills mastery.


• Personalized project feedback and critique includes line-by-line code review from
skilled practitioners with an average turnaround time of 1.1 hours.

• Project review cycle creates a feedback loop with multiple opportunities for
improvement—until the concept is mastered.

• Project reviewers leverage industry best practices and provide pro tips.

Technical Mentor Support

24/7 support unblocks learning.


• Learning accelerates as skilled mentors identify areas of achievement and potential
for growth.

• Unlimited access to mentors means help arrives when it’s needed most.

• 2 hr or less average question response time assures that skills development stays on track.

Personal Career Services

Empower job-readiness.
• Access to a Github portfolio review that can give you an edge by highlighting your
strengths, and demonstrating your value to employers.*

• Get help optimizing your LinkedIn and establishing your personal brand so your profile
ranks higher in searches by recruiters and hiring managers.

Mentor Network

Highly vetted for effectiveness.


• Mentors must complete a 5-step hiring process to join Udacity’s selective network.

• After passing an objective and situational assessment, mentors must demonstrate


communication and behavioral fit for a mentorship role.

• Mentors work across more than 30 different industries and often complete a Nanodegree
program themselves.

*Applies to select Nanodegree programs only.

Deep Reinforcement Learning 12


Learn more at
www.udacity.com/online-learning-for-individuals →

12.22.22 | V1.0

You might also like