dl_01_introduction
dl_01_introduction
Teaching Assistants:
● Zach Eaton-Rosen
● Lewis Moffat
● Michael Jones
● Raza Habib Marie Mulville
Alex Davies (PgM)
● Thomas Gaudelet
Format and Coursework
● Format: Two streams, both streams mandatory
○ Tuesdays: Deep Learning taught by a selection of fantastic guest lecturers from DeepMind
○ Thursdays: Reinforcement Learning taught by Hado Van Hasselt (also DeepMind)
○ Some exceptions, check timetable at https://timetable.ucl.ac.uk/ and on Moodle (for topics)
● Assessment: 100% through Coursework
○ There are four deep learning and four reinforcement learning assignments
○ Each of the eight assignment will be weighted equally, i.e., counting 12.5%
○ Coursework is mixture of programming assignments and questions
○ Framework for coursework will be Colab, a Jupyter notebook environment that requires no
setup to use and runs entirely in the cloud.
○ Machine Learning algorithms will be implemented in TensorFlow through Colab.
○ You can find more information about the assessment on Moodle.
○ Todo: Set up Google account with address: "[email protected]",
where XXXXXXXX is your (numerical) student number
● Support: Use Moodle forum and Moodle direct messages
TensorFlow - What is it?
Hado or Diana
Warning: Lots of work and prior knowledge required!
● Last year, many people complained that it was too much work!
● If you do not know how to code in Python this may not be right for you!
● A lot of preliminary knowledge required - see quiz!
● Deep Learning lectures are delivered by top researchers in the field and will
stretch towards the current research frontier → brace yourselves!
● Check out the Self-Assessment Quiz on Moodle
DeepMind
Guest Lecturers
Introduction to TensorFlow
● Lecture topics:
○ Introduction to Tensorflow principles
○ Practical work-through examples in Colab
● Guest Lecturer: Matteo Hessel
○ Joined DeepMind in 2015.
○ Masters in Machine Learning from UCL Matteo Hessel
○ Master of Engineering Politecnico di Milano
● Guest Lecturer: Alex Davies
○ Joined Deepmind in 2017
○ PhD in Machine Learning at Cambridge
○ Worked with team of international scientists to build the world's first
machine learned musical.
Alex Davies
Neural Nets, Backprop, Automatic Differentiation
● Lecture topics:
○ Neural nets
○ Multi-class classification and softmax loss
○ Modular backprop
○ Automatic differentiation
● Guest Lecturer: Simon Osindero
○ Joined DeepMind in 2016.
○ Undergrad/Masters in Natural Sciences/Physics at University of Cambridge.
○ PhD in Computational Neuroscience from UCL (2004). Supervisor: Peter Dayan.
○ Postdoc at University of Toronto with Geoff Hinton. (Deep belief nets, 2006).
○ Started an A.I. company, LookFlow, in 2009. Sold to Yahoo in 2013.
○ Current research topics: deep learning, RL agent architectures and algorithms,
memory, continual learning.
Convolutional Neural Networks
● Lecture topics:
○ Convolutional networks
○ Large-scale image recognition
○ ImageNet models
● Guest Lecturer: Karen Simonyan
○ Joined DeepMind in 2014
○ DPhil (2013) and Postdoc (2014) at the University of Oxford
with Andrew Zisserman
○ Research topics: deep learning, computer vision
■ VGGNets, two-stream ConvNets, ConvNet visualisation, etc.
■ https://scholar.google.co.uk/citations?user=L7lMQkQAAAAJ
Temporal Hierarchies
Agent Environment
ACTIONS
Deep Learning
What is intelligence?
Intelligence measures an agent’s ability to achieve
goals in a wide range of environments
Complexity
Measure of Intelligence Value achieved
penalty
Deep Learning
Supervised Learning
○ Convolutional Networks on MNIST
[ Lecun, et. al ]
[ Krizhevsky, et. al ]
Deep Learning
Supervised Learning
○ Convolutional Networks on Text
[ Zhang, et. al ]
[ Collobert, et. al ]
[ Simonyan, et. al ]
Deep Learning
Supervised Learning
○ End-to-End Training
○ Optimize for the end loss
○ No engineered inputs
○ With enough data, learn a big non-linear function
○ Learn good representations of data
■ Rich enough supervised labeling is enough to train transferrable representations
■ Best feature extractor
■ Karpathy, Razavian et al, Yosinski et al, Donahue et al
○ Large labeled dataset + big/deep neural network + GPUs
○ Ever more sophisticated modules → Differentiable Progrogramming
Deep Learning
Supervised Learning
○ Innovation continues
■ Inception
■ Ladder Nets
■ Residual Connections
■ …
○ Performance is continuously improving
○ Architectures for easier optimization [ Rasmus, et. al ]
■ Batchnorm
Deep Learning
Unsupervised Learning
○ Unsupervised Learning/Generative Models
■ RBM
■ Auto-encoders
■ PCA, ICA, Sparse Coding
[ Hinton, et. al ]
■ VAE
■ NADE - and all variants
■ GANs
○ How to evaluate/rank different algorithms?
○ Quantitative approach or visual quality?
■ How can we trust if the input domain itself is not interpretable?
○ How can unsupervised learning help a task?
[ Larochelle, Murray]
Deep Learning
Sequence Modeling
○ Almost all data are sequence
■ Text
■ Video [ Hochreiter and Schmidhuber ]
■ Audio
■ Image [nade, pixelrnn]
■ Multi-modal (caption → image, image → caption)
[ Vinyals, et. al ]
[ Sutskever, et. al ]
Deep Learning
Human-level control
through deep
reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G.
Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig
Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran,
Daan Wierstra, Shane Legg, Demis Hassabis
Google DeepMind
(Mnih et al. Nature 2015)
ATARI Games
● Designed to be challenging and
interesting for humans
● Provides a good platform for sequential
decision making
● Widely adopted RL benchmark for
evaluating agents (Bellemare’13)
● Many different games emphasize
control, strategy, …
● Provide a rich visual domain
Deep Learning
End-to-End Reinforcement Learning
Deep Learning
Deep Learning
Deep Learning
Deep Learning
Deep Learning
DeepMind Lab - Challenging RL Problems in 3D
Google DeepMind
(Silver, Huang, et al 2016)
#3 most downloaded
academic paper this month
Why is Go hard for computers to play?
v (s)
Position
Policy network
Move probabilities
p (a|s)
Position
Reducing depth with value network
Reducing depth with value network
Reducing breadth with policy network
Evaluating current AlphaGo against computers
4500
AlphaGo (v18)
V13 scores 494/495
4000
against computer
opponents 3500 9p
Professional
7p
dan (p)
5p
3 to 4 stones 7d
2000
Amateur
dan (d)
handicap 5d
Crazy Stone
Zen
1500 3d
Pachi
1d
Fuego
1000 1k
CAUTION: ratings
3k
Beginner
kyu (k)
based on self-play 500
5k
Go
Gnu
results 0
7k
Computer Programs Calibration Human Players
Beats Beats
Beats Beats
KGS Amateur
Crazy Stone and Zen
humans
Extra revision material (Supervised Learning)
• Review of concepts from supervised learning
• Generalisation, overfitting, Underfitting
• Learning curves
• Stochastic gradient descent
• Linear regression
• Cost function
• Gradients
• Logistic regression
• Cost function
• Gradients
Supervised Learning Problem
Given a set of input/output pairs (training set) we wish to compute the
functional relationship between the input and the output
• Example Algorithms:
• Linear Regression
• Logistic Regression
• Neural Networks
• Decision Trees
• In this lecture, we will revise linear and logistic regression
Key Questions for the ML Practitioner
Training
Error
Validation
Error
Real-World Learning Curves: Overfitting
Training
Error
Early Stopping
Validation
Error
Real-World Learning Curves: Just Right
Training
Error
Validation
Error
Generalisation in Deep Learning
• “Understanding Deep Learning requires rethinking generalization”, Zhang, S. Bengio, Hardt,
Recht, Vinyals
• Deep Neural Networks easily fit random labels
• Generalization error varies from 0 to 90% without changes in model
• Deep NNs can even (rote) learn to classify random images
(Stochastic) Gradient Descent
Generalisation from Stochastic Gradient Descent
Linear Regression
Linear Regression Cost Function
• Model:
• Loss gradient:
• Model gradient:
• Put together:
Batch and stochastic gradient descent
•
Regularisation
Non-linear Basis Functions
Regression with polynomial basis functions
By Michaelg2015 (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons
Cross Entropy
Logistic Regression Cost Function
Modular Gradients for Logistic Regression
• Total Gradient:
• Loss gradient:
• Link gradient:
• Model gradient:
Putting the gradient back together