0% found this document useful (0 votes)
63 views18 pages

Ai 4 All

This document provides a summary of artificial intelligence concepts in 3 sentences or less. It introduces the MONTRÉAL.AI Academy's VIP AI 101 Cheat Sheet, which provides a first-class overview of AI concepts for all. Key AI topics covered at a high level include AI-first, getting started with AI, deep learning fundamentals like neural networks and backpropagation, convolutional neural networks for images, and recurrent neural networks for sequences. Codes and science related to these AI topics are available on the MONTRÉAL.AI Academy website.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views18 pages

Ai 4 All

This document provides a summary of artificial intelligence concepts in 3 sentences or less. It introduces the MONTRÉAL.AI Academy's VIP AI 101 Cheat Sheet, which provides a first-class overview of AI concepts for all. Key AI topics covered at a high level include AI-first, getting started with AI, deep learning fundamentals like neural networks and backpropagation, convolutional neural networks for images, and recurrent neural networks for sequences. Codes and science related to these AI topics are available on the MONTRÉAL.AI Academy website.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

M ONTRÉAL .

AI ACADEMY: A RTIFICIAL I NTELLIGENCE 101


F IRST W ORLD -C LASS OVERVIEW OF AI FOR A LL
VIP AI 101 C HEAT S HEET

A P REPRINT

Vincent Boucher∗
MONTRÉAL.AI
Montreal, Quebec, Canada
[email protected]

July 10, 2019

A BSTRACT
For the purpose of entrusting all sentient beings with powerful AI tools to learn, deploy and scale AI
in order to enhance their prosperity, to settle planetary-scale problems and to inspire those who, with
AI, will shape the 21st Century, MONTRÉAL.AI introduces this VIP AI 101 CheatSheet for All.

Curated Open-Source Codes and Science: http://www.academy.montreal.ai/.

Keywords AI-First · Artificial Intelligence · Deep Learning · GANs · Intelligent Agent

1 AI-First
TODAY’S ARTIFICIAL INTELLIGENCE IS POWERFUL AND ACCESSIBLE TO ALL. AI opens up a world of new
possibilities. To pioneer AI-First innovations advantages: start by exploring how to apply AI in ways never thought of.

"Breakthrough in machine learning would be worth 10 Microsofts." — Bill Gates

2 Getting Started
Tinker with neural networks in the browser with TensorFlow Playground http://playground.tensorflow.org/.

Papers With Code (Learn Python 3 in Y minutes2 ) https://paperswithcode.com/state-of-the-art.

2.1 In the Cloud

Colab 3 . Practice Immediately 4 . Labs5 : Introduction to Deep Learning (MIT 6.S191)

• Free GPU compute via Colab https://colab.research.google.com/notebooks/welcome.ipynb.


• Six easy ways to run your Jupyter Notebook in the cloud6 .

Founding Chairman at MONTRÉAL.AI http://www.montreal.ai.
2
https://learnxinyminutes.com/docs/python3/
3
https://medium.com/tensorflow/colab-an-easy-way-to-learn-and-use-tensorflow-d74d1686e309
4
https://colab.research.google.com/github/GokuMohandas/practicalAI/
5
https://colab.research.google.com/github/aamini/introtodeeplearning_labs
6
https://www.dataschool.io/cloud-services-for-jupyter-notebook/
A PREPRINT - J ULY 10, 2019

2.2 On a Local Machine

JupyterLab is an interactive development environment for working with notebooks, code and data 7 .

• Install Anaconda https://www.anaconda.com/download/ and launch ‘Anaconda Navigator’


• Update Jupyterlab and launch the application. Under Notebook, click on ‘Python 3’

3 Deep Learning

Deep learning allows computational models that are composed of multiple processing layers to learn REPRESEN-
TATIONS of (raw) data with multiple levels of abstraction[2]. At a high-level, neural networks are either encoders,
decoders, or a combination of both8 . Introductory course http://introtodeeplearning.com. See also Table 1.

"DL is essentially a new style of programming – "differentiable programming" – and the field is trying to work out the
reusable constructs in this style. We have some: convolution, pooling, LSTM, GAN, VAE, memory units, routing units,
etc." — Thomas G. Dietterich

Table 1: Types of Learning, by Alex Graves at NeurIPS 2018


Name With Teacher Without Teacher
Active Reinforcement Learning / Active Learning Intrinsic Motivation / Exploration
Passive Supervised Learning Unsupervised Learning

"If you have a large big dataset and you train a very big neural network, then success is guaranteed!" — Ilya Sutskever

Figure 1: Multilayer perceptron (MLP).

"When you first study a field, it seems like you have to memorize a zillion things. You don’t. What you need is to identify
the 3-5 core principles that govern the field. The million things you thought you had to memorize are various
combinations of the core principles." — J. Reed

"1. Multiply things together


2. Add them up
3. Replaces negatives with zeros
4. Return to step 1, a hundred times."
— Jeremy Howard
7
https://blog.jupyter.org/jupyterlab-is-ready-for-users-5a6f039b8906
8
https://github.com/lexfridman/mit-deep-learning

2
A PREPRINT - J ULY 10, 2019

Deep learning (distributed representations + composition) is a general-purpose learning procedure.


v Dive into Deep Learning http://d2l.ai.
v Minicourse in Deep Learning with PyTorch9 .
v Deep Learning. The full deck of (600+) slides, by Gilles Louppe10 .
v A Selective Overview of Deep Learning https://arxiv.org/abs/1904.05526.
v A Recipe for Training Neural Networks https://karpathy.github.io/2019/04/25/recipe/.
v How to Choose Your First AI Project https://hbr.org/2019/02/how-to-choose-your-first-ai-project.
v Blog | MIT 6.S191 https://medium.com/tensorflow/mit-introduction-to-deep-learning-4a6f8dde1f0c.

3.1 Universal Approximation Theorem

Neural Networks + Gradient Descent + GPU11 :


• Infinitely flexible function: Neural Network (multiple hidden layers: Deep Learning)12 .
• All-purpose parameter fitting: Backpropagation1314 .

Figure 2: All-purpose parameter fitting: Backpropagation.

• Fast and scalable: GPU.


When a choice must be made, just feed the (raw) data to a deep neural network (Universal function approximators).

3.2 Convolution Neural Networks (Useful for Images | Space)

The deep convolutional network, inspired by Hubel and Wiesel’s seminal work on early visual cortex, uses hierarchical
layers of tiled convolutional filters to mimic the effects of receptive fields, thereby exploiting the local spatial correlations
present in images[1]. See Figure 4. Demo https://ml4a.github.io/demos/convolution/.
A ConvNet is made up of Layers. Every Layer has a simple API: It transforms an input 3D volume to an output 3D
volume with some differentiable function that may or may not have parameters15 . Reading16 .
In images, local combinations of edges form motifs, motifs assemble into parts, and parts form objects1718 .
v TensorSpace (https://tensorspace.org) offers interactive 3D visualizations of LeNet, AlexNet and Inceptionv3.
9
https://github.com/Atcold/pytorch-Deep-Learning-Minicourse
10
https://glouppe.github.io/info8010-deep-learning/pdf/lec-all.pdf
11
http://wiki.fast.ai/index.php/Lesson_1_Notes
12
http://neuralnetworksanddeeplearning.com/chap4.html
13
https://github.com/DebPanigrahi/Machine-Learning/blob/master/back_prop.ipynb
14
https://www.jeremyjordan.me/neural-networks-training/
15
http://cs231n.github.io/convolutional-networks/
16
https://ml4a.github.io/ml4a/convnets/
17
http://yosinski.com/deepvis
18
https://distill.pub/2017/feature-visualization/

3
A PREPRINT - J ULY 10, 2019

0 1 1 1×1 0×0 0×1 0


0 0 1 1×0 1×1 0×0 0 1 4 3 4 1
0 0 0 1×1 1×0 1×1 0 1 0 1 1 2 4 3 3
0 0 0 1 1 0 0 ∗ 0 1 0 = 1 2 3 4 1
0 0 1 1 0 0 0 1 0 1 1 3 3 1 1
0 1 1 0 0 0 0 3 3 1 1 0
1 1 0 0 0 0 0
I K I∗K

Figure 3: 2D Convolution. Source: Cambridge Coding Academy

Figure 4: Architecture of LeNet-5, a Convolutional Neural Network. LeCun et al., 1998

3.3 Recurrent Neural Networks (Useful for Sequences | Time)

Recurrent neural networks are networks with loops in them, allowing information to persist19 . RNNs process an input
sequence one element at a time, maintaining in their hidden units a ‘state vector’ that implicitly contains information
about the history of all the past elements of the sequence[2]. For sequential inputs. See Figure 6.

ht h0 h1 h2 h3 ht

A = A A A A ... A

xt x0 x1 x2 x3 xt

Figure 5: RNN Layers Reuse Weights for Multiple Timesteps.

"I feel like a significant percentage of Deep Learning breakthroughs ask the question “how can I reuse weights in
multiple places?” – Recurrent (LSTM) layers reuse for multiple timesteps – Convolutional layers reuse in multiple
locations. – Capsules reuse across orientation." — Andrew Trask

v Long Short-Term-Memory (LSTM), Sepp Hochreiter and Jürgen Schmidhuber20 .


v Can Neural Networks Remember? Slides by Vishal Gupta: http://vishalgupta.me/deck/char_lstms/.
v Understanding LSTM Networks http://colah.github.io/posts/2015-08-Understanding-LSTMs/.
v The Unreasonable Effectiveness of Recurrent Neural Networks, blog (2015) by Andrej Karpathy21 .
19
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
20
https://www.bioinf.jku.at/publications/older/2604.pdf
21
http://karpathy.github.io/2015/05/21/rnn-effectiveness/

4
A PREPRINT - J ULY 10, 2019

ht

ct−1 ct
× +

tanh
×
ft it c̃t ot ×
sigm sigm tanh sigm
ht−1 ht

xt

Figure 6: "Long Short-Term-Memory" (LSTM) Cell.

Figure 7: Google Smart Reply System is built on a pair of recurrent neural networks. Diagram by Chris Olah

v Attention and Augmented Recurrent Neural Networks https://distill.pub/2016/augmented-rnns/.


v Attention Is All You Need, Vaswani et al. https://arxiv.org/abs/1706.03762.
v Transformer model for language understanding. Tutorial showing how to write Transformer in TensorFlow 2.022 .

3.4 Unsupervised Learning

True intelligence will require independent learning strategies.


Unsupervised learning is a paradigm for creating AI that learns without a particular task in mind: learning for the
sake of learning23 . It captures some characteristics of the joint distribution of the observed random variables (learn the
underlying structure). The variety of tasks include density estimation, dimensionality reduction, and clustering.[4]24 .
Self-supervised learning is derived form unsupervised learning where the data provides the supervision. E.g.
Word2vec25 , a technique for learning vector representations of words, or word embeddings. An embedding is a
mapping from discrete objects, such as words, to vectors of real numbers26 .

22
https://www.tensorflow.org/alpha/tutorials/sequences/transformer
23
https://deepmind.com/blog/unsupervised-learning/
24
https://media.neurips.cc/Conferences/NIPS2018/Slides/Deep_Unsupervised_Learning.pdf
25
https://jalammar.github.io/illustrated-word2vec/
26
http://projector.tensorflow.org

5
A PREPRINT - J ULY 10, 2019

3.4.1 Generative Adversarial Networks


Simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model
D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is
to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game[3].

min max[IEx∼pdata (x) [logDθd (x)] + IEz∼pz (z) [log(1 − Dθd (Gθg (z)))]] (1)
θg θd

"What I cannot create, I do not understand." — Richard Feynman

Goodfellow et al. used an interesting analogy where the generative model can be thought of as analogous to a team of
counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous
to the police, trying to detect the counterfeit currency. Competition in this game drives both teams to improve their
methods until the counterfeits are indistiguishable from the genuine articles. See Figure 8.

Figure 8: GAN: Neural Networks Architecture Pioneered by Ian Goodfellow at University of Montreal (2014).

StyleGAN: A Style-Based Generator Architecture for Generative Adversarial Networks

• Paper http://stylegan.xyz/paper | Code https://github.com/NVlabs/stylegan.


• StyleGAN for art. Colab https://colab.research.google.com/github/ak9250/stylegan-art.
• This Person Does Not Exist https://thispersondoesnotexist.com.
• Which Person Is Real? http://www.whichfaceisreal.com.
• This Resume Does Not Exist https://thisresumedoesnotexist.com.
• This Waifu Does Not Exist https://www.thiswaifudoesnotexist.net.
• Encoder for Official TensorFlow Implementation https://github.com/Puzer/stylegan-encoder.
• How to recognize fake AI-generated images. By Kyle McDonald27 .

v Few-Shot Adversarial Learning of Realistic Neural Talking Head Models28 .


v Wasserstein GAN http://www.depthfirstlearning.com/2019/WassersteinGAN.
v GANSynth: Generate high-fidelity audio with GANs! Colab http://goo.gl/magenta/gansynth-demo.
v SC-FEGAN: Face Editing Generative Adversarial Network https://github.com/JoYoungjoo/SC-FEGAN.
v CariGANs: Unpaired Photo-to-Caricature Translation. Cao et al.: https://cari-gan.github.io.
v GANpaint Paint with GAN units http://gandissect.res.ibm.com/ganpaint.html.
v PyTorch pretrained BigGAN https://github.com/huggingface/pytorch-pretrained-BigGAN.
27
https://medium.com/@kcimc/how-to-recognize-fake-ai-generated-images-4d1f6f9a2842
28
https://arxiv.org/abs/1905.08233

6
A PREPRINT - J ULY 10, 2019

v Demo of BigGAN in an official Colaboratory notebook (backed by a GPU) https://colab.research.google.


com/github/tensorflow/hub/blob/master/examples/colab/biggan_generation_with_tf_hub.ipynb

3.4.2 Variational AutoEncoder

Variational Auto-Encoders (VAEs) are powerful models for learning low-dimensional representations See Figure 9.
Disentangled representations are defined as ones where a change in a single unit of the representation corresponds to a
change in single factor of variation of the data while being invariant to others (Bengio et al. (2013).

Figure 9: Variational Autoencoders (VAEs): Powerful Generative Models.

v Colab29 : "Debiasing Facial Detection Systems." AIEthics


v SpaceSheet: Interactive Latent Space Exploration with a Spreadsheet https://vusd.github.io/spacesheet/.
v MusicVAE: Learning latent spaces for musical scores https://magenta.tensorflow.org/music-vae.
v Slides: A Few Unusual Autoencoders https://colinraffel.com/talks/vector2018few.pdf.
v Generative models in Tensorflow 2 https://github.com/timsainb/tensorflow2-generative-models/.
v Reading: Disentangled VAE’s (DeepMind 2016) https://arxiv.org/abs/1606.05579.

3.4.3 Natural Language Processing (NLP) | BERT: A New Era in NLP

BERT (Bidirectional Encoder Representations from Transformers)[6] is a deeply bidirectional, unsupervised language
representation, pre-trained using only a plain text corpus (in this case, Wikipedia)30 .

• Reading: Unsupervised pre-training of an LSTM followed by supervised fine-tuning[7].

• TensorFlow code and pre-trained models for BERT https://github.com/google-research/bert.

• Better Language Models and Their Implications31 .

"I think transfer learning is the key to general intelligence. And I think the key to doing transfer learning will be the
acquisition of conceptual knowledge that is abstracted away from perceptual details of where you learned it from." —
Demis Hassabis

v How to Build OpenAI’s GPT-2: "The AI That’s Too Dangerous to Release"32 .


v Play with BERT with your own data using TensorFlow Hub https://colab.research.google.com/github/
google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb.

29
https://colab.research.google.com/github/aamini/introtodeeplearning_labs/blob/master/lab2/Part2_
debiasing_solution.ipynb
30
https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
31
https://blog.openai.com/better-language-models/
32
https://blog.floydhub.com/gpt2/

7
A PREPRINT - J ULY 10, 2019

Figure 10: The two steps of how BERT is developed. Source https://jalammar.github.io/illustrated-bert/.

4 Autonomous Agents
An autonomous agent is any device that perceives its environment and takes actions that maximize its chance of
success at some goal. At the bleeding edge of AI, autonomous agents can learn from experience, simulate worlds and
orchestrate meta-solutions. Here’s an informal definition33 of the universal intelligence of agent π 34 :
X
Υ(π) := 2−K(µ) Vµπ (2)
µ∈E

"Intelligence measures an agent’s ability to achieve goals in a wide range of environments." — Shane Legg

4.1 Deep Reinforcement Learning

Figure 11: An Agent Interacts with an Environment.

Reinforcement learning (RL) studies how an agent can learn how to achieve goals in a complex, uncertain environment
(Figure 12) [5]. Recent superhuman results in many difficult environments combine deep learning with RL (Deep
Reinforcement Learning). See Figure 12 for a taxonomy of RL algorithms.
33
https://arxiv.org/abs/0712.3329
34
Where µ is an environment, K is the Kolmogorov complexity function, E is the space of all computable reward summable
environmental measures with respect to the reference machine U and the value function Vµπ is the agent’s “ability to achieve”.

8
A PREPRINT - J ULY 10, 2019

Figure 12: A Taxonomy of RL Algorithms. Source: Spinning Up in Deep RL by Achiam et al. | OpenAI

4.1.1 Model-Free RL | Value-Based


The goal in RL is to train the agent to maximize the discounted sum of all future rewards Rt , called the return:

Rt = rt + γrt+1 + γ 2 rt+2 + . . . (3)

The Q-function captures the expected total future reward an agent in state s can receive by executing a certain action a:

Q(s, a) = E[Rt ] (4)

The optimal policy should choose the action a that maximizes Q(s,a):

π ∗ (s) = argmaxa Q(s, a) (5)

• Q-Learning: Playing Atari with Deep Reinforcement Learning (DQN). Mnih et al, 2013[10].

TF-Agents (DQN Tutorial) | Colab https://colab.research.google.com/github/tensorflow/agents.

4.1.2 Model-Free RL | Policy-Based

Figure 13: Policy Gradient Directly Optimizes the Policy.

Run a policy for a while (code: https://gist.github.com/karpathy/a4166c7fe253700972fcbc77e4ea32c5):

τ = (s0 , a0 , r0 , s1 , a1 , r1 , . . . , sT −1 , aT −1 , rT −1 , sT ) (6)

Increase probability of actions that lead to high rewards and decrease probability of actions that lead to low rewards:
"T −1 #
X
∇θ Eτ [R(τ )] = Eτ ∇θ log π(at |st , θ)R(τ ) (7)
t=0

9
A PREPRINT - J ULY 10, 2019

πθ (s, α1 )
πθ (s, α2 )
πθ (s, α3 )
πθ (s, α4 )
πθ (s, α5 )
s

Vψ (s)

Figure 14: Asynchronous Advantage Actor-Critic (A3C). Source: Petar Velickovic

• Policy Optimization: Asynchronous Methods for Deep Reinforcement Learning (A3C). Mnih et al, 2016[8].

• Policy Optimization: Proximal Policy Optimization Algorithms (PPO). Schulman et al, 2017[9].

4.1.3 Model-Based RL

In Model-Based RL, the agent generates predictions about the next state and reward before choosing each action.

Figure 15: World Model’s Agent consists of: Vision (V), Memory (M), and Controller (C). | Ha et al, 2018[11]

• Learn the Model: Recurrent World Models Facilitate Policy Evolution (World Models35 ). The world model
agent can be trained in an unsupervised manner to learn a compressed spatial and temporal representation of
the environment. Then, a compact policy can be trained. See Figure 15. Ha et al, 2018[11].

• Learn the Model: Learning Latent Dynamics for Planning from Pixels https://planetrl.github.io/.

• Given the Model: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
(AlphaZero). Silver et al, 2017[14]. AlphaGo Zero Explained In One Diagram36 .

35
https://worldmodels.github.io
36
https://applied-data.science/static/main/res/alpha_go_zero_cheat_sheet.png

10
A PREPRINT - J ULY 10, 2019

4.1.4 Improving Agent Design


Via Reinforcement Learning: Blog37 . arXiv38 . ASTool https://github.com/hardmaru/astool/.
Via Evolution: Video39 . Evolved Creatures http://www.karlsims.com/evolved-virtual-creatures.html.

Figure 16: A comparison of the original LSTM cell vs. two new good generated. Top left: LSTM cell. [19]

"The future of high-level APIs for AI is... a problem-specification API. Currently we only search over network weights,
thus "problem specification" involves specifying a model architecture. In the future, it will just be: "tell me what data
you have and what you are optimizing"." — François Chollet

4.1.5 OpenAI Baselines


High-quality implementations of reinforcement learning algorithms https://github.com/openai/baselines.
Colab https://colab.research.google.com/drive/1KKq9A3dRTq1q6bJmPyFOgg917gQyTjJI.

4.1.6 Google Dopamine and A Zoo of Agents


Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.40 .
A Zoo of Atari-Playing Agents: Code41 , Blog42 and Colaboratory notebook https://colab.research.google.
com/github/uber-research/atari-model-zoo/blob/master/colab/AtariZooColabDemo.ipynb.

4.1.7 TRFL | By the Research Engineering team at DeepMind


TRFL ("truffle"): a library of reinforcement learning building blocks https://github.com/deepmind/trfl.

4.2 Evolution Strategies (ES)

Evolution and neural networks proved a potent combination in nature. Neuroevolution, which harnesses evolutionary
algorithms to optimize neural networks, enables capabilities that are typically unavailable to gradient-based approaches,
including learning neural network building blocks, architectures and even the algorithms for learning[12].
37
https://designrl.github.io
38
https://arxiv.org/abs/1810.03779
39
https://youtu.be/JBgG_VSP7f8
40
https://github.com/google/dopamine
41
https://github.com/uber-research/atari-model-zoo
42
https://eng.uber.com/atari-zoo-deep-reinforcement-learning/

11
A PREPRINT - J ULY 10, 2019

". . . evolution — whether biological or computational — is inherently creative, and should routinely be expected to
surprise, delight, and even outwit us." — The Surprising Creativity of Digital Evolution, Lehman et al.[22]

Neural architecture search has advanced to the point where it can outperform human-designed models[13].
Natural evolutionary strategy directly evolves the weights of a DNN and performs competitively with the best deep
reinforcement learning algorithms, including deep Q-networks (DQN) and policy gradient methods (A3C)[21].

Figure 17: https://colab.research.google.com/github/karpathy/randomfun/blob/master/es.ipynb.

The ES algorithm is a “guess and check” process, where we start with some random parameters and then repeatedly:

1. Tweak the guess a bit randomly, and

2. Move our guess slightly towards whatever tweaks worked better.

"Evolution is a slow learning algorithm that with the sufficient amount of compute produces a human brain." —
Wojciech Zaremba

Demos: ES on CartPole-v143 and ES on LunarLanderContinuous-v244 .


VAE+CPPN+GAN https://colab.research.google.com/drive/1_OoZ3z_C5Jl5gnxDOE9VEMCTs-Fl8pvM.
A Visual Guide to ES http://blog.otoro.net/2017/10/29/visual-evolution-strategies/.

4.3 Self Play

Silver et al.[15] introduced an algorithm based solely on reinforcement learning, without human data, guidance or
domain knowledge. Starting tabula rasa (and being its own teacher!), AlphaGo Zero achieved superhuman performance.
AlphaGo Zero showed that algorithms matter much more than big data and massive amounts of computation.

"Self-Play is Automated Knowledge Creation." — Carlos E. Perez

Self-play mirrors similar insights from coevolution. Transfer learning is the key to go from self-play to the real world45 .

"Open-ended self play produces: Theory of mind, negotiation, social skills, empathy, real language understanding." —
Ilya Sutskever, Meta Learning and Self Play

TensorFlow.js Implementation of DeepMind’s AlphaZero Algorithm for Chess. Live Demo46 | Code47
An open-source implementation of the AlphaGoZero algorithm https://github.com/tensorflow/minigo
ELF OpenGo: An Open Reimplementation of AlphaZero, Tian et al.: https://arxiv.org/abs/1902.04522.

43
https://colab.research.google.com/drive/1bMZWHdhm-mT9NJENWoVewUks7cGV10go
44
https://colab.research.google.com/drive/1lvyKjFtc_C_8njCKD-MnXEW8LPS2RPr6
45
http://metalearning-symposium.ml
46
https://frpays.github.io/lc0-js/engine.html
47
https://github.com/frpays/lc0-js/

12
A PREPRINT - J ULY 10, 2019

4.4 Deep Meta-Learning

Learning to Learn[16]. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve
new learning tasks using only a small number of training samples[20].

X 
θ ← θ − β∇θ LTi fθi0 (8)
Ti ∼p(T )

A meta-learning algorithm takes in a distribution of tasks, where each task is a learning problem, and it produces a
quick learner — a learner that can generalize from a small number of examples[17].

Figure 18: Diagram of Model-Agnostic Meta-Learning (MAML)

"The notion of a neural "architecture" is going to disappear thanks to meta learning." — Andrew Trask

v Meta Learning Shared Hierarchies[18] (The Lead Author is in High School!)


v Colaboratory reimplementation of MAML (Model-Agnostic Meta-Learning) in TF 2.048
v Causal Reasoning from Meta-reinforcement Learning https://arxiv.org/abs/1901.08162

4.5 Multi-Agent Populations

"We design a Theory of Mind neural network – a ToMnet – which uses meta-learning to build models of the agents it
encounters, from observations of their behaviour alone." — Machine Theory of Mind, Rabinowitz et al.[25]

Cooperative Agents. Learning to Model Other Minds, by OpenAI[24], is an algorithm which accounts for the fact that
other agents are learning too, and discovers self-interested yet collaborative strategies. Also: OpenAI Five49 .

"Artificial Intelligence is about recognising patterns, Artificial Life is about creating patterns." — Mizuki Oka et al.

Active Learning Without Teacher. In Intrinsic Social Motivation via Causal Influence in Multi-Agent RL, Jaques et
al. (2018) https://arxiv.org/abs/1810.08647 propose an intrinsic reward function designed for multi-agent RL
(MARL), which awards agents for having a causal influence on other agents’ actions. Open-source implementation 50 .
"Open-ended Learning in Symmetric Zero-sum Games," Balduzzi et al.: https://arxiv.org/abs/1901.08106
Neural MMO: a massively multiagent env. for simulations with many long-lived agents. Code51 and 3D Client52 .

48
https://colab.research.google.com/github/mari-linhares/tensorflow-maml/blob/master/maml.ipynb
49
https://blog.openai.com/openai-five/
50
https://github.com/eugenevinitsky/sequential_social_dilemma_games
51
https://github.com/openai/neural-mmo
52
https://github.com/jsuarez5341/neural-mmo-client

13
A PREPRINT - J ULY 10, 2019

5 Environments
Platforms for training autonomous agents.

"Situation awareness is the perception of the elements in the environment within a volume of time and space, and the
comprehension of their meaning, and the projection of their status in the near future." — Endsley (1987)

5.1 OpenAI Gym

The OpenAI Gym https://gym.openai.com/ (Blog53 | GitHub54 ) is a toolkit for developing and comparing rein-
forcement learning algorithms. What makes the gym so great is a common API around environments.

Figure 19: Robotics Environments https://blog.openai.com/ingredients-for-robotics-research/

How to create new environments for Gym55 . Examples: OpenAI Gym Environment for Trading56 .

5.2 DeepMind Lab

DeepMind Lab: A customisable 3D platform for agent-based AI research https://github.com/deepmind/lab.

• DeepMind Control Suite https://github.com/deepmind/dm_control.


• Convert DeepMind Control Suite to OpenAI Gym Envs https://github.com/zuoxingdong/dm2gym.

5.3 Unity ML-Agents

Unity ML Agents allows to create environments where intelligent agents (Single Agent, Cooperative and Competitive
Multi-Agent and Ecosystem) can be trained using RL, neuroevolution, or other ML methods https://unity3d.ai.

• Getting Started with Marathon Environments for Unity ML-Agents57 .


• Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence58 .

5.4 POET: Paired Open-Ended Trailblazer

Diversity is the premier product of evolution. Endlessly generate increasingly complex and diverse learning environ-
ments59 . Open-endedness could generate learning algorithms reaching human-level intelligence[23].

• Implementation of the POET algorithm https://github.com/uber-research/poet.

v AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence60 .
53
https://blog.openai.com/openai-gym-beta/
54
https://github.com/openai/gym
55
https://github.com/openai/gym/blob/master/docs/creating-environments.md
56
https://github.com/hackthemarket/gym-trading
57
https://towardsdatascience.com/gettingstartedwithmarathonenvs-v0-5-0a-c1054a0b540c
58
https://arxiv.org/abs/1905.08085
59
https://eng.uber.com/poet-open-ended-deep-learning/
60
https://arxiv.org/abs/1905.10985

14
A PREPRINT - J ULY 10, 2019

6 Datasets
Google Dataset Search Beta (Blog61 ) https://toolbox.google.com/datasetsearch.
TensorFlow Datasets: load a variety of public datasets into TensorFlow programs (Blog62 | Colab63 ).

7 Deep-Learning Hardware
A Full Hardware Guide to Deep Learning, by Tim Dettmers64 .
Which GPU(s) to Get for Deep Learning, by Tim Dettmers65 .

Figure 20: Edge TPU - Dev Board https://coral.withgoogle.com/products/dev-board/

Build AI that works offline with Coral Dev Board, Edge TPU, and TensorFlow Lite, by Daniel Situnayake66 .
Jetson Nano. A small but mighty AI computer to create intelligent systems67 .

8 Deep-Learning Software
TensorFlow
• tf.keras (TensorFlow 2.0) for Researchers: Crash Course. Colab68 .
• TensorFlow 2.0: basic ops, gradients, data preprocessing and augmentation, training and saving. Colab69 .
• TensorBoard in Jupyter Notebooks. Colab70 .
• TensorFlow Lite for Microcontrollers71 .
PyTorch
• PyTorch primer. Colab72 .
• PyTorch internals http://blog.ezyang.com/2019/05/pytorch-internals/
61
https://www.blog.google/products/search/making-it-easier-discover-datasets/
62
https://medium.com/tensorflow/introducing-tensorflow-datasets-c7f01f7e19f3
63
https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/overview.ipynb
64
http://timdettmers.com/2018/12/16/deep-learning-hardware-guide/
65
http://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/
66
https://medium.com/tensorflow/build-ai-that-works-offline-with-coral-dev-board-edge-tpu-and-tensorflow-lite-70
67
https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-nano/
68
https://colab.research.google.com/drive/14CvUNTaX1OFHDfaKaaZzrBsvMfhCOHIR
69
https://colab.research.google.com/github/zaidalyafeai/Notebooks/blob/master/TF_2_0.ipynb
70
https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/r2/get_started.
ipynb
71
https://petewarden.com/2019/03/07/launching-tensorflow-lite-for-microcontrollers/
72
https://colab.research.google.com/drive/1DgkVmi6GksWOByhYVQpyUB4Rk3PUq0Cp

15
A PREPRINT - J ULY 10, 2019

9 AI Art | A New Day Has Come in Art Industry

Figure 21: On October 25, 2018, the first AI artwork ever sold at Christie’s auction house fetched USD 432,500.

The code (art-DCGAN) for the first artificial intelligence artwork ever sold at Christie’s auction house (Figure 21) is a
modified implementation of DCGAN focused on generative art: https://github.com/robbiebarrat/art-dcgan.

• TensorFlow Magenta. An open source research project exploring the role of ML in the creative process.73 .
• Magenta Studio. A suite of free music-making tools using machine learning models!74 .
• Style Transfer Tutorial https://colab.research.google.com/github/tensorflow/docs/blob/
master/site/en/r2/tutorials/generative/style_transfer.ipynb
• AI x AR Paper Cubes https://experiments.withgoogle.com/paper-cubes.
• Photo Wake-Up https://grail.cs.washington.edu/projects/wakeup/.
• COLLECTION. AI Experiments https://experiments.withgoogle.com/ai.

"The Artists Creating with AI Won’t Follow Trends; THEY WILL SET THEM." — The House of Montréal.AI Fine Arts

MuseNet. Generate Music Using Many Different Instruments and Styles!75 .


Tuning Recurrent Neural Networks with Reinforcement Learning76 .
Discovering Visual Patterns in Art Collections with Spatially-consistent Feature Learning. Shen et al.77 .
Deep Multispectral Painting Reproduction via Multi-Layer, Custom-Ink Printing. Shi et al.78 .

10 AI Macrostrategy: Aligning AGI with Human Interests


Montréal.AI Governance: Policies at the intersection of AI, Ethics and Governance.
v AI Index. http://aiindex.org.
v Malicious AI Report. https://arxiv.org/pdf/1802.07228.pdf.
v Artificial Intelligence and Human Rights. https://ai-hr.cyber.harvard.edu.

"(AI) will rank among our greatest technological achievements, and everyone deserves to play a role in shaping it." —
Fei-Fei Li

References
[1] Mnih et al. Human-Level Control Through Deep Reinforcement Learning. In Nature 518, pages 529–533. 26
February 2015. https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf
73
https://magenta.tensorflow.org
74
https://magenta.tensorflow.org/studio
75
https://openai.com/blog/musenet/
76
https://magenta.tensorflow.org/2016/11/09/tuning-recurrent-networks-with-reinforcement-learning
77
https://arxiv.org/pdf/1903.02678.pdf
78
http://people.csail.mit.edu/liangs/papers/ToG18.pdf

16
A PREPRINT - J ULY 10, 2019

Figure 22: A Map of Ethical and Right-Based Approaches https://ai-hr.cyber.harvard.edu/primp-viz.html

[2] Yann LeCun, Yoshua Bengio and Geoffrey Hinton. Deep Learning. In Nature 521, pages 436–444. 28 May 2015.
https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf
[3] Goodfellow et al. Generative Adversarial Networks. arXiv preprint arXiv:1406.2661, 2014. https://arxiv.
org/abs/1406.2661
[4] Yoshua Bengio, Andrea Lodi, Antoine Prouvost. Machine Learning for Combinatorial Optimization: a Method-
ological Tour d’Horizon. arXiv preprint arXiv:1811.06128, 2018. https://arxiv.org/abs/1811.06128
[5] Brockman et al. OpenAI Gym. 2016. https://gym.openai.com
[6] Devlin et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint
arXiv:1810.04805, 2018. https://arxiv.org/abs/1810.04805
[7] Dai et al. Semi-supervised Sequence Learning. arXiv preprint arXiv:1511.01432, 2015. https://arxiv.org/
abs/1511.01432
[8] Mnih et al. Asynchronous Methods for Deep Reinforcement Learning. arXiv preprint arXiv:1602.01783, 2016.
https://arxiv.org/abs/1602.01783
[9] Schulman et al. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347, 2017. https:
//arxiv.org/abs/1707.06347
[10] Mnih et al. Playing Atari with Deep Reinforcement Learning. DeepMind Technologies, 2013. https://www.cs.
toronto.edu/~vmnih/docs/dqn.pdf
[11] Ha et al. Recurrent World Models Facilitate Policy Evolution. arXiv preprint arXiv:1809.01999, 2018. https:
//arxiv.org/abs/1809.01999
[12] Kenneth et al. Designing neural networks through neuroevolution. In Nature Machine Intelligence VOL 1, pages
24–35. January 2019. https://www.nature.com/articles/s42256-018-0006-z.pdf
[13] So et al. The Evolved Transformer. arXiv preprint arXiv:1901.11117, 2019. https://arxiv.org/abs/1901.
11117
[14] Silver et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv
preprint arXiv:1712.01815, 2017. https://arxiv.org/abs/1712.01815
[15] Silver et al. AlphaGo Zero: Learning from scratch. In DeepMind’s Blog, 2017. https://deepmind.com/blog/
alphago-zero-learning-scratch/
[16] Andrychowicz et al. Learning to learn by gradient descent by gradient descent. arXiv preprint arXiv:1606.04474,
2016. https://arxiv.org/abs/1606.04474
[17] Nichol et al. Reptile: A Scalable Meta-Learning Algorithm. 2018. https://blog.openai.com/reptile/
[18] Frans et al. Meta Learning Shared Hierarchies. arXiv preprint arXiv:1710.09767, 2017. https://arxiv.org/
abs/1710.09767

17
A PREPRINT - J ULY 10, 2019

[19] Zoph and Le, 2017 Neural Architecture Search with Reinforcement Learning. arXiv preprint arXiv:1611.01578,
2017. https://arxiv.org/abs/1611.01578
[20] Finn et al., 2017 Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. arXiv preprint
arXiv:1703.03400, 2017. https://arxiv.org/abs/1703.03400
[21] Salimans et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. 2017. https:
//blog.openai.com/evolution-strategies/
[22] Lehman et al. The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary
Computation and Artificial Life Research Communities. arXiv preprint arXiv:1803.03453, 2018. https://arxiv.
org/abs/1803.03453
[23] Wang et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse
Learning Environments and Their Solutions. arXiv preprint arXiv:1901.01753, 2019. https://arxiv.org/abs/
1901.01753
[24] Foerster et al. Learning to Model Other Minds. 2018. https://blog.openai.com/
learning-to-model-other-minds/
[25] Rabinowitz et al. Machine Theory of Mind. arXiv preprint arXiv:1802.07740, 2018. https://arxiv.org/abs/
1802.07740

18

You might also like