A White Paper On The Future of Artificial Intelligence

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

A White Paper on the Future of Artificial

Intelligence
Abstract
In the present white paper we discuss the current state of Artificial Intelligence (AI)
research and its future opportunities. We argue that solving the problem of invariant
representations is the key to overcoming the limitations inherent in today's neural
networks and to making progress towards Strong AI. Based on this premise, we
describe a research strategy towards the next generation of machine learning
algorithms beyond the currently dominant deep learning paradigm. Following the
example of biological brains, we propose an unsupervised learning approach to
solve the problem of invariant representations. A focused interdisciplinary research
effort is required to establish an abstract mathematical theory of invariant
representations and to apply it in the development of functional software algorithms,
while both applying and enhancing our conceptual understanding of the (human)
brain.
heir results on the ImageNet LSVRC-
2010 contest, a computer vision challenge to automatically classify 1.2 million high-
resolution images into 1,000
different classes. Their use of deep neural networks yielded a substantial
improvement in error rate and marks the
beginning of the recent wave of interest in machine learning and artificial intelligence.
In the subsequent years, deep
learning has been applied to a considerable number of other problems and used
productively in applications such as
voice recognition for digital assistants, translation software, and self-driving vehicles.
But despite all these impressive success stories, deep learning still suffers from
severe limitations. For one thing,
enormous amounts of labeled data are required to train the networks. Where a
human child might learn to recognize
an animal species or a class of objects by seeing only a few examples, a deep
neural network typically needs tens
of thousands of images to achieve similar accuracy. For another thing, today’s
algorithms clearly are far away from
grasping the essence of an entity or a class in the way humans do. Many examples
show how even the most modern
neural networks fail spectacularly in cases that seem trivial to humans [22].
While neural networks are quite fashionable nowadays, their conceptual foundations
are actually rather old; they were
already being intensely studied in the 1950s and 1960s, inspired by the brain’s
anatomy – according to the understanding
at that time. Today’s deep neural networks are essentially the same as those
classical networks except for their higher
number of layers. They owe their success in recent years largely to an increase in
computing power and the availability
of huge amounts of training data.
Our central hypothesis, which drives our research strategy, is that the current
limitations in AI can only be overcome by
a new generation of algorithms. These algorithms will be inspired by today’s
neurosciences and – to some extent – by
advances in our understanding of the brain which are yet to come. Our envisioned
path forward is an interdisciplinary
∗Learn more at https://www.merckgroup.com/en/research/ai-research.html

One view is that conceptual knowledge is organized using the circuitry in the
medial temporal lobe (MTL) that supports spatial processing and navigation. In
contrast, we find that a domain-general learning algorithm explains key findings in
both spatial and conceptual domains. When the clustering model is applied to
spatial navigation tasks, so-called place and grid cell-like representations emerge
because of the relatively uniform distribution of possible inputs in these tasks. The
same mechanism applied to conceptual tasks, where the overall space can be
higher-dimensional and sampling sparser, leading to representations more
aligned with human conceptual knowledge. Although the types of memory
supported by the MTL are superficially dissimilar, the information processing steps
appear shared. Our account suggests that the MTL uses a general-purpose
algorithm to learn and organize context-relevant information in a useful format,
rather than relying on navigation-specific neural circuitry. Spatial maps in the
medial temporal lobe (MTL) have been proposed to map abstract conceptual
knowledge. Rather than grounding abstract knowledge in a spatial map, the
authors propose a general-purpose clustering algorithm that explains how both
spatial (including place and grid cells) and higher-dimensional conceptual
representations arise during learning.

How the neocortex works is a mystery. In this paper we propose a novel


framework for understanding its function. Grid cells are neurons in the entorhinal
cortex that represent the location of an animal in its environment. Recent
evidence suggests that grid cell-like neurons may also be present in the
neocortex. We propose that grid cells exist throughout the neocortex, in every
region and in every cortical column. They define a location-based framework for
how the neocortex functions. Whereas grid cells in the entorhinal cortex represent
the location of one thing, the body relative to its environment, we propose that
cortical grid cells simultaneously represent the location of many things. Cortical
columns in somatosensory cortex track the location of tactile features relative to
the object being touched and cortical columns in visual cortex track the location of
visual features relative to the object being viewed. We propose that mechanisms
in the entorhinal cortex and hippocampus that evolved for learning the structure of
environments are now used by the neocortex to learn the structure of objects.
Having a representation of location in each cortical column suggests mechanisms
for how the neocortex represents object compositionality and object behaviors. It
leads to the hypothesis that every part of the neocortex learns complete models of
objects and that there are many models of each object distributed throughout the
neocortex. The similarity of circuitry observed in all cortical regions is strong
evidence that even high-level cognitive tasks are learned and represented in a
location-based framework.

A long-standing goal of artificial intelligence is an algorithm that learns, tabula


rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became
the first program to defeat a world champion in the game of Go. The tree search
in AlphaGo evaluated positions and selected moves using deep neural networks.
These neural networks were trained by supervised learning from human expert
moves, and by reinforcement learning from self-play. Here we introduce an
algorithm based solely on reinforcement learning, without human data, guidance
or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a
neural network is trained to predict AlphaGo's own move selections and also the
winner of AlphaGo's games. This neural network improves the strength of the tree
search, resulting in higher quality move selection and stronger self-play in the next
iteration. Starting tabula rasa, our new program AlphaGo Zero achieved
superhuman performance, winning 100-0 against the previously published,
champion-defeating AlphaGo. © 2017 Macmillan Publishers Limited, part of
Springer Nature. All rights reserved.

The human brain has some capabilities that the brains of other animals lack. It is
to these distinctive capabilities that our species owes its dominant position. Other
animals have stronger muscles or sharper claws, but we have cleverer brains. If
machine brains one day come to surpass human brains in general intelligence,
then this new superintelligence could become very powerful. As the fate of the
gorillas now depends more on us humans than on the gorillas themselves, so the
fate of our species then would come to depend on the actions of the machine
superintelligence. But we have one advantage: we get to make the first move. Will
it be possible to construct a seed AI or otherwise to engineer initial conditions so
as to make an intelligence explosion survivable? How could one achieve a
controlled detonation? To get closer to an answer to this question, we must make
our way through a fascinating landscape of topics and considerations. Read the
book and learn about oracles, genies, singletons; about boxing methods,
tripwires, and mind crime; about humanity's cosmic endowment and differential
technological development; indirect normativity, instrumental convergence, whole
brain emulation and technology couplings; Malthusian economics and dystopian
evolution; artificial intelligence, and biological cognitive enhancement, and
collective intelligence.

We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-
ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make training
faster, we used non-saturating neurons and a very efficient GPU implemen- tation
of the convolution operation. To reduce overfitting in the fully-connected layers we
employed a recently-developed regularization method called dropout that proved
to be very effective. We also entered a variant of this model in the ILSVRC-2012
competition and achieved a winning top-5 test error rate of 15.3%, compared to
26.2% achieved by the second-best entry

There is a popular belief in neuroscience that we are primarily data limited, and
that producing large, multimodal, and complex datasets will, with the help of
advanced data analysis algorithms, lead to fundamental insights into the way the
brain processes information. These datasets do not yet exist, and if they did we
would have no way of evaluating whether or not the algorithmically-generated
insights were sufficient or even correct. To address this, here we take a classical
microprocessor as a model organism, and use our ability to perform arbitrary
experiments on it to see if popular data analysis methods from neuroscience can
elucidate the way it processes information. Microprocessors are among those
artificial information processing systems that are both complex and that we
understand at all levels, from the overall logical flow, via logical gates, to the
dynamics of transistors. We show that the approaches reveal interesting structure
in the data but do not meaningfully describe the hierarchy of information
processing in the microprocessor. This suggests current analytic approaches in
neuroscience may fall short of producing meaningful understanding of neural
systems, regardless of the amount of data. Additionally, we argue for scientists
using complex non-linear dynamical systems with known ground truth, such as
the microprocessor as a validation platform for time-series and structure discovery
methods.

The game of Go has long been viewed as the most challenging of classic games
for artificial intelligence owing to its enormous search space and the difficulty of
evaluating board positions and moves. Here we introduce a new approach to
computer Go that uses ‘value networks’ to evaluate board positions and ‘policy
networks’ to select moves. These deep neural networks are trained by a novel
combination of supervised learning from human expert games, and reinforcement
learning from games of self-play. Without any lookahead search, the neural
networks play Go at the level of state-of-the-art Monte Carlo tree search programs
that simulate thousands of random games of self-play. We also introduce a new
search algorithm that combines Monte Carlo simulation with value and policy
networks. Using this search algorithm, our program AlphaGo achieved a 99.8%
winning rate against other Go programs, and defeated the human European Go
champion by 5 games to 0. This is the first time that a computer program has
defeated a human professional player in the full-sized game of Go, a feat
previously thought to be at least a decade away.

Neocortical neurons have thousands of excitatory synapses. It is a mystery how


neurons integrate the input from so many synapses and what kind of large-scale
network behavior this enables. It has been previously proposed that non-linear
properties of dendrites enable neurons to recognize multiple patterns. In this
paper we extend this idea by showing that a neuron with several thousand
synapses arranged along active dendrites can learn to accurately and robustly
recognize hundreds of unique patterns of cellular activity, even in the presence of
large amounts of noise and pattern variation. We then propose a neuron model
where some of the patterns recognized by a neuron lead to action potentials and
define the classic receptive field of the neuron, whereas the majority of the
patterns recognized by a neuron act as predictions by slightly depolarizing the
neuron without immediately generating an action potential. We then present a
network model based on neurons with these properties and show that the network
learns a robust model of time-based sequences. Given the similarity of excitatory
neurons throughout the neocortex and the importance of sequence memory in
inference and behavior, we propose that this form of sequence memory is a
universal property of neocortical tissue. We further propose that cellular layers in
the neocortex implement variations of the same sequence memory algorithm to
achieve different aspects of inference and behavior. The neuron and network
models we introduce are robust over a wide range of parameters as long as the
network uses a sparse distributed code of cellular activations. The sequence
capacity of the network scales linearly with the number of synapses on each
neuron. Thus neurons need thousands of synapses to learn the many temporal
patterns in sensory stimuli and motor sequences.

Adaptation is a property of intelligent machines to update its knowledge according to


actual situation. Self-learning machines (SLM) as defined in this paper are those
learning by observation under limited supervision, and continuously adapt by
observing the surrounding environment. The aim is to mimic the behavior of human
brain learning from surroundings with limited supervision, and adapting ... [Show full
abstract] its learning according to input sensory observations. Recently, Deep Belief
Networks have made good use of unsupervised learning as pre-training stage, which
is equivalent to the observation stage in humans. However, they still need
supervised training set to adjust the network parameters, as well as being non-
adaptive to real world examples. In this paper, SLM is proposed based on deep
belief networks and deep auto encoders to adapt to real world unsupervised data
flowing in to the learning machine during operation. As a proof of concept, the new
system is tested on two AI tasks; number recognition on MNIST dataset, and E-mail
classification on Enron dataset. Keywords: Deep Belief Networks (DBN), Restricted
Generative Adversarial Network (GANs) is one of the most important research
avenues in the field of artificial intelligence, and its outstanding data generation
capacity has received wide attention. In this paper, we present the recent progress
on GANs. Firstly, the basic theory of GANs, and the differences among different
generative models in recent years were analyzed and summarized. Then, the ...
[Show full abstract] derived models of GANs are classified, and introduced one by
one. Thirdly, the training tricks and evaluation metrics were given. Fourthly, the
applications of GANs were introduced. Finally, the problem we need to address, and
future directions were discussed.
In this study we investigate information processing in deep neural network models.
We demonstrate that unsupervised training of autoencoder models of certain class
can result in emergence of compact and structured internal representation of the
input data space that can be correlated with higher level categories. We propose and
demonstrate practical possibility to detect and measure this ... [Show full abstract]
emergent information structure by applying unsupervised clustering in the activation
space of the focal hidden layer of the model. Based on our findings we propose a
new approach to training neural network models based on emergent in unsupervised
training information landscape, that is iterative, driven by the environment, requires
minimal supervision and with intriguing similarities to learning of biologic systems.
We demonstrate its viability with originally developed method of spontaneous
concept learning that yields good classification results while learning new higher
level concepts with very small amounts of supervised training data.
Deep learning is currently playing a crucial role toward higher levels of artificial
intelligence. This paradigm allows neural networks to learn complex and abstract
representations, that are progressively obtained by combining simpler ones.
Nevertheless , the internal "black-box" representations automatically discovered by
current neural architectures often suffer from a lack of ... [Show full abstract]
interpretability, making of primary interest the study of explainable machine learning
techniques. This paper summarizes our recent efforts to develop a more
interpretable neural model for directly processing speech from the raw waveform. In
particular, we propose SincNet, a novel Convolutional Neural Network (CNN) that
encourages the first layer to discover more meaningful filters by exploiting
parametrized sinc functions. In contrast to standard CNNs, which learn all the
elements of each filter, only low and high cutoff frequencies of band-pass filters are
directly learned from data. This inductive bias offers a very compact way to derive a
customized filter-bank front-end, that only depends on some parameters with a clear
physical meaning. Our experiments, conducted on both speaker and speech
recognition, show that the proposed architecture converges faster, performs better,
and is more interpretable than standard CNNs.

You might also like