Introducing Deep Learning and The Pytorch Library: This Chapter Covers
Introducing Deep Learning and The Pytorch Library: This Chapter Covers
The poorly defined term artificial intelligence covers a set of disciplines that have
been subjected to a tremendous amount of research, scrutiny, confusion, fantasti-
cal hype, and sci-fi fearmongering. Reality is, of course, far more sanguine. It would
be disingenuous to assert that today’s machines are learning to “think” in any
human sense of the word. Rather, we’ve discovered a general class of algorithms
3
4 CHAPTER 1 Introducing deep learning and the PyTorch Library
that are able to approximate complicated, nonlinear processes very, very effectively,
which we can use to automate tasks that were previously limited to humans.
For example, at https://talktotransformer.com, a language model called GPT-2
can generate coherent paragraphs of text one word at a time. When we fed it this very
paragraph, it produced the following:
Next we’re going to feed in a list of phrases from a corpus of email addresses, and see if the
program can parse the lists as sentences. Again, this is much more complicated and far more
complex than the search at the beginning of this post, but hopefully helps you understand the
basics of constructing sentence structures in various programming languages.
That’s remarkably coherent for a machine, even if there isn’t a well-defined thesis
behind the rambling.
Even more impressively, the ability to perform these formerly human-only tasks is
acquired through examples, rather than encoded by a human as a set of handcrafted
rules. In a way, we’re learning that intelligence is a notion we often conflate with self-
awareness, and self-awareness is definitely not required to successfully carry out these
kinds of tasks. In the end, the question of computer intelligence might not even be
important. Edsger W. Dijkstra found that the question of whether machines could
think was “about as relevant as the question of whether Submarines Can Swim.” 1
That general class of algorithms we’re talking about falls under the AI subcategory
of deep learning, which deals with training mathematical entities named deep neural net-
works by presenting instructive examples. Deep learning uses large amounts of data to
approximate complex functions whose inputs and outputs are far apart, like an input
image and, as output, a line of text describing the input; or a written script as input
and a natural-sounding voice reciting the script as output; or, even more simply, asso-
ciating an image of a golden retriever with a flag that tells us “Yes, a golden retriever is
present.” This kind of capability allows us to create programs with functionality that
was, until very recently, exclusively the domain of human beings.
1
Edsger W. Dijkstra, “The Threats to Computing Science,” http://mng.bz/nPJ5.
The deep learning revolution 5
Deep learning, on the other hand, deals with finding such representations auto-
matically, from raw data, in order to successfully perform a task. In the ones versus
zeros example, filters would be refined during training by iteratively looking at pairs
of examples and target labels. This is not to say that feature engineering has no place
with deep learning; we often need to inject some form of prior knowledge in a learn-
ing system. However, the ability of a neural network to ingest data and extract useful
representations on the basis of examples is what makes deep learning so powerful.
The focus of deep learning practitioners is not so much on handcrafting those repre-
sentations, but on operating on a mathematical entity so that it discovers representa-
tions from the training data autonomously. Often, these automatically created
features are better than those that are handcrafted! As with many disruptive technolo-
gies, this fact has led to a change in perspective.
On the left side of figure 1.1, we see a practitioner busy defining engineering fea-
tures and feeding them to a learning algorithm; the results on the task will be as good
as the features the practitioner engineers. On the right, with deep learning, the raw
data is fed to an algorithm that extracts hierarchical features automatically, guided by
the optimization of its own performance on the task; the results will be as good as the
ability of the practitioner to drive the algorithm toward its goal.
DATA DATA
0 HAND-
CRAFTED
FEATURES
DEeP
LEARNING LEARNING
MACHINE MACHINE
OUTCOME
REPRESENTATIONS 42
THE PARAdIGm SHIFT
OUTCOME 42
Figure 1.1 Deep learning exchanges the need to handcraft features for an increase in data and
computational requirements.
6 CHAPTER 1 Introducing deep learning and the PyTorch Library
Starting from the right side in figure 1.1, we already get a glimpse of what we need to
execute successful deep learning:
We need a way to ingest whatever data we have at hand.
We somehow need to define the deep learning machine.
We must have an automated way, training, to obtain useful representations and
make the machine produce desired outputs.
This leaves us with taking a closer look at this training thing we keep talking about.
During training, we use a criterion, a real-valued function of model outputs and refer-
ence data, to provide a numerical score for the discrepancy between the desired and
actual output of our model (by convention, a lower score is typically better). Training
consists of driving the criterion toward lower and lower scores by incrementally modi-
fying our deep learning machine until it achieves low scores, even on data not seen
during training.
take a data source and build out a deep learning project with it, supported by the
excellent official documentation.
Although we stress the practical aspects of building deep learning systems with
PyTorch, we believe that providing an accessible introduction to a foundational deep
learning tool is more than just a way to facilitate the acquisition of new technical skills.
It is a step toward equipping a new generation of scientists, engineers, and practi-
tioners from a wide range of disciplines with working knowledge that will be the back-
bone of many software projects during the decades to come.
In order to get the most out of this book, you will need two things:
Some experience programming in Python. We’re not going to pull any punches
on that one; you’ll need to be up on Python data types, classes, floating-point
numbers, and the like.
A willingness to dive in and get your hands dirty. We’ll be starting from the
basics and building up our working knowledge, and it will be much easier for
you to learn if you follow along with us.
Deep Learning with PyTorch is organized in three distinct parts. Part 1 covers the founda-
tions, examining in detail the facilities PyTorch offers to put the sketch of deep learn-
ing in figure 1.1 into action with code. Part 2 walks you through an end-to-end project
involving medical imaging: finding and classifying tumors in CT scans, building on
the basic concepts introduced in part 1, and adding more advanced topics. The short
part 3 rounds off the book with a tour of what PyTorch offers for deploying deep
learning models to production.
Deep learning is a huge space. In this book, we will be covering a tiny part of that
space: specifically, using PyTorch for smaller-scope classification and segmentation
projects, with image processing of 2D and 3D datasets used for most of the motivating
examples. This book focuses on practical PyTorch, with the aim of covering enough
ground to allow you to solve real-world machine learning problems, such as in vision,
with deep learning or explore new models as they pop up in research literature. Most,
if not all, of the latest publications related to deep learning research can be found in
the arXiV public preprint repository, hosted at https://arxiv.org.2
2
We also recommend www.arxiv-sanity.com to help organize research papers of interest.