Deeplearning Avisualapproach Preview
Deeplearning Avisualapproach Preview
1
MACHINE LE ARNING
Expert Systems
Before deep learning became practical on a widespread basis, a popular
approach to learning from data involved creating expert systems. Still used
today, these are computer programs intended to encapsulate the thought
processes of human experts such as doctors, engineers, and even musi-
cians. The idea is to study a human expert at work, watch what they do
and how they do it, and perhaps ask them to describe their process out
loud. We capture that thinking and behavior with a set of rules. The hope
is that a computer could then do the expert’s job just by following those
rules.
These kinds of systems can work well once they’re built, but they’re dif-
ficult to create and maintain. It’s worth taking a moment to see why. The
problem is that the key step of producing the rules, called feature engineer-
ing, can require impractical amounts of human intervention and ingenuity.
Part of deep learning’s success is that it addresses exactly this problem by
creating the rules algorithmically.
Let’s illustrate the problem faced by expert systems with a practical
example: recognizing digits. Let’s say that we want to teach a computer
to recognize the number 7. By talking to people and asking questions, we
might come up with a set of three small rules that let us distinguish a 7
from all other digits: first, 7s have a mostly horizontal line near the top of
the figure; second, they have a mostly northeast-southwest diagonal line;
and third, those two lines meet in the upper right. The rules are illustrated
in Figure 1-1.
This might work well enough until we get a 7 like Figure 1-2.
4 Chapter 1
Digit 7 = Horizontal + NE-SW + Lines meet
line diagonal at upper right
Figure 1-1: Top: A handwritten 7. Bottom: A set of rules for distinguishing a handwritten 7 from other digits.
Supervised Learning
We’ll first consider supervised learning. Here, the word supervised is a syn-
onym for “labeled.” In supervised learning, we typically give the com-
puter pairs of values: an item drawn from a dataset, and a label that we’ve
assigned to that item.
For example, we might be training a system called an image classifier,
with the goal of having it tell us what object is most prominent in a photo-
graph. To train this system, we’d give it a collection of images, and accom-
pany each image with a label describing the most prominent object. So, for
example, we might give the computer a picture of a tiger and a label consist-
ing of the word tiger.
This idea can be extended to any kind of input. Suppose that we have a
few cookbooks full of recipes that we’ve tried out, and we’ve kept records on
how much we liked each dish. In this case, the recipe would be the input,
and our rating of it would be that recipe’s label. After training a program
on all of our cookbooks, we could give our trained system a new recipe, and
it could predict how much we’d enjoy eating the result. Generally speaking,
the better we’re able to train the system (usually by providing more pieces
of training data), the better its prediction will be.
Regardless of the type of data, by giving the computer an enormous
number of pairs of inputs and labels, a successful system designed for the
task will gradually discover enough rules or patterns from the inputs that
it will be able to correctly predict each provided label. That is, as a result of
this training, the system has learned what to measure in each input so that
6 Chapter 1
it can identify which of its learned labels it should return. When it gets the
right answer frequently enough for our needs, we say that the system has
been trained.
Keep in mind that the computer has no sense of what a recipe actually
is, or how things taste. It’s just using the data in the input to find the closest
matching label, using the rules it learned during training.
Figure 1-3 shows the results of giving four photographs to a trained
image classifier.
These photos were found on the web, and the system had never seen
them before. In response to each image, the classifier tells us the likelihood
for each of the 1,000 labels it was trained to recognize. Here we show the
top five predictions for each photo, with their associated probabilities.
The picture in the upper left of Figure 1-3 is a bunch of bananas, so
ideally we’d like to get back a label like bunch of bananas. But this particular
classifier wasn’t trained on any images labeled bunch of bananas. The algo-
rithm can only return one of the labels it was trained on, in the same way
that we can only identify objects by the words we know. The closest match
it could find from the labels it was trained on was just banana, so that’s the
label it returned to us.
Banana
Zucchini
Slug
Coral fungus
Pineapple
Tabby
Tiger cat
Egyptian cat
Remote control
Mouse
Ear
Corn
Banana
American lobster
Knot
Reflex camera
Lens cap
Binoculars
Polaroid camera
Tripod
Figure 1-3: Four images and their predicted labels, with probabilities, from a deep learn-
ing classifier