Lecture 19
Lecture 19
'Typographic attack': pen and paper fool AI into thinking apple is an iPod
• https://www.youtube.com/watch?v=WJRyu1JUtVw
ff
Things that make it hard
• We don't see this for many types of structured data (medical for example)
Viewpoint invariance
• Each time we look at an object, we have a di erent viewpoint. Unlike other machine
learning tasks
• Humans are so good at viewpoint variation, it's hard to appreciate how di cult it is
• One of the main di culties in computer vision
• Typical approaches:
• Use redundant invariant features
• Bounding boxes
• Replicated features with pooling "convolutional neurons"
ffi
ff
ffi
Invariant feature approach
• At test time, try all possible boxes in a range of position and scales
• This was used often in computer vision ~2015
fi
Convolutional neural nets
• LeNet 1990s
• Use many di erent copies of the same feature detector with di erent
positions
,
@w1 @w2
w/ +
@w1 @w2
• We can thus force backpropagation to use replicated features
What does replicating the features achieve?
• Equivariant activities: the neural activities in the next layer are not invariant to
translation, but they are equivariant
• Reduces number of inputs to the next layer (means, we can learn more
features)
• Problem: after several levels of this pooling, we lose information about the
precise location of the object (that's ne for kilns for example)
fi
LeNet5
• Yann LeCun and collaborators developed the rst good recognizer for
handwritten digits using backpropagation and feedforward
• Many hidden layers, many maps of replicated units, pooling between layers.
Did not require segmentation
• Was deployed by USPS, ~10% of zip code reading in USA in early 2000s
fi
LeNet5 in tensorflow
• medium.com/@mgazar
model = keras.Sequential()
model.add(layers.Conv2D(filters=6, kernel_size=(3, 3), activation='relu', input_shape=(32,32,1)))
model.add(layers.AveragePooling2D())
model.add(layers.Conv2D(filters=16, kernel_size=(3, 3), activation='relu'))
model.add(layers.AveragePooling2D())
model.add(layers.Flatten())
model.add(layers.Dense(units=120, activation='relu'))
model.add(layers.Dense(units=84, activation='relu'))
model.add(layers.Dense(units=10, activation = 'softmax'))
Prior knowledge in machine learning
• Data augmentation
• Subsample & transform training images (AugMix Hendrycks et al. 2019)
Thank you