Week 10
Week 10
Angela Lu
Week 10
Information Systems, CB
City University of Hong Kong
Recap
• Network visualization
• Today: Computer vision
2
What is computer vision?
• A field of artificial intelligence (AI) that enables computers
and systems to derive meaningful information from digital
images, videos and other visual inputs — and take actions or
make recommendations based on that information
• The search of the fundamental visual features, maps a 2D image
into a vector or a point
• Applications of reconstruction and recognition
An Interdisciplinary Area
4
CV Applications
Facial recognition
Augmented reality
Autonomous driving
Deep fake
Artwork with GANs …
6
Visual Recognition Problems
• Image classification, such as
• Object/action detection
• Image captioning
• Semantic segmentation
• Visual question answering
• Visual instruction navigation
• Scene graph generation
•…
7
Object Detection
8
Action Detection
9
Image Captioning
10
Sematic Segmentation
11
Visual Question Answering
12
Scene Graph
13
GANs Artwork
14
Image Classification: A Core Task
cat
15
How the Computer Sees it?
16
Other Challenges
• Viewpoint change
• Background
• Illumination
• Occlusion
• Deformation etc…
So how to classify?
17
K-nearest Neighbor Classifier
• K-Nearest Neighbors
• Like nearest
neighbors, but instead
of calculating the
single nearest
neighbor, it takes a
majority vote based
on multiple (K)
nearest neighbors
18
K-nearest Neighbor Classifier
cat
d(I1, I2)
chair
Min(Distance)?
d(I1, I3)
fish
???
Start with a training set of images and labels, and predict d(I1, I4)
labels on the test set 19
Distance Metric
• L1 (Manhattan) distance:
20
K-nearest Neighbor Classifier
• Predicts labels based on the K
nearest training examples,
based on distance metric and
K
• But how to choose K?
• E.g. cross-validation for the
value of k
• (Seems that k ~= 12 works best
for this data)
21
Linear Classifier
f(x, W) = Wx + b
W: weights
x: image pixels
b: vector of bias
22
Linear Classifier: Parameters
3072X1
3x1
f(x, W) = Wx + b 3 numbers giving
class scores
Array of 32x32x3 numbers
(3072 numbers) Weights, 3x3072
Classify to 3 classes:
cat, dog, panda
23
Linear Classifier: Example
b
W
x
24
Classification Results
25
Linear Classifier
• f(x, W) = Wx + b
• How to come up with a good W?
• Start with random W
• Define a loss function
W?
• Quantifies training data inaccuracy
• Find W that minimizes loss (optimization)
26
Loss Function
• SVM loss
• Softmax
• Regularization loss
27
Loss Function: SVM vs. Softmax
28
Optimization: Gradient Descent
29
Stochastic Gradient Descent
30
Next time…
• Deep learning & neural networks
31