0% found this document useful (0 votes)
11 views

ConvNet1

The document outlines the course content for COMP 3340, focusing on Convolutional Neural Networks (CNN) for image classification. It includes key notations, background information on image representation, and the significance of CNNs in various computer vision applications. Additionally, it discusses benchmarks like ImageNet and foundational concepts such as regularization and neural network operations.

Uploaded by

Yat Kiu Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

ConvNet1

The document outlines the course content for COMP 3340, focusing on Convolutional Neural Networks (CNN) for image classification. It includes key notations, background information on image representation, and the significance of CNNs in various computer vision applications. Additionally, it discusses benchmarks like ImageNet and foundational concepts such as regularization and neural network operations.

Uploaded by

Yat Kiu Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

×

COMP 3340
Convolutional Neural Network
for Image Classification

COMP3340 Applied Deep Learning


2.1 Notations and Outline

• CV: Computer Vision


• NN: Neural Networks
• ReLU: Rectified Linear Unit
• SGD: Stochastic Gradient Descend
• FP: Forward Propagation
• BP: Backward Propagation
• CNN: Convolutional Neural Networks
• BN: Batch Normalization

COMP3340 Applied Deep Learning [1] Reference


2
2.1 Notations and Outline
2.1 Notations and Outline
2.2 Introduction
2.3 Boarder Impact
2.4 Benchmark
2.5 Basic Knowledge for Deep Learning
2.6 CNN Operators
2.7 CNN Architectures
2.8 Reference

Three questions for quiz and final.

1. L1 and L2 Regularization analysis


2. Shuffle conv implementation
3. Flops/Parameters computation

COMP3340 Applied Deep Learning [1] Reference


3
Where is CNN?

COMP3340 Applied Deep Learning


60% Of Human Brain Is “Visual”

The visual data domain is the most important data domain.


COMP3340 Applied Deep Learning
5
2.2.2 Background

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


6
2.2.2 Background

• Representing an image in a computer involves storing it as a grid of pixels, where each pixel represents the
smallest unit of the image.
• RGB (Red, Green, Blue) color model, where each color is represented by a combination of these three primary
colors.
• Channel: each value ranges from 0 to 255, (255, 0, 0) red, while (0, 255, 0) green, and (0, 0, 255) blue. A
combination of these, like (255, 255, 0), yellow.

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


7
2.2.2 Background

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


8
2.2.2 Background

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


9
2.2.2 Background

A picture is worth a thousand words.

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


10
2.2.2 Background

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


11
2.2.2 Problem definition: ConvNet for Image Classification

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


12
2.2.2 Problem definition

• In 2012, Google Brain took an


artificial neural network and
spread the computation across
16,000 of our CPU cores, and
trained models with more than 1
billion connections.
• While there’s no accepted way
to compare artificial neural
networks to biological brains, as
a very rough comparison an
adult human brain has around
100 trillion connections. So we
still have lots of room to grow.

COMP3340 Applied Deep Learning [1] https://www.pyimagesearch.com/2021/04/17/image-classification-basics


13
2.2.3 ImageNet benchmark

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_1_feifei.pdf


14
2.2.3 ImageNet benchmark

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_1_feifei.pdf


15
2.2.3 ImageNet benchmark

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_1_feifei.pdf


16
2.2.3 ImageNet benchmark

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_1_feifei.pdf


17
2.2.3 ImageNet benchmark

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_1_feifei.pdf


18
Teach Computer to See

COMP3340 Applied Deep Learning https://www.youtube.com/watch?v=40riCqvRoMs


19
2.3 Boarder impact

• Basis of various modern CV applications


• Image classification
• Semantic Segmentation
• Object Detection
• Video Understanding
• Caption
• Relation Prediction

• Variants:
• Low-shot learning
• Continual learning

• Applications:
• Face recognition
• Robotics
COMP3340 Applied Deep Learning [1] Reference
20
2.3 Boarder Impact

Relation Prediction Caption

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_1_feifei.pdf


21
2.4 Benchmark

• ImageNet (mentioned before)


• Cifar10. Cifar100

Process and visualize Cifar10 and Cifar100 dataset:

COMP3340 Applied Deep Learning [1] https://notebook.community/corochann/chainer-hands-on-tutorial/src/04_cifar_cnn/cifar10_cifar100_dataset_introduction


22
23
PIL stands for Python Imaging Library, and it's the original library that
enabled Python to deal with images.

[-1,+1]

24
PIL stands for Python Imaging Library, and it's the original library that
enabled Python to deal with images.

25
26
27
2.5.1 Preliminaries for Image Classification

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


28
2.5.1.1 K-Nearest Neighbor

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


29
2.5.1.1 K-Nearest Neighbor

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


30

31

32
33
34

Accuracy: 35.39%
35
2.5.1.1 K-Nearest Neighbor

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


36
2.5.1.1 K-Nearest Neighbor

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


37
2.5.1.1 K-Nearest Neighbor
Try it by yourself [demo]

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


38
2.5.1.2 Cross-validation

COMP3340 Applied Deep Learning [1] Reference


39
2.5.1.2 Cross-validation

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


40
2.5.1.2 Cross-validation

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


41
2.5.1.3 Disadvantages of k-Nearest Neighbor

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


42
2.5.2.2 Linear Classifier 10x3072

Weight matrix Bias vector

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


43
2.5.2.2 Linear Classifier and Softmax

Wx+b
COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf
44
2.5.2.4 Neural Network

1*0.55+2*0.54+0.45=2.08
W11
b1 = 0.45

1 W12

W21

W22
b2 = 0.89

b3 = 0.96

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_4.pdf


45
2.5.2.4 Neural Network

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_4.pdf


46
2.5.2.4 Neural Network

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_4.pdf


47
2.5.2.4 Neural Network

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_4.pdf


48
2.5.2.4 Neural Network
More Activation functions

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_4.pdf


49
2.5.2.4 Neural Network
Forward Propagation

write the equations above more compactly as:

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
2.5.2.2 Linear Classifier and Softmax

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


78
2.5.2.2 Linear Classifier and Softmax

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


79
Visualization

80
2.5.2.2 Linear Classifier and Softmax

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


81
2.5.2.2 Linear Classifier and Softmax

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


82
2.5.2.2 Linear Classifier and Softmax

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_2.pdf


83
2.5.2.3 Regularization Regularization

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_3.pdf


84
2.5.2.3 Regularization

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_3.pdf


85
2.5.2.3 Regularization

COMP3340 Applied Deep Learning [1] http://cs231n.stanford.edu/slides/2021/lecture_3.pdf


86
2.5.2.4 Neural Network
Backward Propagation

cost function with respect to that single example

cost function with respect to m examples

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


87
2.5.2.4 Neural Network
Backward Propagation

gradient descent

where

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


88
2.5.2.4 Neural Network (example)

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


89
2.5.2.4 Neural Network

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


90
2.5.2.4 Neural Network

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


91
COMP3340 Applied Deep Learning https://www.youtube.com/watch?v=qg4PchTECck
92
2.5.2.4 Neural Network
Backward Propagation

COMP3340 Applied Deep Learning [1] http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks


93

You might also like