0% found this document useful (0 votes)
14 views29 pages

Traffic Sign Classification Slides

The document discusses image classifiers and convolutional neural networks. Image classifiers can predict the class of items in images after being trained on labeled image data. Convolutional neural networks use kernels to extract features from images and learn representations through multiple convolutional and pooling layers before classification. The document provides examples of using these techniques for classifying fashion items and traffic signs.

Uploaded by

pedromaia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views29 pages

Traffic Sign Classification Slides

The document discusses image classifiers and convolutional neural networks. Image classifiers can predict the class of items in images after being trained on labeled image data. Convolutional neural networks use kernels to extract features from images and learn representations through multiple convolutional and pooling layers before classification. The document provides examples of using these techniques for classifying fashion items and traffic signs.

Uploaded by

pedromaia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

PROJECT OVERVIEW

PROJECT OVERVIEW: INTRO TO IMAGE CLASSIFIERS


• Image Classifiers work by predicting the class of items that are present in a given image.
• For example, you can train a classifier to classify images of cats and dogs.
• So when you feed a trained classifier an image of a dog, it can predict the label associated with the given
image “label = dog”.
• Let’s take a look at the fashion class dataset.
TARGET CLASS: 10
INPUT IMAGES

T-SHIRT/TOP
TROUSER
PULLOVER
DRESS
COAT
CLASSIFIER SANDAL
SHIRT
SNEAKER
BAG
Fashion consists of 70,000 images
ANKLE
• 60,000 training
BOOT
• 10,000 testing
Images are 28x28 grayscale
PROJECT OVERVIEW: CLASSIFY
TRAFFIC SIGNS
• Traffic sign classification is an important task for self driving cars.
• In this project, a Deep Network known as LeNet will be used for traffic sign images classification.
• The dataset contains 43 different classes of images.
• Classes are as listed below:
• ( 0, b'Speed limit (20km/h)') ( 1, b'Speed limit (30km/h)') ( 2, b'Speed limit (50km/h)') ( 3, b'Speed limit (60km/h)') ( 4, b'Speed limit (70km/h)')
• ( 5, b'Speed limit (80km/h)') ( 6, b'End of speed limit (80km/h)') ( 7, b'Speed limit (100km/h)') ( 8, b'Speed limit (120km/h)') ( 9, b'No passing')
• (10, b'No passing for vehicles over 3.5 metric tons') (11, b'Right-of-way at the next intersection') (12, b'Priority road') (13, b'Yield') (14, b'Stop')
• (15, b'No vehicles') (16, b'Vehicles over 3.5 metric tons prohibited') (17, b'No entry')
• (18, b'General caution') (19, b'Dangerous curve to the left')
• (20, b'Dangerous curve to the right') (21, b'Double curve')
• (22, b'Bumpy road') (23, b'Slippery road')
• (24, b'Road narrows on the right') (25, b'Road work')
• (26, b'Traffic signals') (27, b'Pedestrians') (28, b'Children crossing')
• (29, b'Bicycles crossing') (30, b'Beware of ice/snow')
• (31, b'Wild animals crossing')
• (32, b'End of all speed and passing limits') (33, b'Turn right ahead')
• (34, b'Turn left ahead') (35, b'Ahead only') (36, b'Go straight or right')
• (37, b'Go straight or left') (38, b'Keep right') (39, b'Keep left')
• (40, b'Roundabout mandatory') (41, b'End of no passing')
• (42, b'End of no passing by vehicles over 3.5 metric tons')

Data Source: https://www.kaggle.com/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign


CLASSIFY TRAFFIC SIGNS

• The dataset consists of 43 different classes.


• Images are 32 x 32 pixels

INPUT IMAGE
TARGET
CLASSES

20km/h
CLASSIFIER 50 km/h
32 100 km/h
Stop
Yield

32
WHAT ARE
CONVOLUTIONAL NEURAL
NETWORKS (CNNS) AND
HOW DO THEY LEARN?
CONVOLUTIONAL NEURAL
NETWORKS BASICS

• The neuron collects signals from input channels named dendrites, processes information in its
nucleus, and then generates an output in a long thin branch called the axon.
• Human learning occurs adaptively by varying the bond strength between these neurons.

P1
W1

W2 n
P2 S f a
W3
b
P3
1

n  P1W1  P2W2  P3W3  b


a  f ( n)

Photo Credit: https://commons.wikimedia.org/wiki/File:Artificial_neural_network.svg


Photo Credit: https://commons.wikimedia.org/wiki/File:Neuron_Hand-tuned.svg
CONVOLUTIONAL NEURAL NETWORKS:
ENTIRE NETWORK OVERVIEW

T-SHIRT/TOP
TROUSER
PULLOVER
CONVOLUTION POOLING FLATTENING DRESS
COAT
SANDAL
SHIRT
SNEAKER
KERNELS/ POOLING BAG
FEATURE FILTERS ANKLE BOOT

DETECTORS

CONVOLUTIONAL LAYER POOLING LAYER


(DOWNSAMPLING)
Photo Credit: https://commons.wikimedia.org/wiki/File:Artificial_neural_network.svg
FEATURE DETECTORS

• Convolutions use a kernel matrix to scan a given image and apply a filter to obtain a certain effect.
• An image Kernel is a matrix used to apply effects such as blurring and sharpening.
• Kernels are used in machine learning for feature extraction to select most important pixels of an image.
• Convolution preserves the spatial relationship between pixels.

FEATURE MAPS

KERNELS/
FEATURE
DETECTORS
FEATURE DETECTORS

• Live Convolution: http://setosa.io/ev/image-kernels/


FEATURE DETECTOR

0 1 1 0 1
1 0 0 1 0 0 0 1 1 1 1
0 1 0 1 1 0 0 0 3 1 1
0 1 0 0 1 1 1 0 2 3 1
0 0 1 0 1

FEATURE MAP
IMAGE
WHAT ARE
CONVOLUTIONAL NEURAL
NETWORKS (CNNS) AND
HOW DO THEY LEARN? –
PART 2
RELU (RECTIFIED LINEAR UNITS)

• RELU Layers are used to add non-linearity in the feature map.


• It also enhances the sparsity or how scattered the feature map is.

T-SHIRT/TOP
TROUSER
PULLOVER
CONVOLUTION POOLING FLATTENING DRESS
COAT
SANDAL
SHIRT
SNEAKER
KERNELS/ POOLING BAG
FEATURE FILTERS ANKLE BOOT

DETECTORS

CONVOLUTIONAL LAYER POOLING LAYER


(DOWNSAMPLING)

Photo Credit: https://commons.wikimedia.org/wiki/File:Artificial_neural_network.svg


RELU (RECTIFIED LINEAR UNITS)

• RELU Layers are used to add non-linearity in the feature map.


• It also enhances the sparsity or how scattered the feature map is.
• The gradient of the RELU does not vanish as we increase x compared to the sigmoid function

7 10 -5 2 1 7 10 0 2 1
1 0 2 3 -6 1 0 2 3 0
1 17 -5 0 0 1 17 0 0 0
0 1 1 1 0 0 1 1 1 0
0 0 -8 12 1 0 0 0 12 1

Photo Credit: https://commons.wikimedia.org/wiki/File:Artificial_neural_network.svg


POOLING (DOWNSAMPLING)

• Pooling or down sampling layers are placed after convolutional layers to reduce feature map
dimensionality.
• This improves the computational efficiency while preserving the features.
• Pooling helps the model to generalize by avoiding overfitting.
• If one of the pixel is shifted, the pooled feature map will still be the same.
• Max pooling works by retaining the maximum feature response within a given sample size in a feature
map.
• Live illustration : http://scs.ryerson.ca/~aharley/vis/conv/flat.html

1 1 3 4 6
3 6 2 8 MAX POOLING
6 8
FLATTENING 8
3 9 1 0 2x2 9 4 9
STRIDE = 2
1 3 3 4 4

Photo Credit: https://commons.wikimedia.org/wiki/File:Artificial_neural_network.svg


HOW TO IMPROVE
NETWORK
PERFORMANCE?
INCREASE FILTERS/DROPOUT

• Improve accuracy by adding more feature detectors/filters or adding a dropout.


• Dropout refers to dropping out units in a neural network.
• Neurons develop co-dependency amongst each other during training
• Dropout is a regularization technique for reducing overfitting in neural networks.
• It enables training to occur on several architectures of the neural network

KERNELS/
FEATURE
64 INSTEAD OF
32 DETECTORS

• Photo Credit: https://fr.m.wikipedia.org/wiki/Fichier:MultiLayerNeuralNetworkBigger_english.png


CONFUSION MATRIX
CONFUSION MATRIX

TRUE CLASS

+ -

TYPE I ERROR
+ TRUE + FALSE +

PREDICTIONS

FALSE - TRUE -
-
TYPE II ERROR
CONFUSION MATRIX

• A confusion matrix is used to describe the performance of a classification model:

o True positives (TP): cases when classifier predicted TRUE (they have the disease), and correct class
was TRUE (patient has disease).

o True negatives (TN): cases when model predicted FALSE (no disease), and correct class was FALSE
(patient do not have disease).

o False positives (FP) (Type I error): classifier predicted TRUE, but correct class was FALSE (patient
did not have disease).

o False negatives (FN) (Type II error): classifier predicted FALSE (patient do not have disease), but
they actually do have the disease
KEY PERFORMANCE INDICATORS (KPI)

o Classification Accuracy = (TP+TN) / (TP + TN + FP + FN)

o Misclassification rate (Error Rate) = (FP + FN) / (TP + TN + FP + FN)

o Precision = TP/Total TRUE Predictions = TP/ (TP+FP) (When model predicted TRUE class, how often
was it right?)

o Recall = TP/ Actual TRUE = TP/ (TP+FN) (when the class was actually TRUE, how often did the
classifier get it right?)
PRECISION Vs. RECALL EXAMPLE

TRUE CLASS

+ -

+ TP = 1 FP = 1
PREDICTIONS

- FN = 8 TN = 90

o Classification Accuracy = (TP+TN) / (TP + TN + FP + FN) = 91%


o Precision = TP/Total TRUE Predictions = TP/ (TP+FP) = ½=50%
o Recall = TP/ Actual TRUE = TP/ (TP+FN) = 1/9 = 11%
LENET NETWORK
LENET ARCHITECTURE

• The network used is called LeNet that was presented by Yann LeCun
• Reference and photo credit: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
• C: Convolution layer, S: subsampling layer, F: Fully Connected layer
LENET ARCHITECTURE

•STEP 1: THE FIRST CONVOLUTIONAL LAYER #1


• Input = 32x32x1
• Output = 28x28x6
• Output = (Input-filter+1)/Stride* => (32-5+1)/1=28
• Used a 5x5 Filter with input depth of 3 and output depth of 6
• Apply a RELU Activation function to the output
• pooling for input, Input = 28x28x6 and Output = 14x14x6

•STEP 2: THE SECOND CONVOLUTIONAL LAYER #2


• Input = 14x14x6
• Output = 10x10x16
• Layer 2: Convolutional layer with Output = 10x10x16
• Output = (Input-filter+1)/strides => 10 = 14-5+1/1
• Apply a RELU Activation function to the output
• Pooling with Input = 10x10x16 and Output = 5x5x16

•STEP 3: FLATTENING THE NETWORK


• Flatten the network with Input = 5x5x16 and Output = 400

•STEP 4: FULLY CONNECTED LAYER


• Layer 3: Fully Connected layer with Input = 400 and Output = 120
• Apply a RELU Activation function to the output

•STEP 5: ANOTHER FULLY CONNECTED LAYER * Stride is the amount by which the kernel is shifted when the
• Layer 4: Fully Connected Layer with Input = 120 and Output = 84 kernel is passed over the image.
• Apply a RELU Activation function to the output

•STEP 6: FULLY CONNECTED LAYER


• Layer 5: Fully Connected layer with Input = 84 and Output = 43
NOTES ON SERVICE LIMIT
INCREASE
INSTANCE REQUEST

1. Click on Services

2.Click in
Support
INSTANCE REQUEST

1.Click on Create Case

2. Select Service Limit


Increase

3.Select Sagemaker from the drop-down box


INSTANCE REQUEST

1.Select the region (region closest to you)

2.Select Sagemaker Notebooks

3.Select ml.p2.16xlarge instances


4. Request of 2( As we need 2 for training and deployment)

5. Give the same reason


INSTANCE REQUEST

Click the submit button

NOTE: It takes around 24hrs to 48hrs for the instances to get


approval for the first time. After that, you would get the instances
approved within 1 – 2 hours.

You might also like