Classify Webcam Images Using Deep Learning

The document describes using a convolutional neural network called AlexNet to classify images from a webcam in real time. AlexNet is a pretrained deep CNN that has been trained on over 1 million images and can classify images into 1000 categories. The CNN will take images from a webcam and identify objects in the surroundings. It discusses problems with image classification like lighting and occlusion. It also provides details on the architecture of CNNs, including convolutional and pooling layers, as well as the specific architecture of AlexNet, which has convolutional, max pooling, normalization and fully connected layers.

Uploaded by

gaurav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

309 views17 pages

Classify Webcam Images Using Deep Learning

Uploaded by

gaurav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

CLASSIFY

WEBCAM
IMAGES USING
DEEP LEARNING
ABSTRACT

• Deep learning has emerged as a new era in machine learning which is being
applied to a number of signal and image applications. The main purpose of the
work presented in this paper, is to apply the concept of a Deep Learning
algorithm namely, Convolutional neural networks (CNN) in classifying webcam
images in real time. The pretrained deep convolutional neural network that we
are using here is AlexNet that has been trained on over a million images and
can classify images into 1000 object categories (such as keyboard, coffee mug,
pencil, and many animals). Alexnet has learned rich feature representations for
a wide range of images. Images will be captured from our system webcam and
our pretrained deep convolutional neural network, AlexNet will identify objects
in our surroundings.
PROBLEMS IN CLASSIFYING IMAGES

 Large amount of intra-class variability

 Different lightening conditions
 Misalignment
 Non rigid deformation
 Occlusion
 Corruption
WHAT IS DEEP LEARNING?

• Deep learning (also known as deep structured learning or hierarchical

learning) is part of a broader family of machine learning methods based on
learning data representations, as opposed to task-specific algorithms. Learning
can be supervised, semi-supervised or unsupervised.
• Deep learning architectures such as deep neural networks, deep belief networks
and recurrent neural networks have been applied to fields including computer
vision, speech recognition, natural language processing, audio recognition, social
network filtering, machine translation, bioinformatics, drug design, medical
image analysis, material inspection and board game programs, where they have
produced results comparable to and in some cases superior to human experts.
WHY DEEP LEARNING?

• Learning features from data of interest is considered as a possible method of

remedying the limitations of hand-crafted features.
• Discover multiple levels of representation with the hope that higher level
features can represent more abstract semantics of the data. Such abstract
representations learned from a deep network are expected to provide greater
robustness to intra-class variability.
• One key ingredient to the success of deep learning in image classification is
the use of convolutional architectures. A convolutional deep neural network
(ConvNet) architecture consists of multiple trainable stages stacked on top of
each other followed by a supervised classifier
CNN

• A CNN network is a class of feed forward artificial neural networks, most commonly
applied to analyzing visual imagery. Convolutional neural networks are inspired by
biological processes. In CNN connectivity pattern between neurons resembles the
organization of the animal visual cortex. Individual cortical neurons respond to
stimuli only in a restricted region of the visual field known as receptive field. The
receptive fields of different neurons partially overlap such that they cover entire
visual field.
• CNNs use relatively little pre-processing compared to other image classification
algorithms which means our network learns the filters that in traditional algorithms
were hard engineered. This independence from prior knowledge and human effort
in feature design is a major advantage.
CNN-ALEXNET

• It was designed by Alex Krizhevsky and published with Liya Sutskever and
Geoffrey Hinton. AlexNet competed in the ImageNet Large Scale Visual
Recognition Challenge in 2012.
• The network achieved a top-5 error of 15.3%, more than 10.8 percent points
lower than that of the runner up. AlexNet shows the probability of the image
• it captures from the camera. It shows the top five highest categories with
the maximum probabilities and according to that a chart is prepared.
AlexNet is trained over more than 50000 times and shows more correct
results as compared to previous trained models.
ARCHITECTURE OF CNN

• A CNN consists number of convolutional and subsampling layers optionally followed

by fully connected layers. The input to a convolutional layer is a m x m x r image
where m is the height and width of the image and r is the number of channels, e.g.
an RGB image has r=3.
• The convolutional layer will have kk filters (or kernels) of size n x n x q where n is
smaller than the dimension of the image and q can either be the same as the
number of channels r or smaller and may vary for each kernel. The size of the filters
gives rise to the locally connected structure which are each convolved with the
image to produce k feature maps of size m−n+1. Each map is then subsampled
typically with mean or max pooling over p x p contiguous regions where p ranges
between 2 for small images and is usually not more than 5 for larger inputs.
A SIMPLE CONV-NET
OPERATIONS IN CONV-NET

• Convolution
• Non-Linearity (ReLU)
• Pooling or Sub Sampling
• Classification (Fully Connected Layer)
CONVOLUTION

• ConvNets derive their name from the “convolution” operator. The primary purpose of
Convolution in case of a ConvNet is to extract features from the input image. Convolution
preserves the spatial relationship between pixels by learning image features using small
squares of input data .In CNN terminology, the 3×3 matrix is called a ‘filter‘ or ‘kernel’ or
‘feature detector’ and the matrix formed by sliding the filter over the image and
computing the dot product is called the ‘Convolved Feature’ or ‘Activation Map’ or the
‘Feature Map‘. It is important to note that filters act as feature detectors from the original
input image. In practice, a CNN learns the values of these filters on its own during the
training process (although we still need to specify parameters such as number of filters,
filter size, architecture of the network etc. before the training process). More number of
filters we have, the more image features get extracted and the better our network
becomes at recognizing patterns in unseen images.
NON-LINEARITY (RELU)

• An additional operation called ReLU has been used after every Convolution

operation. ReLU stands for Rectified Linear Unit and is a non-linear
operation.
• ReLU is an element wise operation (applied per pixel) and replaces all
negative pixel values in the feature map by zero. The purpose of ReLU is to
introduce non-linearity in our ConvNet, since most of the real-world data
we would want our ConvNet to learn would be non-linear (Convolution is a
linear operation – element wise matrix multiplication and addition, so we
account for non-linearity by introducing a non-linear function like ReLU).
POOLING STEP

• Spatial Pooling (also called subsampling or down-sampling) reduces the

dimensionality of each feature map but retains the most
important information. Spatial Pooling can be of different types: Max,
Average, Sum etc. In case of Max Pooling, we define a spatial
neighborhood (for example, a 2×2 window) and take the largest
element from the rectified feature map within that window. Instead of
taking the largest element we could also take the average (Average
Pooling) or sum of all elements in that window.
FULLY CONNECTED LAYER

• The Fully Connected layer is a traditional Multi-Layer Perceptron that

uses a softmax activation function in the output layer (other classifiers
like SVM can also be used, but will stick to softmax in this post). The
term “Fully Connected” implies that every neuron in the previous layer
is connected to every neuron on the next layer. The output from the
convolutional and pooling layers represent high-level features of the
input image. The purpose of the Fully Connected layer is to use these
features for classifying the input image into various classes based on
the training dataset.
ALEXNET ARCHITECHTURE
DESCRIBING NETWORK

• The net contains eight layers with weights; the first five are convolutional and the remaining three are fully-connected.
The output of the last fully-connected layer is fed to a 1000-way softmax which produces a distribution over the 1000
class labels. The response-normalization layers follow the first and second convolutional layers. Max-pooling layers
follow both of the response-normalization layers as well as the last (fifth) convolutional layer. The ReLU non-linearity is
applied to the output of every convolutional and fully-connected layer.
•
• The input to the net is a 227 × 227 × 3 image. The filters for each convolutional layer are:
• 96 kernels of size 11 × 11 × 3 with step size 4
• 256 kernels of size 5 × 5 × 48* with step size 1
• 384 kernels of size 3 × 3 × 256 with step size 1
• 384 kernels of size 3 × 3 × 192* with step size 1
• 256 kernels of size 3 × 3 × 192* with step size 1
THANK YOU

Throughout Infinity
No ratings yet
Throughout Infinity
156 pages
J750 Basic Training Module1 J750 Tester Overview
No ratings yet
J750 Basic Training Module1 J750 Tester Overview
78 pages
Class 8 ICT Chapter 2 Notes - OAV - Easy & Clear Explanation
No ratings yet
Class 8 ICT Chapter 2 Notes - OAV - Easy & Clear Explanation
13 pages
EWM CLASS 41 - RF Framework - Configuration
100% (1)
EWM CLASS 41 - RF Framework - Configuration
9 pages
The Division 2 - Gear Attribute Sheet PDF
No ratings yet
The Division 2 - Gear Attribute Sheet PDF
41 pages
Akuvox Smart Intercom Solution - 03 2018-En - Case - Study
0% (1)
Akuvox Smart Intercom Solution - 03 2018-En - Case - Study
34 pages
Defense Intelligence Reference Document Traversible Wormholes, Stargates, and Negative Energy
No ratings yet
Defense Intelligence Reference Document Traversible Wormholes, Stargates, and Negative Energy
42 pages
Adversarial Misuse Generative Ai
No ratings yet
Adversarial Misuse Generative Ai
32 pages
09 RS485 Communication Modbus RTU ALR121
No ratings yet
09 RS485 Communication Modbus RTU ALR121
15 pages
Mobile App
0% (1)
Mobile App
13 pages
Smart Integrated Security Systems Solution 2017 PDF
100% (1)
Smart Integrated Security Systems Solution 2017 PDF
30 pages
Find Related Images With Reverse Image Search
No ratings yet
Find Related Images With Reverse Image Search
2 pages
Plastic Money
75% (4)
Plastic Money
19 pages
How To Install Openshift On A Laptop or Desktop
100% (1)
How To Install Openshift On A Laptop or Desktop
7 pages
Time Travel Epic
No ratings yet
Time Travel Epic
210 pages
Nanotech Bad
No ratings yet
Nanotech Bad
106 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
To Study The Nationalize Plastic Money Payment Gateway System
100% (1)
To Study The Nationalize Plastic Money Payment Gateway System
39 pages
A Little Manual of The Individualist Anarchist (Emile Armand 1911)
100% (1)
A Little Manual of The Individualist Anarchist (Emile Armand 1911)
3 pages
Uncover, Understand, Own: Regaining Control Over Your Amd Cpu
No ratings yet
Uncover, Understand, Own: Regaining Control Over Your Amd Cpu
60 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
Analysis of Android Applications by Using Reverse Engineering Techniques
No ratings yet
Analysis of Android Applications by Using Reverse Engineering Techniques
8 pages
Helio Display Full Report
67% (3)
Helio Display Full Report
7 pages
2016dark Web 101
No ratings yet
2016dark Web 101
6 pages
Gateway Drugs
No ratings yet
Gateway Drugs
6 pages
RCMP BMC Final Report
No ratings yet
RCMP BMC Final Report
48 pages
Understanding Credit Cards Note Taking Guide 2 6 3 l1
No ratings yet
Understanding Credit Cards Note Taking Guide 2 6 3 l1
4 pages
Apt44 Unearthing Sandworm
No ratings yet
Apt44 Unearthing Sandworm
40 pages
Moabi - Breaking Sandboxing - RUXCON 2013
No ratings yet
Moabi - Breaking Sandboxing - RUXCON 2013
60 pages
Clear-Com Intercom Systems (Manual)
No ratings yet
Clear-Com Intercom Systems (Manual)
17 pages
Price Update in Purchase Order Based On Delivery: Business Scenario
No ratings yet
Price Update in Purchase Order Based On Delivery: Business Scenario
11 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Credit Card History
No ratings yet
Credit Card History
1 page
Fighting Back Against Revenge Porn A Legislative Solution
No ratings yet
Fighting Back Against Revenge Porn A Legislative Solution
24 pages
Jit Guide - Equipment List
No ratings yet
Jit Guide - Equipment List
24 pages
Attacking Hypervisors Via Firmware and Hardware: Alex Matrosov (@matrosov)
No ratings yet
Attacking Hypervisors Via Firmware and Hardware: Alex Matrosov (@matrosov)
44 pages
Whitepaper 8
No ratings yet
Whitepaper 8
24 pages
Math 1050 Credit Card Lab
No ratings yet
Math 1050 Credit Card Lab
2 pages
Nanotech Insights July 2011 Issue-Draft-1st Sept, 2011 PDF
No ratings yet
Nanotech Insights July 2011 Issue-Draft-1st Sept, 2011 PDF
52 pages
Sentry 1 Technical Manual
No ratings yet
Sentry 1 Technical Manual
36 pages
Identify Web Cam Images Using Neural Networks
No ratings yet
Identify Web Cam Images Using Neural Networks
17 pages
Credit Cards Frauds and Cybersecurity Threats Machine Learning Detection Algorithms As Countermeasures
No ratings yet
Credit Cards Frauds and Cybersecurity Threats Machine Learning Detection Algorithms As Countermeasures
9 pages
Final Year Project - 1802160,1802361
No ratings yet
Final Year Project - 1802160,1802361
24 pages
Backdoor MAC Eleanor Final
No ratings yet
Backdoor MAC Eleanor Final
10 pages
Android - How To Use Webcam in Emulator - Stack Overflow
No ratings yet
Android - How To Use Webcam in Emulator - Stack Overflow
8 pages
Effectiveness of ATM Machine
No ratings yet
Effectiveness of ATM Machine
21 pages
Brief History MB
No ratings yet
Brief History MB
4 pages
Math 1050 - Credit Card Assignment
No ratings yet
Math 1050 - Credit Card Assignment
3 pages
Tyrant Free Zone - Survival Guide Resources 2014
0% (1)
Tyrant Free Zone - Survival Guide Resources 2014
4 pages
Highly Confidential Security System Uml
No ratings yet
Highly Confidential Security System Uml
18 pages
Training - Credit+Card+Information+Security - EN
No ratings yet
Training - Credit+Card+Information+Security - EN
5 pages
Backdoor (Computing) : Politics and Attribution Examples
No ratings yet
Backdoor (Computing) : Politics and Attribution Examples
9 pages
Credit Card Authorization Form
No ratings yet
Credit Card Authorization Form
1 page
Camera Security Systems (Anmol)
No ratings yet
Camera Security Systems (Anmol)
12 pages
Vehicle Relay Attack Avoidance Methods Using RF Si
No ratings yet
Vehicle Relay Attack Avoidance Methods Using RF Si
5 pages
Embedded System
No ratings yet
Embedded System
39 pages
History of Autocad
No ratings yet
History of Autocad
4 pages
Classify Webcam Images Using Deep Learning - MATLAB & Simulink
No ratings yet
Classify Webcam Images Using Deep Learning - MATLAB & Simulink
11 pages
Internet of Things For Industrial Monitoring and Control Applications PDF
No ratings yet
Internet of Things For Industrial Monitoring and Control Applications PDF
5 pages
Tesla Script
No ratings yet
Tesla Script
3 pages
Unsafe Webcam Spy
No ratings yet
Unsafe Webcam Spy
1 page
50 Awesome Auto Projects For The Evil Genius.9780071458238.43450
No ratings yet
50 Awesome Auto Projects For The Evil Genius.9780071458238.43450
11 pages
3D Printing The Future Crime of Present
No ratings yet
3D Printing The Future Crime of Present
9 pages
Arduino Miata Immobilizer Bypass Schematic
No ratings yet
Arduino Miata Immobilizer Bypass Schematic
1 page
Gate Guide - The-S
No ratings yet
Gate Guide - The-S
5 pages
RFID0950 Elevator ID Card Manual: Core Lift Accessories Co.,Ltd
No ratings yet
RFID0950 Elevator ID Card Manual: Core Lift Accessories Co.,Ltd
8 pages
BuddyBland Titan SC12
No ratings yet
BuddyBland Titan SC12
12 pages
Video Wall - Wik
No ratings yet
Video Wall - Wik
3 pages
Field Communication Unit, Fcu 2160: Tankradar Rex
No ratings yet
Field Communication Unit, Fcu 2160: Tankradar Rex
4 pages
Allama Iqbal Open University, Islamabad (Department of Computer Science) Warning
No ratings yet
Allama Iqbal Open University, Islamabad (Department of Computer Science) Warning
4 pages
MC024-120 and 122 - Data Sheet
No ratings yet
MC024-120 and 122 - Data Sheet
2 pages
VRmNet Eyesi Direct Brochure
No ratings yet
VRmNet Eyesi Direct Brochure
12 pages
ZProtect 1.4 Unpacker
No ratings yet
ZProtect 1.4 Unpacker
9 pages
BSC APRRIL2013
100% (1)
BSC APRRIL2013
16 pages
4.2 Backpropagation 1
No ratings yet
4.2 Backpropagation 1
78 pages
Session 18 Solution Architecture For Gen AI
No ratings yet
Session 18 Solution Architecture For Gen AI
34 pages
Altai A8n Datasheet
No ratings yet
Altai A8n Datasheet
2 pages
Altos R480 F4 - v2.0
No ratings yet
Altos R480 F4 - v2.0
2 pages
Broadcom SONiC Cheat Sheet
No ratings yet
Broadcom SONiC Cheat Sheet
1 page
CG Unit 6 Notes
No ratings yet
CG Unit 6 Notes
27 pages
Control Structure
No ratings yet
Control Structure
13 pages
Blockchain Quiz A
No ratings yet
Blockchain Quiz A
16 pages
Unit Four: Flow of Control of Program
No ratings yet
Unit Four: Flow of Control of Program
40 pages
Os BCS303 Lab Manual DR - Ttit
No ratings yet
Os BCS303 Lab Manual DR - Ttit
36 pages
CV Nguyen Thi Uoc
No ratings yet
CV Nguyen Thi Uoc
2 pages
Rel'Event: Acropolis Institute of Technology & Research INDORE-453771
No ratings yet
Rel'Event: Acropolis Institute of Technology & Research INDORE-453771
37 pages
Emnbedded System Sem 4
No ratings yet
Emnbedded System Sem 4
19 pages
History of Computer Graphics
No ratings yet
History of Computer Graphics
10 pages
STS Pointers
No ratings yet
STS Pointers
2 pages
Manish Resume Feb25
No ratings yet
Manish Resume Feb25
2 pages

Classify Webcam Images Using Deep Learning

Uploaded by

Classify Webcam Images Using Deep Learning

Uploaded by

CLASSIFY

 Large amount of intra-class variability

• Deep learning (also known as deep structured learning or hierarchical

• Learning features from data of interest is considered as a possible method of

• A CNN consists number of convolutional and subsampling layers optionally followed

• An additional operation called ReLU has been used after every Convolution

• Spatial Pooling (also called subsampling or down-sampling) reduces the

• The Fully Connected layer is a traditional Multi-Layer Perceptron that

You might also like