OpenCV Python Course_Updated
OpenCV Python Course_Updated
o Pedestrian
Finding Waldo Count Yawns Detection
tion Handwritten Digit Reco
Requirements?
Basic programming is useful, but not needed I’ll walk you through most of the
code. Exposure to Numpy would be helpful.
A Webcam
Software developers & engineers looking strengthen their skills for job promo
Hobbyist who want to build a fun computer vision project. E.g. Raspberry Pi
projects.
Let’s being our exciting journey into
The world of Computer Vision using
OpenCV in Python!
Python & OpenCV Installation
1 – Download & Install Anaconda Python Package
o to: https://www.anaconda.com/download
ect appropriate version - either Python 2.7 or 3.7
Test – Go to windows command prompt and type:
jupyter notebook
This should launch the JuPyter Notebook server
2 – Open Command Prompt or Terminal (Mac & Linux
OpenCV & Dlib Installation
TEP 3
Use pip or pip3 to install the following packages:
• pip install opencv-contrib-python
• pip install dlib
If not installed (should come with Anaconda by default, but if not using Anaconda you will
need to install the following prior to the above)
• pip install numpy
• pip install matplotlib
TEP 4
Test by opening a new notebook and running the following lines (bottom right)
Section 2
• Using a small opening in the barrier (called aperture), we block off most of the rays of light
reducing blurring on the film or sensor
• This is the pinhole camera model
Controlling Image Formation with a Lens
Both our eyes and cameras use an adaptive lens to control many aspects of
image formation such as:
• Aperture Size
• Controls the amount of light allowed through (f-stops in cameras)
• Depth of Field (Bokeh)
• Lens width - Adjusts focus distance (near or far)
How Humans See
The human visual system (eye & visual cortex) is
incredibly good at image processing
3-dimensional array
Black and White or Greyscale
Black and White images are stored in
2-Dimensional arrays
First major release 1.0 was in 2006, second in 2009 and third in 2015.
Open 3.X is very similar, and has some benefits and new functions, however, it also
removed some of the important algorithms (due to patents) such as SIFT & SURF.
Image Manipulations
Image Manipulations
. Transformations, affine and non affine
. Translations
. Rotations
4. Scaling, re-sizing and interpolations
. Image Pyramids
. Cropping
. Arithmetic Operations
. Bitwise Operations and Masking
. Convolutions & Blurring
0. Sharpening
1. Thresholding and Binarization
2. Dilation, erosion, opening and closing
3. Edge Detection & Image Gradients
4. Perspective & Affine Transforms
5. Mini Project # 1 – Make a Live Sketch of Yourself!
Transformations
ransformations – are geometric distortions enacted upon an image.
ypes:
Affine
Non-Affine
Affine vs Non Affine Theory
Scaling Non-Affine or
Projective Trans
also called Hom
Rotation
The non-affine or projective transformation does
not preserve parallelism, length, and angle. It does
however preserve collinearity and incidence.
Translation
Translations
Translation Matrix
T=
M=
OpenCV allows you to scale and rotate at the same thing using the function:
t’s simply a different way of re-sizing that allows us to easily and quickly scale images. Scaling do
educes the height and width of the new image by half.
This comes in useful when making object detectors that scales images each time it looks for an o
Cropping Images
Cropping images refers to extracting a segment a of that image.
Convolutions & Blurring
A Convolution is a mathematical operation performed on two
unctions producing a third function which is typically a modified
ersion of one of the original functions.
Threshold Types:
cv2.THRESH_BINARY – Most common
cv2.THRESH_BINARY_INV – Most common
cv2.THRESH_TRUNC
cv2.THRESH_TOZERO
cv2.THRESH_TOZERO_INV
v2.aptiveThreshold(image, Max Value, Adaptive type, Threshold Type, Block size, Constant that i
ubtracted from mean)
Remember:
Dilation – Adds pixels to the boundaries of objects in an image
Erosion – Removes pixels at the boundaries of objects in an image Erosion
Dilation
Edge Detection & Image Gradients
dge Detection is a very important area in Computer Vision, especially when dealing with conto
you’ll learn this later soon).
dges can be defined as sudden changes (discontinuities) in an image and they can encode just
much information as pixels.
Edge Detection Algorithms
here are three main types of Edge Detection:
Sobel – to emphasize vertical or horizontal edges
Laplacian – Gets all orientations
Canny – Optimal due to low error rate, well defined edges and accurate
detection.
Gaussian
Blur
Canny Edge
Extraction
Threshold
(Inverse)
Section 4
Image Segmentation
Segmentation - Partitioning images into different regions
Image Segmentation
. Understanding contours
. Sorting contours by size or left to right
. Approximating contours & finding their convex hull
4. Matching Contour Shapes
. Mini Project # 2 – Identifying Shapes
. Line Detection
. Circle Detection
. Blob Detection
. Mini Project # 3 – Counting Circles and Ellipses
Contours
Contours are continuous lines or curves that bound or cover the ful
boundary of an object in an image.
Retrieval Mode:
• cv2.CHAIN_APPROX_NONE – Stores all the points along the line (inefficient!)
• cv2.CHAIN_APPROX_SIMPLE – Stores the end points of each line
Hierarchy is stored in the following format: [Next, Previous, First Child, Parent]
NOTE - Contour Hierarchy is a quite lengthy to explain, if you’re interested read here:
tp://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_contours/py_contours_hierarchy/py_contours_hierarchy.html
Sorting Contours
Sorting contours is quite useful when doing image processing.
Sorting by Area can assist in Object Recognition (using contour area)
• Eliminate small contours that may be noise
• Extract the largest contour
Used in Mechanics and Statistics quite frequently and has now been
adopted in Computer Vision
Contour Template – This is our reference contour that we’re trying to find in the new image
Contour – The individual contour we are checking against
Method – Type of contour matching (1, 2, 3)
Method Parameter – leave alone as 0.0 (not fully utilized in python OpenCV)
Mini Project # 2 – Shape Matching
Line Detection – Hough Lines & Probabilistic Hough Lines
v2.HoughLinesP(binarized image, accuracy, accuracy, threshold, minimum line length, max line gap)
Circle Detection
cv2.HoughCircles(image, method, dp, MinDist, param1, param2, minRadius, MaxRadius)
Create
Detector
Input image
into Detector
Obtain Key
points
Draw Key
points
Mini Project # 3 – Counting Circles and Ellipses
Blob Filtering – Shape & Size
v2.SimpleBlobDetector_Params()
ea
params.filterByArea = True/False
params.minArea = pixels
params.maxArea = pixels
cularity
params.filterByCircularity = True/False
params.minCircularity = 1 being perfect circle, 0 the opposite
ertia – Measure of ellipticalness (low being more elliptical, high being more circular)
params.filterByInertia = True/False
params.minInertiaRatio = 0.01
Section 5
Object Detection
Object Detection
. Object Detection using Template Matching
. Mini Project – Finding Waldo
. Feature Description Theory
4. Finding Corners
. SIFT, SURF, FAST, BREIF & ORB
. Mini Project – Object Detection using Features
. Histogram of Gradients (HoG) as a Descriptor
Object Detection
y do we need detect objects in image?
beling Scenes
obot Navigation
elf Driving Cars
ody Recognition (Microsoft Kinect)
sease & Cancer Detection
cial Recognition
andwriting Recognition
entifying objects in satellite images
c.
Object Detection vs Recognition
Uninteresting
Interesting
Why is this important?
Features are important as they can be used to analyze, describe and match
images.
It was improved in 1994 by improving the scoring function when determining corners locations
• cv2.goodFeaturesToTrack (input image, maxCorners, qualityLevel, minDistance)
SURF was developed to improve the speed of a scale invariant feature detector
Instead of using the Difference of Gaussian approach, SURF uses Hessian matri
approximation to detect interesting points and use the sum of Haar wavelet
responses for orientation assignment.
Alternatives to SIFT and SURF
Features from Accelerated Segment Test (FAST)
• Key point detection only (no descriptor, we can use SIFT or SURF to computer that)
• Used in real time applications
Oriented FAST and Rotated BRIEF (ORB) – Developed out of OpenCV Labs (no
patented so free to use!)
• Combines both Fast and Brief
• http://www.willowgarage.com/sites/default/files/orb_final.pdf
Using SIFT, SURF, FAST, BRIEF & ORB in OpenCV
Create
Detector
Input image
into Detector
Obtain Key
points
Draw Key
points
Mini Project # 5 – Object Detection
Histogram of Oriented Gradients (HOGs)
HOGs are a feature descriptor that has been widely and successfully used for object detection.
It represents objects as a single feature vector as opposed to a set of feature vectors where
each represents a segment of the image.
It’s computed by sliding window detector over an image, where a HOG descriptor is a
computed for each position. Like SIFT the scale of the image is adjusted (pyramiding).
HOGs are often used with SVM (support vector machine) classifiers. Each HOG descriptor that
is computed is fed to a SVM classifier to determine if the object was found or not).
reat Paper by Dalal & Triggs on using HOGs for Human Detection:
• https://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf
Histogram of Gradients (HOGs) Step by Step
Using an 8 x 8 pixel detection window or cell (in green), we compute the
gradient vector or edge orientations at each pixel.
. Each cell is then split into angular bins, where each bin corresponds to a
gradient direction (e.g. x, y). In the Dalal and Triggs paper, they used 9 bins
0-180° (20° each bin).
Block 1 Block 2
Section 6
Negative Positive
. We then extract features using sliding windows of rectangular blocks. These features are single valued
and are calculated by subtracting the sum of pixel intensities under the white rectangles from the black
rectangles. However, this is a ridiculous number of calculations, even for a base window of 24 x 24
pixels (180,000 features generated). So the researchers devised a method called Integral Images that
computed this with four array references.
HAAR Classifiers Explained
. However, they still had 180,000 features and the majority of them added no real value.
Relevant Irrelevant
. Boosting was then used to determine the most informative features, with Freund &
Schapire’s AdaBoost the algorithm of choice due to its ease of implementation. Boosting is
the process by which we use weak classifiers to build strong classifiers, simply by assigning
heavier weighted penalties on incorrect classifications. Reducing the 180,000 features to
6000, which is still quite a bit features.
HAAR Classifiers Explained
. Think about this intuitively, if of those 6000 features, some will be more informative than
others. What if we used the most informative features to first check whether the region can
potentially have a face (false positives will be no big deal). Doing so eliminates the need for
calculating all 6000 features at once.
. This concept is called the Cascade of Classifiers - for face detection, the Viola Jones method
used 38 stages.
Load Classifier
Pass Image to
Classifier/Detector
cale Factor
Specifies how much we reduce the image size each time we scale. E.g. in face detection we typically use 1.3. This
means we reduce the image by 30% each time it’s scaled. Smaller values, like 1.05 will take longer to compute, but
will increase the rate of detection.
Min Neighbors
Specifies the number of neighbors each potential window should have in order to consider it a positive detection.
Typically set between 3-6.
• It acts as sensitivity setting, low values will sometimes detect multiples faces over a single face. High values
will ensure less false positives, but you may miss some faces.
List of OpenCV Pre-Trained Cascade Classifiers
ound Here - https://github.com/opencv/opencv/tree/master/data/haarcascades
Mini Project # 6- Car and Pedestrian Detection
Section 7
YES!
Let’s now edit the code to turn this into our own
version the popular and $35M app MSQRD!
Mini Project # 7 – Live Face Swapping
Mini Project # 8 – Yawn Detector and Counter
Section 8
amples:
1. Using today’s news articles to predict tomorrows stock price
2. Using words in a Tweet to predict the sentiment express (e.g. anger, sadnes
joy, political views)
3. Predicting illnesses based on medical data
4. Using your past shopping data to predict future purchases’
5. And the most popular example, filtering email spam.
Spam?
No Spam?
Two main types of Machine Learning
upervised & Unsupervised Learning
upervised Learning – We feed an algorithm some ground truth examples. It then
ormulates a model mapping inputs to outputs (Training Process). We then use this model
o predict the outputs of new inputs.
Input Data
x1 Supervised
x2 Machine Output Labels
. Learning
xN
nsupervised Learning – We similarly try to predict the output from input data, however no
round truth examples are given. Think clustering!
Examples - Supervised & Unsupervised Learning
Neural Networks
• Convolutional Neural Networks (CNN) – are able to beat humans in object recognition tasks
• Café
• TensorFlow
• Theano & Keras
• Deep Belief Networks
K-Nearest Neighbors Algorithm
KNN is a simple machine learning classifier that classifiers a new
input based on the closest examples in the feature space.
. Eigenfaces - createEigenFaceRecognizer()
2. Fisherfaces - createFisherFaceRecognizer()
3. Local Binary Patterns Histograms - createLBPHFaceRecognizer()
Normalize by gray
malize by gray scaling and
scaling and re‐sizing to
izing to 200 x 200 pixels
200 x 200 pixels
. Create a binary (thresholded) mask showing only the desired colors in white
• mask = cv2.inRange(hsv_img, lower_color_range, upper_color_range)
ese algorithms essentially learn about the frame in view (video stream) and
e able to accurate “learn” and identify the foreground mask. What results is a
nary segmentation of the image which highlights regions of non-stationary
bjects.
n OpenCV we typically use the histogram back projected image and initial target location.
1 2 3
Camshift – An Object Tracking Algorithm
Camshift is very similar to Meanshift, however you may have noted the window in Meanshift is o
fixed size. That is problematic since movement in images can be small or large. If the window i
oo large, you can miss the object when tracking.
Camshift (Continuously Adaptive Meanshift) uses an adaptive window size that changes both siz
nd orientation (i.e. rotates). We’ll simply it’s steps here:
. Applies Meanshift till it converges
. Calculates the size of the window
. Calculates the orientation by using the best fitting ellipse
When and how to use Meanshift or Camshift?
If you have some prior knowledge of the object being tracked (e.g. size wrt
to camera point of view) then Meanshift would work well.
Employ Camshift when the object being track is changing shape wrt to the
camera perspective. Generally more versatile, but also more sensitive.
Tip: Beware of the starting location of the window, you can get stuck in a local
minima!
Optical Flow
Seeks to get the pattern of apparent motion of objects in an image between two
consecutive frames.
cas-Kanade Method
Dense Optical F
Mini Project # 11 – Object Tracking
Computational Photography
. What is Computational Photography
. Noise Reduction
. Mini Project # 11 – Photo Restoration (remove strokes, scratches, bends in old photos)
Mini Project # 12 – Photo-restoration
What is Computational Photography?
These are digital image processing techniques used on images produced by cameras.
They seek to enhance images via computational processing rather than use expensive
optical processes (cost more and are bulky).
They seek to enhance images via computational processing rather than use expensive
optical processes (cost more and are bulky).
Course Wrap Up
Course Wrap Up
. Where do you go from here?
. Computer Vision Research Areas & Startup Ideas and Mobile Computer Vision
Consolidating What You’ve Learnt
Familiarity with Python and Numpy
Computer Vision:
• Image Manipulation and Segmentation
• Object Recognition Methods
• Machine Learning in Computer Vision
• Facial Feature Analysis and Recognition
• Motion Analysis and Object Tracking
Becoming an Expert
. Create your own project
Native Mobile – Use OpenCV natively (using wrappers for iOS and Android)
• PROS
• Can do real time video based apps
• No need to pay for cloud based storage
• CONS
• Can’t do heavy processing on mobile device
• More difficult than using OpenCV in python
iOS - http://docs.opencv.org/2.4/doc/tutorials/introduction/ios_install/ios_install.html
Android - http://blog.codeonion.com/2015/11/17/learning-the-packages-of-opencv-sdk-for-android/