0% found this document useful (0 votes)
10 views85 pages

01-02 Introduction To CV and Segmentation

Uploaded by

liangyibo653
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views85 pages

01-02 Introduction To CV and Segmentation

Uploaded by

liangyibo653
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Introduction

Segmentation
Computer Vision

Slide 1 of 85
Outline
• Introduction
• Couse structure
• Computer vision applications
• Stereovision
• Object tracking
• Segmentation
• Connected regions
• Merging and dividing regions
• Segmentation methods
Slide 2 of 85
Introduction

Slide 3 of 85
Syllabus
• Teachers:
• Ph.D, Associate Professor Sergei Shavetov, [email protected]
• Ph.D, Associate Professor Andrei Zhdanov, [email protected]
• (CS) Assistant Professor, Aleksandr Belykh, [email protected]
• (AT) Ph.D, Assistant Professor, Oleg Evstafev, [email protected]
• (AT) Ph.D, Assistant Professor, Vladimir Bespalov, [email protected]
• Course structure: 16 lectures * 45 minutes each
• Practical assignments: 4
• Intermediate tests: 2
• Final test
Slide 4 of 85
Detailed Course Structure
• Introduction • Images Categorization
• Image Segmentation • Object Detection
• Hough Transformation • Face Detection
• Image Features • Image Search
• Object Description • Neural Networks
• Feature Descriptors • Deep Learning
• Introduction to Machine • Convolutional Neural Networks
Learning
• Classification

Slide 5 of 85
Practical Assignments Topics
• Image Segmentation
• Hough Transformation
• Feature Detectors
• Face Detection. Viola-Jones Approach

Slide 6 of 85
Files Exchange
• DingTalk Group / Files:
• Lecture presentations
• Practical assignment guidelines
• Practical assignment templates

Slide 7 of 85
Course Assessments
• 4 practical assignments: 11 points each (total 44 points max)
• practical task: 5 points
• test: 6 points
• After lecture tests: 2 points each (total 16 points max)
• Intermediate test: 10 points each (total 20 points max)
• Final test: 20 points

• Course assessment:
• 60 points and more – pass,
• 59 points and less – fail.

Slide 8 of 85
Course Deadlines
• Practical assignments are performed in groups of 1-3 students
• Practical assignment consists of two parts:
• Practical task (5 points)
• Test on the learned material (6 points)
• To get the maximum score, the group should finish all parts of the
practical assignment, submit test answers, and show the practical
task results to the teacher during class time
• Some tasks may have optional parts which will give the group 1
extra point

Slide 9 of 85
Reasons for Points Decreasing
• Not attending the class (all tasks should be finished in class, not attending
the class results in 0 points for the in-class activity)
• Not finishing a task (will decrease score based on the complexity of the
not finished task)
• Not answering the questions about the finished task (will decrease score
based on answers quality, in the worst case you may be suspected in
copying your classmates work)
• Copying your classmates’ works (in this case you can’t get more than 3
points for the practical assignment)

Slide 10 of 85
Practical Assignments Frameworks
• MATLAB including:
• Digital Image Processing Toolbox,
• Computer Vision Toolbox.
• Python developer package with:
• Jupiter,
• NumPy,
• OpenCV.
• Microsoft Visual Studio including:
• Desktop Development with C++,
• OpenCV.

Slide 11 of 85
Literature
• David A. Forsyth, Jean Ponce. Computer Vision: A Modern Approach
(2nd Edition) // Pearson, 2011.
• Linda G. Shapiro, George C. Stockman. Computer Vision // Pearson,
2001.
• https://www.mathworks.com/help/vision/index.html
• https://docs.opencv.org/4.x/d1/dfb/intro.html

Slide 12 of 85
Computer Vision Applications

Slide 13 of 85
Manipulator on a Conveyor
(ABB FlexPicker)

Slide 14 of 85
Obstacles Detection

Slide 15 of 85
Normal Error vs Abnormal Error

Slide 16 of 85
Pedestrians Detection

Slide 17 of 85
Pedestrians Detection

Slide 18 of 85
Objects Detection

Slide 19 of 85
Parking System

Slide 20 of 85
CV for Mobile Robots – vSLAM
• Visual Simultaneous Localization and Mapping (vSLAM)
• It is used:
• to design a map in an unknown environment or
• to update a map in a known environment
• with simultaneous keeping track of the robot location.

Location coordinates
«Where am I?»

Design a map
«How does the world around look?» Slide 21 of 85
vSLAM
Data about the environment

Data from the robot sensors


(odometers, lidars, etc.)
Map data
Slide 22 of 85
vSLAM in ROS (Robot Operating System)

rgbdslam gmapping
Slide 23 of 85
Stereovision
(distance measurement)
• Binocular disparity is the difference in the
position of objects in terms of each eye.
• Stereoscopy is a sense of depth derived
from binocular disparity.

Slide 24 of 85
Distance Measurement
• Disparity map:
• for each pixel in the left picture with coordinates 𝑥0 , 𝑦0 ,
• a pixel is searched for in the right picture with coordinates 𝑥0 − 𝑑, 𝑦0 .

Left camera Right camera


Left camera Right camera
Slide 25 of 85
Distance Measurement
• 𝑍 – distance,
• 𝑓 – focal length,
• 𝐿 – base length,
• 𝐷 – disparity,
• 𝑥′ and 𝑥′′ – coordinates of the
object in the image plane in the
right and left pictures of the
stereo pair, respectively.

𝑓∙𝐿 𝑓∙𝐿
𝑍= ′ ′′
=
𝑥 −𝑥 𝑑
Slide 26 of 85
Distance Measurement

Left camera Disparity map Right camera

Slide 27 of 85
Distance Measurement
• StereoSGBM – OpenCV class for
calculation stereo matching.

Slide 28 of 85
Skeleton Tracking
• ROS openni_tracker Segmentation

Depth map Torso position

RGB Depth map Body parts Joints positions

Slide 29 of 85
Skeleton Tracking
• ROS openni_tracker

Slide 30 of 85
Fingers Counting

RGB to Grayscale Gauss blurring Binarization

Slide 31 of 85
Fingers Counting
• Computation of convex hull defects.

Slide 32 of 85
Fingers Counting

Slide 33 of 85
Fingers Counting

Slide 34 of 85
Object Detection
(using neural networks)

Slide 35 of 85
Object Detection
(using neural networks)

Caffe – C++ Framework for implementing


deep learning algorithms
http://caffe.berkeleyvision.org/

ImageNet – labeled dataset


http://www.image-net.org/
Similar images

Slide 36 of 85
Segmentation

Semantic
segmentation

Object
segmentation
Slide 37 of 85
Regions with
Convolutional Neural Networks
• The R-CNN architecture identifies 2,000 candidate regions in the image and
works with them:

Slide 38 of 85
Convolution

Slide 39 of 85
Mask R-CNN

Slide 40 of 85
R-CNN performance
• Performance with the same number of classes and equal compute
resources:
• R-CNN: 40-50 s;
• Fast R-CNN: 2 s;
• Faster R-CNN: 0.2 s;
• Mask R-CNN: 0.2 s.
• The R-CNN, Fast R-CNN, and Faster R-CNN are fit for object detection and
semantic segmentation;
• Mask R-CNN is fit for object detection and object segmentation.

Slide 41 of 85
YOLO – You Only Look Once
• The image is divided by a grid to cells.
• Classification and localization algorithms are applied to each cell.
• In each cell the location of the bounding rectangles and the corresponding
probabilities are estimated.

Slide 42 of 85
YOLO – You Only Look Once
• Modifications:
• TinyYOLO;
• YOLOv2;
• YOLOv3…
• 4, 5, 6, 7, 8, 9, …
• Performance: ~45 FPS

YOLOv3 & dataset COCO

Slide 43 of 85
Segmentation

Slide 44 of 85
Connected Regions in Binary Images
• Neighborhood
• "cross" neighborhood and 4-connectivity
• "square" neighborhood and 8-connectivity
• Horizontal and vertical neighbors are at a distance of 1 pixel from the central pixel of
the neighborhood
• Diagonal neighbors are at a distance of 2 pixels from the central pixel of the
neighborhood

Slide 45 of 85
Connected Regions in Binary Images
• The connected region of the image is the region (a set of points) if
• all points have the same value
• there is a continuous path between any two points from the given region
consisting of points that also belong to the given region and are neighbors at
the same time

• Algorithms for selecting connected regions


1. “Forest fire” method (Y and U collision forms possible)
2. Two pass algorithm (algorithm avoids collision)

Slide 46 of 85
Segmentation
• Splitting an image into non-overlapping regions, each of which is represented by a
color or texture of the same type.
• The purpose of segmentation in the "broad sense": split an image into semantic
regions that have a strong correlation with objects or regions of the observed three-
dimensional scene.

Slide 47 of 85
Segmentation
• R – is the entire region of the image.
• Segmentation is the process of splitting R into a such set of connected
regions 𝑅𝑖 , 𝑖 = 1, … , 𝑛, that the following basic conditions are met for
them:
• 𝑅 =∪ 𝑅𝑖=1,…,𝑛 – regions completely cover the image;
• 𝑅𝑖 ∩ 𝑅𝑗 = ∅, ∀𝑖 ≠ 𝑗 –regions do not intersect with each other;
• Pred 𝑅𝑖 = 𝑇𝑅𝑈𝐸, 𝑖 = 1, … , 𝑛, where Pred 𝑅 is the Boolean homogeneity
predicate of the region;
• 𝑃𝑟𝑒𝑑 𝑅𝑖 ∪ 𝑅𝑗 = 𝐹𝐴𝐿𝑆𝐸, ∀ 𝑖 ≠ 𝑗 – the pairwise union of any two regions does
not satisfy the same homogeneity condition.

Slide 48 of 85
Merging Regions
• Perform presegmentation of the image into "starting" regions using a non-
iterative (single) method.
• Determine the criterion for the merging of two neighboring regions.
• Iteratively find and merge all pairs of neighboring regions that satisfy the
merge criterion.
• If no pair of candidates for merging is found, stop and exit the algorithm.

Slide 49 of 85
Splitting Regions
• The partitioning begins with the representation of the entire image as a
simple region, which does not always meet the uniformity condition.
• During the segmentation process, the current regions of the image are
sequentially split in accordance with the specified uniformity conditions.
• The methods of merging and splitting regions do not always lead to the
same segmentation results, even if they use the same homogeneity
criterion.

Slide 50 of 85
Splitting and Merging Regions

Level N

Level N-1
Slide 51 of 85
Splitting and Merging Regions
• The processes of splitting and merging regions are carried out alternately at each
iteration.
• If any region at any pyramidal level is heterogeneous, it is split into four sub-regions.
• On the contrary, if at any level of the pyramid there are four adjacent regions with
approximately the same amount of uniformity, they are merged into a single region
at a higher level of the pyramid.

splitting
merging

Slide 52 of 85
Splitting and Merging Regions
1. Carry out the initial segmentation of regions, determine the criterion of
homogeneity and the pyramid of the data structure.
2. If:
• any region 𝑅 in the pyramid of the data structure is not homogeneous (𝑃𝑟𝑒𝑑 𝑅 =
𝐹𝐴𝐿𝑆𝐸), we split it into four child regions,
• any four regions having the same parents can be merged into a simple homogeneous
region, then the regions are merged,
• there are no more regions that could be divided or merged at this step, go to step 3.
3. If there are any two adjacent regions 𝑅𝑖 , 𝑅𝑗 , that can be merged into a
homogeneous region, merge them.
4. We merge small regions with the largest similar neighboring region.

Slide 53 of 85
Splitting and Merging Regions

Slide 54 of 85
Basic Image Segmentation Methods
• Threshold image segmentation by brightness levels.
• Formula:
𝑔 𝑖, 𝑗 = 1, for 𝑓 𝑖, 𝑗 ≥ 𝑇,

𝑔 𝑖, 𝑗 = 0, for 𝑓 𝑖, 𝑗 < 𝑇,
where
𝑔 𝑖, 𝑗 – the element of the resulting binary image,
𝑓 𝑖, 𝑗 – the element of the original image,
𝑇 – the brightness threshold value.
• The main issue is the definition of the segmentation threshold.

Slide 55 of 85
Segmentation Threshold Calculation
• 𝑤 𝑥 – image histogram, where 0 ≤ 𝑥 ≤ 255.

Threshold by the histogram

Slide 56 of 85
Basic Image Segmentation Methods
• Range Threshold Segmentation
𝑔 𝑖, 𝑗 = 1, for 𝑓 𝑖, 𝑗 ∈ 𝐷

𝑔 𝑖, 𝑗 = 0, otherwise,
where 𝐷 is the range of values.

• Multithreshold segmentation
𝑔 𝑖, 𝑗 = 1, for 𝑓 𝑖, 𝑗 ∈ 𝐷1
𝑔 𝑖, 𝑗 = 2, for 𝑓 𝑖, 𝑗 ∈ 𝐷2

𝑔 𝑖, 𝑗 = 0, otherwise

Slide 57 of 85
k-means Segmentation Algorithm
1. Specify the number of classes k into which the image should be divided. All pixels
are considered as a set of vectors {𝑥𝑖 | 𝑖 = 1, … , 𝑝}.
2. Determine k-vectors {𝑚𝑗 | 𝑗 = 1, … , 𝑘}, which are declared as initial centers of
clusters. Choose the values {𝑚𝑗 | 𝑗 = 1, … , 𝑘} (for example, randomly).
3. Update the values of the mean vectors {𝑚𝑗 | 𝑗 = 1, … , 𝑘}, (cluster centers). For this:
• calculate the distance from each {𝑥𝑖 | 𝑖 = 1, … , 𝑝} to each {𝑚𝑗 | 𝑗 = 1, … , 𝑘};
• assign each 𝑥𝑖 to the cluster 𝑗 ∗ , the distance to the center of which 𝑚𝑗 ∗ is minimal;
• recalculate the average values 𝑚𝑗 for all clusters.
4. Repeat steps 2, 3 until the cluster centers stop changing.

Slide 58 of 85
Weber Segmentation Algorithm
• Formula:
12𝐼
20 − 𝑖𝑓 0 ≤ 𝐼 ≤ 88
88
𝑊 𝐼 = 0,002 𝐼 − 88 2 𝑖𝑓 88 < 𝐼 ≤ 138
7 𝐼 − 138
+ 13 𝑖𝑓 138 < 𝐼 ≤ 255
255 − 138
where 𝑊 𝐼 is the Weber function,
𝐼 is the brightness value.
• Weber principle: a person does not distinguish between gray levels
between 𝐼 𝑛 , 𝐼 𝑛 + 𝑊 𝐼 𝑛 .

Slide 59 of 85
Weber Segmentation Algorithm
1. Set first class number 𝑛 = 1 and initial gray level 𝐼 𝑛 = 0.
2. Calculate the value 𝑊 𝐼 𝑛 corresponding to the brightness 𝐼 𝑛 using the Weber
formula.
3. In the original image 𝐼, set the brightness values ​𝐼(𝑛) for all pixels whose brightness
is in the range 𝐼 𝑛 , 𝐼 𝑛 + 𝑊 𝐼 𝑛 .
4. Find pixels whose brightness value is higher than 𝐺 = 𝐼 𝑛 + 𝑊 𝐼 𝑛 + 1. If
there are such pixels, increase the class number 𝑛 = 𝑛 + 1, 𝐼 𝑛 = 𝐺, go to step 2.
If there are none, finish the job.

• The image is segmented into 𝑛 classes, each class is shown by the brightness
𝑊 𝐼 𝑛 . It is convenient to implement this segmentation method by building a LUT
table.
Slide 60 of 85
Iterative algorithm of Vezhnevets
1. Traversing the image from the top left pixel, which is the class 𝐶1 .
• For the pixels of the first row, calculate the deviation from the class of the left pixel and compare with
the specified threshold. If less than the threshold, add a pixel to the class of the neighbor, otherwise,
create a new class 𝐶1+𝑖 .
2. Compare the first pixel of each next row with the classes of the two neighbors: left and top.
• If the deviation from both compared classes is greater than the threshold, then start a new class.
• If the deviation is greater for only one class, then add a pixel to that class, the deviation from which is
less than the threshold.
• If the deviation is acceptable for both classes, two options are possible:
1. 𝐿 𝑔 𝐶𝑖 − 𝑔 𝐶𝑗 < 𝛿 – combine these two classes (if they are not the same class) and add
the current pixel to the combined class;
2. 𝐿 𝑔 𝐶𝑖 − 𝑔 𝐶𝑗 > 𝛿 – add a pixel to the one of the two classes from which the deviation is
minimal.
• As a measure of 𝑳, you can use any distance function, for example, the difference in RGB space.
Slide 61 of 85
Segmentation by Skin Color
• Advantages: skin color is independent of face orientation; pixel color analysis is
computationally efficient.

• Task: Choose a criterion for evaluating the proximity of the color of each pixel to the
skin tone.

• Design of skin color model:


1. Accumulate the training data using images that indicate "skin" and "non-skin" areas.
Accumulate the skin tone statistics based on the training data.
2. Process the obtained statistics and select the skin color model parameters for
subsequent use. Select the criteria for evaluating whether pixels belong to the “skin”
area.
3. Process images with the obtained criteria.

Slide 62 of 85
Segmentation by Skin Color
• Threshold criteria, i.e. context-
independent segmentation: the color
of a pixel (R,G,B) is assigned to the 𝑅 > 95
"skin" area if the following conditions 𝐺 > 40
𝐵 < 20
are met: 𝑅>𝐺
𝑅>𝐵
𝑚𝑎𝑥 𝑅, 𝐺, 𝐵 − 𝑚𝑖𝑛 𝑅, 𝐺, 𝐵 > 15
𝑅 − 𝐺 > 15

Slide 63 of 85
Segmentation by Skin Color
• Threshold criteria, i.e. context-
independent segmentation: the color
of a pixel (R,G,B) is assigned to the
𝑅 > 220
"skin" area if the following conditions 𝐺 > 210
are met: 𝐵 > 170
𝑅 − 𝐺 ≤ 15
𝐺>𝐵
or in flashlight conditions 𝑅>𝐵

Slide 64 of 85
Segmentation by Skin Color
• Threshold criteria, i.e. context- 𝑅
𝑟=
independent segmentation: the color 𝑅+𝐺+𝐵
𝐺
of a pixel (R,G,B) is assigned to the 𝑔=
𝑅+𝐺+𝐵
"skin" area if the following conditions 𝐵
𝑏=
are met: 𝑅+𝐺+𝐵
𝑟
> 1.185
𝑔
or with a normalized color 𝑟𝑏
> 0.107
𝑟+𝑔+𝑏 2
𝑟𝑔
> 0.112
𝑟+𝑔+𝑏 2

Slide 65 of 85
Segmentation by Skin Color
• It is advised to apply the median filter after segmenting by skin color.

Slide 66 of 85
Texture Segmentation
Approaches to texture segmentation:
1. Statistical – allows you to characterize the
texture of the area as smooth, rough and
grainy.
2. Structural - define and describe the relative
position of the simplest repeating image
elements, for example, segments of parallel
lines passing at a constant step, cells on a
chessboard.
3. Spectral.

Slide 67 of 85
Texture Segmentation
Example

• Split an image containing two types of regions


represented by different textures.
• It can be carried out with a statistical texture
segmentation approach.
• As a result, the image will be split into water surface and
land.

• This cannot be done by binarization methods, only by


analyzing the texture parameters in the area around each An image with different types of
pixel. texture areas corresponding to
land and water.
Slide 68 of 85
Texture Segmentation
• Texture Analysis Segmentation Algorithm:
1. Read image.
2. Define texture parameters. Assuming that the brightness in the pixels of the image is a
random variable 𝑧, it corresponds to the distribution probability 𝑝 𝑧𝑖 , taken from the
histogram (𝐿 is the number of brightness levels).
• The central moment of order 𝑛 of the random variable 𝑧𝑖 is equal to:
𝐿−1
𝜇𝑛 𝑧 = ෍ 𝑧𝑖 − 𝑚 𝑛 𝑝 𝑧𝑖 ,
𝑖=0
where m is the average value of 𝑧 (average brightness of the image),
𝐿−1
𝑚=෍ 𝑧𝑖 𝑝 𝑧𝑖 ,
𝑖=0
𝜇0 = 1 and 𝜇1 = 0.

Slide 69 of 85
Texture Segmentation
• To describe the texture, the second point is important, i.e. variance: 𝜎 2 𝑧 = 𝜇2 𝑧 .
It is a measure of luminance contrast, which can be used to calculate features of
relative smoothness:
1
𝑅 = 1 − 1+𝜎2 ,
𝑧
• 𝑅 is zero for areas of constant brightness (where the variance is zero),
• 𝑅 approaches unity for large values ​of 𝜎 2 𝑧 .

• In grayscale images, it is advised to normalize the dispersion to the interval [0,1]. To


do this, you need to divide 𝜎 2 𝑧 by 𝐿 − 1 2 .

• The standard deviation is used as a characteristic of the texture: 𝑠 = 𝜎 𝑧

Slide 70 of 85
Texture Segmentation
• The third point is a characteristic of the symmetry of the histogram:
𝐿−1
𝜇3 𝑧 = ෍ 𝑧𝑖 − 𝑚 3 𝑝 𝑧𝑖 .
𝑖=0
• To estimate the spread in brightness of neighboring pixels, the entropy
function is used:
𝐿−1

𝑒 = − ෍ 𝑝 𝑧 𝑙𝑜𝑔2 𝑝 𝑧𝑖 .
𝑖=0
where 𝑝 𝑧𝑖 is the probability of the current brightness in the vicinity of the point;
L is the number of brightness levels; e is the entropy value at the current point.

Slide 71 of 85
Texture Segmentation
• To describe the texture, a uniformity measure is also used, which evaluates the
uniformity of the histogram:
𝐿−1

𝑈 = ෍ 𝑝2 𝑧𝑖 .
𝑖=0
• The table shows the values of the described characteristics, selected for smooth,
rough and periodic textures.
Standard Third
Texture Average R (normalized) Uniformity Entropy
deviation moment
Smooth 82.64 11.79 0.002 -0.105 0.026 5.434
Rough 143.56 74.63 0.0079 -0.151 0.005 7.783
Periodic 99.72 33.73 0.017 0.750 0.013 6.674

3. Create a mask to highlight the larger texture.


Slide 72 of 85
Texture Segmentation
• Let the image have textures of two types: large and small (grainy). Grainy
corresponds to the water zone.
• To separate one area from another, create a mask that removes small objects.
• To do this, use the function of determining a connected set of pixels in a binary image and
calculate the areas of the resulting objects. Use connection type eight.
• If the color of the neighbors is the same, then they belong to the same object,
otherwise they belong to different ones.
• All objects with an area less than a given value S are deleted.

Slide 73 of 85
Texture Segmentation

In the left is the result of texture filtering based on entropy calculation in a 9x9 window. Regions of the two
textures are shown as dark (water) and light (land) hues; in the center – water surface mask after removal of
small region objects; on the right is the result of land segmentation.

Slide 74 of 85
Morphological Watershed Method
• A grayscale image is a digital terrain model, where the brightness values are
heights relative to a certain level, i.e. image is a matrix of heights.

Slide 75 of 85
Morphological Watershed Method
• If it rains on such an area, many pools are formed. Water fills small pools, then overflows from overflowing
pools and the pools combine into larger pools according to the heights of the water level.
• The places where pools merge are marked as watershed lines. As a result, the entire area may be flooded.
• The result of segmentation depends on the moment when the water supply stops. If the process is stopped
early, the image will be segmented into small areas, if it is stopped late, into very large ones.

Local min f(x,y)

Slide 76 of 85
Morphological Watershed Method
• All pixels are divided into three types:
1. local minima;
2. located on a slope, i.e. those from which water rolls into the same local minimum;
3. local maxima, i.e. those from which water rolls into more than one minimum.
• When segmenting using this method, it is necessary to determine the watersheds
and watershed lines in the image by processing local areas depending on their
brightness characteristics.

Slide 77 of 85
Morphological Watershed Method
• Algorithm for implementing the watershed method:
1. the segmentation function is calculated (this applies to images where objects are placed in dark
regions and are difficult to distinguish);
2. foreground markers of the image are calculated based on the analysis of pixel connectivity of each
object;
3. background markers are calculated, which are pixels that are not part of objects;
4. the segmentation function is modified based on the location values ​of background markers and
foreground markers.
5. selection against the background of the image of objects of uniform brightness (in the form of
spots).
• Regions characterized by small brightness variations have small gradient values. Therefore, in
practice, the watershed segmentation method is usually applied not to the image itself, but to
its gradient representation.

Slide 78 of 85
Region Detectors
• IBR detector (Intensity-extrema based regions)
• It is necessary to go from the points of the local brightness extremum 𝐼0 along the
rays, calculating some value 𝑓.
• As soon as the peak of the value 𝑓 is found, it is necessary to stop. This point will be
the boundary of the region.
𝐼 𝑡 −𝐼0
𝑓 𝑡 =1 𝑡 .
‫ 𝑡 𝐼 ׬‬−𝐼0 𝑑𝑡
𝑡 0

points along
the ray
An example of the IBR detector
Slide 79 of 85
IBR Detector
• The areas on a pair of similar images may differ, so we describe ellipses
around them.
• If the ellipses are turned into circles, then we get complete similarity up to
rotation.

Circumscribed ellipses around objects


An example of the IBR detector
Slide 80 of 85
MSER Detector
• Maximally Stable Extreme Regions
• Solves the problem of invariance of keypoints when scaling.
• MSER detector algorithm:
1. Sort the set of all image pixels in ascending/descending order of intensity.
2. Construction of a pyramid of connected components. For each pixel of the sorted set,
perform the following sequence of actions:
• updating the list of points included in the component;
• updating the areas of the next components, as a result of which the pixels of the previous level
will be a subset of the pixels of the next level.
3. For all components, search for local minima (we find pixels that are present in this
component, but are not part of the previous ones). The set of local level minima
corresponds to the extreme region in the image.

Slide 81 of 85
MSER Detector

An example of the MSER detector

Slide 82 of 85
Test

Slide 83 of 85
Lecture 1-2 Test

Please scan the code to start the test


Slide 84 of 85
THANK YOU
FOR YOUR TIME!

Andrei Zhdanov
[email protected]

You might also like