0% found this document useful (0 votes)
89 views246 pages

GNR602-Lec15-21 Image Segmentation and Feature Detection

Uploaded by

Piyush Banka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views246 pages

GNR602-Lec15-21 Image Segmentation and Feature Detection

Uploaded by

Piyush Banka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 246

GNR602

Advanced Methods in Satellite


Image Processing
Instructor: Prof. B. Krishna Mohan
CSRE, IIT Bombay
[email protected]

Slot 13
Lecture 15-21 Image Segmentation Techniques
IIT Bombay Slide 1
Lecture 15-21 Image Segmentation

Contents of the Lecture


• Definition of Image Segmentation
• Image Segmentation by Thresholding
• Optimal Thresholding
• Edge Detection
• Region Growing

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 2

Definition

• Given by Theodosis Pavlidis


• An image R
S segmented to R regions ri should
satisfy  r i  S
i 1
Given P is a homogeneity predicate. r1
P(ri) = True r2
r5
P(ri U rj) = False i ≠j and
r3 r4
Region ri and rj adjacent
ri  rj  False
Each ri is 4-connected or 8-connected
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 3

Definition
• Given by Theodosis Pavlidis
• Widely accepted, since it covers the logical
process of image segmentation
• If the image is decomposed into its constituent
parts, each part should correspond to some
object in real world, and one or more parts build
up the entire object

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 4

Interpretation of Pavlidis’ Definition


• P(ri) = True
• The image is made up of regions where each
region is homogeneous according to some
predicate or condition
• The region (or segment) should have low internal
variance so that it does not appear to be made up
of smaller homogeneous segments

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 5

Interpretation of Pavlidis’ Definition


The second condition requires that the
segments cover the entire image

r  S
i 1
i

The entire image is segmented into homogeneous


regions

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 6
Interpretation of Pavlidis’ Definition
• 3rd condition suggests that the adjacent regions cannot
be merged without flouting the homogeneity criterion

P(ri rj )  False,i  j

Obviously this condition does not make sense if the regions are not
spatially adjacent, since the regions representing the same type of
objects can be physically separated.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 7

Interpretation of Pavlidis’ Definition


• 4th condition
– Each ri is 4-connected or 8-connected

This condition implies that each region is made


up of spatially contiguous set of pixels. We do not
refer to the same region or segment if there is no
connected path within it joining any point to any
other point
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 8
Role Of Image Segmentation In
Image Understanding

INPUT PRE- SEGMEN-


IMAGE PROCESS TATION

SCENE SPATIAL OBJECT


DESCRIPTION RELATIONS FORMATION

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 9

Cont…
• The goal of segmentation is to simplify and/or
change the representation of an image into
something that is more meaningful and easier to
analyze.

• They differ greatly in the types of image to which


they are likely to be successful and in
computational costs.

• Most explored area in IP.


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 10

Image Segmentation Approaches


• Image thresholding
• Texture analysis
• Edge/line detection
• Region growing
• Clustering
• Pattern recognition

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 11

Image segmentation by Thresholding


• Thresholding is a popular way to segment an image of a
binary scene.

• Thresholding - based on information contained in a GL


histogram of a given image.

• The objective of thresholding approach is to determine


the distributions representing the desired object and the
background, and the threshold point in such a case will
be the valley (lowest frequency point) of the histogram
that, hopefully, best separates the two groups.
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 12

Image thresholding
• The conventional thresholding schemes :
– Global Thresholding
– Local Thresholding
– Dynamic Thresholding

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 13
Thresholding Method

thresholding

x 10
4
histogram
single threshold multiple thresholds
3

2 .5

1 .5

0 .5

0
0 50 100 150 200 250

From [Gonzalez & Woods]

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 14

Cont…
g(i,j) = T(f(i,j)) = B if f(i,j) < P
= W otherwise

If several thresholds T1, T2 , ... ,Tk are chosen, then


the output image contains k+1 levels.
g(i,j) = T(f(i,j)) = g1 if 0 ≤ f(.,.) < T1
= g2 if T1 ≤ f(.,.) < T2
= .....
= gk+1 if Tk ≤ f(.,.)

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 15

Global Thresholding
• Identification of a level P such that all pixels in the image
with gray level below P to be represented by a value B
and all pixels with gray level equal to or above P to be
represented by a value W.
• g(i,j) = T(f(i,j)) = B if f(i,j) < P
= W otherwise
• T(.) is the thresholding function.
• This operation applies globally to all pixels in the image
irrespective of their position or the gray levels of their
neighbors. This is performed by a simple table-lookup
operation in real time

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 16
Effect of Threshold value 0 0 0 0 0 0 0 0 1 1
0 0 0 0 1 0 0 0 1 1
true object boundary 0 0 0 1 1 1 0 0 1 1
0 0 1 1 1 1 1 0 1 1
0 1 1 1 1 1 1 1 1 1
Thresholding 0 1 1 1 1 1 1 1 1 1
1 1 2 2 3 3 4 4 5 5 0 1 1 1 0 1 1 1 1 1
1 1 2 2 8 3 4 4 5 5
T = 4.5 0 1 1 0 0 0 1 1 1 1
1 1 2 7 8 9 4 4 5 5 0 1 0 0 0 0 0 1 1 1
1 1 6 7 8 9 10 4 5 5 0 0 0 0 0 0 0 0 1 1
1 5 6 7 8 9 10 11 5 5
1 5 6 7 8 9 10 11 5 5
0 0 0 0 0 0 0 0 0 0
1 5 6 7 3 9 10 11 5 5
0 0 0 0 1 0 0 0 0 0
1 5 6 2 3 3 10 11 5 5
0 0 0 1 1 1 0 0 0 0
1 5 2 2 3 3 4 11 5 5
0 0 1 1 1 1 1 0 0 0
1 1 2 2 3 3 4 4 5 5
Thresholding 0 0 1 1 1 1 1 1 0 0

T = 5.5 0 0 1 1 1 1 1 1 0 0
0 0 1 1 0 1 1 1 0 0
0 0 1 0 0 0 1 1 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 17

Local Thresholding
• Local thresholding varies the threshold based on
the region in which the pixel is located
B11 B12 B13

B21 B22 B23

B31 B32 B33


Each block b(i,j) has a separate threshold P(i,j)
This approach can better adapt to local intensity
variations, but creates discontinuities at the boundaries
of adjacent blocks
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 18

Local Thresholding Method


– Spatially adaptive thresholding
– Localized processing
1 1 2 2 3 3 4 4 5 5 1 1 2 2 3 3 4 4 5 5
1 1 2 2 8 3 4 4 5 5 1 1 2 2 8 3 4 4 5 5
1 1 2 7 8 9 4 4 5 5 1 1 2 7 8 9 4 4 5 5
1 1 6 7 8 9 10 4 5 5 1 1 6 7 8 9 10 4 5 5
1 5 6 7 8 9 10 11 5 5 Split 1 5 6 7 8 9 10 11 5 5
1 5 6 7 8 9 10 11 5 5
1 5 6 7 8 9 10 11 5 5
1 5 6 7 3 9 10 11 5 5
1 5 6 7 3 9 10 11 5 5
1 5 6 2 3 3 10 11 5 5
1 5 6 2 3 3 10 11 5 5
1 5 2 2 3 3 4 11 5 5
1 5 2 2 3 3 4 11 5 5
1 1 2 2 3 3 4 4 5 5
1 1 2 2 3 3 4 4 5 5

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 19

Local Thresholding Method


0 0 0 0 0 Spatially adaptive threshold selection 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 1 1 Thresholding Thresholding 1 0 0 0 0
0 0 1 1 1 1 1 0 0 0
0 1 1 1 1
T=4 T=7 1 1 1 0 0
1 1 2 2 3 3 4 4 5 5
1 1 2 2 8 3 4 4 5 5
1 1 2 7 8 9 4 4 5 5
1 1 6 7 8 9 10 4 5 5
1 5 6 7 8 9 10 11 5 5

1 5 6 7 8 9 10 11 5 5
1 5 6 7 3 9 10 11 5 5
1 5 6 2 3 3 10 11 5 5
1 5 2 2 3 3 4 11 5 5
0 1 1 1 1 1 1 2 2 3 3 4 4 5 5 1 1 1 0 0
0 1 1 1 0 1 1 1 0 0
0
0
1
1
1
0
0
0
0
0
Thresholding Thresholding 0 1 1 0 0
0 0 1 0 0
0 0 0 0 0 T=4 T=7 0 0 0 0 0

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 20

Local Thresholding Method


0 0 0 0 0 Merge local segmentation results 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 1 1 1 0 0 0 0
0 0 1 1 1 merge merge 1 1 0 0 0
0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 1 1 1 0 0 0 0
0 0 1 1 1 1 1 0 0 0
0 1 1 1 1 1 1 1 0 0
0 1 1 1 1 1 1 1 0 0
0 1 1 1 0 1 1 1 0 0
0 1 1 0 0 0 1 1 0 0
0 1 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 0 0
0 1 1 1 0 1 1 1 0 0
0 1 1 0 0 merge merge 0 1 1 0 0
0 1 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 21

Dynamic Thresholding
• The threshold is allowed to vary from pixel to
pixel in this case.
• The procedure is similar to local thresholding.
The additional step here is the generation of a
threshold surface by interpolating the threshold
values computed for different blocks.
• The threshold surface provides a threshold for
each pixel of the image.
• This overcomes the limitation of discontinuities
at the block boundaries

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 22
Classical Automatic Thresholding
Algorithm
• Select an initial estimate for T
• Segment the image using T, producing 2 groups: G1, pixels with
value >T and G2 , with value <T
• Compute µ1 and µ2, average pixel value of G1 and G2
• New threshold: T=1/2(µ1+µ2)
• Repeat steps 2 to 4 until T stabilizes.

• Very easy + very fast


• Assumptions: normal dist. + low noise
• This is the well known ------- algorithm!

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 23
• Optimal Thresholding is based on the shape of the
current image histogram. Search for valleys, Gaussian
distributions etc.

Foreground Background

Optimal
threshold ?

Both Real histogram


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 24

Histograms

Good for thresholding Troublesome

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 25

Effect of Smoothing

-
median filtering
thresholding

Original artwork from the book Digital Image Processing by R.C. Gonzalez and R.E. Woods ©
R.C. Gonzalez and R.E. Woods, reproduced with permission granted to instructors by authors on
the website www.imageprocessingplace.com

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 26

Optimal Thresholding
• Histogram shape can be useful in locating the
threshold. However it is not reliable for
threshold selection when peaks are not clearly
resolved
• Optimal thresholding: a criterion function is
devised that yields some measure of
separation between regions
• A criterion function is calculated for each
intensity and that which maximizes/minimizes
this function is chosen as the threshold

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 27

Otsu’s Methold
• Otsu’s thresholding method is based on selecting the
lowest point between two classes (histo.peaks).
• Based on the threshold, the two classes have respective
means and standard deviations

– m1, m2, s1, s2 respectively, L total number of levels.

– ni: number of pixels in level i; N total number of pixels;


 w0, w1 fraction of pixels below and above threshold
T L 1
1 1
w0 
N
n
i 0
i w1 
N
n
i T 1
i

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 27a

Otsu’s Methold
• Analysis of variance (variance=standard deviation2)

• Mean of pixels up to threshold T, above threshold T and overall mean

1 T 1 L1 1 L1
0   ni .i 1   ni .i    ni .i
N0 i  0 N1 i T 1 N i 0

L 1
1
 
2

N
 (i
i 0
  ) 2
ni

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 28
Between Class and Within Class
Variance
• The total variance of the image is the sum of the
between class variance and within class
variance
• Both within class variance w2 and between
class variance b2 are dependent on the
threshold selected
• We should minimize w2 or maximize b2 since
their sum is a constant, independent of threshold

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 29

Otsu’s Method
Between-classes variance (b2 ): The variation of the
mean values for each class from the overall intensity
mean of all pixels:
b2 = w0 (0 -  )2 + w1(1 -  )2,

Substituting  = w0 0 + w11, we get:


b2 = w0w1(1 - 0 )2

w0, w1, 0, 1 stands for the relative frequencies


and mean values of two classes, respectively.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 30

Otsu’s Method
• The criterion function involves between-classes
variance to the total variance is defined as:
(T) = b2 / 2
• Since b2 is a function of threshold T, (T) is
evaluated for all possible thresholds, and the one
that maximizes  is chosen as the optimal
threshold

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 31

Otsu’s Method
• The within cluster variance need not be
separately minimized, since maximizing
between cluster variance automatically
minimizes within cluster variance. This is
because the sum of between cluster variance
and within cluster variance equals the total
variance of the image, which is independent of
the threshold chosen

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 32

Finding the Threshold


• A sequential search can be carried out by
choosing the threshold = 1 to 254, such that the
value at which  is maximum is the optimal
threshold.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 33

Input Image

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 34

2 thresholds
3 display
levels
0, 127, 254

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 35

6 thresholds
7 display
levels
0, 42, 84, 126,
168, 210, 252

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 36

Entropy Method
• Entropy is served as a measure of information
content
• A threshold level t separates the whole
information into two classes, and the entropy
associated with them is:
t 255
H b   pi log( pi ) H w    pi log( pi )
i 0 i t 1

• Optimal threshold is the one maximizes:


H = Hb + Hw
• Motivation: Equi-probable gray levels in the below
threshold and above threshold parts
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 37
Minimum Error Thresholding
• Proposed by Josef Kittler and John
Illingworth, University of Surrey, UK
– Let us consider an image whose pixels
assume grey level values g in the interval
[0,n]
– Histogram given by h(g), normalized
histogram by p(g) = h(g)/n
– The histogram is an estimate of p(g), the PDF
of a mixture population
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 38
Minimum Error Thresholding
• Let us consider that the two populations can be modeled
as Gaussian distributions
p ( g )   Pi p ( g | i), where
i

1   x 
2

p( g | i)  exp    i
 
2 i2   2 i  
 
To avoid confusion, we write p(g|i) as p(g|i,T) since classes 1
and 2 are formed according to the threshold T chosen in our
context.
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 39

Minimum Error Thresholding


• Suppose the grey level data is thresholded at
some level T and each of the two resulting pixel
populations are approximated by a normal
density h(g|i,T) with parameters μi(T), σi(T), and
a priori probability Pi(T). For a single threshold,
we can compute h(g|1,T) and h(g|2,T) with
parameters 1(T), 2(T), 1(T), 2(T)
respectively.
• J. Kittler, J. Illingworth, Minimum Error
Thresholding, Pattern Recognition, vol. 19, no.
1, pp. 41-47, 1986
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 40
Minimum Error Thresholding
T
P1 (T )   h(i )
i 0
n
P2 (T )   h(i)
i T 1
T

 ih(i)
1 (T )  i 0
P1 (T )
n

 ih(i)
2 (T )  i T 1 P2 (T )
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 41
Minimum Error Thresholding
• The probability of a pixel being mapped correctly
(below threshold or above threshold) is denoted
by P(g,T) and given by
p ( g | i, T ) Pi (T )
P( g , T ) 
p( g )
where i=1 if g  T, and i=2 if g > T
Substituting the expressions for the individual class
conditional distributions, and taking logarithms, and
multiplying by -2, we can rewrite P(g,T)
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 42
Minimum Error Thresholding
2
 g  i (T ) 
P( g , T )     2.log( i (T ))  2 log( i (T ))
  i (T ) 
where i=1 if g  T, and i=2 if g > T (Verify!)
The average performance for the whole image is
given by
J (T )   p ( g ) P ( g , T )
g

For a given threshold T, this function indicates the extent of


overlap between the two classes according to the threshold
selected
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 43
Minimum Error Thresholding
• Expanding the previous equation,
2
 g  1 (T ) 
J (T )      2.log( 1 (T ))  2 log( 1 (T )) 
g   1 (T ) 
2
 g  2 (T ) 
g   (T )   2.log( 2 (T ))  2 log(2 (T ))
 2 
Substituting the previous equations in the above, we can
write
J (T )  1  2[ 1 (T ) 1 (T )   2 (T ) 2 (T )] 
2[ 1 (T ) log( 1 (T ))  2 (T ) log( 2 (T ))
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 44

Minimum Error Thresholding


• J(T) is evaluated for each possible threshold and the
threshold that minimizes J(T) is the optimum choice

• Difference between Otsu’s method and this method is


that here the histogram is assumed to be a mixture of
two Gaussian distributions, one for object and other for
background. Thereshold selection is based on MLE

• In Otsu’s method this assumption is not made, the inter-


class separation is expressed in terms of difference of
means of object and background
GNR602 Lecture 15-21 B. Krishna Mohan
Segmentation by Edge
Detection
IIT Bombay Slide 45

What is an Edge ?
• An edge is a discontinuity in the perceptual property –
brightness / color / texture / surface orientation
• An edge is a set of connected pixels that lie on the
boundary between two regions
• The pixels on an edge are called edge pixels or edgels
• Gray level / color / texture discontinuity across an edge
causes edge perception
• Position & orientation of edge are key properties

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 46

Different Edges

Different colors Different brightness Different textures Different surfaces

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 47

Cont…
• The boundaries of an object are often considered to be
analogous to its edges.
• These boundaries are discovered by following a path of
rapid change in image intensity.
• Most edge-detection functions look for places in the
image where the intensity changes rapidly by locating
places where the first derivative of the intensity is larger
in magnitude than some threshold, or finding places
where the second derivative of the intensity has a zero
crossing

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 48

Types of Edge

Step Edges

Roof Edge Line Edges

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 49

Different
kinds of
edges

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 50

Real Edges

Noisy and Discrete!

We want an Edge Operator that produces:


– Edge Magnitude
– Edge Orientation
– High Detection Rate and Good Localization

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 51

• The gradient points in the direction of most rapid change


in intensity

• The gradient direction is given by:

• The edge strength is given by the gradient magnitude

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 52
Discrete Edge Operators
How can we differentiate a discrete image?
Finite difference approximations:

I

1
I i1, j 1  I i, j 1   I i1, j  I i, j  I i , j 1 I i 1, j 1
x 2 
I

1
I i 1, j 1  I i 1, j   I i, j 1  I i, j  Ii, j I i 1, j
y 2
Convolution masks :

I 1 1 1 I 1 1 1
 
x 2 y 2
1 1 1 1

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 53
• Second order partial derivatives:
2I 1 I i 1, j 1 I i , j 1 I i 1, j 1
 2 I i 1, j  2 I i , j  I i 1, j 
x 2

I i 1, j I i , j I i 1, j
2I

1
I i, j 1  2 I i, j  I i, j 1 
y 2
 2
I i 1, j 1 I i , j 1 I i 1, j 1
• Laplacian :
2I 2I
 I 2 2
2

x y
Convolution masks :

0 1 0 1 4 1
1
2 I  1 4 1 or 4  20 4 (more accurate)
20
0 1 0 1 4 1

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 54

Sobel operators
Better approximations of the gradients exist

The Sobel operators below are commonly used


-1 0 1 1 2 1
-2 0 2 0 0 0
-1 0 1 -1 -2 -1

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 55
Comparing Edge Operators
Good Localization
Gradient: Noise Sensitive
Poor Detection
Roberts (2 x 2): 0 1 1 0
-1 0 0 -1

Prewitt (3 x 3):
-1 0 1 1 1 1
-1 0 1 0 0 0
-1 0 1 -1 -1 1
Sobel (5 x 5): -1 -2 0 2 1 1 2 3 2 1
-2 -3 0 3 2 2 3 5 3 2
-3 -5 0 5 3 0 0 0 0 0 Poor Localization
-2 -3 0 3 2 -2 -3 -5 -3 -2 Less Noise Sensitive
-1 -2 0 2 1 -1 -2 -3 -2 -1 Good Detection

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 56
Effects of noise
• Consider a single row or column of the image
– Plotting intensity as a function of position gives a
signal

Where is the edge?


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 57
Solution: Smooth first
Where is the edge?

Look for peaks in


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 58
Derivative theorem of convolution

This saves us one operation:

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 59

The Laplacian Operator


Laplacian operator is a second derivative operator
that is a discrete approximation of the Laplace
heat equation.
The Laplacian operator combines the second
order derivatives as follows:

 2
f ( x , y )  2
f ( x, y )
 f ( x, y ) 
2

x 2
y 2

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 60

Cont…
Common Laplacian kernels are:

Higher derivative operators amplify noise. Thus,


Laplace operator output is much more noisy
compared to the first derivative operators such
as Sobel, Prewitt etc.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 61

Laplacian of Gaussian
• The noise in the input image is reduced by smoothing.
• Among the various smoothing operators, Gaussian filter
has desirable properties in terms of space-frequency
localization.
• Input image is therefore smoothed using the Gaussian
shaped smoothing operator, whose width s is user
controllable.
• In this approach, an image should first be convolved
with Gaussian filter
g(x, y)   G(x, y,)  f (x, y)
2

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 62

Laplacian of Gaussian
Consider

Where is the edge?

LoG operator

Zero-crossings
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 63
2D edge detection filters

Laplacian of Gaussian

Gaussian derivative of Gaussian

is the Laplacian operator:

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 64

Cont…
• The order of performing differentiation and convolution
can be interchanged because of the linearity of the
operators involved:

 
g(x, y)  2G(x, y)  f (x, y)

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 65

LoG - Guidelines for Use


• The LoG operator calculates the second spatial
derivative of an image. This means that in areas where
the image has a constant intensity (i.e. where the
intensity gradient is zero), the LoG response will be zero.
• In the vicinity of a change in intensity, however, the LoG
response will be positive on the darker side, and
negative on the lighter side. This means that at a
reasonably sharp edge between two regions of uniform
but different intensities, the LoG response will be:

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 66

Cont…
• zero at a long distance from the edge,
• positive just to one side of the edge,
• negative just to the other side of the edge,
• zero at some point in between, on the edge itself.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 67

Cont…

• Response of 1-D LoG filter to a step edge. The left hand


graph shows a 1-D image, 200 pixels long, containing a
step edge. The right hand graph shows the response of
a 1-D LoG filter with Gaussian = 3 pixels.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 68

Applications of LoG
• Spurious edges detected away from any obvious edges
hence to increase the smoothing of the Gaussian to
preserve only strong edges.
• The gradient of the LoG at the zero crossing (i.e. the
third derivative of the original image) and only keep zero
crossings where this is above a certain threshold. This
will tend to retain only the stronger edges, but it is
sensitive to noise, since the third derivative will greatly
amplify any high frequency noise in the image.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 69

Coarse to Fine Tracking


• Useful to track edges at a variety of scales
• Edge detection at fine scale picks up a lot of noisy /
minor texture edges
• Edge detection at coarse scale only picks up broad
edges, missing out fine detail
• Option to get edges of varying detail, without noise is to
track edges from coarse to fine scale
• This method is also referred to as edge focusing

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 70
Edge Focusing
• Start with zero crossings at a coarse scale, say
at sigma = 5
• Record the positions of the edge pixels at the
coarse scale
• Reduce sigma by a step size s (e.g. 0.5)
• Repeat the zero crossing detection
• Retain only those zero crossings at the smaller
sigma level that are in the neighborhood of the
zero crossings at the coarse scale
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 71

Cont…
• Reduce current sigma by s again
• Repeat the process till the desired scale of sigma is
reached
Advantage :
• The degree of detail can be controlled by the choice of
starting sigma and step size.
• Computational load can be reduced by limiting the zero
crossing detection at smaller sigma values to those
locations that are in the neighborhood of those detected
at the previous level

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 72

Refer
• F. Bergholm’s paper “Edge Focusing” in IEEE
Trans. Pattern Analysis and Machine
Intelligence, November 1987.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 73

The Canny Edge Detector


The mask we want to use for edge detection should have
certain desirable characteristics, called Canny’s criteria:
1. Low probability of missing a genuine edge
2. Good locality, i.e the edge should be detected where it
actually is.
3. Single response for an edge
4. Low probability of false alarm

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 74

Cont…
• There are four steps following the diagram:

1. Smoothing: using a gaussian smoothing operator


2. Gradient
3. Non-Maximum Suppression
4. Hysteresis Threshold
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 75

Smoothing
• For the smoothing step  Gaussian LPF.
• The standard deviation, determines the width of the filter
and hence the amount of smoothing.
S ( x, y )  G ( x, y ,  )  f ( x, y )
• Let f(x,y) denote the input image. The result from
convolving the image with Gaussian smoothing filter
using separable filtering is an array of smoothed data.
• S(x,y) = The spread of the Gaussian and controls the
degree of smoothing.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 76

Cont…
The edge enhancement step simply involves
calculation of the gradient vector at each pixel of
the smoothed image.

gx (x, y)  S(x 1, y)  S(x 1, y) gy (x, y)  S(x, y 1)  S(x, y 1)

 gy 
Magnitude ( x, y )  g  g x2  g y2
  tan 1
 
 gx 
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 77

Gradient
• At each point convolve with
  1 1 1 1
Gx    Gy   
  1 1   1  1
• Magnitude and Orientation of the Gradient are computed
as
M [i, j ]  P[i, j ]  Q[i, j ]
2 2

 [i, j ]  tan 1 (Q[i, j ], P[i, j ])


• Avoid floating point arithmetic for fast computation

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 78

NMS
• The localization step has two stages: non-maximal
suppression and hysteresis thresholding

Non-Maximal Suppression 90
(NMS) thins the ridges of 135 45
gradient magnitude in
180
magnitude image by 0
suppressing all values along the
225 315
line of the gradient that are not
peak values of a ridge . 270
 s  Sector [ ( x , y )]
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 79

NMS
• Search in a 3x3 neighborhood at every point in magnitude
image
• Consider the gradient direction at the centre pixel
• Compare the edge magnitude at centre pixel with its two
neighbors along the gradient direction.
• If the magnitude value at the center point is not greater than
both of the neighbor magnitudes along the gradient direction,
then g(x,y) is set zero. The values for the height of the ridge
are retained in the NMS magnitude.
gsuppressed image = nms [g(x,y), s]

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 80

Non-Maxima Suppression
• After nonmaxima suppression one ends up with
an image which is zero everywhere except the
local maxima points.
• Thin edges by keeping large values of Gradient

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 81

Principle of NMS
• Thin the broad ridges in M[i,j] into ridges that are only
one pixel wide
• Find local maxima in M[i,j] by suppressing all values
along the line of the Gradient that are not peak values of
the ridge
0 0 0 0 1 1 1 3
3 0 0 1 2 1 3 1
0 0 2 1 2 1 1 0
false 0 1 3 2 1 1 0 0
edges 0 3 2 1 0 0 1 3
2 3 2 0 0 1 0 1 gaps
2 3 2 0 1 0 2 1
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 82

Gradient Orientation
• Reduce angle of Gradient
θ[i,j] to one of the 4 sectors
• Check the 3x3 region of each
M[i,j]
• If the value at the center is
not greater than the 2 values
along the gradient, then M[i,j]
is set to 0

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 83
The suppressed magnitude image will contain many false
edges caused by noise or fine texture

0 0 0 0 0 0 0 3
0 0 0 0 2 1 3 0
0 0 2 1 2 0 0 0
0 0 3 0 0 0 0 0 false edges

0 3 2 0 0 0 0 0
0 3 0 0 0 1 0 1
0 3 0 0 1 0 2 0
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 84

0 0 0 0 1 1 1 3 Definite Edges
0 0 0 1 2 1 3 1
0 0 2 1 2 1 1 0
Definite
NMS 0 1 3 2 1 1 0 0 non-edges
0 3 2 1 0 0 1 0 To be
considered
2 3 2 0 0 1 0 1
2 3 2 0 1 0 2 1

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 85

Thresholding
• Reduce number of false edges by applying a
threshold T
– all values below T are changed to 0
– selecting a good value for T is difficult
– some false edges will remain if T is too low
– some edges will disappear if T is too high
– some edges will disappear due to softening of
the edge contrast by shadows

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 86

Double Thresholding
• Apply two thresholds in the suppressed image
– T2 > T1
– two images in the output
– the image from T2 contains fewer edges but has gaps
in the contours
– the image from T1 has many false edges
– combine the results from T1 and T2
– link the edges of T2 into contours until we reach a gap
– link the edge from T2 with edge pixels from a T1
contour until a T2 edge is found again
GNR602 Lecture 15-21 B. Krishna Mohan
Input Image
Canny Output s=1, T1=0.2, T2=0.5
Canny Output s=1, T1=0.4, T2=0.7
Input Image
Canny Output s=2, T1=0.2, T2=0.5
Canny Output s=2, T1=0.4, T2=0.7
Input Image
Canny Output s=5, T1=0.2, T2=0.5
Canny Output s=5, T1=0.4, T2=0.7
IIT Bombay Slide 87

Color Edge Detection


• Color images present a challenge in detecting
edges due to higher dimensionality
• A simple approach to detecting edges based on
color differences:
– Apply RGB-HSI transform
– Apply conventional edge detectors on Hue
component

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 88

Color Edge Detection


• Approach 2:
• Detect edges independently in Hue, Saturation
and Intensity components
• Compute union of all the edges detected
• This will ensure that whether the edge is caused
due to color, intensity or saturation it will still be
detected

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 89

Color Edge Detection


• Approach 3:
• Compute the gradient in hue, saturation and
intensity components
• Compute the edge gradient magnitude as the
maximum of the three components

GNR602 Lecture 15-21 B. Krishna Mohan


Input Image
PC1 of Multiband Image
PC1 Canny Output s=1, T1=0.2, T2=0.5
Hue Image
Canny Output s=1, T1=0.2, T2=0.5
IIT Bombay Slide 90
Mathematical Morphology Approach
• Color gradient
• This can be computed in terms of
morphological dilation and erosion on
vectors.
• e(X) = dil(X,S) – ero(X,S)
• dil(X,S) is dilation of X by a structuring
element S, usually a 3x3 box
• ero(X,S) is erosion of X by a structuring
element S, usually a 3x3 box
NR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 91
Mathematical Morphology Approach
• Gray scale dilation depends on local
maximum and erosion depends on local
minimum. In case of color (i.e., 3-element
vectors), what is maximum and what is
minimum? In other words, how are the
vectors ordered in increasing or
decreasing order?

NR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 93
Computing Vector Dilation
• Consider n vectors in a neighborhood,
written as X = [X1, X2, …, Xn]T, that is
• X is a nx3 matrix, each row a color vector
or size 3, having elements
(ri, gi, bi).
• Then Max(X) ≅
– [maxi(ri), maxi(gi), maxi(bi)]T
– Similarly Min(X) is also defined

NR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 94

NR602 Lecture 15-21 B. Krishna Mohan


Region Segmentation
IIT Bombay Slide 105
High Resolution
High Resolution

Spatial Spectral

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 106

IRS-1C LISS3
23.5m

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 107

IRSP6 LISS4 Image


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 108

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 109

AIRBORNE
NATURAL
COLOR
COMPOSITE

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 110
High Resolution
High Resolution

Spatial Spectral

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 111

Hyperspectral Image Window and


Spectrum of Vegetation near
Powai Lake

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 112

Evolution of Segmentation
Techniques
• Pixel based classification using spectral features
(Landsat, IRS 1C/1D, SPOT 1,2,3)
• Pixel based classification using spectral and
textural features (IRS P6, SPOT 5, …)
• Object based classification using spatial
features, spectral and textural features

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 113
High Resolution Images
• High resolution images are information
rich
– Spatial information
– Multispectral information
– Textural information
• Image can be viewed as a collection of
objects with spatial relationships

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 114
General Methodology
• Smooth image (if needed)
• Segment the image into regions
– Find seed points to start region growing
• Find gradient
• Find gradient minima
– Grow regions from seed points
• Use suitable algorithm to start looking for similar neighbors such
that we can grow from seed points to regions
• Compute features for each region
– Shape
– Textural
– Contextual
– Spectral
• Classify regions
GNR602 Lecture 15-21 B. Krishna Mohan
Pre-process
IIT Bombay
Slide 115 Decompose image
Generic High
Spatial Resolution Segment image at Different Single stage
Resolutions
Image Analysis segmentation

Framework Link Segments

Connected Comp. Labeling

Spatial Features Spectral Features Texture Features Context

Object Specific Classification General purpose LU/LC


classification

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 116

Step 1

Preprocessor
– Image smoothing
– Suppress noise / eliminate minute detail that
is not of interest
– Adaptive Gaussian / Median
– Very useful to produce proper segmentation
– Optional step
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 117

Preprocessing
Mean Filter

Gaussian Filter
 x 2
 y2 
1 
G ( x, y )  e 2 2

2 

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 118
• Median Filter

Preprocessing
(close-open) alternating
sequential
morphological filter
CpOpCp-1Op-1…C1O1(I)
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 119

Adaptive Smoothing Algorithm


• Compute the image gradient at a pixel
(x,y) as T
 I t ( x, y ) I t ( x, y ) 
 , 
 x y 

• =(Gx,Gy)T
• where superscript t in It stands for
iteration index.
•GNR602
Io(x,y) is the original
Lecture 15-21 image
B. Krishna Mohan
IIT Bombay Slide 120
Algorithm
• Since the weights assigned to
neighboring pixels decrease with
magnitude of the gradient, we consider
t 2
d ( x, y )

2k 2
• w(x,y)= e

where dt(x,y) is the gradient, and


|dt(x,y)|2 = sqrt[Gx2 + Gy2]
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 121
Algorithm
• The parameter k is user-specified, and the
rate of decrease in neighbor weights with
distance from the centre pixel (x,y)
depends on the value of k. If k is small,
then the centre pixel is given large
weightage, and hence little smoothing. If k
is large, then neighbors get more
weightage, and hence more smoothing.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 122
Algorithm
• The smoothed signal is given by
1 1
1
t 1
I ( x, y )  t
N
 I ( x 
i 1 j 1
i , y  j ) wt
( x  i, y  j )

• This is obviously space-variant filtering


1 1

• Nt =   ( x  i, y  j )
wt

i 1 j 1

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 123
Performance of the algorithm
• In practice, within a few iterations, the
edges become sharp, with interior
details of regions smoothed
• It+1(x,y) = It(x,y), happens after many
iterations
• The process is terminated after a few
iterations when the smoothing is
perceived to be adequate
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 124
Performance of the Algorithm
• The weights assigned to the neighbors
w decrease as the gradient magnitude d
increases.
• When there are discontinuities, the
neighboring pixels are assigned small
weights, resulting in edge preserving
smoothing
• Micro-textures are filtered out
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 125

Illustration of
the working of
edge-preserving
smoothing

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 126
Region Based Image Analysis
• Images contain spatially contiguous blocks
of similar pixels or regions
• A region corresponds to an object or part
of a real world object
• Region has spatial, spectral and shape
properties
• Regions have connectivity to adjacent
regions

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 127

How is region based image


analysis performed?
• We assume that the images we consider
are of high spatial resolution to employ
region based methods
• e.g.
– Quickbird, Ikonos, GeoEye, Cartosat2
– Aerial imagery
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 128
Issues with high resolution
imagery
• Advantages
– High spatial resolution
– Individual objects can be extracted and
counted
• Limitations
– Large data volume
– Some of the objects may be irrelevant! e.g.,
small repair patches on roads, potholes
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 129
Region Based Approches
• Region Growing
– Start with individual pixels
– Add neighbors with similar attributes to form
regions
– If neighbor is very different, do not include
This is a bottom-up scheme, starting with
pixels, going up to form regions

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 130
Region Based Approches
• Region Splitting Approaches
– Consider entire image as a single region
– Split region if homogeneity criterion fails
– Now consider each region separately and
repeat the process recursively
This is a top-down process, starting from full
image, towards small regions

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 131
Region Based Approches
• Hybrid Approaches
– Initially split using top-down methods
– Small regions that are spatially adjacent and
having similar properties are merged to form
larger regions
This is the well known strategy of split-and-
merge

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 132
Example
• Split image using a quadtree hierarchy
• Continue till the leaf nodes are small
• Consider the leaf nodes that correspond to
spatially adjacent parts of the image
• Compare means and variances
• If similar, merge
Final result is no longer a QuadTree!

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 133

Quadtree Based Split and


Merge

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 134

Tree structure
Parent Node
N
NW node NW NE
NE node
W E
SW node
SE node SW SE
Nodetype (root/leaf/intermediate)
S
Size
Mean
Notation
Texture

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 135

4 7 8 8 20 22 22 22
Splitting of an image

10 8 8 8 21 23 22 22

12 12 10 10 26 25 40 50

12 12 14 16 27 28 45 48

60 65 105 110 90 90 200 205

70 72 116 120 90 90 210 215

100 120 180 190 100 100 70 75

122 127 29 30 100 100 80 85


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 136
4 7 8 8 20 22 22 22
22 22

Merging process
10 8 8 8 22 23 22 22

12 12 10 10 26 23 40 50
27 49
12 12 14 16 27 28 45 48

60 65 105 110 90 90 200 205

70 72 116 120 90 90 210 215

100 120 180 190 100 100 70 75

122 127 29 30 100 100 80 85


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 137
Quadtree Segmentation with Merging

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 138
Comments
• Region shapes can be jagged, due to the
shape of the sub-images spanned by each
of the quadrants and sub-quadrants
• Splitting can be fast, but merging will be
slow

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 139
Region Growing Methods
• Issues
– Identification of seed pixels from where to
start region growing
– Different seed pixels can result in different
final results – sequential procedure!
– Alternatively, generate multiple seed pixels or
micro-regions and grow regions in parallel

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 140
Segmentation Algorithms
• Region Based
– Seeded Region Growing (SRG)
– Quadtree based split and merge
• Edge Based
– Sobel
– Laplacian
– Canny’s
• Hybrid
– Morphological Watershed transform

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 141
Calculation of Seed Pixels
• For simple intensity images, one can
choose the very bright and very dark
pixels to start region growing
• In other words, one can compute the
white top-hat and black top-hat
transformations and use the resulting
pixels as seeds
• One can optionally apply a threshold to
limit the number of seed pixels
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 142

Calculation of Seed Pixels


• Optionally, one can compute gradient
image, and pick pixels that are gradient
minima.
• This is applicable since region interiors are
where the intensity gradient is minimum.
Gradient maxima occur near region
boundaries.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 143
Calculation of Seed Pixels
• In case of color images or
multidimensional data?
• Option 1: Compute PC1 of the higher
dimensional data, and use gradient
minima or locally bright and dark pixels
as seeds.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 144
Calculation of Seed Pixels
• Option 2: Color gradient
• Compute color gradient and locate local
gradient minima pixels which will be
used as seed pixels.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 145
Calculation of Seed Pixels
• Apply clustering and pick pixels whose
feature vectors are very close to the
cluster mean vectors. Usually these will be
well inside the regions and can be used as
seed pixels

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 146
Seed Pixel Selection
• Given the cluster centres Ci, the seed pixels
are those whose feature vectors are within a
user specified distance (in the feature space)
from the cluster centres.
• X is a seed pixel if
• ||X – Ci||p <= ε
• The cluster centres normally coincide with
region interiors.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 147
Seeded Region Growing
• Consider the image being decomposed into a
collection of regions Ai.
• Initially the regions Ai would start with one or a
small number of pixels based on the initialization
strategy
• Starting with the seed pixels above, the
remaining pixels x are assigned to different
regions based on the criteria chosen
Adams and Bischof, IEEE T-PAMI, vol. 16, no.6,
pp.641-647
GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 148
Seeded Region Growing
• Label seed points according to their initial
grouping.
n n
T  {x   Ai | N ( x)   Ai  }
i 1 i 1
Compute

 ( x) | g ( x)  mean[ g ( y)]|
to determine whether x can be added to region y.
Small value of (x) means high affinity of x to region y.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 149

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 150
Algorithm
• If the neighbors of an unassigned pixel meet
more than one region, then we consider

 ( x)  mini | g ( x)  mean[ g ( yi )] |

• where yi is a pixel belonging to region i.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 151

Algorithm
• Alternatively, x can be labeled as a
boundary pixel, and we append a pixel z to
region Ai such that
 ( z )  min x  ( x)
• This process refers to one iteration, that is
repeated till all pixels in the image are
allocated to one region or the other.

GNR602 Lecture 15-21 B. Krishna Mohan


IIT Bombay Slide 152

5 6 7 22 25

Algorithm for Seeded


Region Growing
7 8 9 26 27

11 10 12 29 28

20 104 20 26 27

100 102 103 63 62

100 101 103 63 64

102 105 101 62 63


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 153
Output of Region Growing

From Randomly selected seeds From Manually selected seeds


GNR602 Lecture 15-21 B. Krishna Mohan
IIT Bombay Slide 154
Comments
• If seed points are too many, region
fragmentation may occur
• If seed points are too few, different
objects can be merged within a single
region
• The second case is easier to handle.
One may consider each large region
and grow sub-regions within it by
repeating the same process performed
on the full image.
GNR602 Lecture 15-21 B. Krishna Mohan
Feature Descriptors
1. Harris Corner Detector
2. Histogram of Oriented Gradients (HoG)
3. Scale Invariant Feature Transform (SIFT)
from Rick Szeliski’s lecture notes,
and other sources…
Feature Descriptors
• Images are recognized, matched using some
key features that are perceived by humans
invariant to rotation, translation, scale,
illumination conditions, …
• Some of these abilities are captured in feature
descriptors with varying capabilities
• Prominent among them are Harris Corner
Detector (Harris-Stevenson algorithm),
Histogram of Oriented Gradients (Navneet Dalal
and Triggs) and Scale Invariant Feature
Transform (David Lowe)
Harris Corner Detector
Harris corner detector
• C.Harris, M.Stephens. “A Combined
Corner and Edge Detector”. 1988
The Basic Idea
• We should easily recognize the point by
looking through a small window
• Shifting a window in any direction should
give a large change in intensity
Harris Detector: Basic Idea

“flat” region: “edge”: “corner”:


no change in no change along significant change
all directions the edge direction in all directions
Harris Detector: Mathematics
Change of intensity for the shift [u,v]:

E (u , v)   w( x, y )  I ( x  u, y  v)  I ( x, y )
2

x, y

Window Shifted Intensity


function intensity

Window function w(x,y) = or

1 in window, 0 outside Gaussian


Taylor series approximation to
shifted image

E(u,v)   w(x, y)[I(x, y)  uIx  vIy  I(x, y)]2


x,y

  w(x, y)[uIx  vIy ]2


x,y

Ix Ix Ix Iy  u
  w(x, y)(u v)  
x,y Ix Iy Iy Iy  v 

Can be verified easily by multiplying 1x2 vector with 2x2 matrix


with 2x1 vector resulting in a scalar
Harris Detector: Mathematics
For small shifts [u,v] we have a bilinear approximation:

u 
E (u, v)  u, v  M  
v 

where M is a 22 matrix computed from image derivatives:

 I x2 IxIy 
M   w( x, y )  2 
x, y  I x I y I y 
Harris Detector: Mathematics
Intensity change in shifting window: eigenvalue analysis
u 
E (u, v)  u, v  M   1, 2 – eigenvalues of M
v 
direction of the
fastest change
direction of the
slowest change
Ellipse E(u,v) = const
(max)-1/2
(min)-1/2
Example to show a
polynomial as an ellipse
Let the polynomial be 4x2 – 8x + y2 + 4y = 8
The standard form of an ellipse with centre
at (p,q) and semi-axes given by a and b is:
(x-p)2 / a2 + (y-q)2 / b2 = 1
The above polynomial can be rewritten as:
4x2 – 8x + 4 + y2 + 4y + 4 = 8 + 4 + 4
4(x – 1)2 + (y + 2)2 = 16 or
(x – 1)2 / 4 + (y + 2)2 / 16 = 1
Harris Detector: Threshold

Classification of 2 “Edge”
image points using 2 >> 1 “Corner”
eigenvalues of M: 1 and 2 are large,
1 ~ 2 ;
E increases in all
directions

1 and 2 are small;


E is almost constant “Flat” “Edge”
in all directions region 1 >> 2

1
Harris Detector: Threshold
Measure of corner response:

R  det M  k  trace M 
2

det M  12
trace M  1  2

(k – empirical constant, k = 0.04-0.06)


Harris Detector: Mathematics

2 “Edge” “Corner”
• R depends only on R<0
eigenvalues of M
• R is large for a corner R>0
• R is negative with large
magnitude for an edge
• |R| is small for a flat
region “Flat” “Edge”
|R| small R<0
1
Harris Detector
• The Algorithm:
– Find points with large corner response
function R (R > threshold)
– Take the points of local maxima of R
Harris Detector: Workflow
Harris Detector: Workflow
Compute corner response R
Harris Detector: Workflow
Find points with large corner response: R>threshold
Harris Detector: Workflow
Take only the points of local maxima of R
Harris Detector: Workflow
Harris Detector: Summary

• Average intensity change in direction [u,v] can be


expressed as a bilinear form:
u 
E (u, v)  u, v  M  
v 
• Describe a point in terms of eigenvalues of M:
measure of corner response
R  12  k  1  2 
2

• A good (corner) point should have a large intensity


change in all directions, i.e. R should be large
positive
Harris Detector: Some
Properties
• Rotation invariance

Ellipse rotates but its shape (i.e. eigenvalues)


remains the same

Corner response R is invariant to image rotation


Harris Detector: Some
Properties
• Partial invariance to affine intensity
change
 Only derivatives are used => invariance
to intensity shift I  I + b
 Intensity scale: I  a I

R R
threshold

x (image coordinate) x (image coordinate)


Harris Detector: Some
Properties
• But: non-invariant to image scale!

All points will be Corner !


classified as edges
Harris Detector: Some
Properties
• Quality of Harris detector for different
scale changes
Repeatability rate:
# correspondences
# possible correspondences

C. Schmid et al. “Evaluation of Interest Point Detectors”. IJCV 2000


Models of Image Change

• Geometry
– Rotation
– Similarity (rotation + uniform scale)

– Affine (scale dependent on direction)


valid for: orthographic camera, locally
planar object
• Photometry
– Affine intensity change (I  a I + b)
Rotation Invariant Detection
• Harris Corner Detector

C.Schmid et.al. “Evaluation of Interest Point Detectors”. IJCV 2000


Histogram of Oriented
Gradients (HoG)
What is HOG?
(Histograms of Oriented Gradients)
HOG is an edge orientation histogram
based descriptor, based on the orientation
of the gradient in localized region that is
called cells.
Therefore, it is easy to express the rough
shape of the object and is robust to
variations in geometry and illumination
changes.
On the other hand, rotation and scale
changes are not supported.

Overall Scheme for HoG based


Person-Non-Person
Classification
HOG image
HOG feature extraction algorithm
1. The color image is converted to grayscale
2. The luminance gradient is calculated at
each pixel
3. To create a histogram of gradient
orientations for each cell.
– Feature quantity becomes robust to changes of
form
4. Normalization and Descriptor Blocks
– Feature quantity becomes robust to changes in
illumination
HOG feature extraction
algorithm(1)
2. The luminance gradient is calculated at each pixel
– The luminance gradient is a vector with magnitude m and
orientation θ represented by the change in the luminance.

m(x,y) (L(x1,y)L(x1,y)) (L(x,y1)L(x,y1))2 2

 L ( x, y  1)  L ( x, y  1) 
 ( x, y )  tan 1

 L ( x  1, y )  L ( x  1, y ) 
 
  
2 2
※L is the luminance value of pixel
HOG feature extraction
algorithm(2)
3. To create a histogram of gradient orientations for
each cell(5×5pixel) using the gradient magnitude
and orientation of the calculated.
– The orientation bins are evenly spaced over 0°– 180°with
Image nine bins each of 20o each. By adding the magnitude of
size = the luminance gradient for each orientation, generate a
60x30 histogram.

  Orientation num is
  
2 2        9
 2
HOG feature extraction
algorithm(3)
4. Normalization and Descriptor Blocks
– Normalization is performed using the following
equation:

v(n)
v(n) 
 339 2 
  v(k)  1
 k 1 
v(n) is the magnitude of each
direction
HOG feature extraction
algorithm(3)
4. Normalization and Descriptor Blocks
– Normalization is performed using the following
equation:
v(n)
v(n) 
 339 
  v(k)  1
2

 k 1 
v is the magnitude of each direction
Normalized vector is computed over 9
Cells and 9 orientations
For a block, the vector size = 9x9=81
Feature
HOG image
descriptor
size
•12x6 cells
•Number of
orientations =
9
•Block size =
3x3=9
•Block moves
4 steps to
right and 10
steps down

Descriptor
size for total
image =
10x4x9x9=
3240
Example of using HOG
• HOG can represent a rough shape of the
object, so that it has been used for general
object recognition, such as people or cars.
• In order to achieve the general object
recognition, the classifier (eg SVM) is be
used.
1. To teach the classifier, the correct image and
the incorrect image.
2. Scan the classifier to determine whether
there are people in the detection window.
SVM based Classification
SVM divides space into two domains
according to a teacher signal.
New examples are predicted to belong to
a category based on which side of the gap
domain.
SVM Based Classification
SVM divides space into two domains
according to a teacher signal.
New examples are predicted to belong to
a category based on which side of the gap
domain.
SVM Based Classification
Comparison with Different
Feature Descriptors
Comparison with Different
Feature Descriptors
Summary
Steps
Parameters

Window moves 7 steps to


right, 15 down
Total window positions=105
Each histogram is normalized
Over a block, number of
histograms = 4, number of
bins=4x9=36
Feature descriptor size = 15x7x9x4=3780 in Dalal’s original paper
Discussion
• Navneet Dalal and Bill Briggs initially
developed this for person identification but
it was shown that this could be used for
other applications as well.
Scale Invariant Feature
Transform (SIFT)
The SIFT (Scale Invariant
Feature Transform) Detector and
Descriptor
developed by David Lowe
University of British Columbia
Initial paper ICCV 1999
Newer journal paper IJCV 2004
Review: Matt Brown’s Canonical
Frames

3/15/2023 211
Multi-Scale Oriented Patches

 Extract oriented patches at multiple scales


3/15/2023 212
[ Brown, Szeliski, Winder CVPR 2005 ]
Application: Image Stitching

3/15/2023 213
[ Microsoft Digital Image Pro version 10 ]
Ideas from Matt’s Multi-Scale
Oriented Patches
1. Detect an interesting patch with an interest
operator. Patches are translation invariant.
2. Determine its dominant orientation.
3. Rotate the patch so that the dominant
orientation points upward. This makes the
patches rotation invariant.
4. Do this at multiple scales, converting them all
to one scale through sampling.
5. Convert to illumination “invariant” form

3/15/2023 214
Implementation Concern:
How do you rotate a patch?
• Start with an “empty” patch whose
dominant direction is “up”.
• For each pixel in your patch, compute the
position in the detected image patch. It will
be in floating point and will fall between
the image pixels.
• Interpolate the values of the 4 closest
pixels in the image, to get a value for the
pixel in your patch.
3/15/2023 215
Rotating a Patch

(x,y) T
(x’,y’)

empty canonical patch

patch detected in the image


x’ = x cosθ – y sinθ
T y’ = x sinθ + y cosθ
counterclockwise rotation

3/15/2023 216
Using Bilinear Interpolation
• Use all 4 adjacent samples

I01 I11

y
I00 I10
x

3/15/2023 217
SIFT: Motivation
• The Harris operator is not invariant to scale and
correlation is not invariant to rotation1.

• For better image matching, Lowe’s goal was to


develop an interest operator that is invariant to scale
and rotation.

• Also, Lowe aimed to create a descriptor that was


robust to the variations corresponding to typical
viewing conditions. The descriptor is the most-used
part of SIFT.
1But Schmid and Mohr developed a rotation invariant descriptor for it in 1997.
3/15/2023 218
Idea of SIFT
• Image content is transformed into local feature
coordinates that are invariant to translation, rotation,
scale, and other imaging parameters

SIFT Features
3/15/2023 219
Claimed Advantages of SIFT
• Locality: features are local, so robust to occlusion
and clutter (no prior segmentation)
• Distinctiveness: individual features can be
matched to a large database of objects
• Quantity: many features can be generated for even
small objects
• Efficiency: close to real-time performance
• Extensibility: can easily be extended to wide range
of differing feature types, with each adding
robustness
3/15/2023 220
Overall Procedure at a High
Level
1. Scale-space extrema detection
Search over multiple scales and image locations.

2. Keypoint localization
Fit a model to determine location and scale.
Select keypoints based on a measure of stability.
3. Orientation assignment
Compute best orientation(s) for each keypoint region.

4. Keypoint description
Use local image gradients at selected scale and rotation
to describe each keypoint region.
3/15/2023 221
1. Scale-space extrema detection
• Goal: Identify locations and scales that can be
repeatably assigned under different views of the
same scene or object.
• Method: search for stable features across multiple
scales using a continuous function of scale.
• Prior work has shown that under a variety of
assumptions, the best function is a Gaussian
function.
• The scale space of an image is a function L(x,y,)
that is produced from the convolution of a Gaussian
kernel (at different scales) with the input image.

3/15/2023 222
Aside: Image Pyramids
And so on.

3rd level is derived from the


2nd level according to the same
funtion

2nd level is derived from the


original image according to
some function

Bottom level is the original image.

3/15/2023 223
Aside: Mean Pyramid
And so on.

At 3rd level, each pixel is the mean


of 4 pixels in the 2nd level.

At 2nd level, each pixel is the mean


of 4 pixels in the original image.

mean

Bottom level is the original image.

3/15/2023 224
Aside: Gaussian Pyramid
At each level, image is smoothed and
reduced in size.

And so on.

At 2nd level, each pixel is the result


of applying a Gaussian mask to
the first level and then subsampling
Apply Gaussian filter to reduce the size.

Bottom level is the original image.

3/15/2023 225
Example: Subsampling with Gaussian pre-
filtering

G 1/8
G 1/4

Gaussian 1/2

3/15/2023 226
Lowe’s Scale-space Interest
Points
• Laplacian of Gaussian kernel
– Scale normalised (x by scale2)
– Proposed by Lindeberg
• Scale-space detection
– Find local maxima across scale/space
– A good “blob” detector

3/15/2023 227
[ T. Lindeberg IJCV 1998 ]
Lowe’s Scale-space Interest Points:
Difference of Gaussians
• Gaussian is an ad hoc
solution of heat
diffusion equation

• Hence

• k is not necessarily very


small in practice

3/15/2023 228
Lowe’s Pyramid Scheme
• Scale space is separated into octaves:
• Octave 1 uses scale 
• Octave 2 uses scale 2
• etc.

• In each octave, the initial image is repeatedly convolved


with Gaussians to produce a set of scale space images.

• Adjacent Gaussians are subtracted to produce the DOG

• After each octave, the Gaussian image is down-sampled


by a factor of 2 to produce an image ¼ the size to start
the next level.
3/15/2023 229
Lowe’s Pyramid Scheme

s+2 filters
s+1=2(s+1)/s0
.
.
i=2i/s0
.
. s+3 s+2
2=22/s0 images differ-
1=21/s0 including ence
0 original images
3/15/2023 The parameter s determines the number of images per octave. 230
Key point localization s+2 difference images.
top and bottom ignored.
s planes searched.

• Detect maxima and


minima of difference-of-
Gaussian in scale space
Resa mple

B lur

Subtrac t

• Each point is compared


to its 8 neighbors in the For each max or min found,
current image and 9 output is the location and
neighbors each in the the scale.
scales above and below
3/15/2023 231
Scale-space extrema detection: experimental results
over 32 images that were synthetically transformed and
noise added.
% detected average no. detected

% correctly matched

average no. matched

Stability Expense
• Sampling in scale for efficiency
– How many scales should be used per octave? S=?
• More scales evaluated, more keypoints found
• S < 3, stable keypoints increased too
• S > 3, stable keypoints decreased
• S = 3, maximum stable keypoints found
3/15/2023 232
Keypoint localization

• Once a keypoint candidate is found, perform a


detailed fit to nearby data to determine
– location, scale, and ratio of principal curvatures
• In initial work keypoints were found at location and
scale of a central sample point.
• In newer work, they fit a 3D quadratic function to
improve interpolation accuracy.
• The Hessian matrix was used to eliminate edge
responses.

3/15/2023 233
Keypoint localization
• There are still a lot of points, some of them
are not good enough.
• The locations of keypoints may be not
accurate.
• Eliminating edge points.
(1)

(2)

(3)
Eliminating the Edge Response
• Reject flat areas (in terms of intensity):
 < 0.03
• Reject edges:
Let  be the eigenvalue with
larger magnitude and  the smaller.

Let r = /. (r+1)2/r is at a


So  = r min when the
2 eigenvalues
 r < 10 are equal.
• What does this look like?
3/15/2023 236
3. Orientation assignment
• Create histogram of
local gradient directions
at selected scale
• Assign canonical
orientation at peak of
smoothed histogram
• Each key specifies
stable 2D coordinates
(x, y, scale,orientation)
If 2 major orientations, use both.

3/15/2023 237
Keypoint localization with
orientation

832
233x189
initial keypoints

536
729
keypoints after keypoints after
gradient threshold ratio threshold

3/15/2023 238
4. Keypoint Descriptors
• At this point, each keypoint has
– location
– scale
– orientation
• Next is to compute a descriptor for the local
image region about each keypoint that is
– highly distinctive
– invariant as possible to variations such as
changes in viewpoint and illumination
3/15/2023 239
Normalization
• Rotate the window to standard orientation

• Scale the window size based on the scale


at which the point was found.

3/15/2023 240
Lowe’s Keypoint Descriptor
(shown with 2 X 2 descriptors
over 8 X 8)

In experiments, 4x4 arrays of 8 bin histogram is used,


a total of 128 features for one keypoint

3/15/2023 241
Lowe’s Keypoint Descriptor
• use the normalized region about the keypoint
• compute gradient magnitude and orientation at each
point in the region
• weight them by a Gaussian window overlaid on the
circle
• create an orientation histogram over the 4 X 4
subregions of the window
• 4 X 4 descriptors over 16 X 16 sample array were
used in practice. 4 X 4 times 8 directions gives a
vector of 128 values. ...

3/15/2023 242
Using SIFT for Matching “Objects”

3/15/2023 243
3/15/2023 244
Uses for SIFT
• Feature points are used also for:
– Image alignment (homography, fundamental
matrix)
– 3D reconstruction (e.g. Photo Tourism)
– Motion tracking
– Object recognition
– Indexing and database retrieval
– Robot navigation
– … many others

3/15/2023 245
[ Photo Tourism: Snavely et al. SIGGRAPH 2006 ]
GNR602 Lecture 15-21 B. Krishna Mohan

You might also like