Dip Unit 4

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 58

18ECE011T-DIGITAL IMAGE

PROCESSING

UNIT-4
IMAGE SEGMENTATION AND
REPRESENTATION
Introduction
 What is image segmentation?
 Technically speaking, image segmentation refers
to the decomposition of a scene into different
components (thus to facilitate the task at higher
levels such as object detection and recognition)
 Scientifically speaking, segmentation is a
hypothetical middle-level vision task performed by
neurons between low-level and high-level cortical
areas
 There is no ground truth to a segmentation
task (an example is given in the next slide)

EE465: Introduction to Digital Image


Processing Copyright Xin Li 2
Dilemma

input result 1 result 2

What do we mean by “DIFFERENT” objects?

Another example: when we look at trees at a close distance, we consider


each of them as a different object; but as we look at trees far away, they
merge into one coherent object (woods)

EE465: Introduction to Digital Image


Processing Copyright Xin Li 3
Overview of Segmentation Techniques
Edge-based Document images

Color-based Medical images

Texture-based Range images

Biometric images
Disparity-based
Texture images
Motion-based

EE465: Introduction to Digital Image


Processing Copyright Xin Li 4
Edge-based Techniques
Segmentation
Edge Classification
by boundary
detection and analysis
detection

EE465: Introduction to Digital Image


Processing Copyright Xin Li 5
Region-Filling

EE465: Introduction to Digital Image


Processing Copyright Xin Li 6
Edge Detection

Basic idea: look for a neighborhood with strong signs


of change.

Problems: 81 82 26 24
82 33 25 25
• neighborhood size 81 82 26 24

• how to detect change

7
Differential Operators

Differential operators

• attempt to approximate the gradient at a pixel via masks

• threshold the gradient to select the edge pixels

8
Example: Sobel Operator

-1 0 1 1 2 1
Sx = -2 0 2 Sy = 0 0 0
-1 0 1 -1 -2 -1

On a pixel of the image


• let gx be the response to Sx
• let gy be the response to Sy

2 2 1/2
Then g = (gx + gy ) is the gradient magnitude.

 = atan2(gy,gx) is the gradient direction.


9
Java Toolkit’s Sobel Operator

original image gradient thresholded


magnitude gradient
magnitude
10
Zero Crossing Operators

Motivation: The zero crossings of the second derivative


of the image function are more precise than
the peaks of the first derivative.

step edge
smoothed

1st derivative
zero crossing
2nd derivative

11
Canny Edge Detector

• Smooth the image with a Gaussian filter.

• Compute gradient magnitude and direction at each pixel of


the smoothed image.

• Zero out any pixel response  the two neighboring pixels


on either side of it, along the direction of the gradient.

• Track high-magnitude contours.

• Keep only pixels along these contours, so weak little


segments go away.
12
Canny Examples

13
Best Canny on Kidney from Hw1

14
Best Canny on Blocks from Hw1

15
Hough Transform

• The Hough transform is a method for detecting


lines or curves specified by a parametric function.

• If the parameters are p1, p2, … pn, then the Hough


procedure uses an n-dimensional accumulator array
in which it accumulates votes for the correct parameters
of the lines or curves found on the image.

image accumulator
b

m
y = mx + b

16
Thresholding
non-contextual approach

The input to a thresholding operation is typically a grayscale or color


image. In the simplest implementation, the output is a binary image
representing the segmentation. Black pixels correspond to background
and white pixels correspond to foreground (or vice versa).
Threshopding of pixel grey level ( Basic
Global Thresholding)

Segmentation is accomplished by scanning the image pixel by pixel


and labeling each pixel as object or background, depending on
whether the grey level is greater or less than the value of T .

 0 f ( x, y )  T

g ( x, y )  
1 f ( x, y )  T

Thersholding works well when a grey level histogram of


the image groups separates the pixels of the object and the
background into two dominant modes. Then a threshold T
can be easily chosen between the modes.
Picking the threshold is the hard part

• Human operator decided the threshold


• Use mean gray level of the image
•A fixed proportion of pixels are detected ( set to 1) by
the thresholding operation
•Analyzing the histogram of an image
Basic Global Thresholding

Figure 1 A) shows a classic bi-modal intensity distribution. This


image can be successfully segmented using a single threshold T1. B)
is slightly more complicated. Here we suppose the central peak
represents the objects we are interested in and so threshold
segmentation requires two thresholds: T1 and T2. In C), the two
peaks of a bi-modal distribution have run together and so it is almost
certainly not possible to successfully segment this image using a
single global threshold
Basic Global Thresholding

The same approach can be used with more than one


treshold value.For example, for threshold T1 and T2, any
point which satisfies the relation T1<f(x,y) <T2 would be
labeled as an object point and all others would be
labeled background points.
In general this technique is less reliable than a single
variable threshold. This is because it often difficult to
establish multiple thresholds to effectively isolate the
region of interest especially when the number of modes
in the corresponding histogram is high.
Basic Global and Local Thresholding

Thresholding may be viewed as an operation that


involves tests against a function T of the form:
T = T[x,y,p(x,y),f(x,y)]
Where f(x,y) is the gray level , and p(x,y) is some local
property.
Simple tresholding schemes compare each pixels gray
level with a single global threshold. This is referred to
as Global Tresholding.
If T depends on both f(x,y) and p(x,y) then this is
referred to a Local Thresholding.
An algorithm used to obtain T automatically for
global thresholding

1. Select an initial estimate for T.


2. Segment the image using T. This well produce two groups
of pixels: G1 consisting of all pixels with gray level
values>T and G2 consisting of pixels with values <T.
3. Compute the average gray level values 1 and 2 for the
pixels in regions G1 and G2.
4. Compute a new threshold value:T = ½[1 + 2 ]
5. Repeat step 2 through 4 until the difference in T in
successive iterations is smaller than a predefined parameter
To.
Global Thresholding - Guidelines for Use
The histogram for image is

This shows a nice bi-modal


distribution --- the lower peak
represents the object and the
higher one represents the
background. The picture can be
segmented using a single
threshold at a pixel intensity value
of 120. The result is shown in
Global Thresholding - Guidelines for Use
The histogram for image is

Due to the severe illumination gradient across the scene, the peaks
corresponding to foreground and background have run together and so
simple thresholding does not give good results. Following images show
the resulting bad segmentations for single threshold values of 80 and 120
respectively (reasonable results can be achieved by using adaptive
thresholding on this image).
Global Thresholding - Guidelines for Use

Thresholding is also used to filter the output of or input to other


operators. For instance, in the former case, an edge detector like Sobel
will highlight regions of the image that have high spatial gradients. If
we are only interested in gradients above a certain value (i.e. sharp
edges), then thresholding can be used to just select the strongest edges
and set everything else to black. As an example,
was obtained by first
applying the Sobel
operator to

and then thresholding this


using a threshold value of
60.
Use of Boundary Characteristics for Histogram
Improvement and Local Thresholding
From the privies discussion , an indication of whether a pixel is on an edge may be
obtained by computing its gradient. In addition , use of the Laplacian can yield
information regarding whether a given pixel lies on the dark or light side of an edge.
The average value of the Laplacian is 0 at the transition of an edge, so in practice the
valleys of histograms formed from the pixels selected by a gradient/Laplacian
criterion can be expected to be sparsely populated.
We can calculate gradient f and the Laplacian 2f at any point (x,y) in an image .
These two quantities may be used to form a three-level image , as follows
if f  T
0

g ( x , y )   if f  T and 2 f  0

 if f  T and 2 f  0

(1)All pixels that are not on an edge are labeled 0


(2) All pixels on the dark side of an edge are labeled +
(3) All pixels on the light side of an edge are labeled -
Global Thresholding - Guidelines for Use

Thresholding can be used as preprocessing


to extract an interesting subset of image
structures which will then be passed along to
another operator in an image processing
chain. For example, image shows a slice of
brain tissue containing nervous cells (i.e. the
large gray blobs, with darker circular nuclei
in the middle) and glia cells (i.e. the isolated,
small, black circles).

We can threshold this image so as to map all


pixel values between 0 and 150 in the original
image to foreground (i.e. 255) values in the
binary image, and leave the rest to go to
background, as in
Global Thresholding - Guidelines for Use

The resultant image can then be


connected-components-labeled in order to
count the total number of cells in the
original image, as in

If we wanted to know how many nerve cells


there are in the original image, we might try
applying a double threshold in order to select
out just the pixels which correspond to nerve
cells (and therefore have middle level
grayscale intensities) in the original image.
(In remote sensing and medical terminology,
such thresholding is usually called density
slicing.) Applying a threshold band of 130 -
150 yields
Thresholding in RGB space

For color or multi-spectral images, it may be possible to set


different thresholds for each color channel, and so select just those
pixels within a specified cuboid in RGB space. Another common
variant is to set to black all those pixels corresponding to
background, but leave foreground pixels at their original
color/intensity (as opposed to forcing them to white), so that that
information is not lost.
1, d ( x, y )  d max

g ( x, y )  
 0 d ( x, y )  d max

where

d ( x , y )   f R ( x , y )  R0    f G ( x , y )  G 0    f B ( x , y )  B0 
2 2 2
Adaptive Thresholding

A more complex thresholding algorithm would be to use a


spatially varying threshold. This approach is very useful to
compensate for the effects of non –uniform illumination. If T
depends on coordinates x and y, this referred to as Dynamic
Thresholding or Adaptive Thresholding.
Another approach is to perform a preprocessing step before
thresholding.
Preprocessing the image to remove noise of other non-
uniformities can improve the performance of the
thresholding.
A technique which often provides better results is to only use
edge points when creating the grey level histogram .
Adaptive thresholding - how it works?

There are two main approaches to finding the threshold:


(i) the Chow and Kaneko approach and
(ii) local thresholding.
The assumption behind both methods is that smaller image
regions are more likely to have approximately uniform
illumination, thus being more suitable for thresholding.
Chow and Kaneko divide an image into an array of overlapping
subimages and then find the optimum threshold for each
subimage by investigating its histogram. The threshold for each
single pixel is found by interpolating the results of the
subimages. The drawback of this method is that it is
computational expensive and, therefore, is not appropriate for
real-time applications.
Adaptive thresholding - Local thresholding
An alternative approach to finding the local threshold is to statistically
examine the intensity values of the local neighborhood of each pixel.
The statistic which is most appropriate depends largely on the input
image. Simple and fast functions include the mean of the local intensity
distribution,
the median value,
or the mean of the minimum and maximum values,

The size of the neighborhood has to be large enough to cover


sufficient foreground and background pixels, otherwise a poor
threshold is chosen. On the other hand, choosing regions which are
too large can violate the assumption of approximately uniform
illumination. This method is less computationally intensive than
the Chow and Kaneko approach and produces good results for
some applications.
Adaptive thresholding -Guidelines for Use

Local adaptive thresholding, on the other hand, selects an


individual threshold for each pixel based on the range of
intensity values in its local neighborhood. This allows for
thresholding of an image whose global intensity
histogram doesn't contain distinctive peaks.
A task well suited to local adaptive thresholding is in
segmenting text from the image
Because this image contains a
strong illumination gradient,
global thresholding produces a
very poor result, as can be seen
in
Adaptive thresholding -Guidelines for Use

Using the mean of a 7×7 neighborhood, adaptive thresholding yields

The method succeeds in the area surrounding the text because


there are enough foreground and background pixels in the local
neighborhood of each pixel; i.e. the mean value lies between
the intensity values of foreground and background and,
therefore, separates easily. On the margin, however, the mean
of the local area is not suitable as a threshold, because the
range of intensity values within a local neighborhood is very
small and their mean is close to the value of the center pixel.
Adaptive thresholding -Guidelines for Use

The situation can be improved if the threshold


employed is not the mean, but (mean-C), where C
is a constant. Using this statistic, all pixels which
exist in a uniform neighborhood (e.g. along the
margins) are set to background. The result for a
7×7 neighborhood and C=7 is shown in

and for a 75×75 neighborhood and C=10 in

The larger window yields the poorer result,


because it is more adversely affected by the
illumination gradient. Also note that the latter is
more computationally intensive than
thresholding using the smaller window.
Adaptive thresholding -Guidelines for Use

The result of using the median instead of the mean can be seen in

The neighborhood
size for this
example is 7×7
and C = 4). The
result shows that,
in this application,
the median is a
less suitable
statistic than the
mean.
Adaptive thresholding -Guidelines for Use

Consider another example image containing a strong illumination


gradient
This image can not be
segmented with a global
threshold, as shown in
where a threshold of 80
was used.

However, since the image contains a


large object, it is hard to apply
adaptive thresholding, as well. Using
the (mean - C) as a local threshold,
we obtain with a 7×7 window and C
=4
Adaptive thresholding -Guidelines for Use

Using the (mean - C) as a local threshold,


we obtain

with a 140×140 window and C = 8. All pixels which


belong to the object but do not have any background pixels
in their neighborhood are set to background. The latter
image shows a much better result than that achieved with a
global threshold, but it is still missing some pixels in the
center of the object. In many applications, computing the
mean of a neighborhood (for each pixel!) whose size is of
the order 140×140 may take too much time. In this case,
the more complex Chow and Kaneko approach to adaptive
thresholding would be more successful.
Color-Based Techniques
 Color representations
 Device dependent: RGB (displaying) or CMYK (printing)
 Device independent: CIE XYZ or CIELAB (L*a*b*)
 There are different specifications of RGB color
spaces (e.g., HP/Microsoft vs. Adobe)

EE465: Introduction to Digital Image


Processing Copyright Xin Li 40
Color Space Conversion

Analog
TV

Digital
TV(MPEG)

EE465: Introduction to Digital Image


Processing Copyright Xin Li 41
Clustering via K-Means Algorithm
An algorithm for partitioning (or clustering) N data points
into K disjoint subsets Sj containing Nj data points so as to
minimize the sum-of-squares criterion

data points centroid

Initialization: the data points the centroid is


randomly choose are assigned to updated for each
K centroids the K sets set

EE465: Introduction to Digital Image


Processing Copyright Xin Li'2004 42
Subproblem I: Clustering by distance
to known centers

43
Subproblem II: Finding the centers
from known clustering

44
Toy Example of Kmeans
Clustering

1.Initialization 2.NN-Clustering

3.Codeword-update 4. Alternate 2 and 3


until convergence

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
K-means Clustering: Step 1
Algorithm: k-means, Distance Metric: Euclidean Distance
5

4
k1

k2
2

k3
0
0 1 2 3 4 5
K-means Clustering: Step 2
Algorithm: k-means, Distance Metric: Euclidean Distance
5

4
k1

k2
2

k3
0
0 1 2 3 4 5
K-means Clustering: Step 3
Algorithm: k-means, Distance Metric: Euclidean Distance
5

4
k1

2
k3
k2
1

0
0 1 2 3 4 5
K-means Clustering: Step 4
Algorithm: k-means, Distance Metric: Euclidean Distance
5

4
k1

2
k3
k2
1

0
0 1 2 3 4 5
K-means Clustering: Step 5
Algorithm: k-means, Distance Metric: Euclidean Distance

expression in condition 2 5

4
k1

2
k2
k3
1

0
0 1 2 3 4 5

expression in condition 1
Data Clustering via Kmeans

Instead of 2D, kmeans can be applied to 3D color space RGB or L*a*b*

EE465: Introduction to Digital Image


Processing Copyright Xin Li 51
Texture-based Techniques

What is Texture?

No one exactly knows.

In the visual arts, texture


is the perceived surface quality
of an artwork.

EE465: Introduction to Digital Image


Processing Copyright Xin Li 52
Disparity-based Techniques

EE465: Introduction to Digital Image


Processing Copyright Xin Li 53
Motion Segmentation

EE465: Introduction to Digital Image


Processing Copyright Xin Li 54
Document Segmentation
 Document images
consist of texts,
graphics, photos and
so on
 Document
segmentation is useful
for compression, text
recognition
 Adobe and Xerox are
the major players

EE465: Introduction to Digital Image


Processing Copyright Xin Li 55
Medical Image Segmentation
 Medical image analysis
can be used as
preliminary screening
techniques to help
doctors
 Partial Differential
Equation (PDE) has
been used for
segmenting medical
images
active contour model (snake)
EE465: Introduction to Digital Image
Processing Copyright Xin Li 56
Range Image Segmentation

range intensity ground


truth

EE465: Introduction to Digital Image


Processing Copyright Xin Li 57
Biometric Image Segmentation
 For fingerprint, face
and iris images, we
also need to segment
out the region of
interest
 Various cues can be
used such as ridge
pattern, skin color and
pupil shape
 Robust segmentation
could be difficult for
poor-quality images
EE465: Introduction to Digital Image
Processing Copyright Xin Li 58

You might also like