0% found this document useful (0 votes)

12 views8 pages

21ai601 CV LM9 2

JASH

Uploaded by

tn20jashyt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views8 pages

21ai601 CV LM9 2

JASH

Uploaded by

tn20jashyt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

21AI601 – COMPUTER VISION

UNIT II & LP 9 – IMAGE PYRAMIDS AND GAUSSIAN

DERIVATIVE FILTERS, GABOR FILTERS AND DWT

1. IMAGE PYRAMIDS
Image information occurs over many different spatial scales. Image pyramids multi-
resolution representations for images are a useful data structure for analyzing and
manipulating images over a range of spatial scales. Here well discuss three different ones, in
a progression of complexity. The first is a Gaussian pyramid, which creates versions of the
input image at multiple resolutions. This is useful for analysis across different spatial scales,
but doesnt separate the image into different frequency bands. The Laplacian pyramid provides
that extra level of analysis, breaking the image into different isotropic spatial frequency bands.
The Steerable pyramid provides a clean separation of the image into different scales and
orientations. There are various other differences between these pyramids, which well describe
below. As a motivating example, lets assume we want to detect the birds from figure ?? using
the normalized correlation approach. If we have a template of a bird, the normalized
correlation will be able to detect only the birds that have a similar image size than the template.
To introduce scale invariance, one possible solution is to change the size of the template to
cover a wide range of possible sizes and apply them to the image. Then, the ensemble of
templates will be able to detect birds of different sizes. The disadvantage of this approach is
that it will be computationally expensive as detecting large birds will require computing
convolutions with big kernels which is very slow.
Another alternative is to change the image size resulting in a multiscale image pyramid.
In this example, the original image has a resolution of 848 643 pixels. Each image in the
pyramid is obtained by scaling down the image from the previous level by reducing the
number of pixels by factor of 25%. This operation is called downsampling and we will study
it in detail in this chapter. Now we can use the pyramid to detect birds at different sizes using
a single template. The red box in the figure denotes the size of the template used. The figure
shows how birds of different sizes become detectable at, at least, one of the levels of the
pyramid. This method will be more efficient as the template can be kept small and the
convolutions will remain computationally efficient.

Multiscale Image Pyramid

Each image is 25% smaller than the previous one. The red box indicates the size of a template used for
detecting ying birds. As the size of the template is xed, it will only be able to detect the birds that tightly
fit inside the box. Birds that are smaller or larger will not be detected within a single scale. By running
the same template across many levels in this pyramid, di erent birds instances are detected at different
scales.
Mutiscale image processing and image pyramids have many applications beyond scale invariant object
detection.
1.1 Linear image transforms
Lets rst look at some general properties of linear image transforms. For an input image x of N
pixels, a linear transform is:

where r is a vector of dimensionality M, and P is a matrix of size N M. The columns of P =

[P0,P1,…PM-1] are the projection vectors. The vector r contains the transform coefficients: ri =
PT i x. The vector r corresponds to a different representation of the image x than the original
pixel space. The transform P is said to be critically sampled when M = N. The transform is
over sampled when M > N, and under-sampled when M < N. We are interested in transforms
that are invertible, so that we can recover the input x from the projection coefficients r:

The columns of B = [B0,B1,… BM 1] are the basis vectors. The input signal x can be reconstructed as a
linear combination of the basis vectors Bi weighted by the representation coefficients ri. The transform P
is complete, encoding all image structure, if it is invertible. If critically sampled (i.e., M = N) and the
transform is complete, then B = (PT) -1. If it is over complete (over-sampled and complete), then the inverse
can be obtained using the pseudo inverse B = (PPT) -1P.

1.2 Gaussian Pyramid

A gaussian filter is a natural one to use to blur out an image, since multiple applications of a gaussian
filter is equivalent to application of a single, wider gaussian filter. Here is an elegant, efficient algorithm
for making a resolution reduced version of an input image. It involves two steps: convolving the image
with a low-pass filter (for example, using the4-thbinomial filter b4=[1,4,6,4,1] /16, normalizedtosumto1,
separably in each dimension),and then subsampling by a factor of 2 the result. Each level is obtained by
filtering the previous level with the 4-th binomial filter with a stride of 2(on each dimension). Applied
recursively, this algorithm generates a sequence of images, subsequent ones being smaller, lower
resolution versions of the earlier ones in the processing.
To make the filters more intuitive, it is useful to write the two steps in matrix form. The following matrix
shows the recursive construction of level k + 1 of the Gaussian pyramid for a 1D image:
gk+1 = DkBkgk = Gkgk

where Dk is the down sampling operator, Bk is the convolution with the 4-th binomial filter, and Gk =
DkBk is the blur-and-down sample operator for level k. We call the sequence of images g0, g1, …, gN as
the Gaussian pyramid. The first level of the Gaussian pyramid is the input image: g0 = x.

It is useful to check a concrete example. If x is a 1D signal of length 8, and if we assume zero boundary
conditions, the matrices for computing g1 are:

the first level of the gaussian pyramid is a signal g1 with length 4. Applying the recursion
we can write the output of each level as a function of the input x: g2 = G1G0x, g3 =
G2G1G0x, and so on. For 2D images the operations are analogous. Figure 3.3 shows the
Gaussian pyramid of an image.

1.3 Laplacian pyramid

In the gaussian pyramid, each level losses some of the fine image details available in the previous
level. The Laplacian pyramid is simple: it represents, at each level, what is present in a Gaussian pyramid
image of one level, but not present at the level below it. We calculate that by expanding the lower-
resolution Gaussian pyramid image to the same pixel resolution as the neighboring higher-
resolution Gaussian pyramid image, then subtracting the two. This calculation is made in
a recursive, telescoping fashion.
Let’s look at the steps for calculating a Laplacian pyramid. What we want is to
compute the difference between gk and gk+1. To do this first we need to upsample the
image gk+1 so that it has the same size as gk. Let Fk = BkUk be the upsample-and-blur
operator for pyramid level k. The operator Fk applies first the upsampling operator Uk,
that inserts zeros between samples, followed by blurring by the same filter Bk than the
one we used for the Gaussian pyramid. The Laplacian pyramid coefficients, lk, at
pyramid level k, are:
Laplacian Pyramid

For instance, for a 1D input x of length 8, and assuming zero boundary conditions, the
operators to compute the first level of the Laplacian pyramid are:

The factor 2 is necessary because inserting zeros decreases the average value of the signal
gk+1 by a factor of 2.
The Laplacian pyramid is an overcomplete representation (more coefficients than
pixels): the dimensionality of the representation is higher than the dimensionality of the
input.
Note that the reconstruction property of the Laplacian pyramid does not depend on the filters
used for subsampling and upsampling. Even if we used random filters the reconstruction
property would still hold.

1.3.1 Image blending

The Laplacian pyramid is used in many image processing or analysis applications.
Here we show one fun application: image blending. The goal is to combine two images
into one. A mask is used to define how the images will be combined. If we want to blend
the following two images using the mask shown in the right:

Making a sharp transition from one image to another gives an artifactually sharp
image boundary (see the straight edge of the apple/orange.) Using the Laplacian pyramid,
we can transition from one image to the next over many different spatial scales to make
a gradual transition between the two images. First, we build the Laplacian pyramid for
the two input images, in this example we use 7 levels and we also keep the last low-pass
residual:

and the Gaussian pyramid of the mask as shown below (note that we use 8
levels, one level more than for the Laplacian pyramid):

m0 m1 m2 m3 m4 m5 m6 m7

Now we combine the three pyramids to compute the Laplacian pyramid of the blended
image. The Laplacian pyramid of the blended image is obtained as:

1.4 Steerable pyramid

The Laplacian pyramid provides a richer representation than the Gaussian
pyramid. But we would like to have an even more expressive image representation. The
steerable pyramid adds information about image orientation. Therefore, the Steerable
representation is a multiscale oriented representation that is translation-invariant. It is
non-aliased and self-invertible. Ideally, we’d like to have an image transformation that
was shiftable–where we could perform interpolations in position, scale, and orientation
using linear combinations of a set of basis coefficients. The steerable pyramid goes part
way there.

We analyze in orientation using a steerable filter bank. We form a decomposition

in scale by introducing a low-pass filter (designed to work with the selected bandpass
filters), and recursively breaking the low-pass filtered component into angular and low-
pass frequency components. Pyramid subsampling steps are preceded by sufficient low-
pass filtering to remove aliasing.

To ensure that the image can be reconstructed from the steerable filter transform
coefficients, the filters must be designed so that their sums of squared magnitudes “tile”
in the frequency domain. We reconstruct by applying each filter a second time to the
steerable filter representation, and we want the final system frequency response to be flat,
for perfect reconstruction.
One block of the Steerable pyramid computation

The Steerable pyramid is a self-inverting overcomplete representation (more coefficients

than pixels).

The following block diagram shows the steps to build a 2 level steerable pyramid
and the reconstruction of the input. The architecture has two parts: 1) the analysis
network (or encoder) that transforms the input image x into a representation
composed of r = [b0,0, …, b0,n, b1,0, …b1,n, …, bk−1,0, …bk−1,n] and the low pass residual
gk−1. And 2) the synthesis network (or decoder) that reconstructs the input from the
representation r.
Steps to Build Steerable Pyramid

09 Pyramids
No ratings yet
09 Pyramids
91 pages
Texture Analysis
No ratings yet
Texture Analysis
67 pages
Gaussian-Laplacian Pyramid
No ratings yet
Gaussian-Laplacian Pyramid
68 pages
06 Image-Pyramids
No ratings yet
06 Image-Pyramids
33 pages
Unit 2
No ratings yet
Unit 2
26 pages
3.1 Image Pyramid
No ratings yet
3.1 Image Pyramid
18 pages
1745320592-Lecture#32 Gaussian Pyramid
No ratings yet
1745320592-Lecture#32 Gaussian Pyramid
17 pages
Convolution 2
No ratings yet
Convolution 2
17 pages
Report
No ratings yet
Report
11 pages
Wavelet and Multiresolution Image Processing
No ratings yet
Wavelet and Multiresolution Image Processing
61 pages
Lecture 5 AI Summary
No ratings yet
Lecture 5 AI Summary
32 pages
2-1 Pyramid
No ratings yet
2-1 Pyramid
3 pages
Image Piramides
No ratings yet
Image Piramides
2 pages
Image Sub-Sampling and Pyramids
No ratings yet
Image Sub-Sampling and Pyramids
63 pages
DIP EXP2 Modified
No ratings yet
DIP EXP2 Modified
10 pages
Fast Feature Pyramids For Object Detection
No ratings yet
Fast Feature Pyramids For Object Detection
14 pages
Image Pyramids and Blending: 15-463: Computational Photography Alexei Efros, CMU, Fall 2005
No ratings yet
Image Pyramids and Blending: 15-463: Computational Photography Alexei Efros, CMU, Fall 2005
53 pages
Lecture 6 AI Summary
No ratings yet
Lecture 6 AI Summary
34 pages
Local Laplacian Filters - Detailed
No ratings yet
Local Laplacian Filters - Detailed
11 pages
Sinamics Fault Codes 2011
No ratings yet
Sinamics Fault Codes 2011
56 pages
Image Processing and Analysis
No ratings yet
Image Processing and Analysis
38 pages
HW 2
No ratings yet
HW 2
6 pages
Pyramid Methods in Image Processing
No ratings yet
Pyramid Methods in Image Processing
47 pages
Theory Multiresolution Signal Decomposition: The Wavelet Representation
No ratings yet
Theory Multiresolution Signal Decomposition: The Wavelet Representation
20 pages
SIFT White
No ratings yet
SIFT White
55 pages
Implementation of Improved Gaussian Filter Algorit PDF
No ratings yet
Implementation of Improved Gaussian Filter Algorit PDF
4 pages
Fundamentals of Computer Vision With QA
No ratings yet
Fundamentals of Computer Vision With QA
25 pages
Digital Image Processing Mini-Project: Project Title: Image Denoising and Feature Extraction Using Spatial Filters
No ratings yet
Digital Image Processing Mini-Project: Project Title: Image Denoising and Feature Extraction Using Spatial Filters
21 pages
Digital Image Processing
No ratings yet
Digital Image Processing
88 pages
RAT292 M3 Part 2 Sensors and Actuators
No ratings yet
RAT292 M3 Part 2 Sensors and Actuators
55 pages
Gaussian Pyramid
No ratings yet
Gaussian Pyramid
2 pages
Lecture W2abc 2
No ratings yet
Lecture W2abc 2
39 pages
Dip Unit-I
No ratings yet
Dip Unit-I
14 pages
Lec 16
No ratings yet
Lec 16
15 pages
01.14.pyramidal Implementation of The Lucas Kanade Feature Tracker - Description of The Algorithm
No ratings yet
01.14.pyramidal Implementation of The Lucas Kanade Feature Tracker - Description of The Algorithm
9 pages
Seismic Trace Interpolation Using The Pyramid Transform: Barry Hung, Carl Notfors and Shuki Ronen, Veritas DGC Inc
No ratings yet
Seismic Trace Interpolation Using The Pyramid Transform: Barry Hung, Carl Notfors and Shuki Ronen, Veritas DGC Inc
4 pages
RSL MusicProduction Coursework Syllabus Guide 18oct2018 1
No ratings yet
RSL MusicProduction Coursework Syllabus Guide 18oct2018 1
41 pages
Fast Gaussian Smoothing
No ratings yet
Fast Gaussian Smoothing
5 pages
Secx1040 Unit 3
No ratings yet
Secx1040 Unit 3
22 pages
Unit-2 Image Processing
No ratings yet
Unit-2 Image Processing
9 pages
Sift Detector and Descriptor: (Scale Invariant Feature Transform)
No ratings yet
Sift Detector and Descriptor: (Scale Invariant Feature Transform)
34 pages
Sampling and Reconstruction: 15-463: Computational Photography Alexei Efros, CMU, Fall 2007
No ratings yet
Sampling and Reconstruction: 15-463: Computational Photography Alexei Efros, CMU, Fall 2007
55 pages
Lec 15 Multiscale Re
No ratings yet
Lec 15 Multiscale Re
73 pages
Vision Review: Image Processing: Course Web Page
No ratings yet
Vision Review: Image Processing: Course Web Page
51 pages
Image Enhancement: Computer Vision CITS4240
No ratings yet
Image Enhancement: Computer Vision CITS4240
12 pages
Practice Midterm Solutions 07
No ratings yet
Practice Midterm Solutions 07
5 pages
Chapter 3
No ratings yet
Chapter 3
78 pages
Dip 05
No ratings yet
Dip 05
11 pages
Lecture #2: C Camera Model
No ratings yet
Lecture #2: C Camera Model
38 pages
DIP Unit 2 (Enhancement, Binary, Colour)
No ratings yet
DIP Unit 2 (Enhancement, Binary, Colour)
126 pages
Digital Image Processing: Assignment No. 2
No ratings yet
Digital Image Processing: Assignment No. 2
18 pages
DIGITAL Image Processing 8
No ratings yet
DIGITAL Image Processing 8
17 pages
Lec10 Image Enhancment
No ratings yet
Lec10 Image Enhancment
27 pages
Digital Image Processing
No ratings yet
Digital Image Processing
156 pages
Computer and Machine Vision 4e - Solution Manual
No ratings yet
Computer and Machine Vision 4e - Solution Manual
44 pages
Image Enhancement in The Spatial Domain
No ratings yet
Image Enhancement in The Spatial Domain
156 pages
ECE - DIP - Unit 4
No ratings yet
ECE - DIP - Unit 4
43 pages
Empowerment Technologies: Principles of Visual Message and Design Using Infographics
No ratings yet
Empowerment Technologies: Principles of Visual Message and Design Using Infographics
18 pages
Lecture 3: Filtering I Image Enhancement by Neighbourhood Processing
No ratings yet
Lecture 3: Filtering I Image Enhancement by Neighbourhood Processing
7 pages
Image Filtering: A Comprehensive Study
No ratings yet
Image Filtering: A Comprehensive Study
46 pages
Digital Image Definitions&Transformations
No ratings yet
Digital Image Definitions&Transformations
18 pages
MIS & Other Subsystems
33% (3)
MIS & Other Subsystems
89 pages
New MTU DS1000 Submittal
100% (1)
New MTU DS1000 Submittal
74 pages
Etabs Tutorial
No ratings yet
Etabs Tutorial
68 pages
OptiX OSN500 STM 1 Amp STM 4 Multi Service CPE Optical Transmission System Product Description V100R002 02 PDF
No ratings yet
OptiX OSN500 STM 1 Amp STM 4 Multi Service CPE Optical Transmission System Product Description V100R002 02 PDF
143 pages
4th SUMMATIVE TEST IN ENGLISH 7 - WEEK 7-8
No ratings yet
4th SUMMATIVE TEST IN ENGLISH 7 - WEEK 7-8
3 pages
MCS101-Artificial Intelligence
100% (1)
MCS101-Artificial Intelligence
3 pages
5G Network Design With HTZ
No ratings yet
5G Network Design With HTZ
4 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
SRS of University Management System by Balwinder Singh Vehgal
0% (1)
SRS of University Management System by Balwinder Singh Vehgal
17 pages
Optical Fiber Communication by Sunil S Harakannanavar 1f2795 PDF
No ratings yet
Optical Fiber Communication by Sunil S Harakannanavar 1f2795 PDF
218 pages
Blue Book A4 For A5 Ebook
No ratings yet
Blue Book A4 For A5 Ebook
342 pages
Curtis Oxburgh 2022 Understanding Cybercrime in Real World Policing and Law Enforcement
No ratings yet
Curtis Oxburgh 2022 Understanding Cybercrime in Real World Policing and Law Enforcement
20 pages
Presentation 2 On Ldap
No ratings yet
Presentation 2 On Ldap
18 pages
Microsoft Project - Project Schedule
No ratings yet
Microsoft Project - Project Schedule
1 page
DEH-P7400HD OwnersManual112811
No ratings yet
DEH-P7400HD OwnersManual112811
112 pages
HCM 5.9 Epicor ERP Payroll
No ratings yet
HCM 5.9 Epicor ERP Payroll
40 pages
Lab #5 Sig Figs
No ratings yet
Lab #5 Sig Figs
2 pages
AUR450C Specification 1
No ratings yet
AUR450C Specification 1
16 pages
Teradata Vantage SQL Basics
No ratings yet
Teradata Vantage SQL Basics
14 pages
Class 2 - Worksheet No 10 - Numerical Ability - Easy
No ratings yet
Class 2 - Worksheet No 10 - Numerical Ability - Easy
3 pages
Data STRC
No ratings yet
Data STRC
7 pages
Working With Categorical Data Chapter4
No ratings yet
Working With Categorical Data Chapter4
33 pages
Scratch-Catch Game-1-1
No ratings yet
Scratch-Catch Game-1-1
16 pages
Java Midlet Spec
No ratings yet
Java Midlet Spec
7 pages
Ethical Hacking
No ratings yet
Ethical Hacking
2 pages
Alango Sound Reinforcement Package
No ratings yet
Alango Sound Reinforcement Package
7 pages
Kushal Kanal Resume
No ratings yet
Kushal Kanal Resume
3 pages
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
From Everand
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
Fouad Sabry
No ratings yet
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet

21ai601 CV LM9 2

Uploaded by

21ai601 CV LM9 2

Uploaded by

21AI601 – COMPUTER VISION

UNIT II & LP 9 – IMAGE PYRAMIDS AND GAUSSIAN

Multiscale Image Pyramid

where r is a vector of dimensionality M, and P is a matrix of size N M. The columns of P =

1.2 Gaussian Pyramid

1.3 Laplacian pyramid

1.3.1 Image blending

1.4 Steerable pyramid

We analyze in orientation using a steerable filter bank. We form a decomposition

The Steerable pyramid is a self-inverting overcomplete representation (more coefficients

You might also like