0% found this document useful (0 votes)
21 views55 pages

Dip Unit 1

The document outlines the fundamentals of digital image processing, covering topics such as image sensing, acquisition, sampling, and quantization. It details the structure and function of the human eye in image formation, relationships between pixels, and color image fundamentals, including RGB and HSI models. Additionally, it discusses mathematical concepts like 2D transforms, convolution, and correlation in image processing.

Uploaded by

THIRUNEELAKANDAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views55 pages

Dip Unit 1

The document outlines the fundamentals of digital image processing, covering topics such as image sensing, acquisition, sampling, and quantization. It details the structure and function of the human eye in image formation, relationships between pixels, and color image fundamentals, including RGB and HSI models. Additionally, it discusses mathematical concepts like 2D transforms, convolution, and correlation in image processing.

Uploaded by

THIRUNEELAKANDAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

UNIT 1

FUNDAMENTALS OF DIGITAL
IMAGE PROCESSING
21CSE251T-DIP

II B. TECH AIML
CONTENTS
• Steps in Digital Image Processing
• Components
• Elements of Visual Perception
• Image Sensing and Acquisition
• Image Sampling and Quantization.

• Relationships between pixels


• Color image fundamentals
• RGB,
• HSI models,

• Two-dimensional mathematical preliminaries,


• 2D transforms
• DFT,
• DCT.
Elements of Visual Perception

• Image Sensing and Acquisition


• Structure and Image Formation in Eye
• Image Sampling and Quantization
Structure and Image Formation in Eye
Structure and Image Formation in Eye
• The eye is nearly a sphere (with a diameter of about 20
mm) enclosed by three membranes: the cornea and
sclera outer cover; the choroid; and the retina.
• The cornea is a tough, transparent tissue that covers the
anterior surface of the eye. Continuous with the cornea,
the sclera is an opaque membrane that encloses the
remainder of the optic globe.
• The choroid lies directly below the sclera. This membrane
contains a network of blood vessels that serve as the
major source of nutrition to the eye.
• At its anterior extreme, the choroid is divided into the
ciliary body and the iris. The latter contracts or expands
to control the amount of light that enters the eye.
• The central opening of the iris (the pupil) varies in
diameter from approximately 2 to 8 mm.
Structure and Image Formation in Eye
• The innermost membrane of the eye is the retina, which lines the inside of
the wall’s entire posterior portion.
• When the eye is focused, light from an object is imaged on the retina. Pattern
vision is afforded by discrete light receptors distributed over the surface of
the retina.
• There are two types of receptors: cones and rods.
• Cones:
• There are between 6 and 7 million cones in each eye.
• They are located primarily in the central portion of the retina, called the fovea, and are highly
sensitive to color.
• Humans can resolve fine details because each cone is connected to its own nerve end.
• Muscles rotate the eye until the image of a region of interest falls on the fovea.
• Cone vision is called photopic or bright-light vision.
• Rods:
• The number of rods is much larger: Some 75 to 150 million are distributed over
• the retina.
• The larger area of distribution, and the fact that several rods are connected to a single nerve
ending, reduces the amount of detail discernible by these receptors.
• Rods capture an overall image of the field of view.
• They are not involved in color vision, and are sensitive to low levels of illumination.
• For example, objects that appear brightly colored in daylight appear as colorless forms in
moonlight because
• only the rods are stimulated.
• This phenomenon is known as scotopic or dim-light vision.
Structure and
Image Formation
in Eye
In a Camera:
1. The lens cannot change its shape.
2. To focus on objects at different distances, the lens is moved closer to or farther from the imaging surface
(like film or a digital sensor). This adjusts the focus.
In the Human Eye:
3. The lens is flexible and can change its shape.
4. The distance between the lens and the retina (the "imaging surface") is fixed; it cannot change.
5. To focus on objects at different distances, the shape of the lens is adjusted by muscles in the eye, which
changes its focal length.
So, in a camera, focus is adjusted by moving the lens, while in the eye, focus is adjusted by changing the shape of
the lens.
Image Sampling and Quantization
• The output of most sensors is a continuous voltage waveform whose
amplitude and spatial behavior are related to the physical phenomenon
being sensed.
• To create a digital image, we need to convert the continuous sensed data
into a digital format. This requires two processes: sampling and
quantization.
• Sampling: Sampling refers to selecting specific points (or intervals) from
the continuous analog signal to represent the signal in a discrete form.
• Quantization: Quantization refers to mapping the amplitude (or intensity)
of the sampled points to a finite set of discrete levels.
BASIC
CONCEPTS IN
SAMPLING AND
QUANTIZATION

Digitizing the
coordinate values
is called
sampling.

Digitizing the
amplitude values
is called
• The samples are shown as small dark squares superimposed on the
function, and their (discrete) spatial locations are indicated by
corresponding tick marks in the bottom of the figure.
• The set of dark squares constitute the sampled function. However,
the values of the samples still span (vertically) a continuous range
of intensity values.
• In order to form a digital function, the intensity values also must be
converted (quantized) into discrete quantities.
• The vertical gray bar in Fig(c) depicts the intensity scale divided into
eight discrete intervals, ranging from black to white. The vertical
tick marks indicate the specific value assigned to each of the eight
(a) Continuous image projected onto a sensor
array.
(b) Result of image sampling and quantization.
Image Sampling and Quantization
• When a sensing strip is used for image acquisition,
the number of sensors in the strip establishes the
samples in the resulting image in one direction, and
mechanical motion establishes the number of
samples in the other.
• Quantization of the sensor outputs completes the
process of generating a digital image. When a sensing
array is used for image acquisition, no motion is
required. The number of sensors in the array
establishes the limits of sampling in both directions.
What is Sensing Strip and Sensing
Array?
• Sensing Strip:
• A sensing strip is a linear array of sensors arranged in a single row.
• It captures one line of the image at a time as the sensor or the object
moves relative to each other.
• Commonly used in scanners, where the strip moves across the
document to capture the entire image.
• Sensing Array:
• A sensing array consists of a 2D grid of sensors, capturing an entire
image in one exposure.
• Each sensor in the array corresponds to a pixel in the image, allowing
for the capture of the full spatial information in one go.
• Commonly used in digital cameras, webcams, and smartphones.
SOME BASIC RELATIONSHIPS BETWEEN
PIXELS
• A pixel p at coordinates (x, y) has two horizontal and two vertical
neighbors with coordinates
(x + 1, y), (x - 1, y), (x, y + 1), (x, y - 1)
• This set of pixels, called the 4-neighbors of p, is denoted N 4(p).
SOME BASIC RELATIONSHIPS BETWEEN
PIXELS
• The four diagonal neighbors of p have coordinates
(x + 1, y + 1), (x + 1, y - 1), (x - 1, y + 1), (x - 1, y - 1)
and are denoted ND(p). These neighbors, together with the 4-neighbors,
are called the 8-neighbors of p, denoted by N 8(p).
• The set of image locations of the neighbors of a point p is called the
neighborhood of p.

• The neighborhood is said to be closed if it contains p. Otherwise; the


neighborhood is said to be open.
ADJACENCY, CONNECTIVITY, REGIONS, AND
BOUNDARIES

• Adjacency
• Definition: Two elements (e.g., pixels, nodes, or regions) are
adjacent if they share a common edge or vertex.
• Types:
• 4-adjacency: Two elements share a common side.
• 8-adjacency: Two elements share a common side or corner.
• Example:
ABC
DEF
GHI
• 4-adjacency: Pixel E is adjacent to D, F, B, and H.
• 8-adjacency: Pixel E is adjacent to D, F, B, H, and also A, C, G, I.
m-Adjacency (Mixed Adjacency)

• m-adjacency, or mixed adjacency, is a hybrid of 4-adjacency and 8-


adjacency.
• It is introduced to avoid ambiguous connections (e.g., diagonal
connections that might lead to multiple possible paths).
• It is particularly useful in scenarios where the connectivity between
pixels needs to be defined unambiguously.
m-Adjacency (Mixed Adjacency)

• Rules of m-Adjacency:
• For two pixels p and q to be m-adjacent:

1. Example:
101
010
101
Connectivity
Connected Set
What is the Use of all these terms?
In medical imaging, we might identify a tumor
as a connected region of specific pixel
Region intensities.
Boundary (Border or Contour) of a Region R
• The boundary of a region R consists of the pixels in R that are
adjacent to the background (complement of R).
• These are pixels that have at least one neighboring pixel that does not
belong to R.

• Inner and Outer Borders


• The inner border is the boundary inside the region itself.
• The outer border is the boundary between the region and the
background.
Boundary of an Entire Image
• If 𝑅 represents the entire image, then its boundary refers to the first
and last rows and columns of the image.
• This is necessary because an image does not have neighbors beyond
its edges.
• Edge vs. Boundary
• The boundary is a property of a region, while an edge is a property of
intensity variation (brightness changes).
• Edges are detected where intensity values change significantly (e.g.,
using techniques like gradient detection)
Primary colors of light: red (R), green

Color image fundamentals (G), and blue (B).


The primary colors can be mixed to
produce the secondary colors: magenta
(red+blue), cyan (green+blue),and
Color image processing yellow (red+green).
• Full-color processing:
• Pseudo-color processing:
Color Characteristics
• The characteristics used to distinguish one color from
another are:
• Brightness: means the amount of intensity (i.e. color
level).
• Hue: represents dominant color as perceived by an
observer.
• Saturation: refers to the amount of white light mixed with
a hue.
RGB COLOUR MODEL
• In the RGB Model, an image consists of three independent image
planes, one in each of the primary colours: red, green and blue.
Converting colors
from RGB to HSI

•RGB is best for display and


digital representation as it is how
screens generate color.
•HSI is best for image analysis
because it separates color from
brightness, making it useful for
tasks like color-based object
detection.
Quiz - 1
• The colour in its purest form is called _________
• A). pixel
• B). hue
• C). Intensity
• D). Saturation
Quiz - 1
• The colour in its purest form is called _________
• A). pixel
• B). hue
• C). Intensity
• D). Saturation
Two-dimensional mathematical
preliminaries,
• Spatial(time) Domain
• Transform(frequency) Domain
• Kernel
• Convolution
• Correlation
Spatial(Time) Domain
• The spatial domain refers to the representation of an image or signal in
its original coordinate system (e.g., pixels in an image).
• Operations in the spatial domain are performed directly on pixel
values (e.g., filtering, sharpening, edge detection).
Transform(Frequency) Domain
• The transform domain represents an image or signal in terms of
frequency components rather than pixel values.
• Operations in this domain manipulate the transformed coefficients
instead of direct pixel values.
• Common transforms include Fourier Transform (FT), Discrete Cosine
Transform (DCT), and Wavelet Transform (WT).
• Example: JPEG compression uses the DCT to remove high-frequency
components, reducing file size.
Quiz - 2
• ______________ represents an image or signal in terms of frequency
components rather than pixel values.
• A). Graph domain
• B). Temporal domain
• C). Spatial domain
• D). Transform domain
Quiz - 2
• ______________ represents an image or signal in terms of frequency
components rather than pixel values.
• A). Graph domain
• B). Temporal domain
• C). Spatial domain
• D). Transform domain
Why frequency domain is preferred over
time domain?
1. Easier Signal Processing
2. Compression
3. Identification of Frequency Components
4. Noise Removal
5. Simplification of Complex Operations
6. Signal Analysis
7. Faster Computation
KERNEL
• A kernel is a small matrix (e.g., 3×3, 5×5, 7×7) that is slid over an
image to apply transformations like blurring, sharpening, edge
detection, noise reduction, and feature extraction.
• The kernel values determine the effect it has on the image.
Quiz - 3
• Can we write any random matrix elements for kernel function?
• Yes
• No
Quiz - 3
• Can we write any random matrix elements for kernel function?
• Yes
• No, A valid kernel must have meaningful values that produce a desired
effect when applied to an image.
Correlation Convolution
• Correlation is similar to • Convolution is a mathematical
convolution but does not flip operation that combines an
the kernel before applying it. image with a kernel by flipping
the kernel both horizontally
and vertically before applying
it.

Convolution is used in edge


detection (Sobel, Prewitt,
Laplacian filters), blurring,
and sharpening.
Example of 3×3 Convolution
Quiz - 4
• _________ will flip and ______ will not flip the kernel function
before applying to the image processing computation.
• A). Evolution, Revolution
• B). Ovulation, Corrosion
• C). Correlation, Convolution
• D). Convolution, Correlation
Quiz - 4
• _________ will flip and ______ will not flip the kernel function
before applying to the image processing computation.
• A). Evolution, Revolution
• B). Ovulation, Corrosion
• C). Correlation, Convolution
• D). Convolution, Correlation
DFT DCT
• The DFT transforms an image • The DCT is similar to the DFT
from the time (spatial) domain but uses only cosine functions
into the frequency domain. It (real values), making it more
provides a frequency spectrum suitable for practical applications
that shows the contribution of like image compression.
different frequencies to the • Pros: avoids complexity and
image.
particularly effective for
• Pros: Useful for tasks like signal compression
processing, pattern recognition,
and enhancement.
• Cons: DCT is a lossy transform
• Cons:Computationally expensive

You might also like